►
From YouTube: 2022-03-15 CNCF TAG Observability Meeting
Description
* Hubble, cillium.io, https://github.com/cilium/hubble
* Pixie, https://pixielabs.ai, https://github.com/pixie-io/pixie, blog.px.dev
* KubeCon EU, TAG Observability Meetup
#observability #ebpf #cncf
A
We've
got
an
exciting
agenda
today
we
have
two
folk,
two
teams
here
to
present.
This
is
the
mid
month
meeting
of
tag
observability,
it's
a
cncf
related
event.
As
such,
the
code
of
conduct
does
apply.
Please
don't
do
anything
that
would
be
in
violation
of
that
code
and
again,
apologies
for
being
a
little
tardy.
Let's
just
start
right
away
at
the
top
of
the
agenda.
B
Yeah,
so
I
actually
put
it
put
with
my
point
should
be
very
quick,
so
I
wanted
to
announce
that
we
kind
of
organized
there
was
an
opportunity
to
organize
project
specific
meetings
at
the
next
cubecon
in
europe
in
valencia,
which
is
in
may,
and
we
had
opportunity
to
create
some
kind
of
time
for
dedicated
for
our
tag
meeting.
So
it's
kind
of
opportunity
to
meet
everyone
who
will
join
kubecon
in
europe,
so
I
just
wanted
to
announce
that
it
will
be
around
11
00
a.m.
On
monday,
I
believe
so
monday.
B
It's
kind
of
double
checked
70th
may
is
that
monday,
yeah,
probably,
but
essentially
we
have
like
dedicated
two
hours
to
to
speak
about
anything
just
meet
each
other,
and
if
we
need
any
special,
I
don't
know
equipment
or
anything.
Let
me
know,
and
also
we
probably
want
to
decide
if
we
want
to
make
it
like
only
in
person
or
do
we
want
to
maybe
sync
and
have
like
a
virtual
conference
as
well
like
together
within
person.
We
can
do
that
as
well.
We
have
a
you
know:
projector,
we
have.
B
B
Okay,
if
nothing
no
no
other
comments,
then
I
guess
we
can
go
to
another
agenda
item
which
is
hubble
project
presentation.
C
Hey
folks,
this
is
thomas.
C
C
My
name
is
thomas,
I'm,
the
co-founder
of
I
surveyed
and
also
one
of
the
creators
of
psyllium
and
cylinder,
is
essentially
the
base
project
or
the
the
overall
project
and
hubble.
The
observability
layer
of
celium
is
what
we're
talking
about
today,
so
limit
itself
as
a
cntf
project
at
incubation
level.
C
So
if
we're
talking
about
like
sodium
and
evpf,
those
are
open
source
projects,
I'm
also
creator
or
founder
of
I
surveillance,
the
company
behind
cylin
and
hubble.
C
What
is
sodium
selenium
is
actually
more
than
observability,
but
today
we
will
talk
about
the
observability
layer,
which
is
called
hubble
overall
cylinder
also
provides
networking
load,
balancing
syllabus
is
cni's
and
we
can
do
service
mesh.
So
we
can
do
a
lot
of
network
security
and
also
runtime
security.
But
what's
interesting
today
is
this
observability
layer.
We
can
provide
permissions
metrics,
extensive
flow
loading,
open,
telemetry
output
and
service
dependency
graphs.
What
is
unique
about
cylinder
is
that
serum
itself
has
been
entirely
based
on
etf
and
uses
ebpf
to
its
full
extent.
C
In
fact,
we
have
extended
evpf
for
the
first
two
years
before
we
have
even
started
the
ceiling
project.
So
let's
talk
briefly
and
look
into
this
evpf
technology.
For
those
of
you
who
have
never
heard
about
it,
ebpf
is
actually
very,
very
simple
to
understand.
It
makes
the
linux
kernel
programmable,
essentially
allowing
to
run
a
program
such
as
this
one.
C
This
is
c
code,
but
it's
also
possible
to
write
this
in
higher
level
languages
to
run
a
program
when
certain
events
in
the
kernel
happen,
in
this
case
we're
using
a
system
call,
but
this
could
also
be
a
network
package,
storage
access,
a
function,
call
of
the
youtube
space,
user,
space,
application,
a
kernel
trace
point
and
so
on.
We
can
then
use
that
program
to
actually
extract
visibility.
C
The
program
I'm
showing
here
is
actually
called
on
every
time
the
exact
system
call
is
being
made,
for
example,
when
somebody
invokes
a
new
command
on
the
shell,
we
can
then
export
statistics
like
cpu.
What
is
the
pid?
What
is
uid
and
so
on?
This
is
allowing
us
to
build
flame
graphs
and
get
a
lot
of
additional
visibility.
C
What
is
very
unique
about
edpf
is
that
it
is
essentially
a
general
purpose
or
almost
general
purpose
runtime,
so
it's
very
very
similar
to
javascript
in
the
browser,
but
for
the
linux
code.
So
we
can
load
bytecode
into
the
linux
kernel
and
essentially
run
them,
and
that
means
we
can
all
of
a
sudden
essentially
add
functionality
to
the
linux
kernel
that
was
not
there
before.
So
we
can
add
our
tracing
observability
functionality.
We
can
parse
http
headers.
C
I'm
sure
I'm
showing
one
example
here,
where
we
we
can
kind
of
make
the
impact
of
this
so
obvious.
This
is
a
quick
benchmark
of
how
effective
how
efficient
epf-based
visibility
can
be,
in
this
case,
we're
showing
the
difference
between
a
proxy
or
a
sidecar
based
hdd
visibility,
attempt
which
is
the
yellow
one
for
injecting
proxies
and
an
evpf-based
one,
which
is
the
red
one,
and
the
baseline
here
essentially
just
the
benchmark
itself
is,
is
the
blue
bar.
This
is
showing
in
particular,
at
higher
requests
per.
C
Second,
how
minimal
the
ebpf
overhead
is
to,
for
example,
provide
http
visibility.
It's
just
one
of
the
examples,
but
this
is
essentially
true
across
the
line
that
ebpf
is
both
powerful
and
super
low
overhead,
which
is
an
ideal
combination
in
the
observability
space
overall,
while
ebpf
is
aligning,
is
actually
much
bigger,
because
the
kernel
is
actually
a
super
powerful
place
to
extract
visibility,
because
the
current
can
see
everything,
but
it
has
been
very
hard
to
get
kernel
changes
into
the
hands
of
end
users.
C
I've
been
kernel,
developed,
promote
intent
for
more
than
10
years
at
red
hat,
and
it
usually
took
years
and
years
and
years
for
a
new
kernel
version
to
make
it
into
the
hands
of
end
users,
which
made
it
very
difficult
to
extend
or
build
new
functionality
into
the
kernel
and
get
that
out
to
users
quickly.
Ebpf
is
changing
this
because,
all
of
a
sudden,
we
can
make
changes
in
real
time
or
like
at
any
given
time
and
just
run
up
load
the
programs
and
run
them.
C
So
this
has
allowed
for
a
very
similar
innovation,
as
when
javascript
was
added
to
browsers,
because
all
of
a
sudden,
we
don't
need
to
upgrade
our
browsers
anymore,
just
to
load
a
new
website,
which
was
clearly
the
case
20
years
ago,
where
we
had
to
upgrade
our
browsers
frequently.
So
that's
the
power
of
ebgf,
we'll
skip
the
rest
of
here
and
go
into
hubble.
C
So
hubble
provides
a
lot
of
different
things.
It
provides
this
visibility
on
the
left
and
look
into
that
briefly.
It
provides
metrics
prometheus
metrics,
but
they
can
also
be
exported
into
an
srem.
It's
belong
to
elasticsearch
or
something
and
then
also
network
tab
or
essentially
distributing
pcap
as
well,
where
you
can
actually
get
real
copies
of
the
of
the
of
the
network
traffic
that
we're
seeing.
C
This
is
the
example
I'm
showing
here
in
terms
of
how
we
have
how
hubble
is
evolving
network
flow
visibility
for
the
networking.
Folks,
if
you
this
is
enabling
a
slow
lock,
but
it
represents
any
five
tuple-based
logging.
It's
essentially,
this
ip
is
talking
to
this
ip
number
of
pipes,
number
of
packets
and
so
on,
not
really
useful
in
the
context
of
containers
and
kubernetes
and
cloud
native.
C
This
is
the
visibility
that
we
provide,
which
at
the
first
class,
doesn't
even
show
or
look
like
a
network
flow
log,
but
it
actually
shows
you
not
only
the
the
now
the
the
kubernetes
name
space,
the
part,
the
entire
process
answers
to
three
who
is
invoking
what
we
can
see
that
docker
d
is
the
runtime
here.
It's
spawned
by
system
b,
then
we
see
a
crawler
binary
which
is
running
containerized.
C
Then
we
see
that
this
crawler
binder
is
invoking
node
app
and
then
surprise,
surprise.
There
was
actually
a
compromised
app
here.
We
see
that
a
reverse
shell
was
used
to
reach
out
of
the
cluster,
apparently
received
instructions.
Then
an
attacker
was
using
curls
local
around
and
we
see
the
actual
network
connections
in
the
as
arrows
here
with
the
destinations
they've
reached
out,
and
we
see
that
both
the
actual
workload
is
called
talking
to
our
elasticsearch
server
in
here,
but
then
also
the
attacker
attempted
to
reach
out
to
that.
C
We
can
even
say
that
see
the
layer,
7
observability
data
here
that
we're
actually
observing
and
http
get
to
slash
user
search,
so
very,
very
powerful
visibility,
because
eepf
can
see
everything
from
the
network
layer
to
the
long
time,
layer
and
so
on,
but
let's
switch
over
to
a
live
demo,
where
we
actually
see
kind
of
what
type
of
visibility
that
we
can
provide
on
top
of
this.
So
let
me
switch
my
screen
here
and
show
you
a
couple
of
the
the
metrics
that
we
can
generate.
B
C
I
hope
excellent,
so
there's
a
ton
of
dashboards
that
you
can
build,
I'm
showing
a
few
of
them
right
now.
So
obviously
we
can
look
at
kind
of
the
raw
network
level,
so
this
is
prometheus
exports
with
just
the
grafana
center
grafana
dashboard.
We
can
export
the
same
metrics
as
open,
telemetry
metrics
as
well
or
any
format
that
you
really
want.
So
we
can
see,
for
example,
for
water
versus
dropped.
We
can
see
that
a
certain
amount
of
traffic
is
constantly
being
dropped
because
of
policy
deny
reasons.
C
For
example,
looking
to
the
network
layer,
we
see
what
type
of
tcp
events
are
ongoing.
We
see
how
many
tcp
syns
are
being
sent
without
being
responded
to.
So
this
is
essentially
a
graph
that
shows
you
how
many
connection
timeouts
or
how
many
connections
are
timing
out.
We
can
see
why
packets
are
being
denied.
We
can
see,
let's
go
into
the
http
layer,
so
we
can
observe
the
entire
http
traffic
that
we're
seeing
and
all
of
this
obviously
completely
transparently.
So
no
changes
to
apps
or
anything
like
this.
C
You
can
deploy
this
as
a
daemon
set
into
your
cluster
and
actually
transparently
get
these
metrics
out,
for
example,
what
type
of
http
traffic?
What
type
of
responses
we're
not
getting
any
404s
right
now,
which
is
good
latency
of
just
graphing
p50
p99.
Here
then
dns,
probably
their
most
favorite
dashboard
by
platform
teams
and
also
app
teams.
We
can
see
what
type
of
dns
requests,
what
type
of
dns
responses,
how
many
errors?
What
are
the
parts
currently
receiving
venus
errors?
C
It
is
well
people
look
at
very
very
quickly
because
dns
is
the
source
of
issues
so
many
times,
but
then
also
what
are
the
queries
that
are
actually
being
done.
So
that's
kind
of
the
network
view
of
things.
C
Let's
also
quickly
look
into,
for
example,
a
bit
more
advanced
metrics,
so
we
could
example
actually
hook
into
the
tcp
layer
and
graph
out
the
smooth
round
trip
time
of
all
tcp
connections.
We
can
look
into
what
pod
is
consuming,
how
much
or
producing
how
much
traffic.
So
you
can
see.
For
example,
what's
the
max
traffic
per
part
here-
and
this
is
a
service
called
node?
Exporter
is
essentially
at
max
pumping
out
70
megs
per
second.
C
You
can
also
look
at
the
average,
and
we
see
that
the
hubble
timescape
in
chester
is
averaging
the
most
traffic.
You
can,
for
example,
look
at
how
much
traffic
for
each
part.
We
can
then
look
at
the
binary
level,
so
not
just
at
the
pod
level,
but
actually
look
into
which
binary
inside
what
part
is
consuming
or
producing
how
much
traffic,
which
also
gives
us,
for
example,
we
can
look
at
cubelet
right,
not
just
actually
containerized
or
pod
level
traffic,
but
also
cubelet.
C
C
You
can
see
that
there
is
the
source
controller,
which
is
talking
to
github.com,
frequently
and
apparently
that
is
receiving
or
doing
tcp
retro
retransmissions,
okay,
occasionally
so
lots
of
different
visibility
or
we
could
even
go
into
let's
say,
lower
level
metrics.
So
this
is
the
these
are
the
network
interface
statistics
you
can
actually
see
which
node
is
pumping
out
or
receiving
how
much
traffic
or
interface
are
there
interface,
interface,
errors
and
so
on.
So
a
massive
amount
of
statistics
and
visibility
that
we
can
explore
it.
C
So
the
kind
of
summarize
I
noticed
was
a
quick
intro
to
kind
of
summarize
hubble
is
the
observability
layer
of
psyllium
all
edpf
based
it
provides
promising
metrics
from
like
lower
network
levels,
all
the
way
to
layer,
7,
hdp,
kafka,
dns
and
so
on.
We
can
use
it
from
like
platform,
team
levels
or
kind
of
security,
metrics
network
policy,
all
the
way
into
building
gold
and
signal
dashboards,
because
it's
using
evpf,
it's
completely
transparent
and
it's
part
of
the
saline
project,
which
is
a
cntf
project
integration
level.
C
I
hope
this
was
a
good
first
initial
intro
into
hubble.
If
you
are
interested
in
it
and
want
to
learn
more
feel
free
to
go
to
cilium.io
or
join
our
slack,
lots
of
people
are
happy
to
answer
questions
there.
You
can
obviously
also
dm
me
on
twitter
or
on
slack
and
I'm
happy
to
answer
questions
as
well.
A
No,
we,
we
absolutely
do.
Thank
you
so
much
if
you
could
also,
after
put
a
link
to
the
slides
that
you
presented
as
well
into
the
doc
downstream.
But
there
is
a
question
here
from
dan.
I
think
he's
still
here,
so
I
could
let
him
ask
it.
B
Yeah
great
presentation,
we're
actually
exploring
adopting
a
mesh
right
now
and-
and
the
key
thing
that's
been
murky
to
me-
is
like
how
do
I
actually
enrich
my
metrics?
So
this
was
a
great
presentation.
Can
you
go
into
a
little
bit
of
detail
on
on
how
that
actually
works?
What
am
I
right?
Let's
say
I
have
a
metric
x
that
I
want
to
tag.
We
do
like
ownership
driven
tagging,
so
I
want
to
say,
like
this
team
owns
this
metric
through
the
sidecar.
How
would
I
achieve
that?
B
C
So
I
think,
a
wonderful
part
versus
that
soleim
achieves
all
this
visibility
without
sidecar.
So
when
we
actually
talk
about
service
mesh,
we're
not
talking
the
side
card
model,
we're
talking
cycle
three
service
mesh
model,
that's
entirely
edpf
and
envoy,
driven,
but
without
side
cars
from
it
from
a
label
perspective
and
from
an
observability
perspective.
We
have
programmable
metrics,
which
means
you
can
actually
add
as
much
context
to
every
parameters.
C
Metrics
that
you
want,
so
you
can
have
name
space
or
labels
or
whatever,
and
then
we
actually
have
our
back
functionality
where,
based
on
those
labels,
you
can
then
restrict
who
is
seeing
what's
metric.
So
you
can,
for
example,
have
a
prometheus
scraping
endpoint.
That
only
shows
the
magic
of
one
particular
namespace
and
then
actually
have
like
a
dashboard,
that's
specific
to
a
team
that
only
sees
that
name
space
as
well.
Does
that
make
sense?
C
C
So
there's
there's
a
static
configuration
then
golang
as
well,
so
like
a
lot
of
can
be
done
in
static
configuration
c
or
c
or
d
based
and
then,
if
I
want
to
want
to
go
further,
there's
essentially
a
go
plug-in
layer
that
allows
you
to
even
create
more
metrics
or
aggregate
them
or
put
them
or
use
a
different
aggregation
form.
And
so
on.
A
C
C
E
Tom,
can
you
ask
a
question
I
recently
met
with
our
director
of
network
and
what
he
just
showed
is
exactly
what
he's
looking
for
we're,
mostly
a
double
yes
show
and
he's
desperate
to
find
something
that
will
show
all
dependencies
between
all
these
network
pieces.
So
I'm
just
curious,
it's
not
just
about
kubernetes
in
general,
because
it's
just
part
of
this
game.
So
how
exactly
you
extract
this
data?
E
I
just
suggest
using
just
a
simple
grafana,
slash,
aws
interface
or
using
you
extract
this
data
by
using
any
aws
connectors
exporters
and
later
you're
working
with
this
data.
Yeah.
C
So
we
have
for
metrics,
we
have
prometheus
and
open
telemetry
support
for
flow
logging.
That's
essentially
event
based
like
we're
seeing
this
connection,
this
connectionist
connection,
it's
json.
C
We
have
a
fluentd
plugin,
so
you
can
export
this
into
whatever
system
you
want,
and
then
we
also
have
a
timescape
based
time
series
database,
where
you
can
actually
store
this
persistently
in
your
own
kubernetes
cluster
as
well,
and
then
we
also
have
hubble
ui,
which
is
a
service
dependency
graph
that
ingests
the
the
flow
data
on
the
metric
data
and
shows
that
so
that's
that
can,
for
example,
show
you
a
service
dependency
graph
who's
depending
on
what,
but
you
can
also
achieve
that
with
the
open,
telemetry
export,
if
you're,
using
something
like
jagger
or
another.
E
Is
sake
I'm
using
fluency
anywhere
to
push
this
data
to
look
at
right
yep.
So,
let's
say:
if
I'm
just
going
is
understanding.
I
just
have
to
push
this
to
another
plug
in
your
plugin
and
you
would
take
care
of
this
one.
You
would
build
that
in
all
these
subjects.
This
is
what
you
would
just
okay,
exactly
yes,
okay,
so,
but
this
is
a
only
what
could
be
discovered
and
make
a
relation
map
inside
of
a
kubernetes
server
itself.
Okay,.
C
You
can
also
run
cerlium
on
any
linux
machine
and
in
in
evpf,
has
recently
been
ported
over
to
windows
as
well,
but
right
now,
celium
or
silicon
windows
is
still
in
a
kind
of
kind
of
an
alpha
level,
but
you
can
run
cinema
on
any
machine.
You
want
it's
not
actually
kubernetes
specific.
If
there
are
no
kubernetes
parts,
it
will
simply
report
the
raw
processes.
E
A
Yeah,
we'll
we'll
have
all
all
links
and
everything
can
blame.
I
can
send
out
a
blast
to
the
email
list
as
well.
I
think
eric
had
one
question
as
well.
I
think
he's
still
online.
F
Yeah,
so
I
actually
have
a
lot
of
questions,
but
my
most
basic
one
is
with
all
of
those
statistics
running
what
kind
of
overhead
does
that
take
cause.
I
know
that
you
showed
that
one
graph
that
showed
you
know
the
more
requests
you
had
the
it
was
just
a
tiny
overhead
but
you're
gathering
a
ton
of
information
across
a
lot
of
different
network
interfaces.
C
Yeah,
so
I
think
that
the
massive
benefit
of
ebpf
is
that
all
of
this
histogram
collection,
which,
in
the
past
you
used
to
do
kind
of
export,
a
lot
of
samples
from
kernels
user
space
and
then
aggregate
and
collect
histograms
and
user
space.
All
of
that
is
now
done
in
kernel,
which
is
very
effective,
so
the
real
override
actually
comes
at
the
level
of
prometheus
and
so
on
like
prometheus
memory.
C
This
is
where
you
think
about
like
label
complexity
and
how
much
data
to
add
the
actual
collection
of
the
max
itself
is
incredibly
low
overhead.
So,
even
with
like
very
extensive
configurations,
it's
somewhere
in
the
five
to
twenty
percent
override,
it
will
depend
a
little
bit
on
how
many
http
requests
per
second,
for
example,
you
have,
or
if
you
have
lots
of
low
level,
udp
connections
and
you're.
Looking
at
every
packet,
lots
of
things
can
be
configured,
but
the
actual
collection
layer
is
incredibly
efficient.
C
Usually
the
overhead
or
the
bottleneck
is
things
like
json
encoding
or
how?
How
how
prometheus,
how
much
memory,
the
actual
promises
instance
will
use,
and
so
on
so.
C
F
F
C
I
mean
we
have
started
out
with
networking
only
now.
We've
added
this
process
context
as
well,
and
we're
now
expanding
right
now,
we're
essentially
very
network
heavy,
very
connectivity,
heavy.
F
Okay
and
do
do
you
support?
Are
you
adding
all
the
ebf
by
code
yourself
or
is
there
like
a
module
laser
modular
where
people,
let's
just
say
they,
wanted
to
add
something
into
the
hubble
project?
Do
they
do
that
themselves?
Is
that
something
you
support
or
do
do
they
have
to
contribute
that
upstream
or
how
does
that
process
work.
C
You
would
have
to
change
hubble's
last
cell
and
code
base
codes
to
change
the
etf
program,
so
yes,
psyllium
loads,
the
egf
programs.
There
is
no
pluggable
infrastructure
right
now,
where
a
user
could
could
add
raw
egf
programs.
That
could
definitely
be
done
and
added,
but
that
doesn't
exist
right
now.
A
If
I
could
kind
of
tack
on
to
that
last
question
partially
as
a
segue,
but
also
because
we
ask
it
for
a
lot
of
the
folks
to
come
present,
you
know
how
could
could
you
provide
either
in
the
doc
or
provide
a
short
overview
of
how
someone
might
engage
with
the
project
of
psyllium
hubble,
all
of
it
if
they
wanted
to
contribute
or
had
ideas
to
make
it
better?
Where
would
they
start,
and
what's
the
community
and
governance
model
like
pragmatically.
C
C
You'll,
find
pointers
to
hubble
there.
The
best
community
entry
point
is
cylinder
io,
which
will
point
you
to
the
slack
channel.
We
have.
We
have
about
10
000
people,
there's
a
development
channel
there.
So
all
the
development
is
happening
in
roma
planning.
All
of
that's
happening
on
github
and
slack
and
you'll
find
all
of
those
pointers
on
silent.io.
A
Awesome.
Thank
you
very
much.
If
there
are
there
any
other,
and
thanks
thanks
again
for
for
presenting
this
is
this
is
really
cool.
Are
there
any
other
questions
before
we
move
on
to
the
next
presenter
to
pixie.
A
Okay,
well,
thank
you
very
much
if
they're,
if
they're,
if
there
are
further
questions
for
for
thomas
or
the
the
hubble
team,
feel
free
to
use
our
slack
channels
or
mailing
list
or
other
means,
but
thank
you
again.
D
Hey,
hey
everyone,
I'm
zayn.
Hopefully
you
can
remember
me.
D
Okay,
sorry
about
that,
oh
you
know
why
probably
broadcasting
from
two
devices.
Is
it
any
better
now.
D
Okay,
great,
so
I
wanted
to
give
a
little
quick
update
on
the
pixie
project.
Matt
asked
us
to
do
a
quick
cast
on
that.
So
let
me
pull
up
the
slides
real
quick.
There
were
a
few
slides
so
for
people
that
are
not
familiar
with
pixie.
We
are
currently
a
sandbox
cncf
project
that
uses
ebpf
to
enable
a
bunch
of
different
observability
use
cases
specifically
on
kubernetes.
D
D
You
solve,
but
we
have
a
slightly
different
approach
to
it,
and
you
know
thomas
and
I
out,
we
know
each
other
on
a
bunch
of
things
so
anyways
without
further
ado.
What
is
pixi
pixie
is
a
developer
debugging
platform
with
the
goal
of
providing
autobots
visibility
for
your
kubernetes
cluster.
So
once
you
get
the
xc
installed
without
having
to
modify
anything
in
your
cluster,
you'll
get
a
bunch
of
different
pieces
of
information.
D
Things
like
service
health,
logging
request,
tracing
costco
and
a
bunch
of
different
things
out
of
the
box,
and
all
this
is
done
pretty
transparently
using
ebpf,
and
you
can
check
out
our
website
like
ps.dev
for
a
lot
more
of
the
technical
details.
D
So
we
kind
of
were
built
on
like
three
different.
You
know
hello
principles.
The
first
one
being
is
everything
is
code
driven
on
the
fly
and
no
manual.
Instrumentation
collection,
pxe
uses
pixel,
which
is
basically
you
know,
pandas
python
dialect
and
it
allows
you
to
go
and
like
process
and
build
data
processing
for
data.
We
split
our
storage
between
edge
and
cloud,
so
the
high
level
over
here
is
that
we
can
collect
a
ton
of
data
using
udpf
and
we
want
to
make
sure
that
we
can
officially
store
it.
D
So
most
of
the
data
is
actually
stored
on
the
node
until
we
need
to
like
deep
breath
it
out
or
something,
and
then
third
thing
is
everything's
api
driven,
so
you
can
basically
go
and
access
all
of
the
data
available
through
pixi
into
an
api
to
go
enable
other
tools.
For
example,
there
is
a
pixi,
you
know
grifana
plug-in
using
using
the
pixel
api.
D
What's
new
so
I'll
kind
of
walk
through
the
entire
thing,
because
I'm
not
sure,
however,
how
familiar
everyone
is,
but
what's
specifically
new
in
the
kitchen
what
people
have
been
calling
pc
is
that
we
now
have
continuous
profiling
and
flame
graphs
for
java
programs,
we
have
the
support
for
node.js
and
opens
acceleration
for
requests,
kafka,
tracing
and
a
bunch
of
other
other
minor
things,
so
I'll
actually
just
go
through
and
do
a
quick
overview
demo
of
pixi,
along
with
you
know,
show
off
a
couple
of
the
new
features
like
java,
continuous
profiling
and
costco.
D
So
if
you're
over
here,
this
is
the
main
pixel
ui.
So
once
you
you
know
get
into
pixie,
you
can
get
a
very
high
level
overview
of
all
the
http
requests
that
are
going
on
in
your
cluster
and
what
is
their?
You
know
throughput.
You
know
latency
and
you
know
other
information
like
error
rates
based
on,
like
you
know,
different
services,
so
there's
actually
a
load
generator
running
it's
making
a
bunch
of
requests
which
then
causes
a
bunch
of
downstream
requests
to
happen.
D
On
how
kubernetes
are
stealing
at
that
moment
in
time,
but
after
that,
we
instrument
everything
using
evpl,
so
you
don't
have
to
do
any
additional
work.
We
can
dive
in
to
more
details
over
here.
So,
for
example,
I
can
dive
into
this
online
batik
pod
and
you
can
see
like
specifically
for
this
namespace
ps
online
boutique,
where
the
request
happening.
What
are
the
slower
class,
where
the
ranges
of
requests
that
we're
seeing
if
you
specifically
click
into
a
service?
Let's
say
I
want
to
see
what
the
checkout
service
is
doing.
D
I
can
get
a
more
granular
overview
of
the
http
requests
errors
latencies.
I
can
see
what
a
sample
of
all
the
slow
requests
are.
So
if
I
click
on
this,
I
can
see
that
this
is
a
http
2
request
as
a
grpc
request,
and
then
here
is
the
protobuf
that
was
sent
in.
That
was
causing
this
request
that
you
saw
so
you
can
use
that
for
for
debugging.
D
If
this
is
in
go
or
java
or
sql's
plus
and
a
few
other
languages,
you
click
in
on
the
pod,
you
can
actually
get
pretty
detailed
information
like
the
flame
graph.
So
if
you
dig
in
over
here,
you
can
go
in
and
see
where
in
the
pod
cpu
time
is
being
spent.
D
So
if
you're
familiar
with
flame
grass,
basically
the
wider,
the
box,
the
more
time
it's
being
spent.
Honestly,
this
pot
is
pretty
idle,
so
you're
not
seeing
a
lot
of
activity,
but
it
was
very
active.
You'll
see
a
bunch
of
horizontal
bars
that
are
very
very
wide.
D
So
the
only
one
I'm
going
to
point
out
is
all
of
this
is
basically
done
by
a
script,
so
you
can
probably
see
up
here
that
their
scripts
changing
with
different
arguments,
so
there's
a
little
python
script
that
goes
on
like
grabs
this
data
and
captures
it,
and
it's
an
open
repository
of
scripts
that
you
can
submit
to
on
our
github
or
you
can
have
like
private
scripts
if
you
don't
want
to
share
them
with
others,
but
as
soon
as
you
create
a
script.
D
D
So
one
of
the
quick
things
I'm
going
to
show
is
some
new
feature
here,
which
is
the
kafka
stuff.
So
if
you're
interested
in
kafka
go
to
the
pxcoffee
overview,
we
can
basically
scan
the
entire
cluster
across
all
these
spaces
and
say:
oh
look,
we
see
some
cosplay
traffic.
D
This
is
kind
of
a
toy
application,
so
you're
going
to
see
that
there's
a
producer
talking
to
the
order
service,
which
then
talks
to
consumer
shipping
and
consumer
invoices
you're,
basically
seeing
the
you're,
basically
seeing
all
the
costs
of
traffic
flow
through.
You
can
then
go
into
like
specific
information
about
you
know,
topics
and
broker,
cause
and
and
actual
data.
D
That's
going
through
the
kafka
kafka
cluster
for
campbell
you
can
actually
see
like
here
is
the
published
topic
and
here
is
the
destination
and
then
the
actual
actual
data
that
we're
seeing.
D
So
we
encounter
a
fair
amount
of
very
detailed
information.
Here's
something
else
produced,
so
we
we
basically
will
capture
all
of
the
kafka
information
and
put
this
in
here.
What
else
can
I
show
you?
D
Oh,
I
don't
think
this
one's
instrumented
yet,
but
if
you
have
the
right
topic,
you
can
actually
take
a
look
at
things
like
producers
and
zero
lag.
So
if
you
enjoy
the
topics,
we'll
be
able
to
tell
you
what
it
is
to
produce
your
consumer
lag
between
different
different
content
instances
in
terms
of
java.
That
was
the
last
thing
I
wanted
to
point
out.
Actually
I
think
it's
in
our
confident
demo.
D
So
if
I
pull
up
a
service,
that's
using
java,
like
the
orders
service
over
here-
and
let
me
pick
actually
in
the
right
cod.
D
So
this
is,
you
know
one
of
the
things
that
was
actually
a
fair
amount
of
workforce
figure
out.
You
can
read
all
about
it
on
our
blog
and
our
dots,
but
basically
we
instrument
the
jvm
software
through
to
be
able
to
capture
all
of
the
you
know,
actual
jolla
flame
graphs,
so
typically.
D
D
There
is
a
bpf
tray
script
and
you
can
deploy
this
across
the
cluster
and
then
process
it
using
some
pixels
and
once
I
run
over
here,
what
you'll
see
is
that
we're
now
going
to
deploy
the
ubpf
pro
across
the
cluster
and
then
it's
going
to
take
a
few
seconds
to
capture
data
and
then
now
you're
going
to
see
like
all
the
tcp
drops
that
are
happening
across
the
entire
entire
cluster.
So
it's
pretty
easy
for
us
to
embed
new
vpnf
scripts
and
pull
the
data
into
yeah.
D
I
think
that's
about
all
that
I
had
you
know
we
support
other
protocols
like
postgres
and
http
if
everyone's
in
detail,
but
we're
giving
up
based
on
that
earlier,
what's
coming
soon
right
now
we're
pretty
actively
working
on
open,
telemetry
export,
we're
actually
working
on
this
thing
called
plugins.
So
pixi
only
does
data
storage
for
a
short
period
of
time.
D
You
know
usually
like
a
few
hours
so
with
the
pixel
plug
and
support,
but
we
all
have
basically
allowed
us
to
plug
in
a
different
backend
like
prometheus
or
timescale,
or
something
or
commercial
systems
to
be
able
to
increase
our
data
storage
requirements
and
also
do
things
like
learning.
Then
we're
adding
support
for
script,
versioning
and
then
the
last
one
of
the
government
side
and
we're
trying
to
get
more
and
more
into
like
the
public
board
meetings
and
governance
process.
Hopefully
our
way
to
becoming
an
integrated
project.
A
Yeah,
thank
you
there's
two
already
that
we
could
start
with
in
the
doc,
but
any
folks
feel
free
to
free
to
jump
in.
I
think
ken
had
the
first
one.
B
After
I
wrote
down
the
question,
but
it
was
basically
for
java
flame
graphs.
Are
you
usually
utilizing
the
data
from
jfr
to
build
those
blame
graphs,
or
I
think
you
mentioned?
You
actually
got
a
special
ebp
ebpf
process
to
do
that.
D
Yeah,
so
what
we
do
for
java
is
that
we
basically
use
so
right
now
for
the
continuous
profiler.
We
basically
capture
all
the
stack.
A
Okay,
cool
cool
then.
A
Is
there
any
sort
of
our
back
or
control
over
who
can
either
log
into
the
ui
and
do
queries
or
is
there
any
kind
of
are
back
around
who
can
deploy
trace?
Ebpf
programs
yeah?
The
capability
is
amazing
but
like,
if
you're,
if
you're
thinking
about
like
running
a
we're,
putting
this
into
a
product
into
an
organization,
you
know
I'm
sure
a
lot
of
cios,
probably
with
all
of
this,
would
have
some
heartburn.
D
So
right
now
we
have
support
for
data
application
and
what
we
call
high
security
mode,
which
will
restrict
some
of
the
features
like
they're
talking
about
like
dvds,
script
conjunction
and
stuff,
that
only
like
happens,
the
authentication
will
redact
all
the
data
that
might
have
sensitive
information
in
it.
So
right
you
know
the
ui,
you
can
go
view,
http
requests
right,
http
request
might
have
pii
data
in
it
and
we
have
support
to
basically
offset
all
of
that
data.
D
We
are
working
on
adding
support
for
spine
scripts
so
that,
if
there's
a
new
script,
that's
executed,
you
can
send
it
out
so
that
it
requires
an
admin
approval
before
it.
Let's
keep
especially
so
this
doesn't
prevent
people
from
being
able
to
randomly
add
in,
like
you
know
your
progress
or
something
so
you'll
be
able
to
know
that
this
script
has
been
verified
and
it's
safe
to
execute.
On
your
system.
D
We
plan
to
add
in
the
basic
r
back,
and
so
we
have
some
are
about
support
right.
There's
some
split
between
what
admins
can
do
and
what
users
can
do.
Then
our
back
support
will
kind
of
make
that
a
little
bit
cleaner
and
then
we
kind
of
add
in
over
the
next
headquarters
cable
level,
our
back
so
for
every
piece
of
information
to
collect.
We
can
restrict
who
can
access.
D
It
then
there's
column
level,
our
back,
which
will
restrict
which
columns
of
data
people
can
access
and
then
there's
our
back
by
entity
which
will
influence
the
cloud
next
year,
hopefully
sooner
but
that'll
allow
things
like.
Oh,
if
you're
in
this
namespace,
you
can
access
data
about
this
namespace,
but
like
not
others,
so
that's
part
of
our
security
roadmap.
I
shared
the
link.
D
This
is
a
public
document,
we're
still
doing
a
better
job
documenting
all
this
stuff,
but
we
do
plan
to
add
in
better
better
artifacts
support,
but
it's
pretty
limited
right
now.
A
Okay,
I
guess
it's
a
quiet.
Oh
hello,.
A
All
right,
I
guess
that's
it,
then
thanks
very
much
for
for
the
overview.
I
I
should.
I
should
end
with
the
same
question.
I
asked
for
hubble:
what's
the
best
place
to
go
to
engage
with
the
project,
if
you
have
ideas
on
how
to
improve
it
or
you're
interested
in
just
joining
that
community.
D
And
then
we
have
a
pretty
active
blog
as
well,
which
has
a
bunch
of
information
about
how
a
lot
of
this
stuff
is
implemented
under
the
hood.
A
All
right
well,
then,
thank
you
for
thank
you.
Thank
you
both
for
for
presenting
it's
open
floor
or
we
could
return
four
minutes
to
everyone's
day.
Awesome.