►
From YouTube: Introduction to Tetragon
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
All
right,
I
think
it's
a
good
time
to
start
this
webinar.
This
introduction
to
psyllium
tetragon
welcome
all
of
you
who
have
joined
to
once
more
go
over
housekeeping
and
logistics.
All
of
you
who
have
joined
you
have
been
automatically
muted
to
make
the
experience
as
inter
as
as
least
interruptive
as
possible.
If
you
have
questions,
feel
free
to
ask
them
in
the
zoom
chat
as
a
message
to
everybody,
and
we
will
either
answer
them
on
air
as
we
have
time
or
answer
them
in
the
chats
directly.
A
A
So
let's
jump
in
this
series
will
be
introducing
tetragon
ebpf
based
security,
observability
and
runtime
enforcement.
A
So,
let's
jump
right
in
and
get
a
first
overview
of
tetragon.
So
what
is
tetragon
tetragon
is
essentially
a
agent
that
can
run
on
any
machine
any
linux
machine.
This
could
be
a
kubernetes
worker
node.
It
could
also
be
a
non-kubernetes
node,
essentially
any
machine,
and
it
will
use
ebpf
to
extract
security,
relevant
observability
and
also
provide
runtime
enforcement.
A
The
virtual
file
system
in
terms
of
file
access,
tcp
layers,
for
example,
to
introspect,
tcp
sequence,
numbers
identify
sequence,
number
attacks,
as
well
as
the
system
and
process
execution
layer,
but
it
does
not
only
or
it
does
not
only
cover
the
system
level.
It
also
covers
applications,
so
we
can
also,
for
example,
extract
function,
calls
or
or
function
traces
look
at
executed
code
and
so
on.
Very
important
tetragon
is
is
transparent,
which
means
no
code.
Changes
are
required.
A
All
of
the
observability.
All
of
the
enforcement
capabilities
are
provided
completely
transparently,
the
all
the
observability
data,
all
the
policies
that
come
in
they
are
integrated
with
other
systems,
and
you
can
see
many
of
them
listed
above
metrics,
for
example,
prometheus
grafana
for
a
lot
of
the
security
relevant
events
they
will
typically
go
into
an
siem
can
be
streamed
by
a
fluency
to
auto
systems
as
well
as,
for
example,
grafana
elasticsearch,
as
well
as
open
telemetry
or
the
raw
json
output.
A
Tetragon
is
part
of
the
psyllium
project
family.
If
that
is
automatically
part
of
the
cloud
native
computing
foundation,
so
it
is
essentially
a
independent
project
from
a
technical
perspective,
independent
project
on
the
distilling
umbrella,
but
it
essentially
benefits
and
is
governed
by
the
psyllium
open
source
governance
model.
A
Let's
jump
into
why
tata
gone
in
terms
of
runtime
security
and
security
observability.
What
is
needed
and
why
we
created
tetragon
is
because
the
security
has
to
be
done
in
real
time.
So
when
we
protect
workloads
that
are
running,
we
need
to
be
able
to
detect
malicious
activity
in
real
time.
We
need
to
be
reporting
when
malicious
events
occur
and
then
even
better
prevent
them
before
they
perform
any
damage
and
we'll
look
at
a
variety
of
examples
on
how
that
can
be
achieved.
A
A
This
can
be
done
or
has
been
done
in
a
variety
of
different
ways
in
the
past,
so
we'll
cover
essentially
why
we
have
created
tetragon
next
by
looking
at
existing
solutions
and
existing
of
existing
approaches
and
then
compare
that
to
tetragon.
This
includes
ld
preload,
ptrace,
secomp,
lsm
and
lsm
edpf,
as
well
as
other
approaches
of
etf.
To
perform
this
type
of
security.
A
This
is
probably
the
oldest
or
one
of
the
oldest
approaches,
ld
preload.
So
the
ability
to
load
a
library
into
an
application
without
the
awareness
or
without
changing
that
application
with
ld
preload,
we
can
essentially
load
a
library
that
will
inject
itself
into
the
application
and
have
all
the
system
calls
that
the
application
performs
be
handled
by
that
library
instead
of
by
the
kernel
itself.
This
is
called
a
system
called
proxy
or
ld
preload
proxy.
This
is
great,
but
it
can
be
bypassed.
A
Obviously,
if
the
binary
of
the
application
is
statically
linked,
ld
preload
will
have
no
effect
and
we
lose
all
visibility
as
well
as
any
enforcement
that
is
done.
There
is
ineffective.
So
essentially,
this
has
quickly
been
abandoned.
From
that
perspective,
we
can
do
system
call
checking
when
we
enter
or
when
system
calls
enter
the
kernel
at
the
syscall
entry.
Examples
of
this
rp
trace
cycomp,
as
well
as
ebpfk,
probes
or
says
call
based
entry
or
cisco
entry
based
edpf
checks.
A
This
is
already
massively
better
than
lb
preload,
because
the
application
cannot
easily
bypass
the
injection
of
this,
but
it
is
vulnerable
to
so-called
t-o-c-t-t-o-u
attack.
So
time
of
check
versus
time
of
use,
this
means
that
the
hook
point
when
the
system
call
or
when
the
ebpf
program
or
that
the
solution
sees
the
system
call
is
before
the
last
moment.
The
application
can
change
system
call
arguments,
and
you
can
see
this
in
the
picture
here.
A
Essentially,
the
hook
is
at
the
entry,
but
the
system
called
handling
copying
the
essentially
the
memory
that
contains
the
system
called
arguments
is
after
this
entry
point,
which
means
the
application
could
actually
create
a
system
called
present
arguments
in
the
system
call
such
as
I
want
to
open
this
file
and
then
the
hook
point
runs
it
validates
and
afterwards
the
application
could
still
change
what
file
it
wants
to
open.
A
A
A
Some
of
you
may
have
heard
of
lsm
or
linux
security
modules.
This
is
a
relatively
old
api
itself.
It
allows
to
do
linux,
security
checks
or
additional
security
enforcement
at
the
right
level,
and
it
is
a
stable
interface
and
is
a
very
safe
place
to
make
checks,
but
it
is
very
static
and
it
requires
essentially
additional
kernel
modules,
then
astronomers
to
load
us
additional
lsm,
probes,
better
known
or
better
suited,
is
actually
ebpf
or
bpf
lsm
that
allows
to
use
ebpf
to
make
lsm
dynamic.
A
This
is
already
a
major
step
forward
and
actually
pretty
close
to
what
we
want.
The
problem
with
this
is
that
it
needs
a
kernel
version
5.7
and
it
is
limited
to
the
hook
points
that
lsm
itself
provides.
So
if
any
additional
hook
points
are
needed,
we
again
need
to
change
kernel
code
and
the
kernel
requirement
goes
up
even
further.
A
This
is
essentially
why
we
have
created
psyllium
tetragon,
so
we
want
the
same
properties
in
terms
of
safety,
security
and
hook
points
as
ebpf
lsm,
but
we
want
to
have
or
avoid
the
reason
kernel
requirement
and
we
want
to
add
additional
hook,
points
that
are
not
found
in
lsm
as
well
as
have
the
flexibility
to
have
multiple
ebpf
programs
share
state
with
each
other
using
maps.
This
is
the
silo
or
the
database
item
that
you
can
see
on
the
right
here.
A
A
So
let's
jump
right
into
observability
here.
What
is
what
is
the
type
of
observability
that
that
tetragon
can
provide?
And
I
see
the
first
question
that
came
in
as
well,
what
will
be
one
of
the
main
differences
use
of
tetragon
with
the
datadog
agent
running
security,
runtime
security
features.
That's
what
can
actually
address
that
right
away
as
we
go
through
the
observability
page
here.
So
the
the
basis
of
tetragon
is
this
agent,
which
uses
in-kernel
collectors
ebpf-based
to
collect
a
variety
of
different
observability
data
or
a
different
observability
data
types
process.
A
Execution
system
call
activity,
file,
access,
tcp,
metadata
name,
spacing
information
capability,
changes,
privileges,
privilege,
changes,
data
access
on
the
storage
side
on
the
file
exercise,
a
lot
of
different
network
activity,
visibility
functions,
including
raw
layer,
3
layer
4,
as
well
as
different
protocols
because
of
ebpf
smart
collector
ability.
So
ebpf
has
specialized
map
types
and
functions
such
as
stack
traces
and
ring
buffers
and
metrics
and
hash
maps.
This
can
be
done
very
very
effectively,
so
we
can
combine
this
deep
visibility,
so
you
can
see
across
the
stack.
A
We
can
extract
visibility
from
lower
levels,
network
storage,
all
the
way
up
into
the
application.
We
can
combine
this
deep
visibility
with
the
transparency,
so
it's
our
app
agnostic
and
no
changes
to
the
applications
needed
so
far.
This
is
in
line
with
other
collectors
as
well.
Many
of
them
also
have
pretty
deep
visibility
where
it
where
it
becomes
unique
and
different
is
the
low
overhead.
A
You
can
see
this
smart
collector
item
in
the
kernel
portion
of
the
tetragon
piece
there
on
the
left.
All
of
the
filtering.
The
aggregation
is
done
in
kernel,
which
means
we
can
massively
reduce
the
amount
of
data,
the
amount
of
observability
data
that
is
sent
from
kernel
so
from
the
kernel
runtime
to
the
tetragon
agent.
This
is
the
arrow
in
between
kernel
and
the
bigger
being
on
top,
and
that
is
typically
the
biggest
overhand.
A
So
if
we
send
a
lot
of
observability
data
from
the
kernel
into
the
agent
in
user
space
that
will
impose
a
lot
of
foreign,
so
the
more
filtering
the
more
aggregation
we
can
do
in
kernel.
The
lord
overhead
make
a
concrete
example.
It
is
massively
more
efficient
to,
for
example,
collect
metrics
such
as
a
rate
or
a
histogram
in
kernel
compared
to
sending
individual
events
to
the
user
space
and
accounting
for
the
metric
in
user
space.
A
So
that's
the
main
difference
to
existing
or
other
collectors.
Ebpf
gives
this
like
foundation
this
framework
to
to
provide
massively
low
overhead
observability
very
similar
to
how
perth
some
of
you
might
heard
have
heard
of
the
perth
performance,
troubleshooting
and
perf
trouble
or
tracing
you
tracing
utility
that
also
uses
the
same
mechanisms
to
provide
high
performance
visibility
more
into
the
the
function,
call
and
memory
and
cpu
consumption
or
memory
and
cpu
usage
aspects.
A
Lastly,
integrations
all
of
this
visibility
is
useless
if
we
cannot
integrate
this
into
existing
systems.
What
we
currently
support
is
prometheus
profana,
a
variety
of
sims
or
sims
flu
and
d,
open
telemetry
as
well
as
elasticsearch,
but
with
the
json
export
and
in
particular,
with
the
prometheus
capabilities.
This
can,
for
example,
also
go
into
a
datadock
dashboarding
or
into
a
variety
of
monitoring
platforms
that
cloud
providers
offer.
A
First
of
all,
context
is
everything
in
terms
of
security
right,
so
we
need
to
understand
as
much
content
as
possible
and
we'll
see
that
as
we
go
into
the
examples
next,
because
the
better
the
context,
the
easier
it
will
be
to
understand
for
security
teams
in
log
files
and
the
more
accurate
the
alerts
will
be.
This
means
that,
based
on
logs
and
alerts,
we
can
quicker
and
easier
identify.
What
is
the
cause
and
what
is
affected?
So,
let's
look
at
a
couple
of
examples
and
we'll
start
very,
very
basic
and
then
go
go
further.
A
I'm
starting
with
very
basic
network
interface
metrics
like
how
much
traffic
on
what
network
interface
right,
boring.
But,
yes,
natural
con
can
do
this
as
well.
Let's
go
further
and
let's
look
at,
for
example,
tcp
latency.
This
is
already
a
lot
more
interesting,
transparently,
measuring
the
round
trip
time
for
tcp
connections
combined
with
dns
visibility.
So
we
can
see
the
round
trip
time
over
time
to
a
variety
of
external
dns,
endpoints
or
external
endpoints,
and
essentially
labeled
by
the
dns
name.
That
was
used.
A
So
we
can
see
the
latency
to
stats.profile.org
api.twitter.com,
a
variety
of
aws
endpoints,
and
so
on
already
pretty
interesting
and
all
of
things.
This
is
done
completely
transparently,
so
you
can
identify
what
connections,
what
endpoints
are
subject
to,
for
example,
higher
round-trip
latency,
but
then
also
traffic
accounting.
A
We
can
see
what,
in
this
example,
a
dashboard
shows
which
kubernetes
part
is
egressing
or
transmitting
how
much
traffic
this
is
in
case
is
on
a
pod
level,
so
I'm
just
at
the
pod
name
level,
but
this
could
also
be
annotated
with
the
label
that
represents
the
name
space,
the
region,
the
availability
zone.
So
you
can
measure
cross
region
or
cross
ac
traffic
easily.
With
this
as
well.
A
In
another
prometheus
metric
example,
we
can
look
at
tls
and
ssl
two
examples
here,
for
example,
matching
or
extracting
the
sni
name.
So
what
are
the
the
different
sni
domain
names
or
host
names?
That
connections
use,
so
we
can
easily
see
what
our
apps
or
what
host
names
or
our
our
apps
reaching
out
to
as
well
as
tls,
handshake,
so
understanding
what
connection
or
which
type
of
endpoint
network
endpoint
is
receiving
tls
handshakes.
We
could
just
annotate
this
further
with,
for
example,
tls
version
or
cipher,
we'll
see
examples
of
that.
A
Next,
as
I
mentioned,
all
of
this
observability
can
go
into
an
sim
such
as
elastics
or
splunk,
or
something
else,
and
then
you
can
query
this,
so
this
is
an
example:
query
to
detect
weak
or
vulnerable
tls
versions.
We
can
see
that
we
are
querying
all
events
where
we
have
tls
information,
that
that
implied
tls
version,
1.0
or
1.1,
and
then
also
I
want
to
show
things
like
the
process
name,
the
namespace,
the
pod
name,
the
sni,
the
port,
the
ips,
the
store
time,
the
pid
and
so
on.
A
So
we
can
get
rich
context
while
we
detect
weak
or
vulnerable
use
of
tls
diving
deeper
into
the
networking
side.
This
is
an
example
of
the
networking
related
events.
When
a
connection
happens,
so
we
can
observe
everything
from
dns,
hdp
and
tcp.
So
if
you
go
from
the
top
to
the
bottom,
you
see
at
the
very
beginning,
a
process
is
started.
Curl
and
curl
is,
is
essentially
invoked
with
the
argument:
psyllium
dot
io.
We
can
then
see
the
dns
resolution.
A
This
is
in
this
case
this
is
a
kubernetes
part,
so
it
will
attempt
to
resolve
a
variety
of
different
kubernetes
service
names
to
essentially
expand
this
into
what
could
be
a
kubernetes
service
name,
this
all
failed,
so
it
does
not
resolve
until
we
actually
go
and
resolve
7..
I
o
we
see
the
ip
returned.
Then
we
see
the
connect
system
call.
We
see
that
this
is
a
tcp
connection.
We
see
hdp
here.
A
We
can
see
that
that
that
cylinder,
I
o
actually
returns
an
http
301
to
essentially
redirect
us
to
the
https
version,
and
then
we
see
that
a
socket
gets
opened.
We
see
the
amount
of
traffic
that
that
was
caused
on
that
soccer,
both
on
receive
and
transmit.
So
you
see
a
variety
of
different
observability
data
here
from
process
execution
to
dns
layer
to
connect
system
call
itself
all
the
way
into
http
traffic
parsing,
but
then
go.
We
can
go
more
and
go
more
into
the
security
side,
for
example
auditing.
A
What
are
all
the
ports
on
which
applications
are
listening
on?
So
we
can
query
our
entire
database
of
tell
me
all
the
parts
which
are
listening
on
particular
ports.
You
can
see
the
result
at
the
bottom,
so
we
see
what
are
the
parts
with
all
the
labels
that
are
listening
on,
for
example,
port,
9080
or
port
5333.
A
We
see
the
actual
binary,
so
we
see
that
in
one
case
this
is
netcat
essentially
listening
on
port
five,
three
three,
two
three.
In
other
case,
this
is
a
python
application.
We
can
also
see
who
has
been
invoking
this,
so
we
see
that
in
one
case
this
was
directly
spawned
from
a
shell.
In
other
case,
it
was
container
d
shim
we
can
detect
dns
bypass
attempt,
so
let's
say
a
pod.
Instead
of
talking
to
cube
dns
or
kubernetes
dns
attempts
to
directly
talk
to
an
external
or
outside
dns
server,
we
can.
A
I
can
easily
identify
such
network
flows
and
query
them.
So
in
this
example,
we
see
that
there
was
a
a
workload
with
a
set
of
labels
running
in
the
tenant
jobs
name,
space
that
attempted
to
directly
talk
to
an
egress
dns
when
bypassed
or
attempted
to
bypass
the
cube,
dns
or
kubernetes
dns
server
go
further
and
detect,
for
example,
nmap
or
network
map
scans,
in
this
case
filtering
for
a
specific
value
in
the
user
agent
field
of
an
http
scanner.
We
can
see
not
only
when
that
scan
occurred.
A
What
was
the
user
agent,
but
we
also
see
what
was
the
process
name?
What
was
the
htg
parameters?
What
was
the
time
and
so
on?
So
we
have
full
context
into
when
a
particular
http
nmap
scan
or
happened
or
occurred,
of
course,
moving
away
a
little
bit
from
the
networking
side.
Tetrocone
can
also
do
raw
system
call
and
process
execution
visibility.
A
This
case
it's
showing
the
raw
json
output,
so
on
the
right,
you
can
see
a
tracing
policy
and
this
tracing
policy
essentially
indicates
that
I
want
to
observe
all
mount
system
calls
and
it
also
shows
what
type
of
arguments
we
are
interested
in
on
the
left.
You
can
see
that
the
the
a
small
subset
of
the
full
context
that
we
can
provide,
obviously
the
process
itself
with
the
binary,
the
current
working
directory,
the
uid,
the
pid,
the
start
time,
the
pod
label
information.
A
So
what
pod
name
and
what
name
space
the
power
labels,
but
then
also
all
the
way
into
the
container
image
so
container
id
the
image
to
show
off
the
image
the
docker
id,
as
well
as
the
entire
process
ancestry.
This
is
just
a
very
small
subset
of
the
four
contexts
that
we
can
provide.
Every
event
contains
a
massive
amount
of
context
that
goes
along
with
it.
A
This
is
an
example
of
the
ui
version
of
this
that
actually
shows,
in
this
case
a
cluster,
a
mini
cube,
cluster,
a
namespace
tenant
jobs,
and
there
is
a
pod
crawler
and
you
can
see
the
entire
process
ancestry
tree
of
not
only
the
container
itself,
but
also
the
control
plane
of
kubernetes,
including
cubelet.
You
can
see
the
process,
executions
and
which
process
makes
or
attempts,
or
has
established
what
network
connections
the
arrows
or
the
lines.
A
Those
are
the
network
connections,
so
we
can
see
that
there
is
a
variety
of,
in
this
case
a
node
app
invoking
server.js
is
reaching
out
to
an
external
ip
to
elasticsearch
and
to
api
or
twitter,
and
we
also
see
that
there
is
a
reverse
shell
that
has
been
invoked
by
a
net
cap.
This
is
the
the
line
at
the
bottom,
which
is
reaching
out
to
another
domain
like
this
blubberish,
not
a
reverse
shell,
and
we
can
see
which
individual
process
made.
This
request
from
a
networking
perspective
would
be
very
hard
to
spot.
A
A
This
is
showing
kubernetes
specific
example,
but
this
functionality
is
actually
not
keeping
at
a
specific
in
any
way.
This
can
this.
This
works
for
any
process
running
on
the
linux
machine,
detect
late
process
execution,
so
actually
very
common.
You
will
have
containers
or
workloads
that
run
a
single
binary
and
you
want
to
identify
what
are
containers,
what
are
workloads
that
have
a
process
or
a
binary
executed
sometime
after
the
container
was
started,
and
this
can
this
can
often
reveal
a
compromised,
pod
or
or
container,
because
this
is
not
what
the
application
tool
does.
A
Let's
say
you
have
like
a
single
statically,
combined,
binary
running
ass
application.
You
can
easily
rule
out
that
this
container
will
never
start
a
process
or
a
binary
like
10
seconds
or
one
minute
after
the
container
has
been
started,
so
you
can
easily
identify
hey,
let
me
know
which
containers
have
had
processes
or
binaries
started,
one
minute
or
30
seconds
after
the
container
itself
was
started.
A
This
is
very
likely
actually
reveals
a
compromised
container
or
pod,
or
some
other
malicious
intent,
monitoring
file
access
so
going
down
or
go
moving
over
to
the
storage
site.
This
is
showing
a
splunk
integration
here
that
shows
which
part
which
container,
which
workload
is
accessing
certain
files.
In
this
case,
we
are
we're
monitoring
in
couple
files,
so
such
as
etsy
password
ash
history,
shadow
file
and
we
can
see
which
part
but
also
which
process
is
accessing
what
file
and
what
is
the
file
operation?
What
is
the
operation?
They
are
performing.
A
That's
just
the
monitoring
side
of
things.
Then
we
can
go
further
and
look
at,
for
example,
network
policy,
compliance
and
look
for
what
are
what
connections
have
been
subject
to
what
policies.
So
we
can
look
at
all
the
the
allowed
connections
and
identify
what
was
the
policy
that
was
used
to
allow
this
traffic
and,
even
more
importantly,
we
can
identify
what
was
allowed
without
any
policy
at
all,
for
example.
So
we
can
clearly
we
can
clearly
validate
and
audit
whether
we
are
achieving
from
a
policy
perspective
what
we
intended
you
can
observe
http
and
grpc.
A
This
is
showing
example,
where
we
show
and
detect
cross-scripting
attempts
in
the
uri,
essentially
querying
in
this
case
splunk
with
particular
search
query
that
will
request
or
will
show
http
flows
with
just
with
the
name
script
in
the
uri.
In
this
case,
it
just
surfaced
a
simple
cross,
scripting
attempt
here
now
switching
gears
a
bit
and
go
into
the
enforcement
side.
So
we've
seen
the
full
width
of
observability
that
we
can
provide,
like
from
network
to
file
to
system
call.
A
We
can
do
enforcement
on
a
vast
majority
of
this
observability,
but
before
we
go
into
complete
examples,
a
couple
of
high-level
points
on
how
this
enforcement
works.
First
of
all,
it
is
preventive
security
to
them
to
the
that's
like
the
cornerstone,
the
the
most
important
aspect
of
tetragon,
so
essentially
preventing
malicious
in
malicious
actions
or
malicious
attempts
before
they
can
do
damage
to
the
system
or
to
application.
This
includes
the
system,
but
also
the
network,
the
file
system,
as
well
as
application
behavior.
A
It
is
synchronous
and
we'll
get
to
that.
So
it
is
essentially
doing
this
in
kernel.
In
terms
of
policy,
we
have
a
couple
of
integrations.
You
can
define
policies
with
kubernetes
crds.
There
is
a
json
api
as
well,
or
a
json
configuration
method
as
well
as
open
policy
agent
that
can
be
used
and
we're
looking
to
convert
or
looking
to
support
converting
from
existing
rule
sets.
Such
as
falco
rule
sets
or
pod
security
policies
as
well.
A
So
if
there
are
other
forms
of
intent
where
you
essentially
already
define
what
your
application
should
be
able
to
do
or
not,
we
will
look
at
supporting
them
in
terms
of
preventive
actions
from
user
space.
This
is
what
we
are
trying
to
avoid,
or
this
is
what
what
tachogon
is
not
vulnerable
to,
which
means
that
typically
systems
that
rely
on
a
observability
with
a
user
space
rule
engine
are
essentially
vulnerable
to
the
following.
A
The
part
or
application,
or
the
process
is
compromised
or
has
malicious
intent
and
performs
either
an
exploit
or
a
malicious
attempt
in
the
kernel
and
changes
behavior
or
attempts
something
maliciously
the
observability
piece
in
the
kernel.
Let's
say
it's
k-pro
based
or
it's
second
based
will
export
this
visibility
with
a
asynchronous
notification
to
the
user
space
agent
running
running
there,
and
you
have
a
rule
engine
there.
This
rule
engine
will
consume
this
observability
and
will
detect
that.
A
Oh,
this
observability
indicates
that
something
bad
is
going
on
and
will
then
kill
the
container
or
kill
the
process.
This
is
asynchronously,
so
it
happens
essentially
after
the
malicious
attempt
has
already
been
performed.
So
while
it
is
strictly
better
than
not
doing
anything,
it
can
often
already
be
too
late
in
terms
of
preventive
action.
What
tetragon
does
instead
is
doing
this
filtering
and
this
rule
engine
in
the
kernel.
So
I
think
one
of
the
question
was:
how
does
this
compare
to
falco?
A
This
is
one
of
the
big
differences
that,
instead
of
using
evpf,
primarily
from
a
visibility,
extraction,
perspective
tetragon,
does
the
filtering
and
the
rule
engine
part
in
kernel,
which
means
that,
as
it
processes
the
observability
data
in
the
kernel,
it
can
immediately
kill
the
process
it
can
be,
and
you
can
even
prevent
the
activity
itself.
So
let's
say
we
have
a
system
call
that
should
not
be
allowed.
We
will
not
allow
that
system
call
to
be
executed
at
all.
We
will
not
report
that
the
system
call
happened
and
then
kill
the
process
in
hindsight.
A
Looking
at
a
couple
of
examples
here,
this
is
an
example
how
we
can
prevent
access
to
a
sensitive
file,
for
example,
in
this
case
edc
shadow.
So
we
have
a
policy
that
essentially
the
policy
is
not
matching.
The
examples,
no
worries,
so
the
example
is
showing
how
this
this
is
done
for
to
protect
authorized
keys
file
for
ssh.
The
example
is
showing
this
for
egc
shadow
with
very
similar
use
case.
A
We
want
to
prevent
write
access
to
a
particular
file
and
tetragon
will
immediately
kill
the
process
that
attempts
to
write
to
that
file,
but
obviously
all
to
be
doing
so
just
to
open
the
file
or
read
from
the
file
and
so
on.
This
is
an
example
where
we
want
to
allow
reading
from
a
file,
but
immediately
prevent
any
process
or
a
particular
process,
or
a
particular
parts,
to
write
to
a
particular
set
of
files.
A
We
can
also
do
things
like
detecting
remounting
of
the
root
file
system.
This
is
an
example
how
this
can
be
done
using
the
pivot
root
system.
Call,
let
me
check
for
the
question
in
the
chat.
What
are
the
available
actions
other
than
sick
kill?
If
any?
So,
obviously,
there
is
an
action
that
can
provide
visibility
itself.
There
is
an
action
to
sick
kill
and
for
some
of
the
hook
points
you
can
essentially
prevent
or
essentially
change
the
return
code.
A
So,
for
example,
for
a
system
call
you
can,
you
can
have
the
action
say,
don't
execute
the
system,
call
and
return
with
an
error
instead.
So
essentially,
when
you
are,
or
when
you're
operating
at
a
hook
point
where
you
can
change
the
verdict,
then
obviously
you
want
to
prevent
that
photo
processing
and
just
return.
A
Monitoring
and
preventing
capabilities
abuse,
so
this
example
is
showing
when
the
monitoring
of
capabilities
is
enabled
and
we're
seeing
here,
process
execution
that
shows
a
pod
test.
Pod
actually
using
ns,
enter
to
essentially
change
the
the
mount,
the
pid,
the
network,
the
uts
and
the
ips
name
space,
and
it
can
do
so
because
it
has
capsis
admin,
privileges
or
capabilities.
So
it
is
succeeding.
So
we
normally
see
that
the
the
ns
enter
process
or
command
is
executed.
We
can
also
see
that
it
performs
set
namespace
functionality
to
change
or
adjust
namespacing
context.
A
A
Subject
to
the
point
that
anybody
with
capsus
admin
can
automatically
access
any
file,
for
example,
so
tetragon
is
independent
from
that
perspective,
this
shows
both
obviously
the
preventing
file
access
again,
but
also
the
ability
to
monitor
capability
changes
and
capability
context
of
any
system
calling
any
runtime
behavior
observed.
A
I
think
john
already
answered
this
question
so
we'll
move
on
and
actually
start
summarizing
a
little
bit.
So
we've
seen
a
variety
of
things.
At
this
point,
we've
seen
tetragon
be
able
to
provide
both
observability
across
the
stack.
So
we've
seen
file
access,
we've
seen
data
access,
we've
seen
a
variety
of
network
behavior,
both
from
a
connectivity
perspective,
protocol,
parsing,
http,
dns,
tls,
you've
seen
capabilities
tracing.
So
what
are
the
capabilities?
Is
capsis
admin?
Is
it
capnet
admin?
Is
it
ppf
so
seeing?
A
What
are
the
capabilities
of
a
particular
system
call
or
process
execution
or
some
other
criminal
activity,
as
well
as
privilege,
escalation,
so
being
able
to
understand
the
privilege
that
a
particular
system
call
is
subject
to
or
is
is
equipped
with
the
file
access?
You
know:
we've
seen
the
tcp
visibility
with
the
round
trip,
time,
visibility
and
one
of
the
initial
slides,
as
well
as
the
raw
system,
call
visibility.
What
are
the
system
calls
being
made
as
well
as
the
process
execution,
including
the
process
ancestor
tree,
so
understanding?
A
Not
only
what
is
my
process
but
who
has
spawned
me
and
who
has
spawned
the
process
that
spawned
that
process
and
so
on?
We've
seen
examples
of
prometheus
metrics
we've
seen
the
example
of
grafana
dashboards,
we've
seen
in
particular
the
splunk
integration
we've
seen
the
json
output
that
can
be
fed
with
fluency
into
any
system.
You
want,
for
example,
for
example,
into
an
into
an
elasticsearch
cluster.
The
tracing
in
the
metrics
can
also
be
exported
using
open
telemetry.
If
there
is
desire
all
of
cilium.
A
Before
we
do
that,
let's
jump
back
and
answer
this
question:
are
there
any
plans
in
maintaining
rule
sets,
or
is
this
already
part
of
tetragon?
So
yes,
actually,
let's
go
through
this
slide
because
it
mentions
this,
so
there
is
essentially
tetragon
that
is
available
in
the
in
the
and
the
cilium
tetragon
repository.
A
What
is
part
of
the
open
source
repository
is
the
following
from
a
visibility
perspective.
The
process
and
system
called
visibility
that
we've
seen
all
the
layers
through
there
for
network
visibility
and
file
access
monitoring,
as
well
as
basic
capabilities
and
name
spacing
visibility
and
on
the
enforcement
we
can
do.
The
system
call
based
enforcement
based
on
k,
probes
and
trace
points
in
addition
to
that,
isovenant
offers
a
tetragon
enterprise
distribution.
A
First
of
all,
it's
it's
a
hardened
enterprise
distribution
of
tetragon,
so
it
has,
for
example,
extended
end-of-life
support.
We
of
course
offer
enterprise
support
for
tetragon
as
well,
but
then,
in
addition
to
that,
it
has
advanced
capabilities,
including
extended
network
visibility.
This,
for
example,
includes
the
round
trip
time
or
the
latency
measurement
on
the
tcp
side,
as
well
as
the
dns
visibility,
the
hdp
and
https
visibility
with
ktls,
as
well
as
all
of
the
tls
visibility
that
we've
seen.
A
It
features
the
siem,
the
siem
integration
directly
with
splunk
splunk
caps,
the
process
and
street
three
information,
so
understanding
the
full
context
of
who
has
spawned
whom,
as
well
as
high
performance
protocol
parsers
and
extended
aggregation
and
filtering
logic
on
the
file
access
side.
While
the
open
source
version
features
file,
access,
monitoring,
the
enterprise
version
can
also
do
file
integrity,
monitoring
with
documents,
shaw,
256
as
well,
on
the
runtime
or
on
the
enforcement
side.
A
The
enterprise
version
features
extended
runtime
enforcement
capabilities
that
are
more
automated,
so
the
system
call
based
enforcement
in
the
open
source
version,
cid
based
or
json-based
can
enforce
rules
as
written.
The
advanced
enterprise
edition
has
additional
automation
around
kubernetes
and
it
has
a
baseline
policy
set
which
can
do
threat
detection
for
known
threats,
as
well
as
simplify
the
installation
of
enforcement
rules
as
well.
A
We
have
already
a
couple
of
covered
a
couple
of
questions,
but
I
see
more
questions
coming
in.
Can
you
run
tetragon
on
a
cluster
mesh
deployment
and
would
applying
tetragon
policy
follow
the
same
principle
of
cmp,
where
you
need
to
apply
to
each
cluster
manually
versus
a
single
touch
point?
A
Yes,
you
can
apply
tetragona,
you
can
run
tetragon
in
a
cluster
mesh
deployment.
Tetragon
can
be
deployed,
it
can
be
deployed
independently
of
psyllium.
It
does
not
require
psyllium
to
run
if
selim
is
running.
Tetragon
will
extract
additional
visibility
from
cilium
itself,
so
it
will
benefit
from
a
silver
installation,
but
it
is
not
required
to
be
there
in
terms
of
policy.
The
policies
work
exactly
the
same
as
the
cilium
network
policy
in
a
cluster
mesh
context,
so
you
will
have
to
install
them
or
load
them
into
individual
clusters.
C
Here
there.
B
Was
a
question
about
if
we
use
the
trace
points
versus
k-probes,
because
our
examples
are
k-probes,
so
the
tetragon
base
also
knows
how
to
do
trace
points.
So
if
you
want,
you
could
do
trace
points,
but
the
problem
with
trace
points
is
they
need
to
be
at
the
cisco
level
or
in
specific
spots
already
in
the
kernel.
So
we
do
use
them
in
some
places
and
tetragon
will
try
to
smartly
use
them
in
the
right
places
and
and
use
k
probes
where
I
can.
B
Our
our
policies
might
say
k
pros,
but
under
the
cover
that
the
sort
of
tetragon
agent
is
trying
to
find
the
best
best
sort
of
mechanism
that
your
kernel
can
support
to
do
the
filters
so
on
newer,
kernels
you'll
even
get
some
of
the
more
fancy.
A
Awesome
great
before
we
go
to
the
q,
a
if
you
want
to
learn
more
tetragon
is
covered
in
the
security
observability
for
evpf,
booklet
or
report.
We
have
done
with
o'reilly,
there's
actually
a
bigger
book
coming
on
ebpf,
but
for
now
you
can
you
can
freely
download
this
security
observable
the
vdpf
that
actually
gives
an
introduction
to
chat
again
and
give
some
of
the
background
why
we
have
created
tetragon
as
well.
A
Tetragon
is
also
featured
in
the
enterprise
hands-on
lapse,
which
gives
you
a
way
to
try
out
tetragon
and
with
instruct
actually
get
your
get
your
hands
dirty
with
tetragon
without
having
to
install
it
yourself.
So
it
essentially
sets
up
a
sandbox
environment
for
you,
so
you
can
try
and
try
try
out
tetragon
and
play
around
with
it
as
well
as
you
can
use
the
the
virtual
summer
school
that
starts
july.
2019,
it's
an
entire
day
focused
on
tetragon
service,
smash
and
variety
of
other
topics.
A
The
link
is
in
the
slides
and
we'll
make
it
available
to
all
attendees
afterwards
as
well.
If
you
are
interested,
you
can
sign
up,
and
I
see
a
couple
of
questions
are
already
coming
in.
Let's
see
if
anybody
has
questions
feel
free
to
ask
them
in
the
chat
and
we'll
be
happy
to
answer,
I
see.
Cornelia
is
also
posting
all
the
links.
That's
great,
a
comment
from
matthias
a
pack
of
easy
installable
security
rules
is
missing
like
like
in
falco.
Yes,
in
fact,
we
we
don't
necessarily
want
to
recreate
everything.
A
So,
as
mentioned,
we
are
currently
in
implementing
a
file
called
rules.
Translators
you
can
actually
bring
your
falco
rule
sets
whether
this
is
the
existing
falco
rule
set.
That
is
in
the
repository.
You
may
have
your
own
and
enforce
them
with
tetragon
the
benefit
there
is
that
you
will
essentially
benefit
from
the
real-time
enforcement
behavior
of
touchagon,
so
you
can
enforce
income
with
in-kernel
capabilities
instead
of
going
to
user
space.
A
A
Also,
john,
if
you
want
to
add
anything
to
to
any
of
the
points,
feel
free
to
do
so
as
well.
B
B
You
know
further
reducing
the
overhead
of
some
of
these
these
hooks,
but
for
a
tetragon,
it's
not
a
for
most
of
the
tetragon
hooks
they're
out
of
the
hot
pathway,
which
is
sort
of
the
advantage
of
doing
these
networking
hooks
versus
sort
of
inline
methods
like
if
you
were
to
think
of
like
s
flow
or
net
flow,
where
you
grab
every
packet
and
then
try
to
analyze
the
data.
So
tetragon
works
with
the
at
the
socket
layer
and
in
the
kernel.
So
a
lot
of
these
things
are
not
hot
path
items.
A
Yep,
I
also
see
another
question
that
came
in
that's
not
strictly
tetragonal
related,
but
we
can,
of
course
also
answer
that.
So
I'm
interested
in
a
reaction
on
this
and
then
a
reference
to
the
blog
post
from
from
buoyant
or
linker
d
on
side
car
proxies
in
service
mesh.
So
for
those
without
the
context,
we
have
released
a
service
mesh
part
as
part
of
psyllium
in
a
battle
level
last
december,
and
we
are
marking
our
the
salem
service
mesh.
A
Ga
with
1.12
coming
out
in
about
two
to
three
weeks
and
the
the
big
difference
of
psyllium
service
mesh
is
that
it
offers,
in
addition
to
the
existing
histogram
integration,
it
does
offer
a
non-cycle
or
a
site
called
free
version
of
a
service
mesh
that
allows
to
run
either
a
per
node
proxy
allows
some
of
the
service
mesh
functionality
to
be
done
entirely
in
evpf,
without
sidecar,
or
to
run
the
proxy
in
a
different
granularity,
for
example,
per
name
space
or
per
service
account,
and
there
is
debate
going
on
whether
this
is
whether
this
is
the
right
model
or
what
is
the
right
model?
A
What
is
the
better
model,
and
this
blog
post
pointed
out
several
questions
or
several
aspects?
I
think
there
is
a
lot
of
good
content
in
that
blog
post,
some
of
which-
which
I
don't
necessarily
agree
with.
So
I
think,
from
a
multi-tenancy
perspective,
psyllium
has
been
running
in
a
per
node
proxy
configuration
for
years
very
successful
in
very
large
deployments.
A
I
think
the
claim
that
this
is
a
lot
of
hard
work
and
impossible
is
is
a
bit
weird,
because
we
are
running
in
that
configuration
since
years
successfully,
and
I
think
there
is
another
angle
which
is
very,
very
interesting.
A
The
the
claim
or
the
the
aspect
that
a
per
node
proxy
is
dangerous
from
a
perspective
of
having
a
single
proxy
share,
multiple
secrets,
which
is
actually
something
that
we
agree
with,
but
we've
found
what
we
believe
is
a
better
solution
which
is
to
actually
extract
the
mtls
portion
outside
of
the
datapath
proxy
entirely
and
make
it
separate.
There
is
a
blog
post
on
this
that
has
been
released
and
we
can
link
to
it,
which
essentially
shows
a
model
where
the
mutual
authentication
is
done
with
a
separate
user
space
agent.
A
So
we
see
that
as
a
very
ideal
solution
in
terms
of
security
from
a
mutual
authentication
perspective,
so
I
think
that's
also
not
necessarily
a
valid
argument
against
the
sidecar
free
model.
That
said,
we
are
not
in
a
position
where
we're
saying
nobody
should
be
running
cycle
free
proxies
at
all.
A
In
fact,
we've
done
the
istio
integration
first
and
have
been
running
that
for
years
with
users,
so
that's
still
kind
of
the
first
implementation
we
have
done
and
then
based
on
a
lot
of
user
feedback
who
has
or
have
requested,
can
you
find
a
way
to
run
or
provide
service
mesh
functionality
without
a
site
called
implement
this
additional
way
of
running
service
mesh?
I
hope
this
was
a
kind
of
sufficient
answer
to
this.
I
don't
think
there
is
necessarily
a
right
or
wrong.
A
We
are
trying
to
operate
as
much
as
possible
on
user
feedback
and
implement
and
provide
what
our
users
are
asking
us
for
we're
not
preventing
or
trying
to
prevent
anybody
from
running
a
sidecar.
If
that's
the
model,
they
would
like
to
run.
A
Another
question
is
prometheus
exporting
supported
in
the
open
source
flavor.
Yes,
it
is
so
the
metrics
and
premium
export
is
supported.
The
enterprise
version
does
have
additional
visibility
as
as
laid
out
here
in
terms
of
dns
http
https
tls.
The
process
ancestor
tree
as
well
as
some
of
the
high
performance
protocol
parsers
and
some
of
the
network
visibility
is
extended,
but
the
metrics
themselves.
The
metric
export,
is
all
in
open
source.
A
Don't
know
if
this
is
the
best
place
to
ask,
but
I
would
like
to
know
where
does
tetragon
write
its
events
to
I'm
using
export
finally
modify
and
then
reading
it
with
tail?
Follow
this
works,
but
I'm
sure
there
is
a
better
way.
I'm
running
tetragon
natively
on
kubernetes.
Maybe
sean.
Can
you
answer
that
question
briefly?
Yeah.
B
Yeah
yeah
so.
B
Fluentd
and
then
export
this
into
their
sim,
whatever
whatever
that
happens
to
be,
you
can
also
use
fluentd
just
to
aggregate
the
logs
and
dump
them
somewhere
else,
which
is
what
I
do
a
lot
of
times
in
development.
If
you
just
have
a
lot
of
nodes,
you
want
to
see
all
the
logs
aggregated
there's,
so
those
are
sort
of
the
common
I
say:
use
cases
that
are
in
production,
there's
also
a
grpc
endpoint
you
can
attach
to.
B
C
A
Another
question:
is
there
any
sort
of
admin
ui,
maybe
integrated
with
hubble?
There
is
integration
with
hubble
ui,
which
means
that
we
have
not
released
this.
Yet
we
will
release
it
soon.
Integration,
where
the
visibility
from
a
runtime
perspective
will
be
essentially
visualized
in
hubble
ui
as
part
of
the
the
hubble
existing
hubble
ui.
A
All
of
these
events
can
also
be
fed
into
timescape,
which
is
part
of
our
enterprise,
offering
it's
a
time
series
database,
where
you
can
essentially
collect
all
of
this
observability
and
store
it
persistently,
and
then
query
it
and
again
run
hubble
ui.
On
top
of
that,
so
the
time
series
database
actually
offers
plain
hubble
api,
so
you
can
run
the
the
hubble
observe
cli,
the
api,
the
hubble
ui
and
all
the
hubble
tooling.
A
On
top
of
the
timescape
time
series
database
again,
we
will
be
looking
into
policy
into
runtime
policy
management
as
well
from
a
centralized
perspective.
Right
now
we
have
automation
with
a
variety
of
automation,
tools
like
cf
engine,
puppet
ansible
and
so
on,
but
we
will
be
looking
at
providing
something
similar,
as
we
have
done
with
the
network
policy
editor
for
the
runtime
site
as
well.
A
Now
I
see
two
questions
in
the
q,
a
section:
what's
the
expected
resource
usage
of
tetragon
per
node,
it
runs
as
a
daemon
set
right.
So,
yes,
it
runs
as
a
demon
set.
So
there's
an
agent
running
on
each
node.
The
overhead
will
very
much
depend
on
the
tracing
policies
that
you
load
and
the
aggregation
that
you
configure.
A
So
if
we
go
back
here
because
of
the
the
the
flexibility
of
evpf,
we
can
do
a
lot
of
aggregation
in
the
call
so
depending
on
whether
you
want
to
see
every
single
system
call
that
is
being
made
or
whether
you
want
to
see,
for
example,
only
namespace
changes
or
only
access
to
certain
files.
The
overhead
will
be
very
different,
very
or
will
differ.
It
can
be
anywhere
from
one
percent
to
25.
A
I
would
say
so.
It
really
depends
on
how
much
point
you
want
to
see
and
what
what
granularity
do
you
want
to
aggregate
and
see
certain
sensitive
events,
or
you
want
to
have
a
full
system
call
system
called
lock?
What's
the
pricing's
pricing
licensing
model
of
tantragon
enterprise
tetragon
enterprise
is
part
of
selling
enterprise
from
from
isovalent,
and
we
will
embed
this
into
the
price,
so
it's
very
similar
to
salem
enterprise.
It's
a
per
node
subscription
that
is
at
the
base
with
a
scale
discount
as
your
infrastructure
grows.
A
Of
course,
as
I
mentioned,
you
can
run
tetragon
completely
independently
of
sodium,
so
you
can,
of
course,
also
purchase
tetragon
enterprise
separately.
If
you
run
both,
we
will
of
course
give
you
a
discount.
A
A
But
I
think
we
have
covered
all
the
questions
that
were
posted.
So
let
me
maybe
repeat
the
the
follow-ups
here
again
so
ebpf
report
the
booklet
great
way
to
get
involved
and
read
more
about
tetragon,
the
hands-on
lab
with
instruct
great
way
within
minutes.
You
will
have
essentially
a
sandbox
kubernetes
environment
with
tetragon
installed
and
you
can
try
out
tetragon
but
also
other
aspects
of
southern
enterprise
and
then
the
virtual
summer
school
day
on
july
19,
where
we
will
host
tetragon
as
well
as
service
mesh.
A
On
top
of
that,
there
is
a
tetragon
slack
channel
on
the
psyllium
slack.
So
if
you
go
to
psyllium
dot
io,
you
will
find
a
button
to
dislike.
You
can
join
the
slack
or
slack
server.
There
is
a
tetragon
channel
with
all
the
tetragon
developers
on
and,
most
importantly,
if
you
want
to
get
involved
outside
or
in
addition
to
using
tetragon
feel
free
to
contribute,
tetragon
open
source
repository,
github,
slash
students,
tetragon,
we
very
much
encourage
contributions
in
all
forms
doesn't
have
to
be
code
contributions,
but
also,
let
us
know
what
features.
A
Would
you
like
to
see?
We
already
got
some
feedback
today.
Rule
sets
we
would
love
to
have
a
discussion.
What
type
of
rule
sets?
What
integrations
do
you
want
us
to
implement,
for
example,
part
security
policies
automatically
that
are
getting
deprecated?
Do
you
want
us
to
support
something
other
than
the
falco
rule
set,
and
so
on
with
that,
I
would
like
to
thank
everybody
for
attending
this
webinar.