►
Description
This is a BrownBag Session (https://gitlab.com/gitlab-org/secure/brown-bag-sessions/-/issues/33) about creating a snapshot-based, feedback-guided fuzzer that uses perf events for feedback. Project with example code: https://gitlab.com/gitlab-org/vulnerability-research/kb/presentations/creating_a_snapshot_feedback_guided_fuzzer
A
A
A
So
I
am
james
johnson.
I
am
a
staff
security
engineer
on
the
vulnerability
research
team
at
gitlab
and
a
lot
of
my
background
has
to
do
with
buzzing.
It's
something
I
find
really
fun
to
develop
tools
for
it.
You
get
tossed
into
a
lot
of
interesting
situations,
problems
to
solve
and
yeah.
It's
just
interesting
to
me
and
I
have
done
it
a
lot
at
past
jobs,
and
this
is
a
link
to
all
of
the
example
material
in
the
slides.
A
A
All
right,
so
this
is
what
we
will
be
talking
about
today.
So
we'll
start
at
a
pretty
relatively
high
level,
with
debugging
talk
about
what
we
want
to
cover.
Why
we
want
debugging
with
the
types
of
information
we
might
want
to
capture
and
then
we'll
cover
mutation,
snapshots,
feedback
and
then
we'll
touch
briefly
on
using
a
corpus
of
inputs
for
fuzzing.
A
All
right,
so
you
could
call
this
a
very
basic
debugger.
It's
a
bash
script
and
oh
here
we
go
that's
better
all
right.
It
runs
a
process.
You
get
the
exit
code
and
if
the
exit
code
isn't
zero,
it
isn't
successful.
You
could
say,
crashed
it
operated
in
a
way
that
wasn't
intended,
but
you
have
no
idea.
What's
going
on
and
bash,
I
will
say,
is
probably
not
the
best
way
to
implement
a
fuzzer
or
a
debugger.
A
So
here's
roughly
the
same
thing
except
written
in
rust.
So
a
lot
of
the
source
code
in
this
is
in
rust.
It's
something
that
I've
enjoyed
learning.
I
don't
really
use
it
much
here
at
gitlab,
but
a
lot
of
the
source,
material
and
research
for
this
presentation
comes
from
some
things
that
I've
done
on
the
side
and
you
know
just
for
fun
programming
and
that
has
been
in
rust.
So
it's
been
easiest
to
do
a
lot
of
the
code
in
this
presentation
and
rust
so
yeah,
but
this
is
doing
pretty
much
the
same
thing.
A
It
spawns
a
new
process
checks
to
see
if
it's
successful.
If
it's
not
successful,
we'll
say
it
crashed
all
right.
A
All
right,
so
if
we
run
this
locally,
you
can
just
compile
it
with
gcc
or
clang.
If
you
want-
and
you
run
it
like
so
we'll
see.
Oh,
I
did
not
have
oh
here.
We
go
we'll
see
something
like
this,
we'll
see
a
seg
fault
and
if
we
check
the
status
or
the
exit
code
of
the
process
that
we
just
ran,
it
is
non-zero,
so
our
basic
bash
or
super
basic
rust
debugger.
We
will
see
something
it
didn't
exit
cleanly.
A
Now,
let's
go
back
though,
and
this
is.
A
Actually,
yeah
that
side
didn't
need
to
be
there,
but
we
need
more
than
this
this
and
this
isn't
enough
information
to
really
have
a
robust
debugger
and
to
figure
out
what's
going
on
or
if
the
crash
was
interesting.
A
So
just
looking
at
the
output
from
here,
we
can
tell
that
the
exit
code
was
139,
but
we
don't
really
know
much
else.
We
know
this
is
the
code
that
caused
the
crash.
We
wrote
it
it's
very
simple,
but
we
don't
really
know
much
more.
Besides
it
had
a
non-zero
exit
code
and
we
know
the
exit
code
is
139.,
so
we
need
more
information.
A
Now
we're
not
going
to
automate
gdb
in
the
past,
I've
done
that
type
of
debugging,
quite
a
bit
where
I
just
wrap
an
existing
debugger.
I've
used
that
a
lot
from
python
both
on
windows
and
on
linux.
I've
written
generic
debugger
wrappers
that
wrap
either
gdb
or
cdb,
so
command
line
1db
when
debug
and
it
works
pretty
well,
but
for
really
fast
kind
of
higher
performance
fuzzing.
A
It
does
not
work
that
well
and
when
I
was
using
it,
it
was
mostly
for
things
like
adobe
reader
browsers
that
type
of
thing
your
performance
on
those
is
pretty
small
anyways.
A
A
That's
just
a
segmentation
fault,
I'm
not
sure
what
the
v
stands
for,
but
the
seg
fault
signal
is
sent
by
the
kernel
when
memory
errors
occur,
you
can
think
of
signals
as
they're
similar
in
the
same
way
that
interrupts
relate
to
kernels
and
hardware.
Signals
relate
to
processes
and
the
kernel.
So
with
hardware
an
interrupt
can
be
sent
to
the
kernel
that
absolutely
needs
to
be
handled.
If
it
isn't
handled,
then
the
computer
itself
will
just
crash.
A
The
same
relationship
exists
with
processes
and
signals.
The
kernel
can
send
a
signal
to
a
process
and
the
process
may
handle
it
or
not.
That
is
one
of
the
big
differences
is
that
most
handlers
are
optional
or
they
have
default
actions.
So
sig
kill
and
said.
Sig
stop
can't
be
handled,
though.
If
that
is
sent
to
a
process,
the
process
just
dies.
A
There's
no
way
to
stop
it
all
right.
So
let's
go
through
a
few
more
examples
of
contrived
crashes
in
programs
just
so
that
we
can
look
at
the
signals
they
generate
now.
This
one
is
a
double
free.
We
allocate
some
data
and
we
free
it
twice
in
a
row.
This
does
come
up
in
complicated
c
based
languages.
You
can
have
double
freeze
and
this
does
not
generate
a
seg
vault.
You
have
a
sig
aboard
so
a
little
slightly
different
behavior.
A
So
if
we
are
looking
at
trying
to
glean
as
much
information
as
we
can
from
the
target,
as
it's
running,
knowing
which
signal
was
sent
to
the
process,
can
be
very
useful
or
more
useful
than
not
having
it.
Let
me
say
that
all
right,
so
let's
look
at
another
one.
This
is
a
stack
overflow,
so
we're
not
overflowing
a
buffer.
Just
a
stack
overflow,
infinite
recursion.
We
run
out
of
stack
space.
This
one
is
also
a
seg
fault
and
again
we're
doing
the
same
thing.
We
compile
it
and
we
run
it
with
gdb.
A
So
we
can
see
this
signal
and
here's
another
one.
This
one
is
a
little
more
interesting
on
this
one.
I
is.
I
have
seen
this
before,
especially
with,
if
you're
fuzzing
a
target
that
has
some
aspect
of
jit
to
it.
A
You
can
have
it
execute
invalid
instructions
or
if
you
have
a
use
after
free,
and
you
make
the
instruction
pointer
jump
to
some
random
place
in
memory
that
happens
to
be
executable
it
can
you
can
see
this
error
an
invalid
instruction,
so
a
90
is
a
knop,
and
so
that's
here
and
then
cc
is
in
3
and
then
knop
in
three
and
this
one
six
is
an
invalid
instruction
on
x64,
and
so
what
I'm
doing
here
is
I
allocate
some
data.
A
I
set
it
to
be
read,
write,
execute
and
then
I
cast
that
data
location
to
be
a
function,
a
callback,
and
then
I
call
it
to
make
the
instruction
pointer
go
to
that
data.
Then
we
start
executing
these
raw,
these
raw
opcodes
directly,
and
this
is
where
it
crashes.
This
also
happens
to
be
a
seg
fault
all
right,
and
this
is
actually
the
full
list
of
signals.
A
One
of
the
ones
that
I
recently
had
to
deal
with
was:
oh,
it's
not
going
away
all
the
way
at
the
bottom.
There's
sig
of
winch
and
I'm
pretty
sure
it
means
window
change
so
when
the
window
resizes,
that
signal
is
sent
to
the
program.
A
So
if
you
are
just
generically
capturing
all
signals
sent
to
the
process-
and
I
actually
haven't
fixed
this
in
the
code-
so
if
you
run
the
sample
fuzzers
in
the
examples
directory,
if
you
resize
the
terminal
window,
it
will
think
a
sig
winch
signal
is
a
crash,
so
yeah
there's
a
lot
of
signals
going
on
and
it's
kind
of
interesting
having
that
insight
into
what's
happening
with
the
process,
all
right.
So
so
we've
got
signals.
We
know
the
exit
code,
but
we
really
want
more
information
than
that.
A
So
next
up
we
really
need
to
have
our
own
debugger
on
linux.
This
is
where
I
would
start
using
p
trace
and
that's
what
we'll
be
talking
about.
We
can
use
this
to
fully
debug
and
control
process.
We
can
read
and
write
registers.
We
can
look
at
the
process.
State
read
memory
from
it,
change,
register,
values,
monitor
the
signals.
Everything
and
actually
this
is
what
gdb
uses
uses
p,
trace
all
right
and
so
I'll
be
quoting
the
man
pages.
A
A
lot
in
here
it's
the
very
succinct,
clear
definitions
about
what
all
these
functions
are,
so
p
trace,
there's
a
p
trace
function
and
you
send
it
a
request
and
the
request
values
like
p
trace
trace
me
are
the
main
way
to
use
p
trace.
A
So
the
way
you
start
using
p
trace
is
you
have
a
child
process
in
the
child
process
you
have
to
have.
You
have
to
call
p
trace
with
the
request
p
trace,
trace
me
once
that
is
done,
then
you
can
start
debugging
the
process
and
tracing
it.
A
So,
in
the
examples
directory
there's
the
b
spawn
with
p
trace
very
similar
to
the
a
example-
and
here
we
are
doing
exactly
what
I
said.
So
we
spawn
a
new
process
and
that's
what
we're
doing
here
when
the
process
is
spawned,
it's
initially
stopped,
and
so
we
execute
some
code
first
in
that
context,
and
that's
where
we
do
the
p
trace
trace
me
request
and
then
after
that
is
done,
the
process
is
created.
A
How
are
we
doing
on
time?
Okay,
we
are
yeah,
we're
doing
just
fine
all
right.
So
if
we
run
this
with
each
of
the
different
processes
we
can
see
or
each
of
the
different
example
targets
that
crash
in
different
ways,
we
can
see
that
we
are
capturing
the
different
signals,
so
all
of
them
are
seg
faults,
except
for
the
double
free,
which
is
the
sig
board,
so
yeah
yeah,
our
debugger,
is
working.
A
So
knowing
the
signal
still
isn't
enough,
but
we
want
the
registers,
we
can
also
look
at
the
last
instruction.
We
can
start
analyzing
the
stack
and
the
heap
having
a
debugger
in
place
where
you
can
actively
inspect
a
process,
is
very
critical
in
automating.
This
type
of
thing,
all
right
so
another
way
that
we
can
gain
additional
information
about
how
a
process
is
crashing
or
how
these
errors
are
occurring
is
using
sanitizers
and
clang.
A
So
sanitizers
are
so
clang
is
part
of
the
llvm
project
and
part
of
how
the
llvm
project
works.
Is
it
takes
source
input
and
transforms
it
into
llvm's
intermediate
language,
so
llvmir
and
then
transformations
are
done
performed
on
top
of
the
ir,
and
then
at
that
point
or
after
that,
then
the
ir
is
transformed
to
architecture-specific
machine
code.
A
So
the
sanitizers
operate
on
the
ir
and
they
insert
instrumentation
in
different
ways
so
that
we
can
get
extra
feedback
about
how
things
are
occurring.
So
we
may
get
more
insight
into
how
something
crashed,
but
it
can
also
ensure
that
things
do
crash
when
they
go
wrong.
So
certain
types
of
errors
like
use
after
free
is
a
good
one.
When,
if
you're
not
running
with
one
of
the
sanitizers,
you
could
free
an
object
and
the
memory
of
the
freed
object
is
still
floating
around
on
the
heap
it
isn't
cleared
out.
A
So
if
you
have
a
stale
pointer
pointing
to
this
freed
object,
you
could
still
use
it.
It
may
have
been
overwritten
by
something
else
or
partially
overwritten,
and
then,
when
that
stale
pointer
is
actually
used,
it
is
going
to
use
a
corrupted
object.
A
So
what
some
of
the
sanitizers
add
checks
into
the
code
at
compile
time
to
make
sure
that
never
happens
like
it
will
crash
if
there
is
ever
a
use
after
free,
so
here's
a
list
of
some
of
the
main
sanitizers
there
are
a
few
other
ones,
I'm
not
that
familiar
with
the
most
important
or
most
used
ones
to
me
are
address
sanitizer,
that's
the
one
I
pretty
much
always
use,
there's.
Also
memory
sanitizer,
detects
use
of
uninitialized
memory,
so
address
sanitizer
is,
I
think
I
quote
it
on
the
oh.
A
I
didn't
add
it
all
right
so
address
sanitizer
itself
is
pretty
interesting.
It
does
operate
on
the
code
at
compile
time
and
every.
If
I
remember
correctly,
every
heap
allocation
gets
its
own
page
in
memory.
So
that
way,
if
anything
is
read
beyond
the
scope
of
the
allocation,
then
it
causes
a
crash.
So
if
you
allocate
a
buffer
on
the
heap-
and
you
try
to
read
beyond
the
balance
of
the
buffer
or
write
it
crashes,.
A
A
So
the
double
free
address
sanitizer
will
output
additional
information
and
context
into
why
something,
crashed
and
we'll
go
through
the
different
sample
or
contrived
targets
and
how
they
look
with
address.
Sanitizer.
A
All
right
and
this
one
is
the
stack
overflow
it
does
give
you
very
nice
error
messages,
saying
a
little
more
clearly
exactly
what
happened
and
maybe
a
little
more
context
about
what
caused
it,
and
so
this
one
is
the
very
basic
one
of
with
that
crashes.
If
you
give
it
gitlab
and
it
tries
to
dereference
null.
So
here
we
go
a
hint
address
points
to
the
zero
page,
so
signal
is
caused
by
a
right
memory
axis.
A
So
to
summarize
so
far
we
can
launch
a
process
and
we
can
monitor
it
and
debug
the
process.
We
can
use
sanitizers,
but
now
we
need
to
actually
start
sending
inputs
to
the
target
process,
all
right.
So
with
mutation
we'll
keep
it
pretty
straightforward.
We
will
just
mutate
random
bytes
and
existing
data.
We're
not
going
to
worry
about
changing
the
size
of
the
data
being
sent
or
anything.
A
So
and
here's
example
c:
it's
another
rus
project.
This
takes
an
existing
array
of
bytes,
so
u8,
it's
just
a
character,
a
byte
and
it's
a
vector,
it's
an
array.
So
we
take
that
and
a
random
object.
This
knows
how
to
generate
random
numbers,
so
n
number
of
bytes
or
four
n
number
of
bytes
or
n
number
of
times.
We
choose
a
random
index
and
we
set
it
to
some
random
value
in
the
character
set,
and
that's
it
very
simple
mutation,
and
here
we
are
in
the
main
function
actually
using
it.
A
So
we
are
spawning
the
process
every
single
time,
and
so
we
have
this
scratch
buffer
that
we
just
keep
reusing.
We
keep
overriding,
so
we
copy
the
original
input
in
and
then
we
mutate
the
scratch
buffer
again
passing
in
the
rand
object
and
the
scratch
space
buffer,
and
then
we
spawn
the
process,
and
then
we
monitor
it
to
see
if
there's
any,
to
see
how
it
crashed
and
now
this
I
let
it
run
for
a
while-
and
it
never
found
gitlab
yeah.
A
It
just
never
found
it
so
and
that's
actually
expected
so
we've
got
six
characters
here.
So
if
we
do.
A
One
two
three,
four:
five:
six
that
many
options
to
rent
or
one
in
that
number-
are
the
odds
of
finding
the
right
randomly
generating
the
correct
value
and
actually
it'd
be
a
little
higher
because
we're
not
generating
six
bytes
every
single
time
we're
generating
a
random
number
of
bytes
or
mutating
a
random
number
of
bytes.
So
the
odds
are
even
lower
that
we
will
find
gitlab
with
this
method,
all
right.
A
So
moving
on
to
the
next
phase,
snapshotting,
it's
a
little
bit
of
a
gear
shift,
but
why
would
we
want
to
snapshot
so
process?
Creation
is
very
slow
if
we
are
creating
a
new
process
every
single
time.
That's
a
lot
of
setup
time
and
tear
down
time
for
that
process.
A
A
After
all,
the
setup
has
occurred
and
then
you
fuzz
only
the
interesting
part,
and
you
also
have
options
to
make
it
more
deterministic.
It's
not
always
the
case,
but
that
is
something
that
applies
in
general
to
the
concept
of
snapshot
based
fuzzing.
It
can
be
more
deterministic.
A
Actually,
let's
go
back
to
this
and
look
at
the
iterations
per
second,
so
creating
a
new
process
every
single
time
we're
getting
about
700
iterations
per
second,
so
it's
faster
than
some
other
things,
but
in
general,
that's
not
too
fast.
A
All
right
all
right
we're
doing
good,
and
so
what
does
a
snapshot
actually
mean
to
me?
It
means
you're,
recording
the
state
and
restoring
it.
It
should
be
that
straightforward,
it
does
get
a
bit
complex
and
there
are
shortcuts
you
can
take.
You
don't
have
to
do
the
full
thing,
but
if
you're
fuzzing
or
wanting
to
snapshot
really
complicated
targets,
it
gets
a
lot
more
complicated
than
that.
A
If
you
take
a
snapshot
and
then
the
process
starts,
allocating
things
on
the
heap
starts,
making
state
changes
on
the
stack
or
in
wherever
you
are
going
to
want
to
be
able
to
reset
all
of
those
changes
back
to
their
original
state.
When
you
took
the
snapshot
so
register
values,
there
are
standard
and
floating
point
registers.
You
need
to
snapshot
both
of
those
sets
and
save
them,
so
they
can
be
restored
file
descriptors.
So
let's
say
a
process
has
10
files
open.
A
How
are
you
going
to
handle
those?
Let's
say
you
snapshot
the
process
and
you
know
you
continue
doing
the
fuzzing.
It
closes
five
of
the
file
handles.
How
do
you
restore
it
back
to
its
original
state
or
let's
say
that
a
file
was
mapped
into
memory
or
there's
existing
maps
that
were
then
closed.
How
do
you
deal
with
those
types
of
things?
It
does
get
very
complicated.
A
If,
for
the
examples
in
this
presentation,
we
won't
deal
with
file,
descriptors
or
network
sockets
or
map
data,
we're
really
only
going
to
focus
on
memory
and
register
register
values,
but
the
other
ones
are
definitely
things
to
think
about
all
right.
So
proc,
fs,
proc
fs
is
it's
a
pseudo
file
system
and
it
gives
you
access
to
kernel
data
structures.
A
It's
not
the
fastest
thing
in
the
world,
but
it
does
give
you
the
information
you
might
need,
if
you're
going
to
do
some
implement
some
sort
of
snapshot
system,
so
in
general,
proc
fs
is
our
friend,
but
really
it's
our
friend
of
me.
We
like
it
because
it
gives
us
information,
we
need,
but
we
actually
really
don't
like
it
and
there's
reasons
for
that.
We'll
get
to
those
later,
though
so.
A
Here's
some
important
proc
fs
files
for
snapshotting,
there's
proc
maps,
oh
and
for
each
of
these,
if
you
want
to
test
them
out
or
look
at
them
on
your
system,
you
go
to
proc,
slash
self,
slash,
then
the
file
name
that
refers
to
the
current
process.
So
you
could
do
proc
self
maps
and
that
will
be
the
maps
file
for
that
process,
all
right,
so
first
off
procped
maps,
I
it
contains
a
list
of
all
of
the
mapped
regions
for
that
process.
A
A
So
if
you
cat,
proc
self
maps,
you'll
see
something
like
this
and
you
can
see
user
bin
cat
is
actually
loaded
multiple
times
with
different
sets
of
permissions,
and
I'm
not
going
to
go
into
that.
But
it's.
This
is
really
interesting
to
look
at
and
to
kind
of
get
a
feel
or
understanding
for
what's
going
on
in
the
kernel
when
a
process
runs,
but
down
here
you
can
actually
see
this
is
the
stack
region
so
as
functions
are
called,
they
leave
stack
frames
on
the
stack.
A
The
stack
would
definitely
be
a
memory
region
that
you
would
want
to
restore.
So
if
you're
thinking
about
snapshotting,
this
gives
you
a
lot
of
information.
You
can
probably
see
ways
where
you
don't
need
to
capture
all
of
this
stuff
in
memory,
so
you
can
restore
it.
If
you
can't
write
to
it,
it's
probably
never
going
to
change.
So
maybe
you
don't
need
to
save
it,
for
example
all
right,
so
this
one
is
another
interesting
file.
A
This
is
the
memory
of
the
process.
It's
not
actually
just
a
file
with
all
of
the
processes.
Memory
remember
these
are
it's
an
interface
to
kernel
data
structures,
so
if
you
open
proc
and
then
the
pid
and
then
mem
and
seek
to
an
address
in
memory,
so
let's
say
you
know
of
an
address
in
a
process
and
you
want
to
read
that
value
using
the
mem
process,
proc
fs
file,
you
would
open
it
seek
to
the
address
and
then
read.
A
However
many
bytes,
and
that
would
be
the
memory
from
that
process
and
one
of
the
examples
in
the
examples
directory
is
proc.
Fs
readmem.c-
and
this
does
exactly
that-
and
this
is
what
it
looks
like
here.
We
go
we're
reading,
proc
self
mem,
and
this
will
always
be
the
current
pid.
A
So
what
should
occur
is
that
we
will
print
out
hello
world,
but
except
it
will
be
from
new
data,
and
that
will
be
a
value
that
we
read
through
proc
fs
through
the
mem
file
in
proc
fs,
and
that's
exactly
what
happens
so
I
I
don't
know
why
you
would
actually
want
to
do
that.
Maybe
there's
reasons
but
for
normal
programming.
I've
never
needed
to
do
that.
A
All
right
so
clear
refs!
So
if
we
go
back
to
this,
this
is
a
lot
of
data.
That's
loaded
into
memory.
These
regions
can
be
fairly
large,
and
so,
if
we're
restoring
each
of
these,
every
single
iteration
during
fuzzing,
that's
going
to
be
a
serious
bottleneck.
So
keep
that
in
mind
as
we
go
through
these
next
steps,
so
clear
refs,
it's
a
write,
only
file
again
you're
setting
some
value
in
kernel
memory,
you're
accessing
kernel,
data
structures
through
proc
fs,
and
what
this
does
is
it
it
sets.
A
It
clears
all
of
the
dirty
flags
on
all
of
the
pages
for
the
process.
A
So
what
you
can
do
with
this
is
clear
when
we
look
at
the
next
one
page
map,
so
the
page
map
gives
you
so
there's
a
64-bit
value
in
this
file
for
every
single
page
loaded
in
into
the
process
and
the
55th
bit
of
that
value
for
each
page
in
the
processes,
memory
indicates
whether
or
not
that
page
is
dirty.
Since
the
last
time
clear
refs
was
written
to
so
now
we
have
if
we
write
the
value
4
here
to
proc,
clear
refs.
A
So
that's
where
this
concept
originated
and
it
was
created
in
order
to
do
snapshotting
or
to
make
that
easier,
and
it
is
actually
used
a
lot
with
docker
containers.
Yeah,
there's
a
lot
of
research
around
how
this
applies
to
docker
all
right.
So,
but
why
do
we
really
care
why
they
are
dirty?
It's
pretty
much
performance.
If
we
only
have
to
restore
one
page
in
memory
versus
entire
all
of
the
regions
that
we
know
about
for
the
process
things
we
will
be
much
faster.
A
Also
if
we're
operating
on
x64.
The
address
space
here
is
incredibly
huge
and
we
do
not
want
to
be
trying
to
save
all
of
the
processes
possible
memory
for
every
snapshot.
All
right
now
remember:
proc
fs
is
our
frenemy.
It's
not
our
friend,
it's
useful,
but
it's
very,
very
slow.
So
this
I'm
not!
This
is
his
handle,
I
think,
but
his
blog
talks
about
how
proc
fs
is
not
that
fast
and
he
had
a
fork
of
the
linux
kernel
that
was
trying
to
do.
A
Data
structures
instead
of
using
a
file
based
system
where
you
have
to
use
multiple
sys
calls
just
to
get
at
the
data
he
his
branch
of
the
or
fork
of
the
linux
kernel
was
using
a
different
method
in
an
effort
to
speed
it
up
and
if
you
search
around
you'll
see
a
lot
of
links
talking
about
how
proc
fs
is
really
not
that
fast
and
we'll
talk
a
little
bit
more
about
that
on
these
slides
so
reading
and
writing
memory
from
proc
mem
requires
a
few
syscalls.
A
You
have
to
open
the
file
you
have
to
seek,
and
then
you
have
to
read
or
write
the
values
and
then
you
have
to
close
the
file.
So
now
you
could
leave
the
file
descriptor
open
and
then
you
just
seek
and
read
or
write
as
you
want
and
then
close
it
when
you're
done
so
you
could
kind
of
rule
out
the
first
two,
but
even
then
for
every
single
value
you
want
to
grab
from
the
processes
memory
you
have
to
do
two
syscalls
at
least
you
have
to
seek,
and
you
have
to
do
the
operation.
A
So
there
are
better
apis
for
doing
this
type
of
thing,
so
there's
process,
vm,
write,
v
and
also
read
v.
These
both
take
an
array
or
two
arrays
of
local
or
not
local.
Two
arrays
of
I
o
vectors
is
what
they're
called
so
there's
a
local
io
vector
and
the
remote
io
vector,
and
there
are
arrays
that
indicate
what
data
to
write:
that's
a
local
io
vector
and
then
where
to
write
it
to
so
you
can
pass
a
whole
list
of
data
and
address
pairs.
A
You
can
send
those
lists
to
process
vm
write
v
and
it
will
do
all
of
them
in
one
syscall
and
actually
it's
not
using
system
calls.
I
actually
thought
it
was
so,
but
it
does
it
all
in
one
shot.
Instead
of
having
to
do
two
sys
calls
for
every
operation
and
it's
the
inverse
for
read
v.
A
So
you
have
an
array
of
buffers
with
known
sizes
that
data
from
the
remote
process
is
going
to
be
read
into,
and
then
you
have
an
array
of
addresses
in
the
remote
process
that
indicate
where
the
data
will
be
read
from
all
right.
So
we
have
a
lot
of
the
building
blocks
in
place
now,
let's
start
putting
them
a
bit
more
together,
so
we
have.
A
If
we
want
to
record
a
snapshot,
the
process
must
be
stopped
and
we
need
to
record
the
registers
and
these
have
to
do
with
p
trace
and
then
with
proc
fs
and
the
process
vm
functions.
We
need
to
copy
all
of
the
writable
regions
indicated
in
the
maps,
and
then
we
write
4
to
clear
refs
and
so
that
will
clear
the
dirty
bit
flag
on
all
the
pages
in
the
process's
memory.
A
A
And
I
mentioned
this
before
this:
does
ignore
all
of
these
things.
I
didn't
mention
child
processes.
What,
if
it
spawned
a
child
process
and
we're
resetting
things
these?
If
your
target
is
the
target
that
you're
fuzzing
is
very
complex,
you
will
probably
have
to
deal
with
these
and
figure
out
how
you
want
to
approach
it.
Maybe
it's
fine
to
when
you
restore
it
to
just
kill
all
the
child
processes
that
didn't
exist
at
the
time
when
the
snapshot
was
taken.
A
Maybe
that's
all
you
need
to
do,
but
you
still
have
to
deal
with
multiple
threads
memory,
mapped
regions,
new
file,
descriptors
that
were
open
lots
to
consider
all
right,
so
snapshotting
is
super
useful.
There
is
a
lot
of
research
going
on
in
that
realm
and
I
did
not
add
a
link
to
something
else
that
I
want
to
so
afl.
It's
a
very
popular
fuzzer,
there's
afl,
plus
plus
some
of
the
afl
plus
plus
guys
have
been
working
on
this
afl
snapshot.
A
Linux
kernel
module
that
takes
care
of
all
the
snapshotting
in
on
the
kernel
side,
so
you
can
have
high
performance
snapshot
based
fuzzing
without
having
to
do
multiple.
Sys
calls,
oh
one
of
the
other
things
I
was
going
to
mention
here
is
that
there
is
also
research
going
on
in
doing
emulated,
fuzzing,
where
you
emulate
a
different
architecture,
possibly
a
simpler
architecture
and
memory
and
in
the
emulator.
A
You
have
full
insight
into
everything,
that's
going
on
in
the
process
or
whatever
you're
emulating
so
you're
able
to
capture
everything
you
need
to
without
needing
to
do
anything
with
the
kernel.
It's
it's
been
emulated,
so
you
have
absolute
control
over
everything
and
that's
another
very
interesting
area
with
snapshot
fuzzing.
A
A
So
the
main
function
calls
do
something,
and
this
intentionally
takes
maybe
a
second
or
maybe
a
little
less.
So
if
you
were
to
fuzz
this
program
end
to
end,
you
would
not
have
a
fast
loop
that
you're
iterating
through
you
would
have
like
five
iterations
per
second
or
something.
A
Okay,
so
this
data
is
coming
directly
from
the
command
line
of
the
process
and
I'm
realizing.
I
did
not
put
the
screenshot
that
I
wanted
to
show
in
there
all
right.
So,
let's
go
back
to
here,
so
we've
changed
the
target
process
to
have
a
slow
section
with
the
interesting
bit
coming
after
the
slow
section
and
the
data
that
is
being
provided
to
the
program
just
comes
from
a
command
line
argument.
Now,
if
we're
using
a
snapshot
based
system,
we'll
have
we'll
create
the
process
once
and
then
keep
resetting
it.
A
A
The
intention
here
is
that
you
could
have
an
automated
way
to
find
these
locations
that
you
want
to
breakpoint
and
I'll
talk
about
some
of
the
ideas
on
that
in
a
little
bit,
but
in
general
the
example
fuzzers
work
like
this:
it
flags
the
data
that
will
be
fuzzed
and
then
it
triggers
the
snapshot
when
to
take
it
and
then
triggers
the
snapshot
restoring.
So
the
target
itself
has
been
manually
instrumented
to
do
have
these
interactions
with
the
fuzzer.
A
So
this
is
how
we're
flagging
the
data
inside
the
target
process
we're
using
some
raw
assembly
and
we
are
triggering
an
n3
that
generates
a
sig
trap
signal.
A
But
before
that
occurs
we
are
putting
a
special
value
into
rcx,
so
food
feed.
If
the
fuzzer
sees
food
feed,
then
we
know
that
that
sig
trap
is
at
the
location
where
the
memory
address
was
tagged
and
then,
when
that
happens,
then
we
grab
rax
from
the
registers
and
rbx
indicates
the
size
of
the
data,
and
now
we
know
the
address
in
the
target
process
and
the
length
of
the
data
that
will
be
fuzzed
and
something
else
here.
You
might
notice
at
the
bottom.
A
We
say
set
watch
point,
and
that
is
what
we
do
on
the
next
slide.
So
taking
the
snapshot,
this
is
actually
setting
the
watchpoint
or
a
hardware
breakpoint.
A
That
is
done
on
the
address
of
the
data
to
be
fuzzed.
So
in
the
target
process
the
rgv1
is
passed
in
and
that
gets
put
into
rax,
which
becomes
this
overwrite
data
address
right,
and
then
we
set
a
watch
point
on
it.
So,
the
next
time
that
data
address
is
accessed,
then
the
a
sig
trap
will
be
triggered
or
sent
to
the
process
and
what
that
gets
us
is.
A
We
don't
need
to
manually
figure
out
the
best
place
where,
let's
see
what
that
gets
us
is,
we
don't
need
to
manually
insert
the
breakpoint
to
trigger
these
buzzing.
So
now
we
take
the
snapshot
as
soon
as
the
data
of
interest
or
the
input
to
the
program
is
actually
used,
and
in
this
contrived
example
that
gets
us
past
that
slow
section
and
where
that
actually
takes
us
is,
let's
go
all
the
way
back
to
here,
one
more
all
right
so,
where
that
actually
takes
us
is
inside
of
stir
length.
A
This
is
the
first
time
the
data
is
accessed
and
that's
where
the
hardware
breakpoint
triggers
the
sig
trap,
and
at
that
point
that
is
where
the
snapshot
is
taken,
and
this
is
completely
after
the
slow
section
and
during
the
development
of
the
snapshot.
Buzzer
code
that
I
did
in
my
free
time
it
I
actually
had
a
problem
where
I
wasn't:
saving
the
floating
point
registers.
A
So
a
lot
of
the
string
functions
in
the
standard
library
use
floating
point
registers
to
you
know,
increase
performance
right
and
I
wasn't
saving
those
when
I
was
doing
the
snapshotting
and
I
kept
it
kept
crashing
sometimes
here
inside
of
sterling,
when
I
after
I
restored
it,
and
it
took
me
a
while
to
realize
that
I
wasn't
restoring
the
floating
point
registers,
because
those
are
what
sterling
was
actually
using.
A
All
right,
okay,
so
we
have
the
hardware
break
point
that
gets
us
it's.
It
sends
a
sig
trap
inside
of
the
sterling
function
and
that's
the
point
at
which
the
snapshot
is
taken.
So
at
this
point
we
need
to
actually
that's
right.
This
slide
is
talking
about
specifically
how
the
hardware
breakpoint
is
set.
It's
pretty
interesting
it.
A
We
are
starting
to
run
out
of
time,
so
I'm
going
to
kind
of
gloss
over
it
a
little
bit,
but
there's
debug
registers
on
intel
processors
and
dr0123
are
each
track,
a
specific
location
in
memory
that
can
be
set.
So
you
can
have
four
hardware
breakpoints
going
at
a
time
with
those
four
registers.
A
The
dr7
register
is
a
debug
control
register
and
you
set
values
inside
of
that
register
to
enable
or
disable
each
of
these
dr0123
registers,
it's
a
little
more
complicated
than
that,
but
that's
the
gist
of
it
and
you
set
these
registers,
specifically
with
ptrace
poke
user.
You
can't
use
the
normal
getregs
request,
all
right,
so
restoring
the
snapshot.
There's
another
breakpoint
triggered
at
the
end
of
the
target
program,
and
this
is
watched
for
in
the
fuzzer.
A
If
a
sig
trap
occurs
after
we've
already
started
the
fuzzing
loop,
then
we
know
it's
ours
and
then
we
just
break
now.
Really
a
sig
trap
could
occur
in
the
process.
Maybe
the
developers
of
an
actual
real
application
have
an
assert
that
they're
left
in
that
would
trigger
a
sig
trap.
A
More
logic
would
need
to
be
added
here
to
handle
that
type
of
thing
all
right.
So
if
we
run
this,
we'll
see
that
we
are,
we
receive
the
tag
memory
address
and
we
have
the
max
data
length
and
we
have
the
address
of
the
data
to
be
fuzzed.
And
here
this
is
the
watch
point
or
the
hardware
breakpoint
being
hit.
This
is
inside
of
the
stir
length
function,
and
so
we
took
the
new
snapshot
and
we
copy
all
of
the
writable
regions
from
memory,
and
then
we
start
the
fuzzing
loop.
A
So
every
at
the
start
of
every
fuzzy
iteration
we
restore
the
snapshot
and
then
do
the
fuzzing
and
this
thing
keeps
showing
up.
But
if
we
look
at
this
we're
getting
32
000
iterations
per
second.
If
you
remember,
when
we
were
creating
a
new
process
for
every
iteration,
we
were
only
getting
600
iterations
per
second,
so
snapshot
fuzzing
does
have
huge
potential
and
it's
really
not
so
much
the
snapshot
in
that
is
getting
us.
A
A
All
right.
Let's
see
so,
let's
say:
if
we're
generating
inputs
to
send
to
the
program,
some
inputs
will
be
more
interesting
than
others.
The
general
thought
theory
behind
a
corpus
is
that
interesting
inputs
will
need
to
be
tracked
and
may
be
prioritized
based
on
how
interesting
they
are
and
non-interesting
inputs,
maybe
ones
that
you've
seen
before
will
just
be
discarded,
and
so
this
corpus
is
work
hand
in
hand
with
a
feedback
metric
or
a
fitness
function,
and
here
this
is
all
I
need
for
a
corpus.
A
All
right,
usually
using
feedback
in
a
fuzzer,
looks
something
like
this.
You
might
revert
the
snapshot
start
recording
whatever.
That
means
to
the
type
of
feedback
run
the
target,
and
then
you
get
your
metrics
from
the
feedback,
and
then
you
check
in
the
corpus
if
you've
seen
that
feedback
metric
or
not
and
decide.
If
you
want
to
save
that
input,
so
types
of
feedback
coverage
is
the
obvious
one,
but
there
are
other
ones:
performance,
counters
manual,
breakpoints,
anything
meaningful
that
can
indicate
progress
in
the
fuzzing
process.
A
So
coverage
is
the
default.
Clang
supports
coverage
as
a
sanitizer,
so
it
gets
inserted
into
the
code
at
compile
time
and
you
can
access
those
coverage
metrics
except
you
directly
in
your
fuzzer
except
you
have
to
you,
have
to
know
about
it
and
you
have
to
access
them.
So
you
need
to
read
the
clang
documentation
to
be
able
to
work
with
the
clang
sanitized
coverage
sanitizer,
the
ones,
the
feedback
metric
that
I'm
using
in
these
examples
is
actually
performance
counters.
A
It
was
very
interesting
to
me
and
that's
why
I
used
it.
It
was
different
and
I
wanted
to
use
a
different
feedback
than
coverage,
because
coverage
is
what
everything
uses
and
I
wanted
to
see
if
something
like
this
would
even
work.
A
So
these
are
all
of
the
types
of
performance
counters
that
are
tracked
by
the
perf
linux
subsystem,
there's
quite
a
bit,
and
these
are
for
the
user
or
system-wide
as
the
kernel.
You
can
specify
what
type
of
counter
you
want
to
have
for.
Each
of
these
part
of
the
problem
with
using
performance
counters
is
that
they
are
non-deterministic.
A
The
perf
subsystem
is
sample
based,
so
it
will
record
samples
of
each
of
these
calendars
throughout
the
recording
process.
So
two
consecutive
runs
of
a
program
will
end
up
with
different
counter
values.
A
So
if
we
run
these
this
instruction
or
this
command,
we're
and
all
we're
doing-
is
echoing
hello
and
we're
recording
the
number
of
cycles,
number
of
instructions
and
bus
cycles
right
could
have
added
branches
in
there
too.
A
They
have
pretty
largely
different
numbers
on
these
now
does
that
matter
actually,
and
it
actually
turns
out
it
doesn't
matter
or
not
as
much
as
you
might
think
it
does
so
cycles.
Definitely
matters
there's
a
huge
variance
in
cycles
like
4
000.
Different
instructions
is
a
lot.
Fewer
branches
is
actually
very
few.
A
A
Will
cause
you
to
save
extra
inputs
that
are
interesting
just
because
of
the
jitter,
but
that
could
help
you
in
the
fuzzing
process
to
not
give
up
on
certain
paths
too
early,
all
right.
So
this
is
an
example.
This
is
the
side
project
that
kind
of
spawned.
All
of
this
for
me
of
doing
snapshot
based
fuzzing
in
rust,
with
performance
counters,
so
re-smack
fuzz
test.
I
call
the
fuzzer
rees
mac,
and
this
is
just
my
experimental
fuzzer
for
that.
A
So
this
is
that
running
on
a
target.
It's
pretty
much
identical
setup
except
it's
looking
for
the
words
reese
mac,
and
this
is
the
full
thing
working
with
proof
counters
as
its
feedback
mechanism.
A
So
if
you
remember,
we
had
run
18
million
iterations
with
the
snapshot
based
buzzer
and
we
still
didn't
see
the
crash.
This
is
looking
for
an
even
longer
input
and
we
get
the
crash
in
about
60
or
70
000
iterations,
so
it
took
like
two
seconds
to
find
it,
and
then
we
got
it
now.
A
We
are
very,
very
close
to
being
out
of
time.
Does
anybody
have
questions
I
have?
Oh,
I've
got
two
slides
left
so
think
of
your
questions.
I'll
cover
these
real,
quick,
so
perf
events
definitely
has
pros
and
cons.
Some
huge
pros
is
that
I
didn't
need
to
instrument
the
process
at
all.
All
I
needed
was
to
have
the
perf
subsystem
and
linux
working
or
functional
on
my
system,
and
that
was
it.
The
problem,
though,
is
that
there's
a
4x
overhead
to
recording
performance
counters
while
you're
running
a
process.
A
A
So
this
here
is
how
long
it
took
to
run
a
very
basic
function
with
no
perf,
and
this
is
with
perf
counters
turned
on.
So
there
are
downsides
to
this.
If
you
spend
a
lot
of
time
in
the
target
process,
this
will
have
a
much
bigger
impact
than
our
trivial
targets,
which
you
know
spend
just
a
very,
very
small
fraction
of
their
time,
actually
running
their
own
code
and
yeah.
That
is
the
end
of
this
presentation.
Did
anyone
have
any
questions.
B
Yeah,
I
thank
you
very
much.
It
was
very
cool
super
interesting.
I
just
it's
like
two
questions
to
the
agenda.
The
first
one
was
about
the
what
what
basically
the
best
the
best
point
of
creating
a
snapshot
is
so
if
you
have
like
inputs,
that's
used
multiple
times
in
your
program.
How
do
you
know
what
are
the
best
like
points
in
the
broadcom's
execution
to
take
the
snapshot.
A
Yeah,
so
that's
that
kind
of
comes
down
to
knowing
knowing
the
target
right.
So
it's
let's
see
there
is
a
thought
that.
A
A
Those
do
tend
to
be
more
successful
than
just
plugging
and
playing
right.
So,
knowing
what
let's
see
where
to
do
the
snapshot,
it
would
have
to
be
targeted.
You
could
try
to
figure
out
where
the
process
spends
its
most
time.
Maybe
that
could
work.
You
could
use
perf
to
figure
that
out.
A
You
could
look
at
the
functions
that
have
the
most
time
spent
in
them,
but
really,
if
you're
looking
for
bugs
you're,
not
really
looking
for
where
does
it
spend
its
most
time
right?
You
want
to
try
and
leverage
or
cause
the
most
code
in
the
program
to
be
executed,
basically
or
if
you
want.
If
it's
more
of
a
state
machine
type
thing
you
want
to
cover
all
the
different
possible
state
transitions
right
yeah.
So
it's
a
there,
isn't
an
easy
answer
to
that.
B
Yeah
yeah.
The
second
question
is
about
that
was
just
something
I
think
I
read
in
a
paper
right
ago.
There
was,
I
have
to
dig
it
up
somewhere
to
link
somewhere,
but
they
were.
You
were
storing
the
state
of
a
process
by
using
like
four
chords,
so
they
were
a
process
which,
basically,
that
creates
essentially
like
the
you.
You
have
like
the
memory
mappings
and
everything
available
in
the
child
process,
and
then
they
were
like
running
it
and
it
would
fail.
B
They
were
restoring
the
parent
process,
so
they
were
basically
using
four
calls
to
create
like
copy
the
copy
of
a
parent
process,
and
I
was
wondering
if
this
could
be
also
useful
for
fuzzing
or
if
this
is
something
that
wouldn't
seem.
I
mean
if
you
have
access
to
the
source
code,
if
you
could
inject
fork
calls
into
the
source
so.
A
That
is
actually
a
method
that
a
lot
of
the
fuzzers
use
is
to
fork
and
use
that
to
track
the
child
process.
A
A
All
right,
I
think,
did
anybody
else
have
questions.