►
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
B
A
B
A
Okay,
great
so
we'll
start
with
just
some
quick
points
of
order.
Zach
are
you
here.
A
That's
not
here,
but
I'll
take
this
from
Zach,
so
the
immediate
friendly
meetings
have
been
rescheduled.
They
are
at
a
new
time
that
is
more
media
friendly
and
they'll
also
take
place
on
Thursday.
So
this
is
if
you're,
East,
Coast
or
U.S
time
zones
early
Thursday
morning
and
Zach
will
be
chairing
those
meetings,
I
think
indefinitely
going
forward
because
they're
outside
my
working
hours,
but
he
might
appreciate
some
fellow
chairs
and
folks
to
help
him
with
the
chair.
A
So
those
have
been
updated
on
the
open,
ssf
public
calendar
and
also
the
shadow
calendar
invites
that
have
rcps
and
we'll
also
update
links
and
stuff
elsewhere.
When
those
happen.
So
then
that
that'll
be
the
next
meeting
in
two
weeks
on
November
16th,
which
I
feel
like
a
good
section
17.
A
but
yeah
it'll
be
Thursday
and
we'll
be
updated.
Any
questions
about
that.
A
Meeting
will
be
on
December
15th,
okay,
cool,
real,
quick
I
want
to
welcome
new
friends.
So
if
you
haven't
been
to
this
work
group
meeting
before,
can
you
unmute
real
quick,
just
say
hi
introduce
yourself
and
your
affiliation.
C
Hello,
I'm
Zach
steinsler
not
to
be
confused
with
Zach
from
chambered
and
I'm
working
at
GitHub
on
the
npm
linking
packages
to
their
source
code
and
build
instructions
using
sixstore.
H
Yes,
time,
Ian
Malloy
on
Farm,
IBM
research.
A
I
guess
maybe
you
want
to
give
us
a
little
preview
of
what
you're
going
to
talk
about
or
a
little
more
detail
about
your
organization
at
IBM.
E
Sure,
maybe
I
can
start
with
that
and
and
and
then
Ian
and
John
can
do
up
the
presentation,
and
we
can
take
you
through
that.
You
know-
we've
been
represented
at
the
open
ssf
by
two
of
our
colleagues
are
on
this
call:
there's
Jeff
boric
and
there's
mattress,
rootkowski
and,
and
of
course,
we
know,
Jamie
Thomas
who's,
one
of
our
vice
presidents
is
is,
is
is
leading
the
is
one
of
the
leaders
at
the
open,
ssf
working
with
Brian
who's,
also
on
the
call.
E
We've
worked
on
like
six
store,
for
example,
that's
one
of
the
projects
in
the
openssf,
which
is
more
based
around
reputation
of
developers
and
credentials,
and
so
on,
for
we've
been
working
on
a
different
angle
on
this,
which
we
actually
call
a
code
genome,
and
it's
it's
really
based
on
the
idea
of.
Can
you
because
it's
possible,
if
you
raise
the
bar
on
security
to
spoof
things
like
you,
know,
developer
credentials
and
other
other
threat
threats
and
other
attacks
become
possible?
E
Is
there
a
way
that
you
can
go
to
the
to
the
code
level
and
what
is
it
that
you're
able
to
do
at
the
code
level,
and
so
you
know
for
the
better
part
of
this
year,
we've
been
actually
working
on
some
of
these
ideas
and
sketching
them
out
again
and
jiong
will
take
us
through
this.
Maybe
you
can
start
presenting
guys,
and
so
what
our
request
to
you
is,
you
know
to
get
you
know
you
guys
are:
are
spending
a
fair
bit
of
time
across
the
industry
in
this
space?
E
I
would
love
to
get
some
feedback
from
you
on
on
some
of
these
ideas
and
and
how
we
are
you
know
pursuing
this
and
how
we
might
be
able
to
make
it
much
more
usable,
consumable
and
practical
as
you
go
forward,
so
I'll
pause
here.
Let
me
also
ask
Jeff:
do
you
want
to
weigh
anything
in
please
before
we
turn
this
over
to
Ian
and
gion
represent
Jeff
borick.
K
Thanks
Jr
and
yeah,
we're
excited
to
preview
this
technology
with
the
securing
repositories
working
group
and
we're
interested
in
finding
ways
to
effectively
share
some
of
this
to
remediate,
help
remediate
some
of
the
challenges
that
span
the
open
source
ecosystem
and
so
we're
we're
here
to
preview
and
sort
of
again
be
flexible.
On
the
approach
on
this
and
and
be,
you
know
radically
transparent
and
see
where
we
go
from
here.
K
H
Over
to
you
right
thanks,
so
everyone
can
see
my
screen.
I
had
to
do
a
quick,
Shuffle
there,
actually,
while
Jairus
speaking,
okay,
great
yeah.
So
again,
you
know
why
I'm
the
department
head
of
the
security
research
group
in
Yorktown
and
I
want
to
bring
forward
some
of
the
work
we've
been
doing.
H
Ideas
have
been
kicked
around
for
a
number
of
years,
I
think
some
of
the
recent
events
that
happened-
probably
the
last,
let's
just
say,
six
to
12
months-
really
kind
of
spread
this
on
to
develop
it
a
whole
lot
more
and
you'll
understand
that
as
we
kind
of
go
through,
but
an
interesting
time,
let's
just
get
started,
so
we
probably
don't
have
to
go
through
any
of
this.
H
This
is
kind
of
one
of
our
slides
when
we
bring
people
up
to
speed
who
aren't
familiar
with
it,
but
supply
chain
attacks
are
pretty
prevalent
these
days
we
actually
see
quite
a
few
of
them
at
all
different
levels
of
the
supply
chain,
from
attacks
against
developers,
developer
credentials,
repositories
and
then
kind
of
going
after
it
systems.
H
F
H
You
know
when
you
go
from
the
developers
who
are
going
to
add
that
code
they're
going
to
build
it,
compile
it
and
then
eventually
distribute
it,
and
there
are
a
whole
bunch
of
tools
that
have
kind
of
built
up
to
help
secure
that,
whether
or
not
that's
looking
at
vulnerabilities,
all
the
great
things
that
kind
of
integrate
directly
with
tools
like
GitHub
that
let
you
find
them
as
early
as
possible.
H
You
know
Integrity
of
the
cicd
pipelines.
You
know,
reproducible
builds
all
the
great
work
that's
been
going
on
with
in
Toto
and
salsa
and
then
eventually
getting
the
s-bombs
the
bill
of
materials.
They
can
actually
provide
to
the
end
users-
and
you
know,
have
that
provide
some
form
of
an
assurance
of
what's
going
on,
and
what
we're
actually
going
to
be
presenting
and
proposing
here
is
something
that's
a
little
complementary
to
these
Technologies.
H
So
it's
not
one
of
these
things
that
you
know
it
makes
us.
He
wouldn't
use
need
these,
but
again
it
might
address
some
additional
security
gaps
that
we
actually
see
so
Jay
I
already
kind
of
mentioned
a
little
bit,
but
what
we
actually
want
to
propose
is
something
like
a
software
fingerprint
to
help
raise
Assurance
of
the
software,
the
code,
the
binaries
that
we
actually
see
and
when
we
think
about
this
we
start
off
and
you've
got
a
hash.
You
know
it
is
something
that
we
can
use
to
verify
file
matches.
H
Precisely
if
there's
a
single
bit
difference,
then
there
are
large
differences
with
the
hash
signatures.
Allow
us
to
then
kind
of
assign
a
some
form
of
trust
to
that.
Assuming
you
trust
who
signed
it
then
effectively
they're
testing
that,
yes,
this
is
what
they
have
and
it's
you
know
a
blessed
copy.
It
hasn't
been
modified
in
any
way
and
you
know
who
it
claims
to
have
come
from,
but
this
actually
requires
you
to
have
full
trust
in
the
signer
and
as
we
kind
of
signed
the
previous
chart,
sometimes
developers
do
actually
get
compromised.
C
H
Commit
but
sometimes
I
might
not,
it
might
not
be.
You
know
something
that
any
different
you
know
package
maintainer
can
actually
have
it's
very,
very
costly
and
some
different
vulnerabilities
or
breaches
to
actually
still
make
it
through
fuzzy
hashes.
H
Something
like
ssdeep
is
kind
of
another
technique
that
is
kind
of
similar,
which
would
actually
provide
a
partial
match
for
a
file
or
for
known
files,
and
what
this
can
tend
to
do
is
find
you
know
some
small
differences
between
files,
but
it
really
lacks
the
the
semantic
understanding
of
the
files,
their
purpose
and
what
they're
actually
trying
to
do,
and
just
to
kind
of
show
a
quick
example.
H
I
was
playing
around
with
actually
running
SSD
on
different
implementations
or
different
builds
different
distributions
of
SSD,
so
Rel
version
of
2.14,
two
different
Ubuntu
versions
of
2.14,
and
you
know,
compiled
it
with
Brew
on
my
laptop
and
they
all
come
up
with
wildly
different
fuzzy
hashes.
So
none
of
them
actually
match.
So
it's
really
difficult
to
actually
verify
that
they're
all
different
implementations
of
effectively
the
same
binary,
the
same
code.
H
H
We
can
also
kind
of
think
back
to
I
guess
the
old
Ken
Thompson
papers,
Reflections
on
trusting
trust
where
you
might
not
know
where
you've
been
compromised
was
the
developer
that
was
compromised.
Is
there
something
the
compiler
is
actually
doing?
H
We've
actually
done
some
tests
recently,
where
we
thought
we
had
potentially
disabled
certain
features,
but
they
still
actually
make
it
through
to
the
end
binary,
and
so
what
we
really
kind
of
want
is
a
way
to
verify
the
from
the
source
code
to
the
binary,
the
the
full
Integrity
of
everything,
and
do
that
across
deployments
and
address
some
of
the
the
recent
things
that
we've
seen
where
you
might
have
a
version
number
that
actually
doesn't
match
it
match
because
potentially
someone
added
additional
patch
or
they,
you
know
back
ported,
some
security,
patch
or
additional
functionality,
because
they
had
to
do
that.
H
For
your
specific
system,
so
we
kind
of
go
back
to
the
the
supply
chain
view.
While
there
are
all
these
tools
that
really
are
doing
a
great
job
at
locking
it
down,
there
are
still
small
places
that
where
there
are
potential
gaps,
potential
weaknesses
that
attackers
can
actually
leverage
again
anything
from
you
know
the
the
sort
of
point
of
entry
compromising
Developers
and
then
at
the
end
you
can
kind
of
think
of
this.
H
Well,
we
have
a
huge
Legacy
problem
so
until
we're
all
on
a
salsa
level
for
build
system,
those
can
be
tons
of
code
out
there
tons
of
Legacy
deployments.
Not
everyone
has
great
cmdbs
or
knows
exactly
what's
being
deployed,
and
so
the
ability
to
be
able
to
go
back
and
verify
and
fingerprint
all
of
it
is
imminently
useful.
I
All
right,
thank
you
again,
so
we
call
our
project
as
a
code
genome,
because
the
fundamentally,
what
we're
trying
to
achieve
is
to
find
the
meaningful
fingerprint
that
represents
the
functionality
of
the
code,
and
this
is
the
kind
of
Beyond
synthetic
matching.
So
we
want
to
understand
or
they're
trying
to
find
the
interfunctionality
of
the
code.
I
These
are
the
actual,
the
same
kind
of
the
computation
as
you
can
see,
we
injected
some
kind
of
assembly
code
in
nine,
or
sometimes
we
put
kind
of
changing
the
control
flow
kind
of
doing
some
kind
of
some
kind
of
obscation
or
kind
of
the
modification
about
entrepreneur.
And,
of
course,
when
you
compile
and
they
will
result
a
different
binary.
Of
course,
even
the
size
is
different,
as
you
can
see
from
the
opposite
thumbs
screen
at
the
behind.
And
what
here?
I
We're
doing
is
regenerating
the
code
genome
based
on
each
of
the
binary
code,
which
is
the
the
bottom
center.
You
see
the
black
screen
and
looks
like
some
kind
of
ir
format,
and
that
generates
some
kind
of
regret,
some
kind
of
graphical
presentation-
and
this
is
actually
the
code
genome-
we
generate
from
the
4D
Pro
on
the
binary
and
then
that
all
the
way
down
to
the
same
representation
so
that
we
can
search
through
so
different
architectures
or
compilers
or
different,
optimization
level.
Of
course,
this
is
a
quite
challenging
problem,
as
you
may
already
know.
I
So
we
cannot
claim
100
coverage,
but
we
are
getting
there
that
we
are
trying
to
improve
our
technology
and
to
handle
many
different
Corner
cases,
but
we
are
getting
there
and
they'll
be
improving
our
code,
genome
functionality
and
the
quality
and
that's
something
we
are
getting
there.
So
this
is
a
high
level
overview,
and
here
maybe
we'll
go
on
to
the
next
chart.
I
And
how
we
generate
the
code
genome
in
terms
of
kind
of
pipelining
or
kind
of
the
step
on
the
bottom
left.
You
see,
there
is
a
source
code
and
which
is
not
Theta
is
optional
because
we
mainly
starting
from
the
binary.
Of
course.
If
we
have
the
source
code,
you
can
compile
to
get
the
binary
or
eventually
what
we
care
is
or
irre
presentation
intermediate
representation,
because
this
is
something
as
you
may
already
know.
So
this
is
some
some
architecture
or
the
platform
independent
leverage,
interaction
code.
I
I
So
this
is
how
we
generating
the
code
genome
based
on
starting
from
the
actual
motion
code
or
the
source
code,
and
then
eventually
we
get
to
some
kind
of
representation
of
the
code
genome
and
then
which
we
can
do
the
comparison
between
the
code
and
the
main
benefit.
Is
we
don't
necessarily
rely
on
the
source
code
and
of
course,
if
we're
source
code,
it
is
great
and
then
we
can
do
kind
of
some
kind
of
ground.
Truth
matching
it.
I
I
So
that's
why
we
focusing
on
the
binary
analysis
and
also
as
Ian
mentioned
so
there
might
be
a
compromisation
of
the
the
compiler
like
xcode
course
is
one
of
the
example
and
that's
why
we
don't
necessarily
want
to
trust
all
the
building
process,
but
we
eventually
verify
the
binary
code,
which
is
the
code
that
actually
learning
on
the
system
in
the
end,
because
source
code
binary
code,
it
doesn't
necessarily
guarantee
they
gone
on
to
the
same
thing.
So
that's
why
we
want
to
inspect
or
verify
the
binary
code.
I
Okay,
so
using
this
technology,
the
team
have
been
exploring
many
different
use
cases
and
we
picked
her.
Maybe
the
two
most
relevant
use
cases
for
the
team
and
the
pro
series
case
was,
as
you
know,
of
course,
last
year
at
the
end
of
the
character
year,
which
is
at
the
perfect
time
of
the
holiday-
and
you
know,
big
hit
about
the
local
Jose
and
then,
as
I
mentioned,
the
software
genome
is
about
finding
the
core
functionality
or
the
core
representation
of
the
code.
I
So
we
took
that
idea
and
finding
where
the
vulnerability
in
the
local
Jose
and
from
the
vulnerable
browser
of
the
rock
Jose
and
they're,
using
that
to
search
things
through
our
system
and
the
infrastructure
and
the
organization
to
finding
where
the
digital
project
is
still
deployed
and
used,
and
this
also
brings
a
quite
interesting
kind
of
challenges.
So,
like
software
is
not
necessary,
you
have
just
a
single
binary,
there's
always
packages
and
package
of
the
packets
and,
of
course,
package
of
packets.
I
Like
zip
ties
you
imagine
so
there
are
lots
of
dependencies
and
also
a
lot
of
layer
of
the
software.
We
have
to
peeling
the
onion
to
get
to
the
actual
code
and
then
dependents
get
quite
complicated.
So
so
there
is
lots
of
kind
of
the
beyond
the
subject.
You
know
we
trying
to
cope
with
the
robust
technology
or
the
framework
to
analyzing
to
get
to
the
Nugget
of
the
actual
code.
I
So
that's
how
we
scan
our
organization
and
we
are
reporting
the
a
lot
of
the
kind
of
the
the
matchings
in
the
deployed
in
the
system,
and
we
actually
so-
and
here
this
we
present
the
low
per
se.
Of
course,
the
same
technology
can
be
used
other
type
of
the
vulnerability,
so
team
is
currently
kind
of
pulling
the
latest
version
or
the
new
vulnerability
whenever
is
coming
out,
we're
pulling
it
and
then
we
currently
generating
quote-unquote
kind
of
the
signature
or
with
the
genome
representation
to
searching
through
and
finding
the
vulnerability.
I
So
this
might
be
more
interesting
for
the
audience,
maybe
the
today
so
another
use
case.
We're
focusing
on
the
genome.
Technology
is
the
S1
verification
and
Asia
mentioned.
Spam
is
a
great
kind
of
technology
or
the
standard
to
provide
what
is
actually
the
ingredient
in
the
software
whenever
it's
deployed
or
the
delivered.
The
issue
here
is
that
the
user
or
end
user.
I
I
So
then
I
missed
something
well,
sometimes
some
of
the
vendors
we
know
there
is
many
instances
like
some
of
the
vendors
is
kind
of
imprints
the
open
source
like
GPA
license,
but
of
course,
for
their
case,
they
do
not
include
or
claim
that
as
part
of
the
S1,
because
they
don't,
they
don't
want
to
get
into
the
kind
of
legal.
But
that's
why
there's
always
chance
it
might
be
incomplete
or
incorrect.
So
that's
why
we
want
to
verify
whether
s-bomb
is
really
matches
with
Insider
the
source
code
or
inside
the
software.
I
That's
why
using
our
technology
energy,
to
understand
what
is
the
inside
of
Softail,
meaning
we
are
doing
the
software
composition
analysis
using
the
genome
technology.
So,
given
the
software
we
verify
or
generating
what
is
inside
and
the
regenerating
that
spam
or
the
verified
as
pump
to
guarantee
okay,
this
is
the
correct
s-bomb.
This
is
what
they
claim.
This
is
matching.
Maybe
this
is
not
matched
so
this
is
the
kind
of
the
capability
we
are
currently
actively
building
it
at
the
moment,
so
in
the
next
chart.
I
So
these
are
great
sources
of
information
to
generate
a
Spam,
but
the
issue
is
that
maybe
it
can
maybe
it
might
not
be
complete.
So,
for
example,
on
the
right
tops
corner.
There
is
a
list
article
about
kind
of
the
Mr
official,
the
alcohol,
some
of
the
images
they
have.
The
incorrect
version
of
word
processor
is
claim.
I
As
you
see
simple
darker
from
the
Ubuntu
image
you
update,
and
then
you
insert
the
WK
and
then,
of
course,
from
the
spawn
you
expect
there
is
a
double
to
get
as
part
of
your
spam
current
of
the
generation,
which
is
the
true,
but
on
the
right
hand,
side
you
see
there
is
a
line
number
six
which
is
effectively
basically
taking
out
the
package
manager.
The
metadata
information
will
be
removed.
When
that
happens,
the
actual
response,
the
general
test
pump,
doesn't
contain
the
duplicate
inside.
H
So
so
young
in
the
interest
of
time,
let's
just
start,
jumping
to
the
asks
so
I'll
show
just
a
couple
other
things.
So
we
had
a
demo
plan
where
we're
going
to
show
that
we
could
identify.
You
know
a
download
recompiles
like
wget
and
unknown
package
and
have
our
our
tool
automatically
be
able
to
identify
and
recognize
it,
but
we'll
skip
that.
H
So
what
we
have
currently
is
several
different
techniques
that
would
actually
compute
these
different
genes,
support
for
both
multiple
formats
for
binaries
packages
and
even
different
types
of
bytecode
interpreter
code,
very
large,
Cloud
native
application
for
processing
this
we're
trying
to
process
as
much
software
as
we
can
to
make
sure
that
the
genes
are
robust
and
we
can
correctly
identify
all
the
different
software
packages
and
are
trying
to
currently
perform
a
large
scale
evaluation.
H
What
are
we
actually
planning
and
releasing
well
we're
hoping
that
some
of
the
initial
versions
of
the
gene
creation
techniques
that
we
have
are
things
that
we
can
actually
open
source,
plus
the
service
that
I
just
kind
of
show
the
screenshots?
That
would
actually
demonstrate
the
technology.
H
Sorry
about
that
and
utilities
for
actually
being
able
to
handle
and
query
the
the
large
database
that
we're
actually
building
up
and
everything.
H
K
K
Think
we
can
come
back
for
a
demo
or
set
up
a
special
one,
but
I'll
defer
to
the
chair
of
the
working
group.
Institute
of
our
time
allowed.
A
Yeah
I'd
like
to
take
questions
and
then
maybe
run
through
the
rest
of
the
agenda
quickly
and
see
if
we
have
more
time
at
the
end
for
a
demo,
okay,
chance.
C
Yes,
first
of
all,
very
very
cool
technology
I'm
wondering
how
you're
proposing
to
generate
all
of
the
hashes
associated
with
the
vast
number
of
software
libraries
that
are
that
are
in
use
in
the
world.
H
We're
probably
going
to
start
a
little
small
looking
at
all
the
major
packages
that
we
see
in
major
distributions.
That's
probably
where
we're
going
to
start
building
this
up,
and
if
people
have
packages
they
think
that
these
are
the
ones.
We
definitely
need
to
ingest
willing
to
take
that
as
kind
of
thoughts
and
feedback.
H
So
we
are
I
think
our
different
mechanisms,
where
we'll
be
you
know,
are
syncing
different
repositories
having
different
projects
that
would
actually
start
pulling
in
and
kind
of
using
that
as
a
base
and
then
building
that
as
we
go
making
sure
that
the
the
genes
that
we
compute
are,
you
know
meaningful
first
before,
as
we
kind
of
go
and
scale
up.
I
I
J
Thank
you
all
for
the
presentation.
I
think
this
is
really
exciting
approach.
I'm
curious
if
you've
considered
how
the
the
genomic
fingerprint
of
software
relates
to
dependency
resolution
of
software
is
it,
for
example,
somehow
inferable
from
a
a
gene
sequence,
if
you
will
what
the
content
or
the
the
dependencies
were,
or
some
way
to
connect
them
or
map
them.
Since
you've
probably
seen
my
work
on
gitbomb
really
focused
on
dependency
resolution,
this
looks
not
the
same,
but
I'm
curious.
If
there
is
overlap.
H
I
So
I
think
that's
a
really
great
question,
and
here
so
in
the
back
end,
so
exactly
to
Tech
or
kind
of
address
or
the
provide
those
capability
to
infer
the
dependencies.
We
are
not
storing
as
kind
of
the
single
the
kind
of
relationship
database.
Instead,
we
are
actually
building
the
knowledge
graph,
the
behind
the
scene,
so
the
where
we
represent
the
dependency
toward
the
kind
of
relationship
between
the
binaries
or
binary
the
package,
the
package
to
the
darker
container.
So
we
want
to
keep
those
relationship
between
the
binary
and
then
from
there.
I
Of
course,
we
can
do
many
interesting
things
like
problems
from
the
code
and
how
to
code
Gadi
Borg
and
maybe
when
the
vulnerability
coming
out,
whether
this
vulnerability
also
used
for
other
packages
or
the
projects
or
all
kinds
of
many
different,
interesting
kind
of
reasoning.
We
can
do
on
top
of
it,
but
that
is
really
kind
of
the
yeah
great
question.
H
What
the
the
one
thing
I
wanted
to
kind
of
chime
in
there
that
was
kind
of
interesting
would
be
I.
Guess
two
things
one
would
be.
You
know
we
use
log4j
as
an
example.
It
was
you
know
the
gift
that
kept
on
giving
and
one
of
the
reasons
for
that
is.
When
you
look
at
a
jar,
you
know
very
very
infrequently.
H
Do
you
actually
have
like
log
for
j.jar
that
we
actually
found,
because
lots
of
people
would
actually
build
with
Maven
and
it
would
package
all
the
dependencies
inside
there,
and
so
it
actually
became
very,
very
difficult,
so
I
mean
I,
I,
think
Jiang
mentioned,
but
we
scanned
hundreds,
if
not
thousands,
of
systems
within
research
with
you
know
a
very,
very
early
version
of
our
tool
and
actually
found
hundreds
of
running
instances
along
for
Jay
that,
had
you
know
all
the
other
scanners
that
they
had
actually
tried
had
missed
partially
because
we
weren't,
you
know
I
like
to
use
the
word
turduck
in,
but
you
know
unpackaging
on
rolling
and
unwrapping
all
the
different
layers
of
the
software
dependencies
until
you
actually
find
that
hey
within
this
jar
is
the
following
class
files
and
those
following
class
files
actually
match
the
gene
sequence
and
I'm
really
kind
of
curious
on
applying
this
eventually
to
things
like
go,
applications
that
tend
to
be
statically
compiled
and
be
able
to
look
at
that
and
say:
hey.
H
We've
got
little
bits
of
code
from
here,
a
little
bits
of
code
from
there
and
they've
all
been
added
together
into
this
larger
package.
So
it's
kind
of
the
dependencies.
But
looking
at
these,
these
different
layers
or.
J
H
I'm
not
sure
if
we
actually
mentions
we
actually
hope
to
have
genes
computed
at
multiple
different
levels
of
granularity.
The
preliminary
version
is
going
to
be
the
file
level.
We
have
things
at
function
level
and
we're
also
again,
you
know
evaluating
what
what
the
right
level
granularity
is.
A
Well,
that
goes
right
into
my
question,
which
I
think
I
was
next
in
the
queue
which
is
about
granularity,
like
you
showed
very
small
Snippets
in
the
slides,
I
think
that
makes
sense.
But
I
guess
you
know
yeah
what
level
of
granularity
are
you
looking
at
and
how
do
you
account
for,
like
the
presence
of
something
in
a
file
like,
let's
say,
a
malicious
or
a
malicious,
snippet
or
a
vulnerable
snippet
is
included
in
one
file,
but
then
it's
also
a
completely
different
position
in
another
file.
Like
does
this
account
for
that.
B
H
The
hope
is
that
we
do
at
the
the
functional
level
we
would
have
like
large
sequences.
Like
large
collections
of
genes,
we
actually
see
within
a
given
file
and
if
you
kind
of
reorder
the
the
positioning,
the
hope
is
out.
Yes,
we
it
would
be
in
very
intellect
the
gene
would
actually
still
be
there.
The
hope
is
that
it
is
robust
to
different
forms
of
obfuscation.
It's
kind
of
Jiang
showed.
H
Had
that
kind
of
in
our
minds,
as
we
kind
of
go
through
into
the
evaluation.
I
So
maybe
there
are
too
many
genes
we
need
to
create,
and
maybe
some
of
the
gene
may
not
be
that
meaningful,
because
that's
the
reason
we
want
to
start
from
maybe
function
level
would
be
the
right
level,
because
the
function
by
definition
function
should
be
kind
of
the
one
of
the
units
that
provide
the
meaningful
computation.
That's
why
we
decide
function
will
be
the
good
Clarity
to
Target,
but
we
definitely
keep
track
function
to
fire
by
the
package
or
by
package
to
container
those
kind.
Regions
should
be
still
capturing
in
the
knowledge
Grant.
A
L
If
you
could
talk
a
little
bit
about
sort
of
that
obfuscation
and
other
sort
of
how
you're
handling
this
and
actually
more
generally
sort
of
what
is
the
threat
model,
there
right
you've
mentioned
a
couple
of
times
sort
of
trying
to
sort
of
keep
some
of
the
malicious
suppliers
honest
if
they're
lying
about
stuff
on
the
as
well
and
all
of
that
sort
of
stuff.
L
How
confident
are
you
in
that
sort
of
being
able
to
handle
effectively
a
malicious
Supply,
which
is
a
part
of
that
sort
of
your
threat
model
and
how
it's
going
to
work.
I
Yeah
there
is,
of
course,
a
part
of
the
Australian
model,
but
we
need
I
guess
there
is
a
reason
we
say.
Well,
we
start
with
the
preview
of
our
technology,
so
we
are
developing
or
evolving
our
technology
to
address
many
different
Corner
cases,
but
we
are
pretty
sure
in
this
space.
As
you
experience,
there
will
be
a
lot
of
Economics,
very
creative
way
to
bypass
all
different
kinds
of
kind
of
de-op
station
technology,
so
we're
trying
to
incorporate
many
different
things
and
many
different
optimization
at
the
IR
level
as
much
as
possible.
I
So
we
are
trying
to
develop
more
and
more
technology
to
address
many
different
kind
of
corner
cases,
and
this
is
actually
one
of
the
reason
we
want
to
interact
with
the
the
community
because
there's
lots
of
development
in
the
binary
analysis,
I
know
from
the
research
community
and,
of
course,
industry
community.
So
we
all
know,
adapt
and
incorporate
into
the
play
so
that
we
can
liberalizing
or
to
get
the
benefit
from
this
development
in
the
community
and,
of
course,
want
to
give
back
as
kind
of
the
service.
So
because
this
is
changing
problem.
H
Maybe
two
more
proof
points
some
of
the
guys
in
Junk's
team.
Have,
you
know,
do
some
or
have
done
some
work
in
malware
analysis,
that's
actually
how
it
some
of
the
ideas
originally
started
and
they're.
Looking
at
you
know,
two
different
disassemblers
might
actually
come
to
wildly
different
conclusions
based
off
of
you
know:
non-word
aligned
instructions
and
some
interesting
little
things
like
that.
That
malware
tends
to
do.
H
H
H
A
So
I'll
add,
like
one
thing
that
seems
particularly
useful
for
and
might
require
a
much
smaller
data
set
is
malware
detection,
so
I'm,
assuming
you
probably
already
thought
about
that-
is
that
maybe
like
a
first
step
here
that
would
be
useful
to
provide
a
service
that
you
know.
We
have
a
data
set
of
known
malware
fingerprints
for
all
that,
and
then
we
can
sort
of
provide
service
that
can
scan
some
arbitrary
piece
of
code
or
software
and
see
if
any
of
them
malware
is
present.
I
So
they
will
be
really
super
useful
security
technology
or
the
use
case,
but
at
the
moment,
as
you
present
in
the
use
case,
2
we're
currently
focusing
on
the
s-bomb
more
focusing
on
the
open
source.
The
reason
is,
since
we
are
still
developing
technology-
and
we
all
know-
have
the
good
mechanism
to
evaluate
whether
our
binding
or
the
result
is
the
meaningful.
You
know
so
for
that.
That's
why
you're
focusing
on
the
open
source
side,
where
we
have
the
ground,
Rules
From,
the
Source
Code,
and
whether
our
finding
is
really
made
sense
or
not.
K
Thanks
Dustin
and
yeah
I
just
think
we
are
probably
ready
to
yield
time
now,
but
I
wanted
to
conclude
it
by
basically
doing
a
quick
show
of
hands.
If
the
folks
attending
the
call
want
to
either
you
know,
provide
a
thumbs
up
or
raise
their
hand,
is
an
indication
that
you
know
this
looks
like
an
interesting
technology
that
the
working
group
would
like
to
follow
up
on.
We
can
talk
about
coming
back
and
doing
a
demo.
K
We
also
want
to
continue
to
socialize
this
with
perhaps
another
working
group
at
the
open
ssf
to
because
we
think
it
has
broad
applications.
But
if
the
working
group
here
is
interested
in
this
Dustin,
whatever
the
most
appropriate
way
to
capture
some
feedback,
would
be
appreciated.
We'd
like
to
continue
to
preview
this
and
may
even
discuss
it
publicly
next
week
at
the
member
Summit
and
but
would
like
to
come
back
and
line
up
a
demonstration
as
appropriate.
A
Yeah
so
yeah,
this
is
a
work
group.
That's
focused
on
the
software
repositories
themselves,
which
is
why
I
was
sort
of
asking
a
little
bit
about
malware,
because
that's
something
we
discussed
in
the
past
and
so
yeah
I
think
there's,
maybe
some
ideas
about
how
a
software
repository
could
use
a
service
like
this
like
malware's.
First
thing
that
comes
to
mind
for
me,
but
I
think
we
also
have
a
working
group
around
identifying
security
threats.
A
So
it
opens
us
up
and
they
might
be
interested
to
see
this
and
yeah
I
can't
think
if
there's
another
one
that
might
be
a
good
fit
as
well
might
like
to
hear
about
this.
But.
L
A
So
yeah
I'll
say
thanks
for
coming
chatting
with
us.
This
seems
super
interesting.
I'd
say
definitely
come
back
when
there's
a
service
that
we
could,
we
could
play
with
I.
Think
we'd
be
very
interested
to
see
that
and
try
to
experiment
with
that.
A
little
bit.
K
Yeah,
we'll
also
try
and
come
back
at
a
time-
that's
maybe
more
Nia
friendly
as
well.
So
we
capture
that
side
of
the
group
as
well.
A
Yeah
you're
welcome
to
do
that.
We
do
record
these
also.
So,
ideally
folks
can
watch
the
presentation,
but
that's
hard
to
do
q
a
over
recording,
okay.
So
moving
on
the
agenda,
real
quick,
let
me
just
take
over
the
screen.
A
A
Okay,
you
can
hear
me
now
yeah,
so
we
have
Justin
here
on
the
call
and
Justin's
been
working
with
folks
in
the
python
ecosystem
for
a
while
on
a
proposal
that's
been
around
for
a
while
to
integrate
tough
into
the
python
package
index
to
do
repository,
sign,
metadata
and
artifacts.
A
This
has
been
in
progress
for
a
very
long
time,
but
we're
at
the
point
now
where
one
of
the
things
that
sort
of
came
out
of
this
work
was
the
creation
of
a
design
doc.
That
very
specifically
describes
like
how
unexisting
repository
would
add.
Support
for
tough
I
wanted
to
share
that
with
this
group
here
for
folks
that
have
other
language
ecosystems
that
might
be
thinking
about
integrating
with
tough,
because
I
think
this
would
be
a
really
helpful
resource,
but
also
Justin's
here
to
answer.
A
D
I
I
I
D
A
I
think
so
I
wasn't
sharing
anything.
But
it's
this
live
line
item
in
the
the
notes,
official,
quite
a
long
document,
but
it's
about
how
API
will
integrate
with
tough.
B
G
Hi,
this
is
Simona
I'm,
one
of
the
people
at
NYU.
Who's
worked
with
Justin
a
little
bit
on
this
and
I
just
wanted
to
mention
that
there's
a
number
of
people
who
worked
on
this
you'll
see
their
names
in
the
the
comments,
especially.
G
On
the
document,
I
just
want
to
give
a
shout
out
and
NYU
and
elsewhere,
who
have
helped
contribute
to
this,
to
make
it
more
likely
that
we'll
be
able
to
work
out
all
the
design
details
so
that
we
can
integrate
top
in
here.
A
F
Okay,
okay,
yeah
I'm,
happy
to
to
take
that
so
we've
been
talking
with
the
Sig
store
folks
about
and
I
think
the
likely
path
forward
for
Sig
store
is
going
to
be
that
the
key
type
in
Six
Tour,
which
is
done
by
fulsio,
will
become
a
key
type
inside
of
tough
and
so
effectively.
People
that
want
to
integrate
Sig
store
will
integrate
tough
and
then
use
that
key
type,
rather
than
like
just
doing
Sig
store
and
not
getting.
The
name
spacing
stuff
like
this
from
the
tough
side.
F
So
right
now
we're
in
the
process
of
working
through
some
of
those
details.
As
for
how,
in
total
relates
to
this,
there
was
a
very
excellent
blog
post
put
out
by
someone
at
datadog.
Let
me
think,
oh
yeah,
it
was
tree
shank
who,
who
described
a
lot
of
the
integration,
work
and
things
like
that
and
really
did
some
fantastic
work
in
that
area.
F
So
you
know
we
would
probably
follow
follow
that
model
of
integration,
because
that's
the
one
of
the
common
Pathways
that
we've
had
for
tough
in
Toto
integration
in
the
future
or
in
the
past
and
will
probably
be
that
way
in
the
future.
Given
some
of
the
ites
and
like
poof,
related
work
for
doing
it
and
obtaining
other
places.
A
M
I,
don't
have
a
pretty
presentation,
but
I
talk
better
when
I
have
some
points,
so
I
hope
you
bear
with
me
with
my
Google
doc
presentation:
okay,
well,
hello,
I'm
Betty
from
Shopify,
as
Dustin
mentioned
I'm,
just
giving
an
update
on
how
the
proposal
presentation
went
with
the
TAC
meeting
from
yesterday.
M
Also
as
Dustin
mentioned
as
well,
this
the
shared
repository
help
desk
proposal
is
largely
championed
by
Jacques.
So
I
am
a
less
charismatic
person
that
you're
stuck
with
for
the
moment,
but
but
I
also
don't
have
as
much
context
on
this
as
well.
So
I
can't
speak
to
it
as
well
as
y'all.
Can
that
said
here
we
go
so
some
background
context
on
what
the
proposal
is
for
anyone
who's
not
aware.
M
So
we
have
a
shared
help
desk
proposal
that
was
shared
and
surfaced
out
and
the
key
goal
of
it
was
to
propose
a
solution
to
help
software
repos
manage
MFA
reset
requests,
since
that
is
high
risk
and
time
consuming,
and
the
goal
is
for
software
repos
in
general
to
like
roll
out
MFA
requirements
to
larger
and
larger
cohorts,
and
in
order
to
support
that,
we
proposed
creating
a
shared
help
desk
to
alleviate
some
of
the
the
MFA
reset
requests
portion,
the
support
that
that
comes
with
it.
M
And
so,
as
a
recap.
From
the
last
meeting
for
the
secure
software
repos
Gordon
group,
the
Jacques
had
asked
the
working
group
if
they
were
in
favor
to
support
bringing
this
proposal
to
the
attack.
There
are
no
oppositions,
everyone
supported
it,
and
so
the
next
piece
was
actually
bringing
it
to
the
tech.
M
M
No
one
was,
and
so
I
as
one
of
his
team
members,
as
well
as
other
team
members
kind
of
held
that
torch
for
him
in
his
absence,
so
Ashley
Pierce
was
the
one
who
presented
to
the
tack
and
the
goal
was
to
bring
into
attack
and
ask
to
bring
it
to
vote
on
whether
they
support
that
proposal.
M
M
M
It's
currently
too
ambiguous
for
the
attack
to
vote
on
I
think
that's
like
Fair
feedback,
and
they
had
really
great
questions
that
they've
asked.
So,
instead
of
like
voting
on
something
they're
unclear
about
the
the
goal
was
to
take
it
back
and
we
need
to
do
some
rework.
M
That
said,
there
is
some
like
good
news
to
share,
which
is
that
the
sentiments
and
the
theme
that
I
think
we
took
away
from
that
meeting
was
that
you
know
members
of
the
tax
that
they
support.
This
idea
conceptually
and
they
agree
that
the
problem
is
worth
solving.
So
what
I
heard
was
like
no
one
was
saying:
no.
M
M
M
M
So
that's
what
I'll
be
planning
to
do
probably
later
today,
if
not
tomorrow,
I'll
create
an
issue
in
our
working
groups,
repo,
which
is
this
repo
right
here,
I'll
share
the
link
in
the
doc
once
I'm
done,
presenting
and
in
there
we'll
have
an
issue
where
we
can
like
have
the
discussion,
get
more
suggestions,
get
feedback
and
the
goal
is
to
create
a
concrete
list
of
questions
that
we
can
then
focus
on
addressing
I
also
wanted
to
call
out.
M
I
know
we're
short
on
time,
but
in
the
conversations
with
the
attack
and
other
members
in
that
meeting,
they
had
mentioned
that,
like
today,
could
have
been
a
good
opportunity
for
them
to
join
us
to
have
some
discussions
here.
I
know
we're
short
on
time,
so
it
might
not
be
viable
at
this
moment.
So,
given
that
I
think
the
point,
the
area
that
I'm
gonna
Focus
everyone
to
start
getting
that
feedback
on
would
be
to
the
issue
that
we're
going
to
create
in
the
working
groups.
Repo
I
do
have
an
Ask.
M
You
know
this
is,
as
I
mentioned,
the
largely
shocks
initiative
that
he's
worked
with
this
working
group
and
he's
gotten
feedback
from
other
ones
as
well.
But
I
was
wondering
if
anyone
is
available
over
the
next
few
weeks,
while
the
shock
is
a
way
to
like
continue
making
progress
on
it,
if
not
like
I,
think.
The
best
path
forward
is
to
one
collect
more
feedback
in
that
issue
and
gather
those
questions
and
then
two
we
can
wait
for
Jacques
return
to
then
like
work
on
that
proposal.
M
Some
more
the
one
thing
I
want
to
highlight
or
call
out
is
that
the
governing
board
meets
to
like
plan
out
2023
budget
in
early
December,
I,
think
shot
comes
back
around
late
November,
so
I
don't
know
if
that'll
be
sufficient
time
to
get
it
in
for
that
the
budget
planning,
but
from
what
I
understand,
even
if
this
isn't
presented
before
then
with
like
a
more
fleshed
out
proposal
that
it
is
possible
even
like
early
next
year
to
continue
this
conversation,
so
the
door
to
get
funded
is
not
like
immediately
closed.
A
A
If
someone
wants
to
lead
that
in
the
interim
until
Jacques
returns,
feel
free
to
take
ownership
of
the
issue
in
that
repository
when
it's
created
and
Betty
asks
you
to
just
drop
a
link
to
that
issue
in
the
slack
Channel
and
the
notes,
and
that
kind
of
thing
when
you
create
it
cool
awesome.
Any
questions.
J
M
I,
don't
think
I
have
any
follow-up
questions
yet,
even
but
if
we
do
below
that,
we
funnel
them
your
way
and.
J
A
Okay,
five
minutes
left
last
thing
on
the
list.
Zach
Newman
is
not
here
but
had
thrown
in
an
idea
to
talk
about
ARA
stuff,
our
stuff
and
a
future
meeting
for
the
tough
folks
around
the
call.
Maybe
you
could
tell
us
what
this
is.
You
can
decide
if
we
want
to
hear
more
about
it.
N
Simulator
for
tasks,
the
idea
is
for
it
to
replace
the
it
basically
to
be
used
as
kind
of
a
Sandbox
for
repositories
in
the
repository
side
of
tough
to
experiment,
with
the
way
repository
setup
could
go
for
Pepe
and
others.
The
same
box
should
that
work.
I
think
that
I
haven't
worked
at
it
myself.
I
think
a
lot
of
the
folks
at
VMware
have
done
a
lot
of
the
work
there
and
they're
all
I,
think
in
Europe,
and
so
probably
aren't
here
today.
So
maybe
we've
do
it
at
some.
N
A
Cool
okay,
so
yeah
an
interesting
I'm
guessing
the
IBM
demo
doesn't
fit
into
four
minutes.
So
is
that
accurate
assumption.
B
H
I
So
there
is
a
reason
why
core
preview,
we
don't
have
fancy
UI
yet
so
we
are
contributing
on
it
on
top
of
it,
but
this
is
more
like
behind
the
scene,
how
it's
developed
or
kind
of
the
Korean
service,
so
this
first
UI
what
I'm
gonna
do
is
so
one
day.
So
why
we're
developing
our
technology
Ian
gave
me
some
random
DPR
packages,
and
he
just
gave
me
this
random
Debian
packages
and
he
just
asked
me:
can
you
tell
me
anything
about
this
Debian
panties?
I
So,
okay,
so
we
are
currently
building
and
let's
try
to
figure
this
out
what
these
package
is,
so
we
submit
to
the
system
and
what
it
does
is
behind
the
scene.
It
is
analyzing
this
binary
and
trying
to
correct
all
the
information,
and
this
is
job
ID
I
just
submit
it,
and
this
will
show
the
progress,
for
example,
job.
Of
course
it
started
that
is
great
and
for
the
Deviant
packages,
as
you
may
kind
of
work
on
the
last
of
the
dbm
packages,
this
is
actually
the
AR
kind
of
Y
and
inside.
I
Of
course,
there
are
many
dependencies
inside
like
these.
Are
the
two
children
points
of
the
analytics
I'm
gonna
skipping
for
now
and
if
I
just
refresh,
you
need
to
do
the
unpacking
one
of
the
inside
by
the
Deviant
binary
and
there's
the
data.xg,
which
also
has
one
of
the
children
which
is
speed
out
or
the
top
five
okay,
it's
still
unpacking
it.
D
I
I
H
I
Yeah,
so
sorry
about
that
I
already
cleaned
up
somehow
yeah.
This
guy
decided
that
so
after
that,
what
happened
is
that
one
of
the
fire
is
actually
this
binary,
which
is
actually
amplifier
inside
it,
and
then
we
call
it
to
our
knowledge
grab.
What
do
we
know
anything
about
this
binary,
and
this
is
the
coriander
wizard,
for
example.
I
This
query:
this
is
one
of
the
file
we
just
submitted
from
on
the
TPM
packages
and
inside
there
is
a
file
called
unknown
and
then
the
first
heat
of
the
result
is
actually
the
match
did
with
the
duplicate
1.19.4,
which
has
a
magic
count
of
118.
So
these
are
the
list
of
the
function
that
match
it.
With
the
specific
binary
we
just
get
from
the
video
Summit
to
the
system,
and
then
there
is
more
detail
about
what
Y
type
is
it?
What
is
size
and
what
is
the
value?
I
And
then
this
will
show
the
what
is
the
closest
match?
That
was
the
top
one
and
the
second
one
is
actually
coming
from
the
official
dbm
package
from
the
Ubuntu,
and
this
is
the
package
name
on
the
19.4
and
this
and
then
this
is
inside
the
wget
from
this
dbm
package.
That
also
has
118
kind
of
match
it.
The
signature
or
the
Genome
of
this,
the
binary
and
then
from
here
I
confirm,
looks
like
hey.
Yeah
looks
like
this
is
double
to
get
and
I
think
my
best
guess
would
be.
I
I
A
That's
perfect,
no
that's
great
yeah
I
would
say
also
like
if
you
want
to
put
together
a
little
screen,
recording
or
video
or
something
we
could
share
that
in
a
stock.
I
think
folks
would
be
interested
to
see
that
as
well,
but
yeah,
that's
great.
All
right.
Awesome!
All
right,
we're
out
of
time
thanks
everyone
for
joining
and
see
you
in
two
weeks,
the
immediate
friendly
meeting.
Thank
you
all.