►
Description
CodeQL is GitHub's expressive language and engine for code analysis, which allows you to explore source code to find bugs and security vulnerabilities. During this beginner-friendly workshop, you will learn to write queries in CodeQL and find known security vulnerabilities in open source Java projects.
GitHub Satellite: A community connected by code
On May 6th, we threw a free virtual event featuring developers working together on the world’s software, announcements from the GitHub team, and inspiring performances by artists who code.
More information: https://githubsatellite.com
Schedule: https://githubsatellite.com/schedule/
A
Okay,
so
hi
everyone.
My
name
is
Luke
Artie
and
I'm,
a
security
solutions
architect
in
the
professional
services
team.
Again
day-to-day
I
work
with
our
customers
to
help
them
deploy
news.
The
cool
security
features
in
our
products
and
one
of
my
favorite
parts
of
the
job
is
teaching
people
how
to
write
kql,
so
I'm
genuinely
really
excited
to
be
presenting
today's
session
on
finding
security
vulnerabilities
in
job
with
Kokua.
A
So
today,
what
we're
going
to
cover
is
what
kql
is?
How
do
you
use
click
yo
to
identify
patterns
in
source
code,
and
then
we
will
provide
a
hands-on
session
where
we'll
guide
you,
through
the
process
of
writing,
query
to
find
a
known
security
vendor
in
Apache
struts.
So
we're
going
to
start
off
with
a
short
slide
deck
here
about
15
minutes
and
then
we'll
move
into
the
hands-on
session,
and
the
session
is
designed
for
total
beginners.
A
So
you
don't
feel
like
you
need
any
existing
experience
of
kql
and
also,
although
this
session
is
focused
on
analyzing,
a
Java
project,
you
also
don't
need
extensive
experience
of
Java
either
and
the
code
key
are
skills
that
you
learn
today
will
actually
be
transferable
to
analyzing
other
languages
with
kql.
Now,
if
you
want
to
follow
along
with
the
hands-on
session,
you
will
need
to
set
up
the
kokyu,
our
development
environment.
The
link
is
on
the
screen
here
to
the
repository
containing
the
prerequisites.
A
A
B
A
A
So
what
can
I
do
with
it?
Well,
the
main
use
case
is
to
find
bugs
and
security
vulnerabilities
in
your
code
developers
a
human
and,
unfortunately,
humans,
make
mistakes,
not
only
that,
but
they
make
the
same
types
of
mistake
over
and
over
again
what
Cotillard
provides
us
a
way
to
describe
an
automatic
search
for
the
patterns
underlying
those
mistakes.
A
The
kokyoku
language
is
very
high-level
and
it's
very
expressive
which
makes
a
very
quick
and
easy
to
write
or
refine
queries
and
to
provide
really
accurate
results.
So
what
this
allows
us
to
do
is
to
help
automate
repetitive
security
code
review
by
taking
the
expertise
of
security
researchers
and
allowing
them
to
express
their
knowledge
in
this
sort
of
codified,
readable,
executable
form
in
these
queries.
A
So
some
might
ask
why
a
new
language
on
there
already
enough
languages
in
the
world
well
kql
has
a
fairly
unique
combination
of
features
which
makes
it
ideally
suited
to
writing
queries
for
analyzing
code.
So
first
it
is
a
logic,
programming
language.
What
this
means
is
that
you
specify
a
logical
conditions
that
describes
the
patterns
you
want
to
find.
A
These
conditions
are
combined
with
operators
such
as
a
hand
not
in
or
second
it
is
declarative,
which
means
that
the
order
in
which
you
specify
the
conditions
does
not
matter
and
that
there
are
no
side
effects
now
together.
These
features
mean
you
can
focus
on
describing
what
you
want
to
find,
not
how
you
want
to
find
it.
Instead,
what
we
do
is
we
provide
an
advanced
query
optimizer,
which
takes
the
high-level
declarative
code
to
our
query
and
automatically
compiles
it
to
an
efficiently
executed
form
now.
A
Thirdly,
Curel
is
object,
oriented,
and
this
may
seem
very
strange
for
a
query
language
we'll
explain
exactly
what
this
looks
like
in
more
detail
in
a
few
slides
time,
but
the
benefit
here
is
the
same
as
other
object-oriented
languages.
So
we
can
encapsulate
data
and
operations
on
that
data
together
we
can
benefit
from
and
we
can
benefit
from
inheritance
composition
and
other
object
oriented
type
patterns.
A
A
A
The
where
Clause
specifies
some
conditions
on
those
variables
in
the
form
of
logical
formula
and
the
Select
Clause
specifies
what
the
result
should
be
think
of
refer
to
variables
that
are
defined
in
the
from
Clause
there's,
also
a
series
of
inputs
at
the
top
of
the
query,
which
les
has
really
used
logic
defined
in
other
libraries.
So
this
import
Java
allows
us
to
import
the
standard
library
for
Java.
A
Let's
take
a
closer
look
at
this
example
and
see
if
we
can
figure
out
what
it's
doing
so,
the
from
clause
is
defined
as
a
series
of
variable
declarations
where
each
declaration
has
a
type
and
a
name
for
example.
This
declaration
here
is
for
a
variable
called
if
statement,
and
it
has
a
type
if
statement
which
is
from
the
koku.
Our
standard
library
for
Java,
which
we
imported
above
types
represents
sets
of
values.
A
So,
for
example,
this
if
statement
type
here
represents
the
set
of
all
if
statements
in
the
program
variables
also
represent
set
of
values
initially
constrained
by
the
type
of
the
variable.
So
this
variable
here
initially
represents
the
set
of
all.
If
statements
in
the
program
and
what
we'll
do
is,
we
will
later
refine
that
with
the
conditions
that
we
apply
so
meet,
the
second
variable
here
block
represents
the
set
of
all
blocks
in
the
program.
Payee
elements
surrounded
by
curly,
braces,
typically,
a
sequence
of
statements.
A
Now
the
where
Clause
allows
us
to
specify
some
conditions
on
these
variables
and
combine
these
with
logical
operators
such
as,
and
here
we
provide
a
condition
that
specifies
the
block
variable
is
equal
to
the.
If
statement
get,
then
there
are
two
important
aspects
here.
So,
firstly,
the
if
statement
get
then
this
shows
the
object,
oriented
nature
code,
QL
so
get.
Then
this
is
an
operation
that
is
provided
on
the
type
if
statement
and
it
returns
the
then
part
of
an
if
statement.
A
So
if
you
think
of
what
an
if
looks
like
typically
it's
a
if
some
condition,
then
something
happens
else,
something
else
happens
say,
let
me
say
get
then
is
the
thing
that
happens
if
the
condition
is
true,
so
this
is
this
is
demonstrating
the
object-oriented
nature
of
of
Kokua
now.
The
second
aspect
here,
that's
important-
is
that
this
equals
here
is
equality,
not
assignment.
We
are
not
assigning
this
value
into
the
variable
block.
We
are
saying
that
it
is
equal
to
this.
A
So
if
we
step
back
a
bit
here,
we
can
see
that
this
query
is
actually
finding
if
statements
which
have
an
empty
then
block
now.
Finally,
if
we
have
a
look
at
the
the
last
part
of
the
query
here,
the
Select
Clause.
This
is
actually
what
defines
the
result
of
the
query
and
here
what
we
can
see
as
we
select
an
element
in
the
program
and
a
message
that
we're
going
to
report
at
this
for
this
element
in
the
program.
This
is
how
we
report
static
analysis
issues
in
your
code
base.
A
It's
a
location
in
the
code
base
and
a
message
at
that
location.
Now,
one
way
to
think
of
the
Select
is
that
it's
going
to
produce
effectively
a
table
of
results
where
the
first
column
is.
If
statements
and
the
second
column
is
the
message
associated
with
those
if
statements,
so
what
we're
going
to
do
now
is
take
a
brief
tour
through
some
of
the
reuse
and
encapsulation
features
in
kql
before
we
actually
start
the
hands-on
session.
So
these
are
the
building
blocks
of
of
queries.
A
So
the
first
feature
that
we'll
introduce
you
to
is
predicates,
so
these
provide
a
way
to
encapsulate
portions
of
logic
in
the
program
so
that
they
can
be
reused.
You
can
think
of
them
as
a
mini
from
where
select
like
a
select
clause.
They've
had
to
be
produced
a
set
of
rows
in
a
result
table
the
difference
here
is
that
you
can
name
the
table
of
results
and
you
can
reuse
them,
say
here's
an
example.
A
A
What
we
can
say
is
that
is
empty,
so
it
takes
a
single
variable
declaration
block
and
what
it's
going
to
do
is
it's
going
to
represent
the
set
of
all
blocks
in
the
program
which
are
empty
now?
How
do
we
define
that?
Well,
we
need
to
put
this
condition
in
in
the
body
of
the
predicate,
and
this
condition
is
the
same
condition
that
we
had
before.
A
We
can
then
use
this
predicate
to
simplify
our
query
by
using
it
as
a
logical
condition
in
the
where
clause,
to
say
that
the
then
part
of
the
if
statement
must
be
empty,
so
you
can
think
of
this
in
in
one
of
two
ways
you
can
think
of
this
as
the
logical
conditions
induced
here
effectively
inlined,
so
you
can
think
of
this
in
the
logical
way,
or
you
can
think
of
this
in
a
sort
of
set
Y.
You
can
think
well,
let's
take
the
set
of
all.
A
So
the
next
feature
that
we'll
look
at
today
is
classes
so
classes.
Allow
you
to
define
new
types
in
kql
and
like
all
types
they
describe
sets
of
values.
So
we've
already
seen
two
classes
in
kokomo
block
and
if
statement
and
those
are
defined
in
the
standard
libraries,
for
example,
we
could
define
a
new
code
ql
class
to
represent
the
set
of
empty
blocks.
The
way
that
we
define
this.
So
we
use
the
keyword
class.
A
We
provide
a
name
for
our
class,
in
this
case
empty
block,
and
and
then
we
provide
a
set
of
super
types
in
this
case
block
all
classes
in
ql
must
have
at
least
one
super
type,
and
the
super
types
define
the
initial
set
of
values
in
our
class.
In
our
case,
empty
block
starts
with
all
the
values
in
the
block
class.
A
However,
the
class
that
can
only
represent
the
same
values
set
of
values
as
another
class
is
not
particularly
interesting.
We
can
therefore
provide
what
we
call
a
characteristic
predicate,
which
is
this
thing
here.
Just
looks
a
little
bit
like
a
constructor
then
get
deceived
it
is
it
isn't
really
but
effectively
what
it
allows.
You
to
do
is
to
define
some
additional
conditions
that
can
restrict
the
set
of
values
further.
A
So
in
this
case,
what
we're
saying
is
we
start
off
with
a
set
of
all
blocks
in
the
program
and
what
it
means
for
Locke
to
be
an
empty
block
is
well
it's
the
same
condition
that
we
had
before
that.
The
number
of
statements
in
that
block
equals
zero.
You
notice
within
here
we
can
use
this
magic
variable
called
this,
and
this
refers
to
the
to
the
block
that
we're
starting
with
and
allows
us
to
define
logical
conditions
on
the
instance
here
of
the
of
the
castle.
A
Now
so
far,
this
class
is
actually
equivalent
to
the
predicate
solution
that
we
saw
previously
we're
in
fact
supplying
the
same
set
of
conditions,
and
we
will
calculate
these
same
set
of
values
as
with
the
predicate
case,
in
both
cases
where
we're
calculating
a
table
effectively.
That
says
here
are
all
the
empty
blocks
in
the
program.
A
A
Now
we
can
actually
use
this
another
way,
so
the
other
way
that
we
can
use
this
is
by
defining
a
variable,
a
temporary
variable
here,
and
this
temporary
variable.
We
can
specify
the
type
directly
as
empty
block
now
when
we
apply
this
logical
condition
here,
we're
only
starting
with
a
set
of
empty
blocks.
To
begin
with,
so
we're
only
going
to
match
this
logical
condition
against
the
empty
blocks.
So
again,
this
is
equivalent
to
the
to
the
previous
queries
that
we've
seen.
A
Okay,
I've
done
more
than
enough
talking
now.
So
let's
move
on
to
the
hands-on
part
of
the
workshop.
If
you
haven't
already
done
so,
please
follow
the
link
on
the
slide
and
get
free
s
code
and
the
ktl
extension
is
set
up.
If
you
have
any
questions,
then
please
do
ask
on
the
slack
channel,
as
I
said
and
as
a
teacher
and
are
all
there
to
help
answer
your
questions.
A
A
A
Okay,
let's
get
started
all
right,
so
this
workshop
is
on
finding
an
unsafety
serialization
issue
in
Apache
struts.
So
serialization
is
the
process
of
converting
in
memory
objects
to
text
or
binary
output
formats,
usually
for
the
purpose
of
sharing
or
saving
program
state.
The
serious
data
thing
can
then
be
loaded
back
into
memory
at
a
future
point
through
the
process
of
D
serialization,
so
languages
such
as
Java,
Python,
Ruby,
C,
sharp
D
serialization
provides
the
ability
to
restore
not
only
the
primitive
data
but
also
complex
types,
such
as
library
and
user-defined
classes.
A
So
this
provides
great
power
and
flexibility,
but
does
also
introduce
a
significant
attack
vector
if
the
DC
ization
happens
on
untrusted
user
data
without
frustration,
so
I'm
sure
you're
all
familiar
with
Apache
struts.
It's
a
popular
open
source,
MVC
framework
for
creating
web
applications
in
Java
now
in
2017,
a
researcher
from
the
predecessor
of
the
github
security
lab
and
found
a
CVE
CVE
2017
1985,
which
was
an
XML
D
serialization
vulnerability
in
Apache
struts,
and
it
was
severe
enough
that
it
would
allow
remote
code
execution
now.
A
The
problem
occur
because
included
as
part
of
the
Apache
struts
framework
is
the
ability
to
accept
requests
in
multiple
different
formats
or
content.
Types
and
Apache
status
actually
provides
a
pluggable
system
for
doing
this
through
this
content
type
handler
interface,
and
that
interface
provides
an
interface
method
to
object.
A
So
you
can
define
a
new
content,
type
and
Eiffel,
say
xml
or
json,
and
by
implementing
this
interface
and
defining
your
own
to
object
method
which
takes
data
in
the
in
the
first
parameter
in
the
form
of
a
reader
and
uses
that
to
populates
the
target
object,
which
is
taken
as
the
second
parameter.
Now.
Typically,
this
this
reader,
this
input
is
provided
directly
from
the
user
without
any
sort
of
validation.
So
it
can't
really
be
trusted
and
typically,
the
way
a
Content
handler
works.
A
So
what
we're
going
to
do
in
this
workshop
is
we're
going
to
write
a
query
to
find
the
CVE
in
a
database.
That's
specifically
built
from
the
known
vulnerable
version
of
Apache,
struts,
okay.
So,
as
I
said
before,
if
you
haven't
already,
please
follow
the
setup
instructions
for
Visual
Studio
code,
so
you'll
need
to
install
the
Visual
Studio
code
IDE
and
then
install
the
code,
ql
extension
for
visual
studio
code.
You
can
install
it
from
the
extension
panel
here.
A
You
also
need
to
set
up
the
startle
workspace
and
the
reason
for
this
is
the
starter:
works-based
has
links
to
the
standard
libraries.
So
if
you
don't
set
up
the
start
of
workspace,
you
don't
get
the
standard
libraries
and
you
aren't
able
to
you
to
easily
write
queries
once
you
have
the
start
of
web
space.
You'll
need
to
open
that
within
vs
code.
A
You
also
need
to
download
and
unzip
the
vulnerable
database
and
you'll
need
to
choose
this
database
in
cocuwa,
using
the
ctrl
shift,
P
to
open
the
command
palette
and
selecting
Kokomo
choose
database,
and
then
what
you
need
to
do
is
to
create
a
new
file
in
the
QL
custom
queries
Java
directory
called
unsafety
civilization.
So
you
can
see
that's
what
I've
done
here.
I've
got
this
unsafety
serialization
ql
file,
and
I've
got
this
open
underneath
here.
A
Ok,
so
the
way
this
workshop
is
going
to
work
is
that
it's
split
into
several
steps,
and
so
it's
up
to
you,
you
can
either
write
one
query
per
step
or
you
can
kind
of
write
a
single
query
that
you
refine
at
each
step.
I'm
going
to
go
for
the
approach
of
writing
a
single
query,
each
step.
Each
question
has
a
hint
associated
with
it
and
that
hint
describes
some
useful
classes
and
predicates
in
the
standard
libraries
for
Java.
A
So
you
can
explore
these
in
your
ID,
using
the
autocomplete
suggestions
and
also
the
jump
to
definition
command.
What
I'm
going
to
do
is,
after
reading
out
each
of
these
questions,
I'm
going
to
pause
and
give
you
the
opportunity
to
answer
it
for
yourself
and
leave
a
gap
of
around
two
minutes
for
these
first
ones
and
maybe
a
little
bit
longer
for
some
of
the
later
ones.
A
What
I'll
do
is
I'll
actually
start
a
timer
in
vs
code,
say
time
to
start
two
minutes
and
at
the
bottom
here
will
give
us
a
timer
to
work
on
I.
Just
say
we
keep
the
time.
Don't
worry
if
you
get
left
behind
we're
going
to
build
in
a
couple
of
pauses
to
answer,
questions
and
things,
and
you
know
if
it's
going
too
fast
or
too
slow
you
can.
A
You
can
ask
questions
on
the
on
the
slap
channel
at
the
same
time
now
before
we
get
started,
I'm
just
going
to
show
you
a
couple
of
features
of
kind
of
how
to
actually
use
vs
code
to
to
write
kql.
So
you
can
see,
we've
got
this
unsafety
serialization
query
file
here,
we've
added
the
importing
at
the
top
for
the
standard
library.
What
I'll
do
is
I'll
just
write
a
simple,
simple
query
here.
A
So
let's
write
a
query
just
to
find
all
the
if
statements
in
the
program,
so
you
can
see
here,
we
get
water
complete,
enter
birds
arrived
from.
We
can
hit
control
space
here
to
bring
up
the
autocomplete
as
well.
So
you
can
see
here
it's
giving
us
type
suggestions,
because
the
first
thing
we
need
to
do
is
define
a
code
QR
type.
A
So
if
statement
you
can
see
that
it's
provided
us
with
the
if
statement
type
as
an
option
we
can
hit
enter,
we
can
provide
a
name
for
this
if
statement
and
we
can
provide
a
select
if
statement
again,
you
can
see
it's
also
completed
that
and
we
can
save
that
file.
You
know
here
you
can
actually
right
from
where
selects,
without
the
way,
in
fact,
only
the
selectors
mandatory
here
right
now,
we
can
run
this
query,
so
there
are
two
ways
that
we
could
run
the
screen.
A
One
is
to
right-click
in
this
in
this
file
here
and
click
kqr
run
query
the
other
way,
which
I,
typically
users
to
open
the
command
palette.
Again,
that's
ctrl
shift
P
in
vs
code
and
choose
the
you
run
query
option
here,
so
you
can
run
this
query
and
you
see
we
get
the
results
back
on
this
right
hand,
side
and
the
results
are
numbered
and
the
ones
that
are
elements
in
the
program,
the
whole
locations
in
the
program.
These
are
actually
clickable
links.
A
So
if
I
click
on
this
first
one
you
can
see
it
takes
us
to
you.
The
read-only
copy
of
the
Apache
stretch
source
code
for
the
line
at
which
this
program
element
exists.
So
we
can
see
here.
This
was
all
the
if
statements
in
the
program
there
are
eight
and
a
half
thousand
of
them,
and
if
we
click
on
them,
it
will
take
us
to
the
library
each
one
of
these
is
defined.
A
Okay,
so
section
one
so
finding
XML,
P
serialization,
so
extreme
is
a
Java
framework
which
is
used
for
Sir,
Isaac
and
DC,
rising
Java
objects
to
run
from
XML,
and
it's
used
by
Apache
struts
and
what
it
does
is.
It
provides
a
method
called
from
XML,
which
is
used
for
deserializing
xml
to
a
java
object
now
by
default.
The
input
is
not
validated
in
any
way.
So
Xtreme
does
come
with
some
validation
features,
but
they
are
not
on
by
default,
and
so
it
is
vulnerable
to
remote
code
execution
exports.
A
A
A
B
A
A
Okay:
let's
go
through
the
answer
to
question
team,
so
we
have
the
query
from
method,
access,
cool
cool
and
the
question
is
to
update
the
query,
to
report
the
method
being
called
by
each
method
call
and
if
we
expand
out,
the
hints
in
the
suggestions
are
to
add
a
code.
Ql
variable
called
method
with
the
type
method.
The
method
access
has
a
predicate
called
get
method
for
returning
the
method,
and
the
final
selection
here
is
to
add
a
where
clause.
A
So
this
is
similar
to
the
if
statement
and
MD
block
or
e
again,
we
have
two
variables
here
and
we're
just
relating
them
with
their
logical
conditions,
who
are
saying,
find
all
pairs
of
method,
accesses
and
methods
in
the
program
where
the
method
being
cooled
by
the
method.
Access
is
this
method
variable
I'm,
just
reporting
all
of
these.
Now
what
we
can
see
is
we
actually
have
two
columns
of
results
where
the
first
column
is
the
method
call.
So
this
is
the
method
call
before
and
the
second
column
here
is
the
method
being
called.
A
A
Okay,
we'll
go
through
the
answer
to
question
three.
So
if
we
expand
out
the
hint
here,
you
see
the
hint
says,
method
get
name
returns.
A
string
representing
the
name
of
the
method
say
the
way
that
we've
changed
is
we're
going
to
have
a
second
condition
into
the
career.
Let
me
say
that
the
method
that
we've
identified
as
the
target
of
the
school,
at
name
of
the
method
equals
from.
A
A
And
we
can
see
we
now
and
you
get
two
results
so
remember
from
where
we
start
at
the
top.
We
had
68
thousand
results
for
that.
She
only
turned
them
out
to
the
from
external
method
and
again
we
can
click
in
here
and
in
dare
and
local
villa
by
alright
question
for
the
extreme
dot
from
XML
method.
Dc
arises.
The
first
argument:
ie
the
argument
at
index
0
update
your
query
to
report
the
deserialized
argument.
A
As
you
saw
before
we
before
we
start
that
there
was
one
one
optimization.
We
can
actually
do
for
this
particular
particular
query
here
so
in
general,
for
the
square,
we
are
not
actually
going
to
need
the
method
itself.
The
thing
will
be
interested
in
is
the
call,
so
we
could
actually
just
remove
the
method
from
the
Select
clause
here.
A
This
is
exactly
the
same
apart
from
the
fact
that
any
reporting,
the
cool,
but
again
you
can
read
this
as
we
read
the
other
logical
conditions
as
saying
for
a
method
call
get
me,
the
method
get
the
name
of
that
method
and
assert
that
it's
is
called
from
XML.
So
this
is
just
a
simplification
of
this
query
and
if
we
run
it
again
now
we
can
see
we
still
get
the
same
same
two
results
here:
okay,.
A
A
A
A
Okay,
let's
go
through
the
answer,
so
the
hint
here
is
that
method
called
get
arguments
in
tie
returns.
The
argument
of
the
height
index
and
their
arguments
are
expressions
in
the
program
which
are
represented
by
the
kqr
class
expression
for
extra.
So
the
session
here
introduce
a
new
variable
to
hold
the
argument
expression.
So
we
can
do
that
by
saying
text
for
our
you
can
add
an
additional
condition
here
twice
together.
The
Arg
and
the
method
group.
So
we're
going
to
say
is
that
the
Arg
is
equal
to
call
get
argument.
A
A
A
So
this
question
is
about
converting
your
previous
query
into
a
predicate
which
identifies
this
set
of
expressions
in
the
program
which
I'd
be
see
rise
directly
by
from
XML,
and
so
we've
got
a
we've
got
a
template
here
that
you
can
follow,
and
this
actually
introduces
a
new
concept
in
kokyo,
which
is
this
exists,
so
you
can
think
of
an
exists
as
a
mechanism
effectively
for
introducing
temporary
variables
with
a
restricted
scope
and
again
a
bit
like
predicates
themselves.
You
can
think
of
them
as
their
own
mini
from
where
select.
A
So
you
have
a
series
of
variable
declarations
at
the
top
here
and
then
some
conditions
in
the
body
that
must
hold
and
we're
using
this
exists
here,
because
we
actually
don't
care
about
the
method
axis
itself
outside
this
predicate.
Actually,
what
we
want
to
get
out
of
it
is
the
argument
that
is
going
to
be
decentralized.
A
Okay,
so
go
through
the
answer
here.
So,
let's
start
off
by
copying
the
template
that
we
provided
above
and
what
we're
essentially
going
to
do
is
take
the
contents
of
the
where
clause
that
was
great
before
we're
going
to
put
them
in
the
body
of
this
exists,
because
the
the
conditions
that
we
want
to
apply
here
effectively,
the
same
I
will
need
to
do
a
little
bit
of
renaming,
because
we
previously
called
the
method
acts
as
cool
actually
now
we're
going
to
call
it
from
XML
and
say
from
a
set
point
of
view.
A
C
A
There
an
argument-
oh
I,
see
okay,
I,
understand
the
app
say.
Actually,
if
we
have
a
look
at
these
predicates
here
you
mouse
over
that
we
actually
get
some
context
specific
help
about
what
this
actually
does.
So
you
can
see
the
the
get
argument
predicate
here
gets
the
argument.
This
specified
zero
base
position
in
this
method
access,
and
you
can
also
see
that
it
has
a
signature
here.
There
helps
us
understand
what
parameters
are
the
it
itself
takes
and
what
it
might
return.
A
So
in
this
sense
it
doesn't
make
sense
to
have
a
specific
QL
type
for
argument,
because
it
doesn't.
You
know,
because
this
can
be
any
type
of
expression.
So
you
may
as
well
use
the
same
cue,
our
class,
that
we
have
have
defined
for
that.
So
hopefully
that
answers
that
particular
question
yeah
is
there?
Is
there
any
feedback
on
on
timings
or
anything.
C
Not
at
the
moment,
because
someone's
asking,
if
there's
you
and
readable
documentation
outside
of
visual
studio
available
for
all
the
predicates
and
libraries,
yes,.
A
A
So
you
can
see
all
this
QR
documentation,
so
the
stuff
that
comes
up
on
the
of
the
overlay
here,
that's
all
published
to
the
public
hub
site
and
the
learning
kgr
for
Java
also
has
tutorials
for
using
different
aspects
of
the
standard
library,
so
dataflow
or
using
the
ast
notes.
So
you
should
be
able
to
find
everything
that
you
need
there
as
well
as
in
as
well
as
in
VSK.
A
A
A
A
well
I
start
off
with
that
kind
of
exists
say
exists
as
a
special
kind
of
component
in
here
are
called
a
quantifier
such
an
existential
quantifier.
So
if
you've
done
any
logic,
formal
logic
before
you
kind
of
recognize,
this
ecology
and
the
structure
of
this
exists
is
that
it
has
a
series
of
variable
declarations
and
then
some
conditions
within
the
exists,
and
so
typically
in
the
simple
case,
you'll
just
see
us
using
the
standard,
logical
conditions
to
combine
the
formula
that
we
need
to
specify
for
the
body
of
the
exists.
A
Now,
there's
also
a
two-part
version
of
this
and
the
reason
there
exists.
A
two-part
question
of
this
is
because
there
are
other
quantifiers,
so
in
particular,
there's
a
quantifier
called
fooled,
which
is
which
talks
about
again.
If
you've
done
any
formal
logic,
that's
this
terminology
would
be
recognizable
which
talks
about
having
a
series
of
declarations
arranged
for
those
declarations
and
something
that
has
to
be
true
for
for
all
of
them.
A
Now,
in
this
case,
that
that
doesn't
particularly
make
sense,
but
the
exists
syntax
that
uses
a
bar
is
there
because
they
exist
because
they're
for
all
syntax
requires
the
bar
and
in
actual
fact
for
exists.
This
is
entirely
equivalent
because
the
way
I
just
described
it
with
the
two
bars
here
we
have
a
range
and
something
that
needs
to
be
true
for
at
least
one
thing
in
the
range.
A
Well,
if
you
think
about
that
a
little
bit,
that's
actually
just
the
same
as
combining
these
two
conditions
with
and
now
one
area
where
it
can
be
a
little
bit
helpful
to
use
this,
as
if
you
have
something
like
this,
where
you
might
say,
I'll
go
from
XML,
don't
get
argument.
Zero,
maybe
say
that
the
first
argument
could
also
be
desirous
as
well.
Well
then,
you
could
write
it
like
this,
but
you
can
see.
C
A
So
I
think
there's
there's
two
answers
to
that
question.
So
I
mean
if
you
think
about
the
fundamental
process
that
we're
doing
here,
which
is
building
a
database
of
facts
and
the
writing
queries
on
a
database
of
facts.
Then
there's
no
a
theoretical
reason
why
you
couldn't
do
this
also
on
on
binaries
right,
there's,
no
kind
of
limitation
there.
A
So
that's
the
the
kind
of
theoretical
question.
The
practical
question
is
that
we
don't
support
at
the
moment
analyzing
analyzing
binaries,
the
only
type
of
binary
support
that
we
do
have
is
for
C,
sharp
and
for
C
sharp.
We
can
analyze
dll's
assemblies
that
are
written
in
dotnet
and
have
byte
codes,
so
it's
called
Gaia
cell
now.
What
are
the
reasons
that
we
don't
support?
This
is
because
it
can
be
tricky
to
say
firstly,
I
mean
they
sort
of
the
main
purpose
for
what
we
do
is
its
static
analysis
of
source
code
and
say
it.
B
A
Necessarily
been
our
focus
to
look
at
binary
analysis,
because
it's
a
sort
of
different
area
and
a
different
topic
and
often
different
use
cases,
and
one
of
the
challenges
with
binary
analysis
is
so
in
particularly
what
we
usually
focus
on
is
trying
to
report
results
to
developers
so
that
they
can
actually
fix
them.
And
if
you
do
a
binary
in
analysis,
then
it's
tricky
to
then
correlate
that
back
to
some
source
code
that
that
you
can
actually
report
an
issue.
So
this
is
one
reason
why,
from
a
pragmatic
point
of
view,
we
haven't
really.
A
We
haven't
really
looked
at
it
now.
The
C
sharp
case
is
actually
quite
interesting,
because
what
we
do
for
C
sharp
is
we
analyze
the
Assemblies
that
your
source
code
depends
on.
So
when
you
do
a
build
of
a
c-sharp
project?
Typically,
when
you're
doing
the
compilation,
you
depend
on
some
DLL
some
assemblies
and
we
actually
analyze
the
assemblies
and
put
the
byte
codes.
A
They
represent
a
database
representation
of
the
byte
code
in
the
database
alongside
the
database
representation
of
your
source
code
and
then
what
we
do
is
we
use
that
to
analyze
data
flow
through
your
libraries
as
well
as
your
source
code,
and
so
that's
one
area
where
the
sort
of
binary
analysis
or
byte
code
analysis
can
be
really
useful
for
source
code
analysis.
Now,
as
I
say,
that's
something
that
we
have
for
c-sharp
at
the
moment.
We
don't
have
it
for
other
languages.
A
A
Right,
great,
okay,
hopefully
everyone's
ready
to
move
on
to
use
section
T,
so
section
T,
is
about
finding
implementations
of
the
two
object
method
from
the
content
type
and
are
so
like
predicates
classes
in
kql
can
be
used
to
encapsulate
reusable
portions
of
logic,
and
so
we
saw
this
in
the
slide.
Deck
classes
represent
single
sets
of
values,
and
importantly,
they
can
also
include
operations
which
are
known
as
member
predicates
that
are
specific
to
that
set
of
values.
A
So
we've
seen
numerous
instances
of
this
already
so
we
saw
if
statement
and
then,
if
statement
drugget,
then
method,
access
and
method
access,
don't
get
methods,
method
and
method.
Don't
get
named.
One
thing,
I
didn't
show
you
before
actually
was
the
jump
to
definition,
feature
that's
available
in
kql.
So
if
you
right-click
on
a
type,
you
can
go
to
definition.
You
can
also
you
do
that
using
12,
and
you
can
actually
see
the
source
code
of
the
class
that
we're
depending
on
here
and
so
actually
all
the
koku
our
classes.
A
All
the
standard
library
is
actually
open
source.
So
it's
all
available
in
github
for
/qr
type
repository,
there's
a
link
at
the
bottom
there
and
you
can
actually
jump
to
the
definition
here
and
you
can
see
which
member
predicates
are
defined.
So
you
can
see
here
for
method
access.
You
see
definitions
and
things
like
it's
an
argument,
get
arguments
the
get
method,
implementation
we
were
looking
at
before.
So
that's
often
helpful
when
you're
kind
of
exploring
the
exploring
the
standard
libraries.
A
So
what
we're
going
to
do?
The
first
question
here
is
to
create
a
code,
ql'
class
called
content
type
and
err
to
find
the
interface
or
Apache
struts
to
rest
handler
content
type
and
so,
as
a
reminder,
there's
a
template
here
of
what
what
a
class
looks
like
in
this
case
we're
going
to
extend
the
standard
library
type
called
ref
type.
So
ref
type
here
stands
for
reference
type,
and
this
is
basically
the
set
of
things
like
classes
in
Java
that
are
referred
to
by
reference.
A
A
A
Go
through
the
answer
so
again,
the
question
was
to
write
a
kql
class
called
content
type
handler
to
find
this
particular
interface,
and
you
can
use
this
template
and
to
to
fill
in.
So
what
we're
going
to
do?
A
sketch
you
write
this
out,
so
we
can
say,
class
content,
type,
swimmer
classes
have
a
main
and
they
have
to
extend
an
existing
type.
A
This
place
we're
going
to
extend
a
ref
type
as
this
as
classes,
interface,
type
parameters,
arrays
and
so
forth,
and
that
would
be
a
valid
valid
class
in
and
of
itself,
but
only
represents.
It
still
only
represents
this
set
of
all
the
ref
types
in
the
program.
So
we
need
to
provide
a
characteristic
credit
curtain
that
specifies
what
conditions
make
content
type
handlers
special
from
the
rest
of
the
ref
types,
and
the
answer
here
is
that
they
they
have
a
name
which
it
looks
like
this.
A
And
so,
if
we
have
a
look
at
the
hint,
it
says,
use
ref
type
door,
House
qualified
name
which
takes
two
parameters:
package,
name
and
class
name
to
identify
classes
within
the
given
package,
name
and
class
name.
So
we've
got
a
little
example
here,
for
that
looks
like
she
can
say,
from
ref
type
are
they're
harder,
has
qualified
name
as
a
package
name
and
a
class
name
escapes
java.lang
string,
and
then
you
can
select
our
and
remember.
A
As
we
talked
about
before
within
the
characteristic
predicate,
you
can
use
the
magic
variable
this
to
refer
to
the
ref
type.
Okay,
so
we're
going
to
say
this
dot
as
qualified
name
and
what
we're
going
to
do
is
we're
going
to
copy
out
this,
because
in
two
parts
here
say
one
justifies
the
package
hounding
the
triplane
subtly.
A
And
there
we
go
so
now.
What
we're
saying
here
is
that
a
content
type
handler
is
a
ref
type
which
has
the
qualified
name.
Org
Apache
stressed
to
you
rest
handler
content
type
handler
now
it
would
be
nice
to
be
able
to
test
this
without
having
to
you
to
write
in
your
query
and
in
fact
we
can
actually
do
that.
So
we
can
right
click
either
on
the
class
or
on
the
characteristic
predicate
and
choose
quick
evaluation.
A
So
you
can
you,
you
can
run
quick
evaluation
on
predicates
on
classes
and
on
complete
sets
of
formula,
and
this
is
a
way
that
you
can
very
easily
start
to
debug
and
further
understand
your
your
kql.
So
we've
written
this
class,
we
want
to
make
sure
it's
returning
the
right
thing
before
we
continue
on
with
the
with
the
next
part
of
the
query.
So
you
can
see
this
has
returned
one
result:
it's
a
content.
A
Type
handler
has
returned
the
interface
content
type
handler
and
that's
what
we
expect
right,
because
there's
only
one
definition
of
this
interface
in
the
year
in
the
codebase
say
the
string
all
right,
say:
question
t
create
a
kql
class
called
content
type
handler
to
object
for
identifying
methods
called
two
objects
on
classes
whose
direct
super
types
includes.
Content
type
handler
play,
there's
a
bit
of
a
mouthful
if
you're
not
sure
where
to
start
and
then
be
able
to
go
to
the
hint
here.
A
A
Okay,
we'll
go
through
the
answer
now,
so
if
we
expand
out
the
hint
here,
then
desert,
there's
a
few
hints
say
the
first
one
is
used
method,
get
name
to
identify.
The
name
of
the
method.
Think
we've
seen
that
before
had
to
identify
whether
the
method
is
declared
on
a
class
use.
Direct
super
type
includes
content,
type
and
ER.
You
will
need
see,
firstly,
identify
the
declaring
type
of
the
method
using
method
doc
get
declaring
type.
A
We
need
to
add
a
characteristic
predicate
so
so
far
this
is
pretty
much
similar
to
the
ref
timeline
that
we
have
above,
except
that
we're
starting
for
the
set
of
all
methods.
We
want
to
refine
it
to
only
those
methods
that
are
called
to
object
and
that
are
on
something
that
implements
this
content
type
handler
interface,
okay,
so
the
first
thing
we're
going
to
put
in
is
that
this
look
at
name
because
to
object.
A
Then
the
second
part
that
we're
going
to
put
in
is
this
aspect
here
so,
firstly,
well,
we
need
to
know
what
type
this
method
was
declared
up.
We
can
say
this
at
declaring
type
guess
the
type
of
which
this
member
is
declared
now
having
got
the
member
type.
What
we
want
to
know
is
whether
that
member
type
so
that
the
the
type
of
this
method
is
declared
in,
we
want
to
know
whether
it
extends
content
type
handler,
because
not
all
the
methods
then
to
object
in
the
program,
maybe
on
things
that
extend
content,
hi
Pamela.
A
A
A
There's
really
no
preference
I
mean
under
the
hood.
These
are
both
predicates,
so
this
one
is
a
predicate
that
has
a
return
type
a
result
tonight.
This
one
is
one
that
takes
the
parameter
under
the
hood.
It's
calculating
the
same
thing:
it's
not
really
it's
more
about
kind
of
personal
preference
as
to
what
makes
more
sense
in
terms
of
the
or
what
what
reads
better
to
you
for
simplicity,
I've,
been
using
the
same
one,
all
the
way
through
so
I,
making
sure
I
use
this
difficut
name
equals.
So
I'll
continue
to
do
that.
C
C
A
So
we
did
this
for
for
some
of
our
languages
and
say
there
were
kind
of
two
ways
that
we
can
look
at
dependency.
So,
as
I
said
before,
for
compiled
languages
like
C,
sharp
or
Java,
when
you
compile
the
code,
you
have
to
provide
a
reference
to
the
libraries
that
you're
actually
compiling
against,
and
so
when
we
see
that
compilation
to
kind
of
populate
our
database
the
facts
about
the
program,
we
observe
that
compilation.
A
We
look
at
the
arguments
to
the
compiler
and
we
see
that
you've
passed
in
these
particular
dependencies,
and
so
when
we
build
our
database
of
facts,
we
store
a
reference
to
the
fact
that
you
depend
on
on
these
particular
particular
binaries.
Now
it's
not
always
a
straightforward
process
to
go
back
from
those
binaries
to
find
out
what
particular
version
they
were
it's
possible
for
some
languages
for
other
languages
like
JavaScript
or
Python.
A
We
depend
a
little
bit
more
on
package
references
in
the
repository
itself,
so
in
those
cases
we
can
actually
build
those
kind
of
package
references
into
the
database
itself
and
we
can
potentially
query
those
as
well
now.
This
is
kind
of
an
interesting
question
because
it
sort
of
hits
on
the
vulnerable
dependencies
topic
and
sager
hub
actually
has
a
native
feature
that
looks
for
vulnerable
dependencies
in
your
source
code.
It
doesn't
use
code
key.
A
Well,
it's
just
looking
at
your
your
definitions,
things
like
your
maven
pom
files
or
your
packages
for
python,
or
you
know
your
NPN
theorem,
yeah
modules
dependencies
for
JavaScript
and
that
sort
of
a
lightweight
solution,
and
what
I
can
do
then
is
match
that
against
a
database
of
known
run,
four
different
library
versions
had
it
reports
that
within
get
out
of
itself.
So
that
does
a
feature
that's
already
and
github.
You
can
already
use
that
today,
based
on
open
source
code
and
private
code
as
well.
Now
the
value
the
King
kqr
can
bring.
A
A
What
QR
can
also
bring
is
a
knowledge
of
which
aspects
of
those
libraries
you
use.
So
one
problem
with
the
sort
of
typical,
vulnerable
dependency
analysis
today
is
it'll,
say
you're
depending
on
a
vulnerable
version
of
library
X,
but
we
don't
know
if
you're
using
it
in
a
way
that
actually
makes
you
vulnerable.
A
So,
for
example,
if
there's
just
one
API
called
in
this
library
you're
using
that
there
is
known
to
be
vulnerable,
that's
why
say
a
CV
was
raised
on
it.
You
know
the
the
kind
of
vulnerable
dependency
analysis
that
doesn't
use.
Kqr
doesn't
know
that.
So
one
interesting
aspect
here
is
whether
you
can
use
kql
to
help
identify
unsafe
uses
of
vulnerable
libraries,
and
there
are
actually
a
couple
of
examples
of
public
queries.
We
have
that
they
use
that
so
I
think
the
JavaScript
play
sure
approaches
like
pollution
query.
A
C
A
The
reason
that
we
do
this
is
because
for
compiled
languages,
it
is
often
important
for
accuracy
to
actually
see
what
the
bill
does,
and
so
there
are
numerous
examples
of
you
know
the
bill
generating
files
as
it
goes
along,
which
help
tie
the
project
together.
Bills
using
you
know,
multiple
different
versions
of
dependencies,
so
you
don't
know
which
version
is
going
to
be
used
until
the
bed
actually
occurs.
A
These
kind
of
aspects
are
only
really
revealed
kind
of
when
you
observe
a
build,
and
so
for
that
reason
for
buildable
languages,
we
we
want
to
build
command
in
order
to
be
able
to
do
a
fortune.
This
means
that
we
can't
build
partial
projects.
So
if
you
can
compile
something
and
that
we
can
build
it,
so
if
you
can
say
compile
one
component
of
a
project,
then
you
can
build
a
database
for
it.
But
if
you
can't
compile
it,
then
yeah
unfortunate,
you
can't
build
a
build.
A
A
database
for
those
ones
and
I
should
note
as
well.
This
is
particularly
important
for
it's
quite
important
for
Java
and
c-sharp
is
very
important
for
c
and
c++
like
it
is
extremely
difficult
to
do
any
second,
any
kind
of
sensible
source
code
analysis
on
c
and
c++
without
running
a
build,
because
C
and
C++
files
often
use
the
preprocessor
which
effectively
arbitrary
arbitrarily
rewrites
the
files
at
Build
time
and
say
for
C
and
C++
yeah.
A
C
C
A
So
at
the
moment
we
don't
have
any
other
languages
on
the
right
map,
but
you
know
we.
We
are
sort
of
continually
evaluating
the
kind
of
set
of
languages
that
we
support
and
the
sort
of
considerations
there
are
typically
popularity
of
the
language.
Also,
the
rating
change
of
the
popularity
of
the
language
you
know
is
the
language
becoming
more
popular
or
less
popular
and
also
obviously
commercial
decisions,
as
well
as
to
kind
of
what
our
what
our
customers
are
interested
in,
and
so
this
is
definitely
an
area
that
we
value
feedback
from
the
community.
A
Say
the
language
itself
is
not
open
source.
So
there's
a
so
I
should
say:
there's
a
language
specification
that
we
published
openly
the
implementation
of
the
language,
the
compiler
and,
in
particular
the
query
optimization
engine
that
is
all
closed
source.
The
aspect
which
is
open
source
is
all
the
queries
themselves,
so
the
queries
themselves
for
all
of
our
languages,
open
source
they're
in
the
github
kql
repository
and
part
of
the
reason
that
we
do
that.
So
the
motivation
here
really
is
that
we
think
the
way
to
you
know
best
help
the
world
secure.
A
You
might
remember
the
Zips
that
vulnerability.
That
was
well
at
least
labeled
by
snick,
and
they
found
a
bunch
of
cases
in
open
source
paper
and
some
software
and
at
the
time
and
still
at
the
moment,
Microsoft
was
one
of
our
biggest
customers
and
one
of
our
biggest
users
of
kql,
and
they
saw
this
zips
that
vulnerability,
which
is
essentially
had
tainted
path
from
ability,
and
they
they
saw
that
revealed
and
the
very
next
day
in
about
twenty
minutes.
A
They
reg
aquarii,
to
find
to
find
instances
of
zips
that
in
c-sharp
code
and
the
day
after
that,
they
don't
run
out
on
dozens
of
their
kind
of
largest
code
bases
and
the
month
after
that,
they
then
released
the
query
to
you
to
be
open,
sourced
and
says:
there's
a
kind
of
a
nice
pattern
here,
of,
firstly
being
able
to
rapidly
rapidly
identifying
new
and
emerging
security
vulnerabilities
and,
secondly,
contributing
those
those
checks
back
to
the
community,
so
that
everybody
benefits
from
all
right.
Thanks.
A
Ok,
let's,
let's
move
on
to
the
next
question,
and
they
were
the
next
exercise
on
here
and
we'll
we
can
come
back
to
more
questions
like
to
run
in
the
process.
Alright
so
question
three
here
and
two
object
methods
should
consider
the
first
parameter
as
untrusted
input.
Writing
query
to
find
the
first
ie
index,
0
parameter
for
to
object
methods,
and
so
the
way
I
suggest
you
do
this
is
we've
been
kind
of
following
on
on
what
I've
been
doing.
Is.
A
A
Okay,
let's
gave
me
the
answer
here
say
so.
Writing
the
question
to
object
method
should
consider
the
first
parameter
as
untrusted
user
input,
so
we
just
want
to
write
a
query
to
find
the
first
parameter
for
two
object
methods.
So
if
we
look
at
the
hints,
it
says
you
use
method,
get
parameter
to
get
the
I
to
index
parameter
and
create
a
query
with
a
single
click.
Your
variable
of
type
content
type
handler
to
object.
So
let's
do
that?
A
We're
going
to
call
this
variable
method
and
we're
going
to
slash
the
object,
method
and
object
method
get
parameter.
We
want
the
zeroth
parameter,
so
this
is
going
to
return
the
places
in
the
program.
The
parameters
in
the
program
that
we
think
are
taking
in
untrusted
user
input.
You
can
see
that
there
are
eight
of
them
here.
So
this
is
this.
B
A
Eight
implementations
here
in
in
the
special
Apache
struts
that
implement
the
two
object
methods.
So
you
can
see
here
we
have
like
HTML
handler
object
method
as
empty
JSON
Lib
handler.
So
this
is
obviously
doing
some
JSON
processing.
We
have
multi-part
form
which
doesn't
do
any
hand.
You
two
objects
extreme,
which
is
hunting
XML
and
a
couple
of
test
ones
here
as
well.
So
have
you
seem
to
be
picking
right
things
and
again,
if
we
have
a
look
at
what's
happening
here,
look
at
the
JSON
one.
A
Well
assume
not
if
Arthur
isn't
amusing,
so,
let's
move
on
all
right,
say:
section
three
unsafe,
XML,
D
serialization,
so
we've
now
identified
places
in
the
program
that
received
the
untrusted
data
and
the
places
in
the
program
that
perform
potentially
unsafe
acts
and
LD
civilization.
What
we
want
to
do
is
really
combine
those
together
now
and
ask
the
question:
does
this
untrusted
data
have
a
flow
to
the
potentially
unsafe
XML
deserialization
cool
now
program
analysis?
We
typically
call
this
a
data
flow
problem.
Data
flow
helps
us
answer.
A
Now
the
way
to
think
about
data
flow
problems
is
to
visualize
them
as
one
of
finding
paths
through
a
directly
graph,
and
this
directed
graph
has
nodes
which
are
elements
in
the
program
and
edges
that
represent
the
flow
of
data
between
those
elements
in
the
program
and
then,
once
you
have
this
graph,
you
know
if
there
if
a
path
exists
in
this
graph,
the
data
flows
between
those
two
nodes,
so
I've
got
a
little
example
here.
Suppose
this
so
consider
this
example
Java
method.
A
So
it's
a
function,
it
returns
an
integer
and
it
takes
a
parameter
that
we've
called
painted
and
it's
an
integer,
and
you
can
see
that
this
value
effectively
propagates
through
this
methods
in
different
ways.
So,
firstly,
this
parameter
is
accessed
and
the
change
parameters,
access
and
stored
in
X
and
depending
on
this
value
of
some
condition,
X,
is
either
restored
in
a
variable
called
Y
and
then
used
in
a
call,
foo
or
X
is
just
returned,
and
in
this
final
case
we
return
-1.
A
Now,
if
we
want
to
think
about
what
the
data
flow
graph
for
this
method
will
look
like
well,
it'd
be
something
like
this
say:
we
will
have
a
data
frame
node.
That
represents
the
parameter
and
that's
going
to
be
called
painted.
This
represents
this
piece
of
source
code
here
and
we
also
have
a
data
flow
node
representing
the
access
of
that
parameter
here,
which
is
going
to
be
an
expression.
A
Now,
of
course,
this
parameter
is
restored
in
the
variable
X
and
then
X
is
then
used
in
one
of
two
ways:
either
is
used
to
you
reassigned
to
Y,
or
it's
used
in
this
return.
Compare
well
what
does
that
mean
from
the
data
flow
graph?
Well,
it
means
there's
going
to
be
two
nodes
in
the
data
flow
graph
want
to
represent
the
access
of
the
X
here
and
once
represent
the
access
of
the
X
there,
and
now
you
can
see
this
is
represented
by
these
two
nodes
here.
A
In
the
argument
to
call
fee
which
we
know
it's
an
expression-
and
so
there's
another
expression-
node
here-
Y-
to
represent
this
bit
of
the
program
that
again
has
data
flow
from
her
this
assignment
of
X
into
the
variable
Y
here.
So
this
is
the
data
flow
graph
that
we'll
be
analysing
for
something
like
this,
and
so
the
thing
to
bear
in
mind
here
is
that
this
is
all
about
flow
through
this
through
this
graph.
A
If
there's
a
path
through
this
graph
from
say
tainted
to
UI
here,
then
data
flows
between
the
two
now
kql
for
Java
provides
data
flow
analysis
as
part
of
its
standard
library.
So
you
can
import
it
using
Cemil
code
java
data
flow
data
flow
and
the
library
models
nodes
using
the
data
flow
node
kql
class.
Now
this
data
flow
graph
is
separate
and
distinct
from
the
ast
that
we
talked
about
before
so
you'll
have
effectively
in
the
database.
One
tree
that
represents
the
basic
structure
of
the
program
and
the
dataframe
8
is
a
dataframe.
A
Waves
are
a
separate
representation
and
the
reason
for
this
is
we
want
to
provide
some
flexibility
and
how
data
flow
is
modeled.
So
we
don't
want
to
be
tied
to
only
having
data
flame
nodes
for
four
things
they
appear
in
the
ast,
and
so
there
are
a
small
number
of
data
phenotypes,
so
the
ones
that
most
common,
the
ones
we
saw
above
are
the
expression,
maids
and
the
parameter
names.
But
there
are
other
types
as
well,
so
things
like
definitions
and
so
forth.
A
Now
what
I'm
actually
going
to
do
is
I'm
going
to
actually
write
out
this
data
flow
template
from
scratch
in
my
in
my
query,
yet
to
explain
kind
of
what
each
of
these
these
features
are
for.
You
feel
free
to
to
copy
the
template
from
from
the
markdown
file
once
we
finished
going
through
the
explanation,
all
right,
so
the
first
thing
I'm
going
to
do
is
add
a
new
import
to
our
query
or
ml
code
dot.
Java
data
for
later
flow
and
this
imports
the
data
flow
library
so
that
we
can
use
it
now.
A
Is
that
you
have
to
define
what
we
call
a
data
flow
configuration
and
the
data
flow
configuration
specifies
what
these
sources
are
for
this
data
flow
problem
and
what
the
sinks
are.
In
other
words,
what
are
the
things
that
we
want
to
find
flow
from
and
floating
now?
The
reason
that
we
have
a
configuration
for
this
is
because
what
we're
doing
here
is
global
data
for
winter
interprocedural
data
flow,
so
we're
looking
across
the
the
whole
program
as
we
see
in
the
database
and
we're
actually
finding
flow
across
method
boundaries
across
file
boundaries.
A
You
know
across
the
whole
program
now
because
of
these
scared
of
this,
it's
not
possible
up
front
to
kind
of
build
a
table
saying
this
is
how
data
flows
from
every
node
in
the
program
to
every
other
node
in
the
program.
So
when
we're
defining
a
configuration
here
is
to
specify
the
conditions
that
make
up
the
source
of
the
sink
so
that
we
can
calculate
this
data
flow
on
a
more
restricted
set
of
data.
A
Now
we
actually
have
a
counterpoint
to
this
global
data
flow,
called
the
local
data
flow
and
that
we
can
actually
compute
for
the
whole
program,
because
we
just
restrict
it
to
the
data
flow
dependencies
within
a
single
single
method
and
that
doesn't
need
a
configuration
that
has
a
simple
predicate.
The
cool
that
says
is
their
local,
flavor,
tween,
X
and
Y,
for
example,
okay.
A
And
so
we
should
note
that
this
is
a
special
kind
of
kql
class.
We
call
it
a
configuration
class
and
its
purpose
is
slightly
different
to
the
to
the
classes
that
we've
seen
before,
so
it's
actually
designed
as
a
wrapper
a
container
for
a
series
of
related
predicates.
So
in
this
case
a
predicate
for
defining
sources
and
a
predicate
for
defining
sinks.
By
defining
this
within
a
single
class,
it
means
that
we
can
tie
the
source
and
sink
definitions
together
now.
A
A
Okay,
and
so
what
we're
going
to
need
to
do
is
to
fill
this
in
in
order
to
specify
what
the
source
is
for.
Our
problem
is
I'm
going
to
need
to
fill
this
in
in
order
to
specify
what
the
sinks
for
our
problem
are.
So
this
is
going
to
be
our
first
question
for
me
when
we
get
to
the
question,
sir,
before
we
do
that,
I
want
to
see
what
this
actually
looks
like
in
a
query.
So
the
way
this
works
in
a
query
is
you
specify
the
configuration
you're
interested
in
now?
A
A
Now
we
can
see
struts,
unsafety,
siriusian,
config
appears
and
you're
to
complete
and
to
use
that
and
then
typically
in
retrospect,
the
configuration
you're
interested
in
and
you're
going
to
have
a
variable
for
each
of
the
source
and
sink
and
then
Li,
where
you're
going
to
say
config
has
flow
was
sink
say
this
is
applying
a
magical.
In
addition,
say
this
is
part
of
this
particular
configuration.
A
A
B
A
A
B
A
A
A
Okay,
say:
let's
go
through
the
is
source
say
the
hint
here
is
that
you
can
translate
from
a
query
course
to
a
predicate
by
converting
the
variable
declarations.
The
front
part
to
the
variable
declarations
of
them
exist,
placing
the
where
clause
conditions,
if
any
in
the
body
of
the
exists
and
adding
a
condition
which
equates
the
select
to
one
of
the
parameters
of
the
predicate
and
remember
to
include
the
content
type
and
our
two
object
class
you
defined
out
here,
so
we've
still
got
the
the
predicate.
A
It
says
to
place
the
where
clause
conditions,
if
any
in
the
body
of
the
exists,
we
don't
have
any
work
or
conditions
for
this
one
and
to
add
a
condition
which
equates
to
select
to
one
of
the
parameters
of
the
predicate.
So
what
we're
going
to
say
is
that
the
source
is
equal
to
the
to
object,
method,
get',
parameter
0.
A
Now
we
can't
actually
say
that
directly
because,
as
I
mentioned
before,
dataframe
nodes
are
not
the
same
as
ast
nodes,
so
you
can
see
here.
This
is
even
highlighted
here.
Node
is
incompatible
with
parameter
now,
alias
actually
works
as
the
dataframe
nodes
have
a
converter
on
them,
which
allows
you
to
get
the
equivalent
ast
node
for
your
data
frame
8.
In
this
case,
this
is
called
as
parameter.
A
A
We're
going
to
say
is
this
Russian
part,
as
I
said
above
that's
just
around
the
hint
a
because
we're
going
to
convert
the
where
clause
just
to
a
condition
in
the
body
of
the
exists
and
we're
going
to
assign
a
parameter
of
the
predicate
the
to
dis
lect.
So
in
this
case,
so
before
we
use
that
as
parameter
because
it
was
a
parameter
in
this
case,
it's
actually
an
expression
that
we're
interested
in
here,
because
the
the
argument
to
the
from
XML
call
is
an
expression.
A
So
you
say,
sync
has
expression,
equals
R
and
and
save
that
here,
okay,
so
what
we
can
do
at
this
point,
so
we
now
have
actually
have
a
completed
crew
that
we
could
use.
What
we
can
do
at
this
point.
Just
to
validate
that
we
have
the
right
things
is
we
can
click
on?
These
is
source
and
is
sync
predicates
run
this
query
and
we
can
just
validate
that.
We
have
the
results
that
we
expect
so
here
we
can
see
those
eight
in
parameters
run
here.
A
A
We
should
actually
be
able
to
find
the
sink
of
the
unsafe
xml
d
civilization.
You
can
see,
we've
got
one
result
and
actually,
if
we
click
into
this,
this
is
in
fact
the
the
CV
that
that
our
colleagues
and
their
the
github
security
lab
team
actually
report.
It
so
is
taking
an
untrusted
user
input
in
that
has
the
reader
in,
and
it's
passing
it
straight
to
this
extreme
dot
from
XML,
and
it's
doing
nothing
to
you
to
protect
this
in
any
way.
A
So
Xtreme
itself
has
some
some
options
that
you
can
set
before
calling
from
XML
T
prevents
sort
of
arbitrary
to
civilization.
Well,
none
of
those
actually
used
here
at
all
and
say
this
allowed
that
she
remain
codex
arbitrary
code
execution
now,
in
this
case,
it's
kind
of
easy
to
verify
that
this
is
a.
This
is
a
true
positive,
an
interesting
case,
because
the
parameter
and
the
which
is
the
source
and
the
argument
which
is
the
sink
just
are
both
in
the
same
method.
Now
this
is
often
not
the
case
for
for
for
security
vulnerabilities.
A
A
Now
there
are
five
parts
to
key
traversing
inquiry
to
a
path
problem
query.
So
the
first
is
converting
the
app
point
from
problem
to
path
problem.
Save
the
query,
isn't
a
thing
we've
seen
this
in
wallet
occasions
already
can
have
some
metadata
at
the
top,
and
one
of
the
pieces
of
metadata
is
a
kind
which
can
be
a
problem
which
can
be
a
problem
or
path
problem.
They
for
a
problem.
Query
will
get
just
the
result
that
we
saw
before
and
slightly
slightly
with
slightly
improved
formatting.
A
What
we
want
to
do
is
to
convert
this
out
kind
to
from
problem
to
path
problem,
and
this
tells
the
cake
you
are
too
lame
to
interpret
the
results
of
this
careers
path
results.
The
second
thing
that
we'll
need
to
do
is
add
a
new
import
data,
fide
path
graph,
which
reports
the
path
data
alongside
the
query
results.
A
The
tool
Train
combines
this
data
with
the
path
information
from
the
import
of
path
graph
to
build
the
paths,
so
we
are
close
to
running
out
of
time
here
say
I'm
going
to
leave
this
as
an
exercise
you
can
complete
on
your
own.
The
solution
is,
is
available
here
under
the
expansion,
and
so
we
have
a
bit
more
information
at
the
bottom.
A
You
can
actually
see
this
working
on
a
vulnerable
copy
of
apache
struts,
so
this
is
the
one
that
we
actually
built
the
database
from.
So
you
can
see
the
plain
of
this
won't
get
up
here
and
it's
been
analyzed
on
LG
TM
comm,
which
is
our
free,
open
source
analysis
platform.
So,
if
you
click
on
this
link
here,
it
will
take
you
to
the
result
in
extreme
panda
on
the
actual
vulnerable
version
of
the
code
as
found
by
the
out-of-the-box
query.
A
Now,
there's
a
wealth
of
information
that
you
can
you
can
move
on
to
after
this,
so
there's
a
few
tutorials
available
for
java,
in
particular
the
one
on
analyzing
data
flow.
It's
probably
an
interesting
one
for
up
next
whip.
After
this,
we
actually
have
full
kokyo
training
courses
for
Java
available
for
free
online
they're,
all
in
the
format
of
a
slide
deck
that
you
can
fry
three
with,
and
you
could
also,
as
I
mentioned
at
the
start,
entered
the
kqr
Java
capture-the-flag
challenge
that
we've
launched
for
satellite.
A
That's
on
the
github,
a
security
lab
website,
and
you
have
a
chance
to
win
a
prize.
So
you
can
put
your
newfound
kql
java
kql
skills
to
use.
We
also
have
some
older
capture-the-flag
challenges
there
as
well.
We
have
ones
for
C
and
C++
or
for
JavaScript
that
you
can
try
out
no
prizes
for
those
ones
anymore,
they've
finished,
but
they're
good
for
learning.
Anyway.
A
We
also
have
a
code
qo+
on
github
learning
lab
and
if
you
want
to
find
out
more
about
how
you
can
find
vulnerabilities
using
kql,
then
the
github
security
that
research
bug
is
a
really
good
place
for
security
researchers
to
go
to
understand
how
the
github
security
lab
actually
uses
code
key
well
on
a
daily
basis
to
find
new
and
interesting
security
vulnerabilities
and,
of
course,
as
I
mentioned
numerous
times
already.
All
of
the
code.
A
You
are
queries
and
all
the
libraries
are
open
source,
so
you
can
click
this
link
and
visit
those,
and
just
as
importantly,
if
you
write
a
new
query
and
you
want
to
contribute
it
back
to
the
community,
you
can
do
that
as
well
and
we
have
a
contributing
guide
here
that
you
can.
You
can
follow
to
do
that
yeah,
and
so
thank
you
very
much
for
for
listening.
I.
A
Don't
think
we
quite
have
time
to
answer
any
more
questions
on
on
the
audio
channel,
but
we'll
be
on
the
slack
channel
for
for
a
bit
longer
yeah.
So
if
you
have
any
burning
questions,
please
you
know
please
answer
them
there
and
we'll
all
be
there
to
be
able
to
to
ask
your
questions.
So
thank
you
very
much
for
turning
up.
Thank
you
very
much
to
my
it's,
my
helpers
and
to
the
team
that
set
this
all
up
and
yeah.
Hopefully,
hopefully,
talk
to
you
seeing
about
kql
Thanks.