►
From YouTube: dast-benchmark Details and Workflow
Description
Covers the dast-benchmark tool and workflow for creating baseline applications for benchmarking.
A
Right
so
this
is
covering
the
to
benchmark
tool
and
how
to
kind
of
go
through
the
workflow
of
adding
a
new
application
to
the
benchmark.
So
I
want
to
quickly
start
with
some
of
the
kind
of
issues
you
have
when
you
go
to
do
a
benchmark
or
an
analysis
comparison
of
scans
to
to
see
if
your
tool
is
actually
getting
the
flow
coverage
and
they
scan
coverage
that
you
expect.
A
So
some
of
the
issues
that
kind
of
crop
up
when
you're
doing
this
is
really
comes
down
to
the
comparison
logic
and
what
ends
up
happening
is
a
scanner
as
its
scanning.
An
application
will
be
trying
either
random
inputs,
or
maybe
sequences
are
a
little
bit
different.
So
it'll
request
one
page
and
using
a
get
method
instead
of
a
post
method
first,
and
it
finds
a
flaw
and
reports
it
the
other
time.
I
might
do
the
post
request
first
and
then
report
the
flaw.
So
the
reports
will
look
like
you're.
A
The
vulnerability
exist
in
a
different
place
when
it's
really
the
same
vulnerability
just
in
a
different
accessed
in
a
different
method.
So
the
benchmark
tool
has
a
number
of
kind
of
techniques.
It
uses
to
try
to
make
sure
that
these
comparisons
are
actually
valid
comparisons
and
not
kind
of
to
strictly
doing
a
comparison
and
basically
reporting
incorrect
results.
So
one
thing
that
you'll
end
up
realizing
as
your
work
with
tools
is
that
URLs
are
not
very
good
for
determining
uniqueness,
so
doing
a
direct
comparison
on
URLs
to
see
if
they
match
doesn't
actually
work.
A
A
lot
of
a
lot
of
cases,
for
example,
like
a
lot
of
applications,
will
do
cache
busting,
so
they'll
append
like
a
random
ID,
or
maybe
the
timestamp
at
the
end
of
the
URL
request,
and
then
the
next
time
you
access
it.
It'll
have
a
different
value
so
that
the
browser
doesn't
cache
it.
When
you
go
to
do
your
comparisons,
to
say,
hey,
does
this
URL
match
I'll
be
like
you
know
it
doesn't
if
you're
doing
it
for
a
comparison.
So
what
we
end
up
doing
is
kind
of
two
techniques.
A
So
one
of
the
solutions
is
to
look
at
the
actual
values
and
determine
if
they're
kind
of
randomized
and
if
they
are
it'll,
basically
ignore
that
it'll
say:
okay,
this
is
a
dynamic
parameter.
It's
going
to
change,
probably
in
the
next
scan.
Don't
don't
pay
attention
to
that
when
you
do
your
comparison,
the
other
one
is
to
have
allowed
the
user
or
the
person
creating
the
expected
baseline
report
to
specify
directly
saying
this
is
not.
This
is
going
to
be
a
dynamic
value,
so
ignore
completely
and
there's
a
couple
of
things
that
you
can
ignore.
A
You
can
ignore
parameter
names,
parameter
values,
the
entire
parameter,
URL
passed
parts
of
the
URL
path,
so
it's
kind
of
flexible
and
it
lets
you
say:
okay,
this
vulnerability
is
going
to
exist
in
and
these
are
the
types
of
constraints
that
are
really
required
for
that
vulnerability
to
to
be
matched
against.
So
besides
that
you
also
have
host
names,
may
change.
So
if
it's
running
in
a
CI
CD
pipelined,
it
might
generate
a
random
URL
or
host
name.
So
we
don't.
A
We
ignore
that
we
basically
strip
out
all
the
host
names
and
just
replace
with
a
dot
star.
Another
issue
is
Mitch,
mismatched,
query
name
ordering,
so
some
applications
may
just
change
the
order.
So
you
use
there's
an
example
of
the
first
request.
Doing
X
is
equal
to
one
and
Y
is
equal
three,
but
then
maybe
the
same
page
will
kind
of
flip
those
those
parameter
key
values.
A
So
what
we
do
is
we
basically
parse
up
the
URI
and
sort
them
so
that
they
always
match
the
same
order
and
that's
oh
and
evidence
and
attack,
so
obviously,
tool
has
many
different
attack
strings
that
it
can
send.
So
what
we
want
to
do
in
some
cases
we
want
to
actually
ignore
the
attack,
string
or
the
evidence.
A
So
if
the
the
evidence
basically
states
it
finds
this
flaw:
here's
how
we
determined
that
it
was
a
flaw
whether
it
was
like
a
URL
or
parts
of
the
page
that
that
kind
of
demonstrate
that
the
flaw
was
actually
found.
So
in
some
cases
you
do
want
to
take
that
into
consideration
for
exam.
If
it's
a
particular
type
of
link
that
must
exist
like
it's
missing
some
property
that
needs
to
exist,
otherwise
it's
coming
to
the
ulnar
Bowles
or
something.
So
in
that
case
we
do
want
to
account
for
it.
A
In
other
cases,
like
a
sequel
injection
attacks
string,
we
don't
care
what
the
attack
string
is
so
just
ignore
the
attack
string
and
that
attack
string
may
also
show
up
in
the
URL.
So
we
need
to
cleanse
the
URL
of
that
attack
string
when
we
do
our
comparisons,
because
the
next
time
it
runs
in
my
tried
it
a
different
attack,
string
and
find
the
same
flaw.
But
again,
the
flaw
is
exactly
the
same.
It's
just
a
matter
of
doing
a
kind
of
decent
job
comparing
against
it.
A
So,
as
I
mentioned,
the
the
expected
report
allows
you
to
ignore
various
fields
like
the
parameter
names,
names
and
values
parameter
path
indexes.
So
we
could
say:
okay,
we're
going
to
ignore
this.
This
particular
path,
and
this
kind
of
just
walks
through
all
of
that,
so
that
kind
of
covers
the.
Why
it's
so
hard
to
do
a
comparison
between
two
scans?
Now,
let's
look
at
actually
how
it
does
it
so
there's
two
kind
of
metric
key
metrics
that
we're
looking
for
when
we
do
a
baseline
or
benchmark
comparison
between
a
scan.
A
That
is
the
scanned
resources
which,
basically
is
determines
the
coverage
of
the
crawler
of
the
task
tool.
So,
if
that
there's
a
hundred
links-
and
it
only
found
50
of
them-
you
have
a
50%
scan
coverage
for
vulnerabilities.
We
have
flaw
coverage
and
flock
coverage
really
depends
on
a
what
type
of
flaws
exist.
How
many
do
exist
where
they
exist
and
other
kind
of
properties
too,
to
allow
us
to
do
the
comparison
so
for
scanned
resources?
A
So
there's
a
sniff
algorithm
that
will
basically
take
the
two
sets
of
all
the
URLs
from
the
expected
report
and
all
the
URLs
from
the
scan
report
and
basically
go
through
it
say:
okay,
was
this
found
yes
or
no?
If
it
was,
if
it
wasn't
found
from
the
expected
report,
then
that
is
a
false
negative
and
if
it's
something
existed
in
the
expected
report,
but
not
in
the
new
scan.
A
That
would
be
a
potential
false,
positive
and
I
say
potential,
because
a
lot
of
times
when
you're
doing
these
baseline
analysis,
you
as
a
person
creating
the
report,
may
miss
a
link
that
the
scanner
just
happened
to
find
that
you
forgot
about.
So,
usually,
you
want
to
look
at
those
results
to
see
if
any
of
these
potential
FPS
are
actual
true
positives,
and
you
need
to
update
your
baseline
report
for
vulnerability
instances.
It's
a
bit
more
complicated.
A
One
of
the
issues
that
we
ran
across
with
the
dass
tool
is,
it
doesn't
have
a
concept
of
a
unique
vulnerability
ID.
So
what
ended
up
happening
is
we
had
to
go
through
all
of
the
alerts
that
the
Zap
could
create
or
the
dash
tool
could
create
and
then
create
vulnerability,
IDs
and
then
use
those
as
our
kind
of
primary
key
to
do
our
comparison
against.
So
in
the
expected
report,
you'll
see
vulnerability,
IDs
or
sure
subsets
of
cwe's.
So
you
can
see
like
here.
A
79.1
79
is
a
cwe
for
a
cross-site,
scripting
and
then
point
me1
means.
Maybe
it's
reflected
there's
another
type
of
other
types
of
alerts
that
may
exist
in
the
report
and
I'll
show
those
in
a
bit,
but
we
go
through
and
pretty
similar
to
how
scan
resources
are
compared.
We
kind
of
create
this
these
two
types
of
sets
and
then
we
create
a
comparison
of
those
sets
and
then,
if
it
doesn't
exist,
an.
A
Report,
it's
considered
an
FP
if
it
doesn't
exist
in
the
scan
result,
that's
considering
that
false
negative
and
so
on
and
so
forth.
So
any
questions
so
far,
no
pretty
good!
Thank
you!
So
next
I'm
going
to
do
a
quick
kind
of
walkthrough
of
the
workflow
of
how
you
would
create
this
type
of
expected
report.
So
what
I
have
here
is
a
new
application.
No
goats
I
did
a
scan
earlier
using
are
using
it
locally
because
I
was
crashing
our
desk
tools,
finding
a
fall
and
crashing
it,
and
so
I
gotta
fix
it.
A
But
this
is
the
the
repository
for
node,
goats
and
it
has
a
customized
CI
job
to
basically
build
it
as
an
image
and
run
it.
So
this
the
CI
template
will
create
the
image
of
the
node
goat
application
with
built
into
it,
and
then
we
have
a
secondary
project.
You
could
put
it
in
the
same
one
doesn't
really
matter
but
to
actually
run
the
desk
and
to
get
to
the
results,
and
then
we
download
the
results-
and
this
is
kind
of
the
end
product
of
that.
A
So
this
kind
of
just
tells
you
if
you're
gonna
benchmark
an
application.
This
is
how
you
would
set
it
up.
So
what
we
end
up
doing
first
is
a
lot
of
times
when
you're,
creating
these
expected
or
baseline
reports.
Is
you
don't
want
to
do
everything
manually
and
the
scanner
is
gonna
find
stuff
that
you
as
a
human
either
will
miss,
or
it's
just
too
tedious
to
report
everything
so
we
end
up
doing.
Is
we
take
this
up
real
quick.
A
So
here
is
the
report
as
all
our
flaws,
all
that
good
stuff
in
it,
and
we're
going
to
take
this
and
actually
generate
an
expected
report
from
it.
So
it's
going
to
go
through
and
do
that
for
us,
so
we
go
and
we
do
best
benchmark
and
we're
going
to
put
a
few
birds
for
codes
and
by
default
it
will
output
a
expected
report.
So
if
you
open
that
up
big
again,
so
this
is
what
the
expected
report
looks
like:
it
has
some
configuration
stuff,
rule
files,
etc.
A
But
here
you
can
see
the
the
evidence
in
this
case,
this
evidence
is
important.
The
attack
was
not
included
and
it
just
has
these
different
types
of
instances,
so
each
flaw
obviously
can
be
found
multiple
times,
so
we
want
to
account
for
that.
So
this
one
was
anti
Caesar
token
scanner,
here's
79,
which
is
cross-site
scripting,
so
you
can
see
in
the
post
username
of
the
sign
up
page,
and
if
we
actually
look
at
this
in
the
here,
we
could
actually
take
this
attack
string.
A
Let's
like
it
uses
a
single
quote,
double
quote
and
then
a
script
see
I've
already
tried
this
once
before
and
I
believe
this
attack
string
actually
doesn't
work,
but
if
you
kind
of
modify
it
a
bit,
it
does
work.
So
it
actually
found
a
flaw.
It
just
kind
of
reported
it
incorrectly.
So
it's
not
really
an
FP,
but
if
it
wasn't
FP
it,
what
you
end
up
doing
is
you
just
delete
this
instance
from
the
expected
report
that
way
you
have
a
clean
result
set.
So
that's
pretty
much
what
that
looks
like.
B
A
Right
so
here's
a
great
example:
we
have
cwe
16,
which
is
a
content,
security
policy
issue
and
a
lot
of
times.
This
will
exist
on
like
every
single
page
that
it
tries
to
find
because
it's
missing
the
CSP
header
or
the
CSP
header
has
some
sort
of
yeah
this
case.
It's
a
wild-card
directive.
So
a
lot
of
cases
you
could
probably
you
say:
okay,
this
is
going
to
be
no
matter
what
it's.
If
this,
if
it's
found
in
a
post
request,
it's
going
to
be
the
same
flaw
in
this
case.
A
The
URLs
are
different,
so
we
might
want
to
keep
that
a
lot
of
times.
You
don't
want
to
keep
that,
but
it
really
depends
on
on
the
floor
itself
or
it
looks
another
one
directory
browsing
here.
We
go
so
you
can
see
here,
here's
a
get
request
in
a
post
request
to
the
same
exact
URL,
the
same
evidence.
It's
like
okay.
This
is
clearly
the
same
flaw.
Let's
just
delete
that
and
then
we're
going
to
mark
this
as
a
it
could
be
either
post
or
get
or
any
other
thing
saying.
A
A
Ok,
so
here's
here's
one
where
we
don't
care
what
the
parameter
value
is.
This
could
be
anything
it's
still
going
to
be
the
learn
you
our
URL,
with
the
URL
parameter
as
being
the
actual
vulnerable
endpoint.
So
what
we
do
is
we
delete
that
and
we
say
we're
going
to
ignore
the
parameter
URL,
because
we
don't
care
what
the
value
is.
Even
if
it
matches
this,
then
we're
good.
So
add
that
and
then
go
back
to
our
results.
A
A
A
All
right
so
now,
if
we
do
a
diff
of
our
expected
report
and
our
expected
report
with
URLs
you'll
see
that
it
added
these
new
URLs
for
us
so
the
next
time
we
do.
Our
comparison.
It'll
include
these
as
well,
so
that's
great
for
URLs,
but
sometimes
you
need
to
add
new
vulnerabilities
as
well.
So
sometimes
you
as
a
auditor
will
be
looking
at
the
the
code
and
say
hey.
This
was
a
flaw
that
the
desk
will
miss
or
we
need
to
add
more
abilities
as
well.
A
A
Area,
so
this
is
kind
of
what
the
rule
list
looks
like,
so
it
just
has
the
vulnerability
ID
and
as
a
cwe
kind
of
it's
actually,
the
Java
class
name,
the
dot
Java.
And
then
this
is
the
alert
title
so
a
lot
of
Weeki
off
the
vulnerability
titles,
because
again,
there's
no
concept
of
a
unique
ID.
So
we
key
off
this
this
vulnerability
title
and
then
we
assign
it
this
formability
idea.
So
that's
and
these
two
faults
are
for
ignoring
parameters
or
ignoring
evidence.
A
B
A
A
You'll
see
that
added
this
new
attack
evidence
to
the
login
URI
methods
get
Kramer's
user,
so
this
just
makes
it
easier
for
us
on
trying
to
add
new
vulnerabilities
to
a
baseline
report
to
not
have
to
go
through
and
add
all
these
JSON
fields
and
and
try
to
get
her
all
your
syntax,
correct,
it'll
just
read
the
CSV
file.
You
can
have
new
lines.
If
the
evidence
requires
new
lines,
you
just
double
quote
it
to
a
chunk
up
to
a
to
lock
it
in
so
so
now
we
have
that
we
have
our
vulnerability
added.
A
So
now
we
can
do
a
comparison.
So
let's
say
we
had
either
a
new
scan
or
we
just
want
to
compare
what
we
kind
of
converted
our
this
note
foliage
ax
report
to
our
current
expected
reports.
So
we're
going
to
do
that
punch
mark
again-
and
this
is
all
done
in
CI
job
as
well,
once
the
once
it's
added
this
a
path,
name,
let's
go
and
then
type
is
full.
A
Expect
expected
woods
in
URLs
and
bones
so
by
default
it
will
spit
out.
So
you
can
see
here
it
found
a
number
of
duplicates
so
this
tool,
the
desk
tool,
will
account
for
duplicate,
false
positives
and
duplicate,
false
and
negatives,
or
excuse
me
false
positives.
A
lot
of
these
vulnerabilities
are
kind
of
found
multiple
times
and
it's
a
very
similar
place.
So
we
just
we
count
those
as
duplicates
because
they're
not
technically
unique.
So
now
we
have
a
benchmark
results.
We
open
that
up
and
you
can
see
what
it
does
is.
A
It
creates
statistics
for
each
type
of
vulnerability
ID.
So
in
this
case
there
is
a
hundred
eighty
two
two
causes.
This
is
user
agent.
Buzzard
usually
does
a
lot
of
stuff.
We
don't
really
consider
this.
A
true
positive
I
would
probably
remove
that
and
consider
this
whole
class
of
issues
as
being
a
false
positive.
You
can
see
for
each
one,
there's
like
two
positives,
duplicate,
two
positives
and
so
on
and
so
forth,
and
also
another
key
metric
is
the
expected
true
positives.
A
So
if
there's
only
supposed
to
be
three,
you
need
to
match
those
against
to
see
if
they
actually
have
the
correct
number.
We
should
see
one
FN
as
you
remember.
We
added
that
cross-site
scripting
flaw.
Obviously
it's
not
going
to
exist
in
the
report
because
we
just
added
it.
So
that's
that's
marked
as
being
a
false
negative,
and
then
we
had
a
number
of
duplicate,
two
positives.
A
Yep
yeah
and
I
mean
you'd
want
to
manually
verify
an
aversive.
Obviously
the
the
verification
process
is
the
most
time
consuming
and
requires
the
kind
of
skill
set
that
you
need
for
identifying
these
types
of
issues,
especially
if
you're
trying
to
add
new
vulnerabilities
that
the
tool
has
not
produced
the
phone.
But
overall,
if
you
wanted
to
just,
for
example,
take
this
expected
reports
and
then
use
it
to
do
a
kind
of
base
comparison
to
say.
A
Okay,
if
I
ran
one
scan
and
I
got
these
results,
I'm
going
to
create
I'm
gonna
generate
my
expected
results
from
that.
Once
can
do
a
new
scan
and
does
it?
How
much
does
it
differ?
Am
I
finding
150
new
vulnerabilities?
That
means
there's
some
kind
of
variance
to
you
with
your
scan
that
maybe
the
desk
tool
is
doing
something
wrong
or
found
something
new,
so
you
could
still
use
it.
A
Now
this
is
kind
of
forced
to
our
report.
Only
there
may
be
in
the
future.
It
would
be
nice
to
be
able
to
compare
against
like
how
other
tools
like
Hattiesburg
do
and
how
those
you
ever
do.
That
would
be
nice,
but
I
think
for
for
v1.
We
just
want
to
be
able
to
use
this
tool
as
more
of
a
QA
process
to
to
make
sure
that
our
scanner
is
getting
the
proper
amount
of
flaw
in
scan
coverage.
A
This
so
every
time
this
benchmark
master
branch's
run,
it
will
generate
a
new
entry.
So
we
can
say
it's
take
a
look
right
now.
Only
dvwa
is
in
there,
but
we're
going
to
load
that
up,
and
this
gives
you
a
nice
like.
Okay,
we
got
48%
scan
coverage,
like
average
looking
good
above
70%
and
then
for
each
different
flaw.
We
can
see
the
expected
versus
how
many
it
actually
found
and
again
false
positive
rates,
false
negative
duplicates
and
so
on
and
so
forth.
A
A
Other
cuz
that
was
weird
okay,
so
here
is
doing
comparison.
You
can
see
we
got
the
same
exact
results
which
is
good.
This
is
what
you
want
to
see
if
you
run
the
same
configuration
against
the
same
application
twice
so
yeah
and
then
I
don't
have
the
the
tables
doing
comparison
just
because
the
tables
are
kind
of
hard
to
work
with,
but
yeah
well.
I
definitely
appreciate.
B
A
If
you
wanted
to
look
at
the
chart,
view
of
this
would
probably
be
easier
for
you
to
look
at
what
was
different
between.
Obviously,
these
are
all
going
to
be
the
same
bar
charts
because
it
found
the
same
exact
issues,
but
if
we
ran
it,
if
we
compare
it
against
like
a
baseline,
which
doesn't
do
actual
attacks,
you're
gonna
see
very
different
results
between
the
two.
If
it
loads.
B
Meanwhile,
while
it's
loading,
I
was
curious,
I,
don't
think
I've
seen
anything
in
the
rule
said
around
headers.
Is
it
something
that
we
intentionally
Ahmed
or
it's
just
like
it's
up
to
the
test
itself,
to
whether
to
use
headers
or
not
but
seems
like
everything
is
around
query
string
and
HTTP?
Nothing.
A
So
in
this
case,
for
the
evidence
or
the
parameter
revenue
I
believe
it
would
be,
the
header
name
would
be
the
parameter
value
in
here.
So
it
would
still
it
as
if
you
are.
Let's
say
you
are
attacking
a
header
and
the
parameter.
The
header
name
was
like
content
type
or
something,
then
the
that
parameter
value
would
be
filled
in
here.