►
A
So
I
start
recording
hello
everyone
who
we
have
here
yeah.
I
think
everyone
has
the
correct,
zoom
link
so
yeah.
This
is
our
brownback
session,
informal
technical
meeting
for
security
for
parsing
and
ingestion.
If
you're
expecting
this
to
be
like
highly
technical,
probably
it's
not
gonna,
be
I'm
just
gonna.
I
will
just
try
to
go
through
the
ingestion
and
security
report
passing
like
in
a
really
highly
abstract
level.
A
So
let
me
introduce
myself.
I
met
them
in
inaudible
people
pronounce
it
muscles
time
wrong.
They
say
enoch.
I
am
working
as
a
software
developer
at
thread
insights,
so
the
agenda
will
be.
I
will
first
show
you
the
different
channels
like
how
our
users
can
interact
with
the
vulnerabilities
appear
in
their
code
base,
and
I
will
talk
about
the
previous
versions
of
report
parsing,
the
pipeline
security
tab
and
the
report
ingestion.
A
There
was
no
report
ingestion
before
it's
a
term
we
recently
invented.
We
previously
had
a
story
port
service,
and
I
will
also
talk
about
the
current
version
of
report
report
passing
the
security
top
the
pipeline
security
tab
and
the
report
ingestion
and
also
the
future
version
of
report
parsing.
The
pipeline
security
tab
and
the
report
ingestion
also
at
the
end,
I
will
try
to
answer
your
questions
if
you
have
any
so.
A
The
different
channels
we
have
within
the
system
are
the
pipeline
security
tab
like
this
pipeline
security
that
lists
all
the
vulnerabilities
from
in
the
code
base
for
that
particular
pipeline
run
and
the
vulnerability
report.
A
We
have
wonderful
vulnerability
report
for
project
group
and
instance
levels
and
then
the
mr
widget,
which
shows
which
vulnerabilities
are
the
new
and
which
ones
already
been
fixed.
So
the
previous
version
of
the
report,
parsing
was
basically
like
just
pasta
json.
If
you
can,
we
didn't
have
any
validation
in
terms
of
like
schema
validation,
and
we
were
just
trying
to
pass
the
json
and
create
the
plain
old
ruby
objects
and
let
the
rest
of
the
application
handle
if
there
is
any
problem
with
the
with
the
data.
A
In
the
previous
version
of
the
pipeline
security
tab,
we
were
downloading
and
parsing
all
the
security
report
artifacts
for
each
http
request,
which
were
mostly
timing
out
due
to
the
size
of
security
report.
Artifacts.
Imagine.
A
C
A
Okay,
so
imagine
you're
loading
the
first
first
page
of
the
security
report
in
the
pipeline
security
tab
and
the
security
analyzers
uploaded
10
gigabytes
of
report.
We
parse
it.
We
create
the
ruby
hash
and
then
which
is
serialized
to
json,
just
for
first
findings
to
be
viewed
by
the
user,
and
when
you
try
to
load
the
second
page,
we
do
the
exact
same
thing,
download
all
that
all
those
data
and
and
then
parse
and
and
so
on.
A
So
of
course,
like
thiago,
was
sitting
in
front
of
his
adjustable
desk
and
trying
to
understand
why
it
was
loading
so
slow,
of
course,
because
deep
application
was
trying
to
download
the
universe
and
then
pass
the
universe
and
so
on.
So
this
was
the
this
was
the
previous
version
of
the
pipeline
security
and
the
previous
version
of
the
report
ingestion.
We
had
one
ruby
file,
ruby
class
called
storyboard
service,
it
contained
499
lines
of
code
and
everyone
was
highly
highly
coupled.
A
It
was
hard
to
make
changes,
because
when
you
touch
one
place,
it
was
breaking
another
place
which
is
completely
unrelated
and
also
it
was
really
hard
to
test
because,
like
everything
was
coupled
writing
a
unit
test
in
isolation
was
impossible.
So
you,
your
test,
basically
had
to
run
all
the
logic
in
in
that
class
to
be
able
to
test,
maybe
a
small
edition.
A
Also
it
was
creating
the
vulnerability
requests
one
by
one.
There
were
therefore
taking
a
long
time
to
complete.
Basically,
like
we
can
say,
people
are
usually
using
m
plus
one
query
problem
file
reading
the
data,
but
this
is
also
kind
of
important
query
issue
and
the
current
version.
A
So,
first
of
all,
I
want
to
fix
terminology,
because
I
was
using
it
wrong.
I
was
using
soft
validation
and
hybridization.
Then
we
came
to
conclusion
that
terminology
is
validation,
means
validate
the
given
security
report
and
nothing
more.
Enforcement
is
skip
the
invalid
security
report
ingestion
so
from
now
on,
we
will
be
using
these
terms
to
describe
the
stuff.
A
A
So
this
is
how
it
looks
in
the
pipeline
security
tab
with
the
yellow
background
and
warning
sign
and
what
the
warning
is
you
can.
You
can
see
this
in
the
current
with
the
current
version
and
enforcement
is.
If
the
report
doesn't
comply
with
the
securities
he
must
be
here.
We
just
discuss
them.
We
don't
even
try
to
ingest
them.
A
A
So
if
you,
if
you
see
a
user,
wants
to
experiment
enforcement
feature,
you
can
suggest
them
setting
this
ci
variable
in
their
gitlab
ci
yaml
file.
A
So
this
is
how
it
looks
like
under
the
variable
section,
we
have
validated
scheme
and
set
as
true,
which
means
it
will
validate
and
enforce,
and
this
is
how
it
looks
in
the
pipeline
security
type
when
the
enforcement
is
enabled.
Those
things
are
not
warnings
anymore,
but
actual
errors,
and
the
vulnerabilities
contained
by
this
scan
will
not
even
be
tried
to
ingest
it.
So
we
can
also
have
warnings
and
error
messages
showing
up
at
the
same
time,
because
we
will
also
show
warning
messages
about
like
hey.
A
This
report
version
has
been
deprecated
and
will
not
work
with
the
next
major
version
of
neetlab
to
inform
the
users
about
upcoming,
braking
change.
A
So
the
current
version
of
the
pipeline
security
tab.
There
are
two
versions.
The
first
version
is
the
old
one
like
download
everything,
try
to
download
everything
everything
pass
everything
and
prepare
the
response.
There's
another
version
which
uses
the
security
findings
table
to
download
just
a
subset
of
those
artifacts,
so
security
findings
in
in
security
findings.
We
are
just
storing
the
metadata
of
the
findings
like
severity,
confidence
report,
type
and
and
so
on.
A
A
So
this
was
working
better
than
this,
so
this
time,
thiago
again
sitting
in
front
of
his
adjustable
desk
and
trying
to
understand
why
this
is
a
bit
faster
right
now.
That's
the
reason
why
it's
a
bit
faster,
so
current
version
of
the
report
ingestion
invalid
reports
are
discarded
if
the
enforcement
is
enabled,
which
means
if
there
is
an
error
for
the
security
scan,
we
don't
even
try
to
ingest
the
reports,
for
the
scan
vulnerabilities
are
being
created
in
batch
of
50
records
at
a
time.
A
So
we
are
not
doing
this
one
by
one
anymore,
there's
a
separate
task
class
for
each
individual
entity
in
case.
If
you're
curious,
we
can
also
go
to
the
code
base
and
check
what
are
those
tasks?
A
Tests
are
separate
for
each
task
and
have
their
own
spec
files.
So
if
you
want
to
test
something
that
you
recently
introduced
in
the
ingestion
logic,
you
don't
have
to
run
the
logic
for
all
the
ingestion.
You
just
need
to
you.
You
can
just
run.
You
can
just
test
what
you
implemented
and
tests
are
running
faster
as
the
whole
logic
doesn't
run
over
and
over
again.
Ingestion
logic
is
now
running
faster,
faster
with
less
resource
consumption.
I
will
show
some
metrics
afterwards.
A
We
haven't
seen
any
unexpected
errors,
since
we
enable
it
like
zero
errors
in
the
in
just
an
ingestion
logic
which
actually
helped
us
a
lot
to
improve
our
error
budget,
which
is
almost
green.
It's
99.94,
I
guess,
and
any
active
record
validation
validation
error.
This
is
actually
like
a
downside
of
this
approach.
A
A
D
Hey
limit
sorry
yeah,
quick
question
on
that.
So
since
we
are
showing
ingestion
errors,
does
that
only
apply
to
schema
validation,
or
does
that
point
above
if
there
is
a
problem
during
the
the
model
ingest,
if
there's
an
active
record
problem,
would
we
surface
that
as
well?
Right
now,.
A
A
Them:
okay,
perfect.
The
tag
of
the
error
message
will
be
different.
For
example,
if
it's
a
schema
validation
error,
the
tag
will
be
schema.
If
it's
an
ingestion
error,
then
we
say
ingestion
or
if
we
have
a
parsing
error,
even
like,
if
it's
not
a
valid
json
file,
the
scanner
provided,
we
say
parsing
error.
So
we
have
pretty
granular
error
messages
there,
fantastic
okay,
so
this.
C
Yeah
sorry
in
terms
of
the
whole
batch
from
it
for
the
current
validation
error
is
that
every
single
is
isn't
the
whole
batch
per
table
or
is
that
the
entirety
like
identifiers.
A
And
the
the
entire
batch,
because
we
don't
really
want
to
leave
the
database
in
an
inconsistent
state
right,
for
example.
Let's
imagine
we
created
the
vulnerability
finding,
but
not
the
finding
links
we
just
roll
back
transaction
and
it
stays
in
a
consistent
state.
A
That's
an
a
that's
a
really
nice
question,
and
this
is
about
75,
percentile
and
95
percentile
of
the
report.
Pricing
time
sorry
report
ingestion
times,
and
I
think
it's
not
that
hard
to
see
the
place
where
we
enable
the
future
flag.
A
So
it's
now
faster
and
also
like
for
the
night
for
the
75
percentile
duration,
there's
a
consolidation,
so
it's
more
foreseeable
in
terms
of
the
time
it
takes.
This
is
a
bonus.
We
did
the
same
thing
for
the
store
scale,
service,
exactly
same
approach,
creating
records
in
batches
and
the
ones
who
play
on
the
stock
market
can
understand.
A
This
had
a
shoulder
pattern
after
the
head
and
shoulder
pattern.
It
started
going
down
a
lot
yeah.
I
also
prepared
some
flow
charts.
Maybe
there's
no
point
me
going
through
these
charts,
but
maybe
you
can
on
your
own
check
those
later
or
maybe
we
can
even
put
those
into
our
documentation
page
for
for
other
people
as
well.
A
I
will
just
tell
you
how
to
is
true
you'll
see.
For
example,
there
are
some
sub
procedures,
for
example
in
the
middle
of
the
flow
one
stores
can
group
in
the
circle,
and
there
is
the
second
sub
procedure.
Those
procedures
are
described
in
a
different
flow
of
chart.
So,
for
example
like
this,
as
you
can
see,
I
also
I
already
showed
the
previous
one
and
the
next
one,
and
to
just
like
give
you
like
to
just
not
you,
you
lose
the
context.
A
A
I
think
you
are
fine
yeah
perfect,
also
like.
Yes,
these
are
the
actual
ingestion
tasks.
We
have
you
may
wonder
why
they
are
colored,
because
I
wanted
to
highlight
them
and
also
after
preparing
the
presentation,
I
realized
it's
it's
possible
to
give
them
a
background
color.
So
I
thought
they
look
fancier.
A
So
this
is
why
so,
maybe
I
can
give
you
like,
I
can
read
through
the
tasks,
so
you
will
have
understanding
of
what
tasks
we
have.
So
the
first
thing
is
we
ingest
identifiers
because
the
identifiers
are
shared
between
different
findings
and
then
we
ingest
the
findings
because
to
be
able
to
ingest
the
vulnerabilities
we
need.
We
need
to
first
injustify
that
if
this
fails
there
is
no
point
creating
the
vulnerability,
and
then
we
attach
the
findings
to
the
vulnerabilities,
basically
updating
the
vulnerability
id
attribute
of
the
finding.
A
A
So
with
this
design,
whenever
a
scanner
group
needs
to
add
a
new
task
in
this
pipeline,
they
can
introduce
a
new
task
completely
new.
They
cannot
own
the
future,
they
can
run
the
code
base.
They
can
test
it
easily
without
the
threading
science
to
you
know
like
check
the
logic
of
course,
like
we
are
here
to
help
you
like.
A
If
you
need
help
how
to
design
a
task,
how
to
implement
a
task,
we
will
be
more
than
happy
to
help,
but
it's
not
very
easier
than
trying
to
place
your
single
function
within
a
huge
class
service
class.
A
You
can
just
implement
a
new
task,
so
this
is
actually
like
lucas,
as
you
already
asked.
So
if
there
is
an
error
in
this
pipeline,
we
roll
back
the
transaction
here
and
also
save
it,
save
this
error
on
the
security
scan,
so
it
will
be
visible
to
the
user.
A
So
the
future
versions.
A
The
future
version
of
the
report-
pricing
as
well
dimension.
It
will
be
mandatory
and
a
permanent
feature
by
15.00
we're
almost
there.
The
official
version
of
the
pipeline
security-
I
mean
the
the
previous
version
wasn't
scaling.
A
The
current
version
is
also
not
scaling
well,
so
we
are
trying
to
find
a
way
to
make
it
more
scalable,
but
we
can't
basically
store
all
the
security
findings
in
postgres
sequel,
because
postgresql
is
not
horizontal
scalable,
because
and
then
we
have
gigabytes
of
data,
maybe
terabytes
of
data
when
it
comes
to
vulnerabilities.
A
Imagine
like
scanners
are
generating
thousands
of
vulnerabilities
for
each
pipeline
run.
So
this
is
why
we
are
looking
for
another
database
engine,
so
maybe
elastic
search
because
we
can
scale
it
horizontally,
basically
having
five
or
ten
shards
each
chart
holds
just
like
a
partition
of
the
data
which
will
make
it
easier
recently.
I
also
started
thinking
about
using
crystal
db
to
like
as
a
manager
for
different
data
stores,
but
it's
just
a
rough
idea,
and
I
didn't
even
put
this
into
the
presentation.
A
So
the
future
versions
of
the
reporting-
just
ingestion,
as
I
mentioned,
like
an
active
record
validation
error,
churns
the
whole
badge.
I
think
discarding
that
just
that
single
record
might
be
a
better
alternative,
so
we
will
try
to
achieve
that
and
also
we
are
working
on
us
ux
improvements
to
give
users
an
indication
that
there
was
an
error
while
ingesting
the
report
in
the
vulnerability
report
page.
A
We
already
have
this
in
the
pipeline
security
tab,
but
we
have
an
issue
to
show
a
warning
message
on
the
vulnerability
report
page
which
gives
link
back
to
the
pipeline
security
tab.
So
users
can
easily
check
what
the
error
must
so
yeah.
As
I
said
like
it,
wasn't
that
technical.
So
if
you
have
any
questions,
I'm
happy
to
answer
or
if
you
want
me
to
locate
you
where
this
new
ingestion
flow
ingestion
logic
in
the
code
base,
I
can
also
open
the
code
base
and
show
it
to
you.
C
So
because
we
have
a
couple
questions
already
they're
more
high
level,
maybe
we'll
just
do
those,
but
ultimately
maybe
it
would
be
worth
digging
into
those
if
we
have
time
at
the
end,
because
ross
and
I
are
actually
working
on
something
right
now
that
is
in
the
code.
C
I
I
guess
first
I
was
just
curious
thanks
for
that.
That
was
really
helpful
to
to
get
the
overview.
Do
we
have
like
a
integration
test?
Sorry
that
the
part
where
ross
that
that
is
mentioning
ross
in
that
paragraph,
so
we're
so
number
two
is
part
of
number
one
in
the
talk
but
ross
and
I
have
been
working
on
this
kind
of
multi-step
flow.
C
So
do
we
have
an
integration
test
around
the
whole
pipeline
because
we're
we're
trying
to
test
something
that
involves
like
modifying
secret
security
findings
prior
to
ingestion,
and
so
it's
kind
of
like
a
fairly
like
integrated
task,
so
so
we're
trying
to
avoid,
like
writing
what
I
call
an
old
school
test
like
a
very
large,
integrated
store
security
report
service.
Spec,
I'm
curious.
If
we,
if
we
do,
have
anything
that
ensures
that's
all
working
as
expected
from
like
report
through
yeah.
A
B
I
don't
know
how
much
more
knowledge
I
have
on
that,
but
but
I
know
that
harsha
has
been
looking
at
the
integration
test
and
trying
to
find
the
missing
places
in
there.
He
had
put
out
there's
an
issue
and
let
me
find
that
issue.
A
But
my
understanding
is
more
like
like,
instead
of
having
an
integration
test,
you
are
more
like
asking
if
we
have
a
unit
task
which
is
crossing
the
borders
of
different
units
right.
Yes,
that's
something
like
I
created
the
pipeline
and
I
see
the
result
on
the
vulnerable
3d
port
page,
but
more
like
the
whole.
The
whole
coverage
for
the
ingestion
logic.
C
Yeah,
I
I
I
guess
I
wouldn't
call
it
an
end-to-end
test
here
is
the
I
can
just
link
to
like
what.
C
I've
been
using
to
test
this
feature,
which
is
basically
just
like
a.
C
Yeah,
it
starts
at
that
comment
in
the
store
security
reports,
worker.
C
A
C
Yes
I'll
create
a
follow-up
issue
and
then
we
can
just
continue
the
conversation
there.
Yeah
that'd
be
awesome
thanks.
Thank
you.
B
C
Yeah
sure,
I'm
kind
of
just
curious
if
in
doing
this,
this
rewrite,
I
know
that
there's
a
bunch
of
plain
work
around
like
data
model,
cleanups
things
like
using
the
vulnerability
reads,
table
more
and
like
removing
unused
columns
things
like
that,
but
I
guess
I'm
curious
about
your
take
on
like
from
a
what
are
any
like
high
level
takeaways
on
like
how
we
can
better
model
our
domain
objects.
A
A
I
mean
we
can
maybe
go
with
the
star
model
or
like
try
to
how
you
say
normalize
the
data
as
much
as
possible,
but
then
the
query
impact
will
be
painful.
So
this
is
why
we
invent
the
vulnerable
trees
to
segregate
the
queries
from
reeds.
A
That's
that's
a
really
huge
topic.
I
mean
we
definitely
need
to
discuss
this
a
lot.
Maybe
we
can
even
have
a
spike
and
then
like
try
to
try
to
find
what
needs
to
be
done.
A
But
like
see,
since
you
are
asking
quests
this
question,
probably
you
have
also
seen
some
opportunities.
Some
room
like
a
room
for
improvement
or
something
if
you
have
please
don't
hesitate,
sharing
those.
C
I
I
guess
the
only
other
item
here
would
be
like
a
a
deeper
use
case
for
this.
That
would
involve
just
talking
about
the
problem
that
ross
and
I
have
been
working
on
solving
and
around
the
ingestion
service,
but
because
that's
like
a
very
specific
use
case
of
a
deep
dive
here,
I
I
would
want
to
just
like
ensure
that,
like
if
anyone
else
has
questions
or
topics
that
they
would
like
to
cover
before
that,
I
I
don't
know
was
this
set
for
half
an
hour?
Are
we
over
time.
C
Okay,
cool
well,
if
no
one
else
has
a
topic.
The
issue
that
I
linked
to
at
the
top
there
around
a
multi-step
workflow.
C
C
So
there's
a
couple
different
ways
that
we've
been
brainstorming:
how
to
do
that
ross,
and
I
have
both
been
like
spiking
on
two
very
similar
but
different
approaches,
but
it
basically
involves
iterating
over
identifiers
and
reor
reordering
identifiers,
so
the
so
a
lower
identifier
becomes
the
primary
and
recalculating
the
uuids
for
a
lookup
for
us.
I
don't
know
if
you
want
to
talk
about
the
approach
you
have
in
place
there,
but
the
basic
idea
here
being
like.
C
Part
of
this
is
does
touch
on
needing
something
like
an
into
in
test
here,
because
it
involves
like
a
fairly
far-reaching
component
beyond
a
specific
task.
We
essentially
need
to
update
a
primary
identifier
id
on
the
vulnerability
itself
or
the
finding
itself,
along
with
re-sorting
identifiers,
to
generate
uuids
correctly
and
things.
A
A
Or
maybe
like
imagine,
we
have
multiple
jobs,
running
for
different
pipelines
on
the
same
same
project
and
trying
to
you
know
readjust
the
data.
A
I
think
that
that's
the
right
place
to
discuss
this.
What
do
you
think.
C
Sure
I
I
I'm
happy
to
bring
it
to
that.
That's
what
we
brought
it
to
before
before
initially
talking
about
the
solution.
When
is
that
this.
A
Is
it's
on
tuesday,
probably
it
the
time
works
for
you.
B
It's
yes!
Next
tuesday,
on
the
29th,
it's
9
a.m:
central
yeah,
so
you're
in
pacific
right,
lucas,
yep
yeah!
So
I
mean
we
can
we
could
I
mean
we
could
probably
bump
it
back
an
hour
so
that
you're
not
having
to
get
up
and
on
it.
C
If
there's
a
like,
if,
if
synchronous
works
for
y'all,
I
can
wake
up
early,
but
we
we
just
linked
the
on
number
three
here,
the
two,
mrs,
so
we're
kind
of
like
looking
at
different
approaches.
C
So
maybe,
if
there's
a
way
of
doing
this
asynchronously,
that
would
be
great
just
like
if
that,
if,
if
you
were
going
to
discuss
this
during
your
refinement,
that
would
be
great.
Just
like
any
comments
you
want
to
leave
on
the
mrs
direction,
we're
just
kind
of
looking
for
a
general
direction
check
on
whether
this
this
makes
sense
or
if
there's
a
better
way
to
plug
in.
We
can
figure
out
the
actual
logic
separately,
but
more
high
level
architecture,
wise.
A
We
can
definitely
add
this
to
our
agenda
for
the
next
meeting
and
discuss
it.
There
yeah.
B
Hey
we
we
do
have
so
we
do
have
the
recording
from
a
couple
weeks
ago,
when
we
did
the
when
we
talked
to
them
the
first
time
on
this
as
well.
A
Seems
like
no,
so
thanks
a
lot
for
your
time
for
joining
this
meeting,
I
still
recording
by
yeah,
because
I
stopped
presenting
and
not
recording.