►
From YouTube: Scalability demo 2021-12-16
A
Thanks
yeah,
so
I've
got
the
first
and
only
item
on
the
agenda,
which
is
yes
query,
and
this
is
a
tool
that
I
developed
this
this
week
or
last
week.
I
guess,
and
it
came
out
of
some
analysis.
I
wanted
to
do
during
a
recent
incident
where
I
wanted
to
do
some
post
processing
on
data
that
is
stored
in
elasticsearch,
so
our
logs
are
in
elasticsearch
and
I
wanted
to
do
some
local
post
processing.
A
A
A
So,
just
to
quickly
share
the
repo
there's
some
documentation
and
instructions
here
in
the
readme,
so
you
can
kind
of
see
some
of
the
stuff.
That
is
that
this
tool
is
capable
of,
but
I
will
also
give
a
quick
demo
and
I
did
prepare
something
that
hopefully
will
not
include
any
sensitive
information,
because
we
are
dealing
with
our
production
logs
here.
A
So
here's
a
sample
query
where
we're
querying
the
gcp
events
index.
So
this
includes
things,
like
instance,
maintenance.
So
it's
useful
to
to
see
if
a
host
was
rebooted
due
to
a
maintenance
event,
and
so
you
can
give
it
a
query
either
using
this
dash
query
parameter
and
then
you
kind
of
dump
the
json
in
there.
But
you
can
also
provide
this
query
on
standard
in,
and
so
this
is
particularly
useful
if
you're
using
kibana
and
you've
got
like
you
use
kibana.
A
As
a
query
builder,
you've
got
an
elasticsearch
query
that
you
can
get
from
the
inspect
tab
in
kibana.
You
paste
that
into
a
file
and
then
you'd
basically
do
yes
well
cat
quer,
query,
dot
json
extract
the
the
query
field
out
of
that
thing
and
then
pipe
that
into
yes,
query
and.
A
True
yeah,
so
that's
that's
kind
of
the
basic
idea.
So
with
that,
let
me
just
go
ahead
and
run
this,
and
so
it-
and
you
can
see-
I'm
I'm
piping
into
jq
here
to
kind
of
extract
some,
not
so
sensitive
information
out
of
these
logs,
because
I
think
they
did
include
some
ip
addresses
and
such
which
don't
want
to
include
on
this
recorded
call.
A
But
this
kind
of
shows
what
you
can
do
with
this
tool,
and
you
know
you
can
do
whatever
kind
of
analysis
you
want.
So
I'm
just
looking
at
how
many
log
lines
were
present
during
this
time
range,
which
is
you
know,
a
fairly
inefficient
way
of
getting
that
count.
But
it's
very
efficient
on
human
time,
because
it's
very
easy
to
put
together
this
kind
of
query
this
kind
of
pipe
line.
A
A
But
with
this
web,
as
you
can
see,
you
know
it's
actually
doing
quite
a
bit
of
stuff
just
to
paginate
through
the
results
and
that's
the
elasticsearch
pagination
api.
It
is
not
very
curl
friendly.
So
that's
that's
kind
of
that's
why
you
would
want
to
use
this
tool
in
the
first
place
over.
C
I
I
I
kept
thinking
the
whole
time,
what's
the
difference
with
girl,
but
you
answered
that
at
the
end
yeah,
that's
it's
annoying!
After
due
to
imagination,.
B
A
Please
give
me
some
new
results
and
a
new
scroll
id,
and
so
you
that's
kind
of
how
you
paginate
through
so
you
you
need
to
kind
of
keep
getting
the
token
from
the
last
response
and
piping
it
through
to
the
next
one,
pretty
obnoxious
to
do
with
carl
and
and
then
to
be
nice.
I
kind
of
delete
those
at
the
end.
In
this
case,
we
could
also
wait
for
one
minute
and
they'll
clean
themselves
up,
so
that
is
not
strictly
necessary,
but
it's
it's
nice
to
clean
up
after
yourself.
Sorry.
A
A
So
this
is
the
feature
flags
actually,
so
we
we
kind
of
in
rails
log
which
feature
flags
were
checked
as
part
of
that
request
and
whether
the
the
result
was
true
or
false,
and
there's
a
lot
of
interesting
stuff
that
you
can
do
with
that
data.
But
one
of
the
concerns
I
had
was
that
this
would
massively
increase
our
log
volume
in
terms
of
the
size
of
the
log
lines
and
elasticsearch
doesn't
store
the
size
of
the
source,
object
anywhere.
A
B
Where
you
know
you
can
use
the
the
json
objects
directly
to
like
aggregate,
so
you
can.
You
can
run
a
function
on
all
of
those,
but
that's
not
very
convenient.
Sometimes
if,
if
you
want
to
do
something
complicated
and
every
time
you
do
it,
it's
going
to
have
to
refetch
everything
and
recompute
it,
and
especially
if
it
doesn't
really
fit
into
that
processor
document
a
time
model,
it's
it's
also
challenging
so
yeah.
C
C
Thanks
steve,
maybe
you
can
talk
a
little
about
the
the
bachelor's
degree
gate
mailing
list,
because
it
ended
up
being
getting
progressively
more
boring,
which
is
good,
and
I
I
I'm
still
waiting
to
hear
more,
but
it's
it's
it's
looking
good.
C
C
And
the
the
problem
I'm
trying
to
solve
is
that
the
I
o
sizes
we
use
when
transferring
pac
file
data
are
not
optimal.
This
is
a
typical
thing.
If
you
do
I
o,
then
you
can
tune
the
I
o
sizes,
the
the
size
of
the
chunks
of
data
you
send
across
something,
and
sometimes
it's
irrelevant,
and
sometimes
it
matters
and
we
send
so
much
data,
so
much
backfile
data
that
it
matters
apparently
and
so
yeah,
but
we
had
to
come
up
with
different.
C
We
had
to
figure
out
how
to
solve
the
problem,
and
I
got
feedback
from
patrick
from
the
gitly
team
and
we
came
up
with
something
where
actually
this
was
this
was
my
first
idea
is
to
use
standard
io.
So
then
I
had
to
use
the
file
browser
here.
C
I
had
to
make
some
changes
in
in
upload
back
so
that
it
uses
standard
io,
and
I
had
to
create
another
function
that
uses
standardio
instead
of
regular
c
unique
syscalls
and
because
the
standardity
of
buffer
size
is
not
that
big,
just
using
standard
theo
on
its
own
is
not
enough
to
get
larger
rights.
C
I
also
had
to
create
a
configuration
mechanism
with
an
nfr
to
have
a
larger
buffer,
and
this
is
also
iffy,
because
if
you
want
to
reconfigure
the
buffer-
and
you
want
to
do
it
as
early
as
possible
in
the
process
life
cycle,
so
the
common
main
is-
or
this
happens
in
a
function
yeah.
This
is
really
the
main
function
of
all
git
sub
commands.
C
I
I
learned
in
a
previous
attempt
that
that
is
the
right
place
to
do
it,
but
it
also
makes
it
more
makes
it
harder
to
sell,
if
you're,
adding
stuff
to
the
main
startup
function
of
that.
That
affects
everything
so
yeah.
That
was
74
lines.
But
then
I
was
going
back
and
forth
with
patrick
and
I
was
trying
to
write
a
cover
letter
for
the
good
waiting
list
and
I
realized,
like
I'm,
asking
for
more
than
what
we
actually
need
and
there's
something
simpler.
We
can
do,
which
is
in.
C
So
what
if
we
can
just
configure
that
and
it's
okay
to
compile
in
a
different
size,
because
we
compile
git
when
we
build
italy,
so
we
we
have
our
own
git
anyway.
So
we
can
use
a
different
number
here,
and
this
is
actually
better
than
just
tweaking
the
right
sizes,
because
this
also
makes
the
reads
bigger.
C
So
it
saves
even
more
time.
So
I
thought
okay
but
yeah.
Then
we
have
a
new
compile
time.
Constants
do
people
like
that?
Well,
let's
just
try
and
then
the
funny
thing
on
the
kid
made
in
this
was
that
they
said
well.
Why
don't
we
just
make
this
buffer
bigger
and
forget
about
the
constant
and
that
didn't
even
occur
to
me
because
nobody
who's
you
who's,
not
using
the
back
objects.
Everybody
who's,
not
using
the
back
optics.
C
Cache
is
probably
going
to
do
eight
kilobyte
rights,
so
they
don't
benefit
from
making
this
buffer
bigger,
but
they
get
maintainer
said
why
don't
we
make
the
buffer
bigger
and
that's
actually
the
simplest
thing
to
do,
and
so,
but
it
would
have
been,
it
would
have
been
nice
if
it
was
just
this
one
line
that
got
me
that
where
I
changed
the
number,
but
the
line
is
funny
because
the
calculation
of
the
number
is
weird.
C
So
clock
gives
you
a
zeroed
piece
of
memory
because
before
the
we
also
got
a
zeroed
struct,
but
the
so
the
patch
ended
up
very
small,
and
the
other
nice
thing
is
that
I've
been
doing
experiments
to
compare
these
things
where
I
start
a
flame
graph,
and
I
do
a
clone
from
on
the
vm,
where
I
know
it's
a
cache
hit
and
then
I
just
count
the
number
of
samples
in
the
different
programs-
and
I
can
say
well
if
kit
is
using
200
samples
without
this
tweak
and
it's
using
120
samples
with
this
tweak,
then
that
is
less
cpu
and
then
I
add
that
up.
C
So
I
look
at
the
the
gitly
hooks,
gitly
and
git
samples.
Let
me
just
show
what
that
looks
like.
C
So
what
I
was
trying
to
say
is
that
my
approach
has
been
make
a
flame
graph
and
count
add
up
the
different
parts
that
I
care
about,
and
then
I
can
make
comparisons
of
the
number
of
samples.
But
one
of
the
good
maintainers
did
was
that
he
just
simulated
our
cache
by
creating
a
shell
script.
That
gets
a
pack
file
and
he
measured
the
throughput
on
that
and
he
got
roughly
the
same
kind
of
like
30
kind
speed
increase.
So
I
got
a
30
percent
drop
in
cpu
frames
or
stack
frames.
C
That's
also
good
and
he
got
a
30
roughly
cpu,
a
10
megabyte
per
second
increase
when
he
tried
this.
So
that
was
a
really
different
way
to
approach
the,
but
also
a
very
sensible
and
valid
way
to
approach
the
impact.
So
it
was
nice
to
see
that
it
had
an
impact,
and
it
was
really
nice
that
this
person
took
the
time
to
set
up
that
experiment
because
he
was
interested.
C
C
You
will
look
at
and
he
once
a
week
he
posts
an
update
to
say
where
all
these,
what
the
status
is
of
all
these
patches,
so
my
patch
will
start
being
mentioned
in
what's
cooking
in
git,
which
is
a
weekly
email,
and
then
I
can
keep
an
eye
on
the
what's
cooking
and
git
email
and
at
some
point
it
will
say
it
is
in
the
next
branch,
and
that
is
our
internal
goal
post
like
if
once
a
commit
is
in
next,
it
is
okay.
C
Let
the
getly
team
will
accept
it
as
a
custom,
git
patch,
on
their
build.
So
once
it's
in
next,
I
can
accelerate
it
a
little
bit,
and
otherwise
you
wait
until
next
becomes
the
current
kit
version,
but
that's
once
a
month
or
something
of
that
order.
So
it's
maybe
two
yeah
sorry
didn't
prepare
this.
So
I
was
rambling
a
bit
any
questions
or
comments.
A
That's
great,
I
mean
also
thanks
for
sharing
a
little
bit
about
the
the
development
process
of
git
itself,
but
that
part
was
really
interesting
to
me.
C
Yeah,
it's
it's!
It's
quite
unusual
because
it's
all!
Actually
I
don't
know
it's
unusual
compared
to
what
we
do
they
they
they
don't
use
anything
like
git
lab
or
github.
It's
it's
pure
email-based
workflow
and
everything
goes
through
a
single
person.
So
you
have
you
have
maintainers,
who
I
guess,
what
in
the
linux
kernel
community
would
be
the
lieutenant
rule
where
they
actually
not?
Quite,
I
think,
with
linux.
C
Once
certain
people
have
reviewed
it.
Luna
storefolds
doesn't
really
look
at
that
or
he
just
accepts
big
chunks
of
work
that
other
people
have
reviewed.
But
here
the
the
benevolent
dictator
reads
everything.