►
From YouTube: 2023-03-16 Scalability Team Demo
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
So
the
first
thing
I
want
to
show
is
Vasily
is
working
on
some
stuff
regarding
knowing
how
redis
is
used
by
stage
groups
and
he's
added
kind
of
a
wrapper
around.
A
Rails.Cash
that
can
be
used
like
as
a
drop-in
replacement,
but
when
initializing
it
you
need
to
pass
in
a
feature.
Category
call
site
that
kind
of
thing.
So
that
means
let
me
share
my
screen.
Wait!
Don't
let
me
first
clean
up
my
screen
a
bit
and
then
share.
A
That
we
can
have
cash
metrics,
so
this
shows
cache
hit
ratios
by
feature
category
and
call
site.
So
then
we
can
see
where
things
are
coming
from
and
how
things
change.
This
is
the
like
a
first
step,
we're
going
to
like
he's
going
to
be
adding
that
everywhere
and
the
idea
is
then
also
to
build,
like
maybe
structured
keys.
So
when
we
do
a
scan
of
the
Reddit
ski
space,
we
can
see
what
a
key
belongs
to
to
reuse.
That
kind
of
information
in
the
key,
but
that's
like
kind
of
far
out.
A
We
can
also
add
more
metrics
here,
like
now.
It's
just
cash
hit
and
miss
that
we
measure
we
also
measure
the
the
duration
of
the
cache
generation.
So
if
it's
a
cache
Miss,
how
long
does
it
take
to
generate
that's
in
there?
Other
things
that
we
could
add.
There
is
the
size
of
the
thing
going
into
the
cache,
but
there's
some
difficulty
there
regarding
compression
and
and
whatnot
yeah,
but.
B
A
Yeah,
so
that's
what
fazilia
is
working
on
I'll
link
the
Epic
from
the
dock
later.
This
is
part
of
like
getting
some
attribution
on
cash
usage
on
redis
usage
related
to
Stage
groups.
D
A
B
Yeah
I
guess
it
depends
how
compressible
it
is,
but
none
of
the
things
we're
looking
at
here
actually
relate
to
the
size
anyway.
Right
and
these
are
all
pretty
good,
but
while
they
were
looking
at
here,
I've
still
got
my
tab
on
the
Thanos
tab.
So
what
that
tells
me
is
that
the
content
Char
has
a
very
high
hit
rate
and
it's
not
very
expensive
to
regenerate,
whereas
the
protected
branches
are
service.
Tech
thing
has
a
lower
hit
rate,
but
is
more
expensive
to
generate
yeah.
A
And
the
thing
is
like
it's:
a
drop
in
replacement
for
rails.cash.
Well,
everything
active
support
cash
thing.
Whatever
it
is,
so
the
idea
would
be
to
yeah
start
using
that
instead
I,
don't
know,
if
there's
something
magical
we
could
do.
That
makes
everybody
use
it.
But
this
was
easy
to
do
because
we
already
separated
off
redis
repository
cache,
so
yeah.
B
Can
we
I
think
Andrew
might
actually
be
typing
the
same
thing
or
maybe
it's
a
related
Point?
Can
we
figure
out
how
many,
how
much
an
attributed
and
attributed
and
attributed
I'm,
not
sure
cash
usage?
There
is
by
your
endpoint,
like
in
the
logs
or
something
and
then
sort
of
go
back
to
Stage
groups
and
say,
like
hey,
you
know,
Source
codes,
you
know,
makes
a
bunch
of
requests
that
have
cash
attribution
and.
B
B
B
A
A
It
looks
like
rails.cash
rails.cash.
E
E
E
My
question
is
really
this
is
great
but
like,
unless
we
can
use
it
in
a
way
that
we
can
drive
change
with
it.
It's
it's
not
going
to
be
as
useful
as
it
could
be
right.
So
how?
How
do
we
use
this?
Like?
Is
there
a
way
that
we
can
kind
of
include
this
in
a
budget
like?
Have
you
thought
about
that
and
I'm
sure
there.
A
A
A
So
that's
that's
the
end
game,
but
there's
currently
like
two
IDs
floating
around
on
how
to
do
that.
One
is
measuring
the
the
throughput
there
like,
but
with
custom
mode
cache
and
the
other
is
periodically
scanning
this
key
space
and
attributing
it
to
Future
category
and
then
including
it
on
to
the
the
report
that
Sean
and
Stephanie
are
building.
B
So,
just
as
a
side
point
another
way
we
can
use
this
to
actually
as
with
Pizza
bikes
right,
because
I've
had
this
in
the
past,
where
I've
been
reviewing
Mr
and
someone
added
a
cache
like
I
read
his
cash
for
something
and
I'm
like
yeah.
But
like
is
the
usage
pattern
of
this
endpoint.
It's
called
a
lot
with
for
the
same
data,
or
is
it
called
for
different
data?
All
the
time
like?
B
What's
what's
the
actual,
like
expected,
hit
rate
of
this,
and
it's
quite
hard
for
someone
to
answer
that,
and
it's
also
quite
hard
for
us
to
answer
after
the
fact
like
how
useful
that
cash
is,
but
if
we
can
put
it
behind
a
feature
flag
or
just
measure
it
after
rolling
it
out,
then
we
can
say
like.
Oh,
we
can
see
the
hit
rate
for
the
specific
cache
it's
like
30
and
it
takes
no
time
to
generate
so
absolute
kind
of
pointless.
A
That's
what
we
could
do
with
the
metrics
already?
Yes,.
B
Exactly
that's
what
I
mean
sorry
yeah
like
this
is
one
way
we
could
use
this
already,
but
that's
in
a
non-structured
way.
That's
not
like
aggregating
data.
That's
just
saying
like
on
a
case-by-case
basis
like
I,
want
to
know
how
effective
my
change
has
been
or
is
going
to
be,
and
this
is
a
way
for
showing
that.
A
Yeah,
so
do
we
want
to
have
that
in
something
like
error
budgets
that
the
teams
are
already
looking
at
or
do
we
want
to
have
a
new
thing.
E
E
And
and
it's
it's
always
those
kind
of
those
parts,
and
and
but
the
problem
is
that
you've
got
to
come
up
with
a
single
number
where
you
mix
in
like
well
like
this
is
how
much
of
this
there
is-
and
you
know
like
with
with
postgres
right.
It's
like
this
is
how
much
Auto
vacuum
this
group
is
generating.
This
is
how
much
you
know
CPU.
E
This
is
how
many
connections
and
then
you've
got
to
kind
of
come
up
with
a
way
of
mixing
all
those
things
together
to
come
up
with
some
sort
of
metric,
and
that's
the
really
difficult
part,
because
I
think
a
lot
of
people
will
just
think.
Well,
it's
not
really
a
real.
You
know
you
you
it's
it's
a
synthetic
number
which
it
is,
but
you've
got
to
make
it
that
it
kind
of
reflects
reality
in
some
way.
Yeah.
B
And
we
also
can't
measure
Reddit,
CPU
I
know
that
the
cash
for
most
of
it's
going
to
be
like
getting
set,
but
we
can't
measure
Reddit
CPU
with
it
because
of
the
clients
biometric
right.
So,
if
we're
doing
like,
like,
we
had
the
interview
with
like
s,
members
or
whatever
with
stuff
before,
like,
we
can't
measure
that
in
this
way,
unfortunately,
yeah.
A
B
A
B
It's
called
a
lot
like
we
don't
we
don't
capture
that,
but
I
think
this
is
like
you
know,
sort
of
down
the
line,
stuff,
I
think
initially,
it's
quite
useful
just
for
poking
around
manually.
If
nothing
else
like
I
said,
like
you
know,
if
there's
something
I
mean
the
two
hit
rates
you
showed
here
are
what
around
85
and
around
98,
which
both
sound
okay.
B
But
it
would
be
interesting
to
see
that
across,
like
a
wider
variety
of
cash
identifiers
and
see
like
are
there
some
that
look
like
they're
really
low
and,
if
so
like?
What
are
we
doing
with
that
cash
like?
Why
are
we
doing
it,
but,
as
I
mean
Jacob's
been
working
on
a
cash
thing
for,
like
you
know
something
slightly
different
and
it's
there's
a
lot
of
variables
and
there's
not
necessarily
one
answer.
D
Well,
I
think
with
a
if
you
have
a
new
data
source,
it's
also
okay,
to
start
with
ad
hoc
analysis
and
learn
what
you
can
learn
from
the
data
source
and
how
you
can
use
it
and
over
time,
maybe
realize
like
well.
Okay,
if
we
can
see
like
with
Reddit
optimizations,
it
I
think
it's
become
very
clear
that
just
the
sheer
volume
of
request
is
often
the
most
important
useful
thing
to
optimize.
D
E
A
Is
interesting
about
this
is
as
well
is
that
we
slot
it
in
between.
So
that
means
we
can
add
more
things
to
it
like
now,
we're
just
adding
metrics
because,
like
that
was
the
easiest
thing
to
do
at
the
first
time,
but
we
could
start
writing
stuff
out
to
files
if
we
can,
like
whatever
we
want.
No,
if
we're
in
between
yeah.
D
C
D
A
Right
now,
it's
for
two
endpoints
that
facility
just
had.
D
A
D
Well,
that
is
a
another
interesting
problem
to
look
at
like
to
how
to
get
everybody
to
go
through
this
mechanism.
D
A
Get
a
wife
yeah.
We
were
thinking
because
the
the
for,
like,
as
we've
seen
from
the
the
cash
instance
most
things
are
rails.cash.fetch
and
it's
a
drop-in
replacement
for
that.
So
my
idea
would
be
to
drop
it
in
like
just
doing
a
fine
and
replace
and
let
the
specs
call
out
if
it
doesn't
work
and
having
it
with
not
only
to
start
with
and
then
start
attributing
like.
We
did
for
controllers
and
sidekick
jobs
and
all
that
stuff.
A
E
Why
am
I
hearing
it
right
that
rails
developers
aren't
suggesting
monkey
patching
the
underlying
rails.cash
or
else
cash.
D
D
Ones,
it
knows
about
but
Andrew
the
problem
as
I'm
hurrying
hearing
it
is
that
we
need
to
set
say
somewhere.
What's
who
the
owner
is
so.
A
E
E
Okay,
right,
okay,
I
mean
because
the
other
way
that
you
could
do
it
is
is
like
and
I
don't
want
to
kind
of
Spike
things
up
too
much
here
but
like.
If,
if
you
entering
a
block
of
code,
where
the
ownership
of
everything
inside
there
postgres
redis
changes
to
a
different
team
right,
you
could
almost
have
like
a
yield
block
or
something
where.
E
E
B
C
A
Big
yeah!
Yes,
because
you
would,
you
would
add
a
block
then
in
something
that
is
now
well
repository
class-
is
maybe
not
a
good
example,
but
then
we'd
start,
for
example,
at
the
controller
and
as
we
see
now
like
not
everything
within
that
controller
should
be
owned.
F
E
A
Solve
yeah
and
there's
also
the
the
thing
of
in
some
cases,
we
push
something
on
the
context,
but
it's
not
in
a
yield
block
and
then.
E
Yeah
like
if,
if
say
for
notes
or
for
to-do's
or
something
and
there's
always
like
you,
add
to-do's
from
all
over
the
show,
I
don't
know
if
that's
a
good
example
but
like
when
you
enter
that
code.
It
kind
of
yes.
A
C
D
I
I
think
this
is
a
fundamental
problem
for
what
we're
doing
with
air
budgets
and
and
our
monolithic
code
base
and
I
also
remember
from
reading
the
SRE
book
that
it
was
that
they
were
saying
like
yeah,
people
are
on
the
fence
or
divided
on
how
to
attribute
resource
usage.
D
A
Not
not
I,
don't
think
it
is
because
we
well
we're
going
into
a
Direction
Where
We.
Have
this
Team
the
we
call
them
stage,
groups
and
they're
building
their
thing,
and
now
we're
have
teams
on
the
other
side
and
reliability
that
are
the
stable
counterparts
or
whatever.
And
then
it's
going
to
be
a
collaboration
like
we
know
that
this
is
for
this
feature
and
then
it's
up
to
a
small
group
of
people
to
decide
who's
going
to
act
on
whatever
is.
D
It
that
that's
not
what
I
meant,
so
we
were
talking
about
protective
branches
and
you're,
saying
lots
of
different
features,
use
protective
branches.
So
who
do
we
count
whose
budgets
or
spends
does
a
use
of
protective
branches
count
towards?
D
D
If
so,
protect
the
branches
would
be
a
source
code
feature,
but
if
somebody
other
than
source
code
does
something
that
needs
a
protective
Branch,
then
we're
saying
that's
feature.
Category
Stage
Group
is
accessing
the
cache.
A
If
pipelines
need
protected
branches
to
know
what,
if
the
branches
protect
like
the
branch
the
pipeline
is
running
on
is
protected
or
not,
then
it
needs
a
way
to
figure
out
if
this
branch
is
protected
in
an
efficient
way
and
it's
source
code,
that's
responsible
to
make
sure
that
they
can
do
that
in
an
efficient
way.
Is
my
thing.
D
Yeah
and
context
objects,
Get
Set
at
controller
level,
on
the
outer
layer
once
something
comes
in
yeah.
So
they
are
never
the
right
thing
because
you
then
go
through
everything
and
you
touch
things
that
don't
by
different
teams
and
different
layers
of
the
call
stack.
But.
A
These
things
can
be
narrowed
down
like
as
Andrew
mentioned
now
we
just
set
it
on
the
outside
on
the
controller
or
the
API
endpoint
or
the
sidekick
job,
but
we
could
set
it
again
like
override
it
for
certain
pieces
of
code
like,
for
example,
when
you
do
something
inside
the
repository.rb
class,
it's
owned
by
source
code.
So
if
you
call
out.
D
To
yeah,
but
what
what
you're
then
recreating
is
the
call
stack.
So
we,
we
use
threat
local
storage,
to
remember
the
context
of
the
request,
and
then
you
have
a
long
call
stacking.
You
go
all
over
the
place
and
the
Ruby
language
knows
where
you
are
in
the
cool
stack
and
what
file
you're
in
I
mean
it's
about
files.
Really,
we
want
to
attribute
things
to
files
and
files.
Our
own.
D
Okay
or
methods
in
files,
but
those
are
all
things
that
are
part
of
the
of
the
language
of
the
call
stack,
so
that
seems
like
if,
if
it,
if
all
the
time,
when
we're
calling
functions,
we
are
having
this
parallel
whole
stack
for
our
own
observability,
then
we're
slowing
down
Ruby
and
we're
sort
of
recreating
the
structure
of
the
of
the
code.
A
D
C
A
Did
we
pick
the
current
number?
The
the
current
number
was
picked
upon
because
we
used
to
call
out
to
Thanos
for
every
run
and
then
we
added
the
cache
and
then
we
added
the
more
reliable
cache.
You
remember
the
data
warehouse
in
the
what
is
it
one
of
the
package
components
like
we
upload
the
zip
every
time
to
get
loud.
D
Oh
right
that
that
that
that
crazy
idea,
I
I,
was
selling
it.
A
I
guess
because
the
CI
cache
like
with
that
random
moments
not
be
available
and
then
we'd
have
to
refetch
everything
again.
F
A
We
do
that
manually
and
it's
stored
in
here.
A
A
A
E
A
But
yeah
once
you've
extracted
it.
We
now
have
about
a
Year's
worth
of
data.
A
Soon
we'll
have
more
but
I
think
a
year
might
be
good
enough
to
show
I'm
not
hearing
any
objections
like
it
does
compress
everything
a
little
bit.
What.
A
C
D
B
Well,
it
doesn't
adopt
that
you
link
like
it's
like
outlier
support,
so
you're
like
oh
great,
it
can
handle
outliers
and
what
it
says
is
like
if
you
have
about
like
yeah,
remove
them
and
you've
got
the
basically.
This
has
removed
them
and
we'll
ignore
empty
data
points.
That's
that's
how
it
supports
them,
but
you
you
have
to
remove
them
and
then,
which
is
from
our
perspective,
like
the
easiest
thing
to
do
as
well.
Like
you
know,
you
know
we
we
don't
have
to
like.
Do
anything
fancy
with
them.
B
Exactly
like
you
know,
we
know
that,
like
oh
hey,
like
everything
blew
up,
then
then
we
don't
need
to
consider
that
in
the
in
the
forecast,
if
it
was
like
not
not
germane
to
to
Future
trends,
I
mean
we
do
need
to
be
a
bit
careful
with
that,
because
we
need
to
be
clear
on
the
distinction
between,
like
you
know
that
this
was.
This
was
a
genuine
like
outlier,
but
it
was
something
that
we
wish.
Hadn't
happened,
but
you
know
was
based
on
usage
or
whatever
so
yeah.
E
So
when
we
get
to
two
or
three
years,
are
we
gonna
turn
on
annual
trends?
B
That
would
help
with
the
seasonality
right,
because
I
think
part
of
the
problem
with
the
seasonality
thing
is
that
profit
can
detect.
These
now,
I
think
that
we
don't
like
the.
B
E
B
D
A
B
Yeah,
it
would
be
nice
to
have
metrics
that
have
been
around
for
that
long
in
a
stable.
That
mean
the
same
thing
for
that
long
as
well,
which
we
do
have
quite
a
long,
so
yeah.
A
I
think,
as
long
if
we
change
them,
like
we've
done
before,
we've
worked
with
the
V2
thing
and
so
on
to
have
it
separate
I
think
we
should
figure
out
a
way
to
not
do
that,
so
we
can
have
the
same
name
and
then
we
can
add
the
change
point
or
like
have
it
detect
a
change
point
but
yeah
for
now.
Let's
just
start
with
a
Year's
worth
of
data
and
see
where
that
gets
off.
The
last
thing
I
wanted
to
call
attention
to
was
the
postgres
primary
CPU.
The
issue
is
linked.
There.
A
People
are
working
on
it,
but
just
spreading
the
the
word
out
a
little
bit,
because
this
is
something
to
worry
about,
and
it's
often
not
an
easy
thing
to
resolve.
I,
don't
have
anything
more
than
that
to
say
about
that.
Okay,.
F
So
this
thing
Dave
does
not
need
to
shut
up
0.5.
That
is
a
pretty
short
one,
so
there's
still
room
for
more
points,
so
I
I
think
last
two,
two
weeks
before
I
I
bought
the
I
was
explaining
the
mystery
ERA
with
redis
cluster
rate
limiting.
So
the
patch
was
very
simple,
so
I
I
bought
two
issues
to
discuss
the
maintainers
and
I
I
I
think
they
were
pretty
exciting,
they're,
pretty
all
right
to
accept
version
five,
because
that's
the
one
that
I'm
maintaining
now
but
4.8.
F
The
maintainer
said
that
they
are
not
open
to
bad
ports.
So
in
that
issue,
I
had
all
the
GitHub
links,
so
yeah
I
think
it's
a
little
bit
about
monkey
patch.
You
can't
really
put
it
in
and
then
enjoy
the
code
changes
on
the
gem
side,
but
the
good
news
is
for
redis
V5
when
we
upgrade
we
can
enjoy
that
change
without
having
to
do
any
patch
on
our
own.
So
I,
don't
think
we'll
be
getting
it
soon,
because
we
are
kind
of
blocked
on
307,
followed
by
sidekick
7..
F
Yeah
so
other
than
that,
not
much
for
me.
I
think
I
probably
want
to
ask
you
to
show
the
patch,
because
it's
pretty
pretty
simple
and
it's
yeah
I
think
in
the
previous
realm
I've
showed
a
bit
of
it
already
the
version
five
patches,
there's
a
select
modification,
so
yeah
yeah,
take
over.
D
Thanks
I
was
I,
I
am
on
callers,
I
mock
for
a
couple
days
this
week
and
I
was
paged
on
Tuesday
or
Andrew
said
it
was
an
incident
where
one
of
where
we
spent
a
long
time
trying
to
understand
what
the
problem
was,
and
one
thing
that
maybe
this
is
a
good
audience
for
that.
D
I
found
confusing
and
sorry
I
didn't
prepare
this,
but
I'll
just
share
my
screen
and
quickly
show
it
because
that
works
better
we're
getting
an
alert
from
patroni
main,
or
that
was
the
really
the
this.
This
aptx
thing
was
alerting
and
we
really
had
a
problem
on
the
CI
cluster,
but
petroni
CI
doesn't
have
an
up
next.
D
So
I'm
curious,
why?
Why
isn't
there
an
aptx
here.
E
C
D
D
E
The
way
that
it's
built
over
is
interesting
because,
sorry,
just
before
you
move
on
because
does
that
mean
that
we
have
a
lot
of
transactions
that
are
being
held
open
kind
of
with
across
the
both
of
the
postgres
instances,
and
then,
if
one
of
them
slows
down
the
other
one
basically
struggles,
because
there's
lots
of
client
transactions.
Is
it
something
like
that.
D
I
am
not
sure
I
was
in
the
incident
for
three
hours
and
by
the
end
of
that
it
was
still
not
clear
what
was
going
on
and
I
hand
it
over
to
the
next
iMac
and
I
haven't
caught
up
yet
on
what
the
cause
of
what
was
really
going
wrong.
But
it
took
quite
a
long
time
to
figure
out.
C
D
And
there
was
a
different
incident
yesterday,
so
I
also
didn't
read
up
on
what
happened
yesterday,
but
going
back
to
Tuesday.
D
Another
thing
that
was
confusing
and
I
just
want
to
highlight
here,
because
people
know
about
psychic
is
that
the
the
psychic
database
load
balancer
if
it
can't
find
a
secondary
that
is
up
so
psychic
psychic
database
load
balancing
works
by
putting
an
attribute
on
the
job
that
says,
I
want
to
see
these
wall
offsets
and
if
the,
if
I,
can
find
these
wall
of
sets
on
a
replica
that
I'm
going
to
use.
This
replica
not
use
a
postgres
primary
and
that's
good,
because
then
you're
not
using
the
primary.
D
The
problem
is
that
if
this
job
gets
picked
up
by
the
psychic
server
process-
and
it
doesn't
see
these
offsets,
it
uses
an
exception
to
force
a
psychic
retry,
because
the
I
think
the
idea
is,
if
we
wait
a
while,
maybe
the
replicas
have
caught
up.
E
E
D
A
Yeah
the
delay
there
is
the
better
option
other
than
using
the
primary,
but
it's
still
not
good
and
we'd
still
want
to
know.
D
Yeah
but
it
so
that
was
very
confusing
and
then
the
other
fun
one
was.
C
D
This
class,
this
this
method
is
where
we
check
if
a
replica
can
be
used
and
notice
this
rescue
here.
D
So
when
this
rescue
happens,
this
method
returns
false
and
if
all
the
replicas
return
false,
then
the
database
load,
balancing
code
and
psychic
raises
this
exception
of
I
want
to
be
retried,
so
what's
happening
here
is
that
we
were
masking
real
errors
and
instead
saying
oh,
there
is
a
database
load
balancing
problem,
because
the
problem
wasn't
even
that
the
replicas
weren't
up
to
date.
D
There
was
something
else,
I
think
the
the
the
thing
was
to
saturated,
like
all
the
the
back
end
connections
between
PG,
Bouncer
and
but
the
the
postgres
surface,
where
constantly
in
use.
So
we
were
getting
one
of
these
connection,
errors
which
are
which
include
PG
error.
D
So
this
is
the
idea
of
just
masking
all
of
these
and
instead
saying
well,
let's
retry
the
load
balancing.
It
is
not
good
for
discovering
what's
what's
wrong,.
E
A
Care
on
those
yeah,
how
is
an
attribute
template
error
connection
there
yeah.
B
D
Not
even
sure
we
should,
we
should
be
catching
these,
so.
E
E
D
C
E
But
do
we
not
want
to
just
do
that
by
getting
sidekick
to
to
repost
the
job
and
retry
it?
That's.
E
D
D
Need
yeah,
I,
don't
know
why
I'm
doing
this
to
myself,
but
at
least
you
I
get
quick
answers
to
why
things
work
the
way
they
work.
If
I,
asking
okay,
I,
guess
I
need
to
make
corrective
actions
now,
but
thanks
for
looking
at
this
with
me.
E
Yeah,
but
how
do
we,
like
you,
know,
Parts
we're
not
going
to
do
another
functional
decomposition.
E
So,
what's
the
you
know
and
I
mean
I
know
that
there's
some
fairly
noisy
pieces
like
again,
you
know
the
old
scanner
or
what
you
know,
the
container
scanner
and
and
or
license
scanner
or
whatever
it
is.
But
in
fact,
do
we
have
good
accounts
like
do
we
have
good
accounting
of
exactly
what
in
there
is
you
know
how
much,
how
much
of
our
resources,
those
sort
of
features
are
consuming
like?
Is
there
something
where
we
we
sort
of
emergency
sort.
A
Of
we've
done
a
few
things
that
Nikolai
pointed
out
from
postgres
dot
AI
some
kind
of
report
that
highlights
some
expensive
queries,
but
I
think
that
that
resulted
in
two
things
that
will
have
been
done.
But
I
don't
know.
We
don't.
C
D
B
E
I
think
like
so,
you
know
when
infra
Dev
was
was
you
know,
happening
I,
think
one
of
the
things
that
really
helped
is
if
you
could
go
there,
and
you
could
really
point
and
say
here
and
and
the
scalability
team
is
quite
uniquely
placed
to
do
that
like
here,
is
you
know,
a
big
problem
that
you
know
this
and
you
really
kind
of
put
it
on
somebody's
lap
and
say
like
this
needs
to
be
fixed
straight
away,
and
then
everyone
can
rally
around
that,
and
maybe
it's
up
to
the
scalability
team
to
help
with
with
I,
don't
know.
D
So
one
thing
I
found
strange
or
or
frustrating
in
in
the
that
first
incident,
I
was
in
I
mean
apart
from
I,
was
the
eye
marker
I'm
not
supposed
to
investigate,
but
even
if
I
hadn't
been
the
iron
mock,
I
I
was
would
have
been
pretty
helpless,
yeah
analyzing,
what's
going
on
with
postgres
I,
don't
have
the
expertise
to
look
at
a
melting,
postgres
server
and
see
what
it's
basically
doing,
but
how
how
many
people
know
how
to
do
that?
D
How
easy
is
it
to
do
that,
so
how
many
yeah,
if
we
can
find
things
and
then
we
like
Andrew,
says
we
can
hand
them
to
development
and
say
Here's
the
thing
that
is
inefficient
or
is
clearly
costing
a
lot
and
it's
something
done
about
them.
But
how
do
we
find
them
who's
able
to
find
them,
and
we
train
more
people
or
Empower
more
people
to
find
them.
E
I'm
curious,
you
know,
there's
all
that
stuff.
A
while
ago,
around
marginalia
plus
tying
that
to
statement
IDs,
there's
a
there's,
a
doc
in
the
Run
books
called
postgres
mapping,
something
like
that
which
describes
like
with
people
using
that
sort
of
information,
the
marginalia
and
the
and
the
and
the
PG
stat
statements.
And
if
you
can
kind
of
tie
all
those
things
together
now
and
it
gives
you
a
much
better
view
of
what's
Happening
inside
posters.
A
It's
still
a
manual
thing
and
I
I,
don't
know
off
the
top
of
my
head
to
figure
out
the
query
IDs
that
are
causing
a
problem
because
it.
D
Okay,
but
nobody
in
the
internet,
myself
included,
knew
that
that
was
there
or
how
to
find
it
or
to
start
using
that,
and
eventually
people
started
people
who
knew
enough
started
to
look
in
the
right
places.
E
D
And
I
I'm
not
in
a
lot
of
incidents,
but
what
I
found
interesting
was
that
the
initial
reaction
here
was
more
along
the
lines
of
we
need
to
increase
the
limits
on
PG
bouncer
or
we
need
to
tweak
some
infrastructure
knobs
and
because
I'm
a
back-end
developer.
My
my
first
thought
is:
the
application
is
doing
something
bad.
It
would
need
to
find
out
what
bad
thing
the
application
is
doing.
Yeah.
E
D
Know
what
I
wanted
to
ask
is:
if
there
are
people
in
here
who
observe
more
incidents
and
I,
think
there
are.
How
often
is
it
a
problem
you
solve
by
by
turning
postgres
knobs,
and
how
often
is
it
the
fault
of
the
application
foreign
roughly.
B
This
is
another
thing:
I
mentioned
to
Jacob
is
I,
think
we've
gone
from
the
same
few
people
being
like.
Basically,
every
incident
call
which
is
bad
to
having,
certainly
on
the
internet
manager
side
like
a
huge
voter,
which
means
that,
like
like
Nick
upset,
he
doesn't
have
much
experience
to
be
an
assistant
manager,
even
though
he's
been
on
the
rotor
for
quite
a
while
because,
like
you
know
and
I'm,
not
saying
people
to
have
more
ships,
I,
don't
I,
don't
have
an
answer
to
this.
D
Well,
my
question
is
my
instinct
is
to
say
we
need
to
spend
more.
We
need
to
invest
in
understanding
how
the
application
is
driving
postgres
and
if
we
can
find
wins
there,
then
we
can
buy
Headroom
on
postgres,
but
I
don't
know
enough
about
the
actual
practice
of
what
goes
on
with
postgres
in
the
application.
If
that's
true.
E
So
so,
going
back
to
your
your
point
about
like
infrastructure
people
looking
at
infrastructure
knobs
to
turn
like
I,
think
that's
where
the
scalability
team
are
quite
uniquely
placed
because
they
can
kind
of
translate
between
the
infrastructure
world
and
the
application
World
much
better,
and
they
can
say
you
know
sure
we
can
change
this
PG
bouncer
connection
tool
but
like.
Why
do
we
fix
this
thing
in
the
application,
because
there's
a
client
transaction,
that's
holding
on
to
a
transaction
for
10
minutes
or
whatever
it
is
and
I
mean?
E
Maybe
one
of
the
things
to
do
is
to
start
saying
that
you
know
we
need
to
help
out
on
more
like
if
things
are,
if
the
temperature's
rising
and
things
are
kind
of
getting
more
scary.
Maybe
it's
time
like
we
become
more
involved
in
those
things
and
actually
because
I
don't
know
about
you,
but
I've
I've
been
very
disconnected
from
from
incidents
and
gitlab.com
for
a
while
I,
don't
I,
don't
know
what
they're
fighting
and
how
they're
fighting
it
and
but.
B
E
And
that's
you
know,
maybe
it's
something
where
we
should
be
suggesting
application
changes
as
well
in
some
places.
C
D
E
D
C
E
Look
like
yeah,
but
but
I
mean
Common,
Sense,
sort
of
dictates
that
if
you
have,
you
know,
500
database
connections
and
you
know
ultimately-
and
you
have
something
that's
taking
a
very
long
time.
It
doesn't
matter
how
you
Shuffle
the
pools
around
for
those
database
connections.
There's
still
only
500
database
connections
and
the
consumption
is
the
problem
right.
You've
got
a
stream,
but
yeah.
E
D
C
D
B
Well,
yeah
I
think
it's
pretty
clear
that
the
post-crest
saturation
is
the
chemistry
yeah,
like
you,
said,
Bob's
right
to
keep
raising
it
up
because
it's
I
mean
like
Hercules
and
I
notice,
the
side
effect
of
it
when
we
were
looking
at
just
some
like
some
endpoint,
that's
caused
a
lot
and
we
noticed
that
it
got
slower
during
some
periods
and
then
you
know,
there's
no
reason
for
the
end
point
to
get
slower,
but
it
turns
out
that
the
connection
pool
was
slightly
more
saturated
during
that
time.
A
I,
like
things
are,
things
are
moving
on
that
issue
and
people
are
doing
stuff.
The
thing
that
I
haven't
that
worries
me
is
that
I
don't
know
how
much
of
an
effect
they
will
have
like
I
saw
Nikolai
mention
an
issue
about
a
query
that
I
know
of
so
I
picked
the
right
teams
with
suggestions
on
how
to
fix
it.
But
I,
don't
know
if
that's
going
to
be
a
10
drop
in
CPU
utilization
or
not.
This
is
going
to
be
visible
at
all
or
not,
but.
D
We
got
better
at
this
with
redis
I
feel,
like
I
I
know:
we've
taking
shots
at
Red.
It's
like!
Oh,
let's
optimize
this
and
optimize
that
and
then
some
things
did
almost
nothing
and
some
things
paid
off
and
I
feel
like
over
time.
We've
gotten
a
better
handle
of
what
what
is
going
to
pay
off,
make
a
difference,
but
I
I,
don't
know
who's
been
doing
this
with
the
database
or
who
is
that
knowledge
has
been
doing
it
for
long
enough
to
to
recognize
the
good
opportunities.
B
I
mean
famous
reddish
but
Matt's
birthday
person,
but
yeah.
E
E
We
have
these
these
reports,
you
know
of
of
lint
effectively
like
linting
reports
and
no
one
looks
at
them
and
they
are
literally
thousands
of
items
in
them
and
it's
this
feature
that
kitlabs
got
on,
and
you
know
every
time
you
run
a
pipeline,
it's
generating
more
of
these
results
and
you
know
we're
not
using
them
and
and
it's
unusable,
because
there's
too
much
information
in
there
and
I
wonder
how
much
of
this
is
related
to
that
as
well
like.
B
D
I
I
want
to
react
to
this,
because
this
is
a
good
example
of
something
that
looks
fishy
or
that
that
might
be
an
opportunity,
but
the
the
problem
is.
We
have
limited
opportunity
to
solve
problems,
so
we
need
to
pick
the
most
impactful
ones
and-
and
this
sort
of
quote,
even
if
this
is
an
impactful
one
on
dedicated
that
doesn't
tell
us
that
it
would
be
impactful
on.com.
E
Yeah,
but
if
there's
lots
and
lots
of
sort
of
what
I'm
thinking
is,
if
there's
lots
and
lots
of
teams
that
are
that
have
all
got
this
turned
on
by
default
and
I'm
just
trying
to
find
a
good
example
now,
but
actually,
interestingly
enough,
the
one
I
looked
at
has
got
zero,
so
maybe
I'm
being
a
bit.
Maybe
it's
already
been
improved,
but
let's
just
go.
Take
a
look
in
this
one
yeah
that
I
don't
know
it
sounds
like
we
need
to
kind
of
look
into
this
a
bit
more.
E
E
I
can't
I
can't
actually
find
the
the
type
of
issues
that
I
was
complaining
about.
So
maybe
it's
maybe
it's
gone
away
already.