►
From YouTube: 2023-04-20 Scalability Team Demo
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
A
So
this
is
an
incident
where
we
crossed
the
Tipping
Point
and
you
can
see
the
this
is
kind
of
a
high
level,
generic
generic
representation
of
our
our
connection,
pool
becoming
saturated.
Obviously,
right
here,
that's
that's
an
effect,
not
a
not
a
cause.
It
Cascades
to
the
rest
of
the
application
stack
and
kind
of
the
the
usual
way
when
more
people
want
to
write
queries
than
than
the
system
can
handle
so
I'm
going
to
dig
from
here
into
into
some
causes.
B
A
A
lot
of
interesting
things
to
show
and
I
I'm
gonna
resist
the
temptation
to
go
into
to
go
too
deep
into
the
into
the
the
tracing.
C
A
Okay,
so
this
is
our.
This
is
our
incident,
and
what
we're
looking
at
here
is.
A
A
few
times
a
minute
we
take,
we
look
at
a
snapshot
of
a
standard,
postgres
catalog
view
called
PD
stat
activity.
It's
got
one
row
per
per
postgres,
backend
and,
and
some
of
the
really
interesting
data
it
gives
is
what
that
back
end
is,
is
currently
waiting
on.
If
anything,
if
it's
active
and
it's
running
on
CPU
and
it's
not
currently
in
a
weight
State,
then
there
won't
be
a
value
for
a
weight
event,
but
in
this
case
we're
filtering
to
when
the
weight
event
is
the
lock
manager.
A
Lightweight
lock,
this
particular
so
lightweight
locks
in
general
are,
let
me
take
a
step
back
just
because
the
vocabulary
can
be
a
little
confusing.
Normally,
when
you
talk
about
database
locks,
we're
talking
about
what
formerly
is
called
heavyweight
locks.
This
is
a
lock
on
a
table
or
an
index
that
says
hey
I'm
using
this
table.
A
Please
don't
change
it
to
schema
at
this
very
moment,
for
example,
when
when
we
need
to
update
data
structures
in
memory
data
structures
with
with
a
safely
concurrently,
we'll
use
kind
of
a
mutex
mechanism
called
lightweight
locks,
and
in
this
case
the
specific
flavor
of
lightweight
lock
that
we're
that
we're
talking
about
is
the
one
that
guards
the
hash
table.
That
records
who
has
heavyweight
locks
and
that's
why
the
lightweight
lock
is
called
lock
manager.
A
So
just
want
to
clear
up
the
terminology,
because
that
that
can
really
hear
people
off
so
as
as,
as
that
suggestion
kind
of
implies,
when
you
see
contention
over
the
lightweight
lock
that
guards
access
to
that
heavyweight
log.
What
we're
really
seeing
is
contention
over
the
frequency
with
which
back-ends
are
trying
to
change
the
state
of
what
tables
and
indexes
they
have
locked.
A
So
the
resource
that
we're
starving
for
here
is
is
access
to
the
lightweight
low,
to
the
this
particular
lightweight
lock,
and
it's
it's
specifically
driven
by
how
often
we
acquire
and
release
heavyweight
locks
on
tables
and
indexes
by
extent,
I'm
kind
of
jumping
into
solution
before
I've
gone
through
the
whole
problem.
But
I'm
gonna
I'm
Gonna
Roll
With
It.
By
extension,
this
means
when
we
partition
a
table.
For
example,
if
we
partition
a
table
into
10
pieces,
we
now
need
10
times
as
many
locks
and
it's
not
just
locks
on
the
table.
A
It's
locks
on
all
of
the
indexes
of
the
table.
So
if
you
run
something
like
something
like
select
star
from
projects
where
ID
equals
one
you're
going
to
acquire
the
lock
on
the
table
and
every
index
for
that
table
and
each
one
of
those
each
one
of
those
heavyweight
lock
acquires
requires
acquiring
and
releasing
one
of
the
lock
manager
locks.
There's
only
16
of
these
locks,
every
table
and
index
is
guarded
by
an
arbitrary
one
of
those
16.
A
But
what
we're
looking
up
to
here
is
things
like
schema
changes
where
we
do
partitioning
or
adding
indexes
or
or
something
even
even
less
obvious,
where
we
perhaps
change
some
application
codes
so
that
it
needs
to
do
an
extra,
join
or
or
similar
kinds
of
things
where
we
do
refactoring
on
on
a
query
that
to
split
into
two
part,
two
queries
instead
of
one,
if
those
queries
happen
in
separate
transactions,
which
is
very
likely
for
the
way
that
we
we
we
manage
our
connection
pooling
we'll,
have
to
acquire
and
release
those
heavyweight
locks
separately
for
those
two
queries,
so
some
very
benign
and
often
helpful
changes
on
the
application
side
and
the
schema
side
can
alter
the
rate
of
of
acquisition
of
this.
A
These
lock
manager,
lightweight
logs
and
that's
I,
think
what
we
we
don't
have.
A
good
way
to
measure
at
this
point.
Query
rate
is
the
best
surrogate
that
we
have
and
it's
not
enough
when
I
say
it's
not
good
enough.
What
I
mean
is
are
during
the
incident.
A
We
knew
that
each
replica
DB
was
getting
about
60
000
queries
per
sec
per
second,
when
we
reintroduce
the
two
additional
replicas
that
went
down
to
about
50,
000
queries
per
second
and
we're
okay
at
that
point,
and
we
were
not
okay
at
the
60k,
but
historically
a
month
earlier,
we
had
been
perfectly
fine
at
60k
and
I.
Think
the
difference
is
the
rate
of
how
often-
and
we
acquire
these
Subway
locks.
The
block
Mains
are
lightweight.
A
B
Can
I
still
interrupt
you
and
ask
some
clarifying
questions
because
I
want
to
keep
keep
up
with
what
you're
saying
so
these
are
replicas.
They
serve
free
traffic,
but
so
read
queries
still
require
heavyweight
locks.
Yes,
these
are
not
logs
that
are
happening
because
we're
modifying
something
but
read
queries
require
them.
That's.
A
True
I
should
I
should
also
clarify
that
the
same
thing
happens
on
the
primary
database,
but
what
we
starved
for
is
the
replica
database.
It
was
on
the
replica
databases,
but
the
primary
is
Justice
susceptible,
so
yeah.
A
Yeah
there's
like
you
know,
there
are
several
modes,
the
lock
that
heavy
rips
come
in
and
the
the
weakest
mode
is
called
access
shared
and
it
only
conflicts
with
with
access
exclusive.
B
A
There's
a
big
caveat
that
I'm
trying
to
find
the
right
place
to
add.
There
is
because
this
is
a
known
contention.
Point,
historically,
a
postgres
introduced
what
it
calls
a
fast
path
for
trying
to
acquire
a
heavy
weight.
Lock
that
essentially
says
under
the
right
conditions.
We
can
avoid
having
to
update
the
shared
memory
data
structure
and
only
update
a
local
hash
table
instead
and
the
the
require
I
can
I'm
going
to
close
for
now
I'm
going
to
gloss
over
the
requirements
for
it.
B
B
So
naively,
every
query
needs
to
take
a
heavyweight,
lock
and
they've
done
lots
of
optimizations
to
reduce
the
pain
of
this,
but
it
still
happens,
and
then
the
other
problem
you're
saying,
is
that,
depending
on
things
like
joints
in
queries,
you
don't
know
how
many
heavyweight
blocks
a
single
application
query
needs.
So
there
you
can
get
unlucky
and
run
a
bunch
of
queries
that
don't
hit
an
optimized
path
and
I
do
need
a
whole
lot
of
locks.
B
And
then
we
end
up
in
this
Zone,
where
we
have
to
update
this
one
of
these
16
things
too
often
and
request
contention.
A
Yes,
great
summary:
yes,
thanks
yep,
absolutely
so
so
back
to
the
graph
there's
two
things:
two
visual
aids
I
wanted
to
kind
of
talk
through.
One
of
them
is
this
graph,
where
we
can
see
that
this
is
obviously
the
incident
that
we're
talking
about,
but
this
is
a
stat
graph,
grouped
by
by
replica
we're
looking
only
at
sorry.
The
vertical
axis
is
counting
how
many
backends
were
at
the
moments
that
we
that
we
pulled
the
the
state
were
stalled
waiting
for
for
this
particular
kind
of
lightweight
lock.
A
So
if
we,
if
we
go
to
just
a
line
graph,
we
can
see
that,
on
the
most
the
most
badly
impacted
replica,
there
were
659
stole
stole
postgres
backends,
all
of
which
were
rating
for
this
type
of
law,
and
this
is
this
is
really
really
serious.
This
is
essentially
all
practically
all
of
them
were
sold
waiting.
A
A
These
lightweight
locks
are
acquired
and
released
on
a
scale
of
microseconds,
so
there
are
many
many
of
these
events
that
were
not
actually
witnessing,
so
we
can
take
this
as
as
a
very
small
representative
sample
of
the
amount
of
contention
that's
happening
and
I
wanted
to
I
wanted
to
mainly
mention
that,
because
these
other
points
in
time,
these
other
kind
of
normal,
normal
weekdays,
also
exhibit
this
kind
of
contention
and
we're
only
seeing
the
tip
of
the
iceberg
for
them
as
well
on
from
a
practical
matter,
it's
normal
to
have
contention
over
over.
A
You
know,
locks
and
lock-like
data
structures
for
short
periods
of
time.
It's
a
really
only
problem
when
the
contention
escalates
to
the
point
where
it
impacts,
latency
and
limits
throughput,
and
the
the
difference
between
between
the
workload
on
this
terrible
day,
and
these
more
normal
days
wasn't
too
huge.
It
was.
It
really
was
just
that
we
we
pushed
a
a
few
more
percentage
points
toward
these
replica
is
by
taking
a
few
of
its
peers.
A
Out
of
out
of
use
like
we
went
from,
we
went
from
about
sixty
thousand
to
fifty
thousand
to
kind
of
back
off
during
during
this
day,
which
is
the
same
thing
that
this
this
adjacent
day
was
was
was
on,
and
you
can
see
that
we
still
have
contention
on
on
those
days
when
we've
got
50k
queries
per
second
hitting
their
replicas,
but
it's
it's
tolerable,
we're
not
violating
the
app
deck,
so
we
most
definitely
we're
violating
the
app
decks
on
on
this
day.
A
A
We
are
in
a
dangerous
position
for
for
having
this
come
back,
especially
if
we
take
one
or
both
of
those,
those
old
replicas
away.
I.
Think
most
of
the
folks
here
are
probably
aware,
but
we're
all
of
these.
All
of
these
database
servers
are
running
kind
of
old
VMS,
they're
they're,
the
N1
machine
family
and
we're
we're
wanting
to
switch
them
to
to
N2
or
or
the
AMD
variants
of
N2,
and
these
last
two
ones.
A
The
node,
101
and
121
are
specifically
those
those
newer
candidates
for
for
machine
type
and
there's
been
some
there's
there's
been
some
some
hopeful
talk
about.
A
Maybe
it
will
be
okay
when
we,
when
we
upgrade
all
of
the
replicas
to,
or
rather
all
of
the
journey
nodes
to
to
be
these
newer
machine
types
they
definitely
performed
better
during
the
incident
and
I'll
kind
of
show
show
here
that
we
didn't
have
any
samples
where
they
were
where
they
were
overwhelmed,
but
I
also
want
to
point
out
that
they,
they
also
only
got
a
slight
bump
in
in
the
query
rate,
and
we
know
that
they
have.
They
have.
A
So
yeah
sorry
I
kind
of
glossed
over
the
details
so
prior
to
the
incident,
let's
see
so
o4
is
the
is
the
primary
so
I'm
gonna
leave
it
out
of
the
mix
here
so
prior
to
the
incident,
we
had
a
one
through
a
three
and
five
through
a
seven
and
those
are
the
the
old
six
and
then
101
and
102
were
the
seven,
the
seventh
and
eighth
nodes
we've
removed,
nodes,
06
and
O7,
which
are
old
nodes
the
day
the
day
before
the
incidents
during
APAC
shift
and.
B
A
Ones,
that's
correct,
yes,
and
then
we
put
the
old
ones
back
in
service
to
prevent
the
recurrence
of
the
incidents
the
next
day
so
yeah.
Why
did.
A
I
think
they
did
slightly
better
and
it
was
just
barely
enough
that
so,
but
that's
a
fantastic
question
so
and
that's
kind
of
where
I
was
going
so
some
so
in
terms
of
so.
For
purposes
of
this
conversation,
I'm
going
to
talk
about
CPU,
speed,
really
in
terms
of
instructions
per
second,
so
assembly
instructions
for
a
second.
A
So
if
we
I'm
totally
making
these
numbers
up
by
the
way,
if
if
we
say
that
these
N2
machines
were
able
to
achieve,
say,
10
or
20
percent,
more
instructions
per
second
than
than
the
amount
of
time
spent
holding
each
occurrence
of
of
holding
this,
this
lock
manager,
lightweight,
lock,
would
be
would
be
proportionally
shorter
and
therefore
less
likely
to
be
contended.
A
So
I
think
that
this
gives
us
a
marginally
larger
Headroom,
because,
because
the
you
only
get
contention
when
you,
when
you
have
when
the
acquisite,
when
the
the
rate
of
acquiring
the
lock
times
the
the
mean
duration
of
holding
the
lock
kind
of
runs,
runs
up
close
to
the
the
saturation
point.
A
When
and
I
think
that
we're
just
I
think
that
we're
just
barely
avoiding
that,
under
normal
circumstances,
on
the
old
nodes
and
the
new
nodes
have
a
higher
instruction
throughput,
and
so
they
are
a
little
bit
further
away
from
that
100
of
the
time
the
lock
being
held.
Does
that
make
sense,
yeah
thanks.
C
That's
anything
anything
we're
looking
at
here
is
suffering
from
this,
like
the
resolution
problem
that
you
brought
up.
So
even
the
new
notes
that
barely
show
up
when
you
highlighted
them
could
be
super
close
to
the
Tipping
Point,
already
they're,
just
a
little
bit
farther
off
than
the
older
ones.
Yes,.
A
We
only
see
spikes
on
this
graph
once
we've
reached
that
saturation
points
not
so
we
could
be
90
of
the
way
there
and
not
see
anything
here
and
it's
not
until
we
actually
cross
that
line
on
on-
and
this
is
this
is
very
susceptible
to
I-
don't
want
to
call
them
microbursts,
but,
like
you
know,
small
scale,
variations
in
the
amount
of
in
the
in
the
lock
acquisition
rate
can
can
kind
of
cause
spikes
on
this
graph
and
lead
to
contention.
A
So
the
other
thing
I
really
do
need
to
find
my
other
my
other
issue,
where
why
did
I
not
keep
that
tab
open,
just
I
think
this
is
yeah
okay,
so
so
this
this
is
just
kind
of
walking
through
the
the
effects
of
the
saturation
I'm
going
to
gloss
over
that
this
is
the
graph
we
were
just
looking
at
kind
of
zoomed
in
on
that
day,
so
it's
a
little
bit
easier
to
see
the
the
shape
of
it.
A
This
is
I,
think
a
two-week
timeline,
so
we
can
see
adjacent
days
and
how
you
know
how
it
looks
there.
Let's
just
find
a
Green
version
of
what
we
were
just
looking
at.
A
Yes,
this
one,
okay,
so
so
I
mentioned
that
these,
like
this
particular
the
lock
manager
lightweight
locks,
are
required
on
on
a
microseconds
time
scale.
A
This
is
so
I
I'm
going
to
gloss
over
the
details
of
how
how
I
capture
this,
unless
someone's
interested,
because
I
find
it
very
interesting,
but
I'm
gonna
just
focus
on
the
data
right
now,
so
this
is
for
a
10.
Second
time
span,
we
captured
every
time
we're
capturing
every
time
we
try
to
acquire
one
of
these
16
luck,
manager,
locks
and
it's
not
immediately
available.
A
In
other
words,
when
it's
contended,
how
long
did
it
take
to
acquire
that
look
when
it
was
contended-
and
this
is
the
distribution
we're
seeing
so
in
10
seconds
we
had?
Maybe
you
know
maybe
45
000
or
so
or
45,
50,
000
or
so
points
when
it
was
contended
and
most
of
them
resolved
in
less
than
four
microseconds,
but
some
of
them.
The
long
tail
is
what
really
worries
me
where
sometimes
we
had
over
over.
A
You
know-
and
this
was
just
a
random
10
seconds
by
the
way
it
wasn't
during
an
incident.
This
was
just
I
got
up
and
I
ran
my
my
capture,
utility
and-
and
this
was
the
result-
it
there's
a
lot
of
variation
in
this
long
at
different
points
in
the
day
and
there's
I've
seen
this
long
tail
in
production
go.
You
know
several
times
higher
where
we're
we're
waiting
for
several
milliseconds
to
acquire
the
lock
and
those
are.
Those
are
I.
A
Think
I
suspect
that
those
are
the
Stalls
that
were
actually
witnessing
in
in
the
spikes
that
we
were
looking
at
on
the
other
graph.
So
this
kind
of
understanding
the
long
tail
why
we
have
these?
You
know
these
ridiculously
long
stalls
is.
A
This
is
one
of
the
open
questions
that
I'd
like
to
try
to
answer
and
I've
got
I've
got
a
few
hypotheses
about
what
could
be
causing
it
and
some
of
them
I've
ruled
out,
and
some
of
them
are
still
kind
of
in
the
running
for
a
while
I
thought
that
that
hyper
thread
contention
May
potentially
play
a
role
and
I.
Think
that's
I
think
that's
unlikely
at
this
point,
because
sorry
I
shouldn't
talk
about
discarded
theories.
So
so
this
is.
This
is
a.
A
This
is
an
open
question
and
but
I
guess.
The
main
reason
I
wanted
to
show
this
distribution
is,
is
to
kind
of
you
know,
put
put
some
some
hard
data
in
front
of
you
and
oh
I
forgot
to
mention
these
there's
there's
two
lock
modes
that
come
into
play
with
lightweight
logs
the
shared
mode,
where
that's
a
reader
lock
and
an
exclusive
mode.
That's
it's
a
really
Rider
lock!
So
this
these
lock
manager
locks
are
almost
always
acquired
in
exclusive
mode.
A
There
are
cases
where
it's
acquired
in
shared
mode,
but
the
count,
the
the
analysis
that
we've
done
here
really
highlighted
that
it's
the
exclusive
mode
where,
where
the
contentions
happening
and
and
if
we,
if
we
capture
every
every
event,
every
call
to
LW
Locker
choir
rather
than
just
the
contended
ones
it.
It
highlighted
that
that
that
we
far
more
often
are
acquired
in
exclusive
mode
rather
than
shared
mode.
A
So
this
this
is
the
mode
to
focus
on,
and
it
is,
unfortunately
that
means
that
contention
escalates
very
quickly.
Stop
sharing.
Now,
just
because
there's
nothing
interesting
to
see
there.
A
A
I
think
this
will
make
everyone
much
more
comfortable
in
in
kind
of
planning
out
our
capacity
requirements
as
well
as
being
comfortable
with
with
being
able
to
make
query
changes
like
if
we,
if
we
make
if
we
make
query
changes
or
schema
changes
like
we.
We
know
that
some
of
our
tables
are
too
big
and
we
know
that
the
solution
to
this
is
to
partition
them
and
there
are
lots
and
lots
of
great
benefits
to
do
in
this
partitioning.
A
But
that
is
also
something
that
can
potentially
increase
the
risk
of
this.
This
lock
manager
LW
log
contention,
particularly
if
we
have
any
queries
that
don't
do
efficient
partition
pruning
as
part
of
as
part
of
the
the
planning
process.
So
so.
B
C
A
Exactly
and
that's
one
of
I
mean
to
be
perfectly
honest:
that's
one
of
my
biggest
concerns
is
making
developers
afraid
to
move
queries
off
the
primary
and
onto
their
replicas,
and
it's
it's
super
super
helpful
for
them
to
to
do
those
those
migrations,
because
we
can
scale
out
their
replica
Fleet
and
we
can't
scale
a
primary
Fleet
without
a
major
architectural
change.
A
So
yeah
I
want
them
to
feel
comfortable
doing
that
very
important
work,
I'm
glad
you
mentioned
the
fast
path
again.
It's
I
I
wrote
I'll
put
this
in
the
issue,
but
I
wrote
up
a
a
really
concise
kind
of
bullet
point
list
of
what
what
are
the
prerequisites
for
for
hitting
the
past
path?
And
many
of
our
queries
do
so
kind
of
one
of
my
ideas
for
follow-up
is
to
try
to
identify
queries
that
don't
like
one.
A
One
of
the
one
of
the
requirements
is
there's
each
back.
End
has
16
slots
for
for
recording,
fast
path
activity
and,
if
you,
if
we
filled
up
those
15
slots
within
the
transaction,
we
can't
use
fastpath
anymore.
For
for
the
rest
of
the
for
the
rest
of
the
queries
in
that
transaction
and
effectively,
that's
I
mean
that's,
that's
a
hard-coded
number
so,
but.
B
A
A
B
And
like
and
doing
to
focus
on
Goods
good
metrics,
so
that
we
can
do
the
capacity
planning.
Yes.
A
Exactly
so,
that's
kind
of
the
way
I'm
leaning,
although
definitely
I,
I
I,
also
kind
of
want
to
be
able
to
give
advice
about
how
to
Reco.
You
know
how
to
construct
the
queries
to
to
like
hit
the
fast
path,
more
often
but
really
I,
agree.
I,
don't
think
it's
practical
to
expect
everyone
to
become
an
expert
in
in
this
particular
really
weird,
optimization
goal
yeah
and.
B
A
Yeah
yeah,
exactly
like
I,
think
I,
think
kind
of
crystal
ball
time.
I.
Think
one
of
the
easiest
ways
for
us
to
run
into
this
problem
is
to
to
add
an
index
to
a
frequently
access
table,
and
perhaps
you
know
some
set
of
existing
queries
already
were
you
know
like
perhaps
joining
that
table
to
one
other
table
and
the
the
total
number
of
of
indexes
across
those
tables.
A
So
you've
got
two
tables
and
if
you've
got
14
indexes
on
on
those
tables-
and
we
add
one
index
to
either
of
those
tables,
guess
what
you
no
longer
qualify
for
a
fast
path.
So
now,
every
time
we
run
that
that
particular
query
it
has
to
do.
It
has
to
go
to
the
shared
memory
log
table
and
compete
for
these
LW
locks,
and
this
is
this
is
no
one's
fault.
It
wasn't
of
a
code
change,
it
was
a
schema
change
or
vice
versa.
A
A
B
C
A
I
I'm
gonna,
I'm
gonna,
show
why
the
answer
is
no
and
Bob.
You
may
have
seen
this,
but
this
is.
This
is
cool
enough.
That
I
think
it's
it's
worth
it's
worth
60
seconds
if,
if
I
can
find
the
crap
in
in
that
short
amount
of
time,
oh
no,
the
zoom
widget's
in
the
way
is
this
it?
That's
not
it!
A
That's
it!
Okay,
okay!
So
the
way
the
way
BPF
instrumentation
works
is
you
want,
so
you
want
to
cut
catch,
calls
to
a
function.
The
say
say:
the
entry
to
a
function
like
a
wlr
acquired
the
way
that
works
is
in
the
binary,
the
the
the
first
instruction.
That's
that's
part
of
entering
the
that
that
function
gets
replaced
with
a
single
byte
instruction
called
ins.
Three,
that's
that's
effectively
a
call
out
to
an
interrupt
Handler.
A
Every
call
to
that
function
is
going
to
hit
this
this
instruction
and
whatever,
whatever
the
the
other
instruction,
was
whether
it
was
a
single
byte
or
multi-byte
instruction.
We
we
will
run
that
instruction
after
calling
the
hook
that
the
interrupt
Handler
is
handing
off
to
so
what
we're
seeing
in
this
Trace
is
we're
catching
every
where
we're
we're
observing
with
with
earth
based
CPU
profiling.
That's
capturing
a
stack
Trace
from
our
process
from
our
postgres
processes,
every
every
99
times
a
second.
A
So
this
is
just
standard
CPU
profiling
on
a
timer
basis
and
I
ran
that
for
60
seconds
during
that
60
second
window
I
also
ran
the
10
second
BPF
Trace
Tracer
extracted
all
of
the
all
of
the
only
the
stacks
where
that
tracer
was
active
or
what
we're
calling
the
LW
luck
acquire
function,
and
you
can
see
that
this
is
the
just
proportionally
on
the
graph.
This
is
how
much
time
we
spent
in
LW
luck
acquire
normally.
A
B
A
B
It
because
ebpf
runs
in
the
kernel,
so
you
cannot
run
your
BPF
program
without
going
into
the
kernel
and
back
so
like
you,
never
want
to
do
extra
function,
goals
and
there's
a
reason.
This
isn't
it
it's.
It's
almost
always
terrible
to
make
a
user
space
function
into
a
function
into
a
system
call
yeah.
B
Frequency,
that's
true!
If
it's
something
like
you're
doing
rights
on
a
socket,
then
that
doesn't
happen
that
often-
and
it's
a
system
call
anyway.
Actually
then
you
wouldn't
even
have
to
because
it
is
a
system
call
so.
A
Yeah
exactly
so,
that's
that's
why
we
can't
do
this
approach
there.
There
are
kind
of
a
few,
a
few
other
kind
of
models
we
could
maybe
do
so.
This
is
this
is
highly
speculative
and
may
not
actually
be
practical
just
to
kind
of
preface
it
with
a
with
a
caveat
two.
The
two
kind
of
hair
brained
ideas,
I'm
entertaining
right
now
is
one
we
could.
A
We
could
do
the
the
lightweight
CPU
profiling
with
perf
that
we
usually
do
and
count
how
many
times
we
observe
in
in
that
time,
timer
based
sampling.
How
many
times
do
we
observe
LW
lock
acquire
being
cold?
A
The
big
caveat
is
that
doesn't
tell
us
which
kind
of
lightweight
look
and
we
definitely
there's
like
I,
don't
know
80
or
so
different
kinds
of
lightweight
locks.
So
it's
not
always
going
to
be
this
kind,
but
we
can
infer
from
the
rest
of
the
stack
because
oh
yeah.
A
I
think
so
so
that
that's
that's
a
that's
a
maybe
this
is
it's
definitely
lightweight
enough
to
do.
It
may
be
useful
or
it
may
not
be
useful,
I'm,
not
sure
yet
I
haven't
tried
it.
The.
B
Problem
with
any
of
these,
things
is,
if
you
want
to
have
a
utilization
like
what
is
100
yeah.
B
Can
also
if
we
were
able
to
get
those
histograms,
then
you
don't
want
High
tail
latency
but
like
what
is
the
worst
that
it
can
get.
So
if
we
could
measure
how
many
things
are
waiting,
how
long
is
the
queue
allowed
to
be?
Yes,.
A
Yeah
exactly
and
the
the
the
weight
event
sampling
that
we're
doing
from
from
PT
stat
activity.
Once
we've
reached
that
once
we've
crossed
that
line
of
having
contention,
we
can
kind
of
estimate
it
from
there,
but
before
we
reach
that
point,
it's
really
hard.
A
So
the
other
hairbrained
idea
I
have
is
that
also
may
not
work
out
is
there's
a
there's.
Another
catalog
table
called
PG
locks.
That's
that's
just
describing
what
heavyweight
locks
are
currently
held
by
open
transactions
and
one
of
the
fields
it
it
includes
is:
did
we
acquire
this
lock
through
fastpath,
so
that
can
let
us
differentiate
between
whether
we're
acquiring
the
look
on
fast
path
or
not
holding
that
in
the
same
way
that
we
pull
PT
stat
activity?
A
Every
couple
you
know
a
couple
times
a
minute
may
not
give
us
a
lot
of
insight,
I'm
not
really
sure,
but
it's
it's
I
figure,
it's
worth
a
shot,
so
that
that's
the
other
thing
on
my
list
to
to
try
out
as
and
see
if
it
can
a
get
useful
information
from
it
and
B
come
up
with
a
way
to
to
make
a
metric
out
of
it.
So,
that's
that's.
That's
that's
as
far
as
I
got
so
far
on
on
ideas.
A
For
for
how
we
can
make
a
metric
to
kind
of
measure,
utilization
or.
B
A
And
and
how
many,
and
how
many
of
the
back
ends
are
holding
more
than
16
that
were
not
on
fastpath
or
I,
guess
that
that's
a
little
silly,
because
we
we
are
never
mind,
forget
that
if
it
wasn't
a
fast
fact,
it
doesn't
matter
why
it
wasn't
fast
path.
It
wasn't
fast
path,
yeah,.
B
A
Yeah,
so
that's
that's,
that's
kind
of
the
the
whirlwinds
tour.
Hopefully
that
was
interesting
and
yeah
I
think
I
I
mean
kind
of
circling
back
to
what
we
do
about
it.
I
think
it's
important
to
get
a
utilization
metric
and-
and
it's
also
important
to
avoid
incidents
and
I
I
feel
like
we're
we're
currently
in
a
dangerous
spot
where,
if
we
lose
a
replica,
whether
it's
planned
or
unplanned,
we'll
be
we'll
be
in
a
state
where
we
could
potentially
cause
it.
A
You
know:
have
an
incident
so
I
think
I
think
we
should
add
at
least
one
additional
replica
like
nowish.
B
There's
one
more
question:
I
wanted
to
ask
about
this
yeah.
You
said
this
is
an
old
problem
in
postgres
that
they've
been
optimizing
over
time.
Yes,
what
do
other
people
do
about
this.
B
Yeah,
so
so
other
big
postgres
sites
that
run
into
this
bottleneck.
What
did
they
do
about
it?
Well,.
A
The
last
time
I
ran
into
this
bottleneck
at
a
previous
company
about
six
years
ago.
We
we
did
actually
rewrite.
We
identified,
which
queries
were
the
the
greediest
consumers
and
we-
and
we
rewrote
them
to
to
to
with
with
this
as
an
optimization
goal,
reducing
the
the
acquisition
rate
for
the
lightweight
law.
A
In
our
case,
we
knew
we
knew
what
the
candidate
queries
were,
and
it
was,
you
know,
a
small
enough
number
that
we
could
afford
to
manually,
look
at
them
and
and
do
some
tuning
on
it
and
that
that
got
us
that
got
us
most
of
the
way
out
of
the
pinch
and
I
I
also
built
a
custom
built
the
postgrad
that
gave
us
more
lightweight
lumps
in
in
the
pool
and
a
few
versions
later.
The
community
did
the
same
thing,
which
is
why
we
have
16
now
as
a
standard.
B
I
guess
that's
is
adding
replicas
something
people
do.
A
Yeah,
yes,
I,
think
I,
think
I've
seen
I'll
see
if
I
can
dig
it
up.
I
think
I
saw
a
write-up
from
one
of
the
postgres
as
a
service
clones.
It
might
have
been
Aurora,
but
I
can't
remember
for
sure
that
was
talking
about
talking
about
how
to
how
to
interpret
contention
over
overlock
manager
lightweight
locks
in
particular.
A
So
that's
exactly
the
problem
we're
talking
about
and-
and
they
were
talking
about,
increasing
the
number
of
of
of
replicas
which
their
their
postgres
has
a
service.
So
they
make
more
money
if
you
run
more
replicas,
so
I
kind
of
take
that
with
a
great
assault,
but
it
is
good
advice.
Also
so
yeah,
a
great
question,
I
I
guess,
that's
that's
kind
of
my
my
Half
Baked
answer
is:
is
yeah,
adding
replicas
and
and
trying
to
optimize
queries
to
to
to
reduce
the
the
acquisition
rate?
A
I
guess
like
from
our
perspective.
We
do.
We
do
a
lot
of
caching
and
query
results
so
from
the
application
side,
if,
if
there
are
frequently
run
queries
that
we
could
afford
to
to
cast
more
often
at
a
at
a
at
any
of
the
other
layers,
that's
another
way
to
to
mitigate
by
just
reducing
the
the
block
manager,
acquisition
rate.
A
A
Cool
I
I've
only
documented
about
half
the
research
I've
done
so,
but
a
lot
of
the
flame
guys
we
looked
at
are
are
not
in
the
issue
yet,
but
I
I
intend
to
get
them
there
to
air
tomorrow.
So
if,
if
you
happen
to
want
to
see
more
of
that,
give
me
a
ping
or
or
just
wait,
wait
a
couple
days
they'll
be
there.