►
From YouTube: Scalability Team Demo 2022-12-08
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
So
today,
I
wanted
to
share
just
some
things:
I've
been
looking
at
for
container
memory
saturation,
so
this
came
out
of
a
couple
of
capacity
planning
issues,
the
initial
one
being
one
where
we
said
that
our
web
service
was
saturating
for
contained
memory
and
I.
Think
Bob
asked
Matthias
for
the
application
performance
team
like
hey.
What's
up
with
this
and
Matthias
was
like
oh
we're,
working
on
it
a
couple
of
months
later
Bob
says
this
is
still
saturated.
A
The
Matthias
was
like
shouldn't,
be,
like
you
know
the
metrics
I'm
looking
at
say
it's
fine,
so
we
discovered
there's
some
significant
differences
there,
some
of
them,
you
know
fine,
so
Bob
was
looking
at
container
level.
Metrics
and
Matthias
was
looking
at
process
level
metrics.
So
obviously
the
container
level
metrics
May
include
memory.
That's
not
included
in
in
that
particular
process,
but
we
also
discovered
or
I
say
we
Matt
pointed
out
showed
us
that
the
current
metric
we
are
using
is
not
suitable.
A
So
C
advisor
essentially
reports
three
different
types:
three
different
metrics
for
memory
usage
in
a
container,
see
advisor
being
the
thing
that
takes
the
c
groups,
memory,
data
and
reports
it
out
as
Prometheus
metrics.
So
we
have
usage,
that's
everything
so
that
that
sounds
like
what
you
want,
but
the
problem
is.
It
includes
things
that
will
be
evicted
before
an
out
of
memory;
kill
happens.
So,
for
instance,
if
you
have
a
file
back
memory
from
like.
A
That's
inactive
that
can
be
that
can
be
reclaimed
by
the
OS
before
and
out
of
memory
kill
happens.
So,
even
if
your
usage
goes
up
to
100,
you
might
not
necessarily
get
out
of
memory
killed.
Those
those
pages
can
just
be
reclaimed.
A
A
But
this
sounds
like
what
you
want
again
like
this
sounds
plausible,
that
taking
the
total
subtracting
the
thing
that
can
be
easily
evicted
in
the
case
of
memory
pressure
and
calling
it
good,
but
this
does
still
include
active
file
back
pages
and
Matt
pointed
out
that
it
you
know
in
memory
pressure
cases,
the
OS
can
also
reclaim
those,
even
though
they're
active,
so
it
will
result
in
some.
You
know
crashing
like
it
will.
Just
you
know.
Presumably
the
the
program
that's
running
will
immediately
need
to
re.
Get
that
memory
back.
A
It's
not
guaranteed,
and
so
it
also
represents
an
over
count,
because
that
memory
can
be
reclaimed
by
the
OS
and
it
isn't
directly
attributable
to
the
application
anyway.
Necessarily
so
you,
you
can't
necessarily
say
that
this
this
shows
this
application
is
saturating.
Our
memory,
just
because
working
set
size
is
very
close
to
the
memory
limit
for
the
container.
A
So
what
we
want
is
RSS
resident
set
size,
which
is
only
the
anonymous
memory
I,
which
is
not
the
file
back
to
memory
and
swap,
but
we
don't
use
swap
so
it's
just
Anonymous
memory
that
sounds
good,
so
I
have
a
chart
somewhere,
which
I
should
probably
load
up,
which
shows
that
what
happens
if
we
switch
these
so
sorry
I
should
have
prepped
this
a
bit
more.
A
A
So
let
me
just
share:
that's.
These
charts
are
going
to
be
a
bit
noisy,
but
I've
excluded
the
rails
app
here,
just
because
we
know
this
works
better
for
the
rails
app.
This
was
this
was
about
exploring
what
happens
with
other
services,
so
the
top
chart
will
be
working
set,
size
and
the
bottom
chart
will
be
resident
set
size.
A
Also
note
that,
because
we're
doing
this
Max
buy
here,
that
does
make
some
of
this
stuff
like
harder,
like
not
harder
to
understand,
but
you
will
need
to
drill
down
to
understand.
What's
going
on,
because
you
know
every
time
you
do
an
aggregation,
you're,
sort
of
throwing
away
a
layer
of
a
data
that
you
could
use,
so
you
then
have
to
undo
the
aggregations.
In
our
case,
that's
actually
pretty
easy.
So
it's
not
a
huge
problem,
but
I
just
want
to
call
it
out.
C
A
Into
that
exactly-
and
we
have
to
do
this
in
capacity
planning
quite
quite
often
oh
good-
it
turned
out
I
could
probably
do
a
day
or
two
days.
It's
not
a
huge
deal.
A
So
a
good
example
of
that
Igor
is
actually
I
think
it
was
for
the
go
memory
metric
for
monitoring
where
we
saw
that
step
up
because
of
the
OS
upgrades
we
did
for
the
postgres
servers,
though,
that
meant
that
the
Prometheus
DB
servers
were
more
with
the
were
the
ones
with
the
highest
level
of
memory
utilization
and
then,
when
they
stepped
down,
we
didn't
see
a
step
down
in
the
max,
because
something
else
I
think
the
gitly
ones
had
added
c
groups,
but
the
max
had
gone
up
again,
but
for
a
different
reason.
A
So
when
the
first,
the
first,
the
cause
for
the
first
step,
went
away,
you
didn't
see
that
reflected
in
the
charts,
because
it's
aggregating
across
everything.
So
it's
kind
of
hard
to
untangle
that
sometimes
and
say
like
what.
What
is
reasonable
and
I'm
not
saying
we
should
do
this
at
a
more
granular
level,
because
I
think
we
already
have
too
much
noise
in
these
metrics,
but
yeah,
okay,
fine!
If.
D
A
Gonna
timeout
I
can
just
show
these
charts,
so
this
was
from
a
couple
of
weeks
ago,
or
no,
it's
not
it's
from
over
a
month
ago,
wow,
so
most
services
sort
of
sorry
one
is
here
in
both
most
Services
look
kind
of
similar,
like
I,
said
it's
kind
of
noisy.
A
What
I
want
to
look
at
is
logging,
which
goes
from
this.
So
you
know
eighty
percent
then
down
to
40
percent
to
this,
which
is
minimum
eighty
percent
very,
very
close
to
100.
So
this
is
using
residents
debt
size,
which
should
be
better.
A
We
investigated
that
in
a
separate
issue,
I
say
we
again,
obviously
Matt
what
this
was
down
to
was:
okay,
fine,
I,
don't
know
why
Thanos
doesn't
doesn't
like
me
today:
lazy
freed
memory,
so
this
is
where
a
program
can
say
to
the
OS
this.
This
block
of
memory
I-
might
want
that
back,
but
I'm
done
with
it
for
now,
but
I
might
want
it
back,
so
it
can
next
time
it
allocates
memory.
A
It
can
get
that
memory
back
without
causing
a
page
vault
and
that
is
included
in
RSS
but
not
included
in
working
set.
So
then
we
have
these
elastic
sorry
fluent
pods
containers
that
show
this
effect.
Let
me
see
if
I
can
load
anything
in
Thanos,
yeah,
okay,
you've.
A
I
was
just
I
was
just
checking.
I
could
do
that.
So
this
is
the
ratio
of
working
set
to
RSS,
but
it
might
be
better
to
do
something
like.
D
A
Site-
it's
probably
a
bad
name
too.
Let's
call
it
so
we
already
have
a
type
label
now
that
we
don't
on
these
because
they're,
not
the
labeled
ones,.
A
So
from
looking
at
the
individual
containers
mats-
or
this
was
due
to
Lazy
free
data
which
is
included
in
the
RSS
metric,
but
not
in
the
working
set,
one
which
is
a
bit
annoying.
So.
A
A
The
name
is
meant
to
indicate
it's
lighter
than
a
thread
right
like
it's
a
it's
a
concurrency,
primitive
that
the
Ruby
runtime
manages,
whereas
Ruby
uses
OS
breads
for
Threads
and
when
the
stack
for
when
the
Fiber
goes
away,
its
stack
gets
freed
and
if
you
have
M
advisory
available,
it
will
use
M
advice,
free
and
so
I
was
able
to
write
a
test
program
that
just
generated
a
lot
of
fiber
stacks
and
then
immediately
threw
them
away
and
I
could
see
that
lazy
freed
memory
increased.
A
Because
of
that.
So
that's
the
implausible.
Last
week,
I
found
out.
That
was
wrong
because
this
code
isn't
in
the
version
of
Ruby,
that
is
on
the
containers
that
we're
running
fluent
D
on.
So
this
was
added
Ruby
2.7
we're
running
Ruby
2.6.
They
do
use
J
Malik,
which
can
use
M
advice
free,
so
I
need
to
go
back
and
check
if
that's
the
cause,
but
either
way
that
doesn't
really
fix.
A
This
I
think
what
we
need
to
fix
this
and
what
I'm
working
on
once
I
can
get
a
reproduction
case
is
to
get
C
advisor
to
be
able
to
report
the
active
and
inactive
and
non-metrics
from
the
C
group
directly,
and
then
we
can
just
sum
those
because
that's
the
metric
that
we
want
is
just
the
anonymous
memory
usage
I,
think
Matt
said
that
lazy
freed
memory
ends
up
in
active
file,
which
is
not
really
accurate,
but
is
because
it's
not
Anonymous
essentially
and
because
it's
active,
so
it
kind
of
ends
up
in
this
weird
sounding
bucket.
A
Because
of
that.
So
that's
where
I
am
with
that
in
lieu
of
updating,
C
advisor.
Another
option
might
be
to
try
and
fix
this
on
the
metric
side.
I'm
not
super
convinced
about
the
options
here,
but
a
couple
we
have
are.
We
could
say
that
some
services
or
components
use
working
set
and
some
use
resident
set.
So
we
could
say
that
fluent
D
uses
working
set
and
everything
else
uses
residency
essentially
or
we
could
say
just
take
the
lower
of
the
two,
because,
both
in
both
cases,
the
estimation
error
is
an
overestimate.
A
So
we
want
the
lowest
overestimate
but
I'm
a
bit
worried
with
both
of
those
that
it
gets
too
confusing.
So
I
mentioned
at
the
start
about
the
drill
down
thing
where
you
have
the
max
across,
like
everything
within
this
service
every
container.
A
If
you
have
the
max
across
a
synthetic
metric
with
everything
in
this
service,
then
I
think
I'm
a
bit
worried
that
actually
investigating
these
issues
will
become
even
harder,
because
at
the
moment,
for
instance,
you
can
have
a
case
where
say
in
the
logging
service
fluent
d,
one
single
fluent
D
container
is
that
say,
80
memory
usage
and
then
that
container
goes
away
and
then
a
pub
sub
beat
container
goes
up
to
81
and
so
on.
A
The
chart
that
just
looks
like
a
fairly
steady
line,
but
it's
actually
two
completely
different
things
that
are
happening.
So
if
we
also
consider
the
possibility
that
it
could
be
two
different
metrics
I'm
worried
that
gets
a
bit
too
confusing,
basically
the
current
metric
memory
metrics.
We
have
aren't
super
useful,
though,
because
of
this
accounting
issue,
there
was
something
else
I
wanted
to
mention
there,
but
I
forgotten
what
it
was.
A
Oh
yeah,
so
go
so
I
mentioned
Ruby
uses
that
in
the
post
free
go
also
did
this
for
a
while
and
then
stopped
using
it,
because
people
kept
complaining
that
the
memory
metrics
reported
were
confusing
in
this
way.
A
Sorry,
the
go
runtime
did
do
this
and
then
stopped
doing
this
a
while
ago,
I
created
an
issue
to
ask
Ruby
to
stop
doing
it,
but,
like
I,
said
in
the
container
that
we're
actually
running
that
we
see
this
on
Ruby
isn't
running
a
version
that
does
this
anyway.
So
it's
not
coming
from
the
Ruby,
the
Ruby
runtime
directly
anyway,
so
that
wouldn't
get
us
out
of
this
problem.
A
At
the
moment,
so
previously,
when
Matt
looked
at
this,
let
me
find
the
issue
I.
Think.
A
I,
remember
that,
but
I
can't
find
it
in
the
issue,
but
yeah
there
was.
Was
it
just
like
a
stripped?
Was
that
a
problem
that
there
was
a
strict
binary
or
something
or
what
was
the
issue
yeah.
A
Was
coming
from
the
Ruby
process,
but
that
Ruby
binary,
like
that
Ruby
version
doesn't
use
M
advice
free
directly,
so
this
is
J
Malik,
because
what.
C
Else
is
it
gonna
be
so
well.
My
follow-up
question
was:
if
it
is
J
Malik,
could
there
be
an
option
for
Jay
Malik
to
inhibit
ORS.
A
So
that
is
I
believe
I
did
make
a
note
of
this
somewhere
I
think
there
is
an
option
for
Jamie
Malek
to
disable
that
let
me
see
if
I
can
find
it.
A
C
A
Just
don't
remember
if
it's
a
build
time
option
or
not
for
Jay
Malik
I.
Think
if
you
remember
that
being
that,
but
I
can't
find
my
note
with
it.
But
yes,
so
what
I've
been
trying
to
do
and
failing
so
far,
is
to
reproduce
this
just
using
topical
posts,
so
just
have
Prometheus
the
advisor
and
fluent
D
running,
but
the
same.
The
same
version
of
fluent
D,
the
other
versions,
don't
really
matter
too
much
and
c
groups
V1,
because
that's
what
we're
using
I
haven't
actually
been
able
to
reproduce
it.
A
Yet,
when
I
run
it
locally,
the
the
two
metrics
track
each
other
pretty
closely.
So
I
need
to
dig
into
that
a
bit
more
because
I
would
I.
A
You
know
it's
quite
easy
to
write
a
reproduction
case
directly,
just
using
M
advice,
free
but
I'm
a
little
bit
concerned
that
if
I
can't
reproduce
the
exact
case
that
we're
seeing
and
we
don't
know
why,
exactly
why
we're
seeing
it
then
I
might
not
be
looking
at
the
right
thing,
because
I've
already
been
looking
at
the
wrong
things
several
times
when
I've
been
looking
at
this
so
yeah,
that's
where
I
am
with
that
I'm
just
going
to
Peter
out
unless
anybody
else
has
got
anything
else
to
say.
E
E
So
the
merge
Mr
yesterday
like
caused
some
problems
on.
Thankfully
you
just
studying.
Clearly
it's
really
weird,
because
it's
because
it's
effective
also
like
a
step
like
the
stack
went
to
the
and
it
is
lightly
because
of
the
of
the
Mr,
where
we
shifted
the
feature
flag
into
the
instrumentation
layer
like.
E
But
the
odd
thing
is
that
I
found
about
while
working
on
the
Mr
and
it
was
it,
showed
up
in
one
of
the
the
pipeline
runs
and
I
caught
it
and
I
fixed
it,
and
then,
when
I,
when
it
was
accidentally
reintroduced
the
fitting
tests
that
caught
it
originally
didn't
fire.
So
that
was
a
strange
part
that
I'm
trying
to
replicate
now
I'm
trying
to
understand
how
how
it's,
how
it's,
how
it's
not
through
the
specs
so
so
for
that
is
reported
right
now
we're
just
outside
of
this
break.
E
The
original
am
I
into
smaller
pieces
that
are
more
just
just
a
pure
reflector
and
then
fixing
other
things.
I
found
I
found
one
issue,
yeah
solution,
so.
A
Just
to
clarify
the
issue
here
is
that
we
introduced
a
feature
flag
check
inside
our
Reddit
instrumentation,
but
our
feature
flag
checks,
use
redis
and
we
were
trying
to
catch
the
result
of
the
feature
five
check
to
resolve
that,
but
obviously
that
doesn't
work.
If
you
need
to
hit
redis
anyway,
because
you
will
end
up
hitting
redis
to
find
out
what
the
feature
flag
is.
E
A
A
Because
we're
using
kubernetes
as
far
as
I'm
aware
this
didn't
cause
any
downtime
right
like
they
should
failed
to
deploy
because
the
pods
didn't
become
ready,
yeah,
so
yeah,
but
yeah,
I,
I,
don't
know
I,
don't
know
how
we
detect
this
because
I
think
in
the
specs.
A
lot
of
things.
Certainly
the
feature
the
stub
feature:
Flags
thing:
I'm,
not
sure
if
it
actually
uses
the
real
feature
flag,
maybe
Bob.
Do
you.
F
E
Yeah
the
the
test,
that
coded
was
actually
the
feature
flag
specs.
It's
also
it
wasn't
our
spec.
It
was
the
feature
flag,
spec,
so
I
think
one
of
one
of
two
of
the
features
like
spec
would
actually
run
in
in
rails
rails
cache,
so
they
actually
research
values
like
there
are
a
couple
of
specs
that
that
test
for
the
feature
for
the
case,
where
there's
a
cashmere,
so
he
actually
looks
at
redis
or
rather
radishes.
E
It
looks
up
the
active
record,
so
those
tests
caught
it,
but
they
didn't
catch
it
a
second
time
around
where
we
accidentally
introduced
it,
because
the
the
way,
the
way
around
getting
the
way
around
that
to
Breaking
that
recursive
condition
is
to
only
run
it
if
it's
a
multi-key
multi-key
lookup,
because
for
us
for
flipper
slippers,
just
a
single
key
lookup.
So
if
you
run
it
only
for
multi-key,
then
we
sort
of
get
around
that,
but
it
was
accidentally
reintroduced
when
I
added
in
the
safe
request
store.
Yes,.
E
E
I
found
I
I
hear
so
this
is
one.
This
was
one
of
the
reasons
why,
when
you
run
GDK
rails
locally
doesn't
catch
anything
the
so
I
took
it
around
and
instantly
it
broke
the
local
GDK
locally.
But
the
odd
thing
is
that
the
test
still
pass
so
I'm
trying
to
get
a
test
to
fill
on
on
this
comment,
which
is
just
a
comment
in
between
the
merge
and
the
revert.
E
E
A
We
could,
at
the
start
of
the
request,
somehow
put
this
feature
flag
in
the
request
store
and
then,
if
it's
in
the
request
store,
the
instrumentation
uses
it
and
if
it's
not,
it
doesn't
but
I.
Don't
like
that,
because
the
start
of
a
request
is
kind
of
a
fuzzy
thing
in
in
the
in
the
rails.
App
so
I
think
it's
probably
better
to
do
a
different
approach.
E
Yeah
I
think
for
now
I'm
trying
to
think
of
a
way
to
not
add
in
so
much,
not
that
in
so
much
metrics
and
lock
such
that
we
need
the
picture
of
like
wasn't
I
think
the
primary
go
on
the
picture
was
to
control
how
much
extra
long
lines
and
metrics
that
we
are
pumping
out
so
yeah
yeah,
I,
I
I've
got
the
plan
is
to
just
add
in
to
just
add
in
the
accounts
of
allowed
a
loud
cross
lot,
but
the
counts
of
how
many
a
lot.
E
A
A
Few
okay
I
see
what
you
mean.
Yes,
yeah.
E
So
so
those
would
pass
the
specs,
because
it's
not
wrapped
yeah,
but
those
that
we
wrapped
within
allow
commands
that,
like
I
mean
we
could
check
on
other
implications,
but
if
it's
not
using
hashtags,
it's
likely
going
to
going
to
be
an
invalid
cross
slot
come
on
anyway.
It's
gonna,
it's
gonna,
be
a
cross-lock
command.
So
we
could.
We
could
just
count
the
number
of
allowed
of
of
cross
lock
commands
that
are
wrapped
in
the
allow
block,
and
then
we
could
put
it
within
every
single
log.
E
Like
everything
like
the
request,
every
every
block
line
that
we
lock
for
the
request.
You
will
just
have
a
redis
allow
cross
and
then
from
there
we
can
extrapolate.
You
can
estimate
fairly
cheaply
what?
What
is
the
amount
of
work
we
need
like?
What's
the
amount
of
cross
requests
happening
right
now
without
having
to
pump
out
extra
long
lines
or
electron
metrics?
And
we
could
do
this
with
this
switch
of
like
problem
until
we
find
a
way
to
catch
it.
The
specs.
E
Yeah
that
would
work
too,
but
I
was
thinking
like
doing
like
the
trade-off
like
like.
Why
are
we
doing
this?
In
the
first
place,
to
estimate
how
much
cross
slots
accounts?
Are
there
happening?
So
if
you
will
do
it
without
the
future
effects
without
having
to
use
and
bars
I
think,
but
it
means
you
have
to
go
into,
they
have
to
go
into
the
chat.
The
helm,
charts
really
pull
and
then
push
several
comments.
Not
several
comments
get
it
into
like
yeah.
C
We
can
yeah,
we
can,
we
can
override
it,
so
it
doesn't
require
a
change
to
the
helm
chart
itself,
but
it
does
require
a
change
to
the
gitlab
Dash
com
yeah,
but
that
that's
a
that's
a
single
change,
so
I
think
that's
a
feasible
work
around.
E
E
E
Yeah
I
mean
we
should
get
it
to
work
instead
of
like
betting
it,
but
I
I
believe
the
test
should
check
for
any
sort
of
recursive
issue.
So
we
prevent
this
all
together.
Then
we
start
like
outlining
sounds
like
best
practice,
because
I
I
checked
the
features
like
dogs,
and
there
was
no
mention
of
like
being
caution,
cautious
about
introducing
at
places
that
it
could
cause
such
problems.
I
did
but
then
again
this
is
a
fairly
hot
path,
so
I
don't
think
we.
F
I
could
show
the
occurrence-based
SLA
thing
because
that
hasn't
been
brought
up
yet
I'll
fill
in
the
agenda
later
and
clean
up.
My
screen
a
bit
so
I
have
something
to
share.
F
So
let's
go
to,
can
you
see
my
screen
and
am
I
on
the
issues
page
or
am
I
in
another
browser
window
here
I'm
in
another
browseries
yeah
yeah?
This
is
issues
and
where
is
the.
F
F
A
little
under
one
week
and
we're
at
99.91-
and
this
is
counting
successful,
aptex
events
and
successful
requests
for
the
four-
what
we
call
primary
services
so
API
kit
registry
in
web
yeah.
We
do
that
by
just
summing
everything
over
time
over
the
30
days.
Yeah,
that's
that's
about
it.
F
I
was
going
hoping
that
this
would
be
closer
to
99.95,
so
we
could
use
it
for
General
SLA
metrics,
but
that's
not
it
so
I'm
hoping
we
can
use
it
as
an
internal
measure,
because
this
is
going
to
be
closer
to
air
budget
for
stage
groups
than
the
general
slas
to
right.
Now,
let's
see
what
those
look
like.
A
Thanks,
do
we
like.
A
Know
the
issue
with
this
before
was,
you
know,
request
volume,
but
you
said
the
average
is
out
over
a
month
and
it
should
be
fine
I'm,
assuming
you're
not
going
to
pay
for
dedicated
if
you're,
not
if
you
don't
have
a
reasonable
amount
of
requests,
but
is
there
like
a
level
below
like
a
request
volume
below
which
this
is
just
not
going
to
work.
F
Yeah
I
I,
don't
know,
I,
don't
know
the
the
level
below
which
it
wouldn't
work
should
be
measured
in
a
month.
If
you
have
less
than
foreign
how
much
100
000
requests
a
month,
then
something
like
that.
A
All
right,
eagle.
C
B
F
Because
they're,
not
it's
a
number
that
we
publish,
and
that
is
mentioned
sometimes
if
we
have
to
in
contracts
and
stuff
so.
B
They
are
there's
one
contract
that
I
know
of
that.
Has
it
I,
don't
know
if
there's
more
than
that,
but
the
the
SLA
is
written
into
some
contracts.
B
Yes,
I
say
it
like
that,
because
I
don't
know
the
specific
wording
of
it,
but
I
do
know
that
the
last
time
that
we
were
looking
to
change
how
the
SLA
was
done,
there
were
concerns
about
making
sure
that
it
was
correctly
reflected
there
I
think.
B
If
we've
come
up
with
this,
if
we
have
one
that
is
more
true
or
more
proper,
then
it's
really
a
case
of
writing
it
up
in
English
what
that
is
and
trying
to
get
that
adopted
and
taking
it
to
Steve
and
saying
this
is
what
we
would
like
to
do
on
gitlab.com
and
how
do
and
how
do
we
move
the
process
forward?
That
way,
I!
B
Don't
think
that
we
should
be
stuck
to
an
SLA
calculation
if
it's
not
the
most
correct
or
the
most
true
version
that
we
have
and
I
think
it's
completely
fine
that
these
things
evolve
over
time.
If,
if
we've
found
something
that
is
better,
then
let's
start
the
process
of
having
that
be
the
one
that
we
adopt.
B
B
And
that
all
starts
from
an
issue
so
creating
the
issue
that
says
this
is
how
we
are
calculating
it
now
pros
and
cons
of
that.
This
is
what
we
want.
This
is
how
we
want
to
calculate
it,
going
forward,
pros
and
cons
of
that,
and
then
we
raise
it
with
Steve
and
say
right:
how
do
we?
How
do
we
take
this
forward?.
F
A
F
F
I'm
going
to
drop
it
in
the
channel
and
then
we're
going
to
see
how
we
move
it
forward,
but
I
think
Steve
is
going
to
want
to
see
at
least
like
a
full
month
of
data,
so
a
number
for
a
month
and
then
putting
both
numbers
next
to
each
other.
So,
let's
start
by
writing
up
and
then
move
it
after
we
have
numbers
for
an
entire
month.
D
D
Yeah
a
couple
of
days
ago,
there
was
an
issue
in
staging
because
this
request,
urgency
label
was
missing
for
the
error
rate.
They
request.
Urgency
is
not
that
much
relevant
for
the
error
read,
but
and
that's
because
it
was
missing
here,
but
we
have
a
validation
to
guarantee
that
the
same
counter
has
the
same
label.
So
it
crashed
it.
So
we
had
this
beautiful
stack
Trace
in
staging
after
enabling
the
feature
flag.
D
D
So
this
is
how
we
I
quickly
drafted
this
yesterday.
We
would
call
it
this
way
and
yeah.
The
only
change
we
that
would
be
required
would
be
the
aptx
class
at
the
increment
method
to
include
the
other
label
names
and
the
same
thing
for
the
error
rate.
D
D
I
was
inspecting
yesterday,
the
code
and
all
of
them
they
have
the
the
request
slas,
they
have
the
same
labels.
They
just
have
different
values
and
this
work.
Well,
maybe
we
don't
need
exactly
this
solution.
D
I
was
trying
to
think
in
other
possible
solutions
to
avoid
this
error
in
the
future.
Not
just
this
one
I
I
thought
and.
B
F
D
B
D
B
All
right
well,
thank
you
so
much
for
the
conversation,
thanks
for
sharing
all
the
things.
I
hope
you
all
have
a
good
rest
of
your
day.