►
From YouTube: Infrastructure Group Conversation
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Okay,
verify
I
made
myself
so
welcome
to
infrastructures.
Group
conversation,
it's
a
short
one,
because
the
topic
is
actually
significant
and
there
isn't
really
a
whole
lot
else
to
talk
about
other
than
availability
in
the
past
few
weeks,
but
I'm
keeping
an
eye
on
the
document
as
people
post
questions
or
have
comments
and
if
not
I'll
dive
a
little
bit
into
the
slides.
A
A
You've
done
a
fair
amount
of
analysis
on
why
we
are
where
we
are,
and
essentially
you
hit
a
ceiling
on
on
Redis,
and
there
was
a
ton
of
work
that
happened
in
quick
fashion
to
address
that,
and
then
we
had
some
wine
spots
on
observability
and
also
there's.
It's
gonna
work
on
I'm
going
to
address
that
as
well.
One
of
the
things
that
we
notice
is
that
there
was
a
Kanis
mismatch
between
development
and
infrastructure.
A
Essentially,
we
interviews
continuous
delivery
and
that
sort
of
changed
that
cadence
to
be
more
of
a
daily
or
weekly
thing
versus
the
traditional
monthly
monthly
cadence,
where
the
relevant
was
in
feature
development
for
part
of
the
month
and
then
very
focused
on
fixes,
but
with
the
introduction
of
continuous
delivery
that
cadence
had
to
change,
and
we
were
a
little
late
to
align
with
development
to
make
that
change.
So
that's
been
addressed
and
we
in
infrastructure
need
to
sort
of
go
back
to
and
focus
relentlessly
on
observable
availability
and
to
do
that.
We're
focused
on
we.
A
We've
kicked
off
this
initiative
to
optimize
or
to
maximize
the
observability
of
the
infrastructure,
and
then
we've
built
some
channels
we're
developing
to
be
able
to
expose
issues
to
to
the
development
organization
as
soon
as
possible.
So
they
have
time
to
maneuver
and
address
them.
I
took
a
little
bit
of
a
detail
on
slide
number
three
about
the
dashboards
that
we've
created
and
the
boards
that
we're
using
to
manage
these.
Do
these
two
initiatives.
B
Real
quick
I
got
the
first
one,
a
jury,
it
might
be
a
very
quiet
one
but
sure
comes
fishbones.
So
first
thing
stood
infrastructure
team
for
all
the
hard
work
in
the
last
month.
With
all
the
outages
and
challenges.
I
know:
it's
been
a
it's
been
a
tough
month,
a
really
crucial
artwork
of
everybody
on
that
team.
A
A
But
we
didn't
react
to
it
and
it
was
hard
to
pinpoint
when
this
was
going
to
become
an
issue
under
has
work
has
done
some
work
on
creating
two
new
metrics,
a
saturation
metrics
and
then
a
subject
metrics,
which
is
kind
of
a
saturation
optics
which
gives
us
better
visibility
into
where
those
points
are
and
there
are
links
to
those
dashboards
and
we're
also
shifting
a
little
bit
more
into
capacity
planning.
So
it's
not
just
look
at
the
metrics
and
see
how
things
are
going
but
try
to
be
more
proactive
on.
A
Where
are
we
going
to
hit
a
wall
and
making
sure
that
where
we
have
awareness
of
the
Walt's,
not
just
in
a
sort
of
instinctive
way
or
by
looking
at
very
specific
racks
for
and
the
more
and
I'll
use
a
buzzword
here,
holistic
way?
So
one
of
the
dashboards
that
and
to
create
it
is
really
the
world
of
the
capacity
planning.
It
looks
at
all
the
services
and
it's
trying
to
evaluate
their
capacity
and
their
saturation
points.
C
A
similar
problem
in
that
we
had
been
really
close
to
capacity
in
different
parts
of
the
system
and
everything
was
working
just
fine,
because
we
were
under
capacity
and
then
we
deployed
a
new
piece
of
code
into
production
and
we
went
from
being
like
98%
of
capacity
to
Hannah's
in
that
capacity
and
then,
instead
of
things
degrading,
gracefully
things
started
falling
over
quite
a
bad
way
and
we
were
caught
by
surprise
because
you
know
the
day
before
everything
had
been
perfect,
and
so
in
the
beginning
we
were
like.
Is
this
a
single
piece
of
code?
C
That's
changed,
and
then
we
realized
multiple,
that
we
were
just
really
really
a
capacity
and
in
some
places
like
what
Redis
immediate
instinct
was
to
just
throw
more
cpu
at
the
problem.
But
then
we
realized
that
we
were
pretty
much
well.
We
are
pretty
much
on
the
on
the
biggest
machine
that
you
can
get
in
GCP
and
because
Redis
is
single
threaded,
we
pretty
much
can't
scale
up
in
any
more
vertically,
and
so
what
Terry
was
talking
about
with
with
saturation
is
on
lots
of
different
metrics.
A
A
A
Obviously,
right
now
it
looks
kind
of
busy
because
we
you
started
doing
using
this
last
week.
It
was
in
response
to
the
the
incidents
over
the
past
couple
of
weeks.
We
already
we
have
a
weekly
meeting
where
we
discuss
these
items
and
we
make
sure
that
they
get
attention
and
that
they're
prioritizing
based
on
severity
as
well,
so
that
we
can
make
sure
that
these
things
get
addressed.
So
this
is
were
saying.