►
From YouTube: 2021-06-28 Delivery team weekly EMEA/AMER
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
B
Cool
I'll
get
started,
so
mttp
is
actually
looking
not
too
unhealthy.
Given
the
week
we
had
last
week
36
hours
of
blockers
last
week.
Almost
all
of
that
was
down
to
the
gcp
incident
and
of
course
we
had
family
and
friends,
though
so
that
never
plays
nicely
on
mttp.
So
not
too
bad.
B
All
things
considered
we're
still
right
just
right
over
the
current
target,
so
announcements,
I'm
gonna,
go
through
all
of
these,
but
I
do
just
want
to
highlight
a
few
of
them,
so
mainly
welcome
reuben
to
the
team
great
to
have
you
if
everyone
hasn't
like,
if
you
haven't
all
scheduled
in
coffee
chats,
yet
please
go
ahead
and
do
that
so
you
can
all
meet
and
henry's
out
this
week.
B
A
couple
of
things
I
want
to
highlight
on
c
and
d,
so
reliability
have
made
a
few
changes
to
processors,
so
on
c
just
an
fyi.
They
are
now
grouping
all
of
their
tasks
under
reliability,
so
it
won't
be
the
individual
work,
streams
of
observability
and
data
stores
and
core
infra.
So,
if
you're,
adding
work
in
there
just
to
be
aware
of
and
then
d
is
one
that
will
be
more
visible
to
us
and
we
have
to
manage
a
little
bit
more
carefully,
I
think
is
incidents
will
also
close.
B
So
previously,
what
tended
to
happen?
Was
they
mitigated
resolved
and
then
there'd
be
a
period
of
time
where
they'd
sit
with
the
resolved
label?
The
issue
itself
would
be
open
and
we'd
do
like
the
kind
of
corrective
actions
and
things
on
there.
So
all
of
that
will
still
go
on,
but
after
you
add
the
resolve
label,
it
will
now
close.
B
So
to
answer
your
question:
where
are
the
difference
between
resolved
and
mitigated?
So
resolved
is
like
say,
for
example,
we
have
a
revert,
mr,
that
we're
waiting
to
pick
into
a
deployment
once
you
get
it,
you
can
mitigate
the
incident
to
get
your
deployment
going.
You
know
like
the
issue
is
more
or
less
solved
or,
if
we've
taken
any
kind
of
short-term
action
to
like
recover
the
problem
and
things
are
generally
looking
fine
that
would
be
mitigated
and
then
it's
resolved
at
the
point
where,
like
absolutely
it's
done,
this
thing
is
finished.
B
So
yeah
just
to
be
aware
of
that,
the
bit
that
we'll
need
to
really
just
keep
a
check
on
is
that
try-
and
I
mean
you
can
do
it
straight
after
as
well,
but
please
make
sure
we're
adding
summaries
and
timelines
to
our
incident.
So
generally
it's
a
little
easier.
I
find
to
do
that
before.
I
add
the
resolved
label,
so
we
get
that
bit.
B
This
is
also
a
new
process.
If
you
have
feedback,
please
go
ahead
and
you
can
add
that
in
as
well
so
cool
and
then
starbucks
got
some
time
off
as
well.
C
Okay,
so
today
I
was
looking
at
the
prometheus
push
gateway
for
tracking
deployments,
and
basically
there
is
an
underlying
problem,
which
is
that
deployments
happened
in
a
very
long
period
of
time
and
because
push
gateway
just
keep
erasing
information.
So
when
you,
when
we
push
stuff
to
a
push
gateway,
we
just
wipe
out
the
old
metrics
and
replace
with
a
new
one,
but
the
way
so
every
run
can
actually
add
just
one
deployment
because
it
runs
at
the
end
of
a
deployment.
C
So
many
of
the
metrics
out
of
this
are
not
able
to
understand
that
there
was
a
metric
reset,
so
there
is
other.
There
are
some
screenshots
in
the
links
and
things
like
that.
So
basically,
what
happens
is
that
some
of
them
are
actually
counted
for
the
resets.
So
prometheus
is
smart
enough
to
understand
it.
This
is
a
new
value,
but
many
don't
so
basically
keeps
thinking
that
we
still
have
the
same
data
as
before.
C
So
my
proposal
there
is
to
run
some
kind
of
helper
daemon.
I've
wrote
a
simple
one
and
go
it's
kind
of
20
line
of
code.
Basically,
you
just
inform
the
thing
I
did
a
deployment.
This
was
the
time
it
took
and
it
just
published
the
information
to
promise,
so
they
can
be
scraping,
scrape
it,
and
so
it's
it's
running
on
memory
and
we
and
and
it
never
resets,
unless
we
restart
it.
So
that's
the
thing
so,
robert.
You
have
a
comment.
C
It
really
depends
on
how
much
we
want
to
move
away
from
running
everything
in
batches
in
ci.
That
thing
that
I
brought
and
go
it's
simple
enough
to
just
put
it
somewhere
and
just
forget
about
it.
I
mean
we
have
plenty
of
kubernetes
clustered.
I
think
it
would
cost
penny
to
run
that
in
something
that
we
already
have,
because
it's
just
keeping
numbers
in
memory
and
if
it
restarts
in
that
case
it
would
be
no
problem
because
you
would
keep
counting.
D
C
D
C
C
It's
just
a
very
quick
example.
Let's
say
that
instead
of
this
thing,
we
were
counting
number
of
api
calls,
instead
of
just
counting
number
of
deployments.
So
every
run
will
have
a
different
number
of
api
calls.
So
it
would
be
easier
to
understand
that,
even
if
it
was
a
reset
because
the
number
is
moving,
but
in
our
case
one
pipeline,
one
deployment,
so
it
always
push
one.
D
Yeah,
this
is
something
that
we
noticed
the
other
day
with
robert,
that
it
was
resetting
the
value.
So
we
did
something
very
hackish
to
work
around
that,
but
I
was
not
aware
that
it
was
also
happening
on
the
commutative
buckets
of
the
instagram,
which
is
also
a
problem
which
is.
C
C
C
Yeah,
because
if
you
look
at
the
description,
they
just
say
that
if
you
run
a
batch
job,
something
like
every
15
minutes,
they
suggest
you
to
replace
to
change
it
to
a
demon
or
if
you
have
something
that
is
short
short
short,
then
they
hope
that
you
are
that
the
numbers
that
you
count
are
different
so
that
they
can
understand
the
the
result.
Otherwise
they
just
say
just
run
a
demon.
C
Yes,
but
it's
based
on
basically
this
is
this
is
andrew,
was
explaining
me
this
thing
that
instagrams
is
kind
of
a
hack
on
top
of
the
metric
system,
so
on
prometheus
level
they
don't
really
belong
to
the
same
thing.
So
in
theory,
you
could
be
able
to
understand
that
because
of
the
sum
counter
actually
reset,
then
the
overall
thing
is
reset,
but
in
in
the
internals
of
remedies,
every
one
is
a
different
metric.
So
each
bucket
is
a
metric.
The
sum
the
counter
they
are
all
different
metrics,
so
because
we
are
tracking
time.
C
The
only
thing
that
change
is
the
total
time,
because
sometimes
you
take
two
hours,
then
you
take
two
hours
point
something
so
that
matrix
is
is
re
is
resetted
as
a
as
is
reset
properly,
but
the
other
one
you're
just
changing
one.
So
yes,
you
just
say
I
did
one
deployment
and
then
six
hours
later
you
just
replace
that
values,
and
I
did
one
deployment
and
say
okay,
so
you
are
just
pushing
the
old
value,
and
so
it's
still
the
same
one.
B
C
Yeah
and
then
so
he
told
me
that
they're
doing
something
like
that
in
tamland,
no
in
the
murky
customer
project,
so
they
have
a
little
program
running
in
cloud
function.
I
don't
even
know
there's
something
running
that
in
that
case
is
scraping,
elasticsearch
and
keeping
information
in
memory,
because
yeah
the
biggest
problem
is
histograms.
B
D
D
I
don't
know
if,
if
that
works,
for
you
alessi
or
do
you
prefer
to
have
all
of
this
in
a
single
issue.
C
B
Okay,
well,
we
will
we
lose
data
on
this
one,
so
you
said
like
if
it
restarts,
then
I
mean
we'd
know
about
it,
because
it's
restarted
but
like
are
we
are
we
creating
kind
of
a
a
risk
around
this
stuff?
C
I
don't
think
we
keep
this
information
for
this
long
because
prometheus
as
a
retention
policy,
I
think,
is
around
I
don't
I
don't
know
if
we
change
it
give
me
a
week.
It
can
be
a
month
a
couple
of
months,
but
the
point
is
that
so
the
process
has
the
information
in
memory
when
you
scrape
it
prometheus
instance,
keep
it
in
his
own
data
storage.
C
So
the
thing
that
you
lose
is
your
local
information
in
the
in
your
daemon,
which
is
expected
because
prometeos
keep
the
thing
stored
and
that's
that's
it
basically.
So,
but
if
the
value
changes
properly,
so
they
just
monotonically
increase
and
then
reset
then
prometheus
can
understand
that
it
would
the
process
restarted
and
and
and
is
helper
function,
would
just
do
the
right
thing
and
give
you
the
monotonical
increase
rate.
But
if
you
just
go
from
zero
to
one
and
then
again
one
and
then
again
one
and
then
again
one
there's
no
way
to
tell.
C
D
C
C
That's
the
because
I
was
thinking
about.
We
can
probably
put
this
in
a
sub
folder
in
release
tools
so
that
we
have
everything
together.
I
just
wrote
it
in
go
because
I
know
how
to
do
this
and
go.
We
can
write
this
in
ruby,
probably
it's
going
to
be
more
expensive
to
run
it
than
the
go
demon.
So
I
was
kind
of
open
to
suggestion
here.
C
So
just
kind
of
this
thing
work
and
can-
and
I
think
it's
easy
enough
to
understand
how
it
works
and
also
if
someone
is
not
really
proficient
with
go,
should
be
quite
easy
to
eventually
change
some
behavior
there.
But
if
we
don't
want
to
go
in
with
go
direction,
we
can
rewrite
this
in
ruby
and
as
well
think
about
how
we
want
to
deploy.
The
thing,
in
my
mind,
was
kind
of
in
the
end,
we're
going
to
build
the
docker
image
and
then
run
it
somewhere.
B
So
do
you
want
someone
to
put
an
issue
together?
We
can
look
at
things
I'd
like,
so
I
think
it's
a
kind
of
a
balance
between
getting
something
up
and
running,
so
we
can
see
this
data,
so
we
can
make
progress
on
deployment
slo,
but
at
the
same
time
we
also
know
this
is
like
one
iteration
of
many
in
terms
of
like
overall
metrics.
B
So
if
we
can
find
a
nice
balance
between
getting
what
we
need
now
without
needing
to
do
weeks
of
work,
but
also
something
that
maybe
we
know
is
going
to
be
okay
for
a
few
months.
B
And
then
we
can
input
in
there
cool,
okay,
thanks
very
much,
let's,
let's,
let's
work
out
on
the
issue:
how
we're
going
to
deploy
that.
B
Cool
and
then
be,
I
think,
might
be
a
bit
of
a
similar
type
of
conversation.
So
we've
following
that
incident,
there
was
a
we
need
to
modify
our
deployment
pipeline
so
that
we're
not
running
italy
deployments
ahead
of
rails.
So
at
the
moment
it's
in
canary
we
do
gitly
prefect
rails
and
then
we
go
into
the
main
fleet
and
we
do
all
the
rest
of
italy,
the
rest
of
prefect
and
the
rest
of
rails.
B
That's
a
bit
of
a
risk
because
actually
we
have
rails
changes
ahead
of
the
full
italy
fleet.
So
we
need
to
make
a
change.
Java
has
made
a
proposal
that
we
could
just
lift,
like
literally
just
pick
up
the
production,
italy
and
profit
jobs
and
put
them
next
to
the
canary
ones.
So
we
would
kick
off
at
the
same
time,
canary
italy,
canary
sorry,
production,
italy
and
then
we'd
do
canary
prefect
production,
proof
it
and
then
we'd
do
canary
web
canary
rails
and
then
we
would
roll
in.
B
So
it
certainly
seems
like
an
easy
initial
one.
Like
iteration
for
this
I'm
wondering,
if
kind
of
longer
term,
we
might
actually
want
to
split
the
getaway
deployment
off
these
other
ones
and
sort
of
like
stagger
it
in,
but
that
might
be
a
little
bit
more
work.
So
I
kind
of
have
two
questions.
One
is
on
the
proposal
that
java's
come
up
with,
looks
like
a
really
straightforward
piece
of
work.
B
B
B
Like
I
do
think,
there's
a
slight
risk
because
we'll
have
done
a
full
getly
deployment.
We
may
not
even
promote
that
thing
to
production.
Right
like
there
are
instances,
I
guess,
and
then
there
won't
be
any
tracking
of
that
deployment.
So
it's
definitely
not
a
perf
like
it.
I
mean
a
better
solution
would
be
to
also
create
italy
tracking
jobs
that
then
sit
with
the
actual
deployment.
But
of
course
that's
a
bit
more
work.
A
B
B
B
Okay,
let's
go
back
I'll,
add
some
more
comments
on
this
issue.
I
think
we
need
to
think
through
what
this
might
may.
I
we
do
need
to
change
this,
but
I'm
wondering
if
there
may
not
be
a
quick
hack.
We
might
actually
need
to
think
about
how
this
properly
fits
in
the
pipeline.
B
Okay,
let's
go
back.
Let's
move
on
to
your
point.
E
Ever
since
we
switched
to
bridge
jobs,
we
the
warm-up,
used
to
run
in
parallel
with
the
canary
deploy
when
it
was
done
that
has
since
ceased.
I
was
wondering
if
we
still
had
an
issue
to
address
pushing
that
backwards.
That
way,
the
warm-up
runs
at
some
point.
B
Hand
yeah
we
should
we
should
readdress
that
I'm
trying
to
gather
up
all
of
the
kind
of
additional
sort
of
random
little
record
like
release
tool
type
things
that
we
have
there'll
be
lots
of.
I
think
things
come
out
as
we
get
deployment
slo
up
and
running,
we'll
see
lots
of
areas
for
improvement
thanks
robert.
B
All
the
links
thanks
thanks
both
because
one
thing
we
are
getting
close
to
is
thinking
about
okrs
for
q3,
so
we
might
want
to
see
if
we
can
wrap
up
a
load
of
these
things.
But
if
it's
a
small
change,
if
someone
wants
to
just
go
for
it,
go
for
it,
then
otherwise,
I
think
in
q3
we
can
have
a
think
about
what
changes
we
want
to
get
in
place
to
to
improve
all
the
things.
Basically,.
B
Awesome
is
there
anything
else
and
I
want
to
discuss.
B
E
B
B
Awesome
all
right
thanks
for
that,
I
will
pull
that
onto
the
board.
So
we've
got
a
few
of
these
things.
Circling
around
yeah,
okay,
I'll
I'll,
dig
it
out
and
see
what
we
can
do
with
that
awesome.
I
shall
stop
the
recording.