►
B
First,
one
was
reviewing
the
data
from
Project
X
program
in
production
and
I
think
I'm
pretty
happy
with
the
results.
We
also
looked
at
data
from
staging
since
we
were
running
or
we
are
running
on
VMs
and
pods
there,
as
well,
with
a
little
bit
more
consistent
workload.
Butt-Fuck
fine
I
think
like
what
we
see
is
the
same
or
better
performance.
B
The
better
performance
I'm
attributing
to
NFS
I,
didn't
get
into
it
too
much,
but
I
think
that's
most
likely
what
it
is,
because
we
have
this
outstanding
issue
where
we're
running
to
the
uploads
directory
for
project
exports,
because
the
carrier
wave.
So
that's
that's
that
I,
don't
there's
much
also
discussed
there.
Scarback
do
one
take
the
next
one.
C
Yeah,
what
I
discovered
was
that
there's
a
disconnect
between
our
rules
that
we
generate
for
Prometheus
versus
what
the
current
is
mixing
that
is
used
by
the
community
that
we're
sourcing
in
for
our
dashboards.
So
I
found
a
few
rules
that
we're
missing,
related
disk,
metrics
and
I
added
those
back
in,
but
thus
far,
there's
a
blocker.
That
is
preventing
me
from
me
when
apply
these
rules
across
all
Prometheus
instances.
In.
D
C
Throughout
all
environments,
so
right
now,
it's
currently
still
broken
I've
got
a
merge
request
that
hopefully
addresses
this
issue
that
I
sign
to
go.
Andrew!
That's
completely
unrelated
this.
It's
just
there's
something
wrong
with
a
set
of
rules
and
Prometheus
is
crashing
and
Prometheus
is
not
happy,
so
I'm
trying
to
fix
that
working
progress
which.
C
And
that
merge
request
is
associated
with
an
issue
that
I've
been
doing
some
investigation
to
figure
that
out.
A
A
C
B
B
A
What
I'm
concerned
about
here
is
that
it's
different
when
we
have
VMs
versus
Auto
scaling
where
we
could
have
the
amuse
er
start
up.
100
export
jobs
for
projects
that
are
100
gigabytes
in
size
and
they
can
cause
our
autoscaler
to
kick
in
continuously
for
a
long
period
of
time
and
starve
out
everything
else
that
we
have
in
the
queue
not
starve
out.
A
Sorry
block
everything
else
that
we
have
in
the
queue
and
instead
of
having
like
a
very
short
burst,
we
will
have
a
very
long
delay
with
a
very
long
tail
of
actual
proper
jobs
that
that
would
succeed.
And
if
you
don't
have
any
limit
any
control
but
giving
those
3
retry.
So
how
many
ever
retries
we
have
I,
don't
know
exactly
how
many
will
not
be
sufficient,
because
yeah
we're
being
a
state
of
degradation
for
a
while,
maybe
even
down.
B
This
wasn't
that
reason,
but
yeah
there's,
there's
there's
limits
for
per
project
exports.
Now,
of
course,
you
can
create
lots
of
projects
to
get
around
it.
That's
well!
That's
how
the
load-
tester
works,
hey
I,
think,
like
I,
guess,
I
guess,
meirin!
This
question
might
be
is
like:
are
we
more
susceptible
to
abuse
in
kubernetes
I'm
you
rpms
and
both
in
like
with
the
autoscaler?
C
Certainly
can
spin
up
a
dedicated
note
pool
which
has
a
different
set
of
node
CPU
memory
requirements,
and
then
we
could
apply
a
deployment
to
that
node
pool
potentially
Jason
plum
had
the
idea.
He
commented
on
issue
that
I'm
chasing
this
exact
question
about
what
to
do
with
resource
stuff
and
he
suggested
using
the
vertical
pod
autoscaler
to
see
what
recommendations
they
may
have,
because
we
could
run
in
a
mode
where
it
doesn't
actually
apply.
Anything
to
the
Diploma
messages
provides
us
with
information.
C
I
would
like
to
enable
that
for
the
next
time
we
perform
an
experiment.
Production
I,
just
haven't
had
any
time
this
week
to
chase
that
down
odds
have
been
working
on
some
of
the
auto
deploy
stuff
right
now
so
hopefully,
next
week,
that's
something
I
could
dig
into
more.
If
that's
an
option,
we
want
to
go
down,
or
we
could
try
two
of
these
at
the
same
time
surrett,
but
another
note
poll
and
see
what
it
takes
to
apply
psychic
dedicate
to
a
separate
neutral.
B
Right,
I
think
from
a
cost
point
of
view,
it's
probably
not
a
problem
right,
I
mean
I,
think
we're
talking
at
most
one
one
like.
If,
if
we
isolate,
is
it
possible
to
put
workloads
on
to
prefer
one
node,
but
also
allow
them
to
run
on
additional
nodes?
If
there's
a
need
or
is
it
isolated
within
the
North
Pole.
C
A
I
would
rather
focus
on
optimizing
controlling
the
blast
radius
because
that
actually
has
potential
more
of
a
money
impact.
Then
you
know
like
notes,
booting
up
and
spending
some
money,
or
rather
spending
some
cycles
that
will
because
being
down
actually
cost
us
more
money
or
being
in
a
disruptive
state,
cost
us
more
money,
and
then
we
can
like
once
we
have
a
graph,
a
grasp
on
that
we
can
focus
on.
How
do
we
optimize
the
cost
as
well.
A
B
E
C
B
C
B
E
E
B
No
one's
pregnant
psychic
restrict
us;
the
chart
doesn't
support
it
in
here
did
upping
the
concurrency
of
you
know,
sidekick
just
writing
sidekick
with
a
higher
concurrency,
and
it's
not
clear
to
me.
I
know
that
you
know
this.
This,
like
creates
me,
but
have
multiple
threads
to
work
on
jobs
and
I
thought
there
would
be
some
like
concurrent
job
processing
but
for
product
export.
It
still
was
like
one
job
at
a
time,
not
more
than
that
so
I
think
to
get
more
concurrency
for
product
I'd
support
within
a
pod.
D
D
F
D
F
D
I
was
mad
like
this
like
like:
let's,
let's
talk
about,
it
am
I
the
way
I
imagined
it
is.
You
know,
we've
got
that
open
issue
about
the
sidekick
dry
run
and
instead
of
it
just
printing
out
the
command.
Sorry
printing
out,
like
a
bunch
of
stuff
in
Jason
or
whatever
it
is.
It
actually
prints
out
the
exact
command
like
sidekick,
blah,
blah,
blah
blah
blah
and
then
there's
some
shell
scripts.
Whatever
you
guys
use
in
Cuban,
Eddie's,
land,
I
guess
it's
just
been
sure
and
it
runs
psych
cluster.
D
You
know,
dollar
queue
selector
or
whatever
it
gets
back
the
command,
and
then
it
literally
does
an
exact
to
that,
and
so
at
the
end
of
that
boot
sequence,
all
that's
left
a
psychic.
There's
no
intermediaries.
You
know
because
the
exact
will
replace
been
show
with
the
with
the
cyclic
process.
So
there's
nothing
in
between
in
that
process
tree
right,
it's
just
sidekick
and
kubernetes
is.
D
D
Okay,
yeah
you
only!
You
don't
need
to
do
that
inside,
like
you,
bananas
land,
because
then
it
gives
us
them
the
mapping
functionality
stuff,
but
it
doesn't
get
in
the
way,
because,
obviously,
if
we
run
process
one
sidekick
cluster
and
inside
there's
just
more
things
that
can
kind
of
go
wrong.
You
know
yeah.
F
E
F
D
D
F
D
C
F
B
E
F
A
Thing
when
we
were
talking
about
github.com
in
the
work
we
are
doing
here,
we
get
to
utilize.
Whatever
we
build
it
almost
immediately.
We
have
Auto,
deploy
button
in
charts
button
or
is
being
worked
in
charts
and
in
omnibus,
which
means.
The
time
reads
we
are
talking
about
here
are
shorter,
so,
instead
of
having
it
officially
released
in
12
dogs
10,
we
could
have
it
I,
don't
know
next
week,
for
example.
So
that's
the
timeframe
we
need
to
to
talk
about
when
we're
talking
about
calm,
ok,.
F
E
F
E
A
A
A
C
A
C
C
If
we
moved
a
production
and
we're
running
16
pods-
and
we
kick
in
at
a
deploy
complete
within
25
minutes,
I'm
hoping
to
test
on
staging
what
it
looks
like
because
the
staging
nodes
are
a
lot
larger
than
the
nodes
in
pre.
So
we
might
be
able
to
see
a
difference
and
how
long
it
takes
an
employee.
So
currently
we
time
out
after
5
minutes
that
may
not
be
sufficient
for
the
length
of
time
it
takes
sidekick
to
get
up
and
ready
and
running
and
running
and
ready.
A
C
This
was
related
to
the
fact
that
it
takes
so
long
for
the
pod.
This
is
poorly
titled.
This
is
how
long
it
takes
for
the
pod
to
transition
from
the
I'm
running
to
ready
we're
discovering
it
takes
between
one
and
two
minutes
in
certain
cases,
but
like
in
the
previous
issue.
We
see
it
taking
over
five
minutes
and
such
so
there's
an
issue
inside
of
charts
to
figure
out
what
we
need
to
do
in
that
particular
situation.
That.
C
C
B
A
C
C
We've
got
Robert
working
on
a
part
of
it
related
to
the
release
tools.
I
picked
up.
Some
work
also
related
to
release
tools
about
waiting
for
stuff.
That
way,
we
could
start
moving
the
triggers
between
omnibus
and
release
tools
and
then
it's
a
matter
of
supplementing
our
current
deployer
with
necessary,
tooling
such
that
it
could
reach
out
to
Kate's
workloads
and
berth
forma
trigger
and
that
in
that
regard,
so
it's
a
work
in
progress.