►
From YouTube: [US] Container Registry Interactive Demonstration
Description
Skarbek works with other SRE's to demonstrate how to deploy, view metrics, and find logs related to our recent service migration of the Container Registry from VM's into Kubernetes.
B
A
Perfect
you're
only
missing
an
optional
dependency.
We
don't
care
about
that
because
we're
not
doing
anything
related
to
it.
So
you
don't
need
to
worry
about
that
at
this
moment
in
time,
so
York
station
is
supposedly
ready
to
go.
So
if
you
want
to
run
the
I'll
type
it
in
here,
okay
control,
EG
STG
list-
this
will
provide
you
a
listing
of
a
wide
variety
of
things.
In
this
case,
it
will
tell
us
that
we've
got
a
few
things
installed
inside
of
the
staging
cluster,
so
we
should
see.
D
A
A
Perfect,
so,
as
you
could
see,
we've
got
a
few
things
to
show
here.
So
the
first
item
I
want
to
point
out
to
the
fact
that
you
we
ran
the
helm
list
command,
which
is
just
asking
the
metadata
inside
of
kubernetes.
What
do
you
have
installed?
So
we
have
three
things
installed,
get
lab
which
contains
of
the
get
lab
application,
which
in
this
case
is
restricted
down
to
the
container
registry,
we're
using
our
own
helmet
art
for
this,
but
we
disable
everything
except
for
the
registry.
A
The
next
items
get
lab
monitoring,
probably
not
the
greatest
sort
of
naming
choices,
but
this
is
just
the
the
use
of
a
stable
Prometheus
operator
helm,
chart
provided
by
the
community.
This
provides
us
our
mechanism
for
monitoring
all
of
our
clusters,
so
you'll
see
this
installed
and
every
single
one
of
our
clusters
that
we
run
today
that
is
relatively
new.
Our
old
gke
runner,
closers
they're
not
running
this
stuff
and
then
I
want
to
call
this
Platinum
L,
but
it's
the
plant
UML
service.
This
is
just
a
fancy.
A
Integration
that
get
lab
offers
that
jarba
has
been
trying
to
push
into
production
so
and
then
last
listing
you
see,
we
did
a
cube
control
get
secrets.
The
register
requires
a
few
specific
secrets
if
you
watched
his
demonstrate
or
his
presentation
on
Wednesday
there's
three
of
them
that
are
required
for
the
docker
registry
to
communicate
cleanly
with
the
get
lab
API,
as
well
as
cleanly
with
our
clients.
So
those
secrets
are
mainly
created
and
they're
clocked
in
here
as
necessary.
So
at
this
point
you
can
now
run
a
list
which
is
fantastic.
So
that's
great.
A
So
the
next
thing
I
want
to
look
at
is
the
pipeline's.
So
what
I'm
gonna
do
is
merge.
The
merge
request
that
you
put
in
if
I
could
find
that
tab
because
I've
lost
it
already
I'm
going
to
prove
it
and
I'm
gonna
merge
it.
So
what
I'd
like
for
you
to
do
next
is
note
that
get
lab
does
not
run
any
of
the
actual
pipelines.
It's
only
the
source
of
truth
for
our
repos.
That
way,
it's
publicly
available,
our
pipelines
are
going
to
be
in
the
obstinance
I'll
paste.
A
You
open
up
the
dry
run
so
the
dry,
so
our
pipeline
is
relatively
straightforward.
The
first
thing
we
do
is
a
basic
checklist
to
make
sure
the
shell
scripts
have
all
the
necessary
stuff
ready
to
go
kind
of.
Needless,
but
we
also
do
a
set
of
dry
runs
so
you're
looking
at
the
staging
dry
run.
So
if
you
scroll
up
a
little
bit,
we're
gonna
see
a
few
things
inside
of
this
particular
job.
A
The
first
thing:
you're
gonna,
see
if
you
scroll
like
half
way
up
the
screen,
we
should
see
a
diff
yeah,
so
we
use
the
plug-in
helm
diff.
This
provides
us
a
view
into
what
is
proposed
to
change
inside
the
configuration
so
somewhere
in
this
general
vicinity.
We
should
see
the
change
that
we
want
to
make
the
change
exactly
right.
There
we're
changing
the
version
of
the
image
registry,
doing
a
downgrade
for
this
demo.
The
second
half,
if
you
scroll
back
down
again
helm,
is
doing
a
dry
run
of
the
upgrade
command.
A
The
real
good
part
is
going
to
be
the
next
job,
that's
after
this,
so
the
upgrade
yeah.
So
this
is
going
to
run
the
exact
same
commands,
but
it
peels
off
the
dry
run
flag.
So
you'll
see
right
now
it's
attempting
to
perform
the
deploy
as
we
speak.
So
that's
good
news
we're
right
where
we
want
to
be
so.
While
it's
deploying
let's
go
look
at
our
metrics,
so
I'll
put
a
new
link
into
the
zoom.
If
you
want
to
click
on
that,
one.
A
So
there's
a
few
things:
I
want
to
point
out
on
this
view,
so
the
top
right
shows
us
the
act
of
replica
set.
So
if
you're
unaware
anytime,
you
create
a
deployment
inside
of
kubernetes,
they
creates
a
replica
set.
That
replica
set
is
what
maintains
the
pods
that
are
responsible
for
that
deployment.
So
you
could
see
the
yellow
line
goes
back
to
infinity.
It's
currently
running
three
pods,
that's
our
old
replica
set.
So
that's
the
old
version
of
the
registry.
That's
probably
running
registry
version
271
as
you've
got
your
mouse
highlighted
over
B.
A
We
see
the
that
small
snippet
of
what's
trying
to
start
up
right
now,
there's
two
pods
attempting
to
run
the
new
version
to
get
lab
registry,
as
you
can
see,
they're
kind
of
struggling
I
guess
because
below
that
on
the
far
right,
there's
unavailable,
replicas
40%
of
all
of
our
pods
are
not
running
properly.
So
three
out
of
five
pods
are
running,
so
two
of
them
are
having
problems.
So
at
this
point
you
know
something's
wrong,
but
we
have
ways
to
look
into
this
information.
A
So,
let's
hop
over
to
our
Long's
and
see
if
we
can
find
out
our
failure,
so
I
dropped.
Another
link
in
the
zoom
chat
for
you
and
just
to
make
this
a
lot
easier.
So
we're
not
pecking
I've
got
a
copy
and
paste
of
the
appropriate
filter.
So
when
this
loads
switch
the
index
to
the
gke
staging
index
and
then
use
that
as
your
filter,
so
we
could
look
for
the
logs
associated
with
the
container
register
in
that
environment.
A
A
A
It's
the
icon
that
looks
like
two
panes.
If
you
hover
over
it,
it
says
toggle,
column
and
table
scroll
down
to
the
the
expanded
view
of
how
can
you
do
it
from
there
now
just
scroll
on
the
listing
of
the
log
that
you've
expanded
scroll
down
to
the
area
where
the
message
block
is
and
there's
a
few
set
of
icons
to
the
left
of
message,
yeah
that.
C
A
C
A
Perfect,
okay,
now
we
could
see
these
are
the
logs
that
are
coming
from
the
container
registry
relatively
quiet,
because
this
is
the
stating
environment.
So
we
don't
see
a
couple
million
logs
from
people
pushing
and
pulling
images
or
logging
in,
but
based
on
these
messages
here,
we
could
clearly
see
that
there's
something
wrong
with
the
pause
that
are
attempting
to
come
back
up,
that
being
the
storage
drivers
not
registered
this
particular
version
of
the
docker
registry.
We
know
something
is
wrong
with
it
for
the
purpose
of
this
demonstration.
A
This
was
easy
to
utilized
as
a
way
to
find
logs
and
troubleshoot,
because
we
know
exactly
what's
wrong,
but
there's
at
this
point.
This
is
this
kind
of
would
if
we
were
troubleshooting
a
different
issue.
This
is
kind
of
you
know
the
direction
to
go
to,
but
in
this
case
we
know
is
that
precisely
what
is
wrong?
This
particular
image
of
the
docker
registry
was
missing,
something
when
it
was
compiled.
Hence
two
seven
one
was
released
immediately
after
that.
A
A
It
has
so
the
job
has
failed
to
be
expected
without
scrolling.
If
you
look
near
the
top
of
your,
the
job
output
you'll
see
that
it
noted
that
the
upgrade
failed,
and
the
very
next
message
was
that
it
timed
out.
So
we've
got
a
configuration
inside
of
helm.
That
says,
hey,
wait,
five
minutes
if
you're
not
successful,
perform
a
rollback
and
that's
as
you
could
see
the
next
very
next
message
is
it's
performing
that
exact
item?
A
So
if
you
go
back
to
our
metrics,
for
example,
we
should
see
a
refresh
it
in
the
right.
Pane.
You'll
see
that
the
on
the
right
side,
the
git
lab
registry,
5,
5,
5
vb,
whatever
started
to
spin
up
a
few
pods
but
immediately
shut
down
and
the
existing
registry
replica
set
remained
intact.
So
during
this
point
in
time
we
never
had
an
outage.
A
C
A
Forget
who
brought
it
up,
but
maybe
it
was
a
Stan
but
like
everything
in
gke
is
going
to
that
one
index.
So
this
includes
all
the
audit
logs
associated
with
je,
as
well
as
the
cluster
operations.
So
anytime,
the
API
is
doing
something
those
logs
are
all
going
to
the
same
place
and
then
every
single
pod
that
logs
all
those
logs,
are
going
to
the
same
place.
I
made
it
easy
here
because
I
gave
you
a
filter.
That's
relatively
quick
to
you
know
limit
it
down
to
the
registry
pods.
D
C
A
A
It'll
be
a
new
subsection
yeah
here
we
are
so
in
this
block.
We're
defining
a
few
things.
So
most
of
this
is
just
modifications
to
the
default,
and
where
is
the
HP
a
configuration
scroll
down
a
little
bit
more
I,
don't
see
what
I'm
looking
for
there.
It
is!
No
that's
not
it!
Oh
that's
right!
The
HP
is
not
in
this
file.
Go
back,
go
back
to
the
root
of
the
repo
and
then
look
for,
say
pretty
AMA.
A
No,
not
this
one:
okay
go
to
either
production,
maybe
yeah
go
to
the
production,
G
prod,
llamo
somewhere
in
here,
perfect,
okay.
So
what
yah
moles
doing?
Excuse
me,
what
helm
is
doing
is
combining
all
these
files,
so
we
take
the
values
Yama
file
and
we
take
the
whatever
environment
we're
operating
on.
So
we
just
operate
on
the
staging
environment,
so
we
would
have
looked
at
the
G
staging
Yama
file
and
we're
mashing
all
these
values
together.
So
in
this
case
we're
telling
the
registry
for
production
we
want
to
run
the
image
version.
A
271
we're
gonna
tell
our
HPA
that
we
want
to
utilize
at
most
one
hundred
and
fifty
pods
by
default,
that
max
value
is
set
to
ten
and
the
minimum
for
that
is
set
to
two.
So
when
a
new
deployment
comes
up,
it'll
automatically
come
up
with
two
pods
from
the
get-go
when
a
deployment
occurs
because
we're
using
auto
scaling-
and
this
is
auto-
scaling
based
off
of
CPU
utilization,
the
deployment
is
going
to
read
how
many
pods
are
currently
running
and
try
to
scale
to
that
number.
A
So
if
you
go
back
to
the
metrics,
we
saw
that
the
we
first
try
to
bring
up
one
pod,
but
for
whatever
reason,
the
kubernetes
cluster
decided
to
scale
up
so
I
guess
soon.
After
that
it
tried
to
bring
up
two
pods
during
the
deployment
process,
but
both
of
those
pods
were
feeling
the
entire
time,
so
they
were
eventually
shut
down
when
we
reached
our
time
limit
right
now,
we're
still
running
at
four
pods,
we'll
probably
see
that
skill
back
down
the
like
shortly
after
this
meeting
I
think
it's
every
three
minutes.
A
A
A
Okay,
I,
don't
know
where
this
is,
but
I
could
explain
to
you
without
having
to
show
you
anything
open
up
our
metrics
real
quick.
So
the
the
container
registry
is
going
to
scale
based
on
CPU
usage.
It's
going
to
communities
maintains
the
concept
of
a
mill
accor.
So
for
every
one
core
is
a
thousand
Miller
Coors
of
CPU.
A
We
request
on
the
container
registry
point
zero
five
Miller
cores
of
usage
when
a
pod
gets
spun
up
if
kubernetes
finds
that
a
node
does
not
have
the
available
point,
zero
five
Miller
cores
available,
it
won't
schedule
that
pod
or
it
might
go
to
a
different
node.
The
CPU
usage
is
based
off
of
that
request
rate.
So,
if
we're
only
using
point
zero
one
percent
this
case
we're
using
twenty
two
percent
of
that
CPU
request.
A
These
values
are
much
higher
in
production
because
using
point,
zero
five
of
a
core
is
very
minut.
That's
probably
equal
to
doing
math
on
a
computer
very
quickly,
so
the
HP,
a
it's
gonna,
take
the
CPU
usage
and
average
it
across
all
running
pods.
And
if
it's
over
75
percent
it'll
scale
up
additional
pods
as
it
deems
necessary,
and
there's
this
really
cool
formula
on
kerbin
website
as
to
how
it
determines
how
many
pauses
should
scale
up.
B
Have
a
quick
question
sure
so
I
put
this
in
the
notes,
but-
and
this
might
be
too
baked
too
much
of
a
kubernetes
specific
question,
but
you
know
whenever
we
were
looking
at
the
part
where
it
we
had
the
M
R,
we
had
a
bad
build.
He
was
trying
to
spin
up
the
pod
and
it
failed.
What
kind
of
capabilities
exists
there
as
far
as
like
it,
knowing
that
there
is
really
a
failure
per
se?
B
A
For
example,
we
had
a
situation
where
a
pod
came
up
and
it
may
have
been
responding,
but
for
whatever
reason,
it's
got
an
issue
where
it
just
sinks
the
CPU
or
it
pegs,
the
CP
usage
really
high.
We
are
implementing
limits
to
most
CPU
and
RAM
use.
So
if
they
exceed
those
limits,
kubernetes
will
D
schedule
that
pod.
It
will
terminate
it
for
us
that
way.
We
prevent
the
cluster
from
running
out
of
resources
and
starving
the
cluster
from
operating
other
workloads.
A
A
B
And
I
can
also
assume
that,
in
theory,
with
a
proper
pipeline
that
the
docker
image
itself,
you
know
whatever
core
changes
inside
you
know
between
2,
7
and
2
7
1
is
I
hypothetical
that
there's
some
work.
That
could
be
done
there
to
maybe
do
proofing
concept,
or
you
know
things
outside
of
this
structure
that
maybe
we
had
in
staging
it's
been
running.
For
two
days
we
haven't
seen
a
problem:
yep
promote
production
now
and
we've.
You
know
that
can
get
some
programmatic
problems
worked
out.
Yeah.
A
So,
theoretically,
specific
to
the
container
registry
there's
an
open
issue
that
allows
or
there's
an
open
issue
for
the
quality
team
to
generate
the
necessary
checks
to
run
QA
against
the
container
registry,
and
we
would
love
to
push
that
into
our
CI
pipeline.
That
way,
it's
full
CD
right
now,
production
is
gated,
but
we
would
like
to
not
have
to
gate
that
by
human.
Instead,
we
want
to
run
QA
if
it
passes,
go
ahead,
promote
it
on
as
you
see
fit.
A
We
do
that
today,
with
Auto
deploy
it'd,
be
nice,
a
tricky
care
that,
over
into
our
communities,
work
as
well.
Along
that
note,
as
we
move
migrate,
more
services,
though
we're
gonna,
be
pushing
QA
harder
to
get
that
kind
of
capability.
Anyways,
I
think
the
next
thing
we
kind
of
want
to
move
is
sidekick,
which
I
don't
know
what
kind
of
QA
capabilities
we'll
have
on
that
front,
but
it'd
be
nice.
If
we
could
have
a
pipeline
where
we
don't
have
to
touch
it,
it's
just
make
your
change
and
watch
it
go
through
the
pipeline.