►
From YouTube: 2021-01-28 GitLab.com k8s migration APAC
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
A
A
A
A
C
Yeah,
pretty
good
we're
we're
more
or
less
living
a
pretty
normal
life.
At
the
moment,
I
guess
all
things
considered.
I
know
that
you're,
probably
more
of
the
opposite
end
of
the
spectrum
yep.
C
Oh
man
yeah.
No,
we're
basically
trying
to
think
if
there's
any
restrictions
at
the
moment,
I
think
there's
some
general
restrictions
like
but
yeah,
even
like
public
events
like
football
games
and
stuff,
is
back
on
and
so
wow.
C
C
B
Yeah
we
keep
on
dithering
between
yeah
various
states
of
lockdown,
I
would
say
yeah.
B
Lockdown,
currently,
the
new
strain
from
the
uk
is
slowly
getting
into
the
country,
so
they're
talking
about
moving
us
back
to
a
full
lockdown.
But
for
now
they
I
mean
stores
are
closed,
but
you
can
order
online
and
order.
B
A
Cool,
so
we've
got
a
few
suggestions,
but
if
you
want
to,
if
there's
anything
else,
you
want
to
cover
on
this
as
well,
graham
then
happy
too
do
you
want
to
start
java
like
whatever's
the
most
interesting
topic.
B
Yeah
sure
graeme,
so
I
take
this
opportunity
with
you
here
to
talk
a
little
bit
about
some
of
the
issues
we're
facing
with
the
websocket
service,
we're
seeing
errors,
we're
seeing
errors
on
pod
cycles
and
I've
done
some
testing
on
pre-prod,
and
I
can
reproduce
this
and
apparently
it
looks
like
that.
B
We
start
to
see
500
errors
when
old
pods
are
germinated,
while
new
pods
are
ready
to
take
traffic.
Yes,
maybe
I
can
demonstrate
this
in
real
time
and
show
you
or
show
everyone.
C
C
B
B
This
was
my
first
instinct
was
like
okay.
This
is
just
web
sockets,
it's
particularly
bad,
because
it's
websockets
and
I
talked
to
the
developers
and
they're
like
well.
It
doesn't
really
matter.
We
handle
these
failures,
gracefully
the
client
can
just
retry
and
it
will
gracefully
degrade
as
well.
So
that's
why
we've
just
silenced
all
of
the
alerts
around
these
errors
on
the
service-
and
you
know,
but
what's
interesting,
though,
is
that
I
can
take
websockets
out
of
the
equation
and
I'm
still.
B
C
B
Which
is
which
is
not,
which
is
not
so
nice.
First
first
question,
maybe
for
you,
which
I'm
still
not
100
clear
on
pdb
file
disruption
budget.
If
we
set
the
max
we
we
default
this
to
a
max
unavailable
of
one
pod.
C
Yeah,
the
surge
settings
and
everything,
and
that
kind
of
like
the
max
surge
and
all
that
kind
of
stuff,
I'm
trying
to
think
those
settings
have
moved
they've
moved
around
a
bit.
I
think
it's.
The
specification
for
a
deployment
have
changed,
but
my
I
I'm
happy
to
admit
I'm
by
no
means
a
deep
expert
on
pb
pdb's
and
once
again,
there's
something
that's
changed
a
lot
as
the
spec
has
grown,
but
my
my
understanding
is
no
that
shouldn't
the
pdb
should
be
taking
effect
for
just
the
standard.
B
Okay,
so
yeah,
because
I
kind
of
like
I
freaked
out
a
little
bit
yesterday,
because
this
service
is
unique
in
that
we
only
have
one
pod
running
per
cluster:
oh
really,
okay,
yeah
and
the
reason
for
that
is
that.
Well,
we
have
three
clusters,
so
we
have
a
z
coverage,
so
there's.
B
And
there
didn't
seem
to
be
a
reason
with
our
default
spec
there's
no
way
we
would
scale
up
to
more
than
one
pod.
I
mean
we
basically
have
a
min
plot
of
one
hammer
stake
sticking
out
one.
So
I
thought
that
okay,
this
is
this.
Maybe
this
is
part
of
the
problem,
so
I
I
bumped
to
that
to
two
yesterday,
but
we're
almost
seems
like
we're
still
seeing
errors,
but
but
going
back
to
like
having
one
one
pod
min
replica
of
one
or
min
max
replica
of
one.
B
B
C
Sorry
refresh
my
memory,
the
websockets
is
still
the
full
cloudflare
h,
a
proxy
everything
else.
Yeah.
B
So
it's
a
little
bit
different,
but
not
that
different.
It
goes
proxy
to
the
service
endpoint
directly,
no
nginx
congress,
okay,
and
we
made.
B
Yeah
cloud
players:
yes,
so
I'm
gonna
share
my
screen
and
we'll
do
a
little
live
test
right
now.
So
upper
right!
This
is
the
pre-prod
cluster.
You
can
see
that
we
have
two
get
web
service:
pods
one
web
web
service
pod
and
then
here's
our
websockets
pod
I've
gone
ahead
and
adjusted
the
settings
by
hand
I'll
set
them
back
when
I'm
done
just
because
I
want
to
make
sure
that
the
pdb
max
I'm
available,
I
made
max
on
available
one.
I
guess
this
doesn't
matter.
B
B
Okay,
well,
I
think
we're
good,
so
we
have
one
websockets
pod
and
what
I'm
gonna
do
is
just
use
this
little
load
testing
tool
which
sends
traffic
to
the
service
endpoint,
which
is
directly
to
workhorse.
C
Okay,
so
this
is
going
to
the
this
is
from
one
of
the
proxy
nodes
to
the
google
ilb
that
sits
in
front.
B
Exactly
exactly
so,
I'm
sending
it
to
this
external
ip
on
port
8181,
which
is
of
course,
so
this
is
going
correct.
This
is
going
through.
So
if
I
just
stop
this,
I
can
see
that
I'm
getting
all
200.
This
rate
is
not
very
it's
10
10
requests
a
second,
the
duration's
20
minutes,
or
so
so
I'm
to
do
that
and
then
roll
them
down.
C
Just
do
like
an
annotation
or
something
maybe
or
maybe
not.
C
E
C
E
B
Oops
I'll
get
out
of
here
so
now
we
have
the
new
pot
coming
up.
The
old
pot
is
running
so
so
far
everything
is
fine.
I'm
still
only
seeing
two.
B
B
B
C
B
C
C
A
B
C
I
think
I
think
you
know
we
can
the
takeaway.
If
I'm,
if
I'm
hearing
it
correctly,
is
you
know
I
it
got
the
video
kind
of
cut
out
just
after
just
before
that
pod
came
ready,
but
I
think
I
assumed
that
we
were
seeing
500s
during.
B
C
C
C
B
Okay,
so
we
still
have
this
pod
terminating,
but
you
can
see
here,
I
started
to
see
500s
as
soon
as
the
pod
started
terminating,
even
though
the
new
pod
was
running,
and
so
so
what's
going
on
here,
I
guess
at
first
I
thought,
like
okay,
the
new
pot,
although
it's
running,
is
not
really
ready
to
take
traffic.
B
That
would
be
my
first
guess,
but
I
tried
doing
this
test
with
having
a
min
pod
of
two
which
and
because
of
our
max
surge
like
yeah,
exactly
so
that
so
I
still
solve
the
problem.
So
I
think,
what's
happening
is
that
somehow
requests
are
going
to
the
terminating
pod.
Maybe,
but
I
haven't.
I
haven't
validated
that
and
I
don't
see
what
I
don't
see
are
502s,
I'm
sorry,
I
don't
see
503s
in
the
workhorse
log,
like
I
don't
see
like
these
requests
going
to
workhorse.
C
C
D
C
Like
if
it's
slow
in
syncing
that,
then
you
know
the
ilbs
list
of
you
know
peers,
I
guess
you
could
say
or
end
points,
maybe
slower.
The
second
thing
is:
is
this
a
new
server?
The
service
object,
it's
pointing
to
that
that.
Can
you
actually
show
me
the
kubernetes
definition
for
that
I
want
to
see
what.
C
B
C
Additional
annotations
on
it,
or
something
because
they're
starting
to
do
with
with
gcp
1.17
they
they're
starting
to
force
all
services
to
do
pod
native
pod
native
load,
balancing
and
they're
switching
from
using
instance
groups
and
cube
proxy
to
trying
to
push
everyone
to
network
endpoint
groups.
That
is
only
for
new
services,
that's
created,
so
so
that's
only
for
new
services,
that's
created.
So
if
this
was
created
before
say
the
117
upgrade
it
shouldn't
be
an
issue,
but
what
it
does
mean
is
they
are
changing
her
release?
C
B
I
can't
say
I
don't
know,
but
you
know
we
I
haven't
really
one
is
that
what's
new
here.
Is
that
we're
not
going
through
the
engine
x,
ingress,
yep,.
B
So
we're
going
through
the
tcplb,
so
maybe
maybe
that's
coming
into
play
sure
this
is
so
that's
new
yeah
other
than
that.
I
don't.
I
can't
think
of
anything
else.
Yeah!
That's.
D
B
B
C
I
agree-
and
I
I
would
say
this
definitely
if
assuming
it
is
an
issue
with
the
ilb
and
not
something
else,
then
I
would
definitely
say
this
is
like
like
a
failure
like
a
bug
or
something
because
this,
obviously
it
shouldn't
be
happening.
I'm
I'm
interested.
Look
I
in
the
interest
of
keeping
this
meeting
sh,
you
know
spiraling
out
into
a
debugging
session.
I
I'm
it's
really
good
for
me
to
see
this,
I'm
actually
interested
now
and
going
to
look
through
stackdriver
and
poke
some
of
the
gcp
like
get
some
of
the
gcp.
C
From
the
ilb
that
actually
implements
this
and
just
have
a
look
and
see
if
we
can
confirm-
because
what
I
would
like
to
see
is
in
theory
the
ilb
that
this
maps
to
which
we
should
be
able
to
see
in
the
gcp
console.
What
we
would
expect
to
see
is
the
back
ends,
for
it
is
all
the
nodes
in
the
cluster
and
and
whatever
the
node
port
is
on
this
server.
So
what's
the
node
port
here
it
should
be
someone
here
somewhere
here,
a1
node,
port
30898,
I
think,
is
the.
B
C
But
yeah
I
I
would.
I
would
double
check
that
what
we're
seeing
on
the
ilb
side
matches
and
that
that
configuration
is
not
for
whatever
reason
changing.
I
wonder
if
I
can
actually
watch
stackdriver
as
well
to
see
if
it
thinks
that
the
endpoints
flap
up
and
down
at
all,
because
once
again
they
shouldn't,
and
if
we
do
see
that,
then
that
would
be
something
else.
That's
suspicious.
C
B
E
C
D
A
B
C
Okay,
yeah:
no,
it's
black
screen
now
again.
B
Well,
anyway,
yeah
so
yeah,
I'm
gonna
spend
a
little
bit
more
time
today.
Troubleshooting
this.
C
C
B
I
think
there's
a
key
difference
here
right
because
in
our
other
configurations
we
have
an
I
o.
We
have
a
internal
ilb,
but
it's
in
front
of
nginx
and
nginx.
B
B
Yeah,
but
it's
funny
because
we
were
definitely
seeing
a
lot
of
errors
on
nginx
pod
churning
and
what
we
did
is
we
just
upped
the
resource
allocation
so
that
we
never
scale
nginx,
so
that's
very
stable,
but
that
doesn't
help
us
here
because
now
we're
bypassing
engines
yeah,
and
so
so.
This
is
kind
of
good.
B
It's
good
that
we're
seeing
it
for
websockets
because,
like
I
said
like
errors
here,
don't
they
matter
too
much,
at
least
that's
what
I've
been
told
so,
but
it's
still
like
something
we
should
get
to
the
bottom
of,
especially
if
we're
gonna,
I
my
intention
is
to
move
https
get
to
this
configuration.
Oh.
C
This
is
yeah
crazy
because,
as
you
said,
this
is
a
completely
boring
it
should.
This
should
be
an
absolutely
bulletproof
rock-solid
configuration
or
setup
like
if
we're
seeing
these
problems,
whether
with
nginx
in
front
of
it
or
without
or
you
know,
we
need
to
make
sure
the
pod
communication
is
working
as
we
expect.
B
But
this
is
why
I'm
suspicious,
because
it's
so
boring
like
okay,
come
on.
Why
hasn't
anyone
else
reported
this,
and
this
is
why
I
think
that
there
could
be
a
delay
between
the
time
that
a
pod
is
ready
and
the
time
that
a
pod
is
able
to
accept
traffic,
and
that
would
explain
it
right
like
like
the
first.
The
new
pod
is
ready.
The
old
pod
is
terminated,
but
if
there's
nowhere
for
traffic
to
go
because
the
new
pod
isn't
actually
ready,
then
that
would
explain.
500.
B
So
so
what
I
would
love
to
do
is
be
able
to
see
the
health
of
the
back
ends.
I
think
you
were
saying
this
too,
like
see
the
health
of
the
back
ends
in
real
time
like
to
see
when
of
the
l4
layer,
the
layer
4
load
balancer,
to
see
when,
like
the
load
balancer
is
dropping
them
right
like
this,
because
the
low
bouncer
has
a
health
check
right
or
not
yeah,
it
does.
B
C
C
So
when
a
when
traffic
hits
a
node
cube
proxy
will
manipulate
the
ip
table,
rules
to
say
you
can
can
or
cannot
go
to
this
pod
and
and
q
proxy
syncs
from
the
kubernetes
api
and
in
theory
like
in
theory,
there
could
be
lag
there,
but
once
again,
that
would
be
a
huge
kubernetes
bundle
that
a
lot
of
people
would
pick
up
on.
You
would
think
right,
you
can
do.
There
is
setups
where
you
can
do
this,
like.
C
Basically,
you
get
rid
of
cube
proxy,
which
is
what
I've
done
for
cast
and
stuff.
I
set
it
up
so
using
this
kind
of
because
it's
a
new
service
and
I
had
some
time
I
set
it
up
using
container
native
load,
balancing
and
stuff,
and
so
when
you
look
on
the
old
load,
balancer
the
actual
pods
themselves,
like.
D
C
C
Cube
proxy.
That
being
said,
I
don't
think
we
should
be
doing
something
as
drastic
as
that
for
this
problem,
because
this
should
just
work.
So
we
need
to.
We
need
to
figure
out
you
could.
Even
you
could
even
do
something
interesting
like
go
into
the
ilb
drop.
Every
manually
drop
every
other
node
out,
so
there's
only
one
actual
physical
node
listening
on
a
node
port
that
traffic
is
going
to
go
through
and
I
don't
know
see
if
that
changes
anything
see
if
like
and
if
you
have.
C
C
Yeah,
so
the
whole
thing
is
the
ilb
does
not
know
how
to
the
whole
point
of
cube
proxy
is
that
I.
B
C
Know
about
nodes
and
then
it
can
go
to
any
node
and
then
it's
like.
Oh
there's,
one
part,
and
it's
over
on
node
three.
I
will
pack
it
mangle
the
packets
and
forward
them
on
over
to
node
three,
so
at
least
you'll
get
that
one.
By
doing
all
I'm
trying
to
say
is:
I
guess,
with
that
at
least:
you'll
bottleneck,
the
incoming
connection
from
the
ilb
to
one
node,
and
maybe
that.
B
Okay,
well,
I
think
what
I'll
do
is
debug
or
yeah.
I
think
I'll
do
some
more
debugging
today
and.
D
C
That
it's
coming
from,
because
is
that,
like
a
service
ip,
is
it
a
pod
ip?
Is
it
like
something
else
like
yeah?
I?
I
might
have
I'm
going
to
have
a
look
at.
Let
me
know
how
you
go,
because
I'm
definitely
interested
in
having
poke
around
this.
If
we
make
have
no
luck,
but
we
should
also
definitely
squeeze
out
google
support
for
this,
because
this
sounds
like
a
fairly
standard
question.
They
should
be
able
to
answer
us
on
yeah.
B
Okay,
that's
pretty
much
all
I
have
amy.
D
Do
you
want
to
talk
a
little
bit
about
the
kubernetes
upgrade
cream
yeah?
I
was
just
wondering
graham,
like
you're,
about.
C
Yeah,
so
once
again,
I
know
try
and
keep
it
short
on,
keep
this
meeting
on
time.
So
the
short
answer
is,
it
looks
like
we've
identified
at
least
two
incidents
that
could
be
maybe
not
alleviated,
but
helped
a
lot
by
gk,
1
1
18
upgrade
talking
about
that
now.
C
It's
actually
also
made
me
think
about
this
problem
as
well,
because
at
the
moment
one
of
the
things
we've
highlighted
and
even
google
have
now
acknowledged-
is
the
tcp
settings
they
have
on
all
their
nodes
is
incorrect,
and
so
we
saw
that
cause
problems
with
mailroom
and
I'm
actually
wondering
maybe
maybe
this
is
crazy.
I
am
actually
wondering
if
that
could
also
be
causing
some
kind
of
weird
connection
issues
we
are
seeing,
but
it
probably
wouldn't
explain
the
503,
so
maybe
not
so.
Basically,
I've
done
a
bunch
of
the
prep
work.
C
The
only
thing
so
there's
two
main
well
there's
three
major
things
that
come
as
part
of
this
upgrade
and
actually
the
next
four
kubernetes
upgrades
are
going
to
be
more
painful
than
the
last
ones.
They've
they've
fully
removed
a
bunch
of
apis
api
versions
for
deployments,
pods
and
stuff.
So
unless
we've
got,
manifestos
are
really
old
and
we've
never
updated
them.
We
we
should
be
fine.
I've
identified
one
spot
and
plant
uml,
so
I'll
probably
put
a
merge
request
up
to
just
you
know,
yeah
and
fix
that.
C
The
second
part
is
the
they've
changed
the
ingress
spec
so
on
118
they've
finally
solidified
and
got
the
ingress
spec
out
of
beta.
So
that's
going
to
change
a
bunch
of
stuff
and
that's
going
to
be
really
invasive
for
the
gitlab
chart.
I
believe,
but
we
don't
have
to
do
that
before
the
upgrade
that's
going
to
happen
after
so
you
know
that's
another
thing
we
need
to
do
and
then
the
final
issue
is
they've,
done,
a
nice
rename
and
and
changing
and
removing
of
metrics.
So
I've
got
to.
C
I
asked
anthony
to
get
someone
in
his
team
to
review,
but
I
think
obviously
now
with
him.
D
C
On
he's,
probably
he
hasn't
picked
up
the
ticket
and
he
hasn't
responded,
but
I'll
get
someone
from
observability
to
basically
go
through
all
of
the
documentation
and
confirm
that
this
isn't
going
to
cause
metric
issues
and
then
basically,
I'm
ready
to
green
light
like
I've
got
the
mrs
ready
to
do
like
ops
and
stuff,
and
I'm
keen
to
do
this
as
quickly
as
possible,
especially
if
we
think
it's
causing
issues.
C
In
the
meantime,
I
have
deployed
into
pre
and
staging
a
what
is
essentially
like
a
workaround
fix
for
the
tcp
issues,
so
I
can
actually
roll
that
into
production
any
any
day
now.
So,
basically,
it's
just
deploying
the
demon
set
that
runs
ctl
to
change
the
the
settings.
So
it's
like
getting.
C
But
now
so
that's
in
pre,
it's
in
staging.
So
obviously,
actually
it's
not
going
to
fix
this
issue,
because
if
it's
in
pre
and
staging
and
we're
still
seeing
it
that
it
doesn't
cause
it,
but
I'm
I'm
ready
to
put
together
a
change,
request
and
roll
that
out.
So
that
kind
of
gets
the
one
benefit
or
one
one.
C
You
know
corrective
action
from
the
18
upgrade
out
there
and
then
obviously,
when
we
upgrade
to
18
I'll
just
remove
that
fix
and
keep
going
but
yeah
besides
observability
and
those
little
bits
I've
talked
about
which
I'm
pretty
much
okay
with.
I
think
you
know
we
can
basically
do
it
whenever.
C
C
Release,
but
I
don't
know,
are
they
back
putting
it
to
117.,
I'm
not
sure
I
can
probably
ask
them,
but
yeah.
I
think
we
we
don't
want.
We
want
to
do
the
118
upgrades
sooner
because,
what's
going
to
happen,
is
eventually
they'll
force
us
to
do
it
and
that's
what
happened
with
the
117
upgrade,
and
it
was
a
little
bit
more.
You
know
nerve-wracking
being
forced
to
do
it
rather
than
doing
it
on
our
own
terms.
B
C
D
C
E
B
C
Yeah
look
honestly,
the
biggest
blocker
at
the
moment
is
just
getting
an
observability
to
confirm
that
you
know
we're
not
going
to
lose,
because
we
did
last
time
I
did
the
17
upgrade.
Suddenly
a
bunch
of
dashboards
stopped
working
and
I
was
scrambling
to
fix
it.
So
I'd
like
to
get
ahead
of
that
this
time,
yeah
so
once
they're
tomorrow,
I'm
gonna
sit
down
and
probably
get
most
of
the
merge
requests
ready.
C
Honestly-
and
you
know,
as
soon
as
I
get
the
kind
of
rubber
stamp
for
the
from
observability,
I
can
probably
do
it
next
week.
I
think
I
can
fit
it
in
next
week
or
I'm
on
call
next
week,
so
it
would
probably
that
that
actually
doesn't
work
out
too
badly,
because
if
things
break,
I
like
being
the
person
on
call
when
I
do
it,
because
you
know
at
least
I
get
to
get
the
alerts
and
fix
it.
So
I
would
like
to
at
least
get
the
smaller
environments
like
even
pre
or
ops,
or
anything.
B
A
I'll
ask
I'll
have
brent
about
the
metric
side,
so
if
we
can
get
that
prioritized
done
because
yeah
it'd
be
great
to
get
this
upgrade,
you
think.
C
A
B
C
Question
so
I
probably
need
to
look
in
depth.
Do
we
have
a
policy
actually
on
what
kubernetes
versions
we
support,
because,
technically
speaking,
like
all
the
versions
we've
run
are
like,
with
the
exception
of
red
hat
and
openshift,
because
you
know
they
they're
the
we'll
support
it
long
after
upstream
supported
model,
I
don't
think
we
should
be
having
to
support
that
many
old
versions,
like
maybe
16
and
15,
17
and
16.,
and
I
think
I
think,
we're
okay,
but
I
should
you're
right.
C
We
should
double
check
or
I
have
a
way
to
figure
that
out
a
bit
better.
B
Okay,
yeah.
I
think
I
think
we
do
support
explicit
versions,
but
you
have
to
type
the
distribution
to
see.
A
C
I
was
gonna
say
I
I
don't
think
just
small
questions
with
the
gitlab
com,
repo
that
we
still
can't
take
changes
off
master.
Yet
is
that
still
being
held
up
on
this
like
web
sockets
and
moving
in
genetics
and
all
that
and
that
stuff
yeah.
B
Week
we'll
understand
better
this
issue
that
we're
seeing
and
whether
we
want
to
move
forward
with
the
get
https
nginx
bypass.
My
hope
is,
we
do
that
and
then
we
can
just
upgrade
the
nginx
ingress
controller,
which
is
going
to
be
a
no
app.
If
we
can't
do
that,
then
we
need
to
just
do
a
cluster
by
cluster
upgrade,
which
is
really
not
that
bad.
I
mean
we've
done
it
twice
already.
It's
just
kind
of
high
touch.
You
know,
that's
all.
C
Yeah
right,
okay,
yeah,
it's
just
curious
because
yeah
just
want
to
start
getting
some
more
changes
in,
but
that's
fine,
the
only
other
thing.
Actually,
I
realized
we
only
got
five
minutes.
So
I'll
mention
this
briefly.
Let
me
see
if
I
can
share
my
screen,
so
I've
been
spending
a
bit
of
my
spare
time
playing
around
with.
C
Is
this
gonna
work
yeah
cool
playing
around
so
we've
got
an
issue
open.
If
I
can,
basically,
we
I
kind
of
talked
about
this.
A
few
months
ago,
decoupling
helm
file
execution
from
sinking
from
chef,
so
I
actually
had
a
little
bit
of
spare
time
and
I
actually
had
a
crack
at
implementing
it
and
basically
using
json
it
to
actually
so
that
values
from
external
sources,
instead
of
being
a
go,
template
being
json,
essentially
and
just
passing
the
values
in
and
using
jsonnet
to
pull
them
out.
C
C
So
basically,
what
I've
got
in
in
this
commit-
and
I
can
pop
it
in
the
dock
there
we
go.
Here's
the
json
obvious
file,
but,
as
you
can
see
here,
I
just
basically
pull
in
a
bunch
of
stuff
with
using
json
external
variables,
which
is
the
chef
rolls.
So
all
of
the
chef
roles
are
json
all
I
pull
the
like
load,
balancer
ips
from
the
google
api
also
is
json,
and
then
it
makes
it
so
easy
to
just
manipulate
and
pull
out
all
the
values.
I
don't
need
to
shell
out
to
jq.
C
I
don't
need
to
do
any
of
the
other
stuff,
and
so
you
can
see
here
all
these
settings.
I've
just
got
like
chef
railsconf,
which
is
just
you
know,
default
attributes
on
the
gus,
gitlab
gitlab
rb,
and
so
then
you
can
just
see.
All
of
this
is
basically
just
all
those
settings
mapped
to
the
values
we
need,
so
it
becomes
a
little
bit
nicer,
a
little
bit
easier
to
read
doing
conditionals
based
off
things
like
redis
configuration.
C
You
know
it
also
becomes
a
lot
easier,
there's
actually
like
there's
other
yaml
files.
We
have
like
the
init
values.
Yaml
and
stuff,
which
contains
a
whole
bunch
of
you,
know,
base
we.
We
have
a
lot
of
very
awkward
logic
that
we
do
in
go
templating
and
I'm
playing
around
with
the
idea
of
using
json,
because
it's
a
bit
higher
level
and
has
got
some
nice
features
for
us
to
make
that
simpler.
C
But
then,
at
the
end
result
is
it
just
generates
json
files
that
then
helm
file
consumes,
but
because
they
actually
live
in
git.
You
know
you
basically
have
this
process.
If
you
do
similar
to
what
we
have
in
the
run
books,
you
do
like
a
make
generate
it.
You
know,
pulls
all
the
values
from
chef
writes
out
the
files
for
every
single
environment.
You
commit
it
all
in
one
commit
and
then
all
of
the
pipelines
run
never
have
to
talk
to
chef
again.
You
know
the
pros
of
that.
C
Are
it's
a
lot
faster
that
we're
no
longer,
depending
on
a
service
that
might
go
down,
causing
helm
file
to
fail,
becomes
easier
for
people
to
see
and
read.
You
know,
where's
the
setting
coming
from.
I
actually
found
a
whole
bunch
of
settings
like
one
or
two
settings
in
our
production
environments
or
staging
environments
that
weren't
set
because
helm
file
was
pulling
them
incorrectly
and
then
just
failing,
silently
and
sending
them
to
empty.
So,
like
century.
C
And
things
like
that
that
wasn't
there
so
yeah,
it's
as
I
said
something
I
haven't
opened
an
mr
or
any
kind
of
concrete,
yet
I'm
still
just
kind
of
poking
around
with
it
but
yeah.
I
just
wanted
to
put
out
there
something
I've
been
playing
with.
B
So
so
there's
I
guess,
there's
like
three
external
sources.
We
have
the
chef
repo,
we
have
secrets
and
then
we
have
gcp
for
the
chef
repo.
I
think
yeah.
You
know
I
mean
like
it's
hopefully
temporary
right
like
I,
I
think
once
we
move.
B
D
C
And
then
pass
that
into
jsonnet
as
well
and
that's
why,
like,
for
example,
you
can
see
here
I've
written
like
a
function,
that's
like
find
g
cloud
address,
so
it
takes
the
whole
json
object
with
every
single
address
and
all
the
settings.
Oh
sorry,
this
is
probably
really
small
yeah
and
then
I
can
just
like
you
know,
find
the
you
can
just
call
this
function
like
get
me.
C
The
address,
forget
https
and
because
I've
you
know,
given
us
the
entire
json
object
from
google
with
every
single
address
we
have,
and
it
becomes
really
easy
to
just
you
know,
write
these
helper
functions
to
find,
addresses
and
or
get
extra
information
out
of
it
and
like
likewise,
you
know,
I
wrote
a
function
for
mapping
the
giddily,
so
the
gitly
transformation
is
really
ugly,
which
we
use
in
jq.
At
the
moment,
it's
horrible.
You
know
it
makes
it
a
little
bit
easier.
I
can
convert
italy
entries
and
just
map
that
to
arrays.
C
So,
ultimately,
if
we
would,
if
we
think
this
idea
has
legs,
you
know
this.
This
would
just
basically
replace
the
values
from
external
sources,
but
there's
nothing
saying
that
we
couldn't
make
this
more
sophisticated
and
get
it
to
the
point
where
it's
more
or
less
one
jsonnet
document,
with
all
the
conditionals,
where
some
variables
are
passed
in
per
environment
like
we
do
now,
and
it
just
generates
the
one
so
basically
helm
file
just
consumes
one
json
file
out
of
json
instead
of
like
values.yaml
environment.yaml
values
from
external
sources.
C
We
just
use
jsonnet
to
do
all
of
that
complicated
logic,
around
values
and
then
helm
file
just
simply
executes
with
okay.
I've
just
got
one
value
file
to
consume
for
this
environment
and
I'll
just
consume.
That,
and
you
know
it
just
keeps
the
job
of
doing
the
helm
upgrade
stuff,
whereas
we
pull
logic,
environment,
logic
out
of
helm
file,
maybe
yeah.
C
C
If
we,
this
kind
of
jsonnet
approach
also
means
you
know,
because
I'm
externalizing
the
execution
of
chef,
although
you
know
we
do
that
in
helm
file
anyway,
we
could
just
change
this
to
point
to
you
know
whatever
pump
puts
the
json
source
in
it
doesn't
really
matter,
and
in
fact
I
think
when
we
talk
in
the
discussions
around
replacing
chef
for
the
nodes
that
we
are
going
to
keep,
there's
a
bigger
discussion
there
about.
How
do
we
do
things
like
service
discovery
for
thing?
C
Like
you,
I
would
personally
almost
argue
that,
should
we
be
putting
consol
sorry
giddaly
servers
in
console
like
should
we
be
relying
on
text
files
and
roles
for
like
our
service
discovery
for
get
getaway
nodes,
and
things
like
that?
Do
you
know
what
I
mean?
So
I
don't
know.
D
C
Over
time,
this
will
just
become
easier
anyway,
as
we
because
a
lot
of
it
a
lot
of
it's
just
oh,
we
need
to
sync
a
list
of
servers.
We
have
from
chef
somewhere
and
it's
like
well,
should
we
actually
be
having
that
in
our
chef
or
ansible
or
whatever
system
at
all?
Should
that
actually
be
in
console
where
you
know
it's
a
live
set.
It's.
B
Yeah
yeah,
I
I
guess,
but
the
thing
is-
is
that
a
getaly
server
doesn't
need
to
know
so.
Getaway
servers
are
going
to
be
managed
by
chef
for
the
foreseeable
future.
At
least
we
don't
have
a
plan
to
move
into
kubernetes
unless,
if
we
switch
to
something
no.
C
No,
no,
there
is
this,
but
there's
discussion
to
replace
the
configuration
management.
So
basically,
we've
got
a
we've,
got
a
deadline
on
chef.
We
either
have
to
pay
for
enterprise
license
or
move
to
something
else,
and
that's
supposed
to
be
tackled
in
march
or
whatever,
so
that
there
is
a
this
is
comes
out
of
the
compliance
audit
or
something
so
all
they.
All
we
got
to
do
is
either
pay
the
money
for
chef,
enterprise
or
and.
B
C
B
But
yeah
I
was
just
making
the
point
that
the
list
of
file
servers
the
list
of
gitaly
nodes.
Once
we
move
the
front
end
to
kubernetes,
all
that
configuration
will
be
will
no
longer
chef
will
no
longer
need
that
configuration
at
all,
because.
B
B
B
B
Kubernetes
right
so,
okay,
but.
B
So
yeah
this
looks
pretty
cool.
Like
I'm
yeah,
I
think
I
think
we
need
to
kind
of
figure
out
the
timing,
whether
it
might.
C
Absolutely
I
agree,
I
think,
and
then
once
we
hit
that
point
we
kind
of
because
at
the
moment
we're
like.
Oh
chef
is
the
single
source
of
truth.
Let's
pull
from
chef,
but
once
we
do
that
flip
we'd
almost
say
that.
Well
now
the
kubernetes
manifest
repo
is
perhaps
the
single
source
of
truth,
and
I
don't.
D
C
A
Dropped
it
if
you've
got
your.
Mr
I'll
put
the
issue
in
but
yeah
I'll
be
great
to
see.