►
From YouTube: CNCF End User Lounge: Platform Evolution - 5 years of Kubernetes at Sky Betting and Gaming
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
All
right
so
hello,
everyone
welcome
to
the
cncf
end
user
lounge,
where
we
explore
how
cloud
native
technologies
are
adopted
by
end
user
organizations
across
different
industries
and
sectors.
The
sincere
end
user
community
is
formed
with
more
than
155
vendor
neutral
companies
that
use
open
source
software
to
deliver
their
product.
I'm
ricardo,
russia,
I'm
a
computing
engineer
at
cern
today.
I
have
andy
bergen
as
a
guest
speaker.
A
So
in
this
live
stream
we
bring
end
user
members
to
showcase
how
their
organization
navigates
the
cloud
native
ecosystem
to
build
and
distribute
their
services
and
products.
You
can
join
us
every
fourth
thursday
at
9
00
a.m.
Pacific.
This
is
an
official
live
stream
of
the
cncf.
So
it's
subject
to
the
cncf
code
of
conduct.
Please
do
not
add
anything
to
the
chat
or
questions
that
would
be
in
violation
of
that
code
of
conduct.
A
Basically,
please
be
respectful
of
all
of
your
fellow
participants
and
presenters.
If
you
have
any
questions
for
us
during
the
stream,
we
will
be
monitoring
the
chat.
Make
sure
you
you
ask
the
questions
in
the
live
stream
chat.
So
this
week
we
have,
as
I
mentioned
andy
bergen
here,
he
will
talk
us
about
platform,
evolution
and
five
years
of
kubernetes
at
skype,
adding
in
gaming
before
we
dive
into
the
questions
andy,
do
you
want
to
briefly
introduce
yourself.
B
I
think
it
should
yeah
hi
everyone
thanks
for
joining
the
stream
or
watching
the
recording
and
hi
to
you,
ricardo
nice
to
nice
to
be
here
today
and
thanks
for
inviting
me
on
so
I
am
lead
platform
engineer
within
the
infrastructure
and
platform
engineering
squad
inside
the
infrastructure
and
platform
stride.
Let's
go
back
in
gaming.
B
I've
been
at
sky
betting
gaming
for
over
seven
years
now
I
originally
started
as
a
devops
engineer
in
the
bet
tribe
moved
to
work
with
dupe
in
the
data
tribe
and
for
the
last
three
and
a
half
years
I've
been
working
with
the
kubernetes
team,
which
has
been
great
before
that
I
did
lots
of
things
with
digital
marketing,
many
different
hats,
many
different
skills
from
dev
from
ops
from
production,
management,
finance
and
all
sorts
of
things,
but
I'm
really
enjoying
being
back
in
the
tech.
B
Now
outside
of
my
day
job,
I
run
the
local
devops
meetup
in
leeds
in
west
yorkshire
in
the
uk,
I'm
also
part
of
the
organizing
team.
Around
devil
stays
london,
which
is
a
conference
there
happens
supposedly
annually,
but
obviously,
over
the
last
year,
things
have
been
somewhat
difficult
around
that
as
we
we
all
know,
but
we
hope
to
be
back
next
year.
So
we're
looking
forward
to
that
looking
forward
to
just
going
to
conferences
in
general
and
including
kubecon
in
that
list.
Right
now,.
A
Awesome
that
sounds
pretty
exciting
lots
of
things
yeah.
I
agree
for
the
conferences.
It's
been
pretty
good
to
have
and
the
first
one
physical
after
all
this
time
with
north
america
yeah,
but
I
guess
we
can
dive
into
the
questions
I
would
start.
Maybe
you
can
tell
us
a
bit
more
about
the
infrastructure
setup
at
your
company
and
specifically,
maybe
you
could
explain
a
bit.
What
are
the
specific
technical
hurdles
that
bookmakers
have
to
face.
B
Okay,
let
me
set
the
scene
a
little
bit
on
that,
so
we
are
on
online
bookmakers.
We've
actually
been
around
for
over
20
years,
initially
the
the
betting
arm
of
sky
television
in
the
uk.
So
we
were
to
be
the
the
red
button
on
the
remote.
You
would
press
that
and
be
able
to
place.
B
A
bet
was
the
idea
behind
that,
but
that
was
a
very
long
time
ago
and
since
then,
obviously
things
have
evolved,
skye
started
his
own
technology
stack
over
a
decade
ago
and
that's
been
growing
steadily.
We
offer
a
range
of
product
service
services
around
sports
betting
and
also
gaming
as
well,
so
it
comes
kind
of
poker
and
sports
and
spots
and
also
all
sorts
of
other
entertainment
products
like
that.
B
I
think
the
main
thing
with
our
industry,
where
it
perhaps
differentiates
from
from
a
lot
of
others,
is
really
the
nature
of
the
traffic
patterns
we
get
and
the
and
the
technology
stack.
We
have
to
have
to
deal
with
that
coupled
with
the
regulatory
stuff.
We
have
to
do
as
well
to
make
sure
that
we
are
looking
after
our
customers
and
we
are
compliant
with
the
regulations,
but
it
gives
us
a
number
of
problems
which
we
have
to
use
technology
to
solve
so
particularly
to
do
with
load.
B
I
think
you
know
many
people
who
work
at
retail
will
be
familiar
with
the
the
busy
days
of
the
the
black
fridays
etc.
Well,
we
tend
to
have
at
least
one
of
those
a
week
in
our
industry
and
we
have
to
deal
with
unpredictable
demands.
I
think
perhaps,
on
the
the
gaming
side
of
things,
this
is
a
sweeping
generalization.
B
We
know
kind
of
the
patterns
that
can
plan
around
promotions
for
things
like
that,
but
with
sports
betting,
we
really
are
at
the
the
whim
of
what
happens
in
the
sports
game.
So
typically
in
the
uk,
the
the
soccer
games
kick
off
at
3
p.m.
B
On
a
saturday
afternoon
and
there's
there's
a
a
large
spike
in
activity
of
people
placing
bets
up
to
that
mark
and
then
what
would
have
happened
several
years
ago
is
that
would
have
dropped
off
immediately
and
we
would
have
really
been
quite
quiet
until
the
end
of
the
day
when
we
were
settling
the
results.
But
now
we've
got
in
play
markets,
etc.
So
we
don't
know
quite
the
demand,
we're
gonna
have
on
the
services
depending
upon
what
events
happen
in
the
sports
games.
B
So
you
know,
we've
got
a
very
spiky
traffic
pattern,
which
is
kind
of
unpredictable
as
well,
so
we
have
to
have
systems
which
can
deal
with
that
sort
of
scale
and
to
be
able
to
obviously
make
sure
that
they're
available
for
our
customers
so
they're
the
kind
of
challenges
we
face
and
I
suppose
to
answer
your
questions.
Traditional
stack
that
we
had
in
the
pre-kubernetes
day.
B
It
would
have
been
very
much
vm
based
running
out
of
data
centers
and
building
applications
with
another
capacity
to
deal
with
the
load.
Obviously
things
are
a
little
different
now,
but
that's
kind
of
how
things
were
when
I
started
all.
A
Those
things
over
the
business-
that's
super
interesting,
actually,
the
I
guess
one
of
the
questions
or
one
of
the
points
I'll
say
for
later.
Maybe
it's
also
understanding
how
you
manage
these
spikes
and
maybe
over
provisioning
of
resources,
or
I'm
actually
interested
if
you're
running
on
premises
on
public
cloud,
but
maybe
maybe
we
can
start
with
your
transition
to
coordinates.
You
just
mentioned
the
virtual
machines,
and
can
you
tell
us
a
bit
about
your
transition
to
coordination
club
native,
and
how
did
you
get
that
going.
B
Okay,
yeah
great
good
question,
so
yeah
yeah
I've
set
that
one
nicely
haven't
they
so
yeah
we're
we're
about
to
start
on
a
kubernetes
journey,
and
this
is
this-
is
back
in
2016..
B
It
wasn't
meant
to
be
a
kubernetes
journey.
It
was
going
to
be
a
journey
of
what
could
provide
the
next
generation
of
hosting
platform
for
the
bet
tribe,
and
you
know
what
what
platform
could
we
put
together
that
really
made
it
easier
for
developers,
I
think,
as
kind
of
like
operations
engineers.
B
I
think
we
maybe
approach
problems
from
thinking
about
the
problems
it
can
solve
for
us,
but
really
this
this
whole
journey
started
on
about
how
do
we
get
our
developers
to
go
quick,
we're
in
a
very
fast
changing
market,
a
lot
of
companies
in
competition
with
us?
How
can
we
get
products
to
customers?
How
can
we
make
it
easy
for
our
developers,
because,
unless
codes
in
front
of
our
users,
it
is
kind
of
worthless.
B
So
how
can
we
make
that
easy,
and
how
can
we
address
some
of
the
problems
that
we
were
having
with
the
more
kind
of
traditional
infrastructure
we
had?
You
know
the
quality
gates
the
bottlenecks.
How
can
we
enable
those,
but
still
do
it
in
a
safe
way,
so
the
objective
really
was
around
creating
a
platform
which
had
as
few
human
interactions
between
somebody
pushing
code
or
repo
and
having
an
automated
process
to
get
that
onto
the
servers
in
front
of
people.
B
That
was
the
objective
and
to
do
that,
we
we
set
about
building
out
some
pox.
First
of
all,
obviously,
you've
got
a
technology
choice
and
back
in
2016,
kubernetes
was
still
relatively
new
and
wasn't
as
mature
as
some
of
the
other
container
stacks
that
were
around
them.
So
there
was
some
technical
evaluation
done
with
that.
Also
at
the
time
this
was
one
of
the
first
pieces
of
work
which
wanted
to
run
in
public
cloud
as
well.
B
So
we
did
the
initial
kind
of
proof
of
concept
to
check
out
the
technologies
settled
on
kubernetes,
I'm
very
pleased
to
say
it
was
before
my
time
in
the
team,
but
I'm
glad
they
chose
it.
But
after
that
it
became
how
do
we
build
with
the
developers
a
platform
that
they
need.
So
we
worked
with
the
team
that
has
one
of
these
spikey
workloads
I
referred
to
earlier,
so
what
we
call
our
push
team,
so
they
are
the
the
updates
in
there
in
a
typical
sports
website.
B
There
are
lots
of
events
happening,
not
just
like
football
games,
but
things
that
happen
in
football
games
or
netball
or
whatever
it
is,
and
these
events
can
change
the
prices
of
markets
that
people
can
gamble
on.
So
there's,
there's
literally
thousands
of
updates
a
minute
going
through
that
all
need
to
be
reflected
on
the
user's
device
in
which
they're
using
to
interact
with
us.
B
So
we
worked
with
the
the
push
team
to
build
out
an
mvp
first
of
all,
building
out
using
container
linux,
as
was
at
the
time
on
aws
provisioning,
sort
of
the
cloud,
storage
and
cloud
load
balancers
as
we
needed
for
that,
and
what
that
allowed
us
to
do
was
on
the
platform
that
for
the
for
the
updates
that
when
there
weren't
many
updates,
we
could
scale
it
right
down
and
when
it
was
busy
we
could
scale
it
up
and
that
platform
was
very
successful
and
it
went
on
to
become
the
kubernetes
platform,
which
was
fairly
widely
adopted
around
the
business.
B
And
here
we
are
now
five
years
later,
with
a
whole
bunch
of
people
around
the
business
from
different
departments
using
it.
A
Super
interesting,
I
I
think
that
the
early
adopters
went
through
the
same
process
as
well
of
like
deciding
which
orchestrator
and
container
platform
they
should
choose.
That's
also
very
nice
to
hear
that.
Maybe
maybe
you
can
dig
also
you
just
mentioned
that
you
deploy
on
aws.
Maybe
you
can
mention
a
bit
about
the
stack.
Do
you
use
like
managed
kubernetes
or
or
is
it
like
your
own
deployment.
B
Yeah
sure
so,
obviously
we're
talking
2016
and
I
think
gke
was
available
back
then,
but
in
the
in
his
very
early
days
too.
So
we
decided
to
do
things
the
hard
way,
as
was
the
the
way
to
do
things
back
there.
So
we
we
didn't
do
things
completely
the
hard
way
we
based.
I
think
I
mentioned
before
we
use
container
linux
so
core
os
as
the
base
for
our
solution,
and
we
provisioned
that
through
a
bunch
of
terraform
which
would
provision
in
our
case
ec2
instances
which
ran
core
os.
B
They
pixie
boot
into
core
os
work
with
some
container
linux
technologies
called
matchbox
ignition
to
pull
down
pre-rendered
configuration
all
those
for
those
nodes
and
then
effectively
boot
from
scratch,
they're
going
to
the
early
user
space
where
they
take
these
the
matchbox
and
the
ignition
configuration,
and
they
they
apply
that
to
the
os
and
before
it
goes
into
proper
user
space.
B
So
it's
kind
of
like
a
a
pre-boot
thing
inside
container
linux,
so
we
use
that
to
provision
set
up
the
node
with
all
this
specific
settings
and
configurations
mainly
systemd
files,
and
then
we
boot
properly
and
that's
when
the
operations
operate
system
spins
up.
So
that
means
that
if
we
reboot
a
node,
we
kind
of
start
from
scratch.
We
do
have
some
persistent
storage
on
there,
which
are
volumes
mounted
off
the
file
shares
to
store
things
like
docker
images,
because
that
can
act
as
a
cache.
B
We
don't
have
to
pull
down
all
the
containers
every
time
we
start
them
up,
but
other
than
that
a
lot
of
stuff
is
held
in
memory
disk,
so
slightly
different
set
up
to
some
other
some
other
clusters.
But
that's
where
really
when
it
means
things
like
upgrades,
are
a
change
of
a
version
number
in
a
repo
and
then
we
republish
all
the
matchbox
and
ignition
stuff
through
terraform,
and
that
means
that
when
the
node
reboot
they
can
pull
in
a
new
image
of
coreos
so
that
that
works
really.
B
Well,
we
run
the
control
plane
in
high
availability.
We
run
that
across
a
couple
of
nodes,
the
xcd
database
is
backed
up
very
regularly
and,
yes,
we
have
tested
it
to
make
sure
we
can
restore
it
as
well.
Like
you
say,
we
use
a
lot
of
terraform
to
actually
do
the
provisioning
and
based
on
other
system
components.
We
should
expect
like
a
monitoring
stack
based
on
prometheus.
B
We
used
some
of
the
other
services
which
were
already
running
and
supplied
for
developers
around
the
business.
So
we
don't
run
our
own
logging
stack
that
goes
into
our
elastic
stack,
that's
run
by
one
of
our
other
teams
in
the
business,
so
we
kept
that
familiar
stack
of
tooling,
which
the
developers
knew,
and
now
we
don't
just
run
aws.
We
also
run
on-prem
using
the
same
terraform
and
ignition
scripts,
although
they
are
slightly
customized
for
different
provisioners,
for
things
like
storage
and
for
and
for
the
virtual
machines
as
well.
B
So
we
run
those
on
on
vmware,
but
it's
essentially
the
same
regardless
of
which
environment
you're
running
in
apart
from
the
nuances
of
the
of
storage
and
load,
balancers
and
etc.
Essentially,
we
keep
the
same
stuff
and
yeah.
That's
allowed
us
to
keep
parity
between
all
the
environments
and
we
run
about
five
clusters.
B
We
don't
have
a
lot
of
clusters
and
we
run
them
independently
as
well.
We
don't
have
like
a
a
a
cluster
mesh
over
the
top
of
that,
although
we
do
run
the
service
mesh,
it's
the
oh,
but
we
run
that
locally
on
each
cluster.
All
right.
A
Super
nice,
like
one
question
I
have
is
just
before
we
jump
into
the
coordinates
details.
You
have
a
bunch
of
small
socks
behind
you.
B
So
yes,
on
the
wall
behind
me,
this
is,
we
recently
moved
officers
during
the
pandemic.
So
we
we
we,
I
left
the
office
in
march
2019
and
I've
only
been
back
once
to
to
collect
stuff
because
we
moved
officers
so
we're
now
in
a
a
building,
that's
entirely
owned
by
the
company
which,
which
is
really
nice
and
it's
completely
custom
set
up.
But
in
the
office
we
had
a
few
bits
of
kind
of
customized
stuff.
B
We
had
around
the
place
just
to
make
a
place
feel
like
home,
so
the
kubernetes
sign
behind,
which
says
platform
engineering.
I
used
to
hang
above
our
desks
and
basically,
when
I
went
in
to
collect
my
stuff,
I
stole
it.
I
don't
think
work
know
that
so
I
probably
shouldn't
have
said
that
out
loud,
but
the
socks
behind
me
are
from
conference
swag.
Many
of
those
will
have
been
from
a
cubecon
or
two
and
at
the
time
we
were
going
through
our
socks
audit,
so
there's
kind
of
a
pun
there.
B
So
we
we
we
used
to
have
those
in
the
office
with
a
graph
of
the
number
of
stocks
we
had
and
the
number
of
socks.
We
were
going
to
tell
the
audio
to
auditors
as
kind
of
a
joke
so
yeah
that
was
our.
That
was
our
official
socks
audit
for
the
kubernetes
platform,
beautiful.
A
Pretty
good
all
right,
so
I
guess
like
digging
a
bit
more
into
the
kubernetes
part,
you
mentioned
you
started
and
eventually
you
had
to
manage.
I
guess
growth
as
things
picked
up.
So
I
guess
I
had
two
questions
here.
One
is
just
the
growth
of
extended
usage
once
things
get
popular,
the
other
one
is
kind
of
related
to
what
you
mentioned
at
the
beginning
for
handling
spikes.
Do
you
have
some
sort
of
auto
scaling
and
how
do
you
manage
that.
B
Yeah,
the
growth
of
the
cluster
so
yeah
we
we
started
off
with.
As
I
said,
we
we
built
it
for
one
customer
in
one
use
and
that
gained
popularity
really
really
quickly
and
with
that,
come
some
challenges
because
you've
not
only
got
a
scale.
Tech
you've
got
to
scale.
People
you've
got
to
scale
the
way
you
work
as
well.
B
So
I
think
when
we,
when
we
moved
to
on-prem
there
were
some
changes
we
had
to
make
around
the
code
base,
so
we
made
some
optimizations
at
that
point
to
handle
some
of
the
some
of
the
growing
pains
we'd
seen
in
the
first
iteration
of
the
cluster.
So
we,
for
example,
on
in
aws.
We
could
use
the
cluster
autoscaler
to
deal
with
workloads.
So,
as
things
got
busy,
we
could
pop
up
more
ec2
instances
to
run
more
workloads
on
and
obviously,
as
it
got
quiet,
we
could
scale
that
down
as
well.
B
So
that
was
great
on
on
aws,
but
on
prem,
that's
not
something
we
can
do.
We
have
to
kind
of
over
provision
for
on-prem.
The
bits
we
had
to
swap
out
were,
I
think
we
went
to
dimensionally
just
before
we
tend
to
like
storage
provisioners.
So
if
you
want
a
slice
of
storage
on
the
aws
based
clusters,
you
get
an
ebs
volume.
If
you're
on
prem,
you
get
a
slice
of
netapp
provisioned
through
through
our
in-house
storage,
arrays
load
balances,
you're,
getting
lb
and
aws.
B
So,
although
the
the
the
provisioners
for
the
cloud
were
fairly
well,
you
know
understood:
we
had
to
write
some
custom
stuff
to
do
that
on-prem,
but
we
didn't
want
developers
to
be
slowed
down
by
having
to
configure
f5s
and
to
be
requesting
storage.
So
we
use
the
same.
Obviously
we
kept
the
the
system,
volume
storage.
We
just
changed
the
provisioner
there,
which
makes
it
sound,
really
simple.
That
was
a
lot
of
work,
went
into
that
and
the
same
with
load
balancer
provisioning.
B
Obviously,
we
didn't
necessarily
want
our
developers
to
be
logging
into
f5s
and
configuring
those
when
they
could
just
declare
the
state
of
what
they
wanted.
Their
network
connections
to
be,
and
the
cluster
should
do
that
for
them,
and
obviously
we
we
put
that
together
as
well
so
yeah
they
were.
They
were
some
of
the
challenges
we
faced
from
keeping
that
parity
as
we
changed
environments
in
terms
of
growing.
B
We,
I
think,
when
we
were
working
more
closely
with
certain
teams,
we
hadn't
necessarily
anticipated
the
challenges
ahead
with
that
particularly
multi-tenancy,
and
I
mean
I
think
the
the
initial
year
of
the
cluster
was
without
our
back
because
it
wasn't
there.
I
know
how
back
was
added
shortly
before
I
arrived
to
the
cluster,
but
that
presented
some
challenges,
because
how
do
we
manage
that?
For
both
environments?
We've
got
a
solution
based
off
volt
and
ldap
groups,
which
allows
teams
to
authenticate
and
get
access
to
to
the
cluster.
B
So
from
that
they're
restricted
to
what
name
space
they
can
do.
We've
done
a
lot
of
work,
we're
putting
insane
defaults
and
least
privileged
security
when
those
name
spaces
are
created,
so
that
if
you
get
on
the
cluster
you're
kind
of
locked
down
to
start
with,
until
you
unlock
the
bits
you
need,
so
you
have
to
set
up
your
network
policies.
You
need
to
sell
quotas
and
stuff
and
by
that
we've
kind
of
managed
the
expectations
of
the
customers
getting
on.
B
We've
got
a
support
channel
where
people
can
raise
support,
requests
and
tickets
and
ask
questions
and
we
can
help
them
there.
But
I
think
the
main
thing
we
found
was
in
terms
of
that
growth.
B
Was
our
users
didn't
always
understand
the
line
between
what
was
the
kubernetes
thing
and
what
was
our
kubernetes
thing
and
there's
an
expectation
from
ourselves
there
that
we
expected
our
development
teams
to
learn
how
to
build
apps
for
kubernetes
and
also
how
to
maintain
and
manage
those
and
off
the
back
of
that
we've.
You
know:
we've
put
in
a
lot
of
training
we've
trained
over
400
developers
on
a
couple
of
different
courses
on
how
to
build
and
write
apps
for
kubernetes,
so
they
can
get
that
right.
But
of
course
there's
still
you
know.
B
I've
heard
it
said
that
kubernetes
isn't
a
developer
tool.
I'm
not
sure
whether
I
agree
with
that
definition,
but
I
think
that
there's
definitely
a
barrier
to
entry
there.
But
whether
or
not
it's
it's
massive
or
small,
I
think
largely
depends
on
the
developers
we're
working
with.
B
As
an
example,
we've
we've
got
developers
who
would
gladly
be
given
root
access
on
everything
and
would
love
to
insert
records
directly
into
the
xcd
database,
given
the
opportunity
to
do
so
in
the
control
plane,
but
obviously,
at
the
other
end
of
the
spectrum,
we've
got
people
that
just
want
to
put
a
few
lines
of
yaml
together
and
they're,
not
that
interested
in
what
that
is
because
they've
got
you
know.
Quite
rightly,
developers
have
got
a
whole
lot
of
other
stuff
to
deal
with.
B
You
know
domain
knowledge
of
their
actual
problems,
they're
trying
to
solve
the
code
they're
trying
to
write
the
business
logic.
You
know,
there's
the
you
know.
I
think
the
expectation
that
go
away
and
learn
kubernetes
as
like
an
afterthought,
I
think,
is
something
that
that
doesn't
really
work.
I
think
we
we
got
bit
a
little
bit
by
that,
and
hence
we
got
to
retrospectively,
do
quite
a
bit
of
training
around
that
to
to
bring
or
to
help
developers
easily
understand
what
they
need
to
do
on
our
clusters.
A
All
right
now
that
that's
very
interesting,
maybe
maybe
I
have
maybe
another
question
about
the
management
of
the
clusters,
but
maybe.
A
A
B
A
B
I
think,
given
a
time
machine,
we
would
have
put
more
developer
tooling
in
place
or
encourage
the
teams
that
we
work
with
initially.
To
do
that,
I
think
if
we
were
starting
again
from
scratch
now
we
would
certainly
have
some
opinionating
ways
of
building
apps
for
kubernetes
and
what
was
supported
on
there,
but
as
with
all
ecosystems
that
evolve,
we
now
have
the
particularly
vet
tribe.
B
I
put
together
a
standard
way
of
building
applications,
so
after
a
couple
of
years
of
people
going
off
and
doing
their
own
thing
or
being
influenced
by
what
other
teams
have
done,
there's
now
a
pattern
evolved
of
how
things
should
be
done
and
we
have
a
team
that
are
building
a
built.
Sorry,
an
application
helm
chart
which
allows
developers
to
build
applications
based
off
a
set
of
base
images
which
are
regularly
updated.
B
They
can
take
their
applications,
there's
pipelines
built
for
those
to
deploy
those
onto
the
clusters,
they
get
a
set
of
standard
dashboards
and
they
get
a
a
bunch
of
tooling
and
references
to
where
they
can
find
the
logs
etc.
B
I
think,
as
I
say,
if
we
could
start
again,
we
would
perhaps
have
done
things
a
little
differently
and
I
think
one
of
one
of
the
things
we
we've
done
over
the
last
three
years
and
certainly
for
my
day
job
as
well.
Is
we
relied
heavily
on
this
kind
of
use
of
developer
experience
to
kind
of
like
solve
a
lot
of
the
growing
pains
we
had
with
the
cluster?
B
We
got
a
lot
of
users
on
there
fairly
quickly,
and
I
think
we
suffered
the
growing
pains
sort
of
internally
of
how
we
were
working
with
the
clusters.
So
we've
done
that.
We've
done
a
heck
of
a
lot
over
the
last
three
years
to
to
like
to
like
smooth
that
out,
starting
with
just
basically
talking
to
more
and
more
of
our
users
about
what
they
want
from
the
cluster.
How
they're
going
to
use
it
understanding
who
was
actually
using
our
cluster
was
was
quite
a
big,
a
big
undertaking.
B
We
did
trying
to
understand
which
workloads
belong
to
teams
because
they
can
move
around
as
well.
So
we
basically
tag
all
the
workloads
on
the
clusters
now
with
metadata,
so
there's
labels
which
indicate
who
owns
the
stuff
and
that
that's
allowed
us
to
do
a
load
of
really
cool
stuff.
It's
allowed
us
to
shard
the
logging
so
rather
than
just
having
one
logging
pipeline.
Well,
we
can
do
that
per
tribe.
Now,
there's
a
lot
of
work
done
into
that.
I
mentioned
we
go
out.
We
speak
to
teams,
we
talk
about
requirements.
B
We
take
that
feedback
back.
We
can
do
that
and
understand
about
the
workloads
which
they're
running,
but
it's
enabled
other
stuff
away
around
others,
as
well
around
things
like
best
practice
and
standards.
So
we
put
together
a
whole
bunch
of
ideas
and
gold
and
best
practices
call
them
standards
the
principles
of
how
you
build
and
run
an
application
at
sky,
betting
and
gaming
on
a
containerized
platform.
So
we've
got
standards
around
build,
run,
deploy
now
and
we've
got
that
that
was
built
with
input
from
everybody
who
was
using
our
cluster.
B
So
we've
got
like
a
collective
mindset
on
that.
It's
not
just
our
opinionated
version
of
what
looks
good
so
we've
got
that
and
then
we
built
tooling
around
that
to
kind
of
like
check
on
that
as
well
and
provide
dashboards
etc
to
indicate
where
things
aren't.
Following
the
rules
and
some
possible
solutions,
they
could
have
to
to
fix
that.
B
So
we've
done
a
lot
of
work
on
that
and
it's
you
know:
that's
evolved
further
into
things
like
understanding
costs
and
education
on
resources,
and
things
like
that,
so
that
we
can,
you
know,
run
things
efficiently
as
well.
A
Yeah
yeah,
I
think,
like
you,
you
covered
a
lot
of
the
challenges.
Yeah,
it
sounds
sounds
very
good,
but
one
one
thing
like
maybe
you
already
mentioned,
but
if
you
would
say
like
the
main
problem
you
will
have
today
while
running
your
clusters.
Well,
would
you
highlight
something?
You
mentioned
a
bunch
of
stuff
that
that
is
tricky
to
handle.
B
Well,
I
think,
from
the
technical
side
you
you're
always
going
to
have
you
know
this
is
a
kubernetes
problem.
This
is
just
a
you
know.
A
running
computer
systems
problem
really
distributed
systems.
You
know
you've
got
you're
going
to
have
face
problems
with
problem
workloads
and
with
components
of
the
system,
not
behaving.
B
There's
you
know
constantly
keeping
things
up
to
date.
It
means
evergreen
and
management
of
that,
and
then,
of
course,
the
the
probably
the
big
one
which
I
think,
whichever
system
you're
running
is
going
to
be
capacity,
particularly
an
on-prem
environment.
You
know:
do
you
have
enough
storage?
You
have
enough
network
bandwidth.
B
Is
your
monitoring
able
to
scale
with
your
workloads
and
then
coming
down
to
like
right-sizing
workloads
to
to
the
right
requests
and
limits
on
them,
trying
to
support
teams
to
to
to
get
that
right?
We
find
that
particularly
challenging,
because
I
don't
think
there's
a
great
range
of
tooling
out
there
to
to
help
with
that.
We've
built
some
in-house
tools
we're
building
more.
We
know
this
is
a
problem
and
we're
you
know
in
order
to
get
our
development
teams
to
understand
and
to
set
their
requested
limits
correctly.
B
We
need
to
help
them
to
do
that.
We
can't
just
you
know,
you
know,
produce
graphs
and
then
point
out
inefficiencies
or
you
know
things
getting
killed
or
cpu,
throttled,
etc.
That's
not
going
to
help,
so
we
need
to
put
better
tooling
around
that
so
that
there
are
some
other
day-to-day
challenges.
But
many
of
them
are,
you
know,
just
keeping
things
up
to
date,
making
sure
we're
maintaining
up
time,
keeping
things
reliable,
yeah.
A
Sounds
very
good,
very
good,
I'm
just
checking.
If
there's
a
question
I
don't
see
any
so
maybe
maybe
we
can
switch
slightly
the
topic
and
less
from
the
technical
or
tooling
part,
but
maybe
you
can
tell
us
how
what's
your
experience
as
an
end
user
in
this
community?
Is
there
I
don't
know,
what's
your
feeling,
interaction
with
other
end
users
or
with
the
tools
and
well
you
mentioned.
B
Yeah,
I
mean
I
mean
I
think,
from
the
the
end
user
community.
I
mean
being
a
member
of
that
it
really
that's
a
real
boon
when
you're
at
the
conference.
I
think
the
attending
uconn
is
something
the
team
have
really
enjoyed.
I've
not
actually
been
the
one.
Yet
I've
got
to
be
honest
about
that
so,
but
I
am
hoping
to
get
there,
but
I
do
like
the
physical
conference
I'd
be
to
the
virtual
ones,
but
I
I
love
the
whole
hallway
tracks
etc.
B
But
there
again
I
am
a
conference
organizer,
so
I
I'm
a
little
bit
opinionated
on
that,
but
yeah
I
know
cubecon
is
certainly
something
which
the
team
have
been
to
and
have
come
back
full
of
ideas
full
of
different
approaches
to
do
stuff.
B
I
think
I
think
the
the
the
main
takeaway
I
take
from
the
team
when
they've
been
and
they
come
back,
is
they
say
they
had
a
plan
of
what
they
were
going
to
see
and
obviously,
as
you'll
know,
cucumber,
is
a
massive
conference
with
with
many
many
tracks
there
of
talks
to
see,
and
they
always
come
back
watching
lyrically
about
the
things
they
didn't
expect
and
I
think
when
well,
I
think
almost
they've
said
when
they
went
to
like
the
popular
talks
and
they
couldn't
get
in
actually
the
ones
they
went
to
because
it
was
near
or
or
it
looked
interesting,
they're,
the
ones
where
they
picked
up
these
little
tidbits
and
these
little
interesting
bits
of
knowledge
which
have
come
back
and
have
been
used.
B
It
before
I
can't
remember
if
they
went
to
copenhagen,
I
think
that
was
the
one
we
went
to
before.
They
went
to
barcelona
and
they
came
back
from
that
like
like
this
is
brilliant.
We
have
to
use
opa
it's
it's
obviously
like
like,
like
the
you
know,
something
we
can.
We
can
use
to
help
our
teams
on
the
cluster
you
know,
but
without
actually
ending
up
with
that
talk,
they
would
have.
B
Obviously
we
would
have
known
about
it
eventually,
because,
obviously
it's
like
a
huge
topic
now,
but
I
don't
think
we'd
have
had
that
kind
of
early
visibility
of
it.
I
think
a
lot
of
our
early
history
adoption
was
based
around
talks
and
examples
and
demos
and
talking
to
other
people
at
cubecon,
which
is
which
we've
seen
so
yeah,
I
think
even
more
than
just
the
conference,
which
of
course
is
you
know
great
and
important.
B
I
think
I
think
supporting
the
cncf
is
important,
because
you
know
we
we
rely
heavily
on
the
projects
which
it
looks
after
so
supporting
that
is
super
important
to
us.
So
yeah,
I
think
they,
you
know,
the
the
end
user
community
is,
is
really
important
and
so
is
cubecom.
A
That's
brilliant
and
yeah
we're
all
hoping
that
normality
will
come
back.
A
Fingers
crossed
it's.
It
looks
like
it's
it's
happening,
so
you
you
actually
mentioned
the
a
lot
of
a
lot
of
the
tools
that
you
you
are
relying
on.
You
mentioned,
of
course,
kubernetes
you
mentioned
prometheus.
You
mentioned
helm
opa
just
now,
I'm
kind
of
curious
because
you
have
a
pretty
large
deployment
and,
interestingly,
you
have
both
on-premises
and
public
cloud
deployments.
So
it's
multi-cluster.
A
You
mentioned
that
you
don't
do
any
kind
of
communication
between
the
clusters,
which
is
also
kind
of
common.
I
I
think,
from
from
what
I
hear,
are
there
any
tools?
You
also
mentioned
challenges
in
costs
and
things
like
this:
are
there
any
tools
or
technologies
that
you're
particularly
interested
in
integrating
in
the
near
future
or
that
you're
looking
forward
to
to
look
at.
B
Yeah
I
mean
I
mean
there
are
a
couple
I
can
mention.
So,
for
example,
I
mean
we're
we're
heavy
on
our
prometheus
adoption
and
we
have
had
constant
requests
for
long-term
storage
and
metrics,
so
victoria
metrics
is
something
we're
we're
heavily
looking
into
now.
Obviously
we
want
to
manage
that
carefully,
because
we're
aware
that
long-term
storage,
you
know,
means
different
things
to
different
users
and
we
we're
particularly
careful
on
how
we
manage
our
prometheus
instances
as
it
is
based
on.
B
You
know
things
like
the
amount
of
cardinality
the
metrics
have
and
start-up
times
etc.
So
you
know,
victoria
metrics
is
something
where
we
rolled
out
and
we're
starting
to
roll
out
to
our
customers.
Now
so
there's
long-term
storage,
but
we
want
to
do
that
in
a
manageable
way.
So
that's
that's
one
of
the
things
we're
doing.
We've
used
gatekeeper,
which
is
a
tool
which
allows
you
to
basically
report
on
opa
states.
B
We've
used
that
for
our
kind
of
like
standards
and
best
practice
dashboard,
we
wrote
an
exporter
which
takes
that
data
out
in
the
format
we
want,
because
we've
also
got
a
lot
of
metadata
tagging
in
there,
which
can
identify
workload,
ownership
and
stuff.
So
then,
in
the
dashboards
we
can
visualize
that
by
ownership
as
well.
So
so
that's
been
a
very
useful
technology.
B
There
are
various
updates.
The
networking
stack
going
on
updates
to
istio
at
the
minute.
The
122
upgrade
is
not
without
its
challenges.
I
don't
think
so.
B
We
are
working
closely
with
our
users
and,
I
suppose
the
great
thing
about
already
having
those
community
those
communication
channels
with
our
users
in
place
has
actually
made
that
fairly
straightforward,
we're
able
to
identify
workloads
and
go
and
talk
to
them
they're
going
to
have
problems
when
we
do
the
upgrade
so
we're
hoping
to
have
everything
in
place
in
the
next
couple
of
weeks,
so
we
can
go
to
122.,
but
I
suppose
the
overall
thing
with
our
cluster
is,
although
we're
looking
at
different
we're,
always
looking
at
new
bits
of
technology
and
replacing
existing
functionality
with
newer
bits,
I
think
the
thing
we
we
more
than
fans
about
is
just
stability
and
updates
to
things
you
know,
we've
got
a
lot
of
operators
that
we
run.
B
You
know
we
want
them
to
be
stable.
We
want
the
underlying
kubernetes
system
to
be
stable.
We
want
the
monitoring
stack
to
be
stable.
We
want
all
the
things
that
send
data
from
the
cluster
to
all
of
the
other
services
which
ingest
our
stuff
to
be
stable.
You
know
so
stability
is
a
big
thing
and
you
know
it's
a
dull
answer
and
it's
not
a
very
exciting
one,
because
it's
not
like
the
the
you
know
the
new
shiny
text,
but
but
we
kind
of
like
things
just
just
working.
B
I
think
one
of
the
things
we're
really
pleased
about
with
our
cluster
is
is
the
stability
of
it
and
we
like
to
keep
it
like
that
and
nobody
likes
getting
paged
and
we
to
we
want
to
keep
it
like
that.
As
best
we
can.
A
I
think
that's
that's
pretty
fair
and
yeah.
I
think
the
the
interesting
bit
here
is
also
that
you
have
a
pretty
like
large
deployment
and
it's
interesting
to
see
how
you
scale
things
like
prometheus
or
metrics,
and
things
like
this
and
you're
looking
at
these
new
new
products
too,
to
handle
that
I
think
for
other
end
users,
it's
extremely
useful,
this
feedback.
A
So
I
think
I
don't
think
we
have
any
questions.
One
one
thing
maybe
I
would
put
here
is:
do
you
have
something
else
that
you
would
like
to
tell
other
end
users
or
the
community
that
we
didn't
cover
here.
B
Yeah
I
mean
one
thing
I
haven't
really
covered,
which
has
been
really
important
to
to
the
team
is
how
we
use
kubernetes,
not
just
to
run
workloads.
We
also
use
it
to
provision
infrastructure.
Now.
Obviously
I
mentioned
things
like
load
balancer
provisioning.
I
mentioned
that
storage,
provisioning,
but
they're
kind
of
you
know
they're
with
the
basic
built-in
primitives
for
kubernetes.
B
So
if
you
you
know,
you
want
a
pv,
you
will
get
some
storage
if
you,
if
you
want
to
load
balance,
then
I
mentioned
that
that's
configured
for
you
based
on
that
work,
though,
the
the
the
automation
really
hasn't
stopped
there.
So
I
would
give
you
some
examples
with
a
team
that
did
the
f5
automation
they're
now
trying
to
automate
more
things.
B
So,
if
you
want
more,
if
you
want
to
configure
an
f5
now
for
a
virtual
machine
usage,
you
can
do
that
through
through
a
code
base
where
you,
where
you
commit
yaml
definitions
for
the
load
balances
you
require.
So
even
if
they're
outside
of
kubernetes,
we
can
use
our
provisioner
inside
of
the
cluster
to
actually
configure
load
balancers
for
things
that
aren't
in
the
cluster.
B
So
obviously,
there's
a
pull
request
approval
on
that,
but
it
means
that,
rather
than
going
into
an
f5
reconfigure
configuring
that
for
teams,
it's
now
all
done
as
code.
That
is
obviously
you
know
a
massive
benefit.
The
same
with
dns
entries.
We've
done
a
lot
of
automation
in
the
cluster,
and
if
you
want
a
dns
entry,
you
can
create
a
dns
object
in
the
cluster.
B
That's
got
an
operator
behind
it
which
will
provision
you
a
dns
entry
in
our
dns
provider
through
the
through
their
api
and
we'll
handle
all
that
and
we'll
tear
it
down
when
you
don't
want
it
etc.
But
equally,
we've
got
another
repo
where
people
can
put
those
dns
definitions
and
they'll
just
get
created,
even
if
the
dns
record
isn't
used
by
something
inside
the
cluster,
so
we're
starting
to
automate
bits
of
infrastructure
through
kubernetes,
even
if
it
isn't
kubernetes.
B
So
two
more
examples
of
that
firewall
automation
has
been
something
we've
been.
You
know,
every
organization
is
wanting.
You
know
that
software
defined
networking
is
is
something
that
the
organizations
want.
Obviously
we
have.
You
know
over
a
decades
worth
of
network
configuration
in
data
centers
in
offices,
etc.
We're
now
starting
to
build
tooling,
which
will
configure
some
of
the
firewalls
through
things
that
are
provisioned
from
kubernetes.
B
Another
example
is
cert
manager,
which
are
that
people
are
very
familiar
with
we're
using
that
with
our
certificate
provisions
now
to
manage
certificates
and
we're
hoping
to
offer
that
outside
of
the
cluster.
So
there's
lots
of
bits
of
automation
right
that
we
run
inside
kubernetes
to
to
manage
the
resources
we
need,
or
you
know,
teams
or
developers
need,
and
indeed
people
in
infrastructure.
But
we
can
also
offer
that
as
the
way
to
manage
this
stuff
in
an
automated
way
outside
of
the
cluster,
so
that
I
think
it's
something
we
we.
B
You
know
we
we're
building
on
and
building
on
them.
I
think
a
lot
of
the
automation
we're
going
to
be
doing
over
the
next
18
months
for
things
for
infrastructure
are
going
to
be
powered
by
kubernetes
as
well,
even
though
they
may
never
have
a
workload
related
on
the
cluster.
A
B
Yeah
I
mean
and
of
course
we're
also
doing
that
for
things
in
the
cluster
as
well.
So
you
know,
operators
are,
you
know,
think
things
like
you
know.
We've
got
an
in-house,
my
sequel
provisioner,
which
will
you
know
with
a
small
chunk
of
yaml,
creating
your
namespace
a
you
know.
However
many
node
or
container
based
replicated
my
sequel
cluster.
You
want
and
with
the
amount
of
resource
and
tooling,
to
back
it
up
and
restore
it
built
into
that
operator.
So
we've
done
a
lot
of
work
on
that
kind
of
thing.
B
We
do
offer
some
operators
which
we
didn't
write.
So
obviously
the
prometheus
operator
is
one
which
we
use
a
lot
to
manage
the
prometheus
instances
on
the
cluster,
but
we've
got
stuff
for
redis
and
a
few
other
bits
and
pieces
as
well.
So
that
means
that
you
know
obviously
like
like
developers
that
don't
have
to
go.
B
A
Yep,
that's
brilliant!
Actually,
we
still
have
a
couple
of
minutes,
so
I
just
thought
of
something
also
because
you,
you
are
mentioning
managing
things
that
are
not
in
the
cluster.
You
also
mentioned
that
you
have
like
multiple
clusters
multi-tenant
just
out
of
curiosity.
How
do
you
handle
this
setting
of
external
resources
when
you
have
multiple
clusters?
Are
users
like
allocated
to
a
certain
cluster,
or
do
they
see
these
resources
everywhere
or.
B
Yeah
so
so
I
think
the
the
ldap
groups
are
obviously
shared
in
the
organization,
so
they're
learning
the
ldap
groups
you're
in
and
they
obviously
define
which
set
of
permissions
you
get
and
then
those
are
bound
to
access
on
the
various
certain
amount
points
within
within
vault.
We
use
vault
on
a
per
environment
basis.
We
don't
have
a.
I
don't
think
we
have
a
vote
in
every
environment,
but
I
think
the
environment
configurations
are
set
within
the
same
vault
instance,
so
they're
not
shared,
but
they
may
be
managed
on
the
same
one.
A
All
right
very
interesting,
I
think
I
think
this
has
been
fascinating
thanks
so
much
for
for
all
the
information.
A
B
B
B
I
don't
know
that
would
be
great.
I
like
to
say
I
I
I
organized
I
spent
eight
years,
organizing
meetups,
then
a
few
years,
organizing
conferences-
and
I
haven't
done
any
of
that
for
you
for
getting
off
for
two
years
now
and
I
miss
it
and
I'm
looking
forward
to
getting
back
to
doing
that
and
yeah
talking
to
people
about
stuff
and
finding
out
what
they're
up
to
so
that'd
be
great.
A
Okay,
okay,
super
cool.
So
then
thanks
everyone
for
joining
the
this
episode
of
the
cloud
native
and
user
launch.
It
was
great
to
have
andy
talking
about
the
sky,
betting
and
gaming
and
how
to
use
coordinates.
A
Don't
also
forget,
as
we
mentioned
a
couple
of
times
already
to
join
us
at
kubecon
cloud
native
com,
eu
yeah.
It's
may
17-20
and
we'll
have
a
lot
of
latest
information
from
the
cloud
native
community.
Also,
if
you
would
like
to
showcase
your
usage
of
cloud
native
tools
as
an
end
user,
then
you're
welcome
to
join
the
end
user
community
with
more
details
at
cncncf.ioslash,
end
user.
So
thanks
again,
everyone
for
joining
us
today
see
you
next
time
and
thanks
a
lot
andy
for
the
great
you're.
B
Welcome.
Thank
you
thanks
for
having
me
it's
been,
it's
been.
It's
been
good
fun
nice
to
share
with
with
people
what
we've
been
up
to
so.