►
From YouTube: CNCF End User Lounge: How Airbnb manages a dense SOA of 1000s of services across dozens of clusters
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Hello,
everyone
and
welcome
to
the
cncf
end
user
lounge,
where
we
explore
how
cloud
native
technologies
are
adopted
by
end
user
organizations
across
different
industries
and
sectors.
Just
as
a
reminder,
the
cncf
end
user
community
is
formed
of
more
than
150
vendor
neutral
organizations
that
use
open
source
tooling
to
deliver
their
product.
A
Just
as
a
reminder,
this
is
an
official
live
stream
of
cncf
as
such
subject
to
the
cncf
kind
of
conduct,
pretty
much
please
be
respectful
to
all
of
the
fellow
participants
and
presenters.
If
you
have
any
questions
for
us,
we'll
be
monitoring
them
throughout
the
stream.
So
make
sure
to
ask
your
questions
in
the
live
stream
chat
and
as
mentioned
today,
I
have
stephen
and
sunil
from
airbnb
and
we're
going
to
discuss
how
airbnb
manages
a
dense
service,
oriented
architecture
of
thousands
of
services
across
dozens
of
clusters.
A
Now,
before
we
jump
into
some
of
the
questions,
stephen
smile,
would
you
like
to
introduce
yourself?
Please.
B
Sure
so
I've
been
working
in
urban
b
for
the
past
three
and
a
half
years,
and
so
I've
had
the
opportunity
to
work
on
two
different
teams
working
with
cloud
native
technologies.
B
The
first
one
is
our
compute
info
team,
which
manages
the
operations
and
scalability
performance,
and
so
on,
of
our
kubernetes
clusters,
as
well
as
tooling,
on
top
of
them
like
how
we
generate
manifest,
how
we
integrate
with
the
existing
infrastructure
and
then
the
second
team
which
I'm
currently
on
is
our
service
mesh
team
which
is
building
out.
You
know
the
next
generation
of
power
services
are
observable
that
are
secured
and
how
they
discover
each
other.
C
Yeah,
my
name
is
sunil.
I
manage
the
computer
for
team
at
airbnb,
so
I've
been
here
around
18
months
prior
to
that
I
did
a
very
similar
thing
at
yelp,
but
we
used
mesos
and
prior
to
that
worked
at
a
company
called
mesosphere
which
developed
the
open
source,
apache
mesos
project.
So
so
kubernetes
is
kind
of
new
to
me
since
I
came
to
airbnb,
but
I'm
very
familiar
with
the
idea
of
container
orchestration
and
yeah
that
the
team
is
really
focused
on
on
making
kubernetes
the
de
facto
compute
platform
airbnb.
B
I
think,
in
order
to
get
a
good
sense
of
our
current
setup,
I
can
talk
about
our
journey
here
from
you
know
the
very
beginning
of
airbnb,
so
back
in
2008,
airbnb
started
out
as
a
single
monolithic,
ruby
on
rails,
app
running
in
a
single
aws
account,
and
that
worked
very
well.
You
know
for
early
days
and
launching
lots
of
features
and
then,
as
the
team
grew-
and
you
know,
the
company
grew
in
terms
of
users.
B
We
naturally
had
to
start
splitting
things
up
right
so
that
we
had
a
lot
of
tooling
that
started
out
dedicated
to
just
this
kind
of
monolithic
app.
We
had
a
separate
deploy
app,
we
configured
the
hosts
which
ran
their
application
with
chef,
and
then
this
didn't
scale
so
well
as
we
split
up
into
an
soa,
so
users
had
to
end
up.
You
know
going
into
multiple
different
repos
to
change
their
configuration
they
had
to
each
like
deploy
each
repo
in
a
different
way.
Deploy
that
configuration
it
was
kind
of
error
prone.
B
They
also
had
to
manage
their
own
posts
right
and
not.
Everyone
is
comfortable
doing
that.
Auto
teams
were
comfortable
doing
that,
and
so
we
wanted
to
reach
for
kind
of
a
more
centralized
solution,
something
that
allowed
users
to
deploy
their
code
and
configuration
the
same
repository.
The
same
way
we
didn't
want
them
to.
B
And
so
we
started
looking
for
you
know
what
are
the
tools
that
we
can
use
to
to
help
us
achieve
these
goals
and
one
of
the
early
tools
that
we
reached
for
was
kubernetes,
and
so
this
was
about
in.
B
2016.,
you
know
there
were
many
iterations
in
between
I'll
start
with
kubernetes,
so
what
kubernetes
does
is
it
lets
us
integrate
really
well
with
our
existing
infrastructure?
So,
like
previously,
we
had
like
I
mentioned
users
had
to
go
into
one
repo
to
configure
their
hosts,
one
repo
to.
B
Configure
their
alerts
and
their
dashboards
and
so
on,
and
so
for
we
built
a
abstraction
on
top
of
kubernetes
called
onetouch,
and
that
allows
users
to
have
this
folder
called
like
underscore
infra
in
their
application.
Repo
and
under
underscore
infra
there's
a
few
files.
B
One
of
them
is
called
cubegen,
and
so
it's
just
a
yemo
file
which,
where
it
allows
users
to
configure
things
like
the
services
that
they'll
need
to
discover
their
cpu
and
memory
requests
and
so
on
and
that
gets
generated
into
kubernetes
manifests
and
there's
also
files
on
the
underscore
info
folder,
like
alerts
which
allow
you
to
they
become
transformed
into
custom
resources
and
that
that
integrates
well,
because
when
we
deploy
all
our
kubernetes
manifest
and
the
custom
resources,
then
we
have
a
custom
controller
which
listens
and
then
kind
of
make
sure
that
our
alerts
provider
is
synced
up
with
the
definition
that
users
have
and
so
there's
a
lot
of
a
lot
more
things
that
we
can
dive
deeper
into.
A
It's
really
great,
actually
you've
mentioned
that
the
previous
state
of
infrastructure
didn't
really
allow
for
scale
or
even
to
have
a
manageable
way
to
deploy
your
services.
So
that's
definitely
one
of
the
I've
seen
in
the
past.
This
has
been
quite
a
core
motivation
for
end
user
companies
to
move
to
cloud
native
now,
you've
been
mentioning
kubernetes
that
you
use
in
your
platform,
but
you've
mentioned
when
you
introduce
yourself
that
you
are
involved
with
the
service
machine.
A
So
could
you
please
maybe
touch
upon
some
of
the
other
technologies
used
in
addition
to
kubernetes?
Maybe
what
using
for
for
logging
or
for
authentication,
if
you're
allowed
to
say
that
so
maybe
some
a
bit
of
like
other
core
parts
that
you
use
in
your
platform.
B
Sure
yeah,
so
I
mentioned
kubernetes
a
lot
but
for
deploys
we're
using
spinnaker.
A
B
Service
mesh,
we're
building
on
top
of
istio
and
then
for
our
logging,
stack
we're
making
a
lot
of
use
of
the
elk
stack
and
for
I'll
talk
mostly
about
our
internal
services
service.
Like.
B
So
in
our
service,
mesh
istio
comes
with
spiffy
right,
which
is
a
kind
of
authentication
framework
and
allows
all
of
our
services
to
communicate
through
tls.
A
A
So
it's
really
great
to
hear
that
an
organization
like
airbnb
already
integrates
it
to
have
this
secure
communication
between
services
now
you're
talking
about
onetouch
and
the
first
time
I
heard
about
onetouch
and
this
abstraction
you
have
on
top
of
kubernetes-
was
during
cubecon
north
america
in
2018,
so
that
was
in
seattle
a
long
time
ago.
It
feels
like
a
very
long
time
ago
now
so
you've
mentioned
that
this
is
an
abstraction
that
helps
your
developers
to
deploy
services.
A
But
I
would
like
to
ask
a
question
of
how
it
actually
impacts
the
maintenance
of
your
clusters
and
services
like?
Is
it
easier
for
you
to
deploy
changes
or
even
completely
roll
out
a
service
or
delete
a
service?
And
another
question
I
have
in
regards
to
this:
does
it?
How
does
it
impact
the
immutability
of
your
infrastructures
and
clusters?
A
B
Yeah
so
I'll
talk
about
how
we
can
move
services
between
clusters
a
little
bit,
so
I
mentioned
earlier
that
you
know
in
each
application
or
service
repository
there's
the
info
folder
and
there's
the
inside
of
the
info
folder
there's
a
cube,
gen
configuration
file,
and
so
the
structure
of
the
cube
gen
file
is
that
there's
multiple
environments
for
the
app,
so
you
might
have
the
production,
development
and
staging
environments
and
in
each
of
those
environments,
there's
one
single
yaml
field
called
context
and
context
is
basically
the
cluster
that
this
environment
gets
deployed
to
and
so
the
when,
when
our
kubegen
cli
reads
in
the
cubechen
file,.
A
B
Generates
out
all
the
all
the
manifests
and
then
then,
during
deploy
time,
that
context
is
checked
and
then.
B
Sends
it
all
those
manifest
to
the
correct
cluster?
So
instead
of
you
know
having
to
specify
the
cluster
multiple
times
to
just
have
a
single
field,
and
so
we
can,
because
the
cube,
gen
file
is
also
just
plain
yemo.
It
makes
it
easy
to
run
automated
refactors
across
these
service
repositories,
where
you
can
change
the
the
context
of
the
environment,
so
you
can
almost
automatically
move
surfaces
between
clusters.
B
First,
you
can
deploy
to
the
new
cluster
and
then
gradually
scale
down
the
service
in
the
old
cluster,
and
we
can
also
even
run
automated
canary
analysis.
So
we
run
like
one
replica
in
the
new
cluster
one
replica
in
the
old
cluster
and
they
both
receive
production
traffic,
and
then
you
can
compare
to
make
sure
that
the
service
is
running
normally
on
the
new
cluster
before
migrating
the
rest
over
and
so.
B
B
You
know
more
challenging
migrations
like
when
we're
switching
our
cni
provider,
our
cmi
plug-in,
then
we
can
create
a
new
cluster
with
the
new
cni
plug-in
and
then
move
our
services
over
and
one
by
one
and
then
gradually
like
spin
down
the
old
cluster,
and
so
it
makes
these
kind
of
transitions
of,
like
cluster
setup,
go
from
something.
That's
really
risky
with
a
large
blast.
Radius.
A
A
Now
this
sounds
really
really
cool
because
migrating
between
clusters,
especially
it
seems
like
it's
in
an
automatic
fashion
as
well.
It's
challenging,
but
it
seems,
like
you
already
have
the
processes
building
house
to
to
allow
this.
Allow
this
kind
of
operations-
and
another
question
I
have
which
is
going
to
be
next-
is
on
developer
experience,
which
is
going
to
be
tied
into
how
these
services
are
delivered.
Now
with
airbnb
one
of
the
talks
that
I've
seen
is
pretty
much.
A
It
was
mentioning
that
you've
migrated
from
managing
hundreds
of
services
to
hosting
thousands
of
them
in
less
than
three
years,
and
I'd
like
to
maybe
question
how
this
impacted
the
developer
experience,
and
maybe
do
you
have
like
new
methodologies
to
troubleshoot,
maintain
debugging
application
when
something
is
going
wrong.
There.
B
B
Exactly
how
many,
but
our
goal
was
to
initially
to
allow
service
owners
to
create
a
production
ready
service
in
in
just
one
hour,
and
this
required
a
lot
of
that
consolidation
efforts
that
I
talked
about
at
the
beginning,
which
is
like
moving
from
having
like
multiple
tools,
repositories
that
service
owners
had
to
edit
and
deploy
into
one
single
application.
So
that's
why
we
we
call
it
one
touch,
because
it's
just
one
one
place
right
to
install
your
your
code
and
configuration.
A
That
so
this
is
pretty
much
the
developer,
experience
that
you've
covered
so
far
and
I'm
curious
if,
for
example,
as
an
engineer,
I'm
using
one
touch
to
deploy
my
application.
But
what
is
the
process
if
I
would
like
to
troubleshoot
it
or
maybe
verify
if
it's
running
in
the
right
clusters
with
the
right
amount
of
pods?
If
I
can
get
the
logs
like
what
are
those
troubleshooting
procedures
internally,.
B
Got
it
yeah
so
for
that
we
have
a
command
line
tool,
just
it's
called
k,
which
is,
you
know,
similar
to
a
lot
of
people's
aliases
for
not
qctl,
but
in
this
case
it's
our
own
cli
that
wraps
around
a
lot
of
common
workflows
that
you
mentioned
like
executing
into
a
pod
for
interactive
debugging
or
like
removing
it
from
service
discovery
for
that
kind
of
interactive
debugging,
while
making
sure
that
it's
deleted
later
checking
the
logs
and
some
of
the
different,
like
the
reason
why
we
have.
B
This
is
because
we
want
a
more
kind
of
like
application-centric
focus
on
on
the
cli.
So
some
of
the
arguments
that
you
can
provide
to
it
are
like
the
app
and
the
environment,
and
then
we
derive
the
name
space
from
that
by
concatenating.
The
app
and
environment
together.
A
You
know,
that's
really
cool
I'll.
I
would
definitely
like
to
try
try
that
out.
Do
you
by
any
chance,
do
you
maybe
have
a
version
which
is
open,
source
and
open
source,
and
the
listeners
can
can
verify
by
themselves
or
they
just
have
to
inspire
themselves
from
from
what
you've
been
telling
so
far,.
A
B
About
that,
for
a
little
while
and
considering
it
so
you
know,
do
you
have
any
insight
into
the
current
thoughts
on
that
for
one
touch
for
all
the
k-town,
the
k-12.
C
Yeah,
we
haven't
really
pushed
boards
on
that
because
I
think
there's
a
lot
of
duplication
between
k
and
cube
ctl
and
a
lot
of
that.
The
sort
of
functionality
provides
is
very
specific
to
how
airbnb,
like
you
know
how
we,
how
we
manage
namespaces,
how
we
manage
applications
and
so
there's
a
little
bit
of
like
proprietary
stuff
there.
C
I
know
we
have
spoken
in
the
past
about
making
elements
of
one
touch
open
source,
although
I
suspect
the
industry
has
moved
forward
since,
because
I
think
at
the
time
it
was
a
relatively
novel
way
to
manage
application
resources.
So
we'd
love
to
but
yeah,
not
a
huge
amount
of
progress.
Yeah.
B
B
So
another
you
know
there
are
other
different
ways
that
that
cubesdl
now
allows
you
to
to
integrate,
which
is
like
with
gitl
plugins,
which
which
are
like
another
good
way
of
adding
your
own
workflows
around
it.
Initially,
this
this
started
as
kind
of
like
scripting
around
cube
ctl,
but
now
we've
also
got
like
many
like,
since
it's
a
full-fledged
cli
we've
got
plenty
of
other
workflows.
For
example,
we
also
use
k
to
allow
developers
to
get
access
to
their
kubernetes
clusters
to
get
their
authentication
credentials.
A
I
wanted
to
mention
before
that,
if
there
are
still
any
kind
of
intentions
to
open
source
that
I
think
the
community
is
going
to
look
forward
to
to
these
tools,
because
I
think
there
are
so
many
organizations
that
are
still
at
the
beginning
of
the
journey
to
adult
cloud
native.
I
think
that's
definitely
going
to
be
useful
to
some
to
some
sectors
and
you've
mentioned
cubesat
plugins.
I
think
again.
This
is
a
great
way
to
personalize
and
tailor
the
way
you
consume.
A
Cubesat
commands
and
the
way
you
interact
with
your
cluster,
so
definitely
kind
of
a
great
usage
at
airbnb
is
great
to
hear
about
this
one
another
point
I
wanted
to
make
about
the
developer
experience,
which
is
a
bit
maybe
less
technical,
being
developer
centric,
it's
quite
important.
B
Yeah,
I
think,
like
I
mentioned,
that
one
of
the
benefits
of
separating
out
the
the
management
of
infrastructure
and
product
teams
is
that
there's
a
there's
the
ability
to
specialize
right.
So,
if
you're
on
a
product
team
or
service
owner
you
can
now,
you
can
more
confidently
deploy
your
changes
to
to
production
and
because
we
have
a
more
homogeneous
and
and
managed
infra
and
there's
less
time.
That's
required,
for
you
know,
firefighting
and
distractions
on
reliability
issues
so
that
allows
our
product
teams
to
achieve
a
better
velocity.
C
I
think
the
reliability
story
is
really
interesting
too,
because
one
of
the
benefits
of
like
the
reason
organizations
move
to
microservices
based
architecture
is
because
it
really
helps
your
engineering
teams.
Scale
right,
like
airbnb,
has,
I
think,
at
this
point
thousands
of
engineers
and
the
monolithic
application.
I
think,
wasn't
working
well
it's
at
the
time,
just
because
of
the
velocity
of
work
that
people
were
doing,
and
so
we
moved
to
service
oriented
architecture
but
maintaining
a
common
standard
for
how
we
want
these
applications
to
be
operationalized.
A
C
C
C
A
C
A
Oh,
that's
really
cool.
Thank
you
for
giving
like
a
such
a
full
introduction
of
your
clusters
and
the
deployment
process
to
to
the
platform.
Another
question:
I
have
it's
more
about
the
future
challenges.
So
do
you
feel
like
at
this
point?
There
are
any
challenges
that
you're
going
to
face
in
building
your
clusters
or
maintaining
the
clusters
or
deploying
your
applications,
or
maybe
there
are
some
new
technologies
that
you'd
like
to
adopt
and
there
are
on
your
radar
at
the
moment.
B
Yeah,
so
some
of
our
current
challenges
are
around
how
we
deploy
services
to
like
multiple
clusters
and
rethinking
some
of
our
fault
domains.
So,
right
now,
all
of
our
clusters
run
across
multiple
availability
zones
and
we
deploy
one
single
service
environment
to
to
one
cluster,
and
so
what
that
means
is
that
the
the
cluster
right
now
is
still
the.
A
B
Domain
for
a
particular
service
environment,
and
so
we
want
to
rethink
a
little
bit
about
that,
because
one
of
the
problems
we've
seen
with
running
clusters
across
multiple
zones
is
balancing
replicas
actually
across
the
zones
evenly
and
with
you
know,
things
like
topology
spread
constraints,
that's
become
a
little
bit
easier,
but
we
still
want
to
maintain
a
really
even
capacity
in
case.
B
You
know,
our
underlying
cloud
provider
loses
some
like
some
capacity
in
a
given
availability
zone
or
also
to
avoid
traffic
going
between
those
zones
and
so
we're
thinking
about
you
know.
How
can
we
restructure
these
clusters?
Could
we
maybe
one
run
one
cluster
in
a
single
availability
zone
and
then
deploy
a
service
to
service
environment,
to
multiple
clusters,
then,
in
order
to
do
that,
we
need
to
have
a
strong
idea
of
our
federation
story.
B
So
how
are
we
going
to
abstract
away
the
underlying
set
of
clusters
from
users
and
allow
them
to
maybe
just
specify
some
of
the
constraints
like?
Maybe
they
would
like
to
run
on
one
set
of
hardware,
but
they
don't
have
to
worry
about
whether
it's
going
to
you
know
prod
a
or
prod,
b
or
part
c.
A
Now
again,
like
sounds
really
exciting
and
hopefully
going
to
share
some
of
these
thoughts
when
you
actually
implement
it
during
different
talks
and
sessions
at
kubecon
and
the
next
section
that
the
next
set
of
questions
that
I
have
are,
of
course
around
your
kubecon
and
cloud
nativecon
participation,
because
airbnb
in
the
past
have
given
many
talks
and
even
keynotes
during
kubecon.
A
So
the
keynote
that
I
was
talking
about
and
refers
to
to
untouch
we've
covered
this
quite
quite
heavily
in
the
in
the
first
section
of
the
stream.
However,
there
is
one
talk
that
you've
delivered,
stephen,
which
is
did
kubernetes,
make
my
p95
spores
now.
Could
you
share
airbnb's
journey
on
performance
gains
and
losses
and
its
mass
migration
to
kubernetes.
B
Sure
yeah,
so
you
know
that's
one
of
the
one
of
the
challenges
that
came
with
making
sure
that
all
of
our
existing
services
adopted
kubernetes
as
well
as
our
new
ones,
was
making
sure
that
developers
had
a
good
sense
of
whether
their
application
was
running
faster
or
slower
when
migrating.
B
And
then
we
also
encounter
lots
of
like
interesting
performance
regressions,
which
we
shared
in
that
talk,
and
so
some
of
the
gains
that
we
saw
in
terms
of
performance
were.
Some
of
them
were
around
like
efficiency
of
resource
usage.
So
you
know
we
had
more
uniform
provisioning
because
of
the
the
bin
packing
that
we
were
able
to
do.
B
We're
able
to
we're
able
to
enforce
a
certain
percentage
of
resources
are
actually
being
used
on
the
nodes
and
then
auto
scale
up
our
the
number
of
nodes
in
our
cluster,
when
we
hit
say
like
85
percent
utilization,
where,
where
I
define
utilization
as
the
all
the
requests
of
the
pods
for
like
their
cpu
and
memory,
compared
to
the
cpu
memory
offered
by
all
the
minions,
and
so
you
can
compare
this
to
like
previously
when
service
teams
were
in
charge
of
their
own,
their
own
hosts,
not
all
of
the
services
were
auto
scaled.
B
So
this
led
to
like
some
pretty
inconsistent
provisioning
and
some
some
services
were
massively
over
provisioned
and
others
like.
When
traffic
increased
a
small
amount,
they
would
have
to
rapidly
scale
up,
and
so
that's
that's
one
of
the
things
that
we
we
got,
which
was
the
central
control
over
how
we're
composing
our
fleet
and
how
we're
how
we've
been
packing.
B
B
So
that's
an
easy
win
and
then
for
some
of
the
the
losses
right
because
we're
we're
running
multi-tenant
we've
got
interference,
so
we
had
to
do
a
lot
of
research
with
regards
to
cpu
limits
and
latencies.
For
example,
is
it
better
to
set
a
cpu
limit
equal
to
the
request
set
like
a
really
high
cpu
limit
or
no
limit
at
all,
and
currently
we're
not
recommending
that
users
set
cpu
limits,
but
we
still
want
to
alert
on
utilization
relative
to
their
their
requests.
A
We
will
try
to
point
to
the
actual
talk
as
well,
which
was
given
at
kubecon
and
another
session
that
I
want
to
mention,
and
scaling
has
been
mentioned
quite
heavily
today
throughout
the
entire
discussion
and
one
of
the
talks
that
was
given
was
scaling
kubernetes
to
thousands
of
nodes
across
multiple
clusters
calmly.
Now
this
is,
I
think,
quite
an
important
characteristic
of
managing
clusters,
because
when
you
scale,
when
you
increase
the
amount
of
infrastructure
you
have
to
manage,
usually
calmly
is
not
something
you
you
would
introduce
in
that
situation.
A
So,
within
this
talk
pretty
much,
it
describes
how
airbnb
scaled
from
six
hundred
branches
clusters
to
five
six
hundred
nodes
to
five
thousand
nodes
and
tens
of
clusters.
Maybe
could
you
briefly
share
how
how
you
kind
of
completed
this
migration?
Maybe
some
of
the
challenges
that
were
faced
during
this
migration,
any
specific
approaches
that
you
could
recommend
to
the
listeners
pretty
much
everything
in
this
context.
B
So
yeah,
a
lot
of
this
talk
was
motivated
around
our
our
journey
from
running
one
single
production
cluster,
which
is
like
our
initial
attempt
into
breaking
that
into
dozens
of
clusters.
So
you
know
we
learned
a
lot
of
things
about
cluster
scalability,
with
that
first
approach
that
I
mentioned
like
we
had
to
understand.
B
You
know
lcd
scalability,
like
events
like
scheduler,
algorithm
efficiency
like
some
dns
issues
and
lots
of
lots
more
so
we've
shared
some
of
those
stories
individually
in
other
talks
like
the
the
talk
of
did
kubernetes
make
my
p95s
worse
as
well
as
a
kind
of
a
series
of
talks
that
we've
got
called
ways
to
blow
up
your
kubernetes
cluster
and
so
yeah
pretty
much
pretty
early
on
when
we
were
migrating
our
services
over.
B
We
realized
that
we
would
need
multiple
production
clusters,
but
but
because
of
that
initial
experience,
we
had
good
guidelines
around
how
big
to
make
each
production
cluster.
So
currently
we
have,
we
run
each
cluster
capped
around
1000
nodes,
and
then
we
have
other
limits
on
things
like
pot
update
rates
and
like
endpoints
per
service,
and
we
follow
the
guidelines
from
the
sixth
scalability.
Pretty
closely
and
another
like,
besides
scalability
of
our
clusters,
we
also
touched
in
this
talk
on
provisioning,
automation
and
and
speed.
B
So
our
first
clusters
were
set
up
in
the
style
of
kind
of
like
kubernetes,
the
hard
way
so
very
hand
rolled
and
so
for
creating
many
clusters,
we're
looking
to
raise
the
level
of
abstraction.
So
we
looked
through
some
of
the
existing
projects
at
the
time
for
bootstrapping
clusters
like
cops
and
cubeadm.
B
So
ultimately,
we
decided
to
create
apis
that
were
inspired
by
these
projects,
and
then
we
wrote
scripts
underneath
that
generated
our
own
configuration,
so
that
allowed
us
to
integrate
with
our
existing
vm
management
infra,
and
so
some
of
the
key
ideas
from
that
talk,
one
that
you
wanna
have
an
api
to
describe
your
cluster
state,
which
is
still
evolving
in
the
community.
B
And
then
we
want
to
group
similar
clusters
into
a
cluster
type
and
a
cluster
type
just
means
like
common
configuration
that
doesn't
really
differ
across
multiple
clusters
and
so
in
the
future.
We're
hoping
to
draw
some
analogy
lines
between,
for
example,
how
replica
sets
specify
like
the
number
of
clusters
and
deployment
manage
rollouts
of
cluster
changes.
So
you
can
actually
imagine
changing
that
api
into
some
sort
of
resource
that
some
sort
of
custom
resource
that
allows
for
smooth
rollout
of
new
changes
across
multiple
clusters.
A
Sounds
like
operators
are
still
a
driving
force
when
it
comes
to
maybe
day
two
or
even
day,
free
kubernetes.
So
again,
if
you,
if
you're
gonna,
build
something
around
this,
actually
the
talks
that
already
have
been
delivered
in
this
topic,
they
have
great
content.
So
I
definitely
would
like
to
encourage
everyone
to
watch
those
and,
if
there's
maybe
further
work
on
this
definitely
be
great
to
hear
about
it.
Now
going
back
to
kubecon
cloud
native
con
north
america
this
year,
maybe
sunil
can
have
some
inputs
here
as
well.
C
Yeah,
I
can
share
one
thing
so
we're
at
this
point
now,
where
everybody's
running
almost
all
stateless
services
on
kubernetes-
and
I
guess
the
the
the
infrastructure
team
at
airbnb-
is
getting
really
excited
about
running
stateful
things
on
kubernetes,
especially
as
we
start
thinking
about
how
we
manage
our
infrastructure
at
a
large
scale,
different
regions,
different
availabilities
and
so
on,
and
so
we're
really
interested
in
how
we
can
onboard
stateful
services.
C
You
know
this
is
things
like
online
offline
databases,
other
distributed
systems,
things
like
kafka,
so
people
have
started
doing
this
in
the
industry,
but
now
it
feels
like
people.
You
know
other
companies
are
starting
to
reach
the
stage
where
they're
running
these
things
in
production.
So
one
of
the
things
we're
really
interested
in
and
learning
more
about
how
people
are
running
stateful
services
and
kubernetes,
and
you
know
how
we
do
that
in
a
reasonable
way
and
a
safe
way
at
airbnb
scale.
B
I
can
also
mention
that,
from
the
service
mesh
side
of
things,
we're
interested
in
some
of
those
recent
efforts
on
the
part
of
kubernetes.
So
like
they're,
you
know
native
kubernetes
resources
that
are
and
works
to
enable
easier
integration
with
service
mesh
like
multi-cluster
services,
for
example.
B
A
It's
really
cool
to
hear
about
this
one
I
think
there's
like
definitely
new
collocated
days
during
kubecon,
and
some
of
them
are
going
to
be
focused
on
managing
data,
and
maybe
some
of
them
can
be
quite
insightful
into
managing
stateful
applications
and
another
cool
topic.
A
That
stephen
mentioned,
of
course,
is
how
can
you,
how
can
you
use
service
mesh
across
multiple
clusters
and
actually
make
sure
that
the
services
in
different
clusters
communicate
between
themselves
securely
and
maybe
taking
a
step
away
from
airbnb
and
taking
the
the
head
from
like
airbnb
hat
and
putting
the
community
hat
here?
What
kind
of
predictions
you
have
in
regards
to
merging
themes
and
technologies
within
the
wider
ecosystem,
and
this
can
be
completely
unrelated
to
your
current
work
at
hearing
baby?
B
Yeah,
so
one
of
the
things
I'm
personally
excited
about
is
learning
or
watching
identity
and
authorization
projects
grow
and
gain
more
adoption.
So,
for
example,
you
know
open
policy
agent,
recently
graduated
from
cncf
and
there's
more
more
adoption
that
it's
seeing
and
I'm
excited
to
see
how
that's
going
to
be
used,
not
just
as
a
mission
controller
or
for
like
service
service
authorization,
but
as
a
component
that
people
use
for
general
policy
enforcement
across
all
their
infrastructure
projects
and
on
the
identity
side
of
things.
B
I'm
looking
to
forward
to
seeing
how
spiffy
and
spire
can
see
widening
adoption
across
the
stack
not
just
for
services,
service
authentication,
but
also,
maybe
like
user
authentication
and
access
to
services
or
infrastructure.
If
you
could,
if
we
could
use
some
of
these
projects
and
extend
them
to
allow.
A
C
I'm
a
little
out
of
touch
with
the
community,
but
I
know
there
are
a
set
of
startups
that
are
looking
at
this
idea
of
run
books
as
code,
which
I
find
kind
of
interesting,
the
idea
of
using
tooling
to
augment
your
on-call
engineer's
ability
to
investigate
issues
with
clusters.
That
seems
really
powerful
to
me
because
you
know
at
a
company
like
airbnb,
we
have
a
large
engineering
team
and
it's
not
really
sustainable
for
our
team
to
be
involved
in
every
incident
to
do
with
a
service.
C
Having
you
know,
issues,
and
so
anything
we
can
do
to
kind
of
programmatically
enrich
data
that
goes
to
users
is
really
helpful.
So
that's
the
direction
I'm
really
interested
in.
You
know.
There's
lots
of
challenges
there,
because
in
order
to
do
that
well,
you
have
to
integrate
a
lot
of
different
providers
and
systems,
but-
and
you
know,
permissions
and
like
where
you
host
this
data
is
really
kind
of
interesting
and
confusing,
but
yeah.
C
I
think
that's
something
that
has
a
lot
of
potential
to
make
on-call
a
lot
easier
for
people
managing
lots
of
clusters.
A
I'm
definitely
curious
to
see
these
areas
growing
within
within
the
ecosystem
as
well,
and
the
last
set
of
questions
I
have
is
in
regards
to
your
experience
as
an
end
user.
Now
airbnb
is
a
cncf
end
user
member
quite
recently.
Actually
they
joined
a
couple
of
weeks
ago,
and
I
know
it
hasn't
been
too
much.
But
still,
I
would
like
to
ask
about
your
experience
of
being
a
cncf
end
user
and
your
experience
with
communicating
or
reaching
out
to
the
community
adopting,
tooling
and
so
forth.
B
Yeah,
so
I
think
in
general,
the
community
has
been
really
accommodating
and
welcoming,
and
so
like
project
maintainers,
are
always
ready
to
discuss
our
requirements
and
any
issues
that
we
bring
up,
and
you
know
if
they
feel
like,
for
example,
like
we
raise
a
request
and
they
think
that
it's
better
to
be
fulfilled
outside
of
the
project
or
like
it's
not
likely
to
be
prioritized
in
the
near
future.
B
They
can
communicate
that
as
well,
and
then
we
can
work
together
to
either
find
you
know
some
extension
mechanism
or
know
that
we're
going
to
build
our
in-house
solution
for
the
time
being
as
well.
Sunil
do
you
have
any
thoughts
on
experiences
and
users.
C
Yeah,
it's
been
pretty
great
so
far.
I
think
I'm
really
excited
to
see
how
much
more
we
can
do
in
the
future.
I
mean
it's
been
nice,
having
kind
of
the
ability
to
communicate
directly
with
other
members
of
the
community
and
even
though
we
were
kind
of
open
to
talking
to
other
companies
before
it's
kind
of
like
a
explicit,
like
you
know,
badge
saying:
hey
we're
we're
open
to
sharing,
which
is
great
we've
already.
I
already
had
a
couple
of
linkedin
conversations
with
people
in
the
community.
C
Now,
a
member
of
this
community
we'd
love
to
talk
more
about
communities,
so
that's
been
great.
A
Awesome
and
another
thing
that
I
one
of
my
last
questions.
Actually,
I
know
that
airbnb
has
been
quite
active
in
outreach
to
the
community.
You've
been
provisioning,
a
lot
of
talks
around
how
you
set
up
your
infrastructure
scale,
deploying
your
application
and
so
forth,
and
one
of
my
questions
is:
how
do
you
think
in
this
organization
can
contribute
and
give
back
to
the
ecosystem?
B
So
one
of
the
great
things
is
that
everyone,
not
just
the
core
maintainers,
can
file
bugs
bug,
reports
and
and
patches
as
well
and
and
even
feature
development.
A
B
That's
some
of
the
things
that
we
we've
been
doing
regularly.
We
also
try
to
attend
working
groups
and
special
interest
groups
to
read
design
documents
as
they're
they're
in
progress
and
mention
our
use
cases
and
requirements
so
that
we
can
help
motivate
specific
solutions,
and
then
we
can
also
discuss
those
extension
points
that
I
mentioned,
which
might
allow
projects
to
be
decoupled
from
business,
specific
logic
or
policy,
and
so
this
allows
for
a
wider
adoption
of
these.
B
C
And
I
think
yeah
steven
uncovers
it
pretty
well,
the
big
thing
for
us
is
really
pushing
code
upstream
as
much
as
possible
because
we
do
run
into
some
interesting
edge
cases
with
the
projects
we
use,
just
as
the
nature
of
our
sort
of
scale
and
setup,
and
I
think
it's
helpful
for
us
and
also
for
everyone
else
to
get.
C
You
know
more
eyes
on
our
code
and
like
to
really
upstream
as
much
of
this
stuff
as
possible,
because
you
know
it,
it
just
reduces
the
maintenance
overhead
for
us,
but
also
really
gives
back
to
the
community.
So
that's
something
we're
definitely
trying
to
do
more
of
in
the
next
few
years.
A
Awesome
well,
I'm
looking
forward
to
all
of
your
contributions,
be
it
in
encode,
be
it
in
talks
being
good
outreach
to
the
community.
I
think
these
are
all
great
ways
for
everyone
to
reach
out
and
I
think
airbnb
is
doing
a
great
job
of
doing
those
so
far.
Now
these
are
pretty
much
all
of
my
questions
for
both
of
you
today.
Thank
you
for
everyone
who
joined
and
listened
to
this
stream
from
the
cncf
end
user
lounge.
A
Just
as
a
reminder,
we
try
to
bring
these
latest
non
cloud
native
and
user
stories
on
every
fourth
thursday
of
the
month
at
9,
00
am
pt,
and
another
thing
I'd
like
to
mention
is
don't
forget
to
join
us
for
kubecon
and
cloud
nativecon,
virtual,
actually
hybrid
north
america,
which
is
going
to
be
in
october
12th
to
15th,
and
if
you'd
like
to
showcase
your
usage
of
cloud
native
tools
as
an
end
users,
you
can
join
the
end
user
community
and
you
can
find
more
details
on
the
cncf.io
forward.
Slash
end
user.