►
From YouTube: Kubernetes UG VMware 20210204
Description
February 4 meeting of the Kubernetes VMware User Group Meeting which discussed running clustered stateful apps and services on Kubernetes hosted on vSphere, addressing interaction with backing storage. There was also discussion of the challenges of supporting services along with additional layers of abstarction related to Kubernetes and hypervisor layers.
A
Hi
welcome
to
the
february
4th
meeting
of
the
kubernetes
vmware
user
group
on
today's
agenda.
We've
so
far
got
one
item
and
this
was
something
a
member
brought
up
at
the
last
meeting,
but
we
didn't
have
time
to
get
into
it.
We're
going
to
discuss
hosting,
replicated
stateful
apps
on
kubernetes,
while
kubernetes
is
hosted
on
top
of
vsphere.
A
So
with
that
said,
let
me
share
my
screen
just
a.
A
A
Okay,
I
assume
you
can
all
see
this
because
I
tested
a
little
bit
a
few
minutes
ago.
If
not,
you
say
something
now.
This
is
just
a
mini
deck
to
kick
off
some
more
knowledgeable
discussion
by
miles,
I
think
of
the
two
of
us
I
think
miles
is
attached
to
the
storage
group
at
vmware
is
more
authoritative
than
I
am.
I
know
enough
to
be
dangerous,
though,
but
we're
talking
about
running,
replicated
stateful
apps
on
on
kubernetes.
A
What
are
these
and
how
are
they
different?
Well
we're
going
to
talk
about
that
and
then
some
factors
as
to
why
you
might
choose
different
settings
in
your
configuration
or
architectural
deployments
on
these,
so
this
category
of
replicated
or
clustered
stateful
apps.
These
are
apps
that
are
designed
from
their
inception
to
run
on
multiple
nodes
spread
across
failure
domains
and
they
implement
their
own
replication.
A
A
counter
example
is
most
of
your
traditional
sql
databases.
Some
of
those
can
indeed
be
sharded,
but
in
their
default
or
typical
implementations.
These
are
not
generally
implementing
replication
and
if
you
really
want
to
have
these
things
to
be
bulletproof
with
high
availability,
they
were
typically
came
from
an
era
where
that
responsibility
was
left
with
the
backing
storage
layer.
A
So
one
thing
I
found
useful
is
that
when
it
comes
to
running
these
sorts
of
apps
on
top
of
kubernetes,
there's
a
lot
of
people
doing
that
in
a
lot
of
places
you
know,
vsphere,
isn't
the
only
one
and
kubernetes,
of
course
was
originated
by
google
to
run
on
google's
public
cloud
and
it's
very
popular
in
public
clouds,
and
when
I
went
out
there
to
do
a
little
research
on
best
practices,
I
actually
found
a
lot
more
material
on
hosting
these
things
on
aws
as
compared
to
vsphere,
but
this
really
can
be
analogous
to
running
it
on
vsphere
in
many
respects,
with
aws
you
get
options
as
your
storage
with
these
clustered
staple
apps.
A
A
Your
storage
disappeared
with
it
or
it
you
know,
goes
down
is
unreachable
whatever,
and
in
many
ways
these
are
comparable
to
using
vmware's
vsphere
hypervisor
with
an
option
of
das
in
an
esx
host,
you
know
consuming
a
local
disk
with
no
attempt
to
do
replication
or
clustering
versus
using
vsan,
or
there
are
other
non-vsan
forms
of
externally
attached
storage
with
vsphere.
A
A
You
have
to
ask
yourself
if
you're
paying
that
pain
for
that
twice,
and
I
think
you
are
you
know
there
are
costs
associated
with
redundancy,
but
there
are
also
benefits,
and
it
isn't
necessarily
crazy
talk
to
do
it.
In
two
places.
From
these
articles,
they
pointed
out
that
there
are
some
use
cases
where
it's
not
worth
quibbling
about
this
one
is
if,
if
your
deployments
are
really
small,
the
cost
of
storage
these
days,
isn't
this
isn't
that
great
necessarily
in
comparison
to
your
operational
costs?
A
You
know
if
you
have
to
go
through
a
lot
of
hoops
to
manage
recoveries,
go
place,
these
nodes
across
failure,
domains
and
we're
talking
about
a
tiny
mongodb
or
cassandra
instruments.
Maybe
it's
just
not
worth
the
bother
to
worry
about
it.
You
know
if
you're,
let's
just
say,
I'm
100,
megabytes
of
storage
or
something
like
that-
maybe
maybe
even
half
a
terabyte
or
something
I
don't
know
that
those
costs
are
going
to
be
huge.
A
There
are
also
some
scenarios
like
throwing
up
these
apps
for
your
developers
or
for
for
testing
where
people
don't
actually
cluster
them,
and
these
things
are
out
there,
a
single
node
anyway,
it's
possible
to
do
that.
The
production
use
case
kind
of
anticipates
that
you're
going
to
put
out
many
nodes,
but
there
are
deployments
for
dev
and
tests
that
are
are
not
multiple
nodes,
in
which
case
go
ahead
and
don't
worry
about
using
the
redundant
backing
store.
A
They
engage
in
cross
node
traffic.
That
can
be
very
substantial.
You
know
I've
I've
I've
been
to
sessions
on
some
of
these
things.
I
think
the
session
I
went
to
was
more
on
ceph,
which
is
actually
a
backing
store,
but
it
was
pointing
out
that
realistically,
in
an
on-prem
scenario,
this
storage
layer
in
its
replication
could
gobble
down
40
gigabit
per
second
networking
pretty
easily,
and
you
know
that
that's
a
cost.
You
can
either
pre-allocate
that
level
of
storage
or
or
suffer
service
level.
A
Brown
outs,
when
you
do
have
failures,
because
this
cross
network
tr
network
node
traffic
could
really
impact
other
application
layer
production
workloads
so
that
you
can't
meet
service
level
guarantees
unless
you
kind
of
let's
say
over
provision,
your
storage
and
when
these
clustered
apps
get
big
enough.
Essentially
I
mean
there
are
people
who
take
these
to
hundreds
of
nodes,
and
when
you
get
that
big,
arguably,
you
could
have
a
situation
where
you're
having
a
failure
per
week,
just
because
you
know
it's
sufficiently
large
numbers.
Even
if
you
have
you
know
a
failure.
A
Percentage
of
a
fractional
percentage
per
node
per
week
get
enough
nodes
and
you
could
have
failures.
Half
the
time
so
make
sure
you
take
into
effect.
This
potential
network
usage
is
my
advice,
because
the
whole
reason
you're
doing
this
is
that
you're
planning
for
failure,
and
you
have
to
expect
that
failures
will
happen.
A
B
Sure
I
mean
this
is
open
for
anyone
else
to
pitch
in
their
thoughts
here,
because
I
know
there's
some
very
experienced
architects
in
in
this
space
as
well.
That
know
what
they're
talking
about,
but
I
think
you
kind
of
hit
the
nail
on
the
head
there
with
there
is
an
operational
cost
to
making
a
storage
efficiency
at
the
infrastructure
layer.
B
So
if,
even
if
your
app
cassandra
kafka-
you
name,
it
does
do
its
own
replication
and
data
services,
you
would
probably
you
know
if
we
assume
the
data
is
replicated
in
a
raid,
one
fashion
at
the
infrastructure,
layer
and
similar
sort
of
fashion
at
the
the
application
layer.
B
There's
gonna
be
a
two-time
storage
consumption
overhead
right
and
until
you
get
to
a
scale
which
is
probably
in
the
petabytes,
not
in
the
terabytes,
it
probably
doesn't
make
sense
to
try
and
realize
the
storage
efficiency
at
the
infrastructure
layer
because
of
the
operational
cost
and
operational
complexity
around
co-locating
data
with
the
storage.
B
So
if
you
take
kubernetes
as
an
example
and
you've
got
a
cassandra
deployment
out
there,
and
you
have
three
cassandra
nodes,
two
for
it
to
make
sense
or
for
it
to
work
for
you
to
have
like
a
storage
efficiency
at
the
the
underlying
infrastructure
layer,
we're
assuming
you
have
some
kind
of
direct
attached
disk,
so
think
of
like
a
local
vmfs
data
store
on
an
esxi
host.
B
You
would
need
to
make
sure
that
the
compute,
so
the
pod
is
scheduled
on
the
same
node
as
that
storage
for
it
to
be
able
to
access
it.
That
is
non-trivial.
You
would
have
to
write
some
kind
of
controller
for
that
to
run
inside
kubernetes
to
make
sure
that
whenever
the
pod
is
scheduled,
it's
scheduled,
along
with
the
volume
you
could
do,
that
with
wait
for
first
consumer
or
some
other
parameters
that
are
in
csi.
B
But
whenever
it
comes
to
failures,
if
you
have
a
failure,
that
part
obviously
can't
come
up
anywhere
else,
because
it's
pvc
is
going
to
be
isolated
to
that
one
node.
So
you
kind
of
have
to
ask
yourself:
is
it
worth
trying
to
save
money
on
disk,
which
is
pretty
cheap
these
days,
whenever
you're
going
to
create
an
operational
minefield?
B
Realistically,
you
know,
unless
you
have
some
really
robust
integration
between
the
infrastructure
layer
and
the
application
layer,
you're,
probably
going
to
have
a
bad
time
if
you
try
and
do
something
like
that
at
least
today,
you
know
the
kubernetes
community
is
getting
more
and
more
aware
of
storage
affinity
to
compute
nodes
and
stuff,
like
that,
so
that
that
will
improve
in
future.
Just
today,
every
time
I've
looked
at
this
and
every
time
I've
tried
to
run
it.
A
In
terms
of
another
comment
that
just
occurred
to
me
in
terms
of
kubernetes
community
effort
for
this,
they
did
initiate
the
stateful
set
as
an
attempt
to
address
deployments
of
these
cluster
aware
stateful
apps.
But
I
went
back
just
as
research
for
today's
meeting
and
looked
at
that,
and
one
thing
I
was
involved
with
the
planning
for
stateful
sets
and
we
had
admiral
goals.
A
But
one
thing
that
disturbed
me
is
the
docs
on
those
things
don't
appear
to
have
been
updated
for
like
years
at
this
point,
and
it
would
make
me
a
little
nervous
as
to
whether
that
initial
momentum
behind
it
has
been
maintained.
I'm
not
saying
it
hasn't,
it
could
be.
The
stuff
is
great.
I
haven't
personally
used
it,
but
my
research
today
just
led
me
to
notice
that
a
lot
of
the
documentation
is
looking
pretty
stale
in
kubernetes
years.
A
Let's
just
say
you
know
this
project
moves
so
fast,
particularly
over
in
the
csi
front,
that
I'm
a
little
nervous
if
something
hasn't
had
a
lot
of
attention
for
over
a
year.
The
other
thing
regarding
cost
is
that,
if
your
use
case
for
these
stateful
apps
is
predominantly
read
only
if
you
look
at
the
underlying
way
things
work
throwing
extra
disks
where
you
know,
I
don't
think
they're
rotating
spindles
anymore,
but
even
if
they're
ssds
isn't
necessarily
a
hundred
percent
doubling
of
the
cost.
A
If
the
application
works
in
such
a
way
that
you
got
additional,
read
iops
and
you're
predominantly
read
versus
write
in
your
usage
of
this
thing,
because
it
could
be
that
you
really
could
benefit
greatly
from
those
extra
iops
so
that
if
under
the
covers,
you've
got
more
ssd
drives
there,
they
should
be
providing
double
the
iops
and
if
you
needed
them
anyway,
well
suddenly
there's
no
extra
cost
to
what
you've
got
going
on
there.
So
maybe
it
was
just
a
freebie
and
even
ignoring
the
operational
benefits
you
might
still.
B
And
I
mean
I
think,
you're
right
in
that
stateful
set
had
admirable
goals
and
stateful
set
works
perfectly
as
long
as
you
deploy
on
bare
metal
and
you
don't
move
your
vms
around,
because
there
would
be
no
vms
as
soon
as
you
add,
an
abstraction
layer
at
the
infrastructure
layer
which
makes
vms
portable.
You
immediately
run
into
the
storage
gravity
problem,
which
is
the
storage,
is
over
here.
The
compute
can
move
anywhere.
That
is
fundamentally.
B
It
needs
to
have
legs
into
the
underlying
infrastructure
to
understand
you
know
the
topology
and
stuff
like
that,
and
I
know
we're
not
meant
to
talk
about
product
and
stuff
here,
but
there's
a
reason.
We
brought
out
the
data
persistence
platform
right.
It
solves
that
challenge
because
we
realize
you
need
the
infrastructure
to
talk
to
kate's.
That's
not
saying
you
can't
do
it
in
a
open
source
friendly
way,
I'm
almost
certain
you
can.
B
But
you
know
these
are
sort
of
hard
problems
and
you
really
need
to
ask
yourself:
is
it
worth
the
amount
of
pain
that
I'm
going
to
have
to
put
the
operations
team
through
to
make
a
saving
at
that
kind
of
layer?
So.
A
So
yeah,
I'm
a
little
skeptical
myself
on,
certainly
hand
rolled
code
to
go
accommodate
these
kinds
of
migrations
and
failure.
Domain
impacts
might
be
actually
pretty
costly
to
take
on
both
operationally
and
at
a
from
a
development
and
test
perspective.
If
you're
going
to
write
custom
things
to
do
it
yourself,.
B
C
So
what
the
kind
of
when
it
comes
to
designing
architectures,
like
this
kind
of
the
rule
of
thumb,
I
I'm
starting
to
use,
is-
is
keep
have
those
layers
do
what
they're
best
at
and
trying
to
keep
them
separate.
So
I
remember
with
with
cloud
foundry
and
I'm
you
know
in
this
case
I'm
talking
about
the
kind
of
the
pivotal
implementation
of
that
yeah.
C
This
interesting
discrepancy
where
both
the
bosch
project
was
able
to
do
something
like
vm
resurrection,
where
I
could
simply
say
if
I
lose
a
node
here,
I'll
simply
recreate
the
node
after
a
certain
timeout,
but
that
was
you
know
you
could
either
see
it
as
complementary
or
in
opposition
to
h
a
v
sphere,
h,
a
so
you
had
to
you
could
either
say.
Well.
If
I,
if
I
make
sure
the
timings
work,
I
can
have
them
both
do
their
thing.
C
I
can
rely
on
on
on
either
of
the
technologies,
which
one
is
faster
to
save
me
first
and
then,
if
that
doesn't
work,
I
can
almost
have
the
other
one
kick
in,
but
then
you've
got
to
tweak
timings
or
you
make
a
choice
and
say
I'll
use
one
over
the
other
and
in
those
cases
I
you
know
we
would
usually
say
well,
we'll
turn
the
resurrector
off
or
use,
because
aj's
going
to
be
able
to
spin
up
that
vm
faster,
especially
if
it's
on
shared
storage
than
than
bosch,
is
able
to
recreate
the
vm
with
kubernetes.
C
We
kind
of
have
the
same
challenge.
The
timings
are
tighter
though
it
was
interesting.
I
saw
today
a
tweet
that
the
that
the
tanzu
rabbit
mq
broker
was
released,
or
the
not
sure
broke
is
the
right
word,
but
the
the
managed
rabbit
mq.
D
C
Yeah
yeah
and
that's
of
course,
analogous
to
what
we
know
from
pcf
as
the
rapidmq
tile.
It
does
the
same
thing
more
or
less.
It
gives
you
the
ability
to
manage
the
life
cycle
of
a
h,
a
rabbit,
mq
cluster
with
rabbit
mqs.
Specifically,
I
there's
very
little
value
in
all
of
the
underlying
ias.
High
availability
options
around
that,
because
the
context
of
what
that
specific
in
this
case
is
a
back-end
service
right.
It's
not
actually
a
customer
workload,
but
it's
a
backing
service.
C
The
whole
context
of
what
that
thing
is
trying
to
maintain
from
an
aha
point
of
view,
is
known
by
that
thing
itself.
In
this
case
we're
talking
about
message
queuing,
the
is
layer
you
know
you
can
do
you
can
do
even
you
know,
kubernetes
native
versions
of
of
spinning
up
that
pod.
But
it's
not
going
to
save
your
message.
C
So
my
way
of
thinking
is:
is
that
goes
first,
if,
if
the
application
is
in
itself
able
to
save
whatever
the
the
core
object
is
that
needs
to
be
highly
available?
That's
your
first
choice
and
with
kubernetes.
Then
then
you
go
to
the
kubernetes
layer
and
if
that
can't
save
you,
then
you
go
to
the
eos
layer
and
and
that's
that's,
how
I've
started
to
think
about
these
architectures.
B
A
Yeah,
the
other
issue
is
I,
like
your
point
of
the
timing
on
this,
maybe
should
be
set
to
be
quite
different,
because
your
worst
case
would
be.
I
can
see
here
with
your
description.
You've
got
potentially
three
abstraction
layers
trying
to
do
a
recovery,
the
app
itself
kubernetes
and
then
the
underlying
infrastructure,
whether
it
be
vsphere
or
aws
or
whatever,
and
if
they
all
have
a
similar
timeout
for
concluding,
there's
some
problem
that
I
need
to
engage
in
heroic
efforts
on.
A
You
know,
let's
just
say,
forsaken
discussion,
it's
two
seconds
and
every
one
of
these
kicks
in
at
that
two
second
mark.
At
the
same
time,
oh,
I
I'd
hate
to
think
about
potential,
even
non-deterministic
behaviors,
and
they
could
influence
each
other,
causing
a
different
layer
to
think
that
gee.
I
tried
doing
this
and
I'm
seeing
this
bizarre
thing,
so
I
need
to
do
even
more.
Even
less
and
you'd
get
the
horrible
oscillations.
D
That's
what
happened
also
in
cluster
api
for
vsphere,
and
that
came
up
multiple
times
with
the
machine
health
checks
and
how
it
does
its
remediation
of
nodes
with
vsphere
and
things
like
that.
I
mean
it's
come
up.
It
comes
up
even
at
the
kubernetes
level,
without
going
up
to
the
application
level
and
when
you
go
to
things
like
cluster
api
or
bosch
with
pks
for
sure.
C
I
guess
it
gets
even
more
complicated
when,
when
you
get
above
kubernetes
and
you
get
into
the
infill
layer
where
you're
doing
data
center
failover,
so
while
the
customers
I've
dealt
with,
they
really
really
like
the
idea
of
being
able
to
fail
over
between
data
centers.
Now
you
know
those
of
us
with
vmware
experience,
you
know,
stretch
cluster
and
all
that
kind
of
stuff.
We've
done
that
and
n16
now
supports
it
as
well.
C
So
it's
coming
back
again
in
a
big
way,
but
these
there's
multiple
layers
of
timing
and
which,
which
component
will
interact
with
which
component
to
fail
over
the
funny
thing.
With
I
mean
this
is
kind
of
a
side
step,
but
the
funny
thing
with
data
center
failovers
is
to
do
it
properly.
You
need
the
entire
networking
stack
like
the
physical
networking
stack
to
you
know
to
be
involved
in
that
and
there
you
notice.
E
C
Yeah
and
and
then
you
get
really
funny
effects
between
any
any
system,
and
this
could
be
at
any
level
right
that
is
active,
active,
so
things
that
will
replicate
state
with
each
other
think
of
a
low
balancer.
That's
replicating
that
state
or
think
about
a
firewall,
that's
replicating
session
state
or
think
about
a
storage
layer
that
is
replicating.
C
You
know
in
what's
the
word
in-flight
io
right,
so
I
think,
like
a
v-plex
or
something
like
that,
all
these
layers
have
to
interact
and
you
lose
data
at
any
of
these
stages,
and
you
have
a
problem.
So
timing
becomes
crazy,
but
there's
very
few
people
who
can
see
the
whole
stack
right.
So
these
things
become
exponentially
complex
and
we're
kind
of
creating
that
in
microcosm
with
these
abstraction
layers
within
kubernetes-
and
we
always
have
to
keep
that
in
mind.
B
So
I
would
love
to
hear
everyone's
opinions
on
the
thoughts
of
putting
kubernetes
on
stretch
clusters,
because
to
me
it's
a
terrible
idea
and
you
should
never
do
it
and
you
should
have
a
discrete
kubernetes
cluster
per
site
and
do
not
try
and
fill
it
over
like
a
traditional
system.
But
vmware
customers
in
particular,
are
stretch
clusters
of
the
way
put
it
on
a
stretch.
Cluster
and
life
will
be
good
and
you
have
to
have
that
awkward.
That's
not
really
how
things
work
around
here.
Anymore
type,
conversation.
A
So
I
all
I
can
say
is
I
I've
not
done
data
gathering
miles,
but
I
can
tell
you
that
there
is
an
awful
lot
that
can
go
wrong
there,
so
that
you're
taking
on
an
awfully
big
problem.
If
you
go
that
direction.
F
F
The
problem
here
I
shouldn't
say
problem,
but
but
I
think
the
natural
reason
that
this
is
being
expressed
is
because
we're
talking
about
vmware
customers
and
we're
talking
about
a
long
tail
of
historical
behavior
around
a
given
abstraction
layer
and
the
fact
that
we're
moving
to
a
higher
abstraction
layer
for
many
doesn't
compute,
and
you
know
obviously,
like
pun
intended
there,
but
but
you
know
the
the
the
whole
the
whole
stretching
this,
and
then
that
doesn't
even
talk
about
networking
where
the
the
sup
du
jour
is,
let's
stretch
l2
as
well
everywhere
and
anywhere
even
out
to
the
public
cloud.
F
If
we
can
it's
in-
and
this
is
also
compounded
by
miles
and
your
experience,
I'm
sure
you
you
see
this
continually
the
idea
of
multiple
from
vendors
to
consultants
to
others
that
are
saying.
Well,
don't
re-architect
your
application
to
do
it
the
kubernetes
way,
let's
just
drop
it
in
a
container,
oh,
and
by
the
way
we
failed
to
mention
that
if
you
buy
us,
then
we'll
allow
you
to
do
that.
F
Then
you'll
get
all
of
those
native
those
cloud
native
benefits,
and
this
just
compounds
the
problem
because
it
it
it
doesn't
acknowledge
this
different
level
of
abstraction
and
the
different
ways
of
doing
things.
But
that
don't
worry
you
don't
have
to
do
any
of
that.
You
can
just
put
it
in
a
container.
You
can
continue
to
stretch
them
across
data
centers
across
countries
across
continents,
even
across
public
clouds,
and
while
your
failover
works
just
fine
and
dandy.
F
B
You're
exactly
right
chip.
I
think
vendors
are
as
much
to
blame
as
anyone
else
when
it
comes
to
this,
because
it's
the
whole
cio
talk,
we
don't
need
you
to
re-architect
your
application
for
this
new
platform,
we'll
just
make
it
work
when
in
reality
yeah,
maybe
they
can
just
make
it
work,
but
all
it
does
is
kick
the
can
down.
The
road
doesn't
solve
the
problem.
It
makes
the
problem
the
futures
problem
and
not
the
now's
problem,
but
yeah.
I
think
it's
kind
of
short-sighted
anytime.
I
do
a
talk
with
a
customer.
B
F
Yeah
it's
in,
and
I
think
you
know,
as
a
community
we're
we're
on
the
same
page
largely,
but
there
is
it's
it's
not
as
if
you
know
as
a
as
a
company
leader,
that's
a
very
difficult
pill
to
swallow.
When
someone
comes
in
and
says,
okay,
it's
gonna
cost
you
80
million
dollars
to
completely
rewrite
all
of
this
in
a
cloud
native
way,
and
yes
many
times.
That's
true
because
of
you
know
a
legacy
of
25
years
worth
of
operations
that
you're
talking
to
a
petabyte
large
oracle
database.
F
Okay,
you
know,
there's
limited
things
that
we
can
do
there,
but
there
are
some
benefits
that
can
be
achieved
by
adopting
some
of
these
better
practices,
but
the
the
thing
that
that
all
far
too
many
people,
vendors
consultants,
etc
etc,
are
afraid
to
speak
up
and
say.
Is
you
may
not
get
all
of
those
benefits
you
might
get
some
of
them,
but
don't
expect
to
realize
all
of
those
benefits
like
there's
cloud
native
is
not
a
silver
bullet
and
failover
between
data
centers
is
not
going
to
help
you
with
your
silver
bullet
manufacturing
process.
F
So
there's
there's
blame
on
both
sides,
but
it's
a
it's
a
it's
a
challenging
thing
and
it's
especially
you
know
the.
F
I
think
the
reason
I
bring
this
up
in
the
vmware
group
is
because
you
know,
there's
vmware
has
so
many
customers
that
there's
still
a
very,
very
large
ramp
up
where
a
lot
of
those
customers
that
have
been
doing
nothing
but
vsphere
for
a
decade
are
now
switching
and
doing
kubernetes
and
now
they're
having
to
you
know,
get
ramped
up
on
this
learning
curve
and
figure
out
how
we
understand
these
concepts
and
well
you
know,
is
that
like?
Is
that
like
drs?
F
Is
that,
like
the
emotion
is
it
like
all
this
and
that
and
no
not
necessarily
but
trying
to
translate
this
information
in
a
way
that
that
makes
sense
but
also
convey
you
know?
These
are
the
things
that
you
should
really
really
do,
and
these
are
the
things
that
you
really
shouldn't
think
of
doing
anymore.
F
Yes,
you
used
to
do
that,
but
now
is
a
different
way
and
it's
just
going
to
take
time
and
people
with
the
expertise
like
yourself
and
steve
and
others
from
from
the
former
pivotal
organization,
helped
to
try
and
drive
some
of
that
education.
But
it
is
an
educational
problem
and
it's
we're
having
an
effect
overall,
but
there's
just
a
flood
of
users
that
are
coming
in
and
that
that
just
need
education
on
how
to
do
this
stuff
properly.
F
And
it's
not
easy
and
it's
gonna
take
time
and
there's
no
anyone
who's
who's,
promising
who's
promising
to
come
in
and
drop
your
app
that
ran
in
iis
since
97
into
a
container
and
all
of
a
sudden
you're
going
to
be
digitally
transformed.
It's
snake
oil
and
you
need
to
be
cautious
of
that.
A
You
know
I
I
agree
with
you
and
I
I
really
still
am
thinking
that
when
robert
kicked
off
this
discussion,
pointing
out
that
you
know
the
the
dueling
abstraction
layers,
I
thought
that
was
great
and
I
did
invest
some
time
in
doing
a
search
out
there
for
prior
articles.
This
isn't
heavily
covered
in
a
way
that
users
are
likely
to
discover
it.
So
maybe,
as
a
group,
we
could
do
the
world
of
service
by
I.
A
So
robert
did
you
have
have
you
come
to
any
conclusion
yet
on
where
you
know
in
terms
of
timing
for
recovery?
If
you
face
the
face,
the
fact
that
the
application
has
potential
instrumentation
and
recovery
efforts
going
down
to
kubernetes,
then
going
down
to
infrastructure,
which
one
do
you
think
should
be
on
the
fast
reaction
time
versus
the
slow
and
or
are
have
you
still
found
nothing
but
kind
of
bad
counter
examples
and
not
discovered
the
magic
solution.
Yet.
C
Well,
like
I
said,
like
I
said
earlier,
I
think
I
think
it's
better
to
lay
the
emphasis
with
the
application
as
much
as
possible,
treats
treat
kubernetes
or
any
past
platform
as
a
landing
place
right,
an
airfield
but
but
but
separate
airfields.
You
know,
don't
don't
try
to
solve
everything
at
the
bottom
layers
leave
as
much
of
that
responsibility
with
the
app
as
possible,
but
yeah.
C
You
know
like,
like,
like,
like
chip,
pointed
out
right
the
learning
curve
for
some
of
these
application
teams
right
they're,
not
ready
for
that
they're,
not
ready
for
handing
over
the
you
know
sorry
for
accepting
the
responsibility
of
things
like
state
failover.
You
know
they
don't
know
how
to
do
it.
They
don't
know
how
to
build
their
application.
Architectures
like
that,
and
especially
if
you
have
an
off-the-shelf
application
where
you're,
not
in
control
of
you,
know
of
how
the
application
handles.
C
You
know
the
failure
of
the
container,
the
failure
of
a
node.
You
know
the
responsibility
then
devolves
automatically
to
the
layer
beneath
it.
I
think
the
challenge
for
us,
especially
those
who
kind
of
you
know,
have
one
foot
in
both
worlds,
is
to
keep
reminding
both
sides
of
that
world
where
these
interactions
are
and
how
they
and
and
and
trying
to
think
about
application
architecture.
I,
like
your
idea
of
of
of
of
maybe
sitting
down
and
figuring
out
what
what
a
modern
reference
architecture
would
look
like.
C
Considering
all
these
layers,
I'm
sure
other
people
have
tried
this
already,
but
but
I
do
I
mean
you
can
have
reference
architectures,
but
that
doesn't
mean
that
everyone's
read
them
yeah.
A
Well,
here's
something
interesting
too!
I
found
that
when
I
did
this
research
out
on
amazon,
there
were
plenty
of
vendors
who
wrote
these
apps
writing
articles
on.
You
know
how
to
do
it
on
amazon,
but
the
amazon
approach
itself
seem
to
be
hey:
we're
going
to
package
this
up
as
a
service
so
that
you
don't
need
to
worry
about
how
we
did
this.
We
just
trust
us.
A
We
had
a
bunch
of
smart
guys,
put
this
together
and
we're
going
to
meet
her
in
a
different
way
and
you
can
get
whatever
you
know,
cassandra
or
mongodb
or
whatever
from
us
as
a
service.
Now,
when
you
get
to
on-prem,
you
sit,
there
wondering
is
that
the
right
approach
that,
rather
than
educate
thousands
of
potential
users
to
go
independently
figure
out
how
to
deploy
the
reference
architecture
themselves?
Is
there
a
way
that
they've?
A
These
are
all
common
enough
that,
ultimately,
the
right
way
to
do
this
is
to
come
up
with
an
as
a
service
equivalent
for
on-prem,
where
some
vendor
or
service
organization
has
figured
out
all
the
hard
parts
and
comes
up
with
a
package
solution,
so
really
that
a
user
organization
doesn't
have
to
engage
in
a
whole
lot
of
education
themselves.
They
just
buy
it
from
a
vendor,
like
you
would
in
a
public
cloud.
D
And
that's
kind
of
what's
happening
with
the
operator
world
within
kubernetes,
I
mean
when
you
look
at
how
the
operators
are
working,
whether
it's
the
tanzu
data
services
or
the
public
open
source.
You
know
postgres
and
hundreds
of
other
ones
using
operator,
lifecycle
manager
and
that's
the
whole
idea
behind
it,
which
is
very
strong.
It's
bringing
that
same
sas,
offering
whether
open
source
or
enterprise
supported
with
these
reference
architectures
just
built
into
it.
So
you
don't
need
to
worry.
D
The
difficulty
is
when
it's
the
in-house
solutions,
when
it's
your
own
application
and
how
you
do
it
from
your
application,
because
any
off-the-shelf
one
there
have
been
solutions
always
for
provisioning
databases
on
premise,
maybe
not
as
great
as
something
like
rds
or
things
like
that,
but
they've
existed
and
they
exist
now
with
operators.
D
B
I
think
there's
another
challenge:
well,
not
really
a
challenge,
but
I
mean
the
entire
point
of
kubernetes
is
to
abstract
the
public
cloud
and
the
infrastructure
so
that
your
app
you
write
once
and
you
run
it
anywhere
as
soon
as
you
start
consuming
services
provided
by
the
platform.
You
are
tied
to
said
platform,
it's
no
longer
a
fully
generic
application.
D
I
think
the
worry
something
is
the
operators
that
are
coming
out
of
the
different
public
clouds
today
of,
for
example,
the
s3
operator,
which
gives
you
the
within
conversation.
You
may
be
you're
actually
just
using
public
cloud
resources,
just
using
aws
sdk
inside
to
create
an
s3
bucket
for
you
or
the
entire
aws
marketplace
these
days
and
azure
did
the
same
thing
and
google
are
doing
it
also,
I
mean,
unfortunately,
the
operator
framework
with
how
open
it
is
is
allowing
us
to
do
that
completely
today.
F
Well,
that's
their
goal
is,
after
all,
to
keep
you
a
customer
not
to
incentivize
you
to
become
an
uncustomer,
so
it
behooves
them
to
try
and
deliver
value
in
ways
which
makes
it
easier
for
you
to
conserve
and
consume
their
services
and
not
more
difficult.
But
you
touched
on
a
point
with
the
operator
framework,
and
this
is
something
that
I
see
as
a
boon
and
also
a
real
big
achilles
heel
and
that
people
seem
to
be
operator
crazy.
You
know
everything
needs
an
operator,
it
doesn't
matter
what
the
application
is.
F
F
When
should
we
architect
and
design
and
create
a
proper
deployment
strategy
versus
when,
should
we
invest
all
this
into
something
else,
that's
going
to
solve
those
problems
for
us
and
it
it
creates
brittleness.
It
creates
difficulty
and
and
later
managing
that
application
and
and
being
able
to
do
things
that
might
need
done
with
it.
So
they
do.
They
have
a
use
case.
Absolutely.
F
I
think
a
bad
use
case
for
an
operator
is
something
that
attempts
to
absolve
you
of
needing
to
design
a
proper
deployment
pipeline
for
an
application.
In
other
words.
Well
as
an
example,
let's
say
that
you
had
an
apache
web
application
of
some
sort,
not
necessarily
using
a
using
a
stateless
application
or
a
stateful
application.
But
if
you
needed
to
deploy
that
and
rather
than
rather
than
creating
a
series
of
good
hygienical
practices
on
how
you're
going
to
develop
your
code,
how
you're
going
to
semantically
version
your
images,
how
you're
going
to
store
it?
F
You
instead
put
a
lot
of
that
knowledge
into
an
operator
and
run
it
inside
the
platform
itself
and
then
try
and
figure
out
how
to
string
it
along
to
get
your
desired
result
rather
than
maintaining
discipline
during
your
development
and
deployment
phases,
so
that
you
get
the
same
end
result.
But
it's
not
tied
to
something
that's
as
brittle
or
that
has
as
much
technical
debt
as
if
you
were
to
write
your
own
controller
in
crd
and
maintain
that
yourself.
B
So,
do
you
mean
like
say
an
actual
service
like
you
know,
to
pick
a
crap
example.
Example:
wordpress
like
building
an
operator
for
wordpress
is
the
wrong
way
to
do
things
because
it
doesn't
make
you
think
about
each
component.
F
It
depends
how
you
use
wordpress
if
your
business
is
to
establish
wordpress
as
a
service.
That
may
be
an
okay
thing
for
you
to
do,
because
you're
constantly
having
to
deploy
this
thing
and
figure
out
how
to
life
cycle
it,
but
if
you're
consuming
its
services
in
a
pedestrian
fashion,
you
probably
don't
need
to
invest
in
writing
an
operator
for
that
and
it
there's
not
a.
F
I
don't
think,
there's
a
hard
and
fast
line,
but
but
but
my
my
point
is
my
point:
is
that
just
because
you're
consuming
an
application
or
you're
consuming
services
from
an
application
doesn't
mean
that
you
need
to
write
and
control
its
life
cycle
at
every
point
in
time,
in
the
manner
that
you
know,
other
kubernetes
native
resources
are
being
controlled
by
things
like,
for
example,
cluster,
api
or
or
csi
or
whatnot,
because
they're
they're
they
they
have
problem
of
scale
that
you
don't
have.
D
I
think
the
other
issue
that
we're
seeing
in
the
operator
world
is,
if
you
just
go
to
operator
hub
today
I
mean
there's
the
five
stages
of
maturity
of
an
operator
and
90
of
the
operators
out.
There
are
just
using
the
helm
operator,
sdk,
meaning
they
only
get
to
the
first
two,
so
even
the
operator
framework
and
how
it's
being
implemented
today,
I
think,
is
just
wrong.
D
An
operator
just
adds
more
technical
debt
to
them
because
to
deploy
that,
then
to
upgrade
that
to
change
your
operator
to
update
your
applications,
afterwards
adds
a
huge
level
of
complexity
on
it
that
you're
just
pushing
off
and
doesn't
give
you
the
flexibility
that
you
really
want
today.
D
I
don't
believe
in
open
source
that
has
two
stars
on
github
and
that
no
one
actually
uses
and
just
calling
it
open
source,
because
it
is,
but
I
think
things
like
the
carvel
tool
set
out
there,
which
I
think
is
very
promising
in
terms
of
how
to
do
things
like
that,
using
things
like
ytt
and
cap,
instead
of
necessarily
going
down
the
path
of
needing
an
operator
for
everything,
I
think
operators
make
sense
for
off-the-shelf
applications,
usually
for
data
services
for
things
that
it's
not
your
expertise,
your
application
is
your
expertise.
D
You
should
know
how
to
write
it
in
a
deployment
and
do
everything
that
you
need
in
a
correct
pipeline
for
deploying
that
postgres
is
not
your
business.
Postgres
is
something
that
you
are
consuming
and
the
people
that
build
postgres
know
better
than
you
how
to
configure
aj,
usually
and
for
those
cases,
I
think
an
operator
makes
sense
for
the
off-the-shelf
data
services
for
off-the-shelf
products
for
your
own
application.
I
think
the
best
way
is
to
use
different,
tooling
helm,
has
kind
of
taken.
D
The
marketplace,
even
though
I
hate
go
templating,
so
I'm
a
much
bigger
fan
of
ytt,
but
I
think
that
there
are
other
options
out
there
json
and
there
are
different
ways
of
doing
these
things,
but
I
think
you
know
we
need
to
readdress
that
subject
in
the
community,
because
operators
for
your
own
application
unless
you're
deploying
it
a
hundred
times
a
day.
I
don't
see
the
big
benefit
over
just
creating
a
service
in
a
deployment
kubernetes
yeah.
B
F
F
Maybe
you
do
if
you're
running
lots
of
it
and
you're
running
it,
and
you
know
at
high
scale,
and
you
have
a
lot
a
lot
of
dependencies
on
it,
but
as
microservices
typically
have
their
own
dedicated
data
stores,
you're,
probably
not
talking
about
an
80
terabyte
database
of
postgresql,
that's
being
used
by
400
different
microservices,
I
mean,
hopefully
not
so
you
know
what
what?
What
is
your
sphere
of
concern
like
if
it's,
if
you're
wanting
to
control
one
or
two
or
maybe
one
or
two
clusters
worth
of
postpress?
F
Maybe
you
don't
use
an
operator,
maybe
you
just
maybe
you
just
figure
out
how
to
operate
postpress
properly
without
delegating
that
to
a
piece
of
machinery,
but
if
you're
doing
it
significantly,
you
know
if
you're
a
lift,
you're
an
airbnb,
and
you
are
running
that
and
you're.
You
do
have
terabytes
of
databases,
and
you
do
have
these
complex
demands.
Then
maybe
you
do
need
to
look
at
an
operator
and
and
that's
time
for
you
to
go
into
some
technical
debt
for
some
of
the
benefits
that
it.
C
Brings
can
I
can.
I
ask
you
guys
it's
like
a
slight
step
back
we're
talking
here
about
ancillary
services
that
were
exposing
through
operators,
not
so
much
the
customer
workloads
themselves.
C
F
Well,
I
I
from
from
my
experience,
what
what
I'm
seeing
and
what
I'm
advising
is
a
minimum
level
of
competency
with
these
things.
You
know,
if
you're,
if
you're
going
to
be
in
the
business
of
going
to
a
ride
in
a
car,
perhaps
you
should
understand
its
most
basic
features.
Does
that
mean
that
you
need
to
be
able
to
pull
a
cam,
shaft
and
readjust
valves?
No,
no,
it
doesn't,
but
you
should
understand
the
function
of
a
wheel
and
a
gas
pedal
and
a
brake.
F
But
if
you're
going
to
be
consuming
that
car
in
a
fashion
where
you
know
you're
going
to
have
extraordinary
demands,
then
that's
going
to
level
up
your
need
to
know
more
about
it.
If
all
you're
interesting
is
consuming
a
ride,
then
that's
where
those
abstractions
come
in,
that
that
has
systems
like
heroku
and
cloud
foundry,
and
it's
commercial
variant
got
very
very
right
because
it
allows
you
to
remove
a
lot
of
that
burden.
F
But
but
you
know
the
kubernetes
is
is
sort
of
seen
as
a
departure
from
that,
because
a
lot
of
the
a
lot
of
the
reason
was
well.
This
is
too
expensive.
It's
got
all
this
stuff.
We
want
to
do
it
ourselves.
Okay,
that's
fine,
but
you
know
you
have
to
take
the
good
with
the
bad
if
you're
gonna
take
if
you're
gonna,
if
you're
gonna
roll
it
yourself
and
you're,
not
gonna
use
a
pass
and
you're
not
going
to
have
access
to
those
abstraction
layers,
then
you're
taking
more
of
the
onus
on
yourself.
F
So
you
know
you
can
there
are
ways
that
you
can
mitigate
some
of
that
by
using
on
the
operator
subject
that
you
can
use
operators
and
some
of
those
concerns
are
ameliorated.
But
if
you're
going
to
take
the
bull
by
the
horns,
sometimes
you're
going
to
get
stuck
and
if
that's
not
something
that
you're
willing
to
do,
then
perhaps
you
should
be
using
a
pass
instead,
where
they
provide
those
guard
rails
and
they
provide
those
fuzzy
pool.
Noodles
that
protect
you
from
the
sharp
edges.
B
I
can
see
the
reasoning
why
people
don't
want
to
pass
again.
It's
the
whole
vendor
lock-in
thing.
It's
the
lack
of
portability
across
passes
that
kind
of
stuff.
So
maybe
what's
missing
like
robert
to
your
like
concern
of
a
customer,
they
don't
want
to
be
an
expert
in
each
one
of
these
things
is
like
a
cloud
foundry
type
services,
but
built
on
kubernetes.
So
you
still
got
the
portability
of
kubernetes
api
underneath
everything,
but
you
can
still
opt
to
have
vendor
support
if
you
need
it
for
particular
services.
B
F
Mysql
and
whatnot,
those
are
those
were
examples
where
you
you
might
want
to
delegate
that
to
a
vendor
and
say
you
know,
very
few
people
are
in
the
business
of
running
prometheus.
Maybe
you
are,
and
you
sell
that
as
a
service,
but
that
that
may
be
a
prime
opportunity.
Where
look.
I
don't
want
to
figure
out
how
to
scale
it.
I
don't
want
to
figure
out
how
to
do
that.
I
just
want
to
consume
it.
F
So
let
me
go
and
let
me
pay
somebody
else
to
do
that
and,
and
you
might
consume,
that
as
a
service
in
the
format
of
you
know,
tons
of
observability
may
wave
front
or
something
else
or
you
may
just
say.
I
want
to
run
it,
but
I
want
to
have
some.
I
want
to
have
somewhat
of
an
easy
button
and
a
number
to
call
if
things
go
awry.
C
I
see
our
team
is
falling
into
the
trap
of
of
building
support
around
these
services
themselves
because,
from
the
consumer
end
it's
as
simple
as
an
operator
in
the
case
of
kubernetes,
a
tile
in
the
case
of
pcm
or
cf,
trying
to
try
to
be
bender
neutral
here,
but
but
the
the
it's
a
discussion
at
a
higher
level.
It's
indeed
like
what
are
you
willing
to
pay
for
which
responsibilities?
Are
you
willing
to
take
on
board?
How
many?
How
much
expertise
do
you
want?
C
In
your
team,
I
mean
I'm
speaking
from
very
practical
experience
as
as
good
as
the
vendor
support
was
for
our
managed,
my
sequel
and
managed
rabbiting
queue,
the
mysql
one
stopped
and
the
wrapped
in
q1.
It
turns
out
that
to
actually
help
your
own
customers,
you
need
to
be
able
to
respond
faster
than
waiting
for
a
four-hour
support
ticket
to
fix
certain
problems
with
queues
or
things
getting
stuck
when
they
won't
restart.
C
I
I
put
a
question
on
twitter
today
towards
the
the
people
bringing
out
the
new
operator
or
rabbit
and
queue
for
tanzania,
and
I
said:
look
it
better,
be
a
better
quality
than
what
you
had
for
pivotal,
because
that
is
exactly
the
stuff
we
ran
into
where
you
know
a
life
cycling
highly
available
around
cube
clusters
kept
failing
you're,
so
dependent
on,
in
this
case
the
commercial
vendor,
but
it
might
as
well
be
an
open
source
project
building
in
the
quality
into
the
operator.
C
If
the
quality
is
not
there,
if
it
sticks
it,
two
stars,
you
know,
and
I
can't
rely
on
that
thing
to
to
fix
error
cases
quickly.
C
D
Point
I
completely
agree
and
by
the
way
the
rabbit
mq
operator
is
much
better.
It's
much
more
configurable.
Also.
I
did
some
of
the
beta
testing
on
some
of
that.
It's
a
very
fun
project.
It's
I
mean
it's
been
public
beta
for
like
a
year
already,
and
it's
really
awesome
much
more
configurable,
but
nine.
I
definitely
think
that
the
operators,
when
you
go
down,
I
think
the
way
to
go
is
with
the
open
source.
D
Also
is
a
great
key
is
a
great
way
to
get
in
because
you
do
have
all
of
these
tooling
out
there
and
then,
when
you
need
the
enterprise
support.
The
truth
of
the
matter
is
that
it's
a
very
easy
move.
Over
I
mean
there
are
currently
three
open
source,
postgres
operators,
plus
one
more
that
is
vmware's
and
the
difference
in
the
spec
of
that
crd
is
close
to
zero
between
any
of
them.
So
moving
between
a
different
operator
is
okay.
This
person
allows
you
to
configure
one
more
parameter.
D
That's
that's
the
difference
here,
there's
no
real
difference.
It
has
to
do
with
if
you're
going
to
get
the
support
or
not
from
vmware.
If
you're
going
to
get
the
whatever,
so
I
think
that
it's
a
great
way
to
get
in
and
up
and
running,
especially
in
debt
environments,
and
things
like
that.
It
really
is
testing
and
you
don't
need
the
expertise
that
you
do
without
an
operator,
because
you
do
not
need
the
expertise
for
how
to
set
up
an
each
a
cluster.
D
C
So
it
kind
of
depends
what
skill
set
is
deeper
right,
the
one
set
up
in
cluster
or
the
one
to
troubleshoot
a
failed
queue.
Failover
right,
I'm
not
too
sure
about
where
that
balance
falls.
D
There's
no
question
and
listen
rabbit
mq.
I
just
did
a
huge
project
with
red
but
mq
and
it
gets
very
complex
and
rabid,
especially
with
the
new
types
of
cues
that
they've
added
into
rabbit
and
things
like
forum
cues.
They
get
very
fun
but
overall
in
the
end,
setting
up
the
h8
cluster.
If
you
don't
need
that
bird
in
it
maybe
the
lower
burden,
but
any
burden
you
can
take
off
the
team.
D
Any
burden
you
can
take
off
at
any
level
already
makes
it
a
quicker
ramp
into
the
product,
because
if
they
had
to
do
it
with
the
open
source,
rev
and
mq
binaries
and
build
docker
images
out
of
them
and
then
go
and
figure
out
how
to
do
h
a
you
added
a
whole
new
level
of
complexity
for
them.
So
it's
not
that
you
took
the
hard
work
away,
but
you
took
away
some
of
the
grunt
work
from
them,
especially
the
upgrades
because
upgrades
of
rabbit
mq
clusters
are
not
fun.
A
Okay,
a
time
check
here,
a
lot
of
people
have
already
dropped
off,
but
I
think
we're
a
little
past
the
limit
already.
I
found
this
a
fascinating
conversation
that
miles
and
I
thought
that
we
might
run
short
before
this
started,
but
this
has
been
one
of
our
better
meetings.
I
think
so.
I
don't
even
remember
who
put
this
on
the
agenda,
but
thanks.
This
was
a
great
topic
and
speaking
of
agenda.
A
D
All
I
will
say
is
that
I
agree
with
robert
that
he
agrees
with
you
about
the
idea
of
the
idea
of
some
blog
post
or
some
talk
or
something
you
know
on
this
stuff,
because
I
think
it
really
is
interesting
on
the
different
levels
of
abstraction
and
it's
something
that
isn't
tackled
at
all
that
I
think
every
one
of
us
deals
with
on
fairly
regular
basis,
and
it's
just
it
hasn't
been
tackled
by
anyone.