►
From YouTube: Kubernetes Machine Learning WG 20180329
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
B
Had
a
question,
so
the
idea
was,
for
kind
of
initially
to
start
off
for
different
groups
to
kind
of
show
or
demonstrate
with
their
current
machine
learning.
You
know
pain
points
are
for
running
on
kubernetes
and
is
that
still
of
interest
to
people,
or
it
seems
like
nobody's
willing
to
step
up
at
this
point.
C
C
C
A
C
C
Least,
but
yeah
we're
actually
finding
that
it
tends
to
be
the
students
that
are
bringing
this
stuff
to
the
research
themselves,
kubernetes
containerization
and
things
of
that
nature.
Yeah.
A
C
C
C
C
C
The
other
thing
like
for
us
Kerber
days
is
still
fairly
small,
mostly
lots
of
experience
with
our
students
in
like
Jupiter
hub
and
things
of
that
nature
and
getting
up
and
going
on
there.
But
our
shop
is
so
like
a
very
classic
big
HBC
group,
so
some
of
the
container
experiences
they're
getting
like
using
singularity
and
all
that
to
run
that
through
slurm
on
our
HPC.
C
But
that
that,
like
again
of
the
students
and
things
like
that,
they
would
much
prefer
to
you
know,
use
Carre's,
use,
containers
you
just
Jupiter
hub
and
get
going
on
there
rather
than
try
and
pass
something
through
a
badge
processing
system,
especially
when
it
comes
to
like
parameter,
optimization
and
all
the
other
fun
things.
So.
C
C
B
C
Just
because,
like
the
massive
growth
of
things
like
jupiter
hub
and
and
more
things
that
are
not
your
classic
batch
scheduler,
so
like
us
as
research
support
staff,
if
like
we
can
sort
of,
you
know,
have
crazes
like
the
unified
platform
that
things
sit
on
and
then,
if
we
want
to
spin
up
slurm
or
something
like
that,
that
can
just
be
in
like
a
namespace
or
something
like
that
in
there.
But
it
allows
us
to
make
better.
You
know,
usage
of
our
resources
as
things
sort
of
shift
in
these
different
directions.
B
A
C
We
have
like
basically
a
proof-of-concept
thing
that
will
like
user
ssh
is
in
and
it
spawns
a
container
as
their
UID
GID
mounts
their
home
directory
and
does
all
that
it
is
a
little
bit
of
challenge
to
like
you
know
as
a
campus,
we
have
all
these
central
resources
with.
You
know
big
shared
storage,
and
you
know
all
the
other
fun
things
a
big
scratch
disk.
A
Okay,
so
mission
I
were
doing
some
discussion
or
had
some
discussion
about
potentially
trying
to
improve,
not
for
everything,
but
some
of
the
documentation
are
surrounding
the
Google
ml
products
and
some
of
the
examples
there
by
products
I
don't
mean
our
actual
products.
I
mean
like
the
open
source
stuff
that
we
do
like
tensorflow,
for
instance,
Kuby
flow
and
just
trying
to
come
up
with
some.
A
You
know
onboarding
and
best
practices
that
generally
pertain
to
running
machine
learning,
workloads
on
top
of
kubernetes
and
maybe
trying
to
aggregate
some
of
the
general
limitations
that
were
discussed
last
time.
Like
you
know,
we,
depending
on
what
you
actually
do,
and
what
your
storage
and
networking
solution
looks
like.
We
really
don't
have
a
lot
of
good
answers
around
high-performance
networking
and
storage
solutions
directly
integrated
with
kubernetes,
but
at
least
we
could
capture
the
state
of
the
art
in
terms
of
what
you
could
do.
A
What
is
available
like
if
you
did
want
to
run
some
type
of
batch
machine
learning
workload
across
HDFS
partition
that
is
feasible.
It's
just
probably
not
such
a
great
idea
to
try
to
run
your
HDFS
cluster
inside
of
kubernetes,
but
you
could
do
things
like
co-locating,
the
kubernetes
cluster,
with
the
HDFS
cluster
in
the
same
physicals,
but
have
them
logically
partitioned.
D
So
within
hey,
it's
Jeremy
from
coop,
so
so
I
think
within
coop.
What
we're
trying
to
solve
a
lot
of
those
problems,
I
mean
I.
Think
a
lot
of
the
issues
you
just
mentioned
are
coming
up,
and
so
we
have
sort
of
these
open
issues
to
sort
of
provide
recommendations
about
how
you
deploy
kubernetes
in
various
settings.
Various
cloud
providers,
various
communities,
distributions
to
sort
of
get
the
things
you
mean
and
some
of
the
issues
that
come
up
so
far.
D
Our
secure
ingress
we've
also
got
an
issue
about
storage,
and
you
know
how
you
new
things
get
come
in
like
object,
stores,
shared
files,
POSIX
compliant
file
systems.
We
haven't
specifically
looked
at,
you
know,
other.
You
know,
high-performance
networking,
yet
I
think
the
net
one
of
the
big
ones
for
us.
Next
is
things
like
monitoring,
and
you
know,
especially
with
integration
with
like
sto,
and
also
monitoring
for
things
like
RPC,
metrics
and
latency,
with
respect
to
your
model
servers,
and
so
any
guidance
or
help
with
that
would
be
amazing,
yeah.
So
you're
talking
about.
D
A
We
if
we
want
to
I,
talked
to
the
bishop
about
this.
If
we're
gonna
do
that,
we
have
so
inside,
obviously
they're
the
sig
I'm,
sorry
inside
the
working
group
and
community.
We
have
a
machine
learning
like
repo
basically,
but
they
so
anything
that
we
want
to
collect
in
terms
of
documentation.
That
would
probably
be
the
best
place
to
put
it
just
because
of
the
kind
of
rules
around.
What's
a
sig
and
what's
a
working
group.
So
if
we
want
to
start
owning
code,
we
gotta
like
form
a
sink.
A
D
A
B
Because
I
think
that's
the
lowest
hanging
fruit
in
terms
of
your
just
getting
started
and
addressing
a
lot
of
the
points
that
you
know
Bob
pointed
out
just
now
and
and
things
we've
seen
ourselves
and
then
all
of
the
harder
problems
that
we
definitely
want
to
get
her
to
as
well
like
like
Jeremy
was
talking
about,
then
you
know
those
can
happen
in
in
kind
of
this
points
kind
of
to
those
right.
This
is
the
kind
of
the
landing
zone
or
entry
point
to
those.
A
So
I
just
linked
in
the
location
of
for
anybody's
unaware
about
the
ml
working
group,
so
we
can
put
whatever
documents
you
want
in
this
subdirectory
in
the
community
to
capture
all
of
us
and
according
to
the
steering
committee,
that's
the
appropriate
place.
So
we
can
do
that
or
I
mean
if
people
prefer
to
use
Google
Docs,
that's
also
another
option
and
we
could
just
link
out
to
the
information
from
there.
It's
really
up
to
you
know
what
people's
preference
is
with
respect
to.
You
know
whether
they
prefer
to
produce
stuff
in
markdown
or
editing.
A
But
did
somebody
want
tickets,
an
action
item
to
do
something
a
little
bit
more
specific
in
terms
of
aggregating,
some
of
the
pain
points
and.
C
C
B
D
Sorry
I
think
my
my
zoom
crash
I
think
what
you
know
so
I,
so
we
want
to
do
it
not
just
for
TF
serving.
We
had
these
other
model
servers
and
we
also
want
to
do
it
for
our
batch
jobs
like
tender
flow
we'd
like
to
have
you
know,
basic
metrics,
like
CPU
memory
and
I.
Think
one
of
the
questions
I
have
that's
not
clear
to
me.
D
But
you
know
back
cloud
and
and
and
distribution,
specific
backends
and
I.
Don't
I
don't
know
what,
like
the
best
practices
are,
and
so
the
more
that
we
can
have
like
the
community
kubernetes
tell
us
like
what
the
best
practices
are
and
then
we
can
just
kind
of
follow
them.
That'd
be
great,
in
my
opinion,
so
I
mean
I
think
it
depends
on
what
you
were.
A
A
A
D
A
D
The
proxy
we
were
looking
at
things
like
RPC
metrics,
so
RPC
counts
rates
and
errors.
Errors
in
my
10
say
those
are
the
ones
we
want
and
so
ideally
would
be.
It
would
be
great
if
GRP
see
sort
of
exported
those
or
tensorflow
serving
exported
those
automatically,
but
I.
Think
me-
and
you
talked
about
this
in
the
past
and
from
what
I
recall
of
our
conversation.
It
doesn't
happen
that
way
today
and
once.
A
D
A
Specifically,
I
mean
like
it's,
not
it's
not
like
brain
surgery
to
get
it
out,
but
not
like
something
that
I'm
gonna
have
the
time
to
go.
Do
a
PR!
You
get
myself
what!
So,
if
we're,
even
if
you're
looking
at
doing
auto
scaling
or
if
you're
looking
at
telemetry
another
way
to
do
it
would
be
to
just
use
if
you
put
a
proxy
in
front
of
it,
extract
all
of
it,
with
the
exception
of
the
actual
hardware,
telemetry
directly
I
think
auto
scaling
off
of
that
to
you.
If
you
wanted
to
I,
think.
D
I'm
leaning
towards
the
proxy
solution,
mainly
because
it
seems
like
we
get
the
most
bang
for
the
buck
and
I
think
integrating
with
SDO
to
do
centralized
policy
management
is
is
something
that
we
want
to
do
anyway.
So
I,
don't
I,
don't
know
the
details,
I'm,
not
it's
not
an
area
of
my
expertise,
but
my
expectation
is
that
once
you're
using
SDO,
you
also
need
a
nun
boy
proxy,
and
so
it
just
seems
like
that.
All
fits
together
nicely,
but
the
details
elude
me
so.
A
D
D
But
somewhat
my
main
point
was
that,
if
you're
going
to
be
doing,
if
you
plan
on
moving
in
the
direction
of
Sto,
you're
gonna
have
an
invoice
see
in
front
of
everything
anyway.
So
it's
not
like
we're
introducing
an
end
boy
proxy,
just
for
metrics
or
therefore
incurring
wait-and-see,
another
I
guess.
Other
issue
is
just
for
the
metrics
I
mean.
A
B
A
Something
people
are
interested
in
I.
Think
the
hard
part
to
get
my
head
around
is
like
a
no
TF
serving
model
server
to
some
extent,
XG
boost
I
played
with
a
little
bit,
but
there
just
seems
like
there's
such
a
large
collection
of
potential
frameworks
that
would
have
to
be
supported
in
order
to
have
a
unified
solution
that
works
for
everybody.
And
then
each
of
those
frameworks
takes
it
different.
A
Then,
even
it's
really
just
a
different
type
of
input
from
the
application
layer
or
the
transport
layer,
most
transport
legend
saying,
but
the
extra
boost
is
structured.
For
instance,
right
I
mean
like
it's,
not
the
same:
I
got
a
I,
don't
know
what
a
unified
proxy
that
works
for
both
of
those
looks
like
well.
D
B
D
B
So
is
there
interest
in
doing
essentially
request
batching
for
better
throughput
under
certain
latency
budgets
at
the
proxy
level?
I
know
this
is
kind
of
a
direction
that
the
rise
lab
at
Berkeley
with
the
clipper
project,
but
it
seems
like
that
naturally
kind
of
fits
into
with
it
unvoiced.
You
know
service,
mesh
layer.
B
Sorry,
my
audio
cut
out:
can
you
repeat
that
right
so
has
anyone
heard
of
intrested
essentially
another
feature
at
the
service
mesh
layer
for
machine
learning?
Inference
would
be
kind
of
in
request,
batching
to
increase
throughput,
while
maintaining
a
certain
latency
SLO,
and
you
know,
there's
there's
kind
of
academic
projects
like
rise,
labs
clipper
which
are
designed
to
do
this,
but
it
seems
to
fit
pretty
naturally
in
an
existing
service,
much
like
sto
or
envoy,
rather
than
having
a
separate
project.
To
do
this.
B
D
D
So
when
we
hear
a
lot
about,
batching
is
actually
with
inference
for
GPUs
and
sort
of
serving
setting,
because
with
GPUs
there
you,
you
really
need
a
batch
in
order
to
batch
multiple
requests
to
get.
You
know
high
throughput
and
sort
of
manage
latency.
You
can't
process
them
sequentially.
So
that's
the
use
case
I've
heard
of
most
where
this
has
come
up,
and
you
know
one
thing
that's
been
mentioned
is
I.
Guess
NVIDIA
has
I
forget
what
what
it's
called
it's.
It's
like.
D
I
think
the
problem
is,
if
you,
if
you,
if
you
I,
think
the
problem
is
that
you
need
to
process
multiple
requests
in
efficient
together
to
get
the
efficiencies
of
the
GPU.
So
if
you
just
process
the
request
as
soon
as
it
came
in
your
while,
your
lack
request
is
processing.
It's
blocking
all
the
other
requests
like.
So
it's
not
like
multi-threaded.
So
you,
if
you
have
like
requests
coming
in
every
millisecond
right,
it's
more!
You
get
better
performance.
If
you
actually
wait,
10
milliseconds
aggregate
all
those
requests,
then
process
them
all
at
once.
D
A
D
Yeah
I
think
that's
accurate,
I'm,
not
sure
about
the
exact
you
know
under
the
hood,
but
you
know
essentially
the
whole
point
of
the
GPU
is
that
it's
high
throughput
because
you
can
process
multiple
in
parallel
simultaneously,
so
you
need
to
have
multiple
requests.
Otherwise,
you're
not
gonna,
get
high
throughput.
So
for
st.
A
Or
Envoy
I
think
that
would
be
very
challenging
for
the
handle
of
that
level
because
they
basically
have
to
intercept
every
request
going
through
the
entire
cluster
once
you've
enabled
the
the
service
mention
Network,
and
because
of
that,
they
try
to
be
super
lightweight
in
terms
of
touching
their
traffic.
I,
don't
know
if
they'd
be
I
could
talk
to
him
and
see
if
they'd
be
willing
to
add
a
feature
to
do
something
like
buffer
a
series
of
requests
and
then
pass
them
back
through.
A
The
other
thing
is
I'm
not
entirely
sure
I
mean
their
HTTP
aware
for
like
layer,
7
I'm,
not
sure
how
aware
they
are
in
terms
of
application
level.
Traffic
like
I,
mean
I
know
they
do
that
they
do
G
RPC
HTTP
and
they
all
do
certain
things
in
the
transport
layer.
I
think
UDP
has
some
nations
support,
but
not
not
too
much.
D
Honestly,
it's
my
opinion,
maybe
Jeep
GPU
serving
is
I,
think
that's
still
very
much
an
unknown.
You
know
whether
that's
actually
the
right
thing
to
do
or
whether
you're
better
off
trying
to
serve
efficiently
from
CPU.
Maybe
you
know
comp
might
comprise
compressing
your
model
or
doing
some
other
approximate
information.
So
I
don't
know
if
I
get
this
time,
that
I
would
necessarily
invest
in
trying
to
convince
right.
A
I'm,
not
sure
that's,
why
say,
maybe
it
doesn't
hurt
mask
so
I
mean
I
would
think
that
they.
This
would
be
something
that
might
be
challenging
for
them,
but
we
could
always
just
reach
out
to
them
and
say:
hey.
Have
you
thought
about
request
batchman,
because
there
are
definitely
features
they
do
in
terms
of
like
draining
connections,
circuit,
breaking
and
so
forth,
and
so
on.
So
there
is
definitely
intelligence
built
into
the
proxy
just
a
question
of
how
much
are
they
willing
to
do?
I.
B
Think
another
angle
on
this,
too,
is
if
this
is
kind
of
the
best
practice
for
recommending
that
you
know
this.
This
is
how
we
recommend
you
surf
models
you're
using
ISTE,
oh
and
boy,
then
that's
just
more
moving
parts.
That's
kind
of
the
you
know
smaller
shops
have
to
set
up
so
I
think.
That's
just
underlines
the
the
kind
of
amount
of
the
lift
that
is
required
for
for
people
to
get
up
and
running
with
these
systems.
So
I
think
that's
something
to
keep
in
mind
as
well.
A
The
other
thing
would
be
if
you
tried
to
batch
it
your
network,
if
ultimately
you're
trying
to
match
the
serve
the
silicon
you'd
have
to
make
your
network
at
that
point
aware
of
this
silicon
that
he's
trying
to
serve
to
you
right.
That
doesn't
seem
like
it's
super
awesome.
Separation
of
concerns
normally.
A
A
A
You
could
so
I
mean
if
you're
getting
the
network
latency
out.
It's
more
a
question
of
where
do
you
actually
insert
the
batching
logic
right
like
in
terms
of
getting
the
network,
intelligence,
sto,
envoy
or
a
proxy?
Another
type
of
proxy
is
a
good
place
to
get
that,
but
for
the
batch
and
implementation
I
don't
know
if
you
necessarily
need
to
put
that
into
the
proxy
as
well,
but
see.
B
D
So
I
think
yeah
that
that's
that's
part
of
it.
I
think
you
know,
there's
a
there's
a
question
of
like
you
know.
If
we,
if
we
recommend
object
store,
you
know
how
do
we
get
object,
stores
that
are?
How
do
we
come
up
with
a
good
reference
limitation
for
object,
stores
that
can
run
anywhere
and
and
so
that
you're
not
just
running
on
the
clouds
that
have
them,
but
you
can
also
do
it
on
pram
and
then
I
think
there
are
use
cases
for
shared
POSIX
because
you've
got
application.
D
Applications
like
hdf5
that
require
it
and
so
I
guess
shared
POSIX
is
one
of
them,
but
I
guess
you
know.
Another
potential
solution
is
to
use
something
like
ABC
and
automatically
sync
data
back
and
forth
behind
the
scenes,
but
yeah
I
think
I.
Think
for
us
we're
just
kind
of
trying
to
figure
out
at
this
point.
You
know:
we've
got
some
issues
about
like
what
are
water.
You
know
the
requirements
and
best
practices
that
we
can
sort
of
recommend
to
people.
So
we
can
say
like
okay
for
examples
and
whatnot.
D
We
make
use
of
a
shared
post,
Docs
combined
file
system,
so
on
these
types
of
clusters,
arcades
distributions
go
and
provision
NFS
and
here's
how
you
can
do
that
or
on
you
know
this
cloud
provider.
There
is
a
you
know,
cloud
file
or
solution
already,
so
you
can
use
that
basically,
that
sort
of
thing
so
that
you
know
in
our
solutions
and
examples
we
don't
sort
of
have
to
reinvent
the
wheel
every
time
so.
B
D
So
I
think
you
know.
As
an
example,
you
know,
there's
there's
some
discussion
about
like
min
IO
versus
rook
right
and
you
know
like
which
one
of
those
gives
you
more
bang
for
the
buck
like
in
terms
of
you
getting
objects,
tour
and
Sheridan
FS,
and
is
it
performant,
and
so
what
cases
can
use
for
it
and
so
I
just
coming
up
with
a
good
recommendation
so
that
you
know
people
we're
not
just
telling
people
like
okay,
here's
a
list
of
options
go
and
figure
it
out
yourself,
mm-hmm
and.
A
D
A
A
That
would
get
you
up
and
running
for,
like
demo
purposes
coming
up
with
something
that's
robust
and
scalable
would
be
challenging,
though,
for
object,
storage,
I,
think
the
biggest
problem
I've
seen
because
most
people
I
see
want
to
use
the
s3
interface
for
their
object.
Storage,
like
GCS,
for
instance,
has
an
s3
compatible
interface,
Azure
blob
store.
A
Other
than
coming
up
with
a
friction
log
about
like
for
this
provider,
this
is
how
you
use
it
this
provider.
This
is
how
you
use
it
for
a
rook.
If
you're
turning
upset
clusters,
that's
a
I
mean
that's,
that's
someone
s3
compatible
API.
They
have
their
own
semantic
differences
there,
but
we
could
come
up
with
something
like
if
you're
using
this
one,
here's
what
you
should
be
aware
of.
I
guess.
D
Yeah
I
think
you
know
we're
iterating
and
all
that
ourselves
and
I
think
you
know
what
the
way
I
think
we
want
this.
We
I
don't
think
we
sort
of
understand
what,
while
the
the
requirements
yet
like
you
know
how
much,
how
well
I'm
tourists
and
is
enough
NFS
needed
and
whatnot.
D
It's
proved
convenient
for
a
couple
things
and
we
do
have
some
use
cases,
but
so
I
think
that's
where
we're
going
to
just
learn
over
it
smart
more
time
as
we
have
sort
of
these
examples
running
in
cluster
and
we
get
more
feedback
from
the
customers
and
we
see
what
solutions
you
know
work
and
don't
work
on
what
the
problems
are.
I
think.
B
One
one
thing
is
simplify.
The
problem
is
that,
in
my
experience,
the
the
storage
solution
you
need
on
your
cluster
often
doesn't
have
to
have
the
resiliency
or
robustness
guarantees
that
you
were
used
to
with
storage
providers,
because
these
kind
of
clusters
are
tend
to
be
more
of
the
kind
of
computation
domain.
I
mean
this
doesn't
hold
in
all
cases,
but
so
so
what
I
mean
is
you'll?
You
know
some
grad
student
lab
will
have
their
NFS,
you
know
box,
but
then
they
have
a
separate
cluster.
B
That's
the
compute
cluster,
and
so
you
just
want
to
get
the
data
in
a
format.
That's
closer
and
more
scalable,
but
you
don't
necessarily
need
there
were
buses,
because
it's
a
second
copy,
the
data
that
doesn't
always
hold,
but
that
it
does
vastly
simplify
the
problem
in
terms
of
making
it
not
have
to
be.
You
know
highly
available
and
full
tall.
Aren't
all
these
kind
of
things.
A
Okay,
I,
don't
really
have
anything
else
either.
Do
we
wanna
end
early
and
take
back
15
minutes
then.
A
Mine
as
well,
okay,
but
before
we
go
I,
want
to
discuss
it.
Does
anybody
have
anything
specific
that
they
want
to
add
to
the
working
group
repository
between
now
and
the
next
two
weeks?
I
know
Mission
definitely
wanted
to
capture
some
things.
Jeremy.
Did
you
want
to
capture
some
of
the
feedback
that
you
just
gave
in
terms
of
potentially
at
least
outlining
what
we
might
do
for
storage
and
proxying
multiple.
D
A
All
right
cool
all
right,
guys,
we'll
see
you
in
two
weeks
then
Thanks,
okay,.