►
From YouTube: Kubernetes Machine Learning WG 20180510
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
B
C
Hello
look
I
added
that
okay,
so
this
comes
in
from
the
first
meeting
where
we
talked
about
their
user
pain
points
from
interviews
that
we
did
with
it
assigns
data
scientists.
So
let
me
start
sharing
my
screen.
I
have
I,
have
already
put
a
link
to
the
repository,
as
well
as
the
slides
to
the
meeting
notes.
C
C
C
So
to
give
you
an
overview
of
how
this
project
started,
we
talked
to
some
internet
ascenders
and
machine
and
machine
learning
practitioners.
So
there
are,
there
are
some
different
types
of
user
categories
that
we
talked
to.
Some
of
them
are
researchers.
They
did
basic
research
in
machine
learning
like
optimizing
models,
what
type
of
models
to
use
any
kind
of
new
direction
in
creating
models
so
on
and
so
forth.
There
were
also
ml
practitioners.
C
So
start
off
with
I
want
to
give
a
brief
overview
of
the
ml
workflow
that's
being
used.
The
first
part
of
this
is
relative
it--it
of
training,
where
there
needs
to
be
some
core.
Changes
need
to
be
done
by
the
user
by
the
ml
practitioner
or
the
researcher
in
order
to
get
to
a
model
that
better
suits
their
needs
or
requirements.
So
what
they
would
do
is
you'd
have
some
only
parameter
input.
They
would
make
some
code
changes
run
the
model
training
process.
Then
it
spits
out
some
output.
C
If
the
output
is
what
they
expect,
then
it
is
more.
Otherwise
they
would
again
try
to
alter
it.
This
process
over
again
by
making
some
more
code
changes
and
I
just
think
their
model
to
suit
their
needs.
So
this
is
just
illustrating
that
they
would
use
the
same.
Might
redo
process
with
some
other
values
in
it.
C
The
other
workflow
is
the
hyper
parameter,
optimization
where
there
are
the
the
the
user
wants
to
run
the
same
model
same
same
model,
training,
script
or
code
with
different
hybrid
parameters,
and
they
want
to
look
at
which,
which
hyper
parameter
gives
them
the
best
output.
The
best
output
is
dependent
on
metrics
like
yeah.
Are
they
accurate
see
they,
for
example,
that
it
could
be
also
time
to
accuracy
in
some
cases
and
and
what
what
the
typical
workflow
looks
like?
Is
they
they?
C
C
So
what
I?
What
we
really
saw
was
some
kind
of
a
workflow,
but
in
each
cases
there
are
some
data
component
involved.
It
has
important
component
in
this
workflow,
so
the
blue
blue
boxes
in
each
of
these
cases
are
the
data
associated
with
both
of
these
fellows.
One
of
them
is
cited
to
training
other
one
as
the
model,
training
output
itself,
even
in
hyper
parameter
tuning
or
have
a
parameter
optimization.
The
same
applies.
C
So,
okay,
so
at
the
start
of
this
working
group,
maybe
in
the
second
meeting
will
present
it
have
a
bunch
of
user
pain,
points
related
to
different
different
user
pains
like
your
user
user
pain
categories
like,
for
example,
data
or
scheduling.
So
in
this
case,
I
just
picked
up
some
some
user
pain
points,
though,
which
to
which
of
the
the
basis
of
for
the
motivation
for
creating
kubernetes
volume
controller.
C
C
The
other
part
of
this
workflow
is
that
user
categories
that
I
mentioned
can
be
like
involved
in
multiple
stages
of
this
workflow,
like
where
the
data
could
be
either.
If
the
user
is
the
MM
practitioner
or
the
data
scientist
is
developing,
they
might
have
their
own
data
set,
which
needs
to
be
either
locally
cached
or
made
available
either
through
PV
or
PVCs,
so
that
T
that
the
job
could
consume
it
or
it
could
be
where,
like
that,
should
be
a
handover
like
where
operator
could
prime
or
cache
the
data
sets
that
are
frequently
used.
C
C
So
so
we
created
KBC
2
who
solved
some
of
the
user
pain
points
that
I
just
mentioned.
So
what
does?
What
is
kubernetes
volume?
Controller
KBC
leverages
the
operator
pattern
to
man,
I'm
kubernetes,
so
KBC
handles
data
from
different
sources
such
as
s3
and
NFS.
For
example.
We
also
have
some
source
types
related
to
pachyderm
file
system.
Now
KBC
maintains
the
metadata
required
to
establish
the
relationship
between
data
and
volumes
and
kubernetes,
so
so
start
off
with
just
to
go
a
little
bit
more
deeper.
C
So
what
we
really
did
was
use
a
custom
resource
definitions,
and
so
we
extend
the
community's
API
server
within
it
with
a
with
a
resource
called
volume
manager.
So
we
load
controllers
custom
controllers
to
drive
these
collectives
resources,
in
this
case
the
volume
manager
to
desired
state.
The
current
status
is
that
we
support
some
data
source
types
and-
and
you
can
look
at
the
look
at
our
repository
in
order
to
get
more
information
of
what
we
support.
C
So
a
typical
girl
for
a
user
journey
that
could
be
supported
by
kvc.
Is
this
I'm
a
researcher
on
AML
practitioner
I?
Have
some
data
in
s3
I
wanted
downloaded
and
replicated
across
multiple
nodes
in
the
cluster
I
want
also
label
them
so
that
I
can
remember
and
keep
track
of
them.
So
what
what
this
replication
does
is
allows
you
to
run
one
use
case
where
people
use
this
to
run
hyper
parameter
optimizations,
who
are
replicated
in
different
types
of
nodes.
C
So
how
s
the
data
source
type
in
KBC
works
is
that
when
you
create
customer
source
you
specify
source
type
or
the
data
source
type,
some
some
labels,
the
location
of
the
s3
bucket
and
the
number
of
replicas
you
wanted
to
do
sorry.
This
is
not
the
labels,
these
the
labels,
so
why?
When
this
er
is
created
now,
what
typically
does
said,
the
the
custom
resource
goes
into
a
pendings
and
the
controller
takes
over
from
there.
C
The
controller
looks
at
the
customer,
so
that
was
created
and
downloads,
the
data
using
pods
and
scheduler,
the
kubernetes
scheduler
and
and
when
the
download
is
completed,
it
will
provide
you
with
a
volume
source.
In
this
case
it
would
be
a
auspat
volume
source
and
the
node
affinity
details
in
order
for
you
to
guide
your
pods,
whether
they
are
part
of
a
job
or
a
high-level
object
like
the
block,
either
a
deployment
or
a
job
in
the
cluster.
C
If
the,
if
the,
if
the
download
fails
or
if
for
some
reason
there
is
some
other
failure,
the
error
is
bubbled
up
to
the
custom
resource
so
that
you
have
a
single
place
in
order
to
look
at
whatever
errors
that
might
that
might
occur
during
this
download.
This
is
just
for
one
source
type,
but
there
are
also
multiple
source
types
within
within
kvc.
They
are
all
behind
an
interface,
so
you
can
extend
the
you
can
implement
the
interface
in
order
to
have
to
go,
get
support
for
a
new
data
source
type.
C
C
C
C
C
That
is
a
bunch
of
labels.
Here
there
are
options
which
will
provide
the
credentials
as
a
part
of
a
secret
in
order
for
us
to
get
access
to
the
data
itself,
the
source
URL
itself
and
then
an
in-point
URL.
The
endpoint
URL
is
added
because
we
support
any
I
wouldn't
say
any.
We
support
three
three
different
has
three
compatible
storages
aw
aw
says
three
GCS
and
Mineo.
C
We
use
Amina
your
client,
so
we
have
to
specify
an
endpoint
URL
in
order
for
in
order
for
it
to
in
order
for
her
to
download
from
different
different
s3
compatible
storages.
So
this
is
a
typical
custom
resource.
So
when
you
create
the
custom
resource,
what
it
would
really
do
is
simply
like
run
multiple
pods
in
the
cluster.
In
this
case
the
replica
is
just
one,
so
it
will
run
like
single
pod
and
download
the
download
the
data
on
a
set
of
nodes
and
provide
you
provide.
C
You
provide
you
with
the
volume
volume
source
and
the
no
definitely
required
in
order
to
consume
that
data.
When
you
are
running
a
job,
so
we
create
the
you
can
really
a
custom
resource
that
I
thought
was
just
shown
here
and
when
you
do
get
volume
managers
you
see
that
the
custom
this
was
was
created
and
when
you
actually
describe
the
the
the
volume
so
that
was
just
created.
What
you
will
see
here
is
is:
is
the
the
volume
source
we
just
sorry.
C
C
C
C
C
D
C
C
You,
wouldn't
you
wouldn't
need
CSA
for
that.
I
understand
that,
but
you
have
to
pass
in
the
node
affinity.
Information.
Yes,
I
yeah
infinity
can
be
taken
care
in
the
scheduler.
That
I
think
that's
the
problem.
There
I
mean
it's
not
a
I'm,
not
saying
like
negative
criticism
of
CSA
I'm,
just
saying
it's
not
supported
it,
but
in
the
future
may
be
supportive.
A
C
C
So
so
this
is
a
problem
where,
like
where
we
may,
we
may
need
to
have
a
choice
of
PVCs,
like
the
user
could
specify
multiple
PVCs
in
the
pod
templates
back.
However,
if
we
are
able
to
attach
to
one
of
them
in
any
of
the
nodes
in
the
pod
might
start
running
something
like
that.
Maybe
a
solution
like
that
would
be
useful.
C
Yeah.
Other
problem
is
I,
think
they're
working
on
it.
The
CSA
driver
installation,
how
do
we
deploy
the
plugins
for
CSI,
then?
The
the
other
miscellaneous
comment
here
is
I,
don't
think
kubernetes
is,
has
support
for,
or
maybe
even
thought
about,
supporting
about
supporting
for
thought
about,
adding
support
for
objects
to
flip
stories
yet
so
those
are
the
couple
of
issues
that
are
outstanding
within
content
storage
in
three
phases
that
make
us
yep
just.
C
So
so
this
is
one.
This
is
one
more
requirement
on
top
of
that.
So,
let's
just
say
there
is
a
deployment
that
uses
a
local
PD,
okay
and
assuming
that
all
the
pods,
all
the
replicas,
in
that
in
that
deployment,
are
not
able
to
land
on
the
same
node,
then
the
rest
of
the
replicas
won't
run
at
all,
because
because
this
is
attached
to
a
single
PVC
and
that
PVCs
is
bound
to
a
single
node
right.
A
Would
you'd
want
to
do
for
something
like
this
for
something
like,
so
if
you're
trying
to
do
distributed,
training,
which
you
really
want,
is
some
combination
of
job
and
stateful
set?
You
want
an
individual
PVC,
that's
generated
for
each
for
each
pod
right,
you
don't
want
to
try
to
mount
them
all
to
the
same
PVC,
even
though
the
PVC
is
backed
by
the
same
actual
data
and
then
even
how
many
good
jobs,
because
he
went
the
pod
to
run
to
completion
in
either
success
or
failure,
not
something
that
continues
to
be
restarted
indefinitely.
Right.
C
So
in
a
job,
I
know
so
in
a
job
when
you
do
this
part
template
spec,
so
you
can
the
partner
plate.
Spec
is
same
for
all
the
parts
and
when
you
mention
one
PVC
in
that
part
to
volume,
UNK
and
that
PVC
happens
to
be
a
local
PB,
let
you
say-
and
it
has
not
affinity
associated
with
it.
If
the
other
resource
requests
are
not
satisfied
by
the
same
node,
the
pod
has
to
land
on
a
different
node
and
it
won't
be
able
to
land
and
well.
A
B
A
C
Make
sense:
that's
what
I'm
saying
like
that
kind
of
a
feature
would
be
very
useful
in
case
of
I
mean
so
I'm
just
talking
about
the
current
state
of
any
high-level
object
within
communities
like
a
job
or
a
deployment
where,
if
you
specify
this
in
a
pod,
template
expect
like
a
volume
on
a
single
local
yeah.
It's
not.
D
C
A
C
The
book
is
actually
a
admission
controller,
so
what
you
would
do
is
you'd
specifying
the
label
name.
What
KB
CCR
that
you
want
to
associate
yourself
with
and
the
admission
controller
would
actually
mount
or
rewrite
your
pod
template,
spec
or
pod
spec
itself
and
mount
all
the
KB
volumes
associated
volume
sources
are
slated
the
KB
CCR
doing
admission
time.
C
E
E
C
C
A
Think
you
CSI
would
play
a
role
there.
You
can
so
there
is
a
proposal
to
actually
mount
kubernetes
Peavey's
that
are
backed
by
HDFS.
Seeing
throughput
you
get
on
HDFS,
really
isn't
what
you'd
want
to
see
for
a
PD
for
network
attached
storage
on
the
other
side
of
it?
We
wanted
so
like,
depending
on
how
your
network
is
actually
set
up.
You
get
data
locality
issues
and
the
main
use
cases
for
HDFS
is
like
basically
I
want
to
set
up
a
bunch
of
processes
that
are
all
in
and
read
from
replicated
partitions
across
the
cluster.
A
But
I
want
to
be
able
to
schedule
my
pods
close
to
the
partition
that
they're
going
to
read
from
so
I.
Don't
have
to
take
the
extra
network
hop
now
on
GCP,
it's
not
as
big
of
a
deal.
The
network
like
we
did
some
experiments
and
the
network
is
actually
fast
enough
that
data
locality
was
it
really
didn't
matter
like
setting
up
with
a
custom,
node
topology,
so
that
we
were
able
to
achieve
data
locality,
didn't
drastically
improve
performance
for
read
over
just
throwing
them
randomly
across
the
cluster.
A
If
you're
more
concerned
about
running
into
on
cram
or
places
where
the
network
isn't
quite
that
over-provisioned,
then
data
locality
is
gonna
become
an
issue
like
AWS.
It
was
definitely
an
issue
and
you
could
notice
decretive
degraded
performance.
The
challenge
is
really
being
able
to
expose
the
correct
network
topology
for
the
nodes
and
pepper
data's
actually
put
some
work
into
doing
this
and
then
the
other
side
of
it
for
HDFS
is.
A
There
are
people
who
are
running
Cades
clusters
and
HDFS
side
by
side
and
the
same
footprint
so
basically
like
you'll
have
caves
one
portion
of
the
rack
and
then
you'll
have
HDFS
on
another
portion
of
the
rack
on
storage,
dense
nodes
and
they've
had
some
level
of
a
success.
Doing
it
that
way,
so
you're
not
going
to
co-locate
directly
onto
the
same
node
as
your
but
you'll
collocate.
A
On
the
same
rack
and
you're,
just
gonna
have
to
go
over
the
top
of
relax
which,
in
order
to
act
as
the
data
for
the
HDFS
cluster,
so
good
was
good
results
there,
but
I'd
definitely
encourage
you
to
reach
out
to
the
pepper,
d2
guys
or
maybe
go
to
sing
big
data.
Let
me
know
please
and
just
talk
to
them
about
it,
because
if
it's
something
you're
interested
in
working
toward
it's
definitely
something
they're
interested
in
working
toward
as
well.
Okay,.
C
A
Okay,
we
didn't
really
have
anything
else
scheduled
for
the
agenda.
Does
anyone
want
to
talk
about
can
be
con.
B
Sorry
so
I
had
a
lot
of
great
conversations
with
people
like
Rhys
and
now
so
they
have
a
great
tool.
Hopefully
they'll
participate
great
to
get
into.
You
know
find
some
way
to
integrate
with
them
work
with
them.
We
had
a
lot
of
discussions.
We've
met
Hall
and
Williams
about
sort
of
the
simplifying
sort
of
the
workflow
for
data
scientists
in
terms
of
building
York.
B
If
you
would
modify
your
code,
build
your
container
and
then
submit
like
a
tenderfoot
job
for
your
for
that
container,
so
we're
looking
at
trying
to
use
hopefully
get
some
experiments,
experiments
going
with
tools
like
scaffold
and
draft
to
see
what
we
can,
how
much
in
water
it
exists.
We
can
leverage
give
me
a
second
to
pull
my
notes.
B
Let's
see
so
we
had,
we
had
a.
We
had
a
number
of
folks
that
we
were
pretty
happy
with
the
number
of
folks
who
signed
up
as
part
of
the
group
of
community,
so
I
think
we
have
over
20-plus
organizations
now
listed.
So
that's
pretty
exciting
I
was
pretty
excited
to
see
Eclipse.
She
now
works
on
kubernetes
I'd
love
to
get
that
integrated
as
part
of
the
coop
flow
story.
I
have
a
good
IDE
to
support
that
use
case.
B
We
had
a
group
high
cloud
shared
some
notes
about
some
feedback
and
UX
research.
They
had
they
had
with
data
scientists,
so
we
should,
if
anyone
else
is
interested
I'd
encourage
you
to
reach
out
to
the
cloud
folks
and
ask
them
if
they're
going
to
share
that
we
chatted
with
one
convergence.
They've
done
a
number
of
improved
DUI
in
cube
flow
and
also
I,
think
they
added
a
lot
of
support
to
ambassador,
which
is
really
nice,
so
we're
hoping
that
they
upstream
those
changes
and
get
that
into
Ambassador.
B
And
then
those
UI
improvements
to
Cooper
that'd
be
fantastic
and
then
I
spoke
with
a
company
called
dot
mesh
and
it
seems
like
I'm
gonna
understand
they
have
a
versioning
system
for
data
that
works
with
Kate's
volumes.
So
if
I
understand
correctly,
it's
supposed
to
allow
you
to
do
things
like
create
a
PVC
and
say
I
want
this
snapshot
of
the
data,
and
then
it
will
automatically
copy
the
data
to
that
PVC
and
make
it
available.
So
that
sounds
like
it
can
be
very
useful
and
sort.
And
yes,
so
that
sounds
interesting
and.
A
B
A
A
Okay,
I'm
going
to
take
silence
as
a
negative.
All
right
then
I
guess
that'll,
be
it
for
this
week
and
we'll
get
15
minutes
back.
The
only
thing
that
it
looks
like
we
have
laid
out
for
next
week
at
the
moment
is
or
I'm
sorry
not
next
week,
but
bi-weekly
is
an
overview
of
FFT
L
from
IBM,
yes
and
it's
deep
learning
fabric
which
promises
to
be
interesting
all
right
guys
and
have
a
good
day
and
I'll
see
you
in
a
couple
of
weeks.
Take
care.