►
From YouTube: OpenShift Commons Briefing MLFlow Intro and Operator Mani Parkhe (DataBricks) Zak Hassan (Red Hat)
Description
OpenShift Commons Briefing
ML SIG March 2019 Full
MLFlow Intro and Operator
Mani Parkhe (Databricks)
Zak Hassan (Red Hat)
Diane Mueller (Red Hat)
A
Hello
everybody
good
morning
my
name
is
Manny
Parque
I
am
an
engineer
at
dieter
brakes
working
on
ml
flow,
so
I'm
here
to
talk
to
you
all
about
what
is
ml
flow,
the
motivation
behind
building
this
machine
learning
framework,
what
the
competence
of
ml
flow
looks
like
and
where
we're
going
to
take
this
so
I'm
going
to
introduce
all
of
this
in
the
next
15
minutes
or
so
as
a
preamble
to
Zacks
presentation.
So
anybody
I'm
sure
this
team
appreciates
how
complex
and
all
flow
development
is
and
also
appreciate.
A
That's
a
slightly
harder
than
traditional
software
development,
as
we
have
done
in
the
last
few
decades.
Sort
of
to
tease
this
out
a
little
bit
I
want
to
sort
of
talk
about
some
differences
between
traditional
software
development
and
machine
learning
development.
So,
let's
start
with
problems
in
each.
So
let's
you
take
a
problem
in
traditional
software.
Let's
say
you
building
a
credit
card
transaction
system
or
a
functional
verification
system,
or
any
of
that
you
start
with
a
phone
or
with
a
functional
specification.
A
You
know
exactly
the
terms
and
conditions
and
what
products
you're
trying
to
build.
So
the
goal
is
pretty
clear
what
you
want
to
do
in
machine
learning.
The
goal
is
to
optimize
the
metric,
so
there
is
no
perfect
answer.
You're
just
trying
to
get
better
and
better
the
metric
could
be
increasing
accuracy
or
it
could
be
a
vector
of
different
metrics
that
you're
trying
to
optimize
all
right.
The
other
difference
is
quality
in
traditional
software.
A
When
we
go
to
machine
learning,
there's
not
only
all
the
code
and
the
system,
but
also
the
data
that
we
use
for
training
the
models
and
then
how
how
well
the
more
theone
and
how
we
like
regularly
continue
to
update
with
fresh
data
and
may
have
to
tune
the
models,
use,
different
algorithms
and
so
on
and
so
forth.
So
the
quality
is
sort
of
a
shifting
goal
and
a
moving
target.
Just
like
the
the
goal
that
we
talked
about
and
the
third
thing
I
want
to
talk
about
is
in
traditional
software.
A
You
develop
off
of
a
common
software
stack,
something
that
the
team
is
Flexer
is
has
worked
with
in
the
past
release
different
projects
on
you're,
pretty
well
acquainted
machine
learning.
On
the
other
hand,
you
want
to
constantly
experiment
with
new
libraries
that
constantly
come,
keep
coming
out
different
frameworks,
new
algorithms,
various
different
types
of
models,
and
not
only
that
or
other
than
experiment.
You
must
know
how
to
production
lies
them.
So
just
having
a
model
that
is
built
with
this
new
framework
is
not
good
enough.
A
A
All
of
this
can
sort
of
present
all
its
challenges
and
typically
for
machine
learning
practitioners,
it
is
incentivized
to
use
different,
different
algorithms
and
different
frameworks
in
order
to
best
receive
to
get
the
best
quality
that
we've
been
talking
about
where
model.
So
if
optimizing
metrics
is
the
goal,
then
you
pretty
much
end
up
trying
out
various
different
tools
until
after
that,
oh
just
being
able
to
train
and
create
a
model
is
not
sufficient.
We
all
know
that
using
the
right
parameters
for
tuning
is
a
quintessential
to
getting
the
right
model.
A
So
you
may
have
a
team
working
on
on
a
team
of
their
engineers,
working
on
the
ETL
and
logs
part
of
it
and
set
of
data
scientists
working
on
the
training
aspect
and
I
said
the
system
engineers
on
the
Diploma
part
of
it.
So
you
have
to
sort
of
work
through
the
lifecycle
of
a
model
how
it
passes
through
these
different
teams
and
governance
of
it
all
and
then
finally,
building
a
model
and
and
putting
it
in
production
is.
A
You
want
to
be
able
to
use
other
systems
in
your
ecosystem
that
work,
along
with
your
machine
learning
models,
for
example,
you
may
want
to
maybe
test
different
style
of
models
or
you
may
want
to
sort
of
track
how
these
experiments
are
flowing
through
different
life
cycles
of
the
user,
or
you
want
to
sort
of
have
a
more
automated,
orchestrated
way
of
training
these
models
and
managing
lifestyle
and
finally
have
to
start
worrying
about
how
the
models
drift
from
the
expected
quality
and
data
drift,
and
so
on.
Soft
work.
A
So
this
in
itself,
what
we
have
seen
is
a
extremely
complex
process.
There
is
not
a
single
tool
out
there
that
makes
it
easy
to
use
and
in
fact,
you
have
to
end
up
using
multiple
tools
that
you
and
the
team
are
comfortable
with
right.
So,
as
we
start
looking
at
this
or
when
we
built
mo
flow
as
an
open
platform
to
be
able
to
for
users
to
manage
their
models
and
machine
learning
frameworks,
we
wanted
it
such
that
it
works
with.
All
of
these.
A
Existing
tools
makes
it
easier
to
work
across
the
stack
these
various
tools
and
not
necessarily
be
one
of
these
tools,
but
then
would
work
with
a
lot
of
these
tools
to
solve
the
problems
that
have
not
been
solved
so
going
over
to
stuff.
You
have
to
talk
about
what
is
ml
flow
in
a
quick
three
bullet
slide,
so
mo
flow
is
an
open
platform
that
helps
you
manage
your
machine
learning
development
lifecycle
and
it
does
it
in
three
ways:
one
we
have
lightweight
ABI
to
be
able
to
work
with
any
ml
or
library.
A
So,
as
we
talked
about
as
like
everybody
has
their
favorite
library
to
user.
In
fact,
you
want
to
try
out
different
libraries.
They
could
be
in
different
programming
languages.
The
key
aspect
that
we
saw
here
was
data
scientists
and
machine
learning.
Practitioners
don't
want
to
get
locked
in
to
one
particular
library
or
one
particular
language.
A
You
want
to
sort
of
be
able
to
use
any
kind
of
language,
so
that's
where
we
build
ml
flow
as
an
API
first
approach,
so
you
can
talk
to
ml
flow
as
components
using
REST,
API
or
Python
Java,
and
and
are
they
all
built
up
on
this
basic
REST
API
and
lets
you
interact
with
them
of
flow?
The
second
thing
we
wanted
to
do
is
we
wanted
to
make
it
a
make
reproducibility
of
the
runs
a
primary.
A
So,
for
instance,
you
have
you
typically
train
your
model
on
your
local
machine,
and
then
you
want
to
sort
of
make
sure
that
it
reproduces
the
exact
same
results
when
you're
running
it
on
a
cloud
somewhere
any
kind
of
cloud
platform.
What
makes
it
hard
is
getting
get
that
reproducibility
n.
If
you,
if
you
send
your
code
to
some
other
engineer
on
your
team,
they
should
be
able
to
do
it
the
exact
same
way.
A
To
begin
with,
one
is
the
ml
flow
tracking
server
we're
going
to
spend
some
time
talking
about
that.
It's
necessarily
a
centralized
repository
to
store
all
the
critical
information
that
is,
that
is
generated
from
and
required
to
generate
your
machine
learning
run,
and
this
could
be
like
into
the
parameters
and
configs
and
and
and
also
the
metrics
that
come
out
right.
So
that's
that
and
then
it
has
mechanisms
to
query
that.
So
it's
like
like
a
database
for
all
your
machine
learning,
runs
the
second
one.
A
Emotional
projects
is
a
code
packaging
format
and
that
is
targeted
towards
making
your
runs
reproducible
when
you
run
it
on
any
cloud
platform
or
at
any
point
of
time
later
later
on.
So
this
is
a
way
to
sort
of
store
everything
such
that
you
can
guarantee
that
you
are
reproducing
that
runs
and
anywhere,
and
anybody
else
can
do
it
as
well
and
the
third
one
is
a
anode
models
is
a
generate
of
format.
A
A
packaging
format
for
models
such
that
the
models
that
are
once
written
out
by
an
ml
flow
can
be
deployed
across
a
variety
of
production
platforms
made
B.
Excuse
me,
maybe
in
real
time
scoring
format,
or
it
could
be
like
that
scoring
or
or
streaming
platforms.
So
that's
that's
the
three
big
components
and
then
I'm
going
to
spend
a
few
minutes
talking
about
each
one
of
these.
A
So
let's
jump
right
into
the
key
concept
of
tracking
ml
for
tracking.
We
talked
about
tuning
machine
learning
your
algorithm,
so
parameters
which
could
be
like
a
numeric
or
string
parameters,
are
a
key
to
sort
of
making
sure
you
get
what
you
want
out
of
be
our
machine
learning.
One
data
scientists
sort
of
try
out
thousands
of
experiments
with
a
same
algorithm
to
sort
of
get
the
right
right.
A
You
get
the
right
results
that
they
are
looking
for.
So
one
of
the
key
concept
is
a
dictionary
of
parameters
that
might
be
associated
with
your
particular
one.
Another
thing
is,
after
your
run
is
done
you
you
generate
numeric
values
by
scoring
your
test
data.
So
what
are
the
metrics
accuracy
error
so
on
so
forth?
That
becomes
the
key
concept
that
the
tracking
server
might
want
to.
The
engineer
might
want
to
keep
track
of
in
this
consolidated
repo
and
then.
A
Finally,
if
you
start
looking
at
your
models,
the
bottle
generated
from
the
attract
training
run
or
could
be
itself
an
artifact
that
you
might
want
to
store
away
to
keep
track
or
governance,
or
even
use
it
for
for
scoring
later
on.
Along
with
that,
you
want
to
keep
track
of
metadata
around
your
around.
Oh,
what
was
metadata
around
how
the
model
was
called,
what
source
code
was
used?
The
exact
version
maybe
get
our
ID
get
hash.
A
These
are
some
of
the
interesting
things
you
may
want
to
store
around
along
with
your
training
data,
and
then,
finally,
they
could
be
some
texts
that
you
might
want
to
store
some
document
that
might
want
to
capture
the
details
of
your
model,
and
for
that
we
have
this
high
level
tag
or
notes
that
that
you
can
record
for
your
training
run
and
then
at
the
bottom
right.
You
see
a
UI
that
sort
of
shows
how
it
would
show
up
on
an
ml
flow
UI.
A
This
is
the
query
layer
that
I
was
referring
to
where
you
have
an
ability
to
show
all
your
runs
and
then
using
this
sequel
like
syntax
query
that
run
for
metrics
and
parameters
in
sort
of
slice
and
dice
your
data
based
on
specific
metrics,
for
example,
you
can
say
metrics
tour
accuracy
is
greater
than
point
98
or
so
and
so
forth.
Okay,
so
how
do
people
use
this
consolidated
storage?
A
So
your
ml
for
tracking
server
as
I
said,
is
a
database
and
you
could
have
various
people
write
into
this
database
for
each
training
one
either
through
some
hosted
notebooks
or
a
local
app
running
on
your
machine
or
a
cloud
job
right.
So
they're
writing
to
this
centralized
database
through
REST,
API
or
Python
API,
and
that's
how
they're,
storing
as
as
we'll
see
how
it
is
done
and
then
you
are.
We
have
built
a
UI
layer
and
then
there
is
an
API
layer
to
be
able
to
query
that
database.
A
So
in
this
picture
we
sort
of
abstracted
away
the
database
we
started
off
when
we
open
up
will
be
a
release,
demo
slow,
using
a
file
back
store.
And
now,
if
you
have
various
different
stores,
we
assume
you're,
releasing
a
sequel,
be
a
store
to
sort
of
store,
all
the
metadata
around
runs,
and-
and
so
you
can
query
that,
using
again
as
a
UI
that
we've
built
or
using
Python
API,
alright,
oh
now.
A
Finally,
how
does
this
look
like
the
lightweight
API
I
was
talking
about?
It
starts
off
with
very
simple.
You
start
a
run
and
then
you,
you
can
record
the
parameters
that
are
super
important
for
the
particular
run,
and
you
can
see
that
highlighted
in
green
here
and
then
you
can
paste
in
your
training
algorithm
and
at
the
end
of
it,
you
are
tell
you,
you
compute
some
metric
based
on
some
test
data,
and
you
can
love
that
metric
or
you
can
actually
log
with
the
artifact
or
some
higher
level.
A
You
can
log
the
model
or
some
high-level
artifact
like,
for
instance,
a
lot
of
symmetric
data
and
that
can
be
logged
as
well,
and
then
you
in
this
particular
example,
the
model
happens
to
be
a
chance
of
work
graph.
So
we
are
logging,
it
like
a
tensor
flow
graphs
and
the
flow
has
these
packages
within
it.
To
understand
that
hey.
This
is
a
tensor
flow
graph
and
we
write
it
as
a
model.
So
any
kind
of
tool
that
understands
the
tensor
flow
graph
can
use
that
for
deployment.
A
So
with
very
simple
API
that
surrounds
your
existing
or
training
algorithm,
you
can,
you
can
now
very
easily
use
a
tracking
server
and
whatever
happens
during
this
run,
gets
logged
to
the
tracking
server
alright,
so
that
was
a
tracking.
Now,
let's
talk
about
projects
now.
As
a
reminder,
this
is
again
a
mechanism,
a
code,
a
packaging
mechanism
to
make
it
easy
for
you
or
anybody
on
your
team
to
reproduce
your
runs
on
any
cloud
platform
anytime
later.
A
So
what
are
the
things
that
we
need
to
be
able
to
do
that?
So
in
this
packaging
format,
we
obviously
need
to
the
code
that
is
needed
for
training.
You
may
also
need
some
configs
and
some
pointer
to
data,
so
this
effectively
constitutes
of
what
would
be
needed
to
be
able
to
reproduce
the
one
whether
it
is
running
it
locally
on
your
somebody
else's
local
machine
or
running
it
on
some
remote
plow.
A
But
then
how
exactly
does
this
packaging
format
look
like?
So
here's
an
example:
it's
necessarily
a
directory
structure
where
you
have
this
file
called
ml
project
and
if
you
look
at
the
file,
use
an
example
of
a
very
small
version
of
the
file.
It
right
of
the
bat
tells
you
that
it
means
a
Conner
environment
right.
So
this
is
a
card
environment
that
the
data
scientists
use
to
train
their
Varro
and
the
content
is
a
Yama
file
that
is
included
within
that
directory
structure.
A
It
says:
okay,
it's
runs
of
a
simple
Python
command
called
me
in
dot
pi,
and
it
gives
it
the
training,
data
and
lambda.
And
obviously
this
is
the
code
main
that
pi
and
then
there
could
be
other
dependent
modules
included.
Oh,
that
this
is
the
code
that
you
have
written
to
create
your
training
rod.
So
pretty
simply
when
you
write
a
training
algorithm,
you
include
this
within
that
directory
structure,
create
a
simple
ml
project
file
and
capture
your
content,
environment
and
boom.
A
Anybody
is
able
to
then
reproduce
your
run
using
a
simple
couple
of
ml,
slow
commands
that
can
just
take
your
existing
code,
use
their
data,
use
their
parameters
and
we
and
in
create
a
run
or
you
can
reproduce
your
run.
So
how
does
somebody
do
that
to
a
command
line,
ml
flow
space,
wrong
space,
I'll
get
her
people
or
it
can
be
a
link
to
the
a
local
directory
structure?
If
you
want
to
do
this
programmatically
there
is
I'm
also
drawn
a
Python
command
to
do
that.
A
Again
was
a
format
that,
once
you
write
out
a
model
using
a
multiple
models,
it
should
make
it
easy
for
you
to
to
deploy
this
on
any
kind
of
an
environment
that
can
host
that
model.
So
for,
let's
take
an
example
where
excuse
me
wait,
you
have
created
a
model
and
it
could
be
like
a
tensor
flow
graph
model
that
in
your
local
environment,
and
then
you
want
to
write
it
out
in
such
a
format
that
it
can
be.
A
So
this
would
be
a
way
of
saying
that
the
same
model
that
is
used
by
one
team,
for
that
say
back
scoring,
is
also
used
banner,
the
team
for
some
real-time
scoring
and
it
uses
the
same
model
that
is
being
written
by
a
mouth
flow.
So
how
does
this
look
like
again?
This
is
a
packaging
format,
though
the
format
will
be
very
similar
to
what
you'd
see
in
our
project.
It
is
a
male
model
file,
but
it
has
a
few
things.
First,
it
tells
you
a
few
metadata
about
when
this
model
was
created.
A
Excuse
me
that
can
run
a
tensor
flow
format,
would
need
to
be
able
to
destabilize
this
model
and
make
it
executable
and
then
any
environment
that
uses
Python
can
use
this
ml
flow.
Mod
module
call
ml
for
a
tensor
flow
to
be
able
to
then
load
this
model
module
and
then
execute
it
like
like
it
would
with
any
Python
function.
Again,
it
all
depends
on
making
sure
that
all
the
dependencies
are
in
play.
A
So
how
can
you
learn
about
a
little
aspect,
so
this
was
a
little
flavor
of
how
a
mil
flow
looks
like
there
is
a
lot
more
in
terms
of
documentation
and
and
training
examples.
If
you
want
to
do
some
type
of
parameter,
tuning
multi-step,
workflows
for
farest
and
back
scoring,
you
can
start
off
with
installing
ml
flow
through
PI
bi
or
you
can
download
it
through
the
github
repo
and
and
try
it
out
so
the
point
the
key
point
to
go
to
would
be
ml
flow
org
that
hosts
the
website
and
has
all
the
examples.
A
A
A
Finally,
we
recently
had
a
survey
on
ml
float
on
our
website
and
30
orgs
want
to
publicly
list
themselves
as
users
or
ml
flow
and
I
want
to
sort
of
state
that
this
survey
is
still
open
and
I
invite
everybody
to
go.
Look
at
ml,
float,
org
and
there's
a
Google
forum
for
this
survey.
You
want
to
know
if
you
have
used
ml
slow
and
if
you
have
not,
what
sort
of
things
are
you
looking
forward
to
in
that?
A
A
It
could
be
system
related
things
like
git
integration
and
cloud
support,
or
a
UI
level
of
work
to
visualize
experiments
and
an
integration
with
different
packages.
A
lot
of
a
lot
of
contributions
came
from
external
contributors,
and
these
are
some
of
the
big
ones,
and-
and
this
is
way
more
than
this-
this
slide
is
slightly
dated
in
the
last
few
minutes
of
orders.
A
Currently
we
are
at
version
0.83,
and
we
wanted
to
sort
of,
as
we
start
going
into
0.9
and
1.0,
they
sort
of
thinking
of
adding
in
point
9
will
be
adding
a
sequel,
backed
store
for
tracking
server.
This
is
one
of
the
most
frequently
asked
requests,
and
then
we
are
adding
fluent
Java
and
Scala
API
a
lot
of
UI
scalability.
A
A
A
lot
of
requests
came
around
that
okay,
this
I
want
to
use
a
model
I'm
going
to
save
it,
and
then
I
want
to
sort
of
have
inject
some
custom
code.
It
could
be
sort
of
a
querying
form,
a
feature
store
or
it
could
be,
like
you
know,
transform
suite
features
a
little
bit,
so
the
sort
of
custom,
logging
and
and
custom
code
is
something
that
is
being
worked
on
actively
and
you
start
releasing
it
point.
Nine
and
one
point
out.
A
A
You
can
already
do
that
using
the
ml
projects
and
you
can
have
multiple
entry
points,
but
but
it
would
it
make
sense
to
have
a
UI
to
sort
of
edit,
your
modifier
multi
but
ISTEP
workflow,
and
then
we're
looking
at
adding
some
telemetry
components
or
logging
data
and
metrics
and
taking
you
back
into
some
Alaric's
tools.
So
this
of
they're
looking
for
feedback
around
that
so
again,
another
plug
filter
for
the
survey
on
ml
slower
or
we
want
to
know
how
you
use
your
animal
flow,
libraries
and
frameworks
and
how?
A
What
would
you
like
for
us
to
add
as
integrations
in
ml
flow?
So
that's
the
survey.
Thank
you
all
for
your
time.
I
want
to
just
sort
of
again
put
out
ml
slaughter
or
join
us
on
slack
or
there's
an
email
thread.
You
can
join
and
thank
you
for
giving
me
the
time
to
talk
at
the
summit.
If
anybody
wants
to
job
to
come
to
spark
some,
it
here
is
a
code
that
wanted
to
sort
of
share
with
you
all.
A
B
That's
that
was
great
I
look
great
over
YouTube,
quick
question:
what
is
the
dates
for
the
spark
AI
summit?
It's.
B
Awesome
I'm
sure
we'll
have
some
people
there
read
that
for
sure.
So,
next
up,
if
there's
no
questions
or
maybe
we'll
save
the
questions
for
after
Zach's
demo
Zach,
if
you
want
to
take
over
sharing
the
ring,
we
will
get
you
set
up
and
Manny.
If
you
could
add
in
into
the
the
notes
the
link
to
your
Bay
Area
Meetup
do
because
I
think
that
would
be
of
interest
for
folks.
C
C
C
What
are
these
things
called
hyper
parameters,
as
many
pointed
out
in
its
presentation,
you
know
well,
when
we're
creating
machine
learning
models
will
be,
you
know,
whoa.
We
may
want
to
do
some
hyper
parameter
tuning
right,
so
when
we,
when
we're
creating
mo
models,
you'll
be
presented
with
different
ways
on
how
to
define
your
model
architecture
in
the
beginning.
You
don't
know
what
the
you
don't
know.
What
the
optimal
architecture
should
look
like
for
your
model,
you,
you
would
need
to
do
multiple,
runs
and
test
your
model
with
multiple.
C
You
know
different
up
to
different
parameters
to
find
the
optimal
model
architecture,
so
the
so.
The
parameters
that
we
define
the
the
model
architecture
are
called
hyper
parameters
and
the
process
of
searching
for
the
ideal
model
architecture
is
called
hyper
parameter
tuning.
So
let
me
try
this
back
up
to
the
use
case
here
so
applying
this
to
our
use
case
with,
let's
say,
let's
say
if
we
have
an
unsupervised
machine
learning
problem
our
machine
learning
technique
such
as
k-means.
C
So
you
might
ask
what
is
k-means
k-means
is
used
to
partition
the
data
points
in
K
clusters
and,
let's
suppose
K
is
three.
Let's
say
we
have
three
clusters
and
we
have
ten
data
points.
What
k-means
does
is
it
takes
that
takes
the
features
of
the
ten
data
points
and
assigns
each
point
to
either
a
cluster
one
two
or
three,
and
the
data
points
that
are
similar
will
be
grouped
under
one
cluster.
C
So
that's
more
into
the
use
case.
You
know,
but
you
know,
mo
lifecycle
is
the
complex
thing.
We
wanted
something
off
the
shelf
open
source
that
we
could
pull
in.
We
we're
running
on
open
shift,
which
is
our
distribution
of
kubernetes,
so
we
wanted
something
that
runs
on
open
ship
to
do
things
like
this
training,
the
model
doing
hyper
parameter,
building
versioning
we
found
mo
flow
great
tool
to
that.
Has
these
all
these
great
features
and
then
tracking
experiments?
C
C
C
C
So
so
we
have
OpenShift
here
we
have
our
operator
here
mo
flow
tracking
operator,
so
I've
already
deployed
it.
We
have
one
instance
of
the
server
and
we're
gonna.
We
have
mo
flow
here,
that's
already
running
so
that
one
instance
is
there
it's
pointing
to
an
s3
bucket
called
Zac
Hasan
and
that
s3
bucket
is
here
in
Stefan
manual.
Ok,
so
what
I'm
gonna
do
is
I'm
gonna
run
this
this
code
here,
which
is
the
standard
example,
that's
on
mo
flow
website,
but
I'll
just
point
you
to
where
the
code
lives.
C
Which
is
here
it's
just
a
standard
example
where
we're
using
the
standard
example
and
it's
using
the
same
MO
flow
api's.
It's
logging,
the
parameters,
it's
logging,
the
metrics
and
then
it's
logging
storing
the
model
into
the
server
and
it's
choosing
different
parameters,
and
then
we
can
compare
these
three
different
runs.
So,
let's
without
further
ado,
this
is
our
UI
we're
gonna
refresh
just
so,
there's
no
smoke
and
mirrors
we
can
see
there
is
no
runs
in
there
right
now.
C
So
if
I
go
ahead
and
do
a
run,
so
we
really
did
a
build
of
that
that
get
repo.
We
built
that
already
it's
already
here,
we're
gonna
inject
a
secret,
and
that
secret
is
is
here
this
we
don't
want
to
store
our.
We
don't
want
to
ask
users
work
there,
AWS
access
key
there
secret,
we
kind
of
want
to
keep
it
in
secrets,
and
then
these
secrets
would
get
injected
into
the
container
okay.
C
C
C
C
C
So,
just
to
wrap
this
up.
What's
next
for
this,
so
the
next
thing
will
we
did
was
we
we
created
a
PR
in
the
mo
flow
repo
and,
like
all
things,
open-source,
it
always
starts
with
the
PR,
and
we
have
a
PR
in
ml
flow
and
we're
contributing
this
this
operator
to
the
mo
flow
community
and
collaborating
there,
and
that's
that's
pretty
much
it
so.
B
B
D
Hi,
this
is
Hema
I'm,
currently
working
with
the
AIC
OE
team
had
a
conversation
with
Zack
as
well,
with
the
different
use
cases
for
ml
flow,
particularly
for
the
machine
learning
aspect.
As
a
data
scientist
I
was
testing
and
trying
so
I
just
had
a
couple
of
questions
to
Manny
yeah.
B
D
Do
know
that
there
a
lot
of
like
you
mentioned
it
was
great
to
know
that
there
are
new
versions
of
ml
flow.
Coming
with
with
some
of
the
features
that
you
mentioned,
and
the
only
questions
some
of
the
questions
I
had
was,
as
I
was
testing
out
ml
flow.
Let's
say
we
have
like
three
or
four
data
scientists
who
are
working
and
testing
different
kind
of
models.
One
is
running
k-means.
Someone
else
is
running,
you
know
k-means
plus
plus,
or
some
spectral
clustering.
D
So
when
we
run
the
ml
flow
server,
which
is
a
one
server
which
is
shared
by
multiple
users,
is
there
a
way
to
sort
of
identify
or,
if
we're
all
pushing
it
under
one
experiment?
Is
there
a
way
to
sort
of
paralyze
this,
or
do
we
all
just
sort
of
push
it
under
one
particular
experiment
ID
or
give
it
a
name?
A
That's
a
really
good
question
email,
so
I
can
answer
it
a
couple
of
ways.
So
an
experiment
if
you
start
thinking
about
it
is,
is
your
way
of
organizing
different
runs.
So,
let's
say
the
only
four
or
five
data
scientists
are
working
on
the
same
project
and
you
want
a
sort
of
work
off
of
each
other's
runs.
You
want
to
sort
of
share
your
data
with
each
other
and
you're
related
to
one
exact
problem
that
you're
solving.
Then
it
actually
does
make
sense
to
wrap
them
up
as
one
experiment.
A
Okay
and
then
you
all
are
submitting
your
runs
to
the
exact
same
experiment,
and
then
they
will
be
recorded
as
different
by
different
runs.
That
are
because,
if
you
look
at
it
the
runs
are
you
you
IDs,
so
you
can
then
have,
and
in
the
in
the
UI
they
will
be
recorded
with
your
specific
user
ID
as
to
who
recorded
it.
So
it's
very
easy
to
sort
of
do
that.
Now.
A
It
might
get
a
little
harder
to
compare
across
those,
but
if
you
really
need
them
all
to
be
together,
Emma
flow
supports
that,
if
you
prefer
to
have
them
as
different
experiments,
you
can
just
create
different
experiments
and
and
and
store
that
so,
for
example,
let's
say
all
of
you
are
working
on
some
addict,
like
tricking
or
tracking,
or
something
like
that,
and
you
want
to
sort
of
have
two
different
experiments
for
two
different
class
of
models.
Right,
let's
say,
is
a
deep
learning
model.
A
Another
is
a
traditional
model
you
could
just
sort
of
like
you
know,
once
you
working
or
you
working
on
that
project,
for
the
the
deep
learning
you
can
record
it
to
a
deep
learning
experiment.
If
you
want
to
work
for
a
traditional
model,
it's
going
to
the
other
one,
so
it
all
depends
on
how
you
want
to
use
it
ever
flow
supports.
All
of
this,
as
I
said
it
is
designed
to
scale
across
a
lot
of
users,
in
fact,
yeah
a
lot
of
users
that
use
em.
A
B
I'm
just
looking
to
see
if
there
are
any
other
questions,
we're
almost
to
the
top
of
the
hour,
so
I
always
love
it
when
I
think
I'm,
giving
everybody
15
minutes,
but
you
just
blow
it
out
of
the
water
and
then
use
the
time
wisely.
So
I'm
going
to
thank
everybody
for
their
time
today.
All
right
then
perfect.