►
From YouTube: What's Next? Machine Learning on OpenShift Panel - Matt Farrellee, David Aronchick , Kris Overholt
Description
OpenShift Commons Gathering December 5th 2017 Austin, Texas
What's Next? Machine Learning on OpenShift Panel
Panelists: Matthew Farrellee (Red Hat), David Aronchick (Google), Kris Overholt (Anaconda)
Tushar Katarki, Red Hat, Moderator
B
A
Obviously,
I'm
very
delighted
to
have
you
all
here.
I
know
it's
late
in
the
afternoon
and
you
know
the
refreshments
are
waiting
for
us.
This
is
the
last
panel,
so
I
will
try
to
keep
it
very
exciting
and
hopefully
not
make
you
sleep.
I
am
too
shortcut.
Archie
I
am
a
product
manager
on
the
OpenShift
team
at
Red
Hat,
and
this
is
the
panel
discussion
on
AI
ml
on
Cuban,
Aires
and
OpenShift.
A
We
have
I'm
going
to
say
a
couple
of
times
so
that
it
sinks
in
we
have
a
new
sig
that
we
are
creating
an
open
shift
Commons
that
Dan
mentioned,
so
you
can
go
and
obviously
I'm
going
to
plug
it
now
and
at
the
end,
go
and
sign
up
on
open
shift
Commons.
For
this
sake,
you
know
if
you're
interested
in
this
topic,
especially
towards
the
end
when
you
know
you
heard
from
this
esteemed
panel
here
so
so,
the
logistics
are
I.
A
I
thought
you
know
we
we
do
some
introduction
to
the
topic
and
to
the
panelist
for
about
10-15
minutes
that
will
get
into
the
main
discussion
itself,
awesome
questions
and
then
I'll,
give
you
an
opportunity
to
ask
some
questions
for
like
say
ten
minutes
and
then
we'll
wrap
up
right
up
with
some
more
plugs
to
make
sense
all
right
cool.
So
a
brief
introduction
to
this
topic.
It's
very
exciting
right,
I
mean
AI,
artificial
intelligence
and
machine
learning
is
already
touching
our
lives,
be
it.
A
You
know,
you
know:
driverless
driverless
cars
or
automobiles
beat
personal
assistants
like
Alexa
and
Siri
beat
you
know,
Netflix
recommending
your
favorite
movie
or
Pandora
a
fear,
music
or
beat
optimizing.
Your
optimal
energy
usage
with
nest.
Thermostats
I
mean
there
is
some
AI
in
there
I
mean
I
mean
it's.
It's
really
already
there
right
I
mean
we
are
saying
its
new
what's
next,
but
in
some
ways
it's
already
there
and
affecting
our
lives,
so
I'm
very
excited
to
talk
about
this
today.
My
personal
favorite
quote
unquote.
Use
case
really
was
this
thing.
A
The
one
thing
is,
you
know:
AI
is
not
easy
right,
I
mean
AI
has
been
talked
about
for
a
long
long
time
in
academia.
So
why
has
it
not
happened
so
far?
It's
because
we
didn't
have
things
like
cloud
and
we
didn't
have
things
like
data
big
data.
So
it's
certainly
building
upon
a
couple
of
huge
technology
trends,
but
also
those
technology
trends
and
those
use
cases,
as
you
know,
are
also
moving
very
fast.
So
yeah
is
building
upon
that.
So
it
is
complicated.
It
is
a
number
of
languages
and
we'll
talk
about
that.
A
It
means
a
number
of
frameworks
it
uses.
A
number
of
it
is
very
computed,
computationally
intensive
resource
hungry,
so
you
really
want
to
optimize
the
use
of
it
so
and
it
touches
a
number
of
different
roles.
I
mean
data
scientists,
obvious
being
a
obvious
new
one
right.
So
so
we
have
all
this
complexity
and
what
I
thought
we
would
do
really
is
the
fact
that
you
know
one
of
advantages
that
we
have
we're
going
to
talk
about.
We
talked
today,
but
over
the
next
few
days,
we're
going
to
talk
about
containers,
I
mean
simplistically
putting.
A
Hopefully
we
can
continue
I
some
of
this
complexity
away
is
how
I'm,
looking
at
it
for
the
some
of
this
complexity
that
we
saw
containers
also
are
lightweight,
they
are
fast
and
efficient
and
they
enable
you
to
they're
portable
across
a
hybrid
cloud
footprint
so
so,
and
kubernetes
and
OpenShift
obviously
have
emerged
as
really
as
you
saw
in
Chris
Wright's
keynote.
You
know
have
really
emerged
as
a
very
powerful
container
platform
and
in
fact
you
might
all
call
it
the
de
facto
platform.
So
so
it
is
great
to
have
to
moderate
this
panel.
A
On
a
IML
and
open
shift
and
cuban
Eddie
so
with
that,
let
me
start
with
the
introduction
of
a
panel
here.
I
start
with
David.
First
David
Arendt
Chuck
is
a
product
manager
at
Google.
He
is
he's
a
manager
for
Google
container
engine
and
has
been
shipping
software
for
20
years
in
various
rules
has
been
a
founder
of
three
startups
and
has
had
strengths
at
Microsoft,
Amazon,
chef
and
obviously
at
Google.
A
A
A
Right,
okay,
so
next
we
have
Chris
Orel.
He
is
a
protein
manager
anaconda.
He
is
a
prog
manager
for
the
data
science
platform.
He
is,
he
has
expertise
in
distributed
systems,
data
engineering
and
computational
science
workflows.
He
has
a
PhD
in
civil
engineering
from
UT
Austin
and
prior
to
that,
prior
to
anaconda
he
was
at
National
Institute
of
Standards
and
Technology,
and
ist
southwestern
Research
Institute
and
University
of
Dickson
Texas
at
Austin,
High
and
anything
else.
Yeah.
D
Hello:
everyone
thanks
for
having
here
it's
an
honor
and
our
headquarters
of
anaconda
is
about
five
blocks
over.
So
this
is
our
home
base.
Here
we
have
about
a
hundred
employees
if
you're
not
familiar
with
anaconda.
It's
leading
open
source
data,
science
distribution.
We
have
just
about
5
million
unique
data
science
users
around
the
world,
Windows
Mac
Linux,
and
it's
a
lot
of
foundational
pieces
of
data
science,
machine
learning
and
gateways
for
folks
to
get
into
things
like
notebooks
and
machine
learning
with
tensorflow.
So
excited
to
talk
about
that
here.
Today,
five
million.
A
That's
a
big
number.
You
know
thanks
thanks
Chris,
for
that,
and
last
but
not
least,
we
have
Matt
Farah
Lee
is
a
senior
engineer
and
architect
from
Red
Hat.
He
was.
He
is
the
founding
member
of
a
project
that
our
analytics,
which
he's
going
to
talk
about
in
a
bit
yeah,
which
is
about
which
is
about
open
source
data,
analytics
ml
platform
based
on
Apache
spark
on
openshift
and
kubernetes.
He
was
one
of
the
founding
members
of
Sahara.
A
I
could
say
that
right
on
in
OpenStack,
which
is
the
OpenStack
big
data
processing
project,
he
was
involved
in
the
university
of
wisconsin
kondal
project,
or
some
of
you
who
might
know
that
that
was
like
the
high-performance
throughput
computing
project
is
kind
of
early
pioneers
in
distributed
cluster
computing.
That's
only
say:
hi
hi.
A
Right
thanks,
I've
known
Matt
for
several
years
now,
including
that
University
of
canvas
constant
process
all
right.
So
let's
dive
right
into
it
right.
This
was
the
introduction,
so,
let's
dive
right
into
it,
so
that
we
have
a
common
understanding.
What
I'd
like
to
do
is
what
does
AI
and
ML
mean
to
you?
I'll
start
with
you
David
and
we'll
circle
around
or
we
can
go
in
any
order,
but
what
does
it
mean
to
you
scope
it
a
little
bit?
What
is
it?
What
is
it
not
absolutely.
C
So
I
I
always
think
it's
funny
and
somewhat
not
great.
When
people
joke
about
you
know,
ái
coming
to
murder,
us
all
and
and
I
am
as
guilty
of
that
as
anyone,
but
please
there
are
many
many
people
in
the
world
who
do
not
understand
AI,
nml
and
like
to
hear
it
even
as
jokes
from
the
people
who
do
know
not
so
great.
So
please.
C
Let
me
plead
about
that
when
I
think
of
ML
I
basically
think
there
are
three
categories
of
problems
that
it
really
unlocks
that
that
you've
never
seen
before
the
first
are
where
things
are
hard
today,
but
they're
they're
tractable,
you
could
at
least
potentially
do
it,
and
so
that
might
be.
You
know
if
you
had
a
million
pictures,
you
know
identifying
what
pictures
had
dogs
in
them.
You
could
do
that
today.
You
know
with
your
standard
algorithm,
you
could
do
it
today
with
humans
and
so
on
and
so
forth.
C
It
would
just
take
a
really
long
time.
Then
there's
the
second
category,
which
is
there's
problems.
You
know
how
to
solve,
but
effectively
they
would
be
impossible
to
solve
the
computers
at
all
right,
and
that
would
be
something
like
beating
go,
for
example,
right,
that's
something
where
we
know
the
rules
of
go
and
theoretically
you
could
beat
it,
but
no,
you
could
not
use
standard.
C
Compute
computational
techniques
today
to
to
you
know,
beat
go
or
be
better
than
humans
at
that
problem,
and
then
the
third
is
where
we
can't
even
really
describe
how
to
solve
the
problem.
We
know
what
a
solution
looks
like
or
when
we've
succeeded,
but
there's
no
algorithm
that
we
could
come
up
with
to
solve
it,
and
that
would
be
something
like
you
know:
identifying
cancer
in
radiology,
for
example,
we
have
kind
of
a
generalized
heuristic,
but
if
you
got
ten
doctors
together,
they
might
still
disagree.
Yet.
C
Even
now
you
have
AI
arc,
sees
me
ml,
being
able
to
look
at
this
problem
and
make
an
assessment
and
be
better
than
then
humans
are
today
so,
and
it's
still
improving
even
beyond
that.
So,
that's
all
to
say
the
three
commonalities
of
those
problems
and
that's
where
I
say
the
definition
of
ml
is
ml
is
being
able
to
solve
a
problem
without
necessarily
understanding
exactly
the
method
all
to
get
there
and
that's
not
great
right.
D
Yeah,
when
I
think
of
machine
learning,
if
I,
initially
scoped,
that
to
a
library
saying
Python
or
R,
it's
just
a
collection
of
algorithms
or
statistical
algorithms
that
can
be
applied
to
different
data
sets
so
sort
of
you
know,
import
a
library
run
it
on
some
data,
that's
the
beginning
of
machine
learning,
but
I
often
think
of
it
on
an
implementation
timeline.
How
we
make
that
useful
for
other
people
and
how
do
we
democratize
that
so
the
following
the
next
two
big
stages,
I,
think
of?
Are
you
have
a
library
it
has
some
statistical
functions.
D
You
do
things
like
model
selection,
there's
you
know
dozens
of
steps
beyond
that,
once
you
have
something
sort
of
working,
it
works
on
my
machine.
How
do
I
share
this
and
make
this
useful
for
the
rest
of
the
world
to
build
and
improve
on
and
to
match
up
with
open
source
philosophy
and
I
think
things
like
standardized
formats,
whether
it's
serializing
data
or
sharing
models
in
efficient
ways
allowing
other
people's
to
build
on
that
module
early.
D
It's
a
big
deal
and
there's
when
you
start
working
with
larger
and
larger
groups,
whether
that's
a
foundation
or
an
enterprise
team
things
like
reproducibility
governance,
traceability
of
those
models
becomes
very
important,
then
deploying
that
out.
So
people
don't
have
to
follow.
You
know
dozens
of
steps
to
get
it
up
and
running.
It's
very
easy
to
get
up
and
running
in
any
environment,
HPC
cloud
on-premise
and
then
beyond
that
stage
of
deployment
and
and
usability
really
comes
the
consuming
of
that
right.
D
So
we
actually
want
our
greatest
audience
to
be
able
to
consume
that
in
a
interactive
visualization
or
just
a
browser
right.
So
there
may
be
complicated
technical
stack
underlying
that
and
it
all
starts
at
the
library
and
infrastructure
level,
but
we
think
a
lot
and
and
as
I've
watched,
different
industries
evolve.
It's
all
been
about
model
about
consolidating
these
models
and
api's
on
a
common
framework
and
common
tool
set
to
really
democratize
the
audience
the
people
building
and
consume.
E
E
Interesting
answers
I,
take
kind
of
more
of
an
engineer,
approach
to
and
I
think
of
AI
as
a
large
body
of
research.
That's
been
ongoing
for
for
many
many
decades
within
that
you
have
things
like
knowledge,
representation
and
machine
learning
and
then
within
machine
learning.
You
have
things
like
neural.
E
And
deep
learning
and
whatnots,
so
I
kind
of
think
of
it
from
a
from
a
structured
perspective
that
way
when
it
comes
to
the
kind
of
like
the
scope
or
the
impact,
it's
more
of
how
are
a
AI
machine
learning
is
giving
us
ways
to
kind
of
interpret
the
world
interpret
all
the
data.
That's
that's
around
us
and
going
giving
us
new
ways
to
interact
with
the
worlds
and
interact
with
with
other
people
to
try.
E
You
gave
some
examples
of
a
machine
learning
apps
that
people
may
interact
with
on
a
daily
like
may
interact
to
the
day
on
a
daily
basis
if
they
could
like
buy
a
Tesla
or
something
like
that,
but
in
reality
AI
machine
learning
is
is
really
ubiquitous
already.
I
mean
Google
search.
Is
an
example
of
this.
That's.
E
People's
lives
at
this
point
on
the
kind
of
like
I,
think
you
said
what
is
it
not
I,
I
like
to
think
that
it's
it's
not
it's
I'm
gonna
kind
of
violate
David's
comments
a
little
bit,
but
it's
not
the
it's,
not
the
destruction
of
humanity.
It's
also
not
the
savior
of
humanity
and
really
it's
also
not
a
salad
dressing.
Although
people
might
given
the
fact
that
it's
so
hyped
right
now,
people
might
say
it
is
you.
A
Know
one
of
the
stand
things
you
hear
also
in
this
context
is:
oh
it'll,
kill
all
the
jobs
and
it's
I
think
it's
not
even
that,
because
just
the
example
of
driverless
cars
I
mean
driverless
doesn't
mean
that
you
can
sleep
inside
I
mean
for
the
next
10
15
years.
You
still
have
to
pay,
probably
some
attention.
You
know.
E
I
may
not
use
as
many
kind
of
like
revolutions
of
these
things
that
people
think
are
going
to
destroy
humanity
or
whatnot
and
spoiler.
We're
still
here
and
things
are,
things
are,
for
the
most
part,
getting
better
getting.
What
not
so
well
the
same
thing
with
a
I
suppose
a
lot
of
the
hype
settles
down
the
reality
of
what
people
can
do
with
it.
How
it
interacts
with
you,
interactive
in
your
life,
actually
becomes
more
clear.
I'd.
C
Like
to
support
that
I
actually
just
want
to
say
one
one
thing
on
top
of
that:
yeah,
yes,
and
it,
which
is
it
won't,
kill
us.
It
won't,
kill
all
the
jobs
or
anything
like
that.
But
only
if
we
all
the
people
in
this
room
and
the
people
watching
are
responsible
and
think
about.
You
are
all
technology,
implementers
creators
and
so
on.
Please
do
think
as
you're
doing
this
stuff,
don't
rely
on
someone
else
doing
the
hard
work
and
being
aware
of
that.
C
A
Very
good
there's
a
there's,
a
good
introduction
to
that
topic.
So
next,
as
what
I
was
going
to
ask
each
one
of
you
and
you
can
go
in
any
particular
order,
but
like
what
is
a
favorite
use
case
for
you,
I
mean
like
something
that
you're
like
get
excited
about.
Oh
I
want
to
make
this
work
today,
because
it'll
solve
this
problem,
and
so
what
gets
you
excited?
E
E
One
so
I've
never
been
particularly
good
at
foreign
languages.
I
took
lots
of
foreign
languages
in
high
school
in
college,
but
never
really
kind
of
like
immersed
myself
in
environments
actually
use
them
to
actually
communicate
with
people
and
I.
Think.
The
the
translation
capabilities
that
are
coming
out
right
now
are
really
going
to
make
it
much
easier
for
people
to
communicate
and
much
easier
for
me
to
communicate
with
people
that
I've
been
otherwise
be
able
to.
D
For
me,
a
little
bit
of
my
background
is
in
civil
engineering
and
specifically
like
life.
Safety
system
is
building
protection
systems,
so
really
kept
an
eye
on
building
systems
and
building
integration,
as
it
comes
together
with
many
different
manufacturers.
If
you
look
at
a
building
inherently
it's
sort
of
boring,
it's
a
boxy
structure
with
rooms
as
soon
as
you
start
recording
and
logging
information
like
energy
usage
temperature
occupancy,
and
you
put
all
of
that
together
in
aggregated.
D
You
actually
get
a
really
beautiful
picture
of
how
that
building
behaves
as
it
interacts
with
people
and
people
interact
with
it
and
at
a
city
scale.
It
helps
things
like
emergency
responders
and
it's
a
nice
example
of
how
to
bring
something
that
was
formerly
static,
sort
of
online
and
something
that
we
can
monitor
over
time
and
become
an
integrated
even
with
many
many
different
subsystems
of
many
different
types.
So
to
me,
the
complexity
of
that
how
we
wrap
that
all
up
into
a
some
useful
metrics
for
people
is
a
pretty
awesome.
A
C
Spend
all
my
time
in
there
now
so
I'm,
always
astounded
at
everything,
I'll
try
and
keep
it
super
brief.
I'm
gonna
I
have
a
talk
tomorrow,
I'm
going
to
give
away
some
of
the
stuff
I'm
talking
about,
but
one
of
them
that
I
love
is
is
from
Google
Google.
As
you
may
know,
we
have
a
lot
of
data
centers.
We
hire
some
smart
data
center
people
and
there's
this
term
and
data
centers
called
pua
power
usage
efficiency.
C
Center
engineers
that
and
this
and
try
and
roll
it
out,
and
so
we
looked
at
it
and
they're
like
oh
all,
these
fans
and
water
and
cooling
and
things
like
that.
They
they
kind
of
look
like
signals
4ml,
and
so
we
hooked
it
up
and
the
power
usage
efficiency
went
like
this
and
that
launched
down
a
mile
off
BAM
right
back
down
and
literally
and
we're
very
public
about
this.
We
save
15%
on
our
power
just
by
using
ml
against
these
data
centers.
H
C
C
Basically
imagine
this
incredibly
simple,
summary
is:
is
you
take
to
a
eyes
or
excuse
me
to
ML
frameworks
models
and
you
pit
them
against
each
other,
so
the
first
one
tries
to
figure
out
a
solution,
and
the
second
one
tries
to
figure
out
something
that
breaks
the
first
solution.
You
just
force
them
to
go
against
each
other.
A
Cool
very
cool
all
right,
so
what
I'll
do
next
is
kind
of
try
to
get
more
deeper
into
it.
We
I
talked
earlier
about
how
this
all
nice
sounds
very
exciting,
but
it's
also
complex
and
it's
not
easy.
So
starting
with
David.
One
of
the
things
that
I
was
going
to
ask
is
what
are
some
of
the
challenges
in
ein
ml
that
you
see
today
and
how
is
kubernetes
playing
a
role
in
it
and
maybe
there's
a
time
to
talk
about
one
of
the
things
that
you
won't
talk
about.
Sure.
C
The
first
is
really
the
approach
ability
of
ml
today,
if
I
saw
any
of
you
down
for
any
non
ml
practitioners
and
and
walked
you
through
what
the
average
ml
person
did.
You'd
be
shocked
at
like
how
absolutely
tribal
and
back-of-the-envelope-
and
you
know-
oh-
maybe
I'll-
tweak
this
number.
You
know
the
way
it
is
right
now,
which
is
really
really
disturbing
for
a
bunch
of
yes
folks
who
are
like
well,
there
should
be
a
standard
process.
C
We're
going
through
this
exploring
and
I
think
we'll
get
a
lot
better
there,
but
part
of
it
relates
to
the
second
big
problem,
which
is
real
transparency
and
understanding
being
able
to
probe
into
a
model.
So
if
I
run
an
application
today,
you
know
I,
can
you
know
attach
to
the
the
process
and
see
exactly
you
know,
what's
being
called
at
what
time?
C
Everything
is,
is
very
bespoke
you
kind
of
piece
together
whatever
works
and
how
you
would
like
to
approach
it,
and
that's
not
great,
because
that
means
that
that
standard,
tooling
doesn't
work
anymore.
You
now
need
to
not
only
create
this,
this
stack,
but
then
on
the
other,
half
of
that
create
a
set
of
tooling.
That
supports
your
stack
and
lets
you
introspect
and
lets
you
understand.
What's
going
on,
let's
you
auto-tune
and
all
those
various
things
that
said,
that's
where
I
hope
that
kubernetes
can
help
help
us.
C
Today,
people
generally
create
their
stacks
from
the
bottoms
up
all
the
way
down.
They
they
understand
exactly
what
version
of
Python
what
libraries
they're
running,
what
you
know,
networks
all
these
various
things
and
that's
just
too
much
for
the
average
data
scientist
to
approach
and
the
data
science
shouldn't
have
to
think
about
that,
and
that's
where
kubernetes
has
really
changed
them
change
the
game.
C
They
create
this
wonderful
standard,
abstraction
over
the
the
infrastructure
that
you're
running
on
and
not
just
create
an
abstraction,
but
actually
create
rich
objects
that
allow
you
to
interact
with
various
components
of
the
platform
and
to
your
point,
like
help
you
wire
a
bunch
of
services
together.
So
I
am
very
optimistic
and,
like
I
said,
I'll
talk
about
it
just
a
little
bit.
What
I
think
the
future
looks
like
for
ml
on
kubernetes.
D
Definitely
so
in
terms
of
data
science
with
python
and
are
many
of
our
users.
If
you
sort
of
climb
the
stack
over
the
past
couple
of
years
with
our
users
anaconda
solve
the
problem
of
I
need
to
get
Jupiter
up
and
running
with
tensorflow,
with
all
of
its
Fortran
and
C
dependencies
as
quickly
as
possible
across
Windows,
Mac
and
Linux.
So
that
was
a
good
thing.
D
The
next
question
was
now
I
want
to
share
this
analysis
or
this
model,
or
this
server,
or
this
visualization
with
my
buddy
on
a
different
operating
system,
or
so
they
need
to
install
system
packages.
Allow
these
things
on
their
firewall,
get
these
other
libraries
and
all
they
miss
master
version
of
this.
So
docker
was
a
nice
addition
to
something
like
an
anaconda
Python
distribution,
we're
sort
of
where
the
open
source
Conda
package
manager
left
off
in
terms
of
environments.
D
It
took
over
and
said:
oh
now,
I
can
bake
everything
into
an
image
and
it's
very
portable
and
then
the
next
layer
was
the
resource
management,
scalability
orchestration
and
that's
where
kubernetes
came
in,
because
what
we
found
about
a
year
ago
was
that
our
users
are
building
amazingly
different
things
and
amazing
things
with
anaconda
and
the
last
mile
was
now
what
now?
How
do
I
deploy
this
thing
out?
So
it
did
a
science
deployment.
You
can
read.
D
They
no
longer
have
to
worry
about
things
clobbering,
one
another,
an
environment,
it
all
just
works
at
that
abstraction
layer.
For
these
people
who
just
want
micro
service,
they
just
want
their
model
to
run
and
share
that
run
alongside
others,
so
they
can
build
on
top
of
it
without
having
to
go
through
that
over
and
over
and
over.
So
this
has
been
very
important
for
enterprises,
adopting
things
like
machine
learning
and
ml
large
organizations
working
together
and
and
really
democratizing
environments
right.
D
The
fact
that
they
can
deploy
to
the
cloud
or
on
Prem
without
changing
anything
is
a
game
changer.
That
means
we
don't
have
to
switch
api's
every
single
time
we
want
to
deploy
somewhere
so
really
in
the
last
year,
deployment
has
data
science
deployment
from
anywhere
from
interactive
is
models
to
machine
learning.
Libraries
and
all
the
above
has
just
become
that
much
more
curved.
Eight
pervasive
through
things
like
kubernetes,
containerization,
isolation
and
orchestration.
E
E
So
quickly,
I
think
kubernetes
has
done
a
tremendous
job
really
providing
the
API
the
interface
that's
expected
by
operations,
people
by
system
mins
by
developers.
It's
really
it's
really
codified
a
lot
of
their
best
practices
over
many
many
decades
of
experience,
there's
with
with
AI
machine
learning.
There
is
a
shift
in
the
way
that
the
system,
the
systems
operate,
the
way
that
they're
built
that's
kubernetes
is
going
to
have
adapt
to.
To
some
extent.
E
It's
you
know,
there's
there's
an
understanding,
I
think
is
really
being
formed,
will
be
talked
about
with
data
later
is
how
how
data
Sciences
operates
and
what
expectations
they
have
and
then
what
expectations,
the
things
that
they
build
have
on
the
infrastructure
that
they're
running
a
concrete
example
that
that
we
usually
use
is
thinking
about
it.
Rot
thinking
about
that
rot
from
a
developer
perspective
is
you've
deployed
some
piece
of
code
and
rot.
E
It's
something
that
happens
over
long
periods
of
time,
usually
with
some
sort
of
dependency
changes
or
some
sort
of
input,
changes
and
things
fail,
and
things
fail
in
a
fairly
drastic
fashion
and
the
infrastructure
kubernetes
understands
how
to
deal
with
systems
that
that
do
that
AI
machine
learning
systems
are
inherently
more
more
statistical
based,
they're
they're.
Not
they
don't
give
you
that
that
clear
things
afoot
mom
fails.
They
just
start
performing
sub-optimally
and
detecting
that
suboptimal
performance
being
able
to
infrastructure
that
can
respond
to.
E
A
E
The
second
thing
we're
doing
is
we're
certain
to
have
the
conversation
as
to
how
do
we
influence?
How
do
we
give
back
to
what
these
best
practices
are?
So
Chris
mentions
the
the
Rabb
analytics
work.
That's
going
on
this
is
this
is
a
output
which
is
starting
to
form
some
of
the
the
understandings
that
we've
we've
built
up
over
the
last
number
of
years,
using
a
using
machine
learning.
E
So
then
that's
that's
really
key,
and
then
the
third
thing
of
how
are
that
adding
to
this
is
we're
actually
putting
out
services
and
software
for
our
customers
that
have
don't
have
a
big
AI
machine
learning
stamp
on
them,
because
in
the
in
the
end,
it's
a
tool
to
do
something
but
are
actually
like
powered
by
AI
and
machine
learning.
Chris
mentioned
too
earlier.
Today
you
mentioned
the
Red.
A
Who
wants
to
talk
about
some
of
the
work
that
is
happening?
I
mean
we
talked
a
little
earlier
about
the
Cuban
early
resource
management,
working
group
and
some
of
the
work
that
is
happening
with
respect
to
GP,
GP
use
and
implement
of
that
and
some
of
the
work
that
is
happening
in
Cuban
Ares.
Who
wants
to
talk
about
that.
E
I'll,
throw
throw
a
couple
words
so,
like
there's,
there's
work
happening
with
those
working
groups,
arounds
figuring
out
the
kind
of
like
hardware
technology
that
is
becoming
more
and
more
important
for
these
machine
learning
algorithms
and
making
sure
that
those
that
hardware
is
is
exposed
and
accessible
to
the
algorithms
that
actually
running
on
top
of
of
openshift.
There's
the
it's
the
performance,
sensitive,
pod,
application,
pod
work,
that's
happening
to
really
kind
of
like
make
sure
that
kubernetes
it
has
a
very
solid
foundation.
A
All
right
so
I,
the
next
question
really
was
what
I
was
going
to
tee
off
really
was
this
way
right?
So
if
I
mean
one
of
the
things
that
everybody
is
thinking
and
like
everybody
makes
these
decisions,
I
think.
Why
should
we
care
now
right?
So
can
you
address
that,
especially
like
you
know
how
data
is
important
and
even
if
you
decide
to
do
something
today,
you
know
you
might
not
have
collected
the
data.
C
That's
fine,
so
I
mean
I
think
that
we
are
awash
in
data
in
a
way
that
we've
never
been
before
you
know,
literally,
we
are
collecting
data
from
every
movement
right,
every
device,
Fitbit
trackers.
You
know
the
sensors
in
this
room,
repeating
heat
and
and
thermostats,
and
so
on
and
so
forth.
All
the
way
up
to
the
you
know
largest
possible.
You
know
number
of
queries,
user,
behavior
and
things
like
that.
C
So
we're
at
this
phase,
where
it's
just
absolutely
transformative
relative
to
data
and
and
I
think
there
will
be
a
really
important
transformation
that
goes
on.
You
know.
I
research
errs
out
there
nowadays
would
argue
that
that
actually,
with
with
all
the
hardware
investments
and
all
the
data
investments,
we
have
everything
that's
necessary
to
make
these
great
decisions.
C
You
know,
building
a
model
is
very,
very
small
versus
ingesting
getting
rid
of
outliers
feature
engineering,
transforming
it
moving
the
data
around
in
a
pipeline
regular
way,
let
alone
after
that
comes
out.
You
know,
are
you
tracking?
It?
Are
you
being
responsible
security-wise,
all
that
good
stuff,
I,
think
a
lot
of
this
stuff
is
gated
on
the
process
and
the
pipeline's,
rather
than
just
you
know,
the
the
actual
implementation
of
building
your
model
I.
D
D
That's
that's
a
lot
of
hard
work
that
goes
into
that,
and
things
like
standard
data
formats
and
best
practices
for
things
like
Apache,
park'
or
column,
or
data
stores.
These
have
become
so
from
the
from
the
Python
and
our
perspective
and
data
science
right
Python
has
connect
to
just
about
every
data
format
and
data
source.
You
can
imagine
it's
part
of
the
Python
data
science
and
a
kind
of
philosophy
rights.
It
just
connects
to
all
sorts
of
remote
data
and
compute
sources.
D
But
what
really
happens
is
we're
seeing
users
exercise,
you
know
for
a
given
problem.
It's
best
to
use
park'
stored
in
this
particular
data
store
for
performance
reasons,
for
training
reasons,
so
we're
seeing
a
lot
of
us.
We
see
these
B
get
exercised
in
different
verticals
and
another
thing
interesting
I've
seen
in
the
last
year
is
you
know,
generating
synthetic
data
when
training
right?
So
just
sometimes
you
just
don't
have
enough
data.
D
It's
like
it
started
when
you
need
to
do
model
selection
on
natural
language,
processing
or
image
classification
and
we've
seen
really
interesting
use
of
generating
huge
datasets
in
parallel
that
can
be
used
for
the
training
iterative
process,
and
then
you
can
bring
in
the
real
data
on
on
a
rolling
basis.
So
between
those
two
things,
data
format,
data
storage,
especially
many
remote
sources,
I
think
that
it
is.
It
is
hard
work
and
it's
something
that
was
recognized
up
front
and
kubernetes
and
containers,
and
it's
going
to
be
hard
work
to
continue
to
maintain.
D
But
we're
gonna
learn
the
high
value
connections
of
things
like
standardized
data
formats
as
the
training,
whether
it's
a
Mis
classification
or
NLP.
It's
orders
of
magnitude
difference
in
performance
when
use
the
right
tool
for
the
right
job
with
the
right
data
format
and
the
right
data
storage.
So
that's
all
starting
to
come
together.
I
think
we're
learning
a
lot
too
that,
together
even
in
the
open
source
and
cloud
activity,
that's
going
on.
A
C
Thank
you
well,
my
plug
is
for
my
talk
tomorrow.
Please
come
but
I
do
want
to
talk
about
something
that
we're
doing
just
between
everyone
in
the
room
and
those
on
Facebook,
where
we're
launching
something
which
is
designed
to
solve
exactly
a
lot
of
the
problems
that
we
talked
about
here
on
stage,
it's
called
cube
flow
and
the
idea
is,
it
is
a
standard
ml
stack
for
running
ml
on
top
of
kubernetes.
It
is
not
about
re
implementing
all
the
great
hard
work.
C
That's
out
there
in
the
world,
tensorflow
XG,
booze
I
can't
learn
anything
like
that
or
any
of
the
UI's
or
any
of
the
transformation
tools.
It
is
really
about
much
in
the
same
way
that
kubernetes
didn't
go
and
re-implement
a
database
serving
tool
as
something
like
that
it
just
allowed
you
to
take
that
containerized
tool
and
spin
it
up
in
a
very
elegant
way,
but
not
just
elegant,
but
also
portable
and
very
scalable.
So
you
could
deploy
it
to
your
laptop.
You
could
deploy
it
to
a
GPU
rig.
C
You
could
deploy
it
to
a
cluster
all
with
the
same
command
repeatably,
and
that
is
something
that
we're
very,
very
you
know
happy
to
get
out
the
door,
because
this
is
something
that
we
hear
so
often
from
customers.
Oh
geez,
you
know
I
wanted
to
go
down
and
mail,
but
I
had
to
you
know
completely
rambling
that
stack
or
I
had
to
build
it.
Myself
or
my.
You
know,
my
data
scientists
had
the
wrong
version
of
Python
and
so
everything
failed.
C
I
D
So
if
you
haven't
it's
a
free
download,
Windows,
Mac
and
Linux,
it's
up
to
a
thousand
libraries
for
Python
and
are
any
area
you
can
think
of
image
classification,
natural
language
processing
in
LT,
caged
and
sim
Jupiter,
notebooks
and
machine
learning.
We've
been
very
busy,
adding
more
and
more
libraries
tensorflow
with
GPU
support
and
the
nice
thing
is
you
just
conned
it
install
it.
It's
all
precompiled
across
Windows,
Mac
or
Linux
makes
it
very
easy
to
use
free
to
use
on
any
of
those
platforms.
D
And
then
we
have
anaconda
Enterprise,
which
you
can
sign
up
for
a
30-day
trial
of
it's
pretty
much
a
manifestation
of
data
science
platform
with
collaboration,
authentication
security,
but
the
it's
all
powered
on
the
underlying
anaconda
distribution
and
the
Conda
package
manager.
So
if
you
haven't
used
that
and
you're
tired
of
living
dependency,
hell
and
and
dealing
with
Fortran
C
library
system,
libraries
and
when
doing
machine
learning,
try
out
anaconda
and
let
us
know
what
you
think
cool.
A
E
I'm
gonna
book
into
you
here
with
Cooper
on
the
other
side
too
I
think
one
of
the
one
of
the
really
important
things
that
we
should
be
looking
at
when
it
comes
to
something
like
like
coop
flow
and
what's
what
Davison
to
talk
more
about
is
there
are
there
are
many
many
organizations
out
there
who
have
been
producing
bespoke
solutions
for
building
these
pipelines?
Building
these
flow
is
trying
to
put
them
into
production,
trying
to
address
how
data
scientists
were
trying
to
address
how
operations
folks
work.
E
E
A
C
C
H
I'm
gonna
put
in
one
more
plug
too.
If
you
go
to
Commons
at
OpenShift
org.
Halfway
down
the
page,
there
is
an
ml
working
group
that
we're
starting
up
on
the
open
chef
Commons.
So
if
you're
interested
this-
and
you
want
to
get
involved
and
hear
more
about
the
best
practices
and
lessons
that
we're
learning
around
coop
flow,
please
sign
up
there
as
well.
So
do
we
have
any
questions
in
the
audience?
I
know
it's
towards
the
end
of
the
day.
There's.
B
C
C
D
Now,
for
me,
it's
a
little
bit
more
rudimentary,
but
it's
exciting
to
watch
our
users
going
through
the
process
of
you
know:
dropping
their
batch
jobs
in
terms
of
models
that
are
constantly
training
constantly
running.
Instead
of
sort
of
this
daily
or
weekly
thing.
That's
exciting,
because
there's
a
lot
of
sort
of
waste
of
time
that
goes
into
these
iterations
and
the
daily
iterations,
as
opposed
to
just
bringing
something
online
and
having
it
run,
sort
of
an
ongoing
basis
and
the
other
part
that's
interesting
again.
D
It's
not
cutting
edge,
but
it's
watching
our
users
refactor
the
way
that
they
work
into
micro
services.
So
what
would
have
previously
been
a
monolithic
image
classifier
with
a
UI
built
onto
it,
with
a
very
specific
declarative
way
of
doing
something?
Let's
say
recognizing
images
or
edges
is
now
completely
different
in
the
way
we're
seeing
our
users
in
the
past
years,
sort
of
build
a
specialized
API
that
just
does
the
classification
and
a
specialized
front-end
for
that.
D
That's
modular
that
can
swap
between
the
different
backends
so
actually
watching
that
roll
out
into
the
larger
masses,
and
not
just
the
developers.
The
bleeping
edge
developers
is
actually
really
nice
to
watch
and
it
lets
it
lets
us
sort
of
focus
around
the
best
tool
for
the
best
job.
Instead
of
a
monolithic
approach
to
everything
so
we're
starting
to
see
those
projects
get
deprecated
and
sometimes
broken
up
to
micro-services
are
actually
healthier
than
the
original
monolith.
So
it's
exciting
to
watch
that
happen
as
it
as
things
get
adopted
more
and
more
across
the
industry.
So.
E
Just
so
two
quick
things,
then
one
is
I
want
to
add
on
to
David's
comments
about
about
transfer
learning
it
people
need
to
watch
this
space
as
the
real.
The
vast
majority
of
the
complexity,
that's
happening
in
data
science
and
data
data
engineering,
work,
model,
design
or
whatnot
and
being
able
to
reuse
that
as
a
developer
is
going
to
be
hugely
empowering.
So
that's
that's
really
key
to
watch
out
for
to
your
question
about
something
happening
in
the
predictive
space.
E
A
I
mean
at
least
from
my
perspective,
to
add
to
that
right,
I
mean
something
that
you
mentioned
using
the
digital
exhaust
like
logs
and
metrics
and
stuff
like
that,
and
how
do
we
make
our
systems
much
more
smarter
in
terms
of
scaling
or
in
terms
of
even
a
better
fault,
tolerance,
etc,
etc?
Is
something
of
very
a
lot
of
interest
for
us
from
the
right
head
perspective?
One
more
question:
all
right:
hi.
G
E
We
really
need
to
become
data
literate.
We
need
to
understand
what
the
sources
are.
We
need
to
be
teaching
people
to
understands
how
and
what
data
is
how
data
is
used
and
what
the
potential
is
and
and
really,
as
from
a
personal
perspective,
also
looking
at
understanding
what
to
use
the
term
our
data,
our
data
exhaust,
is
in
the
in
the
world
today,
I
think.
D
Think
a
big
piece
of
empowering
both
the
producers
and
consumers
of
ML
and
AI
is
about
transparency,
reproducibility
of
the
models
and
the
analyses
themselves,
so
I
think
I.
Think
a
bad
example
is
treating
something
as
a
black
box.
You
know
only
runs
in
a
certain
environment
and
we
don't
really
know
why
it
works
so
well,
but
it
works
great
and
saves
us
money.
It's
not
a
good
approach.
You
know,
I
come
from
a
sort
of
civil
engineering,
very
hands-on
physical
engineering.
I.
Think
ml
to
me
is
the
same.
D
I
often
relate
to
it
in
a
way
that,
when
I
think
about
the
hyper
parameters
or
distributions
that
are
going
through,
I
want
to
see
those
all
the
way
through
I.
Don't
ever
want
to
see
a
step
that
I
don't
really
know
what
happened
to
that
distribution
or
hyper
parameter,
but
it
looks
good.
You
know
that
I
think
you
know,
as
as
producers,
maybe
they're
very
careful
to
document
and
and
be
very
open
about.
D
D
C
I
think
those
are
both
terrific
answers.
I,
you
know
I
think
if
I
was
going
to
kind
of
generalize
it
a
little
bit.
There
are
two
key
things
that
these
both
factoring
to
anyone
familiar
with
chaos
monkey
it's
the
Netflix
tool
that
they
used
to
actively
kill
machines
randomly
to
to
tease
out
issues
like
technology
is
not
neutral.
We
need
to
be
aware
of
that.
We
are
technologists.
We
need
to
be
aware
that
it
is
not
neutral.
C
It
has
a
positive
or
negative
effect,
and
it
is
up
to
us
to
be
our
own
chaos
monkeys
for
the
technology
we
roll
out.
We
need
to
be
probing
in
every
possible
way
and
and
be
mindful
that
hey,
how
am
I
checked
to
make
sure
that
this
model
that
I
rolled
out
doesn't
actively
bias
against
a
certain
population,
whatever
that
might
be
have
I
checked
to
make
sure
that
the
you
know
what
the
edge
cases
look
like
here
and
hey?
Is
this
an
area.
F
C
F
I
did
follow
on
you,
guys
were
talking
about
it
from
a
producer
of
of
ML
technologies
and
I'm
thinking
about
it
from
a
consumer
of
ML
technologies.
In
terms
of
is
there
development
or
some
kind
of
transparency
guidelines
that
we
can
use
to
figure
out?
Ok,
when
a
when
a
ml
model
makes
a
certain
decision,
why
is
it
making
that
decision
and
is
there
a
way
that
can
tell
if
I'm
consuming,
like
you
know,
Google's
version
of
this
algorithm
versus
Microsoft's
version
of
this
algorithm?
F
A
C
You
will
never
know
that
yeah
that's
their
goal,
but
any
AI
you
like
or
any
solution-
and
this
is
not
a
re
or
ml
related
they're,
just
gonna
experiment.
They
want
to
see
whether
or
not
a
new
thing
works.
So
that's
that's
the
problem,
not
the
problem.
I,
you
know
I
know
technology
is
not
the
solution
or
panacea,
or
anything
like
that.
My
hope
is
that
and
I
know.
C
This
is
my
job
to
pitch
my
new
thing,
but
my
hope
is
that
by
creating
standardized
ml
stacks
with
somewhat
standardized
reusable
components,
we
will
develop
standardized,
reusable
transparency
tools.
For
that
I
mean,
though
it
is
impossible
to
look
at
you
know,
for
example,
there
are
two
very
very
popular
image
recognition
models
right
now
out
there
right
now,
ResNet
and
imagenet
they're,
both
very
very
successful.
They
both
perform
better
than
human
right.
Now
you
could
not
use
transparency,
analysis
tools
that
you
built
for
one
with
the
other.
C
There
are
just
completely
different
layouts
and
models,
and
so
on,
and
so
my
hope
is
that
by
building
some
of
these
standard
two,
you
can
do
it,
but
let
me
make
a
pitch
out
there.
I
would
love
someone
to
build
chaos,
monkey
4ml,
meaning
you
don't
need
to
introspect
into
the
model.
Right
like
you,
could
build
this
and
say:
hey
I
have
a
set
of
multiple
different
population
types
as
data
that
I
can
feed
into
your
model
and
get
the
results
back
on
the
other
end,
and
it
doesn't
have
to
be
real
humans.
C
They
can
be
totally
anonymized
but
like
if
at
the
end
your
model
comes
out
and
it's
biased,
then
you're
like
eh,
something
bad
is
happening
here
and
and
it
doesn't
that
didn't
give
you
the
transparency
that
we
all
should
demand,
and
literally
there
are
a
hundred
PhDs
working
on
introspecting
into
models.
Today,
but
at
least
then
we
have
some
awareness
and
so
I
I
will
pitch
that
and
I
will
endorse
and
find
Google
engineers
to
help
you.
If
you
want
to
lead.
C
You
know
that
kind
of
thing
exactly
but
but
test
cases
where
it's
not
like.
We're
we
know
what
the
population
source
was.
It's
the
population
source
is
not
made
available
to
the
model
right
that
you
just
hand
these
objects
in,
and
some
results
come
out
the
other
side
and
that
that
test
on
the
other
side
looks
for
bias
against
populations.
C
E
So
two
quick
Atheneum,
two
quick
things,
one
going
back
to
my
my
definition
of
what
AI
is,
what
machine
learning
is
and
whatnots.
We
need
to
understand
that
machine
learning
there.
There
are
some
areas
in
machine
learning,
neural
networks,
deep
learning
capabilities
is
one
particular
area.
That's
that's
being
applied
a
whole
lot
right
now
and
it
specifically
has
interpret
e
concerns
associated
with
it.
E
There
are
other
approaches
that
are
that
are
better
in
some
use
cases
or
worse
and
worse,
and
others
like
like
image,
recognition,
speech
things
like
that,
but
are
interpretive
when
it
does
come
to
neural
networks.
I
think
we
need
to
extend
your
question
as
to
is
it
it's
not
just
to
focus
on
the
model,
but
it
gets
more
so
kind
of
like
what
David
is
talking
about
this?
E
I
Then
one
more
question:
yeah
David
to
your
point,
I
think
we
had
chaos
monkey
like
ten
years
ago
in
the
financial
industry,
they're
all
used
AI
and
they
never
saw
it
coming.
But
what
I
wanted
to
ask?
Where
do
you
think,
will
be
the
main
contributions
of
AI
when
we
talk
about
things
like
the
self-driving
data
center,
when
I
listen
to
your
answers,
David
I
think
you're
hedging
a
little
bit
that
you
say
right
now.
The
complexity
is
too
high.
We
have
to
focus
on
abstraction.
I
C
So,
if
that,
if
that's
what
you
took
away,
I
apologize
I,
don't
think
the
complexity
is
too
high
at
all
right.
We
have
existence
proof
of
us
solving
that
problem,
I
think,
in
my
opinion,
right
now.
The
problem
generally
relates
to
the
approachability
of
using
AI
or
ml
for
your
data
center
for
yourself
driving
data
center.
That
is
too
high,
and
by
that
I
mean
literally
the
interface
between
a
model
and
your
system
is
broken.
C
It's
highly
bespoke,
meaning
either
I
have
to
rewrite
my
model
in
some
very
specific
way
or
I
have
to
build
some
crazy
feature
engineering
tool
to
translate
the
data
that
I
have
into
something
that's
actually
usable
or
then
I
have
to
like.
Even
if
I
get
answers
you
know,
I'm
like
do
I
have
the
correct
feedback
loop
so
that,
as
might
take
action
on
my
answers,
it's
feeding
back
improperly,
like
all
that
is
broken
right
now,
it's
it's
less
of
like
it's
very
approached
or
it's
very
implementable.
C
D
C
This
up
as
the
end-all
be-all
solution,
but
my
hope
is
that
we're
able
to
develop
some
standard
as
an
industry
around
stacks
around
ways
to
ingest
data
ways
to
spit
out
answers,
getting
feedback
loops
all
that
kind
of
stuff
where
it
doesn't
and
to
be
clear.
Like
I
said
we
have
data
centers
that
we
do
it
at
Google.
We
have
internal
services
that
literally
self-drive
our
data
centers.
So
it's
absolutely
possible
it's
just.
How
do
we
make
that
available
to
everyone.