►
From YouTube: 16 Openshift and Machine Learning at ExxonMobil
Description
OpenShift Commons Gathering @ Kubecon/NA San Diego November 18 2019
A
A
A
B
Okay,
so
I'll
tell
you
our
story,
so
my
name
is
Audrey
Resnick
I'm,
a
data
scientist
I
work
with
the
computational
data
Sciences
group
within
Exxon
Mobile
and
as
a
data
scientist
probably
two
years
ago.
There
were
just
three
of
us
in
this
group
and
what
we're
trying
to
do
is
saying:
how
can
we
get
our
proof
of
concepts
over
to
our
customers?
And
this
was
actually
was
a
challenge
for
us,
because
we'd
come
up
with
these
really
great
ideas
and
then
we'd
sit
there
and
look
at
them
on
our
screens
and
go
damn.
B
B
I
mean
who
wants
to
sit
there
and
install
a
lot
of
software
when
you
just
want
to
see
if
the
problem
that
you
have
actually
can
be
solved,
or
is
it
a
problem
that
she
should
even
be
looking
at,
so
we
started
kind
of
our
search
and
we
we
first
started
with
Jupiter
notebooks
and
we
said
well.
This
is
a
good
way.
B
So
in
having
lunch
with
one
of
my
colleagues,
Chad
Furman
we're
gonna
have
to
get
buttons
Friends
of
Chad
because
he
goes
around
ExxonMobil
and
gives
all
these
ideas.
And
then
some
of
us
end
up
here
on
this
talking
to
are
talking
about
the
stuff
that
we've
come
up
with
said.
Well,
why
don't
you
take
a
look
at
open
shift,
because,
basically,
what
you
can
do
is
take
your
entire
environment
and
create
a
container.
B
So,
instead
of
worrying
about
giving
people,
local
admin,
access
or
worrying
about
the
latest
source
code
or
even
worrying
about
some
of
the
dependencies
that
you
have,
you
can
contain
this
in
with
a
an
atomic
unit
that
you
could
go
ahead
and
deploy,
and
we
sat
back
and
said:
that's
really
a
great
idea.
You
know
actually,
if
this,
if
this
works
for
us,
we
would
have
something
that
is
both.
B
B
Is
it
really
ties
really
well
into
the
agile
process
so
that
we
could
go
through
our
code,
a
lot
more
iteratively
and
we
could
really
quickly
push
out
those
Minimum
Viable
products
and
I'll
kind
of
give
you
a
taste
of
that
right
now,
it's
last
year,
at
this
time
we
pushed
out
to
Minimum
Viable
products,
because
we
had
just
started
using
OpenShift
and
earnest
this
year
at
this
time,
we're
already
past
seventy.
Okay,
that's
seventy
seven
zero,
that's
huge!
B
So
my
goal
last
year
was
as
a
data
scientist
I
want
a
data
science
environment
for
myself,
and
my
colleagues
and
my
users
there's
interactive
and
reproducible
and
can
give
me
some
sort
of
collaboration.
So
what
we
ended
up
doing
is
saying:
let's
get
away
from
this
snowflake
type
thing
and
create
some
sort
of
workflow
where
we
could
have
our
code.
That
would
be
coming
out
of
againt
repository
and
for
you
guys
still
here
yeah
for
some
of
the
scientists.
This
was
a
new
thing.
B
There
again
repository
was
their
h-drive
okay
and
that's
a
challenge
you
have
to
get
folks
to
determine.
You
know:
how
are
we
going
to
work
together
and
openshift
actually
allowed
us
to
do
this
as
well
too,
because
we
could
then
put
a
workflow
together
we
would
take
the
code
we
would
take
it
out
of
yet
we
would
build
an
S
I.
B
Do
paterno
book
push
it
to
open
shift
and
then
basically
have
our
users
or
our
colleagues
be
able
to
hit
a
URL,
and
that
was
huge
for
us,
because
that's
that
interactive
reproducible
and
collaborative
environment
that
we
wanted.
So
what
we
did
last
year
is
we
took
at
that
one
step
forward.
We
said:
hey
data
scientists
we're
going
to
use
agile
methodology.
We
want
to
be
able
to
talk
about
a
problem
that
you're
developing.
B
We
have
this
way
that
you're
going
to
go
ahead
and
deploy
your
product
to
your
user
and
by
the
way,
we're
doing
this
using
an
open
shift
environment
and
guess
what
your
user
or
your
colleague
has
the
ability
just
by
clicking
on
that
URL
to
go
ahead
and
take
a
look
at
your
product
for
us,
that's
actually
ground
baking.
It
may
not
be
for
many
of
you,
but
for
us
this
was
huge
because
before
this,
if
we
wanted
to
deploy
a
proof-of-concept,
it
would
take
us
three
weeks
on
something
that
we
called
quick,
app
delivery.
B
The
other
thing
that
that
we
brought
with
us
as
well
too,
is
with
that
interactive
feedback.
It
really,
as
I
mentioned,
tightened
really
nice
into
the
agile
development.
Where
we
could
then
see
is
a
solution.
Okay,
if
it's
not,
we
could
go
ahead
and
recycle
and
then
keep
on
going
and
for
us,
as
I
mentioned,
that
was
just
very
nice
and
then
the
other
thing
that
we
have
with
this
data
science
delivery
model
is
in
that
turn.
B
In
that
process
of
we're
working
through
agile
ii-if,
we
determine
that
we
have
to
create
some
sort
of
external,
not
Xperia,
L
girl,
domes
or
if
we
have
to
kind
of
change
direction
a
little
bit.
How
are
we
going
to
add
more
packages
or
libraries?
We
can't
ExxonMobil.
Has
this
thing?
You
don't
want
to
go
ahead
and
download
anything
into
our
environment
because
we're
worried
about
having
malware
and
attacks.
So
what
we
have
is
an
actual
security
portal,
where
we
tell
our
Dave
scientists.
B
B
Therefore,
when
you
go
ahead
the
next
time-
and
you
want
to
actually
use
this
package-
guess
what
we're
not
going
to
have
20
instances
that
all
over
the
place
it's
going
to
be
in
one
central
location
where
we
can
actually
get
the
exact
package
that
everybody
would
be
using
again
for
us,
that
was
another
huge
win,
because
I
can
attest
it.
As
a
data
scientist
three
years
ago,
just
the
three
of
us.
We
had
multiple
versions
of
pi
PI.
B
We
had
multiple
versions
of
numpy
and
we
could
never
agree
to
which
one
was
correct,
and
some
of
you
are
laughing
yeah.
So
you
understand
that
so
in
having
this
data
science
delivery
model
and
then
being
able
to
get
these
external
packages
and
pull
them
in
nicely
made
this
environment,
something
that
was
very
valuable
to
our
data.
Scientists
and
optimization
engineers
and
I'll
just
quickly
go
through
this,
because
the
other
thing
that
it
did
is
it
actually
turned
some
of
our
data
scientists
and
optimization
engineers
into
developers.
B
We
don't
tell
them
that
we
tell
them
that
with
the
open
shift
environment,
we
have
a
platform
where
we're
helping
you
develop
success
skills
so
when
they
go
ahead
now,
and
you
see
that
figure
with
the
developer.
Just
think
of
that,
as
the
data
scientist
we're
saying.
Well,
this
way
we're
helping
you
easily
store
your
code
somewhere,
where
somebody
can
get
up
and
it's
being
automatically
created
into
a
source
to
image
and
guess
what
you
don't
have
to
worry
about.
B
Finding
those
images-
because
we
have
it
within
an
image
registry-
and
you
know
what
we
can
easily
deploy
it
so
unbeknownst
to
them
there
working
in
this
methodology.
We
have
over
40
data
scientists
at
this
point
in
time
that
are
actually
using
the
source
to
image
and
actually
are
becoming
developers,
but,
as
we
tell
them,
you're
developing
these
great
success
skills
to
make
you
a
better
data
scientist
by
being
able
to
release
your
proof
of
concepts
very
quickly.
B
So
let
me
just
switch
gears
a
little
bit
and
talk
about
the
machine
learning
that
we
have
at
Exxon.
One
of
the
examples
that
Micro
particularly
works
with
is
we
take
a
look
at
optimization
and
surveillance
and
the
chart
that
you're
looking
at
right.
There
is
just
a
machine
learning
model
to
protect
the
well
flow
and
within
the
OpenShift
environment.
B
One
of
the
things
that
we
looked
at
and
kind
of
struggled
with
at
the
very
beginning
was
if
we
were
going
to
go
ahead
and
work
with
the
data
scientists,
we
didn't
want
to
have
over
70
different
containers.
So
what
we
quickly
did
is
looking
at
the
different
types
of
problems
that
people
looked
at
or
worked
on.
We
came
with
kind
of
three
images
that
we
kind
of
use
between
our
data
scientists
and
that
we
also
give
out
to
the
rest
of
the
ExxonMobil
scientists
when
they
want
to
use
them.
B
B
Kick
the
tires
on
your
model,
see
how
well
it
works,
and
then,
finally,
we
have
an
advanced
container
that
we're
building
that
only
a
very
few
of
our
data
scientists
are
using
right
now
and
that's
because
this
container,
we
use
with
some
of
our
GPU
work
that
we're
doing,
which
is
still
a
proof
of
concept.
So,
if,
though,
for
those
that
are
using
PI,
torch
or
tensor
flow
and
taking
a
look
at
some
of
our
more
advanced
models,
they're
going
to
go
ahead
and
use
that
third
container.
B
So
while
I'm
talking
about
machine
learning
for
us
right
now
we're
doing
our
final
setup,
I'm,
really
hoping.
This
is
the
final
setup,
because
we've
been
having
fun
with
this
since
May,
we
have
our
final
GPU
cluster,
we're
using
some
in
Vidya
V
100's,
and
we
have
also
some
internal
services
with
our
high
performance
computing
center,
where
they
also
have
some
Nvidia
100
cluster
set
up,
and
there
are
really
two
proof
of
concepts
that
we're
working
with
and
I'll
talk
about
them
on
the
next
slide.
B
B
B
So
for
those
of
you
who
are
not
geologists
and
I,
think
I'm,
the
only
geologist
in
here
plus
I,
guess:
I'm,
a
data
scientist
suffer
engineer,
jack-of-all-trades
whatever
that's
rock
data,
so
we're
looking
at
the
porosity,
the
permeability
and
we're
determining.
Can
we
take
a
look
at
a
number
of
our
reservoirs
from
all
over
the
world
and
see
if
we
can
make
matches
and
say
based
on
this
type
of
reservoir?
We
see
these
types
of
characteristics.
You
know
what,
when
we
had
a
field
in
this
type
of
reservoir,
we
produced
X
mounts
of
hydrocarbons.
B
This
other
reservoir
looks
similar
in
terms
of
the
permeability
and
porosity
and
some
of
the
other
characteristics
it
might
take
a
bit
on
that
one
and
see
if
that
one
turns
out
the
same.
So
those
are
the
two
proof
of
concepts
that
we're
working
on
right
now
and
with
that
I'm
going
to
hand
it
over
to
Cori
and
I'm
just
going
to
mention.
Last
year
there
was
myself
and
just
one
Red
Hat
contractor
and
as
of
two
months
ago,
we
actually
were
able
to
create
an
actual
enablement
team
for
our
computational
data
scientists
and
Cori.
A
Audrey,
all
right
so
yeah,
one
of
the
one
of
the
things
that
I
guess,
Audrey
sort
of
tricked
me
into
was
to
head
up
this
team
with
that's
called
the
enablement
team.
Thank
you,
and
so
one
of
the
things
that
our
purpose
is
to
do
is
to
create
appropriately
awkward
conversations
with
the
data
scientists.
We
do
peer
reviews
which
they
generally
operate
by
themselves,
focused
very
specifically
on
their
machine
learning
or
their
models.
So
one
of
the
things
that
we
talk
about
with
them
is
creating
a
pipeline
of
either
automation.
A
One
of
the
things
that
we
we
specifically
go
over
with
them
is
that
one
size
does
not
fit
all.
We
tried
to
actually
give
them
a
solution.
So
if
we
build
it
it'll
come,
they
did
not
like
that.
It
didn't
fit
their
needs
and
we
found
out
that
a
lot
of
these
things
are
very
quick
and
iterative.
There's
a
huge
waste
basket
of
ideas,
and
so
we
just
simply
do
it-
did
enable
them
to
use
web
hooks
with
Sui.
That
was
the
simplest
solution
that
they
were
very
happy
with
another
big
thing
that
we
learned.
A
We
had
one
project
with
a
month's
budget
that
was
burned
through
in
about
three
days
using
GPUs
in
a
cloud
provider.
I
won't
tell
you
which
one,
but
basically
this
is
one
of
those
things
that,
when
you're
looking
at
GPU
training
or
any
kind
of
GPU
use
decide
whether
or
not
it
makes
more
sense
to
actually
just
buy
a
rack
of
GPUs
every
two
or
three
months,
instead
of
putting
it
in
cloud
space.
That
was
one
hard
lesson
that
we
learned
another
one
is
a
lot
of
the
data.
A
Scientists
are
very
focused
again
on
on
specifically
solving
a
very,
very
niche
problem,
so
we've
had
to
sort
of
bring
them
out
of
that
that
mindset
and
we
do
it
by
asking
really
simple
questions.
Usually
these
are
some
of
the
questions
we
ask
so
where's
your
data,
we
talk
about
data
gravity
and
we
help
them
be
more
aware
of
that.
We
also
ask
them:
where
are
your
customers?
A
This
is
sort
of
an
overall
architecture
that
we're
we're
helping
them
to
think
about.
This
is
sort
of
our
stack.
I
guess
that
he
would
say,
and
I've
been
up
in
the
top
left
is
talking
about
cloud
ready
applications.
So
we
try
to
help
to
facilitate
the
data
scientist
to
think
as
a
developer
to
break
these
a
lot
of
times.
It's
a
single
Python
script
that
they've,
like
it's
2,000
lines
of
code
and
whatever,
and
we're
trying
to
break
that
out,
make
it
more
modular
or
help
them
to
think
about
collaborating
with
their
peers.
A
Again,
we
create
that
awkward
conversation
of
well.
How
are
we
supposed
to
support
that
and,
and
that
really
helps
them
to
sort
of
get
out
of
their
own
heads
with
this
so
specifically
on
the
team.
One
of
our
personal
focus
areas
is
not
necessarily
looking
to
try
and
do
something
perfect.
That
is
an
ExxonMobil
thing
we
want.
We've
always
wanted
to
do
something
flawless,
but
this
is
a
cultural
thing
we
have
to
sort
of
break
out
of
so
success
is
not
about
doing
things
perfectly.
A
It's
about
willingness
to
change
and
be
honest
about
where
you
are.
Ultimately,
this
is
going
to
be
helping
you
to
be
in
sorry.
Ultimately,
this
is
far
more
important
than
anything
else
that
you'll
do
so
in
the
in
the
upstream
enablement
team.
Our
focus
areas
specifically
around
what
we're
delivering
is
consulting-
that's
probably
80%
of
our
time
right
now,
just
because
we're
trying
to
bring
people
up,
this
is
technical
debt,
but
in
the
people
and
skillset
area
education.
So
a
lot
of
these
consulting
engagements
become
education.
A
We
do
workshops
that
have
been
really
helpful,
just
teaching
them.
We
have
various
layers
in
and
get
so.
We
teach
them
how
to
use
get
as
an
individual.
How
do
you
get
as
a
team
and
then
how
do
you
get
to
collaborate
externally
and
and
just
look
look
at
the
bigger
picture?
The
other
thing
is
all
these
things
that
we're
doing
either
lead
to
clock
collaboration
or
partnering
with
organizations
internal.
A
Ideally,
hopefully
we'll
get
to
the
point
where
we
can
collaborate
externally
as
well.
We're
working
on
that
one
of
the
things
is
Jupiter
hub,
so
we're
looking
at
open
data
hub
and
Jupiter
hub
as
one
of
those
enablers
for
self-service
and
then
bringing
GPUs
to
the
the
masses.
So
right
at
the
bottom
here,
I'm
going
to
turn
it
back
over
Audrey
she'll
talk
about
sort
of
where
we
why
we
ended
up
where
we
are-
and
this
is
the
user
story
right.
B
So
again,
I
mentioned
that
last
year.
At
this
time
we
had
two
perfect
concepts,
and
these
are
for
our
our
clients
up
in
Calgary,
specifically
within
the
curl
mine.
They
had
a
number
of
trucks
that
would
deliver
material
around
the
mine,
and
you
can
imagine
if
you
have
30
or
40
trucks
on
one
Road
and
these
trucks
are
weighing
a
couple
tons.
You
can
imagine
that
the
road
is
going
to
degrade.
B
So
one
of
the
problems
that
we
were
given
is,
if
we
give
you
a
starting
time
for
these
trucks-
and
we
say
when
they're
picking
those
loads
up
and
where
they're
supposed
to
go.
Can
you
optimally
create
some
sort
of
system
where
we
can
randomly
put
the
trucks
in
different
roads
to
make
sure
that
when
they
get
to
the
dump
location
or
to
an
actual
crusher
location
and
they're
actually
taking
or
to
that
port
portion
there,
so
we
either
get
rid
of
the
ore
or
we
crush
the
or
finer?
B
B
If
you
want
to
say
well,
not
really
that,
but
you
get
the
idea
that
we're
able
to
say
no
don't
buy
more
trucks,
we're
going
to
have
a
better
way
of
actually
going
ahead
and
telling
the
trucks
which
location
to
go
to
and
I
think
another
example
I'll
give
is
also
with
graders,
with
some
of
the
roads
that
we
looked
at.
We
said
you
know,
you
say
that
you
also
need
more
graders.
B
Well,
actually
we
can
show
you
that
60%
of
your
graders
are
actually
sitting
in
different
locations
and
not
working
as
efficiently
as
they
have
to.
So
those
were
some
of
the
items
in
the
proof
of
concept
that
came
out
one
of
the
things
I
think
that
was
really
important
about
this
is
as
data
scientists
in
the
research
center.
B
B
So
at
the
end
of
the
day,
I
think
that's
what
we
think
of
any
ways
as
democratizing
data
science-
and
here
are
my
colleagues
I
made
them
posed
for
this
picture
here
and
they're
very
happy
now,
because
we
don't
have
to
cram
everybody
into
a
room.
However,
we
can
deliver
that
proof
of
concept,
so
maybe
some
of
our
other
colleagues
in
in
Alberta
or
elsewhere
or
in
India,
can
actually
group
around
a
computer
and
take
a
look
at
a
proof
of
concept
that
we
deliver.