►
Description
OpenShift and Machine Learning at ExxonMobil with Cory Latschkowski of ExxonMobil.
Filmed on October 28th, 2019 in San Francisco.
A
So
got
my
name:
Cory
latch,
Kowski
I'm,
with
what's
called
the
upstream
integrated
services
technology,
enablement
group
at
ExxonMobil,
so
I'm
gonna
disclaim
really
quickly.
These
slides
were
recycled,
slides
that
were
approved
by
legal,
so
it's
gonna
be
a
little
painful.
This
first
part
you
going
through
that
I'm
going
to
tell
you
a
few
stories
that
don't
necessarily
have
slides
so
I'm
grateful
to
be
here.
I'm
grateful
for
Exxon
letting
me
come
and
share
some
of
these
experiences
and
also
Red
Hat.
A
For
inviting
me,
if
you
haven't
heard
of
Exxon
Mobil
we're
sort
of
a
large
organization
we
we
did.
Some
people
at
our
core
would
say
that
we're
actually
a
risk
management
company
that
happens
to
dillan
oil
and
gas.
We
take
safety
very
seriously
and
that
builds
a
culture
around
that
also
brief.
Intro
to
me,
I've
been
with
Exxon
Mobil
for
about
eleven
years.
Maybe
twelve,
now
I've
lost
track
moved
around
a
bit.
A
I
was
an
active
directory
domain
admin
for
a
research
company
for
a
few
years,
I
eventually
moved
into
HPC
high-performance
computing
at
Exxon
Mobile,
where
I
was
focused
on
large
data
processing,
I
became
an
RHC
and,
ironically
I,
don't
think
I
had
any
Red
Hat
subscriptions.
I
was
actually
managing
at
a
time
moved
to
cybersecurity.
I
was
an
SME
for
internal
digital
forensic
cases,
did
log
aggregation
project
and
worked
with
Hadoop
Splunk
and
some
other
technologies.
There
would
have
been
a
really
interesting
use
case
in
out
of
looking
back
at
it.
A
We
had
a
data
breach
analysis
that
we
had
pulled
in,
and
machine
learning
would
have
been
really
fun
with
actually
some
of
that
stuff.
About
two
years
ago,
I
was
pulled
in
as
the
platform
architect
for
open
shift,
and
why
did
I
leave
cybersecurity?
It
sounded
really
cool
to
work
with
kubernetes
and
work
with
open
shift
just
curious.
How
many
people
in
here
started
their
journey
with
openshift
before
Version?
A
Three:
anybody
in
here
man-
you
are
a
brave
soul,
so
we
started
out
around
version
3.5,
and
this
was
theirs
I'm
going
to
talk
about
this
a
little
bit
more
later,
but
that's
that's
sort
of
where
we
started
out.
I
came
on
to
a
team
that
was
an
agile
team.
It
was
part
of
a
digital
transformational
effort
and
we
did
everything
from
the
full
stack
hardware
to
onboarding
of
customers.
A
One
of
the
big
wins
there
that
I
want
to
share
is
what
we
did
is
with
with
such
a
large
organization.
There's
a
lot
of
overhead
and
processes,
and
one
of
the
things
that
we
did
is
we
stood
up
an
open
shipped
instance
in
a
cloud
provider
with
git
lab
and
we
use
git
lab
is
basically
authentication
provider
and
we
said,
if
you
have
a
company
email
address,
you
can
go
and
use
this,
and
that
was
a
huge
win.
A
Also
one
of
the
big
wins
there
was
partnering
with
Red
Hat.
We
didn't
internally
have
the
experience
to
pull
this
off,
and
so
we
had
to
partner
with
Red
Hat
and
also
pulling
contractors
to
build
this
team.
I
guess
you
could
say
we
also
had
some
accidental
wins.
We
got
lucky.
We
got
some
really
good
people
that
worked
really
well
together.
A
They
were
integrated
during
some
restructuring
and
Leedy,
but
by
the
name
of
Audrey
Resnick.
She
partnered
with
Red
Hat
contractor
who
was
a
full-stack
developer
to
help
with
some
of
this.
This
work,
okay,
yep
we're
good.
So
in
this
picture,
you'll
see
there's
a
Jupiter
notebook
running.
This
was
what
I
like
to
call
the
beginning
of
the
snowflake
Factory.
A
We
started
every
you
get
a
Jupiter
notebook,
you
get
a
Jupiter,
okay,
everybody
gets
one,
and
so
that
started
not
a
very
consistent
experience
and
in
when
you're
doing
machine
learning
you
want
to
have
reproducible
results,
and
so
that
created
a
few
challenges.
So
what
was
worked
on
was
to
bring
that
the
goals
there
were
to
create
an
interactive
reproducible
and
collaborative
environment
for
the
data
scientists,
and
so
Jupiter
notebooks
were
selected.
This
was
as
you're
seeing
here.
A
The
the
main
thing
was:
people
are
running
these
in
the
HPC
environment
and
Linux
and
windows
on
port
8000,
whatever
you
like,
it
was
all
over
the
place,
so
the
goal
there
was
to
actually
move
from
this,
this
local
PC
environment,
to
more
of
an
open
shift,
and
so
this
was
a
huge
win
for
the
data
scientists
to
start
having
this
the
standardization,
it
forced
a
lot
of
different
things
as
well.
It
inherently
made
more
of
a
DevOps
approach
to
things.
People
may
have
I'll,
probably
explain
it
later.
A
Another
example
of
some
lessons
learned
there,
but
this
accelerated
a
lot
of
the
pocs
that
were
being
done,
and
so
this
is
sort
of
the
model
that
they
went
through
more
of
an
agile
model
and
then
pushing
code.
One
of
the
big
things
was
using
s2i
and
to
deploy
these
proof
of
concepts
and
then
have
a
way
to
demo
that
and
then
giving
feedback
from
that.
So
one
of
the
big
win
here
is
also
that,
because
OpenShift
was
risk
assessed,
a
lot
of
the
controls
had
already
been
documented.
A
A
Also
security,
so
that
was
a
big
one,
as
well
as
bringing
these
dependencies
for
your
for
your
model
and
we
had
a
security
pipeline
to
bring
in
the
artifacts
for
Python
that
were
brought
into
a
nexus
and
also
the
images
that
came
out
of
some
of
these
base.
Images
for
the
jupiter
notebooks
were
part
of
that
image
or
in
into
the
nexus
repo.
A
So
this
we
did
a
few
things
here.
One
is
we
took.
We
took
some
situations
that
would
generally
take
anywhere
from
months
to
deploy
using
waterfall
and
some
of
the
overhead
of
our
internal
procedures
and
change
that
down
to
minutes
so
before
when,
before
this
effort
first
started
between
the
data
scientists
and
developer
Dave
I
think
they
were
producing
one
or
two
proof
of
concepts
that
were
actually
getting
to
customers
right
now.
A
I
think
we're
around
70
plus
pocs
that
are
being
produced
by
the
same
group
because
of
using
one
big
thing
is
source
image,
so
not
only
the
jupiter
notebooks,
but
also
using
source
to
image,
to
deploy
these
POCs
so
reusing
data
and
also
connections
to
data.
That
was
a
huge
one.
We're
using
these
images.
A
lot
of
the
connections
to
data
were
through
sequel
server
or
work.
Alanda
having
those
drivers
already
built
into
the
images
was
very
helpful.
A
Another
thing
that
we
looked
at
recently
working
with
Red
Hat
and
will
Benton
was
the
idea
of
doing
a
source
to
image
model
training
so
actually
doing
your
your
training
and
during
the
build
process,
so
we're
also
seeing
a
CIC
D
pipeline
maturing.
We
we
had
a
few
things
that
we
learned
from
this
in
trying
to
solve
this
one.
One
of
the
biggest
problems
is
where's
your
data
on
Prem
databases
in
various
countries
and
certain
agreements
limited
where
we
could
move
data
or
access
data
development
and
deployment
in
Jupiter
notebooks.
While
it
was
much
better.
A
There
was
also
sort
of
some
lessons
learned
in
context
that
had
not
been
captured
during
that,
for
example,
here's
a
horrible
story
to
tell,
but
somebody
was
using
Jupiter
notebooks
on
their
on
their
local
machine
and
they
moved
to
openshift.
They
started
working
on
that
they
had
done
a
lot
of
work
and
their
pod
scaled
down.
Okay,
someone
hadn't
told
them
about
persistent
volumes,
so
you
can
imagine
how
that
sort
of
was
a
very
painful
but
good
lesson,
so
that
was
sort
of
a
journey
in
itself
to
understanding
that
context.
A
Also
one
size
does
not
fit
all.
We
found
that
data
scientists
are
very
they're,
very
special
and,
and
one
size
does
not
fit
all.
So
we
tried
to
do
that
and
we
found
out
some
of
them
just
wanted.
If
you're
thinking
about
MVP
model,
some
of
them
just
wanted
shoes,
they
didn't
want
to
skateboard,
they
didn't
want
a
bicycle
or
a
car
we're
trying
to
get
them
all
to
work
on
the
bicycle.
So
again,
we
just
focus
on
basic
fundamentals,
which
are
like
webhook
integrations
and
using
integrations
with
Jenkins
and
OpenShift.
A
So
here's
where
some
of
the
things
that
I
can
legally
share
with
you,
this
image
here
is
actually
some
of
the
flow
modeling
that
was
done
with
machine
learning.
So
these
understanding
what
the
well
flows
are
going
to
be
over
over
the
lifetime
with
the
well
using
tensor
flow
high
torch
psychic,
learn
all
those
libraries
and
dependencies
building
them
into
into
the
jupiter
notebooks
and
then
an
other
python
base
images
currently
we're
looking
at
actually
using
GPU
and
an
in
on
OpenShift
and
seeing
the
benefits
there.
A
A
A
So
these
are
some
of
the
efficiencies
that
we're
finding
here
are
not
only
just
we're
not
just
optimizing
slightly
these
or
millions
of
dollars,
if
not
billions,
of
dollars
that
we're
finding
that
we
can
save,
or
at
least
avoid
in
cost
in
certain
areas,
natural
language
processing,
we're
taking
a
lot
of
technical
texts
and
trying
to
create
a
repository
library
around
that
and
also
shared.
It
was
talked
about
earlier,
sharing
those
machine
learning
models
as
api's
through
like
open
data
hub
and
stuff,
like
that.
A
One
of
the
lessons
learned
also
with
GPU
and
this
this
may
or
may
not
be
applicable
to
you.
But
hopefully
it
is.
We
found
that,
because
GPUs
are
built
at
a
premium
in
the
cloud
you
may
want
to
do
an
analysis
to
decide
if
you
want
to
just
buy
that
new
rack
of
GPU
Clutton
like
servers
every
month,
instead
of
paying
a
cloud
provider
to
run
it
for
you,
that
was,
we
had
several
budgets
that
were
burned
through
from
some
of
the
data
scientists
running
their
models
in
cloud.
You
also
learned
to
train
models
locally.
A
So
we
broke
it
down
into
some
really
basic
questions.
When
we
go
into
some
of
these
discussions.
Where
is
your
data
like
because
the
question
that
was
being
asked
of
of
by
the
data
scientist
was
like?
Well,
where
do
I
put
this?
Do
I
put
it
in
the
cloud?
Wear
it,
but
my
data,
where
do
I,
run
my
application,
and
so
we
broke
it
down
into
these
three
questions
being
what
is
your
data
and
discuss
data
sovereignty?
Where
are
your
customers
or
the
internal
external
or
the
mixture?
A
A
A
Ultimately,
this
is
far
more
important
than
what
your
current
abilities
are,
and
so
one
of
my
favorite
conversations
that
I
had
recently
we're
in
a
room,
very
intelligent
people,
I
pretty
sure
I
was
the
only
one
without
a
PhD
in
that
room,
and
someone
asked:
why
is
it
called
a
cloud
and
I
was
like
okay,
I
looked
around
and
I
was
like.
Does
anyone
else
want
to
fill
this
question?
I
realized
nobody
in
the
room
actually
knew
the
answer.
A
I
was
like
we
should
just
Google
this
guy's,
but
no,
it
actually
turned
into
a
really
good
discussion
and
I
and
I
talked
about
back
in
the
90s
doing
Network
diagrams
and
how
it
was
abstracting.
These
details
and-
and
it
was
just
abstraction
and
trust-
and
that's
what
cloud
was
all
about,
and
also
it's
really
easy
to
draw,
but
but
it's
it
sparked
a
conversation
that
was
very
helpful
and
and
realizing
that
we're
never
too
smart
to
learn
more.
A
So
talking
specifically
about
some
of
the
effort
and
some
of
the
lessons
learned
that
came
out
of
some
of
these
first
first
part
of
this
journey
was
that
I'm
part
of
again
part
of
the
upstream
data
science
enablement
team.
So
I
was
platform
architect.
I
have
been
moved
over
because
of
this
unique
set
of
skills
or
experience
that
I
had
to
try
and
add
context
so
I'm.
My
team
is,
is
specifically
there
to
fill
in
these
gaps
in
knowledge
and
on
soul
culture.
A
So
these
are
big
gaps
and
if
we
have
a
lot
of
non
IT
engineers
at
ExxonMobil,
gaps
in
machines
create
failure,
and
we
want
to
avoid
that
in
our
culture
and
in
these
efforts
with
data
science,
so
consulting
with
the
data
science,
we
spend
probably
fifty
to
sixty
percent
of
our
time
doing
that
right
now.
We
also
in
in
doing
that.
We
also
get
a
lot
of
really
good
feedback
which
turns
into
education.
This
turns
into
building
success.
Skills
is
what
we've
termed
it,
which
are
really
just
developer
practices.
A
A
lot
of
data
scientists
at
ExxonMobil
did
not
come
from
it
from
a
development
background.
They
didn't
grow
up
in
that
world
and
so
they're
not
familiar
with
get
branching
they're,
not
yeah.
There's
some
of
them
aren't
even
familiar
with
yet
they're,
not
smart.
Like
it's
it's
a
challenge.
There's
a
wide
spectrum
of
people
were
working
with
another
one.
Big
one
is
collaboration
and
partnering
one
of
the
listening
to
one
of
the
internal
talks
at
Exxon
Mobile.
We
were
looking
at
the
number
of
patents
that
were
released
over
the
last
decade.
A
It
was
very
low
and
we
realized
that
collaboration
is
something
we
don't
do
well,
and
so
we're
focused
on
doing
that
not
being
a
lynchpin
like
our
team's
purpose,
is
not
to
be
a
lynchpin
in
collaboration,
but
to
be
an
enabler
and
to
get
out
of
the
way
of
that
and
to
help
it
organically
happen,
we're
also
focusing
on
self-service.
So
that
was
a
big
one
we
want.
A
We
want
enough
of
a
paved
path
for
our
data
scientists
to
be
able
to
use
these
tools,
and
that
was
why
Jupiter
notebooks
came
in
for
collaboration
with
other
people.
That's
also
that's
also
one
purpose
that,
as
our
team
someone,
someone
asked
us
the
other
day.
So
what
do
you
guys
really
do
and
we're
like?
Well,
we
force
awkward
conversations
and
that's
literally,
what
we
do
is
we
come
in
and
people
are
like
well,
can
you
help
us
and
I'm
like
sure,
but
can
we
do
a
peer
review
of
your
code
and
they're
like
well?
A
A
So
some
of
the
ways
that
we've
we've
discussed
in
retrospective
is
around
what
what
actually
builds
a
successful
enablement
team.
So
one
one
big
thing
is:
we
leave
our
egos
at
the
door
like
if
we
don't
know
something
we
tell
someone
that
and
we
go
figure
it
out
or
we
figure
it
out
with
them.
We're
we're
also
full
stack
developers.
So
we
we
work
on
making
others
successful.
As
a
team,
we
demonstrate
what's
what's
called
healthy
disagreement.
A
We
we
all
have
very
strong
opinions
at
times,
but
we
know
how
to
disagree
appropriately
and
we
demonstrate
that
to
data
scientists
that
inherently
don't
collaborate,
they're,
scared
too
or
they're,
just
it's
not
natural
for
them.
So
we
try
to
be
a
good
example
in
that
area.
I'm.
Also,
to
give
you
an
idea,
there's
four
people
on
this
team
and
we
have
about
90
data
scientists,
we're
supporting
and
no
way
is
that
the
perfect
ratio?
A
Don't
don't
take
that
back
home
with
you,
but
we
we
definitely
have
a
lot
of
work
that
we
do,
but
we've
seen
a
lot
of
success
even
with
a
small
number
of
people,
and
so
here
this
is.
This
is
basically
this
is
the
legally
released
picture
of
some
of
our
data
scientists,
so
this
is
them
collaborating
around
a
jupiter
notebook.
A
One
of
the
things
that's
that
is
the
best
to
see
is
to
see
when,
when
they
really
understand
these
things,
and
when
we
can
answer
those
really
simple
questions,
why
is
it
called
a
cloud
and
in
talking
about
the
fundamentals
and
contexts
around
OpenShift,
it
has
been
a
huge
enabler
for
our
data
scientists
and
I'm
glad
to
be
sharing
that
with
you.
If
you
have
any
questions,
I
hope
we'll
we'll
talk
later,
but
thank
you.