►
Description
MLOps with Jenkins-X: Production-ready Machine Learning - Terry Cox, Bootstrap Ltd
Speakers: Terry Cox
Explore ways to treat Machine Learning assets as first class citizens within a DevOps process as Jenkins-X MLOps Lead, Terry Cox demonstrates how to automate your training and release pipeline in Cloud environments, using the library of ML template projects provided with Jenkins-X.
For more Continuous Delivery Foundation content, check out our blog: https://cd.foundation/blog/
A
Hello
and
thanks
for
joining
me
on
this
session,
about
using
emma
lops
with
jenkins
x,
we
have
quite
a
broad
audience
today.
So
what
I'd
like
to
do
is
start
with
a
little
bit
of
a
history
lesson
back
in
the
dim
and
distant
past
of
not
that
long
ago.
A
So
everybody
would
be
making
incremental
changes
to
bits
of
code
and
then
at
some
point
somebody
typically
a
release
manager
would
be
responsible
for
trying
to
work
out
how
to
integrate
all
of
those
changes
into
a
known
release
version
which
could
then
be
compiled
and
moved
on
to
a
set
of
machines
where
that
could
be
tested
and
typically
testing
would
involve
a
lot
of
customers
using
the
code
trying
it
out
finding
out
what
didn't
didn't
work
and
then
reporting
that
feedback
back
to
project
manager.
A
So
a
release
to
production
in
those
days
typically
involved
packaging
up
an
executable
and
some
written
instructions
on
how
to
deploy
it
and
then
firing
the
whole
lot
over
the
wall
to
an
operations
team
and
once
that
stage
had
been
done,
then
there
would
be
a
big
party
and
developers
will
get
very
drunk
and
then
go
on
holiday
or
move
on
to
a
different
project.
A
So,
as
you
can
probably
guess,
lots
of
risks
associated
with
this
way
of
working,
it's
a
very,
very
manual
process,
and
so
there
were
very,
very
many
opportunities
for
human
errors
to
occur
and
because
there
were
so
many
knowledge
gaps.
I
was
very
easy
to
forget
to
include
things
in
a
build
or
forget
to
document
how
to
use
something
in
the
release
package
or
just
forget,
to
mention
essential
dependencies
that
were
needed
in
production
environments.
A
Now,
of
course,
this
isn't
the
only
way
of
working
and
many
people
in
the
audience
will
be
much
more
familiar
with
the
idea
of
a
devops
process
or
managing
software
release.
A
A
A
You're
validating
the
assets
that
are
being
created
and
going
through
a
highly
automated
governance
process,
where
you're
making
sure
that
approvals
have
been
put
in
place
before.
Eventually,
your
continuous
deployment
system
is
allowed
to
create
the
environments
and
populate
your
production
system
with
the
various
containers
containing
your
product.
A
So
under
these
ways
of
working
you're,
typically
dealing
with
a
situation
where
there
are
no
changes
to
anything
in
a
production
environment
without
some
sort
of
audit
pathway
and
some
way
to
easily
undo
any
changes
that
might
fail.
For
some
reason
in
production,
you've
eliminated
your
knowledge
gap,
because
you've
shifted
all
of
the
aspects
of
the
solution
right
into
the
design
phase,
where
they
can
be
taken
into
account
properly,
as
the
software
is
being
built
and
tested.
A
A
A
Typically,
what
happens
when
you're
dealing
with
machine
learning
assets
is
that
you
have
a
data
science
team
and
that
data
science
team
will
be
involved
in
aggregating
particular
chunks
of
data
to
make
training
sets
and
test
sets,
which
have
been
carefully
designed
to
prove
specific
aspects
of
a
learning
problem
that
you're
trying
to
solve,
and
those
may
be
very
large
data
sets.
A
You
know,
potentially
in
the
order
of
petabytes
of
data,
and
then
you
will
be
creating
training
scripts
to
train
your
machine
learning
models
and
those
training
scripts
are
often
written
in
the
form
of
jupyter
notebooks
and
then,
at
the
same
time,
you'll
also
be
doing
a
lot
of
data
analysis,
work
to
try
and
understand
the
nature
of
both
the
data
you're
working
with
and
the
models
that
you're
creating
to
validate,
whether
what's
being
learnt,
is
actually
accurate
and
to
detect
whether
there's
bias
in
your
data
and
in
your
learned
models
if
the
models
are
acting
fairly
to
your
customers.
A
A
So
there's
a
fair
amount
of
ad
hoc
data
flying
around
and
then
at
some
point
somebody
will
build
some
infrastructure
to
run
a
training
event
on
which
usually
involves
setting
up
some
cloud
infrastructure
so
that
you
can
throw
a
bunch
of
cpu
or
gpu
or
tpu
compute
resource
problem.
And
then
you
may
run
that
for
a
few
hours
or
days
or
weeks
until
at
some
point
you
spit
out
a
model.
A
That
model
will
then
be
evaluated
to
check
its
accuracy
and
once
you're
satisfied
with
the
model
you've
got
the
route
to
deployment
typically
involves
moving
that
model
into
a
model
server
to
make
it
available
for
use
in
the
broader
application.
A
Now,
what
that
actually
means
in
practice
is
that
you've
just
thrown
that
model
over
a
wall
to
the
devops
team
who
are
responsible
for
the
rest
of
the
product
and
to
put
things
in
context.
The
machine
learning
elements
of
a
given
product
typically
represent
about
five
percent
of
the
overall
effort,
and
so
there's
a
lot
of
activity.
That's
going
on
outside
of
the
data
science
team
to
build
the
whole
product
which
involves
integration,
and
you
know,
user
interfaces
and
sales
channels,
etc,
etc,
etc.
A
A
You
really
need
to
get
a
handle
on
versioning
that
data,
because
if
you
want
to
be
able
to
have
any
sort
of
audit
capability,
then
you
need
to
know
which
set
of
data
a
particular
model
was
trained
on
and
potentially
be
able
to
replicate
that
training
set
and
test
the
model
against
the
original
data.
To
see
if
anything
has
changed.
A
The
data
you're
working
with
is
often
sensitive
personal
data.
So
you
have
the
full
stack
of
challenges
around
security
and
privacy
for
managing
those
types
of
data
sets,
and
you
are
often
dealing
with
dedicated
hardware
and
cross-platform
challenges,
because
you
may
need
a
lot
of
very
expensive,
gpu
or
tpu
resource
to
train
your
model,
and
you
may
want
to
deploy
your
model
onto
an
edge
device
such
as
a
phone
where
you
need
to
be
able
to
execute
that
model
on
a
completely
different
set
of
hardware.
A
And
that
means
that
we
need
to
move
into
risk
mitigation
for
challenges
like
bias,
ethics,
regulatory
compliance
as
no
doubt
be
aware.
There's
a
lot
of
focus
on
regulation
of
ai
at
the
moment,
and
the
bar
to
clear
will
be
very
high,
and
so
governance
for
machine
learning
systems
will
need
to
be
very
tight.
A
So
how
do
we
actually
start
to
address
these
problems
in
in
practical
ways?
Well,
realistically,
what
we
need
to
do
is
put
the
focus
on
the
product,
not
on
the
machine
learning,
so
what
we
actually
want
to
do
if
we're
going
to
be
successful
as
a
product
commercialization
team
is
to
optimize
the
management
of
all
of
our
assets
rather
than
optimize.
For
machine
learning
or
optimize
for
user
interface
or
optimize
for
back-end,
we've
got
to
think
about
this
end-to-end.
A
A
A
A
A
A
A
But
initially
the
model
deployment
will
have
no
model
associated
with
it,
so
it
can't
deploy
until
you've
trained
your
model.
The
training
will
execute
within
jenkins
x.
It
will
build
you,
the
specified,
training,
environment,
it'll
run
your
training
create
an
instance
of
a
model,
and
then
it
will
test
that
model
to
see
if
it
passes
your
acceptance
criteria.
A
Now,
if
it
doesn't,
then
it
will
fail
out,
and
you
can
run
another
training
instance
with
some
different
tunings
and
keep
repeating
that
process
until
you
get
a
pass,
but
if
it
passes,
then
it
will
actually
automatically
create
a
pull
request
onto
the
model
deployment
repository
and
move
a
version
of
that
trained
model
into
that
repository,
triggering
the
release
process
for
the
second
repository,
and
so
you
get
an
automated
end-to-end
process
which
allows
you
to
run
trainings
and
then
deploy
them
into
test
environments
where
you
can
evaluate
them
as
part
of
the
broader
application.
A
Now
all
the
governance
processes
around
jenkins
x
are
all
fit
neatly
into
your
standard
release
processes.
So
you
can
just
add
in
whatever
ml
specific
governance
checks
you
want
to
have
like
automated
bias,
checking,
for
example,
and
just
integrate
those
into
the
release,
criteria
that
need
to
be
passed
before
you
can
put
something
into
a
production
environment.