►
From YouTube: How to Build Data Science Pipelines with OpenShift using Ceph, Kafka and Knative Guillaume Moutier
Description
How to Build Data Science Pipelines with OpenShift using Ceph, Kafka and Knative
Guillaume Moutier (Red Hat)
OpenShift Commons Gathering on Data Science
January 28, 2021
https://commons.openshift.org/gatherings/OpenShift_Commons_Gathering_on_Data_Science.html
Find out more about OpenShift Commons, please visit: https://commons.openshift.org
A
A
A
The
most
important
thing
that
I
want
you
to
retain
from
this
presentation
is
to
embrace
the
cloud
native
way
of
doing
things.
I'm
not
saying
here
that
you
must
run
everything
in
the
cloud.
No,
not
at
all,
but
architectural
patterns
that
we
have
seen
emerge
from
cloud
services
can
completely
apply
anywhere,
including
on-prem,
and
this
is
exactly
what
is
helping
a
lot
for
data
pipelines.
A
But
first
thing:
first:
how
do
we
define
this
cloud
native
approach?
Here's
my
totally
very
opinionated
short
list
of
characteristics
that
we
must
aim
at
for
a
cloud
native
data
platform.
First
is
agility
and
elasticity.
You
know
tools
and
frameworks
and
data
sets
they
evolve
constantly
and
very
rapidly.
So
you
must
be
able
to
act
accordingly
with
your
infrastructures,
then
cloud
standards.
I
guess
it's
important
to
avoid
any
vendor
lock-in
with
proprietary
tools
and
formats,
and
we
must
embrace
widely
recognized
open
source
protocols
and
standards,
hybrid
cloud
architecture.
A
You
know
what
you
are
designing
in
terms
of
architecture
must
run
anywhere
without
any
change.
Or
maybe
you
know
some
small
conflicts
that
you
can
adapt,
but
not
the
architecture
itself.
A
A
A
What
I
don't
like
about
this
is
the
coupling
that
that
you
have
at
different
points.
You
know
the
the
user
who
has
to
mount-
let's
say
a
p
drive
on
the
computer
and
same
thing
on
the
server
side,
the
application,
which
relies
on
a
very
specific
configuration
of
the
application
server.
A
A
The
use
case
in
this
demo
is
about
pneumonia
detection
from
chest
x-rays
using
an
automated
data
pipeline.
So
imagine
the
problem.
Is
this
one?
We
have
some
x-ray
images
to
review
some
from
people
having
pneumonia
or
some
people
who
have
normal
chest
x-rays,
and
we
want
you
to
automate
this
process.
A
So,
of
course,
we
think
that
an
ai
ml
model
can
can
help
and
we
can
use
tools
that
are
provided
from
by
open
data
hub,
for
example,
you
know
with
jupiter,
notebooks
and
tensorflow,
and
we
can
train
a
model
to
be
able
to
do
some
inferencing
and
on
those
images
and
determine
if
these,
the
the
new
images
that
we
want
to
process
are
from
people
having
pneumonia
or
not.
A
So
we
have
this
model,
but
it
has
to
scale
so
we
have
to
automate
it
so
now.
The
question
is:
how
can
we
analyze
those
images
as
they
come
in
for
a
continuous
flow
of
thousands
of
images,
and
if
you
want
to
retrain
the
model
and
redeploy
it
seamlessly
at
various
locations
simultaneously,
how
can
we
efficiently
do
that.
A
So
here
is
our
demo
environment.
Let's
say
we
are
at
a
hospital
and
we
are
generating
new
x-ray
images.
Okay,
what
we
will
do
is
send
all
those
images
into
a
bucket
into
a
self-bucket
that
has
been
instantiated
by
openshift
container
storage,
and
this
bucket
has
been
configured
to
send
notifications
whenever
a
new
image
is
coming
in.
A
Those
notifications
will
be
sent
to
a
kafka
topic
that
is
linked
to
a
creative
eventing
and
serving
function,
and
the
container
that
is
spawned
when
a
new
message
is
coming
in
will
do
a
risk
assessment
on
this
new
image.
You
know
basically
using
the
model
that
we
have
trained
to
try
to
infer.
If
there
is
a
risk
of
pneumonia
or
not
in
a
standout
production
scenario,
all
results
will
be
sent
to
a
doctor,
but
here
I
have.
I
did
a
special
step
because,
of
course
you
know
not
all
models
are
totally
perfect.
A
What
the
what
the
the
process
will
do
is
anonymize
the
image
okay,
so
that
it
can
be
further
processed
in
a
central
data
science
lab,
for
example,
and
normally
again
in
a
standard
production
environment.
You
would
have
a
doctor,
a
specialist
doing
a
manual
assessment
and
the
classification
of
of
this
image,
for
which
the
model
was
not
able
to
infer
the
result.
A
It
would
be
classified
as
a
risk
of
pneumonia
or
being
normal,
and
this
would
trigger
a
return
of
the
model
that
could
be
re-injected
here
back
to
our
hospital
origin
through
a
standard
openshift
ci,
cd,
okay.
This
second
part
here
with
the
model,
retraining
and
everything
we
won't
see
it
in
the
demo.
It's
not
implemented
because
training
a
model
like
this
takes
a
certain
time,
but
I
have
a
way
to
simulate
a
new
model
being
used
to
do
those
inferences
and
to
to
make
the
link
with
the
scenario
that
we
described
before.
A
We
can
imagine
that
in
multiple
hospitals
there
is
also
the
same
model
being
used
to
to
make
some
inferencing
on
images.
Okay
and
again,
images
for
which
the
model
is
not
is
not
so
sure
about.
The
the
result,
those
images,
would
be
anonymized
and
sent
for
further
processing.
Here.
Okay,
let's
see
that
live
okay,
let
me
walk
you
through
the
environment
I
have
prepared
here
is
my
openshift
cluster,
where
I
have
a
few
things
that
that
I
have
created
for
this
demo.
A
First,
there
is
here
this
deployment
config
of
what
I
call
the
image
generator.
This
is
a
container
that
will
well,
you
know,
in
fact,
it
won't
generate
x-ray
images.
It
will
just
copy
randomly
a
source
extra
images
that
I
have
in
in
a
bucket.
It
will
pick
randomly
some
images
and
send
them
copy
them
to
an
incoming
bucket.
Okay,
we
can
see
that
here
the
image
generator
is
deployed.
There
is
one
part
running:
that's
the
blue
circle
here
indicate
indicating
that
this
deployment
configuration
is
up
and
running,
but
at
this
moment
it
doesn't
do
anything.
A
A
So
what
this
container
is
doing
is
just
listening
to
a
kafka
topic
and
waiting
for
some
messages
to
to
come
in,
and
we
can
see
here
that
this
kafka
source
is
connected
is
linked
to
this
service.
To
this
serverless
skin
native,
you
can
see
the
the
logo
here,
the
creative
service
that
is
called
risk
assessment.
A
So
this
is
a
full
serverless
container
serverless
deployment,
so
meaning
that,
right
now,
as
there
is
nothing
to
process,
it's
just
also
sitting
idle.
So
we
can
see
here,
there's
no
blue
circle
around
the
container,
meaning
that
is,
it
is
scaled
down
to
zero.
There
is
no
instance
of
of
the
risk
assessment
container
running.
A
I
have
a
few
other
things
that
are
deployed.
First,
is
my
kafka
cluster
deployed
through
the
mqstreams
operator,
so
very
basic
here
only
one
instance
of
kafka
and
zookeeper.
It's
totally
ephemeral
kafka
cluster.
Please
don't
do
this
at
home.
Normally
you
don't
want
to
run
only
with
one
instance
of
each,
but
you
know
for
resource
purposes.
Here
it's
it's
enough
for
what
we
want
to
do.
A
So
all
the
notifications
from
the
the
the
the
self
bucket
that
will
receive
the
image
will
be
sent
to
a
topic
in
this
kafka
cluster,
and
this
is
to
with
this
cluster
that
we
have
here
or
kafka
source
subscriber
that
is
listening
to
the
the
specific
topic
we
want.
Okay,
we
have
also
a
deployment
of
grafana
with
the
its
own
operator.
That's
a
dashboard
that
will
allow
us
to
see
what's
going
on,
and
I
have
also
a
few
helpers
here.
A
I
have
a
small
database,
a
very
basic
mariadb
database,
where
I
will
record
the
names
and
timestamps
of
the
images
being
of
the
images
as
they're
coming
or
being
processed
or
being
anonymized,
and
this
is
what
we
will
also
display
in
the
graphine
dashboard.
And
finally,
I
have
a
small
image
server.
As
you
will
see
on
the
dashboard,
we
will
display
directly
the
images
as
they
are
coming
in.
So
here
is
everything
that
I
have
deployed
and
we
are
now
ready
to
launch
the
image
generation.
A
What
I
will
do
now
is
launch
the
demo
and
to
do
that,
I
will
patch
the
image
generator.
Remember
the
value
that
is
set
to
0
to
idle
it.
I
will
make
it
now
be
one
once
again,
so
that
means
that
now
a
new
image
will
be
generated
will
be
copied
inside
our
incoming
bucket
every
second,
let's
launch
that
so
I've
launched
the
command
and
the
image
generator
will
be
patched
with
this
new
version.
We
can
see
here
that
it
has
already
deployed
it
went
very
fast.
A
A
A
So
here
is
the
graphene
dashboard
that
represents
in
real
time,
what's
happening
in
your
in
our
pipeline
on
the
top
left.
Here
we
have
a
summarized
schema
on
this
pipeline,
so
we
can
see
the
images
are
being
sent
to
an
incoming
bucket
here
and
we
have
the
counter
of
the
number
of
images
that
have
been
uploaded
so
far.
A
Then,
as
notifications
are
sent
to
a
kafka
topic
and
the
risk
assessment
container
has
been
launched,
we
have
the
number
of
images
processed
and
again.
If
the
the
certainty
of
our
model
is
less
than
80,
then
we
will
have
another
function
that
will
anonymize
those
those
images.
Okay,
so
we
can
see
that
the
pipeline
is
running
on
the
right
side.
We
have
the
list
of
the
last
10
uploaded
images.
Okay
and
don't
worry,
those
are
totally
random
generated
names
and
birth
date
and
other
personal
information.
Those
are
not
real
patients.
A
Here
we
have
also
again
the
list
of
the
last
10
uploaded
images.
Then
the
last
10
processed
images
and
we'll
see
right
in
in
a
few
seconds
what
is
happening
on
those
images
and
then
the
last
10
anonymized
images.
We
have
some
counters
on
the
on
the
left
side.
The
cpu
and
ram
usage
that
you
can
see
has
increased,
because
now
we
have
some
processing
to
do.
We
have
the
number
of
risk
assessment
containers
which
have
been
launched
so
far
to
be
able
to
handle
the
the
load
again.
A
This
is
something
that
is
automatic,
automatically
scaled
by
openshift
serverless,
and
then
we
have
here
a
risk
distribution.
So
so
far
within
all
the
images
that
have
been
uploaded,
we
have
the
distribution
between
the
ones
that
have
been
assessed
as
normal
or
a
risk
of
pneumonia
or
unsure.
Okay,
we
have
here
in
this
small
graph,
the
number
of
images
that
have
been
processed
by
model
version
and
we'll
see
in
a
few
seconds
what
happens
when
you
change
the
model,
and
we
have
here
a
counter
of
the
number
of
deployments
of
the
the
risk
assessment.
A
Pods.
Okay,
while
I
will
explain
to
you
what
is
happening
on
the
images,
I
will
do
two
things
first
is
to
increase
the
rate
at
which
the
images
are
sent.
So
far,
it's
only
one
per
second
and
I
will
also
change
here,
a
parameter
simulating
that
we
will
have
a
model
v2
now
that
will
be
used
to
do
this
processing,
so
I
will
do
the
first
patch
here
and
then
the
second
patch.
A
So,
while
my
containers
are
being
updated
to
to
reflect
those
changes,
let's
have
a
look
at
her
images,
and
here
I
have
another
special
dashboard
with
a
bigger
version
of
the
displayed
images,
and
maybe
I
will
wait
for
another
one
to
refresh
so
that
we
can
see
better.
A
It's
refresh
every
five
seconds.
Okay,
let's
stop
here.
So
what
happens
is
this
is
a
base
image?
This
is
the
image
that
I
have
prepared
beforehand.
I
have
about
800
of
those
images
which
are
x-ray
chest
x-rays
with
some
personal
information
that
have
printed
on
these
images.
Those
are,
as
I
said,
random
randomly
generated
information
when
a
risk
assessment
is
made
by
the
model.
What
my
my
processing
container
does
is
right.
A
On
top
of
the
image,
the
the
the
assessment
that
has
been
done
here,
a
risk
of
pneumonia
weighs
the
level
at
which
the
model
made
this
assessment,
so
risk
100.
So
the
model
is
pretty
sure
that
there
is
a
risk
of
pneumonia
for
this
specific
x-ray
image,
but
when
the
model
is
not
sure
and
the
risk
is
less
than
80
what
we
are
doing
also
here,
you
can
see
that
the
personal
informations
that
were
on
this
specific
x-ray
have
been
blurred.
A
A
First,
the
the
usage
of
cpu
and
ram
has
further
increased,
because
if
you
remember,
I
increased
a
lot,
the
rate
at
which
the
image
are
processed.
Now
it's
10
times
per
second
okay.
So
here
those
counter
are
growing
much
more
faster
and
we
can
see
that
the
open
shift
has
done
its
magic
and
automatically
scaled.
The
number
of
containers
number
of
pods
it
needs
to
be
able
to
to
handle
the
load.
Of
course,
we
can
see
here
that
many
more
images
have
been
have
been
assessed.
A
Okay
and
at
the
same
time,
we
can
see
here
that
now
I
am
now
using
the
v2
model
to
be
able
to
to
make
the
risk
assessment.
So
here
with
this
model
change,
I
am
simulating
that,
following
image,
anonymization
and
manual
classification
in
the
central
data
science
lab
a
model
has
been
retrained
and
pushed
back
to
here
to
our
hospital
so
that
it
can
be
used
from
from
now
on.