►
From YouTube: Next Generation AI/ML Workloads on OpenShift using knative and Infinispan - Ian Lawson (Red Hat)
Description
Next Generation AI/ML Workloads on OpenShift using knative and Infinispan
Ian Lawson (Red Hat)
This OpenShift Commons Gathering was held on July 6, 2022 live in London, England
https://commons.openshift.org
A
So
hi
everybody,
my
name,
is
ian
lawson.
I
am
a
account,
remember,
working
at
red
hat.
The
reason
I'm
a
little
bit
vague
at
the
moment
is
that
I've
just
got
over
covered
courtesy
of
dublin
openshift
commons,
which
seemed
to
be
the
super
spreader
event.
A
So
what
we're
going
to
talk
about
today
is
basically
the
concepts
of
next
generation,
ai
and
ml,
using
some
of
the
new
technologies
we've
got
as
part
of
the
openshift
package.
I
come
from
an
ai
background.
I
worked
for
some
agencies
that
remain
nameless
for
a
long,
long
time,
working
on
huge
data
loads,
and
so
I've
got
a
personal
interest
in
this.
So
I've
been
sneakily
doing
this
instead
of
my
day
job,
I
don't
think
any
sales
people
apart
from
dave
in
the
audience,
so
I
can't
get
told
off
for
that.
A
A
Firstly,
it's
the
decoupling
of
application
and
infrastructure,
and
this
is
incredibly
important
because
if
you
come
from
the
backgrounds
that
I've
been
in
terms
of
ai
and
machine
learning,
all
of
the
experimentation
you
did
bolted
itself
onto
machines
used
to
have
huge
hadoop
lakes.
Would
you
actually
process
and
all
these
kind
of
things
you'd
have
to
keep
the
hardware
up
and
running
all
the
time,
and
there
was
a
very
tight
bonding
between
the
application
and
the
infrastructure.
A
The
whole
point
behind
containers
is
that
you
remove
that
binding
in
the
old
days.
I
used
to
call
it
taint
and
the
reason
I
called
it
taint
is
I
used
to
write
applications
and
give
it
to
ops
and
I'd,
say:
here's
my
application,
it's
a
lovely
piece
of
java,
oh
by
the
way
you
have
to
install
this
jvm
on
the
machine,
and
you
have
to
install
this
library
and
you
have
to
install
this
database
and
by
the
time
they
installed
all
the
bits
and
pieces
they
needed
for
my
application.
A
The
entire
machine
was
tainted.
It
wouldn't
run
anybody
else's
application.
The
advantage
of
using
containers
now
is
all
the
taint.
The
things
that
made
ops
hate
us
developers
travels
with
the
containers
and
that's
that
kind
of
distinction
between
the
old
days
of
having
everything
locked
down
to
a
machine
or
traveling
with
the
actual
container.
A
It's
all
about
agility
of
application
creation
as
well.
We
always
forget
this
when
I'm
always
talking
to
customers,
I
say
well,
I'm
going
to
fire
this
application
fire
up
this
demo
within
30
seconds.
Bang.
I've
got
a
running
application
now.
I
know
when
I
used
to
work
in
development,
and
I
expect
it
hasn't
changed
in
the
real
world.
A
It
used
to
take
four
to
eight
weeks
to
get
a
virtual
machine
and
that
virtual
machine
wouldn't
be
anything
like
the
one
I
asked
for
so
this
kind
of
concept,
where
you
can
actually
just
fire
up
things
instantaneously,
is
amazing
and
that's
a
huge
opportunity
for
ai
and
ml
on-demand
execution.
I
will
get
excited
about
k-native
which
again
might
blow
my
blood
blood
pressure
through
the
roof
is
a
new
technology.
We've
got
a
part
as
part
of
openshift.
A
That
really
makes
this
on-demand
execution
shining
and
when
I'm
talking
about
on-demand
execution,
it's
the
ability
to
spin
up
an
application
and
only
consume
the
resources
you
need
for
that
application
for
the
duration
of
the
call,
not
the
application
itself,
and
I'm
jumping
ahead
slightly
at
this
point.
But
it's
like,
if
you
had
let's
say
an
application
that
consisted
of
four
services.
Three
of
those
services
are
called
once
every
24
hours
once
called
one's
called
once
every
100
milliseconds.
A
If
you're
running
a
standard
system
or
a
standard
kubernetes
system,
you
have
to
have
all
four
of
those
applications
up:
24
7
waiting
for
those
requests
to
come
in.
It's
part
of
the
design
of
the
way
kubernetes
works.
What
k
native
does
and
I'll
explain
a
little
bit
more
in
depth
when
we
get
to.
It
is
actually
allow
you
to
have
applications
that
spin
up
on
demand
and
they
only
exist
and
consume
resources
for
the
duration,
they're
being
called
and
finally
elegantly
solves
the
classic
developer.
7030
problem.
A
Have
you
ever
heard
of
this
good
because
I
made
it
up
about
six
years
ago
and
I
was
hoping
it
disseminated
through
the
industry
by
this
time?
What
I
mean
by
the
70
30
problem
is
that
when
I
was
a
developer,
I
used
to
get
paid
reasonably
well
well,
really
well
in
terms
of
development
developers,
never
get
paid.
Well
anybody
any
employers
in
the
room.
A
Here
you
know
developers
don't
get
paid
enough,
but
I
used
to
get
paid
a
reasonable
amount
of
money,
but
I
used
to
waste
70
of
my
time
when
I
was
developing,
so
I
just
spend
70
of
my
time
building
frameworks
installing
libraries
writing
boilerplate
code
and
on
average
I
I
ended
up
spending
30
of
my
time.
Actually
writing
that
core
code.
I
wanted
to
write
when
you
use
a
technology
like
openshift
when
you
use
the
technology
of
containers
that
goes
up
to
about
95
you're,
not
writing!
A
A
A
If
you're
doing
an
experiment
that
requires,
let's
say
a
thousand
iterations
ten
thousand
iterations,
it
needs
to
spin
them
up,
consume
it
and
then
go
away.
These
things
don't
have
to
persist
in
the
past.
This
was
impossible
just
the
time
binding
between
the
application
and
the
infrastructure
itself.
A
If
he
wants
to
set
up
a
machine
to
run
hadoop,
that
machine
would
only
run
hadoop
and
he'd
be
sat
there
with
the
hadoop
like
ready
to
process
jobs,
but
it
will
be
there
all
the
time
assigned
to
the
hadoop
side
and
with
kubernetes
and
containers
and
specifically
native
this,
has
massively
changed
so
understanding
the
container
mindset.
I
love
the
first
line
on
this
thing.
A
Containers
are
file
systems
with
delusions
of
grandeur.
That's
literally,
all
they
are
is
file
systems
that
think
they're
operating
systems.
It's
a
set
of
files,
that's
executed
in
a
process-bound
space
thinks
it's
an
operating
system.
It's
just
a
file
system
with
delusions
and
grandeur
to
exploit
the
design
features
of
kubernetes
correctly
applications.
Application
experiments
need
to
follow
certain
design
patterns.
A
They
should
be
stateless,
they
should
be
sausage
machines
and
it's
not
a
limitation
when
you're
actually
designing
applications
from
a
container
perspective,
because
if
you
use
a
persistence
method
or
a
methodology
like
data
grid,
which
I'm
going
to
show
a
little
example
of
or
persistent
volumes
which
comes
as
part
of
the
openshift
package,
it's
absolutely
perfect
for
those
who
don't
know
what
it
does.
Persisted
volume
actually
expresses
a
file
system
into
the
back
end
of
a
container.
A
The
container
sees
it
as
part
of
its
file
system
itself,
but
when
the
container
goes
away
and
is
restored
to
its
original
image,
you
can
reattach
that
file
system
and
it
can
carry
on
from
where
it
was
which
is
brilliant.
It
adds
that
kind
of
point-to-point
state
persistence,
you
don't
get
by
using
standard
containers
out
of
the
box.
A
So
introducing
candidate
of
serverless-
and
this
is
where
I
tend
to
get
a
little
bit
overly
excited
so
k
native
as
a
concept
is
simple,
and
I
don't
like
the
word
serverless
and
I
normally
get
told
off
at
this
point,
because
this
is
normally
being
filmed
so
I'll,
stop
myself
from
swearing.
But
when
I'm
talking
to
customers,
you
know
they
talk
about
serverless
and
they
say
well
when
you're
talking
about
serverless
you're
talking
about
the
unicorn's
arse,
because
there's
no
such
thing
as
serverless,
it's
someone
else's
server
or
it's
someone
else's
resource.
A
A
It
actually
exists
on
the
worker
node,
but
it's
offline
to
the
point
where
it's
not
consuming
any
resources
whatsoever
and
what
we've
got
is.
We've
got
two
triggers
that
allow
it
to
actually
come
back
into
life.
One
is
called
canadian
serving
and
that's
the
standard
one
and
for
those
who
know
kubernetes
already
what
this
does
is
it
actually
creates
a
service
ingress
point
that
sits
at
the
service
point
for
the
application
looking
for
traffic.
But
what
that
service
point
does
is,
rather
than
just
put
push
the
traffic
into
the
application
itself.
A
It
pulls
the
application
to
see
if
the
application
exists
and
if
it
doesn't
exist,
it
spins
it
up
when
the
application
spins
up.
It
processes
the
ingress
traffic
and
then
there's
a
time
limit
during
which,
if
it
receives
any
more
traffic,
it
remains
alive,
but
if
it
doesn't
receive
traffic
it
spins
down
to
zero.
A
So
it
goes
away
consuming
no
resources
and
that's
massively
efficient
again.
I
normally
get
told
off
by
salespeople
when
I'm
talking
about
it,
because
my
first
pitch
to
customers
is,
you
need
smaller
clusters,
you
need
less
worker
nodes.
You
can
put
loads
more
information
on
the
on
much
less
worker
nodes.
A
Of
course
the
sales
people
take
me
outside
and
kick
me
because
we
want
to
sell
more
worker
notes,
but
the
other
type
is
called
k
native
eventing,
and
this
is
the
one
I
really
like,
and
this
is
to
do
with
a
new
concept
called
cloud
events.
So,
instead
of
actually
having
an
ingress
point,
that's
based
on
a
service.
It
actually
has
events
that
drive
the
creation
of
the
the
actual
application.
A
Now
I
used
to
work
with
eventing
I
used
to
work
with
message:
queues
used
to
work
with
kafka
and
all
those
kind
of
things
and
every
single
one.
That's
got
a
different
protocol
and
everything's
every
single
one
is
very,
very
confusing.
So
what
we've
done
is
we've
actually
abstracted
it
and
made
it
incredibly
simple.
In
fact,
I
made
it
slightly
more
simple,
because
I
complained
to
the
people
that
wrote
this.
A
So
you
set
up
your
k
native
applications
as
being
driven
by
a
trigger
that's
driven
by
a
type
of
cloud
event
and
the
broker
works
to
decide
which
of
the
actual
applications
to
push
it
to
now
out
of
the
box.
When
you
install
one
of
these
brokers,
they're
actually
ephemeral.
So
when
you
actually
put
it
up,
when
you
throw
cloud
events
at
it,
it'll
go
through
its
triggers.
A
A
So
you
can
have
a
kafka
topic
sitting
behind
the
broker,
which
is
actually
feeding
those
cloud
events
into
the
broker,
which
is
driving
the
actual
recreation
of
the
applications
through
the
events,
but
because
it's
kafka,
you
can
wind
the
temporal
stamp
back,
so
you
can
replay
the
actual
messages
through.
So
you
get
all
the
advantages
of
a
complex
messaging
system
such
as
kafka.
But
you
get
the
simplicity
of
this
interface
and
a
very
quick
question.
A
But
there
was
a
problem
with
this,
and
this
comes
from
the
the
work
I
used
to
do
with
with
crap
communications,
and
things
like
that
is
that
if
you've
got
a
broken
packet,
let's
say
you
have
the
event
arrived
at
the
actual
application
and
the
json
itself
was
slightly
broken.
What
it
was
doing
was
it
was
auto
basically
decrypting
that
package
into
a
form
that
could
be
consumed
and
if
it
failed
the
formatting.
If
the
json
was
broken,
it
would
report
it
as
an
error
and
it
wouldn't
get
to
the
actual
service.
A
A
So
why
is
this
relevant
to
artificial
intelligence
and
machine
learning?
Well,
al-ml
workloads
are
all
about
size
and
repetition,
they're
all
about
massive
experiments,
but
they're
all
about
repeating,
repeating
repeating
repeating
and
most
organizations.
Alternatives
are
limited
by
resource,
either
by
size
or
cost.
You
know,
machines
are
expensive,
aws
is
expensive.
Cloud
is
expensive.
A
Containers
and
category
technologies
allow
for
massive
experiments
in
much
smaller
footprints,
and
that's
the
key
thing
with
this.
In
fact,
if
you
took
a
big
system,
did
functional
decomposition
down
to
atomic
services,
you
could
represent
every
one
of
those
atomic
services
as
an
individual
k-native
service
and
suddenly
you've
got
massively
complex
systems
made
of
thousands
of
these
connective
services
where
the
services
are
only
existent
for
the
duration
of
their
calls,
and
suddenly
it
becomes
beautifully,
elegant
and
efficient,
and
I
say
openshift
has
also
got
this
targeted
orchestration
and
strict
resource
control.
A
A
What
capabilities
it
had
so
now
you've
got
a
situation
where
you
can
stand
up
a
worker
node,
that's
got
gpus,
it's
got
numerous
zones
and
those
can
be
expressed
through
the
object
model
into
openshift
and
openshift
can
use
that
information
to
correctly
orchestrate
jobs
and
combine
that
with
the
k-native
server
or
the
case
of
a
venting
approach,
and
you
can
see
that
you
could
build
a
model
such
that
if
parts
of
your
experiment
required
a
huge
amount
of
gpu
processing,
you
could
target
the
gpu
hardware
and
it
all
comes
out
of
the
box,
and
I
say
it's
much
more
efficient
resource
consumption
means
much
better
results
for
less
outlay
right,
the
fun
stuff.
A
I've
been
obsessed
with
neural
nets.
For
years
I
love
the
concept
of
neural
nets
particles.
I
want
to
know
what
this
thing
does
when
it's
not
suffering
from
fever.
Now,
to
demonstrate
the
theory
behind
this
dynamic
execution,
I've
applied
an
idea
around
these
things
called
neural
nets.
Neural
nets
work
by
actually
combining
atomic
components
called
neurons
and
neurons.
A
Your
brain
is
full
of
them
and
what
neuron
does
is
it
takes
a
number
of
inputs
and
it
will
aggregate
those
inputs
together
to
generate
a
threshold
and
it
will
generate
events
at
the
far
end
depending
upon
those
thresholds,
and
you
can
build
massively
complicated
systems.
So
what
I
was
thinking
about
was
using
the
cognitive
services
servicing
and
cloud
events.
You
could
build
and
simulate
these
neurons,
the
problem
being
that,
with
candidate
serving
and
with
standard
container
technology,
there's
no
persistence
between
calls.
A
So
what
I've
done
is
I've
actually
installed
infinispan
or
red
hat
datagrid?
And
what
that
is,
it's
an
in-memory
data
store
and
what
happens
is
the
neurons
are
very
small
containers
that
are
spun
up
on
demand
driven
by
cloud
event
types.
So
I
have
different
cloud
event
types
that
are
actually
generating
different
payloads
into
the
neurons.
When
a
neuron
starts
up,
it
will
go
into
the
data
grid
and
get
its
latest
memory
state.
A
A
Dangerous
part
of
their
presentation,
so
I'm
gonna
do
a
demo.
These
never
work
and
I
think
people
just
come
to
see
it
fail.
But
what
I
wanna
show
you
I
log
into
this
cluster
here.
A
What
I've
got
is
what
looks
like
a
slightly
complex
setup
of
some
creative
serving
stuff
so
I'll,
let
it
start
and
I'll.
Let
it
just
get
into
a
situation
where
you
can
see
it
now.
You
probably
can
oh
you.
Actually,
you
can
read
it.
So
what
we
got
here
is
basically
a
little
setup
to
show
you
some
of
the
examples
I've
been
talking
about
in
the
center.
Here,
that's
the
broker,
so
I've
got
a
single
broker,
the
broker's,
a
namespace
banned
for
us
old
people
who
were
around
when
the
web
first
came
around.
A
You
can
actually
emit
cloud
events
into
the
broker.
Just
by
doing
a
post,
you
actually
put
the
c
type
as
a
header
into
the
post,
and
you
put
the
actual
physical
payload
into
the
post
itself
to
demonstrate
this.
I've
actually
got
an
application
running
here
called
cloud
emitter
and
cloud
emitter,
and
I
say
I
apologize
profusely
because
I'm
still
using
style
sheets.
I
wrote
in
1996.
A
what
this
allows
me
to
do
is
to
target
individual
brokers
and
push
named
cloud
events
into
them.
So
what
I'm
going
to
do
is
actually
push
a
cloud
event
to
that
broker.
That
actual
cloud
event
has
got
a
type
of
caucus
event,
so
what
it
is
is
the
broker
is
waiting
for
these
events
to
come
in
the
broker's
got
triggers
that
actually
trigger
off
these
candidate
services
that
are
actually
offlined
at
the
moment.
I'm
not
going
to
send
it
initially,
because
I
want
to
show
you
the
trigger.
A
A
I've
got
a
caucus
thing
set
up
because
very
very
fast
way
to
actually
spin
up
these
kind
of
things.
I've
also
got
a
camel
k
definition
here,
so
I've
got
a
little
camel
event
room
that
camel
event
is
waiting
for
a
certain
type
of
event
in
the
broker
itself.
You
see,
I've
got
two
triggers
this
trigger.
Here
is
actually
looking
for
type
caucus
event
and
it
drives
requests
into
the
caucus
event.
A
So
if
I
emit
a
caucus
event
from
that
cloud
cloud
cloud
event
emitter,
we
should
see
that
spin
up
what
the
caucus
event
is
going
to
do
is
actually
just
log
the
information
it
gets
and
then
push
another
event
back
to
the
broker.
The
event
it
pushes
back
to
the
broker
is
a
tech
talk,
event
of
which
there
is
a
trigger
into
the
camel
k.
A
So
what
we
should
see
is
a
chain
when
I
push
the
event
into
the
broker
itself,
the
caucus
event
should
kick
off
the
caucus
event
should
process
and
then
generate
an
event
back
into
the
broker
itself,
which
drives
the
creation
of
the
or
the
the
resurrection
of
the
the
camel
k
one.
It's
a
very
pithy
example,
but
I'll
show
you
it
running
so
I'll
admit
the
cloud
event
and
I
gotta
zip
back
to
the
topology
very
quickly.
A
You'll
see
bang
the
caucus
event
immediately,
fires
up
so
what's
happening
is
the
broker
has
actually
seen
that
event
and
pushed
it
into
the
actual
caucus
processor.
The
caucus
process
pushed
one
out
which
has
gone
into
the
camel
k
after
a
certain
timeout,
which
is
configurable.
Both
of
them
will
go
away
again.
It's
a
very
pithy
example,
but
you
can
see
how
dynamic
this
system
is,
and
that's
the
whole
point
behind
this
now
it
does
require
you're,
actually
building
experiments
and
doing
stuff,
like
that.
A
A
It
is
a
work
in
progress,
but
there
is
a
github
repo
where
I've
got
basically
a
nice
little
white
paper
on
what
it's
going
and
all
the
kind
of
in
bits
and
pieces.
You
need
to
do
that,
including
the
yaml
for
creating
the
infinispan
datagrids
and
my
first
feeble
attempts
at
writing
the
caucus
processes.
A
But
for
me
this
is
the
next
generation
of
programming.
This
is
the
next
generation
of
development
and
it
just
feels
delightfully
elegant.
That
was
pretty
much
it.
I'm
not
sure
if
we're
allowed
questions
or
not,
I've
managed
to
stay
just
within
time,
but
I
hope
that
was
useful.
I
will
be
around
for
the
rest
of
day.