►
Description
"Data Hub" is a collection of open source and cloud components deployed as a "machine learning-as-a-service" platform to solve internal business problems at Red Hat that enables teams to build, deploy, and execute analytic, machine learning and AI models.
Repeatable human tasks are being replaced by automation, creating significant opportunity and risk for Red Hat. AI can be applied to our core business and direct customer services. To do so, data must be seamlessly unified from a broad range of sources and made accessible to analytic models.
This presentation is about how Red Hat runs AI and machine learning workloads on OpenShift.
A
So,
thank
you,
everyone
for
inviting
me
to
speak
on
what
we're
doing
with
AI
and
machine
learning
at
Red,
Hat
and
I
wanted
to
introduce
you
to
a
project
that
we
have
going
on
here.
Called
the
data
hub.
The
data
hub
started
off
as
a
reference
architecture
for
how
we
are
doing
AI
within
redshift,
I'm,
sorry,
OpenShift
and
then
some
other
technologies
from
other
other
open
source
technologies,
and
it
has
spawned
off
into
also
solving
internal
problems
at
Red.
A
So
you
can
think
of
the
data
hub
as
a
collection
of
open
source
components
and
the
foundation
of
it
being
open
shift
and
kubernetes,
but
some
of
the
things
that
we're
tying
into
it
is
being
able
to
do.
Data
streams
or
big
data,
do
model
training,
execution
of
those
models,
basic
ETL
requirements
as
well,
providing
api's
and
then
visual,
visuals
and
reports.
A
On
top
of
that,
in
terms
of
why
we
created
the
data
hub,
we
were
initially
tasked
with
solving
some
of
our
interesting
build
issues
that
we
were
having
at
Red
Hat
for
continuous
integration,
continuous
delivery.
So
the
Red
Hat
act,
the
data
hub
actually
spawned
off
as
a
way
of
us
aggregating
and
collecting
all
of
that
information.
It
quickly
was
moved
into
the
AI
as
a
service
type
of
category.
A
When
we
started
to
think
about
well
as
we're
collecting
all
of
this
data,
what
can
we
start
to
infer
from
it
and
then
also
enabling
other
teams
to
do
that?
So
what
we
tried
to
do
is
provide
a
platform
that
will
take
some
of
those
mundane
repeatable
tasks
such
as,
if
a
build
fails
or
if
there's
some
insights.
A
That
needs
to
happen
on
on
log
data
or
metrics
data
and
taking
those
repeatable
human
tasks
and
starting
to
automate
those,
and
what
we
found
is
that
that's
a
great
way
to
augment
our
core
business
by
and
giving
developers
the
ability
to
automate
a
lot
of
these
tasks,
and
it
just
makes
things
a
little
more
efficient
on
our
end.
But
in
order
to
do
that,
the
first
thing
we
had
to
do,
which
is
why
it's
called
our
data
by
that
data.
A
And
some
of
the
things
that
we
talk
about
when
we
start
to
describe
AI
and
ml
AI
is
like
the
new
bi
right.
Everyone
says
it
but
they're,
depending
on
who
you're
talking
to
it,
has
different
connotations.
So
one
of
the
things
that
we
did
in
the
data
hub
team
and
the
AI
Center
of
Excellence,
which
is
a
part
of
the
data
which
is
what
the
data
hub
team
reports
into
is
we
had
the
level
set
on.
A
What
do
we
mean
when
we
say
machine
learning,
artificial
intelligence,
and
how
does
that
stack
up
against
some
of
the
other
things
like
statistics
and
predictive
analytics,
so
the
bread
and
butter
of
the
data
hub?
We
focused
more
on
the
right
side
of
this
chart
when
we
do
things
like
natural
language,
processing,
autonomous
decisions
and
also
you
know
anything
from
anomaly
detection.
Those
types
of
pattern
recognitions,
that's
really
where
we're
focusing
on
and
then
where
we
have.
Our
data
science
is
spending
a
lot
of
their
time.
A
A
Perspective
this
is
where
we
get
into
when
I
mentioned.
We
are
using
the
data
hub
as
a
reference
architecture.
We've
at
the
core
of
the
data
hub
we
use
you'll,
see
on
the
left.
Side
of
this
is
Seth
object,
storage.
That
is
basically
our
we're
using
that
similar
to
what
you
would
use
s3
as
if
you
were
in
amazon's
infrastructure,
where
that
is
our
data
lake.
We
have
lots
of
different
types
of
data.
That's
stored
there,
anything
from
Red
Hat
cloud
services.
We
have
data
pumping
into
that.
A
We
also
have
metric
data
coming
in
from
services
such
as
Prometheus.
We
also
have
data,
that's
more
operational
in
nature,
and
then
we
also
have
just
basic
customer
information
that
we
store
their
support
tickets
feedback
loops
things
like
that,
and
we
use
that
as
a
collection,
a
way
to
collect
all
of
that
data
itself,
as
it
stuff
is
great
for
streaming
data
into
those
systems.
We
also
use
elastic
search
for
more
of
our
raw
log
analysis.
A
So
sometimes,
when
you
have
you
know
just
just
terabytes
of
log
data
coming
in,
you
need
an
engine
that
will
allow
you
to
quickly
sift
through
that
information
to
do
some
kind
of
visualizations
through
use,
elastic
search
for
that
for
that
use
case,
and
then
we
use
Yanis
DB
janus
DB.
However,
wherever
you
come
from
in
a
part
of
the
world,
you
might
pronounce
that
differently,
but
we
use
a
graph
database
Yanis
DB,
and
that
is
for
some
of
the
work
that
we're
doing
with
stacks
and
doing
intelligent
tack.
A
Recommendations,
though,
if
you
are
deploying
a
stack,
that's
focused
on
artificial
intelligence,
giving
you
recommendations
on
hey,
you
might
want
to
add
these
packages
to
it,
or
your
packages
may
be
becoming
out-of-date.
Here's
the
impact
on
your
system,
and
so
we
use
a
graph
database
to
handle
those
those
types
of
use
cases
on
the
ingestion
side
of
things
we
are
using
Kafka.
There's
a
project
called
string
Z
that
will
be
part
of
a
MQ
as
well,
so
the
string
Z
project
is
all
on
kubernetes
as
well.
We
lose
log
stash.
A
We
have
a
homegrown
ship
shift
instance,
which
basically
takes
Jenkins
artifacts
from
our
build
systems,
pumps
it
into
our
system
so
that
we
can
analyze
those
artifacts.
We
also
use
open
whisk
if
you're
familiar
with
serverless
actions.
Open
whisk
is
an
open
source
technology
for
those
service
actions,
and
that
allows
us
to
do
things.
A
We
use
spark
again
that's
on
kubernetes,
that
is
a
project
called
rad
analytics
that
we're
leveraging
their
technology
for
spark
on
kubernetes,
and
then
we
also
have
Jupiter
hog
that
we've
deployed
and
Jupiter
hub
allows
our
data
scientists
to
get
access
to
the
data
and
Seth
and
elasticsearch
using
spark
as
the
processing
engine,
but
then
also
they
can
use
other
images.
Other
types
of
notebooks
as
well
such
as
scikit-learn,
pi,
spark
and
things
like
that,
and
on
the
reporting
end,
we
have
cabana
that
we
use
that
for
our
basic
visualizations,
that's
hooked
into
elasticsearch.
A
All
of
that
rolls
into
our
service
layer
where
we
have
something
called
a
common
AI
library
and
the
common
AI
library
you
can
think
of
that
and
internally.
Our
use
case,
for
that
is,
as
our
data
scientists,
on
our
team
and
and
other
teams
are
creating
these
analytical
models.
There's
a
life
cycle
that
has
to
happen
where
they
may
play
around
with
things,
but
then,
as
they
play
around
with
it,
they
publish
it
into
the
execution
engine
and
then
we
say
well,
you
know
what
that
was
actually
pretty
cool,
I.
A
Think
other
teams
might
like
this
anomaly
detection,
so
we're
building
out
an
AI
library
that
allows
the
data
scientists
to
take
those
models
and
put
it
into
a
place
that
other
teams
can
leverage
it
and
just
you
know,
take
that.
Take
that
model
deploy
it
but
then
pass
their
own
data
through
it
and
get
some
results
out.
We're
doing
that
in
a
number
of
use
cases
that
we
just
started
out
and
we'll
be
publishing
that
ad
library,
pretty
soon
all
of
that
fits
on
top
of
monitoring
and
alerting.
A
A
I'm
going
to
skip
this
kind
of
shows
the
the
openshift
side
of
things,
a
basic
workflow
that
I'll
show
really
quickly
is
how
we
have
several
different
data
sources.
This
top
part
here
is
going
to
show
a
little
bit
how
we
consume
data
into
elasticsearch.
In
qivana,
we
have
various
data
ingestion
services
that
pull
in
data
from
various
build
systems,
and
we
take
those
logs
and
we
pump
them
through
Kafka
again
with
the
streams
II
project,
and
then
we
do
some
kind
of
normalization
on
that
data.
A
A
A
The
data
on
the
model
designing
side
of
this
is
just
to
show
a
basic
use
case
of
how
we
have
data
coming
in
landing
in
SEF,
and
then
the
data
scientists
they
use
Jupiter
hub
right
now,
along
with
spark
to
get
access
to
the
data,
that's
stored
and
stuff,
and
we
use
a
combination
of
Hadoop
and
Amazon
drivers
to
get
access
to
that
data.
We
don't
actually
use
Hadoop
underneath
the
covers
we
just
used
there.
There
there
there
jar
files
to
get
access
to
it,
using
spark.
A
On
the
deployment
and
execution
side
well,
what
happens
after
the
data
scientist
takes
their
data
model
and
deploys
it?
Usually
they
team
up
with
the
data
engineer,
and
we
have
a
number
of
different
environments
that
they
can
deploy
it
on.
If
it
is
more
of
streaming,
data
coming
in
or
ad-hoc
requests
coming
in,
we
normally
push
that
to
our
server
lists
actions.
An
example
of
that
is
we
just
launched
actually
next
week,
we'll
be
launching
sentiment.
A
Analysis
service
that
sentiment
analysis
service
will
exist
in
is
in
open,
Wisc
actions
so
that
as
data
streaming
in
from
various
systems,
we
can
do
an
analysis
on
that
data
in
real
time
and
then
process
the
results
and
return
results
back
to
users,
but
then
for
some
of
the
back
sobs
that
we
have
such
as
model
training
or
doing
a
batch
execution
of
a
model.
Then
that's
when
we
go
back
to
the
workflow
engine
and
the
data
usually
comes
in
off
of
stuff
to
do
that.
Training.
A
There's
a
little
bit
of
combination
of
both
the
other
thing,
we're
working
on,
which
would
be
part
of
the
server
less
actions,
as
well
as
the
feedback
loop.
So
as
models
need
to
be
revisited
and
corrected
for
accuracy,
that's
going
to
be
done
through
the
server
less
actions
as
well.
An
example
of
that
is
for
the
sentiment
analysis.
If
the
entity
detection
or
the
sentiment
analysis
of
that
data
comes
back
incorrectly,
then
we're
we
have
mechanisms.
A
We
have
api's
that
are
going
to
be
hooked
into
a
UI
that
allows
the
end
users
to
modify
that
information,
and
then
we
retrain
the
model
based
on
that
information
that
came
in
and
then
the
flick
on
the
tail
end
of
this
is
where
we
get
into
the
AI
and
ml
side
of
things.
We
provide
a
number
of
different
libraries
for
the
data
scientists,
including
the
SPARC
ml
library,
I,
could
learn
and
ltk
Kerris
we're
also
very
soon
going
to
be
rolling
out
tensorflow
with
GPU
enablement,
hopefully
in
the
next
month
or
month
or
so.
A
We'll
have
that
as
something
available
to
the
data
scientists
to
work
on
again
all
of
that
being
part
of
OpenShift
and
will
continue
as
we
have
more
types
of
AI
and
ml
models
available,
we'll
be
adding
those
to
the
to
the
images
that
we
have
to
make
them
available
for
the
data
scientists
and
to
kind
of
wrap.
This
up.
I
talked
a
little
bit
about
some
of
the
services
that
we
have.
This
would
just
kind
of
go
through
some
of
those
again
for
our
cloud
services,
we're
doing
anomaly,
detection
z'
for
infrastructures.
A
So
you
know
some
of
our
customers
that
are
on
the
cloud
services,
we're
kind
of
actively
monitoring
the
collective
nature
of
all
of
our
customers
to
see
interesting
patterns
and
then
we're
working
with
the
service
teams
to
help
either
resolve
issues
or
offer
up
new
new
opportunities
for
customers
by
by
analyzing
that
information
and
then
on
the
customer
on
the
sentiment,
analysis
and
entity
detection.
We
have
an
ongoing
project
with
a
number
of
teams
where
we're
looking
at
support
tickets.
A
Looking
at
feedback
from
engagements
with
customers,
deployments
of
customer
environments,
feeding
all
of
that
data
back
into
the
data
hub
and
providing
insights
on
what
customers
are
talking
about.
Trending
information,
you
know:
what's
what's
working,
what's
not
working,
you
know
any
kind
of
information
we
can
use
to
help
out
from
a
support
side
of
things,
and
then
we
also
provide
visualizations
on
top
of
the
data
hub
to
analyze
the
build
information
that
comes
into
the
system.
A
A
It
becomes
very
important
for
you
to
add
the
right
level
of
security,
metadata,
lineage,
auditing
capabilities
and
whatnot
on
top
of
that,
so
we're
actually
working
with
a
number
of
customers
to
identify
commonality
across
all
the
customers
and
provide
recommendations
and
also
loop
that
back
into
the
internal
data
hub,
so
that
we
can
add
on
to
the
governance.
So
we're
looking
at
things
like
OPA
as
well.
As
you
know,
some
of
the
Apache
products
to
help
us
get
through
the
data
government
inside
of
things
and
then
on
the
AI
lotta
lifecycle.
A
We
need
to
elaborate
on
the
use
case,
of
storing
the
models
so
giving
a
proper
repository,
also
working
with
promotion
from
dev
test
prod
and
then
providing
performance
monitoring
of
how
that
model
is
actually
executing
in
production,
whether
it's
accurate
or
inaccurate.
How
and
how
it
can
be
more
efficient
and
then
also
backup.
So
those
AI
models
that
give
you
a
rundown.
A
That's
that's
a
overview
of
where
we
are
with
the
data
hub
and
AI
as
it
as
it's
being
worked
on
in
Red
Hat,
and
certainly
it's
a
very
challenging
but
interesting,
and
we
will
be
publishing
very
shortly.
Not
only
articles
on
on
what
we're
doing,
with
with
the
different
deployments
of
the
data
hub,
but
then
also
providing
all
of
a
public
git
repository
as
well
as
quai
images,
so
that
anybody
can
take
this
deployment
of
the
data
built
on
top
of
open
ship
and
put
that
in
their
own
environments.
And
that
is
it.
B
Hey
this
day,
brown
check
really
really
cool
stuff
and
certainly
I've
heard
a
lot
of
these
problems
over
and
over
again.
So
I'm
really
excited
to
hear
you
working
on
them.
I
was
wondering:
is
there
something
we
can
do
in
cute
flow
to
make
dooming
this
easier
or
integration
with
the
services
that
you're
providing
easier?
Yes,.
A
I'm
I'm
glad
you
mention
it.
I
can't
believe
I
didn't
I,
didn't
bring
up
cute
flow,
and
all
of
this
cute
flow
is
very
much
at
the
forefront
of
what
our
team
is
looking
at
as
well.
So
we're
doing
a
lot
of
investigating
with
the
tensorflow
in
getting
an
integration
with
you
flow,
so
that
I
don't
have
any
strong
answers
as
to
how
that's
going
to
turn
out
just
yet.
A
But
we
do
have
lots
of
Engineers
working
on
integrating
that
with
the
rest
of
the
ecosystem,
especially
as
it
deals
with
in
the
first
passages
is
making
making
cue
flow
available
for
the
data
scientists.
You
know
through
the
UI
and
then
once
we
do,
that
working
on
the
deployment
side
of
it
and
we've
looked
at
a
number
of
different
schools
to
help
out
with
the
promotion
of
models.
We
just
haven't
narrowed
down
on
one
that
we
really
like
just
yet
yeah.