►
Description
AMA on OpenDataHub with Landon LaSmith Red Hat 07 20 2020
A
We
are
bringing
one
of
the
upstream
projects
that
is
one
of
the
more
important
workloads
on
right
these
days,
not
that
your
workload
is
not
important,
but
one
of
the
more
interesting
ones,
open
data
hub,
the
ai
platform
and
team
to
come
and
tell
you
a
little
bit
about
this
project
at
red
hat,
and
we
have
a
number
of
members
here:
wanna
vaclav
chad,
landon
and
beverly
landon
la
smith
is
going
to
walk
us
through
first,
a
little
overview
of
what
open
data
hub
is
and
then
we're
going
to
open
it
up
to
q
a
and
have
an
ama
session
on
this,
as
we
like
to
do,
and
have
a
little
bit
of
a
demo
of
it.
A
So
view
up
your
questions
wherever
you're
watching
this,
whether
it's
facebook,
twitch
or
youtube,
or
if
you're
in
the
blue
jeans
and
we'll
aggregate
those
questions
and
answer
them.
Hopefully
after
the
demo
and
lecture
part,
and
have
a
conversation
about
what
open
data
hub
is
and
how
to
use
it.
So
take
it
away.
Landon
hi.
B
B
So
I'm
just
going
to
give
a
quick
overview
of
open
data
hub,
and
hopefully
we
can
answer
all
of
your
questions
right.
So
in
this
slide
we're
going
to
cover
what
is
open
data
hub.
I
give
a
brief
input
introduction
to
kublow,
which
is
our
kind
of
upstream
project
that
we're
in
sync,
with
kind
of
tell
you
where
open
data
hub
is
used
and
just
do
a
give.
You
a
quick
demo
on
how
you
can
deploy
opendata.
B
So
what
is
open
datahub?
The
original
goal
of
open
data
hub
is
to
build
a
platform
for
data
science.
So
we
want
to
make
it
as
easy
as
possible
for
a
data
scientist
to
to
stay
within
their
workflow,
so
we
know
that
they
have
many
tools
that
they
use
for
model
training
model
development
model
serving.
B
We
want
the
a
team
of
data
scientists
to
be
able
to
work
on
shared
data
using
some
type
of
storage,
use,
an
environment,
a
development
environment
that
they're
comfortable
with,
in
this
case
jupiter
notebooks,
but
also
allow
a
kind
of
data
engineers
and
devops
to
to
work
within
that
workflow
to
create
the
best
solution
possible.
B
So
this
began
what
we
are
now
calling
the
open
data
on
open
data
hub
is
not
a
red
hat
official
red
hat
product.
It
is
a
community
project.
We
set
out
to
create
a
reference
architecture
to
provide
best
practices
on
how
you
can
deploy
these
different
tools
within
this
data
science
workflow.
B
So,
with
this
operator,
we
can
deploy
different
tools
that
will
be
used
in
the
workflow
for
a
data
engineer
the
data
scientists
and
make
it
easy
for
devops
to
deploy
this
project.
So
if
you
want
to
deploy
the
open
data
hub,
you
can
find
that
on
any
openshift
cluster
under
the
operator
hub
on
that
cluster
and
look
for
opendatahub,
it
is
a
community
operator.
That's
available
to
install
for
free,
no
red
hat
subscription
required.
B
So
the
open
data
hub
ecosystem
combines
a
lot
of
different
parts
where
we
gather
input
for
best
use
cases,
best
practices
for
open
data
hub.
So
we
work
with
a
lot
of
customers,
internal
and
external,
to
kind
of
lay
out
how
we
want
the
open
data
hub
to
proceed.
So
we
take
public
requests.
You
can
contribute
to
the
open
data
hub.
B
We
work
with
red
hat
partners
to
see
you
know
if
their
tool
helps
further
the
open
data
hub
and
we
work
with
a
lot
of
upstream
components
that
have
downstream
projects
within
red
hat.
So
our
goal
is
to
use
completely
open
products
within
the
open
data
hub
and
also
provide
a
path
where
you
could
kind
of
substitute
in
these
downstream
products
if
necessary.
But
everything
is
freely
available.
B
So
this
is
kind
of
a
few
of
the
components
that
are
in
open
data
hub.
In
this
nice
graphic,
we
focused
on
kind
of
jupiter
notebooks
for
the
development
environment,
object,
storage
provided
by
kind
of
ceph
apache
spark
for
data
engineering
seldom
for
model
serving
argo
workflows
are
kind
of
the
the
core
pipeline
technology
that
we've
used
in
the
past,
prometheus
grafana,
tensorflow
and
kafka.
B
So,
with
the
recent
release
of
open
data
hub
0.6
we're
currently
on
version
0.7,
we
are
an
official
downstream
of
kubeflow,
so
the
kuflo
project
is
is
a
project
to
bring
together
all
these
data
science
tools
into
a
a
ecosystem
that
works
on
kubernetes,
and
we
do
the
work
to
make
sure
that
this
workflow
also
works
on
openshift,
but
we
also
bring
in
a
lot
of
products
that
aren't
covered
by
kubeflow,
and
all
of
this
is
available
in
operator
home
all
right.
B
So
this
is
a
graphic
kind
of
our
original
release.
So
a
little
bit
of
backstory
about
open
data.
Hub,
probably
a
year
ago
we
had
our
official
release
of
zero
five.
This
contained
a
few
of
the
components
which
are
jupiter
hub
data
catalog
for
that
contains
hue,
high
and
thrust
gpu
support,
argo
and
all
in
the
ansible
operator,
with
the
switch
to
kind
of
down
stream
kubeflow
we
refactored
or
updated
operators.
So
it's
purely
based
on
go.
B
It
works
with
the
kf-def
manifest
and
it
fully
supports
kubeflow
products.
So,
using
this
open
data
operator,
you
can
deploy
kubeflow
on
openshift,
in
addition
to
open
data,
hub
components.
B
So
the
current
release
of
zero
seven,
you
can
see
a
few
of
the
components
we
have
released,
so
we
have
full
support
for
kubeflow
version
1.0.
You
can
deploy
that
with
our
operator
on
openshift
kf
serving
support
with
our
operator.
I
think
this
might
be
mixed
in.
We
could
use
this
with
opendata
full
ci
testing
on
all
of
our
updates
and
releases.
B
So
as
soon
as
we
submit
any
updates
to
open
data
hub,
we've
run
a
full
battery
of
ci
tests
to
make
sure
that
new
component
doesn't
break
any
existing
functionality,
but
also
provides
working
new
functionality.
You
can
mix
and
match
odh
and
kublow
components.
B
So
right
now
we're
verifying
a
small
subset
subset,
but
with
the
0.8
release,
we
plan
to
verify
and
test
kind
of
all
of
the
default.
Kubeflow
1.0
components
mixed
in
with
odh
components
and
some
openshift
container
storage.
B
So
the
current
operator
for
open
data
hub
is
kind
of
a
phase
one
basic
install.
B
This
means
that
it
will
deploy
open
data
hub
and
do
some
minor
updates,
but
for
the
most
part,
we're
doing
a
full
install.
We
have
plans
as
time
goes
on.
You
know
throughout
the
year
to
kind
of
bring
this
into
a
phase
five
operator,
but
these
are
long-term
plans.
But
as
of
right
now,
you
can
deploy
your
open
data
hub
ecosystem.
B
Issues
kuplo
kuflow
for
those
that
may
not
be
aware
of
it.
It's
an
open
source
project
dedicated
to
making
deployments
of
ml
workflows
on
kubernetes,
simple,
portable
and
scalable.
B
A
lot
of
the
work
we
did
to
bring
that
or
bring
open
data
hub
in
line
with
kuflow
is
to
make
sure
that
there
are
no
issues
when
deploying
from
kubernetes
to
openshift.
We
had
to
introduce
a
lot
of
updates
and
fixes
to
make
kubeflow
more
secure.
B
We
want
to
make
sure
that
you
know
not.
Every
container
is
running
with
privileges
that
you
don't
have
to
elevate
any
container
privileges
beyond
the
standard,
runtime
permissions,
and
then
we
kind
of
verify
and
make
sure
that
model,
training
and
serving
works
on
openshift.
B
So
these
are
a
few
of
the
goals.
This
is
for
open
data
hub
and
working
with
kuplo.
We
want
to
incorporate
best
practices.
B
Okay,
sorry,
a
simplified
install.
We
want
to
use
the
kind
of
ubi
or
universal
base
image
as
the
base
for
all
of
the
open
data
components.
This
provides
anybody
deploying
open
data
hub
with
a
level
of
security
that
comes
with
using
that
ubi
base
image.
B
So
you
get
a
lot
of
the
the
red
hat
effort
for
providing
a
secure
base
image
in
open
data,
and
we
also
want
to
make
sure
that
we
secure
that
deployment
of
open
data
hub
and
by
extension,
kubeflow.
So
that's
using
kind
of
well-defined
permissions.
B
So
this
is
a
kind
of
a
quick
graph
of
some
open
data
components
that
we
are
bringing
to
the
new
release
of
0.7
or
0.8.
B
We
have
we're
working
on
kind
of
allowing
you
to
deploy
storage
along
with
open
data
hub
based
on
ceph
object,
storage.
B
We
have
support
for
I
guess
we
have
components
that
are
using
postgres,
but
as
of
right
now
you
can
deploy
kafka
and
we
have
you
can
deploy
smart
clusters,
we're
working
on
updates
to
provide
kind
of
data
exploration,
though
we
do
have
superset
which
allows
you
to
do
data
visualization.
B
So
you
can
work
directly
with
kind
of
your
external
databases
or
data
sources
to
visual
asset
data,
we're
working
on
adding
data
cataloging
with
hue,
so
that
you
can
kind
of
navigate
your
object,
storage,
but
also
run
spark
sql
queries
on
that
data,
so
we're
hoping
to
get
that
into
the
next
release
and
we
currently,
I
think
we
do
support
with
the
the
ability
to
mix
open
data
hub
and
kubeflow
components
for
ts
serving.
B
B
So
we
fully
support
openshift
authentication
for
those
notebooks,
so
jupiter
hub
being
a
multi-user
notebook
server.
B
So
if
you
want
to
join
the
open
data
hub
or
follow
it
as
always
feel
free
to
go
to
our
website
at
opendatahub.io,
we
are
fully
functioning
on
github.com
opendatahub
dashio.
B
So
if
you
want
to
track
any
issues
or
progress
that
we're
making
in
the
project,
all
of
our
opendadhub
projects
exists
under
that
organization
of
opendata
io
again
we're
a
community
project
so
feel
free
to
kind
of
take
a
look
file
issues
if
something
doesn't
work
correctly
or
submit
prs.
So
if
you
see
an
issue
or
you
want
to
add
a
new
feature,
definitely
go
there
and
submit
a
pr.
B
If
you
want
to
track
progress,
we
have
an
announcements
list
you
can
subscribe
to
and
then
a
contributors
list.
If
you
go
the
extra
mile
to
submit
prs
and
want
to
become
a
contributor,
and
then
we
have
bi-weekly
open
data
hub
community
meetings
that
you
can
track
archives
on
the
our
gitlab
site.
So
I
want
to
clear
up
confusion.
So
our
old
operator
exists
on
git
lab,
but
to
make
sure
that
we
can
stay
in
sync
with
kind
of
kubeflow
updates
and
become
a
fully
functioning
downstream
of
kubeflow.
B
We
migrated
to
github,
but
a
lot
of
our
old
projects
are
still
on
gitlab
the
open
data
hub
community
being
one
of
those,
but
it's
still
current
for
the
open
data
community.
So
you
can
see
old
meetings,
get
notes
from
any
meetings
where
we
have
a
lot
of
guests
present
any
use
cases
that
are
utilizing
open
data
hub
or
kind
of
volunteering
or
opening
the
discussion
to
add
new
features
to
open
data.
B
So
this
is
kind
of
some
examples
of
where
open
data
is
being
used.
Originally
open
data
hub
was
an
internal
project
that
started
with
the
basic
elk
stack,
if
I
remember
correctly,
and
we
worked
with
internal
customers
so
that
they
could
kind
of
work
with
their
data
in
an
easy
fashion.
So
we
provided
storage
and
elastic
search
to
interact
with
that
data,
and
from
that
we
got
a
lot
of
customer
use
cases
that
help
to
form
the
open
data
up.
B
A
few
of
the
early
adopters
for
open
data,
the
massachusetts
open
cloud,
it's
a
collaborative
effort
of
a
few
or
universities
to
kind
of
run
their
data
science
in
high
resource
workloads
on
a
open,
high
availability
cloud,
so
open
dat
hub
is
kind
of
part
of
the
the
backbone.
For
some
of
this
work,
where
kind
of
professors,
research,
researchers
and
even
some
students
can
get
access
to
to
run
their.
B
So
we'll
do
a
I'll
give
a
quick
demo.
I
just
want
to
demonstrate
how
you
can
get
access
to
the
open
data
hub
and
deploy
it
within
your
workspace.
So
let
me
kick
over
to
my
openshift
console
so
here
I
have
a
basic
openshift
cluster.
Potentially
you
could
deploy
this
on
any
openshift
cluster.
So
right
now,
I'm
using
a
three
worker
node
cluster.
So
this
is
pretty
standard
for
any
open,
shifted
stall.
B
We
do
have
support
for
deploying
on
something
as
small
as
a
crc
or
code
ready
containers
cluster.
You
could
also
use
okd,
which
I
think
just
released
just
went
ga
or
general
availability
for
openshift4
clusters.
So
right
now
the
the
current
iteration
of
openshift
or
opendata
supports
openshift4.x,
so
the
current
version
is
4.5,
which
I
think
was
released
last
week
or
two
weeks
ago.
B
B
And
again,
it's
available
as
a
community
operator,
which
means
it's
freely
available
for
anybody
to
deploy
on
any
openshift
cluster
and
you'll,
get
kind
of
a
rundown
of
the
the
current
components
that
we
deploy
as
part
of
the
open
data
hub
with
additional
info
about
where
you
can
track
the
project.
The
operator
image,
where
we're
pulling
operator
image
and
additional
information,
so
this
kind
of
describes
the
available
channels
that
you'll
see
in
the
next
step.
B
The
update
channel
is
beta.
Beta
is
what
you
want
to
use
right
now.
That
is
our
where
we're
hosting
our
new
operator
legacy
is
the
older
name,
space
bound
operator.
That
is
the
older
answer
operator
that
still
works,
but
we
are
providing
kind
of
minimal
support
for
that.
So
a
lot
of
the
components
that
are
deployed
there
will
not
be
receiving
updates
since
we're
doing
all
our
updates
on
the
beta
channel
and
we'll
leave
the
approval
strategy
as
automatic.
B
So
this
means
that
whenever
we
release
newer
versions
of
open
data,
hub
they'll
be
available
and
installed,
the
operator
will
update
automatically
and
hit
subscribe.
Now,
we're
just
waiting
for
the
operator
to
be
installed
by
olm
olm
is
operator,
lifecycle
manager.
So
we
utilize
a
lot
our
olm
a
lot.
B
So
in
the
older
operator,
one
of
issues
that
we
encountered
was
that
we
had
to
kind
of
recreate
the
deployment
strategy
for
every
component
we
deployed.
So
if
we
deployed
prometheus,
we
had
to
create
the
deployment
objects.
The
the
roles
service
accounts
every
single
item
that
was
required
to
deploy
a
component.
B
Now,
if
there
is
a
component
that
the
open
data
hub
uses
that
is
available
in
operator
hub,
so
whatever
component
has
put
forth
the
effort
to
kind
of
be
listed
on
operator
hub,
we
can
easily
leverage
that
entry
in
operator
hub
for
open
data.
So
we're
not
recreating
the
deployment,
the
strategy
or
plan
for
every
component.
We
can
literally
say
for
selden
version
1.2
reach
out
to
olm
and
deploy
that
operator.
B
So
that's
good
because
we
don't,
we
aren't
required
to
kind
of
stay
in
sync
with
our
update
strategy
so,
as
you
know,
seldom
updates
their
operator
and
pushes
that
to
the
openshift
operator
hub,
we
automatically
get
those
updates
for
that
version
and
olm
will
handle
the
kind
of
deployment
strategy.
So
now
that
the
operator
has
deployed
broken
data
hub,
we'll
just
click
on
that
and
you'll
kind
of
get
another
overview
of
the
deployment.
B
So
anytime
you
submit
a
kfdf
resource
which
is
the
essentially
the
customize
manifest
format
for
kubeflow.
Once
you
submit
create
one
of
those
on
an
openshift
cluster,
the
open
data
operator
will
see
that
and
based
on
the
information
that's
in
there,
it
will
deploy
it
so
we'll
go
ahead
and
click
create
instance.
B
B
Dictionary
has
the
same
basic
format,
so
customize
config,
with
a
repo
ref
and
a
name
customize
config,
repo
ref
in
the
name.
So
this
determines
what
is
getting
deployed
as
part
of
this
kfdf.
B
So
here
you'll
see
that
we're
deploying
ai
library,
cluster
and
ai
library
operator,
one
of
the
things
that
we
set
out
to
do
whenever
we
add
a
component
open
data
hub
is
to
kind
of
separate
the
cluster-wide
permissions
or
cluster-wide
action,
mainly
things
like
deploying
to
a
cluster-wide
namespace.
B
I
want
to
say:
checking
for
required.
Crds
exists
in
this
kind
of
cluster
component
and
then
anything
specific
to
the
deployment
of
the
operator
or
application
exists
out
there.
So
in
the
this
operator
deployment.
Sorry,
so
a
lot
of
these
components
will
have
two
portions
or
two
configs.
So
here
you'll
see
kafka,
cluster
and
kapha
kafka.
So
anything
that's
not
named
cluster,
so
kafka
kapha
actually
has
the
the
deployment
files
necessary
for
kafka
deployment.
B
Cluster
is
generally
the
the
crds
and
any
required
cluster
wide
options.
So,
as
you
look
through
this
kfdf
you'll
see
all
the
components
that
we're
deploying
kafka.
Grafana,
the
rad
analytics
spark
operator
prometheus
jupiter
hub
jupiter
hub
will
be
the
entry
point
to
a
lot
of
the
the
use
cases
for
open
data
hub
if
you
watch
any
demos
or
or
examples,
airflow,
argo
and
so
on
and
so
forth.
B
So
with
the
latest
release
of
open
data
hub,
this
is
kind
of
one
of
the
new
features
we
we
wanted
to
focus
on
so
you'll
see
in
this
repost
section
we
have
kf,
manifests
and
regular
manifest
so
kf
manifest
is
a
fork,
a
downstream
fork
of
the
fixes
and
updates
that
are
required
to
deploy
kubeflow
on
openshift.
B
So
if
you
go
to
the
github.com
kubeflow
slash
manifest,
that
is
the
pure
vanilla,
kuflow
deployment
that
will
work
on
kubernetes
and
they
do
have
support
for
additional
cloud
providers,
so
azure,
let's
say
ibm
google
cal,
but
in
this
open
data
hub
dash,
I
o
manifest
all
the
files
and
updates
fixes
that
you
need
to
deploy
on
openshift
and
this
plane
manifests,
which
is
in
the
odh
manifest
repo.
This
is
the
open
data
hub
proper.
B
B
If
you
see
anything
that
references
kf
manifests
as
a
repo
name,
then
this
is
based
on
kind
of
the
the
upstream
deployment
of
kubeflow
that
we
have
added
a
few
fixes
to
make
sure
that
deploys
successfully
on
openshift.
So
right
now,
I
think
everything
just
references
manifest,
but
as
the
next
version
is
released
in
newer
versions,
you'll
start
to
see
more
and
more
mixing
of
co,
kubeflow
and
open
data
hub
components.
So
potentially
you'll
see
like
the
tf
job
operator,
the
pi
torch
operator,
maybe
even
some
pipeline
work.
B
B
And
now
we
just
wait
for
everything
to
deploy
so
slowly.
You'll,
see
kind
of
different
components
come
online
based
on
that
kfdf
they
have
library,
operators,
seldom
controller
superset,
so
on
and
so
forth,
and
once
these
pods
come
online
they
deploy
successfully.
B
A
Certainly,
we
always
have
questions
one
of
them
while
you're
doing
this
maybe
explain
a
little
bit.
One
of
the
quick
questions
that's
often
asked
is:
is
open
data
hub
available
for
generic
kubernetes,
which
kind
of
flows
into
the
question
about
is
open
data
hub
available
on
operatorhub.io.
B
So
yeah,
so
there's
a
lot
of
confusion
between
operator
hub
that
you
see
in
openshift
and
operatorhub.io
so
operatorhub.io.
The
website
are
for
operators
that
are
certified
to
work
on
finale,
kubernetes,
so
not
openshift,
but
the
upstream
kubernetes
server.
B
So
we
are
certifying
that
we
work
on
openshift,
which
means
that
we
are
only
available
in
the
openshift
operator
hub
that
is
deployed
with
all
openshift
clusters.
So
just
because
you
don't
see
us
on
operatorhub.io
means
that
does
not
mean
that
opendatub
isn't
available
on
operator
hub.
It
just
means
that
we're
certifying
that
we
work
on
openshift,
so
any
openshift
deployment,
whether
it's
okd
code,
ready
containers
openshift
on
aws
openstack.
A
Yeah
and-
and
you
did
mention
and
I'll
mention
this
while
we
watch
your
your
screen
scroll
here-
that
okd
is
now
available
and
okd
is
the
open
source
distribution
of
openshift
and
it's
available
now
so
july,
15th
in
general
availability
and
it's
running
on
fedora
core
os,
but
you
should
be
able
to
deploy
the
operator
hub.
A
The
operator
hub,
open
data
hub
easily
on
okd,
and
I
don't
know
if
anybody's
tested
that
yet,
but
if
you
haven't,
let
me
know
I
do
one
of
the
chairs
of
the
okd
working
group,
we'd
love
to
get
your
feedback
on
that
and
help
you
through
it.
If
there's
any
issues
whatsoever,
I
don't
think
anyone
on
the
odh
team
has
done
that.
Yet
it's
probably
too
too
soon
that
was
just
last
week,
so
would
definitely
have
to
get
that
tested.
B
Yeah
and
just
to
kind
of
build
on
top
of
what
diane
just
said
if
you
deploy
it
on
okd
or
any
infrastructure
provider,
openshift,
cluster
and
you're,
experiencing
the
issues,
please,
please
submit
the
issue
to
kind
of
any
of
our
projects.
B
If
you
are
deploying
kind
of
pure
kublow
on
openshift,
then
feel
free
to
follow
that
on
open
the
organization
open
data
hub
dash,
I
o
slash
manifest,
and
if
you
follow
to
the
wrong
one,
that's
fine,
we
will
definitely
make
sure
it
goes
to
where
it
needs
to
be.
Definitely.
A
Straighten
you
out
point
it
in
the
right
direction
and
really,
if
you're
listening
to
this-
and
you
are
running
this
reference
architecture
or
want
to
please
do
reach
out
we're.
Definitely
looking
I'm
seeing
it
pop
up
in
lots
of
conversations
across
the
ecosystem
from
health
care
to
and
covered
tracking
stuff
to
all
kinds
of
interesting
things.
So
it's
definitely
been
starting
starting
to
get
a
lot
of
overflow
into
other
spaces
and
market
spaces
and
use
cases.
A
B
Everything's
deployed
we're
missing
one
key
thing
for
hub,
but
I'll
investigate
that
and
we'll
go
from
there,
so
we
can
open
it
to
other
components.
So
there
are
questions,
let's
see
so
one
of
the
things
I'll
say.
Whenever
you
deploy
open
data
hub,
we
make
sure
that
everything's
ready
in
a
state
where
you
can
use
it
automatically.
B
So
if
any
of
the
components
need
to
be
accessible,
so
they're
not
just
kind
of
backing
components
where
a
component
a
is
just
utilizing
a
service
from
component
b.
If
it's
something
that
the
user
needs
to
interact
with,
we
make
sure
that
there's
an
open
shift
route
created
to
that
so
that
you
can
easily
once
open
data,
helps
deploy
just
go
to
networking
routes
and
access
that
component.
So
here
you'll
see
superset.
A
B
B
B
A
Okay,
maybe
while
you're
doing
this,
we
can
answer
a
few
more
questions
and
I'll
unmute,
some
of
the
other
folks
that
are
from
your
team
and
you
can
debug
it
and
just
raise
your
hand
when
you
figure
it
out
or
not,
and
we
can
do
that.
So,
let's
see
who
else
we
have
juana
is
here.
A
I'm
sorry
and
vaclav
is
here
so
that
there
are,
while
he's
doing,
that,
a
couple
of
other
questions,
and
I
think
you
you
answered
the
one
about
and
explaining
where
it
is
in
terms
of
vanilla,
kubernetes
versus
open
shift,
and
I
think
we
do
have
a
pretty
strong,
full
open
source
stack
with
the
the
complement
of
okd
now.
A
So
anybody
who
wants
to
do
a
full
stack
without
licensing
ocp
could
could
if
they
would-
and
I
I'll
see
if
I
can
get
the
okd
working
group
to
find
someone
to
to
test
it
out
for
us.
But
one
of
the
questions
that
came
in
and
beverly
has
is
probably
going
to
guide
us
through
some
of
maybe
the
first
question.
If
you
want
to
go
through
that.
D
Absolutely
so
wanna
are
all
components
from
kubeflow
available
or
included
in
open
data
hub.
C
Actually
yeah,
so
not
all
of
them
are,
for
example,
I
could
say:
kf
serving
today
doesn't
work
with
cooper,
1.0
that
we
have,
and
if
you
look
at
the
example
manifest
that
is
actually
linked
through
our
operator
main
page
description,
you'll
see
that
some
components
are
commented
out,
and
these
are
the
components
that
we
are
actually
still
working
on
to
get
them
working
on
openshift.
A
C
C
And
add
something
I
forgot,
what
it
was,
but
yeah,
that's
mainly
what
it
is,
but
our
community
meeting
is
always
busy
with
many
different
developers
from
different
companies,
and
we
do
work
really
close
with
many
of
the
component
owners
such
as
selden
and
cooper.
A
Yeah,
so
I
think,
as
as
this
community
expands,
the
end
users
become
really
important
because
they're
giving
the
feedback
to
how
it's
being
used
and
the
integration
partners
like
seladon
and
others
become
important
too,
as
well,
so
be
interesting
to
see
how
the
the
ecosystem
grows
around
this,
because
you
have
incorporated
a
whole
lot
of
partner
and
integration
points
there.
So
that's
going
to
be
fun
to
watch
as
we
go
through.
C
So
we
have,
from
a
use
case
perspective
from
an
industry
perspective.
We
do
have
a
couple
of
one
use
case:
that's
already
out
there,
which
is
the
fraud
detection
use
case.
So
we
have
all
the
code
and
all
the
instructions
on
gitlab
for
it
and
then
we're
working
on
a
couple
we
have.
We
also
have
ai
on
the
edge
that
landon's
working
on
and
then
we're
working
on
other
industries.
I
think
we
have
one
in
the
banking
industry
and
a
couple
down
the
line
coming
down.
D
I
mean
it
could
well
that
answers
a
question,
but
we
could
also
look
at
it
in
terms
of
do.
We
have
maybe
like
clients
that
are
already
using
open
data
hub
in
their
infrastructure.
C
Yeah,
so
we
do
have
a
couple
clients
we
have
exxonmobil
using
it
and
they
had
and
they
did
many
presentations
with
regards
to
using
open
data
hub.
We
also
have
internal
implementation
of
open
data
hub
that
is
being
used
by
internal
data
scientists
and
data
engineers
in
red
hat,
and
then
we
also
have
the
moc
that
landon
describes
and
I'm
sure
russia
can
add
a
couple
more
about
this
and
where
it
is
of
today.
C
E
So
with
moc,
we
are
working
on
support
for
openlayup
on
power9
machines
and
clusters
of
openshift,
and
we
have
opened
they
have
deployed
in
moc
and
is
being
used
by
students
for
their
research
work.
We
had
a
couple
kind
of
early
adopter
projects.
E
It's
been
kind
of
hard
to
keep
it
running
there,
so
we
are
working
with
them.
We
have
weekly
things
to
to
basically
see
where
they
are
and
when
they
have
openshift4
ready
for
us,
we
will
come
back
to
to
having
opening
up
fully
running
there.
Part
of
our
road
map,
which
you
can
find
on
openawayo
as
well,
is
for
the
next
release
to
have
a
plan
for
how
we
could
do
c
continuous
deployment.
E
Landon
mentioned.
We
have
an
internal
deployment
of
open
data
hub
which
is
running
internally
at
red
hat,
and
then
we
have
that
partially
public
deployment
on
moc,
where
the
researchers
that
are
part
of
messages
open
cloud
can
use
it,
and
our
goal
for
the
next
release
will
be
to
come
up
with
a
plan
for
reproducible,
continuous
deployment
solution
or
process.
E
Rather,
where
we,
our
new
releases
of
open
it
up,
would
go
to
our
internal
lead
app
instance
and
to
like
mostly
deployed
instance,
and
it
would
be
hopefully
also
reproducible
for
our
for
our
users,
where
they
can
use
that
process
to
to
also
like
get
their
deployment
bound
to
our
releases
and
and
stuff
like
that.
E
Yeah
definitely,
it
is
a
big,
is
a
big
ask,
and
not
only
for
open
data,
but
also
for
the
kickflow.
The
upstream
project
that
we
pull
components
from
there
is
plenty
of
people
that
are
running
disconnected
to
be
that
with
edge
deployments
or
originally
on
a
remote
locations
where
they
maybe
only
have
mobile
connections
or
something
like
that,
and
they
need
to
be.
They
need
to
be
able
to
make
sure
that
they
can
control
the
traffic
that
is
coming
in
and
out
of
the
clusters.
E
So
we
want
to
make
sure
that
that
is
possible
to
open
data
up
and
when
deploying
that,
like
everything
goes
down
smoothly,
they
can
pre-pull
the
images
and
they
can
deploy
to
a
disconnected
environment
whenever
they
are
ready.
So
we'll
be
looking
at
that.
Probably
this.
This
fall.
E
We've
been
looking
in
that
for
some
time
in
our
previous
versions,
which
was
based
on
ansible
crater,
but
it
was
kind
of
hard,
because
that
was
just
a
lot
of
parameterization
with
ansible,
and
it
was
just
like
all
the
repos
all
the
registries,
all
the
images
and
it
was
kind
of
a
mess.
So
we
hope
that,
with
this
cube,
flow-based
solution,
it
will
be
a
bit
easier
and
also
keep
close
working
that
in
the
past,
I'm
not
100
sure
if
they
were
able
to
finish
it.
E
But
we
will
definitely
look
at
the
keep
flow
solution
for
that,
but
that
there
is
any
and
if
not,
maybe
we
can
help
or
finish
or
bring
it
back
to
the
community
and
see
what
they
have
in
mind.
For
that.
A
Thanks,
I
know
because
we
had
someone
come
to
the
okd
working
group
who
wanted
to
do
on
arm
64
an
ml
use
case
using
okd
in
a
disconnected
fashion.
A
So
I
think
maybe
open
data
hub
is
a
bit
overkill
for
what
they
were
trying
to
do,
but
I
think
it
gives
them
a
good
roadmap
and
a
good,
maybe
a
collaboration
point
to
work
through
so
I'll
see.
If
I
can
feed
you
that
use
case
as
well.
E
E
A
it's
an
interesting
point:
whether
open
it
up
is
overkill.
You
don't
have
to
use
all
the
components
right
yeah
if
you're
only
reason
to
run
open
it
up
is
to
deploy
selden
and
something
else.
Then
maybe
it's
it's
still
good
to
use
open
data,
because
we
have
verified
the
components
that
they
run
well
on
openshift
and
there
are
some
integrations
and
there
is
more
coming.
If
it's
just
one
component
and
it's
already
an
operator
up-
and
we
just
depend
on
it-
maybe
it
doesn't
make
sense
for
an
opening
now.
E
But
if
it's
like
three
things
that
you
would
be
running,
it
gives
you
kind
of
a
single
point
where
you
just
apply
that
one
custom
resource
and
it
all
comes
up
and
it's
all
integrated
and
configured.
B
Yes,
so
just
so,
it
doesn't
look
like
magic
or
anything.
What
we
did
I
was
playing
around
with
the
kfdf.
The
operator
was
kind
of
throwing
an
issue
with
graffana
deployment,
so
it
was
kind
of
a
timing
issue,
so
we're
relying
on
olm
to
deploy
grafana
based
on
the
grafina
devs
configuration
since
we
have
it
separated
into
kind
of
graffana
cluster
and
the
grafana
application.
B
We
kind
of
needed
like
a
little
wait
time
in
between
the
two
of
a
few
seconds,
though
one
of
the
dependencies
that
the
grafana
deployment
required
wasn't
present
yet
so
that
would
have
been
deployed
by
grafana
cluster.
So
it
was
a
small
race
condition.
B
So
let
me
I'll
just
go
over
what
we
did.
B
And
I
moved
the
grafana
component
to
the
bottom,
so
this
is
pretty
simple,
so
I
just
cut
or
cut
this
text
from
higher
up
in
the
kfdf
ammo
and
moved
it
to
the
bottom.
What
happens
is
once
we
save
that
it'll
trigger
an
update
which
the
operator
will
detect
and
then
it
will
reprocess
that
kfdf.
B
So
now,
there's
based
on
the
previous
attempt
to
deploy
grafana
all
the
dependencies
were
installed
and
now
we
can
deploy
grafana
successfully.
So
that's
that's
all
I
did
and
what
they
did
was
kind
of
unblock
the
dam.
So
once
that
was
resolved,
all
the
below
components
are
the
components
below
grafana
deployed
successfully.
So
you'll
see,
we
have
a
lot
more
deployments
and
we
have
now.
Argo
is
available.
Grafana,
it's
online.
B
Luckily,
I
didn't
expose
my
password
to
the
world
and
again
so
we're
using
openshift
oauth
for
a
lot
of
these
components
by
default,
and
here
these
are.
This
is
one
of
the
customizations
we've
added
to
jupiter
hub,
where
you
can
select
your
notebook
from
a
list
of
notebooks,
so
we
have
kind
of
a
minimal
notebook
which
is
just
kind
of
bare
bones.
B
I
think
it's
just
python
is
installed:
scipy
notebook
for
the
scipy
library,
a
spark
notebook
that
has
version
2.,
spark
245
and
hadoop
273
the
spark
scipy
notebook
and
tensorflow
notebook,
so
any
user
that
deploys
this
has
access
to
this.
So
if
they
have,
if
they
can
kind
of
read
access
to
the
namespace,
a
basic
kind
of
minimal
access
to
namespace,
they
can
deploy
their
own
notebook.
B
B
So
if
you
wanted
to
change
the
small
medium
large
configurations
to
be
kind
of
10
cpu-
and
you
know
256,
megabytes
or
gigabytes
of
memory
or
even
larger
than
that,
where
you
have
kind
of
a
small
cpu
but
large
memory,
you
can
do
that
and
any
if
you,
the
user,
wants
to
add
any
environment
variables
they
can
and
then
they
just
spawn
the
notebook.
So
this
is
under
full
controller
user.
B
At
this
point
they
don't
have
access
to
the
project
space
where
jupiter
hub
is
running,
but
they
have
full
access
to
their
notebook
pod.
B
D
Yeah,
so
we've
got
a
question
since
open
data
hub
is
a
platform
or
blueprint
to
pre
to
building
an
ai
as
a
service
platform.
Can
you
talk
into
whether
it
works
with
gpus.
B
So
yes,
it
does
so
we
have
full
support
for
gpu,
so
the
open
data
hub
does
not
do
gpu
enablement,
but
the
notebook
that
we
just
spawned
a
user
has
an
option
of
requesting
enable
gpus.
B
B
Okay,
so
since
it's
in
red
hat
operators,
you
will
need
a
kind
of
fully
subscribed
cluster,
if
I'm
not
mistaken,
but
the
access
to
the
nvidia
operator
is
free,
I'm
using
air
quotes.
So
if
your
cluster
has
access
to
red
hat
operators,
then
you
have
access
to
the
nvidia
operator,
so
the
nvidia
operator
is
responsible
for
doing
gpu
enablement.
So
you
provide
the
gpu
node
install
the
nvidia
operator,
and
then
that
will
handle
kind
of.
Usually
it
requires
this
auto
dependency.
That's
installed,
the
node
feature.
B
Discovery
operator
is
a
dependency
for
nvidia
that
will
essentially
catalog
every
node
in
the
cluster,
and
it
will
give
you
all
these
annotate
or
these
labels
for
different
hardware
features
that
are
available
and
once
it
sees
this
annotation,
the
nvidia
operator
will
go
out
to
that.
Node
install
the
appropriate
drivers
for
the
gpu.
That's
installed
once
that's
installed.
You
should
get.
B
B
So
at
this
point,
open
data
hub
can
request
x,
number
of
gpus
and
then
from
there
you
can
spawn
any
notebook
with
that
will
request
the
gpus,
and
then
you
could
use
that
in
your
model
development.
B
D
Yeah,
that
was
great
london,
and
can
you
talk
about
what
the
ai
lab
library
is.
B
That's
a
good
question.
Maybe
chad
has
an
answer
for
that.
I
know
he's
done
some
work
with
the
ai
library.
B
B
So
again,
if
you
have
any
questions
about
like
any
of
the
components,
if
you
want
to
know
more
about
them,
that
we
provide
no
potato,
feel
free
to
go
to
opendata
io,
we're
always
improving
the
docs
and
increasing
the
amount
of
documentation,
that's
available
for
different
components.
So,
as
we
add
a
new
component
or
updated
new
component,
add
features,
it
will
be
available
on
opendatahub.io
website.
B
But
I
think
these
are
it's
a
collection
of
models
that
you
can
use
in
your
workflow
and
we're
using
selden.
So
selden
is
a
dependency
for
the
ai
library,
where
any
of
these
models
will
be
deployed
and
the
api
will
be
available
for
you
to
submit
data
to.
A
Then,
back
to
the
gpu
topic,
we
just
got
a
question
coming
in
from
oleg
on
youtube:
how
to
automatically
run
cals
on
free
gpu.
Is
there
any
spawner
for
that
gpus
with
some
ram
available
sure.
E
E
How
is
it
generally
working
in
an
openshift
with
gpus?
It's
basically
that
the
container,
when
you're
spawning
a
container
and
it
requests
some
resources,
it
can
be
cpu
memory
or
any
special
resources
like
gps.
The
container
will
run
on
that
node
and
based
on
the
configuration
it
will.
It
will
get
those
resources
so
for
memory
you
ask
for
100
gigabytes
of
ram.
If
you
have
a
node
which
can
accommodate
that
container,
it
will
get
it
if
it's
gpu.
E
If
there
is
a
node
that
has
a
free,
gpu,
unassigned
gpu,
it
will
get
that
gpu
right
now.
There
is
no
good
solution.
As
far
as
I
know
for
like
splitting
gpus
or
something
so
we
are
talking
about
using
a
gpu.
It
will
be
one
gpu
per
container
or
multiple
gpus
per
container,
but
cannot
be
multiple
containers
per
gpu.
There
are
hex
around
it,
but
it
doesn't
really
really
really
work
yet
so
for
this
we
cannot
automatically
run.
E
So,
if
you
just
say
in
your
code
running
in
a
container
hey
if
it's
there,
if
there
is
a
gpu
I'd
like
to
run
this
on
gpu,
that's
not
how
it
works.
The
container
the
pod
itself
has
to
specify
that
it
requires
the
gpu
to
run,
and
if
there
is
a
free
gpu,
it
will
be
assigned
to
that
node
and
it
will
be
run
on
that
node
and
it
will
get
the
gpu
kind
of
mounted
into
the
devices
of
the
container
and
the
code
inside
a
container
can
use
that
problem
with
that.
E
As
long
as
that
container
runs,
the
gpu
is
allocated
and
cannot
be
allocated
to
something
else.
So
there
is
no
smart
way
right
now
to
let's
say:
okay,
if
there
is
a
free
gp
run
on
gpu,
if
not
do
not
run
on
gpu,
we
don't
have
that
and
I
don't
think
anyone
really
has.
E
Potentially
you
could
write
an
operator
which
would
take
your
code
and
inject
some
information,
whether
there
is
a
free,
gpu
or
not,
and
then
based
on
that
the
code
would
change
the
execution
path
and
then
maybe
based
on
the
information
from
the
cluster.
It
would
either
get
or
don't
get
gpus,
but
but
I
haven't
seen
such
solution
yet,
but
in
general
there
is
no
automated
way
how
to
how
to
decide
this
available
in
opening
it
up
or
in
openshift
in
general,.
A
Cool
thanks
for
that
and
I'm
looking
at
the
time
and
we're
almost
to
the
end
of
the
hour,
so
maybe
landon.
If
you
want
to
share
that
slide
where
people
can
find
additional
resources
again,
while
we
blather
on
a
little
bit
more.
That
would
be
a
great
way
to
to
end
the
hour
here
and
I'm
I'm
just
wondering,
because
you
have
lots
of
different
partners
and
different
integrations
into
this.
Have
you
done
anything
just
because
they
are
now
part
of
our
family
with
the
ibm
watson
stuff?
A
Have
you
has
anyone
integrated
any
of
that
into
and
use
it
from
a
jupiter
notebook,
or
is
that
something
for
a
future
briefing.
E
We
actually
have
an
issue
created
where
ibm
provides
a
cuda
enabled
imp
container
image
where
we
basically
cannot
redistribute
them
as
openly
app
and
as
redhead.
We
cannot
redistribute
put
our
binaries
in
our
images.
We
always
have
to
build
it
on
spot.
E
So
if,
if
you
deploy
open
data-
and
you
want
to
use
gpus-
and
you
want
to
have
good
enabled
images-
you
have
to
build
them
in
your
cluster,
which
is
fine,
we
provide
all
those
build
configs
and
everything
it
just
takes
time
and
some
resources
for
the
build,
whereas
ibm
provides
these
images
in
actually
red
hat
registry.
E
So
we
have
an
issue
for
looking
at
whether
we
can
use
that
image
as
a
base
for
some
of
our
jupiter
notebook
images
so
that
we
don't
have
to
rebuild
build
them,
but
we
can
actually
leverage
what
they
already
provide
on.
The
other
front.
Ibm
is
very
active
in
keep
flow
communities,
so
we
are
talking
to
them.
Often
in
our
community
calls
and
keep
flow
community
calls
and
coordinating.
E
Basically,
the
cube
flow
operator,
which
is
a
base
for
open
media
operator,
now,
has
been
built
by
ibm
team
with
our
guidance
and
then
contributions
to
ideas,
documentation
and
some
code,
but
they
they
did
majority
of
the
work.
The
open
source
team
at
ibm
so
really
really
good
collaboration.
There.
A
Awesome
well
we're
going
to
have
to
get
them
on
again
soon
and
see
if
we
can't
make
that
all
work
and
explain
how
that
all
works
too.
So
I
really
want
to
thank
beverly
for
for
stepping
up
and
organizing
this
today
and
making
it
happen,
and
the
whole
team
from
open
data
hub
for
coming
and
answering
questions
and
sharing
your
your
wonderful
project
and
congratulate
you
on
on.
It's
really
come
a
long
way
since
the
last
time
I
did
an
open
upstream
conversation
on
it.
A
So
it's
really
amazing
to
see
all
this,
and
I
know
I've
been
talking
with
folks,
like
guillaume
moltier,
around
some
of
the
work
that
he's
doing
up
I'm
up
in
canada
as
his
guillaume
and
with
the
covid
project
and
the
that
the
ontario
folks
are
doing.
And
hopefully
we
can
get
him
back
on
again.
Talking
about
that.
A
A
And
we
will
upload
this
and
put
the
slides
link
it
to
on
the
youtube
channel,
rh,
openshift
and
I'm
sure
the
open
data
hub
folks
will
steal
that
video
and
put
it
out
on
their
feeds
as
well.
So
look
for
that
shortly.
Thank
you.
All
for
taking
the
time
today
take
care
and
be.