►
From YouTube: Lightning Talks: Diane Feddema & Zak Hassan: Red Hat
Description
Lightning Talks: Diane Feddema & Zak Hassan: Red Hat
A
Okay,
so
we're
going
to
talk
about
Zack
and
I
are
going
to
talk
briefly
about
machine
learning,
we're
both
practitioners,
we're
using
machine
learning
and
on
kubernetes
right
now
and
we're
using
the
s2i
tools
that
we
talked
about
earlier,
and
we
are
part
of
this
read
analytics
IO
team,
which
is
creating
the
tooling
to
make
it
really
easy
to
run
these
machine
learning
algorithms
and
include
them
in
your
pipeline
on
OpenShift.
So
this
is
a
really
simple
overview
here
of
this
software
stack
with
OpenShift,
then
our
read
analytics
tooling.
A
On
top
of
that
and
then
apache
spark
which
Zack
will
talk
about
next
and
then
your
application,
which
could
be
it
could
be
something
like
a
retail
site
online.
It
could
be
I,
have
an
application
for
running
performance,
all
of
our
performance
tests
and
I've
added
an
an
intelligent
portion
of
that,
because
I've
added
machine
learning
component,
which
improves
the
user
experience,
and
it
does
some
prediction
for
me.
So
Zack
will
tell
us
a
little
bit
about
SPARC
now
and
what
it
does.
B
So
patchy
SPARC
is
a
is
the
so
we
built
a
analytics
platform
on
top
of
OpenShift
and
Apache
spark
is
the
core
engine
for
our
analytics.
So
it
comes
with
different
api's.
You
can
use
machine
learning
or
you
can
use
streaming
or
you
can
use
graph
processing
as
well
as
spark
sequel.
It
comes
with
lots
of
language
bindings.
So
if
you
want
to
do
your
stuff
in
Python,
Scala
Java,
there's
sy
builder
images
that
you
can
you
can
utilize.
A
Primarily
I've
used
it
myself
for
these
algorithms
listed
at
the
bottom
for
clustering,
using
things
like
random
forests
and
regression,
so
some
examples
of
how
you
might
take
a
regular
application.
That's
doing
just
your
transactions
on
the
web
and
turn
it
into
something.
That's
using
one
of
these
machine
learning.
Algorithms
is,
for
instance,
like
on
the
Airbnb
site.
They
use
ultra
alternating
least
squares
to
you
to
give
you
recommendations
about
places
you
might
like
to
stay
say
you
go
to
a
site
where
you're
a
place
you
can
normally
would
like
to
stay
it's
already
booked.
A
They
will
use
alternating
least
squares
to
give
you
a
bunch
of
other
recommendations
about
where
you
might
want
to
stay.
Instead,
you
can
do
clustering,
where
you
might
want
to
cluster
all
your
customers
and
tailor
their
experience
on
your
website,
based
on
which
of
these
clusters.
They
fall
into
I
personally
used
random
forests
to
help
me
with
my
performance,
monitoring
and
I'm,
able
to
like
pick
the
top
ten
configuration
parameters
that
I've
set
in
my
experiment
and
see,
which
ones
are
most
influential
only
the
overall
performance
of
the
codes
that
I'm
running.
A
So
this
just
gives
you
some
examples,
this
small
subset
of
all
the
ML
algorithms
we
have
available
in
SPARC,
and
this
is
the
good
news.
Well
I've
done
all
this
performance
testing,
and
so
far
the
overhead
has
been
10%
running
on
kubernetes
and
in
clusters.
I
mean
in
kubernetes
instead
of
just
bare
metal,
and
so
Zack
will
talk
to
you
a
little
bit
about
how
easy
it
is
to
use
this.
You
don't
really
have
to
be
a
data
scientist
to
do
this
work
you
can.
The
the
API
is
so
easy
pretty
much.
B
So
so
there's
a
lots
of
tooling
around
that.
So
when
you're
designing
models
and
and
whatnot,
then
you
know
there's
you
know
some
data
scientists.
We
do
have
data
scientists
on
staff
that
work
on
algorithms
and
to
have.
But
then,
when
you
train
the
model,
then
you
deploy
the
model
and
then
you
do.
You
can
do
things
like
predictions
and
solve
different
problems
with
with
your
data,
so
I
think
it's
very
interesting.