►
From YouTube: OpenShift Commons ML SIG April 5 Full Mtg Recording
Description
Diane Mueller - co-chair
Introductions
AIOps SIG announced
Drew Minter (UBIXLabs) – UBIX Platform Intro
A
Four
minutes
after
the
hour,
I
think
that's
polite
enough
time.
Welcome,
Jonathan
and
Tushar,
and
everybody
else
and
please
feel
free
to
jump
in
I,
wanted
to
do
a
quick
announcement
about
a
new
cig.
That's
been
injured,
created
here,
AI
ops.
We
had
our
first
meeting
about
a
week
ago.
All
the
videos
and
everything
are
here
on
the
Commons
open,
shapes
cig,
HTML
page,
which
I'll
pop
over
to
here
and
there's
a
mailing,
a
Google
Group
for
it
as
well.
A
That
we've
just
started
up
on
the
topic
of
AI
ops
and
if
you
have
a
chance
or
an
interest
in
applying
your
tooling
do
operations.
This
is
sort
of
the
beginnings
of
conversation
that
we've
been
having
internally
to
Red
Hat
with
a
number
of
partners,
so
we're
we're
socializing
that
and
creating
a
group
around
that.
So
you
can
join
that
or
you
can
join
just
directly
through
the
AI
ops
group
itself.
A
So
you
can
see,
there's
not
a
lot
of
topics
there,
but
we
are
just
getting
started
so
join
us
if
you're
interested
in
that
space
today,
let's
see
who
we
have
on
the
call
and
I'm
what
I
usually
try
and
do
is
make
everybody
introduce
themselves,
especially
the
ones
that
I
don't
know,
and
if
you
want
to
let's
go
through
the
list
here
Bob.
If
you
want
to
just
introduce
yourself
that
would
be
great.
Now
unmute
you.
B
A
C
Sure
everyone
drew
Minton
CTO
and
chief
data
scientist
at
UNIX
labs
I've
been
there
for
five
years.
Are
my
history
with
open
shift
and
red
had
really
started
a
couple
months
ago?
But
you
know
we've
been
very
open
source
friendly,
you
know
stem
to
stern
and
I'll
be
doing
the
demo
to
introduce
you
to
our
platform.
That
will
be
at
the
right
hand,
someone
for
more
detail.
A
D
F
My
name
is
Stefan
x9
I'm,
based
in
Germany
I,
have
background
on
air
ops
from
the
meeting
space
I'm
a
long-term.
He
will
pecker
to
enterprise
now
for
Mike,
a
focus
employee
responsible
for
money,
drawing
products
and
also
machine
learning
areas
and
AI
ops,
but
in
a
way
to
transition
to
a
solution.
Architect,
role
in
to
adapt
in
Germany.
G
A
Otherwise.
What
I'd
like
to
do
today
is
and
I,
don't
see
our
second
speaker
here,
Jeff
beam
having
joined
us,
so
I'm
gonna
ask
drew
MIT
Lincoln
to
share
his
I'm
gonna.
Stop
sharing
my
screen
and
asked
drew
to
share
his
screen
and
walk
us
through
with
one
of
the
topics
for
today
around
the
UX
platform
and
tell
us
what
that
is
and
what
it
can
do,
and
that
would
be
great
to
hear.
E
A
C
All
right,
so
you
know
this
looks
like
a
pretty
technical
audience
and
so
in
service
of
that
I've
cut
out
about
two-thirds
of
the
slides
that
we
normally
cover.
But
again
you
know
a
lot
of
you
know
it's
not
always
one
person
in
AI
who
has
the
entire
sort
of
scope
of
where
we're
playing
so
I
still
think
it's
instructive
to
get
a
little
bit
of
background
on
where
we're
coming.
G
C
As
a
company,
you
know
just
briefly
before
I
go
through
our
messaging.
You
know
the
first
seconds
when
we
were
originally
building
something
that
would
compete
with
data
breaks
back
in
2013,
but
we
decided
to
build
something
easier
than
Scala
of
our
own
DSL
started
in
March
of
2013
back
in
our
0.5
I
believe
it
was.
I
came
in
right
around
when
sparkster
26
happened
in
May
of
2014,
and
we
we
came
from
the
prosthetic
capital
infrastructure.
We
had
a
restart.
C
C
That
that,
while
there's
a
lot
there's
a
significant
amount
of
promise,
there's
also
a
lot
of
a
challenge.
As
far
as
the
the
landscape
between
the
amount
of
data
that's
being
generated,
the
amount
of
demand
for
applications
and,
just
frankly,
how
hard
it
is
to
actually
be
able
not
only
deliver
on
a
quality
model.
But
when
you
look
at
processes
around
it,
you
know
there's
been
a
lot
of
Six
Sigma
in
other
places
where
things
have
become
commoditized
processes
and
something
that
are
very
highly
automated
and
repeatable
processes.
C
And
even
though
conceptually
you
know,
we've
got
Chrystia
and
some
or
what-have-you
operationally
it's
kind
of
all
over
the
place
and
there's
also.
This
is
a
famous
diagram
that
the
Google
shows
in
publishing
one
of
their
their
documents
about
showing
the
the
level
of
technical
debt
that
is
taken
on.
That
is
really
external
to
the
modeling
process
itself.
C
You
know
that
we're
really
trying
to
help
on
all
levels
of
the
entire
process-
and
you
know
with
certainly
the
emphasis
of
the
citizen
and
the
citizen
data
finds
in
the
hardware,
data,
scientist
and
professionals,
having
sort
of
one
place
to
be
able
to
work
together.
But
we
see
more
there.
So
you
know
when
we
look
at
the
data
science
process.
C
We
won't
weird
we'd
taken
an
idea
from
you,
know
continuous
improvements
and
continuous
releases
and
we
to
be
able
to
make
sure
that
we're
facilitating
a
comprehensive
view
so
that,
regardless
of
the
type
of
data
or
the
type
of
analytics
you're
doing
and
the
type
of
you
know,
automation
where
we
would
be
sort
of
a
sense
making
of
the
sensory
part.
Then
there
would
be
some
other
control
loop.
That
would
be
integrated
with
that.
We
can
essentially
be
plugged
in
at
all
levels.
C
And
so,
when
you
look
at
from
a
technology
perspective,
how
how
we're?
How
we're
looking
here
and
again
will
be.
It
will
port
for
open
for
Red
Hat
summit
and
a
phase
where
we're
going
to
be
actually
on
the
point
openshift
as
a
platform
as
availability.
There
abduch
a
lot
more
of
that
later,
but
you
know
because
we
knew
that
we
needed
to
have
a
reference
platform
for
all
of
our
own
research
on
being
able
to
have
model
refreshes
online,
etc.
C
So
what
I'm
going
to
do
is
I'm
going
to
introduce
you
to
the
deployment
infrastructure.
We
have
give
you
an
idea
of
some
of
our
DNA.
If
you
will
and
then
we're
going
to
spend
the
majority
of
the
time
going
through
a
couple
workflows,
one
of
them
will
be
in
as
a
traditional
survey,
iover
IOT
scenario
of
a
part
of
a
passage
failure
and
how
to
predict
it
and
prevent
it
in
the
future
and
then
there's
going
to
be
another
one.
F
C
So
the
short
answer
is
that
it's
grayed
out
in
that
you
know
Jupiter
lab,
has
so
much
momentum
behind
it
that
it's
I
think
it
would
be
foolish
for
any
software
developer
to
try
to
make
its
own
notebook
infrastructure.
You
know
like
data
break
I'm
sure
at
some
point,
we'll
have
to
do
something
more
in
super
labs.
Is
my
prediction,
but
III
do
have
a
good,
a
nice
slide
on
that
subject.
C
C
Okey
doke,
so
what
we're
looking
at
here
is
the
is
the
interface
that
I
use
to
spin
up
the
instances
that
we'll
be
looking
at
today
in
the
demo,
and
so
you
know,
because
we
are
dyed-in-the-wool
open
source
stack.
We
had
actually
considered
having
part
of
our
our
software
actually
open
source.
At
one
point
you
know
we
we
have
the
capacity
to
be
able
to
not
only
spin
up
our
own
instances
of
these
major
subsystems,
but
also
as
we're
planning
for
you
know
for
the
first
version
in
OpenShift.
C
The
images
will
be
using
our
data
ones,
but
we're
certainly
transitioning
expected
soon
after
the
read
something
to
be
able
to
start
using.
You
know
the
the
open,
shifty
images
for
kaká,
etc,
but
we
already
a
configuration
available
for
them,
but
you
know
for
development
purposes
a
lot
of
times.
You
know
we
having
just
one
server
is
enough,
so
it
gives
you're
not
doing
heavy
streaming
work.
C
So
with
this
information,
we're
expecting
that
we'll
be
able
to
give
that
invitation
codes
to
people
at
that
Red
Hat,
and
then
they
can
come
to
the
site
and
be
able
to
put
in
credentials.
It'd
also
be
another
version
for
open
ship
that
will
give
similar
availability.
So
people
can
start
playing
with
this
on
their
own
account.
C
But
one
thing
I
just
want
to
quickly
show
is
that
you
know
we
have
because
we
we
manage
this
and
we
can.
You
can
go
across
cloud.
We
can
actually
do
the
we
can.
We
can
manage
instances,
you
know
centrally,
and
so
here
you
can
see
some
of
the
management
tools
we
have
and
let
me
just
go
down
real
quick
to
show.
C
So
any
questions,
so
this
is
usually
when
I
stop
talking
about
infrastructure
and
dependencies.
So
if
there's
questions
about
deployment,
hardware,
software
requirements
or
dependencies
just
as
a
minor
roadmap
item-
that's
not
in
the
presentation
is
that
after
rad
summit
we're
also
looking
at
being
able
to
you
know,
we
we've
been
doing
a
lot
more
deep
learning
and
we
we
found
that
you
know
for
a
lot
of
occasions.
C
We
need
a
more
dynamic,
be
able
to
switch
from
CPU
workloads
on
day,
like
seven
GPU
support,
so
we're
for
a
more
dynamic
container
management
and
also
extending
the
data
science
language.
That
is
a
layer
that
unifies
all
the
access
to
these.
These
services
that
we
use
internally
and
and
be
able
to
have
that
support
deep
learning
on
GPU
as
well.
A
No
not
seeing
anybody
question
you
yet
so
we'll
hold
that
till
the
end.
Okay,.
C
C
So
this
is
our
infrastructure
as
far
as
the
solution
space
application
that
we
built
on
our
deep
space
or
our
instance
management.
If
you
will
that
that
coordinates
and
deploys
all
those
those
differences.
One
thing
we
also
have
that
the
benefit
of
the
layer
that
we
built
is
that
we
we
essentially
can
backup
and
restore
any
of
the
models
and
if
our
scripting
and
operators
that
can
be
deployed
to
other
versions
and
other
instances
of
Ubik,
so
that
you
can
be
federated
learning
pretty
easily.
C
But
you
know
we're
kind
of
kind
of
we're
going
to
go
in
to
end
in
our
in
our
demonstration
today
in
looking
at
the
process
here.
So
what
I'm
going
to
do
is
I'm
just
going
to
take
our
basic
CSV
load
and
we're
going
to
take
data,
that's
from
a
pump.
That
was
that
was
the
size
of
a
house
that
we
worked
with,
but
we've
simplified
the
number
of
sensors
from
it
for
demo
purposes.
C
So
I'm
going
to
name
this
training
data,
and
this
is
normally
all
we
need
to
do
for
citizen
data
scientists
be
able
to
get
the
data
in,
but
we
also
have
much
more
extended
capabilities
because
we
have
native
SPARC
integration.
So,
for
example,
you
can
choose
if
how
to
store
the
data
frame
or
the
idd,
depending
on
how
you
look
at
it.
C
For
you
know,
for,
however,
you
want
to
use
the
different
options
for
stores
that
are
built
into
SPARC
well,
but
in
this
case
we'll
just
keep
it
simple
for
demo
purposes
and
so
I'm
going
to
start
this
one.
But
what
I'm
also
going
to
do
is
I'm
going
to
start
a
load
for
another
set
of
data
that
will
rule
that
we
will
have
for
later
on
in
the
demo
here.
This
is
essentially
sort
of
fast
waiting
to
see
some
of
the
model
performance.
What
do
you
do
when
you're
in
production.
C
Running
a
little
bit
hot
on
my
computer,
there,
okay,
so
normally
what
you'll
see
is
a
disagree
fresh
leaf
on.
First
there
we
go
okay,
so
you
can
see
that
there's
several
different
steps
that
we
run
at
one
beyond
just
the
basic
load
of
the
data,
and
we
do
this
for
two
reasons.
One
reason
is
that
you
know
we
found
from
a
research
a
thousand
different
workflows
that
data
scientists
generally
have
a
lot
of
sort
of
basic
questions.
C
They
have
for
data
that
they
won't
have
answered,
and
so
we
try
to
stay
sort
of
one
step
ahead
of
the
data
scientist
and
our
recommendations,
plus
also
we
use
that
information
for
feeding
or
learning
engine
by
taking
basic
profile,
information
and
being
able
to
derive
a
feature
space
for
a
meta
learning
or
what
we
call
our
meta
space.
Sorry
just
give
me
just
a
moment.
C
And
so
I'll
show
you
as
soon
as
this
comes
back,
that
there
is
featured
similarity
that
we
that
we
do
and
we
use
a
open
ml
data
set,
which
is
a
meta
learning
research
site,
so
that
if
people
are
interested
in
some
of
the
profiling
of
their
data,
they
have
a
reference
they
can.
They
can
look
at
externally.
C
You
know
I
I
used,
you
know
technical
audience,
so
I'm
always
prepared
for
more
things
than
I
get
to
sometimes
but
I'm
anyway.
So
this
just
to
go
back
to
what
we're
what
I
was
talking
about
before
here.
So
you
can
see.
You
know,
for
example,
when
I
talk
about
our
DSL
language,
this
is
essentially
loading
from
HDFS
into
the
spark
memory,
the
structure
internally,
that
we
use
for
our
tasks
SDK.
C
So,
for
example,
people
can
learn
DSL
and
let
their
own
tasks,
and
we
can
learn
some
from
them
as
long
as
they
have
the
right
annotations
of
what
data
they
produce
for
the
actions
that
we
want.
And
then,
even
though
this
is
the
DSL
dialect,
that's
being
used
in
this
phase,
we
also
have
a
JavaScript
support,
so
you
know
that's
how
we
accomplish
our
airflow
integration
and
our
ml
flow
integration
that
were
working
on
right
now,
but
just
to
go
back
to
what
I
was
talking
about
with
the
future
similarity
here.
C
So
this
is
an
example
of
essentially
for
all
of
the
sensors
that
came
in
this
is
these
are
datasets
that
are
from
the
open
ml
research
site,
and
then
these
are
what
we
saw
similar
features,
and
then
this
is
sort
of
the
cluster.
They
belong
a
few
in
the
likelihood
rank
on
the
scale
there
tend
to
be
or
part
of
that
that
question
that
helps
us
narrow
the
serve
space
for
when
we're
making
recommendations
for
data
Sciences
workflow.
C
C
Taking
that
I'm
copying
the
name
of
the
the
table
that
we
just
created
that
we
just
made
and
I'm
plugging
it
into
this,
this
ugly
little
URL
that
has
the
JWT
token
for
authentication
than
just
the
name
of
the
table,
because
you
know
with
self-service,
analytics
on
one
hand
and
data
science.
You
know
very
handcrafted
model
fact
tables
or
facets.
C
However,
you
want
to
look
at
your
your
inputs,
the
models,
there's
kind
of
been
a
breakdown
of
that
the
corpus
collosum
of
the
data
brain
corporate
wise,
and
so
you
know,
we
add
philosophies
essentially,
if
from.
If
you
have
the
right
authorization,
you
should
be
able
to
have
access
to
any
work
that
anybody
is
doing
as
soon
as
it's
finished,
without
sort
of
the
with
with
publishing
beam
before
of
an
isolation
and
a
governance
scenario
on
the
instances.
So
this
is
an
example
of
the
data.
C
C
Or
any
of
your
bi
clients,
this
is
just
one
example
of
our
integration
here.
Another
example:
that's
that
is
it's
coming
soon.
That
will
be
in
your
live
on
stage
for
the
fifth,
but
it's
just
in
a
slightly
different
version
of
a
newer
version
of
the
software
here
is
that
this
is
the
classic
iris
dataset
look
here.
C
C
Okay,
so
so
here's
our
inline
analytics,
but
one
thing
that
there's
a
recent
feature
that
we've
added
is
that
you
can
look
at
the
view
of
the
process
itself
and
take
anything
from
any
stage
that
there's
been
a
new
sort
of
standing
query
or
our
options
generated
and
be
able
to
directly
have
it
be
brought
into
an
inline
tool,
we're
still
Reed
rebranding.
Essentially,
this
is
for
those
of
you
are
visual
analytics.
This
is
big,
a
Voyager
now
embedded
within
UNIX.
C
So,
but
you
know,
this
is
something
that
we'll
be
able
to
have,
as
you
know,
a
longer
term
ways
you're
not
having
to
copy
and
paste
the
URL
that
we
saw
earlier
there,
but
anyway,
we're
still
just
kind
of
in
the
data
worm.
We
haven't
gotten
to
the
the
fun
stuff
of
the
AI.
Yet
so
you
know
just
as
a
quick
look
at
the
data
from
the
training
site
there.
C
If
we
look
at
the
data,
that's
in
there,
we
can
see
that,
just
as
a
rough
cut
that
the
pumps
feel
pressure,
he
is
the
most
anomalous.
That's
that's
in
the
data
set,
and
so
what
I
can
do
is
I
can
select
that
pump
steel
pressure
and
there's
a
query
to
our
learning
engine
that
uses
the
profile
of
the
data
and
the
meta
features.
Your
meta
space
features
that
we
we've
essentially
generated
and
essentially
makes
a
recommendation
of
what
are
some
tasks
that
we
can
do
next,
and
so
this
particular
task
is
an
annotation.
C
It's
sort
of
we.
We
make
a
difference
between
whether
it's
a
user
domain,
that's
causing
the
future
engineers
or
if
it's
an
analytic
domain,
that's
separate
from
the
taxonomy
another
type
of
sort
of
semantics
that
you're
looking
at
here.
So
this
is
one
of
our
basic
ones,
because
you
know,
since
we
know,
there's
a
date/time
there's
an
anomalous.
You
know
this
is
we're.
G
C
C
We
now
have
the
data
shaped
in
a
way
that
we
can
usually
make
predictive
model
that
we
could
use
to
prevent
this
type
of
failure
in
the
first
place,
and
so
what
I
can
do
is
I
can
select
that
anomaly
that
was
just
created,
I
get
a
recommendation
for
a
classification,
and
so
we
can
use
that
as
the
target.
Then
we
can
go
back
here
and
we
can
select
the
features
that
we
want
to
use.
C
C
Model
one
point:
it
will
be
perfect
and
I
will
go
ahead
and
give
a
coffin
name
and
so
from
a
you
expected.
You
know
this
is
where
we
are
we're:
building
with
ml
flow
and,
ultimately,
with
a
robot
needs
through
AI
I,
actually
a
little
bit
of
a
tree
I
with
relatives,
especially
the
ability
to
have
Bake
off's
outside
of
our
cluster.
But
because
you
know
we
are
native,
we
do
have
native
spark
ml
native
opal
lab
at
scale
out
and
native
a
show
sparkling
water
scale
built-in.
C
So
you
know
it's
not
going
to
require
a
lot
of
different
options
to
have
have
some
variety
here,
and
so
what
we
come
up
with
is
essentially
a
two
different
models
that
better
come
up
and
there's
a
significantly
better
scenario
with
a
spark
canal.
You
got
to
see
what's
above
11
later,
but
the
nice
thing
is
that
essentially
we
have
everything
encapsulated
and,
with
the
same
type
of
idea
that
we
showed
of
an
automatic
availability
within
the
rest
of
the
organization.
We
don't
really
look
at
deployment
as
being
anything
but
more
of
a
governance
issue.
C
C
Matting
the
name
of
the
bake-off
to
this
endpoint.
This
is
written
in
DSL
that
we
have
here,
so
we
can
also
configure
it
as
needed
and
what's
coming
in
here
and
the
features
parameters,
essentially
adjacent
representation
of
one
sensor.
Reading.
These
are
the
same
features
that
we
were
looking
at
before
just
slightly
different
values
and
so
with
the
right
authentication
based
on
the
data
Beauty
header
and
also
having
the
right
email
that
matches
it
that's
being
sent
in.
C
We
essentially
have
this
model
available
for
anybody
who
wants
to
automatically
look
at
the
data
and
be
able
to
start
using
it
someplace
later
on,
and
so
what
we
get
back
during
the
from
this
from
this
round
trip
here
is
metadata.
That
shows
the
data
types
of
what
we're
used
in
the
model
and
then
also
feedback
of
whatever
columns
you
use
in
that
you
pass
through
it,
and
then
we
also
get
a
yes/no
question.
C
C
C
We
try
to
organize
our
past
development
so
that
they
publish
opera
functions
or
analytic
assets
that
can
be
reused
so
that
we
can
have
a
tax
generating
task,
essentially
just
just
compile
those
for,
for
you
know
a
deterministic
workflow,
but
you
know
in
simpler
terms
at
this
point:
I
want
to
just
fast
forward
in
sort
of
a
timeline
to
a
week
later,
where
we
use
the
data
set
that
we
built
earlier.
C
I
can
add
it
to
that
century
task,
and
so
this
gives
me
all
the
configuration
to
be
able
to
evaluate
how
things
have
been
going
since
the
models
been
in
production,
and
this
sample
is
just
a
simple
one
for
for
CSVs,
but
we
do
have
support
in
our
DSL
for
Kosta
and
SPARC
streaming
constructs.
So
you
can
do
this
as
a
continuous
batch.
It's
easier
to
follow
this
way,
and
so,
if
we
look
here,
you
can
see
the
physical
model
that's
associated
with
the
bake-off
model.
C
That's
the
best
one
at
this
point
and
you
can
see
the
the
drop
in
the
AUC
score.
So
it's
not
doing
as
well
as
we
were
thinking
previously
and
the
usually
the
most
common
problem
is
that
there's
been
drift
of
data.
So
we
also
provide
you
with
information
about
the
distribution
of
the
training
set
features
versus
the
batch
itself,
and
you
can
see
that
they're
close
enough,
that
there
really
hasn't
been
any
change.
C
That
would
be
a
representative
difference,
and
so
this
is
a
concept
drift
as
a
polite
way
of
saying
it
or
that
it
you
know
the
model
itself
has
has
has
something
new
that
you
haven't
thought
about
before
you
have
to
you
have
to
think
about,
and
so
you
know
normally
that's
kind
of
a
new
exercise,
but
the
way
that
we've
architected
the
dependencies
of
our
analytic
assets
and
metadata.
We
have
subscriptions
that
you
have.
F
To
recap
a
bit,
so
you
know
in
the
beginning,
you
mentioned
that
the
mode
for
for
out
of
citizen
data
scientist
and
do
you
really
target
these
more
like
high-level,
because
I
see
a
lot
of
details
which
is
great
by
the
way,
but
no
it's
the
citizen
data
scientist
really
is
the
target
target
user
for
your
tool?
Well,.
C
So
that's
an
excellent
question
and
what
you
know
we
started
kind
of
in
the
middle
of
what
we
consider
our
process
from
a
technology
standpoint,
because
we
there's
really
a
citizen
diet,
data
scientist
as
someone
who
hasn't
embraced
all
the
true
complexity
of
data
science,
and
so
you
know
there
are-
and
we
know
that
in
some
ways
this
this
view
of
building
tasks
may
be
very
intuitive
to
people
who
are
dead
professionals.
You
know
people
who
do
ETL
people
who
dated
development.
You
know
bi
developers.
C
What
have
you
that
that
there
is
there
is
a
simpler
sort
of
dashboarding
approach
of
adding
a
layer
of
the
insights
into
the
kind
of
raw
annotations
here.
So
you
can
think
of
this
as
the
workbench
that
we
built
first
of
you,
our
own
research,
you
know
as
they
have
people,
but
we
also
wanted
to
have
the
the
the
breakdown
of
the
tasks
into
a
way
that
could
be.
G
C
C
C
Think
that's
your
earlier
question
about
how
we
want
to
make
it
easier
for
data
scientists
to
be
able
to
be
available
in
this
ecosystem,
so
so
you're
I
think
you're,
spot-on
in
saying
that
this
seems
a
little
bit
more
technical
than
the
citizen
data
science
might
be,
but
probably
too
simple
for
most
data
scientists,
but
where
we're
serving
this
is
sort
of
the
core
of
the
infrastructure.
Almost
the
audit
trail
of
what
we
see
from
other
interface.
C
A
Please
go
ahead
because
it's
really
interesting
to
us
I
think
this
is
all
new
to
me
because,
as
I
said,
I
only
heard
about
uvx
you
buy
the
IX.
C
Ubiquitous
analytics
was
the
and
and
when
we
were
thinking
of
trying
to
compete
directly
with
with
data
breaks,
we
were
calling
that
ubiquity.
But
the
interesting
note
is
that,
just
as
a
technical
piece
for
the
I
think
there's
somebody
I
saw
who
who
was
at
it
breaks
once
upon
a
time
I'm,
just
gonna
go
in
going
in
just
as
a
legacy
instance
code
here,
but
if
you
want
to
kind
of
go
a
little
bit
deeper.
C
I'll
just
show
you
here
that
if
you
and
I
just
restarted
this
instance
before
the
demo
ran
I,
think
it's
usually
about
60s.
So
this,
essentially
you
know.
If
you
want
to
see
a
you
know,
an
idea.
This
is
essentially
all
of
the
that's
running,
and
so
the
whole
demo
itself
took
606
spark
jobs.
That
were
we
run
and
in
the
background
here-
and
you
know
considering
that,
usually
they
measure
things
in
singles
that
you're
doubled.
C
It's
because
it's
one
big
batch
people
are
bringing
to
data
breaks,
they're
very
excited,
but
the
challenge
we
have
is
that
we
build
this,
this
data
science
layer
to
unify
a
lot
of
non
spark
technologies.
In
the
same,
you
know
ast
that
we
create,
and
so
we
have
to
kind
of
break
up
our
we
have
to
make
DSL
more
modular
language
to
make
a
spark,
only
version
that
will
run
in
their
library
system.
C
C
Okay,
so
this
is
this
is
what
the
open
shift
work
starts,
so
essentially
I'm
giving
a
meeting
to
people
on
the
tamp.
It's
just
I
didn't
get
I
didn't
get
a
chance
to
get
this
completely
tested
for
us
to
make
sure
that
everything
ran
probably
before
the
meeting
but
I'm
closed
on,
essentially
having
solution.
C
So
I'll
talk
a
little
bit
more
so
essentially
from.
If
you
look
at
the
second
half
when
he
19
we're
rebranding
the
management
workers,
the
deep
space
and
what
we
were
looking
at
at
the
solution.
Space
inside
space
is
essentially
a
little
bit
of
a
taste
of
that
our
visualization
also
our
chat
and
to
be
able
to
run
tasks
from
the
chat
face
and
also
just
a
lot
of
plugins
using
plotly,
and
that
are
ways
to
have
analytics
that
are
driven
by
path
like
for
semi-supervised
learning
or
labeling.
What
have
you
in
the
domain
space?
C
They
need.
One
thing
on
the
subject
of
jupiter
lab
is
that
you
know
we
are
very
passionate,
I
mean
Jupiter.
You
know
we
really
felt
like
we
had
to
wait
for
Jupiter
lab
to
come
to
be
able
to
have
the
light
way
to
engage
data
scientists
because
you
know
from
from
our
perspective.
Ultimately,
you
know
we
have
a
project
we're
working
with
in
our
deep
AI
team
of
being
able
to
translate
Python
from
Jupiter
lab
notebooks
into
you,
big,
fast
or
at
least
recommend
replacements
that
are
already
scaled
out.
C
They
don't,
if
the
you
know
rewritten
for
production,
but
you
know
we
already
have
in
the
in
the
engine
the
ability
to
take
Python
and
our
code
and
even
our
work
spaces
without
recoding
and
integrate
them
as
well.
But
we
know
there's
a
lot
of
use
cases
where
people
want
to
go
directly
from
a
tool
of
Python
and
put
it
into
a
solution
space
here.
So
we
call
that
that
integration,
where
you
know
we
have
sort
of
a
junior
version
of
solution
space.
C
C
C
So,
just
in
conclusion,
and
talking
about
you
know
where
we're
trying
to
you,
know
the,
but
so
we've
taken
on
a
very
you
know
big
tent,
if
you
will,
but
I
think
it's
taking
us
so
long
and
to
be
able
to
get
the
market,
but
we're
really
excited
about
what
we've
done
and
you
know
we're
really
excited
about.
You
know
joining
the
the
open
ship
community
and
embracing
it,
and
so
that's
the
core
of
it.
So
you
know
again.
C
D
C
That's
that's
an
excellent
question.
I'll.
Just
briefly
show
you
some
potassium
available.
So
as
an
example,
you
know
what
we
used
was.
This
has
there's
a
little
bit
more,
has
more
robust
instrumentation
than
some
of
the
other
tasks
we
currently
have,
but
essentially
we
we
have
a
lot
of
different
ways
to
get
at
static
data
that
that's
directly
available
like,
for
example,
we
have
rappers
that
take
advantage
of
AWS
parallel
api's,
for
you
know
for
loading
files
in
certain
formats
yeah.
C
We
also
have
JDBC
that
we
wrap
around
for
spark
sequel,
but
also
we
have
made
a
deep
investment
in
streaming
and
kind
of
two
ways
in
that
all
of
our
logical
operations
for
our
pipe
functions,
if
you
will
in
our
DSL,
can
take
to
take
static
or
streaming
data
as
a
you
know,
as
a
producer
of
data.
So
so
we
can
not
only
listen
to
caucus
agent.
You
can
basically
set
up
cost
of
topics
and
and
and
read
from
cost
of
topics
directly
in
our
DSL,
but
you
can
also.
C
E
C
C
C
B
B
D
Reason
why
I'm
asking
is,
as
you
move
more
tea
often
sees
there's
different,
there's
services
that
you
could
be
layering
on
top
of
or
conceive
that
gives
you
all,
that
data
acquisition
and
aggregation
from
different
study
with
different
connectors,
etc.
General
that
you
could
be
yes,
out-of-the-box
kind
of
consuming
to
facilitate
getting
data
from
different
points.
So
I
would
encourage
you
to
the
hanukkah
data
and
see
you
can
get
some
more
value
added
from
the
partnership.
Sorry
from
the
open
sea
services,
yes,.
C
Yeah,
it's
definitely
on
the
roadmap
that
you
know
sort
of
the
the
the
price
for
admissions
to
make
sure
our
instances
work
first,
but
I
definitely
welcome
I
mean
you
know:
we've
been
working
with
them
as
a
partner
and
my
I.
Don't
you
seen.
Domodedovo
has
a
lot
of
sort
of
this
pretty
connector
view
of
you
know
when
you,
when
you
want
to
actually
get
a
bunch
of
services.
That
looks
a
lot
like
what
I
saw
in
you
know
in
openshift,
and
he
was
like.
Why
don't?
C
No
III
appreciate
it
because
I
mean
like
I
said
it's
certainly,
you
know:
I
we've
we've
tried
to
concentrate
on
having
a
very
broad
footprint
from
the
product.
You
know
from
it's
sort
of
the
data
science
discipline
and
at
least
having
good
hooks,
but
to
your
point
you
know
we're
moving.
You
know
now
that
we're
we're
we're
building
partnerships
for
for
cloud
integration
trying
to
take
advantage
of
those.
The
things
that
are
that
are
uniquely
open
ship
is
definitely
you
know
very
high
on
our
list.
So,
okay,
thank
you.
A
You
know
I'm,
not
I'm,
not
seeing
them
yet
I
do
want
to
save
a
couple
of
minutes
at
the
end.
So
if
you
want
to
do
a
another
little
demo
for
about
five
minutes,
that
would
be
great
and
then
I
just
want
to
talk
about
future
events,
but
people
can
be
talking
thinking
about
future
topics.
That
would
be
great
well.
He.
C
C
But
so
what
we're
looking
at
here
is
this
has
about
60
or
70
different
features
itself.
It
doesn't
take
very
long
to
look,
but
the
goal
that
we're
trying
to
look
at
here
is.
You
know
we
were
looking
at
a
binary
classification
problem
and
we
had
a
user
domain
feature
that
we
engineered
with
the
annotation,
and
this
this
demo
just
give
an
idea
is,
is
what
I
call
our
force
force
multiplier
demo,
which
is
essentially
trying
to
show
some
of
the
ways
that
we've
taken?
C
You
know
research
from
you
know,
from
sort
of
current
your
Python
ecosystem,
and
also
to
our
own
learning
engine
research
to
be
able
to
help.
You
know
help
data
scientists
get
through
problems
a
lot
faster
here.
So
you
know
as
a
similar
scenario
that
we
talked
to
this
before
I.
Don't
need
to
go
through
all
the
demo
pieces
for
the
you
know
the
background
of
the
looks
it's
the
same
task
but
I'm
gonna
scroll
down
here
to
yeah
sales
price.
Okay.
C
C
So
what
I'm
doing
here
is
I'm
choosing.
We
have
a.
We
have
a
main
target
of
sales
price
and
what
I'm
doing
is
I'm
choosing
some
of
the
some
of
the
features
that
I
want
to
use
as
ways
to
be
able
to
look
at
distinct,
essentially
to
do
subpopulations.
If
you
will
for
calculation,
so
I'm
gonna
have
my
list
of
a
few
features
that
I'm
going
to
add
to
the
feature
synthesis
as
far
as
slicers.
C
If
you
will
and
then
I'll
pick
the
ID
column
that
we
can
use
and
then
what
I'm
going
to
do
is
I'm
going
to
look
at
a
series
of
for
the
sales
price
different
partitions
of
the
sort
of
distributions
of
subpopulations
based
on
those
here,
and
you
know
again,
you
know,
as
long
as
the
grain
is
the
same.
It's
one
path
of
of
execution.
There
up
sorry.
C
Well,
you
know
it's
I,
you
know,
I
we've
been
we're
pretty
close
to
our
funding,
so
I'll
be
able
to
get
some
more
testers
pretty
soon
here
but
anyway.
So
you
know
behind
the
scenes.
What
we're!
What
we're
essentially
going
to
be
doing
here
is
to
build.
C
Yeah
well,
I'll.
Probably
do
it
later
today,
because
Doug
owns
the
the
showcasing
sort
of
branding
stuff
that
we
do
here
so,
but
I'll
just
sew
here.
Real
quick.
These
are
just
think
times
from
from
another
different
version,
hand
there.
So
essentially
you
know
in
selecting
that
we
have
a
deep
feature
synthesis
and
then
what
we
do
is
we
set
that
as
the
main
aggregation
target
and
that
the
putting
the
ID
we
set
up
the
different
ideations
and
then
we'll
we
get
out
is
essentially
it.
What
it
does
is
it
looks.
C
It
looks
at
those
the
it
uses
as
main
groupings,
and
it
looks
for
other
places
where
it
can
find
other
aggregations
of
those
mean
that
make
sense
so
that
you
have
a
very
deep
decomposition.
So
essentially
in
about
two
minutes,
you
get
about
500
features
that
would,
by
themselves
to
be
about
70,
different
sequel,
queries
to
be
able
to
generate,
and
then
the
next
step
is
that
we
we
essentially
wrap
around
the
garden-variety
psychic
random
forest.
Our
own
learning
algorithm
that
we
use
for
feature
selection.
C
So,
essentially,
you
know
we're
taking
the
same
the
same
metric
that
we
used
to
build
all
of
the
features
and
and
we're
using
that
as
the
target
to
predict
and
essentially
what
we
get
out
of
it
is
that
we
do
two
rounds,
one
where
we
do
random
sets
of
hundred
different
features
from
the
five
or
six
hundred
that
are
in
there
and
then
we
then
we
we
apply
our
learning
algorithm.
Then
we
make
other
rounds
so
essentially
all
the
Purple's
or
rounds
that
we're
done
using
our
algorithms
seeded
by
the
orange
one.
C
G
A
Awesome,
how
about
you
put
up
a
final
slide
about
how
to
get
more
information
and
how
to
contact
you?
Okay,
then
we'll
kind
of
move
into
a
I'll
steal
back
the
screen
for
a
few
minutes
after
you
do
that
I'm
seeing
they'll
get
there
so
anyways
we're
at
the
top
of
the
hour.
Folks
and
I
just
wanted
to
mention
a
couple
of
things:
we're
not
going
to
meet
next
month,
because
it's
right
in
the
middle
of
Red
Hat's
summit,
and
you
know
how
crazy
that
is
and
we're.
A
So
if
there's
any
other
adjacent
ones
that
you're
all
planning
on
attending,
please
let
me
know
about
them
so
that
maybe
I
can
plan
the
gathering
to
be
next
to
that,
and
you
only
have
to
travel
once
if
you're
coming
from
outside
of
the
Bay
Area
with
that
through
the
question
I
have
for
you
is:
where
is
you
you
Bix
based?
Are
you
in
the
Bay
Area?
Where
is
so.
C
We
we've
always
been
a
geographically
diverse
company,
I'm
based
in
the
Dallas
Metroplex,
our
CEO
lives
in
the
Madison
Wisconsin
metro
and
our
majority
shareholders
are
and
the
frost
data
capitalist
system
he
came
from
were
in
San,
Juan
Capistrano,
but
at
one
point
we
had
seven
FTEs
in
yeah
and
we've
always
had
contractors
in
the
Philippines
or
Ukraine
or
Romania.
So
you
know,
we've
been
following
the
Sun,
yes,.
A
We
all
are
these
days,
so
that's
great
just
keep
it
on
your
radar
watch
for
announcements
on
the
mailing
list.
I'll.
Add
you
guys
to
the
mailing
list
too.
Thank
you
drew
really
for
taking
the
time
to
do
this
today.
I
know
it
was
short
notice.
I
just
wanted
to.
Let
people
know
that
Jeff
bean
did
ping
me
and
if
you
see
very,
very
Kuznets
because
they're
in
the
middle
of
they
were
acquired
date
artisans,
and
so
that's.
A
A
Those
are
Mondays
and
we're
gonna
skip
may,
and
so
our
next
meeting
will
be
June
7th
and
please,
if
you
Stephan
people
who
are
new
to
this
group
David,
if
there
are
topics
you
want
to
talk
about
our
projects,
you're
working
on
you
want
to
showcase.
Let
me
know,
and
I
will
happily
add
you
to
the
agenda,
but
with
that
through
and
everybody.