►
Description
AI/ML and Operators Case Study: Databricks, Azure and Kubernetes with Azadeh Khojandi and Jordan Knight from Microsoft.
Filmed October 28th, 2019 in San Francisco.
A
Yeah
so,
as
Diane
said,
I'm
my
name's
Jordan
Knight
and
with
me
I,
have
ants
and
we're
both
software
engineers
working
for
Microsoft
out
of
Australia,
and
we
are
here
today
because
we
built
an
operator
when
I
say
we
I
mean
as
and
turns
out
that
when
you
build
and
post
and
operated
at
github,
that's
like
sending
up
a
bat
signal
for
Diane,
because
I
think
within
minutes
of
us
actually
making
that
project.
Public
Diane
was
in
contact
with
that.
A
So
so
and
be
Diane's
been
super
supportive
in
helping
us
actually
get
that
operator
up
to
the
stage
where
we're
nearly
ready
to
pop
it
up
onto
the
operator
hubs.
So
we're
here
today
we're
going
to
run,
run
you
through
some
of
the
reasons
why
we
indeed
built
an
operator
and
it's
for
reasons.
Far
more
than
just
they're
shiny
and
everyone
was
busting
to
try
and
build
an
operator.
We
actually
had
to
find
a
real
business
reason
to
do
that.
A
So
what
I'm,
actually
going
to
do
is
just
run
through
a
bit
of
background,
basically
I'm
a
designer
on
a
project,
a
customer
project
that
we
had
a
need
for
an
operator
and
basically,
as
is
the
lead
developer
on
that
operator,
so
together
we're
in
the
same
team.
But
together
we
actually
I've
got
the
background
on
how
we
how
we
go
about
using
that
operator
and
as
it's
got
some
background
on
how
you
use
the
operator
itself.
A
They're.
Just
a
little
bit
of
background
about
our
team
record,
you
get
emotionally
commercial
software
engineering,
okay,
I
think
Diane's
still
on
the
mic
out
there
somewhere
we're
called
commercial
software
engineering
where,
in
the
past,
Microsoft
engineers
have
been
locked
behind
closed
doors
up
in
Redmond
and
other
cities
around
the
world,
working
on
mainly
Microsoft
products,
and
so
what
Microsoft
decided
to
do
was
to
go
and
build
a
software
development
team,
that's
free
and
available
to
go
and
work
on
work
on
projects
other
than
Microsoft
software.
A
So,
for
example,
we've
got
a
customer
that
had
a
need
for
an
operator,
so
we
came
in
and
actually
made
an
open-source
project
to
go
and
build
that
operator
and
and
so
we're
working
with
these
customers,
and
this
particular
customer
along
and
I,
had
a
use
case
to
go
and
build
highly
cohesive,
loosely
coupled
multi-directional,
complex
multi
configuration
multi-component,
multi
technology,
multi-platform,
high
scale,
high
availability,
low
latency,
big
data
system
or,
in
other
words
a
stream
processing
pipeline.
In
this
case,
it's
a
flexible
stream
processing
platform.
Actually
that
can
be
reused
for
many
different
scenarios.
A
The
scenario
that
we
have
for
this
particular
customer
was
that
we're
actually
going
through
and
collecting
a
lot
of
data
from
a
river
system:
water
quality
data,
nitrate
levels.
Basically,
we
want
to
be
able
to
measure
the
quality
of
water.
That's
coming
down
this
river
to
see
if
the
farming
operations
upriver
are
having
an
impact
on
various
ecological
downstream
areas
such
as
the
Great
Barrier
Reef.
So
the
problem
we
had
and
it's
a
bit
of
a
joke-
that
big
blob
of
text
there
on
the
pipeline,
but
in
reality
pipelines
are
actually
fairly
complex
systems.
A
A
So
if
you
break
down
pipelines
into
their
I,
guess
semantic
parts,
or
at
least
conceptualize
them
a
little
bit.
The
reality
is
they're
they're,
a
highly
cohesive
system,
but
they're
loosely
coupled
so
the
components
don't
really
know
about
each
other,
but
they
know
sort
of
the
interface
between
the
components
they
know.
What
to
expect
coming
in.
The
other
thing
was
pipelines
is
that
relationship
is
king,
and
so,
if
we
don't,
we
bring
relationship
up
as
into
a
first
class
consideration
as
part
of
our
software
delivery.
A
B
A
Do
that,
especially
when
it
none
of
the
relationship
or
maybe
none
of
the
none
of
those
none
of
the
things
that
we're
being
asked
is
actually
available
at
design-time
of
the
system.
The
particular
platform
we've
been
building
is
is
reusable.
We
don't
actually
know
what
the
pipeline
is
going
to
look
like
when
we
build
that
code
and
deliver
it
through
DevOps
into
the
cluster.
It's
up
to
the
customers
that
then
go
and
use
that
this
pipeline
platform
that
we've
built
to
define
those
pipelines
and
to
go
and
deliver
those.
A
So
we
we
got
asked
by
the
customer
to
basically
build
a
reusable
set
of
components
that
they
can
then
pull
off
the
shelf
and
go
and
string
them
together
in
various
relationship
ordering
to
go
and
put
together
a
pipeline
using
a
UI
or
using
a
simple
configuration
language.
The
day
that
we've
come
up
with
after
the
fact,
so
we've
we're
long
gone
from
this
customer
and
then
they
can
come
along
and
decide
to
implement
scenario,
a
B
or
C
using
this
pipeline
system.
A
So
it's
not
just
one
pipeline,
we
get
to
build,
which
would
be
still
difficult,
but
we
could
at
least
hard
code
a
lot
of
that
stuff.
So
we
had
to
build
a
pipeline
system
that
had
unknown
component
configurations
and
relationships
until
later
on
highly
flexible
and
reusable
has
one
to
end
or
one
to
many
custom
transforms.
So
there
might
be
a
whole
bunch
of
spark
jobs
or
Python
scripts
or
anything
that
run
part
of
this
pipeline,
and
there
could
be
Forks
in
the
data
stream.
Some
could
go
off
to
modern
data
warehouse.
A
Some
could
go
to
storage.
Some
can
continue
on
a
hot
path
through
the
system
we
don't
know
at
design-time,
so
the
idea
we
had
was
to
actually
take
tackle
as
complexity
in
and
tuck
it
away
and
not
worry
about
it
at
the
start,
which
is
certain
which
sounds
like
an
easy
thing
to
do.
So
what
we
did
obviously
compartmentalize
the
whole
problem,
and
basically
we
came
up
with
a
single
point
of
configuration
for
these
pipelines.
So
basically
you
can
design
the
pipeline.
A
Have
a
look
at
it
all
your
relationships,
components
and
everything
are
laid
out
in
a
single
file
which,
when
you
think
about
a
pipeline
in
many
ways,
is
just
a
directed
acyclic
or
graph
or
a
dag.
So
basically
we
created
this
configuration
system
that
then
the
rest
of
the
DevOps
component
tree
actually
goes
through
and
makes
real
that
actually
goes
and
turns
that
into
the
real
pipeline.
A
So
we
start
with
this
dag
or
directed
a
sick
of
cyclical
graph.
You
can
visualize
it.
We
can
actually
visualize
it
automatically
when
it
gets
PR
din
to
the
to
the
production
branches,
and
then
the
DevOps
makes
it
real.
As
I
said,
and
in
the
background
the
dag
is
essentially
created
to
a
series
of
helm,
charts
but
then
get
deployed
into
the
cluster
or
it
could
be
using.
We
could
be
using
ansible,
we
could
be
using
terraform
or
even
just
manually,
build
scripts.
It
doesn't
really
matter.
A
The
point
is,
is
that
the
config
up
front
has
no
idea
about
what's
happening
behind
it,
so
that
way,
we've
got
this
first-class
view
of
what
the
pipeline
will
look
like,
and
so
we
had
these
configurable
first-class
objects,
these
concepts
that
we
can
take
and
actually
put
into
these
pipelines,
but
we
had.
We
had
a
problem
in
that.
When
you
have
these
first-class
objects,
we
need
a
way
to
go
and
make
them
real,
and
we
thought
oh,
we
could
have
all
these
scripts
to
do
it.
A
We
could
have
all
these
other
things,
but
it
turns
out
we're
using
kubernetes
and
kubernetes
is
so
much
more
powerful
and
just
a
system
to
actually
hold
pods
to
do
work.
In
fact,
we've
got
a
lot
of
scenarios
now
where
we
can
actually
use
kubernetes
and
have
no
pods
in
and
other
than
operators.
It's
just
such
a
powerful
system
for
managing
configuration
and
for
our
desired
state
configuration.
A
That
basically
means
that
we
can
then
say
hey
we
want
to.
We
want
a
spark
notebook
job
and
it's
got
these
parameters
and
it
will
go
and
fire
that
up
in
data
breaks
as
if
it
was
firing
up
a
pot
in
kubernetes
and
so
as
a
designer
of
one
of
those
configurations.
They
can
actually
see
my
whole
catalog
of
things
that
I
can
do
so.
We
don't
just
have
to
use
operators
to
create
things
in
kubernetes.
A
A
It's
because
it's
just
such
a
nice
way
to
package
up
and
compartmentalize
this
piece
of
complexity
that
you
can
reuse
over
and
over
again
in
many
different
ways.
So
we
can
create
these
reusable
modules
and
once
they're
good
tested,
we
can
publish
them
and
folks
can
pull
them
off
the
shelf
and
use
them
in
any
other
project.
So
they're
completely
black
boxed
they're.
Also
a
first-class
object.
It's
a
concept
you
can
get
around
hey.
Do
you
know
how
to
use
the
data
breaks
operator,
where's
the
documentation
for
the
data
breaks
operator
so
think?
A
In
those
terms,
it
actually
becomes
a
thing.
It's
not
like
just
some
bash
scripts
sitting
in
a
DevOps
pipeline.
It
really
creates
even
an
internal
community
around
it.
You
can
very
easily
represent
them
as
a
line
in
a
in
a
directed,
acyclic
or
graph
are
in
a
config
file
or
anything
like
that.
They're
also
easy
to
represent
in
things
like
helm
or
other
well-known,
I
guess:
kubernetes
delivery
packages
were
there
be
ansible
or
any
of
those
other
style
projects.
A
You
could
easily
represent
an
operator
and
they're,
obviously
extremely
easy
deploy,
deploy,
update,
remove
because
the
operators
themselves
just
work
using
kubernetes
manifests.
So
people
know
how
to
do
that,
but
I
think
one
of
the
main
things
and
one
of
the
main
reasons
that
we
became
I-
guess
very
accepting
of
operators
and
even
promoting
them
and
internally
at
Microsoft,
is
that
they're
becoming
well-known
and
so
their
skills
being
built
in
the
industry
around
operators.
A
B
To
your
donations,
thanks
Jordan
for
explaining
why
we
needed
operators
and
why
operators
are
awesome,
so
none
less
looking
to
what
exactly
as
your
data
breaks
operator
is
and
what
it
does
so
for
those
who
are
not
familiar
with
Azure
data
breaks.
Azure
data
breaks
is
a
spark
base,
analytical
platform
that.
B
Database
is
created
by
original
creators
of
spark
and,
as
name
suggests,
Azure
data
breaks
is
optimized
for
Asia.
You
can
create
a
spark
cluster
in
Azure
in
few
minutes.
It's
designed
for
large
scale
data
processing
and
it's
ideal
for
ETL,
extreme
processing
and
machine
learning
spark
by
nature,
performs
all
of
operation
on
in
an
in-memory
objects.
B
That's
why
it's
really
fast
and
spark
on
data
breaks,
decouples
query
engine
and
compute
from
the
data
storage,
and
it
gives
us
the
huge
advantage
what
it
means
that
you
can
provisions
cluster
and
you
can
connect
to
the
data
where
the
data
lives.
So
you
don't
need
to
copy
over
your
data
on
the
cluster
and
then
you,
after
finishing
running
your
script,
you
can
shut
down
the
cluster
without
worrying
about
losing
your
data.
B
Data
breaks
is
secure.
It's
integrated
with
Azure
Active
Directory,
so
you
can
get
granular
permissions
and
also
it
provides
interactive
workspace
where
data
scientists
and
data
engineers
they
can
write
their
spark
code
in
Python,
Scala,
R
and
sequel.
It
also
supports
Java
and
machine
learning
frameworks
like
pi
torch,
tensorflow
and
socket
layer.
B
So
this
is
the
very
basic
hello
world
notebook
data
breaks
notebook
as
you
can
see.
It's
just
shows
hello
and
the
name
of
the
parameters,
the
name
of
user,
but
to
run
this
at
the
moment.
So
you
can
go
to
the
portal
or
data
breaks
dashboard
and
you
can
create
a
job
or
submit
a
job
and
then
run
it.
But
from
the
ops
perspective
you
can
ask
you
know
that
you
know
your
SR
Engineers,
you're
up
or
you're
up
still
just
go
to
the
data
books
and
you
know
use
the
UI.
B
They
will
be
really
angry,
so
what
they
really
like,
and
so
there
are
two
different
approach.
Currently
you
can
use,
you
can
either
use
the
portal
or
you
can
kind
of
call
the
data
bits
API
or
you
can
use
a
data
breaks
you
a
command
line,
but
what
we
see
that
we
see
that
there
is
a
space
for
the
operator
to
extend
kubernetes
functionality.
So
what,
if
similar
to
submitting
the
Yammer
file
for
deployment,
you
submit
a
Yama
file
to
run
the
SPARC
notebook
from,
or
data
breaks
notebook
from
the
kubernetes.
B
B
Normally
like
to
do
a
live
demo,
but
this
session
is
really
short,
so
I
record
tree
recorded
the
demos
and
I
comment
over
it
and
I
change
the
speed
to
two
weeks,
so
it's
a
little
bit
faster.
So
basically,
so,
for
example,
this
is
a
data
bricks
run
and,
as
you
can
see
here
in
a
spec,
you
can
create
a
spark
cluster
with
three
nodes.
You
know
you,
you
specify
the
location
of
your
data,
bricks,
notebook
and
then
you
pass
the
parameters
and
all
you
need
to
do
is
submitting
your
you
notebook
your
manifest.
B
So,
as
you
can
see,
I
can
get
a
database
run.
There's
no
runs
running,
but
if
I
apply
my
manifest,
so
it
starts
running
it
so,
the
first
time
it
tries
to
provision
the
cluster
after
provisioning,
the
cluster.
You
can
see
the
provision
this
cluster
is
there
and
then,
after
the
clusters,
provisioning
finishes.
It
runs
the
spray,
it
runs
the
scripts
and
then
it
up.
It
shows
the
update
comment.
B
So
to
recap:
I
just
applied
my
manifest,
as
you
can
see,
the
first
time
is
pending
after
cluster
finishes
provisioning.
It
runs
the
script
and
then
it
shut
down
the
cluster.
So
it's
very
fast
and
it
does
its
job,
but
you
might
ask
what,
if
I
want
to
run
a
job
with
the
intervals
so
do
I
need
to
you
know
provision
cluster,
shut
it
down
and
reprovision
the
cluster
again
or
what?
If
you
I,
want
to
have
multiple
workload
on
the
same
cluster
and
the
answer
is
yes
possible
with
the
operator
you
can
do
so.
B
Data
breaks
has
a
functionality
of
interactive
cluster,
so
you
can
create
a
cluster
and
in
the
cluster,
is
keeps
running,
and
then
you
can
attach
your
data
bricks,
your
notebook
to
that.
So,
for
that,
as
you
can
see,
I
have
a
manifest
for
data,
a
database
cluster.
You
can
set
the
auto
scaling
for
minimum
workers
and
the
maximum
workers.
You
can
specify
the
environment
variables
of
your
spark
and
after
that
you
need
to
create
a
data
bricks
job.
So
you
can,
as
you
can
see
in
this
sample,
I
run
my
HelloWorld
scripts.
B
So
again,
if
I
check
the
cluster,
there
is
no
cluster
on
my
and
then
I
can
apply
my
manifest.
So
after
applying
my
manifest
I
will
have
my
cluster,
so
you
can
see
that
is
a
start.
Provisioning,
Interactive,
cluster
and
data
breaks
gives
me
the
ID.
So
after
getting
the
cluster
ID
I
can
update
my
data
breaks
job
manifest
so
I
copy
over
the
cluster
ID
and
I
update
my
data
breaks
job
and
then,
if
I
apply
it
yep.
B
B
So
the
good
thing
about
using
the
operator
is
up
seems
they
don't
need
to
be
learn
new
stuff,
so
they
can
use
all
the
tools
and
monitoring's
that
they
were
and
they
were
familiar
it.
So
to
recap:
I
applied
and
create
a
data
break
strap
and
then,
as
you
can
see,
I
can
go
to
the
portal.
I
can
see
output
of
every
single
job
that
it
runs
every
one
minute.
B
Now,
let's
look
into
the
something
similar
and
closer
to
the
real
were
so
imagine
that
you
have
a
pipeline
to
analyze
tweets
so,
for
example,
the
first
step
is
interesting
to
it.
So
you
want
to
get
a
injustice
which,
based
on
the
hashtag
of
a
certain
keyboard
and
then
what
you
need
to
do
even
before
the
first
step
is
you
need
to
connect
to
the
Twitter
to
get
the
tweets
and
then
put
it
into
stream
can
be.
Even
hubs
can
be
kept,
Co
doesn't
matter,
but
what
is
important
is
here.
B
Is
you
need
to
manage
secret?
So
you
need
to
connect
to
the
third-party
services.
So
there
is
a
concept
in
databases
called
secret
scope
that
you
can
provide
keeper
values
for
your
password.
But
what
we
did
here
we
said
that
what
if,
because
up
seems
they
want
to
manage
all
of
the
secrets
and
all
of
the
password
as
a
kubernetes
secrets?
What
if
in
a
secret
scoping,
you
can
create
your
key
keeper
values,
but
what?
If
we
read
this
secrets
from
the
kubernetes
secrets?
So
that's
what
we
did
here.
So
you
have
your
see.
B
We
have
secret
scope
here
for
the
connecting
to
the
event
hubs
and
Twitter,
and
after
that
you
need
to
run
data
wix1
again.
We
are
attaching
to
the
existing
cluster
or
you
can
provision
new
cluster
and
then
potentially
for
this
ascribe
you
need
some
third-party
libraries,
it
can
be
maven
or
it
can
be
Python
libraries.
B
So
what
you
see
here,
you
would
say
that
I
run
these
libraries
on
the
cluster
before
running
the
scripts,
so
the
data
breaks
automatically
runs
these
libraries
and
pull
that
libraries
for
you
and
after
that
it
runs
the
script
word
again,
a
short
video.
So,
as
you
can
see,
I
have
my
community
secrets
and
then,
if
I
apply
a
my,
so
if
I
apply
my
secret
scope,
it
creates
the
secret
scope
in
data
breaks
and
in
coverages,
for
you.
B
B
The
first,
the
first
time
that
I
run
it
so
it
says
that
it's
running
and
then,
if
I
use
a
cube
city
or
describe
and
provide
the
name
of
my
run
similar
to
how
you
work
with
the
kubernetes
other
objects.
So
you
can
see
it.
It
comes
to
my
run.
I
can
see
output
of
my
run
and
you
can
see
that
it
passed
the
parameters
installs
the
dependencies,
and
it
shows
the
tweets
that
it's
extracted.
I
have
another
notebook
to
test.
B
My
my
my
Twitter
ingestion
and
it's
called
even
hub,
ingest
and
I-
can
attach
my
data
breaks
to
the
current
cluster
that
is
created
by
the
operator.
I
can
read
this
secret
scopes
that
is
created
by
operator
and
then
I
can
run,
and
it's
just
for
monitoring
to
see
that
it
actually
reads
the
tweets.
So
I
can
read
all
of
the
messages
that
I
have
in
my
even
hub
and
if
you're
patient
you
can
see,
you
can
see
they're
the
exact
same
to
it,
that
we
ingest
it
yep.
B
So
to
recap:
I
have
my
secret
scope.
I
have
my
runs,
so
you
can
see,
and
after
that,
I
ran
and
I
ingested
tweets
and
then
with
the
even
hub
ingest.
That
was
my
data
breaks,
notebook
to
just
check
that
stream.
So
I
can
see
the
value
of
my
stream
yeah,
so
plus
I
like
to
share
that
Harvey
built
this
operator
and
then
share
some
of
the
lesson
learned
that
we
learned
along
the
way
for
building
the
operator.
We
use
queue
builder.
There
are
so
many
tools
and
frameworks
that
we
could.
B
You
could
use
a
new
skill
builder
behind
the
scene
to
use
this
kubernetes
api,
machineries
and
customize,
and
it's
really
easy
for
creating
custom
resource
definition
for
those
who
are
not
familiar
with
custom
resource
and
custom
controller
and
custom
resources
endpoint
in
kubernetes.
That
allows
you
to
save
a
structure
data,
and
but
with
that,
it's
not
enough.
So
if
you
want
to
be
the
operator,
you
need
a
custom
controller.
You
need
the
logic
to
set
up.
B
B
So
CR
DS
has
been
used
for
a
tea
party
extension
and
we
are
using
crts
and
now
you
can
see
more
recently,
even
in
kubernetes
they
are
adapting
seared
sea
artists
for
their
built-in
functionality
and
Tim
Hawkins,
one
of
the
co-founder
of
kubernetes.
He
shared
his
vision
recently
that
everything
is
going
to
be
CRT
soon
and
there
should
be
there
shouldn't
be
anything
that
can
you
can't
do
that
they
can
do
so
another
tool
that
a
lot
to
share
with
you
and
it
helped
us
a
lot.
B
It's
using
kind
and
kind
allows
you
to
create
a
local
cluster
and
what
I
really
like
about
kind
that
it
uses
darker
and
it
works
on
different
operating
systems.
The
our
team
was
distributed
and
they
were
using
different
operating
system.
So
we
use
kind
for
creating
our
local
cluster
and
the
benefit
of
using
kind
is.
You
can
create
a
local,
create
a
cluster
and
theory
done
in
two
or
three
minutes,
and
if
you
build
the
image
of
your
operator
on
your
local
machine,
you
can
load
that
image
on
the
master
node
of
your
local
cluster.
B
So
it
saves
you
a
lot
of
time
and
compute
because
you
don't
need
to
push
the
image
into
the
image
repository
and
dock
your
hub
or
you
know
other
image,
repository
on
cloud
and
then
pull
it
down
when
you
are
testing,
especially
when
you
are
writing
web
oops.
So
it's
everything
is
your
look
on
your
local
computer
and
especially
it's
really
good
when
you
are
in
your
test
pipeline,
so
you
can
actually
test
test
your
operator.
B
Another
field
that
we
found
it
very
useful
in
terms
of
the
onboarding
is
using
the
deaf
container,
so
deaf
container
runs
the
source
code
inside
the
container.
So
you
don't
before
using
the
F
container
or
onboarding
process
was
taking
about
half
a
day,
but
after
using
deaf
container,
it
is
reduced
on
to
three
minutes.
So
you
just
clone
the
repository
and
then
with
Visual
Studio
code
and
deaf
container.
It
runs
everything
inside
the
container.
B
It's
all
the
setup
for
the
core,
all
the
setup
for
using
kubernetes
kind,
and
it
also
gives
you
the
ability
to
debug
and
run
so
it's
it's
very
powerful
and
it's
really
easy.
If
you
have
a
distributed
team
or
if
you
have
a
team
with
the
different
setups,
I
highly
recommend
to
check
it
out
and
we
we
got
a
lot
out
of
using
deaf
container
by
that.
This
is
our
github
repo.
Please
check
it
out
and
if
you
have
any
use
case
or
anything
that
you
like
to
chat
with
us,
Jordan
I
will
be.