►
From YouTube: OpenShift Commons Briefing: GCP Kubernetes Operator for Spark - Chaoran Yu (Lightbend)
Description
from OpenShift Commons Operator Framework SIG October 2018
Lightbend's Chaoran Yu on GCP Spark Operator
https://github.com/GoogleCloudPlatform/spark-on-k8s-operator
slides: https://docs.google.com/presentation/d/15WnTa7WYuQ5klkWbKmf0Ch_bVMR5rpgPfkfZmWwLlmk/edit?usp=sharing
B
Okay,
yeah
sure
today,
I
can
talk
a
little
bit
about
the
Google
cloud
platform
spot
operator,
I.
Think
in
two
weeks
ago
I
attended
a
mystic
meeting
where
Chiri
presented
his
spark
operator,
but
this
was
from
Google
I
think
it
has
more
momentum
and
more
contributors
and
users.
So
let's
look
at
what
it's
about
so
have
a
few
things.
I
want
to
talk
about
for
today,
so
to
get
started.
B
I'll
just
talk
briefly
about
what
the
operator
pattern
is,
but
I
think
most
people
are
already
familiar
with
it,
but
the
dystrophy
Peace
Park
orbiter
is,
is
basically
an
implementation
of
this
pattern
and
then
I'll
talk
about
the
architecture
of
the
of
this
operator,
how
to
install
it
and
what
are
what
some
of
its
basic
features
are
and
then
I'll
talk
about
a
co,
I
tool.
That's
provided
in
this
operator
project
called
the
spark
CTL.
B
It
makes
some
of
the
workflow
with
managing
spark
drops
the
easier
as
we'll
see
then
comes
a
future
called
militating
animation
webhook.
This
is
a
feature
that
the
spark
operator
leverages
to
to
provide
lots
of
flexibility
in
customizing.
Your
spark
driver
and
executor
paths
I
think
this
is
one
of
the
most
useful
features
in
this
project
and
two
as
the
last
thing,
I'll
talk
about
exporting
and
looking
at
Prometheus
matric
for
Matias
matrix
with
this
spark
operator.
I'll
conclude
with
some
future
things
that
you
can
contribute
to
this
project.
B
B
B
Then
that's
quickly
cut
into
the
gist
of
the
talk.
The
specifics
of
the
GCP
spark
operator.
It
was
created
by
this
guy
called
Ian
Ali
at
Google,
and
it's
not
open
source.
The
link
is
provided
on
the
slide.
The
approach
that
it
takes
to
managing
spark
jobs
is
that
it
creates
two
customer
resource
definitions
or
CR.
These
one
called
spark
application.
Another
cost
schedule
spark
application,
so
those
are
these
represents
after
the
abstractions
of
a
structure,
and
they
are
what
make
scrap
tops
native
city
citizens
in
cornetist.
B
B
B
Look
like
you
know
in
a
minute,
but
once
you
have
the
spark
job
spec
documented
in
the
yamo,
you
would
use
troops
ETL
or
sparks
ETL
that
we
will
talk
about
use
these
CLI
tools
to
submit
your
llamo
to
the
api
server
and
once
the
server
receives
your
request
to,
for
example,
create
a
new
c
rd
service
pack,
application
or
schedule
spark
application.
The
there's
a
component
called
controllers
in
the
spark
operator
that
would
you
know,
get
this
request
and
assemble
all
of
those
configurations
and
pass
them
to
another
component,
consummation
runner
so
well.
B
B
The
other
basic
features
are
because
it
uses
a
llamo
to
document
in
the
spec
of
a
job,
so
the
llamo
is
a
declarative
in
nature.
It's
easy
to
do
things
like
version
control
and
because,
under
the
hood,
what
it
really
does
is
it's
it's
advanced
parts,
the
main
command,
so
everything
that
spark
submit
takes
all
those
configuration
options.
This
part
operator
also
supports.
You
only
need
to
figure
out
what
you
need
to
put
in
the
llamo.
The
translation.
You
just
need
to
to
know
that,
there's
a
there's,
a
good
documentation,
you
need
to
figure
out.
B
It
also
supports
a
crown
like
scheduled,
spark
jobs.
So
that's
what
the
ECR
the
scheduled
spark
application
is
for
and-
and
the
interesting
feature
is
mutating
animation
black
book
the
operator
uses
that
to
enable
product
customization,
you
can
mount
configure
config
maps
or
volumes
in
your
driver
and
executive
pods.
Obviously
that
nephew
slides
and
you
can
also
use
the
spot
operator
to
enable
automatic
drop
resubmission.
B
If
you
would
like
to
change
the
specs
of
an
existing
spark
job
or
to
restart
it
if
upon
failure,
but
if
that's
what
you
want
yeah
at
the
end,
it
also
supports
exporting
prometheus
metrics.
This
is
an
this
is
an
incomplete
list
of
features,
but
I
think
these.
These
are
what
the
main
the
main
things
are.
B
Let's
talk
about
prerequisites.
It
requires
human,
at
least
1.8
and
above
because
it
relies
on
garbage
collection,
customer
resources,
but
that's
only
available
starting
when
not
8,
and
if
you
would
like
to
use
mutating
animation
web
hook,
then
commanded
his
went
online
and
above
is
required,
because
this
feature
is
only
only
becomes
the
beta
feature.
Starting
went
online
and
exactly
what
distribution
of
kubernetes
you
you
install
the
operator
in
that
doesn't
really
matter.
Personally.
I
have
used
it
on
GK
and
open
shipped
post
work,
fine
and
the
yeah
how
to
install
it.
B
It's
easy
to
install
because
there's
a
there's,
an
incubator
chart
on
the
central
helm,
charts
people
yeah,
you
basically
add
the
as
a
repo
how
people
and
they
started
just
what
it
would
do
for
any
other
standard
chart,
and
there
are
other
options
to
customize
it.
For
example,
you
know
you
would
actually
started
in
a
given
namespace
or
there
are
some
components
that
you
would
like
to
enable
or
disable.
But
you
you
can
look
at
the
the
link.
B
So
basically
you,
like
you
name
your
spark
job
like
spark
Pike,
and
you
would
like
to
run
it
in
default.
Namespace
yeah!
That's
where
you
specify
your
namespace
and
you
provide
your
image,
your
main
class
application
file,
so
all
standard
things,
then
I
can
configure
our
driver
part
2
with
some
memory,
resource
requirements
or
service
economy
that
you
use
and
how
many
exact
or
instances
it
would
like
to
launch
and
resource
requirements
for
executor.
B
So
this
is
a
very
simple
llamo
and
you
can
have
all
sorts
of
other
configurations
as
long
as
the
proximate
supports
it
and
you
figure
out
the
corresponding
spec
in
the
yellow
but
yeah.
This
is
what
it
looks
like
the
basic
operations
very
easy,
because
now
the
CRTs
are
there.
Then
you
can
just
create
a
spark
job.
For
example,
just
as
you
would
create
a
pod
capacity,
I'll
apply
the
yellow
and
took
place
all
the
jobs.
K-Kat
spark
applications,
the
name
of
the
the
CRD
you
have
to
get
other
details.
B
Let's
now
look
at
the
the
custom
COI
provided
in
the
in
this
project
called
sparks
ETL.
So
here
I
said
it's
a.
It
complements
cube
CDL
to
make
some
operations
easier,
but
but
I
would
say
that
it
it
can
fully
replace
cubes
to
do
when
working
with
working
with
spark
application
or
schedule
spark
applications.
They
are
these
because
yeah,
because
everything
that
Cuba
CDL
can
do
is
foxtail
can
do
and
it
makes
things
easier.
For
example,
listing
our
spark
drops.
You
can
do
sparsity
our
list,
but
with
QCD
all
you
would
have
to
do.
B
Chip
CTL
get
spark
applications.
Things
like
that.
So
it's
longer,
but
this
here,
it's
shorter
because
the
status
Facebook
job
sparks
ideale
status
got
pie
again
with
the
previous
video
that
man
is
a
little
longer
to
get
the
events
against
shorter,
but
with
with
coop
city
all
we
have
to
do
it
described
and
spark
application,
and
things
like
that
is
longer
and
getting
sparked
up
logs.
It's
again
a
one-liner,
but
wait
cubes
to
do
you
need
to
first
finds
we
find
a
pot.
B
B
Besides
that
there
are
a
few
other
features.
Spark
Cydia
also
supports
port
forwarding
to
view
the
web
UI
again.
This
is
something
that
cube
CL
can
do
because
with
cubes
Cpl
you
can
again
just
figure
out
the
part
first
and
then
do
a
port
forwarding
on
that
part.
But
here
you
don't
know
yeah
I,
don't
need
to
find
it
hard.
You
just
knows
that
the
spark
job
name
here
is
a
spark
PI.
So
it's
a
little
easier.
B
It
also
supports
staging
local
dependencies
to
s3
and
GCS,
and
so
for
your
dependencies
that
you
specified
in
your
spark
idle
llamo.
You
can
specify
your
a
GCS
pocket
or
a
tree
pocket
to
upload
them
to
to
a
remote
place,
but
you
need
to
configure
your
authentication
and
stuff
up
front.
The
details
are
in
the
documentation
which
I
won't
talk
about
here.
B
Okay,
so
now
let's
go
to
the
mutating
animation
graphic.
So
this
feature
is
a
it's
a
feature
about
of
kubernetes
itself
rather
than
the
operator,
but
the
SPARC
operator
leverages
this
feature
to
to
enable
a
flexible
customization
of
parts.
What
this
feature
is,
is
it's
a
it's,
a
so-called
animation
controller
that
intercepts
requests
to
the
API
server
and
the
modify
stand
object
before
the
object
is
the
persisted
as
I
mention
is
a
beta
feature
in
1.9
above
and
the
SPARC
operator
uses
this
feature
to
achieve
three
use
cases.
B
The
first
use
case
is
Mountain
config,
Matson
driver
and
executive
pods.
The
second
feature
is
the
mountain
volumes.
The
third
feature
is
setting
positive
affinity
and
add
half
in
the
things
like
what
what
notes
you
act
wrong
wrong
or
which
knows
you
would
like
to
avoid.
Let's
look
at
the
view
sample
use
cases.
So
when
would
you
like
to
among
amount
config
maps
in
your
spark
job
pods?
So
here's
a
here's
an
example.
So
it's
a
very
common
to
have
some
custom
configurations
for
a
job
in
Sparky,
Falls,
comm
parking
that
Sh
or
log4j
properties.
B
B
So
your
first
mount
these
files
as
config
maps
and
then
and
then
your
Yama
file.
You
simply
referred
to
that
the
config
Maps
that
you
were
creating
and
then,
when
the
when
the
spark
CRD,
the
the
spark
job
is
created,
those
config
maps
that
you've
pre-mounted
would
be
automatically
mounted
in
the
inside
the
pods
inside
the
trailer,
an
exactor
pods,
and
then
your
spark
job
would
be
automatically
configured
as
as
desired.
B
A
B
B
Another
use
case
with
the
meditating
emission
webhook
is
the
monney
volumes.
So
here's
a
here
a
use
case
that
I've
been
conjuring
myself
is
in
the
use
of
spark
history
server.
In
this
case,
both
driver
and
executive
parts
of
a
spark
job
need
to
log
events
to
the
same
volume,
which
is
also
the
volume
used
by
the
history
server
part
itself
for
using
like
displaying
on
the
UI,
for
example.
B
So
here,
for
example,
I
have
a
app
a
PVC,
a
volume
and
then
in
order
to
to
have
the
driver
an
example
of
how
Atlanta
log
to
data
volume
you
need
to
have
this
chart
without
volume
month.
You
specify
the
name
of
the
volume
that's
available
here
and
then
the
path
that
you
would
like
to
amount
the
volume
at
yeah.
This
way
the
volume
is
available
at
this
/
month
directory
and
then
you
can
long
events
there,
which
which
are
configured
here
yeah.
B
So
these
are
some
use
cases
for
the
mutating
animation
web
hook.
I
think
it's
a
pretty
useful
feature,
but
it's
an
optional
component.
You
can
disable
it
if
you
don't
wanna
use
it
yeah.
Last
but
not
least,
let's
talk
about
primitive
metrics,
the
SPARC
operator
configures
a
Prometheus
DMX
exporter
to
run
as
a
Java
agent
in
the
operator
path
itself,
but
it
also
supports
emitting
metrics
Prometheus
metrics
in
the
driver
and
executive
metrics
themselves.
B
You
know
travelling
executor
executives
themselves,
so
so
the
two
sets
of
metrics
are
in
a
an
application-specific
metric,
for
example,
spark
driver
app
status,
stop
duration.
Does
this
is
a
metric,
that's
specific
for
that
job
coming
from
a
driver
or
executor,
earPod
there's
also
a
set
of
metrics
that
are
a
higher
level,
for
example,
spark
app
running
count,
so
these
are
metrics
that
are
specifically
provided
by
the
operator
pod
itself.
So
these
are
application,
metrics
application
level
metrics,
but
note
that
tweak
to
expose
driver
and
executor
metrics.
B
So
the
first
set
your
spark
application
image
that
you
specified
in
your
llamó
that
needs
to
contain
the
Prometheus
jmx
exporter,
Java
agent
jar,
otherwise
the
metrics
won't
be
exported,
but
once
you
have
that
jar
available
your
image,
it's
it's
easily
configurable
to
have
those
metrics
exported
yeah.
Here's
a
example
llamo
file
that
configure
configures
the
driver
and
executor
metrics
to
be
to
be
exported,
so
these
are
for
the
first
set
of
metrics
that
I
that
I
showed
in
a
previous
slide
the
operator
pal
itself
already
configures
itself
to
export
application
level.
Metrics
yeah.
B
The
this
slide
is
just
the
general
thing
about
how
you
look
at
those
metrics.
You
can
look
at
them
in
the
permitted
UI
or,
for
example,
you
would
like
to
verify
the
list
of
metrics
opera
exported
by
the
SPARC
operator
pod
itself.
Then
you
can
find
the
pod
and
then
do
a
port
forwarding
on
that.
The
the
the
default
port
is
10,
2,
5
4,
and
once
you
have
that
port
forwarded,
you
can
go
to
the
metrics,
the
endpoint,
to
see
that
list
of
the
metrics,
the
application
level
metrics
that
that
would
be
shown
here.
B
The
same
is
true
for
the
driver
and
exactor
metrics.
You
can
also
look
at
them
there
yeah.
The
future
work
is
that
the
current
status
of
the
project,
in
that
it's
a
fully
compatible
with
the
spark
2.3
and
2.4
it's
been
tested
with
a
2.4
release,
candidates
versions
and
it's
currently
alpha,
but
it
will
be
upgraded
to
beta
once
to
love
for
its
officially
released
so
a
trial
here
at
the
lab,
and
we
are
actively
evaluating
and
contributing
to
this
project.
B
Our
past
contributions
included
the
hub
chart
and
the
integration
with
the
Prometheus
and
spark
a
tree
server.
The
project
students
are
in
its
early
stage
and
it
requires
lots
of
testing
to
make
it
mature.
So
more
integration
tests
are
I,
think
that
needs
to
be
added,
and
we
are
also
working
on
that
also
covers
report.
That's
currently
lacking
also
spark
Cpl
doesn't
have
very
good
support
for
scheduled
spark
application.
That's
also
something
to
be
added.
Just
a
few
words
about
the
team.
I
mean
I
work
at
the
light
ends.
B
The
team
I
I
working
is
called
fasted
a
platform,
so
we
are
a
the
product.
Is
a
curated
fully
supported
the
platform
that
helps
you
help
to
helps
developers,
design,
build
and
run
data
pipeline
and
all
mortgages
is
on
streaming
data,
so
data
that
moves
in
real
time
and
Kafka
is
obviously
a
very
important
component,
but
our
entire
architecture
is
built
on
top
of
communities.
We
used
to
be
based
on
based
on
mazes
eCos,
but
now
we
are
really
betting
on
top
about
kubernetes
and
sparkle.
Pager
is
our
well
at
least
for
now.