►
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Yeah
so
yeah
so
welcome
everyone
to
our
bi-weekly
meeting.
Today
we
have
abdullah
from
google
presenting
the
work
that
has
been
done
on
supporting
kubernetes
native
job
queueing,
maybe
abdullah.
Maybe
you
can
also
just
briefly
introduce
the
the
batch
working
group
and
the
work
that
has
been
put
there
briefly
because
kind
of
on
topic.
A
If
you,
if
you
can
say
a
couple
of
words,
the
the
link
has
been
circulated,
but
just
just
to
put
it
in
context
as
well
yeah.
Otherwise
I
suggest,
like
we
listen
to
abdullah
and
then
we
should
have
plenty
of
time
for
discussion.
This
should
be
a
good
one.
B
Okay,
sorry
yeah
this,
this
camera
this.
This
is
the
one
that's
working.
This
one
is
now
yeah.
Thank
you
for
having
me
so
I'm
abdullah,
I'm
a
contributor
to
kubernetes,
I'm
a
co-chair
of
sex
scheduling
and
a
recent
working
group
that's
been
formed
called
badge
working
group,
but
then
within
kubernetes.
B
I
work
for
google
again
part
of
the
gte
team
within
google
and
focused
on
batch
as
well.
So
as
ricardo
mentioned
a
couple
months
ago,
or
a
month
ago,
we
proposed
to
form
a
new
working
group
with
10
kubernetes,
to
focus
on
that
and
reduce
the
fragmentation
on
efforts
related
to
batch
workloads
within
core
kubernetes
and
try
to
make
batch
kind
of
a
first
class
citizen
of
of
kubernetes.
B
We
feel
that
until
recently,
bad
has
been
a
guest
on
the
platform
more
than
it
hasn't
been
a
home
like
services,
and
so
we
want
to
try
to
push
that
use
case
forward.
So
the
goal
of
the
working
group
is
like
based
on
the
charter.
That
we've
agreed
on
is
threefold.
B
One
is
to
enhance,
come
up
with
like
reasonable
apis
to
start
jobs.
We
already
have
an
api
job
api,
but
we're
looking
to
improve
its
capabilities.
Reliability,
scalability
aspects,
its
applicability
to
various
types
of
patch
workloads.
How
can
be
reused,
for
example,
to
reduce
the
fragmentation
we
have
in
the
community
in
building
job
apis?
For
example?
How
can
we
run
mpi
workloads
on
top
of
the
job
api?
How
can
we
run
tensorflow,
like
reinforcement,
learning
workloads
on
top
of
it?
So
that's
one
pillar.
B
The
second
pillar
is
job
level
management,
most
of
the
components
within
kubernetes.
They
are
kind
of
like
podcentric,
whether
that
being
like
the
scheduler
or
the
auto
scaler.
B
Even
quotas
like
it
mostly
works
at
the
pod
level-
and
this
is
this-
does
not
lend
itself
well
to
jobs
and
batch
workloads
in
general,
which
most
of
the
time
you
want
to
manage
the
whole
job,
not
just
like
a
single
pod,
and
so-
and
this
is
part
of
what
we
propose
I'll
be
discussing
here
as
well
in
in
in
this
presentation
in
queue.
The
third
pillar
is
mostly
focused
on
like
hvc,
mostly
on
node
level.
B
You
know
enhancements
to
use
special
accelerators
special
types
of
hardware
and
how
it
works
with
the
scheduling
as
well
like
normal,
aware,
scheduling
or
like
how
do
we
basically
better
use
fpgas
et
cetera
they
have
their
own
uses
of,
like
you
know,
resetting
the
fpga
for
before
report
can
use
it
and,
and
all
these
kinds
of
you
know
hooks
that
allows
special
hardware
to
be
better
used
within
the
kubernetes
ecosystem.
B
Do
you
have
any
questions
about
their
working
group?
I
thanks
ricardo
for
posting,
the
the
pr
for
the
working
group.
The
charter
has
been
merged
I
can
once
and
down.
I
will
also
post
links
to
the
mailing
list,
the
slack
channel
and
the
poll
on
like
we're,
trying
to
decide
on
what
time
is
going
to
be
the
meeting.
It's
probably
going
to
be
on
a
weekly
or
bi-weekly
basis
for
one
hour.
C
I
think
on
on
that
note
last
time
I
checked
in
with
klaus,
I
think
I
was
the
only
person
who'd
responded
to
the
doodle
so
to
try
and
pick
the
time
if
people
are
interested
in
the
conversation
there
is
a
channel
in
slack
or
what
is
it
called?
Is
it
batch
wg
or
wg
batch
wg.
B
E
A
B
A
C
Point
of
yeah
got
it.
B
Yeah-
and
it
is
like
to
your
point,
one
of
the
basically
things
that
we're
planning
to
do
as
well
is
to
try
and
help
defragment
the
community
try
to
reach
out
to
cnc.
And
that's
what
we're
trying
to
do
here
as
well.
Presenting
what
we're
planning
to
discussing
within
core
kubernetes
to
the
larger
community
and
make
sure
we're
aligned
in
in
our
efforts
got.
F
B
B
So
I
haven't
like
I
don't
know
about
the
cncf
one
I
didn't
read
any
charter
or
about
that
group,
but
the
kubernetes
one
is
focused
on
core
kubernetes
enhancements
like
what
do
we
do
with
the
core
kubernetes
feature
set
to
make
it
easier
to
run
batch
workloads
and
the
people
working
on
it
are
leads
in
six
within
kubernetes.
So
the
working
group
is
going
to
set
recommendations
or
how
can
we
improve
core
kubernetes
to
a
better
ex
to
better
execute
batch
workloads?
B
The
individual
sigs,
like
six
scheduling,
sig
apps,
auto
scaler,
sig
node,
will
take
on
the
execution
for
these
enhancements,
and
so
that
is
that
is
the
goal
of
the
kubernetes
working
group.
But
I
can't
speak
too
much
about
the
cncf
one.
A
Maybe
I
can
I
can,
I
can
say
a
couple
of
words:
we
had
a
few
discussions,
also
in
the
toc
about
this
and
and
the
goal
is
really
to
to
try
to
promote
progress
in
this
area
as
much
as
possible,
and
this
can
happen
in
the
kubernetes
core,
like
what
abdullah
was
describing,
but
there's
quite
a
lot
of
initiatives
in
other
projects
in
the
cncf
that
can
have
a
different
release
cycle
or
focus
than
necessarily
the
kubernetes
core.
So
the
goal
is
really
to
see
how
how
these
two
groups
go.
A
G
We
are
also
like
declaring
a
lot
of
very
important
things
for
much
more
approach
as
out
of
scope
for
the
kubernetes
group
like
we
know
that
component
is
there's
no
intent
in
kubernetes
to
handle
workflow
for
workflows,
for
example,
and
many
other
aspects
of
the
job
orchestration.
The
experience
of
how
you
use
how
the
whole
researcher
or
data
scientist
interacts
with
with
the
whole
stack
lower.
So
we
indeed
need
to
coordinate
with
the
other
between
the
two
working
groups,
but
there
is
a
lot,
a
lot
of
scope
of
work
to
be
done.
G
That's
way
broader
than
kubernetes,
where
there
is,
there
is
a
need
for
leadership
and
coordination
to
drive
it,
and
it
is
declared
as
out
of
scope
for
the
the
kubernetes
working
group,
and
this
working
group
wants
to
make
sure
that
primitives
related
to
kubernetes
work
very
well
with
things
outside
and
like
we.
We
do
need
to
work
on
drawing
the
lines
and
making
sure
it's
well
coordinated,
but
I
lost
what
other
things,
but
I
think
it
it
makes
sense
to
have
like
separate,
very
focused
and
clearly
defined
working
streams
and.
C
Hello
you'll
be
pleased
to
know
that
I
added
mcad
to
the
charter
that
we
drew
up
for
the
the
higher
level
cncf
working
group.
That's
where
we'll
talk
about
things
like
armada
and
volcano
and
mcad,
and
how
all
of
those
pieces
how
we
should
all
be
working
together
to
on
those
pieces.
C
A
huge
part
of
working
together
will
be
also
watching
and
reacting
and
contributing
to
what
the
kubernetes
working
group
is
doing
as
well,
because
that
will
play
into
our
our
all
eventual
aims.
But
but
yeah
there's,
as
we
know,
a
whole
bunch
of
other
pieces
to
think
about
like
multi-cluster
stuff
and
and
how
we've
all
sold
that
so
yeah.
Those
discussions
can
take
place
in
the
at
the
cncf
working
group
that
we're
trying
to
put
together.
B
Sounds
good
yeah
with
that
said
I'll,
I'm
here
mostly
like
to
present
q.
This
is
a
new
proposal
that
a
again
like
focus
is
on
the
second
pillar
I
mentioned
for
patchworking
group
and
within
kubernetes,
which
is
like
job
level
management
within
core
kubernetes.
I
mean
the
the
idea
here
is.
B
I
want
to
start
with
discussion,
but
this
working
or
like
this,
this
audience
is
already
aware
of
what
what's
the
job
and
how
we
define
it,
but
quickly
we're
just
thinking
of
jobs
as
computations
that
run
to
to
completion
and
this
basically,
a
group
of
pods
that
either
run
independently
or
like
multicultural
simulations
or
collaboratively
processes
that,
like
being
an
mpi
job
or
even
reinforcement,
learning
job
where
you
have
you
know
like
workers
and
and
drivers.
B
When
the
job
could
start,
it's
sometimes
flexible,
flexible,
on
location
like
which
zone
it
could
it
could,
it
could
run
or
even
type
of
resources
and
types
resources
could
be
even
like
type
of
provisioning,
for
example,
as
a
spot
or
or
on
demand,
or
even
or
even
types
of
accelerators
like
whether
it
could
run
on
gpu
model
x
or
y
and
the
like
on
on-prem
clusters.
B
They
do
have
flexibilities
in
one
way
or
another,
but
in
the
cloud
this
becomes
even
bigger
issue,
because
you
know
in
the
cloud
we
have
way
too
many
types
of
resources
that
users
will
look
at
to
manage
this
trade-off
between
performance
and
coast,
and
so
this
would
be
a
forecast
for
us
as
well
like
it's
a
problem
we
want
to
solve
and
under
the
higher
level,
what
is
job
queueing
or
the
type
of
job
queueing
we're
looking
at
or
how
we
define
it's.
B
Basically,
what
we're
trying
to
do
is
to
have
mechanics
and
mechanisms
to
manage
access
to
a
limited
pool
of
resources
shared
like
multiple
tenants
and
basically,
what
job
queueing
will
do
is
decide
which
jobs
should
wait
and
which
can
start
now,
based
on
a
number
of
constraints.
B
And
why
do
we
need
job
queueing
again
like
on-prem?
This
is
clear:
you
have
static
and
sometimes
smaller
scale
clusters,
but
on
the
cloud
this
is
sometimes
less
clear.
Why
do
you
need
queuing
in
the
cloud?
Since
I
mean
people
sometimes
think
of
it
as
like
infinitely
scalable
and
and
and
can
absorb
every
single
workload?
You
have
that's
not
true.
There
are
a
number
of
aspects
here.
One
is
utilizing
discounts.
B
Cloud
providers
offer
discounts
if
you
pre
declare
how
much
resources
you
want
in
google,
for
example,
we
have
something
called
committed
use
discount.
You
can
pay,
for
example,
to
use
a
number
of
core
cores
over
over
three
years
or
one
year
period,
and
now
that
you've
paid
for
them.
You
always
want
to
have
them
used
and
you
don't
want
to
use
more
than
them,
and
so
it's
basically
you've
created
your
own
static
cluster
within
the
cloud,
and
you
want
to
manage
access
to
these
resources.
B
Another
thing
is
like
users:
they
have
spending
limits.
They
can't
just
keep
create
executing
every
single
job
that
gets
created.
They
want
to
control
their
budgets,
and
so
they
have
spending
limits.
They
also
want
to
introduce
pertinent
limits,
not
like
users,
the
individual
that
run
bad
workloads,
are
big
organizations
with
different
research
groups,
etc.
You
want
set
limits
pair
pertinent,
even
on
the
cloud
and
last
but
not
least,
we
have
cluster
size
limits,
like
kubernetes
itself
can't
scale
infinitely
in
in
gte.
B
We
do
support
up
to
15
000
nodes,
but
again,
like
many
other
and
many
other
instances,
you
can't
scale
more
than
5
000
or
even
1
000,
depending
on
your
workload.
B
So
what
exactly?
We
think
that
users
want
from
from
from
queueing,
I
mean.
Obviously,
you
want
queuing
jobs
that
don't
fit
existing
capacity,
should
just
basically
wait
and
execute
when,
when
capacity
becomes
available,
you
want
to
have
knobs
where
users
can
decide
on
execution
order.
You
want
knobs
for
fair
sharing
of
available
capacity
between
multiple
tenants.
B
Also,
budgeting
is
not
about
all
only
how
much
resources
you
can
use
in
specific
point
in
time,
but
also
over
a
period
of
time
and
also
ability
to
set
policies
who
can
use
which
types
of
resources
and
up
to
at
limit.
We
have
customers,
for
example,
that
they
open
the
tab
for
their
users
to
use
preemptable
or
like
spot
vms.
B
You
can
use
as
much
as
you
want
there
as
many
jobs
you
can
run,
but
when
using
on
demand,
you
have
a
specific
limit
or
you
can't
use
it
at
all
same
thing
with
gpus.
Those
are
expensive.
Scarce
resources
give
them
to
any
tenant,
and
last
is
flexible
placement
again
across
different
resource
types
and
location
and
time.
B
When
you
have
your
job
submitted
to
the
queue
you
want,
the
ability
to
start
the
job
based
on
what
is
available
on
your
infrastructure
and
the
flexibilities
of
the
job,
that's
declared
by
the
user.
B
Any
questions
about
like
do
those
requirements
resonate.
Do
they
capture,
like
the
use
cases
that
you
have
in
mind
for
for
queuing.
A
H
I
B
So
I
will
get
to
the
apis
in
a
second,
but
conceptually
it's
not
like.
Initially
we
will
or
we
will
implement
it
as
a
as
a
within
cluster
controller,
but
I
can
imagine
this
being
run
in
a
nodeless
cluster.
For
example.
Imagine
it
running
in
a
like
as
a
controller
outs
that
manages
multiple
clusters
and
it
can
watch
for
jobs
being
created
in
multiple
clusters.
B
We
need
to,
like
you
know,
fine-tune
these
concepts
a
little
bit,
but
I
don't
see
a
problem
having
this
controller
running
you
know
and
watching
for
multiple
over
on
multiple,
like
api
servers
and
and
try
to
manage
source
across
multiple
clusters.
I
Not
tied
to
the
single
cluster
story,
I
guess
so
I
mean
is
that
a
use
case
that
you
want
to
focus
on
like
first
release
or
it'll
be
like
comes
later.
B
So
the
mvp
is
focused
on,
like
it
will
be
run
on
the
master
of
a
single
cluster,
it's
like
so
so
that
that
is
the
mvp.
Then
the
next
step
is
to
how
how
this
can
run
outside
to
manage
multiple
clusters.
C
C
A
I'll
I'll
add
those
those
links
in
the
agenda
as
well,
so
that
we
can
go
back
yeah.
C
Thank
you
so
actually
one
one
thing
on
just
what
users
want,
I
would
add,
speed
or
or
scale
of
things.
C
I
I
don't
know
that
that's
absolutely
listed
here
but
like
when
we
tried
to
do
this
with
just
the
regular
kubernetes
scheduler
or
building
a
custom
scheduler
a
couple
years
ago.
It
just
wasn't
fast
enough,
especially
when
we
scaled
the
cluster
up
to
a
really
big
big
size.
So
I
don't
know
yeah.
All
of
these
things
speak
to
us
for
sure,
and
then
we
have
a
few
others.
So.
B
E
B
To
all
like,
I
don't
know
like
thousands
of
jobs
or
a
million
pods
type
of
like
you
know,
skill.
B
If
there
are
others,
you
mentioned
that
this
is
this
captures
some
of
your
subset
of
your
requirements.
If
there
are
other
requirements,
please
like
just
post,
maybe
in
the
chat
and
then
just
we
just
want
to
make
sure
that
we
take
them
into
consideration
as
we
move
forward,
yeah.
B
So
why
a
new
controller
like
as
you
notice
playing
kubernetes,
doesn't
really
lend
itself
well
to
managing
jobs
without,
like
with
respect
to,
like
you
know,
queuing
in
general,
anything
that
you
create
on
on
kubernetes.
Basically,
the
whole
cluster
is
going
to
try
to
reconcile
itself
to
create,
pause
and
schedule
the
pause
and
start
the
pause.
There's
no
way
they
could
say.
If
there
is
not
enough
resources,
just
don't
do
anything
and
wait
until
resources
are
becoming
available.
B
It
will
basically
continuously
attempt
to
do
that,
and
it
will
work
itself
to
death,
especially
like
when
you
have
like
thousands
and
hundreds
of
thousands
of
jobs
being
created
and
kubernetes
coders
are
not
really
enforced
like
in
a
way
that
allows
it's
not
dynamically
enforced,
basically,
enforcement
resource
creation
time.
So
it's
whether
you
are
able
to
create
the
job
in
the
first
place
or
not,
and
if
you're
not
you
don't
have
quota
to
create
the
job,
then
there's
no
place
to
park
it
until
it's
like
that
resources
are
available
during
the
job.
B
So
volcano
is
one
of
the
most
famous
schedulers
for
for
drip
scheduling.
Our
issue
with
volcano
is
that
it
it
re-implements
a
number
of
existing
functionalities.
It
is
a
scheduler,
so
it's
the
second
schedule
that
runs
side-by-side,
cube,
scheduler
and
that
causes
a
number
of
issues
related
to
you
know:
race
conditions,
every
implementation
of
some
of
the
features
and
how
it
can
catch
up
with
the
features
that
we
are
actually
pushing
up
in
in
in
kubernetes
or
kubernetes.
B
The
second
thing
is
that
has
on
drop
apis.
It
is
also
a
job
cycle.
It
has
a
drop
life
cycle
controller
and
so
again,
three
implements
the
job
api
that
we
have
called
kubernetes
it.
The
other
thing
is
that
it
lacks
clear
integration
with
auto
scaling.
One
important
thing
we
have
design
aspect
in
queue
is
that
needs
to
have
a
clear
integration
with
cluster
or
scale,
because
this
is
extremely
important
aspect
of
managing
jobs,
because
you
want
to
allocate
resources
for
a
whole
job.
B
How
do
we
do
that
before
the
job
actually
starts,
and
how
do
you
send
it
to
a
specific
location
or
specific
gpu
model
type
or
specific
cpu
or
provisioning
like
standard
versus
on
demand?
And
this
is
like
space
of
the
last
one.
It
basically
lacks
clear
support
for
resource
fungibility
or
flexibility,
and
so
so.
This
is
the
issues
that
we
have
with
with
volcano,
and
here
also
I
want
to
mention
that
gke
had
like
google
cloud,
they
had
a
previous
effort
decommissioned.
B
Now
it's
called
bathroom
gt
a
couple
years
ago.
It
had
similar
issues,
it
reinvented
scheduling,
job
life,
cycle
measurement,
auto
scaling
the
other
thing
it
was.
It
was
close
source,
and
so
it
was
hard
to
meet
customers.
Requirements
of
portability
like
customers
want
to
run
this
on-prem
there's
a
ton
of
patch
workers
already
that
will
continue
to
run
on
on-prem,
and
so
we
need
to
speak
to
these
customers
right
like
we
want
them
to
be
able
to
run.
B
You
know,
manage
their
jobs,
on-prem
and
maybe
sometimes
spill
into
the
cloud
or
have
a
multi-cloud
story
as
or
a
multi
or
on-prem
plus,
or
a
hybrid
story
really
well
thought
out,
and
so
our
thought
here
is
that
okay,
let's
try
to
come
up
with
a
proposal
against,
should
be
open.
Source
driven
by
the
community
addresses
the
requirements
that
we've
mentioned
before.
In
terms
of
you
know
that
plays
on
the
strengths,
of
both
the
cloud
and
on
on-prem
cloud
has
a
ton
of
capabilities.
B
We
that
is
exposed
through
our
scaling
and
auto
scaling
should
be
like
a
central
piece
of
the
design
of
any
job
management
controller
that
that's,
I
guess
how
we
look
at
it.
Any
any
questions
on
on
the
quick,
related
work
review
here.
D
Question
sorry,
having
too
much
focus
on
the
auto
scaler
kind
of
leaves
out
the
the
people
running
it
on
prem
right.
So
I
I
would
say
like
if,
if
we
say
like
auto
scaler
is
going
to
be
first
and
mostly,
then
we
don't
care
about
bare
metal
where
you
have
a
fixed
set
of
machines.
D
Right
like
it
should
be
something
that
we
care
about
auto
scaler,
but
we
care
about
people
that
is
going
to
have
a
fix
it
because
I
feel,
like
this
story
is
being
told
as
a
batch
is
for
auto
scaling
on
a
queuing
system
right
like
if
you
keep
repeating
that
the
auto
scalar
is
the
most
important
thing
that
we
need
to
integrate
and
if
you
have
a
batch,
a
fixated
system
as
they.
I
see
the
list
of
people
in
the
school
like
from
academia
and
universities.
B
They
were
designed
for
fixed
clusters
and
I'm
trying
to
emphasize
the
point
here
that
this
is
changing
and
we
need
to
take
cluster
autoscaler
into
consideration
here
in
an
environment
where
you
have,
you
know
a
ton
of
elasticity
and
flexibility
and
the
fact
that
batch
workloads
are
migrating
from
on-prem
into
the
cloud
I
did
not
mean
to
because,
like
I
still
like,
the
idea
that,
like
for
example,
batch
on
gke
didn't
succeed,
is
because
exactly
what
you
mentioned,
that's
why
we
want
to
start
from
a
point
where
it
needs
to
be
open
source.
B
Those
have
never
been
top
of
mind
and
I'm
trying
to
emphasize
the
point,
but
maybe
I
I
overdid
it.
I
So
question
so
there
is
q
and
there
is
batch.
So
in
this
proposal
are
we
going
to
combine
both
of
them
together
or
they
are?
They
are
still
going
to
be
two
different
entities.
B
I
Sorry,
I
did
not
get
the.
What
do
you
mean
like
this
batch
and
there's
q?
So
I
mean
q
can
apply
on
top
of
normal
kubernetes
scheduler
right
I
mean
you
have
q
and
then
you
can
put
it
on
a
normal
kubernetes
scheduler,
which
schedule
things
one
at
a
time
now
batch
is
like
scheduling
things
in
together
right,
so
now
are
we
going
to
combine
this
batch
scheduler
and
the
queues
capability
together
or
okay?
It's
going
to
be
two
different
things:
okay,.
B
That's
a
great
question
and
it
is
one
of
the
main
design
principles
that
we're
carrying
here,
which
is
don't
reinvent
the
wheel
and
when
you
say
cube
we're
talking
about
the
cube
scheduler.
Am
I
right?
That's
right,
yeah!
This
is
exactly
to
the
point
of
this
slide.
We
don't
want
to
reinvent
the
we
don't
want
to
have
a
second
scheduler
running
like
a
pod
to
node
scheduler
running.
B
Package,
we
should
just
reuse
that
same
thing
with
job
lifecycle
management.
I
don't
want
to
propose
a
new
job
api.
We
just
need
to
manage
the
existing
job
api
and
have
hooks
to
manage
custom
workloads,
custom
jobs,
build
that
cannot
basically
reuse
the
job
api,
but
we
don't
want
to
force
like
we
don't
want
to
introduce
a
new
api
for
creating
jobs.
Basically,
so
so
yeah
we're
not
doing
that,
and
the
advantage
is
that
we're
using
significant
existing
functionality,
we're
not
concerned
about
functionality,
divergence.
B
It
enforces
separation
of
concerns
in
a
sense
that,
like
the
control
that
we're
proposing,
is
not
going
to
do
auto
scaling,
it's
not
going
to
assign
parts
to
nodes,
it's
not
going
to
create
the
parts
of
a
job.
All
of
that
is
the
existing
components.
It
will
only
decide
when
the
job
should
start
and
using
kate's
native
scheduling
directors
like
node
affinity,
chance
integration
to
direct
the
job
to
the
place
where
it
should
run
based
on
existing
capacity
of
the
cluster.
B
F
Yeah,
so
so,
first
100
percent
with
you
on
you
know,
separation
of
concern.
What
scheduling
is
separate?
This
is
a
job
meta
scheduler.
You
know,
I
agree
with
you
and
and
not
reinventing
the
wheel.
Just
the
question
I
have
is
about
the
the
job,
representation
right
and
job
life
cycle
management,
because
I
want
something
that
is
general
enough,
that
if
I
have
a
spark
job
or
a
ray
job
or
whatever
kind
of
of
job,
how
complex
it
is
right,
it
may
include
multiple
deployments,
etc.
F
I
want
to
be
able
to
say
this
is
my
job
right
and
cue
it
as
one
entity.
So
I'm
not
sure
if,
if
the
current
kubernetes
job
specification
is,
is
general
enough
to
accommodate
you
know
all
those
types
of
jobs.
B
This
is
a
great
point
and
it
I
will
I
will.
I
will
address
that
in
the
next
slide.
We
want
to
support
both
the
thing
again.
We
have
users
that
their
journeys
are
simple.
They
just
want
to
run
a
batch
job,
the
job
api.
We
try
to
fix
the
job
api,
but
let
me
let
me
finish
this
one
go
to
the
next
one
address
that
point.
We
do
acknowledge
that
there
are
some
like
cautions
or
limitations
in
this
approach.
B
It
creates
two
layers
of
resource
management,
so
we
need
to
to
make
sure
that
we
address
this
point.
We
have
multiple
components
involved
in
starting
the
job,
and
so
this
may
add
extra
latency
again
could
become
harder,
and
so
all
of
these
things
we
need
to
make
sure
that
the
way
that
we
design
the
controller,
the
ux.
B
B
Yeah,
as
I,
as
I
mentioned,
we
try
to
fix
the
job
api
like,
for
example,
you
mentioned
array,
jobs
or
index
jobs.
We
tried,
we
introduced
index
jobs
to
the
v1
job.
We
fixed
completion
status
tracking.
B
It
was
pretty
much
broken
like
if
tracking
was
based
on
like
if
it
basically,
if,
if
a
part
completes
the
pod
object,
itself
needs
to
continue
to
exist
in
the
api
server
to
make
sure
that
the
job
completes
like
this
is
how
we
were
tracking
things
and
that
did
not
work
on
environments
where
you
have,
for
example,
spot
vms
when
the
spot
vm
gets
preempted
any
part,
even
if,
if
it
completed
that
was
on
the
api,
server
had
a
node
name
assigned
to
that
node,
it
will
get
garbage
collected
and
so
basically
you're
posing
progress
in
the
job.
B
So
we
fixed
that
as
well.
We
introduced
some
new
status
like
tracking,
greedy
pods
and
job
status,
which
is
required
to
implement
tf
and
mpi
jobs
on
top
of
job
api.
So,
obviously
the
point
is
that
we
are
trying
to
improve
the
job
api
to
address
the
simple
use
cases
and
make
it
usable
to
implement
more
complex
workloads,
but
we
do
acknowledge
that
there
will
always
be
a
percentage
of
workers
that
will
not
be
able
to
use
the
drop
api.
That's
absolutely
true.
B
That's
why
we
have
a
cons
like
this.
Is
that
the
resource
model
for
q,
we
do
have
the
concept
of
acute
workload.
It's
basically
an
abstract
representation
of
any
drop
in
queue,
and
the
idea
here
is
that
it,
this
queued
workload
object.
Api
is
going
to
serve
as
a
proxy
between
the
actual
job,
without
that
being,
for
example,
as
you
mentioned,
spark
drop,
and
what
q
is
queuing.
B
We
had
a
concept
of
a
resource
claim
here.
This
is
a
bit
maybe
early
to
introduce,
but
it's
an
api
that
we'll
be
introducing
to
cluster
autoscaler
to
ask
for
resources,
and
this
is
what
I
meant
by.
We
need
that
native
integration
with
auto
scaling.
This
is
not
like
necessary
to
have,
for
example,
in
in
on-prem
environments.
B
The
whole
thing
could
still
work
without
resource
claim,
but
in
on
the
cloud
it
will
be
quite
powerful
because
before
we
starting
the
job,
we
want
to
ask
for
resources,
we're
just
gonna
communicate
with
cluster
or
scale.
The
r
square
will
tell
us
okay,
I
have
these
resources
in
zone
x
or
y,
and
then
we
will
start
the
workload
by
injecting
affinities
to
the
workload
to
send
it
to
the
resources
that
cluster
autoscale
provisions
for
us
and
the
other.
The
the
last
two
are
maybe
slightly
not
completely
surprising.
B
Is
that
the
queue
which
is
basically
an
organizing
concept
for
grouping
managing
and
reading
about
closely
related
resource
jobs,
and
then
there
is
the
concept
of
a
capacity
which
defines
how
much
resources
exist.
B
For
for
different
tenants,
we
are
reusing.
The
namespace
as
a
tenant
concept,
which
seems
to
be
taking
like
is,
is
well
accepted
concept
now
in
kubernetes,
and
so
in
this.
In
this
case,
you
would
basically
model,
for
example,
your
teams,
as
name
spaces.
You
would
create
cues
for
them,
these
queues
are
name
spaced,
they
would
point
to
capacity.
The
capacity
is
a
cluster-wide
resource,
and
so
usually
the
cluster
admin
or
the
batch
admin
would
be
the
one
who's
managing
the
the
capacity
and
the
qr
resources,
the
one
that
creates
them.
B
The
the
this
is
like
the
personas
that
were
focused
on
and
then
the
batches
that
should
be
basically
just
runs
and
monitors
job
like
the
way
that
would
work
it's
basically
for
them.
They
will
create
the
job,
and
then
you
would
have
admin.
You
know
setting
up
all
these
like
using
capacities
that
decides
when
the
job
will
start
and
how
much
resources
exist
for
each
stand.
B
So
this
is
like
quick
slide
on
like
the
theory
of
operation.
Sorry,
I'm
not
paying
attention
to
the
questions
on
this
channel
chat.
I
hope
that
someone
like
aldo
or
magic
are
answering
them
or
please
interrupt
me
if
there's
something
that
I
need
to
clarify
more.
A
It's
just
in
the
interest
of
time
like
we
have
around
15,
maybe
20
minutes.
If
we
overload
a
bit
so
I
I
don't
know
either
either
if
it's
not
a
lot
more,
otherwise
I
would
suggest
we
go
through
it
and
they
take
questions
at
the
end.
I
don't
know
what
people
prefer
or
we
can
just
interrupt,
and
then
we
see
where
we
get
so.
B
I
think
this
after
this
slide,
the
the
story
would
become
a
little
bit
clearer
and
then
I
can
show
a
couple
of
use
cases
and-
and
we
can
have
questions
so
here-
I'm
just
trying
to
show
how
this
q
controller
is
going
to
work.
As
I
mentioned,
we
are
using
a
lot
of
existing
functionality.
The
red
boxes
are
existing
controllers,
part
of
the
part
of
kubernetes,
I'm
introducing
a
new
one
called
q,
so
in
in
time
zero.
B
Basically
in
the
top
left
corner
here
you
can
see
the
batch
admin
would
create
the
queue
and
capacity
resources
it
could
have
like
gatekeeper,
type
policies
to
select
like
who
can
submit
where
and
then
the
batch
user
will
basically
start
the
job.
Let's
assume
here
again
we're
using
the
v1
job
api.
It
will
set
the
queue
name
where
the
job
should
be
started
and
the
job
will
start
in
a
suspended
mode.
B
Basically,
we're
gonna
have,
for
example,
a
webhook
and
we're
also
discussing
within
kubernetes
community
for
like
setting
policies
so
that
we
enforce
that
thing,
like
basically,
jobs
started
in
in
a
suspended.
Basically,
the
job
controller
is
not
going
to
act
on
it
or
just
ignore
it.
B
The
second
step
here
is
that
q
will
look
at
that
is
watching
on
these
jobs.
It
will
assign
them
to
a
capacity.
It
will
create
a
resource
claim
if
it
has
an
integration
with
cluster
orders
character
to
understand
where
these
resources
gonna
come
from
and
then
custer
or
scale.
Basically,
once
consider
us
there
fulfills
this
resource
claim
queue
is
watching
on
that
it
will
unsuspend
the
job
once
the
job
is
unsuspended
again.
The
rest
is
the
same
as
it's
working
right
now.
The
job
control
will
create
the
pods.
B
Basically,
how
do
we
do
job
level
scheduling,
as
I
mentioned
before,
we're
using
kubernetes
native
scheduling
directives
like
queue
based
on
where
the
resources
were
allocated
in
the
resource
via
the
relations
claim,
q
will
basically
inject
affinities
or
even
tolerations
onto
the
job,
to
send
it
to
a
specific
place.
B
B
There
is
the
idea
here
is
that,
once
the
custom
workload
is
created,
we
would
need
a
controller
that
understands
the
custom
workload
and
translates
it
to
create
a
cued
workload
resource
and
this
cute
workload
resource.
If
you
like,
I
don't
have
the
spec
for
it
here,
but
it's
basically
how
much
resources
you
need.
B
It's
basically
like
a
pod
template
an
account,
maybe
even
like
you
know,
an
array
of
that,
because
you
could
have
a
driver
and
and
for
example,
the
workers
like
in
a
spark
job
and
q
would
be
aware
of
these
cute
workers
you'll
be
watching
on
them,
assigning
them
to
capacity,
and
then
it
would
basically
mark
the
job
as
fulfilled,
and
then
the
controller
here
could
would
be
the
one
that
starts
the
cue,
the
custom
workload.
B
The
main
requirement
is
that
for
custom
workers
to
suspend
to
support
suspension
like
basically,
it
needs
the
ability
to
start
in
a
suspend
mode
and
have
a
way
for
us
to
start
it
basically
by
setting
suspend
to
false,
and
this
provides
an
agnostic
way
of
deciding
when
the
job
can
start
and
when
it
when
when
it
should
stop
meaning
like
preempted,
for
example,.
B
In
discussing
like
introducing
a
suspense
sub
resource
similar
to
the
scale
sub
resource,
if
you
are
aware
of
it,
that
allows
hpa
horizontal
port,
odd
scaling
to
work
agnostically
across
different
types
of
deployments.
So
we
can.
We
are
thinking
of
suspending
the
same
way.
F
So
yeah
just
to
make
sure
I
understand
so
in
this
case,
for
example,
if
I'm
interested
in
you
know,
let's
say,
spark
jobs
that
are
started
by
a
spark
custom
resource.
I
need
to
make
sure
that
that
you
know
whatever
spark
controller
implement
the
suspend
api.
B
Yeah,
you
would
need
to
have
like
the
top
level
resource
object.
That
represents
the
job
to
have
these,
like
the
ability
to
tell
it:
okay,
suspend
or
resume
yeah.
So
this
is
the
integration
point
like,
and
we
feel
that
this
is
like
a
really
small
surface
area
of
in
like
of
integration
relatively
the
complexity
here.
Is
that
again
we're
in
a
point
where
kubernetes
is
extremely
flexible
and
allows
you
to
build
anything
you
want
and
the
fact
that
you
want
to
manage
all
these
types
of
custom
resources
right.
B
So
that's
the
design
that
we
came
up
with
the
at
this,
like
the
the
the
integration
surface,
seems
to
be
to
us
reasonably
small.
Hopefully
it
we
will
not
be
proven
wrong,
so
we'll
see
how
it
works.
F
Yeah
I
mean
we're
putting
a
requirement
on
everybody
like
whoever
is
implementing
the
spark
controller,
the
ray
controller,
whatever
you
you
name,
it
right
go
to
the
tensor
job
pythor's
job
right
I
mean
everybody
has
to
implement
that
suspend
interface.
I
guess
that's
the
yeah,
the
the
the
uphill
battle
here.
J
Hi
so
yeah,
it's
not
that
uphill.
I
mean
I'm
already
a
contributor
in
kubeflow,
so
I
can
do
this
for
the
mk
operator,
but,
for
example,
I
think
alex
alex
is
here
at
this
one,
and
I
think
he
has
discussed
this
already
with
with
the
maintainers
of
of
the
train,
the
training
operator,
and
they
they
are
fine
with
the
idea
we
just
need
to
to
implement
the
change,
at
least
for
q
flow.
It's
I
think
this
battle
is
pretty
simple:
it's
not
a
battle
either.
J
Now,
I'm
pretty
sure
we
can
work
with
with
other
communities
to
integrate
it.
As
abdullah
said
it's,
it
is
a
simple
field
that
doesn't
require
much
thought.
G
I
would
also
add
that,
ideally,
actually
that
should
not
be
necessary,
but
we
would
love
that
most
of
these
tools,
the
job
api,
so
that
we
can
actually
consolidate
on
the
base
drop
life
cycle
really
with
the
core
wpi
on
kubernetes.
G
Now,
not
all
jobs
and
spark
is
a
good
example
that
probably
the
spark
has
some
specific
requirements
that,
like
the
job
api,
will
not
be
able
to
meet,
but
we
are
at
least
we
are
looking
into
like
trying
to
curate
like
a
stack
ranked
list
of
all
of
the
tools
that
need
to
indeed
have
this
integration
and
like
at
a
later
stage.
G
If
we
see
that
this
gets
traction
until
early
adopters
and
first
users,
one
of
the
elements
of
work
would
be
and
help
also
that
we
could
use
is
indeed
that
we
do
a
targeted
effort
and
that
kind
of
stack
rank
starting
like
airflow.
Our
go
ai
cup
flow,
various
flavors
actually
of
course,
etc
that
they
all
make
sure
that
that
we
have
this
integration
either.
B
The
other
thing
here
is
that
this
idea
helps
in
scaling
addressing
scaling
concerns
like
we
don't
want
the
pods
to
be
created
only
like
from
the
beginning.
That
will
help
us
scale
like
if
you
have
hundreds
of
thousands
of
jobs
being
created,
and
you
want
to
cue
them.
You
don't
want
all
of
them
to
create
pods
and
just
basically
manage
the
million
pods
that
only
one
tenth
of
them
will
actually
execute
at
a
time
and
so
like.
B
I
feel
that
this
could
also
enforce
a
shift
of
like
a
new
design
pattern,
basically
that
that
should
be
more
scalable
moving
forward.
B
I
don't
have
well,
as
as
you
mentioned,
we
don't
have
a
lot
of
time,
so
the
we
have
the
controller
design
the
controller,
it's
a
different
beast,
we'll
leave
for
another
day,
but
the
design
document
is
there.
We
created
a
repo.
We
have
a
like
a
a
proof
concept
that
we're
planning
to
open
source
next
week,
and
so
hopefully
the
community
can
start
looking
at
it
and
helping
us
ship
it
and
and
improve
it.
B
I
don't
think
we
gotta
have
time
for
the
apis.
I
think
if
we
have
more
questions.
D
Question
on
that
I've
been
thinking
after
reading.
The
proposal
is
adding
a
new
controller
to
kubernetes
feels
like
a
a
really
heavy
thing
to
do
right.
Do
you
see
this
like
an
actual
thing
that
can
happen?
I
I
feel
that
for
the
last
year
currently
has
focused
on
stability
and
maybe
even
to
run
current
addition
like
edge
cases
and
telcos
and
all
of
that
and
then
adding
a
new
controller.
D
That
is
a
use
case
for
a
lot
of
people,
but
not
for,
I
would
say,
the
eighty
percent
of
kubernetes
use
cases
are
looking
into
this,
so
adding
a
new
controller
will
make
kubernetes
heavier.
How
does
the
kure
nettie's
community
feel
about
this.
B
That's
a
very
good
question,
so
we're
starting
as
a
sub
project
not
incorporated.
We
want
to
prove
the
case.
We
improve
that
this
works.
We
are
planning
to
integrate
this
with
kubernetes.
That's
why
the
way
that
we're
designing
this
such
that
it
integrates
with
existing
controllers.
So
that's
one
like
you
know
rock
that
we're
trying
to
avoid
from
the
beginning.
The
other
thing
is:
we
have
cube
controller
manager,
which
is
basically
it's
not
going
to
be
like
a
new
binary.
Sorry,
a
new
yeah,
a
new
binary
executable
on
its
own.
B
It's
just
going
to
be
another
controller
that
gets
created
within
the
cube
controller
manager
set
of
reconcilers.
So
that's.
Why
also,
we
form
the
batch
working
group
to
get
to
convince
the
community
that
there's
conviction
around
these
ideas
like
there
are
there's
momentum,
there's
a
new
type
of
workers
that
we
need
to
open
up,
kubernetes
for
and
yeah
I
mean
I
can't
tell
you
that
it
will
happen,
but
we're
trying
we're
making
decisions
right
now
that,
hopefully,
will
help
us
in
the
future
to
make
the
case
for
having
it
in
kokubuniti's.
B
A
K
Sure
so
we
run
hpc
systems
here
at
pinonell.
I
was
curious
how
these
cues
are
going
to
interact
with
each
other.
You
know,
usually
we
give
each
project
a
namespace,
so
they
would
kind
of
have
their
own
queue
in
this
api,
but
on
our
hpc
systems
we
have
cues
where
each
of
the
projects
submits
their
jobs,
they
can
see
where
they
are
in
the
overall
view
of
the
system
queue.
K
So
they
know
hey.
It's
gonna
be
two
days
before
their
job
starts
or
whatever
and
and
all
the
projects
are
fairly,
their
jobs
are
fairly
scheduled
across
the
different
projects,
so
one
project
doesn't
dominate
the
whole
system.
How
is
having
separate
cues
at
the
name,
space
level,
kind
of
work
for
that
use
case.
B
So
cues
are
simply
like.
If
I,
if
you
look
at
the
api,
it's
simply
a
pointer
to
where
the
actual
capacity
is.
The
thing
is
like
having
q
namespace
solve
a
couple
of
problems.
One
discoverability,
like
users,
usually
only
have
access
to
the
list.
B
Is
in
the
capacity
api-
and
this
is
not
a
namespace
object,
and
this
is
something
that
multiple
cues
can
point
to
right
like
like.
Even
if
you
have,
for
example,
multiple
namespaces,
you
can,
you
can
group
them
using
labels
and
say:
okay,
all
of
them
can
basically
point
to
the
same
capacity
and
they
share
the
same
quality.
B
The
other
thing
that
q
will
help
us
as
being
namespace
is
a
case
that
someone
brought
up
while
discussing
this
in
open
source.
Consider
the
case
where
you
want
to
run
an
experiment
like
a
user,
wants
to
run
an
experiment,
and
it's
like
thousands
of
jobs
that
they
want
to
run,
but
they
don't
want
to
use
more
than,
for
example,
8
gpus,
because
they
don't
want
like,
for
some
reason,
the
ability
for
the
users
themselves
to
create
a
cue
in
their
namespaces
and
set
limits
within
the
queue
right.
B
So
those
names
do
not
give
you
any
promise
on
whether
you
will
get
the
capacity
or
not,
but
there
are
limits
on
how
on
the
maximum
amount
of
resources
that
your
experiment
is
going
to
use.
So
I
would
imagine
users
creating
a
queue,
even
paying
large
scale
experiments
themselves
and
setting
those
limits
at
the
with.
For
that
specific
experiment.
B
D
K
Dedicated
pool
or
a
capacity,
the
the
queues
jobs
are
assigned
to
cues
or
whatever,
but
there's
kind
of
a
scheduler
level
queue
that
aggregates
all
of
the
q
api
objects
into
capacity,
and
it's
looking
at
when
the
various
jobs
are
submitted
and
still
evenly
scheduling
them.
B
Exactly
yeah
like
at
the
end
of
the
day,
your
actual
key
is
going
to
be
the
capacity
where,
like
basically
we're,
gonna
decide,
okay,
which
one
is
going
to
get
executed
first
or
not
like
they
will
all
be
basically
dependent
on
the
capacity.
A
A
A
A
B
Does
it
open
okay?
So
we
will
have
everything
there
like
the
like.
This
is
the
repo
we
will
upload
the
code
there
we'll
have
the
links
to
the
design
documents
and
the
api,
etc.
If
you
have
specific
suggestions,
please
create
issues
to
help
us.
You
know
better
shape
this
project
right
now.
It's
just
a
template,
there's
nothing
in
the
repo,
but
we
will.
We
should
upload
something
this
week
or
next
week.
B
A
And
I
guess
like
if
people
can
have
a
look
at
the
proposal
in
the
google
doc
as
well
put
as
much
input
as
they
can
there.
I
guess
there
was
a
lot
of
discussion
going
on
as
we
had
some
time,
but
I
see
that
people
have
a
lot
more
feedback
to
give.
So
I
think
we
can
we
can
interact
there,
there's
also
the
mailing
list.
A
I
also
linked
that
in
the
agenda,
so
I
suggested,
like
everyone,
checks
those
links
and
yeah
like
let's
say
we,
we
sync
again
in
like
a
couple
of
months,
we'll
we'll
make
sure
there's
a
slot
for
this.
It's
been
great
like
it's
been
super
nice.
A
A
I
I
saw
a
couple
of
new
new
timers,
so
a
first
timer,
so
I
hope
you
you're
here
again
in
two
weeks
so
that
we
can,
we
can
have
a
proper
introduction.
Otherwise
does
anyone
else
have
anything
else
to
raise.
A
If
not
like,
thanks
again
to
abdullah
mache
canaldo
for
for
the
really
nice
presentation
and
we
meet
again
in
two
weeks
march,
2nd
in
principle,
the
topic
will
be
air
gap
solutions
and
we
stick
with
the
topic
for
now,
but
we'll
we'll
send
the
reminders
as
usual.