►
Description
AI and Data change the healthcare industry. In the last year, Argo was helping us to bring medical-grade AI workflows to production and improve the lives of over 20 million patients.
In this talk, I will share the practices and techniques we use to build reusable, production-grade AI workflows using Argo Workflows, How you can write your workflows for a more reusable pipeline. And Finally, How we integrated Argo with Jupyter Notebooks for robust and fast experimentation.
A
B
Hi
everyone
good
morning
today,
I
want
to
talk
with
you
about
components,
workflows
and
cookbooks,
but
more
specifically,
what
I
want
to
talk
with
you
today
is
about
machine
learning,
pipelines
and
heart
attacks,
but
before
I'll
do
that,
I
wish
to
do
a
short
introduction.
So
I'm
omri,
I'm
an
architect
company
called
diagnostic
robotics,
I'm
an
engineer.
While
I'm
not
data
scientist
myself,
I
can
probably
understand
every
second
war
that
you
say
and
what
we
do
at
diagnostic.
B
Robotics
is
ai
driven
precision,
population
health
and
you
might
be
asking
what
a
driven
precision
population
health
means.
It
basically
means
something
like
this:
nice
meme
from
the
liam
neeson
movie
taken.
One
of
the
issues
with
health
care
in
the
united
states
in
the
world
in
general.
Is
that
too
many
people
don't
get
the
health
care
that
they
need,
whether
it's
because
they
don't
have
enough
knowledge
or
whether
it's
resources
or
the
fact
that
their
insurance
company
can
give
them
enough
attention?
B
B
Now
mlaps
is
a
lot
of
things,
but
today
I
wish
to
focus
on
pipelines
and
pipeline.
Orchestration
is
basically
a
sole
problem.
They
are
more
than
enough
to
tools
to
choose
from,
and
the
two
we
chose
to
use
is
argo
and
we
love
argo,
mostly
because
it's
kubernetes
native
and
we
are
kubernetes
workshop,
it's
api
and
cli.
B
First,
it
has
a
really
nice
ui
and
it
allows
us
to
create
really
robust
deployments
with
helm
and
argo
cd,
which
we
use
extensively
and
that
allows
us
to
take
our
ai
are
basically
very
basic
notebooks
and
make
them
work
at
scale
and
that's
an
example
of
part
of
the
pipelines
that
we
are
doing
that.
That's
actually
a
very
small
script,
screenshot
of
a
very
long
pipelines
that
runs
above
and
below
that
that
screenshot
screenshot
so
excellent
job.
B
The
business
people
said
now.
Can
you
do
the
same
thing
for
for
one
carving
and
I've
looked
in
data
scientists
in
my
team
and
they
looked
back
at
me
and
he
felt
for
a
second
said.
Okay,
oh
dear,
because,
while
going
from
research
to
production,
we've
sung,
we've
lost
something
in
the
journey
because
once
we're
looking
on
a
pipeline,
it
might
look
simple
enough
on
the
demo.
B
But
reality
is
much
more
complex
and
development
of
these
pipelines
become
complicated
and
nobody
wants
to
work
with
15
000
line.
Yaml
files
to
define
these
pipelines
and
the
deployment
becomes
much
more
complicated
because
we
have
different
models
in
different
stages
of
life
and
different
maturity
that
need
to
run
together
on
the
same
environment
and
on
the
same
data,
and
now
you
make
sure
that
one
researcher
does
not
does
not
create
issues
or
problems
to
other
researchers
or
even
our
production
code
and
iteratively
experimentation
became
suddenly
very
slow.
B
Once
in
the
note,
in
the
old
notebook
days,
we
just
had
to
make
a
change.
Click
on
enter
and
cd
change
actually
manifest.
Now
we
need
to
wait
for
an
image
to
build,
will
wait
for
a
pipeline
to
actually
be
deployed
and
then
only
run
it.
It
can
take
a
few
minutes
and
it
really
hinders
our
ability
to
run
fast
research
and
that
led
us
to
understanding
that
when
we're
talking
about
an
ai
product
or
machine
learning
product,
there
are
actually
two
modes.
We
need
to
think
about.
B
One
mode
is
the
research
mode
when
we
need
to
be
very
fast,
very
experimental,
very
adjustable,
and
it's
more
or
less
an
iterative
process
in
this
process.
The
data
scientist
is
the
king,
and
then
there
is
the
development
in
the
production
mode
when
we
need
to
think
about
scalability
reliability,
making
sure
that
this
thing
is
governable
and
it's
especially
important
when
we're
talking
about
medical
grade
products
and
we
need
it
to
work
in
a
fire
and
forget
mode.
B
B
So
when
we're
talking
about
the
progress
or
the
spectrum
between
research
and
the
in
production,
we're
looking
from
the
most
basic
steps
when
we're
trying
to
do
basic
exploration-
and
we
are
just
going
all
over
the
place
and
looking
for
interest
in
data-
you
know
we
need
to
be
very
very
very
fast
and
we
don't
care
much
for
scale
or
quality
of
our
code.
And
then
once
we
go
more
and
more
towards
production.
We
start
to
look
into
some
things
like
retrospective
studies,
which
actually
needs
some
governance
and
need
some
level
of
scale.
B
But
we
still
need
the
ability
to
react
fast
and
to
adjust
fast
and
once
we
move
that
we
need
to
pass
clinical
validation
when
an
actual
medical
doctor
just
looks
on
the
model
and
actually
signs
it
off.
And
then
we
need
to
make
sure
that
our
code
and
that
our
system
is
also
governable
and
we
don't
just
change
things
without
any
control
and
when
we
go
into
prospective
research,
basically
sending
it
into
the
market
and
having
a
test,
whether
it
works
or
not,
on
real
life
from
real
population.
B
Emelops
adds
complexity,
anything
there's
a
way
to
do
experimentation
and
the
solution
was
surprisingly
not
more
tools.
Necessarily
the
solution
was
architecture.
So
today
I
wish
to
share
with
you
five
lessons
that
we've
learned
about
how
to
move
from
production,
to
research
start
from
research
to
production
and
back
again.
B
So
the
first
lesson
is
the
pythonizer
pipelines
and
the
reason
why
you
might
want
to
pythonize
your
pipelines
is
first
of
all
it's
because
it's
developer
friendly
people
know
to
write
python
code.
They
don't
necessarily
know
how
to
write.
Yaml
codes
they're
much
nicer
to
work
with,
and
we
can
get
all
the
nice
things
that
we
can
get
in
code,
whether
it's
obstructions
or
autocomplete,
schema
validation,
testing,
linkedin.
B
There
are
several
ways
to
do
that
and
the
way
we
chose
in
the
beginning
was
to
just
generate
identity
code
from
the
open
api
schema.
But
today
there
are
a
lot
of
nice
tools,
not
nice,
open
source
libraries.
You
can
use
to
generate
to
generate
pipelines
from
your
python
code
and
our
first
state
was
looking
more
or
less
like
this.
This
is
basically
a
low
wall
pattern
and
how
it
looks
in
python,
but
suddenly
it
was
not
good
enough.
B
It
wasn't
good
enough
because
it
wasn't
easy
enough
for
data
scientists
and
we've
found
more
and
more
ways
to
make
it
easier
for
our
data
scientists
and
data
engineers
to
write
better
and
better
python
code.
The
issue
that
we
find
it
complex
is
that
for
research
we
actually
sometimes
needed
to
run
these
pipelines
locally.
We
didn't
want
to
do
the
entire
process
of
pushing
the
image
and
pushing
the
pipeline
to
production
and
running
it
in
a
remote
cluster
for
scale.
B
Sometimes
all
we
needed
is
to
just
run
our
code
very
quickly
or
just
run
our
pipeline
locally
and
maybe
stop
it
or
debug
it
in
the
middle
and
when
we
try
to
do
that,
we've
actually
built
another
set
of
pipelines
written
in
python.
That
can
work
only
on
our
local
computers,
and
now
we
suddenly
have
two
sets
of
pipelines.
B
One
is
for
one
is
a
pipeline
for
scale
and
one
is
a
local
pipeline
for
running
locally
and
debugging,
and
this
actually
created
an
issue
because
we
need
to
maintain
this
both
of
these
pipelines
and
then
we
thought
about
it
and
asked:
can
we
run
a
lot
code
locally
for
research
and
then
run
the
same
code
without
without
the
need
to
maintain
both
code
bases
for
scale
in
local
research
at
argo
at
scale,
and
we
went
back
to
the
drawing
board
and
thought
about
writing
a
little
bit
of
a
new
dsl,
a
new
library,
to
write
better
pipelines
near
enters
pythonic
dsl.
B
What
is
pythonic
dsl
well
pythonic
dsl
is
a
mostly
backward
compatible
subset
of
python.
What
does
mostly
backward
compatible
subset
of
python
means.
It
doesn't
mean
that
every
code,
you're
writing
python,
will
automatically
get
translated
into
argo
pipelines.
It
does
mean
that
everything
you
write
in
this
dsl
can
run
locally
as
standard
python
code.
B
So
let's
have
a
look
on
how
it
works.
So
that's
a
very
basic
python
code,
a
very
basic,
probably
the
most
basic
map,
reduced
job
that
I
can
think
about,
and
it
looks
like
a
legitimate
python
code.
It
has
some
sort
of
a
generate
list
that
creates
chunks
of
data,
and
it
has
a
very
specific
map
task
that
should
be
paralyzed
and
some
sort
of
reduced
task.
B
That
runs
it
all
in
order
to
make
it
into
an
argo
dog.
All
we
need
to
do
is
add
some
decorators
that
defines
which
of
these
are
tasks
which
are.
These
are
dags
that
call
other
tasks
and
the
task
where
we
wish
to
run
them.
This,
specifically,
specifically,
we
want
to
run
them
on
a
python-free
image
and
the
neat
thing
about
that
is,
while
basically,
we
can
create
from
the
same
python
method,
a
workflow
that
will
run
its
scale
in
argo.
B
B
Now,
if
you
don't
know
what
in-n-out
burger
are
in
and
out,
burgers
are
a
very
famous
fast
food
chain
in
the
west
coast
and
what
they
became
famous
for
is
the
secret
menu.
The
secret
menu
basically
allows
you
to
mix
and
match
lots
of
in
and
out
ingredients
into
some
customizable
meals
with
very
crazy
names
like
form
four
animal
style
or
the
flying
dutchman.
B
But
when
you
look
a
little
bit
deeper
about
the
magic
of
the
in
and
out
secret
menu,
the
really
neat
thing
about
that
is
that
they
allow
you
to
customize
their
menu
to
your
wheel,
but
they
are
doing
it
with
the
same
four
or
five
or
six
ingredients
that
they
already
have
and
they
already
use
for
their
basic
meals.
And
that's,
I
think,
the
really
really
nice
thing,
because
they
allow
you
to
customize
your
mail
from
on
one
way
and
they
and
on
the
other
way
they
do
use
or.
A
B
Use
the
same
system,
same
ingredients
and
processes
that
allows
them
to
create
something
which
is
reproducible,
that
is
always
in
a
high
level
and
very,
very
consistent,
and
I
was
promising
to
tell
you
about
machine
learning,
pipelines
and
architects.
So
that's
probably
the
part
about
heart
attacks.
So
yeah
you
can
basically
customize
your
burger
into
crazy
into
the
crazy
domains.
Don't
know
why
you
would
do
that,
but
you
can
still
do
that,
so
what
we
can
learn
from
in
and
out
and
building
our
pipelines.
B
So
when
we
were
looking
on
our
pipelines,
we
actually
split
them
into
three
types
of
pipelines
of
three
types
of
elements.
One
element
is
components:
components
are
basically
reusable
pieces
of
code
and
patterns
that
we
wish
to
abstract
away.
Workflows
are
more
or
less
like
recipes.
There
are
reusable
atomic
space
of
works
that
can
be
consumed
independently
or
as
a
part
of
a
larger
cookbook
and
cookbooks
are
a
combination
of
one
or
more
workflows
and
cookbooks
tailored
for
a
very
specific
use,
just
like
a
very
specific
customizable
menu
that
is
tailored
for
a
specific
customer.
B
So
let's
have
a
drill
down
and
see
how
we
all
you
you
use
them.
So
the
first
thing
is
components.
What
are
components?
Components
are
basically
pieces
of
the
workflow
that
we
wish
to
abstract
from
our
users.
Things
like
kubernetes
resource
management,
configuration
injections,
secrets,
common
environment
settings.
Basically,
all
the
things
that
our
engineers
and
our
data
scientists
don't
really
care
about.
All
they
want
is
to
configure
their
workflows
and
start
running.
They
don't
care
about
resource
management
for
kubernetes
and
what
we
do
with
them.
B
Is
we
actually
try
to
take
off
this
configuration
and
pack
them
as
a
really
nice
obstruction?
In
this
example,
this
very
specific
configuration
of
a
kubernetes
resource
we
just
named
it
as
a
medium
memory
machine
and
now,
when
someone
wants
to
write
a
pipeline
that
is
intended
for
etl,
they
will
use
this
type
of
machine
and
when
they
wish
to
use
a
pipeline
or
a
task
for
training,
they
can
use
a
training
machine
or
a
train
with
gpu
machine
and
so
on
and
so
on,
without
thinking
about
how
it
works
behind
the
scenes.
B
B
The
next
part
or
the
next
component
is
workflows.
Workflow
is
an
atomic,
is
an
atomic
piece
of
workflow
some
piece
of
pipeline
that
can
work
independently.
That
actually
has
an
input
as
an
output.
It
makes
some
sort
of
useful
job.
The
basic
difference
between
a
workflow
and
just
a
simple
dag
is
that
for
a
workflow
we
define
another
decorator
called
workflow
template.
B
B
The
third
part
is
cookbooks,
so
if
we
add
components
which
basically
talk
about
obstructions
and
workflow,
which
are
are
reusable
well
maintained,
production-grade
blocks
of
work.
Now
we
have
cookbooks,
which
are
basically
a
mix
and
match
of
these
well-maintained
pieces
of
work,
blues
lots
of
add-on
pieces
of
work
or
haddock
workflows
that
you
can
mix
and
match
together
in
order
to
create
a
customized,
a
customized
workflow
that
we
can
use,
whether
it's
for
training
or
serving
or
whatever,
so
on
the
basic
on
the
most
basic
manner.
B
We
can
just
use
this
to
write
down
our
entire
pipeline
or
customize
pipeline
for
a
specific
use
case,
and
if
our
user's
case
just
becomes
more
complex,
we
can
easily
change
it
and
if
we
wish
to
run
it,
let's
say
for
testing.
We
can
just
add
another
test,
called
sampling
and
run
it
really
really
quickly.
B
B
B
The
assumption
is
that
they
don't
need
to
survive
long,
so
some
cookbooks
might
need
to
survive
more.
There
are
more
less
production
to
books,
some
are
experimental
and
can
live
for
two
or
three
four
weeks
until
they're
not
necessary
anymore,
and
this
allows
us
to
mix
and
match
things
that
need
to
be
more
stable
and
need
to
and
things
that
need
to
be
more
experimental.
B
The
third
lesson
that
we've
learned
is
about
deployment,
so
we've
talked
about
how
we
have
a
cookbook
menu,
how
we
automatically
deploy
our
cookbooks
into
our
argo
clusters,
so
our
data
scientists
can
run
their
cookbooks
from
from
the
ui,
and
that
makes
things
a
lot
easier
for
them.
But
the
issue
with
that
is
that
very
quickly
we
get
junk
into
a
lot
of
junk
into
the
classes.
So
someone
wants
to
create
a
version
with
a
small
change.
B
They
don't
want
to
hurt
other
developers,
so
what
they
do
is
they
rename
the
workflow
a
little
bit
and
push
it.
And
then
we
get
a
lot
of
a
lot,
a
lot
of
version,
small
version
that
we
need
to
find
if
they
need
to
be
maintained
or
not
need
to
be
maintained
and
the
solution
will
find
for
that
is
just
basically
creating
namespace
isolation
for
pipelines.
So
basically
each
branch
in
our
git
receives
their
own
namespace.
B
When
you
push
an
update
to
git,
your
namespace
will
be
created
or
updated
automatically,
and
it
will
never
hurt
other
researchers,
namespace
and
each
namespace
can
be
isolated
by
resources
and
artifacts.
So
no
data
getting
mixed
between
production,
namespaces
and
research
namespaces.
B
We
basically
use
conventions
to
define
these
types
of
namespaces
naming
conventions.
There
are
namespaces
used
for
serving
the
branches
need
to
be
start
with
the
word
serving,
and
there
are
namespaces
need
for
researchers,
and
there
is
a
master
between
namespace
intent
for
continuously
validating
the
health
of
our
system,
we're,
of
course,
using
argo
cd
to
deploy
this.
It
makes
our
life
a
lot
easier.
B
The
fourth
lesson
I
wish
to
talk
about
is
about
interactive
experimentation.
When
we
look
on
pipelines,
especially
in
the
experimentation
phase,
sometimes
I
would
like
to
work
on
skill
and
then
stop
for
a
second
run.
A
small
code
locally
in
our
machine
have
a
look
on
it,
maybe
have
a
few
iterations
and
then
once
I'm
happy
with
the
result,
continue
with
the
entire
pipeline.
And
the
question
is:
how
can
I
do
that
with
argo
and
our
experience
in
the
beginning
wasn't
very
nice,
so
it
was
very
fragmented
experience.
B
So
we
went
into
the
code
started
to
write
something
up
started
to
run
something
up
and
then
went
in
to
get
into
the
argo
ui
launch
our
own
long-term
workflow
from
there
stop
the
workflow
get
back
to
the
id,
get
back
to
the
ui
and
so
forth
and
so
forth,
and
some
of
the
things
that
we
found
that
became
very
effective
in
the
small
library
that
we
wrote
down
is
a
library
that
allowed
us
to
integrate
our
argo
workflows
into
our
notebooks
and
into
our
basic
debugging
and
research
code.
B
So,
basically,
it
looks
like
a
very
simple
argument.
Client
that
allows
you
very
easily
to
submit
a
job,
whether
it's
a
job
that
you've
just
created
or
a
job
that
already
sets
on
the
cluster
itself,
and
you
can
simply
call
a
python
method
and
initiate
this
job
wait
for
it
to
come,
wait
for
it
to
complete
and
get
all
the
relevant
clubs
back
to
the
console.
B
You
can
suspend
a
resume
job,
whether
it's
a
laundry
we
can
check
for
the
status
of
the
job.
If
you
wish
one
of
the
nice
things
is
that
you
will
actually
get
the
job
results.
Serialized
and
deserialized
for
you,
so
if
you
just
if
you
need
to
look
into
the
outputs
or
artifacts
of
the
jobs,
all
you
need
to
do
is
just
call
results
that
outputs
name
the
output
and
you
can
treat
it
just
like
a
regular
python
object,
passion,
python
dictionary
and
the
same
goes
with
artifacts.
B
If
you
wish
to
read
the
csv
results,
you
can
easily
do
that,
and
the
really
nice
thing
is
that
integrates
well
with
our
platonic
dsl.
So
we
can
take
this
code
back
with.
We
can
take
the
code
that
we
wrote.
We
can
take
the
pipelines
that
we
wrote
in
a
very
pectonic
way
and
on
one
hand,
run
it
locally
and
other
end
with
two
lines
of
code
run
it
on
a
cluster
which
makes
things
which
make
things
a
lot
easier.
B
B
The
second
lesson
is
how
to
treat
our
pipeliners
code
and
it's
not
means
to
write
our
pipeline
as
code,
but
actually
treat
it
the
same
as
code,
so
make
sure
that
we
can
run
it
the
same
way
that
as
we
as
we
write
and
run
our
code.
The
second
lesson
is
pipeline
architecture.
How
to
is
to
think
about
how
to
build
your
pipeline
to
be
reusable
and
be
stable
over
time
without
losing
the
ability
to
do
research.
B
So
that's
basically
it
I
invite
you
to
go
and
use
argo
workflow
tools
and
the
pytonic
dso
that
we
wrote
in
the
client
api
that
we
wrote
if
we
wish
to
talk
about
more
about
how
to
create
better
argo,
what
are
good
pipeline
than
argo
pipeline
architecture?