►
From YouTube: SEG MLOps Update - December 14th 2021
Description
This week: Running Jupyter Notebooks as Gitlab Pipelines,
Feedback and Ideas for Jupyter Support: https://gitlab.com/gitlab-org/gitlab/-/issues/343024
Feedback and Ideas for Pipeline Experimentation: https://gitlab.com/groups/gitlab-org/incubation-engineering/mlops/-/epics/6
All Updates: https://gitlab.com/gitlab-org/incubation-engineering/mlops/meta/-/issues/16
A
A
And
today
we're
going
to
talk
about
pipelines,
we're
going
to
talk
about
jupiter
and
further
news
on
the
industry?
So
just
as
starter,
remember
the
vision
of
the
sag
of
what
I'm
doing
here
is.
I
want
to
make
gitlab
2
where
data
scientists
want
to
use
it
not
just
have
to
use
it
but
want
to
use
it,
and
we
are
doing
that
by
looking
at
our
portfolio
and
see.
Where
can
we
augment
data
science
and
machine
learning
engineers
workflow
with
gitlab?
What
can
we
update?
What
can
you
change
to
make
their
life
better?
A
So
on
that
sense,
let's
begin
with
what
was
done
this
past
week,
starting
with
glitter.
What
is
glitter
glitter
is
gitlab,
plus
jupiter
one,
a
very
common
workflow
in
data
science
is
the
data.
A
Scientist
goes
and
creates
the
the
model
in
the
jupiter
notebook
and
does
everything
on
the
jupiter
notebook
and
then
it
wants
to
put
that
model
into
production,
but
production
doesn't
really
play
well
with
jupiter
box
and
what
happens
is
a
machine
learning
engineer
or
a
software
engineer
will
usually
pick
this
up
and
or
the
data
scientists
will
translate,
will
migrate
that
code
from
jupiter
into
a
pythons
python
script.
But
a
lot
is
lost
within
this
translation
or
it
can
take
a
lot
of
time
or
it
doesn't
really
work
as
expected.
A
So
why
not
try
to
not
move
the
code
from
away
from
jupiter,
so
this
is
where
glitter
comes
in.
Glitter
is
a
way
of
converting
a
jupiter
notebook,
a
sequence
of
cells
into
a
gitlab
pipeline,
and
I
this
week
I
created
a
poc
for
this.
So
this
is
part
of
our
exploration
series
on
gitlab
pipelines
for
hyper
parameter.
Optimization,
we're
not
really
working
on
the
hyper
parameter
on
this
step,
but
more
on
the
jupiter
side.
But
it's
really
interesting
so
suppose
that
I
have
this
notebook
over
here.
Very
simple
one.
A
Three
steps
there's
some
thing
here
that
has
not
been
shown,
but
there
is
a
cell
over
here.
If
I
show
the
raw
here,
there's
a
configuration
part
as
well,
and
three
steps
prints,
hello,
one,
hello
and
then
hello,
two
and
then
hello
three.
So
what
do
I
do?
A
I
come
over
here
and
I
create
a
parent
pipeline
that
calls
glitter
and
glitter
parses
this
notebook
into
a
yaml
file
into
a
valid
ci
file,
and
then
it
runs.
So
it's
a
very
simple
script.
It
just
picks
up,
for
example
over
here.
This
creates
each
step.
This
is
a
very
early
concept,
very
poc
and
then
it
creates
a
pipeline
which
in
turn,
calls
glitter
run,
which
has
a
function
that
just
executes
the
function
within
a
a
cell
within
a
notebook.
A
So
I
pass
a
path
and
it
passes
index,
and
then
it's
going
to
run
just
that
piece
of
code.
So
what
it
would
looks
like
on
the
pipeline
is
this
thing
over
here.
So
first
it
generates
a
notebook.
Ci
then
runs
notebook,
ci
and,
like
I
mentioned,
there's
one
job
for
each
cell
and,
for
example,
this
is
zero
cell,
zero
job
prints,
hello,
one
hello,
one,
hello,
one
as
expected.
A
So
there's
a
lot.
This
is
a
very
early
concept.
Of
course
there
is
a
lot
to
be
done
in
here,
for
example,
sharing
state
between
cells
or
running
multiple
cells
as
part
of
one
job,
or
I
don't
know
plenty
of
stuff
to
to
help
to
to
move
here.
Manual
events
better
configuration
so
instead
of
running
sequentially
can
I
run
some
cells
in
parallel,
so
lots
to
be
done
over
here,
but
I
think
it's
a
very
promising
start
and
something
that
nobody's
really
doing
actually
on
the
community.
A
There's
this
big
bias
or
not
bias,
but
line
of
thought
that
data
science
shouldn't
be
done.
Anything
that
is,
production
shouldn't
be
done
on
a
jupiter
notebook,
and
I
think
I
kind
of
disagree
with
that
that
I
think
if
the
team
is
comfortable
with
that
and
wants
to
use
your
notebooks
there's,
no
reason
why
not
sure
the
the
problem
is
not
the
jupyter
notebook
is
the
tooling
and
the
lack
of
tooling
that
makes
it
complicated.
A
So
this
is
where
what
we're
tackling
a
bit
over
here
then,
after
the
the
the
pipeline,
we
also
worked
on
the
jupiter
experience.
Those
are
the
two
main
lines
of
work
we
are
looking
at
right
now.
We
started
by
refactoring
the
code
for
the
diff
that
we
released
before
so
it
was
across.
We
wanted
to
publish
something
rather
than
really
be
thoughtful
with
the
code
base,
but
now
that
we
have
like
that
now
that
is
live
and
it
is
being
used.
A
We
wanted
to
expand
and
refactoring
was
important
and
from
user
feedback,
which
was
really
good,
because
it
means
users
care
about
this.
We
know
that
they
over
here.
We
know
that
they
still
want
the
rod
if
they
like
the
the
the
render
diff,
but
the
raw
diff
is
still
important.
A
So
what
we're
working
now
is
a
way
to
show
both
of
them
at
the
same
time,
which
is
a
bit
challenging
because
which
that
means
we
need
to
know
the
conversion
between
line
numbers
so
that
we
can
comment
on
the
right
place.
For
example,
if
I
comment
on
the
raw
diff,
it
should
display
on
the
correct
place
on
the
render
div
and
vice
versa.
So
it's
technically
a
lot
more
challenging,
but
nothing
that
can't
be
done.
So
we
are
working
on
this.
A
It
might
take
a
while
to
get
everything
right,
but
it's
really
comforting
that
we
are
receiving
this
feedback,
that
users
are
caring
about
this.
They
care
enough
about
the
gifts
that
they
come
over
and
they
they
they
ask
questions
they
talk
about
this.
A
We
are
rendering
about
12k
jupiter
divs
per
day,
which
is
not
a
small
number
considering
it
was
never
something
we
gave
attention
to
so
12k,
it's
quite
a
nice
number
and
we'll
be
building
on
top
of
it.
Some
extra
so
bailey
shared
on
her
twitter.
She
is
a
director
of
ai
or
ml
at
github.
I
believe
I
can't
remember
correctly.
She
asked
the
community
on
twitter.
What
is
the
one
thing
that
it
could
change
at
github
for
machine
learning
or
data
science?
A
I
linked
that
right
over
here.
So
if
you
want
to
go
over
later,
but
the
highlights
are
very
interesting,
majority
is
jupiter
majority
talks
about
either
jupiter
def
or
jupiter,
rendering
one
of
those
it's
clear.
This
is
a
must,
and
it
was
clear
for
us
already,
but
it's
nice
to
keep
receiving
this
the
signals
that
we
are
on
the
right
track
and
the
second
one
that
they
requested.
A
A
lot
is
improvements
on
github
actions
for
machine
learning
and
data
science,
and
which
is
also
something
that
we
are
working
on
is
our
pipeline
efforts,
which
is
great,
like
both
of
the
things
that
users
are
asking.
The
most
is
exactly
what
we
are
deciding
to
work
on
already
on.
So
I'm
very
happy
with
that
and
some
there
was
one
comment
that
even
mentioned
a
gpu
runner
gpu
for
actions,
and
it's
great
that
we
also
have
this
already.
A
So
it
was
very
interesting
to
read
that
very
improve
the
morale
so
up
next
here,
we'll
finish
the
the
source
map
that
we're
that
I've
been
building,
it's
going
well
and
we
should
be
done
very
quickly
at
least
implementing
this
on
the
library
level,
but
integrating
with
with
gitlab
might
take
a
little
bit
a
while,
integrating
with
merge
reviews
integrating
with
divs
that
might
take
a
bit
of
time,
but
we
are
going
in
that
direction.
A
We
want
to
create
a
so
we
can
use.
Meanwhile,
why
we
don't
even
the
diff
that
we
render
is
a
markdown
version.
It
doesn't
show
everything
on
the
jupiter
as
a
mitigation
we
can
for
now
we
can
try
to
render
pretty
diffs
like
the
notebooks
in
itself
using
review,
apps,
and,
I
think,
all
oh
on
the
next
week
I'll
be
creating
a
poc
for
this
see
how
it
goes,
and
we
also
have
a
few
customer
conversations
happening
up
so
again.
A
If
you
have
ideas,
if
you
have
any,
if
you're
interested
on
what,
when
we
are
doing
here,
comment
on
the
issue
subscribe
to
the
to
the
to
the
epic
to
the
update
epic
and
yeah
reach
out,
if
you,
if
you
have
any
ideas,
have
a
good
one.