►
From YouTube: GitLab Experiment Tracking - January 2023 Overview
Description
Epic: https://gitlab.com/groups/gitlab-org/-/epics/9341
Feedback Issue: https://gitlab.com/gitlab-org/gitlab/-/issues/381660
All Updates: https://gitlab.com/gitlab-org/incubation-engineering/mlops/meta/-/issues/16
A
Hello,
hello,
my
name
is
Eduardo
I'm,
an
incubation
engineer
mlaps
here,
gitlab
and
I've
been
working
for
a
bit
a
little
bit
on
machine
learning.
Experience
tracking
on
incubate
in
this
feature
on
gitlab
and
today,
I
want
to
talk
about
a
little
bit,
not
really
a
demo,
but
an
overview
of
this
project.
What
is
the
reason?
What
is
that
I'm
doing?
Why
am
I
doing
how
I'm
doing
it?
What
are
the
next
steps
so
to
get
started
with
machine
learning,
experiment,
tracking
I'm,
going
with
the
assumption
that
we
don't
know?
A
A
You
often
need
three
pieces
of
information.
One
is
the
code
that
generated
the
machine
learning
model.
Second,
is
the
data
that
was
used
to
create
a
machine
that
was
passed
through
the
code
and
three
the
hyper
parameters,
so
the
configuration
of
both
code
and
data
and
environment.
So,
with
these
three
components,
we
have
a
machine
learning
model.
A
Now,
when
I
did,
a
scientist
is
working,
something
that
happens
often
is
that
we
don't
know
what
is
the
best
hyper
parameter
for
the
specific
code
and
data.
So
what
we
do
is
we
have.
We
do
something
called
hyper
parameter
tuning
where,
where
we
try
a
bunch
of
different
hyper
parameters
on
the
same
code
and
on
the
same
data
and
each
of
the
hyper
parameters
will
generate
a
specific
model
candidate
and
then
we
compare
what
is
the
model
candidate?
That's
that
performs
best
and
we
go
on
with
it.
A
An
experiment
in
this
case
is
a
collection
of
very
different
model
candidates
that
are
measured
according
to
that
they
are
comparable.
So,
for
example,
they
are
comparable
based
on
the
model
metrics
that
they
perform.
There
are
collection
of
model
of
candidates
trained
on
different
code
on
different
data,
on
different
hyper
parameter
sets,
but
that
are
comparable
in
some
way
or
another
and
experiment
tracking
is
a
registry
where
you
save
all
of
these
experiments
right.
So
this
is
what
we're
trying
to
build
over
here.
A
A
How
we
are
implementing
experiment
tracking
on
gitlab,
so
the
largest
player,
the
largest
open
source
player
over
here-
is
ml
flow.
Mlflow
has
a
really
good
client,
so
a
part
of
so
the
code
that
goes
into
the
data
scientist
writes
to
save
information
into
mlflow
is
really
good.
It
has
a
lot
of
features.
It
has
a
large
user
base.
It
has
a
it's.
It
is
open
source.
A
It
has
a
large
number
of
Integrations
of
other
tooling
that
integrates
to
it,
but
it
doesn't
really
Provide
support
for
authentication
or
any
of
the
corporate
corporate
expectations
of
a
of
a
opens
or
of
a
tool
to
be
used,
so
authentication
user
management.
All
this
kind
of
stuff
in
its
setup
in
its
deployment,
is
another
tool
that
you're
deploying
internally.
A
So
you
need
a
platform
engineer
that
will
work
on
publishing
this,
either
in
kubernetes
or
whatever,
and
setting
up
a
way
to
restore
the
artifacts
and
everything
else
and
another
one.
Another
issue
is
that
it's
siled
knowledge,
it's
the
information
that
is
on
Ember
flow
sure
you
can
access
through
an
API,
but
that's
extra
effort
that
you
need
to
put
so.
The
knowledge
stays
a
little
bit
solid.
The
information
stays
a
little
bit
solid
in
it.
A
We
have
a
few
North
Stars
here
that
we
want
to
work
about
one.
We
want
to
create
a
an
experience
that
there's
zero
setup
needed.
If
you
have
gitlab
it
will
work
for
anyone
that
has
kit
lab
for
the
data
scientist
that
already
has
ml
flow
or
wants
to
use
ml
flow
minimal
to
zero
code
changes,
so
their
code
should
just
work
either
on
gitlab
or
mlflow.
It
doesn't
really
matter.
A
Third
leverage,
the
gitlab
platform.
We
can
build
a
feature,
but
we
want
to
go
beyond
that.
We
want
to
use
this
feature
to
inform
the
other
stages
of
the
devops
lifecycle
and
we
use
when
I
use
information
from
the
other,
the
rest
of
the
devops
lifecycle
to
improve
the
feature
itself.
So
it's
not
something
a
part.
It's
part
of
the
gitlab
experience,
it's
part
of
the
devops
experience
for
the
data
scientists
and
more
on
the
technical.
A
So
how
we
are
approaching
this,
how
are
we
making
sure
that
that
we're
following
up
this
North
Stars
ml
flow,
is
composed
of
two
components
on
for
this
case?
One
it
has
a
client
which
is
the
code
that
goes
into
the
data
scientists
code
base
where
they
write.
Okay,
I
want
to
run
this
experiment
and
save
this
neighbor
and
there's
the
back
end,
which
is
the
information
where
the
information
will
go.
A
What
we're
doing
here,
we
are
replacing
the
background,
the
back
end
with
gitlab,
so
by
just
switching
to
your
eye
on
the
MFL
client
on
the
code
that
they
are
using
or,
if
I'm,
with
variable,
instead
of
pointing
to
a
more
flow
pointing
to
gitlab,
it
just
works.
This
is
where
we,
this
is
where
we
want
to
be.
A
Create
a
dropping
replacement
for
ML
flow
backend
within
gitlab,
so
the
positive
side
is
that
one
if
the
user
has
gitlab
in
their
organization
or
if
they
use
gitlab.com
or
whatever
they
automatically
have
experiment
tracking,
they
don't
need
to
set
up
a
different
service.
They
need
to
set
up
anything.
It
just
works.
Second,
it's
easier
to
integrate
across
the
platform
on
our
site,
so
I
don't
need
to
be
calling
an
API
to
get
a
fetch
information.
A
A
There
is
zero
addition
setup
necessary
to
deploy
like
I
said
it
just
works.
There's
no
platform
engineering
engineer
help
it's
just
about
setting
up.
We
have
authentication
by
the
by
default
here.
So
by
leveraging
project
and
group
user
management,
we
already
give
users
automatic
access
to
user,
to
authentication
and
user
management
for
the
data
scientists.
A
A
A
Are
we
on
this
I
have
a
video
over
here?
That
shows
a
little
bit
of
an
example.
So,
on
the
right
side,
I
have
a
training
code
on
the
left
side.
I
have
gitlab
working
I'm
not
going
to
go
through
this
video
I
have
the
link
over
there.
If
you
want
to
watch
I'll
link
below
as
well.
So
at
this
point
we
can
already
use
gitlab
as
a
drop
in
for
a
mail
flow
backend.
This
is
already
possible.
A
We
already
have
implemented
the
necessary
endpoints,
the
minimum
UI
necessary
for
this
to
work.
It
already
saves
the
artifacts
intricate
lab
itself.
So
if
you
log
artifacts
through
gitlab,
it
will
save
on
demo
flow.
It
will
save
on
on.
If
you
log
artifacts,
with
ammo
flow,
it
will
save
on
gitlab
using
the
package
registry.
A
We
are
currently
dog
fooding,
the
MVP,
so
our
data
scientists
are
testing
this
out,
pointing
out
the
failures
requesting
features
or
improvements
and
I'm
working
on
or
to
iterate
on
those
and
what
are.