►
From YouTube: Applied ML weekly team meeting - Sep 23 2021
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
All
right,
so,
let's
hear
where
to
start
in
the
applied
ml
weekly
team
meeting,
so
the
first
thing.
So
what
are
the
open
questions
on
our
new
approach
for
milestone
two?
This
is
for
me
what
decisions
are
one-way
doors
like
hard
to
undo
hard
to
go
back
and
where
we
may
not
have
buy-ins
from
the
stakeholders
yet,
and
I've
got
the
issue
here
and
we've
had
a
really
lively
debate
in
the
issue.
What
are
what
are?
What
are
folks,
thoughts
and
alexander,
looks
like
you
have
the
first
personal
preference.
B
So
that's
the
main
question
because
I
think
that
other
things
they
are
consistent
with
the
old
version
of
milestone,
2
and
except
maybe
for
review
roulette
replacement
because
we
decided
not
to
replace.
A
Yeah,
so
definitely
that's
an
easy
one
answer
for
me.
So
milestone
two
was
selected
customers,
including
ourselves.
It
is
not
all.
It
is
definitely
not
all
customers,
even
if
we
had
the
ability
to
do
that
today,
I
don't
think
we
would.
We
want
to
we'd
want
to
try
it
out
for
a
couple
learn
from
it,
get
feedback,
etc.
A
The
one
of
the
big
questions-
and
I
know
alexander
you
and
I
discussed
this
earlier
this
week.
I
think
it's
an
issue
as
well
as
when
we,
depending
on
the
approach
when
we
enable
on
a
per
customer
basis,
is
it
a
separate
environment.
That's
created
to
do
the
ml
workloads
or
that
customer
or
is
the
same
environment,
use
the
shared
one
just
in
the
data
partitioned
by
customer
and
how
much
manual
effort
is
it
to
set
up
the
environments
if
it's
a
separate
one
per.
B
B
So
it
means
that
we
have
a
see
a
predefined,
a
predefined,
ci
template
to
make
recommendations
and
generate
the
artifacts
json
file.
Then
we
can
parse
this
file.
So
that's
fine
right.
We
can
use
either
the
ui
integration
or
bot
integration,
but
the
things
like
data
extraction
model
tuning
training.
I
think
they
should
go
outside
of
customer
ci
jobs,
so
we
can
orchestrate
them
somehow,
for
instance,
using
airflow
or
some
similar
tools,
because
we
already
have
this
pipe
like
these
stagers.
We
only
need
them
to
run.
B
B
I
think
who
should
maintain
this
infrastructure?
That's
the
main
reason
because
we
introdu,
like
we
don't
introduce
something
new
for
the
company
for
gitlab,
because
I
see
that
the
data
team
already
uses
airflow
and
some
other
tools
etl
tools.
But
in
our
case,
for
instance,
if
we
launch
all
these
things
on
our
side.
So
who
should
maintain?
One
solution
is
to
to
make
almost
the
same
thing
as
with
the
data
flow
right,
because
it's
only
a
backend
for
a
patch
beam.
So
we
can
use,
for
instance,
google
cloud
composer.
B
A
A
C
B
A
I
started
out,
I
was
poor
at
it
and
I
still
am
not
great
at
it
so
so
from
there.
So
the
infrastructure
should
maintain
infrastructure.
I
think
definitely
once
it's
in
full
production,
maybe
not
a
milestone
too,
but
I
would,
unless
we
have
a
good
reason
not
to.
I
would
still
do
that
in
milestone
two.
A
What
about
this?
How
about
we
implement
via
ci
job
integration
for
one
and
then
two
of
our
own
projects,
so
maybe
for
gideon
for
git
lab
itself
and
see
what
we
learned
from
it
and
then
decide
rather
than
trying
to
predict
the
future
perfectly,
which
is
very,
very
hard
to
do,
is
do
it
more
energetically?
What
do
you
think.
C
Ultimately,
if
we
have
an
applied
ml
feature
or
reviewer
feature
which
we
can
do
food,
we
can
use
in
our
project
and
on
gitlab.com
we
can
enable
for
customers
and
on
self-managed
now
they
can
extract
their
own
data
and
they
can
even
set
up
on
their
own.
That's
the
ultimate
target.
Now
every
component
we
use
that
like.
C
If
you
go
on
and
use
a
certain
cloud
provider
or
a
tool
which
is
not,
which
is
a
single
tone
which
is
not
an
open
source
component
which
we
can
bundle,
then
we
shall
have
this
trouble,
but
those
are
very
long
term.
You
know
architectural
concerns,
but
ultimately,
let's
say
let
me
give
you
a
silly
example.
For
example,
let's
say
postgres
it's
an
open
source
project,
it's
available
everywhere.
C
C
A
C
A
D
Overall,
I
like
this
strategy.
I
want
to
just
say
like:
if
we
decide
we
need
to
not
go
the
ci
template
route.
That
was
just
an
idea
I
put
out
there,
but
I
could
see
how
that
could
end
up
working
long
term
even
for
self-managed
too.
So
I
think
this
makes
sense.
This
gives
us
a
very
easy
pattern
for
how
we
can
do
feature
flagging,
as
well
as
selective
customer
enablement.
D
A
Absolutely
yeah
we
don't.
We
don't
want
to
definitely
finalize
our
short-term
plan
until
mon
has
a
chance
to
weigh
in
even
after
we
want
to
try
to
make
things
two-way
doors
decisions
as
appropriate.
So
let's
say
we
go
with
the
ci
job
integration
so
that
and
which
is
in
some
ways.
I
really
like
it,
because
we
do
that
for
other
things,
and
in
some
ways
I
don't
it's,
not
the
user.
The
user
experience
is
not
great.
Like
I
know
we
do
that.
A
First,
at
least
at
least,
we
did
do
that
for
the
secure
scanners,
where
you
you'd
have
to
edit
your
you
know
your
your
ci
yaml
file
and
if
you
got
it
wrong,
things
wouldn't
work
correctly,
it'd
be
better
for
it
to
be
a
button
or
even
a
feature
flag,
but
we're
very
early
days.
So
I'm
totally
fine
with
it
now
so
that's
it.
A
The
ci
you
we
tell
the
customer
how
to
integrate
or
try
to
use
the
the
ci
job
we've
created
that
will
initiate
the
data
extraction,
the
running
of
the
models,
the
tuning
of
the
models
and
then
the
output
of
putting
the
comment
in
the
in
the
issue.
Sorry,
in
the
mr
with
the
recommended,
mrs
of
the
recommended
reviewers,
so
the
we'd
kick
it
off
with
the
ci
job.
Then,
where
do
we?
Where
do
we?
Where
do
we
extract
the
data
to
and
run
the
models?
Is
it
a
shared
environment?
A
A
But
the
nice
thing
is
it's
one
infrastructure
that
needs
to
be
set
up
once
and
may
and
maintain
in
one
place.
If
we
build
an
infrastructure
on
the
fly,
we
could
partition
customer
data
by.
B
B
Yeah
but
the
best
thing
that
okay,
I
started
to
work
on
this
issue
today.
So
I
found
how
to
do
that,
because
we
can
reuse
gitlab
tokens
to
check
to
authorize
this
yeah
to
authorize
the
ci
template
for
making
recommendations
and
then.
B
So
so
yeah
it
depends
the
main
thing
that,
as
I
understood,
we
cannot
use
ci
drop
tokens
generated
by
by
the
ci
to
by
by
the
ci
drop.
So
we
cannot
use
these
tokens
to
extract
some
data
using
api
because
the
first
they
leave
only
when
the
drop
is
running
and
then
so
they
have
a
limited
scope.
So
we
cannot.
B
We
cannot
extract
data.
So
that's
why
I
think
that
we
can
so
we
can
generate
a
project
token
for
each
customer
or
a
customer
can
generate
it,
and
then
we
can
reuse
these
tokens,
for
instance,
to
authorize
and
review.
I
mean
to
authorize
and
review
when
we
recommend
reviewers
and
to
extract
data
so
like
that.
But
there
is
one
one
thing
that,
for
instance,
if
we,
if
we
decide
to
to
move
the
entire
pipeline
to
ci
jobs,
they
are
limited
in
time.
As
I
understood
albert
said
that
there
is
only
one
hour
right
for.
C
That's
the
default
so
and
a
lot
of
people
are
not
aware
of
it.
I
made
a
lot
of
stuff
inside
ci,
so
github
shared
runners
also
have
a
limit
which
is
three
hours.
So
that's
something
I
think
we
can
never
change
on
github.com,
but
a
customer
locally
on
their
self-manage
instance
can
go
and
set
any
timeout.
They
want.
Defaults
are
one
hour
and
three
hours
for
runners.
One
hours
for
project
at
this
moment.
A
C
A
Could
we
do
something
like
this,
where
the
ci
job
kicks
off
on
review,
initiates
the
under
view?
You
know
data
extraction,
etc,
but
it's
not
the
actual
job
that
runs
the
actual
job.
That
runs
is
so
we
put
like
something
in
a
message
cube
to
say:
we
need
to
do
the
extraction
for
this
customer
and
then
do
run
the
models
and
et
cetera,
et
cetera,
so
the
ci
job
just
kicks
it
off
and
it
finishes
very
quickly.
A
B
I
just
like
that
from
our
previous
conversation
that
we
have
the
ci
drop
to
recommend
or
to
train
I
mean
initially
when,
with
the
first
run,
we
need
to
start
extracting
data,
but
we
can
do
that
like
in
background
somehow,
so
we
just
start
this
process
and
then
we
we
we
say
that.
Okay,
you
need
to
wait
a
bit
while
the
model
is
trained.
A
A
C
Question
to
alexander:
let's
go
I'll.
Ultimately,
let's
say
we
have
10
million
projects
with
repositories
and
we
go
crazy
and
we
wanna
now
train
in
each
repository
under
you
for
every
project
on
gitlab
and
make
it
available
in
a
let's
say,
not
inside
ci,
but
in
a
in
an
infrastructure
effort.
How
long
will
it
take
or
how
big
effort
is
it.
B
It
could
take
like
around
one
hour
right
now,
so
to
train
the
model,
it's
quite
fast,
but
for
instance,
when
we
start
I'm
afraid
that
when
we
start
to
improve
the
model,
for
instance
adding
some
nlp
or
layers,
I
mean
when
we
start
to
process
merch
request
descriptions
here
we
can
take.
We
need
much
more
time
much
more
and
I'm
afraid
of
that
so
because,
for
instance,
customers
they
paid
for
some
cicd
minutes
and
we
will
start
to
spend
them
yeah.
B
Of
course
like
we
can.
I
mean
there
are
so
many
things
that
we
need
to
address
because
and
we
cannot
change
them
at
the
same
time,
because
there
are
so
many,
for
instance,
as
I
understood,
we
can't
restart
automatically
the
ci
drop
right
now.
Yes,
is
it
right.
C
B
B
I
don't
know
how
it
works
right
now,
but
you
know
it
takes
a
lot
of
time
to
extract
even
like
25
or
even
like
50
participants
using
the
graphql
api.
So
sometimes
the
drops
they
are
false
because
of
that
it
means
that
and
we
can
easily
restore
it.
So
we
need
to
restart
the
drop
and
then
everything
works.
Fine.
C
The
question
so
the
data
you
are
expecting
is
it
available.
A
C
Sorry,
no.
A
C
So,
in
short,.
C
B
Yeah
right
I
mean
that
I'm
afraid
that
we
can't
restart
automatically
this
some
ci
drops,
for
instance,
if
we
need
to
extract
participants,
the
this
job
can
easily
be
broken
yeah,
because
that's.
C
A
that's
a
good
concern,
so
in
short,
jobs
are
usually
you
know.
Ci
is
done
for
testing,
so
jobs
fail
and
your
test
fails
and
you
are
happy
you
catch.
You
cut
something.
So
that's
not
made
for
this.
If
you
need
a
retry
logic,
I
didn't
respond
to
that
immediately.
I'm
sure
there
might
be
a
feature
there,
but
in
short,
if
you
need
a
retro
logic,
you
should
implement
that
yourself
in
your
own
ci
script.
Yes,
yes,
yes,
however,
let's
investigate
and
I
will
think
fabio
he's
more
knowledgeable.
C
B
Yeah,
that's
what
I'm
trying
to
say
that
we
have
a
lot
of
things
that
we
need
to
address.
So
if
we
choose
this
way,
we
need
to
understand
that.
Maybe
it
will
require
some
time
because
we
need
to
collaborate
with
some
other
teams
to
adapt,
for
instance,
pipeline
for
this,
for
this
strategy,
yeah
for
this
integration.
C
If
you
have
additional
like
say
everyday
additional
repository
data,
okay,
I
have
two
questions.
Do
you
need,
do
you
do
an
incremental
training,
or
do
you
need
to
dump
the
data
from
scratch
regularly
or
do
you
need
do
you
do
you
like?
Do
you
store
the
data
somewhere
and
say
incremental.
B
Okay,
so
right
now
it's
a
full
retraining
but
the
better
solution.
For
instance,
we
can,
for
instance,
when
we
have
a
new
data.
We
can.
B
C
The
second
thing
from
graphql
api,
the
data
you
are
getting-
does
it
come
from
I'm
trying
to
figure
out
if
it
comes
from
the
database
or
repository
git
history
or
both
because
both
okay-
if
it's
both
so
you
said
it's
too
slow-
I
mean
the
only
other
alternative
is
that
we
can
get
some
data
from
the
database
from
the
database.
C
Somehow,
because
we
have
ways
to
connect
to
replica
the
data
team
has
direct
connection,
you
know
that's
what
they
do
or
for
the
repository
we
can
do
is
non-shallow
or
history
clone,
and
if
we
can
work
with
that,
at
least
during
the
milestone
too,
we
can
check
out
a
repository
which
then
we
can
incrementally
check
out
more
and
keep
that
somewhere
in
some
cache.
C
B
There
are
some
fields
that
are
not
stored
in
the
database
like
merged
at
something
like
that
or
merged.
I
don't
remember
mihail
knows
better
and,
for
instance,
we
need
to
extract
the
change
files
and
right
now
we
cannot
do
that
using
like
pure
graphql
api,
so
we
need
to
work
with
the
local
repository.
At
the
same
time,.
C
Yeah,
you
mean
merged,
you
said
just.
B
Yes,
I
think
I
think
this
field,
because
there
is
an
issue
I
can
send
you
that's
four.
C
A
Alexander
and
and
after-
and
it
is
a
great
discussion-
we
can
definitely
go
over
on
time.
If
you
both.
Can,
I
think
that'd
be
great.
I
do
want
to
jump
actually
to
agenda
item
number
five
and
just
taylor's
got
an
announcement.
Then
we
can
come
back
to
this.
D
There
we
go
sorry,
I
couldn't
find
the
underlying
button.
Sorry,
I'm
driving
someone
to
the
airport.
So
basically
I
interviewed
for
the
applied
ml
pm
role.
I
am
thrilled
to
say
that
I
have
accepted
it
as
of
technically
10
1,
I'm
off
at
the
end
of
the
next
week,
so
I'll
officially
be
your
product
manager
starting
10-4,
so
I'm
around,
as
you
all
know,
so
I'll
be
able
to
give
you
all
more
dedication
to
this.
D
A
Great
good
awesome,
awesome,
you're,
officially
part
of
the
team
revan
unofficially
for
the
unofficial
was
fine
too
for
a
while,
but
that's
awesome
news,
taylor.
C
It
a
bit-
and
I
have
a
I'm
interviewing-
a
candidate
in
just
a
few
minutes,
so
I
think
I'll
I'll
drop
off
and
I'll
add
it
to
next
week's
agenda.
C
E
A
A
A
Do
you
have
more
time
to
discuss
what
I
interrupted,
or
do
you
not
have
time?
If
you
don't
that's
fine,
you
can.
We
can
continue
some
other
in
some
other
time.
A
Okay,
so
sorry
about
interrupting
you,
both
there's
a
great
discussion:
why
don't
we
get
back
to
it?
So
where
did
we
leave
off.
C
If,
if
that's
true,
if
that's
too
detailed
alexander,
I
can
sync
with
you
too
just
because
there
are
little
unknowns
from
my
site
on
how
underview
works
and
from
your
site
on
how,
because
I
was
on
the
product
intelligence
team,
I
was
able
to
see
how
different
parts
of
the
system
work,
because
I
was
doing
instrumentation
there.
So
that
might
help
us
to.
You
know,
answer
some
questions.
C
A
C
C
No,
I
mean
any
in
the
mindstorm
too
any
concerns.
You
know
you,
oh
yeah,.
B
Okay,
okay,
so
you
know
I'm
trying
to
find
the
best
way,
not
the
best.
Maybe
the
simplest
way
at
this
time
how
to
automate
this
full
pipeline
that
we
have
right
now
I
see
that
in
general
there
are
two
ways
so
the
first
way
we
use
something
like
airflow
another
way
we
use
gitlab
ci
jobs,
ci
pipelines,
so
both
I
think,
are
fine,
but
it
looks
like
pipelines,
they
are
they're
they're
cool,
but
maybe
they
are
not
adapted
right
now
for
for
this
kind
of
task
that
we
have
right.
B
So
you
said
that
if
we
start
to
adapt
them,
it
will
be
cool
because
it
will
be
cool
for
the
company
for
the
company
and
for
the
customers,
because
we
we
can
help
improve
the
tool,
but
I'm
afraid
that
it
can
take
a
lot
of
time.
C
C
C
So
your
concern
is
totally
right,
but
in
turn
tyler
said
he
is
also
responsible
for
monologues
and
eduardo.
Bona
is
also
quite
interested
in
the
discussions
long
term,
if
we
really
wanna,
have
apply
them
in
gitlab
and
in
customer
projects.
We
should
solve
this
issue
without
a
third-party
cloud
service.
That's
the,
and
how
can
we
do
it
like?
C
I
don't
know
how
airflow
is
licensed,
how
it
can
be
bundled
yeah,
apache,
okay,
so
in
sure
those
are
all
questions
you
know
and
introducing
a
new
component
to
bundle
or
even
on
gitlab.com
is
hard,
but
doing
a
single
instance
where
you
do
it
on
confluent
gcp
for
yourself
is
possible
now
as
part
of
milestone.
Two
writing,
because
what
you
want
to
do
is
you
want
to
learn
now?
C
So
that's
why
yeah,
let's
be
concerned
and
you're
right,
so
gitlab
ci
is
not
made
for
this
exactly
depending
on
how
long
the
training
takes
and
etc.
How
long
the
repository
extraction
takes.
You
might
hit
the
revolve
and
also
even
the
data
storage
size
will,
if
you
want
to
extract
a
huge
git
repository
with
a
lot
of
storage
and
history.
A
What
about
the
ci
job,
this
being
extremely
simple
in
terms
of
kicking
off
or
queuing
up
a
job
elsewhere,
something
completely
unrelated
to
ci,
that
is
in
unreviewed
to
say
you
know
running.
The
model
has
been
requested
for
this
project
for
this
customer,
and
then
that
happens
completely,
so
the
ci
job
finishes
basically
almost
right
away,
but
in
the
mo
all
the
all
the
unreview
stuff
happens
separately
and
in
parallel
and
not
related
to
the
sea
icon,
the
ci
job
just
kicks
it
off.
C
Vane,
I
totally
agree
there.
So
one
thing
I
envision
now
is:
let's
say
I
go
to
my
project
on
my
project.
I
enable
and
review
I
click
a
button
that
starts
some
background
jobs
and
that
says:
hey
your
unreview
will
be
available
in
some
hours
and
will
email
you
and
then
we
really
run
a
background
job
and
we
gather
the
data
which
alexander
needs
in
a
certain
format:
do
the
extraction
in
the
back
end
and
put
it
somewhere
now,
there's
the
there
are
still
questions
here.
C
I
don't
I
didn't
I
mean
I
don't
know,
but
then
data
is
so
easy
that
in
the
ci
pipeline,
the
let's
say
the
training
and
finally,
the
running
of
the
model
is
kind
of
cheaper
but
training
and
so
on
so
is
not
subject
to
a
lot
of
delays
because
of
extraction
of
data
or
transformation
of
that
data.
So
we
can
do
I
mean
when
I
think
we
can
do
what
you
tell
by
just
maybe
in
some
project
clicking
a
button
you
know
initially
behind
the
feature
flag.
A
C
Something
becomes
available
and
then
I
have
the
interview
and
then
I
can
have
a
ci
job
or
whatever.
So
that's
one
option.
B
B
A
Like
the
ci
job,
the
the
the
you,
the
customer,
the
user,
would
would
would
have
to
add
it
specific
to
the
ci
job.
You
know
we
tell
them
in
the
documentation.
If
you
want
to
own
an
underview.
Add
this
text,
you
know
to
pure
ci
job,
which
is
great,
so
that
would
kick
it
off.
Where
does
the
fee
feature
play?
I
think
a
feature.
C
A
C
I
think,
and
why,
because
you
know
ci,
as
you
said,
it's
a
manual
process
developers
love
it
and
it's
not
easy
and
okay,
you
include
a
template,
but
it's
finally,
something
we
as
gitlab
engineers,
love
and
do,
and
devops
engineers
do
and
coming
back
the
future
flag
could
only
be
that
cosmetically.
C
A
A
Ned
is,
it
would
be
the
feature
flag
would
just
be
another
way
to
kick
off
till
when
the
ci
job,
when
ci
jobs
run
also
kick
off
on
review.
So
one
way
to
do
it
is
via
the
ci,
the
ci
job
configuration.
The
other
would
be
the
feature
flag
which
basically
would
insert
itself
when
ci
jobs
are
run
for
that
project.
Safe,
don't
kick
off
on
the
view,
you
know,
so
it's
two
different
ways
to
enable
the
same
thing.
It
sounds
like
if
I'm
understanding
correctly.
C
Yeah
yeah
totally,
but
in
short
here
let's
go
back
to
or
what
we
have
to
do
so
we
have
to
train.
We
have
the
exact
data,
hard
problem,
if
you
do
it
in
the
back
end
somewhere
and
make
it
quite
ready
for
for
unreview,
that's
going
to
be
that
button
which
I'm
imagining
behind
the
feature
flag.
C
You
click
on
that
it
does
something
and
prepares
all
the
data
for
this
project,
but
running
the
model
or
then
making
actual
recommendations
of
reviewers
could
be
done
in
a
ci
job
easily
for
the
time
being,
but
long
term.
Anything
is
better
to
be
on
the
ui
I
mean
so
you
go
to
your
merchant
fest
and
there
you
have.
I
don't
know
how
about
you.
You
need
a
product
designer
and
you
suddenly.
Let's
say
I
go
to
my
project
I
enable
and
review.
C
Then
I
go
to
my
mr
and
I
now
see
the
reviewers
for
my
mr
coming
from
machine
learning.
That's
the
ultimate
experience.
I
think
gitlab
paranormal
is
just
a
background
job
framework
which
we
have
available.
That's
why
we
like
to
dog
food
it
and
we
want
to
use
it,
but
overall
long
term.
Anything
is
better
to
be
in
the
ui
I
mean,
or
in
the
background
job.
B
B
We
have
the
situation
when
for
the
new
projects,
we
can
collect
data
step
by
step,
for
instance,
let's
say
that
underview
is
integrated
into
the
core
and
once
a
user
creates
a
project,
so
we
can
collect
the
required
data
step
by
step
with
every
new
merch
request
and
you
commits,
but
we
will
also
have
some
old
projects
and
for
these
projects
we
need
to
collect
the
past
history
and
it's
a
long
running
task.
Sometimes
it
can
be
right
for
huge
projects,
yeah
totally.
E
That's
so
one
thing:
one
thing:
do
you
know
the
people
in
charge
of
the
gpu
runner,
because,
based
on
the
things
I'm
hearing,
you
say
something
alberta
resonated
that
they
said
that
the
ci
pipelines
are
made
to
to
to
fail
to
catch
some
testing
and
that's
it.
So
this
is
not
a
machine
learning,
workflow
or
machine
learning,
so
so
this
tool
for
that
purpose,
probably
is
not
the
best.
But
I
know
that
there
is
this
team
that
they
are
working
on.
E
The
gpu
run
enabled
runners,
so
I
didn't
know
that,
for
example,
no
more
than
three
hours
will
be
possible
to
to
use
a
runner,
but
it
I
know
I
would
say
that
this
is
standard
on
any
training
of
machine
learning
model.
This
can
take
more
than
three
hours,
because
in
three
days
do
we
know,
or
it
would
be
good
to
ask
them.
What
is
what
is
their
approach?
E
How
are
they
thinking
about
solving
this
problem,
because
the
way
that
the
value
proposition
of
the
gpu
runner
is
that
is
to
train
machine
learning
models,
but
if
there
is
a
hard
stop
after
three
hours,
so
it
wouldn't
make
any
sense.
So
I
would
be
curious
to
know
what
do
they?
How
are
they
planning
to
approach
this?
Because
if
they
really
are
marketing
it
to
training
machinery
models,
it
should
be
longer
than
three
hours.
E
C
I,
what
I
mentioned
is
that
current
shared
runners
have
these
limits
on
github.com.
However,
gpu
runners
probably
are
not
that
shared
runners
and
probably
they
could
set
their
any
limit.
They
want.
C
B
A
B
Very
you
know
simple
way,
not
maybe
very
reasonable
for
it
for
the
customers
for
customers,
but
anyway
it
will
look
like
animals
and
an
ml
ops
platform
yeah.
We
definitely
can't.
A
Depend
on
what
the
model
ops
team
is
building
from
for
milestone
two,
although
we
want
to
keep
an
eye
on
it,
so
we
we're
not
because
because
then
milestone
two
won't
be
for
a
much
you
know
until
much
further
out.
However,
we
don't
want
to
be
blind
to
what
they're
doing
either
and
do
things
that
we
need
to
rewrite
it
without
at
least
planning
for
that.
C
Yeah
totally
and
your
concern
is
right,
alexander,
but
overall,
in
short,
that's
the
start
of
the
journey,
and
I
think
you
specifically
yourself.
You
should
focus
on
the
smaller
problem,
but
everything
which
is
field
here
as
problems
are
the
problems
of
the
industry
and
if
github
is
going
to
be
finally
useful
for
customers
to
do
their
own
ml
broadly,
which
is
not
the
concern
of
apply
them
at
the
time
being.
But
it's
going
to
be
really
cool
to
solve
all
these
problems.
You
know
which
we.
B
C
B
B
C
I
totally
agree,
but
I
am
not
aware
of
the
long-term
plans
there
that,
like
let's
say
we
can
make
our
review
work,
but
we
can
keep.
Let's
say
we
could
keep
exactly
the
same
infrastructure
you
were
using
before
gitlab,
because
you
could
give
us
an
api
and
we
could
make
it
work.
C
So
that
could
make
it
work,
but
then
it
could
not
be
easy
to
convert
into
a
feature,
so
I
think,
on
one
extreme,
we
make
andrew
work
exactly
as
it
is,
which
is
easy,
but
then
we
don't
end
up
with
a
gitlab
feature.
On
the
other
extreme,
we
convert
gitlab
into
a
full
ml
ops
platform
with
reviewer
capacity
with
machine
learning.
C
So
I
think
yeah,
I'm
not
trying
to
suggest
one
thing,
but
at
the
end
of
the
day
we
are
gitlab.
We
are
providing
that
tool
to
other
people.
To
use
I
mean,
and
if
you
don't
make
it
reusable,
let
me
give
you
a
silly
example:
I'm
not
well-versed.
Let's
say
that
the
whole
rv
relies
on
a
server
and
all
customers
who
want
to
use
it
on
gitlab.com
or
elsewhere
need
to
go
and
open
an
account
in
confluent
or,
let's
say,
gcp,
that's
going
to
be
not
very
good
long
term.
B
Yeah,
I
see
what
I
mean
yeah,
I
say
I
agree
yeah
I
see
but
like
there
are
two
cases,
look
the
first
one.
For
instance,
we
we
choose
airflow
and
pop
sub,
for
instance
right
then
it
means
that
okay
with
the
predefined
site
template
we
recommend
with
the
airflow
and
pops
up
on
google
and
google
dataflow.
We
extract
and
prepare
everything
right.
So
it
means
that
when
we
start
to
work
on
this
third
milestone,
it
means
that
we
need
to
move
data
from
self-hosted
customers
somehow
to
to
our
side.
B
First,
for
instance,
we
can
do
that
through
the
path
through
pub
sub
right.
If
we
choose
gitlab
pipelines,
maybe
we
can
train
even
on
the
customer
side
and
even
on
the
self
self
customer
side
right,
because
in
this
case
we
don't
need
to
move
data
outside
of
their.
C
Data,
even
on
gitlab,
you
know
like
when
we
enable
run
review
normally
going
into
the
repository
history
of
a
customer
and
doing
their
some
training
without
the
customers.
Knowledge
is
something
which
they
won't
like
without
initiating
it
first
themselves.
You
might
want
to
think
about
the
implications
of
that.
B
But
maybe
in
this
case,
maybe
we
don't
need
to
to
to
make
this
model
too
complicated
to
make
it
strainable
on
the
customer's
side.
That's
what
I
mean,
maybe
in
this
case,
for
instance,
if
we,
if
we
choose
gitlab
pipelines-
and
we
do
all
of
these
things,
for
example
on
the
self
on
the
self
customers
side.
Maybe
in
this
case
that
would
be
one
model,
but
we
could
have
another
model
on
our
side.
That
is
more
complicated
and
we
can
take
so
we
can
spend
more
time
to
train
it.
C
B
C
I
made
a
lot
of
size
analysis
there,
but
what
you
say
is,
I
think,
true
in
short,
for
github.com
we
can
do
something
different
and
ideally,
of
course
we
want
to
have
the
same
thing
exactly
on
gitlab.com
on
the
self-managed
long
term,
but
we
can
always
github.com
can
always
be
the
forerunner
pioneer
or
new
things
and
in
the
meantime,
if
the
self-managed
evolves
and
people
can
do
more
there,
because
model,
ops
or
other
teams
do
great
things
there
or
gpu
runners.
C
Then
then
that's
going
to
be
and
viable
for
us
and
we
can
do
that
way
in
the
future.
C
C
I
mean
huge
repo,
but
so
I
think
I
mean
let's
go
the
way
we
described
naively
and
let's
see
if
we
hit
all
the
challenges,
you
know
on
the
data
instruction
and
we
can
really.
Finally,
we
are
gitlab
the
company
and
we
can
go
and
change
the
anything
like
william
suggested.
We
can
go
and
check
the
gpu
runners.
We
can
check
the
runner
timeouts,
have
more
runners,
have
a
special
runner
with
a
different
timeout,
have
a
apply
them
as
runners,
which
no
one
uses
stuff
like
that,
which
could
be
easier.
B
B
Yes,
sir,
do
we
have
time
to
test
to
try,
for
instance,
this
solution
of
using
ci
pipelines,
but
probably
maybe
we
we
need
to
improve
something
I
mean
something
in
using
pipelines,
maybe
the
usual
way
how
these
pipelines
are
used
right
now
by
using
right
now.
So
what
do
you
think.
B
No,
it's
like
alper
suggested
this
thing
that
if
we
use
gitlab
pipelines,
it
means
that
we
can
improve
them.
Is
it
right,
so
we.
A
A
We
may
not
be
able
to
improve
them
in
the
short
term
and
still
achieve
our
goals
and
apply
to
that.
So
we
can't
improve
them.
Surely
you
know
everybody
can
contribute.
That
doesn't
mean
that
all
changes
are
accepted
or
that
all
teams
agree
on
all
teams,
other
teams,
changes
etc.
So
that
might
be
they
might
love
it
or
they
might.
We
might
give
them
pause.
I
don't
know
what
changes
we
would
make.
C
Just
what
I
meant
is
just
increasing
the
timeout
settings
and
having
shared
runners,
which
is
not
changing
code,
which
could
be
easier
to
do.
A
A
It's
enabled
by
feature
flag,
we'd,
maybe
we'd
get
I
don't
know,
would
it
be
kicking
off
a
sidekick
job
perhaps
to
the
sidekick
job
then
goes
and
does
the
extraction
and
then
stores
the
data
wherever
it
needs
to
be,
and
then
when
we
actually
want
to
run
the
mod
later
and
and
output,
the
recommendation
that
we
put
in
the
ci
job
because
running
the
model
and
updating
the
mrs
with
comments
saying
who's
recommended
is
gonna,
be
much
faster
and
you
know
we'll
take
nowhere
near
an
hour.
B
A
You
need
to
read
the
model,
so
how
do
you
get
the
data
into
the
model?
I
think
we
get
the
data
into
the
model
via
a
feature
flag:
it's
not
via
ruby
code,
rubio's
code
controlled
by
a
feature
flag,
and
if
customers
love
to
use
both
right,
you
can't
do
one
without
the
other
and
if
they
do
one
without
the
other,
it
will
work.
Of
course,
that's
that's,
not
a
good
user
experience,
but
that's
still,
okay
for
an
mvc.
I
think.
B
B
So
we
need
to
do
to
do
this
somewhere
outside
of
these
drops,
then,
for
instance,
a
user
can
add
a
ci
template
and
this
site
template
will
generate
recommendations
that
can
be
parsed
later
and
yes,
all
all
the
so
I
mean
the
recommendation.
Process
works
in
ci
jobs.
B
A
I
think
if,
if
the
data
extraction,
if
the
model
is
not
built
yet
if
the
data
extraction
is
not
done
yet
or
the
model
is
not
ready,
yet
I
think
it's
fine
that
that's
a
case
where,
when
the
ci
job
runs,
it
just
says
it's:
it's
not
running
it.
It
just
comments
it
it
outputs
in
the
logs
saying
that
most
mrs
don't
succeed
on
first
try.
So
you
know,
because
you
know
automated
tests,
don't
succeed,
etc.
A
So
they'll
run
it
again
most
likely
and
then
get
a
recommendation
later
and
if
and
actually
even
if,
even
if
this,
even
if
the
code
is
mergeable
on
first
run
of
all
the
pipelines,
that's
okay,
they
can
rerun.
They
can
kick
off
the
pipelines
again
once
the
model's
ready
they
can
click
the
run
button
again
and
it'll
kick
it
off
later
manually.
I
think
those
are
those
are
fine.
A
Sort
of
the
security
scanners,
if
you
want
to
rerun
your
security
scans
on
on
changed
code,
where
you
want
to
see
it
ran
the
first
time
be.
You
know
you
want
to
run
it
again
to
see
if
there's
any
new
vulnerabilities
based
on
new
things
that
the
scanners
find.
Since
you
last
ran
them.
You
just
hit
the
run
button
again
to
rerun
your
pipelines
and
it'll
pick
up
the
latest
configuration.
B
There
is
also
another.
Oh
sorry,
there
is
also
another
strategy.
For
instance,
we
have
this.
We
have
this
predefined
ci
template
where
we
recommend,
and
once
this
template
is
run,
we
start
the
extraction
process.
I
mean
we
trigger
somehow,
for
instance,
another
pipeline
ci
pipeline,
maybe
or
something
else
where
we,
where
we
extract
data.
So
there
is
another
case
like.
C
Yeah,
I
think
that
can
also
work
fine,
I
mean
we
have
several
alternatives.
I
I'm
aware
that
I'm
so
in
short,
you
go
there.
You
check
if
there's
a
model
data,
if
no
you
extract
it.
If,
yes,
then
you
so
you
launch
you
also
pipeline
architecture
is
quite
rich.
Finally,
you
can
launch
a
new
job
or
inside
one
job.
You
can
do
all
this
and
and
then
finally
you,
if
you
don't
have
enough
data,
you
can
say,
as
vayne
mentioned
at
the
moment,
the
training
is
continuing.
C
One
trouble
which
I
have
to
mention
is
that,
while
defending
the
backhand
approach,
sidekick
jobs
on
in
gitlab
have
five
minutes
recommended
run
time,
so
that
means
you
have
to
divide
the
job
into
very
little.
So
when
I
was
calculating
other
stuff
there,
I
would
never
expect
anything
to
run
on
the
back
end
more
than
three
hours.
B
So
we
have
three
hours,
but
yes,
so
that's
still
also
a
good
question
for
me,
because
I
would
like
to
have
some
kind
of
a
b
test.
As
helper
said
in
one
of
the
issues
to
understand,
do
we
need
to
improve
the
current
model?
How
can
we
improve
the
current
model?
Maybe
we
need
to
change
the
direction
completely,
maybe,
for
instance,
right
now,
I
thought
that
maybe
we
need
to
to
introduce,
for
instance,
some
tech
description
features
to
the
data
set
that
maybe
we
don't
need
these
things
completely.
B
Maybe
customers,
users-
maybe
they
want
something
else,
so
it
would
be
good
to
have
this
kind
of,
but
at
the
same
time
we
don't
have
a
let's
say
a
platform
where
we
can
test
all
these
things.
I
mean
this
mlops
platform,
so
we
cannot
like.
We
can't
write
the
model
and
run
this
model
just
in
five
minutes
in
in
production
or
in
stage.
B
Yes,
yes,
any
any
machine
learning
model,
so
we
don't
have
this
ability
to
do
yeah.
We
are
like.
We
have
two
problems
at
the
same
time,
and
this
is
why
we
need
to
find
some
kind
of
trade
trade-off
right.
C
Yeah
totally,
I
mean
a
lot
of.
C
Because,
finally,
we
are
as
the
pioneer
here
in
the
devops.
We
are
trying
to
apply
them
and
that's
why
all
the
challenges
we
face
are
true
challenges
which
everyone
would
face
and
the
solutions
to
them
are
going
to
be
need
to
be
innovative.
Actually,
that's
what
I
believe
I
mean,
so
we
can
try
multiple
alternatives,
like
you
said
too,.
B
Yes,
but
yeah,
of
course,
there
are
a
lot
of
like
open
source
projects
that
can
help
us
like
ml
ml
flow
or
some
other,
but
still
we
need
to
maintain
that
we
need
to
install.
We
need
to
to
do
a
lot
of
things,
so
we
cannot
take
them
all
and
introduce
at
the
same
time,
right.
A
Yeah
but
lots
of
that's
a
hard
challenging
but
fun
decisions
to
make
so
yeah.
Thank
you.
Thank
you.
I
know
this
went
a
lot
longer
than
we
scheduled,
but
that's
just
fine.
I
know
alexander
you've
been
wanting
to
bounce
a
bunch
of
ideas
off
of
what
you
were
thinking
off
of
somebody
who
knows
the
gitlab
product
in
detail
and
thank
you
alper
for
being
that
person
today.
So
you
could
give
us
all
your
great
advice.
C
Yeah,
so
sorry
for
not
giving
exact
answers.
As
you
see,
there
are
trade-offs
in
every
choice
and-
and
we
are
in
a
challenge-
that's
the
fact.
Absolutely.