►
From YouTube: Applied ML weekly team meeting June 17, 2021
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
B
Listen
people
who
haven't
attended
previously
and
came
today
so
maybe
a
super
quick
introduction
from
kai
and
sean
would
be
great.
A
Sure
so
I'm
sean
carroll,
I'm
the
engineering
manager
for
source
code
and
I'm
very
interested
in
this
project
and
what's
going
on
here
and
just
quickly
about
myself,
I'm
not
by
any
means
a
expert
in
machine
learning.
But
I
did
work
in
a
lab
at
a
swiss
university
for
nearly
four
years
on
supporting
machine
learning
projects,
mostly
from
infrastructure
perspective.
C
Cool
and
then
I'm
kai,
I'm
the
product
manager
for
code
review,
and
I'm
only
here
because
we're
gonna
be
your
first
implementation
point.
I
suppose
at
some
point
in
the
future,
so
thought
we'd
come,
say,
hi
and
then
probably
won't
come
back
for
for
at
least
a
little.
While
a
couple
milestones,
maybe.
B
Yeah
sure,
thanks
guys,
so
we
usually
don't
screen
share
in
meetings,
but
this,
I
think
it's
warranted
item
number
one
is
just
to
go
through
the
issue
board
issue
boards
and
gitlab
are
not
obvious
until
you
kind
of
work
with
them,
at
least
in
our
usage
of
them,
in
our
specific
labels,
so
maybe
share
screen
for
maybe
two
or
three
minutes.
B
B
We
in
our
product
development
flow
process,
the
ones
that
you
know
we're
at
least
currently
using
are
things
like
scheduling,
which
means
you
can
open
and
close
these,
so
I
can
open
them
close
that
so
scheduling
is,
you
know
we
haven't
started
work
on
it,
but
we
will,
in
the
future
it
needs
to
be
scheduled.
B
Blocked
means
it's
blocked
on
something
it
could
be
blocked
in
another
issue
or
something
else
or
like
this
one's
blocked
on
time.
Like
me,
how's
not
transitioning,
officially
to
this
team
for
another
week
or
so
so
I'm
just
kind
of
blocked
based
on
time,
then
there's
ready
for
development,
which
means
we
are
ready
to
start
developing
on
it,
but
we
haven't
and
in
development,
which
means
we're
actively
developing
and
there's
other
ones
like
in
review
in
production,
etc.
B
So
the
ones
that
we
want
to
be
that
alexander
and
I
want
to
be
working
on
now
and
others
as
interested
in
determining
the
plan
for
the
open
source
components
that
don't
exist
in
get
live
infrastructure
which
we've
been
collaborating
on
async
in
the
issue,
which
is
great
publishing
the
plan
in
a
handbook
page
based
on
that
which
I'll
do
based
on
everyone's
feedback,
so
I'll,
create
that
initial
handbook
page
and
tag
anybody
interested
to
review
it
as
we
merge
it
and
then
iteratively
improve
it
and
then
creating
the
infrastructure
request.
B
So
so
we
can
get
that
in
for
the
infrastructure
team,
for
the
proof
of
concept
and
then
going
back
a
bit.
The
big
change
to
the
code
base
is
replacing
adf
with
google
dataflow
or
similar
for
the
proof
of
concept
which
alexander,
once
you
start
working
on,
that
just
change
the
label
from
ready
for
development
to
in-depth
and
then
it'll
show
up
here,
and
that
will
be
the
kind
of
the
next
task
to
go
from
there.
I
think
we
are
closing
in
on
the
in
dev
ones,
as
we
finalize
for
now.
B
At
least
this
iteration
of
the
architectural
plan
document
it
and
then
create
the
infrastructure
request.
So
alexander
and
I
and
me
now
and
the
others
have
been
collaborating
on
that.
So
does
the
issue
board
make
sense
any
any
questions
about
issue
boards
in
general
or
about
these
plans
here.
B
Great
great
I'll
stop
this
here
so
alex
you
had
a
yoda
two
comments
on
the
current
issues.
Do
you
want
to
verbalize.
D
D
B
B
D
Second
comment:
just
a
comment
that
I
have
updated:
the
descriptions
on
how
and
review
uses
different
components,
so
you
can
find
it
there.
I
have
attached
the
link.
B
I
review
that
that
looks
great
I'd,
encourage
others
as
well
as
we
update
the
integration
plan.
D
So
I
have
some
other
questions.
So
can
we
start
working
on
the
handbook
page?
So
we
can
open
a
merge
request.
We
can
collaborate
there
because
I
found
that
we
can
start
working
because
we
have
the
first.
We
know
the
picture
before
we
know
the
picture.
We
know
the
future
picture,
so
we
can
start
describing
these
things.
B
Absolutely
I'll
work
on
the
first
draft,
I'm
going
to
do
handbook
pages
because
then
you
can
spend
your
time
out
on
handing
pages
as
much
and
on
actually
changing
the
review
code
base
is.
D
That,
okay
to
you,
it's
like
I'm
waiting
for
my
laptop.
So
still
I
have
time
to
update
the
this
page.
B
B
Yeah,
when
is
the,
when
is
the
laptops
supposedly
going
to.
D
B
Okay,
good
job.
You
also.
D
Number
three
yep,
so
we
found
how
to
replace
hive
and
azure
data
factory,
so
that's
cool,
but
we
also,
you
know
we
have
to
decide
how
to
deploy
kafka.
D
I
have
attached
the
comment,
so
I
found
three
options:
how
we
can
do
that,
but
still
I'm
thinking
that,
like
we
focus
right
now,
we
focus
more
on
on
the
book
right.
So
we
don't
need
to
create
the
high
availability
cluster
or
something
like
that.
B
Sounds
great
and
then
anyone
interested
can
subscribe
can
watch
the
issue
and
comment
on
it.
We
can
continue
working
on
it.
That's
great.
It
helps.
D
Number
four
right:
I
have
a
lot
of
questions,
so
unreview
requires
a
scheduler
to
automatically
start
the
extract
stage
on
the
training
stage,
because
I.
B
D
B
F
Yeah,
I
wouldn't
once
a
week
wouldn't
once
a
week
be
too
much
I
mean
I
mean
I
don't
I
don't
think
we
don't.
I
think
we
generate
a
lot
of
data
in
in
a
week,
so
I
imagine
training
the
model
daily
or
at
least
ingesting.
The
data
daily
would
put
a
lot
less
pressure
on
the
components
involved,
especially
the
database.
I
mean
a
database
outage.
Even
it
can
can
come
even
from
a
read,
maybe
not
for
a
long
time,
but
we
can
get
the
statement
timeout,
so
we
need
to
retries.
We
need
to
like
put.
A
B
D
D
But
we
know-
maybe
that's
not
necessary
for
the
first
milestone
because
we're
trying
to
produce
the
puck
right.
So
maybe
we
can
move
that
to
the
or
at
least
for
example,
if
we
finish
everything
for
the
first
milestone
in
one
month
and
we
will
have
one
month
more,
maybe
we
can.
Maybe
we
can
do
that.
F
Is
there
a
significant
difference
in
how
often
you
inject
the
data
with
regards
to
training
the
model?
What
do
you
mean?
I
mean
if
you
want
to
like,
if
you
want
to
run
the
data
ingestion
once
a
week,
is
there
going
to
be
a
different?
It's
going?
Is
there
going
to
be
a
big
difference
between
ingesting
it
daily
and
once
per
week
once
per
week?
F
If
you
want
to,
if
you
want
to
fetch
data
from
a
week
before
then
you're
going
to
have
a
huge
query
with
a
ton
of
results,
if
you
fetch
data
per
day,
then
you're
going
to
have
less
lots
of
less
smaller
queries
so
yeah,
I
imagine
database
people
would
be
far
happier
having
small
queries
rather
than
one
huge
query.
So
that's
that's
my
main
point
here.
Basically,.
D
D
D
I
mean,
for
instance,
we
can
extract
data
for
one
one
for
one
year
or
two
years
ago,
then,
for
one
year
one
year
ago,
and
then
for
the
current
year,
like
that.
F
B
F
And
one
week
ago
I
still
think
that's
going
to
be
too
much
and
you're
going
to
get
a
statement
timeout.
So
we
are
looking
at
preferably
getting
one
day
of
data
off
per
per
day
or
per
query
or
maybe
per
hour.
The
actual
interval
between
queries
needs
to
be
adjusted,
but
if
we
try
to
get
a
week's
load
of
data
weeks
worth
of
data
in
one
query,
then
we're
going
a
database
timeout
and
someone's
not
going
to
be
happy.
If
we
do
that
far
too
often,.
F
B
I
do
the
proof
of
concept
is
definitely
proof
of
concept.
It
doesn't
need
to
be
production
ready,
so
we
definitely
need
to
not
cause
too
high
load
in
the
database,
nor
have
it
fail
because
the
queries
time
out,
however,
I
think
automated
scheduling
to
pull
the
data
and
process
it
is
probably
not
needed
for
the
proof
of
concept.
We
can
babysit
it
manually
in
the
proof
of
concept.
B
I
think
if
data
pull
doesn't
work
for
example,
or
things
like
that
perhaps,
but
we
don't
need
to
decide
that
now,
but
I
I'm
okay
with
manual
scheduling
for
the
proof
of
concept
all.
F
B
D
D
So
can
we
start
a
new
issue
to
discuss
thoughts
on
on
how
one
review
might
be
used
by
self-hosted
customers,
because
I
feel
that
we
need
to
discuss
these
things,
because
all
the
time
when
we
propose
something
other
team
members,
they
say
something
like
okay.
We
cannot
use
this
or
that
one
that
technology,
because
we
need
to
support
self-hosted
customers.
So
maybe
we
need
to
start
discussing
these
things
to
understand.
D
Maybe
we
need
to
introduce
several
strategies,
for
example,
one
strategy
for
gitlab.com,
another
strategy
for
self-hosted
customers,
because
we
need
to
understand
the
pressure
that
we
can
produce
while
training
models
and
extracting
data.
B
You
know
it's
a
great
it's
a
great
question
alex
the
the
concern
is
we
have
too
many
things
going
on
at
once
and
we
need
to
make
progress
on
some
things.
However,
these
some
of
these
decisions
that
we
make
may
not
be
two-way
door
decisions.
You
know,
meaning
once
we
go
through
the
door,
it's
hard
to
go
back
to
the
door
and
go
back.
So
the
current
philosophy
is
that
we
believe
that
self
most
self-hosting
customers.
B
This
can
change,
wouldn't
want
to
run
this
themselves
due
to
not
having
the
processing
power
to
do
so
or
want
to
dedicate
it.
So
what
instead
we
do
is
leverage
something
it
doesn't
exist
yet
of
having
an
ability
for
self-hosted
customers.
If
they
choose
to,
then
we
graph
the
data
at
the
custom
within
the
customer's
instance
send
the
data
to
our
cloud
process.
It
compute,
you
know,
run
the
models
and
then
send
the
result.
Recommendations
back
so
we
can
depend
on
cloud-based
resources.
B
C
Yeah,
I
think
so,
there's
still
a
thought
too,
that
this
could
potentially
be
tied
to
usage
ping
for
us
to
incentivize
people
to
turn
that
on
as
well
as
we
already
have
a
mechanism
for
sending
data
back
to
gitlab
I'd,
I'm
not
convinced.
That's
the
right
choice
here,
I
think
talking
to
some
customers
would
help
decide
that,
so
I
think
we
can
start
having
some
of
these
exploratory
conversations.
C
B
Can
we
make
the
assumption
that
we
can
address
that?
We
may
need
to
do
some
refactoring
if
we
decide
to
support
self-hosted
in
a
different
way,
because
it
does
make
it
harder
for
us
to
progress
on
the
poc
and
the
dot
com?
The
you
know
the
the
get
well
git
lab
hosted
ones.
If
we
don't
make
that
assumption.
So
I'd.
A
B
B
C
The
other
thing
I
would
say
too
is
I
mean,
as
we
run
this
on
our
own
code
base,
we'll
get
a
better
sense
of
what
the
actual
requirements
are
for
the
compute
here.
If
it's
not
crazy,
I
there
is
very
much
a
possibility
that
we
may
just
offer
this
as
a
specialized
runner
or
something.
So
I
I
think
we
still
just
need
a
little
bit
more
information
about
how
this
is
going
to
run
what
the
performance
profile,
what
the
cost
profile
looks
like
just
running
within
our
own
data
set.
D
So
yeah,
so
thanks
so
right
now
the
performance
is
not
so
I
mean
crazy.
I
mean
that
doesn't
require
a
lot
of
resources
right
now,
because
the
model
is
not
so
difficult,
but
as
soon
as
we
start,
you
know
to
improve
the
model.
We
will
get
much
more,
we'll
produce
more
pressure.
D
Sorry
right
now
we
can
use
gpu
to
train
the
model
because
it
supports
gpu.
D
F
F
E
Sure
wayne
I
just
mentioned
this-
I
don't
know
if
exactly
the
same
strategy
would
apply
here,
but
it
might
be
worth
checking
out
how
we
did
for
the
source
graph,
which
is
it's
different,
because
it's
a
third-party
software
application,
but
we
do
allow
customers
that
have
airtight
instances
that
set
up
their
own
instance
of
source
graph
and
setup.
So
gitlab
connects
to
their
local
instance
of
source
graph
instead
of
relying
on
sourcecraft.com.
E
C
To
mention,
let's
say:
there's
yeah,
there's
three
other
models
for
like
running
additional
infrastructure,
which
is
source
graph,
git,
pod
and
then
elastic
search.
All
three
of
them
require
in
self-managed
separate
infrastructure.
If
you
want
to
run
them
and
like
have
different
configurations,
so
there's
like
three
three
places
or
groups
that
you
can
talk
to
and
I
worked
on
all
of
them,
but
I'm
happy
to
let
you
go
talk
to
pms
who
work
on
them.
B
Still,
I
think,
actually
we
happened
to
cover
your
comment
already,
but
did
we
cover
that
fully
or
not
yet.
A
Yeah
sorry,
just
as
I
was
typing
it,
I
think
it
was
actually
answered
on
both
parts.
So
just
about
do
we
need
a
gpu
and
what
do
we
do
with
self-hosted,
but
it
sounds
like
we
have
that
answered.
E
Yourself
sure
so,
yeah
hi
everyone
just
wanted
to
say
hi
from
from
guru
of
you
excited
excited
to
see
the
work
of
this
group
excited
for
the
news,
and
I
understand
that
right
now
we're
at
a
implementation
stage
we're
setting
things
up
the
groundwork
and
everything.
So
it
might
be
a
little
bit
too
soon,
but
I
just
wanted
to
make
ourselves
available
if
you
need
anything
to
know
about
code
review,
anything
down
the
line,
I'm
the
front
of
engineering
manager
for
the
group.
E
B
I
appreciate
it.
Actually,
you
and
ky
both
give
and
sean
coaching
groups
have
already
given
us
some
great
ideas:
yeah
the
we're
not
planning
to
integrate
with
the
get
lab
product
at
all
until
milestone
two,
which
is
at
our
current
schedule,
wouldn't
be
going
live
for
four
months
and
we're
not
going
to
start
working
on
that
for
two
months.
B
We're
working
on
proof
of
concept
now
which
doesn't
change
any
code
in
the
gitlab
product
itself
and
milestone
one
so,
but
glad
to
see
you
here
of
course,
you're
always
welcome
and
love
to
have
your
feedback
on
things,
but
we
won't.
We
won't.
We
won't
be
asking
you
to
do
anything
other
than
give
us
feedback
on
our
plans
for
probably
about
two
months.
E
That's
fine
I'll
be
hanging
out
on
the
weekly
whenever
I
can,
but
if
you
ever
need
anything
feel
free
to
bring
me
asynchronously
and
I
will
help
cool
thanks.
B
Okay,
we're
about
out
of
scheduled
time
anything
else.
We
should
discuss.