►
From YouTube: GitLab 14.10 Kickoff - Enablement:Database
Description
Kickoff for the Database Group for the GitLab 14.10 release
Planning issue: https://gitlab.com/gitlab-org/database-team/team-tasks/-/issues/242
Database Group Past Kickoff Videos: https://youtube.com/playlist?list=PL05JrBw4t0KqP3MYrcoQHrqPUqn_jJZSN
14.8 Kickoff video with a more detailed presentation of features: https://youtu.be/eek6Sn0p5gc
Presentation by: Yannis Roussos, Sr. Product Manager, Memory and Database Groups
A
Hi,
I'm
jens
russos,
the
product
manager
of
the
database
group
and
I'd
like
to
take
you
through
what
we're
trying
to
achieve
in
gitlab
40.10,
which
is
scheduled
to
be
released
on
april
22nd
of
2002.
As
a
quick
recap
of
what
we
are
doing,
the
core
mission
of
the
database
group
is
to
build
the
application
code.
The
tools
and
frameworks
that
allow
every
gitlab
feature
to
interact
with
the
database
in
the
most
reliable
and
performant
way
possible.
A
We
also
beat
the
tools
and
product
features
that
allow
any
gitlab
team
member
to
efficiently
develop
codes
that
interact
with
the
database
test
against
production
grade
data
sets
and
make
informed
data-driven
decisions
before
submitting
any
update
to
the
gitlab
pro.
So
let
me
take
you
through
our
top
priorities
for
4.10
the
first
one
is
batch
background
migrations.
I
have
already
provided
a
few
in-depth
presentations
of
this
new
framework
in
past
kick-off
videos.
A
Migrations
is
a
new
framework
that
we
are
building
from
the
ground
up
with
a
focus
on
reliability,
self-monitoring
and
auto
tuning,
and
we
anticipate
that
this
will
be
a
major
step
forward
for
the
stability
and
availability
of
gitlab
incidences
of
any
size
while
making
most
background
data
operations
complete
considerably
faster
in
most
cases.
So
we
have
a
lot
more
information
in
the
related
epic.
A
Our
work
in
4.10
is
driving
us
towards
the
general
availability
of
the
batch
background,
migrations
framework.
We
are
already
working
with
a
few
groups
internally
that
use
the
plasmagram
migrations
for
very
large
difficult
migrations.
A
We
even
had
a
small
incident
yesterday
in
gitlab.com
related
to
right
ahead
low
replication,
where
at
the
moment
we
had
the
bad
background
migration
running
and
because
it
was
about
background
migration,
there's
a
reason
called
were
able
to
pause
the
migration,
let
the
system
recover
and
then
restart
the
migration
without
causing
any
problems
and
without
a
hiccup,
so
in
4.10
driving
towards
the
general
availability
very
quickly.
So,
first
of
all
support
for
multiple
databases.
This
is
something
we
are
working
for:
the
full
14.9.
A
You
know,
maybe
in
a
few
days
and
then
we
are
going
to
work
on
a
few
core
last
core
updates
like,
for
example,
updating
our
scheduling
and
auto
tuning
algorithm,
to
be
able
to
figure
out
background
jobs
that
fail
how
they
fail
their
error,
our
error
out
or
the
time
out
and
if,
if
they
time
out,
which
means
that
the
bad
size
may
be
too
large,
be
able
to
split
them
in
smaller
chunks
and
execute
them
in
smaller
jobs.
A
Similarly,
at
the
ability
to
disable
the
auto
tuning
algorithm
so
that
we
can
limit
the
very
sensitive,
very
delicate
background
jobs
that
we
don't
want
to
change
their
bad
size,
we
can
limit
them
to
whatever
we
set
them
during
scheduling,
of
course,
a
developer
documentation
which
is
very
important
for
the
general
availability
of
the
framework
and
finally,
at
additional
optimization
optimizations
to
our
batch
optimizer,
so
that
we
can
board
support
more
more
use
cases
of
data
operations
like
partial
updates,
by
adding
a
custom,
but
the
strategy
etc.
A
A
That
can
cause
a
system
to
start
feeling
feeling
it
and
feeling
having
multiple
issues.
So
the
idea
here
is
to
build
a
mechanism
that
will
monitor
the
health
of
the
production
database
and
respond
to
it
by
lowering
lowering
the
rate
of
updates
throughout
the
updates
or
even
posing
them,
and
this
is
very
important
for
us.
This
is
the
last
missing
puzzle.
A
On
top
of
what
I
just
discussed
for
bats
background
migrations,
that's
why
we
are
going
to
to
implement
this
mechanism
we're
going
to
introduce
it
as
part
of
the
auto
tuning
layer
of
the
batch
background,
migrations
and
the
idea
there
is
that
we
are
going
to
have
monitoring
in
real
time
of
the
production
system
and
various
mechanisms
for
monitoring
various
signals.
A
That
will
tell
us
if
there
is
a
problem
with
the
production
system
and
neither
lower
the
bat
sizes
and
the
rate
of
updates
or
even
completely
stop
advanced
background
migration,
wait
for
the
system
to
recover
and
then
restart
without
requiring
any
manual
manual
intervention.
Our
first
step
in
40.10
will
be
to
identify
those
signals.
So
what
are
we
going
to
measure
so?
Is
it
about
rate
of
updates?
A
Our
thread
third
top
priority
is
the
automatic
database
test
using
clones
in
general
with
this?
This
is
a
project
that
is
fully
running
in
internal
in
gitlab.
This
is
an
amazing
capability.
We
have.
We
are
shifting
left
our
ability
to
preemptively
find
database
related,
regressions
and
performance
issues
by
testing
all
database
updates
and
guests
against
the
production
clone
of
gitlab.com
database.
So
this
is
amazing.
Any
update
that
we
make
right
now
automatically
in
a
ci
pipeline,
is
executed
and
tested
against
our
production
system,
and
we
have
already
completed
the
coverage.
A
We
have
covered
regular
migrations
and
post
migrations.
Those
are
migrations
that
run
after
the
deployment,
the
code
deployment
in
no
downtime
updates
and
the
next
step.
We
have
started
working
on
already
in
4.9
and
have
completed
4.9
at
the
first
step
of
it
is
supported,
background
migrations.
A
A
We
have
already
completed
support
for
regular
background
migration
are
existing
background
migrations
and
in
14.10
we
are
going
to
add
support
for
bots
background
migrations,
testing
batch
background
migrations,
and
then
we
are
going
to
update
the
way.
We
return
results
on
the
mr
to
the
developers
to
the
gitlab
developers
and
finally,
another
step
of
testing
that
will
allow
us
to
estimate
how
much
time
that
background
migrations
will
run,
because
there
is
the
problem
that
you
may
define.
A
If
you
may
block
the
auto
tuning
and
copy
it
with
a
max
but
bad
size
and
by
mistake,
set
it
to
a
very
low
value,
let's
say
a
thousand
records
per
two
minutes.
This
is
very
low
and
that
will
result
in
the
migration
running
for
30
days
and
we
don't
want
the
migration
running
in
30
days.
So
we
will
add
checks
about
that
as
well
and
inform
developers
that
what
they
are
doing
maybe
feel
safe.
A
But
it
will
result
in
an
update
that
will
last
for
30
days
and
they
will
have
to
wait
for
30
days
for
the
for
it
to
finish
and
finally,
adding
somebody
as
part
of
all
the
other
steps,
we're
also
going
to
have
testing
the
background
scheduling
logic.
This
is
a
very
special
use
case
and
we
have
seen
a
few
problems
in
the
past
because
when
you're
scheduling
a
background,
job,
the
it
is
created
and
really
created,
and
you
can
set
up
when
it
gets
out
of
the
queue.
A
So
we
have
had
some
problems
in
the
past
with
very
specific
parameters
on
background
jobs
that
were
passing
and
could
execute
without
issues
on
development
or
testing
environments
and
when
the
migration
reads,
production
they're
allowed
for
very
specific
reasons.
So
this
is
one
more
layer
of
protecting
background
migrations
that
are
said
to
production.
A
Finally,
our
last
top
priorities
added
the
data
dictionary.
I
have
also
discussed
about
it
in
past
week
of
videos
very
quickly.
We
want
to.
We
have
more
than
400
tables.
We
want
to
label
them
using
some
metadata
so
that
we
can
label
who
the
owner
is,
which
group
owns
the
table,
another
additional
metadata
about
a
description
or
other
data
classification
metadata.
A
We
want
to
start
by
labeling
dailies.
This
is
the
most
impactful
change
for
us,
because
it
will
help
us
very
quickly
in
case
of
incidents
or
a
back
report
to
figure
out
who
the
owner
of
the
tv
with
the
domain
specific
knowledge
is
and
address
it
as
fast
as
possible.
So
we're
going
to
work
on
this
add
support
for
a
labeling
technician,
4.10
and
add
some
documentation.