►
From YouTube: GitLab 15.3 Kickoff - Enablement:Database
Description
Kickoff for the Database Group for the GitLab 15.3 release
Planning issue: https://gitlab.com/gitlab-org/database-team/team-tasks/-/issues/259
Database Group Past Kickoff Videos: https://youtube.com/playlist?list=PL05JrBw4t0KqP3MYrcoQHrqPUqn_jJZSN
Presentation by: Yannis Roussos, Sr. Product Manager, Memory and Database Groups
A
Hi,
I'm
jennifer
russo,
the
product
manager
of
the
database
group
and
I'd
like
to
take
you
through
what
we're
planning
to
see
in
gitlab,
50.3
and
scheduled
to
be
released
on
august
and
2nd
of
june
22..
So
not
a
lot
of
our
new
things
on
our
side.
We
keep
on
working
on
our
top
priorities.
While
we
are
pretty
low
on
capacity,
our
top
priority
keeps
on
being
the
bad
background
migrations.
We
are
almost
done.
A
The
most
important
thing
is
the
effort
on
effort
to
improve
batch
handling.
This
is
important
for
two
reasons.
The
first
one
is
because
we
want
to
cover
all
cases
to
make
bad
background
migrations
feature
complete
to
allow
us
to
to
be
able
to
make
to
run
all
the
types
of
background
migrations
that
we
want
to
execute,
and
the
second
one
is
because
this
is
the
core
task
that
we're
doing
with
those
batch
background
migrations,
those
asynchronous
jobs
that
run
in
the
background
and
update
data.
A
A
So
that
they
can
use
background
migrations,
what's
the
problem
here,
we
have
a
lot
of
helpers,
a
lot
of
libraries,
for
example,
for
doing
database
operations
for
doing
partitioning
helpers,
that
rename
columns,
move
columns,
copy
columns,
copy
data
from
one
column
to
another
change,
the
data
type
of
a
column
and
much
many
more.
All
those
database
operations
that
also
perform
data
operations
are
not
using
bad
background
migrations
right
now.
So
the
idea
here
now
that
the
batch
background,
migrations
framework
has
matured
and
we
really
trust
it.
A
We
want
to
rewrite
all
our
database
libraries
to
use
the
bus
background
migrations
which
will
make
all
our
operations
way
more
efficient
and
also
way
better
monitored.
The
final
one
is
improving
the
self-managed
experience.
This
is
important
for
us,
whatever
we
can
do
to
make
running
a
gitlab
instance
as
seamless
as
possible,
and
also
help
gitlab
instance
administrators
address
issues
during
the
updates,
etc.
A
So
this
is
our
work
on
monitoring
production
database
clusters
and
responding
in
real
time
to
changing
conditions
by
throating,
their
our
background
jobs,
the
rate
in
at
which
we
update
data
or
even
stop
imposing
those
background
jobs.
When
we
see
that
there
is
a
big
problem
on
a
production
system
so
that
we
don't
affect
it
anymore.
So
we
have
two
initiatives
that
we
were
working
throughout
15.1
and
50.2.
They
are
ready
to
be
released.
The
first
one.
This
is
the
first
action
that
we
are
going
to
release
is
posing
a
migration.
A
Why
a
lot
of
vacuum
is
running
on
a
tape.
This
is
auto.
Vacuum
is
the
most
important
maintenance
operation
that
a
poster
sql
server
runs.
We
don't
want
to
run
data
migrations
at
the
same
time
as
of
the
viking
runs,
so
the
idea
is
check
that
the
electro
vacuum
is
running
and
pose
any
migrations
any
database
on
that
table.
Wait
for
it
to
finish
and
then
restart.
This
is
completed.
We
have
merged
the
code.
A
A
We
want
to
pause
the
migration
and
give
some
time
to
the
system
and
the
archival
process
to
archive
all
the
remaining
wall
segments.
This
is
also
almost
done.
It
is
in
review.
We
expect
to
release
it
to
test
it
when
validated
unit
50.3
and
then
release
itself
an
instance
on
new
initiatives.
We
want
to
also
add
code
that
poses
migrations
when
the
patronia
objects
drops
below
nslo
patrony.
Are
our
database
servers
so
we
have
an
object
that
checks
the
health?
How
well
are
things
going
on
our
database
servers?
A
This
is
the
main
monitoring
signal
we
have
that
things
are
not
going
well
and
it's
a
very
early
lead
indicator.
So
the
moment
we
see
that
patronio
objects
going
down
so,
for
example,
in
a
gitlab.com
below
99.99.
A
We
want
to
to
immediately
stop
everything,
because
this
is
a
leading
indicator.
Something
is
happening
with
the
production
system.
Let
the
production
system
recover
and
then
restart
the
migrations
and
finally,
also
throaty
migrations
when
the
wall
rate
exceeds
a
threshold,
so
the
rate
of
generating
gold
segments
exceed
the
threshold,
because
that
is
also
an
indicator
that
there
are
too
many
rights.
There
is
something
happening
there,
it's
not
about
an
incident,
but
some
other
processes
want
to
to
to
do
a
lot
of
rides.
A
Let's
pause
the
background,
the
jobs
that
execute
the
synchronous
updates,
let
them
finish
they'll,
have
a
better
a
higher
priority
and
then
continue.
That's
it
on
this
effort.
Finally,
we
have
this
issue
that
we
we're
going
to
finish
in
15.3
about
as
a
bug
on
the
database
load
balancer
where,
when
we
mix
writes-
and
there
is
a
needing
transaction
timeout
between
rides-
the
transactions,
of
course,
roll
back
on
posters.
A
But
there
is
a
net
case
scenario
where
the
load
balancer
decides
to
retry
the
transaction
starting
from
the
middle,
and
this
is,
of
course,
a
big
problem
for
the
consistency
of
the
data
stored.
We
are,
we
are
almost
done.
We
have.
We
have
a
theory
that
about
why
this
is
happening.
We
have
a
fix,
it's
we
want
to
to
to
release
it
and
monitor
and
check
if
the
fix
that
we
have
implemented
fixes
the
problem.
A
Finally,
this
internal
loan.
We
think
we
want
to
also
add
one
additional
node
on
our
internal
testing
database
cluster
for
testing
against
then
against
the
new,
the
composer
database.
At
the
moment,
we
have
all
both
databases
in
the
same
database
server,
it's
a
matter
of
allowing
us
to
to
scale
our
testing
capabilities.