►
From YouTube: GitLab 14.2 Kickoff - Enablement:Sharding
Description
Kickoff for the 14.2 release for the Sharding team.
Planning issue: https://gitlab.com/gitlab-org/sharding-group/group-tasks/-/issues/1
A
Continuous
integration
are
significant
part
of
our
database
size
but,
more
importantly,
they
actually
have
a
very
large
portion
industry,
roughly
50
percent
of
our
rights
and
those
are
usually
not
particularly
well
scalable
via
read,
replicas
and
only
scale
reads
with
those.
So
this
is
what
we're
doing
it.
It
is
a
project
that
also
involves
many
other
groups
inside
gitlab,
so
distribution
infrastructure-
and
I
wanted
to
highlight
that
we
have
this
progress.
A
A
I'm
going
to
go
through
a
few
things
that
the
sharding
group
itself
is
going
to
do
in
14.2
and
I'll
stay
at
a
relatively
high
level,
because
there's
a
lot
of
things
going
on.
I
think
I
want
to
give
you
a
broad
overview
of
where
we
are
at
so.
The
first
thing
that
the
sharding
group
is
going
to
continue
to
work
on
is
supporting
many
databases
in
gitlab.
As
I
elaborated
earlier
on
right
now,
we
have
a
single
database
that
we
need
to
manage
and
with
decomposition
we
will
have
more
than
one.
A
That
actually
needs
to
happen
following
that
we're
working
on
the
decomposition
of
ci
tables,
and
so
this
is
the
specific
area
of
the
database
for
functional
functionality
that
we're
interested
in
what
we're
trying
to
do
here
and
you
can
see
there's
a
lot
going
on
is
in
this
epic
is
to
essentially
look
at
our
specific
merge
requests
and
try
to
actually
make
the
tests
path
pass
when
we
have
this
in
multiple
databases
and
while
doing
it,
we're
actually
like
generating
quite
a
few
interesting
follow-up
items.
But
the
team
is
sort
of
swarming.
A
On
this
specific
area
to
make
sure
that
we
can
actually
get
this
poc
merge
request
into
a
shape
that
allows
us
to
ultimately
merch.
It
then
there's
a
couple
of
other
things
that
I
would
like
to
highlight.
So
we're
also
really
interested
in
creating
all
of
the
necessary
observability
tools
for
multiple
database
servers.
So
you
can
imagine
now
we
have
one
server.
We
need
to
monitor
it.
A
We
need
to
be
aware
of
what
is
going
on
to
be
able
to
address
any
concerns
that
may
happen,
and
now,
with
potentially
more
than
one
database
cluster
actually
being
available.
We
need
to
make
sure
that
we
have
the
right
dashboards
in
place.
We
collect
the
correct
metrics.
We
have
all
of
the
visibility
for
for
these
additional
components
that
we
require,
and
then
there
are
two
items
here
specifically
to
do
with
infrastructure,
so
this
is
actually
being
worked
on
already
and
we
are
working
on
it
at
speed.
A
So
for
one
we
need
to
benchmark,
or
we
need
to
be
able
to
benchmark
this
new
way
of
running
gitlab,
because
we
need
to
ensure
that
there's
no
performance
regressions
that
everything
works
as
expected
when
we
have
more
than
one
database.
In
order
to
do
this,
we
need
to
create
these
environments
to
actually
measure
this
before
we
roll
it
out
into
production
and
sort
of
tie
to
that
is
the
ability
to
actually
create
clusters
of
databases
in
a
repeatable
manner.
So
you
know
because
there's
more
complexity
behind
this.
A
This
is
a
second
database
cluster
with
a
single
primary
and
then
multiple
read-only
replicas
to
actually
establish
high
availability.
So,
rather
than
having
this
snowflake
sort
of
architecture,
we
did
it
once
we
wanted
to
be
able
to.
You
know,
repeat
this
process
over
and
over.
This
will
become
much
more
important
when
we
are
considering
maybe
decomposing
further
or
when
we
are
actually
looking
at
different
charting
strategies.
A
So
this
is
this
is
planned
for
14.2.
There
are
also
other
activities
right
now
going
on
in
other
areas,
for
example,
fixing
security
features
for
cross
joints
across
this
specific
table.
There
are
also
a
few
other
identified,
broken
features,
you
know
with
multiple
databases,
and
this
work
happens
in
other
groups
check
out
this
table.
It's
a
great
overview.
I'm
excited
about
the
progress
that
the
team
is
making
and
if
you're
particularly
interested
the
epics
are
the
single
source
of
truth.