►
From YouTube: GitLab 15.3 Kickoff - Enablement:Memory
Description
Kickoff for the Memory Group for the GitLab 15.3 release
Planning issue: https://gitlab.com/gitlab-org/memory-team/team-tasks/-/issues/118
Memory Group Past Kickoff Videos: https://youtube.com/playlist?list=PL05JrBw4t0Kq1HDOIfQ8ov6lfyJkWK2Yr
Presentation by: Yannis Roussos, Sr. Product Manager, Memory and Database Groups
A
So
our
top
priorities
are
more
or
less
the
same
as
the
ones
we
had
for
the
past
few
milestones,
but
we
have
seen
traction
on
multiple
fronts.
So
we
have.
I
have
multiple
new,
exciting
things
that
we
are
going
to
discuss,
so
the
first
top
priority
is
investigating
the
puma
locker
memory
used.
Puma
is
our
web
servers
and
who
have
observed
that
the
memory
of
the
service
service
kept
on
increasing
during
when
we
did
not
have
any
deployments.
A
The
reason
for
that
is
that
we
cannot
continue
with
such
a
deep
dive
by
only
using,
for
example,
prometheus
metrics
or
logs
that
we
stored
in
on
cabana.
We
have
to
run
those
diagnostic
reports
like,
for
example,
j,
malloc,
ruby
heap
dumps,
etc.
A
A
problem
there
is
that,
in
order
to
do
so,
we
don't
have
access
to
production
servers,
so
we
have
every
time
to
ask
necessary
to
help
us,
so
we
want
to
build
a
way
to
collect
those
diagnostic
reports
to
generate
and
collect.
So
in
the
first
iteration,
we
will
focus
on
producing
those
reports.
On
the
random
woman
instance.
A
We
will
worry
about
collection
in
a
later
iteration,
so
at
this
first
iteration,
the
collection
of
the
reports
will
still
require
necessary
to
help
us
and
to
fetch
them
and
analyze
them.
The
interesting
part
here
is
that
we
have
to
assume
that
most
of
those
reports
and
diagnostic
tools
incur
a
significant
cause
and
may
interfere
with
certain
user
traffic,
so
may
interfere
with
the
availability
of
our
nodes,
which
is
the
most
important
thing.
So
we
have
to
make
sure
that
these
reports
are
generated
and
collected
in
a
way
that
minimizes
impact
on
node
availability.
A
Our
second
effort
on
this
part
is
also
identifying
the
use
of
diagnostic
reports,
especially
the
ones
that
require
ssh
on
the
nodes,
so,
for
example,
generating
various
different
reports.
Like
the
ruby
heaps,
the
dams
process,
maps,
sit
down,
etc
and
then
running
a
few
sessions
doing
50.3
to
make
sure
which
ones
are
the
ones
that
we
really
need
in
order
to
be
able
to
do
our
analysis
in
our
investigations,
next,
one
which
is
related
to
the
previous
ones
as
well,
and
to
memory
users
in
general
is
tuned.
A
The
jman
logo
settings
for
gitlab.com
the
jmallop
is
a
variant
of
the
malloc
library.
This
is
the
way
that
we,
our
code,
takes
memory.
So
the
problem
here
is
that
we
have
never
managed
to
tune
jame
unlock
for
git
laptops
for
kid
laptop.
A
So,
but
now
we
have
first
of
all,
you
have
the
j
malloc
stats
and
secondly,
by
when
we
are
finished
this
effort,
we
will
be
able
to
also
collect
them.
So
we
are
planning
to
use
to
use
those
j,
malloc
stats,
to
fine-tune
the
settings
of
j
malloc
in
gitlab.com.
A
The
thing
there
is
that,
because
it
is
not
tuned,
it
can
result
in
ever-growing
memory
usage
for
gitlab.com
next,
one,
which
is
also
there
is
a
an
effort
that
has
come
off
from
the
investigation
for
the
puma
long-term
memory
usage
is
considering
replacing
the
puma
vertically,
and
this
was
initially
we
figured
out
that
puma
did
not
work
correctly.
That's
why
the
the
memory
was
increasing,
so
the
the
thing,
the
main
task
of
puma
earlier.
A
It
is
a
process
that
monitors
the
memory
that
other
processes
are
using
and
when
it
passes
at
a
certain
threshold
it
kills
them
it
reaps
them
and
also
occasionally
rips
workers
based
on
on
a
timer
so
that
they
are
refreshed
the
problem
there
is
that,
first
of
all,
it
uses
rss,
which
is
not
a
poor
measure
of
real
memory.
A
So
our
approach
here,
our
idea
here-
is
to
add
a
new
memory
once
for
puma,
and
the
idea
here
is
that,
instead
of
doing
a
static
memory,
limit
is
to
use
heap
utilization
instead,
so
how
efficiently
the
workers
are
using.
The
hip
are
using
the
memory,
and
our
idea
is
that
high
memory
use
is
not
the
bad
thing
bad
thing
as
long
as
it's
used
efficiently.
A
The
other
top
going
to
our
next
top
priority
effort
is
that
optimizing,
the
workers
that
consume
a
lot
of
memory
and
can
cause
out
of
memory
fields.
There
is
a
theme
here,
so
this
is
a
work
we
have
done
for
a
few
months.
In
the
past,
we
have
figured
out
that
there
are
workers
that
use
too
much
memory
and
we
have
so,
for
example,
some
of
those
may
use
more
than
a
gigabyte
of
memory
or
even
go
up
to
six
gigabytes
of
memory.
A
A
The
worker
may
not
even
have
the
time
to
log
in
to
add
the
log
that
I'm
being
killed,
because
I
use
too
much
memory
or
there
are
other
issues
like,
for
example,
in
the
node,
where
you
have
multiple
threads.
Multiple
workers
running
one
worker
consumes
too
much
memory
and
then
the
whole
server.
The
whole
node
is
killed
and
you
see
six
vocal
skills
and
you
are
not
sure
which
one
was
the
one
that
was
causing
the
problem
that
that
can
be.
A
We
can
dive
into
that,
but
causes
a
lot
of
noise
for
us,
so
the
idea
here
is:
can
we
use
different
in
other
ways
to
monitor
those
problems?
One
idea
is
to
use
the
sidekick
memory
killer.
This
is
similar
to
the
puma
memory
killer,
which
runs.
It
does
not
run
on
a
lot
of
types
of
nodes
in
gitlab.com.
A
It
runs
on
only
a
few
workers
types
of
workers,
and
the
idea
here
is
to
enable
it
everywhere
and
use
it
instead
of
allowing
it
to
kill
the
processes
to
use
it
as
an
early
warning
and
load
that
there
is
a
problem
before
the
linux
memory,
killer
gets
in
and
kills
the
process
so
that
we
have
more
details
more
data.
There
are
two
things
here.
A
So
maybe
we
don't
need
to
use
the
the
sidekick
memory
killer
on
that
front,
so
we
will
first
work
on
this,
but
the
second
thing
that
we
want
to
also
do
before
moving
forward
is
to
identify
why
we
also
see
the
the
scientific
memory
killer
not
not
being
triggered
before
pods
are
killed
by
the
linux
memory
killer.
So
we
want
to
identify
that
and
understand
what
happens
with
our
memory
allocation
and
other
triggers
that
can
cause
pots
to
be
killed.
A
The
second
way
of
addressing
this
is
to
use
the
gitlab
scientific,
reliable
feature.
This
is
an
abdominal
on
sciatic
that
allows
us
to
to
track
the
interrupted
count
to
track
how
many
times
workers
have
been
interrupted,
like,
for
example,
when
the
linux
out
of
memory
killer
gets
in,
interrupts
the
process
and
kills
the
the
pole,
so
we
will
investigate
this
as
well.
Finally,
our
last
top
priority
is
the
creating
the
custom
slice
for
global
search.
This
is
to
increase
the
visibility
into
how
our
third
global
sets
works.
A
We
are
deep
into
the
implementation.
We
expect
to
finish
that
so
that
that
will
allow,
by
the
end
of
15.3
the
global
search
team,
to
have
a
way
better
visibility
on
different
types
of
searches
based
or
advanced
and
different
types
of
scopes
with
separate
methods.
That's
it
thank
you
for
watching
and
talk
to
you
next
month.