►
Description
Q&A session between Alberto Ramos (Engineering Manager, Reliability) and Matt Allen (Sr. Technical Recruiter) discussing DBRE roles at GitLab. For more information, please visit our Job Family page: https://about.gitlab.com/job-families/engineering/database-reliability-engineer
A
Hey
I'm
matt
allen,
I'm
one
of
the
senior
recruiters
here
at
git
lab.
I
support
our
product
and
infrastructure
teams
and
today
I'm
visiting
with
alberto
ramos
who's,
one
of
our
engineering
managers
on
our
infrastructure
team,
specifically
for
reliability,
engineering
and
today
we're
going
to
talk
a
little
bit
about
his
database,
reliability,
engineering
role
that
we
recently
opened
up
for
his
team
here
at
gitlab,
so
alberto.
If
you
could
just
maybe
start
off
by
telling
us
a
little
bit
more
about
yourself.
B
Totally,
thank
you
matt.
So
I'm
a
nursery
manager.
I've
been
looking
after
cloud
systems
and
you
know
pretty
much
security
systems
at
scale
in
you
know
large
environments.
For
many
years
already
I
now
I'm
getting
deeper
at
you
know
sre
and
dbr,
particularly
here
in
gitlab.
I
have
a
bigger
opportunity
to
develop
that
you
know
that
position
further.
B
So
you
know
for
the
last
10
to
15
years.
I've
been
you
know,
working
in
the
software
industry
working
for
who's,
that,
let's
say
hewlett
packard
back
in
the
day
when
I
was
working
in
spain
and
in
southeast
asia,
australia,
new
zealand
and
so
on.
Then
I
moved
over
to
to
ireland.
I
live
in
ireland
now
and
I
had
the
opportunity
to
join.
B
You
know:
companies
like
zendesk
after
that
microsoft
and
now
you
know
very
happily
working
for
gitlab,
so
we've
been
in
ireland
me
and
my
family
for
the
last
seven
years.
You
know
I
have
a
beautiful
wife
and
two
great
kids
and
we
are
intending
to
stay
in
ireland,
for
you
know
for
the
rest
of
our
careers,
maybe
another
15
20
years,
if
possible.
That
sounds
great
at
the
moment.
B
Again,
talking
about
my
career,
just
focusing
on
sre
as
much
as
possible
and
then
enjoying
this
position
in
in
gidlab,
so
on
the
on
the
personal
side
of
things
I
love
exercising,
I
love
outdoors
activities,
you
know
running
climbing,
working
out,
cycling
swimming
and
definitely
I
love
traveling.
I
travel
a
lot
with
my
family
as
much
as
possible
and
finally,
you
know
on
the
kind
of
a
on
the
more
elevated
side
of
things.
I
I
I
love
reading
about
cultures
and
politics
in
the
world,
and
you
know
that's
something
that
occupies
my
days.
A
All
right,
very
good
and
you're
still
you're,
still
fairly
new
here
at
gitlab.
You've
been
here.
What
about
three
months
now.
B
Yeah
three
and
a
half
months,
and
still
you
know
finishing
my
landing
in
the
company,
but
already
you
know
quite
quite
happy
with
what
I
see
and
comfortable
with
that
with
the
flow
of
things
and
so
on.
B
Sure
so
we
are,
you
know
we
are
a
pretty
solid
and
mature
reliability
team,
an
sre
following
the
sre
antennas
as
much
as
possible.
You
know,
everybody
knows
the
the
the
profession
quite
well
as
google
defined
it
like.
You
know,
15
to
20
years
ago,
and
we
we're
pretty
close
to
that
definition.
Most
of
our
folks
are
software
developers
and
they
have
a
they
have
a
knack
for
production
systems.
B
They
have,
you
know,
a
passion
for
managing
end-to-end
production
systems,
so
they
make
a
very
good
reliability
team
and
you
know
I'm
quite
happy
to
be
here.
I
can
learn
from
them
and
they
can.
You
know
they're
all
pushing
to
evolve
and
too
and
to
do
better
challenges
are
massive
but
are
still
manageable,
and
I
think
that
we're
doing
a
pretty
good
job,
I'm
quite
happy
with
that
with
the
team.
So
far,.
B
Sure
so
we
are,
you
know
we
are
fully
remote
company
and
we
were
created
as
a
remote
company,
seven
or
eight
years
ago,
so
that
that
changes
things
quite
a
lot.
It's
not
that
we've
changed
recently
I
mean
we've
always
been
remote
and
there's
you
know
that
data
makes
us
more.
B
You
know
our
workflows
are
more
asynchronous
are
and
then
there's
there's
a
there's,
a
massive
push
towards
efficiency.
So
we
try
to
get
the
max
out
of
our
time.
There's
not
a
lot
of
distractions
with
that
on
interruptions
with
some
of
the
colleagues
that
you
stumble
across
in
you
know
in
your
workplace
or
similar,
it's
a
it's,
a
it's,
a
an
environment
where
everybody
wants
to
get
the
max
out
over
time
and
also
enjoy
you
know
their
their
personal
life
and
their.
You
know
their
free
time
as
much
as
possible.
B
So
that's
that's
that's
the
main
difference
that
I
found
so
far
with
some
other
more
traditional
companies
in
the
area
in
the
sector.
A
Sure
and
obviously
yeah
remote's,
you
know
kind
of
built
into
the
foundation
of
everything
that
we
do
here
at
gitlab,
but
obviously
it's
not
always
sunshine
and
rainbows
being
a
remote
worker.
What
are
some
things
that
you
have
found?
Maybe
challenging
so
far
in
terms
of
working,
this
type
of
role
remotely.
B
Cool
so
for
a
manager
like
myself
is
you
know,
I
mean
there's
this
initially
a
lot
of
effort
that
you
need
to
push
into
connecting
with
your
with
your
team,
and
then
you
know
establishing
a
rapport
with
them
and
you
know
getting
to
to
like
them
getting
to
you
know
to
to
be
able
to
work
with
them
very
closely,
but
but
eventually
you
know
that
happens
and
starts
to
work.
Now
the
the
whole
bonding
in
the
team
happens
lower
than
in
some
other
environments.
B
In
some
of
the
environments,
you
have
a
physical
contact
and
then
definitely
that
makes
things
slow
faster
in
that
regard
here
to
become
a
team,
it
takes
a
little
more
kind
of
intentional
investment
and
also
a
little
bit
more
time.
We
need
to
be
patient
in
crystallizing
as
a
team,
a
little
bit
more
in
that
than
some
of
the
companies.
I'll
tell
you
that
in
general,
bonding
and
crystallizing
as
a
team
are
you
know
the
the
only
challenges
that
I've
seen
so
far.
A
B
B
I
mean
there's
50
percent
of
the
time
that
you're
spending
and
looking
after
production
systems
and
also
trying
to
improve
how
production
is
working-
and
you
know,
coming
from
incidents
and
coming
from
from
purely
observing
production
environments,
and
the
second
part
of
the
job
is
about
running
projects
that
are
going
to
transform
how
we
do
operations
in
production
and
automate
how
to
do
operations
with
production.
B
So
it
means
that
a
typical
day
would
be
about
you
know
being
on
call
and
then
getting
all
these
pages
reacting
to
the
pages
and
trying
to
mitigate
incidents
that
are
starting
or
that
are
evolving
as
soon
as
possible.
Then
later
on,
trying
to
find
root,
causes
for
these
these
incidents
and
and
of
course,
corrective
actions
that
are
going
to
be
addressing
some
of
the
shortcomings
that
that
took
us
to
that
incident.
B
So
all
of
you
know
would
take
some
investigation
going
through.
You
know:
monitoring
systems
going
through
production
systems
and
automation,
and
also
again
over
the
documenting
all
of
the
findings
and
then
taking
action
on
some
of
the
points
that
we
find
as
lacking
right
then,
the
other
part
of
your
day
would
be
around
project
work.
B
So
if
you
are
assigned
to
a
project
you'll
be
coding,
you'll,
be
you
know,
configuring
systems
you'll
be
investigating
incidents
that
some
other
experienced
or
cost,
and
they
maybe
they're
reaching
out
to
you
for
help.
So
all
of
other
project
works
out
of
a
picture
you
know
takes,
as
I
said,
mostly
coding.
You
know
reviewing
prs
and
mrs
for
for
different
peers
and
so
on,
and
definitely
you
know,
documenting
and
and
configuring
systems.
A
B
Oh,
so
it's
it's
a
you
know.
The
the
answer
to
the
question
is
pretty
similar
to
the
one
that
you
would
that
you
have
for
some
of
the
companies
that
are
not
doing
sre.
So
when
you're
looking
at
the
production
systems,
you're
still
working
very
closely
with
the
developers
with
the
support
organization
and
also
with
the
qa.
B
So
all
in
all
I
mean
the
developers,
are
you
know,
brothers
and
sisters
and
then
we're
working
with
them
all
the
time
they're
developers
here
in
sre,
at
dbre,
in
in
gitlab
we're
also
developers,
and
then
it
means
that
we're
changing
ideas
all
the
time
we're
reaching
out
to
them
to
understand
how
you
know
the
code
and
the
applications
are
working
and
then
providing
advice
and
collaborating
working
together
in
investigations
and
in
in
different
projects.
B
So
the
collaboration
in
this
case
is
very
fluid,
given
that,
at
the
end
of
the
day,
the
profession
is
the
same.
It's
you
know
software
development
for
both
way,
both
of
the
team
accessory
and
and
the
dev
organization,
with
the
support
the
interactions
more
for
the
incident
management
side
of
the
picture.
B
So
when
you're
on
call
and
you're
or
you're
driving
an
incident
resolution
or
trying
to
mitigate
as
soon
as
possible,
you're
collaborating
with
them
engaging
with
our
customers
trying
to
understand
if
our
customers
would
draw
some
type
of
abusive,
behavior
or
traffic
and
then
talking
to
the
customers
validating
what
they're
seeing
what
they're
not
seeing
if
they
have
any
problems
connecting
to
our
to
our
system
to
our
front
ends,
etc
and
also
getting
them
to
help
us
with
their
with
our
communication.
B
So
anytime
that
we
are
engaging
our
customers
for
communications
and
the
support
organization
is
driving
all
of
us,
and
then
we
synchronize
with
them
very
closely
to
make
sure
that
the
the
most
transparent
and
meaningful
information
about
how
the
incident
is
flowing
comes
across
or
gets
across
really
to
our
to
our
customers
and
finally,
with
the
qa
team.
B
It's
all
you
know
about
collaborating
with
them
in
staging
environments,
making
sure
that
the
their
streets
of
tests
are
as
complete
and
elaborate
as
possible
and
and
then,
for
you
know,
working
side
by
side
on
some
of
the
projects
that
require
changes
to
production.
B
Well,
actually,
most
of
our
projects
require
changes
to
production,
so
they're
going
to
be
running
again
again,
the
suites
of
tests
that
will
be
validating
that
the
change
that
we've
run
in
production
is
not
breaking
things
and
it's
you
know
a
working
expected
so
yeah
I'd
say
that
this
these
three
teams
would
be
the
main
ones
yeah.
A
Yeah
definitely
one
thing
that
I've
noticed
is
a
lot
of
the
collaboration
here
at
gitlab
happens
within
the
product
itself.
So
how
does
your
team
kind
of
dog
food
our
own
product
here?
How
do
they
use
git
lab
in
their
day
to
day.
B
Yeah,
that's
that's
a
good
question,
so
dog
fooding
is
at
the
core
of
a
company.
It's
a
it's
a
it's
a
one
of
the
basic
foundations
of
the
company
or
values
of
the
company.
So
you
cannot
get
a
get
around
it.
I
mean
you
need
to
dog
food,
whether
you
like
it
or
not.
B
In
our
case,
you
know
we
enjoy
enjoy
quite
a
lot
there's
few
areas
where
we
might
have
some
challenges
adult
footing,
but
in
general
I
would
say
eighty
percent
of
our
work,
you
know,
happens
easily
while
footing
our
applications,
so
we're
talking
about
managing
workflows
with
our
issue
management
and
our
boards
management
for
all
of
the
product
management
side
of
things,
and
we
would
be,
of
course,
managing
all
of
our
code
in
our
code
repositories.
B
Also,
we
interact
with
our
you
know,
with
our
with
our
with
our
code
via
pipelines,
and
then
we
deploy
to
production
using
our
pipelines
in
gitlab.
All
of
our
documentation
system
runs
on
on
gitlab,
so
this
is
issues
and
epic
documentation
and
also
our
handbook,
which
is
kind
of
the
manual
that
we
follow
as
employees
in
the
company
and
all
about
all
about
runs
on
on
gitlab.
B
So
I'd
say,
a
big
chunk
of
our
day
runs
from
here
on
gitlab,
and
perhaps
you
know,
the
only
the
only
area
that
we're
still
not
quite
there
for
dog
footing
is
the
incident
management
area.
So
on
that,
on
that
front
we
need
a
more
specific
tool
sometimes,
and
then
we
are
using
a
combination
of
some
basics
of
gitlab
plus
some
other.
You
know
tools
out
there
like
page
of
duty
and
what
else
we
have
there.
We
have.
You
know
google
drive
and
few
other.
You
know
systems.
A
All
right
very
cool
now
what
are
some
of
the
technical
skills
that
are
kind
of
most
important
when
you're
looking
at
dbre
candidates
for
them
to
ultimately
find
success
in
this
role.
B
B
So
our
database,
our
core
databases,
postgres
we're
using
some
other
combination,
cloud
databases,
but
that's
the
main
one
that
we
run
our
production,
gitlab.com
system
and
it's
pretty
large.
It's
composed
of
eight
nodes,
massive
nodes
and
then
the
more
our
candidates
and
our
you
know.
Dbrs
have
experience
around
you
know
playing
with
these
environments.
Breaking
these
environments
automating
these
environments,
the
better.
Definitely
what
else
so
definitely
very
important
to
have
a
to
have
a
lot
of
experience
around
automating
operations
around
the
database
in
this
case
postgres
and
then
around
automating.
B
We're
talking
about
coding,
pure
coding,
not
just
scripting,
but
also
proper
coding,
but
in
our
case
happens
sometimes
in
go
in
ruby,
because
you
know
we're
mostly
a
ruby
shop
and
we're
not
python.
So
any
yeah
any
sort
of
language
works
for
us.
What
else?
I
think
that,
having
a
bit
of
a
sixth
sense
or
a
knack
on
for
system
troubleshooting
is
it's
required.
B
You
need
to
be
pretty
good
at
troubleshooting
understanding
what
is
broken
without
spending
three
days
investigating
at
it,
and
you
know,
following
your
intuition
and
then
being
able
to
kind
of
very
quickly
following
few
signals
and
hence
understand
what
could
be
at
the
core
of
an
incident
and
I'm
being
able
to
troubleshoot
and
mitigate
it
as
soon
as
possible.
B
And,
finally,
you
know,
I
think,
that
it's
important
to
have
a
well-rounded
server
as
a
service
systems,
architecture,
knowledge,
because
you're
not
going
to
be
touching
only
the
database
you're
going
to
be
touching
some
other
systems
that
are
super
important
for
gitlab.com.
B
Like
you
know,
storage
systems
also
front-end
systems,
networking
caching
cdns,
all
of
it
all
of
our
proxies-
are
super
critical
to
our
traffic
and
our
you
know,
gitlab.com,
so
having
a
broad
understanding
of
how
these
systems
work
and
some
in-depth
knowledge
on
some
of
the
particular
areas
that
are
related
to
the
database
is
super
important.
A
B
Cool,
so
we
we
used
to
call
them
softer
back
in
the
day,
but
now
we
call
them
more
and
more
core
skills,
because
you
know
in
a
remote
company
like
like
gilda,
it's
super
important
that
the
soft
skills
are
ingrained
within
your
you
know
your
values
and
how
you
how
you
work.
You
need
to
be
as
good
as
a
technician
as
a
communicator,
sometimes
as
a
an
assassin
as
a
peer
or
a
colleague
right.
B
So
some
of
its
core,
you
know,
skills
that
we
highlight
are
definitely
transparency
being
open
and
honest
and
also
be
somehow
not
shy
and
be
a
little
bit
more
extroverted
than
than
maybe
the
the
kind
of
a
the
very
typical
I.t
person
that
would
be
kind
of
a
locked
in
a
corner
working
on
his
own
on
her
own,
be
more
bb
kind
of
warm
and
forthcoming
with
some
other
team
members,
so
that
you,
you
kind
of
foster
a
collaboration
and
communication
with
them.
B
So
everything
that
has
to
do
with
communication
communicating
clearly
it's
also
very
important
and
be
intentional
about
how
you
communicate
either
verbally
or
or
in
written.
So
no,
I
mean
be
very
sure
of
why
you're,
using
certain
words
so
that
the
right
message
is,
you
know
getting
across
at
the
right,
the
right
person
to
get
the
right
ideas
in
their
mind.
A
B
Italy
definitely
yeah,
that's
the
the
other
massive
system
that
we're
looking
after
in
in
our
team.
So
it's
it's
the
gear
protocol
backend
system
and
it
manages
the
repo
storage,
their
availability,
the
replication
and
balancing
and-
and
you
know,
all
the
features
that
made
possible
to
serve
via
the
get
protocol.
B
You
know
all
the
information
that
are
stored
in
that
is
stored
in
a
repo,
so
there's
a
big
push
now
to
where
defining
how
these
ripples
and
the
replicas,
how
how
they
they're
they're
configured,
how
the
the
replication
parameters
are
set.
How
many
replicas
and
charts
do
you
have
a
in
every
one
of
these
gita
clusters
and
also
how
oliver
relates
to
the
different
contracts
that
the
customers
have
so
the
different
tiers
you
could
you
could
have
a
gold,
you
could
be
silver,
you
could
be
bronze
and
so
forth
and
then
how?
B
All
of
that
gives
you
more
or
less
resiliency
with
your
data
and
your
your
repos.
So
it's
a
it's
a
very
sophisticated
system.
We
still
have
a
lot
to
evolve
there,
and
then
we
work
very
closely
with
the
developers.
Then,
on
the
on
the
application
front
of
the
house,
we
work
on
the
infrastructure
side
of
the
house
to
to
evolve
italy
and
take
it
to
the
next
level.
A
Very
interesting
all
right,
I
know
we're
running
up
on
time,
so
I
wanted
to
save-
I
guess
kind
of
the
fun
question
for
last,
but
if
someone
was
to
ask
you
know,
why
would
I
want
to
join
the
reliability
engineering
team
here
at
gitlab?
What
would
you
tell
them.
B
Cool,
I
guess
that
that's
the
the
million
dollar
question
right,
so
so
I'd
say
if
you
love
sre
and
you
you
love
these
two
systems
at
scale
in
the
cloud
you
love
software
development
and
and
the
systems
around
that
and
you're
curious
about
how
to
work
efficiently,
asynchronously
in
a
remote
bone
company,
also,
if
you're
interested
to
where
to
being
able
to
work
from
any
point
in
the
world
at
any
point
any
any
every
moment.
A
All
right,
well,
hey!
I
really
appreciate
your
time
alberto.
It
was
great
chat
with
you
today
and
thanks
for
sharing
all
your
information
with
us
here
about
our
reliability,
engineering
team.