►
Description
2022 version of a video originally made by Lee Matos.
Link to presentation: https://docs.google.com/presentation/d/10SpbXwBy5f_zQ42RJexOuquRQ7CcsFcL4dKwev6ErIc/edit?usp=sharing
A
A
This
is
a
git
love,
bugging
techniques,
video
techniques,
video,
get
lab,
debugging
techniques
from
a
sport
engineering
perspective,
we'll
move
to
Justin.
Actually,
next
Justin
can
you
introduce
yourself.
B
Sure
my
name
is
Justin
family
I'm,
a
support
engineer
in
Auckland
New,
Zealand
and
I've
been
here
for
coming
up
two
years
now
and
then
he
claims
to
fame.
I
could
think
of
about
myself
that
I'm,
probably
the
least
youngest
get
LED
support,
engineer
on
the
team
and
that
I'm
quite
easily
confused.
It's
it's
great
to
work
through
these
debugging
techniques
and
things
repeatedly
for
me.
A
A
So
this
video
is
about
GitHub
debugging
techniques.
It's
originally
focused
or
not
targeted.
It's
the
support.
Engineering
Group
is
the
people
in
mind.
That's
who
we're
trying
to
make
this
presentation
for,
but
it's
for
everyone,
so
we're
going
to
use
terms
like
customers
and
point
to
specific
channels,
Within
slack
that
we'd
be
using
as
a
support
group
but
they're
open
to
anybody
within
gitlab.
A
If
you're
viewing
this
presentation
outside
of
the
get
like
organization
I'm,
sorry,
you
won't
be
able
to
view
some
of
these
internal
channels
or
discussions,
but
they're
there
for
us
at
least
I
want
to
mention
that
this
is
kind
of
version.
Two
of
a
video
that
Lee
had
made
many
years
ago
in
this
slideshow
was
an
original
video
and
original
deck
link
and
I'll.
Try
to
make
this
presentation
accessible
outside
of
the
GitHub
organization,
and
then
you
can
view
it
later.
A
The
reason
why
we
need
I
felt
we
needed
to
remake
this
was
because
the
original
video
was
showing
its
age.
It
had
a
lot
of
great
information,
had
tips
on
things
to
look
out
for,
but
a
lot
of
those
things
are
not
really
useful
anymore.
There's
just
a
lot
of
changes
since
the
last
one.
The
last
presentation
excluded
a
lot
of
didn't
exclude
it
just
didn't,
have
a
lot
of
information.
The
last
presentation
was
made
around
version
11..
We
have
a
lot
of
new
things.
A
So,
we'll
start
with
the
common
problem,
areas
and
I'll
walk
through
in
the
original
presentation.
Lee
had
this
big
chart,
but
with
the
gitlab
architecture-
and
this
is
available
in
our
docs-
you
can
I
always
say
check
the
docs.
It's
part
of
our
troubleshooting
too
just
check
the
docs,
how
I
always
reference
things
so
in
the
docs?
Is
this
architecture
chart
in
Lee's
original
video
he'd
reference
this
chart
and
point
it
to
various
portions
of
it
as
he
was
going
through
this
lock
resection
now,
I?
A
Don't
really
want
to
go
through
this
chart
again
because
it
became
massive,
it's
huge
now.
So,
if
we
zoom
in
this
little
section,
this
Puma
section
we'll
see
that
it's
just
this
tiny
piece
in
this
massive
puzzle
and
gitlab
is
just
huge
now.
So
this
I
want
to
mention
this
specifically
because
we
won't
be
able
to
walk
through
everything.
A
We
won't
have
every
single
problem
addressed
and
there's
so
many
different
facets
and
different
pieces
of
git
lab
that
we
won't
be
able
to
do
everything,
but
we
can
go
through
some
of
the
common
ones,
and
many
of
them
are
still
similar
to
the
ones
in
the
original
video.
But
we
won't
be
doing
any
of
this
architecture.
Break
anymore.
Waming
actually
has
another
video
about
architecture,
order,
one
that
you
can
reference
and
use
as
part
of
our
learning.
A
If
we're
using
this
as
a
learning
presentation,
so
we
won't
be
doing
the
architecture
breaks
anymore,
but
it's
very
important
that
you
take
a
look
at
this
documentation
and
use
this
information
when
you're
debugging
gitlab,
because,
for
example,
there's
a
certain
component,
that's
not
working
as
expected.
You
need
to.
You
can
use
this
to
check
to
see
what
what
might
be
the
cause
like.
A
What
component
is
connected
to
what
so
we'll
start
with
the
scenario,
for
example,
Puma
errors,
so
Puma
is
kind
of
the
core
app
rails.
It
is
things
that
it
controls
the
core
app
of
gitlab
they're
in
troubleshooting
this
they're
four
logs.
We
should
pay
attention
to
and
these
are
available
in
our
Docs,
but
we
almost
always
have
to
look
at
these
types
of
logs
whenever
we're
troubleshooting,
almost
anything
gitlab.
A
So
whenever
we
request,
information
from
customer
will
almost
always
ask
for
these
logs
unless
we
know
exactly
what's
going
on,
but
there's
a
little
asterisk
in
there.
So,
for
example,
we'll
see
this
Puma
memory
killer,
we'll
see
in
the
Puma
I
see
the
outlog.
This
is
a
very
common
error
and
there's
a
way
to
fix
this.
Within
the
logs,
and
generally
just
to
increase
the
memory,
if
this
happens
too
frequently
or
if
it
happens
consecutively
through
many
many
workers,
then
this
is
a
problem.
A
It
should
be
addressed
and
we
can
take
a
look
at
the
docs
to
see
how
to
increase
it.
This
is
something
that
has
popped
up
again
over
the
years.
This
is
something
that
was
a
very
very
common
a
couple
of
years
ago,
kind
of
fell
off,
because
our
memory
usage
was
much
better
but
started
popping
up
again
more
recently.
In
my
opinion,
this
is
just
something
that
kind
of
pops
up
what's
old
is
New.
Again
is
something
I
think
about
when
I
see
the
Puma
memory
killer,
errors.
A
A
Something
failed.
The
expected
thing
that
the
thing
that
you
expect
failed
essentially,
when
the
Puma
app
was
trying
to
retrieve
some
data
or
something
for
the
app
itself.
It
failed
in
retrieved
in
the
data
through
timeouts
or
it
didn't
match
the
expected
data
in
the
tape
in
the
database
or
something
or
it
just
didn't
exist.
This
is
a
common
error,
so
this
is
something
that
we
can
see
in
the
logs
and
500.
A
Additionally,
when
you
see
a
500
error
in
the
UI
there's
always
a
correlation,
ID
and
I'll
talk
about
this
a
lot
a
lot
later,
especially
about
correlation
ID
and
logs
another
one
is
deadline
exceeded.
This
was
a
common
error
quite
a
few
years
ago,
and
mostly
attributed
to
giddly
being
slow
because
of
disk
problems.
Either
the
disk
performance
was
not
keeping
up
with
what
Kidley
demanded
of
it
or
giddly
was
just
requesting
too
much.
A
I
I
mean
this
essentially
the
same
thing
worded
twice,
but
nowadays
it's
more
of
a
red
herring
error.
We'll
see
this
a
lot
in
the
logs
just
move
past
this
when
you're.
If
you
see
lots
of
these
in
the
logs,
it
could
be
a
problem.
But
if
you
see
this
once
in
a
while,
I
wouldn't
rely
on
this
to
tell
you
whether
or
not
something
is
correct,
and
if
you
look
at
the
timestamps,
you
can
see
that
this
is
from
Lee's
original
presentation
from
2019.
A
So
this
is
a
very
old
error,
but
it's
kind
of
new
again
it's
popping
up,
but
look
Beyond
this
one,
because
it's
not
always
disk
performance
error
nowadays.
A
So
lee
has
this
really
great
concept
about
scared,
customers
and
Savvy
customers.
Scared
customers
are
those
who
request
a
problem
or
tell
us
about
a
problem
that
they're
experiencing
and
they
don't
always
include
the
full
problem
description.
For
example,
we
were
seeing
a
500
error
when
we
visit
a
repo
and
all
repo
is
not
very
clear.
Is
it
all
repos?
Is
it
one
repo?
Is
it
a
group?
Is
it?
Is
it
pushing
or
pulling
it's,
not
very
specific
or
users
can't
see
stuff.
A
Please
help
was
an
actual
error
we
received
just
a
few
days
ago
in
an
emergency
ticket.
So
these
are
scared,
customers
that
don't
fully
understand
the
problem
that
they're
experiencing
and
we
want
to
make
them
more
Savvy
customers.
So
what
we
do
with
an
unclear
problem
is
we
start
with
a
problem
description?
What
is
the
user
describing?
Were
they
expecting
to
see
what
and
where
is
the
error
occurring,
so
they
have
500
errors
or
503s?
What
are
they
experiencing?
We
get
a
gitlab
Osos,
which
is
a
set
of
logs.
A
This
is
a
really
great
project
and
I'm
going
to
mention
a
few
times.
It's
going
to
be
mentioned
quite
a
few
times
in
this
presentation,
because
it's
very,
very
good
output
providing
a
full
trace
and
correlation
ID.
You
can
get
the
full
Trace,
either
from
well
from
the
logs,
if
you're
getting
the
the
full
Trace,
it's
definitely
from
the
logs
correlation
ID
can
be
presented
in
the
UI
or
the
logs.
A
A
If
they
don't
have
this
information,
it
just
takes
a
long
time
to
debug.
We
just
don't
know
what
it
is.
It's
really
hard
to
troubleshoot
a
problem
when
you
don't
know
what
the
problem
is,
when
you
take
your
car
to
the
mechanic-
and
it
just
doesn't
make
that
sound
anymore,
you
just
don't
know
how
to
fix
the
problem,
because
you
don't
know
where
the
sound
is
coming
from.
The
mechanic
doesn't
how
to
recreate
it.
A
It
just
really
difficult
to
do,
and
also
we
have
to
define
a
common
language
since
there's
terms
defining
a
common
language
is
important
too,
and
I
really
want
to
mention
this
one,
because
it's
not
about
English,
it's
not
about
getting
the
same
language
like
spoken
language.
It's
more
about
terms!
If
someone
uses
a
phrase
about
their
Runners,
not
working
expected,
they
they
need
to
be
clear.
Is
this
like
a
gitlab
runner,
that's
not
running,
or
is
this
something
in
the
pipeline?
A
That's
not
running
on
the
runner,
defining
a
Common
Language
and
where
things
occur
really
matter,
and
we
need
to
give
the
customers
from
that
scared
state
to
a
Savvy
State,
and
this
is
the
way
these
are
ways
we
can
do
it.
So,
for
example,
a
good
problem
description
is,
we
are
seeing
a
500
error
when
we
visit
this
repo,
so
sometimes
they'll
have
an
image
link
to
a
500
error,
but
if
it
doesn't
include
that
correlation
ID,
it's
not
very
useful.
It's
just
that.
A
We
know
that
there's
a
500
error
happening
if
they
include
a
timestamp,
that's
better.
So
if
they
include
this
correlation
ID,
it's
good
if
they
paste
it
in
the
tickets,
even
better.
If
they
have
accompanying
logs
included
with
the
ticket
when
they
they
submit
the
ticket,
that's
that's
great
and
if
they
Pace
the
entire
stack
Trace,
that's
a
great
way
to
start
off
a
ticket
to
debug
a
problem.
We
can't
debug
a
problem
unless
we
know
what
the
problem
is
even
working
towards
that
problem.
A
We
can't
find
good
solutions
to
work
towards
that
problem
unless
we
know
where
to
look
and
if
a
scared
customer
does
isn't
able
to
tell
us
it
just
delays
everything.
So
we
need
to
make
the
customers
into
more
Savvy
customers,
so
troubleshooting
the
problem
or
debugging
the
problem.
We
have
several
ways
of
doing
it.
A
In
the
past,
it
was
always
recommended
to
do
things
like
check
a
database
or
the
rails
console
or
API
and
which
one
should
we
use
well
API,
if,
if
you
can,
for
example,
if
they're
trying
to
query
a
bunch
of
Mrs
and
they're,
just
not
displaying
on
the
page,
use
the
API
to
see
if
you
can
query
those
Mr
Mrs
during
the
normally
and
if
they
return
in
the
time
that
you
expect.
A
So
if
the
Mrs
just
blank
page
of
Mrs
and
showing
a
little
bit
of
a
loading
screen,
you
use
the
API
and
it
shows
up
fine,
maybe
there's
something
else
happening,
maybe
a
UI
or
ux
issue
or
something.
So
we
also
want
to
choose
between
rails,
console
or
postgres
and
I
recommend
going
when
you're
most
comfortable
with,
because
you
don't
want
to
make
mistakes,
that
doesn't
mean
be
afraid.
Definitely
if
you're
not
comfortable
with
either
or
even
just
one
or
the
other.
A
Sometimes
you
need
to
use
both
make
sure
that
you
get
someone
to
help
someone
that
might
understand
a
little
bit
better
or
just
run
run
it
by
someone.
It
really
depends
on
the
circumstances
and
I
I
just
want
to
make
sure
that
to
start,
if
you're
with
a
scared
customer,
you
don't
want
to
start
giving
them
a
bunch
of
commands
to
start
pasting
the
chat
and
or
start
pasting
into
their
console
and
then
just
start
running
without
them
understanding
it.
A
So,
for
example,
for
example,
rails
console
move
to
the
browse
console
because
I'm
most
comfortable
with
using
the
rails,
console
and
I
think
a
lot
of
us
in
sport.
Engineering
are
the
original
presentation
that
Lee
had
made
had
talked
about
using
sorry.
My
screen
is
jumping.
A
So
the
original
presentation
that
Lee
had
made
talked
about
the
rails
console
being
kind
of
a
dangerous
place,
but
the
rails
console
has
improved
lately
and
we
have
we've
really
gotten
it
so
that
you
can
stay
kind
of
within
rails.
You
don't
go
off
the
rails
with
the
active
record,
especially
it's
hard
to
break
things.
A
So,
whenever
you're
talking
to
a
customer
and
trying
to
work
through
a
problem,
make
sure
you
test
the
commands
ensure
that
the
customer
is
not
copying
and
pasting,
all
of
them
at
the
same
time
go
line
by
line
and
paste
them
have
someone
to
look
over
your
commands.
A
The
reason
why
I
mentioned
the
line
by
line
thing
is
because,
if
you
paste
it
all
in
one
big
glob,
sometimes
it
could
throw
could
have
formatting
errors
that
you
didn't
expect
like
Carriage
returns
in
the
wrong
spot,
or
something
is
happening
too
quickly,
like
a
a
read
command
that
is
being
posted.
We
need
to
verify
that
the
the
things
that
we're
pasting
in
the
command
are
the
things
we're
expecting
and
I.
A
A
So,
for
example,
the
the
rails
active
record
when
you're
looking
at
a
model
like
a
model
of
a
code,
for
example,
if
you're
pushing
an
MR
for
displaying
an
MR
and
you're.
Looking
at
the
model
of
the
Mr
of
How
It's
displayed
in
a
UI,
you
have
sets
of
data
that
you
can
look
at
within
the
rails
console
and
if
there's,
if
you're,
just
viewing
the
rails,
console
and
you're
looking
at
a
certain
set
of
data,
and
it
just
doesn't
display
or
errors,
it's
a
good
way
of
telling
where
the
500
error
happens.
A
A
B
Sure
so,
when
I
started
again,
I
knew
my
way
around
hi
Chris,
post
Chris
reasonably
well,
and
you
may
worry
about
rails,
not
at
all,
so
that
it
was
the
one
that
I
started
off
using
to
look
at
them.
B
Let's
look
at
that
or
the
database,
but
I
I
do
totally
agree
with
Matthew
that
if
you
can
get
the
hang
of
the
Rails
console
for
a
lot
of
applications
with
debugging
things
and
testing,
things
is
the
way
to
go,
because
it
has
all
the
logic
built
in
there,
as
well
as
to
what
gitlab
does
and
how
the
various
tables
are
grouped
together
into
relationships
and
things
so
but
post
credit
sales
come
in
handy
just
for
checking.
B
Sometimes
what
is
in
a
was
in
a
table,
or
else
what
the
schema
of
a
table
is
to
make
sure
it's
got
all
the
columns
and
indexes
and
things
that
at
once
so
I
mean
you
can
make
updates
to
data
and
gitlab
through
both
the
rails
and
the
postgres
console.
But
the
postgres
console
will
allow
you
to
make
changes
in
isolation
from
other
things.
So
you
could
easily
wind
up
updating,
merge,
requests,
but
not
updating
some
other
record
that
needs
to
be
kept
in
sync
with
or
consistent
with
it.
B
So
I
do
suggest
you
don't
use
postgres
for
updating
data
deleting
inserting
is
just
for
looking
at
tables
and
schemas
unless
you
absolutely
have
to
and
when
you
do
have
to
up,
and
there
will
be
a
know
and
work
around
for
a
problem
and
it
will
have
been
tested
and
tried
by
other
people.
So
you
can
be
reasonably
confident
that
it
will
work
but
always
be
aware
of
what
you're
about
to
update.
B
If
you
do,
the
BET
make
sure
you
verify
it
for
a
select
the
same,
we're
closed
so
before
you
go
ahead
and
do
an
update
or
a
delete
or
something
like
that
to
get
into
the
database
console
from
Standalone,
Omnibus,
installation
and
or
Docker
installation,
you
can
just
run
get
web
Dash,
psql
and
it'll
connect
to
the
local
database.
B
If
your
database
is
hosted
externally
that
won't
work,
so
you
can
use
different
rails.
Db
console
minus
minus
database
main
instead.
That
will
ask
you
for
the
password
for
the
configured
good
lab
user
as
per
the
gitlab.rb
file.
So
you'll
need
to
know
what
that
is
to
get
into
the
console
and
really
Danny
took
my
head
to
that
for
using
for
using
the
console
as
the
backslash
X
option,
which
will
change
the
output
formatting
sort
of
vertically
rather
than
horizontally.
A
And
that's
that's
that
nothing.
So
one
thing
too,
is
that
the
I
didn't
mention
it.
B
A
Think
it
was
the
other
side.
So
here
I
wrote
that
migrations
are
direct
database
changes
and
when
you
look
at
the
migration
files
they
you
will
see
that
they
are
database
entries,
insert
into
table
or
create
a
new
table
or
something
along
those
lines:
I'm,
not
postgrades,
but
migrations.
If
you
have
a
failed
migration,
postgres
console
is
a
great
way
to
troubleshoot
that
one
so
I'll,
move
on
to
logs
logs
are
very
important.
A
I
listed
it
quite
a
few
times
in
the
presentation,
because
it's
it's
very,
very
important
now
the
slide
show
is
only
one
single
slide
about
logs,
but
it's
probably
the
most
important
thing,
because
we
really
need
the
logs
to
trigger
try
to
figure
out
what
a
problem
is
to
see
what
the
issue
is.
We
really
need
to
find
out
the
logs.
A
We
can
get
information
from
a
500
error
and
we
can
try
to
get
some
information
from
the
findings.
The
logs
are
most
important.
So
if
you
check
the
docs
I
included
a
link
in
this
presentation,
the
docs,
the
docs
check
the
docs,
because
it
includes
a
list
of
all
the
logs.
We
have
and
examples
which
are
really
great
because
you
can
see
what
an
expected
log
out
was
supposed
to
look
like
it's
the
expected
log
output
and
then
the
correlation
ID,
which
I
mentioned
several
times
so
far.
A
If
you
see
this
within
the
browser,
you
can
get
that
correlation
ID
from
the
user
and
you
can
search
the
log
support.
This
is
important
on
SAS
too,
because
when
a
SAS
user
has
an
issue
and
they
they
give
you
that
correlation
ID,
you
can
go
into
our
log
system
and
look
it
up
when
you
you
have
the
correlation.
Id
you'll,
see
it
across
multiple
Services
too.
So
you
can
see
whether
Puma
was
behaving
as
expected.
A
If
giddly
was
behaving
behaving
as
expected,
correlation
ID
is
very,
very
important
and
you
can
see
that
in
the
API
too,
if
you're,
using
that
with
the
X
request,
ID
header,
it's
it's
really
really
good.
I
use
this
a
lot
too,
when
I'm
searching
through
logs
and
I
see
a
500
error
occurred.
One
place
that
I
want
to
see
if
this
is
attributed
to
the
same
problem,
that
the
user
is
reporting
or
is
it
just
another?
500
error
that
they
happen
to
be
experiencing:
are
they
even
related?
A
So
it's
really
really
good
to
get
this
correlation
ID
when
you
can,
when
the
user
can
provide
it
so
stack,
traces
and
back
traces
are
great
because
they
can
show
you
where
the
code
failed,
and
this
is
really
cool
working
at
gitlab,
because
we
can
just
kind
of
Link
directly
to
where
the
code
failed.
So
you
can
go
on
gitlab.com
and
then
find
the
version
of
the
user
is
using.
A
You
can
find
that
where
the
stack
Trace
fails-
and
you
can
just
point
to
this-
this
section
of
the
code
is
where
you're
having
a
problem,
and
this
is
useful
too,
because
you
can
use
that
that
section
of
code
to
find
the
model
and
go
back
to
the
rails
console
and
then
just
check
to
see
if
the
model
is
presenting
as
expected,
and
you
can
just
query
that
model
and
see
if
or
query
that
record
I
mean
and
see
if
they're
still
throwing
errors
or
try
to
repair
it
using
that
we'll
leave
repair
it
to
another
time,
but
logs
logs
are
vitally
important
to
figuring
out
the
the
problem
and
if
you
just
want
to
skip
to
a
single
log
exceptions,
Json
log
is
the
best
log
to
use
I.
A
Think
anyway,
you
can
just
kind
of
skip
to
it.
It
has
all
the
Puma
in
gitlab
dash
rails
folder.
You
can
see
the
exceptions
Json
log,
because
it'll
just
show
all
the
exceptions
and
across
application
production
and
Workhorse
logs
too
I
believe
and
I
mentioned
the
Json
logs,
because
they're
somewhat
new
compared
to
version
11.
A
they've
been
more
and
more
they've.
Been
added
or
improved,
more
and
more
since
the
the
previous
presentation
this
this
is
just
this
is
really
important
like
if
we,
if
we
don't
find
narrow,
you
just
can't
find
the
root
cause.
A
It's
just
very,
very
hard,
so
logs
logs
logs
logs
just
keep
asking
the
customer
for
the
logs
if
they
can't
provide
the
logs,
for
example,
they're
on
a
closed
system,
ask
them
to
look
for
the
logs
and
just
know
where
the
logs
are
at.
You
can
even
Point
them
to
the
documentation,
tell
them
where
this
the
logs
exist
and
then
that
then
ask
them
to
look
in
the
logs
for
that
type
of
error.
A
Look
for
a
specific
type
of
thing,
like
a
500
error
or
use
that
correlation
ID,
that
they
got
from
the
browser
and
then
look
for
that
in
the
logs.
You
can
ask
the
customer
to
do
this.
If
they're
I've
worked
with
customers
that
just
can't
share
any
of
the
logged
information,
they
can't
do
a
gitlab
SOS.
They
can't
share
even
just
snippet
of
the
logs,
but
if
we
ask
them
to
find
the
logs,
they
can
provide
small
outputs
that
are
very
much
redacted.
B
We've
mentioned
this
already
a
few
times,
gitlab
SOS.
So
this
is
a
really
really
important
and
helpful
tool
for
us
in
support
and
it
saves
us
and
the
customers
from
a
lot
of
time
in
the
first
instance
when
investigating
a
problem,
and
we
want
to
know
as
much
about
their
environment
as
possible
and
as
quickly
as
possible.
So
this
is
a
project
and
there's
links
to
it
in
the
handbook,
and
you
can
just
go
to
this.
You
can
run
it
either
by
cloning.
B
B
B
B
It'll
do
a
listing
of
CPUs
how
many
CPUs
memory
disk
space,
so
you
can
tell
if
the
system
has
run
out
of
space,
for
instance,
which
might
be
causing
the
problem,
and
it
also
collects
a
copy
of
the
current
log
of
all
the
different
get
lab
services
that
are
on
the
system,
as
well
as
the
OS
syslog
or
messages.
B
So
all
sorts
of
things
in
one
place,
with
one
documented
way
of
for
the
customer
to
create
the
file
and
attach
it
to
the
ticket
and
then
you're
off
to
a
good
start
to
to
try
and
get
to
the
bottom
of
whatever.
The
issue
is.
B
One
thing
to
mention
is
that
the
latest
versions
of
good
lab
SOS,
if
you
run
it
after
cloning,
the
project,
so
the
first
sort
of
way
of
running
an
app
so
is
it
will
include
a
sanitized
copy
of
the
get
their
blood
out
in
configuration
in
the
SOS,
which
is
also
really
helpful
because
that's
sort
of
the
other
thing
we
tend
to
have
to
ask
customers
for
so
we
can
see
what
their
current
settings
are,
but
just.
B
Do
that
if
the
customer
runs
it
directly
using
curl,
so
that
that
file
may
not
be
included
and
you'll
have
to
ask
for
it
separately?
B
Yeah
I
was
just
important
just
before
we
started
here
that
one
of
our
Engineers
Kenneth
was
actually
in
the
process
of
working
on
enhancement
to
get
their
bsos.
That
will
allow
you
to
specify
a
Time
range
for
the
logs
that
you
want
to
include
in
it.
So,
as
I
said
at
the
moment,
it
will
just
include
the
most
recent,
the
current
active
log
file
for
each
service
and
on
a
busy
system.
B
Those
files
can
get
rotated
very
quickly,
so
we
do
always
recommend
the
customer
reproduces,
whatever
problem
they're
having
and
then
immediately
runs,
we've
got
lab
SOS
so
that
it
will
have
those
errors
in
the
current
log,
but
often
even
if
it's
around
15
minutes
or
half
an
hour
later,
you
might
find
that
the
time
period
involved
is
not
included.
So
that
will
be
a
a
good
enhancement
when,
when
it
gets
released,.
B
When
you
get
it
back,
it's
a
it's
a
compressed
tar
file,
so
you
extract
it
to
your
to
your
local
machine
and
it
extracts
it
into
a
hierarchical,
folders
and
log
files
and
things.
And
then
you
can
visually
inspect.
A
B
Yourself
or
grip
for
information
out
of
them,
but
we
also
have
some
other
projects
that
people
have
created
to
help
with
the
interpretation
and
housing
of
those
SOS
files.
B
So
the
two
key
ones
are
fast
bets,
which
is
specifically
for
extracting
performance
information
from
the
get
their
blog
files
and
it'll
show
you
a
number
of
operations
that
are
different
types
of
operations
that
are
performed
how
long
they
took
where
they
spent
their
time,
whether
it
was
in
database
access
or
queuing
or
CPU,
and
that
sort
of
thing
and
the
requests
per
second
involved
of
that
operation.
So
that's
a
way
of,
especially
for
performance
issues.
When
customers
say
the
system
is
running
slowly
you
can
check
and
see.
B
Is
that
one
particular
operation
is
running
very
slowly.
Are
there
hundreds
or
thousands
of
operations
requests
per
second
being
issued
for
it
for
some
for
something
which
is
just
overwhelming
the
system
in
the
information
like
that,
you
can
get
out,
and
you
can
also
compare
particular
log
files
statistics
to
the
benchmarks
for
that
version
of
gitlab
to
see
if
it's
sort
of
behaving
as
as
expected
or
not
so,
there's
lots
of
documentation
around
how
fast
Tax
Works
it
can
produce
graphs
as
well,
showing
showing
the
metrics.
B
So
it's
a
very
powerful
Tool,
bringing
performance
related
issues
and
then
green
hat
is
another
there's
another
one
which
is
a
sort
of
user-friendly,
take
space
interface
into
a
bunch
of
options
to
pass
the
log
files
examine
the
information.
That's
in
there
print
out
the
system,
configuration
and
metrics
and
state
in
in
the
easily
read
format
and
a
whole
bunch
of
other
ones.
B
One
thing
to
remember,
though,
with
this
OS
is
it's
very:
it's
often
the
very
first
thing
we
ask
on
ticket
and
it's
a
customer's
one
of
those
very
special
excuse.
Me.
Special
ones
actually
includes
SOS
with
the
first
of
the
first
Contact
on
the
ticket,
but
often
the
very
first
thing
we'll
go
back
to
them
and
ask
for
is:
can
you
please
send
us
a
get
letter
SOS
from
your
instance
now
for
a
large
installation,
maybe
using
a
reference
architecture?
B
They
may
have
upwards
of
30
nodes
configured
as
part
of
a
you
know:
5000
user,
get
that
reference
architecture
and
even
for
the
non-reference
architectures
they
may
have
multiple
rails
nodes,
multiple.
They
will
have
potentially
multiple
giggly
nodes,
multiple
psychic
notes.
So
when
you
ask
them,
can
I
get
a
bit
SOS,
please
you
do
have
to
be
a
bear
in
mind.
You
might
be
asking
them
to
do
this
across
a
dozen
or
more
instances
at
once.
B
B
So,
along
with
getting
so
this,
which
is
used
for
our
Docker
and
Omnibus
based
installations,
we
have
kubis
OS,
which
is
used
for
our
Helm
chat
based
installations.
It's
the
same
idea.
It's
designed
to
be
a
sort
of
a
one-line
tool
that
command
you
can
run
to
get
a
file
together
containing
a
whole
bunch
of
useful
information
about
how
your
kubernetes
cluster
is
set
up
and
also
collecting
all
the
bitlab
logs.
B
It's
a
project
that
you
that
you
clone
and
then
run
the
run
the
command
it
does
require
you
to
run
it
from
a
machine
that
has
a
coupe
CTL
access
to
the
cluster
that
it
can
interrogate
and
get
the
required
information,
and
you
do
have
to
tell
it
which
namespace
in
your
class
to
look
at
levels
installed,
because
for
many
of
the
commands
that
runs
the
namespace
specific.
So
we've
got
neighbors
into
your
default
namespace
and
you
don't
specify
the
default.
B
The
namespace
you'll
get
a
bunch
of
information
back
that
isn't
all
that
helpful,
someone's
key
things
it
does
include,
though
it
includes
the
currently
applied
pound
chat,
values
that
have
been
used
to
configure
the
gitlab
deployment,
and
it
also
includes
the
log
files
from
all
of
the
different
services
that
run
in
pods
within
the
classes.
So
you
have
psychic
pods
you'll
have
web
service
pods.
Definitely
pod
and
you'll
get
a
log
file
produced
from
each
of
those.
B
Now,
unlike
the
yes's,
which
collect
the
individual
log
files
for
each
service
in
the
single
nicely
formatted
file,
the
kubernetes
logs
from
a
pod
will
include
logs
from
all
the
containers
running
within
their
pod
and
you'll.
B
That
means
that,
for
instance,
for
the
web
service,
part
you'll
have
Workhorse
logs
mixed
up
with
rails
type
logs
and
the
whole
thing
there's
a
little
bit
of
a
jumble
of
logs
from
different
services,
and
if
you
do
want
to
apply
those
logs
to
something
like
Fast
debts,
you
will
need
to
do
some
selective
gripping
of
the
lines
that
are
relevant
from
from
those
files
to
get
them
into
a
format.
B
Their
past
steps
can
can
run
against,
and
the
other
really
useful
thing
in
the
file
is
the
events
logs
from
the
cluster
software
with
kubernetes.
B
It
may
be
evicting
pods
that
are
because
the
memory
in
the
environment
is
too
too
load
and
often
you'll
get
information
about
those
from
the
actual
cluster
event
logs,
and
you
can
go
back
to
the
customer
and
say
well
actually
this
you
need
to
increase
the
memory
you
have
for
your
nodes
are
running
too
much
too
many
points
in
there,
you're
evicting
them
and
as
per
the
service,
it
is
best
to
run
this
as
soon
as
possible,
after
reproducing
whatever
the
problem
is
because
the
kubernetes
is
likes.
B
Okay,
so
this
section
is
just
about
the
different
kinds
of
deployments
and
ways
you
can
deploy
good
lab
and
I
guess.
This
has
changed
a
lot
over
the
years
as
as
more
and
more
options
are
developed
and
made
available.
B
So
Geo
troubleshooting
is
sort
of
a
thing
unto
itself
in
a
way,
because
it's
not
a
Geo
is
our
multi-site
deployment
method,
which
primarily
is
about
bringing
copies
of
repositories
and
database
information
into
different
geographical
locations
that
are
closer
to
the
end
users
to
make
things
like
cloning,
and
you
know,
pushing
repos
and
things
faster
for
people,
because
they're
they're,
if
they're
another
part
of
the
world
and
they're
trying
to
access
the
lab
server
someone
far
away,
then
it
will
take
a
lot
longer.
B
But
the
other
thing
that
Geo
provides
is
a
disaster
recovery
mechanism
whereby
you
can
have
your
primary
site,
your
replicated
secondary
site
that
it's
been
used
by
people
in
that
region
and
then,
if
something
happens
to
the
primary
you
can
switch
over
to
the
secondary
and
not
have
any
loss
of
data
or
much
in
the
way
of
a
downtime.
B
So
it's
quite
a
lot
of
moving
Parts
involved
in
how
that
replication
between
the
primary
and
the
secondary
is
performed
for
all
the
different
kinds
of
objects
in
the
lab
environment.
So
you
have
your
database,
which
is
being
replicated
by
a
postgres.
B
You
have
your
repositories
which
are
being
replicated,
and
then
you
have
all
your
different
kinds
of
objects
like
uploads
or
that's
their
Snippets
and
and
things
like
that,
which
need
to
be
transferred
from
one
site
to
the
other
or
you
know
as
soon
as
possible
after
they've
changed.
B
B
Now,
let's
just
go
mention
one
one
tip
in
the
troubleshooting
for
for
geo
is
to
just
reset
the
secondary
site,
which
performs
the
full
resync
of
all
the
data
from
the
primary
to
the
secondary
I.
Think
that's
that's
an
option
that
that
is
certainly
used
to
to
fix
problems
and
I.
B
Think
because
Geo
is
such
a
Dynamic,
yes,
it's
being
updated
all
the
time
bugs
have
been
fixed
and
new
features
are
being
deployed
that
possibly
there
are
a
lot
more
cases
with
it
that
is
required,
what's
required
in
the
past,
as
a
as
the
last
resort
to
to
get
things
working
again,
I'd
say
that
these
days,
there's
possibly
less
less
necessary
to
go
there
and
also,
if
you
are
going
to
suggest
it,
just
bear
in
mind
that
it's
sort
of
a
you
know
nuclear
option
in
terms
of
you're
going
to
knock
out
the
Dr
side
and
it
may
take
hours
or
days
to
get
the
sinking
back
in
sync
again.
B
So
it's
not
something
the
customer
may
be
there,
keen
on
doing
so,
just
a
bit
sensitive
around
that
when
you're
suggesting
it
and
explore
other
options.
First
and
all
sorts
of
things
can
come
into
play
when
troubleshooting
Geo
issues,
apart
from
problems
with
gitlab
itself,
there's
a
lot
of
performance
aspects
that
can
cause
things
to
get
out
of
sync
and
backlogs
to
develop.
So
you
really
have
to
be
looking
at
the
Italy
prefix,
Network
and
database
performance
at
both
sites.
B
Potentially,
if
there's
a
problem
with
replication,
just
not
happening
happening
as
quickly
as
it
as
a
customer
wants
or
two
or
as
it
should
be
yeah,
so
reference
architectures.
So
there's
a
lot
of
work
again
in
this
area
has
happened
in
recent
years,
so
we
have
our
reference
architectures
that
we
recommend
to
customers
as
a
reliable
tested
and
benchmarked
way
to
provision
a
gitlab
environment
for
a
particular
user
cap
based
on
certain
assumptions
about
what
typical
users
do.
B
So
we
have
reference
architectures
going
from
500
or
1000
users
up
to
50
000
users,
and
you
can
see
all
this
fix
for
those
in
terms
of
machine
types
and
numbers
and
architectures
and
the
documentation.
B
The
reference
architectures
provide
High
availability
now
just
put
a
star
next
to
that.
Just
to
remind
me
to
mention
that
there
is
some
caveats
around
that
that
are
mentioned
in
the
docs
and
one
of
those
things
is
around
prefect
database.
B
It's
not
it's
not
h
a
and
the
reference
architectures
unless
you
post
it
externally
on
a
on
a
database
database
platform
that
that
is
highly
available,
but
otherwise,
nearly
as
far
as
I
know.
All
the
other
feature
parts
are
good.
There
can
be
provisioned
in
a
distributed
way
to
make
them
highly
available.
B
B
So
if
you're
troubleshooting
an
issue,
you
might
be
tempted
to
say
to
a
customer,
oh
you
know:
you've
got
you're
having
performance
problems,
you
should
deploy
a
reference
architecture,
have
a
look
at
the
3000
user
one
and
have
a
look
at
that,
and
this
is
you
need
to
deploy,
28
or
31
nodes,
or
something
and
I
had
a
customer
who
was
rightfully
it's
been
about
that
suggestion,
because
they
just
didn't
have
the
workload
as
that's
associated
with
a
3000
user
system,
but
they
wanted
High
availability
or
in
their
environment.
B
So
you
can
reduce
that
down,
but
the
risk
you
take.
There
is
just
that
the
performance
won't
be
as
good
as
it
is
guaranteed
to
be
by
the
reference
architectures.
B
Sort
of
stateless
parts
of
gitlab,
and
then
we
use
on
the
bus
deployments
of
and
Prospect
to
store
a
home
repository
data
and
that
also
leverages
object.
Storage
to
store
the
other
information
which
external
and
sometimes
when
it
cut
again
when
it
comes
down
to
Performance
and
get
LED
having
more
nodes
can
be
available
alternative
to
just
having
a
single
larger
nodes
and
it
might
even
cost
customers
less.
B
So
we
do
have
to
work
through
what
the
customer's
actual
environment
and
requirements
are
and
if
they're
having
performance
issues,
then
these
are
all
different
options
that
can
be
suggested
and
hopefully
link
to
in
our
doc.
So
the
customer
can
do
their
own
research
and
decide
which
ones
are
most
appropriate.
Events.
B
Oh
so
yeah
so
probably
mentioned
some
of
this
already
one.
B
So
one
thing
to
do
with
with
large
environments
and
and
things
like
reference
architectures
is
to
don't
just
assume
there
might
be
a
single
instance
good
lab
so
remember
to
find
out
before
you
start
suggesting
things
to
do
for
their
to
address
their
issue.
You
can
find
this
out
by
asking
them.
B
You
can
also
have
a
look
at
prior
tickets,
because
often
they
will
have
had
the
same
questions
asked
for
them
in
the
past
by
other
support
Engineers
on
other
tickets
and
animations
there,
and
some
of
our
customers
actually
have
architecture
issues
linked
to
from
the
help
desk.
So
you
can
look
at
those
and
see
a
hopefully
recent
architecture,
diagram
and
other
information
about
them,
as
I
mentioned
before,
be
selective
when
requesting
your
services
in
large
environments,
because
that
can
be
a
lot
I
guess.
So
this
is
so.
B
If
you
only
need
to
see
sidekick
logs,
then
just
request
your
services
from
Psychic
nodes,
and
you
can
even
reduce
that
down
to
ask
for
just
the
log
files
themselves
if
you're
reasonably
searching
about
what
it
is.
You
want
to
want
to
check
and
be
aware
that
they
may
be.
The
environment
may
be
using
external
external
postgres
and
object
storage
or
what
they
may
be
using
those
Services
as
they
are
deployed,
I
get
led.
B
The
South
has
per
the
reference
architectures
options
to
say
a
whole
bunch
of
possibilities
there
and
one
thing
to
bear
in
mind.
This
applies
to
to
any
gitlab
installation
in
the
cloud.
Not
just
reference
architectures,
but
for
performance
issues.
Again,
do
be
aware
of
the
potential
for
a
mismatch
between
instance,
types
and
storage
Types
on
there
Cloud
compute
instances.
B
So
troubleshooting,
Cloud
native
or
which
is
how
we
refer
to
kubernetes
deployments
of
gitlab
we've
talked
about
kubi,
so
this
and
I'm
just
seeing
if
there's
anything
much
there
I've
been
mentioned.
B
Can
rotate
quickly
be
aware
of
that?
We
have
two
files
that
get
included
in
the
Cooper.
So
it's
that's
it
sometimes,
because
sometimes
it
seems
that
information.
Isn't
there
I'm
not
here,
to
go
back
and
ask
the
customer
directly
for
it,
but
that's
extremely
helpful,
especially
the
user
supplied
values
to
see
exactly
what
configuration
values
have
been
applied,
make
sure
your
objective
event
logs
as
well
in
case
the
issue
is
not
collab
at
all,
but
it's
being
imposed
on
it
by
the
cluster
itself,
due
to
Resource
limitations
or
other
areas.
B
Network
errors,
DNA
series
that
sort
of
thing
an
ability
to
pull
down
images
from
external
places
that
and
other
things
external
to
get
there
and
speaking
for
myself
chat
values
for
kubernetes
deployments
can
be
confusing.
B
I,
often
struggle
to
know
exactly
where
they
should
be
specified
and
and
what
you
know
what
sub
subheadings
should
be
associated
with,
and
so
I
make
good
use
of
a
test
test
cluster
that
I
have
to
try
things
out
and
make
sure
I'm
not
going
to
tell
the
customer
something
and
it
is
actually
incorrect
or
misformatted
foreign.
B
.Com
or
SAS
troubleshooting,
so
a
lot
of
troubleshooting
of
issues
reported
by
SAS
customers
is
similar
to
what
we
do
for
customers
who
manage
their
own
good
lab
environments.
B
But
some
of
the
processes
are
also
different
and
the
one
key
thing
we'll
keep
things
that
I
might
have
to
remind
myself
constantly
to
remember
to
do
this
when
I
am
dealing
with
the
ticket,
especially
if
it's
for
something
that
sounds
like
it.
B
You
know
a
key
part
of
gitlab.com
is
not
working
properly
and
it's
something
that
isn't
specific
to
anything.
A
particular
customer
has
has
configured
is
to
check
whether
it's
a
known
issue
already,
because
there's
so
many
custom
people
using
github.com
chances
are
anything
major
will
have
already
been
reported
and
logged
by
someone
else.
B
You
can
save
yourself
a
lot
of
time
by
just
checking
in
the
slack
with
a
Incident
Management
Channel,
whether
the
incident
has
been
declared
relating
to
a
particular
problem.
You
can
check
our
state
of
stock
atlib.com
page
for
similar
information
and
there's
also
issue
tracker
called
reliability,
engineering
team
that
records
slower
priority
issues
or
longer
running
knowing
issues,
and
the
other
key
thing
is
to
check
for
recent
recent
similar
tickets
and
that
can
save
a
lot
of
time.
When
it's
been
I've
been
out.
You
know
now
trying
to
figure
something
out.
B
B
If
you
do
need
to
go
hunting
down
a
particular
problem,
we
can't
just
log
on
and
look
at
the
logs
and
we
can't
get
any
service
run.
So
we
have
tools
to.
Let
us
do
those
things.
Instead,
Cabana
is
the
tool
for
searching
how
log
files
from
all
the
different
galette.com
instances
and
components
against
less
research
backend.
B
You
do
need
to
remember
when
you
go
into
that
to
choose
which
log
Source
you're
interested
in
whether
it's
get
to
Lee
or
sidekick
or
collect
rails
and,
as
Matthew
mentioned
earlier,
having
the
correlation
ID.
What's
your
customity
from
the
error
page
and
bitlab,
when
it
appears,
it's
really
helpful
to
crack
things
down,
and
there
is
a
correlation
dashboard
available
in
Cabana
that
it's
huge
again,
the
correlation,
ID
internet
searches
across
multiple
log
sources
that
you
need
messages
with
that
correlation
ID
in
them,
which
can
be
a
great
time.
B
Saver,
bear
in
mind,
there's
a
seven
day
retention
of
those
logs,
so
you
need
to
for
an
issue
the
customers
reporting
happen
more
than
seven
days
ago.
You
need
to
get
it
reproduced
to
try
and
hunt
it
down.
B
Century
is
the
other
talk,
so
Century
will
actually
collect
similar
errors
into
issues
and
that's
a
good
tool
for
seeing
as
a
particular
type
of
error
happening
a
lot
over
a
given
time
period
across
lots
of
customs
and,
if
you
do
think,
you've
identified
an
issue
that
hasn't
been
reported
before
then.
There's
processes
in
the
handbook
for
using
Century
to
create
an
issue
for
the
site,
reliability
teams
to
look
into
and
see
if
the
action
needs
to
be
taken
to
fix
that.
A
Anything
Stevens,
so
the
presentation,
hopefully
you've
learned
something.
This
is
what
we
normally
do
in
day
to
day
at
gitlab
to
troubleshoot
issues
and
troubleshoot
problems
for
customers
and
ourselves
caleb.com,
so
I'm
gonna
close
it
here
and
stop
the
recording,
take
care.