►
From YouTube: How Gitaly fits into GitLab: Episode 3 – Git push
Description
A 1-hour training video for contributors new to GitLab and Gitaly.
A closer look at the final stage of git push where the
git hooks run and the refs get updated. Interaction between the git
hooks and GitLab internal API. The Git object quarantine mechanism.
Preview of Git HTTP (to be discussed next time).
Recorded 2019-03-07
A
So
last
time
we
were
looking
at
git
push
and
SSH
and
how
that's
first
hits
good
lab
shell
and
then
from
there
it
Bri
authenticates
with
gate
lab
and
then
it's
established
quickly
connection
and
then
more
stuff
happens
and
I
want
to
get
to
that
more
stuff,
because
there's
some
surprising
things
going
on
during
the
final
stages
of
a
good
push,
maybe
I
should
start
with
a
high-level.
So
what
happens
during
a
good
push?
A
A
This
one
is
maybe
easier
to
understand.
So
did
this
one
is
less.
This
one
is
less
mind-boggling,
because
this
trigger
CI
hits
creates
like
notifications
and
gets
lab
that
the
actual
push
happens.
So
you
see
in
this
UI
did
you
push
happened
because
the
post
receive
hook
ran
because
we
have
to
tell
the
rails
application
sometime,
because
when
this
receive
back
process
is
started
on
the
server?
That
is
not
the
time
to
send
notifications
into
the
system.
A
Saying:
hey
user
X
pushed
something,
because
at
that
point
you
haven't
even
received
the
data,
you
don't
know
what
they
pushed.
So
this
is
mostly
about
notifications.
Another
thing
that
it
does
is
that
you
have
this
feature
where,
if
you
push
new
branch
thinking
lab
gives
you
a
URL
where
you
can
click
to
make
a
numerous
requests
so
that
you
were
all
that
stuff
is
printed
by
the
post
receive
folk
okay.
So
that's
the
that's
the
easy
one.
This
is
the
tricky
one
the
pre
receive
hook,
because,
what's
gate
does
before
it
runs?
A
These
hooks
is
that's
the
data
that
it
receives
the
the
peg
file
data.
It
puts
that
in
quarantine.
It
gets
written
in
an
alternate
object
directory
that
is
a
temporary
directory.
So
nobody
knows
about
it
unless
you
know
the
actual
path
to
that
directory,
and
you
can
only
run
git
commands
that
look
up
that
look
at
that
data.
If
you
know
that's
quarantine
directory,
and
specifically
that
means
it
gets
really
interesting,
because
this
pre
receive
oh
conformance
features
like
protect
the
branches.
How
do
we
block
a
push
to
a
protected
branch?
A
Well,
the
free
receive
hook,
gets
input
from
gate.
That
says
the
user
is
updating
branch
X
from
Y
to
commit
Z
and
I
get
lucky,
and
then
look
at
that
and
say:
hey
branch
X
is
a
protected
branch,
and
this
user
does
not
have
the
permissions
to
push
through
this
branch.
So
we
block
the
bush.
This
thing
can
abort
the
whole
bush,
but
for
good
lab
to
look
at
those
look
at
what's
happening
and
to
apply
these
solid
ations.
It
needs
to
look
at
the
gate
repository.
So
this
is
actually.
A
This
thing
runs
on
the
Gately
server
and
makes
a
calls
back
into
a
good
lab
with
an
api,
HTTP
api
call
and
then
get
that
makes
RPC
calls
back
into
the
catelli
server
to
look
at
that
get
data
that
is
about
to
be
committed
and
just
decide
if
it
looks
right-
and
this
is
fairly
complex
because
that
gate
has
in
quarantine.
So
you
need
to
be
careful
that
all
the
RPC
calls
that
run
during
on
kidnap
sites
during
the
pre
receive
hook,
know
where
the
quarantine
data
lives.
B
A
So
this
yeah.
This
is
super
tricky
what's
happening
here,
the
or
basically,
this
states.
This
state
of
the
push
transaction
is
very
tricky
because
you're
using
quarantine
data-
and
this
will
also
be
very
interesting.
However,
we
solve
it
in
RHA
implementation,
because
if
you
replay
a
push
to
several
Gridley
servers,
each
of
them
will
create
their
own
random
quarantine
directory
for
the
objects
and
those
any
lookups.
You
do.
The
details.
Quarantine
objects
are
super
specific
to
the
right,
Catelli
server.
A
Exactly
so,
that's
that's
roughly
what
I
want
to
look
at
and
I
double-checked
before
I
started
this
school,
so
I,
don't
say
something
stupid.
We
don't
have
distributed
tracing
yet
in
gitlab,
shell,
so
the
sorry
in
the
hooks,
so
the
API
calls
that
get
made
here
are
invisible
in
the
tracing,
no
we're
building
out
the
tracing
anyway.
It's
not
still
a
work
in
progress,
so
I
can't
show
you
this
with
tracing.
Also
I,
don't
know
how
to
use
the
tracing,
but
otherwise
I
would.
A
Now
so
I'm
going
to
do
this,
the
some
low-tech
way
I
know
that's
yeah,
so
I
want
to
show
the
API
calls.
That's
where
I
want
to
start
so.
First
I
have
a
repository
here
and
I'm
going
to
make
a
new
branch.
I
have
no
idea
what
this
is
about.
I
think
this
is
because
it's
a
test
repository
with
bad
data
in
there.
Oh,
that's
because
I
actually
ruined
that.
A
A
Let
my
commit
message:
hello
and
I:
do
a
git
push
and
I
expect
this
will
get
pushed
to
my
gate.
Lab
I
am
running
GD
K,
so
this
is
pushing
to
local
via
SSH
to
my
local
kidnap
server
and
yes,
that
works.
So,
let's
make
that
a
function.
So
it's
a
little
easier
to
repeat
so
now,
I
can
just
P
and
it
will
push
good
now.
I
want
to
show
what
that
does
so,
let's
steal
some
logs
I
know
that
so
I
want
to
look
now
now.
A
Look
at
I
I
want
to
look
at
the
excess
logs
of
the
rails
application
because
I
already
know
that
that's
where
the
API
calls
are
being
made
I'm
going
to
truncate
the
log,
because
this
is
development
and
is
this
full
of
stuff
I?
Don't
care
about?
First
need
to
go
to
the
kit
lab
directory
log,
slash
development,
let's
see
how
big
that
is,
that
is
big.
A
A
A
A
Okay,
well
there
we
have
it.
What
is
happening
here
is
that
I'm
not
sure
where
there
are
two
coals
here,
but
this
is
part
of
the
good
lab
shell
part
where
we
interact
with
the
SSH
daemon
and
pre
authentication.
So
this
is
the
setup
that
allows
gate
lab
shell
to
establish
a
goodly
connection
for
this
particular
push
and
then,
during
the
push
we
make
this
API
call
and
this
API
call
from
the
hooks.
A
A
A
A
But
I
was
saying
that
during
the
bush
there
is
a
bunch
there's
these
extra
objects
in
quarantine,
and
we
want
to
see
that
now
the
nice
thing
of
about
how
the
hooks
work
is
that's
their
Ruby
and
you
can
they're
one
of
executable,
so
I
can
just
edit
them
and
wave
wave
to
the
camera.
I
know
that
in
this
setup
they
are
in
the
gate
lab
shell
repo,
which
is
a
weirds
artifact
of
history.
So
here's
the
pre
receive
hook.
Let's
see
if
I
can't
wave.
A
A
A
How
P
doesn't
do
anything?
It's
something
DIF
finds
rerun
defines
so
here
you
see
that
all
the
objects
are
either
in
these
directories,
which
start
with
two
hexadecimal
characters,
which
is
kids
fan-out
scheme,
to
make
sure
that
you
don't
end
up,
exhausting
the
maximum
number
of
directory
entries
or
they're
in
the
packs
right
here
we
have
some
loose
objects
and
Begg
files
and
this
incoming
stuff.
So
that
is
a
quarantine
area.
I
was
talking
about.
A
How
does
this
work?
How
do
we
even
know
where
to
look
for
these
things,
because
and
we're
not
supposed
to
run,
finds
and
say
well,
I
found
some
random
extra
directories
here
and,
let's
assume
that
these
objects
belong
to
the
repository,
so
what's
actually
happening
is
that
gates
is
telling
us
about.
B
A
A
B
A
A
A
But
so,
if
you
look
this
that
this,
this
yeah,
like
you
said
this
points
to
a
partial
list
of
only
the
new
objects-
and
this
is
not
a
complete
git
repository-
it's
not
a
valid
gate
repository
for
that.
We
also
need
to
remember
where
the
rest
of
the
repo
is,
and
that's
why
I
get
says
these
two
variables.
A
I
wasn't
actually
aware
that
kids
said
this
one
I'm
not
sure
what
it's
for,
but
it
is
sort
of
backing
on
my
story
that
it's
actually
called
quarantine
path,
yeah
and
it-
and
you
can
see
it
just
repeats
this,
so
maybe
I
can
tell
you
another
time
where
we
can
look
up
another
time,
why
that's
called
why
that
one
is
there.
We
only
rely
on
these
two,
so
we
do
something
pretty
complex,
because
so
these
two
things
are
important
and
we
only
know
about
them
in
the
context
of
this
hook.
A
A
A
A
What
happened
here
is
that
get
lab
shell,
so
this
parts,
the
session
set
apart
for
SSH
this
code
lives
in
the
kidnap,
shell,
repo
for
legacy
reasons.
I,
don't
want
to
go
into
all
the
code
that
does
this
also
listen
to
get
left
cell
repo
and
it
shares
implementation
codes,
even
though
they
are
part
of
completely
different
things
right,
because
the
session
setup
is
part
of
this
starting,
the
gate
receive
back
process,
and
this
stuff
is
part
of
the
hooks,
which
is
all
the
way
down
here.
A
What?
Why
is
this
in
the
same
repo-
and
why
is
this
called
good
lab
shell?
There
is
no
answer
to
the.
Why,
but
it's
how
things
are.
This
leads
to
all
sorts
of
breathing
wrong
I
into
the
microphone.
This
leads
to
all
sorts
of
confusion,
because,
in
particular
we're
reusing.
These
end
points
either
to
establish
a
get
any
connection
after
receiving
an
SSH
session
or
we're
using
the
same
endpoints
when
we're
in
the
middle
of
a
hook
of
agate
who
runs
on
the
Grizzly
server.
A
Are
you
following?
Are
you
still
with
me?
Are
you
following
this,
so
so
yeah?
So
logically,
these
are
two
separate
things
right.
This
should
be
calling
a
different
API
endpoint
like
really
it
should
be
calling
this,
but
actually
for
legacy
reasons
it's
calling
the
same
endpoint
and
that's
because
a
whole
bunch
of
code
that
is
indeed
live.
Shell
was
just
calling
the
same
API
endpoints
and
that
API
either
returns
data.
You
need
to
establish
at
this
for
the
session
setup
or
it
does
something
completely
different.
Just
this.
A
A
Oh
boy,
yes,
here
we
are,
let's
assume
that
you
know
make
it
bigger.
So
this
is
the
implementation
of
the
allowed
post.
So
this
one
so
first
of
all,
the
actor
can
either
be
identified
by
an
SSH
key
guess
what
that's
when
we're
up
here
or
it
can
be
identified
by
something
that
points
to
a
user.
Id
guess
what
that
is
this
case,
because
these
hooks
of
course
also
run
during
an
HTTP
push
which
is
zero
to
do
with
SSH.
But
the
code
is
mixed
because
legacy.
B
A
A
A
A
So
in
this
first
one
there's
a
changes
per
am
on
this
HTTP
POST,
which
is
as
a
bogus
value
underscore
any,
and
that
is
because
it
is
just
as
good
left
shell
trying
to
establish
an
SSH
session
and
it
has
no
idea
what
the
changes
are,
because
the
data
hasn't
been
sent
down
the
line
yet
and
we
still
hit
the
same
API
endpoint,
because
because
no,
but
we
do
and
then
the
second
time
around
were
actually
in
the
pre
receive
hook.
And
here
we
have
concrete
information
about
a
change.
A
A
It
is
two
different
purposes
and
for
the
pre
receive
we
only
care
about
the
second
time
we
edits
and
one
way
to
recognize.
That
is
that
the
changes
field
is
populated
and
has
real
data
in
there.
Okay,
how
did
I
get
here?
I
wanted
to
show
you
the
quarantine
directory.
So,
let's
see
where
that
is.
That
is
here.
A
A
A
A
And
this
whole
end
thing
is
empty
because
duh,
it's
only
relevant
for
the
other
use
of
the
dual
use,
end
points.
This
thing
is
empty
and
how
I
put
the
whitespace
in
the
wrong
spots?
I
need
to
put
it
here
now.
This
is
the
actual
gates
free
receive
hook,
API
equal
now
here
we
have
this
fund
information,
so
we've
alternate
object
directories.
A
A
A
The
hooks
and
later
code
that
looks
at
the
git
repo
might
run
on
different
machines
where
the
repos
are
on
different
long
points,
so
absolute
paths
would
break.
We
actually
had
this
break
on
us,
so
we
had
to
so
because
the
gate
gives
you
absolute
paths
right.
You
see
here.
This
is
come
on.
This
is
an
absolute
path,
but
that
is
not
the
same
thing
to
pass
around
the
network
between
servers,
so
we
convert
them
to
relative
paths
to
relative
to
the
repo
directory.
A
A
Yeah
was,
strictly
speaking,
the
valiant
alternates
is
also
always
the
same,
but
we
don't
make
an
assumption
about
that.
We
exactly
later
on.
We
throw
these
two
things
together,
because
we
don't
really
care
which
one
is
which
we
just
make
sure
we
want
to
make
sure
we
have
everything
because
we're
not
doing
any
rights.
C
A
Okay,
so
okay
and
you're
starting
to
see
now,
maybe
why
I
was
saying
this
stuff
is
very
subtle
and
good
to
know
about.
If
you
want
to
know
what
happens,
you're
gonna
get
push,
because
the
other
thing
maybe
I
can
show
up
I'm
scrolling
back
here.
I
mean
this
is
one
random
string.
Of
course,
during
my
last
push,
it
was
a
different
random
string
right.
It's
a
temp
directory.
It's
different
every
time.
A
A
So
this
is,
and
that
is
using
a
request
or
a
request
door
is
a
wrapper
around
threat,
local
storage.
So,
regardless
of
whether
we
use
a
multi-threaded
or
a
single
threat,
its
rails
application
server,
this
will
be
local
to
the
request.
And
the
other
thing
that's
important
or
good,
to
know
about
this
stuff
is
that
there's
a
middleware
in
the
rails
stack
that
resets.
This
request
store
after
each
request.
So
it's
it's
read:
local
and
because
of
the
middleware,
it
is
local
to
the
to
the
request.
I
think
Fran,
you've
seen
this
before
a
pull.
C
B
A
If
this
will
go
I
guess
we
were
using
context
and
and
variables,
you
would
sit
on
a
context
except
it's
not
go
so
we're.
We
have
some
roughly
equivalent
Ruby
thing,
okay,
so
these
things
get
stored
here
and
at
the
start
of
this
handler
for
the
this
API
call
and
that
later
they
get
used
I'm
going
to
ignore
this
rugged
thing,
because
that's
legacy
here
in
Italy
clients,
because
when
we
create
a
literally
repository
objects,
we
need
to
send
these
objects
over
to
Kaitlyn
and
I
mentioned.
I.
A
B
A
A
These
things
relative
paths,
those
were
into
the
greatly
proposed
Ettore
message
and
then,
of
course,
on
the
goodly
sides.
We
take
those
relatives
paths
and
we
joined
them
with
the
absolute
path
to
the
repo,
because
that's
where
we
know
the
absolute
path
and
then
we
had
absolute
paths
back
to
gates
just
to
be
safe,
I
think
it
actually
can
handle
relative
paths,
but
we
we're
not
counting
on
that
and
yeah.
Let
me
show
the
goodly
part,
because
it
is,
it
is
still
one
way
or
another.
A
There
we
go
in
Italy,
we
have
a
thing
that
both
resolves
the
repository
path
from
repo
message,
so
the
repo
message
was
here.
So
the
repository
path
is
resolved
by
looking
up
the
storage
name
and
get
the
lease
config
in
memory,
mapping
that
to
an
absolute
path
on
disk
and
then
just
joining
this
path
to
it
right
and
then
you
have
to
repo
path,
and
then
we
also
look
at
these
fields
and
these
fields,
then
guess
we
just
put
them
in
a
list
of
a
list
of
strings
that
are
as
a
representation
of
environment
variables.
A
A
Manually
here,
maybe
because
no
I
have
no
idea
why
these
do
it
manual
yeah.
So
that's
where
this
thing
trickles
through,
so
we
had
the
the
push
and
we
saw
that
during
the
push
get
sets
these
variables
for
us,
and
then
we
interpret
that
during
the
hook
we
send
it
along
on
API
call
we
stored
a
threat,
local
storage
from
the
real
server
and
then
all
outbound
RPC
calls
from
the
real
server
pass.
This
value
back
into
little
e
to
the
same
giddily
server,
where
this
directory
actually
exists.
A
A
A
C
A
A
A
A
A
A
Directory
and
it
sees
oh
there's
a
hook-
it's
executable
I'm,
going
to
run
it
and
going
to
feed
this
stuff
on
standard
outs.
Sorry,
on
standard
in,
if
the
hook
exit
was
zero,
then
the
push
is
allowed
if
the
hook
access
with
one
the
push
gets
denied.
Okay,
maybe
I
should
demonstrate
that
for
a
moment.
A
C
A
C
Okay:
okay:
there
is
where
we
interact
with
our
rules
and
our
yes.
A
Yeah
exactly
so,
the
all
the
all
the
hooks
and
rules
are
gathered
underneath
that
class
and
I
guess:
Fran
you're
also
familiar
more
familiar
with
this
class
because
you
worked
on
the
hooks
in
the
wikis
yeah.
So
this
thing,
fans
out
into
Oh,
hold
all
the
checks
and,
if
you're
very
unlucky,
it
fans
out
into
expensive
checks
that
slow
down
your
push,
but
that's
a
different
story:
okay,.
A
A
So
what
is
workhorse
I
mentioned
workhorse
before
it
is
reverse
proxy
that
sits
in
front
of
the
rails,
app
one
way
to
explain
its.
If
you
don't
know
the
concept
is,
imagine
we
took
nginx
and
we've
stuffed
it
full
of
plugins,
except
it's
not
nginx,
but
it's
a
custom
go
app
and
we
wrote
all
the
plugins
and
go
I.
A
Don't
know
if
that
helps.
But
it's
it
is
it's
a
reverse
proxy
with
lots
of
custom
features
and
it's
a
weird
architecture
thing
that
happens
because
in
the
big
we
used
to
just
have
rails
and
these
Ruby
processes
have
their
limitations,
and
this
was
a
way
for
us
to
heck.
In
things,
that's
hack
things
into
the
request
cycle
that
we'd
rather
not
do
in
rails
and
over
time
it
got
bigger
and
bigger,
and
anything
that
is
slow
like
an
uploads
is
better
done
in
workhorse,
because
there
that
is
a
go
process.
A
It's
just
a
go
routine.
So
having
a
go
routine
that
takes
five
minutes
is
no
problem
having
a
real
process.
A
real
request
that
takes
five
minutes
often
is
a
problem,
particularly
with
unicorn,
which
is
single
threaded.
So
then
you're
hogging
process
with
several
hundred
megabytes
of
memory
for
five
minutes,
so
that
would
never
be
a
good
idea.
A
So
that's
what
workhorse
is,
and
actually
the
original
use
case
for
workhorse
was
to
do
get
over
HTTP
because
get
over
HTTP
can
take,
however
long
it
needs
to
take
the
if
you
have
a
very
big
repo
and
try
to
clone
it.
It's
gonna
take
a
long
time
just
because
you
need
to
copy
a
lot
of
data
and
before
we
had
workhorse,
we
had
unicorn
workers
with
a
one
minute
timeout.
So
if
you
wanted
to
clone
a
large
repo,
you
would
hit
the
one
minute
timeout
and
be
out
of
luck.
A
So
get
HTTP
was
a
completely
inferior
transport
compared
with
get
SSH.
That
was
the
state's
call
long,
a
time
call
and
workhorse.
The
first
thing
that
we
result
was
I
was
trying
to
get
get
HTTP
to
behave
better,
then
offload
Nats
from
the
unicorn
process.
To
this
thing,
this
workhorse
process,
so
what
does
that
look
like
I
haven't
I
haven't
rehearsed
how
to
approach
this?
A
Let
me
first
maybe
try
and
explain
what
the
transport
looks
like
just
irrespective
of
gitlab.
So
in
the
ssh
case
you
establish
one
session
and
then
you
have
bi-directional
traffic
across
that's
a
session,
because
that's
the
that's
how
SSH
works.
You
can
have
bi-directional
communication
and
you
get
your
data
and
then
the
session
is
done.
Http
doesn't
work
like
that
at
least
HTTP
1,
1
and
1.
0
doesn't
work
like
that.
You
have
a
fixed
request
response
cycle
and
after
your
so
you
have
to
you
can
only
say
one.
A
The
process
used
for
the
gate,
transport
into
HTTP
requests
and
the
first
met
the
the
major
version,
and
the
only
thing
that
is
the
most
common
version
is
called
this
smart
get
HTTP
protocol.
That's
because
there
used
to
be
a
dumb
protocol
that
we
don't
care
about
the
done.
We
don't
support
the
damn
protocol,
so
we
don't
have
to
talk
about
that.
The
smart
protocol,
it
emulates
this
stuff
that
happens
during
an
SSH
push
only
within
requesters
response
cycle
and
the
way
that
works.
A
A
B
A
Hits
on
server
returns
list
of
all
references,
and
what
I
mean
by
that
is
it
looks.
It
looks
a
little
bit
like
like
that,
like
this,
it's
formatted
slightly
different,
but
it's
a
list
of
ID's
and
references
I
guess
we
can
actually
run
the
I'm
not
going
to
do
that.
I
could
run
the
actual
command.
That
does
this,
but
it's
a
distraction.
So
that
goes
back
to
the
clients.
Clients
looks
at
objects.
It's
already
has
well,
no
sorry.
The
client
first
looks
at
this
list
and
decides
what
it
wants
to
have.
A
A
Yes,
that's
it
in
the
post
body
and
then
the
post
response
is
Beck
file
with
the
requested
objects
and
RF
updates
description.
Just
I
think
this
is
repeated
in
the
response.
I'm
actually
not
entirely
sure.
Maybe
we
can
look
at
this
next
time.
If
you
want
see
you
can,
if
you
use
the
mitten
proxy
or
something
like
that,
you
can
actually
intercept
this
stuff
and
look
at
what
goes
on
on
the
wire,
but
we
don't
have
to
so
that's
the
cycle.
A
So,
instead
of
one
SSH
session,
you
have
gets
followed
by
a
post
and
the
fun
fact
is
that
in
production
you
see
that
you
get
way
more
gets
of
these
get
requests
in
the
post
request
so
apparently
based
on
github.com
a
lot
of
people.
A
lot
of
clients
are
just
checking
to
see
if
there
are
changes,
because
if
you
do
get
fetch
and
get
fetch
comes
back
and
says
nothing
changed,
you're
ready
up
to
date.
That
means
that
it
did
that
get
requests.
A
It
got
the
list
of
everything
that
they
have
that's
there
and
it's
decides
there's
nothing.
New
I
want
to
have
yeah,
so
that
is
the
that
is
the
mechanics
of
the
transports.
This
is,
and
this
is
the
main
difference
with
gate
SSH,
because
the
the
basic
idea
is
still
the
same.
So
what
we
have
in
workhorse
is
we
have
HTTP
routes
that
intercept
these
specific
requests
and
do
something
special
with
them.