►
From YouTube: Gitaly Training - Architectural Overview
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
B
C
C
I,
don't
think
I
can
bring
myself
to
do
the
the
twitch
thing.
B
Oh
looks
like
we
have
a
good
number
of
folks
and
then
this
is
being
recorded
as
well.
So
I
think
we
can
go
ahead
and
get
started.
So
my
name
is
John
I'm,
a
yam
on
The,
Daily
team
and
yeah.
We
we
wanted
to
do
a
series
of
talks,
I
guess
we'll
call
it
training
sessions.
Originally,
the
audience
was
support.
B
Folks,
because
yeah
Italia
is
a
pretty
complex
topic
and
comes
out
comes
up
a
lot
issues
happen
with
customers
or
you
know
even
on
getlab.com,
and
so
we
just
wanted
to
provide
a
series
of
talks
to
kind
of
kind
of
dive
into
different
aspects
of
the
giddly
code
base.
I
guess
so
not
just
the
code
base
just
how
it
how
it
works
and
how
it
works
with
other
parts
of
gitlab.
B
Just
so
folks
can
get
more
familiar
and
so
yeah
today
will
Chandler
he's
a
senior
back
engineer
on
Gili
team.
He
has
a
support
background
as
well.
Actually
so
he's
going
to
give
our
first
session
on
like
an
overview
of
Italy
architecture,
so
take
it
away.
Will.
C
Next
shot,
I
just
make
sure
we're
recording
that
is:
okay,
we're
recording
it
cool
all
right
yeah.
So
this
is
just
supposed
to
be
kind
of
a
a
little
bit
overview
of
giddly
from
kind
of
the
admin
perspective,
mostly
and.
D
C
Components
fit
together
all
right.
Can
you
see
my
browser.
B
Cool
sorry
well,
I
forgot
to
say
one
thing
sure
so
we're
gonna
have
q
a
at
the
end,
so
oh
whoops
I
just
sent
the
wrong
link.
B
Okay,
I'm
gonna,
send
over
the
agenda
Doc
in
the
zoom
chat,
so
feel
free
to
put
down
any
questions
under
the
questions
section
and
then
we'll
have
about
20
minutes
for
to
answer
those
questions.
So
sorry
about
that.
Go
ahead!
Well
of.
C
Course,
okay,
yeah.
So
basically
my
goal
is
to
kind
of
like
walk
through.
What
does
this
diagram
mean
in
a
little
more
detail
and
give
you
some
more
ideas
on
what
gitly
is
doing
under
the
hood?
C
C
In
the
past
we
had
a
sidecar
called
gilirubi
that
is
gone
as
of
160
I.
Don't
know!
Where
tone
did
that
work
thanks
tone
and
then,
as
of
like
16
4,
there
is
also
another
binary
launching
called
good
to
go.
C
Let's
use
the
get
two,
but
that
has
now
been
fully
replaced
with
just
git,
so
any
git
operation
you're
doing
we
are
just
working
a
git
process,
basically
standard
git
I
think
we
occasionally
will
use
our
own
patches
on
our
Git
Version,
but
typically
we
just
try
to
stay
with
the
Upstream
version
of
git
and
so
we're
just
working
there
and
reading
from
standard
out
typically
to
get
the
whatever
the
details
are.
C
Okay,
so
there
are
kind
of
three
main
clients
that
giddly
serves
so
there's
rails
itself,
so
like
a
puma
process,
so
anytime
you're
actually
like
looking
at
a
project.
That's
that's
Puma,
sending
rpcs.
If
you're
familiar
with
the
performance
bar,
which
you
can
get
with
hitting
p
and
B
on
your
keyboard,
you
can
just
click
here
and
seal.
Our
PCC
is
rescinding
so
find
license
and
you
can
see
all
the
parameters
we're
sending
so
typically
anything
that
rails,
ascending
is
going
to
be
small.
C
You
know,
like
hey,
show
me
what
files
just
by
this
name
or
what
are
the
references
that
exist
and
so
forth,
and
so
those
are
all
going
to
be
over
grbc
and
they
may
or
may
not
be
streaming
grpc
responses,
sometimes
they're
unary
or,
like
a
you,
have
single
response.
Sometimes
they
stream
back
depending
on
the
size,
but
that's
that's
puma
and
then
next
would
be
Workhorse.
C
So
workhorses
gonna
be
a
longer
live
connection,
because
you
know
Workhorse
is
pretty
unusual
to
restart
it.
So
any
HTTP
get
operation
which
angular.com
will
be.
You
know,
95
of
CI
or
in.
E
C
Environment
really
I
was
using
GitHub
CI,
it's
going
to
be
it's
going
to
be
HTTP
right,
so
usually
you'll
see
most
of
your
clone
or
fetch
traffic
come
out
over
Workhorse,
and
so,
when
we're
doing
any
kind
of
Clone
or
fetch,
we
use
What's
called
the
sideband.
So
originally
we
were
sending
the
the
pack
file
details
over
grpc
which
involve
getting
the
output
from
git
encoding
into
your
grpc
buffer,
sending
it
over
to
Workhorse
and
then
de-encoding
it
and
writing
it
up
to
the
client,
and
that
was
inefficient.
So
it
was.
C
Now
scalability
Team
added,
what's
called
a
sideband
connection,
so
we
Multiplex
the
grpc
connection
to
create
a
very
simple
pipe
on
there.
That
will
just
send
the
the
pack
information
in
the
clear,
so
we
don't
have
to
encode
it.
So
that
saves
a
lot
of
CPU
time.
So
that's
why
you'll
see
like
SSH,
upload
pack
or
post
upload
pack
with
sideband?
That's
what
that
with
sideband
is
referring
to.
Is
this
this
Multiplex
connection
that
we
have
running?
C
Oh
yeah?
Thank
you
side.
Sad
band
is
git
itself
outside
channel
is
the
thank
you
Tom
right,
and
so
so
it's
like
I,
said
any
any
kind
of
git
client
operation
so
like
a
CI,
Runner
cloning
or
fetching
or
someone
just
on
their
own
terminal,
doing
a
fetch,
we'll
all
use
Workhorse
and
then
the
the
final
connection
or
client
is
shell,
so
that
that'll
be
for
incoming
estate
requests
on
kitlab.com
we're
using
gitlab
sshd.
C
So
that
is,
you
know
stable
process,
and
so
that
will
have
an
existing
connection
out
to
giddly,
and
so
we
can
kind
of
reuse
that
connection,
so,
if
managed
instances
that's
off
by
default.
So
when
a
new
connection
comes
in
we're
going
to
the
standard,
opensshd
is
going
to
Fork
a
gitlab
shell
process,
which
will
then
establish
a
new
connection.
So
you'll
see
a
lot
more
connection
churn
there.
C
Things
will
be
much
more
ephemeral
when
it
comes
to
that,
but
it's
the
same
same
deal
as
Workhorse,
where
any
it's
just
over
SSH
it'll
be
any
get
client
operation,
so
that
is
more
likely
to
be
humans.
In
most
environments,
some
self-managed
customers
use
SSH
for
their
CI.
So
it's
not
always,
but
it
depends
so
there
is.
This
is
the
closest
thing
we
have
to
a
diagram
which
is
a
this
page
here.
C
This
is
not
mentioned,
Shell
versus
well.
This
is
on
the
killer
dogs.
There
is
also
like
a
larger
gitlab.com
architecture
somewhere
in
the
docs
I.
Don't
know
if
someone
knows
where
that
is
offhand,
I,
don't
I,
don't
have
the
link
handy,
but
that's
that's
pretty
enormous,
but
you
can
you
can
pick
out
all
the
stuff
I'm
talking
about
there
too.
If
you
know
what
you're
looking
for
it's
just
there's
a
lot
more
stuff
going
on.
C
Okay
and
then
occasionally,
you
also
have
Italy
now
to
talk
to
each
other.
For
example,
if
I
am
working
a
project,
that's
not
using
an
object,
pool
I
might
need
to
go
from
one
giddly
server
to
another,
to
fetch
the
details
of
that
project
and
then
clone
it
over
to
my
own
goodly
server
and
I'm,
not
actually
sure
if
this,
like
loopback
connection
to
itself,
really
happens
these
days,
that
used
to
be
for
giddly
Ruby,
where
we
would
like
connect
over
like
to
the
the
host
address
that
rails,
they
would
buy
I.
C
Don't
think
offhand
that
we
still
do
that.
So
this
might
be
a
little
out
of
date.
This
section
here
at
the
bottom.
C
Okay,
all
right.
So
those
are
the
three
clients
that
we
have
was
it
going
to
show
you
oh
yeah.
One
thing:
that's
kind
of
interesting
to
see.
So,
if
you
so
here
is
a
standalone,
giddly,
node.
C
Which
is
corresponds
with
the
storage
section
in
the
giddly
configuration,
so
you
can
see
here,
there's
only
one
storage
name
listed
here,
so
so,
if
I
wanted
to
Fork.
So
let's
say:
I
had
a
a
repository
on
Italy
one
here
and
I
wanted
to
clone
it
over
to
giddly
Shard
to
say
before,
because
it's
a
private
project
we
can't
use
object,
pools
then
what
you'd
have
to
do
is
you
have
to
create
a
connection,
but
Kitty
itself
is
not
aware
of
any
other
giddly
nodes
so
like
on.com.
C
Actually,
if
you
look
at
different
figs,
you'll
see
that,
like
all
the
Italy
stories
are
listed
in
the
config,
but
it's
on
it's
only
there
for
ease
of
configuration
in
reality,
it's
on
it's
only
using
whatever
storage
that
node
is
actually
configured
for,
so
that's
slightly
confusing,
but
but
basically,
how
does?
How
does
this
giddly
node
know
to
talk
to
this
one?
Why
doesn't
the
config
the
way
it
works
is
in
the
rails?
Node
actually
excuse
me.
C
Rails
will
bundle
in
the
addresses
it's
aware
of
in
the
grpc
headers
so
that
we
can
fetch
so
that
it
can
basically,
it
tells
the
Gilly
Clans.
Here's
where
you
need
to
go
to.
D
C
Node,
let
me
show
that
in
action,
real
quick
right,
we
moved
I,
don't
know
why
they
got
rid
of
the
wrench.
Oh
that's
fine,
right
repository.
What
do
I
have
is
this
primary
storage
right
now?
Okay,
so
my
cluster
is
this:
where
everything's
going
so,
let's
work
this.
C
So
this
goatee
buggy
should
be
2D
bug
is
enabling
extremely
verbose
logging
on
giddly,
which.
C
To
do
in
production,
environment
all
right,
but
for
our
purposes
is
Handy,
so
you
can
see
it
was
like
writing
each
header
that's
receiving
on
a
request
all
right.
So
let's
do.
C
C
Okay,
here
we
go
yeah,
so
you
can
see
in
the
we
get
this
giddly
servers
field
and
then
you
get
the
space
64
encoded
string.
C
And
then,
if
you
expand
that
you
can
see
that
it's
a
Json
blob
that
says:
hey
here's,
this
Gilly
server
Shard,
which
you
can
see
here
in
the
rails,
config
and
then
here's
the
address.
Here's
his
token,
so
you
can
authenticate
yourself.
So
that
is
how
giddly
servers
know
how
to
communicate
with
each
other
there's,
never
any
config
on
the
gidley
Node
itself,
as
of
today
at
least,
to
talk
between
nodes.
C
This
a
lot
of
this
is
subject
to
change
like
we're
planning
to
completely
overhaul
how
gitly
cluster
works
and
that's
going
to
add
some
client-side
routing
which
hasn't
been
outmitted
yet
so
that
may
so
this
this
may
be
something
to
change,
but
right
now
this
is
how
it
works.
B
Could
you
explain
real
quick
why,
during
a
forking
process
ideally
would
know
we
need
to
know
about
another
yearly
known.
C
Yeah,
so
it
depends
so
if
you
have
a
public
project
or
an
internal
project,
Italy
will
create,
what's
known
as
an
object
pool
which
is
kind
of
like
a
a
separate
get
repository,
which
will
be
the
prefix
with
an
at
pools
path
on
disk
and
that
will
have
all
the
objects
in
common
between
the
forks.
So
that
way
we
would
do
duplicate
most
of
the
repository,
usually
almost
all
of
it,
and
we
can
save
a
lot
of
space.
C
So
if
you
have
a
project
like
gitlab,
which
has
you
know,
hundreds
or
thousands
of
forks
I,
don't
remember
that
over
top
of
my
head,
we
don't
have
500
copies
of
the
repository
we
have
95
of
one
copy
and
then
five
percent
scattered
around
across
the
other
ones,
is
what
they
differ,
so
that
let's
just
be
much
more
space
efficient.
C
C
Repository
storage,
so
this
page
here
in
admin
under
Repository
lets,
you
set
a
relative
weight
for
where
your
new
repositories
are
created.
So
right
now,
I
have
all
of
my
repositories
being
created
in
my
default
giddly
cluster
and
none
of
them
go
to
Shard
and
I
can
change
that
to
whatever
waiting
I
wanted.
C
If
I,
if
I
had
certain
nodes,
they
wanted
to
fill
up
and
other
ones
to
not
be
touched,
then
I
would
adjust
that
value
anyway.
So
in
this
case,
because
I'm
creating
all
new
repositories
in
default
and
the
existing
copular
repository
wasn't
shard,
the
default
giddly
nodes
which
are
in
a
gilly
cluster,
will
have
to
reach
out
to
Shard
and
say:
hey.
Give
me
this
repository,
so
it'll
do
internally
a
little
Fetch
and
get
all
the
objects
and
create
a
Zone
cut
very
copy
of
the
Repository.
E
C
So,
by
default,
this
is
the
directory
that,
or
Anonymous
instance,
will
put
the
repositories
in
you'll
see
this
stately
metadata
file,
which
is
historically
been
used
to
identify
whether
or
not
we've
been
using
NFS.
So,
basically,
the
way
it
would
work
is
rails
would
validate
what
this
file
looks
like
on
its
end
and
then
the
killer
server
would
also
say
like
what
this,
what
the
EU
ID
it
has
if
they
match
them,
we
say:
okay,
this
must
be
NFS
and
we'd
use
previously
we'd
use
rugged,
which
is
replica
two
bindings
Ruby.
B
C
Sure,
anyway,
so
I
can't
remember
the
top
of
my
head.
If
we've
removed
rugged
entirely
or
not,
I
think
we've
disabled
it
right
job,
I.
B
Think
we
yeah
well
I,
think
you
can
well.
No!
You
can
still
use
if
you
really
wanted
to
through
feature
Flags,
but
for
all
kinds
of
purposes.
It's
not
being
used
by
an
application.
C
Great
okay:
well,
we
won't
talk
about
it
too
much,
then
all
right
and
you
have
this.gilly
directory
or
plus
Gately.
Excuse
me,
and
this
is
kind
of
Italy's
scratch
directory.
C
Where,
when
we're
leading
a
repository,
for
example,
we
will
move
rename
it
as
a
repository
name
Dash
deleted
in
this
directory,
so
we
can
atomically
move
it
out
of
place
and
then
from
there
we
can
delete
it
and
we
create
reference
or
archives
or
archives
caches
for
the
the
Reps
that
we're
serving
that's
another
thing:
that's
in
there
it's
really
implementation
detail
and
then
there's
a
cluster
directory
will
have
the
actual
repositories
itself
and
if
I
had
any
pool
repositories,
there'd
be
a
pools
directory.
Here
too.
C
So
you
can
see
we
have
some
very
repositories
in
here:
let's
commit
graphs,
backfile,
reverse
index
index,
bitmaps
all
the
different
housekeeping
things
that
help
keep
you
efficient,
and
this
is
a
file
that
guildley
creates
to
say
Hey
when's,
the
last
time
we
did
a
full
repack,
so
we
can
keep
track
of
that
as
well
as
I.
Go
here.
I
think
I
see
yeah.
Okay,
let's
go
here.
C
Yeah,
sorry,
okay,
okay,
so
just
to
explain
a
little
bit
more,
so
this
host
here
get
Elite
Training
Italy
2
is
within
a
giddly
cluster
and
so
prefect
will.
Let
me
use
this
at
cluster
format
for
its
paths.
Well,
this
is
a
shirt
again
so
Standalone
giddly,
which
is
what
almost
all
of
our
Ghibli
nodes
at.com,
are,
and
what
you'd
see
in
a
typical
small
self-image
instance
as
well
right-
and
so
here
we
use
the
add
hash
format
for
the
path,
and
so
this
is.
C
This
is
the
path
that
you'll
see
internally
in
rails
as
well.
So
if
we
go
to
projects
and
so
like
a
look
here
and
I
can
say,
okay,
this
is
the
path
that
rails
has
with
project
and
this
matches
up
with
what
we
have
on
The
Shard.
However,
if
you
look
at
a
product,
that's
in
the
cluster
it'll
show
up
as
hashed,
but
there
is
no
path
at
hash
66p86
in
in
the
on
the
file
system
here,
so
we
have
to
do.
C
Let's
go
into
the
prefect
which
I
have
mentioned,
I'll
get
back
to
that
in
a
second
database
to
map
that
between
them
so
like
before
I
go
any
further.
Let
me
actually
talk
about
what
is
giddly
cluster,
so
gitly
cluster
is
a
way
of
having
multiple
copies
of
a
repository
that
are,
you
know,
kept
in
sync
with
each
other
without
well,
exposing
only
a
single
instance
to
rail.
So
it's
a
way
of
getting
redundancy,
which
is
not
something
that
we
have
with
the
shared
goodly.
C
There's
no
way
to
replicate
a
repository
well
to
multiple
different
charts
in
the
current
setup.
So
what
that
does
is
typically
you'll
have
a
little
Bouncer
and
that'll
be
the
address
that
you
give
to
rails.
Let
me
show
you
here
so
like,
for
example,
here
this
could
be
my
load
balancer.
In
actual
fact,
I
only
have
a
single
prefect
node
in
production.
You
typically
have
three,
maybe
more
one.
One
failure
mode
is
seen
somewhat
of
them
with
self-managed.
C
Customers
is,
if
you
have
more
than
more
giddly
nodes
behind
prefect
and
you
have
prefect,
they
can
send
so
much
traffic
that
it
overwhelms
the
network,
connection
or
network
bandwidth
that
is
available
to
the
prefect
node
and
that
will
cause
silent
slowdowns.
It
used
to
be
it
used
to
cause
memory
blow
down
perfect,
but
that's
been
fixed.
Is
it
like
fifteen
five
or
something
like
that
anyway?
So
perfect
nodes,
what
they
they're
stateless?
C
C
C
So,
if
we
go
back
here
is
at
hasht6b86
Etc.
But
if
you
look
here,
it
actually
maps
to
at
cluster
Repository
hd4732
and
you
can
see
giddly
one
is
the
primary
node
for
that
which
would
make
gitly
2
being
along
with
the
secondaries.
I
only
have
two
giddly
nodes
in
this
cluster
typically
get
F3.
It's
two
is
fine.
If
a
customer
only
wants
to
there's
no
need
for
Quorum
or
anything
like
that,
and
this
generation
counter
is
saying
like
how
many
changes
have
happened.
C
So
this
is
how
we
keep
track,
of
which
node
is
in
what
state.
So
if
the
generation
counter
is
the
same,
so
if
we
look
at
storage
repositories,
so
you
can
see
here
they
both
have
the
same
generation.
So
in
this
case
a
read
request
for
this
repository
could
be
read
to
either
get
ready
one
or
delete
two
because
they're
both
at
the
same
generation.
C
It
has
to
replicate
to
basically
do
a
fetch
from
the
primary
if
it's
a
very
large
repository
that
fetch
might
take
30
seconds
in
which
case,
and
if
changes
are
coming
in
faster
than
that,
the
secondary
might
not
catch
up
to
the
primary
for
hours,
and
so
in
that
case
we
wouldn't
be
able
to
distribute
reads:
We
Do,
It,
All
rights,
or
all
reasons
have
to
go
to
the
primary
along
with
any
rights
and
so
you're
losing
most
of
the
benefit
other
than
you
know.
C
Having
redundancy
in
terms
of
you
know
the
primary
hard
drive
failing
in
this
scenario,
you
can't
you
can't
spread
the
load,
so
that's
a
failure
mode
that
we
see
somewhat
often
it's
been
improved.
Most
I
think
pretty
much.
All
RPS
at
this
point
won't
do
that
for
a
while
over
some
that
would
always
trigger
a
replication
and
that
you
know
that
just
decreases
the
efficiency.
C
Okay.
So
that's!
This
is
kind
of
internally
how
prefect
nodes
are
tracking.
You
know
like
how
can
I
route
a
request.
So
basically
the
request
comes
a
little
balancer
low
balancer
chooses
any
of
the
stateless
prefix
nodes,
prefect,
we'll
check
the
database
say:
okay,
I
can
send
this
request
for
repository
6b86
to
either
get
really
one
or
get
li2,
because
the
generation
is
three
it'll
route
it
and
then
it'll
so
it'll
proxy.
The
grpc
request
down
to
gidly
one
of
the
giddly
nodes,
and
then
Gilly
will
do
its
thing.
C
Send
it
back
out
to
prefect
which
proxy
it
adds
a
little
bounce,
we'll
send
it
back
to
the
client
and
so
on.
So
there
are
a
lot
of
steps
there.
So
how
does
giddly
actually
get
get
data
so
I'll
show
that
real
quick.
C
A
C
C
All
right,
that's
actually
at
that
ahead.
C
It's
kind
of
hard
review:
isn't
it
sorry
anyway?
So
let
me
let
me
walk
through
what's
happening
here,
so
so
the
rail
is
going
to
requests
to
pull
up
a
project.
So
it's
going
to
send
a
few
rpcs
so
like.
If
you
look
here
and
see
it's
sent
search
files
by
name
in
this
case.
We
are
looking
for
itself
and
then
it's
looking
for
markdown
files
from
the
wiki
repository.
You
can
see
the
reg
X
that
we're
sending
and
then,
in
addition,
we're
also
to.
C
The
wiki
that's
interesting,
yeah
surprised,
we're
doing.
That's
fine
has
local
branches.
So
it's
saying
hey
what
branches
do
you
have
to
this
Repository?
C
And
so
here
you
can
see
this
is
the
the
the
file
that's
actually
being
forked.
So
it's
it's
not
opt
embedded
in
Optical
I
haven't
been
in
bin
git
gettili
when
it
launches
will
extract
its
own
git
reposit,
don't
copy
of
the
get
binary,
so
let's
actually
just
expand
that
directory.
So
here
you
can
see
everything.
That's
there,
so
you've
got
get
to
go,
which,
as
I
said,
should
be
gone
at
the
next
release,
and
then
you
have
Kili
gpg.
C
We
use
that
to
sign
and
validate
signatures
on
commits
hooks
is
used
for
well,
it's
used
for
the
pack
objects
caches
kind
of
on
the
primary
things
which
I'll
get
into
in
a
little
bit.
C
Lfs
smudge
is
used
for
handling
lfs
objects
and
that's
kind
of
a
long-lived
process
that
we
use
to
make
it
more
efficient
and
then
give
the
SSH
is
used
to
connect
internally
between
guildley
nodes,
and
then
we
also
have
hooks
and
socket
directory
in
there
as
well.
Okay
and
so
sorry
so
I
just
started
printing,
it's
confusing
all
right.
So
then
we
got
this
gift
or
director
saying
hey.
C
C
We
were
saying
treat
controlled
F
to
whatever
you're
receiving
or
crlf
the
line
ending,
don't
use,
replace
refs,
which
has
been
a
security
issue.
These
are
the
objects,
that'll
be
f-sync,
so
objects
metadata
references
or
fsynced.
By
get
whenever
we
make
a
change
there
and
then
fsync
is
fsync,
and
this
I
believe
has
to
do
with
reducing
contention
on
the
Pax
rest
file.
C
Okay
and
so
then
here's
the
actual
commander
running,
which
is
git
cat
files.
This
is
saying,
give
me
a
blob
and
then
we're
running
Z,
which
will
give
it
like
null
null
by
terminated
and
then
they're
saying
patch
command,
which
is
saying
we're
going
to
send
you
a
number
of
commands
every
time.
So
I.
Don't
let's
see
if
we
have.
E
C
Get
file
processes
right
now
and
we
don't
so
what
gittley
does
when
were
these
get
for
these
Ruby
or
rails
operations
that
are
often
making
lots
of
little
requests
so,
like
I,
want
to
look
at
this
file.
I
want
to
look
at
that
file
within
the
context
of
a
given
rails
request,
so
I
think
it
works
out
to
a
single
correlation
ID.
C
We
will
Fork
a
cat
file
process
and
then
keep
that
around,
so
that
we're
not
for
each
file
that
we're
trying
to
view
we're
not
forking
a
new
cat
file
process.
This
lets
us
be
a
little
more
efficient
work
for
your
processes.
C
So
when
you,
when
you
look
at
a
production,
server,
you'll
probably
see
a
whole
lot
of
cat
file
processes
around
those
are
mostly
just
their
hashed
waiting
to
see
if
any
more
requests
come
in
or
or
that
given
request
for
that,
for
that
given
overall,
like
web
request,
that's
coming
in
so
that's
that's!
Why
they're
there
and
then
the
other
thing
you'll
see
a
lot
will
probably
be
upload
pack
or.
D
F
C
More
anything
else
I
want
to
talk
about.
Well,
let
me
pause
right
here,
first,
anything
that
any
questions
that
are
like
immediately
relevant
that
anyone
wants
to
ask.
E
C
D
C
With
the
Omnibus,
but
this
way
we
didn't
have
to
rely
on
Omnibus
updating
what
it
bundled
we
could
always.
We
could
very
you
know,
control
exactly
what
this
is,
what
we
were
executing
and
clients
or
you
admins-
can
still
override
they
get
binary.
So
I
don't
believe
we
have
to
use
this.
That's
not
something
we
recommend,
but
it
is
something
that
you
may
encounter
in
the
wild.
If
you're
dealing
with
a
self-managed
customer.
B
Yeah,
another
reason
is
sometimes:
if
there's
like
a
security
fix
or
some
high
priority
fix,
we
will
build
a
git
binary
with
some
of
our
own
changes.
B
Usually
we
ship
it's
the
standard,
get
binary,
that's
released,
but
sometimes
we'll
ship
with
some
of
our
own
changes
so
yeah.
If
self-managed
customers
use
their
own
git
binary,
it
might
break
because
it
doesn't
have
the
features
that
Italy
needs.
Well,
we.
E
C
C
So
right
now
we
have
for
this
repository
one:
it's
primary
is
going
to
be
one,
so
that
is
the
skilly.
No,
that's
not
that
one
let's
go
back!
Is
this
one
yeah
this
one?
Okay?
C
A
C
C
A
C
No
actually
I
was
just
reading
it
wrong.
It
did
update
okay.
C
Maybe
maybe
it
already
changed,
I
didn't
even
look
great,
that's
probably
it
anyway.
So
you
can
see
now
for
this
repository
only.
We
have
now
updated
the
primary
to
be
Italy
2..
So
the
way
failover
works
is
when
a
request
comes
in
that
the
primary
needs
to
handle
and
the
primary
is
not
available.
Then
we
will
update
the
primary
to
be
one
of
the
other
in
sync
nodes
that
is
on
the
gilding
cluster.
C
If
there
are
none,
then
the
repository
becomes
will
become
unavailable
until
the
a
up-to-date
node
returns,
but
we
only
do
that
for
repositories
that
come
in
the
have
requests
come
in
so
you'll
note
that
this
other
repository
has
not
updated.
Even
though
gidly
one
is
offline.
Originally,
we
did
update
all
of
the
repositories
and
that
caused
a
lot
of
work
in
postgres,
and
it
was
just
it
was.
You
didn't
have
to
do
it
that
way
right
so
now
it
only
it
lazily
updates
the
primary
as
needed.
C
Yeah
and
then
I'll
just
quickly
mention
you
may
see,
deadline
exceeded
errors
in
the
logs.
That
means
that
some
operation
that
rails,
typically
until
it's
always
going
to
rails,
are
sidekick
sent
took
longer
to
regularly
to
serve
than
the
timeout
configured
in
the
admin
area.
So
if
we
go
back
again
and
leave
page.
C
Preferences,
gitly
timeouts,
so
whatever
these
values
are
set
and
whatever
the
operation
is
to
develop
the
time
that
will
depend
on
the
evaporation
that
you're
doing,
but
if
it
takes
particularly
if
it's
a
large
repository
and
it
takes
longer
than
this,
then
you'll
see
a
deadline
exceeded
the
other
area.
You'll
see
a
lot
is
canceled
the
context
canceled.
That
means
typically
that
somewhere
between
giddly
and
the
client,
the
connection
went
away.
Maybe
the
client
closed
their
git
process.
Maybe
you
know,
load
balancer
went
out.
It
could
have
been
anywhere
along
that
chain.
C
If
the
connection
has
lost
and
you'll
see
that
connection
error,
contents,
canceled,
error,
messenger,
then
giddly
so
basically
just
means
that
the
client
went
away
for
some
reason.
We
don't
really
know
why.
Just
you
should
look
Upstream
to
try
and
figure
out
why
that
happened.
C
Okay,
yeah,
that
covers
everything
I
wanted
to
go
over.
Let
me
stop
sharing
for
a
quick
second
I'll
pull
up
the
dock.
B
Thanks
will
so
yeah
I
know
that
that
was
pretty
technical.
So
if
you
have
any
questions
or
if
you
want
will
to
go
over,
maybe
one
of
the
parts
that
he
already
went
over
again,
maybe
yeah.
So
please
put
those
in
the
questions
and
we'll
go
over
them
one
by
one.
C
C
Yeah
Karthik
has
answered
that
so
so
we're
using
a
built-in
git
feature
called
alternates.
So
basically
you
can
say
here's
a
repository
on
disk
and
then
you'll
have
an
alternate
file.
Then
it'll
say:
okay
and
I'll
also
go
to
this
other
directory
and
use
that
to
get
additional
objects
from
it.
So
we're
saying
all
of
you
all
of
you
pool
reposit
all
your
projects
that
are
using
this
pull
repository.
C
We
add
that
alternate
of
the
pool
to
them
and
so
then
get
itself
just
knows:
okay,
I
need
to
go
to
this
other
directory,
which
is
just
a
relative
path
to
to
fetch
the
objects
so
there's
limitation,
which
is
that,
because
it's
a
relative
path,
you
know
it
has
to
be
on
the
same
node
like
we
can't
sell
git,
use
an
HTTP
request
or
something
to
go.
Do
it
so
that
limits
any
any
four
cluster
to
a
single
giddly
node,
the
negatively
cluster.
C
In
theory,
you
still
have
the
the
forks
spread
out
or
replicated
amongst
all
the
nodes
that
was
broken
for
a
long
time.
I
think
Justin
has
gotten
that
basically
fixed
at
this
point
or
close
to
it.
C
C
D
C
Are
we
adding
extra
overhead
of
translating
the
path
through
prefect,
so
I
think
the
main
answer
there
is
that
we
wanted
to
have
a
a
permanent
home
for
a
given
Repository
and
but
rails
was
still
asking
us
to
move
repositories
around
on
disk
now,
particularly
in
the
context
of
Geo.
So
you'd
have
like
a
new
repository
created
in
G
at
Geo,
temporary
and
then
once
the
fetch
hit
geofactor
completed,
then
it
would
get
moved
into
place.
C
So
the
actually,
the
reason
that
we
have
the
add
hash
name
original
in
Italy
or
on
on
giddly
shards,
is
the
same
logic.
Originally,
the
path
on
disk
was
the
repository
names
so
like
including
the
project
so
it'd
be
like
our
applicant
lab
git
repositories.
Repositories,
getlab,
Dash,
org
get
lab
would
have
been
the
path
forget
that.
C
But
whenever
you
had
someone
you
know
rename
their
project
or
rename
their
organization,
they
ended
up
doing
a
ton
of
work
on
disk
and
just
rename
all
these
directories
and
it
was
expensive
and
error
prone.
So
that's
why
I
created
that
hashed
format,
and
so
then
this
cluster
thing
is
doing
the
same
thing.
C
Just
at
another
level
of
abstraction
does
this
way
kill
the
cluster
knows
that
a
repository
is
always
at
this
path,
even
if
rails
thinks
that
it's
moved
somewhere
else.
So
that's
that's
why
we
added
that.
E
It
does
add
another
layer
of
complication.
Cluster
is
always
the
same,
no
matter
which
you
could
delete
shark
to
talk
to
correct.
C
C
Back
one
yeah,
sorry
that
was
actually
over
here.
So
if
you
look
at
the
path,
so
it's
actually
the
hash
of
the
ID,
but
instead
of
taking
the
full
hash
here
like
we
do
with
Ed
hash,
we
just
put
the
actual
number
here.
So
this
is
the
full
hash
of
number
one.
This
is
also
hash.
One
just
use
one
at
the
end,
but
they
don't
they
don't
always.
You
know
correspond
with
the
ID
that
rails
has
like
this
in
rails
is
Project
one
and
cluster
it's
project
or
sorry.
C
C
So
there's
yeah,
there's
just
no.
Unless
you're
backing
up
your
database,
which
I
don't
think
be
really
advised
in
the
docs
there's
just
no
way
to
re-establish
that
you
can.
You
can
kind
of
fake
it
if
you
can
go
through
the
logs
for
so
for
any
traffic,
any
project
that
was
creating
or
getting
traffic.
You
could
at
least
look
at
recent
logs
and
say:
okay
I
can
translate
between
those
two
because
that'll
have
both,
but
if
it's
like
an
idle
project,
then
you're,
probably
not
gonna,
be
able
to
easily
do
it.
There
is.
C
Yeah
for
a
long
time,
I
used
to
write
there
used
to
be
a
gitlab
section
in
this
config
in
the
git
config.
That
would
list
the
the
project
name,
so
you
could
relate
it
back
that
way.
I
think
we
added
that,
like
10
version,
10
could
get
that
10.,
but
it's
that's
going
away
now,
so
you
can't
rely
on
that
either.
C
Okay,
Sean
asks
so
some
customers
didn't
get
really
independent
of
gitlab
hugging
like
clogging
face
I
think
is
certainly
the
most
prominent.
There
might
be
others
that
are
doing
it.
That
I'm,
not
aware
of
so.
This,
isn't
something
that
we
officially
support
in
terms
of
like
a
support
team's
not
going
to
help
them.
There's
most
logic
in
Italy
is
pretty
agnostic
there.
There
is
some
stuff
that
is
get
lasific.
You
know,
like
authorizations,
I
think
are
hard-coded
to
go
to
internal
aloud.
C
B
Yeah
and
if
we're
the
documentation
for
RPC
or
rpcs
it's
there
is
good
amount,
that's
missing
and
that's
something
that
we're
working
to
address.
Yeah.
C
This
is
all
just
automatically
generated
from
the
comments
on
our
protograph
definitions,
so
it
may
or
may
not
be
entirely
useful
understanding
how
it
works,
but
that
it
is
something
that's
out
there,
and
this
is.
This
is
generated
automatically
if
it's
part
of
the
CI
pipeline,
so
it'll
it'll
be
up
to
date.
F
F
They
just
want
to
use
their
own
front-end,
so
they
are
able
to
do
this,
but
this
is
not
something
we
would
recommend
at
this
point,
however,
what
we
want
where
we
want
to
be,
is
to
offer
a
clean,
RPC
interface
with
all
the
implementation,
details
and
internal
logic,
some
of
which
is
currently
managed
in
trails
to
be
moved
to
giddly,
for
example,
how
object
pools
are
created
and
deleted
is
is
kind
of
a
detail
that
doesn't
belong
in
a
client,
whereas
we
want
to
keep
all
the
business
decisions,
things
that
need
access
to
the
main
postgres
database,
like
customer,
tier
or
customer
settings
in
the
repository
or
or
in
in
their
projects.
A
F
It
may
never
be
marketed
outside
this
is
this
is
a
product
decision,
but
it
does
help
us
to
be
more
independent
and
develop
the
overall
of
gitlab
yeah
with
less
friction.
If
we
have
a
clean
interface
between
components.
D
The
briefly
mentioned
from
the
product
perspective,
this
is
also
really
important
because
we
formed
some
very
strategic
Partnerships
through
this
sort
of
opportunity.
We
get
a
lot
of
contributions
back
from
external
people
who
are
you
know,
interested
in
giddly
hugging
face
is
a
great
example,
but
there
are
others
as
well
that
we've
seen
over
the
time
where
we've
gotten
contributions
from
other.
D
You
know
teams
sort
of
looking
at
it
and
it
gives
us
another
perspective
from
a
product
perspective,
and
it
does
ensure
that
we
are
continuing
to
develop
giddly
in
the
most
independent
way,
which
is
part
of
the
reason
why
we
are,
as
the
direction
page
shows,
trying
to
put
the
business
logic
into
the
rail
side
of
things
to
make
giddly
somewhat
agnostic
to
who
is
calling
in
so
yeah.
It's
been
an
overall
win
for
us,
but,
as
honors
has
said,
we're
not
actively
marketing
or
pushing
it
as
an
independent
product.
C
All
right
thanks
everyone,
so
next
question
from
Clement
interested
in
the
cost
of
serving
cost
of
service.
What
are
the
most
costly
operations
so
I
know
on.com
using
fast
Storage.
Snc
storage
has
been
the
largest
largest
cost
component
of
Gilly.
By
far
it
really
depends
on
the
instance,
though
so
you
know.com
has
is
getting.
You
know,
lots
and
lots
and
lots
of
repositories.
You
know
business,
you
know
like
a
self-managed
instance.
C
You
might
not
see
as
many
you
might
see
like
a
single
monitor
repository,
which
will
still
be
large,
but
you
know
not
not
the
scale
of
you
know:
100
terabytes
that
you
see.com
for
the
you
know
the
vast
number
of
repositories
there
so
typically
storage
is
a
big
expense.
Just
because
you
for
Reliable
and
performance
service
you
need
to
have
fast
storage
and
then,
depending
on
the
size
of
the
repository,
you
may
also
see
significant
expenses
in
terms
of
compute.
C
So,
if
you
have,
you
know
a
truly
vast
repository,
let's
say
20
gigs
it
can
easily
use
10
or
20
gigs
per
operation,
sometimes
depending
what
it
is.
So
if
you're,
if
you're
running
multiple
operations
at
once,
you
know
and
you
only
got
10
gigs
of
RAM
on
the
box,
you're
gonna
run
out
real
fast
right,
so
you're
gonna
have
to
have
a
really
large
Post
in
that
case,
just
to
serve
the
repository
of
that
size.
But
more
typically,
storage
is
larger.
Cost.
C
Yeah
goodly
cluster,
you
know
I
would
say
that
you'd
expect
to
spec
that
are
the
same
as
a
standalone,
giddly
node,
because
you
know
internally
we're
still
retrieving
that
as
a
single
node
to
the
world
right.
So
I
don't
think
you'd
want
to
skimp
because
you're
still
writing
the
same
operations
on
a
given
node,
just
maybe
fewer
of
them,
and
any
of
them
could
become
the
primary
at
any
point
right.
C
C
C
E
C
So
basically,
when
a
fetch
comes
in
or
a
clone,
we
will
generate,
the
git
will
generate
the
pack
file,
that's
going
to
set
up
the
client
and
then
this
we're
using
the
gitly
hooks
binary.
We'll
then
write
that
pack
to
disk.
So
if
an
identical
request
comes
in,
we
don't
need
to
go
to
git
to
recompute
that
so
we
save
all
the
CPU
power
that
would
have
been
used
to
create
another
pack
file.
We
can
just
copy
the
bytes
from
disk
serve
it
out
to
the
client.
C
So
it's
usually
a
win,
usually
you're,
more
CPU
bound
than
storage
bound.
If
you
have
a
whole
lot
of
clones
coming
in
at
once,
but
it
depends
on
the
the
instance
of
the
petronic
pattern
so
something
to
look
into
in
general,
but
maybe
not
always
a
win.
C
Okay,
I
think
that
covers
that
Terry
s
any
epics
or
issues
the
sync
operations,
clarificately
cluster
replication
yeah.
So
so
I
mentioned
this
in
passing
earlier,
we
are
moving
Italy
in
the
long
run
to
a
new
architecture
which
will
remove
the
need
for
prefect.
So
that's
going
to
be
on
each
giddly
node
we're
going
to
have
it
right
ahead,
log,
basically
like
what
postgres
uses
or
other
databases
where
a
change
comes
in.
C
We,
you
know,
save
its
changes
to
disk
and
then
we
apply
them
one
after
the
other,
if
you
know
serialize
them
and
replicate
the
changes
out
to
any
other
nodes.
That
also
happen
to
have
this.
This
given
Repository
and
then
the
client
will
be
so
in
giddly
will
be
routing
clients
without
prefect.
That's
pretty!
Well,
since
the
blueprint
for
this
I
remember
the
detail.
C
C
So
there's
gonna
be
a
lot
more
complexity
in
how
giddly
talks
to
clients
when
this
is
done,
you
know
right
now.
It's
basically
rails
has
an
address.
It
goes
the
address.
If
there's
a
problem,
if
you
return
an
error,
that's
it
there's
no
retry
logic!
That's
that's
that!
So
it's
gonna,
the
client
will
have
to
do
a
lot
more
work
whenever
that
comes
around,
but
that's
that's
still
a
ways
out.
C
Okay,
so
that
basically
does
the
question:
sorry
Terry,
but
yeah,
so
we're
not
we're
not
putting
a
a
ton
of
more
effort
into
prefect.
You
know
we're
still
polishing
it
up,
but
so
like
like
I
mentioned,
you
know,
like
other
PCS,
that
previously
forced
a
replication
job.
We've
got
the
down
to
like
three
I
think
at
this
point
or
something
but
we're.
G
Not
yeah
I
was
just
being
I
was
just
being
nosy
I've
seen
that
issue
that
was
or
epic
that
was
linked.
Thank
you.
I
didn't
get
a
chance
to
read
through
it
all
the
way,
but
we
are
working
on
zukes
replication
strategy,
so
I
was
just
curious,
as
for
you
know,
at
least
it's
good
not
great,
to
hear
the
problems
that
you're
encountering,
but
at
least
something
to
think
about
when
we're
looking
at
how
the
replication
is
going
to
work
on
the
zoo
side.
C
Yeah
I'd
be
very
interested
to
see
what
your
chamber
has
it.
C
Okay,
I
think.
That's
we're
at
time.
That's
all
the
questions
in
the
dock
I'm
happy
to
hang
around
for
another
couple
minutes.
If
anyone
has
anything
else,
otherwise
we
can
call
it.
B
Cool,
so
thank
you
so
much
will
for
that
session.
B
B
You
can
take
a
look
at
that
issue.
That's
linked
there
for
upcoming
sessions
yeah
by
the
way
I
haven't
sent
out.
I
haven't
set
up
the
calendar
event,
yet
so
I'll
do
that
shortly
and
send
it
out
on
slack,
but
that
is
it.
Thank
you.
Everyone
for
joining
have
a
good
day.
Bye,
bye.