►
From YouTube: GitLab 13.0: Gitaly and Praefect
Description
Simon Mansfield, Solutions Architect and Christiaan Conover, Manager, Technical Account Manager for East Enterprise covering Gitaly and Praefect
A
A
Cool,
so
this
is
a
pretty
highly
anticipated
capability
we've
been
talking
about,
for
you
know,
probably
over
a
year
for
many
of
us
with
customers,
because
when
we
talk
about
setting
up
gitlab
in
an
configuration,
the
one
caveat
we've
always
had
to
put
on
it
is
except
italy
which
isn't
actually
aj
capable
yet,
and
you
have
to
use
fs
and
it
becomes
a
little
bit
of
a
an
annoying
asterisk,
but
with
gitlab
13,
we
have
finally
released
our
first
iteration
of
what
we're
calling
italy
cluster,
which
is
our
aja
solution
for
that.
A
Essentially,
what
this
is
is
it
is
a
high
availability
solution
for
getting
and
allows
for
you
to
build
out,
scalable
and
redundant
repository
storage
and
negates
the
need
for
nfs.
So
with
this,
we
have
now
gotten
to
the
point
where
you
can
deploy
git
lab
without
any
nfs
behind
it
at
all.
A
A
A
A
Right
so
fair
point:
yes,
there
is
that
asterisk,
but
yes
for
for
an
ancient
environment.
Now
you
technically,
you
don't
need
nfs
really
at
all,
depending
on
how
you
set
it
up,
but
especially
not
for
your
git
repositories,
which
is
the
key
thing
here,
and
so
this
basically
addresses
the
last
remaining
single
point
of
failure
element
of
gitlab.
So
this
is
a
big
thing
for
for
customers
who
are
running
cloud
providers
on-prem.
What
have
you?
A
So
here's
the
architecture
from
our
documentation
that
describes
what
what
it
looks
like
when
you
set
up
a
high
availability
environment.
So
you'll
note
here
the
bottom
half
of
this
outlines
what
it
looks
like
and
so
there's
a
few
key
elements
here
that
we're
going
to
talk
through
here
in
a
moment,
but
you'll
see
that
essentially
the
ingress
point
for
the
gilly
cluster
is
a
load
balancer
in
this
architecture.
A
The
reason
for
that
is
that
the
prefect
can
actually
have
multiple
prefect
nodes
in
a
cluster
so
that
you
can
even
remove
the
single
point
of
failure
of
your
prefect
component
in
your
cluster.
So
it
can
communicate
with
your
you
know
in
any
combination
with
the
postgres
database
that
is
set
up
for
prefect
and
act
as
a
as
a
load
balancer
itself
and
handle
any
request.
A
So
it's
not
a
requirement
to
have
a
load,
balanced,
prefect
environment.
You
can
have
just
a
direct
connection
from
your
gitlab
instance
to
a
single
prefect
node
up
to
you,
but
it
is
architected
to
allow
you
to
do
so
to
do
it
in
a
way
that
even
prefect
is
redundant
so
and
then
you'll
notice
here
the
giddily,
the
actual
ghillie
nodes
they're
set
up
in
in
cluster
groups,
and
you
can
also
shard
across
multiple
giddily
instances
behind
prefect.
If
you
needed
to
do
so,
for
example,
your
storage
volumes
can't
support.
A
You
know
a
horizontally
scalable
giddaly
environment
for
iops
purposes,
and
you
need
to
have
multiple,
different
storage,
physical
storage
volumes
behind
it.
There
are
use
cases
there,
it
gets
complicated,
but
it
can
be
done
so.
The
idea
here
is
that
this
provides
a
variety
of
permutations
for
setting
up
an
aha
environment
for
your
getaway
storage.
A
So
prefect
specifically,
we've
mentioned
this
a
few
times
and
simon.
I
know
I
keep
talking
here
I'll.
Let
you
carry
this
one.
If
you
want
to
discuss
this
one.
B
Yeah,
I
mean
most
of
it's
already
been
said,
so
that
protect
is,
is,
is
this
sort
of
thing
that
sits
in
front
of
italy?
It's
sort
of
transparent
to
gitlab.
Gitlab
doesn't
know
that
it's
talking
to
prefect
it
just,
but
it's
it's
the
thing.
That's
responsible
for
syncing,
all
the
all
the
italy
nodes,
and
it's
in
it's
the
thing
that's
responsible
for
really
implementing.
B
As
as
christian
said,
you
can
have
multiple
prefect
nodes
so
that
they,
you
know
that
they,
the
prefect
mode
itself,
isn't
a
single
point
of
failure
and
and
yeah
really,
and
it
also
does
some
load
balancing
as
well.
So
the
perfect
nodes
also
kind
of
have
this
kind
of
element
of
of
balancing
traffic
between
the
different
various
digitally
modes
as
well.
So
yeah,
that's
really!
What
prefect
is
you
will
hear?
It
mentioned
a
lot
if
you're
talking
about
italy,
cluster.
A
A
You're,
probably
going
to
hear
this
referred
to
quite
a
bit
so
simon
and
I
on
monday,
actually
went
through
the
process
of
setting
up
a
giddily
cluster
and
we
started
from
an
existing
environment
that
I
created
with
a
single
giddily
app
or
a
gitlab,
app
node
and
a
italy
node
separately
configured.
We
took
this
approach
figuring
that
the
majority
of
our
customers
currently
looking
to
do.
A
A
So
I
wanted
to
replicate
roughly
what
you'd
likely
encounter
when
you're
dealing
with
an
existing
environment
with
giving
for
a
perspective,
aj
configuration,
and
so
we
went
through
the
process
of
building
out
the
ghibli
cluster,
integrating
it
into
git
lab
and
then
figuring
out
how
you
would
migrate
data
from
your
existing
italy
space
to
the
new
cluster,
and
we
won't
take
you
through
the
multi-hour
process
in
depth.
A
If
you
really
are
interested
in
doing
that,
you
can
watch
the
video
that
we
put
on
youtube
unfiltered,
which
I
paired
down
a
few
spots
where
we
fumbled
through
gpp
for
our
own
lack
of
knowledge
there
and
eventually
figured
out
the
solutions.
But
you
can
see
the
progression
from
start
to
finish
of
how
we
got
that
done
and
the
pitfalls
we
encountered.
But
here
are
some
of
the
takeaways
from
going
through
that
process
simon.
If
you
want
to
go
through
some
of
these
I'll,
let
you
know.
B
Yes,
there's
something
to
be
aware
of
at
the
moment
is
that
the
the
prefect
leader
election
is
currently
not
it
favors,
availability
of
a
consistency
so
that
the
moment
there
is
a
possibility
of
data
loss,
not
it's
very
unlikely.
It's
when
prefects
know
what
a
prefect
node
has
to
fail,
and
basically
the
leader
of
election
it
takes
place.
B
There
is
a
an
issue-
that's
in
flight
at
the
moment,
to
change
that
to
default,
to
using
the
postgres
database
that
prefect
uses
to
do
leader
election
and
that's
going
to
be
made
the
default
in
13.1.
It's
actually
there.
Now
it's
just
not
the
default.
I
think
it's
because
it
was
still
untested
when
it
was
launched,
so
that
that
will
that
will
give
you
consistency
over
over
the
availability
and
kind
of
solve
those
issues.
B
The
next
point
I
added
so
I'll
talk
about
it,
which
is
that
at
the
moment,
when
you
install
omnibus
you
kind
of
get
everything
all
in
one
and
that's
you
get
the
postgres
database
there
for
all
the
data
as
well
omnibus
does
not
include
a
postgres
database
for
prefect
prefect
needs
its
own
database.
Ideally,
it
can
actually
go
inside,
and
this
is
another
point
of
it
later
on.
It
can
go
inside
your
your
github
database,
but
it's
not
supported
in
that
configuration
when
you're
using
geo.
B
So
for
that
reason
I
mean
myself
and
christine
would
both
recommend.
I
think
that
you
know
you
separate
out
straight
away,
because
if
you
ever
want
to
go
to
geo
you,
you
then
have
to
untangle
your
database.
So
there's
no
prefect
database
going
to
be
installed
in
omnibus
by
default.
At
the
moment.
Do
you
want
to
take
the
next
one
christian.
A
Sure
so,
as
I
mentioned
earlier,
one
of
the
things
that
we
went
through
in
our
in
our
setup
process
was
the
effort
to
migrate
data
from
the
existing
italy
instance
to
the
gita
cluster.
When
you
create
the
cluster,
you
then
obviously
have
to
go
into
gitlab
rv
file
and
add
that
as
a
data
directory
that
the
git
lab
can
use
to
store
information,
but
all
storage
volumes
are
created,
equal
and
get
and
get
labs
eyes.
A
So
you
have
to
tell
git
lab
where
you
want
projects
to
be
stored
and
to
migrate
from
an
existing
one
to
a
to
a
new
one.
Requires
api
calls
right
now.
I
don't
know
of
any
utility
that
has
been
created
to
help
support
the
at
the
migration
of
data
in
in
a
batch
process
from
one
location
to
another.
A
I
actually
have
started
poking
myself
and
just
building
a
proof
of
concept
that
might
be
able
to
do
that
as
a
very
simple
cli,
just
as
a
personal
project
to
maybe
facilitate
that
more
easily,
we'll
see
what
comes
out
of
that
and
if
it's
only
usable,
but
it
is
something
that
right
now,
you
do
have
to
script
via
api
calls
to
migrate
from
one
to
the
other.
B
Yeah
the
next
one
is
really
I've
already
said,
which
is
separate,
your
databases
out
and
then
the
final
point
is
that
at
the
moment,
there's
very
limited
scope
for
admins
to
actually
be
able
to
monitor
the
cluster.
There's
an
epic
around
this.
This
is
something
that
the
getaly
team
is
working
on
and
so
yeah,
that's
something
that
they're
really
pushing
for
so.
A
One
final
point:
I
want
to
make
on
that
first
bullet
here.
The
main
reason
for
this
is,
if
you're
not
familiar
with
collector
environment
and
data,
consistency,
architectures
and
stuff,
like
that,
for
this
sort
of
thing,
the
main,
the
main
reason
that
there's
a
potential
for
data
loss
with
the
way
it's
currently
set
up
is
if,
if
one
of
your,
if
your
primary
get
early
node
fails,
what
prefect
does
right
now?
Is
it
just
picks
another
one?
It
doesn't
worry
about
how
up
to
date.
It
is.
A
It
just
picks
one
and
uses
that
as
the
as
the
leader
and
then
it
goes
forward
from
there.
The
database
that
gets
created
for
prefect
is
also
used
to
track
the
changes
that
occurred
over
time
and
which
nodes
have
which
data
sets,
but
it's
not
currently
utilizing
any
of
that
information
to
actually
take
the
leader.
So
there
is
a
possibility.
If
you
get
out
of
sync
that
it
will,
it
will
say
we
don't
know
which
nodes
have.
What
and
it'll
put
you
in
a
read-only
state
for
any
repositories
that
don't
have
up-to-date
synchronization.
A
So
that's
that's.
The
reason
behind
that
is
that
they're
working
on
that
in
13.1,
hopefully
we'll
see
resolution
to
that
point.
It
won't
even
be
a
concern
by
the
time
our
customers
are
deploying
this
in
production.
B
Okay
yeah,
so
we
went
into
this
process
without
having
read
up
anything
without
having
spoken
to
the
product
team
about
it
specifically,
so
we'd
do
it
as
if
a
customer
was
running
through
the
process,
we
did
make
notes
throughout
the
the
process
and
we
followed
the
documentation
and
actually
the
documentation
is
really
solid.
There
was
pretty
much
only
one
point
really
where
we
really
truly
got
blocked,
and
that
was
due
to
the
cloud
provider
like
documentation
really
rather
than
our
own.
So
but
we're
gonna,
add
those
you
know
everything.
B
We've
got
feedback
for
we're
gonna,
add
back
into
the
to
the
italy
team
and
try
and
get
that
added
to
the
docs.
One
point
of
note
is
that
the
reference
architectures,
so
the
the
five
10
25
50k
architectures,
do
not
yet
have
gita
lee
cluster
in
in
the
reference
architectures.
That
is
something
that
people
are
aware
of
and
they're
adding
that
support.
A
I
made
a
note
here
at
the
bottom
that
the
gitlab
orchestrator
project-
this
is
something
that
I
actually
was
made
aware
of
by
jason,
plum
when
he
graciously
joined
our
session
on
monday
to
try
to
help
us
work
through
some
gcp
issues.
A
I
will
update
the
deck
here
with
a
link
to
that
project.
If
you're
curious,
jason
gave
us
all
sorts
of
disclaimers
about.
This
is
not
production
ready.
This
is
not
productized,
yet
your
mileage
may
vary.
A
Customers
can
use
it,
but
it's
up
to
them
how
they
do
it
all
that
kind
of
stuff,
but
it
could
be
a
very
useful
resource
for
any
customers
that
you're
working
with
that
are
interested
in
having
any
sort
of
infrastructure
as
code
configuration
and
are
looking
to
us
to
provide
some
guidance
or
best
practices
on
how
to
do
so,
because
it
looks
like
it's
pretty
well
well-rounded
right
now
as
to
all
the
components
that
it
covers.
A
So
it's
something
worth
looking
at
as
an
aside,
if
you're
interested
so
the
main
takeaways
from
this
from
going
through
the
exercise.
The
sign
that
I
did
and
reading
through
the
docs
is
the
getaway
cluster
is
really
well
architected.
It
it's
clearly
thought
out
to
be
scalable
and
redundant
and
address
all
the
concerns
somebody
would
have
of
building
an
h8
solution,
especially
when
you're
dealing
with
the
types
of
things
that
that
get
transactions
caused
with
storage.
So
it's
I
think
it's
gonna,
be
a
great
solution.
A
Obviously
there's
a
few
limitations
right
now
that
I
would
say,
make
a
little
premature
to
implement
a
production
environment.
I've
been
giving
my
customers
guidance
before
this
was
even
released,
that
they
should
probably
hold
off
until
probably
13
1
13
2
until
they
look
at
doing
this
in
their
actual
production
environments.
A
Just
anticipating
that,
since
it
was
the
first
ga
release,
there
were
naturally
going
to
be
bugs
that
got
surfaced
by
it
being
out
in
the
wild
and
that
aligns
with
what
we've
been
seeing
from
some
of
the
known
caveats
to
it
that
that
they're
expecting
to
resolve
in
the
next
couple
of
releases.
So
I've
encouraged
customers
to
set
this
up
in
staging
environments,
to
test
it
out
and
understand
the
process.
A
But
I
would
say
it's
probably
not
production
ready
for
most
of
them
until
later
this
summer,
and,
as
you
know,
simon-
and
I
have
also
agreed
that
we're
going
to
help
the
gilly
team
develop
the
docs
to
be
more
even
more
useful
than
they
are
currently
are
for
a
customer-facing
perspective,
so
that
you
know.
Ideally,
customers
can
walk
through
this
step-by-step
with
little
to
no
assistance
from
us.
A
We
may
be
using
a
production
on
gitlab.com,
but
let's
not
forget.
We
also
wrote
beta
releases
of
the
product
itself
on
gitlab.com,
so
I
think
we're
more
risk,
tolerant
with
our
dedicated
infrastructure
team.
To
do
so,
and
we
probably
have
it
built
out
in
such
a
way
that
we're
we're
limiting
the
risk
of
data
loss
because
of
the
scale
that
we're
probably
putting
it
at.
That
would
be
my
my
assumption,
but
yes,
I
do.
B
And
one
of
the
biggest
things
is
the
leader
election
which,
as
I
said,
is
not
defaulted
to
switched
on,
but
it
is
available.
So
I
imagine
that
our
infrastructure
team
have
probably
switched
that
on
so
yeah.
That
would
be
the
main
thing.
A
But
from
the
perspective
of
you
know,
customers
who
have
more
limited
resources
for
this
type
of
thing
than
we
do.
I've
generally
recommended
to
them
just
give
it
a
couple
of
months.
So
this
thing
stabilizes
and
gets
to
fully
production
ready.
B
B
A
Well,
all
right,
so
here's
why,
in
a
traditional
nfs
model,
you
probably
have
some
combination
of
a
compute
instance
and
a
storage
volume
to
support
your
nfs
right
and
that
compute
instance
may
or
may
not
even
be
present.
A
If
you're
needing
an
management
run
a
computer
instance,
it
might
be
provided
by
your
cloud
provider
or
you
may
have
a
sam
or
something
in
your
infrastructure,
but
in
general
you're,
probably
gonna
have
one
gitaly
node
that
is
connected
to
that
storage
volume
for
nfs,
and
if
you
have
multiple
gitly
nodes
that
are
talking
to
that,
it's
still
just
those
nodes
in
a
giddily
cluster
environment.
A
You
know
you
necessarily
are
adding
on
addition,
in
addition
to
your
n
number
of
giga
nodes,
you're,
now
adding
at
least
one
prefect
node
at
least
one
postgres
database,
which
is
probably
on
its
own
node
as
well,
and
possibly
a
load
balancer
in
front
of
all
of
that
between
your
gita
environment.
So
your
compute
resources
alone
are
likely
to
be
higher.
A
You
may
see
some
savings
if
you're
not
having
to
use
an
a
managed
nfs
solution,
but
it's
not
necessarily
going
to
be
enough
to
offset
the
additional
cost
of
setting
up
this
infrastructure.
With
that
being
said,
I
mean
our
reference.
Architectures,
don't
even
start
talking
about
until
you're
at
like
three
or
five
thousand
users,
so
it
may
be
a
negligible
difference
from
the
perspective
of
the
service
provider.
A
C
But
but
one
would
also
expect
that
one
would
gain
some
amount
of
additional
performance
out
of
this,
because
nfs
is,
you
know,
pretty
much
a
pig
with
regard
to
locking
and
everything
else
and
has
its
own.
B
Absolutely
so
the
benefits
the
benefits
of
this
is
that
the
you,
the
actual
implementation
of
your
italy
nodes,
becomes
much
more
important
because
guitar,
if
it's
talking
to
nfs
it's
more
about
the
nfs
storage
and
how
performant
that
is
than
the
gitline
itself.
Now
you've
got
local
ssds
attached
to
your
github
right,
and
that
is
where
it's
getting
the
data
from.
So
potentially
there
is
performance
benefits
there.
A
I,
I
would
argue,
maybe
holistically
when
you
factor
in
not
just
the
infrastructure
costs,
but
also
the
the
increased
productivity
from
faster
performance
as
well
as,
hopefully,
the
the
less
the
lower
amount
of
infrastructure
management
that
maybe
would
have
to
take
place
for
fine-tuning
your
guitar
instances
to
talk
to
an
nfs
solution.
You
might
overall
see
a
total
cost
reduction
from
that
perspective,
but
purely
from
the
bill
that
you
paid
your
cloud
provider,
it's
probably
going
to
be
a
little
higher
yeah.
B
Performance,
wise,
I
think
it's
too
early
to
say
at
the
moment.
I
I
think
this
we
were
really
looking
at
this
from
a
from
an
implementation
perspective,
not
performance,
but
I
can
get
in
touch
with
the
italy
team
and
we
can.
We
can
ask
them
if
they've
got
any
data
on
that.
A
All
right
just
do
it
yeah
I'll
just
say:
I
know
we're
done
here
with
ours,
we're
probably
over
time,
so
we'll
we'll
we'll
hand
it
over
to
chloe
next.
I
think
she's
next
right.
B
C
A
A
A
B
Yeah
effect
effectively,
I
think
what
happens
with
lfs
is
that
there's
a
link
to
the
lfs
stored
in
italy
in
terms
of
like
this
file
exists
in
your
git
repo,
but
it's
not
actually
here.
It's
yeah.
C
Right,
I
mean
that's
how
lfs
works,
because
we
want
to
keep
the
stuff
out
of
the
repo,
but
the
question
is
like
what
do
we
do
with
like
that
and
pages
and
other
data
that
sort
of
exists
outside
of
the
sort
of
normal
get
infrastructure
like
stuff,
that's
alone
in
the
file
system?
So
we're?
We
still
have
the
same
problem
with
that
that
we
did
before.
I
guess.
A
Well,
and
and
the
majority
of
that
other-
that
other
type
of
data
still,
we
already
have
architectural
support
for
using
things
like
object,
storage
to
make
it
so
that
you
don't
rely
on
single
point
of
failure,
storage
solutions
and
you
are
on
a
scalable
option.
Obviously
that
doesn't
work
for
people
who
are
entirely
on-prem
and
don't
have
an
object,
storage
layer
that
they
can
use,
but
our
our
application
architecture
does
support
the
use
of
things
like
s3
for
all
those
other
components.