►
From YouTube: GitLab Support Gitaly Cluster Deep Dive - AMER
Description
Recording of GitLab Support AMER deep dive on Gitaly Cluster
A
Okay,
so
welcome
to
the
getaway
cluster
deep
dive
for
amer,
so
get
the
cluster
just
to
give
a
real
high
level
overview
is
kind
of
our
high
availability
solution
for
getting
so
a
little
bit
of
background.
You
know.
Initially
we
had
rugged
or
like
lublight2,
and
so
we
directly
access
get
data
from
the
application
server
and
then
we
split
it
and
split
that
out
into
gitaly,
so
getaly
in
the
current
or
the
original
instance
didn't
really
support
at
all.
A
So
what
you
have
to
do
is
you
have
to
share
your
data
so
that
you
know
projects
one
two
and
three
are
unshared
one
project,
four,
five
and
six
are
in
chart
two,
and
so
you
have
some
mitigation
of
data
loss,
but
because
you
only
lose
a
half
of
your
projects
if
one
of
those
nodes
goes
down,
but
you
still
don't
have
any
redundancy
there
so
again,
cluster
does.
Is
it
lets?
A
You
run
multiple
gateway
servers
that
are
kept
in
sync
and
presented
to
rails
as
a
single
back
end,
getaway,
so
rails.
B
A
Even
know
that
this
is
prefect-
or
this
is
gilly
multiple
ghillies
here
and
so
this
this
says
the
obvious
advantage
of
now.
You
can
lose
one
of
these
and
you
don't
lose
all
your
data
or
two
okay,
and
so
the
way
we
mediate,
that
is,
with
a
program
called
prefect,
so
prefect
you,
you
need
to
have
at
least
three
of
them.
Three
is
what
we
recommend.
A
I
don't
think
I've
seen
anyone
using
more
that
you
could
in
theory,
but
you
need
to
have
these
three,
so
they
can
agree
on
who
the
primary
is
for
among
the
gilly
clients
and
in
terms
of
getting
these
servers
themselves.
You
know
typically
see
three
as
well.
You
know
again,
you
could
have
more,
but
three
is
most
common
anyway,
so
prefect.
Well,
they
don't
directly
talk
with
each
other.
A
What
they
do
is
they
share
a
postgres
database
which
in
theory
you
could
put
on
the
rails
database,
but
in
practice
we
recommend
putting
on
its
own
database
at
some
separate
database.
So
usually
most
customers
are
doing
that
in
the
cloud
so
using
rds
or
an
equivalent
and
other
cloud
providers
for
customers
who
are
self-managed.
I
think
there
are
some
limitations
right
now.
A
I
can't
think
of
what
they
are
in
my
head,
though,
but
yeah
so
essentially
we'd
recommend
having
a
separate
postgres
for
prefect,
because
it
can
put
a
lot
of
load
on
on
that
server,
particularly
if
you
have
distributed
reads
enabled
it
can
put
a
fair
amount
of
traffic
on
it
and
there
used
to
be
to
the
point
where
we
actually
disabled
that
feature,
because
it
was
just
like
on
large
instances.
A
It
would
just
tank
the
the
postgresql
that's
been
resolved
now
by
using
a
maybe
added
some
caching,
so
that
we're
not
going
to
the
databases
often
and
that
also
required.
A
Yeah?
Okay,
no,
not
there!
Okay!
Here
it
is
yeah,
and
so
then
that
means
that
we
also
require
a
direct
connection
to
that
postgres
as
well
from
each
prefect
so
actually
back
up
for
a
second,
so
most
connections
will
go
through
pg
bouncer
or
we
recommend
pg
bouncer.
A
But
then
we
also
need
to
have
the
more
efficient
connection
for
this
distributed.
Read
cache,
and
so
you
also
need
to
have
in
your
prefect
config,
just
pull
up
the
gitlab
rv
here.
A
So
you
have
both
your
database
host,
which
in
this
this
particular
demo,
I'm
just
using
a
direct
connection,
but
for
a
customer,
this
vpg
bouncer
here
and
then
you
also
have
this
host
no
proxy,
where
you're
directly
connecting
to
we're,
presumably
your
primary
postgres
server.
So
if
you're
yeah
actually,
this
is
this
is
the
downside,
but
for
self-managed
dub.
So
if
you're
not
using
a
highly
available
postgres
so
like
rds,
then
then
you
have
the
downside
of
if
your
primary
postgres
server
goes
down
and
pg
bouncer
fails
over
to
a
secondary
this
proxy.
A
This
post
proxy
connection
no
longer
works
right.
Italy
and
prefecture
still
work,
but
now
we
are
much
less
efficient
because
we
no
longer
have
this
cache.
So
I
don't
actually
know
what
this
I
don't
know.
This
is
actually
actually
haven't
happened
in
anyone's
production.
Yet,
but
I'd
guess
we'd
probably
see
a
lot
more
load
on
postgres
when
this
happens,
but
it
should
be
interesting
to
see
whenever
that
eventually
does
happen,
what
what
the
the
consequences
are?
A
Yeah.
So,
that's
that's
why
we
have
this
separate,
no
proxy
connection
here
before
this
distributed
reads
flag.
The
advantage
of
distributed
reads
is
that
yeah
by
default?
Previously,
you
have
your
primary
gilly
server
and
we
would
do
all
reads
and
all
rights
there,
and
that
means
that
you
know
you,
even
though
you
have
three
servers
in
there,
you
functionally
have
one
handling.
All
the
traffic
distributed
reads.
Lets
you
well
distribute
the
reads
like
you
might
expect,
and
so
that
means
that
your
traffic
read
traffic
can
by
default.
A
So
that's
this
the
reason
for
the
feature
and
some
some
of
the
complications
with
it.
Okay,
where
was
I
all
right,
so
you've
got
prefect
they're
talking
to
postgres
via
two
different
connections,
and
so
what
they're?
What
they're,
basically
doing
is
saying.
Well,
let's
actually
pull
up
postgres
here.
A
Short
yeah,
so
you
can
see
the
generation
here
is
basically
how
many
replications
have
happened.
So
prefect
here
does
not
directly
track.
The
state
of
the
repository
doesn't
say
like
head
is
shot
one
two,
three
four
five
and
saying
I
replicated
this
many
times
for
this
project,
and
so
given
that
I
know
that
this
is
the
state
of
the
repo,
and
so
what
that
kind
of
complicates
is
like
a
lot
of
customers
have
asked
about.
A
Can
I
hot
swap
in
you
know
a
new
server
and
that's
that's
something
we
don't
support
at
all
right
now:
hey
caleb,
welcome!
A
There
is
an
epic
for
that,
but
it
hasn't
got
any
attraction
yet
so
eventually
we'll
support
something
like
that.
But
right
now
we
don't-
and
so
this
is.
This
here
is
at
least
part
of
the
reason.
Why?
Because
we're
not,
we
don't
really
know
what
the
state
is
directly
in
prefect
right.
We
just
know
we
replicated
this
many
times.
I
know
that
prefix
or
getting
two
has
not
run
that
replication.
Yet
so
I
know
it's
behind,
so
we
can't
we're
not
going
and
directly
checking.
A
You
know
the
head
of
the
repo
we're
just
saying
this:
is
the
replication
count
based
on
that?
I
know
what
the
state
is.
Okay,
so
the
repository
since
the
repository
assignments.
A
All
right,
that's
actually
empty.
Okay,
replication
queue
should
be
empty
because
nothing
is
actually
happening
right
now,
yeah,
okay,
and
so
this
is.
This
is
kind
of
tracking
how
you
know
when
when
we're
when
the
primary
changes.
A
So
actually
let
me
back
up
there's
kind
of
two
ways
that
we
do
this
so
initially
we
had
eventual
consistency.
So
when
a
change
was
made,
we'd
update
the
primary
and
then
we'd
schedule
our
application
job
to
say
all
right
now
get
elite.
Two
and
three
go
fetch
the
changes
from
italy,
one
which
is
the
primary,
and
so
we
we'd
add
this
here
and
then
prefectured
fire
off
jobs
to
the
the
two
other
ghibli
servers
to
have
them
fetch.
A
The
changes
with
there's
another
feature:
strong
consistency
which
uses
what
are
called
reference
transactions.
So
basically,
when
a
change
comes
in
the
primary
will
then
call
back
out
to
prefect
and
say:
hey,
I'm
going
to
make
a
change,
and
I
want
to
synchronize
with
the
other
two
gitly
servers.
So
then
they'll
establish
a
quorum
and
the
details
of
that
are
a
little
bit.
I'm
not
a
little
bit
fuzzy
on
that
I'll.
A
Actually
do
that
maybe
catalan
does
but
anyway,
so
they
they
established
a
transaction
that
says
you
know
we're
going
to
update
all
three
of
these
servers
in
lockstep
and
then
that
way,
you
don't
have
this
replication
lag
when,
when
someone
pushes
you
know
all
killing,
one
two
and
three
are
all
through
this
mechanism
updating
simultaneously,
so
that
you
know,
obviously
makes
it
more
reliable
so
that,
if
guilty
one,
if
the
primary
goes
down
or
even
one
of
the
secondaries,
the
other
two
are,
you
know
much
more
likely
to
be
up
to
date
and
in
a
good
state
than
than
if
we
were
replicating
as
it
came.
A
We
had
some
issues
in
the
past,
so
one
thing
I
should
note
is
that
we
really
don't
want
customers
to
be
using
this
with
an
old
version
of
gitlab.
It's
like
you
really
want
them
to
be
on
the
latest
version.
I
think
up
to
13.5,
there
is
a
bug.
Replication
would
just
freeze
you
know
every
hour
or
two,
and
so
you'd
have
this
huge
backlog
of
jobs
that
would
build
up
until
you
restarted
prefect
and
that
would
clear
out
the
race
condition
that
was
breaking
it
until
it
happened
again.
A
So
if
you
see
someone
using
an
older
version
of
getlab
with
the
github
cluster
strongly
recommend
that
they
move
up
to
the
latest,
because
there
are
more
features,
there
have
been
a
lot
of
bug
fixes.
You
know
it's
so
much
very
much
a
feature,
that's
in
the
process
of
being
polished
and
improved.
So
this
is
not
something
you
want
to
like
hop
up
to
thirteen
point
one
and
say
all
right:
I'm
gonna
use
good
catalytic
cluster.
Don't
do
that!
That's
a
really
bad
idea,
make
sure
you're
using
the
latest
version.
A
Okay
right,
so
that
that's
strong
consistency
we
can
actually
and
we'll
still
see
some
replication
jobs.
I
think
catalan.
You
want
to
ask
you
to
jump
in
here,
real
quick,
so
you
had
a
customer
who
is
using
geo
with
with
gili
cluster
and
they
were
seeing
a
backlog
of
replication
jobs
building
up
over
time.
A
D
It
in
our
case
it
was
like
performance
related
where
jio
was
thinking
more
repositories
and
perfect,
could
clear
up
from
the
backlog.
A
Were
they
using
strong
consistency?
No
okay.
That
makes
sense
all
right.
I
was
wondering
how
their
how
this
backlog
is
building
out
when
they're
using
charcoal
synthesis,
okay,
yeah
one,
and
so
the
the
other
major
pain
point
with
strong
consistency,
or
maybe
the
the
major
strength
pain
point
is
that
so,
like
I
mentioned,
italy
is
going
to
create
a
new
connection
from
itself
to
the
prefix
host
that
called
into
it.
A
Oh,
that's,
marina,
and
so
the
way
it
does
that
is,
you
know
if
you
look
at
the
gateway
config,
let's
just
pull
up.
A
It
has
no
knowledge
of
prefect
at
all.
As
far
as
gita
is
concerned,
it's
just
you
know
it's
just
a
standalone
getaway.
It
doesn't
it's
not
a
where
that's
in
a
cluster,
so
there's
no
there's
it
doesn't
have
any
way
of
connecting
to
prefect.
It
doesn't
know
their
other
gitlies.
It's
just
hanging
out.
So
the
way
the
way
we
handle
like
how
does
it?
A
How
does
it
call
back
into
prefect,
then
is
gilly
will
read
the
the
incoming
address
of
the
connection
request
from
prefect
and
we
assume
that
that
is
a
connect,
an
address
that
we
can
connect
to
in
the
other
direction,
but
that's
not
always
the
case.
So
if
you
were
netted
and
so
like
it
wasn't
actually
the
correct
address.
That's
not
going
to
work
the
thing
that
we've
seen
multiple
times
now
is
in
a
containerized
environment.
So
I
don't
actually
know
how
this
works
with
the
chart.
A
I
don't
know
if
it
does,
I
haven't
tested
the
chart,
so
the
chart
may
not
work
either,
but
certainly
in
dockerize
in
docker,
which
was
the
one
that
we
saw
the
most.
What
will
happen
is
gearlee
will
see
the
connection
coming
in.
So,
if
you're
using
bridge
networking
the
default
right,
you'll
see
the
connection
coming
in
from
like
the
bottom
address
of
the
bridge
network
address
space,
and
so
it
tries
to
call
back
out
to
that
and
it
doesn't
work
because
that's
not
actually
an
address
that
goes
back
to
prefect.
A
So
if
you're
using
vms
this,
this
should
be
fine
docker.
You
can
turn
on
host
networking,
but
that
might
cause
other
problems.
If
you
have
multiple
containers
on
the
same
host,
I'll
try
to
bind
to
the
same
ports,
so
my
recommendation
would
be
just
turn
off
reference
transactions,
just
disable
feature.disable
gateway
reference
transactions,
if
they're
in
docker-
and
I
I
don't
know-
what's
gonna
happen
with
the
charts,
but
that
would
be
the
same
there.
You
just
disable
the
feature
there.
A
If
it
is
an
issue,
I'm
not
sure,
probably-
and
there
is
an
mr
to
fix
this
I'll-
put
that
in
the
dock
after
the
call.
But
it's
been
kind
of,
I
mean
it's
been
worked
on,
but
like
paul,
the
developer,
who
was
working
on
it
just
left
the
company
like
two
days
ago,
and
so
someone
else
is
taking
it
over
it's
supposed
to
be
for
13.9.
Now
I
know
it's
supposed
to
be
13.10.
A
I
don't
think
it's
gonna
make
13.10.
So
eventually
this
feature
will
work
with
everyone
right
now.
It
is
a
source
of
some
issues:
okay,
yeah,
all
right,
so
whatever
strong
consistency
whenever
distributed
reads:
yeah,
okay,
other
things
that
can
go
wrong
so
in
terms
of
networking
connections
the
way
the
standard
way
it's
set
up.
Is
you
have
your
rails
server
and
let
me
actually
just
pull
up
here
so
this
is.
This
is
using
a
terraform
script.
That's
in
the
gallery
pro
itself,
so
anyone
can
use
this.
A
If
you
have
gcp
access,
you
just
go
into
gili
underscore
support
terraform
and
you
can
run
that
and
spin
up
a
demo
cluster.
So
it's
really
easy
they're.
Just
like
two
scripts.
You
run
like
create
demo
cluster
and
configure
demo
cluster
and
you're
up
anyway.
So
what
that
does?
And
what
typically
you
see
is
that
you
have
your
rails
nodes
in
this
case,
you
only
have
one
database,
so
this
is.
This
is
actually
just
one
database,
it's
gcp
right
and
then
there's
a
low
balancer
and
behind
the
low
bouncer.
A
You
have
your
three
prefectos
and
then
you
have
three
gitly
hosts
as
well.
So
in
terms
of
net
connections,
you're
gonna
see
from
rails
to
the
low
bouncer
the
low
balancer
into
the
prefix,
the
prefix
into
the
gateways.
The
gateways
will
talk
amongst
each
other
and
then,
if
you
have
reference
transactions
from
themselves
directly
back
to
prefect
and
then
I
believe
they
also
will
be
calling
back
out
to
the
low
balancer
as
well
in
some
cases
and
then
also
using
gitlab
shell
out
to
the
the
rails
node.
A
So
let's
actually
look
at
that.
Real
quick,
oh
hey
ben!
A
So,
let's
see,
let's
just
look
at
see
if
we
can
get
that
to
happen
so
I'll
just
edit.
This.
A
Okay,
yeah,
so
here
we're
connecting
a
port
8075.
So
that's
probably
replication
connection
to
get
late,
ruby,
locally,
yeah,
all
right
and
yeah.
Here
we're
calling
out
to
53.
I
think
that's
rails,
right,
yeah,
okay!
So
that's
calling
the
internal
api.
A
And,
and
where
are
we
calling
prefect
should
be
calling
it
yeah?
I
don't
see
it,
she
did
this
earlier
and
it
worked.
B
A
Interrupt
me
at
any
point
with
questions.
Okay,
actually,
it's
interesting.
I
don't
see
that
we
called
prefect
here
I
had
that
feature
turned
on
and
distributed.
Read
is
true:
okay,
weird
sorry
reference
transactions,
that's
the
one
I
care
about
yeah,
so
in
terms
of
observability
we
have
logs,
but
those
can
be
verbose.
A
A
You've
got
at
least
six
servers
that
you
need
to
check,
because
you
need
to
probably
check
all
three
prefix
and
all
three
giddaly's,
and
this
can
just
be
a
lot
of
data
to
have
to
deal
with
so
a
support
author,
that's
not
as
big
a
deal,
but
it's
still
just
a
pain
to
have
to
go
up
through
all
that
you
know
we
have
a
number
of
useful
prometheus
metrics,
but
you
know
in
self-managed:
that's
often
not
an
option.
A
So,
let's
see
so
you
can
see
here,
replication
latency,
so
how
long
it
takes
to
keep
the
other
two
nodes
in
and
suck
surprise
there's
any
at
all.
Since
I
have
strong
consistency
on,
but
maybe
I'm
just
not
going
to
understand
that
correctly
and
so
here
this
is
interesting,
so
with
redistribution,
it
should
be
apologies
for
the
dog.
A
I
guess
you
feel
strongly.
It
should
be
sending
most
reads
to
the
secondaries
so
right
now,
gitly
three
is
our
primary
and
let
me
look
at
the
ips
here:
okay,
so
giddily
three
is
54.,
but
we
actually
see
that
54
and
what
is
that
27,
which
is
two
up
here,
are
getting
most
of
the
traffic
and
getting
one
is
getting
less,
which
isn't
what
I
expect,
because
it
should
be
taking
the
primary
and
sending
most
requests
to
the
secondary
three
distribution
on.
A
I
know
that's
what
happened
on.com,
but
okay,
so,
let's
get
into
a
little
bit
of
actually.
Let
me
stop
first,
so,
particularly
for
anyone
who
just
joined
any
questions.
You
have
right
now.
B
A
Okay,
so
let's
get
into
breaking
things,
so
let's
see
we
have
giggly3
as
our
primary.
So
let's
turn
up
on
the
secondaries
first,
let's.
A
A
Yeah,
okay-
and
so
you
can
see
here
so
like
this-
is
the
replication
queue
of
jobs
that
needs
to
be
to
be
run.
You
can
see
that
so
we
want
to
replicate
to
get
elite
two
and
we
have
these
pending
jobs,
but
they
can't
run.
I
mean
you
can
actually
see
so
here
we
have
a
cleanup.
We
have
an
update,
we
have
a
repack
incremental
just
from
all
these
are
just
triggered
by
that
one
push.
A
A
And
it'll,
so
one
thing
that
has
come
up
is
you
know:
prefect
will
immediately
see
a
gilly
that
you
bring
back
online
is
healthy.
I
think
it
needs
three
successful
health
checks
before
it'll,
say:
okay,
this
one's
definitely
up
and
so
there'll
be
about
a
five
to
ten
second
lag
from
use
from
gilly
can
becoming
healthy
to
it
actually
starting
to
take
traffic
again,
so
that
can
be
a
little
bit
confusing.
A
A
And
we
run
the
data
loss
command,
where's.
C
A
A
We
see
that
we
have
our
pending
job
scaling,
2.
now
to
go
to
prefect.
Let's
try
that
again.
Do
we
need
to
do
pressure,
replication,
partial,
partially
replicated?
That's
it.
A
I
think,
because
here
it's
just
it's
just
validating
that
yeah
here
we
go
okay,
yeah,
so
getting
two
is
recognized
as
being
behind
by
two
changes
or
less.
So
that's
yeah,
because
we're
just
using
replication
count
here.
That's
that's!
This
can
be
a
little
bit
vague.
So
we
don't.
We
don't
tell
you.
You
are
exactly
n
changes,
we're
just
saying
you're
you're
kind
of
two
changes
behind,
probably
so
it
can
be
a
little
bit
frustrating
because
you
don't
really
know
how
many
changes
have
been
missed.
C
A
There
we
go
okay
and
now,
let's
run
that
again
and
now
everything's
great
okay,
all
right!
So
now,
let's
take
down
the
prime
array,
which
should
still
be
three
yeah:
okay,
yeah,
so
really
there's.
There
is
no
convenient
way
other
than
grafana
to
find
out
what
the
primary
is.
I
don't
believe
we
say
that
in
the
logs
anywhere
it's
kind
of
a
pain
we
just
we
just
assumed
graffana
usage,
so
we
should
definitely
recommend
customers
set
this
up.
You
know
it's
it's
built
in,
so
it's
not
exactly
a
big
deal.
A
A
All
right,
let's
just
monitor,
grafana
here
for
a
minute,
because
it
should
see
a
change
here
in
the
flapping.
So
basically,
what
this
this
graph
here
is
saying
is:
are
we
flat
we're
switching
between
primaries
rapidly?
You
know:
do
we
see
any
kind
of
instability
there?
Okay,
so
now
get
elite.
Two
is
primarily
there.
Let's
just
refresh
this
yeah
okay,
so
two
out
of
three
prefix
recognized
gidley
three
is
down.
Okay,
now
we've
switched
over
all
three
prefix
to
recognizing
gilly
two
is
the
new
primary.
A
So
now,
let's
just
navigate
make
sure
we
can
view
things.
Okay,
so
we've
successfully
handled
the
the
old
primary
going
down.
A
Data
loss,
gilly
3,
is
now
behind,
so
we've
we've
seamlessly
handled
that
and
then,
if
we
restart.
A
A
Yeah
here
it
is
data
recovery
13.4,
I
guess
yeah,
so
I've
modified
this
config
slightly
yeah,
okay,
so
we've
we
fixed
that
gilly
one
is
back
up.
C
A
A
Okay,
yeah,
so
let
me
clear
that
so
it's
a
little
clearer
all
right,
so
you
can
see
that
the
all
three
repos
have
the
same
head
for
the
master
branch:
okay,
okay,
right
now
I
was
talking
about
the
replication
reconciliation
scheduling
interval.
So
that's
that's
configurable!
I
set
it
to
30
seconds.
A
Just
so
that
would
be
a
little
bit
faster
in
the
demo,
but
I
think
by
default
it'll
be
five
minutes.
You
have
five
minutes
by
default.
So
just
one
thing
to
be
aware
of
you
know
like
your
server
close
down
and
then
you
check
it
immediately.
It's
like
well,
it's
it's
still
not
up
to
date.
It's
just
just
because
prefect
probably
has
not
actually
told
it
to
go
in
and
make
sure
that
that
has
been
fixed.
So
okay,
so
there's
that
and
reflection
jobs.
A
A
Now,
that's
probably
failing
over
to
the
primary
right
now
yeah.
So
you
will
see
some
transient
failures
like
this,
as
things
are
kind
of
getting
up
to
date,
so
hopefully
actually
yeah.
So
we
can't
even
come
to
a
quorum
right
now
because
we're
down
two
servers.
Okay,
so
we
have
no
primary
all
right.
So,
let's
just
yeah
all
right.
So
that's
not
a
great
error.
D
A
A
A
A
B
A
Yeah
we'll
test
that
next,
I'm
just
surprised
that
we're
even
trying
to
reach
out
here,
because
I'm
not
actually
sure
why
we're
doing
that
wouldn't.
A
A
Well
so
giddily
doesn't
force
an
election.
That's
preferred
that's
right,
yeah!
So
so
you
know,
ghillies
do
talk
to
each
other
so
like
if
we
need
to
replicate
something,
let's
check
pre-sql
yeah,
so
we
don't
have
any
pending
replication
jobs.
So
I'm
not
actually
sure
why
gideon
is
trying
to
go
out
to
its
colleagues.
C
A
A
And
that's
22,
I
believe,
is
oh,
that's
that's
right,
but
60
is
prefect
one
okay,
yeah!
So
we're
getting
something
in
from
prefect
one
health
checks
right.
That
makes.
A
A
A
A
Let's
type
that
wrong,
this
is
an
interesting
error.
I
don't
think
it
did
since
it
works.
That's
weird:
let's
just
look
at
the
config.
A
A
A
A
Okay,
so
the
primary
is
up,
things
are
still
viewable,
it's
a
little
surprising
to
me
that
it
broke
completely,
because
you
know
the
secondary
that
we
had
was
still
in
sync,
with
the
primary
all
right
and
things
are
still
writable
with
the
primary
up
interesting.
Let's
do.
B
A
Yeah
yeah,
so
I
think
that
kind
of
makes
sense
right
because
on
the
one
hand
you
can't
establish
a
quorum
between
which
ones
are
healthy,
because
there's
only
one
server.
On
the
other
hand,
prefect
knows
that
all
three
of
these
are
up
to
date,
so
if
it
only
has
one
healthy
server,
you'd
think
it'd
be
smart
enough
to
just
say:
well,
okay,
I
can
just
take
this
one,
that's
left
and
use
it,
but
it's
probably
more
complicated
than
I'm
making
it
sound
like.
A
Okay,
all
right
so
primary
goes
down.
We
can
still
make
changes,
let's
bring
them
back.
A
A
A
A
I
have
a
replication
job,
strong
consistency
distributed.
Reads:
yeah
all
right.
So
let
me
ask:
what's
what's
unclear
what
what
anything
that
seems
vague
or
not
obvious
to
anyone.
A
No
they're
really
vague.
This
is
something
I
need
to
add.
I've
been
waiting
for
I'll
just
throw
this.
Let
me
throw
this
in
the
dock.
I
guess
where'd
it
go
here.
It.
C
A
Okay,
maybe
I
can
write
now?
Okay,
now
I
can
write
anyway.
So
this,
mr
I
just
pulled
up.
Actually,
let's
share
my
screen,
so
it's
been
in
the
works
for
a
long
time
like
I
said,
but
my
my
assumption
was
like
by
the
time
I
go
in
and
make
this
change.
This
number
will
be
in
and
so
there's
no
reason
to
update
the
docs
to
show
how
the
the
current
network
works
but
yeah.
A
A
Yeah
I
mean
it's
in
text,
we
talk
about
it,
but
we
don't.
We
don't
diagram
it
now.
It's
definitely
something
we
should.
We
should
rectify
yeah.
So
you
know
if
you're
using
a
cloud
service,
it's
not
really
necessary
right
because
you
only
have
one
postgres
you're
going
to.
But
if
you're
using
you
know
when,
like
an
omnibus
postgres,
then
yeah
it
would
make
sense
for
sure
feature
bouncer
in
there
all
right,
yeah,
good
question
ethan
any
any
other
questions
right.
C
A
A
So
far,
so
good
yeah,
all
right,
that's
aired.
So
this
time
we
elect
gili
3
as
the
primary
last
time.
This
kind
of
just
dropped
out
entirely.
B
A
Didn't
have
a
primary
weird
yeah
all
right,
so
this
is
more
or
what
I
would
expect.
A
Neat
yeah:
we
could
try
to
really
mess
with
it
now.
So,
let's
stop
three.
A
C
A
A
A
A
A
Okay,
although
I
think
that
is
the
correct
data.
A
C
A
A
And
I
think
it'll
stop
us
from
writing
here.
Maybe
I'm
wrong,
though,
let's
find
out
yeah
all
right.
Let's
give
it
a
minute
to
recognize
that
get
leave.
One
is
the
new
primary
yeah.
So
this,
like
I
said
earlier
and
like
you've,
seen
this.
This
transition
between
nodes
is
not
particularly
smooth
and
part
of
that's
just
prefect,
just
waiting
to
get
multiple
health
checks
before
it
says.
Yeah
definitely
go
to
this.
C
A
Oh,
that's,
I
guess
my
disturb
just
turned
off.
A
A
A
A
All
right,
if
we'd
also
turn
so,
let's
turn
off
gilly
one
again
turn
on
kilo
2,
the
previous
primary
or
if
that'll
help.
A
A
What's
the
address,
I
don't
actually
see
it
in
my
list
of
addresses.
I
don't
know
what
that
is.
Oh,
it
is
a
little
balanced.
Okay,
yeah
there
we
go
all
right,
yeah.
That
makes
sense
all
right,
so
we're
calling
it
this
here
we're
calling
this
is
the
prefect
load
balancer,
and
so
that's
what
rails
sees
so
just
config
real
quick.
A
So
in
terms
of
what
you
see
from
rails
and
to
get
editors,
we
just
have
a
gilly
address
right,
so
it
just
just
it
rails
just
thinks
this
is
just
you
know,
just
another
giveaway
server.
It
doesn't
know
that
it's
prefect,
so
all
that
is
abstracted
away,
and
this
this
here
is
just
for
prometheus.
It's
not
something
that
the
rails
application
itself
is
aware
of.
A
A
Expected
two
is
primary:
we
are
read
only
mode,
okay,
good!
That's
that's
what
I
expect,
because
this
is
not
up
to
date
with
the
let's
actually
check
the.
B
A
C
A
C
A
A
All
right,
let's
just
build
it-
that
still
fails
okay,
cool,
okay,
let's
bring
them
back
up,
make
sure
it
actually.
A
C
C
C
A
A
Yeah
yeah,
yeah,
and
so
now
now
we're
recognizing
that
q3
is
primary,
because
that
was
the
most
recent
change.
Okay,
all
right
all
right!
So
we're
pretty
much
in
time
all
right,
any
any.
Last
questions
for
anyone,
yep
all
right
cool!
Well,
thanks
for
joining
everyone!
I
hope
this
was
useful.