►
From YouTube: GitLab Geo - Self-service framework discussion
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
B
It'd
be
a
very
informal
discussion
of
how
the
geo
self-service
framework
works
to
replicate
and
verify
everything
that
everything
that
it
everything
that
it
does,
I
kind
of
like
would
be
nice
to
have
a
whiteboard.
B
A
Slides
would
they
help
maybe
describing
things?
If
not
don't
worry,
we
can.
B
Like
this
little
diagram
here
so
first
geo,
the
idea
is,
you
have
your
git
lab
deployment
and
you
have
maybe
another
location
with
many
developers
at
that
location.
B
B
B
You
know
I
I
guess
we
may
as
well
go
through
all
of
the
pieces
overview.
What's
going
underneath
here
is
the
postpress
database,
which
is
the
main
database
of
all.
The
data
in
gitlab
is
being
replicated
with
just
standard
postpress
streaming
replication
to
the
secondary
site,
and
then
it's
kind
of
geo's
job
to
replicate
everything
else,
which
is
like
get
repos
and
blobs
of
various
types.
B
B
That
that
seems
very
simple
on
its
face.
B
But
when
you
get
into
how
do
you
make
sure
that
everything
is
replicated
and
how
do
you
make
sure
that
there
was
no
corruption
on
the
way
over,
for
example,
or
how
do
you
make
sure
the
secondary
site
like
say
somebody
assist
admin
accidentally
deleted
a
whole
directory
of.
B
Blogs
or
repos
like
how
do
you?
How
do
you
know
about
that
like
because
it
really
really
needs
to
be
there
for
people
to
use
it
and
for
disaster
recovery?
So.
B
It
it
worked
and
we
also
added
support
for
uploads,
which
are,
for
example,
if
you
make
a
comment
in
an
issue
or
merch
request,
and
you
attach
a
an
image.
That's
an
upload!
If
you.
B
Things
like
that,
a
lot
of
things
fall
into
that
table
and
then
we
also
replicated
lfs
objects
and
drive
artifacts,
so
that
was
all
working.
Okay,
except
that
every
time
we
added
a
new
type
like
job
artifacts,
it
was
like
a
massive
effort.
It
would
take
a
single
developer.
You
know
six
months
to
accomplish.
B
And
meanwhile,
we
also
added
on
for
git
repos
for
project
and
wiki
git,
repos
verification,
and
what
that
would
do
is
it
would
take
all
of
the.
B
B
We
don't
have
everything
on
secondary
now,
that's
kind
of
not
yeah.
B
So
if
you,
if
we
go
into
like
what
the
structure
of
a
git
repository
is,
the
refs
are
pointers
to.
B
A
As
an
example,
it's
a
there
was
a
commit.
There
would
be
arrest
for
that.
Commit,
for
example,
would
that
be
accurate
or
is
there
a
referee,
a
comment,
etc,
etc?
Or
have
I
misunderstood
the
concept.
B
So
you
know,
I
mean
now
you're
testing,
my
my
get
knowledge,
so
I
I'm
just
trying
to
remember
what
it's
like
and
I
think
it's
more
like
say
you
have
a
good
repo
with
a
whole
bunch
of
commits,
but
only
one
branch
master
or
main
same
name,
and
so
what
you're
gonna
have
in
the
in
the
structure
of
the
daily
boat
is
you're.
Gonna
have.
B
A
A
B
So
so
that's
a
good,
but
that's
a
a
good
way
enough
to
say
check
something.
The
the
refs
is
not
enough
to
know
that
nothing
is
corrupted,
but
it's
a
pretty
good
way
to
say
that
if
it
doesn't
match,
then
you
definitely
don't
have
everything
gotcha,
yeah,
so
anyways
that
that
verification
piece
was
added
and
and
as
part
of
that,
you
know
just
managing
like
the
process
of
like
okay
check,
some
all
of
the
repo's
on
the
primary
checks
them
all
to
get
repos
on
the
secondary.
B
B
Which
has
tables
like
the
project
registry
so
for
every
project,
it's
going
to
have
a
corresponding
row
in
the
project
registry
table
in
the
secondary,
and
it's
going
to
have
data
like
is
this
synced
or
did
it
fail?
Sync,
when
was
sync
attempted
if
it
failed?
When
should
we
retry?
A
sync
was,
when
was
verification
attempted
well,
it
did
verification
fail.
B
If
so,
when
should
we
try
verification
all
that
kind
of
stuff,
okay
and
there's
a
table
for
each
for
projects
or
uploads
for
lfs
objects
for
artifacts
in
the
geo
tracking
database.
B
The
for
forget,
repos,
the
basic
way
that
something
is
replicated
to
the
secondary
is
the
secondary.
Actually
does
something
similar
to
what
a
user
would
do.
B
It
does
a
git
fetch
against
the
primary
site,
for
that
particular
git
repo
and
it
authenticates
with
jwt
authentication,
which
I'm
not
a
security
expert.
But
it's
it's
that
that
piece
was
needed
because
it's
like
it's,
not
a
user
on
the
secondary
doing
a
get
fetch.
B
A
B
B
A
So
if
I
may,
the
the
secondary
it
fetches
all
of
these
individually,
so
it'll
it'll
get
a
list
of
objects,
blobs,
for
example,
that
the
primary
has
into
its
tracking
database.
Then
it
knows
what
it
synced,
what
it
hasn't
and
then
it
will
fire
off
jobs
to
fetch
the
missing.
A
Let's
take
blobs
as
an
example
do
to
go
fetch
those
and
if
there
will
be
a
separate
request
for
each
item
from
the
secondary
to
to
go
fetch,
you
wouldn't
be
like.
I
want
these
items
as
a
collection.
A
It
would
be
individually
fired
off,
firing
off
requests,
yeah,
yeah.
A
A
B
The
nice
thing
with
doing
the
posts
for
streaming
replication,
that's
kind
of
like
underlying
layers
that
the
primary
we
can
we
can
send
data
to
the
secondary
just
through
that,
so
we
don't
have
to
make
calls
for
for
absolutely
everything
so
yeah
the
primary
will
just
check
some
things.
Have
it
stored
in
the
database.
B
Yep
and
the
secondary
will
see
that
the
secondary
so
that
that
kind
of
gets
into
back
filling
versus
like
events
but
yeah
do
you
have
any
other
questions.
A
I
think
you've
touched
on
it,
so
I'll
wait
for
for
for
that
topic
to
be
discussed
before
I
ask
any
any
questions
so
yeah,
let's,
let's
get
into
it
so.
B
So,
let's
just
go
with
we're
now
we're
going
to
just
talk
about
self-service
framework,
how
that
works,
because
we're
just
going
to
be
talking
about
the
logic
the
just
like
the
logic
of
like
how
do
we
do
this
and
do
that
and
it
will
be
we're
my
we're
migrating
projects
currently
do
ssf,
so
we're
pretty
far
along
and
we
don't.
B
Yeah,
so
there's
a
lot
of
moving
pieces,
so
we'll
start
with
kind
of
how
the
development
of
ssf
like
started
and
how
it
grew.
So
the
beginnings
of
it
was.
B
We
know
that
we
want
the
secondary
to
do
things
responsively
so
like.
If
something
changes
on
the
primary
we
want
it
to
be
reflected
on
the
secondary
as
soon
as
possible,
and
that
means
that
we
really
shouldn't
be
doing
with
it,
with
some
kind
of
like
just
background
process
churning
through
every
single
thing.
Looking
for
what
needs
to
be
synced,
but
that's
too
slow.
So.
B
So
so
we
need
to
implement
some
kind
of
eventing
system,
there's
like
so
many
ways
to
do
it.
One
of
the
choices
that
we
made
was
to
one
not
introduce
new
infrastructure
like
rather
than
q.
You
know
things
like
that,
but
certain
money
brought
up
because
it's
it
was
built
for
that
kind
of
thing.
B
We
we
just
built
it
on
top
of
postgres.
We
already
had
some
some
kind
of
architecture
doing
that
for
the
legacy
logic,
so
we
kind
of
just
built
on
top
of
that
for
us.
B
Yeah
there
should
be
a
very
high
bar
for
introducing
new
new
architecture,
new
infrastructure,
new,
like
git
lab.
You
know,
one
of
the
values
is
it's
like
boring
solutions.
You
know-
and
I
think
I
think
that
makes
sense
for
for
our
purposes
and
it
certainly
has
been
working
so
far.
Okay,
so.
B
And
lfs
objects,
so
we
knew
we
wanted
to
make
to
make
everything
that
could
be
reused,
reusable
so.
B
B
So
there
is
a
document
for
kind
of
like
the
overview
of
the
self-service
framework.
Now
it
hasn't
been
updated
a
lot
in
recent
months,
but
it
does
still
basically
work
the
same
at
its
part
yeah.
So
maybe
before
we
start,
it
is
worth
like
just
briefly
talking
about
like
the
names
of
things,
because
many
of
them
are
overloaded.
B
I
mean
they
were
a
model,
because
that
was
kind
of
confusing,
so
I'll
say
like
a
resource
like
a
resource
is
the
thing
that
you
want
to
replicate.
Basically,
a
dated
type
is
okay,
so
we've
kind
of
done
a
terrible
job
of
sticking
to
this
technician.
B
B
Okay,
a
replicable,
that's
the
thing
you
want
to
sync:
it's
it's
a
resource
that
that
geo
wants
to
sync.
So.
A
Gotcha
so
there
are,
there,
isn't
a
data
type
that
is
a
database
today,
but
we
could
have
something
like
that
in
the
future.
B
Yeah,
I
I
well,
I
don't
think
so.
I
think
it'll
be
because,
like
for
the
kind
of
data
database,
it'll
always
be
like
different.
If
we
were
going
to
make
the
ssf
handle
some
kind
of
thing
like
that,
and
it
would
have
to
be
like
you've.
For
some
reason,
you've
got
like
a
100
container
registries
or
something
in
gitlab.
I
don't
know
why
it
would
be
like
that,
but
you
know
otherwise
I
don't
think
we're
gonna.
B
A
Don't
quite
follow,
but
we
can
we
can
come
back
to
that.
I
was
just
curious
because
I
I
totally
understand
to
get
repositories,
get
the
blobs
the
database.
Obviously
there
is
a
postgres
database,
but
to
me
it
feels
like
the
postgres
database
forms
part
of
the
ssf
framework
rather
than
the
ssa
framework
managing
the
postgres
database.
So
yeah
you.
A
B
That's
definitely
accurate
and-
and
you
know
I
don't
I
don't
like
so-
the
ssf
doesn't
handle
container
registry.
But
can
data
registry
is
basically
a
database.
B
But
anyway,
so
moving
on
a
replicator,
that's
kind
of
more
like
if
you're
in
the
code
like
you're.
Looking
at
the
ruby
code,
like
you
need
to
know
like
the
replicator
is
the
object.
That
knows
how
to
replicate
something.
B
You
you'll
call
a
method
on
the
replicator
to
fire
events
which
is
like
from
the
primary
and
then
in
the
secondary
you'll
call
the
method
on
the
replicator
to
consume.
Those
events.
B
Yeah
yeah
so
yeah
because
of
this
whole
like
event,
eventing
idea
in
order
to
keep
the
secondary,
responsive,
yeah
the
primary
needs
to
somewhere
somehow,
when
something's
changed
or
created.
A
Fire
puts
it
puts
it
into
the
database,
which
gets
sent
across
to
the
secondary,
so
it's
a
primary,
creating
the
entry
in
the
database,
that's
the
producing
part
and
then
getting
replicated.
Then
the
consumer
would
be
the
secondary
when
it
actually
sees
the
replicated
data
at
that
point,
gotcha
just
wanted
to
make
sure
I
understood
the
model.
Thank
you.
B
Yeah,
oh
and
initially
we
tried
to
write
this
so
that
it
could
swap
out
the
underlying
eventing
system.
If
needed,
I
don't
think
it's.
I
think
it's
turned
out
to
be
that
we
don't.
We
won't
need
it
but
yeah.
B
Okay,
that's
something
to
know
so
the
the
old
code
that
we
used
to
carry
these
events
from
one
side
to
the
other
is
not
described
in
here,
but
remind
me
to
talk
about
that
later.
B
So
now
we're
going
to
start
looking
at
the
ruby,
you
know
code,
the
dsl
that
we
we
have
going
on
in
the
color
base.
B
Yeah,
so
to
replicate
something
you
have
to
write
a
replicator
for
it.
So
that's
going
to
be
like
package
file
replicator
class,
and
ideally
you
don't
need
to
write
a
whole
bunch
of
specifics,
like
you
know,
code
yourself,
because,
as
we
know,
downloading
a
package
as
an
http
file
transfer
is
the
same
process
as
downloading
an
upload
and
a
job
artifact,
and
I
know
of
this
object.
B
If
you
call
dot
file
on
this
thing,
then
that's
going
to
give
you
that's
going
to
give
the
strategy
this
thing
that
it
needs
carrier,
wave,
uploader,
okay
or
the
model
here
packages
calling
colon
package
file
is
the
active
record
model
that
represents
the
packages
number
sport
package,
underscore
files
table.
B
And
so
yeah,
all
the
all
the
reputation
logic
is
hidden.
You
know
you
only
need
to
to
say,
like
the
actual
specifics.
A
So
you're,
inheriting
from
from
the
superclass,
as
you
say,
all.
B
B
Yeah
yeah,
and
here
in
that
active
record
model
the
package
file
model.
We
we
tell
you,
you
gotta,
include
this
thing
because
that's
gonna,
you
know,
provide
some
the
ability
for,
like
the
replicator,
to
call
certain
methods
on
this
model
and
also
say
that
replicator
package
file
replicator
that
replicates
this
thing.
B
So,
for
example,
if
you
have
in
if
we're
talking
about
rails
code
now
I
mean
you
know,
because
this
is
active
record,
you
can
do
package
file
dot,
find
that
id
you
can
call
dot
replicator
on
it,
and
it
gives
you
an
instance
of
package
file.
Replicator,
okay
and
package
file
replicated.
Oh
sorry,
go
ahead.
B
It's
the
id
in
the
table
of
that
row
a
particular
row,
okay,
so
like
in
in
in
your
in
your
browser
here,
for
example,
like
issues
number
11,
that's
the
iid
so
that
that
item.
B
B
And
here
it
says
you
can
use
the
replicator
to
generate
events
so
again,
if
we're
in
the
primary,
if
you
call
publish
created
event
on
a
replicator
that
is
like,
if
you
got
it
by
doing
this,
like
you
have
the
package
file
model
instance
and
then
you
do
dot
repetitor,
and
then
you
call
publish
created
event
that
will
create
an
event
in
the
database.
That
says,
oh
this
package
file
number
four
was
created
so
that
the
secondary
can
act
on
that
event.
A
A
A
The
the
primary
initially
was
somebody
added
a
whole
bunch
of
files.
All
right,
blobs,
I'm
going
to
stay
generic
100
blobs.
A
Would
the
the
producer
generate
a
hundred
events
immediately,
or
is
this
a
a
throttling
algorithm
there
to
not
overwhelm
for
the
secondary
not
to
overlap
the
primary
so
sarah,
or
does
the
secondary
manage
that
on
their
own?
What
what
mechanism
stops
stops
that
from
happening.
B
So
like,
if
you,
if
you
push
to
get
repo.
B
B
Believe
that's
true.
In
the
case
of
get
pushes
and
geo,
like
repository,
created
events
being
so
created
done,
asynchronously
so.
B
A
Yep
and
the
queue
is
serviced
but
something's
dispatching
the
the
work
right
so
it'll
pick.
So
I
guess
the
sidekick
queue
is
on
the
secondary
side
right.
A
B
For
creating
those
events
on
the
primaries
of
the
producer
side
right,
I
believe
that
that's
done
asynchronously,
so
you
could
end.
A
A
While
gotcha,
so
so,
just
just
to
make
sure
I
understand
this
correctly.
So
when
you
commit
a
change
of
creator,
let's
say
you
create
a
project
that
kick
initiates
a
sidekick
job,
that's
responsible
for
writing
to
the
database,
which
then
gets
replicated
to
the
secondary.
That's
that's
the
flow
of
events,
so
nothing
writes
directly.
A
The
the
task
of
writing
to
the
database
gets
enqueued
on
the
onto
a
sidekick,
job
and
and
okay
understood,
I
believe
so.
Yeah.
B
Cool
yeah
so
see,
we've
got
this
post
for
sea
worker,
which
is
supposed
to
receive
is
like
the
main.
The
big,
the
big
thing
that
happens
on
a
push
post
receive
is
like
one
of
the
git
hooks.
So
this
is
a
this
is
a
worker
so.
B
Yeah,
this
is
a
scientific
worker
or
or
a
sidekick
job
yeah.
Basically,
and
in
here
we
have
a
hook
into
the
korean
open
source
code
after
project
changes.
B
If
this
is
a
geoprimary,
then
call
this
repository,
updated
service
and
in
there
we
create
that
record.
Okay.
B
B
So,
like,
let's
say
the
primary
sidekick
is,
you
know
fasting
up.
So
a
hundred
deposits.
Forty
created
events
get
created
in
the
span
of
a
second
postpress
streaming.
Replication
has
to
bring
those
rose
over
to
the
secondary,
and
then
I
guess
here's
what
what
else?
When
I
asked
you
to
remind
me
about
what
happens?
B
Here's
where
we
get
into
a
detail
of
the
event
system
on
the
secondary
there's,
a
process
called
geology
and
all
this
this
process
is
doing
is,
is
watching
that
tables,
there's
actually
one
event
table
and
many
specific
event
tables
that
feed
into
it
are
tracked
by
that
one
event:
log
table
yeah,
okay,
but
so
it's
watching
the
net
blog
table
actually,
and
it's
just
a
new
reference
that
it
hasn't
seen
before.
A
B
Calls
it
it
calls
the
specific
event
code,
like
even
processing
code,
for
that
specific
type
of
event
like
repository,
so
it's
like
code
is
specific
code
is
plot
for
each
of
those
types
of
events
and
in
the
case
of
repository
created
event,
what
happens?
Is
it.
B
Oh,
so
that's
a
little
bit
different
from
ssf,
although
in
the
end
it's
it's
the
same,
the
ssf
has
one
event
type
because
we're
trying
to
get
away
from
like
every
time
you
create
a
new.
A
B
Yeah
data
type
or
event
type
you
need
to
like
create
a
new
table
and
it
references
the
geo
event.
Log.
That's
like
a
whole
bunch
of
overhead.
That's
not
helpful!
B
So
in
the
ssf
there's
only
one
event
typing:
it's
just
geo
events.
It
was
had
to
be
generic.
B
So,
like
a
new
event
comes
in
like
creatively
down
to
an
updated
event
or
deleted
event,
then
it
creates
a
geo
event,
job
where
it
where
that
job
has
all
of
those
arguments.
It
says
like.
Oh,
this
thing
was
created
or
this
thing
was
updated
or
this
thing
was
deleted,
and
so
in
the
end
you
get
a
sidekick
job
that
will
do
a
replication
or
deletion.
A
Or
whatever,
okay,
I,
and
if
it's
a
deletion,
it
would
need
to
do
that
locally.
But
if
it
was
a
additional
replication,
then
you
would
need
to
establish
a
call
to
the
primary
and
fetch
the
data.
A
Okay,
yeah,
oh
I
guess.
If
it's
an
addition
or
a
deletion,
it
would
need
to
go
if,
in
the
case
of
blobs,
it
would
need
to
go
bring
that
would
the
blob
have
the
same
reference.
If
it
was
the
it
was
updated,
it
would
just
have
the
same
references
to
indicate
that
it's
been
updated.
Yes,.
B
B
Yeah,
so
almost
every
blob
is
immutable,
which
is
super
convenient
for
geo,
not
just
convenient.
It's
like
the
right
way
to
do
it,
and-
and
so
there's
no
there's
no
update
about
them,
but
we
actually
have
some
blobs
that
are
beautiful
and
it's
terrible
and
I
think
it's
avatars
like
if
you
update
your
user
avatar,
there's
some
bug
around
that.
Currently:
okay,
because
they're
just
updated
in
place.
A
Right,
you're
right:
okay,
yeah,
okay,
but
in
general
it's
immutable,
so
there
will
just
be
add
and
delete
as
a
a
a
a
a
group,
just
assume
that
okay
cool
and
so
the
sidekick
is
responsible.
Ultimately,
it
is
the
worker
that's
responsible
for
talking
to
the
primary
and
managing
the
download
of
the
gotcha.
B
And
you
know
one
of
the
one
of
the
things
is
like
if
you
like,
the
the
the
web
notes
right
like
the
the
ui
the,
but
it
has
short
timeouts,
because
if
you
don't
do
that,
then
you're
susceptible
to
ddos
attacks
and
all
kinds
of
problems,
not
religious
ones,
but
sidekick.
Can
you
know
it's
not
ideal?
B
A
Yeah
yeah,
so
that's
okay,
sidekick
can
keep
that
channel
open
or
this
is.
This
is
a
problem
that
we
we
face
at
the
moment.
B
I
mean
it
is
a
problem
if
that
is
actually
happening.
If
you
get
any
kind
of
like
like
you,
you
end
up
being
very
susceptible
to
like
corruption
and
and
and
whatever,
but
you
know
so
that
nature.
B
Yeah
so
there's
been
discussions
recently
around
like
establishing
expectations
for
geo
in
different
scenarios,
and
this
is
one
area
that
yeah
we
haven't.
Really.
We
haven't
really
done
a
lot
yet
in
the
case
where
you've
got
like
a
terabyte
of
data
in
your
primary
and
your
secondary
is
on
the
other
side
of
the
world
with
the
10
megabit
connection,
you're,
just
not
gonna
have
a
great
experience.
B
It
may
never
catch
up
it.
You
know
it
may
not
be
technically
possible
to
catch
up
sure.
A
Sure,
gotcha,
okay,
but
it's
so
I've
heard
what
workhorse
being
mentioned.
So
what's
and
I'm
kind
of
going
off
reservation
and
apologies
for
that
brother
you
got
sidekick
who,
who
manages
you
know
writes
to
the
database,
also
manages
calls
out
to
the
primary
on
from
the
secondary
and
and
manages
the
the
downloads.
B
So
if
you
imagine
like
the
web,
you
know
ui
gitlab.
Those
are
served
by
those
requests
are
served
by
rails,
but
in
between
we
have
a
component
written
in
go
called
workforce
and
the
reason
is,
there
are
a
number
of
very
specific
types
of
requests
that
can
be
handled
much
more
efficiently,
not
by
rails.
B
So
like,
for
example,
if
you
are
doing
a
get
pull
of
a
huge
repo,
we
really
don't
want
that
request
to
be
served
directly
by
rails.
B
And
and
and
so
what
happens
is
workforce
is
able
to
talk
to
italy
and
bypass
rails,
I
mean
it.
Workhorse
also
is
able
to
talk
directly
to
the
rails.
App
to
say,
like
is
this
user
authenticated
for
this
thing,
but
when
it
gets
that
authentication,
then
they
can
fast
get
away
directly
for
okay,
right
and
and
you
bypass
rails
completely.
So
for.
A
The
moment
geo
doesn't
doesn't
use
workhorse,
it's
just
psychic.
B
A
B
A
A
You've
got
the
single
sorry
mike
I'm
going
into
details
here,
but
you've
got
a
single
api,
something
percepts
the
api
call
and
decides
whether
the
job,
whether
or
the
endpoint
is
pinned
to
a
sidekick
or
a
workhorse
kind
of.
B
Not
sidekick
so
the
rails
app
is
running
in
like
puma.
A
Yeah,
so
so,
okay
cool!
So
so
when
you
ask
for
the
repo,
basically
just
hits
the
the
workhorse
server
and
it
just
gets
from
there.
So
the
api
hangs
off
workhorse.
B
B
B
A
So
your
question
again
so
so
for
different
api
endpoints
do
do
they
have
different.
Are
they
backed
by
different
services
like
let's
say,
a
get,
get
request
that
goes
to
workhorse,
but
let's
say
a
blob
or
something
like
that
right?
Does
that
get
serviced
by
sidekick
in
the
end
when
they
come
in,
so
you
got
the
same
same
api,
but
they've
got
different
endpoints
within
the
api,
so
I
guess
you're
hitting
a
request
to
replicate
git
repositories
or
we
get
another
request
for
blobs.
Are
they
all
hitting
the
workhorse.
A
B
Or
what
okay
so
like?
If
you
were
imagining
your
typical
get
deployment
nginx
first
and
then
from
there
it
gets
workforce
right
and
then
workforce
will
split
off
a
number
of
things
like
it'll
it'll
handle,
get
requests
itself.
B
B
Wait
there's
a
couple
ways
that
it
does
not
switch
but
anyways
yeah,
so
it
handles
those
and
then,
if,
if
it's
a
blob
download
request.
B
We
we
wrote
the
api
endpoints
in
rails,
so
if
it's
not,
if
it's
not
stored
in
object
storage-
actually
I'm
wondering
now.
Maybe
it
is
limited
by
a.
A
Let's
go
but
that
that's
a
good
insight,
I've
kind
of
learned
a
lot
with
regard
to
sidekicking
workhorse
here.
So
thanks
for
that.
But
let's
continue
with
the
ssf
sorry
for
the
digressions.
B
No,
I
mean
so
I
mean
that's
that
is
part
of
the
the
the
endpoints
for
downloading
blocks
from
the
primary
it's
it's
the
same,
one
endpoint
for
all
the
different
kinds
of
blobs
and
it's
kind
of
all
written
within
the
blog.
B
B
So,
outside
of
the
event
system,
there's
a
couple
things
that
need
to
be
handled:
one
is
back
filling
if
you
have
a
large
gitlab
deployment
and
you've
got
a
ton
of
data
from
the
losers
already
using
it
you're,
not
using
geo,
and
then
you
added
your
secondary
site
yep,
then
everything
that
you
have
needs
to
be
replicated
too
secondary,
regardless
of
whether
someone
is
actively
updating
those
things
or
whatever.
B
So
that's,
that's
that's
one
thing
that
needs
to
happen.
Another
thing
that
we
learned
with
geo
is
that.
B
B
We
can't
depend
on
the
inventing
system,
one
hundred
percent-
that's
for
like
more
than
one
reason.
One
is
like
if
somebody
does
a
push
on
the
primary
and
there's
like
some
kind
of
transient
infrastructure
problem
or
there's
a
bug
in
in
the
current
release,
where
the
post-receive
worker,
for
example,
raises
under
some
specific
condition
before
the
geo
repository
created
event
is
created
exactly
whatever
there
could
be
many
reasons
or
or
like
a
sidekick.
Job
is
just
lost,
because
that
can
happen
and.
B
If
you
do,
then
you'll,
eventually
see
or
you'll
have
customers
seeing
like
hey
this
thing
supposed
to
be
synced,
but
it's
not
synced
like
and
now
now
what
we
have
to
kind
of
assume
that
events
are
not
reliable.
Okay,.
A
B
A
B
B
So
that's
happening
and
then
there's
another
job
that
says
like
is
checking
only
the
registry
table,
because
you
don't
want
to
do
big
cross
database
queries.
B
I
see
you
don't
want
to
do
prosthetic
expiries
in
general,
like
they're,
expected
yeah,
they're,
expensive,
they're,
slow,
they're,
yeah,
there's
a
number
of
things
that
can
go
wrong
like
but
anyways,
so
there's
another
job
that
is
constantly
just
looping
over
their
registry
to
people.
Well,
now,
I'm
sorry
not
looping.
Under
the
registry
tables,
it's
doing
queries
on
them,
because
now
we
can
do
full
queries
like
give
me
everything
that
is
pending.
Sync,
that's
never
been
synced,
okay
in
queue,
jobs
give
me
everything
that
has
failed.
B
B
So
that's
a
problem
also,
even
if,
like
the
deployment
has
tons
of
sidekick
workers,
you
don't
necessarily
want
to
just
saturate
all
of
them
with
geo
sync
jobs.
At
the
same
time,
make.
B
Are
set
when
you
create
a
new
secondary.
This
is
like
totally
inappropriate
for
everything,
except
for
the
specific
size
of
deployment
that
this
is
good
for
right,
okay,
yeah!
So
like
it's,
it's
inappropriate
for
the
gdk.
In
fact,
like
I
always
like,
go
in
and
put
like
yeah
something
like
this,
I
see
or
for
the
judy
taste
so
yeah,
that's
that's
a
problem
that
we
have,
but
you
can
tweak
those
you
know
yeah,
that's
so
easily
worked
around.
If
you
have
a
problem
with
like.
Oh
I
got,
you
know
too
many
jobs.
A
B
B
This
we
can
yeah,
we
can
do
tests
on
a
specific
setup
that
we
can
define
and
we
can
say
like
it's
on
gcp.
We
can.
I
just
added
this-
take
a
measurement
of
bandwidth
between
sites,
so
we
can
tell
people
like
okay
if
you've
got
all
thinking,
gcp,
us,
east
and
being
in
gcp
europe
west
of
these
sizes
records
architectures
of
gitlab,
and
your
bandwidth,
you
know,
is
approximately
like
this.
B
This
is
the
behavior
that
you
can
kind
of
generally
expect
from
gm.
Now
it
depends
on
like
so
many
things
like
how
many
users
are
currently
using
your
primary
and
secondary.
What
are
they
doing?
A
B
Cool
yeah,
currently,
there's
no
guidance.
I
don't
there's
like
my
way
to
give.
B
Originally
geo
mostly
was
developed
for,
like
we've
got
a
single
node
for
each
site
and
you
know
maybe
it
was
like
not
a
small
node,
but
I
think
those
defaults,
probably
getting
out
of
that
since
you
set
up
on
the
bus
gitlab
on
both
sides
on,
like
a
you
know,
healthy
sized
node,
then
the
defaults
of
25
10
10,
like
you'll,
be
okay,
okay,
okay,
cool.
A
All
right,
so
we
got,
we
discussed
the
concurrency
setting
with
this.
B
A
Yeah
so
the
backfilling,
let's
say
you're,
you're
thinking
a
large
number
of
jobs.
Let's
say
you
just
started
it's
a
huge
repository
or
a
large
number
of
projects
right.
What
I
think
you've
answered
that
question,
the
concurrency
throttles
you
from
overwhelming
the
primary
or
overwhelming
the
secondary
overwhelming
itself.
So
I
think
you've
already
answered
that
question
there
in
terms
of
information
with
regard
to
us,
the
sync
activity,
what
type
of
information
do
we
have
with
respect
to?
Let's
say
a
repository
sync?
A
What
type
of
information
do
we
have
in
relation
to
where
the
primary
and
secondary
are
with
respect
to
each
other?
You
know
how
many
outstanding
requests
etc.
What's
what's
the
what's
the
deviation
between
the
two?
How
many
are
in
flight
things
like
that?
Do
we
do?
We
have
typically
that
type
of
info
this
relates
doesn't
relate
back
to
the
question
yeah
and
you
mentioned
there
was.
Is
it?
Is
it
this
information
that
that
we
have
yeah.
B
Fields,
so
we
can
talk
about
some
of
these,
so
like
state,
that's
that's
the
big
one
where
we
it
can
have
three
states,
it's
another
one.
I
think
four
states
zero
is
pending.
I
mean
like
it.
It
just
needs
to
be
synced.
We
didn't
necessarily
fail
before
yeah
and
then
one
is
started.
B
Because
of
race
conditions,
okay
and
money
preserve
one,
two
two
is
success,
so
it
was
successfully
synced
before
it.
R3
is
failed
and
those
are
those
are.
This
is
the
big.
This
is
the
main
field
to
look
at
if
you're
wondering.
B
A
That's
fine!
If
there's
a
single
file,
then
I'm
happy
to
kind
of
eyeball
it
just
to
get
an
idea
of
different
different
different
values
that
these
could
have.
B
Yeah
so,
and
this
replicable
registry
is
included
in
all
of
the
registry
models.
B
Yeah
so
state
machines
provided
by
the
extinguishing
gem
to
give
this
give
us
this.
B
Dsl
domain
language
around
managing
states
so
like,
for
example,
if
you
in
the
code
remove
a
registry
to
start
it,
then
we
set
last
sync
dat
field
to
time:
dot
correct.
I
see
so
that's
that's
setting
like
when
did
it
start
and
then,
if
you
move
it
to
pending,
then
we
zero
out
the
failure.
We
try
out
and
retry
count.
A
B
B
A
And
you've
got
a
last
thing:
failure
free
flow,
I'm
assuming
that's
a
free
flow
text
field.
B
But
yeah
so
they're
totally
inconsistently
set,
but
what
I
want
to
get
to
is
a
place
where
we
have
a
text
field
with,
like
maybe
4,
000
characters
allowed
and
we
drop
in
the
whole
back
trace
because,
like
it's
not
enough
to
just
give
me
a
one
little
last
error
message.
A
That
was
gonna,
be
my
that
was
gonna,
be
my
question:
is
it
possible
to
isolate
the
back
back?
You
know
the
failure.
Log
messages
for
this
particular
event
sounds
like
there
is,
which
is
great
news.
B
Yeah
yeah,
listen,
click,
it
cool
all
right
number
places.
Let
me
see
the
last
synthetic
set,
maybe
not
that
many
so
you'll
see
it
below.
A
So
one
other
question
in
when
you
look
at
the
logs,
I'm
assuming
this
information
comes
out
of
the
rail.
Is
it
rails
logs
that
this
information
is
coming
yeah
yeah?
Is
it
if
you?
If
you
look
at
the
console,
you
wouldn't
see
this
output
right,
the
rails
console
which,
which
are
so.
B
A
When
syncing
fails,
for
example,
and
and
the
the
back
trace,
where
would
you
find
that
information
if,
before
we
surface
it
anywhere
else?
Where
would
I
normally
go
to
look
for
this
information?
If
I
was
a
cis
admin.
B
B
Psychic
arms,
so
if
you
like,
if
you
go
onto
this
sidekick,
no,
this
is
a
good
name
and
you
tail
sonic
logs.
Then
I
think
you
should
see
back
traces
or
you
know,
yeah.
A
And
would
it
be
possible
to
I'm
assuming
sidekick
does
more
than
geo
work
right?
Would
it
be
possible
to
identify
the
geo
logs
from
there?
Are
there
any
kind
of
prefixes
or
anything
that
you
could
look
for
to
isolate
it
to
filter
on
the
geo
specific
logs.
B
Oh,
you
know
what
I'm
sorry,
okay,
so
if
we
I
might,
I
might
have
spoken
about,
I
think,
that's
part
of
the
problem
with
the
way
games
are
at
the
moment.
If,
if
you
transition
something
to
fame.
B
B
B
Yeah
we
need.
We
need
to
double
check
all
that.
Yes,
because
yeah
we
like
we
have,
we
know
like
when
there's
a
problem
with
sinking
or
something
we
really
need
to
see
the
back
traces.
B
So
it
helps
if
that
can
be
found
in
multiple
places,
even
like
if
we
store
that
and
still
raise
so
that
scientific
jobs
are,
you
know,
raising
exceptions
with
back
traces,
then
a
century
tracks
it.
That
would
be
good.
That
is
something
that
we
need
to
double
check
on.
What
is
exactly
okay,.
A
B
B
Off
to
the
call,
it
must
be:
okay,
yeah.
It
must
be
swallowing
it
somewhere
that
I'm
just
like
messing
up
a
moment
because
like
in
order
for
failed
to
happen
right
like
it's
got
to
be
executed.
So
the
error
is
not
being
raised.
It's
not
being
raised
here
so
yeah.
I
think
it's
being
swallowed
at
the
moment.
A
Okay,
go
maybe
maybe
that's
something
we
could
look
at
improving,
like
you
say,.
B
Awesome,
oh
so
we're
talking
about
the
states
sinking.
B
B
B
B
I
thought
oh
no
yeah,
okay,
I
I'm
working
on
this
code
at
the
moment.
Okay
and
this
is
master,
and
it's
not
my
code,
so
so
yeah
after
sayings.
What
was
that
happen?
Is
we
need
to
mock
the
base
gratification
pending
so
that
you
know,
regardless
of
whatever
happened,
with
verification
before
we
know
that
it
just
finished
sinking?
So
it's
probably
changed
so
we'll
just
say
like
this
needs
to
be
verified.
B
Yeah
and
it
doesn't
have
to
like-
we
don't
think
you
a
job
here
or
anything
for
gravitation,
because
it
really
doesn't
have
to
occur
that
the
point,
the
there's
already
a
verification
job.
That's
like
looking
for
things
that
are
hanging
just
like
the
backfill
one
or
something
it's
like.
So
anything
like
this
verification
editing.
Is
there
anything
about
verification,
that's
ready
for
retry.
B
So
after
sync
that
happens
and
I'm
going
to
gloss
over
this
other
stuff.
For
that
show,
that's
fine.
A
So
it
basically
gets
the
checksum
which
is
synced
across
from
the
primary.
The
verification
job
runs
a
is
it
is
it?
Is
it
checksums
that
we
use
for
for
verifying
yeah
yeah?
So
so
it's
got
the
checksum
from
the
original
on
the
primary
because
it
come
across
in
the
the
stream
stream
replica,
and
then
it
does
it
on
the
actual
object
that
it's
got
a
copy
of
checks,
those
if
they
match
we're
all
good
we're
off.
B
Yeah
yep,
so
I
guess
we
can
just
get
into
verification
that
I
think
we're
people
there.
A
Yeah-
and
there
was
another
one
related
to
just
just
bringing
enough
in
case-
we
forget
it
mike-
was
if
the
master
hadn't
done
the
checksums.
I
think
there
was
an
issue
if
it
hadn't
done,
but
it's
it's
been
copied.
How
would
how
is
that
even
possible,
but
we
can
come
back
to
that
one
right.
Let's
talk
about
verification.
B
B
Right
well:
okay,
yeah,
so
pretty
so,
we've
already
covered
the
overview
verification
by
the
primary
base,
checks
and
secondary,
advanced
tracks
that
compares
it.
If
it
doesn't
match,
then
the
secondary
says:
failed
verification
failed
sync,
because
apparently
this
thing
is
not
the
same.
B
So,
interestingly,
what
that
means
is
failed.
Verification
on
the
secondary
side
is
not
a
valid
state
because
it
should
always
immediately
transition
to
failed
sync
and
if
something's
failed,
sync
verification
doesn't
come
into
play.
It's
not
ending.
It's
not
success,
it's
not
sure,
but
it
makes
sense.
We
record
the
message
for
the
verification
failure,
but
like
this,
it's
not
really
verification
failed.
It's!
It's
really.
Sync
failed!
A
So,
but
that
that
bit
of
information
might
be
useful
for
someone
troubleshooting
it
right,
so
we
would
hold
that
information.
It
has
failed
because
I
guess,
if
it's
happening
fairly
regularly,
then
we've
there
it's
more
of
a
systemic
issue
rather
than
oh.
We
just
that
got
corrupted
on
the
way
or
something
like
that.
If
everything's
failing
verification,
then
there's
probably
something
bigger
going
on.
A
They're,
just
seeing
sinking
fail,
but
actually
it's
not
just
thinking
it's
things
have
come
across,
but
look
the
verification
is
failing.
So
I
guess
we
just
as
long
as
the
information
is
there
they
can.
They
can
decipher
what
what's
gone
wrong,
but
if
we're
hiding
that
information,
I
think
that
that
could
be
more
difficult
for
someone
to
troubleshoot.
A
B
Who's
here,
yeah,
okay,
so
verification
checksum
is
what
did
I
check
sun
on
this
site
and
if
a
if
a
mismatch
was
detected,
then
this
becomes
true
instead
of
false
and
the
primary
side
checks
gets
recorded
in
here
right:
okay,
yeah,
okay,
yeah,
that's
not
totally
obvious.
B
B
A
B
A
B
B
Yeah
also,
currently
it
just
goes
sleeping
varication
fails.
It
goes
straight
to
failed
sync.
B
Actually
we
have
some
other
fields
here
so
to
mention
forced
to
redownload
for
git
repose,
there's
like
two
approaches
to
sinking
like
that
now,
there's
technically
three,
but
get
fetch
versus
fireballing
the
gate,
repo
on
the
primary
side
and
having
a
secondary
downloading
an
end
point.
B
And
that
that
that's
called
snap,
we
call
it
snap
shotting
and
that's
what
we
use
during
three
download
okay
and
that
one
that
one's
come
up.
You
know
not
in
frequently
with
customers,
because
it's
possible
for
one
approach
or
the
other
to
be
not
working,
but
the
other
one
is
working
and
so
force
to
redownload
comes
into
play.
Where,
when
you
go
to
the
ui
on
the
secondary
and
you
you
click
the
download
for
a
particular
thing.
A
A
A
B
Okay
yeah.
Let
me
just.
A
B
B
B
But
the
problem
is
like
too
many
times.
We
ran
into
the
case
where,
for
some
reason
well,
there
was
a
bug.
There
was
fun
for
a
long
time.
Free
download,
but
the
snapshot
way
wasn't
working
but
get
fetch
was
okay.
So
it's
right
extremely
painful
for
customers
to
be
like
you
know,
resyncing
keeps
trying
to
happen,
but
it
just
fails
every
time,
but
then
we
go
in
and
it's
a
rails
console
and
make
it
sync
with
a
git
fetch.
It
works.
B
A
B
I
put
that
down:
okay,
good
yeah,
so
that's
because
the
projects
projects
are
using
the
legacy
stuff,
but
framework
repositories.
Some
services
used
by
ssf
to
sync
submits.
B
So
one
thing
to
mention:
oh
there's,
a
big
thing
to
mention
verification,
state,
there's,
a
verification
state
for
resources
on
the
primary
side
as
well
right
to
manage
I'll,
be
gonna
track
them
failed.
I
didn't
try.
B
That's
fine
in
many
cases,
but
we
have
some
cases
like
job
artifacts,
where
they're
on,
like
a
hundred
over
a
hundred
million
job
artifacts
on
github.com
and
so
like
extending
this.
The
job
architect's
table
is
like
bad
for
performance.
B
Bad
form,
well,
mainly
performance,
but
like
it,
you
know
there
are
no
there's
more
than
one
reason
why
that
is
the
case,
but
anyways
it's
a
huge
table.
So
we
don't
really
want
to
add
these
fields,
especially
if
they're
only
like,
like
on
gitlab.com
they're,
not
using
geo
so
like
they're,
not
even
relevant.
A
B
Or
self-managed
customers
might
not
be
using
cheating,
so
anyways.
I
think
it
was
for
job
artifacts,
specifically
like
we
added
the
verification
fields
in
a
separate
table.
B
That
adds
a
whole
bunch
of
complexity,
but
it
was
seemingly
necessary
and
one
of
the
bits
of
complexity
that
came
out
of
that
was
this
worker,
and
so
this
worker
iterates
over
the
table,
the
source
table.
B
A
If
you're
pointing
to
another
window,
I
can't
see
that
what
can
you
see?
I
see
your
browser,
so
I
can
see
the
geo
reference
architecture.
Oh.
A
There
we
go
now,
we
can
see
it.
B
All
right,
here's,
my
ide,
which
I
was
pointing
to
for
a
long
time
on
in
the
other
discussions,
verification
state,
backfill,
worker
innovates
over
the
services,
verification,
stay
tuned.
A
A
Mike,
oh,
I
was
wondering
if
you.
B
B
Any
anyways
so
this
so
the
verification
state
table
is
separate
for
a
well,
especially
these
big
tables,
and
I
think
we
decided
that
this
is
just
the
way
going
forward
for
everything.
A
A
How
about
we
do
a
follow-up
session
on
on
this
mic?
I
know
we've
been
going
for
some.