►
From YouTube: 2020-11-17: Experiments with Git object offloading to object storage and partial clone
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
That's
got
it
well.
There
we
go
welcome
to
the
partial
damn
partial
clone
demo.
The
script
is
provided
by
chris
and
we've
iterated
on
this
a
couple
of
times.
A
The
idea
is
to
use
partial
clone
to
basically
scrub
repositories
on
the
server
side
from
some
data
and
we're
mainly
thinking
about
large
blobs
right
now.
So
let's
say
someone
pushes
a
blob
of
100
megabytes.
We
store
it
on
this
right
now,
but
that
is
expensive.
If
we
could
upload
that
that
would
be
great
cool.
Let's
get
started.
A
Like
this
detach
and
go.
B
A
Okay,
I
was
going
to
copy
this
slide.
A
A
Those
config
options
are
set
by
default
and,
funnily
enough,
I
just
noticed
that
for
http
they're
set
in
our
code
base
for
ssh
they're
set
in
rails,
not
great,
but
it
works.
I
guess
all
right
and
now
we're
gonna
clone
from
the
just
cloned
repository
and
we're
using
the
non-local.
So
it
doesn't
hard
link
on
disk
and
it
actually
closes
the
data
and
pretends
to.
A
A
A
Cool
yeah,
so
now
we
might
not
have
all
the
data
now
we're
gonna
show
how
many
blocks
we're
missing.
It
should
be
23
and
it's
kind
of
cool
this
one
liner.
A
So
it
is
a
ref
list
which
is
super
nice
as
a
tool.
It
has
25
flags
which
are
all
pretty
cool,
especially
if
you
combine
them
now.
We
just
list
all
the
objects
and
if
they're
missing,
we
print
them
and
then
we
pipe
it
to
pearl
just
for
the
regex
match.
If
it
starts
with
the
question
mark,
we
know
it's
missing,
then
we
pipe
it
to
work
out
dash
line,
so
23
lines
were
outputted,
so
we're
missing,
23
objects,
which
then
implies.
A
If
we
need
that
object,
we
need
to
go
to
the
server
as
server
to
get
those
objects.
If
we
do
a
checkout
where
we
are
missing
those
objects,
those
blobs
now
we're
going
to
validate
that
the
server
has
them
all.
The
server
now
is
a
remote
repo,
a
local
repository
on
disk,
which
we
cloned
from
with
the
no
local,
but
it
was
a
full
clone
from
the
gitlab.com
repositories.
A
A
A
For
demo
purposes,
so
the
server
replied
with
first
and
then
204,
and
if
we
curl
this
end
point
it's
going
to
show
what
blobs
it
has
and
it
should
have
one
blob
and
the
hash
should
be
equal
to
the
one
we
just
posted
there
we
go,
and
this
is
the
size
and,
as
you
can
see,
it's
over
the
800
kilobytes.
A
So
it
is
considered
a
big
blob
for
the
purpose
of
this
demo.
We're
gonna
hijack
my
path.
Well,
it's
not
really
hijacking
but
we're
gonna
prepend,
my
path
to
use
utility
functions
in
the
test
directory.
A
If
I'm
too,
for
both,
you
can
just
tell
me,
but
I'm
trying
to
make
it
clear
for
anyone.
C
A
Great
question:
it's
not
so
here
we
were
wait,
where's,
my
mouse
now
all
right
yeah
here,
so
we
clone
into
client
one
right.
So
what
happens?
First
is
when
you're
reading
objects,
we're
receiving
yada
yada
and
then
git
does
exactly
the
same
loop
again
with
two
objects,
and
these
two
objects
are
in
the
current
checkout.
A
So
first
it
tells
the
server
give
me
everything,
except
all
the
blobs,
larger
than
800
kilobytes
and
then
git
actually
needs
two
of
those
objects
and
goes
to
the
server
again
and
say:
hey.
You
know
what
those
two
shots
I
want
those
and
so
on
demand.
It's
actually
doing
a
git
fetch
with
two
shares.
This
is
the
second
fetch
right
here,
so
it
it
went
exactly
as
you
thought
it
went,
but
then
it
was
clever
enough
to
just
fetch
those
objects.
It
was
missed.
C
A
Yeah
so
wait.
Oh
the.
A
Yeah
the
second
one
is
done
because
git
wants
to
do
the
checkout
notices
that
it's
missing
blobs
and
then
doesn't
do
it
so
and
then
mrs
blobs,
and
then
does
do
it.
So
what
it
had
to
do
is
enumerate
the
objects.
It
wants
two
additional
objects
and
then
it's
going
to
fetch
those.
But
what
we
could
do
is.
A
A
I'm
just
gonna
press
enter
and
maybe
I'm
wrong,
and
then
we
can
all
laugh
about
it,
but
I
think
this
is
how
it
works.
A
Yeah,
okay,
so
everyone's
back
on
board
like
this
is
okay
cool.
Then,
let's
continue
on,
but
it
was
a
good
question
indeed
that
that
was.
A
A
there
you
go
cool,
and
this
is
also
what
I
really
like
about.
Partial
clone
is
how
transparent
it
is
for
the
end
user.
So
if
we
were
to
use
this
in
the
gdk,
for
example,
where
a
program
is
cloning
for
you
or
in
ci.
A
A
And
then
we're
gonna
set
this
one
remote
as
a
promissor
remote,
and
the
promissory
in
this
context
means
if
I'm
missing
objects.
I
can
iterate
through
all
the
promissory
modes
and
one
of
those
I
can
just
ask
hey:
do
you
have
this
shot
and
this
is
used
to
differentiate
remotes
from
one
another?
So
let's
say
you
have
the
same
repository
on
multiple
servers
and
some
of
them
do
act
as
a
promising,
remote
and
the
other.
Don't
then,
this
is
still
compatible
throughout
all
those
remotes.
A
All
right
so
yeah
a
cat
file,
which
is
a
git
plumbing
feature.
It
just
shows
you
whatever
is
in
the
hash
and
the
dash
b
means
print,
and
this
is
one
of
those
big
files.
The
hash,
it's
the
big.
The
hash
is
actually
this
value
because
we
already
uploaded
this.
We
must
be
missing
it
and
we
pipe
it
through
less
because
it's
a
big
file
and
then
it's
gonna
take
a
while.
A
Yeah
all
right,
I
don't
know
how
to
debug
this,
but
we
can
do
this
async
or
do
you
wanna?
You
have
ideas
now,
chris.
A
A
Okay,
but
the
point
of
this
cat
file
was
just
to
require
it
like
the
checkout
we
just
discussed
like
if
git
needs
something
and
it
doesn't
have
it.
It
goes
to
one
of
the
promisers
and
tries
to
obtain
this
object
is
missing,
so
it
just
did
that
it
went
to
one
of
the
promisers
and
the
promiser
returned
the
object,
so
the
missing
object
count
has
decreased
by
one.
B
A
But
the
origin
is
in
the
park.
Oh,
it
is.
Okay,
sorry
blob,
sorry
tone
wow.
This
went
very
wrong
in
my
head.
Excuse
me:
okay,
but.
A
Okay,
oh
you
wanted
to
add
more
color.
Chris
did
I
miss
something.
A
A
For
now,
we'll
continue
with
the
update
hook,
so
this
command
I'm
just
going
to
execute
copies
a
shell
script
as
the
update
hook,
the
update
rook
runs
for
every
reference
update.
So
if
you
update
a
branch
from
one
shot
to
another,
this
happens
on
a
push,
but
also
with
no.
This
happens
on
a
push
period,
and
then
it
runs
the
update
hook
per
one
of
those
reps.
So
if
you
push
multiple
branches,
then
it
runs
the
update
hook
for
each
of
those
branches.
A
This
hook
well,
first
create
a
big
blob
and
then
I'll
explain
later
so
what
this
does?
It
creates
one
mega
one
megabytes
I
think
yeah
so
had
one
meg
with
count
of
one
megabyte
from
defu
random
and
redirect
standards
out
to
git
client,
one
new
random
data,
which
just
means
create
a
new
blob
in
git
client,
one
new,
random
data.
A
And
we
paste
that,
and
now
we
can
push
that
to
the
origin.
Remote
there
we
go
and
what's
interesting
now
is
that
there's
the
hook
is
actually
being
run,
so
large
push
attempted
with
a
file
named
and
the
size
is
over
the
one.
No,
this
is
like
the
one
megabyte
yeah,
never
mind,
and
this
is
the
upload
blob
output.
A
A
All
right
so
now,
let's
clone
it
again,
the
host,
where
we
clone
from,
is
the
full
repository
again
we
apply
the
same
limit.
A
A
So
there
we
go!
Oh
and
now
we
fetch
free
objects
because
in
the
checkout
there's
the
new
random
data
file
as
well.
So
that's
what
the
counter
means
in
the
writing:
objects.
A
A
A
A
Okay,
so.
B
A
It
doesn't
make
sense
why
it
triggers
so
it's
fine.
I
think
I
could
reason
about
it
within
a
few
seconds.
So
if
I
could
do
that
in
a
few
seconds,
everyone
can
get
client
free-
and
I
guess
the
point
here
is
the
new
random
data
is
no
longer
in
the
checkout,
so
we
won't
fetch
it
again.
So
the
additional
fetch
will
fetch
two
objects,
not
three.
A
A
A
A
So
then
we
remove
a
whole
section
remote
at
origin.
This
is
a
quick
little
trick
that
you
do
it
from
through
config
and
not
through
remote,
and
then
some
zip
command
fun.
Little
story,
remote
is
actually
a
very
thin
wrapper.
The
remote
subcommand
is
a
very
thin
wrap
around
the
config
and
then
the
second
command
just
checks.
If
there's
anything
in
the
config,
which
is
has
the
text
origin
and
the
master
branch
still
points
to
the
origin
as
a
remote.
B
A
Okay,
oh
so
that's
why?
If,
if
I
interrupt
and
then
just
rerun,
the
repack
will
not
be
triggered
so
it's
much
faster
but
in
the
end
just.
B
It's
the
fact
that,
when,
when
it's
fresh
fetch
from
the
http
remote
as
the
fast
import
protocol
is,
is
used
by
this
fetch
inside
it,
it
takes
a
lot
of
time.
But
I
haven't
tried
to
optimize
this.
No.
B
A
A
C
A
Cool
so.
A
So
these
are
all
the
blobs.
If
I
were
to
upload
blob
and
then
this
sha,
would
it
just
reject
it?
Well,
no.
A
A
So
repack
dash
all
is
dash
a
is
all
I
think,
and
the
dash
d
is
recalculated
deltas.
B
A
A
New
keyboard,
I'm
blaming
that
so
that
would
be.
B
A
B
A
A
A
A
B
A
B
When,
when
you,
when
you
request
something
from
from
a
server-
and
here
we
we,
we
use
it
to
to
repack
not
to
send
stuff
anywhere.
So
so,
usually
when
you
repack,
you
don't
use
a
filter.
So
that's.
Why
that
I
guess
that's
why
they
disabled
the
filter
option
when
you
don't
choose
std
out,
but
it's
an
artificial.
A
B
B
Yeah
we
have
to
to
do
some
some
some
kind
of
stuff
that
e3
pack
does.
B
Yeah
by
default,
when
d3
pack
uses
some
options
that
we
should
remove,
because
otherwise
it
it
doesn't
work
with
the
dash
filter.
So
that's
why
I
I
we
have
to
do
a
kind
of
manual
repack
by
using
pack
objects
or
self
and
then
by
moving
the
the
pack,
they
generated
the
pack
files
so
that
they
replaced
the
the
old
ones.
A
It's
pretty
clever,
especially
this
part
where
you
create
a
new
pack
file.
A
A
But
yeah
you,
you
can
disagree,
but
I
I
I
really
like
how
how
much
exposed
kit
is
like
no
other
database
allows
you
to
just
move
some
files
around
and
well.
Maybe
it
does
maybe
I'm
just
unaware
of
them.
A
A
I
just
wanted
to
call
out
for
a
second
that
we
touched
a
promissor
pack
as
well.
So
if,
like
the
package,
didn't
change,
so
all
the
objects
are
in
this
pack
there's
an
index
on
this
pack
and
if
it's
missing
then
there's
an
empty
pack,
but
it
has
the
promissor
extension,
which
then
means
that
it's
the
problem,
it's
a
promised
object
and
yeah.
That's
why
you
might
see
that
file.
A
A
B
Yeah,
I'm
not
sure
why,
but
I
also
got
26
sometimes,
and
sometimes
I
got
25,
so
I
don't
know
what's
going
on
there
but
yeah.
It
seems
to
have
worked.
B
A
Okay,
now
without
piping,
I
just
want
to
wait
if
I
pipe
this
through
to
short.
A
Oh,
that's
not
short,
this
is
short
and
we
just
how
many
are
with
a
six
three
one,
two,
three
okay.
This
is
rudimentary
check
summing,
but
I'm
just
gonna
claim
that
they're
equal,
so
it
might
be
25
or
26,
but
the
main
point
is
we
got
it
on
the
server
if
we're
missing
it
on
the
local
repository
and
just
to
show
that
we
can
fetch
one
of
those.
A
A
A
lot
to
digest,
especially
this
part,
so
that
was
wrapping
my
brain
a
little
when
I
saw
it,
but
I
also
like
git
just
because
you
can
do
this,
but
any
questions
from
tone
or
pablo.
C
A
A
Might
you
might
hit
the
75.
B
C
Not
into
the
last
part
as
well
yet
but
yeah
interesting
stuff.
A
Yeah,
the
last
part
is
really
scrubbing
data
from
a
repository
which
then
only
is
present
on
the
http
server,
because
in
the
previous
sections
we
didn't
do
any
of
the
weird
repack
thingies,
so
they
were
all
still
present.
And
now
we
had
one
repository
which
didn't
have
any.
B
So
so
the
the
the
goal
of
this
back
object
stuff
is
to
just
to
remove
them
to
remove
these
large
blocks
from
the
from
the
from
the
git
that
database
on
the
server
something
it's
something
we
could
do,
for
example,
every
night
or
or
every
I
don't
know.
A
Yeah
and
the
advantage
of
that
would
be
that
we
have
them
local
if
there
is
a
load
coming
so
only
the
first
time,
it's
missing
and
then
the
rest
of
the
day.
We
have
it.
Let's
say
a
new
repository
is
very
popular
and
then
at
night
we
scrub
it,
and
then
we
try
to
go
without
having
that
foul
locally
again.
A
Because
wait
like
what
I'm
trying
to
prove
is
that
this
gitlab
git
is
it's
missing
objects.
We
just
shown
that
the
26
we
fetched
one.
So
it's
25
now,
but
this
repository
should
be
gitlab.com
like
in
a
few
iterations
of
backend
server
work,
but
for
the
customer
we
want
to
do
a
git
clone
and
then
dash.
No
local,
so
don't
reuse
any
objects
from
gitlab
git.
Oh
I
yeah.
B
Yeah
we
can.
We
can
show
that
yes,.
A
It
might
not
work
well,
we
answer
questions
still,
but
it
would
be
so
cool
if
this
completes
in
in
a
few
seconds,
because
that
would
prove
it's
fully.
A
Then
we
also
see
d,
and
we
just
run
this
one
liner
to
to
counter
misses
missing
objects,
because
then
it
would
be
fully
transparent
to
the
end
user
and
like
in
in
air
quotes.
The
only
thing
we
still
got
to
do
is
make
sure
that
hit
biff
and
all
those
server
actions
do
not
eagerly
fetch
all
the
big
blobs.
B
A
A
A
Oh,
this
is
so
cool
okay.
This
this
makes
me
super
happy,
very
cool,
all
right.
A
There's
probably
a
lot
of
work
in
the
till
end,
but
the
first
steps
look
super
promising.
A
Cool
very,
very
cool
chris.
I
I
am
like
a
charles
on
christmas
morning,
so
further
questions
from
tone
above
hello.
C
B
We
can
pass
a
url
and
we
can
work
with
this
url.
B
The
I
can
send
you
the
url
of
the
of
the
script,
so
that
you
can
see
how
it.
B
B
B
B
B
It's
because
this
script
is
named
git
remote,
test,
http,
git
and
and
git
when
it's
passed,
a
url
that
starts
with
stuff.
It
tries
to
see
if
there
is
a
git,
remove,
dash
stuff
split
in
the
path
and
and
it
uses
as
a
helper
to
access
the
remote
and
that's
what's
going
on
when
when
he
tries
to
fetch
from
the
from
optimizer
remote.
B
C
Great
cool
nice
yeah,
I'm
not
sure
if
this
is
something
we
want
to
answer
now
or
we
have
thought
about
at
this
point.
Is
it
because
now
we
have
like
one
one
web
server
that
lists
a
bunch
of
blobs?
Is
it
the
intention
to
like
group
those
blobs
and
subdirectories
for
each
repository,
or
is
it
like
becoming
like
considered
like
one
big
object,
storage,
which
has
blob
for
everyone
who
likes
to
store
blobs
on
my
http
server.
B
B
B
Know
all
kind
of
archive
stuff
and
also
because
git
lfs,
I
think,
uses
some
kind
of
http
server
by
default
or
well.
I
don't
know
but.
B
So
so,
and
so
I
don't,
I
don't
think
the
way
we
organize
stuff
on
the
on
the
on
the
http
server
on
the
promiser
remote
server
is
is
really
relevant,
but
I
I
don't-
and
I
don't
know
what
we
would
like
to
do
in.
A
B
A
There's
one
giant
bucket
on
some
object,
storage
and
the
advantage
there
is
that
if
you
fork
a
repository-
and
we
already
have
this
one
giant
blob
on
the
bucket-
then
we
don't
upload
it
twice.
The
disadvantage
is
that
if
I
know
a
shower
of
a
blob
I'm
interested
in
then
I
can
pretend
that
I
have
this
shy
in
my
repository
push
it
and
pull
it
down,
because
I
now
have
access
to
it
and
given
sha1
is
shattered.
C
Yeah,
I
think
that
that
answer
that
it's
something
we
we
need
to
decide
on
how
we
want
to
do
that,
depending
on,
what's
what's
the
most
beneficial
according
to
yeah,
the
features
like
like
we
have
that
elevates
like
the
ups
and
down
sides
there.
There
are
like
arguments
for
both
both
ways
to
implement.