►
From YouTube: Gitaly: partial clone with blob filter
Description
A quick demo of partial clone filtering by blob size
A
Hi,
my
name
is
James
Ramsay
on
a
product
manager
gitlab
last
week,
I
all
the
week
before
last
I
uploaded
a
video
showing
partial
clone
and
with
a
filter
spec
to
exclude
different
file
paths
and
or
rather
with
the
filter
spec.
You
specifically
described
the
file
paths
you
want
to
clone,
and
so
today,
I
thought
I'd
show
you
another
kind
of
partial
clone,
which
is
by
blob
size,
and
since
this
takes
a
little
while
with
the
large
repository
I'm
demoing
with
I'll
kick
off
the
command
and
then
we
can
take
a
look
at
it.
A
You
just
get
a
copy
of
everything
you
need
and
a
full
copy
that
you
can
work
on
offline,
all
the
different
feature
branches
and
that's
going
to
include
tree
objects
as
well
as
file
objects
or
blobs
and
commit
objects,
and
so
what
the
filter
blob
limit,
10k
flag
does
is
excludes
the
blob
objects
that
are
larger
than
10
kilobytes.
So
what
we
can
see
in
the
get
output
is
a
little
different
to
the
usual
get
output.
A
So
this
is
in
comparison
to
the
usual
one
where
it
all
happens
in
one
step
for
comparison:
let's
do
a
clone
in
the
typical
fashion
and
compare
that
for
speed
and
how
that
looks.
So,
let's
take
a
look
at
how
large
this
partial
plan
is.
So
it's
1.6
gigabytes,
but
more
interesting,
perhaps,
is
to
look
at
just
the
git
directory,
which
excludes
the
index
or
working
copy,
and
that's
eight
hundred
and
thirty
megabytes,
which
is
pretty
close
to
seven
hundred
thirty-five
plus
64.
So
that's
not
surprising
and
that's
what
we
kind
of
expect.
A
We
can
compare
this
to
the
full
bear
repository,
so
this
is
the
complete
their
repository
that
I've
fetched,
which
is
9.2
gigabytes.
So
it's
enormous,
we'll
see
slightly
less
when
we
take
a
look
at
the
full
copy
here,
but
if
I
am
take
a
look
inside
of
the
partially
cloned
and
copy
that
we've
got
here,
this
is
a
complete
copy
of
master
branch,
so
every
single
file
we
need
in
the
master
in
the
head
of
master
branch,
is
here
and
we
can
work
on
it
like
a
usual
git
repository.
A
The
interesting
thing
is
when
we
maybe
take
a
look
at
a
different
branch
or
a
different
revision.
We
should
be
missing
some
objects
because
we
excluded
a
whole
lot
of
blobs
and
we've
only
downloaded
specifically
the
blobs
larger
than
10
kilobytes
that
we
need
for
this
commit
if
I
check
out
a
different
community.
We're
gonna
have
to
download
data
on
demand.
So
let's
take
a
look
at
that:
let's
go
back.
100
commits
on
the
head.
A
Yeah-
and
here
we
can
see
that
we're
talking
to
the
remote
to
get
all
the
files
that
we
need
to
check
this
out,
and
so
it's
actually
downloading
quite
a
bit
of
data,
because
there's
a
lot
of
images
and
other
things
that
get
updated
frequently
on
the
get
lab
website.
So
these
will
need
to
be
downloaded.
A
So
that's
a
pretty
similar
workflow,
that's
happening
here,
except
using
partial
clone
instead
of
LFS
means
that
we
don't
actually
have
to
decide
upfront
where
an
object
is
going
to
be
stored.
Is
the
file
going
to
be
stored
in
git,
or
is
it
going
to
be
stored
and
get
LFS
partial
clone
means
I?
The
user
can
decide
when
I
download
it,
which
one
which
objects
I
want.
Don't
want
just
the
small
ones
or
do
I
want
the
large
ones,
and
that's
really
nice,
because
it
means,
if
I
introduce
a
large
object.
A
I
don't
have
to
rewrite
history,
to
remove
it
and
put
it
somewhere
else.
I
can
just
use
git
I,
don't
have
to
worry
about
where
I'm
putting
objects,
I
don't
have
to
have
LFS
getting
in
the
middle
using
and
smudge
filters
to
decide
where
to
put
things,
intercept,
different
commands
and
download
files.
It's
just
going
to
do
this
natively,
and
so
in
theory,
if
you've
got
a
large
project
that
you've
already
got
binary
files
in
deep
in
the
history,
and
you
can
just
use
now
this
blog
filter
to
exclude
them.
A
A
So
I
guess
that's
more
similar
to
LFS,
but
it's
the
advantage
of
partial
clone.
Is
you
get
to
choose
how
you
want
to
download
the
data?
Don't
have
to
make
that
decision
in
advance
your
colleague
that
accidentally
does
the
thing
that
you
don't
want
them
to
do
and
puts
it
in
Aleph
S
or
doesn't
put
it
in
like
this,
that
doesn't
matter
anymore
you're
in
control.
So
that's
a
quick
demo
of
partial
clone
filtering
by
blob
size,
we're
hoping
to
enable
this
or
add
a
feature
flag
to
an
upcoming
version
of
get
lab.