►
From YouTube: 2020 06 24 GSoC Git Plugin Performance Project
Description
Jenkins project office hours for the Git Plugin Performance Improvement project.
A
Everything
and
let's
go
okay,
so
let's
start
with
the
agenda
directly.
So
the
first
thing
I
wanted
to
discuss
today
was
the
interactive
testing.
I
did
for
the
redundant
fish
shakes
and
the
modification
for
which,
which
for
which
I
had
a
discussion
with
mark.
So
the
first
thing
is
interactive
testing.
I
shared
the
plan
with
you
guys
yesterday,
so
I
took
some
scenarios.
I
thought
would
affect
would
affect
the
results
of
removing
the
second
fetch.
The
first
one
was
with
advanced
clone
behavior,
because
of
course
we
are
cloning
we
do.
A
This
is
directly
related
to
the
fetch
fetch,
so
the
first
mode
was
to
see
if
effect,
enabling
or
disabling
fetch
tags
would
directly
result
in
any
kind
of
difference.
In
the
information
we
have
for
the
repository
and
so
I.
Apart
from
looking
at
the
result,
I
also
looked
at
the
code
and
I
was
looking
at
how
the
second
h
is
handling
these
same
behaviors,
because
if
there
was
a
difference
in
behavior,
then
there
would
be
a
chance.
Then
the
results
would
be
affected
by
removing
the
second
fetch.
A
That
is
what
we
saw
with
with
one
of
the
one
of
the
issues
mark
caught,
which
was
that
some
of
the
references
were
not
being
handled
by
the
first
clone
API,
but
were
actually
handled
by
the
second
fetch
we
will
perform
in
so
with
fetch
tags.
The
second
fish
is
also
doing
the
same
thing
as
the
first
fetch
is,
so
there
is
no
difference
when,
if
we
disable
them
the
first
first,
if
we
enable
them
the
first
file,
the
first
fetch
will
bring
all
the
tags
you
disable
them.
There
will
be
no
difference.
A
The
second
when
I
was
interested
in
was
the
shallow
clone.
First,
with
no
depth.
No
edit
I
was
I
wanted
to
see
if
by
default,
the
second,
the
second
fetch
is
it
providing
deaths
by
default,
and
the
first
is
not
doing
that
so
that
there
might
be
a
difference
in
the
commit
history.
So
so,
when
I,
when
I
compared
the
code,
I
could
see
that
the
clone
API.
Basically,
it
shares
the
same
implementation
for
doing
a
shadow
clone.
So
they
both
if
I,
don't
provide
any
depth.
They
will.
They
have
a
by
default
depth.
A
One
level
for
doing
shadow
clones,
so
there
is
no
difference
if
I,
so
the
behavior
of
the
same.
If
I
remove,
if
I
remove
the
second
fetch,
the
first
fetch
will
take
care
of
performing
a
clone
with
depth
equals
to
1.
Then
the
third
test
was
with
shallow
klum
with
depth
to
I.
Think
it
was,
it
didn't
I
think
it's
the
same
thing
it
doesn't.
It
won't
make
any
difference
in
the
results.
Then
timeout
was
kind
of
just
I.
Just
wanted
to
see
if
there
would
be
I.
A
A
Timeout
specify
exactly
so
basically
I
would
specify
this
if,
if
you're
cloning
or
repository
it,
if
it
takes
more
than
five
minutes,
it's
going
to
it's
going
to
a
breath
abruptly
canceled
the
bit.
That's
what
happens
is
the
math
right?
Yes,
so
that
is
timeout
by
default.
It's
ten
minutes
for
any
foreign
foreign
operation.
A
Yes,
the
second
scenario
was
with
white
workspace
and
force
zone,
so
this
I
I
just
tried
because
I
wanted
to
see
because
it's
it's
cleaning
the
repository
and
forcing
a
Rekluse,
so
I
wanted
to
see.
If
that
would
somehow
change
anything
so
I
enable
wipe
workspace
and
I
tried,
I,
try,
I,
compare
the
results
of
without
the
fixture
with
the
fix
and
I
could
not
see
any
difference.
In
the
repository
information
we
had,
then
it
was
for
checkout
for
a
specific
branch
and
I
realized.
A
So
this
is
this
is
the
third
scenario
and
I
realized
that
it
has.
Basically,
you
know
it's
something
we
do
after
this
step
of
which
involves
these
double
fetches.
So
it's
it's
in
a
function
called
a
tree
changes
it
checkout
is
a
stage
which
is
which
comes
at
a
dated.
Part
of
the
cone,
so
then
there's
this
interesting
behavior
I
found
out,
which
is
not
user
visible.
It's
called
gate
sem
source
defaults.
As
far
as
I
could
understand
it's.
It's
basically
done
it's
done.
A
I
was
reading
and
let
me
recall
it:
it's
done
to
go
to
the
default,
revert
to
the
default
behavior
that
is
enabling
on
a
rev
spec,
and
the
second
thing
was
I.
Don't
remember
exactly.
That
was
not
sure.
If
this
would
make
any
difference,
I
tried
it.
It
did
not,
but
since
it
was
doing
it
was
it
involved
and
honoring
respect.
I
just
wanted
to
check
if
this
behavior
would,
although
I
could
not
understand.
How
is
this
behavior
being
called?
I
did
not
go
too
much
in
deep
how
this
behavior
was
working.
A
B
Would
not
expect
to
make
any
difference,
but
it's
an
interesting
question,
because
pre
build
merge
requires
two
branches,
at
least
inside
the
workspace
right.
There's
got
to
be
a
source
branch
and
a
destination
branch,
but
that
you've
got
to
have
that
one
way
or
the
other,
whether
it's
from
a
wide
spec
in
ischl
e
or
from
an
honor
respect
and
used
to
declare
both
branches,
so
I
think
I
think
it
is,
is
unaffected
by
this.
B
A
So
after
doing
these
interactive
tests,
there
was
one
interesting
problem
which
mark
he
pointed
out,
and
that
problem
was
that
with
the
ref
specs,
while
while
we
are
fetching
the
ref
specs,
so
we
have
multiple
kinds
of
references.
Ref
specs
are
basically
mappings
for
the
references
between
little
more
repository
and
the
local
depository.
So
so
we
can,
when
the
ref
specs,
which
the
first
which
handles
they
are,
they
are
related
to
the
references
of
branches
that
is
less
heads
any
any
related
branch
or
star
which
brings
all
the
branches.
A
So
so
I
had
a
discussion
with
mark
on
how
we
could
safely
retain
the
fix
and
modify
the
code
so
that
we
can.
We
can
not
break
any
use
existing
use
case
and
we
still
do
not
have
to
call
the
second
fetch.
Although
the
current
modification
I
have
tried,
there
are
cases
where
I
would
have
to
call
this
second
fetch
call
to
not
break
an
use
case.
So
right
now,
would
you
guys
like
me
to
go
through
the
fix?
A
The
modification
I
have
tried
and
then
the
interactive
testing
I
tried
upon
upon
that
modification
to
see
the
cases
which
mark
pointed
out
for
which
the
code
was
breaking
the
use
cases.
I
tried
those
cases
and
now
it's
working
with
them.
So
would
you
should
I,
explain
and
go
through
that
code,
or
would
you
guys
review
that
from
the
PR
and
that's
a
we
shouldn't
use
this
time
on
discussion
of
that
code,
so
I'm.
B
I'm
open
to
either
I
will
have
to
review
the
code
either
way.
So,
let's
look
to
the
Omkar
and
to
Justin
and
to
Fran.
What's
your
preference?
Do
you
want
to
skip
detailed
code
review
in
this
session
and
go
on
to
other
topics,
because
you've
got
several
other
topics
we
need
to
address
Rishabh
right.
This
is
this
is
not
the
only
topic
for
a
session
today.
A
B
A
Just
in
fun,
would
you
like
discussed
it,
or
should
we
move
forward
I
think
we
can
creaking
yeah,
okay,
so
the
second.
The
second
agenda
is
related
to
the
performance
benchmarks
and
benchmarking
in
general.
I
was
very
benchmarking
as
being
surprising
and
irritating
at
the
same
time.
For
me
right
now,
so
what
has
happened
is
the
first
of
all.
What
I
one
of
the
important
things
I
have
to
discuss
is
that
I
was
profiling.
A
The
Jenkins
instance
when,
with
whatever
changes
I
did
with
the
fakes
and
without
the
fix
with
Java
flight
recorder,
so
using
that
profiler.
What
what
I
was
experiencing
experiencing
with
consecutive
bells
was
some
kind
of
issue.
I
could
not
find
out
what
the
issue
was,
but
there
there
were
huge
time.
Differences
between
in
the
git
fetch
calls
between
some
repositories,
which
was
which
was
what
I
showed
in
the
platform's
a
meeting
and
which
was
wrong.
A
So
then-
and
it
also
took
a
lot
of
time
for
me
to
change
the
repositories
and
then
again
run
the
instance
fixes
it
was
taking
a
lot
of
time
and
I
wanted
to
do
it
with
a
lot
of
repositories
to
actually
see
oh,
how
are
we
doing
with
the
redundant
fetch?
What
kind
of
performance
overhead
would
be
reduced
if
we
are
allowing
that
stretch,
so
I
shifted
to
so
I
shifted,
sending
you
to
use
of
performance,
then
JMS
benchmark,
because
I
would
have
to
just
write
a
benchmark.
A
I
will
have
enough
parameters
where
I'll
directly
put
a
lot
of
output,
multiple
repositories
and
I'd,
and
then
I
don't
have
to
do
anything.
I
just
have
to
wait
for
the
results
and
also,
theoretically
benchmark
is
the
heading
to
understand
the
root
cause.
If
you
have
to
do
a
root
cause
analysis,
it's
the
best
thing.
One
of
the
best
things
to
do
is
what
I
thought
so
so
I
tried
a
bench
so
I've
written
two
benchmarks
related
to
the
redundant
fetch
and
I'd
like
to
show
you
the
results
and
the
benchmark.
A
So
the
first
benchmark
is
so-so.
I've
raised
a
PR
for
the
redundant
benchmark
as
well,
I've
also
written
another
benchmark.
Meanwhile,
I'm
going
to
show
you
that
benchmark
first,
so
with
this
benchmark,
what
we're
doing
is
we
have
written
two
benchmarks
here.
The
first
one
is
going
to
use
the
initial
clone
it's
going
to
initially,
so
it's
going
to
clone
the
repository
and
we're
going
to
see
what
kind
of
time
it
takes
to
clone
the
repository
for
the
first
time,
then
in
the
second
benchmark,
so
it
so.
A
The
first
benchmark
acts
as
a
baseline
experiment
that
we
can
compare
when
we
actually
add
the
second
operation.
That
is
the
second
reg
call.
How
much
time
difference
are
we
gaining
because
of
that?
So
the
second
benchmark
is
basically
again
the
first
thing,
the
initial
clone
and
then
again
a
fetch,
a
fetch
operation.
One
I
realized
by
writing.
The
benchmarks
was
that
I
was
not
doing
any
kind
of
validation
to
check
if
these
operations
are
actually
doing
what
they
should
do
and
I
am
what
motivated
me
to
do.
A
That
was
that
some
of
my
men
shrunk
so
in
giving
me
results
which
I
could
not
understand
completely.
I
was
like
I,
don't
I,
don't
know
how
to
write
a
benchmark
either
or
is
this
I
am
NOT
able
to
understand
so
so
what
the
initial
validation
I've
put
here
is
that
once
I,
the
first
thing
I
would
be
do
with
the
benchmarks
is
that
we
clone
the
benchmark
from
an
upstream
source,
the
repository
to
a
local
place
for
the
instance
of
the
benchmark.
A
Some
observation
from
that
in
those
times
are
not
even
those
operations
are
not
doing
what
they
should.
So
this
is
the
first
validation
I
was
I
have
put
here,
but
I
am
thinking
of
putting
more
validations
to
actually
see
if
the
operations
I'm
trying
to
benchmark
are
working
or
not,
and
the
times
are
we
get
or
not.
So
so,
with
this
benchmark,
the
results
I
have
is
yes,
so
the
results
I
have
I'll
explain
what
you
see
in
here
is
so
these
are
the
two
benchmarks.
The
first
benchmark
is
here
with
the
initial
clone.
A
This
is
with
the
double
fetch
calls.
So
this
is
the
second
been
wrong.
This
is
the
first
benchmark
with
the
first
benchmarks,
the
color
grading.
It
basically
means
is
that
so
we
have
get
and
jagged
two
implementations
I'm
actually
also
testing
that
as
well.
You
see
how
both
of
the
implementations
work,
so
the
two
bars
here
these
represent,
so
I
took
two
trees.
The
first
is
the
jenkins
repository
and
the
second
is
the
ruby
repository
and
why
I
took
those
two
repositories.
Let
me
just
show
you
quickly
the
reason.
A
The
jenkins
repository
the
number
of
commits
at
13,000
branches
31
with
the
ruby
repositories
I
have
61
thousand
commits
it's
basically
double
so
I
wanted.
So
one
more
thing
I'll
discuss
after
this,
whatever
explained,
is
that
I'm
not
able
to
find
constant
size
repositories
or
something
near
that
when
I'm
actually
looking
for
real
repositories
so
with
the
Jenkins
one?
This
was
the
closest
thing
I
could
find.
This
is
366.
I
mean
this
is
471,
there's
a
good
100
MB
difference,
but
this
is
was
I.
A
I
thought,
maybe
I
could
get
something
out
of
this,
because
the
commit
size
is
increasing,
double
its
double
with
the
branches.
It's
not
it's.
Actually,
let's
not
see
the
branches
there.
First,
let's
see
if
there's
actually
the
pinnacles
of
the
comic,
so
the
results
here,
VCS
or
the
for
those
two
repositories:
Jenkins
depository
and
the
ruby
depository.
The
first
one
is
the
Jenkins
repository
with
the
gate
implementation.
The
second
bar
is
the
room
with
with
the
gate
implementation.
A
Then
next
lighter
blue
color
bar
you
see,
and
the
gree
color
value
C
are
both
of
the
repositories
with
JK,
and
this
is
for
the
first
benchmark
and
then
the
same
thing
for
the
second
benchmark.
So
if
we
see
the
results
technically
of
real-life
performance,
there
is
theoretically
from
these
benchmarks.
A
If
I
could
infer,
there
is
no
difference,
no
tangible
difference
between
between
a
single
fetch
and
adding
a
second
fetch
on
that
same
repository
and
as
you
can
see,
the
first
benchmark
11
seconds
per
operation
with
the
second
with
Jenkins,
it's
again
11
seconds,
there
was
some
difference.
It
was
some
microsecond
difference,
a
millisecond.
Sorry,
not
much
so
weird.
So
I
took
the
time
unit
for
seconds
this
time,
because
I
actually
wanted
to
see
real-life
differences.
A
B
Of
so
you're
you're,
confident
that
the
that
it
was
really
using
Jake,
it
is
the
implementation
those
are
so
so
similar
to
each
other.
It
seems
like
they're
either
both
using
they
could
either
both
using
CLI
get
or
both.
That's
that's,
fascinating.
I
can't
explain
what
you're
seeing,
but
that's
really
interesting.
So.
B
A
B
A
I,
usually
log
I
have
I,
have
I,
usually
print
and
I'm,
using
when
I'm
with
the
benchmarks
are
in
the
implementation
I'm
using.
That
is
how
I'm
sure
that
okay,
this
is
how
it's
being
calculated
in
and
since
since
this
is
looking
a
little
episode,
I
I
actually
have
place
where
I
ran
these
benchmarks
head
and
sure
you
certificate.
Yes,.
B
A
B
A
So
this
is,
this
is
the
run
which
happened:
the
the
visualization
you're,
seeing
this.
These
are
the
results
in
this
form.
So
here
you
can
see
with
the
first
benchmark,
which
is
just
the
initial
clone.
Yet
it's
it's
giving
us
11
seconds
and
with
again
get
with
the
second
benchmark,
which
has
two
fetches.
It's
giving
eleven
point,
one
eight
one
seconds
so
so
different
evidence
is
very
minut
and
not
a
good
thing.
Actually,
no.
B
But
okay,
that
supports
the
observation
I
had
earlier.
When
initially
people
told
me,
this
fetch
is
enormous,
ly
expensive.
You
have
obviously
found
at
least
one
case
where
the
fetch
is
not
enormous,
ly
expensive,
that
redundant
fetch.
That
doesn't
mean
it's
always
free,
but
at
least
it
means
you
found
one
case
where
it
is
free
and
it's
surprisingly
low
cost.
So
interesting,
fascinating,
I,
wonder
if
so,
when
you're,
when
you
reference
the
the
repository
and
a
local
disc,
do
you
reference
it
by
absolute
path?
B
Because
you
may
want
to
read
the
CLI
get
documentation,
I,
don't
think
they'd
do
the
same
optimizations
for
jacott,
but
CLI
get
may
do
some
things
where
they
say.
I
know
this
is
local
and
remember
that
the
person
who
started
writing
this
was
Linus
and
therefore
he
thought
very
seriously
about
file
systems.
He
says
if
I
know
it's
local
I'll
just
do
hard
links
or
I'll
do
symbolic
links
or
I'll.
Do
you
know
there
are
all
sorts
of
things
that
he
could
do
knowing?
A
B
A
A
real
use
case
real-life
use
case.
That
is
so
because,
with
profiling,
what
I
saw
I
saw
results,
not
huge
results,
but
at
least
there
were
there
was
a
10
second
difference
between,
so
that
the
second
fetch
it
it
was
costing
around
10
seconds
or
maybe
8
seconds
or
12
seconds,
at
least
that
much
so
this
was
a
little
bit
surprising.
Well,.
B
A
One
more
observation,
I
think
which
we've
discussed
already,
is
that,
with
a
larger
size,
a
positive
jacket
is
performing
way
worse
than
what
CL
I
get
is
doing
right,
I
kind
of
have
a
question
that
why
are
we
you?
Why
do
we
give
Jake
it
as
an
option
when
we're
seeing
that
I
actually
don't
know
why
we
use
jacket
I,
see
I
have
never
asked
you
that.
Why
are
we
using
it
when
and
we
see
that
for
any
normal
sized
depository
Jake?
It
is
going
to
perform
worse
than
she
liked
it
yeah.
So.
B
So
the
the
original,
the
original
dream,
many
many
years
ago
before
I
became
a
plug-in
maintainer
was
that
Jake
it
would
be
every
bit
as
good
as
command
line
yet
and
we
would
get
better
results
by
being
by
using
a
full
native
implementation.
The
reality
about
a
year
into
using
that
implementation
was,
we
learned
very
painfully
that
Jake
it
was
not
a
complete
implementation
of
CLI
get
and,
and
since,
since
that
time,
the
evidence
has
proven
it
will
probably
never
be
a
full
implementation
of
CLI
get
the
people
who
maintain
Jake.
B
It
are
very
committed
to
it
and
they
do
great
work
for
the
things
they
need
from
it.
But
but
of
course
they
work
on
the
things
they
need,
and
so
so
that
the
the
one
one
use
case
where
Jake
it
is
very
very
helpful,
is
if
you
have
a
platform
where
you
can
get
Java,
but
you
don't
have
a
command
line,
get
port
Jake.
It
will
will
still
work
for
you,
so
so
in
that
case,
it's
interesting
for
large
repositories.
It
looks
like
we
have
clear
evidence.
It's
never
interesting
for
you.
B
The
other
danger
with
large
repositories
is
its
using
its
using
java
virtual
machine
memory
to
do
the
clone
and
and
therefore
you
have
to
worry
about
memory
leaks
inside
or
an
inadequate
garbage
collection
etc
inside
the
Jake
it
implementation,
whereas
with
CL
I,
get
it's
always
a
sub
process.
The
operating
system
will
garbage
collected
for
you,
so
so
yes,
your
observation
is,
is
very
wise,
but
why
use
Jake
it
for
anything
larger
than
larger
than
about
ten
megabytes.
A
Okay,
and
so
the
next
benchmark
I
I
have
actually
raised,
appeared
for
that.
So
this
with
this
benchmark,
what
I'm
doing
this
and
I
think
it
shows
benchmark
as
well,
so
the
dis
benchmark?
We
have
multiple
depositories,
it's
it's
from
the
Jenkins.
These
are
Jenkins
repositories,
small
plugins,
I,
just
incrementally
increase
the
size
and
number
of
commands
number
of
branches
to
see.
A
We
set
those
parameters
for
repositories.
We
create
ourselves
while
we're
benchmarking,
but
but
to
have
a
clear
sensitivity:
analysis
where
we
directly
want
to
find
out
how
this
parameter,
like
the
number
of
commits,
would
affect
the
execution
time
for
gate
which,
without
freezing
without
taking
the
size
of
the
repository
constant
I,
am
not
sure
how
we'll
be
able
to
confidently
say
that.
A
It's
actually
not
doing
the
same
thing.
Here's
a
differ.
The
difference
is
that,
with
the
earlier
benchmark,
I
was
actually
cloning.
The
repository
for
the
first
time
within
the
benchmark,
so
I
was
benchmarking.
The
execution
time
for
that
operation
as
well.
Here
the
that
operation
is
is
taking
place
in
the
setup
before
the
benchmark.
It's
it's
it's
happening
before
the
benchmark,
so
ideally
it
should
not
affect
that
time.
So
clearly,
I
should
what
I
should
get
is
the
execution
time
when
I
am
the
results?
A
The
execution
time
for
the
incremental
fetch
is
what
I
should
get
from
this
benchmark,
and
so
what
I
got
so
I'll
just
show
the
benchmark.
This
is
the
benchmark.
It's
incremental
fetch
the
git
client
I'm,
using
it
the
the
git
repository
it's
referencing
it
should.
It
should
have
already
have
a
git
repository
fetched
from
the
local
git
repository
I
have
so
so.
The
results
here
we
can
see
is
that
we
get
so.
The
colors
you
see
is
basically
multiple
repositories
with
gate
and
then
with
jagat,
it's
just
one
benchmark,
so
we
don't
have
confusing
result.
A
It's
not
that
much
confusing,
so
so
we'd
get
as
we
increasing
repository
size.
One
positive
result
I
can
see
is
that
the
execution
time
is
increasing,
for
the
cost
of
having
an
incremental
fetch
is
increasing,
though
the
increases
in
microseconds
milliseconds,
but
it's
an
increase
and
I'm
sure
as
I
increase.
The
size
I
take
it
to
may
be
much
larger
repositories
mean
have
a
change,
but
what
I
have
to
do
is
after
this.
One
of
the
most
important
thing
is
to
map
this,
the
theoretical
or
derivation,
with
practical
observation
and
to
do
a
practical
observation.
A
What
I've
seen
is
that
I
can
use
JFR
profiling
tool
to
see
how
for
those
repositories,
what
kind
of
performance
overhead
I
am
reducing,
while
I'm,
avoiding
the
second
fetch.
With
this,
we
can
see
that
okay,
now
there
is
a
change.
There
is
a
difference.
There
is
an
increase
when
we
increase
the
size
of
the
repository.
A
The
number
of
commits
also
increase
the
number
of
branches
increase,
but
I
can
never
say
for
sure
what
is
contributing
the
most
for
the
keep
stretch
right
now,
because
since
the
size
of
the
repository
is
increasing,
III
there's
no
way
I
can
say
that.
Okay,
the
commits
is
why
this
is
happening.
For
that
to
happen.
I
need
maybe
to
500
MB
repositories
with
one
bit
having.
A
There
should
be
a
clear
difference
in
the
number
of
commits,
possibly
something
like
20,000
commits
in
one
and
second
might
have
30,000
or
40,000,
so
that
I
can
see
okay
for
these
constant
size
repositories,
if
the
number
of
commits
are
increasing.
This
is
how
the
execution
time
is
increasing
or
decreasing,
or
it
is
having
no
effect
so,
but
that
is
a
yeah.
B
I
thought
I
thought
our
intent
here
was
trying
to
understand
which
things
should
we
include
in
the
sizing,
heuristic
and
isn't.
Isn't
your
observation
here
saying
we
should
include
both
repository
size
on
dist
and
number
of
commits
because
they
seem
to
both
show
as
they
increase.
We,
the
execution
time
increases.
So
do
we
already
have
enough
information
here
to
say
yeah
number
of
number
of
commits
in
the
repository
and
size,
the
repository
and
the
disk
are
both
relevant
to
to
performance.
So
we
include
them
in
the
heuristic.
A
Yes,
mark
you're
right,
they
are
limiting
for
doing
that
is
to
to
find
what
performance?
How
is
it
how
much
affected
from
what
predictors,
but
what
I'm
saying
is
that
we
are
not
able
to
test
them
independently,
not
as
independent
variables.
Here
they
are
depend.
I'm,
not
sure,
is
if
the
file,
what
is
contributing
more
to
the
performance
changes
in
the
get
fetch?
Is
it
the
file
size?
Is
it
the
size,
the
pack,
the
size
of
the
pack,
dot
pack
object
or
is
it?
Is
it
the
number
of
cores?
A
A
My
hypothesis
I,
wanted
to
test
was
that
if
we
have
a
repository
with
a
large
history
and
maybe
not
a
considerable
size,
but
a
large
history,
would
that
affect
the
second
fetch
more,
because
what
I
assumed
with
the
second
fetch
was
that
the
first
fetch
would
download,
although
would
clone
all
the
objects
the
packed
object.
The
second
fetch
does
not
have
to
do
that.
What
it
should
do
is
is
what
I
think
I
haven't
checked.
A
I
haven't
looked
in
to
confirm
this,
but
it
showed
the
ways
to
iterate
through
the
list
of
the
commit
history
or
basically
it
has
to
get
the
increments
in
references
or
any
changes
in
the
repository.
The
second
fetch
we
would
want
to
do
that
and
to
do
that
it
would
go
through
the
history
and
so
my
my
hypothesis,
what
that
was
that
the
the
redundant
fetch
would
actually
have
a
considerable
performance
overhead.
If
we
have
repositories
where
the
history
and
the
branches
they
they're
they're
larger,
then
there
are
cons.
A
I
would
say
a
considerable
number
is
there
for
those
repository,
so
that
is
something
I
wanted
to
test
and
I'm,
not
I'm,
still
not
sure.
With
these
we're
sure
that
with
increasing
the
size
and
all
of
those,
the
number
of
commits
we're
seeing
that
the
the
performance
overhead
of
the
second
leg
is
going
to
increase,
we're
sure
about
that,
because
we
can
see
that
with
those
other
benchmarks,
not
for
the
second
one.
A
First
one
too
much,
but
this
in
in
whether
this
microscope
the
time
unit,
we
can
see
clear
difference
but
I'm,
not
sure
independent
variables,
how
they're
contributing
to
the
performance,
and
so
again
we
can
see
that
jagged
is
actually
performing
better
for
us
for
small
size
repositories
that
to
think
we
have
the
observation.
We
have
that
for
a
small
size,
repository
Jake
is
going
to
perform
better
than
cake.
A
We
were
seeing
that
with
these,
this
benchmark
as
well
that
it's
performing,
but
though
it's
the
difference
is
not
much
in
real-time
I
think
we
see
the
differences
with
much
larger
repositories
Jake.
It
is
not
fun
good,
I'm,
not
sure
how
much
this
would
affect
tea
performance
for
a
user
noticeable
changes,
but
theoretically
it's
Jake.
It
is
performing
better
than
the
first
one
size
repositories.
A
So
yes,
so
with
benchmarking
strategy
I
have
so
if,
if
our
aim
is
say,
if
our
aim
is
just
to
see
that
so
we
need
to
make
an
estimator
and
to
make
make
an
estimator
to
estimate
the
size
of
the
repository.
What
kind
of
parameters
we
need
to
see
so
the
obvious
one
is
the
size
of
the
compare.
The
two
objects,
the
second
it's
safe
to
assume
it's
number
of
commits
number
of
branches,
but
how
much
how
much
independently
they
affect
the
performance
is
something
I
haven't,
not
able
to
figure
out
right
now.
B
So
I'm
I
think
you've,
you've,
you've
answered
the
question.
Should
we
include
size
and
number
of
commits
in
in
the
in
the
assessment?
Absolutely
and
we've
got
you've
got
data
here
that
says
yes,
Jake
it
for
small
size
repositories
is
marginally
faster.
So
so
there's
there's
another
incentive
to
say:
okay,
we
should
now
probably
look
at
code
and
say
or
put
you
into
code
and
say
all
right.
How
do
we
use
this
now
to
implement
the
heuristic
or
to
implement
the
estimator,
the
size,
estimator
and
and
start
seeing?
A
Okay,
that
is
what
I
thought
as
well,
that
we
could.
We
have
clear
evidences
that
some
of
the
parameters
there
how
they
are
affecting,
so
we
can
start
working
with
the
estimator
and
and
I
think
the
next
agenda
good
thing
I
had.
Then
there
was
analysis
on
fine.
So
we
have
discussed
this
performance.
Predictors
forget
wretch.
A
A
So
I
experimented
that
with
vs
code,
microsoft
me
xcode,
so
I
cloned
it-
and
I
am
now
also-
I
tried
testing
the
api
provided
by
github
to
check.
So
what
was
the
size?
It
was
returning
to
me.
So
the
size
was
around
according
to
the
github
again
it
was
around
300
mb,
but
and
I
cloned
it
was
around
nine
I
didn't
so
that's
that's
a
huge
difference
inside
this.
So
I
checked
around
so
I
found
out
that
github
under
servers.
They
have
repositories
bare
repositories,
so
they
they
they
gave
that
size
as
a
result.
B
A
A
B
Have
no
idea
what
that
number
represents:
okay,
alright,
so
so
that
number
I
don't
know
what
that
represents.
I,
usually
look
at
this
size,
the
d
u-
s,
output
for
the
docket
directory,
because
what
that
tells
you
is
size
on
disk
of
of
the
the
fundamentally.
What
is
almost
the
bare
repository
as
represented
on
the
other
side,.
A
So
I
think
I
have
to
confirm
that
I
haven't
IIIi.
Think
I
did
check
the
object,
dot
pack
object,
which
is
downloaded
by
go
by
clicks
alone,
so
I
could
see
similar
sizes
from
some
of
these.
From
this
thing
and
from
from
that
object,
but
I
think
I
check
that
mark
first
to
see
if
that
is
working
and
so
and
with
estimators
with
the
estimated
class.
So
right
now
the
object
one
option
mod
gave
was
the
grade.
Option
is
2.
A
A
A
B
So
the
since
the
execution
of
most
of
the
logic
is
happening
on
the
master
for
you.
Therefore,
you
can
ask
questions
of
the
cache
on
the
master
pretty
directly
so
things
and
things
like
the
when
the
gate
SCM
object
is
created,
you
can
assume
that's
on
the
master
and
that
SCM
object
then
can
can
look
at
the
local
cache
and
interrogate
the
local
cache.
So
I,
don't
think
you
have
to
I.
Think
it'll
be
pretty
straightforward.
B
Actually,
if
you
just
use
that
I,
don't
even
think
you'll
have
to
do
a
cash
lock
in
all
seriousness,
because
I
think
all
you're
trying
to
do
is
look
at
the
file
system,
so
you
get
to
get
the
a
directory
of
the
cache
and
then
knowing
the
directory
name.
You
go
use.
File
system
calls
to
ask
for
the
size
of
the
contents
of
that
directory,
and
that
gives
you
a
relatively
quick
approximation
of
the
size
of
that
repository
deposit.
Yes,.
A
B
B
A
A
Like
get
a
lesson,
what
I
was
interested
to
see
how
that
would
work?
Maybe
I
have
something
interesting
observation
to
show
a
great
head.
Leslie
motor
I
I
want
to
expand
the
benchmarking
study
for
those
two
operations.
That's
the
first
thing
we
could
show.
The
second
thing
would
be
the
redundant
fetch
book.
How
we
have
done
it
so
I
was
thinking
because
of
the
demo.
A
I
I
would
have
to
show
what
I
would
have
to
show
something
visually
so
I
as
a
feature
or
something
in
the
user
interface
and
motion
things
I
have
it's
usually
code
or
weird
results,
so
I'm
actually
not
sure
what
what
are
your
guys
expectations?
How
is
the
are
you,
the
guys
who
will
be
my
will
be
the
panel
and
evaluation
in
the
evaluations,
or
is
it
the
cool
committee
of
Jenkins?
How
so
we're.
B
B
I
think,
if
you
show
graphs
and
you
I,
don't
think
you
have
to
show
Jenkins
UI
as
much
as
graphs
and
highlights
of
hey
here's,
what
we've
learned
as
part
of
this
exercise.
Look
at
this
look
at
this
here's
an
improvement
here.
Here's
an
improvement
here
and
people
will
be
more
impressed
actually
with
with
graphs
and
charts
of
performance
performance
comparisons
than
they
ever
would
be
with
show
them
a
jenkin
july,
because
we
knew
this
was
a
performance
performance
project.
B
A
B
C
A
C
My
experience
last
year
we
had
some
other
projects
that
were
similar
to
this
as
well,
and
it's
it's
not
a
big
deal
like
if
it's
a
plugin
based
thing
where
you're
actually
building
and
you
pull
again
yeah
like
you,
might
get
into
the
demos
of
how
that
works
and
use
your
Experian
and
stuff
like
that.
But
yeah.
My
Park
said
it
think
I
definitely
focus
on
the
meat
of
this
project
and
that
all
people
will
like
it.
You.
B
Know
in
a
perfect
world
they
will
see
nothing
different
it'll
just
be
faster
right.
So
so
so,
if
you,
if
you
show
I'm
going
to
show
you
nothing
except
it's
faster
than
that,
that
should
already
delight
people.
It's
like
wow,
that's
great,
because
usually
it's
it's
faster
and
I
had
to
break
the
following
things
in
order
to
make
it
faster.
A
Okay,
so
so
what
I'm
thinking
is
the
first
thing
is
the
benchmarking
strategy,
with
the
kid
fetched
what
I
did
and
how
I
improve
the
benchmark
on
the
Jenkins
are
standing
in
everything.
One
thing
which
is
missing
right
now,
which
I
haven't
showed
you
guys,
is
integrating
the
JMH
visualizer
plug-in
or
the
Jenkins
page
I
have
to
do
that
because
that's
I
think
it's
going
to
be
a
great
improvement,
because
we
will
be
able
to
see
visually
how
the
results
are
shine.
A
So
that's
something
I'm
gonna
do
and
I'm
going
to
do
it
for
gate,
LS
remote
as
well.
So
for
these
two
operations,
the
benchmarking
strategy,
then
with
the
redundant
fetch
I
think
from
the
fixed
would
would
you
guys
be
interested
in
seeing
the
testing
scenarios
and
the
cases
we
consider
the
while
we
were
fixing
this
and
the
use
cases
we
had
to
consider
if
we
would
break
them
all
how
to
do
this
safely.
The
whole
thing
or
is
that
something
we
don't
have
to
discuss?
A
B
So,
for
me,
I'd
keep
the
the
testing
in
your
back
pocket
in
case
somebody
asks
hey.
How
did
you
check
this
I
I
suspect
the
audience
will
be
I'm
I'm
gravely
concerned
about
not
breaking
compatibility
right,
that's
a
big
deal
for
me,
but
the
audience
the
larger
audience
probably
will
just
assume
that
think.
Of
course,
no
one's
going
to
break
compatibility
and
so
they'll
be
more
interested
in
your
results,
with
numbers
and
with
the
performance
results,
and
your
observations
on
hey
here
are
the
characteristics
we
saw.
A
Okay,
so
with
benchmarks,
as
we've
seen,
that
the
theoretical
results
are
not
showing
much
of
a
difference.
So
what
I
want
to
say
what,
with
the
redundant
fetch,
the
results
I
would
like
to
show
is
the
profiling
results
as
much
as
I
can
so
that
I
have
a
large
sample
and
the
result
is
results
are
not
something
which
we
do
not
expect
well,
I.
Think.
B
It's
okay
to
show
that
show
the
surprises
as
well
and
say:
welcome
to
the
real
world.
Sometimes
we
get
surprised
by
how
software
behaves
I
I,
feel
no
shame
in
declaring
that
we
were
I
was
that
you
were
completely
surprised
to
see
this
result
comparing
to
this
other
result
and
that
more
investigation
is
needed.
That's
that's
perfectly.
Okay,
okay,.
A
So
so
the
the
benchmark
results
and
the
profiling
results
both
of
them
for
the
katadyn
fetch
issues
and
then
I
think
the
third
thing
would
be.
The
third
thing
would
be
the
estimator
class
if
I'm
able
to
create
that
with
some
heuristics,
which
we've
thought
about
and
so
I
I
need
to
first
consolidate
the
approaches
we
can
take
and
if
it's
even
possible
with
the
way
we
won't,
we
do
it
because
I'm
right
now,
not
too
much
sure
because
with
the
API
is
I,
was
actually
seeing.
A
Something
which
I
discussed
there
difference
in
these
sides
because
of
their
and
T
objects.
I
have
to
confirm
that
with
the
cache
thing.
So
if
the
cache
doesn't
work,
then
what
we
do
because,
with
the
cache
it's
I
think
if
we
have
the
cache,
then
it's
it's.
It's
simple
to
estimate
the
size,
but
if
you
don't
have
that,
then
it's
the
real
work
where
we
would
have
to
understand
how
Wurster
how
we
could
estimate
the
size.
I
was
hoping
that
number
of
commits
and
branches
would
have
a
great.
A
A
C
And
one
other
thing
maybe
try
if
you
wanted
to
like,
if
we
wanted
to
roll
out
the
disc
things
like
the
lightest
thing
that
work
was
talking
about,
you
could
maybe
set
up
like
a
bit
bucket
or
gitlab
server
on
your
local
network,
put
these
repos
on
there
and
then
that
would
like
get
you
to
like.
Maybe
it's
not
going
to
optimize
for
being
on
the
file
system?
Oh
that's
a
possibility
as
a
that's.
B
Optional
possibility
well
and
Rishabh
I
have
an
environment
that
we
could
use
to
simulate
exactly
what
Justin
described.
I
have
a
local
git
server
on
my
network,
that
that
happens
to
be
just
full
of
all
sorts
of,
interestingly
sized
repositories.
So
so
Justin's
idea
is
good.
However,
even
before
that
I
would
take
one
more
I
think
you've
learned
something
in
this
extra.
You
you've
gained
a
crucial
piece
of
knowledge
that
I
don't
think
you
highlighted
nearly
enough
as
you're
in
your
summary.
B
We
did
not
know
that
I
had
not
I
had
an
assumption,
but
I
had
no
data
to
support
that
and
what
you
have
is
you
have
hard
data
which
says,
as
these
attributes
of
the
repository
increase
but
carry
the
characteristic
performance
of
git
is
like
this
and
jagged
is
like
this
and
that
curve?
If
that
curve,
is
your
opening
slide?
Even
for
me,
that
would
be
great
because
it
says,
oh,
oh,
everybody
should
be
aware
of
this
characteristic
of
the
jagged
implementation.
B
You've
done,
you've
done
concrete
measurements
and
they
measurements
showing
over
and
over
again
this
exact
same
story
that,
with
large
repositories,
jagged
is
a
poor
choice,
and
so
people
should
be
aware
of
that.
You've
already
contributed
to
the
body
of
knowledge.
Just
with
that
that
that
initial
graph.
A
B
Propose
the
we
that
you
show
us
your
initial
framework
of
the
presentation
on
Friday,
if
you
would
be
willing
so
then,
so
that
we
have
a
chance
to
give
you
feedback.
For
instance,
it
asks
for
a
blog
post
and
I
thought.
You
know
what
the
performance
results
you've
seen
would
be
a
great
blog
post.
Let's
say
look
just
for
the
information
of
Jenkins
users
without
any
code
change.
You
should
be
aware
that
if
you
choose
jacott
and
your
repository
size
is
larger
than
such,
you
are
sacrificing
performance
intentionally.
B
To
highlight
we
use
jagat
as
the
implementation
on
CI
that
Jenkins
that
I,
oh
and
that's
fine
for
small
repositories,
but
remember
that
the
documentation
repository
and
the
Jenkins
core
repository
are
both
well
beyond
the
threshold
size
that
you've
identified.
So
I
already
have
an
improvement
to
make
in
CI
a--
jenkins
that
io
to
get
it
to
get
some
performance
back.
A
Okay,
so,
okay,
so
I
think
this
is
that
I
think
this
is
what
I
wanted
to
discuss
with
the
block
which
I
wanted
to
ask.
Do
I
have
to
do
that
on
a
Jenkins
Rodya
or
can
I
do
it?
Oh,
it's
it's
mandated.
We
would
have
been
Chang
I
thought.
I
was
setting
up
a
gator
page
blog
and
I
was
thinking
that
I
could
do
it
there.
You.
A
C
Plus
one
for
like
demoing,
your
demoing,
your
demo,
that
was
a
good
way
for
us
to
give
feedback
before
and
then
one
thing
that
we
did
before,
which
is
up
to
you
I
think
we
had
done
it
in
Google.
Slides,
doesn't
matter
like
what
technology
use,
but
if
you
want
to
share
that
with
us
like,
we
can
do
markup
and
comments
for
if
you
want
feedback
things
do
look
up
to.