►
From YouTube: 2020 06 17 GSoC Git Plugin Performance Project
Description
Google Summer of Code Jenkins git plugin performance project office hours from June 17, 2020. Topics include repository size estimation heuristics and
B
Okay,
so
for
today's
meeting
agenda,
just
one
second,
so
the
first
thing
on
the
agenda
is
update
on
our
gated
and
fair
tissue,
and
one
thing
I
did
so.
We
had
three
behaviors
which
we
had
to
add
to
the
Clone
API,
which
is
the
API
needed
to
check
out
for
the
first
time,
so
those
those
three
behaviors,
the
first
one
was
clean
before
checkout,
and
we
don't
need
that
because
we
found
out
that
for
the
first
time
we
don't
need
a
clean,
because
there
is
no
gate
repository.
B
B
So
so
that's
a
discovery,
mission,
discovery
and
I.
Think
that
makes
things
easier
because
the
only
change
we
have
as
little
to
unit
tests
now
and
I
think
the
fix
is
good
to
go
because
we
had,
we
had
to
add
we
have.
We
have
to
cover
all
the
youth
possible
use
cases
we
would
miss
if
we
would
avoid
the
second
Treach
the
redundant
one
and
I
think
we
have
covered
all
of
those
cases
mark.
What
would
you
say
also.
A
Also
I
think
there's
more
to
test,
but
we
may
be
at
the
point
now
where
the
best
form
of
testing
is
interactive,
where
we
do
some
X
or
exploration
of
various
permutations
and
combinations
of
job
settings
to
see
hey.
Did
this
deliver
us
what
we
expected
with
some
interactive
tests?
I
was
so
dismayed
by
the
realization
that
I
didn't
understand
personally,
be
the
conditions
around
around
the
prune
and
clean,
and
we
had
to
be
taught
by
a
test,
an
automated
test.
So
there's
there's
more
work
to
do
there
in
terms
of
the
interactive
testing.
B
A
So
one
thing
that
I
was
thinking
of
is
maybe
what
we
need
is
interactive
testing
that
takes
let's
say
the
set
of
available,
parameter,
available,
extensions
arguments
and
chooses
a
subset
of
them
and
then
says:
okay,
I
want
to
run
these
without
with
the
redundant
fetch
and
without
and
compare
the
repositories
in
the
workspaces
that
result.
So
we
have
some
form
of
repository
comparison.
Did
we
get
all
the
references
we
expected?
Did
we
get
all
the
branches
we
expected?
A
A
Now
now
my
usual
game
plan
with
that
kind
of
interactive
testing
is
the
interactive
testing
is
intentionally
rapid.
The
goal
is
not
to
make
it
particularly
repeatable
just
to
keep
notes
as
you
go,
but
then,
when
you
find
a
real
problem,
that's
the
excuse
to
write
an
automated
test,
eventually,
which
says,
show
the
problem
milady.
If
we
focus
on
automating
it
too
soon,
we
spend
all
our
time
on
the
automation
without
doing
the
exploration
Cheryl.
B
So
after
this,
the
next
thing
on
the
agenda
is
the
discussion
on
implementation
of
performance
improvement
into
the
gate.
Plugin
indicate
client
plug-in.
So
last
time
we
discussed
that
we
could
have
a
check
box
a
way
that
we,
by
default,
enable
performance,
improvement
and
people
can
revert
to
the
old
changes
if
they
want
to
so
so,
I
tried
doing
it
and
I
have
a
lot
of
questions
related
to
it.
B
B
Yeah,
so
the
is
a
boolean
enable
performance
improvement
in
the
descriptor
class.
So
one
of
the
first
things
I
have,
in
my
mind,
which
I
haven't
explored
right
now,
is
that
where
are
the
places
I?
How
will
I,
how
will
I
ship
this
boolean
to
every
corner
of
the
gate?
Plugin
I
understand
that
this
descriptor.
This
is
an
object
for
the
SCM
class.
B
So
when
I
create
an
SCM
class,
I
would
have
this
as
an
object
and
I
can
access
so
I
created
a
method
which
is,
is
basically
a
gator
to
get
this
boolean,
but
as
I
understand,
we
need
to
create
their
client
within
the
SCM
class
within
the
gate,
SCM
class,
so
not
sure
if,
if
I'm
I'll
be
able
to
access
this
variable
within
the
SCM
class,
because
this
is
a
different.
The
this,
the
descriptor
closet
is
a
different
class
right
is.
It
is
basically
an
object
to
this
class,
so
I
haven't
explored
that
part.
B
If,
if
I
am
clearly
wrong-
and
you
can
right
now-
tell
me
where
I'm
wrong-
that's
okay
or
else
I'm
going
to
explore
this
more
I,
just
implemented
it
to
show
what
I
so
the
first
part
of
this
whole
implementation
is
to
figure
out
how
to
bring
the
boolean
how
to
bring
the
choice
the
user
will
have
to
the
port.
Then
the
second
part.
So
this
is
the
first
part
now.
B
The
second
part
which
I
which
I
was
thinking
about,
is
that
if
we
need
to
selectively
choose
between
in
implementations,
which
will
which,
from
the
analysis
we
get
from
the
JJ
match
benchmarks
to
do
so,
we
know
that
for
an
example
for
gate
fetch,
the
choice
is
heavily
dependent
upon
the
repository
size,
and
so
what
what
I
was
thinking
about,
and
so
first
I
was
thinking
about.
How
do
I
get
to
move
the
repository
size
of
a
particular
repository.
So
I
was
looking
at
a
command
called
gate.
B
B
So
that
was
because
I
had
with
the
repository
respect
so
so
well,
while
I
was
doing
this
III
am
just
one
of
the
revelations
I
had
was
that
I
I
have
to
realize
the
size
of
the
repository
before
creating
the
client.
That
means
what
are
the
first
question
for
this
thing
is:
can
I
access
the
repository
I
am
going
to
use
with
the
client
I
am
going
to
create
before
actually
creating
the
client.
If
I'm
not
able
to
do
that,
then
I
would
have
to
creat.
B
I
I
would
have
to
clone
the
repository
whenever
I
have
to
make,
maybe
for
once,
when
I
have
to
make
this
decision.
But
then
that
means,
if
I
have
a
300
sized
repository
I
would
I
want
to
improve
the
performance
and
I
would
add
a
considerable
amount
of
time,
while
I'm
cloning,
that
is
repository
and
then
estimating
its
size
first
it'll
it'll,
be
fact
so
maybe
I
would
have
to
also
include
the
I
would
have
to
implement
the
git
command
or
unpacking
objects.
B
That
would
that
would
also
maybe
I
and
then
I
would
I
also
have
to
one
of
the
one.
Other
thing
is
that
I
would
have
to
create
a
new
client
without
I.
Maybe
I
would
have
to
create
a
client
with
a
temporary
local
repository
for
the
for
the
lifecycle
of
making
that
decision,
because
I
would
not
have
any
repository
to
create
a
client
to
check
the
size
of
the
repository
before
creating
the
client
for
the
gate,
plugins
functionalities,
so
so
first
I
think
the
first
part
is
how
do
I
is
this?
B
A
A
There
we
go
so
so
that
is
a
pattern
of
this.
The
kind
of
thing
that
I
think
you
need
to
add
because
it
uses
the
same
exact
technique.
It
has
something
in
the
descriptor
and
then
there
are
things
inside
the
gate,
SCM
class
or
one
of
its
one
of
its
related
classes,
which
asked
the
get
SCM
object
if
show
entire
commit
summary
and
changes
is
enabled
or
not
so
so,
I
think
you
can
just
leverage
that
look
for
it.
I
confess
it's
dealing
with
descriptors
versus
the
the
parent
class
I,
don't
remember
any
of
it.
A
C
B
A
Think
I
think
you
can't
and
you
probably
or
you
could,
because
it's
statically
available
right,
it's
a
global,
so
it's
gotta
be
available
somehow,
but
I'm
not
sure
you
will
actually
need
to
that's
the
piece
where,
where
I'm
I
suspect
you'll
want
to
make
the
determination
which
path
should
I
use
at
runtime
after
already
instantiating
it
to
get
SCM
object,
I,
don't
think
you're
gonna
need
to
make
the
decision
beforehand.
I
suspect
you
amount
to
make
it
at
runtime,
with
the
object
already
in
memory
and
constructed
okay.
C
C
B
A
A
I
don't
recall
that
this
may
now
be
cargo
cult,
programming
and
I
apologize
if
it
is,
but
but
I
don't
remember
why.
But
it
was
easier
for
me
to
deal
with
things
that
had
a
default
value
of
false,
rather
than
starting
them
with
a
default
value
of
true
in
these.
In
these
descriptors
trying
to
retain
compatibility,
okay,.
A
C
A
So
one
piece
of
a
fallible
rule
might
be
to
say:
if
a
repository
has
more
than
some
threshold
of
branches,
we
will
assume
you
know,
try
to
do
a
correlation
between
branch
count
and
approximate
repository
size,
its
fallible.
We
know
it
is.
We
absolutely
know
it's
fallible,
but
it's
cheap
because
we're
already
asking
that
question,
and
so
we
can
use
the
answer
from
that
question.
To
already.
A
If
we
remember
the
question,
if
we
remember
the
answer
will
remember
data
from
the
answer,
we
could
then
use
that
that,
as
part
of
our
decision,
oh
okay,
I,
said
well,
I
saw
a
pre
posit
ory
that
had
500
branches,
it's
probably
not
a
one
megabyte
repository,
it's
probably
more
towards
more
towards
the
the
large
size
than
the
small
size.
Now
there
are
plenty
of
repositories
that
have
two
branches
and
are
still
hundreds
of
megabytes.
A
So
so
it
is
a
fallible
rule,
but
but
one
thing
would
be
used:
Goodell
s
remote,
because
we're
already
calling
it
I
really
liked.
Your
idea
and
I
thought
it
was
brilliant
to
use
to
ask
look,
forget
commands
that
might
tell
us
the
size
of
the
local
repository.
If
we've
got
one
because
count
objects,
there
are
probably
things
like
it,
which
will
count
inside
pack
files
as
well
so
I
think
it's
worth
exploring
that
further
to
see
what's
available.
B
A
D
B
D
D
A
It's
I
think
you've
got
a
very
good
fine,
I'm,
calm
car,
it's
the
fun
part
there
is
that
LFS
objects
are
cached
into
the
docket
directory
as
well
so
Rishabh.
If
you
ask
for
the
disk
usage
literal
disk
usage
on
the
drive
of
the
dot
git
directory,
that's
a
very,
very
good
approximation
if
you've
got
it.
So
if,
if
you've
got
a
local
clone,
you
can.
You
can
certainly
ask
the
question,
because
LFS
LFS
is
an
important
one
to
consider
excellent
point
old
car
that
we
certainly
if
LFS
is
in
use-
that's
probably
a
hint.
C
C
Things
like
github
and
get
lab,
as
you
could
potentially
like.
That,
could
be
something
that
or
something
like
that.
It's
not
generalizable
for
everything
yet,
but
it
would
maybe
be
an
optimization
for
getting
up
and
get
lab.
Is
that
you
could
check
their
API
or
use
the
API
is
to
get
the
repository
size
potentially.
A
That
and
now
that
that
is
a
that
is
a
very
bold,
bold
one,
because
this
would
be
the
first
time
your
interest,
someone's
introducing
REST
API
calls
into
the
get
plug-in
all
of
the
REST
API
calls
are
done
by
higher-level
plug-ins
like
github
and
get
lab
and
giddy
so,
but
but
I
think
Justin's
got
a
good
point
that
that
would
be
another
way
to
another.
Really
excellent
heuristic
is
those
those
providers
may
have
API
calls
that
will
tell
you
the
approximate
size
of
the
repository
by
a
single
API
call.
C
A
B
A
Right
and
so
forth
that
one
ever
as
soon
as
you
involve
another
plug-in
you're,
also
contingent
on
their
release
of
that
plug-in
for
you
delivering
that
future.
So
it's
much
more
challenging
it.
It's
architectural
II,
very
elegant,
but
it
can
be
more
challenging
for
you.
It
feels
like,
though
you
may
Rashaad
we
may
have.
We
have
may
have
described
that
there
is
something
like
a
class
that
you'll
need
inside
inside.
A
C
Maybe
you
start
with
implementing
something
and
to
get
plug-in
itself.
That's
like
the
best
generalizable
kind
of
way
of
doing
it.
You
had
you
use
this
class,
provide
that
to
the
other
classes,
and
then
maybe
those
plugins
implement
that
later.
That's
not
necessarily
the
scope
of
your
work
or
if
we
have
time,
then
maybe
you
add
that
something
like
a
little
possibility.
A
To
you
to
support
Justin's
idea,
Jenkins
has
the
concept
of
an
extension
point
that
allows
other
plugins
to
add
there
to
add
to
you,
and
so
you
could
conceivably
create
this
as
an
extension
point
in
a
get
plugin
which
others,
if
they
wanted
to
contribute
to
it
and
say,
hey
I,
want
to
provide
an
even
better
implementation
than
they
get
plug-ins
naive
implementation.
They
could
do
that
through
this
extension
point
system.
A
B
B
B
So
one
more
question
I
had
related
to
the
gate.
Flagon
was
the
SCN
API
I
have
read
the
SEM
consumer
and
implementation
guides,
but
I
have
never
mapped
that
with
how
I
know
the
gate.
Scm
class,
but
I
haven't
mapped
all
of
it
without
a
plugin
is
using
the
SEM
api's.
So
I
wanted
to
ask
how
much
of
that
should
I
research
before
thinking
of
implementing
this
feature
or
performance
improvement?
A
Improvement,
I
think
you
could
do
it
in
parallel
or
even
after,
because
the
concepts
that
you're
introducing
are
below
the
level
of
the
SCM
api,
they
are
specifics
to
get
internally
I,
don't
think
anything
we've
described
so
far
anyway,
possible
exception.
Maybe
is
that
if,
if
you
ultimately
decide
that
you
want
to
allow
implementations
to
offer
a
better
way
of
estimating
the
size
of
a
repository
that
might
need,
in
addition
to
the
to
the
SCM
api,
but
my
guess
for
right
now
is
you.
A
This
is
entirely
inside
to
get
plugin
for
now,
so
the
sophisticated
and
very
capable
things
that
are
inside
SCM
can
be
largely
ignored,
but
but
fran
is
better
experience
than
this
than
I
am.
I
suspect
that
justin
I
don't
know
about
your
experience
in
it,
but
my
hunch
is
that
it's
probably
above
a
level
that
did
you
that
you're
not
working
at.
C
B
So
for
nothing
for
the
next
week,
one
of
the
possible
tasks
I
have
is
first,
is
to
intact
interactively
test.
The
gate
attendant
fetch
fix
with
the
possible
combinations,
permutations
and
combinations,
and
the
second
is
to
probably
decide
or
test
these
heuristics
we've
mentioned
possible
heuristics
and
create
an
extension.
If
I
can
write,
that
would
be
the
tangible
outcome
of
another
tasks.
I
can
have
to
create
an
extension
which
would
provide
heuristic
to
calculate
the
size
of
repository.
A
And
I'm
not
worried
about
it
being
an
extension
point,
but
something
that
represents
the
estimate
of
repository
size,
I.
Think
is
a
good
thing.
It
doesn't
it's
to
my
mind
it
doesn't.
It
absolutely
does
not
need
to
be
an
extension
point.
You
don't
need
that
complexity.
Yet
it's
it's
more
of
cap,
remembering
the
the
statistics
from
get
LS
remote
is
probably
enough
to
do
the
job
that
you
need.
If
you
remember
hey
this
thing,
the
last
time
I
did
LS
remote.
B
A
B
C
B
C
And
I
guess
another
thing
that
I
thought
about
too
and
I'm
not
sure
how
simple
this
is
behind
the
scenes.
I
know
there
are
some
Jenkins
plugins
the
whole
cache
some
things,
so
you
could
maybe
like,
if
you've
seen
a
git
repository
before
the
workspace
is
gone
that
agent
it's
gone.
You
don't
have
access
to
it
anymore,
but
your
instance
has
seen
that
repo
before
perhaps
there's
a
way
to
to
store
that
information
on
for
the
primary
Jenkins
I,
don't
know
if
anyone
else
has
any
experience
with
those
api's
I've
seen
it
done.
E
For
cushion
yeah
yep
well,
it
is
no
really
careful
cushion.
There
is
a
piece
of
code
which
has
been
cookie
placed
between
plugins
with
some
level
of
success
and
yeah
I
believe
we
had
a
discussion
Bob
that
maybe
one
year
ago,
because
it
creates
a
lot
of
issues
with
a
backup
management
etc.
Because,
yes,
right
now,
we
do
not
have
a
standard
vocation.
Can
you
think
Jenkins?
E
B
Well,
like
I
could
I
think
I
can
do
both
I
can
initially
go
for
the
time
being
tomorrow.
I
have
a
demo
in
the
platform
sig
meeting,
maybe
with
that
I
can
release.
In
the
day
in
the
community.
In
the
Google
Group
forum,
I
can
post
with
the
results
I
have
and
then,
when
I,
when
I
have
multiple
results
with
multiple
good
operations,
I
can
create
a
blog
and
aggregate
all
of
those
desertion,
and
it.