►
Description
Git is a great distributed version control system used by software developers around the world. But, its user base is expanding beyond the core developers, with new problems to solve. Luckily, Git is very extensible. This talk will cover techniques for building a Git tool your users will love, and that feels like a natural part of Git.
A
I'd
like
to
introduce
our
next
speaker
he's
also
going
to
be
talking
about
a
scalability
issue
with
git
and
how
we've
we've
started
to
address
that
at
github
and
and
and
so
forth,
so
I'll,
let
richter.
The
other
notable
thing
about
rick
is
that
he
learned
vim
in
the
air
force,
so
I'm
gonna,
I'm
gonna,
welcome
rick
olson
from
github.
B
All
right
cool,
so
hi,
so
I
apologize
for
the
vague
talk
title
until
yesterday.
I
couldn't
actually
mention
what
I
was
going
to
talk
about.
So
this
is
a
blog
post
that
we
shipped
last
night
in
paris
about
in
paris
time
about
get
large
file
storage.
It's
a
it's
an
open
source,
skid
extension
that
I've
been
working
on.
So
I
want
to
talk
about
building
it.
Why
it's
why
it
works
the
way
it
does
so
I'm
rick
olson
and
I
go
by
techno
winnie
on
the
net
and
I
work
at
github.
B
My
team.
We
work
primarily
on
the
features
of
github
that
use
that
work
with
binary
files,
but
not
in
git.
So
like
avatars,
and
you
know
the
release,
binary,
uploads
and
now
get
lfs.
So
so
what
problem
am
I
trying
to
solve
what
scaling
issue
so,
basically,
so
for
a
lot
of
teams
are
using
git
successfully
right
now,
they're
they're
doing
it.
You
know
correctly
right,
they're,
they're,
working
with
you
know
after
they've
gone
through
the
learning
process,
they've
they're
working
with
text
source
code
files,
documentation.
B
You
know
if
they're
working
on
websites-
or
you
know,
they're
working
with
small
images
and
yeah
git
works
great
for
that.
That's
exactly
what
it
is
designed
to
do.
B
Problems
come
up
when
people
start
working
with
bigger
files,
like
I'm
not
talking
about
big
repositories
like
like
will
with
the
twitter
repository
I'm
talking
about
storing
files
like
individual
files
that
are
really
big
beyond
10
or
50
or
100
megabytes,
and
initially
the
the
real
trouble
here
is
that
you
don't
even
notice
it's
a
pain
point
at
first
you're,
just
like
committing
these
files
happily
and
then
slowly,
like
you
start
noticing,
issues
like
it
takes
longer
and
longer
to
do
a
fresh
clone.
B
So
the
first
thing
that
we
did
is
we
set
up
server-side
limits
that
analyze
the
you
know,
your
git
push
on
the
server
and
if
you
have
any
files
over
50
megabytes,
we
print
a
little
warning
in
your.
You
know
in
your
terminal
and
if
it's
over
100
megabytes,
we
just
reject
the
push,
and
this
was
nice
because
then
people
weren't
pushing
giant
files
onto
our
servers
and
they
weren't
causing
problems
like
cut.
B
Making
our
servers
like
generate
like
giant
pack
files
on
clones
and
things
like
that,
it's
also
giving
feedback
to
the
teams.
Like
hey,
you
know
you're
using
git
incorrectly
you're
you're
about
to
have
a
bad
time.
So
maybe
you
should
look
at
your
development
process,
but
I
I
never
really
liked
that
solution,
because
you
know
you
get
these
teams
and
you
know:
they've
gone
yeah.
B
They've
crossed
the
first
hurdle,
like
kind
of
learning,
how
git
works
and
they
start
using
it
and
they
run
into
this
issue
immediately
and
then
they
get
to
learn
about
get
filter
branch
and
have
rewriting
their
repository
history
and
that's
that's
no
fun
and
I
think
it
gives
a
bad
initial
perception
of
of
git.
So
I
just
I
really
wanted
to
provide
a
better
experience
for
these
users,
so
a
lot
of
projects,
projects
that
I've
been
involved
in
open
source
projects
they're.
You
know
about
they're
all
about
scratching
my
own
itch.
B
These
are
problems
that,
like
I'm
experiencing
myself
so
then
I
know
like
I
can
better
make
decisions
decisions
on
how
to
solve
those
problems,
but
this
this
was
not
like
that.
This
isn't
a
problem
that
I
was
having
or
really
like
anyone
at
github
was
having
is
really
like
our
users
coming
to
us
about
it
and
talking
to
other
people
that
work
on
git
servers
and
things
like
that,
like
like
atlassian
and
they're.
B
B
You
know
she
spoke
to
the
company
about
turning
everyone
that
worked
on
product
into
user
researchers
and
I'm,
like
that's,
that's
bs,
so
I
made
up
made
it
a
point
to
reach
out
to
her
and
see
if
she
could
help
me
out
with
this
project
like
talk
to
talk
to
our
users
and
get
insight
into
their
workflow
and
see
what
we
can
do
to
help
them
out.
We
started
out
looking
at
metrics.
B
B
Having
is
that
they'll
have
an
artist
or
someone
that's
working
on
some
big
file
like
a
photoshop
file
or
whatever,
and
they
go
to
push
it
and
pushing
from
south
africa
to
github.com
in
america
is
really
slow
and
especially,
if
you're
pushing
a
giant
file
and
what
and
push
sniping
is
when,
when,
while
they're
making
their
push,
someone
else
will
change
like
a
readme
or
you
know
like
a
normal
like
do
normal
get
push
and
their
push
will
complete,
while
the
other
one
is
still
uploading.
B
And
then,
when
that
one
finishes
it
looks
at
the
master,
ref
or
whatever
they're
updating
and
it
returns.
You
know
it
says:
oh
well,
the
ref
has
changed
you
gotta
start
over
now
you
gotta
pull
and
start
over,
so
that
yeah.
That
was
a
really
unique
problem.
I
really
liked
talking
to
those
guys
so
at
the
end
of
this
chrissy,
and
I
we
prepared
a
final
report
with
recommendations
and
aspirations,
and
this
was
this
helped
inform
my
team,
like
some
of
the
things
to
do.
B
You
know
that
we
wanted
to
experiment
with.
I
also
informed
the
rest
of
the
company.
You
know
to
help.
You
know
prove
that
this
is
something
worth
taking
on.
B
So
what
are
first
principles
this
actually
like
a
physics
term,
it's
something
we
talk
about
internally
in
the
company,
but
I
couldn't
find
any
reference
of
us
talking
about
it
publicly.
So
I
did
find
this
article
as
an
interview
with
elon
musk.
You
know
the
ceo
of
tesla
and
spacex
and
he
talks
about
first
principles
and
he
was
talking
about
how
they
design
their
batteries
and
basically
just
ignoring
the
the
the
current
understanding
on
building
batteries
and
they
broke
the
problem
down
to
its
most.
B
You
know
to
very
basic
elements
and
re-examine
it
with
a
clear
focus
and
then
they're
able
to
you
know,
build
really
good
batteries
and
all
that
stuff.
So
so
in
our
user
research,
there
are
two
themes
that
that
came
up
things
things
that
were
very
important
to
me
and
things
that
I
thought
needed
needed
to
be
a
focus
for
this.
So
the
first
one
is
usability.
B
There
are
actually
a
couple
tools
that
exist
now:
get
media
and
get
annex
and
a
few
others
and
they're
built
for
git
experts.
They
require
a
lot
of
upfront
configuration
in
the
repository
they
sometimes
introduce
new
commands.
They
don't
quite
always
work
with
the
existing
workflows.
They
don't
work
with
any
of
the
hosted
services
and
I
really
wanted
to
solve
that
problem.
B
B
The
second
thing
github
hosts
code,
so
I
I
was
really
interested
in
finding
a
way
to,
for
this,
get
large
file.
Extension,
get
large
file,
storage
extension
to
to
work
with
with
the
with
github.com,
and
I
wanted
to
do
it
in
a
way
that
didn't
have
any
vendor
lock-in,
no
proprietary
solutions.
I
wanted
to
be
an
open
api.
You
know
just
like
just
like
git
right,
because
all
it
is
is
your
client
is
speaking
over
some
defined
api
to
another
server,
and
you
could
be
talking
to
github.
B
So
how
does
awesome
the
animation
works?
So
this
is
a
yeah.
This
is
a
diagram
of
how
large
file
storage
works,
kind
of
at
the
high
level,
so
you've
got
your
local
repository
and
your
remote
at
the
top
and
your
code
files.
B
They
just
go
directly
into
the
git
repository,
but
larger
files
say
like
a
photoshop
file,
a
pointer,
that's
what
we're
calling
it.
It
goes
into
the
git
repository
and
it's
really
just
just
like.
Like
a
link
like
it's
not
the
actual
file,
it's
a
substitute
and
then
the
actual
file
goes
up
to
a
large
file,
storage
server.
B
Yeah,
let's
see
okay
and
the
really
cool
thing
about
this,
so
is
it
works
without
adding
a
lot
of
extra
stuff
to
your
your
git
flow?
You
get
workflow.
So
this
is
what
the
setup
looks
like.
So
when
you
first
install
the
tool
you
need
to
run,
get
lfs
init,
and
this
sets
up
some
some
get.
You
know
global,
get
configuration
values
and
then
you
need
to
tell
the
get
attributes
file.
B
What
file
types
you
want
to
put
you
want
to
store
in
the
get
large
file,
storage
server-
and
you
know
so.
This
shows
the
track
command,
which
does
that
for
you
or
you
can
just
open
up
the
dot,
get
attributes
file
and
edit
it
yourself-
and
this
is
what
a
clone
is
like,
and
you
know
it's
simple.
You
know
the
same
clone
command
you're
used
to,
but
then
at
the
bottom
you
see
downloading
some
file.zip.
So
that's
that's
where
that
get
lfs
is
doing
its
thing.
B
And
then
this
is
the
you
know,
really
simple:
pull
request,
workflow
and
there
are
no
like
new
commands.
Anything
like
that.
You
just
create
your
your
branch.
Add
your
file
commit
and
push,
and
then
you
see
at
the
top
the
uploading
message.
So
every
you
know
your
standard
git
workflow
will
keep
working.
B
So
how
does
it
do
that
well
get
attributes
and
specifically
the
smudge
and
clean
filters.
These
are
kind
of
awkward
to
talk
about
people
get
them
confused.
So
I
like
to
think
of
the
git
repository
as
a
clean
room
and
everything
is,
is
sterile
when
you're
running
the
git
add
command.
B
You
have
this
dirty
file
in
the
working
directory
and,
as
you
add
it,
it
gets
cleaned
up
into
the
git
repository
and
and
what's
what's
doing,
is
it's
converting
it
to
that
that
text
pointer
and
then,
when
you
check
out
it
does
a
reverse
of
it?
You
got
that
clean
text
pointer
and
it's
going
through
the
smudge
process
as
it
writes
it
to
your
to
your
working
directory,
and
then
it
spits
out
your
like
actual
large
file.
B
So
this
is
what
the
text
pointer
looks
like
it's
similar
to
the
get
media
one,
but
we
added
some
more
metadata
to
it.
So
one
is
the
version
string
this.
This
gives
us
flexibility
to
to
increment
the
text
pointer
format
in
the
future
and
also,
if
you
happen,
to
clone
a
repo
and
you've
never
heard
of
get
lfs
or
anything.
You
just
see
these
like
tiny
pse
files
and
you
open
them
up
and
they
don't
open
in
photoshop.
B
So
another
really
important
part
is
native
app
support,
so
this
is
a
screenshot
from
the
github
desktop
client
for
for
the
mac,
and
I
don't
know
if
you
can
see
that,
but
that's
a
progress
bar
and
at
the
bottom,
it's
downloading
a
large
asset.
B
B
Yes,
all
right
the
cool
thing,
though
the
interesting
thing
is
the
github
for
mac
client
was
written
before
libgit2,
so
it
still
shells
out
to
get
in
a
few
spots
and
they're
they're,
actually
the
first
ones
to
implement,
get
lfs
support.
B
B
This
is
a
pull
request
from
amy
on
the
desktop
team,
and
this
is
she's.
Adding
support
for
the
the
improved
clean,
smudge,
filter
implementation
and
live
get
too
sharp.
B
B
B
Other
cloud
hosts
or
on-premise
installs,
like
github
enterprise
or
even
the
really
small
ones,
like
ghetto
light
or
whatever
they
they
could
in
theory,
implement
this,
and
since
this
is
this
should
be
run
next
to
your
git
server.
So
it
can
take
advantage
of
your
you
know
of
that:
server's
built-in
authentication
and
authorization
code.
B
So
when
you
access
git
lfs
through
the
api
or
when
the
you
know,
when
the
client
does
it
knows
who
you
are,
it
uses
the
same
access
controls
that
that
github.com
does
and
then
the
the
client
itself
doesn't
need
to
support
all
these
different
backends.
It
just
supports
us
one
api
and
then
anyone
can
implement
this
api
and
they
can
use
get
lfs.
B
And
one
of
the
cool
things
about
host,
you
know
about
running
this
server.
Next
to
your
git.
Server
is
now
your
git
host
can
understand.
Get
you
know
these
large
objects
they're,
not
just
text
blobs.
This
is
a
photoshop
file
that
is,
that
is
been
viewed
through
our
render
feature
and
yeah,
and
that
file
is
stored
in
git
lfs.
B
B
And
yeah,
so
here's
what
the
api
looks
like.
I
don't
know
if,
if
you're
not
a
json
api
person,
then
this
is
probably
kind
of
greek
to
you,
but
it's
just
so.
This
is
a
api
call
to
download
a
file
and
the
server
returns.
Some
json
properties
you
get
the
the
object
id,
which
is
a
sha
256
signature
of
the
object
contents
and
then
the
the
file
size
and
then
there's
that
links
property,
and
that
includes
some
hyper
media
links
and
that
basically
just
means
it's.
B
The
get
lfs
api
is
telling
the
client
like
where
you
know
how
it
can
download
the
file.
So
in
this
example,
it's
saying
you
can
get
it
from
get
lfsserver.com
like
this
url
and
then,
if
and
then
it
you
can
also
specify
the
http
headers
to
set.
So
here
we're
saying
you
know,
set
your
authorization
header
to
this
token.
B
To
so
that
you
have
access
to
download
this
file,
and
then
the
client
will
follow
that
link
and
download
the
file
and
upload
requests
is
similar,
but
you're
you're,
you're
you're,
sending
the
oid
and
the
size
to
the
get
lfs
server.
So
you
know
this
request
is
saying
like
hey.
I
want
to
upload
this
file.
Tell
me
where,
to
put
it
so
in
that
json
output
has
more
hyper
media
links,
there's
an
upload
link
and
that's
you
know
saying
yeah
you
can
put
that
file
on
this.
B
So
if,
if
the
location
of
the
the
files
is
separate,
you
may
the
client
may
need
to
talk
back
to
the
get
lfs
api
to
say:
hey,
I
upload
the
file
you
can,
you
know,
make
it
available
so
so
as
a
real
world
example
on
github.com,
when,
when
you
use
this
we're
going
to
return
s3
links
with
the
headers
necessary
to
sign
the
request,
and
then
that
will
give
you
you
know
just
that
temporary
access
to
that.
B
To
that
key,
you
know
to
either
upload
or
download
it,
but
you
know
we
don't
have
access
to
s3.
Really
I
mean
it
is
our
s3
account
and
we
can
set
up
some
stuff
on
the
back
end,
but
I
want
to
build
it
into
this
api
in
case
people
want
to
put
this
in
front
of
other
storage
services.
So
once
the
client
is
uploaded
to
s3,
then
it
talks
back
to
the
to
the
lf
and
get
lfs
api
and
says:
hey.
B
So
authentication
this
was
a
big
part
of
it,
so
when
you're
so
it
so
it
integrates
it's
making
api
calls
from
your.
You
know
from
the
get
lfs
client
and
those
api
calls
will.
You
know,
require
some
form
of
authentication,
but
we
did
not
want
people
to
set
up
a
you
know:
dual
passwords.
We
want
the
server
to
you
know
since
it's
hosted
alongside
your
your
your
git
server,
then
it
should
be
able
to
you
know,
depending
on
the
implementation,
it
should
be
able
to
take
the
same
passwords
or
tokens
or
whatever.
B
So,
if
you're
using
https
remotes
it
just
it,
there's
a
in
a
internal,
get
credential
command
and
it
can
say,
hey
I'm
go.
I
want
to
talk
to
github.com,
you
have
a
stored
password
and
then
get
credentials.
Well,
I
will
just
send
it
back
because
if
you're
using
https,
remotes,
you've
probably
already
entered
in
your
password
before.
B
B
So
so
on
our
servers,
you
know
we
have
proxy
set
up
with
all
these
links,
and
you
know
it's
looking
for
the
different
urls
and
if
it's
like
github.com,
you
know
just
like
the
home
page,
it's
going
to
the
rails
app,
but
if
it's,
if
it's
a
git
url,
then
it
sends
it
back
to
our
our
git
service
and
now,
if
it
has
this
info
lfs
suffix,
then
it
sends
it
to
our
lfs
server.
B
But
we
don't
want
you
to
you
know.
We
don't
want
you
to
force
us
on
people,
because
you
may
be
using
a
git
host
that
doesn't
support
the
get
lfs
api
because
today,
like
there
are
no
services
besides
our
our
reference
implementation,
so
this
isn't
even
quite
available
yet
on
gab.com,
so
you
can
set,
you
can
set
dot
lfs.url
in
the
git
config
and
it
will
use
that
instead
and
this
could
be
a
server
like
on
heroku
or
whatever
doing
you
know
whatever.
B
Also,
not
not
everybody
uses
https
remotes,
a
lot
of
people
use
ssh,
so
we
so
part
of
git
lfs
is
a
new
ssh
command
that
that
it
runs,
and
basically
it
returns
back
the
the
header
necessary
to
authenticate
with
the
api.
So
so
you
don't
have
to
mess
with
get
get
credential
set
up
yeah,
so
the
so.
The
initial
announcement
and
release
is
just
version
0.5
of
the
client
library,
and
we
don't
have
full
support
on
github.com.
Yet
there's
there's
a
waiting
list
and
you
can
go
to.
B
I
should
have
had
the
url
somewhere
get
lfs.github.com
or
go
to
the
blog,
and
you
can
read
about
it
and
when
we
open
up
the
wait
list,
then
you
can
start
using
github,
but
the
project
is
still
new.
So
I
want
to
go
over
some
of
the
the
bigger
ideas
with
us.
B
I
feel
like
oh
geez,
so
one
of
the
things
I
I
want
is
narrow
downloads
and
the
idea
that
when
you
check
out
a
repository
with
lots
of
files,
you
don't
necessarily
need
to
download
everything.
B
Maybe
you
only
your
your,
maybe
your
music
composer
and
you
just
want
the
audio
files
or
whatever
in
a
specific
directory.
So
that's
one
of
those
ideas
that
we're
kicking
around
another
one.
This
is
not
a
popular
idea
and
get
because
you
know
branches
should
make
this
obsolete
right,
but
this
is
something
very
important
to
the
this.
You
know
this
these
these
users,
because
you
know
these
are
people
they're,
like
maybe
two
people
start
touching
the
same
photoshop
file
and
that's
not
a
format
that
you
can
really
merge.
B
B
Another
thing
too,
the
actual
get
lfs
client
right
now
is
written
and
go,
and
you
know
I
I
love
go,
but
for
this
project
that
doesn't
really
matter
what
matters
is
that
we
can
put
out
a
statically
compiled
binary
that
users
can
download,
they
don't
need
to
install
go,
they
don't
need
to
install
ruby
or
python,
and
you
know
with
the
right
version
and
all
the
dependencies
and
stuff
I
mean
like
as
a
ruby
developer.
It's
not
that
difficult,
but
it's
not
something.
I
would
wish
on
someone
that
isn't
a
ruby
developer.
B
So
yeah
I
mean
that's,
that's
my
talk.
We
would
love,
you
know,
we'd,
love
feedback.
I
would
love
for
other
hosting
services
to
look
at
this,
and
maybe
we
can
come
up
with
some
solution
that
we
can.
We
can
all
use
and
agree
on.
So
this
is
a
the
current
core
team.
So
that's
myself
and
rubyist
yeah
and
that's
it.
Let's
go
drink.
B
B
So
then
you
have
to
go
through
the
one-time,
painful
process
of
rewriting
your
history
and
pulling
those
objects
out
and
kind
of
retroactively,
adding
lfs
support,
or
you
could
just
say
screw
it.
Like
we've
been
using
this
repository
and
we're
just
going
to
start.
You
know,
starting
today,
we'll
use
get
lfs
like
that's,
not
something
I
would
ever
do
on
the
github
repo,
because
we
have
you
know:
fender,
gem
files,
cool
anyone.
B
I'm
sorry:
what
about
garbage
collection?
Oh
garbage
collection,
yeah,
so
you're
talking
on
the
the
local
machine
where
the
on
the
server
yeah
I
mean
it's
up
to
the
server
to
implement
it.
So
they
would
have
to
know
that
you
know
if
for
branch
gets
deleted,
they'd
have
to
go
through
and
delete
the.
A
B
Yeah,
so
dave
is
asking
about
garbage
collection
on
the
client
and
prefetching.
I
haven't
even
thought
about
prefetching
right
now.
I
would
love
to
talk
more
about
that
and
I
think
john
who's
going
up
after
me
he's
got
some
ideas
on
garbage
collection
on
the
client
so
yeah
here
he
has
to
say
in
a
bit.
I
guess.