►
From YouTube: Webrecorder: Web archiving for all - Ilya Kreymer
Description
Join Ilya as he demos high-fidelity web archiving using the ArchiveWeb.page browser extension, and talks about how web archives created in the browser can be directly shared with others using an experimental features that uses js-ipfs in the browser.
For more information on IPFS
- visit the project website: https://ipfs.io
- or follow IPFS on Twitter: https://twitter.com/IPFS
Join your local IPFS meetup to attend our next event: https://www.meetup.com/pro/ipfs/
Sign up to get IPFS news, including releases, ecosystem updates, and community announcements in your inbox, each Tuesday: http://eepurl.com/gL2Pi5
A
Yeah,
thank
you
for
having
me
yeah.
I
wanted
to
talk
about
high
fidelity
web
archiving
and
do
a
quick
demo
of
some
of
the
tools
that
I've
been
working
on
and
so
I'll
we'll
go
ahead
and
and
start
so.
I
want
to
talk
about
so
I
work
on
a
project
called
web
recorder
and
the
idea
with
web
recorder
is
to
is
sort
of
motto
is
web
archiving
for
all.
A
So
the
idea
is
to
allow
anyone
to
create
web
archives
of
exactly
what
you
see
in
your
browser
and
to
archive
them
at
full,
fidelity
and
so
kind
of
just
very
briefly.
What
are
some
of
the
goals
of
the
web
recorder
project
yeah?
It's
basically
to
focus
on
on
capture
and
replay
and
and
archive
things
as
as
accurately
as
possible.
A
A
So
you
might
think
that,
well,
there
are
kind
of
some
obvious
approaches
to
start
with,
including,
for
example,
the
browser
save
page,
as
so,
every
browser
has
that
and
you
can
go
to
a
page
and
and
and
try
to
save
a
page
if
you
actually
do
that
on
on
any
page.
That's
anything
that
that
is
not
mostly
just
a
static
document
with
html
you'll
quickly
find
that
what
you
actually
save
doesn't
generally
work
very
well
when
you
try
to
load
it
back
up.
A
Part
of
it
is
that
that
doesn't
save
the
it
doesn't
save
the
network
traffic
that
that
got
you
to
that
point.
It's
only
saves
a
static
snapshot
and
it
doesn't
save
any
of
the
state
and
javascript
and
oftentimes
modern
web
pages,
which
are
really
complex.
Applications,
don't
really
work
when
just
loaded
from
from
your
local
file
system.
A
You
could
actually
use
wget
and
you
could
point
it
at
a
website
and
it'll
retrieve
that
html
it'll
extract
all
the
links
and
it'll
repeat
recursively,
but
again,
there's
no,
no
javascript
run
and
you'll
find
out
that
you'll
get
the
static
assets
from
a
site,
but
anything
that's
loaded
dynamically.
A
Anything
that's
loaded
through
javascript
generally
doesn't
work,
and
so
these
are
sort
of
the
the
what
I
would
call
lower
fidelity
approaches
to
web
archiving
and
a
high
fidelity
web
archiving
is
attempting
to
archive
exactly
what
you
see
and
hear
and
do
in
the
browser
and
essentially
to
capture
the
the
interactive
experience
of
websites
while
keeping
them
interactive.
A
And
so
since
modern
websites
still
are
made
up
of
kind
of
http
network
requests,
that's
basically
what
what
we're
attempting
to
capture
and
so,
for
example,
if
we
look
at
a
at
a
so
and
I'll.
A
So
if
we
look,
for
example,
at
a
site
like
twitter,
that
is
highly
dynamic,
and
you
know
if
you
look
at
the
dev
tools,
for
example,
we'll
see
that
when
we
load
twitter,
even
though
it's
everything
that's
that's
being
loaded,
is,
is
being
served
and
you
can
actually
see
the
the
kind
of
the
network
requests
coming
in
in
devtools,
and
so
the
browser
already
has
this.
Obviously
in
order
to
to
create
the
page,
and
so
what?
A
If
we're
able
to
capture
all
this
network
traffic
and
simply
recreate
it
later?
And
so
that's
basically
the
idea
with
with
high
fidelity
web
archiving,
and
for
that
we
have
a
browser.
Extension
that's
available
on
archive
web
page.
So
it's
easy
to
remember
and
you
can
actually
go
to
archive
web
dev
page
and
from
there.
A
If
you're,
using
a
chromium-based
browser
it'll
take
you
to
the
chrome
web
store,
you
can
also
download
it
as
a
desktop
app
and
run
it
that
way,
and
so
what
this
extension
does
is
essentially
archive
the
exact
network
traffic,
that's
being
loaded
in
the
browser
and
so
before.
I
show
that
just
very
quickly
I'll
kind
of
cover,
so
the
way
that
it
works
is
that
it
archives
the
all
the
traffic
via
the
chrome
debug
protocol,
which
is
what
devtools
also
uses
and
it
stores
the
data
in
in
the
browser
in
indexeddb.
A
And
then
it
can
serialize
that
data
into
a
format
and
into
a
file
format,
that's
downloadable
or
that
format
can
really
be
stored
anywhere,
including
ipfs.
A
And
what
you
could
do
after
you've
stored
that
data,
of
course,
is
then
to
replay
them
and
replaying
websites
is
actually
even
harder
than
actually
capturing
them.
You
have
to
rewrite
the
urls
and
you
have
to
emulate
really
the
the
javascript
environment
and
it's
sort
of
basically
it's
basically
a
mini
a
mini
wayback
machine.
That's
running
entirely
in
your
browser,
and
so
you
won't
have
time
to
cover
all
of
that.
But
that's
sort
of
the
the
idea
behind
this
and
I'll
go
ahead
and
start
a
quick
demo.
A
And
so
let's
say
I'm
on
a
twitter
page
here-
and
I
have
this
extension
installed,
and
so
I
can
go
and
I
can
create
I'll
just
create
a
new
demo
here.
So
called
demo2
and
I'll
click
start,
and
you
can
see
this
size
counter
going
up.
So
that's
actually
all
of
the
all
the
network
traffic.
What
we
just
saw
in
devtools
being
archived
into
the
browser,
and
so
as
more
things
are
loading
on
this
on
this
twitter
page.
A
You
can
see
this
size
counter
going
up
and
when
basically
the
the
extension
tells
you
sort
of
if
if
more
requests
are
being
loaded
or
if
it's
done,
and
so
when
it's
green
like
this,
that
means
that's,
that's
it's
no
longer
loading
anything
additional,
and
so
I
can
kind
of
scroll
down
and
then
it'll
start
loading
additional
things
and
the
size
counter
will
go
up
and
since
it's
using
the
debug
protocol,
it
tells
me
that
archive
web
page
is
debugging
the
browser.
A
This
is
sort
of
a
a
chrome,
the
chromium-based
security
settings.
So
it's
it's
always
there.
I
could
also
click
on
another
page.
Let's
say
I
can
click
on
the
web
recorder
home
page
then
it'll
also
archive
this
page
as
well,
and
so
let's
say
I'm
done
archiving
and
I
can
click
stop
and
then
I
can
go
to
browse
archive,
and
this
shows
me
the
two
pages
that
I
have
just
archived
and.
A
A
It'll
probably
stop
at
some
point,
since
I
only
went
that
far.
I
can
also
click
on
on
this
site
that
oh
well,
maybe
that
didn't
work
because
there's
a
redirect,
but
I
can
also
click
on
it
like
this
and
load
the
home
page
in
this
way,
and
so
you
also
notice
that
when
I
look
at
this
page,
it's
I'm
logged
in
as
myself.
A
So
this
is
my
view
of
of
twitter.comrobacore.io
just
a
web
recorder,
twitter
page,
but
it's
logged
in
as
me,
and
so
this
is
sort
of
a
unique
view
of
the
web
and
and
if
someone
else
goes
to
this
url
they'll
see
something
different
because
it
won't
be
logged
in
as
them
or
it
won't
be
logged
in
as
me,
and
so
this
extension
really
allows
you
to
archive
exactly
what
you
see
in
the
browser
and
sort
of
your
your
own
unique
view
of
the
web,
which
is,
of
course
for
most
social
media
sites
or
many
sites
is,
is
entirely
different
for
for
each
individual
and
then
so.
A
What
we
also
have
is
is
the
sharing
option,
and
this
is
where
I
can
actually
go
and
and
click
start
sharing
and
and
now
this
archive
has
been
written
to
ipfs
and
I'll
cover.
What
that
and
I
can
actually,
why
don't,
I
go
ahead
and
paste
the
link
here
into
the
chat
so
that,
because
it
might
take
take
a
little
bit
of
time
so
I'll
just
go
ahead
and,
and
let's
see
here.
A
I'll
just
stop
sharing
for
a
second
and
paste
this
link
and
it
might
take
a
little
bit
for
it
to
to
load
and,
in
the
meantime,
I'll
go
ahead
and
cover.
A
So
how
is
it
how's
this
data
actually
serialized
to
ipfs
and
we
have
a
format
called
waxy
which
stands
for
web
archive
collection,
zipped
and
it's
a
zip-based
format
inside
of
it.
It
stores
data,
another
format
called
work,
which
is
a
standardized
format
created
by
internet
archive
and
it
also
stores
the
raw
indices
into
the
data.
A
A
And
so
the
idea
is
that,
even
if
you
have
a
large
archive,
you
don't
have
to
load
everything
all
at
once
and
that's
sort
of
a
key
requirement
for
this
to
to
work,
and
so
what's
actually
written
to
idpfs
and
what's
written
is
actually
four
files.
The
index,
the
a
service
worker
sw.js
and
a
ui
file
ui.js,
and
then
the
actual
archive
in
this
waxy
format
is
also
stored
in
in
that
as
part
of
the
multihash.
A
And
so
what
I
just
shared
in
that
in
the
channel
is
basically
a
link
to
this
multihash.
That
was
just
created
directly
in
the
browser,
and
the
idea
is
that
so
I
shared
a
link
to
to
an
http
gateway.
That's
that's
one
way
to
load
the
data,
there's
actually
multiple
ways
to
so.
A
The
sharing
options
include.
A
This
is
basically
the
sharing
menu
in
the
extension
and
so
it'll
basically
allows
you
to
to
get
the
the
multihash
that
was
just
shared,
get
the
shareable
url
through
a
replay
webpage,
which
is
another
site
that
that's
hosted.
That
will
then
use
jsipfs
to
load
that
hash
or
you
could
just
get
an
https
gateway
link,
which
is
perhaps
the
most
most
compatible.
A
I
would
say,
but
not
necessarily
the
fastest,
and
here
top
is
the
status
and,
as
you
probably
noticed,
I
was
using
the
brave
browser
and
the
reason
for
that
is
that
brave
has
native
support
for
go
ipfs
which
is
really
great,
and
so
that
allows
me
to
connect
to
the
native
go
ipfs
daemon,
that's
running
that
has
been
started
by
brave
and
currently
the
way
that
this
works
is
that
I
already
had
to
enable
it
in
brave
manually
and
then
I
actually
check
which
port
the
the
ipfs
api
server
is
running,
and
it's
on
one
of
these
predefined
ports,
depending
whether
it's
a
brave
nightly
or
a
brave
production
release.
A
Eventually,
I
would
like
to
have
it
so
that
this
is
possible
to
determine
through
an
api
and
brave,
does
actually
have
this
api
chrome.ipfs
for
extensions,
but
currently
it's
only
available
to
the
ipfs
companion,
and
so
I'm
I'm
working
with
brave
and
they're
trying
to
make
that
more
more
generically
accessible
to
other
extensions
as
well,
and
that
will
make
this
a
little
bit
simpler.
A
You
could
also
share
in
the
electron
app
which
I
didn't
show,
but
there
is
basically
download
archive
as
an
app
which
has
essentially
the
same
ui,
and
this
app
starts
a
local
desired
pfs,
daemon
node,
and
so
it
connects
to
a
local
running
instance,
and
that
also
generally
works
pretty.
Well,
then,
of
course,
the
more
general
use
case
is,
if
you're
using
this
in
chrome,
there
are
still
some
some
some
issues
to
resolve.
A
Of
course,
since
there
are
no
direct
pdp
connections
in
in
chrome,
everything
must
be
served
over
a
websocket
there's
no
webrtc
either
because
I'm
using
a
service
worker,
and
so
it
connects
to
the
way
that
the
that
basically,
okay,
the
way
that
connecting
to
ipfs
in
in
chrome
works,
is
basically
or
in
a
browser
that
doesn't
have
native
support
is
by
connecting
to
a
preload
node,
which
then
loads.
The
entire
hash.
A
Essentially,
as
a
kind
of
proxy-
and
that
can
be
problematic,
especially
if
your
archive
is
very
large,
that
will
make
it
harder
to
so
essentially
have
to
sync
everything
that
I
just
archived
over
over
that
websocket
and
also
the
format
wax
designed
for
random
access,
whereas
preloading
is
not,
and
so
those
are
some
of
the
current
limitations.
A
One
one
minute
warning
to
wrap
up,
I
think
there's
some
of
the
current
limitations
of
the
system
is
that
basically,
the
the
preloading
isn't
yet
working
as
reliably
as
I
would
like
it
to
when
loading
over
a
gateway.
Occasionally,
there
are
timeouts
since,
since
it's
making
a
lot
of
small
range
requests
and
when
using
replay
webpage,
it
also
requires
a
preload
server
yeah
and
that's
that's.
Basically
it
and
maybe
I'll
also
share
I'll
quickly
share
another.
A
And
yeah,
so
that's
that's,
basically
that
the
idea
is
that
you
can
create
archives
directly
in
your
browser
and
and
then
share
them
with
others.
If
you're
using
brave
it
works
really
well,
if
you're
using
chrome,
hopefully
we'll
have
it
working
better
in
the
future,
and
so
you
can
share
sort
of
your
unique
view
of
the.