►
From YouTube: Gareth Ellis - Node.js Live London
Description
Gareth Ellis, who is a runtime performance analyst at IBM and also a member of the Node.js benchmark working group, provides a brief introduction to benchmarking and performance testing, the different approaches you can take to performance testing, and what to do if you identify an aggression as well as the tools that you can use for benchmarking. He also provides an overview of what is happening in the Node.js benchmarking working group.
A
A
So
what
we're
going
to
talk
about
today,
then
I'm,
going
to
start
off
with
a
brief
introduction
to
benchmarking
and
performance
testing,
going
to
look
at
some
of
the
key
challenges,
different
approaches
that
you
can
take
to
performance
testing
and
then
what
to
do.
If
you
identify
a
regression
how
to
find
out
what
it
is.
A
So
an
introduction
to
benchmarking,
then
one
of
the
most
important
things
when
benchmarking
or
performance
testing
is
to
change
one
thing
and
one
thing
only
between
the
different
runs
and
the
different
things
that
you're
comparing
now.
Typically,
the
thing
that
you
change
will
be
whatever
is
you're
wanting
to
check
the
performance
of
if
you're,
wanting
to
check
if
you're,
the
latest
version
of
your
application
code
performs
the
same
or
better
than
previous
versions.
You'll
be
just
comparing
the
old
version
of
the
code
with
a
new
one.
A
But
then,
if
something
does
go
wrong,
it's
going
to
be
very
difficult
to
try
and
work
out
what
it
is.
That's
causing
the
issue.
It's
worth
mentioning
that
performance
testing
is
quite
different.
Functional
testing,
whereas
in
functional
testing,
typically
you'll
run
something
a
number
of
times,
and
if
it
works
great
with
performance
testing,
there
is
not
any
single
one
answer
that
you'll
get
out
of
your
benchmark,
and
this
is
one
of
the
key
challenges.
A
So
no
matter
how
many
times
you've
run
your
benchmark,
chances
are
each
time
you
run
it
you'll
get
a
slightly
different
answer,
whether
that
be
you're
measuring
the
startup
time,
the
chance
of
it's
starting
to
the
same
number
of
milliseconds.
Each
time
is
fairly
small.
So
one
of
the
things
that
you
need
to
be
aware
of
is
sort
of
fundamental
renter
and
variance.
A
This
can
sometimes
lead
to
false
positives.
If
you've
got
a
theory
about
something,
that's
going
to
happen,
so
you
might
think,
oh,
my
new
application
code
is
definitely
going
to
run
faster.
You
might
just
do
one
run
of
your
old
one,
okay,
that
took
200
milliseconds
right
when
you
won
that
takes
hundred
ninety
brilliant.
It's
faster
job
done,
but
typically
you'll
get
a
large
range
of
results.
So
it's
very
important
to
ensure
that
you
run
your
benchmark
a
good
number
of
times
to
give
you
an
idea
of
the
sort
of
expected
variance
in
your
scores.
A
It's
also
something
that's
worth
mentioning
that
if
you
go
and
run
say
ten
times
and
you
find
you've
got
a
fifteen
percent
difference
between
the
lowest
and
the
highest
number,
then
that
could
well
mean
that
you
would
find
it
difficult
to
measure
say
a
regression
or
an
improvement
of,
say
five.
Ten
percent
to
be
able
to
measure
that
sort
of
thing
you'd
need
to
run
it
a
good
number
of
times,
something
that
can
help,
try
and
reduce
the
variance
would
be
making
sure
you've
got
consistent
environment.
A
So,
each
time
you
run
your
benchmark
or
test
your
application
code,
it's
good
to
try
and
have
the
machine
that
you're
running
it
on
in
it
in
the
same
state.
One
of
the
things
that
we
do
is
we
try
and
reboot
our
machines
before
we
do
each
run.
The
longer
machine
has
been
up,
you'll
find
that
the
performance
may
change
slightly
and
whilst
it
might
be
quite
good
to
Train,
keep
it
as
close
by
keeping
on
the
same
boot.
A
If
you
then
have
to
reboot
your
machine,
save
you
installed
a
kernel
update
or
something
like
that.
You've
then
got
back
to
a
state
that
is
going
to
be
very
difficult
to
get
back
into
your
state
that
you
think
you're
normally
in
so
we
found
that
rebooting
the
machine
before
each
run
at
least
guts
back
into
a
position
that
we
could
easily
we
create.
A
Something
else
is
making
sure
that
the
machine
is
isolated
from
outside
interference,
so
that
could
be
making
sure
your
co-workers
aren't
logging
into
the
machine
that
you're
doing
your
testing
on
and
running
things
that
is
taken
away.
All
the
CPU,
if
you
benchmark
also
is
using
network
trained
and
having
a
private
network.
So
you
make
sure
that
somebody
else
is
in
transferring
a
big
file
across
the
network
and
affecting
your
scores.
A
Something
else
that
you
can
try
is
interleaving
the
ruin,
the
two
things
that
you're
trying
to
compare.
So
that
would
be
doing
one
iteration
of
your
say:
good,
build
and
one
iteration
of
the
build
that
you're
testing
and
do
and
alternating
between
good
build
and
the
one
that
you're
testing.
The
final
key
challenge
is
jumping
to
conclusions.
It's
very
easy
to,
as
I
said
before,
to
say:
oh
I
think
my
recent
code
change
is
going
to
improve
startup
time.
A
Two
different
approaches
you
can
take
towards
benchmarking:
first,
one
being
micro
benchmarks,
and
these
can
be
quite
good
for
measuring
a
specific
function
or
API
change,
for
example,
for
example,
creating
a
new
buffer,
it's
good
for
comparing
key
characteristics.
However,
there
are
some
downsides
to
micro,
benchmarking
as
well.
A
Is
you
risk
not
measuring
exactly
what
you
think
you're
measuring,
so
v8
has
an
optimizing
compiler,
and
yet
that's
looking
out
on
ways
to
improve
your
code
to
make
it
run
faster.
It
may
notice
that
some
expensive
operation,
for
example,
assigning
a
variable
from
some
of
the
function,
and
you
don't
actually
ever
do
without
variable,
so
it
might
just
take
that
away
completely
and
you
think
brilliant,
it's
really
really
fast,
but
actually
you're,
not
testing.
What
you
think
you
are.
A
The
other
approach
towards
benchmarking
is
a
whole
system,
benchmarking,
so
benchmarking,
perhaps
an
expected
customer
use
case
or
a
larger,
larger
benchmark
for
something
that
we
use
in
the
community
is
Acme
air,
which
is
a
fictional
airline.
Company
and
users
can
simulate
creating
themselves
account
booking
on
flights,
checking
in
all
that
sort
of
stuff.
This
does
have
downsides
as
well
as
the
more
you
test.
The
more
code
involved,
there's
more
scope
for
variability,
so
you've
run
a
good
number
of
tests
and
you
think
you
found
a
regression.
What
do
you
do
now?
A
First
thing
is
to
check:
are
you
sure,
have
you
definitely
not
missed
something
and
actually
you've
measured
somebody
logging
in
and
running
something
that's
a
pin
away
or
you
cpu
have
a
look
at
the
expected
variance
is
if
you've
got
a
variance
of
ten
percent
and
you
think
you've
found
a
one-percent
regression.
You
need
to
be
quite
sure.
Otherwise,
it's
going
to
be
very
difficult
as
you
go
through
trying
to
make
changes
to
detect
whether
your
fixed
it
or
not.
If
you
sure
there
is
a
regression,
then
you
can
have
a
look
at
what's
changed.
A
Is
it
your
application
code?
Is
it
node?
Is
it
that
you've
upgraded
to
a
different
machine?
There's
a
few
different
things
that
you
can
use
to
try
and
work
out
what
the
cause
the
regression
is.
There's
various
tools
which
we're
going
to
look
at
in
a
second
second
one
would
be
just
doing
a
binary
search
of
the
code.
Changes
between
your
good
and
your
bad
build
that's
useful
in
some
situations,
maybe
not
so
useful
in
others.
A
So,
if
we're
looking
at
nodejs,
we
need
to
understand
that
there's
a
lot
of
places
a
regression
could
come
from.
It
could
be
perhaps
a
change
to
some
of
the
native
JavaScript
libraries.
It
could
be
that
you've
just
upgraded
to
a
new
version
of
v8
and
oh
and
so
a
new
version
of
node
that
is
pulled
in
a
new
version
of
v8.
A
It
could
be
also
that
new
version
loads
pulled
in
and
open
ssl
security
fix,
which
sometimes
can
leave,
can
cause
performance
regressions
because
you
may
not
have
been
doing
everything
that
you
were
supposed
to
be
doing
before
it
could
be
Olivia
the
update
it
could
be
that
you've
pulled
in
a
new
dependency
or
an
updated
module
that
has
had
an
adverse
effect
on
your
performance,
some
different
tools
that
we
can
use.
So
we
could
use
a
JavaScript
profiler,
there's
one
built
into
node
through
v8,
there's
also
other
external
packages.
A
A
So
as
an
example,
this
is
a
macro
benchmark
that
we've
been
running
IBM
and
it
simulates
creating
a
buffer
from
an
array
of
numbers.
We
go
in
repeat
this
operation
a
large
number
of
times
and
then
run
it
through
a
test
harness
which
tries
to
either
get
a
good
quality
data
or
until
the
maximum
number
of
iterations
that
we've
defined.
A
You
can
just
run
that
by
using
minus
minus
prof
on
your
node
command
line,
it
will
then
go
and
generate
a
file
called
isolate
some
hex
and
then
dash
v8
log
in
your
current
directory.
You
can
then
use
the
post
processor,
which
is
built
into
node
by
passing
minus
minus
prof
dash
process,
and
then
this
isolate
log
that
it's
created
and
it
will
get
get
you
a
load
of
different
bits
of
output
at
metrics,
as
I
said
before
is
another
option
installable
from
mpm.
A
A
But
if
it's
a
massive
change,
for
example,
you've
just
upgraded
from
node
0
to
8
to
node
6,
then
there's
going
to
be
a
lot
of
changes
to
go
through
and
try
and
work
out
what's
caused
the
regression
git
bisect
is
an
option
which
could
help
you
if
it's,
if
it's
in
get
the
change
that
you
looking
at
so
before,
I
showed
you
the
VA
profiler,
so
a
minus
minus
prof.
the
profile
at
the
top
here
is
from
node
4
3
2,
and
we
can
see
the
hottest
method.
A
Is
this
lazy
compile
of
from
object,
and
it
will
give
it
gives
us
the
line
number
in
the
native
JavaScript
library
in
buffer?
Is
we
can
see
that
we're
spending
about
twenty
three
point?
Nine
percent
of
the
ticks
are
happening
in
this
lazy,
compile
when
we
go
to
node
4
up
for
that
jumped
up
to
forty
seven
percent.
So
that's
certainly
something
that's
worth
looking
at.
So
the
lazy
compile
is
part
of
the
compilation,
so
it's
not
necessarily
a
regression
in
from
object.
A
It's
the
compilation,
that's
taking
a
lot
of
the
tix
perf,
the
system
profiler,
you
can
do
a
similar
sort
of
thing.
So
there's
a
massive
number
of
options.
You
can
pat
ikan
pass,
which
you
can
just
get
from
the
perf
man
pages
and
bypassing
in
minus
minus
perf,
basic
prof
onto
your
node
command
line
and
that
supplies
/
with
the
v8
symbol.
So
it
can
match
the
jetted
or
compiled
code
to
what
it
actually
was.
And
again
we
go
with
this
example.
A
We
can
see
twenty-three
percent
of
the
time
in
node,
four,
three,
two
and
forty
six
percent
of
the
time
in
node
voted
for
so
we're
seeing
something
to
do
with
compilations,
so
some
extra
options
that
we
can
pass
in
again.
These
are
v8
options
that
we
can
pass
in
straight
to
node
is
tracing
optimizations
and
also
d
optimizations,
and
this
is
what
happens
when
we
go
and
do
that
on
node
fought
for
so
we
can
see
first
of
all,
that
the
profile
of
spots
that
from
object
is
a
hot
method.
A
It's
being
cold
a
lot
of
times,
so
it's
very
hot
and
therefore
worthy
of
being
compiled,
so
it
goes
and
compels
like
using
crankshaft.
It
goes
into
some
further
optimizations
completes
the
optimization
and
then
straight
away
goes
and
D
optimizes
it.
So
there's
a
good
chance
that
this
is
what's
using
some
of
our
time
after
a
issue
and
a
pull
request.
A
It
ended
up
that
the
problem
was
in
node
fought
for
we'd
gone
through
and
changed
all
of
in
all
four
loops
we'd
gone
and
changed
the
stepper
variable
to
be
declared
with
let
rather
than
VAR,
which
the
current
optimizer
has
an
issue
with
in
v8.
So
this
will
be
fixed
when
turbofan
becomes
the
default,
but
I've
not
note
for
that,
for
it
isn't
the
default.
A
So
we
went
reverted
the
change
and
got
the
performance
back,
which
is
good,
and
the
next
bit
I
want
to
talk
about,
then,
is
what
we've
been
doing
in
the
node
community,
benchmarking
workgroup
so
the
workgroups
goal,
or
we
have
a
mandate
to
track
and
evangelize
performance
games
between
node
releases.
So
we've
been
defining
use
cases.
So,
where
is
it
that
people
have
been
using
node
and
what
are
the
areas
that
we
can
be
looking
at
to
try
and
get
as
many
real-world
examples
of
people
using
those?
A
So
we
can
make
sure
we're
looking
in
the
right
places
for
any
performance.
Regressions
we've
identified
some
some
benchmarks,
but
we
still
have
more
to
ident
fine.
Indeed,
if
you're
aware
of
any
you
more
than
welcome
to
Quinn,
raise
an
issue
or
even
pull
requests
can
submit
some
more
benchmark.
So
we
can
be
running
those
in
the
community
on
regular
build,
so
we
can
spot
any
regressions
that
may
be
coming
in,
which
means
that
you
don't
end
up
deploying
them
in
production
and
then
having
some
issues.
A
We've
been
running
and
capturing
the
results,
so
you
can
go
to
benchmarking.
Nodejs
dog
and
we've
got
a
set
of
graphs
of
the
current
benchmarks
there
and
the
results
tracking,
node
0,
12,
node,
4,
node,
6
and
also
the
master
branch.
We've
currently
got
13
members
and
we
have
meetings
roughly
every
month
month
and
a
half
something
like
that.
A
Here
are
some
of
the
use
cases
that
we've
defined
so
far,
so
the
first
one
would
be
back-end
API
services,
so
this
would
be
rest
or
rest
like.
So
this
is
typically
over
HTTP
in
public
infrastructure.
Their
main
focus
would
be
trying
to
ensure
that
they
can
get
good
performance
over
public
infrastructure
where
things
such
as
latency
and
bandwidth,
maybe
concern
service-oriented
architectures.
A
A
The
next
one
would
be
micro
service
based
applications,
so
nimble,
low
resource,
quick,
startup,
apps
and
typically
these
sort
of
things
may
also
use
some
different
types
of
networking
so
and
perhaps
UDP
to
try
and
get
stuff
to
happen
as
quickly
as
possible,
and
we
want
to
make
sure
that
we
can
try
and
track
these
that
we
tell
you,
don't,
go
and
regress
them
in
node
and
Jen
rating
and
serving
dynamic
web
page
content.
So
things
such
as
express
happy
khoa
react
all
this
sort
of
thing
very
popular
frameworks.
A
We
need
to
make
sure
that
we've
got
benchmarks
that
cover
these,
so
we
don't
go
in
checking
changes
that
could
potentially
affect
lots
and
lots
of
users
single
page
applications.
So
this
is
typically
where
the
main
GUI
of
a
application
is
served
via
HTTP
request
and
then
further
updates
are
done
over
either
web
sockets
or
HTTP,
two
agents
and
data
collectors.
A
So
for
all
of
those
use
cases
and
there's
a
number
of
metrics
that
we
probably
be
interested
in
looking
at
consistently
low
latency
ability
to
support
high
concurrency
high
throughput,
fast
startup,
shut
down
time,
restart
and
also
low
resource
usage
benchmarks
that
we've
currently
got
running
in
the
community,
and
so
we've
got
some
are
tracking
startup
time.
We
also
look
in
at
the
footprint
of
a
small
process
time
to
require
my
jokes:
that's
something
that
lots
of
people
are
going
to
be
hitting.
We
have
acne
air
running,
which
is
throughput.
A
We
look
at
the
response
time
and
also
look
at
respond.
Sorry
footprint
measurements,
whilst
the
applications
running
it's
all
well
and
good,
if
it's
very
small,
once
the
application
gets
going,
most
people
are
going
to
have
quite
a
bit
of
a
load
applied
to
their
application,
so
we
want
to
make
sure
that
it's
not
growing
out
of
control.
We've
also
recently
checked
in
a
docker
file,
so
you
can
go
and
build
your
own
docker
image,
which
will
compare
two
versions
of
node
and
then
throw
out
a
comparison
at
the
end.
A
So
I
dared
you
to
have
a
look
at
that.
We've
got
a
number
of
other
benchmarks:
impressive
progress
as
well
looking
at
the
performance
of
URL
and
also
trying
to
be
running.
The
benchmarks
which
are
in
the
nerdy
is
build
sauce
and
actually
graphing.
Some
of
the
results
from
that,
as
I
mentioned
before,
a
benchmarking,
no
gesture,
we've
got
lots
of
graphs
like
this.
A
This
one
is
where
higher
is
better,
so
we
can
see
there
at
the
top
that
the
purple
and
the
blue,
which
are
node,
6
and
node
master,
are
faster
than
previous
releases,
which
is
what
we
want
to
see.
There's
lots
more
graphs
on
benchmark
and
don't
know,
gesture
so
dared
you
to
go
and
have
a
look
at
there
and
and
then
finally,
how
you
can
get
involved.
So,
as
I
mentioned
before,
get
up
calm,
/
no
Jess,
like
benchmarking,
go
and
have
a
look
at
what's
going
on.