►
From YouTube: PGO Deep dive - special benchmarking meeting
Description
A
B
Everyone,
oh,
this
is
good
knowledge
from
Microsoft
and
from
last
few
weeks,
I
was
a
studying
something
MA
performance
benefit
that
we
get
about
from
profile
guided
optimization.
So
this
is
my
like:
whatever
research
I
did
oh
and
the
results
that
I
have
seen,
I
wanted
to
just
present
and
or
or
and
know
like
what
people
think
about
it.
Oh
so,
let's
start
a
basically
what
is
pgo
pjs
transfer
profile,
guided,
optimization
or
where
you
first
compile
a
binary
or
compiler
instrumented
binary,
and
what
that
means.
It's
like.
B
Oh,
the
compiler
will
add
patches
to
the
core
to
the
flow
graphs
of
the
of
the
source
code,
and
once
you
have
the
instrumented
binary
compiled
you
just
run
through
the
training
scenarios
or
which
will
okay.
Well,
you
run
through
the
training
scenarios
and
the
instrumented
binary
records
all
that
information
and
dumps
it
into
some
profile
data
files.
Once
your
training
scenarios
are
done,
you
take
those
profile,
data
files,
you
take
those
instrumented
binary
and
then
recombine
the
binary
to
get
the
optimized
binary.
What
that
does
is
like
from
the
profile
information.
B
It
knows
what
were
the
hot
code
paths
that
were
taken
during
the
training
scenarios
were
executed
and
it
optimizes
the
code
for
those
core
parts.
Oh
and
that's
why,
like
it
is
like
a
optimized
binary
compared
to
the
normal
binary,
so
some
of
the
benefits
that
we
get
from
pj
ways
like
it
enhanced
the
program
locality.
Basically,
it
knows
what
can
what
of
basic
blocks
or
I
jump
to
and
that's
why
it
brings
them
closer
to
each
other.
Oh,
there
is
also
like
benefit
of
virtual
call.
B
Speculation
like
you,
have
multiple
derived
classes,
and
you
have
some
virtual
calls
or
from
the
profile
information.
It
knows
like
what,
for
the
most
apparent
derived
classes
that
it
called
into
and
they're
like
can
do
some
like
it
does
some
optimization
there
is
like
function,
inlining,
some
better
register
location,
because
it
knows
like
what
kind
of
assignments
you
were
doing
in
while
recording
that
scenario
also,
it
knows
like
the
branch
prediction.
B
So,
for
example,
like
take
a
look
into
like
this
code,
or
let's
say
you
have
this
this
earth,
where
you
have
flown
from
A
to
B,
you
just
go
once
and
then
there
is
a
like
condition
where
you
just
like
go
one
at
like
in
one
condition,
but
like
100
times
in
twos
block
sea
and
the
sea
has
another
condition
where
90
times
it
goes
in
through
through
the
right
flow
and
10
times.
It
goes
through
the
left
flow
and
then
from
EU
k
goes
to
F
by
like
90
times.
B
Normally,
if
you
see,
on
the
right
hand,
side
on
the
part
a
0
in
the
normal
binary,
the
programs
will
be
0,
though
the
basic
box
will
be
arranged
or
like
this,
like
ABCDEF,
just
like
whatever
it
has
seen.
But
if
we
use
profile
guided
optimization,
it
knows
that
it
has
taken
path.
Cef
or
like
more
than
it
did
from
cdf,
so
what
it
did
is
it
moved
block
e
closer
to
see.
So
your
jumps
are
like
not
far.
B
It's
like
near
chumps,
oh
and
the
infrequent
execution,
like
those
dogs
unmoved
down
so,
for
example,
or
dog
d,
has
been
moved
down
and
that's
like
one
example
of
what
a
information,
what
how
it
benefits
from
the
information
it
records,
and
there
are
like
many
other
information
like
what
I
call
it.
The
previous
slide,
a
lot
of
information
gathering
and
using
that
it
produces
the
optimized
binary.
B
Also
some
of
the
case.
Studies
like
vm,
like
a
asp.net
core
or
in
the
recently
they
publish
a
blog
and
they
use
this
PTO
binary.
They
saw
like
time
for
five
to
ten
percent
optimization
in
improvements
in
the
startup
of
the
like
music,
app,
for
example,
or
chromium.
Does
that
the
pockets
of
blog
recently
like
and
talk
to
her
about,
like
oh,
they
will
be
Oh.
Making
chromium
like
oh
build
with
PTO
binaries
Mozilla
had
that
for
a
while
or
Python
does
that
so
there
are
like.
B
B
So
with
that
I
wanted
to
experiment
like
how
beneficial
it
is
for
node
and
basically
what
I
did
was
I
choose
a
training
set,
and
basically
I
use
ahmed,
a
benchmark,
a
tech
empower
benchmark
or
the
core
benchmarks
that
node
has
and
the
top
ten
node
modules
that
most
of
the
models
are
dependent
on,
and
this
list
I
have
taken
from
like
I,
have
included
a
link
from
where
I
have
or
taken
this
list
and
the
right
hand
side
seems
like
all
those
those
ten
modules
that
I
use
and
basically
I
just
ran
unit
tests
of
those
ten
modules
and
obviously
NPM
install
to
install
those
ten
modules.
B
Also,
this
was
my
training
set
and
what
I
did
is
like
I
train
this
binder
or
produce
or
pgo
binary,
using
this
training
set
and
then
measure
the
performance
improvement
that
I
see
on
egg
my
air
temp
hour
and
core
benchmarks.
So
this
is
the
result
that
I
saw
so
this
is
based
on
the
node
of
like
channel
as
of
January.
Third,
so
a
nightmare
I
saw
like
five
percent
improvement.
Oh
taken
power,
I
saw
seven
percent
improvement
and
there
are
like
a
lot
of
improvements
in
the
code
benchmark.
B
Oh
I
have
the
entire
sub,
like
the
detailed
of
individual
benchmark.
Implement,
is
listed
here
in
this
in
my
juice
github,
but
like
just
to
summarize,
like
buffer,
has
like
25
through
fifty
percent
improvement,
query
string
as
up
to
thirty
percent
or
the
script.
Oh,
you
did
timers.
They
have
up
to
twenty
percent
improvement
and
HTTP
I'd,
like
fifteen
percent
improvement
and
that's
again
like
without
hand
coding
any
of
the
things
like
it.
Just
like
running
the
twins,
it
training
scenarios
and
recompiling
the
note
binary
with
that
trading
scenario.
B
I
didn't
have
to
do
anything
else
and
I
put
see
this
benchmark
or
this
improvements
and
there's
all
this
benchmark
for
were
there
any
that
went
down?
Oh
no,
like
that's
all
like
so
in
the
in
the
at
least
light.
So
basically,
like
whatever
training
scenario,
I
used,
I
majored
then
performance
on
those
benchmark.
So
it
could
it's
very
likely
that
I
went
seen
any
regression
on
those
benchmarks,
but
it
is
like
definitely
possible
that
some
other
scenarios
might
have
regressed.
B
Oh
and
that's
the
point
I
will,
I
will
feel
like
I
will
talk
about
in
the
challenges
section
sure,
okay
yeah,
but
but
yeah
like
just
to
like
whatever
training
synergies
you
are
doing
like
it
is
like
those
are
getting
bench
more
like
our
improvements
or
the
other
exits
that
I
did
was
like.
I
just
use
egg
mayor
and
taken
power
as
a
training
set
and
they're
like
I,
saw
improvement
in
Tekken
import
for,
like
fourteen
percent
versus
previously
I
just
saw
four
seven
percent,
so
it
depends
on
what
training
set.
You
have
your
used.
B
Basically,
so
the
challenge
is
or
in
this
PTO
thing
is:
you
need
to
choose
the
right
training
set
0,
which
means
that
you
want
to
make
sure
that
you
are.
You
have
like
improved
the
common
scenarios,
but
you
haven't
progressed
much
the
uncommon
scenarios
like
that's,
definitely
a
trade-off,
but
you
want
to
make
sure
that
the
most
common
signals
are
benefited
by
this
PTO
binaries
or
the
other
challenge.
B
Is
you
need
to
have
a
robust
automation
for
executing
this
training
center,
because
now
we
are
talking
like
if
we
are
want
to
do
this
in
CI
or
basically,
what
we
are
saying
is
that
you
knowed
you
execute
this
trainings
and
I.
You
rebuild
node
and
that's
where
you
get
the
PTO
binary.
So
you
want
to
make
sure
that
all
the
straining
sinners
like
there
are
no
failures
and
it
has
like
recovering
if
there
are
any
failures
and
make
sure
that
the
profile
or
data
files
are
not
missed
by
when
we
execute
the
training
set.
B
So
that's
like
we
need
to
make
sure
like
that's
one
challenge
the
profile
data
files
are
not
shareable
across
like
architecture
or
platform
or
bills,
so
you
can't
have
like
just
one
or
like
a
profile
data
file
and
then
just
rebuild
or
the
optimized
binary
on
different
architectures
of
that.
From
that
you
need,
you
have
to
do
this
exercise
like
for
me.
They
were
/
every
platform.
B
We
have
to
rebuild
I
would
try
it
for
different
architecture.
I
can
give
it
a
try,
but
as
far
as
I
know
and
what
I
read
on
like
internet
or
you
need
to
make
sure
that
oh,
the
PGC
files
are
basically
like
create
generated
like
separately.
Part
of
the
reason
is
like
there
are
multiple
files,
at
least
for
windows.
B
There
are
like
multiple
files
that
are
produced
once
you
build
the
instrumented
binary
and
that
file
is
actually
used,
while
you
actually
build
optimized
binary
so
basically
like
the
first
build
has
to
happen
on
the
same
o
same
matrix
as
the
second,
oh,
so
yeah
so,
and
that
brings
to
the
last
challenge,
which
is
like
you're.
The
build
time
will
increase
for
this,
like
it
will
be
like
twice
plus
time
spent
in
running
the
training
set,
the
control
yeah.
B
But
the
downside
is
or
like
AI
you
have
to
build
I'm
or
you
need
to
find
the
right
inside
and
there
is
them
should
be
like
there
has
to
be
some.
We
evaluate
the
side
effect
of
on
other
scenarios,
so
just
wanted
to
get
thoughts
of
the
community
like
how
the
thing
or
we
can
solve
this
side
effect,
our
this
town
socket
and
what
they
thinkin
of
getting
this
binary
yeah.
A
A
A
B
A
What
would
be
interesting
for
me
anyway,
would
be
as
if
we
say
we
trained
it
on
acne
air.
You
only
do
we
actually
see
degradation
Xin,
the
micro
benchmarks
on
or
if
we
trained
it
on
the
micro
benchmarks.
Do
we
see
depredations
in
Acme
air,
okay,
again
again
tried
at
exercise
cuz?
That
would
say
like
if
we
can't
find
the
case
where
it
makes
other
things
worse,
then
it
makes
you
a
little
bit
less
worried
about
that
right.
B
B
A
That
would
be
kind
of
like
an
interesting
data
point.
The
other
one
is
even
just
on
the
build
front.
You
know
there'd
be
a
significant
challenge.
There
I
think
yeah,
because
today
we
you
know
we
I,
don't
think
people
are
gonna
fly
to
double
the
build
time
like
a
regression
testing
mm-hmm.
At
the
same
time,
that's
all
the
testing
we
have
like
when
we
go
to
do
a
release.
C
A
A
It's
not
really
the
way
that
yeah
exactly
right,
like
the
thing
is
the
current
see,
I
jobs
for
releases,
don't
actually
run
any
tests.
Okay,
but
they
build
again.
You
know
so
I
believe
what
we
do
is
we
build
and
run
the
test
against
a
particular
commit
uh-huh
and
the
regular
CI,
and
we
say
okay
everything's
good.
Then
we
run
the
release
job,
which
will
just
create
the
binaries
from
that
tag.
Okay,.
A
B
Yeah
right
I
mean
yeah,
like
definitely
the
building
is
doubled
or
plus,
like
it
like
time
for
the
training
scenarios
rates.
So,
oh,
but
again
like
that
depends
like
if
we
are
like.
If
we
see
that
there
are
improvements,
for
example
like
if
we
just
train
with
that
mayor
and
see
there
are
significant
improvement
in
micro
back
up,
then
it
might
be
a
word
even
a
try
right.
A
Yeah
I
yeah
I'm
just
trying
to
think
how
to
like
I
think
first
yeah
we'd
have
to
see.
Is
there
I
mean
I,
don't
know
if
there's
any
way
to
really
mitigate
that
doubling
of
the
time
other
than
did
like
it's
too
bad.
You
couldn't
do
the
training
once
and
then
reuse
that
training
data,
even
if
it
wasn't
quite
as
good
right
yeah,
but
it
doesn't
sound
like
you
can
do
that
right.
Right.
D
B
D
C
C
D
B
B
A
The
regular
build,
what's
it
yes,
yeah
like
I'm,
wondering
like
you,
know,
thinking
along
what
what
Gareth
is
mentioning
they're
like
the
first
step,
might
be
to
try
and
integrate
into
the
make
file
a
target
which
would
basically
build
I
guess
the
thing
is,
it
would
have
to
build
run.
The
tests
run
the
training
things
and
then.
A
B
A
B
We
run
the
benchmark
and
then
again
she's
the
flag,
saying
that
okay,
now
I
have
run
the
trade
the
benchmarks.
Now
it
is
time
to
generate
the
optimized
binary
so
that
flag,
actually
else
to
use
the
feet
the
profile
data
files
to
produce
the
binary.
Now,
let's
book
those
profile,
data
files,
where
do
they
get
generated
they
get
generated
in
like
are
the
same
dish
when
no
DX
is
exist.
Okay,
so
those
could
be
copied,
though
right,
yes,.
A
C
A
Those
two
extra
targets
might
be
useful
and
then
I
guess
you
know
we
could
see
that
you
could
have
I,
don't
think
it
needs
to
happen.
Actually,
at
the
same
time
we
could
separate.
We
have
a
job
that
goes,
you
know,
so
so
new
release
gets
published
right.
It's
a
particular
hash.
You
could
then
envision
a
job
that
would
go
in
the
background
and
say
out.
There's
new
release
I'm
going
to
build
in
this
pgo
enabled
mode.
You
I'm
then
going
to
run
this
set
of
benchmarks
and.
C
A
I'm
going
to
do
to
make
dist
in
this
particular
configuration.
Yes,
now
the
the
the
the
wrinkles
to
that
is.
You
know
we
do
have
our
release
machines
separated
out
mm-hmm,
so
I,
don't
think
we'd
want
to
be
running
like
all
the
benchmarks
and
all
that
kind
of
stuff
on
those
machines
right
there
just
aren't
as
many
of
them
that
you
know
and
all
that.
Well,
you
know
it
might
be
something
like
we
would
run
them
on
the
other
on
the
regular
farm.
A
A
But,
for
example,
like
you
know,
in
Linux,
we
may
test
across
fedora
other
bunch
of
things,
but
we
build
on
centos
now,
I
think.
Maybe
we
do
test
at
least
have
some
test
machines
that
are
at
the
earlier
levels
as
well.
I
just
know
our
release.
Machines
are
at
the
earliest
levels
so
that
we
actually
get.
You
know
support
across
the
different
versions.
Yeah.
B
I
mean
yeah.
Definitely
the
Blues
have
to
be
compatible,
I
guess,
for
example,
if
you
generate
a
profile
data
files
with
of
Visual
Studio
2050
I,
don't
think
you
could
be
able
to
build
optimized
bindable,
two
thousand
six
to
1030
right
so
but
yeah,
but
definitely
like
moving
the
files
or
is
possible
and
that's
how
I
did
like
a
yeah.
A
B
A
Yeah,
okay,
so
I
mean
there'd,
be
a
bunch
of
wrinkles
and
trying
to
get
that
process
in
place.
I,
don't
think
we
I,
don't
think
I
just
can't
see
us
integrating
into
our
regression
tests
and
even
into
our
standard
releases.
That's
going
to
be
harder,
but
there
might
be
like
an
alternate
flow
where
there's
a
second
set
of
binaries,
which
would
be
okay
here.
A
The
pgo
enabled
binaries
there'd
be
a
bunch
of
work
related
to
that,
but
I
think
I
think
the
first
step
of
actually
making
it
easy
to
with
make
targets
to
be
able
to
do
the
steps
isn't
a
bad
thing,
because
then,
if
somebody
wants
to
try
it
out,
they
can
do
that
and,
like
guerra
said,
if
they
won't,
if
you
want
to
build
node
yourself
and
optimize
it
for
your
app,
that's
not
a
bad
thing.
Mm-Hmm,
and
then
you
know
once
that's
there,
you
can.
A
You
know
if
we
have
enough
people
were
willing
to
actively
work
on
it.
You
know
we
could
work
through
the
steps
of
saying
okay.
How
would
we
do
an
offline
sort
of
flow
to
generates
and
builds
that
people
at
least
could
try
out
with
to
start
with,
like
here's,
the
optimized
binary
versus
the
unoptimized
right
right.
B
And
yeah
I
mean
like
I'm,
like
I'm
excrement
image,
like
some
other
like
partners
within
Microsoft,
using
like
a
node
and
recently
like
I've,
shared
them,
the
binary
demise,
or
should
optimize
and
I
just
can't
afford
their
app
yeah
and
yeah
I
haven't
heard
back
from
them
yet
but
yeah.
It's
just
like
yesterday.
A
B
A
B
D
So
yeah
yeah,
Batman
I
think
there's
probably
still
quite
a
bit
to
learn,
especially
even
if
we're
only
looking
at
the
sort
of
optimizations
that
are
being
applied.
There
could
be
areas
of
node
that
sir
pointed
out,
but
as
this
is
an
area
that
isn't
normally
up
tonight,
optimized
very
well,
but
perhaps
that
could
be
sort
of
code
changes
made
to
also
sort
of
implement
our
exploit
these
sort
of
optimizations
in
a
normal
binary.
If
there
is
something
that
we
notice
that
yeah.
A
C
B
While
building
it
does
call
out
like
okay,
it
has
optimized
X
number
of
functions
or
for
speed,
but
it
doesn't
call
out
like
what
exact
function
so
yeah
I
can
take
through
and
see
like.
If
I
can
and
get
that
information
out,
and
then
we
can
just
like
hand
code
and
like
optimize
those
function.
Yeah.
A
A
A
You
know
if
that
way,
like
Gareth
was
saying,
if
you
know,
if
somebody
wants
to
build
it
themselves
and
they
can
turn
on
a
few
options
to
be
able
to
do
that.
That
sounds
good
right
yeah
and
we
want
those
you
know
if
we
were
going
to
if
we
were
going
to
use
it
in
the
regular
binary
production.
We
would
want
those
anyway
right.
B
A
A
So
sounds
like
there's
some
decent
next
steps,
any
other
things
we
should
talk
about.
Do
you
think
or
discuss
like
yeah.
B
I
mean
yeah,
that's
not
like
good.
This
was
like
just
a
first
step
to
see
like
Oh
what
you
people
think
about
it,
Oh,
based
on
the
results
that
I've
seen.
Oh
and
you
are
definitely
like
it
so
like
from
what
I
hear
it's
a
good
thing
to
try
out
for
next
steps
and
the
only
pain
point
I
guess
is
to
finding
the
right
training
set
and
the
paratime
oh
yeah.
That's.
A
What
I
mean
I
think
I
think
if
we
got
the
yeah
we're
completely
convinced
that
it's
always
going
to
give
us
like
a
ten
percent
win,
then
you
could
get
to
the
thing
they
say:
okay!
Well,
you
know
how
much
effort
are
we
going
to
have
to
go
through
in
terms
of
actually
getting
binaries?
That
will
give
us
that
reading
I
mean
I
I,
think,
like
the
one
I
just
walked
through,
where
it's
basically
an
offline
process.
A
C
A
B
A
B
As
we
can
right,
because
that's
like
I
think
here,
like
I've,
been
through
that
benchmark-
and
it
is
like
more
realistic
because
it
creates
like
lot
of
objects
versus
Achmed,
which
is
like
it
doesn't
create
much
object.
So
that's
like
definitely
a
realistic
10
but
like
if
we
can
see
improvements
with
people
on
those
benchmarks
and
also
micro.
By
targeting,
then
it's
like
both
pursuing
all
those
I
guess,
yeah.
A
I
think
so,
yeah
I
think
it
long
term.
It
sounds
like
a
good.
You
know.
Ten
percent
is
enough
benefit
that
it's
worthwhile.
It's
just
work
to
get
theirs
well,
right,
yeah,
so
yeah.
We
need
to
build
out
our
benchmarks
and
then
you
know,
but
I.
You
know.
If
you
have
time
just
to
invest
in
it,
it
seems
seems
worthwhile,
okay,
okay,
so.
B
A
A
Okay
sounds
good,
so
I'm
I'm
just
going
to
see
their
shows
as
to
viewers
I'm
gonna
do
is
I'm
gonna,
look
on
no
dev,
so
let
me
just-
and
so,
if
you,
if
you
go
to
no
dev
and
ask
a
question
I'll
just
check
to
see
if
there
are
any
there
I'll
just
take
me
a
second
to
login.
C
A
Dev
just
to
see
I,
don't
see
any
questions
there
yet,
but
will
give
people
just
a
minute
or
two
in
case
okay,
so
there
are
well.
What
is
it
really?
I
it's
it's
the
IRC
channel,
the
no
dev
our
IRC.
Oh,
ok!
Ok!
I
just
thought
that
you
know
that's
one
way.
One
way
people
can
ask
us
questions
I'm,
just
trying
to
think.
Is
there
any
other
I,
don't
know
if
through
YouTube,
there's
a
way
for
people
to
ask
questions.
A
I'm.
Looking
at
the
sea
open,
there
is
a
YouTube
live
chat
which
I've
opened
so
I,
don't
know
if
people
can
access
that
directly
from
the
outside.
That's
another
way
that
you
could
ask
a
question:
ok,
you
shadow
link
below
them.
Oh
I'm,
not
sure
it's
just
a
link,
it's
it's!
Basically,
you
have
to
join
no
dev.