►
From YouTube: Large scale Hybrid Quantum Workflows with PennyLane
Description
Large scale Hybrid Quantum Workflows with PennyLane
Lee J. O'Riordan (Xanadu)
A
A
What
we
call
a
large
scale,
hybrid
Quantum,
workflows
with
Penny
Lane
and
the
general
idea-
is
first
of
all,
I'll
introduce
everybody
to
what
Penny
Lane
is
the
you
know
our
kind
of
goal
for
the
projects
for
the
for
the
library
discuss
a
little
bit
about
integrating
a
good,
HPC,
tooling
and
software,
and
then
move
on
to
the
results
we've
had
for
for
from
using
Pearl
water
over
the
the
past
year.
A
So
great
so
yeah
I
I'll
just
give
a
quick,
introduce
introduction
to
Penny
Lane
first,
so
petty
Lane
itself
is
in
our
eyes
the
kind
of
the
best
way
of
allowing
a
researcher
to
kind
of
build.
These
state-of-the-art
kind
of
hybrid
and
device
agnostic,
Quantum
algorithms.
So
whatever
kind
of
real
I
guess
powerful
points
is
that
Penny
Lane
allows
you
to
incorporate
multiple
different
types
of
devices
in
a
single
workflow.
A
So
another
thing
I
should
say:
is
we
work
with
a
lot
of
people
in
a
lot
of
places
right
so
from
the
hardware
side,
you
know
we're
building
our
own,
our
own
Hardware,
at
Xanadu,
from
the
software
side
and
partner,
with
a
lot
of
different
organizations
and
and
companies
to
add
support
to
pennile
and
to
make
use
of
pennile,
and
you
know
from
the
applications
point
of
view
as
well.
A
We
do
a
lot
of
development
and
research
with
with
other
organizations,
and
so
the
goal
of
this
is
kind
of
to
support
quantum
programming
really
on
on
any
platform.
Okay,
so
we
really
want
to
make
sure
that
you
know
no
matter
what
Hardware
you
have
available.
You
should
be
able
to
build
an
example
within
Penny
Lane
and
to
use
Penny
Lane
for
your
your
Hardware
access,
so
it'd
be
that
you
know
integrated
photonics
for
our
own
stack,
whether
it
be
superconducting,
qubits
simulators,
trapped,
ions.
A
We
want
to
make
sure
you
can
also
integrate
your
workflows
into
machine
learning
tools
like
torch,
tensorflow
and
Jacks,
as
well
as
making
use
of
you
know,
HPC
platforms
and
Cloud
native
platforms.
Okay,
so
we
do
this
with
cognitive,
this
composition
of
hybrid
Quantum
and
classical
design
philosophy,
and
we
treat
Quantum
circuits
themselves.
A
So
if
you
have
some
type
of
quantum
problem
that
evaluates
and
gives
results,
two
song
classical
neural
network
or
do
some
other
parts
of
a
pipeline
and
then
uploads
to
another
quantum
device,
we
want
to
make
that
as
seamless
as
possible
with
Penny
Lane,
and
we
do
this
with
some
help
from
what
we
call
Quantum
nodes
or
Keynotes.
So
this
is
where
the
and
the
integration
comes
with,
with
you
know,
quantum
computers
and
classical
scientific
libraries
within
Penny
Lane.
A
So
if
you
have
made
use
of
tensorflow,
jacks
or
torch
in
the
past,
you
notice
that
they
might
have
these
native
tensor-like
objects
and
with
these
native
tensor-like
objects,
they
each
have
their
own
way
of
tracking
gradients,
you're
tracking
the
the
operations
through
a
classical
machine
learning
model.
So
to
make
sure
that
we
kind
of
have
a
nice
integration
with
the
quantum
circuits
we
unwrapped
these
tensors.
We
feed
them
into
our
Quantum
circuits
in
the
most
efficient
way,
and
then
we
convert
the
output
of
a
Quantum
circuit
back
into
a
tensor
again.
A
So,
as
far
as
what
goes
on
inside
the
cue
node,
the
classical
machine
learning,
Frameworks,
don't
care
or
within
the
queue
node
itself,
it
can
be
simulated
in
Quantum
circuit.
It
could
be
Hardware
that
you
know
passes
on
the
parameters
to
build
a
circuit
and
do
evaluations
or
or
some
kind
of
hybrid
combination,
okay,
so
just
a
quick
overview
as
well.
There's
plenty
of
examples
to
go
around
you
know,
Penny
Lane
has
could
have
put
you
know,
education
at
the
Forefront,
for
you
know
upcoming,
and
you
know
Cutting
Edge
research.
A
So
we
build
examples
from
papers
all
the
time
and,
if
you're
kind
of
looking
to
know
more
I
would
suggest
visiting
the
the
Penny
Lane
AI
qml
website,
and
you
know
we
have
cross.
You
know
we,
we
don't
stick
to
just
their
own
simulators.
You
know,
there's
there's
devices
there,
which
are
you
know
non
non-zanadu
devices.
We
have
a
you
know,
GPU
simulation,
CPU
simulations
everything
you
might
want
to
see.
A
A
You
know
the
development
of
quantum
computers
and
software
for
quantum
computers
to
be
very
tightly
integrated
with
with
HPC
platforms,
so
we
could
have
started
by
focusing
on
ensuring
Penny
Lane
has
I,
guess,
suitability
and
tooling,
which
you
know
is
native
to
the
hvc
based
space.
So
right
now,
there's
was
a
lot
of
focus
at
least
on
integrating
it
with
kruda
and
especially
cool
Quantum
from
Nvidia.
To
make
sure
we
can,
you
know,
definitely
take
advantage
of
those
a100s
and
pearl
Mudder
in
the
in
the
most
efficient
ways.
A
We
also
have
native
you
know,
C
plus,
plus
back
simulators,
which
are
offloading
with
openmp
to
make
sure
that
we
can.
You
know,
always
run
well
on
the
given
Hardware,
as
well
as
a
new
support
for
Cocos,
which
we're
I'll
discuss
briefly
in
the
following
slide.
Some
of
the
other
work
we're
doing.
Obviously,
I
mentioned
the
machine
learning
framework
integration
so
apply
torch,
tensorflow
Jacks.
A
These
are
natively
supported
with
Penny
Lane,
and
you
can
easily
build
a
hybrid
Quantum
job
that
will
work
within
a
given
a
given
workflow
for
for
these
platforms
and
last
on
the
list.
We're
discussing
distributed,
workloads,
not
good
MPI
this
time,
but
with
Ray
and
dusk
right
so
I
mean
these.
Are
you
know
if
anyone
has
ever
played
with
the
likes
of
Ray
or
daspies
or
great
tools
for
for
task-based
computation?
A
And
you
know
they
natively
support
distribution
of
of
workloads,
and
you
know,
we've
had
great
success
with
that
and
I'll
come
back
to
Ray
as
well,
when
I
discuss
the
results
on
role
model
so
just
to
give
a
quick
overview
as
well
of
our
software
suite.
So
this
is
the
busiest
slide
on
the
deck.
You
know,
I'm,
sorry
for
the
words,
but
it's
just
easy
to
have
it
all
on
on
one
place.
A
So
we
want
to
make
sure
that
Penny
Lane
runs
on
everything
and
we
want
to
make
sure
that
Penny
Lane
runs
fast
on
everything.
So,
to
start
with
that,
we
built
a
device
which
we
call
lightning.cubus,
which
is
a
modern
C
plus
20
code
base.
So
the
idea
is
that
if
you
have
a
modern
compiler,
it
will
run
natively
and
we
support
batching
of
observables
with
gradients.
A
So
one
of
the
things
with
Penny
Lane
is
we
want
to
make
sure
that
gradients
are
supported,
natively
and
you're
going
to
be
calculating
gradients
with
respect
to
observables
in
your
circuit.
So
we
can
independently
batch
these
observables
over
openmp
threads
on
a
given
on
a
given
CPU
system,
one
of
the
next
things
we
kind
of
added
recently,
which
is
this
automatically
dispatch,
Syndicate
kernels.
A
So
in
this
case
that,
if
you
have
support,
for
you
know
AVX
512
on
your
Hardware
or
avx2
or
even
just
AVX,
we
should
be
able
to
query
what
gate
sets
you
should
automatically
support
and
our
internal
dispatchable
issues
which
set
of
operations
will
be
the
most
performance
on
your
system
and
the
nice
thing
here
is
you
don't
actually
need
to
comply
with
this
from
scratch
to
make
it
work?
You
just
do
hip,
install
Penny
Lane.
You
get
this
on
Windows
on
Mac
on
Linux,
and
so
this
is
kind
of
our
our
Bare
Bones
festive.
A
You
know
the
fastest
simulator
that
we
built
up
until
a
few
months.
Back
and
obviously
recently,
we've
been
focusing
a
lot
on
Lightning
GPU,
which
is
our
cool
Quantum
back
simulator.
So
with
this
you
know
we're
getting
the
best
performance
on
Nvidia
Hardware,
but
we've
also
implemented
some
native
GPU
support
for
what's
called
agile
impact.
A
Propagations,
forgive
me
for
agile
impact
propagation,
and
so
this
is
kind
of
a
way
of
efficiently
evaluating
Quantum
circuit
gradients
in
I,
guess
the
most
performant
manner
for
for
for
classical
Hardware,
where
you
have
actually
where
you
have
actual
access
to
a
stage
vector
and
last
but
not
least,
we
have
multi-gpu
support
for
the
batching
of
these
Observer
wood
gradients
as
part
of
this
device
as
well.
And
so,
if
you
want
to
install
this,
you
know
assuming
you're
running
on
a
Linux
machine.
A
You
pip,
install
any
land,
lightning,
GPU
and
pip,
install
Google,
Quantum
and
you're
ready
to
go
again,
no,
no
combination
needed,
and,
lastly,
the
caucus
device
is
one
that
I
I
want
to.
Just
draw
a
little
bit
of
focus
to
so.
This
is
something
we've
put
together
recently,
because
we
want
to
make
sure
we
can
support
pretty
much
any
accelerator.
That's
a
you
know
available
on
the
market
right
now,
as
well,
as
you
know,
coming
out
over
the
next
year
or
two,
and
this
automatically
multi-threads
irrigation
kernels
for
us.
A
You
know
over
open
and
previously,
both
plus
threads,
depending
on
how
you
compile
it,
but
we
also
support
Cuda
natively.
We
support
hip
and
rock
them
if
you
want
to
compile
for
AMD,
gpus
and
sickle
if
you
want
to
compile
for
a
single
supported
platform.
Okay,
so
why?
Why
all
of
the
tools
right
a
goal?
A
You
know:
variational
Quantum,
optimization
problems
and
take
a
little
bit
of
a
detour
as
well
into
something
called
circuit,
cutting
okay.
So
let's
focus
a
little
bit
on
variational
algorithms
and
the
gradients
okay.
So
the
general
idea
with
a
variational
algorithm
or
you
know,
a
parametric
point
of
circuit.
If
you
will
is
you
have
some
set
of
parameters
and
you
have
some
Quantum
circus
which
will
accept
these
parameters,
and
you
can
treat
this
as
a
function
effectively.
A
black
box
function
where
you
know
your
function
is
your
circus.
A
Your
parameters
can
be
passed
in
and
then
based
on
your
incoming
parameters,
yeah,
but
from
your
Quantum
circus,
will
will
differ
depending
on
the
algorithm
you're,
putting
together
one
of
the
big
things
that
I
guess
we
we
tend
to
support
in
Panama,
as
I
mentioned,
is
native
gradient
support.
A
Okay,
so
we
want
to
be
able
to
make
sure
that
these
parameters,
we're
passing
in,
can
be
updated
based
on
some
cost
function
or
some
relative
gradients
that
we
are
are
interested
in
evaluating.
So
this
allows
us
to
effectively
navigate
a
potential
landscape
and
find
some
type
of
solution
to
to
a
given
problem
that
is
of
interest
to
us
Okay.
A
So
next,
on
the
list
I
want
to
say
is
quantum
circuits
are
natively
differential,
provided,
they're
parametric,
and
in
this
case
this
is
easily
supported
in
Penny,
Lane,
right
out
of
the
gate,
and
so
one
of
the
Arcus
two
of
the
the
given
methods
that
are,
you
know,
most
I
guess
prominently.
No
one
being
one
is
finite.
A
Difference
we're
all
familiar
with
this
from
even
classical
types
of
problems,
but
in
the
quantum
world
we
also
have
parameter
shift,
and
this
is
kind
of
a
way
of
saying
we
can
build
gradients
from
multiple
executions
of
quantum
circuits
and
we
know
the
scaling
for
these.
It's
you
know
for
n
parameters,
we're
passing
in.
A
We
need
to
evaluate
lots
of
circuits
if
we
have
lots
of
parameters,
and
this
can
cause
a
problem
depending
on
the
type
of
workload
you're
trying
to
put
together,
okay,
quick
DeTour
for
a
moment,
so
I'm
going
to
talk
about
circuit,
cutting
for
the
next
minute
or
so
and
just
say
that
a
tensor
network
is
a
Quantum
circuit.
Is
a
tender
Network?
It's
a
Quantum
circuit
right.
So
these
things
are
interchangeable.
A
Depending
on
how
you
formulate
your
problem-
and
you
can
always
convert
the
tensor
Network
into
a
Quantum
circuit
or
a
Quantum
circuit
into
a
tensor
Network,
provided
you
choose
the
appropriate
operations
that
are
supported
by
both
old
paradigms
and
in
this
region
you
can
apply
methods
that
work
in
tensor
networks
to
you
know
native
state,
Vector
simulation
and
Quantum
circuits,
and
some
of
the
you
know.
The
issues
we
can
do
are
one
of
the
the
behaviors
we
can
do
on
the
tensor
network.
A
Is
you
know,
cutting
the
the
indices
of
you
know
connecting
connecting
tensors
or
cutting
the
gates
themselves
or
which
would
be
the
tensors?
So
we
can
effectively
distribute
and
break
these
components
apart
and
kind
of
evaluate
them,
independent
as
we
have
another,
and
with
that
we
can
actually
do
this
to
Quantum
circuits
too
so
take,
for
example,
a
60
Cubit,
Quantum
circus.
A
A
However,
we
can
you
know,
assuming
that
we
have
an
appropriately
built
problem
decompose
this
into
a
you
know,
a
large
number
of
smaller
circuits
that
can
potentially
fit
on
the
available
hardware
and,
with
you
know,
physical
Quantum
systems
coming
online,
and
you
know
logical
qubits
that
will
be
available
on
these
systems.
We
kind
of
expect
that
there
will
be
limits
to
what
we
can
run
for.
A
You
know
for
the
for
the
hardware
as
it
becomes
available,
and
so
the
idea
with
this
is
to
be
able
to
break
a
problem
down
into
bite-sized,
showing
second
you're
on
on
Quantum
hardware,
and
so
by
taking
this
methods,
we
can
break
circuits
up.
A
We
can
stitch
them
back
together,
classically
after
we
evaluate
all
of
their
individual
components
and
the
nice
thing
being
that
we
can
effectively
evaluate
all
of
these
Cuts
independently
from
one
another,
and
you
can
ask
them,
you
know:
can
we
Farm
these
out
to
you
know
combinations
of
CPUs
gpus
qpus,
and
the
answer
is
yes.
Obviously,
this
is
kind
of
one
of
the
the
nice
things
we've
kind
of
shown
with
this
work,
but
first
I
want
to
do
another
quick
detour
into
an
example
of
how
you
can
do
this
in
Petty
Lane.
A
So
this
is
our
Quantum
circuit.
So
what
I'm
going
to
do
is
I'm
going
to
create
a
two
qubit
device
which
in
this
case,
is
our
lightning
qubit
simulator
next
I'm
going
to
enable
our
circuit
cutting
functionality
in
Penny,
Lane
and
then
I'm
going
to
build
a
tree
qubit
parametric
circuit
right.
So
obviously
you
know
if
you're
trying
to
simulate
a
cupid
problem
on
a
two
cubers
device
you're
going
to
run
into
issues.
However,
by
adding
this
wire
cut,
we
can
effectively
say:
okay,
we
can
just.
A
We
can
break
this
circuit
into
into
multiple
pieces,
which
will
be
smaller
than
than
the
original
than
the
original
composition,
and
then
we
can
evaluate
some
type
of
exploitation
value,
one
action.
So
the
idea
is
that
we
want
to
make
sure
this
is
as
seamless
as
possible
for
the
user
and
circuit
cutting
and
stitching
is
effectively
hidden
behind
the
scenes.
So,
as
far
as
you're
concerned,
you
just
tell
it
to
do
the
circuit
cutting
and
it
evaluates
everything
as
though,
as
the
3D
Jupiter
circuit.
She
will.
A
You
know,
pass
in
your
your
parameter
there,
which
this
requires
gravity
was
true.
Make
sure
that
you
know
your
system
is
trainable.
You
can
evaluate
your
circuit
with
the
parameter
or
you
can
evaluate
the
gradient
of
the
circuit.
Well,
it's
into
that
parameter,
and
you
know
this.
This
works
out
of
the
box
with
with
Penny
Lane,
currently,
okay.
So
now
we
talk
about
scaling
this
up
so
circuit
execution.
For
for
this
type
of
problem,
you
know
it.
It
runs
into
lots
of
evaluations
pretty
quickly
and
the
idea
is.
A
We
can
kind
of
take
this
single
forward
pass
of
a
Quantum
Circuit
by
passing
in
parameters
breaking
it
apart
with
our
circuit,
cutting
and
then
evaluating
it
as
a
as
a
given
function
into
this
type
of
a
workflow
where
the
circuit
transform
is
effectively
cutting
our
circus
down
into
smaller
chunks.
We
evaluate
these
circuits
independently
of
one
another.
We
get
an
outpush
and
then
using
this
post-processing
classical
reduction.
A
We
can
bring
the
results
back
into
into
one
one:
numeric
value
of
Interest
and,
as
I
mentioned
before,
gradients
are
also
of
interest
to
us.
So
as
we're
calculating
more
gradients,
each
slice
of
a
gradient
is
effectively
a
forward
pass
execution
in
this
framework
and,
if
you're
passing
in
lots
of
parameters-
and
you
want
to
calculate
lots
of
gradients
well,
you
need
to
scale
up
the
number
of
evaluations
you
have
to
do
independently.
A
So
this
is
kind
of
the
workload
we
would
expect
to
see
for,
for
a
large
scale
run
using
this
type
of
circuit,
cutting
plus
Quantum
gradients
workload.
And
what
are
we
doing
this
for?
Well?
The
idea
is
to
do
Quantum
parameter
optimization
for
q-wayoa,
and
you
know
the
the
results
in
the
paper
listed
at
the
bottom
of
the
page.
Are
there
I
would
suggest
everybody
to
have
a
read
some
very
nice
analytical
results,
but
I'm
most
interested
in
numerics
in
this.
A
In
this
talk,
and
so
the
idea
is,
we
can
kind
of
use
this
workload
where
we're
building
the
forward
pass,
we're
building
the
the
gradients-
and
we
can
use
this
to
calculate
information,
variational
energy
for
for
QA
QA
problems.
So
the
first
one
I'm
going
to
demonstrate
is
a
129
Cubit
problem
and
then
look
at
variational
parameter,
optimization
of
the
62gbit
problem
that
fits
into
this
into
this
result.
Space
so
on
to
the
numerics.
Well,
I
have
I
think
I
have
two
minutes
left
so
yeah
in
terms
of
variational
energy
calculation.
A
We
kind
of
hit
a
up
to
129
cubits,
and
we
had
some
very
nice
analytical
results
that
allowed
us
to
evaluate
the
same.
The
same
results
for
the
problems
we
were
looking
at.
So
in
this
case
we
have.
You
know
two
qaway
circuit
layers.
We
have
25
nodes,
first
QA
cluster
and
we
have
one
node
per
inter-cluster
connection.
A
So
I
mean
this
is
kind
of
building
our
problem
graph
and
the
idea
is,
we
can
kind
of
increase
the
number
of
clusters
to
increase
the
qubit
requirements
of
our
problem,
and
so
we
were
able
to
run
this
nicely
with
Penny
Lane,
just
out
of
the
gate
and
the
second
problem
we
had
was
looking
at
this
parameter,
optimization
over
62
qubits,
so
for
parameter
for
certain
parameters
of
certain
qaoa
circuits.
A
So
again,
I
would
suggest
anyone
interested
feel
free
to
jump
back
and
have
a
read
and
so
building
this
we
even
played
a
few
tools.
We
use
Penny
Lane
to
do
all
of
the
problem.
Definition
the
circuit,
cutting
and
everything
we
actually
use
Ray
as
part
of
the
orchestration
of
each
of
the
individual
circuits
for
every
array,
remote
task,
we
use
nvidia's
crew,
Quantum
simulator
as
part
of
our
GPU
simulating
device
and
then
obviously
the
the
wonderful
Pro
motor
to
to
do
all
of
the
heavy
lifting.
A
So
we
were
quite
happy
to
see
you
know
some
very
you
know
decent
scaling
results
for
this.
Obviously
it's
you
know
strong
scaling,
so
we
had
a
very
fixed
problem
size.
We
could
definitely
make
it
harder
heavier
and
make
sure
that
we
scale
better
with
with
different
problems,
but
in
terms
of
what
we
wanted.
You
know
we
were
very
happy
with
the
results.
So
all
of
the
the
data
I've
shown
here
today
is
in
this
penny
is
in
this.
You
know
repository
on
GitHub,
and
you
know
that
is
pretty
much
it
from
me.