►
Description
More about this lecture: https://dl4sci-school.lbl.gov/tess-smidt
Deep Learning for Science School: https://dl4sci-school.lbl.gov/agenda
A
Good
morning,
everyone
welcome
again
to
another
deep
learning
for
science
school.
I'm
really
excited
to
have
tess
met
with
us
today
to
talk
about
symmetry
and
equivalence
in
neural
networks,
one
of
the
most
important
topics
and
feats
that
we
need
to
get
to
to
incorporate
physics
and
the
physics
physical
knowledge
in
in
our
neural
networks.
A
In
the
architecture,
tess
smith
is
the
2018
alvarez
postdoctoral
fellow
in
the
computing
science
sciences
here
at
berkeley
lab
her
current
research
interests
include
building
neural
networks
from
first
principles
for
rich
data
types
and
accelerating
existing
techniques
and
creating
new
capabilities
for
computational
chemistry
and
material
science.
A
Tess
earned
her
phd
in
physics
from
uc
berkeley
in
2018
and
as
a
graduate
student,
she
used
quantum
mechanical
calculations
to
understand
and
systematically
design
the
geometry
and
corresponding
electronic
properties
of
atomic
systems.
Tess
has
been
working
on
developing
neural
networks
for
applications
in
chemistry
and
physical
sciences
for
a
while.
Now
she
she
interned
at
google's
accelerated
accelerated
science
team,
where
she
developed
a
new
type
of
convolutional
neural
networks
called
tensor
field
networks
that
can
naturally
handle
3d
geometry
properties
of
physical
systems.
A
This
is
a
topic
that
is
very
close
to
what
we
are
she
will
be
talking
about
today.
So
I'm
really
pleased
to
have
tess
here,
please
test.
Thank
you
for
joining
us
for
everyone.
Please
remember
to
ask
questions
in
the
q,
a
not
in
the
chat
that
it
helps
us
actually
route,
the
questions
to
tests
and
yeah.
Please
ask
questions.
B
Well,
mustafa,
thank
you
so
much
for
that
lovely
introduction
and
thank
you
so
much
for
having
me
hello,
everyone,
I'm
tess,
I'm
really
excited
to
be
here
and
to
share
a
bit
of
what
I
love
about
symmetry
and
equivariance
and
neural
networks.
It's
a
really
large
topic,
so
I'm
not
going
to
be
able
to
completely
do
it
justice
today,
but
I'm
certainly
going
to
try
and
I'm
going
to
hopefully
focus
on
some
things
that
are
relevant
to
kind
of
the
scientific
questions
that
you're
interested
in
investigating.
B
So
because
this
is
zooming,
because
I
can't
see
all
of
your
lovely
faces.
What
I
will
need
your
help
with
is
that
I
really
need
you
to
ask
questions
when
I'm
saying
something
that
doesn't
make
sense,
so
I
will
also
be
monitoring
the
zoom
chat
and
what
I'm
gonna
do.
Basically,
I'm
gonna
check
in
pretty
frequently
so
after
you
know,
before
I
go
on
to
a
next
slide
or
next
concept,
I'm
gonna
be
checking
the
chat.
I
would
really
really
love
it.
B
B
If
we
don't
get
past
slide
five,
I
mean
that'd
be
a
bit
surprising,
but
it
would
that
would
be
fine
yeah,
and
so
with
that,
I
first
want
to
explain
what
is
this
image
that
I'm
showing
you
so
I
wanted
to
give
an
example
of
why
symmetry
and
equivalence
matter
in
scientific
data,
so
I
thought
maybe
this
would
be
a
good
example
on
the
let's
see
on
the
left.
I
always
have
trouble
with
left
and
right
so
on
the
left.
B
We
have
a
water
molecule,
that's
just
rotating
in
3d
space
and
then
on
the
right.
What
we
have
is
we
have
this
matrix
and
this
matrix
happens
to
represent
the
electronic
interactions
between
the
different
orbitals
on
the
different
atoms
of
hydrogen,
I'm
sorry
of
of
water,
so
that
you
have
the
oxygen
orbitals,
the
hydrogen
orbitals
and
hydrogen
orbitals
and
kind
of
it's
a
matrix,
because
you
basically
are
asking
how
strongly
do
these
interact?
These
orbitals
interact
with
respect
to
each
other.
B
The
reason
why
I'm
showing
you
this
diagram
is
to
show
you,
while
the
water
molecule
being
rotated,
looks
pretty
simple:
you're
like
okay
yeah,
a
neural
network
should
probably
be
able
to
understand
that
a
water
molecule
is
rotating.
That's
not
too
hard.
The
problem
is
that
a
lot
of
the
quantities
that
we
are
interested
in
in
science.
Look
like
this.
What
is
actually
the
hamiltonian
matrix
of
water
which,
as
you
can
see
it,
transforms
in
a
very
complicated
and
it's
a
hard
to
to
recognize
pattern
as
to
how
it's
rotating?
B
So,
if
you're
asking
a
neural
network
to
like
learn
that
all
of
those
different
matrices
actually
mean
the
same
thing,
because
they're
all
just
the
same
hamiltonian
for
water
just
in
different
rotations,
you
can
imagine
that
that
might
be
difficult,
and
this
is
where
symmetry
comes
in,
because
if
I
know
that
I
can
rotate
a
water
molecule
in
any
coordinate
system
and
the
hamiltonian
matrix
also
can
rotate
for
any
coordinate
system,
then
I
know
that
I
only
actually
need
one
of
these
hamiltonians
to
reconstruct
any
of
them.
B
So
this
is
kind
of
the
power
of
symmetry.
If
you
employ
it
into
machine
learning,
models
and
I'll
specifically
talk
about
neural
networks,
but
this
applies
more
broadly
to
other
methods.
B
So,
let's
see
if
I
can
get
my
cursor
over
here,
okay,
so
a
quick
outline.
I
want
to
talk
about
what
kind
of
assumptions
are
actually
built
into
neural
networks
and
into
their
operations,
and
then
I
want
to
talk
about
more
about
like
why
does
symmetry
appear
in
scientific
problems
or
just
in
in
general
computational
tasks
and
then
at
some
point
in
that
explanation,
I
may
have
actually
reordered
things
a
little
bit.
I
will
describe
symmetry
in
variance
versus
equivalence.
B
This
is
these
two
words
come
up
a
lot
together
and
often
they're
used
interchangeably,
but
they
actually
mean
something
very
different
related,
but
different.
Invariance
is
just
things:
don't
change
under
some
transformation
and
equal
variances
things
do
change
so
physicists
in
the
audience
would
also
recognize
the
word
covariance
so
covariance.
If
it's
a
covariant
tensor,
that
would
mean
it's
an
equivalent
quantity.
So
once
I
kind
of
maybe
disentangle
those
two
definitions
which
may
actually
happen
kind
of
combined
with
point
four
I'll
describe,
how
do
you
actually
make
models?
B
Symmetry
aware
and
I'll
describe
kind
of
the
reasoning
for
why
you
do
it
certain
ways
versus
other
ways
and
what
you
get
the
pros
and
cons
of
each
approach?
And
then
I'm
going
to
do
a
case
study
which
is
I'm
going
to
describe
how
you
get
symmetry
equivariance
to
euclidean
symmetry,
specifically
in
what
are
called
euclidean
neural
networks,
which
is
kind
of
now
the
superset.
It's
become
a
super
set
of
things
like
tensorfield
networks,
clutch
gordon
nets,
3d
scarable
cnns.
B
We
all
we've
kind
of
rebranded
them
as
euclidean
neural
networks
as
they're
all
equivalent
to
putting
symmetry,
and
I
think
that's
going
to
be
something
that
will
probably
be
new
for
a
lot
of
people
and
but
it
should
hopefully
be
insightful
as
to
kind
of
what
are
the
considerations
that
you
need
to
think
about
when
making
these
models.
Okay.
B
Point
five,
then
we'll
talk
about
just
consequences
of
making
your
model
symmetry
aware.
Often
you
make
a
model
symmetry
aware
for
one
reason,
and
then
you
get
a
bunch
of
consequences
that
were
not
things
you
intended,
and
I
can
give
some
very
concrete
examples
from
when
we
made
tensorfield
networks,
then
I'll
give
a
brief
recap
and
then
I'll
put
up
a
slide
that
just
has
a
bunch
of
resources,
not
exhaustive,
but
better
than
none
on
just
links
and
papers
that
might
be
of
interest
to
you.
B
B
Okay,
I
don't
see
any
questions
so
I'll
go
ahead.
Okay,
so
neural
networks
are
specially
designed
for
different
data
types
and
assumptions
about
the
data
type
are
actually
built
into
the
operation.
So
in
these
next
figures,
w
is
going
to
roughly
represent
the
neural
network
or
something
learnable.
Some
learnable
parameters
and
x
is
going
to
be
whatever
I
give
to
that
neural
network.
B
So
if
I
have
a
bunch
of
arrays,
so
a
bunch
of
data
arrays,
I
might
use
a
dense
neural
network.
This
is
often
considered
the
most
general
type
of
neural
network
and
basically
it's
some
linear
operation
mixed
in
with
a
bunch
of
nonlinear
operations.
But
it's
all
operating
on
these
data
vectors
and
the
assumption
built
into
the
network
is
that
the
components
of
the
vector
are
independent.
B
I
don't
need
any
special
consideration
for
how
these
different
components
relate,
so
there's
no,
nothing
special
in
the
relationship
between
those
pieces
of
data.
Okay,
if
I
have
a
2d
image,
I
might
use
a
convolutional
neural
network
and
what's
baked
into
a
convolutional
neural
network,
is
that
the
same
feature
can
appear
anywhere
in
the
image
and
then
it
actually
means
the
same
thing.
B
So
if
I
get
a
fluffy
ear
and
this
part
of
the
image
or
a
fluffy
ear
in
this
part
of
the
image,
it's
a
fluffy
ear,
they
don't
mean
they're,
not
different
types
of
fluffy
ears,
they're
just
fluffy
ears,
and
this
has
an
assumption
of
locality
that
if
I
want
to
find
patterns
where
pixels
next
to
each
other
are,
are
they
mean
more?
So
they
have
a
closer
relationship.
If
I
want
to
learn
about
this
pixel,
I
should
look
at
the
pixel
next
to
it.
B
Okay,
if
we
have
text,
I
know,
maybe
recurrent
neural
networks
are
have
been
superseded
by
transformer
networks,
but
it's
still.
It
is
good
for
illustrating
this
purpose.
Recurrent
neural
networks
are
expecting
sequential
data
and
specifically
the
assumption
that
the
next
input
or
output
depends
on
what
came
before.
B
Okay
and
then
graphs.
If
you
have
a
graph,
you
might
use
a
a
graph
neural
network
and
I
put
convolutional
neural
network
because
there's
actually
there's
an
explosion
of
the
types
of
graph
neural
networks
and
they
can
all
do
very
different
things
and
I'm
gonna
more
specifically
talk
about
graph
convolutional,
neural
networks,
where
you
have
sort
of
a
an
aggregation
approach.
That
depends
on
your
nearest
neighbors.
But
this
is
a
case
where
you
have
some
graph,
which
is
just
topological
data.
The
nodes
have
features
on
them.
B
The
edges
can
have
features
too,
and
the
network
passes
messages
kind
of
between
the
nodes
via
the
edges
kind
of
a
broad
definition,
and
then
I'm
putting
this
on
a
special
box
because,
as
I
said,
it's
probably
one
that
you're
less
familiar
with
and
if
you
have
3d
physical
data,
so
you
have
data
that
was
generated
in
3d
space
by
an
experiment
or
in
a
simulation.
B
You
might
want
to
consider
using
a
euclidean
neural
network,
because
data
in
in
3d
euclidean
space,
you
always
have
the
freedom
to
choose
your
coordinate
system,
and
so
that
assumption
is
baked
into
the
network.
Okay,
so
these
are
all
you
can
see.
There's
a
lot
of
assumptions
going
on
that.
I
think
a
lot
of
people
don't
talk
about
with
these
different
networks,
and
the
key
thing
about
these
assumptions
is
that
symmetries
actually
emerge
from
these
assumptions.
B
B
If,
for
2d
images,
we
have
translation
symmetry
for
recurrent
neural
network
systems,
kind
of
interesting,
you
have
time
translation
symmetry,
but
only
in
the
forward
direction,
because
you're
using
the
same
module
over
and
over
again,
so
you're
kind
of
assuming
that,
regardless
of
when
something
happens
in
time
that
it
should
interpret
it
in
the
same
way
it
still
takes
in
the
history,
but
the
actual
module
itself
is
identical.
B
So
it
has
a
sort
of
a
forward
time:
translation,
graphs,
convolutional,
neural
networks,
not
in
all
cases,
but
in
most
cases
that
the
desire
is
to
have
permutation
symmetry.
I
can
order
the
nodes
in
any
order
in
my
computer,
but
it
should
still
mean
the
same
thing
if
I
permute
them
and
I'll
talk
about
that.
A
little
bit
more
and
then
lastly
yeah
for
euclidean
neural
networks,
you
have
euclidean
symmetry,
so
I'll
quickly
see
a
checking
if
anyone's
got
any
questions
there.
B
I'm
also
going
to
switch
the
power
from
my
ipad
to
my
laptop
to
make
sure
that
my
laptop
doesn't
die.
Okay,
I
don't
see
any
questions,
so
I
will
continue
okay.
So
yes,
so
symmetry
emerges
when
different
ways
of
representing
the
something
means
the
same
thing,
and
you
can
talk
about
symmetry
in
many
different
ways.
You
can
talk
about
kind
of
the
symmetry
of
the
representation
so
kind
of
this.
The
space
of
possibilities
that
something
can
be
operations
themselves
can
preserve
symmetry.
B
So
if
a
certain
space
has
a
given
symmetry,
does
this
operation
muck
that
up
or
does
it
preserve
that
symmetry
and
then
objects
themselves
so
objects
existing
in
these
spaces?
These
representations
can
have
symmetry,
so
it
can
get
really
confusing
as
to
what
in
the
world,
people
actually
mean
by
symmetry.
So
I'll
leave
out
the
operations
about
preserving
symmetry
for
later
in
the
talk,
but
I'll
at
least
give
some
examples
of
what
does
it
mean
for
a
representation
to
have
symmetry
versus
an
object
having
symmetry?
B
B
So,
for
example,
I've
highlighted
this
is
a
little
kagome
tiling,
that's
a
specific
basket
weaving,
but
it
also
shows
up
in
material
science.
In
case
people
are
wondering
where
the
name
came
from,
so
you
have
these
kind
of
upside
down
triangle
features
and
all
of
those
features.
If
I'm
assuming
translation
symmetry
in
2d,
I'm
saying
that
all
those
upside
down
triangles
are
the
same.
They
mean
the
same
thing,
but
this
up
this
up
right
side
up
triangle
pattern
does
not
mean
the
same
thing.
B
It
would
mean
the
same
thing
if
I
had
2d
euclidean
symmetry,
where
I
was
including
rotations
and
things
like
that.
But
it
does
not
it's
not
included
if
I
purely
have
translation
symmetry,
which
is
the
symmetry.
That's
kind
of
assumed
by
a
convolutional
neural
network,
so
convolutional
neural
networks
do
not
think
of
those
as
the
same
thing,
they
don't
think
of
the
orange
box
and
the
blue
box
is
the
same
thing:
okay.
So
what
about
the
symmetry
of
2d
objects?
B
So
something
that
I
think
is
kind
of
interesting
is
that
the
boundary
of
your
image
actually
breaks
translation,
symmetry,
so
a
convolutional
neural
network
purely
just
convolutional.
If
you
don't
tack
on
a
deep
or
a
dense
network
at
the
end,
it
can
apply
to
an
image.
That's
infinitely
large
it
generalizes
to
that
case.
So
it
can.
It
can
operate
on
images
of
any
size.
The
only
thing
that
limits
what
size
of
image
you
can
use
is
typically
the
dense
layer.
B
That's
used
for
classification,
for
example,
so
that's
kind
of
interesting,
so
you
can
have
kind
of
these
sub
symmetries
within
within
the
image.
But
if
you
have
the
boundary
sort
of
you've
sort
of
fixed
the
origin,
if
you
put
back
periodic
boundary
conditions,
you
would
recover
sort
of
like
these
discrete
translations.
You
don't
have
continuous
translation
of
2d
space,
but
you
have
sort
of
discrete
translations
like
if
you
had
a
unit
cell.
B
I
don't
see
anything
so
we'll
go
ahead.
Okay,
so
let's
talk
about
permutation
symmetry
permutation
symmetry
is
the
set
of
sets
or
sorry
the
symmetry
of
sets,
and
this
sort
of
means
that
if
I
have
10
items
I
can
list
them
in
any
order,
but
it's
still
the
same
set
of
10
items.
So
in
the
case
of
graphs,
where
this
comes
up,
I
might
have
graphs
with
a
particular
topology
like
this.
This
graph
in
pink
here
and
I
might
in
memory,
have
the
data
for
each
node
stored.
B
You
know
in
the
array,
location,
0,
1,
2,
3,
4
5.,
but
the
graph
means
the
same
thing
if
I
decided
to
reorder
those
nodes.
So
I
don't
want
my
algorithm
to
be
sensitive
to
the
order,
because
the
only
reason
why
those
nodes
are
ordered
is
because
of
the
details
of
of
how
computers
compute
just
the
fact
that
they
need
to
store
things
in
memory
and
that
they
need
to
compute.
B
B
So
so
it's
funny
because,
like
with
symmetry,
there's
sort
of
like
the
the
platonic
ideal
of
what
are
called
groups
which,
like
the
group
of
3d
rotations,
is
this
platonic
ideal,
but
normally,
when
we
think
of
rotations,
we're
thinking
of
three
by
three
rotation
matrices
or
like
an
axis
at
an
angle,
and
so
the
the
actual
group
is
sort
of
like
the
ideal
of
what
it
means
to
operate
on
something
with
some
operation.
But
how
you
actually
go
about
expressing
that
transformation
depends
on
your
representation.
B
It
depends
on
what
am
I
acting
on
three
by
three
or,
if
I
am
acting
on
cartesian
space?
Am
I
acting
on
spherical
harmonics?
What
am
I
acting
on?
So
I'm
not
sure
if
that
really
answers
your
question,
but
I
certainly
do
want
to.
I
believe
the
question
you're
getting
at
is
is
important.
B
B
B
Yes,
okay,
thanks
for
the
clarification.
So
what
michael
was
saying
is
that
I'm
just
talking
about
the
order
of
the
nodes
in
memory
and
not
necessarily
saying
that
these
nodes
are
the
same.
So,
for
example,
if
I
look
at
node
three
in
the
above
picture,
it's
connected
to
three
different
other
nodes,
but
in
the
bottom
one's
only
connected
to
and
yes
and
when
I'm
labeling
them,
I'm
labeling
them,
just
as
they
are
in
memory
and
not
as
them
being
the
same.
B
B
Cool,
it's
very
exciting
thanks
for
asking
questions.
Folks,
love
it
okay.
So
if
I
have
a
graph
I
can
also
say
this
graph
has
a
certain
symmetry,
and
I
think
this
is
what's
touching
upon
michael's
question
and
this
is
sort
of
if
you've
heard
the
term
graph
automorphism.
This
is
where
this
comes
in.
Where
specific
nodes
are
indistinguishable,
they
have
the
same
global
connectivity.
So
if
I
look
at
their
next
near
their
nearest
neighbors,
they
look
the
same
next
nearest
neighbors.
They
look
the
same
and
so
forth.
B
So
if
I
kind
of
look
at
all
their
entire
connectivity,
you
can
have
nodes
that
look
the
same.
So
in
this
little
bowtie
graph,
you
can
see
that
the
two
orange
nodes
are
symmetrically
indistinguishable,
as
are
the
blue
nodes.
So
hopefully
that
answers
a
bit
of
michael's
question
and
now
let
me
check
on
anabolic
question.
B
Okay,
so
yeah
permutation
symmetry
is
really
important
in
any
neural
network,
where
you
basically
have
nodes
or
points
or
geometry,
so
anything
where
you
have
to
store
it
in
memory,
but
it
needs
to
be
treated
as
a
set,
and
so
you
know,
even
when
you
do
a
convolution
on
images.
B
One
of
the
ways
that
convolutions
get
away
from
this
problem
is
because
they're
dealing
with
geometry.
So
what
you
like?
How
you
apply
the
operation
it
doesn't
depend
on?
It
doesn't
at
all
depend
on
which
index
it
has
in
memory.
All
that
matters
is,
are
you
in
my
neighborhood
and
what
is
my
relative
distance?
B
Okay,
carrie
anne
asking.
If
I
can
read
the
question
out
that
I'm
answering
absolutely
got
it,
so
it
doesn't
appear.
Oh,
I
see
because
some
of
them
are
in
the
panel
okay.
So
on
a
bob's
question
was
how
is
permutation
symmetry
applicable
to
neural
networks.
Neural
networks,
in
my
understanding
are
directed
acyclic
graph.
So
order
matters.
Okay,
so
I
think
the
picture
that
anabad
is
talking
about
is
kind
of
the
picture
with
all
the
nodes
and
the
connections
that
people
often
describe
dense
neural
networks
with
so
what
I'm
taught
yeah.
B
So
I
I
don't
think
about
neural
networks.
That
way,
because
I
find
that
confusing
I,
when
I
think
of
dense
neural
networks,
I
literally
think
about
this
picture
here.
I
don't
know
if
you
can
see.
I
hope
you
can
see
my
cursor
yeah,
so
if
I
basically
just
have
matrix
operations
applying
to
vectors,
that's
what
I
think
about
when
I
think
of
a
dense
network,
so
I
don't
think
of
the
neural
network,
it's
itself
as
being
needing
permutation,
which
is
more
of
like
the
actual
operations
of
it.
B
I'm
not
sure
if
that
answers
your
question
or
above,
but
I
can
try
again
later
too
okay.
Thank
you
guys
so
much
for
the
questions.
I
will
go
ahead
and
proceed
so
we
talked
a
bit
about
permutation
symmetry.
Now
I
want
to
talk
about
euclidean
symmetry
and
again.
I
want
to
emphasize
that
the
reason
why
we
have
symmetry
in
3d
space,
despite
the
fact
that,
like
the
world
is
messy
and
asymmetric,
is
because
we
always
have
the
freedom
to
choose
our
coordinate
system
and
the
physical
system
should
mean
the
same
thing.
B
So
it's
kind
of
unintuitive,
because
the
world
is
not
symmetric
and
yet
there's
an
underlying
symmetry
of
3d
space.
Because,
basically,
if
we
took
everything
out
of
3d
space
and
which
is
empty
3d
space,
we
wouldn't
be
able
to
tell
the
difference
if
we'd
moved
a
bit
or
rotated
a
bit
or
inverted
our
coordinate
system.
It
would
all
look
the
same.
So
the
underlying
space
and
the
underlying
representation
has
the
symmetry.
B
But
we
can
also
talk
about
the
symmetry
of
geometric
objects,
and
I
think
this
concept
is
more
familiar
with
folks,
especially
in
chemistry
and
material
science.
They're
much
more
familiar
with
the
symmetry
of
geometric
objects,
also
familiar
with
the
symmetry
of
3d
space,
from
dealing
with
things
like
geometric,
tensors
and
3x3
matrices
and
elasticity
tensors
and
all
that
jazz.
But
I
think
when
people
say
this
object
has
symmetry,
they
usually
mean
the
symmetry
of
the
object
and
not
the
underlying
representation.
B
So
this
benzene
molecule,
for
example,
is
highly
symmetric.
It
has
a
six-fold
rotation
axis,
several
mirror
planes
several
2d
or
a
two-fold
rotation
axes.
So
it's
a
highly
symmetric
molecule
in
case
you're
interested
in
the
point
group
d6h,
but
I
just
wanted
to
kind
of
introduce
the
concept
that
the
representation
has
a
symmetry
and
the
object
can
have
a
symmetry.
B
So
I
hope
that's
a
little
clarifying
okay.
Now,
let's
talk
about
how
do
we
actually
go
about
making
models?
Symmetry
aware
and
I'm
going
to
use
an
example
of
how
do
we
make
a
model
that
understands
the
symmetry
of
it's
an
atomic
structure
and
we're
going
to
take
benzene
as
another
example,
so
the
most
general
way
to
express
kind
of
any
geometry
is
to
use
coordinates,
but
this
is
problematic
because
in
general,
coordinates
are
sensitive
to
translations,
rotations
and
inversions.
So
the
numbers
change.
B
So
if
I
want
to
give
these
numbers
to
a
neural
network
chances,
are
it's
going
to
be
sensitive
to
any
of
these
changes,
but
there's
sort
of
three
different
ways
that
we
might
be
able
to
handle
this
approach
one.
It
doesn't
matter
we're
doing
deep
learning,
throw
all
the
data
at
the
model
and
see
what
you
get.
That
is
a
very
valid
approach
and
in
many
cases
it
will
actually
work
with
enough
training
and
gpu
burning
and
things
like
that
approach.
B
Two
is
convert
your
data
to
a
representation
so
that
the
neural
network
can't
possibly
mess
it
up.
You've,
basically
modded
out
you've
gotten
rid
of
all
of
these
sensitivities,
to
any
choice
of
coordinate
system
or
in
the
case
of
graphs,
any
sort
of
permutation
symmetry
like
you
just
have
like,
maybe
a
number
or
a
set
of
numbers
that
describe
things
about
the
graph
without
actually
addressing
any
individual
nodes,
for
example.
That
would
be
something
that's
like
an
invariant
representation.
B
So
basically
you
mod
out
all
the
the
tricky
bits
and
that
that's
a
that's
a
fairly
good
approach,
too
approach.
Three.
If
there's
no
model
that
naturally
handles
the
symmetry
of
your
system,
you
can
build
one.
B
B
I
guess
rigor-
or
you
know
you
have
certain
guarantees
in
certain
cases
so
approach.
One
is
basically
the
approach
of
data
augmentation
that
if
I
just
show
my
model
enough,
for
example,
rotated
examples.
It'll
eventually
get
the
idea
that
these
things
are
the
same,
and
while
that's
true
that
it
does
learn
to
give
the
same
output
for
something
it
doesn't
actually
learn
that
it's
the
same
approach
so
yeah,
so
data
augmentation,
and
also,
if
you
put
constraints
in
your
loss
function,
which
can
sometimes
be
you
it's
very
it's
typically
very
similar
to
data
augmentation.
B
So
I'm
going
to
treat
data
augmentation
loss
function,
constraints
is
kind
of
the
same
thing,
and
these
are
very
good
approaches
and
I'll
discuss
that
yeah.
So
approach.
Two
is
basically
changing
your
input
so
making
your
data
fit
the
model
rather
than
making
the
model
fit
your
data
and
then
approach
three
will
go
into
more
detail,
but
it
kind
of
is
more
similar
to
what
I
talked
about
in
my
first
slide
with
how
these
different
models
build
in
certain
assumptions
about
data
types.
B
B
Great
okay:
let's
talk
about
data
augmentation,
so
I
said
this
a
little
bit
in
the
previous
slide,
but
it's
always
good
to
re-emphasize
data.
Augmentation
is
the
brute
force
approach
to
making
to
teaching
your
model,
to
be
symmetry
aware
or
at
least
to
emulate
it
again,
there's
not
really
a
guarantee
that
it's
going
to
always
behave.
Predictably,
when
you
rotate
an
object,
but
in
most
cases
it
will
in
2d
for
images.
Typically,
people
get
away
with
about
two
or
like
a
ten-fold
augmentation
and
there's
various
reasons
for
that
image.
B
Datasets
tend
to
be
pretty
large
and
have
a
lot
of
rotations
within
them
already.
But
if
you
have
3d
data
like
you're
dealing
with
molecules
in
3d
space,
then
data
augmentation
get
very
expensive.
B
What
kind
of
data?
Sorry,
okay,
so
I'll
finish
the
association
and
then
I'll
get
to
the
question
on
the
slide.
Sorry,
I'm
realizing
that
I'm
very
distractible,
okay,
so
yeah,
so
it's
very
expensive.
So
I
wanted
to
give
this
example
of
if
you're
training
without
cemetery-
and
you
wanted
to
train
a
model
to
be
able
to
recognize
that
this
is
a
cube.
B
You'd
have
to
show
all
these
different
rotations
of
this
cube,
and
I'm
assuming
that
this
is
a
3d
model,
I'm
showing
a
2d
picture,
but
it's
a
3d
model
and
each
of
these
rotations
looks
very
different.
But
if
my
model
knows
that
things
are
the
same
up
to
you
know,
maybe
some
rotation
or
inversion
or
translation.
B
I
only
need
one
example,
and
so
that's
very
helpful.
Okay,
so
now
I'm
going
to
go
to
the
q
a
what
kind
of
data
type
is
used
for
the
benzene
example
yeah,
so
that
data
type
would
technically
be
a
geometry
or
a
geometric
tensor,
but
you
can
also
make
invariant
representations
of
the
benzene
molecule.
So
there's
actually
it's
kind
of
an
open
question
as
to
what
you
how
you
want
to
handle
that
does
that
answer
your.
B
B
I
do
want
to
emphasize
that
data
augmentation
does
make
sense
in
many
cases,
particularly
if
it's
very
difficult
to
formalize
when
two
things
are
similar,
so
you're
like
okay,
these
things
are
kind
of
similar,
but
I
can't
like
formally
show
how
I
transform
this
one
in
that
one
that
can
be
quite
difficult
or,
if
there's
some
quantity
that
you
want
to
conserve
that
it's
it's
not
so
much.
B
It's
not
so
much
that
you
can
easily
yeah,
basically
anything
that's
hard
to
articulate
as
as
a
group
or
as
a
transform
anything
like
that,
then
by
all
means,
please
feel
free
to
add
that
to
your
lost
function,
terms
or
to
use
data
augmentation,
especially
things
that
are
kind
of
messy.
Like
anything,
that's
like
it's
approximately
similar
and
so
perturbations
things
like
that,
so
I
think
there
are.
There
are
cases
that
make
sense
for
using
data
augmentation.
B
They
don't
tend
to
come
up
in
what
I
do
just
because
I'm
always
dealing
with
geometry,
and
so
you
really
do
need
to
deal
with
geometric
tensors,
but
yeah
that
totally
totally
can
be
appropriate
for
your
needs.
Okay,
I
will
go
back
to
the
q.
A
okay
question
is:
is
there
a
benefit
in
terms
of
self-supervised
learning
with
the
equivalent
nature
of
data,
so
that
does
that
influence?
How
neural
network
structure
is
also
equivalent
symmetric,
self-supervised
learning.
B
So
I'm
not
exactly
sure
what
you
mean
by
self-supervised
learning.
I
don't
know
if
that
means
like,
like
an
auto
encoder,
it's
kind
of
unsee,
it's
like
semi-supervised,
but
I
would
say
that
in
general,
having
equivariance
and
symmetry
in
your
network
just
always
helps
if
you
can
make
that
assumption
safely.
B
With
your
data,
which
in
many
scientific
data
sets
you
can't
like
permutation,
symmetry
and
and
the
symmetry
of
3d
space
are
pretty
good
ones
to
make,
that's
only
going
to
help
because
you're,
basically
allowing
your
model
to
just
focus
on
the
actual
data
and
not
learning
how
to
understand
what
a
rotation
is
doing.
So
I
think
it
makes
your
model
able
to
focus
on
them.
Probably
the
more
pertinent
features,
the
things
that
you
actually
wanted
to
pay
attention
to.
So
I
would
say
it
probably
helps
so
I
don't
know
if
that
answers
your
question.
B
But
that's
that's
my
answer
great
another
question:
can
we
learn
the
symmetry
if
it's
hard
for
us
to
write
down?
Yes,
you
can-
and
I
actually
I
didn't
put
this
in
the
resources,
but
I
can
there's
this
really
pretty
paper
by
taco,
cohen
and
friends.
Called
homeomorphic,
it's
like
variational,
auto,
encoders,
homeomorphic
variation
on
encoders
like
that
and
there's
actually
a
lot
of
there's
a
lot
of
people
working
on
this.
I
do
not
work
on
this,
so
I
don't
I
don't.
There
is
a
way
to
do
it.
B
I
don't
know
how
to
do
it
again.
I
I
use
the
same
group
for
everything,
but
there
are
ways
of
doing
this.
I
don't
often
know
what
they
are
on
the
last
slide.
I
have
a
link,
there's
actually
a
workshop
tomorrow
on
equivariance
and
data
augmentation
and
those
there's
there's,
I
think,
there's
at
least
three
or
four
talks
on
learning,
symmetries
or
learning
invariants
from
data,
so
that
would
probably
be
a
better
source
of
information.
To
answer
your
question:
okay,
cool
thanks
for
the
questions.
I'm
gonna
go
ahead
and
continue.
B
So
we
talked
about
data
augmentation.
So
that's
approach,
one!
Let's
talk
about
approach
two,
so
let's
say
I
have
some
invariant
representation
that
basically
sweeps
under
the
rug,
all
the
complexities
of
the
symmetry
of
kind
of
how
I
would
most
naturally
represent
my
object.
I
you
know
an
invariant
representation
might
be
like.
I
have
a
molecule,
and
I
say
well:
an
invariant
representation
is
how
many
atoms
does
it
have
and
how
many
carbon
atoms
and
how
many
hydrogen
atoms,
because
no
matter
how
many
times
you
know
I
rotate
my
molecule.
B
It
has
the
same
number
of
atoms.
So
that's
what
we
mean
by
sort
of
an
invariant
representation,
you
sort
of
featurize.
The
data
object
to
have
these
features
that
don't
change
under
a
change
of
coordinate
system
or
a
change
of
permutation.
Anything
like
that.
The
nice
thing
about
invariant
representations
is
that
you
can
throw
them
at
any
neural
network
with
an
invariant
representation
or
like
you,
can
throw
them
at
any
neural
network
and
it
can't
mess
it
up.
B
It'll
be
fine,
and
the
nice
thing
too,
is
that
you,
if
you
make
a
really
good
invariant
representation,
you
can
use
it
with
any
machine
learning
algorithm.
So
it
doesn't
have
to
be
neural
networks,
and
I
think
you
see
this
a
lot
in.
Like
chemi
informatics
materials.
Informatics
people
have
spent
many
many
years
of
their
lives
crafting
these
gorgeous
invariant
representations
of
chemical
and
material
systems
that
are
very
effective
for
certain
things.
B
I
will
say,
however,
if
you
can
craft
a
good
representation:
that's
fantastic,
that's
great,
but
deep,
learning,
specialty
or
secret
sauce
is
its
ability
to
learn
representations.
And
so,
if
you
really
want
to
use
a
specific
invariant
representation,
you
may
want
to
go
with
a
different
machine
learning
model.
There's
so
many
more
interpretable
machine
learning
models
than
neural
networks.
Neural
networks
are
sort
of
the
most
black
boxy.
B
So
if
you're
not
using
it
to
learn
those
features,
you
may
just
have
a
better
time
or
you
might
be
able
to
get
more
out
of
your
model
by
using
like
a
kernel
method
or
a
decision
tree.
Those
are
all
fantastic
models
for
for
getting
concrete
insights
from
your
data.
So
that's
kind
of
my
my
thoughts
on
that
one.
Okay,
so
one
question
the
q.
A
invariant
representation
will
need
future
engineering
effort
right.
If
so,
what's
the
advantage
of
deep
cnc's
in
terms
of
automatic
feature
extraction?
B
Okay,
so
this
kind
of
touches
upon
what
we
were
just
talking
about.
So
yes,
invariant
representations
do
need
feature
engineering,
that's
yeah!
That's
sort
of
the
definition,
so
you
can
kind
of
think
of
invariant
representation.
Is
equivalent
to
feature
engineering
the
advantage,
I
guess,
of
a
deep
cnn
in
terms
of
automatic
feature
extraction,
is
the
fact
that
it's
automatic
and
also,
I
think
something
that
can
be
nice,
is
if
you
really
want
to
use
the
fact
that
your
model
is
differentiable
and
be
able
to,
like.
B
You
know,
update
your
input
with
respect
to
gradients,
which
is
really
a
fun
game
to
play.
Like
okay,
like
I
put
my
input
through
the
model
and
I
get
some
output,
but
what
if
I
want
the
output
to
be
slightly
different?
How
does
that
change
the
inputs?
So
if
you
want
to
use
this
differentiability
feature,
which
especially,
can
be
really
fun
for
geometry,
that
can
that
you
know
that
can
be
nice.
What
are
other
reasons?
B
I
guess,
if
you
just
you,
want
to
sort
of
be
open
to
being
able
to
toggle
on
and
off
different
interactions
between
to
get
between
the
inputs
and
outputs.
This
is
something
more
relevant
to
euclidean
neural
networks,
where
all
the
interactions
in
the
network
have
very
specific
data
types,
and
so
you
can
actually
turn
some
of
them
off
and
turn
on
other
ones,
and
you
can
actually
use
that
to
run
experiments
as
to
how
important
are
like
vector,
vector
interactions
or
three
by
three
matrix
by
three
or
three
matrix
interactions.
B
So
I
think
it
allows
you
to
ask
it
allows
you
to
run
experiments.
I
think
that's.
What
is
really
interesting
and
and
powerful
about
neural
networks
is
that
you
can,
if
you
craft
them
well,
then
they
become
little
toys
that
you
can
experiment
with
to
learn
about
your
data.
What
does
and
doesn't
work?
B
Being
in
the
middle
of
a
different
point,
but
let's
really
hone
this
notion
of
invariance
versus
equivalence,
so
invariance
does
not
change
under
any
transformation
or
anything.
It's
it's
the
thing
that
it's
the
same
number
you
know
mass
or
like
your
your
your
mass
is
the
same
regardless
of
your
orientation.
B
Otherwise,
that'd
be
a
fantastic.
You
know
weight
loss
program,
it's
like
you,
can
just
gain
use
mass
by
rotation,
but
that's
not
the
case.
It's
a
scalar,
so
invariants
do
not
change
things
that
are
equivalent
change
deterministically.
B
They
do
change
under
specific
operation.
But
if
you
handed
me
that
operation,
I
would
know
exactly
how
that
quantity
transforms.
So,
let's
take
the
example
of
a
3d
vector,
a
3d
vector
has
three
properties.
It
has
a
magnitude
which
I'm
picturing
here
in
orange.
With
the
bars
it's
got
a
direction
which
I'm
picturing
with
the
pink
arrow,
and
then
it's
got
a
location
with
little
purple
dot,
and
if
we
consider
how
these
properties
change
under
translation,
rotation
and
inversion,
they
each
transform
differently.
So
the
magnitude
of
a
vector
is
invariant.
B
So
if
I
take,
if
I
have
two
particles-
and
I
have
some
relative
distance
between
them-
that
distance
that
magnitude
doesn't
change,
no
matter
how
I
change
my
coordinate
system,
so
relative
distances
are
invariant
under
rotation,
translation
and
even
inversion.
There's
some
interpretive
group
theory
dance
for
you.
B
Okay,
if
I
have
the
direction
it
doesn't
matter
where
the
vector
is
located,
it
still
points
in
the
same
direction.
However,
if
I
rotate
it
or
invert
it
it,
it
is
different.
It
transforms.
Predictably,
I
can
apply
a
rotation
matrix
or
an
inversion
operator
to
see
how
that
vector
changes,
but
it
is
different.
B
Last
but
not
least,
the
location
so
points
in
3d
space
are
sensitive
to
all
of
these,
so
they're
sensitive
to
translation,
rotation
and
inversion,
except
in
the
very
special
case
where
the
point
is
at
the
center
of
rotation
or
reversion,
but
generally
speaking,
they
do
change.
If
you
know
I
rotate
around
another
point
or
if
I
invert
across
another
point:
okay,
I'll
look
at
the
q
a.
A
B
Quick,
what
types
of
properties
and
materials
or
chemistry
or
physics
depend
strongly
on
symmetry.
All
of
them.
I
mean
okay,
that
sounds
maybe
a
bit
facetious,
but
but
it
actually
is
like
all
of
them.
So
so,
for
you
know,
I
don't
know
if
any
of
you
have
had
a
professor
who's
like
and
by
symmetry.
We
can
see
this
and
by
symmetry.
We
can
see
that
this
is.
These
are
the
types
of
arguments
that
a
lot
of
people
will
use
to
describe
material.
B
So
here's
an
example:
if
I
have
a
crystal
that
is
symmetric
under
inversion,
so
I
can
take
x
y
z
to
minus
x,
minus
y
minus
c,
so
I
can
invert
it.
There
is
absolutely
no
way
that
that
material
can
host
any
property.
That
looks
like
a
vector,
so
it
can't
have
a
polarization
it
can't
yeah.
Well,
polarization
is
the
first
one
that
comes
to
mind
and
this
impacts
what
the
elasticity
tensor
looks
like.
B
So
if
something
has
inversion
symmetry,
it's
basically
like
how
much
it
compresses
in
all
directions
is
the
same.
It
has
an
isotropic
elasticity.
I
think
that's
correct,
I'm
pretty
sure
that's
correct.
B
I
think
that's
the
case,
so
there's
a
lot
of
them
normal
modes
of
materials
or
not
normal
modes
of
molecule.
So
how
molecules
wiggle
in
response
to
light
is
symmetry
dependent
so.
B
Of
them-
and
I
certainly
have
not
done
full
justice
to
it
and
answering
this
question
all
right,
so
I'm
going
to
go
to
the
next
question.
What
about
discrete
rotations?
Would
you
say
like
the
cubes
yeah,
so
the
cubes
have
a
symmetry.
Definitely
so
yeah
I
should
have
maybe
should
have
included
a
slide
specifically
about
space
groups
and
point
groups,
so
you
have
3d
euclidean
space
and
then
there's
these
subgroups,
which
are
space
groups
and
point
groups
and
space
groups,
are
how
you
can
tile
patterns
in
3d
space.
B
B
B
So
ignore
the
heading
on
the
top,
because
that
doesn't
relate
to
this.
What
we're
talking
about
the
moment,
but
yes,
you
can
have
discrete
rotations
and
discrete
translations
and
that
can
still
be
asymmetry,
but
these
are
actually
subgroups
of
euclidean
symmetry.
So,
what's
really
cool
is,
if
you
have
a
network
that
has
euclidean
symmetry
you
get
all
of
these
subgroups
for
free,
so
I
have
euclidean
symmetry
in
space.
But
if
I
put
a
sphere
in
space,
then
the
symmetry
of
my
sphere
is
only
3d
rotations
and
inversions,
which
is
the
group
called
o3.
B
If
I
put
a
cone
into
3d
space,
I
still
have
rotational
symmetry
this
way
and
I
have
mirrors
along
here,
but
I
don't
have
continuous
symmetry
like
this.
If
I
have
a
cube
in
3d
space,
I
have
lost
all
continuous
symmetries,
but
I
still
have
discrete
rotations.
B
I
have
a
three-fold
axis
along
the
diagonals
four
fold
axis
along
the
faces
and
a
bunch
of
other
symmetries,
and
this
is
this-
is
the
point
group
o
h
or
the
octahedral
point
group.
The
octahedron
has
the
same
symmetries
as
the
q,
fun
fact,
and
then
space
groups
are
what
you
get.
If
you
like.
Let's
say
I
take
a
cube
in
3d
space,
and
actually
this
is
wrong.
It's
it
should
be.
230
is
the
space
group
number.
B
B
Okay,
so
hopefully
this
clarifies
a
little
bit
about
what
does
it
mean
for
invariance
versus
equivariance
and
yeah?
It's
totally
okay.
If
something
is
equivalent
to
only
discrete
rotations,
it
doesn't
have
to
be
all
rotations
in
this
special
case
for
this
vector
it
is
all
rotations,
but
a
cube
is
invariant
under
certain
discrete
rotations.
Actually,
a
vector
is
invariant
under
rotations
around
its
axis,
so
that
maybe
connects
more
closely
to
the
question
that
was
asked.
B
Okay,
so
I
wanted
to
give
an
example
again,
I'm
a
bit
biased
because
I
work
on
on
atoms,
and
so
all
my
examples
are
for
atoms.
So
I
hope
those
of
you
who
do
not
work
on
atoms
do
not
feel
alienated
from
this,
but
I
wanted
to
give
an
example
of
in
some
invariant
futurization
algorithms
and
how
these
can
actually
be
really
sophisticated
and,
and
especially,
they
can
be
very
expressive
if
well
crafted
so
one
such
invariant
representation
are
soap
kernels,
and
I
have
a
link
to
this
paper
in
my
resources
slide.
B
So
there
are
equivariant
operations
that
then
produce
invariant
quantities
and
so
I'll
talk
about
what
it
means
to
be
an
equivalent
versus
invariant
operation
in
a
later
slide.
But
I
just
wanted
to
touch
upon
this
before
we
move
from
representations
to
models.
So
a
soap
kernel
roughly
works
like
this.
So
let's
say
I
have
this
ethane
molecule
where
the
yellow
atoms
are
carbon
and
the
hydrogen
atoms
are
blue
and
I'm
going
to
project
the
local
neighborhood
of
those
carbon
atoms
onto
spherical
harmonics,
and
I
won't
go
through
exactly
what
that
procedure
is.
B
There
is
actually
a
backup
slide
showing
this
that
I'm
happy
to
go
through
if
that's
of
interest
to
folks,
but
you
can
basically
kind
of
see
that
if
my
carbon
atom
is
here
that
this
roughly
represents
where
my
hydrogen
atoms
are
for
this
first
carbon
and
then
this
represents
where
the
hydrogen
and
then
this
carbon
atom
over
here
for
this
other
carbon.
So
we
have
these
two
signals.
You
can
see
that
these
are
equivariant
quantities,
because
if
we
rotate
them,
they
look
different.
B
But
what
we're
going
to
do
is
that
now
we
have
these
two
equivariant
quantities
and
we're
going
to
basically
perform
a
dot
product
kind
of
the
tensor
equivalent
of
a
dot
product,
and
that's
what
we're
doing
down
here.
So
all
this
is
is
basically
showing
what
this
looks
like
numerically
and
I'm
separating
it
by
which
spherical
harmonic
they're
generated
from
there's
more
spherical
harmonics,
the
higher
frequency
you
go,
so
this
shape
just
corresponds
to
these
numbers,
and
what
we
can
do
is
that
we
can
do
a
dot
product
basically
multiply
these
elements.
B
So
this
element
multiplies
this
element
and
then
we
sum
across
the
l's,
and
then
we
get
this
number
here.
So
we
get
these
seven
scalars.
So
you
know
this
is
a
simplified
version,
but
you
know
it's
a
relatively
sophisticated
operation
and
it
turns
out
these
types
of
operations
if
you
have
enough
of
them.
So
you
do
this
for
all
the
different
atoms
sort
of
in
your
local
environment.
That
can
be
extremely
expressive
in
variant
representation
of
geometry.
B
So
that's
just
to
give
an
example.
Okay,
so
I'm
going
to
look
at
the
q
a
do
these
networks
work!
Well
achieve
high
accuracy
on
crystal
structures
that
have
periodic
boundary
conditions.
Okay,
so
I'm
going
to
assume
that
you
mean
euclidean,
neural
networks
and
so
euclidean
neural
networks
can
handle
periodic
boundary
conditions
very
naturally
and
we
haven't
yet
tested
them
too
much
on
crystal
structures
just
because
we
don't
have
enough
people
using
them.
So
if
you
would
like
to
do
so,
I
would
I
would
love
to
chat
with
you
about
that.
B
But
if
you
look
at
analogous
models
like
schnet,
which
is
an
invariant
version,
sort
of
of
euclidean
neural
networks,
they
get
pretty
good
accuracy
on
several
crystal
prediction
tasks,
and
so
our
network
should
do
as
well,
if
not
much
better,
purely
from
an
expressivity
point
of
view,
like
the
operations
we
have
in
that
network,
is
it's
just
more
expressive?
It
can
express
more
complex
interactions,
so
in
theory
it
should,
but
we
haven't
totally
demonstrated
it
yet,
just
because
of
time
and
and
person
power,
so
yeah,
great
okay.
B
So
now,
let's
talk
about
invariant
versus
equivariant
models,
because
we
talked
about
invariant
representations,
but
you
could
still,
for
example,
give
your
network
your
data
and
its
full.
You
know
unaltered
glory
with
all
these.
You
know
the
messiness
of
translations
and
rotations,
and
the
model
can
just
only
operate
in
a
way
that
it's
only
acting
on
invariant
quantity.
So
it's
only
acting
on
relative
distances
versus
relative
distance
vectors,
so
it
doesn't
have
kind
of
the
xyz
component.
It
just
has
the
distance,
so
an
invariant
model
would
handle
the
distance.
B
The
equivariant
would
handle
a
vector
so
for
a
function
to
be
equivalent,
and
this
is
general.
This
is
not
with
respect
to
any
specific
group.
This
is
just
any
sort
of
set
of
operations.
A
function
is
equivariant
if
we
can
either
act
on
the
inputs
or
act
on
the
output.
So
in
the
case
of
rotation
I
could
take
my
molecule
and
I
could
rotate
my
molecule
or
I
could
rotate.
Let's
say
I
was
predicting
forces
on
a
molecule.
B
I
could
rotate
the
forces
that
were
predicted
for
that
molecule,
so
I
could
do
either
either
order
so
that
operation
that
symmetry
operation
commutes
with
the
function
yeah.
So
I
can
either
do
it
to
the
input
to
the
outputs
for
the
case
of
an
invariant
function.
What
that
means
is
that
g
is
the
identity,
so
the
inputs
to
that
function
are
going
to
be
invariant
quantities
and
the
outputs
to
that
function
are
also
invariant
quantities.
So
that's
what
it
needs
to
be
for
an
equivalent
versus
invariant
function.
B
Okay
question
from
onova
quick
question,
so
the
importance
of
detecting
these
symmetry
or
to
symmetry
preserving
neural
networks
is
to
save
computation.
Or
am
I
missing
something?
This
is
great
yeah
there's
many
reasons.
So
the
probably
the
easiest
one
to
motivate
is
to
get
rid
of
data
augmentation.
So
it
does
save
a
lot
of
training
time.
It
saves
on
how
many
parameters
and
actually,
okay,
is
this
I'm
about
to
talk
about
this
next
thing,
but
I'll
talk
about
how
your
data
goes
further
with
equivariant
functions.
B
There
are
other
reasons
too
there's
a
lot
of
these
kind
of
unintuitive
consequences
that
I'll
talk
about
that
end
up
being
super
beneficial
kind
of
as
a
spoiler
alert.
Basically,
euclidean
symmetry
has
a
bunch
of
really
interesting
consequences.
Space
groups,
point
groups,
geometry,
geometric,
tensors,
second
order,
phase
transitions.
You
would
think
that
you
needed
thermo
for
that,
but
actually
it
kind
of
naturally
falls
out
to
study,
euclidean,
symmetry
and
I'll
show
you
some
examples
of
that.
If
we
have
time
how
are
we
doing
on
time?
B
B
I
think
it's
easier
to
interpret
what
the
model
is
doing
and
depending
on
how
you
craft
your
operations,
you
can
again
experiment
a
bit
more
saying:
like
does
this
interaction
help
model
my
data?
Yes,
no,
but
in
order
to
ask
those
questions,
you
do
have
to
have
a
pretty
tailored
network,
so
putting
effort
into
the
network
you
know,
makes
it
a
better
tool
for
answering
scientific
questions.
B
Okay,
okay,
so
why
limit
yourself
to
equivalent
functions?
Why
not
use
a
more
general
function,
and
one
of
the
reasons
for
this
is
because
you
can
substantially
shrink
the
space
of
functions
you're
looking
over,
so
let's
say
I
have
inputs
and
I
have
outputs
and
I
want
some
function
that
maps
my
inputs
to
my
outputs,
and
I
know
that
these
pieces
of
data
have
some
symmetry,
let's
say
euclidean
symmetry.
B
I
know
that
I
can
change
my
coordinate
system
now.
You
could
just
do
data
augmentation
and
teach
it
to
learn
about
rotations
and
things
like
that
or
I
could
have
a
model
that
kind
of
inherently
has
those
operations
built
into
it.
Under
the
hood
or
it
respects
those
operations,
so
if
I
have
all
learnable
functions,
that's
a
huge,
huge
space
and
then
there's
all
equivariant
functions,
which
is
a
much
smaller
space.
It's
still
a
very
large
space,
but
it's
still
smaller
and
physics
lives
only
in
equivariant
functions,
so
all
physics
phenomena
are
equivalent.
B
The
functions
you
really
wanted
to
learn
are
kind
of
the
overlap
of
equivalent
functions
and
functions
constrained
by
your
data.
So
by
constraining
your
network
to
be
equivariant,
your
data
goes
a
lot
further
because
you're,
basically
being
a
lot
more
specific
you're
narrowing
in
on
which
function.
You
want
okay.
Well,
I
want
it
to
be
equivalent.
Okay,
cool
you've
narrowed
out
that
space.
It's
like
okay,
and
then
I
have
this
data
and
it
needs
to
be
compatible
with
this
stage.
Okay,
well,
that
that
narrows
it
on
this
spot.
B
So
it
makes
your
data
a
lot
more
powerful
okay,
so
I
will
go
ahead
and
look
at
the
q
and
a
really
quick
do
we
need
to
implement
function?
G
as
a
neural
network
layer,
great
question:
no,
this
is
just
a
property
of
the
function.
This
is
more
of
like
this
is
a
condition
that
your
neural
network
must
satisfy.
B
You
never
have
to
know
what
g
is,
because
you
prove
it
for
all
g,
so
it
just
is
the
case,
and
this
is
really
interesting
because
you
don't
have
to
know
the
symmetry
of
the
object
for
the
symmetry
to
be
preserved
by
the
network.
The
network
doesn't
know
the
symmetry
of
your
object.
It
never
does
it's
just
I.
It
just
says
I'm
going
to
act
in
a
way
that
preserves
whatever
symmetry.
This
thing
has
and
what's
really
interesting,
is
computationally
speaking
knowing
what
something
is
versus
preserving
a
certain
property.
B
It
has
are
two
very
computationally
different
tasks
like
it's
really
easy
to
make
something.
Permutation
have
permutation
symmetry
versus
like
it's
actually
really
difficult
to
tell
if
two
graphs
are
the
same
like
graph.
Isomorphism
is
like
a
very
hard
problem,
but
ensuring
that
if
they're
the
same,
you
get
the
same
answer.
B
That's
a
lot
easier
now.
Another
graph
might
also
give
the
same
answer
and
that's
why
isomorphisms
are
hard
but
yeah.
I
think
that's
really
important
that
the
symmetry
equivalent
networks
don't
actually
know
the
symmetry
of
whatever
you're
giving
them.
They
just
can't
violate
it.
You're
just
kind
of
handicapping
them
you're
tying
their
hands
behind
their
back,
saying,
okay,
you,
you
will
preserve
rotation
and
translation
and
inversion
operations.
B
Great
question:
okay.
Another
question:
have
you
been
benchmarking
of
euclidean
neural
networks
on
different
predictions
against
other
high
performing
models
such
as
different
types
of
graph,
neural
networks?
I
have
not,
but
my
good
colleague
ben
miller
has
well
I
mean
we
did,
but
ben
did
did
really
all
the
heavy
lifting.
B
So
we
do
have
a
paper
out
on
archive
right
now,
benchmark
against
qm9,
which
is
a
data
set
on
small
molecules
and
I'll
link
that
in
the
or
I
have
the
archive
number
in
the
resource
slide,
and
it
is
a
top
performer
on
a
predicting
dipole
moment,
the
quantity,
the
magnitude
which
is
interesting
because
it's
secretly
a
vector
quantity,
even
though
it's
a
scalar
in
that
data
set.
Yes,
so
we
have,
but
we
would
like
to
do
more
benchmarking,
but
again,
there's
only
so
much
time
in
a
day.
B
Okay,
another
question
systems
governed
by
partial
differential
equations
like
say
fluid
turbulence
have
symmetries
and
conservation
laws.
Yes,
they
do
you've
shown
some
excellent
examples
of
atomic
physics.
Chemistry.
Do
you
have
pointers
on
how
I
might
derive
symmetries
of
pdes
and
incorporate
them
effectively,
either
through
data
transformations
or
contracting
clever
models,
and
then
say
something
about
lee?
Algebra
is
an
incomprehensible
math.
I'm
totally
there
with
you.
I
I
lea
algebra
is
is
a
bit
of
a
mess.
I
mean
it's
very
elegant,
but
it's
it's.
B
It's
not
the
most
fun
thing
to
learn
yeah,
so
pdes
pdes
are
really
interesting.
So
there's
a
bunch
of
work
in
like
neural,
odes
and
and
basically
constructing
models
where,
if
you
kind
of
know
the
partial
differential
equation
you're
solving,
you
can
either
like
fit
certain
parameters
that
are
unknown,
and
this
is
this
is
out
of
my
element,
stephen
hoyer,
at
google.
I
worked
with
him
while
I
was
on
the
google
accelerated
science
team.
B
This
is
like
this
is
something
he's
super
passionate
about,
and
I
think
mustafa
and
steve
I
and
what
he
I
think
you
guys
are
probably
more
plugged
into
this
than
I
am
so
maybe
maybe
you
you
can
provide
some
some
suggestions.
I
know
very
little
about
using
neural
networks
to
solve
partial
differential
equations.
B
I
think
there's
a
lot
of
really
interesting
stuff
coming
out.
I
think
those
methods,
coupled
with
a
euclidean
neural
network,
could
be
super
powerful,
so
really
excited
about
any
potential
applications
there
yeah.
I
was
talking
about
stefan
about
that
once
and
saying
that
that'd
be
kind
of
the
next
step
for
some
of
the
stuff
that
they're
doing
in
3d
but
yeah.
I'm
sorry,
I'm
not
I'm
not
a
bit,
I'm
not
as
helpful
about
discussing
things
about
partial
differential
equations.
It's
a
little
bit
outside
my
knowledge
zone.
A
B
Cool
thanks
mustafa
all
right,
so
another
question
does
applying
cemetery
work
well,
with
transfer
learning
does
applying
equivalent
constraints,
make
it
more
challenging
to
apply
your
network
trained
on
one
data
set
to
another.
Would
you
have
to
change?
Symmetries
and
constraints?
Will
work
on
the
other
data
set
great
question,
so
typically,
you
would
do
transfer
line
like
say
between
different
sets
of
molecules,
or
so
I
would
typically
stay
within
the
same
symmetry.
I
I
don't
think
it
would.
B
I
can't
immediately
think
of
a
use
case
where
you
would
want
to
apply
like
a
permeation
symmetric
graph
to
something
that's
in
euclidean
I
mean
you
could
you
could,
but
typically
again
the
data
type
kind
of
governs
which
neural
network
you
use?
So
you
can
you
definitely
get
very
good
transfer
learning
between,
like
differing
molecules
or
different
sizes,
like
instead
of
you,
train
on
10
different
molecules
at
a
very
high
accuracy
calculation
and
you
extrapolate
to
100
these
types
of
methods
actually
do
really
well
with
that.
So
that's
it.
B
It
does
seem
to
be
very
helpful
and
what's
been
interesting,
is
that
there
are
sort
of
invariant
models
that
are
so.
You
can
take
euclidean
neural
network
and
make
an
invariant
version,
and
I
was
actually
discussing
with
some
colleagues
yesterday
that
we
were
trying
to
figure
out
exactly
why
it
seems
that
the
equivalent
models
are
performing
with
like
they're
much
more
data
efficient
than
the
invariant
ones
like
it
intuitively
makes
sense
like
okay
you're
able
to
extract
more
geometric
information
more
easily,
but
we
don't
totally
like
understand
mathematically
the
specific
mechanism
allowing
for
that.
B
B
They're
able
to
do
much,
much
better
transfer,
learning,
okay
and
then
one
more
question
then
I'll
go
to
the
next
slide.
Are
you
applying
euclidean
neural
networks
after
you're
obtained
data
like
some
md
data
or
applying
your
neural
network
during
the
amd
simulation?
B
So
you
can
use
a
trained,
euclidean
neural
network
to
generate
molecular
dynamics
forces
but
typically
like
you,
first
have
to
train
it
on
some
data,
so
you
can
do
both,
but
typically
you
want
to
do
it
with
a
trained
network.
First,
there
are
certain
cases
where
you
can
actually
use
the
network
to
uncover
certain
things
and
that's
in
later,
slides,
which
I'm
not
sure
we'll
get
to,
but
that's,
okay,
but
hopefully
that
answers
your
question.
B
Okay,
I
might
what
I
might
do
is
I
might
kind
of
go
through
some
slides
for
a
bit
and
then
we
can
go
back
to
questions
because
I
haven't
even
explained
to
you
how
euclidean
neural
networks
work-
and
you
guys
are
asking
me
a
bunch
of
questions
about
them.
So
I
better,
I
better
get
to
that
part
so
that
so
that
we
can
have
even
even
more
fun
questions.
B
B
So
what
I
want
to
point
you
to
is
the
functions
that
you
actually
wanted
to
learn.
They
may
actually
be
an
invariant
function.
In
that
case,
you
are
great.
You
have
nothing
to
worry
about.
However,
it
could
be
that
it
was
like
an
almost
invariant
function,
but
there
was
a
bit
of
equivalence
and
in
that
case,
you're
going
to
be
throttled
on
accuracy
and
there's
certain
situations
where
inherently,
what
you
wanted
to
learn
was
an
equivalent
function
and
the
invariant
function
is
just
not
going
to
be
able
to
do
it.
B
B
So
this
is
kind
of
the
three
different
situations
you
can
find
yourself
in
okay,
so
I
want
to
talk
a
minute
about
convolutions
and
how
it
relates
to
local
versus
global
symmetry,
because
this
is
really
interesting
and
this
kind
of
touches
upon
a
lot
of
the
questions
you
guys
have
been
asking
about:
transfer
learning
so
convolutions
capture,
local
symmetry.
B
So
they
just
look
at
the
kind
of
other
points
around
it
or
the
other
geometry
or
pixels
around
it,
and
the
data
on
those
pixels,
and
it's
it's
only
through
interactions
with
features
in
subsequent
layers
that
yield
sort
of
a
more
global
symmetry.
So,
to
give
an
example,
this
is
a
rubidium,
manganese
chloride
crystal,
and
it
has
these
octahedral
motifs
in
the
crystal
that
occur
in
different
orientations
and
locations.
B
So,
if
we're,
if
we
have
a
convolution
that
is
equivalent
to
3d
rotation,
it
will
understand
that
these
two
octahedron
are
the
same
thing.
However,
they
may
be
in
different
contexts
now,
in
this
particular
case,
their
environments
look
pretty
similar
because
they're
symmetrically
the
same
atom,
basically
but
kind
of
as
you
go
further
out
and
further
out,
as
you
go
from
layer
to
layer,
it
will
eventually
see
oh
well.
My
system
doesn't
have
octahedral
symmetry.
It
actually
has
some
space
group
symmetry
due
to
translations
and
kind
of
the
local.
B
Okay,
so
now
we're
going
to
talk
about
how
do
euclidean,
neural
networks
achieve
euclidean
symmetry
equivalence
and
I'm
going
to
do
this
at
a
fairly
high
level,
because
normally,
if
I
give
a
talk
just
on
this,
it
is
an
hour
so
just
to
kind
of
give
context.
So
euclidean
neural
networks
are
very
similar
to
convolutional
neural
networks,
but
rather
than
operate
on
images.
B
We
operate
on
points
because
for
the
cases
that
were
the
first
applications
that
we
were
most
interested
in,
we
were
interested
on
atomic
structures
and
it's
just
a
lot
easier
to
represent
an
atomic
structure
as
a
point
cloud,
but
these
methods
very
readily
transfer
back
over
to
images.
So
you
can
use
euclidean
neural
networks
on
on
fixed
grids.
Meshes
any
data
type
is
totally
fine.
B
So
because
we're
using
points
we
use
what's
known
as
a
continuous
convolution,
so
rather
than
having
a
fixed
grid
for
every
convolution
center
to
apply
its
operation
on.
We
actually,
you
know
compute
the
relative
distance
between
each
of
the
neighbors,
and
then
we
plug
that
r
vector
into
our
filter
function.
So
our
filter
function
is
actually
a
function
that
takes
in
that
r
vector
versus
it
just
being
a
pixel
grid.
There's
a
slight
difference
and
of
course
one
thing
that's
very
different
about
it.
B
Is
that
not
only
do
we
have
3d
translation,
but
we
have
3d
rotation
equivariance.
So
the
fact
that
it
will
know
that
this
is
a
benzene
molecule,
no
matter
how
I
translate
or
rotate
it
so
again,
a
quick
recap:
translation
equivalence.
The
fact
that
I
can
identify
the
bunny
rabbit
in
this
image,
if
it's
over
here
or
over
there
is
stems
from
the
fact
that
we're
going
to
use
a
convolutional
neural
network
and
the
main
feature
of
a
convolutional
neural
network
that
allows
it
to
be
translation
equivalent,
is
that
it
only
uses
relative
coordinates.
B
B
Oh
there's,
a
quick
question:
can
you
share
your
slides
with
us?
I
think
mustafa
is
handling
how
those
are
shared,
so
I'm
gonna
just
note
that
to
mustafa
so
okay,
yes,
so
translation,
equivalence.
We
have
convolutional
neural
networks,
but
how
do
we
handle
rotation,
equivalence
and
kind
of
harkening
back
to
some
of
the
things
we
talked
about?
We
could
do
data
augmentation,
so
we
just
showed
a
bunch
of
rotated,
bunny
rabbits
or
the
molecules
in
a
rotated
bunny
rabbit.
We
could
do
something
invariant.
B
We
have
an
invariant
model
where
we
are
only
caring
not
about
the
direction,
the
relative
direction
of
one
pixel
with
respect
to
another
or
one
atom,
with
respect
to
another,
but
just
the
distance.
So
that
would
be
some
radial
function
really.
We
want
a
network
that
preserves
both
the
geometry
and
exploits
the
symmetry
of
the
problem.
B
So
we
have
something
that's
similar
to
a
convolutional
neural
network,
except
we
have
very
special
filters
and
everything
in
our
network
is
tensor
algebra,
so
rather
than
scalar
multiplication,
we
have
the
tensor
generalization
of
that
and
I'll
talk
about
that
in
the
next
slide.
So
our
convolutional
filters
have
two
components:
they're
separable
into
a
learned
radial
function
basically
saying
like.
B
If
I'm
this
far
away,
what's
my
output
and
then
we
multiply
it
times
a
spherical
harmonic
to
handle
sort
of
this
angular
distribution
and
the
essential
reason
for
this
and
I'm
happy
to
go
into
more
detail
at
the
end
of
the
talk.
But
the
essential
reason
why
we
use
spherical
harmonics
is
that
they
have
very
beautiful
properties
under
rotation.
Very
nice
transformation
properties.
If
I
give
you
a
linear
combination
of
spherical
harmonics
in
l
equals
two.
So
all
these
l
equals
two
and
then
I
rotate
my
coordinate
system.
B
That
signal
is
still
only
it's
still.
Only
l
equals
two
components,
so
you
can
think
of
it
as
the
angular
frequency
is
preserved.
The
specifics
of
like
which
spherical
harmonics
of
l
equals
two
it's
described
by
will
change,
but
the
frequency
of
the
signal
doesn't
change.
B
So,
okay
I'll
take
one
quick
question,
one
symmetry
that
I'm
thinking
about
that.
I
don't
think
you've
mentioned
scale
invariance.
What
sort
of
models
can
exhibit
scale?
Equivalence?
Okay,
this
is
a
really
interesting
question,
so
euclidean
symmetry,
we
don't
assume
scale
invariance
and
the
reason
for
this
is
physics
is
different
at
different
length
scales.
There
are
models
that
handle
scale
and
variants.
I
didn't
link
to
any
of
them.
I
know
david
warhol
was
working
on
some
of
them.
B
There
are
definitely
papers
on
this.
There's
a
lot
of
cases
where
they'll
do
this
by
sort
of
augmenting
their
filters.
So
basically,
if
you
have
a
filter
with
fixed
weights,
you
say:
okay,
I'm
going
to
take
this.
I
have
learned
parameters
and
then
I'm
going
to
make
like
five
copies
of
the
filter
that
basically
like
zoom
the
filter
in
and
out
so
you
you
can
do
this
in
a
sort
of
way
you
can
do
dilated
convolutions,
but
I
think
there's
also
more
rigorous
ways
of
doing
it.
B
I'm
just
not
aware
of
it
on
hand,
but
it
is
something
that
is
discussed
in
the
literature,
it's
very
relevant
for
image
recognition
because
like
if
you
want
to
recognize
a
cat
in
an
image,
you
want
to
recognize
a
tiny
cat
versus,
like
you
know,
a
close-up
of
the
cat.
So
it's
very
relevant
and
there
is
work
on
this,
but
I
am
largely
ignorant
of
it.
B
So
thanks
for
the
question,
okay,
so
yeah,
so
we
have
special
filters
based
on
spherical,
harmonics
and
learned
radial
functions,
and
then
we
basically
have
to
replace
all
the
scalar
operations
in
our
network.
So
normally
you
you,
you
get
your
filter
function
and
you
have
your
input
and
you
just
multiply
them
and
sum
multiplication
in
this
case
can
no
longer
just
be
scalar
multiplication.
It
actually
has
to
be
a
tensor
multiplication
or
a
tensor
product,
and
to
give
an
example
of
why
this
is
necessary.
B
How
do
you
multiply
two
vectors,
so
there's
sort
of
three
different
answers?
I
could
take
two
vectors
compute
their
dot
product,
and
that
gives
me
a
scalar,
so
that
gives
me
an
invariant
quantity.
I
could
take
two
vectors
compute
their
their
cross
product,
and
that
gives
me
back
a
vector
or
I
could
take
the
outer
product,
and
this
will
actually
give
me
a
matrix.
So
these
are
three
different
ways
you
could
kind
of
combine
or
multiply
vectors,
and
so
it
turns
out
when
you
put
in
neural
networks.
B
Everything
in
our
network
is
a
geometric
tensor
which
surprised
us,
and
I
can
talk
that
about
that.
A
little
bit
more
so
everything
in
the
network
now
has
to
obey
the
rules
of
tensor
algebra
and,
if
you've
ever
heard
of
things
like
clutch,
gordon
coefficients
or
wigner,
3j
symbols,
that's
where
these
things
come
into
play
and
if
that
doesn't
mean
anything
to
you,
don't
worry
about
it!
Okay!
B
So
what
what
do
you
get
from
this?
Because
you
know
this?
This
seems
very
mathematically,
not
fun.
So
what
do
you
get
from
it?
Well,
if
I
give
a
euclidean
neural
network,
a
molecule
and
a
rotated
version-
and
I
wanted
to
predict
molecular
dynamics
forces,
for
example,
the
forces
that
are
predicted
by
the
model
will
be
the
same
as
the
rotated
version
modulo
the
rotation
of
the
molecule.
B
Additionally,
the
networks
generalize
very
well
to
molecules
with
similar
motifs
and
again
that's
because
these
convolutions
have
are
sensitive
to
local
symmetry,
they're,
able
to
understand
local
symmetry
and
then
through
exchanging
messages
between
different
atoms.
You
can
get
a
global
symmetry
picture,
but
that
kind
of
happens
in
a
hierarchical
fashion.
B
Okay.
Another
thing
is
that
if
I
have
a
euclidean
neural
network-
and
I
show
it
these
unit
cells-
so
there's
a
it's
a
very
easy
way
to
articulate
periodicity
in
in
neural
networks
or
in
including
neural
networks,
basically
using
graphs.
So
when
I
have
a
crystal,
which
is
represented
by
a
3d
box,
that
I
then
tile
in
3d
space
I
can
choose
to
represent.
This
is
silicon.
B
I
can
choose
to
represent
it
kind
of
in
the
smallest
unit
cell
or
the
primitive
cell,
the
conventional
unit
cell
or
some
supercell
to
euclidean
neural
network.
These
all
look
the
same
you're
guaranteed
to
sort
of
get
the
same
per
atom
output
regardless.
So,
rather
than
worrying
about
putting
your
unit
cell
in
a
specific
convention,
you
just
give
it
whatever
unit
cell.
You
have
on
hand
as
long
as
it
you
know,
is
numerically
precise
enough
to
have
the
symmetry
you
want.
B
Then
then
that'll
be
fine
with
the
network
and
then
going
back
to
this
first
example
that
I
showed
on
my
slide.
If
you
train
a
euclidean
neural
network
on
a
single
water
molecule
in
a
single
water
hamiltonian,
you
naturally
get
back
the
rich
variety
of
what
this
matrix
looks
like
under
rotation.
So
this
is
what
you
get.
B
This
is
the
payoff
of
all
the
math
that
which,
by
the
way,
is
under
the
hood,
so
you
don't
have
to
deal
with
it,
which
is
why
we
made
a
framework
for
doing
this
and
I'll
link
that
in
okay,
so
we're
coming
up
a
bit
on
time,
so
I
want
to
just
quickly
go
through
some
unintended
intuitive
consequences
of
equivariance.
B
So
if
I
asked
you,
for
example,
let's
take
this
bow
tie
graph
partition
this
graph
with
a
permutation
equivariant
function
using
into
two
different
sets,
so
basically
split
it
up
into
two:
even
groups
using
ordered
labels.
So
I
want
you
to
basically
train
a
model
that
learns
to
predict
zero
one
or
one
zero
on
each
of
the
nodes,
and
I
asked
you
to
do
this.
B
What
you'll
find
is
that
the
network
can't
do
this
and
the
reason
why
is
not
because
your
model
is
broken
but
because
the
question
is
ill-posed,
so
you
know
you
can
imagine
oh
well,
why
didn't
it
learn
either
the
left
graph
or
the
right
graph
like?
Why
didn't
it
learn
one
of
these
partitions
and
the
thing
is
the
network
can't
distinguish
these
two
outputs,
and
so
the
best
thing
it
can
do
is
produce
its
average,
so
it'll
actually
produce
a
degeneracy.
B
There's
nothing
to
distinguish
the
you
know
the
orange
partition
or
the
blue
partition
to
be
first
or
second,
they
are
themselves
sets
of
partitions,
and
so
this
is
something
where
the
you
know:
you're,
like
oh
darn
it.
I
really
wanted
to
use
this
for
graph
partitioning
and-
and
I
really
need
this
partition
this-
this
permutation
equivalence,
but
now
I
can't
use
it.
So
you
have
to
figure
out
how
to
ask
your
question
in
a
way
that
still
respects
the
symmetry
of
your
data
type.
So
that's
kind
of
an
interesting
unintended
consequence.
B
Okay,
I'll
see,
there's
a
question
here.
Interesting
phenomena
such
as
coherent
structures
and
spatio
temple
spatiotemporal
systems
can
be
defined
as
local
deviations
from
the
symmetries
of
the
system
can
symmetries
be
hard
to
define
simply
but
presumably
can
be
learned
by
the
neural
network.
I'm
wondering
if
the
neural
network
that
is
learned,
the
underlying
symmetries
of
the
system,
can
be
used
to
detect
coherent
structures
and
new
data
as
broken
symmetries.
Can
you
think
of
a
way
to
build
a
neural
network?
That
does
that.
B
So
thank
you
for
the
question.
One
thing
that
I
I
love
about
working
with
these
networks
is
that
I've
learned
a
lot
like.
It's
really
changed.
How
I
think
about
symmetry,
and
one
of
the
ways
is
that
I
don't
think
of
symmetry
as
an
on
or
off
thing.
So
when
we
talk
about
space
groups,
it's
either
in
a
space
group
or
not,
but
when,
when
it
comes
down
to
it,
what
symmetry
is
is
the
cancellation
of
certain
interactions?
B
So,
like
you
know,
if
I
have
an
atom
to
my
left
and
an
atom
to
my
right,
I
can
get
cancelling
contributions
if
they're
like
equally
spaced
from
me,
but
if
one's
slightly
to
the
right
or
you
know
slightly
perturbed,
it's
still
a
fairly
symmetric
configuration,
I
will
mostly
get
cancellation
from
those
two
quantities
and
so
again
these
networks,
don't
necessarily,
I
don't
know
specifically
about
making
a
network
that
detects
partial
symmetries.
B
But
even
if
you
do
have
something,
that's
perturbed
that
it's
going
to
be
hard
for
the
network
to
fight
the
fact
that
it
still
looks
mostly
symmetric
and
this
this
actually
does.
This
is
physical.
We
often
think,
oh,
if
I
you
know,
take
a
hill
and
I
roll
down
the
hill
or
like.
If
I
perturb
a
little
bit,
then
I
roll
down
the
hill,
but
that's
assuming
dynamics
and
dynamics
can
make
small
changes
grow.
B
So
unless
you
learned
like
a
really
huge
weight
to
be
paid
on
to
this
very
small
difference,
it
won't
really
be
able
to
amplify
that
perturbation.
I'm
not
sure
if
that
actually
answered
your
question,
but
I
think
it
was
related.
B
You
can
craft,
I
believe
you
can
use,
for
example,
including
neural
network,
to
detect
space
group
symmetries.
I
think
you
can
do
that.
How
you
would
articulate
it
is
actually
a
little
complicated
because
of
how
you'd
represent,
like
certain
symmetry
operations,
and
I
assume
that
nuance
would
transfer
over
to
other
groups
if
you
in
case
you're
interested
in
things
that
are
not
the
euclidean
group,
but
again,
as
far
as
for
building
neural
networks
that
can
learn
or
detect
certain
symmetries.
B
I
highly
recommend
the
workshop
that
I'm
going
to
put
on
the
resource
slide
because
they
might,
they
might
have
some
better
answers
for
you,
okay
and
then
one
more
question.
When
optimizing
euclidean
neural
networks,
do
we
have
issues
of
getting
stuck
and
degenerate
subspaces?
Yes,
you
do,
and
I
will
talk
about
that
in
another
slide.
So
first
I
want
to
just
emphasize
that.
Okay,
we
have
some
of
this,
so
equivalencies
again
can
have
unintended
consequences,
and
so
the
input,
intermediate
and
output
data
of
neural
networks
must
be
geometric.
Tensors.
B
We
didn't
realize
this
when
we
started,
we
just
wanted
a
network
that
had
rotation
equivalence.
We're
like
all
I
wanted,
was
rotation
experience
and
I
got
geometric
tensors
space
groups.
All
this
other
stuff
kind
of
goes
to
show
that
sometimes
it's
worth
going
through
the
hassle
of
getting
that
echo
variance,
because
you
may
end
up
getting
a
bunch
of
things
that
you
secretly
wanted,
but
you
didn't
know
that
you
could
get
so
geometric
tensors.
B
These
really
just
lovely
objects,
I'm
extremely
biased,
because
I
work
with
them
all
the
time
but
essentially
like
let's
say
I
have
a
three
by
three
matrix
which
I'm
representing
with
these
colors
and
the
x
x
and
x
y
and
all
this
stuff.
I
could
equally
represent
that
as
a
linear
combination
of
spherical
harmonics
as
a
shape.
So
it's
interesting
because
geometric
tensors
can
be
as
much
thought
about
as
geometric
shapes
as
a
numerical
objects,
and
then
you
also
get
a
bunch
of
really
interesting
data
types.
B
When
you
deal
with
geometric
tensors,
so,
for
example,
in
3d
space,
there
are
four
distinct
vector-like
quantities,
there's
sort
of
the
classic
vector
which
is
a
equivalent
under
inversion,
equivalent
rotation
and
invariant.
Under
reflection
along
the
vector,
a
pseudo-vector
does
not
flip
when
you
invert
space,
then
if
you
have
a
double-headed
ray,
it
has
a
lot
of
the
same
properties
as
a
vector,
except
for
you
can
invert
it.
And
then
you
have
things
like
a
spiral
which
you
can
rotate
it
by
180,
but
you
can
invert
it.
B
So
there's
all
these
kind
of
little
data
types
and
you
can
actually
articulate
specifically
like
this-
is
a
spiral
into
the
network.
So
this
granularity
of
defining
what
your
object
is
is
super
super
fun
super
useful,
okay,
so
euclidean
neural
networks
also,
they
can
produce
outputs
that
the
outputs
must
have
equal
or
higher
symmetry
than
the
inputs.
B
So
these
are
two
different
situations:
I'm
going
to
input
either
a
tetrahedron,
so
I'm
going
to
input
a
tetrahedron
geometry
or
an
octahedron
geometry,
and
these
are
the
outputs
of
three
randomly
initialized
models
that
have
been
asked
to
generate.
Spherical
harmonics
from
l
equals
zero
to
l
equals
six,
and
you
can
see
that
these
shapes
all
look
substantially
different,
but
they
all
have
the
same
symmetry
as
whatever
I
gave
it
as
input.
B
So
there's
many
different
ways
to
produce
signals
that
have
a
certain
symmetry,
and
so
this
is
sort
of
something
that
we
realized
after
the
fact,
but
that's
been
useful
and
one
reason
why
it's
useful
is
that
you
can
use-
and
this
this
pertains
to,
I
believe,
nicholas's
question
about.
Do
you
end
up
getting
degeneracy
issues?
B
Yes,
you
do,
but
you
can
also
figure
out
a
way
to
get
out
of
them
because
of
the
equivariance
of
your
network,
so
equivalent
neural
networks
can
be
used,
and
this
this
follows
for
any
equivalent
neural
network,
not
just
euclidean
ones.
The
euclidean
ones
produce
nice
pictures,
but
this
also
works
for
permutation,
equivalent
neural
networks
and
other
things.
You
can
use
them
as
symmetry
compilers
and
you
can
use
them
to
find
symmetry
implied
missing
data.
B
So
I'm
going
to
take
two
tasks:
I'm
going
to
start
off
with
some
geometry:
let's
say
this
rectangle
a
specific
rectangle.
So
I'm
going
to
show
this
specific
rectangle,
I'm
going
to
say
I
want
you
to
learn
displacements
of
these
points
to
form
a
square
and
the
task
two
is
kind
of
reversing
this.
I'm
going
to
give
you
a
square,
and
I
want
you
to
distort
the
square
into
this
rectangle
and
basically,
what
we
show
is.
It
can
do
the
first
task,
no
problem,
it's
going
from
low
symmetry
to
high
symmetry.
B
It
can't
do
the
second
task,
and
the
reason
for
this
is
because
it
is
symmetrically
ill-defined
because
the
question
is
well
which
rectangle
did
you
want?
There's
two
degenerate
rectangles.
If
I'm
a
rotation
equivalent
network,
I'm
like
well,
I
can't
tell
whether
you
want
the
one
around
what
it's
oriented
around
y
or
the
one
around
z,
because
you
could
just
or
x
you
could
just
change
your
coordinate
system
and,
if
you
this
is
in
a
recent
paper
that
we
have
on
the
archive
and
it's
it's
in
the
resource
slide
later.
B
This
is
what
the
network
output
is.
So
if,
instead
of
articulating
the
displacements
as
vectors,
I
do
it
as
in
terms
of
spherical
harmonics.
So
we
kind
of
see
the
full
symmetry
of
the
problem.
What
you
can
see
is
that
so
the
blue
points
on
the
left.
The
blob
is
just
supposed
to
overlap
with
the
orange
point.
That
means
it's
doing
a
good
job
if
it
overlaps
with
the
orange
point,
it's
doing
a
superb
job,
so
the
one
on
the
left
perfect,
the
one
on
the
right.
B
You
can
see,
there's
actually
a
degeneracy
in
its
predictions.
That's
why
the
lobes
are
smaller,
because
you're,
basically
averaging
two
signals.
So
if
you
imagine
kind
of
the
normally
sized
blob
in
a
normally
sized
blob-
and
you
sum
them-
you
get
you
get
this
or
you
get
like
an
average
yeah.
So
it's
going
to
average
those
two,
those
two
symmetrically
degenerate
choices,
so
that's
kind
of
cool,
but
what's
even
cooler
is
that
we
can
actually
use
gradients
of
the
network
to
the
input
to
figure
out.
B
How
did
we
need
to
change
the
input
such
that
we
could
do
this
task
and
what
it's
able
to
do
is
so
the
inputs
on
these
individual
points
is
just
like
a
one.
It's
just
a
just
a
number
so
which
is
just
a
scalar,
but
if
I
allow
it
to
have
inputs
which
I
initially
set
to
zero,
but
it
can,
they
could
be
higher
order,
spherical
harmonics.
B
What
it
says
is
it
learns
that,
oh,
you
can
have
l
equals
two
or
l
equals
four
contributions
on
each
of
these
points
and
that
breaks
symmetry
such
that
it
can
fit
the
model
and
this
what
this,
what
this
actually
means?
What
this
actually
means
is
that
it's
choosing
it's
saying
the
y
direction
is
different
from
the
x
direction.
That's
what
this
blob
is
showing
you,
because
you
can
kind
of
see
it's.
It
starts
off
as
a
sphere,
then
it
goes,
and
so
it's
kind
of
going.
B
Okay,
I'm
symmetric,
like
I
can't
tell
the
difference
between
you
know:
minus
x,
minus
y.
Those
look
the
same.
It's
like
a
double-headed
ray,
but
I
can
tell
that
this
is
different
from
this,
and
so
that's
what
it's
able
to
learn
from
gradients,
and
so
in
this
paper
that
I
have
linked
below,
we
basically
mathematically
prove
that
all
that
equivalent
neural
networks
have
this
property
and
that
you
can
use
it
to
find
symmetry
breaking
inputs.
B
A
B
It's
okay
yeah!
So
just
this
is
a
quick
recap.
If
you
have
the
slides,
I
might
just
skip
this,
but
we've
talked
about
symmetry.
We've
talked
about
how
to
make
some
a
model
symmetry
where
we've
talked
about
the
difference
between
invariant
and
equivalent
and
why
you
might
want
to
use
an
equivalent
neural
network
and
that
these
models
can
have
unintuitive
consequences
when
you
embed
these
symmetries.
B
So
I
want
to
give
a
big
shout
out
to
my
collaborators
and
the
developers
of
e3
n,
which
is
the
open
source
repository
that
we
use
for
euclidean
neural
networks.
So
if
you're
interested,
this
is
the
repository
for
you.
I
also
want
to
give
a
shout
out
to
my
friend
tawny,
who
helped
me
with
some
of
the
graphics
in
this
presentation.
B
I
want
to
give
a
shout
out
to
the
tensorfield
networks
team,
and
that
I
was
part
of
this-
is
one
of
the
first
implementations
of
euclidean
neural
networks,
and
then
here
are
all
the
resources
and
I'm
going
to
leave
it
at
that
feel
free
to
reach
out
to
me
via
email.
If
you
have
questions
beyond
this
lecture
and
with
that
said
I'll,
take
any
any
remaining
questions
that
folks
have
but
yeah
thanks.
A
Thank
you
tess,
for
this
great
talk
and
for
all
the
work
that
you
put
in
for
all
the
illustrations
and
then
graphics.
That's
that's.
It's
really
awesome.
It's
a
very
good
overview
here
in
that
field,
yeah
and
thanks
for
everyone
for
the
questions,
I'm
not
sure.
If
there
are
more
questions
here,
we
see
in
the
chat,
I
think
it's
just
people
are
commending.
You
on
the
lecture
and
the
slides.
B
A
B
A
The
speaker
is
a
lively
person
like
you're,
given
animated
sort
of
explanations
of
things,
so
yeah
okay,
so
I
don't.
I
think
we
have
one
last
question.
B
B
One
thing:
that's
nice
about
variational
autoencoders-
is
that
at
least
in
the
in
the
context
of
euclidean
neural
networks,
which
I've
done
some
work
on
making
variational
auto
encoders
it's
hard
with
discrete
geometry
to
do
it
in
a
way
that
I'm
happy
with,
but
it
is
possible,
you
want
to
make
sure
that
your
your
middle
layer,
representation
isn't
just
reduced
to
scalars,
or
if
you
do
that,
you,
you
will
need
to
put
orientation
information
back
into
it
before
generating.
B
This
is
actually
something
that
I'm
exploring
with
some
of
my
collaborators.
We
have
some
slides
here,
so
we've
been
working
on,
oh
where'd,
it
go.
B
There
yeah
so
we've
been
interested
in
working
on
variational
autoencoders,
where
we
basically
take
local
environments
and
encode
them
as
scalars,
so
learning
invariant
representations
and
then
being
able
to
pop
them
back
up
into
a
geometric
object,
but
you
need
to
introduce
coordinate
frames
somehow,
and
so
the
the
point
of
that
paper
is
to
talk
about
all
the
nifty
ways
that
one
can
do
that
that
are
still
faithful
to
the
problem.
So
I
don't
know
if
that
completely
answers
your
question,
but
it
maybe
answers
some
aspect
of
it.
B
Can
you
post
the
link
to
the
workshop
on
the
slack
channel?
I
am
not
actually
on
the
slack
channel,
but
I
think
mustafa
can't.
B
Yeah
so
yeah
sorry,
so
this
this
workshop
tomorrow
should
be
really
good.
It
starts
at
6am
pacific
time
because
it's
an
east
coast
workshop,
but
I
really
the
the
talks
in
the
beginning
are
definitely
worth
it.
There's
some
great
speakers,
particularly
I've
taco
cohen,
actually
came
to
berkeley
lab
a
while
ago
and
is
a
good
colleague
of
mine
and
he's
giving
a
talk
and
that'll
be
on
a
lot
of
these
natural
graph
networks.
So
that
should
be
really
interesting.
A
Sounds
good,
I
think
that's
the
last
of
the
questions
again.
If
you
have
more
questions,
you
want
to
look
at
the
material
and
slides
and
you
have
more
of
them,
and
maybe
there
will
be
like
more
of
the
talks
tomorrow
and
you
have
questions
on
those
especially
to
tests.
Please
post
them
on
slack
channel
and
I'll
find
tests
trying
to
get
her
to
answer
some
of
those
questions.