►
From YouTube: DevoWorm (2023, Meeting #26): Genomics of differentiation, Thompsonian grid divergence, DevoGraphing
Description
GSoC updates (nucleus segmenter pre-topological data, DevoLearn). Open problem: genomic signatures of differentiation trees. Protein space navigation, biology is more theoretical than physics, Thompsonian grids and connections to development. Papers on directed graphs, transformers, and topological data analysis. Attendees: Sushmanth Reddy Mereddy, Bradly Alicea, Morgan Hough, Himanshu Chougule, Susan Crawford-Young, Richard Gordon, and Jesse Parent.
A
Well,
yeah
welcome
to
the
meeting.
We
have
a
Manchu
here
and
Morgan
dick
and
Susan
I
guess
want
you.
Can
you
give
an
update
on
your
things?
I,
don't
know
if
six
month
will
be
here
today,
but.
A
B
Yep
yeah,
okay,
so
I'm
still
continuing
my
work
from
last
week
about
like
the
nuclear
segment
and
the
membrane
segment
of
so
basically,
like
my
main
issue
with
collab.
Was
the
memory
had
like
a
bad?
The
free
resources
were
not
there.
Like
I've
finished.
All
my
computer
I
changed
my
workflow
from
to
just
train
the
model
and
like
I,
got
it
to
I,
got
I
tried
to
change
some
things
and
finally
got
it
running
on
as
well.
B
So,
basically
right
now,
I
changed
a
few
things
in
the
code,
like
the
I
found
another
way
of
getting
the
bounding
boxes
like
other
two
ways
in
which
we
can
use
like
from
the
masks,
and
we
find
the
X
X
Min
x,
minus
y
Min
y
Max
position
of
all
the
coordinates
of
the
box
with
respect
to
mask
and,
as
another
way
is
to
use
like
Pi
torch's
own
custom
implementation,
which
also
gives
you
the
similar
results.
B
But
this
you
know,
I
found
that
it's
better
to
use
graph
function,
because
it's
a
more
quicker
and
more
innovative,
so
and
also
one
thing
I
tried
to
do-
was
to
like
train
for
a
longer
period
of
time.
B
So
the
advantage
over
collab
is
that
kaggle
gives
you
like
around
15
to
16
GB
of
a
GPU
for
like
around
30
to
40
hours
of
uses
per
week,
and
it's
not
and
it's
much
better
than
what
Google
gives
you,
because
app
gives
you,
because
the
resources
and
like
the
free
resources,
get
a
real
switched
every
month
like
get
renewed
every
month
in
collab,
and
here
it's
only
one
week.
B
So
so
I
tried
to
like
train
the
model
once
again
and
basically
so
for
more
ebooks
and
I
ran
into
a
problem
where
my.
B
My
learning
rate
like
changed
to
zero,
like
it
was
zero
and
it
was
not
learning
further
so,
like
I,
tried
to
Google
the
error
about
it
and
to
see
how
to
change
a
few
things
so
that
it
gets
up
and
going
so
like
one
thing,
I
noticed
from
this
was
like
it
was
only
like
finding
only
one
mask
out
of
all
the
masks
that
are
given
like
I'm,
not
sure
if
I
have
an
image.
B
So
basically,
if
these
are
the
bounding
boxes
like
it
was
only
focusing
on
one
bounding
box.
So
the
problem
is
with
respect
to
gradients.
So
it's
like
the
gradients
are
Vanishing
from
like
for
a
foreign
update,
updating
the
wage
during
training,
the
the
gradients
are
Vanishing
so,
which
is
why
your
only
it's
only
focusing
on
one
boundary
box
and
four,
so
the
so.
B
The
trick
to
like
overcome
that
is
to
like
a
hyper
parameter,
tune
it
much
better
than
what
I've
been
doing
right
now
so,
and
also
changing
the
changing
the
optimizer
and
all
so
I've
used
in
this
stochastic
gradient
descent,
which
is
known
to
do
that.
So
I
also
tried
it
with
Adam
and
in
atom.
It's
the
gradients
were
exploding
because
the
loss,
for
example
it's
like
1.3
1.2
something
and
it
just
kind
of
hits
like
it
goes
still
a
really
big
loss
like
5000
something
so
right
now.
B
My
only
the
main
thing
I'm
focusing
on
is
in
parameter.
C
B
And
also
optimizing
the
things
so
that
I
overcome
this
and
also
I
I'm
star
I'm,
reading
papers
of
of
the
okay
Craft
part
of
it.
So
there
is
a
paper
file
like
using
persistent
hemology
in
graph
neural
Nets
like
so
it's
doing,
link
prediction
which
is
now
like.
We
are
trying
to
sell
tracking,
but
the
process
for
before
doing
the
final
task.
That
is
to.
B
It's
similar
to
what
it's
similar
to
what
you're
discussing
last
week,
so
yeah
nice
now
I'm
just
looking
into
the.
What
do
you
say
you
could
say
the
papers,
part
of
it
like
a
reading
material
Center.
So
after
once
that
is
done
once
my
laptop
actually
works
starts
working
again
because,
like
it
has
an
issue
right
now,
so
I've
come
from
my
brother's
laptop.
So
basically
my
CPU
files
select
this
stopped
working
like
completely
so
I
have
to
change
them.
Yeah.
A
That's
great:
let's
go
back
to
the
paper.
What
is
that
paper
from?
Is
it
from
a
journal
or
oh,
it's
archive.
Okay,.
A
A
Yeah
we
were
talking
about
that
last
week.
That'll
be
a
good
thing
to
go
over
yeah
and
yeah,
so
that
they're
doing
link
production
here.
But
the
code
you're
working
on
now
is
for
sort
of
segmentation
getting
to
that
point
of
creating
sort
of
links,
we're
creating
nodes
and
then
maybe
creating
edges
from
that
so
you're
having
problems
with
the
gradient
Vanishing
and
the
gradient
exploding.
So
it's
it's
clear.
Something
is
wrong
there
with
the
hyper
parameters.
So
what
what's
your
strategy
for
solving
that.
B
And
since
it
was
like
a
tifs
file
which
was
changed
into
like
some
like
more
images
like
you
know,
we
converted
the
trf
file
into
pngs
and
it
was
like
a
series
of
images
so
right
now,
I'm
Focus
I'm,
trying
to
like
take
the
best
images
with
which
has
the
like
kind
of
correct
bounding
boxes,
which
was
and
then
to
focus
only
on
the
pages
and
try
to
fine
tune
it
on
that,
and
also
remove,
like
all
the
noisy
uhness
in
the
data
like
some
more
but
and
after
that,
I'll
focus
more
on
changing
these
parameters
and
to
see
which
one
is
the
most
perfect
ones
like
now
I'm
using
the
step
LR
so
for
the
LR
scheduler,
which
basically
changes
the
learning
rate
with
respect
to
the
epochs
and
and
so
I'll
just
play
around
with
these
and
try
to
get
better
results.
B
My
main
concern
was
the
GPU
problem,
which
was
like
in
collabs
so
right
now,
I
can
what
I
can
do
more
iterations
of
the
parameter
tuning
and
everything?
So
it's
better
now.
A
Okay,
yeah:
well,
that's
good
yeah!
Keep
working
on
that
and
you
know,
keep
us
posted
hope.
Your
computer
gets
back
to
normal
soon,
yeah
all
right
yeah.
Thanks
for
the
update.
D
D
Right
now,
my
progress
was
I
almost
implemented
Sam
because,
as
following
my
tutorial
to
implement
some
action,
this
tutorial
I
am
using
ents.
I
am
using
Transformers
method.
Library
I
am
using
this
guy's
data
set
to
implement
it.
I
am
almost
done
with
it,
see
it.
So
what
I'm
sharing
my
screen?
Oh
so.
D
I
was
getting
same
as
the
implemented.
I
just
need
to
write
a
training.
Loop,
then
moral
will
be
trying.
I
am
also
having
G
issues
with
GPO,
so
I
got
some
yesterday
by
paper
space
around
it
took
40
or
something
so
I
took
it.
It
was
the
cheap
one,
then
my
phone
to
start
implementing
it
and
on
the
demonet
also
sorry.
A
A
D
D
A
A
So
yeah
that
was
yeah.
He
was
doing
his
update
so
he's
also
using
Kegel
notebooks
and
we're
working
on
that
and
we'll
I'll
meet
with
him
and
Maya
tomorrow.
You
know,
like
he's,
been
working
with
Maya
in
a
separate
context,
but
we'll
kind
of
bring
those
together
tomorrow,
so
we'll
see
yeah,
so
any
any
updates
from
Susan
or
Morgan
or
Jesse
for
dick.
A
C
Okay,
now
we
have
differentiation
trees
for
C,
elegans,
partial
one
for
Axolotl.
You
know
they're
they're,
not
pocket,
yet
yeah,
okay,
well,
we'll
see
elegans
might
be
the
place
to
look
okay,
so
I'm,
going
by
a
very
simple
premise
that
one
differentiation
is
represented
in
the
DNA:
great
okay,
two,
that
differentiation
occurs
by
copying
a
group
of
genes
and
then
having
them
to
Verge.
C
A
C
Okay,
that's
the
basic
idea!
Well,
I!
You
know
it's
an
untested
idea.
So,
as
I
said
in
a
note
to
you,
the
the
problem
with
finding
the
DNA
responsible
for
differentiation
might
be
approachable
by
saying,
let's
suppose
that
there's
a
motif
in
the
DNA
right,
which
is
similar
for
all
steps
of
differentiation.
C
Yeah
I'm
not
sure
how
to
go
obviously
or
DNA.
We
don't
know,
we
don't
know
the
number
of
base
pairs
involved
in
differentiation
and
and
once
we
know
it,
we
don't
know
how
what
varies
in
it
and
what's
Constant
in
it.
C
C
A
C
I
I
don't
know
how
many
of
the
people
joining
us
play
are
familiar
with
blasting
and
stuff
like
that.
A
A
Little
bit
so
you
know,
differentiation
tree
is
where
we
have.
A
Me
yeah
differentiation,
trees
where
we
have
a
tree
that
diverges,
so
we
have
like
a
single
cell
type
and
then
it
splits
to
two
cell
types
and
then
the
four
cell
types
and
we
go
down,
but
we'll
just
use
this
as
a
quick
example.
Okay,
now,
of
course,
DNA
has
a
structure
of
base
pairs.
So
you
have
these
combinations.
These
codes,
that
kind
of
come
out
if
you
were
to
take
like
100
base
pairs
or
it'd,
be
a
code
there
of
of
a
sequence
of
different
letters.
A
Let's
take
a
six
base
pair
code
here,
let's
say
I'm,
just
kinda,
making
one
up
because
okay,
just
just
for
the
sake
of
simplicity,
so
this
is
a
six
base
pair
code
here
now.
What
you're
saying
is
that
this
might
be
like
involved
in
differentiation.
It'd
probably
be
longer
than
this,
because
this
probably
doesn't
give
you
much.
Sometimes
you
get
repeats
of
things
you
get.
You
know
tandem
repeats,
which
are
like
two
letters
repeated.
A
A
Well,
yeah,
so
you
can
have
like
you
know,
you
can
basically
go
through
and
look
for
different
motifs.
If
you
know
that
they're,
if
you
see
that,
like
you,
could
design
an
algorithm
to
go
through
a
draft
genome
and
look
for
different
repeats
or
different
motifs
of
different
types.
A
If
you
predict
that
you
know,
because
we
know
what
the
codons
should
look
like,
we
know
what
the
amino
acid
should
look
like
that
come
from
the
codon,
so
we
could
say
well
we're
interested
in
this
combination
of
amino
acids,
and
so
that
would
give
you
a
motif
and
then
you
could
see
how
many
times
you
find
that
in
a
draft
genome.
So
you
could
search
for
say
this.
These
six
bases
at
the
top,
and
it
would
give
you
a
number
so.
C
A
C
C
A
C
C
Yeah,
okay,
it
might
be
possible
to
do
this,
but
you
know
Finding
what
to
look
for
it.
It's
sort
of,
like
you
know,
Carl.
The
only
stuff
is
Carl
Lewis,
yeah,
okay.
He
had
to
find
something
that
that
was
good
enough
to
be
in
common
and
evolve
slowly
and
look
for
it
and
he
ended
up
with
the
the
s16
RNA
from
the
from
bacteria
right.
A
A
Yeah
so
one
thing
I
mean
this
kind
of
what
I'm
getting
from
this
is
that
you
have
these
sets
of
genes
for
each
node
and
kind
of
drawing
a
Venn
diagram
over
a
tree.
But
so
each
circle
is
a
set
of
genes
that
are
involved
in
sort
of
differentiation.
A
A
A
We
don't
need
a
quantum
computer,
yet
oh
no
I
mean
the
way
this
works
generally.
Is
that
usually
we
have
like
a
set
of
candidates,
so
people
will
do
searches,
they'll,
search
for
some
string
or
they'll
search
for
a
set
of
genes,
or
maybe
like
different
regions
of
the
genome,
that
they
want
to
search
for,
and
then
they
say
well,
there's
a
certain
probability
that
this
is.
You
know
a
match
in
this
area
or
this
area
we
know
kind
of
from
functional
studies.
A
What
different
parts
of
the
genome
do
if
it
falls
within
a
gene
or
not,
and
then
also
we
know
like,
we
can
also
predict
kind
of
it's
likelihood
that
it's
you
know.
A
A
C
C
Something
may
involve
proper
new
differentiation.
A
Right,
okay,
yeah,
so
yeah
I
mean
then
and
then
the
other
problem,
too,
is
like.
If
you
have
different
cell
types,
you
know
you're
talking
about
the
things
that
make
them
unique
or
the
things
that
are
involved
in
differentiation
is
that
I
mean
a
lot
of
people
will
go
in
and
look
at
like
different,
like
tissue
types
and
know
that
their
differences
in
tissue
types
and
that's
just
a
bunch
of
cells
in
the
tissue
that
they
sequence
and
but
those
aren't
involved
in
differentiation
per
sales
or
a
different
thing.
A
So
that's
that's!
The
other
thing
is
that
you
want
to
be
specific
to
that,
and
it
might
not
be
that
hard,
because
a
lot
of
things
are
just
kind
of
like
this
is
specific
to
bone.
This
is
specific
to
muscle,
and
it's
really
about
you
know
getting
like
upregulation
of
those
genes
more
than
like
the
actual.
C
Of
C
elegans
is
that
every
time
there's
an
asymmetric
division,
you
get
two
new
cell
types,
okay,
yeah,
and
you
only
have
about
50
cases
where
it
doesn't
happen
where
you
get
a
pair
of
cells
that
are
supposed
to
be
similar
right.
So
that's
why
I
think
C
elegans
might
be
the
right
right
organism.
A
A
So
that's
you
know,
I,
don't
know
if
people
are
interested
in
that,
if
people
out
there
want
to
work
on
that
problem,
you
know
we
could
talk.
We've
been
trying
to
do
something
like
this
for
a
while,
but
we've
been
kind
of
it's
hard
to
kind
of
get
sort
of,
get
it
going
and
doing
something
interesting
with
it.
So
it's
hard
work
because
it's
it's
a
lot
of
bioinformatics,
not
everyone
has
that
skill
set
and.
A
E
C
Yes,
yeah
I
mean
conceptually
it's
not
that
hard,
but
the
combinatorics
are
rematch
right.
C
So
that's
why
I
think
this
would
be
a
major
step
towards
testing
whether
differentiation
trees
are
the
basis
of
differentiation
and
if
they
are
okay,
what
is
the
representation
of
DNA
yeah
yeah?
Okay,
we
have
trained
situation
of
mapping
a
tree
into
a
linear
structure
right,
okay
and
that's
that's
why
it
might
be
more
complicated
than
that
I.
Imagine
but
okay
yeah,
but
if
you
try
to
take
a
tree
and
map
it
into
a
linear
structure,
you'll
see
the
problems
right.
A
A
So
yeah
it's
it's
usually
like
you
know
it's
either
going
to
be
the
same
sequence:
that's
conserved
across
different
cell
types
or
probably
more
likely
you're
going
to
have
some.
Maybe
some
changes
depending
I
guess.
If,
if
the
differentiation
process
is
the
same,
it
would
be
basically
conserved.
So
you
know
it
would
basically
be
maybe
like
an
instruction
of
change
to
the
cell
type
or
a
lot
of
times.
You
have
genes
that
are
for,
like
things
like
stemness
or
things
that
enforce
the
identity
of
a
cell.
A
C
A
Well,
they
have,
while
in
C
elegans
you
have
a
deterministic
differentiation
pattern,
so
you
know
certain
cell
types.
They
start
from
the
one
cell
stage
and
their
sort
of
Developmental
cells,
which
are
basically
pluripotent
cell,
well
they're
at
pluripotent
in
the
conventional
sense.
But
they
have
this
they're.
Not
yet
they're
adult
terminal
differentiated
form
yeah
until
they
get
to
a
certain
point
and
then
they
differentiate
and.
C
A
Tree
or
down
the
tree,
I
guess
that
on
the
tree.
C
C
A
I,
don't
think
so,
I
mean
there's,
there's
a
genome
and
there's
like
because
I
know
they
do
a
lot
of
single
cell
sequencing,
so
they
may
have
single
cell
data
available.
So
it'd
be
interesting
to
see
what
the
differences
are
in
the
actual
sequence.
C
A
A
I,
don't
I
think
yeah
people
have
worked
on
like
different
things
like
methyl
I,
think
methyl
markers
and
things
like
that
in
the
genome.
But
I
don't
know
if
people
have
also
done
things
with
epigenetic
inheritance
but
and
then
associated
with
that
work.
They've
done
some
work
on
looking
at
the
epigenome,
but
I
don't
know
where
that
those
data
are
I'm
sure
those
data
are
out
in.
A
Yeah,
it
would
be
great,
I
mean
I,
don't
think
people
have
I,
think
people
think
about
it
more
in
terms
of
like
the
cell
type.
That's
you
know
what's
up
regulated
in
the
cell
type,
but
that's
kind
of
like
beside
the
point.
Is
there
a
mechanism
that
sort
of
guides
the
cell
towards
a
faint?
So,
like
you
know
you,
obviously
you
have
the
cell
and
it's
Associated
things
associated
with
that
cell
type.
A
Right
yeah
that'd
be
good.
Well,
maybe
I'll
take
a
look
and
see
what
kind
of
data
there
is
for
for
these
types
of
things,
I'm.
E
Okay,
I
tried
the
other
day
and
I
said
well
skip
the
first.
Like
seven
pages
of
this,
like.
E
E
E
C
E
No
I
decided
that
the
paper
I
was
trying
to
duplicate,
ignore
the
instabilities
in
its
in
the
tensegrities.
They
were
using
and
just
Maps
before
the
graph
before
the
instability
and
then
graft
afterwards
and
made
a
nice
smooth
curve
and
said:
oh
there's
instabilities
like
yeah.
There
are
so
that
that's
my
half
baby.
E
E
Rotation
Problem
by
placing
the
triangle
in
the
negative
y
direction
so
that
the
Y
is
actually
pointing
out
of
the
plate.
Okay
I
was
trying
to
explain
that
last
time
both
anybody
cared
but
anyway,
I
did
solve
that
good.
So
now,
I'm
going
well
at
least
stress
strain
curves
you
to
make
them
point
in
any
direction.
You
wanted
that!
E
Doesn't
that's
not
very
good
either,
but
because,
apparently,
if
you
make
the
the
top
and
the
bottom
and
the
bars
infinite
elasticity,
then
you
get
a
J
curve,
which
is
what
you're
looking
for
in
a
cell.
E
So
somehow
cells
have
a
kind
of
an
infinite
elasticity
like
over
200
gigapascals,
wow
I'm
going.
Maybe
if
you
include
the
myosin
and
the
fact
that
they're
actually
Contracting
I,
don't
know
anyways,
it's
a
mystery
and.
A
E
A
So
I'm
gonna
go
back
to
sharing
my
screen
here
or
have
I
shared.
My
screen.
I
did
share
my
screen.
Okay,
go
back
to
sharing
my
screen
and
I'm,
going
actually
I'm
going
to
talk
about
this
first
I
I
get
and
I
know
Morgan
knows
about
this
because
he
has
the
T-shirt
there's
this
organization,
of
course
called
Fermat's
Library
and
they
have
a
paper
of
the
week
and
every
week
they
have
like
this.
A
Fermats
library
is
like
a
set
of
papers
that
they,
where
people
can
build
a
library
in
this
specialized
PDF
interface,
where
you
can
mark
up
the
document
collectively,
you
can
take
notes
on
on
a
paper
or
whatever.
So
that's
that's
where
this
is
coming
from
and
this
last
week
they
had
a
paper
from
John
Maynard
Smith
in
1970.,
and
he
published
a
paper
entitled
natural
selection
and
the
concept
of
a
protein
space.
A
This
proposes
a
simple
analogy
for
the
incremental
process
of
adaptive.
Evolution
is
protein.
Space
analogy
contains
the
basic
basis
for
many
Central
ideas
in
evolutionary
genetics,
so
he
just
he
talked.
He
likened
The
evolutionary
trajectory
of
proteins
to
a
simple
word
transformation
game
in
this
letter,
this
game
with
the
goal
of
evolving
one
word
into
another
by
means
of
a
single
letter.
Substitution
serves
as
a
striking
model
for
protein
evolution,
some
of
the
work
I've
done
with
vehicles
and
composing
organisms,
there's
sort
of
a
quasi-developmental
process.
A
I
I,
you
I,
illustrate
this
with
these
kind
of
word
games
and
I.
Actually
wasn't
I
didn't
remember
that
this
was
this
sort
of
what
this
was
this
exercise.
But
it's
basically,
you
know
taking
like
a
string
of
letters,
it
could
be
DNA
or
it
could
be
like
the
English
alphabet,
which
has
26
letters
and
permutating
those
letters.
But
you
start
with
like
a
word,
and
then
you
permutate
the
letters
to
make
new
words,
but
they
have
to
be
words
that
mean
something
they
can't
be
gibberish
they
have
to.
A
You
know,
have
like
an
a
meaning
to
them.
So
that's
the
challenge,
and
so
it's
like
you
know,
in
a
protein
sequence
or
even
in
a
DNA
sequence,
where
you
have
a
string
of
letters,
you're
mutating
them,
but
you
don't
want
to
mutate
to
something:
that's
not
functional.
You
want
to
mutate
to
something,
that's
functional.
So
that's
the
idea,
so
you
have
26
letters
you
have
so
you
you
know
26
possible
States
in
each
position.
A
If
you
have
a
string
of
four,
you
have
four
to
the
26th
possibilities
which
you
don't
like
I
said.
You
don't
want
to
use
all
those
possibilities.
You
just
want
to
use
the
possibilities
that
have
some
meaning.
So
usually
what
that
means
is
that
you
can
build
a
transformation
path
from
one
word
to
another
by
mutating
one
letter
and
then
maybe
mutating
another
letter,
and
it's
a
little
bit
artificial,
because
you're
kind
of
doing
this
consciously
to
sort
of
to
hit
words
with
no
meaning.
A
So
an
example
would
be
like
word
to
war,
so
you're
mutating
the
last
letter
from
D
to
E.
Then
the
third
letter
is
Gore
or
the
third
word
is
Gore,
which
is
mutating
the
first
letter
from
W
to
G
and
then
the
fourth
word
is
gone,
which
is
where
you
mutate,
the
third
letter
from
R
to
n
and
then
the
fifth
word
is
Gene,
which
is
mutating
the
second
letter
from
o
to
e.
So
you
get
go
from
word
to
Gene
and
in
five
steps,
and
you
can
map
that
out
in
a
tree.
C
A
A
Oh
right,
yeah
yeah
I
mean
it
has
a
probably
a
lot
of
similarities
of
things
that
it's
kind
of
a
fun
thing
to
do,
because
you
see
kind
of
like
similarities
between
things
so
yeah,
it's
it's.
It
has
some.
You
know
there
are
other
games
that
you
can
play
with
this,
but
so
you
can
build
a
like
a
sequence
and
then
you
can
build
alternate
sequences.
So
it's
not
just
one
sequence.
A
It's
like
you,
have
multiple
paths
from
word
to
Gene
and
you
can
just
mutate
different
letters
and
get
there
in
different
ways,
and
so
this
is
similar
to
the
concept
of
a
neutral
space
and
it
you
know,
do
not
familiar
the
neutral
space.
It's
basically
the
space
of
possible
genotypes
that
if
you
were
to
just
let
you
know
your
system
sort
of
go
at
random
and
explore
different
states.
You
would
end
up
in,
like.
A
A
Sometimes
people
characterize
this
with
a
hypercube
and
they
have
different
states
in
the
nodes
of
the
hypercube
and
you
you
have
to
Traverse
a
path
through
the
hypercube
to
get
to
the
other
side,
and
you
have
to
count
how
many
steps
it
takes
you
to
get
there,
and
so
this
is
sort
of
similar
to
that
as
well,
and
you
can
look
at
optimality
Criterion
and
all
that
and
then
you
know
there
so
there's
this
sort
of
analogy
between
functional
proteins
and
words
in
English
or
words
in
any
language.
A
You
know
you
have
a
finite
alphabet,
you
can
look
at
their.
You
know
the
transitions
between
different
words
and
build
a
space
from
that,
and
so
that's
sort
of
treating
you
know.
Protein
sequences
and
Gene
sequences
as
like
a
linguistic.
A
Sort
of
dictionary-
and
so
that's
so
this
is
kind
of
an
so.
This
was
from
1970
and
then
this
is
the
paper
here
where
this
is
where
the
notes
are
taken
here.
So
these
people
just
pop
notes
in
here
and
I,
don't
think
this
is
the
paper
I
think
I
just
took
a
screenshot
of
it,
but
this
is
sort
of
where
he
starts
talking
about
natural
selection
and
the
concept
of
a
protein
space.
A
So,
instead
of
thinking
about
it
as
a
neutral
space,
he
talks
about
natural
selection,
which
would
be
where
there's
a
directional
imperative
for
selection
to
go
to,
like
the
other
side
of
the
hypercube
in
that
neutral
space
in
a
minimal
number
of
steps.
So,
instead
of
just
exploring
every
mutational
pathway,
there's
an
imperative
for
natural
selection
and
if
you
think
about
like
the
sort
of
the
artificial
nature
of
this
or
you
go
from
word,
Gene
you're
really
imposing
sort
of
artificial
selection.
On
this
to
say,
I
I
want
to
go
to
a
certain
location.
A
It
is
similar
to
a
lot
of
games
that
people
played
it's
very
game-like.
You
know
you
have
a
lot
of
States,
you
can
visit
and
it's
yeah,
it's
nice,
but
that's
that's
something.
You
know
that
relates
to
what
we're
talking
about
before,
because
you
know
you're
looking
for
things
that
are
permutating,
maybe
from
different
for
different
cell
types.
You
might
have
differentiation
genes,
for
example,
that
all
are
sort
of
Highly
conserved.
So
you
have
this
one
process
that
unfolds
in
every
cell
type
and
it
just
tells
it
to
go.
A
One
way
you
know
tells
it
to
say
differentiate
now
or
you
could
have
something
that
mutates
a
lot
and
it
has
a
specific
code
to
say
it's,
this
cell
type
or
you
know,
which
is
maybe
less
likely,
because
you
know
you
it's
through
different,
like
throughout
Evolution
you'd,
have
to
really
kind
of
Target
or
it'd
have
to
be
under
high
selection
for
certain
cell
types.
So
a
little
bit
of
a
word
on
what
hyperspaces
are
so
a
lot
of
times.
A
If
you
want
to
Envision
a
sort
of
an
evolutionary
space
that
is
subject
to
neutral
processes
or
neutral
Theory
people
of
God,
with
this
idea
of
hyperspace,
so
I'm
going
to
draw
a
simple
example
here,
where
we
have
a
cube-
and
this
Cube
has
a
number
of
binary
States,
we
have
eight
nodes
and
we
characterize
it
with
a
three
bit
register.
So
we
have
this
path
from
zero:
zero
zero
to
one
one
one.
A
So
we're
going
from
the
bottom
left-hand
corner
to
the
top
right
hand
corner
of
this
Cube,
and
the
idea
is
that
you
can
navigate
across
from
node
to
node,
and
each
node
represents
some
mutation
on
zero
zero
zero.
So
you
could
have
zero
zero
one
zero
one,
zero
one:
zero
zero!
Basically
it'll!
Allow
you
to
move
to
111,
so
you
can
have
each
step
as
a
mutational
step
and
it
changes
what
we
call.
They
call
this
the
phenotype.
A
So
a
lot
of
these
models
have
been
developed
in
RNA,
looking
at
RNA
molecules
and
looking
at
their
phenotypes.
So
a
mutation
in
an
RNA
molecule
is
a
change
in
the
phenotype.
It
can
change
the
conformational
structure.
It
can
change
the
message,
that's
inherent
in
the
molecule,
so
people
also
use
hypercubes.
I.
Think
I
mentioned
this
in
the
lecture
where
you
have
a
hypercube.
A
So
you
have
a
cube
within
a
cube.
Basically,
sometimes
people
call
these
tesseracts
and
I
think
they
use
the
word
tesseracting
correctly
in
the
Marvel
series
of
comics
but
whatever.
So
this
is
our
hypercube
and
it's
basically
a
cube
within
a
cube.
A
So
has
a
lot
more
States
and
you
can
explore
more
space
with
this
and
there
are
other
sort
of
mind
bending
topologies
that
you
can
use.
It
gives
you
a
larger
number
of
states,
but
it
also
gives
you
this
larger
space
to
Traverse,
and
the
idea,
though,
again,
is
to
go
across
the
space
traversing
it
one
step
at
a
time
and
then
figuring
out
how
many
steps
you
need
to
take
to
get
to
your
endpoint
now.
A
The
other
thing
I
want
to
mention
is
that
you
know
I
talk
about
this
as
if,
like
it's
a
magical
process,
but
it's
really
about
you
know
what
what
are
the
constraints
that
you
impose
on
these
models?
So
you
know
we
talk
about
neutral
processes,
we're
just
basically
talking
about
a
system
that
explores
a
space
at
random,
so
this
space
has
eight
eight
possibilities,
and
so
you
can
explore
it
at
random.
So
you
could
start
with
zero
zero
zero
and
you
could
move
up
to
zero
one
zero.
A
That's
the
shortest
route,
but
in
you
know,
in
a
neutral
process,
you
might
end
up
achieving
that
in
seven
steps.
It's
just
a
matter
of
random
mutation
and
not
favoring
any
one
site
over
another,
not
favoring
anyone's
strategy
over
another,
and
so
that's
and
then
natural
selection
is
more
reflective
of
this
shorter
path
of
this
optimized
path
from
one
phenotype
or
phenotypic
state
to
another.
A
The
other
thing
I
want
to
talk
about.
Is
this.
C
Interesting,
maybe
one
approach
would
be
to
say,
make
a
particular
cell
type
and
you
get
for
an
organism.
That's
been
sequenced
and
then
pick
another
organism
with
the
same
cell
type
and
see
what's
similar
between
them.
Yeah.
A
A
Yeah
that
would
definitely
yield
a
an
answer
of
some
type.
At
least
you'd
have
an
idea
of
what
to
look
for.
A
Well,
people
have
well
people
have
compared
different
species
and
cell
types,
I'm,
not
sure.
They've
looked
asked
that
for
that
specific
question,
so
they're,
probably
not
looking
for
you
know
they
might
be
looking
for
a
specific
functional
Gene,
like
maybe
a
muscle
Gene
in
you,
know,
muscle
specific
Gene
in
two
species
and
then
and
then,
but
then
you
know,
you'd
have
to
decide
whether
you
want
something
that's
closely
related
or
not,
because
a
lot
of
people
look
in
model
organisms
and
it's
like
you
know,
a
mouse
and
a
c
elegans.
C
E
The
organism
and
I
was
sleeping
eye
retina
in
eyes.
Invertebrates
might
be
interesting,
yeah.
A
A
No
yeah
I,
guess
yeah.
If
you
take
a
sample
of
tissue
from
any
organ
you're
going
to
get
like
so
well,
I
mean
you'll,
get
cell
types
of
different
levels
of
differentiation,
sometimes
and
sometimes
you'll,
get
like
a
single
cell
type
like
and
liver
versus,
like
if
you
were
to
sample
and
retina.
You
have
like
a
bunch
of
cell
types.
Jumbled
up
in
there
like.
E
A
But
yeah
it's
so
I
mean
the
data
set.
You
would
get
would
probably
not
be
like
one
cell
type
we'd
have
to
work
with
the
data
we
have
available.
I
mean
you
could
do
the
experiment.
You
could
maybe
separate
out
different
cell
types,
but
maybe
yeah
that
that's
that's
a
good
yeah.
Those
are,
of
course,
classic
experiments
where
you
just
take
a
self,
a
type
from
one
organism.
You
put
them
in
another
organism
or
you
know
yeah.
A
C
C
A
So
I
think
Susan
threw
a
bomb
earlier
where
she
talked
about
physicists,
not
having
a
lot
of
biology
background,
which
is
somewhat
true,
but
there's
this
argument
that
bio
biology
is
more
theoretical
than
physics,
and
this
is
a
new
newish
paper.
Yeah.
E
A
Yeah,
yeah,
okay,
so
this
is
a
paper.
This
is
from
2013.
Actually
so
it's
about
10
years
old
now
and
it's
called
biology
is
more
theoretical
than
physics,
and
so
it's
it's
kind
of
an
argument
that
you
know
we
think
of
physics,
as
maybe
really
theoretical,
in
biology,
not
being
so
theoretical.
But
the
this
paper
makes
the
argument
that
the
opposite
is
maybe
true,
and
so
the
abstract
reads.
The
word
theory
is
used
in
at
least
two
senses.
A
One
is
to
denote
a
body
of
widely
accepted
laws
or
principles,
as
in
darwinian
theory
or
quantum
theory,
and
to
suggest
the
speculative
hypothesis,
which
is
two
often
relying
on
mathematical
analysis
that
has
not
been
experimentally,
confirmed.
So
it's
basically,
this
widely
accepted
laws
or
principles
that
you
know
you
have
developed
and
then
speculative
hypotheses.
So
this
is
just
the
way
that
people
use
Theory
and
then
this
is
only
two
definitions.
There
are
probably
more.
A
It
is
often
said
that
there
is
no
place
for
the
second
kind
of
theory
in
biology
and
that
biology
is
not
theoretical
but
based
on
interpretation
of
data.
So
this
is
where
often
people
will
go
and
get
lots
of
data,
but
they
have
no
framework
or
they
may
have
a
framework
for
it
in
maybe
like
accepted
laws
or
principles,
but
they
don't
really.
You
have
a
lot
of
room
for,
like
you
know,
building
speculative
hypotheses
that
go
just
kind
of
Beyond
like
Data
Mining,
and
that
sort
of
thing.
A
Maybe
it's
sort
of
the
way
that
the
field
works,
I
don't
know,
but
anyways
here,
ideas
from
a
previous
essay
are
expanded
upon
to
suggest
to
the
contrary
that
the
second
type
of
theory
is
always
played
a
critical
role
in
that
biology,
therefore,
is
a
good
deal
more
theoretical
than
physics
and
says
kind
of
goes
through
some
of
these
things
that,
in
a
previous
lessee
I
pointed
out
the
Curious
Case
of
the
enzyme,
substrate
complex,
which
was
widely
used
to
understand
enzymes
before
any
enzyme,
substrate
complex
was
shown
to
exist.
A
So
this
is
where
you
have
this
idea
of
this
structure,
or
this
thing
we're
talking
about
with
differentiation
genes.
We
don't
know
if
it
exists
or
not.
No
one's
shown
it
exists,
but
you
kind
of
build
a
model,
and
you
say
this
exists
and
I'm
going
to
show
that
exists
and
in
a
way
that's
how
you
kind
of
do
biology.
Although
you
know
a
lot
of
tennis,
people
will,
like
I
said
you
know,
they'll
collect
data
on
you
know
some.
A
You
know
comparing
two
different
species
or
you
know
looking
for
all
functional
genes
of
a
certain
category
or
something
like
that.
But
this
is
where
you
actually
build
a
you
know
a
sophisticated
model
of
something,
and
then
you
kind
of
kind
of
realize
it
with
data
so
Britain
chance
who
brought
these
hypo
hypothetical
entities
into
existence
was
in
no
no
doubt
that
he
was
providing
the
first
experimental
confirmation
of
a
theory.
A
So
this
is
that
one
I
think
Britain
chance
was
also
the
person
who
developed
fnir
if
I'm
not
mistaken,
but
in
the
interviewing
intervening
30
years,
biochemist
happily
used
a
theoretical
entity
because
it
was
so
useful
and
explain
so
much
expediency
overcame
the
kind
of
philosophical
Scruples
that
would
make
a
physicist
swing,
and
so
this
is
something
where
perhaps
this
is
a
marginal
episode
among
enzymologists,
which
can
be
glossed
over
in
favor
of
the
party
line.
That
biology
is
not,
of
course,
theoretical.
A
I
claim
that
this
is
far
from
the
case.
In
fact,
similar
episodes
have
occurred
throughout
biology
involving
some
of
its
most
important
entities,
and
so
he
talks
about
the
receptor
and
some
of
the
equations
that
they
developed
to
describe
a
receptor,
and
these
are
all
theoretical
things
where
you
have
like
these
constants
and
variables
that
you're
kind
of
coming
up
with
you
know.
This
is
what
it
should
look
like.
A
So
this
is
the
the
hill
function,
and
so
this
is,
you
know,
just
kind
of
something
that
had
to
be
experimentally,
verified,
but
you
build
this
model
and
then
you
eventually
realize
it,
and
so
you
know
it
took
30
years
for
the
enzyme
substrate
complex,
to
become
a
chemical
reality.
The
receptor
took
a
good
deal
longer.
A
So
this
is
all
you
know,
kind
of
thinking
about
Theory
and
how
you
realize
it
in
data
related
to
that
is
this
paper
by
Wallace
Arthur,
and
this
is
sort
of
biological
Theory.
So
this
is
actually.
This
is
on
Darcy,
Thompson's,
morphological
transformations
issues
of
causality
and
dimensionality.
A
This
is
the
person
we've
talked
about
Darcy
Thompson
before
he
did
a
lot
of
things
with
mathematical,
Transformations
and
looking
at
organismal
phenotypes
and
how
they
change
across
species.
A
So
if
you
take
like
a
fish-
and
you
put
it
on
a
grid-
and
you
map
the
intersections
of
the
grid
with
the
landmarks
of
the
fish
on
the
phenotypic
landmarks
of
the
fish,
if
you
take
a
different
fish
and
you
put
it
on
that
grid-
you
warp
the
grid
to
match
the
landmarks,
and
you
end
up
with
this
transformation
between
a
straight
line
grid
and
a
warped
grid.
And
so
this
is
interesting
because
Darcy
Thompson
was
building
these
theoretical
models.
A
And
you
know
it's
something
that
is
a
nice
tool,
but
also
kind
of
is
a
theory
of
transformational
change,
so
Darcy
Thompson's
drawing
show
showing
coordinated
differences
between
the
shape
of
an
individual
one
species
and
the
shape
of
an
individual
of
another
have
been
reproduced
and
discussed
countless
times.
However,
while
they
can
be
widely
regarded
as
inspirational,
their
interpretation
and
causal
terms
has
been
proven
difficult
and
there
is
no,
as
you
got
no
consensus
on
the
matter.
So
this
is
you
know,
these
models
are
somewhat
predictive,
but
they're
also
mostly
inspirational.
A
So
it's
not
that
we
understand
the
cause
of
those
Transformations.
We
just
understand
that
those
Transformations
exist
and
so
we're
not
really
at
the
stage
of
a
predictive
Theory,
we're
just
kind
of
at
the
stage
of
like
this
nice
tool
that
can
show
demonstrate
these
Transformations,
but
we
don't
really
know
why
they
exist
in
that
in
that
form.
A
Here,
I
approached
these
thomsonian
Transformations
from
a
particular
angle,
namely
their
dimensional
insufficiency
I
argue
that
this
problem
must
be
solved
before
the
issue
of
causality
can
be
considered.
So
you
don't
you
know
a
lot
of
times
in
theory.
We
want
to
know
about
causality,
we
want
to
know
if
one
thing
causes
another
and
so,
but
maybe
that's
the
wrong
question
to
approach
right
with.
A
Maybe
you
need
to
understand,
you
know
what
he
calls
dimensional
insufficiency,
so
the
thompsonian
approach
we
or
this
approach
of
looking
at
dimensional
insufficiency
leads
to
the
conclusion
that
thomsonian,
Transformations
or
morphological
Transformations.
Remember
those
transformations
of
the
grid
have
not
taken
place
in
evolution
and
logically
can
never
do
so
because
they
involve
the
direct
conversion
of
the
adult
form
of
one
species
into
that
of
another.
A
So
this
is
really
interesting
that
the
whole
idea
of
the
thomsonian
grids
is
to
show
this
transformation
between
like,
if
you
want
to
represent
an
adult
phenotype
of
one
species
versus
the
adult
phenotype
of
another
species.
You
look
at
these
mathematical
Transformations
and
you
say:
well,
you
know
you
assume.
I
must
have
gotten
there
in
development,
but
there's
no
we're
not
actually
measuring
that
we're
measuring
adult
form
versus
adult
form.
So
what
he's
saying
here
is
that
these
Transformations
are
all
adult
adult
comparisons,
and
so
you
can't
necessarily
say
that
these
are.
A
Can
these
conversions
represent
what's
happening,
say
in
development?
So
that's
that's
where
you
have
this
problem.
You
use
a
nice
approach
like
that,
but
you
also
have
to
have
this
developmental
aspect.
So
what
he
does
here
is
he
talks
about
developmental
Transformations,
which
are
also
important,
and
they
would
actually
involve
the
phenotype
developing
along
a
certain
trajectory
and
that's
really.
The
whole
business
of
evodivo
is
to
explain
like
if
you
have
adult
forms
that
are
very
variable,
and
especially
a
trade.
A
That's
variable
how
that
gets
there
in
development
or
what
the
developmental
mechanisms
are
underlying
that.
So
this
is
in
contrast,
developmental
Transformations
do
occur
in
the
short
term,
and
so
those
developmental
Transformations,
don't
necessarily
show
up
in
those
Bridge
Transformations.
Those
grid
Transformations
are
sort
of
the
end
product
of
the
developmental
transformations,
so
you
have
to
think
about
it
in
a
different
way.
It's
not
that
those
Transformations
are
going
to
explain
anything
on
their
own.
A
They
just
show
what
happens
when
some
X
Factor,
which
is
developmental
transformations
unfold,
and
so
so
there
are
these
developmental
Transformations
that
occur
in
the
short
term
within
the
lifetime
of
a
single,
individual
and
evolutionary
Transformations
occur
in
the
long
term.
In
a
context
that
can
be
described
following
Scott
Gilbert
has
five
Dimensions
Scott
Gilbert's,
a
developmental
biologist
who
describe
these
as
five-dimensional
Transformations
I,
argue
that
these
two
kinds
of
transformation,
Developmental
and
evolutionary
have
different
causal
agencies.
A
They
consider
the
possible
nature
of
these
agencies
and
related
to
that
in
the
way
in
which
Thompson's
work
connects
with
darwinian
evolutionary
theory.
So
he
really
kind
of
lays
this
out
in
terms
of
like
you
know,
a
lot
of
what
Darcy
Thompson
did
in
terms
of
the
developmental
biology,
and
that
involves
the
expression
of
Hox
genes
and
other
things
that
you
know.
A
So
this
is
an
example
of
these
graphs
or
these
these
grids
lay
you
know
in
putting
different
adult
phenotypes
on
them
and
you
use
different
landmarks
like
the
tip
of
these
antennae
or
something
to
sort
of
as
a
landmark,
and
then
you
warp
the
grid
accordingly.
So
if
there's
a
larger
distance
between
this
left
appendage
in
the
in
the
antenna
that
grid
will
warp
accordingly,
so
it's
a
shorter
here
and
longer
in
the
species
and
then
it
diverges
outward
in
the
species.
A
In
some
of
these
issues-
and
then
you
know
I'm
not
going
to
get
into
this
too
much
today,
but
there
is
this,
this
document
101
unsolved,
puzzles
in
Evo
Devo
and
what
I
mentioned
evodivo
or
the
evolution
of
development.
This
is
like
all
these
unsolved
problems,
so
these
are
problems
that
we've
kind
of
observed
in
biological
systems
and
we
need
to
really
we
don't
really
have
a
theory
of
how
they
happened.
So
you
know
there
are
a
lot
of
puzzles,
involving
origins,
evolvability,
The,
genome
symmetry
asymmetry,
digits,
Behavior,
nervous
system.
A
A
What's
the
cellular
basis
for
monozygotic
twinning
hottest
twinning
occur
in
Armadillos,
so
you
go
from,
like
the
level
of
you
know,
different
traits
to
the
level
of
like
things
in
specific
species
to
you
know
the
cellular
basis
for
epithelial
Fusion.
So
you
have
a
process
where
you
want
to
know
the
mechanism,
what
causes
the
loss
of
tissue
at
symmetric,
Fusion
planes.
A
So
this
is
these:
are
numbers
and
brackets
to
note
references
as
enumerated
and
quirks
of
human
anatomy
for
context
and
perspective
on
these
topics,
see
quirks
definitions
of
Evo,
Devo
terms,
so
there's
this
book
keywords
and
Concepts
in
evolutionary
developmental
biology,
which
is
a
good
book.
If
you're
interested
to
know
kind
of
the
jargon
of
that
area
of
Evo
Devo,
they
kind
of
have
a
lot
of
things
on
Evolution
and
development.
A
It's
a
good
book
for
kind
of
the
introduction
sort
of
an
introduction
introduction
to
that
area,
and
so
they're
kind
of
like
putting
these
together
from
different
from
this.
These
two
sources
largely
and
it's
I-
think
it's
a
nice
list
because
it
gives
all
these
different
examples
of
different
biological
questions
and
potential
theoretical
questions,
and
it
gives,
like
you
know,
gives
a
nice
diversity
of
things
how
we
think
about
Theory
and
how
we
think
about
asking
questions
in
biology.
A
I
think
there's
like
a
similarity
in
the
math
so
like
a
lot
of
times,
they're
using
the
same
sort
of
techniques.
Of
course
Darcy
Thompson
was
using,
you
know
doing
it
by
hand
or
by
like
you
know,
but
they
you.
A
Yeah,
well,
they
don't
have
it
as
a
grid.
Usually
it's
like
if
I
take
an
image.
I
can
warp
parts
of
the
image
so
I
have
like
these
features
in
the
image
that
I
can
pull
apart
or
dilate,
or
make
smaller
yeah.
C
All
genetic
law
right,
biogenetic
role,
you
see
if
what
Heckle
noted
is
that
embryos
at
early
stages,
all
look
alike
right.
Okay,
I
was
joking
with
it.
My
wife's
been
watching
some
movies
on,
or
people
wrestling
words
as
eggs,
and
they
can't
tell
what
bird
it
is
until
it
grows
up
great.
A
C
A
A
Yeah
and
the
Wallace
or
is
their
paper
I
think
he
mentions
hickle's
work
and
and
I
don't
know
what
I
I
haven't
read
through
what
he
says
about
it
specifically,
but.
A
C
C
A
A
Well,
I
think.
The
point,
though,
too,
is
that
the
Transformations
between
adults
tell
you
something
about
the
adult
phenotypes,
but
that
process
from
the
embryo
or
that
process
from
an
evolving
to
you
know
the
common
ancestor
between
them,
or
rather
you
know,
what's
going
on
in
their
development,
is
a
separate
thing.
So
it's
it's
not.
It
may
be
responsible
ultimately
for
that
transformation,
but
we
can't
say
that
you
know
we
can
say:
okay,
there's
something
that
moved
this
here,
move
that
there
you
know
it's
a
much
more
complicated
process
than
that.
A
All
right
any
other
comments
or
questions.
E
I'm,
going
no
don't
have
comments
or
questions
no.
E
A
Now
I'd
like
to
talk
about
two
papers,
that
one
was
the
one
hermanshe
talked
about
and
the
other
one
is
something
that
I
found.
That's
related
to
this.
So,
let's
get
into
it.
The
first
one
is
link
prediction
with
persistent
homology.
This
is
the
one
that
Hamachi
talked
about
in
the
meeting.
This
is
kind
of
moving
from
that
microscopy
and
segmenting
things
and
finding
features
to
building
nodes
and
then
links,
and
actually,
at
that
point
you
need
to
predict
your
links
and
you
can
use
persistent
homology
to
predict
links.
A
So
this
is
this
is
what
this
paper
is
about.
So
this
is
again,
you
know
you,
you
have
this
underlying
graph
structure
of
the
data
and
you're
trying
to
predict
links
so
link
prediction
is
an
important
learning
task
for
grass
structured
data.
So
this
is
an
interesting
approach.
A
What
they're
saying
is
that
the
data
that
you
have
is
inherently
structured
as
a
graph,
so
it
has
connections,
it
has
interactivity
inherent
in
the
data,
and
so
you
need
to
find
the
links
you
need
to
find
the
important
links
in
this
process
in
the
data
set.
So
it's
an
important
learning
task.
So
we
talk
about
graph,
neural
networks
and
topological
data
analysis
and
graph
neural
networks
and
that
that
sort
of
intersection
we
need
to
think
of
Link
prediction
as
a
learning
task.
A
In
this
paper
we
propose
a
novel,
topological
approach
to
characterize
interactions
between
two
nodes,
so
they're
actually
using
a
topological
approach.
Instead
of
maybe
like
a
statistical
approach
where,
like
we
saw
in
the
hyper
hypercube
example
or
the
neutral
space
example,
just
simply
associations
based
on
the
state
of
the
nodes,
you
know
what's
the
path
from
one
node
to
another,
is
it
mutational?
Is
it
something
else,
but
this
is
a
case
where
we
don't
necessarily
have
that
natural
guide?
A
We
have
to
predict
the
links
from
where
the
you
know
the
data,
the
association
of
the
data,
the
association
with
of
the
features
with
one
another,
our
topological
feature,
based
on
the
extended
persistent
homology
and
codes
Rich
structural
information
regarding
the
multi-hot
paths
connecting
nodes.
So
this
is
where
again
we
have
this
Cube.
We
have
zero,
zero,
zero.
We
have
one
one
one.
We
have
a
path
between
them.
We
need
to
predict
these
what
they
call
multi-hop
paths.
A
This
is
a
communication
networks
term,
but
basically
having
multiple
sort
of
steps
to
getting
from
one
point
to
another
in
the
network.
So
we
need
to
be
able
to
predict
these
paths
and
in
doing
so,
we
predict
leaks
based
on
this
feature.
We
propose
a
graph
neural
network
method
that
outperforms
state-of-the-art
on
different
benchmarks.
A
As
another
contribution,
we've
proposed
a
novel
algorithm
to
more
efficiently
compute
the
extended
persistence,
diagrams
or
graphs.
This
algorithm
can
be
generally
applied
to
accelerate
many
other
topological
methods
for
graph
learning
tasks,
so
they
kind
of
go
through
this.
They
show
two
example
graphs
here,
so
you
have
these
multi-hop
topologies,
where
you
go
from
one
red
dildo
to
another
red
node
and
you
can
see
as
they
get
more
complicated.
So
this
is
kind
of
analogous
to
our
Cube
versus
our
hypercube
example.
A
When
I
was
talking
about
neutral
spaces
and
and
that
sort
of
thing
so
this
graph
on
the
left,
you
have
this
multi-hop
architecture,
where
you
know
you
want
to
go
from
one
red
node
to
the
other.
What
are
the
intervening
nodes
and
then?
Obviously,
what
are
the
interview
links?
The
second
one
is
a
little
harder
because
there's
more
interactivity,
so
in
the
first
one
you
have
three
alternate
paths
and
the
second
one.
You
have
three
alternate
paths,
but
those
paths
interact
in
different
ways.
A
So
that's
what
we
have
here
so
there's
a
lot
of
data
that
we
can
capture
and
characterize
here.
So
this
is
an
example
here
in
figure,
two
of
extended,
persistent,
homology
and
so
a
is
where
they
plot
the
input
graph
with
a
given
filter
function.
So
they're
plotting
this
they're
plotting
this
over
time.
So
you
have
at
different
time
points
these
nodes
appear
and
they
are
linked
together
in
this
way,
the
ascent
in
B,
which
is
this-
the
ascending
and
descenting
filtrations
of
the
input
graph.
A
So
they
have
filtrations
of
it
by
time
so
like
each
at
each
time
you
get
a
new
node
that
pops
up,
and
so,
as
those
pop
up
you
get
new
links
so
like,
for
example,
you
have
link
or
node
one
appears
at
time,
one
node,
one
and
or
node
two
appears
at
time
two,
but
you
can't
necessarily
find
a
connection
between
node
one
and
two
at
times
three
node.
Three
pops
up
and,
of
course
there
is
a
connection
between
three
and
two
and
three
and
one,
but
not
one
and
two.
A
So
it's
not
just
connecting
anything
it's
connecting
things
as
they
appear,
and
then
the
evidence
appears
that
they're
connected
at
T4.
You
have
one
two
three
and
four
and
they're
connected
and
then
at
time
four
you
also
have
the
disappearance
of
one
two
and
three.
So
they
disappear
your
four
here.
So
these
these
nodes,
these
blue
nodes,
die.
You
get
a
red
node.
For
then
four
becomes
a
blue
node.
You
get
a
red
node
three.
That
appears
at
time
three
and
then
time
two
so
you're
going
it's
I.
A
Guess
it's
you're
going
up
to
time,
four
and
then
you're
doing
a
reversible
analysis
here.
So
this
is
where
you're
actually
getting
I
would
see.
So
the
bars
in
the
brown
and
blue
colors
correspond
to
the
waistbands
are
connected
components
and
Loops
respectively.
So
there's
a
lifespans
of
things
die
off.
Things
come
back
to
life
as
you
go
as
you
try
to
do
this
reversible
process.
A
You
get
this
these
connected
components
or
the
com,
all
the
components
that
are
connected
in
a
graph,
so
sometimes
you'll
have
nodes
that
are
sort
of
out
there
without
any
connections.
But
the
connected
component
are
the
parts
that
are
all
sort
of.
You
can
go
from
any
one
node
to
any
other
node
through
the
connections
that
have
been
built.
A
The
first
four
figures
are
the
ascending
filtration
of
the
last
four
figures
denote
the
descending
filtration.
So
there's
this
filtration,
like
I,
said
you
go,
you
do
a
reversible
filtration.
A
You
do
a
forward
filtration
in
the
ascending
filtration
fuv
Max
fev,
while
in
the
descending
filtration
fuv
min
fufv,
so
they're,
basically
using
a
different
Criterion
for
each
one,
they're
using
the
maximum
value
in
the
minimum
value,
so
you're
getting
two
different
ways
of
building
this
graph
or
building
the
links
on
the
for
the
graph
you're
using
two
different
criteria
in
the
resulting
extended
persistence,
diagram,
red
and
blue
markers
correspond
to
Zero,
Dimensional
and
one-dimensional
topological
structures.
So
these
are,
of
course,
Zero
Dimensional
points
in
the
one-dimensional
points.
A
This
is
death
time
and
time.
This
is
birth
time
and
time.
So
you
have
nodes
that
are
being
born
nodes
that
are
dying
off
and
you
have
this
intersection,
so
high
def
times
and
lower
birth
times
or
Zero
Dimensional
points
higher
birth
times
and
lower
death
times
are
one
dimensional
points,
so
this
is
kind
of
their
approach.
A
Their
contribution
is
or
actually,
in
this
paper
we
propose
a
pairwise
topological
feature
to
capture
the
richness
of
the
interaction
between
a
specific
pair
of
Target
nodes.
We
compute
topological
information
within
the
vicinity
of
Target
nodes,
which
is
defined
as
the
intersection
of
the
k-hop
neighborhoods
of
the
nodes.
It
has
been
shown
that
such
locally
enclosed
graphs
carry
sufficient
information
for
link
prediction,
so
they're
able
to
like
boil
this
down
to
these
local
graphs.
A
This
carries
enough
information
for
a
link
prediction,
so
you're,
basically
evaluating
small
parts
of
the
graph.
Here
it
has
enough
information
to
tell
you
about
the
links
and
then
you
can.
You
know
you
find
the
intersection
of
these
larger
they're
sort
of
neighborhoods
in
multi-hop
relationships
and
you're
able
to
find
predict
the
links
from
that
and
then
you're
able
to
build
up
from
those
neighborhoods
to
a
larger
graph
structure.
A
C
A
And
that
is
here
so
this
or
this
this
is
the
paper
Colin
Steiner
at
all:
extending
persistence
using
Punk
array
and
the
chef's
Duality.
So
this
is
the
paper
that
they're
drawing
from.
A
And
so,
in
summary,
our
contribution
is
threefold:
we
introduce
a
pairwise
topological
feature
based
on
persistent
homology,
to
measure
the
complexity
of
interaction
between
nodes.
We
compute
the
top
logical
features
specific
to
the
Target
nodes
using
a
carefully
designed
filter
function
in
domain
of
computation,
so
they
Define
their
their
sort
of
topological
features
and
their
nodes
very
carefully,
and
they
can
predict
links
from
that.
So
they
have
these
neighborhoods.
They
have
these
the
sort
of
birth
of
death
process
and
that's
what
they're
using
to
sort
of
make
these
predictions.
A
We
use
the
pairwise
topological
feature
to
enhance
the
latent
representation
of
a
graph
neural
network
and
Achieve
state
of
the
link
protection
results
on
various
benchmarks.
So
they
do
this
in
a
DNN
context
where
they
run
it.
As
a
an
algorithm
graph
neural
network
algorithm,
we
propose
a
general
purpose,
fast
algorithm
to
compute,
extended
persistent
homology
on
graphs.
The.
B
A
Complexity
is
improve
from
cubic
to
quadratic
to
the
input
size.
It
applies
to
many
other
persistent
homology-based
learning
methods
for
graphs,
so
they're
able
to
do
comparatively
well
and
so
I'm
not
going
to
go
through
the
rest
of
the
paper.
It
looks
like
they
do.
A
lot
of
you
know
highly
technical
work
here
and-
and
we
can
talk
about
this
later-
amanshu
might
want
to
present
on
some
of
this
later.
A
But
this
is
a
nice
paper
for
getting
you
know,
you're,
making
predictions
of
links
link
predictions
generally
have
some
sort
of
there's
some
sort
of
strength
of
the
connection
and
then
you're
using
different
Criterion
like
a
maximum
value
or
minimum
value,
to
sort
of
predict
that
like
whether
it
should
be
there
or
not.
So
this
is
a
nice
paper
going
over
how
to
construct
these,
and
especially
in
a
developmental
context
where
we
have
cells
that
are
being
born
cells
that
are
dying
cells
that
are
diverging
from
their
their
ancestral
cells.
A
The
Descendants
are
diverging
and
we
actually
have
this.
We
make
sure
we
have
two
connectivity
networks
of
Interest.
We
have
a
lineage
tree
which
predicts
which
cells
emerge
from
which
other
cells
so
they're
going
to
have
an
affinity
and
a
spatial
Network
which
predicts
where
they
are
in
space,
where
they
are
in
the
embryo
and
so
forth,
and
so
this
is
useful.
I
think
this
will
be
a
useful
sort
of
guide
to
some
of
these
things.
A
Another
paper
I
want
to
talk
about,
is
Transformers
me
directed
graphs,
and
so
we
talked
about
directed
graphs,
lineage
trees
as
directed
graphs,
differentiation
trees
as
directed
graphs
and
so
directed
graphs.
Again,
you
saw
in
the
example
with
the
differentiation
tree.
We
have
a
mother
cell,
we
have
daughter
cells
and
we
have
this
replicated
this
sort
of
binary
division,
sometimes.
B
A
Have
things
that
are
you
know,
multiple
cells
originate
from
sort
of
ancestral
cell,
but
most
of
the
time,
and
sometimes
you
have
asymmetric
divisions
where
you
only
have
one
daughter
cell,
but
most
of
the
10.
This
is
all
can
be
characterized
as
a
directed
graph,
so
we
can
use
directed
graphs,
but
we
can
also
use
Transformers
which
are
as
a
machine
learning
technique,
and
you
know
these
things
you
know,
as
opposed
to
like
maybe
using
graph
neural
networks.
We
can
always
use
Transformers
and
understand
these.
You
know
or
use
them
in
in
a
similar
way.
A
So
the
abstract
Transformers
are
originally
proposed
as
a
sequence,
the
sequence
mode
for
text,
but
have
become
vital
for
a
wide
range
of
modalities,
including
images,
audio
video
and
undirected
graphs.
So
these
are,
you
know,
different
types
of
input,
data
we
have
video,
you
might
want
to
analyze,
audio
or
still
images,
but
also
under
active
graphs,
which
are
just
graphs
with
interactions
like
we
saw
in
the
last
paper.
A
You
just
have
these
interactions
you're
trying
to
predict
sometimes
you're
trying
to
predict
maybe
causal
linkages
or
sometimes
you're,
trying
to
predict
like
ancestry
and
so
you're
trying
to
do
these
things,
where
you
have
some
information
about
the
direction
of
your
graph
and
so,
however,
Transformers
for
directed
graphs
or
surprisingly
underexplored,
despite
their
applicability,
ubiquitous
domains,
including
source
code
and
logic,
circuits
and
development,
so
we'll
put
that
in
there
in
this
work,
we
propose
two
direction
and
structure,
aware:
positional
encodings
for
directed
graphs,
so
they're
doing
two
direction
and
structural
where
positional
encodings,
so
the
positional
encoding
is
like
like
what's
the
position
in
the
order.
A
So
is
it
you
know
before
or
after
basically
the
first
is
the
eigenvectors
of
the
magnetical
plausene,
a
directional
aware
generalization
of
the
combinatorial
laplacian
and
two
directional
random,
walk-in
Coatings,
so
directional
random
walk
is
where
you
take
a
random
walk.
You
take
the
you
know,
you
take
a
step,
you
draw
from
a
random
distribution
and
take
another
step,
and
so
on
and
so
forth.
A
directional,
random
walk
is
that
you
have
a
certain
direction,
so
you
might
randomize
the
direction,
but
you
have
to
go
in
a
certain
like
forward.
A
Instead
of
you
know
a
true
random
walk,
you
could
go
any
direction
you
wanted,
and
so
you
end
up
with
these
little
clusters
of
movement
around
a
central
point.
A
directional,
random
walk
is
maybe
like
all
the
way,
maybe
to
the
top
of
the
screen,
if
you're
doing
a
simulation
or
to
one
set
of
the
embryo
or
another,
if
you're
in
a
biological
system,
basically
you're
randomizing
your
movement,
but
always
in
One
Direction,
and
this
is
good
for
like
directed
graphs,
because
that's
what
directed
graphs
are
they're
just
this,
they
proceed
in
a
certain
direction.
A
Transformers
have
also
had
success
in
graph
learning
tasks
through
predicting
the
properties
of
molecules
to
other
things.
Well,
virtual.
We
all
prior
work
focuses
on
directed
graphs.
They
attack
a
bunch
of
active
graphs
here,
and
so
the
attention
mechanism
needs
to
become
aware
of
the
graph
structure.
So
Transformers
uses
a
attention
mechanism,
for
example,
prior
work,
modified
the
attention
mechanism
to
incorporate
structural
information
and
I
think
that's
the
same
paper,
that's
different
paper,
or
to
proposed
hybrid
architectures
that
also
contain
graph
neural
networks,
and
so
this
is
where
you
have.
A
You
know
the
attention
mechanism
in
the
Transformer
Works
instead
of
on
visual
features,
they're
working
on
some
of
these
aspects
of
graph
structure,
so.
B
A
Is
you
know
way
we're
going
to
go
forward
in
this
paper,
another
complementary
option
or
positional
encodings
that
are
used
by
many,
if
not
most,
structural
or
Transformers?
So
these
are
positional
encodings,
which
is
where
one
thing
comes.
First
then
another
thing,
so
this
Min
paper
actually
is
the
properties
of
molecules
so
they're
using
this
structure
where
Transformer
or
predicting
molecules,
and
then
there's
this
molar
paper,
which
is
actually.
A
Attending
to
graph
transform,
so
this
is
where
they're
actually
applying
attention
at
graph.
Transformers
in
this
is
another
example
an
archive
this
is
the
Min
paper,
so
this
is
transformers
for
graphs
in
our
review
for
architectural
from
architectural
perspective.
It's
also
in
the
archive
so.
B
A
Are
different
ways:
you
can
do
this
by
merging
attentional
mechanisms
in
craft
Transformers
and
graph
neural
networks?
Okay,
so
now
we
have
there's
a
motivation
for
building
directed
graphs.
This
is
an
example
of
the
first
eigenvector
of
magnetic
laplacian
node
size,
encodes,
the
row
value
of
colors
of
the
imaginary
value.
So
they
have
these
different
ways
that
they've
implemented
this
sequence,
the
undirected
sequence.
So
you
have
the
sequence,
which
kind
of
goes
in
a
certain
direction.
A
It's
like
going
from
one
to
ten,
the
undirected
sequence,
which
goes
in
different
directions
from
a
source
node,
a
binary
tree
which
has
splits
between
a
single
node
and
two
different
nodes
and
then
a
trumpet,
which
is
this
thing
here,
where
you
have
a
directed
sequence,
but
you
have
a
shortcut
that
cuts
across,
and
so
these
are
like
I
guess
these
are
examples
of
sorting
networks,
but
they're.
Also
this
first
eigenvector
of
the
magnetic
laplacy.
The
goal
is
then
to
predict
whether
the
sequence
is
a
correct,
sorting
network
based
on
the
sequence
of
operations.
A
A
So,
moreover,
we
show
that
ignoring
the
edge
Direction
in
maps,
both
correct
and
incorrect,
sorting
networks,
the
same
the
same
undirected
graph,
losing
critical
information,
so
the
main
number
of
contributions
here
they
make
the
connection
between
sinusoidal
positional
encodings
and
the
eigenvectors
of
the
laplacian.
They
propose
spectral
positional
encodings.
A
They
extend
random,
walk,
positional
encodings
to
directed
graphs.
They
excessively
assess
the
predictiveness
of
structural
or
positional
encodings
for
the
set
of
graph
distances.
A
They
introduce
the
task
of
predicting
the
correctness
of
sorting
Networks
the
canonical
ambiguity,
free
application,
where
directionality
is
essential,
the
model,
a
sequence
of
program,
statements,
the
directed
graph
and
rethink
the
graph
construction
from
source
code
to
boost
predictive
performance
or
robustness,
and
then
they
do
the
benchmarking.
So
there
they
have
a
number
of
different
things.
A
They
introduce
here
basically
they're
trying
to
find
the
position
when
coding
through
these
kind
of
signal
processing
techniques,
then
they
have
the
eigenvector
or
laplacian,
which
is
this
graph
4A
transformation,
so
they're
taking
the
graph
and
they're
using
a
Fourier
transformation
on
it.
Then
they
have
directional
spectral
encodings.
A
They
Define
that
here
they
Define
this
in
terms
of
directedness.
So
we
next
illustrate
how
eigenvectors
of
magnetical
flossing
in
code
direction
for
the
special
case
of
the
sequence
the
eigenvectors
are
given
by
this
function.
This
corresponds
to
the
cosine
transformation
type
2,
with
additional
factor
that
encodes
the
node
position
V.
A
So
the
eigenvectors
of
the
magnetic
laplacian
also
encode
the
directionality
in
arbitrary
or
directed
graph
topologies
or
graph
topologies
that
are
directed
but
also
arbitrary,
where
each
directed
Edge
encourages
a
phase
difference
in
the
otherwise
constant
first
eigenvector
between
these
two
and
so
basically,
what
you
have
is
you
have
this
phase
difference,
so
each
Direction
Edge
is
sort
of
a
phase
difference
in
this
eigenvector.
So
then
they
tell
what
directional
random
logs
random,
logs
and
graphs,
which
you
often
use
to
like,
find
these
paths.
A
We
showed
in
the
last
paper,
where
you
go
from
one
node
to
another,
you're,
basically
crossing
the
graph
and
you're
trying
to
find
these
pathways.
So
to
overcome
the
issue
of
only
walking
in
the
forward
Direction.
We
can
additionally
consider
the
reverse
Direction.
Additionally,
we
add
self-loops
to
sync
nodes.
This
avoids
that
a
might
be
no
potent
and
ensures
that
the
landing
probability
is
sum
up
to
one.
So
we
then
Define
the
positional
encoding
for
node
V,
and
so
then
that's
how
they
build
those
kind
of
random
walks
that
are
directional
making.
A
You
know
sort
of
convert
that
to
a
directional
encoding.
So
one
important
point
about
directed
grass
versus
sequences,
so
we
showed
the
example
of
the
sequence
where
you
go
around
in
a
circle.
You
thing
has
something
that
it's
connected
to
in
a
certain
order,
but
that's
not
really
a
directed
graph.
A
directed
graph
is
where
you
have
a
branching
process
that
is
dependent
in
time,
and
so
that's
a
little
bit
different
than
a
sequence
where
it's
kind
of
obvious.
A
In
figure
seven,
we
should
have
the
number
of
topological
sorts
over
the
sequence
lines
P
for
type
of
compact
and
deterministic.
We
struck
a
constructed,
sorting
Network
for
such
networks.
In
the
sequence
length
of
just
eight,
the
number
of
equivalent
sequentializations
already
exceeds
one
million.
So
this
is
a
pretty
large
combinatorial
space.
Note
that
in
the
worst
case,
a
directed
graph
is
n,
n,
factorial,
topological
sort.
So
that's
a
big
number,
therefore
representing
directed
graphs,
so
sequences
can
introduce
a
huge
amount
of
arbitrary
orderedness.
A
A
A
note
on
some
of
these
different
applications,
then
there's
an
application
of
sorting
networks.
So
this
is
something
that
goes
back
to
Donald
news.
It's
a
certain
classic
comparison-based
algorithms
of
the
goal
of
sorting
any
input,
sequence
of
fixed
size
with
the
status
sequence
of
comparators.
So.
B
A
A
sorting
Network,
where
you
have
these
different
lines,
you're,
comparing
between
them
and
this
being
out
of
that
you
can
build
a
directed
graph.
You
know
it's
a
little
bit
hard
to
see
kind
of
how
this
applies
to
development,
but
it's
I
think
it
has
a
nice,
maybe
a
set
of
applications
in
kind
of
thinking
about
Transformers
and
thinking
about
topological
data
analysis
and
how
to
connect
all
this
with
development.