►
From YouTube: 04 - TensorFlow 2.0 Ecosystem - Josh Gordon
Description
Deep Learning for Science School 2019 - Lawrence Berkeley National Lab
Agenda and talk slides are available at: https://dl4sci-school.lbl.gov/agenda
A
A
A
But
if
you
go
to
tensorflow
org
and
what
you
should
do
is
ignore
pretty
much
everything
on
the
website,
except
what
you
find
at
tensorflow
at
org,
slash,
beta
and
go
to
learn
and
then
tensorflow
you
and
our
website
a
little
bit
slow
recently,
not
sure
why
and
then
click
on
tf2,
beta
and
I
know
this
sort
of
hidden
up
here
and
what
you'll
find
easier
through
the
URL.
What
you'll
find
on
this
tab?
These
are
all
the
latest
tutorials
and
guides
and
everything
we're
writing
for
version.
A
If
you
haven't
learned
tensorflow
one
well,
first
of
all,
you
do
not
need
tensorflow
wan
knowledge
at
all
to
use
tensorflow
two
and
if
you
have
not
previously
used
tensorflow
one,
you
should
not
just
start
just
start
directly
with
tensorflow
two
Google's
learned
a
lot,
and
the
community
has
learned
a
lot
about
developing
deep
learning
frameworks
in
the
last
few
years.
That
field
itself
is
advancing
really
fast,
but
so
is
the
software
engineering
side
and,
as
we've
learned,
a
lot
about
the
needs
of
Googlers
and
the
community,
we've
adjusted
the
library
to
match
them.
A
A
So,
first
of
all,
every
single
tutorial
and
guide
that
you
find
on
this
page
is
runnable
end
to
end
and
we're
really
proud
of
this.
It
means
that
the
code
all
works
so
for
all
these
tutorials
they're,
actually
Jupiter
notebooks
on
github
and
the
way
we
build
our
website
is.
We
have
a
script
that
just
reads
the
notebook
converts
its
HTML
sticks
in
on
the
web
page.
So
for
any
of
these,
you
can
go
to
github.
A
If
you
want,
you
can
download
it,
you
can
run
it
locally
or
for
any
of
these,
you
can
run
it
in
collaboratory,
and
let
me
just
give
you
a
quick
overview
of
colab.
I
know
you're
gonna
be
using
lab
hardware
today,
but
collab
is
awesome
if
you're
at
home.
So
what
collab
is
it's
a
free,
Jupiter,
notebook
environment?
It's
provided
by
Google
research,
it's
basically
Jupiter
notebooks
in
the
cloud
and
what's
important
about
collab,
for
our
examples
is
that
it
comes
with
a
free
GPU.
So
when
you
open
up
the
collaboratory
notebook,
whoops.
A
A
So
don't
stick
any
data
on
there
that
you
want
to
keep
you'll
notice
I'm
connected
now,
and
that
means
I
can
use
this
exactly
like
a
jupiter
notebook
at
the
start
of
a
lot
of
our
tutorials,
and
this
is
the
only
thing
you'll
need
to
keep
in
mind
collab
by
default.
It
has
all
sorts
of
software
installed
on
it.
It's
got
scikit-learn,
it's
got
tensorflow
one.
It's
got
pi
torch,
it's
got
Karis,
it's
got
all
sorts
of
great
libraries.
A
You
need
to
install
tensorflow
2
at
in
collab
at
the
start
of
all
our
2.0
beta
tutorials.
So
you
can
run
the
thing
commands
that
start
with
a
bang
inside
collab
you're
running
a
shell
command.
So
if
you
do
bang
LS
you'll
see
the
directory
that
you're
in
in
your
VM.
If
you
do
bang
pip,
you
can
install
software,
and
this
is
the
only
thing
that
takes
a
moment,
but
you'll
have
to
kick
this
off
and
you
start
all
of
our
tutorials.
Here's
how
you
enable
a
GPU.
A
A
So
if
you
go
to
edit
and
then
notebook
settings
you
can
pick
a
hardware
accelerator,
don't
choose
a
TPU
TP
use
will
require
a
small
amount
of
code
changes
and
we're
working
on
making
that
no
code
changes,
but
for
right
now,
GPUs
on
a
single
machine
will
just
work
out
of
the
box
and
your
code
is
identical.
There's
nothing
else!
You
need
to
do
so.
A
You
can
enable
the
GPU
and
when
you
get
connect-
and
then
you
hover
over
the
this
guy
up
here,
you'll
see
it's
connecting
I
know
it's
like
two-point
text,
but
you'll
see
it's
connecting
to
a
GPU
back-end,
which
is
great
all
right
before
I
explain
what
this
tutorial
does.
I
want
to
show
you
the
code
for
a
neural
network,
and
what
we're
looking
at
here
is.
This
is
a.
A
If
we
had
just
written
and
defined
her
model
like
this,
we
would
have
multi-class
logistic
regression
and
the
simplest
way
to
define
a
neural
network
in
tensorflow.
2
is
using
what
we
call
the
sequential
API,
which
says:
I'm
gonna
define
my
neural
network
as
a
stack
of
layers,
one
after
the
other
and
90
plus
percent
of
machine
learning.
Problems
in
practice
will
fall
into
exactly
this.
A
What
you
can
see
if,
instead
we
wanted
to
do
a
neural
network
instead
of
a
logistic
regression
model
or
a
linear
model?
What
we
could
do
is
this.
We
could
add
another
layer
and
we'll
choose
the
width
of
the
layer
and
we'll
give
it
a
reimu
activation,
and
now
we
have
a
neural
network
and
if
we
wanted
to
have
a
deep
neural
network
that
Z
this
is
a
new
laptop,
so
I'm
not
familiar
with
keyboard.
Now
we
have
a
deep
neural
network,
and
now
we
have
a
deeper
neural
network.
A
So
one
thing:
that's
a
blessing
and
a
curse
about
deep
learning,
as
you're
finding
out
is
there's
a
ton
of
concepts.
So
no
one
should
attend
a
school
like
this
being
new
to
this
stuff
and
seeing
terms
like
dense
layers
and
raillery
and
being
like,
oh
yeah,
like
I,
totally
now
having
tuition
or
what's
happening
at
all
these
layers.
It
takes
a
long
time
to
learn
the
concepts,
but
what
I've
hope
you've
just
seen
is
that
writing
the
code
is
much
easier
today
than
it
used
to
be,
which
is
a
really
good
thing.
A
Sequential
models
and
I'll
explain
what
Karos
is
in
a
second,
because
this
is
slightly
weird
sequential
models.
When
people
develop
with
this
style
of
code,
it
feels
imperative
it
feels
exactly
like
doing
regular
Python
programming,
but
what
you're
actually
doing
is
behind
the
scenes.
You're,
defining
a
data
structure
and
your
data
structure
here
is
very
simple:
it's
a
stack
and
what
this
means-
and
this
should
be
interesting-
maybe
even
if
you've
been
developing
with
Karros
for
a
while
and
didn't
realize
this.
A
When
you
say
model
compile
here,
we're
using
a
very
high
level
API
to
set
up
our
model
and
you
can
choose
the
way
you
want
gradient
descent
to
work.
Maybe
you've
seen
SGD.
You
know
maybe
you've
seen
atom
or
things
like
that
Thanks,
but
anyway,
you
can
choose.
Optimizers
out
of
a
box,
can
choose
your
loss
function
and
categorical.
Cross-Entropy
is
something
that
you
would
use
when
you're
doing
classification,
it's
basically
a
loss
function
that
compares
to
probability
distributions
metrics
that
you
want
to
get
printed
out.
A
But
what's
interesting
is
when
you
say
multiply,
compile
at
compile
time.
Obviously
we
can
do
checks,
basically
making
sure
the
shapes
of
your
layers
are
compatible,
and
what
this
means
is
that
we
can
catch
programming
bugs
before
you
start
training
and
running
your
models,
which
is
a
really
good
thing.
This
is
what
we
want
in
the
next
style
of
code.
That
I'll
show
you
in
a
minute:
it's
a
little
bit
more
flexible
than
this,
but
we
can't
check
things
at
compile
time.
A
You
can
train
your
model
with
a
single
line
of
code
and
whenever
you
train
a
deep
learning
model,
the
very
first
thing
you
should
look
at
is
overfitting
and
underfitting.
So
one
of
the
reasons
that
deep
neural
networks
are
so
successful
for
things
like
classification
is
they're
super
powerful
and
if
you
decide
to
define
like
a
really
deep
network
and
by
deep
I
mean
the
number
of
layers-
and
you
make
the
layers
wide
enough
by
that
I
mean
the
number
of
neurons
per
layer
units
per
layer
and
you
train
the
thing
for
a
long
time.
A
It's
probably
going
to
memorize
your
training
data,
and
this
is
a
good
thing
because
of
that
it's
also
very
likely
to
badly
over
fit
and
do
a
horrible
job
on
your
validation
and
testing
data.
There's
lots
of
different
knobs.
You
can
tune
to
prevent
networks
from
overfitting.
There's
things
you
can
do
like
adding
regularization
like
l2
or
drop
out.
You
can
reduce
the
size
of
your
layers.
A
You
can
mess
around
with
your
optimizer,
but
the
most
important
thing
is
just
the
single
number
here
that
box
and
an
epoch
means
you've
used
every
example,
through
your
training
set
once
to
update
all
the
weights
on
your
model.
So
an
epoch
means
a
single
sweep
through
your
training
set.
It's
a
single
round
of
gradient
descent
with
every
example.
The
longer
you
train
these
things
for
the
more
tightly
you're
gonna
fit
the
training
data,
and
so
you
can
basically
figure
out
that
right
number.
A
A
You
make
a
plot
and
you
plot
the
loss
on
your
training
data
and
you
also
plot
the
loss
on
your
validation
data
and
basically,
when
you
start
training,
all
your
weights
are
initialized
randomly,
so
the
loss
on
both
your
training
and
validation
data
will
be
decreasing
as
your
model
begins,
overfitting
or
memorizing.
The
training
data,
the
loss
on
your
validation
data
is
going
to
start
increasing,
it's
gonna
get
worse
and
worse
and
worse,
and
when
that
begins
to
occur,
that's
the
right
number
of
epochs
to
train.
A
For
so
you
train
these
models
until
your
validation
loss
starts.
Increasing
a
lot
of
what
people
focus
on
and
one
cool
thing
about
giving
the
talk
of
the
lab
is
when
they
start
learning
deep
learning.
You'll
spend
all
your
time
messing
around
with
stuff
like
this,
like:
what's
the
right
model
architecture,
what
types
of
layers
should
I
use,
but
in
practice
this
is
by
far
the
smallest
part
of
doing
deep
learning
successfully.
It's
all
about
thinking
about
your
problem.
So
what
are
you
trying
to
model?
Or
what
are
you
trying
to
classify?
A
How
do
you
evaluate
it?
How
do
you
know
you've
done
a
good
job?
How
do
you
know
your
models
are
gonna
work,
though
in
production
when
they're
deployed
on
data
you've
never
seen
before,
and
so
it's
really
thinking
about
your
experiment,
that's
hard
and
setting
up
the
right
experiment,
and
once
you
have
that
done
this,
you
can
mess
with
and
figure
out,
but
anyway,
it's
what
you'll
start
on
today.
Anyway,
let
me
just
run
this
code
really
quickly.
A
A
A
Anyway,
it's
running
modeled
outfit
right
now,
there's
also
methods
like
model
doctor
dict,
and
you
can
see
for
every
Epoque.
It's
reporting
the
loss
from
the
training
data
and
there's
parameters
you
can
set
to
have
it
automatically
report
the
loss
in
the
validation
data
to
anyway.
Here's
the
first
thing,
that's
weird
so
who
has
used
a
library
called
Karis
before
okay,
so
half
of
you,
here's
what's
interesting.
If
you
go
to
Carol
Co
Kara
Scott
IO
is
wonderful.
This
is
an
independent
open-source
project,
nothing
to
do
with
tensorflow.
A
If
you
do
pip
install
Karos,
you
get
what
you
find
at
Cairo
Co
behind
the
scenes.
Caris
will
automatically
install
another
deep
learning
library
and
that
can
be
tensorflow.
It
can
be
c
NT
k.
It
can
be
MX
matter
whatever
what
Karros
is
and
why
Karros
is
so
successful
care
us
and
you've
just
seen
this
Center
getting
started
for
beginners
example:
Karros
is
an
API
spec,
so
Kara
says
basically
there's
different
ways
to
define
your
deep
neural
networks.
One
such
way
is
the
sequential
API,
where
you
define
a
model,
and
you
add
layers
to
it.
A
You
can
pile
it.
What
Kara
doesn't
say
anything
about
is
here's
how
you
multiply
matrices
quickly,
so
when
you
actually
need
to
train
these
things,
kara
uses
another
library
behind
the
scenes
to
do
the
math
and
another
library
to
handle.
How
do
I
get
this
stuff
on
the
GPU
or
whatever.
If
you
do
Pippins
anyway,
this
was
extremely
successful.
This
was
the
first
deep
learning
library
that
was
really
really
really
user,
focused,
really
really
clear
and
easy
to
use.
It
was
oh
and
we'd
love
it
on
the
tensorflow
side.
A
So
now
it's
also
built
into
tensor
flow.
If
you
do
pip
install
tensor
flow,
every
example
that
you
find
at
Karros
dot
IO
will
work
if
you
change
the
imports.
So
you
take
an
example.
Instead
of
from
Charis
top
models,
you
say
from
tensor
flow
Karros
and
the
rest
is
the
same.
Tensor
flow
to
is
a
superset
of
carroll's
and
it
has
stuff
that
you
won't
find
at
this
webpage
if
you're
new,
to
deep
learning.
This
is
a
wonderful
place
to
start
anything.
A
You
learn
at
Karis
thought
IO
will
feed
directly
into
tensor
flow,
so
you're
not
wasting
your
time.
In
fact,
there's
a
whole
book
that
I'd
really
strongly
recommend
deep
learning
with
Python
by
Francois
Sholay,
which
is
by
far
the
best
book
to
start
your
deep
learning
journey
from
a
practical
developer
side.
It's
40
bucks.
It's
there's,
no
math,
it's
not
an
academic
textbook,
but
it's
basically
here's
how
you
do
the
thing.
A
So
if
your
goal
is
to
learn
how
to
like
I
want
show
me
the
simplest
way
to
train
an
image,
classifier
or
a
text
classifier
and
by
simplest
I
mean
simplest,
but
not
black
box.
It's
not
like
model
equals.
My
magical
text,
classifier
train,
it's!
You
know
you're
at
least
defining
the
models
piece-by-piece,
so
you
understand
what's
happening
inside
anyway.
It's
wonderful,
deep
learning
with
Python
tensorflow
adds
a
lot
on
top
of
this
and
right
now
we're
talking
only
about
Python.
So
let
me
show
you
some
Python,
that's
different
in
tensorflow
to
that.
A
You
won't
find
a
chaos
that
I
Oh.
If
we
go
back
to
our
get
it
starting
get
it
started
page
and
we
look
at
tensorflow
for
experts.
This
is
a
very
similar
model
to
what
we
saw
in
the
beginners
example.
It's
another
endless
thing
which
maybe
I'll
walk
you
through
in
a
sec,
but
the
model
is
defined
in
a
very,
very
different
way.
A
You
so
if
you've
been
doing
deep
learning
for
a
while.
This
might
look
like
chainer
or
PI
torch,
and
what
we're
doing
here
is
we're
defining
our
model
by
subclassing,
a
class
defined
by
the
library,
and
here
in
tensorflow
we
call
it
model
different
frameworks.
We
call
it
different
yoga,
different
frameworks,
we'll
call
it
different
things,
and
this
should
feel
a
lot
like
object-oriented
numpy
development
in
the
constructor
and
by
the
way,
of
course,
you
can
add
parameters
as
you
like.
A
A
So
if
you
have
a
tensor,
you
can
just
call
T
dot
numpy
now
you're
backing
them
by
which
is
great.
So
this
is
wonderful
for
learning
the
of
the
basics
or
not
the
basics.
At
all.
Excuse
me
for
learning
exactly
the
details
of
what's
going
in
and
out
of
these
layers,
where
the
shapes.
What
does
my
data
look
like?
So
it's
great
for
debugging
one
thing:
that's
new
in
tensorflow
to
that
I
also
want
to
show
you
is
there's
two
ways
and
by
the
way
this
is
also
a
chaos
model.
A
Karras
has
three
ways
of
defining
models.
It
has
sequential
which
you
should
always
start
with,
and
you
should
always
use
first,
because
sequential
models
are
the
easiest
to
debug
and
they're,
also
the
easiest
to
share
with
friends.
So,
if
I'm
looking
at
a
code
from
a
student
and
she
writes
it
using
the
sequential
model
and
there's
a
bug,
I
can
immediately
see
it.
It
takes
like
30
seconds
tops
if
she
writes
it
using
this
sub
classing
model,
it
can
take
me
15
minutes
to
find
it.
A
That's
because
this
is
new
and
there's
few
standards
for
how
you
write
your
code.
This
way
with
the
sequential
model,
your
code
is
a
data
structure
with
the
sub
classing
model.
Your
model
is
Python
byte
code,
which
means
you
can
do
whatever
you
want,
but
it's
also
tricky
to
debug.
So
that's
that's
the
trade-off
they're,
both
wonderful.
You
can't
go
wrong
with
either
there's
also
the
functional
API
in
Karros,
which
some
people
really
love
so
sequential
is
a
stack,
functional
API,
but
you
would
use
if
your
model
is
a
day
or
a
graph.
A
So
it
gives
you
a
little
bit
more
expressivity
than
the
sequential
model.
Both
of
these
models
can
be
trained
in
two
different
ways,
regardless
of
whether
you
use
the
functional,
sequential
or
sub
classing.
You
can
train
your
model
with
model
dot
fit
which
you
should
always
start
with.
Unless
you
want
to
poke
around
with
the
details
and
here's
how
you
poke
around
with
the
details,
so
in
tensor
flow
2,
you
can
also
train
your
models
with
what
we
call
a
gradient
tape
and
basically,
what
a
deep
learning
library
is.
A
In
a
nutshell,
a
deep
learning
library
is
a
matrix
multiplier,
because
almost
all
these
layers
under
the
hood
you
forward
and
backward
by
multiply
matrices
so
deep
learning
library.
This
is
true,
tensorflow
c
TK,
MX
and
add
all
of
them
multiplies
matrices.
It
can
also
do
that
on
a
GPU
great
all
of
the
deep
learning
libraries
they
have
different
ways
of
defining
layers
and
they
have
automatic
differentiation
and
that's
what
we're
seeing
here
under
the
hood.
So
tensorflow
uses
reverse
mode,
auto
diff,
and
this
is
basically
writing
model
dot
fit
from
scratch.
A
So
the
tape
will
trace
all
the
operations
that
are
in
nested
on
this
whiff
block
and
it
literally
plays
them
back
on
a
tape
to
compute
the
gradients,
and
so
here
this
is
our
forward.
Pass
we're
making
some
predictions
on
images
we're
calculating
our
loss,
which
is
single
number
and
what's
great,
is
if
we
say
tape,
dot,
gradient,
we're,
saying
tensorflow.
Please
give
me
the
gradients
of
the
loss
with
respect
to
all
the
variables
in
my
model.
Do
you
print
those
out
you'll
get
the
very
gradients,
they're
Python
lists.
A
This
means,
if
you
happen
to
be,
you
know
much
smarter
than
me,
and
you
know
you're
doing
research
on
the
optimization
and
you're
implementing
like
the
new
berkeley,
optimization
method,
you
can
implement
it
and
just
regular
Python
and
it's
very,
very
easy
to
poke
around
with
one
cool
thing
that
tensorflow
to
does
the
only
piece
of
non-standard
Python.
So
if
you
did
tensorflow
one
you
learned
about
you
know
sessions
and
graphs
and
placeholders,
and
all
this
stuff
is
very
cool
but
tensorflow.
A
If
you
do
TF
function,
what
this
will
do
so
one
piece
of
slowdown
in
deep
learning
libraries
is
behind
the
scenes.
Tensorflow
is
a
c++
engine,
a
writing
code
in
Python,
when
these
operations
are
actually
executed,
we're
going
from
Python
to
C++
compute
the
results
send
it
back
to
Python,
so
we're
ping-ponging
back
and
forth
line
by
line
which
is
slow.
If
you
do
at
TF
function,
what
you're
saying
is
hand
this
entire
function
to
the
C++
back-end
and
I'm,
not
a
compilers
engineer,
but
if
you
are,
then
you
probably
know
all
different
optimizations.
A
You
can
do
to
the
code
if
you
analyze
it
statically
so
compile
the
code
optimize
it
compute.
It
send
back
the
result
once
and
this
can
give
you
anywhere
from
a
0
to
10
X
speed-up
on
your
code
and
it's
a
single
line.
It's
awesome,
anything
that
you
can
stick
inside
TF
function.
You
can
stick
inside
a
tensorflow
save
model
if
you're
exporting
things.
If
you
do
TF
function,
it
makes
your
hard
your
code
slightly
harder
to
debug.
The
error
messages
might
make
less
sense.
A
So
the
way
this
works
when
you're
developing
your
models,
don't
use
TF
function,
develop
your
model
debug
it
all
that
stuff
when
you've
finished
developing.
If
you
care
about
speed
which
I
often,
but
if
you
do
care
about
speed,
add
a
single
TF
function
and
the
way
to
find
the
best
practices
for
this
is
look
at
our
advanced
tutorials
and
usually
what
we
do
you
don't
need
to
add
this
on
top
of
this
is
recursively
applied,
so
you
don't
need
to
sprinkle
your
whole
code
with
it.
A
Usually
we
just
stick
it
on
top
of
our
training
loop
and
that's
it.
So
we
like
this
a
lot.
It's
super
user-friendly,
so,
what's
really
relevant
for
the
lab
is
distributed
training
so
in
the
guides
on
tensorflow,
two
tents
floated
org,
slash
beta
you'll,
find
a
guide
for
distribution
strategy
and
I
want
to
show
you
what
distribution
strategies
are.
So
here's
some
careless
model
that
we've
defined
to
do
whatever
and
if
we
want
to
run
this,
the
most
common
case
of
distributed
training
is
one
machine
with
multiple
GPUs,
and
this
is
called
data
parallelism.
A
So
there's
a
parameter
called
batch
size,
which
is
how
many
examples
do
you
use
to
round
of
gradient
descent?
Larger
batch
sizes
means
more
accurate
updates,
so
the
simplest
way
to
do
distributed
training
give
one
box
with
a
lot
of
GPUs
as
you
increase
your
batch
size.
So
let's
say
the
most
that
one
GPU
can
handle
is
32
you've,
two
GPUs
you
give
32
examples
to
each
GPU.
They
do
the
forward
pass
backward
pass.
You
average
the
gradients.
A
Nice
about
distribution
strategies
is
that's
the
complete
code
for
data
parallelism
and
there's
more
that
are
being
developed,
there's
different
strategies
for
different
Network
configurations
and
different
number
of
machines
and
different
numbers
of
accelerators.
It's
really
cool
stuff,
and
what
I'd
like
about
these
is
they're
super
user
friendly
when
you
even
on
a
single
machine
by
the
way
big
gotcha,
is
your
data
input
pipeline?
So,
let's
say
you're
doing
something
like
training,
an
image,
classifier
and
you're
reading
examples
off
disk.
A
A
huge
slowdown
is
this
is
called
GPU
starvation,
so
one
issue
might
be
your
GPUs
are
faster
than
the
code
you've
written
to
read
images
off
disk
and
if
you
have
a
small
data
set,
the
easiest
way
around.
This
is
just
use.
Numpy
load
the
whole
thing
into
memory.
You
don't
have
to
worry
about
it.
Have
a
nice
day.
If
you
have
more
energy
to
invest,
you
can
use
something
called
TF
data
and
TF
data.
A
So
we
haven't
published
these
yet
they
should
be
on
the
website
later
this
week,
I
just
sort
of
pirated
the
code
and
uploaded
it.
But
that's:
okay!
It's
about
to
be
open
sourced
anyway.
What
this
is
is
this
tutorial
for
image
segmentation,
so
this
will
train
a
image
segmentation
model
on
the
Oxford
Pets
data
set
the
reason
that
I
wanted
to
give
you
a
link-
and
you
can
just
jot
this
down.
A
So
you
have
it
later:
it's
bitly,
/tf
seg,
it's
just
a
jupiter
notebook
and
what's
nice
about
this,
is
it
runs
in
about
five
minutes
or
less
so
a
lot
of
our
advanced
tutorials
like
cycle
gand
I'll,
show
you
in
a
sec
can
take
a
little
bit
longer
to
train,
but
this
is
fast
enough
that
you
can
do
it
almost
interactively,
and
so
it's
a
really
nice
advanced
example.
That's
fun
that
you
can
play
with
quickly,
so
TF,
seg
and
I
can
give
you
the
things
later
too
and
there's
another
one.
A
They'll
show
you
I'll
walk
you
through
this
I
just
made
these
slides
a
second
ago,
so
a
little
funny.
This
is
a
code
example
for
a
deep
dream
and
this
is
based
off
a
github
repo
that
I
have,
but
this
is
cleaned
up,
so
it
runs
a
lot
faster,
using
tensorflow
to
best
practices,
mine's
kind
of
crap.
So
this
is
Bentley.
/Tf
dream!
A
Oh,
yes!
So
the
reason
this
requires
sign-in
is
I.
Didn't
have
time
to
upload
this
to
my
github
account.
So
what
this
is
is
this
is
just
a
notebook
sitting
on
my
Google
Drive.
If
you
can't
access
it,
I'll
fix
that
in
a
sec.
I
just
want
to
get
this
to
you.
If
other
people
are
having
trouble
accessing
it.
I'll
fix
that
right.
After
the
talk,
I'll,
probably
just
messed
up
the
sharing
settings,
so
TF
seg
in
TF,
dream
and.
A
Let's
do
this,
let
me
talk
briefly
about
tensorflow
beyond
Python
and
then
I
want
to
talk
a
little
bit
about
deep
learning
very,
very
quickly.
I
know:
you've
covered
a
little
bit
of
convolution
I
just
want
to
say
a
few
more
words
about
it.
Then
we'll
do
linear
regression
just
to
show
you
the
mechanics
of
writing
tensor
flow
to
code
from
scratch,
and
then
we'll
do
deep
dream
and
the
reason
we're
doing
deep
dream
and
linear
regression
is.
A
A
Every
time
I
pick
up
my
laptop
to
do
this
I
unplug
things
it's
horrible,
but
what
we're
looking
at
carefully
carefully!
It
is
not
meant
for
this
many
people,
but
this
is
a
model
called
pose
net
and
even
though
I'm
filming
you
right
now,
this
is
relatively
private,
so
no
data
is
being
sent
to
the
cloud.
This
is
all
running
locally
in
Chrome
and
working
Firefox
to
your
favorite
browser.
So
this
is
running
locally.
It's
entirely
in
JavaScript,
its
GPU
accelerated.
So
it's
fast
and
what's
interesting
is
I.
A
Just
bought
this
a
couple
days
ago,
cuz
my
personal
laptop
died.
So
this
is
my
home
laptop
and
it's
the
cheapest
MacBook.
You
can
buy
right
now,
so
it's
still
not
cheap
right.
It's
like
1,100
1,200
bucks,
but
the
point
I'd
like
to
make
is
even
when
this
hardware,
this
is
running
at
I,
can't
see
it.
It's
probably
like
20
frames
a
second.
This
is
fast
right
and
it's
doing
something.
That's
pretty
sophisticated.
A
So
the
reason
I'm
showing
you
JavaScript-
and
this
may
not
be
relevant
to
stuff
you're
working
on
at
the
lab,
but
it's
a
really
cool
way
that
I
destroy
it.
No,
it
still
works
great
question.
Is
it
related
to
the
Xbox
No,
so
I'm?
Not
my
guess
would
be
that
it's
not
related
to
the
Xbox
I
suspect,
but
I
could
be
totally
wrong.
The
Xbox
has
some
sort
of
radar
type
device
to
physically
measure
your
distance.
This
is
purely
just
vision.
A
Why
on
earth
are
we
going
to
JavaScript,
which
is
probably
even
slower
as
soon
as
I
saw
stuff
like
this
I
was
like
well,
I
was
totally
wrong.
The
reason
we
care
about
doing
machine
learning
in
JavaScript-
and
this
is
a
huge
game-changer
of
course,
like
we
want
JavaScript
developers
and
people
using
other
languages
to
be
able
to
develop
a
deep
learning
that
haven't
switched
by
that.
Obviously,
but
the
reason
we
care
about
javascript
is
because
it
runs
client-side.
So
this
gives
you
another
deployment
option.
A
So
as
a
Python
developer,
the
way
I
deploy
models
is
I,
start
up
a
REST
API
and
the
crappy
way
to
do
that.
Is
you
use
flasks
or
whatever
you
want?
If
you're
at
a
large
place
like
a
lab
or
a
company,
you
can
use
something
called
tensorflow
serving
and
tensorflow
serving
is
part
of
the
tensorflow
eco
system.
It's
exactly
the
same
code
that
Google
uses
to
serve
models
internally,
you
can
download
it
to
C++
library,
it
will
load
in
models
you
saved
in
Python
and
throw
up
a
REST
API.
A
So
that's
high
performance,
but
it
takes
some
time
to
set
up,
but
to
pulling
in
JavaScript
means
you
can
just
push
models
out
to
your
users
and
they
run
client-side,
and
this
is
really
really
cool.
So
it's
new
paradigm
and
it's
only
about
a
year
old
and
we're
seeing
tons
of
cool
applications
all
over
the
place
tensorflow
because
I'm
talking
about
the
ecosystem.
It
also
has
it's
a
very,
very
big
project.
A
huge
thing
right
now
is
Swift
for
tensorflow,
and
this
is
something
Chris,
Lattner
and
others
are
working
on.
A
So
Swift
is
a
modern
language.
It's
compiled
it's
fast
and
there's
a
lot
of
engineering
hours
being
poured
into
this
project.
Basically
implementing
tensorflow
and
Swift.
There's
a
whole
class.
You
can
take
on
it
from
a
company
called
fast,
a
I.
It
looks
really
promising.
So
if
you
happen
to
be
a
swift
developer,
that
is
a
completely
legit
place.
There's
no
need
to
use
Python
r
JJ
Allaire
from
our
studio
also
did
a
phenomenally
good
job
implementing
tensor
flow
in
R.
A
A
One
really
useful
one
is
tensorflow
hub
and
tensorflow
hub
is
a
library
of
pre-trained
models.
I
want
to
give
you
some
caveats
there
working
on
upgrading
tensorflow
hub
for
tensorflow
too
right
now.
Some
examples
work
with
tensorflow
too,
and
some
don't
so
just
FYI
you,
let
me
talk
a
little
bit
about
deep
learning
and
then
we'll
do
some
more
code.
So
deep
learning
is
representation.
A
Learning
and
I
want
to
add
a
few
more
words
on
convolution
and
usually
when
I
teach
deep
learning,
I
start
with
convolution
instead
of
dense
layers,
because
if
you
have
a
single
dense
layer-
and
we
had
more
time,
it's
very
easy
to
interpret
what
a
single
dense
layer
is
doing
when
you
have
a
DNN
defined
as
a
stack
of
dense
layers.
Who
knows
exactly
what
features
the
subsequent
layers
are
looking
at,
but
you
can
see
it
very
easily
with
convolution
so
convolution,
it's
not
a
machine
learning
concept
and
it's
something
you've
probably
used.
A
So
this
is
just
the
code
example
inside
to
do
convolution
on
an
image
to
detect
edges.
Convolution
is
how
all
the
filters
in
photoshop
work
for
sharpening
things
and
blurring
things
and
finding
edges
and
stuff
like
that
and
something
interesting
is
inside.
Pi
chose
the
they
have
a
bunch
of
built
in
pictures,
but
it
shows
the
astronaut
picture
so
I
like
astronauts.
A
So
my
first
question
is:
does
anyone
know
who
the
astronaut
that's
built
into
SCI
pi
is
because
this
is
a
science
he
place,
she's
famous
enough
to
get
built
in
his
eye
by
yeah
I
know
you
could
use
your
phone
anyway.
It's
it's
Lynn
Collins
and
she
was
the
first
woman
to
command
the
space
shuttle
Columbia,
which
is
a
big
deal
anyway.
I
just
want
to
show
you
what
convolution
means
so
you'll
see
this
a
lot
deep
learning.
We
take
terms
that,
are
that
mean
a
lot
of
things?
A
So,
if
you're
an
electrical
engineer,
you
know
we
more
about
convolution
than
I
ever
will
in
deep
learning
convolution
we
mean
slide
and
here's
how
we
slide
to
detect
edges
and
I'll.
Show
you
this
fast
and
then
slow.
So
convolution
starts
with
something
called
a
kernel
or
a
filter.
You'll
see
in
deep
learning.
There's
multiple
names
for
the
same
thing,
all
the
time
just
to
make
it
fun
to
learn.
So
here's
a
filter
that
can
detect
edges.
There's
a
couple
important
things
about
this.
So
the
first
thing
is
you
notice:
this
filter
has
nine
numbers.
A
Eight
of
them
are
negative
one.
One
is
eight.
The
reason
that
this
can
detect
edges
is
the
intuition
is.
If
we
take
the
dot
product
of
this
filter
and
an
area
under
the
image,
the
dot
product
is
going
to
be
zero.
If
all
the
pixels
have
the
same
intensity
and
it's
going
to
be
a
larger
number
if
the
pixels
are
different
or
there's
an
edge.
A
A
So
if
you
think
about
it,
if
you
wanted
to
detect
edges
on
an
image
using
a
dense
layer,
you're
gonna
have
a
bad
time
to
quote
self
mark,
so
you're
gonna
need
a
very
wide
layer
and
that
dense
layer
needs
to
learn
to
detect
edges
separately
at
every
chunk
of
the
image.
So
you're
gonna
have
different
neurons.
That
learn
to
say:
is
there
an
edge
in
the
top-left
corner?
You'll
have
more
neurons?
Is
there
an
edge?
Next
to
that?
This
is
stupid
with
just
the
model
numbers,
though
we
can
find
edges
all
across
the
image.
A
So
there's
many
many
many
times
more
efficient.
This
actually
isn't
much
more
than
linear
regression,
so
y
equals
MX,
plus
B
just
to
find
the
best
fit
line.
That's
two
parameters
for
the
slope
in
the
intercept,
and
here
it's
just
seven
more.
We
get
edges
every
one
of
the
image,
so
convolution
has
a
lot
of
nice
properties
and
here's
what
I
mean
by
the
dot
product.
Just
so
you
can
see-
and
this
is
how
the
library
works.
A
Do
we
take
an
image
and
we
take
our
filter
and
we
have
an
output
image
and
we
drop
the
filter
and
some
chunk
of
the
input
image.
We
take
the
dot
product,
that's
just
one
times,
two
plus
zero
times
zero
plus
one
times
one
and
so
on
and
so
forth.
So
you
sum
it
up
and
that's
that's
the
output
pixel
and
then
we
convolve
where
we
slide
and
we
take
the
dot
product.
Again
we
get
another
output,
pixel
we
convolve
and
we
convolve,
and
now
we
have
an
output
image
and
there's
more
to
a
two.
A
Here's
what
the
deep
in
deep
learning
means
so
I
stole
these
from
a
friend
of
mine,
Martin
Gorner
he's
a
much
better
artist
than
I
am
but
here's
an
image
and
an
image
isn't
2d,
it's
3d,
so
an
image
often
has
three
color
channels,
red,
green
and
blue,
and
if
you
printed
the
shape
out
of
this
thing
in
numpy,
you
might
see
ten
by
ten
by
three
with
height
depth.
So
red,
green,
blue
and
what's
interesting,
is
too
much
Diet
Coke.
A
We
can
convince
Reedy
in
exactly
the
same
way
we
convolve
in
2d,
so
we're
still
doing
dot
product.
But
now
our
filters
are
three-dimensional,
so
filters
will
always
pass
through
the
entire
past
through
the
entire
depth
of
the
image.
And
so,
if
we
start
convolving,
this
filter
we're
still
taking
a
dot
product
and
as
we
slide,
we
get
output,
pixels
and
I'm.
Just
gonna
fly
through
it.
If
we
slide
for
a
long
time,
we
get
an
output
image
in
Photoshop.
You
write
the
filters
by
hand
here.
A
A
Great
question:
if
your
image
doesn't
have
symmetry
or
the
filters
don't
evenly
divide
the
image,
then
you
can
use
concepts
like
padding
and
stride
to
deal
with
that,
but
for
right
now,
I'm
just
gonna
pretend
just
to
fly
through
it
excellent
question
to
fly
through
it.
I'm
gonna
pretend
that
it's
just
evenly
divides
it.
You
don't
have
to
worry
about
that
on
the
way
to
find
out
how
to
deal
with
images.
There
are
different
sizes.
A
If
you
look
at
the
tensorflow
api
docs
for
convolution
and
maybe
I'll
show
you
that
in
a
sec,
you'll
see
a
whole
slew
of
options,
but
here's
the
important
point
in
deep
learning
with
more
filters,
you
get
more
output
images,
so
these
might
be
edges
in
the
different
orientation
and
you
can
have
as
many
as
you
like.
So
if
you
see
a
layer,
let
me
show
you
what
such
a
layer
might
look
like
in
code.
A
We
pull
up.
This
is
our
beginner
tutorials
convolutional
neural
networks,
and
this
tutorial
is
a
bit
of
a
lie.
What
this
tutorial
is
it's
not
really
CNN
tutorial.
This
tutorial
is
trying
to
say:
I
am
the
minimum
amount
of
code.
You
need
to
quickly
train
a
convolutional
neural
network,
but
it
doesn't
teach
you
about
cnn's.
Five
energy-
maybe
will
update
it
later,
but
what
we're
looking
at
with
this
layer
here,
what
this
is
saying
is
give
me
32
filters.
So
let's
say
the
input
image
to
this
layer
was
10
by
10
by
3.
A
A
Each
filter,
outputs,
an
image
so
that
base
would
be
32
deep,
all
detecting
different
features
and
what
happens
and
what's
really
nice,
so
the
deep
learning
part
happens
at
the
next
convolutional
layer.
So
at
the
first
layer
the
filters
are
learning
features
of
pixels,
which
are
basically
edges
in
colors
right
at
the
next
layer.
If
we
have
a
bunch
of
filters
again,
we
can
have
a
very,
very
deep
image,
but
now
these
filters
are
learning
features
of
features
and
features
of
edges
are
basically
shapes
at
the
next
layer.
A
You're
learning
features
and
features
of
features
which
are
textures
and
whatnot,
and
so
deep
learning
is
this.
You
learn
this
hierarchy
of
features,
all
of
which
you
learn
automatically,
and
it's
this
really
powerful
concept
and
what's
great
about
it,
is
if
you
start
your
deep
learning
journey
with
convolution,
you
can
visualize
exactly
what
these
filters
are
detecting
and
you
can
actually
do
tricks
with
them
using
things
like
deep
dream
and
then
only
at
the
end,
by
the
way.
A
It's
the
filters
are
very
easy
to
understand
at
the
first
couple
layers,
but
if
you
have
a
CNN,
that's
20
layers,
deep,
who
knows
what
layer
19
is
detecting,
but
you
can
use
deep
dream
to
poke
around
with
it
and
then
at
the
very
very
end.
That's
when
you
have
a
dense
layer.
So
what
a
CNN
is
doing,
your
dense
layer
does
the
actual
classification.
So
what
your
CNN
is
doing,
it's
basically
a
really
really
cool
feature.
Processor
preprocessor
that
you
get
for
free
and
there's
a
couple:
different
cool
research
directions
too.
A
I
may
not
have
a
slide
I'm.
Looking
for.
What's
that
what
you
learned
about,
when
you
start
deep
learning,
is
image
classification,
which
is
really
really
important,
given
a
picture
of
a
cat
or
a
dog
predict
if
it's
cat
or
dog
key
skill,
much
more
interesting
question:
you
can
spend
a
long
time
on
that.
Much
more
interesting
question
is
given
that
the
model
says
you've
got
a
cat
you're
asking:
why
did
it
say?
I
have
a
cat.
A
What
features
is
the
model
looking
at
that
he
used
to
make
the
prediction
and
you
might
not
care
about
it
for
cats,
but
it's
useful
to
do
basic
science
in
other
domains,
so
I'm
sure
you've
all
heard
about
this,
so
Lily,
paying
you
a
few
years
ago,
became
really
famous
for
doing
work
on
diabetic,
retinopathy
detection.
So
you've
probably
seen
these
pictures.
A
If
you
google,
around,
for
like
google
research
blog
diabetic,
retinopathy
you'll,
find
she
didn't
experiment
where
you
know
a
patient
will
take
a
picture
of
their
retina
and
she
tried
to
classify
the
scan
of
the
retinas,
diseased
or
healthy
to
assist
ophthalmologists.
But
basically,
Lily
did
really
amazing
work,
not
in
writing
a
fancy
image
classifier,
but
in
applying
it
to
an
important
domain.
So
that
was
image
classification
applied
somewhere,
where
it
mattered.
A
Can
you
predict
somebody's
blood
pressure
based
on
a
picture
of
their
eye?
The
answer
turned
out
to
be
yes
and
then
also
which
pixels
in
the
eye
or
indicative
of
blood
pressure,
and
you
can
actually
see
anyway,
so
image,
classification
and
interpreting
how
these
things
work
are
sort
of
two
sides
of
the
same
coin,
both
important.
A
So
let
me
show
you
some
all
right.
Let's
do
this
I'm
gonna
point
you
to
a
couple
examples.
Then
I'll
walk
you
through
linear
regression,
I'm
going
to
walk
you
through
a
deep
dream.
So
in
terms
of
examples
here
are
some
of
the
latest
tutorials
we
just
published
I'm
good
I've
got
diet
coke,
but
I've
been
having
too
much
Diet
Coke
thanks.
So.
A
Great
question,
so
the
question
was
within
the
same
layer:
do
the
filters
all
need
to
be
the
same
shape?
Usually
you
want
the
output
to
be
the
same
shape.
So
usually
you
know
this.
All
the
outputs
I
have
are
these
rectangular
volumes
that
just
makes
it
easier
for
the
next
layer.
You
could
absolutely
have
different
shapes
in
your
output.
You
could
absolutely
have
different
shapes
of
filters.
A
Things
like
ResNet
that
have
you
know,
skip
connections
and
things
between
the
layers.
But
one
such
paper
that
you
just
hinted
at
is
called
Inception
one
question:
if
you're
doing
computer
vision,
research
is
what's
the
right
filter
size
and
what
inception
basically
said
was
we
don't
know
so
at
each
layer
inception
will
run
like
a
one-by-one,
filter
or
three
by
three
a
five
by
five
and
it
basically
averages
the
results.
So
it's
sort
of
let's
try
everything
and
it
helped.
So,
yes,
you
can
have
different
filter
sizes
in
our
tutorials.
A
Almost
always
we'll
just
have
a
single
one,
and
one
challenge
with
deep
learning
is
its
hyper
parameter
soup
so
for
almost
any
paper,
you'll
find
a
million
different
parameters
you
can
play
with
how
many
layers
what's
the
size?
What
are
the
different?
Thank
you
so
much
for
the
different
activation
functions.
A
You
can
experiment
for
a
long
time.
Devine
developing
these
models,
I
was
gonna,
say
divining
is
a
bit
more
of
an
art
than
the
science
right
now.
One
such
project,
speaking
of
the
tensorflow
ecosystem,
I,
don't
have
slides
for
this.
But
if
you,
google,
chaos,
tuner
chaos,
tuner
is
a
library
to
make
it
easy
to
basically
do
hyper
parameter
tuning,
which
means
trying
different
combinations
of
things
and
seeing
which
works.
Well.
I
only
know
two
ways
to
do
this
manually:
one
way
to
tune
hyper
parameters
which
you
should
not
do
is
grid
search.
A
So
that's
trial,
combinations
of
a
couple
different
settings
because
it's
very
slow,
slightly
faster
than
grid
search,
could
be
random
search.
The
reason
you
don't
do
grid
search
is
often
settings
that
are
very
close
to
each
other,
have
basically
the
same
performance,
so
you're
wasting
time
by
doing
grid
search.
So
you
random,
search
even
better
if
you're,
a
mathematician
which
I
am
NOT,
there's
all
different
search,
algorithms
that
you
can
apply
to
hyper
parameter
tuning
and
there's
library,
chaos
tuner.
That
has
some
of
these
built
in
and
it
looks.
It
looks
really
really
really
good.
A
A
So
tensorflow
right
now
we
have
really
really
really
good
Gann
tutorials
and
the
reason
is
they're
fun
to
look
at
so
we've
been
spending
a
lot
of
time
developing
them,
and
we
have
this
nice
sequence
of
gans.
What
again
is
almost
all
the
problems
that
you
look
at,
while
you're
learning
our
classification
problems?
It's
given
a
picture,
classify
it
or
you
might
do
regression.
You
know
predictive
price
or
a
predictive
probability
or
predict
the
weather
right.
A
Much
harder
problem
is
image
generation,
and
so,
if
I
say
to
you,
don't
classify
the
image,
but
synthesize
me
a
picture
of
a
cat.
You
know
that
type
of
problem
is
like
a
very
different
order
than
classifying
things
and
the
reason
it's
hard
is
in
deep
learning
everything
we
do
needs
a
loss
function.
So
all
these
DNS
are
trained
by
gradient
descent.
The
way
we
get
the
gradients
is
by
backprop.
A
The
problem
is:
if
you
want
to
synthesize
a
picture,
we
need
a
gradient
that
tells
us
if
our
picture
is
good
or
not,
and
the
way
that
this
is
in
2014
from
uni
Goodfellow.
The
way
that
we
can
generate
images
is
by
training.
Two
networks:
in
parallel,
we
use
one
model.
The
discriminator
and
the
discriminator
is
just
an
image.
Classifier,
it's
only
job
is
given
a
picture
say.
Is
this
a
real
cat,
or
is
this
a
cat
that
somebody
synthesized?
A
We
have
a
second
Network,
which
is
a
generator
and
the
generator
starts
like
knowing
nothing
about
cats
and
we
teach
the
generator
to
generate
increasingly
realistic
cat
photos
over
time
by
training
it
against
the
discriminator.
So
this
is
called
adversarial
training
and
it
gives
us
a
loss
function
that
we
can
optimize
against
and
some
of
the
by
the
way,
all
these
papers
all
these
tutorials,
though
they
should
also
link
to
the
papers
that
they
talk
about.
So
you
can
read
more
detail.
Our
first
Gant
tutorial
it's
important
because
it
runs
fast,
but
it's
boring.
A
It
works
with
M
mist
which
you'll
see
you
know
forever.
But
what
we're
doing
here
is
this
is
just
a
little
gift.
The
tutorial
produces.
That
shows
you
the
digits
it's
learning
to
generate
over
time
and
it
starts
with
random
noise
and
they
get
increasingly
better
and
by
the
way,
just
a
detail.
Here
we
seed
the
generator
with
random
noise.
That's
so
it
doesn't
learn
to
produce
exactly
the
same
image
again
and
again
and
again,
and
the
reason
we've
fixed
the
R
and
for
each
of
these
plots,
which
forces
its
generate
the
same
image.
A
So
you
can
actually
see
the
progression
anyway.
Dc
gained
great
prove
the
point.
This
was
a
later
paper,
but
very
very
quickly.
You
can
do
much
more
sophisticated
things
with
ganz,
so
this
is
a
model
I,
but
a
lot
of
you
who've
heard
heard
about
recently.
It's
called
picks
two
picks,
and
this
is
from
a
wonderful
group
out
of
Berkeley
and
here,
although
we
have
a
whole
bunch
of
data,
sets
the
input
image
here.
Are
these
beautiful
facades
of
well
not
beautiful?
The
output
is
beautiful.
A
Here
are
these
probably
grad
student
produced
cartoon
drawings
of
facades
and
here's
the
building
they
correspond
to,
and
this
is
the
output
of
the
fixed
fixed
model
and
the
reason
I'm
showing
you
this
is
this
little
web
page
will
run
end
to
end
in
collab.
So
if
you
click
the
Run
button,
it
will
download
exactly
the
data
set
that
you
see
here.
Well,
train
the
model
and
show
you
the
output,
which
is
this
so
it's
beautiful,
I
mentioned
experimental
design
being
important.
A
Another
thing,
that's
obviously
important
I'll
just
say
this
is
not
being
an
and
I
bet.
A
lot
of
you
have
heard
about
pics
to
pics.
Recently
just
cuz.
Some
people
did
a
crappy
company
based
on
pics
to
pics,
which
I
think
is
now
sunset.
So
there's
a
lot
of
good
work.
You
can
do
with
deep
learning
you
can
think
about.
A
How
can
we,
you
know,
look
at
how
these
models
are
analyzing
patients
eyes,
but
if
you're
also
like
a
teenager,
you
can
do
really
silly
pointless
things,
and
so
that's
that's
just
something
we're
dealing
with
as
a
community
right
now,
anyway,
another
beautiful
beautiful
paper
from
the
same
group
at
Berkeley
is
cycle
Gann,
and
this
is
real.
This
is
what
the
tutorial
makes
cycle.
A
There's
things
where
it's
very
hard
to
collect
paired
training
data,
so
one
such
example
is
day
and
night.
So
if
you
want
a
picture
of
downtown
San
Francisco
during
the
day,
it's
hard
to
get
exactly
the
same
picture
at
night,
because
cars
move
around
and
stuff
like
that,
but
there's
also
data
sets
where
it's
almost
impossible
to
get
paired
training
data,
because
the
paired
training
data
doesn't
exist
in
nature.
A
So
what
the
authors
of
this
paper
realizes
that,
although
you
can't
get
a
one-to-one
mapping,
what
you
can
do
is
get
a
directory
of
horses
and
a
directory
of
zebras
and
the
adversarial
learning
problem
here
is
the
generator
produces
an
image
of
a
zebra
and
the
discriminator
can't
figure
out
if
it's
real
or
false
like.
Could
this
image
of
a
zebra
belong
in
my
zebra
directory?
The
loss
function
also
forces
the
image
of
a
zebra
to
his
closely
match
the
input
image
of
the
horse.
A
So
if
you
stack
these
up,
you'll
see
they're
almost
identical,
and
so
we
have
these
two
loss
functions.
And
if
you
look
at
the
code,
you'll
see
the
code
is
almost
identical
to
pics
to
pics.
It's
basically
and
in
fact
that's
how
we
wrote
the
tutorial.
We
import
the
entire
pics
to
pics
model
and
we
slightly
change
the
loss
function.
So
a
lot
of
these
cool
tricks
and
deep
learning
are
just
thinking
about
these
new
loss
functions
that
describe
the
problems
you
care
about
and
then
training
models,
yeah,
yes
good
eyes.
A
So
one
thing
the
question
was:
there's
background
noise
and
stuff
like
that,
so
there's
a
couple
reasons
for
background
noise
cycle.
Gann
is
one
of
our
few
tutorials
that
almost
all
of
these
will
run
in
a
few
minutes
cycle.
Gann
does
not,
and
this
starts
to
push
the
limits
of
collab.
So
collab
is
meant
for
interactive
research
or
interactive.
Whatever
development
we
just
run
our
tutorials
in
collab,
because
that's
what
we
expect
users
to
do
before
they
install
tensorflow
in
their
local
machine,
and
we
didn't
train
this
that
long.
A
A
A
Yeah
there's
a
wonderful
journal
called
distilled
pob,
which
is
nuts.
It's
called
distill
di
STI
ll
pub,
and
this
is
some
of
the
best
work
around
into
understanding
exactly
what
are
these
networks
doing
under
the
hood
to
classify
images
or
do
whatever
you
want?
So
it's
research
in
interpret
e
and
it's
the
best
that
I'm
aware
of
all
of
their
articles.
Have
these
beautiful,
interactive
demos.
A
But
if
you
want
to
learn
about
checkerboard,
artifacts
and
convolution,
they
have
a
whole
little
thing
explaining
exactly
why
that
happens,
and
it's
an
artifact
of
just
the
way
filters
work
so
distill
that
pub
is
nuts
they
published
very
rarely,
but
they
maintain
a
super
high
quality
bar
and
it's
got.
It
was
I
think
it
was
started
at
Google
by
Chris
Oh
a
lot,
but
he
left
recently
and
there's
contributors
from
all
over
the
place.
A
Another
thing
that
might
be
of
interest-
and
we
just
published
this
tutorial
I-
think
about
a
week
ago,
so
I'm
not
an
expert
in
this
area,
but
if
you'd
like
to
learn
about
adversarial
examples,
you've
probably
heard
about
these
an
adversarial
example.
What
this
means
is
it's
an
image,
so
this
is
a
panda
to
me.
This
looks
like
a
panda,
but
by
adding
this
noise
to
the
panda,
we
can
trick
the
classifier
into
thinking
that
it's
something
totally
different
and
adversarial.
A
Examples
are
interesting
because
they
reveal
weaknesses
in
the
way
these
models
work
so
often,
like
you
know,
we'll
train
these
image.
Classifiers
we'd
be
like
yes,
like
Pat
self,
on
the
back.
I
have
this
like
super
99%,
accurate
model,
but
really
under
the
hood.
It's
it's
not
doing
what
we
think
it
is,
and
this
ties
in
really
nicely
to
work
on
interpretability.
So
it
would
be
good
if
we
understood
these
models
better,
so
we'd
have
more
confidence
in
the
way
they
works.
A
Yeah,
great
question
so
for
this
and
for
Ganz,
what
level
of
tensorflow
are
we
using
to
write
them?
And
the
answer
is
often
it's
a
mix.
So,
let's
check
so
here
with
adversarial
examples,
we're
using
the
gradient
tape
to
train
the
model.
The
reason
that
we're
using
the
gradient
tape
is,
we
need
the
gradients,
so
the
simplest
way
to
create
an
adversarial
example
which
is
implemented
here.
Is
we
get
the
gradients
of
the
image
with
respect
to?
A
Basically,
we
get
the
gradients
and
I
could
be
wrong.
We
get
the
gradients
just
as
we're
gonna
do
a
normal
step
of
gradient
ascent
and
then
what
we
do
is
we
take
a
giant
step
really
quickly
in
the
wrong
direction,
so
maybe,
under
the
hood,
all
these
images
lie
on
some
manifold
and
we're
just
jumping
way
off
and
that
totally
fools
the
classifier,
because
we
need
the
gradients.
We're
writing
the
grading
tape.
This
way,
however,.
A
We're
using
basically
regular
chaos
to
actually
get
the
image
classifier.
So,
in
addition
to
tensorflow
hub
there's
something
wonderful
called
Carrasco
patience,
and
this
is
what
I
would
personally
recommend
Karis
in
both
TF
chaos
has
a
whole
box
of
famous
image
models
built
in
so
here
we're
downloading
one
such
model
called
mobile
net
and
mobile
net
has
gotten
very
popular
recently,
because
there's
a
lot
of
interest
in
running
models
on
phones
and,
basically
or
in
web
browsers
or
mobile.
A
That's
really
helpful
to
one
research
direction
recently,
which
isn't
rocket
science,
but
it's
super
super
valuable
is
basically
how
can
we
train
accurate
models
with
fewer
parameters,
so
fewer
layers,
shorter
layers,
more
efficient
functions,
so
they
can
run
on
different
devices
and
there's
always
the
speed
accuracy
trade-off
anyway.
Mobile
net
has
a
whole
bunch
of
different
versions
that
run
fast
on
different
devices.
A
A
So
this
is
a
mix
and
then,
if
you
look
through
the
gans
I
like
DC
Gann,
you
might
see
that
one
model
is
defined
using
the
sequential
API
and
you
might
find
another
is
defined
using
something
else,
and
so
the
nice
thing
about
10:02
is
you
have
different
options
based
on
what
are
you
doing?
Let
me
mix
and
match
you.
A
A
Another
really
good
collection,
we're
gonna
separate
these
out
when
we
launch
the
library,
if
you
want
to
learn
a
lot
about
how
this
works
under
the
hood.
So
if
you're
saying
like
Karros
is
great
but
I
have
my
own
idea
for
the
Berkeley
layers,
library
or
anything
else
like
that,
you
want
to
write
your
own.
Let
me
point
you
at
resources
that
you
can
use
to
figure
out
how
to
do
that.
So
there's
in
the
tutorials
webpage
there's
this
guide
section
which
will
break
off
later.
A
If
you
want
to
learn
exactly
how
Karros
layers
and
models
work,
this
is
really
really
excellent
and
it
will
also
introduce
the
sequential
functional
and
subclassing
api's,
which
is
great.
We
also
have
this
awkwardly
named
needs
to
be
expanded
collection
which
walks
you
through
exactly
what
tensors
are
and
how
do
they
interrupt
with
numpy?
How
exactly
does
ta
function,
work
and
auto
graph
work
and
stuff
like
that,
and
some
of
these
guides
are
excellent,
so
there's
lots
of
details
for
you
to
chug
through
you.
A
And
yeah
you
can
see
this
is
under
active
development,
so
you
can
see
there's
different
strategies
supported
right
now
for
different
types
of
different
styles
of
tensorflow.
By
the
way,
in
addition
to
in
tensorflow,
too
careless
is
what
we
call
the
recommended
API.
So
if
you're
starting
tensorflow
too
now
you
should
use
the
Kaos
libraries
tensorflow
is
a
huge
project
and
there's
another
wonderful,
API
called
estimators.
A
These
were
originally
inspired
by
scikit-learn,
but
they
grew
to
become
a
little
bit
more
complicated,
they're,
very,
very
popular
internally
and
they're,
totally
supported
and
tensorflow
they're
wonderful,
they're
fast.
But
if
you're
starting
today,
you
should
probably
start
with
care
us
just
because
it's
a
little
bit
easier
to
use.
But
if
you
have
existing
code
that
happens
to
use
these
things,
it's
still
supported.
No
worries
is
so
your
your
common
question
was
intent
circle
of
1.6.
It
was
difficult
to
write
layers.
Yes,
I
agree
with
you.
Has
this
problem
solved
intense
flow?
A
Yes,
it
has
so
I'm
very,
very
happy
with
sensor
flow
I.
Think
the
team
I
don't
want
to
make
like
an
apple
joke
like
courage,
but
it's
a
courage
to
like
pivot
the
library,
but
this
wasn't
BS,
but
with
the
courage
from
Apple
to
ditch
the
adapters
right
or
ditch
the
ports.
It
was
courage,
I
forget
exactly
what
they
did
but
yeah.
So
this
guide
will
show
you
how
to
write
custom
layers
and
what's
really
nice
about
it
as
all
the
guides,
you
can
run
it.
A
A
So
tensorflow
2
is
a
big
project
and
one
of
my
favorite
things
about
Google
is
it's
very
much
bottom-up,
so
like
if
I
had
an
idea
which
I
don't,
but
if
I
hadn't
an
idea
for
like
the
Josh
library,
if
I
had
like
instead
of
Kara
I
want
to
do
the
Josh
Lib,
probably
I
could
take
a
swing
at
it
and
if
it
was
good,
maybe
I
could
open-source
it
and
get
users.
So
we
have
a
lot
of
people
trying
a
lot
of
different
ideas.
A
This
is
how
Kara's
came
to
be
so
no
one
told
the
Kara's
team.
No
don't
do
this,
so
they
did
it
and
it
worked
really
well.
We've
had
a
lot
of
ideas
that
haven't
worked
so
well.
Slim
by
the
way
works
extremely
well,
but
it
has
a
relatively
small
user
base
that
has
done
really
really
good
work.
Particularly
Slim
has
a
ton
of
really
awesome.
A
Pre-Training
they've
done
an
excellent
job,
so
TF
slim
they
have
a
github,
repo
and
it'll,
be
like
10,
20,
plus
famous
image
classifiers
with
complete
code,
and
all
that
it's
great,
but
it's
not
what
we're
standardizing
on
it's.
Not
this
limb
is
bad.
We
just
picked
care
us
because
it's
got
a
large
user
base
in
this
a
little
bit
more
mature
and
easier
to
use.
I,
don't
know
if
it's
deprecated
or
not
another
thing,
that's
interesting
about
tensorflow!
When
you're
writing
your
data
input
pipeline,
you
basically
got
two
choices.
A
You
can
use
numpy,
which
is
what
you
should
start
with
and
then,
if
you
feel
like
it,
you
can
graduate
to
TF
data.
Tf
data
can
be
faster,
but
it's
also
harder
to
use,
and
it's
this
trade-off
where,
basically,
if
you
have
an
engineering
team,
TF
data
is
what
you
want.
If
you're
a
single
developer,
just
hacking
around
probably
start
with
an
umpire
use,
TF
data,
if
you
feel
like
it,
it
just
performance
tuning,
takes
some
hours
to
get
right.
A
So
if
you're
writing
your
input
pipeline
with
TF
data,
you
probably
should
benchmark
it
and
start
playing
around
with
it.
Anyway,
I'm
gonna
show
you
linear
regression
from
scratch
and
then
I'm
gonna
show
you
deep
dream
and
the
codes
almost
the
same,
which
is
just
why
I
want
to
show
you
this
surprisingly.
Yes,.
A
Can
I
talk
about
TF
agents?
Unfortunately,
no
I'm,
not
a
reinforcement,
learning
expert
I've,
never
used
TF
agents.
My
manager
does
know
a
lot
about
it.
You
can
find
him
on
Twitter,
Magnus,
hissed,
in'
or
check
out
the
TF
agents,
get
a
repo
I,
almost
certainly
there's
somebody
in
the
room.
That
knows
a
lot
about
reinforcement,
learning
who
talked
to
you
about
it.
You
can
try
and
find
them
during
the
workshop.
A
That's
another
interesting
thing
about
tensorflow:
by
the
way
on
the
tensorflow
github
site,
you
will
find
like
a
whole
zoo
of
different
projects
like
TF
agents
that
are
being
implemented
in
tensorflow.
That's
the
actual
tensorflow
code
base,
you'll
find
a
ton
of
them,
and
this
is
a
really
really
nice
thing,
both
we
Google
and
outside
of
Google.
A
A
So
sketch
RNN
is
awesome.
It
looks
like
a
toy
for
kids,
but
it's
not
so
one
thing
you
learn
about.
If
you
learn
about
rnns
you'll,
learn
about
classifying
text
and
you
learn
about
generating
text,
which
is
great.
We
have
a
tutorial
that
will
teach
you
how
to
generate
Shakespeare.
You
can
also
apply
this
same
idea
of
generating
text
to
images
and
sketch
our
men.
It's
tiny,
but
here
it's
loaded
for
pineapples,
and
so
if
I
start
drawing
a
pineapple
sketch
RNN
is
going
to
try
and
autocomplete
my
pineapple
and.
A
So
this
is,
this
is
extremely
cool
and
it's
also
very
surprising
right,
so
obviously
like
this
is
not
gonna
put
an
artist
at
a
word
in
next
long
time,
but
you
can
imagine
doing
like
a
more
serious
implementation
of
this,
where
you
had
a
tool
that
you
know,
I
get
writer's
block
a
lot.
Maybe
you
could
help
artists
with
artists,
Pok
or
if
my
job
was
to
generate
clipart.
Maybe
I
could
see
a
bunch
of
possibilities
and
the
student
process.
What's
cool
is
probably
if
I
start
drawing
an
octopus.
That
was
amazing.
A
A
Yes,
no
glasses
can't
see
you
see
now.
Yes,
exactly,
it
was
an
app
that
Google
made
that
had
people
draw
stuff
and
it's
called
quick
draw
right.
So
quick,
draw
you'll
notice,
there's
somewhere
on
here,
there's
like
a
privacy
thing
when
you
play
quick,
draw,
there's
nothing
identifiable,
but
it
saves
your
drawings
and
what's
interesting,
quick-draw
by
the
way
used
to
be
really
easy
and
would
be
like
draw
I,
don't
know
like
a
truck
now
it's
really
really
hard
because
they
have
a
lot
of
data
in
there
enriching
your
training
set.
A
So
I
can't
do
this.
It's
draw
a
camera,
but
if
you
start
throwing
a
camera,
quick-draw
will
give
you
like
glare
or
suitcase.
Oh
I
know
it's
camera
anyway,
that
drawing
goes
into
the
sketch
RNN
database
and
what's
interesting
about
the
drawings
is
drawings,
I
think
of
drawings.
As
pictures
they're,
not
they're
sequences,
and
because
we
have
a
sequence
of
rush
strokes,
you
can
train
an
RNN
to
continue
the
scene,
and
so
that's
that's
how
sketch
RNN
came
to
be
and
on
the
magenta
website,
which
is
magenta
tensorflow
org.
A
They
have
implementations
of
all
this
stuff.
I
should
also
mention
like
these:
are
it's
awesome?
They
share
their
code.
A
lot
of
our
tutorials
on
the
website
are
meant
to
be
relatively
minimal
examples.
So
it's
not
like,
let's
train
the
world's
most
accurate
image
classifier.
It's
just
show
me
some
code
that
will
get
me
started.
A
These
examples
are
intended
to
be
like
awesome,
so
these
are
just
the
code
directly
from
the
papers,
so
they
take
a
lot
more
time
to
go
through,
but
if
you're
serious
about
learning
how
I
need
this
stuff
works,
it's
all
it's
all
right
there,
which
is
super
super
cool.
The
other
project
that
I
wanted
to
mention
when
I
just
randomly
went
to
the
github
repo
is
called
mesh
tensorflow,
and
this
is
probably
really
interesting
to
berkeley
lab.
This
is
for
like
super
distributed
training,
there's
a
talk
from
the
tensorflow
dev
summit.
A
You
can
watch
that
will
go
into
mesh
tensorflow
in
more
depth
if
you're,
a
statistician
which
I
am
not.
This
is
another
cool
thing
about
deep
learning
by
the
way
I
like
I'm,
a
average
Python
developer,
which
means
I
can
help
people
out
with
their
deep
learning
models.
I'm.
Okay
with
ML,
but
you'll,
see
like
right
off
the
bat
there's
all
these
really
really
deep
sub
disciplines.
A
A
Another
cool
thing:
if
you
feel
like
contributing
to
the
tensorflow
ecosystem,
our
whole
docks
repo
is
on
github.
So
if
you
see
something
with
one
of
these
tutorials
that
can
be
improved,
you
see
something
that
doesn't
make
sense.
Please
file
a
pull
request
to
raise
an
issue
and
do
our
best
to
fix
it.
A
A
Another
really
interesting
project
is
federated
learning,
and
so
this
might
be
of
interest
to
you
if
you're
doing
research
in
privacy
and
so
federated
learning
ask
the
question:
where
of
let's
say
that
all
of
us
would
like
to
train.
Let's
say
all
of
us
are
users
and
we
want
to
train
a
model
that
can
tag
our
photos.
A
So
Google
photos
does
this
now,
but
let's
say,
google
photos
doesn't
exist,
so
you
want
to
upload
a
picture
or
you
want
to
have
a
picture
on
your
phone
and
we
want
a
model
that
says
that's
a
picture
of
you
on
vacation
with
your
dog.
Let's
say
that
we
want
to
train
this
model
together,
but
none
of
us
wants
to
upload
our
images
to
a
server
which
we
don't.
How
can
all
of
us
learn
a
model
together
while
keeping
our
data
private,
and
this
is
called
federated
learning
and
it's
a
really
cool
research
area?
A
There's
the
implementation
and
sensor
flow
and
there's
an
article
on
our
blog
that
you
can
read
about
so
that's
tensorflow
federated
something
to
be
aware
of
for
some
of
these
projects
by
the
way
there's
there's
a
ton
of
them
and
what
I
would
do
before
you
dive
into
them
is
I
would
check
the
activity
log
and
you
want
to
find
projects
that
are
being
actively
developed
and
maintained
and
worked
on.
There
might
be
some
stuff
in
here.
That's
a
little
bit
older.
A
A
A
A
So
let
me
just
explain
why
we
have
this,
so
what
this
notebook
does
is.
This
is
writing
linear
regression
in
the
lowest
level
possible.
You
could
do
this
with
chaos,
but
this
is
pretending
we
didn't
have
it.
What
this
notebook
does
just
go
really
quickly
and
just
give
you
the
highlights.
It
generates
some
random
data.
It
generates
a
noisy
distribution,
as
you
might
expect
it
finds
the
benefit.
That's
fit
line.
The
other
thing
this
notebook
does
in
case.
A
This
is
the
first
time
you
are
you're
new
to
gradient
descent,
and
you
want
to
poke
around
with
exactly
how
gradient
descent
works
at
the
end
of
it.
The
notebook
has
code
to
produce
this,
and
what
we're
looking
at
here
is
when
we
do
linear
regression,
we
start
with
a
random
guess
for
M
and
B
the
slope
and
intercept,
and
our
random
guess
might
be
up
here
and
it's
plotting
what
those
values
were
and
on
the
lawsone
z-axis.
It's
plotting
the
squared
error
and
then
what
we
do
is
at
each
step
of
gradient
descent.
A
You
can
see
how
the
loss
decreases
and
it's
just
a
nice
diagram.
The
reason
I
like
this
is
it's
real
and
I've
seen
this
diagram
alati
slides,
including
slides
that
I've
made,
but
it's
nice
just
to
have
little
code
that
makes
it
and
then
it
also
makes
it
easy
to
think
about
gradient
descent
right.
So
here
we
know
that
linear
regression
has
a
global
minimum.
Deep
neural
networks
do
not
as
a
piece
of
trivia.
It's
been
a
long
time
since
I
took
calculus,
but
I.
Remember
and
I
hadn't
had
any
kind
of
neural
networks.
A
They
weren't
a
thing
then,
but
if,
when
I
took
calculus
I
learned
about
local
minimum
global
minimum
right
and
if
somebody
had
told
me
at
the
time
like
hey
like
these
DN
ends
it's
unknown
if
they
have
a
global
minimum
and
if
they
do,
we
don't
know
if
we
can
ever
find
it.
I
would
have
said
like
right.
Okay,
that's
training.
A
These
things
with
gradient
descent,
probably
he's
not
going
to
work,
because
that's
what
I
learned
in
school
and
my
intuition
would
have
been
totally
wrong
and
it
turns
out
that
a
lot
of
people
made
the
same
mistake.
It
turns
out
that
to
train
a
DNN
to
be
accurate,
you
don't
need
to
find
a
global
minimum.
You
just
need
to
find
some
point
on
the
surface
that
works
well
enough
and
it
turns
out
that
we
can
find
points
that
work
extremely
well
and
also
because
these
dns
they
have
so
many
parameters.
A
Apparently
it's
it's
much
harder
to
get
stuck
in
a
local
minimum.
That's
very
bad!
This
also
makes
it
easy
when
you
learn
about
deep
learning,
there's
a
whole
box
of
optimizers.
You
can
use
this.
One
just
uses
gradient
descent
written
by
hand,
but
you
learn
about
things
like
rmsprop
and
atom
and
stuff
like
that
and
a
lot
of
the
ideas
they
have
good
intuition.
So
you
might
look
at
this
and
say
like
well.
You
know
when
we
have
our
initial
guesses
for
M
and
B
they're-
probably
really
bad,
because
they're
random
guesses.
A
So
when
we
get
the
gradient,
we
might
want
to
take
a
very
large
step
and
then,
after
we've
taken
a
bunch
of
steps,
probably
our
guesses
are
getting
a
little
bit
better.
So
we'll
take
slower
and
slower
steps
and
you
might
invent
the
idea
of
an
adaptive
learning
rate
or
a
decaying
learning
rate
other
things
you
might
invent.
If
you
saw
the
surface,
you
might
come
up
with
things
like
momentum
to
help
you
roll
out
of
like
little
local,
minimum
and
stuff
like
that.
Anyway,
it's
just
really
nice.
A
The
thing
I
wanted
to
show
you
I
just
want
to
show
you
two
things
and
we'll
go
into
deep
dream
for
these
DN
ends.
You
always
need
three
ingredients.
You
need
a
forward
pass
or
a
way
to
make
predictions,
and
here
the
way
we
make
predictions
and
in
tensor
flow
to
this.
These
are
tensors,
but
it
looks
exactly
like
regular
Python.
Our
forward
passes
y
equals
MX
plus
B,
so,
given
an
x
predict
y,
then
our
loss
function
is
the
squared
error
and
oh
yeah.
A
A
What
we
need
is
the
gradients
of
the
loss
with
respect
to
M
and
B
and
the
way
we
get
that
is
given
our
training
data.
We
make
some
predictions,
we
calculate
our
loss
and
then
outside
of
the
width
block,
we
use
the
gradient
tape
to
get
it
directly.
This
is
also
a
really
nice
example
to
have
so
you
can
just
print
these
things
out
and
just
see
exactly
what
they
are
and
what
they
represent,
which
is
really
nice.
A
Another
thing
by
the
way
about
the
style
of
code
is,
if
you're
doing
gradient
clipping
or
something
like
that,
you
can
implement
it
regular
Python,
but
what
I
want
you
to
look
at
is
the
training
loop,
so
it's
that
and
now
I
want
to
explain
deep
dream
and
it's
gonna
look
very
very
similar
yeah.
This
is
the
new
example
that
will
be
on
the
website,
hopefully
this
week.
A
A
It's
so
the
result
was
an
LSD
trip,
so
this
was
one
of
the
original
meme
makers.
So
if
you
had,
if
you
were
like
a
reddit
user
and
like
you
had
your
hands
on
deep
dream
now
you
have
a
lot
of
karma,
so
people
just
like
banged
out
in
psychedelic
images
and
they're
really
cool.
So
when
you
look
at
this,
what
do
you
see
so,
first
of
all,
what's
what's
the
picture
that
this
started
life
as.
A
Starry
night,
by
an
NGO,
okay
and
what
has
story
night
become
or
like
what?
What
is
in
story
night
now
that
van
Gogh
might
not
have
put
in
his
original
painting
eyes
animals,
because
this
is
by
far
the
highest
resolution
screen
I've
ever
presented
on
by
the
way
this
is
nice.
I
can
see,
there's
wheels
that
looks
to
me
like
I
draw
right.
A
It
internet
so
this
this
is
a
generative
model,
so
by
a
generative
model,
I
mean
deep
dream
is
producing
this
image,
we're
not
doing
classification.
All
of
the
things
that
you
see
in
deep
dream
appear
in
image
net
from
a
phase
group.
Her
big
database
image
net
and
the
reason
that
we
see
lots
of
eyes
dog
faces
is
there's
a
cute
little
nose.
A
Image
net
happens
to
have
lots
of
pictures
of
dogs,
flowers,
snakes,
cars
stuff,
like
that,
normally
in
deep
learning.
What
you
do
is
you
have
a
model
and
the
model
has
variables
or
parameters,
and
you
adjust
the
parameters
to
fit
the
data.
So
you
train
the
classifier
by
tweaking
these
weights
in
deep
dream,
we
start
with
a
pre
trained
image.
Classifier
the
goal
is
not
to
adjust
the
classifier
at
all.
The
goal
in
deep
dream
is
deep:
dream
is
an
experiment
to
understand
how
image
classifiers
work.
A
What
are
the
convolutional
layers
that
I
showed
you
earlier
actually
doing
like
I?
Had
this
kind
of
like
hand,
wavy
thing
like
yeah
like
layer,
4
is
detecting
textures,
but
deep
reme
is
saying:
is
it
really
detecting
textures
and
can
we
see
what
the
filters
are
detecting?
So
the
idea
of
deep
dream
is
we're
gonna
start
with
an
input
image
and
we're
going
to
modify
the
image
to
increasingly
excite
a
filter
in
a
pre
trained
image
classifier.
A
So
if
we
downloaded
mobile
met
or
vgg
or
a
model
that
you
trained
yourself
in
the
forward
pass,
we
take
an
image.
Do
you
pass
it
through
the
classifier?
It
goes
through
layer,
1,
layer,
2,
layer,
3,
you
blah
blah
blah
softmax
at
the
end
says
it's
cat
in
deep
reme.
We
stop
at
the
layer
that
we
care
about
so
I
might
stop
at
layer.
4
and
I'll
ask
the
model
to
actually
print
out
the
activations
things
that
come
out
of
the
reimu's
at
layer
4.
That
will
give
me
a
list
of
numbers.
A
A
There's
a
very
small
amount
of
code,
which
is
why
I
find
it
so
interesting
here
we're
downloading
image
net.
This
is
the
Charis
application.
It's
pre-trained
we're
getting
the
image
net
weights
right
here.
If
we
had
a
cat
in
memory,
he
said
base
model
dot,
predicts
cat
and
memory.
You
would
probably
say
it's
cat.
It's
an
image,
classifier
great!
A
Yeah,
if
you
do
model
dot
summary
by
the
way,
there's
yeah,
if
you
do
model
dot
summary
you'll,
see
a
giant
list
of
all
the
layers
in
the
model,
and
the
important
thing
here
is
that
these
layers
have
names
and
what
we're
doing
is.
The
first
thing
we
need
is
a
Ford
Pass.
So
when
we
push
our
image
through
the
inception
model,
we
get
the
output
of
some
layers,
and
here
we've
selected
these
layers
and
there's
two
ways
to
do
deep
dream.
A
Our
loss
and
there's
some
code
here
to
simplify
this,
but
our
loss
is
just
the
sum
of
the
activations,
and
normally
we
do
gradient
descent
here,
we're
doing
gradient
ascent,
so
we
actually
want
to
maximize
this
loss.
So
we
want
to
modify
the
image
to
make
this
number
higher,
which
means
whatever
features
these
filters
are
detecting,
there's
more
of
them
in
the
image
and
then
in
gradient
ascent.
A
A
Here
we
get
the
loss,
so
this
function
will
Ford
the
image.
Through
the
network
will
sum
up
the
activations
in
some
layer
we
need
to
go
in
the
opposite
directions:
we're
taking
the
negative
of
it
and
the
magic
of
Auto
diff
is
once
we
have
it
set
up
this
way
and
before
they
cleaned
it
up.
It
looked
all
basically
the
same
as
linear
regression
when
out
slightly
tighter
the
magic
to
Auto
diff,
because
we
have
everything
in
tensor
flow.
A
The
reason
this
is
a
little
bit
lower
res
is
this
code,
there's
a
whole
bag
of
tricks
that
you
can
add
to
this
to
generate
higher
res,
really
pretty
images.
This
is
the
minimum
amount
of
code
to
make
it
work
and
that's
what
we
get
from
these
filters,
but
you
can
play
with
it
to
detect
different
things
and
here
we're
getting
lots
of
eyes
and
stuff
like
that.
So
deep
dream,
is
this
really
really
really
cool
result?
A
What
deep
dream
is
doing
is
it's
proving
that
yes,
in
the
process
of
training,
is
CNN
you're
learning
filters,
the
automatic
feature.
Engineering
magic
is
learning
filters
that
detect
things
we
see
in
the
world
and
if
you
look
at
the
older
implementation
in
tensorflow,
one
which
has
all
the
tricks,
but
it's
way
way
longer
ignoring
the
code.
The
authors
of
this
library,
they
visualized
every
single
filter
in
every
layer
of
a
pre,
trained
CNN,
and
you
can
see
exactly
what
you
see
in
a
lot
of
diagrams.
A
So
that's
layer,
one
as
you
move
up
the
network
you'll
start
to
see
filters
that
are
responding
to
textures
of
different
types,
and
these
are
a
little
bit
harder
for
me
to
interpret.
But
the
point
is
patterns
are
getting
more
abstract.
As
we
move
up
the
network
and
the
deeper
you
go,
they
still
don't
make
sense
to
me,
but
they
start
to
resemble
things
that
you
might
I.
Don't
know
you
could
name
them.
A
If
you
really
tried
all
right,
some
are
pretty
and
as
you
go
really
deep,
you
start
to
get
things
that
are
semantic
II.
So
here
whatever
this
is-
and
this
looks
like
some
strange
combination
of
cute
dogs
and
eyes
and
snakes-
and
who
knows
what
this
literally
is
is
this
is
some
filter.
So,
if
you
saved
like
calm,
if
like
layer,
5
Network
is
calm,
2d
64-
this
might
be
like
the
eighth
filter
in
that
convolutional
block,
and
this
is
an
image
that
will
make
that
filter
super
excited.
A
So
whatever
that
filter
detects
is
right
here,
the
reason
it's
tessellating
across
the
image
is
because
the
convolution
is
sliding.
So
that's
why
we
see
the
same
pattern.
Repeating,
but
this
is
a
really
big
deal,
it's
an
amazing
insight
and
if
you
can
play
with
this
for
a
long
time,
you'll
see
some
things
that
really
creepy,
because
the
snakes
this
is
pretty
so
there
looks
like
trees.
A
A
A
A
A
A
Thanks
Google
for
the
auto,
so
this
is
this
is
a
MIT
strata
center
strata
Center
anyway,
you
start
with
a
photograph,
and
you
start
with
the
painting
and
you
try
and
produce
a
new
image
that
merges
them
too,
and
by
the
way,
the
way
you
would
do
style
transfer
now
is
with
again,
which
is
both
simpler
and
works
better.
But
this
is
very,
very
it's
a
close
friend,
deep
dream
and
what
you
do
we're
not
just
stacking
these
images.
This
is
also
one
of
these
magical
artifacts
of
given
that
we
have
an
image
classifier.
A
What
else
can
the
classifier
do?
And
the
idea
is
this
if
we
forward
both
of
these
images
through
the
classifier
layers,
close
to
the
input
into
CNN,
to
take
edges
and
shapes-
and
these
are
texture
like
things
right,
edges,
close
to
the
output
detect,
eyes
and
stuff
like
that,
and
those
are
content
like
things
we
can
write
a
loss
function.
We
start
with
an
image:
that's
random
noise
and
the
goal
is
when
we
Ford
this
image
through
the
network.
A
A
There's
some
math,
but
anyway,
you
scroll
through
you'll,
see
the
lost
function.
So
that's
that
style
transfer
the
way
you
would
do
style
transfer
today
is
cycle.
Gann
and
the
authors
shared
these
graphics
with
us
they're
from
the
paper,
but
they
gave
us
the
high
resolution,
which
I
really
appreciate
style.
Gann,
you
can
do
style,
transfer,
eat
things
right,
except
you
can
go
there
they're
a
little
bit
higher
quality.
So
you
go
for
photos
to
different
artists.
You
can
transfer
between
different
artists.
You
can
do
winter
to
summer.
A
One
thing
you
could
do
to
and
you
won't
have
time
for
it
during
the
workshop.
You
need
to
train
this
for
like
10
hours,
it's
gonna
be
difficult
to
do
in
collab.
You
want
to
use
your
own
hardware.
The
lab
has
fast
Hardware.
Leave
it
running
overnight.
You'll
have
a
good
time
winter
to
summer,
works
really
really
well.
One
thing
I
want
to
show
you
about
the
cycle.
Gann
tutorial.
A
A
So
if
you
want
to
modify
cycle
Gann
to
go
from
summer
to
winter
or
whatever
you
can
just
change
a
single
keyword
in
that
tutorial
and
run
the
model
and
it
will
do
it
or
you
can
collect
your
own
directories
and
images
you
could
transfer
between
Berkeley
and
Livermore
or
whatever
you
want.
So
it's
really
really
cool.
This
is
a
thing
called
tensor
flow
datasets
by
the
way
tensor
flow
datasets
is
conveniently
different
from
TF
data
or
tensor
flow
data.
A
Tensor
flow
dataset
is
a
large
collection
of
datasets
things
like
M,
mists
and
other
famous
ones
like
image
net
that
you
can
import
in
TF
data
format,
great
question.
So
what
do
we
do
for
feature
engineering
in
tensor
flow?
Yes,
there's
a
bunch
of
stuff,
so
in
deep
learning,
the
best
place
to
start
so,
there's
there's
broadly
two
classes
of
machine
learning
problems.
A
So
if
you
have
structured
data
and
by
structured
data,
I
mean
you've
got
a
spreadsheet
or
a
CSV
file
with
you
know,
the
rows
could
be
customers
and
the
columns
which
are
your
features,
might
describe
things
like
demographic
data
like
ignoring
fairness
and
privacy
like
age,
gender
income,
whatever
it's
a
small
number
of
features
that
are
very
meaningful
to
us.
When
you
have
data
like
that,
traditional
models
like
trees
work
extremely
well,
it's
very
very
hard
to
beat
a
decision
tree
with
deep
learning.
A
Until
you
have
the
other
type
of
problems,
you
have
our
deep
learning
problems
where
you
have
lots
of
features
like
pixels
or
words
where
individual
pixels
don't
mean
much
to
us,
but
because
of
this
feature
engineering
trick,
it
can
transform
them
into
more
meaningful
representations.
So
you
have
these
two
flavors
of
problems.
Deep
learning
does
work
for
structured
data.
Too
often
you
need
more
data
lots
of
data
before
you
start
outperforming
these
methods.
So
if
you
have
a
structured
data
problem,
really
strong
baseline
is
a
decision
tree
slightly
stronger
is
a
random
forest.
A
Gradient
boosted
tree
is
going
to
be
even
better,
probably
and
start
there,
and
then
you
could
see
if
deep
learning
can
compete.
I'll
go
back
to
future
engineering
and
deep
learning
this
sec.
Here's
what
you
can
do
with
deep
learning,
though
you
that
you
can't
do
with
trees
Kegel
at
finder
there's
this
really
awesome.
New
data
set
from
kaggle
and
the
authors
gave
us
Petfinder
gave
us
permission
to
use
it,
which
I
really
appreciate.
A
A
The
reason
this
is
a
cool
database,
it's
an
important
problem,
but
it's
cool
database
because
it
has
three
types
of
data:
it
has
structured
data
or
tabular
data,
which
is
basically
fields
like
the
idea
of
the
pet,
the
name
of
the
pet,
the
breed
of
the
pet
gender.
So
these
are
psychic
learning
fields
you
might
use
a
tree,
for
it
also
has
pictures
of
the
pets
and
presumably,
if
you
looked
at
a
picture
of
the
pet,
that
would
be
an
informative
feature
and
it
has
free
text.
A
So
that's
like
something
that
they
wrote
like
you
know:
fluffy
is
a
six
year
old,
whatever
and
she's
really
awesome
and
playful,
and
so
we
have
these
three
types
of
data,
unstructured
text,
tabular
data
and
images,
and
what
this
means
is.
This
is
a
good
use
case
for
deep
learning
on
structured
data,
because
you
can
train
a
joint
model
that
takes
all
these
things
at
once,
and
so
that's
really
when
you
want
to
use
deep
learning
on
structured
data.
A
Let
me
point
you
to
the
tutorials
we
have
which
are
okay,
we're
working
on
improving
these
if
this
doesn't
belong
in
machine
learning
basics,
but
it's
there
haphazardly
and
classify
structured
data
and
what
this
will
do.
This
is
just
a
starting
point
like
don't
copy
and
paste
this
and
try
and
train
it
on
a
large
data
set.
This
is
importing
like
300
lines,
saikat
learning
data
set
from
the
Cleveland
Clinic
for
heart
disease
and
it's
predictive.
A
A
patient
has
heart
disease
based
on
this
data,
and
what
this
is
doing
is
it's
showing
different
ways
that
you
can
represent
this
data
for
a
neural
network.
So
we
do
do
some
feature
engineering
only
with
structured
data,
and
let
me
point
you
to
a
tool
that
you
can
use.
It's
called
facets
the
way
to
find
this
tool
facets
it's
from
a
team
called
pair.
Oh,
they
do
people
in
AI
research.
So
if
you
search
for
a
pair
Pai
R
facets
you'll
find
this
tool.
A
It's
not
a
tensor
flow
tool,
it's
just
a
useful
thing,
but
it
can
demonstrate
what
we
do
and
facets.
It
has
this
nice
thing
here
where
you
can
upload
a
CSV
file
and
you
can
visualize
it
the
CSV
file.
That's
it's
got
a
nice
little
button
you
can,
you
can
use,
it
runs
open.
The
CSV
file
it's
already
here
is
from
the
US
Census,
it's
a
subset
of
the
1990
census
and
the
goal
this
is
like
a
perfect
storm
for
fairness
is
predicted.
A
Somebody
makes
more
or
less
than
50
thousand
dollars
a
year,
so
we
can
color
the
dataset
by
some.
Nothing
will
happen
nice.
So
what
I've
done
is
I've
colored
the
data
set
blue
dots,
less
than
50k
red
dots
more
than
50k.
What's
cool
is,
if
you
click
on
a
dot,
you'll
see
the
row
from
the
CSV
file
that
corresponds
to
that
dot.
So
this
person's
38
they
had
a
capital
gain,
and
so
this
type
of
data
makes
sense.
I'm,
actually
surprised,
I,
think
I
clicked
on
a
blue
dot.
A
So
less
than
50k
I'm
surprised
that
somebody
had
a
capital
gain
that
large
and
made
less
than
50000
so
think
this
is
probably
an
outlier,
but
anyway
they're
high
school
grad
anyway
structured
data.
What
facets
means
is
bucketing.
So
let's
say
that
this
is
basically
a
tool
to
poke
around
with
your
data
and
get
to
know
it.
So
let's
say
I
wanted
to
facet
it
or
bucket
it.
What
I
could
do
is
I
could
bucket
it
by
age.
A
So
now
we've
divided
into
age
buckets
and
we
can
see
that
these
kids
very
rarely
make
more
than
50k
and
as
you've,
people
that
are
sort
of
I,
don't
know,
maybe
dudes
they're,
like
prime
income
years
or
whatever
you
see,
that
the
ratio
changes
and
you
can
bucket
it
again.
So
if
you
wanted
to
poke
around,
you
could
bucket
it
by
whatever
you
want.
You
know
you
could
do
with
my
education
and
jobs
and
fast.
That
is
a
fancy
word
for
bucket,
but
one
type
of
feature
engineering
you
might
do
in
deep
learning.
A
Is
you
might
bucket
your
data,
so
you
might
try
and
make
it
easier
for
the
model.
If,
if
you
knew
off
the
top
of
your
head
that
it
didn't
matter
if
they
were
33
or
34
or
35,
you
might
get
rid
of
those
features
and
just
replace
them
with
simpler
ones
by
bucket
izing.
The
data
I
think
more
interesting
than
this
there's
another
tool
that
they're
just
released
and
it's
called
the
counterfactual
tool
or
it's
this
it's
the
what-if
tool.
A
Yes,
oh
thanks
I
should
stop
talking.
Thanks
I'll.
Give
this
last
slide.
A
no
point
am
I
out
of
thanks
for
reminding
me
so
here's
the
last
slide
then
I'll
give
you
books
on
stop
talking.
This
is
a
tool
called
what-if
and
it's
new
to
me.
It
finds
something
called
counterfactual
examples
and
what
that
means
is,
let's
say
you
are
predicting.
A
I'll
give
you
two
so
the
last
one
is
deep
learning
with
Python.
That's
the
Charis
book.
It's
awesome!
If
you
want
the
tensorflow
to
book,
there's
only
one
tensorflow,
two
books
by
Aurelien
Jeron,
it's
the
top
one.
There
only
get
the
second
edition
which
is
not
released
yet,
but
you
can
start
reading
it
on
a
Riley's
website.
They
have
a
free
trial.
Only
get
the
second
edition.
The
first
one
teaches
tensorflow,
one
which
you
don't
want.
So
thanks
very
much
and
I'll
be
around
during
the
hands-on
workshop.
I
can
help
with
any
questions.