►
From YouTube: 01 - Introduction to Machine Learning - Brenda Ng
Description
Deep Learning for Science School 2019 - Lawrence Berkeley National Lab
Agenda and talk slides are available at: https://dl4sci-school.lbl.gov/agenda
A
Good
morning,
all
welcome
to
Berkeley
Lab
welcome
to
the
deep
learning
science
summer
school.
We
are
really
glad
to
have
you
here,
thanks
for
waking
up
right
and
early
on
on
a
Monday
morning.
So
you
know,
I
did
want
to
acknowledge
early
on
that
deep
learning,
of
course,
has
been
taking
off
with
awesome
years
and
over
the
last
five
years,
we've
really
seen
deep
learning
for
science
take
off
the
reason
that
you
know
we
have
150
of
you
here
pretty
much.
A
The
room
is
that
capacity
is
because
we
all
feel
there's
a
lot
of
promise
in
applying
deep
learning
techniques
to
scientific
process.
You
can
read
a
lot
of
generic
introductions
to
deep
learning
and
machine
learning
on
the
web,
but
really
there
is
no
definite
of
resource
where
you
can
turn
to
40
planning
for
science
material.
A
So
this
is
really
a
brainchild
of
most
of
our
I
think
about
a
year
ago,
he
felt
that
there
was
a
need,
a
gap
in
the
community
to
create
a
targeted
event
wherein
we
could
really
go
into
depth
on
what
would
it
take
to
get
deep
learning
work
before
for
scientific
applications?
And
that's
the
reason
why
this
summer
school
exists
so
today
we're
gonna,
kick
things
off
with
Brenda
Inge,
so
Brenda
is.
B
So
these
are
the
learning
objectives
in
particular.
Hopefully,
at
the
end
of
my
talk,
you
guys
will
be
able
you
guys
and
ladies
will
be
able
to
answer
these
questions
like
what
is
machine
learning.
What's
the
relationship
between
deep
learning
and
machine
learning
and
AI,
and
all
that
good
stuff,
you
guys,
can
we
okay?
B
And
so
with
these
definitions
again,
look
at
these
days,
they're
still
kind
of
like
in
the
early
days
of
deep
learning.
But
already
these
definitions
kind
of
firm
up
what
researchers
like
yourself,
are
going
to
kind
of
research
and
perhaps
aim
you're
in
your
curiosity
towards,
and
so
let
me
get
into
the
relationship
between
AI
ml
and
DL,
because
sometimes
there's
a
lot
of
confusion.
It's
actually.
B
First,
we
have
computer
science
and
within
the
field
of
computer
science,
with
artificial
intelligence
and
essentially
artificial
intelligence
is
the
engineering
of
intelligent
machine
that
could
kind
of
like
humans,
and
it
has
its
roots
back
in
1950s
and
then
within
artificial
intelligence.
So,
back
in
those
days,
I
still
remember,
there
were
like,
like
propositional
logic
and
all
that
stuff,
but
nonetheless
there
still
needs
to
be
some
knowledge
rules
that
need
to
be
encoded
in
these
prepositions
and
so
how
machine
learning
in
like
the
1980s?
B
Where
is
it
possible
just
by
giving
the
Machine
just
examples
alone
so
that
they
could
flow?
The
machine
can
extract
knowledge
without
explicitly
programming
such
rules,
and
so
deep
learning
is
yet
another
subset
of
machine
learning
whereby
it
is
machine
learning,
but
it
is
using
neural
networks
as
the
vehicle
for
the
mathematical
models
that
we
do.
This
machine
learning,
and
so
it
has
really
taken
off
since
nineteen
twenties
and
and
even
now.
B
Oh
sorry,
2010
is
really
taking
off
since
2010,
and
it's
really
proliferation
but
proliferating,
even
even
right
now
so
I
want
to
give
you
guys,
like
a
perspective
of
like
you
know,
from
from
a
layman
point
of
view,
like
so
I
gave
you
these
sets
of
Venn
diagrams,
ok,
but
I
want
to
kind
of
motivate
by
artificial
intelligence
is
really
like.
Well,
this
whole
progress
from
artificial
intelligence
and
machine
learning
to
deep
learning
is
really
driven
by
our
humans.
B
Laziness,
if
you
will
so
artificial
intelligence
is
essentially
that
perhaps
you
have
a
job.
That's
super
tedious
and
you
really
don't
want
to
do
it,
but
it's
kind
of
it's
still,
not
so
easy
that
you
can
just
you
know,
get
a
robot
to
do
so,
and
it
requires
some
troubleshooting
and
whatnot.
So
is
there
a
way
that
perhaps
you
can,
you
know,
write
a
script
to
do
it
so
artificial
intelligence?
Pretty
much
is
motivated
by
the
fact
that
you
don't
want
to
do
it.
You
want
to
train
a
machine
to
do
it
now.
B
Machine
learning
is
the
motivation
now
ok,
granted
I
don't
want
to
do
it
I
want
to
train
someone
else
to
do
it
this
machine,
but
is
it
possible
that
you
know
not
writing
down
these
rules
or
these
conditions
that
I
want
this
task
to
be
done?
Is
it
possible
that
I
could
just
give
it
a
whole
bunch
of
examples
whereby
the
examples
are
kind
of
kind
of
processed,
in
a
way
that
it
highlights
the
important
features
of
the
problem,
and
so
that's
machine
learning?
B
Now
deep
learning
is
like
even
another
level
of
laziness
where,
like
oh
I,
don't
even
want
to
do
any
feature:
engineering,
it's
just
I,
don't
know
how
to
or
it's
just
too
annoying.
So
is
it
possible
for
me
to
give
tons
of
examples
to
tuna
machine
effort
to
just
learn
the
important
features
by
themselves?
So
essentially
it's
a
progression
of
laziness,
but
nonetheless
it
productive
laziness,
because
here
we
are
in
this
revolution
of
deep
learning.
B
So
there's
some
history
again
to
motivate
that
artificial
intelligence
is
really
like.
You
know
from
the
fifties
and
then
machine
learning
took
off,
but
then
deep
learning
is
a
relatively
new
field,
and
so
you
know
one
should
not
kind
of
use,
steep
learning
and
AI
in
a
synonymous
sense.
It's
true
that
deep
learning
is
the
ML
algorithm
du
jour,
but
it
doesn't
kind
of
supplant
all
of
like
artificial
intelligence
and
other
machine
learning
algorithms.
B
So
now
they
give
you
kind
of
a
really
quick
overview
of
the
relationship
between
AI
ml
and
DL
I'm
gonna
get
into
the
workflow,
and
this
workflow
is
gonna,
be
a
bit
more
detailed
and
probably
one
that
you're
used
to
because
I
want
to
give
you
a
sense
of
how
you
would
actually
do
it.
If
you
were
to
kind
of
go
home
tonight,
super
motivated
to
to
train
an
ml
model.
B
Okay,
so
generally,
in
a
problem
we
have
inputs
and
these
inputs
might
be.
You
know
from
our
experiments,
so
we
have
some
inputs
or
knobs
that
we
can.
You
can
turn
and
then,
when
we
run
our
experiments
generally,
we
get
some
kind
of
target,
and
so
that's
x
and
y,
and
so
I'm
I'm
gonna
talk
about
a
machine
learning
workflow
in
a
supervised
learning
sense
where
we
actually
from
an
input.
B
We
do
get
the
labels
or
targets,
which
is
why
now
in
the
in
the
olden
days
before
deep
learning,
we
have
to
do
feature
engineering
and
the
reason
is,
for
example,
if
you
are
taking
pictures
of
everybody
here
and
we
want
to
train
a
face,
like
recognition,
algorithm,
we'll
probably
have
to
engineer
some
features,
like
maybe
shape
of
the
eyes
or
width
of
the
nose
and
other.
So
that's
like
before
deep
dirt,
like
someone
actually
with
a
lot
of
subject
matter
expert
has
to
use
that
knowledge
and
encode
it
into
this
F
function.
B
Now
before
we
train
generally,
we
have
to
split
our
data
into
three
partitions
and
partitions
mean.
Essentially,
there
are
three
destroying
sets
and
generally
I
do
I
do
a
rule
where
is
like
80
10,
10,
meaning
80
put
so
if
I
have
my
data,
I
would
generally
partition
80%
of
that
into
training
data,
10%
of
validation
and
unattended
to
test
set
and
think
of
it
as
like.
B
Training
data
is
like
when
you're
studying
for
an
exam
like
a
calculus
exam
training
data
is
when
you're
reading.
Through
your
notes,
your
your
your
book
and
you're
looking
through
the
work,
worked
out
examples
because
immediately,
as
you
see
the
question,
the
X,
you
also
see
the
Y
like
in
the
same
kind
of
in
the
same
place
so
think
of
it
as
s.
You've
worked
out
like
examples
just
for
your
studying
and
that's
for
training
of
the
model,
but
validation
data.
B
It's
kind
of
like
the
examples
that
you're
supposed
to
try
out
and
only
you
should
try
it
out
first,
before
looking
at
the
answer,
key
nutbag,
that
is
to
test
how
well
you
actually
know
the
knowledge
from
the
information
gleaned
from
the
training
data
and
then
test
data
clearly
is
like
kind
of
the
examples
that
you
would
get
in
an
exam
like
you.
Do
it
and
that's
super
important
because
you
get
graded
on
it.
But
generally
you
may
not.
B
B
So
I'm
gonna
take
my
training,
data
and
I'm,
going
to
train
my
algorithm
and
generally
before
I.
Do
that
I
need
to
decide
what
kind
of
algorithm
or
family
of
so
when
I
see
an
ml
algorithm,
the
M?
It's
really
just
a
mapping
and
it's
not
any
scarier
than
just
a
mathematical
mapping
between
inputs
and
parameters
to
a
predicted
output
such
that
their
predicted
output.
You
hope
that
is
true
and
your
parameters
well
enough.
B
Essentially,
we
are
exposing
the
model
to
all
the
instances
in
our
training
sets,
which
is
here,
but
then
all
the
while
we
are
trying
to
tune
our
parameters,
but
we're
just
data
and
that's
why
I've
highlighted
it
as
well.
So
and
generally,
how
do
we
tune
this
parameter?
Well,
we
turn
this
parameter
based
on
some
loss
function
that
we
that
we
pre
specified.
B
So,
for
example,
if
you're
predicting
like
housing
prices,
so
a
viable
loss
function
might
be
the
MSE,
but
if,
for
example,
what
you're
predicting
is
like
a
class
like,
maybe
is
it
a
dog?
Is
it
a
cat
on
so
on
and
so
forth?
Then
you
might
want
to
use
something
that
can
handle
categorical
items
which
is
the
cross
entropy
loss,
so
there's
different
types
of
loss
functions
but
generally
again,
based
on
what
you
know
about
your
problem,
you
have
to
specify
specify
that
a
priori,
but
nonetheless,
these
two
are
the
most
popular
ones.
B
So
generally,
because
we
have
more
than
one
training
instance
in
our
training
set,
we
would
iterate
and
because
this
is
a
deep
learning
summer
school,
we
know
that
generally,
when
we
train
a
neural
network,
we
would
use
the
iterative
process
call
like
stochastic,
gradient
descent
or
mini-batch
versions
of
it.
So
essentially,
that's
the
picture
where
and
every
iteration
imagine
that
you
have
this
unknown
unknown
loss
function
that
you
don't
quite
see
slopes
down,
but
nonetheless
you're
able
to
evaluate
at
each
point
given
the
theta.
So
at
each
iteration.
B
You
have
your
parameters
that
theta
and
you
have
your
training
data,
and
so
you
apply
that
to
estimate
your
target
and
from
the
target,
then
you
can
compute
your
loss,
and
so
that's
what
these
like
kind
of
these
balls
represent.
It's
representing
the
loss
of
a
specific
instance
and
because
we
used
ingrain
in
the
sense.
B
This
is
the
update
rule
that
we
see
that
we
have
a
theta,
which
is
the
current
parameter
and
it's
being
tuned
by
some
learning
some
learning
rate
times
the
gradient,
and
we
see
that
when
is
when
it's
kind
of
like
sloping
down,
it
naturally
sort
of
guides
your
parameters
toward
the
optimal
point,
which
is
at
the
at
the
bottom
of
this
surface.
B
B
So,
even
though
this
is
in
introduction,
but
for
those
of
you
guys
who
are
doing
deep
learning
oftentimes,
there
are
optimizers
that
do
adjustable
learning
rates.
So
generally,
if
you
guys
are
doing
deep
learning,
you
guys
don't
really
have
to
worry
about
the
learning
rate,
because
you
can
use
atom
or
other
types
of
more
advanced
optimizers
that
can
tune
others.
B
But
the
idea
is
that,
given
your
trainings,
that
you
expose
the
instances
of
your
training
center
to
your
algorithm
and
that's
called
a
learning,
epic
and
so
say
you
expose
the
training
data
multiple
times
like
multiple
passes
of
this
training
data.
Do
you
add,
wear
them
and
you're
tuned
in
your
theta?
And
it's
with
the
hopes
that
you
know
you're,
probably
pretty
close
to
you,
know
being
the
optimum
because
you've
been,
you
know,
tracking
your
loss
and
it's
going
down.
Then
let's
say
we
freeze
it.
B
How
do
we
know
how
good
the
model
is
and
again
the
model
again
is
just
a
mathematical
function
where
it
takes
X
N
and
the
theta,
which
now
you're
freezing
to
predict
your
Y,
and
so
that's
where
the
Oh
before
you
get
there.
So
when
we
see
how
good
a
model
is
then
immediately
concepts
such
as
under
fitting
and
overfitting
irrelevant,
so
the
idea
is
that
so
M
is
a
function.
Is
a
mapping
from
input
two
to
the
predicted
targets?
B
Clearly
we
don't
want
something
that
is
like
so
under
fitting
with
such.
Like
you
know,
it's
clearly
not
fitting
the
training
examples,
so
it
this
one
clearly
needs
to
go
back
to
the
training,
training,
I,
guess
dojo
to
get
more
to
get
to
get
trained
a
bit
more.
But
then
here
is
like
it's.
It's
trained,
so
much
that
it's
just
memorizing
all
of
the
training.
Examples
such
that
if
I
were
to
give
it
a
new
example
like
something
that
is
not
in
the
training.
It's
just
this
red
point.
B
A
B
Fit
and
so
how
do
we?
How
do
we
know
whether
I'm
gonna
get
a
good
fit,
and
so
that's
what
our
validation
data
set
comes
in,
and
so
you
might
be
thinking
like
well
yeah
like
finally,
okay,
like
let's
what
we
do
with
the
validation
data,
and
so
the
validation
data
is
what
we
use
to
compute.
The
loss
again
so
again
recall
that
I
froze
I
had
froze
the
parameters,
because
I've
already
done
a
lot
of
training.
B
I've
exposed
the
algorithm
to
the
instances
of
the
training
data
to
the
point,
I
think
I'm,
pretty
happy
with
my
favorite,
so
I
froze
that
and
that's
what's
being
passed
to
here
so
now,
having
that
model
kind
of
frozen
the
parameters
frozen,
then
I'm
gonna
use
it
to
evaluate
my
validation
data,
and
so
what
I
would
do
is
generally
even
when
I
kind
of
do
this
back
in
office.
It's
super
crucial
for
us
to
plot
the
losses,
because
they
give
you
a
sense
of
what
is
going
on
in
your
in
your
training.
B
So
but,
however,
we
know
that
generally
we're
doing
again
for
deep
learning
or
other
methods
oftentimes.
If
we're
doing
an
iterative
types
of
optimization
algorithm,
we
kind
of
need
to
iterate
this
a
couple
of
times.
So
previously
I
was
iterating
between
the
instances
in
the
training
data
and
now
I'm
kind
of
doing
this
in
an
iterative
manner
and
every
time
I
iterate,
think
of
it
as
I
am
improving
my
theta.
So
each
of
the
X
here
is
one
theta
and
as
I'm
training
right,
hopefully
my
theta.
B
If
my
training
is
runo
right
thing
it
should
it
should
do
better.
So
that's
why
we
see
that
at
least
for
the
training
data
it's
going
down
and
it
should
go
down
because
otherwise
there's
something
wrong
with
your
code,
but
the
validation
data.
You
see
that
at
some
point
start
to
go
up
and
you
might
be
wondering
like.
Why
is
that?
Well,
the
reason
is
when
you
train
too
hard:
oh
okay.
So,
ideally
the
optimum.
Is
you
want
to
stop
training?
B
You
want
to
stop
training
when
the
validation
loss
it's
at
this
bottom
and
the
reason
is
B
generally,
when
you
split
into
training
data
validation,
data
and
test
data.
You
I
mean
I
mentioned
that
you've
spent
two
80/10/10
kind
of
percentage,
but
usually
when
you
split
these
data
sets,
you
also
want
to
make
sure
that
they
have
the
same
distribution
like
meaning
if
it's
I'm,
continuous,
so
so
again
using
the
calculus
example,
if
you
guys
have
been
studying
like
derivatives,
that's
your
training
data
and
it's
suddenly
like
your
practice.
A
B
It
like
integrals
and
stuff
like
you
would
you
would
do
well.
Okay,
imagine
you
guys
didn't
learn
intervals
yet,
like
you
would
do
super
horrible
in
it.
So
it
is
very
crucial
for
us
to
kind
of
have
these
three
splits
kind
of
follow.
Some
kind
of
distribution
maybe
have
the
same
support,
even
okay,
so
going
back.
This
is
the
optimum
point,
because
beyond
that
point
we
see
that
validation
set
is
going
up.
B
B
However,
imagine
if
you
have
stopped
before
the
optimum
point,
then
in
a
way
you're
still,
you
still
have
ways
to
kind
of
improve
on
your
validation.
You
see
that
so
that's
what's
called
underfitting,
and
this
picture
you
guys
remember
that
this
picture
from
like
two
slides
ago
or
so
usually
when
under
fitting
it
corresponds
to
this
scenario.
Again.
B
But
then,
if
you
kind
of
train
really
like
for
many
epics
and
not
really
watching,
what's
going
on,
as
mentioned,
you
might
be
kind
of
just
overtired
yourself,
with
your
with
your
studying
and
in
the
machine
learning
term,
pretty
much
you
are
starting
to
memorize,
like
all
your
training
examples
that
you're
not
really
generalizing
anymore
and
that's
why
you
see
this
trend
up
to
a
certain
point:
generalization
era.
Sorry,
the
validation
error
would
go
up
and
you
really
want
to
stop
before
it
does
and
that's
the
case.
B
I
mean
we're
happy
with
there
in
terms
of
its
performance,
then
we
have
a
model,
that's
ready
to
go
that
could
be
deployed
in
whatever
problem
that
you
guys
might
have.
But
if
it's
really
not
so
great,
then
you
kind
of
have
to
go
back
and
troubleshoot
and
I'm
not
going
to
sugarcoat
it
machine
learning
types
of
troubleshooting
could
be
very
frustrating,
sometimes
like
even
with
working
with
tensorflow
code
can
I,
okay,
sorry
Google
people
in
the
room
or
any
other
types
of
deep
learning
library.
Sometimes
they
might
change.
B
Like
the
order
of
argument,
so
I
don't
know,
like
things
could
be
very
subtle
that
you
may
not
notice
unless
you
really
check
the
sizes
of
all
your
sensors
but
I'm
going
too
far
any
into
implementation,
but
in
general
things
that
you
can
check
is
like
well,
do
you
have
sufficient
data
generally?
If
you
occur,
let
me
go
back
to
this
car.
B
What
you
would
generally
want
to
do
is
that
you
want
to
make
sure
that
your
training
data
as
close
to
zero
as
much
as
possible
like
when
it
converged
and
if
it
doesn't
that,
tells
you
that
I
think
a
model
doesn't
have
enough
parameters
like
expressive
power
to
solve
the
problem
or
that
you
may
not
have
enough
data,
and
so
that's
some
of
the
like
considerations
there
or
remember.
B
This
is
not
deep
learning,
even
though
I've
been
kind
of
mixing
deep
learning,
like
talked
with
you
guys,
we
are
using,
like
you
know,
old-school
features,
so
maybe
they're
the
wrong
features
for
this
problem,
and
so
there
could
be
a
whole
bunch
of
like
issues
that
you
might
want
to
consider.
If
your
test
performance
isn't
up
to
what
you
expect
so
now
that
you
guys
are
pretty
well,
hopefully
you
guys
now
have
a
good
appreciation
of
what
it
takes
to
train
our
ML
algorithm,
I'm
gonna,
get
into
deep
learning
and
then
later
I'm
gonna
contrast.
B
How
deep
learning
is
different
from
traditional
machine
learning?
Oh
I
guess
I'm
doing
it
out,
okay,
so
traditional
machine
learning,
as
I
mentioned,
has
this
like
really
tedious
aspect
of
feature
extraction
and
generally
it
requires
expert
knowledge
about
the
problem
in
order
to
extract
the
right
features
yeah.
B
So,
for
example
like
if
you
are
a
real
estate
agent
and
you're,
trying
to
predict
housing
prices,
you
might
I
mean
if
your
real
estate
agent,
you
know
that
you
know
this,
whether
you're
in
a
good
school
district
or
whether
you're
too
close
to
the
highway
and
sometimes
I,
don't
know
you
too
close
to
the
highway.
I.
Don't
know
like
how
does
this
tend
to
be
cheaper
than
the
ones
that
are
in
like
the
cul
de
sac,
okay,
but
yeah?
B
But
what
I'm
saying
is
that,
generally,
if
you
are
to
kind
of
do
this
kind
of
traditional
machine
learning,
you
really
need
to
understand
the
problem
at
hand
and
use
that
knowledge
to
craft
your
features.
And
then
you
can
put
it
in.
You
know
your
favorite
classification
algorithm
and
then
hopefully
it
will
give
you
the
desired
output,
which,
in
this
instant
is
it
is.
B
It
is
a
car
but
with
deep
learning
is
kind
of
like
end
to
end
situation,
and
so
you
see
how
like
this
poor
person,
Amazon
Turk,
maybe
is
no
longer
needed,
because
we
can
just
dump
in
raw
images
into
the
neural
network
and
as
part
of
its
training,
its
able
to
learn
hierarchical
features.
And
so
we
don't
have
to
do
motor.
A
B
Extractor
anymore,
we
can
just
do
this
end-to-end,
and
so
that's
why
I
like
people
really
like
deep
learning,
because,
like
as
I
mentioned,
like
you
know,
we
have
better
things
to
do
in
life,
then
get
features
or
design
features.
So
what
is
deep?
Learning,
let's
get
back,
let's
get
you
to
basics,
so
the
basics
of
deep
learning
is
we
start
with
the
artificial
neural?
B
Actually,
deep
learning
comes
from
neural
networks
and
neural
networks
are
composed
of
these
artificial
neurons,
and
really
these
neurons
is
not
like
simulating
how
our
wonderful
neurons
in
our
brains
are
but
more
like
to
inspire
by
there's
simply
a
mathematical
model,
that's
kind
of
inspired
by
the
fact
that,
just
like
the
regular
biological
neuron,
we
have
synapses
that
take
signals
from
neighboring
neurons
and
comes
in
and
interact
with
this
neuron
cell
body
and
then
out
comes
another
output
signal.
That
is,
then
the
input
to
the
next
neuron
down
the
chain.
B
So
if
you
look
at
this,
like
you
know-
and
most
of
you
guys
have
probably
seen
this
before,
but
nonetheless
your
input
is
something
that
comes
from
like
the
previous
layer,
the
previous
neuron,
and
then
we
have
these
parameters.
These
w's,
essentially
that
you
multiply
with
your
input
and
then
we
also
add
a
bias
term
which
is
the
beat.
B
And
then
we
pass
this
sum
through
a
non-linearity
F,
and
this
is
all
together
what's
being
passed
out
as
the
output
signal,
and
so,
if
F
it's
just
near
function,
then
all
we
can
learn
are
linear
linear
models.
But
it's
because
generally
we
choose
F
to
be
nonlinear,
so
that
chaining
a
whole
bunch
of
nonlinearities
nonlinear
function
is
what
really
gives
the
deep
neural
networks
such
expressive
power,
so
yeah
so
wait.
Empires
are
the
parameter.
B
So
previously
I
made
a
big
deal
about
the
Thetas,
so
in
in
neural
networks,
the
W's
and
a
piece
are
your
parameters,
and
those
are
the
ones
that
you
have
to
chew
by
exposing
your
deep
learning
model
to
data.
So
neural
networks
is
really
just
neurons
but,
like
you
know,
arranged
in
a
graph,
and
so
so
when
I
showed
you
as
one
neuron.
B
Imagine
now
you
just
have
a
whole
bunch
of
them
and
they're
connected
in
this
like
kind
of
graphical
form,
and
so
generally,
the
input
in
input
to
your
problem
constitutes
the
input
layer
and
then,
depending
on
how
many
hidden
layers
you
want
to
put
in
your
model
and
again,
the
more
hidden
layers
you
have
you're,
adding
more
chaining
of
non-linearity.
Then
that
will
increase
the
complexity
in
your
model
and
generally
your
output
layer
is
also
dictated
by
the
inference
task
at
hand.
B
So
if
it's
a
multi-class
classification
that
you
might
have
like
you
know,
multiple
neurons,
corresponding
to
the
number
of
classes
or
if
it's
just
regression,
then
you
might
just
have
you
know
one
number,
because
that's
what
you're
predicting.
So,
let's
dig
deep
and
look
at
a
little
bit
what
the
math
is
all
about.
So
let's
just
focus
on
inputs
layer,
so
I've
been
using
X
as
my
input.
So
that's
fine.
B
You
still
have
the
X,
but
then,
as
I
propagate
X
through
this
model,
first
layer,
I,
I'm,
multiplying
it
with
the
W
of
so
the
W
1
and
B
1
are
the
parameters
specific
to
this
first
layer?
Okay,
so
this
is
like
this
is
just
what
I
showed
you
with
the
neuron
cell
like,
but
maybe
I
put
it
in
kind
of
matrix
notation,
but
this
shouldn't
come
as
a
surprise.
B
Now,
as
we
propagate
the
signal
from
the
first
layer
to
the
second
layer,
we
see
that
we've
added
yet
the
magenta
like
math
to
it.
So
the
second
layer
now
is
transforming
this
output
signal
by
multiplying
with
yet
another
weight.
That's
specific
to
this
second
layer
and
adding
this
second
layer
specific
bias
and
then
putting
it
through
the
non-linearity.
So
I
apologize
that
I'm
a
little
bit
lazy
here.
B
I
should
say
that
you
can
choose
different
nonlinearities
to
you,
know
specific
to
your
layer,
but
here
I
just
put
F
in
general,
but
you
don't
have
to
you-
can
have
different
like
non-linearity,
and
now
what
about
our
last
layer,
the
output
layer
and
generally
the
output
layer?
We
kind
of
keep
it
as
a
so
it
is
regression
and
we
generally
just
keep
it
as
a
linear
thing
without
there
without
the
X
and
so
altogether.
B
This
chaining
of
mathematical
transformations
across
layers
is
what
constitute
your
model,
and
so
that's
the
same
model
that
was
in
a
workflow
earlier
and
now
the
theta
are.
The
things
are
highlighted
in
yellow.
So
when
you're
going
through
your
machine
learning,
workflow
and
China
kind
of
train
your
model,
you
are
actually
tuning
all
these
w's
and
all
these
B's.
B
So
what's
the
difference
between
just
a
vanilla,
neural
network
and
a
deep
learning
neural
network,
it
really
is
just
the
fact
that
you
have
more
layers,
that's
deep
and
so
back
in
the
80s
pretty
much.
They
don't
really
have
the
data
nor
the
hardware
to
really
achieve
the
kind
of
massive
models
that
we
have
now
and
what's
really
nice
about
the
kind
of
neural
networks
that
we
can
train
out
is
that
you
guys
are
so
massive.
We
can
kind
of
peel
back
at
the
layers
and
examine
what
the
model
is
really
learning.
B
B
This
is
how
it's
doing
it
essentially
for
free,
it's
built
into
the
graphical
structure
that
we
can
peel
away
these
these
layers
and
be
able
to
see
what
features
are
relevant
and
so
sometimes
when
we
do
deep
learning,
it's
not
just
trying
to
train
a
neural
network
to
predict
something
sometimes,
as
we
will
see
later,
we
might
want
to
artificially
pose
a
problem
to
the
neural
network,
to
trick
it
into
learning
something
cool
in
here,
so
that
we
can
then
take
those
features
and
do
something
else
with
it.
B
So
just
kind
of
keep
that
in
mind,
and
so
you
guys
might
think
like
well
like
what
happened
well
for
those
of
you
guys
who
are
alive.
I
guess
like
what's
what's
the
like
what
happened
between
80s
and
now?
B
Well,
it's
really
like
the
confluence
of
three
things
so,
like
long
time
ago,
in
the
beginning,
so
like
yeah,
the
50s,
so
essentially
they
can
only
train
really
small
models
because
first,
their
hardware,
limited
they're,
also
really
data
limited
and
then
in
the
80s
they're
able
to
kind
of
figure
out.
You
know
more
tricks
in
order
to
kind
of
develop
the
early
neural
networks.
But
then
you
know
poor
people
or
researchers.
They
are
still
stuck
with
really
limited
hardware,
but
at
least
hey.
They
got
em
this,
but
now
we're
like
we
have.
B
We
have
image
net.
We
have
recipe
one
and
like
we
so
much
data,
and
for
those
of
you
guys,
you
know
who
you
know,
take
pictures
and
post
them
and
like
are
super
active
on
social.
You
guys
are
everyday
contributing
data,
so
essentially
the
data
I
mean
we
have.
We
have
a
really
good
handle
on
data
and
and
also
with
the
investments
of
like
Nvidia
and
other
other
hardware
companies
really
investing
into
their
hardware
infrastructure.
B
That's
really
out
like
it's
like
the
confluence
of
everything,
including
super
smart
researchers
that
are
figuring
out
new
ways
to
train
train
things
deeper,
better
with
residual
layers
and
things
like
that.
It's
a
confluence
of
the
smarts,
the
hardware
and
the
data.
That's
really
kind
of.
Let
letting
us
overcome
this
like
sad,
sad,
AI,
winters
too,
like
explosive
growth
right
now
and
you
know
explosive
growth.
I
I
can't
leave
this
without
kind
of
just
pitching
again.
Now,
deep
learning
is
truly
everywhere.
B
It's
in
image
classification,
as
you
tell
so
yeah
like
when
you
upload
your
pictures
to
any
kind
of
cloud
service
oftentimes
they
immediately
categorize
your
faces.
So
all
deep
learning-
and
it's
also
starting
to
play
a
really
big
part
in
medicine
and
biology
as
well
so
such
as
like
when
people
have
diabetes
that
they
often
have.
They
often
could
go
blind
because
of
some
kind
of
diabetic.
B
Written
not
I'm
not
pronouncing
this
right,
but
we
are
also
involved
in
some
healthcare
projects
where
we're
trying
to
help
build
deep
learning
models
to
help
them,
diagnose
multi-modal,
sorry,
diagnose
illnesses
based
on
multimodal
data
such
as
radiology
reports,
as
well
as
like
images
and
yeah,
so
on
and
so
forth.
Essentially,
you
are
pretty
much
touched
by
deep
learning
like
everywhere
like
if
you
have
a
phone,
it's
it's
is
touching
you
right
now
yeah.
B
So
that
brings
us
to
a
really
quick
intro
of
deep
learning,
now
I'm
going
to
get
into
the
three
main
branches
of
machine
learning
and
again,
I
promise
you,
like
not
bog,
to
spend
too
much
time
on
the
on
the
super,
classical
methods.
The
reason
is,
if
we
were
to
do
this
kind
of
you
know
this
lecture
like
right.
B
That's
categorical,
that's
classification
or
we
are
predicting
like
a
real
value
number.
That's
regression
now
unsupervised
learning
here,
it's
kind
of
like
things
like
clustering
and
dimension
reduction.
Where
you
really
don't
get
the
target,
you
don't
get
the
Y's,
you
only
get
the
X
and
maybe
some
one,
sometimes
it's
just
too
expensive
together
Y,
so
that,
like
you,
just
want
to
see
what
you
can
do
with
the
X
and
in
general
unsupervised
learning.
B
The
goal
really
is
to
uncover
kind
of
structure
in
this
unlabeled
data,
and
so
naturally,
like
things
like
clustering
and
dimension
reduction,
seems
to
be
there
kind
of
approaches.
One
would
take
for
unsupervised
learning
now.
Reinforcement
learning
has
been
getting
a
lot
of
attention
because
alphago
and
like
all
the
other
cool
things
that
and
even
like
autonomous,
like
cars,
so
in
a
nutshell,
it
is
learning
actions
based
on
feedback
from
the
environment.
B
Even
though
most
of
the
time
you
know
we're
kind
of
focused
on
supervised
learning,
but
all
the
other
areas
sort
of
all
play
I'll
play
well
in
everyday
life
and
and
I
was
gonna
say
that
like
even
though
these
branches
are
like
branches
like
there's
technical
names
to
it,
I
want
to
say
that
even
in
our
everyday
life,
like
these
kind
of
types
of
learning
are
not
too
far
removed
from
like
the
way
we
do
think
so.
For
example,
like
imagine,
you
are
learning
how
to
drive
like
you.
B
Don't
know
how
to
drive
yet
but
you're,
learning
on
a
drive
and
so
you're
taking
your
driving
school
or,
like
maybe
you're
your
parents.
You
know
your
parents
like
watching
you
and
teaching
you
so
immediately.
That's
supervised
learning,
because
if
you
like
make
a
turn
too
close
to
the
curb
someone's
gonna
like
like
I,
don't
know
step
on
a
break
or
or
do
something
so
very
supervised
learning,
but
then
say
once
you're
comfortable
enough
to
like
drive
on
your
own,
and
so,
as
you
are
as
you're
driving.
Of
course,
you
are
taking
actions.
B
You
are
reacting
to
the
environment.
Imagine
you
have
to
you
have
to
make
a
left
turn.
One
of
those
left
hander
has
no
like
steak.
No,
you
just
gotta
be
like
aggressive,
so
say
the
first
time
when
you
have
to
get
to
work
you
you
never
got
to
work
on
time,
because
you
are
there
for
like
15
minutes,
but
then
next
day,
you're
gonna
be
better
and
so
on
and
so
forth.
So
learning
from
experience,
that's
reinforcement,
learning
and
unsupervised.
Learning
is
sometimes
saying
you're
like
driving
in
like
a
different
city.
B
So
recently
I
went
to
Rome
and
like
those
people
they
just
try,
they
don't
stop.
So
in
a
way
like
you,
you
can
partition,
you
know
people's
internal
like
driving
behavior,
so
that
you
can
react
accordingly.
So
you
know
pretty
much
all
these
three
branches,
even
though
they're
in
machine
learning
but
they're
really,
like
you
know,
there's
analogs
of
it
in
our
everyday
life,
so
they're
not
that
for
it.
So
let's
talk
a
bit
deeper
about
how
they're
different
so
as
we
know,
supervised
learning.
B
We
have
our
targets,
that's
a
label
right
and
generally
once
we
train
our
neural
network
or
machine
learning
model.
We
would
then
compare
against
it.
Sorry,
so
this
is.
This
is
what
we
are
predicting
and
it
were
comparing
it
against
the
ground
trip.
And
generally
you
know,
if
it's
continuous
remember,
we
use
some
kind
of
like
root
mean
square
and
if
it's
categorical,
we
use
some
kind
of
cross
entropy
and
so
that's
a
loss.
That
is
then
used
as
a
signal
to
tell
us
how
well
are
we
doing?
B
How
good
is
my
end
and
whether,
in
the
end,
because
it
depends
on
the
theta
whether
I
still
need
to
tune
my
theta?
So
that's
supervised
learning,
and
we
know
immediately
that
it's
immediate
feedback,
because,
if
you're
not
doing
well
immediately,
you
know
you're
not
doing
well.
You
can
use
that
lost
signal
to
kind
of
tune.
Your
parameters,
but
reinforcement,
on
the
other
hand,
is
a
little
bit
different.
B
It's
kind
of
like
well,
you
have
what's
called
like
delayed
rewards
because
sometimes
like
state,
you
are
playing
a
video
game
and
you
are
moving
your
joystick
or
if
it's
Xbox,
you
do
those
things
like
that,
and
you
may
not
know
that.
Oh,
like
you,
should
have
done
something
else
until
you
dock
and
you
might
die-
maybe
five
minutes
later,
because
he's
used
up
your
ammo
or
something
so
in
a
way
that
the
reward
is
not
immediate.
B
B
It's
more
like
you
took
the
action
it
influenced
the
state
of
the
world,
and
that
in
itself
would
then
give
you
this
reward,
which
you
can
use
as
your
signal
to
how
well
you're
doing
now
unsupervised
learning
is
like
you
can
no
feedback,
it's
more
like
stirring,
so
you
just
predict,
but
you
may
not
like
know
exactly
how
well
you
did
I
mean
in
a
math
sense,
but
of
course
like
when
we
predict
something,
of
course,
like
we
have
certain
hypotheses
and
other
kind
of
of
sign
knowledge
that
comes
with
doing
machine
learning.
B
B
So
now,
let's
dig
deep
into
like
each
one
of
them,
so
supervised,
learning,
mainly
or
then
split
into
two
other
categories:
classification
and
regression
classification
is
when
you
are
trying
to
predict
something
that
is
like
a
class
or
something
and
Oracle
regression
is
when
you're
trying
to
predict
something.
That
is
a
real,
a
real
number.
You
know
how
tall
how
tall
I
should
have
been
if
I
had
slept
when
I
was
younger
or
I.
B
B
Again,
it's
super
simple
and
because
of
that,
spammers
have
now
gotten
really
smart
about
like
adding
specific
words
that
are
like
that
would
kind
of
bias
the
algorithm.
So
people
generally
don't
use
naive,
Bayes
spam
filters
anymore
now
decision
tree
here
again
country
is
pre
deep
learning
and
essentially
you
give
it
data
and
it's
able
to
partition
the
data
into
very
interpretive
all
types
of
branches,
so
that
so
that
the
roots
of
these
are
what
gives
the
prediction
so
decision
tree.
B
Has
the
you
know
great
feature,
sorry
great
quality
that
it's
very
interpretive
now
SVM
was
pretty
hot
back
when
I
was
a
kissing,
grad
school.
Essentially
you
have
your
data
and
you
know
that
the
goal
is
essentially
to
separate
these
classes
with
hyperplane
that
maximally
separates
them
and
generally,
you
might
kind
of
put
your
data
transform
it
into
a
higher
dimension,
so
that
this
is
possible,
but
nonetheless,
like
these
three
I
mean
there
are
others,
but
these
are
the
kind
of
classical
algorithms
from
the
classification
sub
branch.
B
Now,
as
for
regression,
requestion
is
kind
of
it's
got
his
roots
in
statistics
right
and
essentially
linear
and
polynomial.
Regression.
It's
pretty
much
like
what
you
see
what
you
see
here,
yeah
so
I
just
covered
the
classical
methods,
but
now,
let's
talk
about
with
deep
learning,
so
with
deep
learning.
How
do
we
do
regression
and
classification?
Well,
sorry,
I
got
tired.
I
didn't
do
the
cool
animations
anymore,
so
they
are
with
me.
B
The
idea
is
that
you
still
have
a
network
right
and
if
you
are
doing
classification,
where
is
just
like
true
or
false
right,
then
yeah.
If
it's
only
true
or
false,
then
essentially
you
want
to
predict
the
probability.
The
number
you're
predicting
is
a
probability
that,
like
class
one,
is
true
and
then
the
other
class
we'd
like
one
minus
that
probability.
So
that's
why
you
only
have
one
output
neuron
and
it's
predicting
within
the
range
zero
to
one,
and
so
that's
why?
B
B
If
it's
just
true
vowels,
and
once
you
get
your
prediction,
then
you
will
compare
against
whether
it's
actually
true
or
false,
and
by
using
this
cross
entropy
laws
to
help
tune
your
parameters,
but
what
I'm
but
say
like
if
we
have
a
very
related
problem,
but
now
we
have
multiple
labels,
so
previously
it
could
be
like
in
my
female
or
male,
and
it
next
would
be
like.
Am
I
over
40
years
old?
B
So
you
know,
I
could
I
could
be
true
on
multiple
categories,
and
so
that's
why,
instead
of
just
one
number
now
we
are
predicting
several
numbers
and
several
numbers
correspond
to
the
number
of
classes
in
which
I
can
participate
in
so
and
that
corresponds
to
the
number
of
output
neurons
as
well.
So
you
can
see
we
pretty
much.
We
could
keep
the
same
like
if
your
input
is
practically
the
same.
All
these
layers
could
pretty
much
be
fixed
and
all
you're
doing
is
essentially
well.
B
First
of
all,
you
need
to
make
sure
your
target
is
now
multi
class,
and
then
you
make
adjustments
to
the
output
layers
and
adjust
your
laws
if
necessary.
But
here
we
can
still
use
the
cross
entropy
laws,
but,
for
example,
if
we
wanted
to
request
regression
again,
I
am
predicting
just
one
number
and
there's
a
real
value
number.
So
it's
between
negative
infinity
to
infinity.
Then
you
see
how
I
changed
my
non-linearity
from
sigmoid
to
linear,
because
I
don't
want
to
squish.
B
B
B
You
know
the
starting
point
of
the
bounding
box
XY
and
then
how
big
the
boxes
and
then
together
they
have
ways
to
kind
of
combine
both
los
together
to
actually
solve
this
problem
whereby
they
can
now
give
in
this
picture.
It
will
be
able
to
up
with
the
category
as
well
as
a
bounding
box.
So
to
my
knowledge,
I,
don't
think
that
classical
methods
can
do
that.
B
So
that's
like
a
really
that's
a
big
win
for
us
who
have
access
to
deep
learning,
so
deep
learning
again
for
supervised
learning
is
pretty
straightforward
and
for
unsupervised
learning.
That's
the
second
branch
where
you
don't
have
two
labels,
and
so
you
don't
really
get
any
kind
of
feedback
as
to
how
well
you
do,
and
so
generally
we
do
things
like
clustering
dimension
reduction
or
some
kind
of
Association
and
so
yeah.
Let's,
let's
dig
deep
a
little
bit
and
just
go
over
quickly,
some
of
their
classic
methods.
So
here
this
is
like
k-means
here.
B
So
k
means
essentially
that
you
have
some
data
right
and
you
also
have
to
specify
a
number
of
clusters
and
you
randomly
initialize
your
clusters
to
be
some
point
and
then
what
you
do
is
then
is
then
you
assign
the
neighboring
points
to
the
cluster
as
shown
right
and
then
based
on
the
state.
This
is
your
cluster,
then
you
recompute
you're
centroid
to
the
customer,
and
then
you
do
it
again
and
again
until
it's
kind
of
partitions.
The
data
in
this
kind
of
nice,
neighborhoods
and
DB
scan.
So
these
are
our
clusters.
B
Db
scan
is
another
one
of
these
clustering
algorithm.
Where
is
imagine
you
have
data
points
and
they're
people
and
you're
telling
them
hold
hands
if
you
close
to
your
neighbor,
and
so
that's
really
what
they
do.
They
associate
the
neighbor
that
is
close
enough
to
be
in
the
same
cluster
and
any
and
any
points
that
don't
have
their
hands
held
is
a
weirdo,
because
it's
flagged
as
a
number
and
four
dimensional
reduction.
I
mean
generally
like
we,
we
do.
B
A
B
Deep
learning
I
feel
like
it's,
it's
actually
pretty
exciting
in
terms
of
like
unsupervised
learning,
because
people
doing
unsupervised,
learning
within
deep
learning.
It's
actually
quite
clever,
they've,
been
kind
of
leveraging
a
lot
of
what's
called
self
of
supervised
learning,
and
so
imagine
that
we
want
to.
We
want
to
compress
data,
and
we
have
a
picture
like
this,
but
remember
I
told
you
that
sometimes
we
trick
our
neural
network
and
you're
doing
something
dumb
so
that
we
can
get
something
cool
on
inside.
So
this
is
exactly
one
of
those
times
I'm
telling
it
hey.
B
This
is
my
input
reconstruct
this
input,
like
that's
just
like
weird
right,
but
by
having
the
layers
kind
of
progressively
smaller,
like
a
show
soap.
So
I
drawn
it
like
a
bowtie
thing,
not
because
I
is
pretty
it
actually
is
this
of
the
mayor's,
as
as
we
cannot
get
to
this
latent
feature,
which
is
what
I
want
to
force
a
network
to
kind
of
compress
this
data
into,
and
the
decoder
is
essentially
like
the
key
inverse
so
so
encoder
again
compresses
the
input
into
this
latent
feature.
B
So
there's
lower
dimensional
representation
of
my
of
my
data
and
I.
Call
that,
like
you
know
so,
if
this
feature,
if
this
vector
is
called
a
tray,
then
as
f
is
my
encoder
decoder
is
like
taking
H
right,
then
I
want
to
get
back
at
my
original
input,
and
so
an
auto
encoder
is
all
these
together,
chaining
it
up
where
I
want
X
and
R
to
be
really
close
together,
and
if
this
is
trained
properly.
You
get
a
really
nice
compressed
feature
that
represents
your
input
without
all
the
dimension
dimension
of
your
input
yeah.
B
But
the
thing
is
like
imagine
like,
if,
like
apple
and
orange,
like
alphabetically
they're
kind
of
far
but
semantically
their
food,
their
fruits,
and
maybe
they
got
that
reddish
orange
you
you
know
so
so
like.
Ideally,
if
you
were
to
think
about
it,
you
want
them
kind
of
close
in
similarity
when
you
represent
them
in
a
numeric
sense,
and
so
that's
what
worked
effect
is
about
it's
saying
that
hey
given
this
right,
this,
we
call
these
one
hot
vectors.
B
So
you
might
think
that
oh
well,
that's
interesting,
but
how
do
you
do
that?
It
actually
is
very
much
is
very
similar
to
the
auto
encoder.
So,
instead
of
reconstructing
the
actual
words
because
I
want
I
want
dips,
you
know
what
we
do
instead
is
that
we
predict
the
neighboring
words,
so
it's
kind
of
like
giving
a
word.
I'm
gonna
predict
my
neighbors
and
so
I'm
gonna
make
this
more
concrete
for
you.
B
So,
for
example,
if
you
have
these
sentences
here,
like
I
like
playing
blah
blah
blah,
so
what
you
do
is
that
you
actually
have
a
Center
word.
That's
kind
of
like
you
know
what.
So
there
are
two
flavors
of
this
algorithm,
so
you
can
have
your
Center
word
and
in
the
window
of
words
like
before,
before
and
after
you
they're
what
you
call
the
context
words,
so
you
see
how
like,
as
you
slide,
your
Center
word
kind
of
down.
You
get
more
kind
of
context.
Word
like
your
window.
B
Kind
of
shifts
and
I'm
like
playing
is
I
is
now
the
center
word
like
play
or
like
these
two
words
and
so
on
and
so
forth.
So
again,
even
though
deep
learning
I
said,
oh,
you
don't
need
to
do
like
feature
engineering
with
text
you
you
need
to
at
least
do
some
of
these
kind
of
like
one
hot,
transparent
transformation
before
you
can
do
things
with
them.
B
So
say
you
have
encoded
your
text
in
this
kind
of
data
data
format,
then,
depending
on
whether
given
Center
word,
you
want
to
predict
context
words
or
given
context
words.
You
want
to
predict
Center
words
that
gives
you
like
two
flavors
of
models
that
allows
you
to
learn
the
these
dense
vectors,
which
we
call
word
embeddings.
So
again,
like
the
auto
encoder,
the
auto
encoder.
B
Are
you
kind
of
true
a
lot
of
layers,
but
word
defect
is
actually
a
very
shallow
model,
where
it's
just
the
one
heart
factors
and
in
the
hidden
layer-
and
that's
that's
essentially
the
word.
The
word
vectors
are
the
ones
that
are
what
we're
interested
in
they're
like
right
here,
and
so
it's
just
like
you
know
it's
a
really
shallow
Network.
B
The
way
that
we're
able
to
now
represent
words
in
a
semantically
relevant
way
has
really
paved
the
way
for
a
lot
of
the
more
advanced
on
natural
language
processing
tasks
like
captioning
and
all
that,
so
another
really
cool
idea
in
unsupervised
learning
so
games
is
it's
kind
of
like
interesting
in
that
it
was
a
Richard
proposed
as
a
consumer
buys
method,
but
it's
been
now
used
for
like
semi-supervised
or
not,
but
nonetheless
we
can't
you
don't
really
need
to
have
the
labels,
because
all
you
really
need
is
the
is
the
input
and
again
it's.
B
Essentially
it's
a
generative
model.
So
previously
we
talked
about
classification,
where
is
house
1
and
class
2.
So
all
those
other
models
are
more
like
discriminative
models
where
from
inputs
you
are
you're
just
modeling
what
the
RTB,
but
here
we're
trying
to
model
give
an
input.
What
is
the
probability
like
around
that
input
so
such
that
we
can
perhaps
use
a
probability
distribution
to
generate
more
inputs,
and
so
the
way
it
works
is
that
there's
actually
two
parts
to
it:
the
generator
and
a
discriminator,
and
so
the
okay.
B
B
And
so
it's
this,
like
two-player
game,
that
it's
got
going
so
discriminator
again
is
the
the
neural
network
that
takes
the
picture
and
it
doesn't
know
whether
it's
from
the
fake
one
or
the
true
training
set,
and
it's
trying
to
essentially
maximize
the
probability
of
being
able
to
distinguish
between
the
two
of
them.
But
then
the
generator
is
trying
to
learn
about
the
distribution
of
the
true
training
data
set
so
that
it
can
it
can.
It
can
essentially
fool
the
discriminator
into
thinking
that
its
images
are
actually
true.
B
So
is
this
two-player
game
and
even
though
it's
a
really
cool
idea,
it's
sometimes
it's
hard
to
train.
But
nonetheless,
if
you
are
brave
enough
to
kind
of
check
it
out,
people
have
used
stands
for
data
augmentation,
and
so
what
I'm,
showing
you
here
is
from
this
paper
called
peak
fluid
where
they
gave
it
a
whole
bunch
of
like
fluid
simulation,
2d
3d,
as
well
as
some
simulation
inputs
and
they're
able
to
so.
This
is
one
of
those
plots
that
I've
showed
you
where
it's.
B
This
is
epic,
and
this
is
the
loss
and
at
each
time,
step
I'm
also
showing
you
like
what
the
actual
you
know
the
can
fake
picture
looks
like
and
you
see
how
in
the
beginning
is
like
not
doing
so
well,
so
maybe
it's
like
kind
of
underfitting,
but
as
it
as
it
continues
it.
It
looks.
It
looks
like
real
fluid
simulations,
so
yeah,
so
I've
covered
autoencoders
more
to
bag
and
gas.
B
B
Just
like
tell
my
car
to
take
me
to
LA,
and
you
can
just
do
it
we'll
see,
but
yeah
reinforcement
learning
has
been
pretty
big
because
well,
all
the
other
like
that
I
guess
the
ubers
and
all
the
autonomous
car
companies,
as
well
as
there's
just
a
lot
of
like
automation,
that
is
out
there
in
terms
of
these
industries,
also
putting
investment
into
reinforcement,
learning
but
like
okay,
what
is
reinforcement
learning
so
as
I
mentioned,
reinforcement
learning.
Imagine
you
an
agent,
so
you're
just
bring.
B
You
have
fun,
you
have
a
brain
and
you
did
some
observation
from
the
environment
and
from
this
observation
you
kind
of
internalize
and
think
like.
Okay,
like
the
state
of
the
world,
is
probably
this
and
then
you
take
the
action
and
that
action
changes
the
state
of
the
world
which
in
turn
then
generates
a
reward
which
you
receive
and
another
observation
which
you
get
again
and
that
seeds
your
next
set
of
action,
this
process
kind
of
iterates
and
again
like
it's
like
way,
Matthew.
B
Then
it
has
to
be
again
it's
very
like
intuitive,
it's
kind
of
like
it
is
like
any
one
of
you
guys
who
play
video
games,
or
you
know
even
like
try
to
bake
muffins
like
anything
so
essentially
as
you're
doing
something
new.
You
try
different
things
and
then
based
on
the
outcome
of
whether
you
succeed
it
or
not,
then
you
kind
of
modify
your
actions
based
on
what
you
observe,
as
you
know,
as
having
done
right
or
wrong,
as
well
as
the
rewards
yeah.
B
So
as
mentioned
the
agents,
so
the
reward
is
a
time
delay,
feedback
and
the
agents.
Job
is
really
to
take
actions
that
maximize
the
cumulative
rewards,
meaning
like
if
this
is
a
game,
you
want
to
maximize
not
just
your
next
time
step
reward
but
like
your
rewards
across
all
the
other
time
that
you're
playing
this
game.
B
So
let's
formulate
this
a
little
bit
more.
So
a
Markov
decision
process
is
a
nice
way
of
kind
of
kind
of
describing
this
kind
of
decision
process,
and
so
with
a
MDP
Markov
decision
process.
We
have
observation,
space
action,
space
and
a
way
to
kind
of
encode.
How
do
states
and
action
make
can
trick,
can
help
of
the
agent
transition
to
observe
the
world
transition
to
the
next
state.
B
So
that's
the
state
transition
function
and
also
your
rewards,
because
that's
the
signal
that
you
will
get
there's
some
kind
of
discount
factor,
because
because
if
you
were
playing
this
kind
of
like
over,
if
and
horizon
you
want
to
make
kind
of,
you
want
to
discount
it
so
that
it's
like
temporally
like
relevant,
and
so
we
also
an
MDP
Markov
satisfied.
Clearly
the
Markov
property
such
that
the
states
at
the
next
time
step
only
depends
on
the
states
of
your
current
access
and
your
actually.
So
with
this.
B
Essentially,
when
you
formulate
RL
problem,
you
pretty
much
have
to
know
your
observation,
space
action
space
and
in
all
these
formulations
and
some
extra
concepts.
So
in
general
the
agent
is
trying
to
maximize
the
rewards,
and
this
reward
is
shown
it's
this
kind
of
like
weighted
swell.
Is
this
way
to
soundest
discount
it
some
of
the
reward
that
it
would
get
at
each
round?
So
this
is
what
the
asian
wants
to
maximize
and
a
policy
is
described.
B
How
an
agent
should
behave,
so
it's
a
mapping
between
I'm,
sorry
from
state
to
the
action
and
the
value
function
is
essentially
a
way
of
saying,
like
hey,
if
I'm
in
the
specific
state.
What
is
my
expected
rewards
like
how
good
is
how
good
am
I,
like?
Am
I
sitting
pretty
in
this
state
and
yeah?
So
these
are
the
V
and
Q
values
that
kind
of
represent
how
good
it
is
to
be
in
a
certain
state.
B
Now,
how
do
you
go
about
learning
an
optimal
policy,
because,
ultimately
we
want
to
maximize
this,
and
we
know
that
our
rewards
is
due
to
us
sticking
a
good
action
and
you
know
being
the
right
state
and
everything.
So
therefore,
we
need
to
learn
a
good
mapping
from
state
to
action,
so
a
good
policy,
and
so
imagine
if
we
know
how
good
it
is
to
be
in
a
specific
state,
sorry
that
for
every
action
we
know
how,
how
good
that
expect
that
rewards
would
be.
B
Then
we
will
be
able
to
just
do
an
arc
max,
but
this
Q
function
is
tricky
like
how
do
we?
How
do
we
exactly
know?
We
kind
of
have
to
play
the
game
and
kind
of
observe
what
rewards
we
get
in
order
to
find
out
right,
and
so
what
we
could
do
is
we
can
actually
frame
it
as
another.
One
of
these,
like
regression
problem
where,
at
a
given
time,
you're
trying
to
model
what
your
Q
is.
So
you
have
a
model
that
is
trying
to
predict
Q
and
at
any
given
time.
B
Essentially,
they
use
like
several
frames
and
then
they
put
it
through
again
one
of
these
neural
networks
with
convolutional
layers
that
I'm
sure
stuff
was
gonna
cover
later
on
and
then
out
they
just
computes
the
Q
values.
That's
specific
for,
like
you
know,
like
a
shown.
So
as
a
result,
it's
kind
of
like
anywhere
reinforcement
learning
has
places
where
it's
approximating
a
function.
Deep
learning
has
kind
of
fine
found
its
way
to
kind
of
like
fulfill.
That
need,
and
do
it
really
well
and
yeah
pretty
much.
B
That's
that's
what
I
have
for
reinforcement
learning,
but
I
also
want
to
call
to
your
attention
that
there
are
other
types
of
learning
such
as
transfer,
learning,
semi-supervised,
learning,
active
learning,
so
transfer
learning
is
kind
of
like
once
you
have
a
model
that
you
train.
You
now
have
a
related
problem.
You
don't
want
to
throw
away
all
the
all
your
hard
work
of
tuning
all
the
parameters
it's
kind
of
like.
Is
there
a
way
you
can
transfer
that
knowledge
over?
So
that's
transfer,
learning
and
semi-supervised,
it's
kind
of
like
a
mishmash
between
supervised
and
unsupervised.
B
Imagine
for
supervised
learning
you
have!
You
can
only
afford
to
have
a
really
small
training
data
that
has
labels,
but
then
you
have
a
whole
bunch
of
other
data
that
is
not
labeled,
so
that's
unsupervised.
Is
there
a
way
to
perhaps
learn
structure
from
the
unsupervised
and
then
use
that
to
kind
of
provide
good
initial
values
from
the
supervised
problem,
so
there's
like
different
ways
of
they're
kind
of
like
mishmash
of
supervisor
unsupervised
and
lastly,
active
learning
is
when
again
you
are.
You
don't
have
much
training
data.
You
only
have
like
a
few
instances.
B
It's
their
way
that
if
you
can
have
a
model,
they're
not
only
predicts
the
the
prediction,
but
also
tells
you
how
uncertain
it
is
for
you
to
then
compute
some
kind
of
entropy,
so
that
you
would
then
use
that
information
to
know
at
a
given
time
where
your
model
is
uncertain
so
that
you
can
then
selectively
say.
Okay
now,
I
want
to
choose
that
point
where
I
know
my
model
does
not
do
well
in
and
get
a
label
for
it.
B
B
Deep
learning,
really,
you
know,
is
it's
really
a
team
player
with
all
three
of
them,
with
supervised
learning,
we've
already
seen
it.
It's
super
easy
to
kind
of
go
between
with
question
and
calcification.
You
can
even
do
it
together.
Remember
the
cat
and
the
bounding
box,
which
is
super
cool
and
with
unsupervised
learning
we
saw
that
people
use
clever
way
of
creating
a
fake
signal.
So
remember
the
auto
encoder,
the
signal
the
target
is
itself
or
the
word
to
back.
B
The
signal
was
nearby
words,
so
there's
really
no
true
labeling
involved,
but
it's
using
kind
of
like
artificial,
yet
meaningful
question
within
the
structure
itself
to
to
do
the
supervised
learning
of
the
structure
and
reinforcement
learning.
Well,
it's
been
I've
only
talked
about.
You
know
how
the
dqn,
which
is
using
reinforcement,
learning
to
approximate
the
cue
function,
but
there's
other
words
where
reinforcement.
B
Sorry,
deep
learning
can
also
be
used
to
learn
the
policy
mapping
directly,
as
well
as
to
approximate
the
the
models
that
go
into
the
Markov
decision
process,
and
once
you
have
those
models,
then,
essentially,
if
you
have
a
good
approximation,
then
you
can.
It
boils
down
to
a
planning
problem
which
is
easier
to
solve
well,
in
most
cases
yeah,
but
that's
pretty
much
it
and
if
you
guys
have
questions
just
feel
free
to
speak
up.