►
Description
Date: 02/05/21
Presenter: Hunter Schafer
Institution: University of Washington
Title: "Designing CSE 163, an Intermediate Data Programming course"
http://sbdh-prod.ideas.gatech.edu/resources/newsblog/education-and-workforce-working-group
A
Engineering
here
in
seattle,
at
the
university
of
washington
yeah,
I
see
david's
here
so
the
question
about
his
experience
with
assessment
of
data
science
courses
and
programs.
So
what
I'm
going
to
talk
to
you
all
about
today
is
the
design
of
a
new
course
is
about
two
years
old
at
this
point
that
I
was
hired
to
create
here
at
the
university
of
washington
called
intermediate
data
programming
and
I'll
explain
a
little
bit.
A
How
do
you
give
students
meaningful
open-ended
projects
to
work
on
things
that
they're
interested
in
that
really
motivates
them,
but
in
a
way
that
somehow
makes
make
sure
that
they
meet
learning
objectives
and
how
do
you
frame
that
to
students
in
ways
that
they
can
understand
and
I'll
talk
a
little
bit
about
kind
of
one
idea
we
had,
which
is
this
notion
of
challenge,
goals,
of
giving
them
explicit
ways
of
spelling
out
ways
that
they
can
challenge
themselves
and
working
on
that,
so
I'm
going
to
go
ahead
and
share
my
screen.
A
I
have
some
slides
that
I
tried
to
get
down
into
about
10
minutes
and
I'll
I'll
try
my
best,
but
I
want
to
give
you
a
little
bit
of
a
lay
of
the
land
of
kind
of
what
we're
what
we're
doing
here,
and
what
this
course
is.
I
don't
have
a
bunch
of
my
camera
went
all
the
way
over
there.
There
we
go.
Can
you
all
see
my
slides
all
right?
A
Great?
Thank
you.
I
don't
have
a
bunch
of
numbers
for
you
today.
This
is
more
of
an
experience
report,
but
hopefully
you
can
see
some
things
that
that
we
tried
here
that
you
might
find
interesting
and
definitely
feel
free
to
use
any
of
this
as
inspiration.
A
I
want
to
give
a
shout
out
to
my
partner
who
actually
drew
this
intro
slide
for
me,
that
was
a
very
nice
looking
thing
that
actually
kind
of
covers
what
everything
that
the
that
we
do
in
this
course.
So
before
I
explain
what's
the
course,
I
gotta
give
a
little
bit
of
context
of
where
it
fits
in.
So
before
I
got
to
the
university
of
washington,
we
had
a
main
intro
programming
series,
cse,
142
and
143.
A
These
are
large
intro
programming
courses
in
java
and
they
used
to
serve
as
kind
of
one
of
the
main
routes
that
students
learn
programming
at
the
university
of
washington.
This
is
great
for
cs
majors,
but
a
little
less
great
for
everyone
else
on
campus,
because
a
lot
of
need
right
now
is
in
processing
data
for
research.
A
So
about
10
years
ago
we
created
a
cs1
course
and
python
cse
160,
and
this
is
a
much
smaller
course
it's
usually
taken
by
grad
students
but
intro
programming,
but
we
found
it
limiting
because
it's
a
cs1
course
they're
learning
for
loops
and
functions,
and
it's
really
hard
to
get
very
far
when
you
are
teaching
them
everything
from
scratch,
and
so
this
is
kind
of
where
I
came
in
is
trying
to
create
a
course
that
follows
on
this.
That
gets.
A
It
gets
to
go
a
little
bit
more
advanced
into
data
analysis,
but
really
focusing
on
the
programming
perspective,
and
so
this
is
where
whoops
this
is
where
csg163
came
from,
but
what
they're
good?
Sorry?
My
scrolling
didn't
work
right
quite
right.
Importantly,
though,
a
lot
of
students
end
up
taking
our
first
cse
142
in
java,
because
it's
the
main
programming
course,
so
we
didn't
want
students
to
be
siloed
off,
and
so
actually
one
of
the
really
interesting
things
about
this
course
is
that
anyone
can
take
it
as
long
as
they
have
some
prior
programming
experience.
A
I
was
teaching
about
90
students,
and
this
year
I'm
teaching
about
300
students
a
quarter,
so
we're
kind
of
rapidly
growing
this
course
and
we
want
to
grow
it
more.
I'm
not
going
to
talk
too
much
today
about
broader
data
science,
education
at
the
university
of
washington.
Actually,
my
colleague
ben
marwick
will
be
here
next
month
talking
to
you
about
one
of
our
new
data
science,
minors,
but
this
is
one
course
that
is
a
really
core
part
of
that
minor
in
terms
of
the
programming
aspect.
A
So
I
talked
about
the
context.
What
is
this
class?
I
kind
of
described
cse
163
as
four
key
learning
competencies
or
high
or
high
level
learning
objectives.
We
want
to
do
more
advanced
programming
concepts
because
it's
a
cs2,
like
course.
It's
not
just
learning
for
loops
or
learning
functions,
we're
doing
more
advanced
programs,
writing
things
that
maybe
take
more
than
100
lines
of
code,
maybe
working
with
classes
and
objects.
A
Our
second
is
working
with
types
of
data,
so
almost
all
of
the
programming
concepts
we
introduce
are
tied
into
working
with
real
data,
namely
tabular
data
like
excel
or
csv,
unstructured
text,
data
working
with
images
or
geospatial
data.
So
we
have
a
bunch
of
case
studies
on
different
data
types
and
how
to
process
them.
A
We
also
want
to
introduce
students
to
modern
data,
science
tools
and
libraries,
so
throughout
we're
also
teaching
them.
How
do
you
process
these
data
this
data
efficiently?
Without
having
having
to
write
all
this
code
from
scratch,
so
using
things
like
pandas
or
scikit-learn
or
numpy
for
processing
and
then
also
because
we're
a
computer
science
course,
we
try
to
teach
some
intro
computer
science
concepts,
namely
efficiency,
talking
about
how
to
use
data
structures
and
talking
about
bit
about
memory
management
in
in
a
computer.
A
When
we
I
want
to
make
sure
I
open
the
chat
in
case.
I
see
any
questions.
Oh
sorry,
over
here
on
the
ether
dock,
sorry,
I'm
so
used
to
teaching
where
I
have
to
look
at
chat.
At
the
same
time,
working
with
different
data
types
I
mentioned,
we
do
tabular.
We
also
work
with
time
series,
unstructured
text,
geospatial
data
and
images
in
terms
of
data,
science
tools
and
libraries.
We
teach
some
jupiter
notebooks.
We
also
teach
them
how
to
write
python
scripts
and
visual
studio
code
and
how
to
install
python
with
anaconda.
A
We
teach
a
lot
of
libraries
with
for
data
science
that
are
very,
very
popular
and
then
in
terms
of
kind
of
computer
science
topics.
We
talk
about
efficiency,
computer
memory.
We
talk
about
some
of
the
computing
ideas
behind
how
to
make
things
efficient,
like
hashing
and
indices,
and
we
also
talked
a
little
bit
about
why
python
is
generally
a
slow
language
and
why
that's
okay
for
data
scientists,
why
we
it's
very
powerful,
despite
it
being
slow,
because
that's
something
that
usually
freaks
students
out,
so
that's
kind
of
what
the
course
is
about.
A
I
want
to
talk
a
little
bit
about.
Oh
sorry,
one
other
thing
I
forgot:
all
of
these
things
are
programming
concepts,
but
importantly,
they're
all
grounded
in
data
science
applications.
A
A
We
do
focus
all
our
applications
on
processing
and
analyzing
data,
but
we're
not
doing
things
like
statistical
validity
or
things
like
that,
because
it's
not
a
statistics
course
it's
a
programming
course
and
those
things
are
crucially
important,
but
we're
not
the
only
class
going
to
be
taking
related
data
science.
So
we
talk
about
principles
of
visualization,
talking
about
what
machine
learning
is
at
a
high
level.
A
A
So
one
thing
I
found
very
effective
when
teaching
this
course
is
running
as
a
flipped
classroom.
So
what
that
means
is
students
do
all
the
preparation
outside
of
class
where
they
read
and
do
read
or
watch
videos.
Students
have
reported
that
sometimes
they
don't
like
reading.
So
I
recorded
videos
of
myself
talking
over
the
reading.
A
Most
of
our
offerings
have
unfortunately
been
on
zoom
at
this
point
just
with
the
length
of
the
pandemic,
but
we
walk
around
the
zoom
room
and
we
help
them
with
these
problems
and
give
them
more
individualized
help,
and
I
think
this
has
been
really
helpful,
because
I
believe
that
data
science
is
best
learned
through
experience,
and
I
think
that
the
slips
classroom
model
really
helps
us.
I
have
a
picture
of
the
learning
tool
that
we
use
called
edstem.
A
This
tool
has
been
fantastic
in
transforming
my
course.
It
lets
us
host
python
environments
for
students,
so
they
don't
have
to
install
anything
on
day
one.
We
teach
some
installation
later,
but
it's
a
really
nice
warm-up.
It
lets
them
interact
with
code
and
let
us
write
auto-graded
tests,
so
this
ed
stem
thing
has
been
absolutely
fantastic
for
me
in
this
course.
A
That's
what
they're
doing
in
class.
We
also
have
some
assessments
of
kind
of
their
they're
learning
each
week,
so
we
have
programming
assignments.
I
want
to
just
highlight
a
couple:
fun
programming
assignments
we
use,
so
we
have
them
write
programs.
Then
we
assess
them
on
their
behavior
and
their
code
quality.
Do
they
follow
kind
of
style,
principles
of
writing
good
behaving
code?
A
We
have
them,
write
a
search
engine
so
taking
unstructured
text
data
talking
about
tf,
idf
scores
and
trying
to
rank
search
results.
So
actually
this
is
actually
the
most
complicated
assignment
in
terms
of
just
raw
programming,
but
we
do
tie
it
into
working
with
text.
We
have
them
analyze,
food
access
data.
This
is
one
of
my
favorite
assignments
because
it
actually
has
them
work
with
two
data
sets
the
census
data
and
a
food
access
data
set
and
join
them
together.
A
But
ultimately
our
final
goal
is
a
final
project
which
is
work
on
a
data
analysis
of
their
choosing,
what
they
find
authentic
for
themselves
and
the
basically
the
only
requirement
is
it
has
to
be
using
python
in
a
meaningful
way,
because
it
is
a
python
programming
course.
We
give
them
a
heuristic
that
it
should
be
at
least
120
lines,
but
that
is
not
a
hard
limit.
A
It's
just
whatever
seems
challenging
enough
and
I
previewed
it
at
the
beginning,
one
of
the
things
that
we've
been
pioneering
for
the
past
year
now
is
this
notion
of
what
makes
a
pro
project
challenging
enough,
which
is
identifying
challenge
goals.
So
we
tell
them
here
are
a
couple
ways
that
a
project
can
be
challenging.
A
You
could
work
with
multiple
data
sets.
You
can
work
with
really
messy
data.
A
lot
of
students
try
to
do
web
scraping,
and
that
is
just
really
gross
code
to
write
usually-
and
I
want
to
reward
them
for,
for
if
they
don't
have
the
most
stellar
analysis,
but
they
did
a
really
cool
data
processing
task.
That's
that's
still
important.
A
They
can
go
into
applying
statistical
validity
or
other
other
advanced
data.
Science
topics
that
we
just
don't
get
time
to
talk
about.
We
can
talk
about
more
advanced
ways
of
training,
ml
models
or
learning
a
new
library,
and
importantly,
we
want
to
give
students
the
opportunity
to
express
themselves
in
their
own
way,
and
so
we
always
have
the
chance
to
say
define
your
own
challenge
goal.
We
will.
We
will
work
with
you
and
make
sure
that
that
feels
like
the
right,
the
right
level,
but
do
feel
free
to
challenge
yourself
in
a
different
way.
A
We
think
this
is
a
relatively
diverse
set
of
ways
that
you
can
have
a
challenging
project,
but
it's
not
the
only
way
and
that's
kind
of
the
course
so
far
so
kind
of
last
slide
I
have
is-
and
I
haven't
been
watching
the
time
so
hopefully,
I'm
not
going
over
is
kind
of
what
challenges
have
I
ran
into
while
designing
this
course,
and
where
do
I
see
it
going
in
the
future?
A
So
so
far,
the
biggest
challenge
for
me
or
the
thing
I've
been
thinking
about
a
lot-
is
better
integration
of
ethics
and
societal
impacts
of
data
science
throughout
the
course
so
far,
they're
kind
of
segregated
into
two
days,
which
is
better
than
nothing.
But
I
wish
we
had
a
way
of
going
approaching
these
throughout
the
course.
The
challenge
for
me
is
how
do
you
make
time
to
bring
it
up
in
an
in-depth
way,
while
still
covering
the
things
that
you
need?
So
that's
a
challenge.
A
A
We
want
to
continue
to
scale
our
course
ben
next
next
month
is
going
to
be
talking
about
kind
of
our
data
science
minor,
and
we
need
this
course
to
be
larger
to
serve
more
students,
because
this
is
one
of
the
main
programming
courses
that
they
might
take
for
a
data
science
minor,
and
so
the
current
bottleneck
is
ta
workload,
because
grading
this
final
project
is
actually
not
just
grading
at
the
end.
A
We
also
want
to
experiment
with
new
methods
for
assessment
to
make
more
equitable
and
more
effective
assessments.
So,
right
now
we're
currently
working
in
what
we
call
mastery
based
grading,
which
is
we
dropped
points
and
we
grade
everything
on
a
esnu
scale,
exemplary
satisfactory,
not
yet
and
unaccessible,
and
we
give
students
the
option
to
resubmit,
and
so
it's
really
more
focused
on
feedback
rather
than
points.
A
I
can't
tell
you
how
that's
going
right
now,
we're
still
in
the
process
of
trying
it
out,
and
then
we
want
more
classes
for
students
to
take
after
163,
and
this
is
one
of
the
big
things
we're
trying
to
expand
our
offerings
of
data
science
courses
and,
lastly,
and
finally,
one
that
I'm
most
excited
about
is
we're
actually
launching
163
in
the
high
school.
A
So
next
year
we're
going
to
be
teaching
students
in
greater
seattle
area,
high
schools,
after
their
apcs,
which
is
our
our
intro
java,
equivalent
how
to
do
this
in
high
school,
and
so
there's
lots
of
challenges
to
come
through
there,
namely
technology,
because
not
every
student
has
access
to
a
computer
that
can
run
python.
So
we
really
need
to
figure
out
a
story
here
on
how
to
do
this,
and
I'm
really
excited
to
be
working
on
that
this
this
year.
A
That
is
all
I
have
for
you
today.
I'm
we're
gonna
do
questions
after
I
believe
more
questions,
I'm
not
quite
sure.
The
structure,
but
you're
also
always
welcome
to
reach
out
to
me,
and
if
you
want
to
talk
more
about
this,
thank
you
so
much.
B
Thank
you
so
much
hunter
now.
This
is
a
really
interesting
way
and
structure
for
the
course.
There
are
a
couple
questions
that
are
already
in
the
chat:
yeah
congrat
about
the
ta
load
for
the
course.
A
Yeah,
so
I've
been
very
thankful
or
that
my
department's
been
able
to
actually
give
me
two
tas
per
so
we
usually
have
quiz
sections
at
uw
and
we
usually
have
one
ta
per
section.
I
have
actually
because
it's
a
new
course
gone
into
tas
per
section,
so
it's
about
a
ratio
of
15
students
to
one
ta.
We
use
undergrad
tas,
quite
quite
a
lot
at
uw.
A
So
once
a
student
has
taken
the
course
they're
they're
eligible
to
ta
it
so
lots
of
challenges
in
trying
to
train
undergrads
and
teaching
but
yeah.
So
that's
kind
of
we
usually
have.
We
tell
the
tas
they're
working
about
15
hours
a
week,
which
is
both
involved
with
teaching
and
grading.
B
That's
it
and
then
on
the
mastery
with
resubmit,
especially
for
large
classes.
It's
awesome
for
learning,
but
we'll
we'll
just
multiply
the
grading
by
the
ta
double
neck
because
they
get
a
lot
of
resubmissions.
Yes,
is
that
the
case.
A
Yes,
that's
a
great
question
that
was
my
absolute
biggest
nightmare
with
with
this
kind
of
system.
Is
it's
just
going
to
make
us
work
more
one
thing
that
makes
it
a
lot
easier
is
without
points
you
never
have
to
worry
about
the
difference
between
a
39
and
a
40..
You
know
how
much
time
you
and
mental
energy
you
waste
just
like.
Oh,
is
this:
is
this
fair
to
take
off
because
they
might
not
have
known
that
yet?
A
In
some
sense,
you
can
always
bias
to
be
a
little
stricter
or
have
higher
standards,
because
students
have
the
option
to
resubmit.
So
as
long
as
it's
not
the
end
of
the
world,
if
you
give
them
an
s
instead
of
an
e
because
they
can
resubmit
it
after
a
future
week
after
responding
the
feedback-
and
I
was
really
worried
about
the
resubmission
workload,
but
it
turns
out,
it
only
takes
you
about
less
than
five
minutes
a
student,
because
the
changes
the
delta
tends
to
be
small.
A
You
go
look
at
your
past
feedback
and
just
say:
oh
yeah,
they
responded
to
it.
That's
great,
and
so
so
far,
we're
like
on
three
weeks
of
resubmissions
tas,
have
not
told
me
anything
about
workload.
They
actually
said
it
decreased
a
bit
because
that
esn,
u
grading
in
the
first
place,
is
just
easier
nice
and
that
and
then
there's
a
question
about
prereqs
yeah.
It's
just
some
programming
course
it
can
either
be
intro
java
or
intro
python,
so
yeah,
that's!
A
A
Yeah
we
we
tell
them
that
we
have
a
little
less
support
for
them
most
of
my
materials,
because
I
I
failed
to
mention
because
of
this,
like
many
paths,
if
you
go
look
back
at
this,
oh
animations,
if
you
go
look
back
at
this
diagram,
I
had
and
you
look
at
the
scale
of
the
intro
python
course.
It's
like
300
students
compared
to
the
4
000
students
ish
that
are
taking
either
142
or
143
so
they're.
A
Just
I
have
like
90
of
my
students
in
163
are
from
the
java
background,
and
so
most
of
my
materials
are
constantly
reference
java.
Like
oh,
you
already
know
this
idea
in
java.
This
is
just
slightly
different
words
in
python
and
I
just
don't
have
analogs
for
them
for
c
plus
plus,
because
I
just
haven't
made
materials
for
c
plus
plus,
but
they're
welcome.
To
take
the
course
we
just
generally
say
you
might
need
a
little
bit
more
support.
A
Great
question,
no
so
to
manage
to
workload,
they
can
make
one
resubmission
a
week
on
any
assignment,
so
they
can
either
submit
homework
zero
from
a
little
while
ago
or
they
can
submit
the
most
recent
homework
again.
One
of
the
challenges
that
we're
running
into
actually
is
that
a
lot
of
students
seem
to
we
tell
them.
They
can
resubmit
any
work
as
long
as
it's
had
feedback
on
it
and
they
made
an
initial
submission
by
the
deadline.
A
So
one
of
the
challenges
we're
running
into
is,
it
seems,
like
students
are
kind
of
putting
off
the
work,
they
submit
basically
nothing
for
the
first
submission
and
then
they're
now
falling
a
bit
behind.
This
is
a
little
bit
higher
than
we
had
before
and
we're
trying
to
think
about
how
to
better
support
students
and
make
sure
that,
because
it's
going
to
get
pretty
bad
in
a
couple
weeks,
once
we
start
getting
to
this
final
project
and
therefore
homework
assignments
behind.
B
Yeah,
okay,
so
I
think
we
have
one
more
and
kendra
pointed
out
as
well.
If
you
don't
get
asked,
if
you
don't
get
to
ask
your
question
or
you
think
of
one
later
put
it
in
the
ether
pad
so
hunter.
If
you
see
questions
underneath
like
where
we
took
notes,
you
can
also
add
in
things
that
we
left
out
in
the
note
taking.
This
is
just
you
know,
attempt
to
capture
here
some
of
the
things
you
talked
about.
B
We'll
also
have
this
recording,
but
you
can
respond
directly
to
people
in
the
in
the
notes
and
we'll
try
to
put
your
or
even
put
your
contacts.
So
I
think
it's
already
in
there
too,
for
people
to
reach
out
okay.
So
last
question,
I
think
andrew
you
said:
do
you
use
github
or
some
other
version
control
system
for
homework?
Submissions.
A
Yeah,
so
we
don't
use
github.
I
think
that
it's
a
fantastic
tool,
it's
just
a
little
hard
for
students,
mostly
in
their
freshman
year,
so
this
ed
stem
system
we
use
actually
does
kind
of
a
diffing
between
previous
assessments
for
you,
without
students
needing
to
use
git.
I've
used
it
in
other
courses,
some
slightly
more
advanced
courses,
but
this
one
there's
already
so
many
technologies
going
on
I'm
just
a
little
hesitant
to
add
just
one
more,
which
is
good,
even
though
it's
a
very
important
and
useful
tool.
A
It
started
with
just
something
I
got
for
this
course
and
we're
kind
of
actually
growing
it
to
be
something
more,
at
least
in
the
allen,
school
or
computer
science
department.
B
Great
course,
so,
we'll
switch
over.
Please
put
your
comments
and
if
you
have
experience
with
courses
in
this
area
or
suggestions,
also
that
too,
you
know
we
want
to
go
around
for
like
some
of
the
challenges
that
hunter
mentioned.
If
you
have
experience
with
that,
putting
it
in
the
ether
pad
as
well.