►
From YouTube: Education & Workforce WG |Developing Data Science Curriculum for Postdocs in Bio & Medicine
Description
January 2022
DS+X: Developing Data Science Curriculum for Postdoctoral Scholars in Biology & Medicine
Speaker: Arko Barman; Rice University
A
All
right
thanks
a
lot
for
introducing
me
yeah.
I
probably
don't
have
to
add
much,
but
I
just
want
to
mention
that
I've
been
involved
in
curriculum
development,
mostly
focusing
on
interdisciplinary
kind
of
activities
with
other
departments
and
at
different
career
stages
of
people,
for
example.
This
is
for
post-doctoral
scholars
often
ignored
when
coming
to
education,
because,
as
we
all
know,
they're
always
in
their
labs,
and
they
don't
have
much
time
to
do
anything
else,
besides
their
own
experiments
and
publish
and
apply
for
grants
and
stuff.
A
So
I
started
developing
this
curriculum.
While
I
was
myself
a
postdoc,
and
that
was
the
first
initiative
that
a
postdoc
had
ever
taken
at
the
university
to
develop
a
curriculum
for
postdocs.
So
it's
kind
of
like
a
peer
feedback
that
I
got
that
postdocs
were
interested
in
coding,
programming
and
data
science,
but
like
they
didn't
have
access
to
a
lot
of
resources,
so
anything
they
picked
up
was
from
blog
posts
and
youtube
videos
very
unstructured
and
no
one
to
guide
them
and
advise
them
on
how
to
proceed.
A
So
I
took
it
upon
myself
to
try
and
develop
this
curriculum.
I'm
no
longer
there
at
the
university
where
I
did
this,
but
the
program
continues.
Other
people
have
taken
it
up
and
I'm
very
happy
that
it
still
continues
even
after
I
left
right.
So
my
motivation
was
to
introduce
data
science
to
other
stem
disciplines.
A
Also.
Data
science
is
now
a
big
thing
and
it's
being
introduced
at
the
undergrad
level
at
various
universities.
However,
there
are
people
at
advanced
stages
of
their
career,
for
example
post
docs,
who
did
not
have
the
opportunity,
while
they
were
undergrads
to
be
exposed
to
concepts
and
data.
Science
same
goes
for
a
lot
of
faculty
members
as
well
in
different
stem
disciplines.
A
Also,
I
was
at
a
university
where
it
was
focused
on
biology
and
medicine,
so
I
realized
that
there's
a
growing
use
of
machine
learning,
ai
and
data
science
in
biology
and
medicine
like
I
was
involved
in,
and
people
on
the
biology
side
with
expertise
in
biology
and
medicine.
They
did
not
have
the
training
or
background
to
even
convey
what
they
were
trying
to
do
to
a
machine
learning
expert
like
myself,
so
they
were
fumbling
and
they
could
not
translate
the
concepts
in
biology
and
medicine
to
the
concepts
in
data
science.
A
A
So
this
was
a
course
offered
at
ut
health,
which
is
the
university
of
texas
health
science
center
at
houston.
It
was
offered
by
the
office
of
postdoctoral
affairs
and
they
already
have
a
certificate
training
program,
so
it
was
offered
as
a
part
of
that
program
and
we
had
a
lot
of
support
from
the
postdoc
association
over
there.
A
A
So
the
goals
and
objectives
of
this
course
was
to
introduce
researchers
to
a
programming,
language
or
software,
and
given
the
nature
of
data
science,
the
course
was
offered
both
in
python
and
r
at
different
points
of
time,
and
another
goal
was
to
introduce
very
broad
categories
of
problems
such
as
classification,
regression,
clustering
exploratory
analysis
and
so
on,
just
to
give
them
a
flavor
of
the
different
kinds
of
problems
in
data
science
that
you
know
they
can
use
in
their
own
research.
The
goal
is,
of
course,
to
use
these
methods
in
their
own
research.
A
We
also
encouraged
the
postdocs
attending
to
try
to
translate
their
own
research
problems
in
their
domains
to
data
science.
Problems
like
regression,
which
is,
of
course
very
common
or
clustering,
or
some
sorts
of
such
as
dimensionality
reduction
and
so
on,
and
also
the
goal
was
to
do
expeditious
learning
through
the
use
of
data
sets
that
they're
familiar
with
and
improve
learning
outcomes
through
active
learning
by
using
an
active
involvement
of
the
postdocs
in
the
classroom
setup.
A
The
curriculum
was
split
into
two
broad
segments.
The
first
one
was
coding,
of
course,
because
most
of
them
did
not
have
an
experience
in
coding
and
the
second
part
was
a
bit
technical
on
the
data
science
side,
with
some
machine
learning
concepts
being
taught
the
coding
segment
consisted
of
either
python
or
r.
There
were
demos
and
examples
in
the
class,
and
there
were
pair
programming
exercises
both
in
and
out
of
class
using
optional
homework
assignments.
A
A
A
There
were
think
pair
share
sessions
where
the
postdocs
were
asked
to
talk
about
their
own
research
data
sets
and
how
they
can
come
up
with
a
problems
that
they
can
solve
using
data
science
methods.
A
Another
thing
that
I
incorporated
was
a
floating
lecture,
so
the
last
lecture
session
was
a
floating
lecture
where
they
could
actually
request
me
to
do
a
certain
topic
that
they
wanted
covered.
For
example,
I
got
a
lot
of
requests
for
pca,
supposedly
that's
pretty
popular
in
the
biosciences.
A
I
got
requests
for
doing
heat
maps
by
clustering
and
things
like
that
and
that's
something
which
postdocs
particularly
appreciated,
because
that
covered
some
topics
that
they'd
heard
about
that
used.
But
they
did
not
know
much
about.
A
So
the
only
way
to
evaluate
over
here
was
the
final
project
because
we
focused
on
experiential
learning,
so
I
handcrafted
teams,
basically
using
people
from
different
departments,
backgrounds,
genders
and
ethnicities,
and
they
were
encouraged
to
meet
once
every
week
outside
of
class
time
to
discuss
and
practice
and
they
had
to
use
a
data
set
from
their
own
lab
for
their
own
research
to
complete
the
project
they
had
to
frame
their
data,
science
objectives
or
problem
using
that
data
set
and
they
required
a
lot
of
help
from
me,
which
I
was
happy
to
do,
and
the
project
usually
comprised
four
components:
exploratory
data
analysis,
classification,
regression
and
clustering
and
towards
the
end
of
the
course,
the
last
day
of
the
class,
they
presented
their
findings
to
the
others
in
the
class,
and
we
had
a
very
healthy
discussion
on
what
things
could
be
done
and
what
was
not
correct
and
what
was
correct,
what
would
be
done
in
future
and
so
on.
A
Leicard
scale
strongly
disagree,
disagree,
agree
and
strongly
agree,
and
mostly
people
were
very
appreciative
of
the
course
in
a
lot
of
ways,
and
everyone
found
that
the
course
supplemented
their
laboratory,
training
and
research
question
c
over
here,
and
they
found
the
course
useful
as
a
good
use
of
their
time
outside
of
the
lab,
which
is
particularly
difficult
to
do
for
post
docs,
and
everyone
said
that
their
postdoc
mentor
was
also
supportive
of
participating
in
this
course,
and
actually
for
a
few
months
when,
after
the
pandemic
started,
we
had
to
move
the
course
to
an
online
session
and
it
seemed
it
did
impact
learning
for
a
few,
but
mostly
students.
A
The
postdocs
were
okay.
Moving
to
an
online
platform.
A
Right
finally,
some
lesson:
lessons
learned
from
this
experience:
firstly,
there's
a
lot
of
time
constraints
for
post
doctoral
researchers.
As
everyone
knows,
I've
been
one
myself
and
taking
time
out
of
the
lab
to
learn
something
useful
that
they
can
use
back
in
their
own
lab
is
very
much
appreciated.
A
Assessment
was
a
challenge
because
we
cannot
have
like
exams
and
such
for
postdocs,
so
there
were
optional
homework
exercises
which
most
of
them
actually
did,
and
the
final
project,
which
was
very
impressive,
actually
also
the
use
of
domain
specific
data
sets
and
problems
that
made
the
class
more
interactive
and
more
approachable
for
everyone.
A
The
floating
lecture
was
very
much
appreciated.
Framing
of
the
final
project
did
need
a
lot
of
help
from
my
side
and
still
needs
a
lot
of
help
from
the
person
who's
instructor
right
now.
A
We
also
tried
introducing
guest
lectures,
bringing
in
researchers
and
professors
working
in
data
science
and
machine
learning,
but
this
was
not
appreciated,
which
is
an
interesting
thing
to
note.
The
postdocs
felt
that
it
would,
it
would
be
better
for
them
to.
You
know,
learn
methods
and
learn
apis
and
libraries
to
do
data
science
tasks
for
themselves,
instead
of
bringing
in
professors
who
talked
about,
you
know
very
high
level
stuff
of
what
they're
doing
in
their
labs,
so
that
did
not
help
at
all.
A
Also
towards
the
beginning
of
the
course,
a
lot
of
them
needed
help
with
software
installation,
it's
good
to
have
a
ta,
if,
if
possible,
to
do
that
towards
the
beginning
of
the
course
now,
there's
a
sister
course
to
this
called
statistics
for
biomedical
researchers,
of
course,
didn't
have
time
to
cover
stats
along
with
coding
and
machine
learning.
So
this
is
a
sister
course
that
was
offered
and
which
also
was
pretty
popular.
B
Thank
you.
Thank
you
for
that.
I
am
extr.
I
really
like
this
course,
I'm
very
I'm
thinking
back
to
my
postdoc
days
and
thinking
this
would
have
been
a
really
good
thing
to
have
so
pretty
impressive.
So
does
anyone
have
any
questions?
I
have
questions,
but
I
for
arco
as
well
did.
Did
I
miss
it,
but
what
was
the
time
frame
over
which
you
implemented
this,
like
I
mean
that
you
offered
this
box.
A
This
was
offered
over
a
standard,
14-week
semester
and
the
class
time
was
one
hour
given
the
time
constraints
of
postdocs,
but
I
was
having
office
hours
and
a
lot
of
people
dropped
in
and
even
outside
of
class.
I
kept
receiving
emails
and
it
was
actually
well
past
one
hour
that
the
postdocs
were
spending
on
this
course.
A
B
A
I
expected
no
background,
like
I
interacted
with
them
before,
offering
this
course,
and
I
expected
no
background
at
all.
Coding
was
start
from
scratch
and
machine
learning
concepts
were
taught
from
scratch.
They
were
provided
also
reading
materials
if
they
wanted
to
read
further
so
yeah.
Basically,
anyone
with
no
knowledge
of
sporting
or
machine
learning
could
take
this
course
and
hopefully
learn
something.
A
A
Unfortunately,
no
I
don't
have
the
bandwidth
at
the
moment
to
offer
this
again.
This
was
let
me
mention
this
was
offered
as
a
service,
so
it's
not
that
I
got
paid
for
it
or
anything.
This
was
offered
as
a
service,
so
I
being
a
faculty
now.
I
don't
really
have
the
bandwidth
to
do
that
anymore.