►
Description
Presenter: Ashok Krishnamurthy & Isma Gilani
Institution: University of North Carolina - Chapel Hill
A
So
my
name
is
Ashok.
Krishnamurthy
I
am
currently
the
interim
director
for
renzi
the
Renaissance
Computing
Institute
rency
is
a
a
research
Computing
Institute
that
is
housed
at
the
University
of
North
Carolina
Chapel
Hill,
but
it
has
a
unique
role
in
the
sense
that,
by
by
state
mandate,
we
actually
work
with
closely
with
both
Duke
University,
as
well
as
North
Carolina
State
University.
A
So
we
we
sort
of
have
a
role
with
all
three,
so
the
so
I
also
am
a
a
research
professor
in
the
computer
science
department
at
UNC,
Chapel
Hill
isma.
You
want
to
introduce
yourself.
B
Sure
I'm
es
mangalani
and
I
am
the
interim
director
of
software
architecture
group
here
at
renzi,
and
we
are
responsible
for
building
Helix
and
agile
helix
platforms
at
Ashok
and
I
will
be
talking
about
today.
A
Very
good,
thank
you.
Thank
you.
So
I
want
to
get
started
on
on
on
on
the
topic
itself
and
and
I.
Think
Ismail
will
give
you
a
little
bit
more
detail
about
this,
but
as
part
of
an
NIH
funded
project,
we
created
a
software
platform,
a
workspace
platform
called
Helix
that
was
really
used
to
begin
with,
to
provide
Imaging
services
for
the
national
heart,
lung
and
Blood
Institute.
A
Since
then,
we
have,
through
the
leadership
of
isma
and
and
Steve
Cox,
before
her
actually
made
Helix
into
a
multi-purpose
platform.
So
we
can
actually
build
many
different
vertically
integrated
applications
out
of
Helix
as
the
core
technology,
and
one
of
the
things
that
we
did
when
we
were
looking
at
at
Helix
was.
He
said
here
is
a
fantastic
platform
for
actually
doing
things
around
a
data.
Science
Education
and
Training
and
and
I
have
been
teaching
a
course
that
is
around
data
science.
It.
A
You
could
call
it
an
introduction
to
data
science
course
for
several
years
and
when,
when
we
saw
where
we
could
take
Helix,
it
looked
like
it
was
a
great
way
to
to
create
a
a
version
of
Felix.
That
would
be
extreme,
really
useful
and
and
address
many
of
the
issues.
I
was
running
into
I
and
Stanley
Hall
and
John.
Mcgick
is
one
of
the
three
people
who
are
teaching
this
course
into
while
we
are
actually
running
it.
A
So
that's
how
we
created
this
version
of
helix
or
or
this
vertical
of
Helix
called
edu
helix.
Last
year,
UNC
Chapel
Hill
also
launched
a
data.
Science
minor
and
I
am
one
of
the
faculty
members
on
the
minor
committee
and
that
the
data
and
and
right
now
we
have
over
500.
As
of
the
spring
semester,
I
think
there
more
have
signed
up
to
to
to
be
a
part
of
this
minor
for
500
students.
The
requirements
for
the
miner
are
fairly
straightforward.
A
The
the
minor
consists
of
five
courses,
three
of
which
are
core
courses,
and
then
two
electives.
After
that,
the
idea
is
that
the
core
courses
give
you
some
of
the
fundamentals.
You
need
to
be
a
effective
data
science
person
and
then
the
electives
you
choose
based
on
your
particular
area
of
Interest.
So,
for
example,
you
could
choose
electives.
If
you
wanted
to
in
machine
learning.
A
A
You
can
see:
there's
three
core
requirements:
data
and
computational
thinking,
data
and
statistical
thinking
and
data
culture
and
Society.
The
particular
core
requirement.
I'm
going
to
talk
about
is
data
and
computational
thinking.
Okay,
go
to
the
next
slide,
so
the
data
and
computational
thinking
is
is
so.
The
basic
idea
is
that
you
know:
there's
there's
always
been
courses
in
computational
things,
so
the
courses
in
computational
thinking
is
not
necessarily
new
and,
and
you
can
think
of
it
in
many
different
ways.
A
The
the
sort
of
angle
that
we
brought
in
is
to
say
that
you
know
what
a
lot
of
data
analysis
and
data
science
and
data
management
is
really
also
requires
you
to
be
able
to
think
of
how
you
can
process
manipulate,
collect,
distribute
analyze,
visualize
the
data
itself.
A
It
is
worthwhile
to
have
a
course
that
sort
of
provides
the
computer
science
skills
coding
skills
around
what
would
be
needed
to
Be
an
Effective
data
scientist,
and
there
is
not
a
single
course
that
you
can
take
that
you
have
to
take
to
fulfill
this
requirement.
As
you
can
see
right
now,
there
are
four
different
courses
that
you
could
take
to
fulfill.
That
requirement.
Two
of
them
are
from
the
computer
science
department.
A
One
is
from
geography
and
one
is
from
political
science
the
the
basic
requirement
that
we
have
in
all
of
these
courses,
the
the
way
we
we
we
chose.
These
courses
is
that
it
has
to
be.
It
has
to
be
built
around
a
a
a
ability
to
teach
you
programming
that
that,
even
if
you
come
with
no
programming
or
coding
background
that
it'll
actually
give
you
some.
A
The
exact
language
that
is
used
is
not
as
important
as
giving
you
the
necessary
Concepts
and
the
programming
that
you
are
taught
or
the
coding
that
you're
taught
has
to
be
Hands-On
and
assignment,
based
that
it's
not
just
theoretical
exercise,
and
also
that
the
the
assignments
necessarily
have
to
do
with
data
that
you
just
can't
run
a
simulation
and
say:
okay,
I
now
know
how
to
run
for
Loops.
A
That's
because
there's
a
name
that
existed
long
before
data
science
ever
became
where
what
it
is
now,
so
you
could
say
that
it's
its
real
name
is
an
introduction
to
you,
know
computational
thinking
and
data
science.
That's
where
that's
what
it
really
is.
Okay,
we
have
been
enrolling
about
400
students
a
year
about
250
in
the
fall
semester
and
about
150
in
the
spring
semester,
and
we
basically
are
kind
of
topped
over
there
because
of
room
sizes
and
things
like
that.
Okay,
next
slide,
please
as
well.
A
A
Everything
we
do
in
the
course
is
based
on
Jupiter
notebooks
lectures,
worksheets
quizzes
examinations,
programming
projects.
These
are
all
the
different
artifacts
that
are
involved
in
the
course.
Every
one
of
these
exists
as
a
jupyter
notebook
that
is
then
shared
with
the
student
okay
to
help
us
handle
these
very
large
class
sizes.
A
couple
of
the
faculty
members
at
at
in
the
computer
science
department
developed
a
set
of
tools
that
makes
managing
the
course
a
lot
easier.
A
We
have
a
tool
called
a
fetcher,
or
these
tools
are
all
basically
python,
libraries,
okay,
we
have
a
tool
called
fetcher
which
allows
students
to
easily
download
course
material,
a
tool
called
submitter
which
allows
students
to
submit
their
completed
work,
okay,
a
tool
called
Checker
which
actually
provides
feedback
to
the
students
as
they
are
working
on
their
code.
So
you
know,
as
you
know,
a
Jupiter
notebook
is
is
is
based
on
on
on
cells.
A
So
if
you
can
complete
a
cell
and
actually
run
it
and
and
and
and
and
then,
if
you
run
the
Checker
after
you've
written
your
code,
it'll
give
you
feedback,
there's
three
levels
of
feedback
with
increasing
levels
of
specificity.
So,
for
example,
if
it
is
a
a
worksheet
which
is
which
is
sort
of
the
learning
medium
that
we
use
or
if
it
is
in
a
lecture,
the
feedback
may
actually
say
well.
A
I
was
actually
expecting
an
integer
here,
but
what
I
got
from
you
is
a
string
okay,
so
it's
kind
of
giving
you
quite
a
bit
of
hint
as
to
what
may
be
wrong
with
your
code,
whereas
if
it's,
if
you're,
you
know
your
students
can
run
the
Checker
even
in
an
exam,
we
include
the
Checker
in
an
exam
too.
In
that
case
the
feedback
may
be
just
okay.
I
got
your
answer,
so
that's
just
more
to
tell
the
student
we
accepted.
A
It
doesn't
give
them
any
hint
as
to
whether
they
are
on
the
right
track
or
not.
So
you
can
adjust
it
the
the
level
of
feedback.
We
also
have
a
grader,
which
is
basically
an
ability
to
take
an
exam
or
a
quiz
or
a
worksheet,
and
and
put
it
into
a
containerized
environment
and
and
and
run
it.
So
we
can
do
auto
grading.
A
The
one
thing
I
would
have
to
say
that
both
the
Checker
and
the
grader
they
only
check
for
the
correctness
or
the
answer:
they're,
not
necessarily
checking
the
code.
There's
a
little
bit
of
checking
that
may
happen.
We
will
be
able
to
see
if,
for
example,
a
student
is
using
list
comprehension
when
we
had
said
you
couldn't
so
there
may
be
a
little
bit
of
things
like
that,
but
it's
mostly
checking
the
answer.
A
We
have
used
edu
Helix
so
like
I
said
this
course
we
have
been
teaching
for
a
while,
but
we
have
used
edu
Helix
to
teach
this
twice
in
Fall,
2021
and
spring
2022..
Okay,
go
to
the
next
slide,
please
so
why?
Why
did
we
choose
to
go
to
edu
Helix?
So
you
know
before
before
fall
200
we
won.
We
were
asking
students
to
install
anaconda
on
their
computers.
This
work.
A
We
have
taught
it
many
times
using
that,
but
it
has
many
issues
and
really
it
it
results
in
instructor
and
ta
distraction
and
student
frustration.
Some
of
the
things
are,
you
know
somebody
has
Windows.
Somebody
has
Mac
OS
and
we
actually
had
a
few
students
who,
who
you
know
preferred
operating
system
as
Linux.
A
We
always
have
issues
with
version
and
Version,
Control
and,
and
you
know,
different
versions.
If,
if
the
students
have
a
problem-
and
we
need
to
debug
it,
we
literally
have
to
you
know
sit
next
to
them.
We
can
you
know
before
we
can
see
that
code.
It's
it's
a
you
know
which
which
works.
Okay,
but
of
course,
with
with
remote
learning,
it's
a
little
harder.
So
we
would
have
to
use
Zoom
look
at
the
student
screen
that
necessarily
limited
how
many
students
we
could
work
with
at
the
same
time.
A
So
there
were
some
issues
there,
not
all
of
which
Helix
solves,
but
some
of
them
laptop
failures
were
a
problem.
When
you
have
250
students,
you
can
be
sure
that
somebody's
laptop
is
going
to
crash
in
in
the
middle
of
the
semester
and
then
and
then
they
are
in
a
total
state
of
panic,
lost
everything.
What
do
I
do
now?
A
We
even
used
to
have
networking
issue,
the
the
the
the
the
the
Checker
and
the
and
the
and
the
fetcher
and
the
submitter
all
talk
to
a
server
and
and
sometimes
because
of
some
networking
changes
that
somehow
magically
happened
to
the
students
laptop.
Suddenly
it
doesn't
work.
We
would
constantly
run
into
conflicts
because
some
software
somewhere
Auto
upgraded
and
made
some
changes
so
suddenly
their
Jupiter
notebook
is
not
working,
Etc
etcetera.
A
So
we
used
to
be
a
net
net
loss
of
time
that
was
unrelated
to
the
actual
instructional
material.
We
want
to
convey
to
the
students
with
edu
Helix.
All
they
need
is
a
Chrome
browser,
so
they
don't
even
need
a
full
laptop.
So
if
you
have
a
any
kind
of
a
device
that
that
provides
you
the
Chrome
browser,
that's
all
you
need,
and
you
can.
You
can
do
all
of
the
material
that
you
could
do
before.
A
B
Thank
you
Ashok.
So,
as
a
short
stated
earlier,
agile
Helix
is
a
vertical
domain
of
Helix
and
Helix
is
spelled
by
rency
and
it
is
a
scientific
Computing
platform
built
to
address
the
need
of
robust
cyber
infrastructure.
In
the
cloud
we
have
thus
far
deployed
specific
Helix
domains
to
Aid
research
and
science
by
sharing
and
ingesting
and
analyzing
scientific
data.
B
Helix
provides
a
wide
array
of
data
science
tools
in
a
modern
Cloud
native
environment,
and
it
provides
that
with
appropriate
security,
networking
and
persistent
storage,
and
it's
deployed
as
a
customizable
and
configurable
domain
specific
to
the
user
community.
So
the
researchers
in
in
genomics
may
have
a
different
Suite
of
tools
than
the
researchers
in
in
Azure
here
or
well,
or
or
users
of
agile
Helix.
B
It
empowers
researchers
with
computational
workspaces
close
to
the
data
in
the
cloud
it
it
AIDS
by
by
the
instruction
and
learning
in
a
classroom
environment
by
providing
customizable
Computing
environments
for
data
science
courses,
as
as
Ashok
has
mentioned,
for
agile
helix.
B
B
B
B
You
know
advancy
by
the
software
architecture
group
developers
which
provide
the
launching
of
apps
and
all
the
Network
Services,
and
then
we
have
custom
apps
based
on
their
domain,
to
provide
the
user
research,
the
users
and
the
researchers
the
appropriate
tools
that
they
need
to
do
their
to
to
execute
their
use.
Cases.
B
It
was
specialized
for
compound
16,
so
an
individual
educator
and
course-
and
we
worked
with
the
Helix
team-
worked
with
uncits
to
deploy
edge
of
Helix
for
compound
16.
in
gcp
for
fall,
21
and
spring
of
22
semesters.
B
The
compound
16
instance
of
hydro
Helix
was
built
with
workspaces
consisting
of
specialized
Jupiter
notebooks
pre-loaded,
with
the
required
python,
environment
and
modules
needed
to
complete
and
submit
assignments
and
exams,
and
all
of
this
was
hosted
in
a
kubernetes
environment.
We
had
Docker
dockerized
processes
running
on
servers
in
computer
science
department
which
were
used
for
grading
student,
submit
submissions,
and
these
were
the
fetcher
and
the
Checker
and
the
that
Ashok
had
mentioned
earlier.
B
We
provided
continuous
monitoring
of
cluster
resources
and
and
provided
metrics
to
two
Educators
and
to
the
users
for
insights,
into
resource
allocation
and
usage
and
by
users.
I
mean
the
I.
What
I
mean
here
is
the
cloud
the
provisioners
of
the
technology,
not
the
user
base
itself.
B
This
is
azure
Helix
compound
16
architecture
at
a
glance
very
similar
to
the
Helix
based
on
the
Helix
architecture.
So
we
have
kubernetes
in
the
mix.
We
have
a
helix
UI,
which
uses
saml
authentication
to
so
onion
based
Authentication
for
for
each
user.
In
the
class
we
have
custom
apps,
consisting
of
Jupiter
and
a
file
browser
for
saving
the
data
locally,
Jupiter
notebooks
compiled
with
the
compound
16
modules
to
execute
the
grading
and
the
fetching
of
assignments.
B
B
The
future
of
agile
Helix,
we
want
to
use
it
to
do
great
things,
especially
in
the
area
of
of
data
science.
So
we
would.
We
will
be
collaborating
with
the
school
of
data
science
and
Society
to
leverage
agile
Helix
as
its
educational
platform,
so
to
support
this
initiative-
and
this
is
a
very
ambitious
initiative.
B
We,
the
development
team,
is
compiling
a
list
of
essential
enhancements,
including
unified
workspaces
for
multiple
course,
offerings
integration,
with
better
grading
tools
and
addition
of
R
and
R
studio
for
to
the
computational
suite
and
integration
with
the
campus
LMS.