►
From YouTube: Lightning talks and discussions on Clojure data science - Scicloj meeting 12 - Aug 2020
Description
On Aug 30th, 2020, the Scicloj community had a public meeting with diverse lightning talks and discussions about data science in Clojure.
Background and agenda:
https://scicloj.github.io/posts/2020-08-26-public-meeting-2020-08-30/
Text chat:
https://tinyurl.com/yymuadpb
Some notes:
https://clojureverse.org/t/scicloj-needs-you/6467
A
So
hello,
everybody-
that
is
the
august
closure,
data
science,
meeting
science
meeting
and
locate
and
let
us
mute
ourselves
as
much
as
possible.
While
others
are
speaking
and
maybe
in
a
moment
we
will
hear
ourselves
discussing
some
stuff
in
the
first
part
of
the
meeting.
We
will
have
a
few
of
very
short
talks
by
several
of
us
and
afterwards
we
will
have
a
discussion
and.
A
The
idea
is
mainly
to
catch
up
to
hear
a
little
bit
about
the
projects
which
are
going
on
on
the
ecosystem.
We
are
building
and
to
see
everybody
here.
It
is
heartwarming
because
of
course,
it
is
an
unusual
time
and
people
are
having
the
private
local
struggles
and,
being
here
matters,
I
think
for
us.
Looking
for.
A
For
us
looking
for
for
a
way
to
to
somehow
make
sense
of
what
is
going
on
so
so
I
think
I
think
at
least
for
this
community.
We
can
have
good
hopes
for
the
coming
weeks
and
months,
and
let
us
discuss
that
a
little
bit
later.
So
now.
A
What
we
are
going
to
do
is
to
have
a
sequence
of
short
lighting
talks
where
people
will
be
telling
what
they
are
about,
what
they
have
been
feeling
recently,
and
I
guess
we
will
try
some
method
where
each
person
may
speak
for
five
minutes,
maybe
less
if
they
wish
less
is
also
good
and
we
will
use
a
gong
sound
so
that
people
can
know
about
where
time
passes.
A
The
gong
will
be
heard
in
the
beginning
and
then
after
three
minutes,
four
minutes
and
five
minutes
so
that
you
can
see
where
it
is
starting
to
end
and
the
first
person
to
speak
would
be
anthony
kong
about
the
new
spark
wrapper.
He
has
been
building
and
I'll.
Just
let
us
hear
the
gong
for
a
moment.
A
Is
it
okay?
Could
you
hear
it
yeah
right
so
that
sound
will
happen
at
the
beginning
after
three
and
four
and
five
minutes
and
more
people
are
joining
and
so
anthony?
Are
you
ready
to
begin.
B
Okay,
please
let
me
know
once
you
can
see
the
screen.
B
All
right,
great,
so,
okay,
first
of
all,
thank
you
so
much
for
having
me,
since
we
only
have
five
minutes
I'll,
just
cut
to
the
chase.
So
I'm
talking
about
guinea.
So
this
is
a
closure
data
frame
library
that
runs
on
spark.
B
So
two
things
that
I
want
to
cover
today.
One
is
I'll:
tell
you
what
it
is
and
also
some
design
goals
that
go
into
it.
So,
let's
go
to
what
is
guinea
first
thing
to
say
about
it?
Is
that
it's
an
idiomatic
closure
data
frame
right?
So
what
I
mean
by
idiomatic
idiomatic
closure
here
is
that
it
should
be
nice
to
read
and
nice
to
write
and
also
being
a
data
frame
library.
B
You
know
it
does
what
you'd
expect
it
to
do.
You
can
you
require
it?
You
load
data,
and
then
you
can
do
your
account
and
you
can
look
at
your
columns,
for
instance,
and
all
of
this
just
runs
runs
on
spark.
B
So
this
is
an
example
of
doing
a
group
buy
and
aggregate,
and
when
you
show
you
can
see
the
the
results
here
being
run
on,
you
know
being
hosted
on
spark.
You
get
spark
ml
for
free,
so
there's
a
reasonably
rich
machine
learning
library
that
goes
to
that
goes
with
it.
So
this
is
an
example
of
how
you'd
create
a
pipeline
machine
learning
pipeline
and
then
it's
just
a
matter
of
like
dot.
B
B
The
next
thing
I
want
to
say
is
that
so
it's
not
just
the
library
right,
it
comes
with
a
command
line
interface
as
well,
which
comes
with
a
standalone
rappel
or
for
you
to
submit
script.
So
it's
a
bit
like
spark
shell.
Where,
when
you
fired
up
you
get
the
you,
you
don't
need
any
requires.
You
get
the
data
frame,
you
get
the
library
they're
required
already.
B
So
that's
like
a
brief
tour
of
what
what
guinea
is
so
some
design
goals.
I
think
the
overwhelming
objective
of
the
project
is
to
provide
an
environment
where
you
can
have
fast
feedback
from
the
data.
I
think
this
would
be
familiar
with
for
a
lot
of
data
scientists
where
you're
constantly
living
in
this
cycle,
where
you
get
some
idea
about
your
data
and
then
you're
doing
some
query
and
then
you'll
get
the
result,
and
then
you
build
on
top
of
that
idea.
You'll
get
new
ideas
from
that.
B
So
the
idea
is
that
we
want
to
optimize
this
feedback
and
some
important
factors
is
like
kicking
off
the
feedback
cycle,
so
it
must
be
fast
for
you
to
start
doing
it
and
then
translating
your
ideas
to
queries
and,
finally,
your
query,
speed,
right
and
and
fast
and
accessible
rapple
as
one
of
the
design
goals
that
we're
trying
to
hit-
and
the
use
case
for
this
as
like
spontaneous
queries
so,
for
instance,
like
you're
working
on
a
data
set
and
you're
thinking,
okay,
what's
the
most
expensive
region
in
melbourne
right.
B
So
I
think
python
is
really
good
at
this
because
it
starts
very
quickly
and
then
you
just
do
import
pandas
as
pd.
You
read
your
parquet
file
or
whatever
file
you
store
it
in
and
then
you
do
your
query
and
you
get
the
answer.
That's
fine!
Basically,
we
want
to
be
able
to
do
the
same
thing,
but
with
closure.
B
So
that's
where
the
guinea
cli
really
comes
in
you,
no
no
need
for
line
new
line
run
no
need
for
requiring
this,
and
for
that
you
know
it
starts
up
and
then
you're
good
to
go.
It's
only
a
little
bit
slower
than
python
in
this
case,
and
the
second
thing
is
about
translating
your
ideas
into
code
right,
and
one
thing
to
say
is
that
it
has
to
be
nice
to
write
so
there's
nothing,
stopping
you
from
writing
interrupt
code
such
as
this
one.
B
This
is
like
pure
scala,
interrupt
right,
but
then
it's
not
good,
because
it's
quite
verbose
and
a
lot
of
these
queries
would
be
would
be
short-lived
right,
like
the
the
lifespan
is
like
under
one
minute
and
if
you
have
to
write
a
lot
of
throwaway
codes
like
this,
it's
really
not
nice.
So
with
guinea.
B
You
can
write
this
instead,
which
is
a
little
bit
cleaner
and
and
shorter,
and
it
has
to
be
nice
to
compose
as
well
so
that
instead
of
getting
scala
data
structures,
you
want
to
get
closure
data
structures
so
that
you
can
plug
it
into
somewhere
else.
So
g
collect
here,
you
get
a
sequence
of
nested
maps,
query
speed.
This
is
like
a
very
quick
dirty
benchmark
that
I
did
as
like
resembling
a
project
that
I
did
recently
this
year.
But
guinea
is
fast
right.
B
B
A
Thank
you
so
much
that
last
last
benchmark
was
amazing
and
we'll
discuss
it.
A
So
I
will
share
my
screen
and
what
I
am
trying
to
show
in
this
workflow
is
a
couple
of
tools
that
are
growing
these
days
and.
A
And
what
we
have
here
is
a
certain
kaggle
avocado
data
sets
which
we
are
reading,
thanks
to
the
tablecloth
library,
which
is
a
wrapper
of
techmel
dataset
and
a
tablecloth
allow
us
to
do
a
pandas-like
processing
of
dataframe
nature
and
the
visual
visualizations
we're
having
are
by
some
part
of
the
ping
gorilla
ecosystem,
which
allows
a
certain
dsl
for
viewing
data
in
using
certain
javascript
libraries
that
are
wrapped
by
that
dsl
and
what
we
are
also
having
here
is
some
ui
element,
some
user
interface,
that
can
control
our
repel
state.
A
So,
for
example,
what
we
are
doing
here
is
using
table
cloth.
We
are
taking
all
the
avocado
sales
in
seattle
and
grouping
them
by
type.
You
see
conventional
and
organic
and
looking
at
some
statistics
of
price
and
volume,
and
we
can
filter
these
and
we
can
see
that
if
we
take
a
minimal
volume
which
is
big,
we
have
only
conventional
and
if
we
change
it
a
little
bit,
then
we
have
organic
too
and
all
that
is
happening
in
the
rebel
states.
A
Of
the
pink
gorilla
ecosystem,
but
also
the
cljfx
library,
which
offers
some
really
neat
way
to
manage
state
in
the
jv
enclosure,
and
I
encourage
you
to
look
into
it.
So
we
see
here
table
cloth
and
pink
gorilla
and
also
behind
the
scenes,
the
techviz
library
that
generates
vega
specs
for
visualization
and
all
these
compose
in
pure
data
transformations,
and
that
is
what
I
I
wanted
to
to
show
how
these
few
libraries
can
connect.
A
And
the
point
is
that
it
is
now
very
easy
to
to
see
how
things
can
easily
connect
with
very
thin
layers.
So
it
is
very
easy
to
experiment
and
let
us
talk
about
further
experiments
of
on
how
rapid
state
browser
state,
editor
state
and
your
minds
can
connect
together.
So
thank
you
and
I'll.
Stop
sharing
now.
C
C
Okay,
so
I'm
just
going
to
give
a
quick
overview.
What
this
is
and
then
a
little
bit
about
shiny
and
then
how
to
do.
Shiny-Like
things
inside
so
psyche
is,
is
a
basic,
is
a
self-contained
interactive
built
environment
for
exploring
data
and
creating
and
saving
visualizations
and
creating
shareable
interactive
documents.
Actually,
the
presentation
that
this
is
this
is
a
cycling
document.
C
Some
features,
there's
main
editors.
Main
editor
is
like
this
stuff
over
here
and
then
you
create
the
document.
That's
on
the
right.
Those
editors
can
create
code,
visualizations,
markdown,
laptop
reactive
components.
Automated
boilerplate
has
a
powerful
template
substitution
system
borrowed
from
autonomy,
it's
not
cellular
or
modal
or
linear.
Like
typical
notebooks
doc
body
editors
can
be
live
and
interactive.
C
You
can
directly
mix
code
from
closure,
enclosure,
script
and
r.
Python
is
in
the
works,
multiple
namespaces,
so
that
you
have
single
document.
You
can
have
different
data
scenarios
captured
without
having
to
reload
everything,
and
you
can
organize
documents
along
chapters
and
sections
which
is
sort
of
like
these.
This
tab
stuff
is
shiny
chinese
on
our
system
for
visualizing
our
computations.
He
uses
the
web.
It's
a
push
based
design.
C
Our
code
generates
html
and
pushes
it
out
to
the
client
in
the
browser,
and
it's
intended
to
be
a
simple
and
easy
way
for
our
users
to
visualize
stuff.
That
they're
working
on
key
aspects
involved
here
today
is
the
processing
is
server
based.
So
you
can
leverage
all
the
capabilities.
There
front
end
provides
for
user
interaction
and
it
has
high
leverage
components
for
widgets
and
charts,
and
I
think
this
this
is
probably
a
big
one,
this
bit
here.
So
a
couple
of
simple
examples.
C
C
Or
maybe
this
is
going
to
be
too
long.
Eventually,
this
will
come
up
so
inside.
Okay.
What
we
need
to
do
the
similar
kind
of
thing
is
data
complications
for
server.
That's
not
a
problem.
You
can
use
neanderthal
technology,
visualizations
are
provided
via
vega
light
and
vega
could
be
others
going
to
have
very
high
leverage
via
templates.
C
C
If
we
come
back
over
here
quickly,
maybe
we
can
see
here's
the
the
shiny
version
of
this
and
should
be
able
to
update
this
stuff
a
bit
yeah,
so
that's
going
on
and
then
you
can,
every
time
you
do
these
clicks
here,
you're
going
out
to
the
server
it
recomputes
and
then
pushes
a
new
version
of
the
of
the
visualization
here,
we're
not
doing
that
we're
pulling
things,
but
we
can
just
move
this
around
in
exactly
the
same
way.
C
These
are
again
calling
out
to
the
server
and
we
can
do
the
same
thing
here
with
this
this
stuff.
Is
we
open
this
up
and
take
a
look
at
how
we
did
it?
We
have
these
high
leverage
components
like
slider
input
and
text
input.
It
creates
this
and
then
another
one
that
creates
the
exact
same
thing
here.
C
C
So
again,
these
widgets
are
are
just
take
a
quick
look
at
the
widgets
go
over
here.
This
thing
is
loading
all
kinds
of
stuff,
but
the
definitions
are
in
these.
D
C
All
this
stuff
a
moment
here
we
can
use
this
and
change
our
states.
We
can
update
the
computation
for
the
sliding
window
average.
You
can
come
over
here
and
look
at
the
similar
kind
of
thing,
or
this
is
per
state.
C
C
A
Thank
you
so
much
john,
so
that
was
magnificent,
and
so
we
have
seen
now
earlier
one
way
of
building
dashboards
through
the
rectal
state.
Now
you
showed
us
what
may
happen
when
one
uses
the
browser
state,
and
that
was
really
magnificent
and
thank
you
so
much,
and
so
the
next
speaker
will
be
mike
and
to
everybody
who
have
joined
us
recently.
Thank
you
for
your
patience.
We
are
now
having
this
short
sequence
of
lightning
talks
and
afterwards
we'll
have
a
short
discussion
mike.
Would
you
like
to
share
your
screen.
E
Yeah,
all
right,
I'm
going
to
get
started,
so
I'm
going
to
talk
about
flame,
which
is
a
visual
query
builder
against
an
atomic
knowledge
knowledge
base.
So
the
domain
that
I
work
in
is
in
cancer
immunotherapy,
so
we
cancer
immunotherapy,
is
a
new
technique
for
treating
cancer
using
what
leverages
the
body's
own
immune
system
and
the
parker
institute.
E
Does
research
into
this
process
we're
trying
to
understand
and
make
it
better.
The
only
thing
I
can
say
here
is
that
the
the
immune
system
is
really
complicated
and
there's
just
an
incredible
diversity
of
of
data
and
things
we
want
to
represent
so
we're.
So
we
built
a
knowledge
base
called
candle
cancer
data
and
evidence
library
based
on
the
atomic.
E
E
This
is
a
graph
of
all
of
the
classes
involved,
and
I
don't
know
if
you
can
read
that
so
some
of
the
central
classes
are
there's
there's
subjects
or
patients
samples
which
are,
you
know,
tissue
samples
that
we
got
that
we
get
from
patients
and
measurements
where
you
run
some
experiment
on
a
sample
and
see
what
what's
it
what's
in
the
genome
or
other
or
other
measurements.
E
So
there's
about
30
years,
30
or
so
classes
in
candle
and
then
and
then
another
then
genes
are
another
important
data
type
that
they're
they're
sort
of
central.
So
we
have
genes
and
gene
variants
and
the
samples
are
linked
to
that.
So
the
so.
The
the
one
question
is:
how
is
we?
We
have
a
lot
of
of
users
of
this
database
who
aren't
programmers?
How
can
we
make
it
possible
for
them
to
query
the
database
here?
E
Here's
here
are
some
of
the
some
typical
queries
a
scientist
might
want
to
ask
of
this,
such
as
what
are
the?
What
are
the
outcomes
of
patients
who
have
renal
cell
carcinoma,
who
have
variants
in
pbr
m1,
which
is
a
particular
gene
when
treated
with
anti-pd1,
which
is
a
particular
class
of
drug?
How
do
responses
vary
by
gender
and
body
mass
index
across
cancer
types
things
like
that
and
they
can
get
fairly
complicated?
So
so
that's
that's
some
background.
The
other
background
is
is
kind
of
starting
to
switch.
E
Gears
fast
is
so.
Scratch
is
a
children's
computer
programming
language
that
some
of
you
may
be
familiar
with.
It's
a
visual,
it's
a
it's
a
tool
designed
for
children
that
makes
it
possible
to
compose
programs
by
snapping
blocks
together.
E
It
came
out
of
the
mit
media
lab
there's
a
bunch
of
similar
systems,
including
blockly,
which
comes
out
of
which
comes
out
of
google,
and
so
what
I,
what
my
my
big
idea
was
to
hook
up
blockly
with
with
candle,
to
make
a
visual
query
builder
and
I'll
show
you
what
that
looks
like
and
the
the
system
is
called
is
called
inflame,
because
everything
we
do
is
candle
candle
related.
E
So
this
is
this
is
what
inflamed
looks
like
there's:
a
construction
space
where
you
pull
out
blocks
and
snap
them
together
and
make
a
query
that
query
gets
translated
into
data
log
and
then
you
can
view
the
review
and
browse
the
results
results
down
here.
So,
for
instance,
here
so
here's
a
a
typical
query.
A
scientist
might
want
to
find
subjects
with
stage
four
things
throughout
variance
in
pberm1.
E
So
to
build
a
query,
you
kind
of
sort
of
identify.
What
are
the
entities
you're
interested
in
their
classes,
and
then
you
pull
out
blocks
that
that
that
represent
those
categories
and
snap
them
together.
So
this
so
this
is
this-
is
this?
Is
a
block
translation
of
that
query?
E
E
This
is
a
part
of
the
block.
Construction
process
looks
like
this.
You
have
all
of
your
all
of
the
30
classes
are
here.
They
each
have
a
color,
because
we
we
have
about
30
classes,
we're
sort
of
pushing
the
boundaries
of
what
you
can
do
using
encoding
and
color,
and
there's
blocks
for
each
of
the
to
query
the
subjects
and
the
properties
something
might
have,
and
some
of
those
blocks
come
with
sub
queries
of
their
own.
E
Here's
a
somewhat
more
complicated
example.
You
can
see
you
can.
You
can
also
do
things
like
like
counting
and
control
some
of
the
data
that
that's
returned.
E
So
yeah
describe
some
of
the
details,
so,
like
I
said,
classes
are
mapped
to
color.
You
have
some
blocks,
have
a
little
output
knob,
which
indicates
it
produces
a
set
of
objects
and
then
query
blocks
have
this
this
gap,
which
is
we're
sort
of
I'm
sort
of
abusing
some
of
the
scratch
metaphors.
But
that's
a
way
you
can
add
multiple
clauses
that,
basically,
you
can
put
in
query
constraints
and
they
get
handed
together
in
there,
as
you
can
see
down
here
here,
we're
querying
for
subjects.
A
So
these
were
six
minutes.
If
you
wish.
E
All
right,
all
right,
I'm
about
I'm
about
done.
These
are
some
of
the
underlying
components.
It's
enclosure,
script
and
reframe,
and
atomic
we're
using
the
blockley
library
from
google
there's
a
part
part
of
this
has
been
there's
a
library
called
blockoid
which
we've
open
source,
which
is
which
is
a
wrapper
for
blockly.
So
now
it's
now,
it's
fairly
relatively
easy
to
create
block
languages
and
closure
scripts.
E
I
encourage
anyone
interested
in
that
to
make
their
own
oops
and
then
here's
some
of
the
other
people
who
who
work
with
me
at
parker
on
this.
That's
it.
G
Here,
okay,
does
everyone
have
that
cool?
Let
me
go
ahead
and
start
here,
hi
everyone
thanks
for
being
here,
I'm
excited
to
be
showing
you
an
environment.
I've
been
working
on
for
analyzing
scalable,
open-ended
feedback
as
produced
by
polis,
which
is
a
digital
democracy
tool
for
engaging
citizens
in
decision-making
processes.
G
So
the
first
thing
I'll
show
you
here
is
just
the
repositories
up
on
github
under
pol-is
analysis.
So
if
you'd
like
to
take
a
look
and
play
around
with
some
of
this
data,
welcome
you
to
do
that.
It's
got
instructions
for
running
in
everything
here
right
now,
I'm
going
to
just
go
ahead
and
kick
off
the
the
docker
imager
and
sorry.
I
should
I
missed
a
piece
here
right,
so
really
what
what
this
has
in.
G
It
is
a
docker
container,
which
has
a
bunch
of
analyses
that
have
been
built
upon
tecmo
data
set
lid,
python's,
dlj
and
visualizations
built
using
oz
and
vega,
and
all
this
comes
together
in
sort
of
two
form
factors.
One
custom
closure
kernel
for
folks
who,
like
kind
of
the
traditional
you
know:
jupiter
notebook
style,
environment
and,
additionally,
a
set
of
oz
style
analysis
notebooks,
which
I'll
be
showing
a
little
bit
more
in
a
second
here.
So
kicking
this
off
with
docker
compose
we
down
at
the
bottom.
Could.
G
Yeah,
and
actually
I
need
to-
I
need
to
make
it
smaller
before
I
make
it
bigger,
because
I
need
to
click
on
this
link
and
it
won't
work
if
it's
on
multiple
lines
but
down
at
the
bottom.
Here
it
will
once
we
run
this,
we'll
get
some
some
urls
printed
out
for
opening
up
the
close
jupiter
or
the
jupiter
notebook,
and
now
that
I've
got
that,
I
can
make
this
a
little
bigger.
G
So
once
we're
in
here
we
can
go
ahead
and
click
through
to
notebooks
jupiter
and
there's
a
close
jupiter
example
here,
a
little
bit
more
yeah
sure.
So
I'm
not
going
to
go
into
anything
in
too
great
detail
here,
just
that
it
has
some
basic
examples
of
loading
up
tech
ml
data
set
and
loop
python
clj
requiring
some
of
those
python
libraries.
This
yeah,
don't
worry
about
that
and
and
a
little
oz
example
here.
G
So
still
this,
this
piece
is
still
a
little
bit
bare
bones
and
actually
really
what
I'd
like
to
show
you
more
of
right
now
is
the
is
the
the
oz
side
of
things
so
coming
back
over
here,
you'll
see
just
underneath
the
link
that
we
clicked
it
says.
Enrapple
service
started
import
3850,
so
whatever
sort
of
tool
you're
using
you
should
be
able
to
connect
to
that.
G
I'm
using
vim
here,
and
so
what
I'm
going
to
do
is
first
evaluate
the
polis,
math
namespace
in
this
project
and
just
want
to
point
out
that
this
is
running
in
the
docker
container
right,
but
with
the
n
ruppel
port
file,
it's
able
to
sort
of
find
find
the
right
connection
into
the
docker
image,
with
all
this
stuff
sort
of
baked
together,
and
once
this
once
this
polish
math
namespace
is
loaded.
G
We
can
then
go
in
and
evaluate
the
username
space
where
we
have
some
sort
of
stubbed
out
code
for
starting
the
oz
processes.
Now
it
should
be
the
case
that
you
sh,
you
should
be
able
to
just
immediately
evaluate
the
user
name
space
directly
and
have
it
require
math,
but
there's
a
little
bug.
That's
preventing
that
from
happening
right
now,
so
just
kind
of
a
heads
up
if
you're
poking
around
at
this.
So
once
we
get
that
running,
we
get
a
message
here:
web
server
running
at
localhost
3860.
G
So
we
can
go
ahead
and
plug
that
in
here
at
the
browser,
and
we
get
this
little
message
saying
that
it's
ready
for
ready
for
a
spec
to
load.
So
the
first
thing
I'll
notice,
a
basic
vega
visualization
with
a
couple
of
data
points
just
to
keep
the
wheels
here
and
make
sure
things
are
working.
But
really
we
want
to
take
a
look
at.
G
Is
this
oz
build
process
here
which
kicks
off
this
live
code,
reloading
process
that
gives
you
a
sort
of
notebook
like
environment,
a
little
bit
more
like
what
you
something
like
between
a
notebook
environment
and
like
a
live
coding,
environment
with,
say,
like
reagent
or
fig
wheel
or
shadow?
If
you
used
to
those
tools,
so
once
we've
got
this
running
now,
you'll
see
that
this
is
looking
at
the
directory,
notebooks,
oz
and
so
down.
Here
we
can.
G
We
can
open
up
this
file
in
that
directory,
and
so
this
is
just
a
regular
closure
file
right.
The
only
interesting
thing
being
that
in
this
particular
mode
of
evaluation,
whenever
it
sees
a
literal
vector
form,
it
will
interpret
that
as
hiccup
and
render
to
the
page.
So
all
I
have
to
do
to
kick
start.
This
is
save
the
file.
It
sees
that
something's
changed
and
you
can
see
over
here
on
the
right.
It's
reloading
the
file.
G
It
prints
out
information
about
long-running
forms
that
are
being
processed
as
well
as
how
long
it
takes
them
to
run,
and
once
that
once
that's
done,
we
get
a
visualization
and
scientific
document
here
on
the
right,
it
looks
like
I'm
right
in
five
minutes
I'll
try
to
speed
through
this
here.
G
So
this
this
particular
data
set
is
from
a
a
consultation
done
in
taiwan
around
how
to
regulate
uber
in
in
the
nation,
and
so
you
see,
there's
1200
participants
here,
50
000
votes
an
average
of
40
per
participant,
and
we
can
put
all
these
votes
together
in
a
matrix
of
voter
by
comment
which
is
cool
looking,
but
not
very
useful,
necessarily
and
where
this
gets
a
little
more
useful
is
where
we
start
applying
dimensionality
reduction
to
to
project
this
really
high
dimensional
data
set
into
a
lower
dimensional
space
that
we
can
visualize,
and
so
this
is
here,
is
a
pca
projection
where
we're
coloring
by
some
groups
that
have
been
assigned
using
clustering
algorithms,
and
we
can
interact
with
that
using
this
hover
over
here
and
here,
we're
looking
at
another
dimensionality
reduction
called
umap,
which
has
a
little
more
degrees
of
freedom
for
finding
kind
of
finer
grain
structure
and
so
just
to
kind
of
demonstrate
some
of
the
interactive
features.
G
Here,
we
can
actually
see
how
these
two
projections
relate
just
really
cool,
so
I
think
yeah
I'll
kind
of
wrap
up
here,
some
longer
term
goals
for
the
project.
I'd
like
to
eventually
migrate
the
core
polis
tool
itself
to
this
tekkenml
dataset
and
the
python
stack
as
appropriate
right
now.
G
It's
just
kind
of
an
experimental
place
where
we
can
build
more
custom,
analyses
and
and
tool
around
and
investigate
the
data,
and
the
other
thing
I'm
really
excited
to
explore
is
finding
a
way
to
take
this
api
and
make
it
something
that
people
can
use
from
the
python
ecosystem
so
that
data
scientists,
you
know
who
are
who
are
more
familiar
with
that-
can
take
advantage
of
this
same
kind
of
core
core
logic
in
in
playing
around
with
some
of
this
some
of
this
sort
of
book,
civic
data
so
yeah,
I
think
that's
it
thanks.
G
Everyone
for
for
listening
and
yeah
look
forward
to
answering
questions
later.
So
please,
thanks.
H
A
I
Let's
see,
are
you
able
to
see
my
screen
now?
Yes,
oh
okay!
Thank
you
daniel
and
thank
you
all
for
having
me
here.
This
is
my
first
time
here
and
I've
been
enjoying
the
listening
to
the
different
talks
and
the
tools.
So
I'm
I'm
a
clinician,
I'm
a
physician
by
training
and
doing
clinical
oncology
and
knowledge
engineering.
I
So
what
I'm
talking
about
here
in
terms
of
imprecise
data
is
coming
from
that
ontological
perspective
right
so
imprecise
right.
It
means
lacking
exactness
and
accuracy
of
expression
or
detail,
and
so
the
question
is:
why
do
we
care
when
it
comes
to
data?
Why
do
we
care
whether
something
something
is
you
know
we
want
to
do
things
very
precisely
or
not.
I
We
do
care
because
it
has,
you
know
very
serious
consequences,
so,
for
example,
many
years
back,
it's
probably
more
than
decade.
Now
we
had
this
hubble
telescope,
which
launched
into
space
and
after
launch
they
found
out
that
the
images
were
not
sharp
and
then
finally,
they
traced
it
back
to
having
an
issue
with
you
know.
One
of
the
mirrors
was
designed
like
with
the
one
millimeter
difference
and
that
one
millimeter
flaw
led
to
a
huge
problem
in
terms
of
images
being
sharp
or
not.
I
Now
I
come
from
the
clinical
space,
so
I
want
to
give
you
an
example
from
from
medicine,
so
we
have
some.
We
do
something
called
the
egfr
or
the
estimated
glomerular
filtration
rate,
and
that
is
necessary
for
calculating
that's
necessary
for
for
staging,
diagnosing
or
staging
chronic
kidney
disease.
I
Now,
when
you
do
calculations
based
on
this,
it
can
lead
to
a
bigger
difference
in
the
results.
So
the
calculated
egfr
can
vary
quite
widely,
depending
upon
what
method
you
use
and
so
at
an
individual
level.
If
you
are
calculating
the
egfr
and
comparing
and
trying
to
see
if
the
patient
has
chronic
kidney
disease
or
not,
then
you
could
be
off
by
a
little
bit
and
a
patient
who
is
not
in
chronic
kidney
disease.
You
can
very
well
diagnose
as
kidney
disease
and
start
treating
that
now,
source
of
impressive
imprecision
in
data
can
be.
I
You
know
twofold.
One
is
at
the
time
data
is
generated
because
you're
using
different
methods,
different
instruments
with
different
calibrations
and
things
like
that.
But
today
what
I
want
to
talk
about
is
more
on
the
storage,
computation
and
exchange
of
data,
where
imprecision
can
be
far
more
insidious
in
a
sense
that
we
don't
even
realize
it,
and
this
is
an
exam.
I
This
is
something
that
I
found
in
java
about
almost
about
15
years
back
when
I
was
working
on
some
some
data,
it
is
simple
calculations
using
a
float
can
lead
to
an
imprecise
number
and
I
tried
to
replicate
that
in
closure,
and
you
know
15
20
years
later,
it's
giving
me
similar
kind
of
results.
So
if
you
add
3.3
3.3,
you
get
6.6,
but
if
you
do
it
three
times
you
get
9.89999
instead
of
9.9.
I
The
the
problem
with
this
is
as
by
itself
you
round
it
you'll
get
the
right
answer,
but
once
you
start
multiplying
it,
adding
it
to
other
things,
subtracting
and
do
all
kinds
of
computations.
The
end
result
can
be
very
problematic
now
when
it
comes
to
dates.
I'm
seeing
that
we
have
a
huge
issue-
and
this
this
course
in
in
medicine,
when
we're
dealing
with
a
lot
of
events
and
trying
to
figure
out
what
happened
before
and
what
happened
later.
I
It's
a
big
issue
now,
if,
if
you
want
to
create
a
date
instance
in
java
or
enclosure
right,
this
is
what
you
get
you
get
a
year
month,
date
time
and
then
the
time
and
then
up
to
the
milliseconds
level.
I
The
problem
is
we
don't
deal
with
events
like
in
in
real
life
up
to
this,
this
position
right.
So
we
have
you,
have
a
birthday
you're,
typically
dealing
with
up
to
your
month
and
day.
Somebody.
I
I
Now,
how
do
you
represent
this
when
you
do
not
when
you,
when
your
precision
is,
you
know,
goes
up
to
the
millisecond
level.
So
when
you
say
something
like
2011
january,
I
found
that
different
systems
right
different,
different
places.
They
compromise
in
different
ways.
They
say
2011
january
or
one
or
january
31st,
or
sometimes
they
go
into
the
middle
right
january
15th,
and
they
take
so
when,
when
you,
when
you
start
comparing
dates
right
what
happened
before
and
what
happened
later,
this
this
causes
problems.
I
So
this
is
one
of
the
biggest
areas
that
I
face.
An
issue
in
in
terms
of
handling
dates
in
medicine.
Now
mike
earlier
talked
about.
You
know
the
the
querying
for
data
where
you
mentioned,
you
know,
find
all
subjects
with
stage
four
cancer
and
things
like
that
now.
What
does
subject
mean?
What
does
stage
four
cancer
mean?
These
are
kind
of
things
that
look
very
simple,
but
there
there's
a
lot
of
ambiguity
when
it
comes
to
the
exact
meaning
of
these
terms,
and
that's
what
oncology
is
all
about.
I
I
I
So
the
so
precision
is
very
important
in
when
we
come
to
dealing
with
data
and
it's
very
important
to
have
the
you
know
the
exact
meaning,
whether
it
is
numbers
or
dates
or
terms.
You
know
they
all
follow
the
same
pattern.
It's
it's
very,
very
necessary
to
have
the
exact
meaning
for
this.
These
things.
A
Thank
you
so
much
and
it
has
opened
so
much
so
many
questions
and
but
you
know
at
least
conceptually
to
me.
It
helped
to
see
this
way
you
and
conceptualizing
this
and
so
the
last.
The
last
talk
before
discussion
will
be
by
luke
lukash.
Yes,
oh
you're,
here,
oh
hello,
hello.
Can
you
hear
me?
Yes?
Yes,.
J
J
Unfortunately,
anonymization
does
not
work.
Basically,
you
can
cross-reference
and
de-anonymize
data
sets.
It
happens
all
the
time,
it's
really
horrible.
So
we-
and
there
are
some
horror
stories
you
can
read
online,
so
we
need
something
better.
Unfortunately,
now
we
have
differential
privacy,
researchers
now
call
it
the
golden
standard
for
privacy
and
the
u.s
census
has
already
been
using
differential
privacy
differential,
private
algorithms.
Big
corporations
are
doing
it.
There
are
more
and
more
open
source
tools,
but
the
idea
is
not
widely
known,
which
is
why
I'll
try
to
explain
what
it
is.
J
So
the
the
general
idea
is
your
your
private
data
is
safe
if
the
query
cannot
even
reveal
if
your
data
is
there
in
the
data
set
or
not.
So
here
we
have
a
here.
We
have
a
data
scientist
he's
looking
at
some
query
results
and
he
cannot
even
figure
out.
Are
these
coming
from
the
full
data
set
or
one
where
all
your
data
is
missing,
and
this
is
this
holds
for
all
individuals
in
the
data
set?
J
J
Basically,
this
parameter
epsilon
here
expresses
the
trade-off
between
privacy
and
utility,
so
for
perfect
privacy
we
would
have
epsilon
equals
zero.
This
thing
would
then
be
one
and
for
all
these
pair
for
all
pairs
of
such
data
sets,
we
could
not
distinguish
between
them.
We
would
not
be
able
to
distinguish
between
the
two
data
sets,
so
we
have
to
set
this
parameter
to
something
higher
than
zero,
and
but
not
this,
this
value
should
not
be
high,
but
it
has
to
be
higher
than
zero.
J
We
call
it
a
privacy
budget
and
this
is
still
a
very
strong
property,
so
yeah.
So
this
is
the
definition
and
the
desired
property.
To
actually
achieve
this.
We
need
to
use
random
noise,
so
we
can
either
add
random
noise
to
local
to
data
itself.
It's
called
local
differential
privacy
or
we
can
add
noise
to
the
query
results.
That's
called
global
differential
privacy,
so
here
there
would
be
a
trusted
curator.
J
So
we
could
we
could.
We
would
send
the
query
to
the
trusted
curator,
who
would
compute
the
query?
Results
from
the
data
set?
Add
the
right
amount
of
noise
and
send
it
back
to
us,
so
they
would.
We
would
have
this
interactive
mode
of
work
here
and
it
seems
the
this
whole
differential
privacy
field
is
moving
more
towards
this
model,
where
there's
adjusted
curator
and
the
the
noises
added
to
the
query
results
not
to
the
data
itself
enough
enough.
Noise
means
we
need.
J
We
need
enough
noise
to
protect
individuals,
privacy,
but
not
too
much
so
that
analysis
is
still
good
right
and
even
machine
learning
with
differential
privacy
is
possible.
It's
all
fascinating,
but
I
have
to
move
fast
and
before
I
show
something
that
works.
I
need
to
talk
about
open
mind.
J
We
are
a.
We
are
an
open
source
community.
We
are
working
on
all
kinds
of
privacy
related
technology.
There
will
be
a
free
online
conference
at
the
end
of
september.
If
you're
interested
check
out
those
links.
I'm
a
member
of
the
differential
privacy
team
in
openmind
and
we
we
have
a
little
closure
library
for
differential
privacy.
J
It's
actually
a
wrapper
for
for
a
java
library
by
google
from
google
and
there's
there's
great
value
in
having
this
java
library
under
the
hood,
namely
in
differential
privacy.
We
have
to
worry
about
attacks
on
implementation,
so
this
is
a
little
similar
to
cryptography
in
that
way.
So
having
this
audited
and
battle
tested
library
under
the
hood
is
drained
by
this
for
us
all
right
so
I'll
show
you
a
notebook,
real,
quick.
So
here
here's
a
bar
chart
showing
counts
of
visits
per
hour,
so
in
the
restaurant
right.
J
So
so
the
red
bars
are
showing
true
unaltered
and
real
numbers
of
visitors
per
hour
in
the
restaurant
and
the
blue
bars
are
showing
the
same
values,
but
with
a
little
bit
of
noise,
noise
is
generated
randomly
it's
laplacian
noise.
If
you're
curious,
so
you
can
see,
the
blue
bars
are
slightly
different
than
the
red
bars.
J
There's
some
distortion,
but
not
too
much
so
you
can
still
see
the
pattern
right
and
so
even
from
the
blue
bars
themselves,
you
can
you
can
see,
for
instance,
when
the
restaurant
is
more
busy
and
to
compute
these
differential,
private
values
or
the
to
add
the
right
amount
of
noise
we
need
to
use.
We
need
to
use
functions
from
the
library,
here's
a
count
function
for
counting
visits
right,
so
it's
basically
it
takes
a
collection,
and
then
you
need
two
extra
parameters.
The
first
one
is
privacy
budget
and
the
other
parameter.
J
Long
story
short
means
how
much
an
individual
can
contribute
to
the
algorithm's
result
and
in
general
the
more
an
individual
can
contribute
to
the
algorithm's
result.
The
more
noise
needs
to
be
added.
I
hope
it
makes
sense.
It's
all
about
hiding
individuals
contributions,
so
you
can,
you
can
count,
you
can
compute
differential
private
count,
you
can
compute
differential
private
sum
of
elements
from
a
from
a
collection
also
mean,
and
some
other
functions
will
be
available.
It's
all
work
in
progress.
A
A
Discuss
now
we
have
17
minutes
to
the
official
end
and
for
some
people
it
is
late
hour
at
night,
but
I
guess
after
the
official
end,
some
of
us
may
wish
to
stay
more
because
of
all
the
trouble
in
the
beginning.
F
F
E
Guess
yeah
atomic's,
not
you
know
not
not
the
world's
fastest
we're.
Actually,
we
actually
are
working
with
cognitect
who
who
are
building
like
a
query,
optimizer
engine
for
us,
since
they
they
have
something
against
putting
that
into
atomic
itself.
So
we
we've
been
doing
some
work
on
that.
So
it's
not
the
it's
not
the
fastest
graph
graph
database
in
the
world.
H
So
I
have
a
question
to
mike
as
well,
so
if
I
understood
correctly,
the
database
has
a
graph
model
or
or
which
model
if
it
is
on
the
graph.
E
I
I
think
I
I
think
we
chose
a
graph
representation
because
it
was
necessary
for
the
complexity,
complexity
of
the
of
the
domain
of
the
data
domain,
and
we
choose
you
know
atomic
over
another
graph
database,
for
you
know
more
more
more
more
social
reasons
than
than
than
anything
else.
E
That
that
was
that
that
was
that
was
I
that
was
before
my
before
my
time
on
the
project.
Actually
I
mean
so
the
atomic
the
time
you
know,
there's
other
art,
there's
other,
like
rdf
databases
which
are
sort
of
from
this
from
the
kind
of
representation
query
perspective
are
pretty
similar
to
to
to
atomic.
H
Yeah
yeah,
you
have,
I
mean
my
question
was
which
was
the
the
model
you
use.
So
it's
a
graph.
E
Yeah,
it's
it's
a
it's
a
graph,
we've
sort
of
imposed
a
schema
over
it,
which
was
you
know
the
subject
of
that
that
that
graph
diagram
I
have
so
there's
there's
about
30
classes
and
a
class
can
have
you
know
in
our
case.
Sometimes
you
know
many
hundreds
of
thousands
of
entities
associated
with
it.
For
instance,
like
I
think
your
measurements
are
the
biggest
the
biggest
biggest
class
in
in
in
candle,
because
that
represents
basically
every
ever
every
every
piece
of
information
you
get
you
get
from
a
sample.
E
So
there's
you
know,
there's
you
know
hundreds
of
subjects,
but
you
know
hundreds
of
thousands
of
of
measurements
and
some
more
samples
or
somewhere
in
between
that
and
then
so.
And
one
thing
I
didn't
talk
about
so
I
wrote
a
system
called
al
zabo
which
generally
generated
that
diagram
and
kind
of
and
manages
the
manages
the
schema.
So
it
makes
it
a
little
bit
nicer
to
to
create
and
manipulate
schema
level
information
with
atomic
and
that's
something
I'm
working
on,
I'm
hoping
to
open
source
soon.
G
Just
wanted
to
say,
I
love
seeing
that
you're
able
to
open
source
the
the
closure
script.
What
is
a
blockley
rapper?
That
seems
like
a
really
cool
thing
for
me
able
to
expose.
You
know,
computational
functionality
to
folks
who,
who
aren't
you
know
ready
to
dive
into
code.
Yet.
I
So
most
of
the
most
of
the
ones
like
vega,
lite
and
all
don't
have
that
functionality
you
have
to
dip
into
either
d3
or
or
something
else
like,
I
think,
loom
is
another
one
that
has
some
functionality,
but
otherwise
there's
not
a
whole
lot
of
choice
out
there.
G
Yeah,
I
think
you
can
use
vega
for
doing
some.
Of
that.
I
don't
I
mean
you
can
do
some
graph
work
directly
from
vega
light
and
some
of
the
layouts
have
gotten
a
little
bit
better
there
recently
I'll
just
end,
I'm
always
pitching
vega.
The
team
is
just
one
thing
I
love
about
them
is
the
team
is
just
so
responsive
to
to
requests
and
questions,
and
I
had
thrown
out.
G
I
was
trying
to
build
a
phylogenetic
tree,
visualization
toolkit
and
didn't
have
the
right
layout
to
to
kind
of
make
things
look
right
for
a
phylogenetic
tree
and
they
added
it
like.
I,
I
don't
remember
it
was
like
a
few
weeks
later
like
it
was
pretty
pretty
awesome,
but
but
yeah,
I
think
I
think
you
can
I'd
have
to
look,
but
I
think
you
can
do
if
you
can't
do
it
with
vega
light.
I
think
you'd
be
able
to
do
some
basic
stuff,
like
that,
with
with
vega.
C
And
you
can
do,
I
think
you're
right
like
basic
stuff,
but
I
I've
talked
to
the
idl
folks
as
well
about
some
of
this
and
and
they
realize,
there's
there's
a
lot
of
limitations
there
and
it's
just
not
high
on
their
radar.
It's
not
something!
That's
they're!
Really,
I'm
going
to
get
on
the
road
map
very
soon.
C
I
I
Now
I'm
not
a
programmer
or
a
developer,
so
I
do
most
of
the
closure
work
from
a
hobby
perspective,
and
so
one
of
the
challenges
that
I
find
is
you
know
you
guys
are
showing
some
amazing
tools
and
all
that
stuff.
But
the
the
learning
curve
to
get
set
up
with
any
of
these
tools
is
quite
a
lot
and
if
you're
not
working
with
them
constantly,
then
it's
really
difficult
to
get
wrap
your
mind
around
it.
I
You
know
to
see
how
it
works
and
all
that,
so
I
think
I
think
one
of
the
things
that
will
help
is
if,
if
there
is
any
chance
I
know
all
of
a
lot
of
this
is
like
open
source,
voluntary
work
and
all
that,
but
it
will
help
as
if,
if
you
can
put
together
a
small
hello
world
kind
of
thing,
for
each
of
the
things
and
and
and
and
you
know,
go
from
there
work-
take
a
single
example
and
make
it
more
and
more
complex,
rather
than
giving
different
examples.
C
Yeah,
I
think,
there's
actually
two
pieces
to
that
question
or
point
one.
Is
the
idea
of.
C
Getting
the
infrastructure
up
and
working
at
all
and
then
on
the
second
one.
The
second
part
is
understanding
the
details
of
the
particular
system
or
library
or
whatever,
so
those
are
kind
of
two
different
things.
In
my
in
my
mind,
for
something
like
scitec
you,
you
really
just
need
java
java
8
right
now.
It's
it
still
hasn't
really
ported
to
11.,
but
that's
all
you
need
and-
and
it
has
it's
a
self-contained
uber
jar
and
then
we'll
even
install
the
mkl
libraries
for
neanderthal,
for
you.
C
So
you
don't
and
you
don't
need
I
mean
well,
you
still
have
to.
If
you
know
emacs
it,
the
editors
are
all
basically
emacs
like
they're,
not.
D
C
L
A
K
G
Yeah,
if
I
can
add
something
to
this
too,
I
think
one
of-
and
I
you
know-
didn't-
have
a
lot
of
time
to
touch
on
this,
but
one
of
my
goals
with
the
the
docker
environment
that
I
put
together
and
demonstrated,
is
that,
while
I
think,
as
as
john
anthony
was
pointing
out
like
closure
by
virtue
being
on,
the
jvm,
is
pretty
good
about
just
kind
of
running
and
being
sort
of
repeatable
and
not
being
different
on
different
systems,
et
cetera,
but
now
that
we're
kind
of
bridging
out
into
the
space
of
creating
better
interfaces
with
languages
like
python
and
r,
some
of
those
environmental,
especially
with
python.
G
Some
of
those
environmental
kind
of
complications
can
be
quite
challenging.
I
mean,
if
you've
worked
with
python
for
more
than
any
period
of
time
at
all,
you've
you've
probably
had
an
issue
with
virtual
environments,
and
this
version
of
that
not
working
with
that,
and
it's
just
it's
still
a
mess
there.
Unfortunately,
and
so
one
of
my
goals
with
this
was
just
to
put
together
this
package
that
it's
all
baked
together,
you've
got
the
it's
all
baked
into
a
close
jupiter
jar.
G
So
if
you
again,
if
you
like,
using
it
from
the
kind
of
notebook
environment,
that's
easier
for
some
folks
to
get
started
with,
then
you
have
these
tools
put
together
there,
if
you,
if
and
if
you
want
to
branch
out
into
kind
of
working
more
directly
from
a
rebel
connection,
as
I
was
demonstrating
and
and
as
john
anthony
was
kind
of
pointing
out,
then
then
that
stuff
is
all
kind
of
there
for
you,
too,
so
yeah.
G
So
I'd
encourage
people
who
are
kind
of
interested
in
building
environments.
Like
that
I
mean
I
realized
what
I
what
I
put
together
here
is
very
kind
of
specifically
focused
on
this.
G
These
kind
of
polish
data
sets,
but
I'd
love
to
see
people
taking
kind
of
the
guts
of
that
docker
environment
and
sort
of
repurposing
them
for
other
uses,
because
I
think
that
there's
a
lot
that
we
can
do
now
that
we
have
this
ability
to
to
stitch
together
these
different
these
different
ecosystems,
but
it
it
becomes
it's
a
problem.
G
If
you
have
to
set
up
those
environments
yourself,
I
mean
it's
hard
to
set
up
a
python
environment,
setting
up
to
to
to
merge
with
jupiter
and
with
and
with
and
with
all
disclosure
stuff.
It
can
be
a
little
bit
daunting
if
you,
if
you're
you're,
not
not
oriented
so
just
throwing
that
in
there
yeah.
I
totally
agree.
C
With
that,
I
do,
on
the
other
hand,
I
would
say:
python
is
hands
down
the
worst
r
isn't
anywhere
near
as
bad
yeah.
I
agree.
B
But
on
that,
may
I
ask
a
sevaran
like
whether
or
not
you
have
like
a
some
sort
of
a
reference
point
right
when
you
say
like
maybe
getting
started,
is
not
as
good
at
enclosure
or
it's
not
as
easy
like
you
have.
What
are
you
comparing
that
to
and
what
is
it?
Is
it
the
tutorials?
Is
it
the
the
docker
container
that
would
make
it
easier.
I
So
I
actually
love
working
closure,
so
I
had
actually
given
up
on
doing
any
programming
work
about
about
10
years
back
so
before
that
I
used
to
do
is
again
do
a
lot
of
stuff
in
using
java,
mainly,
and
then
I
got
tired
of
java
with
all
its
verbosity
and
everything
right.
It
was
painful
to
set
up
and
take
down
every
single
thing,
so
move
more
into
management.
But
of
course
you
know
the
the
development
each
was
always
there
and
then
came
across
clojure
and
yeah.
I
It
was
a
steep
learning
curve,
but
I
I
stuck
to
it
and
I
actually
enjoy
working
with
clojure,
so
pretty
much
everything
I
do
now
is
enclosure,
and
so
so
it's
so
don't.
Take
me
wrong
when
I
say
that
you
know
the
the
there
should
be
better
tutorials
or
better
hello,
all
kind
of
things.
I
think
it's
it's
even
after
five
years
I
find
that
you
know
with
some
of
the
new
libraries
when
I
want
to
bring
them
in
and
start
using
them.
I
It's
not
that
easy
to
to
get
to
get
up
to
speed
with
them
and
and
and
many
of
them
don't
seem
to
kind
of
like
what
work
in
the
way
that
that's
described.
Maybe
it
is
because
of
version
differences
or
whatnot,
but
there
are
still
challenges
now,
a
lot
of
them
work
very
well.
It's
just
that
getting
up
and
running
with
them
is
can
be
made
a
little
bit
more
easier.
G
Yeah,
I
I
I
don't
want
to
stay
away
off
this
topic
too
much,
because
I
think
it's
an
important
one.
But
there
are
some
questions
I
wanted
to
sort
of
throw
out
there
around
some
of
the
kind
of
interactive
features
that
in
particular
what
you
were
kind
of
demonstrating
daniel,
as
well
as
what
you're
working
on
john
and
some
things
that
I've
been
thinking
about
for
oz.
So
I
guess
I'll
I'll
put
that
out
there
that
I'd
like
to
at
some
point
asking
questions
about
that.
G
But
I
think
if
I'd
also
like
to
leave
more
space,
if
folks
want
to
continue
talking
about
this
particular
thread.
M
So,
for
example,
I
think
that
I've
seen
three
different
notebook
kinds
of
solutions
demonstrated
here,
either
as
the
main
event
or
some
factor
in
a
presentation.
Was
there
globe,
jupiter
and
and
oz
and
cyte
and
tablecloth,
and
I'd
be
interested
to
learn
more
about
what
they're
based
on
and
how
they
differ.
How
you
choose,
which
one
what's
possible
to
accomplish
whether
their
strengths
are
mainly
in
the
discovery
mode
like
a
rebel,
a
rich
rebel
or
a
presentation
mode,
a
a
a
supercharged
powerpoint.
A
Yeah,
so
I
guess
one
way
to
address
that
is
to
have
a
sequence
of
small
meetings
where
we
will
be
discussing
each
and
every
tool
of
these.
I
think
both
site
and
ours
have
had
some
detailed
presentations
that
are
available
available
in
video
and
but
they
have
evolved
since
then,
and
yes,
we
need
to
keep
describe
this,
keep
describing
what
we
are
having
here
and
let
us
have
that
in
further
more
meetings,
and
we
are
now
at
the
official
end
and
in
a
moment
we
will
say
goodbye.
N
I
have
a
quick
question
for
lucas.
Perhaps
maybe
I
don't
understand
it
well
enough.
I
actually
have
access
to
a
large
number
of
customer
contact
center
transaction
logs
involving
you
know:
digital
chats
phone
calls,
web
self-service
sessions,
etc.
I
was
wondering
if
his
privacy
package
might
be
applicable
to
create
larger
and
scalable
data
sets
sufficiently
protecting
individual
privacy
or
whether
or
not
that's
not
the
purpose
of
the
library,
I'm
just
curious
about
whether
it
could
be
applied.
That
way.
J
So
it's
not
production
ready.
Yet
can
you
hear
me
yeah
so
so
yeah?
This
is.
This
is
exactly
financial
data.
Medical
data
data
about
crime,
this
type
of
stuff
is,
is
exactly
what
we
what
we
care
about?
What
why
these
tools
are
there?
Yes,
so
there's
in
general,
there
is
this
need
to
protect
some
data
and,
at
the
same
time,
somehow
allow
analysis
right
or
even
machine
learning
and
there's
there's
so
much.
J
We
could
do
with
machine
learning
on
medical
data,
for
instance,
but
it's
all
locked,
it's
it's
somewhere
there
in
hospitals
and
it's
for
for
good
reasons.
It's
protected,
it's
private
and
the
same
thing.
If
with
financial
data,
yes,
that's
exactly
what
what
these
things
are
for.
What
I
just
showed
you
is
something
very
small:
it's
all
work
in
progress,
not
production
ready
at
all,
but
I
encourage
you
to
go
to
openmind.org.
We
have
all
kinds
of
solutions
there
there's
the
educational
materials
we
we're
all
about
that.
J
G
Yeah
and
if
I
can
ask
my
question
of
daniel
real
quick
so
that
you
had
the
thing
with
the
slider,
I
think
this
is
maybe
hopefully
a
short
question,
but
you
had
the
thing
with
the
slider
where
it
updated
the
visualizations
and
you
made
it
sound
like
the
slider,
was
updating
state
on
the
server
or
you
know
in
the
jvm
process,
and
that
then
that
was
triggering
processing
that
that
flowed
back
through
to
the
browser.
So
you
said,
that's
using
jfx
is
that
right
or
javafx.
G
A
G
A
Just
say
briefly,
and
afterwards
I
will
just
make
a
video
about
that
on
one
day
and
yeah,
but
I
guess
we
have
seen
a
different
dashboard
like
systems
today,
one
by
john,
which
was
magnificent,
and
what
john
was
showing
us
was
how
we
can
manage
states
in
the
browser
in
the
client
and
use
the
back-end
jvm
just
for
computation
and
as
a
data
source
and
a
similar
library.
A
A
It
is
a
wrapper
of
javafx
for
building
desktop
applications,
but
node
space
does
not
use
the
whole
cljfx.
It
only
uses
the
core
of
clgfx,
which
is
used
for
managing
state
in
the
jvm,
and
this
core
is
similar
to
what
you
may
see
in
closurescape
libraries
for
client
development,
but
kind
of
different
and
refreshing.
A
So
thank
you
for
that,
and
I
guess
maybe
that
is
a
good
moment
to
say
goodbye
and
thank
you
so
much
for
everybody
here
who
have
been
here
in
unusual
hours
in
the
end
and
beginning
of
their
day,
and
I
guess
after
the
goodbye
we
can
stay
and
keep
checking
and
keep
recording
if
you
wish.
A
But
at
that
moment
thank
you
so
much
to
everybody,
everybody
who
needs
to
leave
and
see
you
next
time
and
we're
only
scratching
the
surface,
of
course,
and
we
need
to
talk
again
more
and
more
and
now
let
us
keep
chatting
if
you
wish.
H
H
So
my
my
question
would
be:
is
there
some
kind
of
of
mentoring
activity
around
and
some
anyone
knows
about?
I
mean
I,
I
know:
there's
closure
firm
sign
up
to
it,
but
unfortunately
the
the
problem
is
that
there
weren't
enough
mentors.
H
So
it's
it's
a
little
bit
disappointing
that
I
mean
it's
nice
to
have
this
one
person
who
knows
more
than
the
rest,
so
that
person
can,
you
know,
lead
and
help
the
others
yeah.
That's
that's.
That's
mainly
my
my
question.
H
O
Well,
we
did
try
to
do
some
mentoring
work
on
athens.
O
I
was
involved
in
athens
development
for
a
while,
which
is
a
tool
to
build
a
knowledge
base,
a
clone
of
rome
basically,
and
our
strategy
was
to
get
more
closure
developers
by
teaching
closure,
and
then
we
tried
to
match
up
mentors
and
and
mentees,
and
it
seemed
really
simple
on
the
surface
because
we
wanted
to
make
something
that
was
scalable
and
we
just
needed
to
attract
people.
O
But
when
you
get
into
the
weeds
it
gets
difficult
to
organize,
because
you
need
really
invested
mentors
and
really
invested
mentees
and
what
you
do
when
some
people
sign
up
and
don't
really
want
to
to
contribute
and
other
people
have
different
motivations
in
mind.
But
that's
kind
of
that's
the
low
end
story.
There
are
cool
parts
as
well,
because
sometimes
they
just
strike
it
off
and
it
works
fantastically
for
a
while.
O
I
was
doing
code
reviews
on
a
mentee
who
was
working
through
closure
for
the
brave
and
true
and
he
put
his
exercise
in
a
gist
and
later
on
youtube.
And
then
I
did
code
reviews
which
was
interesting
yeah.
So
an
interesting
experience,
I
don't
know
whether
that
really
answered
your
question.
I
didn't.
I
don't
think
I
understood
perfectly
what
you
were
asking
for.
H
Yeah,
in
fact,
closure
farming.
I
think
it's
organized
by
the
same
people
from
athens,
so
yeah.
N
No,
not
about
closure
form
itself,
but
just
you
know
specific
things
that
I
get
blocked
by
and
I'm
expecting.
Somebody
would
have
a
quick
answer
and
some
of
those
forums
have
dozens
of
people
signed
in,
but
nobody
really
active.
So
I'm
just
trying
to
figure
out
where
the
action
is
taking
place.
N
I
used
to
be
a
whisper
way
back
when
lit
interlisp
was
my
favorite
language
and
not
a
xerox
dorado
36
machine,
but
you
know
in
intervening
years.
I
got
tired
of
java
and
it
was
very
verbose
and
you
know
the
whole
reach
message
about
that
was
right
on.
So
I
just
recently
just
five
six
months
ago,
re-engage
with
closure
and
I'm
having
more
fun
than
you
are.
I
suspect,
so
I'm
really
trying
to
get
back
into
it.
Myself.
A
Beautiful
and
leandro
I
I
just
wish
to
say
that
I
remember
your
ideas
about
mentoring
and
the
discussion
we
have
had
about
it,
and
I
think
you
have
had
beautiful
vision,
a
beautiful
vision
of
how
we
could
organize
a
mentoring
system
where
we
could
help
each
other,
and
let
us
try
it
out.
I
think
it
will
be
challenging
as
theodore
said,
and
the
challenging
thing
would
be
to
find
some
continuity,
some
people
who
can
commit
to
a
process
and
let
us
keep
trying.
H
Yeah,
actually
I
got
into
closure,
I
mean
like
the
the
this.
The
strongest
way
in
which,
through
which
I
got
into
closure,
was
via
a
closure
developer.
I
met
here.
He
was
living
here
in
buenos
aires
and
we
got
together
and
we
did
programming
and
it
was
really
awesome
and
it
was
very
important
to
have
someone
to
ask
about
how
do
you
do
this
simple
little
thing
in
e-max
like
this
way?
Okay,
then,
okay
I'll
keep
on
working.
H
For
instance,
like
a
couple
of
years
days
ago,
I
was,
I
was
swamped
because
I
wanted
to
create
a
new
namespace
and
like
a
new,
a
new
file.
Actually-
and
I
wanted
I
mean-
and
I
and
I
was
asking
myself:
isn't
there
a
way
to
automatically
create
it
like
so
emacs
will
take
care
of
it
like
eclipse
does
when
you
want
to
create
a
new
class.
Something
like
this
like
like
that,
and
I
ended
up
doing
it
by
hand.
H
I
know
there
are
also
very
there's
very
good
documentation
from
practically
that
that's
a
very
good
project,
I,
like
it
very
much
but
yeah
mean,
of
course
it
cannot
cover
everything,
and
those
are
the
things
that
I
could
not
easily
find.
G
Yeah,
if
I
could
throw
something
out
there,
I
feel
like
there's,
there's,
I
think,
kind
of
the
key
to
finding,
and
this
is
kind
of
in
response
to
your
your
question,
sam.
I
I
think
that
there's
a
little
bit
of
a
it's
like
it.
Sometimes
it's
not
just
finding
the
right
place,
but
like
the
right
space
within
the
place,
if
that
makes
sense.
G
So,
for
example,
if
you're,
if
you're
on,
if
you're
on
the
slack
closureians
group,
you
know
you
might
ask
in
a
general
room
some
kind
of
question
about
whatever
and
really
not
get
much
of
a
response,
but
then
find
that
there's
actually
a
more
focused
group
somewhere,
say
talking
about
data
script
or
daytonic,
where
you
can
get
answers
to
questions
really
quickly,
and
so
I
think
it
depends
a
little
bit
on
the
community
and
like
what
the
particular
question
is
and
yeah.
G
I
wish
I
had
a
better
answer
than
like
find
the
right
place,
but
I
think
for
the
data
science
community,
a
lot
of
us
are
on
are
on
zooloop.
Now
there
is,
I
mean
people
are
still
on
the
data
science
channel
in
in
the
closure.
G
In
slack,
but
there's
a
lot
of
activity
in
zulip
as
well,
and
so
so
yeah,
I
think
it
kind
of
comes
down
to
what's
the
question
and
I
think
sometimes
with
tooling
questions
which
they
may
feel
sort
of
ancillary
or
you
know
secondary
to
you,
know
kind
of
core
pro
core
questions
about
how
things
work
in
the
language
say,
but
they're
also
really
important
right
because
there's
little
those
little
bits
of
you
know,
I
mean
so
much
of
my
workflow
being
something
that
I
can
be
fast
and
effective
in
comes
down
to
just
how
I
have
my
editor
and
my
environment
set
up
right,
and
it's
sometimes
hard
to
know
where
to
ask
those
questions.
G
But
but
yes,
I
just
say
like
poking
around,
and
you
know
you
know
not
being
afraid
to
ask
in
multiple
places,
sometimes
to
kind
of
figure
out
where
what
questions
get
answered
with
what
frequency
is
kind
of,
maybe
the
best
advice.
I
could.
G
Give
also,
I
guess
one
further
thing
like
asking
questions
on
stack
overflow,
I
feel
like
I
don't
know
it
maybe
doesn't
quite
happen
as
much
as
it
used
to
in
the
closure
space,
but
it's
great
that
when
you
do
ask
questions
there.
If
it's
like
the
right
kind
of
question,
you
know
those
things
are
then
really
easily
searchable,
which
is
which
is
nice
and
can
be
improved
upon
again.
You
know.
C
M
N
M
G
Yeah,
the
google
closure
group
used
to
be
kind
of
the
place
to
go.
I
think-
and
that
has
I've
noticed
way
less
activity
there.
So
that
makes
sense,
that's
kind
of
where
that's.
A
P
A
Yeah,
so
we
are
thinking
about
it.
A
It
will
be
on
an
hour
which
is
more
comfortable
for
the
friends
in
east
asia,
which
is
earlier,
and
what
we
are
planning
to
do.
More
is
small
meetings
like
a
small
interview
of
one
person
telling
their
story
and
a
few
others
listening
and
asking
questions.
This
kind
of
format
is
something
that
we
are
going
to
do
more,
because
we
want
to
have
also
more
focused
discussions
alongside
the
the
public
ones.
Does
it
make
sense
any
any
thoughts
about
it?.
P
Yes
sure,
because
it
is
wonderful
to
have
the
chance
to
to
have
access
a
I'm
living
in
in
buenos
aires
argentina,
so
here
the
the
closing
communities
is
almost
non-existent
so
having
the
chance
to
have
direct
access
to
to
more
major
communities.
All
around
the
world
is
wonderful
for
us.
So
that's
why
I'm
interested
in
keeping
in
touch
with
you.
G
I'll
add
that,
well
I
I
yeah,
while
that
I
really
like
having
kind
of
more
focused
sessions
on
you,
know
one
person
talking
about
a
thing
that
they're
doing
being
able
to
dive
in
a
little
bit
deep
on
that.
G
It
is
also
really
refreshing
to
every
now
and
then
have
something
kind
of
closer
to
this
format
with
lightning
talks
and
a
little
more
open
discussion,
because
it's
yeah,
it's
you
can
kind
of
like
take
in
a
more
bird's
eye
view
of
what
folks
are
working
on,
and
it's
been
fun
today
to
to
see
that
from
folks.
G
I
Yeah,
I
agree
with
that
sentiment
too
and
regarding
the
smaller
meetings
right,
what
about
something
like
doing
workshops
on
a
regular
basis
where
one
person
like
chris
or
somebody
can
you
know,
go
with
the
tools
that
they're
doing
and
start
from
scratch
and
build
up
a
project
and
work
through
some
small,
well-defined
problem
in
a
couple
of
hours
yeah.
I
love
that
you
know
everybody
else
can
follow
along
with
that.
G
Yeah
that's
great
and
that
that
seems
to
thread
into
some
other
stuff
we
were
discussing
with
you
know:
how
can
we
onboard
more
people
and
yeah?
I
like
that
idea.
I
I
think
having
it
recorded,
I
think,
is
always
useful
right,
because
we
can,
you
know
a
lot
of
times.
It's
yeah
you
want
to
follow
along,
but
it's
not
that
easy
many
times
practically
and
you
may
want
to
go
back
and
pause.
It
get
your
environment
in
order
right,
get
and,
and
then
start
you
know
continue
from
there.
I
So
I
think
recording
it
would
be
really
useful
and
part
of
the
reason
is
that
you
know
when
you
have
a
workshop
kind
of
a
thing
you
have
some
discussion
going
on.
People
are
asking
different
kinds
of
questions
right,
and
I
think
that
adds
to
the
value
of
the
of
the
of
the
discussion
of
the
workshop
itself,
rather
than
somebody
just
giving
a
tutorial,
which
is
one-sided,
yeah.
N
C
G
Yeah,
I
think
so.
I
think
what
happened
was
that
there
were
some
changes
to
what
libraries
came:
pre-packaged
with
java
between
yeah
between
eight
and
eleven
somewhere
and
yeah.
G
It's
subtle:
it's
subtle
things,
because,
while
closure,
I
think
by
itself,
is
great
at
maintaining.
You
know
backwards,
compatibility
and
things
kind
of
just
working
years
later,
if,
if
something
breaks
in
the
underlying
or
yeah,
what,
if
something
changes
in
the
underlying
java
system,.
C
G
Kind
of
screwed-
and
so
I
think
there
hasn't
been
too
much
of
that,
but
it's
funny
how
just
the
one
or
two
things
that
have
caused
friction.
They
seem
to
kind
of
great
again
and
again
so
yeah.
C
And
stuff
from
eight
to
nine,
that
was
a
complete
mess
and-
and
really
I
think
that
was
one
of
the
dumbest
things
they've
ever
done.
But
I
guess,
if
you
can
get
past
that,
then
you
know
things
like
11
going.
You
know
the
big
difference
between
10
and
11
and
11
and
13
or
whatever
is,
is
not
that
bad.
It's
it
was
that
yeah
eight
to
eight
plus
that
killed
everything
I
mean
it's.
It's
amazing
how
many
things
just
don't
even
work
on
on
nine
or
anything
like
nine
plus.
C
So,
like
I
mean
like
psyche,
doesn't
really
I
mean
it
kind
of
works
like
on
11,
but
not
really,
and
it's
mostly
because
they
just
haven't,
went
in
and
fixed
all
these
things
that
the
jvm
broke,
and
you
know,
and
it's
bad
as
it
is.
I
mean
I
guess
from
surveys.
It
looks
like
something
like
70,
80
percent
of
all
jbm
shops
still
run
on
java
8..
N
So
from
where
I
sit,
if
anybody
wants
ideas
for
workshops
connecting
to
you
know
one
thing
I
mentioned
earlier:
connecting
the
sql
server,
16
or
19,
or
setting
up
a
notebook
or
note
space
style,
interactive
environment
or
any
kind
of
integration
with
libpython
clj
and
just
getting
that
just
basically
working,
I
mean
I
know,
there's
some
readme
stuff
there,
but
even
little
gotchas,
like
you
know,
I
didn't
get
this
version
of
something
right
would
help
unblock
me.
C
These
documents
inside
you
can
save
these
things
and
push
them
out
to
github
and
then
and
then
just
load
them
from
any
given
running
instance
of
the
thing
and
there's
definitely
versions,
you
have
definitely
notebooks
out
there
that
do
jdbc
next
jdbc
stuff
it
doesn't
connect
to.
I
mean
I've
used
it
mostly
on
like
my
sequel
and
postgres,
so
I
haven't,
I
haven't
actually
used
it
on
sql
server,
but
I
get.
I
actually
think
next
jdbc
is
really
a
solid
piece
of
work
and
it
just
and
and
it's
not
very
high
level.
C
So
it's
not
like
it's
it's
not
it's
unlikely
that
the
only
thing
you'll
need
more
than
just
the
you
know
the
the
adapter
for
for
the
database.
It's
that
that's
probably
the
only
thing.
C
Oh
well,
yeah,
okay,
who
knows
yeah,
and
I
you
know,
I
have
documents
that
that
that
run
with
clutches
or
but
but
it
does
request
as
chris
points
out
that
this
you
know
these
extra
environment
things
having
to
do
with
other
ecosystems
are
probably
much
more
painful
than
than
anything
associated
with
closure
of
the
jvf.
C
C
I
actually
think
I
hate
apple
now
and
I
probably
would
be
more
willing
to
go
with
windows,
except
that
everybody
in
the
labs
use.
You
uses
mac.
C
Kind
of
stick
with
the
mac,
but
r
on
r
on
like
well
r
on
linux,
is
dead.
Easy.
It
just
works.
You
don't
have
to
do
and
imunix
it's
like
a
no-brainer.
It's
a
complete
no-brainer.
A
A
C
G
Yeah,
I
think
there
are
a
lot
of
fundamental
assumptions
baked
into
python
and
how
it
thinks
about
libraries.
That
I
mean,
I
think
the
original
saying
is
just.
G
You
cannot
install
multiple
versions
of
a
library
in
the
same
environment
which,
like
that's
so
many
decades
ago
now,
but
whether
you
look
at
you
know,
like
you
know,
obviously
closure
or
or
java
or
or
you
know,
npm
and
javascript
I
mean
that's
just
not
you
know,
we
don't
have
these
these
problems
anymore
and
they
still
haven't
sort
of
and
from
that
we
get
virtual
on.
Then
you
know
all
these
other.
All
these
other
challenges.
So
yeah,
it's
it's
it's
a
challenge.
G
I'd
be
interested
in.
I
I
think
yeah,
the
other
thing
that
that
has
occurred
to
me.
Just
recently
we
had
we
had
this
docker
environment
working
and
all
of
a
sudden,
someone
tried
to
run
it
and
and
it
stopped
working
and
it
all
came
down
to
well.
One
of
these
pit
packages
had
updated,
and
now
they
were
using
they're
compiling
against
a
different
like
lower
level.
You
know
system,
library
and,
and
now
nothing
was
working
right
because
it
couldn't
it.
G
Couldn't
bind
to
that,
and
so
you
know
you
can
tell
people
to
well
just
include
the
versions
of
whatever
pip
packages
or
whatever
you
want
to
install,
but
there's
there's
something
really
nice
about
languages
that
just
force
you
to
or
environments.
I
guess
that
force
you
to
to
include
your
versions
when
you,
when
you
make
your
declarations,
it's
something
I've
come
to
really
appreciate
with
closure.
G
You
just
cannot
specify
your
dependencies
without
saying
what
versions
you
want
to
point
to,
and
it
just
kind
of
forces
us
all
into
the
habit
of
like
oh
now,
this
thing's
repeatable
so
yeah.
I
think
that
you
know
it
doesn't.
It
doesn't
solve
the
problem
once
we
start
trying
to
interface
with
the
rest
of
the
world,
but
but
it
is
it's
something.
That's
nice
that
that
we
have
at
least
in
our
core
kind
of
ecosystem.
G
So
I'm
interested
in
thinking
about
I
mean
I
I
don't
know
if,
like
docker
is
the
only
answer
to
this
and
it's
just
like.
We
just
have
to
make
a
container
for
every
possible
like
configuration
of
environments
we
might
want,
or
whether
there's
some
possibility
of
us
kind
of
bringing
a
little
bit
of
that
sanity
that
we
have
in
the
closure
world
to
to
maybe
not
to
these
other
worlds
by
themselves,
but
at
least
to
the
way
that
we
interface
with
them.
I
mean
it'd,
be
sort
of
wonderful.
G
If
we
had
an
extension
of
like
a
depth,
dot
eden
where
you
could
specify
the
the
say,
the
python
packages
you
wanted
or
the
r
packages
you
wanted
or
whatever,
and
that,
maybe
it
even
forced
you
to
specify
the
versions
so
that
so
we
get
a
little
bit
more
of
that
repeatability.
G
I
don't
know
if
then
you'd
have
to
like.
Maybe
you
would
have
something
that
wrapped
virtual
and
you
know
virtual,
on
or
or
constructed
docker
images
for
you
or
whatever,
but
it's
interesting
about
how
we
could
sort
of
improve
upon
that
situation.
G
C
C
Is
kind
of
geared
more
towards
just
a
big
blob
that
will
run
your
stuff
everything
everything
is
in
it.
Where's
doctor
yeah
a
little
bit
more
of
a
dev
ops,
kind
of
a.
G
Thing
yeah
I've.
When
I
was
last
working
at
the
hutch
some
folks,
there
were
starting
to
use
it
kind
of
in
the
bioinformatics
space
and
seemed
to
find
that
it
was
a
good,
a
good
fit
for
that.
C
Yeah
it
just
yeah
because
of
all
these,
especially
in
that
space,
you
know,
like
I
have
this
thing,
that
one
of
the
pis
wants
to
start
pushing
out
to
other
labs
and
was
like
wow.
I
mean
it's,
it's
it's
an
environment
problem,
because
zillions
of
these
c,
plus,
typically
c
plus
plus
packages
and
they're,
all
you
know
trying
to
build
these.
You
can't
run
it
across
operating
systems.
You
can
build
all
the
binary.
You
build
all
that
stuff,
you
might
work
across.
You
know
if
you
build
it
on
ubuntu.
C
C
G
Yeah
I
wish
I
had
a
better
answer.
I
never
I
never
dug
too
deep
into
it,
but
it
seems
I
mean
I
guess
I
it
seems
more
flexible.
I
think,
and
I
I'm
trying
to
remember
whether
it
was
to
remember
what
it
was
that
was
making
folks
sort
of
switch
to
it
at
the
hutch.
G
It
might
have
had
something
to
do
with
like
ease
of
access
to
say,
local
data
files
or
sort
of
like
that
definitely
yeah,
so
crossing
the
sort
of
file
system
boundary
was
something
that
was
a
little
bit
easier.
That
might
have
been
one
of
the
main
appeals
of
it.
D
C
That's
for
sure
what
is.
K
C
I
So
it's
a
government
organization,
it
says
lbl.gov
for
singularity.
D
A
Yeah,
so
maybe
I
guess
let
us
continue
for
a
few
more
minutes.
If
that
is
okay,
I
will
have
to
leave
soon
and
I
guess
everybody
are
invited
to
stay.
Is
there
any
other
topic
that
we
wish
to
discuss
today?.
G
Can
I
add,
can
I
add
one
thing
for
for
taylor's
question?
One
thing
I
remember
also
being
an
issue
is
that
if
you're
on,
like
a
shared
compute
system,
where
you
don't
have
pseudo
access,
it
can
be
sometimes
difficult
to
get
docker
set
up
to
work
in
those
environments,
and
that's
I
I
was
just
skimming
through
here.
The
docs
and
that's
one
of
the
other
things
I
think
was
appealing
to
folks
about
singularity
is
that
you
can
kind
of
set
it
up
a
little
with
with
less
less
permissions
or
lower
permissions.
D
C
A
A
Geeks
is
a
package
manager
written
in
scheme
in
guile
scheme,
and
it
is
based
on
functional
programming
principles
and
allows
for
very
flexible
ways
to
build
systems,
and
there
is
also
some
new
project
called
hermes,
which
is
like
geeks
but
written
in
janet,
which
is
a
closure
like
like
a
lightweight
language,
and
all
these,
I
guess,
are
worth
looking
into,
even
though
they
are
maybe
less
popular
again.
Daniel
yeah.
So
I'll
share
the
link
one
project
I'll
just
send
you
a
link
to
the
chat
yeah.
So
one
project,
I'm
sharing,
is
geeks
and
oh
yeah.
A
A
Okay,
is
there
anything
else
that
we
wish
to
discuss
today.
I
I
I
Yeah,
deep
learning-
and
you
know
mlb
or
any
of
those
kind
of
things.
G
Yeah,
so
the
smile
library,
which
is
what
it
stands
for,
but
it's
one
of
the
things
that's
backing
the
checkml
dataset
and
we'll
take
the
entire
tecmo
stack
and
it
it
has
some
great
stuff
in
there.
I
was
really
surprised
to
find
out
that
I
had
actually
implemented
umap,
which
you
know
was
implemented
in
python,
and
so
I
thought
it
would
be
forever
until
we
had
something
kind
of
a
native
java,
but
it
was
sort
of
stunned
to
find
that
he
had
done
the
work
there.
C
G
Yeah,
there's
also,
of
course,
with
lib
python
clj.
Now
we
have
access
to
anything
that
we'd
have
in
python
and
we've
seen
some
great
examples
from
karen
meyer
who
was
on
earlier
of
you
know,
connecting
to
different
well
the
original
umap
and
I
think,
keras
and
some
other
some
other
kind
of
python
libraries
there's
also
she's,
also
working
on
the
the
closure
bindings
for
mx,
yeah,
mxnet
and
yeah.
I
mean.
I
About
that
area
that
I
find
are
you
know,
there's
it
seems
to
be
there,
but
when
you
actually
go
to
the
websites
and
check
them
out,
the
closure
versions
are
not
there.
G
To
my
knowledge,
she's
still
working
on
that-
and
that's
still
I
mean
for
at
some
point-
they
ended
up
featuring
closure
on
like
the
main
mxnet
page
as
like
supported
language.
So
I
think
that
one
of
the
one
of
the
arguments,
some
folks
at
least,
have
been
making
about
the
kind
of
value
proposition
there
with
mxnet
is
because
that's
an
amazon,
sponsored
library.
G
You
know
that
you
know
that
you
know
that
it's
gonna
run
well
on,
like
amazon
web
services
right,
so
this
kind
of
this
kind
of
deployment
path
issue
from
from
you
know,
getting
getting
to
your
you
know:
development,
environment,
or
you
know,
kind
of
data
science
exploration
environment
to
something
that
can
be
kind
of
productivitized
is
going
to
be
a
well-trod
path.
So
it's
one
argument
that
I've
heard
folks
make
for
for
the
mxnet
route,
but
yeah.
G
I
think
it
kind
of
depends
on
what
specifically
you
want
to
do,
and
I
I
I'm
a
little
bit
and
I'm
a
little
bit.
I
think
there's
there's
an
extent
to
which,
like
the
deep
learning
stuff,
has
taken
a
lot
of
air
out
of
the
room
from
other
really
interesting
stuff,
that's
going
on
in
in
the
machine
learning
space.
G
So
you
know
with
stuff
like
I
mean
again,
umap
there's
this
other
kind
of
interesting
dimension
direction,
algorithm
called
tri-map,
which
again
actually
karen
meyer,
pointed
out
to
us-
and
you
know,
there's
a
lot
of
stuff
out
there,
so
yeah.
So
it's
for
yeah.
I
can
so
it's
the
second
of
the
dimensionality
reductions.
G
I
showed
off
in
the
the
presentation,
but
let
me
see
if
I
can
get
the
the
python
thing
for
it
yeah
and
actually
I'm
not
sure
whether
they
should
share
the
I'll
share,
both
the
like
the
read
the
docs
page
and
there's
a
link
there
to
the
to
the
academic
publication,
which
I
highly
recommend.
G
Anyone
read
who's
interested
in
math,
because
it's
it's
a
fascinating
paper.
It
gets
into
a
ton
of
really
cool
category
theory
and
like
topology
and
stuff
geometric
or
what
is
it
geometric,
algebra,
algebraic
geometry
and
yeah?
It
was
just
it
was
sort
of
refreshing
I
feel
like
so
much.
Kind
of
machine
learning
ends
up
being
just
very
matrixy
and
it's
just
really
cool
to
read
a
paper.
That's
got
a
bunch
of
really
interesting
category
theory
and
stuff.
A
C
Yeah,
I
totally
agree
it's
it's.
It's
kind
of
amazing.
I
C
D
C
Yeah
I
mean
in
general,
though
yeah,
if
you're
going
to
try
and
use
neanderthal
to
to
to
build
a
deep
neural
net
or
just
a
neural
net,
then
you
have
to
you
have
to
do
what
training
is
doing
in
deep
diamond
yeah.
So
it's
it's
more
of
the
low
level
stuff,
but
it,
but
I
just
yeah
I
had
to
use.
I
used
it
in
some
doing
large-scale
injury
calculations
across
a
bunch
of
genomes.
C
And
it
it
and
it's
it's
it's.
I
doubt
that
there's
anything
else
out
there
other
than
maybe
like
tbm,
that's
faster.
I
don't
care
forget
about
it.
You
know
numpy
or
any
of
this
stuff.
G
Yeah
he's
he's
put
out
some
really
impressive
benchmarks.
It's
been
really
cool
to
see
what
he's,
what
wizardry
he's
been
cooking
up.
C
M
Also
impressive
is
that
he
dragon
has
made
his
work
function
with
not
only
the
intel
but
also
the
amd
hardware,
yeah
or
the
nvidia
and
amd
hardware
and
intel
hardware,
so
that,
if
you're
not
aiming
at
amazon
web
services,
if
you're
aiming
at
desktop
or
laptop
computers,
just
about
any
desktop
or
laptop
computer
might
work
nowadays,
that
is
with
it
with
a
certain
level
of
of
computing
functionality
in
the
gpu.
C
Yeah,
I
I
think
that's
right.
Actually,
we
use
it,
though,
what
some
of
our
servers
are
gpu
servers
and
it
works
there.
I
So
how
about
doing
some
kind
of
workshop
which,
where
you
take
what
dragon
has
done
right
his
libraries
and
things
like
that
and
doing
something
fairly
simple
like
word
to
act
or
doctor
or
nlp
kind
of
stuff
and
and-
and
you
know,
as
a
both
both
to
be
able
to
you
know
for
for
a
lot
of
us-
would
not
use
these
things
to
get
up
to
speed
with
it
and
also
like
it's
a
it's.
A
good
tutorial
on
the
whole
data
science
pipeline.
H
I
And-
and
maybe
you
know
like
chris-
you
mentioned
smile
right
so
to
you
something
like
that,
and
I
think
what
would
be
useful
is
the
the
challenge
I
think,
with
any
closure.
Libra
a
developer
faces
is
that
which
library
to
use
right,
like
even
with
with
the
dates
and
times
you
want?
I
Are
you
do
you
want
to
use
java
time
or
time
or
which
one
you
know
so
so
something
just
using
smile
and
neanderthal
or
any
of
those
other
libraries,
you
know
maybe
one
week
do
this
one
week
do
the
other
one
and
on
the
same
data
set
for
the
same
outcomes.
You
know
you
want
to
do
that
and
see
how
they
work
out.
C
I
C
M
C
C
G
Yeah
and
to
add
to
that,
it's
also
has
really
good,
inter
linking
with
the
lid
python
clj
stack,
which
he
also
wrote
so
not
entirely
surprising
there
but
yeah.
It's
100,
echo
everything
everything
john
just
said,
and
in
fact
that's
that's
the
reason
why
in
this
you
know
new
sort
of
analysis,
notebook
stuff
that
I
was
showing
off.
G
That's
exactly
why
we're
kind
of
going
the
techml
route
because
recognize
that
if
we
go
that
route
we
have
quick
access
to
all
of
the
python
clj
stuff
and
and
that
you
know
those
and
as
well
as
smile
right,
because
it's
built
on
top
of
smile
and
so
and
and
again
connecting
to
neanderthal.
You
got
that
there
and
so
those
those
three
and
and
the
the
parquet
and
stuff
I
mean
those
all
those
things
kind
of
collectively
together.
G
I
think
it's
kind
of
rapidly
becoming
a
sort
of
go-to
go-to
place
for
you
know,
get
your
data
in
and
be
able
to
do
a
lot
of
stuff
with
it
pretty
quickly.
C
It
yeah.
D
C
With
the
data
sets
that
don't
even
fit
in
memory,
the
capability
has
that
that's
I
mean
it's
a
game,
changer
and
and
given
and
is,
is
anthony
still
here-
maybe
maybe
maybe
not,
but
you.
C
The
the
that
that
he
was
really
cool
because
he
included
in
that
in
that
benchmark,
detect
ml
stuff
and
even
with
the
thing
that
just
floored
me
was
even
the
you
know,
the
very
idiomatic
kind
of
tablecloth.
You
know
deep
fire
api
version
in
xml
was
almost
I
mean
it
was
within
a
few,
this,
a
tick
or
so
of
the
of
data.table
yeah
in
terms
of
raw
speed
and
data.table.
For
me,
is
you
know
it's
always
been
sort
of
the
gold
standard
for
like
there's,
hardly
not
going
to
be
much
out
there.
N
How
are
you
going
to
decide
the
date
and
time
or
hour
of
day
for
the
next
meeting.
A
Yeah,
so
so
I
guess
next
time
we
will
come,
we
will
have
to
do
some
kind
of
survey
among
the
people
who
are
not
here,
naturally,
so
that
we
see
what
would
be
comfortable
to
the
people
who
couldn't
make
it
on
these
times
and
certainly
we
need
to
diversify
the
times
across
the
day
and-
and
we
just
have
to
have
enough
meetings
so
that
everybody
can
attend.
Sometimes.