►
From YouTube: CI WG demo: Brown Dog
Description
Date: 11/11/2016
Presenter: Kenton McHenry
Institution: National Center for Supercomputing Applications (NCSA)
Midwest Big Data Hub
A
So
what
identity
is
a
dating
structure,
building
block
out
of
NSF?
It's
a
a
call,
basically
to
build
components
that
would
make
up
a
data
infrastructure
from
the
scientific
community
to
support
their
various
data
needs
these
day
it's
most
scientific
domains,
if
not
all
of
them.
These
days
require
some
aspect
of
digital
data
in
digital
software.
So
supporting
that
preserving
that
making
the
science
still
reproducible
is
a
big
concern
of
death
and
so
dibbs.
A
So
in
the
community
one
of
the
things
the
things
we
often
do
at
the
very
first
stage
is,
if
not
throughout
the
entire
project.
It's
this
data
wrangling
that,
where
we're
basically
finding
data
collections,
bringing
him
where
do
we
need
to
be
converting
them
to
perform
at
swinging
to
being
extracting
subsets
of
the
data
that
we're
interested
in?
A
We
don't
want
the
whole
thing,
for
example,
or
cleaning
the
data
filling
in
gaps
or
other
kinds
of
things
in
the
data,
so
that
we
can
do
what
we're
actually
interested
in
the
scientific
or
the
effort
of
the
project,
and
so
basically,
what
brown
dog
is
aiming
to
do
is
kind
of
really
simplify.
That
ad
hoc
steps
that
often
occur
in
these
things,
so
people
will
bring
together
code
from
all
over
the
place.
You
got
to
find
the
code
to
convert
X.
You
got
to
find
a
code
to
pull
out
this
part.
A
You
got
a
five
to
run
the
code.
If
you
can't
find
the
documentation,
don't
have
the
platform
and
so
forth.
I'll
write
the
code
yourself
if
it
doesn't
exist
and
so
trying
to
basically
eliminate
as
much
of
that
as
possible,
so
that,
in
terms
of
the
scientific
workflow,
you
can
kind
of
start
right
after
that,
with
what
you're
interested
in
with
the
science
so
just
getting
tuned
into
that
word,
a
little
more
so
Dino
wrangling.
A
So
we
start
at
the
upper
left
here
where
basically,
we
have
our
collections
somewhere
out
there
in
the
web
or
maybe
on
somebody's
hard
drive
on
somebody's
shelf
somewhere,
and
so,
let's
assume
they're
all
in
line
for
the
moment
and
so
there's
or
elles
and
filesystems.
Or
you
know,
even
some
datasets
we
deal
with
aren't
even
digitized
their
paper
documents,
one
of
those
freak
ologies
forest
inventories
from
centuries
ago.
They
would
like
to
numerical
data.
A
That's
in
that
car
yeah,
because
it
basically
gives
some
way
of
validating
the
models
far
older
than
any
digital
data
source
they
have
otherwise,
and
so
you
have
your
data
collections
out
there
and
so
basically
want
to
get
those,
and
so
what
well
the
way
the
data
is
stored,
assuming
as
digital
again
is
in
a
variety
of
different
means,
so
you're
dealing
with
a
bunch
of
different
file
formats,
they
could
be
commercial
file
formats
for
the
proprietary.
You
don't
have
access
to
software
or
dos
the
specification
to
access
the
crisis
route
directly.
A
So
you
need
some
special
software
to
get
into
the
conscience.
They
could
be
old
floor
mats
where,
basically,
there
is
no
software
easily
accessible
to
get
that
data,
they
could
be
made
up
file
format.
This
happens
all
the
time
in
science,
so
people
just
make
up
their
own
thought
for
microclimate.
So
this
is
what
we
refer
to
as
when
you've
been
bed
kind
of
a
steamer
within
another
format.
Spreadsheets
is
a
great
example
of
that.
A
We
just
define
specifically
what
a
conversion
is
two
distinct
to
distinguish
it
for
another
one.
Basically,
it's
a
transformation
that
largely
preserves
the
entirety
of
the
data.
You
can
potentially
go
back
and
forth.
It's
a
one-to-one
mapping
in
reality.
That's
not
the
case.
You
often
result
in
the
information
loss
in
these
tools,
but
ideally,
if
that
isn't
a
curve
to
1:1
mapping,
and
so
the
second
step
often
involved
this
data
Wranglers.
Once
you
have
access
to
the
contents,
it's
it's
not
really
the
starting
point.
Yeah.
A
A
You
might
want
to
find
a
specific
tree
while
for
us,
or
do
something
with
specific
trees
and
so
forth.
You
don't
want
the
entire
thing,
and
so
this
we
inverted.
This
next
kind
of
transformation
is
one
that
takes
that
data
and
does
something
to
get
the
data
that
you
want.
We've
heard
these
as
extractions,
or
basically
we're
taking
the
original
data
and
basically
turning
it
up.
A
Turning
in
a
hamburger
and
spreading
custom
cross
derived
product,
this
could
take
the
form
of
something
like
metadata
that
it's
useful
in
terms
of
basically
searching
through
a
large
collection
of
data.
You
know
tag
this
slide.
Our
pieces
might
are
dated
with
having
a
river
in
it
or
this
location
is
having
a
river
and
so
forth.
Tags
like
that,
so
you
can
in
query
a
large
collection
to
find
the
stuff
you
want
other
derived
products
such
as
visualizations,
maybe
something
convenient
to
preview.
A
The
data
on
the
web
thumbnail
stuff,
like
that
cleaned
versions
of
the
data,
sets
things
without
holes
in
it
and
so
forth.
It's
all
kinds
of
derived
products
that
come
out
of
this
kind
of
transformations,
and
so
this
is
the
second
type
of
transformation
that
Brownback
supports
all
towards
getting
to
where
we
want
to
be
some
sort
of
scientific
application
which
can
take
it
from
here
to
do
analysis,
further
analysis
on
the
data
and
so
forth
and
so
I.
A
The
idea
here
is,
as
we
move
from
left
to
right,
we're
moving
towards
more
reusable
data
scientists
versus
the
raw
data,
beginning
so
just
as
kind
of
a
visualization
of
this.
So
you
have
some
data,
some
files,
some
data
set,
comes
in
all
right.
That's
what
you
got
that
one!
Is
you
go
through
this
transformation
of
a
conversion?
So
you
give
to
something
you
can
see.
A
So
in
this
case
it
was
an
image
and
with
one
of
my
colleagues
for
more
platform,
some
trees,
and
so
the
conversion
allows
you
to
see
the
data
and
keep
the
pixels
and
then
the
next
step
is
an
extraction.
So
it's
an
example
of
that.
As
we
basically
spit
out
anything,
we
can
extract
from
this
data.
This
is
we
do
this
in
brown
dog
as
JSON.
So
this
is
a
basically
a
JavaScript
array.
Lots
of
web
technologies
utilize
this
as
a
convenient
way
of
spitting
out
data
and
can
utilize
them
with
other
applications.
A
How
much
foliage
is
in
that
bat
brown,
so
it
has
a
numerical
value
for
that
a
model
of
human
preferences
or
for
a
green
infrastructure.
Landscape
design
is
also
as
well
basically
annotating,
dansad
ideas.
All
that
information.
There's
a
suite
of
transformations
that
brown
dog
will
support.
Yeah
would
basically
spit
out
all
the
information
it
can
for
this
specific
data
type
and
provide
that
to
the
user
or
the
application
that
uses
brown
dog
to
do
whatever
they
want
with
it.
And
so
what
does
so?
What
kind
of
data
do
we
need
to
deal
with?
A
So
in
the
scientific
community?
It's
very
diverse,
all
kinds
of
different
data,
so
for
the
past
decade
or
so,
working
at
the
scientific
communities
or
just
some
of
the
things
we've
dealt
with,
there's
all
kinds
of
data
from
image:
data,
satellite
data
radar,
data,
video
data,
all
kinds
of
different
things
and
all
kinds
of
different
steps
that
we
do
on,
such
as,
for
example,
of
land
coverage,
identifying
a
piece
of
satellite
image
that
has
a
urban
landscape
or
a
horse
or
water.
A
Things
like
that
extracting
rivers
from
an
old
historical
map,
so
that
you
can
overlay
it
on
a
modern-day
location
as
those
things
change
over
time.
Hyperspectral
data,
color,
Corrections,
all
kinds
of
different
stuff.
So
the
idea
here
is
its
diverse
lots:
lots
of
different
kinds
of
data
sources
in
the
scientific
community,
and
this
is
actually
where
brown
dog
gets
its
name,
as
the
idea
is,
there's
lots
of
tools
out
there,
and
so,
first
of
all,
we're
not
in
the
business
of
trying
to
reinvent
the
tools
that
exist
already.
A
The
idea
is
to
kind
of
bring
them
all
together
in
this
consistent,
coherent
service
and
so
basically
wrangling.
All
those
tools
are
also
different
kinds
of
tools,
whether
it
be
scripts,
different
kinds
of
languages,
running
on
different
platforms,
different
services,
open
the
source,
closed
source,
dually,
driven
applications
doesn't
matter
if
they
open
some
file
and
save
it
in
a
different
format
or
spit
it
out.
Some
content,
that's
useful
for
scientists
to
sift
through
some
data.
A
We
want
to
use
it,
and
so
that's
basically
in
where
it
comes
from
the
name
Brown
bugs
among
a
lot
of
sniffing
software,
and
so
basically,
the
idea
of
supporting
all
those
different
kinds
of
technologies
in
one
consistent
framework,
that's
easily
leverageable
and
favorable
for
the
community
is
what
we're
after
here
so
brown
dog.
The
main
key
factor
here
again
because
of
all
the
different
kinds
of
data
within
the
scientific
world
is
extensibility
that
aspect
again
different
kinds
of
software
being
able
to
change
over
time.
A
So
those
tool,
the
sources
used
sciences,
are
always
pushing
the
forefront.
It's
never
going
to
stop.
It's
always
going
to
change,
and
so
in
being
able
to
easily
add
new
transformations.
Converters
extractors
to
the
system
is
key,
being
able
to
support
whatever
platforms
are
out
there
to
run
these
tools
and
instructions
on
is
key,
and
so
basically,
this
this
piece
of
cyber
inter-district
we're
building
here,
is
built
around
this
idea
of
basically
running
anything
potentially
as
part
of
the
service,
and
so
that's
a
one.
A
The
key
elements
of
this,
and
also
done
also
encapsulating
that
software
in
a
manner
that's
more
preservable,
so
we
use
virtual
machines,
and
now
we've
switched
over
to
docker
docker
used
to
be
only
linux
announced
they're
supporting
Windows
as
well.
Instead,
so,
basically,
it's
a
convenient
way
of
encapsulating,
not
just
the
tool,
but
also
the
environment
around
the
tool
to
be
able
to
run
it
in
the
future
as
well
and
grow
at
the
inner
system
was
well
elastically
virtual
later
on.
A
The
second
aspect
is
an
API,
and
so
the
idea
here
is
to
allow
the
very
diverse
scientific
community
to
utilize
this
across
all
the
applications
and
tools
they
currently
use,
whether
it
be
their
own
code,
whether
it
be
ArcGIS,
whether
it
be
whatever
else,
and
so
what
we've
designed
the
system
around
is
this
specifically
to
emphasize
a
programmable
interface.
That's
what
an
API
is,
and
it's
just
basically
a
way
for
programs
to
talk
to
it.
A
So
a
program
can
interface
with
this
say:
I
have
this
data
transformer
to
do
this
data
and
it
doesn't,
as
it
turns
the
data
to
you
and
to
do
that
in
a
programmatic
programmatically,
so
you
can
loop
through
a
directory.
Do
it
to
it
as
hard
collection
or
whatever
else
you
want
to
do
and
set
up
a
program
and
other
tools
to
pinch
in
the
leveraging,
as
well
again
supporting
diverse
usage.
So
again,
the
scientific
community,
so
all
the
different
domains
out
there.
You
know
they
have
lots
of
different
client
applications.
A
A
Publication
is
a
key
aspect
as
well,
so
basically
capturing
these
tools
that
are
being
developed
within
sites
of
community
to
do
specific
things
for
their
project,
but
other
people
could
potentially
use
as
well
and
all
this
is
open-source
software,
so
the
brown
dog
API,
just
briefly
so
the
main
interface
again
is
it's
a
rest,
interface
arrest,
programmable
interface.
So
basically
that
means
in
URL
form
and
HTTP.
You
basically
issue
these
commands
to
it
at
the
lowest
level
and
give
it
a
data
set,
get
something
back.
A
Some
transformed
version
of
that,
and
so
you
go
here
to
mediate
the
ID
at
NCSA
at
Illinois,
daddy
you.
These
are
the
things
you
could
do
from.
You
know:
authorization
stuff,
basic
housekeeping
kind
of
things,
to
the
conversions
where
you
specify
inputs
and
outputs
extractions,
where
you
basically
a
specify
individual
extractors,
a
whole
suite
of
extractors
and
provenance.
So
you
can
tell
what's
going
on
where,
and
so
this
is
at
the
lowest
level,
and
so
for
some
people,
that's
fine
for
others
or
people.
A
That's
not
fine,
and
that's
where
the
client
applications
come
in
the
flow
but
before
I
get
into
the
clients.
I
just
want
to
kind
of
throw
this
analogy.
So
in
terms
of
how
this
fits
in
into
either
the
data
infrastructure,
world
of
science
or
within
the
internet
or
today
kind
of
thing
is
it's
a
kind
of
analogous
to
a
DNS,
so
a
DNS
is
a
domain
name,
sort
of
which
is
ever
go
away.
You
would
notice
immediately,
so
you
couldn't
go
to
google.com
because
it
won't
be
able
to
transmit,
but
it
does
is
basically
translates.
A
That's
what
IP
addresses,
in
suppose,
machines
need
to
know
to
route
your
traffic
brown
dog,
basically
kind
of
analogous
to
that
it
doesn't
transform
something
as
simple
as
that
it
transforms.
Basically
anything,
and
so
the
key
here
is
data
file
formats
or
extracting
these
derived
products
either
way
going
with
analogy
again
is
the
idea
of
a
disturbed
in
nature.
Both
of
these
things
aren't
for
emphasize
basically
that
they're
not
at
morning
one
location,
but
it
kind
of
made
up
of
a
bunch
of
different
pieces
that
talk
to
each
other.
A
To
do
this
service
I
won't
get
into
that
too
much
today,
but
that's
a
key
aspect
of
this
as
well,
and
so
how
this
fits
into
things
is
making
basically
three
major
parts.
One
is
into
other
data,
cyberinfrastructure
components
that
are
being
built
out
out
there
not
touch
on.
Those
second
is
with
regards
to
the
users
and
you
guys
to
facilitating
your
data,
wrangling
needs
and,
lastly,
in
terms
of
again
publishing
these
data
transformation
tools,
so
that
other
people
and
leverage
them
so
in
terms
of
the
data
management
landscape.
These
are
just
some
of
those.
A
These
are
the
dating
for
such
a
building
blocks
in
the
data
Nets.
So
the
NSF's
a
data
portfolio,
basically
at
the
moment
and
there's
a
number
of
these
things
that
deal
with
aspects
or
epitaxy
down
here
is
the
the
computational
data
storage
resources
that
NSF
provides
and
then,
on
top
of
it
things
that
kind
of
extract,
the
way
to
storage
layer,
things
that
provide
data
sources
for
communities
like
astronomy
and
other
other
communities,
ecology
and
earth,
science
and
so
forth,
and
then
service
kind
of
stuff
on
top
of
it.
Curation
tool.
A
On
top
of
that,
so
the
idea
was
round
off
is
basically
servicing
all
these
things,
not
just
those
building
blocks,
but
everything
that
has
been
built
and
sent
to
the
community
today,
things
like
irods
and
data,
verse
and
D
spades,
which
are
curation
tools
for
preserving
data
and
sharing
data
and
so
forth,
and
allowing
them
to
build
on
top
of
it
to
kind
of
offload
the
transformation
part.
So
they
don't
have
to
worry
about
it.
They
can
also
leverage
what
everybody
else
is
doing
as
well.
A
So
as
an
example
of
that
one
of
these
is
pecan.
This
is
an
ecological
workflow
system
for
basically
connecting
a
bunch
of
models
in
ecological
community
and
doing
analysis
on
them
to
kind
of
see
which
ones
are
sensitive
to
which
data
inputs
trying
to
do
a
prediction
over
time
and
carbon
storage
or
large
periods
of
time,
say
millennia
and
so
forth,
and
so
basically
it
has
a
lot
of
transformations
it
has
to
deal
with
it
has
to
convert
from
all
those
different
models
have
different
input
formats.
A
It
takes
in
data
sources
from
a
variety
of
places,
and
each
of
those
are
in
difficulty.
So
basically
it
has
to
transform
the
data
sources
to
the
model
formats
which
is
so
low,
and
so
what
we've
been
working
with
them
to
do
is
kind
of
offloading
their.
What
their
transformation
needs
are
to
brown
dogs,
so
they've
added
this
little
checkbox
here
to
their
to
their
interface.
This
is
their
interface,
where
you
basically
select
a
model,
select
a
location
run
it
and
so
the
when
they
select
them
use
brown
dog
here.
A
Basically,
it
offloads
the
data
transformations,
so
they
can
just
worry
about
what
comes
after
that.
So
that's
one
example
there.
What
other
thing
we've
done
kind
of
to
highlight
one
of
the
possibilities
that
can
be
done
when
you
start
thinking
this
way
of
separating
out
data
wrangling
stuff
from
the
science
stuff
is
providing
resources
for
not
just
your
project
but
other
projects
in
this
case
for
the
ecological
modelers
themselves.
A
So
the
people
that
built
those
models
are
not
all
part
of
P
can,
for
example,
ed
is
on
a
Harvard
and
there's
other
models
out
of
Europe
and
so
forth,
30
or
so
different
models,
and
they
all
could
basically
leverage
these
transformations.
They
built
for
this
tool
independently
of
that
tool,
and
so
one
use
case
we
did
here
is
basically
building
a
kind
of
a
a
data
products
platform
based
on
that
those
transformations
for
the
college
build
community.
A
Those
modelers
in
a
language
that
those
modelers
are
very
familiar
with,
are
and
so
using
shiny,
which
is
a
kind
of
a
way
of
doing
interfaces
in
are
a
couple
pages
of
code
with
custom
calls
LeBron
bag.
Are
we
able
to
put
together
this
little
random
site
where,
basically,
you
can
select
a
location
and
select
format?
You
know
that
this
example
here
is
for
Edie.
You
can't
see
it
on
the
screen,
but
for
the
ecological
demography
model
and
then
download
the
data.
A
This
is
not
an
actual
data
product
reduced
by
anybody
and,
in
this
case,
we're
selecting
tumour
flux,
data
source.
The
Northwest
has
not
produced
this
data
product.
It
kind
of
looks
this
way
because
this
transformation
service
in
the
middle
there
is
doing
this
transformation
to
make
his
data
POC,
and
so
that's
a
key
thing.
So
lots
of
these
scientific
endeavors
produce
data
product,
that's,
but
there's
always
more.
That
can
be
done
for
different
users,
also
seeing
example.
A
They
produced
data
products,
the
motor
satellite
data
products,
but
it's
always
more
and
there's
more
than
that
cause
they
can
reduce
them.
This
is
kind
of
a
way
of
Hamelin
map
data.
That
seed
is
a
basically
one
of
the
building
blocks
that
deals
with
curation,
so
they
think
of
it
like
a
smart
drop
box,
breaking
necessary
metadata
tags.
Socially.
The
oort
is
also
this
possible
aging,
auto
curation,
where
this
is
actually
leverage
this
technology
in
Bronx,
dude
extractions.
A
Basically,
the
file
has
uploaded
stuff
gets
spit
out,
but
metadata,
and
so
we
can
basically
the
transformation
we
build
within
brown.
Dog
are
never
told
directly
back
into
the
stadium
at
seed
effort
in
this
Dropbox,
environment
called
clowder,
and
so
this
example
here,
basically
there's
a
Google
map
kind
of
thing
in
the
middle,
and
we
think,
if
somebody's
uploaded
on
some
lidar
data
and
what's
happened,
flood
Laysan
have
been
extracted
and
overlaid
on
top
of
the
lighter
data.
A
This
is
basically
being
leveraged
in
another
today,
structure
of
components
and
that's
the
idea
so
I
like
going
where
long
Leggett
in
with
that
DNS
analogy,
as
you
know,
browsers
use
it
operating
systems
with
lots
of
different
things,
use
the
DNS
lots
of
different
things
will
use
this
DTS.
This
data
transformation
service,
scientific
workflow
systems
are
another
popular
thing
as
well.
That
are
being
built
these
days,
basically
a
kind
of
a
graphical
reproducible
way
of
coordinating
what
happens
as
you
use
software
in
data
in
science.
So
a
little
component
that
you
connect
together
on.
A
A
But
this
one
is
API
directly,
which
I'll
touch
on
a
little
bit
later,
but
to
a
number
of
client
applications
that
we
kind
of
do
Stratus
traps
as
well.
So,
ideally,
the
community
will
start
building
these
client
applications
themselves,
but
to
kind
of
help
that
along
we
built
a
number
of
them
already,
and
so
the
first
one
I'll
show
you
it
was
the
Windows
client,
so
basically
like
Dropbox.
A
So
when
you
install
it,
you
can
because
I've
got
lots
of
client
needed
Lee
for
Windows
or
for
other
operating
systems,
and
so
we
have
one
for
brown
dog.
So
basically
you
can
go
to
your
file
manager,
right
click
on
the
file
and
access
these
capabilities,
such
as
converting
to
a
different
format
right
clicking
on
a
folder
of
files
and
basically
indexing
it.
So
basically,
what
it
does
then
is.
They
runs
all
the
extractions
on
it.
A
They
can
gives
a
little
database
and
if
you
does
a
search
capability
here
too,
it
basically
goes
through
that
and
says
Oh
find
me
all
things
with
green
into
X,
greater
than
X,
and
so
to
give
you
even
just
really
do
that
and
so
forth,
and
so
those
kind
of
things
so
there's
an
example
of
one
of
our
hydrology
use
cases.
So
they
want
to
do
this
analysis
on
lidar
data,
so
there's
a
geo,
TIFF
file
and
so
basically
to
do
the
analysis
they
want.
A
It
has
to
be
in
this
pen,
interleaf
format
built
to
do
that.
So
the
step
first
thing
the
person
has
to
do
is
convert
it
to
that.
So
you
can
write,
clicks
on
and
converts
it
to
this
bill
format
and
then
once
they
do
that
it
basically
gets
added
to
their
desktop
there,
and
the
next
step
is
to
do
the
analysis,
which
is
part
of
the
extraction
suite.
A
So
basically,
they
right
click
into
extraction
and
basically,
all
the
what
gets
sent
back
to
him
is
the
JSON
file
with
stuff
that
an
application
can
then
use
to
get
this
data
that
was
spit
out,
but
also
all
the
derived
products
that
were
generated
as
well
as
a
folder
that
was
generated
for
them,
and
so,
as
part
of
that
is
lots
of
different
things,
and
so
some
of
these
are
some
plots
of
a
China
else,
is
kind
of
thing
to
the
wisdom
there
as
well,
and
so
they
can
do
whatever
they
want
with
that
or
leave
it,
as
that
just
sits
whose
information
they're
in
kind
of
a
raw
format.
A
So
that's
one
example:
a
command-line
interface
is
another,
so
this
is
an
example
here,
where
the
old
photos
CD
image,
that
I
can't
open
so
I'm,
converting
it
to
a
PNG
and
then
running
it
typing
it
into
the
command
line
again
and
then
extracting
everything
I
can
from
that.
So
in
this
case
it's
spitting
out
all
the
metadata
that
it
could
get
from
it.
A
You
know
things
that
it
detects
metadata
from
geo
locations
so
forth
and
again
this
is
the
JSON
form
that
our
application
with
told
apart
one
of
the
more
user-friendly
interfaces
we
worked
on.
That's
pretty
portable
in
terms
of
web
browsers
for
users,
it's
a
bookmarklet,
and
so
what
that
is
is
basically
a
piece
of
JavaScript.
That's
contained
inside
of
a
bookmark
I.
A
Don't
think
any
of
these
before,
but
basically
bookmark
this
thing,
and
so
you
can
drag
and
drop,
it
see
your
bookmarks
and
then
you
go
to
some
random
web
page
and
you
click
on
that
bookmark
and
it
executes
some
JavaScript
on
there
for
you,
and
so
what
this
does
is,
if
you
go
to
some
page
with,
you
know,
links
on
it.
This
is
a
simple
page,
but
any
page
that
looks
on
it.
It
has
these
menus
to
it
so
that
you
can
right-click
on
it.
A
That's
on
a
kind
of
hard
to
see
on
screen
here,
but
it
puts
as
many
there
four
different
formatting
and
then
download
the
data
as,
and
so
you
can
download
the
data
of
different
formats.
So
here's
an
example
in
Dropbox
doing
this
as
well
or
you
can
basically
go
to
a
web
page
and
index
its
entire
contents.
A
A
This
case
the
same
thing:
the
treat
delineation
and
do
the
same
kind
of
thing
in
an
open
source
environment
excel.
This
is
a
very
popular
piece
of
software
across
any
scientific
community.
As
far
as
we've
encountered
with
our
support
of
them,
and
so
that
was
one
of
the
key
ones
we
wanted
to
make
sure
we
can
enter
with,
and
so
in
this
example
here.
This
is
for
our
civil
engineering
use
case.
Basically,
they
record
the
paths
of
people
walking
through
cities,
as
coordinates,
latitude,
longitude
and
spreadsheet.
A
It's
actually
spit
out
of
a
censored
living
after
people
carry,
and
so
there's
what
we
could
create
plugins
built
on
this
API
for
brown
dog
to
do
stuff.
In
this
case,
they
wanted
to
convene
index
of
that
route.
How
much
full
needs
was
encountered
by
the
pedestrian
as
they
work
Wow
it's
in
that
particular
landscape,
and
so
basically
what
they
can
run
that
extractor
on
some
subset
of
cells
there
and
what
it
does
is
basically
generates
new
sheet
at
the
bottom
here
with
bag
beam
index
information
that
was
extracted
out
of
that
for
them.
A
So
another
example
of
a
kind
of
their
client
interface
again
going
with
that
DNS
and
out
get
lots
of
different
things,
building
with
hopital.
So
one
of
the
main
interfaces
we're
trying
to
highlight
this
when
people
sign
up
is
kind
of
the
one
that
kind
of
has
all
the
possibilities
and
so
well
this
next
one
I
want
to
show
you
is
called,
be
fiddle.
It's
based
off
of
this
idea
for
web
developers
called
JavaScript
fiddle.
A
It
should
kind
of
lets
them
navigate,
JavaScript,
CSS
and
so
forth
for
web
developers,
but
in
the
case
of
brown
dog,
navigating
these
transformations,
and
so
what
you
can
do
here
is
when
you
go
here,
is
you
can
try
out
the
transformations
you
can
upload
a
file?
Select
an
output
format
or
select
some
sort
of
suite
of
abstractions
and
then
get
the
output
here,
but
not
just
that,
but
get
the
code
snippets
that
you
can
then
copy
and
paste
into
your
code
whatever
it
might
be.
A
So
we
have
examples
for
just
running
it
in
the
terminal
or
Python.
We
support.
We
have
a
wrapper
library
which
is
kind
of
simple
across
the
Python.
We
also
show
you
what
it
would
take
of
the
native
Python,
using
it
like
to
request
package
to
interface
with
the
rest,
API
and
so
forth.
It's
a
lot
bigger
a
little
bit
longer
coding
are
again
where
we
have
our
library
to
simplify
things
coded
the
MATLAB,
or
we
also
have
a
Labrador
simplified
previously
and
even
Java
Script.
A
So
you
can
do
this
in
a
webpage,
execute
this
kind
of
transitions
and
so
forth,
and
so
you
can
basically
take
these
code
snippets
and
do
stuff
like
run
a
minute
super
notebook.
You
can
actually
download
the
super
notebook
here,
but
in
the
new
version
of
media
feel
there's
a
link
right
under
a
great
can
open
up
a
cheaper
to
notebook
in
run
the
code
right
from
there
or
there's
a
copy
of
a
senior
code.
A
Now
this
is
an
example
of
one
of
our
use
cases
again
in
green
infrastructure,
and
so
this
is
what's
his
code
beforehand.
It's
basically
identifying
green
infrastructure
types
of
images,
so
I
think
what
he's
identifying
here
is
bio
spells
and
satellite
images,
and
so
he
wrote
this
code.
Basically,
he
wrote
the
extractor
to
do
this,
and
this
code
comes
to
about
270
lines
of
Python.
A
It
includes
some
packages
like
open
CV,
which
is
not
the
funnest
thing
to
install
open
a
system
and
so
to
rerun
this
stuff
that
you'd
have
to
have
all
that
there,
and
so
the
idea
here
is
kind
of
highlight
you
know
what
about
the
next
person,
so
the
person
who
could
leverage
this
green
GI
type
identification
code
that
in
terms
of
the
using
again
not
having
to
rewrite
this,
not
having
to
reinvent
the
wheel
not
having
to
install
any
of
this
stuff.
So
it
was
an
example
of
what
that
might
look
like.
A
We
wrote
his
code
using
brown
dog
with
his
extraction.
Extractors
undistracted
basically
chose
an
extractor,
and
so
that
kind
of
takes
it
down.
You
know
significantly,
so
the
next
person
utilized
in
his
GI
type
identification.
He
basically
is
one
page
of
code
from
217
lines,
255
lines
they
don't
have
to
install
anything
special
like
over
the
city.
It's
much
simpler
to
run,
and
this
is
without
the
little
wrapper
library
to
include
that
as
well
takes
down
a
few
more
lines
to
only
47
lines
of
code
so
significantly
reduce.
A
That's
the
exact
same
thing
basically
goes
through
a
folder
of
images,
and
this
really
pops
them
up
with
drawing
these
bounding
boxes
and
putting
subsections
that
have
these
GI
types
of
I/o
spells
in
this
case.
So
the
idea
is
that
basically
leveraging
somebody
else's
transformation
tool
and
simplifying
your
as
much
bylaw
going
forward,
which
kind
of
raises
the
last
part,
which
is
about
publishing
data
transformation
tools,
so,
for
example,
for
that
code,
written
by
Tom,
kicked
right
so
that
he
can
publish
that
code,
and
this
is
kind
of
a
movement.
A
That's
happening
these
days
so
with
regards
to
not
just
preserving
data
of
a
brazillian
software
and
making
it
first-class
citizens
in
terms
of
the
scientific
world
in
terms
of
your
CD,
you
get
credit
for
publications,
so
publications
papers
are
not
enough,
so
reproduce
your
results
anymore.
Your
science,
the
data
in
software,
is
particular
to
so.
This
is
more
movement
towards
making
those
first-class
citizens
and
set
being
able
to
second
get.
A
A
How
often
it's
used
and
give
it
back
to
him,
so
they
can
get
kind
of
credit
for
that
and
simple,
but
will
provide
do
I,
see
sort
of
people
condemning
sites
like
that
code
when
they
utilize
it
and
so
basically
preserving
that
those
kind
of
capabilities
for
other
people
to
leverage
as
well
so
we're
moving
towards
in
our
this
project.
Our
beta
release
early
next
year,
and
so
as
part
of
that
we'll
have
a
number
of
transformations
we've
built
with
our
three
yeast
cases
and
so
on
the
biology
side
of
things.
A
A
lot
of
them
take
the
form
of
these
ecological
models
being
able
to
convert
to
the
format's
they
need
or
the
data
sources
they
utilize
to
different
formats,
as
well
as
a
few
others
can
see
me
doing
one
here,
for
example,
to
extract
traits
from
images
of
plants
and
so
forth.
Our
civil
engineering
use
cases
and
being
infrastructure.
So
a
number
of
the
transformations
there
with
regards
to
like
GI
type
identification,
social
media
stuff,
like
human
preference,
core
extractions
and
so
forth.
A
Finding
images
with
certain
characteristics,
there's
counting
the
people
in
them
who's
talking
to
WHO
and
stuff,
like
that.
With
regards
to
genomics
and
medicine
stuff
like
extracting
things
in
biopsy
images
for
any,
where
they're
at
or
classifying
teachers
and
general
stuff
as
well.
So
basically
document
conversions
document
extractions
extracting
faces,
and
things
like
that
that
anybody
can
potentially
use
as
well
so
in
all
about
82
different
transformations
that'll
be
part
of
that
initial
suite.
A
But
the
idea
is
to
come
and
grow
that
over
time,
I
just
kind
of
just
need
to
out
the
process,
and
so
we'll
have
basically,
as
a
part
of
the
signup
process.
You'll
get
you'll,
get
access
to
peas
fiddle
all
these
clients,
a
sample
page
with
a
bunch
of
data
that
you
could
try
stuff
out
on
and
kind
of
move
on
from
there.
Ideally
over
time,
working
with
you
get
more
tools
at
it
to
the
system.
A
So
in
a
nutshell,
I'm
not
gonna,
I
wasn't
gonna
get
into
the
underlying
technology,
that's
utilized
in
brown
dog,
but
it's
a
number
of
pieces
there
to
kind
of
do
this,
specifically
towards
user
legs,
utilizing
a
large
number
of
tools
and
simplifying
the
process
of
adding
new
tools
but
I'll
stop
there.
This
is
kind
of
a
high-level
picture
of
terms
of
what
it
is,
what
it
does
and
how
others
can
potentially
use
it
in
the
scientific
world.
So
that's
it
open,
stop
there
and
take
any
questions.