►
Description
Date: 6/1/2018
Presenter: Natalia Ruiz Juri
Institution: The University of Texas at Austin
South Big Data Hub
A
All
right:
well,
let
me
introduce
Natalya
race
who
another
person
I've
been
working
with
for
a
while.
Natalya
is
here
at
the
University
of
Texas
and
working
with
the
Center
for
Transportation
Research,
and
she
has
been
doing
work
both
simulated
and
data-driven,
for
improving
the
what
we
consider
sort
of
terrible
traffic
situation
in
Austin
and
across
the
state
and
is
going
to
be
presenting
to
about
the
notion
of
basically
sharing
transportation
data
across
a
wider
array
of
folks
I
guess
to
improve
research
and
outputs
outcomes.
So
all.
B
B
What
network
modeling
means
is
there's
a
variety
of
models
that
agencies
use
to
understand
how
traffic
and
respond
to
different
types
of
changes
they
could
be
changes
to
technology
and
to
the
technology
used,
for
example,
for
traffic
signals,
changes
to
the
infrastructure
itself
or
changes
to
the
way
people
travel
through
incentives,
for
example.
If
you
want
to
have
people
traveling
earlier
during
the
day
or
later
during
the
day,
how
would
that
look
like
when
you
implement
in
the
long
term
and
the
user
variety
of
mobile?
B
Taste
get
a
taste
for
these
new
models,
how
they
could
be
better
and
in
a
way
created
a
man,
maybe
for
new
commercial
software
products
that
can
like
satisfy
new
needs,
because
things
are
moving
quite
fast
in
the
industry.
So,
and
one
of
the
big
challenges
with
using
this
model
that
what
we
call
a
advanced
transportation
model
is
that
they
require
more
computational
resources
and
they
also
they
take
more
data
as
input
and
they
produce
a
lot
of
high
resolution
in
terms
of
space
and
time
definition
of
outputs.
B
So
one
of
our
first
ventures
into
trying
to
handle
large
datasets,
but
actually
trying
to
find
a
way
in
which
we
could
better
visualize
and
understand
the
model,
outputs
and
convey
them
to
decision-makers.
And
so
we
had
a
joint
effort
here
with
editors,
Advanced,
Computing
Center,
and
we
developed
a
tool
to
visualize
model
outputs
that
is
kind
of
agnostic
to
which
modeling
tool
you
use.
B
So
we
came
up
with
this
concept
of
the
data.
Rodeo,
which
was
well
how
about
we
talk
to
all
the
transportation
agencies.
We
convinced
them
to
share
their
data,
and
then
we
create
this
wonderful
environment
that
will
let
us
do
all
these
things
that
are
in
this
live,
which
is
basically
awesome,
and
then
we
were
thinking
something
along
the
lines
of
what
a
what
exists
with
fibers.
B
That's
going
to
be
just
one
more
thing
they
have
to
think
about
when
they
think
about
managing
their
data,
and
something
else
is
also
that,
even
though
we
all
have
this
feeling
that
data
is
going
to
be
super
helpful
in
transportation
and
help
us
have
systems
that
are
much
more
efficient
and
work
better.
There
is
really
not
a
clear
path
between
I
have
this
extra
data
and
my
system
is
going
to
be
this
much
better.
B
So
there
is
this
concept
that
we
know
it
should
help,
and
there
are
certain
applications
where
we
can
see
is
helping,
but
it's
hard
to
sometimes
for
agencies
to
justify,
what's
going
to
be
the
return
of
investment,
of
making
a
huge
effort
in
terms
of
data
sharing
and
gathering.
So
what
we
decided
to
do
was
to
take
more
of
a
bottom-up
approach
in
which
we
started
small.
B
With
a
more
stakeholder
engagement,
we
had
a
couple
of
hackathons
that
were
very
successful:
a
generation
of
static
datasets
and
starting
with
prototype
data
applications
with
the
idea
of
demonstrating
the
value
of
data
or
potential
value
of
data,
understanding
what
the
critical
workloads
are
and
promoting.
This
idea
of
data
sharing
and
also
trying
to
better
understand
what
the
actual
challenges
are,
which
may
not
be
technical
and
there
are
some
technical
challenges
in
some
cases,
but
a
lot
of
them
are
institutional.
B
So
my
next
slides
are
about
some
of
these
prototypes
that
we've
been
working
on,
for
example,
for
the
city
of
Boston.
What
we
did
is
we
took
data
from
the
traffic
monitoring
cameras
and
we've
been
working
also
with
the
Texas
Advanced
Computing
Center,
in
developing
some
methodologies
that
helps
us
that
help
us
analyze
these
data
systematically.
So
there
are
a
lot
of
commercial
products
now
that
actually
use
be
related
to
compute
traffic
metrics,
but
they
require
in
selling
new
hard
work.
It's
usually
pricey
and
they
usually
just
do
one
thing.
B
So
what
we
were
thinking
is
the
city
already
has
an
extensive
network
of
traffic
camera.
Is
there
anything
that
we
can
do
to
use
those
cameras
in
a
way?
That's
more
so
right
now,
they're
just
used
manually
to
do
traffic
operations.
So
when
there's
an
incident
to
see
what's
happening
to
the
tech
problems
or
unusual
traffic
conditions,
so
we
were
thinking,
can
we
use
them
in
a
more
systematic
way
to
do
either
data
collection
or
some
type
of
long-term
monitoring?
B
So
we
worked
on
this
prototype
and
that,
ultimately,
what
it's
doing
is
using
some
existing
libraries
to
recognize
objects
in
the
video
and
some
one
cool
thing
that
we're
doing
is
a
lot
of
the
existing
products
just
work
in
real
time.
So
the
idea
is
that
they're
used
for
what
we
call
traffic
operations,
which
is
to
detect
an
incident
or
to
count
traffic
turning
movements
in
real
time
to
adjust
signal
timing,
for
example.
B
But
what
we
are
trying
to
do
is
look
at
larger
data
sets
and
maybe
be
able
to
analyze
changes
in
patterns
movement
pattern
over
time,
and
so
is
our
first
product
I've
had
to
do
the
vehicular
traffic,
but
then
we
also
are
looking
now
into
pedestrian
traffic
and
we,
the
pictures
on
the
top,
has
hotspot.
So
these
are
locations
where
we
see
many
pedestrians
or
where
pedestrians
spend
a
lot
of
time
and
then
on
the
bottom
slides.
B
We
have
the
trajectories,
so
the
reason
why
this
is
interesting
is
because
you
can
see
a
lot
of
trajectories
moving
across
the
roadways
at
locations
where
there
are
no
crosswalks
whatsoever
and
then
happens
a
lot
in
Texas,
because
not
all
the
roadways
are
very
pedestrian
friendly.
So
sometimes
you
have
to
walk
half
a
mile
to
cross
the
street
and
people
just
don't
and
they
cross
wherever
it's
more
convenient
and
that
can
cause
safety
issues,
and
so
the
city
has
been
trying
to
see.
B
Can
we
use
this
technology
to
identify
problem
area
and
also
if
there
is
some
type
of
measure,
that's
taking
to
improve
conditions?
Can
we
look
at
data
over
time
and
see
if
things
are
better
and
usually
these
things
are
evaluated
manually,
so
people
either
go
and
observe
or
take
video
or
small
amount
of
time
and
then
look
at
it
manually
so
having
a
tool
such
as
this,
one
could
help
them
be
more
efficient
in
evaluating
the
impact
of
their
techniques,
and
something
else
we've
been
doing.
B
Is
we've
been
using
data
from
vendors,
and
this
is
travel
time,
data
along
corridors
and
roadways
in
the
city,
and
we
are
using
trying
to
see
if
they
can
use
that
data
to
be
more
efficient
in
the
way
that
they
decide
with
traffic
signals.
They
have
to
be
time.
So
it's
common
practice
that
if
an
agency
operates,
let's
say
a
thousand
signals
they're
going
to
read
time
or
improve
the
time
implant
on
a
portion
of
them
every
year,
because
they
usually
don't
have
the
resources
to
retime
all
of
them
and
how
that
portion
is
selected.
B
Often
the
simplest
way
is
just
to
make
sure
that
everything
gets
three
times
over
a
reasonable
period
of
time.
So
maybe
everything
needs
to
be
every
time
every
three
years
or
something
like
that.
But
now
there
is
more
data
available,
so
we
were
trying
to
help
them
come
up
with
a
process
that
will
help
them
prioritize,
which
in
this
case
third
of
the
traffic
signal
should
be
every
time
and
what
would
be
best
for
would
benefit
a
larger
portion
of
a
population,
and
this
is
pretty
successful.
The
data
is,
then
their
data.
B
So
it's
not
data,
it
can
be
made
publicly
available
because
the
city
is
paying
for
that.
But
this
video
is
becoming
more
common
from
different
vendors
and
it's
a
pretty
valuable
source
of
data,
and
so
this
is
just
showing
how
we
can
look
at
a
corridor
which
is
a
typically
and
an
arterial
Street,
and
we
can
see
how
speed
has
been
changing
over
time.
B
Try
to
see
if
that's
the
right
corridor
three
time
and
we're
developing
a
methodology
based
on
this,
and
we
also
use
multiple
data
sources
from
the
Texas
Department
of
Transportation
to
try
to
develop
methodologies
to
understand
what's
the
cost
of
a
lane
closure.
So
when
we
close
the
lane
even
either
it's
either
for
construction,
so
a
planned
closure
or
we
close
the
lane
because
there's
an
incident
and
it
needs
to
be
cleared
up.
There
is
delays.
Those
delays
have
a
cost
to
the
users.
B
There
are
simple
ways
of
estimating
those
cost,
but
as
more
and
more
data
becomes
available,
we
can
develop
more
refined
techniques
that
let
us
better
understand
what
impact
that
shows.
How
how
effective
are
our
methodologies
to
inform
the
public
and
to
avoid
excessive
delays
and
when
implemented,
so
we've
been
working
with
the
close
of
the
Department
of
Transportation
in
using
data
from
multiple
sources?
Some
of
them
are
displayed
in
this
image,
and
here
the
challenge
is
usually
data
aggregation
and
how
do
we
get
this
data
from
all
these
different
types
of
sensors?
B
But
they
want
to
see
how
they
can
in
a
way
make
the
data
being
used
by
agencies
and
also
agencies
that
have
the
real
needs
and
problems
that
need
to
be
solved
are
very
helpful
and
we
we
find
it.
It's
extremely
important
to
really
understand
the
value
of
data.
It's
often
the
case
that
there
is
being
collected
for
no
specific
purpose,
and
so
that
is
challenging
when
we
get
a
project
that
has
to
do
with
analyzing
the
data,
because
we
don't.
B
How
do
we
have
guidance
in
terms
of
what
is
it
we
want
to
get
out
of
its
data?
So
this
relationships
help
understand
that
and
then
so
that
we
can
identify
what's
the
best
approach
to
really
accelerate
innovation
through
collaboration,
and
there
are
a
number
of
products,
sometimes
I
feel
that
agents
are
overwhelmed,
but
the
availability
of
data
platforms
that
are
integration,
software,
but
a
collection
and
so
understanding
what
the
landscape
looks
like
and
what
would
be
a
good
ball
for
the
University
and
for
research.
I.
Think
it's
critical,
because
the
products
are
out
there.
B
B
And
then
we
we
feel
the
disc
elaboration
also
helps
promote
the
sharing
of
data
we've
seen
a
and
increasing
the
so
agencies
are
more
interested,
especially.
The
city
of
Austin
is
very
interested
in
open
opening
their
own
data,
and
that
has
been
really
great
for
us,
and
then
we
find
that
working
with
real
data,
both
in
education
and
research,
has
been
very
valuable
for
students
they're,
starting
to
use
more
data
in
class
projects.
B
They
work
with
us
in
projects
and
then
they're
interested
in
doing
research
is
in
the
data,
so
that
for
us,
has
been
embraced
in
terms
of
actually
developing
this
environment
for
data
integration
and
tools.
We're
still
trying
to
find
what's
the
right
way
to
do
it,
that
that
we
really
get
traction
and
that
won't
overlap
with
other
efforts
that
already
exist
at
the
agency
level.
So
this
is
what
I
have
I,
don't
know
how
I
did
with
time.
I
apologize
if
I
spoke
too
much.
B
A
I
know
it's
difficult
to
get
the
information
across
four
different
to
different
agencies
from
different
agencies
along
those
lines.
Have
you
really
been
seeing
much
of
a
change
in
that
just
in
the
City
of
Austin
or
our
other
agencies,
starting
to
understand
the
value
of
pooling
efforts
and
bringing
things
together
and
if
so,
what
are
the
next
sort
of
steps
you
see
to
pushing
that
forward?
So.
B
B
It's
always
scary,
so
sure
data
they're
worried
about
what
people
will
do
with
the
data
they're
worried
about
complaints
about
the
data
quality,
and
so
it's
kind
of
slow
I
haven't
seen
any
drastic
change
in
other
agencies,
but
definitely
more
interest
in
collaborating
so
I've
seen
the
city
talking
more
with
the
Texas
Department
of
Transportation
and
I.
Think
some
of
these
projects,
which
actually
show
some
outcome
out
of
the
data
analysis,
so
we're
using
data
to
solve
this
problem,
are
appealing
to
agencies
and
sometimes,
if
we
do
it
with
the
data
from
one
agency.