►
Description
Jim is a senior research scientist in the cybersecurity group at NCSA, where he leads the CILogon and SciTokens projects, which support identity and access management for research collaborations. Jim is also the deputy director of Trusted CI, and he is the chair of the Trustworthy Data Working Group. Jim received his Ph.D. in computer sciences from the University of Wisconsin-Madison, where he worked on the HTCondor project.
Date: 10/02/20
Presenter: Jim Basney
Institution: National Center for Supercomputing Applications
A
The
second
in
the
fall
semester
of
big
data
hubs,
cyber
infrastructure
working
group
that
we
have
some
folks
here
and-
and
hopefully
we
can-
you
know,
get,
as
you
said,
get
get
the
recordings
out
and
and
get
other
folks
to
be
able
to
consume
this
information
when
they're
not
sitting
in
a
zoom
meeting
and
so
staring
at
a
different
little
box
on
their
computer.
A
I'm
now
gaffney,
I'm
director
for
data
computing,
the
texas
advanced
computing
center
and
I've
been
sharing
this
for
it's
almost
going
on
two
years
at
this
point,
fun
little
group
to
get
together
with
and
and
have
these
discussions,
so
we
normally
just
go
through
the
working
group
business
next
so
that
we
can
have
the
lightning
talk,
not
talk
over
everything,
because
that's
the
fun
part
of
the
meeting,
and
so
let's
get
through
the
the
this
part
first,
which
has
some
good
parts
to
it.
A
A
So
why
don't
we
start?
Well,
I
guess
we
could
start
this.
Let's
start
in
the
west,
let's,
let's
start
on
the
left,.
A
Not
have
the
west,
we
don't
have
the
west,
that's
right,
so
melissa's
not
here
yet
so,
let's
move
to,
we
can
go
to
the
south
next.
C
Great
thank
you.
I
think
lawrence
referenced,
the
slide
she's
working
on.
We
have
a
all
hubs
meeting
with
nsf
on
monday,
so
people
are
kind
of
scrambling
for
that.
Yes,
regarding
nile,
knows
this
well
our
seed
fund
for
the
south
hub.
We
had
the
meeting
last,
I
guess
two
weeks
ago
able
to
award
one
large
and
two
mediums
and
three
smalls,
so
there
may
be
some
press
coming
out
about
that.
C
A
Well,
I
I
will
add
on
on
the
the
funding
for
the
seed
funds.
It
was
a
very
educational
experience
going
through
those,
and
so
I
hope
I
I
hope
folks
are
sharing
some
of
the
lessons
of
of
all
of
that
as
we
go
forward
because
I
I
found
there
was
some
really
really
great
things
that
came
out
of
that,
and-
and
you
know,
I
hope
we
can
help
find
others
like
that-
to
to
really
make
the
funding
have
the
most
impact
that
it
can
all
right
when
we
go
to
midwest.
D
So
yeah
I
can
start
jim.
I
don't
know
if
you
have
anything
to
add,
but
we
also
made
some
seed
funding
decisions
recently
and
so
you'll
see
some
announcements
about
that.
Coming
up
soon,
we
decided
on
eight
projects
to
take
forward,
mostly
small,
but
all
community
driven
and
very
focused
on
our
priority
areas.
D
So
that's
great
news:
we're
looking
forward
to
partnering
with
a
lot
of
folks
within
the
region
on
those
projects
over
the
next
year,
or
so
I
was
going
to
mention
the
covet
information
commons
webinar,
which
is
coming
up
next
week.
If
florence
was
not
here
but
I'll
turn
it
over
to
her
next.
To
talk
about
that.
D
D
So
that's
the
16th
that
11
a.m:
central
9,
pacific
and
noon
eastern.
It's
our
second
round
of
these
pi
lightning
talks
from
the
nfc
funded
covid
projects
under
the
rapid
awards,
and
so
we've
had
a
great
response
from
the
community
on
those
projects
and
a
really
nice
effort.
Last
time
from
the
folks
anything
else,
florence
that
you
want
to
mention
about
that.
B
No,
that's
great,
are
you,
do
you
have
more
stuff
to
talk
about.
A
I
was
going
to
ask
if
you've
got
a
link
for
for
that.
If
we
could
get
that
in
the
agenda,
at
least
so
that
we
can
hopefully
spread
that
around
and
get
people
interested,
because
that
those
should
be
some
very
interesting
talks.
B
You're
so
thoughtful
what
a
great
team
we
are.
So
we
share
the
covet
info
commons,
not
much
money.
We
don't
really
get
any
money,
but
we
share
what
we're
doing
with
the
researchers
so
what's
going
on
in
the
northeast,
all
sorts
of
stuff.
So
you
know
about
the
covet
info
commons.
We've
actually
updated
the
website
recently
and
we
can
put
a
link
in
for
that
as
well.
It's
a
covid
infocommons.net
and
we've
added
a
meet
the
researchers
page,
which
is
really
fun.
So
there
are
the
pi
lightning
talks.
B
We
do
we've
separated
them
into
their
separate
five
to
seven
minute,
lightning
talks,
and
so
you
can
listen
to
what
they
do
and
see
who
they
are
and
where
they
are.
We've
done
interviews
with
some
of
the
researchers
and
that's
posted
as
well
john's
done
some
of
the
midwest.
We
did
a
bunch
in
the
northeast,
so
we're
trying
to
really
humanize
it.
To
enable
this.
B
You
know
human
collaboration,
research
collaboration,
which
is
really
fun
and
we're
planning
monthly
webinars,
because
we
asked
for
two
pis
to
speak
in
the
launch
meeting
in
july
and
42
offered,
which
totally
shocked
us.
So
we
created,
we
said:
okay,
congratulations!
You're!
Now
a
code
info
commons
community
we're
going
to
talk
every
month,
whether
we
need
it
or
not.
So
we
have
apis
that
talk
each
time
and
they're
already
collaborating
it's
really
cool
and
so
yeah.
B
We
meet
the
researchers
and
we
have
a
bunch
of
like
44
data
sets
from
around
the
world
from
six
continents.
People
send
us
this
stuff
because
they
know
we
have
this
and
a
bunch
of
funding
opportunities
and
groups
and
organizations
and
all
sorts
of
stuff,
and
we
submitted
a
proposal
to
speak
at
the
code.
Data
go
fare
conference
which
is
virtually
in
paris,
and
so,
if
it
gets
accepted,
you
know
I'm
going
to
bring.
B
You
know
eight
or
nine
researchers
with
me
virtually
and
we
can
all
present
I'll
present
the
comments
they
can
present
their
stuff,
we'll
see
if
it
gets
accepted
and
we're
going
to
be
presenting
ads
at
the
ad,
so
leadership
summit,
the
covet
info
commons
and
then
we're
having
a
diversity,
equity
and
inclusion
buff
that
we're
leading
on
the
friday
october
16th
that
I'm
co-leading
with
fred
hutchinson
research
center,
which
they
call
themselves
fred.
B
Hutch
or
the
coop
they
have
so
many
like
different
cute
little
nicknames
for
themselves
and
the
chan
zuckerberg
initiatives,
that's
really
cool.
We
did
our
seed
fund
round.
We
had
around
one
and
around
two
people
were
worried.
We
wanted
to
get
it
off
the
ground
because
we
didn't
even
do
it
last
year,
so
we're
like.
Oh
my
gosh.
We
have
to
hurry
up
and
use
half
a
million
dollars
by
may
so
august.
B
31St
was
the
first
round
and
we
had
28
proposals
and
so
we're
just
reaching
out
to
the
the
top
10
or
12
researchers
that
we've
chosen.
We
have
some
we're
bringing
over
to
regular
admission
that
we're
going
to
be
looking
at
with
with
round
two
round
two
just
ended.
Last
night
we
haven't
told
people
how
many
it
is,
but
it's
a
dozen
so
all
together,
it's
about
40,
which
is
really
cool
and
then
we'll
figure
out
those
and
we
say
up
to
25
000.
So
most
of
them
are
like
24
997
dollars.
A
A
A
B
B
A
B
Yeah
and
to
make
you
feel
better
or
worse
at
the
retreat
on
monday
with
nsf,
we
actually
had
to
see
a
seed
fund
like
learnings
panel
with
the
seed
fund
chairs
that
are
going
to
be
presenting.
So
I
think
it's
supposed
to
be
the
start
of
what
are
we
learning?
What
do
we
have
to
do
and
it's
it
gets
complicated
on
the
finance
side,
is
my
perception,
I'm
planning
on
bringing
my
swimmies
for
that.
So
that's
just
starting
for
us
yeah,
and
so
it's
been
a
lot
of
fun.
B
It's
been
super
busy,
oh
and
then
we're
looking
at
what's
the
future
of
the
cove
at
info
commons,
and
I
just
want
to
tell
you
guys
what
I'm
thinking,
because
it
has
a
lot
to
do
with
this
working
group.
B
So
we
want
to
make
sure
we
enable
more
researcher
collaboration,
so
we're
going
to
go
to
nsf
with
like
a
one
pager.
If
this
is
what
we're
thinking,
so
I
don't
know
if
they're
going
to
like
it
or
not,
but
right
now
it's
the
coveted
rapids.
I
think
it
should
be
all
nsf
awards
related
to
kovid.
I'd
also
like
to
play
with
you
know:
nih
and
international.
I
mean
go
for
it
if
it's
a
coveted
info
commons
and
it's
an
open
portal,
what
the
heck
right
so
we're
going
to
see
what
they
say.
B
But
then
I'm
also
wondering
if
we
should
try
to
create
something
like
an
lhc,
o
onee,
but
for
covid,
but
without
the
hierarchy.
So
the
idea
of
creating
like
a
distributed
ecosystem
of
cyber
infrastructure
and
data
sharing,
so
that
when
the
researchers
find
each
other,
they
can
actually
collaborate
and
work
on
some
cyber
infrastructure
and
have
test
beds
together.
So
I
found
one
of
the
pis
that
presented
in
september
dominique
duncan.
I
don't
know
if
you
all
know
her
she's
at
usc.
Anybody
know
her
she's,
really
good,
really
smart.
B
Of
course
they
all
are,
but
I
was,
but
she
actually
has
this
arc
thing
where
it's
a
it's
cobid
related
nih
and
sf
projects,
eight
of
them
maybe-
and
they
can
either
do
the
compute
for
their
testing
if
they
find
a
data
set
on
their
on
the
usc
infrastructure
or
their
own.
They
have
this
interesting
thing
going
on.
So
I'm
wondering
if
we
want
to
think
about
like
an
integrated
cyber,
a
connected
distributed
cyber
infrastructure
of
compute
storage
and
networking
to
support
the
coveted
researchers.
A
The
answer
may
be
yes
and
yes,
okay,
no,
and
that's
why
I
think
I
like
it,
because
I
think
that's
in
fact
it
might
be
a
good
thing
to
have
as
soon
as
we
could.
I
don't
want
to
bump
anybody
from
from
talks
coming
up,
but
it
might
be
good
to
bring
folks
in
and
see
if
we
can.
You
know,
put
together
some
pieces
on
this
and
see
what
we
can
do,
because
I
know
I
mean
you
know
we're
doing
all
sort.
We've
got.
A
You
know
all
sorts
of
covered
running
on
frontera
and
everything
else,
but
it'd
be
nice
to
be
able
to
disseminate
some
of
that.
The
only
the
only
the
the
the
things
standing
up
on
the
back
of
my
neck
right
now
is:
if
there's
any
security
concerns
around
any
of
the
data
that
we
would
need
to
deal
with
and.
A
B
And
I
know
john
had
already
reached
out
about
ncsa
and
some
stuff
they
have.
That
would
be.
You
know,
hypothysma.
They
say
that
I'm
like
no
okay.
That
could
be
yours,
but
maybe
we
can
connect
you
to
people.
You
know
so.
We'd
have
to
look
at
those
things
and
I
think
it
would
be
maybe
a
step
function
we
could
consider,
but
I'm
thinking
open
first.
A
B
Actually,
we
have
a
researcher
at
the
data
science
institute
at
columbia
because
I
work
there
now
I
have
to
keep
getting
used
to
this
and
they
he's
helping
us
think
through
what
type
of
search
research
could
we
do.
A
B
We
could
actually
have
data
sets,
find
data
sets
and
researcher.
You
know,
pi's
find
pia.
You
know
where
you
can
actually
use
contextual
data
to
find
other
data
that
can
help
you
with
your
challenge.
So
he
already
has
some
ideas
on
that.
So
if
we
want
to
schedule
a
talk
about
this,
a
discussion,
not
a
talk,
you
know
not
presentation.
A
A
It's
something
we
can
do
as
an
offshoot
from
this
and
not
do
it
as
a
formal
meeting,
but
more
as
a
you
know.
Let's
see
what
we
can
do
in
the
it's.
A
from
my
background.
It's
an
old
russian
folktale,
the
stone
soup
ideal
where
everybody
brings
a
little
bit
and
we
we
can
actually
make
this
happen
and
then
maybe
go
get
some
money
to
make.
It
happen
for
real.
B
Well,
actually,
an
nsf
has
asked
us
for
our
proposal
for
the
next
version
of
the
covet
info
commons.
So
that's
what
this
is
about.
That's
exactly
what
it
is,
and
actually
I'm
going
to
have
a
first
discussion
with
jim
hendler,
who
is
on
our
steering
committee,
who
you
probably
all
know
but
he's
at
rpi,
and
he
was
involved
with
the
hpc
consortium,
which
is
interesting,
but
that's
very
formal
he's
like
for
them
to
have
a
formal
relationship.
Ostp
has
to
sign
off,
I'm
like
no.
This
is
like
we'll
just
point
to
them.
B
Then
don't
get
excited
you
know,
but,
and
then
john
goody's
going
to
be
on
that
call
and
me
and
jeanette
just
to
start
thinking
about
it,
but
we
can
have
another
call.
Anyone
else
want
to
be
in
on
the
I'm,
calling
it
kixy
right
now.
Isn't
that
cute,
it's
the
cobit
infocomm
and
cyber
infrastructure.
A
B
D
E
Just
a
follow-up
thanks
john,
we
did
host
a
an
agricultural
sort
of
event,
focused
event
on
data
ownership
and
the
recordings
for
for
that
will
be
available
on
the
link
that
I
provided
and
then
just
in
terms
of
sort
of
nationally
and
in
the
in
the
spirit
of
promoting
awareness
of
of
all
things,
data,
aplu
and
aau.
E
Many
of
your
organizations
are
a
member
of
the
of
either
of
those
two
organizations,
if
not
both
hosted
their
summit
or
or
a
lead
up
to
the
summit
just
last
week
as
well.
I'm
part
of
that
steering
committee
and
we're
working
very
diligently
to
release
some
recommendations
that
were
vetted
by
the
group
of
33
organizations
that
were
part
of
that
meeting,
and
so
so
just
heads
up
on
that.
You
can
kind
of
keep
your
radar
out
for
further
announcements
regarding
both
of
those
two
themes.
A
E
Yeah
I
like
how
this
is
coming.
It's
really
sort
of
focusing
in
on
case
studies,
nigel
nile,
so
you'll
you'll.
Actually
I
I
I
to
me
those
are
the
most
useful
when
you
can
sort
of
see.
Well,
how
did
people
do
it
yeah
at
what
stage
are
they
doing?
Are
they
at
right
now
and,
and
so
that's
really
been,
the
focus
now.
A
That's
all
that's
great
stuff
glad
to
hear
it
all
right.
Well,
I
guess
melissa
didn't
make
it
today,
so
we're
not
going
to
hear
unless
anybody
else
has
any
updates
that
they've
heard
grapevine
or
otherwise
about
the
osn.
A
I
know
it's
it's
coming
along
and
you
know
I.
I
have
a
feeling
there
at
that
phase.
There's
always
the
phase
where
you
talk
about
what
you're
going
to
do
and
then
there's
a
great
panic,
and
then
you
talk
about
what
you
actually
got
done.
I
think
they're
in
that
middle
phase.
Still
right
now
the
hardware
is
all
coming
together
and
the
systems
are
working.
So
it's
all
good,
so
anyhow
that
I
think
takes
us
through
the
the
business
section
here.
A
Oh
sorry,
we
were
going
to
talk
a
little
bit
about
any
sort
of
updates
on
outreach
and
engagement,
anything
that
we
need
to
do
or
should
be
doing
around
that
to
maybe
grow
this
group
a
little
bit
or
or
reach
out
to
other
communities,
and
I
think
you
know
we
talked.
I
think
we
talked
a
little
bit
about
it
this
spring
and
wound
up
on
the
agenda
here,
and
I
think
you
know
the
fact
that
it
is
only
11
of
us
as
opposed
to
usual
20
somewhat.
A
B
That's
okay.
I
think
we
what
we
probably
want
to
think
about,
but
I
want
to
get
to
john's
jim's
presentation
we
could
take
this
at
another
time
is
what
we
focus
on,
because
it's
very
general
right.
Cyber
infrastructure
and
data
sharing,
so
you
have
to
kind
of
you
know,
tune
in
and
see
what's
really
going
on.
It's
not
like
you
know,
you're
looking
at
law
and
order
svu,
you
know
it
could
be.
B
You
know,
romper
room,
you
know
it's
like
it
could
be,
not
that
we
rock
a
room,
but
you
know
it
could
be
anything
so
it
could
be
as
we
progress.
Maybe
there's
some
themes
you
know
like
we
make.
You
know
that
we
have.
I
don't
know
but
like
maybe,
if
this
covet
info
commons
once
come
or
health
becomes
a
big
issue.
Maybe
if
we
had,
I
think
something
more
domain
specific,
because
that's
one
of
the
things
I've
been
thinking
about
the
hubs
for
years
is
as
compared
to
just
the
big.
B
B
That
could
be
really
valuable
as
people
move
things
forward,
and
so
I
don't
know
if
that
makes
sense,
but
that's
one
of
the
things
I've
felt
for
years
and
now
that
we
have
the
codeinfo
commons
we
get
to
start
doing
some
of
that
which
is
kind
of
cool
and,
like
the
ag
stuff
that
jim
was
talking
about.
You
know,
let's
make
this
specific
so
that
that's
just
a
thought,
as
we
should
may
want
to
think
about
like
in
the
new
year.
You.
C
Yeah
just
to
build
on
that,
I
think
the
best
practice
from
another
working
group
that
I'm
involved
in
is
you
know
every
six
months.
They
set
a
theme
and
a
goal.
What
they
want
to
try
to
do
with
these
meetings
for
the
next
six
months.
It
doesn't
mean
you
know
if
that
goal
or
theme
doesn't
fit
with
your
research
you're,
going
to
tune
them
out
forever.
C
You
just
realize
for
the
next
six
months,
I'm
probably
not
going
to
make
it
a
priority
to
10,
but
it
happens
to
be
a
theme
I
like,
and
then
the
group
picks
the
theme
and
so
that
to
build
on
what
florence
said,
I
think
that's
a
good
way
to
maybe
create
some
more
energy.
A
Yeah,
no,
I
can
see
what
you're
saying
you
know
and
actually,
having
you
know
more
of
a
you
know:
I've
really
enjoyed
hearing
all
of
the
different
things.
There's
always
a
downside
to
every
selection.
But
you
know
it's
been
nice
to
have
such
a
variety
of
things,
but
it
may
be
good
to
be
a
bit
more
thematic
on
sort
of
the
semester
level.
You
know
it's
only
four
months
so.
B
And
then
it
might
be
more
of
a
value
proposition
for
people
to
present,
because
if
they
know
they'll
be
with
like-minded
people
and
they
can
collaborate,
that's
the
that's.
I
think
one
of
the
values
of
the
covet
info
commons.
They
know
that
they're
all
working
on
this
problem
for
many
different
facets,
but
they
know
they're
all
working
on
it
and
that
could
actually
be
an
attraction.
Maybe
just
a
thought.
A
Yeah,
I
think,
let
me
think
I
think
maybe
we
should
discuss
that
and
and
come
up
with
something
on
you
know.
I've
got
a
few
ideas,
but
you
know
maybe
at
the
next
meeting
that
should
be
our
part
of
our
working
group.
First
setup
is
you
know,
let's,
let's
start
looking
at
a
theme,
whether
it's
you
know
how
to
do.
You
know
reproducibility
or
you
know
whatever
we
can
do,
and
you
know
I
keep.
You
know
how
to
do
better
than
just
writing
a
jupiter
notebook
for
a
semester.
A
Nothing
wrong
with
them
all
right
so
with
that,
unless
there's
any
other
business
that
people
want
to
bring
up,
why
don't
we
move
on
to
to
jim?
So
I
think
I
think
everybody
knows
jim
he's
at
ncsa
where
he
leads
up
the
cyber
security
division
there
and
he's
going
to
give
us
the
results
of
the
trustworthy
data
working
group
survey
and
so
without
further
ado.
Why
don't
I
turn
things
over
and
and
we
can
find
out
if
screen
sharing
is
working.
F
F
Good
good,
okay,
so
what
I'll
be
talking
about
is
guidance
and
survey
results
from
the
tdwg,
the
trustworthy
data
working
group,
which
is
a
nice
collaborative
effort
that
has
really
good
participation
from
the
bt
hubs.
I
mean,
and
so
here's
a
list
of
my
co-authors
on
our
on
the
guidance
report
that
we
published
on
wednesday
and
so
you'll
see
florence
there
and
john
is
there
and
other
other
names
that
you'll
recognize
so
really
really
great,
to
have
an
active
working
group
working
on
these
topics
and
especially
focused
on
trustworthy
data.
F
So
in
developing
the
survey
we
wanted
to
understand
existing
guidance
and
discussions
of
trustworthiness
in
the
literature,
and
so
we
did
not
find
a
single
definition
of
trustworthiness,
and
so
one
one
theme
working
through
our
working
group
and
you'll,
see
in
my
presentation
also
is
how
do
we
get
at
the
attributes
of
trustworthiness?
But
certainly
data
integrity
is
a
core
attribute
of
trustworthiness,
and
so
that's
that
nist
1825
and
this
826
focus
on,
and
so
we
see
also
in
other
references.
F
So
we
know
that
integrity
is
a
key
part
of
trustworthiness
and
we
also
looked
at
some
big
failures
of
trustworthiness
in
the
community.
There's
an
example
of
a
script
that
was
used
in
nuclear
magnetic
resonance
research
that
impacted
a
lot
of
results,
also,
the
integrity
checking
and
the
pegasus
scientific
workflow
integrity
tool
detected
multiple
errors
in
jobs
run
in
the
open
size
grid.
F
So
those
were
two
really
interesting
trustworthiness:
failure
cases
for
us
to
look
at
in
the
working
group
so
based
on
that
background,
we
developed
a
survey
with
four
sections:
one
demographics
about
the
respondents,
information
about
their
views
on
trustworthiness
and
the
data
that
they
work
with
any
tools
and
technologies
that
they
work
with
to
keep
their
data
trustworthy,
and
we
had
a
few
wrap
up
questions.
So
so
it
was
a
total
of
16
questions.
Seven,
multiple
choice,
five
liquid
scale
and
four
free
form
responses.
F
So
the
liquor
scale
went
from
strongly
disagree
to
strongly
agree
like
we
have
in
this
example
here,
and
so
thanks
to
the
promotion
spreading
the
word
that
the
vdhubs
did
and
others
did
we.
I
was
really
pleased
that
we
got
111
responses
to
our
survey
from
april
21st
to
may
31st,
and
so
here's
one
of
the
questions.
F
So
this
this
question,
we
got
a
response
from
n
equals
110,
and
so
only
one
person
didn't
respond
to
this
specific
question
when
they
filled
out
the
survey-
and
it
tells
us
a
bit
about
who
responded
to
the
survey-
and
you
see
here:
54
were
research
computing
facilitators,
46
infrastructure
providers
operators.
So
you
see
this
is
the
community
that
many
of
us
work
with.
F
We
see
computer
and
information
sciences
being
at
the
top
of
the
list,
but
also
many
other
areas
of
sciences
are
being
respon
supported,
either
directly
by
the
researchers
or
because
of
the
facilitators
and
the
the
work
that
the
facilitators
do
and
the
infrastructure
operators
do
and
so
to
get
at
this
idea
of
what?
What
does
it
mean
for
scientific
data
to
be
trustworthy?
We
assembled
our
first
set
of
attributes
about
trustworthy
data
based
on
the
literature.
F
We
looked
at
accuracy,
integrity
methodology,
provenance,
reproducibility,
reputation,
responsible
stewardship
and
significance,
and
we
gave
these
questions
that
are
shown
on
the
slide
and
asked
the
the
survey
takers,
which
of
these
attributes
made
sense
to
them
as
being
part
of
trustworthy
data
for
them,
and
so
integrity
and
reproducibility
got
the
most
checks
from
our
respondents,
but
also
a
good
agreement
with
provenance
methodology
and
responsible
stewardship.
F
Interesting
that
not
a
lot
of
people
selected
significance
and
in
terms
of
I'll
come
back
to
some
of
the
other
responses
that
we
got.
That
gave
us
some
input
into
our
guidance
report.
F
Would
you
apply
apply
additional
guidance
if
you
received
it,
and
so
93
said
yes
or
maybe,
and
so
that
was
good
motivation
for
us
to
work
on
the
the
guidance
report
and
so
quick
overview
of
the
guidance
that
we've
written
up
and
published
this
week.
We
distilled
from
the
survey
responses
of
four
categories
of
stakeholders
for
guidance,
data
users,
data
providers,
secure
infrastructure
providers
and
facilitation
and
compliance
professionals,
and
so
you
see
the
the
facilitators
went
into
the
facilitation
category.
F
Infrastructure
providers
have
their
own
category,
but
also
data
users,
data
providers
and
of
course
many
people
in
the
community
have
have
multiple
roles
and,
and
so
the
the
guidance
for
these
different
stakeholders
can
speak
to
people
in
their
different
roles
in
the
community,
and
so
in
the
guidance
report.
We
thought
more
about
the
attributes
of
trustworthiness
and,
and
so
one
of
the
some
of
the
write-in
responses
we
received
was
about
confidentiality
and
data
encryption.
F
Also,
I
noted
that
significance
was
wasn't
selected
by
a
lot
of
our
respondents,
so
we
we
dropped
significance,
but
we
brought
in
authorization
and
authenticity
and
and
also
did
some
rewording
on
on
attributes
like
acceptance,
accepted
techniques
of
creation.
We
also
looked
at
the
trust
principles
for
data
repositories,
transparency,
responsibility,
user
focus,
sustainability
and
technology.
Transparency
was
another
one
of
our
write-in
attributes
that
we
had
in
the
survey.
F
So
that's
that's
another
important
one
for
us
and
we
thought
about
how
those
attributes
compared
to
the
fair
attributes
that
we
all
know
and
love
so
really
interested
in
this
group's
input
on
these
different
attributes
of
trustworthiness,
and
that
could
definitely
be
a
topic
of
discussion
for
us
today
and
in
the
future,
and
so
what
we
did
in
the
report
is
for
the
four
categories
of
stakeholders
and
each
attribute.
F
We
gave
some
guidance
for
for
each
of
these
attributes
and
talked
about
what
are
the
needs
for
each
of
those
stakeholders,
and
so
the
data
user,
just
looking
at
integrity,
needs
to
understand
that
the
data
that
that
user
is
using
is
not
corrupt.
It's
free
from
errors.
The
data
provider
wants
to
know
that
the
integrity
of
data
they're
being
provided
is
maintained.
F
So
there's
lots
more
rows
in
this
table
in
the
report,
but
in
the
interest
of
time
this
is
the
only
row
I'm
going
to
show
today
we
have
another
table
in
the
report
looking
at
tools
and
technologies
and
how
they
help
with
different
attributes
of
trustworthy
data,
and
what
one
thing
that
we
found
in
this
analysis
is
that
some
of
these
attributes
are
much
better
supported
by
the
tools
that
we
have
at
hand
than
other
attributes.
So
we've
got
lots
of
help
with
availability
and
integrity
and
managing
authorization
to
our
data.
F
F
And
so
I
think
I
went
even
shorter
than
15
minutes,
so
we'll
have
some
some
time
for
a
discussion
here,
but
the
the
next
steps
are.
F
So
I'm
I'm
happy
for
people
to
join
that,
and
so
our
goal
is
to
wrap
up
the
working
group
at
the
end
of
the
calendar
year
in
december,
and
that
will
include
a
revision
to
our
guidance
report
based
on
feedback.
That
comes
out
of
our
discussion
today.
That
comes
out
of
the
webinar
next
wednesday
and
and
other
input
that
we
receive
between
now
and
the
end
of
the
year.
F
And
of
course,
thanks
very
much
to
all
the
working
group
members
who've
participated
in
developing
these
reports.
People
who
111
people
who
responded
to
our
survey
and
our
sources
of
funding
and
so
without
further
ado,
I'm
happy
to
take
comments
and
questions.
B
So
jim,
could
you
go
back
to
the
page
with
that
pretty
new
chart
that
you
created?
Yes,
that
one
yeah
I
haven't
seen
it
so
well
laid
out
before
now
when
you
were
presenting
this,
you
were
talking
about
the
types
of
tools
that
you
think
would
be
valuable.
Can
you
just
spend
a
little
more
time
on
that.
F
Yep,
this
was
the
list
of
tools
that
that
we
listed
in
the
survey
right
and-
and
so
let's
see
so-
we've
got
third-party
data
repository
network
and
cloud
storage,
archival
storage,
integrity
checking
so
like
for
access
controls.
Let's
see,
what's
the
what's,
the
best
way
to
review
mcc.
B
Well,
I
actually
you're
very
you're
being
very
thoughtful.
Thank
you,
but
I
thought
you
I
heard
you
say
if
you
know
of
tools
that
people
could
use,
let
us
know
so,
are
you?
Are
we
looking
for
to
fill
some
of
these?
Do
we
know
if
there
are
gaps
here
that
we're
looking.
F
Understanding
the
methodology
and
the
techniques
that
were
used
to
produce
the
data
and
publishing
that
in
metadata
about
the
data
is,
I
think,
one
area
where
we
don't
have
a
lot
of
tools
that
we're
referencing
right
now
about,
so
that
could
be
tools
for
like
gathering
the
provenance
of
how
the
data
is
collected
and
produced,
and
so
I
think
so
far
we
haven't.
We
don't
have
information
about
that
prominence
in
the
about
available
provenance
tools
in
the
reports.
B
If
we
don't
understand
it
well,
so
have
like
a
combined
kind
of
big
data
hubs
and
trusted
ci
kind
of
trustworthy
data
workshop
or
something
because
I
think
we
can
do
a
better
job
of
making
sure
that
the
data
owners
and
data
creators
are
really
thinking
about
cyber
security,
because
I
think
some
of
them
take
it
for
granted
that
it's
someone
else's
problem
or
they
don't
have
to
worry
about
it,
and
I
wouldn't
say
it
to
them
that
way.
But
I
think
it's
a
problem.
F
But
it
does.
It
does
remind
me
that
that
that
reminds
me
of
one
of
the
questions
we
did
ask.
So
I've
got
some
extra
slides
here
so
wow.
We
did
ask
this
question
and
I
I
was
expecting
so
we
asked
you
know:
do
your
job
responsibilities
include
establishing
or
maintaining
the
trustworthiness
of
scientific
data,
and
that's
where
I
was
thinking
we're
going
to
have
some
people
say
it's.
A
Yeah,
I
think
you're
gonna
there's
a
little
bit
of
bias
on
that,
but
it
it's
better
than
it's
just
those
people
thinking
about
it.
B
Yeah
and
when
they
say
they
agree,
but
they
don't
strongly
agree,
it
could
be
that
they
agree,
but
they're,
not
the
only
ones
that
are
responsible
for
it
right
and
you
know
just
looking
at
it.
They
may
take
some
of
it
for
granted.
So
I
know
that
coming
into
the
northeast
big
data
hub
and
john
I'd
love
to
hear
what
you
have
to
say
about
the
midwest.
Maybe,
since
trusted
ci's
part
of
your
family,
it's
not.
B
You
know-
and
even
now
you
know
we're
going
to
host
a
connected
healthcare
cybersecurity
workshop,
because
I
leave
this
ieee
working
group
in
that
area
on
clinical
iot
and
we're
going
to
talk
about
security,
privacy
and
ethics.
So
expanding
it
a
little
bit,
not
just
security,
but
still
I
have
a
lot
of
ieee
people
that
want
to
be
in
it
and
I
have
like
a
handful
of
people
from
you
know
from
the
hub.
So
I
feel
like
it's
something:
that's
not
top
of
mind
for
the
the
data
folks,
but
john.
D
I
think
I
agree,
but
but
I
think
it's
it's
due
to
the
fact
that
a
lot
of
our
stakeholder
community
is
really
domain
focused,
and
so
you
know
someone
working
in
in
ag
or
water
quality
is
very
focused
on
their
their
scientific
questions
and
not
necessarily
on
the
underlying
cyber
security
issues,
even
even
though
they
should
be
in
a
lot
of
cases,
and
so
we
have
had
a
cyber
security
working
group
for
a
couple
of
years
that
has
very
minimal
work
to
date,
given
the
lack
of
engagement
that
we've
had.
D
So
that's
really
been
a
priority
for
me
for
this
upcoming
year
to
get
that
relaunched
and
get
folks
more
engaged
in
thinking
across
some
of
those
disciplinary
boundaries.
About
the
the
questions
that
everybody
faces,
particularly
the
issues
that
jim
talked
about
in
this
report,
you
know
what
does
it
mean
to
have
trustworthy
data?
How
do
I
know
that?
What
I'm
using
for
my
research
is
actually
data
that
I
can
rely
on,
so
I
definitely
think
that
it's
an
issue
that
we
should
all
be
pushing
harder
on.
B
Yeah,
so,
based
on
that,
I'm
glad
to
hear
that
not
that
it's
a
happy
answer,
but
I'm
glad
I'm
not
the
only
one
you
know
so
maybe
we
could
think
about
you
know
you
wanted
to
get
it
kick-started
in
the
new
year.
John.
I
would
like
to
increase
focus
on
it.
You
know,
maybe
we
could
think
about
some
type
of
co-sponsored
workshop
gym.
B
You
know
and-
and
you
have
at
least
me
and
john-
you
know
maybe
shannon
says
yay
verily.
You
know
the
people
in
the
west,
you
know,
but
once
we
get
it
started,
we
just
invite
the
family.
That's
how
we
work
as
the
hubs
now
you
know,
but
maybe
we
could
think
about
what
would
that
look
like
you
know,
and
how
do
you
leverage
this
great
data
that
you
pulled
together
very
clearly
represented?
B
Thank
you
for
making
it
so
clear
for
people
to
get
underneath
this
and
help
people,
because
a
lot
of
people
said
yeah,
hey,
you
have
something
that
could
help
me.
That
would
be
great,
and
so
I
think
it's
a
chance
to
increase
awareness
for
domain
scientists
and
data
scientists
and
then
work
on
it,
and
I
think
it's
going
to
be
a
journey.
I
don't
think
they're
all
going
to
say.
Oh
my
gosh
yeah
I've
been
hoping.
F
I
think
that
that
workshop
idea
sounds
great
and
so
and
there's
that
aspect
of
matchmaking,
where
you're
sharing
here's
the
good
practice
that
I'm
following
or
here's,
here's
a
new
solution
that
I've
developed
and
I'd,
like
others
or
it's
available
for
others,
to
adopt
to
improve
the
trustworthiness
of
their
data.
That
I
think
that
would
be
excellent.
To
have
a
workshop
to
facilitate
that
data.
That
information
sharing.
D
Yeah
absolutely-
and
I
think
that
there's
there's
interest
from
the
other
hubs
as
well.
I
won't
speak
for
shannon,
but
I
know
in
the
south
that
they've
had
the
a
different
sort
of
flavor
of
cyber
security,
more
focused
on
the
the
social
infrastructure
side
of
things,
and
that's
definitely
an
interest
for
us
as
well.
D
So
I
think
you
know
the
the
fact
that
we
have
a
strong
materials
and
manufacturing
focus
which
which
has
some
industry
concerns
about
cyber
security
and
data
sharing
and
other
areas
as
well,
that
there's
some
good
overlap
amongst
all
the
hubs.
B
B
You
know
technical
breakouts
that
focus
on
a
certain
way
of
doing
cyber
security
or
a
cybersecurity
tool,
or
something
like
that
you
know.
Maybe
we
could
get
the
domain
data
and
data
scientists
and
cyber
infrastructure
people
together.
Maybe
it's
just
an
idea.
I
don't
know
what
the
right
answer
is
going
to
be,
but
I
think
it
would
be
interesting
for
us
to
think
about.
B
And
then
we
co-brand
it
and
we
don't
have
to
spend
any
money.
We
just
have
to
use
zoom.
It's
a
beautiful
thing
these
days
right,
but
I
do
have
a
cyber
security
workshop
grant
award
that
I
have
to
use
up
by
the
end
of
february,
and
so
I've
been
thinking
about.
If
we
look
at
you
and
look
at
you,
you
know
if
they
watch
what
I
say
and
you're
recording
this.
B
They
have
to
cut
all
this
stuff
out,
but
you
know,
maybe
you
know
we
can
get
students
involved
and
you
know
pay
a
little
time
for
them
or
something
you
know.
I
think
it
has
to
be
humans
that
we,
my
new,
my
new
stick,
is
dollars
to
humans.
That's
like
my
strategy
now
because
they
can't
travel.
So
maybe
you
know
we
could
do
something
interesting
together.
B
I
think
we
can
figure
it
out
and
bring
it
to
them
a
proposal
of
what
we're
thinking.
Yeah-
and
you
know
definitely
you
know,
share
it
with
your.
You
know,
ed
friends,
this
isn't
a
secret
and
see
what
they
think,
but
since
john
and
I
tend
to
have
passion
around
it,
we're
happy
to
co-lead
the
thinking
I
think,
but
all
ideas
welcome.
A
A
That's
what
we
do
if
you're,
if
you're
an
ultraviolet
astronomer,
it's
okay,
if
you're
an
x-ray
astronomer,
you
get
two
photons
and
that's
a
statistically
significant
event
to
sort
of
publish
a
paper
on
so
yeah
statistics
get
thrown
out
the
window
on
these
110,
but
I
found
it
interesting
and
I
think
this
may
be
part
of
the
selection
effect
of
this-
that
if
you
looked
at
on
slide
seven,
where
you're
breaking
down
the
the
respondees
in
classification
of
what
they
do,
you
sort
of
see,
I
would
say
a
reflection
of
people
who
are
sort
of
I'll
say
aware
of
cyber
security,
and
so
the
interesting
thing
to
do
here,
because
astronomers
aren't
interested
in.
A
In
cyber
security,
I
mean
the
data
is
worthless,
which
is
why
it's
so
great
to
work
with
there's
no
economic
value
to
it.
So
I
was
just
curious
if,
in
that,
if
there
was
any
differences
that
you
noticed
where
above
this
line,
particular
categories
of
things
were
more
important
and
below
that
it
wasn't,
and
I'm
going
to
say
the
reason
I'm
thinking
about
this
is
because
down
the
bottom
are
a
lot
of
sort
of
the
areas
that
we're
working
on
campus
to
put
together.
F
F
You
know
for
some
area
of
science,
maybe
they
were
going
to
say
you
know
it's,
we
don't
feel
responsible
for
trustworthy
data,
and
so
we
did
a
lot
of
cross-correlations
analysis
between
the
different
answers
and
we
didn't
find
any
any
significant
correlations
between
areas
of
science
and
differences
in
the
responses
to
other
questions.
So
that
was
so.
I
think
that's
the
sort
of
thing
that
we
could
always
keep
looking
for
in
the
data,
but
at
least
so
far
the
the
search
that
we
look
for.
We
weren't
able
to
detect
it.
A
That's
you
know
seven
people
who
are
interested
enough
to
respond
to
this
there's
another
bias,
but
it's
just
it's
something
that
I'm
trying
to
look
at
while
we're
designing
a
lot
of
these
things
and
making
sure
when
we're
building
up
the
the
people
who
are
not
only
responding
to
these
but
who
are
attending
workshops
make
sure
that
we
encourage
folks
who
aren't
necessarily
on
this
chart
to
help
because
I'd
love
to
have
you
know
the
the
folks
who
are
doing
you
know
the
social
science
and
history
and
other
pieces
which
have
exactly
the
same
cyber
security
issues
and
trustworthy
issues
and
and
other
things,
but
they
the
way
that
their
communities
go
about.
A
B
So
you
know
you
bring
up
some
very
interesting
points.
I
actually
find
it
a
little
scary,
because
some
some
people
should
be
really
worried,
like
health
and
medical
sciences
right,
they
should
be
super
worried
and
to
say
that
we
really
can't
tell
the
difference
between
them
and
social
behavioral.
Scientists
and
physicists
actually
scares
me
a
little
bit
because
I
think
some
people
should
be
hyper,
aware
right
and
really
worried,
but
maybe
that
brings
up
the
point
that
there
are
two
maybe
two
facets
and
there
can
always
be
more
than
maybe
two
facets.
B
B
B
They
can
have
it.
I
have
a
backup
copy
they're
not
going
to
do
ransomware.
For
my,
like
you
know,
whatever
data,
do
you
care
about
them?
Changing
it
now
wait
a
second.
I
really
care
about
them.
Changing
it.
Every
scientist
should
worry
about
that
right.
The
provenance.
Do
you
need
security
for
that
yeah?
You
do.
A
But
then
there
are
people
for
whom
you
know,
though
they
have
a
data
set,
that
is
their
career
right
and
that
that's
a
very
that
all
of
these
questions
become
very
different.
Then,
and
how
do
I
share
that,
so
that
people
can
be
confident
in
my
results,
without
giving
away
my
life's
work
and
and
and
those
are
also
issues
that
that
that
need
to
be
addressed
in
all
these
systems?.
A
And-
and
they
need
to
be
done
in
ways
that
that
a
non-technical
person
feels
the
trust
that
they
feel
when
they
have
it
on
the
zip
drive
or
whatever
they
have
in
their
office,
that
nobody
else
in
the
world
can
read
right,
you
know,
and
so
that
I
was
sort
of
interested
to
see
if
anything
in
the
in
the
in
the
lower
part.
There
did
come
out.
It's
interesting
that
it
didn't.
B
Maybe
you
know
we
talk
with
other
people
who
do
have
the
data
it's
their
data
and
some
of
us
are
them,
or
we
have
friends
or
both
and
see
what
would
be
interesting
to
them.
Maybe
talk
to
some
of
you
know
the
research
computing
leaders
that
we
know
that
are
dealing
with
this
type
of
stuff.
You
know:
do
they
think
that
a
cyber
security
workshop
would
be
valuable
for
domain
scientists?
Is
it
for
research
computing
facilitators?
B
Is
that
where
we
should
start,
you
know
because
they're,
the
kind
of
like
in
the
middle,
they
touch
the
data
and
they
touch
the
infrastructure.
You
know
they
touch
both.
So
maybe
we,
you
know,
we
think
about
this
workshop
idea.
We
gather
more
input
through
our
friends
and
family
and
networks
of
humans
and
then,
when
we
talk
about
it
again
like
in
a
month
or
two
or
whatever,
we
have
some
more
insight
on
what
might
make
sense
for
the
community.
F
We
might
organize
the
agenda
of
the
workshop
by
different
themes,
so
have
a
a
few
hours
on
social
and
behavioral
sciences,
a
few
hours
on
health
and
medical
sciences,
with
the
recognition
that
we
would
have
different
speakers
and
and
get
different
input
about
different
data.
Trustworthy
data
concerns
from
those
different
communities
that.
B
Would
be
dynamite
and
maybe
get
that
nia
id
guy
everyone
loves
that
does.
I
am,
I
forget
his
name,
but
everyone
knows
him.
You
know
to
talk
in
the
health
and
medical
sciences,
one
and
then
we'll.
If
we
announce
that
and
say
these
are
the
two
topics
and
people
come
to
us
and
say
wait
a
second.
What
about
geosciences?
B
You
know
then
we'll
identify
more
demand,
and
you
know
then
we
can
kind
of
you
know.
Do
it
step
by
step,
because
I
think
part
of
it
is
increasing
their
awareness
of
what
the
challenges
really
are
and
that
they
can
either
do
something
about
it
or
make
sure
somebody
else
is
doing
something
about
it.
Sometimes
just
have
to
educate
them
enough
to
say,
ask
for
this.
You
know.
F
Because
in
some
cases
it's
it's
fairly
low
hanging
fruit
to
just
enable
the
data
integrity
flag
on
the
tool
you're
using.
So
it's
not
necessarily
onerous
to
improve
to
have
some
real
improvement
in
the
trustworthy
management
of
the
scientific
data.
B
And
how
valuable
would
that
be
people
come
to
this
webinar
they
find
out
that
they
just
have
to
do
this
and
this
and
they're
and
they're
more
protected
whoa
right.
If
I
get
one
thing
out
of
a
webinar,
it's
a
big
day,
that's
how
I
look
at
it,
so
I
don't
feel
bummed
after
I
go
to
one
you
know,
but
that
would
be
very
valuable.
A
A
B
A
F
B
E
A
Well,
but
even
going
outside
of
that,
I
think
you
know
you're
going
to
say
cyber
security
and
a
lot
of
people
use
the
same
password
everywhere
so
and
they
and
and
they
think,
that's
good
enough-
it's
not
their
problem,
and
so
what
we
need
somehow
is
to
fill
in
that
those
folks
here,
because
they
need
to
be
part
of
the
solution
to
understand
what
it
is
to
check
the
box
saying
this
is
trustworthy
data
or
to
you
know
it
isn't.
I
just
found
it
on
a
usb
stick.
B
Really,
we
really
have
to
show
that
advertisement,
but
you're
right
we'd
want
to
go
broad,
and
this
could,
even
if
we
could
do
it
at
perk
too
or
we
do
it
at
sc
or
you
know
like
we
could
go
broader
and
broader.
A
Yeah
well,
and
maybe
even
some
focus
on
some
some
topical
meetings
that
aren't
so
the
the
you
know
the
standard
set
or
or
our
standard
set
yeah.
B
Yeah,
so
we
should
have
that
on
our
agenda
again
and
jim,
if
you
and
you
know
the
trusted
ci
or
whatever
I
mean
I
kind
of
swing
I
play
with
everybody,
but
I'm
not
interested
in
cia
anymore,
but
we
all
work
together.
If
you
know
you
have
thoughts
too,
we
could
schedule
another
call
or
a
separate
call
or
just
do
it
on
another.
One
of
these
calls,
whatever
you
all
think.
A
B
A
This
was
a
really
productive
meeting,
good
discussions
all
around
and
and
hopefully,
and
some
good
ideas,
I've
even
jotted
down
some
ideas
here
for
possible
themes,
and
so
hopefully,
next
time
we
can.
We
can
talk
to
some
of
those
and
start
pulling
all
this
together,
and
so
with
that,
I
will
bid
everybody
a
happy
friday
and
a
good
weekend
and
happy
october.