►
From YouTube: Connecting Communities and Business to Create Data-Driven Decisions - Cali Dolfi (Red Hat)
Description
Connecting Communities and Business to Create Data-Driven Decisions
Guest Speakers: Cali Dolfi and Brian Profitt (Red Hat)
Hosted by Diane Mueller (Red Hat)
OpenShift Commons Briefing
2021-03-12
https://commons.openshift.org/events.html
A
Okay,
so
hi,
my
name
is
brian
prophet.
I
am
a
manager
of
the
community
insights
team
within
the
open
source
program
office
and
what
part
of
what
our
team
does
is
gather
metrics
and
data
to
try
to
figure
out
how
healthy
communities
are
and
and
and
make
sure
that
they
are
growing
and
thriving.
The
way
open
source
projects
should,
and
one
of
the
people
that
I
work
with
on
my
team.
An
amazing
town
is
kelly,
delphi
and
I'll.
Let
her
introduce
herself.
B
Hi
all
my
name
is
kelly
dolphy
and
I'm
a
data
science
intern
here
at
red,
hot
and
I've
been
here
for
you
know
about
a
year
and
a
lot
of
things
that
you're
me
seeing
today
has
been
my
work
over
the
past
six
months,
that
brian
prophet
and
I
have
been
doing,
and
with
that.
I
am
also
finishing
up
my
degree
at
boston
university
in
computer
science.
A
Okay,
great
so
we'll
go
ahead
and
get
started,
and-
and
the
first
question
that
we
always
ask
ourselves
here-
is
how
to
discover
community
health
and
sustainability.
A
You
know
how
strong
and
how
healthy
a
community
will
be
if
they're
going
to
invest
time
and
money
and
resources
into
that
given
project.
So
next
slide,
please
kelly,
okay,
so
historically,
this
has
always
been
done
kind
of
by
the
seat
of
our
collective
paint.
We
we've
always,
you
know,
tried
to
figure
out
community
health
around
things
that
were
fairly
innocuous
and
seemed
very
obvious,
such
as
you
know
how
popular
is
a
given
open
source
project,
how
many
people
are
using
that
project?
A
What
how
many
downloads
are
coming
in
for
that
free
software
project,
because
you
know,
if
you
look
at
it
from
there?
Obviously
those
seem
like
really
good
strong
signs
of
community
strength
and
health
as
platforms
like
gitlab
and
github
came
into
being,
and
you
could
start
looking
at
the
first
level
activity
of
a
given
project
on
those
platforms.
A
You
might
assign
things
about
popularity
and
strength
from
stars
such
as
what
github
has
and
there's
nothing
really
wrong
with
any
of
those
metrics.
The
problem
is
this:
they
don't
necessarily
give
you
a
true
signal
of
what
the
communities
health
is
things
like
popularity
and
downloads,
and
the
number
of
stars
are
great
and
they
look
awesome
when
you're
trying
to
market
your
project
and
tell
people
how
you
know
wonderful,
it
is,
and
it
may
very
well
be
true
indicators
of
how
wonderful
your
project
is
but
think
about
this
suppose.
A
You
have
a
project
that
has
a
bazillion
downloads
and
there's
no
indication
from
the
user
consumption
side
that
currently
in
that
project
there
might
be
some
kind
of
infighting
going
on
and
there's
been
a
massive
battle
going
on
on
the
project
mailing
list
for
months
and
the
project
is
about
to
forge
and
there
it
will
then
lead
to
you
know
a
degradation
of
the
quality
of
the
software.
That's
coming
out
from
that
project.
A
Nobody
on
the
user
side
will
know,
because
you
know
it's
just
still
being
downloaded.
So
it's
not
that
downloads
and
stars
and
popularity
in
general
are
a
bad
thing,
but
we
have
found
that
over
time
they
are
not
really.
You
know,
solid,
true
indicators
of
how
healthy
a
community
can
be
next
slide,
please.
A
So
what
we're
doing
in
the
open
source
program
office
specifically
on
the
community
insights
team
is
we're
we're
trying
to
do
something
and
move
beyond
those
gut
feelings,
we're
trying
to
deploy
analytical
rigor
and
that
we've
been
able
to
do
that
through
the
rise
of
two
key
movements
within
the
broader
open
source
ecosystem.
A
One
is
the
standardization
of
metrics
and
we'll
talk
about
each
of
these
in
just
a
second
and
the
other
is
alongside
of
that
standardization
and
how
we
can
look
at
all
these
different
kinds
of
community
projects
in
the
same
way
is
the
evolution
of
tools,
and
that
is
again
something
we're
going
to
be
talking
about
as
we
move
through
this.
So
next
slide.
Please.
A
So
right
now
the
the
tools
are
this,
and
these
are
all
coming
around
that
standardization
of
metrics
that
I
referred
to
earlier
and
that
is
actually
coming
from
a
project
independent
of
red
hat,
called
project
chaos.
Project
chaos
is
the
linux
foundation,
sponsored
project
that
is
very
keen
on
getting
together
metrics
that
can
apply
to
any
project,
because
in
the
past
a
lot
of
the
pushback
has
been
that
we
have
just
had
metrics
that
you
can't
really
apply.
A
You
know
something
to
a
project
about
databases
versus
an
academic
pro
project,
or
you
know
something
different,
but
that
turns
out
not
to
be
the
case.
All
projects,
no
matter
how
they
are
constructed
or
what
they
are
producing,
have
similar
aspects
which
can
be
measured
and
to
measure
that
we
are.
We
currently
at
red
hat
and
in
the
osbo
office
are
using
three
sets
of
tools.
A
One
is
known
as
cauldron:
the
cauldron
is
an
open
source
project
that
is
based
on
grafana
and
elasticsearch,
and
tools
coming
in
and
grid
more
labs,
and
these
are
all
tools
that
are
put
together
by
a
wonderful
vendor
in
open
the
open
source
community
known
as
peturgia
patergia,
has
put
together
cauldron
as
a
hosted
service.
A
You
can
go
to
cauldron.io
now,
and
you
know
start
running
your
own
metrics
against
projects
that
are
out
in
the
open
source
world,
whether
they're
hosted
on
github
or
get
lab,
and
what
pri
cauldron
does
is
gives
us
a
very
quick,
analytical
and
graphical
snapshot
of
what
a
get
hosted
project
is
looking
like.
What
are
the
activity
levels?
How
many
developers
are
there
now
how
many
developers
were
there
last
year?
A
How
fast
are
pull
requests
being
looked
at
for
the
first
time?
How
fast
are
those
pull
requests
and
issues
getting
closed?
These
are
all
things
that
cauldron
can
do.
In
fact,
we
liked
it
so
much.
We
actually
have
stood
up
our
own
instance
of
cauldron
so
that
we
can
run
our
research
projects
that
much
faster.
A
Alongside
of
that,
we
are
working
with
members
of
project
chaos
on
another
tool
that
does
a
lot
of
deep
diving
into
data
sets
called
auger,
and
what
auger
does
it
doesn't
give
us
a
graphical
picture
of
what
a
community
is
looking
like,
but
it
does
give
us
really
solid,
connected
pieces
of
data
that
show
us
how
different
contributors
are
working
within
different
projects.
A
You
know
not
just
the
strength
of
that
project,
but
what
you're
really
seeing
is
how
it
connects
to
all
the
other
projects
that
are
related
to
it,
which
is
an
amazing
set
of
insights
and
then
beyond
that,
we
take
the
information
from
these
two
tools
and
then
what
we're
doing
is
we
build
something
known
as
community
report
cards,
and
these
report
cards
or
analytical
reports
basically
give
us
the
graphical
and
the
analytical
tools
to
define
how
a
community
is
doing
so
we
can
run
these
at
any
time.
A
We
are
working
to
make
them
as
automated
as
we
possibly
can,
so
that
somebody
can
run
the
the
report.
Look
at
the
data
draw
conclusions
from
the
data
based
on
what
they
know
about
the
project
and
how,
historically
it's
been
give
that
information
back
to
the
the
community
itself
and
they
can
work
on.
You
know
identifying
those
things
that
might
be
problems
and
need
to
be
solved.
A
B
Coming
from
somebody
who
is
relatively
new
to
like
the
world
of
open
source,
I've
always
been
asking
the
question
like
what
should
I
be
paying
attention
to
whenever
we
start
looking
at
new
ankles
of
the
open
source
communities
and
like
where
does
it
go?
Go
from
when
you're
trying
to
look
from
looking
at
intuitions
of
people
of
the
past
and
trying
to
take
that
knowledge
and
make
it
data
driven?
B
B
This
looks
at
github
at
a
completely
new
way
of
using
ai
powered
natural
language
processing,
and
we
can
start
to
understand
the
industry's
language,
where
it's
uniquely
looking
at
open
source
keywords
and
different
things
along
those
lines,
and
this
is
a
way
that
we
can
start
grouping
together
repositories
not
by
certain
metrics
or
whether
these
contributors
all
look
at
look
at
it
and
work
on
it
together.
But
seeing
if
they're
using
similar
languages,
are
they
talking
about
similar
things
within
their
readmes?
B
The
next
stage
of
this
is
project
debater,
which
is
one
of
the
ones
I'm
really
excited
about
project
debater
takes
in
a
set
of
data
where
you
can
find
arguments
for
against
a
certain
position.
For
example,
you
could
type
in
is
open
shift
the
best
container
platform,
and
you
can
see
the
arguments
that
are
going
through
it
and
see
the
weight,
whether
on
the
positive
or
negative
side,
and
see
what
people
are
saying
about
it
and
start
to
figure
out.
B
Okay,
where
are
the
different
weight
places
that
we
need,
whether
it's,
if
you're
looking
at
it
from
a
point
of
view
like
what
do
I
need
to
do
to
take
a
one
step
up,
or
is
this
something
that
I
want
to
become
involved
in
if
you're
kind
of
looking
at
it?
From
an
outsider
point
of
view,
then
from
here
we
can
start
to
use
the
experimental
method.
B
This
is
whenever
we
start
to
use
multiple
different
storage
searches
with
a
few
of
our
different
tools
to
see
what
the
impacts
are
of
the
work
that
you're
doing.
What
is
the
impact
of
the
different
events
that
we're
having
the
discussions
like
the
one
we're
having
around
here
around
different
search
targets
and
when
we're
bringing
all
this
together,
you're
starting
to
see
open
source
communities
in
a
new
light
and
start
to
get
a
little
bit
of
a
one
step
ahead
of
everyone
else
in
the
communities.
A
Yeah,
no
thank
you
and
so
taking
the
the
new
tools
that
kali
just
outlined.
We're
really
this
we're
we're
going
to
take
all
of
the
things
that
we
mentioned.
The
the
tools
that
I
was
talking
about,
and
and
and
also
you
know,
the
new
tools
that
we're
working
with
ibm
to
create
around
mode
and
debater
and
watson,
and
and
now
we're
going
to
start
looking
at
things
that
we've
never
been
able
to
really
do
before
and
one
of
those
those
aspects
is
going
to
be.
A
The
business
impact
of
a
given
community
in
the
past,
usually
measuring
the
return
on
investment
for
a
community
has
been
rather
difficult
because
you
know
you
have
to
have
a
community
and
be
a
part
of
an
open
source
project
and
and
put
some
time
and
effort
into
that.
But
what
is
the
business?
Getting
you
can
say:
well
we're
getting
a
commercial
product
that
we're
selling
and
we're
you
know
making
money
off
of
that,
and
that
is
certainly
true.
A
However,
you
know
there
are
more
things
you
know
there
are
more
aspects
that
you
can
kind
of
pull
out
when
you
look
at
all
the
different
elements
of
working
with
a
community,
and
this
is
what
our
tool
set
is
trying
to
do.
We're
going
to
be
looking
at
things
like
if
a
given
organization
is
really
interested
about
raising
a
certain
conversation
within
the
broader
open
source
ecosystem.
A
You
know
comments
and
issue
trackers
within
mailing
lists,
any
any
place
that
there
are
public
conversations
that
are
going
on.
We
can
quickly
look
for
the
terms
and
the
conversation
that
we're
really
wanting
to
see
if
we're
getting
raised
so
hypothetically,
if
a
company
we're
trying
to
really
talk
up
container
space
and
and
kubernetes
based
tools
or
like
what
we
see
around
the
openshift
ecosystem,
you
could
start
looking
to
see
if
those
conversations
were
happening.
A
Are
they
positive
conversations
if
they're
not
positive
conversations,
maybe
there's
something
going
on?
Maybe
there's
a
project,
a
problem
going
on
with
your
tool,
set
that
you
weren't
aware
of
before,
and
we
can
start
dialing
in
and
figuring
out
what
those
conversations
are
about
and
then
also
too,
we
can
look
at
things
like
how
the
resources
can
be
calibrated
towards
community
health.
A
A
Another
thing
that
these
tools
are
going
to
be
able
to
help
us
do
is
get
into
that
element
of
sustainability.
Because
now,
with
all
of
these
tools
that
are
disposable,
we
can
really
measure
risk
at
many
different
levels.
We
we
can
still
measure
the
internal
project,
health,
which
is
what
we've
been
doing
for
quite
some
time
with
cauldron
and
auger,
but
especially
with
cauldron,
but
now
with
auger
and
mode.
A
A
None
of
us
who
are
watching
this
want
to
see
another
open
ssh,
where
you
have
a
project
that
is
maintained
by
a
very
small
number
of
people,
and
yet
so
many
projects
rely
on
it
and
if
those
people
are
no
longer
able
or
willing
to
maintain
that
project,
then
there
becomes
a
very
large
problem
in
the
broader
open
source
ecosystem
and
and
the
other
thing
that
we're
doing
here
is
we're
trying
to
detect
these
risk
factors
as
early
as
we
possibly
can,
because
the
earlier
we
can
figure
out
that
there's
going
to
be
some
kind
of
risk
to
sustainability
for
a
given
project,
the
faster
that
all
the
businesses
involved
with
that
project
can
make
a
business
decision
and
rescue
it
so
you're,
not
in
constant
firefighter
mode,
you're,
actually,
planning
ahead
and
making
business
decisions
based
on
project
risk
as
early
as
possible
and
and
kelly
tell
us
a
little
bit
more
about
the
other
things
that
we
want
to
try
to
do.
B
Absolutely
the
next
stage
of
this
is
looking
at
strategic
investment
kind
of
taking
that
one
step
ahead
by
putting
together
the
different
tools
that
we
have
here.
We
can
start
looking
at
new
companies,
new
communities
that
are
bolstered
by
these
contributors
and
project
data,
starting
to
see
what
are
the
anomalies.
B
What
are
new
players
that
are
coming
into
the
field
that
we
have
necessarily
not
paid
attention
to
before,
and
this
can
go
from
just
being
a
community
to
being
a
buzzword
like
we
can
start
looking
at
what
is
going
to
be
the
next
containers
if
we
start
to
see
buzz
and
different
talk,
whether
it
be
on
github
and
github
activity,
whether
it
be
at
different
events,
we
can
start
to
see.
Okay.
How
is
this
activity
comparing
to
the
prior
large,
exposing
exploding
buzzwords
in
the
past
and
started
to
see?
B
B
And
then,
once
we
start
to
see
these
certain
words
come
out,
we
can
start
tracking
their
trends
and
looking
at
their
github
data
and
seeing
what
the
issues
and
discussions
are
around
it.
What
are
people
saying
about
it?
B
If
there's
any
chance
that
y'all
can't
see
my
screen,
please
just
let
me
know,
but
right
here
we
have
a
demo
of
the
mode
tool
on
just
the
discovery
side
and
the
terms
that
are
getting
used
to
bring
all
the
data
into.
This
is
we're
looking
specifically
at
red
hat,
fedora
and
centos
stream,
and
we
can
change
what
those
key
terms
are.
B
If
we
start
to
say
okay,
we
want
to
look
just
in
general,
just
say
containers,
but
right
now
for
this
tutorial,
that's
the
terms
that
we're
going
to
be
looking
at
and
what
is
taken
into
the
large
set
of
data.
That's
going
into
this,
and
so
one
thing
that
I've
really
picked
up
on
is
that
you
can
start
examining
the
impacts
of
events
with
changes
of
github
activity
around
that
date.
One
example
of
this
is
the
devcon
cz
that
usually
occurs
around
january.
B
We
can
see
that
in
2019
looking
right
up
after
the
event,
we
can
see
that
there
is
a
bolster
in
the
amount
of
projects
created
and
even
more
around
that
time,
leading
up
to
the
event
and
after
the
event
that
the
amount
of
commits
per
week
are
up
on
a
large
upwards
trajectory,
which
I
think
is
a
huge
thing
to
look
into
here,
because
projects
being
created,
that's
that
has
one
portion
of
it,
but
if
there's
more
activity
around
the
different
projects
that
have
already
had
a
ground
in
state,
we
start
to
think
okay.
B
What
was
being
talked
about
during
this
event?
That's
making
this
activity
going.
Go
up:
is
there
something
new
or
is
there
something
that's
starting
to
that's
been
on
the
stage
a
little
bit,
but
it
really
is
taking
off
because
of
the
talks
that
are
occurring
at
this
event,
and
so
that's
when
you
can
start
to
get
into
a
little
bit
more
analysis
and
start
seeing
okay,
what?
Where
is
the?
Why
and
then
from
here?
We
can
also
looking
look
at
the
growing
technologies
and
buzzwords
so
saying
a
hypothetical.
This
event
was
really
big.
B
B
And
so
we
can
see
right
here
how
the
different
data
changes
over
time
and
that
the
subset
of
data
that
is
have
to
do
with
containers
almost
has
more
of
a
peak
than
even
the
large
overall
set,
and
so
you
can
start
to
think.
Maybe
this
has
a
large
impact
in
it,
and
so
that
starts
sending
you
down
the
rabbit
hole
which
is
just
not
neces,
which
really
starts
to
bring
in
new
ideas.
You
can
start
looking
into.
Who
are
the
contributors,
companies
that
are
being
involved
in
this?
B
Obviously,
a
lot
of
these
are
more
red
hat
centric,
because
right
now
we're
looking
at
terms
that
are
focused
on
specifically
on
red
hat,
but
overall
we
can
start
looking
at
the
com,
the
top
contributors
as
well,
who
are
the
big
players
here
and
also
we
can
start
to
see.
Okay,
what
is
their
activity?
Are
they
going
and
branching
out
onto
some
new
projects
as
well?
It
just
really
has
this
branching
effect.
B
That
starts
to
give
way
more
ideas
of
what's
going
on
in
our
communities
and
then
from
here
say
we
got
some
thing
that
was
starting
to
spark
our
interest.
This
is
when
I
really
think
cauldron
starts
to
come
into
play,
so
here
he's
have
a
project
that
takes
in
a
community's
repo
data
and
so
from
here.
Maybe
I
was
thinking
to
myself
that
I
want
to
learn
a
little
bit
more
about
what
companies
are
involved
in
this
project.
B
We
can
go
to
our
visualization
tool
here
and
create
a
new
visualization
to
start
to
see
what
is
actually
happening
here
whenever
I'm
starting
to
look
at
which
companies
are
coming
at
play.
I
personally
go
for
the
like
pie
option
because
you
can
kind
of
see
okay,
how
what
are
the
percentages-
and
it's
a
lot
more
visually
appealing
for
me,
but
there's
so
many
different
options
here
that
go
upon.
I
really
like
to
use
the
gold
tool
to
start
to
see.
How
is
the
activity
around
the
commits?
How
quick
is
it
happening?
B
Is
it
reaching
the
goals
or
some
type
of
threshold
that
I
have
found
for
found
for
this
community
and
so,
like?
I
said,
there's
just
so
many
different
options
here
whenever
it
comes
to
visualizations,
but
right
for
this
example,
we're
going
to
be
going
on
to
the
pi
tool
and
we're
going
to
be
using
the
source,
their
get
data
to
create
this.
And
so
here
we
just
have,
you
know
just
a
random
circle.
So
we
now
start
to
look
at
okay.
B
What
is
the
aggregation
we
want
to
use
and
from
here
we're
going
to
want
to
take
a
unique
count
of
author
ids
and
because
there's
a
lot
of
times
you'll
see?
Obviously
people
will
commit
multiple
times
on
different
on
different
repos,
and
so
you
want
to
make
sure
you're,
uniquely
counting
each
time
a
new
one
comes
to
play.
B
Right
there,
and
so
this
is
looking
at
their
email
url,
which
is
not
a
perfect
metric,
but
it
can
really
just
get
you
starting
to
look
at
an
idea,
and
I
feel
like
that's
like
the
biggest
like
portion
that
I'm
getting
out
of
all
these
tools.
Is
that
they're
not
something
where
you
just
look
at
them
and
poof?
They
give
you
all
the
answers.
B
They're
going
to
be
grouped
into
an
other
category
and
we're
going
to
update
and
then
look
what
we
have
here.
We
can
start
to
go
and
click
and
see.
Okay.
What
are
the
ones
that
are
actively
involved
in
this?
Specifically,
community
gmail
is
a
pretty
common
one,
which
doesn't
tell
us
too
much.
We
can
see
here
that
red
hat
is
actively
involved
in
it
and
there's
a
lot
of
others,
and
then
there's
also
these
other
ones
that
are
in
the
corner.
Seuss.Com
I've
personally
never
heard
of
it.
B
You
might
have
more
of
your
major
players
on
hand,
but
those
are
the
two
main
tutorials
that
we
have
to
show
today
to
kind
of
give
an
idea
of
what
these
tools
do
and
how
they
could
be
working,
how
they
could
work
together
and
from
there
open
the
table
for
some
questions
and
see
if
there's
any
other
things
that
you'd
like
to
maybe
analyze
with
our
tools
that
we
have
available,
definitely
would
be
open
to
looking
at
some
different
keywords.
B
Using
project
mode
cauldron
sometimes
takes
a
little
bit
longer
to
do
some
specific
searches.
But
if
it's
something
quick,
we
can
definitely
make
that
happen.
C
Well,
this
this
is
awesome
kali.
I
really
love
seeing
you
guys
using
this
and
for
the
health
and
sustainability
of
communities.
It's
really
key
and
a
lot
of
this,
the
tooling
here
that
you're,
showing
especially
the
pie
chart
and
that
that
is,
is
something
that
we've
been
using
in
the
openshift
community
for
quite
some
time
and
one
of
the
things
that
I
always
say
about
and
try
and
preach.
I
I
think
I'm
very
preachy
about
it.
C
Brian
might
agree
with
that,
is
that
the
community
management
and
community
development
we
kind
of
think
of
it
as
an
art,
but
this
is
bringing
and
trying
to
bring
to
bear
some
of
the
data
driven
approaches
that
we
use
in
our
sales
and
marketing
and
everything
else
why
shouldn't
community
and
open
source
communities
have
access
to
this
to
do
the
same
sort
of
stuff.
So
it's
wonderful
to
see
you
guys
using
all
of
this.
C
I
used
to
be
in
massachusetts
right
near
you
in
beverly
and
and
and
a
shout
out
to
umass,
and
I
know
you're
at
bu,
so
we
had
a
little
competition
going
there,
but
is
this
hockey
puck
concept
about
looking
for
new
emerging
technologies,
and
one
of
the
things
that
we've
been
doing
quite
extensively
is
watching
the
migration
of
resources
and
and
even
our
end
users
from
different
projects
so
and
using
these
same
tools
and
the
betergia
tools
and
the
network
analysis
ones.
C
I
think
if
anyone
knows
me,
they've
seen
me
throw
up
what
I
call
the
jellyfish
diagram,
because
it's
always
pink
and
many
tentacles
of
watching
how
people
collaborate
across
communities
and
and
these
tools
have
been-
you
know
available
for
us
for
a
while.
But
what
what
is
really
interesting
to
me
is
the
use
of
the
watson
and
the
ai
stuff
to
do
maybe
some
predictive
things
to
do
more
than
just
watch
what
I
call
the
the
migratory
processes.
C
So
yeah
I'd
like
to
hear
a
little
bit
more
about
how
the
watson
park
plays
into
this.
If,
if
you
can
explain,
because
that
was
slightly
different
than
the
pie,
charts
and
dividing
out,
those
are
tools
that
that
I've
had,
but
the
watson,
the
mode
stuff
is
really
cool
and
and
and
how
that
and
how
you're
working
together
with
ibm
is.
B
Yeah,
so
this
project
kind
of
evolved
a
little
bit
over
the
past,
probably
six
to
eight
months,
and
it
started
out
with
ibm
reaching
out
to
us
just
wanting
a
little
bit
more
of
a
perspective,
a
community
perspective
that
they
knew
they're
like
we
are
big
on
the
research
side.
We've
done
a
lot
of
this
analysis
from
the
standpoint
of
academic
papers,
but
we
understand
that.
B
That's
not
how
and
where
we're
going
to
find
the
useful
data
to
understand
the
open
source
ecosystem,
and
so
we
brought
in
more
of
the
community
and
open
source
perspective
to
start
to
see.
Okay,
what
should
be
the
main
things
to
look
at
whenever
it
comes
to
analyzing
github,
and
so
whenever
we
we
can
actually
go
and
look
back
at
the
visualization
here.
B
So
with
this
project
mode
tool,
all
of
these
terms
are
being
grouped
together
by
using
sentiment
analysis
mainly
on
the
readmes
they're,
also
doing
a
little
bit
of
metrics
on
like
community
metrics,
on
the
amount
of
these
repos
that
have
community
guidelines,
which
is
something
that's
also
very
interesting
to
look
at.
B
But
it's
taking
the
strength
of
watson
like
debater
discovery
to
group
people
to
get
a
group
repos
together
using
cinnamon
analysis,
not
necessarily
grouping
fuel
together
because
of
the
transfer
of
similar
contributors,
and
so
you
can't
go
and
just
look
at
the
entirety
of
github
whenever
you're
looking
at
this
like
mode
tool
whenever
you're
looking
at
this
tape,
but
you
can
say
okay,
I
want
this
specific
subset
so
for
right
now.
B
For
this
demo,
the
subset
of
terms
that
we're
looking
at
is
anything
that
has
to
do
with
red
hat
fedora
centos
stream.
Maybe
we
wanted
to
look
at
containers?
Maybe
we
wanted
to
look
at
hybrid
cloud.
Maybe
we
wanted
to
look
at
some
like
just
looking
at
like
google,
seeing
it
like
what
is
going
on.
That's
just
specifically
doing
going
on
with
a
direct
competitor
vmware,
and
so
you
can
go
and
look
at
the
repos
and
group
them
together
in
a
completely
different
way
and
how
you
want
to
do
that.
B
C
If,
if
I
hear
you
right,
this
mode
tool
is
looking
at
readmes
and
contributor
guidelines,
is
it
looking
at
like
the
mailing
lists?
What
are
the
data
sets?
Is
it?
Is
it
mining
and
the
slack
the
you
know
the
blogosphere
stack
overflow
or
any
of
that
kind
of
content,
or
is
this
strictly
the
readmes
and.
B
We
don't
have
access
to
an
inter
an
interface
to
show
today,
sadly,
but
we
can
take
in
the
re
the
mailing
list,
data,
which
is
actually
something
that
I
worked
on
like
cleaning
and
preparing
data
for
a
different
project
that
I
was
working
on
during
my
internship,
and
so
we've
talked
about
pretty
heavily
taking
the
tools
that
we
already
have
for
preparing
the
mailing
list
data
and
putting
it
into
tools
like
this.
For
this
demo,
we
aren't
looking
at
that,
but
it
is
very
it
is.
We
can
make
that
happen
pretty
easily.
B
This
setup
can
be
applied
to
a
what
a
way
wider
scope
than
just
read
me.
This
is
just
where
we,
this
is
the
starting.
C
Point
yeah
now
this
is
this
is
great
because
well
like,
like
I
said
what
we've
been
doing
with
the
betergia
tools,
is
we
do
all
we?
We
do
a
wide
range
of
things
with
it
and
beyond
just
community
health,
but
one
of
the
things
that
the
other
things
that's
really
important.
I
think
to
emphasize
to
people
watching
this
is
that
the
importance
of
domain
knowledge?
It's
great,
and
this
is
happens
in
any
data
science.
C
Tooling
too,
is
that
if
there
isn't
somebody
with
a
bit
of
domain
knowledge
about
the
domains
and
how
they
interact,
it's
the
tooling
is
really
one
of
the
is
lacking.
So
like
I've
worked
in
other
spaces,
finance,
accounting,
auditing
and
stuff
like
that,
and
if
you
didn't
know
what
you
were
auditing
or
what
you,
what
is
appearing,
it's
pretty
you
can
you
can
go
down
a
wormhole,
let's
say
or
make
the
wrong
assumption.
So
I
think
one
taking
it
step
by
step.
C
Doing
the
readmes,
adding
in
new
stuff
is
really
the
right
thing
to
do
and
as
we
torture
you
and
make
you
learn
all
about
open
source
and
communities,
we'll
eventually
add
in
hopefully
all
of
the
cncf
projects,
the
cloud
native
computing
foundation,
ones,
which
is
really
where
most
of
the
projects
that
I
work
on
and
interact
with,
live
and
breathe,
along
with
the
openshift
ecosystem
too.
So
I'm
really
looking
forward
to
getting
this
with
a
wider
data
set
and
and
that'll
scare
everybody
over
at
ibm.
C
You
know,
in
terms
of
you
know
getting
getting
that
set
up
because
that
we
have
a
lot
of
that
data,
but
not
in
the
semantic
analysis
or
sentiment
analysis
stuff
going
on
and
the
the
other
thing.
That's
really,
I
think
important
to
understand
is
you
know
who
are
the,
and
I
think
you
have
a
little
bit
of
that
like
you
can
see
some
of
the
people's
names.
C
Who
are
you
know,
leading
up
the
repos,
but
is
it
possible
here
like
to
to
do
analysis
that
drives
on
an
individual
contributor
to
a
project,
or
is
this
really
a
higher
level
thing
so
to
see
where
where's
kelsey
hightower
playing
in
these
days
or
out?
A
Yeah,
so
so
answering
that
question,
so
it's
a
little
bit
of
a
fine
line
because
we're
we're
trying
to
so
the
short
answer
to
your
question
is
yes,
we
should
be
able
to
do
that
between
the
tools
we
have
here
and
also
auger,
which
we
haven't
demonstrated.
A
A
It
just
looks
for
all
connections
and
finds
out
where
that
person
is,
you
know
working,
so
we
can
do
it,
we're
a
little
bit
hesitant
because
and
how
we
apply
that
because
we
don't
want
to
get
into
privacy
issues,
we've
historically
and-
and
you
know
this
to
diane,
because
your
work
with
petergia
and
and
the
stuff
that
ospo
used
to
do
when
when
we
approached
this
is
like
a
giant
fire
hose
of
information,
we're
always
very
careful
to
try
to
keep
the
user
data
as
aggregated
as
possible.
A
And
you
know
it's
always
been
a
fine
line,
because
when
we
look
at
things
like
the
the
pie
chart
that
cali
showed
us
earlier,
where
we're
trying
to
identify
domain
by
domain
like
who's
working
for
whom
are
they
working
for
red
hat
or
google
or
souza?
A
We
we
can
do
that,
but
to
refine
that
we,
we
kind
of
need
to
know
a
little
bit
more
about
the
person,
because,
like
at
red
hat,
we
are
not
all
required
to
use
our
red
hat.com
domains
when
we,
you
know,
go
participate
in
any
open
source
project,
so
there
could
be
more
red
hatters
or
you
know,
on
any
given
project
beyond
just
the
red
hat.com
and
it
might
be
similar
for
other
organizations
too.
A
So,
yes,
and
and
as
we
move
forward,
we're
really
trying
to
kind
of
be
very
mindful
of
individual
privacy,
especially
you
know
we're
getting
into
situations
with
ddpr
and
the
california
equivalent.
And
now
you
know
japan
has
one
and
brazil
has
one,
and
I
just
heard
I
think
you
know
another
state
in
the
united
states
has
something
going.
A
So
there
are
a
lot
of
individual
municipalities
and
countries
that
have
privacy
laws
in
place,
and
we
have
to
be
mindful
of
those
as
well.
C
Yeah,
no
definitely,
and
that's
really
one
of
the
the
things
that,
like
you
were
mentioning
with
with
the
patersia
and
the
sorting
hat
and
the
cauldron
projects
and
stuff
like
that,
they
have
been
very
mindful
of
of
making
ensuring
that
it's
it's
following
those
things
and-
and
that's
also,
you
know,
especially
when
we
talk
about
putting
some
of
these
toolings
and
making
them
available
publicly
as
open
source
projects
and
that's
really
been
key.
But
you
know
there's,
I'm
always
I'm
a
huge
fan
of
this
stuff,
so
I'm
totally
thrilled.
C
I
really
want
to
have
you
back
and
demo
the
auger
stuff
and
take
some
time
kali,
maybe
and
look
at
where,
where
openshift
lands
in
this
and
when
okay
okd
lands
in
this-
and
this
is
really
very
timely
as
we're
you
know,
I,
I
love
the
emphasis
early
on
brian,
when
you
were
talking
about
the
the
roi
from
vendors
who
are
there's.
C
There's
always
two
sides
to
every
open
source
initiative
is
the
end
users,
and
you
know
what
the
value
that
they
get
out
of
the
project
and
their
participation,
their
use
of
your
project
and
then
all
of
the
vendors
who
are
collaborating
and
the
value
that
they
get
from
participating
in
those
open
source
projects,
and
so
I
think,
when
we
try
and
sometimes
in
our
jobs,
have
to
justify
resources
being
allocated
to
different
projects,
which
trust
me
happens
all
the
time.
C
Inside
of
red
hat
these
kinds
of
tools,
let
us
really
help
with
those
judgment
calls
about
where
the
resources
are
and
and
as
I
said,
I
think
the
other
big
piece
of
this
is
using
it
to
see
where
the
hockey
puck's
going.
You
know
where,
where
are
we
shooting
to,
because
we
always
we're
always
in
the
present,
when
we're
doing
this,
we're
always
trying
to
suss
out
some.
C
You
know
current
kerfuffle
in
a
mailing
list
or
something
that's
going
on
over
in
here,
because
somebody's
unhappy
with
it-
and
we
often
don't
do
any
forward.
Looking
what's
coming
down
the
pike.
Where
are
our
end
users
playing?
C
You
know
there
was
a
great
example
of
it,
a
while
back
with
a
company
amadeus
that
is
a
using
openshift
and
they
came
to
event,
gave
a
talk
on
their
use
of
kafka
right
and
it
was
really
early
very,
very
early
days
of
kafka,
but
I
think,
had
we
had
tools
like
this,
we
probably
could
have
seen
them
starting
to
log
issues
in
kafka.
C
Log
do
a
pr,
you
know
those
kinds
of
things
and
so
for
vendors,
who
are
looking
to
see
where
their
customers
are
playing,
where
their
end
user
playing
these
kinds
of
tools,
not
just
for
red
hat
or
ibm,
but
for
everybody,
who's
working
in
open
source,
and
it
really
helps
us
justify
continuing
to
engage
and
and
do
this
and,
as
we
all
know,
open
source
is
part
of
pretty
much
every
company
on
the
planet.
C
These
days
is
using
something
open
source,
whether
you're,
making
candy
or
manufacturing
rocket
ships
or
writing
software.
So
this
is
really
part
and
parcel
of
every
business
organization's
decision
making
process
now
and
the
work
that
chaos
is
doing,
that
you
know
all
of
the
different
inner
source
and
other
communities
are
doing.
C
It
are
really
very,
very
important,
so
I
can't
say
thank
you
enough,
kali
and
brian
for
for
highlighting
all
this
work,
and
I
am
so
thrilled
to
see
it
being
done
and
getting
getting
the
airtime
and
resourcing
internally
at
red
hat
so,
and
I'm
really
looking
forward
to
picking
your
brain,
kali
and
running
some
okd
stuff
there
and
and
the
other
one
I
want
to
see
run
is
you
run,
is
operators
that
term
the
operator
framework,
because
that
is
really
got
its
tentacles
and
so
many
different
things
beyond
containers,
and
this
is
where
once
you've
got
these
tools,
refined
and
we've
got
some
processes
in
place
for
it.
C
Getting
the
the
lead
project
engineers
who
are
working
on
it,
whether
it's
in
the
emerging
tech
office
at
red,
hat
or
one
of
the
project
leads
for
one
of
the
cncf
projects
so
that
they
have
the
domain
knowledge
and
they
can
tweak
and
see
this.
This
is
really
hugely
helpful
for
community
development
efforts.
So
again,
thank
you
very
much
for
coming
today
and
putting
up
with
some
of
our
technical
issues
this
morning,
but
we
are
definitely
having
you
back
cali.
This
was
great.