►
From YouTube: Data Driven Approach to Community Development Diane Mueller (Red Hat) Daniel Izquierdo (Bitergia)
Description
Data Driven Approach to Community Development
Diane Mueller (Red Hat) Daniel Izquierdo (Bitergia)
OpenShift Commons Briefing
July 2, 2020
https://commons.openshift.org/events.html
A
Well,
everybody
welcome
to
yet
another
OpenShift
Commons
briefing,
and
this
is
going
to
be
a
fun
one
for
me,
because
I
have
a
colleague
of
mine
that
I've
been
doing.
Some
research
with
Daniel
is
guardo
from
bateria
and
we
presented
it
was
a
week
ago,
Saturday
at
the
ICG
se
a
very,
very
abbreviated
version
of
this.
So
we
thought
we
would
do
a
deep
dive
today,
because
there
were
a
lot
of
questions
about
how
to
use
the
analytic
tools.
A
Why
we're
doing
it,
and
so
we're
gonna,
take
this
opportunity,
when
most
of
you
are
probably
off
on
vacation,
to
steal
an
hour
from
you
and
talk
about
how
in
the
OpenShift
Commons
and
the
OpenShift
ecosystem
and
the
kubernetes
and
the
CN
CF
the
ecosystem.
We've
taken
a
data-driven
approach
to
doing
community
development
and
how
that
has
helped
me
really
be
able
to
be
effective
and
nurture
a
healthy,
diverse,
hopefully
very
engaged
community.
Around
OpenShift
okd
kubernetes.
All
the
CN
CF
projects
were
incubating
operators,
all
those
kinds
of
good
things.
A
A
You
know
that
you've
probably
seen
this
screen
before,
and
you
know
that
we're
really
all
about
open
source
and
believing
very
deeply,
and
it's
in
our
DNA
that
open
source
is
the
source
of
all
the
technology.
Innovation,
that's
happening
today
in
the
world,
and
you
know
github
is
where
we
we
live
and
breathe
these
numbers
again,
they're
a
little
out
of
date
but
and
they've
grown
exponentially
I.
Think
it's
a
hundred
and
twenty-five
million
repositories.
A
At
this
point
it's
huge-
and
there
are
just
a
few
of
them
on
the
screen
and
okd,
which
was
formerly
known
as
open
shipped
origin,
is
the
one
that
we're
going
to
focus
on
a
little
bit
today,
and
so,
if
you
don't
know,
okay
D,
it
is
the
open
shift,
distribution
of
kubernetes.
Basically,
we
like
to
say
it's
a
function
of
kubernetes
plus
plus.
All
of
the
other
things
we
add
into
it
at
Red.
Hat,
open
shift
is
easy
to
find
it's
going
to
be
GA.
A
Hopefully
next
week
with
the
4.5
release
of
open
shift
will
have
a
distro
of
ok
d4
for
you,
and
you
can
try
it
out
at
ok
d
IO,
but
it
basically
is
a
community
distribution
of
kubernetes
and
one
of
the
things
that
happened
over
the
course
of
the
time.
Maybe
four
years
ago
we
switched
from
a
standalone
open
source
project
that
was
origin
to
being
rebasing
re-architecting,
an
open
shift
on
top
of
kubernetes
and
using
heavily
using
containers.
A
So
if
you're,
an
old
stir
from
open
shift
like
me,
you
still
remember
gears
and
cartridges,
but
then,
when
we
switched
over,
we
really
had
to
refocus
how
we
looked
at
what
community
was
and
the
reality
check,
and
the
honest
thing
was
that
from
the
most
part,
the
contributions
to
origin
were
Red
Hat
based
it
was
Red.
Hat
dominated.
There
were
a
lot
once
we
took
Red
Hat
out.
There
were
a
lot
of
external
folks
contributing
to
the
project,
but
you
know,
and
still
today,
on
the
value
added
parts
of
open
shift.
A
They
are
primarily
things
that
are
integrated
and
added
to
it.
The
value
adds
by
open
shift
so
I'm
not
going
to
change
my
tune
on
that.
That
is
really
where
most
of
it
is,
but
big
change
has
been
for
us
and,
and
the
complexity
comes
in
when
we
have
this
ecosystem-based
model
that
we
switch
to.
If
you
go
to
Commons
you'll
see
there
are
you
know
right
now,
over
five
hundred
eighteen,
five
member
organizations
that
are
part
of
that?
A
These
are
energy
users,
integrators
cloud
providers
upstream
project
leads
tons
of
people
having
conversations
that
we
have
to
interact
with
and
understand
where
they're
coming
from,
and
then
we've
seen
what
I
call
the
rise
of
the
interrelated
cloud
native
ecosystems
and
it's
everybody
shows
this
picture.
It's
crazy
I
know,
but
it
actually
is
very
helpful
when
you
filter
it
down
to
some
of
the
open
source
projects
that
are
being
incubated
and
that's
really
what
I
tend
to
focus
on
is
these
ones
that
are
either
incubated
or
graduated
I?
A
Do
trust
me
I,
look
at
all
the
sandbox
ones
too,
but
for
this
analysis,
we're
just
going
with
graduated
incubated
projects
and
then
we're
adding
in
the
wonderful
world
of
operator
framework.
So
the
vote
just
took
place
and
it's
just
been
accepted
as
an
incubated
project.
I
think
it's
going
to
officially
be
announced,
probably
next
week.
I
think
on
the
9th
is
when
the
press
release
went
out
so
that
we
have
to
add
in
for
that
all
the
operators
that
we're
building
things
that
are
an
operator
hub
and
the
operator
framework
itself.
A
So
this
landscape
just
keeps
growing
and
it's
impossible
to
really
understand
all
of
the
relationships
or
to
know
all
the
people
in
your
community.
What
I
like
to
say
is
in
the
past,
community
managers
usually
focused
on
one
single
project
and
trying
to
get
people
to
work
on
just
that
one,
and
we
don't
have
that
luxury
anymore.
There
are
so
many
interdependencies
on
the
different
projects
that
are
working
on
pieces
that
are
layered
on
top
of
openshift,
are
integrated
into
open
shift
or
run
underneath
openshift
and
all
of
those
release
cycles.
A
Product
roadmaps
feature
request,
issues,
everything
you
can
possibly
imagine
you
name,
it
all
have
an
impact
on
each
other
and
then
the
human
side
of
it
is
as
well
is
really
I.
Think
the
thing
that,
from
community
development
point
of
view,
is
it's
unknowable
without
using
a
data-driven
approach.
So
you
know
I
can
create
all
the
spreadsheets
I
want
from
mailing
lists
and
analyze
them
up
the
wazoo
by
myself
by
hand,
but
about
I
think
it
was.
Did
we
decide
when
we
first
met
Daniel?
A
We've
been
I've,
been
looking
at
this
magic
for
a
long
time
and
have
been
implying
it
trying.
You
know,
first,
the
dashboard,
which
gives
you
the
pie,
chart
and
breakdown
of
the
contributors,
and
then
this
network
analysis,
stuff,
and
so
we've
sort
of
ingrained
this
into
the
way
that
I,
my
day-to-day
approach
to
working
with
the
many
communities,
so
I've
been
able
to
scale
myself
in
some
ways
in
a
way
that
I
couldn't
formally
do
without
having
that
data-driven
approach
and
these
data
driven
approaches,
the
sales
teams
use
them.
Serums
are
them.
A
You
know
those
are
customer
relationship
management
things.
This
should
be
a
community
relationship.
Management
tools
is
what
the
way
that
we
look
at
this,
and
so
basically,
what
we're
just
doing
is
applying
some
data
science
and
analytics
to
the
problem
space
of
understanding
who's
in
your
community,
how
to
nurture
them,
how
to
support
them
and
how
to
reach
out
and
connect
with
them
and
connect
teach
them
to
each
other.
I'm
gonna
stop
I'm
gonna.
A
B
So
then
axis
is
based
on
Keith's
repository.
So
if
we
think
about
the
usual
data
sources
in
any
open
source
community,
we
have
a
bunch
of
data
sources
by
data
source.
I
mean
this
is
infrastructure
that
we
may
be
using
into.
You
haven't
really
mentioned
that
on
some
of
them
as
the
mainly
list
or
we
have
select
channels.
We
have
git
repositories,
some
of
them
are
using
git.
Have
some
of
them
are
using
get
a
blossom
stack,
so
there
are.
There
are
several
of
them,
and
typically
those
data
sources
from
five
to
ten.
B
If
we
think
about
development
activities,
communication
channels
outreach
to
other
to
the
general
public.
So
in
this
case
on
for
today,
we
are
just
focusing
analysis,
gate
which
is
a
big
chunk
of
of
case,
and
then
we
are.
We
are
focusing
on
CN
CF,
open,
CA
and
operators
and
go
to
the
next
slide,
please
so
for
the
tooling-
and
this
is
how
we
are
moving
from
from
art
to
science,
with
what
we
call
how
to
apply
this
data-driven
approach
to
community
development.
So
we
are
using
remote
lab
tomorrow.
B
B
One
of
them
is
remote
lab,
which
is
what
we
are
presenting
today
and
I'm
gonna
take
participant
for
original
developers
here,
and
then
we
have
a
cover
which
is
similar
to
doing
a
pretty
focused
on
it
have
as
far
as
I
remember,
and
then
there
are
a
couple
of
extra
push
around
so
remember.
This
is
the
architecture
that
you
can
see.
This
is
it's
not
only
about
retrieving
information,
so
there
is
a
reprocessing
and
post-processing
of
existing
data.
B
There
is
specific
problems
that
we
have
to
deal
with
as
identities
or
affiliation
management
the
how
to
automate
all
of
this.
How
to
have
this
in
improv
and
they're,
not
a
very
and
how
to
produce
value
to
the
end-user
right
to
start
them.
For
four,
from
the
left
side
of
the
of
the
chart,
we
have
a
bunch
of
data
sources.
Some
of
them
we
have
mentioned
in
so
git
repository
is
darker
JIRA.
We
have
proxy
Allah
and
some
others.
B
Then,
right
after
this,
we
have
perceivable,
which
is
the
to
retrieve
all
of
these,
and
this
is
producing
some
data
transformation.
So
this
is
your
front-end
to
transform
any
kind
of
blog
or
API
into
a
JSON
document,
and
this
is
temporarily
stored
in
some
database,
but
then
at
the
very
end
this
is
creating
a
new
index
in
in
elastic
search.
Elastic
search
is
the
database.
We
are
using
here
the
persistent
database
and
then
we
are
creating
bra
indexes
at
the
same
time,
the
tool
that
you
can
see
right
in
the
middle
dumar
ALK.
B
It
is
the
data
processor,
so
this
is
kind
of
saying:
okay,
I
have
a
new
JSON
document,
so
I
am
storing
this
in
elasticsearch
and
then
at
the
same
time,
I
am
asking
shorting
health.
Hey.
We
have
a
new
identity
here.
What
do
I
do
with
it?
So
shorting
hat
is
the
tool
that
will
take
care
of
all
of
the
identities
and
affiliations
and
shorting
help
uses
another
database.
Why
do
we
have
this?
In
this
case?
This
is
to
be
gdpr
compliant.
B
So
we
have
a
kind
of
an
external
or
third-party
database
where
we
can
store
everything,
and
then
everyone
can
opt
in
or
out
from
the
rest
of
the
visualizations
and
and
so
on.
So
you
can
analyze
the
information,
let's
say
the
ones
we
have
sorting
help
doing
its
job.
We
have
the
raw
indexes,
then
the
next
step
is
to
enrich
those
indexes
by
enrich.
B
It
means
basically
creating
specific
indexes
focused
on
your
business
model,
and
the
business
model
we
are
talking
today
is
about
community
development,
so
we
are
producing.
Those
datasets
that
are
in
the
row
index
is
into
something
more
meaningful
for
the
final
user,
an
example
here.
If
we
think
about
the
deed
activity,
we
bunch
of
comets
right.
So
in
a
comment
we
see
it
all
for
the
coming
week
at
the
committee
we
have
the
date.
We
have
the
time
phone.
We
know
the
files
that
were
modified
or
move
or
copies
or
or
or
created
from
scratch.
B
Then
we
have
the
lines
for
each
of
them,
but
we're
ready
to
remove
or
modify
as
well.
So
all
of
this
information
can
be
parsed
and
can
be
transformed,
so,
for
instance,
by
default,
remote
lab
is
probably
producing
a
sprat.
I,
remember
three
or
four
indexes
based
on
it
information.
One
of
them
is
for
another
granularity
of
commits,
so
we
can
go
there
and
check
who
is
working
with
who
in
what
commits
and
or
file
paths
etc.
The
next
Graham
might
give
you
a
more
finer
variety
that
we
have
here.
B
We
can
go
at
the
level
of
of
the
file
path
we
know
were
specifically
is
certain
organizations
where
people
can't
even
participate
in
that.
So
we
are.
We
have
some
critical
area
in
in
our
open
source,
pre
it
and
those
developers
leave
the
community
because
thirst
and
turnover
right
turnover
happens
today,
and
we
can
look
for
the
right
expertise
to
try
to
fill
that
knowledge
gap.
But
we
need
the
data
in
advance
trying
to
understand
what's
going
on,
then
there
is
another
index
that
we
we
can
create.
B
For
instance,
we
are
creating,
which
is
the
analysis
of
what
we
call
the
onion
analysis,
but
we
think
of
open
source
communities
as
an
early
on
it's
a
bunch
of
layers
right
at
the
very
center.
We
have
the
core
set
of
developers,
so
by
definition
we
rename
them
as
those
producing
80%
of
the
activity
of
the
comments.
B
From
just
let's
say
one
data
source
which
is
heat.
We
can.
We
can
start
producing
specific
indexes
right,
so
this
is
what
we
mean
by
and
routine
basis
and
then
at
the
very
end,
at
the
bottom
right
part
of
the
child.
You
see
Kavita
kids
at
downstream
person
of
qivana,
with
let's
say
certain
extra
vitamins
and
plugins,
and
so
on.
Everything
is
open-source
by
the
way
and
then
there's
the
end
user
that
can
be
swiped
all
of
this
information
and
navigate
through
the
data
and
they
can
create
new
visualizations.
A
Couple
points
about
this:
I
think
you,
you
went
a
little
fast
over
the
sorting
hat
and
identity
merger
and
I
just
want
to
harp
a
little
bit
on
this.
If
you
notice
all
of
those
different
data
sources-
and
you
think
about
you-
know
if
you're
listening
to
this
later,
how
many
different
email
addresses
you
use
in
all
of
these
different
data
sources,
and
you
know
the
the
idea
that
we
would
know
who
you
are?
Are
we
as
a
community
manager
would
know?
Oh,
this
is
my
Stack
Overflow
first
I
wanna.
A
A
If
people
think
that
that
they're
still
anonymous
in
the
world,
we
need
to
really
let
them
know
that
that
this
is
a
very
simple
open-source
tool
and
engine
that
people
can
it's
really,
you
know
longer
anonymous,
I
guess
it's
the
point
that
I'm
trying
to
get
to
here
and-
and
so
that's
that
brings
in
another
level
of
conversation
about
moving
from
art
to
science
as
well
as
you
know,
are
we
GDP
are
compliant?
Are
we
following?
A
So
as
a
community
person
working
with
this
data
set,
you
really
need
to
have
some
domain
X
experience
with
it.
So
I.
If
you
look
at
someone's
github
repo,
it
may
have
contributions
to
kubernetes
Prometheus
and
then
there's
some
gaming
platform
over
in
left
field.
You
need
to
know
enough
about
the
ecosystem,
to
know
that
that
gaming
platform
isn't
really
or
hopefully
isn't
really
something.
That's
has
a
repercussion
for
your
your
ecosystem,
so
having
domain
expertise
about
whatever
you're
analyzing
is
really
important.
So
till
I
move
to
the
next
slide
and
let
you
there.
A
B
Perhaps
just
another
body,
so
there
are,
there
are
already
so
we
are
not
the
only
ones
doing
this
right
and
there
are
already
open
source
communities
providing
such
information
about
identities
and
affiliation,
pacific
port
for
attribution,
which
is
what
we
are
doing
here
right
to
help
advancing
in
the
development
of
the
community
and
having
everyone
on
board
earlier
or
faster
with
the
proper
tools.
So
communities
is
open,
a
stock
or
ciencia.
They
already
have
certain
public
datasets
with
specific
identities
affiliations
for
all
of
the
developers,
and
this
is
civilian
community
created.
B
But
that
means
that
the
you,
as
a
member
of
the
community,
can
Boulder
and
say
I.
Am
this
person
I
and
I've
been
working
this
company
a
B
and
C
for
giving
these
years?
But
then
your
contributions
will
be
correctly
explained,
and
this
is
this
is
at
the
end
important
for
for
for
organization,
so
they
can.
They
can
see
specifically,
what's
going
on.
B
We
can
have
some
other
discussions
about
what
does
it
mean,
for
instance,
influence
in
an
open-source
community,
so
we
can
talk
about
specific
roles
as
maintainer
or
proper
developers
to
who's
playing
that
role
from
what
company
that
person
is
specifically
coming
from,
and
if
we
go
for
a
more
aggressive
perspective,
then
we
can
go.
We
can
go
and
have
specific
questions
on
what
are
my
competitors
doing
in
the
technologies
that
are
specifically
key
for
my
technological
stack?
Then
you
need
to
have
certain
knowledge,
and
all
of
these
data
driven
approach
is
quite
useful
to
understand.
A
I
think
I
started
out
using
the
network
analysis
stuff
to
understand
who
was
in
my
community
and
I,
always
say
that
when
I'm
talking
to
people
who
do
community
development
that
the
most
important
first
step
is
knowing
who's
in
your
community
and
how
to
connect
with
them
and
how
they're
connected
to
each
other.
So
you
can
do
all
the
content.
A
Development
write
all
the
documentation
you
want,
but
if
you
really
don't
even
know
who
your
audience
is
or
who
the
participants
are
in
your
community
you're
ain't
going
to
end
up
rewriting
that
or
reframing
it
in
some
way.
So
but
there's
also
the
and
we
talk
about
it.
Quite
often
the
idea
that
this
is
one
way
to
see
where
the
community
is
going.
A
So
in
some
of
the
earlier
analysis,
and
we've
done,
you
can
see,
as
as
things
like,
Yaeger
took
off
and
open
tracing
and
open
and
Zipkin,
and
some
of
the
things
you
could
see.
People
moving
from
one
project
to
the
next
and
that
historical
analysis
and
hopefully
predictive
analysis-
is
the
next
layer
that
we
might
want
to
layer
into
this
too.
A
Then,
once
you
have
that
grasp
of
your
community
moving
and
applying
that
to
pay
attention
to
new
projects,
survey,
lists
or
Lego,
or
you
know,
a
bazillion
other
projects
as
they
pop
up,
because
then
you
can
start
watching
the
key
folks
here
and
what
they're
contributing
to,
and
it's
really
amazing
what
you
can
learn
from
this
and
you
can
get
lost
it's
sort
of
like
social
media.
You
can
go
down
a
wormhole
to,
but
you
always
come
back
up
and
and
see
how
things
are
interrelated.
B
From
indeed
from
from
that
perspective,
I
think
it's
it's
worth
mentioning
that,
before
entering
to
matrix
discovery
process,
is
really
useful
to
have
certain
strategy
on
the
table
and
certain
Authority.
So
people
tend
to
to
have
metrics
for
the
pleasure
of
having
metrics
and
the
problem
sometimes
is
that
you,
you
may
lose
track
of
where
you
were
going
well.
If
you
have
a
proper,
you
know
mettle
and
strategy
and
action
plan,
then
you
can
be
come
play
with
the
data,
but
then
you
can
come.
You
know
that
you
have
a
part
right.
A
The
other
thing
and
we'll
get
to
the
demo
in
a
second
here,
but
the
other
thing,
that's
really
important
for
people
to
understand
too,
is
like
pretty
much.
Every
large
project
has
a
dashboard.
You
know
that
shows
you
the
static
stuff
and
who's
the
biggest
contributor
to
this
project
and
who's
doing
the
most
in
this
project.
And
it's
you
know
it's
a
bragging
right
for
corporate
contributors
or
individual
ones,
and
it's
a
great
way
to
know
how
to
reward
people.
A
But
it's
it's
really
almost
useless
for
doing
community
engagement,
those
static
pie,
charts
and
things.
You
really
need
to
understand
the
relationships,
not
the
numbers
and
I
think
that's
what
this
demo
hopefully
will
show
you
a
little
bit
of
so
I'm
gonna,
stop
sharing
my
screen
and
let
you
share
yours
and
then
we'll
see
how
we're
doing
here
for
time
and
we're
doing.
Okay
does
Daniel
and
I
could
talk
about
this
for
days
and
yeah.
B
B
A
Think
is
that
the
basis
of
the
jellyfish
for
me,
like
the
jellyfish
diagram
we
use
in
the
article-
and
this
is
really
the
thing
that
you
can't
see
in
screenshots
and
stuff,
but
you
can
dive
into
here-
are
the
connectors
here.
So
the
large
jellyfish
there
is
kubernetes
and
the
smaller
one
is
openshift,
and
so
we
can
look
at
the
relationships
between
who's,
contributing
to
openshift
and
who's
contributing
to
kubernetes.
A
So
if
you
dive,
keep
diving
and
it's
you
know,
as
the
complexity
gets
bigger,
you
can
start
to
and
pantses
in
there
Luca
is
in
there.
Seth
is
in
there
like
I,
because
I've
been
working
in
the
open
ship.
Community.
I
know
almost
everybody
here,
but
if
a
new
person
pops
in
then
I'm,
you
know
I
become
aware
of
it,
and
you
can
also
get
list
views
of
this
and
all
kinds
of
cool
stuff.
A
But
it
also
starts
to
show
you
if
you
zoom
back
out
I,
think
you've
added
in
Jaeger
here
you
can
see
who's
working
on
OpenShift
who's,
working
on
kubernetes
and
who's,
also
working
in
Jaeger.
So
this
became
important
for
me
when
I
a
ger
when
the
Jaeger
team
from
uber
and
Red
Hat
said.
Okay,
we
like
some
help
from
You
Diann
to
get
us
into
incubating
status
over
on
CN,
CF
and
I
did
not
know
everybody
in
the
community,
so
I
was
able
to
pull
in
this
data.
A
Look
at
who
from
Red
Hat
was
contributing,
who,
from
uber
and
other
places-
and
these
were
my
key
people-
do
you
connect
with
to
help
move
that
project
through
to
the
next
level,
and
the
team
did
an
awesome
and
you
can
see
URIs
there
and
a
bunch
of
other
folks,
and
so
they
may
not
have
been
contributing
to
my
project.
Okay,
D,
origin
openshift,
but
they
were
contributing
to
a
key
thing
in
the
ecosystem,
Jaeger
and
open
tracing
that
was
integral
to
people
successfully
using
the
open
shift
and
us
deploying
it.
A
B
A
B
So
so,
just
to
mention
to
explain
a
bit
more
how
this
work
so
that
we
didn't
didn't
know
it
in
the
previous
in
the
previous
slide,
although
it
was
already
explained
the
picture
of
the
dots
that
we
say
are
our
developers,
so
those
are
those
our
display
if
they
have
committed
something
during
the
last
year.
You
can
see
here
to
later.
B
Government
is,
in
this
case,
open,
safe
and
Tiger.
Definitely
think
it's
already
a
graduated
project.
Fine,
we
have
this
only
assigned
to
in
creating,
but
you
negate
we
specify
this
filter
here,
so
we
we
are
sure
that
we
were
analyzing,
only
kubernetes,
open,
sea
and
tiger,
so
that
that's
why
we
know
this
is
the
other,
and
this
is
not
any
any
other
project
in
in
the
waiting
or
cat
waited
existent.
So
the
bigger
you
are.
That
means
that
you
have
committed
more
or
commits
to
that
specific
project.
B
So
then
we
have
some
thoughts
around
that
are
bigger
than
the
other
photos
are
developers
that
have
contributed
some
more
comets
than
the
average.
We
can
see
some
of
them
here,
and
then
we
see
it
is
this
number
of
developers
here
that
our
game
are,
they
have
a
net
into
kubernetes
and
they
have
a
net
into
open
sea.
So
this
means
that
there
in
the
last
year,
those
developers
all
of
this
here
have
contributed
to
both
words
in
this
case
coordinated
and
an
offensive
and
the
same.
B
B
So
these
are
the
basics
of
the
network,
diagram
yeah,
it's
true.
So
in
addition
to
these
four,
on
top
of
these,
we
can
specify
certain
filters
as
the
ones
we
already
provide.
If
we
can
go
for
a
tank
picker
here,
so
we
can
go
for
the
last
month.
If
we
are
interested
and
then
we
can
produce
other
or
other
kind
of
data
sets
or
or
widgets,
for
instance,
you
were
specifically
commenting
the
newcomers.
We
can
have
a
list
of
the
very
last
people
that
join
the
community.
B
A
So
I
think
one
of
the
things
that
that
is
hard
to
tease
out
is
retention
of
newcomers,
engagement
with
newcomers
when
new
organizations-
and
so
from
my
perspective,
I'm,
very
organizational
based.
So
when
the
new
organizations
starts
contributing
to
open
shift
or
start
using
open
shift,
I
want
to
know
about
it
or,
and
so
this
data
is
also
includes,
they
love
an
issue.
A
It's
just
being
aware
is
huge
because
then,
when
they
come
and
they
show
up
at
maybe
your
event
or
they
ask
a
question,
you
know
you
re
aware
that
there
that
there
and
in
the
community-
and
that
gives
you
a
step
step
ahead,
so
I-
think
that
there's
like
and
we'll
talk
about
this
a
little
bit
later
is
there's
a
number
of
personas
that
we
tease
out
from
this
data.
That
really
help,
and
maybe,
if
you
dive
into
maybe
the
clayton
:,
your
achill
analysis
that'll
help
a
little
bit
too.
A
So
if
once
you
explain
what
you're
showing
here
and
if
people
don't
know
Clayton,
then
they
don't
know
kubernetes
I
think
that's
a
bumper
sticker,
some
women.
He
is
one
of
the
lead
contributors
and
architects
for
OpenShift
and
and
on
kubernetes
itself.
So
his
watching,
someone
like
him
evolve
over
time
is
really
a
good
example
of
you
know
how
how
someone
on
boards
and
gets
deeply
deeply
involved
into
a
project.
B
Yes,
so
so
this
this
task
for
contains
a
blog
widgets
as
you
can
see,
and
then
this
is
so
far
for
the
2012
year.
So
this
is
eight
years
ago
on
the
Left.
We
have
the
number
of
commits
for
each
of
the
projects
and
then
repeated
for
each
of
the
players
we
have
for
each
of
the
bars.
We
see
more
cars
and
they're
in
the
next
in
the
next
year's
for
Clayton
and
then
for
each
of
them.
This
is
split
into
the
different
different
repositories.
This
developer
has
been
participating
at
then.
B
B
So
this
is
our
beginning
up
and
save,
and
then
we
have
origin,
not
WordPress
example
and
then
Chandler
samples.
So
those
are
two
three
main
main
projects.
Port
Clinton
was,
in
this
case,
contributing
to
we
move
on.
Then
this
is
2013.
Then
we
can
see
how
there
are
some
more
players:
Python
interface,
the
website
for
for
often
see
I,
think
it
had
press
play
and
for
Java
and
our
client
and
some
others.
Then
we
see
how
the
network
is
kind
of
growing
on
this
option.
B
14
then
we
can
see
Offensive
still
and
most
of
activity
for
played
on,
but
then
we
go
for
for
certain
projects
in
so
people
instead
of
having
the
projects
in
the
CNC
for
assistance
believe
by
kubernetes
for
Jagger
and
so
on
same
thing
with
or
we
have
graduated
and
equated
so
then
you
will
say
how
this
keeps
growing,
but
if
we
go
to
their
specific
procedure,
is
that
we
can
see
that
this
is
permit
is
this
is
the
API?
And
then
these
are
examples
to
use,
coordinate
this
and.
A
B
See
how
the
is
the
whole
activity
of
plate
on
2014.
In
this
case
we
keep
at
one
scene.
Then
most
of
the
work
is
in
opposite
your
origin,
but
then
more
and
more
commit
certain
in
the
CNC
effect
system,
2016,
even
more
retirees,
and
then
we
have
been
creating
project
so
probably
about
some
new
plates
in
the
CNC
ethical
system,
plus
all
of
the
graduated
one.
B
So
most
of
them,
as
you
can
see
who
were
made
this
examples,
community
tango
cluster
trade,
history,
API
and
governmentÃs,
then
we
can
go
to
2017
and
Creighton
keeps
growing
2019.
We
have
some
activity
in
the
operator
framework.
We
have
started
to
participate
there
and
then
2000
2019.
So
we
have
a
breakup
framework.
Some
waiting
for
us
graduated
happen
see
and
then
kind
of
nowadays,
so
the
last
six
months
approximately
so
this
is
most
I
think
we
have
four
plate
them.
A
Theoretically,
we
could
have
started
to
see
the
importance
in
the
rise
of
kubernetes
to
this.
If
what
anybody
outside
of
Red
Hat,
probably
could
have
seen
it
I
think
we
saw
it
inside
because
Clayton
was
vociferously
endorsing
the
work
that
was
going
on
in
kubernetes,
but
I
think
you
can
see
from
this
example.
There's
also
ways
to
start
seeing.
You
know,
as
people
move
to
other
technologies,
whether
they're
edge
your
IOT
or
they
start
using
open
data
hub
or
different
networking
solutions,
or
you
know,
load
balancers
or
whatever
it
is.
A
You
can
start
seeing
when
they
start
contributing
to
other
projects
or
posting
questions
about
them.
You
can
start
to
see
where
things
break
down
or
where
things
are
picking
up
speed
and
where
projects
are
maturing,
and
so
it's
a
really
useful
set
of
tooling
for
people
who
are
ecosystem.
Watchers,
like
myself,.
B
We
were
discussing
that
it's
it's
important
for
you,
I
am
the
newcomers
in
the
sense
of
new
New,
York
initiations
come
in
today
to
the
community
and
then
the
relations
with
with
other
communities
for
organization.
So
in
this
case
the
example
we
see
right
here,
this
chart
is
Hoover
activity
in
the
whole
gf+
offensive
plant
operators,
so
the
dots
again
are
developers,
and
then
we
can
see
that
most
certain
certain
specific
repository.
B
So
we
have
a
contra
scene,
then
we
can
see
with
some
more
open
tracing
up
in
tracing
Jagger
in
this
case,
and
then
we
have
the
developers
working
there.
If
you
see
more
open
tracing,
DRP,
see
Prometheus,
ok
and
then
perhaps,
if
we
move
to
the
next
one,
then
we
can
see
how
this
is
related
to
write
write
today,
maybe
you
can
elaborate
a
bit
more
about
importance
of
connectors.
A
There's
a
couple
of
things
that
this
is
showcasing
and
is
one
I
look
at
open
shift
from
an
organizational
based
set
of
glasses,
so
I
like
to
look
at
whether
it's
uber,
who
is
not
an
open
shift
customer
how
they
touchdown
in
our
in
our
ecosystem.
How
and
then
people
who
are
end-users
such
are
different
spheres
of
influence
and
how
we're
connected
to
them.
But
this
is
also
really
you
know,
shows
me
if
I
need
to
find
someone
to
talk
about
not
just
open
tracing,
but
maybe
Prometheus
or
chaos.
A
Engineering
or
you
know
whatever
it
is.
This
starts
to
show
me
the
people
who
are
the
influences
influencers
or
the
connectors
between
projects
so
say:
I'm
looking
for
someone,
who's
done
something
with
Griffin
ax,
open
tracing
and
kubernetes
and
OpenShift
these
diagrams
to
speak
to
internally
at
uber.
Right
or
you
know,
at
a
conference
like
ciencia.
It
allows
me
to
figure
out
and
trace,
not
to
be
using
a
pun,
trace
the
relationship
back
to
someone
who
might
either
be
that
person
to
speak
or
know
the
person
or
helpers.
A
B
So
in
this
case,
what
we
can
see
in
this
example
our
uber
and
private
contributions
to
to
those
prayers
of
the
material.
So
the
NCLB
graduated
and
made
a
operators
an
open,
safe
and
then
the
land
of
colors.
Is
this
purple
Red
Hat
and
then
various
kind
of
this
from
orange
color.
Then
we
can
see
that
there
are
some
over
developers
and
then
there
are
relations,
because
we
can
see
that
there
are
different.
A
Develop
if
you
go
back
up
a
little
bit
that
really
big
dot,
there
is
Travis,
Nelson
I
haven't
know
if
you
added
rook
in
here,
he's
the
gentleman
or
one
of
the
leads
on
rook.
So
it's
like
it's
interesting
to
see
where
people
pop
up
in
other
diagrams
as
well
so
and
there's
a
whole
slew
of
work
there
so
which
repository
is.
Is
that
one
connecting
to
that
Travis
is
in
the
center
of
oh?
So.
A
A
A
The
other
thing
that
it
lets
you
tease
out
is
were
other
people
in
who
you
know
in
the
community
like
grapes,
Swift
I
had
no
idea.
He
had
any
connection
to
Jaeger,
so
it
was
really
pretty
cool
to
be
able
to
do
this,
and
so
this
kind
of
led
us
to
that
kind
of
first
pass
at
really
leveraging
the
data
led
us
to
start
talking
about
okay,
deep
personas,
because
okay
D
is
really
the
project
that
I
try
and
foster,
along
with
a
few
others
like
Quay
and
operators
and
others.
A
But
this
is
really
then
for
me
by
assigning
personas
to
these
folks
and
help
me
sort
of
untangle
the
community
relationships,
and
so
we
kind
of
at
the
moment
I
have
about
five
personas,
that
I
look
at
and
categorize
people
as
the
ten
gentle
personas
people
who
are
in
who
are
working
in
one
community
and
working
in,
but
not
working
in
others,
so
they're
kind
of
tangental
to
your
project.
They
may
not
be
working
on
OpenShift
but
they're,
still
important
to
OpenShift,
so
like
URI
from
uber
or
connector
personas
that
are
working
in
multiple
ones.
A
Those
are
really
good,
and
then
we
mentioned
earlier
a
newcomer,
personas,
very
important
part
of
community
development
is
flagging,
new
entrants,
fostering
them
making.
You
know
understanding
how
long
they
stay,
how
long
it
takes
them
to
get
deeply
involved.
Very
important
aspect
of
community
development,
identifying
project
leads
and
personas.
So
Clayton,
of
course,
was
an
unknown
entity
to
anyone
inside
of
Red
Hat
and
pretty
much
anyone
inside
of
kubernetes,
but
starting
to
figure
out
how
to
identify
other
folks,
as
we
want
to
create
more
diverse
and
healthy
ecosystems
and
someday
Clayton
might
want
to
retire.
A
So
who
are
we
going
to
level
up
and
put
in
maintainer
and
contributor
roles?
Who
are
doing
that?
You
know
to
make
sure
we
have
a
diverse
and
healthy
group
of
project
leads
and
then
again
for
me
organizational
personal
as
that's
when
you
aggregate
everybody
from
whether
it's
uber
Amadeus
or
any
one
of
the
end
users
that
are
using
your
project
to
really
understand
how
they're
using
it
what
other
projects
they're
using.
A
So
it's
really
a
very
interesting
way
to
see
how
people
show
up
in
communities
and
where
things
are
going
and
the
small
part
of
OpenStack
was
a
project
called
solemn,
which
was
supposed
to
be
open
stacks
platform
as-a-service
back
in
the
day.
If
anyone
remembers
that
shout
out
to
Adrian
Otto
and
yeah,
so
we
could
really
dive
into
that.
A
We
know
that
I'm,
a
product
person
person
perspective
and
from
a
community
perspective,
though
we
hate.
Sometimes
this
is
a
good
one
here,
I'll
just
walk
through
it
quickly
here
this
was
all
of
the
projects
that
CERN
was
contributing
to
and
then
the
other
person
that
we
started
to
look
at
when
we
dive
down
into
an
individual
person,
because
we
knew
Greg
Swift,
who
is
now
at
logged
DNA,
but
at
the
time
was
at
Rackspace.
So
he
had
some
OpenStack
connections.
A
A
There
Amadeus
has
been
huge
open
ship
commons,
community
members,
they've
been
onstage
at
Red
Hat
summit
they've
been
in
CN
CF
talks
we
but
being
able
to
really
see
where
they're
going
and
what
new
technologies
they
might
be
working
on.
We
had
them
on
talking
about
Kafka
not
too
long
ago
on
stage,
because
they
were
some
of
the
leading
lights
using
in
an
enterprise
situation,
Kefka
and
willing
to
talk
about
it.
So
that
was
a
great
opportunity
to
do
that.
There's
also
I
mentioned
going
down
wormholes.
A
A
One
is
that
the
data
is
not
always
perfect
and
every
once
in
a
while
it
we
do
the
sorting
hat-
and
this
is
where
we
go
back
to
now,
having
domain
expertise
teasing
out
why
Kim
Min
showed
up
as
a
contributor
to
open
shift,
turned
out
to
be
a
misinterpretation
of
the
data
in
terms
of
one
of
the
issues
or
something
that
was
logged
to
something.
However,
it
did
give
me
a
very
weak
signal
that
at
Ali,
Baba
and
Ali
pay.
They
were
looking
at
open
shift
and
okd
in
origin,
which
then
merged
into.
A
They
eventually
had
a
deployment
of
open
shift
and
okd
there,
so
it
was,
and
I
ran
into
them
at
one
of
the
CN
CF
events
or
was
a
Linux
Foundation
event,
and
they
came
up
to
me
afterwards
and
say:
hey
yeah
yeah
we
are
this
is
this
is
who
I
am
but
correctly
identifying?
People
is
pretty
important
too,
and
then,
as
everybody's
well
aware
of,
we
have
another
problem.
Space
2
is
now
that
IBM
and
Red
Hat
are
conjoined
twins
and
are
all
under
one
umbrella.
A
Learning
who
is
in
the
IBM
world
that
are
also
contributing
to
the
different
projects
so
that
we
can,
you
know,
make
the
best
and
make
take
advantage
of
where
we
have
other
representation
and
other
network
connections
in
in
projects.
So
that's
another
thing
that
we've
been
looking
at
closely
with
all
of
this
data,
so
yeah
those
are
pretty
important
relationships.
A
Obviously
it
really
has
helped
us
a
lot
from
the
Commons
model,
which
is
ecosystem
based
or
open
source
community
development
that
we're
working
with
here
at
OpenShift
and
at
Red
Hat
and
really
what
our
goal
is
is
not
to
we're
not
trying
to
stop
people
or
do
that
we're
really
trying
to
promote
peer-to-peer
interactions.
So
it
allows
us
to
understand
where
those
interactions
are
happening
across
projects
and
nurture
them
too,
as
I
always
like
to
say,
give
away
the
podium,
because
it's
it's
often
not
about
the
code
contribution
at
all.
A
It's
more
about
sharing
the
information,
the
knowledge
making
the
connections
so
that
some
he's
working
on
one
feature
in
one
project
that
impacts
another
one,
getting
them
to
connect
or
be
able
to
facilitate
your
future
getting
into
their
roadmap.
Making
those
connections
are
really
the
things
that
community
development
is
now
all
about,
rather
than
trying
just
to
get
everybody
to
contribute
code
to
yours.
So
what
your
metrics
on
this
stack
analytics
or
whatever
the
dashboard
it
looks
great.
A
We
all
know
that's
a
wonderful
thing
to
be
the
number
one
contributor
to
a
project
or
whatever-
and
you
are
our
powers-that-be-
love
us
to
be
there.
However,
the
more
important
thing
is
that
all
of
the
communication
and
the
network
of
peers
is
nurtured
and
healthy
and
again
diverse
and
well
engaged
and
know
how
to
engage
with
each
other.
So
that
has
really
been
the
model
that
we've
been
going
for
with
open.
A
Shipped
Commons
is
giving
away
the
podium
pulling
in
the
people
to
speak
at
things
like
open
ship
Commons
briefings
on
topics
that
you
might
not
have
thought
were
relevant,
but
once
you
look
at
the
model
you
can
see.
Oh
there's
this
project
out
there.
That's
that's
about
to
hit
you
all
like
a
ton
of
bricks,
so
you'd
better
know
something
about
it.
So
we'll
pull
someone
in
there
and
give
them
the
podium.
A
So
that's
really
kind
of
what
we've
been
teasing
out
over
the
past
couple
of
years
and
whenever
anyone
hears
me
talk
about
jellyfish,
they
probably
shut
down
their
ears
now.
But
these
are
the
kinds
of
tools
that
we
really
think
help
build
healthy
communities,
because
it's
not
possible
any
longer
with
the
complexity
in
these
communities
and
these
relationships
to
do
it
on
gut
or
personal
relationships.
A
There
are
just
way
too
many
repos
to
watch.
There
are
way
too
many
people
in
those
repos
they're
way
too
many
relationships,
and
so
much
of
our
companies
and
our
customers
and
our
end
users
depend
on
these
things
being
well-oiled
machines
that
we
can't
really
risk
it
on
a
gut
instinct
or
Diane,
putting
a
mailing
list
into
a
spreadsheet
doing
analytics
on
it
anymore.
We
haven't
done
that
for
a
long
time.
A
Having
domain
knowledge
is
really
then
key,
and
this
is
not
really
an
attack
on
old
school
community.
Individual
management-
that's
kind
of
nurturing,
still
needs
to
happen
for
your
project.
You
can't
abandon
that
that,
but
it
does
bahoo
view
to
take
a
more
ecosystem
approach
approach
and
to
help
you
do
that
with
some
data-driven
tools
and
then,
as
Daniel
always
tells
me,
data
matters.
A
I
had
in
the
beginning
a
routine
every
Saturday
morning,
I'd
sit
down
with
a
cup
of
coffee
and
run
the
report
and
see
who
would
the
outliers
were
and
where
the,
where
there
was
duplication,
where
the
sorting
hat
didn't
work
and
have
to
go
back
in
and
do
that
clean
up
work,
so
I
think
that's
been
for
me,
one
of
the
habits
for
that
I
would
like
to
see
more
community
people
develop
and
incorporate
is
really
to
start
understanding
who's
in
your
community.
I
can't
say
that
more
vociferous
ly.
A
B
A
Yeah,
the
one
question
that's
come
in,
which
I
think
is
a
good.
One,
too,
is
what
the
correlation
between
code
collaboration
between
personas
and
the
company
team
membership
that
that's
an
interesting
one.
I've
used
the
tooling
so
far
to
identify
that
team
from
say,
Amadeus
or
uber,
who
you
know
who
is
working
on
the
open-source
ice.
It
doesn't
give
me
insights
into
who's
behind
the
firewall.
I,
don't
always
know
everybody
at
Amadeus
or
that,
but
it
does
give
me
a
way
to
to
do
that.
A
We
could
easily,
with
this
tool,
watch
the
development
like
we
did
with
Clayton's
analysis.
Instead
of
just
doing
an
individual
watch,
the
growth
of
open
source
participation
in
different
repos
or
an
entire
organization.
That's-
and
that
would
show
us
the
collab
I,
think
a
bit
of
the
correlation
between
code
collaboration
between
the
personas
and
that
what
we
haven't
done
is
tagged.