►
From YouTube: Getting Started with Augur, November 14, 2022
Description
This is a workshop type meeting where Sean Goggins explains how to get started with the Augur software.
A
And
I
will
also
turn
on
live
transcript
all
right.
Everyone
thank
you
for
coming
to
this
auger
hackathon,
getting
to
know
about
auger.
We
we
have
an
agenda
that
we're
going
to
follow,
which
is
right
here,
we're
going
to
talk
about
everything
about
agar
and
how
to
contribute
look
at
the
file
structure.
What
they
contain,
we'll
go
through
installation
and
the
process
of
adding
a
new
worker
foreign.
A
I
am
just
gonna,
do
real,
quick
before
we
get
started.
The
one
thing
I
am
less
familiar
with
than
my
team
right
now
is
how
to
add
a
new
worker,
because
we
just
changed
that
process.
I'm,
sending
a
Discord
message
to
my
team
to
see
if
anyone's
able
to
join
us.
A
I
just
have
to
put
the
participants
and
I
copy
invite
link.
A
A
The
first
that
I'm
going
to
show
you
is
just
through
the
front
end.
There
are
a
couple
of
things
and
I
remember.
A
All
right,
well,
I'll
it'll,
come
to
me
momentarily
so
auger
effectively.
If
you
look
at
the
group,
so
under
groups
you'll
see
a
list
like
this
in
the
front
end
under
repos
you'll
see
a
list
like
this,
which
is
each
of
the
repositories
in
various
groups
and
I
can
order
them
by
group
name
and
under
insights,
which
there
aren't
any
I.
Don't
I,
don't
believe
at
this
particular
agar
location
and
I'm.
Trying
to
remember
one
that
oh
I
know
I
know
what
it
is.
A
So
I'll
use
the
eBay
instance
under
insights.
Auger
has
some
workers
that
look
for
anomalies
so
the
way
that
works?
Is
it
trains
the
data
based
on
the
last
thousand
days
of
commit
issue
and
pull
request
activity
and
looks
for
changes
from
the
moving
average
using
for
random
Forest
algorithm
to
essentially
identify
whether
or
not
in
the
last
14
days,
any
particular
repository
has
seen
a
statistically
significant
increase
or
decrease
in
activity
and
I'll
explain
the
machine
learning
workers
that
we
go
through
the
parts
of
auger.
A
The
purpose
of
auger
is
to
give
someone
who's
looking
at
a
collection
of
Open
Source
projects.
Some
kind
of
idea
of
you
know
how
much
activity
and
what
kinds
of
activity
exist
in
in
that
repository
so
similar
to
grimorelab
and
other
tools,
we're
trying
to
present
a
sense
of
the
overall
health
using
auger,
auger
metrics
and
when
you're
looking
at
a
list
of
the
repos.
A
It's
somewhat
intuitive
that,
for
example,
some
some
repos
will
have
more
activity
than
others.
Is
this
in
the
direction
of
what
we're
looking
for
under
everything
about
agar
and
how
to
contribute
foreign.
A
A
If
I
widen
this
a
little
bit,
you
can
see
the
lines
of
code
added
by
the
top
10
authors
and
if
I
click
a
year
you
can
see
by
month
how
much
let
me
go
back
there,
and
it's
like
that
now.
This
shows
average
lines
of
code
for
for
commit.
It
shows
organization,
information
which
has
to
be
provided
by
whomever
the
auger
user.
Using
organization
is,
and
then
it
looks
at
gives
you
reviews
and
pull
requests
by
the
week
number
accepted
by
week.
A
Pull
requests
decline,
issues
opened
closed,
Issues
new
issues,
code,
changes
which
are
these
are
lines
or
commits.
These
are
lines
of
code
added
per
week
and
then,
if
there
were
Library
files
used,
they
would
be
listed
there
over
here.
A
There's
another
tab
called
risk
metrics,
which
shows
the
number
of
forks
by
week,
the
number
of
committers
by
week,
whether
or
not
there's
a
cncf
best
practices
badge
the
types
of
licenses
that
are
declared
and
where
there's
no
assertion
of
the
license,
but
a
license
of
some
kind
is
declared
and
then
the
coverage-
and
it
also
indicates
the
percent
of
OSI
approved
licenses
of
those
that
exist.
So
this
this
is
auger's
original
overview
of
information
provided
by
week.
B
Celebration,
I
have
a
question,
a
developed
question
yeah
so
like
will
it
be
possible
for
you
to
like
show
us
like
how
Olga
works
with
an
example?
Does
it
like
work
instantly
like
when
you
plug
in
a
repository?
Does
it
take
like?
Does
it
work
instantly
where
it
gathers
all
these
insights
like
say,
for
example,
you're
taking,
maybe
a
random
repository
and
puts
puts
the
bit
of
beautiful
details.
I
know,
that's
not
because
it
work
like
instantly,
then.
If,
yes,
could
you
like
show
us
like
an
example.
A
Sure
so
it
doesn't
do
it
instantly
so
the
way
that
and
and
I
most
Tools,
in
fact,
all
tools
that
are
in
the
chaos
project
don't
provide
data
instantly.
There
is
a
data
collection
period,
so
it
takes
a
certain
amount
of
time
right
now.
The
current
version
of
auger,
which
is
under
the
auger
Dash
New,
Branch
and
I'll,
just
make
some
notes
here
over
the
end
over
the
agenda,
foreign.
A
I
can
type
the
current
branch
is,
Branch
we
are
working
on,
is
auger
new
and
that
that
I
expected
to
release
any
day
now
I've
been
working
through
some
final
glitches
in
in
auger
new,
really
very
minor
glitches.
A
Now
we
can
collect
all
of
the
data
for
10
000
repos
in
less
than
a
week,
and
that's
because
of
some
new
technologies
that
we've
we've
come
come
to
employ.
So
when
it
comes
to
adding
new
information
to
Auger
there
are,
there
are
two
approaches,
one
which
I
think
is
the
place
to
start
is
immediately
following
the
installation
of
auger
and
and
is
at
the
command
line
and
I'll
show
you
that
briefly
now,
I'll
just
show
you
a
new
auger
interface.
A
A
Auger
has
a
new
interface
right
now,
and
this
this
new
interface
does
a
couple
of
things
that
I'll
explain
briefly.
One
of
the
things
it
does
is,
if
I
create
a
user.
I
can
go
to
that
user's
profile
and
add
a
new
repo
or
organization
that
I
want
to
collect
data
for
does.
Does
anyone
have
so
many
of
the
ones
that
would
be
obvious,
like
chaos
are
already
collected?
Does
anyone
have
a
GitHub
organization
that
they'd
like
to
have
new
data
collected
for.
B
B
B
And
we
get
that
too.
Okay
I
think
I'll
drop
the
Drupal
on
the
charts.
A
So
what
happens
here
is
I'll,
add
Drupal,
and
that
will
take
a
minute
we're
working
on.
You
can
see
by
the
X
not
being
finished
here
that
it's
still
working
one
of
the
changes
that
John
one
of
our
maintainers
is
is
creating
in
this
interface.
Is
it
will
show
you
a
waiting
sign,
and
so
now
it's
it's
a
state,
it'll
say
successfully
added
repo
or
org,
and
it's
not
apparently
a
clear
we're,
also
going
to
make
this
a
little
bit
more
clear
what
your
user
repos
are.
A
A
A
Resume
recording
so
when
you,
when
you
do
that,
step
where
you
add
a
repo
group,
you
end
up
sharing,
it
ends
up
creating
an
entry
for
all
of
those
repos
and
I'm,
just
sharing
a
portion
of
my
screen
right
now.
So
this
is
the
repo
database
and
you
can
see
I
don't
know.
Can
you
see
this
okay
because,
unfortunately,
I
can't
make
it
bigger.
A
You
can
see
that
it
added
all
of
the
repositories
under
repo
under
Drupal,
and
so
one
of
the
indications
that
it's
not
collected
yet
is,
when
you
add,
when
you
add
the
repos
it'll
cue
them
for
collection
and
repos
that
have
just
been
added
in
the
repo
database
will
list
the
git
URL,
but
it
won't.
The
repo
status
will
be
new
and
it
won't
be
as
yet
processed
it
so
that
the
repo
name
or
the
repo
path,
which
is
internal
to
Auger.
A
A
B
A
So
the
next
maybe
to
explain
kind
of
how
auger
Works
one
of
the
next
things
I'll
do
so
that's
that's
basically
showing
you
that
when
a
repo
is
added,
the
repo
is
created
in
the
repo
database
and
is
cued
for
collection.
A
At
that
point,
which
is
useful
to
to
say
the
least-
and
here
are
some
repos
that
have
had
the
data
collected
for
them
already.
A
But
your
question
was
just
kind
of
how
do
all
the
pieces
work
together
right
like
what's
the
what's,
the
technical
flow
of
agar
am
I.
Getting
that
correctly.
B
A
C
A
B
A
B
B
A
A
A
A
A
Prs
Etc,
so
that
becomes
relevant
a
little
bit
later,
but
these
are
some
of
the
other
things
that
the
the
tasks
also
collect
so
for
a
task
that
collects.
Let
me
use
one
example:
pull
requests
information.
A
A
It
also
includes
the
the:
what
do
you
call
it?
The
head.
A
And
the
base,
so
we
know
what
Fork
was
used
to
create
that
pull
request
and
to
where
it
was
merged
it
also
these
this
pull
request
base
also
includes
a
status.
A
A
Or
open
merged
means
the
same
as
close
merged
and
closed
both
mean
closed.
If
the
status
is
closed,
it
means
it
was
closed
without
being
merged
and
if
it
was
merged,
of
course,
it
means
it
was
merged
and
then,
of
course,
when
it's
merged
it's
closed,
it
can
also.
It
also
includes
things
like
assignees.
A
A
Files
and
commits
so
in
the
case
of
pull
requests,
all
of
the
things
are
are
in
there,
so
the
pull
request
isn't
one
just
one
table,
but
it's
all
the
things
about
a
pull
request.
The
same
holds
for
issues
same
same
exact
metaphor,
an
important
piece
of
all
of
this
is
the.
A
A
E
A
A
A
I
have
a
big
chunk:
oh
okay,
all
right
so
great,
but
I
will
still
try
to
get
to
that
part
of
the
question
while
you're
here
early,
so
you
don't
have
to
spend
two
hours
with
us.
A
A
And
one
thing
to
know-
and
you
may
you
may
already
have
figured
this
out
in
a
sense
about
any
kind
of
auger
or
anything
else
is
that
knowing
who
did
something
is
kind
of
important
and
useful,
and
we
have
an
Isaac
specifically
has
created
some
logic,
because
if
you
recall
we're
doing
a
clone
count
for
the
commits
one
of
the
distinct
and
interesting
things
about
commits
is
the
the
platform
identifier
for
the
person
is
not
included
in
the
git
log,
and
so
these
by
default
are
emails
and
everything
else
is
a
GitHub
and
I'm
just
using
GitHub
as
an
example,
it's
both
an
ID
which
is
numeric
and
a
username
as
an
aside.
A
E
Oh
yeah,
for
contributors
in
terms
of
like
how
we're
identifying
them
like
we
have
a
EU
ID
that
basically
takes
into
account
like
the
user
ID
and
the
platform
ID
together,
and
that's
just
like
one
value:
that's
Universal
for
all
contributors
across
all
platforms.
That's.
A
E
It's
the
like
the
canonical
light.
It
depends
on
the
platform
because
platforms
have
different
things,
but
for
GitHub
it's
the
platform
and
the
and
the
user
ID
I
mean
obviously
the
platform
ID
for
all
platforms
will
have
to
be
there
because
it
distinguishes
the
platform,
but
the
individual
user
ID
will
be
a
bit
different
because
you.
A
Know
and
we're
using
the
ID,
because
people
can
now
change
their
username.
So
if
we
use
the
username
it's
it's
possible,
a
username
will
go
away.
The
ID
won't
go
away.
If
you
change
your
username
to
Tony
from
Fred,
then
we
would
lose
all
of
all
of
Fred's
references,
but
Tony
and
Fred
will
always
have
the
same
ID.
A
So
it's
the
platform
in
this
case
GitHub,
plus
that
ID
that
GitHub
assigns
to
every
user
and
then
Isaac's
processes
with
the
commit
counting
tool,
which
is
called
facade
derived
from
work
that
Brian
Warner
did
10
years
ago
and
significantly
evolved
with
Brian's
blessing
and
permission
in
auger,
we'll
resolve
all
that
information
to
the
same
contributor.
B
A
Why
this
is
important
is
because,
if
you
collect
uuids
for
a
collection
of
a
thousand
repos,
those
uuids
will
be
exactly
the
same
on
any
other
auger
instance,
where
I
contribute
with
the
email
asset
goggins.com
or
any
of
my
other
emails
I've.
For
example,
in
my
case,
I've
probably
contributed
to
GitHub
on
12
or
using
12
or
more
emails,
and
my
uuid
will
be
exactly
the
same
on
all
instances
of
auger,
which
ultimately
would
make
it
easier
to
integrate
all
of
the
data
from
all
the
instances
of
instances
that
you
might
have
of
otter.
A
If,
if
you
chose
to
do
that,
so
the
platform
API
stores,
just
if
I'm
looking
through
the
platform
API
and
the
contributor,
that's
automatically
stored
using
this
same
uuid,.
D
A
That's
and
then
the
same
would
hold
for
issues.
Contributors
are
gathered
following
this.
This
example
so
anytime
anytime,
for
example,
a
pull
request,
an
encounters,
a
user
that
isn't
already
in
the
contributors
table
and
contributors
are
stored
in
the
contributors
table.
It
will
go
and
actually
retrieve
the
information
for
that
user
that
isn't
already
in
the
database.
A
Okay,
probably
the
the
the
conceptually
at
least
the
most
important
other
thing
to
share
that
that's
important
to
understand
is
that
messages.
A
We
see
messages
in
many
places,
but
the
two
main
places
are
on
issues
and
pull
requests,
and
so
the
way
that
we've
organized
messages
is
that
there's-
and
you
can
see
the
bottom
I
hope
is
I-
have
an
issue.
A
A
A
I'm
pretty
sure
this
is
one
to
one
where
each
individual
pull
request.
Message
has
a
single
message
in
the
messages
table
and
the
reason
that
we
have
this
bridge
entity
in
relational
terms
is
so
that
we
can
distinguish
the
origin
of
of
the
messages.
Now
in
hindsight,
we
could
have
done
this
differently,
but,
as
I
said
I
designed
this
four
years
ago
and
probably
made
it
overly
relational
two
important
things
one
is,
you
can
distinguish
pull
request
messages
from
issue
messages
or
other
messages
that
we
gather
and
there's
only
one.
A
E
Yeah,
it's
basically
just
like
like,
and
it
like
a
concurrent
thread
that
auger
can
run
or
not
really
a
thread,
but
you
can
think
of
it
as
well.
A
C
C
E
Is
there
some
kind
of
Moder,
oh
ordering
yeah
yeah,
there's
a
like
task
for
organized
into
phases.
Basically,
so
there
are
like
large
groups
of
tasks
that
are
differentiated
because
like
for
one
reason
or
the
other,
they
absolutely
cannot
or
they're
not
supposed
to
run
at
the
same
time
as
other
groups
of
tasks.
E
So,
like
an
example
would
be
like.
We
have
a
preliminary
phase
where
currently,
the
only
thing
that
we
do
is
we
check
all
of
our
sources.
In
like
say
we
have
like
10
GitHub
repos.
We
check
to
make
sure
those
haven't
moved,
URL
or
anything
before
we
run
the
rest
of
the
data
collection.
E
Another
example
would
be
the
machine
learning
workers
like
they're,
pretty
resource
intensive,
so
we
don't
want
them
running
at
the
same
time
as
anything
else
so
they're
in
their
own
phase
and
within
the
phase
you
have
various
ways
of
organizing
individual
tats
that
are
given
to
you
by
celery,
like
you,
can
put
all
a
bunch
of
tasks
in
a
group
and
they'll
run
all
at
the
same
time
put
a
bunch
of
paths
in
a
chain
and
they'll
run
sequentially
and
there's
more
stuff.
A
E
Are
you
asking
if
the
like
the
repo
collect
phase?
Do
things
run
all
like
concurrently,
yeah
yeah?
They
do
like
if,
if
things
aren't
like
dependent
on
each
other
and
they're
in
the
same
phase,
there's
no
reason
why
they
wouldn't
be
run
at
the
same
time
like,
like,
obviously
like
stuff
like
there's
a
task
for
pull
request.
Files
like
that
can't
run
until
pull
requests
is
run
because
it's
dependent
on
the
pull
request
existing
so
that
that
is
like
a
direct
relationship.
E
That's
specified
within
the
phase,
but
like
something
like
I,
don't
know
like
facade
can
run
like
at
the
same
time
as
we
collect
full
requests,
because,
like
there's,
no
reason
why
they
can't.
E
A
So
when
I
one
one
of
the
things
I
mentioned
earlier,
is
that
where
it
used
to
take
over
a
month
to
collect
data
for
10
000
repos,
now
it
takes
a
week
or
so.
One
of
the
reasons
is
the
auger
new
branch,
which
will
soon
be
the
main
branch,
does
a
massive
amount
of
parallelism
compared
to
the
prior
version
of
auger.
D
C
Just
just
one
last
question
over
here
like
this
parallelism
that
we're
talking
about
is
it
done
by
the
use
of
multiple
servers?
Are
we
using
multi-threading
in
some?
You
know
on
the
same
server
to
achieve
that
panelism.
E
It
can
be
used
with
either
right
now,
where
I'm,
just
I've
just
been
testing
it
on
the
same
server
machine
it's
possible
for
both,
though
salary
supports
both
because
whenever
you
schedule
a
task
or
a
phase
or
a
group
of
tasks,
it
just
sends
like
it
just
cues
that
all
up
on
on
something
like
redis
or
their
other
back-end
viewing
services,
but
like
we
use
redis.
And
so,
if
you
were
to
have
your
salary
instance
rather
than
on
a
different
server,
you
could
totally
do
that.
A
Oh
okay,
there
was
something
else.
I
was
thinking
that
might
warrant
explanation
right
now,
just.
B
One
second,
it's
fine,
yeah
Manuela
has
a
question.
Oh.
A
B
Okay
question
is
trust
Legend
very
interested
in
all
girls,
and
so
how
can
I
contribute
I
think
that's
the
second
part
of
yes.
A
A
Thing
here,
there's
there's
two
places
to
contribute.
One
is
the
piece
where
we
actually
go
through
and
and
install
it
and
obviously
obviously
kind
of
prerequisite,
but
the
first
place
that
I
would
point
someone
who
maybe
wants
to
make
in
the
auger
directory.
So
under
the
root
of
wherever
you
clone
auger
in
the
auger
news.
A
Reposed,
so
it
the
the
things
under
things
under
API,
if
they're
a
standard
metric,
for
example,
one
of
the
standard
metrics
and
these
all
return,
Json
objects
that
provide
specific
information
in
the
case
of
repos.
It's
the
ID,
a
name
sometimes
if
it's
got
to
get
covered,
because
sometimes
they
think
exist,
we're
actually
not
going
to
include
those
extra.
A
A
A
Okay,
so
code
changes
is
but
go
under,
metrics
and
submit
is
installed.
A
A
A
metric
called
code
changes
endpoints
and
the
standard
metric
like
that
will
give
you
the
name
of
the
repo.
A
A
A
So
that's
that's
one
API
Doc
and
point,
and
so
that's
that's
how
they're
that's
so
if
I
was
to
go
in
here
and
under
API
metrics
again,
our
standard
metrics
routes
are
non-standard
metrics.
If
I
go
into
one
of
the
standard
metrics.
This
is
actually
a
very
easy
pattern
to
follow.
So
if
the
metric
that
you
want
to
develop
is
inside
is
a
chaos
metric,
you
can
take
a
look
at
any
of
the
files
under
this
metrics
directory
and
they
share
a
common
structure
at
the
very
top.
A
Are
these
12
lines
of
code
with
the
spdx
identifier,
a
description
of
what's
in
this
file,
these
libraries
and
it
instantiates
a
database
connection?
Each
individual
metric
can
be
identified
or
can
be
developed
just
using
SQL.
So
if
there's
data
in
a
chaos
metric
or
metric
model
that
you
want
to
build,
the
SQL
for
first
of
all,
some
of
the
SQL
May
at
first
appear
somewhat
complex,
but
keep
in
mind
that
there
are
hundreds
of
different
queries
already
developed
and
you
can
use
those
as
a
pattern
to
follow
for
developing
metrics.
A
If
there
is
no
repo
group
ID,
provided
you
need
to
provide
a
repo
ID,
that's
the
most
common
use.
The
parameters
are
defined
here,
so
in
a
standard
metric.
These
are
always
the
same
if
the
application
or
end
user
does
not
provide
a
begin
and
end
date,
simply
the
beginning
of
time
according
to
computers
is
provided,
January,
1st
1970
and
the
end
date
is
essentially
the
very
second
that
you
are
making
the
request.
A
The
SQL
variable
is
set
to
none
just
in
case
it's
previously
been
set,
and
then
it
checks
to
see
if
a
repo
ID
has
been
passed.
If
it
has
it
uses
this
SQL
Alchemy
function,
SQL
alchemy.sql.txt,
you
can
see
up
above
that
we've
aliased
SQL
Alchemy
2s,
so
that
all
we
have
to
do
down
here
is
s.sql.txt
return.
A
Triple
quote
put
in
your
SQL
close
triple
quote:
One
X
tent
there,
and
it
goes
there
if
a
repo
ID
group
is
provided,
it's
a
separate
query
and
then
results
in
the
standard
metric
are
always
returned
as
pandas
reading
the
SQL
connecting
to
the
database
with
these
parameters
to
get
the
results
and
then
oops
the
method.
A
Excuse
me
Returns
the
results.
Those
results
gets
processed,
get
processed
by
auger
into
this
Json
file
as
an
API
endpoint.
So
that
is
one
easy
way
without
getting
real
deep
into
auger
that
you
can
get
started
helping
with
auger,
because
it
applies
some
very
easily
templated
things,
and
then
you
can
build
the
metric.
A
B
So
I
think
maybe
I'll.
Let
you
Manila
ask
a
question
because
that's
our
question.
A
Foreign
while
you're
preparing
your
question,
if
yeah,
if,
if
there's
a
like,
if
you
want
to
do
like
the
other
thing
that
we'll
probably
talk
about,
are
tasks
for
data
collection
as
well
as
installation
and
deployment.
But
if
you
want
to
get
started,
you
know
we
can
spend
an
entire
session
like
this
going
through
getting
you
started.
B
D
Oh,
my
God,
so
I
wanted
to
ask
about
some
these
print
events.
That's
you
know.
Maybe
you
want
to
be
nice
in
order
to
be
able
to
have
a
better
Clarity
on
how
our
goal
works
and
how
to
contribute
to
it.
A
B
Okay
Sean,
so
Ahmad
is
the
person.
I
talked
about
from
PI
data
Ghana
that
I
wanted
to
do
spring
with
auger
yeah.
A
B
I
did
a
slack
chat,
well,
yeah
I'm,
not
sure
you've
seen
it
yet.
B
Yeah
so
Amanda,
are
you
asking
about
what's
kind
of
what
ways
can
they
contribute
or
like
what
was
your
question.
D
Yeah
yeah,
so
it's
kind
of
everything
together
like
what
ways
we
can
contribute
and
I'm
sure
most
people
would
want
to
contribute
through
code
or
some
people
want
to
contribute
through
code
and
wanted
to
I
just
wanted
to
like
have
a
better
idea
of
I
I
joined
earlier,
but
I
I
got
distracted
with
work
and
so
I
didn't
get
some
of
the
details,
but
so
I
just
wanted
to
see.
If
looking
at
the
file
structure,
we
could
you
know,
get
kind
of
what
each
of
those
files
are
doing
and
also.
B
I
want
to
add
some
more
context
to
Pi
data
Ghana.
It's
like
a
community
and
Ahmad
wants
to
run
like
a
Sprint
with
Olga.
So
he
wants
to
kind
of
like
understand
how
high
data
community
members
can
what's
what
and
what's
available
that
they
can
contribute,
since
they
do
a
lot
of
like
I
know.
They
have
like
a
lot
of
python
people
interested
in
Python,
so
I
I
think
from
the
file
structure
as
well.
We
we
can
get
through
how
different
members
from
PI
data
can
contribute
via
the
Sprint.
A
Okay,
my
thinking
is
that
one
of
Isaac
I'm
going
to
turn
to
Isaac
here
before
I
start
jumping
off,
because
Isaac
is
deeper
into
things
than
I
am
Isaac.
What
I'm
thinking
of
for
a
group
like
Pi
data,
who
has
a
lot
of
python
skills,
is
that
we
might
arrange
or
coordinate
Sprints
around
some
of
the
tasks
that
we
have
yet
to
move
over
from
on
the
previous
version
of
auger,
like
the
value
and
dependency
workers
expect,
especially
but
I,
don't
know.
If
that's
not,
is
that
all
right?
Yes,.
E
It
would
be
great
to
have
like
just
experience
like
having
new
people
like
look
over
like
the
tasks
in
this,
in
the
face
system
that
I
designed
to
basically
just
making
sure
that
they
can
make
sense
of
it
and
write
good
documentation
for
it
and
basically
like
we
should
have
a
worker
template
like
we
have
in
the
old
version
for
the
tasks
that
we
have
now
but
yeah.
Okay,
that's
a
good
idea!
So.
A
What
I'd
like
to
suggest
that
we
we
do
next
is
I'll,
provide
a
brief
overview
of
the
the
file
structure
in
agar
new
and
then
I'll
start
talking
about
tasks
and
kind
of
set
up
a
question
for
Isaac
to
help
us
walk
through.
That
part.
Does.
A
A
Github
is
essentially
any
GitHub
tasks,
act,
GitHub,
action,
type
things
that
we've
organized
you
don't
need
to
look
at,
that
the
auger
directory
includes
API
and
what
else
so
I
would
consider
the
application,
which
is
kind
of
the
core
auger
face,
which
you
can
ignore
the
tasks
which
I
just
mentioned,
and
then
utility
functions
which
are
as
like,
I
assume
the
utility
functions
are
simply
things
that
are
shared
across
different
parts
of
auger.
A
Isaac
may
have
stepped
away.
Oh
sorry,
what
the
the
util
directory
is
just
a
bunch
of
utilities
that
may
be
shared
across
different
parts
of
auger.
Yes,
exactly,
okay.
A
The
other
directories
are
Docker
somewhat
self-explanatory.
That
this
is
where
our
Docker
stuff
exists.
Front
end
is
the
directory
for
our
what
I,
what
I'm,
calling
our
old
front
end,
which
is
basically
this.
A
That's
all
view
JS
not
to
worry
about,
for
the
most
part.
Scripts
are
primarily
things
that
we
use
to
install
and
configure
auger.
We
have
control
scripts,
Docker
scripts
and
install
scripts,
so
those
are
mostly
shell
scripts
that
are
used
to
get
augers
set
up
and
tests.
Our
unit
tests
that
primarily
Andrew
has
written
that
effectively
test
the
different
parts
of
auger
and
eventually
we'll
reintegrate
that
into
a
GitHub
workflow
of
some
kind.
A
It
used
to
be
Travis
CI,
but
Travis
CI
kind
of
blew
up
its
whole
model
for
everyone,
and
so
now
we're
putting
the
tests
there,
so
the
meat
of
where
contributions
would
be
most
most
welcome
and
helpful.
Obviously,
in
any
place
where
you
find
something
you
want
to
tidy
up
or
whatever
always
welcome,
but
under
the
auger
directory.
It's
this
tasks
directory,
which
is
where
the
data
collection
work
takes
place,
and
this
data
collection
work
is
divided
into
four
main
categories.
A
A
Exists
there,
so
this
is
a
a
place
where
things
you
know.
Contributions
would
be
useful,
a
very
useful,
extremely
useful.
To
give
you
an
idea.
The
tasks
directory
originates
from
a
different
directory
in
our
current
main
branch,
and
it's
essentially
the
way
that
we
have
been
able
to
paralyze
things
parallelize
a
lot
of
the
work
to
enable
much
faster
collection.
A
E
I
I
was
gonna,
pull
up
just
like
one
of
a
simple
like
a
simple
path
and
just
like
show
how
it's
organized
yeah.
So.
A
A
If
you're,
if
you're
able
to
I
know
you
I,
know
you're
running
on
hardcore
version
of
Linux,
probably
yeah.
E
Yeah,
it
should
work
one
sec,
because
it's
a
because
I'm
using
vs
code
and
that's
an
electron
app
but
yeah
one
sec
here,
I'm
ready.
A
So
while
we,
while
we
wait
for
Isaac
to
come
back
the
different
tasks
for
data
analysis,
this
is
where
the
machine
learning
workers
principally
live
the
clustering
worker
clusters
repositories
based
on
the
patterns
of
communication
that
are
identified
as
present.
It
also
does
topic
modeling
for
each
of
the
repositories.
So
we
can
see
what
kind
of
topics
are
discussed.
A
Discourse
analysis
identifies
11
different
categories
of
discourse,
which
can
then
be
used
to
discern
a
Time
sequence,
analysis
of
how
conversations
go
around
pull
requests
and
issues
on
individual
repositories
and
the
message
insights
worker
uses,
a
software
engineering
tuned
sentiment,
analysis
and
Novelty
detection
algorithm
to
identify
the
nature
of
speech
in
terms
of
inclusive
or
not
inclusive.
There
are
also
a
couple
of
repositories
that
were
adding
to
this
worker
to
look
at
inclusiveness
specifically
and
also
ableist
language
and
the
pull
request.
A
The
get
worker
is
this
is
entirely
facade,
so
it
has.
It
runs
the
old
facade
tasks
as
we
modify
it.
There's
some
user
utilities.
A
That
we
have
in
here.
A
A
The
move,
detection
worker
is
the
first
one
that
runs
and
it
determines
if
a
repository,
that's
currently
in
our
set
for
collection,
has
moved
so
it's
more
frequent
than
you
might
think
than
that
a
repository
will
change
organizations
or
change
its
name,
and
when
that
happens,
for
a
period
of
my
anecdotal
observation
up
to
about
a
year
and
a
half
GitHub
will
continue
to
resolve
all
of
the
old
links
to
this
new
location.
But
we
just
go
about
proactively
moving
it
events.
Look
at
the
event
stream
on
GitHub
GitHub.
A
It
does
have
a
400
page
limit
of
100
instances,
so
you
always
get
to
last
40
000
events,
but
the
longer
you're
collecting
the
less
likely
you
are
to
have
gaps.
Facade
GitHub
is
really
focused
on
that
contributor
resolution
piece.
So
you
it's
in
the
GitHub
directory
because
it
uses
the
GitHub
apis
to
resolve
contributors
issues
and
pull
requests
are
fairly
straightforward.
They
gather
all
of
the
issue
and
pull
request
related
data.
A
As
I
mentioned,
all
the
messages
are
in
the
same
table.
So
once
all
the
issue,
all
the
issues
and
all
the
pull
requests
are
created
will
start
collecting
messages
for
or
each
of
them
and
in
all.
In
generally
speaking,
we
literally
get
every
single
message
and
it's
metadata.
That's
issued
against
a
pull
request
or
an
issue
on
a
platform
releases
specifically
looks
at
if
you,
if
you
were
to
go
to
GitHub,
for
example,
it
gets
this
metadata,
so
you
can
see.
There's
a
releases
thing
down
in
the
lower
right
anytime.
There's
a
release.
A
You
get
this,
you
get
all
of
the
data
about
a
release
and
all
of
the
releases
on
a
repository
collected,
really
state.
It
can
be
especially
useful
for
when
you're
trying
to
look
at
activity
in
a
time
period
that
reflects
the
interests
and
needs
of
a
of
a
repository.
So
if
I
look
at
my,
for
example,
my
cycle
of
pull,
request
issues
and
commits
and
I
just
look
at
them
by
month,
those
months
have
less
meaning
in
terms
of
the
cycles
of
a
project
than
if
you
were
to
look
at
them.
A
In
the
context
of
time
between
releases
time
between
releases
tends
to
be
a
a
good
indicator
of
the
Cycles.
Now
that
said,
not
all
GitHub
repositories,
I,
would
say
slightly
more
than
half
but
less
than
two
less
than
three
quarters
somewhere
between
half
and
three
quarters
do
use
releases,
but
a
quarter
to
a
half.
Don't
so,
obviously
that
data
is
only
useful
when
you're
doing
you
know
when
you're,
when
they
actually
are
issuing
releases.
A
Repo
info
is
especially
important
and
as
a
for
auger,
because
what
repo
info
shows
you
is
all
of
the
data
about
a
repository
that
is
platform
metadata.
So,
if
I
go
to.
A
If
I
look
at
the
number
of
forks,
the
number
of
Watchers
number
of
committers
all
very
interesting
where
it
gets
super
interesting,
let
me
go
down
where
there's
some
actually,
some
data
issues
count
because
we're
collecting
all
the
issues
and
we
get
the
issues
count
metadata.
We
can
know
if
we
have
them
all
same
with
pull
requests.
If
I
get
the
pull
request.
Count
I
should
have
1459
pull
requests
for
this
repository
and
462
issues
for
this
Repository.
A
That's
that's
important,
because
now
I
now
I
can
tell
with
some
with
a
great
deal
of
confidence
that,
when
you're
doing
analysis
of
pull
requests,
issues
and
commits
that
you
do
in
fact
have
all
of
the
data
most
other
tools.
Do
not
do
this
verification
or
give
you
visibility
into
this
verification,
and
so,
for
example,
with
GH
torrent
or
get
archive.
A
There's
no
metadata,
and
we
know
there
is
data
missing
with
with
other
tools.
There
is
no
validation
that
we
have
the
correct
count,
and
so
one
of
the
things
that
I
think
auger
does
well.
That's
super
important
is
validate
against
the
platform
metadata
to
ensure
that
we
have
everything
any
questions
or
should
I
keep
talking.
E
I
should
be
able
to
share
my
screen.
E
A
It
looks
like
it's
working,
okay,
you're,
sharing
your
screen
now
I
believe
I
need
to.
Are
you
sharing
your
screen
in
Zoom
yeah?
Okay,
so
everyone
can
see
this
yeah,
okay,
awesome,
so
I
think
what
we
want
to
know
is
well.
One
of
the
things
we
want
to
know
is
like:
let's
take
the
the
value
worker,
for
example,
when
I
guess
you
could
explain
it
first,
but
what
I'm
thinking
of
are
what
I'm
thinking
of
is?
What
are
the
steps
if
one
wanted
to
convert
the
value
worker
into
a
task.
E
Well,
I
actually
did
this
with
the
release
worker,
which
I've
converted
into
the
release
task
right
here.
Your
collect
releases,
okay
and
I
got
I,
went
to
the
old
like
logic
and
I
put
it
in
well.
First
of
all,
I
put
the
whole
thing
in
the
and
then
GitHub
getting
the
GitHub
file
under
cast
okay
and
then
I
made
a
folder
for
what
task
it
is.
So
it's
the
releases
stuff
like
that
goes
in
its
own
folder.
So
your.
A
First
and
then
so,
converting
so
steps
in
I'm
just
going
to
do
steps
in
the
notes,
steps
to
convert
old
workers
in
main
to
tasks
in
auger
new.
So
one
is
you
copied
just
the
error
old
worker
into
a
task
file
under
tasks.
E
I
wouldn't
copy
the
entire
file.
What
I
would
do
is
I'll.
Just
like
get
the
directory
structure
organized
before
I
would
start
writing
like
I
would
just
like.
Can
you
create
a
folder.
A
You
know
what
so
you
create
a
I
guess
you
can
just
walk
us
through
it.
So
like
what
does
that
mean,
create
a
work
create
so
I'm?
Imagining
I've
got
a
workers
directory
with
value
worker
in
it.
Would
you
just
first
create
a
value
worker,
a
value
directory
in
one
of
the
tasks.
E
Yeah
pretty
much
I
mean
it
depends
like
I.
Don't
what
does
the
value
worker
do.
E
And
does
that
interface
with
the
GitHub
API?
It
does
not
yeah,
then
that
should
probably
go
and
like
either
get
or
its
own
folder
if
it's
not
related
to
like
the
get
logged
or
GitHub
because,
like
it's
mainly
organized
by
like
data
source.
E
And
then
in
the
git,
folder
I
create
the
value
worker
folder
and
then,
in
that
folder.
What
you
want
is
you
want
to
task
stop
by
and
you
want
to
core
dot
pi.
A
And
cast
out
pi
and
chord.pi,
and
is
there
any
like
template
or
what
would
be?
What
would
the
template
for
those
be?
Maybe
you
could
walk
us
through
what.
E
It
would
look
a
lot
like
the
releases
well,
the
release
I
chose
the
releases
model
just
because
it's
really
simple,
because
first,
you
would
just
like
import
all
the
database
and
like
stuff
that
you
need
to
run
it.
So
you
need
to
set
the
database
session
that
we
have.
It
needs
the
well
first,
you
need
to
you
need
to
import
the
other
file
because
that's
a
part
of
it,
but
and
then
you
basically
just
need
to
import
the
database
stuff
and
the
other
stuff
that
you're
writing
over
here.
A
So
is
this
the
so
core
I
see
so
the
task.pi
Imports
core.pi,
but
what
goes
in
core
dot
pi.
E
Word
up
high
is,
is
the
actual
functionality
of
the
worker,
the
tasks
to
just
be
like
the
logic
that
starts
it,
you
know,
okay,
you
can
have
like
easy
error
handling
for
like
the
whole
model
like
from
here
and
then
all
like
the
actual
like
manipulation
of
data
and
insertion
to
the
database,
and
everything
happens
here.
A
Okay,
so
when
so,
when
you
did
this
I
assume
you
had
to
change
by
Ruth
I
assume
that
you
had
to
change
lots
of
things
or
some
things
about
what
was
in
the
value
worker.
E
I
didn't
have
to
change
as
much
as
I
thought,
I
pretty
much
just
had
to
like
manipulate
it,
so
that
it
would
interface
nicely
with
our
database
orm
and
insert
it
correctly.
But
it
wasn't
that
hard
to
do
like
you're.
E
In
my
opinion,
at
least
just
like
any
of
the
the
list
of
dictionaries
of
data
that
you
need
to
insert
at
the
table
that
you're
inserting
to
get
into
and
do
UDP
on
that
table
for
support
for
on
conflict
to
update
and
if,
if
you're
not
doing
it
on
conflict,
you
update
insert,
then
it
is
most
likely
better
to
do
it
in
an
actual
like
SQL
text
and
then
just
executing
that
SQL
text.
E
But
in
most
places
you
you
would
you
want
to
do
the
on
conflict?
You
update.
E
That's
the
here:
well,
the
insert
data
method
is
for
the
on
conflict
to
update.
There
is
an
option
to
not
to
change
it
to
an
on
conflict.
Do
nothing
like
that's
what
the
insert
data
method
is
for.
E
If
you
want
to
insert
it
manually,
you
still
can,
with
the
you,
can
do
like
a
like
an
s,
dot,
sql.txt
option.
You
can
just
write
SQL
here
like.
E
Or
you
can
just
execute
that
but
yeah,
but
in
most
cases
you
want
to
do
the
on
conflict.
You
update
where
the
ion
clock
will
do
nothing,
which
is
why
it's
just
called
insert
data
and
it's
a
general
method
right.
There
are
cases
in
which
you
would
need
to
do
like
a
specific
SQL
query,
and
you
can
still
do
that.
A
A
So,
probably,
probably
Ahmed
what
time
in
what
time
frame
is
your
group
thinking
of
doing
your
auger
Sprint.
D
Yeah
sorry
I
I
was
so
I
think
sometime
over
the
weekend.
So
maybe
we
could
just
Galvanize
those
who
will
be
interested
to
register
for
the
events
and
then
let
them
have
like
an
idea
of
what
the
project
is
about
so
that
they
can
prepare
towards
it.
And
then
we
have
like
a
five
four
to
five
hour
Sprints,
where
you
can
get
them
involved
and
contribute
different
capacities.
A
Okay,
what
I'm,
what
I'm
thinking
is,
so
you
kind
of
I'm
fine,
so
the
way
that
I
think
it
could
be
done
is
if
we
the
way
that
I
think
this
might
work
is
if
we
would
do
two
things
to
support
that
effort.
A
One
is,
depending
on
the
time
of
day,
possibly
I
could
be
available
for
basic
questions.
Obviously,
there's
a
getting
auger
installed,
part
that
needs
to
take
place.
What
operating
system
are
most
of
the
folks
working
in.
A
D
Met
most
of
them
were
working
with
Windows
Windows
operating
system.
For
me,
I'm
working
with
a
in
Linux
machine,
and
we
also
have
very
few
people
to
working
with
some
Mac
laptops,
but
majority
will
be
windows.
D
Yeah
I
think
some
of
the
intermediate
users
will
be
more
comfortable
with
it's.
The
Linux
command
line,
yeah.
A
A
Okay,
so
with
20
participants
for
new
for
new
data
collection,
Isaac
I'm,
just
trying
to
so
one
thing
I
can
think
of,
is
if
we
created
a
branch
that
sort
of
templated
out
the
basic
steps
and
created
an
issue
explaining
what
needed
to
be
done
for
three
workers.
A
And
then
so,
if
we
did
that,
then
so,
essentially
we
could
create
a
branch
where
the
the
workers
weren't
working
yet,
but
we
had
some.
You
know
we
described
in
the
issues.
What
needed
to
be
done
for
each
of
the
three
that
I
have
in
mind
and
then
with
20
people,
I
think
if
we
had
a
general
template
they
they
could.
So
everything
that
we
get
right
now
is
from
from
GitHub
I,
don't
know.
A
E
Yeah
Andrew
would
definitely
appreciate
that
if
people
wanted
to
help
with
the
endpoints.
A
Okay,
perfect,
so
I
met
I.
Think
if
you're
doing
it
this
weekend,
I
think,
maybe
if
you
could
give
us
a
few
days
here
to
put
together
those
templates
and
issues
or
how
much,
how
much
in
advance
do
folks
need.
D
Yeah
we
can
do
this
weekend
or
to
get
more
people.
We
can
have
it
next
weekend,
but
I
also
have
a
question
is:
is
there
a
way
we
could
help
people
who,
for
some
reason,
are
maybe
not
able
to
finish
their
tasks
that
day
and
maybe
they
need
some
extra
support
to
complete
their
tasks?
After
that
that
event,
yeah.
A
I
think
one
one
way,
of
course,
is
always
issues
another
another
good
way
is
I
will
schedule
another
one
of
these
sessions
like
if
you
did
it
in,
if,
if
it's,
if
you're
flexible-
and
you
did
it
like
not
this
coming
weekend
but
the
weekend
after
that-
would
give
us
more
time
to
be
prepared.
A
Like
the
weekend
of
the
my
knowledge
of
calendars
is
limited,
hang
on.
Let
me
find
let
me
find
my
calendar.
A
A
A
Okay
yeah,
so
all
right,
we
yeah
that
that
would
work
I
can
I
can
I
can
be
around
to
do
some
support
on
that
day,
but
I
think
probably
the
last
thing
that
we
need
to
think
about
is
auger
has
historically
been
difficult
for
people
with
Windows
computers
to
install
it
on
and
because
Windows
everything
on
one
I
mean
you're
in
a
python
Community.
You
understand
Ahmed
that
everything
works
differently
on
Windows,
yeah
and
and
I.
Don't
know
how
experienced
your
community
is
with
dealing
with
all
of
those
inconse
idiosyncrasies.
D
Okay,
so
what
we
can
do
is
we
can
just
streamline
this
test
cohorts
to
people
who
actually
have
a
Unix
machine
either
Mac
or
Linux.
So
that's
you
know.
That's
that
kind
of
sets
a
bar
so
that
we
we
don't
have
we
don't
waste
too
much
time
trying
to
help
people
I.
A
E
In
there
be
too
much
harder,
the
docker
container
is
functional,
although
I
don't
know
if
it's
like
production
ready,
doesn't
have
to
be
production,
ready.
E
That's
fair:
it's
definitely
like
ready
to
like
Tinker,
with
at
least
like
last
time.
I
ran
it.
It
ran,
although
I
have
not
run
it
in
a
while,
since
I've
been
trying
to
do
a
bunch
of
facade
stuff.
A
Yeah,
so
so
maybe
the
thing
that
we
can
get
ready
for
you
this
week,
Achmed
is
Docker
for
Windows,
okay
and
the
task,
the
task,
examples
and
stubs,
and
maybe
something
similar
for
API
endpoints
and
perhaps
you
and
I
could
have
a
conversation.
A
Maybe
on
this
day
next
week,
where
we
we
go
through
some
of
that
in
a
bit
more
detail.
Okay,.
A
So
I
will
I,
will
just
put
a
I'll
just
put
the
same
session
on
the
calendar
on
the
chaos
calendar
for
next
week,
and
we
we
can
catch
up.
Then,
if
that,
if
that
works
for
you.
D
Yeah
I
think
this
time
next
week
also
works.
D
Is
is
it
it's
it's
for
450
here
right
now,
so
I
don't
know
what
time.
D
A
I'll,
just
I'll
go
ahead
and
just
make
this
particular
meeting
occur
again
next
week.
At
the
exact
same
time,
foreign.
A
And
I
I
have
a
dentist
appointment
at
11,
20,
so
I
think
we're
gonna
fall
short
of
getting
through
our
entire
agenda.
Most
significantly,
the
auger
install
but
I
will
I'll
make
a
separate
recording
of
that
later
today,
so
that
so
that
I
can
refer
you
to
that,
and
this
recording
are
you
in
the
chaos
Slack?
D
A
Yeah
I
mean
I'll,
probably
I'll-
probably
just
share
it
in
the
general
Channel,
though
so
that
others
have
access
to
it.
The
the
auger
channel
so
there's
another
channel
on
the
chaos
slack
and
I
can
share
it
in
that
initial
conversation
as
well,
I
just
wanted
you
to
know.
I
would
also
be
sharing
it
in
the
in
the
in
the
auger,
in
the
auger
slack
on
the
chaos
channel,
so
that
others
are
aware
of
what
we
talked
about
here.
A
A
Well,
thank
thank
you,
Ahmed
nice
to
see
you
again
meet.
Hopefully
you
got
some
stuff
out
of
this
I
think
roof.
We've
certainly
learned
a
bit
and
we'll
talk
with
you
all
again
soon,
all
right.
C
Bye-Bye,
can
you
just
add
that
slag
board
over
the
meeting
as
well,
so
that
we
can
get
a
reminder
on
slack
for
the
meeting
in
the
next
week,
which
is.
C
I
guess
Roots
was
just
mentioning
that
to
me
that
there
is
some
things
especially
to
be
done.
Okay,.
A
I'll
I'll
ask
Elizabeth
what
do
I
have
to
do
with
the
slack
bot.
A
All
right,
I'll
check
I'll,
take
care
of
that.
Yes,
right
now,.
C
Right
away
from
this
portion,
like
of
developing
the
worker
and
all
we'll
go
through
it
the
next
week
only
right,
yeah,.
A
I
mean
I'll
record
installing
auger.