►
Description
🤖 This Open Source Friday we're exploring the role that data labeling plays in LLMs.
📆 Join us on June 16 at 1pm ET as Erin Mikail Staples and Chris Hoge discuss how Heartex empowers AI's evolution.
A
Hello:
everyone
welcome
to
open
source
Friday
open
source.
Friday
is
a
weekly
stream
that
we
do
on
Fridays.
Hence
the
name
at
1
pm
E.T,
where
we
chat
with
open
source,
maintainers
or
core
contributors
or
developer
Advocates
about
different
open
source
projects,
so
that
we
can
get
exposure
to
open
source,
get
inspired
and
figure
out
how
we
can
get
into
open
source
as
well.
Today,
I'm
joined
by
Chris
and
Aaron,
so
I
would
love
for
them
to
introduce
themselves
hey
Kristin
Aaron.
What's
going
on.
B
Yeah
so
so
I'm
I'm
Chris
Hodge
I,
am
the
head
of
community
for
label
Studio
at
hard,
X
and
I've
been
working
at
open
source
for
a
long
time.
I
actually
got
my
start
at
the
University
of
Oregon.
I
was
building
a
scientific
Computing
cluster
for
the
College
of
Arts
and
Sciences,
and
this
was
like
when
AWS
was
just
brand
new
and
Cloud
was
brand
new.
B
It
was
like
all
the
big
buzz
and
we
were
trying
to
decide
what
to
install
on
the
system
and
we
there
are
lots
of
choices
at
the
time
kind
of
like
new
cloud
computing
infrastructure
choices
and
through
the
advice
of
a
good
and
dear
friend
of
mine,
we
went
with
openstack,
which
kind
of
wound
up
like
being
the
front
runner
in
the
whole
Cloud
infrastructure,
open
source
Community
that
turned
into
a
job
supporting
openstack
for
the
public
community
at
Puppet
Labs,
which
eventually
led
to
a
job
where
I
was
working
for
the
openstack
foundation,
kind
of
like
helping
Drive
their
initiatives
on
how
to
you
know,
improve
interoperability,
improve
the
the
developer
Community
as
well
as
kind
of
the
ecosystem
around
around
openstack
I.
B
But
I'm
trained
as
a
mathematician
like
that's
how
I
wound
up
at
the
University
of
Oregon
I've
done
machine
learning
in
the
past
and
I
went
back
and
worked
on
the
Apache
TVM
project
for
a
couple
years
at
octumel,
which
was
really
exciting.
It
was
nice
to
get
back
into
kind
of
like
that
open
source
machine
learning
world,
because
it's
changed
so
much
in
the
last
few
years
and
like
when
I
was
doing
AI.
It
was
like
you
know.
B
This
AI
thing
is
never
going
to
take
off,
and
now
it's
like
the
only
thing
that
people
can
talk
about
and
I've
been
working
on
label
Studios
since
about
October
of
last
year.
So
it's
it's
pretty
exciting
to
be
working
in
the
space
and
you
know
just
like
to
be.
You
know:
I'm
really
lucky
to
have
such
a
long
career
in
open
source,
and
you
know
to
continue
to
grow.
That
and.
B
That
I'm
really
passionate
about
with
with
a
with
a
broader
audience,
so.
A
Oh
yeah
I
was
just
saying
you
had
a
really
rich
career.
It
sounds
really
awesome
and
I.
Think
it's
like
I
think
the
note
you
made
about,
like
you
were
working
on
AI
back
in
the
like
some
time
ago.
I
don't
almost
say
back
in
the
day
that
makes
it
sound
like
a
long
time
ago,
but
sometimes
yeah
and
now
now
it's
kind
of
like
resurfaced
and
become
like
this.
This
cool
thing
that
everyone
wants
to
do
I
think
that's
interesting,
because
people
like
AI
is
new
and
it's
like
it's
not
new.
A
It's
been
around
but
like
now
we're
we're
getting
to
a
different
stage
where
people
are
embracing
it
and
I.
Think
generative,
Ai
and
llms
make
it
easier
and
more
accessible
for
other
developers,
but
Erin.
What
about
you
tell
me
about
yourself
and
then
also
people
in
the
audience.
I
see
y'all,
saying
hi
tell
me
where
you're
you're
tuning
in
from
like
which
country
or
state.
D
Yeah
hi
everyone,
I'm
Erin,
Mikhail,
Steeples,
I'm,
a
senior
developer,
Community
Advocate
at
label,
Studio
heart
text,
I
did
not
take
a
very
linear
career,
so
Chris
very
much
had
that
very
linear
career
I.
Actually,
my
formal
education
is
in
journalism
and
fandom,
so
not
developer
related
at
all,
but
did
I've
been
in
accidentally
in
developer
or
developer
adjacent
spaces.
Pretty
much
almost
my
entire
career
on
accident.
It
was
not
part
of
the
career
plan
and
then
fell
into
kind
of
Open
Source.
D
When
I
was
at
a
startup
as
a
product
manager
actually
for
a
headless,
CMS
Ecommerce
site,
and
they
were
like
well
you're.
Finding
this
Discord
for
the
beta
testers,
you
should
probably
be
our
open
source,
Community
manager
and
I
was
like
cool
and
then
step.
One
was
what
does
an
open
source
Community
manager
do
and
then
it
was
downhill
from
there.
So
yeah
I
think
I
came
into
open
source.
Just
from
the
journalism
background.
D
We
worked
a
lot
with
like
in
school.
I
learned
worked
a
lot
with
like
the
open
journalism
project
and
had
really
loved
the
philosophy
of
it.
D
My
advisor
in
graduate
school
actually
wrote
a
lot
about
the
parallels
between
open
source
and
journalism,
so
that
was
Professor
G
Rosen
at
NYU,
so
he'd
really
experienced
it
from
the
very
theoretical
academic
angle
and
then
have
kind
of
came
along
and
have
been
at
label
Studio
since
December,
and
it
was
kind
of
fun
to
go
back
to
working
with
data
and
thinking
about
those
problems
and
I
think
the
thing
that
makes
it
really
exciting
is
I.
D
If
you
would
have
asked
me
five
years
ago,
if
I
was
even
end
up
in
the
developer
space
or
five
years
ago,
I
was
going
to
be
like
a
news
producer
missing
or
like
a
lawyer
or
something
we
changed
a
lot.
But
you
know
I'm
glad
that
we
ended
up
here
and
the
open
source,
accessibility
and
machine
learning
is
really
exciting.
To
me,.
D
D
Her
entire
doctor
is
a
very
fascinating
field
if
you
love
pop
Cool
Drink
geeking
out,
but
the
thing
that's
really
it's
actually
really
relatable
to
open
source,
there's,
actually
a
whole
study
of
how
what
open
source
people
can
learn
from
The
Sims
fandom
community,
and
someone
has
done
an
entire
academic
study
on
it.
It's
super
fascinating,
but
that
it
comes
down
to
like
collaboration
and
accessibility
to
this
core.
A
A
B
Yeah
well,
the
so
the
project
is
called
label
studio
and
it's
it's
an
open
source
platform
for
data
labeling
I
mean
if
we
think
about
like
the
like
there's
like
the
broader
machine,
Learning
Community,
like
you,
have
things
like
Pi
torch
or
tensorflow,
or
you
know
even
like
more
fundamental
libraries
like
Gumpy,
and
you
know,
like
you,
know,
pandas.
You
know
these
things
that,
like
people
use
to
do,
data
science,
they
used
to
do
machine
learning
and
all
this
stuff
is
just
like
it's.
It's
kind
of
helped.
This
explosion
of
of
innovation.
B
That's
happened,
kind
of
in
the
machine
learning
World,
but
one
piece
that
was
missing
was
like.
If
you,
if
you
go,
and
you
learn
machine
learning
right
now,
you'll,
probably
like
someone
will
say
like
we're:
gonna
build
a
model
and
we're
gonna
like
take
all
the
housing
data
and
we'll
like
do
analysis
on
the
housing
data
or
we'll,
take
the
iris
data
and
we'll
do
stuff
on
the
iris
data
or
the
handwriting
models.
B
This
handwriting
is
a
number
six,
and
this
one
is
number
five
and
like
doing
that
for
thousands
of
data
points.
So
what
the
founders
of
label
Studio
kind
of
realized
was
that
we
need
an
open
source
platform
to
be
able
to
do
this.
It
needs
to
be
highly
configurable
because
you
want
to
label
all
different
types
of
data
and
it
needs
to
be
easy
to
use
and
it
needs
to
have
a
collaborative
interface
so
that
it's
like.
B
D
D
I
was
doing
machine
learning
through
some
sentiment,
analysis
and
spreadsheets
a
few
years
back,
but
the
way
that,
like
the
analogy
that
got
me
to
wrap
my
brain
around
what
data
labeling
is,
was
very
much
the
computers,
don't
speak
English
as
much
as
we
want
them
to
speak
English.
We
have
to
tell
the
computers
what
something
is
and
labeling
is
the
process
of
kind
of
like
translating
computer
speak
to
in
providing
context
to
it.
A
I
think
both
explanations
really
did
help
me
to
understand,
because
even
when
I
I
first
saw
the
the
title,
I
was
like
label
Studio.
What
does
that
have
to
do
with
it,
but
that
that
makes
a
lot
of
sense.
I'm,
just
gonna
highlight
some
comments
from
from
the
the
audience
learner0418
said
so
I
guess
they're
asking
clarification,
so
it
makes
supervised
learning
easier
and
looks
like
you're,
not
in
your
head.
A
They
also
I
think
were
commenting
on
when,
like
everyone
talked
about
open
journalism,
they
said
that's
awesome
to
hear
and
then
also
just
shouting
out
a
couple
of
people
when
I
asked
like
where
are
they
tuning
in
from
people
are
tuning
in
from
Ethiopia
and
Iran
and
Morocco
and
South
Africa
and
Afghanistan
and
Colombia.
So
thanks
to
y'all
for
for
tuning
in
I.
B
I
love,
I
love,
seeing
how
International
this
community
is
and
and
it's
one
of
them
it's
one
of
the
amazing
things
about
open
source
is
like
it.
It
opens
up
so
many
opportunities
for
us
to
collaborate
together
just
throughout
the
world
like
and
it's
and
it
it
breaks
down
barriers
in
time
and
space
and
language
and
I.
B
Just
you
know
it
it
it's
one
of
the
things
that,
like
you,
know
having
having
been
like
worked
for
the
openstack
foundation
where,
like
we
were
an
international
organization,
and
it
was
like
we
had
contributors
and
from
all
over
the
world
and,
like
you
know,
going
out
and
meeting
everyone,
and
it's
just
like
you
know,
I
really
I'm,
so
excited
that
everyone
is
here
and
like
this
is
like
like
when
I
talk
about
like
why
do
I
do
open
source
and
why
do
I
stay
in
open
source?
This
is
a
huge
reason.
A
B
A
Of
course,
and
yeah
I
Echo
your
sentiments,
that's
why
I
always
ask
people
I'm
like
where
are
y'all
tuning
in
from
because
when
I
see
them
say
hello,
I'm,
like
oh
okay,
they're
tuning
in
from
America,
but
then
when
they
start
telling
me
I'm,
like
oh,
my
gosh
you're
from
everywhere.
This
is
so
exciting,
but
okay,
so
you
all!
You
all
even
answered
my
question
on
what
exactly
data
labeling
is
I.
Guess
unless
you
have
more
to
to
add
to
that
and
I
also
had
asked.
B
A
B
I
mean
it's
like,
so
a
huge
part
of
why
it's
so
important
is,
if
we
think
about
like
what
what
machine
learning
is
in
some
ways,
I
just
have
a
giant
mathematical
model
with
sometimes
hundreds,
sometimes
thousands.
Nowadays,
billions
of
parameters
like
these
large
language
models
have
billions
of
parameters
that
you
can
turn
and
that
you
can
tune
and
the
way
the
way
we
tune.
B
Those
parameters
is,
we
put
input
into
of
the
model
and
we
get
output
out,
and
then
we
and
then
we
ask
ourselves
well
how
close
it
was
the
output
to
what
I
expected
it
to
be
or
what
I
wanted
it
to
be,
and
data
labeling
is
actually
defining
what
those
inputs
are.
The
input
might
be
the
image
of
a
cat,
and
we
would
expect
the
output
to
be
hey.
B
This
is
a
cat
and
you
know,
then
you
feed
that
into
the
model
and
if
it
comes
out
and
says
this
is
a
dog
and
you
say
no,
that's
the
wrong
answer
and
there
are
mathematical
steps
that
you
can
go
through
to
go
back
and
tune
all
those
parameters
and
so
having
having
properly
labeled
data
means
that
as
you're
as
you're
tuning
this
model
and
as
you're
as
you're
trying
to
bring
it
into
a
spot
where
it's
giving
you
always
the
wrong
answer
to
the
right
answer.
B
99
of
the
time
like
the
only
way
you
can
do,
that
is
if
the
data
you're
feeding
into
the
model
is
accurately
labeled.
You
know
and
that's
and
that's
why
it's
just
such
a
critical
step
to
be
able
to
make
it
to
work,
and
these
models
require
a
ton
of
data
to
turn
them
into
random
number
generators,
which
is
usually
how
they
start
out
and
turning
them
into
things
that
are
very
accurate
in
making
predictions.
A
Interesting,
thank
you
for
that.
Okay
and
then
someone
asks
clarification.
The
expectations
and
inputs
are
your
labels
right.
C
A
That's
completely
I'm
sure,
we'll
see
it
in
the
demo.
I
just
have
one
other
question
before
we
go
into
the
demo
is
what
inspired
y'all
to
get
involved
in
this
particular
project.
I
think
maybe
from
Chris's
point
of
view.
I
could
assume
it's
because
he's
had
this
long
career
in
Ai
and
open
source,
but
yeah
just
curious
from
both
of
your
point
of
views,
especially
since
Aaron
you
you,
you
have
less
of
a
developer
background
and
like
I'm,
particularly
scared
of
math.
So
just
curious,
like
from
like
yeah.
D
Yeah
I
mean
I
can,
for
me.
I
was
very
fortunate
to
kind
of
take
a
break
between
jobs
in
my
last
career
or
like
the
last
job.
That
I
had
was
very
fortunate
and
I.
Think
the
things
that
really
drew
me
to
the
machine
learning
world
were
two
two
points
is
like
one:
I
really
I'm
a
nerd
at
heart:
it
comes
with
the
glasses,
I
love
the
academic
world
and
I
was
really
fascinated
by
the
amount
of
academics
in
machine
learning
like
it
was
just
like
a
lot
of
people
in
machine
learning.
D
Are
they
just
genuinely
enjoy
learning
for
the
sake
of
learning
and
they
genuinely
enjoy
like
this
discovery
and
the
new
things
out
there
I
actually
did
love
math
I,
actually
hated
science
in
school,
because
I
didn't
want
to
write
a
lab
report,
so
I
learned
in
my
undergrad.
If
you
were
a
liberal
arts
major,
you
could
skip
your
science
credits
if
you
took
physics,
so
I
actually
was
probably
the
only
journalism
student
to
pass
Cal
3
and
like
two
physics
classes,
because
I
didn't
want
to
write
lab
reports.
D
So
don't
ask
me
about
biology,
but
I
enjoyed
that
the,
and
so
that
was
kind
of
like
a
draw
there,
and
then
it
was
really
important
to
me
to
be
at
an
open
source
company,
and
that
was
super
important
and,
like
it
kind
of
felt,
like
Lauren
who's.
The
VP
of
marketing
at
label
Studio
had
reached
out
to
me
and
at
first
I
will
be
totally
honest.
D
I
was
kind
of
like
I,
don't
know
if
machine
learning
is
quite
my
jam
like
it's
not
really
my
thing
and
then
I
started
exploring
and
playing
around
with
it
and
was
really
excited
about.
You
know
the
amount
of
research
going
into
it.
A
lot
of
it
is
open
source
which
is
really
cool,
and
it's.
We
are
learning
new
things
in
this
industry
literally
every
day,
which
I
think
is
for
me.
C
A
So
that
was
the
other
thing
also
I
should
like
change
or
rephrase
the
way
that
I
made
I
sent
said
the
question,
because
it
sounds
like
I'm
like
oh
you're,
not
gonna,
like
I
just
assumed
but
I
meant
to
say
I'm
interested
in
AI,
but
I'm
scared
of
it,
because
I'm
not
good
at
that.
So
I
was
like
interested
in
like
yeah.
D
A
D
Like
it's
a
much
more,
it's
like
if
you
like,
Puzzles
and
like
your
brain
likes,
that
kind
of
things
which
I
think
most
developers
tend
to
like
and
I
think
now
that
I'm
going
back
into
it's
like
my
last
job,
I
was
dabbling
with
like
Ruby
or
like
doing
some
python
stuff,
but
it
was
like
more
JavaScript
stuff
and
a
lot
of
like
API
calls
and
going
into
this
like
I,
don't
see
myself
doing
that
kind
of
work
as
much,
but
it
is
very
focused
on
do
you
under?
D
How
do
you
understand
data
and
what
does
data
actually
mean,
which
does
have
a
more
like
Logistics
flow
to
it,
because
I
think
there's
a
whole
thing
with
analytics
like
in
analytics.
We
can
all
make
up
a
number,
and
it's
like
this
is
the
most
best
number
in
the
world
and
it's
like,
but
without
context.
That
number
can
mean
absolutely
anything.
D
B
D
B
There
is
I
mean
there's
something
kind
of
important
too
about
like
label
Studio
as
like,
like
the
concept,
the
like
the
concept
of
coming
to
the
concept
of
open
data
like
let's
look
at
the
two
biggest
open
source
platforms
for
machine
learning.
Right
now
we
have
we
have
tensorflow,
which
was
developed
by
Google
and
it's
kind
of
like
the
older
one,
and
it's
like
you
know,
it's
received
a
lot
of
adoption
and
it
really
kind
of
kicked
off
this
era
of
of
exploring
deep
learning
in
an
open
way
right.
This.
D
B
Amazing,
like
open
source
given
away
by
one
of
the
biggest
companies
in
the
world,
you
know,
and
then
we
have
pytorch,
which
is
kind
of
like
the
new
kid
on
the
Block
and
it's
kind
of
the
new
hotness
and
a
lot
of
new
development
is
happening
there.
You
know
it's
part
of
the
Linux
Foundation
now
that
came
out
of
Facebook,
another
huge
company,
and
it's
like
why.
Why
are
these
companies
giving
away
this
core
technology?
You
know
why?
Are
they
just
giving
it
away?
B
It's
because
there's
like
a
huge
amount
of
value,
that's
locked
up
in
the
data
and
oftentimes
it's
treated
as
very
secret
and
very
proprietary
and
working
for
a
company
that
is
building
an
open
data
labeling
platform
with
the
goal
of
being
able
to
share
and
collaborate.
Data
means
that
we're
helping
to
drive
towards
a
more
open
data
world
where
we
can
share
data
sets
with
one
another
and
and
I
you
know
and
Laura
in
the
chat
brought
up.
A
great
point
of,
like
you
know,
like
you
know,
do
we
like
practice?
B
A
Love
that
thank
you
and
oh,
and
then
everyone
also
put
if
you're
a
geek
about
public
transparency
and
open
dating
you'll
likely
love
machine
learning.
The
rabbit
hole,
I
went
down
was
amazing,
all
right
so
loved.
All
these
answers
love
the
the
understanding
of
like
what
the
project
is
and
why
you
all
got
involved
I
think
it's
a
great
time
to
Pivot
to
the
actual
demo.
So
we
can
get
a
chance
to
see
like
how
it's
working
under
the
hood
or
or
just
how
we
can
leverage
it
ourselves
and
putting.
D
D
D
But
to
make
things
easier,
we
actually
do
have
a
hugging
face
space,
so
we'll
be
launching
it
from
there
today,
and
the
cool
thing
is
like
hugging
face
is
kind
of
like
a
giant
place
where
there's
a
bunch
of
models
open
that
you
can
go
ahead
and
play
with
without
having
to
download
a
tool
or
a
container
like
tool
like
docker,
and
it's
also
pretty
open
access
so
for
today's
example,
we'll
just
be
operating
within
our
hugging
face
space
and
I'm.
Dropping
that
in
the
chat
as
well
and
Maddie
two
shoes
I
agree.
D
This
is
very
very
cool.
I
think
the
hugging
face
spaces
thing
was
like
a
huge
discovering.
This
was
like
a
huge
unblock
for
me
and,
like
learning
a
lot
about
it,
but
again
would
love
to
see
you
on
there,
but
we're
going
to
go
ahead
and
click
in
the
spaces.
We
also
do
have
some
sample
data
sets
that
go
along
with
our
tutorials.
D
So
if
you
do
check
out
our
tutorials
on
our
website,
it
does
tell
you
like
how
to
get
started
with
data
labeling
and
why
it's
important,
and
what
what
space
does
this
actually
fit
in
machine
learning
process,
so
go
ahead
and
check
it
out
in
the
meantime,
I'm
just
going
to
open
up
this
hugging
base
space
and
it'll
give
you
a
little
warning
like
hey.
This
does
get
reset.
Sometimes
I've
opened
up,
this
hugging
face
space
and
it
has
like
30
projects.
D
Consider
this,
like
everybody,
can
edit
a
Google
doc
in
here.
So
someone
can
come
in
and
actually
delete
it
all.
So,
if
you
are
using
it,
you
can
Fork
it
like
GitHub,
but
for
the
sake
of
today,
I'm
kind
of
like
you
know
what,
let's
edit
in
this
fun
little
Google
doc
platform.
Oh
and
yes,
I
can
do
a
little
for
you
there
you
go.
D
Let
me
know
if
that's
good,
so
I'm
gonna
go
here.
You
can
see
I've
started
to
play
around
with
these
cool
Beyonce
data
sets
I'm
really
official,
with
my
title
naming
but
Roselle
and
I
were
talking
about
Beyonce
before
the
stream,
so
naturally
I
had
to
find
a
great
data
set.
If
you
are
interested
I,
don't
believe
in
boring
learning
tutorials
but
I
do
have
a
list
here
of
Pop
Culture
data
sets
that
I
have
actually
been
practicing
machine
learning
on,
and
it's
just
feel
free
to
add.
D
If
you
find
a
cool
one,
but
there
are
some
things
like
boy
bands
in
there
as
well,
so
we'll
create
a
new
project
and
a
project
in
label
studio
is
kind
of
just
your
way
of
creating
a
window
for
in
a
portion
for
your
data
to
label.
So
the
data
set
that
I've
already
got
downloaded
on
my
computer
is
the
Beyonce
concert
performances
data
set.
D
D
If
you
have
it
in
a
database,
you
can
use
the
URL
or
on
AWS,
but
for
the
sake
of
this,
we're
going
to
use
the
really
easy,
CSV
upload,
so
I'm
just
simply
uploading
a
CSV,
and
then
it's
going
to
ask
me
if
I
want
to
do
this
as
a
list
of
tasks
or
time
series
or
hold
data
for
this
case
Alyssa
task
is
each
task
is
an
item
that
needs
to
be
labeled
where
time,
series
or
whole
data
label
Studio
works
with
a
bunch
of
different
data
types.
D
So
and
now
that
we
have
generative
models,
we
have
people
who
are
using
this
for
images
and
like
annotating
parts
of
the
image
we
have
this
people
using
this
for
video
we've
people
using
this
for
audio,
you
can
build
things
like
training
things
for
better
podcast
transcriptions.
So
what
a
company
would
do
if
you
were
using
this
for
an
enterprise-based
tool,
is
like
I.
Think
of
I
was
editing.
D
Video
last
night,
for
you,
know
myself
like
a
personal
side
project
and
I
use
a
tool
that
helps
me
put
like
a
transcripts
on
my
video
at
the
bottom
that
has
to
be
trained
by
an
algorithm
and
that
has
to
be
trained
and
decided.
You're
like
computer,
doesn't
naturally
know
like
when
Aaron
says
the
sentence.
This
is
what
it
means
in
text.
D
We
should
do
but
different
accents,
actually
don't
recognize
this
well
or
there's
a
lot
of
projects
out
there
that
are
trying
to
label
different
languages
because
we
often,
unfortunately,
do
think
English
first
many
of
the
times,
but
that's
not
really
fair
or
equitable,
insert
why
open
source
is
important
here,
but
anywho
for
the
sake
of
this
data,
set
we're
going
to
click
list
of
tasks
and
then
we'll
click
save
oh
and
I.
Already
look
I'm
already.
So
I
went
ahead
and
imported
it
here.
D
I
skipped
a
step,
so
I
want
to
go
in
and
just
change
the
settings,
so
label
studio
is
extremely
customizable
and
you
can
change
your
interface
based
on
different
use
cases.
So
we
already
have
a
bunch
of
templates
already
designed.
So
you
can
see
here
all
the
different
ways
that
you
can
label
data,
so
you
can
have
anything
from
selecting
semantic
segmentation
of
polygons.
You
can
do
bounding
boxes
for
image,
labeling
and
computer
vision,
kind
of
just
means
like
image,
labeling,
it's
you're
teaching
the
computers,
how
to
you
have
natural
language
processing.
D
So
you
can
do
things
like
if
you're
building
a
chat
bot,
you
might
use
question
answering
or
sentiment
analysis.
You
might
do
a
text
classification
thing
and
then
each
of
these
are
completely
customizable.
We
also
have
generative
AI
ones.
So
if
you
are
building
this
is
actually
part
of
our
1.8
release.
So
literally
came
out
last
week
brand
new
stuff,
but
if
you
want
to
retrain
gpt2
or
gpt4
or
any
of
our
larger,
like
those
larger
models
out
here,
here's
how
you
would
do
so
and
there's
a
ton
of
tutorials.
D
We
have
a
repo
on
how
you
can
start
to
retrain
them
on
our
GitHub,
but
for
the
sake
of
this
demo,
just
being
mindful
of
time,
I
think
we'll
just
do
a
simple
text:
classification
here
and
we'll
customize
this
template.
So
this
typically
is
used
for,
like
sentiment,
analysis
I'm
actually
going
to
use
it
to
rank
the
awesomeness
of
Beyonce
songs,
because
I
think
that
would
be
pretty
fun.
D
So
I'm
going
to
click
song
title.
We
have
three
choices:
positive
negative
and
neutral
I
think
I'm
going
to
change
those
to
great
song
and
I'm
just
going
to
click.
Add.
D
So
our
data
set
is
naturally
going
to
be
biased,
because
I
can
assume
that
we're
all
Beyonce
fans
and
just
like
when
we
talk
about
bias,
that's
how
easy
it
is
to
inject
bias
in
AI.
So
actually
thanks
for
bringing
it
up,
we
are
all
Beyonce
fans.
We
would
probably
never
give
Beyonce
a
terrible
rating
but
say
I
handed
this
data
set
to
someone
who
was
like
I,
absolutely
hate,
Beyonce
or-
and
this
is
also
why
and
like
again
I
mentioned
early
on
in
the
Stream
that
I
have
a
liberal
arts
background.
D
That's
I
actually
think
more
important
than
ever
in
machine
learning
like,
and
so,
if
you
are
not
technical
or
you're.
Coming
from
a
liberal
arts
background,
I
think
it's
super
important
right
now
that
you
are
curious
about
machine
learning,
because
context
is
so
important
if
you've
ever
designed
a
survey,
you
know
how
easy
it
is
to
buy
a
survey.
D
The
difference
is
is
like
we
can.
If
annotators,
because
sometimes
this
is
outsourced,
annotation
is
outsourced
or
labeling
is
outsourced,
if
you
Outsource
it
to
a
team
that
isn't
provided
the
context
of
why
you're
labeling
it
like
for
the
sake
of
today,
we
could
just
be
simply
creating
a
log
of
like
Chris,
Aaron
and
Roselle's
favorite
Beyonce
songs,
but
if
we
were
to
Outsource
this
to
someone
again
without
that
context,
we
might
be
missing
things
if
you
want
to
and
then
kind
of
back
to
the
label
Studio
side
of
things.
D
We
do
have
an
XML
file
here.
If
you
want
to
further
customize
it,
and
this
will
give
you
you
can
change
things
like
add
contacts,
you
can
add
pop-ups,
you
can
go
completely
wild
if
you
want
to
take
the
time
to
do
that.
I
am
constantly
amazed
of
what
people
are
doing
in
this
area,
so
I'm,
always
if
you're
building
something
cool.
Additionally,
if
you've
decided
that
meh
I
want
to
like
do
more
with
my
labeling
or
set
up
things
like
adding
a
model,
so
you
can
do
predictions
or
help
with
like
assisted
labeling.
D
You
can
do
that
as
well.
So
meaning
like
assisted
labeling
is
like
do
like
I'll
do
100
and
then,
like
that
model,
trains
understands
like
how
you
rated
that
and
then
tries
to
predict
it
in
the
future.
D
So
again
clicking
save
and
we
can
see
our
basic
task
view
so
you'll
see
I'll
have
an
ID
if
the
task
has
been
completed
or
not.
This
would
be
if
it
has
total
annotations
per
task
if
you
skip
the
annotation
or
total
prediction.
So
if
again,
if
we
had
a
model
added
here,
it'll
also
have
a
nice
little
annotated,
buy,
which
will
have
my
name
here,
a
song
item,
song,
title
or
item
title,
and
we
can
customize
these
depending
on
how
it
goes,
and
this
data
set
has
642
Beyonce
performances
out
there.
D
D
So
pre-processing
will
come
from
the
there.
We
go,
let's
see
if
it'll
pop
up
here,
it's
like
cleaning
up
the
data
beforehand
so
like
this
is
a
CSV
data.
If
you
have
like
missing
areas
in
your
CSV.
C
D
C
D
D
When
you
pull
a
random
data
set
off
the
internet,
so
I
will
have
to
kind
of
play
around
with
that,
because
you
can
see
where
I'm
getting
like
a
so.
B
But
this
is
actually
we
can
go
back
into
the
labeling
interface
configuration
and,
let's
take
a
look
at
the.
Let's,
take
a
look
at
the
code,
because
not
only
is
there
a
visual
editor
in
label
Studio,
but
there's
actually
an
XML
type,
markup
language
that
allows
you
to
Define
what
it
is
that
you
want
these.
What
you
want
this
interface
to
look
like.
D
B
Yeah
I'm
worried
that
the
I'm
worried
that
that
that
that
backslash
in
the
in
the
data
set
might
be
confusing
the
the
labeling
interface.
D
Gotch,
oh
I,
see
I
actually
do
have
another
Beyonce
data
site
that.
C
D
Pulled
up
so
for
the
sake
of
this,
we
can
actually
creep
in
here.
Oh
it's
doing
the
same
thing,
but
this
one
does
have
albums,
so
we
can
rank
it
by
outlooks,
so
that
is
a
backslash
I'm
going
to
go
ahead
here,
just
kind
of
quickly
speed
run
our
last
section
I'm
going
to
do
this
by
instead
of
spelling
idle,
slash,
album
title:
that's
a
good
call,
we're
just
going
to
go
straight
from
album,
okay
and
then
do
awesome.
D
That's
gonna,
be
our
labels
again
Barry
just
quick
labeling
here,
and
this
is
actually
just
another
things.
It
should
be
pulling
up
from
that
album
text,
but
go
ahead
and
click
save
and
then
we'll
start
labeling,
and
so
it
says
Austin,
Powers
and
gold
numbers.
So
this
is
the
album
that
it's
from
meaning
it's
probably
on
a
soundtrack
I'm.
Just
going
to
put
soundtrack,
you
might
be
like
slightly
overrated
because
you're,
it's
based
on
the
movie,
not
necessarily
the
album
itself
and
then
I
click
submit
Dangerously
in
Love
feel
like
yeah.
D
C
A
D
Off
of
your
thing,
but
again
we're
just
kind
of
going
through
I'm
gonna,
just
randomly
click
now
and
you'll
see
it's
going
to
start
having
the
annotations
here,
so
we
can
go
ahead
and
refresh
and
we'll
start
seeing
those
annotations
pop
up
when
it's
been
completed,
who
it's
been
completed
by,
which
is
you
know,
my
very
official
email
right
there
that
I
made
in.
B
10
seconds
yeah
well,
and
actually
we
can
like
so
we
so
so
so
I
could
actually
jump
in
and,
like
you
know,
like
I
could
log
in
as
another
user.
At
this
point
or.
A
B
D
C
A
Okay,
this
makes
so
much
sense.
Okay,
because
I
was
I've
been
playing
around
a
little
bit
with
like
building
different
applications
with
AI
and
I've,
been
like
I,
don't
understand
exactly
how
I
fine-tune
the
data,
or
do
I
really
have
to
sit
here
and
like
specify
each
thing
this
now
that
I'm,
seeing
it
live,
I'm
like
this
is
what
I
need
like
I
just
need
to
be
able
to
label
it
and
be
like
this
is
good
day
a
good
example.
This
is
a
bad
example
love
it.
D
So
that's
what
you
can
do
that
if
you're
wanting
to
do
some
more
advanced
stuff,
Nikolai
he's
our
CTO
and
I
C2
and
I
CTO
and
co-founder
and
I
went
to
Pi
data
Berlin
and
we
actually
did
talk
about
rohf.
So
if
you're
like
Aaron,
what
the
heck
is
rlhf
yet,
which
is
basically
just
retuning
or
fine-tuning
models,
so
this
is
taking
gpt2
and
it
shows
how
to
use
label
Studio
to
help
fine-tune
a
large
model.
D
So
where
labeling
fits
in
with
this
generative
Ai
and
like
large
generative
model
or
foundational
model
perspective,
is
these
models
are
so
big
that
first,
you
have
to
have
labelers
to
actually
create
the
initial
model,
but
the
value
is
like
you:
don't
want
to
have
to
teach
a
computer
to
learn.
English
again,
like
we've
already
done
that,
there's.
A
D
Out
there
who's
already
tackled
that
step,
you're,
just
wanting
to
make
sure
that
you're
reducing
bias
or
improving
context.
So
the
one
case
that
you
can
go
through
is
like
not
every
label
like
not
every
answer
is
the
best
answer,
so
we
do
have
a
ranking
template
in
label
studio
and
there
is
a
so
I
can
actually
go.
Show
you
what
that
would
look
like
where
you
can
start
to
rank
items
now
this
data
set,
isn't
necessarily
set
up
to
create
new
items
or
do
ranking,
but
we
can
show
you
what
it
would
look
like.
B
I
think
I
think
this
is
a
great
time
to
jump
into
and,
like
you
know,
kind
of
like
talk
about
like
if
you
see
like
it.
Actually,
if
you
kind
of
go
through
the
the
templates
on
the
left
there,
like
we
have
computer
vision
templates
so
that
you
can
like
bring
in
image
data
and
start.
You
know
doing
segmentation
analysis
on
that
bounding
box
analysis
on
it,
natural
language
processing,
which
is
kind
of
like
the
sentiment,
analysis,
audio
and
speech
processing.
B
So
you
can
load
like
audio
data
sets
like
we
have
a
bunch
of
users
who,
like
you,
know,
are
interested
in
like
improving
call
center
experience
and
they
just
have
like
lots
and
lots
and
lots
of
audio
data,
and
it's
just
like
annotators
go
through
and
they
say
that,
like
you
know
the
person
the
person
is
happy
here.
The
person
is
unhappy
here.
They're
asking
a
question
about
this.
You
know
and
helping
it
to,
like.
You
know,
build
better
systems
for
kind
of
like
helping
people
out.
B
You
know
time
series
analysis
of,
like
you
know:
we
have
a
company
out
in
France
that
is
like
they
have
a
whole
bunch
of
devices
that
are
out
in
the
field
and
they
want
to
be
able
to
predict
when
those
devices
fail
and
there's
just
all
of
this
Telemetry
that
comes
out
of
it
like
temperature,
CPU
load.
B
You
know,
you
know,
motion
you
know,
and
then
they
like,
they
stream
this
data
back
in
and
it's
all
just
time,
series
data
of
events
that
are
happening
and
they
were
able
to
build
out
like
all
of
these.
It
was
all
it
was
called
total
energies
and
they
were
able
to
build
out
like
all
of
these
models,
to
be
able
to
predict
like.
Oh,
we
have,
we
have
a.
B
B
There's
we
have
a
template
to
to
work
on
it
and
we're
able
to
like
import
it
and
actually
like
do
the
data
labeling
on
that
it's
it's
really
super
cool
and-
and
we
also
have
a
playground
where
you
can
like
without
bringing
up
label
Studio,
you
can
go
and
you
can
just
like
try
out
different
labeling
interfaces
too.
Hey.
C
B
Want
to
try
out
something
super
complicated
and
you
can
just
go
and
like
try
things
out
anyway.
I
didn't
mean
to
interrupt
you
on
the
things
that
you.
D
Was
I
was
gonna
say,
that's
actually
perfect,
so,
like
I
still
forget
all
the
different
things
that
we
have
but
yeah.
D
If
you
want
to
kind
of
get
started
and
start
like
exploring
here's
a
great
way
to
do
so,
the
hugging
face
space
is
another
great
way
to
do
so
and
if
you're
interested
in
just
like
learning
more
about
machine
learning,
the
two
places
that
actually
were
the
most
helpful
for
me
were
actually
hugging
face,
has
a
great
Discord,
where
they
do
really
really
awesome
like
tutorials
and
like
workshops
together,
and
they
have
all
their
assets
async
and
they
have
a
very
one
thing.
D
I
love
about
them
is
their
transparency,
commitment
to
open
source
and
like
just
ethics
behind
it,
they
have
really
great
ethicists
at
the
team
and
the
other
place
is
actually
Bloomberg's
open
source
repo,
like
the
Bloomberg
machine
learning
and
data
team,
is
doing
some
amazing
things
and
teaching
other
people
they're
in
public
I
was
at
pycon
and
literally
fangirled
over
them,
and
this
poor
person
was
like
I
liked
was
like.
Oh,
my
goodness,
you
have
the
best
repo
I
taught
myself
so
much
off
of
your
repo
and
then
the
guy
was
like
I
didn't
build.
C
A
D
B
Speaking
speaking
of
resources
for
learning,
Aaron
I,
don't
I,
don't
I,
think
Aaron's
being
a
bit
too
modest
because,
if
like,
if
you
want
to
get
started
in
label,
Studio
like
if
you're
looking
at
this
and
you're
like
I,
want
to
understand
more
and
I
want
to
get
started
on
this
Aaron's
actually
written
a
tutorial
called
zero
to
one
with
label.
Studio.
D
D
A
Awesome
I'll
definitely
look
into
that
to
that
blog
post
and
all
those
other
links
and
I'll
try
to
like
resurface
them
again
when,
when
like
near
the
end
of
the
stream
okay,
so
we
found
we
have
a
couple
of
questions.
I'm
gonna,
if
you
see
me,
look
away
for
some
reason.
We
got
like
disconnected
from
LinkedIn,
not
the
actual
video,
but
the
comments
that
which
was
really
weird
but
so
I'll
be
like
going
back
and
forth
from
the
tabs.
But
we
have
a
couple
of
questions.
B
Yeah
no
I
mean
that's
something
that,
like
you,
would
have
to
so
when
labels,
when
you
label
an
entire
data
set,
and
you
export
that
data
set
for
for
use
in
a
machine
learning
model
or
to
share
with
other
people
the
things
that
you're
going
to
get
out
of
that
are
like
links
to
the
original
data
and
so
like,
like
what
is
the
original
date.
B
Like
you
know,
you'll,
probably
be
storing
like
your
images
like
somewhere
on
the
web
or
inside
of
a
you
know,
inside
of
a
an
S3
bucket
or
a
Google
Cloud
Storage,
so
you'll
get
links
to
that
original
data
and
then
you'll
get
a
lot
of
metadata
about
what
The
annotation
was
and
who
the
annotators
were
and
those
sorts
of
things,
and
so
you
have
like
that
long,
Rich
provenance,
but
there's
really
no
facility
to
kind
of
generate
that
more
human
readable
like
this
is
where
the
data
came
from.
This
is
the
bias.
B
A
Perfect.
Thank
you.
Other
questions
that
I
saw
come
in
well
mostly
were
around
on
LinkedIn.
A
lot
of
people
are
saying
how
do
I
get
involved
in
this
project
and
I
also
seen
learner
041a.
Ask
that
as
well
I'm
like
they,
they
want
to
get
involved.
They
want
to
maybe
contribute
or
something
like
that.
How
do
they
do
that.
B
Yeah
I
mean
join
our
slack
Community.
It's
it's
a
great
place
to
start
and
kind
of
engage
with
other
other
members
of
the
community.
B
If
you
want
to
download
the
source
code
and
try
it
it's
all
on
GitHub,
you
can
just
download
the
label
Studio
project.
We
have
lots
of
getting
started,
links
to
just
run
the
software
and
get
involved.
B
You
can
also
contribute
if
you
find
that
there
are
now
label
Studios,
run
mainly
by
a
core
team
that
that
works
for
hard
X,
but
we
do
it
except
outside
contributions,
and
one
of
the
best
and
easiest
ways
to
get
to
get
involved
is,
if
you
have
machine
learning
models
that
you
want
to
be
able
to
do
predictive
analysis
with
or
you
want
to
do,
cool
things
with
like
we
had
a
Community
member
who
went
and
built
an
interface
to
segment
anything
so
that
if
you
want
to
like
label
a
data
set
of
just
like
I
just
need
to
label
like
tens
of
thousands
of
dogs,
different
types
of
dogs,
you
can
like
click
in
the
interface
and
it'll,
just
flood
fill
and
like
this
is
a
dog
I'm
labeling.
B
It.
You
know
like
and
send
that
label
back
to
you
and
they
were
like
you
know
they
were
talking
about
how
you
know
for
for
the
project
that
they
were
working
on.
It
sped
up
their
annotation
process
by
somewhere
between,
like
10
and
100
fold,
maybe
like
I
said-
and
this
is
just
a
machine
learning
model
that
like
plugs
into
our
regular
interfaces,
so
there
are
tons
of
ways
to
get
involved
and
we'd
love
to
see
people
join
the
community
and
just
kind
of
like
you
know
it
just
engages
what
we're
doing.
D
Oh
go
ahead:
Erin
I
was
gonna,
say
I
was
just
gonna,
say
the
on
the
contribution
side.
We
do
have
some
pretty
exciting
contributions.
The
one
that
I
am
most
recently
like
geeking
out
about
is
these
segment,
anything
which
was
completely
contributed
by
and
I
Think
resilio
Geek
out
on
this,
as
well
by
a
college
student
who
was
not
wanting
to
study
for
their
finals.
A
B
B
But
yeah
a
little,
and
not
only
that
but
like
this-
has
been
like
one
of
the
like
like
when
this
patch
started
coming
in
and
it
was
like,
like
we
people
from
all
over
the
community
was
like.
When
is
this
thing?
Gonna
emerge
like
I
have
to
have
this
thing
like
I
absolutely
need
it,
and
this
wasn't
like
the
core
Community.
This
is
just
a
college
student
like
hey.
This
is
going
to
be
cool
and
it's
gonna
help
me
in
my
research
like.
A
It's
just
so
amazing.
This
is
awesome.
No
I
I'm,
like
mad
excited
about
this
product,
I
didn't
even
know
it
was
this
cool
I.
Think
someone
asked
I,
think
everyone
also
answered
but
I'll
just
say
it
out
loud
for
people.
Someone
acts
like
I
want
to
label
I
want
to
do.
Label
dating
or
I
want
to
label
data
for
sentiment
analysis,
but
the
data
is
in
an
Indonesian
language.
Can
label
Studio.
Do
that
looks
like
Aaron
said
yes
like
resounding?
A
Yes,
okay,
so
we
learned
a
lot
about
how
this
tool
Works,
how
people
could
get
involved.
I
think
maybe
I
should
resurface.
Some
of
the
the
links
that
you
all
showed
and
then
I'll
ask
you
all
some
of
the
the
wrap
up
non-technical
questions,
because
somehow
this
was
already
50
minutes,
but
it
didn't
feel
like
it
like
just
listening
to
like
Chris
and
Aaron.
Just
like
you
know,
talk
like
you
all
are
passionate
about
this,
so
it
made
it
really
exciting.
A
Okay,
so
some
of
the
some
of
the
links
that
we
starred
I
think
people
were
asking
like
how
to
get
involved
and
stuff
like
that
looks
like
I.
Remember
huggingface.co
label
Studio
was
one
of
the
first
things
that
you
showed
so
I.
Don't
remember
why
you
showed
that,
though,
if
you
wanted
to
clarify.
D
D
Want
to
install
it
via
Docker
or
that
zero
to
one
tutorial
does
get.
You
started
using
Docker
I'm,
a
Docker
fan
girl,
so
much
so
that
I
had
to
close
out
of
a
Docker
container
before
I
started
the
stream
and
because
I
was
wondering
why
my
computer
was
running
so
slow,
and
so,
but
if
you
are
like
I,
don't
even
want
to
deal
with
that
hugging
face
makes
it
easier
to
just
get
it
all
started
right
within
the
browser.
A
Gotcha,
that's
what
it
was
for
and
I
I
welcome.
Definitely
the
the
summary
of
different
links,
but
we
have
it
on
so
many
different
platforms,
so
just
want
to
make
sure
that
everybody,
because
I
think
I
have
it
on
Twitter
Youtube,
LinkedIn
everything.
So
just
just
like
a
double
check.
I
remember.
You
also
said
the
RL
HF
repository
was,
if
you
wanted
to
like
figure
out
how
to
re
re
or
refine
or
or
fine-tune
specific
data
models
so
like.
A
If
you
had
a
really
large
data
model
like
gpt2
or
something
like
that,
you
can
use
this
for
people.
There
was
a
couple
of
blog
posts.
Maybe
maybe
we
should,
because
there
is
a
lot
of
like
links
there,
the
playground
to
try
things
out,
y'all,
the
blog
post,
on
zero
to
one
100
to
getting
started
with
label
Studio,
which
Chris
and
Erin
like
everyone
wrote.
But
Chris
is
like
saying
it's
amazing.
It's
really
great
I'm,
definitely
going
to
check
it
out
and
then
the
the
awesome
pop
culture
data.
A
If
you
all
wanted
to
check
out,
that's
particular
like
I
think
that
was
what
we
used
as
the
demo
today.
If
you
wanted
to
check
that
out
as
well,
and
if
you
want
to
join
the
slack
there's
this
link
as
well
feel
free
to
re-watch
the
video
I
know.
We
went
through
some
of
the
links
really
quickly
and
if
everyone
like
adds
those
in
the
show
notes.
A
Okay,
next
thing,
I
want
to
switch
into
since
there's
only
53
minutes,
left,
I,
guess,
I'll,
there's
so
much
questions
I
still
have
I
want
to
try
it
out.
I
think
I'm
going
to
try
it
out
after
after
this
stream.
But
let
me
see
just
really
quickly.
I
think
the
only
thing
before
I
went
into
some
of
the
the
like
non-technical
questions
as
just
like
more
of
y'all's
thoughts
on
like
AI
like
what
are
your
predictions
for
the
future
of
like
generative
AI
or
llms
I
I
I.
A
D
B
B
Sometimes
they
get
them
horribly
wrong,
but
like
the
like,
like
like,
if
we
think
about
like
what
was
one
of
the
biggest
Innovations
like
kind
of
in
the
tech
space,
it
was
a
text
box
back
when
Google
came
out
and
it
was
like
how
does
search
get
better
like
search
on
the
Internet
is
awful.
We
have
giant
indexes,
and
you
know
it.
Searches
search
is
terrible
and
then,
like
Google
was
like
we're.
B
Just
gonna
give
you
a
little
text
box
and
you
type
in
what
things
that
you
want
to
get
back
out
and
you
get
back
magic
of
just
like
knowledge,
and
this
is
the
exact
same
thing.
That's
happening
right
now.
You
get
little
text
boxes,
you
type
in
the
things
that
you're
interested
in
and
you
kind
of
get
back.
You
know
what
a
lot
of
people
are
considering
to
be
magic,
and,
but
you
know,
but
you
know
also
understanding
the
limitations
that
these
aren't
thinking
machines.
B
They
aren't
you
know
like
like
old,
like
like
they
still
heavily
depend
upon
making
sure
that
the
data
we
put
into
them
is
the
best
data
possible,
which
means
that
it's
safe,
it's
not
biased,
it's
truthful,
you
know,
and
that
can't
be
done
automatically
that
actually
takes
a
lot
of
attention
to
detail
right
and
there's
always
going
to
be
a
space
for
making
sure
the
data
we
put
into
these
things
is
correct,
but
kind
of
the
other
big
shift
that
I'm
seeing
is
ml
is
accessible
to
everybody
now
and
I
think
that
is
like
the
most
important
income
you
know
outcome.
B
B
A
Yeah
I
love
that
I
think
I
think
the
point
you
made
about
it
being
more
accessible
and
then
also
about
it,
not
being
they
don't
actually
know
things
right
like
an
llm
like
Gathering
From,
like
what
y'all
have
said
in
my
use
with
like
yeah
co-pilot,
it's
just
a
statistical
probability
machine.
It's
just
like
it's
auto,
complete
almost
but
like
at
a
larger,
larger
scale.
A
So
I
think
that
that
was
a
good
note
and
then
you
wrote,
Aaron
wrote,
data,
labeling
and
processing
is
probably
going
to
be
essential
for
the
future,
in
combination
with
keeping
this
open
source.
If
we
want
to
make
this
an
ethical
field
within
Integrity
love
that
and
I
think
that
goes
into
my
next
question
before
the
the
non-technical
I
just
really
wanted
to
know.
Y'all's
answer
to
this.
So
sorry
I
hope
we
don't
go
over
time
too
much
but
is
like.
A
Why
do
you
all
feel
that
open
source
is
key
to
machine
learning
and
AI
I?
Think
you
all
kind
of
went
through
it
a
little
bit
in
the
beginning,
especially
with
everyone's
background
and
like
open
journalism
and
Chris's
background
in
open
source
and
AI.
But
yeah.
D
Yeah
and
again
like
this
was
a
huge
consideration
when
I
wanted
to
get
into
this,
but
I
really
think
like
open
source
is
super
essential.
So
a
lot
of
these
models
were
actually
built
because
of
open
data,
and
for
those
of
you
who
aren't
familiar
what
open
data
is.
It
was
actually
a
government
initiative
put
on
by
the
Obama
Administration
to
increase
transparency
in
government.
D
This
is
where
my
journalism,
geek
hat,
comes
out
a
little
bit.
There
are
open
data
weeks.
B
Yeah
I
mean
I
am
well,
Aaron
is
Frozen.
Sorry,
maybe.
D
Yeah,
it
started
thunderstorming
outside
my
house,
so
that
is
likely
my
internet.
We
love
it,
but
so
open
data
I,
don't
know
where
I
was
at
when
it
got
cut
off,
but
open
data
was
actually
part
of
a
government
initiative
in
the
U.S
designed
by
the
Obama
Administration
to
increase
transparency
in
our
government
issues.
So
basically,
there
was
funds
and
initiatives
created
to
just
make
our
government
projects
a
little
more
accessible,
and
a
lot
of
this
has
been
maintained
over
the
years,
but
much
like
many
open
source
projects.
D
A
lot
of
this
is
done
in
Python,
so
you're,
using
a
notebook
and
if
you've
never
like
used
a
python
notebook
before
one
of
the
things
that's
really
great.
Is
they
have
like
footnotes
of
each
like
what
this
thing
does
like
what
these
lines
of
code
do.
So
it
provides
context
in
a
really
good
python.
D
Notebook
is
amazing,
because
you
can
almost
follow
someone's
step
and
like
see
what
they
were
thinking,
but
by
putting
those
in
the
open,
you
can
actually
learn
from
one
another
and
it
allows
for
advancement
of
the
total
field,
but
also
enhance
like
it
allows
for.
If
there
is
bias,
we
can
see
it
because
everybody
has
their
own
biases.
B
I
mean
I,
I
mean
I.
Think
about
like
what
Aaron
is
Aaron
is
talking
about
is
like
it
is,
is
important,
but
even
just
like
the
openness
of
like
openness,
elevates
everybody
and-
and
you
know,
having
you
know,
if
you
go
to
kaggle,
if
you
go
to
hugging
face
there
are
all
these
open
data
sets.
You
can
find
like
the
large
image
open
data
sets.
It's
called.
B
A
B
You
know,
you
know
clutching
onto
our
knowledge
and
and
then
like
you
know,
and
you
know,
and
not
and
not,
sharing
it
like,
like
I,
don't
know,
that's
kind
of
like
my
like
this
is
why
I
love
open
source,
and
this
is
why
I
I
do
this
thing,
but
I
just
I
just
so
strongly
believe
in
it
like
it's.
It's
such
an
important
thing
to
me.
A
Yeah
I
love,
I,
love
that
about
tech,
overall
I.
Think
in
our
industry
we're
a
little
bit.
We
we
might
be
a
little
bit
more
willing
to
share
things
than
other
Industries,
like
you
know,
like
we'll,
be
like.
Oh
yeah,
like
here's,
this
piece
of
code
like
or
code
snippet,
like
stack
Overflow
and
all
that,
whereas,
like
maybe
some
other
Industries,
are,
are
a
little
bit
less
willing
to
share
things
like
that.
Okay,
we
we
reached
the
hour
mark
I,
I
forced
it
with
all
these
questions
quickly,
with
I.
A
Just
had
a
lot
of
questions,
so
I'll
quickly
do
like
the
non-technical.
What's
the
first
programming
language,
you
all
learned.
B
D
My
live
Journal
templates,
so
it's
like
CSS
and
whatever
I
needed,
because
I
was
a
Sims,
2,
custom
content,
creator,
wow,
flashback
and
I.
Don't
even
remember
what
that
was
to
customize
your
profile
and
then
I
think
the
only
like
actual
like
work
in
the
work
world
I
did
liquid.
It
was
like
the
first
one.
C
B
So
this
is,
this
is
going
to
be
like
this
is
going
to
date
me.
This
is
going
to
be
like
how
old
are
you
without
saying
how
old
you
are?
My
very
first
program
was
Tim
print
hello,
20
go
to
10
in
basic
on
an
Apple
2
plus,
and
that
was
the
first
computer
program.
I
ever
wrote
and
it
was
mind-blowing
to
me.
I
can
make
this
thing
say
hello
over
and
over
and
over
again.
C
C
A
Learned
my
space
I
learned
sequel
for
some
reason:
I
was
like
I'm
gonna,
learn
how
to
code
and
I
learned.
Sequel
and
I
was
like
what
does
this
have
to
do
with
coding.
A
Sure,
but
okay,
if
money
wasn't
an
issue,
how
would
y'all
ideally
spend
your
time
whether
it's
job,
wise
or
not,
job-wise.
D
C
B
You
can
tell
behind
me
like
there's
a
bunch
of
instruments
behind
me.
I
would
get
better
at
all
those
like
if
time
and
money
weren't
weren't
an
issue
too,
but
you
know,
but
I
do
think
that,
like
that
they're
okay,
you
know
they're
like
giving
back,
is
so
important,
also
and
like
like
before
before
the
pandemic,
like
I
I
made
it
a
point
to
like
go
and
pack
food
at
the
food
bank
once
a
week,
and
doing
more
of
that
is
is
like
also
like
boy.
B
A
No,
that's
fair,
like
using
your
privilege
to
to
help
others
I
love
that
oh
okay,
what's
a
dream,
open
source
project
that
you
would
like
to
create
one
day,
I
know
that
maybe
label
Studios
like
your
dream,
but
your
your
other
dream.
D
I
was
like
probably
I
would
do
I'd,
probably
I
think
I
I'm.
Actually,
this
is
what
I'm
talking
I'm,
giving
at
a
strange
Loop
law
on
open
data,
but
I
think
I
would
really
be
interested
in
supporting
more
open
data
projects,
so,
like
I,
have
contributed
to
like
the
city.nyc,
which
is
a
local
non-profit,
Newsroom,
committed
to
making
sure
that
every
New
Yorker
has
Fair
Equitable
and
access
to
information,
so
they
have
to
zero
paywalls.
D
Also
like
organizations
like
The
Marshall
Project,
which
makes
sure
that
we
do
accountability.
Journalism
on
incarceration
so
doing
open
data
projects
will
probably
be
my
time.
B
I
mean
I,
you
know,
there's
I
mean
it
already
kind
of
exists,
but
I
would
love
to
see
it
like
to
give
even
more
time
and
attention
to.
It
is
there's
a
there's,
a
there's,
an
open
source
project
called
processing,
which
is
an
art
platform
like
it's
just
a
it's
a
platform
for
for
creating
audio
and
visual.
C
A
D
A
B
A
B
Yeah
scrub
I
mean
it's
the
song
we
sing
around
the
house.
The
most
is
scrubs.
A
That
works
y'all.
Okay,
let
me
see
if
anybody
else
has
any
questions
in
the
comments.
No,
no
more
questions.
Okay,
cool
but
I
had
a
really
great
time
with
y'all
I'm.
Sorry,
like
I
know,
we
went
10
minutes
over
and
you
all
probably
have
like
very
busy
days
left
ahead
of
you.
So
I'm.
Very
sorry,
I
was
just
really
interested
in
this
conversation.
A
Chris
did
you
freeze
just
so
you
know.
Okay,
no,
you
did
cool.
B
A
So
much
for
for
coming,
go
and
sharing
your
knowledge,
like
both
of
you,
all's
perspectives
on
like
open
data
and
Ai
and
machine
learning.
It
was
really
an
enriching
conversation
for
me
and
I
think
it
was
for
the
audience.
I
also
want
to
say
thanks
for
the
audience
too,
for,
like
you
know,
always
every
time
we
I
do
a
stream
they're
like
tapping
in
they
got
things
to
say
they
have
good
questions
and
I
really
appreciate
that
as
well
I
hope
to
see
y'all
next
week
for
open
source
Friday
next
week.
A
We
have
another
exciting
interesting
like
session
and
then
Chris
and
Erin.
Is
there
any
lastings?
You
wanted
to
say
any
things
you
wanted
to
promote,
like
maybe
you're
speaking
at
an
event
or
like
I
know,
we
had
a
whole
bunch
of
links
up.
Let
me
just
put
some
of
them
up
just
for
people
to
see,
but
is
there
any
like
last
things
you
wanted
to
to
drop
and
promote.
D
I
think
don't
be
a
stranger
like
if
you
need
anything,
definitely
reach
out
and
drop
us
a
star
on
the
label.
Studio
repo,
that's
your
jam!
A
Right
now
or
not
I
shouldn't
say
guys,
but
people
folks,
let's.
C
D
A
A
Cool
all
right,
we
will,
if
Chris
did
you
have
any
last
thing
you
wanted
to
say.
B
No,
no,
no
just
thanks
for
thanks
for
having
us
on
and
and
yeah,
like
really
really
had
a
great
time.