►
From YouTube: Education & Workforce WG: Knowledge Graphs
Description
July 2022
Presenter: Pramod Misra
A
A
I
have
also
worked
with
companies
like
novartis
procter
gamble,
family
in
the
data
science
and
machine
learning
government,
almost
more
than
20
years
in
this
domain.
Today,
I
will
be
covering
this
topic
of
knowledge
graphs.
So
one
disclaimer
it's
in
itself.
It's
a
huge
topic.
It's
there
are
many
books
or
you
know.
A
Universities
do
sometimes
a
three
credit
course
on
this,
but
I
will
try
to
cover
what
are
the
the
some
of
the
pointers
in
each
of
these
areas
and
you
know
feel
free
to
ask
questions
wherever
you
have
any
doubts.
A
A
What
all
the
things
we
need
to
deploy
it,
and
I
will
also
talk
about
one
or
two
use
cases
which
recently
I
have
worked
on
to
give
you
a
flavor
of
how
it
can
help
okay.
Now
the
first
thing
is
why,
at
all
anyone,
whether
it's
in
industry
or
in
academia,
you
need
some
knowledge
stamps
it's.
Basically,
if
you
have
smaller
number
of
tables,
I
mean
I'm
sure
all
of
us
has
worked
at
some
point
in
some
sql
databases.
A
So
as
long
as
you're
working
on
some
small
or
limited
number
of
tables,
then
sql
work
pretty
well.
For
example,
this
is
one
example.
You
have
a
company,
you
have
salary
list,
erp
crm
right.
It
works
the
moment.
We
start
adding
more
and
more
data
tables.
The
performance
of
your
sql
query
start
going
down
right
and,
as
it
starts
going
down,
we
tend
to
spend
a
lot
more
time
in
just
running
those
queries,
and
also
you
know.
A
Right
for
that,
we
definitely
need
you
know
graph
databases
and
that's
where
the
need
of
knowledge,
graphs
and
network
science
comes
in
place
in
terms
of
brief
history,
or
especially
the
the
reason
of
this
2012
period,
because
in
in
2012,
first
google
published
in
their
blog
about
that
they're
using
knowledge,
graphs
for
their
search
engine
and
similarly,
then
facebook
and
most
of
the
big
techs
started
publishing
about
it.
I
mean,
if
you
see
the
you
know
the
pre-99
of
feeds
free
to
carry
where
we
you
know.
A
Most
of
the
search
engines
were
linked
with
indexing,
and
then
we
have
this
whole
decade
of
page
rank
algorithm
and
its
optimization,
and
I
think
the
the
the
post
file
is
where
companies
have
started,
claiming
also
about
work
on
connected
data
right
and
the
big
part
of
this
is
you
have
probably
multiple
data
tables,
multiple
data
sources.
You
want
to
connect
it
efficiently.
A
You
want
to
run
these
algorithms
run.
These
queries
efficiently.
In
fact,
this
has
also
led
to
one.
You
know:
new
designation
in
the
companies
also,
which
we
call
as
the
knowledge
engineer,
which
is
kind
of
an
interface
between
data
scientists
and
the
business
users.
Those
who
are
having
expertise
in
the
knowledge,
graphs
and
also
what
exactly
is
some
knowledge
have
so
in
in
here
from
a
data
architect
perspective?
It's
similarly
simply
a
virtual,
I
would
say
data
layer
right
which
sits
on
top
of
all
your
existing
database.
A
Okay,
so
so
it
basically
facilitates
how
you
can
connect
multiple
databases,
tables,
etc,
whether
it's
a
structured
data,
unstructured
data,
so
whether
it's
a
comments,
post
queries
or
whether
it's
you
know,
multiple
tables
which
you
have
in
your
team.
A
Okay,
I'm
sure
you
must
have
many
of
you
must
have
seen
the
first
two
parts
of
this
thing
where
you
know
we
talk
about
how
the
era
of
computing
has
evolved
right
right
from
where
you
know
the
procedural
era
where
we
are
coding
in
terms
of
you
know,
with
certain
rules
in
the
raw
data,
and
we
get
the
answers
and
also
we
try
to
explain
those
answers
with
those
procedural
codes.
Okay,
whereas
then
we
have
force
2
here.
This
machine
learning
error.
Where
you
know
we
we
use
the
you
know.
A
Basically,
the
machine
learning
models
right
where
I
I
ingest
the
raw
data
and
I
ingest
my
requirement
right
and
then
we
get
the
rules.
Okay,
the
only
limitation
in
in
this
whole
machine
learning
era
is
that
most
of
the
time
you
do
not
get
the
reasons
why
or
explanations
why
it
is
happening.
Okay,
if,
if
a
algorithm
is,
is
predicting
something
a
better
credit
score
or
a
higher
or
lower,
or
probably
a
recommendation
of
a
particular
medicine
why
it
is
doing
so.
A
Okay-
and
that
has
led
to
the
third
error,
which
is
the
knowledge
graph
right,
where
I
use
machine
learning
algorithm,
but
at
the
same
time
I
also
use
these
knowledge
graphs
to
a
certain
that
what
probably
could
be
the
reason
right-
it's
not
necessary.
You
always
get
answered,
but
yeah
most
of
the
time
it
tries
to
give
you
some
connections
linked
with
your
recommendations
or
predictions
okay
and
so
for
all.
We
can
use
these
knowledge
graphs.
A
So
it's
you
know
if
you
see
the
early
examples
or
use
cases
of
knowledge
they
have
mostly,
it
is
linked
to
the
places
where
you
know.
We
have
probably
a
lot
of
data
points
or
probably
a
lot
of
segregated,
or
you
know,
data
tables.
A
So,
for
example,
your
j
routing
public
transport
insurance
risk
analysis
where
we
try
to
you
know,
link
between
the
coverage
and
you
know
what
is
the
potential
risk
to
the
insurer,
the
one
big
area
again
on
the
content,
management
and
there's
a
big
piece
for
companies
where
you
have
lot.
Many
data
points
a
lot
many
documents
in
your
repository,
but
if
you
want
to
search
one
contract,
it
takes
a
lot
of
time
really
a
lot
of
time
to
get
the
right
document
at
the
right
time
from
a
telephone
perspective.
A
Yes,
network
asset
management
is
a
one
big
area
where
you
use
like.
If
you
have
a
failure
in
the
network
right
and
you
want
to
quickly
find
out
that,
because
of
which
tower
or
which
cell
the
issue
is
coming
there,
we
use
the
these
knowledge
graphs.
A
Yeah
the
one
big
case
web
browsing
where
google
started
using
it
for
knowledge
graphs.
Similarly,
many
other
companies,
then
gene
sequencing.
Your
portfolio
analysis
is
some
of
the
leading,
I
would
say
western
banks
are
using
it
in
terms
of
analyzing
their
portfolio
and
what
is
the,
I
would
say,
potential
risk
on
their
portfolio
on
the
social
media.
A
lot
have
been
done
in
terms
of
you
know
how
I
can
find
these.
You
know
people
are
connected
to
each
other
and
how
it
impacts
any
decision
in
one
of
the
use
cases.
A
I
will
talk
about.
You
know
one
of
these
examples
and
how
we
have
used
in
and
to
one
of
our
customers
now
to
implement
any
knowledge
graph.
What
are
the
basic
elements
we
need?
So
one
element
is
basically
the
one
of
the
graph
databases
right,
so
you
need
to
have,
I
would
say,
a
data,
storage
and
processing.
So
basically
your
graph
computing.
You
know
graph
databases,
then
again,
you
also
need
a
a
footing
framework,
so
it
could
be.
A
A
In
terms
of
what
are
the
main
varieties
of
knowledge
graphs,
so
you
know
in
that
I
find
that
what
gartner
said
created
as
darkness,
five
graphs
pretty
interesting
in
in
a
way
it
covers
all
the
five
major
use
cases
so
right
from
interest
like
how
people
get
interested,
how
it
is
linked
to
double
profile,
then
payment
payment
is
a
very
interesting
and
also
it
is
being
used
now
in
you
know,
fraud
and
many
other
use
cases
right
that
how
how
the
the
the
money,
even
in
the
money
laundering
cases,
some
of
the
companies,
are
using
it
social
graph.
A
The
another
big
one
is
the
intent
graph
right
understanding
the
when,
when
someone
buys
a
particular,
you
know
product
right
why
someone
is
taking
on
that.
What
is
the
intention
of
them
right
and
the
last
but
not
least,
is
the
mobile
graph.
So
these
are
the,
I
would
say,
the
five
big
graphs
which
gartner's
predicted.
A
Nevertheless,
in
terms
of,
if
you
see
the
deployment
of
these
graph
databases,
one
of
the
key
element
is
your
this
graph
database
cluster,
which
basically
you
know
we
one
of
you-
can
take
neo4j
or
any
any
existing
graph
database
which
we
want
to
use,
and
then
we
connect
this
graph
database
secure
all
existing
databases.
A
Okay,
you
can
do
actually
any
of
the
things
whether
it's
you
know
etl
is
those
who
do
not
know
so,
basically
extract
transform
load
where
we
want
to
basically
transform
our
data
in
a
certain
format
which
helps
us
to
do
reports.
A
The
the
best
part
of
you
know
or
the
biggest
leverage
you
get
from
these
graph
databases
is
on
the
visualization,
because
sometimes
you
know
many
of
the
you
know
we
do
not
get
enough
information
from
from
the
table,
which
is
there
and
probably
even
when
we
see
the
values,
whereas
when
we
represent
it,
you
know
you
know
in
a
network
or
in
our
linkages,
it's
much
more
easier
to
to
to
connect
with
that
story.
A
So
this
is
one
of
the
use
cases
where
you
know
I've
seen
this
pretty
working
pretty
effectively.
A
This
is
a
glass
door
using
or
in
glasgow's
integration
with
the
facebook
page
and
when,
when
glass
glassware
started
it
they
were
having
a
difficulty
in
terms
of
getting
enough,
I
would
say
job
recommendation,
because
mostly
these
job
recommendations
go
through
your
friends,
defense
network,
okay
and
when
they
connected
this,
the
classroom
data
along
with
the
facebook
right-
that's
where
they
and
they
use
this
in
this
particular
case,
they
use
the
neo4j
cluster
as
a
graph
database
and
they're
find
pretty
interesting.
A
One
is
insights.
Second,
is
also,
I
would
say,
the
right
recommendation
for
the
you
know
for
the
job
service,
so
that
was
a
I
would
say,
a
very
interesting
and
powerful
use
case.
Having
said
that,
it's
not
limited
only
to
these
kind
of
problems,
it
could
be
used
anywhere
right
from
you
know,
finding
the
right
job
to
finding
the
right
web
page
or
to
the
finding
the
right
product.
A
This
is
another
where
actually
5g
vector
had
worked
with
one
of
the
leading
the
insurance
company,
so
those
a
few
words
worked
on
natural
language
processing.
One
of
the
more
you
know,
even
now,
one
of
the
biggest
difficulty
in
nlp
is
you
know,
understanding
the
these
semantic
rules
or
getting
the
intent
of
the
text.
Okay,
so
which
is
the
you
know
in
nlp
the
the
challenge
comes.
A
Is
that
because
we
break
any
sentence
into
small
small
groups
and
we
try
to
find
the
intent
of
it
and
that's
where
the
algorithms
start
giving
the
lower
performance
now?
What's
the
help
we
get
from
a
knowledge
graph
or
any
outside
similar
solution?
Is
it
try
to
put
the
linkages
between
different
pieces
of
words?
Okay,
so
that's
where,
in
the
deep
text
analysis,
it
becomes
extremely
useful.
A
For
example,
in
this
case,
this
insurance
company
wanted
to
connect
the
the
reviews
what
people
have
written
about
their
products
on
you
know
google
or
other
social
media
and
linking
it
with
them
with
what
kind
of
products
are
most
purchased
on
their
website
and
to
do
that
scale
you
need.
You
know
something
which
which
can
quickly
query
this
data
yeah,
and
you
know
you
can
throw
results
on
it.
A
So
that's
where
you
know
the
knowledge
class,
which
is
extremely
extremely
useful,
because
it
adds
the
context
of
different
keywords
within
the
same
sentence.
So
that's
how
you
know
it
helps
in
in
this
whole,
I
would
say
symmetrical.
A
This
was
one
use
case
where
you
know
we
have
worked
on
one
of
the
leading
telecom
company.
The
use
case
was
this:
company
might
have
like
maybe
around
some
12
different
sales
territories
and
in
each
sales
territory
they
have
a
leading
sales
manager
or
a
sales
lead,
and
then
they
have
a
a
team
of
probably
10
to
12
in
below
the
eat
sales
slate.
A
Okay,
now
the
use
case
they
are
trying
to
solve
is
that
is
there
a
link
between
the
performance
of
the
sales
link
with
the
what
kind
of
a
communication
they
are
doing
within
the
team
and
across
the
teams?
Okay,
and
so
for
that
we
we
worked
with
this
company
and
what
we
did
was
we
got
only
the
you
know.
A
One
of
the
interesting
data
set
here
was
the
the
the
email
interaction
of
the
sales
team,
so
we
have
not
taken
the
the
text
or
the
content
of
the
email
and
because
that
was
not
of
interest
for
a
company
and
neither
for
us
what
we
are
interested
is
that
person
a
you
know
to
whom
I'm
most
frequently
interacting
on
email
and
to
whom
I'm
getting
least
okay,
and
if
and
the
hypothesis
was
that,
if
someone
is
more
interactive
or
at
the
center
of
the
the
graph
is
he
or
she
will
be
doing
better
as
a
sales
league?
A
What
is
someone
who's
isolated
and
these
different
charts?
You
see
on
the
right
side
right
so
here
you
will
see
someone
at
the
center
right,
so
this
is
one
sales
feed
we
find
that
puts
very
frequently-
and
this
is
small
number
so
that
how
many
times
in
a
month
they
have
interacted
with
each
other,
okay
and-
and
we
also
find
some
many
islands
right.
So
there
are,
we
found
some
things.
People
are
not
interacting
or
we
are.
A
They
are
only
interacting
with
us,
smaller
group,
okay,
so
these
insights,
when
we
presented
and
when
we
length
with
the
the
the
performance
of
the
sales
we
the
they
got
very
interesting
insights.
There
are
some
who
are
like
you
know.
For
them
the
betweenness
centrality
is
pretty
high,
so
they
seem
to
be
a
big
influencer
in
the
whole
team,
not
only
within
their
team
but
across
all
the
12
reasons.
A
There
are
some
who
who
are
kind
of
on
the
island,
so
they
are
like
hardly
interacting
with
the
rest
of
the
team,
and
so
they
were
the
one
who
were
finding
or
complaining
about
difficulty
in
getting
support
from
the
central
office.
So
these
are
some
of
the
the
interesting
sites
we
got
by.
You
know
doing
this
analysis
biggest
case,
so
yeah
that
that's
all
I
wanted
to
cover.
As
I
said,
it's
it's
a
huge
area
and
the
use
cases
are
unlimited,
but
yeah.
A
Yes,
so
I
have
done
with
one
of
the
I
would
say
university,
but
the
use
skills
was
that
they
wanted
to
find
out
across
different
courses
or
different.
A
I
would
say
you
know
the
the
major
themes
which
they
are
you
know
offering
to
the
students
and
linking
it
with
the
what
students
are
talking
about
them,
those
courses
on
google
or
social
media,
so
so
that
was
the
one
use
case
where
we
worked
that,
so
whether
people
are
talking
more
about
a
particular,
I
would
say,
computer
science
or
education
experts
or
about
a
humanities
course
and
how
it
is
linked
with
you
know
how
much
enrollment
they
get
in
this
course.
C
So
this
this
is
mark.
So
so
are
you
using
any
specific
query
language
to
query
graphs
or
what
kind
of
language
are
you
using.
A
Yes,
so
so
in
our
case
here
we
are
all
in
team
of
python
developers,
so
we
family
use
python,
but
if
you
go
across
the
different
solutions,
so,
for
example,
neo4j
their
graph
that
graph
database
works
with
both
python
as
well
as
cipher
cipher
gives
you
a
bit
in
in
in
terms
of
language
construct.
A
A
C
I
I
I
think
the
the
graph
representation
depend
of
the
the
type
of
problem,
because
the
graphs
can
model
anything.
It
can
be
a
structure
it
can
be.
Some
statistics
can
be
so
so
the
this
language.
C
So
when
you
are
building
this
graph
and
then
you
will
try
to
query
yes,
so
the
the
query
is
based
on
the
semantic
of
the
graph
yeah.
So
so
I
try
to
understand
if
there
is
what
language
is,
if
language
specific
to
a
specific
type
of
graph
or
is
general
purpose
because
python,
you
need
to
specify
what
you
are
looking
for.
Yes,
so
if
you
represent
the
structure,
yes,
the
connection
of
roots,
for
example,
or
social
media,
the
semantic
is
different.
A
A
Apart
from
that
from
you
know,
in
some
places
where
our
customers
are
already
using,
for
example,
they're
using
neo4j.
So
in
that
case
we
we
end
up.
Writing
queries
in
cypher
as
well,
okay,
but
because
that
query
is
pretty
similar
to
what
we
do
in
python,
so
they're,
good,
small
changes,
but
otherwise
for
a
general
purpose.
Graphs.
I
I
found
network
like
x,
is
the
most
useful
one.
B
B
That's
talking
about
how
they're
using
natural
language
processing
to
look
at
their
course
connections,
and
so
it
seems
like
something
that
could
be
done
at
a
university
level
to
look
at
what
courses
are
being
taught
and
how
connected
they
are
and
how
related
they
might
be
to
a
specific
topic
area
like
data
science
or
big
data
or
artificial
intelligence.
So
have
you
seen
that
done
anywhere,
or
is
this
something
that
is
generally
not
used
or
you
just
have?
Have
you
seen
interesting
use
cases
of
kind
of
internally
evaluation
for
universities
with
knowledge,
graphs.
A
Yes,
so
so
yes
you're
right
so
I
have
I
mean
I
had
a
discussion
with
one
of
the
university
where
you
know
they
suggested
something
similar.
But
this
is
this
because
see.
One
of
the
prerequisites
for
knowledge
graphs
is
that
you
know
they
need
to
have
this
whole
data
at
least
structured
in
different
databases,
capturing
it.
Then
it
becomes
easier
to
implement.
So
so
that's
the
the
you
know
in
in
this
case,
where
I
had
this
discussion.
A
So
in
one
university
they
were
not
having
the
data
readily
available,
so
they
took
some
time
and
they
said
probably
will
start
three
months
later
when
they,
you
know
capturing,
start
capturing
this
data
for
at
least
three
to
six
months
period.
But
yes,
it's
it's
a
very
interesting
use
case
and
the
that's
the
power
of
knowledge
graphs
that
as
long
as
you
have
one
connector
or
I
would
say
one,
you
know
unique
identifier
which
you
can
connect
to
multiple
data
points.
A
D
Yeah,
so
well
so
the
the
interesting
question
you
asked
about
the
knowledge
graph
I
have
seen
and
I'm
blanking
now,
but
I
know
there's
at
least
one
like
math
placement
assessment
program
out
there.
That
actually
does
this
around
fundamental
knowledge,
graphs
and
and
what
students
need
to
know
in
order
to
progress
through
topics.
So
it's
able
to
use
the
knowledge
graph
to
pick
out
what
questions
to
present
to
students
to
actually
place
them
into
more
advanced
math
locations,
and
I
don't
remember
it's
out
of
wiley
or
somewhere
else.
D
There's
been
a
bunch
of
presentations
at
some
of
the
jms
and
stuff
like
that,
but
I
also
it
certainly
would
be
very
applicable
to
be
able
to
pull
out
those
graphs.
Of
course
topics.
And
I
I
don't
recall
I've
seen
someone
talking
about
doing
that
sort
of
idea.
But
I
don't
remember
where
it
is
it's
out
there
somewhere.
If
someone
wants
to
go,
dig
it
out.
A
Yeah
you're
right
right
now,
you
know
I
had
you
know
some
active
discussion
with
some
of
the
industries
and
it's
exactly
the
same
lines,
because
this
enables
you
to
many
times
connect
many
different
parts
of
the
story
which
typically
a
business
user
or
someone
who's
domain
expert.
Who
would
like
to
know
it?
But
because
of
limitation
of
databases
or
limitation
of
data
points,
they
were
not
able
to
connect
so
far,
so
yeah
definitely.
E
I
would
add
off
of
your
remarks
and
I'll
also
carl,
actually
in
the
university
of
north
carolina
greensboro,
have
used
mapping
software
to
analyze
course,
descriptions
of
close
to
3600
courses
and
out
of
in
data
science
and
have
mapped
all
of
that
and
we're
currently
using
the
same
program
with
another
university
who
has
purchased
our
services
to
map
out
their
entire
curriculum
and
we've
added
both
undergraduate
and
graduate
courses.
So
we'd
be
happy
to
present
the
results
of
that.
E
It's
intriguing
because
you've
discovered
that
data
science,
broadly
speaking,
including
methodologies
within
that
broad
topic,
are
replete
across
the
university
and
one
of
the
findings
obviously,
is
that
faculty
are
not
speaking
with
one
another
and
we've
also
conducted
this
in
terms
of
specific
topics
that
are
being
taught
in
various
departments
divisions
whether
it's
music
sociology
data
analysis,
whatever
it
may
be.
So
it's
incredibly
insightful
and
carl
I'm
wondering
whether
that's
the
the.
B
It
was
actually
a
different
work,
so
it
was
your
colleague
that
presented
with
me
on
a
with
me
and
carl
and
uma
on
a
panel
for
a
data
science,
education
series-
so
actually
he's
coming
back
in
august
to
actually
speak
to
this
group
and
to
present
out
on
that
work.
So
that's
what
I
was
thinking
about,
so
we
got
a
preview
on
a
panel
of
the
outputs.
Yes
carl
says
to
us,
so
that
did
happen
last
month
and
he
did
an
excellent
presentation.
B
So
we
wanted
him
to
come
and
actually
speak
to
the
full
group.
So
in
august
he'll
be
one
of
the
speakers.
Okay,
that's.
B
So
it
is
samia.