►
Description
This time the ad-hoc topic is about NebulaGraph Cache design for now and future by Wen Hao, our contributor in Storage.
A
And
then
today's
topic,
the
ad
hoc
topic,
will
be
shared
by
windhawk,
which
is
our
community
contributor.
She
will
share
the
cash
design
for
now
in
the
future,
in
naturograph.
A
And
so
actually
one
thing
to
mention
is
we
actually
are
changing
the
frequency
of
the
community
meeting
and
I
will
make
it
every
full
week
instead.
So
we
will
see
if
it
goes
well
in
this
new
frequency
and
if
you
have
any
suggestions
or
new
ideas.
Just
let
us
know
so.
A
You
can
find
everything
in
this
user
pad
or
from
our
navigate
community
repo,
which
is
the
nabla
graph
dash
community
under
resource
github
organization,
and
so
we
don't
have
new
members
to
introduce
today,
and
we
will
have
this
meeting
every
four
weeks
and
we
will
go
through
some
specific
topics
that
anyone
would
like
to
bring
and
we
have
the
same
discussion
in
this
stage
and
anyone
would
like
to
bring
out
ideas.
Can
let
us
know
in
slack
in
github
discussion
or
send
emails
to
us
and
yeah.
A
In
last
four
weeks,
we
have
a
bunch
of
updates
in
torching
the
you
know:
the
ecosystem
on
the
other
ground,
when
the
first
one
is
our
contributor,
helped
create
my
baddies.
Integration
for
navigraft
and
ripple
is
under
nebuchadnezzar
organization.
So
this
this
is
the
link.
So
if
you're
interested
just
feel
free
to
check
out
this
report-
and
we
also
received
some
contributions
regarding
the
flank
connector,
one
of
big
things
is
getting
started
to
support
flink
circle
and
spike
liu
and
new
south
cs7.
A
They
are
working
together
on
this
domain
and
there
is
actually
one
there
are
actually
one
pr
merged
this
week.
Four,
so
with
the
help
of
this,
this
is
working.
We
now
support
up
to
1.40
of
the
link,
the
other
one
is
we.
This
week
we
just
emerged
pr
to
to
help
users
to
leveraging
the
pi
spark
to
using
nebulous
bar
connector,
so
that
one
was
contributed
by
me
and
then
I
will
briefly
preview
you
on
some
of
the
contents
of
the
3.2.0
release.
A
So
in
this
page
they
are
all
enhancements.
So
the
first
one
is:
we
actually
revisited
our
default
configuration
values.
We
added
a
bunch
of
more
in
a
default
configuration
so
that
user
don't
have
to
dig
into
the
system
to
figure
out
some
of
the
configurations
and
we'll
change
some
of
the
default
value
to
a
one.
That
makes
more
sense
and
you
can
see
we
make
a
bunch
of
optimizations
on
specific
operators
in
the
query
engine.
A
This
is
a
new
syntax,
that's
added
support,
which
is
the
extract
function
in
the
mesh
query.
So
you
can
do
some
expression
and
a
regular
expression
thing
with
the
help
of
this
this
function.
A
The
final
one
is,
we
are
optimizing,
optimizing
the
memory
allocation
with
the
arena
allocator.
So
it
is
not
another
improvement
on
the
performance
and
actually
you
can
see.
There
are
a
bunch
of
updates
that
I
will
not
dive
into
here
and
then
I
want
to
bring
windham
our
community
contributors
for
the
sharing
regarding
the
nabla
graph
cache.
B
B
So
in
level
graph
we
are
using
adjacency
list
to
represent
a
graph,
because
adjacent
series
is
good
for
getting
the
neighbors
of
vertex.
B
B
Now
let
me
briefly
discuss
the
issues
that
we
are
going
to
solve
with
nebular
graph
cache.
Actually,
the
issues
that
we
are
going
to
address
come
from
our
findings
in
the
graph
database
storage
access
patterns,
the
first
finding
is
the
advantages
in
the
graph
database
usually
have
low
space
locality.
Let's
take
an
example
of
this
simple
query:
get
neighbors
of
a
vertex.
B
Okay
and
after
retrieving
the
edges,
basically
the
keys
of
the
edges.
We
can
easily
get
the
destination
ids
of
this
edge,
which
points
to
the
neighboring
vertices,
okay,
and
we
know
that
enable
graph
we
use
hashing
to
partition
the
vertices.
That
means
the
neighboring
vertices
and
the
source
voltage
voltages
may
or
may
not
result
in
the
same
partitions.
B
Okay,
and
that
means
they
may
not
reside
in
the
same
storage,
okay
and-
and
this
process
will
continue
if
we
are
going
to
retrieve
the
properties
of
the
neighbors
which
are
more
than
one
half
away,
so
essentially
sorry,
so
essentially
in
the
in
the
diagram.
Here
we
have
a
lot
of
voltages
and
because
we
are
using
hashing
functions
to
partition,
the
vertices
these
voltages
may
result
in
different
storage.
B
So
the
what
it
brings
about
is
retrieving
the
properties
of
multiple
vertices
will
usually
require
accessing
multiple
partitions
and
if
we
have,
if
we
want
to
traverse
the
graph
and
retrieving
the
properties
of
vertices
which
are
unhops
away
and
is
greater
than
one
and
then
the
red,
the
number
of
random
vertex
excesses
will
increase
exponentially
with
the
number
of
hops.
B
Therefore,
the
voltages
will
usually
have
low
spaces
low
space
locality,
and
we
know
that
in
rocks
db,
we
use
block
cache
to
provide
some
of
the
caching
capabilities.
So
that
means
the
voltages
in
the
block.
Cache
in
roxdb
have
low
space
locality,
okay
and
the
second
key
findings
is
about
empty
key
access,
and
we
know
that
in
graph
database
the
data
is
a
schema-less,
which
means
the
schema
of
a
of
data
in
graph
database
is
not
fixed
and
how
it
is
achieved
is
by
using
text
in
nebula.
B
So,
let's
look
at
an
example
here,
so
we
have
a
vertex
and
we
have
three
texts
and
a
tag,
and
these
tags
are
person.
Student
athlete
and
the
vertex
can
have
can
be
associated
with
one
or
multiple
text,
so
vertex
can
be
a
person
can
be
a
student,
can
be
an
athlete
or
any
combinations
of
these
three
okay
by
this
means
of
voltages
in
the
graph
database
can
have
different
schemas,
okay
and
assume.
B
We
have
a
query
like
this:
it
means
retaining
the
properties
of
vertices,
which
has
a
tag
or
person
but
return
the
properties
of
this
kind
of
vertex
with
all
possible
text,
and
we
already
know
that
a
vertex
may
be
associated
with
one
or
multiple
texts
right.
So
how
we
accomplish
this
in
nebular
graph
is
by
concatenating
the
vertex
id
with
all
the
possible
tag,
ids
and
then
construct
all
the
vertex
keys
and
then
try
to
retrieve
the
properties
in
roxdb
with
with
the
other
possible
keys
and
if
there's
a
hit.
A
B
Key
doesn't
exist
in
ros
tv,
then
we
know
this
vertex
is
not
not
associated
with
that
kind
of
that
particular
tag
id.
B
So
what
it
brings
about
is
it
will
cause
a
lot
of
empty
accesses
in
roxdb,
okay,
and
these
actually
are
the
two
key
findings
that
we
come
across
in
nebula
storage
access
patterns.
B
And
then
how
we
improve
this
is
by
designing
our
nebula
storage
cache,
and
this
is
the
architecture
of
the
nebula
storage
cache
and
it
has
a
component
in
the
rocks
db,
part
and
the
components
out
of
rock's
db
part
and
the
interlocks
db
part
will
still
provision
a
block
hash,
which
is
which
is
very
essential
for
some,
like
field
filter
blocks,
index
blocks,
okay
and
also
block
cache
can
hold
edges
because
edges
will
usually
have
better
locality
than
vertexes
in
graph
database
and
out
of
rough
db.
B
We
we
provision
a
cache
space
by
using
cache
lib
and
we
further
divide
the
cache
space
into
two
pools.
One
is
the
existing
cache
pool
and
the
other
is
the
empty
cache
pool,
and
the
existing
cash
board
is
mainly
used
to
store
the
the
key
and
properties
which
reside
which
exists
in
the
rocksdb
and
the
empty
catchport
is
mainly
used
to
cache
the
empty
keys
which
were
queried
but
do
not
exist
in
rocksdb.
B
B
And
this
is
the
configurations
of
of
the
started
cache.
So
let
me
briefly
introduce
talk
about
them
one
by
one,
so
this
is
the
main
switch.
The
enable
storage
cache
is
the
main
switch
for
the
storage
cache,
and
this
is
the
total
capacity
that
we
allocate
to
the
storage,
cache,
okay
and
okay
and
and
pay
attention
here
that
the
block
cache
size
is
out
of
this
section.
So
it's
it
is
an
existing
option,
a
configuration
option
in
our
configuration
file.
B
So
here
the
config,
the
configuration
here,
it
only
managed
the
the
storage,
the
storage
cache
space
part
which
we
implement
by
using
cache
lib,
okay
and
the
configuration
here
is
a
very-
is
a
very
important
configuration
which
is
very
sensitive
to
the
performance.
B
So
it
requires
you
to
put
an
estimated
number
of
cache
entries
on
this
storage
node
in
base
two
logarithm,
and
if
you
don't
provide
enough,
I
mean,
if
the
number
here
put
is
too
too
low,
which,
which
means
it
is
much
lower
than
the
actual
number
of
entries
in
the
storage
node
you,
you
may
suffer
from
low
performance,
okay
and
then
these
two
sections
other
configurations
for
the
existing
cache
pool
and
the
empty
cache
for
respective
respectively.
B
And
the
first
section
is
first,
this
is
a
switch
for
vortex
for
all
the
existing
cache
pool,
and
this
is
a
capacity
for
this
existing
cache
pool,
and
this
is
a
ttl
for
the
items
in
this
existing
cache
pool
and
the
second
one
is
mainly
manage
the
empty
cache
pool.
So
here
the
switch
and
capacity
and
ttl
as
well.
B
So
here
is
the
performance
improvement
that
we
can
achieve
by
using
the
nebula
cache.
So
first
with
this
go
that
go
on
step,
query
wiz
attack,
which
means
we
explicit
explicitly
specify
which
tag
that
we
are
going
to
access.
So
there
will
be
no
empty
keys
when
running
this
query,
so
we
can
achieve
a
20.
Latency
decrease
okay
directly
with
the
existing
cash
pool.
B
Okay
and
similarly,
if
we
run
this
fetch
neighbor
properties
of
a
given
tag
again,
the
tag
is
specified,
so
there
will
be
no
empty
keys.
So
we
only
provision
the
existing
cash
pool.
Okay,
so
we
can
achieve
16
percent
latency
decrease
and
the
next
two
queries
will
try
to
access
the
data
with
all
the
possible
tax.
B
So
there
will
be
a
lot
of
empty
keys.
Okay,
so
for
go
and
step
if
we
provision
both
the
empty
cash
pool
and
the
existing
cash
pool,
we
can
achieve
49,
latency
decrease
and
77
qps
increase
and
for
go
to
step.
We
can
achieve
even
more
latency
decrease
and
the
qps
increase
this,
because
it
is
more
than
one
hub,
as
I
discussed
earlier,
that
if
we
have
more
than
one
hops,
the
number
of
random
accesses
for
voltages
will
increase
exponentially.
B
So
the
more
steps
you
have,
the
the
higher
potential
performance
improvement
that
you
can
achieve
with
the
nebula
cache.
B
Okay,
let
me
briefly
talk
about
our
future
projects
about
nebula
cache.
B
In
public
cloud-
and
we
are
going
in
our
next
few
projects,
we
are
going
to
provide
a
memory,
caching
cloud
to
improve
the
performance
and
also
the
local
storage
cache
to
pro
to
improve
the
performance
for
the
system
which
put
data
in
object.
Storage
in
the
public
cloud,
and
we
are
gonna,
also
provide
a
cache
for
other
in
other
layers.
In
the
nebula
architecture,
for
example,
we
can
provide
cache
for
the
quest
for
the
query
result
and-
and
we
are
also
also
going
to
provide
a
cache
for
the
graph
structure.
B
B
A
Well,
thank
you
so
much
excellent
sharing,
so
so
this
is
actually
first
time
that
we
have
we're
trying
to
invite
a
contributor
to
the
community
on
different
domains
of
number
graph.