►
Description
Don't miss out! Join us at our upcoming event: KubeCon + CloudNativeCon Europe in Amsterdam, The Netherlands from 18 - 21 April, 2023. Learn more at https://kubecon.io The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.
A
A
Hey
everyone
welcome
to
a
conveyor
Meetup.
We're
super
excited
to
share
this
new
tool
with
you
guys
just
some
housekeeping
rules.
If
you
have
questions,
please
put
it
in
the
chat,
we're
gonna
get
to
it
at
the
end
of
the
session.
That
way,
the
presenters
have
enough
time
to
get
through
all
the
material
and
if
we-
and
if
you
have
a
question
that
we
don't
get
to
feel
free
to
go
to
the
conveyor,
swag
Channel
and
just
ask
there,
I'll
put
a
link
to
that
slack
channel
in
the
chat
as
well.
B
Jonathan
and
thanks
everyone
for
listening
today,
so
I'm
John,
refrando
I'm,
a
senior
technical
staff
member
at
IBM
research,
I'm
joined
by
my
colleague,
rural
Krishna,
who
is
a
research
staff
member
and
we're
going
to
talk
about
a
project
that
we've
been
working
on
in
the
conveyor.
Community
called
Data
gravity
insights.
B
So
what
we're
going
to
discuss?
I'm
just
going
to
talk
briefly
about
kind
of
the
problem,
we're
solving
we're
going
to
look
into
just
a
broad
overview
of
DGI
of
data,
gravity,
insights
and
then
Rahul
is
going
to
take
you
through
a
deep
dive,
hopefully
not
too
deep,
but
he's
going
to
go
deep,
we're
gonna!
You
know
open
the
hood
and
show
you
what's
inside,
because
we
want
you
guys
to
help
us
to
build
this
right.
B
This
is
not
all
built,
he's
gonna,
do
a
demonstration
of
DJI
and
then
we'll
come
back
and
talk
about
some
future
work
that
the
community
can
help
us
with
so
application
modernization
right,
making
little
ones
out
of
big
ones,
or
you
know
taking
this
monolith,
which
largely
is
organized
by
Technologies
front-end
application
back-end,
you
know
not
around
business
domains
right,
so
you
want
to
break
them
up
into
microservices
that
are
kind
of
business
driven.
So
why
would
I
want
to
do
this
I'm
going
to
take
this
monolith?
B
It's
I
got
this
one
thing
that
works
great
I'm,
going
to
break
into
a
lot
of
little
things,
make
a
headache
for
myself.
Well
what
you
want
to
do,
what
your
primary
goal
should
be:
I
got
50
programmers
working
on
this
monolith
and
I
want
to
have
10
teams
of
five
programmers
working
much
faster
around
business
domains.
So
how
can
I
figure
out?
What
are
the
business
domains
in
the
monolith
that
I
can
wrap
a
small
team
around
and
they
could
be
autonomous
and
they
could
move
quickly
right?
That's
really!
B
The
thing
is
moving
faster,
Right
Moving
in
Market
faster.
So
if
you
look
at
the
state
of
the
art
today
in
any
of
the
tools
that
will
help,
you
turn
your
monolith
into
microservices
and
they
look,
they
scan
the
code,
they
scan
the
code,
they
find
all
these
connections.
You
get
some
graph
like
this.
You
got
a
whole
bunch
of
little
things,
they're
all
connected
lots
of
lines
trying
to
figure
out.
You
know
what
is
the
best
way
to
slice
between
them,
usually
they're.
Looking
at
you
know
how
often
do
they
call
each
other?
B
You
know
things
like
that
to
understand.
Where
is
the
place
to
Partition,
but
you
know
just
the
dependency
between
the
Clusters
isn't
enough.
We
need
to
understand,
where
are
the
big
gas
giants
that
are
lurking
right
in
your
application?
What
are
those
objects
that
everything
gravitates
to,
because
those
are
probably
the
center
of
a
microservice
right?
So
we
want
to
understand
these
heavy
objects.
We
call
the
data
gravity
insights.
B
We
want
to
understand
how
do
we
find
these
heavy
objects
that
maybe
the
the
center
point
of
of
a
microservice
and
then
all
these
other
things
are
kind
of
orbiting
around
those
things.
So
we
take
a
little
different
approach
like
what's
the
most
important
thing
to
the
customer,
the
data
that
they
persist.
It
was
important
enough
that
they
persisted
in
a
database.
Hello.
The
data
is
kind
of
important.
You
can't
just
look
at
the
code,
so
we
took
the
approach
of
Yeah
the
code
code,
graph
application
call
graph
important
stuff.
What
about
the
schema?
B
What
about
the
relationships
between
the
schema
and
then
you
take
the
Third
Leg
of
that
and
say
what
about
the
transactions
between
the
code
and
the
data?
All
of
that
has
to
be
taken
into
account,
so
data
gravity
insights
is
looking
at
a
holistic
approach.
Right.
Look
at
the
code.
Look
at
the
data.
How
is
the
code
accessing
the
data?
When
is
it
accessing
it
right?
So
you
want
to
understand
and
get
a
holistic
view
of
your
application
and
how
it's
put
together.
B
So,
if
I
look
at
the
call
graph
right,
this
is
this
is
from
the
famous
day
trader
right,
but
I've
got
you
know,
account
data
beans
and
quote
data
beans
and
Market
summary
beans
and
stock
beans,
all
sorts
of
beans
right
lots
of
beans.
In
here
so
and
there's
a
call
graph
between
them,
then
I
look
at
the
schema
nobody's.
B
Looking
at
the
schema
I
look
at
this
schema
and
I've
got
an
account
table
and
a
quilt
Table,
and
there
are
some
foreign
Keys
between
maybe
the
holding
table
and
the
quote
table
so
now.
I've
got
a
different
view
of
the
application
where
I
can
see
foreign
key
relationships.
I
can
see
what
tables
have
other
foreign
occasions
into
other
tables.
That's
a
whole
bunch
of
relationships
in
the
domain
right.
B
If
you
want
to
understand
the
business
domain,
look
at
the
schema
because
usually
dba's
do
a
pretty
good
job
of
ignoring
technology,
which
is
you
know
the
front-end,
back-end
stuff
and
they're
just
dealing
in
the
business
domain.
So
you
look
at
the
schema,
then
you
overlay
these
views
on
top
of
each
other,
and
now
you
can
see
hey
I've
got
some
calls
being
made
at
the
code
level
that
aren't
represented
in
the
schema.
B
I've
got
some
things
done
in
the
schema
that
maybe
aren't
represented
in
the
code,
and
so
I
can
see
those
paths,
but
I
also
want
to
find
those
gas
giants.
I
want
to
find
those
heavy
objects
and
then
say
these
look
like
the
center
of
a
microservice
and
as
I.
Look
at
this
partitioning
I
can
see.
Here
are
my
apis
all
those
red
lines,
those
these
are
the
guys
that
are
going
to
call
each
other
across
partitions,
and
so
this
is
how
I
have
to
build
my
API.
B
The
problem
is:
that's
just
a
2d
view
of
the
world
like
an
x-ray
right
x-ray
is
fine,
I
can
see
broken
bones
and
stuff,
but
I
I,
don't
know.
What's
going
on
behind
all
that
white
stuff,
and
that's
that
myopic
view
I.
Think
of
this
that
2D
flat
plane.
What
we
need
is
an
MRI
I
need
to
be
able
to
take
the
code
and
turn
it
around
and
look
under
it
and
see.
B
You
know,
pull
these
things
apart,
see
who's
really
talking
to
who
right
different
filters,
different
ways
of
looking
at
the
code
extremely
extremely
important
to
understanding
all
the
different
relationships
in
the
code.
So
what
I
want
to
do
is
kind
of
tilt
that
view
and
look
under
it
and
be
able
to
see
how
those
relationships
are
coming
together,
and
we
don't
have
this
view
yet
don't
get
too
excited.
We
want
you
to
help
us
build
this
view,
but
we
think
we
have
all
the
underpinnings
right.
B
Talking,
who
you
know
when
do
I
have
to
to
to
partition
when
do
I,
have
to
partition
these
things
and
how
should
I
partition
so
just
to
go
through
some
of
the
possibilities
and
then
we'll
go
into
the
technical
stuff,
so
clearly
queries
to
run
and
understand
the
dependencies.
Those
are
things
that
we've
already
built:
triangling
the
database,
the
code
dynamic
calls
and
all
that
very
important
right
now
we're
just
taking
a
static
view
of
the
world.
It'd
be
nice
to
add
a
dynamic
view
of
a
call.
B
You
know
when,
where
app
watch,
the
application
run,
because
it's
important
to
understand
if
this
code
calls
this
other
code
well,
does
it
call
it
once
it's
startup?
Does
it
call
it
a
thousand
times
a
second?
That's
a
really
different
relationship,
so
it's
important
to
understand,
Dynamic
and
then
find
the
data
centrality
and
the
code
centrality
right.
So
these
are
the
important
things
in
the
code.
These
are
the
important
objects
in
the
data.
How
do
they
relate
to
each
other?
B
So
can
I
find
classes
that
are
accessing
the
data
outside
of
that
sensuality
now
I
got
distributed
transactions
and
what
do
I
do
about
those
do?
I
refactor
my
data.
Do
I
refactor
my
code
or
do
I
create
a
distributed
transaction
or
do
something
like
a
saga
pattern
right,
so
very
important
to
understand
then,
can
I
find
these
anchor
classes
these
entry
points?
You
know
if
you,
if
you
view
this
graph
with
all
these
bubbles
around
it,
you
say
hey.
This
is
really
important.
Object,
look
everybody's
pointing
to
it.
B
Then
you
find
out
it's
a
servlet.
It's
the
entry
point
to
the
system
or,
of
course,
everybody
everybody
has
to
come
through
it,
but
it's
not
an
important
business
object.
It's
just
a
router,
it's
just
a
traffic
cop,
but
then
can
we
annotate
the
class
to
say?
Okay,
this
one
is
an
entry
point
right,
so
we
do
some
annotation
on
the
classes
which
we
don't
have
yet,
which
we
hope
to
add
right,
hopefully
with
the
community
and
then
what
about
the
framework
being
used.
B
So,
if
I
know
a
little
bit
about
the
framework
I'm
using
spring
Boot
and
what
am
I
using
I'm
using
some
model
view
controller
now,
I
can
say:
okay
can
I
label
the
classes.
These
are
model
classes,
these
are
view
classes.
These
are
controller
classes.
That's
got
to
be
important
information
when
you're
trying
to
figure
out
how
to
refactor
this
application
and
then
identify
things
like
utility
classes.
Again,
I've
got
this
one
class,
everybody
points
to
it.
It's
like
yeah,
it's
the
log
class.
B
You
know
no,
no,
it's
not
the
most
important
thing
in
the
system.
It's
the
least
important
thing.
This
is
a
utility
class.
You
just
copy
it
into
all
the
microservices,
but
it's
important
to
understand
that
we've
done
some
work
to
identify
utility
classes
and
say:
okay,
take
all
those
little
UCLA
classes,
get
them
out
of
my
view.
They're
just
clouding
up
the
view.
I
want
to
see
the
business
objects.
So
what
can
you
come
up
with?
B
I
mean
this
is
what
we
really
want
to
do
here
today
with
a
Meetup
is
say:
we
want
to
show
you
what
we've
done
and
say
come
help
us
build
more
of
this.
We've
got
some
foundational
work
done,
but
there's
lots
of
possibilities
and
we're
hoping
that
you
guys
can
help
us
create
those
possibilities.
C
All
right
thanks
thanks
Jr,
so
we'll
do
a
quick,
deep
dive.
I've
broken
this
down
into
two
parts.
We
we
look
at
data,
gravity
insights,
a
little
closer
into
what
it's
comprised
of
how
we
build
the
graph
and
how
we
can
visualize
some
of
the
use
cases
that
Jr
mentioned,
and
then
we
look
at
cargo,
an
approach
that
we
built
on
top
of
DJI
to
partition
monolithic
applications
into
potential
microservice
recommendations,
so
DJI
comprises.
This
is
the
overview
of
DJI
right.
C
So
we
start
with
the
source
code
and
we
package
it
into
a
one
of
many
formats,
and
then
we
extract
three
key
relationships
from
the
application.
These
are
code
to
graph
relationships,
schema
and
transaction
relationships.
Once
we
have
these,
we
persist
them
in
a
graph
database,
and
this
permits
us
to
use
query
languages
like
Cipher,
to
look
for
interesting
insights
that
we
can
get
so
photograph
understands
the
static
dependencies
between
the
various
methods,
the
instructions,
the
classes
that
we
have
in
the
application.
C
These
dependencies
we've
categorized
into
call
return.
Dependencies
are
data
flow
dependencies
and
Heap
allocation
and
their
corresponding
dependencies.
In
addition
to
that,
we
have
schema
to
graph
which
looks
at
specifically
the
relationship
between
the
database
tables
and
the
columns
in
the
database,
and
these
are
a
few
examples
could
include
foreign
key
relationships
and
and
others.
C
C
So
what
does
this
give
us
right?
So
this
enables
us
to
analyze
the
source
code
dependencies,
so
we
know
which
classes
talk
to
which
other
classes
where
the
utility
classes
are
and
which
class,
which,
which
classes
have
a
lot
of
traffic
and
so
on.
In
addition,
it
gives
us
code
to
database
dependencies,
and
this
tells
us
how
the
source
code
interacts
with
external
resources
or
persistent
databases.
C
In
addition
to
this,
we
have
database
to
database
dependencies
which
allows
us
to
look
at
how
the
various
tables
in
different
databases
communicate
with
one
another
and
what
relationship
they
have.
And
finally,
we
would
like
to
think
of
this
as
a
continuous
modernization
approach,
where
we
look
at
runtime
statistics
and
operational
traces
and
Telemetry
from
tools
like
Jager
and
instana.
C
So
the
question
here
is:
what
can
we
do
with
this
data?
So
here
are
some
examples
that
we
can.
We
can
build.
These
include
transaction
Scopes,
looking
at
various
data,
synchronization
issues
and
inspecting
call
and
control
dependencies.
In
addition,
this
allows
us
to
look
for
potential
restful
service
transformation.
We
can
identify
opportunities
for
code
and
data,
refactor
and
maintenance
identified,
distributed
transactions
and
come
up
with
remediation
strategies
to
handle
these
distributed
transactions,
as
well
as
other
synchronization
issues
across
services.
C
C
We
have
our
getting
started
guide
on
the
conveyor
repository
page
that
gives
you
detailed
instructions
on
how
to
start
using
DJI
for
your
application.
It's
available
as
a
pip
package.
So
all
you'll
need
to
do
is
install
DJI
using
pip
and
then
the
rest
of
the
instructions
are
here:
they're,
pretty
detailed,
so
I'll
just
go
over
the
commands
themselves
and
what
they
do.
C
So
once
you
install
the
PIP
package,
the
command
line
tool
is
DJI
and
I
start
with
DJI
help,
and
this
should
give
us
an
overview
of
what
our
tool
contains.
So
there
are
a
few
options
that
allows
us
to
interact
with
the
graph
database,
as
well
as
some
command
line
options
like
verbosity
and
and
other
information,
but
the
key.
C
The
key
component
of
DJI
are
a
set
of
commands
that
helps
us
build
this
graph
here
are
a
few
that
is
c2g
which
stands
for
Photograph,
and
this
allows
us
to
add
the
call
return,
dependencies,
Heap
dependencies
and
other
things
to
the
graph.
We
have
a
skipper
partition.
For
now
we
have
schema
to
graph
or
s2g,
which
passes
the
SQL
schema,
potentially
through
a
ddl
file
into
the
graph
and
transaction
to
graph
for
TX
to
G,
which
adds
edges
that
denote
the
cloud
operations
in
the
graph.
C
And
finally,
we
have
partitions
and
I'll
do
another
deep
dive
in
the
next
part
of
this
talk
about
what
this
is,
but
on
a
very
high
level.
Partition
is
a
command
that
runs
this
algorithm,
that
I'll
discussed
called
cargo
which
enables
us
to
identify
potential
partitioning
strategies
in
the
DJI
graph,
Spruce
DJI.
Once
we've
followed
the
getting
started
page
and
we
have
an
application,
we
can
call
one
of
the
sub
commands.
I'm
just
gonna
show
one
example.
C
And
this
is
code
to
graph,
and
the
help
here
should
provide
more
in
details
on
what
it
does,
but
essentially
photograph
takes
a
directory
that
contains
a
lot
of
data
that
we've
mined
from
the
application.
You
can
provide
an
abstraction
level
depending
on
what
structure
we
want
to
look
at.
This
could
be
class
method
or
full,
which
includes
class
method
and
instruction,
and
once
we
do
that,
let
me
this
will
take
a
while,
but
it's
going
to
go
through
the
data,
the
program
and
start
populating
the
neo4j
graph
with
a
lot
of
dependencies.
C
So
right
now
it's
doing
Heap
carry
dependencies
and
this
is
going
to
take
a
while
because
there
are
thousands
of
relationships
to
populate.
So
what
I've
done
for
the
sake
of
this
demo
is
I
have
a
running
example
after
running
code
to
craft
and
DJI
and
I'll
show
you
how
we
can
interact
with
it.
So
this
is
neo4j
desktop.
There
is
a
graph
databases,
that's
running
underneath
which
has
all
the
relationships
that
we're
populating
and
there
are
a
couple
of
ways
to
interact
with
it
and
today
I'll
talk,
you
I'll,
walk
you
through
Bloom.
C
So
Bloom
is
a
graphical
user
interface
that
comes
with
the
neo4j
desktop,
and
this
is
what
it
looks
like.
This
is
a
very
high
level
overview.
We
can
think
of
the
the
data
that
we
have
in
DJI.
In
terms
of
perspectives.
There
is
a
class
perspective
which
looks
at
all
the
code
dependencies
and
there
is
a
database
perspective
which
looks
at
the
class
dependencies
as
well
as
the
SQL
table
and
the
dependencies
between
the
databases.
C
So
we
can
look
into
the
into
this
one,
so
we
have
two
types
of
nodes:
the
class
node
and
the
table
node,
and
the
number
of
relationships
between
all
these
nodes
like
call
return,
dependencies
and
foreign
key
relationships
and
so
on.
In
addition
to
this,
we
have
a
set
of
queries
that
we've
created,
and
these
are
just
starter
queries
as
the
use
cases
evolve.
We
can
write
more
complex
queries
as
an
example.
C
Here
is
a
query
that
we
can
use
to
identify
data
centrality
now,
and
the
search
bar
allows
you
to
run
the
queries
and
we
can
look
for
data
centrality,
and
this
should
populate
the
graph
that
we
see
here
with
a
number
of
relationships
between
the
SQL
table
nodes
that
are
shown
in
blue
and
the
class
nodes
which
are
in
Gray.
C
Bloom
also
allows
us
to
add
conditional
rules
to
visualize
these.
So
if
you
look
at
any
of
these
databases,
for
example,
code
ejp,
there
should
be
a
centrality
score.
That
indicates
how
Central
that
entity
is
to
the
program,
so
higher
value
indicates
it's
more
important
and
the
lower
values
indicate
that
it's
slightly
less
important
and
there
are
rules
that
we
can
use
to
differentiate
between
the
most
important
and
the
least
important
class.
And
in
this
view
we
have
an
example
where
there
is
the
the
database.
C
The
larger
ones
are
more
Central
the
gas
giants
analogy
if
you
will
and
the
smaller
nodes
are
less
Central
to
to
the
application
and
the
edges
between
each
of
these
indicate
the
transaction
relationships.
In
this
view,
Bloom
allows
us
to
dismiss
other
nodes
and
inspect
only
a
few
nodes.
If
we
choose
to
do
so,
and
each
database
has
a
set
of
properties
associated
with
it
and
so
does
every
class.
So,
for
example,
there
is
a
centrality
measure.
C
This
tells
us
the
the
signature
of
the
class
as
well
as
if
the
class
is
a
bean,
if
it's
an
entry
point,
if
it's
a
servlet
and
other
things
and
each
relationship
indicates
the
the
nature
of
the
transaction.
So
this
is
a
transactional
read,
so
the
the
class
reads
from
The
Code
ejb
table.
It
tells
us
the
method
that
initiates
this
transaction
read
as
well
as
the
action
that
initiated
this.
C
So
this
is
just
a
quick
overview
of
of
some
of
the
options
of
DJI.
In
addition
to
looking
at
these,
we
can
also
inspect
individual
classes,
and
to
do
that,
we
I
can
look.
I
can
take
one
example
over
here.
This
shows
how
to
call
return.
Dependencies
exist
between
classes.
C
C
One
such
example
is
to
identify
strategies
to
decompose
a
monolithic
application
into
a
set
of
microservices,
and
to
do
that,
we
use
DJI
and
built
an
algorithm
called
cargo
which
was
presented
in
a
conference
quite
recently,
and
cargo
attempts
to
take
the
DJI
graph
and
identify
a
micro
service
boundaries
like
we
see
here-
and
this
is
the
overview
of
the
approach
I'll
go
into
details
on
what
each
of
these
steps
are.
But
in
essence,
we
start
with
the
DJI
graph,
which
is
the
first
step.
C
C
So
the
first
step
is
to
build
a
program
dependency
graph,
and
this
is
the
graph
that
we
have
in
DJI,
and
this
is
just
a
technical
terminology
for
that
we
build
what
is
known
as
a
context:
sensitive
program
dependency
graph.
So
if
you
look
at
DJI
and
every
node,
it
has
a
context
associated
with
it.
Now,
what
a
context
is?
Is
it
emulates
Dynamic
interactions
in
the
program,
because
we
do
a
static
analysis?
We
really
don't
have
runtime
information
and
context.
C
Sensitivity
is
a
way
to
impart
that
runtime
information
into
the
program
and
without
context
sensitivity.
We
might
miss
some
key
interactions
that
might
only
appear
at
runtime
and
not
at
a
static
time,
static,
compile
time.
To
give
you
a
quick
example
of
what
this
means.
We
have
a
quick
example
here
with
it's
it's
more
of
a
pseudo
code
with
a
few
classes
and
interactions.
We
have
two
objects
of
type
A,
as
shown
here,
and
both
of
these
objects
called
a
DOT.
A
C
So
the
first
step
is,
we
allocate
an
object,
A1
and
it
calls
a
DOT
true
now,
in
a
context,
insensitive
graph.
There
is
a
call
graph
Edge
between
Main
and
a
DOT
Foo,
but
on
the
right
we
in
a
context,
sensitive
analysis.
It
not
only
indicates
that
there
is
a
call
Edge,
but
it
also
indicates
which
receiver
object
is
instantiating,
that
College,
as
we
walk
through
the
program,
we'll
see
that
erot
Foods
initiate,
is
called
twice
from
two
receiver
objects,
A1
and
A2
in
a
context.
C
In
sensor
analysis,
this
relationship
is
missed
and,
as
we
walk
through
the
program,
this
becomes
more
of
a
problem
in
context,
insensitive
analysis
where
we
miss
many
many
more
relationships
than
there
actually
are,
but
on
the
right,
we'll
see
that
context.
Sensitive
analysis
includes
all
the
relationships
between
our
two
methods
and
it
also
highlights
which
receiver
object.
Instantiated.
The
call
yeah
by
isolating
these
context,
snapshots
we
can
look
closely
into
different
Dynamic
states
of
the
program
here
is
the
the
graph
again.
C
For
example,
it's
important
to
note
that
this
graph,
although
complete,
is
all
possible
Dynamic
states
of
the
program,
but
any
given
time
in
a
single
threaded
application,
we
can
only
be
in
one
state,
so
a
DOT
Foo
can
either
be
called
by
A1
or
A2,
but
not
by
both
simultaneously
now
to
to
distinguish
this
fact.
We
extract
snapshots.
A
snapshot
is
a
small
example
of
a
dynamic
state
of
a
program
which
we
can
derive
from
the
context
and
sensitive
graph.
C
C
Along
the
same
lines,
we
can
also
extract
snapshots
that
have
to
do
with
database
transactions.
Since
the
DJI
graph
has
a
transaction
relationships,
we
can
extract
subgraphs
from
the
DJI
graph,
which
indicate
interactions
between
the
database
tables
and
the
classes
in
the
program.
Now,
once
we
do
this,
we
have
a
set
of
discrete
snapshots,
which
we
can
then
use
to
apply.
C
This
algorithm
called
label
propagation
which
tries
to
identify
communities
in
in
the
graph,
so
label
propagation
works
with
a
set
of
initial
assignments,
and
then
it
tries
to
propagate
those
assignments
through
the
entire
graph
to
identify
partitions
in
in
the
graph.
So
these
initial
assignments
can
be
random
in,
in
which
case
it
would
be
completely
unsupervised,
but
they
can
also
have.
They
can
also
be
user
preferred
assignments
if
there
are
any
specific
preferences
on
grouping
all
the
all
the
classes
that
handle
the
web
interface
together,
as
well
as
database
interactions.
C
C
So
that's
the
overview,
We've
packaged
cargo
as
part
of
DJI.
It's
also
available
as
a
standalone
tool,
and
it
has
a
lot
of
options
for
enhancing
how
the
label
propagation
behaves,
soliciting
user
feedback
to
initialize
the
label
propagation
and
so
on.
C
So
I'm
going
to
go
over
the
evaluation
right
just
to
kind
of
complete
the
the
thought
process
on
how
cargo
works
and
how
it
performs
compared
to
some
other
algorithms.
So
we
looked
at
a
few
applications,
as
shown
here,
the
blanco
several
Java
Frameworks.
They
have
a
number
of
classes.
These
are
toy
examples,
so
there
are
just
a
few
hundred
classes
in
in
many
cases
and
a
few,
a
few
secret
tables.
C
We
also
looked
at
some
additional
approaches
that
are
available
in
scientific
Literature,
Like,
monitor
micro
and
a
few
others,
and
we
used
these
these
algorithms,
along
with
cargo,
to
see
if
running,
DJI
and
cargo
can
enhance
the
partitioning
recommendations
of
these,
and
when
we
do
that
in
our
experiments
we
use
the
notation
plus
plus
for
gravity.
C
We
look
at
a
few
research
questions
here
to
see.
If
this
technique
works,
we
evaluated
how
effective
it
is
in
remediating
distributed
transactions.
We
looked
at
the
latency
and
throughput
improvements
that
we
might
get
when
we
deployed
these
as
running
microservices,
and
we
also
looked
at
the
partitioning
quality
and
Architectural
metrics
that
we
might
obtain
if
we
were
to
partition
the
monolith
using
cargo.
C
The
first
question
was
looking
at
distributed
transactions,
so
we
wanted
to
minimize
distributed
transactions
and
to
do
this
to
the
extent
possible.
We
want
each
database
table
to
be
accessed
by
just
one
microservice
partition
and
to
measure
that
there
is
a
measure
called
transaction
Purity
which
measures
how
pure
transactions
are.
If
the
transaction
Purity
is
low,
that
means
that
a
table
is
accessed
by
multiple
microservices,
potentially
leading
to
needing
a
distributed
transaction
management.
C
While
it
didn't
fully
eliminate
them,
it
made
them
much
fewer
in
numbers
so
that
it's
easy
to
to
handle
and
finally,
just
running
cargo
without
any
seed.
Examples
in
a
random
manner
also
achieved
a
transactional
purity
of
one
meaning.
It
could
partition
the
application
in
a
manner
such
that
all
the
tables
were
local
to
the
partitions.
C
In
addition
to
just
looking
at
transactions,
we
we
deployed
two
versions
of
the
applications
as
microservices
the
first
one
was
the
original
partitioning
algorithm
with
a
technique
called
monitor
micro
and
the
second
one.
We
used
cargo
to
refine
these
partitions
and
to
kind
of
look
at
if
we
can
get
improved,
latency
and
higher
throughput,
and
we
ran
these
on
various
loads
with
ranging
from
2000
to
a
million
users
on
a
number
of
use.
C
Cases,
and
the
key
takeaway
here
is
that
in
all
cases,
using
cargo
and
and
DJI
to
do
this,
refactoring
improved
the
latency
or
reduced
it
by
11
and
increase
it
throughput
by
about
120,
which
was
quite
considerable
in
our
use
case.
And
finally,
we
have
to
talk
about
cohesion
and
coupling,
which
we
use
to
evaluate
the
architectural
quality
of
these
partitions.
We
measured
some
of
these
metrics
and
we
observed
that
again
using
cargo
reduced
the
coupling
and
increased
cohesion
of
the
applications
compared
to
the
state-of-the-art
techniques.
C
There
are
some
examples
where
we
think
cargo
could
could
do
better.
One
example
is
business
context
Purity,
which,
which
measures
how
closely
tied
each
partition
is
to
a
business
use
case
now,
since
cargo
does
not,
at
its
current
state,
use
any
business
context,
it
didn't
really
do
well
at
creating
partitions
that
stuck
to
a
specific
domain
and-
and
we
think,
with
with
some
additional
work
and
by
engaging
the
community
we
can.
We
can
make
the
partitions
from
cargo
more
aligned
with
the
domains
that
they
they
tackle
all
right.
C
C
So
cargo
is
available
as
a
standalone,
Pi
Pi
package
and
it's
it's
used
as
one
of
the
dependencies
in
DJI.
So
when
you
install
DJI
using
pipei,
it
should
come
in
pre-built
with
cargo,
but
there
is
a
standalone
tool
in
case
there
are
options
to
enhance
some
of
the
partitioning
functionalities
in
cargo.
C
Okay,
so
let's
clear
the
screen
here
and
to
use
cargo
it
and
as
a
sub
command
of
DJI,
and
that
is
DJI
partition
and
I'm.
Just
gonna
ask
for
help
here,
so
we
can
see
how
we
would
invoke
it
from
the
command
line.
So
DJI
partition
has
a
few
options.
The
seed
input
it's
optional,
but
if
we
do
provide
it,
it
consumes
the
user-design
seat
partitions.
So
if
you
have
some
preferences
on
classes
belonging
to
a
specific
microservice,
this
is
the
place
to
provide
it.
C
It
doesn't
have
to
be
exhaustive
and
it
does
not
have
to
cover
all
the
classes.
Any
recommendations
or
preferences
can
be
provided
and
the
partitioning
algorithm
will
try
to
respect
those
those
initial
partitions
and,
along
with
that,
we
have
other
options
like
maximum
partition
size.
If,
if
there
is
a
preference
on
having
just
three
or
four
micro
services,
for
example,
that
could
be
provided
as
an
option,
but
this
is
also
optional.
C
So
if
you
don't
provide
any
number,
cargo
will
interpret
a
scene
partition
size
and
it
will
use
that
internally
to
use
cargo,
which
is
called
cargo
with
one
of
these
options,
so
I'm
just
going
to
call
it
with
a
partition
size
of
five
and
once
you
do
that
it
so
it's.
This
is
going
to
take
a
few
minutes,
but
I'll
just
walk
you
through
what
is
happening.
C
Underneath
cargo
is
looking
at
the
DJI
graph
that
we
showed
and
it's
going
to
make
a
local
copy
of
it
because
we
didn't
want
to
make
it
tie
to
any
specific
graph
database
or
technology.
So
it's
going
to
make
a
local
copy
run.
The
partitioning
algorithm
as
I
described,
find
that
the
partitions
for
every
class
and
then
update
the
DJI
graph
with
a
new
property
for
every
node
indicating
the
the
partition.
C
So
I'm
gonna
go
back
to
this
view
and
we
ran
cargo
once
and
I'm.
Just
gonna
show
you
what
it
might
look
like.
So
these
are
all
the
classes
in
the
application
or
a
set
of
classes
that
we
can
visualize
and
right
now,
they're
all
gray.
But
if
you
look
at
any
one
of
these
classes,
Market
summary
Bean,
for
example,
it
should
have
a
partition
ID.
Likewise,
another
class
could
have
would
have
another
partition
ID,
and
these
partitions
were
obtained
by
running
cargo
to
visualize
it
better.
C
An
obvious
question
here
is:
how
could
this
be
useful
apart
from
visualizing
classes
and
different
partitions
right?
One
thing
DJI
can
help
with
is
to
visualize
distributed
transactions.
So
even
after
we
do
cargo,
there
are
cases
where
we'll
have
distributed
transactions
and
it
is
important
to
remediate
them.
So
by
running
the
distributed
transactions
command.
We
have
some
Cipher
queries.
We
use
to
compute
distributed
transactions,
it
should
populate
a
graph
that
contains
tables
classes
and
the
distributed
transactions.
C
As
we
see
here,
the
larger
blocks
here
indicate
components
that
are
more
Central
and
you'll,
observe
here
that
there
are
classes
that
are
colored
differently,
indicating
that
they
belong
to
different
microservices.
So
we
see
at
least
three
four
microservices
here
with
yellow
lavender
and
orange,
and
they
all
talk
to
certain
databases.
C
As
an
example,
we
can
pick
a
set
of
classes
to
see
what
interactions
they
have,
and
this
is
a
quick
example
of
the
Code
ejb
table,
having
transaction
rights
from
two
different
classes,
one
from
a
ping
ejb
class
and
another
from
a
servlet
class
and
they're
both
reading
from
The
Code
ejb
table.
And
if
you
look
at
the
property
of
every
transaction
read
we
have
a
unique
transaction
ID
and
in
cases
where
the
transaction
ID
is
the
same
for
this.
C
In
this
example,
the
transactions
would
be
potentially
distributed
because
they're
both
part
of
the
same
Global
transaction
that
are
right
reading
from
the
from
the
database.
So
that's
a
quick
example
of
of
what
we
can
do
with
cargo
and
DJI
and
visualize
the
various
interactions
and
distributed
transactions.
C
That
brings
me
to
the
end
of
my
talk.
I
want
to
hand
it
back
to
John
who
will
talk
you
through
some
additional
use
cases
that
we
have
in
mind
what.
B
You've
done
yeah
thank
thanks,
Rahul,
so
if
we
could
just
bring
up
my
slide,
thank
you
so
future
work.
So
this
is
this
is
where
you
come
in
right,
but
one
of
the
things
that
we
could
do
well,
if
you
look
at
the
output
of
DJI,
which
we
didn't
show
you
it's
like
just
a
Json
file
or
or
forget,
if
it's
a
plain
text
file,
but
it's
nothing
to
look
at
so
the
idea
is,
you
know.
B
B
As
I
mentioned,
we
want
to
have
Dynamic
operational
data
so
doing
some
Dynamic
scanning
traces
through
the
program
as
it's
running
right
and
add
that
to
the
graph
again
once
again
understand
yeah.
Okay,
this
is
calling
that,
but
is
it
calling
it
once
the
beginning,
or
you
know
a
thousand
times
a
second
new
languages
right
now,
DJI
only
works
with
Java,
but
you
know:
Java
is
not
the
center
of
the
universe,
so
there's
lots
of
job
out
there,
but
you
know
Python
and
go
are
becoming
very,
very
popular
for
microservices.
B
You
know,
could
we
use
other
languages
and
and
especially
in
C,
sharp
there's
lots
of
Windows
stuff
out
there,
enhancing
the
support
for
the
Java
framics
that
we
have
right
spring,
Boot
and
other
Frameworks
right,
adding
more
Frameworks
that
we
understand,
remember:
I
talked
about
the
model
view
controller,
and
can
we
by
understanding
the
framework?
B
Can
we
you
know
inference
what
these
classes
are
being
used
for
then
support
for
distributed
transactions
and
being
able
to
generate
code
being
able
to
generate
code
that
uses
Saga
patterns
right
so
so,
in
other
words,
you
know
you
break.
You
give
the
architect
this
report
now.
What
do
you
go
do
right
now?
It's
the
exercise
for
the
student,
so
we
would
like
to
be
able
to
generate
code,
generate
stubs
and
and
take
care
of
distributed
transactions.
B
We
need
a
UI
for
visualization,
it's
great
using
Bloom.
It
got
us
pretty
far,
but
we
would
love
to
have
you
know
someone
who
understands
you
know
human
computer
interaction
really
build
that
3D
view
where
we
can
turn
things
around
and
look
behind
them
and
look
under
them
and
see
what's
going
on
so
it's
kind
of
screaming
for
a
really
cool
visualization
that
we
need
to
build
and
then
we're
using
Diva,
which
is
another
conveyor
project
and,
and
it
has
a
set
of
persistence,
Frameworks.
It
supports
and
there's
always
more
persistence
framework.
B
So
we're
looking
at
enhancing
the
persistence
Frameworks
in
Diva,
whether
we
do
them
as
part
of
Diva
or
we
do
them
as
a
set
of
adapters
in
either
here
and
or
there
love
to
have
the
community's
input
on
on
what
you
think
is
the
best
way
to
do
that,
but
enhancing
the
Frameworks,
the
persistence
Frameworks
that
we
support
so
that
we
can
understand
the
distributed
transactions
going
on.
We
are
currently
enhancing
schema
to
graph
looking
at
triggers
right.
B
So
it's
great
to
understand:
here's
the
schema,
here's
a
relationship
and
then
what
about
all
those
triggers
that
when
this
gets
updated
that
automatically
gets
updated
and
the
application
doesn't
know,
what's
going
on
what
about
stored
procedures
right,
there's
lots
of
stuff
with
stored
procedures
out
there,
and
so
could
we
use
the
information
from
the
stored
procedures
to
understand
again
when,
when
this
is
being
updated,
is
something
else
being
updated
what's
happening?
B
And
then
what
can
you
think
of
for
future
work?
You
know
open
an
issue.
Let
us
know
what
you
think
if
there's
other
ideas
that
you
have,
we
would
love
for
you
to
join
us
and
help
us
build
this,
and
so
the
the
the
pointer
to
the
GitHub
repository
is
down
there
at
the
bottom,
we're
using
actually
using
get
up
projects.
So
we
got
a
kanban
board.
We
got
stories
on
the
kanban
board,
but
we
would
love
for
the
community
to
come
help
us.
B
B
You
know
areas
not
just
Java,
but
c-sharp
and
whatnot
and
visualization,
and
and
but
we
need
your
help
to
make
this
thing
as
as
cool
as
we
possibly
can
right
to
be
really
useful,
and
there
is
there's
never
going
to
be
a
tool
where
you
push
the
button
and
it
makes
microservices
you're
always
going
to
need
an
architect
who's,
guiding
it
along
the
way.
So
I
I
totally
believe
that
the
tool
needs
to
assist
the
architect
in
making
architectural
decisions,
give
them
all
the
information
they
need
to
make.
B
Those
decisions
show
them
different
ways
of
viewing
their
application,
but
at
the
end
of
the
day,
I
would
not
hire
an
insurance
architect
to
re-architect.
My
banking
application
right
I
want
someone
who
understands
the
banking
industry.
So
you
need
to
have
that
context,
and
so
we
envision
this
as
a
tool
that
is
going
to
assist
the
software
engineer,
the
architect,
who's
going
to
re-architect
or
redesign
this
application
and
help
them
understand.
B
You
know
where
those
big
heavy
objects
are
and
where
the
micro
services
are,
and
where
are
the
business
domains
should
be
so
please
come
help
us
I'm
pleading
with
you,
but
we'd
love
to
have
you.
You
know
join,
join
the
team,
join
the
community
and
and
help
us
make
this
into
something
great,
so
Jonathan
back
to
you
well,
that
was
my
plea.
Awesome.
A
Thank
you,
John
thank
you,
Rahul
such
an
awesome
demo
and
and
show
so
for
anyone.
If
you
have
any
questions,
feel
free
to
put
in
the
chat
right
now.
While
we
have
John
and
Rahul
here,
we
can
get
them
to
answer
a
few,
and
in
case
you
don't
have
any
questions
now,
but
you
may
so
later.
Whenever
you
have,
whenever
you
start
getting
trying
to
use
the
tool,
I
put
the
link
to
the
Khmer
slack
channel
in
the
comments,
and
you
can
see
on
the
screen
now.
A
B
A
B
So
that's
a
great
yeah,
that's
a
great
question,
so
yeah
I
think
it's
time
to
do
that
so
I.
So
we've
been
we've
been
having
internal
meetings,
but
now
that
we've
announced
it
to
the
community
I
agree
it's
time
to
have
a
weekly.
You
know
community
meeting
or
maybe
a
a
semi-weekly
community
meeting
where
we're
discussing
these
and
having
our
scrum
calls
so
to
speak.
B
So
so,
yes,
we
will.
We
will
post
that
on
the
in
the
readme
in
our
DJI
repo,
but
yes,
it's
time
to
have
Community
meetings
now,
so
we
will
start
those
up
absolutely
and
hopefully
you'll
join
us.
We
won't
just
be
the
same.
People
and
I
want
a
community
meeting
foreign.
A
All
right
well
with
that
we're
going
to
call
it
a
show
and
John
Raul.
Thank
you
again.
So
much
and
people
will
be
picking
you
in
slack,
if
once
they
get
to
to
using
it.