►
Description
gRPC in Dotnet, Python and Golang machine learning system
How gRPC is used to minimize latencies on distributed machine learning systems. Different languages and operating systems communicate quickly without the overhead of REST communication: when avoiding a 0.1ms delay matters.
A
A
In
our
cast
in
the
ghost
rider
ai,
that
is
a
product
of
the
my
company
or
my
guide,
we
use
advanced
natural
language
processing
algorithm
and
this
algorithm
I
use
to
understand
text
and
now
images
just
to
understand
what
people
are
talking
about,
just
when
the
contact
on
the
crm
on
the
social
networks
on
the
calls
with
the
company.
So
the
content
file.
A
The
idea
is
to
collect
data,
to
analyze
this
data
and
to
provide
hints
and
suggestions
just
to
arrive
in
the
last
year
to
create
automatically
create
text
and
automatically
create
advertising
so,
but
something
that
now
looks
quite
good,
but
not
so
crazy.
It
was
crazy
when
the
ghost
writer,
yeah
and
guide
was
created
and
the
idea
I
was
born
just
imagine
we
are
talking
about
nine
years
ago.
A
A
So
if
I
said
I
want
to
visit
rome,
the
ghost
writer
just
scraping
the
voices
going
on
a
roam
on
the
events
on
the
museum
on
tickets
and
knowing
my
preferences
so
just
put
my
head.
My
preferences
to
the
preferences
of
my
friends
just
create
a
real
time
guide,
with
the
text
with
open
hour
to
tickets
by
so
we
are
talking
something
nice
ago
90
years
ago,
with
something
really
hold
the
age
of
technology.
A
That
means
that
there
is
no
tensorflow
and
no
by
torch.
There
is
no
labeling
the
support
services.
So
if
we
wanted
to
create
a
data
set,
we
have
to
collect
the
datasets.
If
we
need
data,
we
have
to
create
data.
If
we
wanted
to
clusterize
something
we
need
to
create
by
ourself
a
system
to
clusterize
data
analysis.
A
A
A
So
there
was
all
the
inside
machine
learning
system
communicates
itself
all
they
communicate
through
the
rest,
api
and
the
high-level
apis.
That
aggregate
machine
learning
algorithms
inside
our
platforms
and
was
there
just
the
integration
with
the
other
services,
where
our
customers
are
also
rest.
Api.
A
So,
just
it's
not
real
architecture,
not
just
is
an
extraction
very
simple.
Just
imagine
that
the
laptop
is
our
customers
that
we
know
of
the
customer
uses
like
when
you
create
on
google
cloud
tokens,
but
we
know
the
apis.
We
know
everything
about
the
service
customers
and
these
customers
arrived
and
just
to
have
the
access
to
the
public
service
just
to
have
authentication
authorization
first
thing
so:
load
balancing
the
rest
api
yourself,
an
application
file
just
to
understand
how
to
mix
this
information
and
just
to
authorize
some
kinds
of
operation.
A
There
is
an
analytical
services,
this
eye
level
service
can
communicate
itself
and
just
use
the
low-level
machining
exposed
models
just
to
collect
the
data.
I
have
the
information
and
aggregate
a
response,
so
it
means
that
a
single
call
in
the
that
arrive
from
the
laptop
is
completely
different
than
something
that
arrives
at
the
end
of
the
of
the
flow
and
is
not
a
one-to-one.
But
just
the
high-level
service
can
aggregate
a
lot
of
other
services.
A
A
A
A
So
I
want
an
sdk
to
do
all
the
the
calls
in
golang,
because
I
see
your
rest
api.
My
rest
api
are
a
lot
of
calls
and
it's
very
complicated
because
they
are
to
mix
them
together
and
sometimes
it
is
take
a
lot
of
time
to
response,
because
I
asked
something
very
complete.
A
Okay,
so
we
started
to
do
everything
on
golem
and
our
second
customers
asked
okay,
I
want
the
same
for
java,
so
just
a
week
start
to
create
sdk,
but
this
a
lot
of
works.
We
are
only
five
five
of
us
at
the
time
and
in
the
many
times,
as
you
know,
this
is
all
the
problem
that
we
have
and
without
seeing
them
so
inside
the
data
center
of
customers,
they
start
to
have
lack
of
data.
A
A
A
A
A
So,
okay,
this
is
an
area
that
is
not
possible
to
optimize.
It
was
just
optimized,
but
there
is
a
10
that
is
from
the
arrival
of
the
first
calls
at
the
end,
to
arrive
in
the
machid
learning
area
and
come
back
on
the
response
that
is
possible
if
we
thought
that
it
was
possible
to
be
optimized
and
just
to
imagine,
200
thousand
calls
every
hour
the
ten
percent
optimization
it
is
not.
A
I
mean
there
are
not
few
data,
the
first
one,
okay,
so,
okay,
we
have
to
optimize
the
data
center
area.
If
I
optimize
the
data
center
area
and
optimize
the
machine
learning
albus,
we
can
do
something.
We
can
remove
some
api
services,
it's
okay,
what
I
removed
I
removed.
So
if
you
go
to
our
slide,
we
can
see
that
the
highlighted
services.
A
We
include
the
onyx
and
machine
learning
ai,
so
we
don't
have
a
services
high
level
that
calls
a
machine
learning
exposed
models,
but
these
models
is
included
in
the
relevant
services,
but
it's
not
possible.
Also
if
this
model
to
be
shared,
we
have
some
kind
of
other
problems,
because
we
have
to
put
this
inside
our
applications
and
we
have
400
instances.
A
A
A
A
The
second
problem
that
we
have
was
the
problem
of
the
load.
Balancer.
We
moved
around
the
load
balancer
directly
on
the
client
ojrpc.
It
was
easier
also
to
manage
or
to
reach
the
other
system.
So
just
imagine
there
is
something
at
the
moment
that
everything
is
written
inside
of
the
balancer.
To
do
rest,
api
calls.
A
Our
customer
obviously-
and
we
have
some
networks,
as
you
can
imagine,
with
cobol
and
so
on,
that
asks
okay,
but
we
don't
have
the
the
firewall
editorialization
the
people
to
immediately
switch
some
of
our
services
on
to
call
the
grpc
not
just
on
claudio
just
an
internally
on
device.
So
we
use
also
kraken
d.
It's
just
for
to
say:
okay,
we
can
create
a
gateway
for
jrpc,
so
our
customer
just
can
continue
to
use
rest
api.
We
can
just
move
it
to
jrpc.
A
A
Now
we
can
see
just
the
the
few
calls
from
the
guide
just
to
have
the
c
sharp
pulse,
to
create
a
channel
with
ssl
and
the
same
with
python,
to
say:
okay
with
jrpc
just
inside
wilfin
said
something
to
authenticate
people.
We
just
can
use
this,
so
it's
not
a
way
to
create
automatically
a
token
just
what
you
do
on
cloud
platform.
A
A
A
For
milliseconds,
that's
something
that
seems
not
so
much
but
for
few
hundred
thousand
calls
for
hour.
Just
this
means
that
we
save
13
minutes
per
hour.
So
just
saying
we
are
rewriting
something
in
jrpc,
okay,
we
gained
20
or
20
percent
of
data.
Just
on
the
channel
of
movement
of
the
data
inside
the
platform,
we
were
new
for
us
jrpc,
we
don't
know
so
much
jpc.
So
we
configured
them
all
the
services
using
ownery
and
using
some
channels
also,
but
just
just
to
start
to
have
a
look.
A
A
We
don't
manage
too
much
about
latency
at
the
throughput,
because
there
is
not
something
that
was
looking
for.
So
okay,
there
is
the
rest
api.
This
is
the
answer
in
jrpc
we
we
have
the
possibility
just
to
move
something
and
just
to
say,
okay,
what
we
have
to
change,
because
we
need
one
and
on
the
other-
and
this
is
interesting,
because
if
we
have
108
projects
from
these
108
projects,
a
lot
of
these
projects
is
completely
different
by
others.
A
Just
have
a
look.
If
we
receive
a
text,
a
text,
we
have
to
call
machine
learning
model
to
be
aggregate
data
to
create
sentiment,
analysis,
hashtags
entity,
recognition
links
and
so
on.
With
thousands
of
data,
the
image
is
different.
I
want
a
semantic
classification,
so
I
want
to
know
that
this
is
inside.
There
is
a
kids.
There
is
a
car.
There
is
a
knight
within
the
city
the
colors
are
parker
vivid
and
so
on.
There
is
my
brand
inside.
A
A
A
A
So
just
when
we
have
to
resolve
this
issue,
you
say:
okay,
we
have
resolved
an
issue
because
we
are
using
less
computation
resources
because
we
switch
to
grpc,
but
we
can
have
no
idea
or
we
can
scale
something
that
isn't
predictable.
A
A
A
A
A
A
A
Just
what
means
what
means
that
we
are
able
to
change
all
the
old
system
is
exposing
as
a
calling
is
channeled,
so
the
clients
and
the
server
on
grpc
and
the
system
clause.
One
of
this
channel
reopen
the
connection
using
a
completely
different
way,
so
they
they
can
change
the
the
number
of
channels
to
use
they
can
use.
A
The
way
the
service
are
exposed,
respect
to
have
a
more
machine
to
a
service
that
is
okay,
it's
not
fast,
but
it
doesn't
matter
because
you
have
it
also
to
expect
just
to
wait
the
other
system
that
is
not
possible
to
to
have
them
more
responsible,
obviously
easy
on
the
100
lives.
The
difficult
part
was
that
we
are
more
complexity.
A
A
We
have
to
consider
the
switch
time
so,
okay,
I
can
be
more
fast,
but
just
the
time
to
restart
the
services.
With
another
configuration
detects
time.
There
is
problem
about
the
program,
languages
policies,
for
example.
If
you
see
in
the
optimization
system
the
the
python,
the
python
infused,
the
stream
connection,
the
stream
connections
is
not
so
good
on
python
is
so
you
don't
gain
the
same
performance
for
some
pieces
of
switch
on
c-sharp
or
on-go.
A
There
is
a
downtime
policy,
so
if
the
system
goes
down
because
you're
restarting
too
much
machines
just
have
to
understand,
if
your
message
is
lost
or
is
in
the
queue,
if
you
have
the
default
just
to
start
and
the
possibility
to
also
the
customer
to
stop
something
and
to
say,
okay,
you
don't
have
to
get
more
machines
or
just
to
create
more
channels,
because
you
are
saturating
the
the
network
connection
or
you.
Your
machine
are
taking
all
the
bandwidth
that
we
have.
A
The
the
impact
of
just
to
leave
the
grpc
machine
learning
to
optimize
by
itself
the
grpc
connection
just
was
again
0.003
seconds
that
again,
there
is
10
minutes
left
4
hour,
so
we
gain
15
minutes,
switching
from
rest
api
to
jrpc
and
without
knowing
nothing
about
how
to
do
the
the
optimization,
because
we
have
to
configure
108
projects,
suggest
the
machine
to
say:
okay
optimize
by
yourself.
I
just
have
just
a
language,
just
a
description
to
say
this
is
important.
This
is
connected.
A
This
is
not
important.
You
don't
have
to
use
too
much
resources
like
cpu
ram
networks
and
just
optimize
itself
to
just
predict,
because
you
know
you
started
from
two
months
three
months
you
just
to
start
to
understand
that
it
works
the
the
consumers,
that's
protection,
so
the
big
picture
just
to
leave
jrpc
optimized
by
itself,
is
save
as
22
minutes
for
four
hour
per
hour
in
total
in
general.
Every
day,
just
we
start
to
save
eight
hours
nine
hours
per
day.
A
So
what
we
do
is
just
switch
to
jrpc
and
just
say:
okay
manage
byte
itself.
What
is
important
for
you?
We
can
adjust
it.
We
have
some
limits
and
this
map
just
do
what
happens
in
our
cloud.
So
when
you
can
see
just
we
have
the
what's
is
the
most
used
configuration
from
our
customer
and
our
system,
so
the
stream
is
most
used
just
for
analyze,
long
text,
obviously
for
images
streamers,
never
use
the
on.net
core
windows,
so
so
the
test,
the
machine
learning
algorithm
tested
it
must
just
continue
to
use
unity.
A
A
So
now
we
have
the
possibility
to
do
the
things
in
another
way,
so
something
that
is
very
important.
All
this
is
very
constant.
We
are
dismissing
the
optimization
or
just
to
use
this
algorithm.
So
if
we
say
that
in
one
month,
two
months
is
obviously
constant
now
automatically,
we
know
it
is
the
best
configuration
for
the
customers.