►
Description
Tigran Bayburtsyan knows how to scale
a network service without fail
using #rustlang, MIO,
and a threadpool to go
so fast all competitors pale
Real-time networking applications becoming more popular, but building backend systems is challenging in terms of memory and cpu efficiency. This is a story about how at TreeScale (github.com/treescale) we got 10X+ memory and cpu efficiency using Rust MIO as a main network TCP/UNIX handling system with thread pools.
(Limerick by @llogiq)
https://paris.rustfest.eu/sessions/scalable-networking-with-rust
A
B
Thank
thanks
for
great
introduction,
actually
I'm
Tigran
for
the
past,
like
eight
years
I've
been
doing
system
engineering,
mostly
helping
companies
to
optimize
their
cloud
environments
and
especially
Network
heavy
applications,
and
you
will
be
amazed
how
much
you
can
save
as
a
company.
Just
optimizing,
your
network
stack
and
it's
it's
just
saving
a
lot
of
your
resources
for
cloud
computing,
so
I'm
typing
many
programming
languages
per
day.
It's
four
or
five.
B
That's
my
daily
work
and
rust
is
not
the
major
one,
but
it's
something
that
like
I,
enjoy
even
doing
like
after
work,
and
this
project
actually
started
the
project
like
to
feed
my
interest
and
I'm
doing
a
lot
of
motorcycling
adventure,
writing
and
also
skiing.
That's
a
lot
yeah,
so
few
words
about
rescale.
We
not
going
to
dive
in
too
deep
how
it
actually
behaved.
B
So
in
a
few
words,
it's
just
a
scalable
pop
subsystem
where
you
have
the
entire
event
distribution
system
without
any
central
point
and
without
any
failure
it
just
routing
for
the
events.
First
implementation
was
on
goal,
obviously,
because
it's
it's
just
easy,
it's
easy
to
code,
but
after
running
it
on
a
production
with
very
heavy
scale,
it
turns
out
that
goes.
Garbage
collection
messing
a
lot
with
like
memory,
deallocation
and
heavy
traffic,
so
second
implementation
was
on
C++.
B
B
At
that
time
we
had
some
experiments,
but
we
thought
okay,
maybe
rust
is
too
early,
but
then
we
actually
saw
that
it's
a
pretty
mature
language
to
use
and
for
our
specific
needs
we
made
on
C++
the
plays
loop
library
called
base
loop,
which
is
actually
almost
the
same
thing
as
a
my
own,
but
with
less
features.
The
specific
need
features
that
we
need
it
for
that
time
and
after
we
saw
that
mio
like
completes
all
the
features
that
we
need.
We
just
starting
writing
on
that
and
actually
writing
to
rust.
B
We
canvassed
from
mio,
because,
if
rust
community
didn't
had
that
library,
we
wouldn't
start
doing
the
rust
and
the
base
usage
of
mio
here
is
actually
the
example
code.
It's
very
simplified,
basically
what
it
does.
It
just
makes
event
loop
around
the
existing
operating
system,
a
pole
or
K
event
based
on
the
operating
system,
and
it
just
registers
specific
sockets
to
receive
events
and
make
some
data
processing
with
them
each
single
event,
as
you
can
see,
there's
a
like
infinite
loop,
which
contains
your
entire
logic
and
it's
operates.
B
Anterior,
your
application
is
life,
so
it
based
principle
of
event
loop
and
it's
single
threaded.
So
if
you
can
imagine
the
application
which
works
with
the
event
loop,
it's
something
like
this.
You
have
the
infinite
loop,
which
produced
specific
actions
based
on
kernel
events,
then
using
a
thread
pool
just
to
optimize
your
processes,
you,
the
threat
to
land,
picking
up
some
pull
some
thread
inside
that
tool
to
perform
some
action
and
return
back
to
your
event
loop
and
continue
doing
your
process.
B
This
is
like
the
base
for
almost
any
kind
of
single
threaded
application,
but
we
actually
faced
some
issues
with
this
especially
performance
issues,
because,
like
first
point
of
three
scale
was
to
scale,
then
it
should
be
like
super
heavy
network
application
so
having
working
it
on
all
CPUs
without
like
a
single
thread.
That's
like
pretty
important
part
and
rust
helped
us
to
develop
a
technique
with
mio,
which
is
looking
something
like
this.
Let
me
this
works.
Okay,
so
which
looks
something
like
this.
B
So
basically
we
have
some
control
system
over
multiple
I/o
loops,
and
this
is
mainly
we
got
this
performance
because
of
the
rust
thread.
Saving
thread
safe
fitting
model-
and
this
is
like
the
entire
process-
works
completely
non-blocking.
So
everything
is
written
with
the
thread
channels,
which
is
pretty
awesome
performance
in
terms
of
real
code
execution.
So
here
is
like
the
example
of
the
main
main
handler
loop.
B
Basically
it
whenever
you
got
some
TCP
socket
to
accept
or
if
it's
a
client
socket
you're,
basically
making
some
validation
around
I
know,
maybe
certificate
checking
or
the
data
validation.
Whenever
you
can
perform
that
and
then
basically,
what
you
are
doing
is
using
Mio
principle.
You
are
the
registering
that
TCP
socket
from
current
loop
and
passing
that
using
the
rust
channels
to
one
of
the
threads
that
operating
another
loop.
So
that's
the
main
principle
for
transferring
and
after
this
transfer
operation,
this
main
handler
loop,
don't
know
anything
about
that.
B
So
this
is
a
little
bit
like
more
code,
but
the
concept
is
that
the
ill
handler
loop
receives
that
and
just
registers
that
inside
his
Pole
inside
his
event
loop.
So
that
way
we
can
just
transfer
connections
between
multiple
event
loops
and
have
operational
like
completely
I,
think
principle
without
any
blocking
data.
So
that's
the
main
benefits
and
the
optimizations
we
had
customer,
which
is
operating
like
few
petabytes
of
networked
data
transfer,
especially
images
per
day,
and
they
have
got
like
from
6
to
10
times
optimization
in
terms
of
memory.
B
After
deploying
this,
this
principle
versus
go
so,
and
the
main
benefit
from
us
is
that
using
this
technique,
we
are
able
to
like
scale
the
code,
because
rust
itself
is
checking
the
safety
and
if
you
are,
for
example,
hiring
a
new
developer.
He
don't
know
this
trading
model
and
he
writes
some
component
around
that
it
just
rust,
prevents
some
memory
leak
between
passing
some
data
between
channels
and
threads
and,
of
course,
using
multiple
cores
as
a
multiple
event
loops,
not
only
tasks.
B
B
And
it's
just
helped
us
a
lot
to
scale
the
code
base
from
like
having
multiple
traits
into
multiple
traits,
even
without
like
changing
anything
inside
the
base
code.
So
this
is
the
main
feature
that
we
liked
a
lot
after
like
moving
away
to
rust
in
terms
of
infrastructure,
so
we
partially
open
source
so
based
technology
itself.
Three
scaled
is
open
source,
but
it's
some
kind
of
a
demo.
D
B
Yeah
so
the
communication
protocol
itself,
the
custom
binary
protocol,
but
we
have
like
API,
integrations
with
high-level
application,
including
the
WebSockets,
so
one
of
our
clients
is
using
inside
the
mobile
lab.
So
basically
we
compiled
our
rust
SDK
inside
the
mobile
app
and
providing
them
this
real-time
networking
feature
for
their
mobile
app,
so
basically
it
just
integratable
and
thanks
to
mio
we
can
just
integrate
that
to
any
any
kind
of
platform.
C
E
B
We
we
have
multiple
actions,
but
data
itself
is
not
copying.
We
have
basically
the
byte
array
and
every
time
when
we
are
doing
something
we
are
actually
doing
by
reference
to
that
array
and
in
terms
of
protocol.
We
are
just
appending,
like
60
byte
to
that
original
data
and
we're
not
doing
any
data
manipulation
to
like
customers
original
data.
Whenever
you
have
some
API
endpoint
and
transferring
data
through
three
scale.
B
So
we
are
using
some
little
pieces
of
unsafe
code
just
to
give
that
like
manipulation
more
easily,
because
we
have
some
big
endian
and
little
engine
Indian
conversed
conversation
between
just
to
figure
out
the
lengths
of
the
bytes
some
in
some
places.
But
it's
some
that
protocol
came
from
C++
code
and
we
didn't
change
that.
We
just
made
the
unsafe
rest
code.
Okay,.
E
B
B
No,
but
if
you,
if
you
are
using
in
linux,
based
environment
or
unix,
so
basically
it's
not.
It
wouldn't
work
on
windows,
but
for
windows
we
did
don't
making
like
thread
TCP
socket
thread
passing.
We
are
using
different
technique,
but
we
all
have
only
one
customer
which
requires
a
Windows.
So
it's
not
to
deal.
F
G
Existing
messages
that
you
receive
and
and
our
prepending
some
stuff
at
the
beginning.
How
do
you
make
sure
that
you
have
room
to
do
that
or
are
these
like
fixed-length
header
for
the
dynamic?
And
how
do
you
make
sure
that
when
you
actually
read
the
data
in
there's
enough
room
before
the
data
that
you're
getting
in
order
to
place
your
headers
yeah.
B
So
if
you
can
imagine
the
white
flow,
we're
basically
getting
your
data
and
make
making
sure
that
we
have
a
proper
length
because
of
the
like
nature
of
TCP.
We
know
that
if
the
airflow
ends
with
some
specific
point,
then
your
data-
that's
how
it
is.
We
basically
measuring
the
length
of
bytes
and
putting
that
with
the
begin.
The
end
that
you
have
this
amount
of
length,
it's
a
four
byte
integer
for
us.
B
So
basically,
whenever
our
another
node
reading
your
data,
it
tries
to
find
first
first
four
bytes
to
decode
and
understand
how
many
lengths
length
he
needs
to
accept
from
another
node.
So
that's
how
we
transferring
data
and
making
sure
that
there
is
no
data
loss,
we're
basically
transferring
the
length
as
a
first
four
bytes.
B
No,
so
we
have
one
customer
with
web
integration,
but
we
provided
for
them.
Websockets,
I,
guess
we'd
webassembly,
it's
really
complicated,
because
not
all
production
browsers
right
now
supporting
and
not
all
customers
wants
to
see
Hecky
website
web
assembly
right
now
because
generally
it's
not
in
production,
so
most
of
the
like
companies
don't
want
to
see
that
on
their
environment.
That's
from
my
experience
because
we
tried
also
experimental
on
that
way.