►
From YouTube: Optimize Ceph messenger Performance
Description
Presented by: Chunsong Feng
Optimize Ceph messenger Performance
1. The NIC SR-IOV is used. Each OSD uses an exclusive VF NIC. 2. The DPDK interrupt mode is added. 3. The single-CPU core and multiple NIC queues are implemented to improve performance. 4. The admin socket command is added to obtain the NIC status, collect statistics, and locate faults. 5. Adjust the CEPH throttling parameters, TCP, and DPDK packet sending and receiving buffer sizes to prevent packet loss and retransmission. 6. The Crimson message component uses the Seastar DPDK.
A
We
implement
the
emergency
worker
balancer
to
to
apply
to
organize
this,
to
do
some
balance
between
image
worker
and
now
we
implemented
GCB
0
copy.
A
A
A
But
Wayne
tested
one
megabyte,
Drive
secret
life:
it
will
decrease
the
by
10
percent
and
then
we
we
enable
a
typical
and
test
it
with
tcpm,
because
each
its
typically
threader
will
occupy
a
4
F4
form.
So
we
only
use
two
emergency
workers
to
avoid
using
too
many
CV
cores.
A
It
will
decrease
the
by
four
percent,
as
if
100
will
increase
the
40
as
13
30
percent,
when
it
is
the
one
megabyte
the
L
is
the
current
cannot
finish
the
property
because
it
will.
It
will
have
too
many
too
many
slow
Ops
and
we
we
use
the
TCP
Tempo.
Finally,
it
has
Many
religious
mission,
so
the
conclusions
RDMA
has
has
no
evidence
advantages
compared
with
T3
if
they
have
a
failed
to
test
the
field
to
work
at
a
one
megabyte
routines
and
I
have
no
benefits
in
4k
IO
tests.
A
And
the
thing
we
we
tested
as
a
new
feature
on
wealth
compression
data
data
transfer
between
osts,
the
fear
versus
DPS
will
support
the
own
wild
temperature
comparison
and
with
Snappy
Health
Force,
this
key
energy
level.
A
We
use
the
four
four
message:
workers
and
because
they
come
come,
completion
and
will
occupy
will
use
the
menu
CPU
many
interviews.
So
we
need
a
increase
in
one
message:
worker
and
the
flos,
with
different
calculation
ratio
of
from
20
to
100
percent
and
the
funnel
is
that
which
uses
slightly
is
slightly.
It
will.
A
Have
the
best
performance
when
the
when
the
compressed
ratio
is
100
percent,
it
will
increase
the
313
milliseconds.
A
When
you
test
the
save
messages,
we
found
the
let's
introduce
the
I'm
I'm,
an
lse
I'm,
we
oh
are.
We
8.1,
has
the
new
new
last
large
budget
system,
extensions
load
and
modify
ACL
Atomic
variable
in
L3
case?
Well,
the
Lord
and
a
store
instruction
will
use,
will
loader
and
modify
the
atomic
in
L1
cache.
A
We
found
that
it
disabled
lse
is
really
well
how
39
percent
include
so
because,
because
they
they
can
encourage
the
encourage
the
Israeli,
it's
the
lower
thing
and
258.
A
A
We
Implement
a
new
feature:
multi-work
balancer,
we've
we
found
is
that
workloads
in
imbalance,
whatever
the
imbalance
of
emergency
workers
okay
occurred
in
this
sequential
blood
tests
makes
you
Messi
working
will
migrate
migrate,
a
optimal
Connection
in
in
order
to
rebalance
the
workload.
For
example.
Measure
was
one
I
was
the
most
workloaded
means
work
Street
how
the
list
workload
we
will.
We
will
try,
but
we
balance
it
and
move
move.
One
move
on
optimal
connection
from
let's
see
working
one
to
a
message:
work:
three:
it
evolved
it
will.
A
It
will
be
a
more
balancing
balance,
the
MP4,
if
they,
if
they,
if
you,
after
after
immigration
use
the
workload,
is
HCl
or
in
balance
in
Balance
the
same
the
predefined
thresholder
we
will.
We
will
redo
reduced
balance
from
moves.
They
can
move
one
optimal
connection
from
the
mode
to
the
list
to
replace
the
whole
workload.
A
A
In
kernel
for
the
14th,
and
it
implements
cinematic
cell
copy
to
reduce
the
overhead
of
memory
copy,
we
Implement
Dynamic
sentiment
digital
copy
in
safe
position,
stack,
which
will
which
only
be
used
when
the
data
size
is
not
saying
so,
let's
hold
it
and
since
the
flow
press,
the
link,
a
connection
is
great.
We
will
also
the
instructor
orbital,
which
little
copy
and
then
create
a
file
and
to
register
and
register
for
error
event.
A
A
When
you
said,
the
message
is
finished
kind
of
will
signal
magic,
worker,
major
worker,
then
you
plan
to
let
them
receive
messages,
call
the
same
message
to
to
get
the
other
blue
engine
it
worked.
It
working
gave
me
a
low
to
high
sequence,
but
with
the
sequence
we
will,
we
will
clear
we
will
clear
the
buffer
other
and
if
yeah,
we,
both.
We
send
a
message
saying
we
will
append
the
button
and
finish
the
buffer
to
the
power
distance.
With
the
when
we
we
get,
we
calculated
the
rmlq
messages
and
we
know
the
same.
A
A
It
says
sizzling
Health.
This
is
linked
talking
about
her
easy
in
Tech
he's
into
some
experience
initiative,
some
experiment
with
the
problem,
saying
inclusion,
OST
messages
and
use
the
c-star
dpk,
and
he
found
the
following
issues.
First,
this
does
see
magic
with
the
system
cannot
establish
a
low
back
connection
and
the
native
Style.
A
We
we
added
some
we
added
some.
We
in
the
otx
is
a
transmitter
function.
We
will
check
a
user
Let's
test.
The
mic
is
the
same
address
as
source
Mac.
If
you,
if
it's
the
same,
we
will
call
actually
seven
to
or
forward
it
to
to
upload
up
layer
to
fix
this
issue.
A
But
she
started,
you
know
to
supporter
work
at
a
slave,
so
we
use
it
as
a
leakage
as
our
way
to
to
generate
many
ovf
leaks
same
way,
let
them
so
that
we
assign
its
OSD
exclusive
vfd
same
thing.
Each
OSD
has
this
has
its
own
League.
It
start
it
will
start
okay,
another
issue,
it's
a
move,
sucked
a
closed
score
is
not
a
support
here
and
we
need
to
improve.
We
need
a
introduced.
A
We
need
to
implement
package
forward
machine
which
we
need
to
introduce
the
four
other
package
across
Chrome,
which
may
which
may
the
performance
may
not
be
recorded.
It
kind
of
support.
A
A
The
last
issue
is
a
pay
for
Crimson.
Emergency
tester
is
fair
to
work
with
too
many
jobs
or
largely
out
there,
as
I
mentioned
before.
We
also,
we
also
you
know.
I
was
just
issue
circumcision
issue.
We
used
the
digital
Tempo
command
to
capture
an
endless
package.
It's
funny
that
a
large
number
of
fast
retransmission
performance.
A
A
We
found
that
with
25
voltage
League
it
can
provide
a
bottle,
a
bottle
of
616k
LPS
and
now
it's
therefore
the
classical
classical
OSD
is
one
one
host
can
only
you
can
cannot
afford
a
so
much
the
LPS.
A
So
we
think
25
2015
indicates
enough.
Fast
technique
cannot
face
more
improve
LPS
and
as
itself
we
will
need
to
upload
a
network
style
and
we
will
use
RDMA
stack.
You
can
upload
some
Network
style,
but
we
use
we
use
topple.
You
know
to
reduce
trouble
to
we
install
we
use
careful,
you
will
you
still
use
paper
to
attach
the
one
OSD
and
phones
addict
most
it's
about
a
50
59
percent.
A
Almost
it
was
nine
percent
of
civil
I
hope
he
worked
all
through
process,
then
networker
a
transmitter,
and
then
the
message
is
sender
and
the
4K
render
right.
Let's
see
transmission
occupied
only
pennies
three
percent
of
the
CPU
we
assign
its
OST
have
first
review.
So
it's
a
network
can
occupy
only
about
one
one.
Cpu,
therefore
offload
a
CPU
cannot
bring
much
benefit.
A
S
Implement
I,
think
at
the
mistake
energy
to
implement
the
HTML
read
feature
use
the
RDM
reader
to
produce
a
large
package.
It's
more
efficient,
saying,
10,
animation
idea,
my
reader
and
paste
it
is.
The
clientele
will
send
them
press
the
window
that
will
sell
an
article
requested
to
the
server
system
will
actively
look
at
the
large
number
of
as
a
large
package
to
avoid
the
congestion,
as
I
mentioned
before
the
one
megabyte
tester
is
lines
sequential
rate
is
lowest
in
in
tcpm.