►
From YouTube: Ceph Month 2021: Optimizing Ceph on Arm64
Description
Presented by: Richael Zhuang
Full schedule: https://pad.ceph.com/p/ceph-month-june-2021
A
Hello,
I'm
rachel
and
my
topic
is
set
on
m64.
It's
about
our
practice
of
death
storage
ecosystem
on
amp
servers.
This
is
the
agenda
of
my
slides.
Firstly,
I
will
give
an
overview
of
the
self-related
work
we
have
done
on
amp.
A
Servers
yeah:
this
is
the
framework
of
seth
now
practice
to
enable
and
improve
safe
storage.
Ecosystem
on
m64
includes
using
unspecific
instructions
or
features
to
do
some
common
leaves
optimization
like
utf-8
drc
isa
and
some
of
safe
speeches
requires
the
support
of
some
server
projects,
so
we
also
enabled
those
server
projects
on
m64,
including
sbdk
system,
safe
csi
and
so
on.
Besides,
we
also
try
to
do
some
optimization
in
this
server
project
and
the
support
to
use
save
as
storage
save
backhand
for
openstack
kubernetes
on
r64
has
also
already
been
found.
A
A
A
The
original
implementation,
encoding
and
checking
is
done,
provides
it's
not
so
efficient
and
our
optimization
can
give
up
to
8
times
boost
full
stream,
validation
and
50
game
for
string
encoding
through
operating
several
bytes
at
a
time,
and
another
optimization
is
about
the
widely
used
crc-33
implementation.
This
on
v8
pmo
instruction.
A
A
A
Besides,
because
iso
provides
multi-binary
versions
of
some
functions,
developers
can
deploy
a
single
binary
with
multiple
function
versions
and
then
choose
features
at
runtime
for
m64.
We
added
the
utility
functions
together,
cpu
feature
set
and
provided
a
framework
to
generate
function
best
on
the
feature
sets
and
now
we
are
working
on
some
other
algorithms
like
aes,
sts
and
multi-hash
xiaowan,
plus
mobile
3.
A
A
Yeah,
we
also
enabled
and
bench
sorry
benchmark
set
on
64
kb
kernel
page
because
has
a
64,
kb
kernel,
page
support.
A
large
page
has
some
benefits
compared
with
small
page
on
amp
platforms
compared
with
4k.
A
A
Here
we
used
a
set
test
cluster
with
one
monitor
one
manager
and
three
osd
to
do
a
benchmark.
Each
osd
backhand
is
one
p4610,
so
me
ssd.
We
tested
the
rbd,
sequential
and
random
grid
and
write
with
different
block
size.
A
The
orange
the
yellow
bars
are
the
same
rbd
bandwidth
of
4kb
kernel,
page
size
and
the
orange
a
sorry
and
the
orange
ones
are
64
kb
kernel
page
size
from
the
graphs
we
can
see.
We
can
get
about
three
to
eleven
percent
boost
for
rate
according
to
different
block
size
from
4k
each
for
a
million
bytes
broadcasting
and
for
right
is
8
to
22
percent
boost
and
6210
percent
for
render
rate
and
about
6
to
15
boost
for
render
right.
So
in
our
english
test
case
using
64
kb
kernel
page
size.
A
A
In
addition
to
the
work
we
have
done,
we
are
investigating
new
optimization
points
like
a
leveraging,
a
new
feature,
a
scalable
vector
extension
called
sve
for
shot
to
do
some
optimization
and
also
we
can
try
to
leverage
the
non-temporal
instructions
to
prevent
cash
pollution
in
some
cases,
and
yet
we
are
also
investigating
the
lost
db
compression
labs,
optimization
potentials.
A
And
then
it
go
to
safe
storage
with
spdk,
yes,
bdk,
the
storage
performance
development
kit
yeah
it
can
achieve
high
performance
by
through
moving
all
of
the
necessary
drivers
into
user
space
and
operating
in
a
port
mode.
Instead
of
relying
on
interrupts,
which
can
avoid
kernel
contact
switches
and
eliminate
interrupt
handling
of
heads.
A
Instead,
spdk
can
be
used
to
accelerate
the
block
service
built
on
saf.
We
can
use
spdk
users
based
on
vmi
driver
instead
of
kernel
and
maybe
driver
in
blue
star
and
besides,
as
mentioned
in
the
english
link,
spdk
ice,
crc
target
or
environment
or
fabric
target
can
be
leveraged
to
accelerate
client
io
performance
on
safe
cluster.
A
A
Unlike
x86
non-memory
model
m
has
a
weak
memory
model,
and
this
essentially
means
that
a
few
guarantees
are
given
us
to
the
observed
order
of
cpu
memory
access.
Thus
any
load
or
store
operation
can
effectively
be
reordered
with
any
other
laws
of
operation
as
long
as
it
will
never
modify
the
behavior
of
a
single,
isolated
threat.
A
A
A
A
A
Apart
from
the
above,
we
also
did
some
optimization
in
sbd,
mme
of
tcp.
For
example,
we
can
leverage
gcp's
incoming
cpu
feature
to
get
the
cpu
affinity
of
a
socket,
and
then
we
can
distribute
the
processing
of
this
socket
to
specific
cpus,
which
provides
optimal
pneumatic
behavior
and
keeps
the
cpu
cache
hot,
and
this
can
bring
about
about
10
percent
performance
boost
in
our
test
and
all
the
related
pages
are
ready
in
spdk
projects.
There
are
some
other
optimization
usb,
but
I'll
not
go
further
on
this.
A
A
The
yellow
bar
is
based
on
blue
star
without
sbdk,
and
the
orange
orange
one
is
based
as
bdk
from
this
graph.
We
can
actually,
we
don't
see
any
obvious
performance
improvements
in
our
test
case
yeah.
There
is
even
some
drop
in
right
performance
here
since
it's
hard
to
get
benefit
from
spdk
with
the
traditional,
safe
osd
framework.
A
Yeah
known
that
my
self,
the
psychology
is
being
reflected
based
on
the
stars.
High
performance
framework
for
the
age
of
persistent
memory
and
faster
may
need
storage
system.
A
A
Yeah
about
six
star
work
we
have
done
includes
upgrading
this
star,
dpdk
leverage
new
hardware.
Your
system
has
decay
and
path
to
achieve
zero
copy
between
c
star,
hip
and
nick.
It
uses
a
physical
address
for
dma
by
referencing.
A
The
file
approximate
delve
page
map,
which
is
a
legacy
method,
and
this
update
leverages
iommu
to
map
cstar
heap
virtual
address
directly
to
io
virtual
address
for
dma
covers,
which
makes
full
use
of
modern
hardbail
and
significance
simplifies
the
code,
and
this
upgrade
also
upgrades
to
new
of
those
api
and
it
besides
to
make
this
start
work
on
an
64.
We
also
face
some
parts
like
network
stack,
crash
issue
on
arm
and
fixed
system,
64
gb
per
shot
memory
limit
for
the
network
stack
crash
issue.
Actually
it's
caused
by
a
code
snippet
like
this
yeah.
A
It's
a
function
curve
with
two
parameters.
First
parameter
is
function;
g,
three,
a
pointer
about
the
seconds
as
yeah.
The
reference
the
pointer
is
called
was
okay
on
x86,
as
parameters
passed
from
right
to
left
in
stake,
pushing
order
so
the
first,
the
second
parameter
used
pointing
is
caught
before
the
first
one.
The
three
pointer
by
fails
on
as
parameters
are
passed
from
left
to
right
to
leverage
abundant
general
purpose
registers,
so
three
points
of
code
before
use
points
which
can
lead
to
crash.
A
Yeah
this
is
the
sister
http
benchmark
online
servers.
We
can
see
that
the
performance
scales
up
almost
linearly
with
the
wesley
edition
of
the
cpu
cross,
but
actually
we
haven't
verified
and
benchmarked
set
this
door
on
our
server.
I
will
do
this
later
yeah
and
see
how
the
performance
is.
A
The
current
save
as
open
state
storage,
backhand
is
mature
and
the
basic
functions
are
like
using
that
as
open
stack,
swift,
singer
glance,
backhand
or
work
well
on
ancestry4
and
in
addition
to
openstack,
we
have
supported
seth
as
kubernetes
container
cloud
storage
backhands
on
64..
We
added
the
official
support
for
some
critical
container
images,
such
as
the
kubernetes
csi,
sidecar
image,
and
added
an
image
support
for
safety,
safe
csi
community
also
added
almost
just
in
its
community
ci.
A
Yet
we
also
support
look
on
m64,
which
can
simplify
the
deployment
of
shaft
cluster
in
kubernetes
and
the
following
related
work.
We're
trying
to
do
includes
supporting
container
options,
storage
interface,
online
c4
and
some
work
about
kubernetes
storage,
e2e
test
improvements,
yeah
yeah.
I
I
think
that's
all
I
want
to
share
here
any.
C
Hey
thanks:
well,
it's
it
was
interesting.
I
I
guess
I'm
naive
user,
but
I
didn't
realize
that
other
architectures
can
implement
the
isa
libraries.
So
my
question
is
basically
like.
Until
now
we
didn't,
we
didn't
ever
use
isa
for
erasure
coding
because
we're
afraid
of
being
locked
into
intel
like
from
an
operations
point
of
view.
Can
we
safely
use
isa,
eraser
coding
and
then
be
free
to
move
to
other
cpus
non-intel
in
future,
with
the
same
osd
pools
same
osds
or
same
pools.
D
Hello,
yes,
yeah
yeah.
I
work
with
richard
so
for
I
say
I'm
more
more
familiar
with
it.
I
did
some
work
in
the
ico
library,
so
you
mean
iso.
Is
the
ise
er,
intelligent
storage,
accelerator
library?
Yes,.
D
So
so
far
I
say,
or
actually
we
implement
some
arm
related
yeah,
I'm
ready
to
accelerate
the
codes
to
the
isa
library
and
for
easy.
I
remember
how
we
implement
upload
there,
as
upstream
as
an
implementation
of
arm
base
that
you
see
in
the
icl
library.
So
I
think
this
is
the
icr
libraries
supports
both
internal
and
arms.
C
I
mean
yeah
thanks
sage,
I
mean
we,
it
was
andreas
at
cern,
he
he
wrote
the
first
isa
library
for
ec,
but
yeah,
like
I
said,
we've
never
even
used
it,
because
we
had
the
impression
that
I
mean
our
our
vendors
will
switch
us
to
amd
at
any
time.
Actually
our
next.
Our
next
block
is
epic
and
we
were
just
afraid
to
afraid
to
ever
use
it.
E
I
guess
the
other
thing
you
might
take
a
look
at.
I
believe
that
there's
an
erasure
code,
non-regression
data
corpus,
that's
part
of
the
repository,
I'm
not
sure
if
it's
been
like
refreshed
with
new
data,
as
new
ec
libraries
have
been
added
but
probably
should
be.
But
the
idea
there
is
that
there's
a
whole
bunch
of
original
coded
data,
that's
stored
in
git
and
then
the
unit
tests
just
like,
read
it
and
make
sure
that
it
that
it's
readable.
E
E
E
E
E
E
Thank
you.
Actually,
I
guess,
since
we
have
a
couple
of
minutes,
I
have
I'm
just
out
of
curiosity
here.
John
in
the
chat
was
mentioning
a
raspberry
pi
2
pi
4.,
I'm
curious
what
people
who
are
running
stuff
on
arm
on
raspberry
pi's
are
using
for
the
actual
storage,
because
maybe
the
pi
4
has
beta
ports,
but
the
three,
I
think,
didn't
right.
You
had
to
use
like
a
usb
adapter
or
something
like
that.
B
I
actually
have
a
colleague
that
has
a
raspberry
pie
cluster
at
home.
He
talks
about
it
all
the
time
way
too
much.
B
I
don't
actually
know
what
he's
using
for
the
the
data
I
I
always
assumed
it
was
just
usbs
and
that
he
only
just
maxed
out
the
data
ports
that
you're
talking
about
and
then,
if
you
have
like
five
or
six,
then
you
know
that's
the
limit
of
but
five
times
three
fifteen
usds
right,
but
I
don't
actually
know
I
can
ask
him.
I
think
he's
on
holiday
at
the.
E
Moment
I
guess
that
question
could
go
to
mike,
actually,
because
you
put
together
a
raspberry,
pi
cluster
right
yeah,
it's
along
the
same
lines
of
using
a
usb
adapter,
though
as
well
yeah,
okay,.