►
From YouTube: CDS Jewel -- Hadoop over Ceph RGW
A
B
Okay,
thanks
Patrick,
so
this
is
actually
a
status
update
from
BP
in
in
finales
in
Inverness,
we
gave
some
Hadoop
solution
over
a
relic
into
a
with
SSD
cache
and
scenes
over
the
past
few
months.
We
have
some
updates
here
and
we
would
like
to
share
with
you
guys
so.
The
content
for
today's
status
update
is
like
the
first.
We
are
going
to
recap:
the
design
of
a
Hadoop
over
rails
gateway
with
SSD
cache,
and
then
we
are
going
to
update
the
status
since
infamous
we
have.
We
actually
have
some
tips
center
of
the
code.
B
B
B
So
this
is
a
general
design
of
a
hadoop
over
rails
gateway
PP.
So
actually
there
are
three
paths
in
in
this
product.
The
first
one
is
rather
skidaway
FS,
which
is
a
additional
plugin
for
Hadoop
compatible
file
system.
We
we
actually
have
done
some
tip
scent
of
the
code.
There
and
second
party
is
a
rather
skate
web
proxy,
which
is
restful
service
based
on
our
PI,
some
whiskey,
and
it
can
give
out
the
location
of
the
data
based
on
the
object
name
and
a
container
name
in
this
third
parties,
rather
skate
away
with
SSD
cache.
B
B
Okay,
so
so
this
is
a
detailed
status
update
since
employees.
The
first
part
is
a
rest
into
a
proxy
pad.
We,
we
actually
have
done
a
as
small
demo,
based
on
passing
whiskey
module,
which
is
which
accepts
some
rest
for
requests
like
like
the
curl
command.
Here
you
can
use
some
Cory
Cory
the
data
location,
with
some
restful
request.
B
B
It's
actually
a
fork
of
Swift
ifs
and
with
that
rascally
can
talk
to
a
single,
reduce
gateway
directory
without
ending
without
much
modification,
but
it
is
only
able
to
talk
to
single
rattles
gateway
instance
that
actually
limits
a
solution
scale.
So
we
modified
the
part
of
the
code
and
now
with
rather
skate
web
proxy,
and
let
us
get
right
of
screen
to
FS
can
talk
to
multiple
retro
skate
way
instances.
B
Okay,
this
is
a
detail
updates
for
a
GW
FS
load.
So
in
general
there
will
be
a
new
file
system
URL
with
a
GW
prefix,
and
with
this
protocol
we
can
running
we
can.
We
can
make
Hadoop
talk
to
relish,
kill
a
cluster.
Basically,
it's
it's
it's
a
fork
of
a
swift
ave
s,
but
their
needs
their
needs
to
some
underneath
there
is.
B
There
are
some
modifications
on
the
on
the
code
and
whereas
good
way
is
able
to
talk
to
multiple
rows
right
on
schedule,
instances,
and
also
we
we
we,
we
have
add
some
new
block
concept
to
edge
wfs,
because
in
Swift
this
actually
no
block
block
level
concept
there.
So
all
the
order
get
and
put
would
happen,
add
object
level.
This
would
make
make
the
all
the
catch
read
and
put
go
through
the
proxy
side.
Don't
practice
I.
B
So
so
so,
with
this
new
block
concept,
we
actually
can
do
some
improvement
on
the
read
on
the
gas
side
based
on
the
block
location
we
can
choose
which
rather
skate,
for
instance,
to
read
from
basically
we
are
using
some
ranch
get
API.
But,
however,
for
the
put
we,
we
still,
we
still
go
through
a
single
rattles
gateway
instance,
since
there
is
no
multiple
put
and
there.
B
Okz,
like
I,
said
that
we
have
seventy
percent
of
the
call
it
down
here
and
there.
There
are
still
some
issue
on
the
objects:
ladders
in
five
gigabytes
and
since,
if
the
object
is
larger
than
five
gigabytes,
there
will
be
some
zero
byte
manifest
file
and
lots
of
small
small
chunks.
So
we
still
trying
to
resolve
this
issue,
but
for
the
objects
smaller
than
five
gigabytes
this,
this
Angela
BFS
is
actually
working
now.
B
Do
you
little
data?
Is
yes,
so
we
actually
have
done
some
special
configuration
in
the
idw
cast
a
layer
or
the
straps
eyes
had
been
configured
as
64
megabyte
Alexis.
So
we
also
in
increase
the
max
chunk
size
and
to
make
sure
or
the
chunk
size
r
equals
M,
though
this
we,
we
may
make
missing
that
much
call
64
mega
byte
blocks
here,
and
then
we
we
actually
get
some
a
pack
in
a
GW
proxy
side.
The
first
thing
is,
we
would
use
some
Leigh
brothers
and
get
X
attr
API
to
get
the
manifest
file.
B
That
is,
that
is
a
exit
here
from
the
head
object.
So
you
actually
nosey
all
the
object.
Brock
names
from
the
manifest
file,
like
the
Hat
object
name,
is
actually
the
object
name,
and
then
you
know
the
rest
of
the
blocks,
like
header
of
your
name,
plus
manifest
plus
cash.
There
a
dash
one
dash
two,
that's
three
right
so.
B
B
C
For
put,
if
you
using
multi-part
upload,
then
the
name
of
the
the
objects
and
any
of
you
you're
going
to
use
a
64
megabytes
channel,
then
the
name
of
the
objects
are
going
to
be
named
after
you
know
it's
going
to
be
the
back
tidy.
Then
it's
going
to
be
the
name
of
the
objects
that
the
the
uploaded
ID
daughter,
some
kind
of
running-
number:
okay,
okay,
that's
going
to
be
so,
and
you
have
the
upload
the
upload
ID,
because
you
you
initiated
the
the
f
load
around.
B
Yeah,
actually
it's
it's
not
a
big
problem
for
the
first
fourteen,
since
we
we
actually
know
the
workouts
here
are
very
much
read
my
workload,
so
there's
not
much
a
pull
traffic
I!
Think
alright,
okay.
So
this
is
a
algebra
proxy
pot,
and
this.
D
B
Yes,
this,
this
is
Pato.
Okay,.
B
So
far
for
the
algebra
proxy
part,
this
is
a
passing
whiskey
demo
that
accepts
rest
for
requests
and
give
out
the
leg
closest
relative
instance
make
sure
we
we
generate
a
top
load
file
of
the
cluster
which
actually
a
this
is
a
so,
for
example,
0
40
s
DS
there,
and
we
we
use
some
top-notch
file
that
covers
red.
For
example,
Raiders
gateway
is
mapping
to
the
first
20
SD
and
the
rest
gateway
to
is
mapping
to
the
second
20
s
DS.
B
So
this
is
a
actually
at
top
notch
of
the
entire
cluster,
and
yet
just
just
like
the
so
actually
I
have
said
this
before
so.
The
first
step
is
a
rest,
get
away
we'll
try
to
get
the
manifest
from
the
head
object.
First,
using
some
Python
liberal
dose
epi
with
cat
exit
here
and
then
read
as
proxy
would
try
to
use
some
oil
to
follow
the
crush
map
and
catch
the
location
of
each
each
book.
That
is
actually
a
staff.
B
B
B
B
B
D
B
Yeah
this
is
yeah.
This
is
actually
the
same
issue.
We
we
match
in
the
objects
larger
than
five
gigabytes.
These.
There
is
also
a
zero
byte
manifest
file,
and
currently
we
have
some
some
solution
in
at
wfs.
That
is
due
to
additional
translation,
I'm,
making
the
zero
byte
manifest
file
and
the
real
real
date
data
chunks.
That
is.
B
B
C
Yes,
so
for
the
multi-part
upload,
you
have
an
issue
there
that
the
first
object
is
just
going
to
hold
the
manifest
and
not
anything
else
and
the
race
that
is
going
to
be
the
it
in
the
table
objects
now,
maybe
for
you
can
tweak
rgw
for
larger
than
five
gigabytes
subjects
to
just
do
the
same
as
it
does
with
with
regular
objects.
Maybe
the
regular
object
upload
can
work
for
you
for
larger
than
five
gigs.
C
B
Okay,
so
this
is
some
awesome
results.
We
have
done
on
Hadoop
over
HDFS
forces
over
Swift
we
there.
There
are
three
different
deployment
considerations.
The
first
one
is
out
of
over
HDFS.
This
is
a
typical
setup
and
then
we
have
a
devolver
Swift
with
a
list
and
point
middleware.
That
is
a
special
configuration
for
a
devolver
Swift
that
makes
a
hadoop
can
do
some
local
local,
where
read
but
part
for
a
cure
for
the
right.
It's
it's.
It's
actually
going
through
the
proxy
server
anyway.
B
B
So,
on
the
right
side,
we
have
some
performs
numbers
so
as
HDFS
as
the
baseline-
and
we
can
see
this
time
point
middleware
is
impact
is
huge
and
this
about
44
lips
and
decoration
without
the
standpoint
middleware,
and
we
we
have
done
some
analysis
on
this
and
we
found
the
rename
renamed
Oh
Haddie's,
quite
big
in
swift,
swift
side,
because
a
in
HDFS
there's
a
simple
methylated
change
and
in
name
node
that
is
HDFS,
rename
and
I
mean
surf
to
rename
there's
some
copy
and
delete
process.
This
is
quite
heavy.
Actually
I
guess
this
this.
B
C
C
B
Okay,
okay,
are
we
trying
to
taking
you
that's
to
see
the
poems
for
Hadoop
over
red
ass
gateway
later
okay?
So
this?
This
is
a
next
step.
So
for
for
the
development
side
we
have
a
steel,
stirrups
and
left
and
I
think
we
can
use
the
development
quite
soon
and
then
we
are
going
to
complete
the
promise
test
work.
That
is
a
double
over
whether
skateway,
with
with
local
block-level
read.
I
think
it's
it's
going
to
be
down
in
dry
or
august
and
based
on
zip,
performs
or
swift.
B
We
are
going
to
need
to
resolve
a
bit
lemme
show
but
I.
If,
if
the
said
your
hula
said
is
is
okay,
then
we
we
may
need
to
investigate
the
copy
implementation
rather
skate
away
side,
and
then
this
might
not
be
an
issue
for
a
rather
skate
way,
and
then
we
have.
We
actually
have
a
cadre
power
on
github
we're
counties,
it's
private,
but
we
are
we're
trying
to
open
source
the
code.
I
think
it's
going
to
be
happen
very
very
soon.
C
C
C
B
C
B
C
C
C
Not
many
places
in
the
code
that
you
actually
need
to
change,
it
might
be
something
that
will
make
sense
to
make
it
configurable.
The
problem
with
this
specific
feature
is
that
certain
things
aren't
going
to
work
like
a
multi-site
multi-region,
but
it's
not
it's
not
anything
that
you,
your
you're
actually
interested
in.
So
it
might
be
that
you
you
can
explore
that.
B
C
C
B
C
Another
thing
and
that's
something
that
went
in
right
recently:
there
is
a
way
to
bump
up
the
number
of
librettist
connections
between
the
gateways
and
the
backend
there's
a
new
configurable
to
do
that.
That's
another
configurable
that
you
can
try
to
look
at
I'm,
not
sure.
Actually,
if
you
seen
it
went
into
hammer,
probably
not.
C
B
C
C
You
yeah
I
I've,
no
more
questions.