►
From YouTube: Ceph Science Working Group 2021-09-22
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
All
right,
it's
a
couple
minutes
after
the
hour.
So,
let's
start
with
my
quick
little
spiel,
there's
a
I
put
the
link
to
the
stuff
pad
in
the
chat
feel
free
to
sign
in
there
or
add
any
topics
you'd
like
to
discuss.
A
If
you
haven't
joined
this
group
before
we're
just
a
bunch
of
disability
type
people
who
manage
set
clusters
and
usually
in
scientific
or
research
computing
environments-
and
this
is
just
an
open
discussion-
I'm
not
I
don't
particularly
present
or
anything
we're
just
chatting
about
whatever
affects
us
and
what's
going
on
in
the
south
world
for
us
lately,
keep
in
mind
these
meetings
I
recorded
and
posted
to
the
stuff
youtube
channel.
A
Like
to
apologize,
I
meant
to
get
one
going
in
august,
but
between
life
and
work
was
busy
a
project.
I
work
on
had
a
cubesat
launch
in
late
july,
and
I
just
split
my
mind
unfortunately,
so
I
forgot
to
set
that
one
up,
but
I
made
sure
to
get
one
in
september
here
with
that.
A
Anything
I
tend
to
just
go
through
the
topics
one
by
one.
If
you
want
to
add
anything
there,
otherwise
feel
free
to
speak
up
so
or
if
anybody
is
new
to
the
group
and
wants
to
introduce
themselves
or
what
type
of
clusters
they
run
or
anything
like
that,
we
can
do
that.
First,.
A
I'm
pretty
sure
I
took
a
selfie
like
that
that,
during
open
days,
dan.
C
Not
joined
the
call
before
so
I
just
figured
say
a
quick
hello
and
quick
intro,
I'm
at
the
national
solar
observatory
in
boulder
colorado
and
just
starting
on
a
ceph
journey.
We've
got
basically
six
petabytes
in
octopus
right
now.
Pre-Production
it'll
probably
grow
significantly
over
the
years.
So
I
don't
have
a
lot
of
well.
I
have
no
operational
experience
at
this
point.
C
D
Are
you
running
your
safe
environment
on
zeus,
red
hut
and
dos
debian,
whatever
flavor.
A
What's
your
primary
use
case
going
to
be
radio's
gateway,
cffs.
A
It's
hard
to
say
any
any
specific,
immediate
things
to
know:
it's
just
a
learning
curve
for
everything
and
pay
attention
to
tunables
and
settings.
I
suppose.
A
B
Okay,
it
might,
it
might
be
so
you're
using
this
for
s3.
Is
that
what
you
said?
Yes,
so
you
you'll,
probably
find
slow
like
the
bucket
indices
will
be
very
slow,
so
things
like
listing
the
buckets,
especially
when
they
get
large,
this
will
be
yet
slow.
B
B
Okay,
cool
thanks
yeah
we
made
we
had
we
we
we
have.
We
have
a
cluster
like
that.
Our
backup
s3
cluster
is
like
that.
The
the
osds
for
the
data
are
hdd
only
old
hardware
and
then
the
ssd
into
each
server,
and
we
use
that
as
the
bucket
indices
and
it's
okay
s3.
It
depends
on
your
use
case,
but
if
you
have
an
interface
with
lots
of
small
objects,
you
can
end
up.
It
can
take
a
long
time
to
like
later
when
you
need
to
rebalance
data
around
there's
lots
of
small
objects.
B
This
can
take
a
long
time
longer
than
like.
Those
of
us
that
are
running
block
storage.
Rebalancing
is
usually
quite
quick
because
the
objects
are
all
something
like
four
or
eight
megabytes
and
it's
uniform,
but
with
s3.
If
you
have,
if
you
let
people
do
anything,
then
they
can
have
a
few
kilobyte-sized
objects,
and
then
you
end
up
having
to
move
around
billions
of
them,
which
takes
longer.
D
B
You
well,
if
you're,
if
you're
coming
from
like
us,
I
think
we
normally
have
the
privilege
that
we
can.
We
know
our
users
closely
and
we
can
give
them
some
advice
on
how
best
to
use
the
how
to
use
the
clusters
more
effectively.
I
think.
C
C
This
telescope-
that's
almost
done
in
hawaii,
so
it's
a
known
source
and
there
will
be
some
variability
in
that
data.
But
it's
basically
all
gonna
be
the
solar
observation
data.
So
I
think
I
think
we'll
do.
Okay
on
that
account
we'll
at
least
have
you
know,
limited
variability
in
terms
of
what's
being
loaded
into
the
cluster.
B
B
We
can
just
store
trillions
of
tiny
objects
and
it's
going
to
work
very
well,
but
in
reality
there
are
you
probably
don't
want
to
exceed
around
a
million
objects
per
bucket.
So
if
they
can
use
many
buckets
it's
better.
If
they're
going
to
have
really
billions
of
objects
and
and
the
object
size
is
it
still-
I
mean
you,
they
can
store.
B
So
in
the
end,
our
like
our
atlas,
we
have
an
atlas
event
server,
so
each
like
particle
collision
is
stored
in
one
event,
a
one
megabyte
event
and
there
they're
storing
well
hundreds
of
millions,
because
it's
not
the
main
storage
but
say
hundreds
of
millions
and
it's
fine
one
megabyte
is
fine.
That's
okay!
It's
like
that's!
Not
too
too!
That's
not
too
small.
B
E
Yeah,
I
guess
one
thing
I
learned
with
steph
is
to
relax
and
when
things
go
wrong,
let
it
so
yeah
what
one
thing
I
learned
is
to
relax
when
things
go
wrong
and
let's
just
sort
itself
out
and
at
one
stage
we
had
a
panic
and
and
tried
to
fix
too
much
when
we
should
have
just
waited,
and
it
would
have
sorted
itself
out
so
and
and
we've
been
very
brutal
to
our
cluster
and
it's
it
survived
all
that
abuse,
and
we
now
know
what's
okay
and
what
isn't
and
it
does
take
a
little
while
to
get
your
head
around
it
coming
from
sort
of
a
more
traditional
sort
of
bunch
of
network
storage.
A
A
I'll
just
kind
of
start
hitting
some
of
these
things
down
the
topic
lists
of
kind
of
just
the
broad
ones
I
ask
for
every
one
of
these
is
kind
of
around,
like
anybody
have
any
crazy,
unexpected
outages
due
to
bugs
or
not
due
to
bugs
they
maybe
cause
it,
and
they
want
to
confess
to
upgrade
experiences
going
from.
You
know:
nautilus,
the
octopus,
pacific,
whatever
or
if
you've
noted
any
important
bugs
or
fixes
in
the
latest
releases
that
can
be
relevant
to
anybody
else.
D
Well,
I
put
the
firmware
information
for
all
of
you
because
I'm
I'm
currently
setting
up
a
new
safe
cluster
with
a
3000
osd
and
we
found
out
that
half
of
half
of
those
had
a
different
firmware
and
there
was
a
critical
patch
for
that
and
half
of
the
osd
is
we're
having
also
a
critical
patch
from
that
vendor.
D
So
I
know
that
half
of
those
disks
are
coming
from
seagate,
but
they
are
rebranded
from
from
that
vendor.
But
the
rest,
I
don't
I
don't
find
what
is
the
specific
vendor
for
for
them?
It's
oem
branded
disks,
but
I
I
still
remind
you
that
whenever
you
are,
you
are
facing
some
issues.
There
might
be
also
critical
firmware
bugs
on
a
disk
level,
and
if,
if
you
are
not
aware
of
those
you,
you
can
run
your
data
quite
badly.
A
B
B
We
we
couldn't
find.
I
don't
think
this
is
a
known
issue
and
our
like
procurement
people
said:
okay,
it's
probably
not
going
to
reproduce
in
production
because
it
was
only.
They
were
running
bad
blocks
with
like
eight
processes
and
hammering
these
things.
So
we
put
them
in
production
anyway
and
we'll
see,
but
also
they
couldn't.
They
could
only
reproduce
this
on
centos
seven
and
we're
running
centos
eight,
and
I
think
that
they
maybe
had
some
weird
power
management.
F
We
had
similar
issue
with
western
digital
nvmes
on
our
load
tests.
Three
of
four
servers
go
down
because
of
environment
and
after
contacting
with
the
vendor,
they
checked
this
hardware
and
noticed
that
the
old
firmware
at
the
back
with
when
you
test
on
the
high
load,
this
will
be
miss.
There
are
powering
off.
F
D
Okay,
did
you
check
the
temperature,
because
I've
seen
pretty
low
glowing
running
temperature
for
nvme,
so
60
cell
seals
and
and
if
they
are
in
the
end
of
the
machine
in
a
high
load,
the
overall
heat
from
cpus
and
other
disks
can
give
actually
higher
temperature
for
for
those
nvmes
if
they
are
located
in
the
rear
of
the
machines.
B
F
B
F
The
same,
but
when
we
reboot
the
box,
we
didn't
see
any
of
nvmes
and
there
was
very
interesting
because
we
are
preparing
to
create
the
technical
preview,
and
this
is
the
last
test
for
us.
F
B
B
It's
it's
especially
interesting
because
I've
seen
message
on
the
mailing
list.
Recently
people
debating
how
many
osds
to
to
how
many
block
db's
to
store
on
one
nvme
I
mean
the
reality
is
now
you
can
buy
two
or
four
by
nvmes
and
it's
tempting
to
put
lots
and
lots
of
block
tvs
on
one
nvme.
We
have
on
on
our
backup
s3
cluster,
where
we're
putting
24
block
db's
on
one
nvme.
G
The
sanger's
clusters
we
put
30
on
an
nvme
yeah.
G
Yeah
because
we
had
a
so
we
get
just
the
occasional
nvme
that
is
where
ratings
will
be
fine
and
then
it
will
just
fail,
which
you
know
it's
kind
of
inconvenient,
and
then
we
just
you.
Obviously
you
just
have
to
rebuild
all
those
all
the
osd's
that
we're
using
it
touchwood.
Only
a
couple
of
them
have
done
it,
but
it's
quite
disappointing
you
may
you
record
all
these
metrics
about
device
wear
and
then
they
just
go
and
randomly
stop
for
their
readily
apparent
reason.
Anyway.
F
Yeah
yeah
it's
happening
sometimes
once
for
two
months
or
three
months,
one
of
in
families
I'm
dying
and
we
are
lost
17
osds
at
once,.
B
I
I
can
just
briefly
say,
and
then
I
have
to
leave
we
did
we
did
a
nautilus
octopus
upgrade
that
I
think
I
I
post
on
the
mailing
list,
so
it
went
very
well
and
then
the
other
thing
I
want
to
mention
is
that
we
had
a
ceph.
Sorry,
we
had
a
power
outage
in
our
data
center
for
30
minutes
and
actually
up
to
two
hours
for
some
of
the
hosts
and
everything
came
back
no
like
we
didn't
have
to
manually
intervene.
B
The
only
thing
that
we
did
was
on
the
clusters
that
we
still
could
access
during
the
power
outage
we
set
no
out
so
that
the
mon
wouldn't
like
decide
to
restart
rebalancing
everything.
While
half
the
machines
were
missing,
but
otherwise
everything
just
recovered
itself
and
we
had
no
like
objects
or
unknown
pgs
or
anything
like
that.
It
was
very.
It
was
a
nice,
a
nice
recovery
and
no
blue
store
corruptions
or
anything
like
that
either.
So.
B
B
Yeah
and
the
rejoin
step
depends
on
the
number
of
open
directories,
and
so
that's
why
it
takes
10
minutes
to
rejoin
on
this
cluster.
Something,
like
I
don't
know,
200
000,
open
directories
or
something
like
that.
According
to
the
open
files
table,.
B
D
F
F
F
Just
it
looks
quite
okay,
but
after
two
weeks
we
noticed
that
it's
still
a
backfilling
and
he
has
about
6
000
of
pg
waiting
to
backfill
and
only
15
16
backfilling
ongoing,
and
we
starting
investigation
about
it.
What
happened
inside
and
we
noticed
that
there
there
are
waiting
back
fees
which
could
start
because
the
parameter
about
osd
max
backswings
was
one,
but
they
are
waiting
for
something
and
after
internal
debug,
we
noticed
that
the
sum
of
shards
in
placement
groups
have
different
states.
F
Some,
for
example,
when
we
have
active,
backfilling
weight,
backfill
weight
stayed
the
shards
for
this.
Pg
was
peering,
active,
clean
back
feeling,
and
it
looks
like
it
wrongly
appeared.
F
So
it's
trying
to
repair
the
cluster
all
the
pages
in
the
cluster
with
command
cpg
repair,
and
it
looks
like
a
lit
a
little
better
after
that
and
we're
still
waiting
to
enter
back
here,
because
we
want
to
check
the
states
again
of
the
cluster.
It
looks
like
some
split
brain
in
peering,
something
like
that.
A
F
This
function
return
for
from
this
function
and
it
hits
when
the
global
pg
state
known
by
monitor,
was
the
backfilling
weight
and
trying
to
backfill
this
group,
and
then
the
osd
was
checking
the
charts
and
seeing
the
backfilling
or
recovering
state
returning
just
from
the
this
function
and
not
doing
anything.
A
My
outage
was
we
had
a
power
event
which
a
couple
hour
power
outage,
but
it
also
managed
to
try
a
bunch
of
network
switches
that
were
behind
ups's,
including
one
of
my
poor
switches
that
hosted
majority
of
my
self
servers
for
one
of
my
clusters
and
it
took
a
couple
days
to
get
a
replacement
and
unfortunately,
so
my
had
a
cluster
that
was
kind
of
partially.
A
The
nodes
were
up,
but
without
networking
some
of
them
had
network
for
a
couple
days.
But
as
once,
I
got
that
new
switch
in
and
brought
up
the
rest
of
the
network.
It
took
a
little
time,
but
it
recovered
without
any
lost
data
or
any
real
problems
and
really
nice
to
see
a
multi-day
outage
be
recovered
so
easily.
D
So
there
was
no
physical
disk
osd
0,
but
it
wasn't
in
a
cross
tree
in
the
same
level
as
everything
else
like
fruit
or
default
or
like
that
and
and
it
somehow
triggered
the
monitors
to
fail
with
the
communication
it.
I
think
that
they
were
trying
to
balance
the
data
between
the
roast
osd
and
the
others,
and
since
it
was,
it
wasn't
in
a
crosstree
on
a
proper
place
or
anything
else.
D
And
we
had
to
add
more
disk
space
on
those
nodes
to
keep
up
with
with
the
growth
of
the
monitor
node
database,
while
the
cluster
we're
trying
to
recover
itself
about
this
situation.
D
D
D
D
A
G
I
I
think
pacific
has
reduced
the
maximum
size
that
the
modern
db
can
grow
to
I've.
I
think
I've
seen
it
be
over
200
gigs
and
one
of
our
big
production
clusters
in
the
middle
of
a
very
large
rebalance,
but
I
think
pacific
has
some
fixes
that
reduce
the
buyers
it
can
grow
to.
F
G
A
D
It's
doable,
but
I
I
prefer
upgrading
them
as
a
whole,
the
pacific,
but
but
this
time
it
was
not
illustrated.
We
had
this
problem
so
be
aware
that
there,
there
might
see
might
be
still
some
bugs
on
the
nautilus.
Even
it's
of
end
of
life,
stuff.
A
I
look
like
somebody
did
that
that
they
also
had
switches
disappear
kind
of
similar
to
my
outage
and
then
for
them
it
looked
like
they
just
once
they
restarted
stuff
and
thought
it
all
came
back.
Fine,
that's
great
to
hear.
E
E
A
A
Usually,
with
thousands
of
clients,
you
know
only
100
or
whatever,
but
they
seem
fairly
consistent
in
maintaining
their
connections,
even
when
the
they're
under
a
load.
E
So
it
it
I
mean
it
works,
I'm
just
wondering
if
well,
if
there's
of
some
performance
set
also
or
is
something
that
we
should
chase
up,
you
know,
but
it's
so
we
do
have
quite
a
lot
of
so
it's
it's
basically
shared
compute
notes
so
that
there,
where
we've
got
50,
60
users
and
they've
got
their
their
own
own
home
directories.
E
A
Interesting
yeah
is
there
any
like
qosing
on
your
network
that
could
be
keeping
prioritizing
other
traffic
over
like
the
the
standard,
just
the
way
that
defines.
E
F
I
saw
if
you're
using
systemd,
of
course,
that
when
there
are
problems
with
some
connections,
the
osd
process
will
stop
and
then
it
will
be
restarted
by
systemd
and
in
our
configuration
we
have
restart
only
three
times
and
then
the
process
will
go
down.
F
So
after
some
huge
issues
with
network,
we
have
to
restart
all
osds,
but
the
problem
is
not
with
safe
software,
but
with
network
and
the
underlying
of
systemd.
So
my
question
was
the
the
core
of
the
problem.
Was
the
some
safe
software
with
calls
some
bugs,
or
we
just
the
operating
system
under
it?.
E
Yes,
so
we
do,
we
do
have
the
database
and
the
data
on
a
single
disk.
So
so
that's
not
the
problem.
I
think
I
think
the
I
can't
remember
they
they
were
disabled
and
I
think
systemd
has
put
them
into
failed
state.
I
I
can't
quite
remember
what
what
happened
and
I
just
had
to
manually
restart
those
osd,
and
that
was
the
trick.
E
It
could
be
an
issue
with
the
underlying
os
and
so
why
why
I'm
hesitant
is
because
our
our
systems
are
configuration,
managed
and
the
use
of
a
very
odd
flavor
of
configuration
management
which
is
of
non-standard.
So
we
use
lcfg
to
manage
our
systems,
and
so
ceph
doesn't
sit
very
easily
with
that.
But
it's
it's
working
but
yeah.
So
in
this
case
manually
restarting
the
docs
just
just
the
trick.
E
A
All
right,
I
guess
just
one
of
the
next
things
on
the
list
here-
was
the
nautilus
end
of
life,
everybody
pretty
much
moved
on
to
octopus
or
pacific
now
from
nautilus,
or
it
was
still
on
that
as
well.
E
So
we
we
are
still
on
nautilus
and
some
other
schools
at
the
university
of
edinburgh
there
looking
to
deploy
staff
and
we're
going
to
wait
and
see
what
they
are
doing
and
then
we're
going
to
use
their
configuration
basically
to
to
upgrade
our
cluster
and
yeah.
We
have
to
figure
out
what
we
do.
We
are
a
bit
worried
about
the
containerization
because
we
just
don't
know
how
that
will
play
with
our
managed
system.
E
A
A
Now
I'm
gonna,
I'm
gonna
assimilate
both
them
all
on
models
still
and
just
a
matter
of
finding
time.
With
everything
going
on,
I
wanna
do
some
centos,
eight
or
stream
upgrades
at
the
same
time,
and
I
gotta
shuffle
some
things
around
shuffle
mods
in
and
I'll
upgrade
things
three
different
clusters
to
do
it
too.
So
it's
just
one
of
these
days.
I
just
need
to
try
to
pull
it
and
start
on
the
test
cluster
and
go
from
there.
A
Now
one
thing
I
was
kind
of
curious
about
was:
if
anybody
was
trying
rocky
linux,
if
you
haven't
heard
that's
the
new
centos
and
I
think
they
kind
of
have
a
fairly
stable
release
now
and
if
anybody
was
considering
rocky
linux
I'll,
just
gonna
stick
with
the
centaur
stream
ubuntu
or
whatever
we're
using
now.
Instead
of
trying
a
new
distro.
H
We're
planning
to
move
to
rocky.
This
is
graham
from
minnesota
supercomputing
institute
and
we're
currently
sent
os7
and
had
been
kind
of
waiting
to
see
what
happened
with
the
centerwise
eight
world
and,
I
think,
rocky's
out
default
choice.
We
haven't
actually
started
moving
yet.
H
A
Yeah,
that's
where
my
I
want
to
just
like
move
my
minds
to
centos
at
the
bare
minimum
go
to
octopus.
H
And
then,
like
looking
further
forward,
I
mean
we
manage
our
clusters
using
puppet,
but,
to
be
honest,
I
always
found
self-deploy
the
easiest
thing
for
deploying
ceph
itself
and
that's
going
away.
I
think
it's
still
present
in
in
octopus.
As
far
as
I
can
tell.
I
think
I
saw
it
as
an
rpm,
but
I
guess
it's
not
going
to
be
around
for
long
and
we'll
need
to
come
up
with
something
different
for
pacific.
H
D
We
are,
we
are
having
the
same
phase
like
the
others,
so
we
we
need
to
face
out
the
center
7
and
python
2.7
and
we
are.
We
are
heading
to
center
stream
because
I
I
think
that
that's
one
way
to
go
to
the
rocky
linux,
if,
if
you
have
kind
of
upgrade
from
center,
seventh,
something
eight
ish
like
centers,
eight
or
centers,
eight
stream,
or
so
that's
our
approach
so
we'll
we
will
go
with
the
center
stream
eight
with
a
fixed
set
of
rpms.
D
So
we
are
not
up
update
updating
daily.
We
are
deploying
the
sa
a
whole
cluster
with
the
same
subset,
and
then
we
have
a
test
cluster
where
to
test
new
packages
and
when
it,
when
we
are
happy
with
that,
then
we
go
and
upgrade.
I
know
that's
clumsy,
comparing
the
some
other
approaches,
but
at
least
time
being,
I
I
think
that's
the
easiest
way
for
us.
So
we
will
take
the
snapshot
of
center
stream
8
time
to
time
and
then
build
from
that.
A
I'll
be
interested
to
hear
how
things
work
out
on
rocky
for
ram
and
so
that
sebastian
in
the
the
chat
was
gonna,
be
rocky
as
well.
So
maybe
in
a
future
meeting
you
guys
can
give
us
a
review
of
how
that's
going
for
you.
H
A
A
H
A
All
right
well,
if
nobody's,
got
anything
else
thanks
for
joining
we,
the
next
one
we
do
use
about
every
two
months,
so.
A
H
H
A
So
if
you
want
the
private
email
reminder
for
this
meeting
of
the
email
address
up
there
or
contact
me
directly,
if
you
don't
want
it
publicly
on
the
list,
otherwise
thanks
everybody
for
joining
great
talking
to
you
all
and
we'll
see
you
in
about
two
months.
A
Yeah
guess
I'm
sorry,
life
and
work
got
me
busy
and
I
missed
one
there.
I
do
apologize
for
fasting
on.