►
From YouTube: Ceph Orchestrator Meeting 2021-04-13
Description
B
Yes,
yep,
though
the
current
pull
request
is
focused
initially
just
on
rgw.
That's
the
only
thing
that
it
implements.
I
think
the
main
thing
missing
right
now
is
that
it's
not
there's
no
test
coverage
in
boothology
and
the
reason
for
that
is
because
we
don't
have
a
way
to
allocate
a
virtual
ip
that
won't
conflict
with
other
nodes.
That's
like
routable
across
the
thing,
so
we
need
to
figure
out
a
way
to
do
allocate
virtual
ips.
B
I
have
a
couple
ideas
of
how
to
like
totally
hack
around
that
with
like
the
bash
in
the
test
and
like
make
up
fake
private
ip
address
ips
that
sort
of
map
on
the
machine,
so
they
won't
conflict
but
might
want
to
do
something
more
sophisticated
there,
but
I'm
going
to
start
with
a
the
hack
just
because
I
think
I
won't
rely
on
any
other
technology
changes.
You
know.
A
B
Because
it
it
comes
up
like
I
started,
writing
the
test,
yaml
that
doesn't
work,
obviously,
but
just
to
see
what
it
would
look
like,
and
it
has,
for
example,
there's
one
part
of
the
task:
does
the
stuff
idiom
apply
here,
you
can
show
my
screen,
and
one
part
does
like
some
shell
to
actually
run
some
commands
and
in
both
cases
I
need
to
substitute
in
the
virtual
ip,
so
pathology
of
both
those
tasks,
understood
that,
like
brace,
brace,
vip,
brace,
brace
or
whatever
means
substitution
and
could
substitute
in
the
same
value
there.
B
That
would
be
nice,
but
if
we
don't
do
that,
then
we
could
rely
on
like
just
doing
it
through
shell
and
doing
the
apply
through
the
shell.
That
would
be
kind
of
annoying.
I
don't
know.
A
C
A
B
Yeah
and
certainly
for
tests,
I
mean,
I
think,
the
only
time
dns
really
is
relevant
for
the
testing
is
when
you're
using
like
ssl,
and
you
want
the
certificate
to
match,
but
like
whatever.
B
But
I
mean
so.
What
I
was
trying
to
write
here
is
a
test
that
does.
B
I
was
trying
to
write
a
test
that
would
apply
it,
create
an
rgw
and
an
ingress
service,
that's
the
junior
virtual
ip
and
then
it
would
basically
wait
for
them
all
to
be
active,
make
sure
it
can
access
it
and
then
it
would
like
start
bubble,
stop
and
start
all
the
backend
ones
and
make
sure
things
worked.
B
Ideally,
we'd
also
have
something
that
would
like
kill
the
keep
alive
d
nodes,
like
the
actual
issue,
proxies
themselves,
take
one
of
those
down
each
in
turn
and
make
sure
things
behave
so
that
I
think,
would
be
the
next
second
half
of
this
test.
That's
not
like
a
load
test
or
anything.
It's
just
like
make
sure
that
it
does
the
most
basic
basic
thing.
A
Today,
I
I
missed
a
proper
tautology
integration
or
any
kind
of
fertility
integration
for
the
order
of
zero
value,
implementation
right,
yeah
gaps
that
you
had.
B
Yeah
I
mean
I
wonder
if
the
way
to
do
this
is
to
like
somehow
I
mean
we
already
have
like
a
machine
locking
thing.
We
could
have
like
virtual
machines
that
are
just
virtual
ips.
You
just
lock
a
virtual
ip
using
the
same
basic
infrastructure,
but
then
the
locking
topology
locking
has
to
be
changed
so
that
you
can
lock
of
multiple
types
things
it
needs
to
be.
B
I
mean,
ideally,
the
virtual
ip
would
be
an
ip
that's
r,
that's
within
the
same
network
subnet
as
existing
ips,
because
that's
how
so
one
of
the
sort
of
wonky
things
that
the
ingress
service
does.
Is
it
it
basically
only
lets
you
configure
the
virtual
ip
on
interfaces
that
already
have
another
ip,
because
it
uses
those
other
ips
and
networks
to
decide
which
interface
to
use
those
interface.
B
B
B
So
I
wrote
a
doc
about
this,
I
mean
you:
can
you
can
have
the
virtual
ipv
in
a
totally
unique
network,
but
what
the
documentation
suggests
is,
if
that's
the
case,
if
it
doesn't,
if
it's
in
a
different
network
than
any
of
the
existing
ips
and
basically
go
and
configure
sort
of
fake
unused
ips
on
those
interfaces
to
like,
essentially
mark
the
interfaces,
and
they
can
be
in
a
totally
unrelated
network.
So
you
can
specify
a
subnet
that
you
look
for
to
choose
the
interface.
A
B
And
then
use
that
interface
and
then
and
then
you
can
use
overflow
yeah,
so
you
can
just
make
up
some
private
network
and
configure
them
statically
on
the
right
interfaces
and
then
use
that
as
a
guide,
collector
or
whatever
yeah
so
anyway.
So
for
rgw
yeah,
that's
the
test.
I
need
to
get
the
virtual
ip
thing.
That's
the
next
step,
I'm
breaking
that
into
a
separate
branch,
because
I
think
the
first
part
of
it.
We
can
merge
without
the
technology
integration
the
my
manual
tests,
all
it
yeah.
It
works
like
a
charm.
B
Of
course
I
can
tell
so
somebody
else
should
probably
give
it
a
go
too
before
we
like
ship
it
or
anything.
But
I
haven't
seen
this.
B
The
next
piece
I
was
going
to
do
is
the
manager
just
make
sure
that
we
can
set
up
ingress
in
front
of
the
manager
and
once
that,
other,
that
standby
thing
merges,
because
we
can
turn
off
all
of
the.
We
can
turn
off
all
the
standby
managers
and
so
that
the
hda
proxy
will
only
direct
traffic
at
the
one
that
is
actually
responding.
B
And
then,
but
we
can
set
up
a
manager
ip,
basically
or
approve
this
or
for
whatever.
The
dashboard.
A
C
C
A
A
B
So
it's
a
proxy,
I
mean,
I
think,
the
one
that
we
most
actually
care
about
is
nfs,
because
actual
users
want
that
and
my
thinking
there
is
just
to
get
the
absolute
most
minimum
thing
that
we
can
possibly
do.
That
gives
you
a
single
ip
that
will
sail
over
if,
like
things,
reprovision,
and
it
might
be,
that
we're
not
doing
any.
B
Yeah,
I
don't
know
yeah
I
mean
we
might
even
not
use
keeper
id
like
it
could
just
be
stuff.
Adm
configuring,
the
ip.
I
don't
know,
that's
just
I'm
not
really
sure
about
that
one.
But.
D
B
Yeah,
it
does
open
makes
me
a
little
bit
nervous
because
the
nice
thing
about
people
id
is
that
it's
like
we'd
have
to
be
careful
basically
because
people
id
when
you
like,
shut
it
down
it
like
deconfigures,
the
virtual
ip.
So
anything
we
do.
We
should
do
the
same
thing
where
it's
like
links
to
a
systemd
service
that
will
turn
it
off
turn
the
fp
off
when
the
service
stops.
B
D
Isn't
there
an
issue
with
client
state,
though,
when
we
actually
cut
off
one
of
the
demons,
so
you
can't
recover
that.
B
That's
the
that's.
The
challenge
is
that
if
you,
if
you're
floating
an
ip
between
different
ganeshas,
the
ganeshas
have
to
have
the
same
rank,
which
means
that
they
can't
both
be
running
at
the
same
time
so
like.
If
you
you
need
to
migrate
it,
then
you
have
to
stop
the
old
one
and
start
the
new
one.
B
A
I've
made
some
thoughts
about
trying
to
beat
up
the
failover
when,
when
a
demon
dies,
we
we
just
can't
use
ssh
for
that,
because
if
you
have
a
big
cluster,
then
s8
is
just
too
yeah
blow
right,
where
we're
going
over
all
holes,
mainly
consisting
of
hosts
with
osd's
and
then
discovering
minion
theta
that
the
games
died.
No,
that's
not.
B
And
it
seems
like
the
way
to
make
the
agent
model
scalable
is
to
have
probably
the
simplest
model
would
be
to
have
an
agent
on
each
host.
That's
just
responsible
for
like
monitoring
and
reporting
state
for
that
and
any
time,
and
it's
like
checking
containers
every
10
seconds
or
whatever,
like
some
high
frequency,
and
if,
if
it
ever
sees
a
change,
then
it
like
pokes
the
manager
and
tells
it
instead
of
having
relying.
A
A
But
we
have
to
push
information
onto
the
manager
yeah,
but
that's
also
going
to
reduce
the
load
right
because
we
don't
fetch
the
host
just
to
discover
that
nothing
changed
yeah
right.
A
On
the
other
hand,
we
then
have
to
care
about
a
problem
where
all
of
a
sudden
all
hosts
are
pushing
changes
for
that
we
have
to
find
a
solution.
B
Anyway,
that's
this:
let's
see
this,
I
think
this
ingress
and
rgw
stuff.
We
should
try
to
get
into
like
you
know,
pacific.1
or
that
two
or
whatever-
and
maybe
the
manager
thing
too,
if
it's
like,
I
think,
it'll
be
easy,
and
so
I
kind
of
like
to
get
that
one
in,
but
nfs
will
be
later
next
next
phase,
yeah.
B
Okay,
well,
it's
the
I've
addressed
the
review
comment
so
far.
I'm
running
it
through
testing
again
so,
as
I
rename
some
variables
and
did
some
cleanup,
but
if
you
guys
want
to
take
a
second
look
at
it,
I
guess
the
one
thing
I
didn't
look
at
actually
is:
what
happens
if
you
have
an
octopus
cluster
which
rgwha
deployed-
and
you
upgrade
like
in
theory,
we
could
make
the
migration
thing
try
to
like
convert
an
rgwha
into
an
ingress
in
practice.
B
D
A
A
What
do
you
think
you
you
raised
some
concerns,
so
sorry,
can
you
say
it
again
replay?
I
think
it's
within
yourselves.
That
was
a
regression
that
you.
E
Oh
yeah
yeah,
I'm
sorry
yeah!
Yes,
so
I
didn't
have
time
to
take
a
look
yet,
but
hopefully
tomorrow
I
will
be
able
to
take
a
look
at
this.
Try
to
not
block
this
pair
for
this
yeah.
I
need
to
take
a
look
at
what
we
have
currently
in
stefan
siber
yeah
yeah.
Sorry
about
that.
I
will
try
to
take
a
look
at
this
tomorrow.
B
B
And
stuff
idiom
yep,
I
guess
probably
the
first
thing
to
check-
is
just
make
sure
that
it
has
efficient
flexibility.
Like
I
didn't,
I
simplified
out
a
lot
of
random
stuff
that
didn't
seem
like
any
real
user
would
ever
want
to
modify
that
like
the
url
for
the
health
check
like
I
don't
know
why
you
would
ever
care,
but
anyway,
yeah.
B
B
I
changed
it,
probably
there's
a
change
in
the
prometheus.
I
don't
know,
I
don't
know
exactly
how
prometheus
is
configured
honestly.
I
made
it
so
that
it
was
unconditionally
turned
on
the
prometheus
exporter
endpoint
in
ancient
proxy,
but
I'm
not
sure
if
we
have
it.
I
don't
know
if
steph
does
a
thing
where
it
like
helps
prometheus
generate
the
config
file,
so
it
knows
what
to
scrape
do.
We.
A
A
Yeah
we
are
creating
the
promises.
A
Here's
the
ginger,
treat
and
paid
for
premises,
and
that
contains
this
grape
conflicts
and
we
are
writing
the
managers
here.
The
the
manager
from
mrs
module
as
target
we'll
probably
also
add.
B
A
B
I
think
eventually,
we'll
want
to
scrape
if
we
talked
in
cds
about
switching
to
or
being
able
to
switch
to
a
model
where
we're
scraping
all
the
demons
directly
instead
of
manager,
but
that's
but
eventually
yeah.
We
might
have
all
the
demons
listed
there.
I
guess.
A
B
B
It's
basically
just
a
few
defaults,
but
it's
a
little
bit.
It's
it's.
It's
opinionated
in
slightly
weird
way
like
it
assumes
it
just
changes
to
2x
replication
by
default,
min
size,
1,
for
example,
which
I
think
probably
makes
sense
for
a
single
node
cluster,
but
not
necessarily
for
a
single
osd
cluster.
B
I
don't
know,
I
think
it's.
I
think
it's
probably
fine
for
now,
but
we
might
want
to
do
something
different
later
and
I
think
the
one
thing
missing
from
that
pull
request
is
I
didn't.
I
need
to
add
a
test
case
to
this.
The
smoke
case
where
we
set
that
option
and
then
boy
lots
of
managers
just
make
sure
that
it
works,
but
they
all
come.
A
Up,
I
was
thinking
about
a
new.
We
have
a
cluster
right,
a
cluster
in
in
a
smoke
test
that
consists
of
one
of
of
two
hosts.
A
B
Only
some
somewhat
because,
like
some
things,
won't
work
bill
because
they're
still
I
had
to
deploy
across
hosts
like
well
like
the
ingrow
service
with
people
id.
I
guess
you
can
deploy
it
on
onenote,
but
it
won't
do
anything.
B
B
You
don't
have
one
monitor,
I
don't
know.
I
think
I
would
just
create
a
slightly.
I
think
I'll,
just
I'll
probably
just
create
a
separate
directory.
Well
I'll.
Look
at
the
existing
smoke
tests,
but
probably
that
there
needs
to
be
two
sets
of
tests
because
slightly
different,
because
some
of
the
smoke
tests
don't
make
sense.
Single
node,
okay,
yeah
that'll
be
nice,
because
those
tests
will
run
more
quickly,
they're
only
one
node
and
probably
a
whole
bunch
of
we
can
probably
shift
a
lot
of
those
to
be
single.
A
Properly
the
graffana
database
across
all
the
daemons
of
of
that
service
would
make
sense,
but
I
have
a
real
trouble
finding
a
proper
back-end
stuff.
For
that
right
I
mean
we
can't
put
the
graffana
database
into
raiders,
because
it's
just
not
going
to
work.
We
can.
A
Yeah
we
found
in
star
because
it's
too
big
and
monster
is
not
going.
Monster-
is
not
supposed
to
contain
those
blobs.
So
we
need
to
find
a
different
way
to.
A
C
I
think
the
main
use
case
for
this
was
just
for
the
well
first
of
all,
you
you
have
to
have
multiple
grafana
instances,
and
then
you
have
to
customize
one
of
them
because
with
a
single
one
I
mean
you
can
process
that
into
the
local
storage.
So
you
don't
really
need
shared
storage,
so
I
mean
it
only
comes
down
to
whether
we
want
to
provide
that
use
case
or
fulfill.
C
That
use
case,
there's
been
some
discussions
on
whether
we
want
to
support
customized,
grafana
dashboards
and
the
cost
of
it,
and
at
least
from
a
design
perspective.
It's
adding
complexity
and
that's
a
clear
example.
B
C
A
The
same
problem
right,
you
need
the
monitoring
and
the
case
something
failed,
yeah
and
you're
only
being
a
you're
only
able
to
get
things
from
radios
if
the
cluster
is
at
least
not
in
an
error
state.
So
as
soon
as
your
cluster
is
broken,
you
can
no
longer
access
your
monitoring
and
that's
the
wrong
way.
A
Okay,
the
the
next
topic
I
had
on
my
mind,
was
nfs
creator
type
mike
we've
talked
about
that
was
it
yesterday
evening,
yeah.
B
B
D
Doesn't
accept
the
type
that
configures
for
both
rgw
and
stuff
ffs,
but
the
volumes
module
during
cluster
create
has
a
type
right.
D
So
I
think
my
proposal
was
to
just
drop
the
positional
art
before
pacific
goes
on
too
long,
and
if
we
feel
we
actually
really
need
that
during
a
cluster
crate,
we
can
read
it
as
a
named
arc.
That
would
be
fairly
simple,
they're
carrying
it
around
with
us
long
term,
but
we're
not
sure
if
that's
really
the
way
we
want
to
create
these
clusters.
I
don't
know.
A
D
D
So
the
exports
so
in
my
mind,
I
kind
of
think
about
it
in
a
way
that,
like
the
orchestrator,
should
be
responsible
for
the
service,
the
nfs
service.
And
then
the
volumes
module
can
have
just
like
a
simple
like
hook
right
into
that
for
its
cluster,
create
and
then.
D
Other
export
type
logic
should
reside
up
in
this
volumes
module,
and
you
know
the
separation
that
way
concerns
between
the
components.
B
D
I
mean
I
think,
after
this
I'll
try
and
spend
some
time
with
marsha.
Maybe
you
know
whoever
else
and
then
we'll
talk
about
the
rgw
block
some
more.
If
we
can
get
resolution
there.
B
B
B
A
We
also
deploying
nfs,
but
not
in
the
smoke
test,
but
in
the
dli
test.
So
we
have
proper
nfs
coverage
in
in
the
nsf
item
suite
not
in
the
smoke
test.
But
in
the
safe
item
suite
itself.
D
D
That
way,
we're
testing
both
of
the
orchestra
layer
and
then
also
at
the
the
volumes
layer,
and
then
I
think
the
next
thing
is
like
kind
of
more
of
an
end
to
end
of
organic.
Something
like
that
kind
of
like
how
we
with
ffs,
where
we
kind
of
run
through
and
mount
it.
And
then
you
know,
generate
some
iops.
B
A
But,
but
we,
if,
if
we
want
to
continue
having
those
kind
of
traditional
cluster
specifications
in
totology,
then
we
need
deploying
a
cluster
using
the
spectra
is
more
makes
more
sense
to
me
right,
but
that's
not
how
sev
that's
not
how.
B
Yeah,
I
think
we
should
be
moving
away
from
the
the
royal
thing.
I
mean
the
main
thing
that
the
role
gives
you
now
the
stuff
video
is
in
the
picture.
The
main
thing
that
it
gives
you
is
that
you
can
have
like
a
server
name
in
there
and
then
in
your
client
task,
where
you're
like
a
fuse
mount
or
what
nfs
client
mount
or
whatever
you
can
just
refer
to
the
role
name
and
it
figures
out
who
it
is.
B
B
I
mean
we
have
to
update
a
bunch
of
tests
but
seems
like
the
better
direction
to
go
in
the
only
real
thing
that
stuff
adm.hi
really
needs
to
understand
is
how
to
deploy,
how
to
do
the
bootstrap,
and
I
guess
mostly
the
osd's,
even
like
even
the
configuring
additional
monitors
like
we
could
probably
remove.
Well,
I
don't
know
it
might
not
be
worth
it,
but
and
the
manager
is
like
adding
additional
managers
like
we
probably
do
away
with
some
of
that
stuff.
A
Makes
sense
right
but
have
time
to
do
that
right
now
or
when
should
we?
When
should
we
do
that
right.
B
I
mean
we
could
also
we
could
have
yeah,
I
mean,
could
you
try
to
do
both?
I
don't
know
I
just
the
thing
with
this.
This
mode
is
that,
like
it
has
it's
a
slightly
different.
B
It
ties
you
to
the
rolls
configuration
that
so
it
makes
the
test
less
flexible
and
it
like
means
that
we
don't
test
the
automatic
placement
scheduling
stuff,
although
that
might
not
matter.
I
think
the
placement
is
relatively
well
covered
with
the
unit
tests.
A
B
Yeah,
like
I
guess
this
particular
test
case,
could
be
replaced
with
a
bit
of
yaml
in
the
smoke
surfadiam
smoke
rolis,
there's
a
directory
that
has
services
and
you
could
just
add
a
nfs.yaml
in
there
and
it
could
just
be
the
videom.apply
and
then
spec
and
like
that's
it.
Basically
without
writing
this
code.
I
think
we
should
do
that
instead
and
then
the
point
at
which
we
need.
B
We
want
the
indian
test
where
it
actually
gets
mounted
we'll
have
to
add
support
to
the
nfs
client
task
so
that
it
can
discover
what
the
nfs
endpoint
is,
which
we
can
gather
from
the
arch,
ps,
output
or
yeah
or
ls.
If
we
are
using
eventually.
A
C
Yeah
yeah
well,
this
is
just
to
share
this
he's,
trying
to
run
a
more
realistic
vista
based
environment
or
from
well
a
laptop
sized
environment
rather
than
a
vm
based
or
something
requires
some
bare
metal
environment.
So
the
idea
is
just
to
run
vstr
plus
the
cyphidium
flag
and
and
also
deploy
new
hosts
from
from
containers.
So
he's
just
trying
this
docker
indograd
thing
with
systemd
and
all
the
complexities
around
that
but
yeah.
I
think
the
outcome.
A
A
Into
into
the
containers,
while
still
just
just
coding
on
your
local
machine
in
a
virtual
multi-host
setup,
that
would
be
great.
C
Yeah,
let's
see
how
far
we
we
get
with
this
yeah.
I
tried
that
a
while
ago-
and
I
first
issue
I
faced-
was
the
3d
ost
from
loopback
devices,
so
that
was
an
issue
I
faced
with
said
volume
so,
but
the
functions
train.
I
tried
that
from
sephidium
bootstrap
and
he
strength
this
from
vista,
maybe
he's
luckier
than
I
was
with
with
this
approach.
A
B
C
But
it
can
be
a
regular
lookback
device
or
it
has
to
be.
B
Yeah,
it
could
be,
I
think
it
can
be
anything,
I'm
not
sure
it's
not
really.
B
I
don't
think
stuff
am
supports
it,
so
we
might
need
to
buy
some
work
on
this
episode
to
make
sure
you
can
actually
do
that,
but
because
it
basically
like
bypasses
all
the
drive
group
stuff
like
it's
just.
C
A
Yeah
that
was
implemented
because
users
went
into
the
problem
when,
when
reinstalling
their
host
os,
and
then
they
needed
a
way
to
recreate
all
the
existing
osds
of
the
ufc
demons.
Based
on
what
was
deployed.