►
A
So
thanks
for
going
through
these
these
items
already
like
it
sounds
like
you
already
have
a
discussion
here.
First
of
all,
I
just
wanted
to
summarize
how
this
will
be
set
up.
I
answered
your
question,
like
I
think
I
think,
between
five
and
ten
shards
total
across
all
environments,
where
each
shard
is
a
full
petroleum
cluster.
So
if
that
helps
with
like
sizing
to
give
you
an
idea.
A
A
It's
going
to
start
out
with
this.
It's
basic.
I
think
the
first
phase
is
probably
going
to
be
a
replica
like
we'll,
create
a
new
database.
That's
going
to
function,
a
new
database
cluster,
that's
going
to
function
as
a
replica
and
then
there'll
be
a
feature
flag
that
will
start
using
the
replica
for
certain
tables,
probably
ci
to
start
and
then
and
then
eventually
we'll
you
know
make
it
so
that
that
cluster
is
no
longer
a
replica
but
functions
on
its
own
independent
of
the
main
cluster.
A
So
what
is
a
shard?
A
shard
is
a
petrone
cluster
consisting
of
n
number
of
postgres
nodes,
plus
console
cluster
plus
pg
bouncer
a
load
balancer,
and
that's,
I
think,
that's
it.
B
A
No,
the
well
there's
going
to
be
a
dedicated
gcp
project
per
environment,
similar
to
what
we
have
now.
I've
created
these
they're
called
like
production
db
staging
db
and
then
there's
some
sandbox
projects
within
each
project.
Then
there'll
be
n
number
of
database
shards
and
each
shard
will
have
like
its
own
network
space
so
that
we
can
and
then
the
current
idea
is
that
we'll
peer,
the
vpc
in
each
environment,
to
the
corresponding
environment,
like
productiondb,
will
get
peered
to
production
and
then
using
firewall
rules
will
give
production
access
to
individual
charts.
A
B
Makes
sense
so
going
going
to
my
next
question,
like
obviously
like
my
understanding,
is
that
this
entire
effort
is
driven
primarily
by
the
limits
of
the
scalability
of
the
existing
postgres
cluster,
so
we're
starting
out
to
have
some
scalability,
but
obviously,
at
some
point,
we'll
reach
a
limit
to
that
as
well,
and
I'm
curious
like
obviously
it's
not
going
to
be
in
a
few
months.
A
The
forecast
for
where,
like
how
much
runway
we
have
for
the
database,
is
sometime
next
year
before
we
like,
I
think
the
estimate
was
like
april
or
may
of
next
year
before
we
get
to
like
critical,
where
it
becomes
critical
that
we
have
to
do
this,
we're
we're
kind
of
like
doing
doing
the
sharding
now
to
get
ahead
of
that,
because
we
figure
there's
probably
going
to
be
lots
of
issues,
and
we
just
want
to
I
mean-
and
we
don't
know
like
it's
possible-
that
that
timeline
could
change
right.
Okay,
cool!
A
Great,
so
the
reason
why
I
schedule
this
with
you
is
because
you're,
like
the
most
knowledgeable
person
on
the
monitoring
end.
B
A
B
A
A
Okay,
great,
I
was
kind
of
on
the
same
page
there.
I
don't
think
it's
gonna
be
operationally
difficult
to
have
one
per
shard,
but
I
thought
it
was
overkill.
So
I
I
figured
one
project.
A
What
we've
just
described,
yeah,
okay,
so
that's
simple
so
down
to
well,
let's
jump
to
2c.
You
made
a
comment
here
about
indices
yeah.
So
I'm
thinking
I
don't
know
what
I'm
thinking
here.
To
be
honest,
I'm
I
have
no
idea,
but
I
I
I
maybe
better
not
to
use
the
existing
index.
So
we
don't
mess
it
up,
so
we
create
a
new
one
like
pub
sub,
post
resident
staging
db
or
or
if
db,
shards,
yeah
actually
yeah.
I
was
thinking
something
like
yeah.
B
So
there's
a
few
things
to
consider
here.
One
thing
is
that
the
volume
of
logs-
I
don't
think
like
one
of
the
reasons
why
we
were
splitting
indices
in
the
past,
was
that
the
load
volume
was
massive
and
like
elastic,
wasn't
just
able
to
cope
with
that.
So
splitting
it
into
separate
indices,
allowed
it
to
to
use
more
shards,
but.
B
B
So
if
we
had,
for
example,
a
postgres
index
for
gcp
project,
it
would
be
easy
to
find
the
logs
that
the
relevant
logs
and
the
reason
I
say
that
is
because
that
that
was
the
schema,
that
for
the
naming
convention
that
we
use
in
the
past
component
and
then
the
gcp
project,
but
that
you
know
that
might
not
be
the
best
fit
like
we
might
want
to
reconsider
that,
for
whatever
reason
I'm
just
voicing
like
you
know
what
what
we
were
doing
in
the
past,
like.
B
It
might
be
perfectly
reasonable
to
to
say
we're
sending
we're
sending
all
postgres
logs
from
production
to
the
single
index,
because
we
don't
want
to
split
them
in
in
separate
indices,
because
there
there
will
be
a
transition
period
where
the
existing
postgres
chart
will
be
still
running
in
the
gitlab
gprd
project.
So
the
question
would
be
like
you
know:
how
do
you
transition?
Do
you
like
sent
to
both
of
them,
etc,
etc?
A
B
A
Did
we,
I
forgot,
the
index
name,
uses
g
prod
and
g-stage
as
the
environment
names
or
is
it?
Does
it
use
production
and
staging?
Is
the
environment.
A
So,
given
that
we
have
n
b
equals
gprd
db
and
n
equals,
you
know
g
stage
db.
Should
I
just
create
new
indexes
using
gprd
db,
gstagedb,
keeping
everything
else
the
same
and
then
and
then
we
just
use
the
project
because
that's
that's
sort
of
like
it's,
not
the
project
name.
Project
name
is
production
db,
but
it's
the
same
kind
of
yeah.
It
follows
the
same
convention.
B
A
B
How
do
you
think,
like
from
the
perspective
of
of
querying
those
logs,
how
difficult
would
that
be.
A
B
Well,
but
it
would
still
be,
it
would
still
only
be
prosperous
logs,
so
I
wouldn't
expect
there
to
be
and
presumably
deployed
using
the
same
method,
so
it
wouldn't
be
like
in
one
environment
in
one
gcp
project,
it's
running
in
kubernetes
in
the
other,
it's
in
chat,
and
thus
you
know
you've
got
different
fields.
So
I
guess
it's
something
to
to
to
figure
out,
as
as
we
move
along.
B
On
the
other
hand,
on
the
other
hand,
some
like
eoc
might
get
alerted
for
possibilities
performance
they
might
go
to
the
yeah.
That's
true,
yes,
and
you
know
forget
that
there's
multiple
indices
they
need
to
check.
Like
I
don't
know
like
I
think,
yeah
I
I
don't
know,
I
think
I
think
we'll
we'll
need
to
or
you'll
need
to
figure
out
as
we
go
like
make
it
like,
make
all
the
basically
make
a
decision
here.
A
Yeah
I
mean
I
could
it's
it's.
We
can
probably
delay
it
like
we
could
do.
We
could
use
the
existing
index
on
staging
we're
not
going
to
do
this
on
prod
for
a
while.
Yet
so
we
do
use
the
existing
index
on
staging.
We
have
to
come
up
with
a
shard
label
for
the
main
cluster,
so
we'll
call
it
like
main
or
something.
A
B
A
All
right,
great
okay,
on
to
number
three
so
obviously
like
we
moved
a
lot
of
our
monitoring
stack
into
kubernetes,
but
for
this
I'm
thinking
that's
overkill.
A
If
you
knowing
what
you
know,
do
you
tend
to
agree
or
do
you
think
you
would
try
to
get
this
in
kubernetes?
That.
B
Running
kubernetes
would
be
an
overkill
well,
we
would
still
be
running
well
so
and
in
point
2d,
you
mentioned
that
you
wanted
to
get
the
matrix
into
that.
That
means
that
some
components
of
diamonds
will
have
to
be
running
in
that
project
and
prometheus
as
well.
So
we've
got
at
least
one
thanos
component,
I'm
thinking
about
thanos
cipher,
but
there's
at
least
one
thomas
component
and
some
prometheus.
B
Both
of
them
need
to
be
deployed,
configured
maintained
so
at
some
point
we'll
need
to
update
them.
So
my
thinking
is
what's
the
easiest
way
to
update
like
so,
let's
say
we
we
hit
a
bug
in
in
famous,
which
is
something
that
happened
in
the
past.
What
we've
done
like
we,
we
had
to
upgrade
monitoring
components
on
multiple
occasions,
but
on
one
occasion
we
had
to
patch
the
thomas
binary
and
just
having
a
single
source
of
that
binary
was
very.
A
B
A
Well,
right
now
we're
we're
flirting
with
the
idea
of
using
omnibus,
but
you
know
we're
not
ready
to
marry
it
so
we're
going
to,
but
right
now
we're
just
doing
ansible
on
you
know,
bass
ubuntu
with
the
omnibus
installed
and
we're
doing
that
everywhere.
We're
not
using
the
part
of
this
is
not
using
chef
at
all
nice.
A
So
I
don't
think
it
would
be
complex
to
like
configure
prometheus
and
thanos
and
even
pub
sub
with
ansible,
but
you're
kind
of
right
like
having
one
process
for
pushing
those
changes
with
gitlab
home
file
like
with
kubernetes
and
another
process,
configuring
it
with
ansible.
Maybe
it's
not
ideal.
B
So
but
having
said
all
of
that,
like
I
don't
know
if
we
have
a
case
where
prometheus
is
running
inside
of
kubernetes
and
it's
scraping
endpoints
on
gcevms-
and
I
don't
know
what
the
networking
implications
of
that
would
be
so
like
it
might
be
trivial
as
in
it
might
just
require.
B
The
vpc
peering
between
the
sap
networking,
the
cuban
eddies
and
and
the
vms,
it
might
be
as
simple
as
that
I
don't
know
so.
That's
that's
something
to
consider
so
again,
like
there's,
there's
pros
and
cons
both
I
think.
B
What's
what's
what's
your
take
on
this.
A
My
take
is
like
for
expediency.
I
feel
like
it'd,
be
simple
for
infant.
Like
you
said
like
for
inventory
generation,
it's
really
simple
to
do
this
and
ansible,
because
I
can
just
write
out
the
prometheus
config.
I
don't
think
setting
up
thanos
and
pub
sub
is
gonna
be
tricky
either.
I
would
just
like
create
one
vm
and
install
all
of
these
components
on
it
and
then
be
done
with
it.
That's
my
that's
what
I
was
thinking.
B
So
I
was,
I
was
updating,
alert
manager
the
other
day
and
some
of
the
prometheus
instances
didn't
really
deal
with
that
very
well,
and
I
had
to
go
through
prometheus
conflict
like
basically
go
through
contact
with
all
arbiters
instances
and
there's
the
conflict
is
spread
across
multiple
places
and
like
this
would
be
if
we
were
to
to
run
this
in
with
ansible.
That's
yet
another
way
to
configure
it.
Just
just
just
you
know
like
I'm,
not
I'm
just
I'm
just
like
voicing
voicing
that.
B
I'm
not
saying
that
it's
a
bad
thing
or
a
good
thing,
necessarily
and
my
point
being
since
we
intend
to.
I
will.
I
don't
know
if
that's
still
the
case,
but
up
to
my
best
knowledge,
we
intend
to
move
chef
conflict
to
ansible
conflict.
So
perhaps
this
is
a
good
first
candidate
because
he
would
be
starting
from
scratch,
and
we
once
we
once
we
have
all
the
ansible
playbooks
for
prometheus,
then
moving
the
existing
prometheuses
that
are
managed
with
chef
would
be
much
easier.
A
I
guess
that's
a
possibility
too
yeah.
I
don't
know
if
we
intend
to
like
do
that
or
to
try
to
go
fully
to
kubernetes
with
like
having
the
ability
to
scrape
things
outside
of
the
cluster
right
yeah.
B
Yeah,
because
is
one
of
the
things
that
you
could
you
could
leverage
if,
if
prometheus
was
running
on
a
vm,
is
the
prometheus
service
discovery
mechanism?
So
it
can
talk
to
gce
to
discover,
scraping
end
points,
and
I
I
don't
think
that's
possible.
Well,
I
don't
I
don't.
I
don't
know
if
that's
possible
in
in
kubernetes
like
it
might
be
possible
that
you
just
tell
a
prometheus
business
running
in
cuba
and
he's
hey.
A
B
A
A
A
What
about
I
guess
if
we
create
the
kubernetes
cluster
in
these
new
environments,
we
should
we
might
as
well
put
pub
sub
in
them
as
well.
You
know
too
right.
A
B
Well,
not
necessarily
if,
if,
if
your
intention
is
to
use
the
existing
indices,
then
you
could
just
configure
fluency
on
vms
to
forward
logs
to
a
topic
in
another
project.
Okay,.
B
The
existing
prospect
infrastructure
would
just
pick
up
those
logs
and
you
wouldn't
have
to
do
anything
there.
Might
you
might
need
to
rescale
resize
some
of
the
box
of
peat
deployments
due
to
increased
volume,
but
I
doubt
that
like
postgres
doesn't
look
that
much,
but
it's
it's
zero.
A
Does
make
it
simpler,
yeah,
okay,
yeah,
you're
right,
I
guess
we
use
the
same
index.
We
would
do
that.
A
Okay,
cool
thanos
yeah,
we'll
have
to
create
the
side
car.
If
we
deployed
to
vm
we'll
have
to
configure
it,
but
I
don't
imagine
that'll
be
too
complicated.
A
Thanos
stores,
independent
defender
sidecar
so
so
does
thanos
store,
have
to
like
be
in
the
same
project
or
how
does
that
work
now,
like?
We
have
final
store
running
in
each
environment's
kubernetes
cluster.
A
B
That's
that's
a
good
question.
I
and
the
short
answer
is,
I
don't
know.
I
think
that
then
managed
to
move
all
of
it
to
kubernetes,
but
I
don't
know
I've
definitely
seen
some
vms
named
after
thanos
components,
but
I'm
not
sure
if,
if
they're
still
being
used
like
what's
running
on
them,
I
don't
know
the
short
answer.
Is
I
don't
know
I.
I
think
the
intention
was
to
move
all
thanos
stores
to
run
in
and
they've
been
started.
A
B
So
if
you,
if
you
wanted
to
have
the
metrics
discoverable
in
thanos,
like
that,
that
would
be
one
way
to
do
it
to
basically
have
a
gcs
bucket
thunderstorm
thomas
compact,
for
a
gcp
project.
A
Yeah,
okay
yeah:
this
sounds
to
me
like
the
yeah.
The
easiest
thing
would
be
just
to
have
a
kubernetes
cluster
and
you
know
just
have
to
figure
out
how
to
scrape
these
extra
because
we're
not
going
to
be
really
monitoring
nothing.
It's
great
writing,
including
it's
all
going
to
be
outside
like
petroni,
and
I
mean
I
guess
we
could
run
some
exporters,
like
maybe
a
stackdriver
exporter
or
something
in
kubernetes,
but
I
think
most
of
you.
B
A
Okay
b,
I
think
we
already
touched
on
see
we
already
touched
on
as
well.
So
I
think
I'm
good
what
I'll
do
is
I'll
spend
some
time.
Thinking
like
we're
thinking
about
how
we
can
monitor
endpoints
and
scrape
endpoints
outside
the
cluster.
B
B
Curious
to
know
which
way
you'll
go
because
if
you,
if
you
decide
that
you
want
to
provide
ansible
module
or
play
for
for
managing
prometheus,
that
would
be
really
helpful
to
know.
A
Yeah,
the
issue
is.
A
B
Sorry,
sorry,
I
couldn't
be
of
more
help
if
you've
got
any
further
questions
feel
free
to
to
message
me
I'll.
Try.
A
To
help
you
yeah
sure,
yeah,
no
you've
been
a
lot,
a
big
help
and
yeah.
I
appreciate
you
taking
the
time
I'll
I'll,
let
you
know
how
it
goes
cool
all
right.
Thanks
all
right
talk
to
you
later
ciao.