►
From YouTube: Kubernetes SIG K8s Infra - 20230607
Description
A
B
A
So
before
we
start
do
we
have
anyone
new
in
this
code,
one
two,
some
quick
introduction,
no
obligation
to
do
it.
I'm
new.
C
A
Welcome
yeah
welcome
to
the
gang.
We
really
appreciate
what
Peter
Martin
is
doing
for
us
so
happy
to
see
more
people.
D
A
Okay,
welcome
also
folks,
please
don't
forget
to
put
yourself
in
the
attend
the
list,
because
I
see
number
of
hardship
and
not
matching
the
number
of
attending.
So
thank
you
so
much
it's
all.
Okay,
do
we
have
somewhere
else.
A
A
A
A
How
I'm
gonna
do
this?
Let
me
introduce
last
one.
A
But
also
yeah
the
secretion,
also
yeah.
We
bring
back
down
because
of
advocacy
to
kind
of
normal
thing.
So
hopefully
there
should
be
enough.
G
G
To
see
the
change
in
the
artifact
registry
after
we
fix
the
the
S3
buckets.
A
Yeah
I
I
think
we
made
a
mention
of
that
in
the
last
meeting
that
we
have
a
graph
showing
huge
decreasing
costs
just
for.
E
G
H
A
A
A
A
No
I
think
yeah.
This
view
is
better
yeah,
so
yeah
registry
is
still
the
highest
cost
because
we
have,
we
don't
cover
all
the
region
and
prow
is
slowly
growing.
A
If
I,
let
me
put
that
so
yeah.
Let
me
do
this
way.
A
I
Yeah
I
think
if
we
look
at
the
cost
updates
that
I
did
at
the
bottom,
it
tells
about
the
same
story.
I
think
we
should
be
the
way
it
runs
now
be
around
2
million.
At
the
end
of
the
year
there
doesn't
seem
to
be
there's
a
massive
change
happening
now.
It's
slowly
creeping
down
on
the
one
side
and
AWS.
I
So
AWS
already
59,
it
was
two
or
three
weeks
ago
or
two
or
three
meetings
ago.
It
was
around
about
45
and
now
it's
almost
60,
so
AWS
is
nicely
growing
to
match
and
also
the
critters
looking
good
to
150
000
left
of
the
current
budget.
A
B
I
have
a
quick
question:
do
we
have
any
estimation,
or
are
we
getting
ready
to
go,
live
AWS
plan
for
with
the
rent
system,
because
we
have
persistent
expected
usage
on
that
matter?.
A
B
There
are
also
some
solutions
of
this
for
the
ECP
site
for
Marketplace
AWS
reservances
Marketplace
utilization
with
automated
third-party
Solutions.
So
we
want
to
take
that
out.
A
A
We
don't
currently
I,
don't
I,
don't
I,
don't
want
to
say
full
understanding,
we're
more
like
we
don't
capture
the
entire
infrastructure.
We
plan
to
have
and
I
always
say,
for
example,
2024,
because
we
still
have
things
we
need
to
balance
between
gcp
and
AWS.
So
until
we
get
that
right,
there's
no
really
plans
to
do
capacity,
planning
and
cost
optimization
right
now
and
we
are.
Our
budget
is
fine
to
accept
any
kind
of
experiment.
Basically
do
wedding
things
ready.
A
A
K
Think
he
can
so
one
thing
you
need
to
do
is
label
the
buckets
that
you're
gonna
project
and
then
you
should
be
able
to
understand
which
set
of
labels
are
costing
x.
What.
B
A
So
in
two
meetings
we
had
the
conversation
with
Ben
Justin
James
was
asking
about
basically
how
we
can
hard
more
image
to
graduate
from
kgcr.io
to
the
new
registry
and
I
think
we
were
in
agreement
on
getting
that
information
first,
because
before
we
get
any,
we
move
to
any
conversation.
So
it
was
considered
as
an
AI.
So
I'm
not
sure
if
we
still
want
to
do
that.
J
And
I
think
the
AI
was
on
me
right
and
if
so
I
haven't
yet
had
time.
Okay,
but
I
can
still
do
it.
If
we
don't
want
to
do
it,
then
that's
fine,
but
I
did
the
analysis
before
so
I
assume
that
that's
why
the
AI
I
think
fell
on
to
me.
A
L
Thanks
thanks
Justin,
let's,
let's
do
one
more
planning
cycle
just
to
get
a
next
set
of
10
images
and
and
like
see.
If
we
want
to
do
it
at
all
right
like
I'm,
not
saying
that
we
should
do
it.
I
just
want
to
see
what
will
show
up
in
that
bucket
and
then
we
can
decide
whether
to
do
it
or
not.
And
at
this
point
like
I,
don't
want
to
go
increase
the
workload
on
all
the
googlers
that
are
behind
the
scenes.
You
know
handling
the
case
gcr.io.
L
Unless
we
really
think
that
you
know
we
would
make
a
dent
in
something
or
the
other
right.
G
Without
looking
at
individual
images,
just
looking
at
the
overall
bandwidth,
craps
like
we
have
dropped
it
off
so
much
you
can.
You
can
also
just
see
that,
like
the
case,
rfx
prod
project
has
dropped
instead
of
being
like
60
plus
percent,
it's
less
than
half
of
our
spend,
even
though
it
still
includes
the
the
new
artifact
Registries,
so
I've
just
been
keeping
an
eye
on
those
overall
metrics
for
whether
or
not
and
my
current
recommendation
is
that
we're
letting
it
letting
that
traffic
H
out.
L
So
just
so,
you
all
know
I
cleaned
out
the
last
image
that
was
there
in
the
docker
Hub
registry.
I,
don't
know
if
you
know
those
things
so
I
cleaned
them
out
today.
So
there
were
three
tags
on
the
pause
image
and
if
you
go
it's
empty
now,
I
I
got
access
to
it
from
Tim,
Hawkins
and
I.
Went,
went
and
cleaned
that
out
just
an
hour
or
so
ago.
A
G
A
G
L
A
A
L
Anyway,
sorry,
let's
go
on.
A
Cool
okay:
we
can
jump
in
the
open
discussions
right.
A
Yeah,
okay,
so
first
start
with
me:
there's
a
draft
of
an
article
about
basically
announcing
the
other
kids
that
I
will
use
the
CDN
to
serve
binaries.
So
I
put
the
link
of
the
pull
request.
Opening
so
I
invite
people
doing
some
kind
of
review.
There's
a
preview.
If
you
want
to
read
the
current
state
of
the
article,
we
hope
to
get
that
merge
bye
end
of
the
week,
so
everybody
want
to
drop
comments.
A
I
A
H
A
Is
it?
Is
it
really
a
problem
to
have
that
clean
and
back
to
normal,
because
I
think
it's
not
it's
not
really
changing
anything.
We
kept
that
for
like
three
months
traffic
didn't
go
up
or
we
still
have
like
60
55
of
traffic
captured
by
the
new
version,
Series,
so
I
don't
think
it's
for
maintaining
that
Banner
all.
K
Right
sounds
good,
also
the
Google
Maps
out
the
redirects,
or
are
they
like
we're
still
doing
select
set
of
images.
A
Yeah
I
think
that's
gone
back
to
the
question.
We
that
basically
come
back
to
the
request
by
teams
about
getting
a
new
data
point
to
see.
If,
if
it's
while
doing
add
another
image
for
the
redirect,
so
in
my
I
have
a
conversation
I
get
about
this
well.
A
Okay,
okay,
so
I'm
done
about
the
blog
posts.
Next
is
Patrick.
Let
me
put
do
you
want
to
present
something
or
just
want
to
talk
about.
F
You
can
just
open
those
links
that
I've
put
here.
Maybe
let's
start
with
the
first
one,
so
yeah
I
just
want
to
briefly
discuss
the
GitHub
solution
that
we
introduced
to
the
eks
pro
build
cluster.
F
So
here
on
the
bottom,
there
is
a
GitHub
section
and
it
describes
how
to
create
customizations
and
how
you
can
basically
deploy
things
to
our
clusters
without
having
direct
access,
basically
for
GitHub,
so
yeah
there
is
so
I
mean
I,
don't
want
to
get
into
the
details.
You
can
read
it
on
your
own
and
you
can
reach
me
out
later
yeah,
it's
the
GitHub
section.
It
describes
how
to
install
it
on
the
empty
cluster
and
also
how
to
extend
what
we
already
have
so
yeah.
F
To
be
honest,
there
was
no
like
I
mean
it
was
based
on
a
subjectively.
It
was
based
because
it
subjectively
feels
like
a
simpler
solution
there,
for
instance
Argo,
but
if
we
have
any
like
strong
arguments,
I
guess
we
can
simply
migrate
this.
It
shouldn't
be
really
that
hard.
F
It's
not
that
much
of
an
automation
so
yeah
we
have
some
dashboards
on
grafana
so
that
you
can
track
if
the
thing
that
you
audits,
if
it's
if
it
has
been
successfully
deployed
what
it's
missing
right
now
and
what
we
need
to
still
in
proof
is
to
present
somehow
what
type
of
error
happens
if
something
goes
wrong,
but
this
could
be
achieved
through
probably
lucky
connected
to
grafana,
with
some
logs
from
specific
conference
that
flux
has
or
maybe
notifications.
This
is
something
that
we
can
work
on.
F
To
be
honest,
I,
don't
really
know
if
we
have
already
some
notifications
if
we
leverage
them.
If
it's
easy
to
get
some,
let's
say
slack
token
I,
don't
know
we
can
consider
that
it's
not
like
must
have
I'll
be
also,
as
we
can
also
configure
Loki,
as
I
said
before
yep.
So
this
is
this:
is
it
if
you
want
to
discuss
any
alternatives,
I'm
open?
F
If
you
have
any
comments,
questions
yeah,
we
can
discuss
quickly
right
now
or
you
can
reach
me
out
later
so
yeah.
That
would
be
it
from
my
hands.
K
Yeah
I
have
a
question
for
you.
So
can
you
just
go
to
the
the
markdown
document
in
the
second
tab.
K
Can
you
split
this
document
so
that
there's
a
section
the
bootstrap
section
right?
This
should
be
something
that
you
run
one
time
on
a
cluster
when
you
create
it
and
you
don't
really
need
to
do
it
and
then
there
should
be
a
section
where
people
who
are
writing
new
stuff
for
the
cluster.
What
they
need
to
look
at.
K
And
I:
don't
need
to
worry
about
deploying
flux,
but
I
do
need.
B
F
Yeah
I
mean
that's
a
that's
a
good
point.
We
still
haven't
added
the
monitoring
section
to
documentation,
so
yeah
I
mean
we
will
prepare
some
developer
dedicated,
let's
say
readme,
so
that
someone
who's
not
interested
in
the
administrative
administrative
part
of
eks
could
leverage
that
that's
a
good
idea.
L
Yeah
so
basically
personal
based
right,
like
so
have
some
personas
and
most
of
the
people
won't
need
to
know
how
to
install
things.
L
I
mean
this
is
good:
let's,
let's
go
with
what
you
have
and
see
how
it
works
over
a
period
of
time
right
like
so
only
when
we
start
getting
into
the
maintenance
mode
will
we
know
the
actual
problems:
hey
somebody's,
not
there.
So
we
can't
we
don't
know
what
is
happening.
You
know
things
like
that.
Right
like
it
should
be.
We
should
be
able
to
people
who
are
watching.
L
It
should
be
able
to
tell
what
it's
doing
reasonably
well
and
when
it
goes
bad
we
should
all
get
to
know
one
way
or
another
right.
So
some
alarms
monitoring
that
kind
of
stuff
will
be
good.
D
Thanks
James
Patrick
this
is
this-
is
good
work.
The
model
A
lot
of
the
common
patterns,
I've
seen
with
fluxiest
elsewhere
and
and
and
also
here
in.
L
Right
one
simple
pattern:
for
the
errors
is
probably
going
to
be
just
email
people
when
something
when
the
git
Ops
operation
itself
fails,
then
just
send
send
out
an
email
or
reopen
a
GitHub
issue
like
we
do
in
publishing
bot
one
of
those
two.
You
know
simple
it.
It
works.
M
Yeah,
do
you
want
me
to
show
the
screen
or
you
want
to
go
through
the
board.
M
Know,
okay,
yeah
sure
so
first
thing
I
have
worked
in
the
past
week
on
adding
a
new
dashboard.
That's
going
to
allow
monitoring
the
resource
consumption
of
crowd
jobs.
This
is
one
of
the
key
blockers
that
we
have
for
migrating
jobs
from
default:
Google
internal
kubernetes
cluster
to
eks,
parallel
cluster,
because
in
eks
probable
cluster,
you
need
to
specify
for
each
proud
job,
resource
request
and
limits,
and
in
default
cluster
jobs
running
there.
Don't
have
that
most
of
them
don't
have
that
and
the
idea.
M
This
is
some
work
that
we
did
internally
in
Cuba
medic
a
few
years
ago.
Actually,
when
we
migrated
to
brow
is
to
have
a
dashboard,
this
one
could
quickly
that's
like
going
to
show
the
memory,
usage
and
CPU
usage
for
pods
that
are
basically
running
product
jobs
and
it
is
showing
the
status
of
that
and
how
it
works.
Basically,
there
are
multiple
dashboards
like
you
can
send
information
or
organization,
level
or
repo
level,
or
something
like
that.
M
But
this
builds
dashboard
is
the
most
important
one
and,
as
you
can
see
on
the
screen,
like
it
presets
you
for
a
job
that
you
select,
you
can
see
memory
usage,
you
can
see.
Cpu
usage,
I.
Think
memory
usage
is
like
the
most
straightforward
thing
like
it
shows.
How
much
memory
is
needed.
Cpu
is
a
little
bit
more
complicated
because
of
bots
go
Max
prox,
so
go
is
trying
to
use
more
CPU.
Then
it
can.
M
Then
some
throttling
comes
in
the
effect
and
like
it's
hard
to
estimate
like
how
much
exactly
CPU
you're
going
to
need.
That's
more
going
to
be
based
on
like
trial
and
error
like
you're,
going
to
try
to
reduce
CPU
you're,
going
to
look
at
the
graph
like
to
see
forward,
shot
link
is
happening
too
often,
and
you
are
eventually
going
to
see
if
the
job
duration
got
increased.
The
flaking
is
the
increase,
then
stuff,
like
that.
M
M
A
G
G
Looking
at
memory
usage
like
that
specific
example,
usually
we're
constrained
on
scheduling,
CPU,
so
reducing
the
memory
allocation
to
a
job
doesn't
help
us
schedule
more
workloads
and
it
does
increase
the
odds
that
a
pod
gets
killed
or
something
we
and
like
things
will
react
in
Behavior,
based
on
how
much
memory
is
available
to
them
performance
wise,
so
I
would
keep
that
in
mind
in
the
future
when
looking
at
this
dashboard
I
think
for
most
of
them,
we
want
to
dial
in
a
reasonable
amount
of
of
CPU
limit
and
that
the
the
amount
of
memory
we
allocate
should
probably
be
based
on
the
shape
of
the
underlying
machine,
the
like
ratio
of
CPU
to
memory,
because
CPU
is
almost
always
our
most
constrained
resource
it
it's
possible.
G
We
find
it
as
changes,
but
I've
looked
at
a
lot
of
job,
but
a
lot
of
CI
job
workloads
and
I.
Don't
think
we're
gonna
find
that
and
it's
it's
probably
fine
and
expected
that
we're
over
allocating
memory
to
pods.
M
Yeah
I
agree
with
Ben.
That's
why
I
said
I
think
like
resource
optimization
for
existing
jobs
shouldn't
be
like
too
big
priority.
The
main
driver
for
this
is
like
to
help
to
facilitate
migration
for
jobs
that
don't
have
any
resource
and
limits,
so
that
folks
have
at
least
some
to
to
more
reason
to
figure
out
what's
going
on
and
how
they
should
adjust
their
jobs.
G
M
M
Yeah
so
jobs
migration
update,
as
some
of
you
might
know,
we
had
an
outage
for
the
Google
internal
Pro
build
clusters,
so
jobs
were
not
getting
started
and
because
of
U.S
holidays
that
were
that
weekend,
where
this
happened,
a
lot
of
forks
were
trying
to
find
the
solution.
One
of
the
solution
was
to
migrate
jobs
to
AKs
project
cluster,
because
we
have
more
coverage.
M
Attempts
of
support
like
folks
who
can
chop
a
fix
if
something
is
wrong
and
we
actually
had
some
major
repos
a
bit
migrate
as
much
as
possible,
and
this
includes
test
info.
A
lot
of
testing
for
jobs
are
migrated,
ones
that
are
not
migrated.
Are
those
require
access
to
the
Google
infra
like
to
push
images,
those
are
still
in
place.
There
are
other
projects
like
Q
I.
M
Think
is
called
like
that,
like
that
there
is
cluster,
a
cluster
API
provided
for
abs
and
probably
some
other
projects
that
migrated
I,
think
everything
is
working
quite
a
while.
So
far,
no
major
issues:
egg,
like
there
were
no
major
issues,
some
smaller
issues
about
resource
limits
and
requests,
but
that
was
fixated
that
was
mainly
for
Kappa
The.
M
L
I
had
a
question
not
necessarily
for
Marco,
but
thanks
for
Marco
for
providing
that
context
on.
You
know
the
problem
that
happened
and
how
we
were
able
to
mitigate
some
of
it.
But
the
question
is
how
many
clusters
do
we
have
and
where
are
they
running
and
who
has
access?
I
I
went
looking
for
it
and
I
I
couldn't
find
it
in
a
single
location.
Then.
A
L
A
A
Default
first,
one
is
defaults.
First,
one
is
default
owned
by
Google
last
one
is
owned
by
Google
in
the
middle.
Those
are
community
on.
L
Right
we
need
to
put
this
information
somewhere
for
sure
right,
yeah
yeah,
the
other
one
that
we
were
talking
about.
That
day
was
you
know,
foreign
job
definitions.
The
cluster
is
not
mentioned,
and
you
know
clustered
since
the
cluster
is
default.
Then
we
were
talking
about
hey.
L
Can
we
put
you
know
cluster
colon
default
there
to
indicate
so
that
we
can
have
a
count
of
how
many
jobs
are
running
where
and
how
many
are
actually
running
in
the
default
cluster,
using
grep
kind
of
thing
right,
or
do
some
write
some
scripts
to
go
around
figuring
out
like
how
many
jobs
are
there?
Where
are
they
running?
L
You
know
how
how
many
of
them
can
we
move
from
here
to
there
that
that
sort
of
a
thing
we
can
start
doing
right
like
we
can
go?
Tell
six
saying:
hey,
saying
you
have
this
many
jobs
running
in
this
cluster?
Can
you
move
this
to
that
cluster
kind
of
you
know:
information
go
ahead,
Ben
I.
G
I,
don't
I,
don't
think
we
actually
wanted
to
to
centralize
tracking
that
I
think
that
that's
gonna
be
a
big
headache.
I
think
we
want
to
put
out
a
good
PSA
outlining
for
people
why
it's
valuable
to
them
and
us
to
migrate
and
what
sort
of
jobs
make
sense
and
just
let
them
migrate
what
they
can
and
then
we
can
evide
evaluate
what's
left
and
I
believe
you
already
found
that
we
can.
G
We
have
the
config
loading
like
prow
has
a
package,
that's
aware
of
loading,
the
config
properly,
that
does
all
the
defaulting
and
everything
and
we
don't
want
to
add
the
field
to
every
yeah.
L
G
But
we
can
also
get
the
data
without
require,
like
enforcing
that
people.
Add
this
to
like
boilerplate
to
every
job
like
it.
It's
it's
noise
to
actually
have
it
on
disk,
and
if
it
doesn't
have
the
cluster
field,
then
we
know
what
it
is,
and
you
know
it's
more
reliable
to
actually
load
the
config
with
the
like
the
the
package.
G
B
A
That's
fair,
yeah,
I,
I
think
right
now
we
need
to
basically
push
the
six
to
start
basically
think
about.
Basically
migration
I
think
that's
the
one
one
thing
and
provide
documentation
about
I.
Think
the
one
thing
I
would
be
interesting
is
like
see
document
that
guides
migration
a
web
literally,
we
literally
say
to
people.
You
should
specify
the
cluster
field
because
yeah,
it's
explicit.
We
know
where
it's
running
in
that
migration
guide
and
from
that
we
push
now
the
differences
to
do
the
migration,
because
we'll
be
talking
about
improving
power.
A
L
And
the
thing
that
I've
also
noticed
is
like:
unless
there
is
an
emergency
people,
don't
move
right
so
yeah
we
can't
get
them
folks
to
do
anything
unless
there
is
a
real
urgency.
This
exhibit
a
in.
That
is
what
I
was
doing
last
weekend,
which
was
to
send
emails
to
you
know
differently,
not
emails,
opening
up,
GitHub
issues
for
all
the
sigs
saying:
hey
here,
are
your
10
jobs
that
have
been
failing
for
the
last
year
or
so
right.
L
So
I
was
sending
those
kinds
of
GitHub
issues,
and
you
know
we
had
two
two
CA
jobs
which
are
failing
on
our
side
for
a
for
three
months
too.
So,
and
nobody
looks
at
it
because
there
is
no
urgency
right.
So
we
got
to
create
the
urgency
somehow
and
I.
Don't
know
how
to
do
that
and
I'm.
Looking
for
ideas,
I.
G
Mean
this
is
the
kind
of
task
that
shards
out
really
well.
We've
done
some
similar
ones
recently,
where
I
was
like
hey,
we
should
be
using
E2
nodes
for
for
builds
and
contributors,
just
sharted
out
that
work
and
pried
all
the
repos
and
updated
it.
The
config
has
so
many
jobs
trying
to
centralize
keeping
track
of
all
of
them.
I
mean
sick
testing
hasn't
been
able
to
keep
on
top
of
that
and
we're
not
even
responsible
for
all
the
rest
of
the
infrastructure.
G
G
The
rest
of
us
can
like
review
and
approve
those
PRS,
and
you
don't
even
have
to
have
access
to
any
of
this
or
be
an
expert
in
any
of
it
in
most
jobs
where
it
will
work,
fine,
you're,
literally
just
going
to
add
one
field
and
say
cluster.
You
know
eks
brow,
build
done,
just
add
that
field
to
the
job
and
send
that
PR
and
then
just
be
around
to
to
check.
G
G
I
think
if
we
can
even
get
a
fraction
of
the
project
to
pick
this
up,
it
will
go
pretty
well
and
we'll
move
a
lot
of
jobs
and
then
we'll,
then
we
can
start
looking
at
what's
left
and
we
don't
actually
have
to
get
like
people
to
personally
do
this
for
their
own,
we'll
ask
them
to,
but
it's
okay,
if
someone
else
picks
it
up.
M
Ahead
Marco
one
thing
about
it
to
mention
is
we
had
a
discussion
in
some
channels
and
the
idea
that
we
had
is
that
they
come
up
with
a
sort
of
data
spreadsheet
whatsoever.
That's
going
to
include
like
a
list
of
jobs
that
make
sense
to
migrate
and
the
way
we
thought
this
is
something
that
I
also
had
a
chance
to
discuss
with
hippie,
and
the
deems
also
came
up
with
some
spreadsheet.
M
Is
that
based
on
stuff,
like
labels,
the
environment,
variables,
the
jobs
that
you're
seeing
and
stuff
like
that,
we
can
relatively
reliably
find
out
what
jobs
are
a
good
candidate
to
be
moved?
For
example,
if
you
see
a
job,
that's
using
gcp
by
I,
don't
know
using
the
preset
for
gcp
credentials,
that's
like
suffocate
who
don't
want
to
migrate.
M
There
are
also
jobs
like
given
that
might
make
sense
to
migrate.
This
is
using
the
e2e
test
for
abs,
for
example,
but
this
is
something
that
has
to
be
coordinated
with
us,
with
maintenance
of
the
TKS
brow
build
cluster,
because
we
need
to
keep
an
eye
on
stuff
like
Bosco
spool
and
to
make
sure
that
we
have
enough
accounts
and
a
bunch
of
other
stuff.
M
So
the
idea
was
try
to
come
up
with
a
spreadsheet
and
try
to
do
some
like
data
Magic
on
it
and
figure
out
what
jobs
might
make
sense
to
migrate,
export
that
and
like
maybe,
create
some
phased
approach
like
let's,
first
migrate,
jobs
that
are
easy,
like
don't
use
anything
just
run.
Some
script
make
linked
make
build.
What's
server,
then
go
to
e2e
jobs
that
are
possible
to
migrate
and
stuff
like
that.
M
L
And
the
other
thing
is
like
we
have
to
do
it
the
best
time
to
do
it
is
now
I
guess
otherwise,
once
we
go
down
deeper
into
128
release
cycle,
we
don't
want
to.
You
know,
break
things,
because
people
are
going
to
you're,
not
getting
good
CI
signal
sort
of
thing
so
that
that's
the
other
thing
right
like.
So,
if
you
have
to
start
doing
it,
we
have
to
start
doing
it
now
early
and
yes,
then.
G
G
Just
you
just
remove
the
cluster
field,
I,
don't
think
we
need
to
build
a
spreadsheet
or
a
dashboard
or
anything
I
think
once
we've,
you
know
gotten
people
to
to
work
on
this.
For
a
bit,
we
will
have
scaled
down
the
problem,
a
lot,
how
many
jobs
are
remaining
on
the
default
cluster
and
we
probably
won't
even
need
to
build
anything.
We
can
just
look
at
the
product,
case.io
dashboard
filtering
for
the
default
cluster
and
and
look
at
what's
still
there
and
and
I.
Don't
even
think
we
need
a
really
detailed
guide.
G
I
mean
we
basically
outlined
it
in
the
chairs
and
tech
meeting.
We
could
probably
just
go
transcribe
what
Marco
and
I
said
in
that
last
meeting
and
send
it
out.
Okay,.
L
A
Yeah,
sorry,
folks,
we
have
10
minutes
left
I
think
we
still
have
like
four
things
to
talk
about
so
before
I
want
to
basically
aw
you
have
this
conversation
anything
or
in
private
DMS.
Whatever
you
want,
we
need
to
move
forward
in
this
meeting
so
last
question
Jiffy.
Can
we
make
diplomatic
work
on
that
until
end
of
the
year
move
the
2
000
blow
jobs.
A
A
Good,
so,
let's
move
forward,
we
can,
we
can
now
follow
up
on
this
on
Slack
next
is
I
might
pronounce
that
from
John
oyan.
C
Yeah
both
both
are
actually
correct.
Thanks
and
I
will
return
the
favor
and
probably
mispronounce
your
name
Arno.
C
Perfect
thanks
so
from
my
side,
just
the
super
quick
one
PSA
and
one
open-ended
question.
So
the
PSA
is
I
made
an
effort
to
kind
of
bring
parity
between
the
grafana
for
build
clusters.
I
took
a
look
at
the
10
dashboards
in
gcp
cluster
I
managed
to
successfully
Port
three
out
of
the
ten.
The
other
seven
had
either
no
data
or
were
just
not
relevant
for
what
we
have
deployed
in
the
eks
cluster.
So
I
think
that
is
reasonably
good.
C
The
open-ended
question
the
security
team
or
or
the
security
response
team
from
AWS
has
reached
out
that
potentially
we
are
sharing
for
read
only
too
much
on
on
the
grafana
for
EK
export,
mainly
I
think
they
mentioned
the
node
exporter.
They
haven't
found
anything
sensitive,
but
also
they
said.
If
potentially
we
may
want
to
keep
that
even
for
read-only
Access
restricted
to
authorized
and
authenticated
users.
C
At
the
moment
the
grafana
doesn't
really
have
ability
to
authorize
yourself
or
authenticate
it's
just
Anonymous
users,
but
there
is
a
thought
that
we
can
potentially
split
the
dashboards
into
two
grafana
organizations.
The
public
will
have
only
the
super
safe,
whatever
we
feel
very
confident
that
can
be
exposed
to
everybody,
and
there
will
be
then
a
second
organization
which
will
be
hidden
only
for
those
who
are
authenticated.
C
A
C
Writing
is
disabled,
but
the
security
Response
Team
mentioned
that
even
the
read-only
access
for
the
node
exporter
is
something
that
triggered
an
alert
and
they
they
mentioned.
While
they
don't
have
any
concrete
problems
with
it
and
they
couldn't
find
any
concrete
piece
of
information
that
is
too
sensitive.
They
would
prefer
us
to
have
that
hidden
as
well
and
having
some
dashboards
hidden
as
far
as
I
understand,
grafana
is
best
done
by
having
multiple
organizations
like
rafana
organizations.
A
Okay,
yeah
I
think
we
should
just
follow
SRC
recommendation
right
now.
Let's
just
yeah,
let's
remove
any
any
suppose
dashboard,
that's
supposed
to
expose
sensitive
information
because
I
think
right
now
we
only
expose
CPU
memory
resources
and
IP
I
private
IP
addresses
right
now.
So
if
they
feel
like
is
tooth
sensitive,
just
remove
the
dashboard
I'll.
A
L
E
So
are
no
reached
out
to
me
wanted
to
get
something
going
with
OCTA
turns
out.
We
do
have
a
contact
with
OCTA.
The
idea
here
was
start
granting
access
to
GitHub
or
wow
to
AWS
and
gcp,
using
GitHub
SSO
via
OCTA.
We
have
a
free
trial
account
and
we
are
working
with
OCTA
directly
to
make
that
not
a
free
trial
account,
but
the
free
trial
account
is
going
to
have
a
lot
more
of
their
knobs
enabled
than
a
normal
account.
So
that's
that's
moving.
K
Just
before
we
move
on
from
that,
that's
good
to
hear
that
we've
found
an
IDP
provider.
So
does
that
mean
like
we're
parking,
Azure,
ID,
I'm
gonna
use
this
instead.
A
Yeah
we
can
do
yeah,
we
can
do
sing
basically
sing
the
GitHub
username
to
any
to
OCTA
and
from
that
we
make
sure
there's
authentication
in
it
allow
from
OCTA
to
any
platform
we
want,
and
authorization
is
now
handled
by
each
Cloud
plot
platform.
We
want
to
integrate
with
that's
the
plan
I.
Don't
we
don't
need
specific
policy
regarding
MFA
like
what
Azure
proposing
we
just
need,
Authentication,
so
I
think
we
can
start
with
OCTA
until
we
get
Azure
and
improve
things
later.
But
OCTA
is
a
good
start
for
me.
Yeah.
M
For
OBS,
the
core
implementation
is
done
like
there
have
been
like
cluster
signal.
This
meeting
we
discussed
it
and
like
everything,
is
merged
and
like
the
first
release
that
we
are
going
to
have
packages
built
on
OBS
is
going
to
be
128,
zero,
Alpha
2
and
that's
going
probably
to
happen
tomorrow,
so
trust
if
you're
interested
to
keep
an
eye
on
that.
We
are
still
like
going
to
see
how
comms
and
all
that
is
going
to
look
like,
but
for
now
just
know
that
this
is
going
to
happen.
M
L
Yeah
Marco
quick.
This
thing
there
is
there
a
GitHub
issue
for
planning
of
the
sizing
for
the
OBS
Network
traffic.
Let's
open
that
up,
please
because
you
know
we
mentioned
it
a
couple
of
times,
but
we
haven't
made
any
progress
on
that.
M
Yeah
we
will
have
to
figure
that
out
yeah.