►
From YouTube: INFRA Weekly Meeting 2020 06 09
Description
Jenkins Infrastructure Project Meeting - 2020-06-09
Notes - http://bit.ly/2T0oZ9v
A
A
So
basically,
what
we
discover
is
around
Jeff
February,
the
the
volume
that
store
backups
switched
in
a
read-only
mode,
so
we
had
no
backups
for
that
database
since
February,
and
so,
but
if
you
weren't
able
to
restore
all
the
users,
so
the
first
focus
was
first
to
restore
all
the
services
on
that
Lister
and
now
we
are
investigating
what
are
the
little
issues
with
the
users.
So
first
do
you
have
any
questions
regarding
the
AES
tester.
B
C
What
was
running,
but
it
was
whenever
any
changes
would
try
to
be
made
if
there
was
their
shoes.
So
we
hadn't
been
able
to
run
the
employment
job
for
the
last
couple
of
weeks
because
it
kept
failing
and
I.
Think
Olivia
was
trying
to
do
update
to
the
ingress
records
in
preparation
for
the
next
version,
and
it
kind
of
would
not
working
properly
come
on
and
yeah.
A
That's
that
that's
what
happens.
The
cluster
is
money
towards.
We
are
using
data
to
monitor
the
cluster.
Every
services
running
on
a
cluster
is
method
towards
one
of
the
things
that
we
could
have
better
melter
is
that
database,
but
having
I
mean
I'm,
looking
I'm
looking
at
what
happened
with
the
cluster
and
I'm,
not
sure
if
we
could
have
work,
I
need
a
better
way,
except
that
having
more
clusters
spread
the
risk
and
multiple
Crestor,
but
otherwise
that
was
a
word
issue.
Something
that
also
tried
was
to
open
a
ticket
an
azure.
A
So
this
is
also
something
that
we
discover
I'm,
not
sure
that
we
that
really
need
that
I
think
yeah
I
think
we
just
had
to
do
to
clean
up
everything.
So
that
list
was
running
since
two
years,
something
that
I
know
since
one
year
and
since
when
you're
sorry
that
one
was
ringing
for
one
for
a
year
and
I
know
monitoring
by
Toby
I,
don't
think
that
we
could
have
catch
this.
A
A
And
so
when
we
want
to
use
an
dated
person
of
kubernetes,
we
just
just
ask
Microsoft
to
update
a
version
to
the
next
one
that
we
want
to
use
and
Microsoft
will
do
with
management
to
turn
some
machine
down
up
and
so
on.
So
we
don't
have
the
visibility
there
and
basically,
what
happened
here
is
we
had
a
lot
of
time,
much
issues
between
the
agent
and
the
master,
so
it
was
not
possible
to
let's
say
a
cessation,
the
Machine,
the
bugs
what's
happening
with
the
CCD
or
whatever.
A
You
were
just
like
a
black
box
to
us.
So
that's
why
I
guess
we
try
to
upgrade
hoping
that
upgrading
the
version
even
to
a
minor
version
which
just
restarts
at
the
nodes
on
a
short
sides.
That's
one
of
the
things
that
we
try.
Also
I
was
also
hoping
that
maybe
I
should
support
would
help
us
in
this
case,
but
it
was
not
the
case.
So
that's
why
we
just
said
we
decided
to
just
do
it
everything,
because
it's
also
at
that
time,
that
it
was
the
the
quicker
solution
to
a
problem.
A
I
did
not
realize
at
that
time
that
the
backup
was
not
then
anymore.
So,
basically,
the
way
that
these
backups
is
done
is
each
time
so,
every
day
there
is
a
cron
job
that
done
the
database
on
a
giraffe
on
Azure
file,
storage
and
each
time
we
stopped
the
container.
We
also
generate
a
backup
on
the
database
storage
and
that
astral
storage
is
replicated
in
multiple
regions,
so
there
is
no
reason
that
the
backup
begun,
and
so
basically
what
happened
here.
A
We
just
mounted
that
a
short
file
storage
in
read-only
mode
instead
of
instead
of
read
rights-
and
maybe
something
I
mean
something
that
we
need
here-
is
a
monitoring
job.
That
just
say
sounds
like
your
backup
is
quite
old,
that's
older
than
one
week
or
two
weeks
or
one
day
today,
whatever
this
is
something
that
we
could
have
seen
earlier,
but
yeah.
So
something
that
you
have
to
keep
in
mind
is
that
adapt
services
running
on
curate
it's
since
three
years.
A
Multiple
time
we
turn
to
turn
to
move
the
cluster
at
the
container
into
unreachable
cluster,
because
we
have
created
the
crystal
of
within
particular
permission.
It
was
always
almost
transparent
because
the
person
was
done
for
free
segments
and
we
were
able
to
back
up
and
restore
very
quickly.
So
you
thought
this
was
read
the
first
time
that
we
had
such
issues
with
that
vectors.
B
A
Those
kind
of
issues
were
really
small,
but
it
means,
for
example,
employee
tennis
changes.
So
we
had
to
wait
for
up
yours
before
the
Dannette
was
totally
propagated
and
stuff
like
this,
and
so
I
have
a
list
of
things
that
I
that
I
changed.
Everything
has
been
pushed
to.
The
gigantic
forest
has
charge
repository
most
of
the
changes
but
yeah
we
still
have.
We
still
have
to
the
retrospective
once
everything
is
totally
fixed
was
regarding
that
outage.
D
A
D
A
A
D
E
D
A
A
A
B
D
B
Alert
on
the
same
machine,
that's
the
program,
yeah!
That's
why
I'm
asking
whether
we
could
have
a
snapshot
for
that.
So,
for
example,
for
week
II
we
can
just
keep
one
standard
forever,
because
we
don't
expect
any
changes
to
happen.
No
Nikki
now
I
mean
you
cannot
lose
and
for
a
Giri
it
would
be
still
useful
in
general
so
that,
if
anything
happens,
we
at
least
have
one
time
show
for
most
relevant
version
where
you
have
historic
data,
and
yet
that
could
think
what
we
do
that
next.
C
A
So,
for
example,
you
mentioned
package
injections
an
idea,
yes
you're
right
most
of
the
packages
also
on
measure,
but
for
example,
we
don't
have
the
very
old
version
like
before
we
start
to
ablai
upload
so
Bessie
for
the
release
line
that
are
not
used
anymore
same
if
I,
for
example,
for
certain
packages,
that's
what
I
mean
by
most
of
data
or
I
can
be
retrieved
in
some
way,
but
it's
really
been
under
the
services
and
all
the
data.
So,
for
example,
for
me
hast,
you
know
off
student
by
unit
yeah.
A
B
A
Engine
I
think
the
first
time
would
just
be
to
nasty
events
on
the
different
machine.
Let's
do
it
so
if
we
don't
have
to
collect
a
if
you
just
want
to
do
it,
one
snapshot
today
to
be
sure
it's
which,
if
you
just
want
to
put
in
place
some
script,
to
do
that
on
an
on
a
regular
basis.
Then
you
have
to
go
to
work
on
the
script
and
you
have
to
work
on
the
monitoring
as
well
too.
A
So
yeah,
let's
go
back
to
the
to
the
LDAP
database.
So
basically
what
you
did
here
so
something
that
you
have
to
understand
is
you
have
so
yeah.
We
measured
up
as
a
source
of
identity,
but
you
also
have
multiple
services
that
synchronized
database
in
their
service
and
so
keep
local
version
of
the
of
the
of
the
users.
And
so
basically
what
happened
here
is
we
lost
the
LDAP
database.
A
A
So
yeah
and
now
we
are
I'm
looking
at
how
to
restore
the
people
who
create
an
account
bill,
does
not
have
an
admin
access,
I
checked
just
before
the
meeting
and
it's
around
9,000
user
that
were
removed
from
the
database,
and
so
now
we
have
to
bring
them
back
into
the
app.
That
is
that
we
have
to
write
a
script
for
this,
but
that's
kind
of
the
current
state.
B
F
A
B
So
we
have
some
super
users,
for
example
a
Kiki
or
Jesse
Duke,
who
should
be
able
to
release
convenience
now
use
the
current
set
of
permissions
and
yeah,
probably
we
could
start
adding
some
contributors
to
the
allowed
least,
maybe
additional
resource
or
whatever,
so
that
the
admissions
for
counts
for
asked
for
that
I.
Don't
think
it's
an
end
of
the
world
if
we
delay
this
general
these
permissions.
If
you
have
a
personal
work
around
now,.
A
E
Y
Rockets,
we
have
identified
the
I
think
50
accountants,
that
our
maintainer
accounts
that
we
no
longer
have
or
that
have
been
recreated.
So
what
we
can
do
is
we
can
remove
permissions
from
these
accountants
in
the
permission
files
with
positronic
permissions,
updater
and
just
restore
the
old
behavior,
because
we
know
which
accounts
are
potentially
compromised
or
should
not
have
access,
I
mean
and
every
everyone
else
is
fine.
So
we
can
do
that.
B
A
Sounds
right
so
I
guess
we
can
move
to
the
last
a
big
project.
The
work
being
done
on
the
automated
release
obviously
did
not
have
the
time
to
work
and
this
over
the
last
week,
but
basically,
just
before
the
oth
happen.
I
just
merged
major
PR,
where
we
could
directly
really
stable
secret.
A
weekly
release
directly
from
the
religion
warrants
for
the
stable
I
think
it's
ready,
but
before
really
stable
a
little
bit
sure
that
we
can
use
a
security
one
and
yeah.
A
E
Arguments
are
introduced
that
make
the
entire
thing
configurable
we
can
cut.
We
can
set
up
an
environment
where
we
would
release
a
weekly
release
as
if
it
were
a
security
update
and
whether
that
have
works,
or
we
just
create
repos
in
for
maven
148
and
pretend
there's
a
security
update
happening.
You
can
do
either
of
these
things
so.
A
Yeah,
so
basically
is
this
something
that
is
something
that
we
quit
test
now
so
right
now
we
have
two
jobs,
the
one
that's
release
that
use
mavin,
release,
beginning
and
the
that
package.
Everything
and
we
also
promotes-
can
also
promote
artifact
at
the
end
of
the
release
at
the
end
of
the
packaging,
and
so
everything
is
parameterize
now
so
we
just
have
yeah.
We
just
have
to
say
together
to
destroy
working
well.
C
F
B
A
So
it
so,
basically
what
what
seem
is
suggesting
instead
of
creating
a
weekly
release
on
Monday,
we
create
a
city
to
release
based
on
weekly
content,
so
we
just
fetch
the
data
from
the
Jenkins,
a
master
branch.
We
don't
have
any
if
we
do
not
introduce
security
whatever
just
like
we
just
like
we
will
be
under
security
and.
D
A
D
A
Just
we
can,
we
can
do
it
on
Tuesday.
It
was
just
a
confusion
from
me
because
I
just
missed
the
email
where
we
said.
Okay,
we
are
going
to
do
the
release
on
Tuesday,
and
so
initially
it
puts
a
crunch
up
on
Monday
and
could
not
was
not
taken
to
a
conscious
today.
So
we
did
release
yesterday,
but
it
will
not
happen
anymore,
so
we
can
do
it
on
to
the
prefer
that
as
well,
oh
yeah,
so
it
sounds
like
we
can.
Our
sink
west
I
need
to
see
how
we
can
drill
right.
A
And
another
another
stuff
that
are
also
at
the
reason:
varmint
has
no
heart
a
folder
called
components
and
on
the
under
components
we
can
now
release
the
remoting
components
yeah.
So
we
can.
We
can
use
the
cosine
in
certificate
for
the
remoting
components,
so
the
next.
The
next
release
will
happen
from
from
that
environments
and
I
know
the
people
who
were
interested
to
to
use
across
any
certificate
to
sign
components.
So
yeah,
that's
the
first,
the
first
that's
one
of
the
should
I
change
to
the
release
environment.
E
E
A
Basically,
since
some
photo
for
the
remoting
compartment,
I
just
reviewed
most
of
what
we
do
for
Jenkins
core,
except
that
I
remove
some
parts,
so
I
do
not
expect
it
to
be
a
big
work
here.
It's
just
like
it
was
more
proof
of
concept
to
see
if
we
can
can
released
remoting.
So
if
do
anything
that
we
need
this
tool
to
introduce
staging
environments,
that's
not
a
big
deal.
C
That
we've
got
two
issues
on
the
weekly
at
the
moment
with
the
release
process.
One
is
that
Windows
is
broken
due
to
a
Microsoft
security
update,
and
we
need
to
rebuild
our
images.
We're
having
issues
with
that
and
the
other
is
that
if
the
packaging
failed,
it
seems
like
we
get
packaged,
we
get
them
absolutely,
but
not.
The
package
is
uploaded
to
it
here.
So
both
we've
gone
through
and
look
like
the
BBN,
and
they
read
headline
like
that
failing,
even
though
they
stay
hidden
so.
A
No
regard
regarding
the
issue
with
the
Debian
that
is
not
published
did
something
weird
to
me
right
now,
because
the
way
the
packages
are
happening
is
you
have
to
bein
happening
at
the
same
time,
then
as
Intuit's
Sue's
windows,
and
it's
also
published
work,
and
if,
for
some
reason
one
of
them
is
broken,
it
will
finished
the
I
mean
the
footsteps,
for
example,
the
event
should
be
terminated
and
there
is
another
I'm
think
that
need
to
be
improving.
The
release
process
is
at
the
end
of
the
released
at
the
end
of
the
packaging
process.
C
C
H
A
Just
just
under
the
windows
packaging,
so
the
beggining
dub
is
fitting
since
we
agreed
at
Leicester
and
the
reason
to
that
is
because
we
were
using
old
version
of
Windows
on
the
old
cluster
and
since
we
upgraded
castor,
we
now
have
up-to-date
version
of
Windows
the
notes,
and
there
was
a
security
issue
with
vehicles
in
all
version
of
Windows
and
in
order
to
fix
that
security
issue,
the
introduced,
breaking
change,
and
so
now
the
old
vest.
Those
does
not
work
with
a
new,
quick
Windows
notes.
A
So
that's
one
of
the
things
and
general
issue
which
is
related
to
Windows
but
also
to
reinforce
Witcher,
is
when
we
started
using
Windows
notes
for
the
release
process.
We
had
one
big
image
containing
channel
P
content,
web
channel
P
and
invest
also
piously,
and
we
had
also
to
put
in
place
specific
infrastructure
levels
on
that
specific
notes,
and
so
now
that
your
creds
are
every
Jenkins
instance.
So
we
are
web
created
to
changing
shots,
that
we
are
using
to
infrastructure
and
Sun.
A
We
are
putting
a
lot
of
logic
in
that
specific
windows
container
in
order
to
be
able
to
work
in
our
infrastructure,
and
this
does
not
scale,
and
it's
also
difficult
to
test
on
working
machines,
because
we
mean
we
have
to
do
a
lot
of
specific
configuration.
Configuration
changes
in
order
to
test
that
the
windows
packaging.
So
that's
why,
as
we
really
need
new
container
finders.
A
A
B
So,
for
today
it
is
an
enhancement
because
we
defer
to
have
companies
like
Jenkins
configuration
less
code
using
github
wishes
and
right
now
it
goes
to
G
Day,
which
is
well
not
that
relevant,
although
I'm
not
sure
what
still
needs
to
get
implemented.
For
that
I
believe
it's.
We
need
some
updates
under
metadata,
so
they
would
pull
requests
from
team
for
that
and
so
that
we
need
to
apply
some
magic
to
get
it
posted.
But
yeah
that
render
link
is
already
there.