►
From YouTube: Project Authorizations - Lunch and Learn Session
A
A
Yes,
can
you
see
my
screen,
yes,
yeah.
B
Okay,
so
today
we
will
be
talking
about
project
authorizations,
past
preser,
present
challenges
and
the
future
of
project
authorization.
So
my
name
is
Manoj.
I
am
a
back-end
engineer
at
the
Italian
skill
group,
but
previously
I
used
to
be
in
manage
art
and
manage
organized
and
during
my
time
in
manager,
I
used
to
work
a
lot
in
Project
authorization.
So
that
is
how
I
know
most.
B
Yeah,
so
the
first
part
is
a
very
basic
introduction
to
project
authorization
and
what
what
it
is
about
so
project
authorizations
is
basically
a
cache
for
storing
the
access
level
that
every
user
has
to
the
projects
that
they
have
access
to.
B
So
it's
a
very
simple
table
consisting
of
three
columns:
user,
ID,
project,
ID
and
access
level
where
you
store
the
specific
data,
so
I
have
put
a
question
mark
in
front
of
cash
because
I
don't
know
if
you
can
really
call
this
a
cache
in
the
sense
that
in
Cache
systems
what
happens
is
if
the
cache
is
missing,
we
call
it
a
cache
Miss.
We
are
able
to
obtain
the
right
value
from
you
know,
making
that
calculation
without
the
cash
also.
B
So
if
there's
a
cash
Miss,
you
go
into
directly
into
the
source
and
figure
out
the
value,
but
in
in
this
case
that's
not
really
happening,
because
the
expectation
that
the
system
runs
on
is
the
fact
that
the
value
of
access
level
is
present
in
the
other
project.
So
if
there
is
a
cache
Miss,
then
the
whole
system
goes
for
a
pass.
So
cash
is
probably
not
the
right
term
for
it
in
in
the
right
sense.
Maybe
you
can
call
it
something
as
a
pre-computed
value,
but
first
simplicity's
sake.
B
We
can
assume
it
to
be
a
cash,
but
we
can
always
rebuild
the
cash.
There
are
systems
for
it,
but
if,
if
there
is
a
cash
Miss,
then
system
goes
for
a
pass,
which
means
that
somebody
does
not
get
access
to
the
project
so
yeah.
So
that's
the
basic
detail
and
here
I
have
an
example
of
like
what
it
looks
like
in
our
app.
B
Yes,
so
yeah,
there
is
a
uniqueness
scope
on
user
ID
project,
ID
access
level
combinations,
which
means
that
a
specific
user
can
have
only
one
kind
of
access
level
to
a
project,
which
means
that
you
can
be
a
maintainer
in
our
project,
but
you
cannot
also
be
a
owner
of
that
project,
so
it's
always
unique
that
way.
So
that
is
one
of
the
constraints
we
have
on
the
table
next
slide.
A
A
B
Yes,
so
the
questions
that
can
come
up
at
this
point
is:
why
do
we
use
project
authorizations
and
the
reason
is
we
use
it
as
a
cache,
because
we
want
the
precompleted
value
to
be
always
there,
otherwise
fetching
it
from
the
members
table.
Like
you
add
a
member,
add
somebody
as
a
member
to
a
project
and
if
you
compute
via
that
member's
table,
then
it
takes
a
lot
of
time
because,
like
that
computation
is
hard,
it's
expensive.
B
So
what
we
do
is
we
already
pre-compute
that
value
that
a
specific
person
has
to
a
project
and
we
store
it
in
the
project
authorizations
table.
So
that
is
why
we
have
project
organizations
and
the
corresponding
question
is
why
don't
we
have
group
authorizations
because
users
also
have
access
to
groups
and
we
don't
have
a
table
called
group
authorizations
and
if
computation
is
hard
or
expensive,
why
don't
we
have
group
authorizations
Institute?
B
So
the
reason
to
that
is,
we
kind
of
have
satisfactory
performance
on
the
group
side
of
things,
but
we
did
not
have
satisfactory
performance
on
the
project
project
side
of
things,
because
when
you
do
a
bit
push
to
a
private
report,
you
you
want
it
to
be
as
instantaneous
as
possible
and
we
do
not
have
time
to
go
into
member
stable
and
security.
Hey.
Does
this
use
the
highly
enough
rights
to
push
to
this
repo.
B
Side
there
are
also
so,
if
you
can
imagine
as
a
tree,
a
group
is
at
the
top
level
and
project
is
at
the
least
level.
So
once
you
Traverse
to
a
leaf,
there
are
already
like
more
components
involved
in
the
calculation
process,
so
that
makes
it
more
expensive,
but
group
is
at
the
top,
so
you
can
imagine
that
it's
easier
to
figure
out
the
value
at
the
top
level.
So
that
is
why
we
don't
have
a
group
authorization
stable,
a,
but
it
might
come
at
some
point.
B
A
B
Slide
yeah,
so
there
are
some
existing
concerns
around
project
authorizations
before
we
dive
deep
into
it.
B
So
the
first
point
is
this:
Cache
can
be
considered
wasteful
because
we
have
to
store
every
user's
pre-computed
value
to
every
project
that
they
are
part
of,
for
example,
on
my
day-to-day
job,
I
will
use
the
gitlab
rails
project,
but
then,
since
I
am
part
of
the
gitlab
or
group,
I
am
part
of
like
thousands
of
projects
underneath
it
and
my
access
level
to
each
of
these
projects
is
stored
in
the
project
authorization
table,
even
though
I
like
virtually
never
access
those
projects
like
in
my
day,
job
I,
maybe
go
to
the
the
the
handbook,
repo
and
also
the
git
Library.
B
These
are
like
the
two
projects
that
I
use,
but
I
have
entries
for
all
the
other
thousand
projects
which
is
kind
of
you
know
in
in
a
sense
it
is
Facebook,
and
the
second
point
is
like
I
told
this
is
a
precompleted
value,
not
really
a
cash,
so
it
will.
It
is
a
cash
that
only
grows
like
the
value
will
never
decrease
unless
there
are
like
the
user
is
deleted
or
something
the
project
is
deleted.
B
So
the
you
can
imagine
that
the
size
of
this
table
is
only
going
to
keep
growing
as
gitlab
grows,
and
next
part
is
maintaining.
Any
cash
is
hard,
but
maintaining
the
cash
of
project
authorization
is
doubly
hard
because
any
problem
that
you
have
in
this
table.
It
is
a
severe
repeat
one
issue
which
means
that,
like
the,
if
the
access
levels
differ,
then
somebody
is
going
to
get
elevated
access
or
somebody
is
not
going
to
get
the
proper
access.
B
A
C
C
So
does
the
whole
concept
of
customizable
roles
after
all,
since
this
is
about
storing
roles
and
access,
does
it
make
it
even
more
complicated?
Now.
B
Yeah,
if
I
understand
correctly,
a
soil
is
not
part
of
the
custom
rules,
but
if
I
understand
correctly,
we
do
not
store
the
custom
role
in
this
table
and
it
is
calculated
on
top
of
that.
So
definitely
it
makes
it
more
expensive.
So
you
have
to
find
that
this
person
is
a
maintainer.
It
will
have
an
access
level
of
40,
but
it
may
have
some
owner
rights,
so
that
is
calculated
differently.
D
And
this
is
also
the
exact
reason
for
why
we
had
first
the
the
technical
discovery
on
customizable
roles,
because
we
were
very
concerned
of
this
very
issue
that
this
might
be
very.
This
might
have
very
poor
performance.
D
So
not
entirely
decoupling.
We
still
have
the
uncustomized
borrowers.
We
have
the
the
base
access
level.
So
that's
reflected
on
this.
What
we
try
to
do
is
to
move
away
from
relying
on
Project
authorization
on
these
specific
permissions
and
rely
on
these
on-the-fly
calculations.
So
we
can
check
if
the,
if
the,
if
the
user
has
the
custom
permission,
that
they
need,
but
that
that
that
needed
to
be
very
performant.
B
It's
more
like
one
on
top
of
the
other,
so
the
existing
system
continues,
but
you
also
check
whether
this
has
any
custom
Rubix
and
calculate
that
also
okay,
yeah,
so
yeah.
So
we
have
this
cache,
but
we
need
to
keep
it
updated.
Otherwise,
like
I
told
you,
people
don't
get
the
proper
access
to
the
products.
B
So
the
next
question
is:
when
do
we
update
this
cache
and
updation
happens
whenever
we
trigger
an
action
that
needs
to
update
the
project
authorization,
which
means
that,
like,
when
you
add
a
number
to
a
group
or
a
project
when
you
delete
a
member
from
a
group
or
a
project
when
you
transfer
a
group
The
the
hierarchy
thing
just
so.
In
that
case,
you
also
have
to
update.
B
So
all
these
actions
that
happen,
which
warrants
a
change
to
your
project
authorizations,
is
when
we
trigger
this
updation
and
to
update
these
project
authorizations.
We
use
workers
which
are
like
psychic
workers.
They
do
it
asynchronously
behind
the
scenes.
So
it
does
not
return
the
requests
and
the
worker
that
we
use
for
it
is
called
the
authorized
projects
worker
and
authorized
projects
worker
always
refreshes
authorizations.
B
On
a
per
user
basis
by
per
user
I
mean
that
simply
means
that
the
argument
that
goes
into
the
worker
to
perform
the
job
is
the
user
ID
of
the
person.
So
in
this
case,
whether
as
an
example
when
a
user
with
id42
is
added
to
a
new
project
or
a
group,
we
would
usually
run
authorize
projects,
worker,
Dot,
performancing42,
and
that
will
do
the
calculations
figure
out.
B
Well,
if
the
user
term
named
42,
has
need
to
have
access
to
any
new
projects
or
has
this
person
lost
access
to
any
project
and
we
remove
and
add
a
project
authorization
records
that
those
rows
in
that
specific
table
as
necessary.
So
that
is
what
we
are
used
to
and
we
kind
of
Now
call
the
system
as
the
Legacy
system,
because
we
have
something
new
to
do
this.
B
So
when
you
say
the
Legacy
worker
you
can
imagine,
it
is
actually
the
authorized
projects
worker
that
we
are
talking
about
and
we
had
many
challenges
around
authorized
projects
worker.
The
main
challenge
being
I
told
you.
This
works
on
a
per
user
basis
and
per
user
basis
is
very
expensive
calculation
and
it
is
also
not
really
necessary
to
do
stuff
on
a
per
user
basis.
B
As
an
example,
when
a
new,
when
a
new
user
is
added
to
a
project,
you
do
not
need
to
fetch
data
surrounding
this
user's
access
to
all
other
projects.
So
if
I'm
added
to
the
gitlab
project,
essentially
I
need
to
figure
out
I
need
to
add
an
entry
for
manuds
with
access
level
of
40
to
the
gitlab
project.
But
then
what
happens
inside
this
worker?
Is
it
fetches
all
of
my
existing
project
authorization
records
also?
B
So
that
is
a
very
expensive
calculation
that
is
happening,
and
that
was
the
major
challenge,
because
you
know
the
time
required
to
do.
This
is
very
kind
of
high
and
when
you
have
so
many
jobs
in
the
queue
that
does
this,
you
have
infrastructure
challenges
that
way
we
couldn't
meet
slos
on
on
uncertain
days,
so
the
worker
that
runs.
This
is
doing
a
recursive
query,
because
we
have
like
a
our
hierarchy
effects
like
we
have
groups
and
then
projects
and
subgroups,
and
then
projects
inside
that.
B
So
it
used
to
do
like
a
recursive
query,
but
in
May
2023
we
have
kind
of
changed
to
two
linear
queries.
I
think
abubakar
worked
on
it
I,
don't
still
don't
know
if
the
feature
flag
is
turned
on,
but
there
have
been
like
we
have
been
moving
in
that
direction,
which
kind
of
removes
the
expensive
part.
It's
not
recursive,
but
now
it
is
linear
based
on
linear
queries,
so
it
should
improve
the
performance
a
bit
next
yeah,
so
I
told
you.
B
B
So
around
mid-2020s,
when
we
started
having
inferative
issues
around
project
authorization,
we
used
to
have
like
a
lot
of
problems,
and
that
is
when
we
decided
to
like
think
about
alternative
Solutions
with
something
that
can
improve
performance
and
not
have
so
many
informative
issues
coming
our
way
and
the
alternative
that
we
talked
about
or
we
came
up
with,
is
called
a
scooped
worker,
which
would
be
much
more
performant
and
let's
look
at
scoped
workers,
so
so
the
scoped
worker
we
have
right
now
is
called
a
project.
B
Yeah,
so
this
does
refreshes
on
a
per
project
basis.
If
you
remember
the
last,
one
did
refresholder
per
user
basis,
so
this
does
it
on
a
project
basis.
So
when,
for
example,
when
a
project
with
ID
40
is
transferred
from
one
group
to
the
other,
we
simply
do
trajectory
calculator,
worker,
dot,
performancing
40,
but
40
is
the
project
ID.
B
So
inside
this
job,
what
happens
is
it
is
able
to
fetch
all
the
members
arising
from
all
different
areas
like
from
within
the
group
The
ancestors
direct
members
of
the
project,
people
from
Project
group
links?
It
is
able
to
fetch
all
of
them
together
and
figure
out
how,
like
all
of
the
users
that
have
access
to
this
particular
project.
So
that
is
what
we
mean
by
thus
the
work
on
our
project
basis
yeah.
So
this
is
the
diagramized
the
diagram
of
the
a
case
that
I
was
trying
to
tell
you.
B
So
a
project
p
is
moved
from
group
a
to
Group
B
and
now
it
is
part
of
project
P.
So
you
can
imagine
that
there
are
users
in
group
a
and
users
in
group
b.
So
now
previously,
users
in
group
a
had
access;
now
they
will
lose
access
and
people
in
group
b
should
get
access.
So
earlier,
what
used
to
happen?
B
Is
we
used
to
like
take
a
union
of
all
user
IDs
in
group
a
and
Group
B
and
run
the
Legacy
worker
for
all
those
users,
which
means
there
are
multiple
jobs
being
happening
for
all
those
users?
If
there
were
like
100
users
combined
in
both
groups,
there
are.
This
was
like
a
hundred
different
jobs
and
people
in
group
a
will
look
access,
people
in
group
b
will
gain
access,
but
now
this
is
much
simpler.
This
is
just
one
job.
B
We
just
simply
pass
the
project
of
project
ID
of
that
project
and
it
is
able
to
figure
all
this
out
inside
one
single
job.
So
when
we
deployed
it,
we
had
like
a
really
good
Improvement
in
all
metrics
that
we
calculated
like
or
close
to
99
Improvement
everywhere
in
number
of
jobs
being
generated,
because
this
is
like
n
Jobs
versus
one
job
and
also
the
the
time
spending
DB
and
the
number
of
queries
that
you
have
fired
so
yeah
close
to
99
percent
Improvement
in
efficiency.
There.
A
B
Yeah
so
so
everything
was.
B
So
everything
was
not
really
nice,
as
the
last
slide
showed
you
when
we
started
getting
good
results
with
the
new,
the
new
worker.
We
thought
like
why
not
apply
the
same
thing
everywhere
and
we
would
have
better
results
everywhere.
So
that
is
when
our
problem
started.
A
B
To
100
different
projects,
recalculate
worker
jobs,
but
we
thought
that,
since
the
performance
of
this
worker
is
far
better
than
the
Legacy
worker,
it
should
be
okay
and
we
shipped
this
change,
and
this
led
to
an
incident
and
delete
what
happened
during
the
incident
is
the
QA
test
started
failing
and
on
checking
that
we
came
to
know
that,
like
too
many
jobs
were
waiting
in
the
queue
and
the
queue
depth
was
increasing,
it
was
higher
than
usual
and
we
realized
that
it
is
because
of
the
changes
we
deployed
for
group,
member
updates
and
in
in
simple
terms.
B
What
went
wrong
is.
This
is
a
case
of
frequent
action,
giving
rise
to
n
jobs
in
one
go
wherein
it's
a
very
large
number.
So
adding
number
to
a
group
is
a
if
you
consider
gitlab.com,
that's
a
very
frequent
action
and
when
you
add
members
over
API
that
happens
very
successively
and
each
of
them
is
triggering
like
a
thousand
jobs
or
more.
Every
job
is
in
the
queue
and
then
the
queue
gets
clogged.
So
that
is
what
happened
in
this
case
and
yeah.
B
So
this
gave
to
like
without
many
projector
calculator
worker
jobs,
and
it
happened
very
like
one
after
the
other,
because
we
have
no
control
over
how
customers
use
the
API,
so
they
can
add
like
a
thousand
different
members
to
each
group
that
they
want.
So
this
is
the
case
that
happened
and
the
takeaways
that
we
have
from
this
incident
is
never
have
like
in
referral
jobs
being
generated
from
one
action,
but.
B
But
if
you
have
like
a
one
one,
refresh
job
generated
per
one
action
that
is
kind
of
correct,
so
the
takeaway
here
is
when
using
scoped
workers
use
them
for
the
right
scope.
What
happened
with
us
is
where
we
went
wrong.
Is
we
had
a
project
scoped
worker,
but
then
we
started
using
it
for
an
action
that
takes
place
in
the
scope
of
a
group.
So
that
should
not
happen.
You
should
scope
it
correctly
and
you
also
use
it
for
the
Right
Scoop.
B
So
we
continue
using
a
combination
of
both
right
now
yeah
and
regarding
the
future.
There
are
like
two
different
areas
that
we
want
to
focus
on.
One
is
we
have
a
proposal
for
groups
for
scoped
worker.
We
do
not
have
it
currently
and
that
can
solve
a
lot
of
problems
and
hopefully,
if
that
happens,
we
might
be
able
to
retire
the
Legacy
worker
and
we
can
use
groups
corporate
workers
in
areas
where
the
Legacy
worker
is
currently
being
like,
adding
a
number
to
a
group.
B
The
moving
group
transfer
all
of
that
stuff
and
we
also
want
to
remove
safety
networkers
from
across
the
code
base.
This
is
so
in
gitlab.com.
The
safety
networkers
are
not
the
replica,
so
there
isn't
much
damage
being
done
that
way,
but
on
self-managed
instances
this
safety
net
jobs
runs
on
the
primary
and
for
such
installations
it
might
be
might
be
a
problem.
So
we
want
to
remove
the
safety
net
jobs
at
some
point.
But
before
that
we
have
to
make
sure
that
our
new
workers
give
us
the
right
output.
B
We
haven't
been
able
to
measure
that,
but
once
we
measure
it
and
if
the
results
are
right,
we
can
also
remove
the
safety
Network
calls
from
the
code
base,
and
this
is
the
like
how
effective
net
jobs
function.
So
the
first
line
this
line
is
the
new
worker,
of
course,
and
the
second
line
is
we
want
to.
Unless
until
we
compare
the
consistency
rates,
we
just
NQ
also
the
job
in
the
Legacy
worker,
but
with
a
low
priority
and
with
the
one
hour
delay.
B
D
Have
one
question
yeah,
so
you
describe
the
incident
where
the
queue
was
clogged
with
with
a
bunch
of
jobs.
I,
wonder
what
happens
if
so,
let's
consider
the
project
recalculate
worker.
So
let's
see
that
the
let's
assume
that
the
queue
is
already
clogged
and
the
member
is
added
to
a
project.
B
Yeah
that
can
happen,
but
the
good
thing
with
the
new
worker
is
the
the
deduplication
keys,
the
project
ID
so
in
in
that
case
the
jobs
will
be
reduplicated.
But
in
the
other
case
the
reduplication
key
was
the
user
ID,
because
so,
if
you
add
different
users
to
the
same
group,
each
is
each
job
is
unique,
because
the
key
is
user
ID.
So
it
is
never
deluplicated.
C
What
is
the
you've
talked
a
lot
about
these
jobs,
getting
stuck
in
the
queue
yeah?
Does
that
actually
like
what
it?
What
does
it
bring
down
git
lab,
or
is
it
just
a
case
of
maybe
you
don't
have
access
to
the
project?
You
should
what
is
like
the
user-facing
impact
of
it.
B
So
it's
a
mix
of
both
I
would
say
there
are
also
other
jobs
waiting.
So
when
you
wait,
the
access
access
policies
are
not
applied
correctly,
so
you
do
success
and
on
the
higher
level
as
a
whole.
Also,
our
side
systems
come
to
a
stop,
because
there
are
also
other
jobs
happening
in
other
parts
of
the
code
base
that
needs
to
use
psychic,
and
since
it
is
clocked
at
this
particular
Point,
they
also
cannot
process
and
it's
it's
sort
of
leads
to
a
cascading
effect
and
yeah.
B
Maybe
we
can
talk
about
what
JC
has
also
asked
in
in
brief
yeah,
so
JCS
asked
whether
the
assigned
nature
of
project
authorization
worker
has
ever
caused
any
problems
and
I
have
replied
that
so
I
think
means
that
it
happens
behind
the
scenes,
which
means
that
you,
you
add
a
member
to
a
group
and
they
do
not
get
this
access
instantly
like
it
is
near
instant
because
it
has
to
go
in
the
background
and
process
it.
B
So
we
have
no
control
over
when
that
job
finishes,
for
example,
if
they
are
stuck
in
a
queue
it
may
finish
like
after
one
minute
or
so
so
this
was
always
not
the
case.
We
always
we
did
not
have
async
refreshes
from
the
beginning
it.
It
used
to
be
sync
repression,
which
means
that
as
soon
as
you
finish,
the
request,
you
also
had
access
to
the
project,
but
at
some
point
we
had
to
change
that,
because
it
is
not
always
nice
to
wait
on
request.
B
It
will
increase
the
response
time
and
it
will
also
skew
your
metrics
for
the
error,
budgets
and
stuff
like
that,
and
we
also
had
another
reason
to
do
like
async
everywhere.
So
we
shifted
to
async
mode,
but
we
are
also
aware
of
the
fact
that
the
access
levels
are
not
instant.
It
is
near
instant
and
we
are
we
had
to
take
that
call.
B
C
B
To
be
honest,
we
haven't
reached
the
stage
where
we
are.
We
have
started
to
think
about
project
authorizations.
Yet
because
projects
will
still
access
to
users
will
still
access.
So
we
are
confident
that
some
version
of
the
table
will
continue
to
exist,
but
we
I
think
we
were
also
talking
about
whether
group
authorizations
need
to
happen
because
now
organizations
will
have
groups
and
it
might
be
much
easier
to
pre-compute
that
value
and
store
it
somewhere,
rather
than
figuring
it
out
on
the
go
so
that
that
might
happen.
I
guess.
B
C
D
A
related
question
to
that
so
there's
also
the
group
project,
consolidation
effort
and
I
was
wondering.
If
you
mentioned
the
the
group
authorization
table
that
you
are
considering,
would
it
make
sense
to
just
create
a
namespace
authorization
and
use
that
for
both
project
and
group
instead
of
treating
the
two
as
two
different
entities
or
are
there
benefits
to
keep
them
as
a
separate
entities.
B
So
what
I
think
about
it
is
there
are
like
variations
in
projects
and
groups
on
how
we
get
memberships,
because
there
are
group
Group
shares,
but
there
are
no
project
projects
here,
so
variation
success.
That
way.
So,
even
if
you
make
it
like
namespace
ID
access
level
user
ID
internally,
we
will
have
to
differentiate
somewhere
between
a
group
and
a
project
and
then
get
fetch
members
of
them
and
do
the
whole
thing.
B
So
the
first
condition
in
that
case
would
would
appear
somewhere
but
yeah
having
it
as
namespace
would
make
sense,
because
then
you
can
contain
it
in
one
single
table.
Otherwise
you
would
have
to
like
split
it
into
project
authorizations
or
group
authorization.
A
C
D
Yeah
and
I
think
it's
very,
very
useful.
This
summary
and
the
presentation
you
provided
because
there's
like
a
a
lot
of
detail
and
historical
context
to
project
authorization-
and
this
was
like
a
very
very
nice
story-
yeah.
B
I
I
always
think
about
the
fact
that
somebody
in
the
group
should
be
acting
as
the
historian
of
manage
art,
because
there
is
a
lot
of
there's
a
lot
of
context.
I
think
at
this
point,
I
think
it
is,
but
somebody
needs
to
be
there
because
it
is
very
easy
to
lose
context.
There
are
a
lot
of
historical
contexts
around
like
what
considerations
we
did
to
reach
this
point.
So.