►
Description
This is a brief overview of the artchitecture, design and project setup for the External License DB.
Chapters:
0:00 - Architecture Overview
04:10 - Namespace Layout
04:30 - DB Schema
06:17 - Feeder
09:30 - Interfacer
11:00 - Processor
13:20 - Exporter
14:53 - Deployment
16:43 Terraform Environments
18:58 Deployment Jobs
20:00 Documentation & Wrap Up
A
Hello,
everyone
I'm
going
to
be
walking
through
the
architecture
and
some
of
the
information
necessary
to
understand
how
the
external
license
database
works
and
is
how
it's
designed
and
set
up.
So
what
we're
looking
at
here
is
the
architecture
diagram
of
all
the
various
components
down
below
you
will
see,
gitlab
and
then
above
you'll,
see
Cloud
platform
and
then
to
the
left
are
either
instances
or
the
package
Registries.
So
these
are
the
three
major
components
that
we
are
communicating
with.
A
So
everything
pretty
much
starts
within
gitlab,
where
we
have
a
number
of
repositories.
These
repositories
are
either
containers
that
get
pushed
into
a
container
registry.
They
get
lab
one
as
well
as
the
Google
one
or
their
binary
releases
that
are
used
to
run
from
our
CI
CD
job
inside
of
the
deployment
Repository.
A
So
we
have
schema,
which
is
the
schema
of
the
database,
which
is
over
here
in
Cloud
SQL.
The
schema
includes
migration
scripts,
which
I'll
walk
through
the
processor
is
for
communicating
directly
with
the
database.
It's
one
of
the
only
ones
that
actually
talks
the
database,
the
other
one
being
the
exporter.
A
The
interfacer
is
what
kind
of
listens
for
the
package
names
coming
from
the
feeders,
which
I'll
explain
in
a
bit
that
tell
it
how
to
communicate
to
the
package
registry,
how
to
get
the
version
information
pull
out
the
licenses
for
those
versions
and
then
push
that
over
to
the
processor.
The
exporter
is
what
it
does.
A
It
communicates
with
the
database
and
then
exports
all
of
the
data
in
a
structured
format
that
the
sea
team
is
aware
of,
and
then
the
feeder
itself
is
what
initially
kind
of
kicks
off
the
whole
process
of
communicating
with
the
package
registry
getting
a
list
of
packages.
Sometimes
it
gets
versions,
but
usually
just
gets
the
packages,
and
then
it
pushes
them
over
a
pub
sub,
which
is
just
a
message:
queue
pushes
these
package
messages
and
then
they
go
to
the
interfacer.
The
interfacer
then
calls
out
to
the
package.
A
Registries
gets
the
information
for
the
licenses
and
then
pushes
that
over
the
licenses,
Pub
sub
and
then
calls
into
the
processor
and
then
the
processor
batches,
all
those
up
stuffs
them
into
the
database
every
minute
or
so
that
way
we
don't
kind
of
destroy
the
database
with
thousands
of
connections.
It
kind
of
batches
them
up
nicely,
so
we
keep
the
utilization
pretty
low,
and
that
is
roughly
how
this
is
designed.
A
One
thing
about
the
migration
is:
it's
actually
a
service
only
because
terraform,
which
we
use
for
deployment
automation,
doesn't
support
Cloud
jobs
or
Cloud
yeah
Cloud
jobs,
so
it's
actually
running
as
a
service,
but
it's
once
the
deployment
process
kicks
off.
It
will
deploy
a
new
version
of
the
schema
if
there's
any
changes.
Otherwise
it
just
does
nothing
so
yeah.
A
The
major
components
here
are
the
deployment
project,
our
git
lab
registry
gitlab
repositories,
the
artifact
registry
for
Google,
so
we're
actually
pushing
these
from
the
registry
from
gitlab
service
tree
over
to
the
artifact
registry
inside
of
Google.
That
way,
Google
can
pull
them
in
for
the
interface
for
a
processor,
migrator
and
so
forth.
A
A
couple
of
things
to
note
about
the
pub
sub:
there
is
a
this
nice
little
interface,
Between,
pub,
sub
and
Cloud
run
Cloud
run
is
just
running
a
container
as
a
service,
but
it
also
runs
it
as
an
HTTP
server,
but
Pub
sub
communicates
over
HTTP,
but
it
also
has
some
additional
kind
of
tricks
and
things
they
can
do,
such
as
setting,
depending
on
the
number
of
Pub
sub
messages.
How
much
to
scale
out
the
interfa
or
excuse
me,
the
containers
so
it'll
tell
it
how
many
messages
it
can
accept
at
once.
A
So
each
container
can
accept
n
number
of
messages,
as
well
as
scale
out
to
n
number
instances.
So
when
we
talk
about
scaling
out
horizontally,
we're
talking
about
scaling
out
the
number
of
instances
and
we're
talking
about
vertically
we're
talking
about
scaling
out
the
number
of
messages,
concurrent
messages
that
each
container
can
handle
so
I
hope
that
makes
so
much
sense.
That
is
the
design
of
our
architecture.
A
A
You
should
already
have
access
to
all
of
these
I'm
going
to
briefly
walk
through
each
one
of
these
repositories,
so
you
have
a
better
understanding
of
how
they're
laid
out
and
how
they're
structured
so
we'll
start
off
with
schema,
because
that's
where
everything
starts
from
so
here
we
have
I,
don't
know
if
this
is
going
to
take
up
my
whole
screen
or
not.
We
have
our
migration,
so
each
one
of
our
projects,
the
main
binary
entry
point,
is
under
the
command
directory
and
then
the
name
of
the
project.
So
here
we
just
have
our
project.
A
We
are
using
the
standard,
UL
Ur,
fave
CLI.
This
one
has
a
number
of
command
line.
Arguments
that
you
will
need
to
know.
This
says
also
gets
set
up
in
the
cloud
run:
terraform
deployment
which
I'll
cover
in
another
segment
or
another
video.
So
that's
how
our
commands
are
set
up.
Scripts
are
all
these
are
for
the
release
and
deployment
process.
So
if
you
see
these
this,
the
scripts
lib
scripts
directory.
This
is
how
it's
setting
up
the
deployment
in
the
release.
A
So
we
are
gating
each
release
so
anytime
you
want
to
actually
release
a
version.
You
will
need
to
go
through
and
actually
click
the
manual
release
button
after
you've
committed
and
merged
to
Main.
A
So
migrations
for
schema
are
contained
in
here
we're
using
Goose,
which
is
kind
of
a
nice
migration
system
for
go
so
you
can
have
either
SQL
files
or
you
can
have
go
files,
so
sometimes
you'll
need
to
do
more
complex,
like
Dynamic
stuff,
communicate
with
the
database.
Get
information
then
do
migrations
that
way.
In
those
cases,
you'll
want
to
create
a
ghost
go
migration
script.
Other
cases
that
are
kind
of
standard
you'll
just
have
your
up
statements
and
then,
if
you
need
to
roll
back,
you'll
have
your
down
statements.
A
So
that's
how
that
is
structured.
So
that's
it
for
schema.
Let's
move
on
to
the
we're
gonna
I
guess
work
back!
Actually,
no
it's
working
forward,
so
we're
going
to
look
at
the
license
feeder
next,
so
the
license
feeder
is
what
is
kicking
off
the
entire
process.
This
is
done
as
a
deployment,
a
CI
CD
job.
So
keep
that
in
mind.
A
If
you're
wondering
how
this
whole
process
kicks
off,
it's
a
scheduled
job
within
the
deployment
project,
but
for
building
we
have
the
feeder
here
so
again,
command
feeder
is
the
main
entry
point
has
number
of
environment
variables.
These
are
usually
set
in
the
gitlab
CI
yaml.
So
you
don't
need
to
worry
about
them
too
much.
We
need
to
register
them
and
then
it
just
pulls
off
one
thing
that
is
kind
of
interesting
about
our
design.
A
Is
we
try
not
to
have
passwords
anywhere
so
we're
going
to
use
impersonation,
so
the
deployment
project
has
a
deployer
key
that
we
have
set
to
allow
it
to
create
tokens
to
impersonate
other
services,
so
we're
basically
dropping
our
privileges
to
allow
us
to
do
this
type
of
work,
so
in
here,
in
this
case,
we're
just
going
to
impersonate,
whatever
users
provided
to
it,
which
I
think
is
a
feeder
user
but
yeah
and
then
it
kind
of
just
goes
through
and
depends
on
the
type
of
registry
it'll
start
feeding
out
the
the
packages.
A
So
if
we
look
at
the
structure
here,
the
main
kind
of
interesting
parts
are
going
to
be
the
the
actual
registry.
The
ones
that
talk
to
the
registry,
so
we
have
golang
feeders-
we
have
most
of
these
in
by
now
I-
think
we're
only
missing
one
at
this
point,
which
is
the
packages
ruby
gem.
A
So
it's
all
kind
of
structured
in
this
way
that
it
follows
the
same
interface,
a
very
basic
interface
of
just
like
feed
and
registry
name
and
then,
depending
on
how
they
communicate
with
the
registry
they're
going
to
be
working
differently.
Obviously,
so
that
is
somewhat
other
there's.
Some
kind
of
helper
like
Publishers,
which
handles
Pub
sub
I,
can
create
videos
on
all
these.
A
If
you
want
to
go
into
more
details
on
the
architecture
design
of
each
individual
project,
but
I'm
just
going
to
go
through
it
right
kind
of
lightly
right
now-
and
these
are
all
again
most
of
these
things
have
either
interfaces
that
allow
you
to
test
it.
We
also
have
concepts
of
dry
runs,
so
you
can
kind
of
see
how
it
would
work
before
you
actually
blast
out
millions
of
messages
over
Pub
sub
again
lib
scribs
handles
release.
In
this
case
the
feeder
is
a
binary
release,
so
it's
pushing
into
the
package
registry.
A
So
you
have
here
each
feeder
version
getting
released,
so
that
is
the
feeder
one.
Other
thing
is
the
bucket.
This
handle
is
storing
kind
of
State
for
the
feeder,
so
the
feeder
will
sometimes
save
cursors
or
time
stamps,
so
it
can
continue
where
I
left
off
and
that
uses
Google's
storage
buckets.
A
So
oh
and
one
other
thing,
I
should
probably
mention
real
quick
is
all
messages
are
just
Json.
So
inside
of
the
data
you'll
see
a
package
message.
This
is
just
the
the
pub
sub
message:
that's
basically
encoding
or
decoding
the
data
right
now,
it's
very
simple:
it's
just
a
package
registry
package,
name
version
and
then
any
sort
of
metadata
Sometimes
some
packages
will
require
additional
information,
so
that's
available
there.
A
So
that
is
the
message.
So,
if
once
it
goes
from
the
feeder,
it's
then
going
to
go
over
Pub
sub
and
it's
going
to
go
into
the
license.
Interfacer
and
the
license.
Interfacer
again
has
Pucket
same
information
inside
a
command.
Again
we
have
the
interfacer
the
message
senders
just
for
testing.
Inside
of
this
we
determine
or
configure
various
things
like
where
to
keep
errors.
If
we
failed
to
look
up
a
package,
we'll
save
it
in
gcp
and
whether
if
you
want
to
do
that
or
not,
we
enable
kind
of
feature
flags
for
it.
A
So
again,
the
interfaces
are
an
HTTP
server.
So
you
will
see
inside
of
the
dispatcher
doing
kind
of
your
usual
HTTP
server
stuff,
as
well
as
kind
of
handling
the
incoming
messages
determining
if
it's
a
dead
letter,
meaning
that
the
it
tried
10
times
and
it
failed.
So
it's
like
okay,
we
need
to
give
up
and
then
it'll
just
dead
letter
it
and
store
the
information
that
would
have
been
lost
into
a
gcp
bucket
and
then
it
just
kind
of
goes
through
I.
A
Don't
again,
I
don't
want
to
go
into
too
much
detail
here,
but
depending
on
the
package
registry
type,
it
will
then
call
into
the
interfacers
and
once
again
we
have
it
split
up
by
package
registry.
So
if
it's
Pi
Pi
we're
going
to
handle
it.
This
way
where
we
have
this
handle
method,
that's
yeah
goes
through
and
processes,
each
one
of
the
messages
and
then
looks
it
up.
A
Does
its
business
and
then
returns
from
each
one
of
the
interfacers
will
return
from
the
handle
method
back
to
the
dispatcher
and
I
believe
it
will
say:
yeah
interfacer
handle
message
and
then
it's
going
to
provide
it,
it's
legitimate.
It
will
push
it
off
and
say,
publish
it
and
we're
done
so
that
is
the
interfacer.
Next
up
is
the
processor.
A
The
processor
is
what's
again
communicating
with
the
database,
so
this
one
will
do
a
little
bit
more
setup.
It's
a
little
bit
more
involved
because
it
has
to
communicate
with
Cloud
run
or
excuse
me
Cloud
SQL
again
it
is
a
HTTP
server,
so
in
this
case
we're
creating
a
new
server
here.
Initializing
the
database
before
that
making
sure
we
have
all
the
information
we
need
so
yeah
that
configures
that
and
then
inside
of
this
server
we
are
again
handling
the
incoming
HTTP
message
or
Pub
sub.
It's
the
same
thing.
A
A
Otherwise,
we
do
something
a
little
bit
different
where,
since
we
need
to
batch
up
these
messages,
we
also
need
to
keep
track
of
the
request
that
came
in,
because
the
way
that
Pub
sub
works
is
if
a
message
comes
in,
it's
going
to
have
a
time
out
of
I
think
600
seconds.
So
we
need
to
track
and
close
out
that
connection.
If
we
just
accepted
it
and
closed
the
connection,
Pub
sub
would
think
it's
done
and
it
wouldn't
be
able
to
retry.
A
So
we
kind
of
had
to
get
a
little
bit
fancy
and
create
a
data
structure
that
had
a
channel
inside
of
it.
Saying:
okay,
accept
this
message,
put
into
a
queue
but
leave
the
connection
open
because
we
don't
want
to
return
until
the
batched
insert
is
actually
completed
on
the
database
side.
That
way
we
can
track
request
failures,
so
we
create
this
little
data
structure
that
cues
it
up
and
then
once
it
does
get
batch
insert
we
close
that
channel
and
then
it
can.
A
It's
able
to
return
from
this
HTTP
request
and
then
Pub
sub
knows
that
that
message
has
been
handled
and
it
can
continue
to
send
out
a
new
one.
So
it's
a
little
bit
funky
there,
but
it's
the
the
best
way
we
could
have
with
not
creating
thousands
of
connections
to
the
database
or
having
thousands
of
insert
insert
statements.
It
uses
a
different
style
of
kind
of
batching
things
up,
copying
things
around
to
Temporary
tables
and
then
inserting
them
and
again
I
can
do
another
video
on
that
one
as
well.
A
So
that's
the
processor.
We
now
have
the
Sporter.
So
at
this
point
all
the
data
should
be
in
the
database.
The
exporter
is
going
to
run
as
a
CI
CD
job
again
from
the
deployment
project
which
I'll
cover
in
a
second.
It
has
the
main
exporter
command
here,
which
again
uses
the
same
kind
of
structure
of
reading
and
environment
variables,
and
then
configuring
the
connection
to
the
database
and
then
how
how
you
want
to
kind
of
export
the
data
from
what
start
period
tool
until
what
end
period
so
yeah,
that's
the
exporter
main.
A
A
So
there's
a
little
bit
of
kind
of
complexity
around
how
we're
monitoring,
how
much
we've
written
to
the
object,
because
we're
rotating
it
much
like
an
Apache
log,
rotate
or
after,
like
10
megabytes,
it'll
rotate
to
the
next
file,
so
to
track
all
that
there
has
there's
a
little
bit
of
complexity
there.
A
But
again
we
can
make
another
video
to
cover
that
so
everyone's
aware
of
how
that
works,
and
then
it's
yeah,
the
usual
kind
of
connecting
to
the
database
and
then
creating
a
cursor
to
kind
of
iterate
through
and
pull
out
the
components
from
the
database
and
then
store
them
as
CSV
files
into
a
public
gcp
bucket.
So
that
is
probably
the
the
major
difference
of
that.
A
So
on
to
deployment
deployment
is
kind
of
the
heart
of
this
whole
project.
It's
made
up
of
terraform,
so
we're
using
terraform
we're
using
actually
like
all
of
the
git
lab
features
we're
using
terraform
with
gitlabs
HTTP
back
end,
so
we're
storing
all
the
state
inside
of
the
deployment
project.
So
that
way,
it's
it's
able
to
remember
exactly,
what's
being
already
deployed
the
kind
of
pipeline.
The
way
it
works
is
we
have
that's,
not
the
right
one,
whereas
Pipelines
deployment
researcher.
A
So
it's
made
up
of
a
number
of
steps
where,
once
you
merge
some
change,
so
you're
going
to
modify
a
terraform
file
or
whatever
it
goes
through
its
validation,
it
builds
out
container
information,
so
it's
actually
pulling
out
containers
from
gcp
or
excuse
me
from
gitlab
pushing
them
into
Google's
cloud
provider
for
our
Google's
Cloud
container
provider
are
called
artifact
registry,
we're
doing
the
same
thing
for
production,
so
we're
keeping
them
separate,
and
then
it
goes
through
a
stage
where
you
have
to
manually
release
the
development
environment.
So
what
happens
in
these
stages?
A
It's
going
to
prepare
the
plan,
so
it'll
terraform
will
create
a
plan
of
how
it's
going
to
change
the
infrastructure
and
then
to
actually
apply
that
plan.
You
have
to
click
this
play
button.
Once
you've
confirmed
your
changes,
work
in
Dev.
You
will
then
click
the
play
button
for
prod
and
again
you
can
change
this
if
you
want
but
I
figured.
This
would
make
the
most
sense
for
people
who
are
are
coming
into
this
project.
A
So
the
way
that
deployment
is
set
up,
we
opted
to
use
these
kind
of
different
environments,
so
local
is
for
you
for
your
environment.
So
if
you're
testing,
locally
I
created
some
helper
scripts
to,
for
example,
push
images
from
git
Labs
container
registry
to
your
personal
gcp
projects,
container
registry
as
well
as
do
the
whole
terraform
plan
and
apply
stuff
automated
for
you.
So
each
one
of
these
has
a
main
TF,
which
is
the
defines
the
infrastructure
you're
going
to
create
and
it
references
modules.
A
A
This
determines
the
type
of
variables
that
this
project
or
this
deployment
requires
and
then
to
actually
provide
the
values
for
that
you
have
a
tfvars
file.
So
that's
where
this
is
all
set,
so
you're
going
to
have
to
modify
this
to
your
particular
instance
like,
for
example,
your
email
address
and
whatnot
as
well
as
environments
again,
these
are
kind
of
helpers
for
kind
of
applying
stuff
to
your
environments.
So
definitely
take
a
look
at
those
and
we
have
guides
on
how
to
set
all
this
stuff
up
in
the
project.
A
A
So
prod
right
now
is
not
set
up,
we're
kind
of
leaving
that
as
an
exercise
to
SCA
or
we're
going
to
work
with
you
to
get
that
built
out.
Dev
is,
and
honestly
it's
just
copying.
What's
in
Dev
and
putting
into
prod
and
changing
some
names,
that's
not
that
much
work.
So
the
difference
here
is,
we
have
the
back
end
TF.
This
is
where
it's
actually
storing
the
information
for
Dev.
Again,
we
keep
the
the
environments
separate.
So
this
one
has
the
backend
information
for
that.
A
So
that
way
all
your
States
stored
in
gitlab
and
not
on
the
developer's
desktop.
Once
again,
we
have
the
dev
tfrs
again.
This
is
all
the
same
information
broken
into
each
module
and
yeah
that's
kind
of
how
the
deployment
works
now
the
actual
process
for
that
deployment.
You
can
look
into
this
file
or
using
gitlab's
new
secure
files.
A
So
we
do
have
a
single
key
Json
R2
one
for
Devon
for
prod
that
has
the
secure
file,
which
is
the
the
service
account
Json
file,
as
well
as
each
kind
of
development,
environment
or
prod
environments,
and
then
the
register,
the
versions.
So
if
you
want
to
say
change
the
processor,
you
will
go
through
the
processor
project,
release
it
and
then
you'll
have
to
bump
these
versions
manually
and
then
apply
and
create
an
MR
in
this
deployment
project
and
then
release
it
and
then
click
play
to
to
deploy
to
Dev
so
yeah.
A
It
has
a
number
of
steps,
validation
which
I
showed
earlier,
and
then
we
have
these
feeders
and
exporter
jobs
stages
which
are
just
going
to
be
run
using
rules
that
say
only
if
this
flag
is
set
which
comes
from
the
ucicd
schedules.
So
right
now
we
have
these
kind
of
set
up.
So
if
you
want
to
kick
off
like
an
npm
run,
you
just
click
play.
Obviously
these
will
be
scheduled,
but
this
will
give
you
an
idea
of
how
it's
currently
set
up.
A
So
if
you
are
feeling
overwhelmed-
and
you
just
don't
understand-
and
you
need
to
just
change
something
or
get
some
test
in,
we
did
create
a
significant
amount
of
documentation.
So
this
is
pretty
much
everything
you
need
to
know
from
preparing
your
environment
to
how
to
communicate
to
the
database
to
developing
schema
changes.
Each
project
has
its
own
documentation
that
you
can
go
through
and
say
like.
A
Okay,
if
I
need
to
create
a
new
feature
for
the
processor,
do
all
these
things
and
it'll
walk
you
through
deploying
to
your
personal
environment,
to
test
deploying
to
kind
of
like
on
your
local
development
and
then
deploying
it
kind
of
properly
using
that
local
environment.
That
I
showed
with
the
apply
and
push
local
images
shell
scripts
and
then,
once
you
confirmed,
oh
that's
working,
then
you
can
move
to
changing
the
dev
environment
in
the
prod
environment
to
reflect
any
changes,
if
you
add
a
new
environment
variable
or
whatnot.
A
So
all
this
information
here
again
for
each
individual
project-
it's
all
here.
So
hopefully
that
helps
you
because
again,
this
is
kind
of
an
overwhelming
system.
We
also
have
the
architecture
in
here
as
well
as
the
security
guide.
We've
already
supplied
this
information
to
the
security
team,
so
they
are
aware
of
it
and
yeah
that
pretty
much
wraps
up
the
kind
of
introduction
to
the
external
license
database
I
hope
that
was
helpful
and
I
will
do
further
videos
if
required.
Thank
you.