►
From YouTube: Database Office Hours 2021-01-27 - Thin cloning demo, Database Lab, Database Migration Testing
Description
Agenda doc (internal): https://docs.google.com/document/d/1wgfmVL30F8SdMg-9yY6Y8djPSxWNvKmhR5XmsvYX1EI/edit#
We give a demo for accessing database thin clones using psql, talk about Database Lab and our plans to implement fully automated database (migration) testing
A
A
B
B
All
right,
so
this
is
the
database
office.
Hours
call
cannot
start
right
off.
Today.
I
put
a
topic
on
the
agenda
about
the
thin
cloning
that
we're
doing
on
using
postgres,
ai
and
database
lab.
B
I
wanted
to
give
a
quick
demo
of
how
that
looks
like
with
pc
core,
and
we
can
talk
about
how
we
how
we
plan
to
use
that
going
forward.
I
don't
think
there
is
any
other
topics
on
the
agenda
feel
free
to
add.
B
If
there
are
more,
has
anybody
like,
I
think,
everybody's
using
postcards
ai
right
now
right?
Has
anybody
used
p-sql
access
to
a
thin
clone
already
right?
So
basically
you
can
use
postcards
ai
database
lab
you
go
to
the
ui
you
log
in
you
use
database
lab,
you
can
run
queries,
you
can
get
query
plans
and
that's
that's
all
great.
B
B
You
can
you
can
explore
the
data
there,
but
you're
directly
connected
to
a
production
replica,
which
means
it's
it's
part
of
the
production
cluster.
You
don't
want
to
mess
with
that
too
much,
and
then
it's
also
read-only.
So
you
can't
even
create
temporary
tables
or
tables
for
that
matter
or
any
indexes.
B
You
can't
mess
with
the
data.
You
can't
change
anything
and
that
that
is
sort
of
a
limitation
and
with
thin
clones.
What
you
can
do
is
actually
grab
a
thin
clone.
Make
that
your
own,
and
then
you
have
a
fully
redried
database
cluster.
You
can
use
p
sql
to
access
that
and
then
you
can
create
a
nexus.
You
can
change
data
or
you
can
export
data
any
way
you
want,
and
then
you
you
start
over,
you
create
another
thin
clone.
Then
you
can
start
fresh
and
that's
all
within
within
seconds.
B
Would
it
make
sense
to
run
through
a
quick,
quick
example
of
how
that
looks
like
cool?
This
isn't
very
well
documented
yet
so
this
is
something
that
we
will
still
have
to
do.
Otherwise
we
just
point
to
the
documentation
for
this.
I'm
just
gonna
share
my
my
screen
just
a
second.
B
So
basically,
we
have
one
gcp
instance.
Currently
that
runs
database
lab
this
is
it's
not
exposing
anything
on
the
public
network
except
for
ssh?
So
what
we
have
to
do
is
a
bit
of
ssh
port
forwarding
to
get
there,
and
this
is
there's
basically
two
things
that
we
want
to
do.
One
is
talk
to
the
api,
so
it
has
a
nice
command
line.
B
Tool
that
you
can
use
has
a
nice
api
that
you
can
use
to
create
thin
clones,
and
in
order
to
do
that,
we
still
have
to
expose
the
the
api
somewhere
because
it's
not
on
the
public
network.
So
what
I
do
here-
and
this
is
this-
is
what
you
can
do
when
you're
set
up
with
your
ssh
key
and
all
that
I
basically
just
forward
the
api
port
to
my
local
machine.
So
basically,
app
is
available
on
the
port.
On
my
localhost.
B
A
B
You
better
all
right
thanks
for
noting
all
right,
so
this
is
just
the
ssh
port
forward
for
the
api.
Let's
give
it
a
one
instance
that
we're
talking
to,
and
then
it's
really
as
easy
as
using
the
dblab
cli
tool
you
can
do
dblab
clone
create
and
then
you
can
basically
specify
the
posters
user
name
and
password
that
you
want
to
use.
B
So
this
is
the
postgres
account,
that's
going
to
be
created
for
you
to
connect
later,
and
that's
basically
it
so
this
is
you
know
what
you're
not
seeing
here,
that
there
is
one.
There
is
a
token
that
you
have
to
configure,
but
that
is
that
is
basically
all
that
you
need
to
do
to
create
a
clone,
and
this
is
the
time
it
takes
so
about
about
10
seconds,
and
you
have
this
thin
clone
available.
B
What
you're
getting
back
there
is
the
connection
information.
So,
basically,
now
on
the
database
lab
instance,
we
have
a
full
postgres
cluster
running
on
this
port
and
in
order
to
connect
that
we
can
use
those
those
user
and
password
combination,
we
specified
before
the
super
secure
password,
and
we
have
to
remember
the
port
here,
because
this
is
what
we
have
to
forward
again.
It's
only
from
the
database
lab
instance.
It's
only
exposed
locally,
so
you
can't
connect
to
it
from
the
outside.
B
B
I
forward
that
to
my
local
part
and
using
the
same
same
instance
again,
and
then
what
I
can
do
is
just
use
my
v-sql
client,
whatever
you
like
to
your
any
ui
tool,
you
can
connect
to
that
that
local
port
and
you're
connected
to
a
full
read,
write
thin
clone
of
the
production
database.
B
So
you
can
see
that
here.
This
is
the
this
is
the
actual
copy
of
the
production
database,
mine,
terabyte
and
size.
It's
been
renamed
so
gitlab
hq
production
is
the
actual
production
database
name
and
upon
thin
cloning,
it's
being
renamed,
because
you
want
to
have
some
indication
that
you're
not
working
on
a
production
instance.
B
Otherwise
it's
very
easy
to
mess
up
and
drop
tables
in
the
wrong
console.
I
guess
so
that's
that's
kind
of
help
in
that
regard
right
and
then
what
you
can
do
anything
you
want
with
that
instance.
This
is
fully
your
own.
I
can,
I
can
create
indexes.
B
I
can
drop
tables,
update
data,
everything
and
then
I
can
just
recycle
that
or
create
another
thin
clone
and
start
over.
Basically.
B
This
is
a
good
question.
This
is
something
we
are
figuring
out.
Currently,
I
think
it's
a
nexus
request,
you-
and
I
I
don't
know
yet
about
the
routing
who
who
takes
that.
Currently.
This
is
the.
C
Unless
I
guess,
because
it's
a
production
data,
I
guess
it
needs
to
go
through
the
usual
path.
With
I
mean
we
need
to
make
sure
that
that
only
people
who
are
allowed
to
get
production
data
access
right,
get
this
access
so
because.
B
Yes,
I
think
so
too
for
the
for
the
ssh
key
setup
for
postgres
ai.
Anybody
can
basically
start
using
that
with
a
good
good
laptop
email
address.
You
can
log
into
the
product
on
the
site
and
you
can
start
using
that.
But
then
you
can
only
access
the
ui.
You
can
only
use
database
lab
where
you
get
the
query
plans
and
all
that,
but
you
don't
have
a
way
of
accessing
the
data
directly
like
like.
We
just
saw
on
that
demo
right
so
yeah
for.
C
B
All
right
and
then
just
to
show
a
quick
use
case
what
we
just
had
this
week.
There
was
a
request
for
changing
a
bit
of
data
in
the
database
and
basically
we're
using
our
database
lab
and
thin
clone
to
prepare
data,
because
what
you
can
do
is
you
can
create
some
tables
right.
This
is
something
you
can't
do
on
a
on
a
replica
and
then
what
we
basically
have
on
on
the
local
machine.
B
I
have
a
csv
file
there,
some
data
in
it
that
I
want
to
import-
and
I
can
just
I
can
just
go
in
and
use
copy
for
that
which
basically
allows
me
to
copy
csv
file
into
that
table,
and
that
goes
from
my
local
machine
to
the
process
cluster.
B
I
can
now
work
with
that
right
and
then
you
can.
You
can
do
all
sorts
of
things.
We
ended
up
running
a
bunch
of
queries
and
then
exporting
data
again
using
copy.
It
goes
in
the
same
way
you
can
copy
from
the
cluster
to
your
local
machine,
exporting
csv,
and
that's
that's
really
useful
for
preparing
those
changes,
but
you
can
also.
B
I
use
it
basically
on
a
on
a
daily
basis
to
whenever
I
want
to
interact
with
the
database,
I'm
I
don't
connect
anymore
to
the
to
the
production
replica.
B
Cool
and
then
basically
the
only
thing
left
is
the
lifetime
of
that
that
clone,
if
you're
not
using
it
anymore,
then
it's
going
to
be
recycled
and
destroyed
after
a
couple
of
hours.
I
think
there
is
a
setting
where
you
can
prevent
that
from
happening.
Much
like
the
termination
protection
in
gcp
other
than
that,
it's
gonna
destroy
itself
off
after
a
while
or
you
can.
B
D
I
think
I
recall
when,
when
this
was
first
being
discussed
as
a
possibility,
because
with
every
every
time
you
make
a
clone
it's
taking
up
more
and
more
space.
Is
there
still
any
concerns
about
that
type
of
limit?
If
there's
a
lot
of
people
making
these
clones.
B
Yes,
this
is
still
a
concern.
The
the
disk
space
is
obviously
not
unlimited
and
the
longer
you
leave
a
clone
around
the
more
space
it's
taking
roughly
and
obviously,
when
you're
making
changes
to
the
clone,
then
then
this
is
also
taking
more
space,
but
it's
not
that
that
an
additional
clone
is
a
full
copy
of
the
data.
So
this
is
this
is
incremental
in
a
sense,
and
at
the
moment
I
wouldn't
bother
being
concerned
about
that.
If,
if
we
run
into
those
limitations,
then
yeah,
we
need
to
do
something
about
that.
C
B
B
And
then
the
usual
caveats
apply
as
well,
just
like
with
database
lab.
This
is
not
a
production
instance,
so
it
has
a
couple
of
differences
in
terms
of
index
of
instance,
type
it's
much
smaller,
and
then
it's
also
based
on
zfs
for
for
those
reasons
for
think
loading.
So
that
gives
you
different
characteristics,
so
the
the
performance
can
be
much
different
compared
to
production.
B
There
is
an
interesting
work
going
on
where
nikolai
is
proposing
to
implement
an
estimator
so
based
on
the
on
the
performance
that
we
see
on
database
lab.
What
do
you
expect
to
see
in
production
and
we
would
be
able
to
estimate
that
for
timing
numbers,
for
example,
but
that
is
sort
of
still
ongoing
work
right
now.
A
B
I
think
you
can
do
both.
I
can
only
tell
from
from
my
usage
of
this
I'm
I've
grown
very
used
to
just
creating
a
thin
clone,
I'm
working
without
you're,
more
more
flexible
with
with
what
you
can
do,
you're,
not
at
risk
at
running
gigantic
queries
and
breaking
breaking
the
production
replica.
B
On
the
flip
side.
It
sometimes
takes
longer,
so
some
queries
are
just
very
slow
compared
to
production
replica.
But
if
you
can,
if
you
can
manage
that
personally,
I
I
use
the
thin
clones
very,
very.
B
Exactly
and
for
analytical
careers
like
in
the
situation
when
you
you
don't
expect
to
change
anything,
so
you
don't
really
need
a
writable
cluster.
There
is
always
the
option
to
also
use
the
archive
replica
for
that,
where
we
don't
have
those
statement
timelines
either
most
of
the
time.
I
would
expect
this
to
be
faster
than
working
with
database
labs
for
those
queries,
but
whenever
you
need
the
flexibility
of
being
able
to
change
things,
database
law
is
the
only
option
that
you
have.
B
Yeah
and
sort
of
going
forward
what
we
would
love
to
do,
what
is
sort
of
a
very
natural
thing
to
do
with
those
thin
clones,
and
we've
talked
about
that
before.
Have
we
about
connecting
your
development
environment
to
that
running,
migrations
on
a
thin
clone,
and
this
is
something
that
we're
driving
forward
right
now.
It
is
not
recommended
to
connect
your
development
environment.
Still,
it's
probably
not
going
to
happen
that
we
recommend
that,
because
of
the
you
know,
security
concerns
associated
with
that.
B
But
what
we
are
going
to
have
is
a
environment
where
these,
for
example,
database
migrations,
are
being
kicked
off
automatically,
and
this
is
sort
of
locked
down
a
lockdown
cio
environment
that
runs
automatically
runs
those
migrations
for
you
and
you
get
some
feedback
on
on
the
mr.
B
So
this
is
what
we're
currently
working
on
to
get
going.
There's
an
there's,
a
very
minimal,
minimal
product
out
there.
I
linked
an
example
with
the
feedback
that
you
that
you
would
be
getting
back,
so
the
workflow
is
basically
you
push
a
change
with
a
migration,
a
kicks
off
ci,
which
basically
picks
up
another
pipeline
in
a
locked
down
environment.
That
runs
the
migrations
for
you,
grabs
a
thin
clone
like
we
just
did,
runs
the
migrations
and
get
some
statistics
right
now.
B
It's
just
runtime,
so
it
just
reports
only
back
the
run
time
for
those
migrations.
But
we
have
a
lot
of
discussions
going
on
what
we
can
add.
You
can
do
query
statistics,
you
can
do
lock
observations,
so
what
kind
of
locks
does
this
migration
take?
Are
those
dangers
or
not,
and
some
and
stuff
like
that,
and
all
that
will
be
reported
back
on
the
on
the
mr?
So
you
get
a
comment
with
all
those
details
and
for
database
maintainers.
B
E
D
B
B
B
D
B
Okay,
so
this
is
on
the
on
the
database
team
group.
There's
this
gitlab
com
database
testing
project.
B
We
have
just
renamed
that
from
migration
testing
to
database
testing,
because
migrations
is
a
huge
thing
that
we
can
do,
but
we
can
also
do
more,
so
we
can
also,
we
think,
also
about
getting
automatically
getting
query
plans
for
your
changes.
Basically,
so
this
is
more
than
just
migration
testing,
at
least
the
idea.
B
We
can
see
how
we
get
there.
There
is
a
bit
of
readme,
but
the
the
the
basic
problem
that
we're
solving
I
mean
running,
taking
migrations
and
running
them
on
the
thin
clone
is
not
very
difficult
right.
I
mean
we've
just
seen
how
we
create
that
thin
clone.
You
can
connect
to
that
using
psql
and,
of
course,
you
can
also
configure
your
gdk
environment
to
or
gitlab
environment,
to
run
those
migrations
on
that.
So
that's
not
super
difficult
to
do.
B
What
is
sort
of
most
concerning
in
that
context
is
the
security
aspect
and
the
fact
that
we're
working
with
with
the
full
copy
of
the
production
database.
This
is
considered
red
data,
so
they're
the
most
important
data
for
us.
The
one
one
that
we
have
to
protect
the
most
and
as
such,
what
we
can't
do
is
sort
of
add
a
job
to
our
regular
ci
pipeline.
B
If
you,
if
you,
you
know,
become
creative
about
that,
and
you
would
be
able
to
inject
that
into
an
environment
where
you
run
that
on
the
production
data
and
and
be
able
to
observe
the
output
of
that,
and
potentially
also,
I
don't
know,
copy
that
data
somewhere.
B
So
there
is
no
limitation
with
regards
to
to
network
isolation
and
and
those
kind
of
things,
so
we
can't
have
it
in
that
open
way
and
that's
why
we
need
a
more
locked
down
product
and
what
we're
basically
working
on
right
now
is
having
a
separate
project.
This
is
what
you
can
see
here.
This
is
being
mirrored
to
the
ops
instance.
We
have
the
ops
gitlab
net
instance,
which
is
a
private
instance
that
we
run,
and
that
has
a
mirror
of
that
project
where
those
pipelines
execute
and
basically
only
in
the
future.
B
But
basically
the
idea
is
that
we
have
this
builder.
It's
a
standard,
shell
executor
that
is
building
docker
images,
for
you
so
doesn't
do
anything
else.
It
doesn't
run
any
code.
It
just
builds
docker
images,
it
pushes
those
docker
images
to
the
to
the
registry
that
lives
on
the
other
runner
and
basically
the
worker
runner
executes
from
that
registry,
and
it
doesn't
have
any
other
network
connectivity,
so
it
can't
connect
to
the
outside
world
really
except
for
its
own
local
registry.
B
So
it's
basically
similar
to
the
idea
of
how
do
you
assign
ssl
keys,
where
you
can't
have
any
network
connectivity
used?
You
build
something
on
a
regular
network
and
then
you
inject
that
into
a
locked
down
environment
where
you
do
the
signing
of
the
running
and
the
more
security
related
stuff,
and
this
is
the
environment
that
you
control
more.
This
is
what
we're
doing
here
as
well
internally.
This
is
spinning
up
a
couple
of
services,
so
one
is
redis,
there's
a
standard
radius.
B
B
B
It
exports
the
postgres
port,
but
in
fact
it
just
forwards
to
the
database
lab
instance,
and
this
is
the
only
hole
that
sort
of
is
in
the
network
here,
other
than
that
the
this
container
basically
runs
gitlab
rails
and
also
in
a
way
where
you,
on
top
of
all
the
isolation
that
we
already
have
so
on
top
of
limiting
the
network.
From
for
this
runner,
we
also
have
ib
tables
going
on
in
this
container.
B
So
there
is
no
network
connectivity,
except
for
the
local
docker
network
talking
to
to
those
services,
and
this
is
actually
where
the
migrations
execute.
So
this
can't
talk
to
anything
else
than
the
postgres
and
the
radisson.
That's
it
it
executes
the
migrations
and
it
basically
produces
an
artifact
which
is
a
json
file
with
all
the
statistics
right
now.
It's
just
the
information
that
which
migrations
ran
and
how
long
that
took
basically,
but
we
would
drop
any
all
those
query,
statistics
and
everything
else
we
want
to
report
on.
B
B
We
can
also
communicate
back
to
the
gitlab
com,
merge
request,
so
we're
using
the
json
file
and
we
push
a
comment
or
whatever
makes
sense,
to
the
original,
much
request
with
those
with
with
those
statistics
and
reports
that
you
can
see
here,
yeah,
that's
what
we
currently
have
is
the
runtime
of
migration.
This
can
be
much
more.
B
Yeah,
I
think
that
that's
basically
the
idea,
sort
of
and
yeah
the
biggest
concern
is
really
the
the
security
side.
How
do
we
make
sure
that
nothing
escapes
and
we
can't
run
sort
of?
Well?
We
have
some
controls
over
what
kind
of
code
runs
on
production
data
and
who
can
sort
of
see
the
output
of
that
code.
D
A
B
Cool
really
excited
about
that
as
well.
I
think
it's
a
major
step
for
us
to
to
run
this
automatically
and
get
the
feedback.
We
hope
this
is
going
to
be
useful.
A
B
And
we
just
added
that
or
we're
just
about
to
merge
that
I
think
that
this
is
being
triggered
already.
So,
even
though
so
there's
there's
going
to
be
a
job
on
the
regular
ci
pipeline.
Soon,
that's
triggering
those
those
pop
up,
those
testing
pipelines-
it's
probably
not
gonna,
be
available
for
all
the
mr.
B
So
we're
still,
you
know
early
phase
testing
that,
but
what
I've
already
seen
is
that
this
is
also
very
fast
in
terms
of
how
fast
you
get
feedback
from
that
the
job
kicks
off
very
quickly,
so
pretty
much
in
the
beginning,
beginning
of
the
regular
pipeline
we
triggered
the
other
pipeline
and
given
that
most
of
the
things
are
being
cached,
so
the
docker
cache
is
pretty
good,
there's
not
much.
We
need
to
do
on
that,
and
the
thin
cloning
is
very
fast
as
well.
It
takes
like
10
seconds
to
get
that
clone.
B
B
B
B
B
Oh
yeah,
and
in
addition
to
that
there
is
the
blueprint
for
database
testing.
I
also
linked
that
on
the
agenda.
That
is
basically
a
summary
of
what
we
just
talked
about.
If
you
want
to
leave
feedback
for
that,
it's
also
much
appreciated.
B
B
B
You
have
a
nice
shirt
steve,
but
I
can
also
only
see
half
of
the
tanuki.
Is
it
the
standard,
gitlab
shirt
right,
nice.
B
We
just
got
some
swag
as
well.
It's
a
bit
hard
to
put
on.