►
From YouTube: Database Office Hours - 2019-09-28
Description
Database Office Hours - 2019-09-28
A
So
the
question
was:
there
is
two
operator
classes
you
can
use,
one
is
for
the
trigram
index
and
the
other
one
is
for
text
patterns
and
the
difference
is
basically
one
is
actually
a
Gini
index.
So
the
tree
gram
index
is
an
index,
whereas
the
pattern
operator
clause
is
typically
used
on
a
b-tree
index
and
the
question
was
it:
we
should
prefer
one
or
the
other,
and
apparently
we
have
a
mix
of
those.
A
So
we
have
mostly
tree
gram
indexes,
but
a
few
occasions
where
we
use
the
pattern,
ops,
class
and
I
think
it
really
depends
on
what
we're
trying
to
do.
The
think.
The
tree
gram
index
is
much
more
powerful
in
a
sense,
because
you
can
actually
use
regular
expressions
and
a
lot
of
them
are
being.
You
can
put
a
lot
of
those
expressions
and
unser
CH
by
using
the
Trivium
index.
Where
is
the
pattern
operator
class
on
a
b-tree?
It
only
allows
for
prefix
searches.
A
So
when
you
do
ABC
wildcard,
that's
a
good
way
to
use
the
pattern.
Ops
class.
But
if
you
do
like
wildcard
ABC,
then
that's
a
problem,
because
you
can
only
go
into
the
B
tree
from
the
top
basically
and
for
any
prefix
urchins
that
works.
So
even
if
you
do
like
foo
wildcards
a
B
white
card
and
it
would
basically
perform
a
prefix
lookup
for
food
so
for
the
first
string
and
then
basically
scan
all
the
records
and
can't
make
use
of
the
index
anymore.
So.
A
Basically,
when
when
we
were
only
interested
in
prefix
searches
than
the
pattern,
ops
is
fine,
and
if
we
don't,
if
you
want
to
do
more
than
that,
we
typically
do
some
like
wildcard
searches
at
the
beginning
in
the
end.
So
when
you
search
for
something
you
do
wall
cards,
search
term
wall
cards
and
then
I
think
the
tree
gram
index
is
more
useful.
B
A
To
Jaden
stands
for
generalized
inverted
index.
If
I'm
not
mistaken,
let
me
look
it
up,
but
it's
basically
a
you
can
call
it
an
inverted
index
and
it
basically
you
can
think
of
it
as
it
as
a
hash
map
where
you
have
like
a
key
and
then
the
key
points
to
all
the
records
that
somehow
have
this
key
right.
A
For
example,
text
searches
in
leucine
or
elastic
search.
They
also
use
inverted
indexes
where
you
basically
extract
all
the
terms
from
a
document
right.
So
all
the
words
from
from
from
the
documents
that
you
have
and
then
you
create
an
inverted
index
that
basically
contains
each
of
those
words
as
the
key
and
then
it
points
to
a
list
of
documents
that
contain
this
word
right
and
then
you
can
do
office
lis.
You
can
do
fast,
look
ups
by
word.
A
So,
if
you're
interested
in
knowing
which
documents
contain
a
particular
word,
you
just
go
to
that
sort
of
hash
map
retrieve
that
list
and
then
you're
good.
Then
you're
done
basically,
but
it
can
also
get
more
complex.
Where
you,
let's
say
you're
interested
in
two
words
or
a
phrase,
then
you
can
basically
make
those
two
lookups
you.
A
You
get
two
lists
of
documents,
one
for
the
one
word
and
one
for
the
other
word,
and
then
you,
if
you're
interested
in
knowing
all
the
documents
that
have
both
you
basically
and
those
those
lists
in
a
sense
and
that's
that's
sort
of
how
the
general
index
structure
works
and
for
a
tree
Graham.
It
basically
means
that
the
keys
are
three
Graham,
so
consecutive
three-letter
words,
and
then
it
is
quite
interesting
to
look
at
how
to
translate
a
regular
expression
into
how
do
I
look.
A
Don't
think
that
that
would
work,
because
the
tree
Graham
is
something
you
you
extract
from
your
from
your
document
right,
it's
not
that
the
big
be
tree
or,
let's
maybe
look
at
how
the
B
tree
works
in
this
case
for
the
prefix
search.
Basically,
you.
C
Can
learn
couple
of
words:
hi,
hey,
so
yeah
v3
is
only
four
numbers
only
on
one
single
axis
right,
because
one
dimensional
numbers
are
three
is
for
two
dimensional
numbers
for
four
three
grams:
we
do
we
deal
with
sets
actually
like
a
race.
That's
actually
so
collections
of
two
three
actually
three
three
letters
objects
of
few
letters
right
so
sets
of
the
character,
objects
and
it's
possible
to
use
trees,
but
it
will
be
so
called
Russian
doll,
3,
Rd
3,
so
we
impose
this.
C
C
C
Had
her
own
closet
right,
yeah,
yeah,
I,
know
I,
understand
what
you
mean
right
if
we
like
something
like
or
like
something
percent
mark.
If,
if
percent
mark
goes
in
only
in
there,
it's
possible
to
use
b3,
but
it
will
be
actually
under
the
hood,
it
will
be
a
to
two
comparisons
right,
so
actually
we
will
have
two
iterations
more
than
and
less
than
so
again
it
will
be
single
dimensional.
C
A
A
B
A
Any
case
it's
going
to
be
a
problem.
No,
but
just
today
you
have
the
default
value
of
tens.
Let's
say
you
have
a
lot
of
Records
in
there.
You
change
it
to
20,
and
new
records
are
going
to
have
20
unless
you
know
when
they're
using
the
default,
and
only
if
you
create
a
new
column
and
put
a
default
one
on
it,
and
today
you
with
nine
six,
you
would
basically
rewrite
the
table,
but
with
the
eleven
that's
not
even
that
is
necessary.
A
Know,
that's
when
you,
when
you
add
a
new
column
right,
you
can,
you
can
either
add
a
column
and
say
it's
it's
nullable.
So
all
the
records
will
be
null.
That's
a
cheap
operation
to
do,
but
as
soon
as
you
want
those
records
to
have
a
default
value,
the
then
all
those
existing
records
they
would
get
this
default
value
right
and
then
for
for
the
absolute
end
version
for
post
quiz.
A
That
doesn't
mean
that
you're
going
to
as
soon
as
we
add
this
column,
with
a
default,
we're
rewriting
the
whole
table
and
with
the
11s
readjust
it's
a
very
cheap
operation
to
do
that,
even
with
the
default
value.
So
I
think
that
I
just
keep
track
internally
of
the
of
the
new
default
value
for
that,
without
touching
existing
records.
A
That's
a
pretty
interesting
one.
It's
about
a
creating
an
index
on
an
expression,
and
the
expression
consists
of
basically
figuring
out
what
the
greatest
time
step
is
of
things.
So
we
have
three
different
times
them,
some
with
time
zones
and
without
and
basically
we
want
to
index
the
greatest
times
some
of
those
three
and
that's
a
problem,
because
the
expression
isn't
immutable
and
you
have
to
use
immutable
expressions
for
indexes.
I
haven't
figured
that
one
out
so
I
don't
know
if
you
want
to
take
a
look,
but
it's
kind
of
interesting.
A
A
C
A
C
No,
if
you
have
time
stamped
without
timezone,
should
work
for
indexes,
so
it
should
index
it
easily
because
it's
quite
it's
kind
of
kind
kind
of
Texas.
That's
it
similar
to
like.
So
it's
just
a
value
and
that's
it.
It
doesn't
depend
on
our
session
variables
or
on
our
time
zone
or
something,
but
with
time
zone
will
be
tricky
to
index,
because
the
value
depends
on
like
real
value,
which
cost
index
depends
on
our
position.
So
we
need
to
convert
it
to
same
at
time.
Zone
UTC,
for
example,.
D
D
So
now
that
we
have
our
single
code
rings,
we
need
to
be
very
careful
of
the
well
long
story
short
on
get
laughs.
We
are
not
executing
the
test.
The
specific
test
forget
Lacoste,
which
means
that
we
need
to
catch
these
errors,
like
ourselves,
our
viewers
and
specifically
for
background
migrations.
Background
migrations
on
the
leaf,
get
laughs,
background
migration
directory.
We
need
to
ensure
these
ones
are
not
referencing,
EES,
specific
models
and
staying
for
the
specs.
D
The
specs
for
these
ones
should
not
reference
EES,
specific
factories,
because
the
pipeline
does
not
catch
this
type
of
errors.
They
are
going
to
be
green,
but
when
these
changes
are
transferred
to
get
last
fall's,
they
are
going
to
fail
because
they
are
referencing
a
model
that
does
not
exist
on
get
la
fast,
so
they
are
discussing
these
like
this,
causing
the
the
way
we
can
automate
this,
but
until
that
gets
done
there,
this
is
going
to
be
a
manual
job.
D
D
A
B
A
B
D
B
D
A
D
A
A
A
B
A
D
B
A
All
right
I
wanted
to
call
out
something
that
recently
got
a
bit
of
traction.
Is
that
we're
we
want?
We
generally
want
migrations
to
be
reversible.
We
already
required
up
for
regular
migrations.
So
whenever
you
write
around
migration,
you
want
to
look
at
the
up
method
as
well
as
the
dahle
method.
So
we
can
revert
that
easily
and
that
isn't
always
true
for
heavier
migrations,
where
we
have
like
post-deploy
migrations
that
do
data
migrations
or
it's
never
the
case
currently
for
background
migrations.
A
So
for
a
background
migration,
you
just
kick
off
the
job,
get
that
scheduled,
but
you
don't
have
a
way
of
stopping
it,
gracefully
or
reverting
it
in
the
end.
So
whenever
we
run
into
issues
in
on
the
production
side,
we
with
the
background
migration.
We
basically
manually
kill
all
those
jobs
right.
So
we
don't
have
a
good
interface
to
say,
stop
this
migration,
and
now
please
revert
it
because
it's
causing
issues
and
what
we're
starting
or
we're
starting
to
discuss
requiring
that
reversibility
of
emigrations.
A
A
A
All
right
and
the
other
topic
I
wanted
to
call
out
is
we
were
discussing
the
different
database
roles
or
database
maintainer
responsibilities
and
I
was
hoping.
We
can
clarify
what
the
responsibilities
of
a
maintainer
are.
Maybe
also
how
that
relates
to
the
other
roles
that
we
have
for
database
and
I
just
wanted
to
ask
if
there
is
still
like
unclarity
towards
those
roles
or
the
responsibilities,
and
if
yes,
where
we,
where
we
can
clarify,
because
that
wasn't
totally
obvious
to
me
where
we
can
put
that
information.
A
D
Like
maybe
we
can
just
add
a
paragraph
there
explicitly
saying
that
the
database
maintainer
is
a
person
that
is
knowledgeable
and
the
gift
of
gold
base.
That
is
not
related
to
an
infrastructure
role,
and
that
has
a
good
eye
for
performance
improvement
and
it
can
be
a
buckin
engineer
or
a
database
ingénue.
This
new
role.
E
D
A
A
D
Well,
I
think
a
button
maintainer
and
database
maintainer
are
similar
but,
as
example
as
a
buckin
maintainer
when
I
do
about
in
container
review
I,
don't
normally
review,
like
the
migrations
like
very
thorough,
because
I
trust
that
italics
maintainer
to
do
that
and
I
also
don't
review
like
the
background
migrations.
So
when
I
am
reviewing
the
merge
request,
I
saw
Tommy's
reviewers
I
do
I
do
check
the
background
migration
I
do
check
for
the
time
like
for
how
many
records
are
going
to
be
updated
on
the
it
lab
for
the
specs
and
whatever.
D
A
A
D
Yeah
and
well
there
it
was
the
database
maintainer.
All
that
is
described
on
the
database
review
page.
It
does
not
mention
what
a
database
maintainer
is
just
mentioned
like
what
are
you
supposed
to
do
like
if
we
explicitly
say
that
the
data
is
maintained,
there
is
someone
that
has
knowledge
on
the
guid
love
code
base
like
specific
knowledge
or
the
good
luck
code
base.
I
think
that
will
be
very
helpful.
D
A
That
make
sense,
I'll
drop
something
and
send
it
later,
maybe
worth
saying
that
the
there
is
also
a
lot
of
changes
going
on
currently
for
for
those
primary
roles.
So
we've
only
had
the
database
reliability
engineer
until
recently
and
that
that
was-
or
maybe
that's
only
talk
about
how
that
changes.
I
think
it's
it's
changing
in
the
direction
that
we
that
this
is
focusing
on
infrastructure,
council
running
the
site,
running
the
database,
owning
the
database
infrastructure
and
then,
on
the
other
hand,
we
got
that
database
team
going
so
that
that
has
been
approved
and
we're.
A
Currently
writing
those
role,
descriptions
and-
and
there
is
we
talk
about
a
database
engineer
there,
where
you
basically
you're
back
an
engineer,
but
you
also
bring
the
deep
database,
knowledge
and
you're
kind
of
working
on.
Let's
say
foundational
database
code
changes
sense
not
on
features,
but
rather
on
those
aspects
and
database
performance,
and
then
we
have
those.
A
A
E
E
E
E
You
know
the
gitlab
database
and
there's
the
small
flow
tracking
database,
and
you
know
there's
the
periscope
there
so
later
team
has
an
issue
opens
and
assigned
to
me
which
they
Ivana.
He
see
the
historical
relations
of
some
tables
in
it,
la
Enterprise,
Edition
or
Community
Edition
and
those
tables
are
subscriptions.
Currently,
you
know
when
a
user
makes
a
subscription,
it's
always
erasing
the
it
doesn't,
keep
historical
subscription.
It's.
E
Data
team
sees
this
as
a
blocker
and
they
want
to
have
historical
versions
of
his
tables.
Okay,
the
idea,
which
seems
obvious,
is
against
all
the
rules.
All
the
database
design,
which
I
read
on
the
handbook,
which
is
you
know,
I,
want
to
keep
a
JSON
me
column.
You
know
when
I'm
gonna
store
in
polymorphic
table
a
previous
version
of
the
subscription
table,
it's
just
for
data
purposes,
so
this
is
the
first
approach
which
which
is
requested
by
the
data
team
in
Julia.
The
second
approach
is
with
I'm
gonna
use
it
to.
E
A
E
Enterprise
users
on
their
staff
fostered
initially
they
will
not
see
their
own
historical
version.
They
will
not
see
their
previous
subscription
levels
or
members,
but
our
data
team
is
going
to
able
to
make
historical
analysis.
You
know
simple
cases
like
we
change
your
subscription
L.
You
are
becoming
from
broast
gold.
For
me,
that
is
his
loss
immediately.
A
E
E
E
A
So
I
think
I
think
the
you
went
based
implementation
is
much
better
we've.
We
have
different
approaches
for
doing
analytics.
So,
for
example,
we
do
cycle
analytics
and
what
we,
what
we
try
to
do
is
basically
looking
at
the
data
we
have
and
figuring
out
different
metrics
from
that,
and
that's
that's
already
very
painful
when
you,
when
you
only
look
at
the
last
three
months,
so
they
say
we
had
a
migration
recently
and
it's
just
that
the
database
itself
isn't
isn't
really
built
for
that.
A
In
a
sense,
it's
not
an
analytical
database
is
a
transactional
one
and
it's
really
painful
to
run
those
analytics
on
top
of
that
and
then
extract
that
information
and
I
think
a
solution
like
like
you
described
where
you,
you
basically
emit
an
event
that
says
all
this
changed,
and
now
we
keep
track
of
that
event
stream
and
then
later
on,
you
can
run
and
analytics
on
that
event
stream.
That
makes
a
lot
more
sense
to
me
than
trying
to
store
all
this
information
in
our
main
database,
because
we
are
already
growing
fast.
A
We
need
to
manage
that
in
some
way
and
I
think
if
we,
if
we
had
a,
we
had
this
event
based
system
in
place
where
you
you
can
consume
those
events
later
on
I,
think
that
would
tremendously
help
in
different
areas
as
well,
not
only
for
the
show
subscriptions
but,
like
I
said
for
cycle
analytics
or
anything
like
that.
That
would
be
very
helpful.