►
From YouTube: Optimising Rails Database Queries: Episode 1
Description
In this video series we will take a look at optimising database queries in Rails applications. We'll be using GitLab as an example, but the techniques can be applied to other Rails applications as well.
The audio quality is unfortunately not the best, but I plan to sort this out before recording the second episode. Make sure to watch it in 1080p, otherwise the text will be too blurry.
The explain visualiser used in this episode can be found at https://explain.depesz.com/.
A
This
is
followed
by
a
bunch
of
queries
that
seem
to
repeat
themselves
quite
a
few
times
that
these
squares
are
most
likely
the
result
of
a
M
plus
1
created
problem
in
this
case,
meaning
that
for
every
snippet
we
probably
fetch
the
author,
but
for
whatever
reason
we
are
not
pre
loading.
Those
queries
today
we'll
specifically
take
a
look
at
this
first
query.
This
rather
complex
looking
select
star
from
snippets
query
now
for
the
sake
of
this
video
I.
A
Have
it
already
formatted
here
in
fin
that
way,
we
don't
have
to
go
through
that
procedure
and
I've
already
obtained
a
query
execution
plan,
a
process,
sequel,
there's
a
command
you
can
run
explained.
So
if
you
open
up
a
terminal
here
and
for
example,
we
open
up
our
development
database,
we
can
normally
run
a
create,
like
say,
select
account
star
from
users
and
we
get
the
results.
But
if
you
want
to
know
the
expected
plans,
we
can
run
it
with
explain.
A
This
whole
result
opposed
to
sequel,
telling
you
what
it
expects
their
own
emphasis
on
expect
because
explain
doesn't
actually
execute
the
query.
If
you
want
to
actually
execute
it,
we
have
to
use,
explain
analyze,
and
now
we
get
some
extra
data,
such
as
the
planning
time,
the
execution
time,
the
cost
of
each
step
in
the
query
and
so
on.
A
We
can
extend
that
a
little
further
with
some
additional
options.
So,
for
example,
we
can
run
explain
analyzed
with
the
buffers
option
which
will
display
the
number
of
shared
buff
is
used
in
this
case.
Buffers
are
memory
buffers
and
they
use
for
caching
results
from
disk.
Every
buffer,
I
believe
is
I,
think
it's
8
megabytes
in
size
or
8
kilobytes
1,
ft,
2,
8,
1,
9,
2
bytes,
that's
right,
it's
8
kilobytes!
So
in
this
case
we
can
see
that
it
says
buffered
shet
hit
3
means
they
used
3
buffers.
A
A
Unfortunately,
if
you
gather
this
output
from
the
terminals,
if
we
take
a
query
from
the
left
here
and
we
executed
with
explain
analyze
buffers
and
then
we
paste
it,
the
output
is
not
exactly
readable.
There's
a
lot
going
on
here.
So
typically,
what
I
do
is
I
will
run
this
copy
the
output
and
then
put
it
in
this
tool
called
explained
or
depeche
calm.
A
And
what
this
does
is.
This
will
visualize
the
query
plan
and
show
you
information,
such
as
how
long
it
took
to
execute
a
specific
step.
How
long
it
took
to
execute
that
step
and
all
sub
steps.
How
many
rows
were
produced
out
so
forth?
I'm
not
going
to
go
too
deep
in
the
format
of
posta
sequels
explained
plan,
because
that
is
something
you
can
discuss
in
its
own
video,
but
very
quickly
to
give
a
rough
idea.
Each
step
here
is
called
a
node.
A
They
start
with
your
names,
a
limit,
sword,
etc
and
they're
sort
of
executed
from
insight
to
out.
So
this
first
node
is
actually
the
last
thing
it
will
run
and
not
the
first
one,
perhaps
the
best
visualize.
This
is
a
function.
Call
that
calls
another
function.
It
calls
another
function.
In
other
words,
limit
is
a
function
then
ankles
the
sword
function.
It
calls
this
case
index
scan
function
etc,
and
they
return
results
as
they
unwind
now.
A
In
this
case,
what
we
can
see
here-
and
this
is
why
I
like
this
tool,
so
much
is-
we
can
immediately
see
that,
for
example,
it
no
three-
we
spend
about
2.3
seconds,
excluding
all
sub
notes.
If
you
look
at
the
statistics
of
that
node,
we
see
that
it
says
we
are
using
I
believe
about
two
and
a
half
million
shared
buffers
which
I
believe
equals
about
20
gigabytes
of
memory.
A
That's
a
lot
of
memory
for
just
a
bunch
of
snippets,
especially
considering
we
were
only
displaying
to
any
of
them
here
and
if
you
scroll
down,
we
see
that
here
again
we
use
quite
a
lot
of
buffers
to
get
the
data
if
you
scroll
down
a
bit
further
again
a
lot
of
buffers,
but
we
also
perform
an
index
skin
over
860,000
rows
here
and
here
below
we.
How
many
do
we
scan
about
four
and
a
half
million
with
a
filter?
A
The
filter
is
essentially
equivalent
of
rubies
area
select,
so
it
basically
loads
all
the
data
into
memory
and
then
filters
it.
Here
we
have
the
number
of
rows
removed,
that's
about
1.8
million,
so
in
general
disk
we
is
doing
a
lot
of
work
just
to
get
20
snippets
way
too
much
work.
So,
let's
take
a
look
at
the
actual
query
to
see
if
you
can
quickly
figure
out
why
that
is
so.
A
I'm
gonna
close
my
terminal
because
we're
not
going
to
use
that
anymore
today,
I'm
gonna,
here,
look
at
the
query
now,
if
I
look
at
this
query
and
I
scroll
down
a
little
bit
and
then
go
back
up,
what
stands
out
to
me
is
that
there
appears
to
be
certain
sections
that
are
repeated
quite
often
so
here,
let's
see
what
lightest
is.
This
is
line
from
line
11
to
16.
We
have
this
exist,
select
from
project
authorizations
and
get
left
this
table
stores
all
the
projects
that
a
particular
user
has
access
to.
A
A
So,
in
other
words,
there
appears
to
be
quite
a
bit
of
repetition.
Going
up.
Another
thing
that
stands
out
is
that
for
this
first
sub
query:
here
we
get
all
the
projects
we
say
limit
us
to
the
ones
that
you
have
access
to
and
get
the
ones
that
have
the
project
features,
snippet
access
level
called
in
this
list
of
values
or
where
that
value
is
10
and
you
have
access
to
the
project.
A
This
is
a
little
weird
that
we're
doing
this
again
because
we
already
filtered
the
project
still
once
you
have
access
to
here,
and
so
this
and
exists
as
far
as
I
can
tell,
is
completely
redundant,
and
so,
if
you
remove
this
oops
I
have
to
go
there.
We
go.
We
are
left
with
and
feature
snippet
access
level
in
null
2030
or
10,
which
is
exactly
the
same
as
feature
snippet
access
level
in
nolde,
10,
20,
30,
and
there
you
go.
We
already
got
rid
of
that
now.
I
know
from
some
prior
testing
a
bit.
A
This
saves
about
2
seconds
of
the
execution
time,
but
we're
still
left
with
about
12
seconds
or
so
so
it's
not
the
primary
source
of
concern.
Now
this
particular
and
condition
in
its
current
state.
We
could
remove
it
because,
as
far
as
I
know,
these
are
the
only
possible
values
that
we
actually
store
in
this
goal,
however,
I
suspect
the
code
that
generates
this
query
will
use
different
values
here,
based
on
your
permissions.
A
A
10
and
again
you
have
access
to
projects,
in
other
words,
we're
getting
all
the
projects
that
our
belief,
visibility,
level,
10
or
20
s,
public,
add
internal
with
zero,
I,
think
being
private,
and
then
we
say
Oh
get
the
ones
where
the
snippets
access
level
is
normal,
which
means
it's
a
belief,
public
2030,
which
I
suppose
is
what
most
proud
knows.
So
is
the
default
and
2030
is
public
internal
and
then
we
say
or
snippets
are
private,
but
here
F
access
to
the
project.
A
If
I
look
at
this,
what
I
will
probably
do
here
is
move
this
to
another
Union
condition
or
a
union
member
or
whatever
you'd
like
to
call
it.
The
reason
is
that,
from
my
experience,
the
performance
of
or
impulses
equal
can
vary
quite
a
bit.
Sometimes
it
performs
well.
But
when
you
use,
for
example,
aware
in
with
a
sub-query
and
then
an
or
from
my
experience
it
tends
to
perform
rather
poorly,
whereas
if
you
use
a
Union,
typically
comes
much
better.
So
let's
do
that
well,
change
this
and
condition
to
a
union.
A
So
we
just
do
that
and
then
we
do
Union
and
I
will
just
copy
paste.
This
put
that
here
and
then
this
or
condition
becomes
a
and
condition
and
we
change
the
indentation
and
then
the
Frances
here
can
go
so
now
we
have
this
as
two
separate
things.
That's
some
repetition
here
is
the
disability
level,
but
since
this
is
generated
by
our
code,
that's
not
a
big
deal.
We're
not
writing
this
screen
manually
in
our
source
code.
A
Now,
let's
take
a
look
at
this,
where
project
ID
in
or
essentially
doing
here
in
this
from
block
is
for
all
these
projects.
We
select
all
columns
and
then
we
just
select
the
ID
from
it.
This
is
rather
wasteful,
so
we
can
change
this
to
select
projectile
ID.
It
simply
means
this
less
data.
We
have
to
send
over
there
like
less
data
to
filter
out,
basically
I.
Think
in
this
case,
poster
seagull
might
be
smart
enough
to
optimize
this
for
us,
but
I
I
personally
prefer
to
make
these
things
explicit.
A
A
The
reason
for
that
is
that
we're
in
typically
performs
about
the
same
with
small
number
of
values,
but
if
you
have
a
lot
from
our
experience
where
it
exists,
typically
performs
better
now
the
way
we
have
to
do.
That,
though,
is
a
little
bit
more
annoying.
So
what
we
have
to
do
here,
because
we
see
three
joint
snippets
anywhere?
No,
we
don't
okay.
So
what
we
can
do
is
change
this
to
where
X
is.
A
And
then
we
do
and
projects
the
ID
is
snippets
the
project
ID
over
here.
Basically,
things
get
all
the
projects,
project,
features
etc,
where
their
ID
equals
to
snippets
project
ID
from
the
outer
query,
we
do
the
same
here
and
projects
Doe
ID
equals
snippet
stub
project
ID,
and
we
do
the
exact
same
thing
here
and
then
in
outer
career
we
have
or
where
the
project
is
not
specified.
We
just
use
for
obtaining
personal
snippets
that
are
not
associated
with
a
project.
A
A
We
have
some
few
things
were
that
we
could
do.
We
could
improve,
but
let's
take
a
look
at
how
this
query
runs
in
its
current
state,
so
to
do
that,
I'm
gonna,
open
a
database
connection.
Now
this
might
be
a
bit
scary
because
I'm
using
a
production
database,
let's
see
the
phone
size
is
correct,
yep
now
the
reason
I'm
doing.
A
That
is
because,
when
you
want
to
get
the
data
of
how
a
created
performs,
you
need
something
that's
as
close
to
production
as
possible
and
if
you
do
doesn't
say
a
development
environment
where
you
have
pretty
much
no
data,
it's
not
going
to
be
accurate.
If
you
have
a
staging
environment,
that's
up-to-date,
it's
gonna
be
more
accurate.
A
The
lack
of
traffic
going
in
can
influence
the
behavior,
because
traffic
who
may
might
result
in
certain
buffers
being
created
that
are
otherwise
not
available,
in
other
words,
in
a
completely
unused
staging
environment,
the
behavior
is
probably
gonna,
be
very
different,
so
we're
going
to
do
here
is
we're.
Gonna
use,
explain,
analyze,
buffers
paste
in
the
query
and
then
we
just
run
it
and
boom
there
we
go
so
this
query.
Now
it
takes
two
point,
two
milliseconds
to
run:
that's
really
fast,
considering
it
took
12
seconds.
Let's
make
sure
it's
actually
returning
the
right
results.
A
I
know
from
this
page.
It's
supposed
to
return
about
20
rows,
I
believe
so,
let's
take
a
look.
You
do
that
we
get
rid
of
the
explain
analyze
and
we
just
turn
this
into
a
let's.
Just
keep
it
a
select
star,
because
the
limit
here
only
limits
it
to
20.
So
it's
not
a
big
deal
yeah.
So,
let's
close
that
and
let's
see
if
they
are
actually
20
rows,
it's
a
lot
of
data.
Let's
just
change
that
to
account
query:
it's
maybe
a
little
easier
to
look
at,
so
we
can
get
rid
of
the
limit.
A
Take
a
bit
of
the
order
by
let's
see
if
you
and
I
we
have
52
snippets.
That's
the
total
number
of
snippets.
Let's
see
if
that
matches
what
I
have
here
and
yes
52,
so
in
less
than
30
minutes
or
so
we've
gone
from
14
seconds.
The
original
execution
10
to
12
seconds
to
2
and
1/2
milliseconds
and
all
we
did-
was
remove
some
redundant
clauses
and
split
equally
up
into
like
an
x-ray
union
member,
and
we
can
probably
still
do
better.
We
can
probably
get
rid
of
this
an
exists.
A
We
can
probably
get
rid
of
this
one
by
using
a
common
table
expression
in
process
equal
she's,
the
best
way
to
explain
it's,
you
basically
run
the
query
and
you
saved
results,
and
you
can
then
select
form
it
that
way.
If
it's
a
heavy
query
you
ensure
that's
only
execute
at
once.
That
feature,
unfortunately,
is
not
available
in
my
sequel.
So
if
you
want
to
use
that
in
our
code,
we
have
to
make
sure
that
we
can
deal
with
both
databases.
A
scale-up
has
to
support
both
we're
not
going
to
do
that
today.
A
That's
quite
a
bit
of
work.
The
plan
is
that
for
the
next
video
we'll
take
a
look
at
the
code
and
see
how
it
influences
this
query.
In
order
to
better
understand
if
the
optimizations
we
made
can
actually
be
turned
back
into
source
code,
for
example,
if
certain
conditions
that
we've
moved
have
to
be
there,
we
might
have
to
take
a
slightly
different
approach.