►
From YouTube: Create Deep Dive #6: GitLab ElasticSearch integration
Description
In this Deep Dive session, Mario de la Ossa, Backend Engineer on the Plan team at GitLab, shares his knowledge of GitLab's ElasticSearch integration.
Download the slides: https://docs.google.com/presentation/d/1H-pCzI_LNrgrL5pJAIQgvLX8Ji0-jIKOg1QeJQzChug/edit?usp=sharing
Learn more about Deep Dive knowledge sharing sessions: https://about.gitlab.com/handbook/communication/knowledge-sharing/#deep-dive-sessions
Find out when the next Create Deep Dive is taking place: https://gitlab.com/gitlab-org/create-stage/issues/1
---
Read more about our product vision: http://bit.ly/2IyXDOX
Learn about FOSS & GitLab: http://bit.ly/2KegFjx
Get in touch with Sales: http://bit.ly/2IygR7z
A
So,
let's
get
started
in
everything,
so
we're
gonna
go
into
what
is
elasticsearch,
and
why
do
we
want
it?
We're
gonna
talk
about
the
differences
between
database
search
and
searching
when
the
last
search
we're
going
to
be
talking
about
why
we're
still
not
using
it
a
yellow
comb
which
we're
trying
very
hard
to
do
by
the
way
we
want
to
use
it.
A
So,
first
of
all,
what
is
the
last
search?
So
it's
a
search
another
analytics
engine
built
on
Apache,
oh
you've
seen,
which
is
a
you
know,
text
search
engine,
I'd,
open
source,
it's
restful
and
it's
distributed.
You
can
have
multiple
charts
that
talk
to
each
other.
So
you
the
data
and
the
work
of
finding
the
data.
It's
actually
the
most
popular
search
engine
for
both
log
analytics
and
bootleg
search
and
with
good
reason.
It
is
very,
very
good
at
what
it
does.
A
It
accepts
JSON
documents
using
its
API
or
they
done
ingestion
such
as
monster.
The
launch
actually
uses
a
lots
and
Surgenor,
so
you
can
feed
sorry.
It
allows
there
to
be
a
lot
stat.
It
automatically
sorts
an
original
document
and
add
searchable
references
through
that
document
in
the
closer's
index.
So
then
it
permits
us
to
search
and
retrieve
the
document
using
the
elastic
search
api,
and
you
could
also
use
to
bond
out
to
visualize
your
data
and
build
build
interactive
dashboards
and
some
data.
A
It's
very
high
performance
thanks
to
being
distributed.
It
enables
them
to
process
large
volumes
of
data
in
parallel
and
it's
near
real-time
reading,
a
writing
data
usually
think
less
than
a
second
to
complete.
So
by
the
way
we
do
not
want
to
use
last
search
for
all
of
our
data.
For
the
usual
reasons,
it's
not
a
relational
database,
so
it
doesn't
hold
any
sort
of
relation
data
between
your
your
documents.
It's
purely
document
storage
and
what
I
mean
by
that
is
that
it's
closer
to
say
a
MongoDB
than
it
is
to.
A
So
we
talked
about
what
it
is:
let's
talk
about,
the
differences
between
database
search
and
all
assets,
engines
and
the
main
difference,
and
the
main
reason
why
we
want
to
implement
elasticsearch
on
get
Lancome
is
that
it
allows
for
global
code
and
commit
search.
So
right
now
you
go
get
Lancome
and
you
friend
search
for
codes
or
permits.
You
cannot
do
this
globally.
You
have
to
drill
down
into
a
particular
project
before
you
can
search
for
code.
A
So,
as
you
can
see
in
the
screenshot
on
the
right,
we
are
searching
through
any
group
and
any
project
in
the
integral
based
and
filtered
search,
which
is
when
you
are
actually
searching
for
issues
inside
of
the
project
or
a
group
that
does
not
currently
use
lacet
search.
When
you
see
on
the
right
is
the
global
search,
/
search
that
you
get
when
you,
when
you
type,
in
any
words
on
the
top
bar
on
the
bottom
and
poisoning
you
some
gear.com?
A
A
Easier
indexing
easier,
an
easier
way
of
knowing
if
the
index
is
stale,
for
example,
we
don't
have
a
good
way
to
do.
Zero
downtime
deploys
right
now
at
a
minimum.
It
requires
a
real
sweet
start
if
we
change
any
classes.
Sorry
for
change
the
scheme
on
itself,
the
classes
think
of
the
change
in
schema.
The
problem
is
global
it
to
database
migrations,
but
we
actually
don't
have
any
new
tooling
around
it.
A
A
So
we
now
have
a
way
to
enable
elasticsearch
just
for
a
few
groups
or
just
for
a
few
projects,
so
we'll
be
using
that
to
enable
it
only
for
good
luck,
see
you
lucky
on
good
luck
on
itself.
You
touch
things
out,
so
let's
start
talking
about
how
to
set
it
up
initially.
So
first
things.
First,
you
have
to
install
elasticsearch
right.
We
have
the
requirements
for
each
version
available
in
our
documentation.
A
A
We
then
move
on
to
the
National
indexing
of
context
after
you've
installed
it.
We
currently
do
it
via
ray
tests,
but
soon
we
will
be
adding
this
to
the
admin
console.
It'll
miss
me
a
button
that
you
could
say
well
index,
but
currently
we're
using
good
lab
elastic
index,
great
task
that
runs
all
the
indexing
operations
in
the
program,
except
for
repository
indexing.
The
repository
indexing
is
a
little
bit
special
and
I'll
talk
about
a
little
bit
more
in
the
following
slides.
A
This
is
suitable
for
all,
but
extremely
large
instances
which
must
run
each
indexing
operation
separately
in
order
to
avoid
overloading
cycling.
This
is
because,
when
we
start
our
repository
indexing,
we
actually
start
thank
youing
as
many
sidekick
operations
as
psychic
workers
as
we
can,
and
finally,
we
enable
indexing
and
search,
pls
and
search
on
the
admin
console,
and
this
is
what
you
see
when
you're
in
the
admin
settings
area.
You
can
enable
the
last
Search,
Indexing
and
search
without
accent,
search
as
two
separate
bins.
A
Most
of
our
search
results
require
all
of
your
projects
to
be
indexed
and
I'm,
not
talking
about
the
repository,
but
rather
the
project
metadata
itself,
because
most
of
our
queries
rely
on
the
project
data
in
order
to
check
permissions
on
lots
of
search
itself
before
we
return
any
results
and
you
can
set
the
URL
of
glass
and
search
the
number
of
glasses
or
shards
and
latitude
replicas
the
normal
things.
We
also
have
indexing
restrictions
in
place
for
a
last
search
where
you
can
limit
in
the
namespaces
it's
and
the
projects
that
can
be
indexed.
A
So
if
you
don't
want
your
entire
instance
to
be
indexed
in
asset
search,
you
can
set
restrictions
and
then
only
when
you
are
inside
of
the
group
or
project
will
but
code
bathes,
Musil
asset
search
for
search
results.
This
does
mean
that
you
lose
global
search
of
course,
because
not
everything
would
be
indexed
so
on
global
search.
We
use
the
database
when
this
is
enabled
so.
A
A
You
can
have
multiple
indexes
and
each
of
those
indexes
would
hold
one
document
type,
and
that
is
usually
the
way
people
use
it
right.
We
would
have
one
document
type
for
issues.
One
document
Bible
projects,
one
got
you
platformer
two
quests
and
that
would
keep
everything
separate
and
it
would
you
know,
keep
the
index
is
a
little
bit
smaller.
What
about
that?
So?
A
A
That
has
a
type
name
and
that's
where
we
keep
with
a
parenting
issue.
Weather
is
merciless,
etc.
Now
that
does
mean
that
all
of
our
types
share
all
the
fields.
So
we
have
a
lot
of
sports
fields,
which
means
we
could
have
a
lot
of
ways
of
storage.
But
thankfully,
since
elasticsearch
6.0
there
has
been
great
storage
improvements
for
sparse
fields,
so
we
do
not
get
a
big
storage
penalty
is
actually
negligible.
A
We
should
probably
move
to
one
and
expert
type,
but,
like
I
said
we
lose
the
ability
to
filter
by
project
attributes
or,
alternatively,
we
are
forced
to
denormalized
project
data
into
every
data
class
type.
So
we
would
balloon
storage
usage
because
then
we
have
to
copy
the
same
project
information
that
we
require,
for
example,
of
access
levels
for
issues
or
access
levels
from
Earth
requests
or
whether
or
not
you
have
permissions
to
see
a
project.
We
would
have
to
put
that
into
every
single
document
story.
Don't
you
think.
A
So
analyzes
are
where
the
search
magic
actually
happens
and
analyze
the
comparison,
data
for
better
searching
and
each
analyser
increases
and
storage
needs,
because
what
an
analyser
does
is
well.
What
we
the
way
we're
doing
it
is
every
analyzer
has
an
option
to
keep
the
original
data
and
we
do
want
to
eat
the
original
data.
So
we
have
that
on
and
then
each
analyzer
molds
the
data
into
different
into
a
different
format,
basically
and
they're,
proposed
to
organizers
and
filters.
For
example,
an
analyzer
that's
used
very
often
is
just
a
plain
English
analyzer.
A
A
Fox10.
We
turn
the
sorry.
We
also
have
into
engrams
and
an
edge
Engram
actually
would
turn
aux-in
D,
fo
and
fo
x,
and
that's
so
that
we
can
actually
match
when
somebody
searches
for
partial
words,
because
the
last
search
will
not
match
a
partial
term.
Unless
you
will
really
have
tokens
that
have
that
partial
term
inside
of
them.
A
For
models,
we
use
the
standard
organization,
which
is
what
I
was
talking
about,
and
we
have
three
filters.
You
have
a
standard
filter
which
actually
doesn't
do
much.
It's
just
there
for
a
lasting
search
in
case
in
the
future.
They
have
to
add-
and
it's
just
an
easy
way
for
them
to
add
an
extra
filter
if
they
mean
it
which
actually,
on
the
last
search
stomach
oil.
The
standard
filters
didn't
loot,
so
I
guess
they
never
needed
it,
and
they
know
that
it's
leaking
differently.
A
We
had
a
lower
case
filter
which
normalize
it
starts
to
lower
case
and
also
when
researching
we
normal
aspects,
the
lower
case.
So
this
means
that
it's
a
lot
easier
to
match,
search
terms
as
everything
is
normalized
to
lowercase.
We
don't
have
to.
We
do
not
need
sensitive
and
searching,
and
we
have
a
custom
stemmer
filter
that
we
called
my
stomach,
because
it's
the
only
one
we
have
their
uses,
the
light,
English
stammer
and
that's
the
one
that
knows
how
to
separate
English
words.
For
example.
A
Then
also
to
have
my
Engram
analyzer.
Now
again,
you
can
see
on
the
Engram
analysis.
We
need
it,
so
we
never
need
their
proper
name
and
it
creates
two
or
three
grams
or
projects
for
the
project's
name
with
namespace
itself
and
toward
three
grams
I
mean
so
you're,
just
getting
your
getting
partial
strengths
of
two
or
three
characters.
A
We
also
have
way
more
interresting
analyzers
for
repositories
and
commits
we
do
with
one
of
tokenizing
with
a
ski
holding
and
lowercase
cultures.
Ascii
folding
basically
turns
every
utf-8
character
that
could
be
asked
me
into
ASCII
and
Lotus.
Just
you
know,
makes
everything
more
paste.
We
don't
have
special
filter
for
the
code
analyzer,
it
uses
an
edge
and
burn
filter
that
creates
grams
mean
to
120
characters
wide,
and
that
is
also
so.
A
We
used
to
this
for
code
and
basically
means
that
if
you
have
a
very
long
string
with
a
lot
of
periods,
you
know
like
if
you
call
a
very
long
function
and
then
you
add
parentheses,
and
you
add
some
arguments.
Sometimes
we
want
to
just
find
everything
that
has
the
very
line
functioning,
just
a
function.
So
if
we
did
not
have
this
edge
Engram
filter,
we
would
actually
not
be
able
to
find
that
function
or
if
you
want
the
pirate
partial
function-
and
you
know
like
functions
that
have
specific
port
inside
of
them.
A
We
need
this
in
order
for
elasticsearch
to
be
able
to
find
it.
We
also
have
a
filter
with
a
ton
of
reject
patterns,
so
we
are
basically
separating
camelcase
function
names.
So
then
you
can
find
different
words
inside
of
a
concave
function.
We
also
extract
all
the
digits.
We
also
extract
terms
and
some
quotes.
We
separate
the
terms
on
periods
and
we
separate
patterns.
You
know
term
to
do
slash,
and
this
is
again
because
if
we
don't
do
that,
then
you
cannot
search
for
just
that
term.
A
We
also
have
a
custom
share,
analyzer,
which
organizes
using
an
Engram
to
five
movie
characters
and
that
directly
mapped
to
you
know
how
game
uses
charts.
You
do
sorry
to
identify
specific
commits.
So
if
you,
if
you
put
a
commit
sha
into
the
search
bar,
we
use
this
analyzer
to
turn
like
a
40-long
sha
into
19,
Loong
18
long.
So
anything
long
all
the
way
down
to
five
long
and
that's
how
we
can
allow
you
to
search
for
the
specific
shop,
no
matter
how
many
bits
you're
giving
us.
A
So
how
do
we
interact
with
rails
models?
Well,
we
use
a
customized
lesson,
search
rails
gem
to
link
up
our
models,
new
asset
search.
We
need
it
to
customize
it
a
bit
because
our
way
of
doing
the
document
type
is
not
normal.
We,
you
know
we
have
a
single
document
type
for
all
of
our
models
and
that's
not
so
many
people
usually
do
so.
We
had
to
customize
it
a
little
bit
and
you
can
find
those
customizations
under
a
oops.
Sorry
I
wanted
to
get
out
of
here.
A
So
we
have
an
application
search
module,
which
is
the
entry
point
that
defines
cold
ice
and
shared
methods
for
everything,
except
for
repositories,
after
that
each
class
defines
their
own
search
module.
So
we
have
project,
search,
issue,
search,
Merguez,
search
and
note
search,
/
temple
and
these
classes
to
find
the
basic
elapsing
search.
Query
structure
and
any
special
indexing
concerns,
so
application
search
defines,
for
example,
the
basic
project
filter.
A
You
we
define
the
basic
project
filter
right
here.
You
can
see
that
we
are
defining
all
of
our
settings,
such
as
the
number
of
charge,
the
filter.
That
was
something
about
the
lady
whose
temer
organizer
for
project
path
and
we
go
down.
These
are
all
the
fields
that
we
have
available
to
us.
This
joint
field
here
is
actually
how
we
do
parent-child
relationships,
so
the
project
can
be
a
parent
and
then
issue.
Merge,
request,
milestone
your
block.
We
can
Bob
and
commit
in
each
child's
multiple
project,
and
we
need
this
sorry.
A
I'll
talk
about
after
commits
in
a
few
seconds,
but
what
I
really
want
to
show
you
here
is
the
basic,
very
catch
here
we
go.
So
this
is
the
absolute
fitness
hash
we
ever
send
to
all
asset
search.
We
have
query
of
boolean
type
that
must
match
certain
fields
with
the
query
and
we
use
the
and
operator
so
the
more
in
terms
you
give
us,
you
know
they
all
have
to
match.
A
And
we
have
a
project
ID
filter
where
the
parent
is
any
project
and
the
query
has
to
match
a
project,
ID
query
which
is
right
here,
so
you
can
see
that
we
we
checked
into
the
user.
Can
we
cross
project,
which
basically
means
the
user
is
an
admin?
Then
we
send
enough
to
participate.
We
want.
We
want
them
to
be
able
to
read
everything.
Otherwise,
we
pick
project
by
memberships.
We
bring
projects
by
disability,
because
if
the
project
is,
if
the
user
is
not
a
member
of
it,
the
project
has
to
be
public.
A
A
If
we
are
going
to
limit
by
numbers
only
then
we
need
to
make
sure
that
the
feature
is
enabled
and
and
the
user
is
a
member.
Otherwise
we
just
need
the
feature
to
be
enabled
for
public
projects,
and
this
is
actually
used
heavily
in
issues,
Mercia
class,
etc.
To
make
sure
that
we
are
only
returning
issue
results,
for
example,
for
projects
that
actually
have
issues
enabled
we
also
have
a
last
search,
git
repository,
that's
the
one
that
defines
how
blobs
wiki
blobs
and
commits
interact
with
massive
search.
A
We
need
a
separate
module
because
there,
probably
different
rebels,
are
not
on
the
database
they're
actually
on
this,
so
we
need
to
talk
to
get
Ally
in
order
to
get
the
active
logs
and
the
actual
commits.
We
only
index
the
default
branch,
otherwise
its
cost
would
skyrocket.
The
only
branch
we
ever
index
is
master
or
whatever
other
default
we
have
set,
and
we
currently
have
two
indexes.
A
We
have
a
rail
script,
that's
actually
very
slow
and
it's
due
to
be
removed
soon,
and
we
have
a
good
level
assets
virginity,
which
is
weird
in
Ingo
and
knows
how
to
walk,
really
knows
how
to
talk
to
two
and
it's
a
lot
faster.
We
agreement
improve
speed
by
almost
10
times
for
certain
scenarios
and
lowers
the
research
usage,
and
it's
just
because
go
with
better
at
memory.
Handy
though
this
is
still
memory
in
Hungary,
we're
still
loading
all
the
lofts
into
memory.
A
Compiler
modules
in
this
case-
and
it
also
allows
us
to
type
from
the
site,
make
out
on
memory
color
so
before
we
have
problems
with
our
rails
script
because
against
queue
prone
psychic.
So
since
it
just
from
psychic,
it
gets
old
from
psychic
issues
of
us
and
it's
and
rails
it.
It's
a
ruby
script.
It
shows
up
as
part
of
how
much
memory
that
worker
is
using,
and
so
psychic
actually
has
the
limit
upon
how
much
memory
we
can
use
and
we
start
getting
killed
by
the
memory
killer.
A
So
by
pushing
us
off
to
another
process
entirely.
We
can
hide
from
that,
and
so
we
can
use
as
much
memory
as
we
actually
need
to
now.
It's
really
useful
blobs,
which
includes
wiki
blogs
and
commits,
and
we
loss
is
actually
a
new
thing,
because
we
found
out
last
milestone
that
we
basically
had
a
broken
indexer
for
anybody
that
was
no
longer
using
NFS.
A
Anybody
doesn't
that
migrated
over
to
giddily,
because
wiki
blobs
are
blobs
there
and
they
are
on
the
they
are
on
this
and
they're
not
on
the
database,
so
we
actually
have
to
treat
them
exactly
the
same
as
all
the
globs.
By
the
way,
whenever
I
say,
blobs
I
do
mean
files,
so
all
of
the
code
files
in
a
repository
or
any
other
file.
You
have
your
repository
and
it's
a
good
idea
to
note
right
now
that
we
not
law
will
not
index
binary
blobs.
A
We
only
index
things
that
can
be
affected
as
text,
so
any
images
of
course
getting
or
any
binaries
as
in
executable
files,
those
get
ignored.
We
also
do
not
index
very
large
files.
Do
not
quite
remember
what
the
limit
is
right
now.
What
is
in
the
codebase?
Where
is
I?
Think
that
could
be
wrong?
How
much
so
in
it
I.
A
If
I
make
it
by,
it
is
actually
very,
very
large
text
files.
So
it's
a
very
it's
a
very
reasonable
limit,
so
it
talks
to
get
aliy
and
it
gets
a
diff
between
the
last
commit
has
found
an
index
status
and
the
current
shop
now,
in
a
sense,
the
index
status
and
the
rail
stephenie's,
and
it
really
is
just
there
to
know
what
is
the
last
minute
that
was
indexed
for
that
particular
project,
because
we
don't
want
to
reinvent
everything.
A
What's
the
last
commit
that
we've
already
indexed
and
only
index,
all
the
new
files
that
have
come
in
after
that
and
by
new
files,
I
also
mean
updated
files,
so
we
can
catch
all
the
updates
that
have
happened
between
the
last
commit
and
the
current
head
or
the
current
Shalit
can
get
wish
to
us
and
any
deletions
as
well
happen
at
this
point,
and
the
only
problem
here
is
that
we
are
assuming
that
only
humans
are
ever
added.
That
commits
are
only
ever
added
that
they're
not
removed.
A
A
But
coming
back
to
application
search
that
one
defines
pull
backs
right
for
incremental
indexing
when
a
model
gets
updated,
so
we
have
inserted
a
Tunde
story
triggering
a
last
in
search
of
base
via
the
elastic
search
index
of
Merkers
whoops.
Sorry,
I
keep
forgetting
how
to
use
new
slides.
So,
as
we
had
seen
a
few
seconds
before,
we
have
enough
to
come
in
and
create
an
update
and
ondestroy,
and
this
actually
gets
included
in
all
of
our
project.
A
In
this
case,
a
project
need
to
live
in
the
same
start
as
the
parent,
which
does
mean
that
we
can
get
a
very
unbalanced
storage
usage
where,
if
you
have
a
project
with
10,000
issues,
then
that
shard
is
going
to
be
full
of
issues
that
were
cleaning
full
of
files
and
osteogenic.
Some
worker
really
is
just
in
queueing
index
reference
service
or
deleting
the
file
directly,
and
this
really
again
just.
A
Sends
a
direct
operation
to
elasticsearch
whether
it
is
index
or
update
and
it
can
detect
it.
This
is
the
first
time
we've
ever
annex
a
project.
So
if
this
is
the
first
time
we've
ever
index
a
project,
there
are
some
extra
thing
to
do.
Usually
this
would
just,
for
example,
get
an
issue
and
it
would
update
the
the
issue
description,
that
issue
title
etc
or
if
it
gets
a
project
that
it's
already
ends
before
so
it
gets
an
update
operation.
A
Then
again,
it
just
makes
sure
that
all
of
your
project
features
are
updated
and
make
sure
it's
mated
that
off
and
that
project
is
completed.
But
if
this
is
the
first
time
we've
ever
seen
this
project
and
the
way
we
detect,
that
is
by
whether
or
not
you're
setting
as
an
index
or
an
update
operation.
Then
we
have
to
do
an
initial
index
where
we
have
to
grab
every
single
index
that
Association,
which
is
the
issues
the
merge,
request,
etc,
and
we
gotta
import
them
into
blast
search.
A
If
we
don't
comport
them
into
a
lesson
search,
then
it's
there
they're
never
going
to
be
imported
in.
Basically,
it
is
the
initial
import
when,
for
example,
import
a
project
into
the
above
itself,
then
we
would
grab
all
of
the
new
issues
world
where
Mars
to
me,
and
otherwise
you
have
an
empty
project.
People
recover
anything
to
index
and
the
repositories
actually
get
updated
ya,
get
which
worker
hooks
so.
A
We
are
in
equal
branch
and
the
project
that's
enabled
for
elasticsearch.
Remember
we
can
filter
which
project
or
which
group
gets
indexed
yeah,
and
we
actually
have
a
brenes
lock
here,
because
we
were
having
some
trouble
if
we
had
somebody
create
a
new
project
as
the
indexing
was
going
on.
So
if
we
should
induce
a
commit,
we
just
use
elastic,
commit
industry
worker
and
a
lots
of
commit
index
worker
is
the
simplest
of
our
workers,
it
just
Falls
in
level
asset
search
in
mixer,
and
this
one.
A
The
only
thing
has
to
know
is
whether
we're
going
to
use
the
new
experimental
index
or
not
the
way
it
chooses,
whether
to
use
the
experimental
limiter
or
not
is
by
checking
the
application
settings.
If
we
have
the
setting
to
use
the
experimental
index
are
enabled
and
if
the
binary
exists
in
your
path,
binary
must
exist
in
your
path.
For
you
to
be
able
to
enable
that
feature,
then
it'll
use
the
new
index
or
we
should
really
rename
this,
because
it's
not
experimental
anymore.
A
It's
basically
a
data
will
to
be
a
release,
but
you
know
we
to
be
change
the
documents,
but
nobody
looks
at
this.
Hopefully,
but
anyhow,
we
set
up
some
environment
variables.
You
know
we
said
from
what
chart
to
watch
we
want
to
index
and
the
only
thing
we
do
here
is
update
the
index
status
and
run
the
industry
so
running
a
mixer.
It
really
is
just
pulling
the
indexer.
If
you
don't
have
the
experiment
when
these
are
enabled
it'll
go
to
the
Ruby
script.
A
That's
really
it.
We
just
have
some
extra
thanks
to
get
the
index
status
and
what
attributes
we
have
to
update,
because
now
that
the
wiki
also
goes
through
the
go
indexer
and
we
want
to
do
incremental,
we
need
to
actually
send
weekly,
commit
and
weekly,
and
it's
that
information
as
well
so
yeah.
The
last
commit
that
was
indexed
escaped
on
the
database
in
the
ending
status.
A
Monitor
so
how
does
search
work
at
last
search
various
adjacent
structure
and
it
contained
multiple
filters,
so
we
see
an
example
of
it
on
the
right
side,
we
can
say
hey.
This
is
a
really
very
that
must
contain
the
term
kimchi
on
there,
the
user
attribute
and
let's
filter
out
anything
that
has
the
the
value
tech
under
the
tag
attribute,
and
it
must
not
contain.
A
A
Second
must
match
this,
but
should
just
guessing
the
same
must
match,
but
it
gives
it
a
lower
score,
so
they
would
show
up
later
in
the
results
we
have
were
meant
for
missions
as
believe
filters,
and
we
can
filter
for
a
project
that
I
user
has
sciences
or
projects
which
features
enabled.
So
we
want
to
filter,
for
example,
for
a
publication
tracker.
We
can
do
that
and
highlighting
is
given
to
us
by
a
lesson
search.
If
you
click
there,
you'll
be
able
to
see
the
elastance
rich
documentation
on
how
Eileen
works.
A
You
just
send
in
highlight
field
and
a
query
with
the
field
to
pilate,
and
all
that's
in
search
will
tell
you
where
exactly
in
that
field,
it
matched
the
the
query
you
gave
it.
So
the
response
comes
back
with
a
highly
element
for
each
search
hit
and
it
has
the
actual
fragment
of
the
document
where
it
matched.
So
if
you
pray
performs-
and
you
get
a
quick
brown
boss,
something
it'll
give
your
fragments
it's
with
Fox
and
it'll
like
it'll,
have
to
match
right
there
and
where
you
should
be
highlighting.
A
We
expose
the
last
search,
simple
query
string.
We
do
this
because
it's
a
little
bit
simpler
for
us,
but
it
also
allows
the
user
to
use
it
solution
operators.
So
you
can,
you
can
do
a
little
bit
more
interesting
things
than
our
normal
database
search.
We
can
do
exact
search
matches
and
it
is
complex,
but
it
is
very
powerful.
A
It
allows
us
to
filter
by
path,
filename
or
extension,
and
the
way
they
are
implemented
is
extremely
simple.
Again
we
have
to
the
repository
I
believe
yep.
So
we
need
to
have
this
search.
Query
class.
Then
you
add
filters
so
file
name
is
the
following:
filter
then
path
allowing
people
to
my
path
and
extension
options
from
the
by
extension,
and
this
is
just
how
to
partisans
or
whether
it
be
input
matches
or
not.
And
if
you
go
into
the
query
class
and
see
how
exactly
these
filters
happen.
A
Sorry
about
that
I
completely
forgot
to
create
a
thought,
so
I
asked
when
restricting
by
group
of
project
and
something
global
search
is
disabled.
What
does
that
mean?
For
the
good
luck
word
uncommon.
This
is
enable.
Can
you
devil
in
the
staging,
so
I
see?
Nick
has
already
been
answering
a
few
of
these,
so
thank
you
so
much
Nick.
You
want
to
take
this.
One
I
think.
B
That's
it
then
other
people
are
going
to
be
curious
as
well,
so
it's
good
to
go
through
them.
In
this
case,
it's
just
asking
I
guess
about
there's
limited
products
that
we
saw
right
at
the
start.
In
the
admin
settings
you
can
say
only
index
to
get
level
group
which
we're
gonna
do
on
get
lab
calm
because
we
can
afford
to
just
index
the
Gallimore
group.
It's
a
it's
about
three
and
a
half
gigabytes,
just
one
group.
So
it's
not
too
expensive.
B
If
you
have
just
a
group
or
just
a
project
indexed
and
set
up
in
the
elasticsearch
settings,
then,
if
you're
on
the
search
page
for
that
group
or
for
that
project,
you
get
all
the
extra
wet
elasticsearch
magic.
If
you're
in
any
other
group
or
any
other
project
or
a
global
level,
then
you
just
get
a
regular
ordinary
database
search
with
none
of
the
extra
features
booked.
That
search
does
still
work,
which
is
the
important
thing
it
doesn't
disable
global
search
for
everybody,
which
I
think
is
what
I
was
most
worried
about.
D
The
only
difference
in
the
UI
is
there's
a
in
the
search
box
once
the
results
are
displayed.
There's
like
a
note
underneath
it
just
says:
you're
using
advanced,
advanced
global
search,
I,
think
or
advanced
search
whatever
we
call
it.
So
that's
that
the
indication
is
there
to
the
user
that
they're
using
elasticsearch
yeah.
A
Basically,
right
now,
the
only
way
to
know
is
whether
or
not
you
have
the
words
advanced
search.
Syntax
is
enabled
I
have
a
an
example
right
over
here,
you
see
is
this
advanced
search.
Functionality
is
enabled
right
below
the
search
query.
That's
how
you
know
whether
it
lasted
search
is
enabled
or
not,
and
that's
fair.
We
should
probably
give
it
a
badge
or
something
we're
actually
thinking
about
a
few
different
ways
to
surface
whether
your
lesson
search
is
working
or
not.
The.
A
Yeah,
like
we
have
the
code
and
the
commit
to
have,
will
not
show
up
if
lassiter
she's
disabled
for
global
search.
So
if
you
have
the
indexing,
this
enabled
only
for
a
few
groups
or
a
few
projects,
then
sure
for
the
group.
You'll
see
coding,
commit
and
that'll
be
different,
but
for
the
project,
it'll
look
exactly
the
same.
A
So
sorry,
let's
go
ahead
and
switch
over
to
Cody.
Then
he
says
we
kick
up
as
many
sidekick
jobs
as
we
can
so.
Elasticsearch
innocence
has
taken
down
your
love
before
our
self-managed.
So
is
there
a
way
to
be
nicer,
so
we
were
actually
thank
youing.
Even
more
jobs
nowadays
leave
that
we
used
to
because
yeah
we
just
do
just
do
a
matching
a
couple
of
thousand
projects
and
then
keep
going
from
there,
but
I'm
pretty
sure
that
the
we
fix.
This
is
not
really
us
being
nicer,
but
rather
the
siding
configuration
itself.
A
F
A
A
A
And
we
actually
are
not
fixing
issues
in
the
old
one,
so
right
now,
wiki's,
for
example,
are
broken
on
the
old
one
for
bikini
and
mixing.
If
you
don't
have
NFS
anymore,
like
a
lot
of
our
users
are
moving
on
to
get
Li,
did
you
use
some
Italy
and
you're
not
using
the
go?
Indexer
you're
gonna
have
a
bad
time.
Okay,.
A
B
I
think
you
showed
off
some
of
those
permission
filters
in
yes
and
earlier.
There
is
a
huge
amount
of
duplication
here
and,
to
a
great
extent,
is
necessary.
Just
for
a
starts.
You
support,
pagination
of
all
things
if
elasticsearch
doesn't
know
which
issues
etc.
The
user
Canon
cannot
see
Venice.
They
can't
give
you
the
correct
results
and
you
can't
paginate
them
properly.
So
yeah
we
duplicate
all
that
logic
again
in
elasticsearch,
there's
a
lot
of
it.
B
It's
very
sensitive
and
there
have
been
books
in
it
before
I,
don't
know
of
any
books
that
expose
data
right
now,
if
you
find
one
then
it's
probably
gonna
have
to
be
p1
since
we're
going
to
start
turning
because
online
github.com
at
some
point
so
feel
free
to
look
for
them.
I
would
love
to
find
them
if
they
do
exist.
So.
A
I
do
have
relevant
methods
here
on
screen
right
now.
We
have
to
re-implement
all
this
because
we
have
to
send
it
all
as
a
jason
as
a
jason
document
in
order
to
be
able
to
query
against
it
and
we
query
against
it.
Instead
of
you
know
filtering
in
memory,
because
it
would
be
extremely
prohibitive
to
do
this
in
memory.
I
mean
we're
talking
about.
Sometimes
we
we
return.
A
You
know
thousands
of
results
and
if
we're
gonna
paginate
this,
we
got
a
trajan
ain't
enough
to
get
like
this,
the
first
20,
and
then
we
could
paginate
enough
its
second
20.
And
then
how
do
you
keep
all
of
those
pagination
things
in
sync?
Whether
because
every
single
time
you
go
to
the
next
page,
you
lose
all
of
your
state.
You
don't
haven't
stayed
in
there.
Restful
environment.
B
And
one
thing
probably
worth
calling
out
there
explicitly
is:
if
you're
not
an
administrator,
but
you
are
a
member
of
another
20,000
projects.
Every
elastic
search
query
you
do
will
send
those
20,000
project
IDs
to
the
elastic
search
cluster
such
as
life,
it's
horrible,
but
it
does
work
I
think
we
only
send
the
project
IDs
for
private
and
internal
projects
and
that's
right
yeah.
We
don't
do
a
huge
amount
ever.
Otherwise
we
would
bring
down
to
close
everything.
E
E
A
So
right
now
we
don't
actually
have
any
alerts
when
a
mixing
fails,
but
I
mean
we
do
need
to
figure
out
how
to
make
this
easier
for
infrastructure
to
be
able
to
tell
when
something's
going
wrong
every
single
time
you
edit
your
project,
we
do
trigger
indexing
updates,
do
through
and
after
commit
pull
back
but
yeah.
If
that.
B
Leak
and
we
rely
on
sidekick,
not
losing
jobs
and
geo
has
the
same
problem
at
times
him
there
are
times
when
psychic
does
leave
jobs,
we
retry
the
jobs
a
couple
of
times.
If
that
fails,
they
go
into
the
dead
job
queue
and
from
their
last
10,000
of
those,
and
if
the
index
gets
badly
out
of
sync,
your
only
real
option
is
to
V
index
from
scratch,
which
is
part
of
why
making
me
indexing.
Easy
is
so
important
right.
A
A
A
So
that's
a
very,
very
hard
problem,
because
every
single
time
we
add
another
filter,
we
are
adding
more
storage
as
well
so
you're
talking
about
you
know
either
adding
something
that
will
detect
what
language
each
thing
is
in
before
sending
it
to
elasticsearch,
which
I
do
not
know
if
it's
possible,
because
we
we
then
have
elasticsearch
analyzers
that
run
on
the
elasticsearch
fluster
itself,
with
no
way
of
really
ending
to
elasticsearch.
Hey,
don't
use
that
analyzer
use
this
one
instead
for
this
particular
document.
A
H
H
If,
like
I
like,
are
there
any
like
case
study
is
like
because
I
would
imagine
that
elasticsearch
is
being
used
to
index,
you
know
code
and
search
for
code
other
places?
Maybe
that's
not
but
I
wonder
if
there
is
like
this
is
just
me
kind
of
like
spitballing
like
is
there
some
like
resource
out
there
that
we
may
be
able
to
tap
into.
D
I'll
just
say
from
a
product
perspective
and
I
think
I
got
tagged
on
that
issue.
The
other
day,
I
think
I,
think
the
concern
is
super
valid
and
I
think
it's
something
that
we
need
to
look
into
and
the
impacts
on
like
index
size
and
what
all
we
have
to
do
and
adding
more
filters
spawned
a
bunch
of
other
discussion.
It's
one
of
those
things
that
I
think
we
just
got
to
keep
testing
and
iterating
on
to
get
that
to
get
broader
coverage.
D
I
think
it
makes
sense
to
strive
for
broader
coverage
on
different
languages,
especially
in
some
of
those
use
cases
where
the
search
just
flat-out
can't
return
it,
because
we've
already
decided
at
some
other
language
than
it
is
so
I
think
it
it's
something
that
we
should
just
we're.
Gonna
have
to
keep
top
of
mind
and
keep
working
towards
all
right.
A
H
A
I
A
Whatever
we
change
our
schema,
then
that
requires
full
reindex
yeah
and
you
know
that's
because
we're,
let's
just
really
have
a
good
way
to
transform
data,
and
we
would
have
to
go
into
the
database
anyways
to
know
how
to
transform
it.
So
now
we
usually
have
not
I
mean
we
don't
force
reindex.
Normally
it's
just.
A
It
meant
that
nothing
would
work
if
you
try
to
use
the
old
index
that
you
had
so
and
we
do
try
to
batch
up
whenever
we
make
a
schema
change.
So
we
had
a
few
small
ones
like
there
was
a
typo
and
one
attribute,
and
we
were
not
actually
indexing
the
internal
ID
for
one
of
our
objects,
but
they
were
very
mild
only
used
for
display
things.
So
you
know
we
didn't
really
say:
hey,
you
need
to
re-indict.
A
C
You
topspin
share
your
screen.
You
know
I
found
it
hard
sometimes,
but
how
to
visualize
our
data
without
using
Kabana
and
I?
Don't
know
how
many
people
actually
use
it,
but
I
will
share
what
I
see.
I
have
a
simple
Cabana
instance
that
has
indexed
some
data.
So
I
think
to
your
point
earlier
about
like
what
everything
is
in
one
type.
I
think
this
is
what
it's
showing
right.
Oh.
A
A
H
A
C
A
B
A
A
B
B
If
they
don't
have
it,
I'll
have
to
install
it
if
they're
installing
from
source,
as
it
was
DJ
said,
we've
already
installing
it
by
default
and
omnibus,
and
that's
the
path
that
we
really
want
to
make
easy
to
people
if
they're
running
out
of
Debian
stretch-
and
they
need
go
on
point
11-
don't
just
have
to
install
it
as
well.
Right.
J
I
think
it's
more
of
a
documentation
issue,
because
if
you
go
to
the
elastic
search
integration,
official
Docs,
which
I
linked
it,
there
is
no
mention
of
it
being
installed
in
omnibus
and
it
does
directly
instruct
the
user
to
get
clone
make
and
make
install
which,
if
they
do
not
have
that
dependency
installed,
it
will
fail,
and
it's
not
just
for
Debian
stretch.
It's
for
cent
OS,
$0.06,
OS,
7,
Ubuntu,
1604
and
I
believe
a
default.
Kubuntu
18:04
would
also
lack
that.
A
A
A
You
Matt,
we
only
index
the
master
branch.
You
know
we
could
do
some
sort
of
interesting
stuff
with
only
indexing
things
that
have
changed
between
master
and
another
branch,
but
then
there
would
be
a
lot
of
trouble,
keeping
it
updated
Plus.
L
A
M
A
M
L
A
The
specified
default
branch.
So
if
you
go
to
your
project,
then
you
see
default
branch
that
one
defaults
through
the
name
master.
So,
but
you
can
change
that
and
we
would
honor
that.
A
Matt
wants
to
know
how
long
will
we
retain
data
in
the
Alaskan
surgeon,
sir,
you
know
that's
as
long
as
the
document
itself.
It's
still
exists
in
the
repository.
It
will
be
there
as
long
as
the
issues
themselves
exist
in
the
database.
They
will
be
there
even
if
you
close
them,
they
still
exist.
They
just
have
the
tag
closed
on
them.
A
A
A
To
be
honest,
I'm
not
sure
nick
has
some
thoughts
about
that,
but
it'll
be
more
of
our
infrastructure.
Team
will
know
more
as
well
Nick
you
want
to
jump
in.
He
might
have
left
already.
He
he
was
getting.
It
was
the
later
play
for
him.
I'm
sorry,
I'm
gonna
have
to
ask
you
to
follow
up
with
either
Nick
or
our
construction
team
that
have
been
dealing
with
the
new
staging
stuff,
I
I'm.
A
L
A
So
Lois
wants
to
go
back
to
Cody's
first
question
and
self-manage
clients.
Well
yeah.
They
had
a
large
instance.
They
need
a
psychic
cluster
for
implementing
elasticsearch
as
small
and
as
small
as
this
doesn't
really
need
it
as
well.
It'll
be
super
quick.
We
we
do
need.
At
least
you
know
a
few
psychic
workers
to
be
able
to
keep
everything
updated,
but
that's
a
requirement
for
all
of
the
good
lab
application
right
and
I'm,
not
sure
who
it
is.
That's
writing
right.
Now.
Oh
Cody,
thanks
Cody
yeah.
A
And
feel
free
to
jump
in
if
I,
if
you
feel
like
I'm,
not
answering
the
question
completely
so
Matt.
F
F
That
I'm
pulling
up
for
just
a
second
sorry
to
stop
there,
no
I
think
we
can
just
take
this
async
and,
like
researchers,
some
more
under
stalling
this
out
there
for
the
other
people
who
are
curious
about
it,
I
think
we
need
to
dive
in
a
little
bit
further
and
just
figure
out
how
to
determine
whether
sidekick
queue
is
is
causing
pain
for
other
sorts
of
basic
actions.
So
that's
I,
just
I
just
wanted
to
mention
that
real
quick.
A
To
go
back
to
work
so
Matt
wants
to
know
considering
the
upgrade
path
for
elasticsearch.
Would
we
expect
that
we
need
to
re,
and
it's
only
in
the
future,
and
that's
yes
and
that's
why
we
are
trying
so
hard
to
decouple
ask
search
team
up
from
the
codebase
as
well.
We
want
zero
downtime
indexing.
We
might
have
to
do
one.
A
You
know
maintenance
downtime
in
order
to
update
to
that
in
the
future.
Hopefully,
hopefully
we
don't.
Hopefully
it's
a
matter
of
just
you,
know
disabling
indexing
and
then
just
doing
our
thing
and
the
rolling
update
being
enough,
but
we
currently
do
not
have
zero
downtime
a
bit
saying
like
we
require
reload
of
the
classes
in
order
to
pick
up
the
new
changes.
A
And
you
said
we
showed
some
clever
use
of
engrams
to
handle
both
partial
search
terms
and
reduce
the
need
for
language,
specific
letters,
so
your
app
so
that
question
you
know
it's
the
same
as
above.
Are
we
going
to
add
more
stammers
for
other
natural
languages
and
the
question
is
most
likely
not
sadly,
because
of
the
problem
of
storage?
You
know.
Every
single
thing
you
add
requires
more
storage,
we're
copying
the
data
and
tokenizing
specific
programming
languages
would
be
interesting,
we're
trying
to
keep
everything
general
and
that's
why
we
have
all
of
those.
A
Let
me
see
if
I
can
find
them
real,
quick
there.
This
is
what
we
have
a
little
beast
panelist,
so
we're
trying
to
do.
You
know
separate
camelcase
things
separated
separate
things
and
parentheses
separate
terms
like
quotes,
separate
extension
periods
and
path
terms
that
tends
to
work
well.
For
most
of
the
cases,
there
are
some
edge
cases,
of
course,
I.
Think
if
you
go
for
Objective
C,
for
example,
you
got
those
square
brackets
you're
going
to
deal
with,
so
that
would
be
something
we
are
not
really
capturing
well
with
these
different
rejects,
--is
and
I.