►
Description
Incubation Engineering APM weekly issue - https://gitlab.com/gitlab-org/incubation-engineering/apm/apm/-/issues/13
A
Hello
there,
joe
shaw
here
incubation
engineer
in
incubation
engineering
department
at
git,
lab
working
on
application,
performance
management,
monitoring
observability
to
try
and
create
our
own
observability
stack
within
gitlab.
This
is
my
weekly
demo
weekly
update.
A
If
you
will
issue
you
can
see
the
issue
in
front
of
you
there
and
you
could
subscribe
to
the
weekly
list
of
videos
in
this
issue
here
I'll
add,
add
a
new
related
link.
Each
time
I
put
one
up
so
this
week.
I
wanted
to
mostly
focus
on
the
metrics
schema
work,
where
I
was
trying
to
build
a
generic
schema
to
capture
observability
metrics
in
click
house.
A
While
doing
that,
I
quickly
realized.
I
probably
wouldn't
make
enough
progress
to
show
anything
in
this
demo,
so
I
quickly
switched
back
to
running
some
more
benchmarks,
which
I
was
going
to
do
anyway,
with
mongodb
and
createdb.
Previously
we
benchmark
benchmarks
click
house
against
timescale
db,
because
that
was
sort
of
an
obvious
competitor
for
us
in
terms
of
having
a
time
series
database
that
was
multi-modal,
flexible
and
fit
with
what
gitlab
were
doing
already
with
postgres.
A
But
out
of
this
selection
that
I
identified
as
part
of
the
time
series
benchmarking
suite
that
we're
using
mongodb
and
createdb
would
be
other
ones
that
would
fit
in
with
that
as
well.
So
the
issue
here
I
tried
to
run
these
benchmarks.
A
I
managed
to
fix
up
as
previously
with
time
scale,
while
with
clickers,
in
particular
with
the
time
series
benchmarking
suite
things
didn't
work
out
the
box
quite
how
I
wanted
them
to,
and
I
have
previous
experience
with
mongodb,
so
I
was
able
to
sort
of
patch
that
up
and
get
that
running
create
db.
Unfortunately,
I
wasn't.
A
They
were
parts
of
the
golang
interface
implementation
that
we're
just
missing
various
scripts
that
we're
missing.
So
I've
decided
just
to
drop
that
and
if,
unless
there's
any
particular
objection,
I
won't
be
going
back
to
that.
A
So
I
I
carried
on
with
mongodb,
and
it
runs
the
same
as
last
time,
so
I've
linked
back
to
the
previous
benchmarks,
we're
doing
it's
a
vm
in
google
cloud,
16,
cpus,
64,
gig
of
ram
and
running
running
the
cpu
only
test
suite
because
the
devops
one,
while
it
does
put
more
stress
on
it,
it's
kind
of
all
relative.
A
It
takes
a
lot
longer
to
run
and
we
found
that
the
cpu,
the
cpu
only
case
was
a
reasonable
subset
of
of
the
devops,
the
full
devops
cycle,
one
so
to
try
and
speed
things
up.
I
ran
it
with
that.
So
here
we
go
here.
Are
the
results
against
mongodb
with
click
house,
so
the
metric
rate
when
loading
the
databases
much
higher
again,
it
might
need
to
be
seen
pretty
fast,
but
still
click
house
managed
to
to
beat
it
there.
A
A
The
p95
95th
percentile
latencies
here
that
we're
running
and
you
can
see
click
house
is
performing
better
initially
and
then
a
lot
better
and
even
better,
as
the
queries
get
more
and
more
complex
and
in
this
case
here
the
group,
by
order
by
limit
one,
which
is
a
very,
very
complex
query,
the
mongodb
version
doesn't
actually
timed
out.
A
I
couldn't
get
any
results
out
of
it
for
that,
and
you
can
see
the
vast
difference
there
again
similar
there
similar
and
there
are
a
couple
where
again
single
group,
by's,
very
simple:
query:
mongodb
actually
outperformed
again
we're
talking
sub
10
milliseconds
here
in
this
case
anyway,
similar
case
here,
simple,
so
yeah.
So
in
terms
of
most
of
the
queries,
it
performs
a
lot
better.
A
We
couldn't
run
the
cpu
4000
node
test
where
that's
generating
data
for
sort
of
4,
000
simulated
hosts
mongodb,
just
it
would
load
the
data
but
querying
it
was
just
absolute
standstill
and
I
think
it
was
the
memory
on
the
host.
That
was
the
issue
and
they're
jumping
down
to
sort
of
cpu
and
memory
usage
there.
A
Those
are
kind
of,
as
expected,
click
house
uses
more
cpu,
but
it
manages
to
utilize
the
server
resources
a
lot
better.
It's
using
all
the
cores
to
do
its
work.
I
found
when
monitoring
it
that
mongodb
was
using
far
fewer
calls,
and
I
don't
know
if
that's
the
setup
issue
that
I've
caused
and
although
having
done
some
reading,
I
do
think
it's
something
to
do
with
the
type
of
queries
being
used
in
their
complexity.
A
A
For
that
tags,
and
this
one
record
in
this
kind
of
strange
format
where
there
are
lots
of
mpt
empty
values
which
appear
to
be
the
sort
of
gaps
between
them,
and
I
guess
the
idea
being,
then
you
could
just
index
into
this
events
list.
It
does
make
this
document
in
longer
to
be
very
large
and
it's
clearly
not
not
having
the
best
effect
on
the
queries
in
use
there.
So
not
the
best
solution,
but
I
think
we
can
safely
say
that
that
we
can
rule
that
mongodb.
A
For
the
time
being,
you
can
see
that
the
memory
usage
of
like
houses
is
lower
there
as
well.
So
that's
great
so
so
I'm
happy
with
that.
I'm
happy
to
move
on
now
and
and
stop
with
this
benchmarking.
Unless
I
see
any
comments
or
anything
that
indicates,
I've
missed
something
obvious
I'll,
listen
to
some
limitations.
Here,
it's
an
old
mongodb
driver
and,
like
I
say
it,
doesn't
utilize
all
the
vm
cores.
A
But
my
assumption
has
to
be
that
whoever
created
the
people
who
created
the
benchmarks
for
mongodb
and
the
the
document
format
for
it
probably
felt
better
suited
to
do
that
than
I
am
and
would
be
better
building
a
benchmark
for
it.
So
I
have
to
I
have
to
assume
a
trust.
That's
that's
going
to
be
as
good
as
I
could.
I
could
ever
create
and
I
need
to
focus
on
getting
my
click
house
implementation
done
so
back
to
the
issue.
There's
still
working
on
this
metric
schema.
A
I've
got
an
ongoing,
merge
request
for
that
started
with
a
naive
schema
and
I'm
building
the
implementation
for
that.
Just
just
so,
I
can
see
how
that
performs
in
the
most
basic
case
before
I
go
on,
to
understand
how
to
refine
that
and
and
make
the
queries
work
better,
but
it
might
be
that
the
naive
scheme,
it
works,
fine,
my
little
test
so
far.
I've
shown
that
it
doesn't
anyway.
So
that's
fine
on
our
previous
weekly
issue.
A
We
got
a
a
very
helpful
comment
in
where
someone
was
suggesting
improvements
like
naive
schema.
So
that's
really
useful.
So
one
thing
I
need
to
do
is
sort
of
double
check:
the
documentation
for
various
things
that
have
been
suggested
here
like
custom,
codecs,
cardinality
of
measurements.
A
I
looked
at
some
issues
through
with,
for
example,
the
primary
key
and
I
figured
I
should
sort
the
primary
key-
an
order
by
case
here
for
the
at
least
for
the
naive
one.
So
I
I
went
ahead
and
used
this
kind
of
suggestion
for
that
to
try
and
improve
its
performance
in
the
most
basic
case,
so
it
wasn't
just
really
really
bad
for
no
good
reason
and
I'll
look
into
some
of
these
aspects
of
this
table
design.
A
What
I
intend
to
do
is,
rather
than
having
a
a
record
being
created
per
field
in
a
measurement.
The
measurements
are
all
going
to
end
up
in
the
same
record
using
arrays,
which
has
been
designed
in
a
lot
of
other
systems
as
well
that
I've
seen
using
click
out.
So
I'm
gonna
have
a
go
at
that.
A
Let's
come
back
to
the
weekly
issue,
then
I
also
stumbled
across
the
click
house,
youtube
channel,
which
I
didn't
know
about.
That
has
some
quite
useful
introductory
videos
and
some
useful,
in-depth
videos
there
as
well.
A
So
I've
been
I've
been
going
through
a
bit
of
that,
which
is
quite
handy,
to
have
more
of
an
introductory
approach
to
learning
it
rather
than
looking
straight
at
the
documentation
which
I
have
been
looking
at,
but
it's
not
as
easy
to
get
into
and
the
up
next
is
basically
the
same
as
it
was
last
week.