►
From YouTube: Marcus, Dennis and Radovan discussing of how to approach the PostHog backfilling process
Description
Marcus, Dennis and Radovan discussing of how to approach the PostHog backfilling process in order to overcome low performance with the PostHog ingestion.
B
I
put
some
current
status
and
we
just
finished
the
regular
weekly
meeting,
the
main
problem
or
main
issues
with
low
performance.
What
I
spoke
with
post
hoc
team
and
jeff
martin
from
our
side.
He
is
able
to
bump
this
to
50
ports
instead
of
three
ports.
It
means,
theoretically
speaking
17
times
faster
and
also
I
want
to
check
with
you
about
cost
efficiency
and
also
in
slack
threat,
he's
playing
everything.
It's
not
a
costly,
because
under
the
current
setup
anyway,
they
will
pay
the
same
amount
of
let's
say
money
for
the
current
cluster.
B
C
B
Exactly
this
is
what
jeff
martin
nicely
explained.
He
said
about
that
vm
instances
in
pods
and
how
things
are
done
and
how
it's
executed
on
on
gcp
cluster.
I
was
scared
and
was
taking
care
about
cost
efficiency.
It's
always
one
angle.
We
need
to
to
decide
so
what
he
said:
kubernetes
magic
based
on
the
current
pod
usage.
C
But
it's
not
yes,
that's
true,
but
even
let's
say
even
it
cost
us
25
000
dollars.
If
you
say
yeah,
you
also
already
paid
6
million
yeah.
That
is
indeed
a
small
and
there's
no
material.
On
the
other
hand,
it's
still
50
25k
right.
So
exactly.
B
That's
true
and
anyway,
it
will
be
scaled
automatically
up
to
50
ports.
That's
a
maximum
if,
if
it's
idle,
no
cost
at
all,
but
it's
scale,
I
would
say
on
demand,
so
because
current
what
what
posco
team
explained
to
us,
they
can
put
out
to
scale
automatically.
But
it's
not
the
case
for
our
cluster,
because
it's
scale,
that's
the
reason
why
everything
is
so
humble,
it's
poc
and
you
want
to
save
some
money.
Of
course,
don't
don't
spend
it
very
very
quickly.
B
B
B
Correctly,
roughly
but
yeah
as
per
what
you
know,
I
need
to
check
that,
of
course,
but
if
spo
por
this
scale-
and
there
is
50
puts-
I
can
check
it
tomorrow
morning
immediately
see
okay,
is
this
done?
At
least
I
will
check
one
one
day
of
data
or
one
hour
of
data,
something
small,
so
I
can
scale
and
calculate
it
more
precisely.
This
is
like,
where
is
our
sweet
spot.
C
B
They
said
three
months
of
data,
so
the
my
total
calculation
is
here,
but
I
think
he
can
survive
at
the
last
date.
That
is
not
possible.
I
think
one
month
will
be
sufficient.
I
don't
know
that's
what
I
read
between
the
lines
on
the
last
meeting,
because
I
spoke
with
him
and
also
deadline
is
next
friday
for
this.
B
Three
months,
but
in
the
previous
meeting
he
said,
okay,
I
know
he's
aware
of
the
issue
about
the
performance,
but
he
said
okay.
If
there
is
no
three
months,
maybe
we
can
negotiate
something
else
like
small
decrease
or
decouple
smaller
amount
of
data.
But,
as
I
said,
if
you
can
scale
this
to
50
pots,
I
can
try
one
hour.
One
day
of
data
be
sure
everything
is
done.
C
C
You
need
to
hand
it
over
and
that's
not
ideal.
So
if
it
was
one
month,
then
we
get
started
right
now
and
then
by
the
end
of
tomorrow
we
would
have
all
the
data
in
and
we
can
close
the
issue
and
the
evaluation
can
continue.
So
that's
why
I'm
asking,
but
let's
let's
say:
unless
we
need
to
go
for
three
months,
then
I
think
indeed
there's
no
other
possibility
to
bump
up
the
cluster
to
50
plus.
C
C
C
B
B
Yes,
also
what
I
want
to
say
and
rise
a
question
mark.
I
need
to
check
everything
before
we
move
on.
Okay,
I
will
check
with
jeff
martin
about
the
cost.
I
just
need
to
test
everything
before
we
go
live
theoretically
it
will
work.
That's
one
thing.
The
second
thing
is
I
check
with
the
post
hoc
team.
When
I
try
to
insert
one
record,
there
is
a
problem
on
post
hoc
installation
because
there
is
a
bug
on
their
side.
B
B
So
I
don't
have
an
error,
but
data
is
not
there
where
I
expect,
I
can't
see
it
in
post
hall,
so
they
need
to
sort
out
this
as
well.
From
my
point
of
view,
my
action
point
check
with
jeff
about
the
cost
of
50
hundred
posts
just
to
have
the
measures
for
three
days
of
usage,
four
days
of
usage
plus
check
with
posco
team
to
fix
this
issue,
and
then
I
need
to
test
everything
and
say
we
are
ready.
My
estimation
is
fine.
B
Later
on,
if
it's
okay,
I
can
run
the
duck
and
if
something
fails,
someone
can
just
rerun
it.
You
can
do
it.
Energy
markers,
it's
very
simple,
but
we
need
to
ensure
its
measure
is
correct
and
it's
fine
from
end
to
end
and
eight
is
there
in
the
test
test
project,
then
we
should
switch
to
the
real
project
and
say:
okay,
this
is
the
data
for
you
and
of
course,
schema
is
okay.
For
now
they
have
some
additional
requirements,
we're
talking
about
carolyn
and
dave,
but
that
is
what
it
is.
B
C
Good
and
then
the
third
one
we
need
to
have
him
back
up
because
you're
on
holiday
from
friday,
let's
say
tomorrow,
who
can
actually
get
of
emergency?
Hey,
you
say:
yeah,
we
can
run
the
deck,
but
if
we
have
an
error
with
with
with
that
deck,
then
I
cannot
solve
the
north
markers
can
solve.
So
we
need
to
have
someone
as
a
backup.
B
So
I
would
say
from
my
point
of
view,
anyone
of
indeed
engineer
engineering
team
can
do
that.
What
is
my
intention
is
to
fully
documented
everything
it's
already
done.
Code
is,
let's
say
self-descriptive
it
all
specification.
Each
function
has
explanation
usage
input
output.
I
think
it's
in
a
good
shape
if
something
fails
any
moment
or
any
part
can
be
debugged
and
fixed,
because
it's
building
components
so
also.
I
take
that
action
point
for
me
with
me
to
create
a
backup
what
I
can
do
can
organization.
C
Now
exactly
that
was
my
proposal.
Indeed,
please
have
a
sink
in
one
hour
or
30
minutes
sync,
I
don't
know
how
much
your
time
you
need
dude,
please,
with
fat,
tell
in
the
status,
so
the
status
as
of
now
or
essay.
Tomorrow
is
yes,
we
have
a
technical
issue
that
needs
to
be
tackled
with
postal,
that's
something
we
cannot
solve
ourselves.
C
Lastly,
push
one
record
and
it
doesn't
exist
in
post
hoc
so
that
that's
what
indeed,
then
please
check
indeed
with
jeff
about
the
cost
impact
bump
it
up
to
50
points
if
it
is
not
too
much
and
then
if
this
technical
one
is
solved
by
tomorrow,
start
the
thing
and
then
back
fent
is
a
backup
in
case
of
an
emergency
that
something
breaks,
but
that
really
is
something
great
right.
If
there
is
a
schema
change
needed,
I
think
that's
too
bad.
That
needs
to
wait
up
until
you're
back.
B
Infrastructure
issues
and
any
kind
of
infrastructure
issue
any
kind
of
issue
like
triage
issue,
nothing
more
than
that.
That
is
what
it
is.
I
prove
it
works.
I
need
to
test
everything
hope
it
do
not
be.
An
additional
problem
also
need
to
sort
out
this
with
the
post
hoc
team
sort
out
with
jeff
create
a
backup
everything.
C
Yes
and
then
my
last
question-
and
I
look
at
both
of
you-
let's
say
happy
flow-
this
technical
one
is
solved
by
tomorrow.
We
bump
the
server
up
to
50
pots.
We
run
it
tomorrow
and
it's
finished
on
sunday
yeah
ideal
situation,
let's,
let's
hope
for
the
best.
C
Sure
other
things
can
go
wrong,
but
was
just
for
me
to
get
clear
overview
that
are
we
now
covered
with
our
plan
and
apparently
we
indeed
we
are
covered
with
our
plan
and
let's
see
so,
I
would
not
have
the
surprise
saying.
It
was
also
already
a
surprise
for
me
that
you
need
to
do
this
backfield.
I
thought
it
was
only
installation-wise,
so
that
was
my
checklist,
so
the
installation
is
done
by
it
by
kev.
C
B
Also,
you've
been
a
pickup,
the
unlucky
guy,
for
the
backup
will
consider
who
the
key
players
are.
If
something
is
wrong,
so
I
can
do
takeover
like,
for
instance,
something
screwed
up
with
postcode
installation.
You
know
it's
buggy
like
we
can't
answer
the
data,
what
to
do
pink,
post
hoc
team
if
you
need
to
scale
something
ping,
jeff
martin,
just
to
create
a
state
machine
of
scenarios
if
something
gone
wrong
can
be
wrong.
B
C
Maybe
not
a
smart
thing
to
say:
don't
make
it
too
difficult
and
we
can't
think
about
hundred
scenarios,
but
indeed
like,
like
you
said,
if
you
have
just
proper
documentation,
we
know
who's
involved.
We
will
find
out,
of
course,
so
of
course,
if
you
have
some
time
to
do
these
kind
of
additional
documentation
that
that
would
be
fantastic,
but
key,
of
course
also.
We
said
that
we
need
to
tackle
that
technical
issue,
because
that's
a
broker
of
course.
So
that's
that
has
more
priority
for
me
to
get
that
one
solved.
B
So
ping
drop
a
message
in
postcode
channel
and,
as
I
said,
I
I
hope
I
will
be
able
to
do
the
testing
first
thing
in
morning
like
okay,
I
have
end-to-end
testing
for
one
day
and
I
can
see
it's
done
or
I
hope
there
is
no
bit
more
problem,
but
anyway
I
will
do
all
additional
steps
and
let
you
know
asking
honestly
about
the
latest
type
I
have
this
current
status
will
put
also
for
tomorrow.
So
you
have
the
full
picture.
C
B
B
No,
no
just
to
see
what's
going
on
yeah,
so
you
can
easily
connect
the
dots
all
the
strings
whatever
so
so.
The
two
key
things
for
us
for
today
is
to
ping
jeff:
do
the
pump
and
create
50
pots
before
that?
I
will
ask
for
the
costs
of
course
put
it
here.
This
is
a
confidential
issue.
Oh
yes,
then
ask
possible
team
to
fix
the
issue
with
kafka.
For
now.
B
B
This
is
the
point
one,
and
also
we
discuss
about
this,
and
also
he
mentioned
now-
heads
up
for
a
number
of
partitions
for
kafka.
It's
only
one
partition
at
the
moment.
He
can
take
that
action
point
with
him,
but
I
don't
know
when
he
plan
to
to
do
this
because
you
have
one
partition
for
kafka.
So
the
question
is
you're
able
to
push
everything
to
kafka,
but
are
you
able
to
push
efficiently
to
post
hoc
to
sharehouse
click,
sorry
and
upgrade
to
the
latest
version
to
get
rid
of
some
bugs?
So
what
can
happen?
B
Everything
is
perfect.
It's
happy
flow,
but
it's
probably
with
kafka
or
it's
a
problem
with
this
issue
about
events.
Pod
can't
accept
the
issue,
the
requests,
so
that
can
be
a
potential
problem.
I
would
say
I
hope
everything
will
be
fine,
but
in
case
of
any
problem
we
need
to
ping
or
drop
a
message
in
in
postcard,
slack
channel.
C
Okay
boogie,
but
you
cannot
backfill
data.
B
B
C
Now,
let's
do
so
right
or
not?
Sorry,
let's
do
so,
but
I
don't
know
which
extent
it
can
be
done
today,
but
if
it
can
be
done
today-
and
I
think
we
need
to
have
force
a
lot-
maybe
marco
you
can
can
push
that
because
most
of
the
folks
are
in
the
us.
If
we
can
make
it
happen
that
we
are
1.37
by
the
end
of
the
day.
We
can
start
tomorrow
with
new
fresh.
B
C
B
C
Action
item
number
one:
yeah.
D
C
A
A
Needs
to
do
it
is
that
correct,
okay,
this
is
you
know,
I'm
definitely
gonna
ask.
I
will
I'm
not
sure
if
he
will
be
able
to
do
today,
whatever
he's
currently
on
display,
because
we're
throwing
this
last
minute,
let's
go
with
the
assumption.
He
won't
from
the
call
that
we
were
on
before
is
like
well,
they
recommend
us
upgrading,
but
it
should
still
load
regardless.
Now
it
might
not,
but
it's
it's
a
lot
to
do
before
before
running
and
goes
on.
A
C
That's.
I
think
that
the
biggest
blocker
that
we
have
currently,
of
course,
and
the
the
the
performance
is
an
issue,
but
if
we
cannot
even
insert
one
record
because
of
an
issue
on
post
help
site
that
needs
to
be
get
out
of
the
way.
First,
of
course,
and
if
I
understand
you
correctly,
that
regardless
of
the
upgrade
you
still
can
should
be
able
to
insert
one
row
or
not.
A
A
They
just
strongly
recommend
it
because
of
all
the
bugs,
but
it
doesn't
mean
that,
isn't
it's
not
like
a
black
white,
it
doesn't
work
at
all
or
with
new
version
as
well.
I
think
it's
just
like
some
stuff
was
buggy,
but
it
still
is
somewhat
working
because
I
felt
going
right
now.
I
see
data
yeah,
but.
A
B
I
pick
up
zillion
microphones,
because
computers
crashing
sorry
for
that,
let's
go
on
the
problem
is
now.
I
can't
insert
at
least
one
record
something
crashed
on
their
side
in
the
current
version,
without
probably
aggregate
they
claim
upgrade
will
sort
this
out
if
there
is
no
upgrade,
I
need
to
ping
them
and
wait
for
them
to
fix
this.
B
A
B
I
loaded
two
days
ago
and
ever
something
crash
in
the
morning.
Yesterday.
B
Yes,
so
it
wasn't
stopped
blocker
from
the
beginning.
I
was
able
to
do
that
slowly.
I
have
one
million
records
there
in
my
test
project,
but
from
yesterday
morning
nothing
can
be
inserted
and
it
was
a
black
box.
For
me
there
is
no
exception,
just
silence,
so
it
means
possible
library
from
python.
Accept
my
record,
push
it
to
kafka
and
stay
in
kafka
forever,
and
it's
not
processed
to
the
click
house
and
it's
not
visible
in
postcode
application.
B
A
Okay,
let
me
back
up
and
go
to
worst
case
scenario,
we'll
have
postdoc
to
give
the
right
stuff
to
jeff
and
and
we'll
ask
you
have
to
do
it,
provide
it.
You
can
do
it
now,
we're
great.
If
you
can't
do
it
before
you
leave,
we
cannot
start
the
load.
Somebody
else
needs
to
kick
it
off
or
we
say
we're
not
going
to
make
our
okr.
Yes,.
B
B
C
And
I
don't
I
don't
want
to
go,
go
all
in
with
an
operator
if
the
bread
is
not
needed,
don't
let
let's
do
it,
then,
let's
take
all
version
36,
but
we
need
to
have
that
loading
issue
out
of
the
way.
Anyhow,
and
maybe
we
need
to
push
this
one
back
to
postdoc,
saying
hey,
this
is
issue
right
now
we
cannot
insert
it.
How
do
we
solve
it?
Can
you
solve
it
or
do
we
need
to
upgrade
it
to
a
next
version?
C
Let
them
make
the
call,
and
if
we
need
to
upgrade
it,
then
they
need
to
provide
the
helm
and
jeff
has
to
execute
the
upgrade
if
they
can
solve
it.
Otherwise,
I'm
a
huge
fan
of
it,
then,
because
upgrading
takes
more
time,
more
people
will
make
it
more
complex
than
it
already
is
so
right.
You
need
to
have
the
loading
issue
out
today.
C
B
B
A
B
C
A
Yeah
I'll
follow
if
there's
no,
no,
not
enough
traction
or
whatever
the
response
is
I'll,
follow
where
needed
so
overnight
for
you
for
you
all
overnight.
Hopefully
something
will
be
done
similar
if
they
say
it
needs
to
be
the
upgrade.
Similarly
I'll
have
them
sync
with
jeff
youtube.
Your
point
dennis
indeed,
it
feels
like
a
big
scary
thing
to
do
right
before
somebody
goes
like.
If
it
doesn't
work,
it
doesn't
work
and
they
were
out
and
that's
that's
not
it.
A
So
if
it
really
is,
that's
the
only
option
we
have
and
then
yeah,
hopefully
all
this
could
stop
before
your
time
tomorrow,
which
is,
I
don't
know,
12
hours
or
so
otherwise.
So
what
is
our?
I
know
we're
out
of
time.
What
is
our,
what
is
the
do?
We
have
even
a
backup
plan
what
what
if
it
doesn't
like?
Well,
if,
if
it
doesn't
get
solved
in
the
next
24
hours,
but
preferably
like
12
hours,
really
to
kick
it
off
to
test,
it
basically
needs
to
be
done
with
12
hours.
A
A
B
That's
that's
the
backup
option
dennis.
I
think
you're
right
here,
probably
that
or
anyone,
as
I
said
or
poor,
can
can
follow
up
on
what
I
did
actually
and
just
try
to
do
the
same
next
monday,
tuesday,
whatever
and
then
to
finish
before
next
friday,
if
happy
flow
failed
for
some
reason
or
for
any
reason,
but
as
I
said,
we
will
do
something
in
acts,
24,
22
or
21
hours
for
sure,
and
then
we
will
take
it
over
the
last
status.
B
B
I
will
also
stay
and
try
to
provide
documentation,
and
this
message
immediately
just
to
be
prepared
for
tomorrow
and
tomorrow.
I'll
do
my
best
and
hope
my
my
laptop
will
survive
because
I
don't
know,
what's
going
on
really
there's
some
crazy.
You
should
see
kernel
task
hit
my
memory
three
times
3.5
times,
but
anyway
I
will
prepare
what
is
possible
tonight.
B
C
Yeah
and
since
since
it
was
working
on
the
existing
version,
I
don't
will
put
all
my
hopes
on
an
upgrade
because
for
me
it
seems
unrelated
that
software
is
working
on
overworking
and,
of
course,
if
there
is
an
issue
you
need
to
upgrade,
but
if
it
was
never
working
then
I
would
say
yeah
upgrade
to
another
version.
Maybe
that
will
solve
it,
but,
like
I
said
I
don't
want
to
go
all
in
on
updates,
especially
not
where
we
are
right
now,
with
the
evaluation
and
the
timing
that
we
have.
B
C
Well,
let's,
let's
rumble.