►
From YouTube: 2023-08-28 Analytics Section Meeting
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Hello,
everyone
and
welcome
to
the
August
28th
analytics
section
meeting
fairly
light
agenda.
Please
feel
free
to
add
anything
under
something
you'd
like
to
discuss
or
show
off
in
our
little
show
and
tell
section.
A
But
the
one
topic
that
I
wanted
to
really
talk
about
is
potential
solutions
for
how
we
could
better
deal
with
events
that
are
missing
columns
or
just
not
basically
in
the
in
the
past
couple
weeks
and
today
is
what
are
over
the
weekend.
Perhaps
we've
had
some
events
come
in
that
are
I,
guess
missing,
columns
and
I.
Think
as
click
house
is
trying
to
parse
it.
It
just
seems
it's
well.
A
Clickhouse
is
saying
it's
missing
tabs
and
because
the
events
or
tab
separated
values,
but
it
seems
like
we
might
be
missing
values
as
well.
On
one
hand,
we
should
be
able
to
better
handle
these
when
we
don't
use
the
Kafka
engine
in
the
in
cluster
clickhouse,
because
clickhouse
Cloud
doesn't
support
that.
But
on
the
other
hand,
that's
still
going
to
be
concern,
for
you
know,
self-managed
installations
for
the
cluster,
and
things
like
that.
So
I
was
curious.
A
If
we
had
any
ideas
about
how
we
could
potentially
better
handle
these,
there
is
an
issue
that
bossy
he's,
not
on
the
call
linked
to
for
a
related
issue
for
analytics
instrumentation
group
to
look
at
this,
so
he
invites
any
input
there
and
then
Anka.
You
had
a
a
point
here
as
well.
Would
you
like
to
vocalize
that.
B
Yeah
sure
so
so
I
already
booked
the
issue
that
particular
issue
where
like
it
got,
failed
and
it
seems
like
the
STM.
It
is
a
timestamp
and
it
is
not
being
passed
in
the
payload,
and
that
is
the
reason
like
in
the
click
house
like
for
snow
blow
events
while
creating
a
table.
We
gave
a
schema
as
a
date
time
and
it
is
not
getting
there.
So
that's
why
it's
failing.
So
there
are
three
three
possible
solutions
to
that.
We
accept
the
event
and
add
the
current
timestamp.
B
If
it's
not
present,
if
it's,
we
can
discard
the
event
in
the
enrichment
process,
as
only
third
option
is
like
which
update
the
schema,
where
we
accept
the
nulls
like
for
the
snow
blow
events
table,
but
we've
also
mind
to
check
for
all
the
events
which
are
present
like
because,
like
we
only
know
right
now,
like
the
HTM
as
a
issue,
but
it
can
be
other
events
as
well.
A
Right,
I
guess
it's
it's!
It
speaks
to
more
of
how
they
want
to
handle
fields
in
general
that,
like
null
values
and
all
that
kind
of
thing,
because
we
can
address
like
you
said
we
can
address
it
for
STM.
But
you
know
if,
if
someone
else
leaves
out
another
column
where
the
schema
changes,
then
this
issue
could
be
reproduced
again
right.
A
So
it's
interesting
how
a
missing
field
is
causing
a
parsing
issue,
though,
because
like
I
don't
know
if
I'm
not
clear
yet
I
still
haven't
looked
at
the
clickhouse
error
more
in
depth,
but
I'm
curious.
How
that's
like
is
that
trying
to
match
again?
No,
it's
not
even
able
to
parse
it
correctly,
because
I
was
going
to
say.
Is
there
an
issue
between
the
columns
to
insert
and
like
the
missing
field?
A
But
if
it's
an
optional
field
as
well,
then
yeah
that
would
that
wouldn't
make
sense,
yeah
yeah,
so
so
you're
saying
that
we
could
potentially
deal
with
like
fields
that
come
in
that
are
missing
in
the
enrichment
process.
A
B
It
so
like
what
do
you
think
about
like?
Should
we
discard,
if
it's
it
doesn't
have
a
STM
or
like,
should
we
try
it
with
some
placeholder,
where
your
value
there
or.
A
A
That's
fine,
but
we
need
a
better
handle
that.
So
it's
not
just
like
kind
of
halting
the
processing
of
other
events,
so
I
think
discarding.
It
is
fine,
but
I
I.
It's
not
just
a
specific
issue
to
the
JavaScript
SDK
like
I
think
it
has
to
be
something
I,
don't
know.
If
that's
something
we
can
add
to
the
overall
enrichment
process:
enrichment,
business,
okay,
okay,
yeah,
that
would
make
that
would
make
sense.
Then.
A
Yeah
so
in
that
case,
I
would
say
I'm
fine
with
discarding
events,
because
we
know
that
we
should
be
sending
the
right
event
right,
Fields,
if
they're,
using
our
SDK,
I,
I'm,
less
concerned
and
I'm
open
to
everyone
else,
I'm
curious.
A
What
everyone
else
thinks
about
this,
but
I'm
less
concerned
about
someone
sends
us
a
custom
event
and
it
doesn't
work
and
they're
coming
to
us
and
say:
hey
I
didn't
use
your
SDK,
but
it's
not
working
well,
we
can
work
with
you
work
with
them
to
figure
out
how
to
get
it
working,
maybe
for
their
use
case,
but
we
really
should
only
be
supporting
the
SDK
I.
Think
adding
not
placeholder
data
but
like
default
values
or
fields
that
they
haven't
supplied
doesn't
seem
like
the
right
fix
there.
A
So
I
would
say
like
if
you're
not
basically
like.
If,
if
you're
not
using
the
SDK,
then
you
know
you
should
if
the
events
get
discarded,
that's
not
really
or
RSU,
so
to
say
any
other
thoughts
on
that.
C
I
share
the
same
opinion
in
this
topic.
I
also
feel
like
we
should
discard
these
events
just
proudly
push
them
to
the
bad
events,
queue
yeah,
and
so
one
thing
worth
noting
is
that,
right
now
we
have
this
problem
of
the
STM
attribute,
but
probably
it
would
be
best
to
find
a
more
holistic
approach
where
right
we
make
sure
that
it
will
not
break
again,
because
someone
misses
some
other
attributes.
Yeah.
A
That's
exactly
that
so,
like
my
like,
it
sounds
like
it's
good
that
we
can
do
this
in
the
enrichment
process,
because
I
wasn't
sure
if
we'd
have
to
introduce
like
another
valid
event,
validation
process,
which
would
obviously
make
things
a
bit
more
complicated
and
complex,
but
yeah.
If
we
can
add
something
in
the
iteration
process
to
just
put
these
events
into
into
the
bad
queue,
then
we
at
least
have
a
place
where
we
can
recognize
that
there
are
people
that
are
sending
events
in
this
way.
A
But
is
there
a
way
we
can
handle
that
with
our
existing
schema,
such
that,
like
it's
not
just
specific
to
the
STM
field?
That's
an
open
question
to
the
analytics
instrumentation
group
like
I
guess:
is
there
a
way
to
Discord
events
for
I?
Don't
know
if
this
just
means
like
we
have
to
validate
our
scheme
every
time
it
goes
through
but
like?
How
do
we
deal
with
it
so
that
it
deals
with
like
any
any
field
that
other
than
just
STM.
D
C
A
No,
but
that's
it
sounds
like
a
good
lead,
then,
as
far
as
like
a
potential
solution
there,
so
I'll
I'll
check
the
related
issue.
That
boss
is
linked
and
then
I
can
update
that
to
say
hey
this
is
the
direction
I
want
to
head
towards
and
then
we
can.
We
can
continue
that
down
that
path.
I
think
that's!
That's
probably
the
best
way
to
go
about
it
because
and
semi-related
you'll
see
in
the
production
of
writing.
A
This
review
merge
request
that
I've
Linked
In
the
show
and
tell
there's
a
bit
of
a
different
architecture
for
what
we're
trying
to
do
for
for.com,
where
we
have
vector
and
as
because
for
context
for
those
that
don't
know
yet,
but
for
a
click
house,
that's
part
of
the
cluster
there's
a
table.
Engine
called
Kafka
which
allows
clickhouse
to
actually
pull
the
events
from
Kafka,
but
clickhouse
Cloud
does
not
support
that.
A
So
we
actually
have
to
push
events
from
Kafka
to
clickhouse
and
so
I've
introduced
another
piece
into
the
cluster
called
Vector,
which
could
potentially
do
this
validation
process.
A
But,
of
course
that
is
like
a
further
Downstream
process,
where
you
know
it's
already
gone
through
the
enrichment
Pro,
the
snow,
pile
enrichment
flow,
so
the
if
we
can
so
if
we
can
have
a
solution
in
the
enrichment
process,
then
I
think
that
would
be
the
best
case
scenario
there,
as
opposed
to
trying
to
validate
it
again
or
discarding
events
again
in
a
down
further
Downstream
service.
A
So
cool
I'll
update
the
issue
there.
So
thanks
thanks
for
chiming
in
and
then
semi-related
is
the
a
little
bit
of
the
show
and
tell
if
anyone
doesn't
have
anything
else.
I
just
wanted
to
kind
of
share
the
writing
to
review
merge
request,
it's
quite
an
old
request,
but
the
architecture
has
changed
quite
a
bit
from
when
we
initially
used
it
suit
and
I
switched
over
to
snowplow.
A
I
just
wanted
to
call
that
out
for
people
who
haven't
seen
that
yet
there's
an
updated
architecture
diagram
as
well
I
just
wanted
to
call
attention
to
that.
If
you're
curious
about
how
everything's
working
there
open
to
any
feedback
or
questions,
also
because
it
talks
about
disaster
recovery
and
backing
up
and
like
monitoring,
if
you
have
any
ideas
about
how
we
could
do
that
yeah
again
in
a
few
of
us,
as
we've
been
going
through,
these
production
incidents
have
been
discussing.
A
You
know
how
we
could
monitor
each
part
of
the
event
pipeline
to
kind
of
better,
have
a
visibility
and
be
more
proactive
to
these
types
of
issues.
You
know
open
to
any
ideas
there
as
well,
but
yeah
cool
thanks
for
the
discussion
there,
but
that's
that's
the
agenda.
So
if
there's
nothing
else,
I
mean
well.
Let
me
ask
it
this
way.
Is
there
anything
that
anyone
would
like
to
discuss.
A
Cool
there
was
a
follow-up
question
that
I
was
curious
about
related
to
this
potential
solution,
where
I
was
asking,
if
the
service
ping
event
pipeline
was
susceptible
to
this.
Of
course,
the
architecture
is
a
little
bit
different
since
it's
going
through
Wanda
and
I
believe
Kinesis.
So,
but
it
seems
like
because
data
teams
set
that
up
the
other.
If
you
wanna,
did
you
want
to
vocalize
that
point
or.
D
Yeah
sure
so,
I
think
it's
more
of
a
question
for
the
data
team,
because
when
we
enrich
snowplow
events
they
all
land
in
in
a
S3
data
Lake
and
from
them
the
from
there
they
are
picked
up
to
be
included
in
the
snowflake
models.
D
A
It's
just
a
thought
because
I
mean,
of
course,
it's
completely
different,
because
service
paying
is
not
something
that
we
open
up
for
others
to
well.
I
mean
they
can
submit
it
through
customers
dot,
but
it's
a
little
bit
of
a
different
process
right
in
terms
of
how
the
data
gets
handled
like
you
mentioned,
but
I
was
just
curious.
A
You
know
what
another
alternative
or
another
potential
solution
as
well
is
that
I
I
mentioned
Vector
processing
events
like
maybe
it's
just
a
good
sign
for
like
us
not
to
use
the
Kafka
table
engine
because
it
seems
to
just
get
hung
up
on
one
bad
event.
Unless
there's
a
configuration
setting
to
say
hey
if
you
can't
parse
this
part
of
the
Kafka
log
after
X
amount
of
tries-
and
you
should
just
move
on
but
yeah.
A
The
good
thing
is,
though,
is
that
once
you
do
clearly
a
bad
event,
the
rest
go
through.
So
all
the
events
from
the
past
few
days
have
should
be
showing
up
now,
but
yeah
all
good
practice
for
when
the
customers
will
playing
with
our
service
cool,
then
that's
the
end
of
the
agenda,
then
so
I'll
give
it
another
few
seconds.
If
anyone
wants
to
talk
about
anything,
otherwise,
everyone
gets
17
minutes
back
thanks
for
joining
the
call
have
a
good
rest
of
your
Monday
rest
of
your
week.