►
From YouTube: CyberSecurity and Data Management for Connected Vehicles Gil Mazuz (Upstream) OpenShift Commons
Description
CyberSecurity and Data Management for Connected Vehicles
Gil Mazuz (Upstream)
OpenShift Commons on Automotive
April 6 2022
full agenda here:
https://commons.openshift.org/gatherings/OpenShift_Commons_Gathering_on_Automotive.html
Join OpenShift Commons:
https://commons.openshift.org/gatherings/index.html#join
A
Hello,
all
my
name
is
gilmazooz,
I'm
vp
of
engineering
at
upstream,
where
I
lead
the
software
development
data,
engineering
data
science
and
devops.
Before
upstream,
I
was
nine
years
at
nso,
where
I
founded
the
real-time
data
analytics
and
intelligence
domain.
A
A
A
So
we
can
see
here
there
are
so
many
vectors
and
exploits
that
could
be
in
in
in
this
domain.
You
have
the
backend
server.
So
imagine
yourself,
someone
a
hacker
attacks,
the
telematics
server
of
a
specific
oem.
He
can
control
the
entire
fleet.
You
might
send
commands
that
control.
The
entire
fleet
says
close
the
doors
and
no
one
can
enter
or
even
something
worth.
You
have
mobile
applications.
A
hacker
can
also
hack
to
a
specific
vehicle
or
to
an
application.
Server
that
receives
mobile
application,
commands
that
are
moving
from
this.
A
To
any
attack
vectors
with
all
of
this
data,
so
this
technology
comes
with
this
with
with
a
risk.
That's
where
upstream
comes
to
the
rescue.
So
upstream,
one
of
our
passwords
is
unlocking
the
value
of
mobility
data
and
that's
what
we
do
upstream
aims
to
unlock
the
value
in
connected
vehicles,
data
to
help
stakeholders
secure
their
assets.
A
So
let's
talk
about
red
hat
openshift
and
the
upstream
cloud
agnostic
architecture.
So,
first
of
all,
let's
see
how
the
platform
is
built,
how
the
how
the
flow,
how
the
data
flow
goes
through
the
platform.
So
data
comes
from
many
sources,
as
we've
mentioned
before,
data
comes
from
telematics
servers
from
the
vehicle
themselves
from
applications.
A
All
of
this
data
requires
a
serious
data,
engineering
and
data
ingestion
flow
in
order
to
make
it
to
make
good
use
of
it
in
order
to
protect
it
in
order
to
have
services
upon
it
and
other
applications.
So
the
first
phase
is
this
is
going
through.
Is
data
normalization
and
cleansing,
so
we
are
cleaning
the
data
making
sure
that
unsorted
data,
even
in
real
time,
can
be
sorted
to
sort
to
a
specific
amount
to
a
specific
extent
and
make
sure
that
the
data
is
unified
in
order
for
us
to
make
sense
of
it.
A
After
that,
we
deduce
a
digital
twin
from
this
data.
Now
digital
twin,
imagine
yourself
in
the
actual
vehicle
state
is
like
it's
a
virtual
representation
of
the
vehicle.
Now
that
has
its
own
challenges,
because
you
know,
for
instance,
if
the
velocity
of
a
vehicle
was
80
miles
per
hour
before
five
minutes.
Is
it
still
80
miles
per
hour?
A
If
that's
the
last
signal
we
received,
so
you
need
to
manage
and
maintain
all
of
this
state
and
to
make
sure
that
it's
still
relevant
and
still
up
to
date
and
after
that,
based
upon
all
those
signals
and
the
digital
tween.
We
have
an
ai,
artificial
intelligence
power
detection,
where
we
detect
anomalies
in
cyber
attacks
and
other
forms
of
models
that
utilize
the
data.
A
Up
top
of
this
platform,
we
were
building
applications,
so
we
have
cyber
security
application
that
first
and
foremost
guards
and
and
protects
the
vehicles
and
the
servers
we
also
have
advanced
analytics
and
other
usage
and
other
other
applications
for
the
data,
such
as
insurance,
predictive
maintenance,
business,
intelligence
data,
quality
validation.
There
are
so
many
corrupted
data
that
are
getting
that
the
oems
are
getting
and
and
our
customers
are
getting
and
also
the
ability
to
create
third-party
applications
upon
this
data.
A
So,
let's
drill
down
a
little
bit
more
into
the
architecture.
Now
one
of
the
first
biggest
challenges
that
we
have
is
that
we
are
the
being
deployed
in
the
customers,
virtual
private
cloud,
usually
or
virtual
private
cloud
or
on-prem.
We
cannot
have.
We
cannot
say,
okay,
we
we
rely
on
the
cloud
because
we
sometimes
can
be
deployed
on
aws.
We
can't
deploy
it
on
gcp,
we
can't
deploy
on
azure
and
we
even
can
be
deployed
on-prem.
A
So
how
do
you
handle
this
kind
of
a
challenge?
So
let's
go
over
a
little
bit
of
our
into
our
components
and
see
how
openshift
helped
us
in
this
challenge.
So
we
have.
You
know
we
use
kafka.
Obviously
kafka,
not
something
like
you
know,
dedicated
stream
platforms,
but
kafka,
because
we
can
deploy
it
everywhere.
So
we
use
kafka
for
the
ingestion
and
and
message
brokering.
A
So
we
have
messages
of
coming
for
all
messages
that
coming
from
all
of
our
sources.
It
goes
through
a
macro
server,
a
macro
service
that
doing
the
parsing
and
normalization
and
all
of
that
by
the
way
macro
service
is
a
term
that
uber
mentioned,
because
you
know
when
you
have
too
many
microservices
it's
hard
to
maintain.
So
sometimes
it's
good
to
unify
them
and
have
a
clear
interface
with
a
clear
domain
over
each
macro
service.
That's
what
we
did
after
this
normalization
and
ingestion.
A
There
are
clean
signals
and
unified
seals
that
are
going
to
a
new
topic
and,
of
course,
after
that,
there's
the
processing,
detection
enforcement
and
so
forth.
We
are
using
redis
for
state
store,
password
sql
for
for
entity.
Business
entity
storage
makes
sense
because
hierarchical
entities
are
built
like
that,
and
it's
much
easier
to
query
them
with
sql.
A
We
use,
of
course,
logs
uploader
machine
learning
models
in
our
data
lake
and
we
use
presto
or
to
be
to
be
more
accurate
trino
in
order
to
query
this
data
and
also
for
our
machine
learning
in
machine
learning
operations.
So
that's
that's
a
drill
down
for
into
our
architecture.
A
This
architecture
we
use,
we
utilize
with
open
shift
in
order
to
to
have
it
as
much
as
cloud
agnostic,
as
we
can
so
upstream,
was
designed
to
run
over
cloud
infrastructure
agnostically,
as
I've
mentioned,
aws
azure
gcp,
using
help
to
deploy
and
install
upstream
platform
and
supporting
third-party
app
components
upstream,
utilized
cloud
services
such
as
possibly
sql,
db,
redis,
object,
storage,
and
here
we
found
the
openshift
operator's
equivalence
out
of
the
box.
What
happened
is
that
it
actually
replaced
for
us
the
managed
services
that
we
used
to
have
from
the
cloud.
A
A
We
have
an
operator
marketplace,
which
is
a
very
large
place
where
you
can
find
most,
if
not
all,
of
official
releases
for
cloud
native
services
and
applications,
and
it's
one
click
deployment.
It's
one
click
deployment
and
you
have
this
operator
service,
which
is
again
super
convenient
deployment
we
use
hell.
Openshift
is
based
upon
kubernetes,
obviously
so
helm
works
here
out
of
the
box
support
deployment
for
best
practices
of
cloud
native
kubernetes
applications
maintenance.
A
We
have
also
built-in
metrics
in
in
openshift
console
that
provided
us
better
visibility
and
overall
view
of
the
running
workloads.
It
was
really
easy
for
us
to
see
what's
happening
in
our
cluster
because
of
the
openshift
console
security.
I
think
that's.
This
is
one
of
the
biggest
ones,
so
we
all
know
how
how
big
advantage
openshift
gives
to
security.
A
So
the
fact
that
we
were
able
to
run
in
separate
projects
easy
to
segregate
and
the
segregation
which
is
easy
to
segregate
between
two
projects,
easy
to
assign
user
roles
and
control
access,
was
great
for
us
and
was
great
for
the
customer
that
were
hosting
us
because
they
could
sleep
well
at
night
because
they
knew
that
we
don't
have
a
demean
permissions.
We
have
a
segregated
project
and
we
knew
that
we
can't
hurt
their
cluster
in
any
ways
and
it
gave
us
you
know
peace
of
mind,
also
wildcard.
A
You
know
the
ability
to
work
and
get
the
ssl
for
free
without
managing
the
certificate.
You
know
to
work
with
the
asterisk.cluster.openshift
with
on
all
routes
within
the
cluster,
again
very
simple,
very
straightforward
and
multi-tenancy,
which
you
can
be.
It
can
be
deployed
in
a
multi-vendor
cluster
with
no
cluster
level
admin
permissions
which
we
can
run
multi
projects
on
the
same
cluster
with
different
permissions
of
user,
which
I
mentioned
before.
A
Furthermore,
eventually,
I
could
tell
you
without,
of
course
mentioning
the
customer
that
we
had
a
very
big
oem,
very
big
customer,
that
we
moved
our
platform
due
to
its
specific
needs.
More
than
once,
we
moved
it
from
azure
to
on-prem
to
gcp
and
all
of
this
and
was
really
straightforward.
A
That's
a
great
trait
and
openshift
helped
us
utilize
that
very
well,
so
that's
that's
basically
openshift
and
cyber
security
and
data
management
platform
that
we
did.
There
is,
of
course,
some
time
for
questions
and
answers
from
the
audience.
A
So
this
actually
was
a
challenge
that
eventually
became
an
advantage.
It's
a
great
question.
So
the
fact
that
in
one
of
our
customers,
we
we
were
running
as
a
guest,
we
didn't
have
any
root.
Permissions
had
some
of
our
components,
such
as
machine
learning
operations,
for
instance,
being
a
little
bit
more
complicated
because
we
didn't
have
all
the
permissions
we
need.
But
what
it
allowed
us
is
to
do
the
configurations
and
changing
in
order
for
all
of
the
the
platform,
including
this
ml
ops,
to
run
without
root
permissions.
A
So
we
actually
evolutionize
the
platform
to
be
able
to
run
as
guest
completely
and
that
takes
us
to
the
next
level,
because
that
actually
helped
us
being
even
more
agnostic,
not
just
agnostic
with
the
cloud,
but
even
even
very
lightweight,
with
the
permissions
required
to
run
the
platform,
which
is
very
convenient
and
very
important
when
you
deploy
yourself
in
different
customers
that
again,
security
is
very
important
for
them.
So
that
was
a
challenge
that
we
eventually
made
an
advantage.
B
A
So,
to
be
honest
at
the
beginning,
obviously
I
mean
I
think,
everyone
that
comes
to
to
to
develop
data
platform
data
engineering
platforms.
Some
kind
of
way,
especially
in
our
days,
wants
just
to
rely
on
on
a
specific
cloud
at
first,
because
you
know
you
get
a
lot
of
things
for
free.
You
get
a
lot
of
managed
services,
you
get
a
lot
of
good
stuff
for
free
and
you
can
rely
upon
that,
but
very
very
soon.
A
We
saw
that
this
is
not
feasible
due
to
the
nature
of
our
customers
due
to
the
sensitivity
of
the
data
due
to
the
fact
that
we
need
to
be
extremely
flexible
so,
for
I
would
say
almost
from
the
beginning,
when
we,
when
we
started
designing
the
platform
we
thought
about,
we
need
to
be
cloud
agnostic
to
some
extent.
A
We
we
didn't
believe
how
it,
how
bigger
was
the
extent
that
we
eventually
did,
because
eventually,
almost
each
component,
we've
said:
okay,
listen.
We
have
to
find
a
replacement
for
dismantled,
service
and
and
and
eventually
we've
banned
it.
We
we've
made
it
that's
what
exactly
what
we
did
and
openshift,
as
I
mentioned,
helped
us
with
that,
because
wherever
outlier
we
had,
which
was
okay,
listen,
we
need.
We
need
that
we
could
have
replaced
with
an
operator
in
order
to
have
this
peace
of
mind
so
eventually,
component
after
component,
a
piece
after
piece.
A
We
almost
did
the
entire
platform
to
be
cloud
agnostic,
and
that
was
super
important.
It
was,
it
was
crucial
for
our
business.
It
has
many
advantages.
It's
the
flexibility
of
of
the
platform
is
is,
is
amazing
and
that's
that's
one
of
our
biggest
advantages,
so
yeah,
it's
it
was
very
important
and-
and-
and
I
think-
and
I
think
we
we
knew-
that
we
need
to
have
that
again.
A
B
Yeah
great,
I
mean
to
say,
the
cloud
technologies
are
evolving
so
fast
on
a
day-to-day
basis
and
also
maintaining
your
platform
to
be
agile.
I
think
that's
one
of
the
usps
of
any
platform
providers
thanks
a
lot
gil,
that's
it
from
my
side,
just
checking
if
there
are
any
other
question.
So
there
is
one
question
that
came
from
jc:
can
you
give
us
an
example
of
what
kind
of
vulnerability
can
be
reduced
or
detected
by
data
normalization
and
cleansing
in
this
architecture?
B
A
Of
course
there
are,
unfortunately,
there
are
many,
but
but
let's
just
say,
for
instance,
injection
okay,
you
can
have
and
a
single
injection
in
in
into
the
car
you
can
have
even
even
that
or
or
sql
injection,
to
the
telematics
server,
for
instance,
or
or
some
kind
of
attaching
all
of
those
things
in
order
for
you
to
analyze
and
to
have
to
understand
an
anomaly
or,
for
instance,
let's
just
say,
brute
force
attack,
okay,
in
order
for
for
for
a
a
detection
platform
to
understand
if
it
was
an
attack
or
just
you
know,
just
big
amount
of
of
of
of
methods
that
came
right
now,
you
need
to
analyze
the
pattern.
A
The
machine
needs
to
analyze
the
pattern
to
understand
the
anomaly
in
order
to
do
so.
All
the
data
that
you
have
from
all
the
vehicles
and
specifically
from
a
specific
vehicle
and
from
the
servers
and
from
the
applications
needs
to
talk
in
the
same
language
needs
to
be
unified.
If
it's
not
unified
you
can.
You
cannot
start
doing
machine
learning
models
upon
that.
You
can't
it's
very
hard
to
identify
patterns
upon
it,
but
once
you
clean
the
data
you
unify
it,
you
put
it
in
a
unified
schema.