►
Description
OpenShift Architecture Evolution at Elisa
Antti Seppälä Service Manager (Elisa)
OpenShift Commons Gathering Helsinki 2018
A
So
today,
ladies
and
gentlemen,
my
name
is
auntie,
and
it's
really
good
to
be
here
today,
I'm
here
to
talk
to
you
a
little
bit
about
the
evolution
of
our
openshift
architecture
here
at
ELISA.
Well,
it
was
nice
to
hear
the
Augusta
mentioning
that
the
company
is
over
100
years
old.
Well,
ELISA
is
even
older
than
that.
It
was
founded
in
1882.
So
talking
about
legacy
still
well
a
couple
of
words
about
myself.
First,
so
I
work
as
a
service
manager
at
ELISA
like
I,
said
that
means
that
I
lead
the
DevOps
team.
A
We
are
responsible
for
the
cloud
offering
that
we
offer
to
our
developers
the
infrastructure
behind
that,
and
we
also
do
maintain
some
tools
and
technologies
and
participate
in
architecture
discussions
with
the
devs.
So
we
in
general
it
would
be
fair
to
say
that
we
try
to
make
the
lives
of
the
developers
easier.
I
also
get
to
input
strategy.
A
Take
part
in
partnerships.
Negotiations
do
budgeting
and
various
other
duties
that
managers
generally
get
to
do,
but
that's
only
my
day
job.
So
during
night,
I
turn
into
Linux
kernel
hacker,
so
I
really
do
actually
enjoy.
Turning
the
bugs
or
oopss
that
you
can
see
on
the
right
side
of
the
screen
to
to
upstream
commits
that
looks
something
like
the
left
side
of
the
screen,
but
that's
just
the
way,
I'm
weird
but
yeah.
A
You
know
coming
from
a
kernel
background
and
being
this
simple-minded
low-level
guy
I'm,
not
ashamed
to
admit
that
I
feel
the
containers
are
sometimes
a
bit
overwhelming.
The
technology
is
rather
new
and
it's
evolving
rather
rapidly.
So
it
may
feel
like
it's
hard
to
keep
up
so
in
the
next
part
of
the
presentation.
I
will
actually
go
through
a
little
bit
about
the
history.
How
the
container
stack
that
we
currently
run
at
ELISA
became
what
it
is
today.
A
Well,
I'm
not
going
to
go
into
the
early
early
beginning,
I'm,
just
going
to
start
by
mentioning
that
that
it
all
basically
started
when
we
tried
to
introduce
ansible
to
do
software
deployments.
It
worked
rather
well.
So
we
had
our
existing
virtualization
environment
and
we
did
offer
ansible
as
the
tool
for
the
devs
to
deploy
their
stuff
into
production.
They
were
really
happy
about
it
and
eventually
we
get
good
adoption,
but
with
good
adoption
came
increasing
amount
of
requests
to
have
the
ability
to
control
the
infrastructure
as
well
with
the
ansible.
A
Well
at
the
time,
we
couldn't
really
do
that,
so
we
took
up
and
created
another
set
of
technologies
that
could
really
comply,
and
this
one
was
and
still
actually
is
based
on
OpenStack,
so
you
get
API
availability
and
you
don't
need
to
do
tickets
and
so
on
and
so
forth.
The
thought
was
actually
to
run
heat
templates,
which
described
the
infrastructure
and
then
run
ansible
on
top
of
that
to
really
provision
the
software,
so
you
could
control
it
entirely.
A
Well,
this
became
immensely
popular,
the
devs
loved
it
and
we
were
happy
about
that.
But
also
during
that
time
the
container
revolution
happened
and
after
a
little
while
we
started
to
investigate
it.
What
were
the
people
actually
deploying
into
production
with
this
beautiful
stack
and
it
turns
out,
they
were
deploying
kubernetes.
They
were
using
that
to
deploy
kubernetes
and
then
run
the
rest
of
stuff
on
that,
and
actually
this
was
fine
in
our
opinion,
so
we
liked
it.
A
They
we
even
endorsed
it
to
some
extent,
so
we
created
some
installation
scripts
to
help
they've
installed
kubernetes,
because
we
saw
that
it
would
probably
become
the
mainstream
container
Orchestrator
at
some
time
in
the
future.
Well,
boy
did
they
install
it,
then.
So,
eventually,
we
started
sprouting
up
multiple
kubernetes
clusters,
with
different
configurations
with
different
sizes,
with
different
features
and
with
different
operations,
and
if
you
like
us,
are
in
an
organization
where
there
is
separate
team
or
a
department
that
is
responsible
for
the
operations
and
24/7
and
and
being
on-call
duty.
A
A
Well,
this
turned
out
to
be
more
difficult
than
we
would
have
hoped
so
at
the
time,
kubernetes
didn't
do
multi-tenancy,
basically
at
all
I,
don't
know
if
it
still
does,
but
but
at
the
time
at
least,
it
was
really
really
hard.
So
we
set
our
sights
to
another
container
Orchestrator
solution
that
would
be
based
on
kubernetes
and
would
be
compatible
with
the
multi-tenancy
requirements
of
our
multiple
teams.
So
we
didn't
want
teams
sharing
the
environment.
A
Well,
it's
pretty
easy
to
tell
that
the
open
ship
with
the
pill
quite
nicely,
so
it
does
more
multi-tenancy
and
all
the
requirements
and
it's
based
on
kubernetes
that
the
devs
already
loved,
so
eventually
the
evolution
of
the
software
stack
that
we
offer
looks
like
this
in
the
next
part
of
this
presentation.
I'm
going
to
talk
to
you
a
little
bit
about
how
the
open
shift
installation
itself
has
evolved
during
the
years.
So
our
first
attempt
at
install
it
looked
a
little
bit
like
this.
A
A
Then
we
also
added
a
rotor
node,
which
was
used
to
direct
the
traffic
into
the
cluster.
Well,
it's
pretty
easy
afterwards
and
looking
at
the
picture,
it's
pretty
easy
to
spot
the
single
point
of
failure,
so
we
pretty
soon
discovered
that
you
can't
take
the
rotor
down
for
maintenance
without
bringing
the
whole
cluster
down.
So
another
version
of
the
cluster
was
setup
where
we
added
the
backup
router,
which
would
be
then
set
up
using
the
keepalive
d
to
switch
over
it.
A
Still
the
adoption
rate
wasn't
at
the
level
that
we
were
really
hoping
and
again
looking
into
and
asking
the
developers
that
what's
going
on
or
why
aren't
they?
Using
this,
it
became
clear
that
most
of
the
teams
actually
have
some
external
dependency
that
they
would
like
to
access
external
database
or
external
API
or
external
whatever,
and
they
also
stated
that
they
would
not
want
to
share
the
access
to
the
external
service
with
the
other
users
of
the
cluster
and
at
the
time,
option
shift
couldn't
really
separate
the
external
traffic
between
the
projects.
A
Don't
recommend
utilizing
the
tech
preview
features
in
in
production
because
there
were
some
pains
in
setting
it
up,
but
luckily,
with
upstream
collaboration
and
collaboration
with
Red
Hat
support,
we
were
able
to
make
the
feature
work
stable
enough
for
our
uses
actually
stable
enough,
so
that,
with
the
usage
increasing,
we
ended
up
charting
the
routers
once
more
to
really
have
the
bandwidth
available
for
for
the
increasing
traffic-
and
this
is
pretty
much
the
architecture
or
openshift
where
it
stands
today
in
our
data
centers.
But
when
I
say
data,
centers
I
really
do
mean
it.
A
So
we
ended
up
discovering
that
it's
rather
nice
to
have
multiple
ones
set
up
for
h.a
reasons.
So
if
you
want
to
take
or
do
upgrades
in
one
of
them,
it's
pretty
nice
to
have
the
data
center
number
2
available
for
operations
at
the
same
time.
So
we
ended
up
setting
up
another
cluster
into
a
data
centers.
A
We
also
did
it
like,
so
that
the
developers
have
the
chance
to
choose
which
one
they
would
like
to
use.
So
if
they
wanted
to
use
data
center
1
they
could
or
if
they
wanted,
to
use
the
number
to
take
good.
We
obviously
gave
strong
recommendations
to
use
both
at
the
same
time
and
that
was
kind
of
what
we
wanted
them
to
really
do,
because
that
would
have
allowed
us
to
do
the
maintenance
things
rather
nicely,
but
even
evidently
they
did
not
and
when
I
asked
that,
why
don't
you
use
both
of
them.
A
At
the
same
time,
it
turned
out
that
it's
really
hard
for
them
to
do
load
balancing
between
the
clusters,
so
that
you
could
expose
your
service
from
both
of
them
at
the
same
time
and
have
traffic
flowing,
even
if
the
other
one
was
taking
down
for
maintenance.
So
DNS
round-robin
didn't
really
work
for
that.
So
most
of
the
browsers,
for
example-
don't
really
cope
with
that
sort
of
mechanism
for
balancing
the
load.
A
A
The
piece
of
software
actually
listens
to
openshift
api
events
related
to
road
creation
or
updates,
and
can
provision
external
load
balancer
when
needed.
So
if
you
set
up
route
in
data
center,
one
this
one
will
create
it
to
the
load,
balancer
and
start
listening
events
or
traffic
from
there.
If
you
then
add
the
same
route
to
data
center
number,
two,
it
will
be
added
to
the
load,
balancer
pool
as
another
destination
of
the
traffic.
A
This
piece
of
software
was
actually
open
sourced
in
our
announced
to
be
open
source
in
the
last
open
shift
commence
gathering
in
San
Francisco
last
May.
But
if
you
still
want
to
take
a
look,
there's
the
github
URL,
where
you
can
find
it
okay!
Well,
this
setup
is
pretty
much
the
one
we
nowadays
run,
but
going
even
further.
We
discovered
that
the
workloads
between
or
there
are
different
kinds
of
workloads
that
people
want
to
run.
A
Basically,
the
difference
that
we
ended
up
discovering
is
that
there
is
production
workload
which
you
expect
expect
to
be
rather
highly
available
and
highly
responsive
and
it's
generally
quite
predictable.
But
then
there
is
the
development
workload
that
people
do
compilations
or
load
tests
or
unit
tests
or
AI
training
cycles.
A
A
We
hope
that
it
will
be
able
to
do
automation
with
smaller
iteration,
which
in
turn
enables
faster
learning
for
the
dev
teams.
That's
quite
a
valuable
thing
to
get.
We
also
aim
to
shift
the
responsibility
of
end-user
experience
to
the
teams
themselves
with
the
cloud
technologies.
A
good
example
of
a
team
that
has
been
following
these
guidelines
is
actually
the
ELISA.
We,
the
entertainment
services,
video,
on-demand
store
or
Walk
Rama
in
Finnish,
which
at
the
moment,
runs
on
top
of
OpenShift
and
being
a
sizeable
operation
as
it
is.
A
It
already
merits
for
several
data
centers
to
be
used
to
get
the
kind
of
store
up
and
running
all
the
time,
but
that's
not,
in
my
opinion,
the
most
interesting
thing
that
we
do
on
these
clusters.
So
I'm
gonna
talk
a
little
bit
about
the
coolest
stuff
that
we
currently
are
working
on,
which
is
the
self
optimizing
Network.
So
being
a
telecom
operator.
A
We
have
singles
the
base
station
which
may
have
multiple
cells.
Let's
say,
2g
3G,
4G
5g
is
coming
up
and
each
of
these
cells
may
have
multiple
antennas
directed
to
different
direction,
and
each
of
these
may
have
somewhere
rating
from
hundreds
to
thousands
parameters
that
you
can
fine-tune
to
kind
of
create
a
optimal
coverage
for
your
cell
phones.
A
A
Now
now
these
ones
need
to
be
adjusted
because
that
they
have
different
usage
because
of
the
one
parameter
that
you
changed
and
when
you
are
just
adjust
these
the
next
ones
need
to
be
adjusted.
So
you
end
up
with
this
cascading
ripple
effect
that
covers
your
entire
network
from
a
small
parameter
change
in
one
of
the
base
stations,
and
we
figured
that
this
no
longer
is
a
task
for
humans
to
do
so.
A
A
So
if
you
are
actually
interested
in
learning
some
more
about
this
way
well
in
either/or
order,
so
if
you
want
to
learn
more
about
the
product
that
we
are
trying
to
build
around
the
self
optimizing
network,
stuff
visit,
alisa,
automate
comm,
and
if
you
are
truly
interested
in
becoming
an
expert
on
the
field
and
working
with
the
technologies
over
there
go
check
out
elisa
FIS
last
jobs.
We
have
several
openings
available
there,
but
with
that
I
think
it's
time
to
say.
Thank
you
all
and
very
nice
that
you
had
me.