►
From YouTube: OCG Berlin 2017 - Big Data on OpenShift at T-Systems
Description
From the March 28th 2017 OpenShift Commons Gathering in Berlin @KubeCon https://Commons.openshift.org/Gathering
A
A
You
know,
I
was
one
of
the
first
Commons
calls
that
Diane
always
so
nicely
moderates
and
they
were
like
I,
don't
know,
20
people
roughly
now
and
kind
of
them
were
redhead
guys.
So
we
were
joining
that
and
in
2014
we
were
doing
a
lot
of
stuff
with
quote
unquote
the
old
version
of
open
shift
and
yeah.
A
We,
we
went
through
quite
some
challenges
with
that
version
to
be
honest
and
in
2015
you
know
we
were
basically
saying
hey
if
we
make
the
right
decision
to
go
with
open
shift
because
all
of
our
large
customers
with
key
systems
and
those
of
you
who
don't
know
t-systems
we're
the
enterprise
arm
of
deutsche
telekom,
they
were,
they
were
making
the
decision
to
go
with
Cloud
Foundry
right,
so
the
diner's,
the
the
debauched
of
the
world.
They
were
saying,
ok,
cloud,
foundry
and
then
in
in
2015.
A
You
know
Chris
Morgan
from
the
redhead
team
and
Ashish.
They
came
around
and
said:
hey
we're
going
to
do
here.
Something
with
Cooper
nadeyus,
we
completely
riorca
tech
chure.
It
right
is
that,
okay,
that
that
looks
promising
and
thank
God.
We
stuck
with
the
with
the
version.
Now
things
are
really
turning
into
the
positive
and
we
see
a
lot
of
customers
coming
in
now.
Let
me
introduce
please
next
slide,
so
let
me
introduce
myself
real
real,
quick.
A
You
know
carrying
a
degree
of
informatics
that
we
call
it
computer
science
and
also
an
MBA
and
like
everything
that
that's
fast
and
that
has
to
wheel.
So
we
probably
need
to
put
two
wheels
on
an
open
shift,
so
that
I
even
do
that
in
in
my
off
time
to
kids
that
keep
me
busy
in
the
off
time
and
one
little
baby
that
keeps
you
busy.
A
That's
called
a
betcha,
that's
our
product
that
we
wrapped
around
the
OPA
shift
and
that
we're
working
on
for
life
now
three
years,
almost
next
slide,
please,
you
know
what
I
Chris
Wright
already
mentioned
a
lot
of
the
point.
Please
it's
animated!
You
know
what
makes
the
life
of
a
of
a
developer
real
miserable.
You
know
the
provision
of
environment
still
these
days
in
the
enterprise.
A
Take
weeks
you
know
if
physical
hardware
is
involved,
and
especially
around
a
big
data
environment,
you
know
it
still
takes
12
to
13
weeks
till
a
developer
can
log
on
to
something
and
install
his
his
stuff
on
it.
In
12
to
13
weeks
I
mean
the
you
know:
developer
has
gone
through,
I,
don't
know
five
six
Sprint's
already,
so
that
shows
how
how
things
need
to
change
here.
The
data
center
folks
Chris
also
mentioned
the
data
center
folks.
They
think
they
are
the
real
IT
guys
right
and
the
developers
think
no.
A
We
are
the
real
life
you
guys
we
arrived.
We
write
the
the
stuff
out
there,
but
the
the
data
center
folks.
They
say
we
get
up
at
night
at
two
o'clock
yo
and
and
make
things
work
again
when,
when
things
break-
and
next
point
is
big
data
seems
to
break
the
bank
right.
If
you
think
about
you,
know
terabytes
and
petabytes,
and
these
days
we
talk
exabyte
of
data.
These
things
get.
You
know
expensive,
you're
from
the
get-go
and
the
licenses
for
commercial
products.
A
Are
you
know,
you
know
they're
they're,
breaking
the
bank
too,
so
and
and
I
have
a
slide
or
a
couple
slide
on
the
on
cloud
performance.
When
it
comes
to
big
data,
you
know
and
I.
You
know
it's
of
you.
Those
couple
of
slides
will,
in
the
end,
undermine
or
under
right
mine
and
build
the
foundation
of
that
cloud.
Performance
really
sucks
for
productive,
big
data
workloads
and
obviously
in
the
enterprise
data
needs
to
be
protected.
You
know,
you're,
we
live
here
in
Germany.
A
We
have
this
data
protection
data
privacy
legislation
which
you
know
kind
of,
prohibits
the
cloud
adoption,
but
things
are
moving
and
we
need
to
take
care
of
that.
People
who
are
using
the
cloud
are,
you
know,
feeling
comfortable
with
what
we
can
offer
to
them
and
then,
when,
when
things
are
developed,
you
know
the
fun
the
whole
fun
starts
over
and
over
again.
A
So
next
slide,
please
so
what
we
what
we
came
up
with
and
a
year
ago
my
my
boss
asked
me:
hey
Thomas,
you
have
been
so
successful
now,
with
with
open
shift
and
with
platform-as-a-service.
Don't
you
want
to
take
over
also
the
big
data
of
team
as
well
and
I
was
also
scratching
my
head
and
said.
Let
me
think
about
for
one
weekend
and
then
I
remembered
what
happens
to
me.
In
the
early
days
we
had
a
couple
of
smaller
customers
that
approached
us
and
they
had
in
mind
already.
You
know
big
data
use
cases.
A
They
want
to
develop
mobile
apps
that
are
attracting
a
lot
of
data,
so
they
were
asking
us
Thomas
tell
us
about
the
the
through
code,
the
performance
of
your
of
your
underlying
cloud,
so
we
gave
them
the
numbers
and
they
said
yeah.
We
have
to
walk
away
with.
This
will
not
suffice
our
needs
ultimately.
A
So
that
was
really
that
was
really
you
know.
You
know
make
my
heart
bleed
at
the
time,
because
if
you
have
to
go
have
to
let
go
away
a
customer
that
really
that
really
sucks.
So
when
I
said,
okay,
I'll
put
things
together
pass
platform
as
a
service
and
big
data,
because
then
we
have
both
things
together
and
I'll
have
an
underlying
infrastructure.
Here,
that's
support
both.
You
know
the
cloud
on
the
left-hand
side
for
smaller
volumes
when
we
talk
about
your
gigabyte
sizes,
so
you
know
provision
that
fast.
A
So
the
developer
can
can
start
using
it
using
containerized
technology
using
technology
from
kada
era
from
hortonworks
meh
bar,
and
they
can
try.
They
can
try
it
out
and
and
not
break
the
bank
because
it
could,
if
it's
not
working,
they
have
not
a
lot
of
money.
If
it's
working
great,
they
can
stick
on
the
same
platform
and
and
scale
out
for
four
big
volumes.
You
know
we
say
this
is
for
multiple
terabytes.
A
We
were
using
the
same
platform
ultimately
and
for
huge
volumes,
multi
petabyte.
We
then
say
we,
we
cut
the
we
cut
the
platform
and
move
you
out
to
a
real
bare
metal
platform.
What
does
that
mean?
I
give
you
an
example:
car
manufacturers,
or
these
days
all
you
know,
thinking
about
autonomous
driving
right.
They
they
have
that
in
mind
and
for
test
driving
from
a
legal
perspective.
They
need
to
a
document
that
they
have,
you
know,
done
their
diligence.
A
So
a
lot
of
video
sequences
are
coming
in
a
high
resolution
so
that
they
can
document
what
what
happens
with
the
car
in
a
specific
situation,
so
that
needs
to
be
stored
somewhere
and
with
a
car
manufacturer
here
in
the
southern
part
of
Germany
he's
talking
to
us
and
we're
running
a
proof
of
concept
with
him
off.
You
know
storing
exabyte
of
data.
That's
something
that
really!
A
So,
as
I
said,
you're
the
underlying
infrastructure,
you
know
we
started
with
with
vcloud
from
VMware.
Why
did
we
do
that?
Because
it
was
right
there
right
at
the
time
two
and
a
half
three
years
ago
it
was
just
there.
We
had
that
as
a
key
systems
cloud
today
we
are
actually
offering
the
same
services
on
Azure
and
OTC.
A
Those
guys
are
really
now
coming
to
us
and
say:
okay,
we
wanted
part
of
the
game
here
and
they
they
provide
their
applications
in
a
containerized
version,
so
that
the
developer
here
on
the
left-hand
side
can
already
use
their
technology
and
make
use
of
it
and
then
see
how
they
can
I
can
leverage
that
technology.
Please
next
slide.
A
Now
we
run
some.
Yes,
some
some
proof
of
concepts
and
some
some
tasks
with
it,
especially
around
the
Big
Data
Hadoop
ecosystem.
What
we've
done
we
used
for
this
proof-of-concept?
We
used
HDFS,
obviously
for
from
a
dupe
for
the
file
system,
MapReduce
to
spark
in
various
standard,
flavors,
scallop
eyes
and
our
hive
and
test,
and
we
used
for
the
meta
data
storage.
A
We
used
microsemi
sequel
pot
and
you
in
an
extra
pot,
and
while
you
deploy
it,
you
can
say
how
many,
how
many
slaves
you
want
for
the
Hadoop
cluster
and
you
can
even
you
know,
say
you
want
to
have
it
persistent
or
you
want
to
have
a
ephemeral
and
that
that
system
that
we've
tried
out
is
really
use.
It's
really
designed
for
small
beta
users
for
the
development
users
next
slide.
Please.
So
you
see
that
up
there
you
see
the
TSI
AF,
which
actually
stands
for
in
the
form
of
time
we
were
called
fabric.
A
We
had
to
do
away
that
name
because
that's
a
that's
a
Microsoft
server
name
product,
so
we
had
to
call
it
a
battle
and
you're
all
familiar
with
this
kind
of
setting,
and
so
we
created
the
Hadoop
quick
starts
on
it.
If
you
go
to
the
next
slide,
so
this
is
where
you
see
how
spark
is
being
started
on
The,
Bachelor,
dube
master.
A
You
can
see
still
where
we
are
on
the
on
the
on
the
on
the
line
here
and
please
go
to
the
next
slide.
So
this
is
basically
if
one
of
the
the
use
cases
where
we
implement
it
the
whole
the
whole
heart
do
platform
created
some
some
databases
and
uploaded
publicly
available
data
to
it
and
did
some
research.
So
this
is
a
nice
showcase
to
see
until
the
developers
what
they
could
do
next
slide.
Please,
when
I
asked
my
my
architect,
could
you
in
preparation
of
this
of
this
meeting
here?
A
Could
you
please
send
me
some
nice
fancy,
architectural
diagrams?
He
sent
me
this
thing
I
said:
is
this
it
he
said
yeah
it?
Doesn't
it
doesn't
get
more
complicated.
So
basically,
you
know
the
Hadoop
master,
a
pot
and
then
my
sequel
pot
for
the
metadata
and
then
the
slaves
you
scale
it
out.
So
pretty
simple,
but
it
took
us.
You
know
quite
some
quite
some
time
until
it
really
worked.
It
was
like
I,
don't
know,
23
months
until
until
my
guys
really
figured
it
out
and
they
had
it
run
on
earth
on
that
environment.
A
Now
this
thing
is
really
some.
You
know
some
sort
of
a
break
in
the
presentation.
You
see
this
a
badge
ille
logo
here
you
know
we're
open
shift,
is
really
eighty
percent
of
it
and
you
have
a
juror
on
the
right
hand
side.
Now,
these
two
errors
between
those
two
mean
that
number
one
we
run
on
a
juror
now
number
two.
There
is
this
concept
of
a
data
trustee
with
with
Microsoft,
where
Deutsche
Telekom
runs
the
azure
platform
in
Germany
right
as
their
beta
trustee.
A
So
that
means
the
the
data
center
is
a
Deutsche
Telekom
data
center,
where
everywhere
in
Germany
runs
the
people
are
t-systems
people
who
are
run.
It
were
running
it.
So
no
one
else,
then
people
who
are
bound
to
the
data
legislation
that
is
valid
in
Europe
and
Germany
are
actually
accessing
this
platform
right
and
then
I.
A
Think
at
the
beginning
of
this
year,
redhead
was
announcing
the
partnership
with
Microsoft
implementing
the
open
shift,
also
on
a
juror,
so
that
now
we
have,
we
have
all
forces
in
in
Huambo
right,
Microsoft
Azure
is
the
operator
and
open
shift,
as
the
de
facto
of
you
know,
development
engine.
So
next
slide,
please!
So
what
we?
What
we
were
trying
now
is
to
really
underline
what
we
what
we
said
that
clouds
for
productive,
big
data
workloads-
really,
you
know
Cadillacs,
so
we
were
using
a
hortonworks
deployment
that
was
offered
on
as
you
were.
A
So
next
slide,
please
the
deployment
of
the
whole
thing
to
15
to
20
minutes
pretty
fast
for
a
whole
Hortonworks,
a
big
data
environment.
So
that's
fine.
We
access
the
m,
barry
bassis,
the
environment
through
the
embarrass
l
or
from
the
ssh
from
the
internet
to
the
master
right.
We
took
the
ubuntu
s
nos
and
as
a
set
HTTP
2.5
and
choose
a
replication
factor
of
three.
Now
those
of
you
who
know
a
group
fast,
that's
that's
a
standard
right,
so
we
had
like
266
gigabyte
of
usable
storage.
A
A
So
we
said:
okay,
a
50
gigabyte
of
terror.
Sword
should
use
like
two
point.
Eight
percent
of
the
time-
and
you
know,
did
all
the
all
the
math
around
that.
So
in
theory,
this
terror
sort
on
a
small
scale
should
take
roughly
two
minutes
of
run
time
until
it's
done
right-
and
you
know
one
one
side
note
down
at
the
bottom
here,
you
know
we
had
only
one
local
disk,
which
usually
is
not
very
optimal,
so
we
had
some
some
bottlenecks
now
here
next
slide,
please
now!
This
is
really
the
the
outcome
right.
A
We
had
like
eighty
percent
weight
I
owe
on
that
on
that
environment
and
during
the
during
the
map
phase
and
during
the
reduced
face
on
that
machine,
really
no
reduce
could
could
run
right.
So
basically
the
outcome
was
and
next
slide,
yeah
dis
busy.
Obviously,
if
you
have
to
wait
I
all
this
busy
is
another.
A
So
next
slide
now
my
you
know,
my
presentation
is
entitled
to.
You
know
be
called
okay.
We
have
big
data
and
pass
as
being
you
know,
siblings,
tied
at
the
hip,
but
also
we
think
it's.
You
know
it's
also
an
enabler
for
death,
ops
in
the
enterprise
and
over
the
last
two
two
and
a
half
years.
If
we
really
were
struggling
with
with
DevOps
the
thought
of
DevOps
in
the
enterprise
I,
don't
know
review,
you
know
is
working
in
a
company
with
500
employees
and
more
hands
up.
A
A
Next
slide,
please
what
we
came
up
with
this
is
this
little
propeller,
where
you
have
on
top
the
customer
customer
want
to
be
fast
right.
He
has
the
developers
inside
waiting
till
the
till.
The
environments
come
along
and
you
have
the
engineering
guys.
You
know
they're,
agile
or
service
oriented,
and
you
have
the
operations.
They
are
following
the
ITIL
ITIL
processes
and
in
the
beginning
you
know
you
have
people
working
together,
that's
nice,
but
at
some
point,
when
the
first
release
is
out,
you
know
things
get
difficult.
A
They
need
to
be
operated
at
night,
as
I
said,
and
you
know
the
Ops
guys
still,
you
know,
go
back
to
their
use
behavior
from
the
past
and
they
say
if
you
want
to.
If
you
want
to
come
out
with
a
new
release,
you
can.
But
you
know
six
weeks
from
now
is
the
right
time
now
what
we
implement.
It
is
the
concept
of
a
recalling
the
death
of
two
engineer
for
lack
of
a
better
term
which
builds
actually
the
the
layer
between
really
the
developers
and
the
Ops
guys.
A
So
those
kind
of
guys
speak
the
language
of
the
Ops
guys,
but
they
also
know
how
developers
think.
So
if
the
developers
come
along
and
say
yeah,
we
want
to
deploy
something
that
needs
to
have
some
changes
on
the
platform.
Those
DevOps
engineers
are
talking
to
the
Ops
guys,
so
that
concept
we
have
deployed
at
tal
collect.
Maybe
you
guys
know
that
the
road
charging
system
here
in
Germany
is
going
through
a
large
re
architecture
ring
and
they
are
actually
having
like
four
people
on
that
DevOps
engineering
level
and
after
they
have
that
implemented.
A
The
whole
system
got
very
much
calmed
down
and
they
they
tell
the
guys
how
to
use
the
right,
how
to
use
the
platform
and
they
have
requests.
They
are
filling
in
the
request
to
the
Ops
guys
they
talk
to
them
so
that
the
screaming
got
a
lot,
a
lot,
less
noisy
in
the
environment
in
the
system.
Right.
So
that's
about
it,
big
data
and
and
DevOps
from
a
tee
system
standpoint,
which
is
called
a
betcha
thanks,
guys.