►
From YouTube: Linux Container Internals by Great Umegbewe
Description
The microservices-driven architecture would not be thriving today if it were not for matching runtime environments - the container runtimes. Containers however, would not be possible without the availability of features that enable the virtualization of resources such as processes, filesystems and devices. In this talk, we would explore the following: * How Containers work. * The features in the Linux Kernel that make them possible precisely cgroups, namespaces, and UnionFS. * Container Runtimes * Container Storage and Networking.
---
KCD Africa 2022 is the 2nd iteration of the Kubernetes Community Days Africa, a CNCF-powered free community event. Visit https://kcdafrica.com for more information.
A
It's
a
devops
and
linux
engineer
with
a
passion
for
open
source
and
communities,
so
you
can
talk
to
great
about
linux.
Everything
develops
everything
cloud
system
and
when
he's
not
into
all
of
this,
is
out
there
boxing
and
really
random
stuff.
So
over
to
you
great.
B
Okay,
thank
you
very
much
for
having
me
my
name
is
grit
and
yeah
I'll,
be
speaking
about
container.
You
know
container
internals
so,
like
speaking
about
the
features
that
make
containers
possible
so
yeah.
Let
me
just
share
my
screen
trying
to
do
that
now,
yeah,
so
casey.
My
screen.
B
Okay,
so
my
screen
is
open.
All
right,
I
don't
know
if
km
does.
B
Okay,
okay,
so
I
was
thinking
about
the
container
internal,
so
what's
really
makes
them
possible
and
how
they
really
work
so
yeah,
okay,
so
meet
me.
My
name
is
great
xerox,
great
on
twitter
and
I'm
an
sre
address,
aryan,
devops
rtci,
so
tci
is
a
company
based
in
the
uk
and
we
basically
help
organizations
move
to
cloud
native
and
also
build
products
for
them.
So
I
mean
knox
fanboy
pretty
much.
As
far
as
I
remember,
I've
been
using
linux
since
about
age
of
nine
yeah.
B
We
had
venus
computers
at
home,
so
I
talked
about
in
the
scanner
and
containers
and
there
yeah.
I
love
the
internal
chips.
B
I
also
like
the
technique
nightworks,
I'm
generally
funny
and
I'm
red
sometimes
so
that's
that
yeah
so
just
lighting
the
mood.
This
is
how
I
walk
around
and
so
as
soon
as
you
say,
hi
to
me,
I
smile
so
in
case.
See
me
anywhere
and
I'm
burning
my
face:
don't
hesitate
to
say
hi,
I'm
always
smiling
yeah,
so
yeah,
just
a
disclaimer
just
so
that
we
can
set
things
straight.
Containers
don't
run
on
docker.
B
That
is
like
a
misconception
or
a
misrepresenting
air
misrepresentation,
because
containers
are
wrong
on
docker.
Docker
is
just
one
of
those
several
container
engines
that
interact
with
container
runtimes
container
on
times
like
container
wrong
and
the
rest
of
them,
which
in
turn
asks
the
canal
to
set
up
containers,
so
other
container
engines
you
can
find
is
like
cryo
and
portland.
There
are
some
examples
of
container
engines
you
can
find
around.
B
So
the
outline
is
pretty
much
containers.
They
will
look
at
the
building
blocks,
c
groups,
name,
species
copy
and
write.
Then
we'll
talk
a
little
about
container
runtimes,
docker,
wrong,
systemd
and
spam,
and
then
we'll
do
a
little
demo,
just
yeah.
B
So
what
are
containers?
I'm
sure
you
have
had
what
containers
are
containers
like
apples,
okay,
so
containers
is
like
a
form
of
operating
system
virtualization.
So
this
is
not
like
a
full
virtualization,
because
containers
depend
on
system
canal
like
you're,
not
scanning
to
like
to
set
up
and
yeah.
So
containers
is
like
a
form
of
virtualization
and
isolation,
because
containers
are
isolated
from
their
host
system.
B
They
have
their
own
pids,
that
is,
process
ids
and
they
have
their
own
network
and
all
that
so
containers
help
you
to
package
application
code
together
with
these
dependencies
so
that
you
can
run
them
in
in
between
environments.
So
containers
come
with
portability
where
you
can
run
them
on
different
environments,
no
need
to
start
installing
dependencies,
so
let's
say,
for
example,
you're
on
the
windows,
your
new
mac
os,
and
I
built
a
python
application,
and
I
give
you
the
code.
B
Okay,
I
want
you
to
run
this
application
so,
instead
of
use
starting
to
like
run
python
installing
python,
installing,
like
the
libraries
like,
let
me
say,
for
example,
panda
andres
is
obviously
done.
It's
already
packaged
into
a
container,
so
you
can
just
run
the
container
and
everything
will
run
perfectly.
So
one
thing
to
take
note
here
is
that
containers
have
run
as
processes
on
the
operating
system.
B
So
this
is
the
what
we
should
take
note
of
here,
because
I
won't
be
talking
much
on
containers
I'll,
be
talking
about
containers
like
from
from
a
higher
level
from
an
operating
system
level.
How
containers
are
run
on
the
system
than
containers
at
the
I'll
be
talking
from
a
lower
level?
Yeah.
So
does
that
I'll
be
talking
about
containers
as
processes
so
yeah?
So
we
have
the
control
groups.
B
So
control
groups
is
one
of
the
it's
a
special
mechanism
provided
by
the
linux
kennel,
which
allows
us
to
allocate
like
resources
like
cpu
memory.
B
Devices
network
to
a
group,
a
group
of
processes
or
a
set
of
processes
so,
like
I
said,
container
around
us
processes,
so
c
groups
able
us
to
like
give
memory
to
like
limit
memory
to
a
container.
B
A
container
is
a
process
of
course
limits.
Memory,
container
limits
cpu
to
a
container
and
all
that.
So
the
reason
for
even
c
groups
are
the
first
that
was
secretly
abused
as
a
security
feature,
so
it
was
actually
built
as
a
honeypot,
so
that
for
attackers.
So
we
have
some
signal
subsystems.
We
have
the
memory
pid
cpu
set,
freeze,
block
iu,
so
other
subsystems
are
what
you
can
control
on.
What
you
can
the
amount
you
what
you
can
control
to
assign
to
a
process?
B
So
I'll
just
talk
about
a
few
of
them
so
block?
B
Are
you
set
limits
to
read
or
write
from
and
to
block
devices
net
net
cls
allows
to
mark
network
packets
from
attacks
like
allowed
to
mark
a
network
packets
to
a
group
and
cpu
uses
the
scheduler
to
provide
cpu
tax
access
to
the
processor
resources
and
the
pid
sets
number
of
and
dice
is
it
sets
a
limit
of
number
of
processes
in
a
group,
so
we
can
limit
number
of
processes
that
can
run
in
the
container,
so
substance,
crypts
or
systems
like
bit
makes
that
possible
so
for
namespaces
yeah.
B
So
yc
group
will
say
I
okay,
I'm
going
to
limit
what
you
are
going
to
use.
The
namespaces
will
limit
what
you
can
see.
So
it's
I
can
give
an
example
of
kubernetes
namespaces,
so
namespace
is
in
kubernetes.
So
let's
say
we
have
this
namespace
called
dev
and
we
have
this
new
space
called
pro.
B
So
the
objects
community
objects
like
replica
sets,
secrets
and
the
rest
of
them
can
be
grouped
into
a
name
space
while
in
another
namespace
like
said
approach
is
as
is
on
and
what
is
it
called
objects2?
So
these
objects
can't
really
assess.
Let
me
see
a
pod
in
the
improv
cannot
really
access
his
secrets
in
dev,
because
this,
like
is
a
form
of
isolation
between
them,
so
lim
just
take
note
c
groups.
B
That
means
what
you
can
use
like
in
quantity
and
the
name
species
limits
what
you
can
see.
So
we
have
different
types
of
name
species.
So
we
have
the
mounts
namespace,
which
mnt,
which
controls
mount
points
so
upon
creation
of
containers.
B
The
current
mount
namespace
are
copied
to
the
new
namespace,
but
man
points
created
after
was
not
a
gate
between
those
namespaces,
so
we
also
have
the
pid,
so
it
provides
namespaces.
B
It
provides
processes
within
an
independent
set
of
process
ids,
so
pid
namespace
is
what
makes
containers
think
that
they
have
this
form
of
isolation
so
now,
for
example,
a
process.
Now,
let's
say
we
have
an
nginx
docker
container.
I
want
to
mention
docker.
Let's
say
we
have
an
nginx
container.
B
The
nginx
might
be
running
with
apid
of
four
or
five
or
whatever
on
the
container,
but
outside
that
container
that
it
will
be
running
on
the
whole
system.
It
will
be
running
as
a
different
pid.
So
this
is
what
make
full
containers
into
thinking
that
they
have
their
their
own
guys
that
they
have
their
their
in
control
of
what
they
can
do.
B
So
these
are
one
of
the
features
of
namespaces
that
enables
that
so
network
net,
and
this
is
visualizes-
the
network
stack
so
upon
creation
and
networking
space
contains
only
look
back
interface,
so
once
you
create
a
container
that
is
at
default,
it's
it's
only
a
loopback
interface
that
you're
gonna
have
I'm
going
to
show
an
example
of
this
in
the
demo.
B
I
will
like
give
an
example
of
how
we
can
use
network
namespace
namespaces,
so
also
we
can
also
look
at
the
username
species,
provides
previous
isolation
and
the
user
identification
segregation
so
like
the
ui
and
the
guid
so
like
it
gives
you
this
kind
of
security
feature.
I
think
it
was
recently
that
it
came
into
the
linux
scanner.
Okay,
yeah
recently
he
came
into
the
scanner.
I
changed
at
3.8
yeah,
so
we
also
have
the
ipc
ipc.
I
don't!
B
I
don't
even
know
if
anybody
cares
about
ipc
in
that
process
in
that
process
communication.
So
this
was
even
recently
added
to
the
lenovo
to
like
isolate
interprocess
communications.
B
So
yeah
clone
is
one
of
the
system
calls
clone
is
actually
a
system
called
in
the
linux
kernel
that
enable
us
to
use
namespaces.
So
if
you,
you
can
actually
like
look
at
the
code
in
the
linux
and
like
try
to
have
an
understanding
of
what
clone
does
so
yeah.
This
is
how
process
is
called
clone.
If
you
can
look
for
the
ins
clone
function
there,
so
it
caused
it
by
the
stack
passes
on
flux
to
it
on
the
parent
tid.
B
So
next
we
have
the
copyright
copy.
All
right
copy.
All
right
is
really
really
a
complex
topic
like
it's
going
to
even
have
its
own
different
presentation
on
that,
so
computer
write
is
a
bit
complicated
copy
or
write
is
like
an
optimization
strategy.
B
So,
if
you
have
also
noticed,
let's
say
you
pull
a
docker
container
and
maybe
you
put
another
docker
container,
sometimes
it's
going
to
tell
you
that
this
layer
already
exists
right
so
copy
or
write,
makes
this
stuff
smart
enough
so
copy
and
write
like
those
sharing
sharing
of
this
files
and
other.
So
we
have
different
file
systems
which
copy
our
rights
uses.
B
We
have
the
aufs
btrfs
vfs
and
device
mapper,
so
I
don't
think
docker
has
to
worry
about
this
because
the
car
will
just
you-
don't
really
have
to
worry
about
this,
because
docker
will
just
use
the
most
suitable
file
system
is
intelligent
enough
to
do
that
so,
but
that
stuff
we
also
have
capabilities.
B
So
we
have
capabilities
that
enable
you
to
say:
okay.
I
want
this
capability
side
time
on
this
container
yeah
and
we
also
have
excellent
so
security,
enhanced
linux,
so
security
enhancement,
knobs,
most
side
means
use
a
security
enhancement
node
so
that
you
can
set
some
kind
of
have
more
control
over
who
can
access
this
container.
So
that's
the
path
you
have
to
look
out
for
and
then
we
have
the
container
runtimes
like.
B
I
said
we're
going
to
come
into
this,
so
we
have
the
docker
engine
container,
open,
vs,
duplicated
there,
sorry
about
that.
Yeah
docker
engine
actually
uses
container
and
wrong
to
for
the
container
creation,
because
what
actually
made
docker
stand
out
was
that
they
had
they
improved
the
they
improved.
The
developer
experience
using
containers
before
containers
were
just
something
that
most
likely
you
see
with
assay
admins.
I
know
that
really
made
this
easy
with
their
whole
family
of
two.
B
So
that's
why
it's
called
an
engine
we
also
have
lxc.
Alexa
has
been
existing,
in
fact,
yeah.
There's,
no
there's!
No
there's
no
record
of
lxe
needed
no
scandal.
You
haven't
even
seen
anything
about
those
containers,
because
the
containers
you
might
see
is
like
a
very
different
thing
from
the
containers.
You
know
like
it's
a
big
differences,
so
alexi
was
one
of
the
container
technologies
that
we
are
there
and
we
also
have
open
vs
to
which
was
also
one
of
the
container
technologies
that
assisted
so
yeah.
B
B
Just
trying
to
kind
of
adjusted
so
so
yeah
we'll
try
to
look
at
the
name
species
that
exist
on
my
pc.
So
let's
do
ls
ns
like
come
up
with
this
with
the
namespaces
that
exist
in
my
system,
so
we
have
the
times
namespace.
We
have
the
c
group
namespace.
We
have
the
pid
namespace,
uts
ipc
and
the
rest
of
them.
So
yeah
for
browsers,
like
chrome
and
brief
they'll,
most
likely
have
a
lot
of
pid,
namespaces
and
yeah.
B
They
also
have
the
network
namespace
like
for
network
and
the
rest
of
them.
So
let's
try
to
list
those
to
these
audios
yeah.
So
I
don't
have
to
type
sudo
every
time
so
I'll
say:
sudo
ls,
slash
proc,
so
we're
going
to
check
a
process.
So
if
I
do
this,
yes
ox
so
we're
going
to
actually
check
which
namespace
my
first
process
belongs
to
so
the
in
its
init
commands
like
the
init
process.
So
let's
see
which
namespace
it
belongs.
B
So
that's
just
pretty
simple:
yes,
ls
yeah.
B
B
B
B
So
now,
let's
create
a
pair
of
feature
internet
devices
so
say
ip
link
always
forget
to
ask
sudo,
don't
mind
me
vth1
type
v.
It
is
a
name
vth2.
B
So
we
just
added
we
just
added
a
video
internet
device.
So
if
i2
ip.
B
B
B
Yeah
you
can
see,
we
are,
the
visual
internet
is
created.
So
now,
let's
link
each
device.
Each
device
is
recreated
to
rename
space
so
link
set.
B
B
I
will
link
it
to
mvth2
to
mean
space
too
yeah,
so
I'll
be
pretty
correct
here.
So,
let's
bring
up,
let's
try
to
bring
up
the
devices
and
assign
ip
addresses
to
them,
so
not
announce
exact
namespace
one
ip
link
set
dev
vth1
pop.
B
To
do
again,
sorry
yeah.
B
So
do
again,
I
think
it's
hard
time.
I
run
this
with.
B
Too
so
now,
let's
verify
the
connectivity
between
the
two
namespace
as
it's
enabled
by
the
feature:
internet
spectrum
so
from
the
namespace
one
system,
I've
linked
them.
So
if
his
name
space
one,
we
should
be
able
to
paint
the
name,
the
second
one
and
from
the
second
name
we
should
be
able
to
ping
the
first
name
space.
So
this
is
sudo
sudo,
okay
s,
I
p
net
ns
exec
name
space,
one
pink,
so
that
goes
about
five
pings
one.
Nine
two
point:
one
six,
eight
point:
two!
B
So
yeah
it's
working
and
it's
like
trying
to
access
the
second
namespace
and
being
in
that,
so
we
should
be
able
to
do
that
from
the
second
one
to
the
pink
this.
B
So
it
works
so
yeah.
B
I'll
just
go
ahead
and
delete
these
namespaces.
B
List
yeah,
I
no
longer
have
to
say
so.
This
is
something
that
most
of
this
container
around
clients
do
under
the
hood.
They
try
to
create
network
name,
species
and
the
rest
of
them
so
that
you
can
get
this
entire
operability
between
your
containers,
providing
network
access
to
your
containers
so
yeah
and
also
go
over
the
second
demo.
B
This
wouldn't
take
time,
so
it's
just
to
show
how
these
container
runtimes
also
do
copy
all
right
so
make
the
arrow.
So
let's
say
mega
pcd
one.
B
B
And
hello,
four:
so
then
we'll
try
to
make
a
union
between
these
two
directories.
So
so,
let's
say
kcd
union
so
say:
union
fs.
They
are
kcd
one
and
pcd2
into
union
kcd
union.
B
You
can
see
that
these
three
fives
has
been
unions
together
into
a
namespace,
so
under
the
hood,
this
is
what
container
runtimes
kind
of
try
to
implement.
I
actually
say
you
can
also
look
into
okay.
Sorry,
I
can
also
say
you
should
look
into
boca.
I
don't
know
if
I
can
type
that.
Okay,
please
look
into
rocca.
B
So
yeah
so
broker
is
actually
an
implementation
of
docker
in
in
bash,
so
you
can
try
to
read
the
source
code
and
you
would
see
most
of
how
this
container
runtimes
implements
these
low-level
features
of
the
okay.
B
This
low-level
features
these
low-level
features
of
the
linux
and
I
would
like
to
create
a
container.
So
it
does
that
for
my
talk
and
if
you
have
any
question
you
can
just
leave
them
in
the
chat
yeah,
I
think
yeah
he
has
sadiq
has
done
that.
So,
if
you
have
any
questions
on
like
these,
you
can
try
to
like
reach
out
to
me
or
you
can
just
write
them
on
the
chat
and
yeah.
B
So
it
was
nice
speaking
with
you,
and
I
hope
you
have
a
great
day
thanks.
C
Awesome
that
was
a
great
presentation.
We
don't
have
any
questions
yet
in
the
chat.
C
So
here
is
a
great
twitter
handle
at
zero
x.
Great.
The
guy
is
great
that
his
twitter
handle
is
in
lead
code
self,
so
you
can
reach
out
to
him
on
twitter
and
share
whatever
questions
you
have
and
you
grateful
to
answer,
but
if
you
still
have
any,
you
can
drop
them
in
chat
and
he
will
be
available
on
the
youtube
channel
to
answer
any
of
them
in
real
time.
B
Okay,
so
c
groups
namespaces
are
used.
Okay,
like
I
said
I
don't
know
if
it
came
in
late,
but
c
groups
is
what
limits
the
resources
that
containers
can
use
right.
So
it
limits
how
many
it
limits
like.
There
are
subsystems
for
c
groups.
You
have
the
pid,
the
nets,
the
rest
of
them.
So
what
c
group
does
is
like
try
to
limit
system
resources
that
these
containers
can
use
right.
So,
like
I
said,
c
groups
limits
what
you
can
use
and
namespaces
limits.
B
What
you
can
see
so
namespaces
is
what
makes
containers
not
to
be
able
to
see
each
each
other,
because
if
they
are,
let's
say
this
container
is
able
to
see
the
process
of
another
container.
That's
really
bad
because
you
can
have
things
like
racing
conditions
and
it's
also
a
security
issue,
so
name
spaces
is
what
brings
in
that
isolation.
B
Right
like
how
you
can
run
multiple
containers
on
your
system,
you
can
run
nginx,
you
can
run
busy
box.
You
can
run
a
lot
of
containers
without
having
conflicts,
but
if
namespaces
doesn't
exist,
things
like
process
ids
will
start
clashing,
and
even
though
that
is
a
bit
of
a
security
issues
that
is
really
bad,
so
namespace
is,
is
a
feature
of
the
not
scanning
that
like
try
to
convince
containers.
B
B
Okay,
yeah
demo,
okc
group,
okay,
yeah.
I
think
I
did
this.
C
Yeah
it
yeah
it
shared
it.
What
we
will
do
later
is
the
recording
of
the
whole
thing
will
be
available
subsequently
and
will
also
break
each
of
the
sessions
so
that
you
can
have
access
to
individuals.
So
your
guest,
you
can
check
our
youtube
channel
by
monday
or
tuesday
next
week
and
you
will
have
his
specific
video,
so
you
can
rewatch
the
parts
where
he
did.
The
demo.
B
Okay:
okay,
that's
fine
research!
You
can
also
contact
me
on
twitter.
If
you
have
any
questions
so
I'm
kind
of
available
to
like
answer,
questions.
C
C
Thank
you
very
much
great.
We
have
our
next
session
all
right,
bye.