►
From YouTube: Application Delivery & Life Cycle Management - Francesco Giannoccaro (UK Health Security Agency)
Description
Accelerate Application Delivery and Life-Cycle Management to Support the Growing Complexity of Public Health Science
Francesco Giannoccaro (UK Health Security Agency)
This OpenShift Commons Gathering was held on July 6th, 2022 live in London, England
https://commons.openshift.org
A
Good
afternoon,
everyone
thanks
thanks
for
joining
this
session.
My
name
is
francesco
jannocker
and
I
will
take
about
20
minutes
of
your
time
to
share
my
experience
with
working
with
colleagues
in
these
last
two
years
during
the
pandemics.
A
So
I
work
in
uk
health
security
agency,
I'm
head
of
high
performance
computing
and
hosting
services
within
the
organization
uk,
I
just
say
the
health
security
agency
in
uk
has
been
created
in
2021,
merging
bringing
together
three
existing
organizations.
One
was
public
of
england.
I
was
working
before
in
that
organization
jbc,
which
is
the
join
biosecurity
center
and
a
division
of
nhs
called
nhs
test
and
trace.
A
So
this,
the
format
ii,
jvc
and
nhs
test
and
trace
have
been
created
during
the
pandemics
and
and
since
2021,
precisely
since
october
2021,
all
these
three
body
are
now
acting
as
a
single
agency.
A
We
have
about
8
000
people,
majority
scientists
working
in
the
organization,
and
so
we,
the
mission
of
the
organization,
is
to
protect
the
public
from
the
impact
of
infectious
disease
spanning
you
know,
from
any
aggressive
pathogen,
but
also
from
chemical,
biological
and
nuclear
incidents.
So
quite
a
broad
mission
and
and
a
lot
of
expertise
that
have
been
coming
to
the
organization.
A
So
the
the
main
scientific
services
that
the
organization
provide
to
the
country
are
the
pathogen
genomic
services.
So
in
screening,
through
the
analysis
of
the
dna
dna
analysis
of
pathogens,
of
both.
A
Virus,
but
also
a
number
of
other
pathogens
who
not
only
don't
think
about
just
about
the
coronavirus,
also
more
broadly
to
tuberculosis
legionella
monkey
park.
So
all
the
aggressive
pathogens
that
are
potentially
origin
of
national
international
global
outbreak
include
also
in
the
activity
that
the
agency
does
the
antimicrobial
resistance
monitoring,
so
understanding
how
those
pathogens
develop
resistance
to
the
antibiotics
and
predictive
models
to
to
understand
how
those
outbreak
evolve.
A
So
the
this
is
an
ever-growing
complexity,
a
scenario
that
constantly
evolves
for
a
number
of
reasons
because
of
we
live
in
a
global
society
where
people
moves
and
travel
from
country
to
current
country
to
from
different
continents
and
and
therefore
transmissible
disease
travel
at
a
different
speed
in
in
modern
days,
but
also
in
the
on
the
positive
side.
A
You
think
about
the
ability
of
having
significantly
more
data
to
analyze
and
to
provide
to
scientists
to
understand
how
we
can
fight
those
disease
so
the
the
amount
the
large
amount
of
data
is,
at
the
same
time
a
challenge
mean
that,
of
course,
to
make
best
use
of
of
this
increasing
volume
of
data.
A
We
need
also
to
have
an
increasing
capacity
in
terms
of
you
know:
computational
resources
and
especially
in
scenario
like
this-
that
fortunately
likely
happened
once
in
generation.
A
It's
just
not
for
trans
things
that
we
every
generation
go
through,
but
there
is
suddenly
incredible
volume
of
data
comings
from
different
organizations
around
the
world
share
those
data
within
the
scientific
community
and
therefore
there's
the
need
to
scale
at
a
speed
that
is
difficult
to
cope
with,
so
the
the
infrastructure
that
the
organization
has
rely
partially
on
on-premise
data
center
and
partially
on
off-premise
capability
running
on
commercial
cloud.
A
So
we
have
two
main
data
center,
one
in
north
london
and
one
in
southwest
england,
ports
on
salisbury
and
clearly
in
those
data
center.
We
have
both
storage
and
computational
capacity
during
the
pandemic.
We
have
expanded
that
resources
on
premise
but,
most
importantly,
we
have
enabled
the
ability
to
analyze
and
to
leverage
resources
that
are
available
from
public
cloud
environments
and
in
order
to
make
to
make
those
resources
available.
A
The
connectivity,
of
course
play
an
important
role
when
you
have
a
significant
amount
of
data
that
is
are
generated
by
the
the
whole
genome.
Sequencing
machine
every
day
so
that
the
connectivity
has
been
also
a
challenge,
then
the
data
you
know
you
talk
about
petabyte
of
data
every
month
produce
the
the
globally.
A
The
data
related
to
dna
sequencing
is
growing
at
a
speed
that
is
higher
than
any
other
science
domain
and
before
pandemi
it
was
already
doubling
every
six
months
and,
of
course,
now
is-
is
growing
at
even
faster
speed.
So
managing
the
amount
of
data
in
in
an
effective
way
is
is
a
challenge
in
itself
and
is
not
only
important
to
provide
you
know,
resilience
to
this
data,
but
also
facilitate
access
to
this
data,
which
means
imagine
it
is
not
as
data
that
sit
in
a
database.
A
This
is
a
structure
of
data
that
scientific
machine
producer
essentially
are
files
in
order
to
facilitate
the
sharing
of
those
information
with
the
scientific
community
around
the
world.
It's
important
to
catalogue.
You
know,
to
hard
metadata
to
those
files
that
allow
browse
ability
of
those
systems,
so
technology
around
cataloging
data
making
this
data
easily
accessible
is
pivotal.
A
The
the
way
we
have
been
approaching,
this
increasing
demand
for
computational
results
has
been
adding
additional
hardware
to
the
data
center
that
we
have
in
in
our
infrastructure,
the
one
that
I
mentioned-
north
london
and
southeast
england,
but
also
to
say
it's
building
a
lasting
capability
on
commercial
cloud
in
order
to
leverage
that
resources
is
important
to
give
workloads
the
portability
that
they
need
to
seamlessly
run
in
different
environment,
and
that
possibility
is
one
of
the
main
issue
that
containers
help
to
resolve.
A
So
the
use
of
you
know
containerizing
those
workloads,
quantized
scientific
application-
is
ultimately
what
enable
easy
use
of
elastic
resist
that
are
available
in
commercial
cloud.
So
we
have
started
using
containers
in
the
hpc
domain
in
the
high
performance
computing
area,
and
we
started
that
journey
a
few
years
ago
with
colleagues
within
the
organization,
primarily
bioinformaticians.
A
So
colleague
that
are
tech
savvy
that
you
know
have
easily
approached
and
understand
the
benefit
of
moving
from
having
multiple
version
of
their
software
compiled
in
what
technically
is
referred
to
as
modules
in
in
an
hpc
environment
move.
Those
number
of
modules
that
are
at
least
there
into
container
is
a
process
that
I
started
a
couple
of
years
ago
and
that,
as
enable
that
portability
that
I
I
was
mentioning
before
in
this
specific
case,
the
container
engine
is
a
container
engine
called
singularity.
A
But
you
know
exactly
the
same
concept
of
other
container
engine
like
docker,
so
the
benefit,
of
course
is
simplifying,
say
the
the
postability.
A
The
fact
that
you
know
you,
you
have
different
version
of
runtime,
all
in
a
single
object,
let's
say
and
and
the
fact
that
this
specific
container
engine
is
already
engineered
to
talk
with
job
scheduler
in
hpc
environments.
So
in
hpc,
of
course,
an
important
role
is
played
by
the
job
scheduler,
which
is
the
component
that
spread
the
workloads
across
a
number
of
physical
nodes,
and
then
we
have
been
trying
to
introduce
and
to
present
those
benefit
to
our
scientists.
A
Clearly,
the
audience
is
different.
So
when
you
talk,
you
know
with
a
colleague
that
are
already
working
on
an
hpc
environment.
They
are
already
more
knowledgeable
in
some
concept,
whereas
scientists
like
biology,
biology,
epidemiologists,
they
need
a
way
to
convert
the
the
the
ordinary
way
of
working
for
them,
so
basically
using
their
laptop
and
id
from
often
they
work
on
in
r,
for
instance,
but
to
allow
them
to
use
scalable
environment
is
slightly
more
complicated
than
working.
A
You
know,
with
people
that
already
have
been
using
common
line,
interface
like
in
an
hpc
environment,
so
kubernetes
you
know
provided
scalability,
however,
is
is
not
sufficient
to
deploy
or
to
move
applications
that
scientists
use
from
their
laptop
in
a
way
that
that
that
easily
they
they
can
use
that
level
of
scalability
and
openshift
offer
that
wider
ecosystem
of
components
all
integrated
in
in
an
easy
way
to
to
be
used
by
colleagues,
that
don't
don't
have
that
already
that
technical
familiarity
with
with
some
components
like
load,
balancer,
shared
storage,
network
isolation,
networking
encapsulation
and
we
using
openshift,
we
have
significantly
facilitated
that
journey
of
moving
application
from
scientific
workstation,
desktop
or
laptop
into
an
environment
that
can
be
easily
scaled.
A
There
are
so
the
benefits
that
we
have
seen,
of
course,
are
not
only
from
a
development
aspect,
which
we
we've
seen
many
benefits
and
and
very
positive
feedback
from
from
our
scientists.
Of
course,
the
security
aspect
for
them
is
is
quite
important.
So
every
time
you
talk
about
moving
something
from
a
laptop
they
they
feel
like.
A
You
know
they
control
that
they
comply
to
information,
governance
policy,
to
move
into
a
broader
environment,
and
when
you
stop
mentioning
public
cloud,
you
know
you
have
to
unders
to
explain
all
the
security
envelope
that
you're
going
to
provide
to
make
to
offer
the
same
level
of
security
to
what
they
do.
So
again,
open
shift
has
a
very
strong
attention
to
security.
The
fact
that
the
registry,
the
image
registry,
for
instance,
is
constant,
constantly
maintained
and,
and
therefore
security
patched
is
a
big
plus.
A
It's
not
like
downloading
an
image
from
you
know,
docker,
hub
or
or
or
git
is
it's
using
a
base
image
that
is
already
been
assessed
in
terms
of
specifically
in
terms
of
security,
then
the
other,
of
course
benefit
that
we
have
seen
from
from
those
colleagues
is
the
fact
that
the
the
old
ci
cd
capability
is
a
really
streamlined
within
their
the
same
environment.
A
So
they
the
need
to
really
learn
much
about
how
to
monitor
you
know
the
the
loads
on
their
system
and
and
how
to
understand
how
to
scale
some
of
the
components
within
boats
or
within
an
environment.
A
So
the
the
logs,
the
the
the
fact
that
we
still
have
an
easy
way
to
give
this
ability
to
today
log
to
a
more
central
log
capability.
That
organization
has,
in
this
case,
for
instance,
through
splunk,
has
been
a
plus
that
has
you
know
they
didn't
have
before,
and
that
has
been
much
appreciated,
but
also
from
an
operator
perspective.
Also
from
you
know,
from
within
the
team
that
I'm
I'm
liking
to
work
more
closely
with
the
benefits
are
numerous.
A
So,
as
I
say,
we
we
work
in
in
quite
complex
environment,
probably
remember
from
the
first
slides
the
you
know:
the
ecosystem
of
open
source
technology.
You
have
bare
metal,
high
performance,
computing
environment,
running
linux.
A
You
have
those
environment
that
can
scale
and
burst
dynamically,
on,
for
instance,
on
openstack,
where
we
run
in
different
type
of
application
in
an
elastic
way,
so
openstack,
for
instance,
run
open
shift,
but
also
provide
bursting
capability
to
the
hpc
environment,
so
openstack
hpc,
those
components
are
quite
you
know,
intense
to
maintain
because
they
have
very,
very
different
module
within
that
the
same
environment
and
having
a
more
orchestrated
way
for
these
components,
to
talk
each
other
to
to
automatically
scale
and
to
provide
that
level
of
flexibility
that
you
see
in
in
commercial
public
solution
is,
is
one
of
the
the
benefits
that
from
an
operator
perspective,
openshift
provides.
A
So
here
you
know
you,
you
see,
for
instance,
the
fact
that
the
security
again
is
is
at
a
very
high
level,
so
meaning
you
know
as
linux
and
mc
as
capability,
which
means
in
essence,
that
every
container
is
treated
as
as
a
process
and
is
fully
isolated
by
all
other
processes.
So
there
is
no
need
to
give
not
only
no
need
to
give
high
privilege
to
the
process
that
run
in
this
environment,
but
in
addition
to
the
fact
that
they
don't
run,
has
sort
of
this
root
as
user.
A
They
also
are
a
lower
level,
have
an
additional
capability
of
being
isolated
one
two
to
each
other.
The
the
case,
studies
that
I
would
like
to
briefly
touch
today
that
the
one
that
we
have
seen
are
quite
broad.
So,
in
addition
to
the
normal,
let's
say
web
application
that
probably
the
majority
of
of
of
people
using
okay
or
open
open
shift.
They
use
you
know
so
web
environment
with
the
database,
which
are
part
of
our
user
case.
A
I'm
going
to
touch
briefly
on,
for
instance,
batching
process
that
we
have
been
running
on
openshift,
which
are
probably
slide.
A
You
know
more
specific
for
scientific
organization,
so
we
we
run,
of
course,
a
number
of
web
system
that
present
the
results
of
the
analysis
for
a
number
of
outbreak
and,
again,
don't
think
just
about
calvi
the
coronavirus,
but
also
about
about
other
outbreaks
that
happen
around
the
world,
like
less
known,
like
the
the
the
collaboration
with
who
on
legionella,
musel
rubella,
which
are
all
outbreak
that
have
happen
in
different
countries.
A
An
important
virus,
which
is
the
tuberculosis
pathogen
that
still
at
these
days,
affecting
and
significant
amount
of
people
so
is
produced.
This
is
a
clear
example
where
we
have
a
technical
partner
that
works
on
on
the
code,
so
at
the
image
is
produced
by
external
developer
and
they
push
their
image
within
our
environment.
They,
of
course
they
use.
You
know
the
same
base
image
that
we
have.
A
We
approve
in
terms
of
security
and
then,
when
we
tag
a
build
as
production
that
that
environment
becomes
available
to
the
public,
we
have
an
internal
components,
for
instance,
that
we
run
an
open
shift
and
that
is
used
in
collaboration
with
other
scientific
organizations
around
the
country,
and
this
this
is
to
provide
basically
it's
a
consolidated
simulation
tools
used
by
scientists
and
academia
to
produce
consensus
around
the
kovid
r
parameter,
which
is
the
the
parameter
that,
in
give
a
sense
of
you,
know
how
the
infections
span
from
individual
to
individuals
how
their
reproduction
number
is
called
and.
A
We
do
this
in
collaboration
with
the
uk,
defense,
science
and
technology
laboratory
and-
and
this
is
ultimately
then
analyzed
by-
I
think-
spin,
which
is
the
scientific
pandemic,
influencer
influencer
group
for
modeling
so
to
to
models
how
how
the
infections
spread
another
application.
Is
the
uk
nsc
recommendation
website.
So
it's
the
national
screening
committee
recommendation.
So
this
is
a
website
that
is
used,
is
accessible
by
the
public
and
and
clinical
sorry
clinician
to
look
up
to
to
clinical
condition
so
understanding
if
a
specific
condition
need
to
go
through.
A
A
screening
of
you
know
the
of
the
affected
people,
and
so
you
can
find
how
how
the
screen
work
and
is
also
used
to
coordinate
consultation
about
you
know
the
screening
itself.
This
is
from
an
openshift
perspective.
This
is
completely
managed
by
ci
cd
capability
that
openshift
itself
provide.
A
A
So
in
this
case,
we
we
use
openshift
to
run
analysis
that
are
managed
to
to
specific
pipeline,
in
this
case
apache
airflow.
So
users
before
were
running
these
bad
jobs
again
on
their
scientific
workstation.
A
They
didn't
have
enough
compute
capacity
to
run
in,
in
short
time,
analysis
on
large
data
set
and,
as
I
was
mentioning
before,
the
volume
of
data
to
analyze
has
been
growing
significantly
and
and
so
the
need
of
not
you
know,
having
a
workstation
taking
hour
to
analyze,
and
you
know,
having
a
laptop
always
on
started
that
process
of
thinking
how
we
could
run
that
type
of
batch
processing
on
kubernetes
in
this
case,
on
open
shift.
A
So
there
is
a
building
container
image
process
and
that
allow
us
also
to
locking
down
specific
version
of
an
runtime
environment,
and
that
is
important
for
scientific
reproducibility.
So
we
want
to
make
sure
that
when
we
publish
some
data,
we
also
keep
track
of
the
entire
process
involved
with
that
results.
So
all
the
code
and
the
library
involved
in
the
the
scientific
data
that
are
ultimately
shared
with
the
scientific
community
need
to
be
kept,
and
that
is
an
a
you
know
capability
that
we
we
have.
A
A
This
is
one
of
the
systems
that
access
a
number
of
different
data
source
and
perform
different
type
of
data,
mesh
up
and
query
and
analysis,
and
one
of
the
challenge
here
was
ensuring
that
we
were
complying
to
information
governance
policies.
So
we
built
a
small
module
that
used
this
technology
called
kerberos
to
perform
authentication
at
user
level,
so
the
user
that
run
that
batch
process
that
started
the
batch
process
and
therefore
that
is
going
to
querying
some
specific
sensitive
data
is
also
logged
for
information
governance
requirements.
A
So
here
is
again
a
wide
number
of
components
involved
through
through
the
the
old
pipeline
that
afro
managed-
and
these
are
ultimately
computed
in
in
a
containers
environment,
specifically
on
open
shift.
The
results
of
those
analysis
are
then
published
through
through
dashboard,
and
you
know
coming
back
to
to
to
the
the
way
colleagues
were
working.
A
You
know
those
to
to
share
those
dashboards,
those
reporting
they
were
saving
this
data
into
a
git
repository
and
everyone
that
wanted
to
see
a
reporting
capability.
They
had
to
download
on
their
system
that
set
up
data
in
and
only
only
then
be
able
to
access.
You
know
that
type
of
dashboard
with
openshift
we
have
enabled
we've
removed
that
complexity
and
all
that
number
of
steps.
So
here
again
we
have
accelerator
the
ability
for
scientists
to
do
their
daily
work.
A
A
I
think
yeah,
python
and
r
are
the
two
development
runtime
that
we
use
that
we
see
are
used
more
often,
but
specifically
on
science,
scientists
that
do
that
run,
predictive
models
and
data
analytics
that
there
are
there's
a
lot
of
using
of
r,
so
interactive
analysis
through
they.
They
they
normally
do
through
this
through
a
component
that
called,
let's
call
r
studio
in
in
their
own
laptop.
A
So
it's
they
have
this
ide,
this
development
environment
and
they
choose
the
the
different
data
set
and
they
they
run
at
those
analysis.
But
what
we're
working
now
is
a
way
for
them
to
connect
that
id
that
that
environment
directly
with
kubernetes
within
open
shift
and
therefore
from
the
code
and
that
they
put
together,
they
can
then
deploy
the
actual
computational
task
in
parallel
across
a
number
of
of
containers
in
openshift.
So
again,
that
is
a
capability
that
has
allowed
us
to
facilitate
and
accelerate
and
use
that
scale
resources.
A
I
think
that
that's
primarily
what
I
want
to
say.
I
want,
of
course,
also
mention
the
fact
that
all
these
has
been
possible
through
the
use
of
open
source
technology,
primarily
so
I
I
want
to
have
used
this
opportunity
to
thank
the
open
source
community
around
the
world,
not
only
people
that
are
working
on
okd
and
open
shift,
but
more
broadly
people
that
invest
their
time,
resources
on
the
broad
ecosystem
of
open
system
technology.
A
These
enable
open
science
and
make
possible
for
organizations
to
accelerate
at
speed
a
number
of
activities
are
otherwise
very
complex
and
and
probably
costly
to
run
in
a
different
way.
So
thanks
again
to
radat
for
the
effort
that
he's
putting
on
the
open
source
and
and
to
the
open
source
community
in
general.