►
Description
Discussion around https://gitlab.com/gitlab-org/gitlab-runner/issues/4119
A
B
Sure
so,
basically,
that
the
issue
we
had
here
in
the
agenda,
basically,
when
we
execute
commands
inside
pods,
we
wait
for
the
walks
of
the
commands
and
for
the
existence
of
the
commands,
include
Safeway
disputed
command,
which
takes
a
minute
to
run
even
a
simple
slip.
If
in
this
sixty
seconds,
while
the
command
is
running,
if
let's
say
there
are
some
connectivity
issues
in
the
kubernetes
cluster
or
in
the
networking
stake
in
general,
the
connection
between
the
runner
and.
A
B
A
B
B
So
basically,
this
this
isolating
so
so
this
is
the
the
placing
remote
commands
where
the
basically
the
connection
is
handled.
So
this
is
the
place
where
STD
in
and
test
out,
basically
being
copied
to
put
to
the
works
and
there's
a
spatial
error
stream,
which
is
part
of
the
ideas
of
the
occupancies
communication
protocol,
and
this
error
stream
can
decode
a
message.
B
A
B
This
is
sexually
like
the
root
of
the
problem
and
it's
kind
of
specific,
because
there
might
be
a
case
where
the
connection
legit
doesn't
have
anything
to
send
to
send
back
to
us,
but
we
still
get
get
enter
far.
We
don't
get
enter
file.
The
connection
basically
causes,
for
example,
from
our
end,
and
this.
B
B
B
B
This
is
the
last
place
in
our
code
before
we
actually
execute
the
mode
command
and
the
code
I
showed
you
before.
Okay,
so
yeah.
The
problem
is
here:
remote
command
touch
it
basically
remote
command,
creates
an
HTTP
to
connection
and
to
this
HTTP
to
connection
will
create
multiple
streams
for
a
standard
input,
standard
output
and
the
spatial
error
stream,
which,
which
is
part
of
the
protocol
and
basically
for
cuter.
They
should
be
two
connection.
We
get
enter
file
and
so
happens.
The
thing
I
explained.
A
A
B
B
A
B
A
B
B
B
Yeah
I
read
in
part
to
0.2
yeah
yeah.
That's
that's
one.
So
this
point
consists
of
two
possible
solutions.
We
need
to
modify
mode
command
in
this
case.
If
we
want
to
try
this,
what
this
can
give
us
two
possible
outcomes.
We
could
either
be
able
to
reattach
dreams.
Let's
say
III,
don't
know,
I
haven't
tried
so
deeply.
B
Maybe
the
HTTP
to
connection
is
gonna,
reconnect
its
we're,
basically
using
ghost
HTTP
to
stack,
so
we
create
streams
on
top
of
the
ice
due
to
connection.
So
if
we,
if
the
connection
dies,
it
could
reconnect.
If
we
tree
connects,
we
could
regulate
the
streams
on
top
of
it
passer
and
we
could
just
resume
resume
listening
to
to
standard
output
on
the
process.
We.
B
B
We
could
check
if
we
have
already
gotten
like
a
starts
response.
If
we
don't
have
a
start
response,
yet
that
probably
means
the
connection
died.
Unexpectedly,
that's
one
possible
way.
Another
possible
ways
don't
rely
on
IO.
If
you're
Ito,
we
could
implement
the
buffer
reading
ourself,
maybe
or
we
could
just
handle
the
case
where
we
get
nowhere,
no
message
and
then
we
could
just
retry.
It's
not
gonna,
be
like
it's
been
supports,
don't
be
wrong.
B
And
yeah:
that's
in
the
case
where
we
can
actually
recollect.
If
we
can't
reconnect,
we
can
fall
back
to
like
checking.
If
we
have
this
touched
responds
and
if
we
don't,
we
can
just
return
an
error
and
saying
the
connection
died
exactly
so
we're
at
least
gonna
solve
this
problem
where
the
job
is
marked
and
successful.
But
it's
actually
not.
B
So
let's
say
we
cannot
touch
these
dreams
again
and
we
get
we
get
to
to
this
point
and
we
get
to
this
case.
We
could
check
whether
we
have
already
received
some
status
response
and
we
could
at
this
point
you
could
say
we
have.
We
have
until
it
starts
response.
This
is
probably
this
probably
shouldn't
happen.
A
A
The
reason
I
do
not
like
mucking
around
with
the
stream,
because
it's
still
the
network
right,
it
still
can
go
wrong,
multiple
in
multiple
directions.
If
the
stream
disconnects
does
the
process
still
execute,
so
let's
say
we
pass
in
the
basket:
right
needs
to
be
executed
and
the
stream
disconnects
Midway
is
the
bass
shifts
and
running.
In
this
case,.
A
B
B
A
So
yeah
that's
good,
so
what
we
have,
for
example,
the
shell
executors
for
bash-
please
retire
back
to
standard
part
right,
but
for
our
shell
and
patch
we
save
the
file
and
then
execute
the
file.
So
we
can
do
something
similar
sort
of
streaming
to
the
standardout.
We
sense
the
files
there's
a
way
to
send
files
to
communities
and
we
execute
the
files
under
me
can
read
from
the
stream
I'm,
not
sure
if
that
makes
sense,
if
that's
a
big
change,
I
have
not
I'm
just
thinking
out
loud.
It.
B
B
C
C
First
thing
is
that
I
would
don't
think
now
about
this
fire
redirection
and
problems
with
running
streams,
because
reading
streams
is
one
of
the
basic
things
of
Cooper
City,
our
exact
I.
Don't
think
we
are
the
only
user
on
the
world
that
relies
on
this.
If,
if
kubernetes,
for
some
reason
will
start
have
problems
with
attaching
to
the
ports
and
reading
the
streams
of
any
comment
at
this
executive,
it
will
need
to
be
fixed
quickly
quickly.
Now
we
are
fighting
with
some
really
really
edge
case
scenario
and
in
most
cases
this
just
works.
C
So
we
need
to
find
a
way
to
basically
the
biggest
problem.
Is
that
not
that
there's
something
wrong
is
happening?
The
biggest
problem
is
that
we
don't
detect
that
this
is
happening
and
we
say
everything
went
went
okay.
Of
course,
there
was
at
least
one
client
in
the
in
the
issue
that
that
claims
that
he
sees
this
every
time
or
everyone
every
build,
and
in
his
case
it
would
be
just
that
he
would
get
all
of
the
jobs
failed,
but
I
think
this
is.
This
is
a
second
thing
that
we
should
look
on.
C
C
Second
thing
is
that
okay,
we
will
try
to
reproduce
this
we'll
try
to
reattach
this,
how
this
works
on
the
kubernetes
side,
because
copying
the
client
library
stack
and
changing
it
so
that
we
can
try
to
reattach
it's
easy.
The
question
is:
is
the
kubernetes
server
supporting
this
and
how
to
find
what
to
attach
to
like
I've,
never
seen
how
to
attach
to
running
exit
comment?
You
can
attach
multiple
times
to
the
main
command
started
from
the
from
the
container,
but
the
execuse,
always
independent,
one-time
entity.
Like
you
start
exact,
you
can
see
it.
C
A
C
Attaching
to
the
container,
because
I
don't
remember
now,
which
executor
it
was
at
some
moment
we
had
two
different
implementations
like
docker
or
kubernetes.
One
of
them
was
starting
the
container
with
shell
detection
script
in
the
as
the
main
command
and
then
was
executing
all
of
the
runner
scripts
through
the
exec,
and
the
second
executor
was
attaching
to
the
running
container
and
executing
the
scripts
and
at
some
at
some
moment
we
change
it.
One
of
the
executor
to
just
make
them
working
the
same,
and
we
chosen
to
use
the
exact
one
so
that.
A
Was
my
change
actually
so
the
way
the
current
status
right
now?
The
docker
executors
basically
says
the
p1
of
the
container
described
to
be
giveth
right.
So
it's
like
the
sheltered
action
or
the
blue
script
whatsoever.
So
we
can't
get
the
exit
code
from
the
container
status
and
that's
how
the
docker
executors
work,
but
for
the
kubernetes
executors
we
start
the
pod
and
run
kubernetes
extract,
but
these
scripts
are
standard
out
and
the
reason
we
had
it
like
that.
B
B
A
The
job
is
finished,
there's
a
configure
timeout
where
it
will
wait
for
it,
for
example,
30
minutes
until
it
stops
the
pod
right.
So
if
the
job
is
finished
and
you're
connected
to
the
interactive
from
terminal
years,
so
you
can
still
look
around
and
like
change
files
and
things
like
that.
So
that's
why
we
wanted
to
change
the
Dockers
order
to
be
the
same
as
kubernetes
to
use.
A
A
A
C
A
A
You
can
still
use
it
for
like
30
minutes
at
the
bands
of
the
configuration
so
that
that's
what
it
would
break
if
we
change
it
so,
like
the
script
executes
stay
attached
like
we
can't
read
it
the
time
like
we
can't
really
stop
it
from
stopping
or
keeping
it
upon
the
life.
Does
that
make
sense
like
I'm,
not
sure
if
I'm
explaining
the
problem
properly,
yeah.
C
Yeah
I
get
the
problem
I
thinking
how
we
could,
how
I
could
change
the
execution
of
the
script
to
have
the
outfit,
but
doesn't
stop
the
container
in
case
when
it
fails,
because
the
biggest
problem
with
exec
is
that
we
don't
really
have
a
good
way
to
reattach
to
it
and
to
detect
problems.
We've
attached
it
to
the
container.
We
can
get
back
to
the
same
executed
comment
multiple
times,
and
this
would
just
make
things
much
easier
if
we
could
attach
to
the
Container
not
to
do
the
exit
here
when
I
get
the
problem
of.
C
A
C
C
C
We
upload
the
script.
It's
in
the
container,
we
started
how
we
will
get
the
trace,
how
we
will
get
the
final
result.
This
is
where
the
fire
happens,
not
on
starting
the
execution
but
I'm
watching
from
the
execution.
We
still
need
to
read
the
output.
We
still
need
to
look
for
the
exit
code
and
we
still
need
to
make
it
not
failing
or
foresee
succeeding
in
case
of
network
problems.
Disconnecting
disconnecting
us
from
doubt.
C
B
B
C
B
B
B
B
C
C
C
A
B
C
Don't
think
we
need
to
stream
this,
we
need
to
have
secreted
yeah
like
let's,
let's,
let's
consider
like
that,
we
update
the
executed
script
need
to
think
if
this
should
be
on
the
kubernetes
drink
or
for
all
the
executors.
We
update
the
executed
string
so
that
the
final
exit
clause
of
the
script
is
saved
as
file
on
an
on
location.
The
location
would
have
example
that
the
project
URL
job
ID
and
the
sequence
idea
of
the
script
execution
by
the
run.
So
we
have
a
unique
file
from
this
specific
file.
C
This
specific
script
that
was
executed
now
the
comment
exits.
We
get
non-zero
exit
code.
This
means
that
for
some
reason,
the
common
exited
with
an
arrow.
In
this
case,
we
don't
need
to
do
anything
more.
We
just
fail
the
job
and
propagate
the
exit
codes
as
we
do
now.
If
we
got
exit
code
zero,
then
it
may
be
that
the
job
finishes.
Okay
or
maybe,
in
this
strange
case,
4x
equals
zero.
C
We
can
we
can
retry
it
we
and
with
script
as
long
as
cut
and
the
name
of
the
file,
for
example
it.
We
should
really
have
very
bad
luck
to
fail
in
in
such
networking
problem
again
in
in
such
quick
timing
and
then
getting
this
output.
If
we
just
pin
parse
it-
and
we
can
set
this
as
the
final
exit
code
of
the
of
the
script,
if
it's
still
a
zero,
then.
A
C
C
However,
from
the
issue,
it
seems
that
it
not
because
we
get
we
get
the
positive
output
of
the
job
and
people
say
that
it's
still
right
and
again,
it
fails
in
the
background,
so
I
would
assume
that
currently
still
execute
this
job.
But
this
needs
to
be
checked.
So
then
we
need
to
have
a
way
to
detect
if
the
job
is
finished
and
check
the
exit
code
file
after
it
is
finished,
so
some
sort
of
run
file,
but
is
created
when
the
job
is
started
and
removed.
C
C
A
A
A
C
A
A
A
C
A
C
C
C
C
C
You
have
30
minutes
for
your
web
terminal
to
use,
and
when
this
is
finished,
you
have
the
normal
red
failure
with
the
job
felt
with
exit
code,
bla
bla,
bla
bla.
So
we
have
this.
This
information
doubled,
but
it
seems
that
it
would
be
easier
to
do
something
like
that,
but
this
requires
us
to
totally
rewrite
how
Copernicus
executor
works.
B
Okay,
by
the
way,
if
we
go
back
to
the
solution
to
must
provide
it,
we
can
read
the
output
of
the
command
we
execute
in
exec.
What
if
we
just
had,
sounds
stupid
or
which
we
can
check
whether
the
processes
exited
successful,
then
we
can
know
whether
we
should
check
the
the
status
code
inside
the
file.
A
C
B
And
by
the
way
like
in
the
in
the
real
wonk
term,
we're
probably
going
to
remove
this
because
I
guess
it's
kind
of
a
problem
for
people
where
they
get
their
jobs
to
stop
in
the
middle
of
their
work.
So
we
might
want
to
at
some
point
to
reward
the
covariates
executors
sound.
The
other
way
is
you
guys
suggested.
B
C
C
A
C
Okay,
this
was
the
one.
The
second
one
that
we
discussed
was
that
we
switch
from
exit
to
attach
we
stopped
sending
the
exit
code
and
killing
the
con.
Instead,
we'll
look
for
some
specific
Martin
doubt
so
we
can
decide
what
is
the
status
of
the
script
execution?
Yes,
and
then
this
should
support
also
the
the
web
term.
In
our
case.
A
Tomas
is
because
he
said
we
will
check
the
file
status,
the
job
start
to
fine.
We
save
every
time
the
cube
exact
return
to
start
with
zero
right
to
make
sure
that's
exactly
terminated,
but
instead
of
doing
that,
I
can
settle
every
time
every
time
at
zero.
We
check
it.
We
do
okay,
it's
zero.
Let's
check
the
job
output
advertised
dismissing
job
block
fine,
like
we
will
actually
check
the
file
status
because
we
know
of
the
lock
has
missing
data.
We
should
check
the
fire
hose
dat.
A
C
B
A
C
B
B
B
B
B
B
B
B
A
A
Yeah
I
just,
but
even
if
we
can
chop
success
and
job
failed,
we,
how
would
it
work?
We
will
check
for
those
logs
and
ok,
let's
say:
there's
a
job.
Success
now
am
I
gonna
check
if
the
user
connected
to
the
web
term
or
not
after
you
is
connected
to
the
web
term
and
not
just
do
nothing,
but
if
the
user
did
not
connect
to
the
web
train,
you
know
just
filled
the
pot
since
it's
an
infinite
script.
I.
C
Don't
know
how
that
works,
I
think
that
we
can
just
reuse
what
we
have
now
like
right
now.
If
the,
if
the
script
fails,
it
fails
in
the
context
of
the
court
is
still
running,
so
we
still
need
to
kill
this
spot
with
this
infinite
SH
process
right-
and
this
is
exactly
the
same
case-
we
just
we
just
don't
rely
on
the
exit
code
to
rely
on
the
some
specific
lines,
some
specific
words
out
that
just
makes
it
a
premiere,
decide.
Okay,
this
job
succeeded.
C
C
A
C
B
A
C
Like
this
doesn't
change
the
stages
now
we
still
get
the
information
about
the
script
failing
or
succeeding
and
at
the
level
of
common
built
where
the
so
stages
are
executed.
We
even
don't
look
on
the
exit
code.
We
look
on
the
error
that
was
returned
by
script.
Execution
of
know.
This
is
what
the
executor
is
responsible
for,
so
in
case
of
communities.
We
can
just
to
show
of
the
logs
through
some
hope
that
will
read
it
check
if
there
is
the
marker,
but
we
want
and
push
it
push
it
further.
C
Why
why
you
say
we
need
to
have
them
all
connected
right?
We
all
we
can
attach
multiple
times
each
time.
Well,
we
from
common,
build.
We
execute
the
screen.
This
is
the
executor
to
somehow
schedule
the
script
on
the
target
environment
in
case
of
kubernetes.
In
this
moment,
we
attach
to
the
run
container,
send.
C
Stripped
through
the
yeah
and
then
we
attach
to
the
locks
and
we
start
clicking
meaning
from
them.
We
read
from
them
until
something
breaks
again
the
networking
stuff,
and
then
we
can
retry
it
or
we
get
the
marker
that
says
that
the
job
succeeded
and
then
we
return
from
this
comment
in
the
exact
same
way.
How
we
do
it
now,
returning
it
without
error,
I
already
get
a
lock.
C
That
says,
but
we
have
some
error
in
the
script,
but
it
internally
would
exit
with
there
with
the
exit
code,
something,
but
we
just
make
it
not
send
the
exit
code,
and
then
we
know
that
from
the
kubernetes
error
we
need
to
return
with
the
we
need
to
return
from
the
let's
get
back
to
the
common
build
and
common
build
behaves
exactly
like
it
does.
Now
it
got
an
error
or
not
depending
on
which
state
enforce
it
execute
something
else
or
not.
C
When
this
is
finished,
we
go
to
the
cleanup
method
of
the
executor
and
at
this
moment
exactly
turn
knows.
If
someone
connected
to
the
web
terminal
or
not,
we
don't
need
to
check
it
is
checked
already
and
if
it's
connect,
it
just
waits
for
the
terminal
to
exit
or
for
the
timeout
to
kill
the
port.
Finally,
if
it's
not
connected,
it
gives
the
pod.
So
the
only
thing
that
is
changed
is
detecting
the
exit
of
the
screen
and
detecting.
If
it
was
failure.
C
C
C
We
need
to
update
the
script
at
least
for
a
kubernetes,
so
it
will
not
exceed
what
we'll
have
there
a
trap
that
will
catch
this
exit
card
and
write
something
to
the
output
and
after
we
attach
after
we
send
the
script
to
the
standard
input.
We
don't
leave
there
attached
looking
on
the
standard
output
and
standard
error
of
the
process,
but
instead
we
switch
the
locks
or
we
switch.
B
A
C
Requirement
that
we
have
forerunner
from
the
beginning
is
that
you
need
to
have
the
show
and
in
case
of
the
talker,
you
need
to
make
the
container
started
with
the
shell
in.
So
if
we
have
any
problems
with
entry
point
on
the
kubernetes
executor,
it
needs
to
be
fixed,
but
this
doesn't
change
requirement.
We
give
you
the
entry
point,
so
you
can
do
anything
that
you
want,
but
finally
you
need
to
give
us
the
the
shelter,
so
we
can
start
executing
something
on
this
shop.
C
C
Look
looking
looking
on
this,
it
looks
just
like
a
nasty
hard
to
prepare
workout.
It
doesn't
look
like
a
proper
solution
and
the
second
version
looks
way
more
proper
I'm,
still
not
a
big
fan
of
using
the
output
and
some
specific
markers
to
detect
the
failures,
but
it's
still
better
than
in
my
mind,
my
main
better
than
and
saving
the
exit
code
to
file
and
then
reading
the
fight
to
get
the
exit
code
like
we
don't
have
exit
code
at
all.