Kubernetes SIG Node, 20 Sep 2023

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Kubernetes SIG Node CI 20230920

Description

SIG Node CI weekly meeting. Agenda and notes: https://docs.google.com/document/d/1fb-ugvgdSVIkkuJ388_nhp2pBTy_4HEVg5848Xy7n5U/edit#heading=h.2v8vzknys4nk

GMT20230920-170330_Recording_2560x1440.mp4

A

I want to remind people if anybody wants to join and like host either. Please sign up. Thank you, Mike and Dixie for doing it today.

B

Thanks, okay, I hit recorder. um Okay, so today is Wednesday September 20 2023. This is signaled.

C

Ci meeting.

B

We have a couple of items in the agenda today.

B

It's the first Canon 92 investigating eviction test cases.

D

uh Yeah hi everyone uh yeah, just as part of the cap I was working on. uh You know the I was working on the split image file system, but I noticed that Cuba already supports a dedicated image file system, but we don't have any test coverage of that, and uh so I was a little con. I mean I like to add eviction tests uh for the split disk case, but I think I probably want to add it. Also for the dedicated image file system and I was just curious. What the group thinks about that.

D

I, don't know that would be uh it's allowed or not. I don't know.

A

uh So definitely allowed definitely need to to be added. um We have a long history of two eviction test, failing uh one on cryo and one on contain energy, and it's different eviction test. um I. Think uh latest I heard that uh one container the eviction test fails because it failed to fill up disk fast enough, uh because this became too big on infra nodes. So if you can look into that as well, uh it'll be really appreciated.

D

What I've started to I've been doing some work to try and get the cryo eviction test as part of a pre-submit, so I have I, have a PR where I'm trying to test ideas, but unfortunately, all the cryo jobs were periodics, so it was uh difficult.

D

I mean I, think I can I can get them to fail locally, uh not sure if it's the same reason, but I will continue looking into uh the eviction test cases just to see if I can at least especially on the cryo side, if I can find why they're failing uh but yeah and then the other part is also the dedicated the image file system.

D

So if anyone I don't really know how to start that work, so I'd appreciate some help on at least like what kind of configuration for AWS or gcp we use to provision cubelet with a separate disk. That could be helpful for me to know.

A

So you need a machine with extra disk specifically for um image file system right.

D

Yeah not even the kept work, just like the just. The fact that if we have a separate image file system and like ideally I'd probably want it to be a test case for both container Rd and cryo.

A

Yeah I don't know any other tests that will have separate discount, mounted um I. Think infra Channel may be able to help you so in a slug that is a Casey from intro or something like that. Let me find out.

D

Yes, Emperor I.

E

C

D

I'll go ahead and ask that and I guess the question I have is Maybe. You all can clear something up for me. uh Do I add it for AWS or gcp.

A

These are Works um I think you will have more experience with gcp, but the.

C

A

C

D

A

D

I'll continue looking at the sorry going yeah.

A

I think I don't know how many tests you want to run on this specific environment with the extra disk. uh It probably will be just one test right, like one area that you want to test.

D

uh My thought is: if I can get the eviction tests working, I I think it mostly would just be provisioning a separate disk and then changing uh cryo or cubelet to use that sorry, cryo or container ID to use the separate image file system and just run the eviction tests.

A

D

Yeah, at least that's kind of my my thought there.

A

It may be easier to start with separate job specifically for this extra disk. It will only run one test and then we will decide like how fast it is and um I just not like eviction tests are generally slow and um I'm, not sure what will be extra cost of running this disk.

D

Fair enough uh yeah, I'll think of some test cases for that uh and then I'll keep I'll try to keep getting back to investigating the eviction ones. uh Yeah.

A

D

Oh no and I finally got the the pre-submits running, so I can actually test the cryo jobs on uh on PRS, so that'll that'll be easier for the eviction tests.

A

D

No, it's just. They were all periodic jobs there were. There were no. There was only a few of the cryo pre of the periodic jobs that were actually uh you could able trigger them from a PR, so they'd have to be merged and then go into the the Cron job to test them, which is not great for testing.

A

I thought that uh all the eviction tests is mostly equivalent logic and what it like I'm wondering. Is it anything specific for container runtime or it's just for you, it's easier because you have a local environment set up.

D

Yeah uh I have cryo set up locally. uh That's the one I'm testing, with uh the main difference is stats, uh which is why the the one of them is failing in container Rd and it's not failing in cryo, because I think somebody I think David Porter pointed out. There's some issue with uh pids I think uh around stats. That's the last issue. I saw around that and that's failing for uh cryo stats provider, but I don't think it fails. For this anything. That's using the C advisor stats provider, which is uh cryo.

A

D

So there's slight differences between those uh and uh yeah. The other thing on the agenda is actually. This is something that Ryan fixed uh yesterday, but I.

D

We were noticing that the proud jobs there was some issue, don't fully understand all the stuff and prowl, but uh the there's two required pre-submit jobs from Sig, node, I think or maybe one was an ete GCE, which I don't know who owns that one and the other was I, think a Sig node uh end-to-end tests are using the deprecated bootstrap.pi uh job and those started failing I think we probably should consider uh trying to migrate those jobs, I'm happy to take that on for the required ones, because I've been poking around a lot and test the testing for scripts lately, so uh I'll just yeah.

A

Move to decorate through is definitely a preferred. You can ping me uh if needed, quick approval, um yeah.

D

All right, that's it for me. Thank you.

A

And for previous item I I, remember like when we configure images to run on I think we can do machine type and some like some other characteristics, but we cannot specify and put disk size uh if you will be looking at agent extra disk. Maybe you all can also look into configuring boot disk size, because I believe that was an issue for eviction. Tests that would like default would just size increased and we failed to filled up efficiency.

A

D

Yeah: okay, I'll, look into that yeah I, don't fully know I'll.

F

D

I'll learn a little bit more about how all that stuff is actually provisioned. Yeah.

A

Me either like I I, remember looking into like this is a config file and test info and I know which properties are supported and then I started looking whether this type is supported and I think I found in KK that there is a default Disk type success.

A

Anyway, um yeah thank you for looking at that I think it will make uh in general. We don't have too many Edge case tested in kublet. uh More eviction, tests than other, like stress uh environments, will have better for uh reliability.

A

But yeah other side of it is it's hard to troubleshoot and hard to write so yeah.

B

Thanks looks like it's just we just added an issue link the.

C

B

Been tracked before, okay, that was great.

B

Okay, then I guess we can move to um desperate triage.

B

A

B

Really quickly.

B

um This one was.

C

Flaky once I think.

B

It's fine and serial container. It has been deleted for a while.

B

It's always the same test, I believe almost all the same tests.

B

Does anybody recall if we have a test, not an issue for this test.

B

D

Which tests the density, cereal, yeah.

B

For a while, but I don't recall if we have one there, if you have an issue, if not, we can create one.

B

uh I'll check it later then confirmware release. They are looking good.

B

B

uh Okay, the college manager started failing this week.

C

Yeah I submitted a fix for this.

B

C

Leave like this, and it was approved, I believe just before this meeting.

B

C

Let's monitor this tomorrow and see if that.

B

Actually fixed it, let's hope, let's hope it does. Okay,.

B

Looking for these ones, I want.

C

B

C

B

Let's see on continuity, I'm Gonna, Keep skipping the ec2, because I know they're still, those are still flaky, so I'll just go to the other ones.

B

For play or another small fillers, this.

C

B

C group three one container.

C

B

For a while as well there's one passenger one person and then the universe.

B

And all failures are related to graceful note, shutdown.

B

C

B

Is it 170 Ruby too, it looks like it's the same feature yeah.

B

But since we're related to graceful, not shut down version Port priority.

B

I recall I filed an issue for these ones.

B

The performance test.

B

Okay, let's continue with the resource managers. The fantastically this is the okay. It's my turns.

C

Same test that I just got.

B

I see the one from uh anthropology: do you think it was yeah.

C

B

Yeah then, let's wait for the pr, and hopefully this is so your um your peer was addressing the test right, not the job.

C

Correct yeah, it's addressing this particular.

B

Test: okay, yeah: let's wait for this one again, thanks.

B

A

Might have a fix for that flaking event like that.

B

Soon, okay, flick of them yeah the winter plate is flaking right. So we.

C

Might have a fix for that.

B

Okay, that's great yeah.

B

Let's see prior groups, you can do one of the acidification.

B

Yeah, it's just been read for a while.

B

And I think that sometimes you know it does.

B

Oh, it's timeout! Okay! So when it fails, it turns out.

B

I am not sure about this job, foreign.

B

Testing mesh one favor, but overall, oh.

B

Building much cross confidence full project.

B

It's in here, okay same with this one, this one's gonna fix itself.

B

Let's see kernel Model Management, this one is now I. Don't recall. I saw this this before.

B

C

B

The first time I see this this uh this dashboard. Thank you for the thicker. We have a flake here in this CIA PDF to know.

B

Duty profiles looks fine well, it was failing on an extremely fixed now this one, a bunch of flakes, but.

B

This one looks fine, I, think I think all the tests that we have failing. They already have an issue tracking them, so there's only to create any new ones. uh With that in mind, I think we can finish.

B

If you want uh of my screen, you.

A

Can test uh board I think it's part of.

B

Oh you're ready.

B

B

Okay, only kill test for memory usage. We should know the location will be failing.

B

Moving for a while, okay, uh yeah I think we should have reported this one last week, but nobody has to look into it.

E

um Can you please assign your thing I forgot to assign it to myself.

B

B

B

Thank you is not into feeders.

C

B

Into it yeah, this is the one I talked about uh with.

B

Okay component helpers, support structures and contextual login. Please support request.

B

Erasing uh important helpers, Etc areas.

B

Although we have any previewer from that.

A

B

Yeah we have a couple of note packets, but this doesn't this doesn't look like uh it should be in this test here, I'm gonna move it.

B

It just removed it because it doesn't make sense in this in this project, you can leave it in Signal character,.

B

I'll, take note graceful, no shotgun based on proper idea. This is the issue. Okay,.

B

Oh, it looks like this is already assignment of Justice.

B

Okay- and their last issue seems to be this one: topology manager, Matrix, I, think yeah. That's what this already assigns this, and this is already Marketing in progress.

B

B

I didn't go through, but I'm actually not sure hold on.

B

C

Is the other one.

B

That was in the dashboard.

B

It's a testic and.

A

That's what we discussed yesterday on a signal meeting: okay, yeah I, don't think I think it belongs to CIA group.

B

Okay, it's fantastic, so um yeah! It's.

A

A test weeks in a.

B

Motor archive it okay, thank you.

B

All right I mean the project is cleaner. Okay, then we can move.

B

Do you want to take over.

C

E

Domination race period are Force deleted, um been a part with spec dominated brace period is 0 is deleted without without force it is force deleted from API server without cubelet, killing its containers and unlocking its volume first.

E

This can be dangerous. Let me see. What's the status behavior is controlled by not sure what seg is responsible.

E

Deleted in the API server, as I pointed above node cubelet cannot do anything.

B

I'm not sure, but this looks like intended: Behavior.

B

Definition of experience works, I will.

C

E

Is uh can you elaborate like.

E

Maybe we should consider additional code, validation.

E

Oh, does anyone have any information about this mic? Are you saying that this is intended Behavior, sorry,.

B

I, wasn't it I was talking about yeah I think this is intended Behavior, oh basically, when you delete a post, you have termination works period, yeah, and it specifies how long it waits between sending industry a safe term and then a stick kill to delete the boss. Once it's on the hill, the ball is deleted completely. uh So if.

C

F

B

Your termination grade Spirit of serious seconds, the part, gets killed immediately.

E

Right right so yeah, because the grace period is set to zero here, so it should be uh first deleted immediately correct right.

B

But what what so that's expected here? What seems to be the? What? What is this person trying to do like what? What are they suggesting that it's saying to the behavior.

E

Um This can be dangerous for stateful sets. They should guarantee that only a single replica of quadrants but KCM will start a new replica when the old replica is still running.

E

A

I think we need to convert it to feature because yeah.

A

B

C

Nice to have, but it's not.

A

A bug it's working as expected.

E

Okay, uh I couldn't follow what they are trying to say for stateful sets.

A

Supposed to run only one port at a time and.

C

A

uh Germination with zeros and uh like container, is still running uh on the Northern, like it still have volume attached. So when you create a new uh poor that may fail to attach the same volume or it may fail to uh I mean it may do something unexpected, because two replicas will be running the same time.

E

I see, okay, so should I add here as a kind feature.

A

Yeah and remove kind back.

E

Anything to do here is it should I move it to Dance.

A

Project, all together, if you click on um yeah or go back to the project board and then remove it.

E

A

Yeah, just click uh click, Three, Dots and remove from Project. Thank you.

E

Garbage collector for container images is unpredictable and inconsistent. We got which collector for container image life cycle does not seem to adapt to the documentation and provided parameters and the dogs. The configured high threshold percent value triggers garbage collection, which deletes images in order based on the last time they were used. Okay, uh starting with the oldest first, the cubelet deletes images under disk usage reaches the low threshold percent value from the test is not. This does not appear to be the case.

E

Often often, the garbage collection will delete way, more images than required, often dropping below the low threshold percentage and deleting three four at a time either create dummy images or Identify some appropriate for image, deploy a port and then delete it. The image will Dash after a while, the disk space will reach low threshold percent. First and GBC will be triggered. The GC shouldn't either delete the earliest unused image. This does not happen. Priorities sometimes given to other images and more than one is deleted in a random order.

B

I'm not 100 sure on this, but I believe the garbage collector will delete. All images are not are not being used.

E

B

E

They are saying the behavior does not. It does not align with the documentation right.

B

uh Is it documented, like that.

E

So they are saying uh the configured high threshold percentage value triggers garbage collection, okay, yeah, that's what we are expecting, which relates images in order of the last time they were used but I think that's not happening.

E

So the behavior is dependent on your CRI plugin, oh I, see logic seems pretty straightforward and it's probably okay. It is assumes that when you delete an X, then X and these three numbers.

E

They have also provided the steps.

E

This is not a bargain case. I can see how kids can do it better in a better way, without making CRA complicated or dump the responsibility of the CRF level.

E

So I think this is like this says here uh it's related to whatever CRA plugin they are using. But um then this author is saying, even though this is not a bug in case, um it can be done in a better way without dumping. This responsibility on CRA plugin.

B

In that case, we could probably label it the second request, but if it's documented like that, then it is. uh It is wrong. Let's.

A

Move documentation, issue, I, think, okay,.

B

That makes sense.

D

I mean it does look like now. We do do something related to that. I think it sorry I mean we are I, think sorting by the last used uh in the list, but uh at least that's the list that we are.

D

Yes, it looks like that, like we, we do have a list of last used and then.

D

Yeah I don't know it's hard to not exactly sure. I read this one earlier and I wasn't sure.

E

I think it's not just the documentation done right.

E

um I think if we can try to reproduce this, that would be the right step to do. I think so again,.

B

They do have the reproduction steps, but.

E

Yeah I think we should try to reproduce this and see what the exact behavior is.

E

And then sort of see, if there is uh it's required to change the documentation.

E

Any thoughts on that.

A

So didn't comment after uh or black right.

A

um So maybe we can um Marcus needs information to double check that this, indeed, is the case, and if this is the case and uh it's a documentation issue, but uh if they see something different, then we can double check.

E

Okay, you want me to comment anything here.

A

Yeah just ask uh a creator of the bug with a with what I'll break suggested. Is the.

C

Same things, that they.

A

A

And can you apply the dark on this information as well.

E

A

Needs information.

E

Too much oh yeah I did that.

A

Yeah but um yeah it will apply a label.

E

Oh, is this correct yeah.

A

Thank you. You.

E

A

Command apply label, but then we also need to move to different columns that you did additionally I.

E

See yeah I saw.

C

E

Deleting a PVC is blocked because it is still referenced by a complete job. What's PVC.

D

um Persistent volume, controller.

C

D

uh I did look into this one. Actually uh I was able I asked them to reproduce in later versions. They said they were able to I. Don't uh I have a repo case on there uh for the local up cluster, but I don't know if this is really a book. uh I think I am unclear.

D

Who really would own this, whether it's the Sig storage or Sig node, because, like essentially it's this problem where the pods are and succeeded or failed with a job and the volume I guess they're still holding the volume and I think the the work around would be probably using a TTL on a job to actually delete the job in the Pod eventually, but I don't know because that's a design decision or not.

A

It sounds like a design decision, so maybe you can remove, seek node from that and let's take storage, decide on what's the desired Behavior.

E

I'm going to remove Sigma yeah.

E

Cubelet and respect resolve conf when resolved one is empty or full of comments. When passing the world can't play, it did not respect the option and still copy the uh Etc resolve conf to the Pod. What makes it special is that resolved cons provided by myself is empty after skimming the code I found. The logic is problematic because resolve quantity is empty and empty area or no error is returned, as if DNS policy is default.

E

The part DNS type turns to because a container D will get an empty DNS config, so it will copy resolve con from host to container the expectation is cubelet should refuse to apply the wrong empty, resolve conf, to create a pod and complain with errors, so I think they want like they are suggesting. We have some validation for this parameter.

E

Should compensate with what scenes and operator problem.

E

The situation may occur when system resolved is used in the operator is the end of the cubelet I see there is a race there. Okay, the situation occur when systemd resolve is used and the operator pass. The result con to cubelet.

E

And after a while system, we got resolved to availability, DNS so um Sega. Do you know what this is about? I think from the initial information I thought that they were requesting for some validation around resolve conf, but I am unable to follow uh the discussion here. So.

A

They say that if you pass an empty resolve, config then Google, it will disregard it and use it default. One and this causes some problems, I'm, not sure what kind of problems it causes for them, but uh but, like Antonio, asked like, why would you even consider it passing empty resolve config file, and the scenario here is that somebody pass a resolve, config files that is empty originally, but then it will be written by.

C

A

C

A

So it's um they want to bootstrap is empty, but then have it updated later.

E

I see okay and then what.

A

Position networking like who Are You, Gonna Break by changing this Behavior.

E

So this should be handled by segment working. So what what is uh Anton you're saying here? There is a race condition here: race condition for what.

A

Nice condition so between somebody python, empty file and filling it up two years later, so it's kind of a race between like how configuration being applied so instead of writing config file first and then starts in kublet. Somebody start kublet and then want to populate this file because it will be added later.

E

Okay, all right so is there an action. I will hear from us.

A

um I, don't think from uh signal will be any action. I would suggest, remove from signals and say like yeah, it's for Signature networking to decide how to migrate. Customers to this.

A

E

How to migrate.

A

Oh like how to migrate a new behavior of respects in empty file.

E

And should I add Sig.

A

Signature I think yeah network is already there. Okay.

E

C

E

Your sister here.

A

C

A

Just removed it sorry.

E

Oh sorry, could you please tell me what.

A

I just removed it from the projector register, so.

E

E

The memory manager unexpected admission error, dual socket server with threads and GP manager, started policy, topology manager policy based effort, 10 gigs of RAM watched here. If I try to allocate two guaranteed pods of this one is admitted and the second one fails with unexpected admission error, even if it would fit using memory of both pneuma nodes.

E

So this is where the expectation is either the Pod becomes pending. All memory from both pneumonodes is used.

E

um Have some memory reserve for Numa 0 launch to identical paths with memory limits really close to the max, so one pod fit on pneuma one, but this second one doesn't fit on. You want zero.

F

Hey I'm I'm monitoring this one a few and this other person is a teammate of mine. So we can, you can even assign it to me and we're going uh complete the triage we're discussing the if, if the behavior is uh I mean consistent with the cap and the expected Behavior or not, and we're still trying to figure out if it's an actual bug or not.

E

Okay, could you please summarize uh what the issue is.

F

The issues that the the customer reading and sorry, the customer, the the user, considering the the documentation, is expecting the Pod to run on both pneuma zones, because this is actually what the configuration should allow.

F

But the behavior is not that and they were expecting the the Pod to be to be pending, but this is not actually possible. So the only possible behavior is that if the pods goes, It goes uh up and consumes memory from both Numa zones, but we, this is actually um or memory manager, specific behavior. So we need to actually to Deep dive with uh the behavior. Is legal or not? Okay,.

E

All right, yeah, thank you. No.

F

Problem should.

E

I assign it to you.

F

F

Exactly that's me yeah. Thank you.

A

I think Tyler is trying to uh G is a feature so yeah.

F

Exactly exactly okay.

E

uh This stays the same then right.

A

F

Yes, I will I am just for them to share I, will probably move ping pong with interest and need information, but I will handle it. No worries ping me. If I don't thanks.

E

um Next one note status error handling in cubelet. The nodes, ready condition is true, but node status address is empty. This indicates an issue populating the addresses field.

E

Oh, what's empty here address okay, there are no addresses.

D

Looks like somebody's already.

C

E

So no AI, sir, should I remove, said node.

E

Okay, sorry should I remove signal.

A

Yeah I think uh it's um a little bit involved into node addresses, but uh mostly it's significantly.

E

I think this was the one relationship.

E

Let lookup note address Logic for external provider assigned wrong pod IPS to the host Network pods.

A

Yeah I wonder if this was created out of previous bug. We just discussed.

E

Yeah I think so, but yeah I closed that one. So I don't have all the link to that. One.

E

But anyway, this looks like networking related issue right.

A

Did Antonio apply node.

C

E

There is both node and Network.

A

C

A

Because of external provider um and how we handle pod IP, let's keep note here, but let's not triage. Let's wait for signature.

E

It's okay sounds good. Thank you in place, vpa wrong. Cri updates, after lack of resource limits after I delete the resource limits when in place vpa completes the C group memory. Limits is still old value. I see.

A

503 are accepted, sounds like.

C

E

I'm just trying to see if anyone is looking into it.

E

Doesn't look like right.

C

A

C

I remember seeing.

A

Something similar reason uh before um so.

C

A

Would suggest three are accepted and then we need to handle it together, as other in place. Vpa features.

E

So do we assign it to anyone or just accept it as good enough.

A

uh Unless somebody on this call wants to take a look, no no okay, yeah there is a person who is looking into in place with PE um I will try to find them the advantage issue. Okay,.

E

E

Tri-Cutter shows the compressed size and is inconsistent with Docker images. Running recorder images show compressed size of each image, which is both misleading I. Think this shouldn't be on us, which is misleading and inconsistent with Docker images, which shows the uncompressed size.

D

uh I, looked at this one, a little bit, I think what they're trying to say is. They probably are suggesting uh it's not a bug. It's a feature, I think they want CRI to tell you whether or not the images are compressed or not, and then they already posted an issue on the CRI tools, repo and I. Guess somebody suggested posting it here, because at least that's what I think is going on in this issue.

E

Okay, but is this actually like? Is this actually on signaled.

D

C

D

C

A

Yeah they want qualification qualification. What this CRI response is supposed to be I.

E

Think so should I change it to feature then.

A

um It's a good question.

E

Because there's one um I think the uncomfortable.

A

Documented so I think it's feature yeah.

E

Sure do you want to add it uh add the details here to documentation as well.

A

E

A

Yeah, if you decide that you want to change how this API behaves, and it will be feature.

E

Yeah but I mean the behavior of the current API to um to show compressed size of each image. Do we want to document that behavior.

A

Yeah, we need to document what we currently have and then uh decided whether we want to change it.

E

Okay, what's the tag for documentation, kind,.

A

A

C

A

E

Should I removed from the project here for the Crackle.

E

And this is the same one image for policy, always Maple, stale images based off stale image, digest resolution.

E

This looks, let me see the state limit sizes, so yeah you're, aware of this, because I did see some comments here find feature.

E

Is this kind feature.

A

Yeah I think uh unless somebody that uh back.

E

I'm running API test jobs using front table which requires a fixed style when running tests recently modified frequently.

A

Yeah I think we discussed last time and I I removed it from Project. Thank you.

E

Okay, so looks like we are done.

A

Yeah I don't think we have time to fish out issues from needs, triage or news information. um Maybe next time.

E

Sounds good so that that's done for today.

A

Thank you thank.

E