Queue cache - add reverse workload mapping to stop relying on previous workload state during update/delete #8001

Singularity23x0 · 2025-11-28T16:05:57Z

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Part 2/3 of updating the queue cache.
The PR introduces changes allowing for O(1) time retrieval of the current state of the workload assumed into the queue cache. This allows making the update/deletion methods independent from an object representing the state of the workload before the update.

Which issue(s) this PR fixes:

Part 1.2 of addressing: #5310

Special notes for your reviewer:

Created from #7915

Does this PR introduce a user-facing change?

NONE

netlify · 2025-11-28T16:06:05Z

✅ Deploy Preview for kubernetes-sigs-kueue canceled.

Name	Link
🔨 Latest commit	`21d9f03`
🔍 Latest deploy log	https://app.netlify.com/projects/kubernetes-sigs-kueue/deploys/692dab6791518d000866e8fa

k8s-ci-robot · 2025-11-28T16:06:07Z

Hi @Singularity23x0. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

pajakd · 2025-12-01T08:17:35Z

/ok-to-test

olekzabl

Nice! Though some more minor comments.

pkg/cache/queue/manager.go

pkg/cache/queue/manager_test.go

olekzabl · 2025-12-01T15:00:01Z

/lgtm

k8s-ci-robot · 2025-12-01T15:00:11Z

LGTM label has been added.

Git tree hash: 2ef83dc41dd58962f464055f949c596915d04c5a

mimowo · 2025-12-02T10:24:54Z

pkg/cache/queue/manager.go

 			continue
 		}
 		wl := cq.Pop()
+		m.reportPendingWorkloads(cqName, cq)


Why is this moved? It looks unrelated to the PR.

To summarize the discussion on this issue we had on the previous iteration of the pr:

One of the tests (test/integration/singlecluster/controller/jobs/appwrapper/appwrapper_controller_test.go -> Should schedule AppWrappers as they fit in their ClusterQueue) had an issue with reporting pending workloads, where the existence of the "inflight" workload would make it think there is one pending.
With the reporting being set after the nil check, this remained the case, whereas now, if the inflight is set back to nil, the metric will be corrected accordingly.

This makes sense, but is this test failing only after the PR? I'm wondering if we should consider this as a small bug we would cherrypick, wdyt?

My investigations were inconclusive as to what exactly in the changes cause this issue to become visible, my best bet so far is that changes to the logic made some async operations be delayed, happening later than planned and as such not caching this particular update to the inflight field in time.
I can split it as a bug fix and put it in a separate pr to be merged beforehand, working on it.

So if we don't change the location of the line on the new version, is the test failing consistently or flaking?

Also, do you have an example run which failed?

#8037

Update: un-drafted and ready

Yeah, thank you for extracting. It would still be great to spend some time trying to understand what is the issue here, and its impact. It would be greatly appreciated if you can try to push the investigation a bit more

Added more info on test flakieness: #8037 (comment)

mimowo

LGTM, just one nit / question

mimowo · 2025-12-03T07:22:57Z

@Singularity23x0 please rebase

k8s-ci-robot · 2025-12-03T07:23:06Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot · 2025-12-05T15:25:36Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: kshalot, olekzabl, Singularity23x0
Once this PR has been reviewed and has the lgtm label, please assign gabesaba for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Singularity23x0 added 2 commits November 28, 2025 15:54

Addition and usage of the reverse workload mapping.

39a81f2

Merge branch 'main' into 5310-v3-wl-rev-mapping

93ac84b

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 28, 2025

k8s-ci-robot requested review from gabesaba and tenzen-y November 28, 2025 16:06

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 28, 2025

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 28, 2025

Cleanup.

4ff01e0

Singularity23x0 mentioned this pull request Nov 28, 2025

Queue cache - add reverse workload mapping to stop relying on previous workload state during update/delete #7915

Closed

Singularity23x0 marked this pull request as ready for review November 28, 2025 16:12

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 28, 2025

k8s-ci-robot requested review from mimowo and pajakd November 28, 2025 16:12

Singularity23x0 marked this pull request as draft November 28, 2025 16:24

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 28, 2025

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 1, 2025

Singularity23x0 added 3 commits December 1, 2025 12:03

Merge remote-tracking branch 'upstream/main' into 5310-v3-wl-rev-mapping

b70e0bf

Requeue cleanup.

b8384f4

Spacing cleanup.

555d91b

Singularity23x0 marked this pull request as ready for review December 1, 2025 12:14

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 1, 2025

k8s-ci-robot requested a review from kannon92 December 1, 2025 12:14

olekzabl approved these changes Dec 1, 2025

View reviewed changes

pkg/cache/queue/manager.go Outdated Show resolved Hide resolved

pkg/cache/queue/manager.go Show resolved Hide resolved

pkg/cache/queue/manager_test.go Outdated Show resolved Hide resolved

pkg/cache/queue/manager_test.go Outdated Show resolved Hide resolved

Singularity23x0 added 3 commits December 1, 2025 13:07

Test cleanup - log local vs global.

60901d2

Test naming cleanup: assumed -> assigned

1eccfdd

Applied review comments.

4bc7646

Singularity23x0 requested a review from olekzabl December 1, 2025 13:15

olekzabl reviewed Dec 1, 2025

View reviewed changes

pkg/cache/queue/manager_test.go Outdated Show resolved Hide resolved

Test cleanup: Assumed -> Assigned

21d9f03

k8s-ci-robot assigned olekzabl Dec 1, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 1, 2025

mimowo reviewed Dec 2, 2025

View reviewed changes

Singularity23x0 mentioned this pull request Dec 2, 2025

Update the metric in every heads run. #8037

Merged

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 3, 2025

kshalot approved these changes Dec 5, 2025

View reviewed changes

Queue cache - add reverse workload mapping to stop relying on previous workload state during update/delete #8001

Are you sure you want to change the base?

Queue cache - add reverse workload mapping to stop relying on previous workload state during update/delete #8001

Uh oh!

Conversation

Singularity23x0 commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Uh oh!

netlify bot commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for kubernetes-sigs-kueue canceled.

Uh oh!

k8s-ci-robot commented Nov 28, 2025

Uh oh!

pajakd commented Dec 1, 2025

Uh oh!

olekzabl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

olekzabl commented Dec 1, 2025

Uh oh!

k8s-ci-robot commented Dec 1, 2025

Uh oh!

mimowo Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Singularity23x0 Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

mimowo Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Singularity23x0 Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

mimowo Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Singularity23x0 Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

mimowo Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Singularity23x0 Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

mimowo Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

mimowo left a comment

Choose a reason for hiding this comment

Uh oh!

mimowo commented Dec 3, 2025

Uh oh!

k8s-ci-robot commented Dec 3, 2025

Uh oh!

k8s-ci-robot commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Singularity23x0 commented Nov 28, 2025 •

edited

Loading

netlify bot commented Nov 28, 2025 •

edited

Loading

mimowo Dec 2, 2025 •

edited

Loading

mimowo Dec 2, 2025 •

edited

Loading