You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -92,11 +94,11 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
92
94
-[x] (R) KEP approvers have approved the KEP status as `implementable`
93
95
-[x] (R) Design details are appropriately documented
94
96
-[x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
95
-
-[] e2e Tests for all Beta API Operations (endpoints)
96
-
-[] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
97
-
-[] (R) Minimum Two Week Window for GA e2e tests to prove flake free
97
+
-[x] e2e Tests for all Beta API Operations (endpoints)
98
+
-[x] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
99
+
-[x] (R) Minimum Two Week Window for GA e2e tests to prove flake free
98
100
-[x] (R) Graduation criteria is in place
99
-
-[] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
101
+
-[x] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
100
102
-[x] (R) Production readiness review completed
101
103
-[x] (R) Production readiness review approved
102
104
-[x] "Implementation History" section is up-to-date for milestone
@@ -140,7 +142,7 @@ controller, and delegate the status synchronization to the Kueue controller.
140
142
### Non-Goals
141
143
142
144
- passing custom parameters to the external controller
143
-
- Introduce a new concurrency policy for CronJobs (eg. `ForbidActive` or `SoftForbid`)
145
+
- Introduce a new concurrency policy for CronJobs (e.g. `ForbidActive` or `SoftForbid`)
144
146
to replace a Job that is about to complete, but still has terminating pods.
145
147
146
148
## Proposal
@@ -278,7 +280,7 @@ a typo in the field value, or the cluster administrator may not install the
278
280
custom controller (like MultiKueue) on that cluster.
279
281
280
282
In such cases the user may not observe any progress by the job for a long time
281
-
and my need to debug the Job.
283
+
and may need to debug the Job.
282
284
283
285
In order to allow for debugging of situations like this the Job controller will
284
286
put a log line indicating the synchronization is delegated to another controller
@@ -293,9 +295,15 @@ providing a user readable information if the Job is synchronized by a custom
and [events](#event-indicating-the-job-is-skipped).
298
300
301
+
Based on beta feedback, kubectl extensions are deferred as follow-up enhancements.
302
+
The "managedBy" field is already visible via standard kubectl commands
303
+
(`kubectl get job <name> -o yaml`), and MultiKueue users have not reported
304
+
difficulties debugging Jobs with this field. We will reconsider these enhancements
305
+
based on user feedback post-GA.
306
+
299
307
#### Custom controllers not compatible with API assumptions by CronJob
300
308
301
309
Currently, the validation of the Job status API is rather relaxed, allowing for
@@ -352,9 +360,6 @@ type JobSpec struct {
352
360
// all characters before the first "/" must be a valid subdomain as defined
353
361
// by RFC 1123. All characters trailing the first "/" must be valid HTTP Path
354
362
// characters as defined by RFC 3986. The value cannot exceed 64 characters.
355
-
//
356
-
// This field is alpha-level. The job controller accepts setting the field
357
-
// when the feature gate JobManagedBy is enabled (disabled by default).
358
363
// +optional
359
364
ManagedBy *string
360
365
}
@@ -373,6 +378,11 @@ We will re-evaluate for [GA](#ga) to also skip the reconciliation within the
373
378
`enqueueSyncJobInternal` for optimal performance. See discussion in the
374
379
[Skip reconciliation in the event handler](#skip-reconciliation-in-the-event-handler).
375
380
381
+
Skipping reconciliation in event handlers is deferred as a performance
382
+
optimization for post-GA. This is considered a premature optimization
383
+
without evidence of performance issues from MultiKueue users on the
384
+
management cluster.
385
+
376
386
There is no validation for a value of the field beyond its format as described
377
387
in the [API](#API) comment above.
378
388
@@ -493,9 +503,9 @@ The following scenarios are covered:
493
503
- it does not add the Suspended condition for a Job with `.spec.suspend=true` ([link](https://github.com/kubernetes/kubernetes/blob/856475e5fffe3d99c71606d6024f5ed93e37eebc/test/integration/job/job_test.go#L2059)).
494
504
- the Job controller reconciles jobs with custom "managedBy" field when the feature gate is disabled ([link](https://github.com/kubernetes/kubernetes/blob/856475e5fffe3d99c71606d6024f5ed93e37eebc/test/integration/job/job_test.go#L2030))
495
505
- the Job controller handles correctly re-enablement of the feature gate [link](https://github.com/kubernetes/kubernetes/blob/169a952720ebd75fcbcb4f3f5cc64e82fdd3ec45/test/integration/job/job_test.go#L1691)
496
-
- the `job_by_external_controller_total` metric is incremented when a new Job with custom "managedBy" is created ([link](https://github.com/kubernetes/kubernetes/blob/856475e5fffe3d99c71606d6024f5ed93e37eebc/test/integration/job/job_test.go#L2044-L2058))
497
-
- the `job_by_external_controller_total` metric is not incremented for a new Job without "managedBy" or with default value ([link](https://github.com/kubernetes/kubernetes/blob/856475e5fffe3d99c71606d6024f5ed93e37eebc/test/integration/job/job_test.go#L2000-L2029))
498
-
- the `job_by_external_controller_total` metric is not incremented for Job updates (regardless of the "managedBy") (tested indirectly as [here](https://github.com/kubernetes/kubernetes/blob/856475e5fffe3d99c71606d6024f5ed93e37eebc/test/integration/job/job_test.go#L2000-L2029) the Job controller updates the Job status)
506
+
- the `jobs_by_external_controller_total` metric is incremented when a new Job with custom "managedBy" is created ([link](https://github.com/kubernetes/kubernetes/blob/856475e5fffe3d99c71606d6024f5ed93e37eebc/test/integration/job/job_test.go#L2044-L2058))
507
+
- the `jobs_by_external_controller_total` metric is not incremented for a new Job without "managedBy" or with default value ([link](https://github.com/kubernetes/kubernetes/blob/856475e5fffe3d99c71606d6024f5ed93e37eebc/test/integration/job/job_test.go#L2000-L2029))
508
+
- the `jobs_by_external_controller_total` metric is not incremented for Job updates (regardless of the "managedBy") (tested indirectly as [here](https://github.com/kubernetes/kubernetes/blob/856475e5fffe3d99c71606d6024f5ed93e37eebc/test/integration/job/job_test.go#L2000-L2029) the Job controller updates the Job status)
499
509
500
510
The following scenarios related to [Terminating pods and terminal Job conditions](#terminating-pods-and-terminal-job-conditions) are covered:
501
511
-`Failed` or `Complete` conditions are not added while there are still terminating pods ([link](https://github.com/kubernetes/kubernetes/blob/856475e5fffe3d99c71606d6024f5ed93e37eebc/test/integration/job/job_test.go#L1183))
@@ -513,6 +523,8 @@ We propose a single e2e test for the following scenario:
513
523
- the Job controller does not reconcile a job with any other value of the "managedBy" field. In particular,
514
524
it does not reset the status for an unsuspended Job.
515
525
526
+
[Job should allow to delegate reconciliation to external controller](https://github.com/kubernetes/kubernetes/blob/b393d87d16f225f873f72a79734b3409323b4a05/test/e2e/apps/job.go#L1310-L1333): [SIG Apps](https://testgrid.k8s.io/sig-apps#gce&include-filter-by-regex=external%20controller&include-filter-by-regex=Job%20should%20allow%20to%20delegate%20reconciliation%20to%20external%20controller), [triage search](https://storage.googleapis.com/k8s-triage/index.html?sig=apps&test=Job%20should%20allow%20to%20delegate%20reconciliation%20to%20external%20controller)
527
+
516
528
### Graduation Criteria
517
529
518
530
#### Alpha
@@ -523,7 +535,7 @@ We propose a single e2e test for the following scenario:
523
535
API fields affected by the new validation rules
524
536
- make CronJob more resilient by checking the Job condition is `Complete` when using `CompletionTime` (see [here](#custom-controllers-not-compatible-with-api-assumptions-by-cronjob))
525
537
- The feature flag disabled by default
526
-
- implement the `job_by_external_controller_total` metric
538
+
- implement the `jobs_by_external_controller_total` metric
527
539
528
540
Second Alpha (1.31):
529
541
- preparatory fix to address all known inconsistencies between validation and the
@@ -546,10 +558,60 @@ Second Alpha (1.31):
546
558
#### GA
547
559
548
560
- Address reviews and bug reports from Beta users
549
-
- Re-evaluate the ideas of improving debuggability (like [extended `kubectl`](#debuggability), [dedicated condition](#condition-to-indicated-job-is-skipped), or [events](#event-indicating-the-job-is-skipped))
561
+
- So far, with the adoption in MultiKueue we have not got users requesting extra debuggability. With the "managedBy" field being immutable it seems the feature is not causing issues requiring extensive debug. The metric and immutable field should be sufficient as a source of truth.
562
+
563
+
- Re-evaluate the ideas of improving debuggability (like [extended `kubectl`](#debuggability), [dedicated condition](#condition-to-indicate-job-is-skipped), or [events](#event-indicating-the-job-is-skipped))
564
+
- Keep these ideas as follow-up features and not blockers for the graduation.
565
+
550
566
- Re-evaluate the need to skip reconciliation in the event handlers to optimize performance
551
-
- Assess the fragmentation of the ecosystem. Look for other implementations of a job controller and asses their conformance with k8s.
567
+
- This could be a follow-up performance improvement, but does not seem required. We have no record of MultiKueue users hitting performance issues on the management cluster due to Job controller.
568
+
569
+
- Assess the fragmentation of the ecosystem. Look for other implementations of a job controller and assess their conformance with k8s.
570
+
- See [Ecosystem Assessment](#ecosystem-assessment) section below.
571
+
552
572
- Lock the feature gate
573
+
- This will be done in follow-up implementation PR for 1.35.
574
+
575
+
##### Ecosystem Assessment
576
+
577
+
We already have a couple of implementations of the "managedBy" field in order to integrate with MultiKueue:
578
+
- JobSet
579
+
- Kubeflow Trainer v1
580
+
- Kubeflow Trainer v2
581
+
- KubeRay
582
+
- Tekton Pipelines
583
+
- AppWrapper
584
+
585
+
These implementations follow the k8s core design, particularly making the field immutable.
| Open set of allowed values | y | y | n | n | n | y | y |
592
+
| Max length 63 chars | y | y | y (by closed set) | y (by closed set) | y (by closed set) | n | n |
593
+
| Specific format validation | y | y | y (by closed set) | y (by closed set) | y (by closed set) | n | n |
594
+
| Immutability | y | y | y | y | y | y | y |
595
+
| Status validation | y | n | n | n | n | n | n |
596
+
| Skip inside Reconcile | y | y | y | y | y | y (also filtering) | y |
597
+
598
+
**Key Findings:**
599
+
- There is a bit inconsistency about the "open set of values" vs "closed set of values"
600
+
- Kubeflow Trainer v1 puts the field under `spec.runPolicy.managedBy`. This is because the code for different controllers is deduplicated at the level of the runPolicy field
601
+
- Kubeflow Trainer locks the set to only "known controllers" like MultiKueue: `kueue.x-k8s.io/multikueue`. This was done to be "on the safe side", and avoid complex status validations as in the core
602
+
- The main part is that all of them follow the key aspects: **immutability** and **skip inside Reconcile**
603
+
- All projects consistently wanted to avoid the effort of validating the status which we took in the k8s Job
604
+
605
+
**Validation inconsistencies:**
606
+
- Some controllers (Tekton, AppWrapper) don't enforce the 63-character length limit. If they allow more than 63 characters, then nothing will break for them. The idea of restricting in the core k8s was to make it more constrained
607
+
- Most controllers skip the status validation that k8s Job implements to avoid the extra work, since "managedBy" is immutable
608
+
609
+
##### Follow-up Work
610
+
611
+
We may reconsider dropping the code gracefully handling Jobs with invalid statuses after 1.39 behind a dedicated feature gate. All Jobs created in 1.32+ won't require the graceful handling as the rules prevent entering invalid state. We will wait at least 4 releases before locking such a feature gate.
612
+
613
+
**Metric Stability:**
614
+
The `jobs_by_external_controller_total` metric will be promoted to STABLE in 1.35.
553
615
554
616
#### Deprecation
555
617
@@ -569,7 +631,7 @@ A downgrade to a version which does not support this feature (1.29 and below)
569
631
does not require any additional configuration changes. All jobs, including these
570
632
that specified a custom value for "managedBy", will be handled in the default
571
633
way by the Job controller. However, this introduces the risk of
572
-
[two controllers running at the same time](#two-controllers-running-at-the-same-time-on-old-version).
634
+
[two controllers running at the same time](#two-controllers-running-when-feature-is-disabled).
573
635
574
636
In order to prepare the risk the admins may want to make sure the custom controllers
575
637
using the "managedBy" field are disabled before the downgrade.
@@ -598,7 +660,7 @@ is not supported on the old version.
598
660
In case the version of the kube-controller-manager leader is skewed (old), the
599
661
built-in Job controller would reconcile the Jobs with custom "managedBy" field,
600
662
running into the risk of
601
-
[two controllers running at the same time](#two-controllers-running-at-the-same-time-on-old-version).
663
+
[two controllers running at the same time](#two-controllers-running-when-feature-is-disabled).
602
664
It is recommended the users don't create jobs with custom "managedBy" field
603
665
during an ongoing upgrade.
604
666
@@ -682,7 +744,7 @@ Yes.
682
744
683
745
However, when the feature is disabled and there are Jobs external controllers by
684
746
using "managedBy" field there is a risk of
685
-
[two controller running at the same time](#two-controllers-running-at-the-same-time-on-old-version).
747
+
[two controller running at the same time](#two-controllers-running-when-feature-is-disabled).
686
748
Thus, it is recommended administrators make sure there are no Jobs using external
687
749
controllers before rollback.
688
750
@@ -752,7 +814,7 @@ being flipped between two owners.
752
814
The feature is opt-in so in case of such problems the custom "managedBy" field
753
815
should not be used.
754
816
755
-
Also, an admin could check if the value of the `job_by_external_controller_total`
817
+
Also, an admin could check if the value of the `jobs_by_external_controller_total`
756
818
matches the expectations. For example, if the value of the metric does not increase
757
819
when new jobs are being added with a custom "managedBy" field, it might be
758
820
indicative that the feature is not working correctly.
@@ -831,7 +893,7 @@ checking if there are objects with field X set) may be a last resort. Avoid
831
893
logs or events for this purpose.
832
894
-->
833
895
834
-
Check the `job_by_external_controller_total` metric. If the value is non-zero
896
+
Check the `jobs_by_external_controller_total` metric. If the value is non-zero
835
897
for a field, it means there were Jobs using the custom controller created, so
836
898
the feature is in use.
837
899
@@ -885,22 +947,24 @@ Pick one more of these and delete the rest.
885
947
886
948
-[x] Metrics
887
949
- Metric name:
888
-
-`job_by_external_controller_total` (new), with the `controller_name` label
950
+
-`jobs_by_external_controller_total` (new), with the `controller_name` label
889
951
corresponding to the custom value of the "managedBy" field. The metric is
890
952
incremented by the built-in Job controller on each ADDED Job event,
891
953
corresponding to a Job with custom value of the "managedBy" field.
892
954
This metric can be helpful to determine the health of a job and its controller
893
955
in combination with already existing metrics (see below).
956
+
- Components exposing the metric: kube-controller-manager
substantial increase of this metric, when additionally `job_by_external_controller_total>0`
958
+
substantial increase of this metric, when additionally `jobs_by_external_controller_total>0`
896
959
may be indicative of two controllers stepping onto each-other causing
897
-
conflicts (see [here](#two-controllers-running-at-the-same-time-on-old-version)).
960
+
conflicts (see [here](#two-controllers-running-when-feature-is-disabled)).
961
+
- Components exposing the metric: kube-apiserver
898
962
- `kube_cronjob_status_active` (existing), substantial increase of this
899
963
metric, may suggest that there are accumulating non-progressing jobs controlled
900
-
by `CronJob`. If additionally `job_by_external_controller_total>0` it may suggest
964
+
by `CronJob`. If additionally `jobs_by_external_controller_total>0` it may suggest
901
965
that the Jobs are getting stuck due to not being synchronized by the custom
902
966
controller.
903
-
- Components exposing the metric: kube-apiserver
967
+
- Components exposing the metric: kube-apiserver
904
968
905
969
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
906
970
@@ -1080,7 +1144,7 @@ N/A.
1080
1144
1081
1145
## Implementation History
1082
1146
1083
-
- 2023-12.20 - First version of the KEP
1147
+
- 2023-12-20 - First version of the KEP
1084
1148
- 2024-03-05 - Merged implementation PR [Support for the Job managedBy field (alpha)](https://github.com/kubernetes/kubernetes/pull/123273)
1085
1149
- 2024-03-07 - Merged [Update Job conformance test for job status updates](https://github.com/kubernetes/kubernetes/pull/123751)
1086
1150
- 2024-03-08 - Merged [Follow up fix to the job status update test](https://github.com/kubernetes/kubernetes/pull/123815)
@@ -1090,6 +1154,7 @@ N/A.
1090
1154
- 2024-06-21 - Merged [Update the count of ready pods when deleting pods](https://github.com/kubernetes/kubernetes/pull/125546)
1091
1155
- 2024-07-12 - Merged [Delay setting terminal Job conditions until all pods are terminal](https://github.com/kubernetes/kubernetes/pull/125510)
1092
1156
- 2024-07-30 - Merged [Update the docs for JobManagedBy and JobPodReplacementPolicy related to pod termination](https://github.com/kubernetes/website/pull/46808)
1157
+
- 2024-10-17 - Merged [Graduate JobManagedBy to Beta in 1.32](https://github.com/kubernetes/kubernetes/pull/127402)
1093
1158
1094
1159
<!--
1095
1160
Major milestones in the lifecycle of a KEP should be tracked in this section.
@@ -1369,7 +1434,7 @@ It makes them not that useful to debug situations when the Job didn't make
1369
1434
progress for long time. So, they would not give a reliable signal for debugging
1370
1435
based on playbooks.
1371
1436
1372
-
Renewing the even on every Job update seems excessive from the performance
1437
+
Renewing the event on every Job update seems excessive from the performance
0 commit comments