Capture controller pod logs and status as part of artifacts #1314

caxu-rh · 2025-09-15T16:46:16Z

Add new functions for getting the pods which match a deployment's selector label
Add new functions for getting logs from a pod's container(s)
- This introduces a new dependency, k8s.io/client-go, since controller-runtime basically only covers CRUD operations on resources, not logs
Add logic in DeployableByOlmCheck to use these functions and write as artifacts

coveralls · 2025-09-15T17:01:46Z

coverage: 83.62% (-0.2%) from 83.812%
when pulling d8482d9 on caxu-rh:fix-eet-4864
into c2622b1 on redhat-openshift-ecosystem:main.

dcibot · 2025-09-15T17:20:36Z

from change #1314:

SUCCESS https://www.distributed-ci.io/jobs/60b148c5-e083-4596-9657-1326c5492894/jobStates

acornett21

Just a couple of questions.

internal/openshift/openshift.go

internal/policy/operator/deployable_by_olm.go

dcibot · 2025-09-15T18:16:01Z

from change #1314:

SUCCESS https://www.distributed-ci.io/jobs/330dcc42-4102-4506-b067-27e01a308386/jobStates

github-actions · 2025-09-16T13:43:29Z

Code Review by Gemini

Overall, this is a well-structured and valuable set of changes. The introduction of k8s.io/client-go for log retrieval is a necessary and appropriate step, and the integration with the existing openshiftClient and DeployableByOlmCheck is handled cleanly.

Here's a detailed review:

`internal/openshift/openshift.go`

openshiftClient struct and NewClient function:
- Adherence to best practices / Idiomatic Go: Introducing K8sInterface kubernetes.Interface and passing it via NewClient is excellent dependency injection. It makes the openshiftClient more flexible and testable.
- Suggestion: None, this is well done.

GetDeploymentPods function:

Potential Issue (Security/Performance/Correctness): The current implementation has a subtle but critical flaw if deployment.Spec.Selector.MatchLabels is empty.
```
selectorLabels := deployment.Spec.Selector.MatchLabels
// ...
labelSelector := crclient.MatchingLabels{}
maps.Copy(labelSelector, selectorLabels) // If selectorLabels is empty, labelSelector will be empty.
// ...
err = oe.Client.List(ctx, &podList, labelSelector) // If labelSelector is empty, this lists ALL pods in the namespace.
```
When crclient.MatchingLabels{} is empty, controller-runtime's client.List operation will effectively list all pods in the target namespace (or even cluster-wide if the client is configured for it). This is almost certainly not the intended behavior for "Get pods of a Deployment". A deployment with an empty MatchLabels selector is unusual and typically invalid or not managing any pods.

Recommendation (Correctness/Performance):

If deployment.Spec.Selector.MatchLabels is empty, the function should either return an empty slice of pods or an error, as it's an ambiguous state for "pods of this deployment". Returning an empty slice is generally more graceful for a "get" operation.
Additionally, it's good practice to explicitly specify the namespace for List operations, even if the client might default to it.

// GetDeploymentPods can return an ErrNotFound
func (oe *openshiftClient) GetDeploymentPods(ctx context.Context, name string, namespace string) ([]corev1.Pod, error) {
	logger := logr.FromContextOrDiscard(ctx)

	deployment, err := oe.GetDeployment(ctx, name, namespace)
	if err != nil {
		return nil, err
	}

	selectorLabels := deployment.Spec.Selector.MatchLabels
	if len(selectorLabels) == 0 {
		// A deployment without selector labels is unusual and might not manage pods
		// in the way expected. Returning an empty list is safer than
		// potentially listing all pods in the namespace.
		logger.V(log.TRC).Info("deployment has no selector labels defined, returning empty pod list", "namespace", namespace, "name", name)
		return []corev1.Pod{}, nil
	}

	labelSelector := crclient.MatchingLabels{}
	maps.Copy(labelSelector, selectorLabels)

	podList := corev1.PodList{}
	// Explicitly filter by namespace and labels.
	err = oe.Client.List(ctx, &podList, labelSelector, crclient.InNamespace(namespace))
	if err != nil {
		return nil, fmt.Errorf("could not list pods matching label selector: %v", err)
	}

	return podList.Items, nil
}

GetPod function:
- Adherence to best practices / Idiomatic Go: Standard controller-runtime client usage, good error handling.
- Suggestion: None.
getContainerLogs (private helper function):
- Adherence to best practices / Idiomatic Go: Correctly uses client-go for log streaming. defer logs.Close() ensures resource cleanup. Error handling is robust.
- Performance: Streaming directly to bytes.Buffer is efficient for in-memory collection.
- Security: Accessing logs requires appropriate RBAC permissions for the service account used by the tool. This is an inherent requirement for log collection and not a vulnerability introduced by the code itself. The code correctly uses the provided K8sInterface.
- Suggestion: None.
GetPodLogs function:
- Adherence to best practices / Idiomatic Go: Comprehensive, covering InitContainers, Containers, and EphemeralContainers. The strategy of logging errors and continuing for individual container log failures is appropriate for artifact collection.
- Performance: Collects all logs into bytes.Buffers in memory. For extremely verbose logs across many containers, this could consume significant memory. However, for typical operator controller logs, this should be acceptable. The DeployableByOlmCheck then writes these buffers to disk, which is a reasonable two-step process.
- Suggestion: None.

`internal/openshift/openshift_test.go`

Test Setup:
- Adherence to best practices: The switch to fakecr.NewClientBuilder and fakecg.NewClientset() for controller-runtime and client-go clients respectively is good for testing.
- Test Coverage: The tests for GetDeploymentPods, GetPod, and GetPodLogs are well-written and cover various scenarios (existing, non-existent, no matching pods).
- Observation: The fakecg.NewClientset() provides a default "fake logs" string for log streams, which is sufficient for testing the log retrieval mechanism's flow without needing complex mocking.
- Suggestion: None, the tests are good.

`internal/openshift/types.go`

Client interface:
- Adherence to best practices: Correctly extends the interface to expose the new functionality.
- Suggestion: None.

`internal/policy/operator/deployable_by_olm.go`

DeployableByOlmCheck struct and initClient, initOpenShiftEngine functions:
- Adherence to best practices / Idiomatic Go: The k8sClientset field and its initialization in initClient and subsequent passing to openshift.NewClient demonstrate good dependency management.
- Suggestion: None.
cleanUp function:
- Adherence to best practices / Idiomatic Go: This is where the new functionality is put to use. The logic for iterating deployments, then their pods, and then each container's status and logs, is well-structured.
- Error Handling: Logging warnings for failures during artifact collection (e.g., unable to retrieve deployment pods or logs) and continuing is a good strategy, ensuring that as many artifacts as possible are collected.
- Artifact Naming: The filenames [pod.Name]-PodStatus.json and podLogs-[pod.Name]-[container].json are clear and descriptive.
- Security: As mentioned before, logs can contain sensitive information. The act of collecting them is a design choice of the tool, not a vulnerability in the code itself. Ensure that the context in which these artifacts are stored and accessed aligns with security policies.
- Suggestion: None, this part is well implemented.

Summary of Key Feedback:

The most important suggestion is to address the behavior of GetDeploymentPods when deployment.Spec.Selector.MatchLabels is empty to prevent it from listing all pods in the namespace.

Apart from that, the changes are solid, follow best practices, and are well-tested.

Signed-off-by: Caleb Xu <caxu@redhat.com>

acornett21 · 2025-09-16T18:25:22Z

The new files should be added to the .gitignore

dcibot · 2025-09-16T18:30:18Z

from change #1314:

SUCCESS https://www.distributed-ci.io/jobs/758548bb-eef9-43d8-8837-d45349f3fc8f/jobStates

internal/policy/operator/deployable_by_olm.go

acornett21 · 2025-09-16T18:46:49Z

/test 4.20-e2e

dcibot · 2025-09-16T20:17:54Z

from change #1314:

SUCCESS https://www.distributed-ci.io/jobs/c7d7bae7-35ea-4d91-a8cf-b4c94d17562c/jobStates

Signed-off-by: Caleb Xu <caxu@redhat.com>

dcibot · 2025-09-16T21:57:15Z

from change #1314:

SUCCESS https://www.distributed-ci.io/jobs/2faece92-e27c-40d3-af28-525c2e2707da/jobStates

openshift-ci · 2025-09-16T21:58:12Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: acornett21, caxu-rh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [acornett21]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci bot requested review from skattoju and tonytcampbell September 15, 2025 16:46

caxu-rh force-pushed the fix-eet-4864 branch 2 times, most recently from 7351d8d to 5863642 Compare September 15, 2025 16:54

caxu-rh force-pushed the fix-eet-4864 branch from 5863642 to 5ca0b34 Compare September 15, 2025 17:32

acornett21 reviewed Sep 15, 2025

View reviewed changes

internal/openshift/openshift.go Show resolved Hide resolved

internal/policy/operator/deployable_by_olm.go Outdated Show resolved Hide resolved

acornett21 added the gemini-review label Sep 16, 2025

openshift: add GetDeploymentPods method

d5d049f

Signed-off-by: Caleb Xu <caxu@redhat.com>

caxu-rh force-pushed the fix-eet-4864 branch from 5ca0b34 to 186aeed Compare September 16, 2025 18:01

github-actions bot removed the gemini-review label Sep 16, 2025

acornett21 reviewed Sep 16, 2025

View reviewed changes

internal/policy/operator/deployable_by_olm.go Outdated Show resolved Hide resolved

caxu-rh force-pushed the fix-eet-4864 branch 2 times, most recently from 0f88c46 to affb997 Compare September 16, 2025 19:50

caxu-rh added 2 commits September 16, 2025 17:29

openshift: add functions for getting pod logs

894cc97

Signed-off-by: Caleb Xu <caxu@redhat.com>

DeployableByOlmCheck: collect pod status and logs

d8482d9

Signed-off-by: Caleb Xu <caxu@redhat.com>

caxu-rh force-pushed the fix-eet-4864 branch from affb997 to d8482d9 Compare September 16, 2025 21:30

acornett21 approved these changes Sep 16, 2025

View reviewed changes

openshift-ci bot assigned acornett21 Sep 16, 2025

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 16, 2025

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Capture controller pod logs and status as part of artifacts #1314

Capture controller pod logs and status as part of artifacts #1314

caxu-rh commented Sep 15, 2025

Uh oh!

coveralls commented Sep 15, 2025 •

edited

Loading

Uh oh!

dcibot commented Sep 15, 2025

Uh oh!

acornett21 left a comment

Uh oh!

Uh oh!

Uh oh!

dcibot commented Sep 15, 2025

Uh oh!

github-actions bot commented Sep 16, 2025

Uh oh!

acornett21 commented Sep 16, 2025

Uh oh!

dcibot commented Sep 16, 2025

Uh oh!

Uh oh!

acornett21 commented Sep 16, 2025

Uh oh!

dcibot commented Sep 16, 2025

Uh oh!

dcibot commented Sep 16, 2025

Uh oh!

openshift-ci bot commented Sep 16, 2025

Uh oh!

Uh oh!

Capture controller pod logs and status as part of artifacts #1314

Are you sure you want to change the base?

Capture controller pod logs and status as part of artifacts #1314

Conversation

caxu-rh commented Sep 15, 2025

Uh oh!

coveralls commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dcibot commented Sep 15, 2025

Uh oh!

acornett21 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dcibot commented Sep 15, 2025

Uh oh!

github-actions bot commented Sep 16, 2025

Code Review by Gemini

internal/openshift/openshift.go

internal/openshift/openshift_test.go

internal/openshift/types.go

internal/policy/operator/deployable_by_olm.go

Summary of Key Feedback:

Uh oh!

acornett21 commented Sep 16, 2025

Uh oh!

dcibot commented Sep 16, 2025

Uh oh!

Uh oh!

acornett21 commented Sep 16, 2025

Uh oh!

dcibot commented Sep 16, 2025

Uh oh!

dcibot commented Sep 16, 2025

Uh oh!

openshift-ci bot commented Sep 16, 2025

Uh oh!

Uh oh!

coveralls commented Sep 15, 2025 •

edited

Loading

`internal/openshift/openshift.go`

`internal/openshift/openshift_test.go`

`internal/openshift/types.go`

`internal/policy/operator/deployable_by_olm.go`