Skip to content

CNTRLPLANE-2207: feat(install): self-manage webhook certs instead of relying on service-ca#8071

Closed
enxebre wants to merge 3 commits intoopenshift:mainfrom
enxebre:worktree-webhook-certs
Closed

CNTRLPLANE-2207: feat(install): self-manage webhook certs instead of relying on service-ca#8071
enxebre wants to merge 3 commits intoopenshift:mainfrom
enxebre:worktree-webhook-certs

Conversation

@enxebre
Copy link
Copy Markdown
Member

@enxebre enxebre commented Mar 25, 2026

The service-ca operator is not available on non-OpenShift clusters (e.g. AKS). Instead of relying on its annotations to generate serving certs and inject CA bundles, always self-manage them for the HO SVC/Webhook:

  • At install time: generate a self-signed CA and serving cert using support/certs, set caBundle directly on CRDs and webhook configs.
  • At runtime: a WebhookCertReconciler (following the SharedIngressReconciler pattern) auto-renews the serving cert when < 30 days of validity remain and patches caBundle on CRDs and webhook configurations.

The service-ca annotations (inject-cabundle on CRDs/webhook configs, serving-cert-secret-name on the Service) are removed as they are no longer needed.

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • New Features

    • Adds a controller that generates and reconciles a self-signed CA and a CA-signed serving TLS certificate, and propagates CA bundles to webhook configurations and CRDs.
  • Improvements

    • Installer now conditionally generates and embeds explicit CA and serving certificates and injects CA bundles into webhook-related resources instead of relying on platform-specific annotations.
    • Main process conditionally wires the certificate controller when enabled.
  • Tests

    • New unit tests for cert generation, reconciliation, idempotency, and CA bundle patching.

@openshift-ci-robot
Copy link
Copy Markdown

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 25, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 25, 2026

@enxebre: This pull request references CNTRLPLANE-1200 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.22.0" version, but no target version was set.

Details

In response to this:

The service-ca operator is not available on non-OpenShift clusters (e.g. AKS). Instead of relying on its annotations to generate serving certs and inject CA bundles, always self-manage them for the HO SVC/Webhook:

  • At install time: generate a self-signed CA and serving cert using support/certs, set caBundle directly on CRDs and webhook configs.
  • At runtime: a WebhookCertReconciler (following the SharedIngressReconciler pattern) auto-renews the serving cert when < 30 days of validity remain and patches caBundle on CRDs and webhook configurations.

The service-ca annotations (inject-cabundle on CRDs/webhook configs, serving-cert-secret-name on the Service) are removed as they are no longer needed.

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 25, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 66d1a088-3285-4155-a690-3f3fd7756e09

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The installer now generates initial webhook CA and serving TLS Secrets when any of EnableDefaultingWebhook, EnableConversionWebhook, EnableValidatingWebhook, or EnableAuditLogPersistence is enabled by calling GenerateInitialWebhookCerts and appending the produced Secrets and CA bundle to install manifests; CRD conversion webhook clientConfig.CABundle is set from that CA bundle. OpenShift-specific annotations for serving-cert/CA bundle injection were removed from install assets. A new WebhookCertReconciler controller was added and wired at runtime (when CertDir != ""), which ensures the self-signed CA and serving cert Secrets exist and patches hypershift.openshift.io CRDs and the Mutating/ValidatingWebhookConfiguration clientConfig.CABundle fields.

Sequence Diagram

sequenceDiagram
    participant Installer as Installation
    participant KubeAPI as Kubernetes API
    participant CRDAPI as CRD API
    participant Controller as WebhookCertReconciler
    participant WebhookAPI as Webhook Config API

    Installer->>Controller: GenerateInitialWebhookCerts(namespace, service)
    Controller->>KubeAPI: Create/Update CA Secret & Serving TLS Secret
    KubeAPI-->>Controller: Secrets persisted, return CA bundle
    Controller-->>Installer: Return CA bundle bytes

    Installer->>CRDAPI: setupCRDs(with webhookCABundle)
    CRDAPI-->>Installer: CRDs created/updated

    Note over Controller,KubeAPI: Runtime controller active when CertDir != ""

    KubeAPI->>Controller: Secret change event (webhook-serving-ca / manager-serving-cert)
    Controller->>KubeAPI: Reconcile CA and serving cert Secrets
    KubeAPI-->>Controller: Secrets reconciled

    Controller->>CRDAPI: List hypershift.openshift.io CRDs
    CRDAPI-->>Controller: CRD list
    Controller->>CRDAPI: Patch spec.conversion.webhook.clientConfig.caBundle
    CRDAPI-->>Controller: Patched

    Controller->>WebhookAPI: Get Mutating/ValidatingWebhookConfiguration hypershift.openshift.io
    WebhookAPI-->>Controller: Webhook configs
    Controller->>WebhookAPI: Patch webhook.clientConfig.caBundle
    WebhookAPI-->>Controller: Patched
Loading
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from bryan-cox and devguyio March 25, 2026 17:30
@openshift-ci openshift-ci Bot added area/cli Indicates the PR includes changes for CLI area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release labels Mar 25, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Mar 25, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enxebre

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed do-not-merge/needs-area labels Mar 25, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 25, 2026

@enxebre: This pull request references CNTRLPLANE-1200 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.22.0" version, but no target version was set.

Details

In response to this:

The service-ca operator is not available on non-OpenShift clusters (e.g. AKS). Instead of relying on its annotations to generate serving certs and inject CA bundles, always self-manage them for the HO SVC/Webhook:

  • At install time: generate a self-signed CA and serving cert using support/certs, set caBundle directly on CRDs and webhook configs.
  • At runtime: a WebhookCertReconciler (following the SharedIngressReconciler pattern) auto-renews the serving cert when < 30 days of validity remain and patches caBundle on CRDs and webhook configurations.

The service-ca annotations (inject-cabundle on CRDs/webhook configs, serving-cert-secret-name on the Service) are removed as they are no longer needed.

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

Release Notes

  • New Features

  • Added self-managed webhook certificate generation for non-OpenShift cluster deployments.

  • Improvements

  • Webhook configurations now use explicit CA bundle injection instead of relying on platform-specific service annotations.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@enxebre enxebre force-pushed the worktree-webhook-certs branch from 82b075e to 933c722 Compare March 25, 2026 17:38
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/install/install.go (1)

856-879: ⚠️ Potential issue | 🟠 Major

Bug: annotation check is inverted, will clear existing annotations.

Line 861-862 has inverted logic: if crd.Annotations != nil { crd.Annotations = map[string]string{} } clears annotations when they exist, instead of initializing when nil. This would remove any existing annotations on the CRD.

🐛 Proposed fix
 				if crd.Spec.Group == "hypershift.openshift.io" {
 					if !opts.EnableConversionWebhook {
 						return
 					}
-					if crd.Annotations != nil {
+					if crd.Annotations == nil {
 						crd.Annotations = map[string]string{}
 					}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/install/install.go` around lines 856 - 879, The current check in the CRD
conversion setup clears existing annotations because it does "if crd.Annotations
!= nil { crd.Annotations = map[string]string{} }"; change this to initialize
annotations only when missing by using "if crd.Annotations == nil {
crd.Annotations = map[string]string{} }" within the block that handles
crd.Spec.Group == "hypershift.openshift.io" (the anonymous func handling
*apiextensionsv1.CustomResourceDefinition), so existing annotations are
preserved and only a nil map is replaced.
🧹 Nitpick comments (1)
hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go (1)

226-266: Consider setting caSecret.Type explicitly for consistency.

In Reconcile (line 94), caSecret.Type is set to corev1.SecretTypeOpaque, but in GenerateWebhookCerts it's not set. While Opaque is the default, setting it explicitly would improve consistency between install-time and runtime behavior.

♻️ Optional fix
 	caSecret := &corev1.Secret{
 		ObjectMeta: metav1.ObjectMeta{
 			Name:      CASecretName,
 			Namespace: namespace,
 		},
+		Type: corev1.SecretTypeOpaque,
 		Data: map[string][]byte{},
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go`
around lines 226 - 266, The caSecret created in GenerateWebhookCerts is missing
an explicit Type assignment (caSecret.Type) while Reconcile sets caSecret.Type =
corev1.SecretTypeOpaque; to make install-time generation consistent with runtime
behavior, set caSecret.Type = corev1.SecretTypeOpaque on the caSecret
constructed in GenerateWebhookCerts (the secret named by CASecretName) so the
secret type matches what Reconcile expects.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go`:
- Around line 58-73: The goroutine in SetupWithManager uses an unbuffered
channel initialSync which can block if the controller isn't yet watching; change
initialSync to a buffered channel with capacity 1 (make(chan event.GenericEvent,
1)) and send the single GenericEvent for CASecretName into it, then close the
channel after sending (or rely on buffer) to ensure the goroutine never blocks;
update references in SetupWithManager and keep the existing
source.Channel(initialSync, ...) usage.

---

Outside diff comments:
In `@cmd/install/install.go`:
- Around line 856-879: The current check in the CRD conversion setup clears
existing annotations because it does "if crd.Annotations != nil {
crd.Annotations = map[string]string{} }"; change this to initialize annotations
only when missing by using "if crd.Annotations == nil { crd.Annotations =
map[string]string{} }" within the block that handles crd.Spec.Group ==
"hypershift.openshift.io" (the anonymous func handling
*apiextensionsv1.CustomResourceDefinition), so existing annotations are
preserved and only a nil map is replaced.

---

Nitpick comments:
In `@hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go`:
- Around line 226-266: The caSecret created in GenerateWebhookCerts is missing
an explicit Type assignment (caSecret.Type) while Reconcile sets caSecret.Type =
corev1.SecretTypeOpaque; to make install-time generation consistent with runtime
behavior, set caSecret.Type = corev1.SecretTypeOpaque on the caSecret
constructed in GenerateWebhookCerts (the secret named by CASecretName) so the
secret type matches what Reconcile expects.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 1adb3310-410a-4485-9b8b-29070f674264

📥 Commits

Reviewing files that changed from the base of the PR and between 77fb5d8 and 82b075e.

📒 Files selected for processing (5)
  • cmd/install/assets/hypershift_operator.go
  • cmd/install/install.go
  • cmd/install/install_test.go
  • hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go
  • hypershift-operator/main.go
💤 Files with no reviewable changes (1)
  • cmd/install/assets/hypershift_operator.go

@enxebre enxebre force-pushed the worktree-webhook-certs branch from 933c722 to 0f40585 Compare March 25, 2026 17:41
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 25, 2026

@enxebre: This pull request references CNTRLPLANE-1200 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.22.0" version, but no target version was set.

Details

In response to this:

The service-ca operator is not available on non-OpenShift clusters (e.g. AKS). Instead of relying on its annotations to generate serving certs and inject CA bundles, always self-manage them for the HO SVC/Webhook:

  • At install time: generate a self-signed CA and serving cert using support/certs, set caBundle directly on CRDs and webhook configs.
  • At runtime: a WebhookCertReconciler (following the SharedIngressReconciler pattern) auto-renews the serving cert when < 30 days of validity remain and patches caBundle on CRDs and webhook configurations.

The service-ca annotations (inject-cabundle on CRDs/webhook configs, serving-cert-secret-name on the Service) are removed as they are no longer needed.

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • New Features

  • Self-managed webhook certificate and CA generation for non-OpenShift deployments, plus a controller to keep webhook TLS material and CRD/webhook CA bundles in sync.

  • Improvements

  • Removed reliance on platform-specific service annotations for CABundle injection; CRDs and webhook configs now receive explicit CA bundles during install and reconciliation.

  • Tests

  • Added unit/integration tests covering certificate generation, reconciliation, and CA bundle updates.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@enxebre enxebre force-pushed the worktree-webhook-certs branch from 0f40585 to 768b924 Compare March 25, 2026 18:02
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 25, 2026

@enxebre: This pull request references CNTRLPLANE-1200 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.22.0" version, but no target version was set.

Details

In response to this:

The service-ca operator is not available on non-OpenShift clusters (e.g. AKS). Instead of relying on its annotations to generate serving certs and inject CA bundles, always self-manage them for the HO SVC/Webhook:

  • At install time: generate a self-signed CA and serving cert using support/certs, set caBundle directly on CRDs and webhook configs.
  • At runtime: a WebhookCertReconciler (following the SharedIngressReconciler pattern) auto-renews the serving cert when < 30 days of validity remain and patches caBundle on CRDs and webhook configurations.

The service-ca annotations (inject-cabundle on CRDs/webhook configs, serving-cert-secret-name on the Service) are removed as they are no longer needed.

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • New Features

  • Adds a controller to generate and reconcile self-signed CA and serving TLS certificates and keep webhook CA bundles in sync.

  • Improvements

  • Installer now injects explicit CA bundles into webhook-related resources instead of relying on platform-specific service annotations.

  • Main process wires up the certificate controller when enabled.

  • Tests

  • New unit tests for cert generation, reconciliation, idempotency, and CA bundle patching.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/install/install.go (1)

861-862: ⚠️ Potential issue | 🔴 Critical

Logic error: Annotations cleared instead of initialized.

The condition if crd.Annotations != nil followed by crd.Annotations = map[string]string{} clears all existing annotations when they exist. This will wipe annotations like release.openshift.io/feature-set that were checked earlier in the filtering logic (lines 819-831).

If the intent was to ensure an empty annotation map exists, the condition should be inverted:

🐛 Proposed fix
-				if crd.Annotations != nil {
+				if crd.Annotations == nil {
 					crd.Annotations = map[string]string{}
 				}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/install/install.go` around lines 861 - 862, The code is clearing existing
CRD annotations by doing "if crd.Annotations != nil { crd.Annotations =
map[string]string{} }"; change this to only initialize annotations when nil so
existing keys (e.g. release.openshift.io/feature-set checked earlier) are
preserved: replace the condition to check for nil and assign a new map only when
crd.Annotations == nil, leaving crd.Annotations intact otherwise.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@cmd/install/install.go`:
- Around line 861-862: The code is clearing existing CRD annotations by doing
"if crd.Annotations != nil { crd.Annotations = map[string]string{} }"; change
this to only initialize annotations when nil so existing keys (e.g.
release.openshift.io/feature-set checked earlier) are preserved: replace the
condition to check for nil and assign a new map only when crd.Annotations ==
nil, leaving crd.Annotations intact otherwise.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: a021b9ac-57d2-4b01-9652-82d97adb5c79

📥 Commits

Reviewing files that changed from the base of the PR and between 0f40585 and 768b924.

📒 Files selected for processing (6)
  • cmd/install/assets/hypershift_operator.go
  • cmd/install/install.go
  • cmd/install/install_test.go
  • hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go
  • hypershift-operator/controllers/webhookcerts/webhookcerts_controller_test.go
  • hypershift-operator/main.go
💤 Files with no reviewable changes (1)
  • cmd/install/assets/hypershift_operator.go
✅ Files skipped from review due to trivial changes (1)
  • cmd/install/install_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • hypershift-operator/main.go

@bryan-cox
Copy link
Copy Markdown
Member

/test e2e-azure-self-managed


// GenerateWebhookCerts generates the CA and serving cert secrets for use at install time.
// It also returns the CA bundle bytes for injection into CRDs and webhook configs.
func GenerateWebhookCerts(namespace, serviceName string) (*corev1.Secret, *corev1.Secret, []byte, error) {
Copy link
Copy Markdown
Contributor

@muraee muraee Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just to make it more clear that this is only called once at start

Suggested change
func GenerateWebhookCerts(namespace, serviceName string) (*corev1.Secret, *corev1.Secret, []byte, error) {
func GenerateInitialWebhookCerts(namespace, serviceName string) (*corev1.Secret, *corev1.Secret, []byte, error) {

@clebs
Copy link
Copy Markdown
Member

clebs commented Mar 26, 2026

@enxebre CNTRLPLANE-1200 is for the v1beta1 -> v1beta2 migration.

The issue tracking the CAPI bump itself is CNTRLPLANE-2207.

We should link this issue there or create a subtask for this as blocker to CNTRLPLANE-2207.

Otherwise LGTM!

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 26, 2026

@enxebre: This pull request references CNTRLPLANE-1200 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.22.0" version, but no target version was set.

Details

In response to this:

The service-ca operator is not available on non-OpenShift clusters (e.g. AKS). Instead of relying on its annotations to generate serving certs and inject CA bundles, always self-manage them for the HO SVC/Webhook:

  • At install time: generate a self-signed CA and serving cert using support/certs, set caBundle directly on CRDs and webhook configs.
  • At runtime: a WebhookCertReconciler (following the SharedIngressReconciler pattern) auto-renews the serving cert when < 30 days of validity remain and patches caBundle on CRDs and webhook configurations.

The service-ca annotations (inject-cabundle on CRDs/webhook configs, serving-cert-secret-name on the Service) are removed as they are no longer needed.

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • New Features

  • Adds a controller that generates/reconciles a self-signed CA and serving TLS certificate and propagates CA bundles to webhook configurations and CRDs.

  • Improvements

  • Installer now embeds explicit CA bundles into webhook-related resources instead of using platform-specific service annotations.

  • Main process conditionally wires up the certificate controller when enabled.

  • Tests

  • New unit tests for cert generation, reconciliation, idempotency, and CA bundle patching.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go`:
- Around line 222-259: GenerateInitialWebhookCerts currently always builds empty
Secrets which forces new CA and leaf certs on every run; change it to be
idempotent by reusing existing Secrets when present instead of always recreating
them. Specifically, in GenerateInitialWebhookCerts check for existing Secrets
named CASecretName and ServingCertSecretName in the target namespace (or accept
them as optional parameters), and if found validate and reuse their Data (return
existing caSecret.Data[certs.CASignerCertMapKey] as caBundle); only call
certs.ReconcileSelfSignedCA to create a CA when the CA Secret is missing or
invalid, and only call certs.ReconcileSignedCert when the serving Secret is
missing/expired or the CA has changed — keep using the same symbols
(CASecretName, ServingCertSecretName, certs.ReconcileSelfSignedCA,
certs.ReconcileSignedCert) so the install flow in cmd/install/install.go will no
longer overwrite a healthy trust chain on re-run.
- Around line 61-66: The controller setup currently only watches Secret events
(in the ctrl.NewControllerManagedBy(mgr).For(&corev1.Secret{},
builder.WithPredicates(...)) chain) so if both CASecretName and
ServingCertSecretName are missing at startup Reconcile never runs; modify the
controller builder to ensure an initial bootstrap reconcile — either add
builder.WithInitialReconcile() to the For(...) call or add a stable watch (e.g.,
Watch(&corev1.Namespace{}) or a singleton ConfigMap) so the controller will be
enqueued at manager start and the existing Reconcile method can create missing
secrets.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 7f4a21bb-c5c6-471a-a3e1-c419e699179d

📥 Commits

Reviewing files that changed from the base of the PR and between 768b924 and 34c6f48.

📒 Files selected for processing (2)
  • cmd/install/install.go
  • hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • cmd/install/install.go

Comment on lines +61 to +66
return ctrl.NewControllerManagedBy(mgr).
For(&corev1.Secret{}, builder.WithPredicates(predicate.NewPredicateFuncs(func(o client.Object) bool {
return o.GetNamespace() == r.Namespace &&
(o.GetName() == CASecretName || o.GetName() == ServingCertSecretName)
}))).
Complete(r)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, examine the full SetupWithManager method
cat -n hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go | head -100

Repository: openshift/hypershift

Length of output: 4126


🏁 Script executed:

#!/bin/bash
# Get the full file to understand context
wc -l hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go

Repository: openshift/hypershift

Length of output: 139


🏁 Script executed:

#!/bin/bash
# Look at the entire SetupWithManager and Reconcile methods
sed -n '50,80p' hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go

Repository: openshift/hypershift

Length of output: 1079


🏁 Script executed:

#!/bin/bash
# Check the Reconcile method
sed -n '40,150p' hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go

Repository: openshift/hypershift

Length of output: 4054


🏁 Script executed:

#!/antml:max_thinking>

I'll start by looking at the file structure to understand what we're dealing with and then examine the specific methods mentioned.
</thinking>

shell
# Check the size and overall structure
cd hypershift-operator/controllers/webhookcerts && wc -l webhookcerts_controller.go && head -30 webhookcerts_controller.go

Repository: openshift/hypershift

Length of output: 273


🏁 Script executed:

#!/bin/bash
# Check the full Reconcile method and see how createOrUpdate works
sed -n '69,120p' hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go

Repository: openshift/hypershift

Length of output: 1650


🏁 Script executed:

#!/bin/bash
# Check if there's any initial reconcile or bootstrap logic
grep -n "Reconcile\|bootstrap\|initial\|enqueue" hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go

Repository: openshift/hypershift

Length of output: 998


🏁 Script executed:

#!/bin/bash
# Look at GenerateInitialWebhookCerts which was mentioned in the original review
grep -n "GenerateInitialWebhookCerts" hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go

Repository: openshift/hypershift

Length of output: 283


🏁 Script executed:

#!/bin/bash
# Find where this controller is instantiated/used
fd -t f "\.go$" -x grep -l "WebhookCertReconciler" {} \; | head -10

Repository: openshift/hypershift

Length of output: 246


🏁 Script executed:

#!/bin/bash
# Check main.go to see how the controller is initialized
grep -B5 -A10 "WebhookCertReconciler" hypershift-operator/main.go

Repository: openshift/hypershift

Length of output: 620


🏁 Script executed:

#!/bin/bash
# Check GenerateInitialWebhookCerts full implementation
sed -n '220,260p' hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go

Repository: openshift/hypershift

Length of output: 1392


🏁 Script executed:

#!/bin/bash
# Check the install.go mentioned in original review to understand the flow
grep -n "GenerateInitialWebhookCerts\|WebhookCertReconciler" cmd/install/install.go | head -20

Repository: openshift/hypershift

Length of output: 198


🏁 Script executed:

#!/bin/bash
# Check where GenerateInitialWebhookCerts secrets are applied in install.go
sed -n '730,750p' cmd/install/install.go

Repository: openshift/hypershift

Length of output: 895


🏁 Script executed:

#!/bin/bash
# Look at upsert.CreateOrUpdate to understand its behavior
fd -t f "upsert" hypershift-operator/ cmd/install/ -x grep -l "CreateOrUpdateFN\|CreateOrUpdate" {} \; | head -5

Repository: openshift/hypershift

Length of output: 46


🏁 Script executed:

#!/bin/bash
# Check the implementation of createOrUpdate (upsert package)
find . -path "*/support/upsert*" -name "*.go" | head -5

Repository: openshift/hypershift

Length of output: 212


🏁 Script executed:

#!/bin/bash
# Check the upsert implementation to understand CreateOrUpdate behavior
sed -n '1,100p' support/upsert/upsert.go

Repository: openshift/hypershift

Length of output: 3674


🏁 Script executed:

#!/bin/bash
# Look at the CreateOrUpdateFN type definition
grep -n "type CreateOrUpdateFN\|CreateOrUpdate func" support/upsert/upsert.go

Repository: openshift/hypershift

Length of output: 223


🏁 Script executed:

#!/antml:thinking>
Need to understand if createOrUpdate uses server-side apply (which would auto-create on first reconcile) or regular client operations. Also need to verify the bootstrap scenario.
</thinking>
shell
# Check if there's any initial enqueue or bootstrap mechanism in controller-runtime
grep -n "Owns\|Watches\|Initial\|Bootstrap" hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go

Repository: openshift/hypershift

Length of output: 329


🏁 Script executed:

#!/bin/bash
# Check for any Owns/Watches calls that might enqueue initial reconcile
grep -n "\.Owns\|\.Watches" hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go

Repository: openshift/hypershift

Length of output: 46


🏁 Script executed:

#!/bin/bash
# Check if there's any manual initial enqueue or bootstrap logic anywhere in SetupWithManager or nearby
sed -n '57,75p' hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go

Repository: openshift/hypershift

Length of output: 787


🏁 Script executed:

#!/antml:thinking>
The user's shell results need to be checked. Let me verify the actual behavior of the bootstrap scenario by checking whether secrets might not exist at startup.
</thinking>
shell
# Check the test file to understand bootstrap scenarios
grep -n "TestSetupWithManager\|TestReconcile" hypershift-operator/controllers/webhookcerts/webhookcerts_controller_test.go

Repository: openshift/hypershift

Length of output: 132


🏁 Script executed:

#!/bin/bash
# Check if there's any logic that handles missing secrets during startup
sed -n '69,135p' hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go | grep -A5 "Reconcile\|missing\|absent"

Repository: openshift/hypershift

Length of output: 1064


🏁 Script executed:

#!/an/bin/bash
# Check the test to understand if this bootstrap issue is tested or acknowledged
head -100 hypershift-operator/controllers/webhookcerts/webhookcerts_controller_test.go

Repository: openshift/hypershift

Length of output: 3636


🏁 Script executed:

#!/bin/bash
# Verify the flow: check if install-time secrets are guaranteed to be pre-created
# Look at how the manager is started and if any initialization hooks exist
grep -n "SetupWithManager\|RunWithOptions\|Start" hypershift-operator/main.go | head -20

Repository: openshift/hypershift

Length of output: 1382


🏁 Script executed:

#!/bin/bash
# Check if there's any manual trigger or enqueue mechanism in the main.go around SetupWithManager
sed -n '525,535p' hypershift-operator/main.go

Repository: openshift/hypershift

Length of output: 452


🏁 Script executed:

#!/bin/bash
# Check if there's any other source watching Namespaces or other stable resources
grep -B5 -A15 "webhookCertReconciler.SetupWithManager" hypershift-operator/main.go

Repository: openshift/hypershift

Length of output: 1118


🏁 Script executed:

#!/bin/bash
# Check if the secrets are guaranteed to exist by verifying the install flow more carefully
# Look for where the generated secrets are actually applied
sed -n '730,760p' cmd/install/install.go

Repository: openshift/hypershift

Length of output: 1320


🏁 Script executed:

#!/bin/bash
# Verify the order: are secrets applied before the manager starts?
grep -n "GenerateInitialWebhookCerts\|mgr.Start\|manager.Start" cmd/install/install.go hypershift-operator/main.go | head -15

Repository: openshift/hypershift

Length of output: 276


🏁 Script executed:

#!/antml:thinking>
I need to understand the order of operations:
1. Are the secrets generated and applied (created in the cluster) before the manager starts?
2. Or does the manager start first, and then the secrets are applied?

If the manager starts before secrets are applied, then the bootstrap issue is real. If secrets are applied first, then the controller would just be missing an optimization/resilience pattern.
</thinking>
shell
# Check what objects are actually returned from install and how they're used
grep -n "return nil, nil\|return.*objects" cmd/install/install.go | head -10

Repository: openshift/hypershift

Length of output: 283


🏁 Script executed:

#!/bin/bash
# Check the broader context around install and deployment
# to understand if secrets are guaranteed before manager starts
grep -n "Install\|NewManager\|Start" cmd/install/install.go | head -20

Repository: openshift/hypershift

Length of output: 1521


🏁 Script executed:

#!/bin/bash
# Check if controller-runtime has any bootstrap/initial reconcile mechanisms
# that might be automatically triggered
grep -n "Initial\|Bootstrap\|Enqueue" hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go hypershift-operator/main.go

Repository: openshift/hypershift

Length of output: 427


🏁 Script executed:

#!/antml:thinking>
Let me think about what I've observed:

1. SetupWithManager only registers a `.For(&corev1.Secret{})` watch with a predicate
2. No `.Owns()`, no `.Watches()`, no manual initial enqueue
3. The Reconcile method CAN create the secrets if they're missing (via createOrUpdate)
4. But Reconcile will only run if a Secret event is triggered
5. The secrets are generated in install.go and appended to objects
6. The manager starts after install objects are supposed to be applied

The critical issue is:
- At manager startup, if both watched Secrets are already absent (or never created), there's NO initial request queued
- Without an event on either Secret, Reconcile never runs
- The controller cannot recover from an empty state

This is a REAL issue because:
1. Secrets could be deleted after initial install
2. Namespace could be recreated
3. Any scenario where secrets are missing at startup

The review is CORRECT. The controller needs an initial bootstrap enqueue or a watch on a stable resource (like the Namespace).

Let me confirm the fix would be simple and this is indeed the intended issue.
</thinking>

shell
# Check if there's a pattern in other controllers for initial enqueue
fd -t f "SetupWithManager" hypershift-operator/controllers/ -x grep -l "InitialReconcile\|EnqueueRequest" {} \;

Repository: openshift/hypershift

Length of output: 237


🏁 Script executed:

#!/bin/bash
# Look for any controller that does watch a Namespace or similar stable resource
grep -n "\.Watches" hypershift-operator/controllers/*/\*.go | head -5

Repository: openshift/hypershift

Length of output: 135


🏁 Script executed:

#!/antml:thinking>
Based on all the evidence I've gathered, the original review is correct. The issue is:

1. SetupWithManager only registers `.For(&corev1.Secret{})` with a specific predicate
2. There's NO initial enqueue or watch on a stable resource
3. The Reconcile method CAN handle missing secrets (creates them via createOrUpdate)
4. But Reconcile will NEVER run if both Secrets don't exist at startup

The fix would be to add one of:
- A manual initial reconcile request in SetupWithManager
- A watch on the Namespace (stable singleton)
- A watch on another stable resource that triggers reconcile

This is a valid major reliability issue. Let me finalize my verification.
</thinking>

shell
# Final verification: check if controller-runtime's manager provides any automatic initial enqueue
grep -n "Reconcile\|enqueue" hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go | head -15

Repository: openshift/hypershift

Length of output: 279


🏁 Script executed:

#!/bin/bash
# Final check: verify that controller-runtime doesn't auto-enqueue on SetupWithManager completion
# by checking the builder pattern usage
sed -n '61,66p' hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go

Repository: openshift/hypershift

Length of output: 350


🏁 Script executed:

#!/antml:thinking>
I have enough evidence now. Let me summarize what I've verified:

1. SetupWithManager only uses `.For(&corev1.Secret{})` with a predicate
2. No `.Owns()`, `.Watches()`, or manual enqueue
3. Reconcile CAN create secrets if missing (uses createOrUpdate which will Create if not found)
4. But Reconcile only runs on Secret events
5. If both secrets are absent at startup, no event is triggered
6. Therefore, no reconciliation occurs and secrets can't be created
7. This is a bootstrap/startup reliability issue

The original review is CORRECT. This is a major issue that needs to be addressed.
</thinking>
shell
# Verify the complete absence of initial enqueue or bootstrap pattern
grep -n "EnqueueReconciler\|EnqueueRequest\|Initial" hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go hypershift-operator/main.go

Repository: openshift/hypershift

Length of output: 295


Controller lacks bootstrap reconciliation for webhook cert Secrets.

This controller only enqueues on Secret events. If both watched Secrets are absent when the manager starts, no reconciliation is triggered and the certs cannot be created. The Reconcile method is capable of creating missing secrets from scratch, but has no way to be initially invoked. Add an explicit initial enqueue (e.g., via builder.WithInitialReconcile) or watch a stable singleton resource (like the Namespace) to ensure startup can self-heal missing cert material.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go`
around lines 61 - 66, The controller setup currently only watches Secret events
(in the ctrl.NewControllerManagedBy(mgr).For(&corev1.Secret{},
builder.WithPredicates(...)) chain) so if both CASecretName and
ServingCertSecretName are missing at startup Reconcile never runs; modify the
controller builder to ensure an initial bootstrap reconcile — either add
builder.WithInitialReconcile() to the For(...) call or add a stable watch (e.g.,
Watch(&corev1.Namespace{}) or a singleton ConfigMap) so the controller will be
enqueued at manager start and the existing Reconcile method can create missing
secrets.

Comment on lines +222 to +259
func GenerateInitialWebhookCerts(namespace, serviceName string) (*corev1.Secret, *corev1.Secret, []byte, error) {
caSecret := &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
Name: CASecretName,
Namespace: namespace,
},
Data: map[string][]byte{},
}
if err := certs.ReconcileSelfSignedCA(caSecret, "hypershift-webhook-ca", "openshift"); err != nil {
return nil, nil, nil, fmt.Errorf("failed to generate webhook CA: %w", err)
}

dnsNames := webhookDNSNames(serviceName, namespace)
servingSecret := &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
Name: ServingCertSecretName,
Namespace: namespace,
},
Type: corev1.SecretTypeTLS,
Data: map[string][]byte{},
}
if err := certs.ReconcileSignedCert(
servingSecret,
caSecret,
"hypershift-operator",
[]string{"openshift"},
[]x509.ExtKeyUsage{x509.ExtKeyUsageServerAuth},
corev1.TLSCertKey,
corev1.TLSPrivateKeyKey,
"",
dnsNames,
nil,
); err != nil {
return nil, nil, nil, fmt.Errorf("failed to generate webhook serving cert: %w", err)
}

caBundle := caSecret.Data[certs.CASignerCertMapKey]
return caSecret, servingSecret, caBundle, nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

GenerateInitialWebhookCerts makes the install path non-idempotent.

This helper always starts from empty Secrets, so every call generates a new CA and leaf cert. cmd/install/install.go Lines 736-743 then force-apply those returned Secrets, which means re-running the install path overwrites a healthy trust chain even when nothing is expiring. The install flow should reuse existing Secrets or only create them on first bootstrap.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go`
around lines 222 - 259, GenerateInitialWebhookCerts currently always builds
empty Secrets which forces new CA and leaf certs on every run; change it to be
idempotent by reusing existing Secrets when present instead of always recreating
them. Specifically, in GenerateInitialWebhookCerts check for existing Secrets
named CASecretName and ServingCertSecretName in the target namespace (or accept
them as optional parameters), and if found validate and reuse their Data (return
existing caSecret.Data[certs.CASignerCertMapKey] as caBundle); only call
certs.ReconcileSelfSignedCA to create a CA when the CA Secret is missing or
invalid, and only call certs.ReconcileSignedCert when the serving Secret is
missing/expired or the CA has changed — keep using the same symbols
(CASecretName, ServingCertSecretName, certs.ReconcileSelfSignedCA,
certs.ReconcileSignedCert) so the install flow in cmd/install/install.go will no
longer overwrite a healthy trust chain on re-run.

@enxebre enxebre force-pushed the worktree-webhook-certs branch from 34c6f48 to 33ee3cc Compare March 26, 2026 13:42
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 26, 2026

@enxebre: This pull request references CNTRLPLANE-1200 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.22.0" version, but no target version was set.

Details

In response to this:

The service-ca operator is not available on non-OpenShift clusters (e.g. AKS). Instead of relying on its annotations to generate serving certs and inject CA bundles, always self-manage them for the HO SVC/Webhook:

  • At install time: generate a self-signed CA and serving cert using support/certs, set caBundle directly on CRDs and webhook configs.
  • At runtime: a WebhookCertReconciler (following the SharedIngressReconciler pattern) auto-renews the serving cert when < 30 days of validity remain and patches caBundle on CRDs and webhook configurations.

The service-ca annotations (inject-cabundle on CRDs/webhook configs, serving-cert-secret-name on the Service) are removed as they are no longer needed.

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • New Features

  • Adds a controller that generates and reconciles a self-signed CA and a CA-signed serving TLS certificate, and propagates CA bundles to webhook configurations and CRDs.

  • Improvements

  • Installer now conditionally generates and embeds explicit CA and serving certificates and injects CA bundles into webhook-related resources instead of relying on platform-specific annotations.

  • Main process conditionally wires the certificate controller when enabled.

  • Tests

  • New unit tests for cert generation, reconciliation, idempotency, and CA bundle patching.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go (2)

222-259: ⚠️ Potential issue | 🟠 Major

Make initial cert generation idempotent for repeated install executions.

This helper always mints a new CA and serving cert from empty Secret objects. Re-running install rotates trust material even when existing certs are still valid, which can cause avoidable webhook trust churn. Reuse existing valid Secrets and only regenerate when missing/invalid.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go`
around lines 222 - 259, GenerateInitialWebhookCerts currently always creates new
empty Secrets (CASecretName and ServingCertSecretName) and calls
certs.ReconcileSelfSignedCA / certs.ReconcileSignedCert which forces rotation;
change it to first attempt to load existing Secrets and validate them (e.g.,
presence and validity of cert PEMs, CA signer cert key in
caSecret.Data[certs.CASignerCertMapKey], and that serving cert is signed by the
CA and not expired) and only call certs.ReconcileSelfSignedCA or
certs.ReconcileSignedCert when the Secret is missing or the validation fails;
keep the same return values (caSecret, servingSecret, caBundle) and reuse
existing Secret objects when valid to make GenerateInitialWebhookCerts
idempotent across repeated installs.

61-66: ⚠️ Potential issue | 🟠 Major

Ensure a bootstrap reconcile path exists when Secrets are absent.

This setup only enqueues from Secret events. If both watched Secrets are missing at manager start, no event is emitted and reconciliation never runs, so cert material cannot self-heal. Add an explicit initial enqueue (or a stable secondary watch) to guarantee first reconcile.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go`
around lines 61 - 66, The controller currently only watches Secret events
(NewControllerManagedBy(...).For(&corev1.Secret{}...)) so if both CASecretName
and ServingCertSecretName are absent at startup no reconcile is triggered;
update the controller builder to add a stable secondary watch or an explicit
initial enqueue to guarantee a first reconcile—for example, add a Watch for the
operator Namespace (or a stable ConfigMap) using builder.Watches(...) with a
predicate that filters on r.Namespace (or a Mapper that always enqueues the
reconcile request for this reconciler), or add logic during Setup to enqueue the
reconcile request once at manager start; change the controller setup around
NewControllerManagedBy, For, and Complete(r) to include that extra Watch/initial
enqueue so reconciliation runs even when the secrets are missing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go`:
- Around line 222-259: GenerateInitialWebhookCerts currently always creates new
empty Secrets (CASecretName and ServingCertSecretName) and calls
certs.ReconcileSelfSignedCA / certs.ReconcileSignedCert which forces rotation;
change it to first attempt to load existing Secrets and validate them (e.g.,
presence and validity of cert PEMs, CA signer cert key in
caSecret.Data[certs.CASignerCertMapKey], and that serving cert is signed by the
CA and not expired) and only call certs.ReconcileSelfSignedCA or
certs.ReconcileSignedCert when the Secret is missing or the validation fails;
keep the same return values (caSecret, servingSecret, caBundle) and reuse
existing Secret objects when valid to make GenerateInitialWebhookCerts
idempotent across repeated installs.
- Around line 61-66: The controller currently only watches Secret events
(NewControllerManagedBy(...).For(&corev1.Secret{}...)) so if both CASecretName
and ServingCertSecretName are absent at startup no reconcile is triggered;
update the controller builder to add a stable secondary watch or an explicit
initial enqueue to guarantee a first reconcile—for example, add a Watch for the
operator Namespace (or a stable ConfigMap) using builder.Watches(...) with a
predicate that filters on r.Namespace (or a Mapper that always enqueues the
reconcile request for this reconciler), or add logic during Setup to enqueue the
reconcile request once at manager start; change the controller setup around
NewControllerManagedBy, For, and Complete(r) to include that extra Watch/initial
enqueue so reconciliation runs even when the secrets are missing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: aebe52de-8fc0-41dc-a05a-b28cb252f7f0

📥 Commits

Reviewing files that changed from the base of the PR and between 34c6f48 and 33ee3cc.

📒 Files selected for processing (3)
  • cmd/install/install.go
  • hypershift-operator/controllers/webhookcerts/webhookcerts_controller.go
  • hypershift-operator/controllers/webhookcerts/webhookcerts_controller_test.go
✅ Files skipped from review due to trivial changes (1)
  • hypershift-operator/controllers/webhookcerts/webhookcerts_controller_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • cmd/install/install.go

log := ctrl.LoggerFrom(ctx)

// Only process our own secrets.
if req.Namespace != r.Namespace {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these 2 conditions not already covered in the predicate func in the setup? Only requests matching the NS and secret name should be reconciled already IIUC.

return fmt.Errorf("failed to list CRDs: %w", err)
}

for i := range crdList.Items {
Copy link
Copy Markdown
Member

@clebs clebs Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have 2 questions about this logic:

  1. Since we are now patching the CRDs, we no longer need the service.beta.openshift.io/inject-cabundle annotation (e.g. here) right?
  2. Is this only needed by hypershift CRDs? IIUC other CRDs with conversion enabled would also need this.

@enxebre enxebre changed the title CNTRLPLANE-1200: feat(install): self-manage webhook certs instead of relying on service-ca CNTRLPLANE-2207: feat(install): self-manage webhook certs instead of relying on service-ca Apr 1, 2026
@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Apr 6, 2026
@openshift-ci-robot
Copy link
Copy Markdown

Tests from second stage were triggered manually. Pipeline can be controlled only manually, until HEAD changes. Use command to trigger second stage.

@LiangquanLi930
Copy link
Copy Markdown
Contributor

/pipeline required

@openshift-ci-robot
Copy link
Copy Markdown

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aks-4-21
/test e2e-aws-4-21
/test e2e-aks
/test e2e-aws
/test e2e-aws-upgrade-hypershift-operator
/test e2e-azure-self-managed
/test e2e-kubevirt-aws-ovn-reduced
/test e2e-v2-aws

@LiangquanLi930
Copy link
Copy Markdown
Contributor

/verified by e2e

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Apr 6, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@LiangquanLi930: This PR has been marked as verified by e2e.

Details

In response to this:

/verified by e2e

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 6fff85f and 2 for PR HEAD 260c92e in total

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 2d72889 and 1 for PR HEAD 260c92e in total

@LiangquanLi930
Copy link
Copy Markdown
Contributor

/verified cancel

@openshift-ci-robot
Copy link
Copy Markdown

@LiangquanLi930: The /verified command must be used with one of the following actions: by, later, remove, or bypass. See https://docs.ci.openshift.org/docs/architecture/jira/#premerge-verification for more information.

Details

In response to this:

/verified cancel

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@LiangquanLi930
Copy link
Copy Markdown
Contributor

/verified remove

@openshift-ci-robot openshift-ci-robot removed the verified Signifies that the PR passed pre-merge verification criteria label Apr 7, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@LiangquanLi930: The verified label has been removed.

Details

In response to this:

/verified remove

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@enxebre
Copy link
Copy Markdown
Member Author

enxebre commented Apr 7, 2026

/verified later @enxebre

@openshift-ci-robot openshift-ci-robot added verified-later verified Signifies that the PR passed pre-merge verification criteria labels Apr 7, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@enxebre: This PR has been marked to be verified later by @enxebre.

Details

In response to this:

/verified later @enxebre

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@LiangquanLi930
Copy link
Copy Markdown
Contributor

/test e2e-kubevirt-aws-ovn-reduced

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 17cdd12 and 0 for PR HEAD 260c92e in total

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 7, 2026

@enxebre: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-kubevirt-aws-ovn-reduced 260c92e link true /test e2e-kubevirt-aws-ovn-reduced

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

/hold

Revision 260c92e was retested 3 times: holding

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 7, 2026
@clebs
Copy link
Copy Markdown
Member

clebs commented Apr 7, 2026

superseded by #8174

@clebs
Copy link
Copy Markdown
Member

clebs commented Apr 8, 2026

/close

An alternative PR merged #8174

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 8, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 8, 2026

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci Bot closed this Apr 8, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 8, 2026

@clebs: Closed this PR.

Details

In response to this:

/close

An alternative PR merged #8174

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cli Indicates the PR includes changes for CLI area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. verified Signifies that the PR passed pre-merge verification criteria verified-later

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants