[SPARK-56845][K8S] Truncate ConfigMap names that exceed DNS subdomain limit by TongWei1105 · Pull Request #55874 · apache/spark

TongWei1105 · 2026-05-14T06:47:13Z

What changes were proposed in this pull request?

Add a new overload KubernetesClientUtils.configMapName(prefix, suffix) that falls back to spark-<uniqueID><suffix> when prefix+suffix exceeds KUBERNETES_DNS_SUBDOMAIN_NAME_MAX_LENGTH (253), mirroring the
fallback strategy already used by KubernetesConf.driverServiceName.

Migrate the three driver-side ConfigMap call sites to the new helper:

HadoopConfDriverFeatureStep (suffix -hadoop-config)
KerberosConfDriverFeatureStep (suffix -krb5-file)
PodTemplateConfigMapStep (suffix -driver-podspec-conf-map)

The two def newConfigMapName fields are converted to lazy val so the fallback's random uniqueID() is captured exactly once — otherwise the ConfigMap would be created with one name while the pod's volume
references another. lazy val (rather than val) avoids spuriously computing — and emitting a fallback warning for — a name that is never used (e.g. the step is constructed but no Hadoop/Kerberos conf is set).

Note: this also changes the truncation behavior of the existing single-arg configMapName(prefix) (@Since("3.3.0")), which now delegates to the new overload. Spark's own callers
(configMapNameDriver/configMapNameExecutor) use short fixed prefixes (~22 chars) and never hit the fallback, so behavior for built-in callers is unchanged. External @DeveloperApi consumers passing very long
prefixes will see different — but safer, collision-free — names than before.

Why are the changes needed?

When spark.app.name is very long (>229 chars), the derived resourceNamePrefix plus a fixed suffix exceeds the Kubernetes DNS subdomain 253-char limit. The K8s API then rejects the ConfigMap with must be no more than 253 characters, failing driver submission. This PR makes the three driver-side ConfigMap names robust to long app names.

Does this PR introduce any user-facing change?

Yes — driver submission with very long spark.app.name no longer fails. Submissions that previously failed will now succeed; the affected ConfigMaps will be created with names like
spark-<uniqueID>-hadoop-config instead. A warning is logged when the fallback is used.

For users of the public KubernetesClientUtils.configMapName(prefix) API: the truncation strategy for over-long prefixes changed from "take first N chars of prefix" to "fall back to spark-<uniqueID>-conf-map".
This avoids silent name collisions across applications that happened to share the first 244 chars of their prefix. Spark's own callers always use short prefixes and are unaffected.

How was this patch tested?

Added unit tests:

KubernetesClientUtilsSuite: verifies the new helper returns prefix+suffix within the limit, falls back to spark-<id><suffix> when over the limit, and that the legacy single-arg overload still produces the
-conf-map suffix.
HadoopConfDriverFeatureStepSuite, KerberosConfDriverFeatureStepSuite, PodTemplateConfigMapStepSuite: each adds a "very long resourceNamePrefix" case asserting (a) the resulting ConfigMap name is within
the limit, and (b) the pod's volume references the exact same name as the created ConfigMap (regression guard for the def → lazy val change).

Was this patch authored or co-authored using generative AI tooling?

Yes，Generated-by: Claude Code (Opus 4.7)

… limit When `spark.app.name` is very long (>229 chars), the derived `resourceNamePrefix` plus a fixed suffix (e.g. `-hadoop-config`, `-krb5-file`, `-driver-podspec-conf-map`) can exceed the Kubernetes DNS subdomain 253-char limit. The K8s API then rejects the ConfigMap with `must be no more than 253 characters`, failing driver submission. Unify ConfigMap name construction through a single helper `KubernetesClientUtils.configMapName(prefix, suffix)` that mirrors the fallback strategy already used by `KubernetesConf.driverServiceName`: when the preferred name is too long, fall back to `spark-<uniqueID><suffix>`, which preserves uniqueness across concurrent applications and keeps the name within the limit. The three call sites (HadoopConfDriverFeatureStep, KerberosConfDriverFeatureStep, PodTemplateConfigMapStep) are migrated to the helper, and the two `def newConfigMapName` fields are converted to `lazy val` so the fallback's `uniqueID()` is captured exactly once - otherwise the ConfigMap would be created with one name while the pod mounted another. `lazy val` (rather than `val`) avoids spuriously computing - and emitting a fallback warning for - a name that is never used (e.g. the step is constructed but no Hadoop/Kerberos conf is set).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-56845][K8S] Truncate ConfigMap names that exceed DNS subdomain limit#55874

[SPARK-56845][K8S] Truncate ConfigMap names that exceed DNS subdomain limit#55874
TongWei1105 wants to merge 1 commit into
apache:masterfrom
TongWei1105:truncate-configmap-names

TongWei1105 commented May 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

TongWei1105 commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

TongWei1105 commented May 14, 2026 •

edited

Loading