feat(valkey): add Valkey cluster addon as a sibling to redis by mogita · Pull Request #2 · GetStream/kubeblocks-addons

mogita · 2026-05-06T10:44:40Z

Summary

Adds addons/valkey/ as a cluster-mode-only side-by-side KubeBlocks addon. It replaces the five post-install Helm hooks in stream-infra/kubernetes/codebase/charts/valkey/templates/hooks/ (patch-cache-config, patch-maxmemory, patch-prefer-ip, patch-reshard-cm, patch-valkey-image) by baking the same behaviour into the addon at template-level — no more racing the operator with kubectl patch jobs against KubeBlocks-managed ConfigMaps.

The Valkey addon is independent of addons/redis/. Upstream redis evolution can land cleanly via merge — there are no shared files to conflict on.

What's in the addon

Single Valkey major (9.x) — no multi-version range loop, no sentinel, no twemproxy. Slim by design; we can extend valkeyVersions in values.yaml for new patches.
cmpv-valkey-cluster.yaml ships docker.io/valkey/valkey:<version>. dbctl and agamotto stay on apecloud (KubeBlocks-side tooling, not the engine).
ShardingDefinition with minShards: 1 — provisions 1, 2, 3+ shards. The create_redis_cluster helper branches on primary_count == 1 to use CLUSTER ADDSLOTSRANGE 0 16383 (mirroring AWS ElastiCache's approach), bypassing redis-cli --cluster create which Redis itself rejects below 3 masters.
redis.conf tuned for cache workload (config/valkey-cluster-config.tpl): appendonly no, save "", io-threads 1, latency-monitor-threshold 25, maxmemory-policy allkeys-lru, maxmemory at 85% of pod memory limit.
valkey-cluster-server-start.sh emits cluster-preferred-endpoint-type ip on the default-network branch (upstream emits hostname), so CLUSTER SLOTS announces VPC-routable IPs.
valkey-cluster-manage.sh skips the legacy redis-cli --cluster reshard call on scale-out — slot migration is driven by ASM (CLUSTER MIGRATESLOTS via ape-dts) through the OpsDefinition in stream-infra.

Internal function names keep their redis_* identifiers to minimise diff vs. upstream redis scripts (easier future bug-porting). Filenames and CR names are valkey-cluster-* for clarity.

What this retires in stream-infra

Hook	Replaced by
`patch-valkey-image.yaml`	`cmpv-valkey-cluster.yaml` ships valkey/valkey directly
`patch-cache-config.yaml`	`valkey-cluster-config.tpl` bakes appendonly/save/io-threads/latency-monitor
`patch-maxmemory.yaml`	Same template, 85% of `PHY_MEMORY` (limit), `allkeys-lru` policy
`patch-prefer-ip.yaml`	`valkey-cluster-server-start.sh` line 652 set to `ip`
`patch-reshard-cm.yaml`	`valkey-cluster-manage.sh` simply doesn't call `scale_out_shard_reshard`

What stays in stream-infra: NLB / TargetGroupBinding, NetworkPolicies, ServiceMonitor, the auto-heal CronJob, the ASM OpsDefinition. Engine-agnostic infra.

Settings global for v1

No per-cluster Helm knobs yet — every Valkey cluster on this addon picks up the same tunings. If divergence is needed later we can wire either values.yaml overrides into the config tpl or a real ParametersDefinition / ParamConfigRenderer for per-Cluster overrides.

Verification

helm template addons/valkey renders 5 resources cleanly (ShardingDefinition, ComponentDefinition, ComponentVersion, plus 2 ConfigMap templates). All 9 script files mount into the scripts ConfigMap.
shellspec for build_single_shard_addslots_command and the create_redis_cluster branch: 4 examples, 0 failures (run on bash 5).
Upstream redis spec still passes (no shared files).
e2e: switch a Cluster from componentDef: redis-cluster-8 to componentDef: valkey-cluster-9-0.1.0, verify provision, set/get via redis-cli -c.
e2e: provision with shards: 1; verify cluster_state ok and slots covered.
e2e: scale 1 → 3, then run ASM OpsRequest; verify slot migration and --cluster check passes.
e2e: confirm CLUSTER SLOTS announces pod IPs (not FQDNs) so chat-api on EC2 can reach the cluster.

Out of scope

Reverse path (3→2→1 scale-in) is untouched in this PR. The manage.sh lifecycle hook for shardRemove calls --pre-terminate; that flow is unchanged. We can revisit if/when we need to scale a Valkey cluster down through 1.
No changes to addons/redis/. The redis addon stays exactly as upstream — anyone still on a redis-componentDef cluster keeps the existing 3-shard floor and stock behaviour.

Stand up addons/valkey/ as a cluster-mode-only side-by-side addon, so our Valkey customizations live in their own file tree and never collide with upstream redis evolution. This retires the five post-install Helm hooks in stream-infra (patch-cache-config, patch-maxmemory, patch-prefer-ip, patch-reshard-cm, patch-valkey-image) by baking the equivalent behaviour into the addon at template-level. What's in the addon ------------------- - Single Valkey major (9.x) — no multi-version range loop, no sentinel, no twemproxy. cmpv-valkey-cluster.yaml ships docker.io/valkey/valkey images. dbctl/agamotto stay on apecloud. - ShardingDefinition with `minShards: 1` (provisions 1, 2, 3+ shards, matching how AWS ElastiCache exposes the same engine). - redis.conf tuned for a cache workload at template-level: appendonly no, save "" (no scheduled BGSAVE), io-threads 1 (avoids CFS throttling at our pod CPU limit), latency-monitor-threshold 25 (observability), maxmemory-policy allkeys-lru, maxmemory at 85% of pod memory limit. - valkey-cluster-server-start.sh: emits `cluster-preferred-endpoint-type ip` on the default-network branch (was `hostname`), so CLUSTER SLOTS announces VPC-routable IPs for chat-api and other external clients. - valkey-cluster-manage.sh: skips the legacy `redis-cli --cluster reshard` call on shard scale-out — slot migration is driven by ASM (CLUSTER MIGRATESLOTS via ape-dts) through the OpsDefinition in stream-infra. - valkey-cluster-common.sh: branches `create_redis_cluster` on a single primary to use `CLUSTER ADDSLOTSRANGE 0 16383` (mirroring ElastiCache), bypassing `redis-cli --cluster create` which rejects fewer than 3 masters. Lifts the matching guard in initialize_redis_cluster. Function names inside the scripts intentionally keep their `redis_*` identifiers to minimise the diff vs. upstream redis scripts and ease future bug-porting. Settings are global for now — no per-cluster Helm knobs. Add ParametersDefinition / values overrides later if cluster-specific tunings are needed. Verification ------------ - `helm template addons/valkey` renders 5 resources cleanly: ShardingDefinition, ComponentDefinition, ComponentVersion, plus the config + scripts ConfigMap templates. All 9 script files mount. - shellspec for `build_single_shard_addslots_command` and `create_redis_cluster` branch logic: 4 examples, 0 failures.

Add 9.0.0, 9.0.1, 9.0.2, 9.0.4 alongside existing 9.0.3 / 9.1.0 in ComponentVersion releases. 9.0.4 (released 2026-05-06) becomes the chart appVersion and the default `serviceVersion` on the ComponentDefinition. The full 9.0.x range gives operators a pinned set of options for OpsRequest type=Upgrade rollback / patch-version testing without needing to redeploy the addon. Same-image-tag mapping; no behavioural change.

9.1.0 is still RC upstream and not yet a tagged release on docker.io/valkey/valkey. Keep ComponentVersion to the stable 9.0.x line (9.0.0 - 9.0.4) for now; re-add 9.1.0 once the GA tag ships.

mogita added 3 commits May 6, 2026 12:43

chore(valkey): drop 9.1.0 from mirror versions

5f30aec

9.1.0 is still RC upstream and not yet a tagged release on docker.io/valkey/valkey. Keep ComponentVersion to the stable 9.0.x line (9.0.0 - 9.0.4) for now; re-add 9.1.0 once the GA tag ships.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(valkey): add Valkey cluster addon as a sibling to redis#2

feat(valkey): add Valkey cluster addon as a sibling to redis#2
mogita wants to merge 3 commits intomainfrom
feature/valkey-addon

mogita commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mogita commented May 6, 2026

Summary

What's in the addon

What this retires in stream-infra

Settings global for v1

Verification

Out of scope

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant