feat(valkey): add Valkey cluster addon as a sibling to redis#2
Open
feat(valkey): add Valkey cluster addon as a sibling to redis#2
Conversation
Stand up addons/valkey/ as a cluster-mode-only side-by-side addon, so our Valkey customizations live in their own file tree and never collide with upstream redis evolution. This retires the five post-install Helm hooks in stream-infra (patch-cache-config, patch-maxmemory, patch-prefer-ip, patch-reshard-cm, patch-valkey-image) by baking the equivalent behaviour into the addon at template-level. What's in the addon ------------------- - Single Valkey major (9.x) — no multi-version range loop, no sentinel, no twemproxy. cmpv-valkey-cluster.yaml ships docker.io/valkey/valkey images. dbctl/agamotto stay on apecloud. - ShardingDefinition with `minShards: 1` (provisions 1, 2, 3+ shards, matching how AWS ElastiCache exposes the same engine). - redis.conf tuned for a cache workload at template-level: appendonly no, save "" (no scheduled BGSAVE), io-threads 1 (avoids CFS throttling at our pod CPU limit), latency-monitor-threshold 25 (observability), maxmemory-policy allkeys-lru, maxmemory at 85% of pod memory limit. - valkey-cluster-server-start.sh: emits `cluster-preferred-endpoint-type ip` on the default-network branch (was `hostname`), so CLUSTER SLOTS announces VPC-routable IPs for chat-api and other external clients. - valkey-cluster-manage.sh: skips the legacy `redis-cli --cluster reshard` call on shard scale-out — slot migration is driven by ASM (CLUSTER MIGRATESLOTS via ape-dts) through the OpsDefinition in stream-infra. - valkey-cluster-common.sh: branches `create_redis_cluster` on a single primary to use `CLUSTER ADDSLOTSRANGE 0 16383` (mirroring ElastiCache), bypassing `redis-cli --cluster create` which rejects fewer than 3 masters. Lifts the matching guard in initialize_redis_cluster. Function names inside the scripts intentionally keep their `redis_*` identifiers to minimise the diff vs. upstream redis scripts and ease future bug-porting. Settings are global for now — no per-cluster Helm knobs. Add ParametersDefinition / values overrides later if cluster-specific tunings are needed. Verification ------------ - `helm template addons/valkey` renders 5 resources cleanly: ShardingDefinition, ComponentDefinition, ComponentVersion, plus the config + scripts ConfigMap templates. All 9 script files mount. - shellspec for `build_single_shard_addslots_command` and `create_redis_cluster` branch logic: 4 examples, 0 failures.
Add 9.0.0, 9.0.1, 9.0.2, 9.0.4 alongside existing 9.0.3 / 9.1.0 in ComponentVersion releases. 9.0.4 (released 2026-05-06) becomes the chart appVersion and the default `serviceVersion` on the ComponentDefinition. The full 9.0.x range gives operators a pinned set of options for OpsRequest type=Upgrade rollback / patch-version testing without needing to redeploy the addon. Same-image-tag mapping; no behavioural change.
9.1.0 is still RC upstream and not yet a tagged release on docker.io/valkey/valkey. Keep ComponentVersion to the stable 9.0.x line (9.0.0 - 9.0.4) for now; re-add 9.1.0 once the GA tag ships.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
addons/valkey/as a cluster-mode-only side-by-side KubeBlocks addon. It replaces the five post-install Helm hooks instream-infra/kubernetes/codebase/charts/valkey/templates/hooks/(patch-cache-config, patch-maxmemory, patch-prefer-ip, patch-reshard-cm, patch-valkey-image) by baking the same behaviour into the addon at template-level — no more racing the operator withkubectl patchjobs against KubeBlocks-managed ConfigMaps.The Valkey addon is independent of
addons/redis/. Upstream redis evolution can land cleanly via merge — there are no shared files to conflict on.What's in the addon
valkeyVersionsinvalues.yamlfor new patches.cmpv-valkey-cluster.yamlshipsdocker.io/valkey/valkey:<version>.dbctlandagamottostay on apecloud (KubeBlocks-side tooling, not the engine).ShardingDefinitionwithminShards: 1— provisions 1, 2, 3+ shards. Thecreate_redis_clusterhelper branches onprimary_count == 1to useCLUSTER ADDSLOTSRANGE 0 16383(mirroring AWS ElastiCache's approach), bypassingredis-cli --cluster createwhich Redis itself rejects below 3 masters.config/valkey-cluster-config.tpl):appendonly no,save "",io-threads 1,latency-monitor-threshold 25,maxmemory-policy allkeys-lru,maxmemoryat 85% of pod memory limit.valkey-cluster-server-start.shemitscluster-preferred-endpoint-type ipon the default-network branch (upstream emitshostname), soCLUSTER SLOTSannounces VPC-routable IPs.valkey-cluster-manage.shskips the legacyredis-cli --cluster reshardcall on scale-out — slot migration is driven by ASM (CLUSTER MIGRATESLOTSvia ape-dts) through the OpsDefinition in stream-infra.Internal function names keep their
redis_*identifiers to minimise diff vs. upstream redis scripts (easier future bug-porting). Filenames and CR names arevalkey-cluster-*for clarity.What this retires in stream-infra
patch-valkey-image.yamlcmpv-valkey-cluster.yamlships valkey/valkey directlypatch-cache-config.yamlvalkey-cluster-config.tplbakes appendonly/save/io-threads/latency-monitorpatch-maxmemory.yamlPHY_MEMORY(limit),allkeys-lrupolicypatch-prefer-ip.yamlvalkey-cluster-server-start.shline 652 set toippatch-reshard-cm.yamlvalkey-cluster-manage.shsimply doesn't callscale_out_shard_reshardWhat stays in stream-infra: NLB / TargetGroupBinding, NetworkPolicies, ServiceMonitor, the auto-heal CronJob, the ASM OpsDefinition. Engine-agnostic infra.
Settings global for v1
No per-cluster Helm knobs yet — every Valkey cluster on this addon picks up the same tunings. If divergence is needed later we can wire either
values.yamloverrides into the config tpl or a realParametersDefinition/ParamConfigRendererfor per-Clusteroverrides.Verification
helm template addons/valkeyrenders 5 resources cleanly (ShardingDefinition,ComponentDefinition,ComponentVersion, plus 2 ConfigMap templates). All 9 script files mount into the scripts ConfigMap.build_single_shard_addslots_commandand thecreate_redis_clusterbranch: 4 examples, 0 failures (run on bash 5).ClusterfromcomponentDef: redis-cluster-8tocomponentDef: valkey-cluster-9-0.1.0, verify provision, set/get viaredis-cli -c.shards: 1; verifycluster_state okand slots covered.--cluster checkpasses.CLUSTER SLOTSannounces pod IPs (not FQDNs) so chat-api on EC2 can reach the cluster.Out of scope
manage.shlifecycle hook forshardRemovecalls--pre-terminate; that flow is unchanged. We can revisit if/when we need to scale a Valkey cluster down through 1.addons/redis/. The redis addon stays exactly as upstream — anyone still on a redis-componentDef cluster keeps the existing 3-shard floor and stock behaviour.