Search before reporting
Read release policy
User environment
Pulsar Version: 4.1.2 (Image: docker.io/apachepulsar/pulsar-all:4.1.2)
Platform: Kubernetes (Azure/AKS based on log headers)
Component: BookKeeper / ZooKeeper
Issue Description
Our pulsar-bookie pods are experiencing sudden restarts. The logs indicate that the ZKRegistrationClient is invalidating the cache for the specific bookie address, followed by NetworkTopologyImpl removing the node from the /default-rack.
Error messages
INFO org.apache.bookkeeper.discover.ZKRegistrationClient - Invalidate cache for pulsar-bookie-1.pulsar-bookie.pulsar.svc.cluster.local:3181
INFO org.apache.bookkeeper.net.NetworkTopologyImpl - Removing a node: /default-rack/pulsar-bookie-1.pulsar-bookie.pulsar.svc.cluster.local:3181
Reproducing the issue
ZooKeeper logs show standard ruok commands but no explicit session expiration immediately preceding the drop.
The Bookie seems to be under normal load (Compaction usage buckets are mostly at 100%).
Config: diskUsageWarnThreshold = 0.9, isForceGCAllowWhenNoSpace = true.
Additional information
No response
Are you willing to submit a PR?
Search before reporting
Read release policy
User environment
Pulsar Version: 4.1.2 (Image: docker.io/apachepulsar/pulsar-all:4.1.2)
Platform: Kubernetes (Azure/AKS based on log headers)
Component: BookKeeper / ZooKeeper
Issue Description
Our pulsar-bookie pods are experiencing sudden restarts. The logs indicate that the ZKRegistrationClient is invalidating the cache for the specific bookie address, followed by NetworkTopologyImpl removing the node from the /default-rack.
Error messages
Reproducing the issue
ZooKeeper logs show standard ruok commands but no explicit session expiration immediately preceding the drop.
The Bookie seems to be under normal load (Compaction usage buckets are mostly at 100%).
Config: diskUsageWarnThreshold = 0.9, isForceGCAllowWhenNoSpace = true.
Additional information
No response
Are you willing to submit a PR?