Leader Election
When running multiple replicas of an operator for high availability, leader election ensures that only one instance actively reconciles resources at a time. JOSDK uses Kubernetes Lease objects for leader election.
Enabling Leader Election
Programmatic Configuration
var operator = new Operator(o -> o.withLeaderElectionConfiguration(
new LeaderElectionConfiguration("my-operator-lease", "operator-namespace")));
Or using the builder for full control:
import static io.javaoperatorsdk.operator.api.config.LeaderElectionConfigurationBuilder.aLeaderElectionConfiguration;
var config = aLeaderElectionConfiguration("my-operator-lease")
.withLeaseNamespace("operator-namespace")
.withIdentity(System.getenv("POD_NAME"))
.withLeaseDuration(Duration.ofSeconds(15))
.withRenewDeadline(Duration.ofSeconds(10))
.withRetryPeriod(Duration.ofSeconds(2))
.build();
var operator = new Operator(o -> o.withLeaderElectionConfiguration(config));
External Configuration
Leader election can also be configured via properties (e.g. environment variables or a config file).
See details under configurations page.
How It Works
- When leader election is enabled, the operator starts but does not process events until it acquires the lease.
- Once leadership is acquired, event processing begins normally.
- If leadership is lost (e.g. the leader pod becomes unresponsive), another instance acquires the lease
and takes over reconciliation. The instance that lost the lead is terminated (
System.exit())
Identity and Namespace Inference
If not explicitly set:
- Identity is resolved from the
HOSTNAMEenvironment variable, then the pod name, falling back to a random UUID. - Lease namespace defaults to the namespace the operator pod is running in.
RBAC Requirements
The operator’s service account needs permissions to manage Lease objects:
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["create", "update", "get"]
JOSDK checks for these permissions at startup and throws a clear error if they are missing.
Sample E2E Test
A complete working example is available in the
leader-election sample operator,
including multi-replica deployment manifests and an E2E test that verifies failover behavior.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.