Ensuring Leader Election

Ensure that only one instance of the collector-manager pod is active in the cluster.

kubectl get pods -A -l collector-manager/leader=true

Example output:

NAME                                                 READY   STATUS    RESTARTS   AGE
plerion-collector-manager-ccbc55c5d-dr27w   1/1     Running   0          10m

If the above command returns multiple instances , follow these steps:

  1. Validate the configuration and ensure that LEADER_ELECTION value is set true in the configmap.
  2. Ensure that only of the instances is active and another is in a waiting state, attempting to acquire the lease, or is shutting down.
kubectl logs -f plerion-collector-manager-ccbc55c5d-dr27w
  • Example logs for a pod that is currently waiting to acquire lease
{"level":"info","ts":"2023-11-21T03:33:01Z","logger":"setup","msg":"initializing controller"}
{"level":"info","ts":"2023-11-21T03:33:01Z","logger":"setup","msg":"fetching tenant config"}
{"level":"info","ts":"2023-11-21T03:33:08Z","logger":"controller-runtime.metrics","msg":"Metrics server is starting to listen","addr":":8080"}
{"level":"info","ts":"2023-11-21T03:33:08Z","logger":"setup","msg":"starting manager"}
{"level":"info","ts":"2023-11-21T03:33:08Z","logger":"setup","msg":"Starting server","kind":"health probe","addr":"[::]:8081"}
{"level":"info","ts":"2023-11-21T03:33:08Z","logger":"setup","msg":"starting server","path":"/metrics","kind":"metrics","addr":"[::]:8080"}
{"level":"info","ts":"2023-11-21T03:33:08Z","msg":"attempting to acquire leader lease plerion-system/ecaf1259.collector.plerion.com...\n"}
  • Example logs for a pod that currently has the lease
{"level":"info","ts":"2023-11-21T03:33:01Z","logger":"setup","msg":"initializing controller"}
{"level":"info","ts":"2023-11-21T03:33:01Z","logger":"setup","msg":"fetching tenant config"}
{"level":"info","ts":"2023-11-21T03:33:08Z","logger":"controller-runtime.metrics","msg":"Metrics server is starting to listen","addr":":8080"}
{"level":"info","ts":"2023-11-21T03:33:08Z","logger":"setup","msg":"starting manager"}
{"level":"info","ts":"2023-11-21T03:33:08Z","logger":"setup","msg":"Starting server","kind":"health probe","addr":"[::]:8081"}
{"level":"info","ts":"2023-11-21T03:33:08Z","logger":"setup","msg":"starting server","path":"/metrics","kind":"metrics","addr":"[::]:8080"}
{"level":"info","ts":"2023-11-21T03:33:08Z","msg":"attempting to acquire leader lease plerion-system/ecaf1259.collector.plerion.com...\n"}
{"level":"info","ts":"2023-11-21T03:33:25Z","msg":"successfully acquired lease plerion-system/ecaf1259.collector.plerion.com\n"}
{"level":"info","ts":"2023-11-21T03:33:25Z","logger":"setup","msg":"Starting EventSource","controller":"resourcecollector","source":"<redacted>"}
{"level":"info","ts":"2023-11-21T03:33:25Z","logger":"setup","msg":"Starting Controller","controller":"resourcecollector"}
{"level":"info","ts":"2023-11-21T03:33:25Z","logger":"setup","msg":"Starting workers","controller":"resourcecollector","worker count":1}
🚫

If you are unable to ensure leader election in the current release, consider reinstalling the collector-manager by referring to the collector-manager uninstall and install guide.