Monitoring for Buildkite agents
- Python 80.7%
- Nix 19.3%
This allows forcing certain combinations of tags to always end up in the pipeline steps, even if no agents are dynamically found to match them. This way the absence of agents on e.g. certain queues can be detected, which would cause CI jobs to be unschedulable. Fixes #2. |
||
|---|---|---|
| src/buildkite_prober | ||
| .gitignore | ||
| config.yml | ||
| flake.lock | ||
| flake.nix | ||
| LICENSE | ||
| pyproject.toml | ||
| README.md | ||
| uv.lock | ||
Buildkite Prober - Monitoring for Buildkite agents
The AFNix Buildkite Prober checks the health of Buildkite agents - not just using whitebox metrics, but instead through a blackbox "probing" approach. On a regular interval, synthetic test workloads are injected as a test Buildkite pipeline which runs on an agent pool. The result of these test workloads is then used to determine the health of individual agents. This means that an agent technically up but unable to perform any work due to software or hardware issues will properly be reported as down.
See config.yml in the root directory for an example configuration file.