CircleCI managed plugins for AI Agents
There's a toplevel codex compatible marketplace meant to facilitate local testing in codex.
The actual plugin in plugins/circleci is compatible with existing codex marketplaces.
Skills include:
- general CircleCI build debugging
- usage of the chunk cli and cloud based modes
- circleci cli usage
- circleci config management and optimization
Run from the repository root:
evals/circleci/scripts/run_routing_evals_ci.shThis runs:
- SKILL frontmatter checks for
plugins/circleci/skills/*/SKILL.md quick_validate.pychecks (when available)- routing eval cases from
evals/circleci/cases/skill-routing-cases.json
Routing case purpose (evals/circleci/cases/skill-routing-cases.json):
circleci-buildscases: ensure failed-build, flaky, and root-cause prompts route tocircleci-builds(explicit + implicit).circleci-clicases: ensure CLI/auth/rerun/command-line prompts route tocircleci-cli(explicit + implicit).circleci-configcases: ensure.circleci/config.yml, caching, workspace, and runtime optimization prompts route tocircleci-config(explicit + implicit).chunkcases: ensure Chunk setup andchunk-cliprompts route tochunk(explicit + implicit).- negative-control cases: ensure non-CircleCI prompts route to
null.
evals/circleci/scripts/run_invocation_smoke_evals_local.shDefault local mode is non-strict and still writes artifacts even if preflight or a case fails.
Strict mode (non-zero exit on failures, local-only):
STRICT=1 evals/circleci/scripts/run_invocation_smoke_evals_local.shArtifacts are written to:
evals/circleci/artifacts/invocation-smoke/latest/report.json- per-case JSONL/stderr files under
evals/circleci/artifacts/invocation-smoke/latest/<case-id>/
Invocation smoke case purpose (evals/circleci/cases/skill-invocation-smoke-cases.json):
builds-explicit-smoke: validate explicit$circleci-buildsprompt selectscircleci-builds.chunk-explicit-smoke: validate explicit$chunkprompt selectschunk.cli-implicit-smoke: validate CLI intent prompt selectscircleci-cliwithout explicit skill mention.negative-control-smoke: validate unrelated prompt reportsnone.
evals/circleci/scripts/run_trace_capture_evals_local.shTrace case purpose (evals/circleci/cases/trace-cases.json):
- validate codex capture preflight and JSONL artifact generation for explicit and implicit prompts.
- validate grading behavior for expected skill mention and negative controls.
If codex preflight fails with a network error, verify that codex exec can reach OpenAI endpoints from your environment.