Installation
Installation covers the Python, uv, optional Prime, backend, and provider credential requirements needed to run aec-bench locally.
System Requirements
- Python: 3.13 or later
- Package manager:
uv - OS: macOS, Linux, or Windows through WSL
- Memory: 4 GB RAM minimum, 8 GB recommended for larger generated suites
Docker, Modal, Prime, and hosted Harbor execution are optional. You only need those services when you choose the matching execution or training path.
From Source
Use the source checkout when developing templates, tasks, or library code:
git clone https://github.com/TheodoreGalanos/aec-bench.git
cd aec-bench
uv sync --extra webui --dev
uv run aec-bench --versionThe console entry point is aec-bench. In a source checkout, run it through uv run so Python, dependencies, and local package code stay aligned.
Optional Extras
Web UI support is optional:
uv sync --extra webui --dev
uv run aec-bench webPrime Lab support is optional:
uv sync --extra prime --dev
uv run aec-bench prime doctorProvider integrations for direct Python agent calls may require the pydantic-ai extra:
uv sync --extra pydantic-ai --devProvider Credentials
Set credentials for the provider used by your model or endpoint alias:
| Provider | Environment variables |
|---|---|
| Anthropic | ANTHROPIC_API_KEY |
| OpenAI | OPENAI_API_KEY |
| Azure OpenAI or Azure AI Foundry v1 | AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT; optional AZURE_OPENAI_API_VERSION |
| Together AI | TOGETHER_API_KEY |
| AWS Bedrock | AWS_BEARER_TOKEN_BEDROCK, AWS_REGION or AWS_DEFAULT_REGION |
| Prime hosted eval/training | authenticated prime CLI session |
When you run the CLI from a project checkout, aec-bench loads .env at startup. Existing shell variables take precedence.
For Azure AI Foundry deployments that expose the v1 OpenAI-compatible API, use the /openai/v1/ endpoint and pass the deployment name as --model. For Together AI, use an explicit together: model prefix.
Verify Installation
Use CLI help and a non-provider command first:
uv run aec-bench --help
uv run aec-bench generate list-templates --discipline ground
uv run aec-bench library export --stdout --prettyThen run a provider-backed command once credentials are present:
uv run aec-bench run-local tasks/electrical/voltage-drop \
--model gpt-4.1-mini \
--harness directProject Layout
The source checkout uses a src/ Python layout:
src/aec_bench/ # Library source
tasks/ # Benchmark task seeds and generated instances
seeds/ # Expert-created seed files
agents/ # Ready-to-use agent implementations
artefacts/ # Local generated artefacts and catalogue exports
docs/ # Architecture and library guides
workspaces/ # Evolution workspacesGenerated artefacts, local runs, Prime packages, and evolution swarm state are intentionally local outputs. Commit curated tasks, templates, source, tests, and docs rather than transient run artefacts.
Next Steps
- Quickstart — Generate and run a first task
- Templates — Understand the built-in template catalogue
- CLI Reference — See the current command surface
- Prime Lab — Export tasks for Prime eval and training