Installation

Installation covers the Python, uv, optional Prime, backend, and provider credential requirements needed to run aec-bench locally.

System Requirements

Python: 3.13 or later
Package manager: uv
OS: macOS, Linux, or Windows through WSL
Memory: 4 GB RAM minimum, 8 GB recommended for larger generated suites

Docker, Modal, Prime, and hosted Harbor execution are optional. You only need those services when you choose the matching execution or training path.

From Source

Use the source checkout when developing templates, tasks, or library code:

git clone https://github.com/TheodoreGalanos/aec-bench.git
cd aec-bench
uv sync --extra webui --dev
uv run aec-bench --version

The console entry point is aec-bench. In a source checkout, run it through uv run so Python, dependencies, and local package code stay aligned.

Optional Extras

Web UI support is optional:

uv sync --extra webui --dev
uv run aec-bench web

Prime Lab support is optional:

uv sync --extra prime --dev
uv run aec-bench prime doctor

Provider integrations for direct Python agent calls may require the pydantic-ai extra:

uv sync --extra pydantic-ai --dev

Provider Credentials

Set credentials for the provider used by your model or endpoint alias:

Provider	Environment variables
Anthropic	`ANTHROPIC_API_KEY`
OpenAI	`OPENAI_API_KEY`
Azure OpenAI or Azure AI Foundry v1	`AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT`; optional `AZURE_OPENAI_API_VERSION`
Together AI	`TOGETHER_API_KEY`
AWS Bedrock	`AWS_BEARER_TOKEN_BEDROCK`, `AWS_REGION` or `AWS_DEFAULT_REGION`
Prime hosted eval/training	authenticated `prime` CLI session

When you run the CLI from a project checkout, aec-bench loads .env at startup. Existing shell variables take precedence.

For Azure AI Foundry deployments that expose the v1 OpenAI-compatible API, use the /openai/v1/ endpoint and pass the deployment name as --model. For Together AI, use an explicit together: model prefix.

Verify Installation

Use CLI help and a non-provider command first:

uv run aec-bench --help
uv run aec-bench generate list-templates --discipline ground
uv run aec-bench library export --stdout --pretty

Then run a provider-backed command once credentials are present:

uv run aec-bench run-local tasks/electrical/voltage-drop \
  --model gpt-4.1-mini \
  --harness direct

Project Layout

The source checkout uses a src/ Python layout:

src/aec_bench/          # Library source
tasks/                  # Benchmark task seeds and generated instances
seeds/                  # Expert-created seed files
agents/                 # Ready-to-use agent implementations
artefacts/              # Local generated artefacts and catalogue exports
docs/                   # Architecture and library guides
workspaces/             # Evolution workspaces

Generated artefacts, local runs, Prime packages, and evolution swarm state are intentionally local outputs. Commit curated tasks, templates, source, tests, and docs rather than transient run artefacts.

Next Steps

Quickstart — Generate and run a first task
Templates — Understand the built-in template catalogue
CLI Reference — See the current command surface
Prime Lab — Export tasks for Prime eval and training