Skip to content

Targets

Targets abstract the connection between evaluations and agent implementations. They allow the same EVAL.yaml to run against different agents, models, or environments.

.agentv/targets.yaml
targets:
- name: default
provider: anthropic
model: claude-sonnet-4-20250514
judge_target: default
- name: powerful
provider: anthropic
model: claude-opus-4-20250514
judge_target: default
- name: fast
provider: anthropic
model: claude-3-5-haiku-20241022
judge_target: powerful # Use better model for judging
- name: azure
provider: azure
endpoint: ${{ AZURE_OPENAI_ENDPOINT }}
api_key: ${{ AZURE_OPENAI_API_KEY }}
deployment: ${{ AZURE_DEPLOYMENT_NAME }}
judge_target: azure
PropertyTypeDescription
namestringUnique target identifier
providerstringProvider type
judge_targetstringTarget for LLM judges (optional)
modelstringModel name (provider-specific)
endpointstringAPI endpoint (if applicable)
api_keystringAPI key (use env vars)
- name: claude
provider: anthropic
model: claude-sonnet-4-20250514
# Uses ANTHROPIC_API_KEY env var
- name: azure_prod
provider: azure
endpoint: ${{ AZURE_OPENAI_ENDPOINT }}
api_key: ${{ AZURE_OPENAI_API_KEY }}
deployment: gpt-4o
api_version: "2024-02-15-preview"
- name: openai
provider: openai
model: gpt-4o
# Uses OPENAI_API_KEY env var

Execute a local CLI command:

- name: local_agent
provider: cli
command: ["python", "./my_agent.py"]
cwd: /path/to/agent

Test VS Code extensions:

- name: vscode_test
provider: vscode
workspace_template: ${{ WORKSPACE_PATH }}
judge_target: azure

For testing without LLM calls:

- name: mock
provider: mock
responses:
- pattern: "hello"
response: "Hi there!"
- pattern: ".*"
response: "I don't understand."
EVAL.yaml
execution:
target: default # Uses "default" from targets.yaml
evalcases:
- id: complex-task
execution:
target: powerful # Override for this case

Use a different model for evaluation:

execution:
target: fast # Agent uses fast model
evaluators:
- name: quality
type: llm_judge
prompt: ./prompts/quality.md
target: powerful # Judge uses powerful model
endpoint: ${{ AZURE_ENDPOINT }} # Required
api_key: ${{ API_KEY }} # From .env or environment
.env
AZURE_ENDPOINT=https://my-resource.openai.azure.com
AZURE_API_KEY=sk-...
AZURE_DEPLOYMENT=gpt-4o
  1. Check evalcase execution.target
  2. Fall back to file-level execution.target
  3. Fall back to default target
  4. Error if no matching target found
# Resolution order
evalcases:
- id: case-1
execution:
target: special # 1. Uses "special"
- id: case-2 # 2. Uses file-level "fast"
execution:
target: fast # File-level default
targets.yaml
targets:
- name: dev
provider: anthropic
model: claude-3-5-haiku-20241022
- name: prod
provider: anthropic
model: claude-sonnet-4-20250514

Run with specific target:

Terminal window
agentv eval ./EVAL.yaml --target prod
targets:
- name: variant_a
provider: anthropic
model: claude-sonnet-4-20250514
config:
temperature: 0.0
- name: variant_b
provider: anthropic
model: claude-sonnet-4-20250514
config:
temperature: 0.3
# Good
- name: production_claude
- name: development_haiku
- name: cost_optimized
# Avoid
- name: t1
- name: model
# Good: Use environment variables
api_key: ${{ AZURE_API_KEY }}
# Avoid: Hardcoded keys
api_key: sk-12345...
- name: fast_agent
provider: anthropic
model: claude-3-5-haiku-20241022
judge_target: powerful_judge # Better model for judging
- name: powerful_judge
provider: anthropic
model: claude-sonnet-4-20250514

Keep targets.yaml in version control with template values:

# targets.yaml (committed)
targets:
- name: azure
provider: azure
endpoint: ${{ AZURE_ENDPOINT }}
api_key: ${{ AZURE_API_KEY }}
# .agentv/README.md
Required environment variables:
- AZURE_ENDPOINT: Azure OpenAI endpoint
- AZURE_API_KEY: Azure OpenAI API key
- AZURE_DEPLOYMENT: Deployment name