Open source · MIT licensed · v0.1.0

The pytest for LLMs

Test your AI outputs like you test your code.

test_capitals.py

from llmtest import expect, llm_test

@llm_test(
    expect.contains("Paris"),
    expect.latency_under(2000),
    expect.cost_under(0.001),
    model="claude-sonnet-4-20250514",
)
def test_capital(llm):
    output = llm("What is the capital of France?")
    assert "Paris" in output.content

Terminal

Waiting...

Everything you need to test LLMs

No LLM judge. No YAML configs. Just pytest.

Zero LLM Calls

Most assertions are deterministic and instant. No paying an LLM to judge your output.

Built on Pydantic

All models use BaseModel — auto-validation, JSON serialization, schema generation.

22+ Assertions

Text, performance, agent, and composable. Contains, regex, JSON, cost, latency, tool calls.

Multi-Provider

OpenAI, Anthropic, Ollama out of the box. Install only what you need.

Agent Testing

Tool call validation, loop detection, call ordering. Test your AI agents properly.

Retry Support

Built-in retry at decorator and fixture level. Handle non-deterministic outputs.

22+ built-in assertions

All deterministic. All instant. No LLM calls needed for most checks.

Textcontains, regex, JSON, length, similarity, structured output

Performancelatency, cost, token count

Agenttool calls, loop detection, call ordering

ComposableAND, OR, custom logic with & and | operators

# Text

expect.contains("Paris")

expect.matches_regex(r"\d+")

expect.valid_json()

expect.structured_output(MyModel)

# Performance

expect.latency_under(2000)

expect.cost_under(0.01)

# Agent

expect.tool_called("search")

expect.no_loop()

# Composable

expect.contains("A") & expect.not_contains("B")

Start testing your LLMs today

Open source. MIT licensed. Built for developers.

Quick Start Read the Docs