No description
- Python 95.8%
- HTML 3.9%
- Dockerfile 0.3%
19-task TDD plan covering: dependencies, config, database layer, user models, repository CRUD, service logic, auth middleware, auth/user/admin routes, app wiring, pipeline config injection, CLI commands, web templates, Docker updates, and integration tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| docs/superpowers | ||
| profiles | ||
| src/ltl | ||
| tests | ||
| .gitignore | ||
| config.example.yaml | ||
| config.yaml | ||
| docker-compose.yaml | ||
| Dockerfile | ||
| pyproject.toml | ||
| README.md | ||
LLM Translation Layer (LTL)
An Anthropic API proxy for Ollama that makes local models work with Claude Code and other Anthropic SDK clients. Focuses on reliable tool calling — file reads, writes, edits, and bash execution.
How it works
Claude Code → LTL Proxy → Ollama → Local Model (qwen, llama, etc.)
← ← ←
LTL receives Anthropic API requests, translates them to OpenAI format, sends them to Ollama, validates tool calls, repairs malformed ones, and translates responses back. Your client thinks it's talking to Anthropic.
Quick Start
pip install
pip install llm-translation-layer
# Start the proxy
ltl serve
# In another terminal, point Claude Code at it
export ANTHROPIC_BASE_URL=http://localhost:8080
Docker
docker-compose up
From source
git clone <repo>
cd llm-translation-layer
pip install -e .
ltl serve
Configuration
Copy config.example.yaml to ~/.ltl/config.yaml:
server:
host: 0.0.0.0
port: 8080
ollama:
base_url: http://localhost:11434
model_mapping:
claude-sonnet-4-20250514: qwen2.5-coder:32b
claude-opus-4-0-20250514: qwen3-coder-next
default: qwen2.5-coder:latest
context_management:
strategy: adaptive
validation:
max_retries: 3
Features
- Full Anthropic Messages API —
/v1/messageswith streaming support - Auto model discovery — scans Ollama for available models on startup
- Smart tool calling — model-specific prompts, validation, and auto-repair
- Context management — adaptive truncation for models with smaller context windows
- Model profiles — built-in optimization for Qwen, Llama, Mistral, DeepSeek, Gemma, Phi
- User overrides — customize profiles and config per model
- Dashboard — web UI at
/dashboardfor monitoring
CLI
ltl serve # Start the proxy server
ltl serve --port 9000 # Custom port
ltl models # List Ollama models
ltl config # Show current config
API Endpoints
| Endpoint | Description |
|---|---|
POST /v1/messages |
Anthropic Messages API (main endpoint) |
POST /v1/messages/count_tokens |
Token counting |
GET /v1/models |
List available models |
GET /v1/models/{id} |
Model details |
GET /health |
Health check |
GET /dashboard |
Web dashboard |
POST /v1/reload |
Re-scan models |
Model Profiles
LTL ships with profiles for common model families. Each profile tunes:
- System prompt templates for tool calling
- Tool call format (native vs prompted)
- Few-shot examples
- Retry strategies
- Parallel call handling
Add custom profiles in ~/.ltl/profiles/:
# ~/.ltl/profiles/my-model.yaml
family: my-model
tool_format: structured_json
supports_function_calling: native
few_shot_examples: 2
workarounds:
multi_tool_parallel: serialize
Architecture
The request pipeline:
- Ingest — Parse Anthropic request, resolve model
- Transform — Convert to OpenAI format, inject tool prompts
- Context — Trim to fit model's context window
- Generate — Send to Ollama
- Validate — Check tool calls, auto-repair if malformed
- Respond — Convert back to Anthropic format
Requirements
- Python 3.11+
- Ollama running locally (or accessible via network)
- At least one model pulled in Ollama