No description
  • Python 95.8%
  • HTML 3.9%
  • Dockerfile 0.3%
Find a file
Fill84 2ec314069d Add multi-user auth implementation plan
19-task TDD plan covering: dependencies, config, database layer,
user models, repository CRUD, service logic, auth middleware,
auth/user/admin routes, app wiring, pipeline config injection,
CLI commands, web templates, Docker updates, and integration tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 02:59:24 +02:00
docs/superpowers Add multi-user auth implementation plan 2026-04-14 02:59:24 +02:00
profiles Initial implementation of LLM Translation Layer (LTL) 2026-04-12 05:41:55 +02:00
src/ltl Refactor language instruction handling; consolidate language response guidelines into TOOL_CALLING_INSTRUCTIONS for clarity and consistency. 2026-04-14 00:59:15 +02:00
tests Refactor Dockerfile and docker-compose for configuration updates; enhance config loading with environment variable support; improve tool call extraction and validation logic; update tool calling instructions in templates. 2026-04-12 14:03:25 +02:00
.gitignore Initial implementation of LLM Translation Layer (LTL) 2026-04-12 05:41:55 +02:00
config.example.yaml Initial implementation of LLM Translation Layer (LTL) 2026-04-12 05:41:55 +02:00
config.yaml Enhance model management and configuration endpoints; add functionality to get and set active model, update model mapping, and improve dashboard display for active model selection. 2026-04-13 02:03:19 +02:00
docker-compose.yaml Add configuration file and update docker-compose for Ollama integration; enhance context management and logging settings; improve payload handling in generate.py and respond.py; streamline prompt transformation in transform.py. 2026-04-13 00:58:34 +02:00
Dockerfile Refactor Dockerfile and docker-compose for configuration updates; enhance config loading with environment variable support; improve tool call extraction and validation logic; update tool calling instructions in templates. 2026-04-12 14:03:25 +02:00
pyproject.toml Initial implementation of LLM Translation Layer (LTL) 2026-04-12 05:41:55 +02:00
README.md Initial implementation of LLM Translation Layer (LTL) 2026-04-12 05:41:55 +02:00

LLM Translation Layer (LTL)

An Anthropic API proxy for Ollama that makes local models work with Claude Code and other Anthropic SDK clients. Focuses on reliable tool calling — file reads, writes, edits, and bash execution.

How it works

Claude Code  →  LTL Proxy  →  Ollama  →  Local Model (qwen, llama, etc.)
             ←             ←          ←

LTL receives Anthropic API requests, translates them to OpenAI format, sends them to Ollama, validates tool calls, repairs malformed ones, and translates responses back. Your client thinks it's talking to Anthropic.

Quick Start

pip install

pip install llm-translation-layer

# Start the proxy
ltl serve

# In another terminal, point Claude Code at it
export ANTHROPIC_BASE_URL=http://localhost:8080

Docker

docker-compose up

From source

git clone <repo>
cd llm-translation-layer
pip install -e .
ltl serve

Configuration

Copy config.example.yaml to ~/.ltl/config.yaml:

server:
  host: 0.0.0.0
  port: 8080

ollama:
  base_url: http://localhost:11434

model_mapping:
  claude-sonnet-4-20250514: qwen2.5-coder:32b
  claude-opus-4-0-20250514: qwen3-coder-next
  default: qwen2.5-coder:latest

context_management:
  strategy: adaptive

validation:
  max_retries: 3

Features

  • Full Anthropic Messages API/v1/messages with streaming support
  • Auto model discovery — scans Ollama for available models on startup
  • Smart tool calling — model-specific prompts, validation, and auto-repair
  • Context management — adaptive truncation for models with smaller context windows
  • Model profiles — built-in optimization for Qwen, Llama, Mistral, DeepSeek, Gemma, Phi
  • User overrides — customize profiles and config per model
  • Dashboard — web UI at /dashboard for monitoring

CLI

ltl serve                    # Start the proxy server
ltl serve --port 9000        # Custom port
ltl models                   # List Ollama models
ltl config                   # Show current config

API Endpoints

Endpoint Description
POST /v1/messages Anthropic Messages API (main endpoint)
POST /v1/messages/count_tokens Token counting
GET /v1/models List available models
GET /v1/models/{id} Model details
GET /health Health check
GET /dashboard Web dashboard
POST /v1/reload Re-scan models

Model Profiles

LTL ships with profiles for common model families. Each profile tunes:

  • System prompt templates for tool calling
  • Tool call format (native vs prompted)
  • Few-shot examples
  • Retry strategies
  • Parallel call handling

Add custom profiles in ~/.ltl/profiles/:

# ~/.ltl/profiles/my-model.yaml
family: my-model
tool_format: structured_json
supports_function_calling: native
few_shot_examples: 2
workarounds:
  multi_tool_parallel: serialize

Architecture

The request pipeline:

  1. Ingest — Parse Anthropic request, resolve model
  2. Transform — Convert to OpenAI format, inject tool prompts
  3. Context — Trim to fit model's context window
  4. Generate — Send to Ollama
  5. Validate — Check tool calls, auto-repair if malformed
  6. Respond — Convert back to Anthropic format

Requirements

  • Python 3.11+
  • Ollama running locally (or accessible via network)
  • At least one model pulled in Ollama