Watch

No description

Python 95.8%
HTML 3.9%
Dockerfile 0.3%

Find a file

Repository files (latest commit first)
Filename	Latest commit message	Latest commit date
Fill84 2ec314069d Add multi-user auth implementation plan 19-task TDD plan covering: dependencies, config, database layer, user models, repository CRUD, service logic, auth middleware, auth/user/admin routes, app wiring, pipeline config injection, CLI commands, web templates, Docker updates, and integration tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>		2026-04-14 02:59:24 +02:00
docs/superpowers	Add multi-user auth implementation plan	2026-04-14 02:59:24 +02:00
profiles	Initial implementation of LLM Translation Layer (LTL)	2026-04-12 05:41:55 +02:00
src/ltl	Refactor language instruction handling; consolidate language response guidelines into TOOL_CALLING_INSTRUCTIONS for clarity and consistency.	2026-04-14 00:59:15 +02:00
tests	Refactor Dockerfile and docker-compose for configuration updates; enhance config loading with environment variable support; improve tool call extraction and validation logic; update tool calling instructions in templates.	2026-04-12 14:03:25 +02:00
.gitignore	Initial implementation of LLM Translation Layer (LTL)	2026-04-12 05:41:55 +02:00
config.example.yaml	Initial implementation of LLM Translation Layer (LTL)	2026-04-12 05:41:55 +02:00
config.yaml	Enhance model management and configuration endpoints; add functionality to get and set active model, update model mapping, and improve dashboard display for active model selection.	2026-04-13 02:03:19 +02:00
docker-compose.yaml	Add configuration file and update docker-compose for Ollama integration; enhance context management and logging settings; improve payload handling in generate.py and respond.py; streamline prompt transformation in transform.py.	2026-04-13 00:58:34 +02:00
Dockerfile	Refactor Dockerfile and docker-compose for configuration updates; enhance config loading with environment variable support; improve tool call extraction and validation logic; update tool calling instructions in templates.	2026-04-12 14:03:25 +02:00
pyproject.toml	Initial implementation of LLM Translation Layer (LTL)	2026-04-12 05:41:55 +02:00
README.md	Initial implementation of LLM Translation Layer (LTL)	2026-04-12 05:41:55 +02:00

README.md

LLM Translation Layer (LTL)

An Anthropic API proxy for Ollama that makes local models work with Claude Code and other Anthropic SDK clients. Focuses on reliable tool calling — file reads, writes, edits, and bash execution.

How it works

Claude Code  →  LTL Proxy  →  Ollama  →  Local Model (qwen, llama, etc.)
             ←             ←          ←

LTL receives Anthropic API requests, translates them to OpenAI format, sends them to Ollama, validates tool calls, repairs malformed ones, and translates responses back. Your client thinks it's talking to Anthropic.

Quick Start

pip install

pip install llm-translation-layer

# Start the proxy
ltl serve

# In another terminal, point Claude Code at it
export ANTHROPIC_BASE_URL=http://localhost:8080

Docker

docker-compose up

From source

git clone <repo>
cd llm-translation-layer
pip install -e .
ltl serve

Configuration

Copy config.example.yaml to ~/.ltl/config.yaml:

server:
  host: 0.0.0.0
  port: 8080

ollama:
  base_url: http://localhost:11434

model_mapping:
  claude-sonnet-4-20250514: qwen2.5-coder:32b
  claude-opus-4-0-20250514: qwen3-coder-next
  default: qwen2.5-coder:latest

context_management:
  strategy: adaptive

validation:
  max_retries: 3

Features

Full Anthropic Messages API — /v1/messages with streaming support
Auto model discovery — scans Ollama for available models on startup
Smart tool calling — model-specific prompts, validation, and auto-repair
Context management — adaptive truncation for models with smaller context windows
Model profiles — built-in optimization for Qwen, Llama, Mistral, DeepSeek, Gemma, Phi
User overrides — customize profiles and config per model
Dashboard — web UI at /dashboard for monitoring

CLI

ltl serve                    # Start the proxy server
ltl serve --port 9000        # Custom port
ltl models                   # List Ollama models
ltl config                   # Show current config

API Endpoints

Endpoint	Description
`POST /v1/messages`	Anthropic Messages API (main endpoint)
`POST /v1/messages/count_tokens`	Token counting
`GET /v1/models`	List available models
`GET /v1/models/{id}`	Model details
`GET /health`	Health check
`GET /dashboard`	Web dashboard
`POST /v1/reload`	Re-scan models

Model Profiles

LTL ships with profiles for common model families. Each profile tunes:

System prompt templates for tool calling
Tool call format (native vs prompted)
Few-shot examples
Retry strategies
Parallel call handling

Add custom profiles in ~/.ltl/profiles/:

# ~/.ltl/profiles/my-model.yaml
family: my-model
tool_format: structured_json
supports_function_calling: native
few_shot_examples: 2
workarounds:
  multi_tool_parallel: serialize

Architecture

The request pipeline:

Ingest — Parse Anthropic request, resolve model
Transform — Convert to OpenAI format, inject tool prompts
Context — Trim to fit model's context window
Generate — Send to Ollama
Validate — Check tool calls, auto-repair if malformed
Respond — Convert back to Anthropic format

Requirements

Python 3.11+
Ollama running locally (or accessible via network)
At least one model pulled in Ollama