Skip to main content

2 posts tagged with "llm translation"

View All Tags

Krrish Dholakia
Ishaan Jaffer

v1.63.0 fixes Anthropic 'thinking' response on streaming to return the signature block. Github Issue

It also moves the response structure from signature_delta to signature to be the same as Anthropic. Anthropic Docs

Diff​

"message": {
...
"reasoning_content": "The capital of France is Paris.",
"thinking_blocks": [
{
"type": "thinking",
"thinking": "The capital of France is Paris.",
- "signature_delta": "EqoBCkgIARABGAIiQL2UoU0b1OHYi+..." # 👈 OLD FORMAT
+ "signature": "EqoBCkgIARABGAIiQL2UoU0b1OHYi+..." # 👈 KEY CHANGE
}
]
}

Krrish Dholakia
Ishaan Jaffer
info

v1.61.20-stable will be live on 2025-02-04.

These are the changes since v1.61.13-stable.

This release is primarily focused on:

  • LLM Translation improvements (claude-3-7-sonnet + 'thinking'/'reasoning_content' support)
  • UI improvements (add model flow, user management, etc)

Demo Instance​

Here's a Demo Instance to test changes:

New Models / Updated Models​

  1. Anthropic 3-7 sonnet support + cost tracking (Anthropic API + Bedrock + Vertex AI + OpenRouter)
    1. Anthropic API Start here
    2. Bedrock API Start here
    3. Vertex AI API See here
    4. OpenRouter See here
  2. Gpt-4.5-preview support + cost tracking See here
  3. Azure AI - Phi-4 cost tracking See here
  4. Claude-3.5-sonnet - vision support updated on Anthropic API See here
  5. Bedrock llama vision support See here
  6. Cerebras llama3.3-70b pricing See here

LLM Translation​

  1. Infinity Rerank - support returning documents when return_documents=True Start here
  2. Amazon Deepseek - <think> param extraction into ‘reasoning_content’ Start here
  3. Amazon Titan Embeddings - filter out ‘aws_’ params from request body Start here
  4. Anthropic ‘thinking’ + ‘reasoning_content’ translation support (Anthropic API, Bedrock, Vertex AI) Start here
  5. VLLM - support ‘video_url’ Start here
  6. Call proxy via litellm SDK: Support litellm_proxy/ for embedding, image_generation, transcription, speech, rerank Start here
  7. OpenAI Pass-through - allow using Assistants GET, DELETE on /openai pass through routes Start here
  8. Message Translation - fix openai message for assistant msg if role is missing - openai allows this
  9. O1/O3 - support ‘drop_params’ for o3-mini and o1 parallel_tool_calls param (not supported currently) See here

Spend Tracking Improvements​

  1. Cost tracking for rerank via Bedrock See PR
  2. Anthropic pass-through - fix race condition causing cost to not be tracked See PR
  3. Anthropic pass-through: Ensure accurate token counting See PR

Management Endpoints / UI​

  1. Models Page - Allow sorting models by ‘created at’
  2. Models Page - Edit Model Flow Improvements
  3. Models Page - Fix Adding Azure, Azure AI Studio models on UI
  4. Internal Users Page - Allow Bulk Adding Internal Users on UI
  5. Internal Users Page - Allow sorting users by ‘created at’
  6. Virtual Keys Page - Allow searching for UserIDs on the dropdown when assigning a user to a team See PR
  7. Virtual Keys Page - allow creating a user when assigning keys to users See PR
  8. Model Hub Page - fix text overflow issue See PR
  9. Admin Settings Page - Allow adding MSFT SSO on UI
  10. Backend - don't allow creating duplicate internal users in DB

Helm​

  1. support ttlSecondsAfterFinished on the migration job - See PR
  2. enhance migrations job with additional configurable properties - See PR

Logging / Guardrail Integrations​

  1. Arize Phoenix support
  2. ‘No-log’ - fix ‘no-log’ param support on embedding calls

Performance / Loadbalancing / Reliability improvements​

  1. Single Deployment Cooldown logic - Use allowed_fails or allowed_fail_policy if set Start here

General Proxy Improvements​

  1. Hypercorn - fix reading / parsing request body
  2. Windows - fix running proxy in windows
  3. DD-Trace - fix dd-trace enablement on proxy

Complete Git Diff​

View the complete git diff here.