Skip to main content

3 posts tagged with "ui"

View All Tags

Krrish Dholakia
Ishaan Jaffer
info

v1.61.20-stable will be live on 2025-02-04.

These are the changes since v1.61.13-stable.

This release is primarily focused on:

  • LLM Translation improvements (claude-3-7-sonnet + 'thinking'/'reasoning_content' support)
  • UI improvements (add model flow, user management, etc)

Demo Instance​

Here's a Demo Instance to test changes:

New Models / Updated Models​

  1. Anthropic 3-7 sonnet support + cost tracking (Anthropic API + Bedrock + Vertex AI + OpenRouter)
    1. Anthropic API Start here
    2. Bedrock API Start here
    3. Vertex AI API See here
    4. OpenRouter See here
  2. Gpt-4.5-preview support + cost tracking See here
  3. Azure AI - Phi-4 cost tracking See here
  4. Claude-3.5-sonnet - vision support updated on Anthropic API See here
  5. Bedrock llama vision support See here
  6. Cerebras llama3.3-70b pricing See here

LLM Translation​

  1. Infinity Rerank - support returning documents when return_documents=True Start here
  2. Amazon Deepseek - <think> param extraction into ‘reasoning_content’ Start here
  3. Amazon Titan Embeddings - filter out ‘aws_’ params from request body Start here
  4. Anthropic ‘thinking’ + ‘reasoning_content’ translation support (Anthropic API, Bedrock, Vertex AI) Start here
  5. VLLM - support ‘video_url’ Start here
  6. Call proxy via litellm SDK: Support litellm_proxy/ for embedding, image_generation, transcription, speech, rerank Start here
  7. OpenAI Pass-through - allow using Assistants GET, DELETE on /openai pass through routes Start here
  8. Message Translation - fix openai message for assistant msg if role is missing - openai allows this
  9. O1/O3 - support ‘drop_params’ for o3-mini and o1 parallel_tool_calls param (not supported currently) See here

Spend Tracking Improvements​

  1. Cost tracking for rerank via Bedrock See PR
  2. Anthropic pass-through - fix race condition causing cost to not be tracked See PR
  3. Anthropic pass-through: Ensure accurate token counting See PR

Management Endpoints / UI​

  1. Models Page - Allow sorting models by ‘created at’
  2. Models Page - Edit Model Flow Improvements
  3. Models Page - Fix Adding Azure, Azure AI Studio models on UI
  4. Internal Users Page - Allow Bulk Adding Internal Users on UI
  5. Internal Users Page - Allow sorting users by ‘created at’
  6. Virtual Keys Page - Allow searching for UserIDs on the dropdown when assigning a user to a team See PR
  7. Virtual Keys Page - allow creating a user when assigning keys to users See PR
  8. Model Hub Page - fix text overflow issue See PR
  9. Admin Settings Page - Allow adding MSFT SSO on UI
  10. Backend - don't allow creating duplicate internal users in DB

Helm​

  1. support ttlSecondsAfterFinished on the migration job - See PR
  2. enhance migrations job with additional configurable properties - See PR

Logging / Guardrail Integrations​

  1. Arize Phoenix support
  2. ‘No-log’ - fix ‘no-log’ param support on embedding calls

Performance / Loadbalancing / Reliability improvements​

  1. Single Deployment Cooldown logic - Use allowed_fails or allowed_fail_policy if set Start here

General Proxy Improvements​

  1. Hypercorn - fix reading / parsing request body
  2. Windows - fix running proxy in windows
  3. DD-Trace - fix dd-trace enablement on proxy

Complete Git Diff​

View the complete git diff here.

Krrish Dholakia
Ishaan Jaffer

alerting, prometheus, secret management, management endpoints, ui, prompt management, finetuning, batch

New / Updated Models​

  1. Mistral large pricing - https://github.com/BerriAI/litellm/pull/7452
  2. Cohere command-r7b-12-2024 pricing - https://github.com/BerriAI/litellm/pull/7553/files
  3. Voyage - new models, prices and context window information - https://github.com/BerriAI/litellm/pull/7472
  4. Anthropic - bump Bedrock claude-3-5-haiku max_output_tokens to 8192

General Proxy Improvements​

  1. Health check support for realtime models
  2. Support calling Azure realtime routes via virtual keys
  3. Support custom tokenizer on /utils/token_counter - useful when checking token count for self-hosted models
  4. Request Prioritization - support on /v1/completion endpoint as well

LLM Translation Improvements​

  1. Deepgram STT support. Start Here
  2. OpenAI Moderations - omni-moderation-latest support. Start Here
  3. Azure O1 - fake streaming support. This ensures if a stream=true is passed, the response is streamed. Start Here
  4. Anthropic - non-whitespace char stop sequence handling - PR
  5. Azure OpenAI - support entrata id username + password based auth. Start Here
  6. LM Studio - embedding route support. Start Here
  7. WatsonX - ZenAPIKeyAuth support. Start Here

Prompt Management Improvements​

  1. Langfuse integration
  2. HumanLoop integration
  3. Support for using load balanced models
  4. Support for loading optional params from prompt manager

Start Here

Finetuning + Batch APIs Improvements​

  1. Improved unified endpoint support for Vertex AI finetuning - PR
  2. Add support for retrieving vertex api batch jobs - PR

NEW Alerting Integration​

PagerDuty Alerting Integration.

Handles two types of alerts:

  • High LLM API Failure Rate. Configure X fails in Y seconds to trigger an alert.
  • High Number of Hanging LLM Requests. Configure X hangs in Y seconds to trigger an alert.

Start Here

Prometheus Improvements​

Added support for tracking latency/spend/tokens based on custom metrics. Start Here

NEW Hashicorp Secret Manager Support​

Support for reading credentials + writing LLM API keys. Start Here

Management Endpoints / UI Improvements​

  1. Create and view organizations + assign org admins on the Proxy UI
  2. Support deleting keys by key_alias
  3. Allow assigning teams to org on UI
  4. Disable using ui session token for 'test key' pane
  5. Show model used in 'test key' pane
  6. Support markdown output in 'test key' pane

Helm Improvements​

  1. Prevent istio injection for db migrations cron job
  2. allow using migrationJob.enabled variable within job

Logging Improvements​

  1. braintrust logging: respect project_id, add more metrics - https://github.com/BerriAI/litellm/pull/7613
  2. Athina - support base url - ATHINA_BASE_URL
  3. Lunary - Allow passing custom parent run id to LLM Calls

Git Diff​

This is the diff between v1.56.3-stable and v1.57.8-stable.

Use this to see the changes in the codebase.

Git Diff

Krrish Dholakia
Ishaan Jaffer

langfuse, management endpoints, ui, prometheus, secret management

Langfuse Prompt Management​

Langfuse Prompt Management is being labelled as BETA. This allows us to iterate quickly on the feedback we're receiving, and making the status clearer to users. We expect to make this feature to be stable by next month (February 2025).

Changes:

  • Include the client message in the LLM API Request. (Previously only the prompt template was sent, and the client message was ignored).
  • Log the prompt template in the logged request (e.g. to s3/langfuse).
  • Log the 'prompt_id' and 'prompt_variables' in the logged request (e.g. to s3/langfuse).

Start Here

Team/Organization Management + UI Improvements​

Managing teams and organizations on the UI is now easier.

Changes:

  • Support for editing user role within team on UI.
  • Support updating team member role to admin via api - /team/member_update
  • Show team admins all keys for their team.
  • Add organizations with budgets
  • Assign teams to orgs on the UI
  • Auto-assign SSO users to teams

Start Here

Hashicorp Vault Support​

We now support writing LiteLLM Virtual API keys to Hashicorp Vault.

Start Here

Custom Prometheus Metrics​

Define custom prometheus metrics, and track usage/latency/no. of requests against them

This allows for more fine-grained tracking - e.g. on prompt template passed in request metadata

Start Here