Skip to main content

Web Search Integration

Enable transparent server-side web search execution for any LLM provider. LiteLLM automatically intercepts web search tool calls and executes them using your configured search provider (Perplexity, Tavily, etc.).

Quick Start

1. Configure Web Search Interception

Add to your config.yaml:

model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY

litellm_settings:
callbacks:
- websearch_interception:
enabled_providers:
- openai
- minimax
- anthropic
search_tool_name: perplexity-search # Optional

search_tools:
- search_tool_name: perplexity-search
litellm_params:
search_provider: perplexity
api_key: os.environ/PERPLEXITY_API_KEY

2. Use with Any Provider

import litellm

response = await litellm.acompletion(
model="gpt-4o",
messages=[
{"role": "user", "content": "What's the weather in San Francisco today?"}
],
tools=[
{
"type": "function",
"function": {
"name": "litellm_web_search",
"description": "Search the web for information",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"}
},
"required": ["query"]
}
}
}
]
)

# Response includes search results automatically!
print(response.choices[0].message.content)

How It Works

When a model makes a web search tool call, LiteLLM:

  1. Detects the litellm_web_search tool call in the response
  2. Executes the search using your configured search provider
  3. Makes a follow-up request with the search results
  4. Returns the final answer to the user

Result: One API call from user → Complete answer with search results

Supported Providers

Web search integration works with all providers that use:

  • Base HTTP Handler (BaseLLMHTTPHandler)
  • OpenAI Completion Handler (OpenAIChatCompletion)

Providers Using Base HTTP Handler

ProviderStatusNotes
OpenAI✅ SupportedGPT-4, GPT-3.5, etc.
Anthropic✅ SupportedClaude models via HTTP handler
MiniMax✅ SupportedAll MiniMax models
Mistral✅ SupportedMistral AI models
Cohere✅ SupportedCommand models
Fireworks AI✅ SupportedAll Fireworks models
Together AI✅ SupportedAll Together AI models
Groq✅ SupportedAll Groq models
Perplexity✅ SupportedPerplexity models
DeepSeek✅ SupportedDeepSeek models
xAI✅ SupportedGrok models
Hugging Face✅ SupportedInference API models
OCI✅ SupportedOracle Cloud models
Vertex AI✅ SupportedGoogle Vertex AI models
Bedrock✅ SupportedAWS Bedrock models (converse_like route)
Azure OpenAI✅ SupportedAzure-hosted OpenAI models
Sagemaker✅ SupportedAWS Sagemaker models
Databricks✅ SupportedDatabricks models
DataRobot✅ SupportedDataRobot models
Hosted VLLM✅ SupportedSelf-hosted VLLM
Heroku✅ SupportedHeroku-hosted models
RAGFlow✅ SupportedRAGFlow models
Compactif✅ SupportedCompactif models
Cometapi✅ SupportedComet API models
A2A✅ SupportedAgent-to-Agent models
Bytez✅ SupportedBytez models

Providers Using OpenAI Handler

ProviderStatusNotes
OpenAI✅ SupportedNative OpenAI API
Azure OpenAI✅ SupportedAzure-hosted OpenAI
OpenAI-Compatible✅ SupportedAny OpenAI-compatible API

Configuration

WebSearch Interception Parameters

ParameterTypeRequiredDescriptionExample
enabled_providersList[String]YesList of providers to enable web search for[openai, minimax, anthropic]
search_tool_nameStringNoSpecific search tool from search_tools config. If not set, uses first available.perplexity-search

Provider Values

Use these values in enabled_providers:

ProviderValueProviderValue
OpenAIopenaiAnthropicanthropic
MiniMaxminimaxMistralmistral
CoherecohereFireworks AIfireworks_ai
Together AItogether_aiGroqgroq
PerplexityperplexityDeepSeekdeepseek
xAIxaiHugging Facehuggingface
OCIociVertex AIvertex_ai
BedrockbedrockAzureazure
Sagemakersagemaker_chatDatabricksdatabricks
DataRobotdatarobotVLLMhosted_vllm
HerokuherokuRAGFlowragflow
CompactifcompactifCometapicometapi
A2Aa2aBytezbytez

Search Providers

Configure which search provider to use. LiteLLM supports multiple search providers:

Providersearch_provider ValueEnvironment Variable
Perplexity AIperplexityPERPLEXITYAI_API_KEY
TavilytavilyTAVILY_API_KEY
Exa AIexa_aiEXA_API_KEY
Parallel AIparallel_aiPARALLEL_AI_API_KEY
Google PSEgoogle_pseGOOGLE_PSE_API_KEY, GOOGLE_PSE_ENGINE_ID
DataForSEOdataforseoDATAFORSEO_LOGIN, DATAFORSEO_PASSWORD
FirecrawlfirecrawlFIRECRAWL_API_KEY
SearXNGsearxngSEARXNG_API_BASE (required)
LinkuplinkupLINKUP_API_KEY

See Search Providers Documentation for detailed setup instructions.

Complete Configuration Example

model_list:
# OpenAI
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY

# MiniMax
- model_name: minimax
litellm_params:
model: minimax/MiniMax-M2.1
api_key: os.environ/MINIMAX_API_KEY

# Anthropic
- model_name: claude
litellm_params:
model: anthropic/claude-sonnet-4-5
api_key: os.environ/ANTHROPIC_API_KEY

# Azure OpenAI
- model_name: azure-gpt4
litellm_params:
model: azure/gpt-4
api_base: https://my-azure.openai.azure.com
api_key: os.environ/AZURE_API_KEY

litellm_settings:
callbacks:
- websearch_interception:
enabled_providers:
- openai
- minimax
- anthropic
- azure
search_tool_name: perplexity-search

search_tools:
- search_tool_name: perplexity-search
litellm_params:
search_provider: perplexity
api_key: os.environ/PERPLEXITY_API_KEY

- search_tool_name: tavily-search
litellm_params:
search_provider: tavily
api_key: os.environ/TAVILY_API_KEY

Usage Examples

Python SDK

import litellm

# Configure callbacks
litellm.callbacks = ["websearch_interception"]

# Make completion with web search tool
response = await litellm.acompletion(
model="gpt-4o",
messages=[
{"role": "user", "content": "What are the latest AI news?"}
],
tools=[
{
"type": "function",
"function": {
"name": "litellm_web_search",
"description": "Search the web for current information",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query"
}
},
"required": ["query"]
}
}
}
]
)

print(response.choices[0].message.content)

Proxy Server

# Start proxy with config
litellm --config config.yaml

# Make request
curl http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "What is the weather in San Francisco?"}
],
"tools": [
{
"type": "function",
"function": {
"name": "litellm_web_search",
"description": "Search the web",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
}
}
]
}'

How Search Tool Selection Works

  1. If search_tool_name is specified → Uses that specific search tool
  2. If search_tool_name is not specified → Uses first search tool in search_tools list
search_tools:
- search_tool_name: perplexity-search # ← This will be used if no search_tool_name specified
litellm_params:
search_provider: perplexity
api_key: os.environ/PERPLEXITY_API_KEY

- search_tool_name: tavily-search
litellm_params:
search_provider: tavily
api_key: os.environ/TAVILY_API_KEY

Troubleshooting

Web Search Not Working

  1. Check provider is enabled:

    enabled_providers:
    - openai # Make sure your provider is in this list
  2. Verify search tool is configured:

    search_tools:
    - search_tool_name: perplexity-search
    litellm_params:
    search_provider: perplexity
    api_key: os.environ/PERPLEXITY_API_KEY
  3. Check API keys are set:

    export PERPLEXITY_API_KEY=your-key
  4. Enable debug logging:

    litellm.set_verbose = True

Common Issues

Issue: Model returns tool_calls instead of final answer

  • Cause: Provider not in enabled_providers list
  • Solution: Add provider to enabled_providers

Issue: "No search tool configured" error

  • Cause: No search tools in search_tools config
  • Solution: Add at least one search tool configuration

Issue: "Invalid function arguments json string" error (MiniMax)

  • Cause: Fixed in latest version - arguments weren't properly JSON serialized
  • Solution: Update to latest LiteLLM version

Technical Details

Architecture

Web search integration is implemented as a custom callback (WebSearchInterceptionLogger) that:

  1. Pre-request Hook: Converts native web search tools to LiteLLM standard format
  2. Post-response Hook: Detects web search tool calls in responses
  3. Agentic Loop: Executes searches and makes follow-up requests automatically

Supported APIs

  • Chat Completions API (OpenAI format)
  • Anthropic Messages API (Anthropic format)
  • Streaming (automatically converted)
  • Non-streaming

Response Format Detection

The handler automatically detects response format:

  • OpenAI format: tool_calls in assistant message
  • Anthropic format: tool_use blocks in content

Performance

  • Latency: Adds one additional LLM call (follow-up request with search results)
  • Caching: Search results can be cached (depends on search provider)
  • Parallel Searches: Multiple search queries executed in parallel

Contributing

Found a bug or want to add support for a new provider? See our Contributing Guide.