LLM Provider Configuration Guide
This guide explains how to set up and add LLM providers in Gofannon. Provider configurations determine which models are available and how they interact with the system.
Table of Contents
- Overview
- Why LiteLLM?
- Configuration Structure
- Provider Configuration Files
- Adding a New Provider
- Parameter Types and Features
- Examples
- LiteLLM Mapping Reference
Overview
Gofannon uses a centralized provider configuration system that abstracts LLM provider implementations through LiteLLM. All provider configurations are defined in:
webapp/packages/api/user-service/config/
├── provider_config.py # Main provider registry
├── openai/__init__.py # OpenAI models configuration
├── anthropic/__init__.py # Anthropic/Claude models configuration
├── gemini/__init__.py # Google Gemini models configuration
└── [provider]/__init__.py # Additional provider configurations
The LLM service that consumes these configurations is located at:
webapp/packages/api/user-service/services/llm_service.py
API Key Management
Gofannon supports two ways to configure API keys:
1. Environment Variables (System-wide)
Set by administrators and used as fallback for all users:
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GEMINI_API_KEY=...
PERPLEXITYAI_API_KEY=pplx-...
2. User Profile Keys (User-specific)
Each user can configure their own API keys through the Profile → API Keys page. User-specific keys take precedence over environment variables.
See API Key Management for detailed documentation.
Key Priority Order
When making an LLM API call:
- User's stored API key (if configured in profile)
- Environment variable (system-wide fallback)
- No key available (provider unavailable)
Why LiteLLM?
Gofannon relies on LiteLLM to abstract multiple LLM providers and manage their dependencies. This architectural decision has important implications:
Advantages
- Unified Interface: Single API interface for all providers (OpenAI, Anthropic, Google, etc.)
- Dependency Management: LiteLLM handles provider-specific SDKs and their dependencies
- Consistency: Standardized request/response formats across providers
- Reduced Maintenance: Updates to provider SDKs are managed by LiteLLM
Important Tradeoff
Do not use provider-specific SDKs directly. While this keeps our codebase simpler, it creates a brief lag between when a provider releases a new feature and when we can use it (we must wait for LiteLLM to add support). However, we've decided this tradeoff is acceptable given the significant maintenance and consistency benefits.
Best Practice
When implementing provider features, always reference LiteLLM's documentation to understand:
- How provider-specific options map to LiteLLM parameters
- Which features are currently supported
- Provider-specific limitations or quirks
Configuration Structure
Each provider in provider_config.py follows this structure:
PROVIDER_CONFIG = {
"provider_name": {
"api_key_env_var": "PROVIDER_API_KEY", # Optional: environment variable for API key
"models": {
"model-name": {
"api_style": "responses", # Optional: "responses" for OpenAI's special APIs
"returns_thoughts": True, # Whether model returns reasoning/thoughts
"parameters": {
# Model-specific parameters (see below)
},
"built_in_tools": [
# Provider-specific built-in tools (see below)
]
}
}
}
}
Key Fields
- api_key_env_var: Environment variable name for the provider's API key
- models: Dictionary of model configurations keyed by model name
- api_style: Special handling for certain APIs (e.g., OpenAI's "responses" API for o1/reasoning models)
- returns_thoughts: Boolean indicating if the model returns reasoning traces or internal thoughts
- parameters: Model-specific parameters with validation rules
- built_in_tools: Provider-specific tools (web search, code execution, etc.)
Provider Configuration Files
OpenAI Configuration
Location: webapp/packages/api/user-service/config/openai/init.py
Key features:
- API Style: OpenAI's newer models (o1, gpt-5 series) use the
"responses"API style - Reasoning Effort: GPT-5 and o-series models support
reasoning_effortparameter - Built-in Tools: Many models have built-in web search capabilities
Example configuration:
"gpt-5.2": {
"api_style": "responses",
"returns_thoughts": True,
"parameters": {
"reasoning_effort": {
"type": "choice",
"default": "disable",
"choices": ["disable", "low", "medium", "high"],
"description": "Reasoning Effort: Effort level for reasoning during generation"
},
},
"built_in_tools": [
{
"id": "web_search",
"description": "Performs a web search.",
"tool_config": {"type": "web_search", "search_context_size": "medium"}
},
]
}
LiteLLM Mapping:
- The
api_style: "responses"maps to LiteLLM'slitellm.aresponses()function (see llm_service.py:87-127) - Standard models use
litellm.acompletion()(see llm_service.py:220-240) - Model string format:
"openai/model-name"(see llm_service.py:53)
Anthropic Configuration
Location: webapp/packages/api/user-service/config/anthropic/init.py
Key features:
- Mutually Exclusive Parameters: Claude 4.x models cannot have both
temperatureandtop_pset simultaneously - Max Tokens: Different models have different token limits
Example configuration:
"claude-opus-4-5-20251101": {
"returns_thoughts": False,
"parameters": {
"temperature": {
"type": "float",
"default": 1.0,
"min": 0.0,
"max": 1.0,
"description": "Randomness (0=focused, 1=creative)"
},
"top_p": {
"type": "float",
"default": 0.9,
"min": 0.0,
"max": 1.0,
"description": "Nucleus sampling (0.1=conservative, 0.95=diverse)",
"mutually_exclusive_with": ["temperature"]
},
"max_tokens": {
"type": "integer",
"default": 8192,
"min": 1,
"max": 16384,
"description": "Maximum tokens in response"
},
}
}
LiteLLM Mapping:
- Model string format:
"anthropic/model-name"(see llm_service.py:53) - Anthropic's block-based content format is handled in llm_service.py:250-261
- The
mutually_exclusive_withis enforced in the frontend; LiteLLM passes through only one parameter
Gemini Configuration
Location: webapp/packages/api/user-service/config/gemini/init.py
Key features:
- Built-in Tools: Google Search, URL context, code execution
- Reasoning Effort: Similar to OpenAI's reasoning models
Example configuration:
"gemini-2.5-pro": {
"parameters": {
"temperature": {
"type": "float",
"default": 1.0,
"min": 0.0,
"max": 2.0,
"description": "Temperature - Controls the randomness of the output."
},
"reasoning_effort": {
"type": "choice",
"default": "disable",
"choices": ["disable", "low", "medium", "high"],
"description": "Reasoning Effort: Effort level for reasoning during generation"
},
},
"built_in_tools": [
{
"id": "google_search",
"description": "Performs a Google search.",
"tool_config": {"google_search": {}}
},
{
"id": "code_execution",
"description": "Executes code snippets in a secure environment.",
"tool_config": {"codeExecution": {}}
}
]
}
LiteLLM Mapping:
- Model string format:
"gemini/model-name"(see llm_service.py:53) - Built-in tools are passed through LiteLLM's
toolsparameter
Ollama Configuration
Location: webapp/packages/api/user-service/config/provider_config.py:17-77
Example for local models:
"ollama": {
"models": {
"llama2": {
"parameters": {
"temperature": {
"type": "float",
"default": 0.7,
"min": 0.0,
"max": 1.0,
"description": "Controls randomness"
},
"num_predict": {
"type": "integer",
"default": 512,
"min": 1,
"max": 2048,
"description": "Maximum tokens to generate"
},
}
}
}
}
LiteLLM Mapping:
- No API key required (local deployment)
- Model string format:
"ollama/model-name"
Adding a New Provider
Follow these steps to add a new LLM provider:
Step 1: Research LiteLLM Support
Before adding a provider, check LiteLLM's supported providers documentation:
- Verify the provider is supported
- Note the required authentication method
- Identify any provider-specific parameters
- Check for special features (built-in tools, reasoning, etc.)
Step 2: Create Provider Configuration File
Create a new file: webapp/packages/api/user-service/config/[provider_name]/__init__.py
# [Provider Name] models configuration
# Updated [Date]
models = {
"model-name": {
"returns_thoughts": False, # or True if model supports reasoning
"parameters": {
# Define parameters with validation rules
"temperature": {
"type": "float",
"default": 0.7,
"min": 0.0,
"max": 1.0,
"description": "Controls randomness in generation"
},
# Add more parameters as needed
},
"built_in_tools": [] # Add provider-specific tools if available
}
}
Step 3: Register Provider in Main Config
Edit webapp/packages/api/user-service/config/provider_config.py:
from .provider_name import models as provider_name_models
PROVIDER_CONFIG = {
# ... existing providers ...
"provider_name": {
"api_key_env_var": "PROVIDER_NAME_API_KEY", # if API key is required
"models": provider_name_models,
},
}
Step 4: Set Environment Variables
Add the API key to your environment or .env file:
PROVIDER_NAME_API_KEY=your-api-key-here
Step 5: Test the Integration
Create a test to verify the provider works correctly. The LLM service will automatically:
- Format the model string as
"provider_name/model-name" - Pass it to LiteLLM's
acompletion()oraresponses()function - Handle the response according to the configuration
Parameter Types and Features
Basic Parameter Types
Float Parameter
"temperature": {
"type": "float",
"default": 0.7,
"min": 0.0,
"max": 2.0,
"description": "Controls randomness"
}
Integer Parameter
"max_tokens": {
"type": "integer",
"default": 4096,
"min": 1,
"max": 16384,
"description": "Maximum tokens in response"
}
Choice Parameter
"reasoning_effort": {
"type": "choice",
"default": "medium",
"choices": ["low", "medium", "high"],
"description": "Effort level for reasoning"
}
Advanced Features
Mutually Exclusive Parameters
Prevents using two parameters simultaneously (like temperature and top_p):
"temperature": {
"type": "float",
"default": 0.7,
"min": 0.0,
"max": 2.0,
"mutually_exclusive_with": ["top_p"]
}
Implementation: The LLM service filters out None values before passing to LiteLLM (see llm_service.py:58-59):
# Filter out None values from parameters (e.g., top_p with default None)
filtered_params = {k: v for k, v in parameters.items() if v is not None}
Built-in Tools
Provider-specific tools that don't require custom implementation:
"built_in_tools": [
{
"id": "web_search",
"description": "Performs a web search.",
"tool_config": {"type": "web_search", "search_context_size": "medium"}
}
]
LiteLLM Mapping: Built-in tools are passed through the tools parameter in llm_service.py:69-70:
if tools:
kwargs["tools"] = tools
API Styles
For providers with multiple API endpoints (like OpenAI's responses API):
"api_style": "responses" # Uses litellm.aresponses() instead of acompletion()
Implementation: The service checks this flag and routes to the appropriate LiteLLM function (see llm_service.py:82-86):
use_responses_api = (
api_style == "responses" and
(tools or reasoning_effort != 'disable')
)
Examples
Example 1: Basic Provider (No Special Features)
# config/cohere/__init__.py
models = {
"command": {
"returns_thoughts": False,
"parameters": {
"temperature": {
"type": "float",
"default": 0.75,
"min": 0.0,
"max": 1.0,
"description": "Controls randomness"
},
"max_tokens": {
"type": "integer",
"default": 4096,
"min": 1,
"max": 4096,
"description": "Maximum tokens in response"
}
},
"built_in_tools": []
}
}
# In provider_config.py
from .cohere import models as cohere_models
PROVIDER_CONFIG = {
"cohere": {
"api_key_env_var": "COHERE_API_KEY",
"models": cohere_models
}
}
Example 2: Provider with Reasoning Support
# config/mistral/__init__.py
models = {
"mistral-large": {
"returns_thoughts": True,
"parameters": {
"temperature": {
"type": "float",
"default": 0.7,
"min": 0.0,
"max": 1.0,
"description": "Controls randomness"
},
"reasoning_effort": {
"type": "choice",
"default": "disable",
"choices": ["disable", "low", "medium", "high"],
"description": "Reasoning effort level"
}
},
"built_in_tools": []
}
}
Example 3: Local Provider (No API Key)
# In provider_config.py
PROVIDER_CONFIG = {
"vllm": {
# No api_key_env_var needed for local deployment
"models": {
"llama-3-70b": {
"returns_thoughts": False,
"parameters": {
"temperature": {
"type": "float",
"default": 0.7,
"min": 0.0,
"max": 1.0,
"description": "Controls randomness"
}
},
"built_in_tools": []
}
}
}
}
LiteLLM Mapping Reference
This section explains how Gofannon's configuration maps to LiteLLM function calls.
Model String Format
Gofannon constructs model strings in the format "provider/model":
# From llm_service.py:53
model_string = f"{provider}/{model}"
Examples:
"openai/gpt-4o""anthropic/claude-opus-4-5-20251101""gemini/gemini-2.5-pro"
Standard Completion Flow
For most models (see llm_service.py:220-240):
kwargs = {
"model": model_string, # "provider/model"
"messages": messages, # Standard messages array
**filtered_params, # temperature, max_tokens, etc.
}
if reasoning_effort != 'disable':
kwargs['reasoning_effort'] = reasoning_effort
response = await litellm.acompletion(**kwargs)
Responses API Flow
For OpenAI's responses API (see llm_service.py:87-127):
# System prompts become 'instructions'
kwargs["instructions"] = "\n\n".join(system_prompts)
# Last user message becomes 'input'
response_obj = await litellm.aresponses(input=input_text, **kwargs)
# Poll for completion
response_status = await litellm.aget_responses(response_id=response_obj.id)
Parameter Filtering
Parameters with None values are filtered out (see llm_service.py:58-59):
filtered_params = {k: v for k, v in parameters.items() if v is not None}
This implements mutual exclusivity without explicit validation.
Tool Handling
Tools are passed directly if provided (see llm_service.py:69-70):
if tools:
kwargs["tools"] = tools
Response Extraction
Different response formats are handled:
Standard responses (see llm_service.py:239-263):
message = response.choices[0].message
content = message.content if isinstance(message.content, str) else ""
# Extract thoughts (reasoning, tool calls, etc.)
if message.tool_calls:
thoughts_payload['tool_calls'] = [tc.model_dump() for tc in message.tool_calls]
if hasattr(message, 'reasoning_content') and message.reasoning_content:
thoughts_payload['reasoning_content'] = message.reasoning_content
Anthropic block-based responses (see llm_service.py:250-261):
if isinstance(message.content, list): # Anthropic's block-based content
thought_blocks = [block for block in content_blocks if block.get("type") == "thought"]
tool_use_blocks = [block for block in content_blocks if block.get("type") == "tool_use"]
text_blocks = [block.get("text", "") for block in content_blocks if block.get("type") == "text"]
Additional Resources
- LiteLLM Documentation
- LiteLLM Supported Providers
- LiteLLM API Reference
- Gofannon LLM Service Implementation
Troubleshooting
Provider Not Working
- Check LiteLLM Support: Verify the provider is supported by LiteLLM
- Verify API Key:
- Check if the user has configured a personal API key in their profile
- Ensure the environment variable is set correctly (fallback)
- Check Model Name: Verify the model name matches LiteLLM's expected format
- Review LiteLLM Logs: Check
services/litellm_logger.pyfor error messages
API Key Issues
User-specific keys not working:
- Verify the key is saved in the user's profile (Profile → API Keys)
- Check the provider status shows "Configured"
- Test the key directly with the provider's API
Environment variable not working:
- Ensure the environment variable name matches
api_key_env_varinprovider_config.py - Restart the application after setting environment variables
- Check for typos or extra whitespace
Parameter Issues
- Mutually Exclusive Parameters: Ensure only one parameter from a mutually exclusive group is set
- Range Validation: Check that numeric values are within min/max bounds
- Type Mismatches: Verify parameter types match the configuration (float vs int)
Feature Lag
If a provider releases a new feature that isn't working:
- Check if LiteLLM has added support for the feature
- Review LiteLLM's changelog
- Consider updating the LiteLLM dependency
- Temporarily use the provider's SDK directly (not recommended for production)
Contributing
When adding new providers or models:
- Follow the existing configuration patterns
- Add comprehensive parameter descriptions
- Document any provider-specific quirks
- Reference LiteLLM documentation for parameter mappings
- Add example usage in this documentation
- Test thoroughly with the provider's actual API
Last Updated: January 2026 Maintainer: AI Alliance Gofannon Team