API Documentation
Overview
This document provides a comprehensive guide to using the Sahara API. It walks you through discovering available models and compute providers, querying model metadata, and making inference requests using both raw HTTP and OpenAI-compatible Python clients. The API is especially useful for developers integrating multiple model providers into their workflow while maintaining a unified interface. You will learn how to:
Query all available models and compute providers
Filter models or providers using specific criteria
Access model usage details
Send inference requests through Langchain, OpenAI SDK, or direct HTTP
Implement multi-agent logic with routing

Preparation
API Setup
To access the Sahara Model Hub API, you need a valid API key. This key is required to authenticate every API request.
How to Get Your API Key
1. Go to the Developer Portal
Open: https://portal.saharalabs.ai
2. Log In and Access API Keys
Click your profile icon (top-right) → select "API Key".
3. Create a New Key
Click "Create API Key", assign it a name like "dev-client", and generate it.
4. Copy and Store Securely
You can only view the key once. Save it securely in an environment variable, config file, or secret manager.

Note: Never expose your API key in public code or repositories. Treat it as a secret credential.
Once you have your API key, configure it in your script. This will be required in all requests sent to the Sahara Model Hub API.
Configure http header with you API_KEY:
Replace "your-api-key" with the key you obtained from the Developer Portal.
Discover Available Models & Providers
The Sahara API allows you to dynamically explore available models and compute providers.
Get All Models
This command fetches all registered models across providers:
Sample Response
Get All Providers
This API lists all compute providers (e.g., OpenAI, Lepton, Together):
Sample Response
Get Models by Provider
Query models served by a specific provider
Sample Reponse
Get Providers by Model
Find which providers serve a specific model, for example, when we want to find the provider serving deepseek-ai/DeepSeek-V3
Output
Get Model Details
Fetch metadata and detailed usage requirements for a specific model-provider pair:
Sample Response
Model Inference by Raw HTTP Request
Sample Response
Model Inference by OpenAI SDK
If you prefer to use OpenAI's SDK, the Sahara endpoint fully supports OpenAI-compatible APIs.
Non-Streaming Response
Sample Output
Streaming Response
Sample Response
Model Inference Using Langchain
Prerequisites
Ensure the following tools and packages are installed before continuing
langchain_openai is a Python library that provides integration between LangChain and OpenAI’s API.
You can interact with sahara models using the `langchain` interface. This is useful for testing streaming outputs and experimenting with conversational flows. Below is an example using three working models and one invalid one to demonstrate both success and failure:
Sample Response
Multi-Agent Integration (OpenAI Agents SDK)
The Sahara API supports OpenAI's 'agents-python' package. This example sets up three agents:
A Spanish-speaking agent
An English-speaking agent
A triage agent that routes input based on langauge
Prerequisites
Ensure the following tools and packages are installed before continuing
openai-agents is a Python SDK that provides an Agent Framework for building intelligent agents.
nest_asyncio Allows you to run asynchronous code
Sample Response
This example demonstrates complex routing logic using OpenAI-compatible models served from Sahara.
Error Handling and Best Practices
Error Codes
400 Bad Request: Check request formatting.
404 Not Found: Verify pipeline or model IDs.
500 Internal Server Error: Retry or contact support.
Best Practices
Secure Keys:
Use environment variables to store API keys securely.
Monitor Usage:
Regularly review metrics to optimize performance.
Retry Logic:
Implement retry logic for transient errors (e.g., 500 Internal Server Error).