# API Documentation

### Overview

This document provides a comprehensive guide to using the Sahara API. It walks you through discovering available models and compute providers, querying model metadata, and making inference requests using both raw HTTP and OpenAI-compatible Python clients.\
\
The API is especially useful for developers integrating multiple model providers into their workflow while maintaining a unified interface.\
\
You will learn how to:

* Query all available models and compute providers
* Filter models or providers using specific criteria
* Access model usage details
* Send inference requests through Langchain, OpenAI SDK, or direct HTTP
* Implement multi-agent logic with routing

<figure><img src="/files/izTqjEOXwq09rwAGCkMM" alt=""><figcaption></figcaption></figure>

### Preparation

### API Setup

To access the Sahara Model Hub API, you need a valid API key. This key is required to authenticate every API request.

#### How to Get Your API Key

1\. Go to the Developer Portal &#x20;

&#x20;  Open: <https://portal.saharalabs.ai>

2\. Log In and Access API Keys &#x20;

&#x20;  Click your profile icon (top-right) → select "API Key".

3\. Create a New Key &#x20;

&#x20;  Click "Create API Key", assign it a name like "dev-client", and generate it.

4\. Copy and Store Securely &#x20;

&#x20;You can only view the key once. Save it securely in an environment variable, config file, or secret manager.

<figure><img src="/files/MBWzQY36rpwHfUiq0jDR" alt=""><figcaption></figcaption></figure>

Note: Never expose your API key in public code or repositories. Treat it as a secret credential.

Once you have your API key, configure it in your script. This will be required in all requests sent to the Sahara Model Hub API.

Configure http header with you API\_KEY:

```
  API_KEY = "your-api-key"
    HEADERS = {
        "Accept": "application/json",
        "x-api-key": API_KEY,
    }
```

Replace "your-api-key" with the key you obtained from the Developer Portal.

### Discover Available Models & Providers

\
The Sahara API allows you to dynamically explore available models and compute providers.

#### Get All Models

This command fetches all registered models across providers:

```
curl -s 'https://portal.saharalabs.ai/api/compute/models'   -H 
'Accept: application/json'   -H 'x-api-key: your-api-key' | jq
```

#### Sample Response

```
[
  "llama-3-8b",
  "gpt-4o",
  "deepseek-ai/DeepSeek-V3",
  "deepseek-ai/DeepSeek-R1",
  "llama3-3-70b",
  "Qwen/Qwen2.5-72B-Instruct-Turbo",
  "meta-llama/Llama-3.3-70B-Instruct-Turbo",
  "Qwen/Qwen2.5-7B-Instruct-Turbo",
  "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
  "llama3-1-8b",
  "deepseek-ai/DeepSeek-V3-0324"
]
```

### Get All Providers

This API lists all compute providers (e.g., OpenAI, Lepton, Together):

```
curl -s 'https://portal.saharalabs.ai/api/compute/providers'   -H 
'Accept: application/json'   -H 
'x-api-key: your-api-key'
```

#### Sample Response

```
["lepton","predibase","sagemaker","bedrock","openai","together"]
```

### Get Models by Provider

Query models served by a specific provider

{% code overflow="wrap" %}

```
curl -s 'https://portal.saharalabs.ai/api/compute/models?provider=predibase'   -H 'Accept: application/json'   -H 'x-api-key: your-api-key' | jq
```

{% endcode %}

#### Sample Reponse

```
[
  "llama-3-8b"
]
```

### Get Providers by Model

Find which providers serve a specific model, for example, when we want to find the provider serving deepseek-ai/DeepSeek-V3

{% code overflow="wrap" %}

```
curl -s 'https://portal.saharalabs.ai/api/compute/providers?model=deepseek-ai/DeepSeek-V3'   -H 'Accept: application/json'   -H 'x-api-key: your-api-key' | jq
```

{% endcode %}

#### Output

```
[
  "together"
]
```

### Get Model Details

Fetch metadata and detailed usage requirements for a specific model-provider pair:

{% code overflow="wrap" %}

```
curl -s 'https://portal.saharalabs.ai/api/compute/modelDetail?model=deepseek-ai/DeepSeek-V3&provider=together'   -H 'Accept: application/json'   -H 'x-api-key: your-api-key' | jq
```

{% endcode %}

#### Sample Response

```
{
  "id": "1beec936-672e-4e63-9ef9-af721d0ed3e2",
  "name": "deepseek-ai/DeepSeek-V3",
  "description": "together AI deepseek-ai/DeepSeek-V3",
  "is_public": null,
  "license": null,
  "model_size": 0,
  "tags": null,
  "tensor_type": null
}
```

### Model Inference by Raw HTTP Request

```python
import os
import requests

SAHARA_DEVPORTAL_API_KEY = 'your-api-key'
MODEL_BASE_URL = "https://portal.saharalabs.ai/api/compute"

model_name = "gpt-4o"
model_provider = "openai"

url = f"{MODEL_BASE_URL}/chat/completions"
headers = {
   "Content-Type": "application/json",
   "Authorization": f"Bearer {SAHARA_DEVPORTAL_API_KEY}",
   "OpenAI-Organization": model_provider
}
data = {
   "model": model_name,
   "messages": [
       {"role": "system", "content": "You are a helpful assistant."},
       {"role": "user", "content": "Hello!"}
   ]
}

response = requests.post(url, headers=headers, json=data)
print(response.json())
```

#### Sample Response

{% code overflow="wrap" %}

```
{'id': 'chatcmpl-BHOQRlHqSrfSMi3wFtOYzxVOZWufb', 'choices': [{'finish_reason': 'stop', 'index': 0, 'logprobs': None, 'message': {'content': 'Hello! How can I assist you today?', 'refusal': None, 'role': 'assistant', 'audio': None, 'function_call': None, 'tool_calls': None, 'annotations': []}}], 'created': 1743485167, 'model': 'gpt-4o-2024-08-06', 'object': 'chat.completion', 'service_tier': 'default', 'system_fingerprint': 'fp_898ac29719', 'usage': {'completion_tokens': 10, 'prompt_tokens': 19, 'total_tokens': 29, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}}
```

{% endcode %}

### Model Inference by OpenAI SDK

If you prefer to use OpenAI's SDK, the Sahara endpoint fully supports OpenAI-compatible APIs.

#### Non-Streaming Response

```python
from openai import OpenAI
client = OpenAI(
   base_url=MODEL_BASE_URL,
   api_key=SAHARA_DEVPORTAL_API_KEY,
   organization="openai"
)
completion = client.chat.completions.create(
 model="gpt-4o",
 messages=[
   {"role": "system", "content": "You are a helpful assistant. You are a helpful assistant. You are a helpful assistant. You are a helpful assistant."},
   {"role": "user", "content": "Hello! Who are you man? Are you ok? Hey hey hey"}
 ]
)
print(completion.choices[0].message)
```

#### Sample Output

{% code overflow="wrap" %}

```
ChatCompletionMessage(content="Hello! I'm an AI assistant here to help you with any questions or information you need. How can I assist you today?", refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None)
```

{% endcode %}

#### Streaming Response

```python
async def generate(model_name, model_provider):
   print(f"Testing Streaming Output of {model_name} on {model_provider}")
   chat = ChatOpenAI(
       model=model_name,
       api_key=SAHARA_DEVPORTAL_API_KEY,
       openai_api_base=MODEL_BASE_URL,
       organization=model_provider,
       streaming=True,
       extra_body={
           "compute_provider": "lepton"
       }
   )


   messages = [
       HumanMessage(content="Hello! How are you are you are you? Hey hey hey!")
   ]


   try:
       full_content = ""
       async for chunk in chat.astream(messages):
           if chunk.content:
               full_content += chunk.content
               print(full_content)


       print(full_content)
       return


   except Exception as e:
       print(f"Streaming error: {e}")
       error_data = {"type": "error", "message": str(e)}
       print(f"data: {json.dumps(error_data)}\n\n")


async def main():
   for combination in model_provider_combinations[:1]:
       await generate(combination["model_name"], combination["model_provider"])


if __name__ == '__main__':
   asyncio.run(main())
```

#### Sample Response

```
Testing Streaming Output of gpt-4o on openai
Hello
Hello!
Hello! I'm
Hello! I'm here
Hello! I'm here and
Hello! I'm here and ready
Hello! I'm here and ready to
Hello! I'm here and ready to help
Hello! I'm here and ready to help.
Hello! I'm here and ready to help. What
Hello! I'm here and ready to help. What can
Hello! I'm here and ready to help. What can I
Hello! I'm here and ready to help. What can I do
Hello! I'm here and ready to help. What can I do for
Hello! I'm here and ready to help. What can I do for you
Hello! I'm here and ready to help. What can I do for you today
Hello! I'm here and ready to help. What can I do for you today?
Hello! I'm here and ready to help. What can I do for you today?
```

### Model Inference Using Langchain

Prerequisites

Ensure the following tools and packages are installed before continuing

{% code overflow="wrap" %}

```
pip install langchain_openai
```

{% endcode %}

langchain\_openai is a Python library that provides integration between LangChain and OpenAI’s API.&#x20;

You can interact with sahara models using the \`langchain\` interface. This is useful for testing streaming outputs and experimenting with conversational flows.\
\
Below is an example using three working models and one invalid one to demonstrate both success and failure:

```python
from langchain_core.messages import HumanMessage
from langchain_openai import ChatOpenAI
import asyncio
import json

model_name = "gpt-4o"
model_provider = "openai"

chat = ChatOpenAI(
   model=model_name,
   api_key=SAHARA_DEVPORTAL_API_KEY,
   openai_api_base=MODEL_BASE_URL,
   organization=model_provider,
   streaming=False,
)

messages = [
   HumanMessage(content="Hello! How are you?")
]


def generate():
   try:
       res = chat.invoke(messages)
       print(res)

   except Exception as e:
       print(f"Streaming error: {e}")
       error_data = {"type": "error", "message": str(e)}
       print(f"data: {json.dumps(error_data)}\n\n")


if __name__ == '__main__':
   generate()

```

#### Sample Response

{% code overflow="wrap" %}

```
content="Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?" additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 30, 'prompt_tokens': 13, 'total_tokens': 43, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_eb9dce56a8', 'finish_reason': 'stop', 'logprobs': None} id='run-427fd56e-853e-4cb4-9c29-8f48cccab9d6-0' usage_metadata={'input_tokens': 13, 'output_tokens': 30, 'total_tokens': 43, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}}

```

{% endcode %}

### Multi-Agent Integration (OpenAI Agents SDK)

The Sahara API supports OpenAI's 'agents-python' package. This example sets up three agents:&#x20;

1. A Spanish-speaking agent
2. An English-speaking agent
3. A triage agent that routes input based on langauge

#### Prerequisites

Ensure the following tools and packages are installed before continuing

{% code overflow="wrap" %}

```
pip install nest_asyncio
pip install "openai-agents @ git+https://github.com/openai/openai-agents-python.git"
```

{% endcode %}

* openai-agents is a Python SDK that provides an Agent Framework for building intelligent agents.
* nest\_asyncio Allows you to run asynchronous code

```python
import os
from agents import Agent, Runner, AsyncOpenAI, OpenAIChatCompletionsModel, RunConfig
import asyncio
import nest_asyncio
nest_asyncio.apply()


SAHARA_DEVPORTAL_API_KEY = 'your-api-key'
MODEL_BASE_URL = "https://portal.saharalabs.ai/api/compute"
os.environ["OPENAI_BASE_URL"] = os.environ["OPENAI_API_KEY"] = 


client_openai = AsyncOpenAI(
   api_key=SAHARA_DEVPORTAL_API_KEY,
   base_url=MODEL_BASE_URL,
   organization="openai"
)


client_together = AsyncOpenAI(
   api_key=SAHARA_DEVPORTAL_API_KEY,
   base_url=MODEL_BASE_URL,
   organization="together"
)


spanish_agent = Agent(
   name="Spanish agent",
   instructions="You only speak Spanish. Your name is James",
   model=OpenAIChatCompletionsModel(
       model="deepseek-ai/DeepSeek-V3",
       openai_client=client_together,
   )
)


english_agent = Agent(
   name="English agent",
   instructions="You only speak English. Your name is Jesse",
   model=OpenAIChatCompletionsModel(
       model="deepseek-ai/DeepSeek-V3",
       openai_client=client_together
   ),
)

triage_agent = Agent(
   name="Triage agent",
   instructions="Handoff to the appropriate agent based on the language of the request.",
   handoffs=[spanish_agent, english_agent],
   model=OpenAIChatCompletionsModel(
       model="gpt-4o",
       openai_client=client_openai
   ),
)

async def main():
   result = await Runner.run(triage_agent, input="Hola, ¿Cómo te llamas?")
   print(result.final_output)


asyncio.run(main())
```

#### Sample Response

```
¡Hola! Me llamo James. ¿En qué puedo ayudarte hoy?
```

This example demonstrates complex routing logic using OpenAI-compatible models served from Sahara.

### Error Handling and Best Practices

**Error Codes**

* 400 Bad Request: Check request formatting.
* 404 Not Found: Verify pipeline or model IDs.
* 500 Internal Server Error: Retry or contact support.

**Best Practices**

1. Secure Keys:

* Use environment variables to store API keys securely.

2. Monitor Usage:

* Regularly review metrics to optimize performance.

3. Retry Logic:

* Implement retry logic for transient errors (e.g., 500 Internal Server Error).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.saharaai.com/user-guide-marketplace/api-documentation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.