API Documentation

Overview

This document provides a comprehensive guide to using the Saharaarrow-up-right API. It walks you through discovering available models and compute providers, querying model metadata, and making inference requests using both raw HTTP and OpenAI-compatible Python clients. The API is especially useful for developers integrating multiple model providers into their workflow while maintaining a unified interface. You will learn how to:

  • Query all available models and compute providers

  • Filter models or providers using specific criteria

  • Access model usage details

  • Send inference requests through Langchain, OpenAI SDK, or direct HTTP

  • Implement multi-agent logic with routing

Preparation

API Setup

To access the Sahara Model Hub API, you need a valid API key. This key is required to authenticate every API request.

How to Get Your API Key

1. Go to the Developer Portal

Open: app.saharaai.com/developer/apiarrow-up-right

2. Log In and Access API Keys

Navigate to the Main Page (top-left) → select "API Key" in the sidebar menu on left side.

3. Create a New Key

Click "Create API Key", assign it a name like "dev-client", and generate it.

4. Copy and Store Securely

You can view your key anytime by clicking on the "eye" icon in the sidebar. Save it securely in an environment variable, config file, or secret manager.

Note: Never expose your API key in public code or repositories. Treat it as a secret credential.

Once you have your API key, configure it in your script. This will be required in all requests sent to the Sahara API.

Configure http header with you API_KEY:

Replace "your-api-key" with the key you obtained from the Developer Portal.

Discover Available Models & Providers

The Sahara API allows you to dynamically explore available models and compute providers.

Get All Models

This command fetches all registered models across providers:

Sample Response

Get All Providers

This API lists all compute providers (e.g., OpenAI, Lepton, Together):

Sample Response

Get Models by Provider

Query models served by a specific provider

Sample Reponse

Get Providers by Model

Find which providers serve a specific model, for example, when we want to find the provider serving deepseek-ai/DeepSeek-V3

Output

Get Model Details

Fetch metadata and detailed usage requirements for a specific model-provider pair:

Sample Response

Model Inference by Raw HTTP Request

Sample Response

Model Inference by OpenAI SDK

If you prefer to use OpenAI's SDK, the Sahara endpoint fully supports OpenAI-compatible APIs.

Non-Streaming Response

Sample Output

Streaming Response

Sample Response

Model Inference Using Langchain

Prerequisites

Ensure the following tools and packages are installed before continuing

langchain_openai is a Python library that provides integration between LangChain and OpenAI’s API.

You can interact with sahara models using the `langchain` interface. This is useful for testing streaming outputs and experimenting with conversational flows. Below is an example using a working model to demonstrate both success and failure:

Sample Response

Multi-Agent Integration (OpenAI Agents SDK)

The Sahara API supports OpenAI's 'agents-python' package. This example sets up three agents:

  1. A Spanish-speaking agent

  2. An English-speaking agent

  3. A triage agent that routes input based on langauge

Prerequisites

Ensure the following tools and packages are installed before continuing

  • openai-agents is a Python SDK that provides an Agent Framework for building intelligent agents.

  • nest_asyncio Allows you to run asynchronous code

Sample Response

This example demonstrates complex routing logic using OpenAI-compatible models served from Sahara.

Error Handling and Best Practices

Error Codes

  • 400 Bad Request: Check request formatting.

  • 404 Not Found: Verify pipeline or model IDs.

  • 500 Internal Server Error: Retry or contact support.

Best Practices

  1. Secure Keys:

  • Use environment variables to store API keys securely.

  1. Monitor Usage:

  • Regularly review metrics to optimize performance.

  1. Retry Logic:

  • Implement retry logic for transient errors (e.g., 500 Internal Server Error).

Last updated