Together AI provides cloud infrastructure for running open-source AI models at scale. The API is fully OpenAI-compatible and supports 200+ models including Llama 3.1 405B, Qwen 2.5, Mistral, DeepSeek, Stable Diffusion, and more. Features include serverless inference (pay per token), dedicated GPU clusters, fine-tuning with LORA, vision models, and function calling. Popular for research, enterprise AI, and teams migrating from proprietary models.
https://api.together.xyz/v1
Auth type
Bearer Token
Auth header
Authorization: Bearer YOUR_TOGETHER_API_KEY
Rate limit
60 requests/min (default) · Scales with plan
Pricing
from $0.10/mo
Free quota
$1 free credit on signup
Documentation
https://docs.together.ai
Endpoint status
Server online — HTTP 404 — server is online but path returned an error (may require auth)1.09s
(checked Mar 29, 2026)
Builder score
B
66%
builder-friendly
Create an API key at api.together.ai. The endpoint is OpenAI-compatible — pass your key as a Bearer token in the Authorization header.
Authorization: Bearer YOUR_TOGETHER_API_KEY
Llama 3.1 8B: $0.18/M · $0.18/M. Llama 3.1 70B: $0.88/M · $0.88/M. Llama 4 Scout: available. DeepSeek R1: $1.25/M · $1.25/M. Image gen: $0.008 per image. No free tier; minimum $5 credit purchase.
| Method | Path | Description |
|---|---|---|
| POST | /chat/completions |
Chat completions (OpenAI-compatible) |
| POST | /completions |
Text completions |
| POST | /embeddings |
Generate text embeddings |
| POST | /images/generations |
Image generation (Stable Diffusion, FLUX) |
| GET | /models |
List all available models with pricing |
| POST | /fine-tunes |
Start a fine-tuning job with LoRA |
curl "https://api.together.xyz/v1/chat/completions" \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo","messages":[{"role":"user","content":"Write a Python function to parse JSON safely"}],"max_tokens":200}'
{
"id": "890ab123",
"model": "meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
"choices": [{
"message": { "role": "assistant", "content": "Here's a Python function to safely parse JSON:
```python
import json
def safe_json_parse(data):
try:
return json.loads(data)
except json.JSONDecodeError:
return None
```" },
"finish_reason": "stop"
}],
"usage": { "prompt_tokens": 20, "completion_tokens": 68 }
}
Data sourced from API Map. Always verify pricing and rate limits against the official Together AI documentation.