Cohere provides production-ready language AI for enterprise teams. Command R+ leads on RAG benchmarks, making it ideal for grounded enterprise search. The Embed v3 model produces high-quality text embeddings for semantic search and retrieval. The Rerank endpoint improves search accuracy by reordering candidate results. All models have built-in tool use (function calling) and citation support for RAG pipelines.
https://api.cohere.com/v2
Auth type
Bearer Token
Auth header
Authorization: Bearer YOUR_COHERE_API_KEY
Rate limit
100 API calls/min (trial) · Higher on production
Pricing
Pay per use
Free quota
Trial key with limited free usage
Documentation
https://docs.cohere.com
Endpoint status
Server online — HTTP 401 — server is online but path returned an error (may require auth)1.05s
(checked Mar 29, 2026)
Builder score
C
64%
builder-friendly
Create an API key at dashboard.cohere.com. Pass it as a Bearer token in the Authorization header.
Authorization: Bearer YOUR_COHERE_API_KEY
| Plan | Price/mo | Included |
|---|---|---|
| Trial | Free | Limited trial usage |
| Production | Free | No minimum, pay per token/query |
Command R+: $2.50/M in, $10/M out. Command R: $0.15/M in, $0.60/M out. Command R7B: $0.0375/M in, $0.15/M out. Embed: $0.10/M tokens. Rerank: $2/1k queries.
| Method | Path | Description |
|---|---|---|
| POST | /chat |
Chat with Command R / R+ with optional tool use and RAG |
| POST | /embed |
Generate embeddings for semantic search and classification |
| POST | /rerank |
Rerank search results by relevance to a query |
| POST | /classify |
Classify text into predefined categories |
curl "https://api.cohere.com/v2/chat" \
-H "Authorization: Bearer $COHERE_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"command-r-plus","messages":[{"role":"user","content":"Summarize the benefits of RAG for enterprise AI."}]}'
{
"id": "abc123",
"message": {
"role": "assistant",
"content": [{"type":"text","text":"RAG (Retrieval-Augmented Generation) benefits include..."}]
},
"finish_reason": "COMPLETE",
"usage": { "billed_units": { "input_tokens": 24, "output_tokens": 112 } }
}
Data sourced from API Map. Always verify pricing and rate limits against the official Cohere documentation.