Replicate makes it easy to run machine learning models with a single API call. Access thousands of open-source models for image generation (SDXL, Flux, ControlNet), video generation (Stable Video Diffusion), audio (Whisper, MusicGen), language (Llama 3, Mistral), and specialized tasks like upscaling, background removal, and object detection. Deploy private models and fine-tunes too.
https://api.replicate.com/v1
Auth type
Bearer Token
Auth header
Authorization: Bearer r8_...
Rate limit
No hard limit (scales with your account tier)
Pricing
Pay per use
Free quota
No free tier (pay-as-you-go)
Documentation
https://replicate.com/docs
Endpoint status
Server online — HTTP 401 — server is online but path returned an error (may require auth)650ms
(checked Mar 29, 2026)
Builder score
B
68%
builder-friendly
Create an API token at replicate.com/account/api-tokens. Pass it as a Bearer token in the Authorization header.
Authorization: Bearer r8_...
Billed by the second per hardware tier: CPU $0.000225/sec, Nvidia T4 $0.0012/sec, A40 $0.0023/sec, A100 $0.0115/sec. Image gen ~$0.003–0.012/image.
| Method | Path | Description |
|---|---|---|
| POST | /predictions |
Run a model and create a prediction |
| GET | /predictions/{prediction_id} |
Get prediction status and output |
| GET | /models |
List available public models |
| GET | /models/{owner}/{model_name} |
Get model details and latest version |
| POST | /models/{owner}/{model_name}/versions/{id}/predictions |
Run a specific model version |
curl "https://api.replicate.com/v1/predictions" \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"version":"db21e45d3f7023abc2a46ee38a23973f6dce16bb082a930b0c49861f96d1e5bf","input":{"prompt":"A photo of a cat wearing a beret in Paris"}}'
{
"id": "xyz789abc",
"status": "starting",
"model": "stability-ai/sdxl",
"urls": {
"get": "https://api.replicate.com/v1/predictions/xyz789abc",
"cancel": "https://api.replicate.com/v1/predictions/xyz789abc/cancel"
}
}
Data sourced from API Map. Always verify pricing and rate limits against the official Replicate documentation.