Supported Models
| Model | ID | Context | Max Output | Tools | Notes |
|---|---|---|---|---|---|
| Llama 3.1 70B | llama3.1-70b | 128K | 8K | Yes | Fastest 70B available |
| Llama 3.1 8B | llama3.1-8b | 128K | 8K | Yes | Extreme speed |
Setup
Get API access
Sign up at inference.cerebras.ai. Currently in limited access.
Environment Variables
Your Cerebras API key.
Configuration Example
- .env
- settings.yml
Model Aliases
| Alias | Model |
|---|---|
cerebras | llama3.1-70b |
Usage Examples
Notes
- API endpoint:
https://api.cerebras.ai/v1(OpenAI-compatible) - Status: Experimental - hardware-specific availability, may have capacity constraints.
- Best use case: real-time streaming, bulk generation tasks, low-latency chat.
- Cerebras does not support vision or image inputs.
Related
- AI Providers Overview - Compare all 37 supported providers
- Groq - LPU-based inference, another ultra-fast hardware provider
- SambaNova - High-throughput RDU-based inference
- profclaw provider - Add and test providers from the CLI