Groq - profClaw

Groq’s Language Processing Units (LPUs) deliver the fastest inference available. Llama 3.3 70B runs at hundreds of tokens per second - ideal for real-time chat and low-latency agentic workflows.

Supported Models

Model	ID	Context	Max Output	Tools	Input $/1M	Output $/1M
Llama 3.3 70B	`llama-3.3-70b-versatile`	128K	32K	Yes	$0.59	$0.79
Llama 3.1 8B Instant	`llama-3.1-8b-instant`	128K	8K	Yes	$0.05	$0.08
Mixtral 8x7B	`mixtral-8x7b-32768`	32K	8K	Yes	$0.24	$0.24

Setup

Get an API key

Set the environment variable

export GROQ_API_KEY=gsk_...

Verify

profclaw doctor --provider groq

Environment Variables

GROQ_API_KEY

string

required

Your Groq API key. Format: gsk_...

Configuration Example

.env
settings.yml

GROQ_API_KEY=gsk_...

providers:
  groq:
    api_key: "${GROQ_API_KEY}"

Model Aliases

Alias	Model
`groq`	llama-3.3-70b-versatile
`groq-fast`	llama-3.1-8b-instant
`groq-mixtral`	mixtral-8x7b-32768

Usage Examples

# Fast general purpose
profclaw chat --model groq "Explain this error message"

# Fastest (8B model)
profclaw chat --model groq-fast "One-line summary of this PR"

Notes

Groq is ranked 5th in auto-selection priority after Anthropic, OpenAI, Azure, and Google.
llama-3.1-8b-instant is one of the cheapest available models at $0.05/1M input tokens.
Groq has a generous free tier with rate limits per day.
API is OpenAI-compatible - endpoint: https://api.groq.com/openai/v1

AI Providers Overview - Compare all 37 supported providers
Cerebras - Wafer-scale inference for even faster token speeds
Together AI - Hundreds of open-source models via one API
profclaw provider - Add and test providers from the CLI

​Supported Models

​Setup

​Environment Variables

​Configuration Example

​Model Aliases

​Usage Examples

​Notes

​Related