Groq LLM
Overview
The GroqLLMService provides high-speed inference for Llama, Mixtral, and Gemma models using Groq's LPU™ technology. It is highly compatible with the OpenAI API interface, making it easy to drop in for low-latency conversational needs.
Installation
To use Groq LLM, install the required dependencies:
pip install "piopiy-ai[groq]"
Prerequisites
- A Groq account and API key.
- Set your API key in your environment:
export GROQ_API_KEY="your_api_key_here"
Configuration
GroqLLMService Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
api_key | str | Required | Your Groq API key. |
model | str | "llama-3.3-70b-versatile" | Model to use on Groq. |
base_url | str | "https://api.groq.com/openai/v1" | API endpoint. |
Usage
Basic Setup
import os
from piopiy.services.groq.llm import GroqLLMService
llm = GroqLLMService(
api_key=os.getenv("GROQ_API_KEY"),
model="llama-3.3-70b-versatile"
)
Notes
- Speed: Groq is exceptionally fast, often reaching hundreds of tokens per second, which reduces the "time-to-first-word" for your voice agent.
- Model Compatibility: Any model available on Groq can be used by simply updating the
modelparameter. - OpenAI Interface: Since it inherits from
OpenAILLMService, all standard OpenAI-style parameters and tool calling are supported.