Skip to main content

Groq LLM

Overview

The GroqLLMService provides high-speed inference for Llama, Mixtral, and Gemma models using Groq's LPU™ technology. It is highly compatible with the OpenAI API interface, making it easy to drop in for low-latency conversational needs.

Installation

To use Groq LLM, install the required dependencies:

pip install "piopiy-ai[groq]"

Prerequisites

  • A Groq account and API key.
  • Set your API key in your environment:
    export GROQ_API_KEY="your_api_key_here"

Configuration

GroqLLMService Parameters

ParameterTypeDefaultDescription
api_keystrRequiredYour Groq API key.
modelstr"llama-3.3-70b-versatile"Model to use on Groq.
base_urlstr"https://api.groq.com/openai/v1"API endpoint.

Usage

Basic Setup

import os
from piopiy.services.groq.llm import GroqLLMService

llm = GroqLLMService(
api_key=os.getenv("GROQ_API_KEY"),
model="llama-3.3-70b-versatile"
)

Notes

  • Speed: Groq is exceptionally fast, often reaching hundreds of tokens per second, which reduces the "time-to-first-word" for your voice agent.
  • Model Compatibility: Any model available on Groq can be used by simply updating the model parameter.
  • OpenAI Interface: Since it inherits from OpenAILLMService, all standard OpenAI-style parameters and tool calling are supported.