OpenAI Realtime
Overview
The OpenAIRealtimeLLMService implements OpenAI's low-latency WebSocket API for bidirectional audio and text. Unlike standard LLM services, it handles audio input/output directly, significantly reducing conversational latency by bypassing traditional STT/TTS loops.
Installation
To use OpenAI Realtime, install the required dependencies:
pip install "piopiy-ai[openai]"
Prerequisites
- An OpenAI API key with access to Realtime models (Get yours here).
- Set your API key in your environment:
export OPENAI_API_KEY="your_api_key_here"
Configuration
OpenAIRealtimeLLMService Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
api_key | str | Required | Your OpenAI API key. |
model | str | "gpt-4o-realtime-preview" | Realtime model ID. |
base_url | str | "wss://api.openai.com/v1/realtime" | WebSocket endpoint. |
session_properties | SessionProperties | None | Initial session configuration. |
SessionProperties
Includes settings for voice selection, modalities, turn detection, and instructions.
Usage
Basic Setup
import os
from piopiy.services.openai.realtime.llm import OpenAIRealtimeLLMService
llm = OpenAIRealtimeLLMService(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4o-realtime-preview"
)
Notes
- End-to-End Latency: This service provides the fastest possible "Time to First Word" by using a single persistent WebSocket for both listening and speaking.
- Modalities: You can configure the model to output
audio,text, or both. - Interruption: Built-in support for server-side turn detection and interruption handling.