Skip to main content

OpenAI Realtime

Overview

The OpenAIRealtimeLLMService implements OpenAI's low-latency WebSocket API for bidirectional audio and text. Unlike standard LLM services, it handles audio input/output directly, significantly reducing conversational latency by bypassing traditional STT/TTS loops.

Installation

To use OpenAI Realtime, install the required dependencies:

pip install "piopiy-ai[openai]"

Prerequisites

  • An OpenAI API key with access to Realtime models (Get yours here).
  • Set your API key in your environment:
    export OPENAI_API_KEY="your_api_key_here"

Configuration

OpenAIRealtimeLLMService Parameters

ParameterTypeDefaultDescription
api_keystrRequiredYour OpenAI API key.
modelstr"gpt-4o-realtime-preview"Realtime model ID.
base_urlstr"wss://api.openai.com/v1/realtime"WebSocket endpoint.
session_propertiesSessionPropertiesNoneInitial session configuration.

SessionProperties

Includes settings for voice selection, modalities, turn detection, and instructions.

Usage

Basic Setup

import os
from piopiy.services.openai.realtime.llm import OpenAIRealtimeLLMService

llm = OpenAIRealtimeLLMService(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4o-realtime-preview"
)

Notes

  • End-to-End Latency: This service provides the fastest possible "Time to First Word" by using a single persistent WebSocket for both listening and speaking.
  • Modalities: You can configure the model to output audio, text, or both.
  • Interruption: Built-in support for server-side turn detection and interruption handling.