Large Language Models (LLM)
Overview
In Piopiy, the LLM is your agent's reasoning layer. It receives transcribed user speech (from STT), uses your instructions and context, and streams responses for TTS playback.
How Piopiy Handles LLM
Inside VoiceAgent.Action(...), Piopiy places the LLM between context aggregation and TTS:
- STT provides user text.
- Piopiy appends turns to session context.
- LLM generates streaming response tokens.
- TTS converts those tokens into spoken audio.
The LLM is session-scoped, so each live call keeps isolated conversation state.
Supported LLM Providers (SDK)
The list below is based on Piopiy SDK LLM integrations.
Implementation Example
import os
from piopiy.voice_agent import VoiceAgent
from piopiy.services.openai.llm import OpenAILLMService
async def on_new_session(agent_id, call_id, from_number, to_number, metadata=None):
voice_agent = VoiceAgent(
instructions="You are a concise voice assistant.",
greeting="Hello! How can I help you?",
)
llm = OpenAILLMService(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4.1",
)
# initialize stt / tts as usual
await voice_agent.Action(stt=stt, llm=llm, tts=tts)
await voice_agent.start()
Best Practices
- Keep prompts short and voice-friendly.
- Prefer low-latency models for live calls.
- Use tool calling for precise backend actions instead of hallucinated data.
- Keep a fallback provider for reliability.
What's Next
- Function Calling: Connect the LLM to real backend actions using tool schemas.
- Context Management: Learn how conversation memory is maintained per session.
- Telephony: Deploy LLM-driven flows to real phone sessions.