Skip to main content

Basic Voice Agent Example

This is the simplest example demonstrating a complete voice AI agent using the Piopiy Voice AI Orchestrator.

View Source on GitHub

It connects multiple services together:

  • Listens to user speech via Deepgram STT
  • Processes it with AI via OpenAI LLM
  • Responds with natural voice via Cartesia TTS
  • Supports real-time interruptions via Silero VAD

Requirements

You must install the SDK with the relevant provider extras:

pip install "piopiy-ai[cartesia,deepgram,openai,silero]"

Environment Setup

Create a .env file or export these variables in your terminal:

# Piopiy Dashboard
AGENT_ID="your_agent_id"
AGENT_TOKEN="your_agent_token"

# Providers
OPENAI_API_KEY="your_openai_key"
DEEPGRAM_API_KEY="your_deepgram_key"
CARTESIA_API_KEY="your_cartesia_key"

# Optional
AGENT_DEBUG="true"

How to Run

Save the script below as basic.py and run it:

python basic.py
  1. Log in to the Piopiy Dashboard.
  2. Ensure you have purchased a Piopiy phone number and mapped it to your new AI Agent. (See Dashboard Setup Guide for help).
  3. Dial that phone number from your personal phone to interact with your local agent!

Full Script

import asyncio
import os
from dotenv import load_dotenv

from piopiy.agent import Agent
from piopiy.voice_agent import VoiceAgent
from piopiy.services.deepgram.stt import DeepgramSTTService
from piopiy.services.openai.llm import OpenAILLMService
from piopiy.services.cartesia.tts import CartesiaTTSService

load_dotenv()

async def create_session(agent_id, call_id, from_number, to_number, metadata=None):
print(f"📞 New Call Session: {call_id} from {from_number}")

# 1. VoiceAgent Configuration
voice_agent = VoiceAgent(
instructions="You are a helpful AI assistant. The customer is calling for support.",
greeting="Hello! How can I help you today?",
)

# 2. Speech-to-Text (STT) Setup
stt = DeepgramSTTService(
api_key=os.getenv("DEEPGRAM_API_KEY"),
model="nova-2",
language="en-US",
smart_format=True,
interim_results=True,
)

# 3. Large Language Model (LLM) Setup
llm = OpenAILLMService(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4o-mini",
temperature=0.7,
)

# 4. Text-to-Speech (TTS) Setup
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="a0e99841-438c-4a64-b679-ae501e7d6091", # British Lady
model="sonic-english",
sample_rate=24000,
)

# 5. Start the voice agent pipeline
await voice_agent.Action(
stt=stt,
llm=llm,
tts=tts,
vad=True,
allow_interruptions=True,
)

await voice_agent.start()

async def main():
agent = Agent(
agent_id=os.getenv("AGENT_ID"),
agent_token=os.getenv("AGENT_TOKEN"),
create_session=create_session,
debug=True,
)

print("🚀 Agent starting...")
print(" Waiting for calls...")

await agent.connect()

if __name__ == "__main__":
asyncio.run(main())