Basic Voice Agent Example

This is the simplest example demonstrating a complete voice AI agent using the Piopiy Voice AI Orchestrator.

It connects multiple services together:

Listens to user speech via Deepgram STT
Processes it with AI via OpenAI LLM
Responds with natural voice via Cartesia TTS
Supports real-time interruptions via Silero VAD

Requirements

You must install the SDK with the relevant provider extras:

pip install "piopiy-ai[cartesia,deepgram,openai,silero]"

Environment Setup

Create a .env file or export these variables in your terminal:

# Piopiy Dashboard
AGENT_ID="your_agent_id"
AGENT_TOKEN="your_agent_token"

# Providers
OPENAI_API_KEY="your_openai_key"
DEEPGRAM_API_KEY="your_deepgram_key"
CARTESIA_API_KEY="your_cartesia_key"

# Optional
AGENT_DEBUG="true"

How to Run

Save the script below as basic.py and run it:

python basic.py

Log in to the Piopiy Dashboard.
Ensure you have purchased a Piopiy phone number and mapped it to your new AI Agent. (See Dashboard Setup Guide for help).
Dial that phone number from your personal phone to interact with your local agent!

Full Script

import asyncio
import os
from dotenv import load_dotenv

from piopiy.agent import Agent
from piopiy.voice_agent import VoiceAgent
from piopiy.services.deepgram.stt import DeepgramSTTService
from piopiy.services.openai.llm import OpenAILLMService
from piopiy.services.cartesia.tts import CartesiaTTSService

load_dotenv()

async def create_session(agent_id, call_id, from_number, to_number, metadata=None):
    print(f"📞 New Call Session: {call_id} from {from_number}")
    
    # 1. VoiceAgent Configuration
    voice_agent = VoiceAgent(
        instructions="You are a helpful AI assistant. The customer is calling for support.",
        greeting="Hello! How can I help you today?",
    )

    # 2. Speech-to-Text (STT) Setup
    stt = DeepgramSTTService(
        api_key=os.getenv("DEEPGRAM_API_KEY"),
        model="nova-2",
        language="en-US",
        smart_format=True,
        interim_results=True,
    )

    # 3. Large Language Model (LLM) Setup
    llm = OpenAILLMService(
        api_key=os.getenv("OPENAI_API_KEY"),
        model="gpt-4o-mini",
        temperature=0.7,
    )

    # 4. Text-to-Speech (TTS) Setup
    tts = CartesiaTTSService(
        api_key=os.getenv("CARTESIA_API_KEY"),
        voice_id="a0e99841-438c-4a64-b679-ae501e7d6091",  # British Lady
        model="sonic-english",
        sample_rate=24000,
    )

    # 5. Start the voice agent pipeline
    await voice_agent.Action(
        stt=stt,
        llm=llm,
        tts=tts,
        vad=True,
        allow_interruptions=True,
    )
    
    await voice_agent.start()

async def main():
    agent = Agent(
        agent_id=os.getenv("AGENT_ID"),
        agent_token=os.getenv("AGENT_TOKEN"),
        create_session=create_session,
        debug=True,
    )
    
    print("🚀 Agent starting...")
    print("   Waiting for calls...")
    
    await agent.connect()

if __name__ == "__main__":
    asyncio.run(main())

Requirements​

Environment Setup​

How to Run​

Full Script​

Requirements

Environment Setup

How to Run

Full Script