Transports
Available Transport Types
In voice AI architecture, the Transport is the layer responsible for managing the real-time ingress and egress of audio between the user and your conversational voice AI.
Unlike frameworks that require you to cobble together third-party media servers (like Daily or LiveKit) or write your own custom WebSocket parsers for Twilio/Telnyx, Piopiy provides a fully integrated, zero-configuration transport layer natively powered by TeleCMI.
1. TeleCMI Telephony (PSTN)
The primary transport used by Piopiy agents is the TeleCMI Telephony network. When you purchase a local or toll-free number on the Piopiy dashboard (available for India, USA, UK, and many others globally), all media routing is handled for you automatically.
- Zero setup: No need to configure STUN/TURN servers or WebSocket endpoints.
- Direct routing: Audio from the PSTN (cellular networks, landlines) is converted to high-fidelity digital streams and routed directly to your
VoiceAgent. - Global reach: Low-latency edge network ensures fast voice AI responses regardless of where the caller is located.
2. TeleCMI WebRTC
If you are building a web or mobile application, Piopiy supports native WebRTC connections through the exact same TeleCMI infrastructure.
- Browser native: Stream audio directly from Chrome, Safari, or mobile web views without installing plugins.
- Unified logic: The identical
VoiceAgentcode that handles your inbound phone calls will handle your web traffic interchangeably.
Transport Input and Output
Under the hood, the transport layer acts as the bridge connecting the outside world to your STT and TTS engines within the Action pipeline.
- Input: The TeleCMI transport continuously ingests raw audio chunks from the caller and buffers them directly into your configured Speech-to-Text (STT) service (e.g., Deepgram).
- Output: When your Text-to-Speech (TTS) service generates synthetic audio frames, they are pushed back to the TeleCMI transport. The transport instantly packets and plays this audio down the phone line or WebRTC connection to the user.
Telephony vs WebRTC Considerations
When deciding how users should connect to your voice AI, consider the following:
Telephony (Phone Numbers)
- Extreme Accessibility: Anyone with a phone can call your agent. No app downloads or web links required.
- Legacy Compatibility: Perfect for replacing existing IVR menus, customer support desks, or automated outbound dialing campaigns.
- Network Dependency: Audio quality is constrained by standard telephone network compression (8kHz payload).
WebRTC (In-App or Browser)
- High Fidelity: Supports wideband audio (16kHz or 24kHz) for crystal clear STT transcription and incredibly natural TTS playback.
- Rich Context: WebRTC allows you to easily pass custom app-level metadata (like the user's logged-in profile ID or current shopping cart) directly to the agent upon connection.
- Visual Synchronization: Since the user is in an app, you can trigger UI changes (like showing text transcripts or flashing indicators) synchronously with the audio stream.
What's Next
- Speech Input & Turn Detection: Learn how Piopiy's Voice Activity Detection (VAD) handles interruptions in the transport stream.