Skip to main content

Cartesia Cartesia STT

The CartesiaSTTService provides cutting-edge, low-latency speech recognition using Cartesia's Sonic intelligence platform. It is optimized for deeply conversational AI flows where speed and accuracy are paramount.

Installation

To use Cartesia, install the required dependencies:

pip install "piopiy-ai[cartesia]"

Prerequisites

  • A Cartesia account and API key (Get yours here).
  • Set your API key in your environment:
    export CARTESIA_API_KEY="your_api_key_here"

Configuration

CartesiaSTTService Parameters

ParameterTypeDefaultDescription
api_keystrRequiredYour Cartesia API key.
base_urlstr"api.cartesia.ai"Custom Cartesia API base URL.
sample_rateint16000Audio sample rate in Hz.
live_optionsCartesiaLiveOptionsNoneCartesia-specific transcription options.

CartesiaLiveOptions Parameters

OptionTypeDefaultDescription
modelstr"ink-whisper"The transcription model to use.
languagestr"en"Target language for transcription.
encodingstr"pcm_s16le"Audio encoding format.

Usage

Basic Setup

import os
from piopiy.services.cartesia.stt import CartesiaSTTService

stt = CartesiaSTTService(
api_key=os.getenv("CARTESIA_API_KEY")
)

With Custom Options

import os
from piopiy.services.cartesia.stt import CartesiaSTTService, CartesiaLiveOptions

options = CartesiaLiveOptions(
model="ink-whisper",
language="en-US"
)

stt = CartesiaSTTService(
api_key=os.getenv("CARTESIA_API_KEY"),
live_options=options
)

Notes

  • Model Choice: ink-whisper is the default model, providing a great balance of speed and robustness.
  • WebSocket Streaming: This service uses Cartesia's WebSocket API for real-time, bi-directional transcription streaming.