Google STT
The GoogleSTTService provides real-time speech recognition using Google Cloud's Speech-to-Text V2 API. It supports streaming audio, multi-language detection, and automatic reconnection to handle Google's 5-minute streaming limit.
Installation
To use Google STT, install the required dependencies:
pip install "piopiy-ai[google]"
Prerequisites
- A Google Cloud Project with the Speech-to-Text API enabled.
- Google Cloud credentials (JSON key file) (Get yours here).
- Set the credentials environment variable:
export GOOGLE_APPLICATION_CREDENTIALS="path/to/your/service-account-file.json"
Configuration
GoogleSTTService Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
credentials | str | None | JSON string of service account credentials. |
credentials_path | str | None | Path to service account JSON file. |
location | str | "global" | Google Cloud location (e.g., "us-central1"). |
sample_rate | int | None | Audio sample rate in Hz. |
params | InputParams | InputParams() | Advanced recognition settings. |
InputParams
| Parameter | Type | Default | Description |
|---|---|---|---|
languages | List[Language] | [EN_US] | Recognition languages. First is primary. |
model | str | "latest_long" | Speech recognition model to use. |
enable_automatic_punctuation | bool | True | Add punctuation to transcripts. |
enable_interim_results | bool | True | Stream partial recognition results. |
enable_voice_activity_events | bool | False | Detect voice activity in audio. |
Usage
Basic Setup
from piopiy.services.google.stt import GoogleSTTService
from piopiy.transcriptions.language import Language
stt = GoogleSTTService(
credentials_path="path/to/creds.json",
params=GoogleSTTService.InputParams(
languages=[Language.EN_US, Language.ES_ES],
model="latest_long"
)
)
Notes
- Streaming Limit: Google STT has a 5-minute limit per stream. Piopiy automatically handles reconnection to enable "endless streaming."
- Model Choice:
latest_longis recommended for general conversational use cases.