Skip to main content

Google Google STT

The GoogleSTTService provides real-time speech recognition using Google Cloud's Speech-to-Text V2 API. It supports streaming audio, multi-language detection, and automatic reconnection to handle Google's 5-minute streaming limit.

Installation

To use Google STT, install the required dependencies:

pip install "piopiy-ai[google]"

Prerequisites

  • A Google Cloud Project with the Speech-to-Text API enabled.
  • Google Cloud credentials (JSON key file) (Get yours here).
  • Set the credentials environment variable:
    export GOOGLE_APPLICATION_CREDENTIALS="path/to/your/service-account-file.json"

Configuration

GoogleSTTService Parameters

ParameterTypeDefaultDescription
credentialsstrNoneJSON string of service account credentials.
credentials_pathstrNonePath to service account JSON file.
locationstr"global"Google Cloud location (e.g., "us-central1").
sample_rateintNoneAudio sample rate in Hz.
paramsInputParamsInputParams()Advanced recognition settings.

InputParams

ParameterTypeDefaultDescription
languagesList[Language][EN_US]Recognition languages. First is primary.
modelstr"latest_long"Speech recognition model to use.
enable_automatic_punctuationboolTrueAdd punctuation to transcripts.
enable_interim_resultsboolTrueStream partial recognition results.
enable_voice_activity_eventsboolFalseDetect voice activity in audio.

Usage

Basic Setup

from piopiy.services.google.stt import GoogleSTTService
from piopiy.transcriptions.language import Language

stt = GoogleSTTService(
credentials_path="path/to/creds.json",
params=GoogleSTTService.InputParams(
languages=[Language.EN_US, Language.ES_ES],
model="latest_long"
)
)

Notes

  • Streaming Limit: Google STT has a 5-minute limit per stream. Piopiy automatically handles reconnection to enable "endless streaming."
  • Model Choice: latest_long is recommended for general conversational use cases.