Cerebras

Overview

The CerebrasLLMService integrates Cerebras's ultra-fast inference API, powered by their Wafer-Scale Engine (WSE). It is designed for applications requiring the absolute lowest latency for open-source models.

Installation

pip install piopiy-ai

Prerequisites

A Cerebras API key (Get yours here).

Set your API key in your environment:

export CEREBRAS_API_KEY="your_api_key_here"

Configuration

`CerebrasLLMService` Parameters

Parameter	Type	Default	Description
`api_key`	`str`	Required	Your Cerebras API key.
`model`	`str`	`"llama3.1-8b"`	Model identifier.
`base_url`	`str`	`"https://api.cerebras.ai/v1"`	API endpoint.

Usage

Basic Setup

import os
from piopiy.services.cerebras.llm import CerebrasLLMService

llm = CerebrasLLMService(
    api_key=os.getenv("CEREBRAS_API_KEY"),
    model="llama3.1-70b"
)

Notes

Extreme Speed: Cerebras provides some of the highest token-per-second rates in the industry, making it exceptional for conversational AI.
Support: Currently supports Llama 3.1 and future open-source models as they are added to the Cerebras platform.

Overview​

Installation​

Prerequisites​

Configuration​

CerebrasLLMService Parameters​

Usage​

Basic Setup​

Notes​