Skip to main content

NVIDIA NIM

Overview

The NvidiaLLMService provides access to NVIDIA Inference Microservices (NIM). It supports a wide variety of state-of-the-art models optimized for NVIDIA hardware.

Installation

pip install piopiy-ai

Prerequisites

  • An NVIDIA API key (Get yours here).
  • Set your API key in your environment:
    export NVIDIA_API_KEY="your_api_key_here"

Configuration

NvidiaLLMService Parameters

ParameterTypeDefaultDescription
api_keystrRequiredYour NVIDIA API key.
modelstr"nvidia/llama-3.1-nemotron-70b-instruct"Model identifier.
base_urlstr"https://integrate.api.nvidia.com/v1"API endpoint.

Usage

Basic Setup

import os
from piopiy.services.nvidia.llm import NvidiaLLMService

llm = NvidiaLLMService(
api_key=os.getenv("NVIDIA_API_KEY"),
model="nvidia/llama-3.1-nemotron-70b-instruct"
)

Notes

  • Optimization: NIMs are highly optimized for NVIDIA GPUs, offering excellent throughput and latency.
  • Token Usage: Handled incrementally to match NVIDIA's reporting style.