Skip to content

co-cddo/gds-idea-llmbo

Repository files navigation

LLMbo - Large Language model batch operations

AWS Bedrock offers powerful capabilities for running batch inference jobs with large language models. However, orchestrating these jobs, managing inputs and outputs, and ensuring consistent result structures can be arduous. LLMbo aims to solve these problems by providing an intuitive, Pythonic interface for Bedrock batch operations.

Additionally, it provides a method of using batch inference for structured responses, taking inspiration from the likes of instructor, mirascope and pydanticai. You provide a model output as a pydantic model and llmbo creates takes care of the rest.

See the AWS documentation for models that support batch inference.

Currently the library has full support (including StructuredBatchInference) for Anthropic and Mistral models. Other models may be supported through the default adapter, or you can write and register your own.

Prerequisites

  • A .env file with an entry for AWS_PROFILE=. This profile should have sufficient permissions to create and schedule a batch inference job. See the AWS instructions
  • A service role with the required permissions to execute the job.,
  • A s3 bucket to store the input and outputs for the job. S3 buckets must exists in the same region as you execute the job. This is a limitation of AWS batch inference rather that the package.
    • Inputs will be written to f{s3_bucket}/input/{job_name}.jsonl
    • Outputs will be written to f{s3_bucket}/output/{job_id}/{job_name}.jsonl.out and f{s3_bucket}/output/{job_id}/manifest.json.out

Install

pip install llmbo-bedrock

Getting started

Here's a quick example of how to use LLMbo:

BatchInferer

from llmbo import BatchInferer, ModelInput

bi = BatchInferer(
    model_name="anthropic.claude-v2",
    bucket_name="my-inference-bucket",
    region="us-east-1",
    job_name="example-batch-job",
    role_arn="arn:aws:iam::123456789012:role/BedrockBatchRole"
)

# Prepare your inputs using the ModelInput class, you also need to include an id 
# input Dict[job_id: ModelInput ]

inputs = {
    f"{i:03}": ModelInput(
        messages=[{"role": "user", "content": f"Question {i}"}]
    ) for i in range(100)
}


# Run the batch job, this prepares, uploads, creates the job, monitors the progress 
# and downloads the results
results = bi.auto(inputs)

StructuredInferer

For structured inference, simply define a Pydantic model and use StructuredBatchInferer:

from pydantic import BaseModel
from llmbo import StructuredBatchInferer

class ResponseSchema(BaseModel):
    answer: str
    confidence: float

sbi = StructuredBatchInferer(
    output_model=ResponseSchema,
    model_name="anthropic.claude-v2",
    # ... other parameters
)

# The results will be validated against ResponseSchema
structured_results = sbi.auto(inputs)

For more detailed examples see the followingin llmbo.py:

  • examples/batch_inference_example.py: for an example of free text response
  • examples/structured_batch_inference_example.py: for an example of structured response ala instructor

About

No description, website, or topics provided.

Resources

License

Security policy

Stars

Watchers

Forks

Packages

No packages published

Languages