This project provides a complete Terraform infrastructure setup for benchmarking Generative AI models, specifically designed for Oracle Cloud Infrastructure (OCI). It automates the deployment of compute instances with pre-configured benchmarking tools and includes performance comparison capabilities between different AI platforms.
- Automated Infrastructure Deployment: Complete OCI setup with VCN, subnets, security groups, compute instances and IAM policies
- Pre-configured Benchmarking Environment: Automatically installs GenAI-Bench and required dependencies
- Multiple Platform Support: Benchmarking scripts for both OCI GenAI and vLLM platforms
- Performance Visualization: Python scripts for generating comprehensive performance plots and metrics
- Flexible Instance Configuration: Support for various compute shapes including GPU instances
- Security Best Practices: Proper IAM setup with Instance Principal authentication
Before deploying this infrastructure, ensure you have:
- Oracle Cloud Infrastructure (OCI) Account with appropriate privileges
- Terraform installed (version 1.0+)
- OCI CLI configured with proper credentials
- Valid compartment OCID and tenancy OCID
- SSH access capabilities for instance management
The infrastructure creates:
- Virtual Cloud Network (VCN) with public and private subnets
- Internet Gateway and NAT Gateway for connectivity
- Security Lists with configurable port exposure
- Compute Instance with flexible shape configuration
- Dynamic Groups and IAM Policies for Instance Principal access
- Automated software installation via Ansible playbooks
Create a terraform.tfvars file with the following variables:
region = "sa-saopaulo-1" # Your preferred OCI region
compartment_ocid = "ocid1.compartment.oc1..your-compartment-id"
tenancy_ocid = "ocid1.tenancy.oc1..your-tenancy-id"You can customize the deployment by modifying these variables in terraform.tfvars:
# Instance configuration
shape = "VM.Standard.E5.Flex" # Instance shape
ocpus = 32 # Number of OCPUs
memory_in_gbs = 64 # Memory allocation
boot_volume_size_in_gbs = 100 # Boot volume size
# Network configuration
exposed_ports = [22] # Ports to expose
# Image configuration
image_id = "ocid1.image.oc1.sa-saopaulo-1.aaaaaaaa7avt4eh5yycvdmzpenw45offnablkjduihvxhtxoesevvu76n2eq"
ssh_user = "opc"-
Clone the repository:
git clone https://github.com/speglich/GenAI-Benchmark-Infrastructure.git cd GenAI-Benchmark-Infrastructure -
Initialize Terraform:
terraform init
-
Review the deployment plan:
terraform plan
-
Deploy the infrastructure:
terraform apply
-
Access your instance:
ssh -i ./keys/<environment_name>_private_key.pem opc@<public_ip>
The instance comes pre-configured with benchmarking tools. You can run benchmarks using the provided scripts:
cd ~/benchmarks
./oci_benchmark.shThis script benchmarks OCI's Generative AI service with various concurrency levels and traffic scenarios.
cd ~/benchmarks
./vllm_benchmark.shThis script benchmarks vLLM deployments for comparison purposes.
After running benchmarks, use the Python plotting script to visualize results:
cd ~/benchmarks
sh generate_plots.shThe plotting script supports:
- Multiple platform comparisons
- Various performance metrics (latency, throughput, error rates)
- Customizable visualizations
- CSV export for further analysis
-
Time to First Token (TTFT)
-
End-to-end Latency
-
Output Throughput (tokens/second)
-
Input Throughput (tokens/second)
-
Requests per Second
-
Error Rates
-
Token Statistics
The benchmarks support various traffic patterns:
- Constant load:
N(5000,0)/(50,0) - Variable load:
N(480,240)/(300,150) - High throughput:
N(2200,200)/(200,20)
Tests are automatically run with multiple concurrency levels:
- 1, 2, 4, 8, 16, 32, 64, 128, 256 concurrent requests
- Create a new shell script in the
benchmarks/directory - Follow the pattern of existing scripts (
oci_benchmark.sh,vllm_benchmark.sh) - Use the
genai-benchcommand with appropriate parameters
- Compute Resources: Adjust
shape,ocpus, andmemory_in_gbsin variables - Network Security: Modify
exposed_portslist for different service requirements - Storage: Change
boot_volume_size_in_gbsfor additional disk space - Regional Deployment: Update
regionandimage_idfor different OCI regions
Modify ansible/install_genai_bench.yml to:
- Install additional software packages
- Configure custom benchmarking tools
- Set up monitoring or logging solutions
- SSH Keys: Automatically generated and stored in
keys/directory - Instance Principal: Configured for secure OCI API access
- Network Security: Minimal port exposure with customizable security lists
- IAM Policies: Least-privilege access for required operations
The included plotting tools provide comprehensive performance analysis:
- Multi-platform Comparisons: Compare OCI GenAI vs vLLM performance
- Scalability Analysis: Understand performance characteristics across concurrency levels
- Bottleneck Identification: Identify performance limitations and optimal configurations
- Export Capabilities: CSV export for integration with other analysis tools
-
Terraform Apply Fails:
- Verify OCI credentials and permissions
- Check compartment and tenancy OCIDs
- Ensure sufficient quota for chosen instance shape
-
SSH Connection Issues:
- Verify security group allows SSH (port 22)
- Check private key permissions (should be 600)
- Confirm public IP assignment
-
Benchmark Failures:
- Verify Instance Principal configuration
- Check OCI GenAI service availability in your region
- Validate API endpoints and model names
- Terraform Logs: Use
TF_LOG=DEBUG terraform applyfor detailed logging - Ansible Logs: Check
/var/log/messageson the instance for Ansible execution details - Benchmark Logs: Review output files in
~/benchmarks/results/directory
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Add appropriate tests
- Submit a pull request
This project is provided as-is for educational and benchmarking purposes. Please ensure compliance with Oracle Cloud Infrastructure terms of service and applicable software licenses.
For issues and questions:
- Check OCI documentation for service-specific issues
- Consult Terraform and Ansible documentation for infrastructure problems
Note: This infrastructure setup is designed for benchmarking and testing purposes. For production deployments, additional security hardening, monitoring, and backup strategies should be implemented.