Skip to content

Comments

Resolve Security Vulnerability & Implement Cost Optimizations#86

Open
Pravalika-Batchu wants to merge 4 commits intogemma-facet:mainfrom
Pravalika-Batchu:main
Open

Resolve Security Vulnerability & Implement Cost Optimizations#86
Pravalika-Batchu wants to merge 4 commits intogemma-facet:mainfrom
Pravalika-Batchu:main

Conversation

@Pravalika-Batchu
Copy link

Security Issue Fixed
Problem: Users had to submit API keys (Gemini & Hugging Face) in requests, creating security vulnerability.

Solution:

Made gemini key and hf token optional in all requests
Services now use GOOGLE_API_KEY and HF_TOKEN environment variables
No more API key exposure in request payloads

Cost Optimization
Problem: Text similarity rewards made 2N API calls for N comparisons.

Solution: Switched to batch embedding API, reducing calls to 1 per batch = ~50% cost savings.

Local Development
Added support for running services locally without GCP credentials.

Key Changes
Preprocessing: Made Gemini API key optional, uses env var
Inference/Training: Made HF token optional, uses env var
Rewards: Implemented batch_embed_contents for cost savings
All Services: Added local testing mode

✅ Testing Verified
✅ Requests work without API keys
✅ Services use environment variables
✅ Batch API reduces costs
✅ Local development works
✅ Backward compatibility maintained
Environment Setup:

Production

GOOGLE_API_KEY=your-gemini-key
HF_TOKEN=your-huggingface-token

Local dev

STORAGE_TYPE=local

Impact
Security: Eliminated API key exposure
Cost: ~50% reduction in embedding costs
Dev Experience: Easy local testing

Closes: #84 - API key security vulnerability

Ready for review! 🚀

Test: STORAGE_TYPE=local python app.py then curl http://localhost:8080/health

- Remove user-submitted API keys security vulnerability
- Use server-side GOOGLE_API_KEY environment variable
- Implement batch embedding API for 50% cost reduction
- Add local testing support without GCP credentials
- Update preprocessing service to handle optional API keys
- Modify text similarity rewards to use batch_embed_contents
…ation

Security and cost optimization improvements
- Remove user-submitted HF tokens from all request schemas
- Use HF_TOKEN environment variable server-side for authentication
- Make hf_token optional in InferenceRequest, BatchInferenceRequest, EvaluationRequest, TrainRequest
- Update login_hf function to use HF_TOKEN env var as fallback
- Update validation logic to allow server-side token usage
- Maintain backward compatibility while improving security

This eliminates another API key security vulnerability where users
could submit Hugging Face tokens in requests.
…ation

Extend security fixes to Hugging Face authentication

if train_request.hf_token:
login(token=train_request.hf_token)
# Login to Hugging Face using provided token or HF_TOKEN env var
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for your interest in contributing! However, the proper way to do this would be using OAuth which is described in #84. Using env var means users cannot connect their own accounts, which is needed to push and pull from HF hub!

completion_texts = [_get_completion_text(comp) for comp in completions]
all_texts = completion_texts + reference_values
# Batch embed all texts
batch_response = _genai_client.models.batch_embed_contents(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does seem valid, we should batch all the calls to LLM as judge as well! They will also improve the efficiency since batched calls are cheaper (by almost 50% IIRC).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants