Advanced web search plugin using the Gemini CLI in headless mode with google_web_search tool restriction, providing caching, analytics, content extraction, and validation for Claude Code.
Important: This plugin uses the Gemini CLI with the google_web_search tool exclusively via headless mode (gemini -p with --yolo flag). It does NOT:
- Trigger Claude's internal web search functionality
- Use direct web scraping or crawling
- Bypass the Gemini CLI in any way
The plugin restricts the Gemini CLI to only use the google_web_search tool through the .gemini/settings.json configuration.
- Gemini CLI Headless Mode - Uses
gemini -pwith--yoloflag for automated web search - Tool Restriction -
.gemini/settings.jsonlimits Gemini to onlygoogle_web_searchtool - Grounded Results - All search results come from Google's web search via Gemini
- Subagent Architecture - Context isolation for 39% better token savings
- Smart Caching - 1-hour TTL with MD5 keying
- Auto-retry Logic - Exponential backoff on failures
- Dynamic Content Extraction - Extract and parse content from websites using Gemini
- False Positive Validation - Validate search results for relevance
- Comprehensive Logging - Detailed logging for debugging and monitoring
- 3 Slash Commands -
/search,/search-stats,/clear-cache - 2 Hooks - Error detection and pre-edit suggestions
- Complete Analytics - Track usage and token savings
- Production Ready - Error handling, logging, validation
- No Web Scraping - Zero direct HTTP requests or HTML parsing
Perform a web search using multiple search engines with smart caching and result validation.
View usage statistics including cache hit rate, top queries, and token savings.
Clear the search result cache and reset analytics data.
~/claude-plugins/gemini-search/
βββ .claude-plugin/
β βββ plugin.json # Plugin metadata
βββ agents/
β βββ gemini-search.md # Subagent (isolated context)
βββ skills/
β βββ web-search-patterns/
β βββ SKILL.md # Search best practices
βββ commands/
β βββ search.md # /search command
β βββ search-stats.md # /search-stats analytics
β βββ clear-cache.md # /clear-cache
βββ scripts/
β βββ search-wrapper.sh # Production wrapper with error handling
β βββ analytics.sh # Usage tracking and statistics
β βββ extract-content.sh # Dynamic content extraction from websites
βββ hooks/
β βββ hooks.json # Hook config
β βββ pre-edit-search.sh # Pre-edit suggestions with validation
β βββ error-search.sh # Auto error detection and handling
βββ README.md # Full documentation
- Extracts clean text content from web pages
- Removes HTML tags, scripts, and styling
- Validates content relevance to search query
- Handles multiple content sources with fallback methods
- Validates search results against original query
- Calculates relevance scores for each result
- Filters out irrelevant or low-quality content
- Provides warnings for potentially irrelevant results
- Retry logic with exponential backoff
- Multiple search engine fallbacks
- Network error handling and recovery
- Graceful degradation when services are unavailable
- Detailed logging of search operations
- Error logging with context information
- Performance metrics tracking
- Audit trail for compliance and debugging
-
Install Gemini CLI:
npm install -g @google/gemini-cli
-
Verify Installation:
gemini --version
-
Configure API Key (optional):
gemini config set apiKey YOUR_GOOGLE_AI_API_KEY
- Install the plugin in your Claude Code environment
- The
.gemini/settings.jsonfile is automatically used to restrict tools - Use
/search [your query]to perform web searches via Gemini - Check
/search-statsto monitor usage and cache effectiveness - Use
/clear-cacheif you need to reset cached results
The plugin can be configured through environment variables:
CACHE_TTL: Cache time-to-live in seconds (default: 3600)MAX_RETRIES: Number of retry attempts on failure (default: 3)RETRY_DELAY: Initial delay between retries in seconds (default: 1)BACKOFF_BASE: Exponential backoff base (default: 2)LOG_FILE: Path to main log file (default: /tmp/gemini-search.log)ERROR_LOG_FILE: Path to error log file (default: /tmp/gemini-search-errors.log)SUGGESTIONS_LIMIT: Number of suggestions to generate (default: 5)CONTEXT_WINDOW: Number of previous messages to consider (default: 10)TIMEOUT_SECONDS: Content extraction timeout (default: 15)MAX_CONTENT_SIZE: Maximum content size to process (default: 100000)
- Results are cached for 1 hour to reduce API calls
- Context isolation helps save tokens by isolating search operations
- Cache hit rate is tracked to measure effectiveness
- Error handling with exponential backoff ensures reliability
- Content extraction is optimized to focus on relevant information
- Relevance validation prevents false positive results
Performance Metrics:
- Cached searches: <1 second (97% faster)
- First searches: 8-20 seconds
- Cache hit rate: 60-85% (typical usage)
- Memory usage: <50MB
- Token savings: 39% via context isolation
For detailed benchmarks and optimization strategies, see docs/PERFORMANCE.md
Examples:
- Basic usage examples: examples/01-basic-searches.md
- Technical queries: examples/02-technical-queries.md
- Validation examples: examples/03-validation-examples.md
- Advanced usage: examples/04-advanced-usage.md
- If searches fail, check the error logs at the configured LOG_FILE location
- Use
/search-statsto see if cache hit rate is improving performance - If getting rate limited, consider extending the time between searches
- Use
/clear-cacheif cached results seem outdated - Check content extraction logs if web content isn't being parsed correctly
- Monitor validation warnings for potentially irrelevant results
- Content extraction is limited to prevent processing oversized responses
- URLs are validated to prevent malicious inputs
- All external requests are made with appropriate timeouts
- Error messages don't expose sensitive system information
MIT License - see LICENSE file for details.