Skip to content

Web search and content extraction extension for Pi coding agent

License

Notifications You must be signed in to change notification settings

nicobailon/pi-web-access

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pi-web-access

Pi Web Access

Web search, content extraction, and video understanding for Pi agent. Zero config with Chrome, or bring your own API keys.

npm version License: MIT Platform

pi-web-fetch-demo.mp4

Why Pi Web Access

Zero Config — Signed into Google in Chrome? That's it. The extension reads your Chrome session cookies to access Gemini directly. No API keys, no setup, no subscriptions.

Video Understanding — Point it at a YouTube video or local screen recording and ask questions about what's on screen. Full transcripts, visual descriptions, and frame extraction at exact timestamps.

Smart Fallbacks — Every capability has a fallback chain. Search tries Perplexity, then Gemini API, then Gemini Web. YouTube tries Gemini Web, then API, then Perplexity. Blocked pages retry through Jina Reader and Gemini extraction. Something always works.

GitHub Cloning — GitHub URLs are cloned locally instead of scraped. The agent gets real file contents and a local path to explore, not rendered HTML.

Install

pi install npm:pi-web-access

If you're not signed into Chrome, or prefer a different provider, add API keys to ~/.pi/web-search.json:

{
  "perplexityApiKey": "pplx-...",
  "geminiApiKey": "AIza..."
}

You can configure one or both. In auto mode (default), web_search tries Perplexity first, then Gemini API, then Gemini Web.

Optional dependencies for video frame extraction:

brew install ffmpeg   # frame extraction, video thumbnails, local video duration
brew install yt-dlp   # YouTube stream URLs for frame extraction

Without these, video content analysis (transcripts, visual descriptions via Gemini) still works. The binaries are only needed for extracting individual frames as images.

Requires Pi v0.37.3+.

Quick Start

// Search the web
web_search({ query: "TypeScript best practices 2025" })

// Fetch a page
fetch_content({ url: "https://docs.example.com/guide" })

// Clone a GitHub repo
fetch_content({ url: "https://github.com/owner/repo" })

// Understand a YouTube video
fetch_content({ url: "https://youtube.com/watch?v=abc", prompt: "What libraries are shown?" })

// Analyze a screen recording
fetch_content({ url: "/path/to/recording.mp4", prompt: "What error appears on screen?" })

Tools

web_search

Search the web via Perplexity AI or Gemini. Returns a synthesized answer with source citations.

web_search({ query: "rust async programming" })
web_search({ queries: ["query 1", "query 2"] })
web_search({ query: "latest news", numResults: 10, recencyFilter: "week" })
web_search({ query: "...", domainFilter: ["github.com"] })
web_search({ query: "...", provider: "gemini" })
web_search({ query: "...", includeContent: true })
Parameter Description
query / queries Single query or batch of queries
numResults Results per query (default: 5, max: 20)
recencyFilter day, week, month, or year
domainFilter Limit to domains (prefix with - to exclude)
provider auto (default), perplexity, or gemini
includeContent Fetch full page content from sources in background

fetch_content

Fetch URL(s) and extract readable content as markdown. Automatically detects and handles GitHub repos, YouTube videos, PDFs, local video files, and regular web pages.

fetch_content({ url: "https://example.com/article" })
fetch_content({ urls: ["url1", "url2", "url3"] })
fetch_content({ url: "https://github.com/owner/repo" })
fetch_content({ url: "https://youtube.com/watch?v=abc", prompt: "What libraries are shown?" })
fetch_content({ url: "/path/to/recording.mp4", prompt: "What error appears on screen?" })
fetch_content({ url: "https://youtube.com/watch?v=abc", timestamp: "23:41-25:00", frames: 4 })
Parameter Description
url / urls Single URL/path or multiple URLs
prompt Question to ask about a YouTube video or local video file
timestamp Extract frame(s) — single ("23:41"), range ("23:41-25:00"), or seconds ("85")
frames Number of frames to extract (max 12)
forceClone Clone GitHub repos that exceed the 350MB size threshold

get_search_content

Retrieve stored content from previous searches or fetches. Content over 30,000 chars is truncated in tool responses but stored in full for retrieval here.

get_search_content({ responseId: "abc123", urlIndex: 0 })
get_search_content({ responseId: "abc123", url: "https://..." })
get_search_content({ responseId: "abc123", query: "original query" })

Capabilities

GitHub repos

GitHub URLs are cloned locally instead of scraped. The agent gets real file contents and a local path to explore with read and bash. Root URLs return the repo tree + README, /tree/ paths return directory listings, /blob/ paths return file contents.

Repos over 350MB get a lightweight API-based view instead of a full clone (override with forceClone: true). Commit SHA URLs are handled via the API. Clones are cached for the session and wiped on session change. Private repos require the gh CLI.

YouTube videos

YouTube URLs are processed via Gemini for full video understanding — visual descriptions, transcripts with timestamps, and chapter markers. Pass a prompt to ask specific questions about the video. Results include the video thumbnail so the agent gets visual context alongside the transcript.

Fallback: Gemini Web → Gemini API → Perplexity (text summary only). Handles all URL formats: /watch?v=, youtu.be/, /shorts/, /live/, /embed/, /v/.

Local video files

Pass a file path (/, ./, ../, or file:// prefix) to analyze video content via Gemini. Supports MP4, MOV, WebM, AVI, and other common formats up to 50MB. Pass a prompt to ask about specific content. If ffmpeg is installed, a thumbnail frame is included alongside the analysis.

Fallback: Gemini API (Files API upload) → Gemini Web.

Video frame extraction

Use timestamp and/or frames on any YouTube URL or local video file to extract visual frames as images.

fetch_content({ url: "...", timestamp: "23:41" })                       // single frame
fetch_content({ url: "...", timestamp: "23:41-25:00" })                 // range, 6 frames
fetch_content({ url: "...", timestamp: "23:41-25:00", frames: 3 })      // range, custom count
fetch_content({ url: "...", timestamp: "23:41", frames: 5 })            // 5 frames at 5s intervals
fetch_content({ url: "...", frames: 6 })                                // sample whole video

Requires ffmpeg (and yt-dlp for YouTube). Timestamps accept H:MM:SS, MM:SS, or bare seconds.

PDFs

PDF URLs are extracted as text and saved to ~/Downloads/ as markdown. The agent can then read specific sections without loading the full document into context. Text-based extraction only — no OCR.

Blocked pages

When Readability fails or returns only a cookie notice, the extension retries via Jina Reader (handles JS rendering server-side, no API key needed), then Gemini URL Context API, then Gemini Web extraction. Handles SPAs, JS-heavy pages, and anti-bot protections transparently. Also parses Next.js RSC flight data when present.

How It Works

fetch_content(url)
  → Video file?  Gemini API (Files API) → Gemini Web
  → GitHub URL?  Clone repo, return file contents + local path
  → YouTube URL? Gemini Web → Gemini API → Perplexity
  → HTTP fetch → PDF? Extract text, save to ~/Downloads/
               → HTML? Readability → RSC parser → Jina Reader → Gemini fallback
               → Text/JSON/Markdown? Return directly

Skills

librarian

Bundled research workflow for investigating open-source libraries. Combines GitHub cloning, web search, and git operations (blame, log, show) to produce evidence-backed answers with permalinks. Pi loads it automatically based on your prompt. Also available via /skill:librarian with pi-skill-palette.

Commands

/search

Browse stored search results interactively. Lists all results from the current session with their response IDs for easy retrieval.

Activity Monitor

Toggle with Ctrl+Shift+W to see live request/response activity:

─── Web Search Activity ────────────────────────────────────
  API  "typescript best practices"     200    2.1s ✓
  GET  docs.example.com/article        200    0.8s ✓
  GET  blog.example.com/post           404    0.3s ✗
────────────────────────────────────────────────────────────

Configuration

All config lives in ~/.pi/web-search.json. Every field is optional.

{
  "perplexityApiKey": "pplx-...",
  "geminiApiKey": "AIza...",
  "searchProvider": "auto",
  "githubClone": {
    "enabled": true,
    "maxRepoSizeMB": 350,
    "cloneTimeoutSeconds": 30,
    "clonePath": "/tmp/pi-github-repos"
  },
  "youtube": {
    "enabled": true,
    "preferredModel": "gemini-3-flash-preview"
  },
  "video": {
    "enabled": true,
    "preferredModel": "gemini-3-flash-preview",
    "maxSizeMB": 50
  }
}

GEMINI_API_KEY and PERPLEXITY_API_KEY env vars take precedence over config file values. searchProvider sets the web_search default: "auto", "perplexity", or "gemini". Set "enabled": false under any feature to disable it. Config changes require a Pi restart.

Rate limits: Perplexity is capped at 10 requests/minute (client-side). Content fetches run 3 concurrent with a 30s timeout per URL.

Limitations

  • Chrome cookie extraction is macOS-only — other platforms fall through to API keys. First-time access may trigger a Keychain dialog.
  • YouTube private/age-restricted videos may fail on all extraction paths.
  • Gemini can process videos up to ~1 hour; longer videos may be truncated.
  • PDFs are text-extracted only (no OCR for scanned documents).
  • GitHub branch names with slashes may misresolve file paths; the clone still works and the agent can navigate manually.
  • Non-code GitHub URLs (issues, PRs, wiki) fall through to normal web extraction.
Files
File Purpose
index.ts Extension entry, tool definitions, commands, widget
extract.ts URL/file path routing, HTTP extraction, fallback orchestration
gemini-search.ts Search routing across Perplexity, Gemini API, Gemini Web
gemini-url-context.ts Gemini URL Context + Web extraction fallbacks
gemini-web.ts Gemini Web client (cookie auth, StreamGenerate)
gemini-api.ts Gemini REST API client (generateContent)
chrome-cookies.ts macOS Chrome cookie extraction (Keychain + SQLite)
youtube-extract.ts YouTube detection, three-tier extraction, frame extraction
video-extract.ts Local video detection, Files API upload, Gemini analysis
github-extract.ts GitHub URL parsing, clone cache, content generation
github-api.ts GitHub API fallback for large repos and commit SHAs
perplexity.ts Perplexity API client with rate limiting
pdf-extract.ts PDF text extraction, saves to markdown
rsc-extract.ts RSC flight data parser for Next.js pages
utils.ts Shared formatting and error helpers
storage.ts Session-aware result storage
activity.ts Activity tracking for the observability widget
skills/librarian/ Bundled skill for library research

About

Web search and content extraction extension for Pi coding agent

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published