diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index 758fe37..5983805 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -16,7 +16,7 @@ --- -- [ ] PR title follows [conventional commits](https://seanbrar.github.io/pollux/contributing/) +- [ ] PR title follows [conventional commits](https://polluxlib.dev/contributing/) - [ ] `make check` passes - [ ] Tests cover the meaningful cases, not just the happy path - [ ] Docs updated (if this changes public API or user-facing behavior) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index a9a7485..1437729 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -37,7 +37,8 @@ jobs: python-version: ${{ matrix.python-version }} test-command: make test enable-api-tests: false - upload-coverage: false + # Upload coverage from a single matrix entry to avoid duplicate Codecov reports. + upload-coverage: ${{ matrix.python-version == '3.13' }} api-tests: # Run API tests exactly once to avoid exhausting shared free-tier quota. diff --git a/.github/workflows/reusable-checks.yml b/.github/workflows/reusable-checks.yml index 887c46f..690f64c 100644 --- a/.github/workflows/reusable-checks.yml +++ b/.github/workflows/reusable-checks.yml @@ -50,11 +50,17 @@ jobs: GEMINI_API_KEY: ${{ inputs.enable-api-tests && secrets.GEMINI_API_KEY || '' }} OPENAI_API_KEY: ${{ inputs.enable-api-tests && secrets.OPENAI_API_KEY || '' }} ENABLE_API_TESTS: ${{ inputs.enable-api-tests && '1' || '' }} - run: ${{ inputs.test-command }} + run: | + if [ "${{ inputs.upload-coverage }}" = "true" ]; then + make test-cov + else + ${{ inputs.test-command }} + fi - name: Upload coverage to Codecov if: inputs.upload-coverage == true uses: codecov/codecov-action@v5 with: token: ${{ secrets.CODECOV_TOKEN }} + files: coverage.xml fail_ci_if_error: false diff --git a/Makefile b/Makefile index 01c7a17..1657603 100644 --- a/Makefile +++ b/Makefile @@ -9,7 +9,7 @@ PYTEST_ARGS = -v # ------------------------------------------------------------------------------ # Main Commands # ------------------------------------------------------------------------------ -.PHONY: help install-dev lint format typecheck check test test-api docs-serve docs-build demo-data clean-demo-data clean hooks +.PHONY: help install-dev lint format typecheck check test test-cov test-api docs-serve docs-build demo-data clean-demo-data clean hooks .PHONY: mutmut help: ## Show this help message @@ -49,6 +49,9 @@ check: lint typecheck test ## Run all checks (lint + typecheck + tests) test: ## Run all tests $(PYTEST) $(PYTEST_ARGS) -m "not api" +test-cov: ## Run tests with coverage (CI only) + $(PYTEST) $(PYTEST_ARGS) -m "not api" --cov=src/pollux --cov-report=xml + test-api: .check-api-keys ## Run API tests (requires ENABLE_API_TESTS=1 + provider API key) ENABLE_API_TESTS=1 $(PYTEST) $(PYTEST_ARGS) -m "api" diff --git a/README.md b/README.md index b2a7f32..76addcc 100644 --- a/README.md +++ b/README.md @@ -3,15 +3,13 @@ Multimodal orchestration for LLM APIs. > You describe what to analyze. Pollux handles source patterns, context caching, and multimodal complexity—so you don't. -> -> Originally built for Gemini during Google Summer of Code 2025. Pollux now -> supports both Gemini and OpenAI with explicit capability differences. -[Documentation](https://seanbrar.github.io/pollux/) · -[Quickstart](https://seanbrar.github.io/pollux/quickstart/) · -[Cookbook](./cookbook/) +[Documentation](https://polluxlib.dev/) · +[Quickstart](https://polluxlib.dev/quickstart/) · +[Cookbook](https://polluxlib.dev/cookbook/) -![CI](https://github.com/seanbrar/pollux/actions/workflows/ci.yml/badge.svg) +[![PyPI](https://img.shields.io/pypi/v/pollux-ai)](https://pypi.org/project/pollux-ai/) +[![CI](https://github.com/seanbrar/pollux/actions/workflows/ci.yml/badge.svg)](https://github.com/seanbrar/pollux/actions/workflows/ci.yml) [![codecov](https://codecov.io/gh/seanbrar/pollux/graph/badge.svg)](https://codecov.io/gh/seanbrar/pollux) [![Testing: MTMT](https://img.shields.io/badge/testing-MTMT_v0.1.0-blue)](https://github.com/seanbrar/minimal-tests-maximum-trust) ![Python](https://img.shields.io/badge/Python-3.10%2B-brightgreen) @@ -34,18 +32,24 @@ result = asyncio.run( ) ) print(result["answers"][0]) +# "The key findings are: (1) three source patterns (fan-out, fan-in, +# broadcast) and (2) context caching for token and cost savings." ``` -For a full 2-minute walkthrough (install, key setup, success checks), use -[Quickstart](https://seanbrar.github.io/pollux/quickstart/). For local-file -analysis, swap to `Source.from_file("paper.pdf")`. +`run()` returns a `ResultEnvelope` dict — `answers` is a list with one entry per prompt. + +To use OpenAI instead: `Config(provider="openai", model="gpt-5-nano")`. + +For a full 2-minute walkthrough (install, key setup, success checks), see the +[Quickstart](https://polluxlib.dev/quickstart/). ## Why Pollux? -- **Multimodal-first**: PDFs, images, videos, YouTube—same API -- **Source patterns**: Fan-out (one source → many prompts), fan-in, and broadcast +- **Multimodal-first**: PDFs, images, video, YouTube URLs, and arXiv papers—same API +- **Source patterns**: Fan-out (one source, many prompts), fan-in (many sources, one prompt), and broadcast (many-to-many) - **Context caching**: Upload once, reuse across prompts—save tokens and money -- **Production-ready core**: async execution, explicit capability checks, clear errors +- **Structured output**: Get typed responses via `Options(response_schema=YourModel)` +- **Built for reliability**: Async execution, automatic retries, concurrency control, and clear error messages with actionable hints ## Installation @@ -53,14 +57,16 @@ analysis, swap to `Source.from_file("paper.pdf")`. pip install pollux-ai ``` -Or download the latest wheel from [Releases](https://github.com/seanbrar/pollux/releases/latest). +### API Keys -### API Key - -Get a key from [Google AI Studio](https://ai.dev/), then: +Get a key from [Google AI Studio](https://ai.dev/) or [OpenAI Platform](https://platform.openai.com/api-keys), then: ```bash +# Gemini (recommended starting point — supports context caching) export GEMINI_API_KEY="your-key-here" + +# OpenAI +export OPENAI_API_KEY="your-key-here" ``` ## Usage @@ -87,6 +93,43 @@ async def main() -> None: asyncio.run(main()) ``` +### YouTube and arXiv Sources + +```python +from pollux import Source + +lecture = Source.from_youtube("https://www.youtube.com/watch?v=dQw4w9WgXcQ") +paper = Source.from_arxiv("2301.07041") +``` + +Pass these to `run()` or `run_many()` like any other source — Pollux handles the rest. + +### Structured Output + +```python +import asyncio + +from pydantic import BaseModel + +from pollux import Config, Options, Source, run + +class Summary(BaseModel): + title: str + key_points: list[str] + sentiment: str + +result = asyncio.run( + run( + "Summarize this document.", + source=Source.from_file("report.pdf"), + config=Config(provider="gemini", model="gemini-2.5-flash-lite"), + options=Options(response_schema=Summary), + ) +) +parsed = result["structured"] # Summary instance +print(parsed.key_points) +``` + ### Configuration ```python @@ -95,33 +138,33 @@ from pollux import Config config = Config( provider="gemini", model="gemini-2.5-flash-lite", - enable_caching=True, + enable_caching=True, # Gemini-only in v1.0 ) ``` -See the [Configuration Guide](https://seanbrar.github.io/pollux/configuration/) for details. +See the [Configuration Guide](https://polluxlib.dev/configuration/) for details. ### Provider Differences Pollux does not force strict feature parity across providers in v1.0. -See the capability matrix: [Provider Capabilities](https://seanbrar.github.io/pollux/reference/provider-capabilities/). +See the capability matrix: [Provider Capabilities](https://polluxlib.dev/reference/provider-capabilities/). ## Documentation -- [Quickstart](https://seanbrar.github.io/pollux/quickstart/) — First result in 2 minutes -- [Concepts](https://seanbrar.github.io/pollux/concepts/) — Mental model for source patterns and caching -- [Sources and Patterns](https://seanbrar.github.io/pollux/sources-and-patterns/) — Source constructors, run/run_many, ResultEnvelope -- [Configuration](https://seanbrar.github.io/pollux/configuration/) — Providers, models, retries, caching -- [API Reference](https://seanbrar.github.io/pollux/reference/api/) — Entry points and types -- [Cookbook](./cookbook/) — Scenario-driven, ready-to-run recipes - -## Origins - -Pollux was developed as part of Google Summer of Code 2025 with Google DeepMind. [Learn more →](https://seanbrar.github.io/pollux/#about) +- [Quickstart](https://polluxlib.dev/quickstart/) — First result in 2 minutes +- [Concepts](https://polluxlib.dev/concepts/) — Mental model for source patterns and caching +- [Sources and Patterns](https://polluxlib.dev/sources-and-patterns/) — Source constructors, run/run_many, ResultEnvelope +- [Configuration](https://polluxlib.dev/configuration/) — Providers, models, retries, caching +- [Caching and Efficiency](https://polluxlib.dev/caching-and-efficiency/) — TTL management, cache warming, cost savings +- [Troubleshooting](https://polluxlib.dev/troubleshooting/) — Common issues and solutions +- [API Reference](https://polluxlib.dev/reference/api/) — Entry points and types +- [Cookbook](https://polluxlib.dev/cookbook/) — Scenario-driven, ready-to-run recipes ## Contributing -See [CONTRIBUTING](https://seanbrar.github.io/pollux/contributing/) and [TESTING.md](./TESTING.md) for guidelines. +See [CONTRIBUTING](https://polluxlib.dev/contributing/) and [TESTING.md](./TESTING.md) for guidelines. + +Built during [Google Summer of Code 2025](https://summerofcode.withgoogle.com/) with Google DeepMind. [Learn more](https://polluxlib.dev/#about) ## License diff --git a/docs/CNAME b/docs/CNAME new file mode 100644 index 0000000..64f793d --- /dev/null +++ b/docs/CNAME @@ -0,0 +1 @@ +polluxlib.dev diff --git a/docs/cookbook/getting-started/structured-output-extraction.md b/docs/cookbook/getting-started/structured-output-extraction.md index b08d4ac..c7ed43d 100644 --- a/docs/cookbook/getting-started/structured-output-extraction.md +++ b/docs/cookbook/getting-started/structured-output-extraction.md @@ -33,7 +33,7 @@ Real API (returns actual structured data): ```bash python -m cookbook getting-started/structured-output-extraction \ - --input path/to/file.pdf --no-mock --provider openai --model gpt-4.1-mini + --input path/to/file.pdf --no-mock --provider openai --model gpt-5-nano ``` ## What You'll See diff --git a/docs/index.md b/docs/index.md index ea0f6b6..9cee246 100644 --- a/docs/index.md +++ b/docs/index.md @@ -18,10 +18,11 @@ complexity — so you don't. ## Why Pollux? -- **Multimodal-first** — PDFs, images, videos, YouTube URLs. Same API. +- **Multimodal-first** — PDFs, images, video, YouTube URLs, and arXiv papers. Same API. - **Source patterns** — Fan-out, fan-in, and broadcast execution over your content. - **Context caching** — Upload once, reuse across prompts. Save tokens and money. -- **Production-ready core** — Async pipeline, explicit capability checks, clear errors. +- **Structured output** — Get typed responses via Pydantic schemas. +- **Built for reliability** — Async execution, retries, concurrency control, and clear errors. ## Install diff --git a/docs/overrides/home.html b/docs/overrides/home.html index f0f0790..f8f0880 100644 --- a/docs/overrides/home.html +++ b/docs/overrides/home.html @@ -38,7 +38,7 @@

You describe what to analyze. Pollux handles source patterns, - context caching, rate limits, and retries — + context caching, and multimodal complexity — so you don’t.

@@ -66,7 +66,7 @@

source=Source.from_file("paper.pdf"), config=Config( provider="gemini", - model="gemini-2.0-flash", + model="gemini-2.5-flash-lite", ), ) print(result["answers"][0]) @@ -99,7 +99,7 @@

Context caching

-

Production-ready

+

Built for reliability

Async pipeline, retries with backoff, structured output, usage tracking. diff --git a/mkdocs.yml b/mkdocs.yml index 9da3e6b..1f1971f 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -1,7 +1,7 @@ site_name: Pollux repo_url: https://github.com/seanbrar/pollux repo_name: seanbrar/pollux -site_url: https://seanbrar.github.io/pollux/ +site_url: https://polluxlib.dev/ edit_uri: edit/main/docs/ theme: diff --git a/pyproject.toml b/pyproject.toml index 23bde86..6624fc1 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -21,7 +21,8 @@ dependencies = [ ] [project.urls] -Homepage = "https://github.com/seanbrar/pollux" +Homepage = "https://polluxlib.dev" +Documentation = "https://polluxlib.dev" Repository = "https://github.com/seanbrar/pollux" Changelog = "https://github.com/seanbrar/pollux/blob/main/CHANGELOG.md" Issues = "https://github.com/seanbrar/pollux/issues" @@ -86,6 +87,9 @@ exclude_lines = [ [tool.coverage.html] directory = "coverage_html_report" +[tool.coverage.xml] +output = "coverage.xml" + # --- MyPy --- [tool.mypy] python_version = "3.10" diff --git a/src/pollux/__init__.py b/src/pollux/__init__.py index e0dcc1b..dac6e31 100644 --- a/src/pollux/__init__.py +++ b/src/pollux/__init__.py @@ -36,7 +36,12 @@ if TYPE_CHECKING: from pollux.providers.base import Provider -__version__ = "0.9.0" +try: + from importlib.metadata import PackageNotFoundError, version + + __version__ = version("pollux-ai") +except PackageNotFoundError: + __version__ = "0.0.0+unknown" # Library-level NullHandler: stay silent unless the consumer configures logging. logging.getLogger("pollux").addHandler(logging.NullHandler()) diff --git a/uv.lock b/uv.lock index cb4926f..58fc052 100644 --- a/uv.lock +++ b/uv.lock @@ -1388,7 +1388,7 @@ wheels = [ [[package]] name = "pollux-ai" -version = "1.0.0rc1" +version = "1.0.0rc2" source = { editable = "." } dependencies = [ { name = "google-genai" },