A Rust WebAssembly plugin for the Noorle platform that provides arXiv paper search and PDF download capabilities.
- Search arXiv: Query the arXiv repository for academic papers with customizable result limits
- Download PDFs: Download paper PDFs directly from arXiv to specified locations
- Structured Data: Returns detailed paper metadata including titles, authors, abstracts, categories, and dates
- Fast & Efficient: Built with Rust for optimal WASM performance
This arXiv plugin demonstrates practical patterns for building research-oriented Noorle plugins:
- Feed Parsing: Shows how to parse Atom/RSS feeds from academic APIs
- PDF Download: Implements binary file download and storage from WASM
- Complex Data Structures: Handling rich metadata with dates, arrays, and nested objects
- API Integration: Interfacing with academic repositories and content providers
WebAssembly Component Model with WASI 0.2 provides secure, portable sandboxing with standardized system interfaces.
Key Libraries:
feed-rs: Robust Atom/RSS feed parsing for academic contentwaki: WASI-compatible HTTP client for API requestschrono: Date/time handling for publication timestampsserde: JSON serialization for structured data exchange
# Build the plugin (creates WASM component)
noorle plugin build
# Deploy to Noorle platform
noorle plugin deploy# Test search function
wasmtime run --wasi http \
--invoke 'search("quantum computing", 5)' dist/plugin.wasm
# Test PDF download (requires filesystem access)
wasmtime run --wasi http --dir /tmp \
--invoke 'download-pdf("2301.08727", "/tmp")' dist/plugin.wasmarxiv/
├── src/
│ ├── lib.rs # Main plugin implementation
│ └── types.rs # Data structures for arXiv papers
├── wit/
│ └── world.wit # Component interface definition
├── Cargo.toml # Rust dependencies and metadata
├── noorle.yaml # Plugin permissions and configuration
├── build.sh # Build script (used by noorle CLI)
└── dist/ # Build output (created after build)
└── plugin.wasm # Compiled WASM component
Search for papers on arXiv matching the given query.
Parameters:
query: Search terms (e.g., "quantum computing", "machine learning")max-results: Maximum number of results to return (1-100, default: 10)
Returns: Success: JSON string containing array of paper objects with:
paper_id: arXiv identifiertitle: Paper titleauthors: Array of author namesabstract_text: Paper abstracturl: Web URL to paper pagepdf_url: Direct PDF download URLpublished_date: Publication date (ISO 8601)updated_date: Last update date (ISO 8601)categories: arXiv subject categories
Error: String describing what went wrong
Example Response:
[
{
"paper_id": "2509.16200v1",
"title": "Exploring confinement transitions in Z2 lattice gauge theories...",
"authors": ["Matjaž Kebrič", "Lin Su", "Alexander Douglas"],
"abstract_text": "Confinement of particles into bound states is a phenomenon...",
"url": "http://arxiv.org/abs/2509.16200v1",
"pdf_url": "http://arxiv.org/pdf/2509.16200v1",
"published_date": "2025-09-19T17:58:55Z",
"categories": ["cond-mat.quant-gas", "quant-ph"]
}
]Download a PDF paper from arXiv.
Parameters:
paper-id: arXiv paper ID (e.g., "2301.08727")save-path: Directory to save the PDF (e.g., "/tmp")
Returns: Success: JSON string with download result:
{"success": true, "file_path": "/path/to/file.pdf"}Error: String describing what went wrong
[dependencies]
wit-bindgen = "0.46.0" # Component Model bindings generation
anyhow = "1.0" # Error handling
serde = { version = "1.0", features = ["derive"] } # JSON serialization
serde_json = "1.0" # JSON parsing
waki = "0.5" # WASI HTTP client
urlencoding = "2.1" # URL encoding for API parameters
feed-rs = "1.5" # Atom/RSS feed parsing
chrono = { version = "0.4", features = ["serde"] } # Date/time handlingBy studying this example, developers learn:
- Feed Parsing: How to process Atom/RSS feeds in WASM components
- Binary File Handling: Downloading and saving PDFs from WASM
- Complex Data Processing: Working with academic metadata structures
- Date/Time Handling: Managing temporal data in WASM environments
- Error Recovery: Graceful handling of API failures and malformed data
This example serves as a foundation for building research tools, academic integrations, and content aggregation plugins.