Check out other spire projects here.
The flexible crawler & scraper framework powered by tokio and tower.
Spire is a modular web scraping and crawling framework for Rust that combines the power of async/await with the composability of tower's middleware ecosystem. It supports both HTTP-based scraping and browser automation through pluggable backends.
- Multiple Backends: HTTP (reqwest) and browser automation (thirtyfour) support
- Tower Integration: Composable middleware using the tower ecosystem
- Type-Safe Routing: Tag-based routing with compile-time guarantees
- Ergonomic Extractors: Clean, type-safe data extraction from requests
- Async/Await: Built on tokio for high-performance concurrent scraping
- Observability: Optional tracing and metrics support
- Graceful Shutdown: Proper resource cleanup and cancellation support
Add spire to your Cargo.toml:
[dependencies]
spire = { version = "0.2.0", features = ["reqwest"] }Basic HTTP scraping example:
use spire::prelude::*;
use spire::extract::Text;
use spire::context::{RequestQueue, Tag};
use spire::reqwest_backend::HttpClient;
use spire::dataset::InMemDataset;
async fn handler(Text(html): Text) -> Result<(), Box<dyn std::error::Error>> {
println!("Scraped {} bytes", html.len());
Ok(())
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let router = Router::new()
.route(Tag::new("main"), handler);
let backend = HttpClient::default();
let client = Client::new(backend, router)
.with_request_queue(InMemDataset::stack())
.with_dataset(InMemDataset::<String>::new());
client.queue()
.push(Tag::new("main"), "https://example.com")
.await?;
client.run().await?;
Ok(())
}See the main crate documentation for more examples and detailed usage.
We welcome contributions! Please read our Contributing Guide for details.
This project is licensed under the MIT License - see the LICENSE file for details.