English | ็ฎไฝไธญๆ | ไธญๅฝ้ๅ
๐ PulsarRPA: The AI-Powered, Lightning-Fast Browser Automation Solution! ๐
PulsarRPA is an AI-enabled ๐ค, high-performance ๐, distributed ๐, and open-source ๐ browser automation platform, built for large-scale automation ๐ญ. It excels at:
- ๐ค AI integration with LLMs for smarter automation
- โก Ultra-fast, spider-grade browser automation
- ๐ง Advanced web content understanding
- ๐ Powerful data extraction APIs
PulsarRPA is designed to meet the demands of modern web automation, delivering accurate โ and comprehensive ๐ data extraction โ even from the most complex ๐ and dynamic โก websites.
๐บ Bilibili: https://www.bilibili.com/video/BV1kM2rYrEFC
Download the latest Executable Jar and run it.
# Linux/macOS and Windows (if curl is available)
curl -L -o PulsarRPA.jar https://github.com/platonai/PulsarRPA/releases/download/v3.0.2/PulsarRPA.jar
java -D DEEPSEEK_API_KEY=${DEEPSEEK_API_KEY} -jar PulsarRPA.jar
You can ignore DEEPSEEK_API_KEY
if you don't need to use the AI features.
Download links:
For docker user:
docker run -d -p 8182:8182 -e DEEPSEEK_API_KEY=${DEEPSEEK_API_KEY} galaxyeye88/pulsar-rpa:latest
Talk about a webpage using the chat-about
API:
curl -X POST "http://localhost:8182/api/ai/chat-about" -H "Content-Type: application/json" -d '{
"url": "https://www.amazon.com/dp/B0C1H26C46",
"prompt": "introduce this product"
}'
Extract data from a webpage using extract
API:
curl -X POST "http://localhost:8182/api/ai/extract" -H "Content-Type: application/json" -d '{
"url": "https://www.amazon.com/dp/B0C1H26C46",
"prompt": "product name, price, and description"
}'
Use the chat
API to ask any questions:
curl http://localhost:8182/api/ai/chat?prompt=What-is-the-most-fantastical-technology-today
Use post
method to send a longer prompt:
curl -X POST "http://localhost:8182/api/ai/chat" -H "Content-Type: application/json" -d '
What is the most fantastical technology today?
You should return a list of 5 items.
'
curl -X POST "http://localhost:8182/api/x/e" -H "Content-Type: text/plain" -d "
select
llm_extract(dom, 'product name, price, ratings') as llm_extracted_data,
dom_base_uri(dom) as url,
dom_first_text(dom, '#productTitle') as title,
dom_first_slim_html(dom, 'img:expr(width > 400)') as img
from load_and_select('https://www.amazon.com/dp/B0C1H26C46', 'body');
"
The extracted data:
{
"llm_extracted_data": {
"product name": "Apple iPhone 15 Pro Max",
"price": "$1,199.00",
"ratings": "4.5 out of 5 stars"
},
"url": "https://www.amazon.com/dp/B0C1H26C46",
"title": "Apple iPhone 15 Pro Max",
"img": "<img src=\"https://example.com/image.jpg\" />"
}
val prompts = """
move cursor to the element with id 'title' and click it
scroll to middle
scroll to top
get the text of the element with id 'title'
"""
val eventHandlers = DefaultPageEventHandlers()
eventHandlers.browseEventHandlers.onDocumentActuallyReady.addLast { page, driver ->
val result = session.instruct(prompts, driver)
}
session.open(url, eventHandlers)
๐ Example: View Kotlin Code
val options = session.options(args)
val event = options.eventHandlers.browseEventHandlers
event.onBrowserLaunched.addLast { page, driver ->
warnUpBrowser(page, driver)
}
event.onWillFetch.addLast { page, driver ->
waitForReferrer(page, driver)
waitForPreviousPage(page, driver)
}
event.onWillCheckDocumentState.addLast { page, driver ->
driver.waitForSelector("body h1[itemprop=name]")
driver.click(".mask-layer-close-button")
}
session.load(url, options)
๐ Example: View Kotlin Code
select
llm_extract(dom, 'product name, price, ratings, score') as llm_extracted_data,
dom_first_text(dom, '#productTitle') as title,
dom_first_text(dom, '#bylineInfo') as brand,
dom_first_text(dom, '#price tr td:matches(^Price) ~ td') as price,
dom_first_text(dom, '#acrCustomerReviewText') as ratings,
str_first_float(dom_first_text(dom, '#reviewsMedley .AverageCustomerReviews span:contains(out of)'), 0.0) as score
from load_and_select('https://www.amazon.com/dp/B0C1H26C46 -i 1s -njr 3', 'body');
๐ Example Code:
๐ Advanced Guides
๐ท๏ธ Web Spider
- Scalable crawling
- Browser rendering
- AJAX data extraction
๐ง LLM Integration
- Natural language web content analysis
- Intuitive content description
๐ฏ Text-to-Action
- Simple language commands
- Intuitive browser control
๐ค RPA Capabilities
- Human-like task automation
- SPA crawling support
- Advanced workflow automation
๐ ๏ธ Developer-Friendly
- One-line data extraction
- SQL-like query interface
- Simple API integration
๐ X-SQL Power
- Extended SQL for web data
- Content mining capabilities
- Web business intelligence
๐ก๏ธ Bot Protection
- Advanced stealth techniques
- IP rotation
- Privacy context management
โก Performance
- Parallel page rendering
- High-efficiency processing
- Block-resistant design
๐ฐ Cost-Effective
- 100,000+ pages/day
- Minimal hardware requirements
- Resource-efficient operation
โ Quality Assurance
- Smart retry mechanisms
- Precise scheduling
- Complete lifecycle management
๐ Scalability
- Fully distributed architecture
- Massive-scale capability
- Enterprise-ready
๐ฆ Storage Options
- Local File System
- MongoDB
- HBase
- Gora support
๐ Monitoring
- Comprehensive logging
- Detailed metrics
- Full transparency
๐ค AI-Powered
- Automatic field extraction
- Pattern recognition
- Accurate data capture
- ๐ฌ WeChat: galaxyeye
- ๐ Weibo: galaxyeye
- ๐ง Email: galaxyeye@live.cn, ivincent.zhang@gmail.com
- ๐ฆ Twitter: galaxyeye8
- ๐ Website: platon.ai