🤖 PulsarRPA

🌟 Introduction

💖 PulsarRPA: The AI-Powered, Lightning-Fast Browser Automation Solution! 💖

PulsarRPA is an AI-enabled 🤖, high-performance 🚀, distributed 🌐, and open-source 🔓 browser automation platform, built for large-scale automation 🏭. It excels at:

🤖 AI integration with LLMs for smarter automation
⚡ Ultra-fast, spider-grade browser automation
🧠 Advanced web content understanding
📊 Powerful data extraction APIs

PulsarRPA is designed to meet the demands of modern web automation, delivering accurate ✅ and comprehensive 📚 data extraction — even from the most complex 🔄 and dynamic ⚡ websites.

🎥 Demo Videos

🎬 YouTube:

📺 Bilibili: https://www.bilibili.com/video/BV1kM2rYrEFC

🚀 Quick Start Guide

🌟 For Beginners - Just Talk, No Special Skills Required!

Run

Download the latest Executable Jar and run it.

# Linux/macOS and Windows (if curl is available)
curl -L -o PulsarRPA.jar https://github.com/platonai/PulsarRPA/releases/download/v3.0.2/PulsarRPA.jar
java -D DEEPSEEK_API_KEY=${DEEPSEEK_API_KEY} -jar PulsarRPA.jar

You can ignore DEEPSEEK_API_KEY if you don't need to use the AI features.

Download links:

For docker user:

docker run -d -p 8182:8182 -e DEEPSEEK_API_KEY=${DEEPSEEK_API_KEY} galaxyeye88/pulsar-rpa:latest

💬 Chat About a Webpage

Talk about a webpage using the chat-about API:

curl -X POST "http://localhost:8182/api/ai/chat-about" -H "Content-Type: application/json" -d '{
 "url": "https://www.amazon.com/dp/B0C1H26C46",
 "prompt": "introduce this product"
}'

📊 Extract Data

Extract data from a webpage using extract API:

curl -X POST "http://localhost:8182/api/ai/extract" -H "Content-Type: application/json" -d '{
  "url": "https://www.amazon.com/dp/B0C1H26C46",
  "prompt": "product name, price, and description"
}'

💬 Chat with AI

Use the chat API to ask any questions:

curl http://localhost:8182/api/ai/chat?prompt=What-is-the-most-fantastical-technology-today

Use post method to send a longer prompt:

curl -X POST "http://localhost:8182/api/ai/chat" -H "Content-Type: application/json" -d '
What is the most fantastical technology today?
You should return a list of 5 items.
'

🎓 For Advanced Users - LLM + X-SQL

curl -X POST "http://localhost:8182/api/x/e" -H "Content-Type: text/plain" -d "
select
  llm_extract(dom, 'product name, price, ratings') as llm_extracted_data,
  dom_base_uri(dom) as url,
  dom_first_text(dom, '#productTitle') as title,
  dom_first_slim_html(dom, 'img:expr(width > 400)') as img
from load_and_select('https://www.amazon.com/dp/B0C1H26C46', 'body');
"

The extracted data:

{
  "llm_extracted_data": {
    "product name": "Apple iPhone 15 Pro Max",
    "price": "$1,199.00",
    "ratings": "4.5 out of 5 stars"
  },
  "url": "https://www.amazon.com/dp/B0C1H26C46",
  "title": "Apple iPhone 15 Pro Max",
  "img": "<img src=\"https://example.com/image.jpg\" />"
}

👨‍💻 For Experts - Native API

🎮 Browser Control:

val prompts = """
move cursor to the element with id 'title' and click it
scroll to middle
scroll to top
get the text of the element with id 'title'
"""

val eventHandlers = DefaultPageEventHandlers()
eventHandlers.browseEventHandlers.onDocumentActuallyReady.addLast { page, driver ->
    val result = session.instruct(prompts, driver)
}
session.open(url, eventHandlers)

📝 Example: View Kotlin Code

🤖 RPA Crawling:

val options = session.options(args)
val event = options.eventHandlers.browseEventHandlers
event.onBrowserLaunched.addLast { page, driver ->
    warnUpBrowser(page, driver)
}
event.onWillFetch.addLast { page, driver ->
    waitForReferrer(page, driver)
    waitForPreviousPage(page, driver)
}
event.onWillCheckDocumentState.addLast { page, driver ->
    driver.waitForSelector("body h1[itemprop=name]")
    driver.click(".mask-layer-close-button")
}
session.load(url, options)

📝 Example: View Kotlin Code

🔍 Complex Data Extraction with X-SQL:

select
    llm_extract(dom, 'product name, price, ratings, score') as llm_extracted_data,
    dom_first_text(dom, '#productTitle') as title,
    dom_first_text(dom, '#bylineInfo') as brand,
    dom_first_text(dom, '#price tr td:matches(^Price) ~ td') as price,
    dom_first_text(dom, '#acrCustomerReviewText') as ratings,
    str_first_float(dom_first_text(dom, '#reviewsMedley .AverageCustomerReviews span:contains(out of)'), 0.0) as score
from load_and_select('https://www.amazon.com/dp/B0C1H26C46  -i 1s -njr 3', 'body');

📚 Example Code:

📖 Advanced Guides

✨ Features

🕷️ Web Spider

Scalable crawling
Browser rendering
AJAX data extraction

🧠 LLM Integration

Natural language web content analysis
Intuitive content description

🎯 Text-to-Action

Simple language commands
Intuitive browser control

🤖 RPA Capabilities

Human-like task automation
SPA crawling support
Advanced workflow automation

🛠️ Developer-Friendly

One-line data extraction
SQL-like query interface
Simple API integration

📊 X-SQL Power

Extended SQL for web data
Content mining capabilities
Web business intelligence

🛡️ Bot Protection

Advanced stealth techniques
IP rotation
Privacy context management

⚡ Performance

Parallel page rendering
High-efficiency processing
Block-resistant design

💰 Cost-Effective

100,000+ pages/day
Minimal hardware requirements
Resource-efficient operation

✅ Quality Assurance

Smart retry mechanisms
Precise scheduling
Complete lifecycle management

🌐 Scalability

Fully distributed architecture
Massive-scale capability
Enterprise-ready

📦 Storage Options

Local File System
MongoDB
HBase
Gora support

📊 Monitoring

Comprehensive logging
Detailed metrics
Full transparency

🤖 AI-Powered

Automatic field extraction
Pattern recognition
Accurate data capture

📞 Contact Us

💬 WeChat: galaxyeye
🌐 Weibo: galaxyeye
📧 Email: galaxyeye@live.cn, ivincent.zhang@gmail.com
🐦 Twitter: galaxyeye8
🌍 Website: platon.ai

Name		Name	Last commit message	Last commit date
Latest commit History 2,882 Commits
.github/workflows		.github/workflows
.mvn		.mvn
bin		bin
docker/dev		docker/dev
docs		docs
pulsar-all		pulsar-all
pulsar-app		pulsar-app
pulsar-bom		pulsar-bom
pulsar-client		pulsar-client
pulsar-common		pulsar-common
pulsar-dom		pulsar-dom
pulsar-persist		pulsar-persist
pulsar-plugins		pulsar-plugins
pulsar-python		pulsar-python
pulsar-ql-common		pulsar-ql-common
pulsar-ql		pulsar-ql
pulsar-resources		pulsar-resources
pulsar-rest		pulsar-rest
pulsar-skeleton		pulsar-skeleton
pulsar-spring-support		pulsar-spring-support
pulsar-tests		pulsar-tests
pulsar-third		pulsar-third
pulsar-tools		pulsar-tools
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
DEVPLAN.md		DEVPLAN.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README-CN.md		README-CN.md
README.md		README.md
VERSION		VERSION
cloc.sh		cloc.sh
docker-compose.yaml		docker-compose.yaml
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 PulsarRPA

🌟 Introduction

🎥 Demo Videos

🚀 Quick Start Guide

🌟 For Beginners - Just Talk, No Special Skills Required!

Run

💬 Chat About a Webpage

📊 Extract Data

💬 Chat with AI

🎓 For Advanced Users - LLM + X-SQL

👨‍💻 For Experts - Native API

🎮 Browser Control:

🤖 RPA Crawling:

🔍 Complex Data Extraction with X-SQL:

✨ Features

📞 Contact Us

About

Releases 7

Packages

Contributors 4

Languages

License

platonai/PulsarRPA

Folders and files

Latest commit

History

Repository files navigation

🤖 PulsarRPA

🌟 Introduction

🎥 Demo Videos

🚀 Quick Start Guide

🌟 For Beginners - Just Talk, No Special Skills Required!

Run

💬 Chat About a Webpage

📊 Extract Data

💬 Chat with AI

🎓 For Advanced Users - LLM + X-SQL

👨‍💻 For Experts - Native API

🎮 Browser Control:

🤖 RPA Crawling:

🔍 Complex Data Extraction with X-SQL:

✨ Features

📞 Contact Us

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 7

Packages 0

Contributors 4

Languages

Packages