An example script that researches topics and generates Twitter threads using AI. This project demonstrates how to build an agentic workflow using the AI SDK, showing how to chain together multiple AI prompts, tools, and operations (search, summarization, content generation) into a cohesive document.
You'll need:
- Node.js (v16 or higher)
- pnpm package manager
- Serper API key (for Google search results)
- OpenAI API key
- Install dependencies:
pnpm install
- Copy the example environment file and edit it with your API keys:
cp .env.example .env
Add your API keys to the .env
file:
SERPER_API_KEY=your_serper_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
- Go to serper.dev
- Sign up for an account
- Navigate to your dashboard to find your API key
- Go to platform.openai.com
- Sign up or log in
- Navigate to API keys section
- Create a new API key
There are two ways to run the research tool:
pnpm research "your topic here"
pnpm pretty "your topic here"
Both commands will:
- Search Google for relevant articles
- Extract and summarize content
- Generate a Twitter thread based on the research
The pretty version includes progress bars and better formatted output.
pnpm pretty "latest developments in quantum computing"
This will generate a researched Twitter thread about quantum computing developments.
The current implementation uses a naive approach with JSDOM to extract content from web pages, simply pulling all <p>
tags. This has several limitations:
- Doesn't account for page structure or content relevance
- May miss important content in other HTML elements
- Doesn't handle dynamic content or JavaScript-rendered pages
- No semantic understanding of content importance
A more robust approach would:
- Use proper web scraping tools like Puppeteer or Playwright
- Implement text chunking strategies
- Use embeddings and vector similarity to identify relevant content
- Store embeddings in a vector database for efficient similarity search
- Apply semantic analysis to prioritize important content
These improvements would significantly enhance the quality and relevance of the extracted research content.