A web application that analyzes Reddit user profiles to generate detailed personas using AI-powered analysis. This tool scrapes Reddit user data (posts and comments) and uses Google's Gemini AI to extract personality traits, motivations, frustrations, and emotional profiles.
- User Data Extraction: Scrapes Reddit posts and comments using PRAW
- AI-Powered Analysis: Uses Google Gemini AI for persona generation
- Comprehensive Personas: Generates detailed profiles including:
- Basic demographics (age, occupation, location)
- Personality traits and archetypes
- Motivations and frustrations
- Emotional analysis with scoring
- Goals and behavioral patterns
- Web Interface: Clean, responsive web UI built with FastAPI and Tailwind CSS
- Database Storage: MongoDB integration for data persistence
- Real-time Processing: Live persona generation with loading indicators
Before running this application, make sure you have:
- Python 3.8+
- MongoDB (local or cloud instance)
- Reddit API credentials
- Google Gemini API key
-
Clone the repository
git clone https://github.com/yourusername/reddit-persona-generator.git cd reddit-persona-generator -
Create a virtual environment
python -m venv venv # On Windows venv\Scripts\activate # On macOS/Linux source venv/bin/activate
-
Install dependencies
pip install -r requirements.txt
-
Set up environment variables
Create a
.envfile in the project root:# Reddit API Credentials REDDIT_CLIENT_ID=your_reddit_client_id REDDIT_CLIENT_SECRET=your_reddit_client_secret REDDIT_USER_AGENT=your_app_name/1.0 by /u/yourusername # Google Gemini API GEMINI_API_KEY=your_gemini_api_key # MongoDB Connection MONGODB_URI=mongodb://localhost:27017/
- Go to Reddit Apps
- Click "Create App" or "Create Another App"
- Fill out the form:
- Name: Your app name
- App type: Choose "script"
- Description: Optional
- About URL: Optional
- Redirect URI:
http://localhost:8000(or leave blank for scripts)
- Note down the Client ID (under the app name) and Client Secret
- Go to Google AI Studio
- Create a new API key
- Copy the API key to your
.envfile
Option 1: Local MongoDB
- Install MongoDB locally
- Start MongoDB service
- Use default connection string:
mongodb://localhost:27017/
Option 2: MongoDB Atlas (Cloud)
- Create account at MongoDB Atlas
- Create a cluster
- Get connection string and add to
.env
-
Start the FastAPI server
uvicorn main:app --reload
-
Open your browser
Navigate to
http://localhost:8000 -
Analyze a Reddit user
- Enter a Reddit username or URL (e.g.,
https://reddit.com/u/usernameor justusername) - Click "Generate Persona"
- Wait for analysis to complete (may take 1-2 minutes)
- View the generated persona dashboard
- Enter a Reddit username or URL (e.g.,
For direct command-line usage:
python persona_generator.pyEnter the Reddit username when prompted.
reddit-persona-generator/
├── persona_generator.py # Core analysis logic
├── app/
│ └── main.py # FastAPI web application
│ └── templates/
│ ├── index.html # Home page template
│ └── combined.html # Persona display template
├── requirements.txt # Python dependencies
├── .env.example # Environment variable template
├── .gitignore # Git ignore rules
├── .python-version # Python version manager file
├── pyproject.toml # Project metadata and build system
├── uv.lock # Dependency lock file (used by uv)
├── sample_users/ # Sample output persona files
│ ├── kojed # Sample analysis output for Reddit user 'kojed'
│ └── Hungry-Move-6603
└── README.md # Project documentation
- You can find real examples in the
sample_users/directory. Each file corresponds to a specific Reddit user’s persona analysis.
| Variable | Description | Required |
|---|---|---|
REDDIT_CLIENT_ID |
Reddit API client ID | Yes |
REDDIT_CLIENT_SECRET |
Reddit API client secret | Yes |
REDDIT_USER_AGENT |
Reddit API user agent | Yes |
GEMINI_API_KEY |
Google Gemini API key | Yes |
MONGODB_URI |
MongoDB connection string | No (defaults to local) |
-
Data Limits: By default, the app fetches 100 recent posts and comments. Modify in
persona_generator.py:posts = list(redditor.submissions.new(limit=100)) comments = list(redditor.comments.new(limit=100))
-
Persona Fields: Customize the persona analysis structure in the
json_format_definitionwithinanalyze_with_llm()function
The application generates personas with the following structure:
- Basic Info: Name, age, occupation, location, status
- Personality Traits: Key characteristics
- Motivations: What drives the user
- Frustrations: Common pain points
- Goals & Needs: Aspirations and requirements
- Emotional Profile: Scored analysis (0-100) for:
- Happiness
- Confidence
- Anxiety
- Anger
- Sadness
- Rate Limits: Reddit API has rate limits (60 requests/minute)
- Public Data Only: Only analyzes publicly available posts/comments
- Privacy: Respect user privacy and Reddit's terms of service
- AI Accuracy: Persona analysis is AI-generated and may not be 100% accurate
- Data Freshness: Analysis based on recent activity (last 100 posts/comments)
- This tool only analyzes publicly available Reddit data
- No personal information is stored beyond what's publicly posted
- Users can request data deletion by contacting the administrator
- Follow Reddit's API terms of service and rate limiting guidelines
-
"FATAL: GEMINI_API_KEY not set"
- Ensure your
.envfile contains the correctGEMINI_API_KEY
- Ensure your
-
MongoDB Connection Error
- Check if MongoDB is running locally
- Verify your
MONGODB_URIis correct
-
Reddit API Errors
- Verify your Reddit API credentials
- Check if you're hitting rate limits
- Ensure the username exists and has public posts
-
Empty Persona Generated
- User might have no public posts/comments
- Account might be too new or inactive
- Check if the username is spelled correctly
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- PRAW for Reddit API access
- Google Gemini for AI-powered analysis
- FastAPI for the web framework
- Tailwind CSS for styling
If you encounter any issues or have questions, please:
- Check the troubleshooting section above
- Search existing GitHub issues
- Create a new issue with detailed information about the problem
Happy analyzing! 🎉