- Features
- Architecture
- Quick Start
- Installation
- Configuration
- Usage Guide
- Datasets
- Development
- Contributing
- License
- Few-Shot Learning: Learns from existing posts to match unique writing styles
- Multi-language Support: Generate posts in English and Hinglish
- Customizable Length: Short (1-5 lines), Medium (6-10 lines), Long (11-15 lines)
- Tone Variations: Professional, Casual, Humorous, Inspirational, Educational
- Smart Context: AI analyzes your dataset to understand patterns and preferences
- Custom Prompts: Create and save your own prompt templates
- Target Audience: Tailor content for Students, Professionals, Entrepreneurs, Job Seekers
- Writing Styles: Storytelling, List Format, Question-Answer, Tips & Tricks, Personal Reflection
- Smart Features: Hashtag inclusion, emoji integration, call-to-action generation
- Professional Formatting: Structured layouts for business communications
- Engagement Analysis: Track performance metrics and engagement trends
- Content Insights: Word count, readability scores, hashtag analysis, sentiment analysis
- Visual Dashboard: Interactive charts with Plotly visualizations
- Performance Recommendations: AI-driven insights for better engagement
- Dataset Comparison: Compare performance across different datasets
- Multiple Datasets: Switch between different post collections seamlessly
- Auto-Processing: Upload raw JSON files and automatically extract metadata
- Smart Categorization: AI-powered tagging, tone detection, and audience classification
- Real-time Stats: Live statistics and insights about your datasets
- Export/Import: Easy dataset sharing and backup functionality
The system uses a specific naming convention to organize datasets:
Raw Datasets (need processing):
raw_[category].json
Examples: raw_tech_students.json, raw_professionals.json, raw_entrepreneurs.json
Processed Datasets (ready for generation):
processed_raw_[category].json
Examples: processed_raw_tech_students.json, processed_raw_professionals.json
Structure Comparison:
- Raw: Basic posts with
textandengagementonly - Processed: Enhanced with AI-extracted
tags,tone,target_audience,language,length,line_count
- Specialized Dataset: 1000+ posts covering the complete college experience
- Year-wise Content: First year to final year journey posts
- Technical Events: Hackathons, coding competitions, tech workshops
- Academic Milestones: Assignments, projects, internships, placements
- Personal Growth: Learning experiences, challenges, achievements
Raw Posts β Preprocessing β Feature Extraction β Dataset Creation
β β β β
Text Content β Topic/Language β Tags/Metadata β Structured Data
User Input β Few-Shot Selection β Prompt Engineering β LLM Generation β Post Output
β β β β β
Query Params β Similar Posts β Context Prompt β AI Model β Generated Content
# The system identifies similar posts based on:
- Topic/Tag matching
- Language preference
- Length category
- Writing style patterns
# Example: For "Internship" + "Medium" + "English"
selected_examples = filter_posts(
topic="Internship",
length="Medium",
language="English"
)The system creates intelligent prompts by combining:
- User Requirements: Topic, length, language, tone
- Context Examples: 1-2 most relevant posts from dataset
- Style Guidelines: Specific instructions for writing style
- Formatting Rules: Hashtags, emojis, structure preferences
# Prompt Structure:
"""
Generate a LinkedIn post using the below information.
1) Topic: {user_topic}
2) Length: {desired_length}
3) Language: {language_preference}
4) Tone: {selected_tone}
Use the writing style as per the following examples:
Example 1: {similar_post_1}
Example 2: {similar_post_2}
"""- Model: Llama 3.1 70B Versatile (via Groq API)
- Temperature: Optimized for creative yet consistent output
- Context Window: Efficiently manages prompt length and examples
- Error Handling: Robust validation and fallback mechanisms
- Python 3.8+
- Groq API Key (Get yours here)
-
Clone the repository
git clone https://github.com/vacantvectors/project-genai-post-generator.git cd project-genai-post-generator -
Install dependencies
pip install -r requirements.txt
-
Configure API Key Create a
.envfile in the project root:GROQ_API_KEY=your_groq_api_key_here
-
Run the application
streamlit run main.py
-
Access the app Open your browser and navigate to
http://localhost:8501
- Size: Professional LinkedIn posts with engagement metrics
- Features: Text content, engagement data, topics, language classification
- Use Case: General professional content generation
- Size: 1000+ posts covering 4-year college journey
- Coverage:
- First Year: Orientation, first coding classes, basic programming
- Second Year: OOP concepts, internship applications, technical workshops
- Third Year: Advanced algorithms, first internships, open source contributions
- Fourth Year: Final projects, placement preparation, graduation
{
"text": "Day 1 of college! π\nFeeling excited and nervous at the same time...",
"tags": ["College Life", "First Year", "Computer Science"],
"language": "English",
"engagement": 45
}Perfect for fast content creation:
- Select topic from dropdown
- Choose length and language
- Pick tone and formatting options
- Click "Generate Post"
For specific requirements:
- Enter custom topic
- Define target audience
- Specify post purpose
- Add context and keywords
- Generate tailored content
Monitor your content performance:
- Engagement metrics
- Content analysis
- Tag performance
- Performance insights
- Upload new posts
- Switch between datasets
- Merge datasets
- Export analytics
# Example generation process:
1. User selects: Topic="Hackathon", Length="Medium", Language="English"
2. System finds similar posts in dataset
3. Creates context-aware prompt with examples
4. LLM generates new post matching style
5. Post validation and formatting
6. Output delivered to user- Input Validation: Sanitizes user inputs
- API Error Handling: Graceful degradation for API issues
- Data Validation: Ensures dataset integrity
- Logging: Comprehensive error tracking
- Caching: Intelligent caching of API responses
- Batch Processing: Efficient handling of multiple requests
- Memory Management: Optimized data loading and processing
- Engagement Analytics: Total, average, median engagement
- Content Metrics: Word count, hashtags, emojis, readability
- Performance Insights: Best performing topics, optimal length
- Trend Analysis: Language preferences, tag popularity
- Interactive charts with Plotly
- Word clouds for content analysis
- Performance comparison graphs
- Exportable analytics reports
Create reusable templates for specific use cases:
template = {
"name": "Tech Achievement Post",
"prompt": "Generate a post about {achievement} for {audience}...",
"variables": ["achievement", "audience"]
}Add your own posts to improve generation:
{
"text": "Your post content here...",
"engagement": 150,
"language": "English",
"tags": ["Your", "Tags"],
"line_count": 5
}project-genai-post-generator/
βββ main.py # Main Streamlit application
βββ post_generator.py # Core generation logic
βββ few_shot.py # Few-shot learning implementation
βββ llm_helper.py # LLM integration
βββ analytics.py # Analytics and insights
βββ error_handler.py # Error handling utilities
βββ preprocess.py # Data preprocessing
βββ requirements.txt # Dependencies
βββ README.md # Documentation
βββ data/
β βββ processed_posts.json # Main dataset
β βββ college_student_posts.json # Student-specific dataset
β βββ generated_posts_history.json # Generation history
β βββ prompt_templates.json # Custom templates
βββ resources/
βββ architecture.jpg # Architecture diagram
βββ tool.jpg # Tool screenshot
We welcome contributions! Here's how you can help:
- Fork the repository
- Create a feature branch
git checkout -b feature/amazing-feature
- Make your changes
- Add tests if applicable
- Commit your changes
git commit -m 'Add amazing feature' - Push to the branch
git push origin feature/amazing-feature
- Open a Pull Request
- Additional datasets (industry-specific posts)
- New generation algorithms
- Enhanced analytics features
- UI/UX improvements
- Performance optimizations
β Error: GROQ_API_KEY not found
Solution: Ensure .env file contains valid GROQ_API_KEY
β Error: ModuleNotFoundError
Solution: Run 'pip install -r requirements.txt'
β Error: Model not responding
Solution: Check internet connection and API key validity
- Clear browser cache
- Restart Streamlit server
- Check system memory usage
- Verify API rate limits
- Multi-platform Support: Twitter, Facebook post generation
- Advanced AI Models: Integration with GPT-4, Claude
- Real-time Analytics: Live performance tracking
- Collaborative Features: Team workspaces
- API Integration: REST API for external applications
- Mobile App: Native mobile application
- Sentiment Analysis: Post emotion tracking
- Competitor Analysis: Benchmark against industry standards
- Optimal Timing: Best posting time recommendations
- A/B Testing: Content variation testing
- Groq for providing excellent LLM API services
- Streamlit for the amazing web app framework
- LangChain for LLM integration capabilities
- Plotly for interactive visualization tools
- Open Source Community for various libraries and tools
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: sarthak@vacantvectors.com
If this project helps you, please consider:
- Starring the repository
- Reporting bugs
- Suggesting new features
- Sharing with others
Made by Sarthak Chakraborty
Empowering creators with AI-driven content generation
