Skip to content

ScreenShare.AI is a context-aware assistant that sees, understands, and interacts with your screen in real time. It combines live screen sharing, OCR analysis, multi-AI integration, and voice interaction to deliver instant, relevant help powered by Next.js, Node.js, WebSockets, and MongoDB with secure authentication and persistent history.

harshaltupe12/ScreenShare

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

17 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Jerry - AI-Powered Screen Assistant

A personal AI assistant with screen sharing capabilities that can see and understand your screen to provide contextual help.

Current Status

βœ… Completed Features

  • Frontend UI (95% Complete): Modern, responsive design with authentication, dashboard, and AI assistant interface
  • Backend API (95% Complete): Express.js server with WebSocket support, JWT authentication, and AI integration
  • AI Integration (85% Complete): Gemini AI API integration with screen analysis and voice synthesis
  • Authentication System (100% Complete): JWT-based authentication with password hashing and session management
  • Real-time Features (90% Complete): WebSocket connections, screen sharing, and live AI responses
  • Data Persistence (100% Complete): MongoDB Atlas integration with user data, AI sessions, and conversation history

πŸš€ Production Ready

Your Jerry AI Assistant is now production-ready with complete data persistence using MongoDB Atlas, secure authentication, AI session tracking, and scalable cloud database architecture.

Features

πŸ–₯️ Real-Time Screen Sharing

  • Live Screen Sharing: Share your screen with AI for real-time analysis
  • Cross-Platform: Works on desktop browsers with screen sharing permissions
  • Visual Indicators: See when your screen is being shared
  • Automatic Updates: Screen updates every 500ms for smooth experience
  • Image Compression: Automatic compression for optimal performance

πŸ€– AI Assistant

  • Context-Aware: AI can see and analyze your shared screen using OCR
  • Real-Time Responses: Get instant AI assistance for any task
  • Multiple AI Providers: Gemini AI, Hugging Face, and intelligent fallbacks
  • Voice Synthesis: Browser TTS using react-speech-kit
  • Voice Playback: Audio responses with error handling
  • Speech Recognition: Voice input for questions
  • Conversation History: All conversations are saved and accessible

πŸ” Secure Authentication

  • JWT Authentication: Secure token-based authentication
  • Password Hashing: bcrypt with salt rounds of 12
  • Account Security: Login attempt tracking and automatic lockout
  • User Profiles: Complete profile management and preferences
  • Session Management: Automatic token refresh and secure logout

Getting Started

Prerequisites

  • Node.js (v16 or higher)
  • npm or yarn
  • Modern browser with screen sharing support

Installation

  1. Clone the repository
git clone <repository-url>
cd jerry
  1. Install dependencies
# Backend
cd backend
npm install

# Frontend
cd ../frontend
npm install
  1. Set up environment variables
# Backend (.env)
PORT=5000
FRONTEND_URL=http://localhost:3000
MONGODB_URI=your_mongodb_uri

# Authentication (Required)
JWT_SECRET=your_super_secret_jwt_key_change_in_production
JWT_REFRESH_SECRET=your_super_secret_refresh_key_change_in_production

# AI Services (optional but recommended)
GEMINI_API_KEY=your_gemini_api_key
HUGGING_FACE_API_KEY=your_hugging_face_api_key

# Frontend (.env.local)
NEXT_PUBLIC_BACKEND_URL=http://localhost:5000

πŸ’‘ Tip: See backend/AI_SETUP.md for detailed setup instructions for AI services and authentication. πŸ’‘ Tip: See backend/MONGODB_SETUP.md for detailed MongoDB Atlas setup instructions.

  1. Start the servers
# Backend (Terminal 1)
cd backend
npm start

# Frontend (Terminal 2)
cd frontend
npm run dev
  1. Access the application

Usage

Getting AI Assistance

  1. Sign in to your account or create a new one
  2. Click "Start AI Assistant" from the dashboard
  3. Share your screen when prompted (optional but recommended)
  4. Ask questions via text or voice input
  5. Get instant AI responses with context from your screen
  6. Enable voice synthesis to hear AI responses

Screen Sharing

  1. Click "Share Screen" in the AI assistant interface
  2. Select your screen/window when prompted
  3. AI will analyze your screen using OCR for better context
  4. Ask questions about what's on your screen
  5. Get contextual responses based on your screen content

AI Features

  • Context-aware responses based on your screen content
  • Voice input for hands-free interaction
  • Voice synthesis for audio responses
  • Multiple AI providers with automatic fallbacks
  • Conversation history for reference

Technical Details

Architecture

  • Frontend: Next.js 14 with TypeScript
  • Backend: Node.js with Express and Socket.IO
  • Real-time: WebSocket communication for instant updates
  • Authentication: JWT-based secure authentication
  • Database: MongoDB for data persistence

Screen Sharing Implementation

  • MediaDevices API: Uses getDisplayMedia() for screen capture
  • Canvas Capture: Converts video frames to base64 images with compression
  • WebSocket Broadcasting: Sends screen data to AI processing
  • Real-time Updates: 500ms refresh rate for smooth experience
  • Image Optimization: Automatic compression to 1280x720 max resolution

Security

  • HTTPS Required: Screen sharing requires secure context
  • JWT Authentication: Secure token-based user authentication
  • Password Security: bcrypt hashing with salt rounds
  • Account Protection: Automatic lockout after failed attempts
  • Data Privacy: Screen data is not stored permanently

Browser Support

Screen Sharing Support

  • βœ… Chrome/Chromium (v72+)
  • βœ… Firefox (v66+)
  • βœ… Safari (v13+)
  • βœ… Edge (v79+)

Required Permissions

  • Screen sharing permission
  • Microphone (for voice features)
  • Camera (optional)

Development

Project Structure

jerry/
β”œβ”€β”€ backend/          # Node.js server
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ controllers/
β”‚   β”‚   β”œβ”€β”€ routes/
β”‚   β”‚   β”œβ”€β”€ services/
β”‚   β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ middleware/
β”‚   β”‚   └── utils/
β”‚   └── index.js      # Main server file
β”œβ”€β”€ frontend/         # Next.js application
β”‚   β”œβ”€β”€ app/          # App router pages
β”‚   β”œβ”€β”€ components/   # React components
β”‚   └── lib/          # Utility functions
└── README.md

Key Components

  • ScreenShare.tsx: Handles screen capture and display
  • AIChat.tsx: AI conversation interface
  • WebSocket Service: Real-time communication
  • AI Service: Multi-provider AI integration
  • Auth Service: JWT authentication management
  • OCR Service: Tesseract.js text extraction

Testing

Test the Complete Workflow

  1. Run the test script to verify all components:
cd backend
node test-workflow.js
  1. Manual Testing:
    • Start both servers (backend and frontend)
    • Create an account or sign in
    • Start the AI assistant
    • Share your screen
    • Ask the AI a question
    • Verify voice responses work
    • Test voice input functionality

Expected Workflow

  1. User signs in with secure authentication ➜
  2. User starts AI assistant from dashboard ➜
  3. User shares screen via WebRTC ➜
  4. Frontend captures screen snapshot ➜
  5. Sends image via WebSocket ➜
  6. Backend performs OCR (Tesseract) ➜
  7. Extracted text sent to Gemini AI ➜
  8. Gemini gives intelligent suggestions ➜
  9. Backend sends AI response ➜
  10. Frontend shows chat + plays voice using browser TTS

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Test thoroughly using the test script
  5. Submit a pull request

License

This project is licensed under the MIT License.

Support

For support and questions, please open an issue in the repository.

About

ScreenShare.AI is a context-aware assistant that sees, understands, and interacts with your screen in real time. It combines live screen sharing, OCR analysis, multi-AI integration, and voice interaction to deliver instant, relevant help powered by Next.js, Node.js, WebSockets, and MongoDB with secure authentication and persistent history.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published