This repository is for our 7th-semester project, Advanced Stock Price Forecasting Using a Hybrid Model of Numerical and Textual Analysis. It uses Python, NLP (NLTK, spaCy), machine learning models, Grafana, InfluxDB, and Streamlit for comprehensive data analysis and visualization.
The Advanced Stock Price Forecasting Using a Hybrid Model of Numerical and Textual Analysis project involves a comprehensive approach to predicting stock prices using both numerical data and textual analysis. The project components include:
-
Data Collection and Storage: We gathered historical stock data of major companies and stored it in an InfluxDB database to efficiently handle large-scale time-series data.
-
Data Visualization: A Grafana dashboard has been set up for real-time visualization of stock prices and analysis results, enhancing data interpretation and decision-making processes.
-
Textual Analysis for Enhanced Forecasting: We utilized Natural Language Processing (NLP) libraries, such as NLTK and spaCy, to analyze financial news and reports. This component complements numerical analysis to improve the accuracy of our hybrid forecasting model.
-
Machine Learning Models: The project used models including Naive Bayes, MLP (Multi-Layer Perceptron), Logistic Regression, and Random Forest to process both numerical and textual data, creating a robust and comprehensive stock prediction system.
-
Collaboration and Project Management: The repository includes contributions from all team members with well-organized tasks, ensuring seamless collaboration and effective version control.
📁 Stock-Market-Prediction/
├── 📁 Codes/
│ ├── 📁 Historical_Data_Analysis/
│ ├── 📁 Partial_Data_Analysis/
│ ├── 📁 Ticker_Symbols_Stocks/
│ ├── 📁 Flask_App/
│
├── 📁 Conferences/
│
├── 📁 Documents/
│
├── 📁 Reference_Documents/
│
├── 📁 Resources/
│
├── 📄 LICENSE
└── 📄 README.md
- Python: Core programming language used for data analysis, model building, and backend development.
- GitHub: Platform for version control and collaborative development.
- InfluxDB: Database for efficient time-series data storage and retrieval.
- Grafana: Tool for real-time data visualization and dashboard creation.
- Streamlit: Framework for creating interactive web applications.
- Flask: Lightweight framework for developing the project’s backend.
- Pandas: Library for data manipulation and analysis.
- Matplotlib & Plotly: Libraries for data visualization and graphical representation.
- NLP Libraries (NLTK, spaCy): Tools for processing and analyzing text data.
- Machine Learning Libraries: Used for implementing models like Naive Bayes, MLP, Logistic Regression, and Random Forest.
For easy visualization and data management, we are using the following tools:
This section provides an overview of the stock market, project details, and descriptions of the companies used in the project, including MAANG, Nvidia, Microsoft, and TCS.
This section displays numerical data of the stock market, featuring graphs of open, high, low, and close prices along with volume bar plots, RSI, and moving averages.
This section highlights model predictions, including individual and comparative graphs of predicted and actual values for stock prices, as well as predicted RSI and moving averages.
This section visualizes sentiment analysis from news headlines, showcasing positive, negative, and neutral sentiment scores.
The hybrid model combines numerical and textual data for a comprehensive analysis.
- Streamlit: The app is deployed at: Stock Market Numerical and Text Hybrid Prediction
Here's an overview of the Streamlit App:
- Flask: The app codes can be seen here: Flask App Codes.
Here's an overview of the Flask App:
Name | Madhurima Rawat | Geetanshu Dev Meshram | Sneha Jha |
---|---|---|---|
Role | Project Planner & Developer | Data Analyst & Backend Developer | Data Analyst |
Responsibilities | Project planning, managing GitHub repository, code documentation, InfluxDB database setup, Grafana dashboard, Streamlit deployment, Flask backend, data visualization & preprocessing | Model building for numerical data, Flask app design | Text data processing, model building, hybrid model creation |
Tools | GitHub, InfluxDB, Grafana, Streamlit, Python, Flask, Pandas, Matplotlib, Plotly | Python, Flask, ML libraries | NLP libraries, ML libraries, hybrid modeling tools |
GitHub | GitHub | GitHub | GitHub |
-
Partial Data Analysis:
- Historical Stock Prices: Yahoo Finance
- Textual and Hybrid Data:
-
Complete Historical Data:
- Alphabet (Google) (GOOG): Google Stock Price
- Apple (AAPL): Apple Stock Price
- Amazon (AMZN): Amazon Stock Price
- Meta (META): Meta Stock Price
- Netflix (NFLX): Netflix Stock Price
- Nvidia (NVDA): Nvidia Stock Price
- Microsoft (MSFT): Microsoft Stock Price
- TCS: TCS Stock Price
- Illustration Links: