- GIRIRAJ - 2020UCA1904
- VINEET KUMAR - 2020UCM2312
- SAIBAL PATRA - 2020UCM2348
1)Understand why would you need to be able to predict stock price movements;
2)Download the data - You will be using stock market data gathered from Yahoo finance;
3)Split train-test and validation data and also perform some data normalization;
4)Motivate and briefly discuss an LSTM model as it allows to predict more than one-step ahead;
5)Predict and visualize future stock market with current data
Note: Stock market prices are highly unpredictable and volatile. This means that there are no consistent patterns in the data that allow you to model stock prices over time near-perfectly.
Dataset Link : https://finance.yahoo.com/quote/NTPC.NS/history?p=NTPC.NS
Stock prices come in several different flavors. They are,
Open: Opening stock price of the day
Close: Closing stock price of the day
High: Highest stock price of the data
Low: Lowest stock price of the day
Long Short-Term Memory models are extremely powerful time-series models. They can predict an arbitrary number of steps into the future. An LSTM module (or cell) has 5 essential components which allows it to model both long-term and short-term data.
Cell state (ct) - This represents the internal memory of the cell which stores both short term memory and long-term memories
Hidden state (ht) - This is output state information calculated w.r.t. current input, previous hidden state and current cell input which you eventually use to predict the future stock market prices. Additionally, the hidden state can decide to only retrive the short or long-term or both types of memory stored in the cell state to make the next prediction.
Input gate (it) - Decides how much information from current input flows to the cell state
Forget gate (ft) - Decides how much information from the current input and the previous cell state flows into the current cell state
Output gate (ot) - Decides how much information from the current cell state flows into the hidden state, so that if needed LSTM can only pick the long-term memories or short-term memories and long-term memories
For the model, we have partitioned the dataset into three parts, Training, Validation, and Testing section.
Training section consists of 80% of the data.
Validation part contains the data present in between 80 and 90%
For testing we will use the rest of the data.
For training the model, we are using LSTM algorithm.
We are using the Sequential model, Adam optimizers, and fixing the learning rate to 0.001.
We will run the model for 100 times(epoch = 100)
We will visualize Training Predictions and Observations, Testing Predictions and Observations, and Validation Predictions and Observations one by one.
Training Dataset Visualization