This project implements a Deep Q-Learning approach using PyTorch. Deep Q-Learning combines Q-Learning with deep neural networks to solve complex decision-making tasks. The primary goal of this project is to train an agent capable of learning optimal policies through interactions with its environment.
- Implements a Deep Q-Network (DQN) for reinforcement learning.
- PyTorch-based implementation for easy extensibility.
- Supports replay memory and target networks.
- Configurable hyperparameters for learning and exploration.
-
Environment Interaction:
The agent interacts with the environment to gather experiences in the form of states, actions, rewards, and next states. -
Replay Memory:
Experiences are stored in a memory buffer to decorrelate data and improve training stability. -
Deep Q-Network:
A neural network estimates the Q-value for each state-action pair. -
Optimization:
The network is optimized using stochastic gradient descent to minimize the difference between predicted and target Q-values. -
Target Network:
A separate network is used to calculate stable target Q-values during training. -
Exploration vs. Exploitation:
An epsilon-greedy policy balances exploration of new actions with exploitation of the learned policy.
- Python 3.8+
- PyTorch
- NumPy
- Matplotlib
- OpenAI Gym (if applicable for environment setup)
-
Clone the repository:
git clone https://github.com/Prasannakumar/Trader-Bot-Deep-Reinforcement-Learning.git cd Trader-Bot-Deep-Reinforcement-Learning
-
Install the required dependencies:
pip install -r requirements.txt
-
Run the Jupyter notebook:
jupyter notebook deep_q_torch.ipynb
- Open the notebook and execute each cell sequentially.
- Modify the hyperparameters in the configuration section to experiment with different learning rates, discount factors, and exploration rates.
- Setup Section: Initializes libraries and configures the environment.
- Replay Memory Class: Handles experience storage and sampling.
- Deep Q-Network Class: Implements the neural network for Q-value estimation.
- Training Loop: Iteratively trains the agent using collected experiences.
- Evaluation Section: Tests the trained agent in the environment and visualizes performance.
- A trained agent capable of solving tasks in the given environment.
- Visualization of reward progression during training.
- Performance evaluation metrics.
Contributions are welcome! Please submit a pull request or open an issue for discussion.
This project is open-source and freely available for all. Be mindful of the terms and conditions of OpenAI and other technologies used within this application.