- Preserve weights according to importance to previous task(s)
- Papers:
- Idea: Transferring knowledge of multiple reinforcement learning policies into a single multi-task policy
- Vanilla policy distillation: transfer policies learned from multiple teacher networks to one student network via supervised regression (state, q-value pairs).
- Training student Q-network by matching the output distributions between student/teacher Q-networks using KL-divergence is more effective.
- Hierarchical Experience Replay: Directly combine layers from the teacher networks to the multi-task policy network + new sampling approach for ER
- Papers:
Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model
Learning and Transfer of Modulated Locomotor Controllers
learning to do laps with reinforcement learning and neural nets