mnist MNIST Classification Task via several different deep learning approaches TODO Learning Rate Scheduler Gradient Accumulation Gradient Checkpointing Freezing layers