Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Tweak default model hyperparams for better synthesis results
This is the result of a series of experiments conducted to improve ML synthesis performance measured by recently implemented "rating" of all-experiments synthesis evaluation results. Room for improvement was noticed thanks to the recent update to validation dataset selecting (now it consists of completely new synthesis trees not used for training). A close look at new training history charts showed signs of serious overfitting hurting *cross-tree* model generalization abilities. Training converged very fast for training loss/mae, but that made the new validation mae/loss only worse, so that's no good. Lowering the learning rate, adding L2 regularization and strong dropout to simulate model batching helped "slow down" training convergence a lot, allowing to observe the behaviour of the new val loss in a more detailed way. A clear minimum was spotted, overfitting confirmed, and the chosen training epoch count / steps per epoch / batch size were adjusted. The resulting model showed a synthesis evaluation rating of 76/100, which is much better than the previous model's (60/100). For reference, default score + SOTA synthesis method has a rating of 77/100, so that's a pretty strong result. The new hyperparameters produce models that conduct ML synthesis much better, so they should be chosen as the default.
- Loading branch information