I conducted experiments using colight methods on a hangzhou_4x4 network,and the final results were as follows:
2023-12-25 02:09:05 (INFO): Final Travel Time is 343.5820, mean rewards: -21.5031, queue: 1.2889, delay: 0.0717, throughput: 2732.
The environment of my experimental system is: win10, sumo Version 1.16.0.
There are many differences between my experimental results and those in the paper. Do cityflow and sumo perform experiments with similar results with the same data and the same hyperparameter settings?