-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do profiling of the fixed-ms2 score SSVM #10
Comments
Till now we could reach a speed-up by factor 4 with very simple modifications:
There is still room for improvement:
Top list of run-time consuming parts:
|
We could reach a further improvement by do some optimizations on the kernel computation side:
The biggest speedup, so far though, was reached by computing the preference values for all candidates associated with one sequence (so of all nodes) in one go. |
We used 4 cores for the parallel computations ('alg_1__numba' and 'ufunc'). Depending on the input data dimension different algorithms perform the best. For example, when the costs per kernel matrix element increases (d=7500) the ufunc implementation performs the best. |
When the output matrix has a rectangular shape, i.e. n_A < n_B, than the 'alg_1__numba' performs the best. This case appears when the preference values are calculated where n_A is the number of active candidates and n_B are the candidates for which the preference values should be calculated.
|
N = 24, n_cand=50, L_max=15 Numpy: 22.856s N = 48, n_cand=50, L_max=15 Numpy: 97.751s N = 96, n_cand=50, L_max=15 Numpy: 321.269s N = 24, n_cand=50, L_max=25 Numpy: 27.407s N = 24, n_cand=100, L_max=15 Numpy: 37.855s |
... where do we spend most of the run-time?
The text was updated successfully, but these errors were encountered: