Conversation
f5fbe61 to
83f2297
Compare
|
run-ci: all |
|
The PR was built and ran successfully in standalone mode running on CPU. Here are some of the comparison plots.
The full set of validation and comparison plots can be found here. Here is a timing comparison: |
|
The PR was built and ran successfully with CMSSW running on CPU. Here are some plots. OOTB All Tracks
The full set of validation and comparison plots can be found here. |
|
run-ci: all |
|
The PR was built and ran successfully in standalone mode running on GPU. Here are some of the comparison plots.
The full set of validation and comparison plots can be found here. Here is a timing comparison: |
|
@slava77 I think this PR is good to go. Represents most of the boiler-plate changes of the CPU speedups PR. |
|
The PR was built and ran successfully with CMSSW running on GPU. Here are some plots. OOTB All Tracks
The full set of validation and comparison plots can be found here. |
slava77
left a comment
There was a problem hiding this comment.
nice updates.
I think the comment cleanup in the MiniDoublet code is a bit too aggressive. While some removals may be clean for some tautological docs, quite a bit is going to lose clarity. Please recover
9aad224 to
727bac8
Compare
|
run-ci: all |
727bac8 to
2375562
Compare
|
run-ci: all |
|
run-ci: all |
|
The PR was built and ran successfully in standalone mode running on CPU. Here are some of the comparison plots.
The full set of validation and comparison plots can be found here. Here is a timing comparison: |
|
The PR was built and ran successfully with CMSSW running on GPU. Here are some plots. OOTB All Tracks
The full set of validation and comparison plots can be found here. |
|
The PR was built and ran successfully with CMSSW running on CPU. Here are some plots. OOTB All Tracks
The full set of validation and comparison plots can be found here. |
|
The PR was built and ran successfully in standalone mode running on GPU. Here are some of the comparison plots.
The full set of validation and comparison plots can be found here. Here is a timing comparison: |
|
run-ci: all |
a2305fb to
df0990a
Compare
|
The PR was built and ran successfully in standalone mode running on CPU. Here are some of the comparison plots.
The full set of validation and comparison plots can be found here. Here is a timing comparison: |
|
The PR was built and ran successfully with CMSSW running on CPU. Here are some plots. OOTB All Tracks
The full set of validation and comparison plots can be found here. |
df0990a to
6ba26d8
Compare
|
run-ci: all |
|
run-ci: all |
|
The PR was built and ran successfully in standalone mode running on CPU. Here are some of the comparison plots.
The full set of validation and comparison plots can be found here. Here is a timing comparison: |
|
The PR was built and ran successfully in standalone mode running on GPU. Here are some of the comparison plots.
The full set of validation and comparison plots can be found here. Here is a timing comparison: |
















































This PR Timing (CPU) - commit 2 (pre-checks, exact trig simplifications, and additional early exits)



This PR Timing (CPU) - commit 1 (reducing redundant memory loads)
Master Timing (CPU)