-
Notifications
You must be signed in to change notification settings - Fork 91
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Adding timing metrics to CUDA and host executors (#842)
* Adding timing metrics to CUDA and host executors
- Loading branch information
1 parent
1777afc
commit 2de8514
Showing
5 changed files
with
111 additions
and
21 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
.. _profiling: | ||
|
||
Profiling | ||
######### | ||
|
||
Profiling is a way to measure the performance of a program and to identify bottlenecks in your MatX application. Since | ||
the method for profiling depends on the executor, each executor implements its own profiling mechanism. For example, | ||
the CUDA executor can use events encapsulating the kernels it's profiling. The profiling is done through the executor | ||
object rather than the `run` statement so that multiple `run`\s can be profiled together. | ||
|
||
Profiling is done by calling the `start_timer()` method of the executor: | ||
|
||
.. code-block:: cpp | ||
exec.start_timer(); | ||
To stop the profiler, `stop_timer()` is called: | ||
|
||
.. code-block:: cpp | ||
exec.stop_timer(); | ||
Depending on the executor, `stop_timer()` may need to block for the operation to conplete on an asynchronous executor. | ||
|
||
Once `stop_timer()` returns, the execution time between the timers can be retrieved by calling `get_time_ms()`: | ||
|
||
.. code-block:: cpp | ||
auto time = exec.get_time_ms(); | ||
In the above example `time` contains the runtime of everything executed between the `start_timer()` and `stop_timer()` calls. For | ||
a CUDA executor this is the time between the beginning of the first kernel and the end of the last. For a CPU executor this is the CPU | ||
time between the two calls. | ||
|
||
.. note:: | ||
Profiling does not work a multi-threaded host executor currently | ||
|
||
For a full example of profiling, see the `spectrogram` example. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters