Skip to content

Commit

Permalink
beginner_source/hta_trace_diff_tutorial.rst λ²ˆμ—­ (#941)
Browse files Browse the repository at this point in the history
* beginner_source/hta_trace_diff_tutorial.rst λ²ˆμ—­
  • Loading branch information
uddk6215 authored Oct 15, 2024
1 parent eddcb82 commit 9ce499f
Showing 1 changed file with 26 additions and 33 deletions.
59 changes: 26 additions & 33 deletions beginner_source/hta_trace_diff_tutorial.rst
Original file line number Diff line number Diff line change
@@ -1,44 +1,39 @@
Trace Diff using Holistic Trace Analysis
쒅합적 뢄석을 μ΄μš©ν•œ 트레이슀 차이 뢄석
========================================

**Author:** `Anupam Bhatnagar <https://github.com/anupambhatnagar>`_
**μ €μž:** `Anupam Bhatnagar <https://github.com/anupambhatnagar>`_
**λ²ˆμ—­:** `μ΄μ§„ν˜ <https://github.com/uddk6215>__`

Occasionally, users need to identify the changes in PyTorch operators and CUDA
kernels resulting from a code change. To support this requirement, HTA
provides a trace comparison feature. This feature allows the user to input two
sets of trace files where the first can be thought of as the *control group*
and the second as the *test group*, similar to an A/B test. The ``TraceDiff`` class
provides functions to compare the differences between traces and functionality
to visualize these differences. In particular, users can find operators and
kernels that were added and removed from each group, along with the frequency
of each operator/kernel and the cumulative time taken by the operator/kernel.
λ•Œλ•Œλ‘œ μ‚¬μš©μžλ“€μ€ μ½”λ“œ λ³€κ²½μœΌλ‘œ μΈν•œ PyTorch μ—°μ‚°μžμ™€ CUDA μ»€λ„μ˜ λ³€ν™”λ₯Ό 식별해야 ν•  ν•„μš”κ°€ μžˆμŠ΅λ‹ˆλ‹€.
이λ₯Ό μœ„ν•΄ HTAλŠ” 트레이슀 비ꡐ κΈ°λŠ₯을 μ œκ³΅ν•©λ‹ˆλ‹€. 이 κΈ°λŠ₯을 톡해 μ‚¬μš©μžλŠ” 두 μ„ΈνŠΈμ˜ 트레이슀 νŒŒμΌμ„ μž…λ ₯ν•  수 μžˆλŠ”λ°,
A/B ν…ŒμŠ€νŠΈμ™€ μœ μ‚¬ν•˜κ²Œ, 첫 번째 μ„ΈνŠΈλŠ” λŒ€μ‘°κ΅°μœΌλ‘œ, 두 번째 μ„ΈνŠΈλŠ” μ‹€ν—˜κ΅°μœΌλ‘œ κ°„μ£Όν•  수 μžˆμŠ΅λ‹ˆλ‹€.
TraceDiff ν΄λž˜μŠ€λŠ” 트레이슀 κ°„μ˜ 차이λ₯Ό λΉ„κ΅ν•˜κ³  μ΄λŸ¬ν•œ 차이λ₯Ό μ‹œκ°ν™”ν•˜λŠ” κΈ°λŠ₯을 μ œκ³΅ν•©λ‹ˆλ‹€.
특히, μ‚¬μš©μžλŠ” 각 κ·Έλ£Ήμ—μ„œ μΆ”κ°€λ˜κ±°λ‚˜ 제거된 μ—°μ‚°μžμ™€ 컀널을 찾을 수 있으며, 각 μ—°μ‚°μž/μ»€λ„μ˜ λΉˆλ„μ™€ λˆ„μ  μ†Œμš” μ‹œκ°„λ„ 확인할 수 μžˆμŠ΅λ‹ˆλ‹€.

The `TraceDiff <https://hta.readthedocs.io/en/latest/source/api/trace_diff_api.html>`_ class
has the following methods:
`TraceDiff <https://hta.readthedocs.io/en/latest/source/api/trace_diff_api.html>`_ ν΄λž˜μŠ€λŠ” λ‹€μŒκ³Ό 같은 λ©”μ†Œλ“œλ₯Ό 가지고 μžˆμŠ΅λ‹ˆλ‹€.
TraceDiff 클래슀의 λ©”μ†Œλ“œ:

* `compare_traces <https://hta.readthedocs.io/en/latest/source/api/trace_diff_api.html#hta.trace_diff.TraceDiff.compare_traces>`_:
Compare the frequency and total duration of CPU operators and GPU kernels from
two sets of traces.
두 μ„ΈνŠΈμ˜ νŠΈλ ˆμ΄μŠ€μ—μ„œ CPU μ—°μ‚°μžμ™€ GPU μ»€λ„μ˜ λΉˆλ„μ™€ 총 지속 μ‹œκ°„μ„ λΉ„κ΅ν•©λ‹ˆλ‹€.

* `ops_diff <https://hta.readthedocs.io/en/latest/source/api/trace_diff_api.html#hta.trace_diff.TraceDiff.ops_diff>`_:
Get the operators and kernels which have been:
λ‹€μŒκ³Ό 같은 μƒνƒœμ˜ μ—°μ‚°μžμ™€ 컀널을 κ°€μ Έμ˜΅λ‹ˆλ‹€.
μƒνƒœ μ˜ˆμ‹œ:

#. **added** to the test trace and are absent in the control trace
#. **deleted** from the test trace and are present in the control trace
#. **increased** in frequency in the test trace and exist in the control trace
#. **decreased** in frequency in the test trace and exist in the control trace
#. **unchanged** between the two sets of traces
#. **μΆ”κ°€:** μ‹€ν—˜κ΅°μ˜ νŠΈλ ˆμ΄μŠ€μ—” μΆ”κ°€λ˜μ—ˆκ³  λŒ€μ‘°κ΅°μ˜ νŠΈλ ˆμ΄μŠ€μ—λŠ” μ—†λŠ” 것
#. **μ‚­μ œ:** μ‹€ν—˜κ΅°μ˜ νŠΈλ ˆμ΄μŠ€μ—μ„œ μ‚­μ œλ˜κ³  λŒ€μ‘°κ΅°μ˜ νŠΈλ ˆμ΄μŠ€μ—λŠ” μžˆλŠ” 것
#. **증가:** μ‹€ν—˜κ΅°μ˜μ—μ„œ λΉˆλ„κ°€ μ¦κ°€ν•˜κ³  λŒ€μ‘°κ΅°μ˜ νŠΈλ ˆμ΄μŠ€μ—λ„ μ‘΄μž¬ν•˜λŠ” 것
#. **κ°μ†Œ:** μ‹€ν—˜κ΅°μ˜μ—μ„œ λΉˆλ„κ°€ κ°μ†Œν•˜κ³  λŒ€μ‘°κ΅°μ˜ νŠΈλ ˆμ΄μŠ€μ—λ„ μ‘΄μž¬ν•˜λŠ” 것
#. **λ³€κ²½ μ•ˆ 됨:** 두 μ„ΈνŠΈμ˜ 트레이슀 간에 λ³€ν™”κ°€ μ—†λŠ” 것

* `visualize_counts_diff <https://hta.readthedocs.io/en/latest/source/api/trace_diff_api.html#hta.trace_diff.TraceDiff.visualize_counts_diff>`_

* `visualize_duration_diff <https://hta.readthedocs.io/en/latest/source/api/trace_diff_api.html#hta.trace_diff.TraceDiff.visualize_duration_diff>`_

The last two methods can be used to visualize various changes in frequency and
duration of CPU operators and GPU kernels, using the output of the
``compare_traces`` method.
λ§ˆμ§€λ§‰ 두 λ©”μ†Œλ“œλŠ” compare_traces λ©”μ†Œλ“œμ˜ 좜λ ₯을 μ‚¬μš©ν•˜μ—¬ CPU μ—°μ‚°μžμ™€ GPU μ»€λ„μ˜
λΉˆλ„ 및 지속 μ‹œκ°„μ˜ λ‹€μ–‘ν•œ λ³€ν™”λ₯Ό μ‹œκ°ν™”ν•˜λŠ” 데 μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

For example, the top ten operators with increase in frequency can be computed as
follows:
예λ₯Ό λ“€μ–΄, λΉˆλ„κ°€ μ¦κ°€ν•œ μƒμœ„ 10개의 μ—°μ‚°μžλŠ” λ‹€μŒκ³Ό 같이 계산할 수 μžˆμŠ΅λ‹ˆλ‹€.

.. code-block:: python
Expand All @@ -47,20 +42,18 @@ follows:
.. image:: ../_static/img/hta/counts_diff.png

Similarly, the top ten operators with the largest change in duration can be computed as
follows:
λ§ˆμ°¬κ°€μ§€λ‘œ, 지속 μ‹œκ°„ λ³€ν™”κ°€ κ°€μž₯ 큰 μƒμœ„ 10개 μ—°μ‚°μžλŠ” λ‹€μŒκ³Ό 같이 계산할 수 μžˆμŠ΅λ‹ˆλ‹€.

.. code-block:: python
df = compare_traces_output.sort_values(by="diff_duration", ascending=False)
# The duration differerence can be overshadowed by the "ProfilerStep",
# so we can filter it out to show the trend of other operators.
# "ProfilerStep"에 μ˜ν•΄ 지속 μ‹œκ°„ 차이가 κ°€λ €μ§ˆ 수 μžˆμœΌλ―€λ‘œ,
# 이λ₯Ό ν•„ν„°λ§ν•˜μ—¬ λ‹€λ₯Έ μ—°μ‚°μžλ“€μ˜ κ²½ν–₯을 보여쀄 수 μžˆμŠ΅λ‹ˆλ‹€.
df = df.loc[~df.index.str.startswith("ProfilerStep")].head(10)
TraceDiff.visualize_duration_diff(df)
.. image:: ../_static/img/hta/duration_diff.png

For a detailed example of this feature see the `trace_diff_demo notebook
<https://github.com/facebookresearch/HolisticTraceAnalysis/blob/main/examples/trace_diff_demo.ipynb>`_
in the examples folder of the repository.
이 κΈ°λŠ₯에 λŒ€ν•œ μžμ„Έν•œ μ˜ˆμ‹œλŠ” μ €μž₯μ†Œμ˜ examples 폴더에 μžˆλŠ” `trace_diff_demo notebook
<https://github.com/facebookresearch/HolisticTraceAnalysis/blob/main/examples/trace_diff_demo.ipynb>`_ 을 보면 λ©λ‹ˆλ‹€.

0 comments on commit 9ce499f

Please sign in to comment.