Skip to content

Commit

Permalink
Introduce AIT_USE_FAST_MATH environment flag (facebookincubator#627)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: facebookincubator#627

Whether the fast math option should be used for the device code generation. Fast math implies the use of approximate math operations (say, a division operation), allowing to gain speed at the cost of accuracy. Default value is "1".

Reviewed By: aakhundov

Differential Revision: D45321954

fbshipit-source-id: 9df3583eef5f7284176d338100f6702bace07a90
  • Loading branch information
Alexandr Guzhva authored and facebook-github-bot committed Apr 26, 2023
1 parent 1e60f7b commit ad45691
Show file tree
Hide file tree
Showing 3 changed files with 16 additions and 2 deletions.
2 changes: 2 additions & 0 deletions docs/source/reference/env.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,3 +48,5 @@ Miscellaneous
**LOGLEVEL**: It is used to control the logging level in Python. The default value is "INFO". "DEBUG" is useful for debugging.

**AIT_PLOT_SHORTEN_TENSOR_NAMES**: If set to "1", shorten too long tensor names for a plot of a model graph, thus making a plot much easier to analyze visually. "0" by default.

**AIT_USE_FAST_MATH**: If set to "0", no fast math option will be used for the device code generation. Default value is "1".
6 changes: 4 additions & 2 deletions python/aitemplate/backend/cuda/target_def.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,11 +120,12 @@ def _build_compile_options(self):
environ.get_compiler_opt_level(),
"-std=c++17",
"--expt-relaxed-constexpr",
"--use_fast_math",
f"-I{ait_static_path}",
] + ["-I" + path for path in cutlass_path]
if self._ndebug == 1:
options.append("-DNDEBUG")
if environ.use_fast_math():
options.append("--use_fast_math")
return " ".join(options)

def src_extension(self):
Expand Down Expand Up @@ -277,7 +278,6 @@ def _build_compile_options(self):
"-DCUTLASS_USE_TANH_FOR_SIGMOID=1",
"-w",
"--expt-relaxed-constexpr",
"--use_fast_math",
f"-gencode=arch=compute_{nvcc_arch},code=[sm_{nvcc_arch},compute_{nvcc_arch}]",
"-Xcompiler=-Wconversion",
environ.get_compiler_opt_level(),
Expand All @@ -286,6 +286,8 @@ def _build_compile_options(self):
)
if self._ndebug == 1:
options.append("-DNDEBUG")
if environ.use_fast_math():
options.append("--use_fast_math")
FBCUDA.compile_options_ = " ".join(options)
compile_options = FBCUDA.compile_options_
_LOGGER.info(f"The compile options are: {compile_options}")
Expand Down
10 changes: 10 additions & 0 deletions python/aitemplate/utils/environ.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,16 @@ def get_compiler_opt_level() -> str:
return compiler_opt


def use_fast_math() -> str:
"""
Whether the fast math option should be used for the device code generation.
Fast math implies the use of approximate math operations (say,
a division operation), allowing to gain speed at the cost of accuracy.
Default value is "1".
"""
return os.getenv("AIT_USE_FAST_MATH", "1") == "1"


def force_profiler_cache() -> bool:
"""
Force the profiler to use the cached results. The profiler will throw
Expand Down

0 comments on commit ad45691

Please sign in to comment.