Skip to content

Commit cfda78c

Browse files
iofu728liyucheng09Starmys
authored
Hotfix(MInference): fix the pip setup issue (#6)
* Hotfix(MInference): fix the pip setup issue * Hotfix(MInference): fix the torch version Co-authored-by: Yucheng Li <liyucheng09@gmail.com> Co-authored-by: Chengruidong Zhang <chengzhang@microsoft.com>
1 parent 038e005 commit cfda78c

File tree

10 files changed

+122
-38
lines changed

10 files changed

+122
-38
lines changed

.github/workflows/release.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,12 @@ jobs:
7373
with:
7474
python-version: ${{ matrix.python-version }}
7575

76+
- name: Set CUDA and PyTorch versions
77+
run: |
78+
echo "MATRIX_CUDA_VERSION=$(echo ${{ matrix.cuda-version }} | awk -F \. {'print $1 $2'})" >> $GITHUB_ENV
79+
echo "MATRIX_TORCH_VERSION=$(echo ${{ matrix.pytorch-version }} | awk -F \. {'print $1 "." $2'})" >> $GITHUB_ENV
80+
echo "MATRIX_PYTHON_VERSION=$(echo ${{ matrix.python-version }} | awk -F \. {'print $1 $2'})" >> $GITHUB_ENV
81+
7682
- name: Install CUDA ${{ matrix.cuda-version }}
7783
run: |
7884
bash -x .github/workflows/scripts/cuda-install.sh ${{ matrix.cuda-version }} ${{ matrix.os }}

.github/workflows/scripts/build.sh

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,9 @@ export MAX_JOBS=1
1616
# Make sure release wheels are built for the following architectures
1717
export TORCH_CUDA_ARCH_LIST="7.0 7.5 8.0 8.6 8.9 9.0+PTX"
1818
# Build
19-
$python_executable setup.py $3 --dist-dir=dist
19+
if [ "$3" = sdist ];
20+
then
21+
MINFERENCE_SKIP_CUDA_BUILD="TRUE" $python_executable setup.py $3 --dist-dir=dist
22+
else
23+
MINFERENCE_LOCAL_VERSION=cu${MATRIX_CUDA_VERSION}torch${MATRIX_TORCH_VERSION} MINFERENCE_FORCE_BUILD="TRUE" $python_executable setup.py $3 --dist-dir=dist
24+
fi

.github/workflows/unittest.yml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
name: Unit Test
22

33
# see: https://help.github.com/en/actions/reference/events-that-trigger-workflows
4-
on: [] # Trigger the workflow on pull request or merge
4+
on:
5+
push:
6+
branches:
7+
- 'test/**'
58
# pull_request:
69
# merge_group:
710
# types: [checks_requested]

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -414,3 +414,4 @@ __pycache__
414414
build/
415415
*.egg-info/
416416
*.so
417+
dist

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,14 @@
88

99
<p align="center">
1010
| <a href="https://aka.ms/MInference"><b>Project Page</b></a> |
11-
<a href="https://arxiv.org/abs/2407.02490"><b>Paper</b></a> |
11+
<a href="https://export.arxiv.org/pdf/2407.02490"><b>Paper</b></a> |
1212
<a href="https://huggingface.co/spaces/microsoft/MInference"><b>HF Demo</b></a> |
1313
</p>
1414

1515
https://github.com/microsoft/MInference/assets/30883354/52613efc-738f-4081-8367-7123c81d6b19
1616

1717
## News
18-
- 📃 [24/07/03] Due to an issue with arXiv, the PDF is currently unavailable there. You can find the paper at this [link](https://github.com/microsoft/MInference/blob/main/papers/MInference1_Arxiv.pdf)..
18+
- 📃 [24/07/03] Due to an issue with arXiv, the PDF is currently unavailable there. You can find the paper at this [link](https://export.arxiv.org/pdf/2407.02490).
1919
- 🧩 [24/07/03] We will present **MInference 1.0** at the _**Microsoft Booth**_ and _**ES-FoMo**_ at ICML'24. See you in Vienna!
2020

2121
## TL;DR

minference/configs/__init__.py

Whitespace-only changes.

minference/modules/__init__.py

Whitespace-only changes.

minference/ops/__init__.py

Whitespace-only changes.

minference/version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
_MINOR = "1"
66
# On master and in a nightly release the patch should be one ahead of the last
77
# released build.
8-
_PATCH = "0"
8+
_PATCH = "1"
99
# This is mainly for nightly builds which have the suffix ".dev$DATE". See
1010
# https://semver.org/#is-v123-a-semantic-version for the semantics.
1111
_SUFFIX = ""

setup.py

Lines changed: 102 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
from packaging.version import Version, parse
99
from setuptools import find_packages, setup
1010
from torch.utils.cpp_extension import CUDA_HOME, BuildExtension, CUDAExtension
11+
from wheel.bdist_wheel import bdist_wheel as _bdist_wheel
1112

1213
# PEP0440 compatible formatted version, see:
1314
# https://www.python.org/dev/peps/pep-0440/
@@ -46,52 +47,117 @@
4647
]
4748
DEV_REQUIRES = INSTALL_REQUIRES + QUANLITY_REQUIRES
4849

49-
MAIN_CUDA_VERSION = "12.1"
5050

51+
# ninja build does not work unless include_dirs are abs path
52+
this_dir = os.path.dirname(os.path.abspath(__file__))
5153

52-
def _is_cuda() -> bool:
53-
return torch.version.cuda is not None
54+
PACKAGE_NAME = "minference"
5455

56+
BASE_WHEEL_URL = (
57+
"https://github.com/microsoft/MInference/releases/download/{tag_name}/{wheel_name}"
58+
)
5559

56-
def get_nvcc_cuda_version() -> Version:
57-
"""Get the CUDA version from nvcc.
60+
# FORCE_BUILD: Force a fresh build locally, instead of attempting to find prebuilt wheels
61+
# SKIP_CUDA_BUILD: Intended to allow CI to use a simple `python setup.py sdist` run to copy over raw files, without any cuda compilation
62+
FORCE_BUILD = os.getenv("MINFERENCE_FORCE_BUILD", "FALSE") == "TRUE"
63+
SKIP_CUDA_BUILD = os.getenv("MINFERENCE_SKIP_CUDA_BUILD", "FALSE") == "TRUE"
64+
# For CI, we want the option to build with C++11 ABI since the nvcr images use C++11 ABI
65+
FORCE_CXX11_ABI = os.getenv("MINFERENCE_FORCE_CXX11_ABI", "FALSE") == "TRUE"
66+
67+
68+
def check_if_cuda_home_none(global_option: str) -> None:
69+
if CUDA_HOME is not None:
70+
return
71+
# warn instead of error because user could be downloading prebuilt wheels, so nvcc won't be necessary
72+
# in that case.
73+
warnings.warn(
74+
f"{global_option} was requested, but nvcc was not found. Are you sure your environment has nvcc available? "
75+
"If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, "
76+
"only images whose names contain 'devel' will provide nvcc."
77+
)
5878

59-
Adapted from https://github.com/NVIDIA/apex/blob/8b7a1ff183741dd8f9b87e7bafd04cfde99cea28/setup.py
60-
"""
61-
assert CUDA_HOME is not None, "CUDA_HOME is not set"
62-
nvcc_output = subprocess.check_output(
63-
[CUDA_HOME + "/bin/nvcc", "-V"], universal_newlines=True
79+
80+
cmdclass = {}
81+
ext_modules = []
82+
83+
if not SKIP_CUDA_BUILD:
84+
print("\n\ntorch.__version__ = {}\n\n".format(torch.__version__))
85+
TORCH_MAJOR = int(torch.__version__.split(".")[0])
86+
TORCH_MINOR = int(torch.__version__.split(".")[1])
87+
88+
# Check, if ATen/CUDAGeneratorImpl.h is found, otherwise use ATen/cuda/CUDAGeneratorImpl.h
89+
# See https://github.com/pytorch/pytorch/pull/70650
90+
generator_flag = []
91+
torch_dir = torch.__path__[0]
92+
if os.path.exists(
93+
os.path.join(torch_dir, "include", "ATen", "CUDAGeneratorImpl.h")
94+
):
95+
generator_flag = ["-DOLD_GENERATOR_PATH"]
96+
97+
check_if_cuda_home_none("minference")
98+
99+
# HACK: The compiler flag -D_GLIBCXX_USE_CXX11_ABI is set to be the same as
100+
# torch._C._GLIBCXX_USE_CXX11_ABI
101+
# https://github.com/pytorch/pytorch/blob/8472c24e3b5b60150096486616d98b7bea01500b/torch/utils/cpp_extension.py#L920
102+
if FORCE_CXX11_ABI:
103+
torch._C._GLIBCXX_USE_CXX11_ABI = True
104+
ext_modules.append(
105+
CUDAExtension(
106+
name="minference.cuda",
107+
sources=[
108+
os.path.join("csrc", "kernels.cpp"),
109+
os.path.join("csrc", "vertical_slash_index.cu"),
110+
],
111+
extra_compile_args=["-std=c++17", "-O3"],
112+
)
64113
)
65-
output = nvcc_output.split()
66-
release_idx = output.index("release") + 1
67-
nvcc_cuda_version = parse(output[release_idx].split(",")[0])
68-
return nvcc_cuda_version
69114

70115

71116
def get_minference_version() -> str:
72117
version = VERSION["VERSION"]
73118

74-
if _is_cuda():
75-
cuda_version = str(get_nvcc_cuda_version())
76-
if cuda_version != MAIN_CUDA_VERSION:
77-
cuda_version_str = cuda_version.replace(".", "")[:3]
78-
version += f"+cu{cuda_version_str}"
119+
local_version = os.environ.get("MINFERENCE_LOCAL_VERSION")
120+
if local_version:
121+
return f"{version}+{local_version}"
79122
else:
80-
raise RuntimeError("Unknown runtime environment")
123+
return str(version)
81124

82-
return version
83125

126+
class CachedWheelsCommand(_bdist_wheel):
127+
"""
128+
The CachedWheelsCommand plugs into the default bdist wheel, which is ran by pip when it cannot
129+
find an existing wheel (which is currently the case for all flash attention installs). We use
130+
the environment parameters to detect whether there is already a pre-built version of a compatible
131+
wheel available and short-circuits the standard full build pipeline.
132+
"""
133+
134+
def run(self):
135+
return super().run()
136+
137+
138+
class NinjaBuildExtension(BuildExtension):
139+
def __init__(self, *args, **kwargs) -> None:
140+
# do not override env MAX_JOBS if already exists
141+
if not os.environ.get("MAX_JOBS"):
142+
import psutil
143+
144+
# calculate the maximum allowed NUM_JOBS based on cores
145+
max_num_jobs_cores = max(1, os.cpu_count() // 2)
146+
147+
# calculate the maximum allowed NUM_JOBS based on free memory
148+
free_memory_gb = psutil.virtual_memory().available / (
149+
1024**3
150+
) # free memory in GB
151+
max_num_jobs_memory = int(
152+
free_memory_gb / 9
153+
) # each JOB peak memory cost is ~8-9GB when threads = 4
154+
155+
# pick lower value of jobs based on cores vs memory metric to minimize oom and swap usage during compilation
156+
max_jobs = max(1, min(max_num_jobs_cores, max_num_jobs_memory))
157+
os.environ["MAX_JOBS"] = str(max_jobs)
158+
159+
super().__init__(*args, **kwargs)
84160

85-
ext_modules = [
86-
CUDAExtension(
87-
name="minference.cuda",
88-
sources=[
89-
os.path.join("csrc", "kernels.cpp"),
90-
os.path.join("csrc", "vertical_slash_index.cu"),
91-
],
92-
extra_compile_args=["-std=c++17", "-O3"],
93-
)
94-
]
95161

96162
setup(
97163
name="minference",
@@ -110,7 +176,6 @@ def get_minference_version() -> str:
110176
"Programming Language :: Python :: 3",
111177
"Topic :: Scientific/Engineering :: Artificial Intelligence",
112178
],
113-
package_dir={"": "."},
114179
packages=find_packages(
115180
exclude=(
116181
"csrc",
@@ -136,5 +201,9 @@ def get_minference_version() -> str:
136201
python_requires=">=3.8.0",
137202
zip_safe=False,
138203
ext_modules=ext_modules,
139-
cmdclass={"build_ext": BuildExtension},
204+
cmdclass={"bdist_wheel": CachedWheelsCommand, "build_ext": NinjaBuildExtension}
205+
if ext_modules
206+
else {
207+
"bdist_wheel": CachedWheelsCommand,
208+
},
140209
)

0 commit comments

Comments
 (0)