diff --git a/DEVELOPING.md b/DEVELOPING.md index 608cf8cd3..1da391efe 100644 --- a/DEVELOPING.md +++ b/DEVELOPING.md @@ -3,19 +3,22 @@ - [Requirements](#requirements) - [Setting up your development environment](#setting-up-your-development-environment) - [Building the project from source](#building-the-project-from-source) + - [Summary: Building locally with Meson For developers](#summary-building-locally-with-meson-for-developers) - [Development Tasks](#development-tasks) - [Advanced Updating submodules](#advanced-updating-submodules) - [Cython and C++](#cython-and-c) - [Making a Release](#making-a-release) + - [Releasing on PyPi for pip installs](#releasing-on-pypi-for-pip-installs) + - [Releasing documentation](#releasing-documentation) # Requirements - Python 3.9+ -- numpy>=1.25 -- scipy>=1.11 -- scikit-learn>=1.3.1 +- numpy>=1.25.0 +- scipy>=1.5.0 +- scikit-learn>=1.4.1 For the other requirements, inspect the ``pyproject.toml`` file. @@ -70,21 +73,82 @@ For other commands, see Note at this stage, you will be unable to run Python commands directly. For example, ``pytest ./treeple`` will not work. -However, after installing and building the project from source using meson, you can leverage editable installs to make testing code changes much faster. For more information on meson-python's progress supporting editable installs in a better fashion, see . +However, after installing and building the project from source using meson, you can leverage editable installs to make testing code changes much faster. - pip install --no-build-isolation --editable . + spin install -**Note: editable installs for treeple REQUIRE you to have built the project using meson already.** This will now link the meson build to your Python runtime. Now if you run +This will now link the meson build to your Python runtime. Now if you run pytest ./treeple the unit-tests should run. +Summary: Building locally with Meson (For developers) +----------------------------------------------------- +Make sure you have the necessary packages installed. + + # install build dependencies + pip install -r build_requirements.txt + + # you may need these optional dependencies to build scikit-learn locally + conda install -c conda-forge joblib threadpoolctl pytest compilers llvm-openmp + +``YOUR_PYTHON_VERSION`` below should be any of the acceptable versions of Python for treeple. We use the ``spin`` CLI to abstract away build details: + + # run the build using Meson/Ninja + ./spin build + + # you can run the following command to see what other options there are + ./spin --help + ./spin build --help + + # For example, you might want to start from a clean build + ./spin build --clean + + # or build in parallel for faster builds + ./spin build -j 2 + + # you will need to double check the build-install has the proper path + # this might be different from machine to machine + export PYTHONPATH=${PWD}/build-install/usr/lib/python/site-packages + + # run specific unit tests + ./spin test -- treeple/tree/tests/test_tree.py + + # you can bring up the CLI menu + ./spin --help + +You can also do the same thing using Meson/Ninja itself. Run the following to build the local files: + + # generate ninja make files + meson build --prefix=$PWD/build + + # compile + ninja -C build + + # install treeple package + meson install -C build + + export PYTHONPATH=${PWD}/build/lib/python/site-packages + + # to check installation, you need to be in a different directory + cd docs; + python -c "from treeple import tree" + python -c "import sklearn; print(sklearn.__version__);" + +After building locally, you can use editable installs (warning: this only registers Python changes locally) + + pip install --no-build-isolation --editable . + +Or if you have spin v0.8+ installed, you can just run directly + + spin install + # Development Tasks There are a series of top-level tasks available. - make run-checks + make pre-commit This leverage pre-commit to run a series of precommit checks. @@ -115,6 +179,8 @@ In order to develop new tree models, generally Cython and C++ code will need to treeple is in-line with scikit-learn and thus relies on each new version released there. Moreover, treeple relies on compiled code, so releases are a bit more complex than the typical Python package. +## Releasing on PyPi (for pip installs) + 1. Download wheels from GH Actions and put all wheels into a ``dist/`` folder will have all the wheels for common OSes built for each Python version. @@ -140,3 +206,61 @@ or if you have two-factor authentication enabled: +Is treeple useful for me? +========================= + +1. If you use decision tree models (random forest, extra trees, isolation forests, etc.) in your work, treeple is a good package to try out. We have a variety of better tree models that are not available in scikit-learn, and we are always looking for new tree models to implement. For example, oblique decision trees are in general better than their axis-aligned counterparts. + +2. If you are interested in extending the decision tree API in scikit-learn, treeple is a good package to try out. We have a variety of internal APIs that are not available in scikit-learn, and are able to support new decision tree models easier. + Why oblique trees and why trees beyond those in scikit-learn? ============================================================= @@ -48,72 +55,10 @@ Installing with pip on a conda environment is the recommended route. pip install treeple -Building locally with Meson (For developers) --------------------------------------------- - -Make sure you have the necessary packages installed - - # install build dependencies - pip install -r build_requirements.txt - - # you may need these optional dependencies to build scikit-learn locally - conda install -c conda-forge joblib threadpoolctl pytest compilers llvm-openmp - -We use the ``spin`` CLI to abstract away build details: - - # run the build using Meson/Ninja - ./spin build - - # you can run the following command to see what other options there are - ./spin --help - ./spin build --help - - # For example, you might want to start from a clean build - ./spin build --clean - - # or build in parallel for faster builds - ./spin build -j 2 - - # you will need to double check the build-install has the proper path - # this might be different from machine to machine - export PYTHONPATH=${PWD}/build-install/usr/lib/python3.9/site-packages - - # run specific unit tests - ./spin test -- treeple/tree/tests/test_tree.py - - # you can bring up the CLI menu - ./spin --help - -You can also do the same thing using Meson/Ninja itself. Run the following to build the local files: - - # generate ninja make files - meson build --prefix=$PWD/build - - # compile - ninja -C build - - # install treeple package - meson install -C build - - export PYTHONPATH=${PWD}/build/lib/python3.9/site-packages - - # to check installation, you need to be in a different directory - cd docs; - python -c "from treeple import tree" - python -c "import sklearn; print(sklearn.__version__);" - -After building locally, you can use editable installs (warning: this only registers Python changes locally) - - pip install --no-build-isolation --editable . - -Or if you have spin v0.8+ installed, you can just run directly - - spin install - Development =========== -We welcome contributions for modern tree-based algorithms. We use Cython to achieve fast C/C++ speeds, while abiding by a scikit-learn compatible (tested) API. Moreover, our Cython internals are easily extensible because they follow the internal Cython API of scikit-learn as well. +We welcome contributions for modern tree-based algorithms. We use Cython to achieve fast C/C++ speeds, while abiding by a scikit-learn compatible (tested) API. We also will welcome contributions in C/C++ if they improve the extensibility, or runtime performance of the codebase. Our Cython internals are easily extensible because they follow the internal Cython API of scikit-learn as well. Due to the current state of scikit-learn's internal Cython code for trees, we have to instead leverage a fork of scikit-learn at when extending the decision tree model API of scikit-learn. Specifically, we extend the Python and Cython API of the tree submodule in scikit-learn in our submodule, so we can introduce the tree models housed in this package. Thus these extend the functionality of decision-tree based models in a way that is not possible yet in scikit-learn itself. As one example, we introduce an abstract API to allow users to implement their own oblique splits. Our plan in the future is to benchmark these functionalities and introduce them upstream to scikit-learn where applicable and inclusion criterion are met. diff --git a/doc/_static/versions.json b/doc/_static/versions.json index 50b26d277..8ce5f79a1 100644 --- a/doc/_static/versions.json +++ b/doc/_static/versions.json @@ -6,7 +6,7 @@ }, { "name": "0.8", - "version": "dev", + "version": "stable", "url": "https://docs.neurodata.io/treeple/stable/" }, { diff --git a/doc/index.rst b/doc/index.rst index 4f75c94de..ccd9c7ca5 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -5,10 +5,11 @@ learning problems. It extends the robust API of `scikit-learn `, KDD 2020, 513-523, 2020. diff --git a/treeple/_lib/sklearn_fork b/treeple/_lib/sklearn_fork index ae2604ba5..d455aa16e 160000 --- a/treeple/_lib/sklearn_fork +++ b/treeple/_lib/sklearn_fork @@ -1 +1 @@ -Subproject commit ae2604ba53d092eaaec64eba0136a76460586cb0 +Subproject commit d455aa16ee9cc42ce342dd07d9b94db117783fcc