Releases: jrudar/LANDMark
LANDMark Classifier v2.1.1
- Removed support for Python <3.10
- Updated minimum supported NumPy package
- Minor bug fixes related to dependencies
- 'terminal' is the default proximity method until 'path' is properly implemented
LANDMarkClassifier Version 2.1.0
- 'use_cascade' parameter is now active. This parameter extends X usng the results of the decision function at each node. These new features are then used alongside the original features during the training (and prediction) steps within each node in the tree.
- Updated API
- Updated tests
- Added notebooks on simple and advanced usage
- Added notebooks demonstrating the effects of some important parameters
- Changed code in linear models so that 'y_min' is equal to 80% of minimum count of the bootstrapped resampled data
- 5-Fold Stratified Cross-Validation is now explicitly stated to be used for linear models
- Use of a neural network model to split a node now requires that at greater than 'minority_sz_nnet' samples of the minority class is present in the bootstrapped resampled data.
- 'minority_sz_lm' controls how many samples must be present to split using a linear model
- Autodetection of sparse matrix and conversion to CSR format if sparsity >= 90% sparsity
- New way to create high-dimensional embedding for calculation of dissimilarities using the proximity() function: "path". By using this parameter a binary matrix containing all nodes visited by sample is created rather than just the terminal nodes. Original method is now "terminal".
- Initial layer of neural network model uses 'mish' activation function and other small changes to the architecture of the network
- Replaced the _get_node_ids() function in tree.py with _get_all_nodes(). This is done so that the new embedding approach can be implimented
- Updated version (2.1.0) due to new non-breaking feature
What's Changed
Full Changelog: LANDMarkClassifier-v.2.0.7...LANDMarkClassifier-v.2.1.0
LANDMarkClassifier Version 2.0.7
- Changed the "decision_function()" behavior for models that return probabilities. Now all probabilities greater than 0.5 are 1 and less than 0.5 are -1. This does not change the behavior of LANDMark, but it does allow for the code to be cleaned up considerably as now probabilities no longer need to be handled. This affects the "get_split()", "_predict()", and "_proximity()" functions.
- Simplified tree-traversal in the "_predict()", and "_proximity()" functions. No longer uses nested-if statements.
- Added "predict_proba()" function to "ETClassifier()" wrapper.
- Preparing to introduce a new hyper-parameter, "use_cascade". This parameter appends the output of the decision function onto X (Inspiratin from https://www.tandfonline.com/doi/full/10.1080/15481603.2021.1965399). This parameter is not currently enabled
- Updated version to 2.0.7 to reflect these changes
LANDMarkClassifier Version 2.0.6
July 2023 Update 2
Updated version to 2.0.6
LANDMarkClassifier()._check_params() now returns type List[np.ndarray, np.ndarray]
Removed TransformerMixin from imports in LANDMark.py
Removed tensorflow dependencies
Removed unused imports to 'gc' and 'pandas' from lm_linear_clfs.py
Each type of classification model now has its own module
Random selection of 'alpha' for RidgeClassifier when samples are fewer than 6
Neural network now uses PyTorch (AMP not yet enabled)
Test coverage improvement
Linear models no longer split nodes with few samples. Extra Trees Classifier using max_depth of 1 used instead
LANDMarkClassifier Version 2.0.5
June 2023 Update
- Code readability (eg: 'if some_var == False' changed to 'if some_var is False')
- Fixed formatting using black
- Added a function to validate LANDMark parameters
- Error raised if predict() is called on a model which has not been fit
- Simplified section in Node() that handles stopping criteria
- Updated Line 209 in 'lm_base_clfs.py': Sometimes the efficient LOOCV fails so a switch to 5-fold CV solves the issue
LANDMark Classifier version 2.0.4
What's Changed
Full Changelog: LANDMarkClassifier-v.2.0.3...LANDMarkClassifier-v.2.0.4
LANDMark Classifier version 2.0.3
Fixed broken PyPI Release
LANDMark Classifier version 2.0.2
- Enhanced ability to resample by accepting 'imbalanced-learn' approaches.
- Added additional split criteria based on the gain ratio and Tsallis entropy (tsallis, gain, gain-ratio, tsallis-gain-ratio)
- 'q' parameter has been exposed and is now available for hyper-parameter tuning (for Tsallis entropy)
LANDMark Classifier version 2.0.1
- Minor fix to README
LANDMark Classifier version 2.0.0
- Removed dependency on 'shap' - This can be assessed post-hoc using a variety of methods and improves LANDMark performance. May be re-introduced in the future.
- Adding type annotations and parameter checking for LANDMark input and hyper-parameters. Additional annotations will be added in later patches for subsequent modules.
- More informative class names (eg: BaggingClassifier -> Ensemble)
- Simplified the Ensemble() class
- Updated README, API, CONTRIBUTIONS, ISSUES, BUG_REPORT files
- Added tests
- Considerable reduction in redundant code by combining all linear models into a single base classifier (LMClassifier)
- Removed unused modules
- Bumped version to version 2.0