Methods for Multiple-Output Learning in python
This package provides multi-output methods in python, using scikit-learn for base classifiers. Classifiers are written in the style of scikit-learn
classifiers.
For a maturing Java-based framework for multi-label multi-output learning, see the MEKA framework. But sometimes, it's nice to work in python, hence this project. The basic problem transformation methods are implemented, as in MEKA, except using scikit-learn for base classifiers. I have also come across the scikit-multilearn with similar goals which also in fact has a wrapper to MEKA classifiers.
Installation requires numpy
and scikit-learn
. To install:
$ python setup.py install
Or, if you will be developing, then
git clone https://github.com/jmread/molearn
cd molearn
python setup.py develop
If you install locally, then use the --prefix
option, e.g.,
python setup.py develop --prefix=$HOME/.local/
To check that it is working, run the demo:
$ python runDemo.py
Data is represented in two-dimensional numpy arrays, similarly to sklearn.
For example, to run Classifier Chains with a Random Forest base classifier:
from molearn.classifiers.CC import CC
from sklearn.ensemble import RandomForestClassifier
h = CC(h=RandomForestClassifier(n_estimators=100))
h.fit(X_train,Y_train)
Y_pred = h.predict_proba(X_test)
print "Exact Match: ", Exact_match(Y_test,Y_pred > 0.5)
For further examples, have a look at runDemo.py
.