Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow usage of arbitrary regressor and selector objects from CLI #91

Open
dwysocki opened this issue Dec 23, 2015 · 2 comments
Open

Allow usage of arbitrary regressor and selector objects from CLI #91

dwysocki opened this issue Dec 23, 2015 · 2 comments

Comments

@dwysocki
Copy link
Member

Currently we have a predefined list of supported regressor and selector objects, which can be accessed via the --regressor <regressor> and --selector <selector> CLI options. This approach restricts users to the ones we decided to include. It also means that if an object is added or deprecated in a future scikit-learn release, we will have to break compatibility between releases, or perform some gymnastics to support both.

I propose the following. We allow users to provide the name of any object to use as a regressor/selector, e.g. --regressor sklearn.linear_model.LassoCV. The following self-contained code snippet demonstrates how we might implement this.

from importlib import import_module # that's a mouthful

# this will be the `args` object that already exists in `plotypus.cli.get_args`
from argparse import Namespace
args = Namespace(regressor="sklearn.linear_model.LassoCV")

# separate module and object name
regressor_components = args.regressor.split(".")
regressor_module_name = ".".join(regressor_components[:-1])
regressor_object_name = regressor_components[-1]
# import the module and then the regressor object from it
regressor_module = import_module(regressor_module_name)
regressor_object = getattr(regressor_module, regressor_object_name)

This functionality could of course be wrapped in a utility function (perhaps plotypus.utils.import_object) for reuse on the selector and possibly other objects, but this makes for a nice demonstration.

The choice still remains to include some shortcuts like we already do, such as having LassoCV be a shortcut for sklearn.linear_model.LassoCV. This could be taken care of by a check like

if "." in args.regressor:
    regressor = utils.import_object(args.regressor)
elif args.regressor in regressor_shortcuts:
    regressor = regressor_shortcuts[args.regressor]
else:
    raise Exception("Undefined regressor: {}".format(args.regressor))

I think we should definitely go ahead with the new functionality, the only thing I'm iffy on is keeping the old functionality. Thoughts?

@dwysocki
Copy link
Member Author

One caveat: our current implementation of Baart's criteria will not work here. It will have to be refactored as a legitimate selector object, as I had originally envisioned it, instead of an option to plotypus.preprocessing.Fourier. This is something I planned on doing eventually, anyway, so it's not a real issue.

EDIT: This now has its own issue, #98.

@dwysocki
Copy link
Member Author

This issue also depends on #90 being completed, in order for it to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant