Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow direct passing of arguments to regressor from CLI #90

Open
dwysocki opened this issue Dec 23, 2015 · 1 comment
Open

Allow direct passing of arguments to regressor from CLI #90

dwysocki opened this issue Dec 23, 2015 · 1 comment

Comments

@dwysocki
Copy link
Member

The plotypus CLI currently comes with a number of allowed regressors, and some predefined options for them. Not all regressors accept the same arguments, so only the appropriate arguments are passed to the chosen regressor.

This design is rather limited, however. To allow access to all options for all regressors, we must add global CLI options for each. This would not only over-complicate the accepted arguments, but it also makes it unclear which options are supported without checking the API for each regressor. On the other hand, we can support a limited number of options, as we do now, but where we draw the line between inclusion and exclusion is rather arbitrary.

I propose a more flexible solution. We don't pass any specific command line arguments to the regressors, and instead create a single, all-encompassing --regressor-options <key 1> <val 1> ... <key N> <val N> option. This will allow the user to pass any supported options to their regressor of choice. This will require the user to read the documentation for the regressor they use, which likely means reading the scikit-learn documentation. We can of course give some suggestions in our own documentation, to alleviate that learning curve.

@dwysocki
Copy link
Member Author

Before this can be implemented for regressors, a change must be made to the Fourier preprocessor, and likely many other places which act on its output. The design matrix produced by this class includes the intercept, and fit_intercept=False is passed to all of the regressors. This is a problem, since we cannot guarantee the user will/will not pass this argument, or even that their regressor of choice will support it as an option.

For this reason, we need to consider all of the ramifications of fit_intercept=True/False, being passed, and how to handle it in general. I'm thinking there should be some sort of option for the Fourier class, as well as anything which acts on its output. I need to do some digging through the code before I can make an informed decision, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant