-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API design #13
Comments
Update: changed train to fit in order to match sklearn API |
You are right, with the current API you have to know the parameters. The idea was, that if you want to change the dimensionality, or the used metric you have to know what you are doing anyways. But, a DR object already has a function druid.parameter("parameter_name", [parameter_value]) - (with two aliases "para" and "p") where you can set a parameter (which is chainable, similar to d3's attr function). But it would be no problem to use getters and setter. Checking the parameters would be probably easier that way. Druid has already some other things implemented, some clustering-, k-NN-, and linear algebra implementations --- some of them doesn't work that well yet ;). Therefore we could maybe change the DR constructor to take a String for the name of the DR method for example I like the values function very much, we should add this :), also the function to set dimensionality and additionally one for changing the metric function. As I mentioned in issue #11 a fit or train method will not work with most of the DR methods. Maybe we could add it for those DR methods where it works? |
Ah I hadn't seen the chainable .parameter method, now I see it!
however it feels a bit strange to parametrize after adding the values. And it seems you can't use .parameter("d", 3) to change the dimensionality?
I disagree :) I love to learn by testing things out, and it's frustrating if they break for no apparent reason. You can see that in the "hello" notebook: it needs quite a bit of code to inject the default values. And if you're trying to go 3D, they are not optional. PS: I admit I haven't paid attention yet to the clustering methods (and others); my comments so far are meant only for the DR methods of the API. But I'm curious about them and waiting for some examples or documentation to appear :) |
I would also very much like all DR methods to return a generator, even if it only yields one final result. Otherwise we have to do something like this in user space:
Ref. https://observablehq.com/@fil/druidjs-worker EDIT: solved in 0.7.3 |
I would find having a uniform way to set the I didn't see a |
With the current API, if one wants to project in d=3, one has to know the exact number n of optional arguments before specifying 3 as the n+1th argument. This feels a bit uneasy, and it means that we can't add a supplementary hyperparameter to any method without it being a breaking change.
It seems to be that it would be nice to rethink the API "à la D3", so that:
I would imagine that this could be structured as:
And for each hyperparameter, for example UMAP/min_dist
With this we could say for example:
I wonder what should be done for NaN, I suppose they should be automatically ignored if the values accessor returns any NaN.
Note also that some methods such as UMAP can accept a distance matrix instead of a data array.
PS: Sorry for spamming your project :) The potential is very exciting.
The text was updated successfully, but these errors were encountered: