frequent/largest/smallest determined from source and compare combined#66
frequent/largest/smallest determined from source and compare combined#66anthonyywangmath wants to merge 1 commit intofbdesignpro:masterfrom
Conversation
|
Hi @anthonyywangmath, thank you so much for the detailed contribution! This is a good find. I agree that the solution is to force source and compared to use the same data types. I am reviewing a few other issues, and in general it feels like coercing users into a bit more strictness is more desirable than trying to do a bunch of conversion and failsafe cases, because things degenerate quickly since there can be so many edge cases. I'm not sure if I will bring this in wholesale as a pull request or integrate it piecemeal, as I am in the middle of a big revision and it might be easier to do. I will look into it further and let you know... Thanks again! |
I want to chime in on this one. Instead of forcing source to have same datatype, an advisory could be issued on top of the report to get more accurate results by having the same data-types. The entry point to the use of library should be low. |
When comparing two numerical series, the detailed tables of freq/max/min values to display was determined based on the source data only. This could make it seem like, for example, that all the largest values come from the source dataset when it is not the case.
For example if the largest values in your source dataset were 20, 18, 16 while in the compare dataset the largest were 21, 19, 17, then only 20, 18, 16 would appear in the table of largest values.
The proposed changes makes it so that the values shown in the table of frequent/largest/smallest is determined by combining both source and compare datasets whenever the compare dataset exists.
Other remarks: