Skip to content

Conversation

ben-bitdotio
Copy link

@ben-bitdotio ben-bitdotio commented Aug 17, 2021

Summary:
Added resolution between float and int types so they aren't recognized as incompatible.

Tests:
Verified that the following file is correctly predicted to have a header via Detector.has_header().

col1,col2,col3
hello,"hello world", 1.2
world,"hello world", 1.2
test,"hello world 您", 1

Update: I will be unable to contribute to this discussion under this account after today. It appears that I'm unable to modify the assignees list but @ellie-bitio should be able to follow up if necessary.

@GjjvdBurg
Copy link
Collaborator

Thanks for opening an issue on this and creating a PR @ben-bitdotio! The header detection code could definitely be improved, but I've been waiting until I have a dataset to evaluate the accuracy of different algorithms. This fix seems pretty harmless though, so I think we can merge it for now.

Would you be able to add a unit test to tests/test_unit/test_detect.py that fails without your fix but passes with your fix? That would be a nice confirmation that it works as expected (the example you give above could work as a test case). Thank you!

(cc-ing @ellie-bitio as suggested)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants