Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add __hash__ method to Node #173

Merged
merged 1 commit into from
Nov 12, 2023
Merged

Add __hash__ method to Node #173

merged 1 commit into from
Nov 12, 2023

Conversation

pedrovhb
Copy link
Contributor

Hey all,

Thanks for the library! I'd like to contribute something that I've wanted a few times when working with py-tree-sitter.

As a note, I'm not very well-versed in C or writing Python extensions, so although this seems pretty simple, a critical eye would be appreciated.

This PR adds a __hash__ method to the Node class. This makes it possible to add Node instances to sets or use them as dictionary keys. I've tested and found the hash to work well and be consistently the same for a given node, and consistently different for different nodes as you'd expect.

@amaanq
Copy link
Member

amaanq commented Nov 11, 2023

Do you need nodes hashed? What use case or benefit does this bring? Imo I can't really see any

Copy link
Contributor

@maxbrunsfeld maxbrunsfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to make it possible to put nodes in a set or dictionary. Implementing in terms of tree address plus node id makes sense.

@pedrovhb
Copy link
Contributor Author

Do you need nodes hashed? What use case or benefit does this bring? Imo I can't really see any

So for one, having a way to do O(1) lookup to check if a node is contained within a set without having to explicitly extract node IDs and then possibly have a separate {id: node} mapping is generally handy, I think.

The main thing though is that I'm experimenting with a visitor/transformer pattern in the style of libcst and the standard library ast. In that context, it's useful to allow providing additional metadata for a node. Libcst for instance offers an interface for providing arbitrary metadata and comes with a few metadata provider implementations out of the box such as a type inference and a scope provider. The metadata is generally specific to particular nodes, and having a {node: metadata} mapping is significantly simpler than having an {id: metadata} mapping and a separate {id: node} one, and having to do two lookups everywhere.

In terms of general expectations set by the language, in Python the usual business is that immutable types such as tuples and primitives are hashable, and mutable ones like lists and dictionaries are not. Tree-sitter nodes are immutable as I understand, so there's no real reason I see for nodes to not be hashable, given its identity is well defined.

@amaanq
Copy link
Member

amaanq commented Nov 12, 2023

Awesome, thanks for both of your comments!

@amaanq amaanq merged commit 600438f into tree-sitter:master Nov 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants