Skip to content

Conversation

@pboling
Copy link
Contributor

@pboling pboling commented Jan 4, 2026

This PR updates tree_stump to support the tree-sitter Rust crate 0.26.x, enabling grammars with LANGUAGE_VERSION 15 to be loaded and used. It also fixes the unsafe memory transmute that extended the LanguageRef lifetime from the borrowed scope.

Changes

ext/tree_stump/Cargo.toml

[dependencies]
tree-sitter = "0.26.3"
streaming-iterator = "0.1"

ext/tree_stump/src/language.rs

  • Replace self.raw_language_ref.version() with self.raw_language_ref.abi_version()
  • The Ruby Language#version method now calls abi_version() internally

ext/tree_stump/src/parser.rs

  • Remove timeout_micros() and set_timeout_micros() methods
  • Handle Parser::language() returning Option<LanguageRef<'_>> instead of
    Option<Language> - return None to Ruby since we can't convert LanguageRef
    back to Language without cloning

ext/tree_stump/src/tree.rs

  • Convert usize to u32 when calling Node::child(index) using .try_into().unwrap()
  • Convert usize to u32 when calling Node::named_child(index)
  • Update child_containing_descendant to use child_with_descendant (renamed in 0.26)

ext/tree_stump/src/query.rs

  • Use streaming_iterator::StreamingIterator for QueryMatches iteration:
    use streaming_iterator::StreamingIterator;
    
    while let Some(m) = matches.next() {
        // process match
    }
  • This replaces the previous for m in matches { ... } pattern

Breaking Changes

  • Ruby API: The following methods are removed from TreeStump::Parser:

    • timeout_micros / timeout_micros=

    These were deprecated in tree-sitter 0.25 and removed in 0.26. Users should use
    alternative cancellation mechanisms (e.g., Ruby's Timeout module).

  • Ruby API: Parser#language now returns nil when called after language=.
    This is because tree-sitter 0.26's Parser::language() returns a borrowed reference
    that cannot be converted back to an owned Language. If you need to track the language,
    store it separately before setting it on the parser.

Backward Compatibility

  • Grammars with LANGUAGE_VERSION 13-15 are supported
  • Older grammars may work but are not guaranteed

Testing

# Load tree_stump
require 'tree_stump'

# Register a LANGUAGE_VERSION 15 grammar
TreeStump.load_language("/path/to/libtree-sitter-toml.so", "tree_sitter_toml")

# Create parser and parse
parser = TreeStump::Parser.new
parser.set_language("toml")
tree = parser.parse('key = "value"')

puts tree.root_node.kind        # => "document"
puts tree.root_node.child_count # => 1
puts tree.root_node.text        # => 'key = "value"'

Closes #16, #17

Additional Fixes

Eliminate unsafe memory transmute and prevent integer truncation.

  • Remove unsound unsafe transmute in Parser::language() that extended LanguageRef lifetime from borrowed scope to 'static. The safety comment claimed languages outlive parsers, but this wasn't enforced by the type system and could lead to use-after-free.
  • Redesign Parser to store language_name and look up the language directly from the global LANG_LANGUAGES map in build_query(), keeping proper lifetime bounds tied to the lock guard.
  • Replace as u32 casts with .try_into().ok()? in Node::child() and Node::named_child() to prevent silent value truncation on 64-bit systems where usize can exceed u32::MAX.

Fix build errors and deprecation warnings for tree-sitter 0.26

Rust source changes (ext/tree_stump/src/*.rs):

  • Fix compilation error in parser.rs by changing Query::new to accept
    &tree_sitter::Language instead of &tree_sitter::LanguageRef
  • Replace deprecated magnus::exception::runtime_error() calls with
    build_error() helper throughout lib.rs, query.rs, and tree.rs
  • Replace deprecated magnus::exception::type_error() with
    ruby.exception_type_error() in query.rs
  • Fix deprecated into_symbol() -> into_symbol_with(&ruby) in query.rs
  • Fix deprecated into_value() -> into_value_with(ruby) in query.rs
  • Add #[allow(deprecated)] for unavoidable fallback case in util.rs
  • Fix lifetime elision warnings by adding explicit <'_> lifetime
    parameters to Node and TreeCursor return types in tree.rs

Spec changes:

  • Update spec_helper.rb to use TREE_SITTER_RUBY_PATH env var for
    grammar path, falling back to local path for CI
  • Add is_missing? and missing? method tests for Node class
  • Add helper methods find_missing_nodes and check_all_nodes for
    tree traversal in tests

Update tree_stump to support the tree-sitter Rust crate 0.26.x,
enabling grammars with LANGUAGE_VERSION 15 to be loaded.

Changes:
- Replace Language::version() with Language::abi_version()
- Remove deprecated parser timeout/cancellation APIs
- Handle Parser::language() returning Option<LanguageRef>
- Convert usize to u32 for Node::child() and Node::named_child()
- Use streaming_iterator for QueryMatches iteration
- Update Cargo.toml to tree-sitter = "0.26.3"
- Add streaming-iterator = "0.1" dependency

BREAKING CHANGE: Parser#timeout_micros method is removed (deprecated
in 0.25, removed upstream in 0.26). Parser#language now returns nil
after setting a language due to API changes in the Rust crate.

Closes joker1007#16
Copilot AI review requested due to automatic review settings January 4, 2026 05:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the tree_stump Ruby gem to support tree-sitter Rust crate version 0.26.x, enabling compatibility with grammars using LANGUAGE_VERSION 15. The update involves adapting to breaking API changes in tree-sitter, including removal of timeout methods, API signature changes, and adoption of streaming iterators.

Key changes:

  • Updated tree-sitter dependency from 0.22 to 0.26 with new dependencies (tree-sitter-language, streaming-iterator)
  • Removed deprecated timeout methods from the Parser API
  • Added new Node methods (is_missing) and compatibility aliases for better API consistency
  • Updated CI workflow to test additional Ruby implementations and versions

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
ext/tree_stump/Cargo.toml Updated tree-sitter to 0.26, added tree-sitter-language and streaming-iterator dependencies
Cargo.lock Updated dependency versions and added new transitive dependencies for tree-sitter 0.26
ext/tree_stump/src/lib.rs Updated language loading to use LanguageFn, removed timeout method definitions, added Node method aliases
ext/tree_stump/src/language.rs Updated version() to call abi_version(), added new abi_version() method
ext/tree_stump/src/parser.rs Removed timeout methods, added unsafe lifetime extension for LanguageRef
ext/tree_stump/src/tree.rs Added is_missing() method, converted usize to u32 for child access, renamed method call to child_with_descendant
ext/tree_stump/src/query.rs Updated to use StreamingIterator for query matches iteration
.github/workflows/rspec.yml Enhanced CI with additional Ruby versions, experimental implementations, and improved workflow configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@pboling pboling force-pushed the feat/upgrade-tree-sitter branch 2 times, most recently from 6343278 to c3960c3 Compare January 4, 2026 05:56
Consolidate tree-sitter environment setup using the kettle-rb/ts-grammar-action
composite action instead of separate steps for Java, tree-sitter, and Rust.

Before:
- actions/setup-java@v5 (conditional for TruffleRuby/JRuby)
- tree-sitter/setup-action@v2 (library and CLI)
- actions-rust-lang/setup-rust-toolchain@v1 (Rust for tree_stump)

After:
- kettle-rb/ts-grammar-action@v1 (all-in-one)
  - install-cli: true
  - install-lib: true
  - setup-rust: true (required for tree_stump Rust extension)
  - setup-java: conditional (for TruffleRuby/JRuby)

This reduces workflow complexity and ensures consistent tree-sitter
environment setup across kettle-rb projects.

Closes joker1007#17
@pboling pboling force-pushed the feat/upgrade-tree-sitter branch from c3960c3 to 9ec6ef7 Compare January 4, 2026 05:57
- usize can be larger than u32.
- On conversion failure (index > 4,294,967,295)
  - Now returns None
  - Previously truncated the value
@pboling pboling force-pushed the feat/upgrade-tree-sitter branch from 1309b56 to 718a127 Compare January 4, 2026 07:35
- Remove unsound unsafe transmute in Parser::language() that extended
  LanguageRef lifetime from borrowed scope to 'static. The safety comment
  claimed languages outlive parsers, but this wasn't enforced by the type
  system and could lead to use-after-free.

- Redesign Parser to store language_name and look up the language directly
  from the global LANG_LANGUAGES map in build_query(), keeping proper
  lifetime bounds tied to the lock guard.
Rust source changes (ext/tree_stump/src/*.rs):

- Fix compilation error in parser.rs by changing Query::new to accept
  &tree_sitter::Language instead of &tree_sitter::LanguageRef
- Replace deprecated magnus::exception::runtime_error() calls with
  build_error() helper throughout lib.rs, query.rs, and tree.rs
- Replace deprecated magnus::exception::type_error() with
  ruby.exception_type_error() in query.rs
- Fix deprecated into_symbol() -> into_symbol_with(&ruby) in query.rs
- Fix deprecated into_value() -> into_value_with(ruby) in query.rs
- Add #[allow(deprecated)] for unavoidable fallback case in util.rs
- Fix lifetime elision warnings by adding explicit <'_> lifetime
  parameters to Node and TreeCursor return types in tree.rs

Spec changes:

- Update spec_helper.rb to use TREE_SITTER_RUBY_PATH env var for
  grammar path, falling back to local path for CI
- Add is_missing? and missing? method tests for Node class
- Add helper methods find_missing_nodes and check_all_nodes for
  tree traversal in tests
@pboling
Copy link
Contributor Author

pboling commented Jan 6, 2026

The last build failed because ruby-head tag in setup ruby now points at ruby 4.1.0. The last time it ran it worked because it was still using Ruby 4.0.0, and there is a breaking change in the C API. I have fixed it in rbs-sys and magnus, but rbs-sys-env and magnus both still need to be released:

@joker1007 This can be merged, but the ruby-head build will continue to fail until new versions are released for rbs-sys-env and magnus.

@pboling
Copy link
Contributor Author

pboling commented Jan 8, 2026

Also, I added, and then removed, CI for JRuby and Truffleruby. I just wanted to see if it was a reasonable lift to add support for them. Turns out it was more than I can handle right now.

@pboling
Copy link
Contributor Author

pboling commented Jan 14, 2026

@joker1007 The new rbs-sys-env has been released, but still waiting on magnus.

- Change gemspec extensions from Cargo.toml to extconf.rb to use rb_sys/mkmf
- Add rb_sys ~> 0.9.119 as runtime dependency (required by extconf.rb)
- Add CI build verification steps to ensure extension compiles before tests
- Add Rust version verification and explicit compile step in workflow
- Improve error diagnostics for extension build failures

Fixes issue where extension built successfully but .so file wasn't found
in CI due to RubyGems using native Cargo builder instead of rb_sys.
@pboling
Copy link
Contributor Author

pboling commented Jan 21, 2026

@joker1007 I think we need to make rb-sys a runtime dependency in the gemspec, because the gem seems to not build without it in some scenarios. I'm not sure why yet. I've added this, but happy to take it back out if it doesn't belong and I've misdiagnosed my build failures.

  spec.add_dependency "rb_sys", "~> 0.9.119"

Oh, and magnus will not be released for ruby-head support. We'll have to target main until there is a new release with the support...

Oh, lol, maybe that's why the build was failing... 😆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support tree-sitter Rust crate 0.26.x (Support LANGUAGE_VERSION 15)

1 participant