[![License: MIT][📄license-img]][📄license-ref]
if ci_badges.map(&:color).detect { it != "green"} ☝️ let me know, as I may have missed the discord notification.
if ci_badges.map(&:color).all? { it == "green"} 👇️ send money so I can do more of this. FLOSS maintenance is now my full-time job.
👣 How will this project approach the September 2025 hostile takeover of RubyGems? 🚑️
I've summarized my thoughts in this blog post.
Ast::Merge is not typically used directly - instead, use one of the format-specific gems built on top of it.
The *-merge gem family provides intelligent, AST-based merging for various file formats. At the foundation is tree_haver, which provides a unified cross-Ruby parsing API that works seamlessly across MRI, JRuby, and TruffleRuby.
| Gem | Version | CI | Language / Format |
Parser Backend(s) | Description | |
|---|---|---|---|---|---|---|
| tree_haver | Multi | MRI C, Rust, FFI, Java, Prism, Psych, Commonmarker, Markly, Citrus, Parslet | Foundation: Cross-Ruby adapter for parsing libraries (like Faraday for HTTP) | |||
| ast-merge | Text | internal | Infrastructure: Shared base classes and merge logic for all *-merge gems |
|||
| bash-merge | Bash | tree-sitter-bash (via tree_haver) | Smart merge for Bash scripts | |||
| commonmarker-merge | Markdown | Commonmarker (via tree_haver) | Smart merge for Markdown (CommonMark via comrak Rust) | |||
| dotenv-merge | Dotenv | internal | Smart merge for .env files |
|||
| json-merge | JSON | tree-sitter-json (via tree_haver) | Smart merge for JSON files | |||
| jsonc-merge | JSONC | tree-sitter-jsonc (via tree_haver) | ||||
| markdown-merge | Markdown | Commonmarker / Markly (via tree_haver) | Foundation: Shared base for Markdown mergers with inner code block merging | |||
| markly-merge | Markdown | Markly (via tree_haver) | Smart merge for Markdown (CommonMark via cmark-gfm C) | |||
| prism-merge | Ruby | Prism (prism std lib gem) |
Smart merge for Ruby source files | |||
| psych-merge | YAML | Psych (psych std lib gem) |
Smart merge for YAML files | |||
| rbs-merge | RBS | tree-sitter-bash (via tree_haver), RBS (rbs std lib gem) |
Smart merge for Ruby type signatures | |||
| toml-merge | TOML | Parslet + toml, Citrus + toml-rb, tree-sitter-toml (all via tree_haver) | Smart merge for TOML files |
tree_haver supports multiple parsing backends, but not all backends work on all Ruby platforms:
| Platform 👉️ TreeHaver Backend 👇️ |
MRI | JRuby | TruffleRuby | Notes |
|---|---|---|---|---|
| MRI (ruby_tree_sitter) | ✅ | ❌ | ❌ | C extension, MRI only |
| Rust (tree_stump) | ✅ | ❌ | ❌ | Rust extension via magnus/rb-sys, MRI only |
| FFI | ✅ | ✅ | ❌ | TruffleRuby's FFI doesn't support STRUCT_BY_VALUE |
| Java (jtreesitter) | ❌ | ✅ | ❌ | JRuby only, requires grammar JARs |
| Prism | ✅ | ✅ | ✅ | Ruby parsing, stdlib in Ruby 3.4+ |
| Psych | ✅ | ✅ | ✅ | YAML parsing, stdlib |
| Citrus | ✅ | ✅ | ✅ | Pure Ruby PEG parser, no native dependencies |
| Parslet | ✅ | ✅ | ✅ | Pure Ruby PEG parser, no native dependencies |
| Commonmarker | ✅ | ❌ | ❓ | Rust extension for Markdown |
| Markly | ✅ | ❌ | ❓ | C extension for Markdown |
Legend: ✅ = Works, ❌ = Does not work, ❓ = Untested
Why some backends don't work on certain platforms:
- JRuby: Runs on the JVM; cannot load native C/Rust extensions (
.sofiles) - TruffleRuby: Has C API emulation via Sulong/LLVM, but it doesn't expose all MRI internals that native extensions require (e.g.,
RBasic.flags,rb_gc_writebarrier) - FFI on TruffleRuby: TruffleRuby's FFI implementation doesn't support returning structs by value, which tree-sitter's C API requires
Example implementations for the gem templating use case:
| Gem | Purpose | Description |
|---|---|---|
| kettle-dev | Gem Development | Gem templating tool using *-merge gems |
| kettle-jem | Gem Templating | Gem template library with smart merge support |
The *-merge gem family is built on a two-layer architecture:
tree_haver provides cross-Ruby parsing capabilities:
- Universal Backend Support: Automatically selects the best parsing backend for your Ruby implementation (MRI, JRuby, TruffleRuby)
- 10 Backend Options: MRI C extensions, Rust bindings, FFI, Java (JRuby), language-specific parsers (Prism, Psych, Commonmarker, Markly), and pure Ruby fallback (Citrus)
- Unified API: Write parsing code once, run on any Ruby implementation
- Grammar Discovery: Built-in
GrammarFinderfor platform-aware grammar library discovery - Thread-Safe: Language registry with thread-safe caching
Ast::Merge builds on tree_haver to provide:
- Base Classes:
FreezeNode,MergeResultbase classes with unified constructors - Shared Modules:
FileAnalysisBase,FileAnalyzable,MergerConfig,DebugLogger - Freeze Block Support: Configurable marker patterns for multiple comment syntaxes (preserve sections during merge)
- Node Typing System:
NodeTypingfor canonical node type identification across different parsers - Conflict Resolution:
ConflictResolverBasewith pluggable strategies - Error Classes:
ParseError,TemplateParseError,DestinationParseError - Region Detection:
RegionDetectorBase,FencedCodeBlockDetectorfor text-based analysis - RSpec Shared Examples: Test helpers for implementing new merge gems
require "ast/merge"
module MyFormat
module Merge
# Inherit from base classes and pass **options for forward compatibility
class SmartMerger < Ast::Merge::SmartMergerBase
DEFAULT_FREEZE_TOKEN = "myformat-merge"
def initialize(template, dest, my_custom_option: nil, **options)
@my_custom_option = my_custom_option
super(template, dest, **options)
end
protected
def analysis_class
FileAnalysis
end
def default_freeze_token
DEFAULT_FREEZE_TOKEN
end
def perform_merge
# Implement format-specific merge logic
# Returns a MergeResult
end
end
class FileAnalysis
include Ast::Merge::FileAnalyzable
def initialize(source, freeze_token: nil, signature_generator: nil, **options)
@source = source
@freeze_token = freeze_token
@signature_generator = signature_generator
# Process source...
end
def compute_node_signature(node)
# Return signature array for node matching
end
end
class ConflictResolver < Ast::Merge::ConflictResolverBase
def initialize(template_analysis, dest_analysis, preference: :destination,
add_template_only_nodes: false, match_refiner: nil, **options)
super(
strategy: :batch, # or :node, :boundary
preference: preference,
template_analysis: template_analysis,
dest_analysis: dest_analysis,
add_template_only_nodes: add_template_only_nodes,
match_refiner: match_refiner,
**options
)
end
protected
def resolve_batch(result)
# Implement batch resolution logic
end
end
class MergeResult < Ast::Merge::MergeResultBase
def initialize(**options)
super(**options)
@statistics = {merged_count: 0}
end
def to_my_format
to_s
end
end
class MatchRefiner < Ast::Merge::MatchRefinerBase
def initialize(threshold: 0.7, node_types: nil, **options)
super(threshold: threshold, node_types: node_types, **options)
end
def similarity(template_node, dest_node)
# Return similarity score between 0.0 and 1.0
end
end
end
end| Base Class | Purpose | Key Methods to Implement |
|---|---|---|
SmartMergerBase |
Main merge orchestration | analysis_class, perform_merge |
ConflictResolverBase |
Resolve node conflicts | resolve_batch or resolve_node_pair |
MergeResultBase |
Track merge results | to_s, format-specific output |
MatchRefinerBase |
Fuzzy node matching | similarity |
ContentMatchRefiner |
Text content fuzzy matching | Ready to use |
FileAnalyzable |
File parsing/analysis | compute_node_signature |
Ast::Merge::ContentMatchRefiner is a built-in match refiner for fuzzy text content matching using Levenshtein distance. Unlike signature-based matching which requires exact content hashes, this refiner allows matching nodes with similar (but not identical) content.
# Basic usage - match nodes with 70% similarity
refiner = Ast::Merge::ContentMatchRefiner.new(threshold: 0.7)
# Only match specific node types
refiner = Ast::Merge::ContentMatchRefiner.new(
threshold: 0.6,
node_types: [:paragraph, :heading],
)
# Custom weights for scoring
refiner = Ast::Merge::ContentMatchRefiner.new(
threshold: 0.7,
weights: {
content: 0.8, # Levenshtein similarity (default: 0.7)
length: 0.1, # Length similarity (default: 0.15)
position: 0.1, # Position in document (default: 0.15)
},
)
# Custom content extraction
refiner = Ast::Merge::ContentMatchRefiner.new(
threshold: 0.7,
content_extractor: ->(node) { node.text_content.downcase.strip },
)
# Use with a merger
merger = MyFormat::SmartMerger.new(
template,
destination,
preference: :template,
match_refiner: refiner,
)This is particularly useful for:
- Paragraphs with minor edits (typos, rewording)
- Headings with slight changes
- Comments with updated text
- Any text-based node that may have been slightly modified
Ast::Merge::JaccardSimilarity provides set-based fuzzy matching of text blocks using Jaccard index with bigram and token overlap metrics. This is the foundation for detecting renamed or refactored nodes that share similar content.
# Calculate similarity between two text strings
Ast::Merge::JaccardSimilarity.jaccard("def process_users(data)", "def handle_users(data)")
# => 0.75 (high overlap due to shared tokens)
# Extract tokens from text for comparison
tokens = Ast::Merge::JaccardSimilarity.extract_tokens("data.each { |u| validate(u) }")
# => ["data", "each", "validate"]Ast::Merge::TokenMatchRefiner extends MatchRefinerBase for Jaccard-based fuzzy refinement of unmatched node pairs during alignment. It uses greedy best-first matching to pair orphan nodes that have similar body text.
refiner = Ast::Merge::TokenMatchRefiner.new(
threshold: 0.6, # Minimum Jaccard similarity (default: 0.6)
node_types: [:def, :class], # Only match these node types
)
merger = MyFormat::SmartMerger.new(
template, destination,
match_refiner: refiner,
)Ast::Merge::CompositeMatchRefiner chains multiple refiners sequentially, enabling multi-strategy matching in a single alignment pass. Each refiner operates on the residual unmatched nodes from the previous refiner.
composite = Ast::Merge::CompositeMatchRefiner.new(refiners: [
Ast::Merge::ContentMatchRefiner.new(threshold: 0.8), # strict text match first
Ast::Merge::TokenMatchRefiner.new(threshold: 0.5), # then looser token match
])
merger = MyFormat::SmartMerger.new(
template, destination,
match_refiner: composite,
)The Ast::Merge module is organized into several namespaces, each with detailed documentation:
| Namespace | Purpose | Documentation |
|---|---|---|
Ast::Merge::Detector |
Region detection and merging | lib/ast/merge/detector/README.md |
Ast::Merge::Recipe |
YAML-based merge recipes | lib/ast/merge/recipe/README.md |
Ast::Merge::Comment |
Comment parsing and representation | lib/ast/merge/comment/README.md |
Ast::Merge::Text |
Plain text AST parsing | lib/ast/merge/text/README.md |
Ast::Merge::RSpec |
Shared RSpec examples | lib/ast/merge/rspec/README.md |
Key Classes by Namespace:
- Detector:
Region,Base,Mergeable,FencedCodeBlock,YamlFrontmatter,TomlFrontmatter - Recipe:
Config,Runner,ScriptLoader - Comment:
Line,Block,Empty,Parser,Style - Text:
SmartMerger,FileAnalysis,LineNode,WordNode,Section - RSpec: Shared examples and dependency tags for testing
*-mergeimplementations
| Tokens to Remember | |
|---|---|
| Works with JRuby | |
| Works with Truffle Ruby | |
| Works with MRI Ruby 4 | |
| Works with MRI Ruby 3 | |
| Support & Community | |
| Source | |
| Documentation | |
| Compliance | [![License: MIT][📄license-img]][📄license-ref] |
| Style | |
| Maintainer 🎖️ | |
... 💖 |
Compatible with MRI Ruby 3.2.0+, and concordant releases of JRuby, and TruffleRuby.
| 🚚 Amazing test matrix was brought to you by | 🔎 appraisal2 🔎 and the color 💚 green 💚 |
|---|---|
| 👟 Check it out! | ✨ github.com/appraisal-rb/appraisal2 ✨ |
Find this repo on federated forges (Coming soon!)
| Federated DVCS Repository | Status | Issues | PRs | Wiki | CI | Discussions |
|---|---|---|---|---|---|---|
| 🧪 kettle-rb/ast-merge on GitLab | The Truth | 💚 | 💚 | 💚 | 🐭 Tiny Matrix | ➖ |
| 🧊 kettle-rb/ast-merge on CodeBerg | An Ethical Mirror (Donate) | 💚 | 💚 | ➖ | ⭕️ No Matrix | ➖ |
| 🐙 kettle-rb/ast-merge on GitHub | Another Mirror | 💚 | 💚 | 💚 | 💯 Full Matrix | 💚 |
| 🎮️ Discord Server | Let's | talk | about | this | library! |
Available as part of the Tidelift Subscription.
Need enterprise-level guarantees?
The maintainers of this and thousands of other packages are working with Tidelift to deliver commercial support and maintenance for the open source packages you use to build your applications. Save time, reduce risk, and improve code health, while paying the maintainers of the exact packages you use.
- 💡Subscribe for support guarantees covering all your FLOSS dependencies
- 💡Tidelift is part of Sonar
- 💡Tidelift pays maintainers to maintain the software you depend on!
📊@Pointy Haired Boss: An enterprise support subscription is "never gonna let you down", and supports open source maintainers
Alternatively:
Install the gem and add to the application's Gemfile by executing:
bundle add ast-mergeIf bundler is not being used to manage dependencies, install the gem by executing:
gem install ast-mergeFor Medium or High Security Installations
This gem is cryptographically signed and has verifiable SHA-256 and SHA-512 checksums by stone_checksums. Be sure the gem you install hasn’t been tampered with by following the instructions below.
Add my public key (if you haven’t already; key expires 2045-04-29) as a trusted certificate:
gem cert --add <(curl -Ls https://raw.github.com/galtzo-floss/certs/main/pboling.pem)You only need to do that once. Then proceed to install with:
gem install ast-merge -P HighSecurityThe HighSecurity trust profile will verify signed gems, and not allow the installation of unsigned dependencies.
If you want to up your security game full-time:
bundle config set --global trust-policy MediumSecurityMediumSecurity instead of HighSecurity is necessary if not all the gems you use are signed.
NOTE: Be prepared to track down certs for signed gems and add them the same way you added mine.
ast-merge provides base classes and shared interfaces for building format-specific merge tools.
Each implementation (like prism-merge, psych-merge, etc.) has its own SmartMerger with format-specific configuration.
All SmartMerger implementations share these configuration options:
merger = SomeFormat::Merge::SmartMerger.new(
template,
destination,
# When conflicts occur, prefer template or destination values
preference: :template, # or :destination (default), or a Hash for per-node-type
# Add nodes that only exist in template (Boolean or callable filter)
add_template_only_nodes: true, # default: false, or ->(node, entry) { ... }
# Custom node type handling
node_typing: {}, # optional, for per-node-type preference
)Control which source wins when both files have the same structural element:
:template- Template values replace destination values:destination(default) - Destination values are preserved- Hash - Per-node-type preference (see Advanced Configuration)
Control whether to add nodes that only exist in the template:
true- Add all template-only nodesfalse(default) - Skip template-only nodes- Callable - Filter which template-only nodes to add
When you need fine-grained control over which template-only nodes are added, pass a callable (Proc/Lambda) that receives (node, entry) and returns truthy to add or falsey to skip:
# Only add nodes with gem_family signatures
merger = SomeFormat::Merge::SmartMerger.new(
template,
destination,
add_template_only_nodes: ->(node, entry) {
sig = entry[:signature]
sig.is_a?(Array) && sig.first == :gem_family
},
)
# Only add link definitions that match a pattern
merger = Markly::Merge::SmartMerger.new(
template,
destination,
add_template_only_nodes: ->(node, entry) {
entry[:template_node].type == :link_definition &&
entry[:signature]&.last&.include?("gem")
},
)The entry hash contains:
:template_node- The node being considered for addition:signature- The node's signature (Array or other value):template_index- Index in the template statements:dest_index- Alwaysnilfor template-only nodes
# spec/spec_helper.rb
require "ast/merge/rspec/shared_examples"
# spec/my_format/merge/freeze_node_spec.rb
RSpec.describe(MyFormat::Merge::FreezeNode) do
it_behaves_like "Ast::Merge::FreezeNode" do
let(:freeze_node_class) { described_class }
let(:default_pattern_type) { :hash_comment }
let(:build_freeze_node) do
lambda { |start_line:, end_line:, **opts|
# Build a freeze node for your format
}
end
end
end"Ast::Merge::FreezeNode"- Tests for FreezeNode implementations"Ast::Merge::MergeResult"- Tests for MergeResult implementations"Ast::Merge::DebugLogger"- Tests for DebugLogger implementations"Ast::Merge::FileAnalysisBase"- Tests for FileAnalysis implementations"Ast::Merge::MergerConfig"- Tests for SmartMerger implementations
While kettle-rb tools are free software and will always be, the project would benefit immensely from some funding. Raising a monthly budget of... "dollars" would make the project more sustainable.
We welcome both individual and corporate sponsors! We also offer a wide array of funding channels to account for your preferences (although currently Open Collective is our preferred funding platform).
If you're working in a company that's making significant use of kettle-rb tools we'd appreciate it if you suggest to your company to become a kettle-rb sponsor.
You can support the development of kettle-rb tools via GitHub Sponsors, Liberapay, PayPal, Open Collective and Tidelift.
| 📍 NOTE |
|---|
| If doing a sponsorship in the form of donation is problematic for your company from an accounting standpoint, we'd recommend the use of Tidelift, where you can get a support-like subscription instead. |
Support us with a monthly donation and help us continue our activities. [Become a backer]
NOTE: kettle-readme-backers updates this list every day, automatically.
No backers yet. Be the first!
Become a sponsor and get your logo on our README on GitHub with a link to your site. [Become a sponsor]
NOTE: kettle-readme-backers updates this list every day, automatically.
No sponsors yet. Be the first!
I’m driven by a passion to foster a thriving open-source community – a space where people can tackle complex problems, no matter how small. Revitalizing libraries that have fallen into disrepair, and building new libraries focused on solving real-world challenges, are my passions. I was recently affected by layoffs, and the tech jobs market is unwelcoming. I’m reaching out here because your support would significantly aid my efforts to provide for my family, and my farm (11 🐔 chickens, 2 🐶 dogs, 3 🐰 rabbits, 8 🐈 cats).
If you work at a company that uses my work, please encourage them to support me as a corporate sponsor. My work on gems you use might show up in bundle fund.
I’m developing a new library, floss_funding, designed to empower open-source developers like myself to get paid for the work we do, in a sustainable way. Please give it a look.
Floss-Funding.dev: 👉️ No network calls. 👉️ No tracking. 👉️ No oversight. 👉️ Minimal crypto hashing. 💡 Easily disabled nags
See SECURITY.md.
If you need some ideas of where to help, you could work on adding more code coverage, or if it is already 💯 (see below) check reek, issues, or PRs, or use the gem and think about how it could be better.
We so if you make changes, remember to update it.
See CONTRIBUTING.md for more detailed instructions.
See CONTRIBUTING.md.
Everyone interacting with this project's codebases, issue trackers,
chat rooms and mailing lists agrees to follow the .
Made with contributors-img.
Also see GitLab Contributors: https://gitlab.com/kettle-rb/ast-merge/-/graphs/main
This Library adheres to .
Violations of this scheme should be reported as bugs.
Specifically, if a minor or patch version is released that breaks backward compatibility,
a new version should be immediately released that restores compatibility.
Breaking changes to the public API will only be introduced with new major versions.
dropping support for a platform is both obviously and objectively a breaking change
—Jordan Harband (@ljharb, maintainer of SemVer) in SemVer issue 716
I understand that policy doesn't work universally ("exceptions to every rule!"), but it is the policy here. As such, in many cases it is good to specify a dependency on this library using the Pessimistic Version Constraint with two digits of precision.
For example:
spec.add_dependency("ast-merge", "~> 5.0")📌 Is "Platform Support" part of the public API? More details inside.
SemVer should, IMO, but doesn't explicitly, say that dropping support for specific Platforms is a breaking change to an API, and for that reason the bike shedding is endless.
To get a better understanding of how SemVer is intended to work over a project's lifetime, read this article from the creator of SemVer:
See CHANGELOG.md for a list of releases.
The gem is available under the following license: AGPL-3.0-only. See LICENSE.md for details.
If none of the available licenses suit your use case, please contact us to discuss a custom commercial license.
See LICENSE.md for the official copyright notice.
Maintainers have teeth and need to pay their dentists. After getting laid off in an RIF in March, and encountering difficulty finding a new one, I began spending most of my time building open source tools. I'm hoping to be able to pay for my kids' health insurance this month, so if you value the work I am doing, I need your support. Please consider sponsoring me or the project.
To join the community or get help 👇️ Join the Discord.
To say "thanks!" ☝️ Join the Discord or 👇️ send money.
Thanks for RTFM.