docs: add reassignZinggId & model diff guides by padam-prakash · Pull Request #1283 · zinggAI/zingg

padam-prakash · 2026-03-05T04:03:05Z

Expand and improve documentation for explainOutput, persistent ZINGG ID reassignments, and model diff. Changes include:

docs/explainoutput/README.md: rename incrementalRun->runIncremental and add an "Important Notes" section clarifying that explainOutput focuses on probabilistic matches (deterministic matches are not included and may produce empty results for some clusters).
docs/persistentZinggId.md: fix title typo and replace placeholder with a full guide for the reassignZinggId phase (motivation, how it works, CLI and Python usage examples, configuration wrappers, example JSON configs, and output/usage notes).
docs/stepbystep/createtrainingdata/modeldiff.md: replace placeholder with a comprehensive model diff guide (purpose, how it works, CLI and Python examples, configuration samples, output format, and use cases).

Expand and improve documentation for explainOutput, persistent ZINGG ID reassignments, and model diff. Changes include: - docs/explainoutput/README.md: rename incrementalRun->runIncremental and add an "Important Notes" section clarifying that explainOutput focuses on probabilistic matches (deterministic matches are not included and may produce empty results for some clusters). - docs/persistentZinggId.md: fix title typo and replace placeholder with a full guide for the reassignZinggId phase (motivation, how it works, CLI and Python usage examples, configuration wrappers, example JSON configs, and output/usage notes). - docs/stepbystep/createtrainingdata/modeldiff.md: replace placeholder with a comprehensive model diff guide (purpose, how it works, CLI and Python examples, configuration samples, output format, and use cases).

Copilot

Pull request overview

This PR expands Enterprise documentation for understanding model evolution and operational continuity by adding full guides for model diff, persistent ZINGG ID reassignment, and clarifying explainOutput behavior.

Changes:

Added a comprehensive guide for the diff phase, including CLI/Python usage and config examples.
Replaced the placeholder persistent ID page with a full reassignZinggId guide (motivation, workflow, usage examples, configs).
Updated explainOutput docs to use runIncremental and added notes clarifying its current focus on probabilistic matches.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File	Description
docs/stepbystep/createtrainingdata/modeldiff.md	Replaces placeholder with an end-to-end “diff” guide (usage, configs, output).
docs/persistentZinggId.md	Fixes page title and adds a detailed “reassignZinggId” guide (usage, configs, behavior).
docs/explainoutput/README.md	Renames incremental phase reference and documents explainOutput scope/limitations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

docs/stepbystep/createtrainingdata/modeldiff.md

Copilot · 2026-03-05T04:06:55Z

docs/stepbystep/createtrainingdata/modeldiff.md

+  "deterministicMatching": [
+    {
+      "matchCondition": [
+        {"fieldName": "fname"},
+        {"fieldName": "stNo"},
+        {"fieldName": "add1"}
+      ]
+    },
+    {
+      "matchCondition": [
+        {"fieldName": "ssn"}
+      ]
+    }
+  ],


The narrative/Python example says deterministic matching rules are added in the new configuration, but in the JSON samples the deterministicMatching block appears under the Original Configuration (configBaseline.json). Please align the examples (either move deterministicMatching to the updated config sample or adjust the explanation) to avoid misleading readers about what changed between the models.

Suggested change

"deterministicMatching": [

{

"matchCondition": [

{"fieldName": "fname"},

{"fieldName": "stNo"},

{"fieldName": "add1"}

]

},

{

"matchCondition": [

{"fieldName": "ssn"}

]

}

],

docs/stepbystep/createtrainingdata/modeldiff.md

docs/persistentZinggId.md

Update documentation to clarify the required wrapper configuration for CLI operations and update example files/names. Key changes: use the sparkIncremental reassign example in persistentZinggId.md; explain that --conf must point to a wrapper config containing transformedOutputPath and that its name is an arbitrary internal identifier; change transformedOutputPath names to reassignedOutput and diffOutput in examples; add explanatory notes about wrapper usage; add deterministicMatching to the updated model example and remove the duplicate block from the baseline example; and clarify wording for ZINGG_ID_UPDATED/ORIGINAL in the diff output description.

padam-prakash requested a review from Copilot March 5, 2026 04:03

Copilot started reviewing on behalf of padam-prakash March 5, 2026 04:03 View session

Copilot AI reviewed Mar 5, 2026

View reviewed changes

padam-prakash requested a review from sonalgoyal March 5, 2026 09:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add reassignZinggId & model diff guides#1283

docs: add reassignZinggId & model diff guides#1283
padam-prakash wants to merge 2 commits intozinggAI:mainfrom
padam-prakash:Issue#1272DocumentationForDiffandReassign

padam-prakash commented Mar 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

padam-prakash Mar 5, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

padam-prakash commented Mar 5, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

padam-prakash Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants