Problems 1925 3100 3133 3164 3197 3228 3260 3291 3320 3351 3380 3413 3444 3471 #108

romankurnovskii · 2025-12-08T09:52:58Z

Summary

Brief description of the changes in this PR.

Type of Change

Bug fix
New feature
Performance improvement
Documentation/Tests

Objective

For new features and performance improvements: Clearly describe the objective and rationale for this change.

Testing

Unit tests added/updated
Integration tests added/updated

Breaking Changes

This PR contains breaking changes

If this is a breaking change, describe:

What functionality is affected
Migration path for existing users

Checklist

Code follows project style guidelines
Documentation updated where necessary
No secrets or sensitive information committed

Related Issues

Closes #[issue number]

Summary by Sourcery

Adjust JSON normalization formatting to support multi-value-per-line numeric arrays with a configurable print width and propagate this setting through JSON value formatting.

Enhancements:

Change JSON number array formatting to pack multiple values per line based on a target print width instead of one per line.
Add a configurable print_width parameter to the JSON formatter and propagate it through nested formatting calls.

Summary by CodeRabbit

Chores
- Updated internal JSON formatting utilities for improved layout handling.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…er line - Updated format_json_value to format number arrays with multiple numbers per line - Numbers wrap at printWidth (150) to match Prettier style - Matches the formatting style in book-sets.json

sourcery-ai · 2025-12-08T09:53:03Z

Reviewer's Guide

Updates JSON normalization to support configurable line-wrapping for numeric arrays and reapplies formatting to book-sets.json to match the new style.

Flow diagram for updated JSON normalization and numeric array wrapping

flowchart TD
  A["sort_json_by_numeric_keys called with input_file and output_file"] --> B["Read raw JSON from input_file"]
  B --> C["Parse JSON into data structure"]
  C --> D["Sort top level keys numerically into sorted_data"]
  D --> E["Initialize lines list with '{'"]
  E --> F["For each key,value in sorted_data"]

  subgraph G["Formatting each value"]
    direction TB
    F --> G1["Call format_json_value(value, indent_level 1, print_width 150)"]
    G1 --> H{Value type?}
    H --> I["None"]
    H --> J["bool"]
    H --> K["int or float"]
    H --> L["str"]
    H --> M["dict"]
    H --> N["list"]

    M --> M1["For each item in sorted dict: format_json_value(v, indent_level+1, print_width)"]
    M1 --> M2["Join formatted items with commas and newlines, wrap in '{ }'"]

    N --> N1{"List empty?"}
    N1 --> N2["Return '[]'"]
    N1 --> N3{"First element is number?"}

    N3 --> O["Numeric array formatting with wrapping"]

    subgraph P["Numeric array wrapping algorithm"]
      direction TB
      O --> P1["Compute available_width = print_width - len(next_indent) - 2"]
      P1 --> P2["Initialize lines, current_line, current_length = 0"]
      P2 --> P3["For each item in list"]
      P3 --> P4["Compute item_str and item_length (include comma+space if needed)"]
      P4 --> P5{"current_length + item_length > available_width and current_line not empty?"}
      P5 --> P6["Append current_line joined by ', ' to lines and start new line with item"]
      P5 --> P7["Append item to current_line and update current_length"]
      P6 --> P8["After loop, append remaining current_line to lines"]
      P7 --> P8
      P8 --> P9["Indent each line with next_indent and wrap with '[ ]'"]
    end

    N3 --> Q["Non numeric list formatting"]
    Q --> Q1["formatted_items = format_json_value(item, indent_level+1, print_width) for each item"]
    Q1 --> Q2["Compute total_length of formatted_items"]
    Q2 --> Q3{"total_length < 100 and len(list) <= 5?"}
    Q3 --> Q4["Render on single line"]
    Q3 --> Q5["Render one item per line with indentation"]
  end

  G --> R["Append '  key: formatted_value' to lines"]
  R --> S{"More keys?"}
  S --> F
  S --> T["Append '}' to lines"]
  T --> U["Join lines with newlines"]
  U --> V["Write normalized JSON to output_file"]

File-Level Changes

Change	Details	Files
Enhance JSON formatter to wrap numeric arrays across multiple elements per line based on a configurable print width.	Add a print_width parameter to format_json_value and propagate it through all recursive calls. Change numeric list formatting from one-number-per-line to multi-number-per-line with width-based line breaking. Introduce width-based packing logic that builds comma-separated lines without exceeding the configured print width. Update sort_json_by_numeric_keys to call format_json_value with an explicit print width of 150 characters.	`scripts/normalize_json.py`
Regenerate normalized JSON data using the updated formatting rules for numeric arrays.	Reformat JSON structure to use the new multi-number-per-line style for numeric arrays, maintaining key ordering and indentation. Ensure the data file remains semantically equivalent while matching the updated normalization script output.	`data/book-sets.json`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

coderabbitai · 2025-12-08T09:53:07Z

Caution

Review failed

The pull request is closed.

Walkthrough

A new optional print_width parameter (default 150) was added to the format_json_value() function to control output width during JSON formatting. The parameter is threaded through recursive calls and applied to numeric list formatting for multi-line, width-constrained layouts.

Changes

Cohort / File(s)	Summary
JSON formatting enhancement `scripts/normalize_json.py`	Added `print_width` parameter to `format_json_value()` function signature; threaded parameter through recursive calls for dictionaries and lists; updated numeric list formatting logic to apply width constraints; updated `sort_json_by_numeric_keys()` to invoke `format_json_value()` with `print_width=150`

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Single-file change with consistent parameter-threading pattern across recursive calls
Review focus: validate that print_width is correctly propagated through all call paths and that list formatting logic properly applies width constraints without unintended side effects

Poem

🐰 A width-aware whisker-twitch of delight!
The JSON now flows with precision just right,
One-fifty characters, perfectly spaced,
Lists are constrained—no more traces misplaced!
Small change, big format—the normalizer's might! ✨

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch problems-1925-3100-3133-3164-3197-3228-3260-3291-3320-3351-3380-3413-3444-3471

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b1f41ed and 849b266.

📒 Files selected for processing (1)

scripts/normalize_json.py (3 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

In the numeric list branch you switched from json.dumps to str(item), which may change how floats, large ints, or non-ASCII values are serialized compared to the rest of the JSON; consider using json.dumps(item, ensure_ascii=False) consistently to avoid subtle formatting differences.
The print_width is currently hard-coded as 150 in sort_json_by_numeric_keys; if different widths might be useful, consider threading this through as a parameter rather than a constant.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In the numeric list branch you switched from `json.dumps` to `str(item)`, which may change how floats, large ints, or non-ASCII values are serialized compared to the rest of the JSON; consider using `json.dumps(item, ensure_ascii=False)` consistently to avoid subtle formatting differences.
- The `print_width` is currently hard-coded as 150 in `sort_json_by_numeric_keys`; if different widths might be useful, consider threading this through as a parameter rather than a constant.

## Individual Comments

### Comment 1
<location> `scripts/normalize_json.py:42-51` </location>
<code_context>
+            for i, item in enumerate(value):
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Consider using `json.dumps` instead of `str` for numeric items to keep JSON formatting consistent.

In this numeric-list branch you're using `str(item)`/`str(x)` for both measuring and emitting, whereas the previous code used `json.dumps(item, ensure_ascii=False)`. That change can alter JSON formatting and make behavior diverge from other branches. Consider computing `item_str = json.dumps(item, ensure_ascii=False)` once and reusing it for both width calculation and output.

Suggested implementation:

```python
            for i, item in enumerate(value):
                item_str = json.dumps(item, ensure_ascii=False)
                # Add comma and space length (2) if not first item on line
                item_length = len(item_str) + (2 if current_line else 0)

                if current_length + item_length > available_width and current_line:
                    # Start a new line
                    lines.append(", ".join(current_line))
                    current_line = [item_str]
                    current_length = len(item_str)
                else:

```

1. Ensure `import json` is present at the top of `scripts/normalize_json.py` if it's not already imported.
2. In the `else:` branch (not shown), make sure you append `item_str` to `current_line` (not the raw `item`) and keep `current_length` updated with `len(item_str)` plus the comma/space when applicable.
</issue_to_address>

### Comment 2
<location> `scripts/normalize_json.py:17-20` </location>
<code_context>


-def format_json_value(value, indent_level=0):
+def format_json_value(value, indent_level=0, print_width=150):
     """Format a JSON value with custom formatting following Prettier style."""
     indent = "  " * indent_level
</code_context>

<issue_to_address>
**suggestion:** The `print_width` parameter isn’t applied consistently, e.g. the non-numeric list single-line threshold is still hardcoded.

`print_width` is used for numeric lists, but other arrays still rely on the hardcoded `total_length < 100` check. Consider basing that threshold on `print_width` (optionally with a margin) so the width configuration is applied consistently across all list types.

Suggested implementation:

```python
def format_json_value(value, indent_level=0, print_width=150):

```

```python
    elif isinstance(value, list):
        if not value:

```

```python
        total_length = len(indent) + 2 + sum(len(item) + 2 for item in items) - 2
        max_width_for_list = max(print_width - len(indent), 0)
        if total_length <= max_width_for_list:

```

I assumed the non-numeric list single-line decision currently uses `if total_length < 100:` after building `items`. If the condition appears in multiple branches (e.g. for different list element types), apply the same replacement to each occurrence so all non-numeric array formatting respects `print_width`. Also ensure that all recursive calls to `format_json_value` in list handling pass the `print_width` argument through (as is already done in the dict branch shown). If the `total_length` calculation differs, keep it as-is and only adjust the width check to use `print_width` via `max_width_for_list`.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-12-08T09:54:21Z

scripts/normalize_json.py

+            for i, item in enumerate(value):
+                item_str = str(item)
+                # Add comma and space length (2) if not first item on line
+                item_length = len(item_str) + (2 if current_line else 0)
+
+                if current_length + item_length > available_width and current_line:
+                    # Start a new line
+                    lines.append(", ".join(str(x) for x in current_line))
+                    current_line = [item]
+                    current_length = len(item_str)


suggestion (bug_risk): Consider using json.dumps instead of str for numeric items to keep JSON formatting consistent.

In this numeric-list branch you're using str(item)/str(x) for both measuring and emitting, whereas the previous code used json.dumps(item, ensure_ascii=False). That change can alter JSON formatting and make behavior diverge from other branches. Consider computing item_str = json.dumps(item, ensure_ascii=False) once and reusing it for both width calculation and output.

Suggested implementation:

for i, item in enumerate(value): item_str = json.dumps(item, ensure_ascii=False) # Add comma and space length (2) if not first item on line item_length = len(item_str) + (2 if current_line else 0) if current_length + item_length > available_width and current_line: # Start a new line lines.append(", ".join(current_line)) current_line = [item_str] current_length = len(item_str) else:

Ensure import json is present at the top of scripts/normalize_json.py if it's not already imported.

In the else: branch (not shown), make sure you append item_str to current_line (not the raw item) and keep current_length updated with len(item_str) plus the comma/space when applicable.

sourcery-ai · 2025-12-08T09:54:21Z

scripts/normalize_json.py

+def format_json_value(value, indent_level=0, print_width=150):
    """Format a JSON value with custom formatting following Prettier style."""
    indent = "  " * indent_level
    next_indent = "  " * (indent_level + 1)


suggestion: The print_width parameter isn’t applied consistently, e.g. the non-numeric list single-line threshold is still hardcoded.

print_width is used for numeric lists, but other arrays still rely on the hardcoded total_length < 100 check. Consider basing that threshold on print_width (optionally with a margin) so the width configuration is applied consistently across all list types.

Suggested implementation:

def format_json_value(value, indent_level=0, print_width=150):

elif isinstance(value, list): if not value:

total_length = len(indent) + 2 + sum(len(item) + 2 for item in items) - 2 max_width_for_list = max(print_width - len(indent), 0) if total_length <= max_width_for_list:

I assumed the non-numeric list single-line decision currently uses if total_length < 100: after building items. If the condition appears in multiple branches (e.g. for different list element types), apply the same replacement to each occurrence so all non-numeric array formatting respects print_width. Also ensure that all recursive calls to format_json_value in list handling pass the print_width argument through (as is already done in the dict branch shown). If the total_length calculation differs, keep it as-is and only adjust the width check to use print_width via max_width_for_list.

vikahaze added 2 commits December 8, 2025 11:52

Fix normalize_json.py to format number arrays with multiple numbers p…

5481418

…er line - Updated format_json_value to format number arrays with multiple numbers per line - Numbers wrap at printWidth (150) to match Prettier style - Matches the formatting style in book-sets.json

sets

849b266

romankurnovskii merged commit 66c5a52 into main Dec 8, 2025
1 of 4 checks passed

sourcery-ai bot reviewed Dec 8, 2025

View reviewed changes

romankurnovskii deleted the problems-1925-3100-3133-3164-3197-3228-3260-3291-3320-3351-3380-3413-3444-3471 branch December 8, 2025 09:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Problems 1925 3100 3133 3164 3197 3228 3260 3291 3320 3351 3380 3413 3444 3471 #108

Problems 1925 3100 3133 3164 3197 3228 3260 3291 3320 3351 3380 3413 3444 3471 #108

Uh oh!

romankurnovskii commented Dec 8, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

sourcery-ai bot commented Dec 8, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

coderabbitai bot commented Dec 8, 2025 •

edited

Loading

Review failed

Uh oh!

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Dec 8, 2025

Uh oh!

sourcery-ai bot Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Problems 1925 3100 3133 3164 3197 3228 3260 3291 3320 3351 3380 3413 3444 3471 #108

Problems 1925 3100 3133 3164 3197 3228 3260 3291 3320 3351 3380 3413 3444 3471 #108

Uh oh!

Conversation

romankurnovskii commented Dec 8, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Type of Change

Objective

Testing

Breaking Changes

Checklist

Related Issues

Summary by Sourcery

Summary by CodeRabbit

Uh oh!

sourcery-ai bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Flow diagram for updated JSON normalization and numeric array wrapping

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

coderabbitai bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

romankurnovskii commented Dec 8, 2025 •

edited by coderabbitai bot

Loading

sourcery-ai bot commented Dec 8, 2025 •

edited

Loading

coderabbitai bot commented Dec 8, 2025 •

edited

Loading