fix: issue with parsing nested lists/tuples for extra inputs #318

ajcasagrande · 2025-10-01T03:20:50Z

Edit: I think there may be a flow in this with regards to handing strings without colons, and if the user passes in 2 items. need to investigate and test more.

Fixes a bug that was noticed during development of macOS support where --extra-inputs would not serialize and deserialze itself correctly.

 Value error, not enough values to unpack (expected 2, got 1) 
   [type=value_error, input_value=[['temperature', 0.1], ['...9], ['max_tokens', 150]], input_type=list]

Summary by CodeRabbit

New Features
- Configuration parsing now accepts lists/tuples of key-value pairs and normalizes them into a consistent tuple-list format, applying value coercion. Existing behavior for other input types is unchanged.
Tests
- Added tests verifying parsing of list/tuple key-value pairs and idempotent re-parsing.
- Expanded UserConfig serialization tests to cover full JSON round-trips, defaults, custom values, and file-based serialization scenarios.

coderabbitai · 2025-10-01T03:21:00Z

Walkthrough

Enhanced parse_str_or_dict_as_tuple_list to directly convert 2-element sequences inside list/tuple/set inputs into (key, coerced_value) tuples while preserving existing recursive handling for other items. Added tests for list-of-pairs parsing and a new UserConfig JSON round-trip serialization test class.

Changes

Cohort / File(s)	Summary
Config validators: tuple-list parsing `aiperf/common/config/config_validators.py`	Updated parse_str_or_dict_as_tuple_list to detect 2-element sequences within list/tuple/set inputs and convert them to tuples with value coercion; retains prior recursive aggregation for non-pair items and unchanged behavior for non-list inputs.
Config validator tests `tests/config/test_config_validators.py`	Added tests verifying parsing of lists of key-value pairs (list and tuple forms, mixed types), canonical tuple output, and idempotent re-parsing.
User config serialization and defaults tests `tests/config/test_user_config.py`	Introduced TestUserConfig class and test_user_config_serialization_to_json_string to build a populated UserConfig (including ConversationConfig, TurnConfig, TurnDelayConfig, CustomDatasetType) and validate JSON round-trip; preserved existing file-serialization tests and added related imports.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Caller
  participant Validator as parse_str_or_dict_as_tuple_list
  Caller->>Validator: parse(input: list/tuple/set)
  loop for each item in input
    Validator->>Validator: inspect item
    alt item is 2-element sequence
      Note right of Validator #E6F4EA: New direct-pair handling
      Validator->>Validator: coerce value -> tuple(key, coerced_value)
      Validator-->>Caller: append tuple
    else not a 2-element sequence
      Note right of Validator #FFF4E6: Existing recursive flow
      Validator->>Validator: recurse/aggregate
      Validator-->>Caller: append aggregated results
    end
  end
  Validator-->>Caller: return list of tuples

sequenceDiagram
  autonumber
  participant Test as TestUserConfig
  participant Model as UserConfig
  participant JSON as Serializer
  Test->>Model: construct populated instance
  Model-->>Test: instance
  Test->>JSON: model_dump_json(exclude_unset, exclude_defaults)
  JSON-->>Test: json_string
  Test->>Model: model_validate_json(json_string)
  Model-->>Test: reconstructed instance
  Test->>Test: assert fields / equality

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

A nibble of tuples, a hop through the keys,
I juggle two-elements with carrot-light ease.
Configs go round-trip, JSON sings true,
Defaults in the burrow, custom paths too.
Thump goes my paw: green checks, nicely signed. 🥕✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title succinctly and accurately describes the primary change—fixing the parsing logic for nested lists and tuples passed as extra inputs—using clear, conventional commit phrasing without unnecessary detail.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch ajc/fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d056b9e and 2f83784.

📒 Files selected for processing (3)

aiperf/common/config/config_validators.py (2 hunks)
tests/config/test_config_validators.py (1 hunks)
tests/config/test_user_config.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

tests/config/test_config_validators.py (1)

aiperf/common/config/config_validators.py (1)

parse_str_or_dict_as_tuple_list (113-167)

tests/config/test_user_config.py (6)

aiperf/common/config/conversation_config.py (3)

ConversationConfig (112-140)

TurnConfig (72-109)

TurnDelayConfig (18-69)

aiperf/common/config/endpoint_config.py (1)

EndpointConfig (23-149)

aiperf/common/config/loadgen_config.py (1)

LoadGeneratorConfig (15-150)

aiperf/common/config/output_config.py (1)

OutputConfig (15-34)

aiperf/common/config/user_config.py (3)

UserConfig (34-280)

timing_mode (278-280)

_compute_artifact_directory (225-232)

aiperf/common/enums/dataset_enums.py (1)

CustomDatasetType (17-21)

🪛 GitHub Actions: Run Unit Tests

tests/config/test_user_config.py

[error] 1-1: Make coverage failed. Command 'make coverage' exited with code 1 due to test failures in pytest.

[error] 1-1: Pytest run failed due to TypeError: test functions in TestUserConfig class expected 0 positional arguments but 1 was provided (likely fixture misconfiguration).

[error] 1-1: TestUserConfig.test_user_config_serialization_to_file: TypeError raised during test execution indicating fixture/argument mismatch.

[error] 1-1: TestUserConfig.test_user_config_defaults: TypeError raised during test execution indicating fixture/argument mismatch.

[error] 1-1: TestUserConfig.test_user_config_custom_values: TypeError raised during test execution indicating fixture/argument mismatch.

[error] 1-1: TestUserConfig.test_user_config_exclude_unset_fields: TypeError raised during test execution indicating fixture/argument mismatch.

[error] 1-1: TestUserConfig.test_compute_artifact_directory[model_names0-chat-request_rate-True-/tmp/artifacts/hf_model-openai-chat-concurrency5-request_rate10.0]: AttributeError: 'TestUserConfig' object has no attribute 'setattr' when patching pathlib.Path.is_file.

[error] 1-1: TestUserConfig.test_compute_artifact_directory[model_names1-completions-request_rate-True-/tmp/artifacts/model1_multi-openai-completions-concurrency5-request_rate10.0]: AttributeError: 'TestUserConfig' object has no attribute 'setattr' when patching pathlib.Path.is_file.

[error] 1-1: TestUserConfig.test_compute_artifact_directory[model_names2-embeddings-fixed_schedule-False-/tmp/artifacts/singlemodel-openai-embeddings-fixed_schedule]: AttributeError: 'TestUserConfig' object has no attribute 'setattr' when patching pathlib.Path.is_file.

🪛 Ruff (0.13.1)

tests/config/test_user_config.py

229-229: Probable insecure usage of temporary file or directory: "/tmp/artifacts/hf_model-openai-chat-concurrency5-request_rate10.0"

(S108)

236-236: Probable insecure usage of temporary file or directory: "/tmp/artifacts/model1_multi-openai-completions-concurrency5-request_rate10.0"

(S108)

243-243: Probable insecure usage of temporary file or directory: "/tmp/artifacts/singlemodel-openai-embeddings-fixed_schedule"

(S108)

257-257: Probable insecure usage of temporary file or directory: "/tmp/artifacts"

(S108)

260-260: Unused lambda argument: self

(ARG005)

263-263: Probable insecure usage of temporary file or directory: "/tmp/dummy_input.txt"

(S108)

274-274: Unused lambda argument: self

(ARG005)

🔇 Additional comments (5)

aiperf/common/config/config_validators.py (2)

123-123: LGTM! Clear documentation of the new behavior.

The docstring update accurately describes the new fast path for 2-element sequences.

136-147: Fix correctly handles 2-element sequences and maintains backward compatibility.

The implementation properly detects 2-element sequences and converts them directly to tuples, which resolves the serialization/deserialization bug. The unpacking at line 141 is safe due to the length check, and coerce_value maintains idempotent behavior for already-typed values.

tests/config/test_config_validators.py (1)

250-273: Excellent test coverage for the bug fix.

The test comprehensively validates the fix for nested list/tuple parsing, including the exact scenario from the PR description. The idempotent parsing check (lines 271-273) is particularly valuable, ensuring that serialization round-trips work correctly.
tests/config/test_user_config.py (2)
8-22: LGTM!

The new imports for ConversationConfig, TurnConfig, TurnDelayConfig, and CustomDatasetType are correctly added to support the expanded test coverage.

32-96: Fix missing self parameter and invalid field names.

This test method has multiple critical issues causing pipeline failures:

Missing self parameter: Instance methods in a test class must include self as the first parameter.

Invalid UserConfig fields: The fields conversation_config, verbose, and template_filename do not exist on UserConfig based on the class definition.

Invalid EndpointConfig field: Line 52 uses timeout but the field is named timeout_seconds.

Apply this diff to fix the method signature and field names:
-    def test_user_config_serialization_to_json_string(self):
+    def test_user_config_serialization_to_json_string(self):
         """Test the serialization and deserialization of a UserConfig object to and from a JSON string."""
         config = UserConfig(
             endpoint=EndpointConfig(
                 model_names=["model1", "model2"],
                 type=EndpointType.CHAT,
                 custom_endpoint="custom_endpoint",
                 streaming=True,
                 url="http://custom-url",
                 extra=[
                     ("key1", "value1"),
                     ("key2", "value2"),
                     ("key3", "value3"),
                 ],
                 headers=[
                     ("Authorization", "Bearer token"),
                     ("Content-Type", "application/json"),
                 ],
                 api_key="test_api_key",
                 ssl_options={"verify": False},
-                timeout=10,
+                timeout_seconds=10,
             ),
-            conversation_config=ConversationConfig(
-                num=10,
-                turn=TurnConfig(
-                    mean=10,
-                    stddev=10,
-                    delay=TurnDelayConfig(
-                        mean=10,
-                        stddev=10,
-                    ),
-                ),
-            ),
             input=InputConfig(
                 custom_dataset_type=CustomDatasetType.SINGLE_TURN,
             ),
             output=OutputConfig(
                 artifact_directory="test_artifacts",
             ),
             tokenizer=TokenizerConfig(
                 model_name="test_tokenizer",
             ),
             loadgen=LoadGeneratorConfig(
                 concurrency=10,
                 request_rate=10,
             ),
-            verbose=True,
-            template_filename="test_template.yaml",
             cli_command="test_cli_command",
         )
Note: Remove conversation_config, verbose, and template_filename as they are not valid UserConfig fields. If these are meant to test future fields, add them to the UserConfig class definition first.

Likely an incorrect or invalid review comment.

tests/config/test_user_config.py

codecov · 2025-10-01T03:42:03Z

Codecov Report

❌ Patch coverage is 83.33333% with 1 line in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
aiperf/common/config/config_validators.py	83.33%	0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (5)

tests/config/test_user_config.py (5)
142-142: Fix missing self parameter.

This test method is missing the self parameter, causing a TypeError during test execution.

Apply this diff:
-def test_user_config_defaults():
+    def test_user_config_defaults(self):
If this test is intended to be part of the TestUserConfig class, ensure it's properly indented.

175-175: Fix missing self parameter.

This test method is missing the self parameter, causing a TypeError during test execution.

Apply this diff:
-def test_user_config_custom_values():
+    def test_user_config_custom_values(self):
If this test is intended to be part of the TestUserConfig class, ensure it's properly indented.

207-207: Fix missing self parameter.

This test method is missing the self parameter, causing a TypeError during test execution.

Apply this diff:
-def test_user_config_exclude_unset_fields():
+    def test_user_config_exclude_unset_fields(self):
If this test is intended to be part of the TestUserConfig class, ensure it's properly indented.

252-253: Fix missing self parameter and unused lambda arguments.

This test has multiple critical issues:

Missing self parameter: The method signature is incorrect. For a pytest test method in a class, self must be the first parameter, followed by any fixtures.

Unused lambda parameters: Lines 265 and 278 have lambda expressions with unused self parameters.

Apply this diff:
 def test_compute_artifact_directory(
-    monkeypatch, model_names, endpoint_type, timing_mode, streaming, expected_dir
+    self, monkeypatch, model_names, endpoint_type, timing_mode, streaming, expected_dir
 ):
Also fix the lambda expressions:
-    monkeypatch.setattr("pathlib.Path.is_file", lambda self: True)
+    monkeypatch.setattr("pathlib.Path.is_file", lambda _: True)
-    monkeypatch.setattr(UserConfig, "_timing_mode", property(lambda self: timing_mode))
+    monkeypatch.setattr(UserConfig, "_timing_mode", property(lambda _: timing_mode))
If this test is intended to be part of the TestUserConfig class, ensure it's properly indented.

99-99: Fix missing self parameter.

This test method is missing the self parameter, causing a TypeError during test execution. Instance methods in a test class must include self as the first parameter.

Apply this diff:
-def test_user_config_serialization_to_file():
+    def test_user_config_serialization_to_file(self):
Additionally, this function appears to be at module level rather than inside the TestUserConfig class. If you intended to group it in the test class as indicated by the comment at line 24, ensure it's properly indented.

🧹 Nitpick comments (1)

tests/config/test_user_config.py (1)

24-26: Test organization inconsistency.

The comment at line 24 states "Everything below here was unchanged, just tabbed over to be grouped in the Test class," but the existing tests (lines 99-281) remain at module level rather than being indented inside the TestUserConfig class. Only the new test method test_user_config_serialization_to_json_string is properly grouped in the class.

For consistency and better organization, consider either:

Moving all tests into the TestUserConfig class with proper indentation and self parameters, or

Updating the comment to accurately reflect that only the new test is in the class

Also applies to: 99-281

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f83784 and a693358.

📒 Files selected for processing (1)

tests/config/test_user_config.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

tests/config/test_user_config.py (2)

aiperf/common/config/conversation_config.py (3)

ConversationConfig (112-140)

TurnConfig (72-109)

TurnDelayConfig (18-69)

aiperf/common/enums/dataset_enums.py (1)

CustomDatasetType (17-21)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build (ubuntu-latest, 3.10)

🔇 Additional comments (3)

tests/config/test_user_config.py (3)

9-9: LGTM! New imports are properly used.

The new imports (ConversationConfig, TurnConfig, TurnDelayConfig, CustomDatasetType) are all utilized in the new test method and are properly organized.

Also applies to: 16-17, 21-21

24-30: LGTM! Well-structured test class.

The new test class follows pytest conventions and provides clear documentation.

32-96: LGTM! Comprehensive serialization test that validates the PR fix.

This test directly exercises the fix for nested list/tuple parsing by:

Testing extra field with list of tuples: [("key1", "value1"), ...]

Testing headers field similarly

Validating round-trip JSON serialization/deserialization with both exclude_unset and exclude_defaults

The test ensures that the enhanced parse_str_or_dict_as_tuple_list validator correctly handles these structures during serialization and deserialization.

the-david-oy

Excellent work.

ajcasagrande added 2 commits September 30, 2025 20:06

fix: issue with parsing nested lists/tuples

6232c7d

add more user config tests

2f83784

ajcasagrande requested a review from the-david-oy October 1, 2025 03:20

ajcasagrande self-assigned this Oct 1, 2025

github-actions bot added the fix label Oct 1, 2025

coderabbitai bot reviewed Oct 1, 2025

View reviewed changes

revert tabbing over tests

a693358

coderabbitai bot reviewed Oct 1, 2025

View reviewed changes

ajcasagrande marked this pull request as draft October 1, 2025 19:01

the-david-oy approved these changes Oct 1, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: issue with parsing nested lists/tuples for extra inputs #318

fix: issue with parsing nested lists/tuples for extra inputs #318

Uh oh!

ajcasagrande commented Oct 1, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Oct 1, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Oct 1, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

the-david-oy left a comment

Uh oh!

Uh oh!

fix: issue with parsing nested lists/tuples for extra inputs #318

Are you sure you want to change the base?

fix: issue with parsing nested lists/tuples for extra inputs #318

Uh oh!

Conversation

ajcasagrande commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Oct 1, 2025

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

the-david-oy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ajcasagrande commented Oct 1, 2025 •

edited

Loading

coderabbitai bot commented Oct 1, 2025 •

edited

Loading