Fix Nested Metrics Handling in CompileMetrics #21761

Dimios45 · 2025-10-20T22:11:28Z

This PR addresses the issue reported in Keras Issue #21700, where nested output structures and corresponding losses are supported, but nested metrics are not. This caused failures when attempting to compile models with nested metrics configurations.

Changes

Implemented recursive handling of nested metric configurations within CompileMetrics.
Ensured compatibility with both nested and flat metric structures.
Updated internal methods to process nested metrics appropriately.

Testing

Added tests to verify correct handling of nested metrics during model compilation and training.
Tests cover nested dictionaries and lists to ensure the recursive metric construction behaves as expected and raises appropriate errors for invalid configurations.

Example Usage:

model.compile(
    optimizer='adam',
    loss={'a': 'mse', 'b': {'c': 'mse', 'd': 'mse'}},
    metrics={'a': ['mae'], 'b': {'c': 'mse', 'd': 'mae'}}

google-cla · 2025-10-20T22:11:36Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

gemini-code-assist · 2025-10-20T22:11:48Z

Summary of Changes

Hello @Dimios45, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical limitation in Keras where models with nested output structures failed to compile when provided with corresponding nested metric configurations. By introducing a comprehensive recursive processing logic within the CompileMetrics class, the system can now seamlessly interpret and apply metrics to complex, hierarchical model outputs. This enhancement ensures that Keras users can define metrics in a structure that mirrors their model's output, providing greater flexibility and consistency in model definition and training.

Highlights

Nested Metrics Support: The CompileMetrics utility now fully supports nested dictionary and list structures for metrics, aligning with existing support for nested output structures and losses in Keras models.
Recursive Metric Processing: A new recursive mechanism (_build_nested_metrics) has been implemented to correctly parse and build metrics for complex, hierarchical output configurations, ensuring proper assignment and initialization.
New Helper Methods: Several new internal methods, including _has_nested_structure, _build_single_output_metrics, _build_flat_placeholders, _build_metrics_set_for_nested, and _flatten_dict_keys, were introduced to facilitate the robust handling of nested metric configurations.
Comprehensive Testing: New unit tests have been added to validate the correct behavior of nested metrics during model compilation and training, covering various nested dictionary and list scenarios, as well as compatibility with symbolic tensors.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces support for nested metrics in CompileMetrics, aligning its functionality with the existing support for nested losses. The implementation correctly uses recursion to handle complex nested structures. My review focuses on improving the robustness of the error handling, ensuring consistent behavior for user inputs, and strengthening the new tests to fully validate the feature. Overall, this is a valuable addition that enhances Keras's flexibility.

keras/src/trainers/compile_utils.py

gemini-code-assist · 2025-10-20T22:13:45Z

keras/src/trainers/compile_utils_test.py

+def test_nested_dict_metrics():
+    import numpy as np
+    from keras.src import layers
+    from keras.src import Input
+    from keras.src import Model
+
+    X = np.random.rand(100, 32)
+    Y1 = np.random.rand(100, 10)
+    Y2 = np.random.rand(100, 10)
+    Y3 = np.random.rand(100, 10)
+
+    def create_model():
+        x = Input(shape=(32,))
+        y1 = layers.Dense(10)(x)
+        y2 = layers.Dense(10)(x)
+        y3 = layers.Dense(10)(x)
+        return Model(inputs=x, outputs={'a': y1, 'b': {'c': y2, 'd': y3}})
+
+    model = create_model()
+    model.compile(
+        optimizer='adam',
+        loss={'a': 'mse', 'b': {'c': 'mse', 'd': 'mse'}},
+        metrics={'a': ['mae'], 'b': {'c': 'mse', 'd': 'mae'}},
+    )
+    model.train_on_batch(X, {'a': Y1, 'b': {'c': Y2, 'd': Y3}})


This test function test_nested_dict_metrics lacks assertions, which means it only verifies that the code runs without raising an exception. A test should validate the output or behavior. Please add assertions to check the metrics returned by train_on_batch or the model.metrics_names attribute after compilation.

Additionally, having a standalone function with the same name as a class method in the same file is confusing. It would be better to integrate this into a test class and give it a more descriptive name, such as test_nested_metrics_with_model_compile.

gemini-code-assist · 2025-10-20T22:13:45Z

keras/src/trainers/compile_utils.py

+            elif isinstance(metrics_cfg, (list, tuple)) and isinstance(yp, (list, tuple)):
+
+                flat_metrics = []
+                for i, (m_cfg, y_t_elem, y_p_elem) in enumerate(zip(metrics_cfg, yt, yp)):


When handling lists/tuples in build_recursive_metrics, using zip can silently truncate sequences if metrics_cfg, yt, and yp have different lengths. This is inconsistent with _build_metrics_set and _build_metrics_set_for_nested, which explicitly check for length mismatches and raise a ValueError. For a consistent and predictable user experience, you should add a length check here to ensure the structures are compatible and raise an informative error if they are not.

gemini-code-assist · 2025-10-20T22:13:45Z

keras/src/trainers/compile_utils_test.py

        value = compile_loss(y_true, y_pred)
        self.assertAllClose(value, 1.07666, atol=1e-5)

+    def test_nested_dict_metrics(self):


The test method test_nested_dict_metrics is defined within the TestCompileLoss class, but it is testing the functionality of CompileMetrics. For better code organization and clarity, this test should be moved to the TestCompileMetrics class.

gemini-code-assist · 2025-10-20T22:13:45Z

keras/src/trainers/compile_utils_test.py

+        expected_metric_names = []
+        for key in result.keys():
+            if 'mean_squared_error' in key or 'mean_absolute_error' in key:
+                expected_metric_names.append(key)
+
+        # At least some metrics should be computed
+        self.assertGreater(len(expected_metric_names), 0)


The assertions in this test are too weak. It only checks that len(expected_metric_names) is greater than zero, which doesn't confirm that the correct metrics are created with the correct names for the nested structure. The test should be more specific by asserting the presence of expected metric names (e.g., a_mean_squared_error, b_c_mean_squared_error). This would provide a much stronger validation of the implementation.

Suggested change

expected_metric_names = []

for key in result.keys():

if 'mean_squared_error' in key or 'mean_absolute_error' in key:

expected_metric_names.append(key)

# At least some metrics should be computed

self.assertGreater(len(expected_metric_names), 0)

self.assertIn("a_mean_squared_error", result)

self.assertIn("b_c_mean_squared_error", result)

self.assertIn("b_c_mean_absolute_error", result)

self.assertIn("b_d_mean_squared_error", result)

self.assertEqual(len(result), 4)

codecov-commenter · 2025-10-20T22:17:06Z

Codecov Report

❌ Patch coverage is 6.03448% with 109 lines in your changes missing coverage. Please review.
✅ Project coverage is 34.37%. Comparing base (f78fd8c) to head (017c2e3).

Files with missing lines	Patch %	Lines
keras/src/trainers/compile_utils.py	6.03%	109 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (f78fd8c) and HEAD (017c2e3). Click for more details.

HEAD has 8 uploads less than BASE

Flag BASE (f78fd8c) HEAD (017c2e3)

keras 5 1

keras-numpy 1 0

keras-tensorflow 1 0

keras-torch 1 0

keras-jax 1 0

Additional details and impacted files

@@             Coverage Diff             @@
##           master   #21761       +/-   ##
===========================================
- Coverage   82.70%   34.37%   -48.33%     
===========================================
  Files         573      573               
  Lines       58817    58932      +115     
  Branches     9202     9243       +41     
===========================================
- Hits        48643    20257    -28386     
- Misses       7837    37768    +29931     
+ Partials     2337      907     -1430

Flag	Coverage Δ
keras	`34.37% <6.03%> (-48.14%)`	⬇️
keras-jax	`?`
keras-numpy	`?`
keras-openvino	`34.37% <6.03%> (-0.06%)`	⬇️
keras-tensorflow	`?`
keras-torch	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

google-ml-butler bot added the size:L label Oct 20, 2025

google-ml-butler bot assigned gbaned Oct 20, 2025

gemini-code-assist bot reviewed Oct 20, 2025

View reviewed changes

Add metrics implementation to compile_utils

017c2e3

Dimios45 force-pushed the metrics branch from db4078f to 017c2e3 Compare October 20, 2025 22:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix Nested Metrics Handling in CompileMetrics #21761

Fix Nested Metrics Handling in CompileMetrics #21761

Dimios45 commented Oct 20, 2025

Uh oh!

google-cla bot commented Oct 20, 2025

Uh oh!

gemini-code-assist bot commented Oct 20, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Oct 20, 2025

Uh oh!

gemini-code-assist bot Oct 20, 2025

Uh oh!

gemini-code-assist bot Oct 20, 2025

Uh oh!

gemini-code-assist bot Oct 20, 2025

Uh oh!

codecov-commenter commented Oct 20, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Fix Nested Metrics Handling in CompileMetrics #21761

Are you sure you want to change the base?

Fix Nested Metrics Handling in CompileMetrics #21761

Conversation

Dimios45 commented Oct 20, 2025

Changes

Testing

Example Usage:

Uh oh!

google-cla bot commented Oct 20, 2025

Uh oh!

gemini-code-assist bot commented Oct 20, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Oct 20, 2025 •

edited

Loading