Skip to content

Conversation

@ShiZhongming
Copy link
Collaborator

@ShiZhongming ShiZhongming commented Nov 10, 2025

Summary by CodeRabbit

  • New Features
    • Real-time network layout validation with automatic discovery and sorting by modification time.
    • Support for user-defined network layouts via shapefiles or GeoJSON files.
    • Multi-candidate connection options for network design (k-nearest neighbor connectivity).
    • Enhanced API endpoints for parameter validation and dynamic metadata computation.
    • Support for nullable building parameters and flexible network selection.

✏️ Tip: You can customize this high-level summary in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (16)
cea/technologies/network_layout/connectivity_potential.py (4)

61-65: Docstring correctly describes function behavior.

The return description accurately states the function returns a GeoDataFrame with streets split at intersection points, which matches the implementation.


700-700: Remove extraneous f-string prefix.

The string doesn't contain any placeholders, so the f prefix is unnecessary.

Based on static analysis hints.

Apply this diff:

-        print(f"      → Large tolerance may distort short street segments")
+        print("      → Large tolerance may distort short street segments")

709-709: Remove extraneous f-string prefix.

The string doesn't contain any placeholders, so the f prefix is unnecessary.

Based on static analysis hints.

Apply this diff:

-        print(f"      → Large tolerance may connect unrelated street segments")
+        print("      → Large tolerance may connect unrelated street segments")

712-712: Remove extraneous f-string prefix.

The string at line 712 doesn't contain any placeholders, so the f prefix is unnecessary. Note that line 713 correctly uses an f-string with the {SNAP_TOLERANCE} placeholder.

Based on static analysis hints.

Apply this diff:

-        print(f"  INFO: Typical snap_tolerance range is 0.3-2.0m")
+        print("  INFO: Typical snap_tolerance range is 0.3-2.0m")
cea/technologies/thermal_network/thermal_network.py (2)

3376-3398: ** Regression: Broad exception handler still swallows informative error messages.**

This issue was flagged in a previous review. The broad except Exception: on line 3396 catches the informative ValueError instances raised on lines 3387 and 3392 (which list available layouts or provide guidance) and replaces them with the generic message on line 3397, losing exactly the context users need.

Restrict the handler to I/O-related errors only and let the explicit ValueErrors propagate:

     if not network_name:
-        # Get available network layouts to provide helpful error message
-        try:
-            network_folder = locator.get_thermal_network_folder()
-            available_layouts = [name for name in os.listdir(network_folder)
-                                if os.path.isdir(os.path.join(network_folder, name))
-                                and name not in {'DH', 'DC'}]
-            if available_layouts:
-                raise ValueError(
-                    f"Network name is required. Please select a network layout.\n"
-                    f"Available layouts: {', '.join(available_layouts)}"
-                )
-            else:
-                raise ValueError(
-                    "Network name is required, but no network layouts found.\n"
-                    "Please create or import a network layout using 'thermal-network-layout'."
-                )
-        except Exception:
-            raise ValueError("Network name is required. Please select a network layout.")
+        network_folder = locator.get_thermal_network_folder()
+        try:
+            entries = os.listdir(network_folder)
+        except (FileNotFoundError, OSError):
+            raise ValueError(
+                "Network name is required, but no network layouts folder was found.\n"
+                "Please create or import a network layout using 'thermal-network-layout'."
+            )
+        available_layouts = [
+            name for name in entries
+            if os.path.isdir(os.path.join(network_folder, name))
+            and name not in {"DH", "DC"}
+        ]
+        if available_layouts:
+            raise ValueError(
+                f"Network name is required. Please select a network layout.\n"
+                f"Available layouts: {', '.join(available_layouts)}"
+            )
+        else:
+            raise ValueError(
+                "Network name is required, but no network layouts found.\n"
+                "Please create or import a network layout using 'thermal-network-layout'."
+            )

3410-3410: ** Type mismatch: check_heating_cooling_demand receives config with list-type network_type but expects scalar string.**

This issue was flagged in a previous review. The call to check_heating_cooling_demand(locator, config) passes the full config object, but inside the function (line 3350), network_type = config.thermal_network.network_type extracts the MultiChoiceParameter value, which is a list (e.g., ["DH", "DC"]). However, the function then performs scalar comparisons on lines 3352 and 3358:

if network_type == "DH":  # List == string will always be False

Since the main function now loops over individual network_type values (line 3401), pass the scalar network_type from the loop instead:

             elif network_model == 'detailed':
-                check_heating_cooling_demand(locator, config)
+                check_heating_cooling_demand(locator, network_type)

Then update the function signature at line 3348:

-def check_heating_cooling_demand(locator, config):
+def check_heating_cooling_demand(locator, network_type):
     # local variables
-    network_type = config.thermal_network.network_type
     total_demand = pd.read_csv(locator.get_total_demand())
cea/technologies/network_layout/main.py (8)

502-577: resolve_plant_buildings: unreachable unmatched-building warning block

You already raise a ValueError when unmatched_buildings is non‑empty:

if unmatched_buildings:
    raise ValueError(...)

The later block:

if unmatched_buildings:
    print(f"  ⚠ Warning: {len(unmatched_buildings)} plant building(s){label} not found in zone:")
    ...

is therefore unreachable dead code.

Either remove the second if unmatched_buildings block or downgrade the earlier raise to a warning (but not both). Given the strict validation semantics, removing the dead warning block seems best.


579-724: auto_layout_network: good orchestration; tweak connected-buildings detection and demand-filter edge case

The DC/DH orchestration, demand filtering, and per‑network‑type Steiner + plant integration look sound. Two non‑critical points:

  1. connected-buildings explicit vs. auto-populated detection

    You currently treat connected_buildings_config as “explicit” only if it differs from all zone buildings:

    is_explicitly_set = (connected_buildings_config and
                         set(connected_buildings_config) != set(all_zone_buildings))

    This misclassifies the case where a user explicitly selects all buildings as “blank”. Functionally the resulting building set is the same, but it also causes you to skip the stricter validation path that enforces the explicit list.

    If you want every non-empty config value to be treated as explicit, you can simplify to:

  •    is_explicitly_set = (connected_buildings_config and
    
  •                       set(connected_buildings_config) != set(all_zone_buildings))
    
  •    is_explicitly_set = bool(connected_buildings_config)
    
    
    (Same comment applies to `process_user_defined_network` below.)
    
    
  1. Demand-filter edge case for empty per-service building sets

    When consider_only_buildings_with_demand is True, it’s possible (especially in small test scenarios) to end up with buildings_for_dc or buildings_for_dh empty for a given service. At the moment you’ll still include that service in network_types_to_generate, which can lead to later failures in calc_building_centroids if no buildings are available.

    Consider dropping a service from list_include_services (and logging) when its filtered list is empty, similar to what you already do in the supply-driven branch.

Neither issue is a blocker, but addressing them would make the behavior more predictable.


944-987: NetworkLayout dataclass: unused fields and potential simplification

NetworkLayout currently only uses network_name, connected_buildings, and algorithm; the other fields:

  • pipe_diameter
  • type_mat
  • consider_only_buildings_with_demand
  • disconnected_buildings

are not referenced anywhere in this module.

Given the TODO, either remove these fields to keep the dataclass minimal, or document their planned usage (and wire them into auto_layout_network/process_user_defined_network) so they don’t appear as dead state in the public API.


1142-1156: validate_network_covers_district_buildings never auto-creates; printing branch is dead

By design, validate_network_covers_district_buildings now returns (nodes_gdf, []) and raises errors for missing/extra buildings; it does not auto-create nodes.

As a result, this block is unreachable:

if auto_created_buildings:
    ...
else:
    print("  ✓ All specified buildings have valid nodes in network")

auto_created_buildings will always be empty; only the else branch executes.

You can simplify to:

print("  ✓ All specified buildings have valid nodes in network")

and update any remaining docstrings that still mention auto‑creation.


30-97: Validate type values in convert_simplified_nodes_to_full_format before treating them as building names

type_original values that are empty, null, or effectively NaN will currently fall into the “building node” branch and create consumers with building names like "nan" / "", which is almost certainly invalid:

type_value = str(row['type_original']).strip()
...
else:
    # It's a building node (consumer)
    nodes_full.loc[idx, 'building'] = type_value
    nodes_full.loc[idx, 'type'] = 'CONSUMER'

Add a simple validation guard so malformed types fail fast instead of silently creating bogus buildings:

-    for idx, row in nodes_full.iterrows():
-        type_value = str(row['type_original']).strip()
+    for idx, row in nodes_full.iterrows():
+        type_value = str(row['type_original']).strip()
+
+        if not type_value or type_value.upper() == 'NAN':
+            raise ValueError(
+                f"Invalid node at index {idx}: 'type' field is empty or null. "
+                "Each node must have a valid type (building name, 'NONE', or plant type)."
+            )

Then keep the existing plant / NONE / building branches.


131-157: Add defensive checks for supply.csv / assemblies CSVs and required columns

get_buildings_from_supply_csv assumes the files and columns ('name', 'code', 'scale', 'supply_type_hs' / 'supply_type_cs') exist. Missing files or columns will currently raise low‑signal FileNotFoundError / KeyError from pandas.

Consider adding minimal guards:

-    supply_df = pd.read_csv(locator.get_building_supply())
+    try:
+        supply_df = pd.read_csv(locator.get_building_supply())
+    except FileNotFoundError as e:
+        raise FileNotFoundError(f"supply.csv not found at {locator.get_building_supply()}") from e
@@
-    if network_type == "DH":
-        assemblies_df = pd.read_csv(locator.get_database_assemblies_supply_heating())
+    if network_type == "DH":
+        try:
+            assemblies_df = pd.read_csv(locator.get_database_assemblies_supply_heating())
+        except FileNotFoundError as e:
+            raise FileNotFoundError("SUPPLY_HEATING.csv assemblies database not found") from e
@@
-    else:  # DC
-        assemblies_df = pd.read_csv(locator.get_database_assemblies_supply_cooling())
+    else:  # DC
+        try:
+            assemblies_df = pd.read_csv(locator.get_database_assemblies_supply_cooling())
+        except FileNotFoundError as e:
+            raise FileNotFoundError("SUPPLY_COOLING.csv assemblies database not found") from e
@@
+    required_supply = {'name', system_type_col}
+    if not required_supply.issubset(supply_df.columns):
+        raise ValueError(f"supply.csv missing required columns: {required_supply - set(supply_df.columns)}")
+    required_assembly = {'code', 'scale'}
+    if not required_assembly.issubset(assemblies_df.columns):
+        raise ValueError(f"Assemblies database missing required columns: {required_assembly - set(assemblies_df.columns)}")

This will give users much clearer errors when inputs are misconfigured.


159-176: Add basic validation around Total_demand.csv and expected columns

Similar to supply, get_buildings_with_demand assumes Total_demand.csv exists and that QH_sys_MWhyr / QC_sys_MWhyr and name columns are present. A missing file or renamed column will currently fail with a generic pandas error.

A small defensive wrapper would help:

-    total_demand = pd.read_csv(locator.get_total_demand())
+    try:
+        total_demand = pd.read_csv(locator.get_total_demand())
+    except FileNotFoundError as e:
+        raise FileNotFoundError(f"Total_demand.csv not found at {locator.get_total_demand()}") from e
@@
-    if network_type == "DH":
-        field = "QH_sys_MWhyr"
-    else:  # DC
-        field = "QC_sys_MWhyr"
+    if network_type == "DH":
+        field = "QH_sys_MWhyr"
+    else:  # DC
+        field = "QC_sys_MWhyr"
+
+    required = {field, 'name'}
+    if not required.issubset(total_demand.columns):
+        raise ValueError(f"Total_demand.csv missing required columns: {required - set(total_demand.columns)}")

This keeps runtime failures clearer and closer to their root cause.


1056-1118: connected-buildings detection in process_user_defined_network has the same subtle semantics as in auto-layout

As in auto_layout_network, you treat connected_buildings_config as explicit only when it differs from all zone buildings:

is_explicitly_set = (connected_buildings_config and
                     set(connected_buildings_config) != set(all_zone_buildings))

Here, the functional difference does matter: for explicit mode you validate against connected_buildings_config, for “blank” you validate only against buildings present in the user layout.

If a user manually sets connected-buildings to exactly the zone building set, this condition will treat it as “blank” and you’ll no longer enforce that the layout covers all zone buildings.

If you intend “any non-empty value means explicit”, it’s safer to use:

-        is_explicitly_set = (connected_buildings_config and
-                           set(connected_buildings_config) != set(all_zone_buildings))
+        is_explicitly_set = bool(connected_buildings_config)

This keeps auto-populated “blank” behavior (empty list) while honoring explicit user lists, even when they match the full zone.

cea/technologies/network_layout/graph_utils.py (2)

81-245: Geometry endpoint replacement in _merge_orphan_nodes_to_nearest can still corrupt edge geometries

When rewiring edges from best_node_B to best_node_A, the endpoint replacement:

if coords[0] == best_node_B or coords[0] == u:
    coords[0] = best_node_A
else:
    coords[-1] = best_node_A

will sometimes replace the wrong endpoint (e.g., when coords[0] equals the non‑orphan node u and the orphan is actually at coords[-1]). This yields a LineString whose endpoints no longer match the edge’s (other_node, best_node_A) tuple, and can even create degenerate self‑loops.

You already track best_node_B; you can make this robust and avoid depending on u by only replacing the endpoint that actually equals the orphan, and otherwise leaving the geometry untouched:

-                        # Replace the orphan endpoint
-                        if coords[0] == best_node_B or coords[0] == u:
-                            coords[0] = best_node_A
-                        else:
-                            coords[-1] = best_node_A
+                        # Replace the endpoint that actually matches the orphan node
+                        if coords[0] == best_node_B:
+                            coords[0] = best_node_A
+                        elif coords[-1] == best_node_B:
+                            coords[-1] = best_node_A
+                        else:
+                            # Endpoint mismatch – keep original geometry to avoid corrupting it
+                            pass

This keeps geometries consistent with the new edge endpoints and avoids subtle topology glitches.


330-345: normalize_coords fallback doesn’t actually support shapely Points as advertised

The fallback branch still uses c[0] / c[1] while the comment claims it handles attribute‑based geometries like shapely Point. For shapely Points (especially in newer versions), c[0] / c[1] may not be valid, so this will raise instead of gracefully handling geometry objects.

To align behavior with the comment and be more robust:

-        else:
-            # Fallback: attempt attribute access (e.g., shapely Point)
-            try:
-                x = float(c[0])  # type: ignore[index]
-                y = float(c[1])  # type: ignore[index]
-            except Exception as e:
-                raise ValueError(f"Invalid coordinate element for normalization: {c!r}") from e
+        else:
+            # Fallback: support simple geometry-like objects (e.g., shapely Point)
+            try:
+                if hasattr(c, "x") and hasattr(c, "y"):
+                    x = float(c.x)
+                    y = float(c.y)
+                else:
+                    x = float(c[0])  # type: ignore[index]
+                    y = float(c[1])  # type: ignore[index]
+            except Exception as e:
+                raise ValueError(f"Invalid coordinate element for normalization: {c!r}") from e

This keeps existing sequence behavior while making the Point support real.

🧹 Nitpick comments (7)
cea/technologies/thermal_network/thermal_network.py (2)

3413-3428: Consider consolidating NetworkConfig creation to avoid duplication and improve maintainability.

The NetworkConfig class defined inside main (lines 3413-3424) solves the per-network configuration issue by overriding network_type with the scalar value from the loop. However, this approach has maintenance concerns:

  1. Duplicate attribute lists: The hardcoded list of attributes (lines 3417-3423) must be kept in sync with copy_config_section (lines 127-132). If the config structure changes, both locations need updates.

  2. Inline class definition: Defining the class inside main is unconventional and makes it harder to test or reuse.

  3. Silent attribute skipping: If config.thermal_network is missing any expected attributes, the hasattr check silently skips them without warning.

Consider one of these alternatives:

Option 1: Add a helper method to create per-network config sections:

def create_network_config_section(base_section, network_type_override):
    """Create a config section for a specific network type."""
    class NetworkSection:
        pass
    
    section = NetworkSection()
    section.network_type = network_type_override
    
    # Copy attributes from the thermal_network_section_fields list
    for field in ['network_names', 'file_type', 'set_diameter',
                  'load_max_edge_flowrate_from_previous_run', 'start_t', 'stop_t',
                  'use_representative_week_per_month', 'minimum_mass_flow_iteration_limit',
                  'minimum_edge_mass_flow', 'diameter_iteration_limit',
                  'substation_cooling_systems', 'substation_heating_systems',
                  'temperature_control', 'plant_supply_temperature', 'equivalent_length_factor']:
        if hasattr(base_section, field):
            setattr(section, field, getattr(base_section, field))
    
    return section

Option 2: Modify ThermalNetwork.__init__ to accept an explicit network_type parameter that overrides the config section value:

def __init__(self, locator, network_name, thermal_network_section=None, network_type_override=None):
    # ... existing code ...
    self.copy_config_section(thermal_network_section)
    if network_type_override is not None:
        self.network_type = network_type_override

Either approach would eliminate the inline class definition and make the code more maintainable.


3399-3444: Consider catching broader exception types to allow graceful degradation across network types.

The error collection logic (lines 3400, 3432-3434, 3436-3444) only catches ValueError exceptions, which means other exception types will abort processing of remaining network types. For example:

  • RuntimeError raised on line 3430 for unknown network models
  • Any unexpected exceptions from thermal_network_simplified, ThermalNetwork, or thermal_network_main

If the goal is to process all network types and report failures at the end (allowing partial success), consider catching a broader exception type:

         try:
             if network_model == 'simplified':
                 thermal_network_simplified(locator, config, network_type, network_name)
             elif network_model == 'detailed':
                 check_heating_cooling_demand(locator, config)
                 # Create a per-network config section with the correct network_type
                 # This is a simple namespace object that mimics the config section interface
                 class NetworkConfig:
                     def __init__(self, base_config, network_type_override):
                         self.network_type = network_type_override
                         # Copy all other attributes from base config
                         for attr in ['network_names', 'file_type', 'set_diameter',
                                    'load_max_edge_flowrate_from_previous_run', 'start_t', 'stop_t',
                                    'use_representative_week_per_month', 'minimum_mass_flow_iteration_limit',
                                    'minimum_edge_mass_flow', 'diameter_iteration_limit',
                                    'substation_cooling_systems', 'substation_heating_systems',
                                    'temperature_control', 'plant_supply_temperature', 'equivalent_length_factor']:
                             if hasattr(base_config, attr):
                                 setattr(self, attr, getattr(base_config, attr))

                 per_network_config = NetworkConfig(config.thermal_network, network_type)
                 thermal_network = ThermalNetwork(locator, network_name, per_network_config)
                 thermal_network_main(locator, thermal_network, processes=config.get_number_of_processes())
             else:
                 raise RuntimeError(f"Unknown network model: {network_model}")
             print(f"{network_type} network processing completed.")
-        except ValueError as e:
+        except Exception as e:
             print(f"An error occurred while processing the {network_type} network")
             errors[network_type] = e

This allows all network types to be attempted even if one fails with an unexpected exception type. If you prefer to only catch known exception types, consider adding RuntimeError and any other specific types that should allow graceful degradation:

except (ValueError, RuntimeError) as e:
cea/technologies/network_layout/main.py (3)

178-499: auto_create_plant_nodes: overall logic is solid; one small robustness improvement suggested

The component detection, missing‑junction auto‑creation, component‑count validation, and the two plant modes (explicit plant buildings vs. anchor‑load one‑per‑component) are well structured and match the documented behavior. A couple of points:

  • The temp_node_added cleanup now only keying on building == plant_building is safe because add_plant_close_to_anchor always creates plant nodes with building == 'NONE', so the created PLANT is not accidentally removed.
  • Final normalize_gdf_geometries on nodes_gdf is a good guard for consistency.

One minor suggestion: inside the “no plant buildings” branch, you already compute component_demand and check demand_field in component_demand.columns. It would be safer to also handle the case where component_demand is empty because none of the buildings appear in Total_demand.csv and log a warning before falling back to alphabetical selection (currently you silently fall back). Not critical, but it would make diagnostics clearer.


957-987: _validate_network_name: verify network_type semantics for duplicate detection

The validation now uses the passed network_type to check for existing outputs:

output_folder = locator.get_output_thermal_network_type_folder(network_type, network_name)
...
edges_path = locator.get_network_layout_edges_shapefile(network_type, network_name)
nodes_path = locator.get_network_layout_nodes_shapefile(network_type, network_name)

This is a big improvement over the previous module-level hack, but it assumes network_type is a concrete type like 'DC' or 'DH'. If the config ever uses aggregate values like 'DC+DH', these paths may not match real output folders, effectively disabling duplicate detection.

Either:

  • Guarantee/configure network_layout.network_type to be a single type ('DC' or 'DH') and document it, or
  • Broaden _validate_network_name to check both 'DC' and 'DH' folders when network_type is something like 'DC+DH'.

Not a blocker, but worth verifying against your current config/schema.


1280-1319: main: routing between user-defined and auto-layout paths plus cleanup looks good

The main entrypoint:

  • Builds NetworkLayout via from_config,
  • Routes to process_user_defined_network when any of edges_shp, nodes_shp, or geojson_path is set, else to auto_layout_network, and
  • On any exception, removes the partially created network folder before re‑raising.

Only minor nit: snap_tolerance computed in main is only used for the user-defined path (auto-layout re-reads from config). That’s fine, just something to be aware of if you later refactor.

cea/technologies/network_layout/steiner_spanning_tree.py (2)

63-99: calc_steiner_spanning_tree signature and Mehlhorn guard: behavior looks good; docstring needs a small refresh

The refactored signature that takes potential_network_graph: nx.Graph plus defaulted type_mat_default / pipe_diameter_default is a sensible API. The Mehlhorn guard:

if connection_candidates > 1 and steiner_algorithm == SteinerAlgorithm.Mehlhorn:
    print("  WARNING: connection_candidates > 1 requires Kou algorithm for optimization.")
    ...
    connection_candidates = 1

is also a reasonable safety net.

Two minor cleanups:

  • The docstring still mentions create_plant and doesn’t describe potential_network_graph / connection_candidates clearly; consider updating the parameter docs to match the new signature and behavior.
  • The method: str = SteinerAlgorithm.Kou annotation is slightly misleading; the default is actually a SteinerAlgorithm. You can either change the type to SteinerAlgorithm or default to SteinerAlgorithm.Kou.value.

Functionally this looks fine.


515-537: _validate_steiner_tree gives useful diagnostics; normalize terminal_coords to a set internally

The new _validate_steiner_tree correctly checks:

  1. Degrees of terminal nodes,
  2. Presence of terminal–terminal edges,
  3. Overall connectivity, and
  4. Basic network statistics.

One small improvement: callers currently pass terminal_nodes_coordinates as a list; you treat terminal_coords as a set in membership checks but don’t normalize it. To avoid accidental O(n²) scans and inconsistent tuple/list comparisons, you could normalize once at the top:

-def _validate_steiner_tree(mst_graph: nx.Graph, mst_nodes: gdf, mst_edges: gdf, terminal_coords: set):
+def _validate_steiner_tree(mst_graph: nx.Graph, mst_nodes: gdf, mst_edges: gdf, terminal_coords):
@@
-    print("\n  Validating Steiner tree properties...")
+    terminal_coords = {tuple(c) for c in terminal_coords}
+    print("\n  Validating Steiner tree properties...")

and then rely on set semantics throughout. Not required for correctness, but slightly safer and faster.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 12820c1 and 83955a8.

📒 Files selected for processing (6)
  • cea/technologies/network_layout/connectivity_potential.py (16 hunks)
  • cea/technologies/network_layout/debug.py (1 hunks)
  • cea/technologies/network_layout/graph_utils.py (4 hunks)
  • cea/technologies/network_layout/main.py (2 hunks)
  • cea/technologies/network_layout/steiner_spanning_tree.py (9 hunks)
  • cea/technologies/thermal_network/thermal_network.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • cea/technologies/network_layout/debug.py
🧰 Additional context used
🧬 Code graph analysis (4)
cea/technologies/thermal_network/thermal_network.py (3)
cea/config.py (2)
  • locator (578-581)
  • get_number_of_processes (246-257)
cea/inputlocator.py (1)
  • get_thermal_network_folder (1022-1024)
cea/technologies/thermal_network/simplified_thermal_network.py (1)
  • thermal_network_simplified (379-889)
cea/technologies/network_layout/connectivity_potential.py (2)
cea/utilities/standardize_coordinates.py (2)
  • get_projected_coordinate_system (55-65)
  • get_lat_lon_projected_shapefile (75-90)
cea/technologies/network_layout/graph_utils.py (4)
  • gdf_to_nx (16-78)
  • normalize_coords (311-345)
  • normalize_geometry (348-380)
  • _merge_orphan_nodes_to_nearest (81-245)
cea/technologies/network_layout/steiner_spanning_tree.py (2)
cea/technologies/network_layout/graph_utils.py (2)
  • gdf_to_nx (16-78)
  • normalize_coords (311-345)
cea/technologies/network_layout/plant_node_operations.py (2)
  • add_plant_close_to_anchor (67-165)
  • get_next_node_name (29-45)
cea/technologies/network_layout/main.py (7)
cea/technologies/network_layout/connectivity_potential.py (1)
  • calc_connectivity_network_with_geometry (934-1033)
cea/technologies/network_layout/steiner_spanning_tree.py (1)
  • calc_steiner_spanning_tree (63-543)
cea/technologies/network_layout/plant_node_operations.py (3)
  • add_plant_close_to_anchor (67-165)
  • get_next_node_name (29-45)
  • get_next_pipe_name (48-64)
cea/technologies/network_layout/substations_location.py (1)
  • calc_building_centroids (21-58)
cea/technologies/network_layout/graph_utils.py (2)
  • normalize_gdf_geometries (383-414)
  • normalize_geometry (348-380)
cea/optimization_new/user_network_loader.py (2)
  • load_user_defined_network (45-111)
  • validate_network_covers_district_buildings (405-518)
cea/inputlocator.py (9)
  • get_total_demand (1269-1271)
  • get_zone_building_names (897-903)
  • get_street_network (1170-1173)
  • get_zone_geometry (878-882)
  • get_network_layout_shapefile (987-991)
  • get_network_layout_nodes_shapefile (1007-1019)
  • get_output_thermal_network_type_folder (983-985)
  • get_network_layout_edges_shapefile (993-1005)
  • get_thermal_network_folder_network_name_folder (979-981)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: tests (windows-latest)
  • GitHub Check: tests (macos-latest)
  • GitHub Check: tests (ubuntu-latest)
🔇 Additional comments (29)
cea/technologies/network_layout/connectivity_potential.py (8)

32-42: LGTM: Import additions are appropriate.

All newly imported modules and functions are utilized in the updated implementation.


259-375: Good: Consistent coordinate type handling.

The implementation now consistently uses homogeneous (float, float) tuples throughout, which prevents type ambiguity issues when working with shapely geometry coordinates. The added explode() step for MultiLineStrings prevents potential errors when accessing .coords.


432-486: Well-implemented k-nearest neighbor search.

The function correctly implements vectorized k-nearest street edge discovery with proper input validation, empty-input handling, coordinate normalization, and structured return format. The use of nsmallest(k) is efficient for finding the top-k closest edges per building.


488-621: Excellent: K-nearest support with thorough documentation.

The function correctly implements k-nearest terminal connections. The extensive inline comments (lines 593-620) clearly explain the normalization strategy and rationale for avoiding global topology operations on the combined network, which is critical for maintaining correct building-to-street topology.


717-756: LGTM: Proper integration of snap tolerance validation.

The function correctly validates snap_tolerance before applying street network cleaning, which provides early warning of potentially problematic configuration values.


759-786: Well-designed junction identification.

The function correctly identifies original street junctions by checking node degree while excluding terminal nodes. The coordinate normalization ensures consistency with the rest of the system.


789-931: Excellent validation with comprehensive diagnostics.

The function provides thorough connectivity validation with three levels of checks, safe orphan removal, and detailed error diagnostics including actionable suggestions for resolution (e.g., increasing connection_candidates). The approach of safely discarding orphan components without building terminals before raising an error is sound.


934-1033: Well-orchestrated connectivity workflow.

The function correctly integrates all new capabilities: k-nearest terminal connections, snap tolerance validation, orphan merging, junction identification, and comprehensive connectivity validation. The comment at lines 1016-1019 correctly explains the critical decision to compute junctions from the streets-only graph rather than the combined graph.

cea/technologies/network_layout/graph_utils.py (1)

471-479: Empty-graph handling in nx_to_gdf looks good

The new branch that returns an empty GeoDataFrame with a geometry column and CRS when edges_data is empty is reasonable and should prevent failures on empty graphs.

cea/technologies/network_layout/main.py (9)

100-118: determine_network_types helper is clear and sufficient

The logic mapping include_services{'DC','DH'} with a guard for “no thermal services” is straightforward and aligns with the rest of the module.


120-129: print_demand_warning helper is fine as a simple logging utility

The warning formatting and capping to 10 buildings read well and should be adequate for CLI diagnostics.


731-773: auto_layout_network: use of zone/street GeoDataFrames and plant-building resolution looks correct

Loading zone_df / street_network_df, resolving plant buildings against zone_df['name'], and passing CRS‑aware building_centroids_df into calc_connectivity_network_with_geometry fit well with the rest of the pipeline. No issues spotted here.


751-915: Per-network-type Steiner + plant insertion in auto_layout_network looks coherent

The per‑type loop:

  • Builds building centroids for the service,
  • Computes the potential network graph with the configured connection_candidates and snap_tolerance,
  • Calls calc_steiner_spanning_tree once per type (with plant_building_names=None so plants are added only here),
  • Adds plants either at user-specified buildings or at the anchor load building, and
  • Normalizes plant types to plain PLANT before writing {type}/nodes.shp.

This is a well-factored separation between the Steiner core and per-network plant logic. The duplicate-node-name validation after plant addition is also a useful safety net.


916-942: layout.shp edge merging and renumbering are reasonable

Collecting all DC/DH edges, dropping duplicates by geometry, and then renumbering PIPE names via get_next_pipe_name(all_edges_gdf.loc[:idx]) should give you a clean, globally unique PIPE sequence in layout.shp. Behavior for non‑PIPE* names is left unchanged, which is also appropriate.


990-1055: process_user_defined_network: loading, basic validation, and scenario mismatch checks look strong

The use of load_user_defined_network, the zone‑geometry subset check for all network buildings, and the detailed error message when building names don’t exist in zone.shp are all solid. That should catch most “wrong scenario / wrong network” mistakes early.


1119-1141: Supply-based building selection branch in process_user_defined_network mirrors auto-layout correctly

The supply-driven path (when overwrite_supply is False) using get_buildings_from_supply_csv, unioning DC/DH buildings, adjusting list_include_services when one side is empty, and printing counts is consistent with auto_layout_network. No specific issues spotted here.


1168-1174: Expected component count wiring looks reasonable

Deriving expected_num_components from config.network_layout.number_of_components and passing it into auto_create_plant_nodes is a clean way to centralize this validation. No issues here.


1176-1277: Per-network-type plant integration for user-defined layouts and layout.shp writing look correct

For user-defined networks:

  • You validate building coverage first,
  • Then call auto_create_plant_nodes separately for DC and DH (with expected_num_components),
  • Collect all edges including plant-connection edges into all_edges_with_plants,
  • Drop duplicates by geometry,
  • Renumber PIPE names with get_next_pipe_name, and
  • Normalize geometries before writing layout.shp.

This addresses the previous problem where plant edges were not persisted to layout.shp and should keep the edge set coherent across DC/DH. The DC/DH node‑file writes and type normalization (PLANT_DC/PLANT_DHPLANT) also look consistent.

cea/technologies/network_layout/steiner_spanning_tree.py (11)

14-18: New imports for defaults and plant helpers are appropriate

Bringing in TYPE_MAT_DEFAULT / PIPE_DIAMETER_DEFAULT and reusing add_plant_close_to_anchor / get_next_node_name from plant_node_operations centralizes pipe defaults and naming logic nicely.


111-120: Use of potential_network_graph and original_junctions metadata is appropriate

Converting the provided potential network graph to undirected form and reading original_junctions from its .graph metadata matches how calc_connectivity_network_with_geometry populates that field. This keeps Steiner post‑processing aligned with the connectivity pipeline.


145-160: Single-building short-circuit is correct and avoids unnecessary Steiner work

For a single terminal, you now build a trivial graph with a single node and skip steiner_tree, which is exactly what you want. The later empty‑edges handling and diagnostics paths also account for this case.


161-283: Terminal-leaf and no-building-to-building enforcement logic is conceptually sound

_enforce_terminal_leafs_and_no_b2b:

  • Removes direct terminal‑terminal edges and replaces them with shortest street paths using G_no_terminals as a base, and
  • Ensures each terminal ends up with degree 1 by rerouting excess terminal edges via non-terminal paths where possible.

This matches Steiner-tree expectations and should significantly reduce pathological layouts, with defensive fallbacks (warnings, keeping edges when reroute fails) to preserve connectivity.

Given the complexity, it’s worth relying on tests to confirm behavior under larger, messy networks, but the high‑level approach looks reasonable.


285-397: Pruning non-terminal leaves and contracting degree-2 street chains are well-scoped

_prune_nonterminal_leaves and _contract_degree2_street_nodes:

  • Iteratively remove dangling non-terminal leaves, and
  • Contract degree‑2 interior street nodes while preserving terminals and any nodes in original_junctions.

Using _merge_two_edges to preserve geometries and recompute weights keeps edge attributes consistent enough for subsequent processing (you later overwrite type_mat/pipe_DN/names anyway). This looks like a solid clean-up stage after Steiner.


399-429: _report_network_stats helper is useful and low-risk

The reporting of node/edge counts, non-terminal leaves, terminal-degree anomalies, and total length provides good diagnostics for debugging Steiner outputs. Importing LineString lazily for the rare fallback path is fine.


459-475: Empty-edge handling for single-building networks is correct

When there are no edges, creating an empty GeoDataFrame with the expected schema and printing a “Single-building network” message keeps downstream consumers safe, and the later weight recalculation is correctly skipped.


485-505: Node field population and junction typing look coherent

Populating:

  • coordinates via normalize_coords,
  • building from pointer_coordinates_building_names,
  • name as NODE{index},
  • type as CONSUMER/NONE, and
  • Type_node as TERMINAL / ORIGINAL_JUNCTION / NONE

aligns well with how upstream graph metadata is structured. The final consistency check comparing terminal_nodes_names to mst_nodes['building'] (minus 'NONE') is a good sanity guard.


510-534: Plant creation inside calc_steiner_spanning_tree is now clearly controlled by plant_building_names

The logic:

  • Creates explicit PLANT nodes only when plant_building_names is a non-empty list,
  • Emits a warning if requested plant buildings don’t exist in the network, and
  • Uses calc_coord_anchor only when plant_building_names is an empty list (not None) and type_network is 'DC' or 'DH'.

Given callers pass plant_building_names=None when they plan to handle plants externally (as in auto_layout_network), this separation of responsibilities is clear and avoids duplicate plant insertion.


688-702: calc_coord_anchor logic matches previous behavior

Merging nodes_df with Total_demand.csv and selecting the building with maximum demand for the given type_network is consistent with the previous anchor-load selection logic. Using this only when type_network is 'DC' or 'DH' avoids ambiguity for combined networks.


709-813: Legacy add_loops_to_network: deprecation note and get_next_node_name usage look fine

Marking add_loops_to_network as deprecated and wiring it to use get_next_node_name for any new intermediate nodes keeps backwards compatibility while discouraging new use. The updated _add_edge_to_network pipe-numbering scheme fits naturally here as well.

Ensure the network layout value is always converted to a string before stripping whitespace, preventing errors when non-string values are provided.
Improves the orphan node merging logic in graph_utils.py to only merge small, non-terminal orphan components within a threshold, and discards isolated single nodes. Updates main.py to support validating network names across multiple network types and simplifies the logic for detecting explicitly set connected buildings.
Eliminated the unused 'representative_week' variable and its argument from all file output function calls in save_all_results_to_csv. This streamlines the code and ensures consistency with updated function signatures.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (11)
cea/config.py (3)

869-965: Dynamic NetworkLayoutChoiceParameter behavior is good once path issues are fixed

With _get_available_networks and _get_network_file_paths corrected as above, the remaining behavior here is sound:

  • _choices is computed lazily from available layout folders and sorted by most recent modification time.
  • encode enforces that only existing layouts can be written.
  • decode:
    • Returns the stored value when valid.
    • Falls back to the most recently modified layout when the stored value is empty/invalid.
    • Returns '' when no networks exist.

That gives a sensible “pick an existing layout, default to most recent” UX for both CLI and dashboard.


886-891: Guard _get_available_networks against missing thermal-network folder

_get_available_networks calls os.listdir(locator.get_thermal_network_folder()) unconditionally. For a fresh scenario with no outputs/data/thermal-network folder yet, this will raise FileNotFoundError whenever the parameter is accessed (dashboard, CLI help, etc.), which can break config usage before any network exists.

Return an empty list when the folder is missing:

     def _get_available_networks(self) -> List[str]:
-        locator = cea.inputlocator.InputLocator(self.config.scenario)
-        network_folder = locator.get_thermal_network_folder()
-        return [name for name in os.listdir(network_folder)
-                if os.path.isdir(os.path.join(network_folder, name)) and name not in self._network_types]
+        locator = cea.inputlocator.InputLocator(self.config.scenario)
+        network_folder = locator.get_thermal_network_folder()
+        if not os.path.exists(network_folder):
+            return []
+        return [
+            name
+            for name in os.listdir(network_folder)
+            if os.path.isdir(os.path.join(network_folder, name)) and name not in self._network_types
+        ]

This lets NetworkLayoutChoiceParameter gracefully report “no networks” instead of crashing on missing folders.


892-903: Fix _get_network_file_paths: wrong getter for edges and avoid check_cpg side-effects

Two problems here:

  1. edges_path mistakenly uses the nodes getter, so both paths point to nodes.shp.
  2. Both paths are built via get_network_layout_*_shapefile, which run check_cpg and can raise even when the shapefile exists but .cpg is missing, breaking layout discovery.

You only need existence checks here; build the layout paths directly from the network-type folder to avoid side-effects and fix the edges/nodes mixup:

-    def _get_network_file_paths(self, network_type: str, network_name: str) -> Tuple[str, str]:
-        """Get path for network node and edge files for the given network name"""
-        locator = cea.inputlocator.InputLocator(self.config.scenario)
-
-        network_type_folder = locator.get_output_thermal_network_type_folder(network_type, network_name)
-        # Remove trailing slash/separator if present
-        network_type_folder = network_type_folder.rstrip(os.sep)
-
-        edges_path = locator.get_network_layout_nodes_shapefile(network_type, network_name)
-        nodes_path = locator.get_network_layout_nodes_shapefile(network_type, network_name)
-
-        return edges_path, nodes_path
+    def _get_network_file_paths(self, network_type: str, network_name: str) -> Tuple[str, str]:
+        """Get path for network node and edge files for the given network name (no side-effects)."""
+        locator = cea.inputlocator.InputLocator(self.config.scenario)
+        base = locator.get_output_thermal_network_type_folder(network_type, network_name).rstrip(os.sep)
+        edges_path = os.path.join(base, 'layout', 'edges.shp')
+        nodes_path = os.path.join(base, 'layout', 'nodes.shp')
+        return edges_path, nodes_path

This aligns with the locator’s folder structure, avoids check_cpg failures, and correctly distinguishes edges vs nodes.

cea/technologies/thermal_network/thermal_network.py (3)

3341-3355: Demand check still broken: network_type is a list, but comparisons expect a scalar string

check_heating_cooling_demand reads network_type = config.thermal_network.network_type, which is now a MultiChoiceParameter returning a list (e.g. ["DH"] or ["DH", "DC"]). The subsequent checks:

if network_type == "DH":
    ...
if network_type == "DC":
    ...

will never be true for lists, so you silently skip all demand validation. In main, you already iterate per network_type:

network_types = config.thermal_network.network_type
for network_type in network_types:
    ...
    elif network_model == 'detailed':
        check_heating_cooling_demand(locator, config)

but the loop variable isn’t used inside the check.

Make check_heating_cooling_demand operate on a single network_type string and pass the current loop value:

-def check_heating_cooling_demand(locator, config):
-    # local variables
-    network_type = config.thermal_network.network_type
+def check_heating_cooling_demand(locator, network_type: str):
+    """Validate that the selected network_type has non-zero demand."""
     total_demand = pd.read_csv(locator.get_total_demand())
@@
-    if network_type == "DH":
+    if network_type == "DH":
         ...
-    if network_type == "DC":
+    if network_type == "DC":
         ...

and in main:

-            elif network_model == 'detailed':
-                check_heating_cooling_demand(locator, config)
+            elif network_model == 'detailed':
+                check_heating_cooling_demand(locator, network_type)

This restores the intended per-type validation (heating demand for DH, cooling for DC). Please update any other call sites to pass a scalar network_type in the same way.

Also applies to: 3392-3421


3399-3421: Per-network-type NetworkConfig wrapper and detailed pipeline look correct

Using an inner NetworkConfig to:

  • override network_type with the current loop value, and
  • copy the remaining thermal-network attributes from the base config,

before passing it into ThermalNetwork(locator, network_name, per_network_config) cleanly fixes the previous bug where the constructor was given a raw string instead of a section-like object. This ensures each detailed run sees a scalar network_type and the correct per-run settings.

Assuming check_heating_cooling_demand is updated as suggested, the detailed path per type looks coherent.


3369-3390: Broad except Exception still hides the detailed “available layouts” error messages

The network_name check re-raises a generic ValueError("Network name is required. Please select a network layout.") for any exception, including the more informative ValueErrors you raise when listing available layouts or when no layouts exist. That was previously flagged and remains unresolved.

Narrow this to filesystem errors and let your explicit ValueErrors propagate:

     network_name = config.thermal_network.network_name
     if not network_name:
-        # Get available network layouts to provide helpful error message
-        try:
-            network_folder = locator.get_thermal_network_folder()
-            available_layouts = [name for name in os.listdir(network_folder)
-                                if os.path.isdir(os.path.join(network_folder, name))
-                                and name not in {'DH', 'DC'}]
-            if available_layouts:
-                raise ValueError(
-                    f"Network name is required. Please select a network layout.\n"
-                    f"Available layouts: {', '.join(available_layouts)}"
-                )
-            else:
-                raise ValueError(
-                    "Network name is required, but no network layouts found.\n"
-                    "Please create or import a network layout using 'thermal-network-layout'."
-                )
-        except Exception:
-            raise ValueError("Network name is required. Please select a network layout.")
+        network_folder = locator.get_thermal_network_folder()
+        try:
+            entries = os.listdir(network_folder)
+        except FileNotFoundError:
+            raise ValueError(
+                "Network name is required, but no network layouts folder was found.\n"
+                "Please create or import a network layout using 'thermal-network-layout'."
+            )
+
+        available_layouts = [
+            name
+            for name in entries
+            if os.path.isdir(os.path.join(network_folder, name)) and name not in {"DH", "DC"}
+        ]
+
+        if available_layouts:
+            raise ValueError(
+                "Network name is required. Please select a network layout.\n"
+                f"Available layouts: {', '.join(available_layouts)}"
+            )
+        else:
+            raise ValueError(
+                "Network name is required, but no network layouts found.\n"
+                "Please create or import a network layout using 'thermal-network-layout'."
+            )

This preserves the helpful guidance while still handling missing network folders cleanly.

cea/technologies/network_layout/graph_utils.py (2)

81-237: Orphan edge geometry rewrite can corrupt endpoints

When snapping best_node_B to best_node_A, the endpoint replacement:

if coords[0] == best_node_B or coords[0] == u:
    coords[0] = best_node_A
else:
    coords[-1] = best_node_A

will rewrite the wrong endpoint whenever coords[0] matches the non-orphan node (u) and coords[-1] matches best_node_B. That creates a LineString whose endpoints no longer match the edge (other_node, best_node_A).

Restrict the replacement to the endpoint that actually equals the orphan node and leave the geometry unchanged if neither endpoint matches:

-                        # Replace the orphan endpoint
-                        if coords[0] == best_node_B or coords[0] == u:
-                            coords[0] = best_node_A
-                        else:
-                            coords[-1] = best_node_A
+                        # Replace the endpoint that actually matches the orphan node
+                        if coords[0] == best_node_B:
+                            coords[0] = best_node_A
+                        elif coords[-1] == best_node_B:
+                            coords[-1] = best_node_A
+                        else:
+                            # Endpoint mismatch – leave geometry unchanged to avoid corrupting it
+                            pass

This keeps edge geometries consistent with their (other_node, best_node_A) endpoints.


322-337: normalize_coords fallback still incompatible with shapely Points

The fallback branch claims to support attribute-based geometries but still indexes c[0] / c[1], which fails for shapely Point objects (not subscriptable). It also never prefers .x / .y when available.

Make attribute-based coords the primary path and only fall back to indexing for sequence/array-like inputs:

-    rounded = []
-    for c in coords:
-        # Accept (x, y) or (x, y, z/other); ignore extra dimensions
-        # Also tolerate sequences like numpy arrays
-        if isinstance(c, (list, tuple)) and len(c) >= 2:
-            x = float(c[0])
-            y = float(c[1])
-        else:
-            # Fallback: attempt attribute access (e.g., shapely Point)
-            try:
-                x = float(c[0])  # type: ignore[index]
-                y = float(c[1])  # type: ignore[index]
-            except Exception as e:
-                raise ValueError(f"Invalid coordinate element for normalization: {c!r}") from e
-        rounded.append((round(x, precision), round(y, precision)))
-    return rounded
+    rounded = []
+    for c in coords:
+        try:
+            # Geometry-like objects (e.g. shapely Point)
+            if hasattr(c, "x") and hasattr(c, "y"):
+                x = float(c.x)  # type: ignore[union-attr]
+                y = float(c.y)  # type: ignore[union-attr]
+            # Sequence / array-like (x, y[, z...])
+            elif isinstance(c, (list, tuple)) and len(c) >= 2:
+                x = float(c[0])
+                y = float(c[1])
+            else:
+                # Fallback for other indexable types (e.g. numpy arrays)
+                x = float(c[0])  # type: ignore[index]
+                y = float(c[1])  # type: ignore[index]
+        except Exception as e:
+            raise ValueError(f"Invalid coordinate element for normalization: {c!r}") from e
+        rounded.append((round(x, precision), round(y, precision)))
+    return rounded

This matches the comment and safely supports shapely Point inputs.

cea/technologies/network_layout/main.py (3)

567-574: Remove unreachable dead code.

Lines 567-574 print warnings for unmatched buildings, but lines 544-550 already raise a ValueError when unmatched_buildings is non-empty. This makes the warning code unreachable.

-    if unmatched_buildings:
-        print(f"  ⚠ Warning: {len(unmatched_buildings)} plant building(s){label} not found in zone:")
-        for building in unmatched_buildings[:5]:
-            print(f"      - {building}")
-        if len(unmatched_buildings) > 5:
-            print(f"      ... and {len(unmatched_buildings) - 5} more")
-        print(f"    Available buildings in zone: {', '.join(sorted(available_buildings)[:5])}" +
-              (f" ... and {len(available_buildings) - 5} more" if len(available_buildings) > 5 else ""))

Based on past review comments.


131-156: Add defensive error handling for missing files and columns.

Past review comments requested error handling, but no defensive checks are present. Without validation:

  • Missing CSV files will raise unhandled FileNotFoundError
  • Missing columns ('code', 'scale', 'name', system_type_col) will cause KeyError
  • Unmapped supply types will produce NaN values and be silently filtered

Add validation:

 def get_buildings_from_supply_csv(locator, network_type):
     """..."""
+    try:
+        supply_df = pd.read_csv(locator.get_building_supply())
+    except FileNotFoundError as e:
+        raise FileNotFoundError(f"supply.csv not found at {locator.get_building_supply()}") from e
-    supply_df = pd.read_csv(locator.get_building_supply())
 
     # Read assemblies database
     if network_type == "DH":
+        try:
+            assemblies_df = pd.read_csv(locator.get_database_assemblies_supply_heating())
+        except FileNotFoundError as e:
+            raise FileNotFoundError("SUPPLY_HEATING.csv not found") from e
-        assemblies_df = pd.read_csv(locator.get_database_assemblies_supply_heating())
         system_type_col = 'supply_type_hs'
     else:
+        try:
+            assemblies_df = pd.read_csv(locator.get_database_assemblies_supply_cooling())
+        except FileNotFoundError as e:
+            raise FileNotFoundError("SUPPLY_COOLING.csv not found") from e
-        assemblies_df = pd.read_csv(locator.get_database_assemblies_supply_cooling())
         system_type_col = 'supply_type_cs'
+    
+    # Validate required columns
+    if 'name' not in supply_df.columns or system_type_col not in supply_df.columns:
+        raise ValueError(f"supply.csv missing required columns: 'name' or '{system_type_col}'")
+    if 'code' not in assemblies_df.columns or 'scale' not in assemblies_df.columns:
+        raise ValueError("Assemblies database missing required columns: 'code' or 'scale'")

Based on past review comments.


69-83: Add validation for empty or invalid type values.

The loop assumes type field contains valid values. If a node has an empty, null, or whitespace-only type value, line 82 will set building='' and type='CONSUMER', creating an invalid consumer node with no building name. Line 70's .strip() converts whitespace to empty string, but doesn't validate it.

Add validation before classification:

     for idx, row in nodes_full.iterrows():
         type_value = str(row['type_original']).strip()
 
+        if not type_value or type_value.upper() == 'NAN':
+            raise ValueError(
+                f"Invalid node at index {idx}: 'type' field is empty or null. "
+                "Each node must have a valid type (building name, 'NONE', or plant type)."
+            )
+
         if type_value.upper() in plant_types:

Based on past review comments.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 83955a8 and 991d138.

📒 Files selected for processing (4)
  • cea/config.py (5 hunks)
  • cea/technologies/network_layout/graph_utils.py (4 hunks)
  • cea/technologies/network_layout/main.py (2 hunks)
  • cea/technologies/thermal_network/thermal_network.py (7 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
cea/config.py (2)
cea/technologies/network_layout/main.py (1)
  • _validate_network_name (967-992)
cea/inputlocator.py (5)
  • InputLocator (28-1364)
  • get_thermal_network_folder_network_name_folder (979-981)
  • get_thermal_network_folder (1022-1024)
  • get_output_thermal_network_type_folder (983-985)
  • get_network_layout_nodes_shapefile (1007-1019)
cea/technologies/thermal_network/thermal_network.py (3)
cea/config.py (2)
  • locator (578-581)
  • get_number_of_processes (246-257)
cea/inputlocator.py (4)
  • get_thermal_network_layout_massflow_nodes_file (1107-1109)
  • get_thermal_network_substation_ploss_file (1123-1125)
  • get_network_temperature_return_nodes_file (1115-1117)
  • get_thermal_network_folder (1022-1024)
cea/technologies/thermal_network/simplified_thermal_network.py (1)
  • thermal_network_simplified (379-889)
cea/technologies/network_layout/main.py (5)
cea/technologies/network_layout/connectivity_potential.py (1)
  • calc_connectivity_network_with_geometry (934-1033)
cea/technologies/network_layout/plant_node_operations.py (3)
  • add_plant_close_to_anchor (67-165)
  • get_next_node_name (29-45)
  • get_next_pipe_name (48-64)
cea/technologies/network_layout/graph_utils.py (2)
  • normalize_gdf_geometries (375-406)
  • normalize_geometry (340-372)
cea/optimization_new/user_network_loader.py (2)
  • load_user_defined_network (45-111)
  • validate_network_covers_district_buildings (405-518)
cea/inputlocator.py (12)
  • get_building_supply (905-907)
  • get_database_assemblies_supply_heating (793-795)
  • get_database_assemblies_supply_cooling (785-787)
  • get_total_demand (1269-1271)
  • get_zone_building_names (897-903)
  • get_street_network (1170-1173)
  • get_zone_geometry (878-882)
  • get_network_layout_shapefile (987-991)
  • get_network_layout_nodes_shapefile (1007-1019)
  • get_output_thermal_network_type_folder (983-985)
  • get_network_layout_edges_shapefile (993-1005)
  • get_thermal_network_folder_network_name_folder (979-981)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: tests (macos-latest)
  • GitHub Check: tests (windows-latest)
  • GitHub Check: tests (ubuntu-latest)
🔇 Additional comments (5)
cea/technologies/network_layout/graph_utils.py (1)

463-471: Empty-graph handling in nx_to_gdf looks good

Creating an empty GeoDataFrame with an explicit geometry column and CRS when edges_data is empty is a clean way to avoid failures on graphs without edges and keeps callers’ expectations about geometry/CRS intact.

cea/config.py (3)

430-437: Type annotations on base Parameter.encode / decode are fine

Adding return-type hints without changing behavior keeps the API stable and clarifies usage; no issues here.


757-817: NetworkLayoutNameParameter validation and collision checks look sane

The name parameter:

  • Strips whitespace and rejects common filesystem-unsafe characters.
  • Uses InputLocator.get_thermal_network_folder_network_name_folder to detect existing {network_name} folders and prevent accidental overwrite.
  • Only enforces collision checks on encode (when saving), so existing configs with old names can still be read.

This is a reasonable front-end guard for layout creation.


1151-1175: SingleBuildingParameter nullable support looks correct

The nullable handling:

  • Reads a .nullable flag from default.config.
  • decode returns None when nullable and the stored value is '' / None, otherwise defers to ChoiceParameter.decode.
  • encode returns '' for None/empty when nullable, preserving existing behavior (fallback to first building) for invalid non-empty values.

This keeps existing semantics and adds the ability to represent “no building selected” cleanly.

cea/technologies/thermal_network/thermal_network.py (1)

793-909: Result writers now consistently include network_name

All modified calls in save_all_results_to_csv now pass (thermal_network.network_type, thermal_network.network_name) into the various locator methods (massflow edges/nodes, pressures, temperatures, thermal losses, pumping energy, etc.), which matches the new {network_type}/{network_name} result path structure exposed in InputLocator.

This should correctly segregate outputs per (type, layout) pair rather than overwriting a single {network_type} folder.

Comment on lines +949 to +952
# TODO: Remove unused fields
pipe_diameter: float = 0.0
type_mat: str = ""
consider_only_buildings_with_demand: bool = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Remove or document unused fields.

The TODO comment indicates these fields are unused:

  • pipe_diameter
  • type_mat
  • consider_only_buildings_with_demand

If they're not used in the current implementation, remove them. If they're for future use, document their purpose and make them optional with defaults.

Based on past review comments.

🤖 Prompt for AI Agents
In cea/technologies/network_layout/main.py around lines 949-952, the TODO notes
pipe_diameter, type_mat, and consider_only_buildings_with_demand are unused;
either remove these fields from the dataclass/structure if truly unused, or keep
them but add a clear docstring comment explaining their intended future purpose
and make them explicit optional parameters with sensible defaults (e.g., None
for type_mat or keep numeric default for pipe_diameter) and annotate types
accordingly; update any related constructors/factories and unit tests to reflect
the removal or the documented optional fields so the codebase and API remain
consistent.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (11)
cea/technologies/network_layout/main.py (6)

567-574: Remove unreachable dead code.

Lines 567-574 print warnings for unmatched buildings, but lines 544-550 already raise a ValueError if unmatched_buildings is non-empty. This makes the warning code unreachable.

-    if unmatched_buildings:
-        print(f"  ⚠ Warning: {len(unmatched_buildings)} plant building(s){label} not found in zone:")
-        for building in unmatched_buildings[:5]:
-            print(f"      - {building}")
-        if len(unmatched_buildings) > 5:
-            print(f"      ... and {len(unmatched_buildings) - 5} more")
-        print(f"    Available buildings in zone: {', '.join(sorted(available_buildings)[:5])}" +
-              (f" ... and {len(available_buildings) - 5} more" if len(available_buildings) > 5 else ""))

977-980: Remove or document unused fields.

The TODO comment indicates these fields are unused:

  • pipe_diameter
  • type_mat
  • consider_only_buildings_with_demand

If they're not used in the current implementation, remove them. If they're for future use, document their purpose and make them optional with clear defaults.


1190-1196: Remove unreachable dead code.

The function validate_network_covers_district_buildings (from relevant code snippets) never populates auto_created_buildings—it only validates and raises errors for missing buildings, returning (nodes_gdf, []). Therefore, lines 1190-1196 are unreachable.

Additionally, there's a critical docstring–implementation mismatch in that function:

  • Docstring claims: "Auto-create missing nodes if edges reach the building"
  • Implementation reality: Raises errors instead; never auto-creates nodes

Either implement the auto-creation logic described in the docstring, or remove lines 1190-1196.


131-157: Add defensive error handling for missing files and columns.

Past review comments requested error handling, but defensive checks are still missing. Without validation:

  • Missing CSV files will raise unhandled FileNotFoundError
  • Missing columns will cause KeyError
  • Unmapped supply types will produce NaN values and be silently filtered

Add validation:

 def get_buildings_from_supply_csv(locator, network_type):
     """..."""
+    try:
+        supply_df = pd.read_csv(locator.get_building_supply())
+    except FileNotFoundError as e:
+        raise FileNotFoundError(f"supply.csv not found at {locator.get_building_supply()}") from e
-    supply_df = pd.read_csv(locator.get_building_supply())
 
     # Read assemblies database
     if network_type == "DH":
+        try:
+            assemblies_df = pd.read_csv(locator.get_database_assemblies_supply_heating())
+        except FileNotFoundError as e:
+            raise FileNotFoundError("SUPPLY_HEATING.csv not found") from e
-        assemblies_df = pd.read_csv(locator.get_database_assemblies_supply_heating())
         system_type_col = 'supply_type_hs'
     else:
+        try:
+            assemblies_df = pd.read_csv(locator.get_database_assemblies_supply_cooling())
+        except FileNotFoundError as e:
+            raise FileNotFoundError("SUPPLY_COOLING.csv not found") from e
-        assemblies_df = pd.read_csv(locator.get_database_assemblies_supply_cooling())
         system_type_col = 'supply_type_cs'
+    
+    # Validate required columns
+    if 'name' not in supply_df.columns or system_type_col not in supply_df.columns:
+        raise ValueError(f"supply.csv missing required columns: 'name' or '{system_type_col}'")
+    if 'code' not in assemblies_df.columns or 'scale' not in assemblies_df.columns:
+        raise ValueError("Assemblies database missing required columns: 'code' or 'scale'")

159-175: Add defensive error handling for missing files and columns.

Similar to get_buildings_from_supply_csv, this function lacks defensive checks requested in past review comments.

Add validation:

def get_buildings_with_demand(locator, network_type):
    """..."""
    try:
        total_demand = pd.read_csv(locator.get_total_demand())
    except FileNotFoundError as e:
        raise FileNotFoundError(f"Total_demand.csv not found at {locator.get_total_demand()}") from e
    
    if network_type == "DH":
        field = "QH_sys_MWhyr"
    else:
        field = "QC_sys_MWhyr"
    
    if field not in total_demand.columns:
        raise ValueError(f"Total_demand.csv missing required column: {field}")
    if 'name' not in total_demand.columns:
        raise ValueError("Total_demand.csv missing required column: name")
    
    buildings_with_demand = total_demand[total_demand[field] > 0.0]['name'].tolist()
    return buildings_with_demand

69-83: Add validation for empty or invalid type values.

Past review comments requested this validation but it wasn't added. If a node has an empty, null, or whitespace-only type value, line 82 will set building='' and type='CONSUMER', creating an invalid consumer node with no building name.

Add validation before classification:

     for idx, row in nodes_full.iterrows():
         type_value = str(row['type_original']).strip()
 
+        if not type_value or type_value.upper() == 'NAN':
+            raise ValueError(
+                f"Invalid node at index {idx}: 'type' field is empty or null. "
+                "Each node must have a valid type (building name, 'NONE', or plant type)."
+            )
+
         if type_value.upper() in plant_types:
cea/default.config (1)

1636-1636: Help text still references incorrect parameter names.

Past review comments flagged this issue, but it remains unresolved. The help text references parameter names with 'user-' prefix that don't match the actual parameter keys:

  • Line 1636: References 'user-nodes-shp-path' and 'user-network-geojson-path', but actual params are nodes-shp-path and network-geojson-path
  • Line 1643: References 'user-edges-shp-path' and 'user-network-geojson-path', but actual params are edges-shp-path and network-geojson-path
  • Line 1650: References 'user-edges-shp-path' and 'user-nodes-shp-path', but actual params are edges-shp-path and nodes-shp-path

Update the help text to use the correct parameter names (without the 'user-' prefix).

Also applies to: 1643-1643, 1650-1650

cea/technologies/thermal_network/thermal_network.py (2)

3379-3393: Pass scalar network_type into check_heating_cooling_demand instead of the full config

Inside check_heating_cooling_demand, network_type is read from config.thermal_network.network_type, which is now a multi-choice parameter (list-like) while the function compares against scalar strings "DH" / "DC". In main, you already have the scalar network_type from the loop:

network_types = config.thermal_network.network_type
for network_type in network_types:
    ...
    elif network_model == 'detailed':
        check_heating_cooling_demand(locator, config)  # still passes config

As written, network_type == "DH" / "DC" inside check_heating_cooling_demand will not behave as intended once config.thermal_network.network_type is a list, and the detailed path will silently skip the “no demand” checks.

Use the scalar from the loop and simplify the function signature accordingly:

-def check_heating_cooling_demand(locator, config):
-    # local variables
-    network_type = config.thermal_network.network_type
+def check_heating_cooling_demand(locator, network_type):
     total_demand = pd.read_csv(locator.get_total_demand())

and in main:

-            elif network_model == 'detailed':
-                check_heating_cooling_demand(locator, config)
+            elif network_model == 'detailed':
+                check_heating_cooling_demand(locator, network_type)
                 # Create a per-network config section with the correct network_type

This matches the multi-network orchestration and preserves the intended validation semantics.

#!/bin/bash
# Verify there are no other call sites that still use the old signature
rg -n "check_heating_cooling_demand\s*\(" -S

Also applies to: 3430-3460


3407-3428: Broad except Exception still hides the detailed “available layouts” message

The network_name block still wraps the entire layout-discovery logic in a bare except Exception, which replaces the more informative ValueError messages (including the list of available layouts) with the generic “Network name is required” text. This was already flagged earlier and remains a regression in user feedback.

Restrict the handler to filesystem errors and let the explicit ValueErrors propagate. For example:

-    if not network_name:
-        # Get available network layouts to provide helpful error message
-        try:
-            network_folder = locator.get_thermal_network_folder()
-            available_layouts = [name for name in os.listdir(network_folder)
-                                if os.path.isdir(os.path.join(network_folder, name))
-                                and name not in {'DH', 'DC'}]
-            if available_layouts:
-                raise ValueError(
-                    f"Network name is required. Please select a network layout.\n"
-                    f"Available layouts: {', '.join(available_layouts)}"
-                )
-            else:
-                raise ValueError(
-                    "Network name is required, but no network layouts found.\n"
-                    "Please create or import a network layout using 'thermal-network-layout'."
-                )
-        except Exception:
-            raise ValueError("Network name is required. Please select a network layout.")
+    if not network_name:
+        network_folder = locator.get_thermal_network_folder()
+        try:
+            entries = os.listdir(network_folder)
+        except FileNotFoundError:
+            raise ValueError(
+                "Network name is required, but no network layouts folder was found.\n"
+                "Please create or import a network layout using 'thermal-network-layout'."
+            )
+
+        available_layouts = [
+            name for name in entries
+            if os.path.isdir(os.path.join(network_folder, name))
+            and name not in {"DH", "DC"}
+        ]
+        if available_layouts:
+            raise ValueError(
+                "Network name is required. Please select a network layout.\n"
+                f"Available layouts: {', '.join(available_layouts)}"
+            )
+        else:
+            raise ValueError(
+                "Network name is required, but no network layouts found.\n"
+                "Please create or import a network layout using 'thermal-network-layout'."
+            )
cea/technologies/thermal_network/simplified_thermal_network.py (2)

99-125: Edges shapefile read path is still asymmetric with where edges are written

Edges are read from the generic get_network_layout_shapefile(network_name) (layout.shp), while later in the pipeline updated edge properties are written to the type-specific get_network_layout_edges_shapefile(network_type, network_name) (edges.shp). This means a re-run will read from the generic layout (possibly stale) instead of the last computed edges.shp.

Consider switching the read to the same type-specific path, with a fallback to the generic layout for backward compatibility:

-    edges_path = locator.get_network_layout_shapefile(network_name)
+    try:
+        edges_path = locator.get_network_layout_edges_shapefile(network_type, network_name)
+    except Exception:
+        # Fallback for legacy projects without per-type edges.shp
+        edges_path = locator.get_network_layout_shapefile(network_name)

This keeps existing layouts working while favouring the most up-to-date edge geometry.


99-138: Fix type hint for network_type in get_thermal_network_from_shapefile

network_type is used as a string (e.g. "DH" / "DC") to resolve node shapefile paths and construct error messages, but the annotation currently says gpd.GeoDataFrame, which is misleading for users and static tooling.

Update the signature to reflect the actual type:

-def get_thermal_network_from_shapefile(locator: cea.inputlocator.InputLocator, network_type: gpd.GeoDataFrame, network_name: str):
+def get_thermal_network_from_shapefile(locator: cea.inputlocator.InputLocator, network_type: str, network_name: str):

The rest of the implementation already assumes a string.

🧹 Nitpick comments (2)
cea/technologies/thermal_network/thermal_network.py (1)

137-165: clone(): constructor arguments now match signature and preserve config

Passing self.network_name plus self as the config section into ThermalNetwork fixes the prior bad constructor usage and ensures cloned instances inherit the same network settings. The extra shapefile read in __init__ is minor overhead and safe since you overwrite the loaded state immediately afterward.

cea/technologies/thermal_network/simplified_thermal_network.py (1)

127-137: Duplicate ID checks are helpful; consider stringifying for clearer messages (optional)

The duplicated node/pipe ID checks are good. If you want more readable error messages (instead of the tuple-style ValueError representation), you can format them into a single string:

-    if duplicated_nodes.size > 0:
-        raise ValueError('There are duplicated NODE IDs:', duplicated_nodes.name.values)
-    if duplicated_edges.size > 0:
-        raise ValueError('There are duplicated PIPE IDs:', duplicated_edges.name.values)
+    if duplicated_nodes.size > 0:
+        raise ValueError(f"There are duplicated NODE IDs: {duplicated_nodes.name.values}")
+    if duplicated_edges.size > 0:
+        raise ValueError(f"There are duplicated PIPE IDs: {duplicated_edges.name.values}")

Not required for correctness, but it usually produces clearer logs.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 991d138 and 298454c.

📒 Files selected for processing (4)
  • cea/default.config (5 hunks)
  • cea/technologies/network_layout/main.py (2 hunks)
  • cea/technologies/thermal_network/simplified_thermal_network.py (12 hunks)
  • cea/technologies/thermal_network/thermal_network.py (13 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
cea/technologies/thermal_network/simplified_thermal_network.py (3)
cea/inputlocator.py (4)
  • get_network_layout_shapefile (987-991)
  • get_network_layout_nodes_shapefile (1007-1019)
  • get_output_thermal_network_type_folder (983-985)
  • get_network_layout_edges_shapefile (993-1005)
cea/technologies/thermal_network/utility.py (1)
  • extract_network_from_shapefile (10-89)
cea/utilities/__init__.py (1)
  • pushd (42-54)
cea/technologies/network_layout/main.py (8)
cea/technologies/network_layout/connectivity_potential.py (1)
  • calc_connectivity_network_with_geometry (934-1033)
cea/technologies/network_layout/steiner_spanning_tree.py (1)
  • calc_steiner_spanning_tree (63-543)
cea/technologies/network_layout/plant_node_operations.py (3)
  • add_plant_close_to_anchor (67-165)
  • get_next_node_name (29-45)
  • get_next_pipe_name (48-64)
cea/technologies/network_layout/substations_location.py (1)
  • calc_building_centroids (21-58)
cea/technologies/network_layout/graph_utils.py (2)
  • normalize_gdf_geometries (375-406)
  • normalize_geometry (340-372)
cea/optimization_new/user_network_loader.py (2)
  • load_user_defined_network (45-111)
  • validate_network_covers_district_buildings (405-518)
cea/config.py (5)
  • locator (578-581)
  • get (259-261)
  • get (438-445)
  • get (1111-1119)
  • _validate_network_name (763-789)
cea/inputlocator.py (11)
  • get_building_supply (905-907)
  • get_database_assemblies_supply_heating (793-795)
  • get_database_assemblies_supply_cooling (785-787)
  • get_total_demand (1269-1271)
  • get_zone_building_names (897-903)
  • get_zone_geometry (878-882)
  • get_network_layout_shapefile (987-991)
  • get_network_layout_nodes_shapefile (1007-1019)
  • get_output_thermal_network_type_folder (983-985)
  • get_network_layout_edges_shapefile (993-1005)
  • get_thermal_network_folder_network_name_folder (979-981)
cea/technologies/thermal_network/thermal_network.py (3)
cea/config.py (2)
  • locator (578-581)
  • get_number_of_processes (246-257)
cea/inputlocator.py (7)
  • get_network_layout_shapefile (987-991)
  • get_network_layout_nodes_shapefile (1007-1019)
  • get_thermal_network_layout_massflow_nodes_file (1107-1109)
  • get_thermal_network_substation_ploss_file (1123-1125)
  • get_network_temperature_return_nodes_file (1115-1117)
  • get_network_layout_edges_shapefile (993-1005)
  • get_thermal_network_folder (1022-1024)
cea/technologies/thermal_network/simplified_thermal_network.py (1)
  • thermal_network_simplified (402-912)
🪛 GitHub Actions: Ruff
cea/technologies/network_layout/main.py

[error] 769-769: Ruff check failed: f-string without placeholders. Remove extraneous 'f' prefix. Command: ruff check ./cea

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: tests (ubuntu-latest)
  • GitHub Check: tests (windows-latest)
  • GitHub Check: tests (macos-latest)
🔇 Additional comments (9)
cea/technologies/thermal_network/thermal_network.py (3)

194-233: Network layout loading and validation look correct

Using get_network_layout_shapefile(self.network_name) and get_network_layout_nodes_shapefile(self.network_type, self.network_name) with explicit FileNotFoundError messages and filter_edges=True in extract_network_from_shapefile gives clearer failures and correctly filters combined layouts down to the active network type. No functional issues spotted here.


815-925: Output paths are now network-name aware

All CSV outputs in save_all_results_to_csv now pass both thermal_network.network_type and thermal_network.network_name into the locator, aligning with the new per-network folder structure. This avoids different networks overwriting each other’s results.


3430-3476: Multi-network orchestration and per-network config wrapper look consistent

Looping over config.thermal_network.network_type and creating a per-network NetworkConfig that overrides network_type while copying the other thermal-network settings aligns ThermalNetwork with the new multi-choice parameter and prevents the old “list vs scalar” bugs in the detailed model. Aggregating ValueError / FileNotFoundError per network type into a final summary is also a nice UX improvement.

cea/technologies/thermal_network/simplified_thermal_network.py (6)

1-3: Import of warnings is appropriate

Adding warnings is justified by new validation and demand-handling logic that emits user-facing warnings.


167-327: Topology validator before WNTR is robust and well-scoped

validate_network_topology_for_wntr does thorough pre-checks (duplicate IDs, edge endpoints, connectivity, isolated nodes, plant presence, plant reachability) and raises clear, actionable ValueErrors. This should avoid opaque EPANET error codes later. Implementation using NetworkX is straightforward and consistent with the data structures.


330-383: Demand-pattern validation is a good safeguard

validate_demand_patterns enforces consistent time-series lengths and finite values across buildings, warning on mismatched hours and NaNs and erroring on infinities. This is a solid defensive layer before WNTR; the “empty DataFrame” shortcut is also reasonable.


402-507: Simplified model entry point correctly switches to per-network execution with validation

The new thermal_network_simplified(locator, config, network_type, network_name) signature and its usage:

  • Pulls network-type-specific shapefiles via get_thermal_network_from_shapefile.
  • Intersects demand buildings with node_df buildings.
  • Uses pushd into the per-network output folder.
  • Warns and assigns zero patterns for consumers without demand profiles.
  • Ensures at least one PLANT node.
  • Calls validate_demand_patterns and validate_network_topology_for_wntr before building the WNTR model.

This is a clean adaptation to named networks and should significantly improve diagnostics.


472-507: Structured WNTR error handling and result validation are well-designed

The three simulation phases now:

  • Wrap EpanetSimulator.run_sim() in targeted try/except blocks with context-rich ValueErrors.
  • Check for empty / NaN link results and warn or fail early.
  • Reuse the same wn object across iterations while updating diameters and reservoir head patterns.

This preserves the original algorithm but makes hydraulic/thermal failures much easier to diagnose.

Also applies to: 556-711


781-787: Ensuring per-network folders and writing back updated edges shapefile is consistent with new layout model

Creating the per-(network_type, network_name) folder (when network_name is set) and writing edge_df (including geometry) back to get_network_layout_edges_shapefile keeps the shapefile outputs aligned with the new file layout and with the detailed model. This also enables later reuse of pipe DN information from a single, canonical edges file.

Also applies to: 909-912

Comment on lines +1715 to +1720
# read in diameters from shp file (may return None on first run)
diameter_guess = read_in_diameters_from_shapefile(thermal_network)
if diameter_guess is None:
# First run - no previous diameters available, use initial guess
print('No previous pipe diameters found, using initial diameter guess...')
diameter_guess = initial_diameter_guess(thermal_network)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

read_in_diameters_from_shapefile returns string codes where numeric diameters are required

In the no-loop branch of calc_max_edge_flowrate, diameter_guess is populated from read_in_diameters_from_shapefile. That helper currently returns network_edges['pipe_DN'], i.e. string codes like "DN200", but downstream code expects diameters in meters:

  • hourly_mass_flow_calculation passes diameter_guess as pipe_diameter_m into calc_mass_flow_edges.
  • calc_mass_flow_edges calls calc_pressure_loss_pipe, which performs arithmetic and exponentiation on pipe_diameter_m.

On a subsequent (non-loop) run where edges.shp exists, this will hit pipe_diameter_m ** 5 with string data and raise a TypeError.

You either need to map pipe_DN codes back to D_int_m using the thermal grid catalog, or fall back to the existing numeric initial guess. A minimal fix that preserves reuse:

 def read_in_diameters_from_shapefile(thermal_network):
-    edges_path = thermal_network.locator.get_network_layout_edges_shapefile(
-        thermal_network.network_type, thermal_network.network_name)
-
-    # On first run, edges.shp doesn't exist yet (only layout.shp from Network Layout Part 1)
-    # Fall back to layout.shp if edges.shp doesn't exist
-    if not os.path.exists(edges_path):
-        edges_path = thermal_network.locator.get_network_layout_shapefile(thermal_network.network_name)
-
-    network_edges = gpd.read_file(edges_path)
-
-    # pipe_DN may not exist in layout.shp (only added after first thermal network run)
-    if 'pipe_DN' in network_edges.columns:
-        diameter_guess = network_edges['pipe_DN']
-    else:
-        # Return None to indicate no diameter guess available (will use defaults)
-        diameter_guess = None
+    edges_path = thermal_network.locator.get_network_layout_edges_shapefile(
+        thermal_network.network_type, thermal_network.network_name)
+
+    # On first run, edges.shp doesn't exist yet (only layout.shp from Network Layout Part 1)
+    # Fall back to layout.shp if edges.shp doesn't exist
+    if not os.path.exists(edges_path):
+        edges_path = thermal_network.locator.get_network_layout_shapefile(thermal_network.network_name)
+
+    network_edges = gpd.read_file(edges_path)
+
+    # pipe_DN may not exist in layout.shp (only added after first thermal network run)
+    if 'pipe_DN' in network_edges.columns:
+        # Map stored pipe_DN codes back to internal diameters in meters
+        pipe_catalog = pd.read_csv(
+            thermal_network.locator.get_database_components_distribution_thermal_grid('THERMAL_GRID')
+        )
+        code_to_d_int = dict(zip(pipe_catalog['pipe_DN'], pipe_catalog['D_int_m']))
+        diameter_guess = network_edges['pipe_DN'].map(code_to_d_int).values
+        # If any mapping failed, fall back to default behaviour
+        if np.isnan(diameter_guess).any():
+            diameter_guess = None
+    else:
+        # Return None to indicate no diameter guess available (will use defaults)
+        diameter_guess = None

This keeps the existing “first run uses initial_diameter_guess” behaviour, but makes the reuse path type-safe.

Also applies to: 1838-1856

🤖 Prompt for AI Agents
In cea/technologies/thermal_network/thermal_network.py around lines 1715-1720
(also apply same fix to 1838-1856), diameter_guess is set from
read_in_diameters_from_shapefile which returns pipe_DN string codes (e.g.
"DN200") but downstream expects numeric diameters in meters; change the reuse
path to convert/map pipe_DN codes to numeric meter diameters using the thermal
grid catalog (map DN code -> D_int_m) and produce a numeric Series/array aligned
with network_edges, and if a DN code is missing or mapping fails fall back to
initial_diameter_guess(thermal_network) for those edges so diameter_guess
contains only floats (or a numeric pandas Series) suitable for
calc_mass_flow_edges/calc_pressure_loss_pipe.

ShiZhongming and others added 4 commits November 21, 2025 13:06
- Track which networks succeeded during processing
- Show "Completed: DH. Failed: DC. See errors above." when partial success
- Show "All network types failed to process (DC, DH)" when all fail

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
cea/technologies/thermal_network/thermal_network.py (2)

3407-3429: Broad except Exception still discards detailed layout guidance for missing network_name

The if not network_name: block rebuilds a detailed list of available layouts, but then wraps it in a broad:

except Exception:
    raise ValueError("Network name is required. Please select a network layout.")

Any ValueError you raise intentionally for the user (with the list of available layouts or “no layouts found” guidance) is immediately swallowed and replaced with the generic message, which is exactly the regression flagged in the earlier review.

Please narrow the exception handling to true I/O errors and let your explicit ValueErrors propagate. For example:

-    if not network_name:
-        # Get available network layouts to provide helpful error message
-        try:
-            network_folder = locator.get_thermal_network_folder()
-            available_layouts = [name for name in os.listdir(network_folder)
-                                if os.path.isdir(os.path.join(network_folder, name))
-                                and name not in {'DH', 'DC'}]
-            if available_layouts:
-                raise ValueError(
-                    f"Network name is required. Please select a network layout.\n"
-                    f"Available layouts: {', '.join(available_layouts)}"
-                )
-            else:
-                raise ValueError(
-                    "Network name is required, but no network layouts found.\n"
-                    "Please create or import a network layout using 'thermal-network-layout'."
-                )
-        except Exception:
-            raise ValueError("Network name is required. Please select a network layout.")
+    if not network_name:
+        network_folder = locator.get_thermal_network_folder()
+        try:
+            entries = os.listdir(network_folder)
+        except FileNotFoundError:
+            raise ValueError(
+                "Network name is required, but the thermal-network folder was not found.\n"
+                "Please create or import a network layout using 'thermal-network-layout'."
+            )
+
+        available_layouts = [
+            name for name in entries
+            if os.path.isdir(os.path.join(network_folder, name))
+            and name not in {'DH', 'DC'}
+        ]
+        if available_layouts:
+            raise ValueError(
+                "Network name is required. Please select a network layout.\n"
+                f"Available layouts: {', '.join(available_layouts)}"
+            )
+        else:
+            raise ValueError(
+                "Network name is required, but no network layouts found.\n"
+                "Please create or import a network layout using 'thermal-network-layout'."
+            )

This preserves your helpful messages while still handling the missing-folder case gracefully.


3430-3461: Multi‑network loop and check_heating_cooling_demand still inconsistent on network_type (list vs scalar)

main now loops over:

network_types = config.thermal_network.network_type
for network_type in network_types:
    ...
    elif network_model == 'detailed':
        check_heating_cooling_demand(locator, config)
        ...
        per_network_config = NetworkConfig(config.thermal_network, network_type)
        thermal_network = ThermalNetwork(locator, network_name, per_network_config)

while check_heating_cooling_demand continues to do:

network_type = config.thermal_network.network_type
if network_type == "DH":
    ...
if network_type == "DC":
    ...

Given config.thermal_network.network_type is now a MultiChoice (list) for this PR, those == "DH"/"DC" tests will never be true, so:

  • Demand checks are effectively skipped entirely.
  • The check is also not per‑network_type; it re‑reads the same list each iteration.

At the same time, ThermalNetwork is correctly instantiated with a scalar network_type via per_network_config, so the inconsistency is localized to the demand check.

You can resolve this cleanly by:

  1. Making check_heating_cooling_demand take a scalar network_type, and
  2. Passing the loop variable instead of the whole config.

For example:

-def check_heating_cooling_demand(locator, config):
-    # local variables
-    network_type = config.thermal_network.network_type
+def check_heating_cooling_demand(locator, network_type):
     total_demand = pd.read_csv(locator.get_total_demand())
     if network_type == "DH":
         ...
     if network_type == "DC":
         ...

and in main:

-            elif network_model == 'detailed':
-                check_heating_cooling_demand(locator, config)
+            elif network_model == 'detailed':
+                check_heating_cooling_demand(locator, network_type)
                 per_network_config = NetworkConfig(config.thermal_network, network_type)
                 thermal_network = ThermalNetwork(locator, network_name, per_network_config)

Optionally, to be robust to legacy configs where network_type might still be a single string, you can normalize at the top of main:

network_types = config.thermal_network.network_type
if isinstance(network_types, str):
    network_types = [network_types]

so the loop always sees an iterable of valid "DH"/"DC" values.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 298454c and 9f4489a.

📒 Files selected for processing (1)
  • cea/technologies/thermal_network/thermal_network.py (13 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
cea/technologies/thermal_network/thermal_network.py (4)
cea/config.py (2)
  • locator (578-581)
  • get_number_of_processes (246-257)
cea/inputlocator.py (4)
  • get_network_layout_shapefile (987-991)
  • get_network_layout_nodes_shapefile (1007-1019)
  • get_network_layout_edges_shapefile (993-1005)
  • get_thermal_network_folder (1022-1024)
cea/technologies/thermal_network/utility.py (1)
  • extract_network_from_shapefile (10-89)
cea/technologies/thermal_network/simplified_thermal_network.py (1)
  • thermal_network_simplified (402-912)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: tests (macos-latest)
  • GitHub Check: tests (ubuntu-latest)
  • GitHub Check: tests (windows-latest)
🔇 Additional comments (4)
cea/technologies/thermal_network/thermal_network.py (4)

137-165: Clone now correctly propagates network_name into cloned instances

Passing self.network_name into the ThermalNetwork constructor in clone keeps cloned networks bound to the same layout and paths as the original. This is consistent with the updated __init__ signature and downstream use of network_name. No issues here.


167-233: Per‑network shapefile loading and edge filtering look correct

The new edges_path / nodes_path logic and existence checks provide clear, actionable FileNotFoundError messages and correctly distinguish between “layout not created” and “layout skipped due to missing demand”. Using the combined layout.shp plus per‑type nodes.shp and extract_network_from_shapefile(..., filter_edges=True) is a sound way to derive DC/DH‑specific networks from a shared layout. No functional concerns from this change.


650-931: Output paths now consistently include network_name for non‑representative runs

In the non‑representative (use_representative_week_per_month == False) branch of save_all_results_to_csv, all locator calls now pass both thermal_network.network_type and thermal_network.network_name. This aligns result files with the selected network layout and matches the usage in the representative‑week branch. The change is coherent and should not alter numerical results.


1065-1066: Looped-network initial solve remains consistent

Dropping the first row of edge_node_df and the corresponding column from mass_flow_substation_df to form A and b_init for the least‑squares solve keeps the system well‑determined for looped networks. This change appears to be a minor reshuffle rather than a semantic change; dimensions and alignment remain consistent.

Comment on lines +1715 to 1721
# read in diameters from shp file (may return None on first run)
diameter_guess = read_in_diameters_from_shapefile(thermal_network)
if diameter_guess is None:
# First run - no previous diameters available, use initial guess
print('No previous pipe diameters found, using initial diameter guess...')
diameter_guess = initial_diameter_guess(thermal_network)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

read_in_diameters_from_shapefile still returns string DN codes, causing TypeError in no‑loop runs

In the no‑loop path of calc_max_edge_flowrate, diameter_guess comes from read_in_diameters_from_shapefile(thermal_network) and is passed directly as pipe_diameter_m into hourly_mass_flow_calculationcalc_mass_flow_edgescalc_pressure_loss_pipe. The updated read_in_diameters_from_shapefile still returns:

diameter_guess = network_edges['pipe_DN']  # e.g. "DN200"

not numeric diameters in meters. These string values will reach:

pipe_diameter_m ** 5

in calc_pressure_loss_pipe, which will raise a TypeError on the second and subsequent runs (when edges.shp exists and has pipe_DN), breaking all non‑looped networks.

The new fallback to initial_diameter_guess only protects the first run (when there is no pipe_DN yet); it does not fix the reuse case.

You need to map pipe_DN codes back to internal diameters (D_int_m) using the thermal grid catalog and return a numeric array, or fall back to None so the caller can use initial_diameter_guess.

A minimal fix that keeps your new first‑run fallback:

 def read_in_diameters_from_shapefile(thermal_network):
-    edges_path = thermal_network.locator.get_network_layout_edges_shapefile(
-        thermal_network.network_type, thermal_network.network_name)
-
-    # On first run, edges.shp doesn't exist yet (only layout.shp from Network Layout Part 1)
-    # Fall back to layout.shp if edges.shp doesn't exist
-    if not os.path.exists(edges_path):
-        edges_path = thermal_network.locator.get_network_layout_shapefile(thermal_network.network_name)
-
-    network_edges = gpd.read_file(edges_path)
-
-    # pipe_DN may not exist in layout.shp (only added after first thermal network run)
-    if 'pipe_DN' in network_edges.columns:
-        diameter_guess = network_edges['pipe_DN']
-    else:
-        # Return None to indicate no diameter guess available (will use defaults)
-        diameter_guess = None
-
-    return diameter_guess
+    edges_path = thermal_network.locator.get_network_layout_edges_shapefile(
+        thermal_network.network_type, thermal_network.network_name)
+
+    # On first run, edges.shp doesn't exist yet (only layout.shp from Network Layout Part 1)
+    # Fall back to layout.shp if edges.shp doesn't exist
+    if not os.path.exists(edges_path):
+        edges_path = thermal_network.locator.get_network_layout_shapefile(thermal_network.network_name)
+
+    network_edges = gpd.read_file(edges_path)
+
+    # pipe_DN may not exist in layout.shp (only added after first thermal network run)
+    if 'pipe_DN' in network_edges.columns:
+        # Map stored pipe_DN codes back to internal diameters in meters
+        pipe_catalog = pd.read_csv(
+            thermal_network.locator.get_database_components_distribution_thermal_grid('THERMAL_GRID')
+        )
+        code_to_d_int = dict(zip(pipe_catalog['pipe_DN'], pipe_catalog['D_int_m']))
+        diameter_guess = network_edges['pipe_DN'].map(code_to_d_int).values
+        # If any mapping failed, fall back to default behaviour
+        if np.isnan(diameter_guess).any():
+            diameter_guess = None
+    else:
+        # Return None to indicate no diameter guess available (will use defaults)
+        diameter_guess = None
+
+    return diameter_guess

With this change, the no‑loop branch in calc_max_edge_flowrate will either reuse valid numeric diameters or safely fall back to initial_diameter_guess.

Also applies to: 1838-1856

🤖 Prompt for AI Agents
In cea/technologies/thermal_network/thermal_network.py around lines 1715-1721
(also applies to lines 1838-1856): read_in_diameters_from_shapefile currently
returns pipe DN strings (e.g. "DN200") which later cause a TypeError when
arithmetic expects numeric diameters; change the function to map pipe_DN values
to numeric internal diameters (D_int_m) using the thermal grid catalog lookup
and return a numeric array of diameters; if a DN cannot be resolved, return None
(not the raw strings) so the caller falls back to initial_diameter_guess; ensure
the returned dtype is numeric (floats) and preserve the existing first-run
behavior when no pipe_DN is present.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (4)
cea/technologies/thermal_network/thermal_network.py (3)

1838-1856: 🔴 CRITICAL: read_in_diameters_from_shapefile still returns string DN codes, breaking arithmetic operations.

This was flagged in previous reviews but remains unfixed. Line 1851 returns network_edges['pipe_DN'] which contains string codes like "DN200". However, downstream code treats these as numeric diameters in meters:

  • diameter_guess is passed to hourly_mass_flow_calculation (line 1737) as pipe_diameter_m
  • Eventually reaches calc_pressure_loss_pipe (line 1505) which performs pipe_diameter_m ** 5 (line 1541)
  • This will raise TypeError on the second run when edges.shp exists and contains pipe_DN values

The first run works because line 1720 falls back to initial_diameter_guess() when diameter_guess is None, but subsequent runs fail.

Fix: Map pipe_DN codes to numeric D_int_m values using the thermal grid catalog:

 def read_in_diameters_from_shapefile(thermal_network):
     edges_path = thermal_network.locator.get_network_layout_edges_shapefile(
         thermal_network.network_type, thermal_network.network_name)
 
     # On first run, edges.shp doesn't exist yet (only layout.shp from Network Layout Part 1)
     # Fall back to layout.shp if edges.shp doesn't exist
     if not os.path.exists(edges_path):
         edges_path = thermal_network.locator.get_network_layout_shapefile(thermal_network.network_name)
 
     network_edges = gpd.read_file(edges_path)
 
     # pipe_DN may not exist in layout.shp (only added after first thermal network run)
     if 'pipe_DN' in network_edges.columns:
-        diameter_guess = network_edges['pipe_DN']
+        # Map stored pipe_DN codes back to internal diameters in meters
+        pipe_catalog = pd.read_csv(
+            thermal_network.locator.get_database_components_distribution_thermal_grid('THERMAL_GRID')
+        )
+        code_to_d_int = dict(zip(pipe_catalog['pipe_DN'], pipe_catalog['D_int_m']))
+        diameter_guess = network_edges['pipe_DN'].map(code_to_d_int).values
+        # If any mapping failed, fall back to default behaviour
+        if pd.isna(diameter_guess).any():
+            diameter_guess = None
     else:
         # Return None to indicate no diameter guess available (will use defaults)
         diameter_guess = None
 
     return diameter_guess

3410-3428: 🔴 CRITICAL: Broad exception handler swallows informative error messages.

This issue was flagged in a previous review but remains unfixed. The except Exception: on line 3427 catches the explicit ValueError instances raised on lines 3418 and 3423 (which helpfully list available layouts or guide the user) and replaces them with the generic message on line 3428.

Users will never see the context about which layouts are available or the guidance to run 'thermal-network-layout'.

Fix: Restrict the exception handler to only I/O errors:

     if not network_name:
-        # Get available network layouts to provide helpful error message
-        try:
-            network_folder = locator.get_thermal_network_folder()
-            available_layouts = [name for name in os.listdir(network_folder)
-                                if os.path.isdir(os.path.join(network_folder, name))
-                                and name not in {'DH', 'DC'}]
-            if available_layouts:
-                raise ValueError(
-                    f"Network name is required. Please select a network layout.\n"
-                    f"Available layouts: {', '.join(available_layouts)}"
-                )
-            else:
-                raise ValueError(
-                    "Network name is required, but no network layouts found.\n"
-                    "Please create or import a network layout using 'thermal-network-layout'."
-                )
-        except Exception:
-            raise ValueError("Network name is required. Please select a network layout.")
+        network_folder = locator.get_thermal_network_folder()
+        try:
+            entries = os.listdir(network_folder)
+        except (FileNotFoundError, PermissionError, OSError):
+            raise ValueError(
+                "Network name is required, but the thermal network folder could not be accessed.\n"
+                "Please create or import a network layout using 'thermal-network-layout'."
+            )
+        
+        available_layouts = [
+            name for name in entries
+            if os.path.isdir(os.path.join(network_folder, name))
+            and name not in {'DH', 'DC'}
+        ]
+        if available_layouts:
+            raise ValueError(
+                f"Network name is required. Please select a network layout.\n"
+                f"Available layouts: {', '.join(available_layouts)}"
+            )
+        else:
+            raise ValueError(
+                "Network name is required, but no network layouts found.\n"
+                "Please create or import a network layout using 'thermal-network-layout'."
+            )

3442-3442: 🔴 CRITICAL: Type mismatch in check_heating_cooling_demand call.

Line 3442 calls check_heating_cooling_demand(locator, config) within the per-network loop, but the function expects a scalar network_type string. Currently:

  1. config.thermal_network.network_type at line 3430 is a list (e.g., ["DH", "DC"])
  2. Inside check_heating_cooling_demand (line 3379), line 3381 extracts this list: network_type = config.thermal_network.network_type
  3. The function performs scalar comparisons if network_type == "DH": (line 3383)
  4. List == string comparisons always return False, breaking the logic

This was flagged in previous reviews but remains unfixed.

Fix: Pass the scalar network_type from the loop iteration:

             elif network_model == 'detailed':
-                check_heating_cooling_demand(locator, config)
+                check_heating_cooling_demand(locator, network_type)
                 # Create a per-network config section with the correct network_type

And update the function signature (line 3379):

-def check_heating_cooling_demand(locator, config):
+def check_heating_cooling_demand(locator, network_type):
     # local variables
-    network_type = config.thermal_network.network_type
     total_demand = pd.read_csv(locator.get_total_demand())
cea/technologies/network_layout/main.py (1)

567-574: Remove unreachable dead code.

Lines 544-550 raise a ValueError if unmatched_buildings is non-empty, making the warning code at lines 567-574 unreachable. This was flagged in past review comments but remains in the code.

Apply this diff:

-    if unmatched_buildings:
-        print(f"  ⚠ Warning: {len(unmatched_buildings)} plant building(s){label} not found in zone:")
-        for building in unmatched_buildings[:5]:
-            print(f"      - {building}")
-        if len(unmatched_buildings) > 5:
-            print(f"      ... and {len(unmatched_buildings) - 5} more")
-        print(f"    Available buildings in zone: {', '.join(sorted(available_buildings)[:5])}" +
-              (f" ... and {len(available_buildings) - 5} more" if len(available_buildings) > 5 else ""))

Based on past review comments.

🧹 Nitpick comments (1)
cea/technologies/network_layout/main.py (1)

977-980: Remove or document unused fields.

The TODO comment indicates these fields are unused:

  • pipe_diameter
  • type_mat
  • consider_only_buildings_with_demand

Past review comments requested addressing this. Either remove these fields if truly unused, or document their purpose and make them optional with clear defaults.

Based on past review comments.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f4489a and 84cc1dd.

📒 Files selected for processing (2)
  • cea/technologies/network_layout/main.py (2 hunks)
  • cea/technologies/thermal_network/thermal_network.py (13 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
cea/technologies/thermal_network/thermal_network.py (3)
cea/config.py (2)
  • locator (578-581)
  • get_number_of_processes (246-257)
cea/inputlocator.py (7)
  • get_network_layout_shapefile (987-991)
  • get_network_layout_nodes_shapefile (1007-1019)
  • get_thermal_network_layout_massflow_nodes_file (1107-1109)
  • get_thermal_network_substation_ploss_file (1123-1125)
  • get_network_temperature_return_nodes_file (1115-1117)
  • get_network_layout_edges_shapefile (993-1005)
  • get_thermal_network_folder (1022-1024)
cea/technologies/thermal_network/utility.py (1)
  • extract_network_from_shapefile (10-89)
cea/technologies/network_layout/main.py (7)
cea/technologies/network_layout/connectivity_potential.py (1)
  • calc_connectivity_network_with_geometry (934-1033)
cea/technologies/network_layout/steiner_spanning_tree.py (1)
  • calc_steiner_spanning_tree (63-543)
cea/technologies/network_layout/plant_node_operations.py (3)
  • add_plant_close_to_anchor (67-165)
  • get_next_node_name (29-45)
  • get_next_pipe_name (48-64)
cea/technologies/network_layout/substations_location.py (1)
  • calc_building_centroids (21-58)
cea/technologies/network_layout/graph_utils.py (2)
  • normalize_gdf_geometries (375-406)
  • normalize_geometry (340-372)
cea/optimization_new/user_network_loader.py (2)
  • load_user_defined_network (45-111)
  • validate_network_covers_district_buildings (405-518)
cea/inputlocator.py (7)
  • get_building_supply (905-907)
  • get_database_assemblies_supply_heating (793-795)
  • get_database_assemblies_supply_cooling (785-787)
  • get_total_demand (1269-1271)
  • get_zone_building_names (897-903)
  • get_street_network (1170-1173)
  • get_zone_geometry (878-882)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: tests (ubuntu-latest)
  • GitHub Check: tests (windows-latest)
  • GitHub Check: tests (macos-latest)
🔇 Additional comments (2)
cea/technologies/thermal_network/thermal_network.py (2)

139-139: LGTM: Constructor call correctly updated with network_name.

The clone() method now properly passes self.network_name to the ThermalNetwork constructor, which aligns with the updated constructor signature at line 64.


197-232: Good: Enhanced error messages and network-specific path resolution.

The updated path resolution with network_name and the helpful file-not-found error messages (lines 201-216) improve user experience. The filter_edges=True parameter at line 232 correctly handles the case where layout.shp contains edges for all network types while nodes are network-specific.

Comment on lines +69 to +83
for idx, row in nodes_full.iterrows():
type_value = str(row['type_original']).strip()

if type_value.upper() in plant_types:
# It's a plant node
nodes_full.loc[idx, 'type'] = type_value.upper()
nodes_full.loc[idx, 'building'] = 'NONE'
elif type_value.upper() == 'NONE':
# It's a junction node
nodes_full.loc[idx, 'type'] = 'NONE'
nodes_full.loc[idx, 'building'] = 'NONE'
else:
# It's a building node (consumer)
nodes_full.loc[idx, 'building'] = type_value
nodes_full.loc[idx, 'type'] = 'CONSUMER'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add validation for empty or invalid type values.

The loop strips type_value but doesn't validate that the result is non-empty. If a node has an empty, null, or whitespace-only type value after stripping, line 82 will execute, setting building='' and type='CONSUMER', which creates an invalid consumer node with no building name.

Past review comments flagged this issue but it remains unresolved.

Apply this fix:

     for idx, row in nodes_full.iterrows():
         type_value = str(row['type_original']).strip()
 
+        if not type_value or type_value.upper() == 'NAN':
+            raise ValueError(
+                f"Invalid node at index {idx}: 'type' field is empty or null. "
+                "Each node must have a valid type (building name, 'NONE', or plant type)."
+            )
+
         if type_value.upper() in plant_types:

Based on past review comments.

🤖 Prompt for AI Agents
In cea/technologies/network_layout/main.py around lines 69 to 83, validate
type_value before classifying: check for null/NaN (pandas.isna) or
empty/whitespace-only after stripping and treat those as junctions (set
nodes_full.loc[idx,'type']='NONE' and nodes_full.loc[idx,'building']='NONE') or
alternatively log/warn and skip the row; only proceed to the existing
plant/none/consumer branches when type_value is non-empty, ensuring you never
assign an empty string to building for a CONSUMER.

Comment on lines +131 to +156
def get_buildings_from_supply_csv(locator, network_type):
"""
Read supply.csv and return list of buildings configured for district heating/cooling.
:param locator: InputLocator instance
:param network_type: "DH" or "DC"
:return: List of building names
"""
supply_df = pd.read_csv(locator.get_building_supply())

# Read assemblies database to map codes to scale (DISTRICT vs BUILDING)
if network_type == "DH":
assemblies_df = pd.read_csv(locator.get_database_assemblies_supply_heating())
system_type_col = 'supply_type_hs'
else: # DC
assemblies_df = pd.read_csv(locator.get_database_assemblies_supply_cooling())
system_type_col = 'supply_type_cs'

# Create mapping: code -> scale (DISTRICT/BUILDING)
scale_mapping = assemblies_df.set_index('code')['scale'].to_dict()

# Filter buildings with DISTRICT scale
supply_df['scale'] = supply_df[system_type_col].map(scale_mapping)
district_buildings = supply_df[supply_df['scale'] == 'DISTRICT']['name'].tolist()

return district_buildings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add defensive error handling for missing files and columns.

Past review comments requested error handling for this function, marked as addressed, but no defensive checks are visible in the current code. Without validation:

  • Missing CSV files will raise unhandled FileNotFoundError
  • Missing columns ('code', 'scale', system_type_col, 'name') will cause KeyError
  • Unmapped supply types will produce NaN values in the scale column and be silently filtered

Apply this pattern:

def get_buildings_from_supply_csv(locator, network_type):
    """..."""
    try:
        supply_df = pd.read_csv(locator.get_building_supply())
    except FileNotFoundError as e:
        raise FileNotFoundError(f"supply.csv not found at {locator.get_building_supply()}") from e
    
    # Read assemblies database
    if network_type == "DH":
        try:
            assemblies_df = pd.read_csv(locator.get_database_assemblies_supply_heating())
        except FileNotFoundError as e:
            raise FileNotFoundError("SUPPLY_HEATING.csv not found") from e
        system_type_col = 'supply_type_hs'
    else:
        try:
            assemblies_df = pd.read_csv(locator.get_database_assemblies_supply_cooling())
        except FileNotFoundError as e:
            raise FileNotFoundError("SUPPLY_COOLING.csv not found") from e
        system_type_col = 'supply_type_cs'
    
    # Validate required columns
    if 'name' not in supply_df.columns or system_type_col not in supply_df.columns:
        raise ValueError(f"supply.csv missing required columns: 'name' or '{system_type_col}'")
    if 'code' not in assemblies_df.columns or 'scale' not in assemblies_df.columns:
        raise ValueError("Assemblies database missing required columns: 'code' or 'scale'")
    
    # Create mapping and handle unmapped types
    scale_mapping = assemblies_df.set_index('code')['scale'].to_dict()
    supply_df['scale'] = supply_df[system_type_col].map(scale_mapping)
    
    # Warn about unmapped supply types
    unmapped = supply_df[supply_df['scale'].isna()]
    if not unmapped.empty:
        print(f"  ⚠ Warning: {len(unmapped)} building(s) have supply types not in assemblies database")
    
    district_buildings = supply_df[supply_df['scale'] == 'DISTRICT']['name'].tolist()
    return district_buildings

Based on past review comments.

🤖 Prompt for AI Agents
In cea/technologies/network_layout/main.py around lines 131-156, the function
get_buildings_from_supply_csv lacks defensive checks: add try/except around
pd.read_csv calls to raise informative FileNotFoundError messages for missing
supply and assemblies CSVs depending on network_type; validate required columns
exist ('name' and system_type_col in supply_df, 'code' and 'scale' in
assemblies_df) and raise ValueError with clear messages if missing; build the
scale_mapping only after validating assemblies_df and map
supply_df[system_type_col] to a new 'scale' column, detect unmapped supply types
(NaN in 'scale') and log or warn with a count, then filter for scale ==
'DISTRICT' to return names.

Comment on lines +159 to +175
def get_buildings_with_demand(locator, network_type):
"""
Read total_demand.csv and return list of buildings with heating/cooling demand.
:param locator: InputLocator instance
:param network_type: "DH" or "DC"
:return: List of building names
"""
total_demand = pd.read_csv(locator.get_total_demand())

if network_type == "DH":
field = "QH_sys_MWhyr"
else: # DC
field = "QC_sys_MWhyr"

buildings_with_demand = total_demand[total_demand[field] > 0.0]['name'].tolist()
return buildings_with_demand
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add defensive error handling for missing files and columns.

Similar to get_buildings_from_supply_csv, this function lacks defensive checks as requested in past review comments.

Apply this pattern:

def get_buildings_with_demand(locator, network_type):
    """..."""
    try:
        total_demand = pd.read_csv(locator.get_total_demand())
    except FileNotFoundError as e:
        raise FileNotFoundError(f"Total_demand.csv not found at {locator.get_total_demand()}") from e
    
    if network_type == "DH":
        field = "QH_sys_MWhyr"
    else:
        field = "QC_sys_MWhyr"
    
    if field not in total_demand.columns:
        raise ValueError(f"Total_demand.csv missing required column: {field}")
    if 'name' not in total_demand.columns:
        raise ValueError("Total_demand.csv missing required column: name")
    
    buildings_with_demand = total_demand[total_demand[field] > 0.0]['name'].tolist()
    return buildings_with_demand

Based on past review comments.

🤖 Prompt for AI Agents
In cea/technologies/network_layout/main.py around lines 159 to 175, add
defensive error handling: wrap the pd.read_csv(locator.get_total_demand()) call
in a try/except to catch FileNotFoundError and rethrow with a message including
the path; determine field based on network_type as before, then verify that both
the chosen field (QH_sys_MWhyr or QC_sys_MWhyr) and 'name' exist in
total_demand.columns and raise ValueError with clear messages if missing; keep
the final filtering logic unchanged once checks pass.

Comment on lines +386 to +432
# Try to find the building node (it might be in the network or in the zone but not connected)
plant_node = building_nodes[building_nodes['building'] == plant_building]
temp_node_added = False

if plant_node.empty:
# Plant building not in network - try to find it in zone geometry
building_in_zone = zone_gdf[zone_gdf['name'] == plant_building]
if not building_in_zone.empty:
# Add the building temporarily so we can create a plant near it
building_geom = building_in_zone.iloc[0].geometry

# Generate unique node name
node_name = get_next_node_name(nodes_gdf)

# Normalize centroid coordinates for consistency
centroid_normalized = normalize_geometry(building_geom.centroid, SHAPEFILE_TOLERANCE)

new_node = gpd.GeoDataFrame([{
'name': node_name,
'building': plant_building,
'type': 'CONSUMER',
'geometry': centroid_normalized
}], crs=nodes_gdf.crs)
nodes_gdf = pd.concat([nodes_gdf, new_node], ignore_index=True)
plant_node = nodes_gdf[nodes_gdf['building'] == plant_building]
temp_node_added = True
else:
print(f" ⚠ Warning: Plant building '{plant_building}' not found in zone geometry, skipping")
continue

# Create a new PLANT node near this building
building_anchor = plant_node
nodes_gdf, edges_gdf = add_plant_close_to_anchor(
building_anchor,
nodes_gdf,
edges_gdf,
'T1', # Default pipe material
150 # Default pipe diameter
)

# The newly created plant node is the last one
plant_node_idx = nodes_gdf.index[-1]
plant_node_name = nodes_gdf.loc[plant_node_idx, 'name']

# Remove the temporary building node if we added one
if temp_node_added:
nodes_gdf = nodes_gdf[nodes_gdf['building'] != plant_building]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Potential issue: temp building node removal may delete the newly created plant.

When a plant building isn't present in the network, the code:

  1. Adds a temporary CONSUMER node at the building centroid (lines 395-411)
  2. Calls add_plant_close_to_anchor to create a PLANT node near it (lines 418-424)
  3. Removes all nodes with building == plant_building (line 432)

However, if add_plant_close_to_anchor duplicates the temporary node to create the plant (which can happen when there are no NONE nodes available), the newly created plant node will also have building == plant_building, causing step 3 to delete both the temporary consumer node AND the newly created plant node.

A safer approach is to remove only the temporary CONSUMER node by tracking its index:

             temp_node_added = False

             if plant_node.empty:
                 # Plant building not in network - try to find it in zone geometry
                 building_in_zone = zone_gdf[zone_gdf['name'] == plant_building]
                 if not building_in_zone.empty:
                     # Add the building temporarily so we can create a plant near it
                     building_geom = building_in_zone.iloc[0].geometry

                     # Generate unique node name
                     node_name = get_next_node_name(nodes_gdf)

                     # Normalize centroid coordinates for consistency
                     centroid_normalized = normalize_geometry(building_geom.centroid, SHAPEFILE_TOLERANCE)

                     new_node = gpd.GeoDataFrame([{
                         'name': node_name,
                         'building': plant_building,
                         'type': 'CONSUMER',
                         'geometry': centroid_normalized
                     }], crs=nodes_gdf.crs)
                     nodes_gdf = pd.concat([nodes_gdf, new_node], ignore_index=True)
+                    temp_node_idx = nodes_gdf.index[-1]  # Track the temp node index
                     plant_node = nodes_gdf[nodes_gdf['building'] == plant_building]
                     temp_node_added = True
                 else:
                     print(f"  ⚠ Warning: Plant building '{plant_building}' not found in zone geometry, skipping")
                     continue

             # Create a new PLANT node near this building
             building_anchor = plant_node
             nodes_gdf, edges_gdf = add_plant_close_to_anchor(
                 building_anchor,
                 nodes_gdf,
                 edges_gdf,
                 'T1',  # Default pipe material
                 150    # Default pipe diameter
             )

             # The newly created plant node is the last one
             plant_node_idx = nodes_gdf.index[-1]
             plant_node_name = nodes_gdf.loc[plant_node_idx, 'name']

             # Remove the temporary building node if we added one
             if temp_node_added:
-                nodes_gdf = nodes_gdf[nodes_gdf['building'] != plant_building]
+                # Remove only the temporary CONSUMER node by index
+                nodes_gdf = nodes_gdf.drop(temp_node_idx)
+                nodes_gdf = nodes_gdf.reset_index(drop=True)

Based on similar issues flagged in past review comments.

Comment on lines 1190 to 1196
if auto_created_buildings:
print(f" Auto-created {len(auto_created_buildings)} missing building node(s):")
for building_name in auto_created_buildings[:10]:
print(f" - {building_name}")
if len(auto_created_buildings) > 10:
print(f" ... and {len(auto_created_buildings) - 10} more")
print(" Note: Nodes created at edge endpoints closest to building centroids (in-memory only)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Remove unreachable dead code.

The function validate_network_covers_district_buildings (from relevant code snippets) never populates auto_created_buildings—it only validates and raises errors for missing buildings, returning (nodes_gdf, []). Therefore, lines 1190-1196 are unreachable.

Past review comments flagged this issue multiple times and marked it as addressed in commits, but the code remains.

Additionally, there's a critical docstring–implementation mismatch in validate_network_covers_district_buildings:

  • Docstring claims: "Auto-create missing nodes if edges reach the building"
  • Implementation reality: Raises errors instead; never auto-creates nodes

Either implement the auto-creation logic described in the docstring, or remove lines 1190-1196 and update the function's docstring to accurately reflect that it only validates and raises errors.

Based on past review comments.

🤖 Prompt for AI Agents
In cea/technologies/network_layout/main.py around lines 1190-1196, the block
printing auto_created_buildings is dead/unreachable because
validate_network_covers_district_buildings never populates that list (it returns
(nodes_gdf, []) and raises on missing nodes) and the docstring incorrectly
states it “auto-creates” nodes; remove the unreachable print block (lines
1190-1196) and update the function docstring to state that the function only
validates coverage and raises errors for missing building nodes (or, if you
prefer auto-creation, implement the creation logic and populate
auto_created_buildings consistently across the function). Ensure any references
to auto_created_buildings elsewhere in the function are removed or handled, and
run tests to confirm no remaining references to the removed variable.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (8)
cea/scripts.yml (1)

247-260: Consider declaring layout nodes and edges as thermal-network inputs again

thermal_network.py still depends on the layout nodes/edges shapefiles, but they are no longer listed under input-files for thermal-network. This means CLI/dashboard pre-checks won’t catch missing layouts and failures will occur later at runtime.

Recommend adding both layout files here, wired to thermal-network:network-type and thermal-network:network-name, e.g.:

   - name: thermal-network
@@
-    input-files:
-      - [get_demand_results_file, building_name]
-      - [get_db4_components_conversion_folder]
-      - [get_database_components_distribution_thermal_grid]
-      - [get_weather_file]
+    input-files:
+      - [get_demand_results_file, building_name]
+      - [get_db4_components_conversion_folder]
+      - [get_database_components_distribution_thermal_grid]
+      - [get_weather_file]
+      - [get_network_layout_nodes_shapefile, "thermal-network:network-type", "thermal-network:network-name"]
+      - [get_network_layout_edges_shapefile, "thermal-network:network-type", "thermal-network:network-name"]
cea/technologies/network_layout/main.py (3)

502-577: Remove unreachable unmatched_buildings logging in resolve_plant_buildings

unmatched_buildings is checked twice: first to raise a ValueError, then later to print warnings. The second block (lines 567–574) is unreachable:

if unmatched_buildings:
    raise ValueError(...)
...
if unmatched_buildings:
    print("  ⚠ Warning: ...")  # never executed

Drop the second block to avoid dead code:

-    if unmatched_buildings:
-        print(f"  ⚠ Warning: {len(unmatched_buildings)} plant building(s){label} not found in zone:")
-        ...
-        print(f"    Available buildings in zone: ...")

971-983: Clean up unused fields in NetworkLayout dataclass

NetworkLayout currently defines several fields that are not used anywhere in this module:

  • pipe_diameter
  • type_mat
  • consider_only_buildings_with_demand
  • disconnected_buildings

Since auto_layout_network now reads those flags directly from config.network_layout, consider either removing these dataclass fields or documenting their intended future use and making them optional. Keeping dead fields around can confuse callers about what is actually honoured.


30-97: Guard against empty/NaN type values when converting simplified nodes

convert_simplified_nodes_to_full_format treats any non-NONE / non-PLANT* type_original as a building name. If the source has null, NaN, or whitespace-only types, those become bogus building IDs ("nan", ""), which then propagate into validation (e.g., as “extra buildings” not in the zone).

Add an explicit check before classification, e.g.:

-    for idx, row in nodes_full.iterrows():
-        type_value = str(row['type_original']).strip()
+    for idx, row in nodes_full.iterrows():
+        raw_type = row['type_original']
+        # Reject empty / null type values early
+        if pd.isna(raw_type) or str(raw_type).strip() == '':
+            raise ValueError(
+                f"Invalid node at index {idx}: 'type' field is empty or null. "
+                "Each node must have a valid type (building name, 'NONE', or plant type)."
+            )
+        type_value = str(raw_type).strip()

This prevents accidental creation of consumer nodes with invalid building names.

cea/default.config (2)

1673-1677: connection-candidates range still only documented, not enforced

The help text constrains connection-candidates to 1–5 and explains behavior differences by value and algorithm, but the parameter is a plain IntegerParameter with no min/max. Invalid values (e.g. 0 or 10) would only be caught deeper in the pipeline, if at all.

If the config system supports bounded integers, consider enforcing the range here; otherwise, add explicit validation when reading the config and fail fast or clamp values.

This was raised previously; if you’ve decided against hard validation, consider at least clarifying in documentation that out-of-range values are undefined behavior.


1641-1660: Fix help texts to reference actual parameter keys for user‑defined layout paths

The three help strings still reference the old user-* parameter names, which no longer exist:

  • edges-shp-path.help mentions 'user-nodes-shp-path' and 'user-network-geojson-path'.
  • nodes-shp-path.help mentions 'user-edges-shp-path' and 'user-network-geojson-path'.
  • network-geojson-path.help mentions 'user-edges-shp-path' and 'user-nodes-shp-path'.

Given the current keys are edges-shp-path, nodes-shp-path, and network-geojson-path, the texts should be updated to avoid confusing users.

Suggested minimal diff (keep all other text unchanged):

-edges-shp-path.help = Path to user-defined edge shapefile. If provided, bypasses automatic layout generation. Must be used with 'user-nodes-shp-path'. Do NOT use with 'user-network-geojson-path' (mutually exclusive).
+edges-shp-path.help = Path to user-defined edge shapefile. If provided, bypasses automatic layout generation. Must be used with 'nodes-shp-path'. Do NOT use with 'network-geojson-path' (mutually exclusive).
@@
-nodes-shp-path.help = Path to user-defined node shapefile. Required attributes: 'building' (str) matching building names, 'type' (str) node type (CONSUMER/PLANT/NONE). If provided, bypasses automatic layout generation. Must be used with 'user-edges-shp-path'. Do NOT use with 'user-network-geojson-path' (mutually exclusive).
+nodes-shp-path.help = Path to user-defined node shapefile. Required attributes: 'building' (str) matching building names, 'type' (str) node type (CONSUMER/PLANT/NONE). If provided, bypasses automatic layout generation. Must be used with 'edges-shp-path'. Do NOT use with 'network-geojson-path' (mutually exclusive).
@@
-network-geojson-path.help = Path to user-defined network GeoJSON file with combined nodes (Points) and edges (LineStrings). Node features require 'building' (str) and 'type' (str) attributes. Edge features require 'type_mat' attribute (str). Do NOT use with 'user-edges-shp-path' or 'user-nodes-shp-path' (mutually exclusive).
+network-geojson-path.help = Path to user-defined network GeoJSON file with combined nodes (Points) and edges (LineStrings). Node features require 'building' (str) and 'type' (str) attributes. Edge features require 'type_mat' attribute (str). Do NOT use with 'edges-shp-path' or 'nodes-shp-path' (mutually exclusive).
cea/config.pyi (1)

931-995: Harmonize network_name typing across sections

network_name is annotated inconsistently:

  • ThermalNetworkSection.network_name: str
  • ThermalNetworkOptimizationSection.network_name: Any
  • OptimizationNewSection.network_name: str
  • PlotsSection.network_name: Any
  • NetworkLayoutSection.network_name: str

Given default.config uses string-based parameters for all of these (with some allowing “empty means default”), it would be better to choose a single convention, e.g. str | None where optional, or str where required, and apply it consistently.

For example:

 class ThermalNetworkOptimizationSection(Section):
@@
-    network_name: Any
+    network_name: str | None
@@
-    @overload
-    def __getattr__(self, item: Literal["network_name"]) -> Any: ...
+    @overload
+    def __getattr__(self, item: Literal["network_name"]) -> str | None: ...
@@
 class PlotsSection(Section):
@@
-    network_name: Any
+    network_name: str | None
@@
-    @overload
-    def __getattr__(self, item: Literal["network_name"]) -> Any: ...
+    @overload
+    def __getattr__(self, item: Literal["network_name"]) -> str | None: ...

If some of these really cannot be None at runtime, documenting that and using plain str everywhere would also be fine—just avoid mixing Any with str arbitrarily.

#!/bin/bash
# Grep for network_name usages to confirm expected nullability.
rg -n "network_name" cea/ -C3

Also applies to: 996-1005, 1041-1041, 1131-1166, 1233-1253, 1358-1374

cea/inputlocator.py (1)

979-986: Fix duplicated prefixes and double underscores in thermal-network result filenames

The combination of _get_thermal_network_results_file_path and its callers currently builds malformed filenames:

  • Helper unconditionally does filename = f"{network_type}_{network_name}_{file_suffix}".
  • get_thermal_network_node_types_csv_file and get_thermal_network_edge_list_file pre‑build file_name including the same {network_type}_{network_name}_ prefix and then pass it as file_suffix.

Effects:

  • Named network: "{nt}_{nn}_{nt}_{nn}_metadata_nodes.csv" / ..._metadata_edges.csv.
  • Unnamed network: "DH__DH__metadata_nodes.csv", "DH__DH__metadata_edges.csv" (note double underscores and duplicated DH).

This diverges from the documented convention and makes paths hard to reason about.

Recommend:

  1. Treat the third argument of _get_thermal_network_results_file_path as a pure suffix.
  2. Omit the network_name part from the prefix when it is empty.

Suggested diff:

@@
-    def _get_thermal_network_results_file_path(self, network_type, network_name, file_suffix):
-        """
-        Helper method to resolve the path to thermal network results files.
-        """
-        if network_name:
-            # Named networks: thermal-network/{network_name}/{network_type}/
-            folder = self.get_output_thermal_network_type_folder(network_type, network_name)
-        else:
-            # Default/unnamed networks: thermal-network/
-            folder = self.get_thermal_network_folder()
-        
-        filename = f"{network_type}_{network_name}_{file_suffix}"
-        return os.path.join(folder, filename)
+    def _get_thermal_network_results_file_path(self, network_type, network_name, file_suffix):
+        """
+        Helper to resolve thermal-network result file paths.
+
+        Named:   scenario/outputs/data/thermal-network/{network_name}/{network_type}/
+                 {network_type}_{network_name}_{file_suffix}
+        Default: scenario/outputs/data/thermal-network/
+                 {network_type}_{file_suffix}
+        """
+        if network_name:
+            folder = self.get_output_thermal_network_type_folder(network_type, network_name)
+            filename = f"{network_type}_{network_name}_{file_suffix}"
+        else:
+            folder = self.get_thermal_network_folder()
+            filename = f"{network_type}_{file_suffix}"
+
+        return os.path.join(folder, filename)
@@
-    def get_thermal_network_node_types_csv_file(self, network_type, network_name):
-        """scenario/outputs/data/thermal-network/{network_type}/{network_name}/{network_type}_{network_name}_metadata_nodes.csv"""
-        if not network_name:
-            file_name = network_type + '_' + "_metadata_nodes.csv"
-        else:
-            file_name = network_type + '_' + network_name + '_metadata_nodes.csv'
-
-        return self._get_thermal_network_results_file_path(network_type, network_name, file_name)
+    def get_thermal_network_node_types_csv_file(self, network_type, network_name):
+        """scenario/outputs/data/thermal-network/{network_name}/{network_type}/{network_type}_{network_name}_metadata_nodes.csv"""
+        return self._get_thermal_network_results_file_path(network_type, network_name, "metadata_nodes.csv")
@@
-    def get_thermal_network_edge_list_file(self, network_type, network_name=''):
-        """scenario/outputs/data/thermal-network/{network_type}/{network_name}/{network_type}_{network_name}_metadata_edges.csv"""
-        if not network_name:
-            file_name = network_type + "_" + "_metadata_edges.csv"
-        else:
-            file_name = network_type + "_" + network_name + "_metadata_edges.csv"
-        return self._get_thermal_network_results_file_path(network_type, network_name, file_name)
+    def get_thermal_network_edge_list_file(self, network_type, network_name=''):
+        """scenario/outputs/data/thermal-network/{network_name}/{network_type}/{network_type}_{network_name}_metadata_edges.csv"""
+        return self._get_thermal_network_results_file_path(network_type, network_name, "metadata_edges.csv")

This keeps the helper’s contract simple and matches the intended filename patterns for both named and unnamed networks.

Also applies to: 1041-1053, 1087-1094, 1106-1112

🧹 Nitpick comments (6)
cea/optimization_new/domain.py (2)

111-112: Misleading variable name.

network_name_specified contains the network name value (a string), not a boolean. While this works correctly in the boolean context on Line 112, the variable name suggests it should be a boolean flag.

Apply this diff for clarity:

-                network_name_specified = self.config.optimization_new.network_name
-                is_district = (building.initial_connectivity_state != 'stand_alone') or network_name_specified
+                network_name = self.config.optimization_new.network_name
+                is_district = (building.initial_connectivity_state != 'stand_alone') or network_name

924-930: Redundant connectivity override.

This block overrides initial_connectivity_state to the generic 'network' for all buildings in the base network. However, model_initial_energy_system() (called later within optimize_domain() at line 945) overrides connectivity again with the actual network IDs ('N1001', 'N1002', etc.) at lines 488-495.

Consider removing this block if no code between lines 920 and 945 relies on the connectivity state, or clarify the intent with a comment explaining why the intermediate override is necessary.

-    # If network-name is specified, override connectivity to use network layout
-    # (network layout is the only source of truth, not Supply.csv)
-    if config.optimization_new.network_name and hasattr(current_domain, 'base_building_to_network'):
-        for building in current_domain.buildings:
-            if building.identifier in current_domain.base_building_to_network:
-                building.initial_connectivity_state = 'network'
-                print(f"  Overriding {building.identifier}: initial_connectivity_state = 'network' (from network layout)")
-

If this defensive override is intentional (e.g., for intermediate validation), add a comment explaining why it's needed before model_initial_energy_system() performs the final override.

cea/technologies/network_layout/main.py (2)

131-175: Consider adding defensive checks around CSV structure

Both get_buildings_from_supply_csv and get_buildings_with_demand assume the standard CEA CSV schemas. If users customise inputs, failures will surface as low-level FileNotFoundError/KeyError.

Optional hardening (not strictly required, but improves UX):

  • Wrap pd.read_csv calls in try/except FileNotFoundError and re-raise with paths from the locator.
  • Validate required columns ('name', system_type_col, 'code', 'scale', and the demand field) before indexing, raising a clear ValueError if missing.

You can follow the pattern already used in other parts of the codebase (e.g., user network loader) for consistent diagnostics.


178-499: auto_create_plant_nodes logic is robust after recent fixes

The updated flow correctly:

  • Detects existing PLANT / PLANT_DC / PLANT_DH nodes and prevents conflicting plant-building specs.
  • Repairs missing junction nodes at edge endpoints before building components.
  • Validates component count against expected_num_components with helpful context messages.
  • Creates plants either at user-specified buildings or at anchor-load buildings per component, and only removes the temporary CONSUMER node it added (plant nodes remain since building='NONE').

No functional issues stand out. The O(|edges|·|nodes|) endpoint matching loop is acceptable for typical problem sizes; you can later optimise with spatial indexing if profiling shows it as a bottleneck.

cea/inputlocator.py (1)

979-991: Minor: update docstrings to include /data/ in thermal-network layout paths

The new layout helpers store files under scenario/outputs/data/thermal-network/..., but their docstrings omit the data segment:

  • get_thermal_network_folder_network_name_folder
  • get_network_layout_shapefile
  • get_network_layout_nodes_shapefile_from_part1
  • get_network_layout_edges_shapefile
  • get_network_layout_nodes_shapefile

Consider updating the docstrings so they match the actual paths, e.g.:

-"""scenario/outputs/thermal-network/{network_name}/"""
+"""scenario/outputs/data/thermal-network/{network_name}/"""

and similarly for the others. This keeps the locator documentation aligned with the real folder structure.

Also applies to: 1000-1004, 1008-1032

cea/config.pyi (1)

561-581: Avoid widening many config fields to Any in the type stubs

A number of fields that previously had precise types are now annotated as Any, for example:

  • EmissionsSection.csv_carbon_intensity_column_name
  • DbfToolsSection.output_file_name
  • ShapefileToolsSection.output_file_name
  • ResultSummarySection.folder_name_to_save_exported_results
  • CreateNewScenarioSection.project / scenario
  • RenameBuildingSection.new
  • CreateMixedUseTypeSection.use_type / metadata
  • CreatePolygonSection.filename
  • ServerSection.host
  • SchemasSection.locator_method
  • DevelopmentSection.nsis/conda/gui/yarn/release_version
  • PlotsGeneralSection.plot_title/y_label/x_label

These are all backed by StringParameter (or nullable string) in default.config, so their runtime types are still str or str | None. Using Any here removes most of the benefit of having this .pyi file.

Additionally, network_name typing is inconsistent across sections:

  • ThermalNetworkSection.network_name: str
  • ThermalNetworkOptimizationSection.network_name: Any
  • OptimizationNewSection.network_name: str
  • NetworkLayoutSection.network_name: str

Consider restoring more specific annotations that mirror default.config, e.g.:

 class EmissionsSection(Section):
@@
-    csv_carbon_intensity_column_name: Any
+    csv_carbon_intensity_column_name: str | None
@@
 class DbfToolsSection(Section):
@@
-    output_file_name: Any
+    output_file_name: str
@@
 class ResultSummarySection(Section):
@@
-    folder_name_to_save_exported_results: Any
+    folder_name_to_save_exported_results: str | None

…and similarly for the other fields listed above, ensuring network_name is consistently typed as str across all sections.

This keeps the stubs useful for mypy/pyright users and still matches the underlying config behavior.

Also applies to: 634-646, 691-709, 852-907, 1325-1353, 1442-1466, 1467-1476, 1478-1527, 1529-1568

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 84cc1dd and 5d58d44.

📒 Files selected for processing (8)
  • cea/config.py (5 hunks)
  • cea/config.pyi (22 hunks)
  • cea/default.config (6 hunks)
  • cea/inputlocator.py (3 hunks)
  • cea/optimization_new/domain.py (7 hunks)
  • cea/optimization_new/network.py (2 hunks)
  • cea/scripts.yml (3 hunks)
  • cea/technologies/network_layout/main.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • cea/optimization_new/network.py
🧰 Additional context used
🧬 Code graph analysis (4)
cea/optimization_new/domain.py (4)
cea/optimization_new/user_network_loader.py (2)
  • map_buildings_to_networks (855-900)
  • UserNetworkLoaderError (40-42)
cea/optimization_new/building.py (6)
  • demand_flow (44-46)
  • demand_flow (49-54)
  • initial_connectivity_state (83-84)
  • initial_connectivity_state (87-92)
  • check_demand_energy_carrier (219-248)
  • Building (27-309)
cea/inputlocator.py (3)
  • get_network_layout_shapefile (987-991)
  • get_network_layout_nodes_shapefile (1022-1034)
  • get_thermal_network_folder (1037-1039)
cea/optimization_new/helperclasses/optimization/connectivity.py (1)
  • Connection (36-86)
cea/config.py (2)
cea/technologies/network_layout/main.py (1)
  • _validate_network_name (995-1020)
cea/inputlocator.py (5)
  • InputLocator (28-1379)
  • get_thermal_network_folder_network_name_folder (979-981)
  • get_thermal_network_folder (1037-1039)
  • get_output_thermal_network_type_folder (983-985)
  • get_network_layout_nodes_shapefile (1022-1034)
cea/technologies/network_layout/main.py (7)
cea/technologies/network_layout/connectivity_potential.py (1)
  • calc_connectivity_network_with_geometry (934-1033)
cea/technologies/network_layout/steiner_spanning_tree.py (1)
  • calc_steiner_spanning_tree (63-543)
cea/technologies/network_layout/plant_node_operations.py (3)
  • add_plant_close_to_anchor (67-165)
  • get_next_node_name (29-45)
  • get_next_pipe_name (48-64)
cea/technologies/network_layout/substations_location.py (1)
  • calc_building_centroids (21-58)
cea/technologies/network_layout/graph_utils.py (2)
  • normalize_gdf_geometries (375-406)
  • normalize_geometry (340-372)
cea/optimization_new/user_network_loader.py (2)
  • load_user_defined_network (45-111)
  • validate_network_covers_district_buildings (405-518)
cea/inputlocator.py (11)
  • get_database_assemblies_supply_heating (793-795)
  • get_database_assemblies_supply_cooling (785-787)
  • get_total_demand (1284-1286)
  • InputLocator (28-1379)
  • get_zone_building_names (897-903)
  • get_zone_geometry (878-882)
  • get_network_layout_shapefile (987-991)
  • get_network_layout_nodes_shapefile (1022-1034)
  • get_output_thermal_network_type_folder (983-985)
  • get_network_layout_edges_shapefile (1008-1020)
  • get_thermal_network_folder_network_name_folder (979-981)
cea/config.pyi (2)
cea/schemas.py (1)
  • locator_method (159-160)
cea/dev/build.py (1)
  • conda (50-55)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: tests (ubuntu-latest)
  • GitHub Check: tests (macos-latest)
  • GitHub Check: tests (windows-latest)
🔇 Additional comments (15)
cea/optimization_new/domain.py (4)

18-18: LGTM! Clean imports and attribute declarations.

The new imports and base network attributes are well-typed and clearly documented.

Also applies to: 52-55, 68-71


126-306: LGTM! Comprehensive network loading with excellent error handling.

The load_base_network() method is well-structured with:

  • Clear validation that network_name is required (lines 144-152)
  • Comprehensive error messages for missing networks (lines 154-214)
  • Defensive edge filtering for network-type-specific nodes (lines 220-235)
  • Proper error context wrapping (lines 298-306)

362-374: LGTM! Proper handling of network-supplied buildings.

The filtering logic correctly identifies and skips buildings without supply components (network-supplied buildings), with clear user messaging.


485-522: LGTM! Clean connectivity override with robust parsing.

The base network connectivity override is well-implemented:

  • Helper function _parse_connectivity_id handles multiple formats
  • Good fallback for unexpected formats (line 509)
  • Clear separation of concerns
cea/scripts.yml (2)

233-238: Updated network-layout description looks good

The new wording correctly explains auto-generated vs user-provided layouts and multi-component behavior; no issues from a config/schema perspective.


329-336: The review comment is incorrect—optimization_new accesses its own config section, not thermal-network parameters.

The code in cea/optimization_new/domain.py (lines 97, 140-141, 908, 926, 933) directly accesses config.optimization_new.network_type and config.optimization_new.network_name, not config.thermal_network.*. The current parameter restriction in scripts.yml (parameters: [..., 'optimization-new']) correctly grants access to the optimization-new config section where these parameters belong. Adding thermal-network parameters would be unnecessary and would not address the actual config dependencies.

Likely an incorrect or invalid review comment.

cea/config.py (3)

430-436: Base Parameter.encode / decode signatures are consistent

The typed encode(self, value) -> str / decode(self, value) -> Any keep existing behavior and align with subclass expectations; no functional issues detected.


757-817: NetworkLayoutNameParameter validation is sound

The split between decode (syntax-only, allowing empty) and encode (enforcing non-empty and collision checks under the scenario’s thermal-network folder) is a good fit for interactive editing vs. saving, and the invalid-character guard is appropriate for filesystem safety.


1153-1179: Nullable SingleBuildingParameter behavior

The nullable handling for SingleBuildingParameter (returning None on decode and '' on encode when allowed) is consistent, and dynamically computing _choices from get_zone_building_names remains intact. Just ensure .nullable is only enabled in default.config where an empty building is semantically valid, as the encoder still falls back to self._choices[0] for invalid non-empty values.

cea/technologies/network_layout/main.py (5)

100-129: Network type derivation and demand warnings

determine_network_types cleanly derives the active {'DC','DH'} set and gives a clear error when no thermal services are present, and print_demand_warning centralizes repeated messaging. Both helpers look correct and improve readability.


579-968: auto_layout_network orchestration is consistent with new config model

The auto layout pipeline:

  • Correctly switches between overwrite-supply vs. supply.csv-driven district building sets.
  • Uses bool(connected_buildings_config) to distinguish explicit vs. blank connected-buildings, avoiding the earlier bug where “all buildings” was treated as blank.
  • Applies consider_only_buildings_with_demand per service and tracks separate DC/DH building lists.
  • Skips generation for services with no demand and surfaces a consolidated error if nothing is generated.
  • Persists per-type nodes (DC/ and DH/) and a merged layout.shp with deduplicated geometries and re-numbered pipes.

The control flow is complex but coherent; no obvious correctness issues found.


985-1020: NetworkLayout._validate_network_name collision check

The backend validation that walks include_services and checks for existing {network_name}/{network_type}/layout/{edges,nodes}.shp folders is a good safeguard against accidental overwrites and complements NetworkLayoutNameParameter. The logic and use of locator.get_output_thermal_network_type_folder look consistent with the updated locator API.


1023-1297: User-defined network processing correctly validates and persists plant edges

Key behaviors in process_user_defined_network look correct:

  • Delegates loading/format validation to load_user_defined_network, then enforces that all network buildings exist in zone.shp.
  • Mirrors the overwrite-supply vs. supply.csv logic from auto-layout and validates demand per service.
  • Uses validate_network_covers_district_buildings to ensure district buildings are exactly represented.
  • Calls auto_create_plant_nodes per network type, then:
    • Writes DC/DH-specific nodes shapefiles.
    • Merges DC/DH edges (including plant edges), drops geometric duplicates, renumbers PIPE* names, normalises geometries, and writes layout.shp.

This addresses the earlier issue where plant edges for user-defined networks were not persisted to layout.shp.


1304-1343: main(): error handling and routing are appropriate

The main entry point:

  • Routes cleanly between user-defined and automatic layout flows.
  • Validates and normalises the network name via NetworkLayout.from_config.
  • On any exception, removes the partially created {scenario}/outputs/data/thermal-network/{network_name} folder, preventing stale or half-written layouts.

No further changes suggested here.

cea/default.config (1)

944-950: Thermal-network network-name / network-type refactor looks consistent

Using NetworkLayoutChoiceParameter plus MultiChoiceParameter for network-type aligns with the new multi-network design and the type stubs. No issues from the config side here.

Comment on lines +869 to +905
class NetworkLayoutChoiceParameter(ChoiceParameter):
"""
Parameter for selecting existing network layouts based on network type.
"""

_network_types = {'DH', 'DC'}

def initialize(self, parser):
# Override to dynamically populate choices based on available networks
pass

@property
def _choices(self):
networks = self._get_available_networks()
sorted_networks = self._sort_networks_by_modification_time(networks)
return sorted_networks

def _get_available_networks(self) -> List[str]:
locator = cea.inputlocator.InputLocator(self.config.scenario)
network_folder = locator.get_thermal_network_folder()
if not os.path.exists(network_folder):
return []
return [name for name in os.listdir(network_folder)
if os.path.isdir(os.path.join(network_folder, name)) and name not in self._network_types]

def _get_network_file_paths(self, network_type: str, network_name: str) -> Tuple[str, str]:
"""Get path for network node and edge files for the given network name"""
locator = cea.inputlocator.InputLocator(self.config.scenario)

network_type_folder = locator.get_output_thermal_network_type_folder(network_type, network_name)
# Remove trailing slash/separator if present
network_type_folder = network_type_folder.rstrip(os.sep)

edges_path = locator.get_network_layout_nodes_shapefile(network_type, network_name)
nodes_path = locator.get_network_layout_nodes_shapefile(network_type, network_name)

return edges_path, nodes_path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix edges path construction and avoid shapefile getter side-effects

NetworkLayoutChoiceParameter._get_network_file_paths currently has two issues:

  1. edges_path mistakenly calls the nodes locator, so both paths point to the nodes shapefile:
    edges_path = locator.get_network_layout_nodes_shapefile(network_type, network_name)
    nodes_path = locator.get_network_layout_nodes_shapefile(network_type, network_name)
  2. Using get_network_layout_*_shapefile introduces check_cpg side-effects when merely checking for existence, which can raise if .cpg is missing and break network discovery.

Use the network-type folder and join the layout filenames directly instead:

     def _get_network_file_paths(self, network_type: str, network_name: str) -> Tuple[str, str]:
-        """Get path for network node and edge files for the given network name"""
-        locator = cea.inputlocator.InputLocator(self.config.scenario)
-
-        network_type_folder = locator.get_output_thermal_network_type_folder(network_type, network_name)
-        # Remove trailing slash/separator if present
-        network_type_folder = network_type_folder.rstrip(os.sep)
-
-        edges_path = locator.get_network_layout_nodes_shapefile(network_type, network_name)
-        nodes_path = locator.get_network_layout_nodes_shapefile(network_type, network_name)
-
-        return edges_path, nodes_path
+        """Get paths for network edge and node files for the given network name (no locator side-effects)."""
+        locator = cea.inputlocator.InputLocator(self.config.scenario)
+        base = locator.get_output_thermal_network_type_folder(network_type, network_name).rstrip(os.sep)
+        edges_path = os.path.join(base, "layout", "edges.shp")
+        nodes_path = os.path.join(base, "layout", "nodes.shp")
+        return edges_path, nodes_path

This fixes the edges/nodes mix-up and keeps _choices computation robust even when .cpg files are absent.

Comment on lines +932 to +937
# Load base network again if not already loaded
if not config.optimization_new.network_name:
start_time = time.time()
current_domain.load_base_network()
end_time = time.time()
print(f"Time elapsed for loading/validating base network: {end_time - start_time} s")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Logic error: unreachable code path will always fail.

The condition checks if not config.optimization_new.network_name:, but load_base_network() requires network_name to be set—it raises UserNetworkLoaderError if network_name is missing or empty (lines 144-152). This branch will always fail if executed.

If the intent is to optionally load the base network when not already loaded, the logic should be:

-    # Load base network again if not already loaded
-    if not config.optimization_new.network_name:
+    # Load base network if it wasn't loaded earlier (e.g., network_name set but not loaded yet)
+    if config.optimization_new.network_name and not hasattr(current_domain, 'base_building_to_network'):
         start_time = time.time()
         current_domain.load_base_network()
         end_time = time.time()
         print(f"Time elapsed for loading/validating base network: {end_time - start_time} s")

Alternatively, if base network loading should always happen earlier (lines 908-912), remove this block entirely.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Load base network again if not already loaded
if not config.optimization_new.network_name:
start_time = time.time()
current_domain.load_base_network()
end_time = time.time()
print(f"Time elapsed for loading/validating base network: {end_time - start_time} s")
# Load base network if it wasn't loaded earlier (e.g., network_name set but not loaded yet)
if config.optimization_new.network_name and not hasattr(current_domain, 'base_building_to_network'):
start_time = time.time()
current_domain.load_base_network()
end_time = time.time()
print(f"Time elapsed for loading/validating base network: {end_time - start_time} s")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants