Changes in code to aggregate single files together by dyutivartak · Pull Request #2 · Seafood-Globalization-Lab/US_Census_Trade_Data_Requests

dyutivartak · 2025-12-08T23:53:48Z

Description

Updated 02-aggregate_data.py to correctly aggregate CSV files from the nested output/ directory structure. The script now recursively searches through all date-stamped subdirectories (us-census-data-YYYYMMDD/imports/ and us-census-data-YYYYMMDD/exports/) and combines all CSV files into two consolidated datasets.

Key Changes:

Replaced flat directory reading with recursive file discovery using glob.glob() with recursive pattern matching
Updated input paths from Raw_Data/imports_all_files/ and Raw_Data/exports_all_files/ to output/*/imports and output/*/exports
Changed output paths to output/imports_combined.csv and output/exports_combined.csv
Added progress indicators showing file processing status (every 100 files)
Added error handling to skip invalid or empty files gracefully
Added filtering to exclude request_log.csv files from aggregation
Improved console output with formatted sections and record counts

This change ensures the aggregation script correctly processes all data files regardless of their location in the nested directory structure.

How Has This Been Tested?

Test Results:

Successfully processed 7,769 import CSV files → output/imports_combined.csv (1,470,643 records, 230MB)
Successfully processed 2,726 export CSV files → output/exports_combined.csv (1,376,889 records, 212MB)
Verified that request_log.csv files are correctly excluded
Confirmed progress indicators display correctly during processing
Verified output files are created with proper formatting and record counts

Screenshots (if appropriate):

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My code follows the code style of this project.
My change requires a change to the documentation.
I have updated the documentation accordingly.
I have read the CONTRIBUTING document.

Changes in code to agrregate single files together

0979e64

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Changes in code to aggregate single files together#2

Changes in code to aggregate single files together#2
dyutivartak wants to merge 1 commit intomainfrom
dyuti-develop

dyutivartak commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

dyutivartak commented Dec 8, 2025

Description

Motivation and Context (link issue)

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

Checklist:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant