-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Overview
Add advanced CSV operations that maintain memory efficiency while providing more complex data transformations.
Motivation
We have 18 CSV functions including token-saving inspection tools. These advanced operations would enable more complex workflows while staying memory-efficient for large files.
Proposed Functions
Medium Priority - Data Transformation
sort_csv_rows- Sort by column (with sampling strategy for large files)aggregate_csv_column- Compute sum/avg/min/max for column without full loaddeduplicate_csv_rows- Remove duplicates by column(s) with memory-efficient streaming
Medium Priority - File Operations
merge_csv_files- Combine multiple CSV files (horizontal or vertical merge)split_csv_by_column- Split into multiple files based on column valuetranspose_csv- Swap rows and columns (memory-efficient for large files)
Lower Priority - Advanced Filtering
join_csv_files- SQL-style join of two CSV files on key columnpivot_csv_data- Create pivot table from CSV datagroup_csv_rows- Group rows by column value with aggregations
Design Principles
- Google ADK compliant (JSON-serializable types, no defaults)
- @strands_tool decorator
- Memory-efficient (streaming/chunking for large files)
- Include skip_confirm for file creation operations
- Consistent with existing CSV tools pattern
- Avoid loading entire files when possible
Related
- Extends existing data/csv_tools.py (18 functions)
- Related to issue Data: Future Enhancement Features #57 (Data Future Enhancements)
- Complements token-saving tools like select_csv_columns, filter_csv_rows
Module
data/csv_tools.py
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request