Skip to content

Comments

Expose detailed I/O data via magic variable#48

Draft
Copilot wants to merge 4 commits intomainfrom
copilot/expose-iops-detailed-data
Draft

Expose detailed I/O data via magic variable#48
Copilot wants to merge 4 commits intomainfrom
copilot/expose-iops-detailed-data

Conversation

Copy link
Contributor

Copilot AI commented Jan 29, 2026

Change Description

  • My PR includes a link to the issue that I am addressing

Implements iops_detailed_data magic variable to expose per-operation I/O details. Previously only aggregate metrics were available; users can now analyze file paths, syscalls, and per-operation sizes.

Solution Description

Core Changes

  • Inject iops_detailed_data into user namespace after each %%iops execution
  • Returns pandas.DataFrame when detailed tracing available (strace/fs_usage modes)
  • Returns explanatory string when using psutil aggregate-only mode

Data Collection

  • Enhanced parse_strace_line and parse_fs_usage_line with collect_detailed parameter
  • Extract file paths via strace -y flag (fd-to-path resolution)
  • Extract syscall names from both tracers
  • Fixed mutual exclusivity bug: histogram and detailed collection now work simultaneously

Schema
DataFrame columns when available:

  • path (str): File path accessed
  • operation (str): "read" or "write"
  • syscall (str): Syscall name (read, write, pread64, etc.)
  • size_bytes (int): Bytes transferred

Usage Example

%%iops
with open('test.txt', 'w') as f:
    f.write('data' * 1000)
    
# Next cell:
iops_detailed_data  # Access DataFrame
# Analyze: iops_detailed_data.groupby('path')['size_bytes'].sum()

Code Quality

  • I have read the Contribution Guide
  • My code follows the code style of this project
  • My code builds (or compiles) cleanly without any errors or warnings
  • My code contains relevant comments and necessary documentation

Project-Specific Pull Request Checklists

New Feature Checklist

  • I have added or updated the docstrings associated with my feature using the NumPy docstring format
  • I have updated the tutorial to highlight my new feature (if appropriate)
  • I have added unit/End-to-End (E2E) test cases to cover my new feature
  • My change includes a breaking change
    • My change includes backwards compatibility and deprecation warnings (if possible)

Documentation Change Checklist

Build/CI Change Checklist

  • If required or optional dependencies have changed (including version numbers), I have updated the README to reflect this

Test Coverage: 13 new tests, all 133 tests passing. CodeQL: 0 alerts.

Original prompt

This section details on the original issue you should resolve

<issue_title>Expose detailed I/O data via magic variable</issue_title>
<issue_description>## Expose detailed I/O data via magic variable

Implementation

After %%iops execution, inject a variable iops_detailed_data into the user's namespace via:

self.shell.user_ns['iops_detailed_data'] = detailed_data

Variable Content

When detailed data is available (strace/fs_usage modes):

  • Type: pandas.DataFrame
  • Columns:
    • path (str): File path
    • operation (str): "read" or "write"
    • syscall (str, optional): Syscall name if available
    • size_bytes (int): Bytes transferred

When detailed data is NOT available (psutil mode):

  • Type: str
  • Value: "Detailed I/O data not available: profiling uses psutil mode which only provides aggregate metrics"

Documentation

Add a brief note to README.md showing users can access iops_detailed_data after running %%iops.

Example

%%iops
with open('test.txt', 'w') as f:
    f.write('data')

Next cell:

iops_detailed_data  # Access the DataFrame

Also update readthedocs.</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

@codecov
Copy link

codecov bot commented Jan 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 45.49%. Comparing base (b04ec2e) to head (00b8a88).

Additional details and impacted files
@@           Coverage Diff           @@
##             main      #48   +/-   ##
=======================================
  Coverage   45.49%   45.49%           
=======================================
  Files           4        4           
  Lines         455      455           
=======================================
  Hits          207      207           
  Misses        248      248           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot AI and others added 3 commits January 29, 2026 21:44
- Added pandas as dependency
- Enhanced parse_strace_line and parse_fs_usage_line to collect detailed data (path, operation, syscall, size_bytes)
- Added -y flag to strace for file path extraction
- Modified measure_macos_osascript and measure_linux_strace to collect detailed data
- Updated magic.py to inject iops_detailed_data variable into user namespace
  - DataFrame when detailed data available
  - String message when using psutil mode

Co-authored-by: mtauraso <31012+mtauraso@users.noreply.github.com>
- Added comprehensive tests for detailed data parsing and collection
- Updated README.md with usage examples for iops_detailed_data
- Updated docs/user_guide.rst with detailed documentation
- Fixed linter issues (line length, import order)

Co-authored-by: mtauraso <31012+mtauraso@users.noreply.github.com>
- Fixed bug where collect_ops and collect_detailed were mutually exclusive
- Both histogram and detailed data can now be collected simultaneously
- Added warning about iops_detailed_data variable name collision
- Improved code structure to handle both flags independently
- All 133 tests still pass

Co-authored-by: mtauraso <31012+mtauraso@users.noreply.github.com>
Copilot AI changed the title [WIP] Expose detailed I/O data via magic variable Expose detailed I/O data via magic variable Jan 29, 2026
Copilot AI requested a review from mtauraso January 29, 2026 21:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expose detailed I/O data via magic variable

2 participants