-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Bug report
I have a 1 GiB file, and I'm getting different results when I read it with standard Python tooling and pyarrow; pyarrow bytes read are unrealistically small.
with open('train-00000-of-00007.parquet', 'rb') as gh:
%iops data = gh.read()
del data
======================================================================
IOPS Profile Results (strace (per-process))
======================================================================
Execution Time: 18.2150 seconds
Read Operations: 2
Write Operations: 0
Total Operations: 2
Bytes Read: 1.02 GB (1,091,305,162 bytes)
Bytes Written: 0.00 B (0 bytes)
Total Bytes: 1.02 GB (1,091,305,162 bytes)
----------------------------------------------------------------------
IOPS: 0.11 operations/second
Throughput: 57.14 MB/second
======================================================================
import pyarrow.parquet as pq
%iops pq.read_table('train-00000-of-00007.parquet')
======================================================================
IOPS Profile Results (strace (per-process))
======================================================================
Execution Time: 19.7621 seconds
Read Operations: 3
Write Operations: 3
Total Operations: 6
Bytes Read: 3.63 MB (3,808,731 bytes)
Bytes Written: 13.05 KB (13,360 bytes)
Total Bytes: 3.65 MB (3,822,091 bytes)
----------------------------------------------------------------------
IOPS: 0.30 operations/second
Throughput: 188.87 KB/second
======================================================================
I tried to do sync; sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches', but it didn't help.
Environment Information
Linux 6.8.0, x86_64, etx4, python 3.13, pyarrow 23, iops_profiler 0.2.0, ipython 9.9.0
Before submitting
Please check the following:
- I have described the situation in which the bug arose, including what code was executed, and any applicable data others will need to reproduce the problem.
- I have included information about my environment, including the version of this package (e.g.
iops_profiler.__version__) - I have included available evidence of the unexpected behavior (including error messages, screenshots, and/or plots) as well as a description of what I expected instead.
- If I have a solution in mind, I have provided an explanation and/or pseudocode and/or task list.
Reactions are currently unavailable
Metadata
Metadata
Labels
bugSomething isn't workingSomething isn't working