-
-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider adding support for Windows directory size "philosophy" #48
Comments
Interesting! Thank you for reporting this.
Are you sure? Looks like |
Yes, I'm sure, only files and directories. But there's something else I noticed:
4096 bytes is the standard cluster size / default allocation unit size for NTFS (except for very large partitions). I tested this with another directory (small, just 10 files), and I could observe the same phenomenon:
First test is the parent directory ( So, in sum, as also printed with du, the former is 2 directories and 10 files, the latter is 1 directory and 10 files. |
Thank you for looking into this further. Your analysis is spot on! I looked at the source code of
On Linux, this is consistent with what ▶ mkdir test-directory
▶ touch test-directory/empty-file
▶ mkdir test-directory/empty-subdirectory
▶ echo -n "123" > test-directory/file-3-bytes
▶ echo -n "1234567" > test-directory/file-7-bytes
▶ du -s --block-size=1 test-directory; diskus test-directory
8192 test-directory
8.19 KB (8,192 bytes)
▶ du -s --apparent-size --block-size=1 test-directory; diskus --apparent-size test-directory
170 test-directory
170 B (170 bytes) Compare this with
So I am not really sure how to proceed here. Apparently, the "real" Windows tools also seem to disregard the size of directories?! See also: #49 |
Uh, well. I was quickly jotting down my reply, but probably should have stopped for a moment here, because I might've remembered that I was tripping over this before not that long ago..
Yeah, that's exactly the issue. On Windows, by definition, a directory itself does not have a size. Or, phrased differently, the size of the directory alone is always zero... It's a question of abstraction... without doing a deep dive on file systems here, the gist of the issue is that, on a technical level, a directory does have a size, obviously. But, due to how NTFS works, we have another famous case of my favorite problem class for the entirety of science (and engineering): counting, because counting is hard. The Sesame Street didn't teach us that the real crux is what/where/when to count... In NTFS land, it's called the Master File Table (MFT), and this is where the space for the directories goes. NTFS has a reserved space for this, called the MFT Zone, I think, and while it's a part of the file system, it's not the, well, visible part of the file system. To be honest, I don't know either what would be the best way to proceed here. Edit: |
I think it is important to distinguish between "disk usage" and "apparent size" here. GNU struct stat {
dev_t st_dev; /* ID of device containing file */
ino_t st_ino; /* Inode number */
mode_t st_mode; /* File type and mode */
nlink_t st_nlink; /* Number of hard links */
uid_t st_uid; /* User ID of owner */
gid_t st_gid; /* Group ID of owner */
dev_t st_rdev; /* Device ID (if special file) */
off_t st_size; /* Total size, in bytes */
blksize_t st_blksize; /* Block size for filesystem I/O */
blkcnt_t st_blocks; /* Number of 512B blocks allocated */
/* Since Linux 2.6, the kernel supports nanosecond
precision for the following timestamp fields.
For the details before Linux 2.6, see NOTES. */
struct timespec st_atim; /* Time of last access */
struct timespec st_mtim; /* Time of last modification */
struct timespec st_ctim; /* Time of last status change */
#define st_atime st_atim.tv_sec /* Backward compatibility */
#define st_mtime st_mtim.tv_sec
#define st_ctime st_ctim.tv_sec
};
where:
The following table shows the difference between these two quantities (on a filesystem with 4KiB block size):
It looks to me like the number you are getting on Windows is neither disk usage, nor apparent size. It's the "apparent size of all files, excluding directories". The Windows sysinternals
(which is sysinternals
I have no idea where the latter number of "24,576 = 6 × 4,096" bytes comes from. But I agree with you that something should be changed on Windows. Maybe we could introduce a |
Yes, this is true,
Good question,,
I was thinking about the parent directory itself, maybe, but it doesn't really add up.
Sounds good to me! |
Windows 7 x64, Diskus 0.7.0 Let's get the folder size in bytes. $ for /f "tokens=1,2 delims=: " %a in ('robocopy C:\Test . /L /BYTES /S /NJH /NDL /NFL /XJ /R:0 /W:0') do @if /i %a==Bytes echo %b
3012235061
$ pwsh -c "(gci -lp C:\Test -r -force | measure -p length -sum).sum"
3012235061
$ coreutils du -bs C:\Test
3012235061 C:\Test
$ duu -q C:\Test
summary
=======
files : 60 531
directories : 7 136
bytes : 3 012 235 061
kilobytes : 2 941 635,80
megabytes : 2 872,69
gigabytes : 2,81 So far so good, the output matches. But what does Diskus show? $ diskus.exe --size-format decimal C:\Test
3.04 GB (3,036,213,045 bytes)
$ diskus.exe --size-format binary C:\Test
2.83 GiB (3,036,213,045 bytes) 3 012 235 061 vs 3 036 213 045. Something is wrong, indeed. |
Err… Hello? |
Yes? |
Bottom line: do not use Diskus, because it misleads and creates confusion. |
Hey, first time trying diskus, and while I can confirm that it's really fast here as well, I get a different result in total bytes on a local directory. 😕
I've noticed the Windows caveat section, but I don't think that this applies in my case here, because there is noting unusual in this path, no junctions, or hardlinks whatsoever.
To be sure, I've tested the same path with some other tools, like the Python based duu1, another one implemented in Rust found here on GitHub (dua2), as well as the Sysinternals Disk Usage (du3) tool for Windows for reference.
diskus
Here's the comparison:
duu
dua
du
And last but not least, the total value in bytes as displayed in Windows Explorer:
94.8 GB (101'895'657'743 bytes)
OS information:
Footnotes
https://github.com/jftuga/duu ↩
https://github.com/Byron/dua-cli ↩
https://docs.microsoft.com/en-us/sysinternals/downloads/du ↩
The text was updated successfully, but these errors were encountered: