NTFS on ZPOOL/ZFS with FAST_DEDUP #421
Replies: 3 comments 2 replies
-
Some aspects
The best performance and flexibility would be a simple ZFS filesystem with fast dedup enabled. Size of dedup table can be limited with a quota, a 1M recsize is a good option.
|
Beta Was this translation helpful? Give feedback.
-
When you create a ZFS filesystem below a ZFS pool you can mount this filesystem individually. Special on Windows is the option to assign a driveletter either to the pool or to individual ZFS filesystems. ZFS compression (and dedup) is not related to files but ZFS datablocks (in recsize) independent from data structures ontop ZFS. A larger recsize makes compress and dedup more efficient. Arc is the rambased ZFS readcache for last/most accessed ZFS datablocks (not files). Unless no other application demands RAM, ZFS is using RAM up to a certain percentage ex 50% of RAM. Special vdev is a method for hybrid pools (hd + NVMe). It must be a mirror as a vdev lost is a pool lost. It can massively improve small file performance ex up to 64K or 128K and file listings due faster access to metadata. You can check if this is enough for performance, otherwise wizfile claims support for non ntfs filesystems. |
Beta Was this translation helpful? Give feedback.
-
There was a race in the mounting, which you detailed - and I believe it is fixed. I have been waiting to hear back on that, and other issues before rc11. |
Beta Was this translation helpful? Give feedback.
-
On debian sid I compiled and installed myself zfs with fast_dedup:
https://github.com/openzfs/zfs/releases/tag/zfs-2.3.0-rc3
I created pool:
zpool create -o autoexpand=on -o autotrim=on -o ashift=12 -O dedup=on -O casesensitivity=insensitive -O compression=zstd -O atime=off -O recordsize=1M -O longname=on zw /dev/sde /dev/sdf /dev/sdg
I created a dataset: zfs create zw/wz
I imported pool on Windows on zfs version with fast_dedup:
https://github.com/openzfsonwindows/openzfs/releases/tag/zfswin-2.2.6rc10
There were big problems in that, more often than not, the wz folder was unopenable in Windows. It showed up as two kinds of shortcuts that could not be accessed. Maybe one in ten pool imports was correct and allowed entry and editing of the wz folder.
As a solution, it turned out that replacing the mountpoint parameter "/zw/wz" with legacy
zfs set mountpoint=legacy zw/wz
I set a drive letter to W:\ that was convenient for me
zfs set driveletter=w zw
I created a virtual vhdx drive of 17 TB in the zw/wz directory. Then a second similar one.
It took about 12 hours.
I didn't count how much it took, but rather over 2 days.
Mount-DiskImage "W:\wz\K17T.vhdx"
I chose recordsize=1M because I was concerned that the DDT deduplication table should not be too large. At recordsize=1M, one 1 TB file will get a million blocks. If I had chosen recordsize=128K, there would be almost 10 million blocks for that one file. We are talking about one file, but 1TB in small files would still have many times the number of blocks. It seems to me that the large recordsize has a disadvantage in that with small files the block is always filled with zeros to the recordsize and these zeros have to be compressed, but this I have checked experimentally is not a big problem - writing and reading are nevertheless fast, but most importantly I win a lot by reducing the write slowdown with increasing DDT array size.
These are the parameters my pool currently has:
16, What do I gain from all this:
a) Very fast index of all files using the WIZFILE program, which scans all my disks every time Windows starts.
b) DEDUPLICATION on NTFS.
c) SMR disks are no longer such a big problem.
This is my write buffer configuration on disks:
I could set it up like this, because this data is backed up from time to time
https://antibody-software.com/wizfile/
It is very good practice to disable real-time file change monitoring, as I have observed various programmes crashing many times after prolonged wizfile operation with file system change monitoring.
Beta Was this translation helpful? Give feedback.
All reactions