gen-dataset

A command line tool to quickly generate a lot of files in a lot of directories. This tool creates an M-ary tree shaped directory tree and randomly places any number of files of any size within this tree. The distribution of files per directory is roughly equal. If a size is provided, the files will be filled with zeros up to that size.

Installation

Precompiled Static Binary

Download Binary

sudo wget https://github.com/joshuaboud/gen-dataset/releases/download/v1.3/gen-dataset -P /usr/local/bin

Mark Executable

sudo chmod +x /usr/local/bin/gen-dataset

From Source

Install Boost Development Libraries

Get Source and Install

git clone https://github.com/joshuaboud/gen-dataset.git
cd gen-dataset
make -j8
sudo make install

Usage

usage:
  gen-dataset  -c [-b -d -s -S -t -w -y] [path]

flags:
  -b, --branches <int>              - number of subdirectories per directory
  -c, --count <int>                 - total number of files to create
  -d, --depth <int>                 - number of directory levels
  -s, --size <float[K..T][i]B>      - file size
  -S, --buff-size <float[K..T][i]B> - write buffer size (default=1M)
  -t, --threads <int>               - number of parallel file creation threads
  -w, --max-wait <float (seconds)>  - max random wait between file creation
  -y, --yes                         - don't prompt before creating files

Example

Generate 10 1GiB files in a single subdirectory named 'subdir':

gen-dataset -c 10 -s 1GiB subdir

Generate 10,000 1M files in 3905 directories:

gen-dataset -d 5 -b 5 -c 10000 -s 1MiB

Simulate real usage by randomly waiting up to 2.5 seconds between file creations:

gen-dataset -d 4 -b 6 -c 1000 -s 1MiB -w 2.5

Generate 1,000,000 empty files in 55986 directories with 16 threads writing the files:

gen-dataset -d 6 -b 6 -c 1000000 -t 16

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
doc		doc
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
makefile		makefile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gen-dataset

Installation

Precompiled Static Binary

From Source

Usage

Example

About

Releases 4

Packages

Languages

License

joshuaboud/gen-dataset

Folders and files

Latest commit

History

Repository files navigation

gen-dataset

Installation

Precompiled Static Binary

From Source

Usage

Example

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 4

Packages 0

Languages

Packages