Skip to content
This repository was archived by the owner on Jul 18, 2024. It is now read-only.

Commit b778261

Browse files
committed
Add Vectored IO docs
1 parent bb9f68a commit b778261

File tree

2 files changed

+95
-3
lines changed

2 files changed

+95
-3
lines changed

docs/30-vectored-io.md

Lines changed: 94 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,95 @@
1-
# blobxfer Vectored IO
1+
# blobxfer Vectored I/O
2+
`blobxfer` supports Vectored I/O (scatter/gather) which can help alleviate
3+
problems associated with
4+
[single blob or single fileshare throughput limits](https://docs.microsoft.com/en-us/azure/storage/storage-scalability-targets).
5+
Additionally, `blobxfer` has the ability to replicate a single source to
6+
multiple destinations to allow for increased resiliency or throughput for
7+
consumption later.
28

3-
## TODO
9+
## Distribution Modes
10+
`blobxfer` supports two distribution modes: `replica` and `stripe`. The
11+
following sections describe each.
12+
13+
### Replica
14+
`replica` mode replicates an entire file (or set of files) across all
15+
specified destinations. This allows for multiple backups, resiliency,
16+
and potentially increased download throughput later if the clients understand
17+
how to download from multiple sources.
18+
19+
The logic is fairly simple in how this is accomplished. Each source file
20+
has portions of the file read from disk, buffered in memory and then
21+
replicated across multiple storage accounts.
22+
23+
```
24+
Whole File +---------------------+
25+
Replication | |
26+
+------------------------------> | Destination 0: |
27+
| | Storage Account A |
28+
| | |
29+
| +---------------------+
30+
|
31+
|
32+
+------------+---------------+ Whole File +---------------------+
33+
| | Replication | |
34+
| 10 GiB VHD on Local Disk +--------------> | Destination 1: |
35+
| | | Storage Account B |
36+
+------------+---------------+ | |
37+
| +---------------------+
38+
|
39+
|
40+
| +---------------------+
41+
| Whole File | |
42+
| Replication | Destination 2: |
43+
+------------------------------> | Storage Account C |
44+
| |
45+
+---------------------+
46+
```
47+
48+
In order to take advantage of `replica` Vectored IO, you must use a YAML
49+
configuration file to define multiple destinations.
50+
51+
### Stripe
52+
`stripe` mode will splice a file into multiple chunks and scatter these
53+
chunks across destinations specified. These destinations can be different
54+
containers within the same storage account or even containers distributed
55+
across multiple storage accounts if single storage account bandwidth limits
56+
are insufficient.
57+
58+
`blobxfer` will slice the source file into multiple chunks where the
59+
`stripe_chunk_size_bytes` is the stripe width of each chunk. This parameter
60+
will allow you to effectively control how many blobs/files are created on
61+
Azure. `blobxfer` will then round-robin through all of the destinations
62+
specified to store the slices. Information required to reconstruct the
63+
original file is stored on the blob or file metadata. It is important to
64+
keep this metadata in-tact or reconstruction will fail.
65+
66+
```
67+
+---------------------+
68+
| | <-----------------------------------+
69+
| Destination 1: | |
70+
| Storage Account B | <---------------------+ |
71+
| | | |
72+
+---------------------+ <-------+ | |
73+
| | |
74+
^ ^ | | |
75+
| | | | |
76+
1 GiB Stripe | | | | |
77+
+-----------------------------+ Width +------+---+--+------+---+--+------+---+--+------+---+--+------+---+--+
78+
| | | | | | | | | | | | |
79+
| 10 GiB File on Local Disk | +-----------> | D0 | D1 | D0 | D1 | D0 | D1 | D0 | D1 | D0 | D1 |
80+
| | | | | | | | | | | | |
81+
+-----------------------------+ 10 Vectored +---+--+------+---+--+------+---+--+------+---+--+------+---+--+------+
82+
Slices | | | | |
83+
| | | | |
84+
| v | | |
85+
| | | |
86+
+> +---------------------+ <+ | |
87+
| | | |
88+
| Destination 0: | <--------------+ |
89+
| Storage Account A | |
90+
| | <----------------------------+
91+
+---------------------+
92+
```
93+
94+
In order to take advantage of `stripe` Vectored IO, you must use a YAML
95+
configuration file to define multiple destinations.

docs/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Please refer to the following documents detailing the usage of `blobxfer`.
99
1. [Installation](01-installation.md)
1010
2. [Command-Line Usage](10-cli-usage.md)
1111
3. [YAML Configuration](20-yaml-configuration.md)
12-
4. [Vectored IO](30-vectored-io.md)
12+
4. [Vectored I/O](30-vectored-io.md)
1313
5. [Client-side Encryption](40-client-side-encryption.md)
1414
6. [blobxfer Data Movement Library](80-blobxfer-python-library.md)
1515
7. [Performance Considerations](98-performance-considerations.md)

0 commit comments

Comments
 (0)