|
1 | | -# `bmap-tools` |
| 1 | +The code at this location is no longer maintained and will |
| 2 | +likely be removed in the future. |
2 | 3 |
|
3 | | -> The better `dd` for embedded projects, based on block maps. |
4 | | -
|
5 | | -## Introduction |
6 | | - |
7 | | -`bmaptool` is a generic tool for creating the block map (bmap) for a file and |
8 | | -copying files using the block map. The idea is that large files, like raw |
9 | | -system image files, can be copied or flashed a lot faster and more reliably |
10 | | -with `bmaptool` than with traditional tools, like `dd` or `cp`. |
11 | | - |
12 | | -`bmaptool` was originally created for the "Tizen IVI" project and it was used for |
13 | | -flashing system images to USB sticks and other block devices. `bmaptool` can also |
14 | | -be used for general image flashing purposes, for example, flashing Fedora Linux |
15 | | -OS distribution images to USB sticks. |
16 | | - |
17 | | -Originally Tizen IVI images had been flashed using the `dd` tool, but bmaptool |
18 | | -brought a number of advantages. |
19 | | - |
20 | | -* Faster. Depending on various factors, like write speed, image size, how full |
21 | | - is the image, and so on, `bmaptool` was 5-7 times faster than `dd` in the Tizen |
22 | | - IVI project. |
23 | | -* Integrity. `bmaptool` verifies data integrity while flashing, which means that |
24 | | - possible data corruptions will be noticed immediately. |
25 | | -* Usability. `bmaptool` can read images directly from the remote server, so users |
26 | | - do not have to download images and save them locally. |
27 | | -* Protects user's data. Unlike `dd`, if you make a mistake and specify a wrong |
28 | | - block device name, `bmaptool` will less likely destroy your data because it has |
29 | | - protection mechanisms which, for example, prevent `bmaptool` from writing to a |
30 | | - mounted block device. |
31 | | - |
32 | | -## Usage |
33 | | - |
34 | | -`bmaptool` supports 2 subcommands: |
35 | | -* `copy` - copy a file to another file using bmap or flash an image to a block |
36 | | - device |
37 | | -* `create` - create a bmap for a file |
38 | | - |
39 | | -You can get usage reference for `bmaptool` and all the supported command using |
40 | | -the `-h` or `--help` options: |
41 | | - |
42 | | -```bash |
43 | | -$ bmaptool -h # General bmaptool help |
44 | | -$ bmaptool <cmd> -h # Help on the <cmd> sub-command |
45 | | -``` |
46 | | - |
47 | | -You can also refer to the `bmaptool` manual page: |
48 | | -```bash |
49 | | -$ man bmaptool |
50 | | -``` |
51 | | - |
52 | | -## Concept |
53 | | - |
54 | | -This section provides general information about the block map (bmap) necessary |
55 | | -for understanding how `bmaptool` works. The structure of the section is: |
56 | | - |
57 | | -* "Sparse files" - the bmap ideas are based on sparse files, so it is important |
58 | | - to understand what sparse files are. |
59 | | -* "The block map" - explains what bmap is. |
60 | | -* "Raw images" - the main usage scenario for `bmaptool` is flashing raw images, |
61 | | - which this section discusses. |
62 | | -* "Usage scenarios" - describes various possible bmap and `bmaptool` usage |
63 | | - scenarios. |
64 | | - |
65 | | -### Sparse files |
66 | | - |
67 | | -One of the main roles of a filesystem, generally speaking, is to map blocks of |
68 | | -file data to disk sectors. Different file-systems do this mapping differently, |
69 | | -and filesystem performance largely depends on how well the filesystem can do |
70 | | -the mapping. The filesystem block size is usually 4KiB, but may also be 8KiB or |
71 | | -larger. |
72 | | - |
73 | | -Obviously, to implement the mapping, the file-system has to maintain some kind |
74 | | -of on-disk index. For any file on the file-system, and any offset within the |
75 | | -file, the index allows you to find the corresponding disk sector, which stores |
76 | | -the file's data. Whenever we write to a file, the filesystem looks up the index |
77 | | -and writes to the corresponding disk sectors. Sometimes the filesystem has to |
78 | | -allocate new disk sectors and update the index (such as when appending data to |
79 | | -the file). The filesystem index is sometimes referred to as the "filesystem |
80 | | -metadata". |
81 | | - |
82 | | -What happens if a file area is not mapped to any disk sectors? Is this |
83 | | -possible? The answer is yes. It is possible and these unmapped areas are often |
84 | | -called "holes". And those files which have holes are often called "sparse |
85 | | -files". |
86 | | - |
87 | | -All reasonable file-systems like Linux ext[234], btrfs, XFS, or Solaris XFS, |
88 | | -and even Windows' NTFS, support sparse files. Old and less reasonable |
89 | | -filesystems, like FAT, do not support holes. |
90 | | - |
91 | | -Reading holes returns zeroes. Writing to a hole causes the filesystem to |
92 | | -allocate disk sectors for the corresponding blocks. Here is how you can create |
93 | | -a 4GiB file with all blocks unmapped, which means that the file consists of a |
94 | | -huge 4GiB hole: |
95 | | - |
96 | | -```bash |
97 | | -$ truncate -s 4G image.raw |
98 | | -$ stat image.raw |
99 | | - File: image.raw |
100 | | - Size: 4294967296 Blocks: 0 IO Block: 4096 regular file |
101 | | -``` |
102 | | - |
103 | | -Notice that `image.raw` is a 4GiB file, which occupies 0 blocks on the disk! |
104 | | -So, the entire file's contents are not mapped anywhere. Reading this file would |
105 | | -result in reading 4GiB of zeroes. If you write to the middle of the image.raw |
106 | | -file, you'll end up with 2 holes and a mapped area in the middle. |
107 | | - |
108 | | -Therefore: |
109 | | -* Sparse files are files with holes. |
110 | | -* Sparse files help save disk space, because, roughly speaking, holes do not |
111 | | - occupy disk space. |
112 | | -* A hole is an unmapped area of a file, meaning that it is not mapped anywhere |
113 | | - on the disk. |
114 | | -* Reading data from a hole returns zeroes. |
115 | | -* Writing data to a hole destroys it by forcing the filesystem to map |
116 | | - corresponding file areas to disk sectors. |
117 | | -* Filesystems usually operate with blocks, so sizes and offsets of holes are |
118 | | - aligned to the block boundary. |
119 | | - |
120 | | -It is also useful to know that you should work with sparse files carefully. It |
121 | | -is easy to accidentally expand a sparse file, that is, to map all holes to |
122 | | -zero-filled disk areas. For example, `scp` always expands sparse files, the |
123 | | -`tar` and `rsync` tools do the same, by default, unless you use the `--sparse` |
124 | | -option. Compressing and then decompressing a sparse file usually expands it. |
125 | | - |
126 | | -There are 2 ioctl's in Linux which allow you to find mapped and unmapped areas: |
127 | | -`FIBMAP` and `FIEMAP`. The former is very old and is probably supported by all |
128 | | -Linux systems, but it is rather limited and requires root privileges. The |
129 | | -latter is a lot more advanced and does not require root privileges, but it is |
130 | | -relatively new (added in Linux kernel, version 2.6.28). |
131 | | - |
132 | | -Recent versions of the Linux kernel (starting from 3.1) also support the |
133 | | -`SEEK_HOLE` and `SEEK_DATA` values for the `whence` argument of the standard |
134 | | -`lseek()` system call. They allow positioning to the next hole and the next |
135 | | -mapped area of the file. |
136 | | - |
137 | | -Advanced Linux filesystems, in modern kernels, also allow "punching holes", |
138 | | -meaning that it is possible to unmap any aligned area and turn it into a hole. |
139 | | -This is implemented using the `FALLOC_FL_PUNCH_HOLE` `mode` of the |
140 | | -`fallocate()` system call. |
141 | | - |
142 | | -### The bmap |
143 | | - |
144 | | -The bmap is an XML file, which contains a list of mapped areas, plus some |
145 | | -additional information about the file it was created for, for example: |
146 | | -* SHA256 checksum of the bmap file itself |
147 | | -* SHA256 checksum of the mapped areas |
148 | | -* the original file size |
149 | | -* amount of mapped data |
150 | | - |
151 | | -The bmap file is designed to be both easily machine-readable and |
152 | | -human-readable. All the machine-readable information is provided by XML tags. |
153 | | -The human-oriented information is in XML comments, which explain the meaning of |
154 | | -XML tags and provide useful information like amount of mapped data in percent |
155 | | -and in MiB or GiB. |
156 | | - |
157 | | -So, the best way to understand bmap is to just to read it. Here is an |
158 | | -[example of a bmap file](tests/test-data/test.image.bmap.v2.0). |
159 | | - |
160 | | -### Raw images |
161 | | - |
162 | | -Raw images are the simplest type of system images which may be flashed to the |
163 | | -target block device, block-by-block, without any further processing. Raw images |
164 | | -just "mirror" the target block device: they usually start with the MBR sector. |
165 | | -There is a partition table at the beginning of the image and one or more |
166 | | -partitions containing filesystems, like ext4. Usually, no special tools are |
167 | | -required to flash a raw image to the target block device. The standard `dd` |
168 | | -command can do the job: |
169 | | - |
170 | | -```bash |
171 | | -$ dd if=tizen-ivi-image.raw of=/dev/usb_stick |
172 | | -``` |
173 | | - |
174 | | -At first glance, raw images do not look very appealing because they are large |
175 | | -and it takes a lot of time to flash them. However, with bmap, raw images become |
176 | | -a much more attractive type of image. We will demonstrate this, using Tizen IVI |
177 | | -as an example. |
178 | | - |
179 | | -The Tizen IVI project uses raw images which take 3.7GiB in Tizen IVI 2.0 alpha. |
180 | | -The images are created by the MIC tool. Here is a brief description of how MIC |
181 | | -creates them: |
182 | | - |
183 | | -* create a 3.7GiB sparse file, which will become the Tizen IVI image in the end |
184 | | -* partition the file using the `parted` tool |
185 | | -* format the partitions using the `mkfs.ext4` tool |
186 | | -* loop-back mount all the partitions |
187 | | -* install all the required packages to the partitions: copy all the needed |
188 | | - files and do all the tweaks |
189 | | -* unmount all loop-back-mounted image partitions, the image is ready |
190 | | -* generate the block map file for the image |
191 | | -* compress the image using `bzip2`, turning them into a small file, around |
192 | | - 300MiB |
193 | | - |
194 | | -The Tizen IVI raw images are initially sparse files. All the mapped blocks |
195 | | -represent useful data and all the holes represent unused regions, which |
196 | | -"contain" zeroes and do not have to be copied when flashing the image. Although |
197 | | -information about holes is lost once the image gets compressed, the bmap file |
198 | | -still has it and it can be used to reconstruct the uncompressed image or to |
199 | | -flash the image quickly, by copying only the mapped regions. |
200 | | - |
201 | | -Raw images compress extremely well because the holes are essentially zeroes, |
202 | | -which compress perfectly. This is why 3.7GiB Tizen IVI raw images, which |
203 | | -contain about 1.1GiB of mapped blocks, take only 300MiB in a compressed form. |
204 | | -And the important point is that you need to decompress them only while |
205 | | -flashing. The `bmaptool` does this "on-the-fly". |
206 | | - |
207 | | -Therefore: |
208 | | -* raw images are distributed in a compressed form, and they are almost as small |
209 | | - as a tarball (that includes all the data the image would take) |
210 | | -* the bmap file and the `bmaptool` make it possible to quickly flash the |
211 | | - compressed raw image to the target block device |
212 | | -* optionally, the `bmaptool` can reconstruct the original uncompressed sparse raw |
213 | | - image file |
214 | | - |
215 | | -And, what is even more important, is that flashing raw images is extremely fast |
216 | | -because you write directly to the block device, and write sequentially. |
217 | | - |
218 | | -Another great thing about raw images is that they may be 100% ready-to-go and |
219 | | -all you need to do is to put the image on your device "as-is". You do not have |
220 | | -to know the image format, which partitions and filesystems it contains, etc. |
221 | | -This is simple and robust. |
222 | | - |
223 | | -### Usage scenarios |
224 | | - |
225 | | -Flashing or copying large images is the main `bmaptool` use case. The idea is |
226 | | -that if you have a raw image file and its bmap, you can flash it to a device by |
227 | | -writing only the mapped blocks and skipping the unmapped blocks. |
228 | | - |
229 | | -What this basically means is that with bmap it is not necessary to try to |
230 | | -minimize the raw image size by making the partitions small, which would require |
231 | | -resizing them. The image can contain huge multi-gigabyte partitions, just like |
232 | | -the target device requires. The image will then be a huge sparse file, with |
233 | | -little mapped data. And because unmapped areas "contain" zeroes, the huge image |
234 | | -will compress extremely well, so the huge image will be very small in |
235 | | -compressed form. It can then be distributed in compressed form, and flashed |
236 | | -very quickly with `bmaptool` and the bmap file, because `bmaptool` will decompress |
237 | | -the image on-the-fly and write only mapped areas. |
238 | | - |
239 | | -The additional benefit of using bmap for flashing is the checksum verification. |
240 | | -Indeed, the `bmaptool create` command generates SHA256 checksums for all mapped |
241 | | -block ranges, and the `bmaptool copy` command verifies the checksums while |
242 | | -writing. Integrity of the bmap file itself is also protected by a SHA256 |
243 | | -checksum and `bmaptool` verifies it before starting flashing. |
244 | | - |
245 | | -On top of this, the bmap file can be signed using OpenPGP (gpg) and bmaptool |
246 | | -automatically verifies the signature if it is present. This allows for |
247 | | -verifying the bmap file integrity and authoring. And since the bmap file |
248 | | -contains SHA256 checksums for all the mapped image data, the bmap file |
249 | | -signature verification should be enough to guarantee integrity and authoring of |
250 | | -the image file. |
251 | | - |
252 | | -The second usage scenario is reconstructing sparse files Generally speaking, if |
253 | | -you had a sparse file but then expanded it, there is no way to reconstruct it. |
254 | | -In some cases, something like |
255 | | - |
256 | | -```bash |
257 | | -$ cp --sparse=always expanded.file reconstructed.file |
258 | | -``` |
259 | | - |
260 | | -would be enough. However, a file reconstructed this way will not necessarily be |
261 | | -the same as the original sparse file. The original sparse file could have |
262 | | -contained mapped blocks filled with all zeroes (not holes), and, in the |
263 | | -reconstructed file, these blocks will become holes. In some cases, this does |
264 | | -not matter. For example, if you just want to save disk space. However, for raw |
265 | | -images, flashing it does matter, because it is essential to write zero-filled |
266 | | -blocks and not skip them. Indeed, if you do not write the zero-filled block to |
267 | | -corresponding disk sectors which, presumably, contain garbage, you end up with |
268 | | -garbage in those blocks. In other words, when we are talking about flashing raw |
269 | | -images, the difference between zero-filled blocks and holes in the original |
270 | | -image is essential because zero-filled blocks are the required blocks which are |
271 | | -expected to contain zeroes, while holes are just unneeded blocks with no |
272 | | -expectations regarding the contents. |
273 | | - |
274 | | -`bmaptool` may be helpful for reconstructing sparse files properly. Before the |
275 | | -sparse file is expanded, you should generate its bmap (for example, by using |
276 | | -the `bmaptool create` command). Then you may compress your file or, otherwise, |
277 | | -expand it. Later on, you may reconstruct it using the `bmaptool copy` command. |
278 | | - |
279 | | -## Project structure |
280 | | - |
281 | | -```bash |
282 | | ------------------------------------------------------------------------------------- |
283 | | -| - bmaptool | A tools to create bmap and copy with bmap. Based | |
284 | | -| | on the 'BmapCreate.py' and 'BmapCopy.py' modules. | |
285 | | -| - setup.py | A script to turn the entire bmap-tools project | |
286 | | -| | into a python egg. | |
287 | | -| - setup.cfg | contains a piece of nose tests configuration | |
288 | | -| - .coveragerc | lists files to include into test coverage report | |
289 | | -| - TODO | Just a list of things to be done for the project. | |
290 | | -| - make_a_release.sh | Most people may ignore this script. It is used by | |
291 | | -| | maintainer when creating a new release. | |
292 | | -| - tests/ | Contains the project unit-tests. | |
293 | | -| | - test_api_base.py | Tests the base API modules: 'BmapCreate.py' and | |
294 | | -| | | 'BmapCopy.py'. | |
295 | | -| | - test_filemap.py | Tests the 'Filemap.py' module. | |
296 | | -| | - test_compat.py | Tests that new BmapCopy implementations support old | |
297 | | -| | | bmap formats, and old BmapCopy implementations | |
298 | | -| | | support new compatible bmap fomrats. | |
299 | | -| | - test_bmap_helpers.py | Tests the 'BmapHelpers.py' module. | |
300 | | -| | - helpers.py | Helper functions shared between the unit-tests. | |
301 | | -| | - test-data/ | Data files for the unit-tests | |
302 | | -| | - oldcodebase/ | Copies of old BmapCopy implementations for bmap | |
303 | | -| | | format forward-compatibility verification. | |
304 | | -| - bmaptools/ | The API modules which implement all the bmap | |
305 | | -| | | functionality. | |
306 | | -| | - BmapCreate.py | Creates a bmap for a given file. | |
307 | | -| | - BmapCopy.py | Implements copying of an image using its bmap. | |
308 | | -| | - Filemap.py | Allows for reading files' block map. | |
309 | | -| | - BmapHelpers.py | Just helper functions used all over the project. | |
310 | | -| | - TransRead.py | Provides a transparent way to read various kind of | |
311 | | -| | | files (compressed, etc) | |
312 | | -| - debian/* | Debian packaging for the project. | |
313 | | -| - doc/* | Project documentation. | |
314 | | -| - packaging/* | RPM packaging (Fedora & OpenSuse) for the project. | |
315 | | -| - contrib/* | Various contributions that may be useful, but | |
316 | | -| | project maintainers do not really test or maintain. | |
317 | | ------------------------------------------------------------------------------------- |
318 | | -``` |
319 | | -
|
320 | | -## How to run unit tests |
321 | | -
|
322 | | -Just install the `nose` python test framework and run the `nosetests` command in |
323 | | -the project root directory. If you want to see tests coverage report, run |
324 | | -`nosetests --with-coverage`. |
325 | | -
|
326 | | -## Known Issues |
327 | | -
|
328 | | -### ZFS File System |
329 | | -
|
330 | | -If running on the ZFS file system, the Linux ZFS kernel driver parameters |
331 | | -configuration can cause the finding of mapped and unmapped areas to fail. |
332 | | -This can be fixed temporarily by doing the following: |
333 | | -
|
334 | | -```bash |
335 | | -$ echo 1 | sudo tee -a /sys/module/zfs/parameters/zfs_dmu_offset_next_sync |
336 | | -``` |
337 | | -
|
338 | | -However, if a permanent solution is required then perform the following: |
339 | | -
|
340 | | -```bash |
341 | | -$ echo "options zfs zfs_dmu_offset_next_sync=1" | sudo tee -a /etc/modprobe.d/zfs.conf |
342 | | -``` |
343 | | -
|
344 | | -Depending upon your Linux distro, you may also need to do the following to |
345 | | -ensure that the permanent change is updated in all your initramfs images: |
346 | | -
|
347 | | -```bash |
348 | | -$ sudo update-initramfs -u -k all |
349 | | -``` |
350 | | -
|
351 | | -To verify the temporary or permanent change has worked you can use the following |
352 | | -which should return `1`: |
353 | | -
|
354 | | -```bash |
355 | | -$ cat /sys/module/zfs/parameters/zfs_dmu_offset_next_sync |
356 | | -``` |
357 | | -
|
358 | | -More details can be found [in the OpenZFS documentation](https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html). |
359 | | -
|
360 | | -## Project and maintainer |
361 | | -
|
362 | | -The bmap-tools project implements bmap-related tools and API modules. The |
363 | | -entire project is written in python and supports python 2.7 and python 3.x. |
364 | | -
|
365 | | -The project author is Artem Bityutskiy (dedekind1@gmail.com). Artem is looking |
366 | | -for a new maintainer for the project. Anyone actively contributing may become a |
367 | | -maintainer. Please, let Artem know if you volunteer to be one. |
368 | | -
|
369 | | -Project git repository is here: |
370 | | -https://github.com/intel/bmap-tools.git |
371 | | -
|
372 | | -## Credits |
373 | | -
|
374 | | -* Ed Bartosh (eduard.bartosh@intel.com) for helping me with learning python |
375 | | - (this is my first python project) and working with the Tizen IVI |
376 | | - infrastructure. Ed also implemented the packaging. |
377 | | -* Alexander Kanevskiy (alexander.kanevskiy@intel.com) and |
378 | | - Kevin Wang (kevin.a.wang@intel.com) for helping with integrating this stuff |
379 | | - to the Tizen IVI infrastructure. |
380 | | -* Simon McVittie (simon.mcvittie@collabora.co.uk) for improving Debian |
381 | | - packaging and fixing bmaptool. |
| 4 | +This project has moved to [https://github.com/yoctoproject/bmaptool](https://github.com/yoctoproject/bmaptool) |
0 commit comments