package gptar
Install
Dune Dependency
Authors
Maintainers
Sources
sha256=f243377c4650ecf75900997c948b518e6d6b1dd99fee8ee84d3c4b404d0badd0
sha512=f28b4ea53a8902b05dd1d7120056876fd25cefe51c0680083fd9565c530b2efc930b2caf4cf35d3b894d5100b1c33b0b3641c139e6e40a6923fc200ec227e9f2
README.md.html
GPTar
You know GUID Partition Tables. You love the tape archive format. Now you can have both!
With GPTar you can now create a GPT-formatted disk image which is also a valid tar archive! Put a partition right after the GPT partition table and you can store your real tar archive data there. Create more partitions to store elsewhere on the disk for Full Performance™.
$ fdisk -l disk.img
Disk disk.img: 512 KiB, 524288 bytes, 1024 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 54A6B56C-9BBA-4AC9-A8B5-B25EA2F4057C
Device Start End Sectors Size Type
disk.img1 34 37 4 2K unknown
Are you somehow unable to mount the disk, but still curious what files are in the tar archive? Well, no fret! Use standard tar utilities to inspect or extract the contents:
$ tar -tvf disk.img
?r-------- 0/0 16896 1970-01-01 01:00 GPTAR unknown file type ‘G’
-r-------- 0/0 14 1970-01-01 01:00 test.txt
How does this black magic work!?
A GPT formatted disk starts with a so-called "protective" MBR. Thankfully, tar headers only use the first few hundreds of bytes which would end up in the MBR bootstrap code if merged into a MBR. So the protective MBR is modified to have as bootstrap code a dummy tar header for a file GPTAR
whose length covers the rest of the first LBA (if block size is greater than 512 bytes), the GPT table header and the partition table entries. Then the remaining space can be used as a tar archive, too.
Why though?!
We at Robur implemented an opam mirror that uses the disk mainly as a tar archive, but some data is cached at the end of the disk using mirage-block-partition. This works fine, and we can list the contents of the tar archive on disk using traditional tar utilities. However, a problem is the disk partitioning information is not stored on the disk and must be passed on the commandline. This could lead to data corruption if the wrong offsets are used. Using a table such as GPT or MBR would work, but then we lose the ability to inspect the tar archive. This ungodly hack is a compromise giving us an on-disk partition table while preserving the ability to inspect the archive - at the cost of the GPTAR
dummy file (and my soul, allegedly).