diff --git a/doc/format.txt b/doc/format.txt index df7f1d7..90f714c 100755 --- a/doc/format.txt +++ b/doc/format.txt @@ -1,20 +1,20 @@ -╔══════════════════════════════════════════════════════════════════════════════╗ -║ Elastic, Compressed, Content-Addressed Container ║ -║ ╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍ ║ -║ File Format Specification ║ -╚══════════════════════════════════════════════════════════════════════════════╝ ++------------------------------------------------------------------------------+ +| Elastic, Compressed, Content-Addressed Container | +| ................................................ | +| File Format Specification | ++------------------------------------------------------------------------------+ version: 1.0 1 Introduction -══════════════ +============== This section provides a brief introduction to the goals that EC3 is intended to fulfill. 1.1 File Format Purpose and Design Goals - ──────────────────────────────────────── + ---------------------------------------- The primary goals of the EC3 image format can be found in its name: @@ -24,7 +24,7 @@ version: 1.0 * Compressed: The format should support compression to reduce filesize and increase efficiency, without compromising random-access to file data - + * Content-Addressed: The format should support data de-duplication to further increase storage efficiency. @@ -43,7 +43,7 @@ version: 1.0 1.2 Document Scope - ────────────────── + ------------------ This document describes the general layout of an EC3 image, and all of the data structures contained within. It provides all of the information required @@ -55,14 +55,14 @@ version: 1.0 2 Overview -══════════ +========== This section provides a general overview of what an EC3 image is, how it works, and a preview of some of the internal data structures. 2.1 What Is An EC3 Image? - ───────────────────────── + ------------------------- An EC3 image is a data file that can contain, among other things, a set of zero or more logical filesystems, called volumes. Each volume has its own @@ -78,7 +78,7 @@ version: 1.0 simple as changing the quota of blocks that a particular logical partition is allowed to allocate, and doesn't require physically moving any sectors around. - + EC3 builds upon this concept by employing cross-volume data de-duplication. Every file that is stored within an EC3 image is split into a set of fixed- size, content-addressed chunks. The size of these chunks is constant within @@ -111,7 +111,7 @@ version: 1.0 2.2 Tags: The Core Unit Of Data - ─────────────────────────────── + ------------------------------- At its most basic level, an EC3 image is just a set of one or more tags. A tag is a contiguous segment of binary data with an associated type and @@ -122,19 +122,19 @@ version: 1.0 3 Types & Units -═══════════════ +=============== This section describes the fundamental data types used within EC3 data structures, as well as some of the units used throughout this document. 3.1 Integral Types - ────────────────── + ------------------ All integer values are stored in big-endian format. All signed integer values are stored in 2s-complement format. The following integer types are used: Name Size Sign - ─────────────────────────────────────────────── + ----------------------------------------------- uint8 8 bits (1 byte) Unsigned uint16 16 bits (2 bytes) Unsigned uint32 32 bits (4 bytes) Unsigned @@ -146,14 +146,14 @@ version: 1.0 3.2 String Types - ──────────────── + ---------------- All strings are stored in UTF-8 Unicode format with a trailing null terminator byte. 3.3 Storage Size Units - ────────────────────── + ---------------------- Throughout this document, any reference to kilobytes, megabytes, etc refer to the base-2 units, rather than the base-10 units. For example, 1 kilobyte @@ -161,14 +161,14 @@ version: 1.0 4 Algorithms -════════════ +============ EC3 uses a range of algorithms. A selection of hashing algorithms are used for fast data lookup and for ensuring data integrity. 4.1 Fast Hast - ───────────── + ------------- The Fast Hash algorithm is optimised for hashing string data. It is intended for use in string-based hashmaps. The algorithm used for this purpose is @@ -182,7 +182,7 @@ version: 1.0 4.2 Slow Hash - ───────────── + ------------- The Slow Hash function is optimised for minimal chance of hash collisions. It is intended to generate the content hashes used to uniquely identify data @@ -191,7 +191,7 @@ version: 1.0 4.3 Checksum - ──────────── + ------------ The Checksum algorithm is used to validate the contents of an EC3 image and detect any corruption. The algorithm used for this purpose is the CRC32 @@ -203,7 +203,7 @@ version: 1.0 5 Image Header -══════════════ +============== The Image Header can be found at the beginning of every EC3 image file. It provides critical information about the rest of the file, including the @@ -217,10 +217,10 @@ version: 1.0 5.1 Image Header Layout - ─────────────────────── + ----------------------- Offset Description Type - ──────────────────────────────────────── + ---------------------------------------- 0x00 Signature uint32 0x04 Format Version uint16 0x06 Chunk Size uint16 @@ -242,7 +242,7 @@ version: 1.0 0 1 0 6 XXXXXXXXYYYYYYYY - + Where X encodes the major number of the format version, and Y encodes the minor version of the format version. For example, version 3.2 would be encoded as 0x0302. @@ -255,7 +255,7 @@ version: 1.0 The following chunk size values are defined: Header Value Chunk Size (bytes) Chunk Size (kilobytes) - ──────────────────────────────────────────────────────────────── + ---------------------------------------------------------------- 0x00 16,384 16 0x01 32,768 32 0x02 65,536 64 @@ -279,7 +279,7 @@ version: 1.0 6 Tags -══════ +====== Tags are the fundamental units of data storage in an EC3 image. Every image contains one or more tags. A tag is essentially a contiguous range of data @@ -290,7 +290,7 @@ version: 1.0 6.1 The Tag Table - ───────────────── + ----------------- The Tag Table describes all of the tags in an image. Its location and size can be found by parsing the Image Header. The Tag Table consists of a number @@ -299,7 +299,7 @@ version: 1.0 Each entry in the Tag Table has the following layout: Offset Description Type - ──────────────────────────────────────── + ---------------------------------------- 0x00 Tag Type uint32 0x04 Flags uint32 0x08 Checksum uint32 @@ -338,7 +338,7 @@ version: 1.0 6.2 Tag Types - ───────────── + ------------- The type of a tag determines the format of the data contained within it. @@ -388,49 +388,49 @@ version: 1.0 6.3 Tag Flags - ───────────── + ------------- 6.4 Tag Identifiers - ─────────────────── - + ------------------- + 7 Manifest -══════════ +========== 8 Volumes -═════════ +========= 8.1 Filesystem Tree - ─────────────────── + ------------------- 8.2 Clusters - ──────────── + ------------ 8.3 String Table - ──────────────── + ---------------- 8.4 Extended Attributes - ─────────────────────── + ----------------------- 9 Binary Blobs -══════════════ +============== 10 Embedded Executables -═══════════════════════ +======================= 11 Signature Verification -═════════════════════════ +========================= 12 Encryption -═════════════ +============= vim: shiftwidth=3 expandtab