|  | GIT index format | 
|  | ================ | 
|  |  | 
|  | = The git index file has the following format | 
|  |  | 
|  | All binary numbers are in network byte order. Version 2 is described | 
|  | here unless stated otherwise. | 
|  |  | 
|  | - A 12-byte header consisting of | 
|  |  | 
|  | 4-byte signature: | 
|  | The signature is { 'D', 'I', 'R', 'C' } (stands for "dircache") | 
|  |  | 
|  | 4-byte version number: | 
|  | The current supported versions are 2 and 3. | 
|  |  | 
|  | 32-bit number of index entries. | 
|  |  | 
|  | - A number of sorted index entries (see below). | 
|  |  | 
|  | - Extensions | 
|  |  | 
|  | Extensions are identified by signature. Optional extensions can | 
|  | be ignored if GIT does not understand them. | 
|  |  | 
|  | GIT currently supports cached tree and resolve undo extensions. | 
|  |  | 
|  | 4-byte extension signature. If the first byte is 'A'..'Z' the | 
|  | extension is optional and can be ignored. | 
|  |  | 
|  | 32-bit size of the extension | 
|  |  | 
|  | Extension data | 
|  |  | 
|  | - 160-bit SHA-1 over the content of the index file before this | 
|  | checksum. | 
|  |  | 
|  | == Index entry | 
|  |  | 
|  | Index entries are sorted in ascending order on the name field, | 
|  | interpreted as a string of unsigned bytes (i.e. memcmp() order, no | 
|  | localization, no special casing of directory separator '/'). Entries | 
|  | with the same name are sorted by their stage field. | 
|  |  | 
|  | 32-bit ctime seconds, the last time a file's metadata changed | 
|  | this is stat(2) data | 
|  |  | 
|  | 32-bit ctime nanosecond fractions | 
|  | this is stat(2) data | 
|  |  | 
|  | 32-bit mtime seconds, the last time a file's data changed | 
|  | this is stat(2) data | 
|  |  | 
|  | 32-bit mtime nanosecond fractions | 
|  | this is stat(2) data | 
|  |  | 
|  | 32-bit dev | 
|  | this is stat(2) data | 
|  |  | 
|  | 32-bit ino | 
|  | this is stat(2) data | 
|  |  | 
|  | 32-bit mode, split into (high to low bits) | 
|  |  | 
|  | 4-bit object type | 
|  | valid values in binary are 1000 (regular file), 1010 (symbolic link) | 
|  | and 1110 (gitlink) | 
|  |  | 
|  | 3-bit unused | 
|  |  | 
|  | 9-bit unix permission. Only 0755 and 0644 are valid for regular files. | 
|  | Symbolic links and gitlinks have value 0 in this field. | 
|  |  | 
|  | 32-bit uid | 
|  | this is stat(2) data | 
|  |  | 
|  | 32-bit gid | 
|  | this is stat(2) data | 
|  |  | 
|  | 32-bit file size | 
|  | This is the on-disk size from stat(2), truncated to 32-bit. | 
|  |  | 
|  | 160-bit SHA-1 for the represented object | 
|  |  | 
|  | A 16-bit 'flags' field split into (high to low bits) | 
|  |  | 
|  | 1-bit assume-valid flag | 
|  |  | 
|  | 1-bit extended flag (must be zero in version 2) | 
|  |  | 
|  | 2-bit stage (during merge) | 
|  |  | 
|  | 12-bit name length if the length is less than 0xFFF; otherwise 0xFFF | 
|  | is stored in this field. | 
|  |  | 
|  | (Version 3) A 16-bit field, only applicable if the "extended flag" | 
|  | above is 1, split into (high to low bits). | 
|  |  | 
|  | 1-bit reserved for future | 
|  |  | 
|  | 1-bit skip-worktree flag (used by sparse checkout) | 
|  |  | 
|  | 1-bit intent-to-add flag (used by "git add -N") | 
|  |  | 
|  | 13-bit unused, must be zero | 
|  |  | 
|  | Entry path name (variable length) relative to top level directory | 
|  | (without leading slash). '/' is used as path separator. The special | 
|  | path components ".", ".." and ".git" (without quotes) are disallowed. | 
|  | Trailing slash is also disallowed. | 
|  |  | 
|  | The exact encoding is undefined, but the '.' and '/' characters | 
|  | are encoded in 7-bit ASCII and the encoding cannot contain a NUL | 
|  | byte (iow, this is a UNIX pathname). | 
|  |  | 
|  | 1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes | 
|  | while keeping the name NUL-terminated. | 
|  |  | 
|  | == Extensions | 
|  |  | 
|  | === Cached tree | 
|  |  | 
|  | Cached tree extension contains pre-computed hashes for trees that can | 
|  | be derived from the index. It helps speed up tree object generation | 
|  | from index for a new commit. | 
|  |  | 
|  | When a path is updated in index, the path must be invalidated and | 
|  | removed from tree cache. | 
|  |  | 
|  | The signature for this extension is { 'T', 'R', 'E', 'E' }. | 
|  |  | 
|  | A series of entries fill the entire extension; each of which | 
|  | consists of: | 
|  |  | 
|  | - NUL-terminated path component (relative to its parent directory); | 
|  |  | 
|  | - ASCII decimal number of entries in the index that is covered by the | 
|  | tree this entry represents (entry_count); | 
|  |  | 
|  | - A space (ASCII 32); | 
|  |  | 
|  | - ASCII decimal number that represents the number of subtrees this | 
|  | tree has; | 
|  |  | 
|  | - A newline (ASCII 10); and | 
|  |  | 
|  | - 160-bit object name for the object that would result from writing | 
|  | this span of index as a tree. | 
|  |  | 
|  | An entry can be in an invalidated state and is represented by having -1 | 
|  | in the entry_count field. | 
|  |  | 
|  | The entries are written out in the top-down, depth-first order.  The | 
|  | first entry represents the root level of the repository, followed by the | 
|  | first subtree---let's call this A---of the root level (with its name | 
|  | relative to the root level), followed by the first subtree of A (with | 
|  | its name relative to A), ... | 
|  |  | 
|  | === Resolve undo | 
|  |  | 
|  | A conflict is represented in the index as a set of higher stage entries. | 
|  | When a conflict is resolved (e.g. with "git add path"), these higher | 
|  | stage entries will be removed and a stage-0 entry with proper resoluton | 
|  | is added. | 
|  |  | 
|  | When these higher stage entries are removed, they are saved in the | 
|  | resolve undo extension, so that conflicts can be recreated (e.g. with | 
|  | "git checkout -m"), in case users want to redo a conflict resolution | 
|  | from scratch. | 
|  |  | 
|  | The signature for this extension is { 'R', 'E', 'U', 'C' }. | 
|  |  | 
|  | A series of entries fill the entire extension; each of which | 
|  | consists of: | 
|  |  | 
|  | - NUL-terminated pathname the entry describes (relative to the root of | 
|  | the repository, i.e. full pathname); | 
|  |  | 
|  | - Three NUL-terminated ASCII octal numbers, entry mode of entries in | 
|  | stage 1 to 3 (a missing stage is represented by "0" in this field); | 
|  | and | 
|  |  | 
|  | - At most three 160-bit object names of the entry in stages from 1 to 3 | 
|  | (nothing is written for a missing stage). | 
|  |  |