)]}'
{
  "commit": "37dc6d81042ac41437163264e7a29d3bf50c8d90",
  "tree": "4fa524fad26762eb880b9aea26c4bfb0ab969add",
  "parents": [
    "b5b1f4c0ecf20c069d0301732edfdbfec167aa0c"
  ],
  "author": {
    "name": "Taylor Blau",
    "email": "me@ttaylorr.com",
    "time": "Mon Oct 02 20:44:32 2023 -0400"
  },
  "committer": {
    "name": "Junio C Hamano",
    "email": "gitster@pobox.com",
    "time": "Thu Oct 05 13:26:11 2023 -0700"
  },
  "message": "builtin/repack.c: implement support for `--max-cruft-size`\n\nCruft packs are an alternative mechanism for storing a collection of\nunreachable objects whose mtimes are recent enough to avoid being\npruned out of the repository.\n\nWhen cruft packs were first introduced back in b757353676\n(builtin/pack-objects.c: --cruft without expiration, 2022-05-20) and\na7d493833f (builtin/pack-objects.c: --cruft with expiration,\n2022-05-20), the recommended workflow consisted of:\n\n  - Repacking periodically, either by packing anything loose in the\n    repository (via `git repack -d`) or producing a geometric sequence\n    of packs (via `git repack --geometric\u003d\u003cd\u003e -d`).\n\n  - Every so often, splitting the repository into two packs, one cruft\n    to store the unreachable objects, and another non-cruft pack to\n    store the reachable objects.\n\nRepositories may (out of band with the above) choose periodically to\nprune out some unreachable objects which have aged out of the grace\nperiod by generating a pack with `--cruft-expiration\u003d\u003capproxidate\u003e`.\n\nThis allowed repositories to maintain relatively few packs on average,\nand quarantine unreachable objects together in a cruft pack, avoiding\nthe pitfalls of holding unreachable objects as loose while they age out\n(for more, see some of the details in 3d89a8c118\n(Documentation/technical: add cruft-packs.txt, 2022-05-20)).\n\nThis all works, but can be costly from an I/O-perspective when\nfrequently repacking a repository that has many unreachable objects.\nThis problem is exacerbated when those unreachable objects are rarely\n(if every) pruned.\n\nSince there is at most one cruft pack in the above scheme, each time we\nupdate the cruft pack it must be rewritten from scratch. Because much of\nthe pack is reused, this is a relatively inexpensive operation from a\nCPU-perspective, but is very costly in terms of I/O since we end up\nrewriting basically the same pack (plus any new unreachable objects that\nhave entered the repository since the last time a cruft pack was\ngenerated).\n\nAt the time, we decided against implementing more robust support for\nmultiple cruft packs. This patch implements that support which we were\nlacking.\n\nIntroduce a new option `--max-cruft-size` which allows repositories to\naccumulate cruft packs up to a given size, after which point a new\ngeneration of cruft packs can accumulate until it reaches the maximum\nsize, and so on. To generate a new cruft pack, the process works like\nso:\n\n  - Sort a list of any existing cruft packs in ascending order of pack\n    size.\n\n  - Starting from the beginning of the list, group cruft packs together\n    while the accumulated size is smaller than the maximum specified\n    pack size.\n\n  - Combine the objects in these cruft packs together into a new cruft\n    pack, along with any other unreachable objects which have since\n    entered the repository.\n\nOnce a cruft pack grows beyond the size specified via `--max-cruft-size`\nthe pack is effectively frozen. This limits the I/O churn up to a\nquadratic function of the value specified by the `--max-cruft-size`\noption, instead of behaving quadratically in the number of total\nunreachable objects.\n\nWhen pruning unreachable objects, we bypass the new code paths which\ncombine small cruft packs together, and instead start from scratch,\npassing in the appropriate `--max-pack-size` down to `pack-objects`,\nputting it in charge of keeping the resulting set of cruft packs sized\ncorrectly.\n\nThis may seem like further I/O churn, but in practice it isn\u0027t so bad.\nWe could prune old cruft packs for whom all or most objects are removed,\nand then generate a new cruft pack with just the remaining set of\nobjects. But this additional complexity buys us relatively little,\nbecause most objects end up being pruned anyway, so the I/O churn is\nwell contained.\n\nSigned-off-by: Taylor Blau \u003cme@ttaylorr.com\u003e\nSigned-off-by: Junio C Hamano \u003cgitster@pobox.com\u003e\n",
  "tree_diff": [
    {
      "type": "modify",
      "old_id": "ca47eb200882a251ddf7ad644151bb6e896b242f",
      "old_mode": 33188,
      "old_path": "Documentation/config/gc.txt",
      "new_id": "83ebc2de557feb86b3b797261e544cdb0b523d93",
      "new_mode": 33188,
      "new_path": "Documentation/config/gc.txt"
    },
    {
      "type": "modify",
      "old_id": "90806fd26aa4ac0d1fef93d25384af3799818896",
      "old_mode": 33188,
      "old_path": "Documentation/git-gc.txt",
      "new_id": "b5561c458a101c32a51ccca8599d4898e166b174",
      "new_mode": 33188,
      "new_path": "Documentation/git-gc.txt"
    },
    {
      "type": "modify",
      "old_id": "4017157949e6d764a619f870c0eed538d3e99f64",
      "old_mode": 33188,
      "old_path": "Documentation/git-repack.txt",
      "new_id": "fbfc72e1b255c9efa997577ee66e8146caf5ce70",
      "new_mode": 33188,
      "new_path": "Documentation/git-repack.txt"
    },
    {
      "type": "modify",
      "old_id": "00192ae5d32162229dee52cde1a77a6846c75443",
      "old_mode": 33188,
      "old_path": "builtin/gc.c",
      "new_id": "c97c9fb04644456866d43c147e7e3e735f1be097",
      "new_mode": 33188,
      "new_path": "builtin/gc.c"
    },
    {
      "type": "modify",
      "old_id": "8a5bbb9cbaf862b8a6279b2f0cdde07163c11662",
      "old_mode": 33188,
      "old_path": "builtin/repack.c",
      "new_id": "04770b15fe7ee21727fdda173bbabdb5f5760d73",
      "new_mode": 33188,
      "new_path": "builtin/repack.c"
    },
    {
      "type": "modify",
      "old_id": "69509d0c11db655a12fc6ed20215ed25976c46eb",
      "old_mode": 33261,
      "old_path": "t/t6500-gc.sh",
      "new_id": "0d6b5c3b27a0c9ebc4303ea481ef82618f4b7407",
      "new_mode": 33261,
      "new_path": "t/t6500-gc.sh"
    },
    {
      "type": "modify",
      "old_id": "d91fcf1af1b5644efddacdc6e8cb6b0aa39e20b7",
      "old_mode": 33261,
      "old_path": "t/t7704-repack-cruft.sh",
      "new_id": "dc86ca8269bb1ac89d9e71f634dfd4f983dec33f",
      "new_mode": 33261,
      "new_path": "t/t7704-repack-cruft.sh"
    }
  ]
}
