)]}'
{
  "commit": "02a7f6ffab9ec7641f88032f30998976bca07820",
  "tree": "dc918724f836b06ad846aaef3f63dab54e46a801",
  "parents": [
    "89219bc0cd09ada8a204e0ace0bd15decaea7d31"
  ],
  "author": {
    "name": "Patrick Steinhardt",
    "email": "ps@pks.im",
    "time": "Thu Oct 30 11:38:41 2025 +0100"
  },
  "committer": {
    "name": "Junio C Hamano",
    "email": "gitster@pobox.com",
    "time": "Thu Oct 30 07:09:52 2025 -0700"
  },
  "message": "packfile: fix approximation of object counts\n\nWhen approximating the number of objects in a repository we only take\ninto account two data sources, the multi-pack index and the packfile\nindices, as both of these data structures allow us to easily figure out\nhow many objects they contain.\n\nBut the way we currently approximate the number of objects is broken in\npresence of a multi-pack index. This is due to two separate reasons:\n\n  - We have recently introduced initial infrastructure for incremental\n    multi-pack indices. Starting with that series, `num_objects` only\n    counts the number of objects of a specific layer of the MIDX chain,\n    so we do not take into account objects from parent layers.\n\n    This issue is fixed by adding `num_objects_in_base`, which contains\n    the sum of all objects in previous layers.\n\n  - When using the multi-pack index we may count objects contained in\n    packfiles twice: once via the multi-pack index, but then we again\n    count them via the packfile itself.\n\n    This issue is fixed by skipping any packfiles that have an MIDX.\n\nOverall, given that we _always_ count the packs, we can only end up\noverestimating the number of objects, and the overestimation is limited\nto a factor of two at most.\n\nThe consequences of those issues are very limited though, as we only\napproximate object counts in a small number of cases:\n\n  - When writing a commit-graph we use the approximate object count to\n    display the upper limit of a progress display.\n\n  - In `repo_find_unique_abbrev_r()` we use it to specify a lower limit\n    of how many hex digits we want to abbreviate to. Given that we use\n    power-of-two here to derive the lower limit we may end up with an\n    abbreviated hash that is one digit longer than required.\n\n  - In `estimate_repack_memory()` we may end up overestimating how much\n    memory a repack needs to pack objects. Conseuqently, we may end up\n    dropping some packfiles from a repack.\n\nNone of these are really game-changing. But it\u0027s nice to fix those\nissues regardless.\n\nWhile at it, convert the code to use `repo_for_each_pack()`.\nFurthermore, use `odb_prepare_alternates()` instead of explicitly\npreparing the packfile store. We really only want to prepare the object\ndatabase sources, and `get_multi_pack_index()` already knows to prepare\nthe packfile store for us.\n\nHelped-by: Taylor Blau \u003cme@ttaylorr.com\u003e\nSigned-off-by: Patrick Steinhardt \u003cps@pks.im\u003e\nSigned-off-by: Junio C Hamano \u003cgitster@pobox.com\u003e\n",
  "tree_diff": [
    {
      "type": "modify",
      "old_id": "6aa2ca8ac9ee43b1e9614cc08f788afab2447620",
      "old_mode": 33188,
      "old_path": "packfile.c",
      "new_id": "b07509b69bd7cc417f534c34a2e0e84d9d4ccc01",
      "new_mode": 33188,
      "new_path": "packfile.c"
    }
  ]
}
