)]}'
{
  "commit": "835e0aaf6f0e07e9f9a393ed0e456db7c1be33ef",
  "tree": "4f1d523bf2416b0e08fdde84a8224403f89fd5a9",
  "parents": [
    "2bf8f36ddb308b084912f8265ad6fd60df34a036"
  ],
  "author": {
    "name": "Patrick Steinhardt",
    "email": "ps@pks.im",
    "time": "Fri Mar 13 07:45:21 2026 +0100"
  },
  "committer": {
    "name": "Junio C Hamano",
    "email": "gitster@pobox.com",
    "time": "Fri Mar 13 08:54:15 2026 -0700"
  },
  "message": "builtin/pack-objects: reduce lock contention when writing packfile data\n\nWhen running `git pack-objects --stdout` we feed the data through\n`hashfd_ext()` with a progress meter and a smaller-than-usual buffer\nlength of 8kB so that we can track throughput more granularly. But as\npackfiles tend to be on the larger side, this small buffer size may\ncause a ton of write(3p) syscalls.\n\nOriginally, the buffer we used in `hashfd()` was 8kB for all use cases.\nThis was changed though in 2ca245f8be (csum-file.h: increase hashfile\nbuffer size, 2021-05-18) because we noticed that the number of writes\ncan have an impact on performance. So the buffer size was increased to\n128kB, which improved performance a bit for some use cases.\n\nBut the commit didn\u0027t touch the buffer size for `hashd_throughput()`.\nThe reasoning here was that callers expect the progress indicator to\nupdate frequently, and a larger buffer size would of course reduce the\nupdate frequency especially on slow networks.\n\nWhile that is of course true, there was (and still is, even though it\u0027s\nnow a call to `hashfd_ext()`) only a single caller of this function in\ngit-pack-objects(1). This command is responsible for writing packfiles,\nand those packfiles are often on the bigger side. So arguably:\n\n  - The user won\u0027t care about increments of 8kB when packfiles tend to\n    be megabytes or even gigabytes in size.\n\n  - Reducing the number of syscalls would be even more valuable here\n    than it would be for multi-pack indices, which was the benchmark\n    done in the mentioned commit, as MIDXs are typically significantly\n    smaller than packfiles.\n\n  - Nowadays, many internet connections should be able to transfer data\n    at a rate significantly higher than 8kB per second.\n\nUpdate the buffer to instead have a size of `LARGE_PACKET_DATA_MAX - 1`,\nwhich translates to ~64kB. This limit was chosen because `git\npack-objects --stdout` is most often used when sending packfiles via\ngit-upload-pack(1), where packfile data is chunked into pktlines when\nusing the sideband. Furthermore, most internet connections should have a\nbandwidth signifcantly higher than 64kB/s, so we\u0027d still be able to\nobserve progress updates at a rate of at least once per second.\n\nThis change significantly reduces the number of write(3p) syscalls from\n355,000 to 44,000 when packing the Linux repository. While this results\nin a small performance improvement on an otherwise-unused system, this\nimprovement is mostly negligible. More importantly though, it will\nreduce lock contention in the kernel on an extremely busy system where\nwe have many processes writing data at once.\n\nSuggested-by: Jeff King \u003cpeff@peff.net\u003e\nSigned-off-by: Patrick Steinhardt \u003cps@pks.im\u003e\nSigned-off-by: Junio C Hamano \u003cgitster@pobox.com\u003e\n",
  "tree_diff": [
    {
      "type": "modify",
      "old_id": "ba150a80ada2df880d0bed34fcebfc1248aa2b46",
      "old_mode": 33188,
      "old_path": "builtin/pack-objects.c",
      "new_id": "7301ed8c681dc5e02f1ff4bedef000263a0a774a",
      "new_mode": 33188,
      "new_path": "builtin/pack-objects.c"
    }
  ]
}
