combine-diff: handle --find-object in multitree code path

When doing combined diffs, we have two possible code paths:

  - a slower one which independently diffs against each parent, applies
    any filters, and then intersects the resulting paths

  - a faster one which walks all trees simultaneously

When the diff options specify that we must do certain filters, like
pickaxe, then we always use the slow path, since the pickaxe code only
knows how to handle filepairs, not the n-parent entries generated for
combined diffs.

But there are two problems with the slow path:

 1. It's slow. Running:

      git rev-list HEAD | git diff-tree --stdin -r -c

    in git.git takes ~3s on my machine. But adding "--find-object" to
    that increases it to ~6s, even though find-object itself should
    incur only a few extra oid comparisons. On linux.git, it's even
    worse: 35s versus 215s.

 2. It doesn't catch all cases where a particular path is interesting.
    Consider a merge with parent blobs X and Y for a particular path,
    and end result Z. That should be interesting according to "-c",
    because the result doesn't match either parent. And it should be
    interesting even with "--find-object=X", because "X" went away in
    the merge.

    But because we perform each pairwise diff independently, this
    confuses the intersection code. The change from X to Z is still
    interesting according to --find-object. But in the other parent we
    went from Y to Z, so the diff appears empty! That causes the
    intersection code to think that parent didn't change the path, and
    thus it's not interesting for "-c".

This patch fixes both by implementing --find-object for the multitree
code. It's a bit unfortunate that we have to duplicate some logic from
diffcore-pickaxe, but this is the best we can do for now. In an ideal
world, all of the diffcore code would stop thinking about filepairs and
start thinking about n-parent sets, and we could use the multitree walk
with all of it.

Until then, there are some leftover warts:

  - other pickaxe operations, like -S or -G, still suffer from both
    problems. These would be hard to adapt because they rely on having
    a diff_filespec() for each path to look at content. And we'd need to
    define what an n-way "change" means in each case (probably easy for
    "-S", which can compare counts, but not so clear for -G, which is
    about grepping diffs).

  - other options besides --find-object may cause us to use the slow
    pairwise path, in which case we'll go back to producing a different
    (wrong) answer for the X/Y/Z case above.

We may be able to hack around these, but I think the ultimate solution
will be a larger rewrite of the diffcore code. For now, this patch
improves one specific case but leaves the rest.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2 files changed
tree: 71222615b3cd83c2268a57622b34278a257a5ad8
  1. .github/
  2. block-sha1/
  3. builtin/
  4. ci/
  5. compat/
  6. contrib/
  7. Documentation/
  8. ewah/
  9. git-gui/
  10. gitk-git/
  11. gitweb/
  12. mergetools/
  13. negotiator/
  14. perl/
  15. po/
  16. ppc/
  17. refs/
  18. sha1dc/
  19. sha256/
  20. t/
  21. templates/
  22. trace2/
  23. vcs-svn/
  24. xdiff/
  25. .cirrus.yml
  26. .clang-format
  27. .editorconfig
  28. .gitattributes
  29. .gitignore
  30. .gitmodules
  31. .mailmap
  32. .travis.yml
  33. .tsan-suppressions
  34. abspath.c
  35. aclocal.m4
  36. add-interactive.c
  37. add-interactive.h
  38. add-patch.c
  39. advice.c
  40. advice.h
  41. alias.c
  42. alias.h
  43. alloc.c
  44. alloc.h
  45. apply.c
  46. apply.h
  47. archive-tar.c
  48. archive-zip.c
  49. archive.c
  50. archive.h
  51. argv-array.c
  52. argv-array.h
  53. attr.c
  54. attr.h
  55. banned.h
  56. base85.c
  57. bisect.c
  58. bisect.h
  59. blame.c
  60. blame.h
  61. blob.c
  62. blob.h
  63. bloom.c
  64. bloom.h
  65. branch.c
  66. branch.h
  67. bugreport.c
  68. builtin.h
  69. bulk-checkin.c
  70. bulk-checkin.h
  71. bundle.c
  72. bundle.h
  73. cache-tree.c
  74. cache-tree.h
  75. cache.h
  76. chdir-notify.c
  77. chdir-notify.h
  78. check-builtins.sh
  79. check_bindir
  80. checkout.c
  81. checkout.h
  82. CODE_OF_CONDUCT.md
  83. color.c
  84. color.h
  85. column.c
  86. column.h
  87. combine-diff.c
  88. command-list.txt
  89. commit-graph.c
  90. commit-graph.h
  91. commit-reach.c
  92. commit-reach.h
  93. commit-slab-decl.h
  94. commit-slab-impl.h
  95. commit-slab.h
  96. commit.c
  97. commit.h
  98. common-main.c
  99. config.c
  100. config.h
  101. config.mak.dev
  102. config.mak.in
  103. config.mak.uname
  104. configure.ac
  105. connect.c
  106. connect.h
  107. connected.c
  108. connected.h
  109. convert.c
  110. convert.h
  111. copy.c
  112. COPYING
  113. credential-cache--daemon.c
  114. credential-cache.c
  115. credential-store.c
  116. credential.c
  117. credential.h
  118. csum-file.c
  119. csum-file.h
  120. ctype.c
  121. daemon.c
  122. date.c
  123. decorate.c
  124. decorate.h
  125. delta-islands.c
  126. delta-islands.h
  127. delta.h
  128. detect-compiler
  129. diff-delta.c
  130. diff-lib.c
  131. diff-no-index.c
  132. diff.c
  133. diff.h
  134. diffcore-break.c
  135. diffcore-delta.c
  136. diffcore-order.c
  137. diffcore-pickaxe.c
  138. diffcore-rename.c
  139. diffcore.h
  140. dir-iterator.c
  141. dir-iterator.h
  142. dir.c
  143. dir.h
  144. editor.c
  145. entry.c
  146. environment.c
  147. exec-cmd.c
  148. exec-cmd.h
  149. fast-import.c
  150. fetch-negotiator.c
  151. fetch-negotiator.h
  152. fetch-pack.c
  153. fetch-pack.h
  154. fmt-merge-msg.c
  155. fmt-merge-msg.h
  156. fsck.c
  157. fsck.h
  158. fsmonitor.c
  159. fsmonitor.h
  160. fuzz-commit-graph.c
  161. fuzz-pack-headers.c
  162. fuzz-pack-idx.c
  163. generate-cmdlist.sh
  164. generate-configlist.sh
  165. gettext.c
  166. gettext.h
  167. git-add--interactive.perl
  168. git-archimport.perl
  169. git-bisect.sh
  170. git-compat-util.h
  171. git-cvsexportcommit.perl
  172. git-cvsimport.perl
  173. git-cvsserver.perl
  174. git-difftool--helper.sh
  175. git-filter-branch.sh
  176. git-instaweb.sh
  177. git-merge-octopus.sh
  178. git-merge-one-file.sh
  179. git-merge-resolve.sh
  180. git-mergetool--lib.sh
  181. git-mergetool.sh
  182. git-p4.py
  183. git-parse-remote.sh
  184. git-quiltimport.sh
  185. git-rebase--preserve-merges.sh
  186. git-request-pull.sh
  187. git-send-email.perl
  188. git-sh-i18n.sh
  189. git-sh-setup.sh
  190. git-submodule.sh
  191. git-svn.perl
  192. GIT-VERSION-GEN
  193. git-web--browse.sh
  194. git.c
  195. git.rc
  196. gpg-interface.c
  197. gpg-interface.h
  198. graph.c
  199. graph.h
  200. grep.c
  201. grep.h
  202. hash.h
  203. hashmap.c
  204. hashmap.h
  205. help.c
  206. help.h
  207. hex.c
  208. http-backend.c
  209. http-fetch.c
  210. http-push.c
  211. http-walker.c
  212. http.c
  213. http.h
  214. ident.c
  215. imap-send.c
  216. INSTALL
  217. interdiff.c
  218. interdiff.h
  219. iterator.h
  220. json-writer.c
  221. json-writer.h
  222. khash.h
  223. kwset.c
  224. kwset.h
  225. levenshtein.c
  226. levenshtein.h
  227. LGPL-2.1
  228. line-log.c
  229. line-log.h
  230. line-range.c
  231. line-range.h
  232. linear-assignment.c
  233. linear-assignment.h
  234. list-objects-filter-options.c
  235. list-objects-filter-options.h
  236. list-objects-filter.c
  237. list-objects-filter.h
  238. list-objects.c
  239. list-objects.h
  240. list.h
  241. ll-merge.c
  242. ll-merge.h
  243. lockfile.c
  244. lockfile.h
  245. log-tree.c
  246. log-tree.h
  247. ls-refs.c
  248. ls-refs.h
  249. mailinfo.c
  250. mailinfo.h
  251. mailmap.c
  252. mailmap.h
  253. Makefile
  254. match-trees.c
  255. mem-pool.c
  256. mem-pool.h
  257. merge-blobs.c
  258. merge-blobs.h
  259. merge-recursive.c
  260. merge-recursive.h
  261. merge.c
  262. mergesort.c
  263. mergesort.h
  264. midx.c
  265. midx.h
  266. name-hash.c
  267. notes-cache.c
  268. notes-cache.h
  269. notes-merge.c
  270. notes-merge.h
  271. notes-utils.c
  272. notes-utils.h
  273. notes.c
  274. notes.h
  275. object-store.h
  276. object.c
  277. object.h
  278. oid-array.c
  279. oid-array.h
  280. oidmap.c
  281. oidmap.h
  282. oidset.c
  283. oidset.h
  284. pack-bitmap-write.c
  285. pack-bitmap.c
  286. pack-bitmap.h
  287. pack-check.c
  288. pack-objects.c
  289. pack-objects.h
  290. pack-revindex.c
  291. pack-revindex.h
  292. pack-write.c
  293. pack.h
  294. packfile.c
  295. packfile.h
  296. pager.c
  297. parse-options-cb.c
  298. parse-options.c
  299. parse-options.h
  300. patch-delta.c
  301. patch-ids.c
  302. patch-ids.h
  303. path.c
  304. path.h
  305. pathspec.c
  306. pathspec.h
  307. pkt-line.c
  308. pkt-line.h
  309. preload-index.c
  310. pretty.c
  311. pretty.h
  312. prio-queue.c
  313. prio-queue.h
  314. progress.c
  315. progress.h
  316. promisor-remote.c
  317. promisor-remote.h
  318. prompt.c
  319. prompt.h
  320. protocol.c
  321. protocol.h
  322. prune-packed.c
  323. prune-packed.h
  324. quote.c
  325. quote.h
  326. range-diff.c
  327. range-diff.h
  328. reachable.c
  329. reachable.h
  330. read-cache.c
  331. README.md
  332. rebase-interactive.c
  333. rebase-interactive.h
  334. rebase.c
  335. rebase.h
  336. ref-filter.c
  337. ref-filter.h
  338. reflog-walk.c
  339. reflog-walk.h
  340. refs.c
  341. refs.h
  342. refspec.c
  343. refspec.h
  344. remote-curl.c
  345. remote-testsvn.c
  346. remote.c
  347. remote.h
  348. replace-object.c
  349. replace-object.h
  350. repo-settings.c
  351. repository.c
  352. repository.h
  353. rerere.c
  354. rerere.h
  355. reset.c
  356. reset.h
  357. resolve-undo.c
  358. resolve-undo.h
  359. revision.c
  360. revision.h
  361. run-command.c
  362. run-command.h
  363. send-pack.c
  364. send-pack.h
  365. sequencer.c
  366. sequencer.h
  367. serve.c
  368. serve.h
  369. server-info.c
  370. setup.c
  371. sh-i18n--envsubst.c
  372. sha1-file.c
  373. sha1-lookup.c
  374. sha1-lookup.h
  375. sha1-name.c
  376. sha1dc_git.c
  377. sha1dc_git.h
  378. shallow.c
  379. shallow.h
  380. shell.c
  381. shortlog.h
  382. sideband.c
  383. sideband.h
  384. sigchain.c
  385. sigchain.h
  386. split-index.c
  387. split-index.h
  388. stable-qsort.c
  389. strbuf.c
  390. strbuf.h
  391. streaming.c
  392. streaming.h
  393. string-list.c
  394. string-list.h
  395. sub-process.c
  396. sub-process.h
  397. submodule-config.c
  398. submodule-config.h
  399. submodule.c
  400. submodule.h
  401. symlinks.c
  402. tag.c
  403. tag.h
  404. tar.h
  405. tempfile.c
  406. tempfile.h
  407. thread-utils.c
  408. thread-utils.h
  409. tmp-objdir.c
  410. tmp-objdir.h
  411. trace.c
  412. trace.h
  413. trace2.c
  414. trace2.h
  415. trailer.c
  416. trailer.h
  417. transport-helper.c
  418. transport-internal.h
  419. transport.c
  420. transport.h
  421. tree-diff.c
  422. tree-walk.c
  423. tree-walk.h
  424. tree.c
  425. tree.h
  426. unicode-width.h
  427. unimplemented.sh
  428. unix-socket.c
  429. unix-socket.h
  430. unpack-trees.c
  431. unpack-trees.h
  432. upload-pack.c
  433. upload-pack.h
  434. url.c
  435. url.h
  436. urlmatch.c
  437. urlmatch.h
  438. usage.c
  439. userdiff.c
  440. userdiff.h
  441. utf8.c
  442. utf8.h
  443. varint.c
  444. varint.h
  445. version.c
  446. version.h
  447. versioncmp.c
  448. walker.c
  449. walker.h
  450. wildmatch.c
  451. wildmatch.h
  452. worktree.c
  453. worktree.h
  454. wrap-for-bin.sh
  455. wrapper.c
  456. write-or-die.c
  457. ws.c
  458. wt-status.c
  459. wt-status.h
  460. xdiff-interface.c
  461. xdiff-interface.h
  462. zlib.c
README.md

Build status

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-<commandname>.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission). To subscribe to the list, send an email with just “subscribe git” in the body to majordomo@vger.kernel.org. The mailing list archives are available at https://lore.kernel.org/git/, http://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the “What's cooking” reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name “git” was given by Linus Torvalds when he wrote the very first version. He described the tool as “the stupid content tracker” and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of “get” may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • “global information tracker”: you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • “goddamn idiotic truckload of sh*t”: when it breaks