pack-objects: reuse data from existing packs.

When generating a new pack, notice if we have already needed
objects in existing packs.  If an object is stored deltified,
and its base object is also what we are going to pack, then
reuse the existing deltified representation unconditionally,
bypassing all the expensive find_deltas() and try_deltas()
calls.

Also, notice if what we are going to write out exactly match
what is already in an existing pack (either deltified or just
compressed).  In such a case, we can just copy it instead of
going through the usual uncompressing & recompressing cycle.

Without this patch, in linux-2.6 repository with about 1500
loose objects and a single mega pack:

    $ git-rev-list --objects v2.6.16-rc3 >RL
    $ wc -l RL
    184141 RL
    $ time git-pack-objects p <RL
    Generating pack...
    Done counting 184141 objects.
    Packing 184141 objects....................
    a1fc7b3e537fcb9b3c46b7505df859f0a11e79d2

    real    12m4.323s
    user    11m2.560s
    sys     0m55.950s

With this patch, the same input:

    $ time ../git.junio/git-pack-objects q <RL
    Generating pack...
    Done counting 184141 objects.
    Packing 184141 objects.....................
    a1fc7b3e537fcb9b3c46b7505df859f0a11e79d2
    Total 184141, written 184141, reused 182441

    real    1m2.608s
    user    0m55.090s
    sys     0m1.830s

Signed-off-by: Junio C Hamano <junkio@cox.net>
3 files changed
tree: 6a2f4bbf0f88d1f1cbdf8bf1ad873616eec392bc
  1. arm/
  2. compat/
  3. Documentation/
  4. mozilla-sha1/
  5. ppc/
  6. t/
  7. templates/
  8. .gitignore
  9. apply.c
  10. blob.c
  11. blob.h
  12. cache.h
  13. cat-file.c
  14. check-ref-format.c
  15. checkout-index.c
  16. clone-pack.c
  17. combine-diff.c
  18. commit-tree.c
  19. commit.c
  20. commit.h
  21. config.c
  22. connect.c
  23. convert-objects.c
  24. copy.c
  25. COPYING
  26. count-delta.c
  27. count-delta.h
  28. csum-file.c
  29. csum-file.h
  30. ctype.c
  31. daemon.c
  32. date.c
  33. delta.h
  34. describe.c
  35. diff-delta.c
  36. diff-files.c
  37. diff-index.c
  38. diff-stages.c
  39. diff-tree.c
  40. diff.c
  41. diff.h
  42. diffcore-break.c
  43. diffcore-order.c
  44. diffcore-pathspec.c
  45. diffcore-pickaxe.c
  46. diffcore-rename.c
  47. diffcore.h
  48. entry.c
  49. environment.c
  50. epoch.c
  51. epoch.h
  52. exec_cmd.c
  53. exec_cmd.h
  54. fetch-clone.c
  55. fetch-pack.c
  56. fetch.c
  57. fetch.h
  58. fsck-objects.c
  59. get-tar-commit-id.c
  60. git-add.sh
  61. git-am.sh
  62. git-applymbox.sh
  63. git-applypatch.sh
  64. git-archimport.perl
  65. git-bisect.sh
  66. git-branch.sh
  67. git-checkout.sh
  68. git-cherry.sh
  69. git-clone.sh
  70. git-commit.sh
  71. git-compat-util.h
  72. git-count-objects.sh
  73. git-cvsexportcommit.perl
  74. git-cvsimport.perl
  75. git-diff.sh
  76. git-fetch.sh
  77. git-fmt-merge-msg.perl
  78. git-format-patch.sh
  79. git-grep.sh
  80. git-log.sh
  81. git-lost-found.sh
  82. git-ls-remote.sh
  83. git-merge-octopus.sh
  84. git-merge-one-file.sh
  85. git-merge-ours.sh
  86. git-merge-recursive.py
  87. git-merge-resolve.sh
  88. git-merge-stupid.sh
  89. git-merge.sh
  90. git-mv.perl
  91. git-parse-remote.sh
  92. git-prune.sh
  93. git-pull.sh
  94. git-push.sh
  95. git-rebase.sh
  96. git-relink.perl
  97. git-repack.sh
  98. git-request-pull.sh
  99. git-rerere.perl
  100. git-reset.sh
  101. git-resolve.sh
  102. git-revert.sh
  103. git-send-email.perl
  104. git-sh-setup.sh
  105. git-shortlog.perl
  106. git-svnimport.perl
  107. git-tag.sh
  108. git-verify-tag.sh
  109. GIT-VERSION-GEN
  110. git-whatchanged.sh
  111. git.c
  112. git.spec.in
  113. gitk
  114. gitMergeCommon.py
  115. hash-object.c
  116. http-fetch.c
  117. http-push.c
  118. http.c
  119. http.h
  120. ident.c
  121. index-pack.c
  122. index.c
  123. init-db.c
  124. INSTALL
  125. local-fetch.c
  126. ls-files.c
  127. ls-tree.c
  128. mailinfo.c
  129. mailsplit.c
  130. Makefile
  131. merge-base.c
  132. merge-index.c
  133. mktag.c
  134. name-rev.c
  135. object.c
  136. object.h
  137. pack-check.c
  138. pack-objects.c
  139. pack-redundant.c
  140. pack.h
  141. patch-delta.c
  142. patch-id.c
  143. path.c
  144. peek-remote.c
  145. pkt-line.c
  146. pkt-line.h
  147. prune-packed.c
  148. quote.c
  149. quote.h
  150. read-cache.c
  151. read-tree.c
  152. README
  153. receive-pack.c
  154. refs.c
  155. refs.h
  156. repo-config.c
  157. rev-list.c
  158. rev-parse.c
  159. rsh.c
  160. rsh.h
  161. run-command.c
  162. run-command.h
  163. send-pack.c
  164. server-info.c
  165. setup.c
  166. sha1_file.c
  167. sha1_name.c
  168. shell.c
  169. show-branch.c
  170. show-index.c
  171. ssh-fetch.c
  172. ssh-pull.c
  173. ssh-push.c
  174. ssh-upload.c
  175. strbuf.c
  176. strbuf.h
  177. stripspace.c
  178. symbolic-ref.c
  179. tag.c
  180. tag.h
  181. tar-tree.c
  182. test-date.c
  183. test-delta.c
  184. tree-diff.c
  185. tree.c
  186. tree.h
  187. unpack-file.c
  188. unpack-objects.c
  189. update-index.c
  190. update-ref.c
  191. update-server-info.c
  192. upload-pack.c
  193. usage.c
  194. var.c
  195. verify-pack.c
  196. write-tree.c