From: Derrick Stolee <dstolee@microsoft.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, stolee@gmail.com, peff@peff.net,
ramsay@ramsayjones.plus.com, sbeller@google.com,
Derrick Stolee <dstolee@microsoft.com>
Subject: [PATCH v4 0/4] Improve abbreviation disambiguation
Date: Sun, 8 Oct 2017 14:49:38 -0400 [thread overview]
Message-ID: <20171008184942.69444-1-dstolee@microsoft.com> (raw)
Changes since previous version:
* Fixed an overflow error in the binary search. I sent a separate patch
to fix this error in existing searches; that patch should be applied
before this one.
* Removed test-list-objects and test-abbrev in favor of a new git log
test in p4211-line-log.sh. Limited perf numbers for Linux repo are
given in cover letter and commit 4/4.
* Silently skip packfiles that fail to open with open_pack_index()
Thanks for all the comments from Jeff, Junio, Ramsey, and Stefan!
Thanks,
Stolee
---
When displaying object ids, we frequently want to see an abbreviation
for easier typing. That abbreviation must be unambiguous among all
object ids.
The current implementation of find_unique_abbrev() performs a loop
checking if each abbreviation length is unambiguous until finding one
that works. This causes multiple round-trips to the disk when starting
with the default abbreviation length (usually 7) but needing up to 12
characters for an unambiguous short-sha. For very large repos, this
effect is pronounced and causes issues with several commands, from
obvious consumers `status` and `log` to less obvious commands such as
`fetch` and `push`.
This patch improves performance by iterating over objects matching the
short abbreviation only once, inspecting each object id, and reporting
the minimum length of an unambiguous abbreviation.
Add a new perf test for testing the performance of log while computing
OID abbreviations. Using --oneline --raw and --parents options maximizes
the number of OIDs to abbreviate while still spending some time
computing diffs. Below we report performance statistics for perf test
4211.6 from p4211-line-log.sh using three copies of the Linux repo:
| Packs | Loose | Base Time | New Time | Rel% |
|-------|--------|-----------|----------|-------|
| 1 | 0 | 41.27 s | 38.93 s | -4.8% |
| 24 | 0 | 98.04 s | 91.35 s | -5.7% |
| 23 | 323952 | 117.78 s | 112.18 s | -4.8% |
Derrick Stolee (4):
p4211-line-log.sh: add log --online --raw --parents perf test
sha1_name: Unroll len loop in find_unique_abbrev_r
sha1_name: Parse less while finding common prefix
sha1_name: Minimize OID comparisons during disambiguation
sha1_name.c | 129 +++++++++++++++++++++++++++++++++++++++++------
t/perf/p4211-line-log.sh | 4 ++
2 files changed, 118 insertions(+), 15 deletions(-)
--
2.14.1.538.g56ec8fc98.dirty
next reply other threads:[~2017-10-08 18:50 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-08 18:49 Derrick Stolee [this message]
2017-10-08 18:49 ` [PATCH v4 1/4] p4211-line-log.sh: add log --online --raw --parents perf test Derrick Stolee
2017-10-08 18:49 ` [PATCH v4 2/4] sha1_name: unroll len loop in find_unique_abbrev_r Derrick Stolee
2017-10-08 18:49 ` [PATCH v4 3/4] sha1_name: parse less while finding common prefix Derrick Stolee
2017-10-09 13:42 ` Jeff King
2017-10-08 18:49 ` [PATCH v4 4/4] sha1_name: minimize OID comparisons during disambiguation Derrick Stolee
2017-10-09 13:49 ` Jeff King
2017-10-10 12:16 ` Derrick Stolee
2017-10-10 12:36 ` Jeff King
2017-10-10 12:56 ` Junio C Hamano
2017-10-10 13:09 ` Jeff King
2017-10-10 13:11 ` Derrick Stolee
2017-10-10 13:30 ` Jeff King
2017-10-11 13:58 ` Derrick Stolee
2017-10-12 12:02 ` [PATCH v5 0/4] Improve abbreviation disambiguation Derrick Stolee
2017-10-12 12:04 ` Derrick Stolee
2017-10-12 12:21 ` Junio C Hamano
2017-10-12 14:22 ` Jeff King
2017-10-12 12:02 ` [PATCH v5 1/4] p4211-line-log.sh: add log --online --raw --parents perf test Derrick Stolee
2017-10-12 12:02 ` [PATCH v5 2/4] sha1_name: unroll len loop in find_unique_abbrev_r Derrick Stolee
2017-10-12 12:02 ` [PATCH v5 3/4] sha1_name: parse less while finding common prefix Derrick Stolee
2017-10-12 12:02 ` [PATCH v5 4/4] sha1_name: minimize OID comparisons during disambiguation Derrick Stolee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171008184942.69444-1-dstolee@microsoft.com \
--to=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
--cc=ramsay@ramsayjones.plus.com \
--cc=sbeller@google.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).