From: Derrick Stolee <dstolee@microsoft.com>
To: git@vger.kernel.org
Cc: peff@peff.net, gitster@pobox.com, stolee@gmail.com,
Derrick Stolee <dstolee@microsoft.com>
Subject: [PATCH v5 0/4] Improve abbreviation disambiguation
Date: Thu, 12 Oct 2017 08:02:16 -0400 [thread overview]
Message-ID: <20171012120220.226427-1-dstolee@microsoft.com> (raw)
In-Reply-To: <61168095-d392-39d2-ba65-823525239b5c@gmail.com>
Changes since previous version:
* Make 'pos' unsigned in get_hex_char_from_oid()
* Check response from open_pack_index()
* Small typos in commit messages
Thanks,
Stolee
---
When displaying object ids, we frequently want to see an abbreviation
for easier typing. That abbreviation must be unambiguous among all
object ids.
The current implementation of find_unique_abbrev() performs a loop
checking if each abbreviation length is unambiguous until finding one
that works. This causes multiple round-trips to the disk when starting
with the default abbreviation length (usually 7) but needing up to 12
characters for an unambiguous short-sha. For very large repos, this
effect is pronounced and causes issues with several commands, from
obvious consumers `status` and `log` to less obvious commands such as
`fetch` and `push`.
This patch improves performance by iterating over objects matching the
short abbreviation only once, inspecting each object id, and reporting
the minimum length of an unambiguous abbreviation.
Add a new perf test for testing the performance of log while computing
OID abbreviations. Using --oneline --raw and --parents options maximizes
the number of OIDs to abbreviate while still spending some time
computing diffs. Below we report performance statistics for perf test
4211.6 from p4211-line-log.sh using three copies of the Linux repo:
| Packs | Loose | Base Time | New Time | Rel% |
|-------|--------|-----------|----------|-------|
| 1 | 0 | 41.27 s | 38.93 s | -4.8% |
| 24 | 0 | 98.04 s | 91.35 s | -5.7% |
| 23 | 323952 | 117.78 s | 112.18 s | -4.8% |
Derrick Stolee (4):
p4211-line-log.sh: add log --online --raw --parents perf test
sha1_name: unroll len loop in find_unique_abbrev_r
sha1_name: parse less while finding common prefix
sha1_name: minimize OID comparisons during disambiguation
sha1_name.c | 135 +++++++++++++++++++++++++++++++++++++++++------
t/perf/p4211-line-log.sh | 4 ++
2 files changed, 123 insertions(+), 16 deletions(-)
--
2.14.1.538.g56ec8fc98.dirty
next prev parent reply other threads:[~2017-10-12 12:02 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-08 18:49 [PATCH v4 0/4] Improve abbreviation disambiguation Derrick Stolee
2017-10-08 18:49 ` [PATCH v4 1/4] p4211-line-log.sh: add log --online --raw --parents perf test Derrick Stolee
2017-10-08 18:49 ` [PATCH v4 2/4] sha1_name: unroll len loop in find_unique_abbrev_r Derrick Stolee
2017-10-08 18:49 ` [PATCH v4 3/4] sha1_name: parse less while finding common prefix Derrick Stolee
2017-10-09 13:42 ` Jeff King
2017-10-08 18:49 ` [PATCH v4 4/4] sha1_name: minimize OID comparisons during disambiguation Derrick Stolee
2017-10-09 13:49 ` Jeff King
2017-10-10 12:16 ` Derrick Stolee
2017-10-10 12:36 ` Jeff King
2017-10-10 12:56 ` Junio C Hamano
2017-10-10 13:09 ` Jeff King
2017-10-10 13:11 ` Derrick Stolee
2017-10-10 13:30 ` Jeff King
2017-10-11 13:58 ` Derrick Stolee
2017-10-12 12:02 ` Derrick Stolee [this message]
2017-10-12 12:04 ` [PATCH v5 0/4] Improve abbreviation disambiguation Derrick Stolee
2017-10-12 12:21 ` Junio C Hamano
2017-10-12 14:22 ` Jeff King
2017-10-12 12:02 ` [PATCH v5 1/4] p4211-line-log.sh: add log --online --raw --parents perf test Derrick Stolee
2017-10-12 12:02 ` [PATCH v5 2/4] sha1_name: unroll len loop in find_unique_abbrev_r Derrick Stolee
2017-10-12 12:02 ` [PATCH v5 3/4] sha1_name: parse less while finding common prefix Derrick Stolee
2017-10-12 12:02 ` [PATCH v5 4/4] sha1_name: minimize OID comparisons during disambiguation Derrick Stolee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171012120220.226427-1-dstolee@microsoft.com \
--to=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).