git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
From: Derrick Stolee <dstolee@microsoft.com>
To: git@vger.kernel.org
Cc: peff@peff.net, gitster@pobox.com, stolee@gmail.com,
	Derrick Stolee <dstolee@microsoft.com>
Subject: [PATCH v5 0/4] Improve abbreviation disambiguation
Date: Thu, 12 Oct 2017 08:02:16 -0400
Message-ID: <20171012120220.226427-1-dstolee@microsoft.com> (raw)
In-Reply-To: <61168095-d392-39d2-ba65-823525239b5c@gmail.com>

Changes since previous version:

* Make 'pos' unsigned in get_hex_char_from_oid()

* Check response from open_pack_index()

* Small typos in commit messages 

Thanks,
 Stolee

---

When displaying object ids, we frequently want to see an abbreviation
for easier typing. That abbreviation must be unambiguous among all
object ids.

The current implementation of find_unique_abbrev() performs a loop
checking if each abbreviation length is unambiguous until finding one
that works. This causes multiple round-trips to the disk when starting
with the default abbreviation length (usually 7) but needing up to 12
characters for an unambiguous short-sha. For very large repos, this
effect is pronounced and causes issues with several commands, from
obvious consumers `status` and `log` to less obvious commands such as
`fetch` and `push`.

This patch improves performance by iterating over objects matching the
short abbreviation only once, inspecting each object id, and reporting
the minimum length of an unambiguous abbreviation.

Add a new perf test for testing the performance of log while computing
OID abbreviations. Using --oneline --raw and --parents options maximizes
the number of OIDs to abbreviate while still spending some time
computing diffs. Below we report performance statistics for perf test
4211.6 from p4211-line-log.sh using three copies of the Linux repo:

| Packs | Loose  | Base Time | New Time | Rel%  |
|-------|--------|-----------|----------|-------|
|  1    |      0 |   41.27 s |  38.93 s | -4.8% |
| 24    |      0 |   98.04 s |  91.35 s | -5.7% |
| 23    | 323952 |  117.78 s | 112.18 s | -4.8% |

Derrick Stolee (4):
  p4211-line-log.sh: add log --online --raw --parents perf test
  sha1_name: unroll len loop in find_unique_abbrev_r
  sha1_name: parse less while finding common prefix
  sha1_name: minimize OID comparisons during disambiguation
 sha1_name.c              | 135 +++++++++++++++++++++++++++++++++++++++++------
 t/perf/p4211-line-log.sh |   4 ++
 2 files changed, 123 insertions(+), 16 deletions(-)

-- 
2.14.1.538.g56ec8fc98.dirty


  reply index

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-08 18:49 [PATCH v4 " Derrick Stolee
2017-10-08 18:49 ` [PATCH v4 1/4] p4211-line-log.sh: add log --online --raw --parents perf test Derrick Stolee
2017-10-08 18:49 ` [PATCH v4 2/4] sha1_name: unroll len loop in find_unique_abbrev_r Derrick Stolee
2017-10-08 18:49 ` [PATCH v4 3/4] sha1_name: parse less while finding common prefix Derrick Stolee
2017-10-09 13:42   ` Jeff King
2017-10-08 18:49 ` [PATCH v4 4/4] sha1_name: minimize OID comparisons during disambiguation Derrick Stolee
2017-10-09 13:49   ` Jeff King
2017-10-10 12:16     ` Derrick Stolee
2017-10-10 12:36       ` Jeff King
2017-10-10 12:56         ` Junio C Hamano
2017-10-10 13:09           ` Jeff King
2017-10-10 13:11           ` Derrick Stolee
2017-10-10 13:30             ` Jeff King
2017-10-11 13:58               ` Derrick Stolee
2017-10-12 12:02                 ` Derrick Stolee [this message]
2017-10-12 12:04                   ` [PATCH v5 0/4] Improve abbreviation disambiguation Derrick Stolee
2017-10-12 12:21                     ` Junio C Hamano
2017-10-12 14:22                       ` Jeff King
2017-10-12 12:02                 ` [PATCH v5 1/4] p4211-line-log.sh: add log --online --raw --parents perf test Derrick Stolee
2017-10-12 12:02                 ` [PATCH v5 2/4] sha1_name: unroll len loop in find_unique_abbrev_r Derrick Stolee
2017-10-12 12:02                 ` [PATCH v5 3/4] sha1_name: parse less while finding common prefix Derrick Stolee
2017-10-12 12:02                 ` [PATCH v5 4/4] sha1_name: minimize OID comparisons during disambiguation Derrick Stolee

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171012120220.226427-1-dstolee@microsoft.com \
    --to=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox