From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: git@vger.kernel.org, dstolee@microsoft.com,
git@jeffhostetler.com, peff@peff.net, gitster@pobox.com,
Johannes.Shindelin@gmx.de, jrnieder@gmail.com
Subject: Re: [RFC PATCH 00/18] Multi-pack index (MIDX)
Date: Sun, 07 Jan 2018 23:42:27 +0100 [thread overview]
Message-ID: <87k1wtb8a4.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <20180107181459.222909-1-dstolee@microsoft.com>
On Sun, Jan 07 2018, Derrick Stolee jotted:
> git log --oneline --raw --parents
>
> Num Packs | Before MIDX | After MIDX | Rel % | 1 pack %
> ----------+-------------+------------+--------+----------
> 1 | 35.64 s | 35.28 s | -1.0% | -1.0%
> 24 | 90.81 s | 40.06 s | -55.9% | +12.4%
> 127 | 257.97 s | 42.25 s | -83.6% | +18.6%
>
> The last column is the relative difference between the MIDX-enabled repo
> and the single-pack repo. The goal of the MIDX feature is to present the
> ODB as if it was fully repacked, so there is still room for improvement.
>
> Changing the command to
>
> git log --oneline --raw --parents --abbrev=40
>
> has no observable difference (sub 1% change in all cases). This is likely
> due to the repack I used putting commits and trees in a small number of
> packfiles so the MRU cache workes very well. On more naturally-created
> lists of packfiles, there can be up to 20% improvement on this command.
>
> We are using a version of this patch with an upcoming release of GVFS.
> This feature is particularly important in that space since GVFS performs
> a "prefetch" step that downloads a pack of commits and trees on a daily
> basis. These packfiles are placed in an alternate that is shared by all
> enlistments. Some users have 150+ packfiles and the MRU misses and
> abbreviation computations are significant. Now, GVFS manages the MIDX file
> after adding new prefetch packfiles using the following command:
>
> git midx --write --update-head --delete-expired --pack-dir=<alt>
(Not a critique of this, just a (stupid) question)
What's the practical use-case for this feature? Since it doesn't help
with --abbrev=40 the speedup is all in the part that ensures we don't
show an ambiguous SHA-1.
The reason we do that at all is because it makes for a prettier UI.
Are there things that both want the pretty SHA-1 and also care about the
throughput? I'd have expected machine parsing to just use
--no-abbrev-commit.
If something cares about both throughput and e.g. is saving the
abbreviated SHA-1s isn't it better off picking some arbitrary size
(e.g. --abbrev=20), after all the default abbreviation is going to show
something as small as possible, which may soon become ambigous after the
next commit.
next prev parent reply other threads:[~2018-01-07 22:42 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-07 18:14 [RFC PATCH 00/18] Multi-pack index (MIDX) Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 01/18] docs: Multi-Pack Index (MIDX) Design Notes Derrick Stolee
2018-01-08 19:32 ` Jonathan Tan
2018-01-08 20:35 ` Derrick Stolee
2018-01-08 22:06 ` Jonathan Tan
2018-01-07 18:14 ` [RFC PATCH 02/18] midx: specify midx file format Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 03/18] midx: create core.midx config setting Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 04/18] midx: write multi-pack indexes for an object list Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 05/18] midx: create midx builtin with --write mode Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 06/18] midx: add t5318-midx.sh test script Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 07/18] midx: teach midx --write to update midx-head Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 08/18] midx: teach git-midx to read midx file details Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 09/18] midx: find details of nth object in midx Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 10/18] midx: use existing midx when writing Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 11/18] midx: teach git-midx to clear midx files Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 12/18] midx: teach git-midx to delete expired files Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 13/18] t5318-midx.h: confirm git actions are stable Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 14/18] midx: load midx files when loading packs Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 15/18] midx: use midx for approximate object count Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 16/18] midx: nth_midxed_object_oid() and bsearch_midx() Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 17/18] sha1_name: use midx for abbreviations Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 18/18] packfile: use midx for object loads Derrick Stolee
2018-01-07 22:42 ` Ævar Arnfjörð Bjarmason [this message]
2018-01-08 0:08 ` [RFC PATCH 00/18] Multi-pack index (MIDX) Derrick Stolee
2018-01-08 10:20 ` Jeff King
2018-01-08 10:27 ` Jeff King
2018-01-08 12:28 ` Ævar Arnfjörð Bjarmason
2018-01-08 13:43 ` Johannes Schindelin
2018-01-09 6:50 ` Jeff King
2018-01-09 13:05 ` Johannes Schindelin
2018-01-09 19:51 ` Stefan Beller
2018-01-09 20:12 ` Junio C Hamano
2018-01-09 20:16 ` Stefan Beller
2018-01-09 21:31 ` Junio C Hamano
2018-01-10 17:05 ` Johannes Schindelin
2018-01-10 10:57 ` Jeff King
2018-01-08 13:43 ` Derrick Stolee
2018-01-09 7:12 ` Jeff King
2018-01-08 11:43 ` Ævar Arnfjörð Bjarmason
2018-06-06 8:13 ` Ævar Arnfjörð Bjarmason
2018-06-06 10:27 ` [RFC PATCH 0/2] unconditional O(1) SHA-1 abbreviation Ævar Arnfjörð Bjarmason
2018-06-06 10:27 ` [RFC PATCH 1/2] config.c: use braces on multiple conditional arms Ævar Arnfjörð Bjarmason
2018-06-06 10:27 ` [RFC PATCH 2/2] sha1-name: add core.validateAbbrev & relative core.abbrev Ævar Arnfjörð Bjarmason
2018-06-06 12:04 ` Christian Couder
2018-06-06 11:24 ` [RFC PATCH 00/18] Multi-pack index (MIDX) Derrick Stolee
2018-01-10 18:25 ` Martin Fick
2018-01-10 19:39 ` Derrick Stolee
2018-01-10 21:01 ` Martin Fick
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k1wtb8a4.fsf@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=Johannes.Shindelin@gmx.de \
--cc=dstolee@microsoft.com \
--cc=git@jeffhostetler.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jrnieder@gmail.com \
--cc=peff@peff.net \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).