git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: Changing the default for "core.abbrev"?
Date: Mon, 26 Sep 2016 00:34:42 -0400	[thread overview]
Message-ID: <20160926043442.3pz7ccawdcsn2kzb@sigill.intra.peff.net> (raw)
In-Reply-To: <xmqq37knwcf4.fsf@gitster.mtv.corp.google.com>

On Sun, Sep 25, 2016 at 08:46:39PM -0700, Junio C Hamano wrote:

> Linus Torvalds <torvalds@linux-foundation.org> writes:
> 
> > I can just keep reminding kernel maintainers and developers to update
> > their git config, but maybe it would be a good idea to just admit that
> > the defaults picked in 2005 weren't necessarily the best ones
> > possible, and those could be bumped up a bit?
> >
> > I think I mentioned this some time ago, and it's not a huge deal, but
> > I thought I'd just mention it again because it came up again today for
> > me..
> 
> I am not quite sure how good any new default would be, though.  Just
> like any timeout is not long enough for somebody, growing projects
> will eventually hit whatever abbreviation length they start with.

I actually think "12" might be sane for a long time. That's 48 bits of
sha1, so we'd expect a 50% change of a _single_ collision at 2^24, or 16
million.  The biggest repository I know about (in number of objects) is
the one holding all of the objects for all of the forks of
torvalds/linux on GitHub. It's at about 15 million objects.

Which _seems_ close, but remember that's the size where we expect to see
a single collision. They don't become common until much later (I didn't
compute an exact number, but Linus's 16x sounds about right). I know
that the growth of the kernel isn't really linear, but I think the need
to bump to "13" might not just be decades, but possibly a century or
more.

So 12 seems reasonable, and the only downside for it (or for "13", for
that matter) is a few extra bytes. I dunno, maybe people will really
hate that, but I have a feeling these are mostly cut-and-pasted anyway.

> I am not opposed to bump the default to 12 or whatever, but I
> suspect any lengthening today may need to be accompanied by a tool
> support that finds the set of objects that are reachable from a
> commit whose names begin with non-unique abbreviations that appear
> in the commit log message. Assuming that it is very hard to refer to
> future objects in the log message you write today, such a tool may
> find a single object that used to be the unique instance of that
> abbreviation back then, and with reachability bitmap support, it may
> not be too expensive to run.

I had a similar thought, but I think it's not just reachability. You
might refer to a short sha1 on an alternate branch that isn't reachable
from you. Or you may even use the short sha1 in an email message or a
bug tracker. So I think the extra context you want is probably a
timestamp: at time t, what was a reasonable guess for this sha1?

That's easy to answer for commits and tags (cull the ones that are too
new), but harder for blobs and trees (you'd want to know the earliest
commit which contains them).

An easier (but less automatic) tool would be to improve our error
message for the ambiguous case, and actually report details of the
candidates. I'm working up a patch now.

-Peff

  reply	other threads:[~2016-09-26  4:34 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-26  1:39 Changing the default for "core.abbrev"? Linus Torvalds
2016-09-26  3:46 ` Junio C Hamano
2016-09-26  4:34   ` Jeff King [this message]
2016-09-26  4:45     ` Junio C Hamano
2016-09-26 11:57       ` [PATCH 0/10] helping people resolve ambiguous sha1s Jeff King
2016-09-26 11:59         ` [PATCH 01/10] get_sha1: detect buggy calls with multiple disambiguators Jeff King
2016-09-26 16:37           ` Junio C Hamano
2016-09-26 17:21             ` Jeff King
2016-09-26 17:50               ` Junio C Hamano
2016-09-26 11:59         ` [PATCH 02/10] get_sha1: avoid repeating ourselves via ONLY_TO_DIE Jeff King
2016-09-26 11:59         ` [PATCH 03/10] get_sha1: propagate flags to child functions Jeff King
2016-09-26 11:59         ` [PATCH 04/10] get_short_sha1: peel tags when looking for treeish Jeff King
2016-09-26 12:11           ` Jeff King
2016-09-26 16:55           ` Junio C Hamano
2016-09-26 17:23             ` Jeff King
2016-09-26 12:00         ` [PATCH 05/10] get_short_sha1: refactor init of disambiguation code Jeff King
2016-09-26 12:00         ` [PATCH 06/10] get_short_sha1: NUL-terminate hex prefix Jeff King
2016-09-26 17:10           ` Junio C Hamano
2016-09-26 17:25             ` Jeff King
2016-09-26 17:36               ` Junio C Hamano
2016-09-26 12:00         ` [PATCH 07/10] get_short_sha1: mark ambiguity error for translation Jeff King
2016-09-26 12:00         ` [PATCH 08/10] sha1_array: let callbacks interrupt iteration Jeff King
2016-09-26 12:00         ` [PATCH 09/10] for_each_abbrev: drop duplicate objects Jeff King
2016-09-26 12:00         ` [PATCH 10/10] get_short_sha1: list ambiguous objects on error Jeff King
2016-09-26 16:36           ` Linus Torvalds
2016-09-27  5:42             ` Jacob Keller
2016-09-27 12:38             ` Jeff King
2016-09-29 13:01             ` Kyle J. McKay
2016-09-29 13:24               ` Jeff King
2016-09-29 14:36                 ` Kyle J. McKay
2016-09-29 14:55                   ` Jeff King
2016-09-26 17:30           ` Junio C Hamano
2016-09-26 17:34             ` Jeff King
2016-09-26 17:39               ` Junio C Hamano
2016-09-29 11:46           ` Kyle J. McKay
2016-09-29 13:03             ` Jeff King
2016-09-29 17:19               ` Junio C Hamano
2016-09-30  5:51                 ` Jacob Keller
2019-02-04 16:12     ` [RFC/PATCH] core.abbrev doc: document and test the abbreviation length Ævar Arnfjörð Bjarmason
2019-02-04 19:13       ` Junio C Hamano
2019-02-04 20:04       ` Junio C Hamano
2019-02-04 21:36         ` Ævar Arnfjörð Bjarmason
2019-02-04 23:32         ` Jeff King
2019-02-04 23:50           ` Ævar Arnfjörð Bjarmason
2019-02-06 18:29             ` Jeff King
2019-02-06 18:36               ` Ævar Arnfjörð Bjarmason
2016-09-26  6:33   ` Changing the default for "core.abbrev"? Matthieu Moy
2016-09-26 12:09     ` Jeff King
2016-09-29 13:01   ` Kyle J. McKay
2016-09-26  7:13 ` Christian Couder
2016-09-28 23:30 ` [PATCH 0/4] raising core.abbrev default to 12 hexdigits Junio C Hamano
2016-09-28 23:30   ` [PATCH 1/4] config: allow customizing /etc/gitconfig location Junio C Hamano
2016-09-29  9:53     ` Jakub Narębski
2016-09-29 17:20       ` Junio C Hamano
2016-09-29 17:45         ` Matthieu Moy
2016-09-28 23:30   ` [PATCH 2/4] t13xx: do not assume system config is empty Junio C Hamano
2016-09-29  9:01     ` Jeff King
2016-09-29 18:13       ` Junio C Hamano
2016-09-29 18:26         ` Jeff King
2016-09-29 18:57           ` Junio C Hamano
2016-09-29 19:18             ` Jeff King
2016-09-29 19:57               ` Junio C Hamano
2016-09-29 19:06           ` Junio C Hamano
2016-09-29 19:26             ` Jeff King
2016-09-29 21:03               ` Junio C Hamano
2016-09-29 21:08                 ` Jeff King
2016-09-28 23:30   ` [PATCH 3/4] worktree: honor configuration variables Junio C Hamano
2016-09-28 23:30   ` [PATCH 4/4] core.abbrev: raise the default abbreviation to 12 hexdigits Junio C Hamano
2016-09-29  2:44     ` SZEDER Gábor
2016-09-29  5:27       ` Lukas Fleischer
2016-09-29  9:22         ` Jeff King
2016-09-29  9:15       ` Jeff King
2016-09-29 10:03         ` Matthieu Moy
2016-09-29 12:52         ` SZEDER Gábor
2016-09-29  5:58     ` Johannes Sixt
2016-09-29 18:05       ` Junio C Hamano
2016-09-29 18:37         ` Linus Torvalds
2016-09-29 18:55           ` Linus Torvalds
2016-09-29 19:06             ` Linus Torvalds
2016-09-29 19:42               ` Junio C Hamano
2016-09-30  0:56               ` Mike Hommey
2016-09-30  1:01                 ` Linus Torvalds
2016-09-30 19:41                   ` Ævar Arnfjörð Bjarmason
2016-09-29 19:16             ` Jeff King
2016-09-29 19:40               ` Linus Torvalds
2016-09-29 19:45                 ` Junio C Hamano
2016-09-29 21:53                   ` Linus Torvalds
2016-09-29 23:13                     ` Junio C Hamano
2016-09-29 23:20                       ` Junio C Hamano
2016-09-30  0:20                       ` Linus Torvalds
2016-09-30  0:28                         ` Linus Torvalds
2016-09-30  0:57                           ` Linus Torvalds
2016-09-30  1:18                             ` Linus Torvalds
2016-09-30  3:54                               ` Junio C Hamano
2016-09-30  4:10                                 ` Junio C Hamano
2016-09-30  4:18                                   ` Linus Torvalds
2016-09-30  4:29                                     ` Linus Torvalds
2016-09-30  4:27                                   ` Junio C Hamano
2016-09-30  4:35                                     ` Junio C Hamano
2016-09-30 18:40                                     ` Junio C Hamano
2016-09-30 18:51                                       ` Linus Torvalds
2016-09-30 19:00                                         ` Junio C Hamano
2016-09-30  4:11                                 ` Linus Torvalds
2016-09-30  8:06                               ` Jeff King
2016-09-30 17:54                                 ` Linus Torvalds
2016-09-30 18:05                                   ` Jeff King
2016-09-30 18:21                                   ` Linus Torvalds
2016-09-30 20:01                                     ` Junio C Hamano
2016-09-30 17:56                                 ` Junio C Hamano
2016-09-30  7:47                       ` Jeff King
2016-09-29  9:25     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160926043442.3pz7ccawdcsn2kzb@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).