git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Stefan Beller <sbeller@google.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Marc Strapetz <marc.strapetz@syntevo.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: topological index field for commit objects
Date: Wed, 29 Jun 2016 16:56:48 -0400	[thread overview]
Message-ID: <20160629205647.GA25987@sigill.intra.peff.net> (raw)
In-Reply-To: <xmqqk2h73f2i.fsf@gitster.mtv.corp.google.com>

On Wed, Jun 29, 2016 at 01:39:17PM -0700, Junio C Hamano wrote:

> > Would it make sense to refuse creating commits that have a commit date
> > prior to its parents commit date (except when the user gives a
> > `--dammit-I-know-I-break-a-wildy-used-heuristic`)?
> 
> I think that has also been discussed in the past.  I do not think it
> would help very much in practice, as projects already have up to 10
> years (and the ones migrated from CVS, even more) worth of commits
> they cannot rewrite that may record incorrect committer dates.

Yep, it has been discussed and I agree it runs into a lot of corner
cases.

> If the use of generation number can somehow be limited narrowly, we
> may be able to incrementally introduce it only for new commits, but
> I haven't thought things through, so let me do so aloud here ;-)

I think the problem is that you really _do_ want generation numbers for
old commits. One of the most obvious cases is something like "tag
--contains HEAD", because it has to examine older tags.

So your history looks something like:

  A -- B -- ... Z
        \        \
	 v1.0     HEAD

Without generation numbers (or some proxy), you have to walk the history
between B..Z to find the answer. With generation numbers, it is
immediately obvious.

So this is the ideal case for generation numbers (the worst cases are
when the things you are looking for are in branchy, close history where
the generation numbers don't tell you much; but in such cases the
walking is usually not too bad).

So I think you really do want to be able to generate and store
generation numbers after the fact. That has an added bonus that you do
not have to worry about baking incorrect values into your objects; you
do the topological walk once, and you _know_ it is correct (at least as
correct as the parent links, but that is our source of truth).

I have patches that generate and store the numbers at pack time, similar
to the way we do the reachability bitmaps. They're not production ready,
but they could probably be made so without too much effort. You wouldn't
have ready-made generation numbers for commits since the last full
repack, but you can compute them incrementally based on what you do have
at a cost linear to the unpacked commits (this is the same for bitmaps).

-Peff

  parent reply	other threads:[~2016-06-29 20:58 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-29 18:31 topological index field for commit objects Marc Strapetz
2016-06-29 18:59 ` Junio C Hamano
2016-06-29 20:20   ` Stefan Beller
2016-06-29 20:39     ` Junio C Hamano
2016-06-29 20:54       ` Stefan Beller
2016-06-29 21:37         ` Stefan Beller
2016-06-29 21:43           ` Jeff King
2016-06-29 20:56       ` Jeff King [this message]
2016-06-29 21:49         ` Jakub Narębski
2016-06-29 22:00           ` Jeff King
2016-06-29 22:11             ` Junio C Hamano
2016-06-29 22:30               ` Jeff King
2016-07-05 11:43                 ` Johannes Schindelin
2016-07-05 12:59                   ` Jakub Narębski
2016-06-30 10:30             ` Jakub Narębski
2016-06-30 18:12               ` Linus Torvalds
2016-06-30 23:39                 ` Jakub Narębski
2016-06-30 23:59                 ` Mike Hommey
2016-07-01  3:17                 ` Jeff King
2016-07-01  6:45                   ` Marc Strapetz
2016-07-01  9:48                   ` Jakub Narębski
2016-07-01 16:08                   ` Junio C Hamano
2016-07-01  6:54               ` Jeff King
2016-07-01  9:59                 ` Jakub Narębski
2016-07-20  0:07             ` Jakub Narębski
2016-07-20 13:02               ` Jeff King
2017-02-04 13:43                 ` Jakub Narębski
2017-02-17  9:26                   ` Jeff King
2017-02-17  9:28                     ` Jakub Narębski
2016-06-29 22:15       ` Marc Strapetz
2016-06-29 21:00   ` Jakub Narębski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160629205647.GA25987@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=marc.strapetz@syntevo.com \
    --cc=sbeller@google.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).