git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Stefan Beller" <sbeller@google.com>,
	"Paweł Marczewski" <pwmarcz@gmail.com>,
	"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Strange behavior of git rev-list
Date: Thu, 7 Sep 2017 23:47:40 -0400	[thread overview]
Message-ID: <20170908034739.4op3w4f2ma5s65ku@sigill.intra.peff.net> (raw)
In-Reply-To: <xmqqwp59zwqv.fsf@gitster.mtv.corp.google.com>

On Fri, Sep 08, 2017 at 12:37:28PM +0900, Junio C Hamano wrote:

> > I'm still moderately against storing generation numbers inside the
> > objects. They're redundant with the existing parent pointers, which
> > means it's an opportunity for the two sets of data to disagree. And as
> > we've seen, once errors are cemented in history it's very hard to fix
> > them, because you break any history built on top.
> >
> > I'm much more in favor of building a local cache of generation numbers
> > (either on the fly or during repacks, where we can piggy-back on the
> > existing pack .idx for indexing).
> 
> I guess our mails crossed.  Yes, objects that are needlessly broken
> only because they botch computation of derivable values are real
> problem, as we need to accomodate them forever because histories can
> and will be built on top of them.
> 
> On the other hand, seeing that the world did not stop even with some
> projects have trees with entries whose mode are written with 0-padding
> on the left in the ancient part of their histories, it might not be
> such a big deal.  I dunno.

True, but in counter-point:

  1. Tree problems generally only affect operations on that tree itself.
     Parent (or generation number) problems hit us any time we walk
     across that part of history, which may be much more frequent.

  2. There's an open question of what to do with existing commits
     without generation numbers.

     Per (1), "git tag --contains" is _always_ going to want to know the
     generation number of v1.0. Some problems are "local" to their block
     of history and as the project history marches forward, the bugs are
     there but you are less likely to make queries that hit them. But
     considering old tags for reachability will happen forever (and is
     the _most_ important use of generation numbers, because it lets us
     throw out that old history immediately).

     So if we assume we can't rewrite those objects, then we end up with
     some kind of local cache anyway.

  3. I think we should be moving more in the direction of keeping
     repo-local caches for optimizations. Reachability bitmaps have been
     a big performance win. I think we should be doing the same with our
     properties of commits. Not just generation numbers, but making it
     cheap to access the graph structure without zlib-inflating whole
     commit objects (i.e., packv4 or something like the "metapacks" I
     proposed a few years ago).

-Peff

      reply	other threads:[~2017-09-08  3:47 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-07  9:20 Strange behavior of git rev-list Paweł Marczewski
2017-09-07  9:47 ` Jeff King
2017-09-07  9:50   ` Paweł Marczewski
2017-09-07 10:11     ` Jeff King
2017-09-07 19:24       ` Stefan Beller
2017-09-08  3:17         ` Jeff King
2017-09-08  3:37           ` Junio C Hamano
2017-09-08  3:47             ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170908034739.4op3w4f2ma5s65ku@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pwmarcz@gmail.com \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).