git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Clemens Buchacher <drizzd@aon.at>
Cc: "Junio C Hamano" <gitster@pobox.com>,
	git@vger.kernel.org, "Jakub Narebski" <jnareb@gmail.com>,
	"Ted Ts'o" <tytso@mit.edu>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: Re: [PATCH 2/5] add object-cache infrastructure
Date: Tue, 12 Jul 2011 17:36:07 -0400	[thread overview]
Message-ID: <20110712213607.GA12447@sigill.intra.peff.net> (raw)
In-Reply-To: <20110712210716.GB17322@toss.lan>

On Tue, Jul 12, 2011 at 11:07:16PM +0200, Clemens Buchacher wrote:

> On Tue, Jul 12, 2011 at 03:45:40PM -0400, Jeff King wrote:
> >
> > It has been a long time since I've looked at darcs, but from my
> > recollection, it will only work with specific patch types. That is, it
> > works if B and C are commutative. For text patches that touch the same
> > area, that is not the case. But if "B" were a token-renaming patch, for
> > example, I think it might work.
> 
> If they were commutative, we would not have a problem in git
> either.

Except that git doesn't support many commutative special forms, like
token-renaming patches.

> > Anyway, that is not really relevant to git. I think we decided long ago
> > that being simple and stupid about the content changes (i.e., blob A
> > became blob B) is better in general, even when there are a few corner
> > cases that might have been better off the other way.
> 
> Yes, but that only applies to git merge. When we talk about
> rebasing we are looking at individual patches rather than a single
> global merge. For rebase I think "patch algebra" is very relevant,
> and we have already implemented a simple patch algebra with
> patch-id's.

I can buy that argument, but I think most of the benefit in darcs comes
from annotating your patch as "this is just renaming 'foo' to 'bar'" and
other special patch types. Because the algebraic properties of those
types is more interesting. And what we really don't want in git is
having to put the burden on users of making those annotations (not just
because it's annoying to do, but because when the annotation doesn't
match what's in the blobs, the results would be extremely confusing).

But if you are proposing that we could do run-time detection on those
sorts of patch properties and use the result in making a better rebase,
I don't think that's a bad idea (though I do wonder if the amount of
code will be worth it).

Again, it has been a long time since I've looked at darcs, and I was
never a serious user of it, so everything I say above may be utterly
wrong.

> > I'd be curious to see an example worked out. In my experience, even if
> > something like patch-ids don't match, it's not a big deal for the hunks
> > that do match, because when we get to the actual content merge, we will
> > realize that both sides made the same change to that hunk.  So it's not
> > like you are getting unrelated conflicts; whatever small part of the
> > diff made the patch-id different will be the part where you get the
> > conflict, and the should merge cleanly.
> 
> I am reading that last part as "they should not merge cleanly".

Sorry, it was supposed to be "...and the rest should merge cleanly". But
I think you got my meaning.

> Exactly. The case I am talking about is where the patch-id's are
> different but there are no conflicts. I have worked out an example
> for git and darcs. Below are two scripts to demonstrate. In the
> example, the patch-id is different because upstream changes the
> patch in a way that does not conflict with the original patch. It
> simply adds another change that goes into a different hunk. Git
> fails to merge cleanly because the patch-id's are different. It
> presents the user with an awkward conflict that looks like a
> revert. Darcs, on the other hand, merges cleanly. It recognizes the
> fact that all changes from the original patch are contained
> upstream and do not conflict with the upstream version. The fact
> that more changes are added on top does not bother darcs.

Ah, OK, I see. Let me try to amend what I said earlier to make sure.

In the normal case of applying patch B on top of patch A, it doesn't
matter if we use per-hunk patch-ids or normal patch-ids. Because even if
we decide to actually go through with the merge of B on top of A, any
hunks that _would have_ had their per-hunk patch-ids match will merge
cleanly.

But in the real world, it is about applying patch Z on top of patches
A..Y, where Z has similar hunks to patch N. And then it _does_ make a
difference, because it is about skipping hunks from Z that are already
in N, but will end up applied on top of Y. And what's in Y and what's in
N may be quite different.

Does that sound right?

> Now, one might argue that this is a corner case. But it's actually
> very common. In the example, the patch-id changes because of an
> extra change in a different text area. That is indeed unlikely.
> However, the same problem will occur in a much more common case.
> Let's say we have a patch with 10 hunks. The patch is applied
> upstream, with only one difference in one of the hunks.
> Subsequently, text areas affected by any of the other hunks change
> upstream. When the original patch is rebased on top of that, it
> will conflict with the one hunk that was changed in the upstream
> version of that patch. And that's ok. Git should not decide which
> version is correct. But in addition to that conflict there will
> also be conflicts for all the other hunks, which the upstream patch
> did _not_ modify. And all of those conflicts will look like
> reverts.

Right, that makes sense to me now.

-Peff

  parent reply	other threads:[~2011-07-12 21:36 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-11 16:13 [RFC/PATCH 0/5] generation numbers for faster traversals Jeff King
2011-07-11 16:16 ` [PATCH 1/5] decorate: allow storing values instead of pointers Jeff King
2011-07-11 16:39   ` Jeff King
2011-07-11 19:06   ` Junio C Hamano
2011-07-11 21:20     ` Jeff King
2011-07-11 16:17 ` [PATCH 2/5] add object-cache infrastructure Jeff King
2011-07-11 16:46   ` Jeff King
2011-07-11 16:58     ` Shawn Pearce
2011-07-11 19:17   ` Junio C Hamano
2011-07-11 22:01     ` Jeff King
2011-07-11 23:21       ` Junio C Hamano
2011-07-11 23:42         ` Junio C Hamano
2011-07-12  0:03         ` Jeff King
2011-07-12 19:38           ` Clemens Buchacher
2011-07-12 19:45             ` Jeff King
2011-07-12 21:07               ` Clemens Buchacher
2011-07-12 21:15                 ` Clemens Buchacher
2011-07-12 21:36                 ` Jeff King [this message]
2011-07-14  8:04                   ` Clemens Buchacher
2011-07-14 16:26                     ` Illia Bobyr
2011-07-13  1:33                 ` John Szakmeister
2011-07-12  0:14         ` Illia Bobyr
2011-07-12  5:35           ` Jeff King
2011-07-12 21:52             ` Illia Bobyr
2011-07-12  6:36       ` Miles Bader
2011-07-12 10:41   ` Jakub Narebski
2011-07-12 17:57     ` Jeff King
2011-07-12 18:41       ` Junio C Hamano
2011-07-13  6:37         ` Jeff King
2011-07-13 17:49           ` Junio C Hamano
2011-07-11 16:18 ` [PATCH 3/5] commit: add commit_generation function Jeff King
2011-07-11 17:57   ` Clemens Buchacher
2011-07-11 21:10     ` Jeff King
2011-07-11 16:18 ` [PATCH 4/5] pretty: support %G to show the generation number of a commit Jeff King
2011-07-11 16:18 ` [PATCH 5/5] limit "contains" traversals based on commit generation Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110712213607.GA12447@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=drizzd@aon.at \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jnareb@gmail.com \
    --cc=jrnieder@gmail.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).