git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* rename detection limit checking, cherry picking, and git am -3
@ 2007-09-17  3:32 Mark Levedahl
  2007-09-17  3:47 ` Shawn O. Pearce
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Levedahl @ 2007-09-17  3:32 UTC (permalink / raw)
  To: Git Mailing List

Linus' recent patch to invoke limiting on rename detection broke my 
ability to use cherry-picking on one project. This project has about 
4300 files on one branch (a), 2500 on a later branch (b), 226 commits in 
total between the two branches, and a convoluted history of how branch a 
morphed into branch b. About 50 files were renamed in the transition, 
and we need to migrate patches from the still maintained branch a onto 
the new branch b.

Prior to Linus' recent patch to limit rename detection (0024a549), 
cherry picking a patch from a to b, where the patch affected just one 
file, often took about 45 seconds on a 3 GHz pentium 4 with the CPU 
pegged at 100% for the duration. The cherry picking always succeeded and 
correctly followed renames, but was very slow.

Following Linus' patch, the cherry picking fails with a merge conflict 
(almost instantly), complaining the file has been deleted on b but 
modified on a, i.e., the rename detection does not work. I tried raising 
diff.renameLimit to 100000, that seems to have no effect whatsoever on 
cherry-pick (the process aborts with a conflict almost immediately).

Curiously, using "git format-patch x..y --stdout | git am -3" succeeds 
in this case, and runs in well less than a second. This performance 
seems unchanged by the rename detection limit patch.

So, the rename limit patch "broke" git for this usage, though one could 
reasonably argue the previous code was so slow as to be broken anyway.

The curious thing to me is the vast superiority of whatever 
git-format-patch|git-am -3 does, and I wonder if that isn't a 
fundementally better design for cherry picking than git-cherry-pick 
implements (it obviously is for this case).

Mark

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rename detection limit checking, cherry picking, and git am -3
  2007-09-17  3:32 rename detection limit checking, cherry picking, and git am -3 Mark Levedahl
@ 2007-09-17  3:47 ` Shawn O. Pearce
  2007-09-17  4:27   ` Junio C Hamano
  2007-09-18  0:18   ` Mark Levedahl
  0 siblings, 2 replies; 5+ messages in thread
From: Shawn O. Pearce @ 2007-09-17  3:47 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: Git Mailing List

Mark Levedahl <mlevedahl@gmail.com> wrote:
> The curious thing to me is the vast superiority of whatever 
> git-format-patch|git-am -3 does, and I wonder if that isn't a 
> fundementally better design for cherry picking than git-cherry-pick 
> implements (it obviously is for this case).

In this case `git am -3` creates a tree object containing only
the files modified by the patch and then feeds that tree into
git-merge-recursive.  Now if you go study git-revert's code you'll
see it actually just calls git-merge-recursive on three trees,
but these are three complete trees.

So what's probably happening here is there's less candidates on one
side in the `am -3` case, so we spend a lot less time generating
the rename matrix, searching for a match, and we get better changes
of finding a match.

I actually don't see why cherry-pick can't be defined in terms
of `format-patch|am -3`.  It probably would be faster in almost
all cases.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rename detection limit checking, cherry picking, and git am -3
  2007-09-17  3:47 ` Shawn O. Pearce
@ 2007-09-17  4:27   ` Junio C Hamano
  2007-09-17  9:58     ` Karl Hasselström
  2007-09-18  0:18   ` Mark Levedahl
  1 sibling, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2007-09-17  4:27 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Mark Levedahl, Git Mailing List

"Shawn O. Pearce" <spearce@spearce.org> writes:

> I actually don't see why cherry-pick can't be defined in terms
> of `format-patch|am -3`.  It probably would be faster in almost
> all cases.

Heh, people often suggested that rebase should get --merge as
default, and I resisted that.

I think it would make sense to do the consolidated backend for
rebase, revert, cherry-pick and am (I have been tentatively
calling this "git replay") primarily based on the "patch with
fallback to 3-way" like format-patch piped to "am -3", with an
option to do merge-recursive.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rename detection limit checking, cherry picking, and git am -3
  2007-09-17  4:27   ` Junio C Hamano
@ 2007-09-17  9:58     ` Karl Hasselström
  0 siblings, 0 replies; 5+ messages in thread
From: Karl Hasselström @ 2007-09-17  9:58 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Shawn O. Pearce, Mark Levedahl, Git Mailing List, Catalin Marinas,
	David Kågedal

On 2007-09-16 21:27:46 -0700, Junio C Hamano wrote:

> I think it would make sense to do the consolidated backend for
> rebase, revert, cherry-pick and am (I have been tentatively calling
> this "git replay") primarily based on the "patch with fallback to
> 3-way" like format-patch piped to "am -3", with an option to do
> merge-recursive.

I guess such a backend would be useful for StGit as well.

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rename detection limit checking, cherry picking, and git am -3
  2007-09-17  3:47 ` Shawn O. Pearce
  2007-09-17  4:27   ` Junio C Hamano
@ 2007-09-18  0:18   ` Mark Levedahl
  1 sibling, 0 replies; 5+ messages in thread
From: Mark Levedahl @ 2007-09-18  0:18 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Git Mailing List

Shawn O. Pearce wrote:
> In this case `git am -3` creates a tree object containing only
> the files modified by the patch and then feeds that tree into
> git-merge-recursive.  Now if you go study git-revert's code you'll
> see it actually just calls git-merge-recursive on three trees,
> but these are three complete trees.
>
> So what's probably happening here is there's less candidates on one
> side in the `am -3` case, so we spend a lot less time generating
> the rename matrix, searching for a match, and we get better changes
> of finding a match.
>
>   
Thanks for the explanation. For my case, there are < 500 files 
(including renamed files) in common between the two branches, giving 
~2000*4000 files that have no correspondence for which git can try to 
find renames. Clearly, reducing the one side from 4000 files to 1 file 
has an enormous payoff.

Mark

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-09-18  0:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-09-17  3:32 rename detection limit checking, cherry picking, and git am -3 Mark Levedahl
2007-09-17  3:47 ` Shawn O. Pearce
2007-09-17  4:27   ` Junio C Hamano
2007-09-17  9:58     ` Karl Hasselström
2007-09-18  0:18   ` Mark Levedahl

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).