* rename detection limit checking, cherry picking, and git am -3
@ 2007-09-17 3:32 Mark Levedahl
2007-09-17 3:47 ` Shawn O. Pearce
0 siblings, 1 reply; 5+ messages in thread
From: Mark Levedahl @ 2007-09-17 3:32 UTC (permalink / raw)
To: Git Mailing List
Linus' recent patch to invoke limiting on rename detection broke my
ability to use cherry-picking on one project. This project has about
4300 files on one branch (a), 2500 on a later branch (b), 226 commits in
total between the two branches, and a convoluted history of how branch a
morphed into branch b. About 50 files were renamed in the transition,
and we need to migrate patches from the still maintained branch a onto
the new branch b.
Prior to Linus' recent patch to limit rename detection (0024a549),
cherry picking a patch from a to b, where the patch affected just one
file, often took about 45 seconds on a 3 GHz pentium 4 with the CPU
pegged at 100% for the duration. The cherry picking always succeeded and
correctly followed renames, but was very slow.
Following Linus' patch, the cherry picking fails with a merge conflict
(almost instantly), complaining the file has been deleted on b but
modified on a, i.e., the rename detection does not work. I tried raising
diff.renameLimit to 100000, that seems to have no effect whatsoever on
cherry-pick (the process aborts with a conflict almost immediately).
Curiously, using "git format-patch x..y --stdout | git am -3" succeeds
in this case, and runs in well less than a second. This performance
seems unchanged by the rename detection limit patch.
So, the rename limit patch "broke" git for this usage, though one could
reasonably argue the previous code was so slow as to be broken anyway.
The curious thing to me is the vast superiority of whatever
git-format-patch|git-am -3 does, and I wonder if that isn't a
fundementally better design for cherry picking than git-cherry-pick
implements (it obviously is for this case).
Mark
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: rename detection limit checking, cherry picking, and git am -3
2007-09-17 3:32 rename detection limit checking, cherry picking, and git am -3 Mark Levedahl
@ 2007-09-17 3:47 ` Shawn O. Pearce
2007-09-17 4:27 ` Junio C Hamano
2007-09-18 0:18 ` Mark Levedahl
0 siblings, 2 replies; 5+ messages in thread
From: Shawn O. Pearce @ 2007-09-17 3:47 UTC (permalink / raw)
To: Mark Levedahl; +Cc: Git Mailing List
Mark Levedahl <mlevedahl@gmail.com> wrote:
> The curious thing to me is the vast superiority of whatever
> git-format-patch|git-am -3 does, and I wonder if that isn't a
> fundementally better design for cherry picking than git-cherry-pick
> implements (it obviously is for this case).
In this case `git am -3` creates a tree object containing only
the files modified by the patch and then feeds that tree into
git-merge-recursive. Now if you go study git-revert's code you'll
see it actually just calls git-merge-recursive on three trees,
but these are three complete trees.
So what's probably happening here is there's less candidates on one
side in the `am -3` case, so we spend a lot less time generating
the rename matrix, searching for a match, and we get better changes
of finding a match.
I actually don't see why cherry-pick can't be defined in terms
of `format-patch|am -3`. It probably would be faster in almost
all cases.
--
Shawn.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: rename detection limit checking, cherry picking, and git am -3
2007-09-17 3:47 ` Shawn O. Pearce
@ 2007-09-17 4:27 ` Junio C Hamano
2007-09-17 9:58 ` Karl Hasselström
2007-09-18 0:18 ` Mark Levedahl
1 sibling, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2007-09-17 4:27 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Mark Levedahl, Git Mailing List
"Shawn O. Pearce" <spearce@spearce.org> writes:
> I actually don't see why cherry-pick can't be defined in terms
> of `format-patch|am -3`. It probably would be faster in almost
> all cases.
Heh, people often suggested that rebase should get --merge as
default, and I resisted that.
I think it would make sense to do the consolidated backend for
rebase, revert, cherry-pick and am (I have been tentatively
calling this "git replay") primarily based on the "patch with
fallback to 3-way" like format-patch piped to "am -3", with an
option to do merge-recursive.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: rename detection limit checking, cherry picking, and git am -3
2007-09-17 4:27 ` Junio C Hamano
@ 2007-09-17 9:58 ` Karl Hasselström
0 siblings, 0 replies; 5+ messages in thread
From: Karl Hasselström @ 2007-09-17 9:58 UTC (permalink / raw)
To: Junio C Hamano
Cc: Shawn O. Pearce, Mark Levedahl, Git Mailing List, Catalin Marinas,
David Kågedal
On 2007-09-16 21:27:46 -0700, Junio C Hamano wrote:
> I think it would make sense to do the consolidated backend for
> rebase, revert, cherry-pick and am (I have been tentatively calling
> this "git replay") primarily based on the "patch with fallback to
> 3-way" like format-patch piped to "am -3", with an option to do
> merge-recursive.
I guess such a backend would be useful for StGit as well.
--
Karl Hasselström, kha@treskal.com
www.treskal.com/kalle
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: rename detection limit checking, cherry picking, and git am -3
2007-09-17 3:47 ` Shawn O. Pearce
2007-09-17 4:27 ` Junio C Hamano
@ 2007-09-18 0:18 ` Mark Levedahl
1 sibling, 0 replies; 5+ messages in thread
From: Mark Levedahl @ 2007-09-18 0:18 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Git Mailing List
Shawn O. Pearce wrote:
> In this case `git am -3` creates a tree object containing only
> the files modified by the patch and then feeds that tree into
> git-merge-recursive. Now if you go study git-revert's code you'll
> see it actually just calls git-merge-recursive on three trees,
> but these are three complete trees.
>
> So what's probably happening here is there's less candidates on one
> side in the `am -3` case, so we spend a lot less time generating
> the rename matrix, searching for a match, and we get better changes
> of finding a match.
>
>
Thanks for the explanation. For my case, there are < 500 files
(including renamed files) in common between the two branches, giving
~2000*4000 files that have no correspondence for which git can try to
find renames. Clearly, reducing the one side from 4000 files to 1 file
has an enormous payoff.
Mark
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2007-09-18 0:18 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-09-17 3:32 rename detection limit checking, cherry picking, and git am -3 Mark Levedahl
2007-09-17 3:47 ` Shawn O. Pearce
2007-09-17 4:27 ` Junio C Hamano
2007-09-17 9:58 ` Karl Hasselström
2007-09-18 0:18 ` Mark Levedahl
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).