* Opinions on changing add/add conflict resolution? @ 2018-03-12 18:32 Elijah Newren 2018-03-12 18:47 ` Jonathan Nieder 2018-03-12 22:19 ` Ævar Arnfjörð Bjarmason 0 siblings, 2 replies; 22+ messages in thread From: Elijah Newren @ 2018-03-12 18:32 UTC (permalink / raw) To: Git Mailing List Hi everyone, I'd like to change add/add conflict resolution. Currently when such a conflict occurs (say at ${path}), git unconditionally does a two-way merge of the two files and sticks the result in the working tree at ${path}. I would like to make it instead first check whether the two files are similar; if they are, then do the two-way merge, but if they're not, then instead write the two files out to separate paths (${path}~HEAD and ${path}~$MERGE, while making sure that ${path} is removed from the working copy). Thoughts? I have a patch series[1] with more details and other changes, but wanted to especially get feedback on this issue even from folks that didn't have enough time to read the patches or even the cover letter. Thanks, Elijah [1] https://public-inbox.org/git/20180305171125.22331-1-newren@gmail.com/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-12 18:32 Opinions on changing add/add conflict resolution? Elijah Newren @ 2018-03-12 18:47 ` Jonathan Nieder 2018-03-12 21:26 ` Elijah Newren 2018-03-12 22:19 ` Ævar Arnfjörð Bjarmason 1 sibling, 1 reply; 22+ messages in thread From: Jonathan Nieder @ 2018-03-12 18:47 UTC (permalink / raw) To: Elijah Newren; +Cc: Git Mailing List Hi, Elijah Newren wrote: > Hi everyone, > > I'd like to change add/add conflict resolution. Currently when such a > conflict occurs (say at ${path}), git unconditionally does a two-way > merge of the two files and sticks the result in the working tree at > ${path}. > > I would like to make it instead first check whether the two files are > similar; if they are, then do the two-way merge, but if they're not, > then instead write the two files out to separate paths (${path}~HEAD > and ${path}~$MERGE, while making sure that ${path} is removed from the > working copy). > > Thoughts? My immediate reaction is that it seems inconsistent with the rest of merge behavior. Why would add/add behave this way but edit/edit not behave this way? Would this behavior be configurable or unconditional? I suspect I would want it turned off in my own use. On the other hand, in the case of wild difference between the two files, skipping the two-way merge and just writing one of the versions to the worktree (like we do for binary files) sounds like something I would like in my own use. Can you add something more about the motivation to the commit message? E.g. is this about performance, interaction with some tools, to support some particular workflow, etc? Thanks and hope that helps, Jonathan > I have a patch series[1] with more details and other changes, but > wanted to especially get feedback on this issue even from folks that > didn't have enough time to read the patches or even the cover letter. > > Thanks, > Elijah > > [1] https://public-inbox.org/git/20180305171125.22331-1-newren@gmail.com/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-12 18:47 ` Jonathan Nieder @ 2018-03-12 21:26 ` Elijah Newren 2018-03-12 21:35 ` Jonathan Nieder 2018-03-13 5:30 ` Junio C Hamano 0 siblings, 2 replies; 22+ messages in thread From: Elijah Newren @ 2018-03-12 21:26 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Git Mailing List Hi, Cool, thanks for taking a look! On Mon, Mar 12, 2018 at 11:47 AM, Jonathan Nieder <jrnieder@gmail.com> wrote: > > My immediate reaction is that it seems inconsistent with the rest of > merge behavior. Why would add/add behave this way but edit/edit not > behave this way? Fair enough. I have two separate reasons for believing that edit/edit IS different than add/add; further, the current behavior for add/add is inconsistent with the rest of merge behavior on a different axis. I think it's helpful to get the motivation for my changes before trying to explain why those are different and before delving into the other inconsistency, so I'll add the explanation of this claim at the end of this email. > Would this behavior be configurable or unconditional? I suspect I > would want it turned off in my own use. > > On the other hand, in the case of wild difference between the two > files, skipping the two-way merge and just writing one of the versions > to the worktree (like we do for binary files) sounds like something I > would like in my own use. I think you just said the exact opposite thing in these last two paragraphs; that you wouldn't want my proposed behavior and that you'd want it. I suspect that may mean that I misunderstood something you said here. Could you clarify? > Can you add something more about the motivation to the commit message? > E.g. is this about performance, interaction with some tools, to > support some particular workflow, etc? To be honest, I'm a little unsure how without even more excessive and repetitive wording across commits. Let me attempt here, and maybe you can suggest how to change my commit messages? * When files are wildly dissimilar -- as you mentioned -- it'd be easier for users to resolve conflicts if we wrote files out to separate paths instead of two-way merging them. * There is a weird inconsistency between handling of add/add, rename/add, and rename/rename(2to1). I want this inconsistency fixed. * There is a significant performance optimization possible for rename detection in git merge if we remove the above inconsistency and treat these conflict types the same. (Actually, we only need them to be the same in the special case where a renamed file is unmodified on the un-renamed side of history, but I don't want to special case that situation because it sounds like a recipe for inconsistent results). * If we insist on these conflict types being handled differently because there really is some important distinction between them, then I'm almost certain that build_fake_ancestor() and it's usage in am/rebase is broken/wrong. This is because the usage of build_fake_ancestor(), as currently implemented, will cause mis-detection of one conflict type as another. > Thanks and hope that helps, > Jonathan As promised above, here are my reasons for believing that edit/edit IS fundamentally different than add/add for the behavior considered here, as well as my explanation of the weird inconsistency add/add currently has with the rest of merge behavior: ==Reason 1== edit/edit means that ${path} existed in the merge base as well as both sides. It is more likely than not that ${path} on each side is similar to the merge base and thus similar (though less so) to each other. On the other hand, add/add means ${path} did NOT exist in the merge base. Thus, we cannot use the same reason to believe they are similar. The only reason we have to assume similarity is the filename, which, while a useful indicator, gives us a weird inconsistency within git for handling pathname collisions -- see "Weird inconsistency" below. ==Reason 2== In merges, git does rename detection, but not copy or break detection. In particular, break detection is what would allow us to find out that the edited files are not similar to the original. add/add conflicts automatically come to us "pre-broken" (is there a better term for knowing that two files are not paired?), though it is possible that they just happen to be similar and should be paired/joined. ==Follow on to reason 2== Not doing break detection means we get e.g. rename/add-source cases wrong, so there are valid arguments for saying we should add break detection. However, it would be computationally expensive and would require a fair amount of work re-thinking all the cases in merge-recursive to make sure we get rename-breaks right (there are FIXMEs and comments and testcases I added years ago to help someone along the path if they want to try). Since rename/add-source just isn't a conflict type that occurs much (if it all) in real world repositories, the motivation to implement it is somewhat low. ==Weird inconsistency git currently exhibits== There are three types of conflicts representing two (possibly unrelated) files colliding at the same path: add/add, rename/add, and rename/rename(2to1). add/add does the two-way merge of the colliding files, and the other two conflict types write the two colliding files out to separate paths. I think the motivation behind the current behavior was that add/add was written assuming filename-match implies similarity, and the other two were written assuming filename-match didn't imply anything and the files should be assumed dissimilar. That seems weird to me. I think all three should behave the same (especially when the renamed file was unmodified on the un-renamed side of history, and likely whenever there is no content conflicts for the renamed file). ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-12 21:26 ` Elijah Newren @ 2018-03-12 21:35 ` Jonathan Nieder 2018-03-12 23:08 ` Hilco Wijbenga 2018-03-13 0:38 ` Elijah Newren 2018-03-13 5:30 ` Junio C Hamano 1 sibling, 2 replies; 22+ messages in thread From: Jonathan Nieder @ 2018-03-12 21:35 UTC (permalink / raw) To: Elijah Newren; +Cc: Git Mailing List Hi again, Elijah Newren wrote: > On Mon, Mar 12, 2018 at 11:47 AM, Jonathan Nieder <jrnieder@gmail.com> wrote: >> Would this behavior be configurable or unconditional? I suspect I >> would want it turned off in my own use. >> >> On the other hand, in the case of wild difference between the two >> files, skipping the two-way merge and just writing one of the versions >> to the worktree (like we do for binary files) sounds like something I >> would like in my own use. > > I think you just said the exact opposite thing in these last two > paragraphs; that you wouldn't want my proposed behavior and that you'd > want it. I suspect that may mean that I misunderstood something you > said here. Could you clarify? Sorry for the lack of clarity. My understanding was that the proposed behavior was to write two files: ${path}~HEAD ${path}~MERGE My proposal is instead to write one file: ${path} with the content that would have gone to ${path}~HEAD. This is what already happens when trying to merge binary files. [...] >> Can you add something more about the motivation to the commit message? >> E.g. is this about performance, interaction with some tools, to >> support some particular workflow, etc? > > To be honest, I'm a little unsure how without even more excessive and > repetitive wording across commits. Simplest way IMHO is to just put the rationale in patch 5/5. :) In other words, explain the rationale for the end-user facing change in the same patch that changes the end-user facing behavior. > Let me attempt here, and maybe you > can suggest how to change my commit messages? > > * When files are wildly dissimilar -- as you mentioned -- it'd be > easier for users to resolve conflicts if we wrote files out to > separate paths instead of two-way merging them. Today what we do (in both the wildly-dissimilar case and the less-dissimilar case) is write one proposed resolution to the worktree and put the competing versions in the index. Tools like "git mergetool" are then able to pull the competing versions out of the index to allow showing them at the same time. My bias is that I've used VCSes before that wrote multiple competing files to the worktree and I have been happier with my experience resolving conflicts in git. E.g. at any step I can run a build to try out the current proposed resolution, and there's less of a chance of accidentally commiting a ~HEAD file. [...] > There are three types of conflicts representing two (possibly > unrelated) files colliding at the same path: add/add, rename/add, and > rename/rename(2to1). add/add does the two-way merge of the colliding > files, and the other two conflict types write the two colliding files > out to separate paths. Interesting. I would be tempted to resolve this inconsistency the other way: by doing a half-hearted two-way merge (e.g. by picking one of the two versions of the colliding file) and marking the path as conflicted in the index. That way it's more similar to edit/edit, too. Thanks, Jonathan ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-12 21:35 ` Jonathan Nieder @ 2018-03-12 23:08 ` Hilco Wijbenga 2018-03-12 23:14 ` Jonathan Nieder 2018-03-13 0:38 ` Elijah Newren 1 sibling, 1 reply; 22+ messages in thread From: Hilco Wijbenga @ 2018-03-12 23:08 UTC (permalink / raw) To: Git Mailing List Cc: Elijah Newren, Jonathan Nieder, Ævar Arnfjörð Bjarmason On Mon, Mar 12, 2018 at 2:35 PM, Jonathan Nieder <jrnieder@gmail.com> wrote: > Interesting. I would be tempted to resolve this inconsistency the > other way: by doing a half-hearted two-way merge (e.g. by picking one > of the two versions of the colliding file) and marking the path as > conflicted in the index. That way it's more similar to edit/edit, > too. If work is going to be done in this area, would it be possible to include making auto-merging (in general) optional? Preferably, configurable by file (or glob) but I'd already be happy with a global setting to opt out. I keep running into bugs caused by Git's automatic merging. (And I don't see how this could be improved without teaching Git the specifics of various file types.) It's especially hard when rebasing large numbers of commits. The bug is introduced early on but I will not notice anything is wrong until late in the rebase. (Since I seem to keep asking for things that turn out to already have been implemented ... if this is already possible please just point me to the right setting and consider me a happy camper. :-) ) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-12 23:08 ` Hilco Wijbenga @ 2018-03-12 23:14 ` Jonathan Nieder 0 siblings, 0 replies; 22+ messages in thread From: Jonathan Nieder @ 2018-03-12 23:14 UTC (permalink / raw) To: Hilco Wijbenga Cc: Git Mailing List, Elijah Newren, Ævar Arnfjörð Bjarmason Hi, Hilco Wijbenga wrote: > On Mon, Mar 12, 2018 at 2:35 PM, Jonathan Nieder <jrnieder@gmail.com> wrote: >> Interesting. I would be tempted to resolve this inconsistency the >> other way: by doing a half-hearted two-way merge (e.g. by picking one >> of the two versions of the colliding file) and marking the path as >> conflicted in the index. That way it's more similar to edit/edit, >> too. > > If work is going to be done in this area, would it be possible to > include making auto-merging (in general) optional? Preferably, > configurable by file (or glob) but I'd already be happy with a global > setting to opt out. Have you experimented with the 'merge' attribute (see "git help attributes")? E.g. you can put * -merge in .gitattributes or .git/info/attributes. If that helps, then a patch adding a pointer to the most helpful place (maybe git-merge.txt?) would be very welcome. Thanks and hope that helps, Jonathan ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-12 21:35 ` Jonathan Nieder 2018-03-12 23:08 ` Hilco Wijbenga @ 2018-03-13 0:38 ` Elijah Newren 2018-03-13 17:22 ` Elijah Newren 1 sibling, 1 reply; 22+ messages in thread From: Elijah Newren @ 2018-03-13 0:38 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Git Mailing List I'm worried that my attempt to extract add/add from the rest of the discussion with rename/add and rename/rename resulted in a false sense of simplification. Trying to handle all the edge and corner cases and remain consistent sometimes gets a little hairy. Anyway... On Mon, Mar 12, 2018 at 2:35 PM, Jonathan Nieder <jrnieder@gmail.com> wrote: > Elijah Newren wrote: >> On Mon, Mar 12, 2018 at 11:47 AM, Jonathan Nieder <jrnieder@gmail.com> wrote: > > Sorry for the lack of clarity. My understanding was that the proposed > behavior was to write two files: > > ${path}~HEAD > ${path}~MERGE (Just to be clear: The proposed behavior was to do that only when the colliding files were dissimilar, and otherwise two-way merging.) > My proposal is instead to write one file: > > ${path} > > with the content that would have gone to ${path}~HEAD. This is what > already happens when trying to merge binary files. Thanks for clarifying. I feel the analogy to merging binary files breaks down here in more than one way: 1) Merging binary files is more complex than you suggest here. In particular, when creating a virtual merge base, the resolution chosen is not the version of the file from HEAD, but the version of the file from the merge base. Nasty problems would result otherwise for the recursive case. If we tried to match how merging binary files behaved, we run into the problem that the colliding file conflict case has no common version of the file from a merge base. So the same strategy just doesn't apply. The closest would be outright deleting both versions of the file for the virtual merge base and punting to the outer merge to deal with it. That might be okay, but seems really odd to me. I feel like it'd unnecessarily increase the number of conflicts users will see, though maybe it wouldn't be horrible if this was only done when the files were considered dissimilar anyway. 2) merging binary files only has 3 versions of the file to store at a single $path in the index, which fit naturally in stages 1-3; rename/add and rename/rename(2to1) have 4 and 6, respectively. Having three versions of a file from a 3-way merge makes fairly intuitive sense. Have 4 or 6 versions of a file at a path from a 3-way merge is inherently more complex. I think that just using the version from HEAD risks users resolving things incorrectly much more likely than in the case of a binary merge. > [...] >>> Can you add something more about the motivation to the commit message? >>> E.g. is this about performance, interaction with some tools, to >>> support some particular workflow, etc? >> >> To be honest, I'm a little unsure how without even more excessive and >> repetitive wording across commits. > > Simplest way IMHO is to just put the rationale in patch 5/5. :) In > other words, explain the rationale for the end-user facing change in the > same patch that changes the end-user facing behavior. Patches 3, 4, and 5 all change end-user facing behavior -- one patch per conflict type to make the three types behave the same (the first two patches add more testcases, and write a common function all three types can use). The rationale for the sets of changes is largely the same, and is somewhat lengthy. Should I repeat the rationale in full in all three places? >> Let me attempt here, and maybe you >> can suggest how to change my commit messages? >> >> * When files are wildly dissimilar -- as you mentioned -- it'd be >> easier for users to resolve conflicts if we wrote files out to >> separate paths instead of two-way merging them. > > Today what we do (in both the wildly-dissimilar case and the > less-dissimilar case) is write one proposed resolution to the worktree > and put the competing versions in the index. Tools like "git > mergetool" are then able to pull the competing versions out of the > index to allow showing them at the same time. > > My bias is that I've used VCSes before that wrote multiple competing > files to the worktree and I have been happier with my experience > resolving conflicts in git. E.g. at any step I can run a build to try > out the current proposed resolution, and there's less of a chance of > accidentally commiting a ~HEAD file. > > [...] >> There are three types of conflicts representing two (possibly >> unrelated) files colliding at the same path: add/add, rename/add, and >> rename/rename(2to1). add/add does the two-way merge of the colliding >> files, and the other two conflict types write the two colliding files >> out to separate paths. > > Interesting. I would be tempted to resolve this inconsistency the > other way: by doing a half-hearted two-way merge (e.g. by picking one > of the two versions of the colliding file) and marking the path as > conflicted in the index. That way it's more similar to edit/edit, > too. Your proposal is underspecified; there are more complicated cases where your wording could have different meanings. I also had a backup plan was to propose making rename/add and rename/rename(2to1) behave more like add/add (instead of making the behavior for all three a blend of their current behaviors), but the problem is I have two backup proposals to choose between. And it sounds like you're adding a third, which follows your ~HEAD strategy from above. Let me see if I can explain why your proposal is underspecified: For rename/rename(2to1) we have up to 3 sets of conflicts. Let's say we renamed A->C in HEAD and B->C on the other side. We first need to do a three-way content merge on the first C (with the other two versions of A), then we need to do a three-way content merge on the second C (with the other two versions of B). This gives us two different C's, both with possible conflict markers, that want to be stored at the same pathname. Let me call these two versions of C, C1 and C2. C1 and C2 may or may not be similar. Resolving the fact that both want to be stored at C is what gives us potentially a third set of conflicts. What's your resolution strategy here, for each combination of cases (has conflict markers or not; is similar to the other C or not; working on a virtual merge base or not)? For reference there are two possible backup strategies I could propose here: Backup Proposal #1: act stricly like add/add: Do a two-way merge of C1 and C2. If we end up with nested conflict markers, oh well. Good luck, user. Backup Proposal #2: act like add/add, but only if C1 and C2 don't have conflict markers. If C1 or C2 have conflict markers, behave as currently -- record C1 and C2 to different paths. If C1 and C2 do not have conflict markers, two-way merge them and record in the worktree at C. My question for your counter-proposal: Do you record C1 or C2 as C? Or do you record the version of C from HEAD as C? And what do you pick when creating a virtual merge base? Problems I see regardless of the choice in your counter-proposal: * If you choose C from HEAD, you are ignoring 3 other versions of "C" (as opposed to just one other version when merging a binary file); will the user recognize that and know how to look at all four versions? * If you choose C1 or C2, the user will see conflict markers; will the user recognize that this is only a _subset_ of the conflicts, and that there's a lot of content that was completely ignored? * There is no "merge base of C1 and C2" to use for the virtual merge base. What will you use? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-13 0:38 ` Elijah Newren @ 2018-03-13 17:22 ` Elijah Newren 0 siblings, 0 replies; 22+ messages in thread From: Elijah Newren @ 2018-03-13 17:22 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Git Mailing List On Mon, Mar 12, 2018 at 5:38 PM, Elijah Newren <newren@gmail.com> wrote: > I feel the analogy to merging binary files breaks down here in more > than one way: Actually, after mulling this over, I'm going to retract the "more than" piece of this sentence. I'm trying to figure out how to retract more, but have only figured out one piece. In particular... > 1) > > Merging binary files is more complex than you suggest here. In > particular, when creating a virtual merge base, the resolution chosen > is not the version of the file from HEAD, but the version of the file > from the merge base. Nasty problems would result otherwise for the > recursive case. > > If we tried to match how merging binary files behaved, we run into the > problem that the colliding file conflict case has no common version of > the file from a merge base. So the same strategy just doesn't apply. > The closest would be outright deleting both versions of the file for > the virtual merge base and punting to the outer merge to deal with it. > That might be okay, but seems really odd to me. I feel like it'd > unnecessarily increase the number of conflicts users will see, though > maybe it wouldn't be horrible if this was only done when the files > were considered dissimilar anyway. Thinking about this more, it really isn't that weird at all. The code is already set up to use null_oid as the "original" version when creating a virtual merge base, going back to the content from a recursive merge base is a strategy used in multiple places in recursive cases, and null is precisely the version from the recursive merge base. If two added files differ wildly, I don't think using null would increase the number of conflicts appreciably, if at all. So, the analogy to merging binary files holds just fine from this angle. So, if we could figure out how to handle the higher numbers of paths for e.g. the rename/rename cases, then perhaps this is a strategy that could work. >> Interesting. I would be tempted to resolve this inconsistency the >> other way: by doing a half-hearted two-way merge (e.g. by picking one >> of the two versions of the colliding file) and marking the path as >> conflicted in the index. That way it's more similar to edit/edit, >> too. > > Your proposal is underspecified; there are more complicated cases > where your wording could have different meanings. > <snip> > My question for your counter-proposal: > Do you record C1 or C2 as C? Or do you record the version of C from > HEAD as C? And what do you pick when creating a virtual merge base? > > Problems I see regardless of the choice in your counter-proposal: > * If you choose C from HEAD, you are ignoring 3 other versions of > "C" (as opposed to just one other version when merging a binary file); > will the user recognize that and know how to look at all four > versions? > * If you choose C1 or C2, the user will see conflict markers; will > the user recognize that this is only a _subset_ of the conflicts, and > that there's a lot of content that was completely ignored? > * There is no "merge base of C1 and C2" to use for the virtual merge > base. What will you use? For this last question, the answer is "null", and it works just fine. I don't yet have good answers to the other questions, though. If someone else does, I'd love to hear it. Oh, and by the way Jonathan, thanks very much for your feedback and alternative ideas for consideration. You gave me more angles to try to think about this problem from. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-12 21:26 ` Elijah Newren 2018-03-12 21:35 ` Jonathan Nieder @ 2018-03-13 5:30 ` Junio C Hamano 2018-03-13 18:21 ` Elijah Newren 1 sibling, 1 reply; 22+ messages in thread From: Junio C Hamano @ 2018-03-13 5:30 UTC (permalink / raw) To: Elijah Newren; +Cc: Jonathan Nieder, Git Mailing List While I do not think it is a bad idea to add an optional way to write the contents of conflicted stages out to separate files, I do not think it is a good idea to change what happens to add/add conflict by default. Two integrators picking up a same patch that adds a file separately and allowing them to diverge before they are merged should not be all that surprising, just like two integrators picking up a same patch that changes an existing path and letting them evolve independently, so from that point of view, I do not think there is fundamental difference between edit/edit vs add/add and rename/rename conflict. The latter certainly would be much rarer, but that is because edit happens a lot more often than add. There certainly are cases where conflicts is easier to resolve when the merged tip versions are unrelated or diverged too vastly, with or without a common merge base. As I already agreed to, it would be useful in such a case to have an option to write the conflicted stages to separate files to let an external merge tool to examine them and help you resolve them. But that is not limited to resolving vast differece between contents involved in an add/add conflict, is it? The same tool (which can be driven as a mergetool backend) would be useful in reconciling the difference between contents involved in an edit/edit conflict, no? If anything, if rename/rename behaves differently by always writing the result out to separate files, it is that codepath whose behaviour should be fixed to match how add/add conflicts are dealt with. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-13 5:30 ` Junio C Hamano @ 2018-03-13 18:21 ` Elijah Newren 2018-03-13 22:26 ` Junio C Hamano 0 siblings, 1 reply; 22+ messages in thread From: Elijah Newren @ 2018-03-13 18:21 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jonathan Nieder, Git Mailing List On Mon, Mar 12, 2018 at 10:30 PM, Junio C Hamano <gitster@pobox.com> wrote: > While I do not think it is a bad idea to add an optional way to write the > contents of conflicted stages out to separate files, I do not think it is a > good idea to change what happens to add/add conflict by default. <snip some other good arguments> > If anything, if rename/rename behaves differently by always writing > the result out to separate files, it is that codepath whose behaviour > should be fixed to match how add/add conflicts are dealt with. Cool, thanks for the context. I'm happy to go down this path, but there is one question I'd like your opinion on: what if the intermediate content merges have conflicts themselves? If that question isn't clear, let me be more precise... Let's say A is renamed to C in HEAD, and both sides modified. We thus need to do a 3-way content merge of C from HEAD with the two A's. Let's call the result C1. Further, let's say that B is renamed to C in $MERGE_BRANCH, and both sides modified. Again we need a 3-way content merge for that other C; let's call the result C2. If neither C1 nor C2 have content conflicts, then making rename/rename(2to1) act like add/add is easy: just do a two-way merge of C1 and C2. But what if C1 or C2 have conflicts? Doing a two-way merge in that case could result in nested conflict markers. Is that okay, or would we want to continue the old behavior of writing C1 and C2 out to separate files for that special case only? I'm inclined to special case content conflicts in the intermediate merges and handle them by writing out to separate files, but I was the one who suggested sometimes using separate files for add/add conflicts; I'm curious if perhaps my strong aversion to nested conflicts is also off base. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-13 18:21 ` Elijah Newren @ 2018-03-13 22:26 ` Junio C Hamano 2018-03-13 22:42 ` Elijah Newren 0 siblings, 1 reply; 22+ messages in thread From: Junio C Hamano @ 2018-03-13 22:26 UTC (permalink / raw) To: Elijah Newren; +Cc: Jonathan Nieder, Git Mailing List Elijah Newren <newren@gmail.com> writes: > Cool, thanks for the context. I'm happy to go down this path, but > there is one question I'd like your opinion on: what if the > intermediate content merges have conflicts themselves? If that > question isn't clear, let me be more precise... I think you answered this yourself after (re)discovering the virtual ancestor merge in the recursive strategy, and no longer need my input here ;-) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-13 22:26 ` Junio C Hamano @ 2018-03-13 22:42 ` Elijah Newren 2018-03-13 22:52 ` Junio C Hamano 2018-03-13 22:56 ` Jonathan Nieder 0 siblings, 2 replies; 22+ messages in thread From: Elijah Newren @ 2018-03-13 22:42 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jonathan Nieder, Git Mailing List On Tue, Mar 13, 2018 at 3:26 PM, Junio C Hamano <gitster@pobox.com> wrote: > Elijah Newren <newren@gmail.com> writes: > >> Cool, thanks for the context. I'm happy to go down this path, but >> there is one question I'd like your opinion on: what if the >> intermediate content merges have conflicts themselves? If that >> question isn't clear, let me be more precise... > > I think you answered this yourself after (re)discovering the virtual > ancestor merge in the recursive strategy, and no longer need my > input here ;-) The question about what to put into the index was another issue, and it's good to hear that you seem to approve of my logic on that one. Thanks. :-) However, my question here about what to write to the working tree for a rename/rename(2to1) conflict in one particular corner case still remains. Should a two-way merge be performed even if it may result in nested sets of conflict markers, or is that a sufficiently bad outcome for the user that it's the one case we do want to write colliding files out to different temporary paths? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-13 22:42 ` Elijah Newren @ 2018-03-13 22:52 ` Junio C Hamano 2018-03-13 23:04 ` Elijah Newren 2018-03-13 22:56 ` Jonathan Nieder 1 sibling, 1 reply; 22+ messages in thread From: Junio C Hamano @ 2018-03-13 22:52 UTC (permalink / raw) To: Elijah Newren; +Cc: Jonathan Nieder, Git Mailing List Elijah Newren <newren@gmail.com> writes: > However, my question here about what to write to the working tree for > a rename/rename(2to1) conflict in one particular corner case still > remains. Hmph, is it a bad idea to model this after what recursive merge strategy does? I think what is written out from that codepath to the working tree has the nested conflict markers (with a bit of tweak to the marker length, IIRC) in it. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-13 22:52 ` Junio C Hamano @ 2018-03-13 23:04 ` Elijah Newren 0 siblings, 0 replies; 22+ messages in thread From: Elijah Newren @ 2018-03-13 23:04 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jonathan Nieder, Git Mailing List On Tue, Mar 13, 2018 at 3:52 PM, Junio C Hamano <gitster@pobox.com> wrote: > Elijah Newren <newren@gmail.com> writes: > >> However, my question here about what to write to the working tree for >> a rename/rename(2to1) conflict in one particular corner case still >> remains. > > Hmph, is it a bad idea to model this after what recursive merge > strategy does? I think what is written out from that codepath to > the working tree has the nested conflict markers (with a bit of > tweak to the marker length, IIRC) in it. Oh, that's cool; I didn't know that. It looks like that was introduced in commit d694a17986 ("ll-merge: use a longer conflict marker for internal merge", 2016-04-14). That seems like a good idea; I'll go with that. Thanks. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-13 22:42 ` Elijah Newren 2018-03-13 22:52 ` Junio C Hamano @ 2018-03-13 22:56 ` Jonathan Nieder 2018-03-13 23:14 ` Elijah Newren 1 sibling, 1 reply; 22+ messages in thread From: Jonathan Nieder @ 2018-03-13 22:56 UTC (permalink / raw) To: Elijah Newren; +Cc: Junio C Hamano, Git Mailing List Hi, Elijah Newren wrote: > However, my question here about what to write to the working tree for > a rename/rename(2to1) conflict in one particular corner case still > remains. Should a two-way merge be performed even if it may result in > nested sets of conflict markers, or is that a sufficiently bad outcome > for the user that it's the one case we do want to write colliding > files out to different temporary paths? Nested conflict markers only happen in the conflictstyle=diff3 case, I would think. merge-recursive writes them already. I've often wished that it would use a union merge strategy when building the common ancestor to avoid the nested conflicts that rerere doesn't understand. But anyway, that's an orthogonal issue: in the rename/rename context, it should be fine to write nested conflict markers since that's consistent with what merge-recursive already does. Thanks, Jonathan ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-13 22:56 ` Jonathan Nieder @ 2018-03-13 23:14 ` Elijah Newren 2018-03-13 23:30 ` Junio C Hamano 0 siblings, 1 reply; 22+ messages in thread From: Elijah Newren @ 2018-03-13 23:14 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Junio C Hamano, Git Mailing List On Tue, Mar 13, 2018 at 3:56 PM, Jonathan Nieder <jrnieder@gmail.com> wrote: > Elijah Newren wrote: > >> However, my question here about what to write to the working tree for >> a rename/rename(2to1) conflict in one particular corner case still >> remains. Should a two-way merge be performed even if it may result in >> nested sets of conflict markers, or is that a sufficiently bad outcome >> for the user that it's the one case we do want to write colliding >> files out to different temporary paths? > > Nested conflict markers only happen in the conflictstyle=diff3 case, I > would think. Currently, yes. To be clear, though, this change would make it possible even when there is no recursive merge being done and when conflictstyle=merge. > merge-recursive writes them already. I've often wished that it would > use a union merge strategy when building the common ancestor to avoid > the nested conflicts that rerere doesn't understand. But anyway, > that's an orthogonal issue: in the rename/rename context, it should be > fine to write nested conflict markers since that's consistent with > what merge-recursive already does. Cool, sounds like we're now all on the same page. Someone in the future might hate us if they use conflictstyle=diff3, and have a recursive merge, and have a rename/rename(2to1) conflict in the virtual merge base with nested conflicts, and that resulting file is also involved in a separate rename/rename(2to1) conflict in the outer merge that has its own nested conflicts; we'd have up to four levels of nested conflicts. But for now, I'm going to write that off as a crazy thought (or dismiss it as nigh impossible to reason about regardless of what we did to help them) and just proceed ahead. :-) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-13 23:14 ` Elijah Newren @ 2018-03-13 23:30 ` Junio C Hamano 0 siblings, 0 replies; 22+ messages in thread From: Junio C Hamano @ 2018-03-13 23:30 UTC (permalink / raw) To: Elijah Newren; +Cc: Jonathan Nieder, Git Mailing List On Tue, Mar 13, 2018 at 4:14 PM, Elijah Newren <newren@gmail.com> wrote: > > Someone in the future might hate us if they use conflictstyle=diff3, > and have a recursive merge, FWIW I already always use diff3 style and see the nested markers, and I already hate it, so you are no worse off ;-) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-12 18:32 Opinions on changing add/add conflict resolution? Elijah Newren 2018-03-12 18:47 ` Jonathan Nieder @ 2018-03-12 22:19 ` Ævar Arnfjörð Bjarmason [not found] ` <CABPp-BHDOimDoLxWxS=BDOBkm6CUTrXTzD16=TSkWGN-HOiU2g@mail.gmail.com> 1 sibling, 1 reply; 22+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2018-03-12 22:19 UTC (permalink / raw) To: Elijah Newren; +Cc: Git Mailing List On Mon, Mar 12 2018, Elijah Newren jotted: > Hi everyone, > > I'd like to change add/add conflict resolution. Currently when such a > conflict occurs (say at ${path}), git unconditionally does a two-way > merge of the two files and sticks the result in the working tree at > ${path}. > > I would like to make it instead first check whether the two files are > similar; if they are, then do the two-way merge, but if they're not, > then instead write the two files out to separate paths (${path}~HEAD > and ${path}~$MERGE, while making sure that ${path} is removed from the > working copy). > > Thoughts? > > I have a patch series[1] with more details and other changes, but > wanted to especially get feedback on this issue even from folks that > didn't have enough time to read the patches or even the cover letter. Does this mean that e.g. in this case of merging two files, one containing "foo" and one containing "bar": ( rm -rf /tmp/test.git && git init /tmp/test.git && cd /tmp/test.git && echo foo >README && git add README && git commit -mfoo && git checkout --orphan trunk && git reset --hard && echo bar >README && git add README && git commit -mbar && git merge --allow-unrelated-histories master; cat README ) That instead of getting: <<<<<<< HEAD bar ======= foo >>>>>>> master I'd now get these split into different files? I'm assuming by similarity you're talking about the same heuristic we apply for git diff -M, i.e. if "moving" a file would consider it removed/added instead of moved you'd want two files instead of the two-way merge. I don't mind this being a configurable option if you want it, but I don't think it should be on by default, reasons: 1) There's lots of cases where we totally screw up the "is this similar?" check, in particular with small files. E.g. let's say you have a config file like 'fs-path "/tmp/git"' and in two branches you change that to 'fs-path "/opt/git"' and 'fs-path "/var/git"'. The rename detection will think this these have nothing to do with each other since they share no common lines, but to a human reader they're really similar, and would make sense in the context of resolving a bigger merge where /{opt,var}/git changes are conflicting. This is not some theoretical concern, there's lots of things that e.g. use small 5-10 line config files to configure some app that because of some combo of indentation changes and changing a couple of lines will make git's rename detection totally give up, but to a human reader they're 95% the same. 2) This will play havoc with already established merge tools on top of git which a lot of users use instead of manually resolving these in vi or whatever. If we made this the default they'd need to to deal with this new state, and even if it's not the default we'll have some confused users wondering why Emacs Ediff or whatever isn't showing the right thing because it isn't supporting this yet. So actually, given that last point in #2 I'm slightly negative on the whole thing, but maybe splitting it into some new format existing tools don't understand is compelling enough to justify the downstream breakage. I don't think we've ever documented the format we leave the tree in after a failed merge as equivalent to plumbing, but for the purposes of tools that build on top of git it really is. ^ permalink raw reply [flat|nested] 22+ messages in thread
[parent not found: <CABPp-BHDOimDoLxWxS=BDOBkm6CUTrXTzD16=TSkWGN-HOiU2g@mail.gmail.com>]
* Fwd: Opinions on changing add/add conflict resolution? [not found] ` <CABPp-BHDOimDoLxWxS=BDOBkm6CUTrXTzD16=TSkWGN-HOiU2g@mail.gmail.com> @ 2018-03-13 2:53 ` Elijah Newren 2018-03-13 22:12 ` Junio C Hamano 2018-03-13 9:59 ` Ævar Arnfjörð Bjarmason 1 sibling, 1 reply; 22+ messages in thread From: Elijah Newren @ 2018-03-13 2:53 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List [Re-sending because this computer happened to have plain-text mode turned off for some weird reason, and thus the email bounced] Hi, On Mon, Mar 12, 2018 at 3:19 PM, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > > Does this mean that e.g. in this case of merging two files, one > containing "foo" and one containing "bar": > > ( > rm -rf /tmp/test.git && > git init /tmp/test.git && > cd /tmp/test.git && > echo foo >README && > git add README && > git commit -mfoo && > git checkout --orphan trunk && > git reset --hard && > echo bar >README && > git add README && > git commit -mbar && > git merge --allow-unrelated-histories master; > cat README > ) > > That instead of getting: > > <<<<<<< HEAD > bar > ======= > foo > >>>>>>> master > > I'd now get these split into different files? As currently implemented, yes. However, I was more concerned the idea of handling files differently based on whether or not they were similar, rather than on what the precise definition of "similar" is for this context. As far as the definition of similarity goes, estimate_similarity() is currently used by rename detection to compare files recorded at different pathnames. By contrast, in this context, we are comparing two files which were recorded with the same pathname. That suggests the heuristic could be a bit different and use more than just estimate_similarity(). (e.g. "We consider these files similar IF more than 50% of the lines match OR both files are less than 2K.") > I don't mind this being a configurable option if you want it, but I > don't think it should be on by default, reasons: I don't think a configurable option makes sense, at least not for my purposes. Having rename/rename conflicts be "safely" mis-detected as rename/add or add/add, and having rename/add conflicts be "safely" mis-detected as add/add is my overriding concern. Thus, making these three conflict types behave consistently is what I need. Options would make that more difficult for me, and would thus feel like a step backwards. git am/rebase has been doing such mis-detections for years (almost since the "dawn" of git time), but it feels really broken to me because the conflict types aren't handled consistently. (The facts that (a) I'm the only one that has added rename/add testcases to git.git, (b) that I've added all but one of the rename/rename(2to1) testcases to the git.git testsuite, and (c) that rename/add has had multiple bugs for many years, all combine to suggest to me that folks just don't hit those conflict types in practice and thus that they just aren't noticing this breakage -- yet.) I also want to allow such mis-detections for cherry-picks and merges because of the significant (nearly order-of-magnitude in some cases) performance improvements I can get in rename detection if it's allowed. > 1) There's lots of cases where we totally screw up the "is this > similar?" check, in particular with small files. > > E.g. let's say you have a config file like 'fs-path "/tmp/git"' and > in two branches you change that to 'fs-path "/opt/git"' and 'fs-path > "/var/git"'. The rename detection will think this these have nothing > to do with each other since they share no common lines, but to a > human reader they're really similar, and would make sense in the > context of resolving a bigger merge where /{opt,var}/git changes are > conflicting. > > This is not some theoretical concern, there's lots of things that > e.g. use small 5-10 line config files to configure some app that > because of some combo of indentation changes and changing a couple > of lines will make git's rename detection totally give up, but to a > human reader they're 95% the same. Fair enough. The small files case could potentially be handled by just changing the similarity metric for these conflict types, as noted above. If it's a small file, that might be the easiest way for a user to deal with it too. I'm not sure I see the problem with the bigger files, though. If you have bigger files with less than 50% of the lines matching, then you'll essentially end up with a great big conflict block with one file on one side and the other file on the other side, which doesn't seem that different to me than having them be in two separate files. In fact, separate files seems easier to deal with because then the user can run e.g. 'git diff --no-index --color-words FILE1 FILE2', something that they can't do when it's in one file. That has bothered me more than once, and made me wish they were just in separate files. > 2) This will play havoc with already established merge tools on top of > git which a lot of users use instead of manually resolving these in > vi or whatever. > > If we made this the default they'd need to to deal with this new > state, and even if it's not the default we'll have some confused > users wondering why Emacs Ediff or whatever isn't showing the right > thing because it isn't supporting this yet. > > So actually, given that last point in #2 I'm slightly negative on the > whole thing, but maybe splitting it into some new format existing tools > don't understand is compelling enough to justify the downstream breakage. To me, this is a bigger concern. We have changed conflict resolutions in various ways at various times over the years, so I don't think the output should be considered fixed in stone, but I am very sympathetic to arguments that this particular change is too painful. There is a possible way to allow for a transition period, though. Junio has already requested that some of the changes I'm working on be done as a new merge strategy that is a rewrite of merge-recursive (https://public-inbox.org/git/xmqqk1ydkbx0.fsf@gitster.mtv.corp.google.com/). It would be a little unfortunate to not be able to use the new merge strategy as an exact drop-in replacement of the current recursive merge strategy, but it would provide a way for people to get some time to migrate other tools. However, I'm far more concerned with the three collision conflict types having consistent behavior than I am with changing add/add conflict handling. And if your two concerns or Jonathan's concern would prevent you from wanting to use the new merge strategy (and possibly prevent it from becoming the default in the future), I'd much rather just modify rename/add and rename/rename to behave like add/add. Would that be more to your liking? Elijah ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Fwd: Opinions on changing add/add conflict resolution? 2018-03-13 2:53 ` Fwd: " Elijah Newren @ 2018-03-13 22:12 ` Junio C Hamano 0 siblings, 0 replies; 22+ messages in thread From: Junio C Hamano @ 2018-03-13 22:12 UTC (permalink / raw) To: Elijah Newren; +Cc: Ævar Arnfjörð Bjarmason, Git Mailing List Elijah Newren <newren@gmail.com> writes: > As currently implemented, yes. However, I was more concerned the idea > of handling files differently based on whether or not they were > similar, rather than on what the precise definition of "similar" is > for this context. > > As far as the definition of similarity goes, estimate_similarity() is > currently used by rename detection to compare files recorded at > different pathnames. By contrast, in this context, we are comparing > two files which were recorded with the same pathname. That suggests > the heuristic could be a bit different and use more than just > estimate_similarity(). (e.g. "We consider these files similar IF more > than 50% of the lines match OR both files are less than 2K.") Yeah, I think there is similar difference between similarity score that is used by diffcore-rename and dissimilarity score that is used by diffcore-break exactly for that reason. If you start from a 100-line original file and grow it to 100-line one by adding 900 lines, as long as you kept the original 100 lines, it is easier to view the change within the same path as a continued development, instead of saying they are so dissimilar. In any case, I think the way the stage #2 and stage #3 (i.e. ours and theirs) contents are externalized during a conflicted mergy operation should be consistent across edit/edit and add/add conflict, so if we are adding a new way to write out extra temporary files out of these higher stage index entries, it should be made applicable not only to add/add conflict (i.e. there shounld't be a code that says "oh, let's see, this lacks stage #1, so do this different thing"). Personally, I think it is best to leave it all outside of the core and make "git mergetool" to be responsible for the job of externalizing higher stage index entries to temporary working tree files. They already need to do so in order to work with external tools that do not read directly from our index file anyway, no? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? [not found] ` <CABPp-BHDOimDoLxWxS=BDOBkm6CUTrXTzD16=TSkWGN-HOiU2g@mail.gmail.com> 2018-03-13 2:53 ` Fwd: " Elijah Newren @ 2018-03-13 9:59 ` Ævar Arnfjörð Bjarmason 2018-03-13 17:09 ` Elijah Newren 1 sibling, 1 reply; 22+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2018-03-13 9:59 UTC (permalink / raw) To: Elijah Newren; +Cc: Git Mailing List On Tue, Mar 13 2018, Elijah Newren jotted: > Hi, > > On Mon, Mar 12, 2018 at 3:19 PM, Ævar Arnfjörð Bjarmason <avarab@gmail.com> > wrote: > >> >> Does this mean that e.g. in this case of merging two files, one >> containing "foo" and one containing "bar": >> >> ( >> rm -rf /tmp/test.git && >> git init /tmp/test.git && >> cd /tmp/test.git && >> echo foo >README && >> git add README && >> git commit -mfoo && >> git checkout --orphan trunk && >> git reset --hard && >> echo bar >README && >> git add README && >> git commit -mbar && >> git merge --allow-unrelated-histories master; >> cat README >> ) >> >> That instead of getting: >> >> <<<<<<< HEAD >> bar >> ======= >> foo >> >>>>>>> master >> >> I'd now get these split into different files? >> > > As currently implemented, yes. However, I was more concerned the idea of > handling files differently based on whether or not they were similar, > rather than on what the precise definition of "similar" is for this context. > > As far as the definition of similarity goes, estimate_similarity() is > currently used by rename detection to compare files recorded at different > pathnames. By contrast, in this context, we are comparing two files which > were recorded with the same pathname. That suggests the heuristic could be > a bit different and use more than just estimate_similarity(). (e.g. "We > consider these files similar IF >50% of the lines match OR both files are > less than 2K.") Yeah that's fair enough, we could definitely tweak a dedicated heuristic for just this purpose to do the right thing most of the time, although really having completely unrelated 10-line README files use the old two-way merge would, I guess, also be confusing to users who want this feature. >> I don't mind this being a configurable option if you want it, but I >> don't think it should be on by default, reasons: >> > > I don't think a configurable option makes sense, at least not for my > purposes. Having rename/rename conflicts be "safely" mis-detected as > rename/add or add/add, and having rename/add conflicts be "safely" > mis-detected as add/add is my overriding concern. Thus, making these three > conflict types behave consistently is what I need. Options would make that > more difficult for me, and would thus feel like a step backwards. I think there's three types of users who interact with the current format: 1) The user using no special tool (e.g. editing with nano) and runs into a conflict, maybe this helps these users the most. 2) The user who's also using no special tool (e.g. editing with nano) but just prefers to get in-file conflict markers. 3) The user using a tool to solve the conflict, who doesn't care if we'd spew out our current format, this new formar, or some XML file with both versions or whatevery, they're using some other UI. I don't think we should make some change that breaks existing tools for users of #3 by default. > 1) There's lots of cases where we totally screw up the "is this >> similar?" check, in particular with small files. >> >> E.g. let's say you have a config file like 'fs-path "/tmp/git"' and >> in two branches you change that to 'fs-path "/opt/git"' and 'fs-path >> "/var/git"'. The rename detection will think this these have nothing >> to do with each other since they share no common lines, but to a >> human reader they're really similar, and would make sense in the >> context of resolving a bigger merge where /{opt,var}/git changes are >> conflicting. >> >> This is not some theoretical concern, there's lots of things that >> e.g. use small 5-10 line config files to configure some app that >> because of some combo of indentation changes and changing a couple >> of lines will make git's rename detection totally give up, but to a >> human reader they're 95% the same. >> > > Fair enough. The small files case could potentially be handled by just > changing the similarity metric for these conflict types, as noted above. > If it's a small file, that might be the easiest way for a user to deal with > it too. Or it might be easier for the user to deal with even big conflicts with in-file conflict markers, see use-case #2 & #3 above. > I'm not sure I see the problem with the bigger files, though. If you have > bigger files with less than 50% of the lines matching, then you'll > essentially end up with a great big conflict block with one file on one > side and the other file on the other side, which doesn't seem that > different to me than having them be in two separate files. In fact, > separate files seems easier to deal with because then the user can run e.g. > 'git diff --no-index --color-words FILE1 FILE2', something that they can't > do when it's in one file. That has bothered me more than once, and made me > wish they were just in separate files. Right, but this assumes they user isn't using a merge-tool as noted in #3. >> 2) This will play havoc with already established merge tools on top of >> git which a lot of users use instead of manually resolving these in >> vi or whatever. >> >> If we made this the default they'd need to to deal with this new >> state, and even if it's not the default we'll have some confused >> users wondering why Emacs Ediff or whatever isn't showing the right >> thing because it isn't supporting this yet. >> >> So actually, given that last point in #2 I'm slightly negative on the >> whole thing, but maybe splitting it into some new format existing tools >> don't understand is compelling enough to justify the downstream breakage. >> > > To me, this is a bigger concern. We have changed conflict resolutions in > various ways at various times over the years, so I don't think the output > should be considered fixed in stone, but I am very sympathetic to arguments > that this particular change is too painful. I think we've changed around a lot of the details, and certainly how the conflict is resolved. By "format" I mean that we've been emitting a file with conflict markers, which third-party tools know how to parse (e.g. ediff). That's a relatively stable format that we (with minor changes) also share with cvs, svn, or plain patch(1) etc. > [...] > However, I'm far more concerned with the three collision conflict types > having consistent behavior than I am with changing add/add conflict > handling. And if your two concerns or Jonathan's concern would prevent you > from wanting to use the new merge strategy (and possibly prevent it from > becoming the default in the future), I'd much rather just modify rename/add > and rename/rename to behave like add/add. Would that be more to your > liking? I don't really object to these changes, I don't know enough about this area, I skimmed your patches and 90% of it is over my head (although I could keep digging...). I'm just chiming in because it seems to me from upthread that you're purely focusing on the use-case of the user of git who's using git at the abstraction of using a dumb editor that doesn't do anything to help them with conflict markers to resolve conflicts. I think that in those cases if you have two unrelated 3k line files that collide splitting them out is certainly more helpful. But there's also the use-case of existing tools (e.g. ediff) essentially using this as a plumbing format, and we have to keep those tools in mind, because to them it really doesn't matter what the format is, but it *does* matter that they can understand it, but if we took these patches as-is e.g. ediff wouldn't even show these conflicting files, correct? Also another caveat: Since these are new side-files what happens when you're in a conflict and run `git clean -dxf`, are these new files wiped away, or are they somehow known to the index? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Opinions on changing add/add conflict resolution? 2018-03-13 9:59 ` Ævar Arnfjörð Bjarmason @ 2018-03-13 17:09 ` Elijah Newren 0 siblings, 0 replies; 22+ messages in thread From: Elijah Newren @ 2018-03-13 17:09 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List On Tue, Mar 13, 2018 at 2:59 AM, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > On Tue, Mar 13 2018, Elijah Newren jotted: >> However, I'm far more concerned with the three collision conflict types >> having consistent behavior than I am with changing add/add conflict >> handling. And if your two concerns or Jonathan's concern would prevent you >> from wanting to use the new merge strategy (and possibly prevent it from >> becoming the default in the future), I'd much rather just modify rename/add >> and rename/rename to behave like add/add. Would that be more to your >> liking? > > I don't really object to these changes, I don't know enough about this > area, I skimmed your patches and 90% of it is over my head (although I > could keep digging...). > > I'm just chiming in because it seems to me from upthread that you're > purely focusing on the use-case of the user of git who's using git at > the abstraction of using a dumb editor that doesn't do anything to help > them with conflict markers to resolve conflicts. Yeah, that's totally why I started this separate thread; I was worried the original would push away folks who weren't familiar enough with merge-recursive or were just overwhelmed by all the different changes and rationale, but I really wanted to get feedback on at least the piece that people were likely to hit in practice and would understand. Thanks for doing so; your, and Jonathan's, and Junio's comments have provided some good context. > Also another caveat: Since these are new side-files what happens when > you're in a conflict and run `git clean -dxf`, are these new files wiped > away, or are they somehow known to the index? A git clean would wipe them out...and if that scares you (and I can totally see how it might), then rest assured that the situation is a whole lot worse: this isn't limited to rename/add and rename/rename(2to1) conflicts. There are several paths in merge-recursive.c that call unique_path() to get a temporary filename that is in no way tracked in the index but which is used for storing content relevant to the merge. These include directory/file conflicts, untracked files being in the way of putting a renamed file where it belongs, untracked files being in the way of a modify/delete, untracked files being in the way of simple adds on the other side of history, rename/rename(1to2)/add/add, and maybe others. In all cases, a git clean is going to wipe out the files that were written to different temporary locations. My rewrite I'm trying to find time to work on would get rid of the code paths involving untracked files being in the way of other stuff, but would do nothing for the other cases. I would love to get rid of all of them, but don't have a clue how to do so. ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2018-03-13 23:30 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-03-12 18:32 Opinions on changing add/add conflict resolution? Elijah Newren 2018-03-12 18:47 ` Jonathan Nieder 2018-03-12 21:26 ` Elijah Newren 2018-03-12 21:35 ` Jonathan Nieder 2018-03-12 23:08 ` Hilco Wijbenga 2018-03-12 23:14 ` Jonathan Nieder 2018-03-13 0:38 ` Elijah Newren 2018-03-13 17:22 ` Elijah Newren 2018-03-13 5:30 ` Junio C Hamano 2018-03-13 18:21 ` Elijah Newren 2018-03-13 22:26 ` Junio C Hamano 2018-03-13 22:42 ` Elijah Newren 2018-03-13 22:52 ` Junio C Hamano 2018-03-13 23:04 ` Elijah Newren 2018-03-13 22:56 ` Jonathan Nieder 2018-03-13 23:14 ` Elijah Newren 2018-03-13 23:30 ` Junio C Hamano 2018-03-12 22:19 ` Ævar Arnfjörð Bjarmason [not found] ` <CABPp-BHDOimDoLxWxS=BDOBkm6CUTrXTzD16=TSkWGN-HOiU2g@mail.gmail.com> 2018-03-13 2:53 ` Fwd: " Elijah Newren 2018-03-13 22:12 ` Junio C Hamano 2018-03-13 9:59 ` Ævar Arnfjörð Bjarmason 2018-03-13 17:09 ` Elijah Newren
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).