git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/9] Add a new --remerge-diff capability to show & log
@ 2021-12-21 18:05 Elijah Newren via GitGitGadget
  2021-12-21 18:05 ` [PATCH 1/9] tmp_objdir: add a helper function for discarding all contained objects Elijah Newren via GitGitGadget
                   ` (11 more replies)
  0 siblings, 12 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-21 18:05 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren

Here are some patches to add a --remerge-diff capability to show & log,
which works by comparing merge commits to an automatic remerge (note that
the automatic remerge tree can contain files with conflict markers).

Changes since original submission[1]:

 * Rebased on top of the version of ns/tmp-objdir that Neeraj submitted
   (Neeraj's patches were based on v2.34, but ns/tmp-objdir got applied on
   an old commit and does not even build because of that).
 * Modify ll-merge API to return a status, instead of printing "Cannot merge
   binary files" on stdout[2] (as suggested by Peff)
 * Make conflict messages and other such warnings into diff headers of the
   subsequent remerge-diff rather than appearing in the diff as file content
   of some funny looking filenames (as suggested by Peff[3] and Junio[4])
 * Sergey ack'ed the diff-merges.c portion of the patches, but that wasn't
   limited to one patch so not sure where to record that ack.

[1]
https://lore.kernel.org/git/pull.1080.git.git.1630376800.gitgitgadget@gmail.com/;
GitHub wouldn't let me change the target branch for the PR, so I had to
create a new one with the new base and thus the reason for not sending this
as v2 even though it is. [2]
https://lore.kernel.org/git/YVOZRhWttzF18Xql@coredump.intra.peff.net/,
https://lore.kernel.org/git/YVOZty9D7NRbzhE5@coredump.intra.peff.net/ [3]
https://lore.kernel.org/git/YVOXPTjsp9lrxmS6@coredump.intra.peff.net/ [4]
https://lore.kernel.org/git/xmqqr1d7e4ug.fsf@gitster.g/

=== FURTHER BACKGROUND (original cover letter material) ==

Here are some example commits you can try this out on (with git show
--remerge-diff $COMMIT):

 * git.git conflicted merge: 07601b5b36
 * git.git non-conflicted change: bf04590ecd
 * linux.git conflicted merge: eab3540562fb
 * linux.git non-conflicted change: 223cea6a4f05

Many more can be found by just running git log --merges --remerge-diff in
your repository of choice and searching for diffs (most merges tend to be
clean and unmodified and thus produce no diff but a search of '^diff' in the
log output tends to find the examples nicely).

Some basic high level details about this new option:

 * This option is most naturally compared to --cc, though the output seems
   to be much more understandable to most users than --cc output.
 * Since merges are often clean and unmodified, this new option results in
   an empty diff for most merges.
 * This new option shows things like the removal of conflict markers, which
   hunks users picked from the various conflicted sides to keep or remove,
   and shows changes made outside of conflict markers (which might reflect
   changes needed to resolve semantic conflicts or cleanups of e.g.
   compilation warnings or other additional changes an integrator felt
   belonged in the merged result).
 * This new option does not (currently) work for octopus merges, since
   merge-ort is specific to two-parent merges[1].
 * This option will not work on a read-only or full filesystem[2].
 * We discussed this capability at Git Merge 2020, and one of the
   suggestions was doing a periodic git gc --auto during the operation (due
   to potential new blobs and trees created during the operation). I found a
   way to avoid that; see [2].
 * This option is faster than you'd probably expect; it handles 33.5 merge
   commits per second in linux.git on my computer; see below.

In regards to the performance point above, the timing for running the
following command:

time git log --min-parents=2 --max-parents=2 $DIFF_FLAG | wc -l


in linux.git (with v5.4 checked out, since my copy of linux is very out of
date) is as follows:

DIFF_FLAG=--cc:            71m 31.536s
DIFF_FLAG=--remerge-diff:  31m  3.170s


Note that there are 62476 merges in this history. Also, output size is:

DIFF_FLAG=--cc:            2169111 lines
DIFF_FLAG=--remerge-diff:  2458020 lines


So roughly the same amount of output as --cc, as you'd expect.

As a side note: git log --remerge-diff, when run in various repositories and
allowed to run all the way back to the beginning(s) of history, is a nice
stress test of sorts for merge-ort. Especially when users run it for you on
their repositories they are working on, whether intentionally or via a bug
in a tool triggering that command to be run unexpectedly. Long story short,
such a bug in an internal tool existed last December and this command was
run on an internal repository and found a platform-specific bug in merge-ort
on some really old merge commit from that repo. I fixed that bug (a
STABLE_QSORT thing) while upstreaming all the merge-ort patches in the mean
time, but it was nice getting extra testing. Having more folks run this on
their repositories might be useful extra testing of the new merge strategy.

Also, I previously mentioned --remerge-diff-only (a flag to show how
cherry-picks or reverts differ from an automatic cherry-pick or revert, in
addition to showing how merges differ from an automatic merge). This series
does not include the patches to introduce that option; I'll submit them
later.

Two other things that might be interesting but are not included and which I
haven't investigated:

 * some mechanism for passing extra merge options through (e.g.
   -Xignore-space-change)
 * a capability to compare the automatic merge to a second automatic merge
   done with different merge options. (Not sure if this would be of interest
   to end users, but might be interesting while developing new a
   --strategy-option, or maybe checking how changing some default in the
   merge algorithm would affect historical merges in various repositories).

[1] I have nebulous ideas of how an Octopus-centric ORT strategy could be
written -- basically, just repeatedly invoking ort and trying to make sure
nested conflicts can be differentiated. For now, though, a simple warning is
printed that octopus merges are not handled and no diff will be shown. [2]
New blobs/trees can be written by the three-way merging step. These are
written to a temporary area (via tmp-objdir.c) under the git object store
that is cleaned up at the end of the operation, with the new loose objects
from the remerge being cleaned up after each individual merge.

Elijah Newren (9):
  tmp_objdir: add a helper function for discarding all contained objects
  ll-merge: make callers responsible for showing warnings
  merge-ort: capture and print ll-merge warnings in our preferred
    fashion
  merge-ort: mark a few more conflict messages as omittable
  merge-ort: make path_messages available to external callers
  diff: add ability to insert additional headers for paths
  merge-ort: format messages slightly different for use in headers
  show, log: provide a --remerge-diff capability
  doc/diff-options: explain the new --remerge-diff option

 Documentation/diff-options.txt |  8 ++++
 apply.c                        |  5 ++-
 builtin/checkout.c             | 12 ++++--
 builtin/log.c                  | 16 ++++++++
 diff-merges.c                  | 12 ++++++
 diff.c                         | 34 ++++++++++++++++-
 diff.h                         |  1 +
 ll-merge.c                     | 40 ++++++++++---------
 ll-merge.h                     |  9 ++++-
 log-tree.c                     | 70 ++++++++++++++++++++++++++++++++++
 merge-blobs.c                  |  5 ++-
 merge-ort.c                    | 49 +++++++++++++++++++++---
 merge-ort.h                    | 10 +++++
 merge-recursive.c              |  8 +++-
 merge-recursive.h              |  1 +
 notes-merge.c                  |  5 ++-
 rerere.c                       | 10 +++--
 revision.h                     |  6 ++-
 t/t6404-recursive-merge.sh     |  9 ++++-
 t/t6406-merge-attr.sh          |  9 ++++-
 tmp-objdir.c                   |  5 +++
 tmp-objdir.h                   |  6 +++
 22 files changed, 288 insertions(+), 42 deletions(-)


base-commit: 4e44121c2d7bced65e25eb7ec5156290132bec94
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1103%2Fnewren%2Fremerge-diff-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1103/newren/remerge-diff-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1103
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 113+ messages in thread

end of thread, other threads:[~2022-02-02 11:18 UTC | newest]

Thread overview: 113+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-21 18:05 [PATCH 0/9] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
2021-12-21 18:05 ` [PATCH 1/9] tmp_objdir: add a helper function for discarding all contained objects Elijah Newren via GitGitGadget
2021-12-21 23:26   ` Junio C Hamano
2021-12-21 23:51     ` Elijah Newren
2021-12-22  6:23       ` Junio C Hamano
2021-12-25  2:29         ` Elijah Newren
2021-12-21 18:05 ` [PATCH 2/9] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
2021-12-21 21:19   ` Ævar Arnfjörð Bjarmason
2021-12-21 21:57     ` Elijah Newren
2021-12-21 23:02       ` Ævar Arnfjörð Bjarmason
2021-12-21 23:15         ` Elijah Newren
2021-12-21 23:44   ` Junio C Hamano
2021-12-23 18:26     ` Elijah Newren
2021-12-21 18:05 ` [PATCH 3/9] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
2021-12-22  0:00   ` Junio C Hamano
2021-12-23 18:36     ` Elijah Newren
2021-12-21 18:05 ` [PATCH 4/9] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
2021-12-22  0:06   ` Junio C Hamano
2021-12-23 18:38     ` Elijah Newren
2021-12-21 18:05 ` [PATCH 5/9] merge-ort: make path_messages available to external callers Elijah Newren via GitGitGadget
2021-12-21 18:05 ` [PATCH 6/9] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
2021-12-22  0:24   ` Junio C Hamano
2021-12-25  2:35     ` Elijah Newren
2021-12-21 18:05 ` [PATCH 7/9] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
2021-12-21 18:05 ` [PATCH 8/9] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
2021-12-21 21:23   ` Ævar Arnfjörð Bjarmason
2021-12-21 22:18     ` Elijah Newren
2021-12-21 18:05 ` [PATCH 9/9] doc/diff-options: explain the new --remerge-diff option Elijah Newren via GitGitGadget
2021-12-21 21:28   ` Ævar Arnfjörð Bjarmason
2021-12-21 22:24     ` Elijah Newren
2021-12-21 23:47       ` Ævar Arnfjörð Bjarmason
2021-12-22 19:05         ` Elijah Newren
2021-12-21 23:20 ` [PATCH 0/9] Add a new --remerge-diff capability to show & log Junio C Hamano
2021-12-21 23:43   ` Elijah Newren
2021-12-22  0:33 ` Junio C Hamano
2021-12-25  7:59 ` [PATCH v2 0/8] " Elijah Newren via GitGitGadget
2021-12-25  7:59   ` [PATCH v2 1/8] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
2021-12-28 10:56     ` Johannes Altmanninger
2021-12-28 22:34       ` Elijah Newren
2021-12-28 23:01         ` brian m. carlson
2021-12-28 23:45           ` Elijah Newren
2021-12-25  7:59   ` [PATCH v2 2/8] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
2021-12-25  7:59   ` [PATCH v2 3/8] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
2021-12-28 10:56     ` Johannes Altmanninger
2021-12-28 19:37       ` Elijah Newren
2021-12-28 22:05         ` Johannes Altmanninger
2021-12-25  7:59   ` [PATCH v2 4/8] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
2021-12-25  7:59   ` [PATCH v2 5/8] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
2021-12-25  7:59   ` [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
2021-12-26 18:30     ` In-tree strbuf "in-place" search/replace (was: [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers) Ævar Arnfjörð Bjarmason
2021-12-28 10:56     ` [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers Johannes Altmanninger
2021-12-28 21:48       ` Elijah Newren
2021-12-25  7:59   ` [PATCH v2 7/8] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
2021-12-28 10:57     ` Johannes Altmanninger
2021-12-28 21:09       ` Elijah Newren
2021-12-29  0:16         ` Johannes Altmanninger
2021-12-30 22:04           ` Elijah Newren
2021-12-31  3:07             ` Johannes Altmanninger
2021-12-25  7:59   ` [PATCH v2 8/8] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
2021-12-28 10:57     ` Johannes Altmanninger
2021-12-28 23:42       ` Elijah Newren
2021-12-26 21:52   ` [PATCH v2 0/8] Add a new --remerge-diff capability to show & log Ævar Arnfjörð Bjarmason
2021-12-27 21:11     ` Elijah Newren
2022-01-10 15:48       ` Ævar Arnfjörð Bjarmason
2021-12-28 10:55   ` Johannes Altmanninger
2021-12-30 23:36   ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
2021-12-30 23:36     ` [PATCH v3 1/9] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
2022-01-19 15:49       ` Ævar Arnfjörð Bjarmason
2022-01-20  2:31         ` Elijah Newren
2022-01-20  7:53           ` Elijah Newren
2022-01-19 16:01       ` Ævar Arnfjörð Bjarmason
2022-01-20  2:33         ` Elijah Newren
2021-12-30 23:36     ` [PATCH v3 2/9] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
2021-12-30 23:36     ` [PATCH v3 3/9] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
2022-01-19 16:41       ` Ævar Arnfjörð Bjarmason
2022-01-20  3:29         ` Elijah Newren
2021-12-30 23:36     ` [PATCH v3 4/9] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
2021-12-30 23:36     ` [PATCH v3 5/9] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
2021-12-30 23:36     ` [PATCH v3 6/9] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
2021-12-30 23:36     ` [PATCH v3 7/9] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
2021-12-30 23:36     ` [PATCH v3 8/9] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
2022-01-19 16:19       ` Ævar Arnfjörð Bjarmason
2022-01-21  2:16         ` Elijah Newren
2022-01-21 16:55           ` Elijah Newren
2021-12-30 23:36     ` [PATCH v3 9/9] merge-ort: mark conflict/warning messages from inner merges as omittable Elijah Newren via GitGitGadget
2021-12-31  8:46     ` [PATCH v3 0/9] Add a new --remerge-diff capability to show & log Junio C Hamano
2022-01-21 19:12     ` [PATCH v4 00/10] " Elijah Newren via GitGitGadget
2022-01-21 19:12       ` [PATCH v4 01/10] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
2022-02-01  9:09         ` Ævar Arnfjörð Bjarmason
2022-02-01 16:40           ` Elijah Newren
2022-01-21 19:12       ` [PATCH v4 02/10] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
2022-02-01  9:35         ` Ævar Arnfjörð Bjarmason
2022-02-01 16:54           ` Elijah Newren
2022-02-02 11:17             ` Ævar Arnfjörð Bjarmason
2022-01-21 19:12       ` [PATCH v4 03/10] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
2022-01-21 19:12       ` [PATCH v4 04/10] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
2022-01-21 19:12       ` [PATCH v4 05/10] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
2022-01-21 19:12       ` [PATCH v4 06/10] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
2022-01-21 19:12       ` [PATCH v4 07/10] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
2022-01-21 19:12       ` [PATCH v4 08/10] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
2022-01-21 19:12       ` [PATCH v4 09/10] merge-ort: mark conflict/warning messages from inner merges as omittable Elijah Newren via GitGitGadget
2022-01-21 19:12       ` [PATCH v4 10/10] diff-merges: avoid history simplifications when diffing merges Elijah Newren via GitGitGadget
2022-02-02  2:37       ` [PATCH v5 00/10] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 01/10] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 02/10] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 03/10] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 04/10] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 05/10] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 06/10] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 07/10] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 08/10] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 09/10] merge-ort: mark conflict/warning messages from inner merges as omittable Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 10/10] diff-merges: avoid history simplifications when diffing merges Elijah Newren via GitGitGadget

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).