git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "William Sprent via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: William Sprent <williams@unity3d.com>,
	William Sprent <williams@unity3d.com>
Subject: [PATCH] fast-export: fix surprising behavior with --first-parent
Date: Tue, 23 Nov 2021 11:28:47 +0000	[thread overview]
Message-ID: <pull.1084.git.1637666927224.gitgitgadget@gmail.com> (raw)

From: William Sprent <williams@unity3d.com>

When invoking git-fast-export with the --first-parent flag on a branch
with merges, fast-export would early-out on processing the first merge
on the branch. If combined with --reverse, fast-export would instead
output all single parent commits on the branch.

This commit makes fast-export output the same commits as rev-list
--first-parent, and makes --reverse not have an effect on which commits
are output.

The fix involves removing logic within fast-export which was responsible
for ensuring that parents are processed before their children, which was
what was exiting early due to missing second parents. This is replaced
by setting 'reverse = 1' before revision walking, which, in conjuction
with topo_order, allows for delegating the ordering of commits to
revision.c. The reverse flag is set after parsing rev-list arguments to
avoid having it disabled.

Signed-off-by: William Sprent <williams@unity3d.com>
---
    fast-export: fix surprising behavior with --first-parent
    
    Hi,
    
    This is my first time patching git, so I probably need some guidance on
    my approach. :)
    
    I've noticed that git fast-export exhibits some odd behavior when passed
    the --first-parent flag. In the repository I was working on, it would
    only output the initial commit before exiting. Moreover, adding the
    --reverse flag causes it to behave differently and instead output all
    commits in the first parent line that only have one parent. My
    expectation is more or less that git fast-export should output the same
    set of commits as git rev-list would output given the same arguments,
    which matches how it acts when given revision ranges.
    
    It seems like this behavior comes from the fact that has_unshown_parents
    will never be false for any merge commits encountered when fast-export
    is called with --first-parent. This causes the while loop to follow the
    pattern of pushing all commits into the "commits" queue until the
    initial commit is encountered, at which point it calls handle_tail which
    falls through on the first merge commit, causing most of the commits to
    be unhandled.
    
    My impression is that this logic only serves to ensure that parents are
    processed before their children, so in my patch I've opted to fix the
    issue by delegating that responsibility to revision.c by adding the
    reverse flag before performing the revision walk. From what I can see,
    this should be equivalent to what the previous logic is trying to
    achieve, but I can also see that there could be risk in these changes.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1084%2Fwilliams-unity%2Ffast-export%2Ffirst-parent-issues-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1084/williams-unity/fast-export/first-parent-issues-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1084

 builtin/fast-export.c  | 36 ++----------------------------------
 t/t9350-fast-export.sh | 30 ++++++++++++++++++++++++++++++
 2 files changed, 32 insertions(+), 34 deletions(-)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 8e2caf72819..50f8c224b6e 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -107,18 +107,6 @@ static int parse_opt_reencode_mode(const struct option *opt,
 
 static struct decoration idnums;
 static uint32_t last_idnum;
-
-static int has_unshown_parent(struct commit *commit)
-{
-	struct commit_list *parent;
-
-	for (parent = commit->parents; parent; parent = parent->next)
-		if (!(parent->item->object.flags & SHOWN) &&
-		    !(parent->item->object.flags & UNINTERESTING))
-			return 1;
-	return 0;
-}
-
 struct anonymized_entry {
 	struct hashmap_entry hash;
 	const char *anon;
@@ -752,20 +740,6 @@ static char *anonymize_tag(void *data)
 	return strbuf_detach(&out, NULL);
 }
 
-static void handle_tail(struct object_array *commits, struct rev_info *revs,
-			struct string_list *paths_of_changed_objects)
-{
-	struct commit *commit;
-	while (commits->nr) {
-		commit = (struct commit *)object_array_pop(commits);
-		if (has_unshown_parent(commit)) {
-			/* Queue again, to be handled later */
-			add_object_array(&commit->object, NULL, commits);
-			return;
-		}
-		handle_commit(commit, revs, paths_of_changed_objects);
-	}
-}
 
 static void handle_tag(const char *name, struct tag *tag)
 {
@@ -1185,7 +1159,6 @@ static int parse_opt_anonymize_map(const struct option *opt,
 int cmd_fast_export(int argc, const char **argv, const char *prefix)
 {
 	struct rev_info revs;
-	struct object_array commits = OBJECT_ARRAY_INIT;
 	struct commit *commit;
 	char *export_filename = NULL,
 	     *import_filename = NULL,
@@ -1281,19 +1254,14 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix)
 
 	get_tags_and_duplicates(&revs.cmdline);
 
+	revs.reverse = 1;
 	if (prepare_revision_walk(&revs))
 		die("revision walk setup failed");
 	revs.diffopt.format_callback = show_filemodify;
 	revs.diffopt.format_callback_data = &paths_of_changed_objects;
 	revs.diffopt.flags.recursive = 1;
 	while ((commit = get_revision(&revs))) {
-		if (has_unshown_parent(commit)) {
-			add_object_array(&commit->object, NULL, &commits);
-		}
-		else {
-			handle_commit(commit, &revs, &paths_of_changed_objects);
-			handle_tail(&commits, &revs, &paths_of_changed_objects);
-		}
+		handle_commit(commit, &revs, &paths_of_changed_objects);
 	}
 
 	handle_tags_and_duplicates(&extra_refs);
diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh
index 409b48e2442..bd08101b81d 100755
--- a/t/t9350-fast-export.sh
+++ b/t/t9350-fast-export.sh
@@ -750,4 +750,34 @@ test_expect_success 'merge commit gets exported with --import-marks' '
 	)
 '
 
+
+test_expect_success 'fast-export --first-parent outputs all revisions output by revision walk' '
+	git init first-parent &&
+	cd first-parent &&
+	test_commit init &&
+	git checkout -b topic1 &&
+	test_commit file2 file2.txt &&
+	git checkout main &&
+	git merge topic1 --no-ff &&
+
+	git checkout -b topic2 &&
+	test_commit file3 file3.txt &&
+	git checkout main &&
+	git merge topic2 --no-ff &&
+
+	test_commit branch-head &&
+
+	git rev-list --format="%ad%B" --first-parent --topo-order --no-commit-header main > expected &&
+
+	git fast-export main -- --first-parent > first-parent-export &&
+	git fast-export main -- --first-parent --reverse > first-parent-reverse-export &&
+	git init import && cd import &&
+	cat ../first-parent-export | git fast-import &&
+
+	git rev-list --format="%ad%B" --topo-order --all --no-commit-header > actual &&
+	test $(git rev-list --all | wc -l) -eq 4 &&
+	test_cmp ../expected actual &&
+	test_cmp ../first-parent-export ../first-parent-reverse-export
+'
+
 test_done

base-commit: cd3e606211bb1cf8bc57f7d76bab98cc17a150bc
-- 
gitgitgadget

             reply	other threads:[~2021-11-23 11:28 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-23 11:28 William Sprent via GitGitGadget [this message]
2021-11-23 13:07 ` [PATCH] fast-export: fix surprising behavior with --first-parent Ævar Arnfjörð Bjarmason
2021-11-24 11:57   ` William Sprent
2021-11-23 19:14 ` Elijah Newren
2021-11-24 13:05   ` William Sprent
2021-11-24  0:41 ` Junio C Hamano
2021-11-24 13:05   ` William Sprent
2021-12-09  8:13 ` [PATCH v2] " William Sprent via GitGitGadget
2021-12-10  3:48   ` Elijah Newren
2021-12-10 21:55     ` Junio C Hamano
2021-12-10 22:02       ` Elijah Newren
2021-12-13 15:09     ` William Sprent
2021-12-14  0:31       ` Elijah Newren
2021-12-14 13:11         ` William Sprent
2021-12-16 16:23   ` [PATCH v3] " William Sprent via GitGitGadget
2021-12-21 18:47     ` Elijah Newren
2021-12-21 20:50       ` Junio C Hamano
2021-12-22  8:38         ` William Sprent

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1084.git.1637666927224.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=williams@unity3d.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).