git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
* [PATCH 1/2] revision: use repository from rev_info when parsing commits
@ 2020-06-23 20:56 Michael Forney
  2020-06-23 20:56 ` [PATCH 2/2] submodule: use submodule repository when preparing summary Michael Forney
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Michael Forney @ 2020-06-23 20:56 UTC (permalink / raw)
  To: git, Derrick Stolee

This is needed when repo_init_revisions() is called with a repository
that is not the_repository to ensure appropriate repository is used
in repo_parse_commit_internal(). If the wrong repository is used,
a fatal error is the commit-graph machinery occurs:

  fatal: invalid commit position. commit-graph is likely corrupt

Since revision.c was the only user of the parse_commit_gently
compatibility define, remove it from commit.h.

Signed-off-by: Michael Forney <mforney@mforney.org>
---
 commit.h   |  1 -
 revision.c | 18 +++++++++---------
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/commit.h b/commit.h
index 1b2dea5d85..a2e8ca99a2 100644
--- a/commit.h
+++ b/commit.h
@@ -97,7 +97,6 @@ static inline int parse_commit_no_graph(struct commit *commit)
 
 #ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
 #define parse_commit_internal(item, quiet, use) repo_parse_commit_internal(the_repository, item, quiet, use)
-#define parse_commit_gently(item, quiet) repo_parse_commit_gently(the_repository, item, quiet)
 #define parse_commit(item) repo_parse_commit(the_repository, item)
 #endif
 
diff --git a/revision.c b/revision.c
index ebb4d2a0f2..2b6bf47c81 100644
--- a/revision.c
+++ b/revision.c
@@ -439,7 +439,7 @@ static struct commit *handle_commit(struct rev_info *revs,
 	if (object->type == OBJ_COMMIT) {
 		struct commit *commit = (struct commit *)object;
 
-		if (parse_commit(commit) < 0)
+		if (repo_parse_commit(revs->repo, commit) < 0)
 			die("unable to parse commit %s", name);
 		if (flags & UNINTERESTING) {
 			mark_parents_uninteresting(commit);
@@ -992,7 +992,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
 					ts->treesame[0] = 1;
 			}
 		}
-		if (parse_commit(p) < 0)
+		if (repo_parse_commit(revs->repo, p) < 0)
 			die("cannot simplify commit %s (because of %s)",
 			    oid_to_hex(&commit->object.oid),
 			    oid_to_hex(&p->object.oid));
@@ -1037,7 +1037,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
 				 * IOW, we pretend this parent is a
 				 * "root" commit.
 				 */
-				if (parse_commit(p) < 0)
+				if (repo_parse_commit(revs->repo, p) < 0)
 					die("cannot simplify commit %s (invalid %s)",
 					    oid_to_hex(&commit->object.oid),
 					    oid_to_hex(&p->object.oid));
@@ -1105,7 +1105,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 			parent = parent->next;
 			if (p)
 				p->object.flags |= UNINTERESTING;
-			if (parse_commit_gently(p, 1) < 0)
+			if (repo_parse_commit_gently(revs->repo, p, 1) < 0)
 				continue;
 			if (p->parents)
 				mark_parents_uninteresting(p);
@@ -1136,7 +1136,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 		struct commit *p = parent->item;
 		int gently = revs->ignore_missing_links ||
 			     revs->exclude_promisor_objects;
-		if (parse_commit_gently(p, gently) < 0) {
+		if (repo_parse_commit_gently(revs->repo, p, gently) < 0) {
 			if (revs->exclude_promisor_objects &&
 			    is_promisor_object(&p->object.oid)) {
 				if (revs->first_parent_only)
@@ -3295,7 +3295,7 @@ static void explore_walk_step(struct rev_info *revs)
 	if (!c)
 		return;
 
-	if (parse_commit_gently(c, 1) < 0)
+	if (repo_parse_commit_gently(revs->repo, c, 1) < 0)
 		return;
 
 	if (revs->sort_order == REV_SORT_BY_AUTHOR_DATE)
@@ -3333,7 +3333,7 @@ static void indegree_walk_step(struct rev_info *revs)
 	if (!c)
 		return;
 
-	if (parse_commit_gently(c, 1) < 0)
+	if (repo_parse_commit_gently(revs->repo, c, 1) < 0)
 		return;
 
 	explore_to_depth(revs, c->generation);
@@ -3414,7 +3414,7 @@ static void init_topo_walk(struct rev_info *revs)
 	for (list = revs->commits; list; list = list->next) {
 		struct commit *c = list->item;
 
-		if (parse_commit_gently(c, 1))
+		if (repo_parse_commit_gently(revs->repo, c, 1))
 			continue;
 
 		test_flag_and_insert(&info->explore_queue, c, TOPO_WALK_EXPLORED);
@@ -3476,7 +3476,7 @@ static void expand_topo_walk(struct rev_info *revs, struct commit *commit)
 		if (parent->object.flags & UNINTERESTING)
 			continue;
 
-		if (parse_commit_gently(parent, 1) < 0)
+		if (repo_parse_commit_gently(revs->repo, parent, 1) < 0)
 			continue;
 
 		if (parent->generation < info->min_generation) {
-- 
2.27.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/2] submodule: use submodule repository when preparing summary
  2020-06-23 20:56 [PATCH 1/2] revision: use repository from rev_info when parsing commits Michael Forney
@ 2020-06-23 20:56 ` Michael Forney
  2020-06-24 14:35   ` Derrick Stolee
  2020-06-23 21:24 ` [PATCH 1/2] revision: use repository from rev_info when parsing commits Eric Sunshine
  2020-06-24 14:29 ` Derrick Stolee
  2 siblings, 1 reply; 7+ messages in thread
From: Michael Forney @ 2020-06-23 20:56 UTC (permalink / raw)
  To: git, Derrick Stolee

In show_submodule_header(), we gather the left and right commits
of the submodule repository, as well as the merge bases. However,
prepare_submodule_summary() initializes the rev_info with the_repository,
so we end up parsing the commit in the wrong repository.

This results in a fatal error in parse_commit_in_graph(), since the
passed item does not belong to the repository's commit graph.

Signed-off-by: Michael Forney <mforney@mforney.org>
---
 submodule.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/submodule.c b/submodule.c
index e2ef5698c8..785ab47629 100644
--- a/submodule.c
+++ b/submodule.c
@@ -438,13 +438,13 @@ void handle_ignore_submodules_arg(struct diff_options *diffopt,
 	 */
 }
 
-static int prepare_submodule_summary(struct rev_info *rev, const char *path,
-		struct commit *left, struct commit *right,
+static int prepare_submodule_summary(struct repository *r, struct rev_info *rev,
+		const char *path, struct commit *left, struct commit *right,
 		struct commit_list *merge_bases)
 {
 	struct commit_list *list;
 
-	repo_init_revisions(the_repository, rev, NULL);
+	repo_init_revisions(r, rev, NULL);
 	setup_revisions(0, NULL, rev, NULL);
 	rev->left_right = 1;
 	rev->first_parent_only = 1;
@@ -632,7 +632,7 @@ void show_submodule_summary(struct diff_options *o, const char *path,
 		goto out;
 
 	/* Treat revision walker failure the same as missing commits */
-	if (prepare_submodule_summary(&rev, path, left, right, merge_bases)) {
+	if (prepare_submodule_summary(sub, &rev, path, left, right, merge_bases)) {
 		diff_emit_submodule_error(o, "(revision walker failed)\n");
 		goto out;
 	}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] revision: use repository from rev_info when parsing commits
  2020-06-23 20:56 [PATCH 1/2] revision: use repository from rev_info when parsing commits Michael Forney
  2020-06-23 20:56 ` [PATCH 2/2] submodule: use submodule repository when preparing summary Michael Forney
@ 2020-06-23 21:24 ` Eric Sunshine
  2020-06-24 14:29 ` Derrick Stolee
  2 siblings, 0 replies; 7+ messages in thread
From: Eric Sunshine @ 2020-06-23 21:24 UTC (permalink / raw)
  To: Michael Forney; +Cc: Git List, Derrick Stolee

On Tue, Jun 23, 2020 at 5:02 PM Michael Forney <mforney@mforney.org> wrote:
> This is needed when repo_init_revisions() is called with a repository
> that is not the_repository to ensure appropriate repository is used
> in repo_parse_commit_internal(). If the wrong repository is used,
> a fatal error is the commit-graph machinery occurs:

s/is/in/

>   fatal: invalid commit position. commit-graph is likely corrupt
>
> Since revision.c was the only user of the parse_commit_gently
> compatibility define, remove it from commit.h.
>
> Signed-off-by: Michael Forney <mforney@mforney.org>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] revision: use repository from rev_info when parsing commits
  2020-06-23 20:56 [PATCH 1/2] revision: use repository from rev_info when parsing commits Michael Forney
  2020-06-23 20:56 ` [PATCH 2/2] submodule: use submodule repository when preparing summary Michael Forney
  2020-06-23 21:24 ` [PATCH 1/2] revision: use repository from rev_info when parsing commits Eric Sunshine
@ 2020-06-24 14:29 ` Derrick Stolee
  2 siblings, 0 replies; 7+ messages in thread
From: Derrick Stolee @ 2020-06-24 14:29 UTC (permalink / raw)
  To: Michael Forney, git

On 6/23/2020 4:56 PM, Michael Forney wrote:
> This is needed when repo_init_revisions() is called with a repository
> that is not the_repository to ensure appropriate repository is used
> in repo_parse_commit_internal(). If the wrong repository is used,
> a fatal error is the commit-graph machinery occurs:
> 
>   fatal: invalid commit position. commit-graph is likely corrupt
> 
> Since revision.c was the only user of the parse_commit_gently
> compatibility define, remove it from commit.h.

Is this demonstrable in a test case, to prevent regressions?

Notably, you are _not_ dropping parse_commit(), and it would be
easy for another call to that shim to slip into revision.c.

> Signed-off-by: Michael Forney <mforney@mforney.org>
> ---
>  commit.h   |  1 -
>  revision.c | 18 +++++++++---------
>  2 files changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/commit.h b/commit.h
> -#define parse_commit_gently(item, quiet) repo_parse_commit_gently(the_repository, item, quiet)

I'm happy we can drop this shim!

> diff --git a/revision.c b/revision.c
> index ebb4d2a0f2..2b6bf47c81 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -439,7 +439,7 @@ static struct commit *handle_commit(struct rev_info *revs,
>  	if (object->type == OBJ_COMMIT) {
>  		struct commit *commit = (struct commit *)object;
>  
> -		if (parse_commit(commit) < 0)
> +		if (repo_parse_commit(revs->repo, commit) < 0)

I counted 9 copies of parse_commit[_gently]() in my version
of revision.c, so it looks like you caught them all.

Thanks!
-Stolee


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] submodule: use submodule repository when preparing summary
  2020-06-23 20:56 ` [PATCH 2/2] submodule: use submodule repository when preparing summary Michael Forney
@ 2020-06-24 14:35   ` Derrick Stolee
  2020-06-24 16:12     ` Junio C Hamano
  2020-06-30 11:04     ` Michael Forney
  0 siblings, 2 replies; 7+ messages in thread
From: Derrick Stolee @ 2020-06-24 14:35 UTC (permalink / raw)
  To: Michael Forney, git

On 6/23/2020 4:56 PM, Michael Forney wrote:
> In show_submodule_header(), we gather the left and right commits
> of the submodule repository, as well as the merge bases. However,
> prepare_submodule_summary() initializes the rev_info with the_repository,
> so we end up parsing the commit in the wrong repository.
> 
> This results in a fatal error in parse_commit_in_graph(), since the
> passed item does not belong to the repository's commit graph.
> 
> Signed-off-by: Michael Forney <mforney@mforney.org>
> ---
>  submodule.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/submodule.c b/submodule.c
> index e2ef5698c8..785ab47629 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -438,13 +438,13 @@ void handle_ignore_submodules_arg(struct diff_options *diffopt,
>  	 */
>  }
>  
> -static int prepare_submodule_summary(struct rev_info *rev, const char *path,
> -		struct commit *left, struct commit *right,
> +static int prepare_submodule_summary(struct repository *r, struct rev_info *rev,
> +		const char *path, struct commit *left, struct commit *right,
>  		struct commit_list *merge_bases)
>  {
>  	struct commit_list *list;
>  
> -	repo_init_revisions(the_repository, rev, NULL);
> +	repo_init_revisions(r, rev, NULL);

This is how we properly initialize the repository in the rev_info.
It's unfortunate that this use of the_repository was pretty clearly
incorrect. This is submodule.c, so every instance of the_repository
should be examined carefully. Taking a brief look right now, the
rest seem to be correct in that they are finding submodules within
the super-repo. The only issue will arise when recursing into
submodules, which is known to be broken in-process and are handled
with subprocesses instead.

>  	setup_revisions(0, NULL, rev, NULL);
>  	rev->left_right = 1;
>  	rev->first_parent_only = 1;
> @@ -632,7 +632,7 @@ void show_submodule_summary(struct diff_options *o, const char *path,
>  		goto out;
>  
>  	/* Treat revision walker failure the same as missing commits */
> -	if (prepare_submodule_summary(&rev, path, left, right, merge_bases)) {
> +	if (prepare_submodule_summary(sub, &rev, path, left, right, merge_bases)) {
>  		diff_emit_submodule_error(o, "(revision walker failed)\n");
>  		goto out;
>  	}

Perhaps the test I requested in patch 1 is only appropriate
here? Or, maybe the test should be test_expect_failure in the
first patch and switched to test_expect_success here?

Thanks,
-Stolee



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] submodule: use submodule repository when preparing summary
  2020-06-24 14:35   ` Derrick Stolee
@ 2020-06-24 16:12     ` Junio C Hamano
  2020-06-30 11:04     ` Michael Forney
  1 sibling, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2020-06-24 16:12 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Michael Forney, git

Derrick Stolee <stolee@gmail.com> writes:

> Perhaps the test I requested in patch 1 is only appropriate
> here? Or, maybe the test should be test_expect_failure in the
> first patch and switched to test_expect_success here?

Either is OK, but it is probably easier to read to have just one
addition in 2/2 to expect succeses.  Temporarily revierting with
"git show ':!t' | git apply -R" before running test when you want
to reallly see how the original crashed is easy and simple enough.

Thanks.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] submodule: use submodule repository when preparing summary
  2020-06-24 14:35   ` Derrick Stolee
  2020-06-24 16:12     ` Junio C Hamano
@ 2020-06-30 11:04     ` Michael Forney
  1 sibling, 0 replies; 7+ messages in thread
From: Michael Forney @ 2020-06-30 11:04 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1967 bytes --]

On 2020-06-24, Derrick Stolee <stolee@gmail.com> wrote:
> Perhaps the test I requested in patch 1 is only appropriate
> here? Or, maybe the test should be test_expect_failure in the
> first patch and switched to test_expect_success here?

I made a good effort to write a test, but I am still unable to
reliably trigger the offending codepath, which is:

submodule.c:prepare_submodule_summary
revision.c:prepare_revision_walk
revision.c:limit_list
revision.c:process_parents
commit.c:repo_parse_commit_gently
commit.c:repo_parse_commit_internal (needs !item->object.parsed and
use_commit_graph)
commit-graph.c:parse_commit_in_graph
commit-graph.c:parse_commit_in_graph_one
commit-graph.c:fill_commit_in_graph (needs pos >= number of commits in
commit-graph in parent repository)

The trick seems to be ensuring that the parent commit of the first
commit in the range of commits changed in a submodule does not get
parsed during show_submodule_header, is not a loose object in the
repository, and has an index in the commit-graph that is larger than
the size of the commit-graph in the parent repository. This seems to
depend on the order of commits in the commit-graph, which seems to be
random (perhaps based on commit hashes?).

I attached my best attempt at a test to trigger the error. The
probability of the test failing correctly (without the fix applied)
seems to depend on how many commits are present in submodule before
the first commit in the range of changed commits. This can be
controlled by adjusting the `seq 1 X` in the for loop. The lowest
number of commits with which I have been able to reproduce the bug is
3, where it occurs around 1% of the time, and if I set it to 200, I
can reproduce the bug around 99% of the time.

I don't really want to spend more time on this than I already have.
Can the bug fix be applied without a test? If not, hopefully someone
can volunteer to craft a reliable test (assuming that this is even
possible).

-Michael

[-- Attachment #2: t7421-submodule-commit-graph.sh --]
[-- Type: application/x-sh, Size: 721 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-23 20:56 [PATCH 1/2] revision: use repository from rev_info when parsing commits Michael Forney
2020-06-23 20:56 ` [PATCH 2/2] submodule: use submodule repository when preparing summary Michael Forney
2020-06-24 14:35   ` Derrick Stolee
2020-06-24 16:12     ` Junio C Hamano
2020-06-30 11:04     ` Michael Forney
2020-06-23 21:24 ` [PATCH 1/2] revision: use repository from rev_info when parsing commits Eric Sunshine
2020-06-24 14:29 ` Derrick Stolee

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git