git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
* [PATCH] revision.c: drop missing objects from cmdline
@ 2018-10-23 21:57 Matthew DeVore
  2018-10-24  4:54 ` Junio C Hamano
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Matthew DeVore @ 2018-10-23 21:57 UTC (permalink / raw)
  To: git; +Cc: Matthew DeVore, gitster, jonathantanmy, jeffhost, ramsay

No code which reads cmdline in struct rev_info can handle NULL objects
in cmdline.rev[i].item, so stop adding them to the cmdline.rev array.
Objects in cmdline are NULL when the given object is promisor and
--exclude-promisor-objects is enabled.

This new behavior avoids a segmentation fault in the added test case in
t0410.

We could simply die if add_rev_cmdline is called with a NULL item,
(rather than warn if --exclude-promisor-objects is set) but because the
amended test case already expects the command to finish successfully,
difference and show a warning. Note that this command:

	git rev-list --objects --missing=print $missing_hash

Already fails with a "fatal: bad object HASH" message and this patch
does not change that.

Signed-off-by: Matthew DeVore <matvore@google.com>
---
 revision.c               | 12 ++++++++++++
 t/t0410-partial-clone.sh | 11 ++++++++++-
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/revision.c b/revision.c
index a1ddb9e11c..8724dca2e2 100644
--- a/revision.c
+++ b/revision.c
@@ -1148,6 +1148,18 @@ static void add_rev_cmdline(struct rev_info *revs,
 			    int whence,
 			    unsigned flags)
 {
+	if (!item) {
+		/*
+		 * item is likely a promisor object returned from get_reference.
+		 */
+		if (revs->exclude_promisor_objects) {
+			warning(_("ignoring missing object %s"), name);
+			return;
+		} else {
+			die(_("missing object %s"), name);
+		}
+	}
+
 	struct rev_cmdline_info *info = &revs->cmdline;
 	unsigned int nr = info->nr;
 
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index ba3887f178..e02cd3f818 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -366,7 +366,16 @@ test_expect_success 'rev-list accepts missing and promised objects on command li
 
 	git -C repo config core.repositoryformatversion 1 &&
 	git -C repo config extensions.partialclone "arbitrary string" &&
-	git -C repo rev-list --exclude-promisor-objects --objects "$COMMIT" "$TREE" "$BLOB"
+
+	printf "warning: ignoring missing object %s\n" \
+	       "$COMMIT" "$TREE" "$BLOB" >expect &&
+	git -C repo rev-list --objects \
+		--exclude-promisor-objects "$COMMIT" "$TREE" "$BLOB" 2>actual &&
+	test_cmp expect actual &&
+
+	git -C repo rev-list --objects-edge-aggressive \
+		--exclude-promisor-objects "$COMMIT" "$TREE" "$BLOB" 2>actual &&
+	test_cmp expect actual
 '
 
 test_expect_success 'gc repacks promisor objects separately from non-promisor objects' '
-- 
2.19.1.568.g152ad8e336-goog


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] revision.c: drop missing objects from cmdline
  2018-10-23 21:57 [PATCH] revision.c: drop missing objects from cmdline Matthew DeVore
@ 2018-10-24  4:54 ` Junio C Hamano
  2018-10-25 23:13   ` Matthew DeVore
  2018-10-25 23:53 ` [PATCH v2] list-objects.c: don't segfault for missing cmdline objects Matthew DeVore
  2018-12-05 21:43 ` [PATCH v3] " Matthew DeVore
  2 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2018-10-24  4:54 UTC (permalink / raw)
  To: Matthew DeVore; +Cc: git, jonathantanmy, jeffhost, ramsay

Matthew DeVore <matvore@google.com> writes:

> No code which reads cmdline in struct rev_info can handle NULL objects
> in cmdline.rev[i].item, so stop adding them to the cmdline.rev array.

"The code is not prepared to have cmdline.rev[].item that is NULL"
is something everybody would understand and agree with, but that
does not automatically lead to "so ignoring or rejecting and dying
is OK", though.  The cmdline thing is used for the commands to learn
the end-user intent that cannot be learned by the resulting objects
in the object array (e.g. the user may have said 'master' but the
pending[] (and later revs.commits) would only have the object names,
and some callers would want to know if it was a branch name, a
refname refs/heads/master, or the hexadecimal object name), so
unless absolutely needed, I'm hesitant to take a change that loses
information (e.g. the user named this object that is not locally
available, we cannot afford to add it to the pending[] and add it to
revs.commits to traverse from there, but we still want to know what
object was given by the user).

> Objects in cmdline are NULL when the given object is promisor and
> --exclude-promisor-objects is enabled.

A "promisor" is a remote repository.  It promises certain objects
that you do not have are later retrievable from it.  The way you can
see if the promisor promised to later give you an object is to see
if that missing object is reachable from an object in a packfile the
promisor gave you earlier.  

"The given object" is never a "promisor", so I am not sure what the
above wants to say.  Is 

    When an object is given on the command line and if it is missing
    from the local repository, add_rev_cmdline() receives NULL in
    its "item" parameter.

what you meant?  Is that the _only_ case in which "item" could be
NULL, or is it also true for any missing object due to repository
corruption?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] revision.c: drop missing objects from cmdline
  2018-10-24  4:54 ` Junio C Hamano
@ 2018-10-25 23:13   ` Matthew DeVore
  0 siblings, 0 replies; 7+ messages in thread
From: Matthew DeVore @ 2018-10-25 23:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan, jeffhost, ramsay

On Tue, Oct 23, 2018 at 9:54 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Matthew DeVore <matvore@google.com> writes:
>
> > No code which reads cmdline in struct rev_info can handle NULL objects
> > in cmdline.rev[i].item, so stop adding them to the cmdline.rev array.
>
> "The code is not prepared to have cmdline.rev[].item that is NULL"
> is something everybody would understand and agree with, but that
> does not automatically lead to "so ignoring or rejecting and dying
> is OK", though.  The cmdline thing is used for the commands to learn
> the end-user intent that cannot be learned by the resulting objects
> in the object array (e.g. the user may have said 'master' but the
> pending[] (and later revs.commits) would only have the object names,
> and some callers would want to know if it was a branch name, a
> refname refs/heads/master, or the hexadecimal object name), so
> unless absolutely needed, I'm hesitant to take a change that loses
> information (e.g. the user named this object that is not locally
> available, we cannot afford to add it to the pending[] and add it to
> revs.commits to traverse from there, but we still want to know what
> object was given by the user).
Hmm, when you explain the purpose of cmdline, it's obvious now that it
doesn't make sense to mechanically drop items from it. I'm sending
another version of this patch which uses a more focused approach and
is a bit simpler.

>
> > Objects in cmdline are NULL when the given object is promisor and
> > --exclude-promisor-objects is enabled.
>
> A "promisor" is a remote repository.  It promises certain objects
> that you do not have are later retrievable from it.  The way you can
> see if the promisor promised to later give you an object is to see
> if that missing object is reachable from an object in a packfile the
> promisor gave you earlier.
>
> "The given object" is never a "promisor", so I am not sure what the
> above wants to say.  Is
>
>     When an object is given on the command line and if it is missing
>     from the local repository, add_rev_cmdline() receives NULL in
>     its "item" parameter.
>
> what you meant?  Is that the _only_ case in which "item" could be
> NULL, or is it also true for any missing object due to repository
> corruption?

Yes, that is what I meant. I believe for corruption there is an actual
error shown with die() or the like, though I am not certain.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2] list-objects.c: don't segfault for missing cmdline objects
  2018-10-23 21:57 [PATCH] revision.c: drop missing objects from cmdline Matthew DeVore
  2018-10-24  4:54 ` Junio C Hamano
@ 2018-10-25 23:53 ` Matthew DeVore
  2018-10-29  0:06   ` Junio C Hamano
  2018-12-05 21:43 ` [PATCH v3] " Matthew DeVore
  2 siblings, 1 reply; 7+ messages in thread
From: Matthew DeVore @ 2018-10-25 23:53 UTC (permalink / raw)
  To: git; +Cc: Matthew DeVore, gitster, jonathantanmy, jeffhost, ramsay

When a command is invoked with both --exclude-promisor-objects,
--objects-edge-aggressive, and a missing object on the command line,
the rev_info.cmdline array could get a NULL pointer for the value of
an 'item' field. Prevent dereferencing of a NULL pointer in that
situation.

There are a few other places in the code where rev_info.cmdline is read
and the code doesn't handle NULL objects, but I couldn't prove to myself
that any of them needed to change except this one (since it may not
actually be possible to reach the other code paths with
rev_info.cmdline[] set to NULL).

Signed-off-by: Matthew DeVore <matvore@google.com>
---
 list-objects.c           | 3 ++-
 t/t0410-partial-clone.sh | 6 +++++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/list-objects.c b/list-objects.c
index c41cc80db5..27ed2c6cab 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -245,7 +245,8 @@ void mark_edges_uninteresting(struct rev_info *revs, show_edge_fn show_edge)
 		for (i = 0; i < revs->cmdline.nr; i++) {
 			struct object *obj = revs->cmdline.rev[i].item;
 			struct commit *commit = (struct commit *)obj;
-			if (obj->type != OBJ_COMMIT || !(obj->flags & UNINTERESTING))
+			if (!obj || obj->type != OBJ_COMMIT ||
+			    !(obj->flags & UNINTERESTING))
 				continue;
 			mark_tree_uninteresting(revs->repo,
 						get_commit_tree(commit));
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index ba3887f178..e52291e674 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -366,7 +366,11 @@ test_expect_success 'rev-list accepts missing and promised objects on command li
 
 	git -C repo config core.repositoryformatversion 1 &&
 	git -C repo config extensions.partialclone "arbitrary string" &&
-	git -C repo rev-list --exclude-promisor-objects --objects "$COMMIT" "$TREE" "$BLOB"
+
+	git -C repo rev-list --objects \
+		--exclude-promisor-objects "$COMMIT" "$TREE" "$BLOB" &&
+	git -C repo rev-list --objects-edge-aggressive \
+		--exclude-promisor-objects "$COMMIT" "$TREE" "$BLOB"
 '
 
 test_expect_success 'gc repacks promisor objects separately from non-promisor objects' '
-- 
2.19.1.568.g152ad8e336-goog


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] list-objects.c: don't segfault for missing cmdline objects
  2018-10-25 23:53 ` [PATCH v2] list-objects.c: don't segfault for missing cmdline objects Matthew DeVore
@ 2018-10-29  0:06   ` Junio C Hamano
  0 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2018-10-29  0:06 UTC (permalink / raw)
  To: Matthew DeVore; +Cc: git, jonathantanmy, jeffhost, ramsay

Matthew DeVore <matvore@google.com> writes:

> When a command is invoked with both --exclude-promisor-objects,
> --objects-edge-aggressive, and a missing object on the command line,
> the rev_info.cmdline array could get a NULL pointer for the value of
> an 'item' field. Prevent dereferencing of a NULL pointer in that
> situation.

Thanks.

> There are a few other places in the code where rev_info.cmdline is read
> and the code doesn't handle NULL objects, but I couldn't prove to myself
> that any of them needed to change except this one (since it may not
> actually be possible to reach the other code paths with
> rev_info.cmdline[] set to NULL).
>
> Signed-off-by: Matthew DeVore <matvore@google.com>
> ---
>  list-objects.c           | 3 ++-
>  t/t0410-partial-clone.sh | 6 +++++-
>  2 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/list-objects.c b/list-objects.c
> index c41cc80db5..27ed2c6cab 100644
> --- a/list-objects.c
> +++ b/list-objects.c
> @@ -245,7 +245,8 @@ void mark_edges_uninteresting(struct rev_info *revs, show_edge_fn show_edge)
>  		for (i = 0; i < revs->cmdline.nr; i++) {
>  			struct object *obj = revs->cmdline.rev[i].item;
>  			struct commit *commit = (struct commit *)obj;
> -			if (obj->type != OBJ_COMMIT || !(obj->flags & UNINTERESTING))
> +			if (!obj || obj->type != OBJ_COMMIT ||
> +			    !(obj->flags & UNINTERESTING))
>  				continue;
>  			mark_tree_uninteresting(revs->repo,
>  						get_commit_tree(commit));
> diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
> index ba3887f178..e52291e674 100755
> --- a/t/t0410-partial-clone.sh
> +++ b/t/t0410-partial-clone.sh
> @@ -366,7 +366,11 @@ test_expect_success 'rev-list accepts missing and promised objects on command li
>  
>  	git -C repo config core.repositoryformatversion 1 &&
>  	git -C repo config extensions.partialclone "arbitrary string" &&
> -	git -C repo rev-list --exclude-promisor-objects --objects "$COMMIT" "$TREE" "$BLOB"
> +
> +	git -C repo rev-list --objects \
> +		--exclude-promisor-objects "$COMMIT" "$TREE" "$BLOB" &&
> +	git -C repo rev-list --objects-edge-aggressive \
> +		--exclude-promisor-objects "$COMMIT" "$TREE" "$BLOB"
>  '
>  
>  test_expect_success 'gc repacks promisor objects separately from non-promisor objects' '

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3] list-objects.c: don't segfault for missing cmdline objects
  2018-10-23 21:57 [PATCH] revision.c: drop missing objects from cmdline Matthew DeVore
  2018-10-24  4:54 ` Junio C Hamano
  2018-10-25 23:53 ` [PATCH v2] list-objects.c: don't segfault for missing cmdline objects Matthew DeVore
@ 2018-12-05 21:43 ` " Matthew DeVore
  2018-12-06  1:12   ` Junio C Hamano
  2 siblings, 1 reply; 7+ messages in thread
From: Matthew DeVore @ 2018-12-05 21:43 UTC (permalink / raw)
  To: git; +Cc: Matthew DeVore, gitster, jonathantanmy, jeffhost, ramsay, matvore

When a command is invoked with both --exclude-promisor-objects,
--objects-edge-aggressive, and a missing object on the command line,
the rev_info.cmdline array could get a NULL pointer for the value of
an 'item' field. Prevent dereferencing of a NULL pointer in that
situation.

Properly handle --ignore-missing. If it is not passed, die when an
object is missing. Otherwise, just silently ignore it.

Signed-off-by: Matthew DeVore <matvore@google.com>
---
 revision.c               |  2 ++
 t/t0410-partial-clone.sh | 16 ++++++++++++++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/revision.c b/revision.c
index 13e0519c02..293303b67d 100644
--- a/revision.c
+++ b/revision.c
@@ -1729,6 +1729,8 @@ int handle_revision_arg(const char *arg_, struct rev_info *revs, int flags, unsi
 	if (!cant_be_filename)
 		verify_non_filename(revs->prefix, arg);
 	object = get_reference(revs, arg, &oid, flags ^ local_flags);
+	if (!object)
+		return revs->ignore_missing ? 0 : -1;
 	add_rev_cmdline(revs, object, arg_, REV_CMD_REV, flags ^ local_flags);
 	add_pending_object_with_path(revs, object, arg, oc.mode, oc.path);
 	free(oc.path);
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index ba3887f178..169f7f10a7 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -349,7 +349,7 @@ test_expect_success 'rev-list stops traversal at promisor commit, tree, and blob
 	grep $(git -C repo rev-parse bar) out  # sanity check that some walking was done
 '
 
-test_expect_success 'rev-list accepts missing and promised objects on command line' '
+test_expect_success 'rev-list dies for missing objects on cmd line' '
 	rm -rf repo &&
 	test_create_repo repo &&
 	test_commit -C repo foo &&
@@ -366,7 +366,19 @@ test_expect_success 'rev-list accepts missing and promised objects on command li
 
 	git -C repo config core.repositoryformatversion 1 &&
 	git -C repo config extensions.partialclone "arbitrary string" &&
-	git -C repo rev-list --exclude-promisor-objects --objects "$COMMIT" "$TREE" "$BLOB"
+
+	for OBJ in "$COMMIT" "$TREE" "$BLOB"; do
+		test_must_fail git -C repo rev-list --objects \
+			--exclude-promisor-objects "$OBJ" &&
+		test_must_fail git -C repo rev-list --objects-edge-aggressive \
+			--exclude-promisor-objects "$OBJ" &&
+
+		# Do not die or crash when --ignore-missing is passed.
+		git -C repo rev-list --ignore-missing --objects \
+			--exclude-promisor-objects "$OBJ" &&
+		git -C repo rev-list --ignore-missing --objects-edge-aggressive \
+			--exclude-promisor-objects "$OBJ"
+	done
 '
 
 test_expect_success 'gc repacks promisor objects separately from non-promisor objects' '
-- 
2.20.0.rc1.387.gf8505762e3-goog


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] list-objects.c: don't segfault for missing cmdline objects
  2018-12-05 21:43 ` [PATCH v3] " Matthew DeVore
@ 2018-12-06  1:12   ` Junio C Hamano
  0 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2018-12-06  1:12 UTC (permalink / raw)
  To: Matthew DeVore; +Cc: git, jonathantanmy, jeffhost, ramsay, matvore

Matthew DeVore <matvore@google.com> writes:

> When a command is invoked with both --exclude-promisor-objects,
> --objects-edge-aggressive, and a missing object on the command line,
> the rev_info.cmdline array could get a NULL pointer for the value of
> an 'item' field. Prevent dereferencing of a NULL pointer in that
> situation.
>
> Properly handle --ignore-missing. If it is not passed, die when an
> object is missing. Otherwise, just silently ignore it.
>
> Signed-off-by: Matthew DeVore <matvore@google.com>

Thanks for an update.  Will replace.

> ---
>  revision.c               |  2 ++
>  t/t0410-partial-clone.sh | 16 ++++++++++++++--
>  2 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/revision.c b/revision.c
> index 13e0519c02..293303b67d 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -1729,6 +1729,8 @@ int handle_revision_arg(const char *arg_, struct rev_info *revs, int flags, unsi
>  	if (!cant_be_filename)
>  		verify_non_filename(revs->prefix, arg);
>  	object = get_reference(revs, arg, &oid, flags ^ local_flags);
> +	if (!object)
> +		return revs->ignore_missing ? 0 : -1;
>  	add_rev_cmdline(revs, object, arg_, REV_CMD_REV, flags ^ local_flags);
>  	add_pending_object_with_path(revs, object, arg, oc.mode, oc.path);
>  	free(oc.path);
> diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
> index ba3887f178..169f7f10a7 100755
> --- a/t/t0410-partial-clone.sh
> +++ b/t/t0410-partial-clone.sh
> @@ -349,7 +349,7 @@ test_expect_success 'rev-list stops traversal at promisor commit, tree, and blob
>  	grep $(git -C repo rev-parse bar) out  # sanity check that some walking was done
>  '
>  
> -test_expect_success 'rev-list accepts missing and promised objects on command line' '
> +test_expect_success 'rev-list dies for missing objects on cmd line' '
>  	rm -rf repo &&
>  	test_create_repo repo &&
>  	test_commit -C repo foo &&
> @@ -366,7 +366,19 @@ test_expect_success 'rev-list accepts missing and promised objects on command li
>  
>  	git -C repo config core.repositoryformatversion 1 &&
>  	git -C repo config extensions.partialclone "arbitrary string" &&
> -	git -C repo rev-list --exclude-promisor-objects --objects "$COMMIT" "$TREE" "$BLOB"
> +
> +	for OBJ in "$COMMIT" "$TREE" "$BLOB"; do
> +		test_must_fail git -C repo rev-list --objects \
> +			--exclude-promisor-objects "$OBJ" &&
> +		test_must_fail git -C repo rev-list --objects-edge-aggressive \
> +			--exclude-promisor-objects "$OBJ" &&
> +
> +		# Do not die or crash when --ignore-missing is passed.
> +		git -C repo rev-list --ignore-missing --objects \
> +			--exclude-promisor-objects "$OBJ" &&
> +		git -C repo rev-list --ignore-missing --objects-edge-aggressive \
> +			--exclude-promisor-objects "$OBJ"
> +	done
>  '
>  
>  test_expect_success 'gc repacks promisor objects separately from non-promisor objects' '

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-23 21:57 [PATCH] revision.c: drop missing objects from cmdline Matthew DeVore
2018-10-24  4:54 ` Junio C Hamano
2018-10-25 23:13   ` Matthew DeVore
2018-10-25 23:53 ` [PATCH v2] list-objects.c: don't segfault for missing cmdline objects Matthew DeVore
2018-10-29  0:06   ` Junio C Hamano
2018-12-05 21:43 ` [PATCH v3] " Matthew DeVore
2018-12-06  1:12   ` Junio C Hamano

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox