git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: "René Scharfe" <l.s.r@web.de>, "Junio C Hamano" <gitster@pobox.com>
Cc: Robert Leftwich <robert@gitpod.io>,
	git@vger.kernel.org, Derrick Stolee <dstolee@microsoft.com>
Subject: Re: Bug/regression report - 'git stash push -u' fatal errors when sub-repo present
Date: Mon, 4 Oct 2021 16:06:22 -0400	[thread overview]
Message-ID: <6df361a5-8e15-63a7-dbb8-77405c6edf0e@gmail.com> (raw)
In-Reply-To: <1d26a9f3-dcb5-408a-581e-40411e6a2179@web.de>

On 10/1/2021 10:25 AM, René Scharfe wrote:
> Am 30.09.21 um 18:35 schrieb Junio C Hamano:
>> René Scharfe <l.s.r@web.de> writes:
>>
>>> Looping in Stolee as the author of 6e773527b6 (sparse-index: convert
>>> from full to sparse, 2021-03-30).

Thanks for looping me in. I'm still getting caught up from some
time off.

>>> Here's a raw patch for that.  Versions before 6e773527b6 pass the
>>> included test.
>>>
>>> The magic return value of 2 is a bit ugly, but allows adding the
>>> additional check only to the call-site relevant to the bug report.
>>>
>>> I don't know if other callers of verify_path() might also need that
>>> check, or if it is too narrow.
>>
>> The callers inside read-cache.c, which I think were the original
>> intended audiences of the function, always call verify_path() with
>> the pathname suitable to be stored as a cache entry.
>>
>> The callee never has to assume an extra trailing slash might exist
>> at the end.  And verify_path() said "no", if any caller passed an
>> extra trailing slash before the commit in question.
>>
>> I _think_ we defined the new "tree" type entries the sparse index
>> stuff added to be the _only_ cache entries whose path always ends
>> with a slash?  If so, perhaps we should audit the existing callers
>> and move the "paths that come from end-users to be entered to the
>> index must never end with a slash" sanity check this function gave
>> us, which was broken by 6e773527b6, perhaps?
>>
>> "git update-index" should never allow to create a "tree" kind of
>> cache entry (making it sparse again should be the task of systems
>> internal, and should not be done by end-user feeding a pre-shrunk
>> "tree" kind of entry to record a sparsely populated state, if I
>> understand correctly), so its call to verify_path(), if it ends with
>> a directory separator, should say "that's not a good path".
>>
>> Prehaps we can rename verify_path() to verify_path_internal() that
>> is _known_ to be called with names that are already vetted to be
>> stored in ce_name and convert callers inside read-cache.c to call
>> it, and write verify_path() as a thin wrapper of it to fail when the
>> former stops by returning S_ISDIR() at the place your fix touched,
>> or something?
> 
> Restoring the stricter check for the general case and providing a way
> out for the code handling sparse indexes sounds like a good idea.
> 
> I don't know which callers are supposed to need that, but the following
> patch passes all tests, including the new one.

I like this new approach.

> +enum verify_path_result {
> +	PATH_OK,
> +	PATH_INVALID,
> +	PATH_DIR_WITH_SEP,
> +};
> +
> +static enum verify_path_result verify_path_internal(const char *, unsigned);
> +
> +int verify_path(const char *path, unsigned mode)
> +{
> +	return verify_path_internal(path, mode) == PATH_OK;
> +}
> +

I was confused at first as to why the verify_path() implementation is
not near the implementation of verify_path_internal(), but I see how
you use the _internal() version in make_cache_tree(), justifying this
position.

>  struct cache_entry *make_cache_entry(struct index_state *istate,
>  				     unsigned int mode,
>  				     const struct object_id *oid,
...
>  			/*
>  			 * allow terminating directory separators for
>  			 * sparse directory entries.
>  			 */
>  			if (c == '\0')
> -				return S_ISDIR(mode);
> +				return S_ISDIR(mode) ? PATH_DIR_WITH_SEP :
> +						       PATH_INVALID;

The rewrite of this method is mostly mechanical except for this one
bit that is solving the issue at hand.

> @@ -1349,7 +1364,7 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e
> 
>  	if (!ok_to_add)
>  		return -1;
> -	if (!verify_path(ce->name, ce->ce_mode))
> +	if (verify_path_internal(ce->name, ce->ce_mode) == PATH_INVALID)
>  		return error(_("invalid path '%s'"), ce->name);

And yes, I believe that make_cache_entry() and add_index_entry_with_check()
are the only places that need this internal version. If we find others,
then we can add them as necessary. The tests in t1092 are getting rather
robust, although they don't do much for this test case:
> +test_expect_success 'stash -u ignores sub-repository' '
> +	test_when_finished "rm -rf sub-repo" &&
> +	git init sub-repo &&
> +	git stash -u
> +'

Seems like a good test to have, anyway.

I look forward to seeing this as a full patch.

Thanks,
-Stolee

  reply	other threads:[~2021-10-04 20:06 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-30  0:49 Bug/regression report - 'git stash push -u' fatal errors when sub-repo present Robert Leftwich
2021-09-30 10:06 ` René Scharfe
2021-09-30 16:35   ` Junio C Hamano
2021-10-01 14:25     ` René Scharfe
2021-10-04 20:06       ` Derrick Stolee [this message]
2021-10-04 20:52         ` Junio C Hamano
2021-10-04 22:29           ` Derrick Stolee
2021-10-07 20:31 ` [PATCH 1/3] t3905: show failure to ignore sub-repo René Scharfe
2021-10-07 20:31 ` [PATCH 2/3] read-cache: add verify_path_internal() René Scharfe
2021-10-07 20:31 ` [PATCH 3/3] read-cache: let verify_path() reject trailing dir separators again René Scharfe
2021-10-08  9:04   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6df361a5-8e15-63a7-dbb8-77405c6edf0e@gmail.com \
    --to=stolee@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=robert@gitpod.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).