git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/1] sha1-file: split OBJECT_INFO_FOR_PREFETCH
@ 2019-05-28 15:19 Derrick Stolee via GitGitGadget
  2019-05-28 15:19 ` [PATCH 1/1] " Derrick Stolee via GitGitGadget
  0 siblings, 1 reply; 5+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2019-05-28 15:19 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Junio C Hamano

I found this interaction while testing the VFS for Git fork rebasing onto
v2.22.0-rc1 [1]. It seems this new flag meant for partial clone prefetches
interacts poorly with the read-object hook we use in our fork.

The issue is that OBJECT_INFO_FOR_PREFETCH has multiple bits on, so testing
"flag & OBJECT_INFO_FOR_PREFETCH" can be true even if not all bits are on.
My fix simply splits the new bit into a special flag while keeping
OBJECT_INFO_FOR_PREFETCH as a union of flags.

Jury is out if this fixes our problem, but it definitely seems like a bug
waiting to happen in Git, too.

Thanks, -Stolee

[1] https://github.com/microsoft/git/pull/140

Derrick Stolee (1):
  sha1-file: split OBJECT_INFO_FOR_PREFETCH

 object-store.h | 10 +++++++---
 sha1-file.c    |  2 +-
 2 files changed, 8 insertions(+), 4 deletions(-)


base-commit: 0f4a4fb1c4239a2aa46343add84ad6f99f6f3aae
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-228%2Fderrickstolee%2Fobject-info-prefetch-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-228/derrickstolee/object-info-prefetch-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/228
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/1] sha1-file: split OBJECT_INFO_FOR_PREFETCH
  2019-05-28 15:19 [PATCH 0/1] sha1-file: split OBJECT_INFO_FOR_PREFETCH Derrick Stolee via GitGitGadget
@ 2019-05-28 15:19 ` Derrick Stolee via GitGitGadget
  2019-05-28 20:31   ` Junio C Hamano
  2019-05-28 20:54   ` Jeff King
  0 siblings, 2 replies; 5+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2019-05-28 15:19 UTC (permalink / raw)
  To: git; +Cc: jonathantanmy, Junio C Hamano, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The OBJECT_INFO_FOR_PREFETCH bitflag was added to sha1-file.c in 0f4a4fb1
(sha1-file: support OBJECT_INFO_FOR_PREFETCH, 2019-03-29) and is used to
prevent the fetch_objects() method when enabled.

However, there is a problem with the current use. The definition of
OBJECT_INFO_FOR_PREFETCH is given by adding 32 to OBJECT_INFO_QUICK. This is
clearly stated above the definition (in a comment) that this is so
OBJECT_INFO_FOR_PREFETCH implies OBJECT_INFO_QUICK. The problem is that using
"flag & OBJECT_INFO_FOR_PREFETCH" means that OBJECT_INFO_QUICK also implies
OBJECT_INFO_FOR_PREFETCH.

Split out the single bit from OBJECT_INFO_FOR_PREFETCH into a new
OBJECT_INFO_SKIP_FETCH_OBJECT as the single bit and keep
OBJECT_INFO_FOR_PREFETCH as the union of two flags. This allows a clearer use
of flag checking while also keeping the implication of OBJECT_INFO_QUICK.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 object-store.h | 10 +++++++---
 sha1-file.c    |  2 +-
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/object-store.h b/object-store.h
index dd3f9b75f0..c90628d839 100644
--- a/object-store.h
+++ b/object-store.h
@@ -282,10 +282,14 @@ struct object_info {
 #define OBJECT_INFO_IGNORE_LOOSE 16
 /*
  * Do not attempt to fetch the object if missing (even if fetch_is_missing is
- * nonzero). This is meant for bulk prefetching of missing blobs in a partial
- * clone. Implies OBJECT_INFO_QUICK.
+ * nonzero).
  */
-#define OBJECT_INFO_FOR_PREFETCH (32 + OBJECT_INFO_QUICK)
+#define OBJECT_INFO_SKIP_FETCH_OBJECT 32
+/*
+ * This is meant for bulk prefetching of missing blobs in a partial
+ * clone. Implies OBJECT_INFO_SKIP_FETCH_OBJECT and OBJECT_INFO_QUICK
+ */
+#define OBJECT_INFO_FOR_PREFETCH (OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK)
 
 int oid_object_info_extended(struct repository *r,
 			     const struct object_id *,
diff --git a/sha1-file.c b/sha1-file.c
index ad02649124..0299fdd516 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1371,7 +1371,7 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid,
 		/* Check if it is a missing object */
 		if (fetch_if_missing && repository_format_partial_clone &&
 		    !already_retried && r == the_repository &&
-		    !(flags & OBJECT_INFO_FOR_PREFETCH)) {
+		    !(flags & OBJECT_INFO_SKIP_FETCH_OBJECT)) {
 			/*
 			 * TODO Investigate having fetch_object() return
 			 * TODO error/success and stopping the music here.
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] sha1-file: split OBJECT_INFO_FOR_PREFETCH
  2019-05-28 15:19 ` [PATCH 1/1] " Derrick Stolee via GitGitGadget
@ 2019-05-28 20:31   ` Junio C Hamano
  2019-05-28 20:54   ` Jeff King
  1 sibling, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2019-05-28 20:31 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, jonathantanmy, Derrick Stolee

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <dstolee@microsoft.com>
>
> The OBJECT_INFO_FOR_PREFETCH bitflag was added to sha1-file.c in 0f4a4fb1
> (sha1-file: support OBJECT_INFO_FOR_PREFETCH, 2019-03-29) and is used to
> prevent the fetch_objects() method when enabled.
>
> However, there is a problem with the current use. The definition of
> OBJECT_INFO_FOR_PREFETCH is given by adding 32 to OBJECT_INFO_QUICK. This is
> clearly stated above the definition (in a comment) that this is so
> OBJECT_INFO_FOR_PREFETCH implies OBJECT_INFO_QUICK. The problem is that using
> "flag & OBJECT_INFO_FOR_PREFETCH" means that OBJECT_INFO_QUICK also implies
> OBJECT_INFO_FOR_PREFETCH.

So the right test to see prefetch is in effect is not

	if (!!(flag & TWO_BITS))

but to use

	if ((flag & TWO_BITS) == TWO_BITS)

instead?

> Split out the single bit from OBJECT_INFO_FOR_PREFETCH into a new
> OBJECT_INFO_SKIP_FETCH_OBJECT as the single bit and keep
> OBJECT_INFO_FOR_PREFETCH as the union of two flags. This allows a clearer use
> of flag checking while also keeping the implication of OBJECT_INFO_QUICK.

OK, I guess that would work and with far less damage to the existing
code.

>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  object-store.h | 10 +++++++---
>  sha1-file.c    |  2 +-
>  2 files changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/object-store.h b/object-store.h
> index dd3f9b75f0..c90628d839 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -282,10 +282,14 @@ struct object_info {
>  #define OBJECT_INFO_IGNORE_LOOSE 16
>  /*
>   * Do not attempt to fetch the object if missing (even if fetch_is_missing is
> - * nonzero). This is meant for bulk prefetching of missing blobs in a partial
> - * clone. Implies OBJECT_INFO_QUICK.
> + * nonzero).
>   */
> -#define OBJECT_INFO_FOR_PREFETCH (32 + OBJECT_INFO_QUICK)
> +#define OBJECT_INFO_SKIP_FETCH_OBJECT 32
> +/*
> + * This is meant for bulk prefetching of missing blobs in a partial
> + * clone. Implies OBJECT_INFO_SKIP_FETCH_OBJECT and OBJECT_INFO_QUICK
> + */
> +#define OBJECT_INFO_FOR_PREFETCH (OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK)
>  
>  int oid_object_info_extended(struct repository *r,
>  			     const struct object_id *,
> diff --git a/sha1-file.c b/sha1-file.c
> index ad02649124..0299fdd516 100644
> --- a/sha1-file.c
> +++ b/sha1-file.c
> @@ -1371,7 +1371,7 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid,
>  		/* Check if it is a missing object */
>  		if (fetch_if_missing && repository_format_partial_clone &&
>  		    !already_retried && r == the_repository &&
> -		    !(flags & OBJECT_INFO_FOR_PREFETCH)) {
> +		    !(flags & OBJECT_INFO_SKIP_FETCH_OBJECT)) {
>  			/*
>  			 * TODO Investigate having fetch_object() return
>  			 * TODO error/success and stopping the music here.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] sha1-file: split OBJECT_INFO_FOR_PREFETCH
  2019-05-28 15:19 ` [PATCH 1/1] " Derrick Stolee via GitGitGadget
  2019-05-28 20:31   ` Junio C Hamano
@ 2019-05-28 20:54   ` Jeff King
  2019-05-29  0:29     ` Derrick Stolee
  1 sibling, 1 reply; 5+ messages in thread
From: Jeff King @ 2019-05-28 20:54 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, jonathantanmy, Junio C Hamano, Derrick Stolee

On Tue, May 28, 2019 at 08:19:07AM -0700, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@microsoft.com>
> 
> The OBJECT_INFO_FOR_PREFETCH bitflag was added to sha1-file.c in 0f4a4fb1
> (sha1-file: support OBJECT_INFO_FOR_PREFETCH, 2019-03-29) and is used to
> prevent the fetch_objects() method when enabled.
> 
> However, there is a problem with the current use. The definition of
> OBJECT_INFO_FOR_PREFETCH is given by adding 32 to OBJECT_INFO_QUICK. This is
> clearly stated above the definition (in a comment) that this is so
> OBJECT_INFO_FOR_PREFETCH implies OBJECT_INFO_QUICK. The problem is that using
> "flag & OBJECT_INFO_FOR_PREFETCH" means that OBJECT_INFO_QUICK also implies
> OBJECT_INFO_FOR_PREFETCH.
> 
> Split out the single bit from OBJECT_INFO_FOR_PREFETCH into a new
> OBJECT_INFO_SKIP_FETCH_OBJECT as the single bit and keep
> OBJECT_INFO_FOR_PREFETCH as the union of two flags. This allows a clearer use
> of flag checking while also keeping the implication of OBJECT_INFO_QUICK.

Oof. I actually suggested splitting these up for review, but thought it
was only a clarity/flexibility issue, and completely missed the
correctness aspect of checking when the bit is set.

I agree with Junio's other response that using "==" would be the right
way for a multi-bit check, in general. But I like the split here,
because I think the result is more clear to read and harder to get
wrong for future checks.

I'd even go so far as to say...

> + * This is meant for bulk prefetching of missing blobs in a partial
> + * clone. Implies OBJECT_INFO_SKIP_FETCH_OBJECT and OBJECT_INFO_QUICK
> + */
> +#define OBJECT_INFO_FOR_PREFETCH (OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK)

we could dump this, and callers should just say what they mean (i.e.,
specify both flags).

There are only two of them, and I think both would be more readable with
a helper more like:

  int should_prefetch_object(struct repository *r,
                             const struct object_id *oid) {
	return !oid_object_info_extended(r, oid, NULL,
	                                 OBJECT_INFO_SKIP_FETCH_OBJECT |
					 OBJECT_INFO_QUICK);
  }

but unless everybody is immediately on-board with "yes, that is much
nicer", I don't want bikeshedding to hold up your important and
obviously-correct fix.

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] sha1-file: split OBJECT_INFO_FOR_PREFETCH
  2019-05-28 20:54   ` Jeff King
@ 2019-05-29  0:29     ` Derrick Stolee
  0 siblings, 0 replies; 5+ messages in thread
From: Derrick Stolee @ 2019-05-29  0:29 UTC (permalink / raw)
  To: Jeff King, Derrick Stolee via GitGitGadget
  Cc: git, jonathantanmy, Junio C Hamano, Derrick Stolee

On 5/28/2019 4:54 PM, Jeff King wrote:
> On Tue, May 28, 2019 at 08:19:07AM -0700, Derrick Stolee via GitGitGadget wrote:
> 
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> The OBJECT_INFO_FOR_PREFETCH bitflag was added to sha1-file.c in 0f4a4fb1
>> (sha1-file: support OBJECT_INFO_FOR_PREFETCH, 2019-03-29) and is used to
>> prevent the fetch_objects() method when enabled.
>>
>> However, there is a problem with the current use. The definition of
>> OBJECT_INFO_FOR_PREFETCH is given by adding 32 to OBJECT_INFO_QUICK. This is
>> clearly stated above the definition (in a comment) that this is so
>> OBJECT_INFO_FOR_PREFETCH implies OBJECT_INFO_QUICK. The problem is that using
>> "flag & OBJECT_INFO_FOR_PREFETCH" means that OBJECT_INFO_QUICK also implies
>> OBJECT_INFO_FOR_PREFETCH.
>>
>> Split out the single bit from OBJECT_INFO_FOR_PREFETCH into a new
>> OBJECT_INFO_SKIP_FETCH_OBJECT as the single bit and keep
>> OBJECT_INFO_FOR_PREFETCH as the union of two flags. This allows a clearer use
>> of flag checking while also keeping the implication of OBJECT_INFO_QUICK.
> 
> Oof. I actually suggested splitting these up for review, but thought it
> was only a clarity/flexibility issue, and completely missed the
> correctness aspect of checking when the bit is set.
> 
> I agree with Junio's other response that using "==" would be the right
> way for a multi-bit check, in general. But I like the split here,
> because I think the result is more clear to read and harder to get
> wrong for future checks.

Thanks, for the feedback, both of you.

> I'd even go so far as to say...
> 
>> + * This is meant for bulk prefetching of missing blobs in a partial
>> + * clone. Implies OBJECT_INFO_SKIP_FETCH_OBJECT and OBJECT_INFO_QUICK
>> + */
>> +#define OBJECT_INFO_FOR_PREFETCH (OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK)
> 
> we could dump this, and callers should just say what they mean (i.e.,
> specify both flags).

Dropping the _PREFETCH flag also makes oid_object_info_extended() slightly
less "coupled" to the prefetch feature, and instead describes more explicitly
the way the flag is changing the behavior of the method.
 
> There are only two of them, and I think both would be more readable with
> a helper more like:
> 
>   int should_prefetch_object(struct repository *r,
>                              const struct object_id *oid) {
> 	return !oid_object_info_extended(r, oid, NULL,
> 	                                 OBJECT_INFO_SKIP_FETCH_OBJECT |
> 					 OBJECT_INFO_QUICK);
>   }
> 
> but unless everybody is immediately on-board with "yes, that is much
> nicer", I don't want bikeshedding to hold up your important and
> obviously-correct fix.

I'll come back with another series to drop the _PREFETCH flag after the
release calms down. It can give more time for others to chime in here.

Thanks, Junio for the quick turnaround in taking the patch.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-05-29  0:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-28 15:19 [PATCH 0/1] sha1-file: split OBJECT_INFO_FOR_PREFETCH Derrick Stolee via GitGitGadget
2019-05-28 15:19 ` [PATCH 1/1] " Derrick Stolee via GitGitGadget
2019-05-28 20:31   ` Junio C Hamano
2019-05-28 20:54   ` Jeff King
2019-05-29  0:29     ` Derrick Stolee

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).