git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* t9210-scalar.sh fails with SANITIZE=undefined
@ 2022-09-22 10:04 Jeff King
  2022-09-22 12:35 ` Derrick Stolee
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff King @ 2022-09-22 10:04 UTC (permalink / raw)
  To: git; +Cc: Victoria Dye

Running t9210 with the tip of master triggers a problem with UBSan:

  $ make SANITIZE=undefined
  [...]
  $ cd t
  $ ./t9210-scalar.sh -v -i
  [...]
  read-cache.c:1886:46: runtime error: member access within misaligned address 0x7f7c09bf7055 for type 'struct ondisk_cache_entry', which requires 4 byte alignment
  0x7f7c09bf7055: note: pointer points here
  33 2e 74 00 63 2c 31  42 17 3f 49 72 63 2c 31  42 17 3f 49 72 00 00 fe  01 02 60 06 4d 00 00 81  a4
              ^

Now here's the interesting part. We do carefully read most of the data
out of the struct with get_be16(), which should handle alignment (we
have to do so because that on_disk_cache_entry is literally just a cast
from an mmap'd buffer). But the line in question is just:

  const uint16_t *flagsp = (const uint16_t *)(ondisk->data + hashsz);

It's not even reading anything, but just computing an offset within the
struct. I don't think that line in particular is to blame. If I use an
older version of Git that predates it on the same repo generated by
t9210, I get a similar error for a different line.

Another thing to note is that the command which fails isn't scalar
itself! It's just vanilla "git add -- loose.t". But it's curious that we
never saw this alignment problem before. I wonder if the scalar-cloned
repository has some index options turned on that trigger the issue.

I didn't dig further. It's obviously new in v2.38.0-rc1, but I'm not
sure it's a show-stopper. It _might_ have been there all along, and is
just now surfacing. Or it might be in an existing experimental feature
that just wasn't exercised enough in the tests. Or if it really is new
in scalar, then it will only hurt people using scalar, which didn't
exist before. So I don't think it's a regression in the strictest sense,
but it might be nice to get a more accurate understanding of the problem
before the release.

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: t9210-scalar.sh fails with SANITIZE=undefined
  2022-09-22 10:04 t9210-scalar.sh fails with SANITIZE=undefined Jeff King
@ 2022-09-22 12:35 ` Derrick Stolee
  2022-09-22 19:12   ` Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: Derrick Stolee @ 2022-09-22 12:35 UTC (permalink / raw)
  To: Jeff King, git; +Cc: Victoria Dye

On 9/22/2022 6:04 AM, Jeff King wrote:
> Running t9210 with the tip of master triggers a problem with UBSan:
> 
>   $ make SANITIZE=undefined
>   [...]
>   $ cd t
>   $ ./t9210-scalar.sh -v -i
>   [...]
>   read-cache.c:1886:46: runtime error: member access within misaligned address 0x7f7c09bf7055 for type 'struct ondisk_cache_entry', which requires 4 byte alignment
>   0x7f7c09bf7055: note: pointer points here
>   33 2e 74 00 63 2c 31  42 17 3f 49 72 63 2c 31  42 17 3f 49 72 00 00 fe  01 02 60 06 4d 00 00 81  a4
>               ^
> 
> Now here's the interesting part. We do carefully read most of the data
> out of the struct with get_be16(), which should handle alignment (we
> have to do so because that on_disk_cache_entry is literally just a cast
> from an mmap'd buffer). But the line in question is just:
> 
>   const uint16_t *flagsp = (const uint16_t *)(ondisk->data + hashsz);
> 
> It's not even reading anything, but just computing an offset within the
> struct. I don't think that line in particular is to blame. If I use an
> older version of Git that predates it on the same repo generated by
> t9210, I get a similar error for a different line.
> 
> Another thing to note is that the command which fails isn't scalar
> itself! It's just vanilla "git add -- loose.t". But it's curious that we
> never saw this alignment problem before. I wonder if the scalar-cloned
> repository has some index options turned on that trigger the issue.
> 
> I didn't dig further. It's obviously new in v2.38.0-rc1, but I'm not
> sure it's a show-stopper. It _might_ have been there all along, and is
> just now surfacing. Or it might be in an existing experimental feature
> that just wasn't exercised enough in the tests. Or if it really is new
> in scalar, then it will only hurt people using scalar, which didn't
> exist before. So I don't think it's a regression in the strictest sense,
> but it might be nice to get a more accurate understanding of the problem
> before the release.

Interesting find!

Here are the index-related settings that Scalar sets as of -rc1:

* core.preloadIndex=true
* index.threads=true
* index.version=4

My gut feeling is that index.version=4 might be the culprit. I thought
we tested GIT_TEST_INDEX_VERSION=4 in some CI modes, but apparently we
do not. Do you get the same error in other tests with that environment
variable?

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: t9210-scalar.sh fails with SANITIZE=undefined
  2022-09-22 12:35 ` Derrick Stolee
@ 2022-09-22 19:12   ` Jeff King
  2022-09-22 22:09     ` Victoria Dye
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff King @ 2022-09-22 19:12 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, Victoria Dye

On Thu, Sep 22, 2022 at 08:35:22AM -0400, Derrick Stolee wrote:

> > I didn't dig further. It's obviously new in v2.38.0-rc1, but I'm not
> > sure it's a show-stopper. It _might_ have been there all along, and is
> > just now surfacing. Or it might be in an existing experimental feature
> > that just wasn't exercised enough in the tests. Or if it really is new
> > in scalar, then it will only hurt people using scalar, which didn't
> > exist before. So I don't think it's a regression in the strictest sense,
> > but it might be nice to get a more accurate understanding of the problem
> > before the release.
> 
> Interesting find!
> 
> Here are the index-related settings that Scalar sets as of -rc1:
> 
> * core.preloadIndex=true
> * index.threads=true
> * index.version=4
> 
> My gut feeling is that index.version=4 might be the culprit. I thought
> we tested GIT_TEST_INDEX_VERSION=4 in some CI modes, but apparently we
> do not. Do you get the same error in other tests with that environment
> variable?

Yeah, that seems by far the most likely of those three. And indeed,
running with GIT_TEST_INDEX_VERSION=4 causes even t0000 to fail with the
same problem. A minimal reproduction in git.git is just:

  make SANITIZE=undefined
  git clone . tmp
  cd tmp
  rm .git/index
  export GIT_INDEX_VERSION=4
  ../git reset --hard ;# ok, writes v4 index
  ../git reset --hard ;# fails reading unaligned v4 index

So it seems like a problem with the v4 format in general. Which...yikes.

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: t9210-scalar.sh fails with SANITIZE=undefined
  2022-09-22 19:12   ` Jeff King
@ 2022-09-22 22:09     ` Victoria Dye
  2022-09-22 22:27       ` Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: Victoria Dye @ 2022-09-22 22:09 UTC (permalink / raw)
  To: Jeff King, Derrick Stolee; +Cc: git

Jeff King wrote:
> On Thu, Sep 22, 2022 at 08:35:22AM -0400, Derrick Stolee wrote:
> 
>>> I didn't dig further. It's obviously new in v2.38.0-rc1, but I'm not
>>> sure it's a show-stopper. It _might_ have been there all along, and is
>>> just now surfacing. Or it might be in an existing experimental feature
>>> that just wasn't exercised enough in the tests. Or if it really is new
>>> in scalar, then it will only hurt people using scalar, which didn't
>>> exist before. So I don't think it's a regression in the strictest sense,
>>> but it might be nice to get a more accurate understanding of the problem
>>> before the release.
>>
>> Interesting find!
>>
>> Here are the index-related settings that Scalar sets as of -rc1:
>>
>> * core.preloadIndex=true
>> * index.threads=true
>> * index.version=4
>>
>> My gut feeling is that index.version=4 might be the culprit. I thought
>> we tested GIT_TEST_INDEX_VERSION=4 in some CI modes, but apparently we
>> do not. Do you get the same error in other tests with that environment
>> variable?
> 
> Yeah, that seems by far the most likely of those three. And indeed,
> running with GIT_TEST_INDEX_VERSION=4 causes even t0000 to fail with the
> same problem. A minimal reproduction in git.git is just:
> 
>   make SANITIZE=undefined
>   git clone . tmp
>   cd tmp
>   rm .git/index
>   export GIT_INDEX_VERSION=4
>   ../git reset --hard ;# ok, writes v4 index
>   ../git reset --hard ;# fails reading unaligned v4 index
> 
> So it seems like a problem with the v4 format in general. Which...yikes.

Other than allowing us to use a (non-packed) 'struct ondisk_cache_entry' to
parse the index entries, is there any reason why the on-disk index entries
should (or need to be) 4-byte aligned? If not, we could update how we read
the 'ondisk' index entry in 'create_from_disk()' to avoid the misalignment.
E.g.:

------------------8<------------------8<------------------8<------------------
diff --git a/read-cache.c b/read-cache.c
index b09128b188..f132a3f256 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1875,7 +1875,7 @@ static int read_index_extension(struct index_state *istate,
 
 static struct cache_entry *create_from_disk(struct mem_pool *ce_mem_pool,
 					    unsigned int version,
-					    struct ondisk_cache_entry *ondisk,
+					    const char *ondisk,
 					    unsigned long *ent_size,
 					    const struct cache_entry *previous_ce)
 {
@@ -1883,7 +1883,7 @@ static struct cache_entry *create_from_disk(struct mem_pool *ce_mem_pool,
 	size_t len;
 	const char *name;
 	const unsigned hashsz = the_hash_algo->rawsz;
-	const uint16_t *flagsp = (const uint16_t *)(ondisk->data + hashsz);
+	const char *flagsp = ondisk + offsetof(struct ondisk_cache_entry, data) + hashsz;
 	unsigned int flags;
 	size_t copy_len = 0;
 	/*
------------------>8------------------>8------------------>8------------------

the do the same sort of conversion with the rest of the function. It's
certainly uglier than just using the 'struct ondisk_index_entry *' pointer,
but it should avoid the misaligned addressing error.

> 
> -Peff


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: t9210-scalar.sh fails with SANITIZE=undefined
  2022-09-22 22:09     ` Victoria Dye
@ 2022-09-22 22:27       ` Jeff King
  2022-09-22 22:56         ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff King @ 2022-09-22 22:27 UTC (permalink / raw)
  To: Victoria Dye; +Cc: Derrick Stolee, git

On Thu, Sep 22, 2022 at 03:09:52PM -0700, Victoria Dye wrote:

> Other than allowing us to use a (non-packed) 'struct ondisk_cache_entry' to
> parse the index entries, is there any reason why the on-disk index entries
> should (or need to be) 4-byte aligned? If not, we could update how we read
> the 'ondisk' index entry in 'create_from_disk()' to avoid the misalignment.

I don't think so. And indeed, we already use get_be16(), etc, for most
of the access (which is mostly there for endian-fixing, but also
resolves alignment problems).

> ------------------8<------------------8<------------------8<------------------
> diff --git a/read-cache.c b/read-cache.c
> index b09128b188..f132a3f256 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1875,7 +1875,7 @@ static int read_index_extension(struct index_state *istate,
>  
>  static struct cache_entry *create_from_disk(struct mem_pool *ce_mem_pool,
>  					    unsigned int version,
> -					    struct ondisk_cache_entry *ondisk,
> +					    const char *ondisk,
>  					    unsigned long *ent_size,
>  					    const struct cache_entry *previous_ce)
>  {
> @@ -1883,7 +1883,7 @@ static struct cache_entry *create_from_disk(struct mem_pool *ce_mem_pool,
>  	size_t len;
>  	const char *name;
>  	const unsigned hashsz = the_hash_algo->rawsz;
> -	const uint16_t *flagsp = (const uint16_t *)(ondisk->data + hashsz);
> +	const char *flagsp = ondisk + offsetof(struct ondisk_cache_entry, data) + hashsz;
>  	unsigned int flags;
>  	size_t copy_len = 0;
>  	/*
> ------------------>8------------------>8------------------>8------------------
> 
> the do the same sort of conversion with the rest of the function. It's
> certainly uglier than just using the 'struct ondisk_index_entry *' pointer,
> but it should avoid the misaligned addressing error.

Yeah, I think that's probably the only reasonable solution. I thought
ditching ondisk_cache_entry entirely (which is basically what this is
doing) would be a tough sell, but a quick "grep" shows it really isn't
used in all that many spots.

I also wondered why other versions do not have a similar problem. After
all, cache entries contain pathnames which are going to be of varying
lengths. But this seems telling:

  $ git grep -m1 -B1 -A2 align_padding_size
  read-cache.c-/* These are only used for v3 or lower */
  read-cache.c:#define align_padding_size(size, len) ((size + (len) + 8) & ~7) - (size + len)
  read-cache.c-#define align_flex_name(STRUCT,len) ((offsetof(struct STRUCT,data) + (len) + 8) & ~7)
  read-cache.c-#define ondisk_cache_entry_size(len) align_flex_name(ondisk_cache_entry,len)

So we actually pad the entries in earlier versions to align them, but
don't in v4. I'm not sure if that was a conscious choice to save space,
or an unintended consequence (though it is mentioned in the docs, I
think that came after the code).

That's probably all obvious to people who work with the index a lot.
It's the one part of Git I've mostly managed to remain oblivious to. :)

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: t9210-scalar.sh fails with SANITIZE=undefined
  2022-09-22 22:27       ` Jeff King
@ 2022-09-22 22:56         ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2022-09-22 22:56 UTC (permalink / raw)
  To: Jeff King; +Cc: Victoria Dye, Derrick Stolee, git

Jeff King <peff@peff.net> writes:

> I also wondered why other versions do not have a similar problem. After
> all, cache entries contain pathnames which are going to be of varying
> lengths. But this seems telling:
>
>   $ git grep -m1 -B1 -A2 align_padding_size
>   read-cache.c-/* These are only used for v3 or lower */
>   read-cache.c:#define align_padding_size(size, len) ((size + (len) + 8) & ~7) - (size + len)
>   read-cache.c-#define align_flex_name(STRUCT,len) ((offsetof(struct STRUCT,data) + (len) + 8) & ~7)
>   read-cache.c-#define ondisk_cache_entry_size(len) align_flex_name(ondisk_cache_entry,len)
>
> So we actually pad the entries in earlier versions to align them, but
> don't in v4. I'm not sure if that was a conscious choice to save space,
> or an unintended consequence (though it is mentioned in the docs, I
> think that came after the code).

I think we didn't even have on-disk vs in-core distinction in the
early index code.  The active_cache[] array used to be an array of
pointers into the (read-only) mmapped memory, peeking into on-disk
index we just "read".  Back when v4 was introduced, that arrangement
was (thought to be) long gone---we iterated over the mmapped memory
and used create_from_disk() to munge the on-disk representation into
a machine native form.  At that point, there was no point in having
the padding---we are supposed to be using get_beXX() and stuff
without having to worry about alignment requirements.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-09-22 22:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-22 10:04 t9210-scalar.sh fails with SANITIZE=undefined Jeff King
2022-09-22 12:35 ` Derrick Stolee
2022-09-22 19:12   ` Jeff King
2022-09-22 22:09     ` Victoria Dye
2022-09-22 22:27       ` Jeff King
2022-09-22 22:56         ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).