git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] reftable: avoid undefined behaviour breaking t0032
@ 2022-04-15  7:02 Carlo Marcelo Arenas Belón
  2022-04-15  7:10 ` Junio C Hamano
  2022-04-15  8:30 ` [PATCH v2] " Carlo Marcelo Arenas Belón
  0 siblings, 2 replies; 16+ messages in thread
From: Carlo Marcelo Arenas Belón @ 2022-04-15  7:02 UTC (permalink / raw)
  To: git; +Cc: hanwen, Carlo Marcelo Arenas Belón

At least in glibc based systems, memset with a NULL first parameter
will cause a runtime exception.

Avoid doing so by adding a conditional to check for NULL in all three
identically looking functions that were affected.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
---
Bug was introduced with the original code in 1214aa841bc (reftable: add
blocksource, an abstraction for random access reads, 2021-10-07), so not
to be considered a regression for this release.

 reftable/blocksource.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/reftable/blocksource.c b/reftable/blocksource.c
index 0044eecd9aa..984bf07fc17 100644
--- a/reftable/blocksource.c
+++ b/reftable/blocksource.c
@@ -15,7 +15,8 @@ license that can be found in the LICENSE file or at
 
 static void strbuf_return_block(void *b, struct reftable_block *dest)
 {
-	memset(dest->data, 0xff, dest->len);
+	if (dest->data)
+		memset(dest->data, 0xff, dest->len);
 	reftable_free(dest->data);
 }
 
@@ -56,7 +57,8 @@ void block_source_from_strbuf(struct reftable_block_source *bs,
 
 static void malloc_return_block(void *b, struct reftable_block *dest)
 {
-	memset(dest->data, 0xff, dest->len);
+	if (dest->data)
+		memset(dest->data, 0xff, dest->len);
 	reftable_free(dest->data);
 }
 
@@ -85,7 +87,8 @@ static uint64_t file_size(void *b)
 
 static void file_return_block(void *b, struct reftable_block *dest)
 {
-	memset(dest->data, 0xff, dest->len);
+	if (dest->data)
+		memset(dest->data, 0xff, dest->len);
 	reftable_free(dest->data);
 }
 
-- 
2.36.0.rc2.283.gbef64175c85


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] reftable: avoid undefined behaviour breaking t0032
  2022-04-15  7:02 [PATCH] reftable: avoid undefined behaviour breaking t0032 Carlo Marcelo Arenas Belón
@ 2022-04-15  7:10 ` Junio C Hamano
  2022-04-15  8:30 ` [PATCH v2] " Carlo Marcelo Arenas Belón
  1 sibling, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2022-04-15  7:10 UTC (permalink / raw)
  To: Carlo Marcelo Arenas Belón; +Cc: git, hanwen

Carlo Marcelo Arenas Belón  <carenas@gmail.com> writes:

> At least in glibc based systems, memset with a NULL first parameter
> will cause a runtime exception.

I take it to mean that the code assumes that it is OK to pass NULL
as long as length is 0 (i.e. filling the range of memory whose size
is 0 with the specified byte can happen safely no matter what the
starting address of that range is, as size==0 by definition should
mean a no-op).  That would mean we have a rule on how members of
dest must be set: .data is allowed to be NULL only when .len is 0.

If so, I wonder if we want to guard with dest->len instead, i.e.

	if (dest->len)
		memset(dest->data, 0xff, dest->len);

With the form in this patch, i.e.

> -	memset(dest->data, 0xff, dest->len);
> +	if (dest->data)
> +		memset(dest->data, 0xff, dest->len);

we will fail to catch a bogus caller that violates the rule above
that we have on <data, len>.  But if we guard with dest->len, then
a violator of <data, len> rule will be caught by memset().

Thanks.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] reftable: avoid undefined behaviour breaking t0032
  2022-04-15  7:02 [PATCH] reftable: avoid undefined behaviour breaking t0032 Carlo Marcelo Arenas Belón
  2022-04-15  7:10 ` Junio C Hamano
@ 2022-04-15  8:30 ` Carlo Marcelo Arenas Belón
  2022-04-15 10:21   ` [RFC PATCH 0/2] reftable: remove poor man's SANITIZE=address, fix a memset() bug Ævar Arnfjörð Bjarmason
  2022-04-25 10:18   ` [PATCH v2] reftable: avoid undefined behaviour breaking t0032 Han-Wen Nienhuys
  1 sibling, 2 replies; 16+ messages in thread
From: Carlo Marcelo Arenas Belón @ 2022-04-15  8:30 UTC (permalink / raw)
  To: git; +Cc: hanwen, Carlo Marcelo Arenas Belón

1214aa841bc (reftable: add blocksource, an abstraction for random
access reads, 2021-10-07), makes the assumption that it is ok to
free a reftable_block pointing to NULL if the size is also set to
0, but implements that using a memset call that at least in glibc
based system will trigger a runtime exception if called with a
NULL pointer as its first parameter.

Avoid doing so by adding a conditional to check for the size in all
three identically looking functions that were affected, and therefore,
still allow memset to help catch callers that might incorrectly pass
a NULL pointer with a non zero size, but avoiding the exception for
the valid cases.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
---
Changes since v1:
- Improved logic as suggested by Junio
- Hopefully also improved commit message

 reftable/blocksource.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/reftable/blocksource.c b/reftable/blocksource.c
index 0044eecd9aa..db1b7dc966f 100644
--- a/reftable/blocksource.c
+++ b/reftable/blocksource.c
@@ -15,7 +15,8 @@ license that can be found in the LICENSE file or at
 
 static void strbuf_return_block(void *b, struct reftable_block *dest)
 {
-	memset(dest->data, 0xff, dest->len);
+	if (dest->len)
+		memset(dest->data, 0xff, dest->len);
 	reftable_free(dest->data);
 }
 
@@ -56,7 +57,8 @@ void block_source_from_strbuf(struct reftable_block_source *bs,
 
 static void malloc_return_block(void *b, struct reftable_block *dest)
 {
-	memset(dest->data, 0xff, dest->len);
+	if (dest->len)
+		memset(dest->data, 0xff, dest->len);
 	reftable_free(dest->data);
 }
 
@@ -85,7 +87,8 @@ static uint64_t file_size(void *b)
 
 static void file_return_block(void *b, struct reftable_block *dest)
 {
-	memset(dest->data, 0xff, dest->len);
+	if (dest->len)
+		memset(dest->data, 0xff, dest->len);
 	reftable_free(dest->data);
 }
 
-- 
2.36.0.rc2.283.gbef64175c85


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 0/2] reftable: remove poor man's SANITIZE=address, fix a memset() bug
  2022-04-15  8:30 ` [PATCH v2] " Carlo Marcelo Arenas Belón
@ 2022-04-15 10:21   ` Ævar Arnfjörð Bjarmason
  2022-04-15 10:21     ` [RFC PATCH 1/2] reftable: remove the "return_block" abstraction Ævar Arnfjörð Bjarmason
  2022-04-15 10:21     ` [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc() Ævar Arnfjörð Bjarmason
  2022-04-25 10:18   ` [PATCH v2] reftable: avoid undefined behaviour breaking t0032 Han-Wen Nienhuys
  1 sibling, 2 replies; 16+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-15 10:21 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Carlo Marcelo Arenas Belón, Han-Wen Nienhuys,
	Ævar Arnfjörð Bjarmason

On Fri, Apr 15 2022, Carlo Marcelo Arenas Belón wrote:

> 1214aa841bc (reftable: add blocksource, an abstraction for random
> access reads, 2021-10-07), makes the assumption that it is ok to
> free a reftable_block pointing to NULL if the size is also set to
> 0, but implements that using a memset call that at least in glibc
> based system will trigger a runtime exception if called with a
> NULL pointer as its first parameter.

FWIW I've been carrying 1/2 here for a while in my local tree,
i.e. reftable/* has various abstractions and indirections that aren't
really needed. In this case we can just get rid of that & free them,
so the memset()s you fixed can just be removed.

The 2/2 is then another memset() issue I spotted when looking at this
again, -fanalyzer notes the bug related to it.

Ævar Arnfjörð Bjarmason (2):
  reftable: remove the "return_block" abstraction
  reftable: don't memset() a NULL from failed malloc()

 reftable/block.c                |  4 +---
 reftable/blocksource.c          | 28 +---------------------------
 reftable/publicbasics.c         |  2 ++
 reftable/reftable-blocksource.h |  2 --
 4 files changed, 4 insertions(+), 32 deletions(-)

-- 
2.36.0.rc2.863.gfc2c14e3b91


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH 1/2] reftable: remove the "return_block" abstraction
  2022-04-15 10:21   ` [RFC PATCH 0/2] reftable: remove poor man's SANITIZE=address, fix a memset() bug Ævar Arnfjörð Bjarmason
@ 2022-04-15 10:21     ` Ævar Arnfjörð Bjarmason
  2022-04-15 13:37       ` René Scharfe
  2022-04-25  9:57       ` Han-Wen Nienhuys
  2022-04-15 10:21     ` [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc() Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 16+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-15 10:21 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Carlo Marcelo Arenas Belón, Han-Wen Nienhuys,
	Ævar Arnfjörð Bjarmason

This abstraction added in 1214aa841bc (reftable: add blocksource, an
abstraction for random access reads, 2021-10-07) has the caller
provide a "blockp->data", so there's not point in having the vtable
have a custom free() function.

In addition this had what looked like a poor man's SANITIZE=address
doing a memset() to 0xff just before the data was free'd.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 reftable/block.c                |  4 +---
 reftable/blocksource.c          | 28 +---------------------------
 reftable/reftable-blocksource.h |  2 --
 3 files changed, 2 insertions(+), 32 deletions(-)

diff --git a/reftable/block.c b/reftable/block.c
index 34d4d073692..bb17cc32372 100644
--- a/reftable/block.c
+++ b/reftable/block.c
@@ -442,9 +442,7 @@ void block_writer_release(struct block_writer *bw)
 
 void reftable_block_done(struct reftable_block *blockp)
 {
-	struct reftable_block_source source = blockp->source;
-	if (blockp && source.ops)
-		source.ops->return_block(source.arg, blockp);
+	FREE_AND_NULL(blockp->data);
 	blockp->data = NULL;
 	blockp->len = 0;
 	blockp->source.ops = NULL;
diff --git a/reftable/blocksource.c b/reftable/blocksource.c
index 2605371c28d..d9e47cc316b 100644
--- a/reftable/blocksource.c
+++ b/reftable/blocksource.c
@@ -13,12 +13,6 @@ license that can be found in the LICENSE file or at
 #include "reftable-blocksource.h"
 #include "reftable-error.h"
 
-static void strbuf_return_block(void *b, struct reftable_block *dest)
-{
-	memset(dest->data, 0xff, dest->len);
-	reftable_free(dest->data);
-}
-
 static void strbuf_close(void *b)
 {
 }
@@ -42,7 +36,6 @@ static uint64_t strbuf_size(void *b)
 static struct reftable_block_source_vtable strbuf_vtable = {
 	.size = &strbuf_size,
 	.read_block = &strbuf_read_block,
-	.return_block = &strbuf_return_block,
 	.close = &strbuf_close,
 };
 
@@ -54,19 +47,7 @@ void block_source_from_strbuf(struct reftable_block_source *bs,
 	bs->arg = buf;
 }
 
-static void malloc_return_block(void *b, struct reftable_block *dest)
-{
-	memset(dest->data, 0xff, dest->len);
-	reftable_free(dest->data);
-}
-
-static struct reftable_block_source_vtable malloc_vtable = {
-	.return_block = &malloc_return_block,
-};
-
-static struct reftable_block_source malloc_block_source_instance = {
-	.ops = &malloc_vtable,
-};
+static struct reftable_block_source malloc_block_source_instance = { 0 };
 
 struct reftable_block_source malloc_block_source(void)
 {
@@ -83,12 +64,6 @@ static uint64_t file_size(void *b)
 	return ((struct file_block_source *)b)->size;
 }
 
-static void file_return_block(void *b, struct reftable_block *dest)
-{
-	memset(dest->data, 0xff, dest->len);
-	reftable_free(dest->data);
-}
-
 static void file_close(void *b)
 {
 	int fd = ((struct file_block_source *)b)->fd;
@@ -115,7 +90,6 @@ static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
 static struct reftable_block_source_vtable file_vtable = {
 	.size = &file_size,
 	.read_block = &file_read_block,
-	.return_block = &file_return_block,
 	.close = &file_close,
 };
 
diff --git a/reftable/reftable-blocksource.h b/reftable/reftable-blocksource.h
index 5aa3990a573..7b7cb280b73 100644
--- a/reftable/reftable-blocksource.h
+++ b/reftable/reftable-blocksource.h
@@ -35,8 +35,6 @@ struct reftable_block_source_vtable {
 	   beyond the end of the block */
 	int (*read_block)(void *source, struct reftable_block *dest,
 			  uint64_t off, uint32_t size);
-	/* mark the block as read; may return the data back to malloc */
-	void (*return_block)(void *source, struct reftable_block *blockp);
 
 	/* release all resources associated with the block source */
 	void (*close)(void *source);
-- 
2.36.0.rc2.863.gfc2c14e3b91


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc()
  2022-04-15 10:21   ` [RFC PATCH 0/2] reftable: remove poor man's SANITIZE=address, fix a memset() bug Ævar Arnfjörð Bjarmason
  2022-04-15 10:21     ` [RFC PATCH 1/2] reftable: remove the "return_block" abstraction Ævar Arnfjörð Bjarmason
@ 2022-04-15 10:21     ` Ævar Arnfjörð Bjarmason
  2022-04-15 13:37       ` René Scharfe
  1 sibling, 1 reply; 16+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-15 10:21 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Carlo Marcelo Arenas Belón, Han-Wen Nienhuys,
	Ævar Arnfjörð Bjarmason

Return NULL instead of possibly feeding a NULL to memset() in
reftable_calloc(). This issue was noted by GCC 12's -fanalyzer:

	reftable/publicbasics.c: In function ‘reftable_calloc’:
	reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument]
	   43 |         memset(p, 0, sz);
	      |         ^~~~~~~~~~~~~~~~
	[...]

This bug has been with us ever since this code was added in
ef8a6c62687 (reftable: utility functions, 2021-10-07).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 reftable/publicbasics.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c
index 0ad7d5c0ff2..a18167f5ab7 100644
--- a/reftable/publicbasics.c
+++ b/reftable/publicbasics.c
@@ -40,6 +40,8 @@ void reftable_free(void *p)
 void *reftable_calloc(size_t sz)
 {
 	void *p = reftable_malloc(sz);
+	if (!p)
+		return NULL;
 	memset(p, 0, sz);
 	return p;
 }
-- 
2.36.0.rc2.863.gfc2c14e3b91


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc()
  2022-04-15 10:21     ` [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc() Ævar Arnfjörð Bjarmason
@ 2022-04-15 13:37       ` René Scharfe
  2022-04-15 13:53         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 16+ messages in thread
From: René Scharfe @ 2022-04-15 13:37 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Carlo Marcelo Arenas Belón, Han-Wen Nienhuys

Am 15.04.22 um 12:21 schrieb Ævar Arnfjörð Bjarmason:
> Return NULL instead of possibly feeding a NULL to memset() in
> reftable_calloc(). This issue was noted by GCC 12's -fanalyzer:
>
> 	reftable/publicbasics.c: In function ‘reftable_calloc’:
> 	reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument]
> 	   43 |         memset(p, 0, sz);
> 	      |         ^~~~~~~~~~~~~~~~
> 	[...]
>
> This bug has been with us ever since this code was added in
> ef8a6c62687 (reftable: utility functions, 2021-10-07).

Not sure it's a bug, or limited to this specific location.  AFAICS it's
more a general lack of handling of allocation failures in the reftable
code.  Perhaps it was a conscious decision to let the system deal with
them via segfaults?

When the code is used by Git eventually it should use xmalloc and
xrealloc instead of calling malloc(3) and realloc(3) directly, to
handle allocation failures explicitly, and to support GIT_ALLOC_LIMIT.
This will calm down the analyzer and extend the safety to the rest of
the reftable code, not just this memset call.

>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  reftable/publicbasics.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c
> index 0ad7d5c0ff2..a18167f5ab7 100644
> --- a/reftable/publicbasics.c
> +++ b/reftable/publicbasics.c
> @@ -40,6 +40,8 @@ void reftable_free(void *p)
>  void *reftable_calloc(size_t sz)
>  {
>  	void *p = reftable_malloc(sz);
> +	if (!p)
> +		return NULL;
>  	memset(p, 0, sz);

This function is calloc(3) reimplemented, except it can't make use of
pre-zero'd blocks of memory and doesn't multiply element size and count
for the caller while checking for overflow, making it slower and harder
to use securely. :-/

>  	return p;
>  }

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 1/2] reftable: remove the "return_block" abstraction
  2022-04-15 10:21     ` [RFC PATCH 1/2] reftable: remove the "return_block" abstraction Ævar Arnfjörð Bjarmason
@ 2022-04-15 13:37       ` René Scharfe
  2022-04-25  9:57       ` Han-Wen Nienhuys
  1 sibling, 0 replies; 16+ messages in thread
From: René Scharfe @ 2022-04-15 13:37 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Carlo Marcelo Arenas Belón, Han-Wen Nienhuys

Am 15.04.22 um 12:21 schrieb Ævar Arnfjörð Bjarmason:
> This abstraction added in 1214aa841bc (reftable: add blocksource, an
> abstraction for random access reads, 2021-10-07) has the caller
> provide a "blockp->data", so there's not point in having the vtable
> have a custom free() function.
>
> In addition this had what looked like a poor man's SANITIZE=address
> doing a memset() to 0xff just before the data was free'd.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  reftable/block.c                |  4 +---
>  reftable/blocksource.c          | 28 +---------------------------
>  reftable/reftable-blocksource.h |  2 --
>  3 files changed, 2 insertions(+), 32 deletions(-)
>
> diff --git a/reftable/block.c b/reftable/block.c
> index 34d4d073692..bb17cc32372 100644
> --- a/reftable/block.c
> +++ b/reftable/block.c
> @@ -442,9 +442,7 @@ void block_writer_release(struct block_writer *bw)
>
>  void reftable_block_done(struct reftable_block *blockp)
>  {
> -	struct reftable_block_source source = blockp->source;
> -	if (blockp && source.ops)
> -		source.ops->return_block(source.arg, blockp);

I don't understand this interface.  Why is source.arg passed to the
function?  Is this perhaps done in preparation for some kind of backend
that is planned to be added later which makes use of it?  I suspect we'd
better postpone cleanups until the full functionality is integrated.

> +	FREE_AND_NULL(blockp->data);

If you do that...

>  	blockp->data = NULL;

... then this line becomes redundant.

>  	blockp->len = 0;
>  	blockp->source.ops = NULL;
> diff --git a/reftable/blocksource.c b/reftable/blocksource.c
> index 2605371c28d..d9e47cc316b 100644
> --- a/reftable/blocksource.c
> +++ b/reftable/blocksource.c
> @@ -13,12 +13,6 @@ license that can be found in the LICENSE file or at
>  #include "reftable-blocksource.h"
>  #include "reftable-error.h"
>
> -static void strbuf_return_block(void *b, struct reftable_block *dest)
> -{
> -	memset(dest->data, 0xff, dest->len);
> -	reftable_free(dest->data);
> -}
> -
>  static void strbuf_close(void *b)
>  {
>  }
> @@ -42,7 +36,6 @@ static uint64_t strbuf_size(void *b)
>  static struct reftable_block_source_vtable strbuf_vtable = {
>  	.size = &strbuf_size,
>  	.read_block = &strbuf_read_block,
> -	.return_block = &strbuf_return_block,
>  	.close = &strbuf_close,
>  };
>
> @@ -54,19 +47,7 @@ void block_source_from_strbuf(struct reftable_block_source *bs,
>  	bs->arg = buf;
>  }
>
> -static void malloc_return_block(void *b, struct reftable_block *dest)
> -{
> -	memset(dest->data, 0xff, dest->len);
> -	reftable_free(dest->data);
> -}
> -
> -static struct reftable_block_source_vtable malloc_vtable = {
> -	.return_block = &malloc_return_block,
> -};
> -
> -static struct reftable_block_source malloc_block_source_instance = {
> -	.ops = &malloc_vtable,
> -};
> +static struct reftable_block_source malloc_block_source_instance = { 0 };
>
>  struct reftable_block_source malloc_block_source(void)
>  {
> @@ -83,12 +64,6 @@ static uint64_t file_size(void *b)
>  	return ((struct file_block_source *)b)->size;
>  }
>
> -static void file_return_block(void *b, struct reftable_block *dest)
> -{
> -	memset(dest->data, 0xff, dest->len);
> -	reftable_free(dest->data);
> -}
> -
>  static void file_close(void *b)
>  {
>  	int fd = ((struct file_block_source *)b)->fd;
> @@ -115,7 +90,6 @@ static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
>  static struct reftable_block_source_vtable file_vtable = {
>  	.size = &file_size,
>  	.read_block = &file_read_block,
> -	.return_block = &file_return_block,
>  	.close = &file_close,
>  };
>
> diff --git a/reftable/reftable-blocksource.h b/reftable/reftable-blocksource.h
> index 5aa3990a573..7b7cb280b73 100644
> --- a/reftable/reftable-blocksource.h
> +++ b/reftable/reftable-blocksource.h
> @@ -35,8 +35,6 @@ struct reftable_block_source_vtable {
>  	   beyond the end of the block */
>  	int (*read_block)(void *source, struct reftable_block *dest,
>  			  uint64_t off, uint32_t size);
> -	/* mark the block as read; may return the data back to malloc */
> -	void (*return_block)(void *source, struct reftable_block *blockp);
>
>  	/* release all resources associated with the block source */
>  	void (*close)(void *source);


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc()
  2022-04-15 13:37       ` René Scharfe
@ 2022-04-15 13:53         ` Ævar Arnfjörð Bjarmason
  2022-04-15 14:30           ` Phillip Wood
  0 siblings, 1 reply; 16+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-15 13:53 UTC (permalink / raw)
  To: René Scharfe
  Cc: git, Junio C Hamano, Carlo Marcelo Arenas Belón,
	Han-Wen Nienhuys


On Fri, Apr 15 2022, René Scharfe wrote:

> Am 15.04.22 um 12:21 schrieb Ævar Arnfjörð Bjarmason:
>> Return NULL instead of possibly feeding a NULL to memset() in
>> reftable_calloc(). This issue was noted by GCC 12's -fanalyzer:
>>
>> 	reftable/publicbasics.c: In function ‘reftable_calloc’:
>> 	reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument]
>> 	   43 |         memset(p, 0, sz);
>> 	      |         ^~~~~~~~~~~~~~~~
>> 	[...]
>>
>> This bug has been with us ever since this code was added in
>> ef8a6c62687 (reftable: utility functions, 2021-10-07).
>
> Not sure it's a bug, or limited to this specific location.  AFAICS it's
> more a general lack of handling of allocation failures in the reftable
> code.  Perhaps it was a conscious decision to let the system deal with
> them via segfaults?

I think it's just buggy, it tries to deal with malloc returning NULL in
other contexts.

> When the code is used by Git eventually it should use xmalloc and
> xrealloc instead of calling malloc(3) and realloc(3) directly, to
> handle allocation failures explicitly, and to support GIT_ALLOC_LIMIT.
> This will calm down the analyzer and extend the safety to the rest of
> the reftable code, not just this memset call.

Yeah, its whole custom malloc handling should go away.

>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  reftable/publicbasics.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c
>> index 0ad7d5c0ff2..a18167f5ab7 100644
>> --- a/reftable/publicbasics.c
>> +++ b/reftable/publicbasics.c
>> @@ -40,6 +40,8 @@ void reftable_free(void *p)
>>  void *reftable_calloc(size_t sz)
>>  {
>>  	void *p = reftable_malloc(sz);
>> +	if (!p)
>> +		return NULL;
>>  	memset(p, 0, sz);
>
> This function is calloc(3) reimplemented, except it can't make use of
> pre-zero'd blocks of memory and doesn't multiply element size and count
> for the caller while checking for overflow, making it slower and harder
> to use securely. :-/

*nod*, this is really just re-arranging the deck chairs.

Maybe Han-Wen had something in mind, but I really don't see the point of
having it use malloc wrappers at all, as opposed to just having the
linker point it to the right "malloc".

So if it needs to be used outside of git.git it would just need the
trivial xmalloc() etc. wrappers.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc()
  2022-04-15 13:53         ` Ævar Arnfjörð Bjarmason
@ 2022-04-15 14:30           ` Phillip Wood
  2022-04-15 15:20             ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 16+ messages in thread
From: Phillip Wood @ 2022-04-15 14:30 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, René Scharfe
  Cc: git, Junio C Hamano, Carlo Marcelo Arenas Belón,
	Han-Wen Nienhuys

Hi Ævar and René

On 15/04/2022 14:53, Ævar Arnfjörð Bjarmason wrote:
> 
> On Fri, Apr 15 2022, René Scharfe wrote:
> 
>> Am 15.04.22 um 12:21 schrieb Ævar Arnfjörð Bjarmason:
>>> Return NULL instead of possibly feeding a NULL to memset() in
>>> reftable_calloc(). This issue was noted by GCC 12's -fanalyzer:
>>>
>>> 	reftable/publicbasics.c: In function ‘reftable_calloc’:
>>> 	reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument]
>>> 	   43 |         memset(p, 0, sz);
>>> 	      |         ^~~~~~~~~~~~~~~~
>>> 	[...]
>>>
>>> This bug has been with us ever since this code was added in
>>> ef8a6c62687 (reftable: utility functions, 2021-10-07).
>>
>> Not sure it's a bug, or limited to this specific location.  AFAICS it's
>> more a general lack of handling of allocation failures in the reftable
>> code.  Perhaps it was a conscious decision to let the system deal with
>> them via segfaults?
> 
> I think it's just buggy, it tries to deal with malloc returning NULL in
> other contexts.

Does it? I just quickly scanned the output of

git grep -e 'reftable_[mc]alloc' -e 'reftable_realloc' -A2 origin/master

and I didn't see a single NULL check

>> When the code is used by Git eventually it should use xmalloc and
>> xrealloc instead of calling malloc(3) and realloc(3) directly, to
>> handle allocation failures explicitly, and to support GIT_ALLOC_LIMIT.
>> This will calm down the analyzer and extend the safety to the rest of
>> the reftable code, not just this memset call.
> 
> Yeah, its whole custom malloc handling should go away.

Isn't it there so the same code can be used by libgit2 and other things 
that let the caller specify a custom allocator?

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc()
  2022-04-15 14:30           ` Phillip Wood
@ 2022-04-15 15:20             ` Ævar Arnfjörð Bjarmason
  2022-04-15 16:23               ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-15 15:20 UTC (permalink / raw)
  To: Phillip Wood
  Cc: René Scharfe, git, Junio C Hamano,
	Carlo Marcelo Arenas Belón, Han-Wen Nienhuys, Edward Thomson,
	Patrick Steinhardt


On Fri, Apr 15 2022, Phillip Wood wrote:

> Hi Ævar and René
>
> On 15/04/2022 14:53, Ævar Arnfjörð Bjarmason wrote:
>> On Fri, Apr 15 2022, René Scharfe wrote:
>> 
>>> Am 15.04.22 um 12:21 schrieb Ævar Arnfjörð Bjarmason:
>>>> Return NULL instead of possibly feeding a NULL to memset() in
>>>> reftable_calloc(). This issue was noted by GCC 12's -fanalyzer:
>>>>
>>>> 	reftable/publicbasics.c: In function ‘reftable_calloc’:
>>>> 	reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument]
>>>> 	   43 |         memset(p, 0, sz);
>>>> 	      |         ^~~~~~~~~~~~~~~~
>>>> 	[...]
>>>>
>>>> This bug has been with us ever since this code was added in
>>>> ef8a6c62687 (reftable: utility functions, 2021-10-07).
>>>
>>> Not sure it's a bug, or limited to this specific location.  AFAICS it's
>>> more a general lack of handling of allocation failures in the reftable
>>> code.  Perhaps it was a conscious decision to let the system deal with
>>> them via segfaults?
>> I think it's just buggy, it tries to deal with malloc returning NULL
>> in
>> other contexts.
>
> Does it? I just quickly scanned the output of
>
> git grep -e 'reftable_[mc]alloc' -e 'reftable_realloc' -A2 origin/master
>
> and I didn't see a single NULL check

I think you're right, I retrieved that information from wetware. Looking
again I think I was confusing it with the if (x) free(x) fixes in
34230514b83 (Merge branch 'hn/reftable-coverity-fixes', 2022-02-16).

>>> When the code is used by Git eventually it should use xmalloc and
>>> xrealloc instead of calling malloc(3) and realloc(3) directly, to
>>> handle allocation failures explicitly, and to support GIT_ALLOC_LIMIT.
>>> This will calm down the analyzer and extend the safety to the rest of
>>> the reftable code, not just this memset call.
>> Yeah, its whole custom malloc handling should go away.
>
> Isn't it there so the same code can be used by libgit2 and other
> things that let the caller specify a custom allocator?

I don't think so, but I may be wrong. I think it's a case of
over-generalization where we'd be better off with something simpler,
i.e. just using our normal allocation functions.

That still allows for taking this code and plugging it into any custom
allocator. If you simply want to compile some code such as this to use
another allocator you can easily do that via the linker, as
e.g. git.git's build process allows you to do with nedmalloc.

So presumably if libgit2 would want to use this they'd just do that via
their build process.

I.e. they'd have "malloc" point to a symbol that would resolve to a
shimmy layer that would dispatch to functions using this:
https://github.com/libgit2/libgit2/blob/main/src/util/alloc.c

The reason you'd use these sorts of pluggable malloc functions is, I
think, a few:

 1. You are a library like libgit2, as opposed to libreftable itself, so
    you want to provide this sort of "set the alloc pointers to this"
    now. But in this case we either always want malloc/free (git.git) or
    the "parent" can provide these wrappers (libgit2).

 2. You want to support switching dynamically (or not-globally), or have
    N instances with different malloc, as e.g. the PCREv2 API does:
    cbe81e653fa (grep/pcre2: move back to thread-only PCREv2 structures,
    2021-02-18).

    For this code I think that's thoroughly in YAGNI territory.

 3. Your distribution model can't assume the user can adjust this via
    linking, e.g. C source that's intended to be #included as-is
    (although there you could use defines...) etc.

But maybe I'm missing something.

But a really good reason not to do this and just rely on the link-time
overriding is:

 A. Things look the same, and benefit from more general fixes (although
    to be fair the x*alloc() wrappers would work either way..)

 B. You don't get subtle bugs where you forget some_custom_malloc() and
    then use a generic free(). Having looked just now reftable/ has at
    least one such submarine-segfault waiting to happen.

 C. If you don't actually need the hyper-customize pluggable model of
    say PCREv2 where the library is trying to support having two
    patterns in memory managed by different allocators, or other such
    advanced use-cases it's just extra complexity.

Some of this was discussed for xdiff/* in this thread:
https://lore.kernel.org/git/220216.86leybszht.gmgdl@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc()
  2022-04-15 15:20             ` Ævar Arnfjörð Bjarmason
@ 2022-04-15 16:23               ` Junio C Hamano
  2022-04-25 10:30                 ` Han-Wen Nienhuys
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2022-04-15 16:23 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Phillip Wood, René Scharfe, git,
	Carlo Marcelo Arenas Belón, Han-Wen Nienhuys, Edward Thomson,
	Patrick Steinhardt

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> Does it? I just quickly scanned the output of
>>
>> git grep -e 'reftable_[mc]alloc' -e 'reftable_realloc' -A2 origin/master
>>
>> and I didn't see a single NULL check
>
> I think you're right, I retrieved that information from wetware. Looking
> again I think I was confusing it with the if (x) free(x) fixes in
> 34230514b83 (Merge branch 'hn/reftable-coverity-fixes', 2022-02-16).

True.  Initially I hoped that these RFC changes gives us a better
solution that comes from stepping back and rethinking the issues
around the original "why are we calling memset() with NULL?", and
if it were only "well, in _return_block() functions, we clear the
block before calling _free()---that shouldn't be necessary, if the
calling custom malloc-free pair cares, they can do the clearing, and
plain vanilla free() certainly doesn't, so let's stop calling
memset()", it certainly would have fallen into that category, but
everything these RFC patches do beyond that seems "oh, it does not
look important to me, so let's rip it out to simplify", without
knowing enough to say if that is sensible.

But ...

>> Isn't it there so the same code can be used by libgit2 and other
>> things that let the caller specify a custom allocator?
>
> I don't think so, but I may be wrong. I think it's a case of
> over-generalization where we'd be better off with something simpler,
> i.e. just using our normal allocation functions.

... many points that was raised on the reftable code in this thread
may deserve response to explain the rationale behind the current
code and design, as nobody can improve, without breaking the
intended way to be used, without knowing what it is.  Thanks for
marking the patches with RFC.  I think the "patch" is a bit too
dense a form to point out the issues in the existing code, but the
discussion that follows by quoting the points in the original code
and suggested changes is a good way to think about the code before
these RFC patches.

Ideally, it would have been much better if these points were raised
during the review before the code was accepted to the code base, but
it is better to have one now, rather than never ;-)

THanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 1/2] reftable: remove the "return_block" abstraction
  2022-04-15 10:21     ` [RFC PATCH 1/2] reftable: remove the "return_block" abstraction Ævar Arnfjörð Bjarmason
  2022-04-15 13:37       ` René Scharfe
@ 2022-04-25  9:57       ` Han-Wen Nienhuys
  2022-04-25 17:30         ` Junio C Hamano
  1 sibling, 1 reply; 16+ messages in thread
From: Han-Wen Nienhuys @ 2022-04-25  9:57 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Carlo Marcelo Arenas Belón

On Fri, Apr 15, 2022 at 12:21 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> This abstraction added in 1214aa841bc (reftable: add blocksource, an
> abstraction for random access reads, 2021-10-07) has the caller
> provide a "blockp->data", so there's not point in having the vtable
> have a custom free() function.
>
> In addition this had what looked like a poor man's SANITIZE=address
> doing a memset() to 0xff just before the data was free'd.

>  void reftable_block_done(struct reftable_block *blockp)
>  {
> -       struct reftable_block_source source = blockp->source;
> -       if (blockp && source.ops)
> -               source.ops->return_block(source.arg, blockp);
> +       FREE_AND_NULL(blockp->data);


My thinking here is that we could mmap the reftable file to do reads.
In that case, discarding the block would imply decreasing a refcount
somewhere, rather than deallocating memory.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] reftable: avoid undefined behaviour breaking t0032
  2022-04-15  8:30 ` [PATCH v2] " Carlo Marcelo Arenas Belón
  2022-04-15 10:21   ` [RFC PATCH 0/2] reftable: remove poor man's SANITIZE=address, fix a memset() bug Ævar Arnfjörð Bjarmason
@ 2022-04-25 10:18   ` Han-Wen Nienhuys
  1 sibling, 0 replies; 16+ messages in thread
From: Han-Wen Nienhuys @ 2022-04-25 10:18 UTC (permalink / raw)
  To: Carlo Marcelo Arenas Belón; +Cc: git

On Fri, Apr 15, 2022 at 10:34 AM Carlo Marcelo Arenas Belón
<carenas@gmail.com> wrote:
> 1214aa841bc (reftable: add blocksource, an abstraction for random
> access reads, 2021-10-07), makes the assumption that it is ok to
> free a reftable_block pointing to NULL if the size is also set to
>..

either patch (data or len) LGTM

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc()
  2022-04-15 16:23               ` Junio C Hamano
@ 2022-04-25 10:30                 ` Han-Wen Nienhuys
  0 siblings, 0 replies; 16+ messages in thread
From: Han-Wen Nienhuys @ 2022-04-25 10:30 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, Phillip Wood,
	René Scharfe, git, Carlo Marcelo Arenas Belón,
	Edward Thomson, Patrick Steinhardt

On Fri, Apr 15, 2022 at 6:23 PM Junio C Hamano <gitster@pobox.com> wrote:
> >> git grep -e 'reftable_[mc]alloc' -e 'reftable_realloc' -A2 origin/master
> >>
> >> and I didn't see a single NULL check
> >
> > I think you're right, I retrieved that information from wetware. Looking
> > again I think I was confusing it with the if (x) free(x) fixes in
> > 34230514b83 (Merge branch 'hn/reftable-coverity-fixes', 2022-02-16).
>
> True.  Initially I hoped that these RFC changes gives us a better
> solution that comes from stepping back and rethinking the issues
> around the original "why are we calling memset() with NULL?", and

memset with NULL is an oversight.

The malloc routines are pluggable so the code could be reused for
libgit2. However, as use within Git itself is still not imminent, they
could just as well be removed as they are just a premature
generalization right now.

> if it were only "well, in _return_block() functions, we clear the
> block before calling _free()---that shouldn't be necessary, if the
> calling custom malloc-free pair cares, they can do the clearing, and
> plain vanilla free() certainly doesn't, so let's stop calling

The memset() calls on free() (eg. in  are there to tease out
use-after-free bugs in the unittests, but they should probably be
removed from file_return_block() as that is production code.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 1/2] reftable: remove the "return_block" abstraction
  2022-04-25  9:57       ` Han-Wen Nienhuys
@ 2022-04-25 17:30         ` Junio C Hamano
  0 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2022-04-25 17:30 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Ævar Arnfjörð Bjarmason, git,
	Carlo Marcelo Arenas Belón

Han-Wen Nienhuys <hanwen@google.com> writes:

> On Fri, Apr 15, 2022 at 12:21 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> This abstraction added in 1214aa841bc (reftable: add blocksource, an
>> abstraction for random access reads, 2021-10-07) has the caller
>> provide a "blockp->data", so there's not point in having the vtable
>> have a custom free() function.
>>
>> In addition this had what looked like a poor man's SANITIZE=address
>> doing a memset() to 0xff just before the data was free'd.
>
>>  void reftable_block_done(struct reftable_block *blockp)
>>  {
>> -       struct reftable_block_source source = blockp->source;
>> -       if (blockp && source.ops)
>> -               source.ops->return_block(source.arg, blockp);
>> +       FREE_AND_NULL(blockp->data);
>
>
> My thinking here is that we could mmap the reftable file to do reads.
> In that case, discarding the block would imply decreasing a refcount
> somewhere, rather than deallocating memory.

Sounds like a plan.  As a solution to the memset() thing, ripping
out this abstraction layer is indeed not just overkill but also
doing too much of "while we are at it".

Let's take what we've queued on cm/reftable-0-length-memset and
merge it down.

Thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-04-25 17:30 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-15  7:02 [PATCH] reftable: avoid undefined behaviour breaking t0032 Carlo Marcelo Arenas Belón
2022-04-15  7:10 ` Junio C Hamano
2022-04-15  8:30 ` [PATCH v2] " Carlo Marcelo Arenas Belón
2022-04-15 10:21   ` [RFC PATCH 0/2] reftable: remove poor man's SANITIZE=address, fix a memset() bug Ævar Arnfjörð Bjarmason
2022-04-15 10:21     ` [RFC PATCH 1/2] reftable: remove the "return_block" abstraction Ævar Arnfjörð Bjarmason
2022-04-15 13:37       ` René Scharfe
2022-04-25  9:57       ` Han-Wen Nienhuys
2022-04-25 17:30         ` Junio C Hamano
2022-04-15 10:21     ` [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc() Ævar Arnfjörð Bjarmason
2022-04-15 13:37       ` René Scharfe
2022-04-15 13:53         ` Ævar Arnfjörð Bjarmason
2022-04-15 14:30           ` Phillip Wood
2022-04-15 15:20             ` Ævar Arnfjörð Bjarmason
2022-04-15 16:23               ` Junio C Hamano
2022-04-25 10:30                 ` Han-Wen Nienhuys
2022-04-25 10:18   ` [PATCH v2] reftable: avoid undefined behaviour breaking t0032 Han-Wen Nienhuys

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).