git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 1/4] shallow.c: make paint_alloc slightly more robust
@ 2016-12-02 20:31 Rasmus Villemoes
  2016-12-02 20:31 ` [PATCH 2/4] shallow.c: avoid theoretical pointer wrap-around Rasmus Villemoes
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Rasmus Villemoes @ 2016-12-02 20:31 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy, Jeff King,
	Rasmus Villemoes

I have no idea if this is a real issue, but it's not obvious to me that
paint_alloc cannot be called with info->nr_bits greater than about
4M (\approx 8*COMMIT_SLAB_SIZE). In that case the new slab would be too
small. So just round up the allocation to the maximum of
COMMIT_SLAB_SIZE and size.

Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk>
---
 shallow.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/shallow.c b/shallow.c
index 4d0b005..e21534a 100644
--- a/shallow.c
+++ b/shallow.c
@@ -445,11 +445,13 @@ static uint32_t *paint_alloc(struct paint_info *info)
 	unsigned size = nr * sizeof(uint32_t);
 	void *p;
 	if (!info->slab_count || info->free + size > info->end) {
+		unsigned alloc_size = size < COMMIT_SLAB_SIZE ?
+			COMMIT_SLAB_SIZE : size;
 		info->slab_count++;
 		REALLOC_ARRAY(info->slab, info->slab_count);
-		info->free = xmalloc(COMMIT_SLAB_SIZE);
+		info->free = xmalloc(alloc_size);
 		info->slab[info->slab_count - 1] = info->free;
-		info->end = info->free + COMMIT_SLAB_SIZE;
+		info->end = info->free + alloc_size;
 	}
 	p = info->free;
 	info->free += size;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/4] shallow.c: avoid theoretical pointer wrap-around
  2016-12-02 20:31 [PATCH 1/4] shallow.c: make paint_alloc slightly more robust Rasmus Villemoes
@ 2016-12-02 20:31 ` Rasmus Villemoes
  2016-12-03  5:17   ` Jeff King
  2016-12-02 20:31 ` [PATCH 3/4] shallow.c: bit manipulation tweaks Rasmus Villemoes
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Rasmus Villemoes @ 2016-12-02 20:31 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy, Jeff King,
	Rasmus Villemoes

The expression info->free+size is technically undefined behaviour in
exactly the case we want to test for. Moreover, the compiler is likely
to translate the expression to

  (unsigned long)info->free + size > (unsigned long)info->end

where there's at least a theoretical chance that the LHS could wrap
around 0, giving a false negative.

This might as well be written using pointer subtraction avoiding these
issues.

Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk>
---
 shallow.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/shallow.c b/shallow.c
index e21534a..8b1c35d 100644
--- a/shallow.c
+++ b/shallow.c
@@ -444,7 +444,7 @@ static uint32_t *paint_alloc(struct paint_info *info)
 	unsigned nr = (info->nr_bits + 31) / 32;
 	unsigned size = nr * sizeof(uint32_t);
 	void *p;
-	if (!info->slab_count || info->free + size > info->end) {
+	if (!info->slab_count || size > info->end - info->free) {
 		unsigned alloc_size = size < COMMIT_SLAB_SIZE ?
 			COMMIT_SLAB_SIZE : size;
 		info->slab_count++;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/4] shallow.c: bit manipulation tweaks
  2016-12-02 20:31 [PATCH 1/4] shallow.c: make paint_alloc slightly more robust Rasmus Villemoes
  2016-12-02 20:31 ` [PATCH 2/4] shallow.c: avoid theoretical pointer wrap-around Rasmus Villemoes
@ 2016-12-02 20:31 ` Rasmus Villemoes
  2016-12-03  5:21   ` Jeff King
  2016-12-02 20:31 ` [PATCH 4/4] shallow.c: remove useless test Rasmus Villemoes
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Rasmus Villemoes @ 2016-12-02 20:31 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy, Jeff King,
	Rasmus Villemoes

First of all, 1 << 31 is technically undefined behaviour, so let's just
use an unsigned literal.

If i is 'signed int' and gcc doesn't know that i is positive, gcc
generates code to compute the C99-mandated values of "i / 32" and "i %
32", which is a lot more complicated than simple a simple shifts/mask.

The only caller of paint_down actually passes an "unsigned int" value,
but the prototype of paint_down causes (completely well-defined)
conversion to signed int, and gcc has no way of knowing that the
converted value is non-negative. Just make the id parameter unsigned.

In update_refstatus, the change in generated code is much smaller,
presumably because gcc is smart enough to see that i starts as 0 and is
only incremented, so it is allowed (per the UD of signed overflow) to
assume that i is always non-negative. But let's just help less smart
compilers generate good code anyway.

Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk>
---
 shallow.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/shallow.c b/shallow.c
index 8b1c35d..5aec5a5 100644
--- a/shallow.c
+++ b/shallow.c
@@ -464,7 +464,7 @@ static uint32_t *paint_alloc(struct paint_info *info)
  * all walked commits.
  */
 static void paint_down(struct paint_info *info, const unsigned char *sha1,
-		       int id)
+		       unsigned int id)
 {
 	unsigned int i, nr;
 	struct commit_list *head = NULL;
@@ -476,7 +476,7 @@ static void paint_down(struct paint_info *info, const unsigned char *sha1,
 	if (!c)
 		return;
 	memset(bitmap, 0, bitmap_size);
-	bitmap[id / 32] |= (1 << (id % 32));
+	bitmap[id / 32] |= (1U << (id % 32));
 	commit_list_insert(c, &head);
 	while (head) {
 		struct commit_list *p;
@@ -650,11 +650,11 @@ static int add_ref(const char *refname, const struct object_id *oid,
 
 static void update_refstatus(int *ref_status, int nr, uint32_t *bitmap)
 {
-	int i;
+	unsigned int i;
 	if (!ref_status)
 		return;
 	for (i = 0; i < nr; i++)
-		if (bitmap[i / 32] & (1 << (i % 32)))
+		if (bitmap[i / 32] & (1U << (i % 32)))
 			ref_status[i]++;
 }
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/4] shallow.c: remove useless test
  2016-12-02 20:31 [PATCH 1/4] shallow.c: make paint_alloc slightly more robust Rasmus Villemoes
  2016-12-02 20:31 ` [PATCH 2/4] shallow.c: avoid theoretical pointer wrap-around Rasmus Villemoes
  2016-12-02 20:31 ` [PATCH 3/4] shallow.c: bit manipulation tweaks Rasmus Villemoes
@ 2016-12-02 20:31 ` Rasmus Villemoes
  2016-12-03  5:24   ` Jeff King
  2016-12-03  5:14 ` [PATCH 1/4] shallow.c: make paint_alloc slightly more robust Jeff King
  2016-12-06 12:53 ` [PATCH v2 0/6] shallow.c improvements Nguyễn Thái Ngọc Duy
  4 siblings, 1 reply; 20+ messages in thread
From: Rasmus Villemoes @ 2016-12-02 20:31 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy, Jeff King,
	Rasmus Villemoes

It seems to be odd to do x=y if x==y. Maybe there's a bug somewhere near
this, but as is this is somewhat confusing.

Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk>
---
 shallow.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/shallow.c b/shallow.c
index 5aec5a5..7c28239 100644
--- a/shallow.c
+++ b/shallow.c
@@ -513,7 +513,7 @@ static void paint_down(struct paint_info *info, const unsigned char *sha1,
 							  p->item);
 			if (p->item->object.flags & SEEN)
 				continue;
-			if (*p_refs == NULL || *p_refs == *refs)
+			if (*p_refs == NULL)
 				*p_refs = *refs;
 			commit_list_insert(p->item, &head);
 		}
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] shallow.c: make paint_alloc slightly more robust
  2016-12-02 20:31 [PATCH 1/4] shallow.c: make paint_alloc slightly more robust Rasmus Villemoes
                   ` (2 preceding siblings ...)
  2016-12-02 20:31 ` [PATCH 4/4] shallow.c: remove useless test Rasmus Villemoes
@ 2016-12-03  5:14 ` Jeff King
  2016-12-05 10:02   ` Duy Nguyen
  2016-12-06 12:53 ` [PATCH v2 0/6] shallow.c improvements Nguyễn Thái Ngọc Duy
  4 siblings, 1 reply; 20+ messages in thread
From: Jeff King @ 2016-12-03  5:14 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: git, Nguyễn Thái Ngọc Duy

On Fri, Dec 02, 2016 at 09:31:01PM +0100, Rasmus Villemoes wrote:

> I have no idea if this is a real issue, but it's not obvious to me that
> paint_alloc cannot be called with info->nr_bits greater than about
> 4M (\approx 8*COMMIT_SLAB_SIZE). In that case the new slab would be too
> small. So just round up the allocation to the maximum of
> COMMIT_SLAB_SIZE and size.

I had trouble understanding what the problem is from this description,
but I think i figured it out from the code.

Let me try to restate it to make sure I understand.

The paint_alloc() may be asked to allocate a certain number of bits,
which it does across a series of independently allocated slabs. Each
slab holds a fixed size, but we only allocate a single slab. If the
number we need to allocate is larger than fits in a single slab, then at
the end we'll have under-allocated.

Your solution is to make the slab we allocate bigger. But that seems
odd to me. Usually when we are using COMMIT_SLAB_SIZE, we are allocating
a series of slabs that make up a virtual array, and we know that each
slab has the same size. So if you need to find the k-th item, and each
slab has length n, then you'd look at slab (k / n), and then at item (k
% n) within that slab.

In other words, I think the solution isn't to make the one slab bigger,
but to allocate slabs until we have enough of them to meet the request.

But I don't really know how this code is used, or why it is using
COMMIT_SLAB_SIZE in the first place. That's generally supposed to be an
internal detail of the commit-slab.h infrastructure. Why is it being
used directly, instead of just using the functions that commit-slab
defines?

-Peff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/4] shallow.c: avoid theoretical pointer wrap-around
  2016-12-02 20:31 ` [PATCH 2/4] shallow.c: avoid theoretical pointer wrap-around Rasmus Villemoes
@ 2016-12-03  5:17   ` Jeff King
  0 siblings, 0 replies; 20+ messages in thread
From: Jeff King @ 2016-12-03  5:17 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: git, Nguyễn Thái Ngọc Duy

On Fri, Dec 02, 2016 at 09:31:02PM +0100, Rasmus Villemoes wrote:

> The expression info->free+size is technically undefined behaviour in
> exactly the case we want to test for. Moreover, the compiler is likely
> to translate the expression to
> 
>   (unsigned long)info->free + size > (unsigned long)info->end
> 
> where there's at least a theoretical chance that the LHS could wrap
> around 0, giving a false negative.
> 
> This might as well be written using pointer subtraction avoiding these
> issues.
> [...]
>
> -	if (!info->slab_count || info->free + size > info->end) {
> +	if (!info->slab_count || size > info->end - info->free) {

Yeah, I agree the correct way to write this is to compare the sizes
directly. That is how overflow checks _must_ be written. This one is
less likely to overflow, but even computing the value more than one past
the end of the array is technically undefined.

-Peff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/4] shallow.c: bit manipulation tweaks
  2016-12-02 20:31 ` [PATCH 3/4] shallow.c: bit manipulation tweaks Rasmus Villemoes
@ 2016-12-03  5:21   ` Jeff King
  0 siblings, 0 replies; 20+ messages in thread
From: Jeff King @ 2016-12-03  5:21 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: git, Nguyễn Thái Ngọc Duy

On Fri, Dec 02, 2016 at 09:31:03PM +0100, Rasmus Villemoes wrote:

> First of all, 1 << 31 is technically undefined behaviour, so let's just
> use an unsigned literal.

It took me a second to realize that you weren't talking about the
unsigned parameter here. You mean using "1U". It might be worth saying:

   ...use an unsigned literal, "1U".

to make it more obvious.

> If i is 'signed int' and gcc doesn't know that i is positive, gcc
> generates code to compute the C99-mandated values of "i / 32" and "i %
> 32", which is a lot more complicated than simple a simple shifts/mask.

Right, that makes sense (though it is a separate issue).

-Peff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] shallow.c: remove useless test
  2016-12-02 20:31 ` [PATCH 4/4] shallow.c: remove useless test Rasmus Villemoes
@ 2016-12-03  5:24   ` Jeff King
  2016-12-05 12:00     ` Duy Nguyen
  0 siblings, 1 reply; 20+ messages in thread
From: Jeff King @ 2016-12-03  5:24 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: git, Nguyễn Thái Ngọc Duy

On Fri, Dec 02, 2016 at 09:31:04PM +0100, Rasmus Villemoes wrote:

> It seems to be odd to do x=y if x==y. Maybe there's a bug somewhere near
> this, but as is this is somewhat confusing.

Yeah, this code is definitely wrong, but I'm not sure what it's trying
to do. This is the first time I've looked at it.

-Peff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] shallow.c: make paint_alloc slightly more robust
  2016-12-03  5:14 ` [PATCH 1/4] shallow.c: make paint_alloc slightly more robust Jeff King
@ 2016-12-05 10:02   ` Duy Nguyen
  0 siblings, 0 replies; 20+ messages in thread
From: Duy Nguyen @ 2016-12-05 10:02 UTC (permalink / raw)
  To: Jeff King; +Cc: Rasmus Villemoes, Git Mailing List

On Sat, Dec 3, 2016 at 12:14 PM, Jeff King <peff@peff.net> wrote:
> On Fri, Dec 02, 2016 at 09:31:01PM +0100, Rasmus Villemoes wrote:
>
>> I have no idea if this is a real issue, but it's not obvious to me that
>> paint_alloc cannot be called with info->nr_bits greater than about
>> 4M (\approx 8*COMMIT_SLAB_SIZE). In that case the new slab would be too
>> small. So just round up the allocation to the maximum of
>> COMMIT_SLAB_SIZE and size.
>
> I had trouble understanding what the problem is from this description,
> but I think i figured it out from the code.
>
> Let me try to restate it to make sure I understand.
>
> The paint_alloc() may be asked to allocate a certain number of bits,
> which it does across a series of independently allocated slabs. Each
> slab holds a fixed size, but we only allocate a single slab. If the
> number we need to allocate is larger than fits in a single slab, then at
> the end we'll have under-allocated.

Each bit here represents a ref. This code walks the commit graph and
"paints" all commits reachable by the n-th ref with the n-th bit,
stored in the commit slab. But because the majority of commits will
have the same bitmap (e.g. when you exclude tag ABC and nothing else,
then all commits from ABC will have the same bitmap "1"), it's a waste
to allocate the same bitmap per commit (and it's also inefficient to
let malloc allocate 1 bit). I tried to reduce the memory usage: if the
a commit and its parent has the same bitmap, and the slab pointer of
the child commit points to the memory of the parent's, no extra
allocation is done. This manual memory management is pretty much like
alloc.c

The COMMIT_SLAB_SIZE here is really an arbitrary big number so that we
don't have to allocate often. It's basically allocating a new memory
pool. When we use all of that pool, we allocate a new one.. Yeah I
probably should define a new one instead of reusing COMMIT_SLAB_SIZE.
Tthe chances of under-allocation is super low, but still possible: you
need to send more than 4M "exclude" (or "shallow") requests to
upload-pack, to create a bitmap of over 512KiB. That's a lot of
traffic in git protocol.

> Your solution is to make the slab we allocate bigger. But that seems
> odd to me. Usually when we are using COMMIT_SLAB_SIZE, we are allocating
> a series of slabs that make up a virtual array, and we know that each
> slab has the same size. So if you need to find the k-th item, and each
> slab has length n, then you'd look at slab (k / n), and then at item (k
> % n) within that slab.
>
> In other words, I think the solution isn't to make the one slab bigger,
> but to allocate slabs until we have enough of them to meet the request.

If I still understand my code (it's been a long time since I wrote
this thing), then I think we just need to catch the problem and die().
Normal users should never ask the server to allocate this much.
-- 
Duy

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] shallow.c: remove useless test
  2016-12-03  5:24   ` Jeff King
@ 2016-12-05 12:00     ` Duy Nguyen
  0 siblings, 0 replies; 20+ messages in thread
From: Duy Nguyen @ 2016-12-05 12:00 UTC (permalink / raw)
  To: Jeff King; +Cc: Rasmus Villemoes, Git Mailing List

On Sat, Dec 3, 2016 at 12:24 PM, Jeff King <peff@peff.net> wrote:
> On Fri, Dec 02, 2016 at 09:31:04PM +0100, Rasmus Villemoes wrote:
>
>> It seems to be odd to do x=y if x==y. Maybe there's a bug somewhere near
>> this, but as is this is somewhat confusing.
>
> Yeah, this code is definitely wrong, but I'm not sure what it's trying
> to do. This is the first time I've looked at it.

I'm sorry I don't know why it's there either :( The first version that
has this was v3 [1] which still uses "util" pointdf instead of the
commit slab but the logic does not differ much.

This is the place when we "paint" the parent commit with the same
bitmap as the child I mentioned earlier (though I think I mentioned it
backward). You see similar code in the same loop just a bit earlier:
if the commit has not been painted, it gets a new bitmap, otherwise
new refs are OR'd to its bitmap. It's the OR part (when the bitmaps
pointed by *p_refs and *refs differ, it's not just about pointer
comparison) that's probably missing here.

But it looks like we can safely delete the " || *p_refs == *refs" part
because the commit in question is inserted back to the commit list
"head" and revisited in the next iteration. If its bitmap is different
from the child's, then in the next iteration it should hit the "if
(memcmp(tmp, *refs, bitmap_size))" line above, in the same loop, then
the new bit will be added. If it's marked UNINTERESTING though, that
won't happen. I'll need more time to stare at this code...

[1] http://public-inbox.org/git/1385351754-9954-9-git-send-email-pclouds@gmail.com/
-- 
Duy

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 0/6] shallow.c improvements
  2016-12-02 20:31 [PATCH 1/4] shallow.c: make paint_alloc slightly more robust Rasmus Villemoes
                   ` (3 preceding siblings ...)
  2016-12-03  5:14 ` [PATCH 1/4] shallow.c: make paint_alloc slightly more robust Jeff King
@ 2016-12-06 12:53 ` Nguyễn Thái Ngọc Duy
  2016-12-06 12:53   ` [PATCH v2 1/6] shallow.c: rename fields in paint_info to better express their purposes Nguyễn Thái Ngọc Duy
                     ` (6 more replies)
  4 siblings, 7 replies; 20+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-12-06 12:53 UTC (permalink / raw)
  To: git; +Cc: rv, Jeff King, Nguyễn Thái Ngọc Duy

After staring not-so-hard and not-for-so-long at the code. This is
what I come up with. Rasmus I replaced two of your commits with my
own (and thank you for giving me an opportunity to refresh my memory
with this stuff). The first two commits are new and the result of
Jeff's observation on COMMIT_SLAB_SIZE.

You may find the description here a bit different from my explanation
previously (about "exclude/shallow requests"). Well.. I was wrong.
I had the recent --exclude-tag and friends in mind, but this is about
clone/fetch/push from/to a shallow repository since 2013, no wonder I
don't remember much about it :-D

Nguyễn Thái Ngọc Duy (4):
  shallow.c: rename fields in paint_info to better express their purposes
  shallow.c: stop abusing COMMIT_SLAB_SIZE for paint_info's memory pools
  shallow.c: make paint_alloc slightly more robust
  shallow.c: remove useless code

Rasmus Villemoes (2):
  shallow.c: avoid theoretical pointer wrap-around
  shallow.c: bit manipulation tweaks

 shallow.c | 39 ++++++++++++++++++++-------------------
 1 file changed, 20 insertions(+), 19 deletions(-)

-- 
2.8.2.524.g6ff3d78


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 1/6] shallow.c: rename fields in paint_info to better express their purposes
  2016-12-06 12:53 ` [PATCH v2 0/6] shallow.c improvements Nguyễn Thái Ngọc Duy
@ 2016-12-06 12:53   ` Nguyễn Thái Ngọc Duy
  2016-12-06 12:53   ` [PATCH v2 2/6] shallow.c: stop abusing COMMIT_SLAB_SIZE for paint_info's memory pools Nguyễn Thái Ngọc Duy
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-12-06 12:53 UTC (permalink / raw)
  To: git; +Cc: rv, Jeff King, Nguyễn Thái Ngọc Duy

paint_alloc() is basically malloc(), tuned for allocating a fixed number
of bits on every call without worrying about freeing any individual
allocation since all will be freed at the end. It does it by allocating
a big block of memory every time it runs out of "free memory". "slab" is
a poor choice of name, at least poorer than "pool".

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 shallow.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/shallow.c b/shallow.c
index 4d0b005..8100dfd 100644
--- a/shallow.c
+++ b/shallow.c
@@ -434,9 +434,9 @@ define_commit_slab(ref_bitmap, uint32_t *);
 struct paint_info {
 	struct ref_bitmap ref_bitmap;
 	unsigned nr_bits;
-	char **slab;
+	char **pools;
 	char *free, *end;
-	unsigned slab_count;
+	unsigned pool_count;
 };
 
 static uint32_t *paint_alloc(struct paint_info *info)
@@ -444,11 +444,11 @@ static uint32_t *paint_alloc(struct paint_info *info)
 	unsigned nr = (info->nr_bits + 31) / 32;
 	unsigned size = nr * sizeof(uint32_t);
 	void *p;
-	if (!info->slab_count || info->free + size > info->end) {
-		info->slab_count++;
-		REALLOC_ARRAY(info->slab, info->slab_count);
+	if (!info->pool_count || info->free + size > info->end) {
+		info->pool_count++;
+		REALLOC_ARRAY(info->pools, info->pool_count);
 		info->free = xmalloc(COMMIT_SLAB_SIZE);
-		info->slab[info->slab_count - 1] = info->free;
+		info->pools[info->pool_count - 1] = info->free;
 		info->end = info->free + COMMIT_SLAB_SIZE;
 	}
 	p = info->free;
@@ -624,9 +624,9 @@ void assign_shallow_commits_to_refs(struct shallow_info *info,
 		post_assign_shallow(info, &pi.ref_bitmap, ref_status);
 
 	clear_ref_bitmap(&pi.ref_bitmap);
-	for (i = 0; i < pi.slab_count; i++)
-		free(pi.slab[i]);
-	free(pi.slab);
+	for (i = 0; i < pi.pool_count; i++)
+		free(pi.pools[i]);
+	free(pi.pools);
 	free(shallow);
 }
 
-- 
2.8.2.524.g6ff3d78


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 2/6] shallow.c: stop abusing COMMIT_SLAB_SIZE for paint_info's memory pools
  2016-12-06 12:53 ` [PATCH v2 0/6] shallow.c improvements Nguyễn Thái Ngọc Duy
  2016-12-06 12:53   ` [PATCH v2 1/6] shallow.c: rename fields in paint_info to better express their purposes Nguyễn Thái Ngọc Duy
@ 2016-12-06 12:53   ` Nguyễn Thái Ngọc Duy
  2016-12-06 12:53   ` [PATCH v2 3/6] shallow.c: make paint_alloc slightly more robust Nguyễn Thái Ngọc Duy
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-12-06 12:53 UTC (permalink / raw)
  To: git; +Cc: rv, Jeff King, Nguyễn Thái Ngọc Duy

We need to allocate a "big" block of memory in paint_alloc(). The exact
size does not really matter. But the pool size has no relation with
commit-slab. Stop using that macro here.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 shallow.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/shallow.c b/shallow.c
index 8100dfd..2512ed3 100644
--- a/shallow.c
+++ b/shallow.c
@@ -431,6 +431,8 @@ void remove_nonexistent_theirs_shallow(struct shallow_info *info)
 
 define_commit_slab(ref_bitmap, uint32_t *);
 
+#define POOL_SIZE (512 * 1024)
+
 struct paint_info {
 	struct ref_bitmap ref_bitmap;
 	unsigned nr_bits;
@@ -447,9 +449,9 @@ static uint32_t *paint_alloc(struct paint_info *info)
 	if (!info->pool_count || info->free + size > info->end) {
 		info->pool_count++;
 		REALLOC_ARRAY(info->pools, info->pool_count);
-		info->free = xmalloc(COMMIT_SLAB_SIZE);
+		info->free = xmalloc(POOL_SIZE);
 		info->pools[info->pool_count - 1] = info->free;
-		info->end = info->free + COMMIT_SLAB_SIZE;
+		info->end = info->free + POOL_SIZE;
 	}
 	p = info->free;
 	info->free += size;
-- 
2.8.2.524.g6ff3d78


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 3/6] shallow.c: make paint_alloc slightly more robust
  2016-12-06 12:53 ` [PATCH v2 0/6] shallow.c improvements Nguyễn Thái Ngọc Duy
  2016-12-06 12:53   ` [PATCH v2 1/6] shallow.c: rename fields in paint_info to better express their purposes Nguyễn Thái Ngọc Duy
  2016-12-06 12:53   ` [PATCH v2 2/6] shallow.c: stop abusing COMMIT_SLAB_SIZE for paint_info's memory pools Nguyễn Thái Ngọc Duy
@ 2016-12-06 12:53   ` Nguyễn Thái Ngọc Duy
  2016-12-06 12:53   ` [PATCH v2 4/6] shallow.c: avoid theoretical pointer wrap-around Nguyễn Thái Ngọc Duy
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-12-06 12:53 UTC (permalink / raw)
  To: git; +Cc: rv, Jeff King, Nguyễn Thái Ngọc Duy

paint_alloc() allocates a big block of memory and splits it into
smaller, fixed size, chunks of memory whenever it's called. Each chunk
contains enough bits to present all "new refs" [1] in a fetch from a
shallow repository.

We do not check if the new "big block" is smaller than the requested
memory chunk though. If it happens, we'll happily pass back a memory
region smaller than expected. Which will lead to problems eventually.

A normal fetch may add/update a dozen new refs. Let's stay on the
"reasonably extreme" side and say we need 16k refs (or bits from
paint_alloc's perspective). Each chunk of memory would be 2k, much
smaller than the memory pool (512k).

So, normally, the under-allocation situation should never happen. A bad
guy, however, could make a fetch that adds more than 4m new/updated refs
to this code which results in a memory chunk larger than pool size.
Check this case and abort.

Noticed-by: Rasmus Villemoes <rv@rasmusvillemoes.dk>

[1] Details are in commit message of 58babff (shallow.c: the 8 steps to
    select new commits for .git/shallow - 2013-12-05), step 6.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 shallow.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/shallow.c b/shallow.c
index 2512ed3..75e1702 100644
--- a/shallow.c
+++ b/shallow.c
@@ -447,6 +447,9 @@ static uint32_t *paint_alloc(struct paint_info *info)
 	unsigned size = nr * sizeof(uint32_t);
 	void *p;
 	if (!info->pool_count || info->free + size > info->end) {
+		if (size > POOL_SIZE)
+			die("BUG: pool size too small for %d in paint_alloc()",
+			    size);
 		info->pool_count++;
 		REALLOC_ARRAY(info->pools, info->pool_count);
 		info->free = xmalloc(POOL_SIZE);
-- 
2.8.2.524.g6ff3d78


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 4/6] shallow.c: avoid theoretical pointer wrap-around
  2016-12-06 12:53 ` [PATCH v2 0/6] shallow.c improvements Nguyễn Thái Ngọc Duy
                     ` (2 preceding siblings ...)
  2016-12-06 12:53   ` [PATCH v2 3/6] shallow.c: make paint_alloc slightly more robust Nguyễn Thái Ngọc Duy
@ 2016-12-06 12:53   ` Nguyễn Thái Ngọc Duy
  2016-12-06 12:53   ` [PATCH v2 5/6] shallow.c: bit manipulation tweaks Nguyễn Thái Ngọc Duy
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-12-06 12:53 UTC (permalink / raw)
  To: git; +Cc: rv, Jeff King, Nguyễn Thái Ngọc Duy

From: Rasmus Villemoes <rv@rasmusvillemoes.dk>

The expression info->free+size is technically undefined behaviour in
exactly the case we want to test for. Moreover, the compiler is likely
to translate the expression to

  (unsigned long)info->free + size > (unsigned long)info->end

where there's at least a theoretical chance that the LHS could wrap
around 0, giving a false negative.

This might as well be written using pointer subtraction avoiding these
issues.

Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 shallow.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/shallow.c b/shallow.c
index 75e1702..719f699 100644
--- a/shallow.c
+++ b/shallow.c
@@ -446,7 +446,7 @@ static uint32_t *paint_alloc(struct paint_info *info)
 	unsigned nr = (info->nr_bits + 31) / 32;
 	unsigned size = nr * sizeof(uint32_t);
 	void *p;
-	if (!info->pool_count || info->free + size > info->end) {
+	if (!info->pool_count || size > info->end - info->free) {
 		if (size > POOL_SIZE)
 			die("BUG: pool size too small for %d in paint_alloc()",
 			    size);
-- 
2.8.2.524.g6ff3d78


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 5/6] shallow.c: bit manipulation tweaks
  2016-12-06 12:53 ` [PATCH v2 0/6] shallow.c improvements Nguyễn Thái Ngọc Duy
                     ` (3 preceding siblings ...)
  2016-12-06 12:53   ` [PATCH v2 4/6] shallow.c: avoid theoretical pointer wrap-around Nguyễn Thái Ngọc Duy
@ 2016-12-06 12:53   ` Nguyễn Thái Ngọc Duy
  2016-12-06 12:53   ` [PATCH v2 6/6] shallow.c: remove useless code Nguyễn Thái Ngọc Duy
  2016-12-06 13:42   ` [PATCH v2 0/6] shallow.c improvements Jeff King
  6 siblings, 0 replies; 20+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-12-06 12:53 UTC (permalink / raw)
  To: git; +Cc: rv, Jeff King, Nguyễn Thái Ngọc Duy

From: Rasmus Villemoes <rv@rasmusvillemoes.dk>

First of all, 1 << 31 is technically undefined behaviour, so let's just
use an unsigned literal.

If i is 'signed int' and gcc doesn't know that i is positive, gcc
generates code to compute the C99-mandated values of "i / 32" and "i %
32", which is a lot more complicated than simple a simple shifts/mask.

The only caller of paint_down actually passes an "unsigned int" value,
but the prototype of paint_down causes (completely well-defined)
conversion to signed int, and gcc has no way of knowing that the
converted value is non-negative. Just make the id parameter unsigned.

In update_refstatus, the change in generated code is much smaller,
presumably because gcc is smart enough to see that i starts as 0 and is
only incremented, so it is allowed (per the UD of signed overflow) to
assume that i is always non-negative. But let's just help less smart
compilers generate good code anyway.

Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 shallow.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/shallow.c b/shallow.c
index 719f699..beb967e 100644
--- a/shallow.c
+++ b/shallow.c
@@ -467,7 +467,7 @@ static uint32_t *paint_alloc(struct paint_info *info)
  * all walked commits.
  */
 static void paint_down(struct paint_info *info, const unsigned char *sha1,
-		       int id)
+		       unsigned int id)
 {
 	unsigned int i, nr;
 	struct commit_list *head = NULL;
@@ -479,7 +479,7 @@ static void paint_down(struct paint_info *info, const unsigned char *sha1,
 	if (!c)
 		return;
 	memset(bitmap, 0, bitmap_size);
-	bitmap[id / 32] |= (1 << (id % 32));
+	bitmap[id / 32] |= (1U << (id % 32));
 	commit_list_insert(c, &head);
 	while (head) {
 		struct commit_list *p;
@@ -653,11 +653,11 @@ static int add_ref(const char *refname, const struct object_id *oid,
 
 static void update_refstatus(int *ref_status, int nr, uint32_t *bitmap)
 {
-	int i;
+	unsigned int i;
 	if (!ref_status)
 		return;
 	for (i = 0; i < nr; i++)
-		if (bitmap[i / 32] & (1 << (i % 32)))
+		if (bitmap[i / 32] & (1U << (i % 32)))
 			ref_status[i]++;
 }
 
-- 
2.8.2.524.g6ff3d78


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 6/6] shallow.c: remove useless code
  2016-12-06 12:53 ` [PATCH v2 0/6] shallow.c improvements Nguyễn Thái Ngọc Duy
                     ` (4 preceding siblings ...)
  2016-12-06 12:53   ` [PATCH v2 5/6] shallow.c: bit manipulation tweaks Nguyễn Thái Ngọc Duy
@ 2016-12-06 12:53   ` Nguyễn Thái Ngọc Duy
  2016-12-06 13:42   ` [PATCH v2 0/6] shallow.c improvements Jeff King
  6 siblings, 0 replies; 20+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-12-06 12:53 UTC (permalink / raw)
  To: git; +Cc: rv, Jeff King, Nguyễn Thái Ngọc Duy

Some context before we talk about the removed code.

This paint_down() is part of step 6 of 58babff (shallow.c: the 8 steps
to select new commits for .git/shallow - 2013-12-05). When we fetch from
a shallow repository, we need to know if one of the new/updated refs
needs new "shallow commits" in .git/shallow (because we don't have
enough history of those refs) and which one.

The question at step 6 is, what (new) shallow commits are required in
other to maintain reachability throughout the repository _without_
cutting our history short? To answer, we mark all commits reachable from
existing refs with UNINTERESTING ("rev-list --not --all"), mark shallow
commits with BOTTOM, then for each new/updated refs, walk through the
commit graph until we either hit UNINTERESTING or BOTTOM, marking the
ref on the commit as we walk.

After all the walking is done, we check the new shallow commits. If we
have not seen any new ref marked on a new shallow commit, we know all
new/updated refs are reachable using just our history and .git/shallow.
The shallow commit in question is not needed and can be thrown away.

So, the code.

The loop here (to walk through commits) is basically

1.  get one commit from the queue
2.  ignore if it's SEEN or UNINTERESTING
3.  mark it
4.  go through all the parents and..
5a. mark it if it's never marked before
5b. put it back in the queue

What we do in this patch is drop step 5a because it is not
necessary. The commit being marked at 5a is put back on the queue, and
will be marked at step 3 at the next iteration. The only case it will
not be marked is when the commit is already marked UNINTERESTING (5a
does not check this), which will be ignored at step 2.

But we don't care about refs marking on UNINTERESTING. We care about the
marking on _shallow commits_ that are not reachable from our current
history (and having UNINTERESTING on it means it's reachable). So it's
ok for an UNINTERESTING not to be ref-marked.

Reported-by: Rasmus Villemoes <rv@rasmusvillemoes.dk>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 shallow.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/shallow.c b/shallow.c
index beb967e..11f7dde 100644
--- a/shallow.c
+++ b/shallow.c
@@ -512,12 +512,8 @@ static void paint_down(struct paint_info *info, const unsigned char *sha1,
 			    oid_to_hex(&c->object.oid));
 
 		for (p = c->parents; p; p = p->next) {
-			uint32_t **p_refs = ref_bitmap_at(&info->ref_bitmap,
-							  p->item);
 			if (p->item->object.flags & SEEN)
 				continue;
-			if (*p_refs == NULL || *p_refs == *refs)
-				*p_refs = *refs;
 			commit_list_insert(p->item, &head);
 		}
 	}
-- 
2.8.2.524.g6ff3d78


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/6] shallow.c improvements
  2016-12-06 12:53 ` [PATCH v2 0/6] shallow.c improvements Nguyễn Thái Ngọc Duy
                     ` (5 preceding siblings ...)
  2016-12-06 12:53   ` [PATCH v2 6/6] shallow.c: remove useless code Nguyễn Thái Ngọc Duy
@ 2016-12-06 13:42   ` Jeff King
  2016-12-06 13:47     ` Duy Nguyen
  6 siblings, 1 reply; 20+ messages in thread
From: Jeff King @ 2016-12-06 13:42 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, rv

On Tue, Dec 06, 2016 at 07:53:33PM +0700, Nguyễn Thái Ngọc Duy wrote:

> Nguyễn Thái Ngọc Duy (4):
>   shallow.c: rename fields in paint_info to better express their purposes
>   shallow.c: stop abusing COMMIT_SLAB_SIZE for paint_info's memory pools
>   shallow.c: make paint_alloc slightly more robust
>   shallow.c: remove useless code
> 
> Rasmus Villemoes (2):
>   shallow.c: avoid theoretical pointer wrap-around
>   shallow.c: bit manipulation tweaks

The first 5 patches look obviously good to me. The naming changes in
paint_alloc() make things much clearer, and the fixes retained from
Rasmus are all obvious improvements.

The final one _seems_ reasonable after reading your explanation, but I
lack enough context to know whether or not there might be a corner case
that you're missing. I'm inclined to trust your assessment on it.

-Peff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/6] shallow.c improvements
  2016-12-06 13:42   ` [PATCH v2 0/6] shallow.c improvements Jeff King
@ 2016-12-06 13:47     ` Duy Nguyen
  2016-12-07 23:42       ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Duy Nguyen @ 2016-12-06 13:47 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, Rasmus Villemoes

On Tue, Dec 6, 2016 at 8:42 PM, Jeff King <peff@peff.net> wrote:
> The final one _seems_ reasonable after reading your explanation, but I
> lack enough context to know whether or not there might be a corner case
> that you're missing. I'm inclined to trust your assessment on it.

Yeah I basically just wrote down my thoughts so somebody could maybe
spot something wrong. I'm going to think about it some more in the
next few days.
-- 
Duy

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/6] shallow.c improvements
  2016-12-06 13:47     ` Duy Nguyen
@ 2016-12-07 23:42       ` Junio C Hamano
  0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2016-12-07 23:42 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Jeff King, Git Mailing List, Rasmus Villemoes

Duy Nguyen <pclouds@gmail.com> writes:

> On Tue, Dec 6, 2016 at 8:42 PM, Jeff King <peff@peff.net> wrote:
>> The final one _seems_ reasonable after reading your explanation, but I
>> lack enough context to know whether or not there might be a corner case
>> that you're missing. I'm inclined to trust your assessment on it.
>
> Yeah I basically just wrote down my thoughts so somebody could maybe
> spot something wrong. I'm going to think about it some more in the
> next few days.

In the meantime let me queue them as-is.

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-12-07 23:43 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-02 20:31 [PATCH 1/4] shallow.c: make paint_alloc slightly more robust Rasmus Villemoes
2016-12-02 20:31 ` [PATCH 2/4] shallow.c: avoid theoretical pointer wrap-around Rasmus Villemoes
2016-12-03  5:17   ` Jeff King
2016-12-02 20:31 ` [PATCH 3/4] shallow.c: bit manipulation tweaks Rasmus Villemoes
2016-12-03  5:21   ` Jeff King
2016-12-02 20:31 ` [PATCH 4/4] shallow.c: remove useless test Rasmus Villemoes
2016-12-03  5:24   ` Jeff King
2016-12-05 12:00     ` Duy Nguyen
2016-12-03  5:14 ` [PATCH 1/4] shallow.c: make paint_alloc slightly more robust Jeff King
2016-12-05 10:02   ` Duy Nguyen
2016-12-06 12:53 ` [PATCH v2 0/6] shallow.c improvements Nguyễn Thái Ngọc Duy
2016-12-06 12:53   ` [PATCH v2 1/6] shallow.c: rename fields in paint_info to better express their purposes Nguyễn Thái Ngọc Duy
2016-12-06 12:53   ` [PATCH v2 2/6] shallow.c: stop abusing COMMIT_SLAB_SIZE for paint_info's memory pools Nguyễn Thái Ngọc Duy
2016-12-06 12:53   ` [PATCH v2 3/6] shallow.c: make paint_alloc slightly more robust Nguyễn Thái Ngọc Duy
2016-12-06 12:53   ` [PATCH v2 4/6] shallow.c: avoid theoretical pointer wrap-around Nguyễn Thái Ngọc Duy
2016-12-06 12:53   ` [PATCH v2 5/6] shallow.c: bit manipulation tweaks Nguyễn Thái Ngọc Duy
2016-12-06 12:53   ` [PATCH v2 6/6] shallow.c: remove useless code Nguyễn Thái Ngọc Duy
2016-12-06 13:42   ` [PATCH v2 0/6] shallow.c improvements Jeff King
2016-12-06 13:47     ` Duy Nguyen
2016-12-07 23:42       ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).