git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] cache-tree: do not cache empty trees
@ 2011-02-05  8:30 Nguyễn Thái Ngọc Duy
  2011-02-05  9:50 ` Nguyễn Thái Ngọc Duy
  0 siblings, 1 reply; 20+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2011-02-05  8:30 UTC (permalink / raw)
  To: git, Ilari Liusvaara
  Cc: Jakub Narebski, Jonathan Nieder, Dmitry S. Kravtsov, Shawn Pearce,
	Nguyễn Thái Ngọc Duy

Current index does not support empty trees. But users can construct
empty trees directly using plumbing. When empty trees are checked out,
things become inconsistent:

 - If cache-tree somehow is invalidated, when a tree is read to index,
   empty trees disappear. When we write trees back, empty trees will
   be gone.

 - If cache-tree is generated by read-tree and remains valid by the
   time trees are written back, empty trees remain.

Let's do it in a consistent way, always disregard empty trees in
index. If users choose to create empty trees their own way, they
should not use index at all.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 On Wed, Feb 2, 2011 at 2:09 AM, Ilari Liusvaara <ilari.liusvaara@elisanet.fi> wrote:
 > Yes, writing to index/working area. IIRC, having such entry in tree causes
 > a "ghost directory". I don't exactly recall what such thing broke, but I
 > remember that it broke something (merging?)...
 >
 > Those ghosts also had annoying tendency to persist between commits. Commits
 > didn't kill them. Rm didn't work. You had to create something on top/inside to
 > get rid of them.

 That's probably because of cache-tree. Empty trees can't exist in
 index so an operation "trees -> index -> trees" will remove empty
 trees.

 But read-tree can preserve the exact structure of original trees
 (including empty trees) so if that particular path is untouched,
 empty trees will remain.

 Perhaps a patch like this for pre-1.8.0?

 cache-tree.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index f755590..f717793 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -621,6 +621,8 @@ static void prime_cache_tree_rec(struct cache_tree *it, struct tree *tree)
 			struct tree *subtree = lookup_tree(entry.sha1);
 			if (!subtree->object.parsed)
 				parse_tree(subtree);
+			if (!hashcmp(entry.sha1, (unsigned char *)EMPTY_TREE_SHA1_BIN))
+				continue;
 			sub = cache_tree_sub(it, entry.path);
 			sub->cache_tree = cache_tree();
 			prime_cache_tree_rec(sub->cache_tree, subtree);
-- 
1.7.3.4.878.g439c7

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH] cache-tree: do not cache empty trees
  2011-02-05  8:30 [PATCH] cache-tree: do not cache empty trees Nguyễn Thái Ngọc Duy
@ 2011-02-05  9:50 ` Nguyễn Thái Ngọc Duy
  2011-02-05 10:14   ` Jonathan Nieder
  2011-02-05 14:07   ` Nguyễn Thái Ngọc Duy
  0 siblings, 2 replies; 20+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2011-02-05  9:50 UTC (permalink / raw)
  To: git, Ilari Liusvaara
  Cc: Jakub Narebski, Jonathan Nieder, Dmitry S. Kravtsov, Shawn Pearce,
	Nguyễn Thái Ngọc Duy

Current index does not support empty trees. But users can construct
empty trees directly using plumbing. When empty trees are checked out,
things become inconsistent:

 - If cache-tree somehow is invalidated, when a tree is read to index,
   empty trees disappear. When we write trees back, empty trees will
   be gone.

 - If cache-tree is generated by read-tree and remains valid by the
   time trees are written back, empty trees remain.

Let's do it in a consistent way, always disregard empty trees in
index. If users choose to create empty trees their own way, they
should not use index at all.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Scratch the first version. This one actually works.

 cache-tree.c               |    9 +++++++++
 t/t1013-read-tree-empty.sh |   38 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 0 deletions(-)
 create mode 100755 t/t1013-read-tree-empty.sh

diff --git a/cache-tree.c b/cache-tree.c
index f755590..03732ad 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -621,9 +621,18 @@ static void prime_cache_tree_rec(struct cache_tree *it, struct tree *tree)
 			struct tree *subtree = lookup_tree(entry.sha1);
 			if (!subtree->object.parsed)
 				parse_tree(subtree);
+			if (!hashcmp(entry.sha1, (unsigned char *)EMPTY_TREE_SHA1_BIN)) {
+				warning("empty tree detected! Will be removed from new commits");
+				cnt = -1;
+				break;
+			}
 			sub = cache_tree_sub(it, entry.path);
 			sub->cache_tree = cache_tree();
 			prime_cache_tree_rec(sub->cache_tree, subtree);
+			if (sub->cache_tree->entry_count == -1) {
+				cnt = -1;
+				break;
+			}
 			cnt += sub->cache_tree->entry_count;
 		}
 	}
diff --git a/t/t1013-read-tree-empty.sh b/t/t1013-read-tree-empty.sh
new file mode 100755
index 0000000..a9279f0
--- /dev/null
+++ b/t/t1013-read-tree-empty.sh
@@ -0,0 +1,38 @@
+#!/bin/sh
+
+test_description='read-tree with empty trees'
+
+. ./test-lib.sh
+
+T1=f4ec99e8174c01eab488469b4c2680500bbb18da
+T2=4b825dc642cb6eb9a060e54bf8d69288fbee4904
+
+test_expect_success 'setup' '
+	printf "40000 empty\0\x4b\x82\x5d\xc6\x42\xcb\x6e\xb9\xa0\x60\xe5\x4b\xf8\xd6\x92\x88\xfb\xee\x49\x04" >newtree &&
+	git hash-object -w -t tree newtree >actual &&
+	echo $T1 >expected
+	test_cmp expected actual
+'
+
+test_expect_success 'ls-tree T1 (with empty tree)' '
+	git ls-tree $T1 >actual &&
+	cat <<EOF >expected &&
+040000 tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904	empty
+EOF
+	test_cmp expected actual
+'
+
+test_expect_success 'write-tree removes empty tree' '
+	git read-tree "$T1" &&
+	git write-tree >actual
+	echo $T2 >expected
+	test_cmp expected actual
+'
+
+test_expect_success 'ls-tree T2 (without empty tree)' '
+	git ls-tree $T2 >actual &&
+	: >expected &&
+	test_cmp expected actual
+'
+
+test_done
-- 
1.7.3.4.878.g439c7

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] cache-tree: do not cache empty trees
  2011-02-05  9:50 ` Nguyễn Thái Ngọc Duy
@ 2011-02-05 10:14   ` Jonathan Nieder
  2011-02-05 10:32     ` Nguyen Thai Ngoc Duy
  2011-02-05 14:07   ` Nguyễn Thái Ngọc Duy
  1 sibling, 1 reply; 20+ messages in thread
From: Jonathan Nieder @ 2011-02-05 10:14 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: git, Ilari Liusvaara, Jakub Narebski, Dmitry S. Kravtsov,
	Shawn Pearce

Hi,

Some quick nits to save myself time.  The basic idea of the patch
seems sound.

Nguyễn Thái Ngọc Duy wrote:

> --- /dev/null
> +++ b/t/t1013-read-tree-empty.sh
> @@ -0,0 +1,38 @@
> +#!/bin/sh
> +
> +test_description='read-tree with empty trees'
> +
> +. ./test-lib.sh
> +
> +T1=f4ec99e8174c01eab488469b4c2680500bbb18da
> +T2=4b825dc642cb6eb9a060e54bf8d69288fbee4904

What are these trees?  Do they need to be hardcoded?

> +
> +test_expect_success 'setup' '
> +	printf "40000 empty\0\x4b\x82\x5d\xc6\x42\xcb\x6e\xb9\xa0\x60\xe5\x4b\xf8\xd6\x92\x88\xfb\xee\x49\x04" >newtree &&

printf '\xab' is unfortunately unportable.  I suppose
this should rather say something like

 test_unequal () {
	printf '%s\n' "$1" >bad &&
	printf '%s\n' "$2" >actual &&
	! test_cmp bad actual
 }

	empty_tree=$(git mktree </dev/null) &&
	tree_with_empty_subtree=$(
		echo "040000 tree $empty_tree	empty" |
		git mktree
	) &&
	test_unequal "$empty_tree" "$tree_with_empty_subtree"

> +test_expect_success 'ls-tree T1 (with empty tree)' '
> +	git ls-tree $T1 >actual &&
> +	cat <<EOF >expected &&
> +040000 tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904	empty
> +EOF
> +	test_cmp expected actual
> +'

	echo "040000 tree $empty_tree	empty" >expect &&
	git ls-tree "$tree_with_empty_subtree" >actual &&
	test_cmp expect actual

> +
> +test_expect_success 'write-tree removes empty tree' '
> +	git read-tree "$T1" &&
> +	git write-tree >actual
> +	echo $T2 >expected
> +	test_cmp expected actual
> +'

	git read-tree "$tree_with_empty_subtree" &&
	...

Sane?
Jonathan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] cache-tree: do not cache empty trees
  2011-02-05 10:14   ` Jonathan Nieder
@ 2011-02-05 10:32     ` Nguyen Thai Ngoc Duy
  0 siblings, 0 replies; 20+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-02-05 10:32 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, Ilari Liusvaara, Jakub Narebski, Dmitry S. Kravtsov,
	Shawn Pearce

2011/2/5 Jonathan Nieder <jrnieder@gmail.com>:
> Nguyễn Thái Ngọc Duy wrote:
>> +
>> +T1=f4ec99e8174c01eab488469b4c2680500bbb18da
>> +T2=4b825dc642cb6eb9a060e54bf8d69288fbee4904
>
> What are these trees?  Do they need to be hardcoded?

I have uneasy feeling constructing a tree manually. Hardcoding it (or
less automatic check) makes me feel better. I'm cooking a patch to
make hash-object reject malformed trees/commits/tags. Then I'll remove
these SHA-1.

>> +
>> +test_expect_success 'setup' '
>> +     printf "40000 empty\0\x4b\x82\x5d\xc6\x42\xcb\x6e\xb9\xa0\x60\xe5\x4b\xf8\xd6\x92\x88\xfb\xee\x49\x04" >newtree &&
>
> printf '\xab' is unfortunately unportable.  I suppose
> this should rather say something like
>
>  test_unequal () {
>        printf '%s\n' "$1" >bad &&
>        printf '%s\n' "$2" >actual &&
>        ! test_cmp bad actual
>  }
>
>        empty_tree=$(git mktree </dev/null) &&
>        tree_with_empty_subtree=$(
>                echo "040000 tree $empty_tree   empty" |
>                git mktree
>        ) &&
>        test_unequal "$empty_tree" "$tree_with_empty_subtree"

Hah! Was wondering how the heck they could create trees manually. So
it's mktree. Thanks!

>        ...
>
> Sane?

Don't know. But it's definitely saner than my version.
-- 
Duy

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] cache-tree: do not cache empty trees
  2011-02-05  9:50 ` Nguyễn Thái Ngọc Duy
  2011-02-05 10:14   ` Jonathan Nieder
@ 2011-02-05 14:07   ` Nguyễn Thái Ngọc Duy
  2011-02-07  2:09     ` Junio C Hamano
  2011-02-07  9:17     ` [PATCH] cache-tree: do not cache empty trees Jonathan Nieder
  1 sibling, 2 replies; 20+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2011-02-05 14:07 UTC (permalink / raw)
  To: git, Ilari Liusvaara
  Cc: Jakub Narebski, Jonathan Nieder, Dmitry S. Kravtsov, Shawn Pearce,
	Nguyễn Thái Ngọc Duy

Current index does not support empty trees. But users can construct
empty trees directly using plumbing. When empty trees are checked out,
things become inconsistent:

 - If cache-tree somehow is invalidated, when a tree is read to index,
   empty trees disappear. When we write trees back, empty trees will
   be gone.

 - If cache-tree is generated by read-tree and remains valid by the
   time trees are written back, empty trees remain.

Let's do it in a consistent way, always disregard empty trees in
index. If users choose to create empty trees their own way, they
should not use index at all.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Third version. Cleaned up t1013. Also needs [1] for mktree to work
 with empty trees.

 [1] http://mid.gmane.org/1296914582-619-1-git-send-email-pclouds@gmail.com

 cache-tree.c               |    9 +++++++++
 t/t1013-read-tree-empty.sh |   20 ++++++++++++++++++++
 2 files changed, 29 insertions(+), 0 deletions(-)
 create mode 100755 t/t1013-read-tree-empty.sh

diff --git a/cache-tree.c b/cache-tree.c
index f755590..03732ad 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -621,9 +621,18 @@ static void prime_cache_tree_rec(struct cache_tree *it, struct tree *tree)
 			struct tree *subtree = lookup_tree(entry.sha1);
 			if (!subtree->object.parsed)
 				parse_tree(subtree);
+			if (!hashcmp(entry.sha1, (unsigned char *)EMPTY_TREE_SHA1_BIN)) {
+				warning("empty tree detected! Will be removed in new commits");
+				cnt = -1;
+				break;
+			}
 			sub = cache_tree_sub(it, entry.path);
 			sub->cache_tree = cache_tree();
 			prime_cache_tree_rec(sub->cache_tree, subtree);
+			if (sub->cache_tree->entry_count == -1) {
+				cnt = -1;
+				break;
+			}
 			cnt += sub->cache_tree->entry_count;
 		}
 	}
diff --git a/t/t1013-read-tree-empty.sh b/t/t1013-read-tree-empty.sh
new file mode 100755
index 0000000..8d2ab97
--- /dev/null
+++ b/t/t1013-read-tree-empty.sh
@@ -0,0 +1,20 @@
+#!/bin/sh
+
+test_description='read-tree with empty trees'
+
+. ./test-lib.sh
+
+EMPTY_TREE=4b825dc642cb6eb9a060e54bf8d69288fbee4904
+
+test_expect_success 'setup' '
+	echo "040000 tree $EMPTY_TREE	empty" | git mktree >tree
+'
+
+test_expect_success 'write-tree removes empty tree' '
+	git read-tree `cat tree` &&
+	git write-tree >actual
+	echo $EMPTY_TREE >expected
+	test_cmp expected actual
+'
+
+test_done
-- 
1.7.3.4.878.g439c7

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] cache-tree: do not cache empty trees
  2011-02-05 14:07   ` Nguyễn Thái Ngọc Duy
@ 2011-02-07  2:09     ` Junio C Hamano
  2011-02-07  2:36       ` Nguyen Thai Ngoc Duy
  2011-02-07  8:17       ` [PATCH] correct type of EMPTY_TREE_SHA1_BIN Jonathan Nieder
  2011-02-07  9:17     ` [PATCH] cache-tree: do not cache empty trees Jonathan Nieder
  1 sibling, 2 replies; 20+ messages in thread
From: Junio C Hamano @ 2011-02-07  2:09 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: git, Ilari Liusvaara, Jakub Narebski, Jonathan Nieder,
	Dmitry S. Kravtsov, Shawn Pearce

> diff --git a/cache-tree.c b/cache-tree.c
> index f755590..03732ad 100644
> --- a/cache-tree.c
> +++ b/cache-tree.c
> @@ -621,9 +621,18 @@ static void prime_cache_tree_rec(struct cache_tree *it, struct tree *tree)
>  			struct tree *subtree = lookup_tree(entry.sha1);
>  			if (!subtree->object.parsed)
>  				parse_tree(subtree);
> +			if (!hashcmp(entry.sha1, (unsigned char *)EMPTY_TREE_SHA1_BIN)) {
> +				warning("empty tree detected! Will be removed in new commits");
> +				cnt = -1;
> +				break;
> +			}

You shouldn't need the cast (if you did, then hashcmp() macro should be
fixed so that you don't need to).

I don't think warning() is warranted for an operation you introduced to
keep the internal data structure consistent.

Should this comparison done after we parsed the subtree, or should we be
doing that before it?

If you are adding this new check to a point where we have already parsed
the subtree object, don't you have a better and cheaper way to detect if
the subtree is empty than the 20-byte comparision, namely, perhaps by
looking at subtree->size?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] cache-tree: do not cache empty trees
  2011-02-07  2:09     ` Junio C Hamano
@ 2011-02-07  2:36       ` Nguyen Thai Ngoc Duy
  2011-02-07  8:17       ` [PATCH] correct type of EMPTY_TREE_SHA1_BIN Jonathan Nieder
  1 sibling, 0 replies; 20+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-02-07  2:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Ilari Liusvaara, Jakub Narebski, Jonathan Nieder,
	Dmitry S. Kravtsov, Shawn Pearce

2011/2/7 Junio C Hamano <gitster@pobox.com>:
>> diff --git a/cache-tree.c b/cache-tree.c
>> index f755590..03732ad 100644
>> --- a/cache-tree.c
>> +++ b/cache-tree.c
>> @@ -621,9 +621,18 @@ static void prime_cache_tree_rec(struct cache_tree *it, struct tree *tree)
>>                       struct tree *subtree = lookup_tree(entry.sha1);
>>                       if (!subtree->object.parsed)
>>                               parse_tree(subtree);
>> +                     if (!hashcmp(entry.sha1, (unsigned char *)EMPTY_TREE_SHA1_BIN)) {
>> +                             warning("empty tree detected! Will be removed in new commits");
>> +                             cnt = -1;
>> +                             break;
>> +                     }
>
> You shouldn't need the cast (if you did, then hashcmp() macro should be
> fixed so that you don't need to).

Or perhaps EMPTY_TREE_SHA1_BIN should be casted to const unsigned char
*. That would eliminate 4 typecastings elsewhere.

> I don't think warning() is warranted for an operation you introduced to
> keep the internal data structure consistent.

Worse. I don't think users know an empty tree is added or removed.
diff-tree does not show it (or should not, I haven't tested).

> Should this comparison done after we parsed the subtree, or should we be
> doing that before it?
>
> If you are adding this new check to a point where we have already parsed
> the subtree object, don't you have a better and cheaper way to detect if
> the subtree is empty than the 20-byte comparision, namely, perhaps by
> looking at subtree->size?

OK before is better.
-- 
Duy

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] correct type of EMPTY_TREE_SHA1_BIN
  2011-02-07  2:09     ` Junio C Hamano
  2011-02-07  2:36       ` Nguyen Thai Ngoc Duy
@ 2011-02-07  8:17       ` Jonathan Nieder
  2011-02-09 23:33         ` Junio C Hamano
  1 sibling, 1 reply; 20+ messages in thread
From: Jonathan Nieder @ 2011-02-07  8:17 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, git, Ilari Liusvaara,
	Jakub Narebski, Dmitry S. Kravtsov, Shawn Pearce

Junio C Hamano wrote:

>> --- a/cache-tree.c
>> +++ b/cache-tree.c
>> @@ -621,9 +621,18 @@ static void prime_cache_tree_rec(struct cache_tree *it, struct tree *tree)
>>  			struct tree *subtree = lookup_tree(entry.sha1);
>>  			if (!subtree->object.parsed)
>>  				parse_tree(subtree);
>> +			if (!hashcmp(entry.sha1, (unsigned char *)EMPTY_TREE_SHA1_BIN)) {
>> +				warning("empty tree detected! Will be removed in new commits");
>> +				cnt = -1;
>> +				break;
>> +			}
>
> You shouldn't need the cast (if you did, then hashcmp() macro should be
> fixed so that you don't need to).

Isn't this a bug in the definition of EMPTY_TREE_SHA1_BIN rather than
the signature of hashcmp?

-- 8< --
Subject: correct type of EMPTY_TREE_SHA1_BIN

Functions such as hashcmp that expect a binary SHA-1 value take
parameters of type "unsigned char *" to avoid accepting a textual
SHA-1 passed by mistake.  Unfortunately, this means passing the string
literal EMPTY_TREE_SHA1_BIN requires an ugly cast.  Tweak the
definition of EMPTY_TREE_SHA1_BIN to produce a value of more
convenient type.

In the future the definition might change to

	extern const unsigned char empty_tree_sha1_bin[20];
	#define EMPTY_TREE_SHA1_BIN empty_tree_sha1_bin

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 builtin/checkout.c |    2 +-
 cache.h            |    4 +++-
 notes-merge.c      |    2 +-
 sha1_file.c        |    2 +-
 4 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 757f9a0..cd7f56e 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -404,7 +404,7 @@ static int merge_working_tree(struct checkout_opts *opts,
 		topts.dir->exclude_per_dir = ".gitignore";
 		tree = parse_tree_indirect(old->commit ?
 					   old->commit->object.sha1 :
-					   (unsigned char *)EMPTY_TREE_SHA1_BIN);
+					   EMPTY_TREE_SHA1_BIN);
 		init_tree_desc(&trees[0], tree->buffer, tree->size);
 		tree = parse_tree_indirect(new->commit->object.sha1);
 		init_tree_desc(&trees[1], tree->buffer, tree->size);
diff --git a/cache.h b/cache.h
index d83d68c..3abf895 100644
--- a/cache.h
+++ b/cache.h
@@ -676,9 +676,11 @@ static inline void hashclr(unsigned char *hash)
 
 #define EMPTY_TREE_SHA1_HEX \
 	"4b825dc642cb6eb9a060e54bf8d69288fbee4904"
-#define EMPTY_TREE_SHA1_BIN \
+#define EMPTY_TREE_SHA1_BIN_LITERAL \
 	 "\x4b\x82\x5d\xc6\x42\xcb\x6e\xb9\xa0\x60" \
 	 "\xe5\x4b\xf8\xd6\x92\x88\xfb\xee\x49\x04"
+#define EMPTY_TREE_SHA1_BIN \
+	 ((const unsigned char *) EMPTY_TREE_SHA1_BIN_LITERAL)
 
 int git_mkstemp(char *path, size_t n, const char *template);
 
diff --git a/notes-merge.c b/notes-merge.c
index 71c4d45..1467ad3 100644
--- a/notes-merge.c
+++ b/notes-merge.c
@@ -615,7 +615,7 @@ int notes_merge(struct notes_merge_options *o,
 	bases = get_merge_bases(local, remote, 1);
 	if (!bases) {
 		base_sha1 = null_sha1;
-		base_tree_sha1 = (unsigned char *)EMPTY_TREE_SHA1_BIN;
+		base_tree_sha1 = EMPTY_TREE_SHA1_BIN;
 		OUTPUT(o, 4, "No merge base found; doing history-less merge");
 	} else if (!bases->next) {
 		base_sha1 = bases->item->object.sha1;
diff --git a/sha1_file.c b/sha1_file.c
index d86a8db..b44fc95 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2048,7 +2048,7 @@ static struct cached_object {
 static int cached_object_nr, cached_object_alloc;
 
 static struct cached_object empty_tree = {
-	EMPTY_TREE_SHA1_BIN,
+	EMPTY_TREE_SHA1_BIN_LITERAL,
 	OBJ_TREE,
 	"",
 	0
-- 
1.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] cache-tree: do not cache empty trees
  2011-02-05 14:07   ` Nguyễn Thái Ngọc Duy
  2011-02-07  2:09     ` Junio C Hamano
@ 2011-02-07  9:17     ` Jonathan Nieder
  2011-02-07  9:57       ` Nguyen Thai Ngoc Duy
  1 sibling, 1 reply; 20+ messages in thread
From: Jonathan Nieder @ 2011-02-07  9:17 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: git, Ilari Liusvaara, Jakub Narebski, Dmitry S. Kravtsov,
	Shawn Pearce

Nguyễn Thái Ngọc Duy wrote:

> Let's do it in a consistent way, always disregard empty trees in
> index. If users choose to create empty trees their own way, they
> should not use index at all.

While this violates some seeming invariants, like

1.
	git reset --hard
	git commit --allow-empty
	git rev-parse HEAD^^{tree} >expect
	git rev-parse HEAD^{tree} >actual
	test_cmp expect actual

2.
	git reset --hard
	git revert HEAD
	if git rev-parse HEAD~2
	then
		git rev-parse HEAD~2^{tree} >expect
		git rev-parse HEAD^{tree} >actual
		test_cmp expect actual
	fi

, I think it's a good change.  Malformed modes in trees already break
those false invariants iiuc.

Thanks.

> --- a/cache-tree.c
> +++ b/cache-tree.c
> @@ -621,9 +621,18 @@ static void prime_cache_tree_rec(struct cache_tree *it, struct tree *tree)
>  			struct tree *subtree = lookup_tree(entry.sha1);
>  			if (!subtree->object.parsed)
>  				parse_tree(subtree);
> +			if (!hashcmp(entry.sha1, (unsigned char *)EMPTY_TREE_SHA1_BIN)) {
> +				warning("empty tree detected! Will be removed in new commits");
> +				cnt = -1;
> +				break;
> +			}

Aside from the warning, this part is an optimization, right?

>  			sub = cache_tree_sub(it, entry.path);
>  			sub->cache_tree = cache_tree();
>  			prime_cache_tree_rec(sub->cache_tree, subtree);
> +			if (sub->cache_tree->entry_count == -1) {
> +				cnt = -1;
> +				break;
> +			}

Would be nice to include a test for this, like so:

	subdir/
		empty1/
		subsubdir/
			empty2/
			empty3/

> --- /dev/null
> +++ b/t/t1013-read-tree-empty.sh
> @@ -0,0 +1,20 @@
> +#!/bin/sh
> +
> +test_description='read-tree with empty trees'
> +
> +. ./test-lib.sh
> +
> +EMPTY_TREE=4b825dc642cb6eb9a060e54bf8d69288fbee4904

If we _have_ to hard-code this (why?) then I'd prefer to do so
in test-lib.sh.

[...]
> +test_expect_success 'write-tree removes empty tree' '
> +	git read-tree `cat tree` &&
> +	git write-tree >actual
> +	echo $EMPTY_TREE >expected
> +	test_cmp expected actual

Broken &&-chain.  Not sure why this test relies on virtual objects
and another (independently nice) patch instead of adding the empty
tree to the object db --- is there something subtle I am missing?

The test does not distinguish between success due to git read-tree
omitting empty trees and success due to git mktree omitting empty
trees.

Hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] cache-tree: do not cache empty trees
  2011-02-07  9:17     ` [PATCH] cache-tree: do not cache empty trees Jonathan Nieder
@ 2011-02-07  9:57       ` Nguyen Thai Ngoc Duy
  2011-02-07 12:18         ` Ilari Liusvaara
  2011-02-07 20:48         ` Junio C Hamano
  0 siblings, 2 replies; 20+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-02-07  9:57 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, Ilari Liusvaara, Jakub Narebski, Dmitry S. Kravtsov,
	Shawn Pearce

On Mon, Feb 07, 2011 at 03:17:40AM -0600, Jonathan Nieder wrote:
> While this violates some seeming invariants, like
> 
> 1.
> 	git reset --hard
> 	git commit --allow-empty
> 	git rev-parse HEAD^^{tree} >expect
> 	git rev-parse HEAD^{tree} >actual
> 	test_cmp expect actual
> 
> 2.
> 	git reset --hard
> 	git revert HEAD
> 	if git rev-parse HEAD~2
> 	then
> 		git rev-parse HEAD~2^{tree} >expect
> 		git rev-parse HEAD^{tree} >actual
> 		test_cmp expect actual
> 	fi
> 
> , I think it's a good change.  Malformed modes in trees already break
> those false invariants iiuc.

Perhaps it's not a good approach after all. What I wanted was to make
pre-1.8.0 tolerate empty trees created by 1.8.0. Perhaps it's better
to just let pre-1.8.0 refuse to work with empty trees, forcing users
to upgrade to 1.8.0?

The (untested) patch below would make git refuse to create an index
from a tree that contains empty trees. Hmm?

diff --git a/cache-tree.c b/cache-tree.c
index f755590..e33998a 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -619,6 +619,8 @@ static void prime_cache_tree_rec(struct cache_tree *it, struct tree *tree)
 		else {
 			struct cache_tree_sub *sub;
 			struct tree *subtree = lookup_tree(entry.sha1);
+			if (!hashcmp(entry.sha1, EMPTY_TREE_SHA1_BIN))
+				die("empty tree .../%s detected!", entry.path);
 			if (!subtree->object.parsed)
 				parse_tree(subtree);
 			sub = cache_tree_sub(it, entry.path);
diff --git a/unpack-trees.c b/unpack-trees.c
index 1ca41b1..0e6738e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -434,6 +434,7 @@ static int traverse_trees_recursive(int n, unsigned long dirmask, unsigned long
 	void *buf[MAX_UNPACK_TREES];
 	struct traverse_info newinfo;
 	struct name_entry *p;
+	struct unpack_trees_options *o = info->data;
 
 	p = names;
 	while (!p->mode)
@@ -447,8 +448,11 @@ static int traverse_trees_recursive(int n, unsigned long dirmask, unsigned long
 
 	for (i = 0; i < n; i++, dirmask >>= 1) {
 		const unsigned char *sha1 = NULL;
-		if (dirmask & 1)
+		if (dirmask & 1) {
 			sha1 = names[i].sha1;
+			if (o->merge && !hashcmp(sha1, EMPTY_TREE_SHA1_BIN))
+				return error("empty tree .../%s detected!", p->path);
+		}
 		buf[i] = fill_tree_descriptor(t+i, sha1);
 	}
 

-- 
Duy

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] cache-tree: do not cache empty trees
  2011-02-07  9:57       ` Nguyen Thai Ngoc Duy
@ 2011-02-07 12:18         ` Ilari Liusvaara
  2011-02-07 12:29           ` Nguyen Thai Ngoc Duy
  2011-02-07 12:32           ` Jonathan Nieder
  2011-02-07 20:48         ` Junio C Hamano
  1 sibling, 2 replies; 20+ messages in thread
From: Ilari Liusvaara @ 2011-02-07 12:18 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Jonathan Nieder, git, Jakub Narebski, Dmitry S. Kravtsov,
	Shawn Pearce

On Mon, Feb 07, 2011 at 04:57:13PM +0700, Nguyen Thai Ngoc Duy wrote:
> 
> Perhaps it's not a good approach after all. What I wanted was to make
> pre-1.8.0 tolerate empty trees created by 1.8.0. Perhaps it's better
> to just let pre-1.8.0 refuse to work with empty trees, forcing users
> to upgrade to 1.8.0?
> 
> The (untested) patch below would make git refuse to create an index
> from a tree that contains empty trees. Hmm?
 
Remember, many distros ship with old versions of Git. Debian stable
is now at 1.7.2.3 (Squeeze became Debian 6.0) and it'll take years
before next release. What about these?

Making previous versions refuse to work with empty trees isn't exactly
trivial, as the tree parser seems to be written to be extremely
liberal on what it accepts.

-Ilari

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] cache-tree: do not cache empty trees
  2011-02-07 12:18         ` Ilari Liusvaara
@ 2011-02-07 12:29           ` Nguyen Thai Ngoc Duy
  2011-02-07 12:32           ` Jonathan Nieder
  1 sibling, 0 replies; 20+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-02-07 12:29 UTC (permalink / raw)
  To: Ilari Liusvaara
  Cc: Jonathan Nieder, git, Jakub Narebski, Dmitry S. Kravtsov,
	Shawn Pearce

On Mon, Feb 7, 2011 at 7:18 PM, Ilari Liusvaara
<ilari.liusvaara@elisanet.fi> wrote:
> On Mon, Feb 07, 2011 at 04:57:13PM +0700, Nguyen Thai Ngoc Duy wrote:
>>
>> Perhaps it's not a good approach after all. What I wanted was to make
>> pre-1.8.0 tolerate empty trees created by 1.8.0. Perhaps it's better
>> to just let pre-1.8.0 refuse to work with empty trees, forcing users
>> to upgrade to 1.8.0?
>>
>> The (untested) patch below would make git refuse to create an index
>> from a tree that contains empty trees. Hmm?
>
> Remember, many distros ship with old versions of Git. Debian stable
> is now at 1.7.2.3 (Squeeze became Debian 6.0) and it'll take years
> before next release. What about these?

Waiting a few years is my best bet :P Really I don't figure out any
other way for migration. New empty trees would end up in repository
and affect all connected clients regardless version.

> Making previous versions refuse to work with empty trees isn't exactly
> trivial, as the tree parser seems to be written to be extremely
> liberal on what it accepts.

The repository _can_ contain empty trees. The main problem is index
not supporting empty trees. I only prevent index being used. If users
create new commits (including empty trees) with mktree and
commit-tree, they can also checkout without index using ls-tree and
cat-file.
-- 
Duy

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] cache-tree: do not cache empty trees
  2011-02-07 12:18         ` Ilari Liusvaara
  2011-02-07 12:29           ` Nguyen Thai Ngoc Duy
@ 2011-02-07 12:32           ` Jonathan Nieder
  1 sibling, 0 replies; 20+ messages in thread
From: Jonathan Nieder @ 2011-02-07 12:32 UTC (permalink / raw)
  To: Ilari Liusvaara
  Cc: Nguyen Thai Ngoc Duy, git, Jakub Narebski, Dmitry S. Kravtsov,
	Shawn Pearce

Ilari Liusvaara wrote:

> Remember, many distros ship with old versions of Git. Debian stable
> is now at 1.7.2.3 (Squeeze became Debian 6.0) and it'll take years
> before next release. What about these?

The next Debian point release is scheduled for a month from now. :)
There will probably be more point releases after that.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] cache-tree: do not cache empty trees
  2011-02-07  9:57       ` Nguyen Thai Ngoc Duy
  2011-02-07 12:18         ` Ilari Liusvaara
@ 2011-02-07 20:48         ` Junio C Hamano
  2011-02-08  4:11           ` Nguyen Thai Ngoc Duy
  1 sibling, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2011-02-07 20:48 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Jonathan Nieder, git, Ilari Liusvaara, Jakub Narebski,
	Dmitry S. Kravtsov, Shawn Pearce

Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:

> Perhaps it's not a good approach after all. What I wanted was to make
> pre-1.8.0 tolerate empty trees created by 1.8.0. Perhaps it's better
> to just let pre-1.8.0 refuse to work with empty trees, forcing users
> to upgrade to 1.8.0?

I don't think we saw even something remotely resembles an agreement that
empty tree is a good thing to have yet.  Why are you rushing things?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] cache-tree: do not cache empty trees
  2011-02-07 20:48         ` Junio C Hamano
@ 2011-02-08  4:11           ` Nguyen Thai Ngoc Duy
  2011-02-08  4:30             ` Jonathan Nieder
  2011-02-08 10:40             ` Ilari Liusvaara
  0 siblings, 2 replies; 20+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-02-08  4:11 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Nieder, git, Ilari Liusvaara, Jakub Narebski,
	Dmitry S. Kravtsov, Shawn Pearce

On Tue, Feb 8, 2011 at 3:48 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
>
>> Perhaps it's not a good approach after all. What I wanted was to make
>> pre-1.8.0 tolerate empty trees created by 1.8.0. Perhaps it's better
>> to just let pre-1.8.0 refuse to work with empty trees, forcing users
>> to upgrade to 1.8.0?
>
> I don't think we saw even something remotely resembles an agreement that
> empty tree is a good thing to have yet.  Why are you rushing things?

But empty trees are allowed in repo since 79b1138 (fsck.c: fix bogus
"empty tree" check). Index can't handle empty trees, so it's a bug to
me that index still accepts them as input and silently discard them. A
bug regardless directory tracking support in 1.8.0. A corner case that
nobody would likely encounter (except Ilari and his "ghost directory"
problem).
-- 
Duy

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] cache-tree: do not cache empty trees
  2011-02-08  4:11           ` Nguyen Thai Ngoc Duy
@ 2011-02-08  4:30             ` Jonathan Nieder
  2011-02-15 10:19               ` Yann Dirson
  2011-02-08 10:40             ` Ilari Liusvaara
  1 sibling, 1 reply; 20+ messages in thread
From: Jonathan Nieder @ 2011-02-08  4:30 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Junio C Hamano, git, Ilari Liusvaara, Jakub Narebski,
	Dmitry S. Kravtsov, Shawn Pearce, Yann Dirson

Nguyen Thai Ngoc Duy wrote:

> But empty trees are allowed in repo since 79b1138 (fsck.c: fix bogus
> "empty tree" check). Index can't handle empty trees, so it's a bug to
> me that index still accepts them as input and silently discard them.

FWIW my instinct points in the opposite direction.  I wouldn't mind
seeing fsck warn about trees containing empty subtrees[1].  As for
cache-tree, while it is not obvious what the right thing to do is,
discarding empty subtrees sounds accepatable.

For storage of empty subtrees in repos imported from svn, Yann's idea
of using .gitattributes somehow (maybe in the parent directory or
maybe in the subdir itself) seems oddly appealing.

Just my unproductive two cents,
Jonathan

[1] I suspect 79b1138 was only meant to deal with the "git commit
--allow-empty from newly initialized repo" case.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] cache-tree: do not cache empty trees
  2011-02-08  4:11           ` Nguyen Thai Ngoc Duy
  2011-02-08  4:30             ` Jonathan Nieder
@ 2011-02-08 10:40             ` Ilari Liusvaara
  1 sibling, 0 replies; 20+ messages in thread
From: Ilari Liusvaara @ 2011-02-08 10:40 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Junio C Hamano, Jonathan Nieder, git, Jakub Narebski,
	Dmitry S. Kravtsov, Shawn Pearce

On Tue, Feb 08, 2011 at 11:11:19AM +0700, Nguyen Thai Ngoc Duy wrote:
> On Tue, Feb 8, 2011 at 3:48 AM, Junio C Hamano <gitster@pobox.com> wrote:

> bug regardless directory tracking support in 1.8.0. A corner case that
> nobody would likely encounter (except Ilari and his "ghost directory"
> problem).

Except it wasn't me who directly saw those problems. It was somebody else
on #git (back when I was there). That was a while ago. That's the reason
I don't exactly remember what exactly it did (possibly false modifications
causing some commands (merge/rebase) to abort?).

Hmm. I should test what exactly happens (including with older git versions
to see if those problems have (possibly unintentionally) got fixed).

-Ilari

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] correct type of EMPTY_TREE_SHA1_BIN
  2011-02-07  8:17       ` [PATCH] correct type of EMPTY_TREE_SHA1_BIN Jonathan Nieder
@ 2011-02-09 23:33         ` Junio C Hamano
  0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2011-02-09 23:33 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Nguyễn Thái Ngọc Duy, git, Ilari Liusvaara,
	Jakub Narebski, Dmitry S. Kravtsov, Shawn Pearce

Jonathan Nieder <jrnieder@gmail.com> writes:

> Junio C Hamano wrote:
>
>>> --- a/cache-tree.c
>>> +++ b/cache-tree.c
>>> @@ -621,9 +621,18 @@ static void prime_cache_tree_rec(struct cache_tree *it, struct tree *tree)
>>>  			struct tree *subtree = lookup_tree(entry.sha1);
>>>  			if (!subtree->object.parsed)
>>>  				parse_tree(subtree);
>>> +			if (!hashcmp(entry.sha1, (unsigned char *)EMPTY_TREE_SHA1_BIN)) {
>>> +				warning("empty tree detected! Will be removed in new commits");
>>> +				cnt = -1;
>>> +				break;
>>> +			}
>>
>> You shouldn't need the cast (if you did, then hashcmp() macro should be
>> fixed so that you don't need to).
>
> Isn't this a bug in the definition of EMPTY_TREE_SHA1_BIN rather than
> the signature of hashcmp?

Yeah, you are right.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] cache-tree: do not cache empty trees
  2011-02-08  4:30             ` Jonathan Nieder
@ 2011-02-15 10:19               ` Yann Dirson
  2011-02-16 14:29                 ` Jakub Narebski
  0 siblings, 1 reply; 20+ messages in thread
From: Yann Dirson @ 2011-02-15 10:19 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Nguyen Thai Ngoc Duy, Junio C Hamano, git, Ilari Liusvaara,
	Jakub Narebski, Dmitry S. Kravtsov, Shawn Pearce

On Mon, 07 Feb 2011 22:30:00 -0600
Jonathan Nieder <jrnieder@gmail.com> wrote:
> For storage of empty subtrees in repos imported from svn, Yann's idea
> of using .gitattributes somehow (maybe in the parent directory or
> maybe in the subdir itself) seems oddly appealing.

In fact, I was mostly thinking about only using the
toplevel .gitattributes, if only to avoid the same sort
of clutter caused by the .gitignore-trick.

-- 
Yann Dirson - Bertin Technologies

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] cache-tree: do not cache empty trees
  2011-02-15 10:19               ` Yann Dirson
@ 2011-02-16 14:29                 ` Jakub Narebski
  0 siblings, 0 replies; 20+ messages in thread
From: Jakub Narebski @ 2011-02-16 14:29 UTC (permalink / raw)
  To: Yann Dirson
  Cc: Jonathan Nieder, Nguyen Thai Ngoc Duy, Junio C Hamano, git,
	Ilari Liusvaara, Dmitry S. Kravtsov, Shawn Pearce

On Tue, 15 Feb 2011, Yann Dirson wrote:
> On Mon, 07 Feb 2011 22:30:00 -0600
> Jonathan Nieder <jrnieder@gmail.com> wrote:

> > For storage of empty subtrees in repos imported from svn, Yann's idea
> > of using .gitattributes somehow (maybe in the parent directory or
> > maybe in the subdir itself) seems oddly appealing.
> 
> In fact, I was mostly thinking about only using the
> toplevel .gitattributes, if only to avoid the same sort
> of clutter caused by the .gitignore-trick.

I was thinking about configuration variable (to keep or automatically
delete empty directories), overridable per-pathname basis with 
.gitattributes, similarly to 'eof' situation.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2011-02-16 14:30 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-05  8:30 [PATCH] cache-tree: do not cache empty trees Nguyễn Thái Ngọc Duy
2011-02-05  9:50 ` Nguyễn Thái Ngọc Duy
2011-02-05 10:14   ` Jonathan Nieder
2011-02-05 10:32     ` Nguyen Thai Ngoc Duy
2011-02-05 14:07   ` Nguyễn Thái Ngọc Duy
2011-02-07  2:09     ` Junio C Hamano
2011-02-07  2:36       ` Nguyen Thai Ngoc Duy
2011-02-07  8:17       ` [PATCH] correct type of EMPTY_TREE_SHA1_BIN Jonathan Nieder
2011-02-09 23:33         ` Junio C Hamano
2011-02-07  9:17     ` [PATCH] cache-tree: do not cache empty trees Jonathan Nieder
2011-02-07  9:57       ` Nguyen Thai Ngoc Duy
2011-02-07 12:18         ` Ilari Liusvaara
2011-02-07 12:29           ` Nguyen Thai Ngoc Duy
2011-02-07 12:32           ` Jonathan Nieder
2011-02-07 20:48         ` Junio C Hamano
2011-02-08  4:11           ` Nguyen Thai Ngoc Duy
2011-02-08  4:30             ` Jonathan Nieder
2011-02-15 10:19               ` Yann Dirson
2011-02-16 14:29                 ` Jakub Narebski
2011-02-08 10:40             ` Ilari Liusvaara

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).