git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v4 0/4] nd/invalidate-i-t-a-cache-tree
@ 2012-12-16  4:15 Nguyễn Thái Ngọc Duy
  2012-12-16  4:15 ` [PATCH v4 1/4] cache-tree: remove dead i-t-a code in verify_cache() Nguyễn Thái Ngọc Duy
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-12-16  4:15 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

This version also fixes the CE_REMOVE bug I mentioned. As a side
effect, the bug fix makes the i-t-a fix cleaner.

Nguyễn Thái Ngọc Duy (4):
  cache-tree: remove dead i-t-a code in verify_cache()
  cache-tree: replace "for" loops in update_one with "while" loops
  cache-tree: fix writing cache-tree when CE_REMOVE is present
  cache-tree: invalidate i-t-a paths after generating trees

 cache-tree.c          | 61 +++++++++++++++++++++++++++++++++++++--------------
 cache-tree.h          |  1 +
 t/t2203-add-intent.sh | 20 +++++++++++++++++
 3 files changed, 65 insertions(+), 17 deletions(-)

-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v4 1/4] cache-tree: remove dead i-t-a code in verify_cache()
  2012-12-16  4:15 [PATCH v4 0/4] nd/invalidate-i-t-a-cache-tree Nguyễn Thái Ngọc Duy
@ 2012-12-16  4:15 ` Nguyễn Thái Ngọc Duy
  2012-12-16  4:15 ` [PATCH v4 2/4] cache-tree: replace "for" loops in update_one with "while" loops Nguyễn Thái Ngọc Duy
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-12-16  4:15 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

This code is added in 331fcb5 (git add --intent-to-add: do not let an
empty blob be committed by accident - 2008-11-28) to forbid committing
when i-t-a entries are present. When we allow that, we forgot to
remove this.

Noticed-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 cache-tree.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index 28ed657..e2beab5 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -166,12 +166,8 @@ static int verify_cache(struct cache_entry **cache,
 				fprintf(stderr, "...\n");
 				break;
 			}
-			if (ce_stage(ce))
-				fprintf(stderr, "%s: unmerged (%s)\n",
-					ce->name, sha1_to_hex(ce->sha1));
-			else
-				fprintf(stderr, "%s: not added yet\n",
-					ce->name);
+			fprintf(stderr, "%s: unmerged (%s)\n",
+				ce->name, sha1_to_hex(ce->sha1));
 		}
 	}
 	if (funny)
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 2/4] cache-tree: replace "for" loops in update_one with "while" loops
  2012-12-16  4:15 [PATCH v4 0/4] nd/invalidate-i-t-a-cache-tree Nguyễn Thái Ngọc Duy
  2012-12-16  4:15 ` [PATCH v4 1/4] cache-tree: remove dead i-t-a code in verify_cache() Nguyễn Thái Ngọc Duy
@ 2012-12-16  4:15 ` Nguyễn Thái Ngọc Duy
  2012-12-16  4:15 ` [PATCH v4 3/4] cache-tree: fix writing cache-tree when CE_REMOVE is present Nguyễn Thái Ngọc Duy
  2012-12-16  4:15 ` [PATCH v4 4/4] cache-tree: invalidate i-t-a paths after generating trees Nguyễn Thái Ngọc Duy
  3 siblings, 0 replies; 8+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-12-16  4:15 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

The loops in update_one can be increased in two different ways: step
by one for files and by <n> for directories. "for" loop is not
suitable for this as it always steps by one and special handling is
required for directories. Replace them with "while" loops for clarity.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 cache-tree.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index e2beab5..44eed28 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -259,7 +259,8 @@ static int update_one(struct cache_tree *it,
 	/*
 	 * Find the subtrees and update them.
 	 */
-	for (i = 0; i < entries; i++) {
+	i = 0;
+	while (i < entries) {
 		struct cache_entry *ce = cache[i];
 		struct cache_tree_sub *sub;
 		const char *path, *slash;
@@ -271,8 +272,10 @@ static int update_one(struct cache_tree *it,
 			break; /* at the end of this level */
 
 		slash = strchr(path + baselen, '/');
-		if (!slash)
+		if (!slash) {
+			i++;
 			continue;
+		}
 		/*
 		 * a/bbb/c (base = a/, slash = /c)
 		 * ==>
@@ -289,7 +292,7 @@ static int update_one(struct cache_tree *it,
 				    flags);
 		if (subcnt < 0)
 			return subcnt;
-		i += subcnt - 1;
+		i += subcnt;
 		sub->used = 1;
 	}
 
@@ -300,7 +303,8 @@ static int update_one(struct cache_tree *it,
 	 */
 	strbuf_init(&buffer, 8192);
 
-	for (i = 0; i < entries; i++) {
+	i = 0;
+	while (i < entries) {
 		struct cache_entry *ce = cache[i];
 		struct cache_tree_sub *sub;
 		const char *path, *slash;
@@ -320,7 +324,7 @@ static int update_one(struct cache_tree *it,
 			if (!sub)
 				die("cache-tree.c: '%.*s' in '%s' not found",
 				    entlen, path + baselen, path);
-			i += sub->cache_tree->entry_count - 1;
+			i += sub->cache_tree->entry_count;
 			sha1 = sub->cache_tree->sha1;
 			mode = S_IFDIR;
 		}
@@ -328,6 +332,7 @@ static int update_one(struct cache_tree *it,
 			sha1 = ce->sha1;
 			mode = ce->ce_mode;
 			entlen = pathlen - baselen;
+			i++;
 		}
 		if (mode != S_IFGITLINK && !missing_ok && !has_sha1_file(sha1)) {
 			strbuf_release(&buffer);
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 3/4] cache-tree: fix writing cache-tree when CE_REMOVE is present
  2012-12-16  4:15 [PATCH v4 0/4] nd/invalidate-i-t-a-cache-tree Nguyễn Thái Ngọc Duy
  2012-12-16  4:15 ` [PATCH v4 1/4] cache-tree: remove dead i-t-a code in verify_cache() Nguyễn Thái Ngọc Duy
  2012-12-16  4:15 ` [PATCH v4 2/4] cache-tree: replace "for" loops in update_one with "while" loops Nguyễn Thái Ngọc Duy
@ 2012-12-16  4:15 ` Nguyễn Thái Ngọc Duy
  2012-12-16  7:20   ` Junio C Hamano
  2012-12-16  4:15 ` [PATCH v4 4/4] cache-tree: invalidate i-t-a paths after generating trees Nguyễn Thái Ngọc Duy
  3 siblings, 1 reply; 8+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-12-16  4:15 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

entry_count is used in update_one() for two purposes:

1. to skip through the number of processed entries in in-memory index
2. to record the number of entries this cache-tree covers on disk

Unfortunately when CE_REMOVE is present these numbers are not the same
because CE_REMOVE entries are automatically removed before writing to
disk but entry_count is not adjusted and still counts CE_REMOVE
entries.

Separate the two use cases into two different variables. #1 is taken
care by the new field count in struct cache_tree_sub and entry_count
is prepared for #2.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 cache-tree.c | 30 +++++++++++++++++++++++-------
 cache-tree.h |  1 +
 2 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index 44eed28..2c10b2e 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -238,6 +238,7 @@ static int update_one(struct cache_tree *it,
 		      int entries,
 		      const char *base,
 		      int baselen,
+		      int *skip_count,
 		      int flags)
 {
 	struct strbuf buffer;
@@ -245,6 +246,8 @@ static int update_one(struct cache_tree *it,
 	int dryrun = flags & WRITE_TREE_DRY_RUN;
 	int i;
 
+	*skip_count = 0;
+
 	if (0 <= it->entry_count && has_sha1_file(it->sha1))
 		return it->entry_count;
 
@@ -264,7 +267,7 @@ static int update_one(struct cache_tree *it,
 		struct cache_entry *ce = cache[i];
 		struct cache_tree_sub *sub;
 		const char *path, *slash;
-		int pathlen, sublen, subcnt;
+		int pathlen, sublen, subcnt, subskip;
 
 		path = ce->name;
 		pathlen = ce_namelen(ce);
@@ -289,10 +292,13 @@ static int update_one(struct cache_tree *it,
 				    cache + i, entries - i,
 				    path,
 				    baselen + sublen + 1,
+				    &subskip,
 				    flags);
 		if (subcnt < 0)
 			return subcnt;
 		i += subcnt;
+		sub->count = subcnt; /* to be used in the next loop */
+		*skip_count += subskip;
 		sub->used = 1;
 	}
 
@@ -324,7 +330,7 @@ static int update_one(struct cache_tree *it,
 			if (!sub)
 				die("cache-tree.c: '%.*s' in '%s' not found",
 				    entlen, path + baselen, path);
-			i += sub->cache_tree->entry_count;
+			i += sub->count;
 			sha1 = sub->cache_tree->sha1;
 			mode = S_IFDIR;
 		}
@@ -340,8 +346,18 @@ static int update_one(struct cache_tree *it,
 				mode, sha1_to_hex(sha1), entlen+baselen, path);
 		}
 
-		if (ce->ce_flags & (CE_REMOVE | CE_INTENT_TO_ADD))
-			continue; /* entry being removed or placeholder */
+		/*
+		 * CE_REMOVE entries are removed before the index is
+		 * written to disk. Skip them to remain consistent
+		 * with the future on-disk index.
+		 */
+		if (ce->ce_flags & CE_REMOVE) {
+			*skip_count = *skip_count + 1;
+			continue;
+		}
+
+		if (ce->ce_flags & CE_INTENT_TO_ADD)
+			continue;
 
 		strbuf_grow(&buffer, entlen + 100);
 		strbuf_addf(&buffer, "%o %.*s%c", mode, entlen, path + baselen, '\0');
@@ -361,7 +377,7 @@ static int update_one(struct cache_tree *it,
 	}
 
 	strbuf_release(&buffer);
-	it->entry_count = i;
+	it->entry_count = i - *skip_count;
 #if DEBUG
 	fprintf(stderr, "cache-tree update-one (%d ent, %d subtree) %s\n",
 		it->entry_count, it->subtree_nr,
@@ -375,11 +391,11 @@ int cache_tree_update(struct cache_tree *it,
 		      int entries,
 		      int flags)
 {
-	int i;
+	int i, skip;
 	i = verify_cache(cache, entries, flags);
 	if (i)
 		return i;
-	i = update_one(it, cache, entries, "", 0, flags);
+	i = update_one(it, cache, entries, "", 0, &skip, flags);
 	if (i < 0)
 		return i;
 	return 0;
diff --git a/cache-tree.h b/cache-tree.h
index d8cb2e9..55d0f59 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -7,6 +7,7 @@
 struct cache_tree;
 struct cache_tree_sub {
 	struct cache_tree *cache_tree;
+	int count;		/* internally used by update_one() */
 	int namelen;
 	int used;
 	char name[FLEX_ARRAY];
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 4/4] cache-tree: invalidate i-t-a paths after generating trees
  2012-12-16  4:15 [PATCH v4 0/4] nd/invalidate-i-t-a-cache-tree Nguyễn Thái Ngọc Duy
                   ` (2 preceding siblings ...)
  2012-12-16  4:15 ` [PATCH v4 3/4] cache-tree: fix writing cache-tree when CE_REMOVE is present Nguyễn Thái Ngọc Duy
@ 2012-12-16  4:15 ` Nguyễn Thái Ngọc Duy
  3 siblings, 0 replies; 8+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-12-16  4:15 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

Intent-to-add entries used to forbid writing trees so it was not a
problem. After commit 3f6d56d (commit: ignore intent-to-add entries
instead of refusing - 2012-02-07), we can generate trees from an index
with i-t-a entries.

However, the commit forgets to invalidate all paths leading to i-t-a
entries. With fully valid cache-tree (e.g. after commit or
write-tree), diff operations may prefer cache-tree to index and not
see i-t-a entries in the index, because cache-tree does not have them.

Reported-by: Jonathon Mah <me@JonathonMah.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 cache-tree.c          | 14 ++++++++++++--
 t/t2203-add-intent.sh | 20 ++++++++++++++++++++
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index 2c10b2e..37e4d00 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -244,6 +244,7 @@ static int update_one(struct cache_tree *it,
 	struct strbuf buffer;
 	int missing_ok = flags & WRITE_TREE_MISSING_OK;
 	int dryrun = flags & WRITE_TREE_DRY_RUN;
+	int to_invalidate = 0;
 	int i;
 
 	*skip_count = 0;
@@ -333,6 +334,8 @@ static int update_one(struct cache_tree *it,
 			i += sub->count;
 			sha1 = sub->cache_tree->sha1;
 			mode = S_IFDIR;
+			if (sub->cache_tree->entry_count < 0)
+				to_invalidate = 1;
 		}
 		else {
 			sha1 = ce->sha1;
@@ -356,8 +359,15 @@ static int update_one(struct cache_tree *it,
 			continue;
 		}
 
-		if (ce->ce_flags & CE_INTENT_TO_ADD)
+		/*
+		 * CE_INTENT_TO_ADD entries exist on on-disk index but
+		 * they are not part of generated trees. Invalidate up
+		 * to root to force cache-tree users to read elsewhere.
+		 */
+		if (ce->ce_flags & CE_INTENT_TO_ADD) {
+			to_invalidate = 1;
 			continue;
+		}
 
 		strbuf_grow(&buffer, entlen + 100);
 		strbuf_addf(&buffer, "%o %.*s%c", mode, entlen, path + baselen, '\0');
@@ -377,7 +387,7 @@ static int update_one(struct cache_tree *it,
 	}
 
 	strbuf_release(&buffer);
-	it->entry_count = i - *skip_count;
+	it->entry_count = to_invalidate ? -1 : i - *skip_count;
 #if DEBUG
 	fprintf(stderr, "cache-tree update-one (%d ent, %d subtree) %s\n",
 		it->entry_count, it->subtree_nr,
diff --git a/t/t2203-add-intent.sh b/t/t2203-add-intent.sh
index ec35409..2a4a749 100755
--- a/t/t2203-add-intent.sh
+++ b/t/t2203-add-intent.sh
@@ -62,5 +62,25 @@ test_expect_success 'can "commit -a" with an i-t-a entry' '
 	git commit -a -m all
 '
 
+test_expect_success 'cache-tree invalidates i-t-a paths' '
+	git reset --hard &&
+	mkdir dir &&
+	: >dir/foo &&
+	git add dir/foo &&
+	git commit -m foo &&
+
+	: >dir/bar &&
+	git add -N dir/bar &&
+	git diff --cached --name-only >actual &&
+	echo dir/bar >expect &&
+	test_cmp expect actual &&
+
+	git write-tree >/dev/null &&
+
+	git diff --cached --name-only >actual &&
+	echo dir/bar >expect &&
+	test_cmp expect actual
+'
+
 test_done
 
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 3/4] cache-tree: fix writing cache-tree when CE_REMOVE is present
  2012-12-16  4:15 ` [PATCH v4 3/4] cache-tree: fix writing cache-tree when CE_REMOVE is present Nguyễn Thái Ngọc Duy
@ 2012-12-16  7:20   ` Junio C Hamano
  2012-12-16 10:15     ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2012-12-16  7:20 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> entry_count is used in update_one() for two purposes:
>
> 1. to skip through the number of processed entries in in-memory index
> 2. to record the number of entries this cache-tree covers on disk
>
> Unfortunately when CE_REMOVE is present these numbers are not the same
> because CE_REMOVE entries are automatically removed before writing to
> disk but entry_count is not adjusted and still counts CE_REMOVE
> entries.

Nicely explained.  I wonder if we can also add a piece of test to
the patch 4/4 to demonstrate the issue with CE_REMOVE entries,
though.

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 3/4] cache-tree: fix writing cache-tree when CE_REMOVE is present
  2012-12-16  7:20   ` Junio C Hamano
@ 2012-12-16 10:15     ` Nguyen Thai Ngoc Duy
  2012-12-16 10:38       ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-12-16 10:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sun, Dec 16, 2012 at 2:20 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
>
>> entry_count is used in update_one() for two purposes:
>>
>> 1. to skip through the number of processed entries in in-memory index
>> 2. to record the number of entries this cache-tree covers on disk
>>
>> Unfortunately when CE_REMOVE is present these numbers are not the same
>> because CE_REMOVE entries are automatically removed before writing to
>> disk but entry_count is not adjusted and still counts CE_REMOVE
>> entries.
>
> Nicely explained.  I wonder if we can also add a piece of test to
> the patch 4/4 to demonstrate the issue with CE_REMOVE entries,
> though.

A hand crafted one, maybe. I did not attempt to recreate it with git
commands (and I don't think we update cache-tree after unpack_trees).
So I wrote something like this instead:

int main(int ac, char **av)
{
	unsigned char sha1[20];
	setup_git_directory();
	read_cache();
	active_cache[1]->ce_flags |= CE_REMOVE;
	write_cache_as_tree(sha1, 0, NULL);
	return 0;
}

I can polish it a bit and write new tests based on it and
test-dump-cache-tree if you want.
-- 
Duy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 3/4] cache-tree: fix writing cache-tree when CE_REMOVE is present
  2012-12-16 10:15     ` Nguyen Thai Ngoc Duy
@ 2012-12-16 10:38       ` Junio C Hamano
  0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2012-12-16 10:38 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git



Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:

>On Sun, Dec 16, 2012 at 2:20 PM, Junio C Hamano <gitster@pobox.com>
>wrote:
>> Nicely explained.  I wonder if we can also add a piece of test to
>> the patch 4/4 to demonstrate the issue with CE_REMOVE entries,
>> though.
>
>A hand crafted one, maybe. I did not attempt to recreate it with git
>commands (and I don't think we update cache-tree after unpack_trees).

Yeah, that's what I thought. No need to bother creating a bug that won't appear in the real life :-) 

Tusks for sanity checking.
-- 
Pardon terseness, typo and HTML from a tablet.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-12-16 10:39 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-16  4:15 [PATCH v4 0/4] nd/invalidate-i-t-a-cache-tree Nguyễn Thái Ngọc Duy
2012-12-16  4:15 ` [PATCH v4 1/4] cache-tree: remove dead i-t-a code in verify_cache() Nguyễn Thái Ngọc Duy
2012-12-16  4:15 ` [PATCH v4 2/4] cache-tree: replace "for" loops in update_one with "while" loops Nguyễn Thái Ngọc Duy
2012-12-16  4:15 ` [PATCH v4 3/4] cache-tree: fix writing cache-tree when CE_REMOVE is present Nguyễn Thái Ngọc Duy
2012-12-16  7:20   ` Junio C Hamano
2012-12-16 10:15     ` Nguyen Thai Ngoc Duy
2012-12-16 10:38       ` Junio C Hamano
2012-12-16  4:15 ` [PATCH v4 4/4] cache-tree: invalidate i-t-a paths after generating trees Nguyễn Thái Ngọc Duy

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).