[PATCH 0/2] Reorganize read-tree

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH 0/2] Reorganize read-tree
@ 2005-08-31  3:48 Daniel Barkalow
  2005-08-31  3:49 ` [PATCH 1/2] Object model additions for read-tree Daniel Barkalow
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Daniel Barkalow @ 2005-08-31  3:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Linus Torvalds

I got mostly done with this before Linus mentioned the possibility of
having multiple index entries in the same stage for a single path. I
finished it anyway, but I'm not sure that we won't want to know which of
the common ancestors contributed which, and, if some of them don't have a
path, we wouldn't be able to tell. The other advantages I see to this
approach are:

 - it uses the more common parser of tree objects, moving toward having
   only one (diff-cache still uses read_tree(), however).
 - it doesn't need to do very complicated things with the index; the
   original read-tree does a bunch of stuff with an index with a gap in
   the middle containing obsolete entries.
 - it uses a much simpler method of finding directory/file conflicts,
   which is possible because the struct trees represent directories as
   well as files.
 - it deals with each path completely before going on to the next one,
   instead of first dealing with each input tree and then dealing with
   each path.
 - it removes a lot of intimate knowledge of the index structure from the
   program.

The general idea is that it figures out what trees you want, and then
iterates through the entry lists together, recursing into directories, and
calls the merge function with an array of the index entries (not yet
added) for the path in each tree; the merge function adds the appropriate
things to the index.

Note that this set doesn't include calling merge functions with multiple
ancestors or remotes; that can be done when we've decided on whether my
version of read-tree is worth using.

There are various potential refinements, plus removing a bunch of memory
leaks, still to do, but I think this is sufficiently close to review.

(Refinements: it ought to have two indices in memory, the old and the new,
and never modify the old and only append to the new, to simplify things
further; it ought to use a sentinal value for the index entry to indicate
that there is something in the tree to conflict with there being a file at
the given path; the --emu23 logic could be clearer)

The first patch adds a few functions to the object library.
The second patch changes read-tree around; It is essentially a rewrite,
except for the merge functions and main().

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] Object model additions for read-tree
  2005-08-31  3:48 [PATCH 0/2] Reorganize read-tree Daniel Barkalow
@ 2005-08-31  3:49 ` Daniel Barkalow
  2005-08-31  3:49 ` [PATCH] Change read-tree to merge before using the index Daniel Barkalow
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Daniel Barkalow @ 2005-08-31  3:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Linus Torvalds

Adds object_list_append() and a function to get the struct tree from an ent.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
---

 object.c |   11 +++++++++++
 object.h |    3 +++
 tree.c   |   19 +++++++++++++++++++
 tree.h   |    3 +++
 4 files changed, 36 insertions(+), 0 deletions(-)

49d33c385aa69d17c991300f73e77c6718a2b4a6
diff --git a/object.c b/object.c
--- a/object.c
+++ b/object.c
@@ -184,6 +184,17 @@ struct object_list *object_list_insert(s
         return new_list;
 }

+void object_list_append(struct object *item,
+			struct object_list **list_p)
+{
+	while (*list_p) {
+		list_p = &((*list_p)->next);
+	}
+	*list_p = xmalloc(sizeof(struct object_list));
+	(*list_p)->next = NULL;
+	(*list_p)->item = item;
+}
+
 unsigned object_list_length(struct object_list *list)
 {
 	unsigned ret = 0;
diff --git a/object.h b/object.h
--- a/object.h
+++ b/object.h
@@ -41,6 +41,9 @@ void mark_reachable(struct object *obj,
 struct object_list *object_list_insert(struct object *item,
 				       struct object_list **list_p);

+void object_list_append(struct object *item,
+			struct object_list **list_p);
+
 unsigned object_list_length(struct object_list *list);

 int object_list_contains(struct object_list *list, struct object *obj);
diff --git a/tree.c b/tree.c
--- a/tree.c
+++ b/tree.c
@@ -1,5 +1,7 @@
 #include "tree.h"
 #include "blob.h"
+#include "commit.h"
+#include "tag.h"
 #include "cache.h"
 #include <stdlib.h>

@@ -212,3 +214,20 @@ int parse_tree(struct tree *item)
 	free(buffer);
 	return ret;
 }
+
+struct tree *parse_tree_indirect(const unsigned char *sha1)
+{
+	struct object *obj = parse_object(sha1);
+	do {
+		if (!obj)
+			return NULL;
+		if (obj->type == tree_type)
+			return (struct tree *) obj;
+		else if (obj->type == commit_type)
+			return ((struct commit *) obj)->tree;
+		else if (obj->type == tag_type)
+			obj = ((struct tag *) obj)->tagged;
+		else
+			return NULL;
+	} while (1);
+}
diff --git a/tree.h b/tree.h
--- a/tree.h
+++ b/tree.h
@@ -32,4 +32,7 @@ int parse_tree_buffer(struct tree *item,

 int parse_tree(struct tree *tree);

+/* Parses and returns the tree in the given ent, chasing tags and commits. */
+struct tree *parse_tree_indirect(const unsigned char *sha1);
+
 #endif /* TREE_H */

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] Change read-tree to merge before using the index.
  2005-08-31  3:48 [PATCH 0/2] Reorganize read-tree Daniel Barkalow
  2005-08-31  3:49 ` [PATCH 1/2] Object model additions for read-tree Daniel Barkalow
@ 2005-08-31  3:49 ` Daniel Barkalow
  2005-08-31  6:28 ` [PATCH 0/2] Reorganize read-tree Junio C Hamano
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Daniel Barkalow @ 2005-08-31  3:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Linus Torvalds

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
---

 read-tree.c |  522 ++++++++++++++++++++++++++++++++++-------------------------
 1 files changed, 297 insertions(+), 225 deletions(-)

d0f45ad81db2e133c49c23bd09c5615da344bb5c
diff --git a/read-tree.c b/read-tree.c
--- a/read-tree.c
+++ b/read-tree.c
@@ -5,28 +5,280 @@
  */
 #include "cache.h"

-static int stage = 0;
+#include "object.h"
+#include "tree.h"
+
+static int merge = 0;
+static int emu23 = 0;
 static int update = 0;

-static int unpack_tree(unsigned char *sha1)
+static struct object_list *trees = NULL;
+
+typedef int (*merge_fn_t)(struct cache_entry **src,
+			  struct cache_entry **dest,
+			  int df_conflicts_2,
+			  int df_conflicts_3);
+
+static int unpack_trees_rec(struct tree_entry_list **posns, int len,
+			    const char *base, merge_fn_t fn,
+			    int file2, int file3, int *indpos)
+{
+	int baselen = strlen(base);
+	int src_size = len + 1;
+	if (emu23)
+		src_size++;
+	if (src_size > 4)
+		src_size = 4;
+	do {
+		int i;
+		char *first = NULL;
+		int pathlen;
+		unsigned ce_size;
+		int dir2 = 0;
+		int dir3 = 0;
+		int subfile2 = file2;
+		int subfile3 = file3;
+		struct tree_entry_list **subposns = NULL;
+		struct cache_entry **src = NULL;
+		char *cache_name = NULL;
+
+		/* Find the first name in the input. */
+
+		/* Check the cache */
+		if (merge && *indpos < active_nr) {
+			/* This is a bit tricky: */
+			/* If the index has a subdirectory (with
+			 * contents) as the first name, it'll get a
+			 * filename like "foo/bar". But that's after
+			 * "foo", so the entry in trees will get
+			 * handled first, at which point we'll go into
+			 * "foo", and deal with "bar" from the index,
+			 * because the base will be "foo/". The only
+			 * way we can actually have "foo/bar" first of
+			 * all the things is if the trees don't
+			 * contain "foo" at all, in which case we'll
+			 * handle "foo/bar" without going into the
+			 * directory, but that's fine (and will return
+			 * an error anyway, with the added unknown
+			 * file case.
+			 */
+
+			cache_name = active_cache[*indpos]->name;
+			if (strlen(cache_name) > baselen &&
+			    !memcmp(cache_name, base, baselen)) {
+				cache_name += baselen;
+				first = cache_name;
+			} else {
+				cache_name = NULL;
+			}
+		}
+
+		for (i = 0; i < len; i++) {
+			if (!posns[i])
+				continue;
+			if (!first || strcmp(first, posns[i]->name) > 0)
+				first = posns[i]->name;
+		}
+		/* No name means we're done */
+		if (!first)
+			return 0;
+
+		pathlen = strlen(first);
+		ce_size = cache_entry_size(baselen + pathlen);
+
+		if (cache_name && !strcmp(cache_name, first)) {
+			src = xmalloc(sizeof(struct cache_entry *) *
+				      src_size);
+			memset(src, 0,
+			       sizeof(struct cache_entry *) *
+			       src_size);
+			src[0] = active_cache[*indpos];
+			remove_cache_entry_at(*indpos);
+			if (emu23) {
+				// we need this in stage 2 as well as stage 0
+				struct cache_entry *copy =
+					xmalloc(ce_size);
+				memcpy(copy, src[0], ce_size);
+				copy->ce_flags =
+					create_ce_flags(baselen + pathlen, 2);
+				if (dir2 || file2) {
+					die("cannot merge index and our head tree");
+				}
+				src[2] = copy;
+				subfile2 = 1;
+			}
+		}
+
+		for (i = 0; i < len; i++) {
+			struct cache_entry *ce;
+			int ce_stage;
+			if (!posns[i] ||
+			    strcmp(first, posns[i]->name)) {
+				continue;
+			}
+
+			if (len > 3) {
+				if (i - len + 3 >= 0)
+					ce_stage = i - len + 3 + merge;
+				else
+					ce_stage = merge;
+			} else {
+				ce_stage = i + merge;
+			}
+			if (emu23 && ce_stage == 2)
+				ce_stage = 3;
+
+			if (posns[i]->directory) {
+				if (!subposns) {
+					int posnlen = sizeof(struct tree_list_entry *) * len;
+					subposns = xmalloc(posnlen);
+					memset(subposns, 0, posnlen);
+				}
+				parse_tree(posns[i]->item.tree);
+				subposns[i] = posns[i]->item.tree->entries;
+				posns[i] = posns[i]->next;
+				if (emu23 && ce_stage == 1)
+					dir2 = 1;
+				if (ce_stage == 2)
+					dir2 = 1;
+				if (ce_stage == 3)
+					dir3 = 1;
+				continue;
+			}
+
+			if (ce_stage == 2)
+				subfile2 = 1;
+			if (ce_stage == 3)
+				subfile3 = 1;
+
+			ce = xmalloc(ce_size);
+			memset(ce, 0, ce_size);
+			ce->ce_mode = create_ce_mode(posns[i]->mode);
+			ce->ce_flags = create_ce_flags(baselen + pathlen,
+						       ce_stage);
+			memcpy(ce->name, base, baselen);
+			memcpy(ce->name + baselen, first, pathlen + 1);
+
+			if (!src) {
+				src = xmalloc(sizeof(struct cache_entry *) *
+					      src_size);
+				memset(src, 0,
+				       sizeof(struct cache_entry *) *
+				       src_size);
+			}
+
+			memcpy(ce->sha1, posns[i]->item.any->sha1, 20);
+			if (ce_stage == 1 && emu23 && !src[2]) {
+				// we need this in stage 2 as well as stage 1
+				struct cache_entry *copy =
+					xmalloc(ce_size);
+				memcpy(copy, ce, ce_size);
+				copy->ce_flags =
+					create_ce_flags(baselen + pathlen, 2);
+				if (dir2 || file2) {
+					die("cannot merge index and our head tree");
+				}
+				src[2] = copy;
+				subfile2 = 1;
+			}
+			src[ce_stage] = ce;
+			posns[i] = posns[i]->next;
+		}
+		if (src) {
+			if (fn) {
+				int ret;
+				/*
+				printf("%s:\n", first);
+				for (i = 0; i < src_size; i++) {
+					printf(" %d ", i);
+					if (src[i])
+						printf("%s\n", sha1_to_hex(src[i]->sha1));
+					else
+						printf("\n");
+				}
+				*/
+				ret = fn(src, NULL,
+					 (file2 || dir2),
+					 (file3 || dir3));
+				/*printf("Added %d entries\n", ret);*/
+				*indpos += ret;
+			} else {
+				for (i = 0; i < src_size; i++) {
+					if (src[i]) {
+						add_cache_entry(src[i], ADD_CACHE_OK_TO_ADD|ADD_CACHE_SKIP_DFCHECK);
+					}
+				}
+			}
+		}
+		if (subposns) {
+			char *newbase = xmalloc(baselen + 2 + pathlen);
+			memcpy(newbase, base, baselen);
+			memcpy(newbase + baselen, first, pathlen);
+			newbase[baselen + pathlen] = '/';
+			newbase[baselen + pathlen + 1] = '\0';
+			if (unpack_trees_rec(subposns, len, newbase, fn,
+					     subfile2, subfile3, indpos))
+				return -1;
+		}
+	} while (1);
+}
+
+static void reject_merge(struct cache_entry *ce)
+{
+	die("Entry '%s' would be overwritten by merge. Cannot merge.",
+	    ce->name);
+}
+
+static void check_updates(struct cache_entry **src, int nr)
 {
-	void *buffer;
-	unsigned long size;
-	int ret;
+	static struct checkout state = {
+		.base_dir = "",
+		.force = 1,
+		.quiet = 1,
+		.refresh_cache = 1,
+	};
+	unsigned short mask = htons(CE_UPDATE);
+	while (nr--) {
+		struct cache_entry *ce = *src++;
+		if (!ce->ce_mode) {
+			if (update)
+				unlink(ce->name);
+			continue;
+		}
+		if (ce->ce_flags & mask) {
+			ce->ce_flags &= ~mask;
+			if (update)
+				checkout_entry(ce, &state);
+		}
+	}
+}

-	buffer = read_object_with_reference(sha1, "tree", &size, NULL);
-	if (!buffer)
+static int unpack_trees(merge_fn_t fn)
+{
+	int indpos = 0;
+	unsigned len = object_list_length(trees);
+	struct tree_entry_list **posns =
+		xmalloc(len * sizeof(struct tree_entry_list *));
+	int i;
+	struct object_list *posn = trees;
+	for (i = 0; i < len; i++) {
+		posns[i] = ((struct tree *) posn->item)->entries;
+		posn = posn->next;
+	}
+	if (unpack_trees_rec(posns, len, "", fn, 0, 0, &indpos))
 		return -1;
-	ret = read_tree(buffer, size, stage, NULL);
-	free(buffer);
-	return ret;
+
+	check_updates(active_cache, active_nr);
+	return 0;
 }

-static int path_matches(struct cache_entry *a, struct cache_entry *b)
+static int list_tree(unsigned char *sha1)
 {
-	int len = ce_namelen(a);
-	return ce_namelen(b) == len &&
-		!memcmp(a->name, b->name, len);
+	struct tree *tree = parse_tree_indirect(sha1);
+	if (!tree)
+		return -1;
+	object_list_append(&tree->object, &trees);
+	return 0;
 }

 static int same(struct cache_entry *a, struct cache_entry *b)
@@ -91,17 +343,6 @@ static void verify_uptodate(struct cache
 	die("Entry '%s' not uptodate. Cannot merge.", ce->name);
 }

-/*
- * If the old tree contained a CE that isn't even in the
- * result, that's always a problem, regardless of whether
- * it's up-to-date or not (ie it can be a file that we
- * have updated but not committed yet).
- */
-static void reject_merge(struct cache_entry *ce)
-{
-	die("Entry '%s' would be overwritten by merge. Cannot merge.", ce->name);
-}
-
 static int merged_entry_internal(struct cache_entry *merge, struct cache_entry *old, struct cache_entry **dst, int allow_dirty)
 {
 	merge->ce_flags |= htons(CE_UPDATE);
@@ -120,7 +361,7 @@ static int merged_entry_internal(struct
 		}
 	}
 	merge->ce_flags &= ~htons(CE_STAGEMASK);
-	*dst++ = merge;
+	add_cache_entry(merge, ADD_CACHE_OK_TO_ADD);
 	return 1;
 }

@@ -139,76 +380,19 @@ static int deleted_entry(struct cache_en
 	if (old)
 		verify_uptodate(old);
 	ce->ce_mode = 0;
-	*dst++ = ce;
+	add_cache_entry(ce, ADD_CACHE_OK_TO_ADD);
 	return 1;
 }

-static int causes_df_conflict(struct cache_entry *ce, int stage,
-			      struct cache_entry **dst_,
-			      struct cache_entry **next_,
-			      int tail)
-{
-	/* This is called during the merge operation and walking
-	 * the active_cache[] array is messy, because it is in the
-	 * middle of overlapping copy operation.  The invariants
-	 * are:
-	 * (1) active_cache points at the first (zeroth) entry.
-	 * (2) up to dst pointer are resolved entries.
-	 * (3) from the next pointer (head-inclusive) to the tail
-	 *     of the active_cache array have the remaining paths
-	 *     to be processed.  There can be a gap between dst
-	 *     and next.  Note that next is called "src" in the
-	 *     merge_cache() function, and tail is the original
-	 *     end of active_cache array when merge_cache() started.
-	 * (4) the path corresponding to *ce is not found in (2)
-	 *     or (3).  It is in the gap.
-	 *
-	 *  active_cache -----......+++++++++++++.
-	 *                    ^dst  ^next        ^tail
-	 */
-	int i, next, dst;
-	const char *path = ce->name;
-	int namelen = ce_namelen(ce);
-
-	next = next_ - active_cache;
-	dst = dst_ - active_cache;
-
-	for (i = 0; i < tail; i++) {
-		int entlen, len;
-		const char *one, *two;
-		if (dst <= i && i < next)
-			continue;
-		ce = active_cache[i];
-		if (ce_stage(ce) != stage)
-			continue;
-		/* If ce->name is a prefix of path, then path is a file
-		 * that hangs underneath ce->name, which is bad.
-		 * If path is a prefix of ce->name, then it is the
-		 * other way around which also is bad.
-		 */
-		entlen = ce_namelen(ce);
-		if (namelen == entlen)
-			continue;
-		if (namelen < entlen) {
-			len = namelen;
-			one = path;
-			two = ce->name;
-		} else {
-			len = entlen;
-			one = ce->name;
-			two = path;
-		}
-		if (memcmp(one, two, len))
-			continue;
-		if (two[len] == '/')
-			return 1;
-	}
-	return 0;
+static int keep_entry(struct cache_entry *ce, struct cache_entry **dst)
+{
+	add_cache_entry(ce, ADD_CACHE_OK_TO_ADD);
+	return 1;
 }

 static int threeway_merge(struct cache_entry *stages[4],
 			  struct cache_entry **dst,
-			  struct cache_entry **next, int tail)
+			  int df_conflict_2, int df_conflict_3)
 {
 	struct cache_entry *old = stages[0];
 	struct cache_entry *a = stages[1], *b = stages[2], *c = stages[3];
@@ -218,7 +402,7 @@ static int threeway_merge(struct cache_e
 	/* #5ALT */
 	if (!a && b && c && same(b, c)) {
 		if (old && !same(b, old))
-			return -1;
+			reject_merge(old);
 		return merged_entry_allow_dirty(b, old, dst);
 	}
 	/* #2ALT and #3ALT */
@@ -269,12 +453,12 @@ static int threeway_merge(struct cache_e
 		 *     and vice versa.
 		 */
 		if (c) { /* #2ALT */
-			if (!causes_df_conflict(c, 2, dst, next, tail) &&
+			if (!df_conflict_2 &&
 			    (!old || same(c, old)))
 				return merged_entry_allow_dirty(c, old, dst);
 		}
 		else { /* #3ALT */
-			if (!causes_df_conflict(b, 3, dst, next, tail) &&
+			if (!df_conflict_3 &&
 			    (!old || same(b, old)))
 				return merged_entry_allow_dirty(b, old, dst);
 		}
@@ -293,7 +477,7 @@ static int threeway_merge(struct cache_e
 	 */
 	if (old) {
 		if (!b || !same(old, b))
-			return -1;
+			reject_merge(old);
 	}
 	merge = merge_entries(a, b, c);
 	if (merge)
@@ -301,9 +485,9 @@ static int threeway_merge(struct cache_e
 	if (old)
 		verify_uptodate(old);
 	count = 0;
-	if (a) { *dst++ = a; count++; }
-	if (b) { *dst++ = b; count++; }
-	if (c) { *dst++ = c; count++; }
+	if (a) { count += keep_entry(a, dst + count); }
+	if (b) { count += keep_entry(b, dst + count); }
+	if (c) { count += keep_entry(c, dst + count); }
 	return count;
 }

@@ -317,14 +501,11 @@ static int threeway_merge(struct cache_e
  *
  */
 static int twoway_merge(struct cache_entry **src, struct cache_entry **dst,
-			struct cache_entry **next, int tail)
+			int df23, int df32)
 {
 	struct cache_entry *current = src[0];
 	struct cache_entry *oldtree = src[1], *newtree = src[2];

-	if (src[3])
-		return -1;
-
 	if (current) {
 		if ((!oldtree && !newtree) || /* 4 and 5 */
 		    (!oldtree && newtree &&
@@ -334,8 +515,7 @@ static int twoway_merge(struct cache_ent
 		    (oldtree && newtree &&
 		     !same(oldtree, newtree) && /* 18 and 19*/
 		     same(current, newtree))) {
-			*dst++ = current;
-			return 1;
+			return keep_entry(current, dst);
 		}
 		else if (oldtree && !newtree && same(current, oldtree)) {
 			/* 10 or 11 */
@@ -346,9 +526,16 @@ static int twoway_merge(struct cache_ent
 			/* 20 or 21 */
 			return merged_entry(newtree, current, dst);
 		}
-		else
+		else {
 			/* all other failures */
+			if (oldtree)
+				reject_merge(oldtree);
+			if (current)
+				reject_merge(current);
+			if (newtree)
+				reject_merge(newtree);
 			return -1;
+		}
 	}
 	else if (newtree)
 		return merged_entry(newtree, current, dst);
@@ -357,137 +544,25 @@ static int twoway_merge(struct cache_ent
 }

 /*
- * Two-way merge emulated with three-way merge.
- *
- * This treats "read-tree -m H M" by transforming it internally
- * into "read-tree -m H I+H M", where I+H is a tree that would
- * contain the contents of the current index file, overlayed on
- * top of H.  Unlike the traditional two-way merge, this leaves
- * the stages in the resulting index file and lets the user resolve
- * the merge conflicts using standard tools for three-way merge.
- *
- * This function is just to set-up such an arrangement, and the
- * actual merge uses threeway_merge() function.
- */
-static void setup_emu23(void)
-{
-	/* stage0 contains I, stage1 H, stage2 M.
-	 * move stage2 to stage3, and create stage2 entries
-	 * by scanning stage0 and stage1 entries.
-	 */
-	int i, namelen, size;
-	struct cache_entry *ce, *stage2;
-
-	for (i = 0; i < active_nr; i++) {
-		ce = active_cache[i];
-		if (ce_stage(ce) != 2)
-			continue;
-		/* hoist them up to stage 3 */
-		namelen = ce_namelen(ce);
-		ce->ce_flags = create_ce_flags(namelen, 3);
-	}
-
-	for (i = 0; i < active_nr; i++) {
-		ce = active_cache[i];
-		if (ce_stage(ce) > 1)
-			continue;
-		namelen = ce_namelen(ce);
-		size = cache_entry_size(namelen);
-		stage2 = xmalloc(size);
-		memcpy(stage2, ce, size);
-		stage2->ce_flags = create_ce_flags(namelen, 2);
-		if (add_cache_entry(stage2, ADD_CACHE_OK_TO_ADD) < 0)
-			die("cannot merge index and our head tree");
-
-		/* We are done with this name, so skip to next name */
-		while (i < active_nr &&
-		       ce_namelen(active_cache[i]) == namelen &&
-		       !memcmp(active_cache[i]->name, ce->name, namelen))
-			i++;
-		i--; /* compensate for the loop control */
-	}
-}
-
-/*
  * One-way merge.
  *
  * The rule is:
  * - take the stat information from stage0, take the data from stage1
  */
 static int oneway_merge(struct cache_entry **src, struct cache_entry **dst,
-			struct cache_entry **next, int tail)
+			int df23, int df32)
 {
 	struct cache_entry *old = src[0];
 	struct cache_entry *a = src[1];

-	if (src[2] || src[3])
-		return -1;
-
 	if (!a)
 		return 0;
 	if (old && same(old, a)) {
-		*dst++ = old;
-		return 1;
+		return keep_entry(old, dst);
 	}
 	return merged_entry(a, NULL, dst);
 }

-static void check_updates(struct cache_entry **src, int nr)
-{
-	static struct checkout state = {
-		.base_dir = "",
-		.force = 1,
-		.quiet = 1,
-		.refresh_cache = 1,
-	};
-	unsigned short mask = htons(CE_UPDATE);
-	while (nr--) {
-		struct cache_entry *ce = *src++;
-		if (!ce->ce_mode) {
-			if (update)
-				unlink(ce->name);
-			continue;
-		}
-		if (ce->ce_flags & mask) {
-			ce->ce_flags &= ~mask;
-			if (update)
-				checkout_entry(ce, &state);
-		}
-	}
-}
-
-typedef int (*merge_fn_t)(struct cache_entry **, struct cache_entry **, struct cache_entry **, int);
-
-static void merge_cache(struct cache_entry **src, int nr, merge_fn_t fn)
-{
-	struct cache_entry **dst = src;
-	int tail = nr;
-
-	while (nr) {
-		int entries;
-		struct cache_entry *name, *ce, *stages[4] = { NULL, };
-
-		name = ce = *src;
-		for (;;) {
-			int stage = ce_stage(ce);
-			stages[stage] = ce;
-			ce = *++src;
-			active_nr--;
-			if (!--nr)
-				break;
-			if (!path_matches(ce, name))
-				break;
-		}
-
-		entries = fn(stages, dst, src, tail);
-		if (entries < 0)
-			reject_merge(name);
-		dst += entries;
-		active_nr += entries;
-	}
-	check_updates(active_cache, active_nr);
-}
-
 static int read_cache_unmerged(void)
 {
 	int i, deleted;
@@ -516,8 +591,9 @@ static struct cache_file cache_file;

 int main(int argc, char **argv)
 {
-	int i, newfd, merge, reset, emu23;
+	int i, newfd, reset, stage = 0;
 	unsigned char sha1[20];
+	merge_fn_t fn = NULL;

 	newfd = hold_index_file_for_update(&cache_file, get_index_file());
 	if (newfd < 0)
@@ -569,9 +645,7 @@ int main(int argc, char **argv)

 		if (get_sha1(arg, sha1) < 0)
 			usage(read_tree_usage);
-		if (stage > 3)
-			usage(read_tree_usage);
-		if (unpack_tree(sha1) < 0)
+		if (list_tree(sha1) < 0)
 			die("failed to unpack tree object %s", arg);
 		stage++;
 	}
@@ -583,19 +657,17 @@ int main(int argc, char **argv)
 			[2] = twoway_merge,
 			[3] = threeway_merge,
 		};
-		merge_fn_t fn;

-		if (stage < 2 || stage > 4)
+		if (stage < 2)
 			die("just how do you expect me to merge %d trees?", stage-1);
 		if (emu23 && stage != 3)
 			die("--emu23 takes only two trees");
 		fn = merge_function[stage-1];
 		if (stage == 3 && emu23) {
-			setup_emu23();
 			fn = merge_function[3];
 		}
-		merge_cache(active_cache, active_nr, fn);
 	}
+	unpack_trees(fn);
 	if (write_cache(newfd, active_cache, active_nr) ||
 	    commit_index_file(&cache_file))
 		die("unable to write new index file");

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] Reorganize read-tree
  2005-08-31  3:48 [PATCH 0/2] Reorganize read-tree Daniel Barkalow
  2005-08-31  3:49 ` [PATCH 1/2] Object model additions for read-tree Daniel Barkalow
  2005-08-31  3:49 ` [PATCH] Change read-tree to merge before using the index Daniel Barkalow
@ 2005-08-31  6:28 ` Junio C Hamano
  2005-08-31 16:56   ` Daniel Barkalow
  2005-08-31 14:15 ` Catalin Marinas
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2005-08-31  6:28 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Linus Torvalds

Dan, I really really *REALLY* wanted to try this out in "pu"
branch and even was about to rig some torture chamber for
testing before applying the patch, but you got the shiny blue
bat X-<.

I've checked with both marc and gmane [*], and believe the
problem is not on my end.  The MUA seems to have stripped all
the trailing whitespaces, so you do not have a single SP on
empty context lines, and lack excess trailing whitespace on line
41 of object.h (object_list_insert declaration), for example.

A patch to SubmittingPatches, MUA specific help section for
users of Pine 4.63 would be very much appreciated.

[References]

http://marc.theaimsgroup.com/?l=git&m=112545993929060&q=raw
http://www.mail-archive.com/git@vger.kernel.org/msg03590.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] Reorganize read-tree
  2005-08-31  3:48 [PATCH 0/2] Reorganize read-tree Daniel Barkalow
                   ` (2 preceding siblings ...)
  2005-08-31  6:28 ` [PATCH 0/2] Reorganize read-tree Junio C Hamano
@ 2005-08-31 14:15 ` Catalin Marinas
  2005-08-31 17:04   ` Daniel Barkalow
  2005-08-31 16:57 ` [PATCH 1/2 (resend)] Object model additions for read-tree Daniel Barkalow
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Catalin Marinas @ 2005-08-31 14:15 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Junio C Hamano, git, Linus Torvalds

Daniel Barkalow <barkalow@iabervon.org> wrote:
> I got mostly done with this before Linus mentioned the possibility of
> having multiple index entries in the same stage for a single path. I
> finished it anyway, but I'm not sure that we won't want to know which of
> the common ancestors contributed which, and, if some of them don't have a
> path, we wouldn't be able to tell.

I don't have time to look at the patch and I don't have a good
knowledge of the GIT internals, so I will just ask. Does this patch
changes the call convention for git-merge-one-file-script? I have my
own script for StGIT and I would need to know whether it is affected
or not.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] Reorganize read-tree
  2005-08-31  6:28 ` [PATCH 0/2] Reorganize read-tree Junio C Hamano
@ 2005-08-31 16:56   ` Daniel Barkalow
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Barkalow @ 2005-08-31 16:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Linus Torvalds

On Tue, 30 Aug 2005, Junio C Hamano wrote:

> Dan, I really really *REALLY* wanted to try this out in "pu"
> branch and even was about to rig some torture chamber for
> testing before applying the patch, but you got the shiny blue
> bat X-<.

I'll send a replacement with the settings correct.

> A patch to SubmittingPatches, MUA specific help section for
> users of Pine 4.63 would be very much appreciated.

Ah, it looks like a recent version changed the default behavior to do the 
right thing, and inverted the sense of the configuration option. (Either 
that or Gentoo did it.) So you need to set the 
"no-strip-whitespace-before-send" option, unless the option you have is 
"strip-whitespace-before-send", in which case you should avoid checking 
it.

I don't actually have things set up for preparing patches from work, 
although I can resend the patches I prepared earlier.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2 (resend)] Object model additions for read-tree
  2005-08-31  3:48 [PATCH 0/2] Reorganize read-tree Daniel Barkalow
                   ` (3 preceding siblings ...)
  2005-08-31 14:15 ` Catalin Marinas
@ 2005-08-31 16:57 ` Daniel Barkalow
  2005-08-31 16:57 ` [PATCH 2/2 (resend)] Change read-tree to merge before using the index Daniel Barkalow
  2005-09-05  2:04 ` [PATCH 0/2] Reorganize read-tree Junio C Hamano
  6 siblings, 0 replies; 11+ messages in thread
From: Daniel Barkalow @ 2005-08-31 16:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Linus Torvalds

Adds object_list_append() and a function to get the struct tree from an ent.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>

---

 object.c |   11 +++++++++++
 object.h |    3 +++
 tree.c   |   19 +++++++++++++++++++
 tree.h   |    3 +++
 4 files changed, 36 insertions(+), 0 deletions(-)

49d33c385aa69d17c991300f73e77c6718a2b4a6
diff --git a/object.c b/object.c
--- a/object.c
+++ b/object.c
@@ -184,6 +184,17 @@ struct object_list *object_list_insert(s
         return new_list;
 }
 
+void object_list_append(struct object *item,
+			struct object_list **list_p)
+{
+	while (*list_p) {
+		list_p = &((*list_p)->next);
+	}
+	*list_p = xmalloc(sizeof(struct object_list));
+	(*list_p)->next = NULL;
+	(*list_p)->item = item;
+}
+
 unsigned object_list_length(struct object_list *list)
 {
 	unsigned ret = 0;
diff --git a/object.h b/object.h
--- a/object.h
+++ b/object.h
@@ -41,6 +41,9 @@ void mark_reachable(struct object *obj, 
 struct object_list *object_list_insert(struct object *item, 
 				       struct object_list **list_p);
 
+void object_list_append(struct object *item,
+			struct object_list **list_p);
+
 unsigned object_list_length(struct object_list *list);
 
 int object_list_contains(struct object_list *list, struct object *obj);
diff --git a/tree.c b/tree.c
--- a/tree.c
+++ b/tree.c
@@ -1,5 +1,7 @@
 #include "tree.h"
 #include "blob.h"
+#include "commit.h"
+#include "tag.h"
 #include "cache.h"
 #include <stdlib.h>
 
@@ -212,3 +214,20 @@ int parse_tree(struct tree *item)
 	free(buffer);
 	return ret;
 }
+
+struct tree *parse_tree_indirect(const unsigned char *sha1)
+{
+	struct object *obj = parse_object(sha1);
+	do {
+		if (!obj)
+			return NULL;
+		if (obj->type == tree_type)
+			return (struct tree *) obj;
+		else if (obj->type == commit_type)
+			return ((struct commit *) obj)->tree;
+		else if (obj->type == tag_type)
+			obj = ((struct tag *) obj)->tagged;
+		else
+			return NULL;
+	} while (1);
+}
diff --git a/tree.h b/tree.h
--- a/tree.h
+++ b/tree.h
@@ -32,4 +32,7 @@ int parse_tree_buffer(struct tree *item,
 
 int parse_tree(struct tree *tree);
 
+/* Parses and returns the tree in the given ent, chasing tags and commits. */
+struct tree *parse_tree_indirect(const unsigned char *sha1);
+
 #endif /* TREE_H */

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/2 (resend)] Change read-tree to merge before using the index.
  2005-08-31  3:48 [PATCH 0/2] Reorganize read-tree Daniel Barkalow
                   ` (4 preceding siblings ...)
  2005-08-31 16:57 ` [PATCH 1/2 (resend)] Object model additions for read-tree Daniel Barkalow
@ 2005-08-31 16:57 ` Daniel Barkalow
  2005-09-05  2:04 ` [PATCH 0/2] Reorganize read-tree Junio C Hamano
  6 siblings, 0 replies; 11+ messages in thread
From: Daniel Barkalow @ 2005-08-31 16:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Linus Torvalds

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>

---

 read-tree.c |  522 ++++++++++++++++++++++++++++++++++-------------------------
 1 files changed, 297 insertions(+), 225 deletions(-)

d0f45ad81db2e133c49c23bd09c5615da344bb5c
diff --git a/read-tree.c b/read-tree.c
--- a/read-tree.c
+++ b/read-tree.c
@@ -5,28 +5,280 @@
  */
 #include "cache.h"
 
-static int stage = 0;
+#include "object.h"
+#include "tree.h"
+
+static int merge = 0;
+static int emu23 = 0;
 static int update = 0;
 
-static int unpack_tree(unsigned char *sha1)
+static struct object_list *trees = NULL;
+
+typedef int (*merge_fn_t)(struct cache_entry **src, 
+			  struct cache_entry **dest, 
+			  int df_conflicts_2,
+			  int df_conflicts_3);
+
+static int unpack_trees_rec(struct tree_entry_list **posns, int len,
+			    const char *base, merge_fn_t fn, 
+			    int file2, int file3, int *indpos)
+{
+	int baselen = strlen(base);
+	int src_size = len + 1;
+	if (emu23)
+		src_size++;
+	if (src_size > 4)
+		src_size = 4;
+	do {
+		int i;
+		char *first = NULL;
+		int pathlen;
+		unsigned ce_size;
+		int dir2 = 0;
+		int dir3 = 0;
+		int subfile2 = file2;
+		int subfile3 = file3;
+		struct tree_entry_list **subposns = NULL;
+		struct cache_entry **src = NULL;
+		char *cache_name = NULL;
+
+		/* Find the first name in the input. */
+
+		/* Check the cache */
+		if (merge && *indpos < active_nr) {
+			/* This is a bit tricky: */
+			/* If the index has a subdirectory (with
+			 * contents) as the first name, it'll get a
+			 * filename like "foo/bar". But that's after
+			 * "foo", so the entry in trees will get
+			 * handled first, at which point we'll go into
+			 * "foo", and deal with "bar" from the index,
+			 * because the base will be "foo/". The only
+			 * way we can actually have "foo/bar" first of
+			 * all the things is if the trees don't
+			 * contain "foo" at all, in which case we'll
+			 * handle "foo/bar" without going into the
+			 * directory, but that's fine (and will return
+			 * an error anyway, with the added unknown
+			 * file case.
+			 */
+
+			cache_name = active_cache[*indpos]->name;
+			if (strlen(cache_name) > baselen &&
+			    !memcmp(cache_name, base, baselen)) {
+				cache_name += baselen;
+				first = cache_name;
+			} else {
+				cache_name = NULL;
+			}
+		}
+
+		for (i = 0; i < len; i++) {
+			if (!posns[i])
+				continue;
+			if (!first || strcmp(first, posns[i]->name) > 0)
+				first = posns[i]->name;
+		}
+		/* No name means we're done */
+		if (!first)
+			return 0;
+
+		pathlen = strlen(first);
+		ce_size = cache_entry_size(baselen + pathlen);
+
+		if (cache_name && !strcmp(cache_name, first)) {
+			src = xmalloc(sizeof(struct cache_entry *) * 
+				      src_size);
+			memset(src, 0,
+			       sizeof(struct cache_entry *) * 
+			       src_size);
+			src[0] = active_cache[*indpos];
+			remove_cache_entry_at(*indpos);
+			if (emu23) {
+				// we need this in stage 2 as well as stage 0
+				struct cache_entry *copy =
+					xmalloc(ce_size);
+				memcpy(copy, src[0], ce_size);
+				copy->ce_flags = 
+					create_ce_flags(baselen + pathlen, 2);
+				if (dir2 || file2) {
+					die("cannot merge index and our head tree");
+				}
+				src[2] = copy;
+				subfile2 = 1;
+			}
+		}
+
+		for (i = 0; i < len; i++) {
+			struct cache_entry *ce;
+			int ce_stage;
+			if (!posns[i] ||
+			    strcmp(first, posns[i]->name)) {
+				continue;
+			}
+
+			if (len > 3) {
+				if (i - len + 3 >= 0)
+					ce_stage = i - len + 3 + merge;
+				else
+					ce_stage = merge;
+			} else {
+				ce_stage = i + merge;
+			}
+			if (emu23 && ce_stage == 2)
+				ce_stage = 3;
+
+			if (posns[i]->directory) {
+				if (!subposns) {
+					int posnlen = sizeof(struct tree_list_entry *) * len;
+					subposns = xmalloc(posnlen);
+					memset(subposns, 0, posnlen);
+				}
+				parse_tree(posns[i]->item.tree);
+				subposns[i] = posns[i]->item.tree->entries;
+				posns[i] = posns[i]->next;
+				if (emu23 && ce_stage == 1)
+					dir2 = 1;
+				if (ce_stage == 2)
+					dir2 = 1;
+				if (ce_stage == 3)
+					dir3 = 1;
+				continue;
+			}
+
+			if (ce_stage == 2)
+				subfile2 = 1;
+			if (ce_stage == 3)
+				subfile3 = 1;
+
+			ce = xmalloc(ce_size);
+			memset(ce, 0, ce_size);
+			ce->ce_mode = create_ce_mode(posns[i]->mode);
+			ce->ce_flags = create_ce_flags(baselen + pathlen,
+						       ce_stage);
+			memcpy(ce->name, base, baselen);
+			memcpy(ce->name + baselen, first, pathlen + 1);
+
+			if (!src) {
+				src = xmalloc(sizeof(struct cache_entry *) * 
+					      src_size);
+				memset(src, 0,
+				       sizeof(struct cache_entry *) * 
+				       src_size);
+			}
+
+			memcpy(ce->sha1, posns[i]->item.any->sha1, 20);
+			if (ce_stage == 1 && emu23 && !src[2]) {
+				// we need this in stage 2 as well as stage 1
+				struct cache_entry *copy =
+					xmalloc(ce_size);
+				memcpy(copy, ce, ce_size);
+				copy->ce_flags = 
+					create_ce_flags(baselen + pathlen, 2);
+				if (dir2 || file2) {
+					die("cannot merge index and our head tree");
+				}
+				src[2] = copy;
+				subfile2 = 1;
+			}
+			src[ce_stage] = ce;
+			posns[i] = posns[i]->next;
+		}
+		if (src) {
+			if (fn) {
+				int ret;
+				/*
+				printf("%s:\n", first);
+				for (i = 0; i < src_size; i++) {
+					printf(" %d ", i);
+					if (src[i])
+						printf("%s\n", sha1_to_hex(src[i]->sha1));
+					else
+						printf("\n");
+				}
+				*/
+				ret = fn(src, NULL,
+					 (file2 || dir2),
+					 (file3 || dir3));
+				/*printf("Added %d entries\n", ret);*/
+				*indpos += ret;
+			} else {
+				for (i = 0; i < src_size; i++) {
+					if (src[i]) {
+						add_cache_entry(src[i], ADD_CACHE_OK_TO_ADD|ADD_CACHE_SKIP_DFCHECK);
+					}
+				}
+			}
+		}
+		if (subposns) {
+			char *newbase = xmalloc(baselen + 2 + pathlen);
+			memcpy(newbase, base, baselen);
+			memcpy(newbase + baselen, first, pathlen);
+			newbase[baselen + pathlen] = '/';
+			newbase[baselen + pathlen + 1] = '\0';
+			if (unpack_trees_rec(subposns, len, newbase, fn,
+					     subfile2, subfile3, indpos))
+				return -1;
+		}
+	} while (1);
+}
+
+static void reject_merge(struct cache_entry *ce)
+{
+	die("Entry '%s' would be overwritten by merge. Cannot merge.", 
+	    ce->name);
+}
+
+static void check_updates(struct cache_entry **src, int nr)
 {
-	void *buffer;
-	unsigned long size;
-	int ret;
+	static struct checkout state = {
+		.base_dir = "",
+		.force = 1,
+		.quiet = 1,
+		.refresh_cache = 1,
+	};
+	unsigned short mask = htons(CE_UPDATE);
+	while (nr--) {
+		struct cache_entry *ce = *src++;
+		if (!ce->ce_mode) {
+			if (update)
+				unlink(ce->name);
+			continue;
+		}
+		if (ce->ce_flags & mask) {
+			ce->ce_flags &= ~mask;
+			if (update)
+				checkout_entry(ce, &state);
+		}
+	}
+}
 
-	buffer = read_object_with_reference(sha1, "tree", &size, NULL);
-	if (!buffer)
+static int unpack_trees(merge_fn_t fn)
+{
+	int indpos = 0;
+	unsigned len = object_list_length(trees);
+	struct tree_entry_list **posns = 
+		xmalloc(len * sizeof(struct tree_entry_list *));
+	int i;
+	struct object_list *posn = trees;
+	for (i = 0; i < len; i++) {
+		posns[i] = ((struct tree *) posn->item)->entries;
+		posn = posn->next;
+	}
+	if (unpack_trees_rec(posns, len, "", fn, 0, 0, &indpos))
 		return -1;
-	ret = read_tree(buffer, size, stage, NULL);
-	free(buffer);
-	return ret;
+
+	check_updates(active_cache, active_nr);
+	return 0;
 }
 
-static int path_matches(struct cache_entry *a, struct cache_entry *b)
+static int list_tree(unsigned char *sha1)
 {
-	int len = ce_namelen(a);
-	return ce_namelen(b) == len &&
-		!memcmp(a->name, b->name, len);
+	struct tree *tree = parse_tree_indirect(sha1);
+	if (!tree)
+		return -1;
+	object_list_append(&tree->object, &trees);
+	return 0;
 }
 
 static int same(struct cache_entry *a, struct cache_entry *b)
@@ -91,17 +343,6 @@ static void verify_uptodate(struct cache
 	die("Entry '%s' not uptodate. Cannot merge.", ce->name);
 }
 
-/*
- * If the old tree contained a CE that isn't even in the
- * result, that's always a problem, regardless of whether
- * it's up-to-date or not (ie it can be a file that we
- * have updated but not committed yet).
- */
-static void reject_merge(struct cache_entry *ce)
-{
-	die("Entry '%s' would be overwritten by merge. Cannot merge.", ce->name);
-}
-
 static int merged_entry_internal(struct cache_entry *merge, struct cache_entry *old, struct cache_entry **dst, int allow_dirty)
 {
 	merge->ce_flags |= htons(CE_UPDATE);
@@ -120,7 +361,7 @@ static int merged_entry_internal(struct 
 		}
 	}
 	merge->ce_flags &= ~htons(CE_STAGEMASK);
-	*dst++ = merge;
+	add_cache_entry(merge, ADD_CACHE_OK_TO_ADD);
 	return 1;
 }
 
@@ -139,76 +380,19 @@ static int deleted_entry(struct cache_en
 	if (old)
 		verify_uptodate(old);
 	ce->ce_mode = 0;
-	*dst++ = ce;
+	add_cache_entry(ce, ADD_CACHE_OK_TO_ADD);
 	return 1;
 }
 
-static int causes_df_conflict(struct cache_entry *ce, int stage,
-			      struct cache_entry **dst_,
-			      struct cache_entry **next_,
-			      int tail)
-{
-	/* This is called during the merge operation and walking
-	 * the active_cache[] array is messy, because it is in the
-	 * middle of overlapping copy operation.  The invariants
-	 * are:
-	 * (1) active_cache points at the first (zeroth) entry.
-	 * (2) up to dst pointer are resolved entries.
-	 * (3) from the next pointer (head-inclusive) to the tail
-	 *     of the active_cache array have the remaining paths
-	 *     to be processed.  There can be a gap between dst
-	 *     and next.  Note that next is called "src" in the
-	 *     merge_cache() function, and tail is the original
-	 *     end of active_cache array when merge_cache() started.
-	 * (4) the path corresponding to *ce is not found in (2)
-	 *     or (3).  It is in the gap.
-	 *
-	 *  active_cache -----......+++++++++++++.
-	 *                    ^dst  ^next        ^tail
-	 */
-	int i, next, dst;
-	const char *path = ce->name;
-	int namelen = ce_namelen(ce);
-
-	next = next_ - active_cache;
-	dst = dst_ - active_cache;
-
-	for (i = 0; i < tail; i++) {
-		int entlen, len;
-		const char *one, *two;
-		if (dst <= i && i < next)
-			continue;
-		ce = active_cache[i];
-		if (ce_stage(ce) != stage)
-			continue;
-		/* If ce->name is a prefix of path, then path is a file
-		 * that hangs underneath ce->name, which is bad.
-		 * If path is a prefix of ce->name, then it is the
-		 * other way around which also is bad.
-		 */
-		entlen = ce_namelen(ce);
-		if (namelen == entlen)
-			continue;
-		if (namelen < entlen) {
-			len = namelen;
-			one = path;
-			two = ce->name;
-		} else {
-			len = entlen;
-			one = ce->name;
-			two = path;
-		}
-		if (memcmp(one, two, len))
-			continue;
-		if (two[len] == '/')
-			return 1;
-	}
-	return 0;
+static int keep_entry(struct cache_entry *ce, struct cache_entry **dst)
+{
+	add_cache_entry(ce, ADD_CACHE_OK_TO_ADD);
+	return 1;
 }
 
 static int threeway_merge(struct cache_entry *stages[4],
 			  struct cache_entry **dst,
-			  struct cache_entry **next, int tail)
+			  int df_conflict_2, int df_conflict_3)
 {
 	struct cache_entry *old = stages[0];
 	struct cache_entry *a = stages[1], *b = stages[2], *c = stages[3];
@@ -218,7 +402,7 @@ static int threeway_merge(struct cache_e
 	/* #5ALT */
 	if (!a && b && c && same(b, c)) {
 		if (old && !same(b, old))
-			return -1;
+			reject_merge(old);
 		return merged_entry_allow_dirty(b, old, dst);
 	}
 	/* #2ALT and #3ALT */
@@ -269,12 +453,12 @@ static int threeway_merge(struct cache_e
 		 *     and vice versa.
 		 */
 		if (c) { /* #2ALT */
-			if (!causes_df_conflict(c, 2, dst, next, tail) &&
+			if (!df_conflict_2 &&
 			    (!old || same(c, old)))
 				return merged_entry_allow_dirty(c, old, dst);
 		}
 		else { /* #3ALT */
-			if (!causes_df_conflict(b, 3, dst, next, tail) &&
+			if (!df_conflict_3 &&
 			    (!old || same(b, old)))
 				return merged_entry_allow_dirty(b, old, dst);
 		}
@@ -293,7 +477,7 @@ static int threeway_merge(struct cache_e
 	 */
 	if (old) {
 		if (!b || !same(old, b))
-			return -1;
+			reject_merge(old);
 	}
 	merge = merge_entries(a, b, c);
 	if (merge)
@@ -301,9 +485,9 @@ static int threeway_merge(struct cache_e
 	if (old)
 		verify_uptodate(old);
 	count = 0;
-	if (a) { *dst++ = a; count++; }
-	if (b) { *dst++ = b; count++; }
-	if (c) { *dst++ = c; count++; }
+	if (a) { count += keep_entry(a, dst + count); }
+	if (b) { count += keep_entry(b, dst + count); }
+	if (c) { count += keep_entry(c, dst + count); }
 	return count;
 }
 
@@ -317,14 +501,11 @@ static int threeway_merge(struct cache_e
  *
  */
 static int twoway_merge(struct cache_entry **src, struct cache_entry **dst,
-			struct cache_entry **next, int tail)
+			int df23, int df32)
 {
 	struct cache_entry *current = src[0];
 	struct cache_entry *oldtree = src[1], *newtree = src[2];
 
-	if (src[3])
-		return -1;
-
 	if (current) {
 		if ((!oldtree && !newtree) || /* 4 and 5 */
 		    (!oldtree && newtree &&
@@ -334,8 +515,7 @@ static int twoway_merge(struct cache_ent
 		    (oldtree && newtree &&
 		     !same(oldtree, newtree) && /* 18 and 19*/
 		     same(current, newtree))) {
-			*dst++ = current;
-			return 1;
+			return keep_entry(current, dst);
 		}
 		else if (oldtree && !newtree && same(current, oldtree)) {
 			/* 10 or 11 */
@@ -346,9 +526,16 @@ static int twoway_merge(struct cache_ent
 			/* 20 or 21 */
 			return merged_entry(newtree, current, dst);
 		}
-		else
+		else {
 			/* all other failures */
+			if (oldtree)
+				reject_merge(oldtree);
+			if (current)
+				reject_merge(current);
+			if (newtree)
+				reject_merge(newtree);
 			return -1;
+		}
 	}
 	else if (newtree)
 		return merged_entry(newtree, current, dst);
@@ -357,137 +544,25 @@ static int twoway_merge(struct cache_ent
 }
 
 /*
- * Two-way merge emulated with three-way merge.
- *
- * This treats "read-tree -m H M" by transforming it internally
- * into "read-tree -m H I+H M", where I+H is a tree that would
- * contain the contents of the current index file, overlayed on
- * top of H.  Unlike the traditional two-way merge, this leaves
- * the stages in the resulting index file and lets the user resolve
- * the merge conflicts using standard tools for three-way merge.
- *
- * This function is just to set-up such an arrangement, and the
- * actual merge uses threeway_merge() function.
- */
-static void setup_emu23(void)
-{
-	/* stage0 contains I, stage1 H, stage2 M.
-	 * move stage2 to stage3, and create stage2 entries
-	 * by scanning stage0 and stage1 entries.
-	 */
-	int i, namelen, size;
-	struct cache_entry *ce, *stage2;
-
-	for (i = 0; i < active_nr; i++) {
-		ce = active_cache[i];
-		if (ce_stage(ce) != 2)
-			continue;
-		/* hoist them up to stage 3 */
-		namelen = ce_namelen(ce);
-		ce->ce_flags = create_ce_flags(namelen, 3);
-	}
-
-	for (i = 0; i < active_nr; i++) {
-		ce = active_cache[i];
-		if (ce_stage(ce) > 1)
-			continue;
-		namelen = ce_namelen(ce);
-		size = cache_entry_size(namelen);
-		stage2 = xmalloc(size);
-		memcpy(stage2, ce, size);
-		stage2->ce_flags = create_ce_flags(namelen, 2);
-		if (add_cache_entry(stage2, ADD_CACHE_OK_TO_ADD) < 0)
-			die("cannot merge index and our head tree");
-
-		/* We are done with this name, so skip to next name */
-		while (i < active_nr &&
-		       ce_namelen(active_cache[i]) == namelen &&
-		       !memcmp(active_cache[i]->name, ce->name, namelen))
-			i++;
-		i--; /* compensate for the loop control */
-	}
-}
-
-/*
  * One-way merge.
  *
  * The rule is:
  * - take the stat information from stage0, take the data from stage1
  */
 static int oneway_merge(struct cache_entry **src, struct cache_entry **dst,
-			struct cache_entry **next, int tail)
+			int df23, int df32)
 {
 	struct cache_entry *old = src[0];
 	struct cache_entry *a = src[1];
 
-	if (src[2] || src[3])
-		return -1;
-
 	if (!a)
 		return 0;
 	if (old && same(old, a)) {
-		*dst++ = old;
-		return 1;
+		return keep_entry(old, dst);
 	}
 	return merged_entry(a, NULL, dst);
 }
 
-static void check_updates(struct cache_entry **src, int nr)
-{
-	static struct checkout state = {
-		.base_dir = "",
-		.force = 1,
-		.quiet = 1,
-		.refresh_cache = 1,
-	};
-	unsigned short mask = htons(CE_UPDATE);
-	while (nr--) {
-		struct cache_entry *ce = *src++;
-		if (!ce->ce_mode) {
-			if (update)
-				unlink(ce->name);
-			continue;
-		}
-		if (ce->ce_flags & mask) {
-			ce->ce_flags &= ~mask;
-			if (update)
-				checkout_entry(ce, &state);
-		}
-	}
-}
-
-typedef int (*merge_fn_t)(struct cache_entry **, struct cache_entry **, struct cache_entry **, int);
-
-static void merge_cache(struct cache_entry **src, int nr, merge_fn_t fn)
-{
-	struct cache_entry **dst = src;
-	int tail = nr;
-
-	while (nr) {
-		int entries;
-		struct cache_entry *name, *ce, *stages[4] = { NULL, };
-
-		name = ce = *src;
-		for (;;) {
-			int stage = ce_stage(ce);
-			stages[stage] = ce;
-			ce = *++src;
-			active_nr--;
-			if (!--nr)
-				break;
-			if (!path_matches(ce, name))
-				break;
-		}
-
-		entries = fn(stages, dst, src, tail);
-		if (entries < 0)
-			reject_merge(name);
-		dst += entries;
-		active_nr += entries;
-	}
-	check_updates(active_cache, active_nr);
-}
-
 static int read_cache_unmerged(void)
 {
 	int i, deleted;
@@ -516,8 +591,9 @@ static struct cache_file cache_file;
 
 int main(int argc, char **argv)
 {
-	int i, newfd, merge, reset, emu23;
+	int i, newfd, reset, stage = 0;
 	unsigned char sha1[20];
+	merge_fn_t fn = NULL;
 
 	newfd = hold_index_file_for_update(&cache_file, get_index_file());
 	if (newfd < 0)
@@ -569,9 +645,7 @@ int main(int argc, char **argv)
 
 		if (get_sha1(arg, sha1) < 0)
 			usage(read_tree_usage);
-		if (stage > 3)
-			usage(read_tree_usage);
-		if (unpack_tree(sha1) < 0)
+		if (list_tree(sha1) < 0)
 			die("failed to unpack tree object %s", arg);
 		stage++;
 	}
@@ -583,19 +657,17 @@ int main(int argc, char **argv)
 			[2] = twoway_merge,
 			[3] = threeway_merge,
 		};
-		merge_fn_t fn;
 
-		if (stage < 2 || stage > 4)
+		if (stage < 2)
 			die("just how do you expect me to merge %d trees?", stage-1);
 		if (emu23 && stage != 3)
 			die("--emu23 takes only two trees");
 		fn = merge_function[stage-1];
 		if (stage == 3 && emu23) { 
-			setup_emu23();
 			fn = merge_function[3];
 		}
-		merge_cache(active_cache, active_nr, fn);
 	}
+	unpack_trees(fn);
 	if (write_cache(newfd, active_cache, active_nr) ||
 	    commit_index_file(&cache_file))
 		die("unable to write new index file");

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] Reorganize read-tree
  2005-08-31 14:15 ` Catalin Marinas
@ 2005-08-31 17:04   ` Daniel Barkalow
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Barkalow @ 2005-08-31 17:04 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: Junio C Hamano, git, Linus Torvalds

On Wed, 31 Aug 2005, Catalin Marinas wrote:

> Daniel Barkalow <barkalow@iabervon.org> wrote:
> > I got mostly done with this before Linus mentioned the possibility of
> > having multiple index entries in the same stage for a single path. I
> > finished it anyway, but I'm not sure that we won't want to know which of
> > the common ancestors contributed which, and, if some of them don't have a
> > path, we wouldn't be able to tell.
> 
> I don't have time to look at the patch and I don't have a good
> knowledge of the GIT internals, so I will just ask. Does this patch
> changes the call convention for git-merge-one-file-script? I have my
> own script for StGIT and I would need to know whether it is affected
> or not.

Nope, it only changes the trivial merge calling convention within 
read-tree.c; I think it's plausible that we might like to add information 
at some point, but the short-term goal is just to prevent a few bad cases 
in trivial merges.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] Reorganize read-tree
  2005-08-31  3:48 [PATCH 0/2] Reorganize read-tree Daniel Barkalow
                   ` (5 preceding siblings ...)
  2005-08-31 16:57 ` [PATCH 2/2 (resend)] Change read-tree to merge before using the index Daniel Barkalow
@ 2005-09-05  2:04 ` Junio C Hamano
  2005-09-05  2:55   ` Daniel Barkalow
  6 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2005-09-05  2:04 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Linus Torvalds

Daniel Barkalow <barkalow@iabervon.org> writes:

> I got mostly done with this before Linus mentioned the possibility of
> having multiple index entries in the same stage for a single path. I
> finished it anyway, but I'm not sure that we won't want to know which of
> the common ancestors contributed which, and, if some of them don't have a
> path, we wouldn't be able to tell. The other advantages I see to this
> approach are:

I've finished reading your patch, after beating it reasonably
heavily by feeding combinations of nonsense trees to make sure
it produces the same result as the original implementation.  I
have not found any regression from the read-tree in "master"
branch, after you fixed the path ordering issues.

> There are various potential refinements, plus removing a bunch of memory
> leaks, still to do, but I think this is sufficiently close to review.

I am not so worried about the leaks right now; they are
something that could be fixed before it hits the "master"
branch.

I like your approach of reading the input trees, along with the
existing index contents, and re-populating the index one path at
a time.  It probably is more readable.

I further think that you can get the best of both worlds, by
inventing a convention that mode=0 entry means 'this path does
not exist in this tree'. This would allow you to have multiple
entries at the same stage and still tell which one came from
which tree.  Instead of calling fn in unpack_trees(), you could
make it only unpack the tree into the index, and then after
unpacking is done, call fn() repeatedly to resolve the resulting
index.  Of course the semantics of merge_fn_t needs to change
but if you feed N trees, the caller of a merge_fn_t function
needs to pick up the first N entries (because you use mode=0
entry to mean 'missing from this tree', each path will always
have N entries) and feed them to the merge function in one call,
then pick up the next N entries and feed them in the next call,
and so on.  I think that would simplify that part of the code
even further.

So if you are making an octopus of 4 trees (one being our
current branch) using 2 merge bases, your intermediate index
would look like:

    mode   SHA1   stage path
    100644 XXXXX  0     foo	from original index
    000000 0{40}  1     foo	merge base #1 did not have foo
    100644 ZZZZZ  1     foo	merge base #2 has it
    100644 XXXXX  2     foo	our current head
    100644 ZZZZZ  3     foo	other head #1 being merged into us
    100644 YYYYY  3     foo	other head #2 being merged into us
    100644 ZZZZZ  3     foo	other head #3 being merged into us

We cannot write something like this out without breaking
backward compatibility, but I personally think this breakage is
OK, because what is being broken is the index file format, not
tree object format, and the index file is by definition local to
a repository.  IOW, it is not too much to ask people not to use
old tools to read new index file they created using new tools.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] Reorganize read-tree
  2005-09-05  2:04 ` [PATCH 0/2] Reorganize read-tree Junio C Hamano
@ 2005-09-05  2:55   ` Daniel Barkalow
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Barkalow @ 2005-09-05  2:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Linus Torvalds

On Sun, 4 Sep 2005, Junio C Hamano wrote:

> Daniel Barkalow <barkalow@iabervon.org> writes:
> 
> > I got mostly done with this before Linus mentioned the possibility of
> > having multiple index entries in the same stage for a single path. I
> > finished it anyway, but I'm not sure that we won't want to know which of
> > the common ancestors contributed which, and, if some of them don't have a
> > path, we wouldn't be able to tell. The other advantages I see to this
> > approach are:
> 
> I've finished reading your patch, after beating it reasonably
> heavily by feeding combinations of nonsense trees to make sure
> it produces the same result as the original implementation.  I
> have not found any regression from the read-tree in "master"
> branch, after you fixed the path ordering issues.

Good.

> > There are various potential refinements, plus removing a bunch of memory
> > leaks, still to do, but I think this is sufficiently close to review.
> 
> I am not so worried about the leaks right now; they are
> something that could be fixed before it hits the "master"
> branch.

Right.

> I like your approach of reading the input trees, along with the
> existing index contents, and re-populating the index one path at
> a time.  It probably is more readable.
> 
> I further think that you can get the best of both worlds, by
> inventing a convention that mode=0 entry means 'this path does
> not exist in this tree'. This would allow you to have multiple
> entries at the same stage and still tell which one came from
> which tree.  Instead of calling fn in unpack_trees(), you could
> make it only unpack the tree into the index, and then after
> unpacking is done, call fn() repeatedly to resolve the resulting
> index. 

I think that almost all of the benefit actually comes from calling fn() in 
unpack_trees() and not putting anything in the index before merging. 
Without that, you need the complex index management and the complicated 
search for DF conflicts. The main point of not reading everything into the 
index before calling fn() on stuff is that the index is actually really 
difficult to deal with in this situation (because you are simultaneously 
moving through it, removing and modifying entries, and searching it for 
conflicts). The improvement in readability comes from not doing this.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-09-05  2:52 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-08-31  3:48 [PATCH 0/2] Reorganize read-tree Daniel Barkalow
2005-08-31  3:49 ` [PATCH 1/2] Object model additions for read-tree Daniel Barkalow
2005-08-31  3:49 ` [PATCH] Change read-tree to merge before using the index Daniel Barkalow
2005-08-31  6:28 ` [PATCH 0/2] Reorganize read-tree Junio C Hamano
2005-08-31 16:56   ` Daniel Barkalow
2005-08-31 14:15 ` Catalin Marinas
2005-08-31 17:04   ` Daniel Barkalow
2005-08-31 16:57 ` [PATCH 1/2 (resend)] Object model additions for read-tree Daniel Barkalow
2005-08-31 16:57 ` [PATCH 2/2 (resend)] Change read-tree to merge before using the index Daniel Barkalow
2005-09-05  2:04 ` [PATCH 0/2] Reorganize read-tree Junio C Hamano
2005-09-05  2:55   ` Daniel Barkalow

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).