git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes
@ 2013-05-25  9:07 Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 01/25] describe: make own copy of refname Michael Haggerty
                   ` (25 more replies)
  0 siblings, 26 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:07 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

This is version two of the patch series.  Aside from addressing
Junio's comments about the first version, it goes significantly
further than v1:

I did a manual audit of the 50 (!) functions that are used as an
each_ref_fn callback to the for_each_ref()-style functions.  (I hope I
haven't missed any.)  I checked that they do not make the assumption
that the lifetimes of the refname and sha1 arguments extend past the
duration of the callback invocation.  There were a number of callers
that got this wrong; I believe I have fixed them all.

I also changed how object_array_entry manages its name field.  Like
the RFC in the first version of this patch series, I change
add_object_array_with_mode() to make a copy of the name before storing
it in its field.  But (at Peff's suggestion) I also make an
optimization like that of strbuf of not copying the name if it is the
empty string, but rather using a pointer to a static empty string.

After this patch series, the test suite runs without errors under
valgrind even when I change refs.c:do_one_ref() to pass temporary
copies of refname and sha1 to the callback functions then free them
immediately.  (Before this patch series, such a test failed in many
places.)  But it is still not enough to make Peff's
jk/packed-refs-race series work correctly under the "hyperactive
repository" stress-test in which the packed-refs file is made to
always look stale,

    refs.c:get_packed_refs():
    -	if (refs->packed &&
    -	    !stat_validity_check(&refs->packed_validity, packed_refs_file))
    +	if (refs->packed /* &&
    +	    !stat_validity_check(&refs->packed_validity, packed_refs_file) */)


Michael Haggerty (25):
  describe: make own copy of refname
  fetch: make own copies of refnames
  add_rev_cmdline(): make a copy of the name argument
  builtin_diff_tree(): make it obvious that function wants two entries
  cmd_diff(): use an object_array for holding trees
  cmd_diff(): rename local variable "list" -> "entry"
  cmd_diff(): make it obvious which cases are exclusive of each other
  revision: split some overly-long lines
  object_array: add function object_array_filter()
  revision: use object_array_filter() in implementation of gc_boundary()
  object_array_remove_duplicates(): rewrite to reduce copying
  fsck: don't put a void*-shaped peg in a char*-shaped hole
  find_first_merges(): initialize merges variable using initializer
  find_first_merges(): remove unnecessary code
  object_array_entry: fix memory handling of the name field
  do_fetch(): reduce scope of peer_item
  do_fetch(): clean up existing_refs before exiting
  add_existing(): do not retain a reference to sha1
  show_head_ref(): do not shadow name of argument
  show_head_ref(): rename first parameter to "refname"
  string_list_add_one_ref(): rename first parameter to "refname"
  string_list_add_refs_by_glob(): add a comment about memory management
  exclude_existing(): set existing_refs.strdup_strings
  register_ref(): make a copy of the bad reference SHA-1
  refs: document the lifetime of the args passed to each_ref_fn

 bisect.c           |  5 ++--
 builtin/describe.c |  6 +++--
 builtin/diff.c     | 69 ++++++++++++++++++++++++++---------------------------
 builtin/fetch.c    | 29 ++++++++++++----------
 builtin/fsck.c     |  2 +-
 builtin/show-ref.c |  2 +-
 bundle.c           |  2 +-
 http-backend.c     |  6 ++---
 notes.c            |  9 ++++---
 object.c           | 70 ++++++++++++++++++++++++++++++++++++++++++++----------
 object.h           | 25 +++++++++++++++++--
 refs.h             | 22 ++++++++++++-----
 revision.c         | 64 +++++++++++++++++++++++++++++--------------------
 revision.h         | 32 ++++++++++++++++---------
 submodule.c        |  6 ++---
 15 files changed, 228 insertions(+), 121 deletions(-)

-- 
1.8.2.3

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v2 01/25] describe: make own copy of refname
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 02/25] fetch: make own copies of refnames Michael Haggerty
                   ` (24 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

Do not retain a reference to the refname passed to the each_ref_fn
callback get_name(), because there is no guarantee of the lifetimes of
these names.  Instead, make a local copy when needed.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 builtin/describe.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 6636a68..3dc09eb 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -42,7 +42,7 @@ struct commit_name {
 	unsigned prio:2; /* annotated tag = 2, tag = 1, head = 0 */
 	unsigned name_checked:1;
 	unsigned char sha1[20];
-	const char *path;
+	char *path;
 };
 static const char *prio_names[] = {
 	"head", "lightweight", "annotated",
@@ -126,12 +126,14 @@ static void add_to_known_names(const char *path,
 			} else {
 				e->next = NULL;
 			}
+			e->path = NULL;
 		}
 		e->tag = tag;
 		e->prio = prio;
 		e->name_checked = 0;
 		hashcpy(e->sha1, sha1);
-		e->path = path;
+		free(e->path);
+		e->path = xstrdup(path);
 	}
 }
 
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 02/25] fetch: make own copies of refnames
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 01/25] describe: make own copy of refname Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 03/25] add_rev_cmdline(): make a copy of the name argument Michael Haggerty
                   ` (23 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

Do not retain references to refnames passed to the each_ref_fn
callback add_existing(), because their lifetime is not guaranteed.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 builtin/fetch.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 4b6b1df..f949115 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -590,7 +590,7 @@ static void find_non_local_tags(struct transport *transport,
 			struct ref **head,
 			struct ref ***tail)
 {
-	struct string_list existing_refs = STRING_LIST_INIT_NODUP;
+	struct string_list existing_refs = STRING_LIST_INIT_DUP;
 	struct string_list remote_refs = STRING_LIST_INIT_NODUP;
 	const struct ref *ref;
 	struct string_list_item *item = NULL;
@@ -693,7 +693,7 @@ static int truncate_fetch_head(void)
 static int do_fetch(struct transport *transport,
 		    struct refspec *refs, int ref_count)
 {
-	struct string_list existing_refs = STRING_LIST_INIT_NODUP;
+	struct string_list existing_refs = STRING_LIST_INIT_DUP;
 	struct string_list_item *peer_item = NULL;
 	struct ref *ref_map;
 	struct ref *rm;
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 03/25] add_rev_cmdline(): make a copy of the name argument
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 01/25] describe: make own copy of refname Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 02/25] fetch: make own copies of refnames Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 04/25] builtin_diff_tree(): make it obvious that function wants two entries Michael Haggerty
                   ` (22 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

Instead of assuming that the memory pointed to by the name argument
will live forever, make a local copy of it before storing it in the
ref_cmdline_info.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 revision.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/revision.c b/revision.c
index a67b615..25e424c 100644
--- a/revision.c
+++ b/revision.c
@@ -898,6 +898,10 @@ static int limit_list(struct rev_info *revs)
 	return 0;
 }
 
+/*
+ * Add an entry to refs->cmdline with the specified information.
+ * *name is copied.
+ */
 static void add_rev_cmdline(struct rev_info *revs,
 			    struct object *item,
 			    const char *name,
@@ -909,7 +913,7 @@ static void add_rev_cmdline(struct rev_info *revs,
 
 	ALLOC_GROW(info->rev, nr + 1, info->alloc);
 	info->rev[nr].item = item;
-	info->rev[nr].name = name;
+	info->rev[nr].name = xstrdup(name);
 	info->rev[nr].whence = whence;
 	info->rev[nr].flags = flags;
 	info->nr++;
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 04/25] builtin_diff_tree(): make it obvious that function wants two entries
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (2 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 03/25] add_rev_cmdline(): make a copy of the name argument Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 05/25] cmd_diff(): use an object_array for holding trees Michael Haggerty
                   ` (21 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

Instead of accepting an array and using exactly two elements from the
array, take two single (struct object_array_entry *) arguments.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 builtin/diff.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/builtin/diff.c b/builtin/diff.c
index 8c2af6c..abdd613 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -153,7 +153,8 @@ static int builtin_diff_index(struct rev_info *revs,
 
 static int builtin_diff_tree(struct rev_info *revs,
 			     int argc, const char **argv,
-			     struct object_array_entry *ent)
+			     struct object_array_entry *ent0,
+			     struct object_array_entry *ent1)
 {
 	const unsigned char *(sha1[2]);
 	int swap = 0;
@@ -161,13 +162,14 @@ static int builtin_diff_tree(struct rev_info *revs,
 	if (argc > 1)
 		usage(builtin_diff_usage);
 
-	/* We saw two trees, ent[0] and ent[1].
-	 * if ent[1] is uninteresting, they are swapped
+	/*
+	 * We saw two trees, ent0 and ent1.  If ent1 is uninteresting,
+	 * swap them.
 	 */
-	if (ent[1].item->flags & UNINTERESTING)
+	if (ent1->item->flags & UNINTERESTING)
 		swap = 1;
-	sha1[swap] = ent[0].item->sha1;
-	sha1[1-swap] = ent[1].item->sha1;
+	sha1[swap] = ent0->item->sha1;
+	sha1[1-swap] = ent1->item->sha1;
 	diff_tree_sha1(sha1[0], sha1[1], "", &revs->diffopt);
 	log_tree_diff_flush(revs);
 	return 0;
@@ -403,7 +405,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 	else if (ents == 1)
 		result = builtin_diff_index(&rev, argc, argv);
 	else if (ents == 2)
-		result = builtin_diff_tree(&rev, argc, argv, ent);
+		result = builtin_diff_tree(&rev, argc, argv, &ent[0], &ent[1]);
 	else if (ent[0].item->flags & UNINTERESTING) {
 		/*
 		 * diff A...B where there is at least one merge base
@@ -412,8 +414,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 		 * between the base and B.  Note that we pick one
 		 * merge base at random if there are more than one.
 		 */
-		ent[1] = ent[ents-1];
-		result = builtin_diff_tree(&rev, argc, argv, ent);
+		result = builtin_diff_tree(&rev, argc, argv, &ent[0], &ent[ents-1]);
 	} else
 		result = builtin_diff_combined(&rev, argc, argv,
 					       ent, ents);
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 05/25] cmd_diff(): use an object_array for holding trees
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (3 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 04/25] builtin_diff_tree(): make it obvious that function wants two entries Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 06/25] cmd_diff(): rename local variable "list" -> "entry" Michael Haggerty
                   ` (20 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

Change cmd_diff() to use a (struct object_array) for holding the trees
that it accumulates, rather than rolling its own equivalent.

Incidentally, this change removes a hard-coded limit of 100 trees in
combined diff, not that it matters in practice.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 builtin/diff.c | 37 ++++++++++++++++++-------------------
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/builtin/diff.c b/builtin/diff.c
index abdd613..661fdde 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -253,8 +253,8 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 {
 	int i;
 	struct rev_info rev;
-	struct object_array_entry ent[100];
-	int ents = 0, blobs = 0, paths = 0;
+	struct object_array ent = OBJECT_ARRAY_INIT;
+	int blobs = 0, paths = 0;
 	const char *path = NULL;
 	struct blobinfo blob[2];
 	int nongit;
@@ -351,13 +351,8 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 		if (obj->type == OBJ_COMMIT)
 			obj = &((struct commit *)obj)->tree->object;
 		if (obj->type == OBJ_TREE) {
-			if (ARRAY_SIZE(ent) <= ents)
-				die(_("more than %d trees given: '%s'"),
-				    (int) ARRAY_SIZE(ent), name);
 			obj->flags |= flags;
-			ent[ents].item = obj;
-			ent[ents].name = name;
-			ents++;
+			add_object_array(obj, name, &ent);
 			continue;
 		}
 		if (obj->type == OBJ_BLOB) {
@@ -381,7 +376,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 	/*
 	 * Now, do the arguments look reasonable?
 	 */
-	if (!ents) {
+	if (!ent.nr) {
 		switch (blobs) {
 		case 0:
 			result = builtin_diff_files(&rev, argc, argv);
@@ -402,22 +397,26 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 	}
 	else if (blobs)
 		usage(builtin_diff_usage);
-	else if (ents == 1)
+	else if (ent.nr == 1)
 		result = builtin_diff_index(&rev, argc, argv);
-	else if (ents == 2)
-		result = builtin_diff_tree(&rev, argc, argv, &ent[0], &ent[1]);
-	else if (ent[0].item->flags & UNINTERESTING) {
+	else if (ent.nr == 2)
+		result = builtin_diff_tree(&rev, argc, argv,
+					   &ent.objects[0], &ent.objects[1]);
+	else if (ent.objects[0].item->flags & UNINTERESTING) {
 		/*
 		 * diff A...B where there is at least one merge base
-		 * between A and B.  We have ent[0] == merge-base,
-		 * ent[ents-2] == A, and ent[ents-1] == B.  Show diff
-		 * between the base and B.  Note that we pick one
-		 * merge base at random if there are more than one.
+		 * between A and B.  We have ent.objects[0] ==
+		 * merge-base, ent.objects[ents-2] == A, and
+		 * ent.objects[ents-1] == B.  Show diff between the
+		 * base and B.  Note that we pick one merge base at
+		 * random if there are more than one.
 		 */
-		result = builtin_diff_tree(&rev, argc, argv, &ent[0], &ent[ents-1]);
+		result = builtin_diff_tree(&rev, argc, argv,
+					   &ent.objects[0],
+					   &ent.objects[ent.nr-1]);
 	} else
 		result = builtin_diff_combined(&rev, argc, argv,
-					       ent, ents);
+					       ent.objects, ent.nr);
 	result = diff_result_code(&rev.diffopt, result);
 	if (1 < rev.diffopt.skip_stat_unmatch)
 		refresh_index_quietly();
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 06/25] cmd_diff(): rename local variable "list" -> "entry"
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (4 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 05/25] cmd_diff(): use an object_array for holding trees Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 07/25] cmd_diff(): make it obvious which cases are exclusive of each other Michael Haggerty
                   ` (19 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

It's not a list, it's an array entry.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 builtin/diff.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/diff.c b/builtin/diff.c
index 661fdde..84243d9 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -339,9 +339,9 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 	}
 
 	for (i = 0; i < rev.pending.nr; i++) {
-		struct object_array_entry *list = rev.pending.objects+i;
-		struct object *obj = list->item;
-		const char *name = list->name;
+		struct object_array_entry *entry = &rev.pending.objects[i];
+		struct object *obj = entry->item;
+		const char *name = entry->name;
 		int flags = (obj->flags & UNINTERESTING);
 		if (!obj->parsed)
 			obj = parse_object(obj->sha1);
@@ -360,7 +360,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 				die(_("more than two blobs given: '%s'"), name);
 			hashcpy(blob[blobs].sha1, obj->sha1);
 			blob[blobs].name = name;
-			blob[blobs].mode = list->mode;
+			blob[blobs].mode = entry->mode;
 			blobs++;
 			continue;
 
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 07/25] cmd_diff(): make it obvious which cases are exclusive of each other
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (5 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 06/25] cmd_diff(): rename local variable "list" -> "entry" Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 08/25] revision: split some overly-long lines Michael Haggerty
                   ` (18 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

At first glance the OBJ_COMMIT, OBJ_TREE, and OBJ_BLOB cases look like
they might be mutually exclusive.  But the OBJ_COMMIT case doesn't end
the loop iteration with "continue" like the other two cases, but
rather falls through.  So use if...else if...else construct to make it
more obvious that only the last two cases are mutually exclusive.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 builtin/diff.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/builtin/diff.c b/builtin/diff.c
index 84243d9..9fc273d 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -350,22 +350,21 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 			die(_("invalid object '%s' given."), name);
 		if (obj->type == OBJ_COMMIT)
 			obj = &((struct commit *)obj)->tree->object;
+
 		if (obj->type == OBJ_TREE) {
 			obj->flags |= flags;
 			add_object_array(obj, name, &ent);
-			continue;
-		}
-		if (obj->type == OBJ_BLOB) {
+		} else if (obj->type == OBJ_BLOB) {
 			if (2 <= blobs)
 				die(_("more than two blobs given: '%s'"), name);
 			hashcpy(blob[blobs].sha1, obj->sha1);
 			blob[blobs].name = name;
 			blob[blobs].mode = entry->mode;
 			blobs++;
-			continue;
 
+		} else {
+			die(_("unhandled object '%s' given."), name);
 		}
-		die(_("unhandled object '%s' given."), name);
 	}
 	if (rev.prune_data.nr) {
 		if (!path)
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 08/25] revision: split some overly-long lines
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (6 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 07/25] cmd_diff(): make it obvious which cases are exclusive of each other Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 09/25] object_array: add function object_array_filter() Michael Haggerty
                   ` (17 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty


Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 revision.c | 20 ++++++++++++++------
 revision.h | 32 +++++++++++++++++++++-----------
 2 files changed, 35 insertions(+), 17 deletions(-)

diff --git a/revision.c b/revision.c
index 25e424c..8ac88d6 100644
--- a/revision.c
+++ b/revision.c
@@ -70,7 +70,8 @@ static int show_path_truncated(FILE *out, const struct name_path *path)
 	return ours || emitted;
 }
 
-void show_object_with_name(FILE *out, struct object *obj, const struct name_path *path, const char *component)
+void show_object_with_name(FILE *out, struct object *obj,
+			   const struct name_path *path, const char *component)
 {
 	struct name_path leaf;
 	leaf.up = (struct name_path *)path;
@@ -186,7 +187,9 @@ void mark_parents_uninteresting(struct commit *commit)
 	}
 }
 
-static void add_pending_object_with_mode(struct rev_info *revs, struct object *obj, const char *name, unsigned mode)
+static void add_pending_object_with_mode(struct rev_info *revs,
+					 struct object *obj,
+					 const char *name, unsigned mode)
 {
 	if (!obj)
 		return;
@@ -209,7 +212,8 @@ static void add_pending_object_with_mode(struct rev_info *revs, struct object *o
 	add_object_array_with_mode(obj, name, &revs->pending, mode);
 }
 
-void add_pending_object(struct rev_info *revs, struct object *obj, const char *name)
+void add_pending_object(struct rev_info *revs,
+			struct object *obj, const char *name)
 {
 	add_pending_object_with_mode(revs, obj, name, S_IFINVALID);
 }
@@ -226,7 +230,9 @@ void add_head_to_pending(struct rev_info *revs)
 	add_pending_object(revs, obj, "HEAD");
 }
 
-static struct object *get_reference(struct rev_info *revs, const char *name, const unsigned char *sha1, unsigned int flags)
+static struct object *get_reference(struct rev_info *revs, const char *name,
+				    const unsigned char *sha1,
+				    unsigned int flags)
 {
 	struct object *object;
 
@@ -247,7 +253,8 @@ void add_pending_sha1(struct rev_info *revs, const char *name,
 	add_pending_object(revs, object, name);
 }
 
-static struct commit *handle_commit(struct rev_info *revs, struct object *object, const char *name)
+static struct commit *handle_commit(struct rev_info *revs,
+				    struct object *object, const char *name)
 {
 	unsigned long flags = object->flags;
 
@@ -368,7 +375,8 @@ static void file_change(struct diff_options *options,
 	DIFF_OPT_SET(options, HAS_CHANGES);
 }
 
-static int rev_compare_tree(struct rev_info *revs, struct commit *parent, struct commit *commit)
+static int rev_compare_tree(struct rev_info *revs,
+			    struct commit *parent, struct commit *commit)
 {
 	struct tree *t1 = parent->tree;
 	struct tree *t2 = commit->tree;
diff --git a/revision.h b/revision.h
index 01bd2b7..9628465 100644
--- a/revision.h
+++ b/revision.h
@@ -195,19 +195,23 @@ struct setup_revision_opt {
 };
 
 extern void init_revisions(struct rev_info *revs, const char *prefix);
-extern int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct setup_revision_opt *);
+extern int setup_revisions(int argc, const char **argv, struct rev_info *revs,
+			   struct setup_revision_opt *);
 extern void parse_revision_opt(struct rev_info *revs, struct parse_opt_ctx_t *ctx,
-				 const struct option *options,
-				 const char * const usagestr[]);
+			       const struct option *options,
+			       const char * const usagestr[]);
 #define REVARG_CANNOT_BE_FILENAME 01
 #define REVARG_COMMITTISH 02
-extern int handle_revision_arg(const char *arg, struct rev_info *revs, int flags, unsigned revarg_opt);
+extern int handle_revision_arg(const char *arg, struct rev_info *revs,
+			       int flags, unsigned revarg_opt);
 
 extern void reset_revision_walk(void);
 extern int prepare_revision_walk(struct rev_info *revs);
 extern struct commit *get_revision(struct rev_info *revs);
-extern char *get_revision_mark(const struct rev_info *revs, const struct commit *commit);
-extern void put_revision_mark(const struct rev_info *revs, const struct commit *commit);
+extern char *get_revision_mark(const struct rev_info *revs,
+			       const struct commit *commit);
+extern void put_revision_mark(const struct rev_info *revs,
+			      const struct commit *commit);
 
 extern void mark_parents_uninteresting(struct commit *commit);
 extern void mark_tree_uninteresting(struct tree *tree);
@@ -220,15 +224,19 @@ struct name_path {
 
 char *path_name(const struct name_path *path, const char *name);
 
-extern void show_object_with_name(FILE *, struct object *, const struct name_path *, const char *);
+extern void show_object_with_name(FILE *, struct object *,
+				  const struct name_path *, const char *);
 
 extern void add_object(struct object *obj,
 		       struct object_array *p,
 		       struct name_path *path,
 		       const char *name);
 
-extern void add_pending_object(struct rev_info *revs, struct object *obj, const char *name);
-extern void add_pending_sha1(struct rev_info *revs, const char *name, const unsigned char *sha1, unsigned int flags);
+extern void add_pending_object(struct rev_info *revs,
+			       struct object *obj, const char *name);
+extern void add_pending_sha1(struct rev_info *revs,
+			     const char *name, const unsigned char *sha1,
+			     unsigned int flags);
 
 extern void add_head_to_pending(struct rev_info *);
 
@@ -238,7 +246,9 @@ enum commit_action {
 	commit_error
 };
 
-extern enum commit_action get_commit_action(struct rev_info *revs, struct commit *commit);
-extern enum commit_action simplify_commit(struct rev_info *revs, struct commit *commit);
+extern enum commit_action get_commit_action(struct rev_info *revs,
+					    struct commit *commit);
+extern enum commit_action simplify_commit(struct rev_info *revs,
+					  struct commit *commit);
 
 #endif
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 09/25] object_array: add function object_array_filter()
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (7 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 08/25] revision: split some overly-long lines Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 10/25] revision: use object_array_filter() in implementation of gc_boundary() Michael Haggerty
                   ` (16 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

Add a function that allows unwanted entries in an object_array to be
removed.  This encapsulation is a step towards giving object_array
ownership of its entries' name memory.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 object.c | 16 ++++++++++++++++
 object.h | 11 +++++++++++
 2 files changed, 27 insertions(+)

diff --git a/object.c b/object.c
index 20703f5..fcd4a82 100644
--- a/object.c
+++ b/object.c
@@ -278,6 +278,22 @@ void add_object_array_with_mode(struct object *obj, const char *name, struct obj
 	array->nr = ++nr;
 }
 
+void object_array_filter(struct object_array *array,
+			 object_array_each_func_t want, void *cb_data)
+{
+	unsigned nr = array->nr, src, dst;
+	struct object_array_entry *objects = array->objects;
+
+	for (src = dst = 0; src < nr; src++) {
+		if (want(&objects[src], cb_data)) {
+			if (src != dst)
+				objects[dst] = objects[src];
+			dst++;
+		}
+	}
+	array->nr = dst;
+}
+
 void object_array_remove_duplicates(struct object_array *array)
 {
 	unsigned int ref, src, dst;
diff --git a/object.h b/object.h
index 97d384b..0d39ff4 100644
--- a/object.h
+++ b/object.h
@@ -85,6 +85,17 @@ int object_list_contains(struct object_list *list, struct object *obj);
 /* Object array handling .. */
 void add_object_array(struct object *obj, const char *name, struct object_array *array);
 void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode);
+
+typedef int (*object_array_each_func_t)(struct object_array_entry *, void *);
+
+/*
+ * Apply want to each entry in array, retaining only the entries for
+ * which the function returns true.  Preserve the order of the entries
+ * that are retained.
+ */
+void object_array_filter(struct object_array *array,
+			 object_array_each_func_t want, void *cb_data);
+
 void object_array_remove_duplicates(struct object_array *);
 
 void clear_object_flags(unsigned flags);
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 10/25] revision: use object_array_filter() in implementation of gc_boundary()
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (8 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 09/25] object_array: add function object_array_filter() Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 11/25] object_array_remove_duplicates(): rewrite to reduce copying Michael Haggerty
                   ` (15 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

Use object_array_filter(), which will soon be made smarter about
cleaning up discarded entries properly.  Also add a function comment.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---

This version changes the test to "nr == alloc" for clarity, but
doesn't move the test to the caller as did v1 of the patch series.

 revision.c | 32 +++++++++++++++-----------------
 1 file changed, 15 insertions(+), 17 deletions(-)

diff --git a/revision.c b/revision.c
index 8ac88d6..be73cb4 100644
--- a/revision.c
+++ b/revision.c
@@ -2435,25 +2435,23 @@ static struct commit *get_revision_1(struct rev_info *revs)
 	return NULL;
 }
 
-static void gc_boundary(struct object_array *array)
+/*
+ * Return true for entries that have not yet been shown.  (This is an
+ * object_array_each_func_t.)
+ */
+static int entry_unshown(struct object_array_entry *entry, void *cb_data_unused)
 {
-	unsigned nr = array->nr;
-	unsigned alloc = array->alloc;
-	struct object_array_entry *objects = array->objects;
+	return !(entry->item->flags & SHOWN);
+}
 
-	if (alloc <= nr) {
-		unsigned i, j;
-		for (i = j = 0; i < nr; i++) {
-			if (objects[i].item->flags & SHOWN)
-				continue;
-			if (i != j)
-				objects[j] = objects[i];
-			j++;
-		}
-		for (i = j; i < nr; i++)
-			objects[i].item = NULL;
-		array->nr = j;
-	}
+/*
+ * If array is on the verge of a realloc, garbage-collect any entries
+ * that have already been shown to try to free up some space.
+ */
+static void gc_boundary(struct object_array *array)
+{
+	if (array->nr == array->alloc)
+		object_array_filter(array, entry_unshown, NULL);
 }
 
 static void create_boundary_commit_list(struct rev_info *revs)
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 11/25] object_array_remove_duplicates(): rewrite to reduce copying
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (9 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 10/25] revision: use object_array_filter() in implementation of gc_boundary() Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-29 16:18   ` Junio C Hamano
  2013-05-25  9:08 ` [PATCH v2 12/25] fsck: don't put a void*-shaped peg in a char*-shaped hole Michael Haggerty
                   ` (14 subsequent siblings)
  25 siblings, 1 reply; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

The old version copied one entry to its destination position, then
deleted any matching entries from the tail of the array.  This
required the tail of the array to be copied multiple times.  It didn't
affect the complexity of the algorithm because the whole tail has to
be searched through anyway.  But all the copying was unnecessary.

Instead, check for the existence of an entry with the same name in the
*head* of the list before copying an entry to its final position.
This way each entry has to be copied at most one time.

Extract a helper function contains_name() to do a bit of the work.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 object.c | 32 +++++++++++++++++++++-----------
 object.h |  6 +++++-
 2 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/object.c b/object.c
index fcd4a82..10b5349 100644
--- a/object.c
+++ b/object.c
@@ -294,22 +294,32 @@ void object_array_filter(struct object_array *array,
 	array->nr = dst;
 }
 
+/*
+ * Return true iff array already contains an entry with name.
+ */
+static int contains_name(struct object_array *array, const char *name)
+{
+	unsigned nr = array->nr, i;
+	struct object_array_entry *object = array->objects;
+
+	for (i = 0; i < nr; i++, object++)
+		if (!strcmp(object->name, name))
+			return 1;
+	return 0;
+}
+
 void object_array_remove_duplicates(struct object_array *array)
 {
-	unsigned int ref, src, dst;
+	unsigned nr = array->nr, src;
 	struct object_array_entry *objects = array->objects;
 
-	for (ref = 0; ref + 1 < array->nr; ref++) {
-		for (src = ref + 1, dst = src;
-		     src < array->nr;
-		     src++) {
-			if (!strcmp(objects[ref].name, objects[src].name))
-				continue;
-			if (src != dst)
-				objects[dst] = objects[src];
-			dst++;
+	array->nr = 0;
+	for (src = 0; src < nr; src++) {
+		if (!contains_name(array, objects[src].name)) {
+			if (src != array->nr)
+				objects[array->nr] = objects[src];
+			array->nr++;
 		}
-		array->nr = dst;
 	}
 }
 
diff --git a/object.h b/object.h
index 0d39ff4..6c1c27f 100644
--- a/object.h
+++ b/object.h
@@ -96,7 +96,11 @@ typedef int (*object_array_each_func_t)(struct object_array_entry *, void *);
 void object_array_filter(struct object_array *array,
 			 object_array_each_func_t want, void *cb_data);
 
-void object_array_remove_duplicates(struct object_array *);
+/*
+ * Remove from array all but the first entry with a given name.
+ * Warning: this function uses an O(N^2) algorithm.
+ */
+void object_array_remove_duplicates(struct object_array *array);
 
 void clear_object_flags(unsigned flags);
 
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 12/25] fsck: don't put a void*-shaped peg in a char*-shaped hole
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (10 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 11/25] object_array_remove_duplicates(): rewrite to reduce copying Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 13/25] find_first_merges(): initialize merges variable using initializer Michael Haggerty
                   ` (13 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

The source of this nonsense was

    04d3975937 fsck: reduce stack footprint

, which wedged a pointer to parent into the object_array_entry's name
field.  The parent pointer was passed to traverse_one_object(), even
though that function *didn't use it*.

The useless code has been deleted over time.  Commit

    a1cdc25172 fsck: drop unused parameter from traverse_one_object()

removed the parent pointer from traverse_one_object()'s
signature. Commit

    c0aa335c95 Remove unused variables

removed the code that read the parent pointer back out of the name
field.

This commit takes the last step: don't write the parent pointer into
the name field in the first place.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 builtin/fsck.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index bb9a2cd..9909b6d 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -112,7 +112,7 @@ static int mark_object(struct object *obj, int type, void *data)
 		return 1;
 	}
 
-	add_object_array(obj, (void *) parent, &pending);
+	add_object_array(obj, NULL, &pending);
 	return 0;
 }
 
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 13/25] find_first_merges(): initialize merges variable using initializer
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (11 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 12/25] fsck: don't put a void*-shaped peg in a char*-shaped hole Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 14/25] find_first_merges(): remove unnecessary code Michael Haggerty
                   ` (12 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty


Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 submodule.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/submodule.c b/submodule.c
index e728025..b837c04 100644
--- a/submodule.c
+++ b/submodule.c
@@ -846,7 +846,7 @@ static int find_first_merges(struct object_array *result, const char *path,
 		struct commit *a, struct commit *b)
 {
 	int i, j;
-	struct object_array merges;
+	struct object_array merges = OBJECT_ARRAY_INIT;
 	struct commit *commit;
 	int contains_another;
 
@@ -856,7 +856,6 @@ static int find_first_merges(struct object_array *result, const char *path,
 	struct rev_info revs;
 	struct setup_revision_opt rev_opts;
 
-	memset(&merges, 0, sizeof(merges));
 	memset(result, 0, sizeof(struct object_array));
 	memset(&rev_opts, 0, sizeof(rev_opts));
 
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 14/25] find_first_merges(): remove unnecessary code
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (12 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 13/25] find_first_merges(): initialize merges variable using initializer Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 15/25] object_array_entry: fix memory handling of the name field Michael Haggerty
                   ` (11 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

No names are ever set for the object_array_entries in merges, so there
is no need to pretend to copy them to the result array.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 submodule.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/submodule.c b/submodule.c
index b837c04..ad476ce 100644
--- a/submodule.c
+++ b/submodule.c
@@ -893,8 +893,7 @@ static int find_first_merges(struct object_array *result, const char *path,
 		}
 
 		if (!contains_another)
-			add_object_array(merges.objects[i].item,
-					 merges.objects[i].name, result);
+			add_object_array(merges.objects[i].item, NULL, result);
 	}
 
 	free(merges.objects);
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 15/25] object_array_entry: fix memory handling of the name field
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (13 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 14/25] find_first_merges(): remove unnecessary code Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-29 16:24   ` Junio C Hamano
  2013-05-25  9:08 ` [PATCH v2 16/25] do_fetch(): reduce scope of peer_item Michael Haggerty
                   ` (10 subsequent siblings)
  25 siblings, 1 reply; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

Previously, the memory management of the object_array_entry::name
field was inconsistent and undocumented.  object_array_entries are
ultimately created by a single function, add_object_array_with_mode(),
which has an argument "const char *name".  This function used to
simply set the name field to reference the string pointed to by the
name parameter, and nobody on the object_array side ever freed the
memory.  Thus, it assumed that the memory for the name field would be
managed by the caller, and that the lifetime of that string would be
at least as long as the lifetime of the object_array_entry.  But
callers were inconsistent:

* Some passed pointers to constant strings or argv entries, which was
  OK.

* Some passed pointers to newly-allocated memory, but didn't arrange
  for the memory ever to be freed.

* Some passed the return value of sha1_to_hex(), which is a pointer to
  a statically-allocated buffer that can be overwritten at any time.

* Some passed pointers to refnames that they received from a
  for_each_ref()-type iteration, but the lifetimes of such refnames is
  not guaranteed by the refs API.

Bring consistency to this mess by changing object_array to make its
own copy for the object_array_entry::name field and free this memory
when an object_array_entry is deleted from the array.

Many callers were passing the empty string as the name parameter, so
as a performance optimization, treat the empty string specially.
Instead of making a copy, store a pointer to a statically-allocated
empty string to object_array_entry::name.  When deleting such an
entry, skip the free().

Change the callers that were already passing copies to
add_object_array_with_mode() to either skip the copy, or (if the
memory needed to be allocated anyway) freeing the memory itself.

A part of this commit effectively reverts

    70d26c6e76 read_revisions_from_stdin: make copies for handle_revision_arg

because the copying introduced by that commit (which is still
necessary) is now done at a deeper level.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 bundle.c   |  2 +-
 object.c   | 26 +++++++++++++++++++++++---
 object.h   |  8 +++++++-
 revision.c |  6 ++++--
 4 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/bundle.c b/bundle.c
index 4b0e5cd..3d64311 100644
--- a/bundle.c
+++ b/bundle.c
@@ -281,7 +281,7 @@ int create_bundle(struct bundle_header *header, const char *path,
 			if (!get_sha1_hex(buf.buf + 1, sha1)) {
 				struct object *object = parse_object_or_die(sha1, buf.buf);
 				object->flags |= UNINTERESTING;
-				add_pending_object(&revs, object, xstrdup(buf.buf));
+				add_pending_object(&revs, object, buf.buf);
 			}
 		} else if (!get_sha1_hex(buf.buf, sha1)) {
 			struct object *object = parse_object_or_die(sha1, buf.buf);
diff --git a/object.c b/object.c
index 10b5349..e4ff714 100644
--- a/object.c
+++ b/object.c
@@ -260,11 +260,18 @@ void add_object_array(struct object *obj, const char *name, struct object_array
 	add_object_array_with_mode(obj, name, array, S_IFINVALID);
 }
 
+/*
+ * A zero-length string to which object_array_entry::name can be
+ * initialized without requiring a malloc/free.
+ */
+char object_array_slopbuf[1];
+
 void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode)
 {
 	unsigned nr = array->nr;
 	unsigned alloc = array->alloc;
 	struct object_array_entry *objects = array->objects;
+	struct object_array_entry *entry;
 
 	if (nr >= alloc) {
 		alloc = (alloc + 32) * 2;
@@ -272,9 +279,16 @@ void add_object_array_with_mode(struct object *obj, const char *name, struct obj
 		array->alloc = alloc;
 		array->objects = objects;
 	}
-	objects[nr].item = obj;
-	objects[nr].name = name;
-	objects[nr].mode = mode;
+	entry = &objects[nr];
+	entry->item = obj;
+	if (!name)
+		entry->name = NULL;
+	else if (!*name)
+		/* Use our own empty string instead of allocating one: */
+		entry->name = object_array_slopbuf;
+	else
+		entry->name = xstrdup(name);
+	entry->mode = mode;
 	array->nr = ++nr;
 }
 
@@ -289,6 +303,9 @@ void object_array_filter(struct object_array *array,
 			if (src != dst)
 				objects[dst] = objects[src];
 			dst++;
+		} else {
+			if (objects[src].name != object_array_slopbuf)
+				free(objects[src].name);
 		}
 	}
 	array->nr = dst;
@@ -319,6 +336,9 @@ void object_array_remove_duplicates(struct object_array *array)
 			if (src != array->nr)
 				objects[array->nr] = objects[src];
 			array->nr++;
+		} else {
+			if (objects[src].name != object_array_slopbuf)
+				free(objects[src].name);
 		}
 	}
 }
diff --git a/object.h b/object.h
index 6c1c27f..2ff68c5 100644
--- a/object.h
+++ b/object.h
@@ -11,7 +11,13 @@ struct object_array {
 	unsigned int alloc;
 	struct object_array_entry {
 		struct object *item;
-		const char *name;
+		/*
+		 * name or NULL.  If non-NULL, the memory pointed to
+		 * is owned by this object *except* if it points at
+		 * object_array_slopbuf, which is a static copy of the
+		 * empty string.
+		 */
+		char *name;
 		unsigned mode;
 	} *objects;
 };
diff --git a/revision.c b/revision.c
index be73cb4..4aeda33 100644
--- a/revision.c
+++ b/revision.c
@@ -88,7 +88,9 @@ void add_object(struct object *obj,
 		struct name_path *path,
 		const char *name)
 {
-	add_object_array(obj, path_name(path, name), p);
+	char *pn = path_name(path, name);
+	add_object_array(obj, pn, p);
+	free(pn);
 }
 
 static void mark_blob_uninteresting(struct blob *blob)
@@ -1288,7 +1290,7 @@ static void read_revisions_from_stdin(struct rev_info *revs,
 			}
 			die("options not supported in --stdin mode");
 		}
-		if (handle_revision_arg(xstrdup(sb.buf), revs, 0,
+		if (handle_revision_arg(sb.buf, revs, 0,
 					REVARG_CANNOT_BE_FILENAME))
 			die("bad revision '%s'", sb.buf);
 	}
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 16/25] do_fetch(): reduce scope of peer_item
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (14 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 15/25] object_array_entry: fix memory handling of the name field Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 17/25] do_fetch(): clean up existing_refs before exiting Michael Haggerty
                   ` (9 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty


Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 builtin/fetch.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index f949115..80c6e37 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -694,7 +694,6 @@ static int do_fetch(struct transport *transport,
 		    struct refspec *refs, int ref_count)
 {
 	struct string_list existing_refs = STRING_LIST_INIT_DUP;
-	struct string_list_item *peer_item = NULL;
 	struct ref *ref_map;
 	struct ref *rm;
 	int autotags = (transport->remote->fetch_tags == 1);
@@ -724,8 +723,9 @@ static int do_fetch(struct transport *transport,
 
 	for (rm = ref_map; rm; rm = rm->next) {
 		if (rm->peer_ref) {
-			peer_item = string_list_lookup(&existing_refs,
-						       rm->peer_ref->name);
+			struct string_list_item *peer_item =
+				string_list_lookup(&existing_refs,
+						   rm->peer_ref->name);
 			if (peer_item)
 				hashcpy(rm->peer_ref->old_sha1,
 					peer_item->util);
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 17/25] do_fetch(): clean up existing_refs before exiting
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (15 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 16/25] do_fetch(): reduce scope of peer_item Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 18/25] add_existing(): do not retain a reference to sha1 Michael Haggerty
                   ` (8 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty


Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 builtin/fetch.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 80c6e37..48df5fa 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -697,6 +697,7 @@ static int do_fetch(struct transport *transport,
 	struct ref *ref_map;
 	struct ref *rm;
 	int autotags = (transport->remote->fetch_tags == 1);
+	int retcode = 0;
 
 	for_each_ref(add_existing, &existing_refs);
 
@@ -712,9 +713,9 @@ static int do_fetch(struct transport *transport,
 
 	/* if not appending, truncate FETCH_HEAD */
 	if (!append && !dry_run) {
-		int errcode = truncate_fetch_head();
-		if (errcode)
-			return errcode;
+		retcode = truncate_fetch_head();
+		if (retcode)
+			goto cleanup;
 	}
 
 	ref_map = get_ref_map(transport, refs, ref_count, tags, &autotags);
@@ -736,7 +737,8 @@ static int do_fetch(struct transport *transport,
 		transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1");
 	if (fetch_refs(transport, ref_map)) {
 		free_refs(ref_map);
-		return 1;
+		retcode = 1;
+		goto cleanup;
 	}
 	if (prune) {
 		/* If --tags was specified, pretend the user gave us the canonical tags refspec */
@@ -779,7 +781,9 @@ static int do_fetch(struct transport *transport,
 		free_refs(ref_map);
 	}
 
-	return 0;
+ cleanup:
+	string_list_clear(&existing_refs, 0);
+	return retcode;
 }
 
 static void set_option(const char *name, const char *value)
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 18/25] add_existing(): do not retain a reference to sha1
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (16 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 17/25] do_fetch(): clean up existing_refs before exiting Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 19/25] show_head_ref(): do not shadow name of argument Michael Haggerty
                   ` (7 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

Its lifetime is not guaranteed, so make a copy.  Free the memory when
the string_list is cleared.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 builtin/fetch.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 48df5fa..fa6fe44 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -571,7 +571,8 @@ static int add_existing(const char *refname, const unsigned char *sha1,
 {
 	struct string_list *list = (struct string_list *)cbdata;
 	struct string_list_item *item = string_list_insert(list, refname);
-	item->util = (void *)sha1;
+	item->util = xmalloc(20);
+	hashcpy(item->util, sha1);
 	return 0;
 }
 
@@ -636,7 +637,7 @@ static void find_non_local_tags(struct transport *transport,
 		item = string_list_insert(&remote_refs, ref->name);
 		item->util = (void *)ref->old_sha1;
 	}
-	string_list_clear(&existing_refs, 0);
+	string_list_clear(&existing_refs, 1);
 
 	/*
 	 * We may have a final lightweight tag that needs to be
@@ -782,7 +783,7 @@ static int do_fetch(struct transport *transport,
 	}
 
  cleanup:
-	string_list_clear(&existing_refs, 0);
+	string_list_clear(&existing_refs, 1);
 	return retcode;
 }
 
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 19/25] show_head_ref(): do not shadow name of argument
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (17 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 18/25] add_existing(): do not retain a reference to sha1 Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 20/25] show_head_ref(): rename first parameter to "refname" Michael Haggerty
                   ` (6 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty


Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 http-backend.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/http-backend.c b/http-backend.c
index 6b85ffa..3135835 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -416,8 +416,8 @@ static int show_head_ref(const char *name, const unsigned char *sha1,
 	struct strbuf *buf = cb_data;
 
 	if (flag & REF_ISSYMREF) {
-		unsigned char sha1[20];
-		const char *target = resolve_ref_unsafe(name, sha1, 1, NULL);
+		unsigned char unused[20];
+		const char *target = resolve_ref_unsafe(name, unused, 1, NULL);
 		const char *target_nons = strip_namespace(target);
 
 		strbuf_addf(buf, "ref: %s\n", target_nons);
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 20/25] show_head_ref(): rename first parameter to "refname"
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (18 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 19/25] show_head_ref(): do not shadow name of argument Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 21/25] string_list_add_one_ref(): " Michael Haggerty
                   ` (5 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

This is the usual convention.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 http-backend.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/http-backend.c b/http-backend.c
index 3135835..0324417 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -410,14 +410,14 @@ static void get_info_refs(char *arg)
 	strbuf_release(&buf);
 }
 
-static int show_head_ref(const char *name, const unsigned char *sha1,
+static int show_head_ref(const char *refname, const unsigned char *sha1,
 	int flag, void *cb_data)
 {
 	struct strbuf *buf = cb_data;
 
 	if (flag & REF_ISSYMREF) {
 		unsigned char unused[20];
-		const char *target = resolve_ref_unsafe(name, unused, 1, NULL);
+		const char *target = resolve_ref_unsafe(refname, unused, 1, NULL);
 		const char *target_nons = strip_namespace(target);
 
 		strbuf_addf(buf, "ref: %s\n", target_nons);
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 21/25] string_list_add_one_ref(): rename first parameter to "refname"
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (19 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 20/25] show_head_ref(): rename first parameter to "refname" Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 22/25] string_list_add_refs_by_glob(): add a comment about memory management Michael Haggerty
                   ` (4 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

This is the usual convention.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 notes.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/notes.c b/notes.c
index f63fd57..fa7cdf7 100644
--- a/notes.c
+++ b/notes.c
@@ -918,12 +918,12 @@ out:
 	return ret;
 }
 
-static int string_list_add_one_ref(const char *path, const unsigned char *sha1,
+static int string_list_add_one_ref(const char *refname, const unsigned char *sha1,
 				   int flag, void *cb)
 {
 	struct string_list *refs = cb;
-	if (!unsorted_string_list_has_string(refs, path))
-		string_list_append(refs, path);
+	if (!unsorted_string_list_has_string(refs, refname))
+		string_list_append(refs, refname);
 	return 0;
 }
 
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 22/25] string_list_add_refs_by_glob(): add a comment about memory management
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (20 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 21/25] string_list_add_one_ref(): " Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-29  8:21   ` Thomas Rast
  2013-05-25  9:08 ` [PATCH v2 23/25] exclude_existing(): set existing_refs.strdup_strings Michael Haggerty
                   ` (3 subsequent siblings)
  25 siblings, 1 reply; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

Since string_list_add_one_ref() adds refname to the string list, but
the lifetime of refname is limited, it is important that the
string_list passed to string_list_add_one_ref() has strdup_strings
set.  Document this fact.

All current callers do the right thing.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 notes.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/notes.c b/notes.c
index fa7cdf7..602d956 100644
--- a/notes.c
+++ b/notes.c
@@ -927,6 +927,9 @@ static int string_list_add_one_ref(const char *refname, const unsigned char *sha
 	return 0;
 }
 
+/*
+ * The list argument must have strdup_strings set on it.
+ */
 void string_list_add_refs_by_glob(struct string_list *list, const char *glob)
 {
 	if (has_glob_specials(glob)) {
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 23/25] exclude_existing(): set existing_refs.strdup_strings
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (21 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 22/25] string_list_add_refs_by_glob(): add a comment about memory management Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-25  9:08 ` [PATCH v2 24/25] register_ref(): make a copy of the bad reference SHA-1 Michael Haggerty
                   ` (2 subsequent siblings)
  25 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

The each_ref_fn add_existing() adds refnames to the existing_refs
list.  But the lifetimes of these refnames is not guaranteed by the
refs API, so configure the string_list to make copies as it adds them.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 builtin/show-ref.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/show-ref.c b/builtin/show-ref.c
index 8d9b76a..4a0310d 100644
--- a/builtin/show-ref.c
+++ b/builtin/show-ref.c
@@ -103,7 +103,7 @@ static int add_existing(const char *refname, const unsigned char *sha1, int flag
  */
 static int exclude_existing(const char *match)
 {
-	static struct string_list existing_refs = STRING_LIST_INIT_NODUP;
+	static struct string_list existing_refs = STRING_LIST_INIT_DUP;
 	char buf[1024];
 	int matchlen = match ? strlen(match) : 0;
 
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 24/25] register_ref(): make a copy of the bad reference SHA-1
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (22 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 23/25] exclude_existing(): set existing_refs.strdup_strings Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-29 16:53   ` Junio C Hamano
  2013-05-25  9:08 ` [PATCH v2 25/25] refs: document the lifetime of the args passed to each_ref_fn Michael Haggerty
  2013-05-29  8:25 ` [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Thomas Rast
  25 siblings, 1 reply; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

The lifetime of the sha1 parameter passed to an each_ref_fn callback
is not guaranteed, so make a copy for later use.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 bisect.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/bisect.c b/bisect.c
index 374d9e2..71c1958 100644
--- a/bisect.c
+++ b/bisect.c
@@ -15,7 +15,7 @@
 static struct sha1_array good_revs;
 static struct sha1_array skipped_revs;
 
-static const unsigned char *current_bad_sha1;
+static unsigned char *current_bad_sha1;
 
 static const char *argv_checkout[] = {"checkout", "-q", NULL, "--", NULL};
 static const char *argv_show_branch[] = {"show-branch", NULL, NULL};
@@ -404,7 +404,8 @@ static int register_ref(const char *refname, const unsigned char *sha1,
 			int flags, void *cb_data)
 {
 	if (!strcmp(refname, "bad")) {
-		current_bad_sha1 = sha1;
+		current_bad_sha1 = xmalloc(20);
+		hashcpy(current_bad_sha1, sha1);
 	} else if (!prefixcmp(refname, "good-")) {
 		sha1_array_append(&good_revs, sha1);
 	} else if (!prefixcmp(refname, "skip-")) {
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 25/25] refs: document the lifetime of the args passed to each_ref_fn
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (23 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 24/25] register_ref(): make a copy of the bad reference SHA-1 Michael Haggerty
@ 2013-05-25  9:08 ` Michael Haggerty
  2013-05-29 16:54   ` Junio C Hamano
  2013-05-29  8:25 ` [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Thomas Rast
  25 siblings, 1 reply; 40+ messages in thread
From: Michael Haggerty @ 2013-05-25  9:08 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

The lifetime of the memory pointed to by the refname and sha1
arguments to each_ref_fn was never documented, but some callers used
to assume that it was essentially permanent.  In fact the API does
*not* guarantee that these objects live beyond a single callback
invocation.

In the current code, the lifetimes are bound together with the
lifetimes of the ref_caches.  Since these are usually long, the
callers usually got away with their sloppiness.  But even today, if a
ref_cache is invalidated the memory can be freed.  And planned changes
to reference caching, needed to eliminate race conditions, will
probably need to shorten the lifetimes of these objects.

The commits leading up to this have (hopefully) fixed all of the
callers of the for_each_ref()-like functions.  This commit does the
last step: documents what each_ref_fn callbacks can assume about
object lifetimes.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.h | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/refs.h b/refs.h
index a35eafc..122ec03 100644
--- a/refs.h
+++ b/refs.h
@@ -15,13 +15,23 @@ struct ref_lock {
 #define REF_ISBROKEN 0x04
 
 /*
- * Calls the specified function for each ref file until it returns
- * nonzero, and returns the value.  Please note that it is not safe to
- * modify references while an iteration is in progress, unless the
- * same callback function invocation that modifies the reference also
- * returns a nonzero value to immediately stop the iteration.
+ * The signature for the callback function for the for_each_*()
+ * functions below.  The memory pointed to by the refname and sha1
+ * arguments is only guaranteed to be valid for the duration of a
+ * single callback invocation.
+ */
+typedef int each_ref_fn(const char *refname,
+			const unsigned char *sha1, int flags, void *cb_data);
+
+/*
+ * The following functions invoke the specified callback function for
+ * each reference indicated.  If the function ever returns a nonzero
+ * value, stop the iteration and return that value.  Please note that
+ * it is not safe to modify references while an iteration is in
+ * progress, unless the same callback function invocation that
+ * modifies the reference also returns a nonzero value to immediately
+ * stop the iteration.
  */
-typedef int each_ref_fn(const char *refname, const unsigned char *sha1, int flags, void *cb_data);
 extern int head_ref(each_ref_fn, void *);
 extern int for_each_ref(each_ref_fn, void *);
 extern int for_each_ref_in(const char *, each_ref_fn, void *);
-- 
1.8.2.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 22/25] string_list_add_refs_by_glob(): add a comment about memory management
  2013-05-25  9:08 ` [PATCH v2 22/25] string_list_add_refs_by_glob(): add a comment about memory management Michael Haggerty
@ 2013-05-29  8:21   ` Thomas Rast
  2013-05-30 19:29     ` Michael Haggerty
  2013-05-30 22:05     ` [PATCH v2 FIXUP 22/25] fixup! " Michael Haggerty
  0 siblings, 2 replies; 40+ messages in thread
From: Thomas Rast @ 2013-05-29  8:21 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, Jeff King, Johan Herland, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> Since string_list_add_one_ref() adds refname to the string list, but
> the lifetime of refname is limited, it is important that the
> string_list passed to string_list_add_one_ref() has strdup_strings
> set.  Document this fact.
>
> All current callers do the right thing.
[...]
> +/*
> + * The list argument must have strdup_strings set on it.
> + */
>  void string_list_add_refs_by_glob(struct string_list *list, const char *glob)
>  {
>  	if (has_glob_specials(glob)) {

Maybe add an assert() so that this is bulletproof?

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes
  2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
                   ` (24 preceding siblings ...)
  2013-05-25  9:08 ` [PATCH v2 25/25] refs: document the lifetime of the args passed to each_ref_fn Michael Haggerty
@ 2013-05-29  8:25 ` Thomas Rast
  2013-05-30 19:55   ` Michael Haggerty
  25 siblings, 1 reply; 40+ messages in thread
From: Thomas Rast @ 2013-05-29  8:25 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, Jeff King, Johan Herland, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

>

I read the entire series on Monday, and give it an Ack at maybe 90%
confidence level -- sorry, I was short on caffeine and sleep ;-)

I meant to verify this assertion:

> I did a manual audit of the 50 (!) functions that are used as an
> each_ref_fn callback to the for_each_ref()-style functions.  (I hope I
> haven't missed any.)  I checked that they do not make the assumption
> that the lifetimes of the refname and sha1 arguments extend past the
> duration of the callback invocation.  There were a number of callers
> that got this wrong; I believe I have fixed them all.

But time ran out, and I wouldn't want you to hold your breath.  The
series is a big improvement even if another caller slipped through the
cracks.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 11/25] object_array_remove_duplicates(): rewrite to reduce copying
  2013-05-25  9:08 ` [PATCH v2 11/25] object_array_remove_duplicates(): rewrite to reduce copying Michael Haggerty
@ 2013-05-29 16:18   ` Junio C Hamano
  2013-05-30 21:14     ` Michael Haggerty
  0 siblings, 1 reply; 40+ messages in thread
From: Junio C Hamano @ 2013-05-29 16:18 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Johan Herland, Thomas Rast, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> The old version copied one entry to its destination position, then
> deleted any matching entries from the tail of the array.  This
> required the tail of the array to be copied multiple times.  It didn't
> affect the complexity of the algorithm because the whole tail has to
> be searched through anyway.  But all the copying was unnecessary.
>
> Instead, check for the existence of an entry with the same name in the
> *head* of the list before copying an entry to its final position.
> This way each entry has to be copied at most one time.
>
> Extract a helper function contains_name() to do a bit of the work.
>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>  object.c | 32 +++++++++++++++++++++-----------
>  object.h |  6 +++++-
>  2 files changed, 26 insertions(+), 12 deletions(-)
>
> diff --git a/object.c b/object.c
> index fcd4a82..10b5349 100644
> --- a/object.c
> +++ b/object.c
> @@ -294,22 +294,32 @@ void object_array_filter(struct object_array *array,
>  	array->nr = dst;
>  }
>  
> +/*
> + * Return true iff array already contains an entry with name.
> + */
> +static int contains_name(struct object_array *array, const char *name)
> +{
> +	unsigned nr = array->nr, i;
> +	struct object_array_entry *object = array->objects;
> +
> +	for (i = 0; i < nr; i++, object++)
> +		if (!strcmp(object->name, name))
> +			return 1;
> +	return 0;
> +}

Because some codepaths (e.g. patch 14/25) stuff NULL in the name
field, we may want to be more careful with this.

This is not a new problem, and I think the longer term solution is
to get rid of object_array_remove_duplicates(), so it is perfectly
fine to leave this function broken with respect to NULL input as-is.

The only caller of remove-duplicates is bundle.c, which gets many
starting points and end points from the command line and tries to be
nice by removing obvious duplicates, e.g.

	git bundle create t.bundle master master

but I think its logic of deduping is wrong.  It runs dwim_ref() on
the incoming refs after the remove-duplicates call, so

	git bundle create t.bundle master heads/mater

will end up with two copies of refs/heads/master.  To fix it, the
code must dedup the result of running dwim_ref(), and at that point,
there is no reason to call object_array_remove_duplicates().

> +
>  void object_array_remove_duplicates(struct object_array *array)
>  {
> -	unsigned int ref, src, dst;
> +	unsigned nr = array->nr, src;
>  	struct object_array_entry *objects = array->objects;
>  
> -	for (ref = 0; ref + 1 < array->nr; ref++) {
> -		for (src = ref + 1, dst = src;
> -		     src < array->nr;
> -		     src++) {
> -			if (!strcmp(objects[ref].name, objects[src].name))
> -				continue;
> -			if (src != dst)
> -				objects[dst] = objects[src];
> -			dst++;
> +	array->nr = 0;
> +	for (src = 0; src < nr; src++) {
> +		if (!contains_name(array, objects[src].name)) {
> +			if (src != array->nr)
> +				objects[array->nr] = objects[src];
> +			array->nr++;
>  		}
> -		array->nr = dst;
>  	}
>  }

>  
> diff --git a/object.h b/object.h
> index 0d39ff4..6c1c27f 100644
> --- a/object.h
> +++ b/object.h
> @@ -96,7 +96,11 @@ typedef int (*object_array_each_func_t)(struct object_array_entry *, void *);
>  void object_array_filter(struct object_array *array,
>  			 object_array_each_func_t want, void *cb_data);
>  
> -void object_array_remove_duplicates(struct object_array *);
> +/*
> + * Remove from array all but the first entry with a given name.
> + * Warning: this function uses an O(N^2) algorithm.
> + */
> +void object_array_remove_duplicates(struct object_array *array);
>  
>  void clear_object_flags(unsigned flags);

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 15/25] object_array_entry: fix memory handling of the name field
  2013-05-25  9:08 ` [PATCH v2 15/25] object_array_entry: fix memory handling of the name field Michael Haggerty
@ 2013-05-29 16:24   ` Junio C Hamano
  0 siblings, 0 replies; 40+ messages in thread
From: Junio C Hamano @ 2013-05-29 16:24 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Johan Herland, Thomas Rast, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> Change the callers that were already passing copies to
> add_object_array_with_mode() to either skip the copy, or (if the
> memory needed to be allocated anyway) freeing the memory itself.
>
> A part of this commit effectively reverts
>
>     70d26c6e76 read_revisions_from_stdin: make copies for handle_revision_arg
>
> because the copying introduced by that commit (which is still
> necessary) is now done at a deeper level.

Good.  Thanks for working on this.

>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>  bundle.c   |  2 +-
>  object.c   | 26 +++++++++++++++++++++++---
>  object.h   |  8 +++++++-
>  revision.c |  6 ++++--
>  4 files changed, 35 insertions(+), 7 deletions(-)
>
> diff --git a/bundle.c b/bundle.c
> index 4b0e5cd..3d64311 100644
> --- a/bundle.c
> +++ b/bundle.c
> @@ -281,7 +281,7 @@ int create_bundle(struct bundle_header *header, const char *path,
>  			if (!get_sha1_hex(buf.buf + 1, sha1)) {
>  				struct object *object = parse_object_or_die(sha1, buf.buf);
>  				object->flags |= UNINTERESTING;
> -				add_pending_object(&revs, object, xstrdup(buf.buf));
> +				add_pending_object(&revs, object, buf.buf);
>  			}
>  		} else if (!get_sha1_hex(buf.buf, sha1)) {
>  			struct object *object = parse_object_or_die(sha1, buf.buf);
> diff --git a/object.c b/object.c
> index 10b5349..e4ff714 100644
> --- a/object.c
> +++ b/object.c
> @@ -260,11 +260,18 @@ void add_object_array(struct object *obj, const char *name, struct object_array
>  	add_object_array_with_mode(obj, name, array, S_IFINVALID);
>  }
>  
> +/*
> + * A zero-length string to which object_array_entry::name can be
> + * initialized without requiring a malloc/free.
> + */
> +char object_array_slopbuf[1];
> +
>  void add_object_array_with_mode(struct object *obj, const char *name, struct object_array *array, unsigned mode)
>  {
>  	unsigned nr = array->nr;
>  	unsigned alloc = array->alloc;
>  	struct object_array_entry *objects = array->objects;
> +	struct object_array_entry *entry;
>  
>  	if (nr >= alloc) {
>  		alloc = (alloc + 32) * 2;
> @@ -272,9 +279,16 @@ void add_object_array_with_mode(struct object *obj, const char *name, struct obj
>  		array->alloc = alloc;
>  		array->objects = objects;
>  	}
> -	objects[nr].item = obj;
> -	objects[nr].name = name;
> -	objects[nr].mode = mode;
> +	entry = &objects[nr];
> +	entry->item = obj;
> +	if (!name)
> +		entry->name = NULL;
> +	else if (!*name)
> +		/* Use our own empty string instead of allocating one: */
> +		entry->name = object_array_slopbuf;
> +	else
> +		entry->name = xstrdup(name);
> +	entry->mode = mode;
>  	array->nr = ++nr;
>  }
>  
> @@ -289,6 +303,9 @@ void object_array_filter(struct object_array *array,
>  			if (src != dst)
>  				objects[dst] = objects[src];
>  			dst++;
> +		} else {
> +			if (objects[src].name != object_array_slopbuf)
> +				free(objects[src].name);
>  		}
>  	}
>  	array->nr = dst;
> @@ -319,6 +336,9 @@ void object_array_remove_duplicates(struct object_array *array)
>  			if (src != array->nr)
>  				objects[array->nr] = objects[src];
>  			array->nr++;
> +		} else {
> +			if (objects[src].name != object_array_slopbuf)
> +				free(objects[src].name);
>  		}
>  	}
>  }
> diff --git a/object.h b/object.h
> index 6c1c27f..2ff68c5 100644
> --- a/object.h
> +++ b/object.h
> @@ -11,7 +11,13 @@ struct object_array {
>  	unsigned int alloc;
>  	struct object_array_entry {
>  		struct object *item;
> -		const char *name;
> +		/*
> +		 * name or NULL.  If non-NULL, the memory pointed to
> +		 * is owned by this object *except* if it points at
> +		 * object_array_slopbuf, which is a static copy of the
> +		 * empty string.
> +		 */
> +		char *name;
>  		unsigned mode;
>  	} *objects;
>  };
> diff --git a/revision.c b/revision.c
> index be73cb4..4aeda33 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -88,7 +88,9 @@ void add_object(struct object *obj,
>  		struct name_path *path,
>  		const char *name)
>  {
> -	add_object_array(obj, path_name(path, name), p);
> +	char *pn = path_name(path, name);
> +	add_object_array(obj, pn, p);
> +	free(pn);
>  }
>  
>  static void mark_blob_uninteresting(struct blob *blob)
> @@ -1288,7 +1290,7 @@ static void read_revisions_from_stdin(struct rev_info *revs,
>  			}
>  			die("options not supported in --stdin mode");
>  		}
> -		if (handle_revision_arg(xstrdup(sb.buf), revs, 0,
> +		if (handle_revision_arg(sb.buf, revs, 0,
>  					REVARG_CANNOT_BE_FILENAME))
>  			die("bad revision '%s'", sb.buf);
>  	}

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 24/25] register_ref(): make a copy of the bad reference SHA-1
  2013-05-25  9:08 ` [PATCH v2 24/25] register_ref(): make a copy of the bad reference SHA-1 Michael Haggerty
@ 2013-05-29 16:53   ` Junio C Hamano
  2013-05-30 21:51     ` Michael Haggerty
  0 siblings, 1 reply; 40+ messages in thread
From: Junio C Hamano @ 2013-05-29 16:53 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Johan Herland, Thomas Rast, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> The lifetime of the sha1 parameter passed to an each_ref_fn callback
> is not guaranteed, so make a copy for later use.
>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>  bisect.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/bisect.c b/bisect.c
> index 374d9e2..71c1958 100644
> --- a/bisect.c
> +++ b/bisect.c
> @@ -15,7 +15,7 @@
>  static struct sha1_array good_revs;
>  static struct sha1_array skipped_revs;
>  
> -static const unsigned char *current_bad_sha1;
> +static unsigned char *current_bad_sha1;
>  
>  static const char *argv_checkout[] = {"checkout", "-q", NULL, "--", NULL};
>  static const char *argv_show_branch[] = {"show-branch", NULL, NULL};
> @@ -404,7 +404,8 @@ static int register_ref(const char *refname, const unsigned char *sha1,
>  			int flags, void *cb_data)
>  {
>  	if (!strcmp(refname, "bad")) {
> -		current_bad_sha1 = sha1;
> +		current_bad_sha1 = xmalloc(20);
> +		hashcpy(current_bad_sha1, sha1);

This, together with 18/25, may hint that we want hashdup()?

>  	} else if (!prefixcmp(refname, "good-")) {
>  		sha1_array_append(&good_revs, sha1);
>  	} else if (!prefixcmp(refname, "skip-")) {

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 25/25] refs: document the lifetime of the args passed to each_ref_fn
  2013-05-25  9:08 ` [PATCH v2 25/25] refs: document the lifetime of the args passed to each_ref_fn Michael Haggerty
@ 2013-05-29 16:54   ` Junio C Hamano
  0 siblings, 0 replies; 40+ messages in thread
From: Junio C Hamano @ 2013-05-29 16:54 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Johan Herland, Thomas Rast, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> The commits leading up to this have (hopefully) fixed all of the
> callers of the for_each_ref()-like functions.  This commit does the
> last step: documents what each_ref_fn callbacks can assume about
> object lifetimes.
>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>

I looked at all the patches in the series and they all looked
sensible.

I have a few comments (sent as individual review) but all of the
suggestions in them are "by the way I noticed this issue that is not
new to this series while I was reading the code, and we may want to
fix it elsewhere", not "this is broken in the patch, we need to fix
it before queuing".

Thanks for a pleasant read.

> ---
>  refs.h | 22 ++++++++++++++++------
>  1 file changed, 16 insertions(+), 6 deletions(-)
>
> diff --git a/refs.h b/refs.h
> index a35eafc..122ec03 100644
> --- a/refs.h
> +++ b/refs.h
> @@ -15,13 +15,23 @@ struct ref_lock {
>  #define REF_ISBROKEN 0x04
>  
>  /*
> - * Calls the specified function for each ref file until it returns
> - * nonzero, and returns the value.  Please note that it is not safe to
> - * modify references while an iteration is in progress, unless the
> - * same callback function invocation that modifies the reference also
> - * returns a nonzero value to immediately stop the iteration.
> + * The signature for the callback function for the for_each_*()
> + * functions below.  The memory pointed to by the refname and sha1
> + * arguments is only guaranteed to be valid for the duration of a
> + * single callback invocation.
> + */
> +typedef int each_ref_fn(const char *refname,
> +			const unsigned char *sha1, int flags, void *cb_data);
> +
> +/*
> + * The following functions invoke the specified callback function for
> + * each reference indicated.  If the function ever returns a nonzero
> + * value, stop the iteration and return that value.  Please note that
> + * it is not safe to modify references while an iteration is in
> + * progress, unless the same callback function invocation that
> + * modifies the reference also returns a nonzero value to immediately
> + * stop the iteration.
>   */
> -typedef int each_ref_fn(const char *refname, const unsigned char *sha1, int flags, void *cb_data);
>  extern int head_ref(each_ref_fn, void *);
>  extern int for_each_ref(each_ref_fn, void *);
>  extern int for_each_ref_in(const char *, each_ref_fn, void *);

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 22/25] string_list_add_refs_by_glob(): add a comment about memory management
  2013-05-29  8:21   ` Thomas Rast
@ 2013-05-30 19:29     ` Michael Haggerty
  2013-05-30 22:05     ` [PATCH v2 FIXUP 22/25] fixup! " Michael Haggerty
  1 sibling, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-30 19:29 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Junio C Hamano, Jeff King, Johan Herland, git

On 05/29/2013 10:21 AM, Thomas Rast wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> Since string_list_add_one_ref() adds refname to the string list, but
>> the lifetime of refname is limited, it is important that the
>> string_list passed to string_list_add_one_ref() has strdup_strings
>> set.  Document this fact.
>>
>> All current callers do the right thing.
> [...]
>> +/*
>> + * The list argument must have strdup_strings set on it.
>> + */
>>  void string_list_add_refs_by_glob(struct string_list *list, const char *glob)
>>  {
>>  	if (has_glob_specials(glob)) {
> 
> Maybe add an assert() so that this is bulletproof?

Good idea.  Will be added in v3.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes
  2013-05-29  8:25 ` [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Thomas Rast
@ 2013-05-30 19:55   ` Michael Haggerty
  0 siblings, 0 replies; 40+ messages in thread
From: Michael Haggerty @ 2013-05-30 19:55 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Junio C Hamano, Jeff King, Johan Herland, git

On 05/29/2013 10:25 AM, Thomas Rast wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> I read the entire series on Monday, and give it an Ack at maybe 90%
> confidence level -- sorry, I was short on caffeine and sleep ;-)

Thanks very much.  I'll buy you a coffee the next time I see you :-)

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 11/25] object_array_remove_duplicates(): rewrite to reduce copying
  2013-05-29 16:18   ` Junio C Hamano
@ 2013-05-30 21:14     ` Michael Haggerty
  2013-06-02 21:02       ` Junio C Hamano
  0 siblings, 1 reply; 40+ messages in thread
From: Michael Haggerty @ 2013-05-30 21:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Johan Herland, Thomas Rast, git

On 05/29/2013 06:18 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> The old version copied one entry to its destination position, then
>> deleted any matching entries from the tail of the array.  This
>> required the tail of the array to be copied multiple times.  It didn't
>> affect the complexity of the algorithm because the whole tail has to
>> be searched through anyway.  But all the copying was unnecessary.
>>
>> Instead, check for the existence of an entry with the same name in the
>> *head* of the list before copying an entry to its final position.
>> This way each entry has to be copied at most one time.
>>
>> Extract a helper function contains_name() to do a bit of the work.
>>
>> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
>> ---
>>  object.c | 32 +++++++++++++++++++++-----------
>>  object.h |  6 +++++-
>>  2 files changed, 26 insertions(+), 12 deletions(-)
>>
>> diff --git a/object.c b/object.c
>> index fcd4a82..10b5349 100644
>> --- a/object.c
>> +++ b/object.c
>> @@ -294,22 +294,32 @@ void object_array_filter(struct object_array *array,
>>  	array->nr = dst;
>>  }
>>  
>> +/*
>> + * Return true iff array already contains an entry with name.
>> + */
>> +static int contains_name(struct object_array *array, const char *name)
>> +{
>> +	unsigned nr = array->nr, i;
>> +	struct object_array_entry *object = array->objects;
>> +
>> +	for (i = 0; i < nr; i++, object++)
>> +		if (!strcmp(object->name, name))
>> +			return 1;
>> +	return 0;
>> +}
> 
> Because some codepaths (e.g. patch 14/25) stuff NULL in the name
> field, we may want to be more careful with this.
> 
> This is not a new problem, and I think the longer term solution is
> to get rid of object_array_remove_duplicates(), so it is perfectly
> fine to leave this function broken with respect to NULL input as-is.

You make a good point about NULL names, but I agree to leave it for now
since it needs work anyway.

> The only caller of remove-duplicates is bundle.c, which gets many
> starting points and end points from the command line and tries to be
> nice by removing obvious duplicates, e.g.
> 
> 	git bundle create t.bundle master master
> 
> but I think its logic of deduping is wrong.  It runs dwim_ref() on
> the incoming refs after the remove-duplicates call, so
> 
> 	git bundle create t.bundle master heads/mater
> 
> will end up with two copies of refs/heads/master.  To fix it, the
> code must dedup the result of running dwim_ref(), and at that point,
> there is no reason to call object_array_remove_duplicates().

That sounds reasonable.

I poked around this code a bit to understand what is going on, and it
occurred to me that the object_array can include both positive and
negative references, right?  And yet object_array_remove_duplicates()
only considers names, not flags.  So it seems to me that if the deduping
code would see the same reference twice, once positive and once
negative, then it would throw an arbitrary one of them out, which would
be wrong.

But I couldn't provoke this situation, so perhaps setup_revisions()
already specially treats combinations like "master ^master"?  (If that's
true then why? and wouldn't it get confused by "master ^heads/master"?)

I suppose someday I will have to dig into these functions and maybe even
document them.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 24/25] register_ref(): make a copy of the bad reference SHA-1
  2013-05-29 16:53   ` Junio C Hamano
@ 2013-05-30 21:51     ` Michael Haggerty
  2013-05-30 22:09       ` Philip Oakley
  0 siblings, 1 reply; 40+ messages in thread
From: Michael Haggerty @ 2013-05-30 21:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Johan Herland, Thomas Rast, git

On 05/29/2013 06:53 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>> [...]
>> +		current_bad_sha1 = xmalloc(20);
>> +		hashcpy(current_bad_sha1, sha1);
> 
> This, together with 18/25, may hint that we want hashdup()?

I thought about it, but was surprised that I could only find one or two
other places in the existing code that would use such a function.  But
sure, I would be happy to submit a patch.

I think hashdup() needn't be inline, so the definition can't go with its
cousins in cache.h.  There is no cache.c.  So where would be the best
place to define hashdup()?  object.c?  sha1_name.c?

While I'm at it, I think it would be nice to have constants like

#define SHA1_LEN 20
#define SHA1_HEX_LEN 40

and start using those instead of magic numbers.  Any objections (or
suggestions for better names)?

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v2 FIXUP 22/25] fixup! string_list_add_refs_by_glob(): add a comment about memory management
  2013-05-29  8:21   ` Thomas Rast
  2013-05-30 19:29     ` Michael Haggerty
@ 2013-05-30 22:05     ` Michael Haggerty
  2013-06-03 15:31       ` Junio C Hamano
  1 sibling, 1 reply; 40+ messages in thread
From: Michael Haggerty @ 2013-05-30 22:05 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Johan Herland, Thomas Rast, git, Michael Haggerty

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
Junio, would you mind squashing this patch onto mh/reflife 22/25?

 notes.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/notes.c b/notes.c
index 602d956..b69c0b8 100644
--- a/notes.c
+++ b/notes.c
@@ -932,6 +932,7 @@ static int string_list_add_one_ref(const char *refname, const unsigned char *sha
  */
 void string_list_add_refs_by_glob(struct string_list *list, const char *glob)
 {
+	assert(list->strdup_strings);
 	if (has_glob_specials(glob)) {
 		for_each_glob_ref(string_list_add_one_ref, glob, list);
 	} else {
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 24/25] register_ref(): make a copy of the bad reference SHA-1
  2013-05-30 21:51     ` Michael Haggerty
@ 2013-05-30 22:09       ` Philip Oakley
  0 siblings, 0 replies; 40+ messages in thread
From: Philip Oakley @ 2013-05-30 22:09 UTC (permalink / raw)
  To: Michael Haggerty, Junio C Hamano
  Cc: Jeff King, Johan Herland, Thomas Rast, git

From: "Michael Haggerty" <mhagger@alum.mit.edu>
Sent: Thursday, May 30, 2013 10:51 PM
> On 05/29/2013 06:53 PM, Junio C Hamano wrote:
>> Michael Haggerty <mhagger@alum.mit.edu> writes:
>>> [...]
>>> + current_bad_sha1 = xmalloc(20);
>>> + hashcpy(current_bad_sha1, sha1);
>>
>> This, together with 18/25, may hint that we want hashdup()?
>
> I thought about it, but was surprised that I could only find one or 
> two
> other places in the existing code that would use such a function.  But
> sure, I would be happy to submit a patch.
>
> I think hashdup() needn't be inline, so the definition can't go with 
> its
> cousins in cache.h.  There is no cache.c.  So where would be the best
> place to define hashdup()?  object.c?  sha1_name.c?
>
> While I'm at it, I think it would be nice to have constants like
>
> #define SHA1_LEN 20
> #define SHA1_HEX_LEN 40
>
> and start using those instead of magic numbers.  Any objections (or
> suggestions for better names)?
>

The first named constant should be fully qualified to the same extent as 
the second, perhaps:
    #define SHA1_BYTE_LEN 20

and perhaps with an alternate of (though HEX is just as good):
    #define SHA1_CHAR_LEN 40


> Michael
>
> -- 
> Michael Haggerty
> mhagger@alum.mit.edu
> http://softwareswirl.blogspot.com/
> --
Philip 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 11/25] object_array_remove_duplicates(): rewrite to reduce copying
  2013-05-30 21:14     ` Michael Haggerty
@ 2013-06-02 21:02       ` Junio C Hamano
  0 siblings, 0 replies; 40+ messages in thread
From: Junio C Hamano @ 2013-06-02 21:02 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Johan Herland, Thomas Rast, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

>> The only caller of remove-duplicates is bundle.c, which gets many
>> starting points and end points from the command line and tries to be
>> nice by removing obvious duplicates, e.g.
>> 
>> 	git bundle create t.bundle master master
>> 
>> but I think its logic of deduping is wrong.  It runs dwim_ref() on
>> the incoming refs after the remove-duplicates call, so
>> 
>> 	git bundle create t.bundle master heads/mater
>> 
>> will end up with two copies of refs/heads/master.  To fix it, the
>> code must dedup the result of running dwim_ref(), and at that point,
>> there is no reason to call object_array_remove_duplicates().
>
> That sounds reasonable.
>
> I poked around this code a bit to understand what is going on, and it
> occurred to me that the object_array can include both positive and
> negative references, right?  And yet object_array_remove_duplicates()
> only considers names, not flags.  So it seems to me that if the deduping
> code would see the same reference twice, once positive and once
> negative, then it would throw an arbitrary one of them out, which would
> be wrong.
>
> But I couldn't provoke this situation, so perhaps setup_revisions()
> already specially treats combinations like "master ^master"?  (If that's
> true then why? and wouldn't it get confused by "master ^heads/master"?)

With "git bundle create t.bundle ^master master", you see two
entries in revs.pending.objects[] but they are the same object and
is already marked as uninteresting, so you will not see 'master' in
the result.

This parsing loop predates the more recent revs->cmdline mechanism,
that treats these two command line arguments as separate entities,
so that we can more reliably tell what the real end-user input is.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 FIXUP 22/25] fixup! string_list_add_refs_by_glob(): add a comment about memory management
  2013-05-30 22:05     ` [PATCH v2 FIXUP 22/25] fixup! " Michael Haggerty
@ 2013-06-03 15:31       ` Junio C Hamano
  0 siblings, 0 replies; 40+ messages in thread
From: Junio C Hamano @ 2013-06-03 15:31 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, Johan Herland, Thomas Rast, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
> Junio, would you mind squashing this patch onto mh/reflife 22/25?

Done.  Thanks.

>
>  notes.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/notes.c b/notes.c
> index 602d956..b69c0b8 100644
> --- a/notes.c
> +++ b/notes.c
> @@ -932,6 +932,7 @@ static int string_list_add_one_ref(const char *refname, const unsigned char *sha
>   */
>  void string_list_add_refs_by_glob(struct string_list *list, const char *glob)
>  {
> +	assert(list->strdup_strings);
>  	if (has_glob_specials(glob)) {
>  		for_each_glob_ref(string_list_add_one_ref, glob, list);
>  	} else {

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2013-06-03 15:31 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-25  9:07 [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 01/25] describe: make own copy of refname Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 02/25] fetch: make own copies of refnames Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 03/25] add_rev_cmdline(): make a copy of the name argument Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 04/25] builtin_diff_tree(): make it obvious that function wants two entries Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 05/25] cmd_diff(): use an object_array for holding trees Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 06/25] cmd_diff(): rename local variable "list" -> "entry" Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 07/25] cmd_diff(): make it obvious which cases are exclusive of each other Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 08/25] revision: split some overly-long lines Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 09/25] object_array: add function object_array_filter() Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 10/25] revision: use object_array_filter() in implementation of gc_boundary() Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 11/25] object_array_remove_duplicates(): rewrite to reduce copying Michael Haggerty
2013-05-29 16:18   ` Junio C Hamano
2013-05-30 21:14     ` Michael Haggerty
2013-06-02 21:02       ` Junio C Hamano
2013-05-25  9:08 ` [PATCH v2 12/25] fsck: don't put a void*-shaped peg in a char*-shaped hole Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 13/25] find_first_merges(): initialize merges variable using initializer Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 14/25] find_first_merges(): remove unnecessary code Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 15/25] object_array_entry: fix memory handling of the name field Michael Haggerty
2013-05-29 16:24   ` Junio C Hamano
2013-05-25  9:08 ` [PATCH v2 16/25] do_fetch(): reduce scope of peer_item Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 17/25] do_fetch(): clean up existing_refs before exiting Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 18/25] add_existing(): do not retain a reference to sha1 Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 19/25] show_head_ref(): do not shadow name of argument Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 20/25] show_head_ref(): rename first parameter to "refname" Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 21/25] string_list_add_one_ref(): " Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 22/25] string_list_add_refs_by_glob(): add a comment about memory management Michael Haggerty
2013-05-29  8:21   ` Thomas Rast
2013-05-30 19:29     ` Michael Haggerty
2013-05-30 22:05     ` [PATCH v2 FIXUP 22/25] fixup! " Michael Haggerty
2013-06-03 15:31       ` Junio C Hamano
2013-05-25  9:08 ` [PATCH v2 23/25] exclude_existing(): set existing_refs.strdup_strings Michael Haggerty
2013-05-25  9:08 ` [PATCH v2 24/25] register_ref(): make a copy of the bad reference SHA-1 Michael Haggerty
2013-05-29 16:53   ` Junio C Hamano
2013-05-30 21:51     ` Michael Haggerty
2013-05-30 22:09       ` Philip Oakley
2013-05-25  9:08 ` [PATCH v2 25/25] refs: document the lifetime of the args passed to each_ref_fn Michael Haggerty
2013-05-29 16:54   ` Junio C Hamano
2013-05-29  8:25 ` [PATCH v2 00/25] Remove assumptions about each_ref_fn arg lifetimes Thomas Rast
2013-05-30 19:55   ` Michael Haggerty

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).