git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH/RFC] add --authorship-order flag to git log / rev-list
@ 2013-06-04 18:08 elliottcable
  2013-06-04 18:08 ` [PATCH/RFC] rev-list: add --authorship-order alternative ordering elliottcable
  2013-06-04 18:53 ` [PATCH/RFC] add --authorship-order flag to git log / rev-list Junio C Hamano
  0 siblings, 2 replies; 51+ messages in thread
From: elliottcable @ 2013-06-04 18:08 UTC (permalink / raw)
  To: git; +Cc: elliottcable

This is my first time submitting a patch to this list, so please, let me know if
I'm doing any of this the wrong way! I've striven to follow
`Documentation/SubmittingPatches`. I hope I've succeeded. For that matter, it's
my first time diving into git's sources, so I obviously would love some
commentary on the patch itself, as well. ;)

I've tried herein to add an `--authorship-order` flag to complement git-log's
`--topo-order` and `--date-order` flags; it should operate the same as
`--date-order`, but using the `AUTHOR_DATE` instead of the `COMMITTER_DATE`.

I've sent an e-mail to this list, previously, on this subject; I'd make this
patchset a reply to that, except I have no idea what the in-reply-to should be:
http://www.spinics.net/lists/git/msg208542.html

The original work is all on GitHub:
https://github.com/git/git/pull/40
https://github.com/ELLIOTTCABLE/git/compare/master...author-order+

elliottcable (1):
  rev-list: add --authorship-order alternative ordering

 builtin/log.c                          |  2 +-
 builtin/rev-list.c                     |  1 +
 builtin/rev-parse.c                    |  1 +
 builtin/show-branch.c                  | 12 ++++-
 commit.c                               | 83 ++++++++++++++++++++++++++++++----
 commit.h                               |  3 +-
 contrib/completion/git-completion.bash |  4 +-
 po/de.po                               |  4 +-
 po/git.pot                             |  2 +-
 po/sv.po                               |  4 +-
 po/vi.po                               |  4 +-
 po/zh_CN.po                            |  4 +-
 revision.c                             | 11 ++++-
 revision.h                             |  1 +
 14 files changed, 110 insertions(+), 26 deletions(-)

-- 
1.8.1.3

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH/RFC] rev-list: add --authorship-order alternative ordering
  2013-06-04 18:08 [PATCH/RFC] add --authorship-order flag to git log / rev-list elliottcable
@ 2013-06-04 18:08 ` elliottcable
  2013-06-04 19:14   ` Junio C Hamano
  2013-06-04 18:53 ` [PATCH/RFC] add --authorship-order flag to git log / rev-list Junio C Hamano
  1 sibling, 1 reply; 51+ messages in thread
From: elliottcable @ 2013-06-04 18:08 UTC (permalink / raw)
  To: git; +Cc: elliottcable

--date-order is an excellent alternative to --topo-order if you want a feel for
the *actual history*, chronologically, of your project. I use it often, with
--graph as well; it's a great way to get an overview of a project's recent
development history.

However, in a project that rebases various in-development topic-branches often,
it gets hard to demonstrate a *chronological history* of changes to the
codebase, as this always “resets” the COMMITTER_DATE (which --date-order uses)
to the time the rebase happened; which often means ‘last time all of the
topic-branches were rebased on the latest fixes in master.’

Thus, I've added an --authorship-order version of --date-order, which relies
upon the AUTHOR_DATE instead of the COMMITTER_DATE; this means that old commits
will continue to show up chronologically in-order despite rebasing.
---
 builtin/log.c                          |  2 +-
 builtin/rev-list.c                     |  1 +
 builtin/rev-parse.c                    |  1 +
 builtin/show-branch.c                  | 12 ++++-
 commit.c                               | 83 ++++++++++++++++++++++++++++++----
 commit.h                               |  3 +-
 contrib/completion/git-completion.bash |  4 +-
 po/de.po                               |  4 +-
 po/git.pot                             |  2 +-
 po/sv.po                               |  4 +-
 po/vi.po                               |  4 +-
 po/zh_CN.po                            |  4 +-
 revision.c                             | 11 ++++-
 revision.h                             |  1 +
 14 files changed, 110 insertions(+), 26 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 9e21232..54d4d7f 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -237,7 +237,7 @@ static void log_show_early(struct rev_info *revs, struct commit_list *list)
 	int i = revs->early_output;
 	int show_header = 1;
 
-	sort_in_topological_order(&list, revs->lifo);
+	sort_in_topological_order(&list, revs->lifo, revs->use_author);
 	while (list && i) {
 		struct commit *commit = list->item;
 		switch (simplify_commit(revs, commit)) {
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 67701be..cfa5d1f 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -30,6 +30,7 @@ static const char rev_list_usage[] =
 "  ordering output:\n"
 "    --topo-order\n"
 "    --date-order\n"
+"    --authorship-order\n"
 "    --reverse\n"
 "  formatting output:\n"
 "    --parents\n"
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index f267a1d..d08aebd 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -65,6 +65,7 @@ static int is_rev_argument(const char *arg)
 		"--tags",
 		"--topo-order",
 		"--date-order",
+		"--authorship-order",
 		"--unpacked",
 		NULL
 	};
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 90fc6b1..ac06ac3 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -6,7 +6,7 @@
 #include "parse-options.h"
 
 static const char* show_branch_usage[] = {
-    N_("git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order] [--current] [--color[=<when>] | --no-color] [--sparse] [--more=<n> | --list | --independent | --merge-base] [--no-name | --sha1-name] [--topics] [(<rev> | <glob>)...]"),
+    N_("git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order | --authorship-order] [--current] [--color[=<when>] | --no-color] [--sparse] [--more=<n> | --list | --independent | --merge-base] [--no-name | --sha1-name] [--topics] [(<rev> | <glob>)...]"),
     N_("git show-branch (-g|--reflog)[=<n>[,<base>]] [--list] [<ref>]"),
     NULL
 };
@@ -631,6 +631,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 	int all_heads = 0, all_remotes = 0;
 	int all_mask, all_revs;
 	int lifo = 1;
+	int use_author = 0;
 	char head[128];
 	const char *head_p;
 	int head_len;
@@ -667,6 +668,8 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 			    N_("show refs unreachable from any other ref")),
 		OPT_BOOLEAN(0, "topo-order", &lifo,
 			    N_("show commits in topological order")),
+		OPT_BOOLEAN(0, "authorship-order", &use_author,
+			    N_("like --date-order, but with the *author* date")),
 		OPT_BOOLEAN(0, "topics", &topics,
 			    N_("show only commits not on the first branch")),
 		OPT_SET_INT(0, "sparse", &dense,
@@ -694,6 +697,11 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 			   show_branch_usage, PARSE_OPT_STOP_AT_NON_OPTION);
 	if (all_heads)
 		all_remotes = 1;
+	/* I'm having trouble figuring out exactly what `lifo` stores. Why do both 'date-order' and
+	 * 'topo-order' set the same variable!? Aren't they mutually exclusive? Since *both* set it, for
+	 * the moment, I'm going to set it for '--authorship-order'; but that seems counterintuitive. */
+	if (use_author)
+		lifo = 1;
 
 	if (extra || reflog) {
 		/* "listing" mode is incompatible with
@@ -900,7 +908,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 		exit(0);
 
 	/* Sort topologically */
-	sort_in_topological_order(&seen, lifo);
+	sort_in_topological_order(&seen, lifo, use_author);
 
 	/* Give names to commits */
 	if (!sha1_name && !no_name)
diff --git a/commit.c b/commit.c
index 888e02a..b8a0f60 100644
--- a/commit.c
+++ b/commit.c
@@ -78,7 +78,34 @@ struct commit *lookup_commit_reference_by_name(const char *name)
 	return commit;
 }
 
-static unsigned long parse_commit_date(const char *buf, const char *tail)
+static unsigned long parse_commit_author_date(const char *buf, const char *tail)
+{
+	const char *dateptr;
+
+	if (buf + 6 >= tail)
+		return 0;
+	if (memcmp(buf, "author", 6))
+		return 0;
+	while (buf < tail && *buf++ != '>')
+		/* nada */;
+	if (buf >= tail)
+		return 0;
+	dateptr = buf;
+	while (buf < tail && *buf++ != '\n')
+		/* nada */;
+	if (buf + 9 >= tail)
+		return 0;
+	if (memcmp(buf, "committer", 9))
+		return 0;
+	while (buf < tail && *buf++ != '\n')
+		/* nada */;
+	if (buf >= tail)
+		return 0;
+	/* dateptr < buf && buf[-1] == '\n', so strtoul will stop at buf-1 */
+	return strtoul(dateptr, NULL, 10);
+}
+
+static unsigned long parse_commit_committer_date(const char *buf, const char *tail)
 {
 	const char *dateptr;
 
@@ -301,7 +328,8 @@ int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long s
 			pptr = &commit_list_insert(new_parent, pptr)->next;
 		}
 	}
-	item->date = parse_commit_date(bufptr, tail);
+	item->date = parse_commit_committer_date(bufptr, tail);
+	item->author_date = parse_commit_author_date(bufptr, tail);
 
 	return 0;
 }
@@ -380,6 +408,19 @@ void free_commit_list(struct commit_list *list)
 	}
 }
 
+struct commit_list * commit_list_insert_by_author_date(struct commit *item, struct commit_list **list)
+{
+	struct commit_list **pp = list;
+	struct commit_list *p;
+	while ((p = *pp) != NULL) {
+		if (p->item->author_date < item->author_date) {
+			break;
+		}
+		pp = &p->next;
+	}
+	return commit_list_insert(item, pp);
+}
+
 struct commit_list * commit_list_insert_by_date(struct commit *item, struct commit_list **list)
 {
 	struct commit_list **pp = list;
@@ -393,6 +434,17 @@ struct commit_list * commit_list_insert_by_date(struct commit *item, struct comm
 	return commit_list_insert(item, pp);
 }
 
+static int commit_list_compare_by_author_date(const void *a, const void *b)
+{
+	unsigned long a_date = ((const struct commit_list *)a)->item->author_date;
+	unsigned long b_date = ((const struct commit_list *)b)->item->author_date;
+	if (a_date < b_date)
+		return 1;
+	if (a_date > b_date)
+		return -1;
+	return 0;
+}
+
 static int commit_list_compare_by_date(const void *a, const void *b)
 {
 	unsigned long a_date = ((const struct commit_list *)a)->item->date;
@@ -414,6 +466,12 @@ static void commit_list_set_next(void *a, void *next)
 	((struct commit_list *)a)->next = next;
 }
 
+void commit_list_sort_by_author_date(struct commit_list **list)
+{
+	*list = llist_mergesort(*list, commit_list_get_next, commit_list_set_next,
+				commit_list_compare_by_author_date);
+}
+
 void commit_list_sort_by_date(struct commit_list **list)
 {
 	*list = llist_mergesort(*list, commit_list_get_next, commit_list_set_next,
@@ -509,7 +567,7 @@ struct commit *pop_commit(struct commit_list **stack)
 /*
  * Performs an in-place topological sort on the list supplied.
  */
-void sort_in_topological_order(struct commit_list ** list, int lifo)
+void sort_in_topological_order(struct commit_list ** list, int lifo, int use_author)
 {
 	struct commit_list *next, *orig = *list;
 	struct commit_list *work, **insert;
@@ -554,8 +612,12 @@ void sort_in_topological_order(struct commit_list ** list, int lifo)
 	}
 
 	/* process the list in topological order */
-	if (!lifo)
-		commit_list_sort_by_date(&work);
+	if (!lifo) {
+		if (use_author)
+			commit_list_sort_by_author_date(&work);
+		else
+			commit_list_sort_by_date(&work);
+	}
 
 	pptr = list;
 	*list = NULL;
@@ -580,10 +642,13 @@ void sort_in_topological_order(struct commit_list ** list, int lifo)
 			 * guaranteeing topological order.
 			 */
 			if (--parent->indegree == 1) {
-				if (!lifo)
-					commit_list_insert_by_date(parent, &work);
-				else
-					commit_list_insert(parent, &work);
+				if (!lifo) {
+					if (use_author)
+						commit_list_insert_by_author_date(parent, &work);
+					else
+						commit_list_insert_by_date(parent, &work);
+				} else {
+					commit_list_insert(parent, &work); }
 			}
 		}
 		/*
diff --git a/commit.h b/commit.h
index 67bd509..de07525 100644
--- a/commit.h
+++ b/commit.h
@@ -17,6 +17,7 @@ struct commit {
 	void *util;
 	unsigned int indegree;
 	unsigned long date;
+	unsigned long author_date;
 	struct commit_list *parents;
 	struct tree *tree;
 	char *buffer;
@@ -150,7 +151,7 @@ void clear_commit_marks_for_object_array(struct object_array *a, unsigned mark);
  *   in addition, when lifo == 0, commits on parallel tracks are
  *   sorted in the dates order.
  */
-void sort_in_topological_order(struct commit_list ** list, int lifo);
+void sort_in_topological_order(struct commit_list ** list, int lifo, int use_author);
 
 struct commit_graft {
 	unsigned char sha1[20];
diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index 91234d4..f051e53 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1445,7 +1445,7 @@ _git_log ()
 			$__git_log_common_options
 			$__git_log_shortlog_options
 			$__git_log_gitk_options
-			--root --topo-order --date-order --reverse
+			--root --topo-order --date-order --authorship-order --reverse
 			--follow --full-diff
 			--abbrev-commit --abbrev=
 			--relative-date --date=
@@ -2291,7 +2291,7 @@ _git_show_branch ()
 	case "$cur" in
 	--*)
 		__gitcomp "
-			--all --remotes --topo-order --current --more=
+			--all --remotes --topo-order --authorship-order --current --more=
 			--list --independent --merge-base --no-name
 			--color --no-color
 			--sha1-name --sparse --topics --reflog
diff --git a/po/de.po b/po/de.po
index 4901488..0dc184f 100644
--- a/po/de.po
+++ b/po/de.po
@@ -8716,12 +8716,12 @@ msgstr "Ausgabe mit Zeilenumbrüchen"
 
 #: builtin/show-branch.c:9
 msgid ""
-"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order] [--"
+"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order | --authorship-order] [--"
 "current] [--color[=<when>] | --no-color] [--sparse] [--more=<n> | --list | --"
 "independent | --merge-base] [--no-name | --sha1-name] [--topics] [(<rev> | "
 "<glob>)...]"
 msgstr ""
-"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order] [--"
+"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order | --authorship-order] [--"
 "current] [--color[=<Wann>] | --no-color] [--sparse] [--more=<n> | --list | --"
 "independent | --merge-base] [--no-name | --sha1-name] [--topics] "
 "[(<Revision> | <glob>)...]"
diff --git a/po/git.pot b/po/git.pot
index 4a9d4ef..325348d 100644
--- a/po/git.pot
+++ b/po/git.pot
@@ -8123,7 +8123,7 @@ msgstr ""
 
 #: builtin/show-branch.c:9
 msgid ""
-"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order] [--"
+"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order | --authorship-order] [--"
 "current] [--color[=<when>] | --no-color] [--sparse] [--more=<n> | --list | --"
 "independent | --merge-base] [--no-name | --sha1-name] [--topics] [(<rev> | "
 "<glob>)...]"
diff --git a/po/sv.po b/po/sv.po
index a5c88c9..5091224 100644
--- a/po/sv.po
+++ b/po/sv.po
@@ -8478,12 +8478,12 @@ msgstr "Radbryt utdata"
 
 #: builtin/show-branch.c:9
 msgid ""
-"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order] [--"
+"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order | --authorship-order] [--"
 "current] [--color[=<when>] | --no-color] [--sparse] [--more=<n> | --list | --"
 "independent | --merge-base] [--no-name | --sha1-name] [--topics] [(<rev> | "
 "<glob>)...]"
 msgstr ""
-"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order] [--"
+"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order | --authorship-order] [--"
 "current] [--color[=<när>] | --no-color] [--sparse] [--more=<n> | --list | --"
 "independent | --merge-base] [--no-name | --sha1-name] [--topics] [(<rev> | "
 "<mönster>)...]"
diff --git a/po/vi.po b/po/vi.po
index c6af8d5..ec41ff8 100644
--- a/po/vi.po
+++ b/po/vi.po
@@ -8622,12 +8622,12 @@ msgstr "Ngắt dòng khi quá dài"
 
 #: builtin/show-branch.c:9
 msgid ""
-"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order] [--"
+"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order | --authorship-order] [--"
 "current] [--color[=<when>] | --no-color] [--sparse] [--more=<n> | --list | --"
 "independent | --merge-base] [--no-name | --sha1-name] [--topics] [(<rev> | "
 "<glob>)...]"
 msgstr ""
-"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order] [--"
+"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order | --authorship-order] [--"
 "current] [--color[=<khi>] | --no-color] [--sparse] [--more=<n> | --list | --"
 "independent | --merge-base] [--no-name | --sha1-name] [--topics] [(<rev> | "
 "<glob>)...]"
diff --git a/po/zh_CN.po b/po/zh_CN.po
index ba757d9..a666aed 100644
--- a/po/zh_CN.po
+++ b/po/zh_CN.po
@@ -8446,12 +8446,12 @@ msgstr "折行输出"
 
 #: builtin/show-branch.c:9
 msgid ""
-"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order] [--"
+"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order | --authorship-order] [--"
 "current] [--color[=<when>] | --no-color] [--sparse] [--more=<n> | --list | --"
 "independent | --merge-base] [--no-name | --sha1-name] [--topics] [(<rev> | "
 "<glob>)...]"
 msgstr ""
-"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order] [--"
+"git show-branch [-a|--all] [-r|--remotes] [--topo-order | --date-order | --authorship-order] [--"
 "current] [--color[=<when>] | --no-color] [--sparse] [--more=<n> | --list | --"
 "independent | --merge-base] [--no-name | --sha1-name] [--topics] [(<rev> | "
 "<glob>)...]"
diff --git a/revision.c b/revision.c
index 518cd08..2d077ce 100644
--- a/revision.c
+++ b/revision.c
@@ -1053,6 +1053,7 @@ void init_revisions(struct rev_info *revs, const char *prefix)
 	revs->pruning.add_remove = file_add_remove;
 	revs->pruning.change = file_change;
 	revs->lifo = 1;
+	revs->use_author = 0;
 	revs->dense = 1;
 	revs->prefix = prefix;
 	revs->max_age = -1;
@@ -1394,6 +1395,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--topo-order")) {
 		revs->lifo = 1;
 		revs->topo_order = 1;
+		revs->use_author = 0;
 	} else if (!strcmp(arg, "--simplify-merges")) {
 		revs->simplify_merges = 1;
 		revs->topo_order = 1;
@@ -1412,6 +1414,11 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--date-order")) {
 		revs->lifo = 0;
 		revs->topo_order = 1;
+		revs->use_author = 0;
+	} else if (!strcmp(arg, "--authorship-order")) {
+		revs->lifo = 0;
+		revs->topo_order = 1;
+		revs->use_author = 1;
 	} else if (!prefixcmp(arg, "--early-output")) {
 		int count = 100;
 		switch (arg[14]) {
@@ -2191,7 +2198,7 @@ int prepare_revision_walk(struct rev_info *revs)
 		if (limit_list(revs) < 0)
 			return -1;
 	if (revs->topo_order)
-		sort_in_topological_order(&revs->commits, revs->lifo);
+		sort_in_topological_order(&revs->commits, revs->lifo, revs->use_author);
 	if (revs->line_level_traverse)
 		line_log_filter(revs);
 	if (revs->simplify_merges)
@@ -2503,7 +2510,7 @@ static void create_boundary_commit_list(struct rev_info *revs)
 	 * If revs->topo_order is set, sort the boundary commits
 	 * in topological order
 	 */
-	sort_in_topological_order(&revs->commits, revs->lifo);
+	sort_in_topological_order(&revs->commits, revs->lifo, revs->use_author);
 }
 
 static struct commit *get_revision_internal(struct rev_info *revs)
diff --git a/revision.h b/revision.h
index a313a13..09effab 100644
--- a/revision.h
+++ b/revision.h
@@ -73,6 +73,7 @@ struct rev_info {
 			simplify_history:1,
 			lifo:1,
 			topo_order:1,
+			use_author:1,
 			simplify_merges:1,
 			simplify_by_decoration:1,
 			tag_objects:1,
-- 
1.8.1.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH/RFC] add --authorship-order flag to git log / rev-list
  2013-06-04 18:08 [PATCH/RFC] add --authorship-order flag to git log / rev-list elliottcable
  2013-06-04 18:08 ` [PATCH/RFC] rev-list: add --authorship-order alternative ordering elliottcable
@ 2013-06-04 18:53 ` Junio C Hamano
  2013-06-06 18:06   ` Elliott Cable
  1 sibling, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-04 18:53 UTC (permalink / raw)
  To: elliottcable; +Cc: git

elliottcable <me@ell.io> writes:

> This is my first time submitting a patch to this list, so please, let me know if
> I'm doing any of this the wrong way! I've striven to follow
> `Documentation/SubmittingPatches`. I hope I've succeeded. For that matter, it's
> my first time diving into git's sources, so I obviously would love some
> commentary on the patch itself, as well. ;)
>
> I've tried herein to add an `--authorship-order` flag to complement git-log's
> `--topo-order` and `--date-order` flags; it should operate the same as
> `--date-order`, but using the `AUTHOR_DATE` instead of the `COMMITTER_DATE`.

After reading the subject alone, my reaction was "is this sorting
commits by the name of the author"?

That is one of the expected natural reactions when people hear about
this option, which is not what you want.

Perhaps naming it --authordate-order (or enhance the command line
parsing to allow --date-order=author|committer) would give us a
better UI.

(the above comment is before reading any of the code in the patch).

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH/RFC] rev-list: add --authorship-order alternative ordering
  2013-06-04 18:08 ` [PATCH/RFC] rev-list: add --authorship-order alternative ordering elliottcable
@ 2013-06-04 19:14   ` Junio C Hamano
  2013-06-04 21:20     ` Junio C Hamano
  2013-06-04 21:22     ` [PATCH/RFC] rev-list: add --authorship-order alternative ordering Jeff King
  0 siblings, 2 replies; 51+ messages in thread
From: Junio C Hamano @ 2013-06-04 19:14 UTC (permalink / raw)
  To: elliottcable; +Cc: git

elliottcable <me@ell.io> writes:

> --date-order is an excellent alternative to --topo-order if you want a feel for
> the *actual history*, chronologically, of your project. I use it often, with
> --graph as well; it's a great way to get an overview of a project's recent
> development history.
>
> However, in a project that rebases various in-development topic-branches often,
> it gets hard to demonstrate a *chronological history* of changes to the
> codebase, as this always “resets” the COMMITTER_DATE (which --date-order uses)
> to the time the rebase happened; which often means ‘last time all of the
> topic-branches were rebased on the latest fixes in master.’
>
> Thus, I've added an --authorship-order version of --date-order, which relies
> upon the AUTHOR_DATE instead of the COMMITTER_DATE; this means that old commits
> will continue to show up chronologically in-order despite rebasing.
> ---

Missing sign-off.  Please see Documentation/SubmittingPatches.

>  builtin/log.c                          |  2 +-
>  builtin/rev-list.c                     |  1 +
>  builtin/rev-parse.c                    |  1 +
>  builtin/show-branch.c                  | 12 ++++-
>  commit.c                               | 83 ++++++++++++++++++++++++++++++----
>  commit.h                               |  3 +-
>  contrib/completion/git-completion.bash |  4 +-
>  po/de.po                               |  4 +-
>  po/git.pot                             |  2 +-
>  po/sv.po                               |  4 +-
>  po/vi.po                               |  4 +-
>  po/zh_CN.po                            |  4 +-

Please drop all the changes to po/ area; it is managed by the i18n
coordinator and generated by an automated tool that extracts these
strings from the code.

People who code should not (and do not have to) touch these files.

>  revision.c                             | 11 ++++-
>  revision.h                             |  1 +
>  14 files changed, 110 insertions(+), 26 deletions(-)
>
> diff --git a/builtin/log.c b/builtin/log.c
> index 9e21232..54d4d7f 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -237,7 +237,7 @@ static void log_show_early(struct rev_info *revs, struct commit_list *list)
>  	int i = revs->early_output;
>  	int show_header = 1;
>  
> -	sort_in_topological_order(&list, revs->lifo);
> +	sort_in_topological_order(&list, revs->lifo, revs->use_author);

The name "use-author" is a clear sign that the person who added this
code were too narrowly focused to think "author" automatically would
mean "author date" ;-).

It probably makes sense to revamp sort_in_topological_order(), so
that its second parameter is not a boolean 'lifo' that tells too
much about its implementation without telling what it actually
means.  Instead, we can make it an enum sort_order, that tells it to
emit the commits in committer-date order, author-date order, or
graph-traversal order.

And update revs->lifo to use that same enum, without adding
use_author_date bit to rev_info.

> @@ -694,6 +697,11 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
>  			   show_branch_usage, PARSE_OPT_STOP_AT_NON_OPTION);
>  	if (all_heads)
>  		all_remotes = 1;
> +	/* I'm having trouble figuring out exactly what `lifo` stores. Why do both 'date-order' and
> +	 * 'topo-order' set the same variable!? Aren't they mutually exclusive? Since *both* set it, for
> +	 * the moment, I'm going to set it for '--authorship-order'; but that seems counterintuitive. */

Lines that are too wide.

	/*
         * Also please format multi-line comments
         * like this, nothing other than slash-asterisk
         * on the first and the last lines.
         */

> @@ -301,7 +328,8 @@ int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long s
>  			pptr = &commit_list_insert(new_parent, pptr)->next;
>  		}
>  	}
> -	item->date = parse_commit_date(bufptr, tail);
> +	item->date = parse_commit_committer_date(bufptr, tail);
> +	item->author_date = parse_commit_author_date(bufptr, tail);
> ...
> diff --git a/commit.h b/commit.h
> index 67bd509..de07525 100644
> --- a/commit.h
> +++ b/commit.h
> @@ -17,6 +17,7 @@ struct commit {
>  	void *util;
>  	unsigned int indegree;
>  	unsigned long date;
> +	unsigned long author_date;

While walking we keep many of them in-core, and 8-byte each for each
commit objects add up.  We do not want to make "struct commit" any
larger than it already is.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH/RFC] rev-list: add --authorship-order alternative ordering
  2013-06-04 19:14   ` Junio C Hamano
@ 2013-06-04 21:20     ` Junio C Hamano
  2013-06-06 19:03       ` Elliott Cable
  2013-06-04 21:22     ` [PATCH/RFC] rev-list: add --authorship-order alternative ordering Jeff King
  1 sibling, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-04 21:20 UTC (permalink / raw)
  To: elliottcable; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

>> @@ -301,7 +328,8 @@ int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long s
>>  			pptr = &commit_list_insert(new_parent, pptr)->next;
>>  		}
>>  	}
>> -	item->date = parse_commit_date(bufptr, tail);
>> +	item->date = parse_commit_committer_date(bufptr, tail);
>> +	item->author_date = parse_commit_author_date(bufptr, tail);
>> ...
>> diff --git a/commit.h b/commit.h
>> index 67bd509..de07525 100644
>> --- a/commit.h
>> +++ b/commit.h
>> @@ -17,6 +17,7 @@ struct commit {
>>  	void *util;
>>  	unsigned int indegree;
>>  	unsigned long date;
>> +	unsigned long author_date;
>
> While walking we keep many of them in-core, and 8-byte each for each
> commit objects add up.  We do not want to make "struct commit" any
> larger than it already is.

Having said that, I do not see a reasonable alternative
implementation than adding an author-date field to struct commit
without major restructuring if we were to add this feature.

So please do not take this part of the response as a "patch rejected
because we do not want to add anything to this structure".  We'll
think of something down the road, but as an independent topic after
this gets in (or doesn't).

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH/RFC] rev-list: add --authorship-order alternative ordering
  2013-06-04 19:14   ` Junio C Hamano
  2013-06-04 21:20     ` Junio C Hamano
@ 2013-06-04 21:22     ` Jeff King
  1 sibling, 0 replies; 51+ messages in thread
From: Jeff King @ 2013-06-04 21:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: elliottcable, git

On Tue, Jun 04, 2013 at 12:14:21PM -0700, Junio C Hamano wrote:

> > diff --git a/commit.h b/commit.h
> > index 67bd509..de07525 100644
> > --- a/commit.h
> > +++ b/commit.h
> > @@ -17,6 +17,7 @@ struct commit {
> >  	void *util;
> >  	unsigned int indegree;
> >  	unsigned long date;
> > +	unsigned long author_date;
> 
> While walking we keep many of them in-core, and 8-byte each for each
> commit objects add up.  We do not want to make "struct commit" any
> larger than it already is.

Yeah, I had the same thought. Maybe this is a good candidate to build on
top of the jk/commit-info slab experiment. The topo-sort could allocate
an extra slab for author-date (or even expand the indegree slab to hold
both indegree and author date), use it during the sort, and then free it
afterwards.

Elliott: you can see the relevant changes to the topo-sort in commit
96c4f4a (commit: allow associating auxiliary info on-demand,
2013-04-09).

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH/RFC] add --authorship-order flag to git log / rev-list
  2013-06-04 18:53 ` [PATCH/RFC] add --authorship-order flag to git log / rev-list Junio C Hamano
@ 2013-06-06 18:06   ` Elliott Cable
  0 siblings, 0 replies; 51+ messages in thread
From: Elliott Cable @ 2013-06-06 18:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Jun 4, 2013 at 2:53 PM, Junio C Hamano <gitster@pobox.com> wrote:
> After reading the subject alone, my reaction was "is this sorting
> commits by the name of the author"?
>
> That is one of the expected natural reactions when people hear about
> this option, which is not what you want.
>
> Perhaps naming it --authordate-order (or enhance the command line
> parsing to allow --date-order=author|committer) would give us a
> better UI.

The same comment was raised by someone in IRC when I submitted an RFC
on this. The conclusion we'd arrived at, IIRC, was that the only
remotely-not-ugly solutions were either --authorship-order or
--author-date-order.

I really like the idea of [--date-order[=author|committer]], but
that's getting beyond my knowledge of the code-base. Perhaps I should
just implement the changes to the implementation in *my* revision of
the patch, and leave it up to a future patcher with the requisite
knowledge of the argumentation features to throw in the changes to
that flag quickly? Either that, or implement it as --author-date-order
right *now*, and change it later before it hits Master (so we don't
end up with a no-longer-supported feature?)

(It'd take me many hours to track down the details of how git's
codebase goes around doing that, and then attempting to replicate it,
whereas someone familiar could probably do it in fifteen minutes,
hence the thought-process. Commentary welcome.)

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH/RFC] rev-list: add --authorship-order alternative ordering
  2013-06-04 21:20     ` Junio C Hamano
@ 2013-06-06 19:03       ` Elliott Cable
  2013-06-06 19:29         ` Junio C Hamano
  2013-06-06 19:40         ` Junio C Hamano
  0 siblings, 2 replies; 51+ messages in thread
From: Elliott Cable @ 2013-06-06 19:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King

On Tue, Jun 4, 2013 at 3:14 PM, Junio C Hamano <gitster@pobox.com> wrote:
> elliottcable <me@ell.io> writes:
>> Thus, I've added an --authorship-order version of --date-order, which relies
>> upon the AUTHOR_DATE instead of the COMMITTER_DATE; this means that old commits
>> will continue to show up chronologically in-order despite rebasing.
>> ---
>
> Missing sign-off.  Please see Documentation/SubmittingPatches.

Will-do.

I read that part, and was rather confused. At no point did I get the
idea that I should sign-off *my own initial commit*. Perhaps that part
of the documentation needs to be slightly re-written? Would that be a
welcome change?

On Tue, Jun 4, 2013 at 3:14 PM, Junio C Hamano <gitster@pobox.com> wrote:
> elliottcable <me@ell.io> writes:
>> diff --git a/builtin/log.c b/builtin/log.c
>> index 9e21232..54d4d7f 100644
>> --- a/builtin/log.c
>> +++ b/builtin/log.c
>> @@ -237,7 +237,7 @@ static void log_show_early(struct rev_info *revs, struct commit_list *list)
>>       int i = revs->early_output;
>>       int show_header = 1;
>>
>> -     sort_in_topological_order(&list, revs->lifo);
>> +     sort_in_topological_order(&list, revs->lifo, revs->use_author);
>
> The name "use-author" is a clear sign that the person who added this
> code were too narrowly focused to think "author" automatically would
> mean "author date" ;-).
>
> It probably makes sense to revamp sort_in_topological_order(), so
> that its second parameter is not a boolean 'lifo' that tells too
> much about its implementation without telling what it actually
> means.  Instead, we can make it an enum sort_order, that tells it to
> emit the commits in committer-date order, author-date order, or
> graph-traversal order.
>
> And update revs->lifo to use that same enum, without adding
> use_author_date bit to rev_info.

I'll look into replacing lifo with an enum as soon as I can sit back
down to update this patch. For the moment, nothing more than
committer_date_sort and author_date_sort, I suppose?

Overview being, I suppose, that `lifo` will no longer exist (since it
effectively determines, when truthy, that we operate in a
*non*-date-ordered topological method); then have commiter_date_order
and author_date_order bits in an enum, with zero being
lifo/straightforward-topological-order. Sound about right?

I'll try and make this a separate patch. First commit, to replace lifo
with an enum; second commit, to *actually implement* the code obeying
that enum when it is set to author_date_order.

On Tue, Jun 4, 2013 at 5:22 PM, Jeff King <peff@peff.net> wrote:
> On Tue, Jun 04, 2013 at 12:14:21PM -0700, Junio C Hamano wrote:
>
>> > diff --git a/commit.h b/commit.h
>> > index 67bd509..de07525 100644
>> > --- a/commit.h
>> > +++ b/commit.h
>> > @@ -17,6 +17,7 @@ struct commit {
>> >     void *util;
>> >     unsigned int indegree;
>> >     unsigned long date;
>> > +   unsigned long author_date;
>>
>> While walking we keep many of them in-core, and 8-byte each for each
>> commit objects add up.  We do not want to make "struct commit" any
>> larger than it already is.
>>
>> Having said that, I do not see a reasonable alternative
>> implementation than adding an author-date field to struct commit
>> without major restructuring if we were to add this feature.
>>
>> So please do not take this part of the response as a "patch rejected
>> because we do not want to add anything to this structure".  We'll
>> think of something down the road, but as an independent topic after
>> this gets in (or doesn't).
>
> Yeah, I had the same thought. Maybe this is a good candidate to build on
> top of the jk/commit-info slab experiment. The topo-sort could allocate
> an extra slab for author-date (or even expand the indegree slab to hold
> both indegree and author date), use it during the sort, and then free it
> afterwards.
>
> Elliott: you can see the relevant changes to the topo-sort in commit
> 96c4f4a (commit: allow associating auxiliary info on-demand,
> 2013-04-09).
>
> -Peff

Again, might be a little over my head. If you really think it's best
that I look into that branch, I will try. :)

Meantime, is there any other, more-immediate approach you can think
of? I thought, for a moment, of only storing *either* the committer
*or* the author date in the commit-struct at a given time, and
flagging with a single bit ... but I'm not sure how widely-spread the
nead for committer-date currently is. Maybe I can go back and
parse-out the author date *when I need it*, instead, though that
sounds slow …

Epilogue: I'll make the *obvious* changes mentioned above sometime
within the next week (I'm more than a little swamped; in the middle of
a big breakup, and a big move, simultaneously! :( ), especially the
enum instead of the use_author bit … and submit another patch RFC. It
won't be finalized until we can decide what to do about the extra
8bytes in the commit struct, though. More input welcome.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH/RFC] rev-list: add --authorship-order alternative ordering
  2013-06-06 19:03       ` Elliott Cable
@ 2013-06-06 19:29         ` Junio C Hamano
  2013-06-06 19:32           ` Elliott Cable
  2013-06-06 19:40         ` Junio C Hamano
  1 sibling, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-06 19:29 UTC (permalink / raw)
  To: Elliott Cable; +Cc: git, Jeff King

Elliott Cable <me@ell.io> writes:

> On Tue, Jun 4, 2013 at 3:14 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> elliottcable <me@ell.io> writes:
>>> Thus, I've added an --authorship-order version of --date-order, which relies
>>> upon the AUTHOR_DATE instead of the COMMITTER_DATE; this means that old commits
>>> will continue to show up chronologically in-order despite rebasing.
>>> ---
>>
>> Missing sign-off.  Please see Documentation/SubmittingPatches.
>
> Will-do.
>
> I read that part, and was rather confused. At no point did I get the
> idea that I should sign-off *my own initial commit*. Perhaps that part
> of the documentation needs to be slightly re-written? Would that be a
> welcome change?

I fail to see what more needs to be clarified on top of what we
already have; please re-read "(5) Sign your work" section, paying
with special attention to:

 - "YOU WROTE IT or otherwise have the right to pass it on".

 - "the contribution was created in whole or in part BY ME and I
   HAVE THE RIGHT TO SUBMIT".

But perhaps you meant something else by "*my own initial commit*"???

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH/RFC] rev-list: add --authorship-order alternative ordering
  2013-06-06 19:29         ` Junio C Hamano
@ 2013-06-06 19:32           ` Elliott Cable
  0 siblings, 0 replies; 51+ messages in thread
From: Elliott Cable @ 2013-06-06 19:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King

Wow. That's my bad entirely. I apparently hallucinated a section
suggesting that you “sign-off” commits that you'd reviewed, or
something; and I'd completely skipped the section on certifying that
you have legal rights to the work, because I'd *written* it, and
didn't think it'd be relevant.

I feel like an idiot. Forgive me. I'll --signoff my next version of
the patch. o7

On Thu, Jun 6, 2013 at 3:29 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Elliott Cable <me@ell.io> writes:
>
>> On Tue, Jun 4, 2013 at 3:14 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>> elliottcable <me@ell.io> writes:
>>>> Thus, I've added an --authorship-order version of --date-order, which relies
>>>> upon the AUTHOR_DATE instead of the COMMITTER_DATE; this means that old commits
>>>> will continue to show up chronologically in-order despite rebasing.
>>>> ---
>>>
>>> Missing sign-off.  Please see Documentation/SubmittingPatches.
>>
>> Will-do.
>>
>> I read that part, and was rather confused. At no point did I get the
>> idea that I should sign-off *my own initial commit*. Perhaps that part
>> of the documentation needs to be slightly re-written? Would that be a
>> welcome change?
>
> I fail to see what more needs to be clarified on top of what we
> already have; please re-read "(5) Sign your work" section, paying
> with special attention to:
>
>  - "YOU WROTE IT or otherwise have the right to pass it on".
>
>  - "the contribution was created in whole or in part BY ME and I
>    HAVE THE RIGHT TO SUBMIT".
>
> But perhaps you meant something else by "*my own initial commit*"???

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH/RFC] rev-list: add --authorship-order alternative ordering
  2013-06-06 19:03       ` Elliott Cable
  2013-06-06 19:29         ` Junio C Hamano
@ 2013-06-06 19:40         ` Junio C Hamano
  2013-06-06 22:48           ` Junio C Hamano
  1 sibling, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-06 19:40 UTC (permalink / raw)
  To: Elliott Cable; +Cc: git, Jeff King

Elliott Cable <me@ell.io> writes:

>> And update revs->lifo to use that same enum, without adding
>> use_author_date bit to rev_info.
>
> I'll look into replacing lifo with an enum as soon as I can sit back
> down to update this patch. For the moment, nothing more than
> committer_date_sort and author_date_sort, I suppose?

> I'll try and make this a separate patch. First commit, to replace lifo
> with an enum; second commit, to *actually implement* the code obeying
> that enum when it is set to author_date_order.

If you want to do this in a multi-step series (which may not be a
bad idea), I would imagine that the enum starts as a choice between
the two: traversal-order vs committer-date-order.  The first patch
would change nothing else.

And then you would add the third choice, author-date-order, and
implement the logic to sort them using author instead of committer
date in the same patch.

>> Elliott: you can see the relevant changes to the topo-sort in commit
>> 96c4f4a (commit: allow associating auxiliary info on-demand,
>> 2013-04-09).
>>
>> -Peff
>
> Again, might be a little over my head. If you really think it's best
> that I look into that branch, I will try. :)
>
> Meantime, is there any other, more-immediate approach you can think
> of? I thought, for a moment, of only storing *either* the committer
> *or* the author date in the commit-struct at a given time, and
> flagging with a single bit ... but I'm not sure how widely-spread the
> nead for committer-date currently is. Maybe I can go back and
> parse-out the author date *when I need it*, instead, though that
> sounds slow …

You would parse all of them at the beginning of topo-sort function
once and store these dates in the commit-info-slab (alongside with
indegree).  Once you are done sorting, you can discard the slab.

This could be done as a follow-up patch, but the tons of helper
functions you added to compare by author date to revision.c will
have to be removed in such a transition, because the whole point of
using commit-info-slab is not to have commit->author_date field,
which these new helpers work on.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH/RFC] rev-list: add --authorship-order alternative ordering
  2013-06-06 19:40         ` Junio C Hamano
@ 2013-06-06 22:48           ` Junio C Hamano
  2013-06-06 23:25             ` [PATCH] toposort: rename "lifo" field Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-06 22:48 UTC (permalink / raw)
  To: Elliott Cable; +Cc: git, Jeff King

Junio C Hamano <gitster@pobox.com> writes:

> If you want to do this in a multi-step series (which may not be a
> bad idea), I would imagine that the enum starts as a choice between
> the two: traversal-order vs committer-date-order.  The first patch
> would change nothing else.
>
> And then you would add the third choice, author-date-order, and
> implement the logic to sort them using author instead of committer
> date in the same patch.
> ...
> You would parse all of them at the beginning of topo-sort function
> once and store these dates in the commit-info-slab (alongside with
> indegree).  Once you are done sorting, you can discard the slab.
>
> This could be done as a follow-up patch, but the tons of helper
> functions you added to compare by author date to revision.c will
> have to be removed in such a transition, because the whole point of
> using commit-info-slab is not to have commit->author_date field,
> which these new helpers work on.

As I needed to have an excuse to push jk/commit-info-slab topic
further (I have an unpublished show-branch rewrite on top of it),
I may take a look at doing this myself if/when I find some time.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH] toposort: rename "lifo" field
  2013-06-06 22:48           ` Junio C Hamano
@ 2013-06-06 23:25             ` Junio C Hamano
  2013-06-07  1:25               ` Junio C Hamano
  2013-06-07  5:09               ` [PATCH] toposort: rename "lifo" field Eric Sunshine
  0 siblings, 2 replies; 51+ messages in thread
From: Junio C Hamano @ 2013-06-06 23:25 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

When sorting commits topologically, the primary invariant is to emit
all children before its parent is emitted.  When traversing a forked
history like this with "git log C E":

    A----B----C
     \
      D----E

we ensure that A is emitted after all of B, C, D, and E are done, B
has to wait until C is done, and D has to wait until E is done.

In some applications, however, we would further want to control how
these child commits B, C, D and E on two parallel ancestry chains
are shown.  Most of the time, we would want to see C and B emitted
together, and then E and D, and finally A.  This is the default
behaviour for --topo-order output.

The "lifo" parameter of the sort_in_topological_order() function is
used to control this.  After inspecting C, we notice and record that
B needs to be inspected, and by structuring the "work to be done"
set as a LIFO stack, we ensure that B is inspected next, before
other in-flight commits we had known that we will need to inspect,
e.g. E, that have already been in the "work to be done" set.

When showing in --date-order, we would want to see commits ordered
by timestamps, i.e. show C, E, B and D in this order before showing
A, mixing commits from two parallel histories together.  When "lifo"
is set to false, the function keeps the "work to be done" set sorted
in the date order to realize this sematics.

But the name "lifo" is too tied to the way how the function implements
its behaviour, and does not describe _what_ is the desired semantcs.

Replace the "lifo" field with an enum rev_sort_order, with two
possible values: REV_SORT_IN_GRAPH_ORDER and REV_SORT_BY_COMMIT_DATE.

The mechanical replacement rule is:

  "lifo == 0" is equivalent to "sort_order == REV_SORT_BY_COMMIT_DATE"
  "lifo == 1" is equivalent to "sort_order == REV_SORT_IN_GRAPH_ORDER"

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

> As I needed to have an excuse to push jk/commit-info-slab topic
> further (I have an unpublished show-branch rewrite on top of it),
> I may take a look at doing this myself if/when I find some time.

  So this is the first step, applies on top of jk/commit-info-slab.

 builtin/log.c         |  2 +-
 builtin/show-branch.c | 14 ++++++++------
 commit.c              | 12 ++++++++----
 commit.h              | 14 +++++++++++---
 revision.c            | 10 +++++-----
 revision.h            |  6 +++++-
 6 files changed, 38 insertions(+), 20 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 8f0b2e8..8d26042 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -205,7 +205,7 @@ static void log_show_early(struct rev_info *revs, struct commit_list *list)
 	int i = revs->early_output;
 	int show_header = 1;
 
-	sort_in_topological_order(&list, revs->lifo);
+	sort_in_topological_order(&list, revs->sort_order);
 	while (list && i) {
 		struct commit *commit = list->item;
 		switch (simplify_commit(revs, commit)) {
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index d208fd6..7c57985 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -631,7 +631,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 	int num_rev, i, extra = 0;
 	int all_heads = 0, all_remotes = 0;
 	int all_mask, all_revs;
-	int lifo = 1;
+	enum rev_sort_order sort_order = REV_SORT_IN_GRAPH_ORDER;
 	char head[128];
 	const char *head_p;
 	int head_len;
@@ -666,15 +666,17 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 			    N_("show possible merge bases")),
 		OPT_BOOLEAN(0, "independent", &independent,
 			    N_("show refs unreachable from any other ref")),
-		OPT_BOOLEAN(0, "topo-order", &lifo,
-			    N_("show commits in topological order")),
+		OPT_SET_INT(0, "topo-order", &sort_order,
+			    N_("show commits in topological order"),
+			    REV_SORT_IN_GRAPH_ORDER),
 		OPT_BOOLEAN(0, "topics", &topics,
 			    N_("show only commits not on the first branch")),
 		OPT_SET_INT(0, "sparse", &dense,
 			    N_("show merges reachable from only one tip"), 0),
-		OPT_SET_INT(0, "date-order", &lifo,
+		OPT_SET_INT(0, "date-order", &sort_order,
 			    N_("show commits where no parent comes before its "
-			       "children"), 0),
+			       "children"),
+			    REV_SORT_BY_COMMIT_DATE),
 		{ OPTION_CALLBACK, 'g', "reflog", &reflog_base, N_("<n>[,<base>]"),
 			    N_("show <n> most recent ref-log entries starting at "
 			       "base"),
@@ -901,7 +903,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 		exit(0);
 
 	/* Sort topologically */
-	sort_in_topological_order(&seen, lifo);
+	sort_in_topological_order(&seen, sort_order);
 
 	/* Give names to commits */
 	if (!sha1_name && !no_name)
diff --git a/commit.c b/commit.c
index 66a6c00..fc1734b 100644
--- a/commit.c
+++ b/commit.c
@@ -507,7 +507,7 @@ define_commit_slab(indegree_slab, int);
 /*
  * Performs an in-place topological sort on the list supplied.
  */
-void sort_in_topological_order(struct commit_list ** list, int lifo)
+void sort_in_topological_order(struct commit_list ** list, enum rev_sort_order sort_order)
 {
 	struct commit_list *next, *orig = *list;
 	struct commit_list *work, **insert;
@@ -556,7 +556,7 @@ void sort_in_topological_order(struct commit_list ** list, int lifo)
 	}
 
 	/* process the list in topological order */
-	if (!lifo)
+	if (sort_order != REV_SORT_IN_GRAPH_ORDER)
 		commit_list_sort_by_date(&work);
 
 	pptr = list;
@@ -583,10 +583,14 @@ void sort_in_topological_order(struct commit_list ** list, int lifo)
 			 * guaranteeing topological order.
 			 */
 			if (--(*pi) == 1) {
-				if (!lifo)
+				switch (sort_order) {
+				case REV_SORT_BY_COMMIT_DATE:
 					commit_list_insert_by_date(parent, &work);
-				else
+					break;
+				default: /* REV_SORT_IN_GRAPH_ORDER */
 					commit_list_insert(parent, &work);
+					break;
+				}
 			}
 		}
 		/*
diff --git a/commit.h b/commit.h
index 70e749d..247e474 100644
--- a/commit.h
+++ b/commit.h
@@ -139,15 +139,23 @@ struct commit *pop_commit(struct commit_list **stack);
 void clear_commit_marks(struct commit *commit, unsigned int mark);
 void clear_commit_marks_for_object_array(struct object_array *a, unsigned mark);
 
+
+enum rev_sort_order {
+	REV_SORT_IN_GRAPH_ORDER = 0,
+	REV_SORT_BY_COMMIT_DATE
+};
+
 /*
  * Performs an in-place topological sort of list supplied.
  *
  *   invariant of resulting list is:
  *      a reachable from b => ord(b) < ord(a)
- *   in addition, when lifo == 0, commits on parallel tracks are
- *   sorted in the dates order.
+ *   sort_order further specifies:
+ *   REV_SORT_IN_GRAPH_ORDER: try to show a commit on a single-parent
+ *                            chain together.
+ *   REV_SORT_BY_COMMIT_DATE: show eligible commits in committer-date order.
  */
-void sort_in_topological_order(struct commit_list ** list, int lifo);
+void sort_in_topological_order(struct commit_list **, enum rev_sort_order);
 
 struct commit_graft {
 	unsigned char sha1[20];
diff --git a/revision.c b/revision.c
index cf620c6..966ebbc 100644
--- a/revision.c
+++ b/revision.c
@@ -1038,7 +1038,7 @@ void init_revisions(struct rev_info *revs, const char *prefix)
 	DIFF_OPT_SET(&revs->pruning, QUICK);
 	revs->pruning.add_remove = file_add_remove;
 	revs->pruning.change = file_change;
-	revs->lifo = 1;
+	revs->sort_order = REV_SORT_IN_GRAPH_ORDER;
 	revs->dense = 1;
 	revs->prefix = prefix;
 	revs->max_age = -1;
@@ -1373,7 +1373,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--merge")) {
 		revs->show_merge = 1;
 	} else if (!strcmp(arg, "--topo-order")) {
-		revs->lifo = 1;
+		revs->sort_order = REV_SORT_IN_GRAPH_ORDER;
 		revs->topo_order = 1;
 	} else if (!strcmp(arg, "--simplify-merges")) {
 		revs->simplify_merges = 1;
@@ -1391,7 +1391,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->prune = 1;
 		load_ref_decorations(DECORATE_SHORT_REFS);
 	} else if (!strcmp(arg, "--date-order")) {
-		revs->lifo = 0;
+		revs->sort_order = REV_SORT_BY_COMMIT_DATE;
 		revs->topo_order = 1;
 	} else if (!prefixcmp(arg, "--early-output")) {
 		int count = 100;
@@ -2165,7 +2165,7 @@ int prepare_revision_walk(struct rev_info *revs)
 		if (limit_list(revs) < 0)
 			return -1;
 	if (revs->topo_order)
-		sort_in_topological_order(&revs->commits, revs->lifo);
+		sort_in_topological_order(&revs->commits, revs->sort_order);
 	if (revs->simplify_merges)
 		simplify_merges(revs);
 	if (revs->children.name)
@@ -2480,7 +2480,7 @@ static void create_boundary_commit_list(struct rev_info *revs)
 	 * If revs->topo_order is set, sort the boundary commits
 	 * in topological order
 	 */
-	sort_in_topological_order(&revs->commits, revs->lifo);
+	sort_in_topological_order(&revs->commits, revs->sort_order);
 }
 
 static struct commit *get_revision_internal(struct rev_info *revs)
diff --git a/revision.h b/revision.h
index 5da09ee..2a5e325 100644
--- a/revision.h
+++ b/revision.h
@@ -4,6 +4,7 @@
 #include "parse-options.h"
 #include "grep.h"
 #include "notes.h"
+#include "commit.h"
 
 #define SEEN		(1u<<0)
 #define UNINTERESTING   (1u<<1)
@@ -60,6 +61,10 @@ struct rev_info {
 	const char *prefix;
 	const char *def;
 	struct pathspec prune_data;
+
+	/* topo-sort */
+	enum rev_sort_order sort_order;
+
 	unsigned int	early_output:1,
 			ignore_missing:1;
 
@@ -70,7 +75,6 @@ struct rev_info {
 			show_all:1,
 			remove_empty_trees:1,
 			simplify_history:1,
-			lifo:1,
 			topo_order:1,
 			simplify_merges:1,
 			simplify_by_decoration:1,
-- 
1.8.3-451-gb703ddf

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH] toposort: rename "lifo" field
  2013-06-06 23:25             ` [PATCH] toposort: rename "lifo" field Junio C Hamano
@ 2013-06-07  1:25               ` Junio C Hamano
  2013-06-07  5:11                 ` [PATCH 0/3] Preparing for --date-order=author Junio C Hamano
  2013-06-07  5:09               ` [PATCH] toposort: rename "lifo" field Eric Sunshine
  1 sibling, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-07  1:25 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

Junio C Hamano <gitster@pobox.com> writes:

> When sorting commits topologically, the primary invariant is to emit
> all children before its parent is emitted.  When traversing a forked

s/its/their/;

>> As I needed to have an excuse to push jk/commit-info-slab topic
>> further (I have an unpublished show-branch rewrite on top of it),
>> I may take a look at doing this myself if/when I find some time.
>
>   So this is the first step, applies on top of jk/commit-info-slab.

The next step will be to replace the use of commit_list in this
function with a priority queue, whose API may look like what is at
the end of this message.

Then write a compare function that looks at commit->date field to
compare committer timestamp, and set it to commit_queue->compare
when REV_SORT_BY_COMMIT_DATE is asked for.  When doing the graph
traversal order, set compare function to NULL when initializing the
commit_queue and use it as a LIFO stack.

And the step after that will be to add an author-date field to the
commit-info-slab we currently use to keep track of indegree, grab
author timestamp from commits as we encounter them, and write
another comparison function to use that information (using the
cb_data field of commit_queue to point at the info slab) to
implement REV_SORT_BY_AUTHOR_DATE.  That step can also implement the
command line option parsing for the new --author-date-order option
(or alternatively, --date-order={author,committer}).


#ifndef COMMIT_QUEUE_H
#define COMMIT_QUEUE_H

/*
 * Compare two commits; the third parameter is cb_data in the
 * commit_queue structure.
 */
typedef int (*commit_compare_fn)(struct commit *, struct commit *, void *);

struct commit_queue {
	commit_compare_fn compare;
	void *cb_data;
	int alloc, nr;
	struct commit **array;
};

/*
 * Add the commit to the queue
 */
struct commit *commit_queue_put(struct commit_queue *, struct commit *);

/*
 * Extract the commit that compares the smallest out of the queue,
 * or NULL.  If compare function is NULL, the queue acts as a LIFO
 * stack.
 */
struct commit *commit_queue_get(struct commit_queue *);

#endif /* COMMIT_QUEUE_H */

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] toposort: rename "lifo" field
  2013-06-06 23:25             ` [PATCH] toposort: rename "lifo" field Junio C Hamano
  2013-06-07  1:25               ` Junio C Hamano
@ 2013-06-07  5:09               ` Eric Sunshine
  1 sibling, 0 replies; 51+ messages in thread
From: Eric Sunshine @ 2013-06-07  5:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List, Elliott Cable, Jeff King

On Thu, Jun 6, 2013 at 7:25 PM, Junio C Hamano <gitster@pobox.com> wrote:
> When sorting commits topologically, the primary invariant is to emit
> all children before its parent is emitted.  When traversing a forked
> history like this with "git log C E":
>
>     A----B----C
>      \
>       D----E
>
> we ensure that A is emitted after all of B, C, D, and E are done, B
> has to wait until C is done, and D has to wait until E is done.
>
> In some applications, however, we would further want to control how
> these child commits B, C, D and E on two parallel ancestry chains
> are shown.  Most of the time, we would want to see C and B emitted
> together, and then E and D, and finally A.  This is the default
> behaviour for --topo-order output.
>
> The "lifo" parameter of the sort_in_topological_order() function is
> used to control this.  After inspecting C, we notice and record that
> B needs to be inspected, and by structuring the "work to be done"
> set as a LIFO stack, we ensure that B is inspected next, before
> other in-flight commits we had known that we will need to inspect,
> e.g. E, that have already been in the "work to be done" set.
>
> When showing in --date-order, we would want to see commits ordered
> by timestamps, i.e. show C, E, B and D in this order before showing
> A, mixing commits from two parallel histories together.  When "lifo"
> is set to false, the function keeps the "work to be done" set sorted
> in the date order to realize this sematics.

s/sematics/semantics/ (or perhaps s/.../semantic/ ?)

> But the name "lifo" is too tied to the way how the function implements
> its behaviour, and does not describe _what_ is the desired semantcs.

s/semantcs/semantics/

> Replace the "lifo" field with an enum rev_sort_order, with two
> possible values: REV_SORT_IN_GRAPH_ORDER and REV_SORT_BY_COMMIT_DATE.
>
> The mechanical replacement rule is:
>
>   "lifo == 0" is equivalent to "sort_order == REV_SORT_BY_COMMIT_DATE"
>   "lifo == 1" is equivalent to "sort_order == REV_SORT_IN_GRAPH_ORDER"
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 0/3] Preparing for --date-order=author
  2013-06-07  1:25               ` Junio C Hamano
@ 2013-06-07  5:11                 ` Junio C Hamano
  2013-06-07  5:11                   ` [PATCH 1/3] toposort: rename "lifo" field Junio C Hamano
                                     ` (3 more replies)
  0 siblings, 4 replies; 51+ messages in thread
From: Junio C Hamano @ 2013-06-07  5:11 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

These three patches introduce a commit-queue API to manage a set of
commits in a priority queue, with a caller-specified comparison
function.  The priority queue replaces the singly-listed commit_list
in the topological sort function.

The series applies on top of the commit-info-slab API sesries Peff
and I did two months ago.  These three patches do not use the slab
API yet, but a follow-on patch to introduce REV_SORT_BY_AUTHOR_DATE
needs to use commit-slab to record author date for the commits being
sorted, and consult it in its comparison function when comparing the
author dates of commits.

Junio C Hamano (3):
  toposort: rename "lifo" field
  commit-queue: LIFO or priority queue of commits
  sort-in-topological-order: use commit-queue

 Makefile              |  2 ++
 builtin/log.c         |  2 +-
 builtin/show-branch.c | 14 +++++----
 commit-queue.c        | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++
 commit-queue.h        | 34 +++++++++++++++++++++
 commit.c              | 70 +++++++++++++++++++++++++-----------------
 commit.h              | 14 +++++++--
 revision.c            | 10 +++---
 revision.h            |  6 +++-
 9 files changed, 193 insertions(+), 43 deletions(-)
 create mode 100644 commit-queue.c
 create mode 100644 commit-queue.h

-- 
1.8.3-451-gb703ddf

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 1/3] toposort: rename "lifo" field
  2013-06-07  5:11                 ` [PATCH 0/3] Preparing for --date-order=author Junio C Hamano
@ 2013-06-07  5:11                   ` Junio C Hamano
  2013-06-07  5:18                     ` Eric Sunshine
  2013-06-07  5:11                   ` [PATCH 2/3] commit-queue: LIFO or priority queue of commits Junio C Hamano
                                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-07  5:11 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

The primary invariant of sort_in_topological_order() is to emit all
children before their parent is emitted.  When traversing a forked
history like this with "git log C E":

    A----B----C
     \
      D----E

we ensure that A is emitted after all of B, C, D, and E are done, B
has to wait until C is done, and D has to wait until E is done.

In some applications, however, we would further want to control how
these child commits B, C, D and E on two parallel ancestry chains
are shown.  Most of the time, we would want to see C and B emitted
together, and then E and D, and finally A, which is the default
behaviour for --topo-order output.

The "lifo" parameter of the sort_in_topological_order() function is
used to implement this behaviour.  After inspecting C, we notice and
record that B needs to be inspected, and by structuring the "work to
be done" set as a LIFO stack, we ensure that B is inspected next,
before other in-flight commits we had known that we will need to
inspect, e.g. E, that may have already been sitting in the "work to
be done" set.

When showing in --date-order, we would want to see commits ordered
by timestamps, i.e. show C, E, B and D in this order before showing
A, possibly mixing commits from two parallel histories together.
When "lifo" parameter is set to false, the function keeps the "work
to be done" set sorted in the date order to realize this semantics.

But the name "lifo" is too tied to the way how the function implements
its behaviour, and does not describe _what_ the desired semantics is.

Replace the "lifo" field with an enum rev_sort_order, with two
possible values: REV_SORT_IN_GRAPH_ORDER and REV_SORT_BY_COMMIT_DATE.

The mechanical replacement rule is:

  "lifo == 0" is equivalent to "sort_order == REV_SORT_BY_COMMIT_DATE"
  "lifo == 1" is equivalent to "sort_order == REV_SORT_IN_GRAPH_ORDER"

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/log.c         |  2 +-
 builtin/show-branch.c | 14 ++++++++------
 commit.c              | 12 ++++++++----
 commit.h              | 14 +++++++++++---
 revision.c            | 10 +++++-----
 revision.h            |  6 +++++-
 6 files changed, 38 insertions(+), 20 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 8f0b2e8..8d26042 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -205,7 +205,7 @@ static void log_show_early(struct rev_info *revs, struct commit_list *list)
 	int i = revs->early_output;
 	int show_header = 1;
 
-	sort_in_topological_order(&list, revs->lifo);
+	sort_in_topological_order(&list, revs->sort_order);
 	while (list && i) {
 		struct commit *commit = list->item;
 		switch (simplify_commit(revs, commit)) {
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index d208fd6..7c57985 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -631,7 +631,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 	int num_rev, i, extra = 0;
 	int all_heads = 0, all_remotes = 0;
 	int all_mask, all_revs;
-	int lifo = 1;
+	enum rev_sort_order sort_order = REV_SORT_IN_GRAPH_ORDER;
 	char head[128];
 	const char *head_p;
 	int head_len;
@@ -666,15 +666,17 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 			    N_("show possible merge bases")),
 		OPT_BOOLEAN(0, "independent", &independent,
 			    N_("show refs unreachable from any other ref")),
-		OPT_BOOLEAN(0, "topo-order", &lifo,
-			    N_("show commits in topological order")),
+		OPT_SET_INT(0, "topo-order", &sort_order,
+			    N_("show commits in topological order"),
+			    REV_SORT_IN_GRAPH_ORDER),
 		OPT_BOOLEAN(0, "topics", &topics,
 			    N_("show only commits not on the first branch")),
 		OPT_SET_INT(0, "sparse", &dense,
 			    N_("show merges reachable from only one tip"), 0),
-		OPT_SET_INT(0, "date-order", &lifo,
+		OPT_SET_INT(0, "date-order", &sort_order,
 			    N_("show commits where no parent comes before its "
-			       "children"), 0),
+			       "children"),
+			    REV_SORT_BY_COMMIT_DATE),
 		{ OPTION_CALLBACK, 'g', "reflog", &reflog_base, N_("<n>[,<base>]"),
 			    N_("show <n> most recent ref-log entries starting at "
 			       "base"),
@@ -901,7 +903,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 		exit(0);
 
 	/* Sort topologically */
-	sort_in_topological_order(&seen, lifo);
+	sort_in_topological_order(&seen, sort_order);
 
 	/* Give names to commits */
 	if (!sha1_name && !no_name)
diff --git a/commit.c b/commit.c
index 66a6c00..fc1734b 100644
--- a/commit.c
+++ b/commit.c
@@ -507,7 +507,7 @@ define_commit_slab(indegree_slab, int);
 /*
  * Performs an in-place topological sort on the list supplied.
  */
-void sort_in_topological_order(struct commit_list ** list, int lifo)
+void sort_in_topological_order(struct commit_list ** list, enum rev_sort_order sort_order)
 {
 	struct commit_list *next, *orig = *list;
 	struct commit_list *work, **insert;
@@ -556,7 +556,7 @@ void sort_in_topological_order(struct commit_list ** list, int lifo)
 	}
 
 	/* process the list in topological order */
-	if (!lifo)
+	if (sort_order != REV_SORT_IN_GRAPH_ORDER)
 		commit_list_sort_by_date(&work);
 
 	pptr = list;
@@ -583,10 +583,14 @@ void sort_in_topological_order(struct commit_list ** list, int lifo)
 			 * guaranteeing topological order.
 			 */
 			if (--(*pi) == 1) {
-				if (!lifo)
+				switch (sort_order) {
+				case REV_SORT_BY_COMMIT_DATE:
 					commit_list_insert_by_date(parent, &work);
-				else
+					break;
+				default: /* REV_SORT_IN_GRAPH_ORDER */
 					commit_list_insert(parent, &work);
+					break;
+				}
 			}
 		}
 		/*
diff --git a/commit.h b/commit.h
index 70e749d..247e474 100644
--- a/commit.h
+++ b/commit.h
@@ -139,15 +139,23 @@ struct commit *pop_commit(struct commit_list **stack);
 void clear_commit_marks(struct commit *commit, unsigned int mark);
 void clear_commit_marks_for_object_array(struct object_array *a, unsigned mark);
 
+
+enum rev_sort_order {
+	REV_SORT_IN_GRAPH_ORDER = 0,
+	REV_SORT_BY_COMMIT_DATE
+};
+
 /*
  * Performs an in-place topological sort of list supplied.
  *
  *   invariant of resulting list is:
  *      a reachable from b => ord(b) < ord(a)
- *   in addition, when lifo == 0, commits on parallel tracks are
- *   sorted in the dates order.
+ *   sort_order further specifies:
+ *   REV_SORT_IN_GRAPH_ORDER: try to show a commit on a single-parent
+ *                            chain together.
+ *   REV_SORT_BY_COMMIT_DATE: show eligible commits in committer-date order.
  */
-void sort_in_topological_order(struct commit_list ** list, int lifo);
+void sort_in_topological_order(struct commit_list **, enum rev_sort_order);
 
 struct commit_graft {
 	unsigned char sha1[20];
diff --git a/revision.c b/revision.c
index cf620c6..966ebbc 100644
--- a/revision.c
+++ b/revision.c
@@ -1038,7 +1038,7 @@ void init_revisions(struct rev_info *revs, const char *prefix)
 	DIFF_OPT_SET(&revs->pruning, QUICK);
 	revs->pruning.add_remove = file_add_remove;
 	revs->pruning.change = file_change;
-	revs->lifo = 1;
+	revs->sort_order = REV_SORT_IN_GRAPH_ORDER;
 	revs->dense = 1;
 	revs->prefix = prefix;
 	revs->max_age = -1;
@@ -1373,7 +1373,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--merge")) {
 		revs->show_merge = 1;
 	} else if (!strcmp(arg, "--topo-order")) {
-		revs->lifo = 1;
+		revs->sort_order = REV_SORT_IN_GRAPH_ORDER;
 		revs->topo_order = 1;
 	} else if (!strcmp(arg, "--simplify-merges")) {
 		revs->simplify_merges = 1;
@@ -1391,7 +1391,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->prune = 1;
 		load_ref_decorations(DECORATE_SHORT_REFS);
 	} else if (!strcmp(arg, "--date-order")) {
-		revs->lifo = 0;
+		revs->sort_order = REV_SORT_BY_COMMIT_DATE;
 		revs->topo_order = 1;
 	} else if (!prefixcmp(arg, "--early-output")) {
 		int count = 100;
@@ -2165,7 +2165,7 @@ int prepare_revision_walk(struct rev_info *revs)
 		if (limit_list(revs) < 0)
 			return -1;
 	if (revs->topo_order)
-		sort_in_topological_order(&revs->commits, revs->lifo);
+		sort_in_topological_order(&revs->commits, revs->sort_order);
 	if (revs->simplify_merges)
 		simplify_merges(revs);
 	if (revs->children.name)
@@ -2480,7 +2480,7 @@ static void create_boundary_commit_list(struct rev_info *revs)
 	 * If revs->topo_order is set, sort the boundary commits
 	 * in topological order
 	 */
-	sort_in_topological_order(&revs->commits, revs->lifo);
+	sort_in_topological_order(&revs->commits, revs->sort_order);
 }
 
 static struct commit *get_revision_internal(struct rev_info *revs)
diff --git a/revision.h b/revision.h
index 5da09ee..2a5e325 100644
--- a/revision.h
+++ b/revision.h
@@ -4,6 +4,7 @@
 #include "parse-options.h"
 #include "grep.h"
 #include "notes.h"
+#include "commit.h"
 
 #define SEEN		(1u<<0)
 #define UNINTERESTING   (1u<<1)
@@ -60,6 +61,10 @@ struct rev_info {
 	const char *prefix;
 	const char *def;
 	struct pathspec prune_data;
+
+	/* topo-sort */
+	enum rev_sort_order sort_order;
+
 	unsigned int	early_output:1,
 			ignore_missing:1;
 
@@ -70,7 +75,6 @@ struct rev_info {
 			show_all:1,
 			remove_empty_trees:1,
 			simplify_history:1,
-			lifo:1,
 			topo_order:1,
 			simplify_merges:1,
 			simplify_by_decoration:1,
-- 
1.8.3-451-gb703ddf

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 2/3] commit-queue: LIFO or priority queue of commits
  2013-06-07  5:11                 ` [PATCH 0/3] Preparing for --date-order=author Junio C Hamano
  2013-06-07  5:11                   ` [PATCH 1/3] toposort: rename "lifo" field Junio C Hamano
@ 2013-06-07  5:11                   ` Junio C Hamano
  2013-06-07  5:29                     ` Eric Sunshine
  2013-06-07  5:11                   ` [PATCH 3/3] sort-in-topological-order: use commit-queue Junio C Hamano
  2013-06-09 23:24                   ` [PATCH v2 0/4] log --author-date-order Junio C Hamano
  3 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-07  5:11 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

Traditionally we used a singly linked list of commits to hold a set
of in-flight commits while traversing history.  The most typical use
of the list is to insert commit that is newly discovered in it, keep
it sorted by commit timestamp, pick up the newest one from the list,
and keep digging.  The cost of keeping the singly linked list sorted
is nontrivial, and this typical use pattern better matches a priority
queue.

Introduce a commit-queue structure, that can be used either as a
LIFO stack, or a priority queue.  This will be used in the next
patch to hold in-flight commits during sort-in-topological-order.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Makefile       |  2 ++
 commit-queue.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 commit-queue.h | 31 +++++++++++++++++++++++++
 3 files changed, 104 insertions(+)
 create mode 100644 commit-queue.c
 create mode 100644 commit-queue.h

diff --git a/Makefile b/Makefile
index 598d631..3cf55e9 100644
--- a/Makefile
+++ b/Makefile
@@ -634,6 +634,7 @@ LIB_H += cache.h
 LIB_H += color.h
 LIB_H += column.h
 LIB_H += commit.h
+LIB_H += commit-queue.h
 LIB_H += compat/bswap.h
 LIB_H += compat/cygwin.h
 LIB_H += compat/mingw.h
@@ -757,6 +758,7 @@ LIB_OBJS += color.o
 LIB_OBJS += column.o
 LIB_OBJS += combine-diff.o
 LIB_OBJS += commit.o
+LIB_OBJS += commit-queue.o
 LIB_OBJS += compat/obstack.o
 LIB_OBJS += compat/terminal.o
 LIB_OBJS += config.o
diff --git a/commit-queue.c b/commit-queue.c
new file mode 100644
index 0000000..77d4b02
--- /dev/null
+++ b/commit-queue.c
@@ -0,0 +1,71 @@
+#include "cache.h"
+#include "commit.h"
+#include "commit-queue.h"
+
+void clear_commit_queue(struct commit_queue *queue)
+{
+	free(queue->array);
+	queue->nr = 0;
+	queue->alloc = 0;
+	queue->array = NULL;
+}
+
+void commit_queue_put(struct commit_queue *queue, struct commit *commit)
+{
+	commit_compare_fn compare = queue->compare;
+	int ix, parent;
+
+	/* Append at the end */
+	ALLOC_GROW(queue->array, queue->nr + 1, queue->alloc);
+	queue->array[queue->nr++] = commit;
+	if (!compare)
+		return; /* LIFO */
+
+	/* Bubble up the new one */
+	for (ix = queue->nr - 1; ix; ix = parent) {
+		parent = (ix - 1) / 2;
+		if (compare(queue->array[parent], queue->array[ix],
+			    queue->cb_data) < 0)
+			break;
+
+		commit = queue->array[parent];
+		queue->array[parent] = queue->array[ix];
+		queue->array[ix] = commit;
+	}
+}
+
+struct commit *commit_queue_get(struct commit_queue *queue)
+{
+	struct commit *result, *swap;
+	int ix, child;
+	commit_compare_fn compare = queue->compare;
+
+	if (!queue->nr)
+		return NULL;
+	if (!compare)
+		return queue->array[--queue->nr]; /* LIFO */
+
+	result = queue->array[0];
+	if (!--queue->nr)
+		return result;
+
+	queue->array[0] = queue->array[queue->nr];
+
+	/* Push down the one at the root */
+	for (ix = 0; ix * 2 + 1 < queue->nr; ix = child) {
+		child = ix * 2 + 1; /* left */
+		if ((child + 1 < queue->nr) &&
+		    (compare(queue->array[child], queue->array[child + 1],
+			     queue->cb_data) >= 0))
+			child++; /* use right child */
+
+		if (compare(queue->array[ix], queue->array[child],
+			    queue->cb_data) < 0)
+			break;
+
+		swap = queue->array[child];
+		queue->array[child] = queue->array[ix];
+		queue->array[ix] = swap;
+	}
+	return result;
+}
diff --git a/commit-queue.h b/commit-queue.h
new file mode 100644
index 0000000..7c5dc4c
--- /dev/null
+++ b/commit-queue.h
@@ -0,0 +1,31 @@
+#ifndef COMMIT_QUEUE_H
+#define COMMIT_QUEUE_H
+
+/*
+ * Compare two commits; the third parameter is cb_data in the
+ * commit_queue structure.
+ */
+typedef int (*commit_compare_fn)(struct commit *, struct commit *, void *);
+
+struct commit_queue {
+	commit_compare_fn compare;
+	void *cb_data;
+	int alloc, nr;
+	struct commit **array;
+};
+
+/*
+ * Add the commit to the queue
+ */
+extern void commit_queue_put(struct commit_queue *, struct commit *);
+
+/*
+ * Extract the commit that compares the smallest out of the queue,
+ * or NULL.  If compare function is NULL, the queue acts as a LIFO
+ * stack.
+ */
+extern struct commit *commit_queue_get(struct commit_queue *);
+
+extern void clear_commit_queue(struct commit_queue *);
+
+#endif /* COMMIT_QUEUE_H */
-- 
1.8.3-451-gb703ddf

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 3/3] sort-in-topological-order: use commit-queue
  2013-06-07  5:11                 ` [PATCH 0/3] Preparing for --date-order=author Junio C Hamano
  2013-06-07  5:11                   ` [PATCH 1/3] toposort: rename "lifo" field Junio C Hamano
  2013-06-07  5:11                   ` [PATCH 2/3] commit-queue: LIFO or priority queue of commits Junio C Hamano
@ 2013-06-07  5:11                   ` Junio C Hamano
  2013-06-09 23:24                   ` [PATCH v2 0/4] log --author-date-order Junio C Hamano
  3 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2013-06-07  5:11 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

Use the commit-queue data structure to implement a priority queue
of commits sorted by committer date, when handling --date-order.
The commit-queue structure can also be used as a simple LIFO stack,
which is a good match for --topo-order processing.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 commit-queue.c | 13 +++++++++++
 commit-queue.h |  3 +++
 commit.c       | 74 ++++++++++++++++++++++++++++++++++------------------------
 3 files changed, 59 insertions(+), 31 deletions(-)

diff --git a/commit-queue.c b/commit-queue.c
index 77d4b02..ffffc4e 100644
--- a/commit-queue.c
+++ b/commit-queue.c
@@ -2,6 +2,19 @@
 #include "commit.h"
 #include "commit-queue.h"
 
+void commit_queue_reverse(struct commit_queue *queue)
+{
+	int i, j;
+
+	if (queue->compare != NULL)
+		die("BUG: commit_queue_reverse() on non-LIFO queue");
+	for (i = 0; i <= (j = (queue->nr - 1) - i); i++) {
+		struct commit *swap = queue->array[i];
+		queue->array[i] = queue->array[j];
+		queue->array[j] = swap;
+	}
+}
+
 void clear_commit_queue(struct commit_queue *queue)
 {
 	free(queue->array);
diff --git a/commit-queue.h b/commit-queue.h
index 7c5dc4c..d3c92e5 100644
--- a/commit-queue.h
+++ b/commit-queue.h
@@ -28,4 +28,7 @@ extern struct commit *commit_queue_get(struct commit_queue *);
 
 extern void clear_commit_queue(struct commit_queue *);
 
+/* Reverse the LIFO elements */
+extern void commit_queue_reverse(struct commit_queue *);
+
 #endif /* COMMIT_QUEUE_H */
diff --git a/commit.c b/commit.c
index fc1734b..46cd150 100644
--- a/commit.c
+++ b/commit.c
@@ -9,6 +9,7 @@
 #include "gpg-interface.h"
 #include "mergesort.h"
 #include "commit-slab.h"
+#include "commit-queue.h"
 
 static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
 
@@ -504,21 +505,41 @@ struct commit *pop_commit(struct commit_list **stack)
 
 define_commit_slab(indegree_slab, int);
 
+static int compare_commits_by_commit_date(struct commit *a, struct commit *b, void *unused)
+{
+	/* newer commits with larger date first */
+	if (a->date < b->date)
+		return 1;
+	else if (a->date > b->date)
+		return -1;
+	return 0;
+}
+
 /*
  * Performs an in-place topological sort on the list supplied.
  */
-void sort_in_topological_order(struct commit_list ** list, enum rev_sort_order sort_order)
+void sort_in_topological_order(struct commit_list **list, enum rev_sort_order sort_order)
 {
 	struct commit_list *next, *orig = *list;
-	struct commit_list *work, **insert;
 	struct commit_list **pptr;
 	struct indegree_slab indegree;
+	struct commit_queue queue;
+	struct commit *commit;
 
 	if (!orig)
 		return;
 	*list = NULL;
 
 	init_indegree_slab(&indegree);
+	memset(&queue, '\0', sizeof(queue));
+	switch (sort_order) {
+	default: /* REV_SORT_IN_GRAPH_ORDER */
+		queue.compare = NULL;
+		break;
+	case REV_SORT_BY_COMMIT_DATE:
+		queue.compare = compare_commits_by_commit_date;
+		break;
+	}
 
 	/* Mark them and clear the indegree */
 	for (next = orig; next; next = next->next) {
@@ -528,7 +549,7 @@ void sort_in_topological_order(struct commit_list ** list, enum rev_sort_order s
 
 	/* update the indegree */
 	for (next = orig; next; next = next->next) {
-		struct commit_list * parents = next->item->parents;
+		struct commit_list *parents = next->item->parents;
 		while (parents) {
 			struct commit *parent = parents->item;
 			int *pi = indegree_slab_at(&indegree, parent);
@@ -546,30 +567,28 @@ void sort_in_topological_order(struct commit_list ** list, enum rev_sort_order s
 	 *
 	 * the tips serve as a starting set for the work queue.
 	 */
-	work = NULL;
-	insert = &work;
 	for (next = orig; next; next = next->next) {
 		struct commit *commit = next->item;
 
 		if (*(indegree_slab_at(&indegree, commit)) == 1)
-			insert = &commit_list_insert(commit, insert)->next;
+			commit_queue_put(&queue, commit);
 	}
 
-	/* process the list in topological order */
-	if (sort_order != REV_SORT_IN_GRAPH_ORDER)
-		commit_list_sort_by_date(&work);
+	/*
+	 * This is unfortunate; the initial tips need to be shown
+	 * in the order given from the revision traversal machinery.
+	 */
+	if (sort_order == REV_SORT_IN_GRAPH_ORDER)
+		commit_queue_reverse(&queue);
+
+	/* We no longer need the commit list */
+	free_commit_list(orig);
 
 	pptr = list;
 	*list = NULL;
-	while (work) {
-		struct commit *commit;
-		struct commit_list *parents, *work_item;
-
-		work_item = work;
-		work = work_item->next;
-		work_item->next = NULL;
+	while ((commit = commit_queue_get(&queue)) != NULL) {
+		struct commit_list *parents;
 
-		commit = work_item->item;
 		for (parents = commit->parents; parents ; parents = parents->next) {
 			struct commit *parent = parents->item;
 			int *pi = indegree_slab_at(&indegree, parent);
@@ -582,27 +601,20 @@ void sort_in_topological_order(struct commit_list ** list, enum rev_sort_order s
 			 * when all their children have been emitted thereby
 			 * guaranteeing topological order.
 			 */
-			if (--(*pi) == 1) {
-				switch (sort_order) {
-				case REV_SORT_BY_COMMIT_DATE:
-					commit_list_insert_by_date(parent, &work);
-					break;
-				default: /* REV_SORT_IN_GRAPH_ORDER */
-					commit_list_insert(parent, &work);
-					break;
-				}
-			}
+			if (--(*pi) == 1)
+				commit_queue_put(&queue, parent);
 		}
 		/*
-		 * work_item is a commit all of whose children
-		 * have already been emitted. we can emit it now.
+		 * all children of commit have already been
+		 * emitted. we can emit it now.
 		 */
 		*(indegree_slab_at(&indegree, commit)) = 0;
-		*pptr = work_item;
-		pptr = &work_item->next;
+
+		pptr = &commit_list_insert(commit, pptr)->next;
 	}
 
 	clear_indegree_slab(&indegree);
+	clear_commit_queue(&queue);
 }
 
 /* merge-base stuff */
-- 
1.8.3-451-gb703ddf

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/3] toposort: rename "lifo" field
  2013-06-07  5:11                   ` [PATCH 1/3] toposort: rename "lifo" field Junio C Hamano
@ 2013-06-07  5:18                     ` Eric Sunshine
  2013-06-07  5:21                       ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Eric Sunshine @ 2013-06-07  5:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List, Elliott Cable, Jeff King

On Fri, Jun 7, 2013 at 1:11 AM, Junio C Hamano <gitster@pobox.com> wrote:
> The primary invariant of sort_in_topological_order() is to emit all
> children before their parent is emitted.  When traversing a forked

s/parent is/parents are/

> history like this with "git log C E":
>
>     A----B----C
>      \
>       D----E
>
> we ensure that A is emitted after all of B, C, D, and E are done, B
> has to wait until C is done, and D has to wait until E is done.
>
> In some applications, however, we would further want to control how
> these child commits B, C, D and E on two parallel ancestry chains
> are shown.  Most of the time, we would want to see C and B emitted
> together, and then E and D, and finally A, which is the default
> behaviour for --topo-order output.
>
> The "lifo" parameter of the sort_in_topological_order() function is
> used to implement this behaviour.  After inspecting C, we notice and
> record that B needs to be inspected, and by structuring the "work to
> be done" set as a LIFO stack, we ensure that B is inspected next,
> before other in-flight commits we had known that we will need to
> inspect, e.g. E, that may have already been sitting in the "work to
> be done" set.
>
> When showing in --date-order, we would want to see commits ordered
> by timestamps, i.e. show C, E, B and D in this order before showing
> A, possibly mixing commits from two parallel histories together.
> When "lifo" parameter is set to false, the function keeps the "work
> to be done" set sorted in the date order to realize this semantics.
>
> But the name "lifo" is too tied to the way how the function implements
> its behaviour, and does not describe _what_ the desired semantics is.
>
> Replace the "lifo" field with an enum rev_sort_order, with two
> possible values: REV_SORT_IN_GRAPH_ORDER and REV_SORT_BY_COMMIT_DATE.
>
> The mechanical replacement rule is:
>
>   "lifo == 0" is equivalent to "sort_order == REV_SORT_BY_COMMIT_DATE"
>   "lifo == 1" is equivalent to "sort_order == REV_SORT_IN_GRAPH_ORDER"
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/3] toposort: rename "lifo" field
  2013-06-07  5:18                     ` Eric Sunshine
@ 2013-06-07  5:21                       ` Junio C Hamano
  0 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2013-06-07  5:21 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Elliott Cable, Jeff King

Eric Sunshine <sunshine@sunshineco.com> writes:

> On Fri, Jun 7, 2013 at 1:11 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> The primary invariant of sort_in_topological_order() is to emit all
>> children before their parent is emitted.  When traversing a forked
>
> s/parent is/parents are/

Hmm, not quite.  The above refers to:

      B
     /
    A---C
     \
      D

where A is the parent, B, C and D are all its children.  We want to
emit all children (B, C and D) before their parent A _is_ emitted.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/3] commit-queue: LIFO or priority queue of commits
  2013-06-07  5:11                   ` [PATCH 2/3] commit-queue: LIFO or priority queue of commits Junio C Hamano
@ 2013-06-07  5:29                     ` Eric Sunshine
  0 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine @ 2013-06-07  5:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List, Elliott Cable, Jeff King

On Fri, Jun 7, 2013 at 1:11 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Traditionally we used a singly linked list of commits to hold a set
> of in-flight commits while traversing history.  The most typical use
> of the list is to insert commit that is newly discovered in it, keep

s/commit/a commit/

Also, "in it" is perhaps implied by "insert", so s/in it// may be appropriate.

> it sorted by commit timestamp, pick up the newest one from the list,
> and keep digging.  The cost of keeping the singly linked list sorted
> is nontrivial, and this typical use pattern better matches a priority
> queue.
>
> Introduce a commit-queue structure, that can be used either as a
> LIFO stack, or a priority queue.  This will be used in the next
> patch to hold in-flight commits during sort-in-topological-order.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v2 0/4] log --author-date-order
  2013-06-07  5:11                 ` [PATCH 0/3] Preparing for --date-order=author Junio C Hamano
                                     ` (2 preceding siblings ...)
  2013-06-07  5:11                   ` [PATCH 3/3] sort-in-topological-order: use commit-queue Junio C Hamano
@ 2013-06-09 23:24                   ` Junio C Hamano
  2013-06-09 23:24                     ` [PATCH v2 1/4] toposort: rename "lifo" field Junio C Hamano
                                       ` (3 more replies)
  3 siblings, 4 replies; 51+ messages in thread
From: Junio C Hamano @ 2013-06-09 23:24 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

Not much changed in the first three patches since the edition from
last week.  A clean-up to clarify the toposort API, introduction of
priority queue API, and then its use in topological sort logic.

The final patch adds "log --author-date-order" to build on top of
them.

Adding tests to t4202 and/or t6012 is left as an exercise to readers.

Junio C Hamano (4):
  toposort: rename "lifo" field
  commit-queue: LIFO or priority queue of commits
  sort-in-topological-order: use commit-queue
  log: --author-date-order

 Documentation/rev-list-options.txt |   4 ++
 Makefile                           |   2 +
 builtin/log.c                      |   2 +-
 builtin/show-branch.c              |  14 ++--
 commit-queue.c                     |  84 ++++++++++++++++++++++++
 commit-queue.h                     |  34 ++++++++++
 commit.c                           | 129 +++++++++++++++++++++++++++++--------
 commit.h                           |  15 ++++-
 revision.c                         |  13 ++--
 revision.h                         |   6 +-
 10 files changed, 260 insertions(+), 43 deletions(-)
 create mode 100644 commit-queue.c
 create mode 100644 commit-queue.h

-- 
1.8.3-451-gb703ddf

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v2 1/4] toposort: rename "lifo" field
  2013-06-09 23:24                   ` [PATCH v2 0/4] log --author-date-order Junio C Hamano
@ 2013-06-09 23:24                     ` Junio C Hamano
  2013-06-10  2:12                       ` Eric Sunshine
  2013-06-10  5:05                       ` Jeff King
  2013-06-09 23:24                     ` [PATCH v2 2/4] commit-queue: LIFO or priority queue of commits Junio C Hamano
                                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 51+ messages in thread
From: Junio C Hamano @ 2013-06-09 23:24 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

The primary invariant of sort_in_topological_order() is that a
parent commit is not emitted untile all children of it are.  When
traversing a forked history like this with "git log C E":

    A----B----C
     \
      D----E

we ensure that A is emitted after all of B, C, D, and E are done, B
has to wait until C is done, and D has to wait until E is done.

In some applications, however, we would further want to control how
these child commits B, C, D and E on two parallel ancestry chains
are shown.

Most of the time, we would want to see C and B emitted together, and
then E and D, and finally A (i.e. the --topo-order output).  The
"lifo" parameter of the sort_in_topological_order() function is used
to control this behaviour.  We start the traversal by knowing two
commits, C and E.  While keeping in mind that we also need to
inspect E later, we pick C first to inspect, and we notice and
record that B needs to be inspected.  By structuring the "work to be
done" set as a LIFO stack, we ensure that B is inspected next,
before other in-flight commits we had known that we will need to
inspect, e.g. E.

When showing in --date-order, we would want to see commits ordered
by timestamps, i.e. show C, E, B and D in this order before showing
A, possibly mixing commits from two parallel histories together.
When "lifo" parameter is set to false, the function keeps the "work
to be done" set sorted in the date order to realize this semantics.
After inspecting C, we add B to the "work to be done" set, but the
next commit we inspect from the set is E which is newer than B.

The name "lifo", however, is too strongly tied to the way how the
function implements its behaviour, and does not describe what the
behaviour _means_.

Replace this field with an enum rev_sort_order, with two possible
values: REV_SORT_IN_GRAPH_ORDER and REV_SORT_BY_COMMIT_DATE, and
update the existing code.  The mechanical replacement rule is:

  "lifo == 0" is equivalent to "sort_order == REV_SORT_BY_COMMIT_DATE"
  "lifo == 1" is equivalent to "sort_order == REV_SORT_IN_GRAPH_ORDER"

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/log.c         |  2 +-
 builtin/show-branch.c | 14 ++++++++------
 commit.c              | 12 ++++++++----
 commit.h              | 14 +++++++++++---
 revision.c            | 10 +++++-----
 revision.h            |  6 +++++-
 6 files changed, 38 insertions(+), 20 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 8f0b2e8..8d26042 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -205,7 +205,7 @@ static void log_show_early(struct rev_info *revs, struct commit_list *list)
 	int i = revs->early_output;
 	int show_header = 1;
 
-	sort_in_topological_order(&list, revs->lifo);
+	sort_in_topological_order(&list, revs->sort_order);
 	while (list && i) {
 		struct commit *commit = list->item;
 		switch (simplify_commit(revs, commit)) {
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index d208fd6..7c57985 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -631,7 +631,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 	int num_rev, i, extra = 0;
 	int all_heads = 0, all_remotes = 0;
 	int all_mask, all_revs;
-	int lifo = 1;
+	enum rev_sort_order sort_order = REV_SORT_IN_GRAPH_ORDER;
 	char head[128];
 	const char *head_p;
 	int head_len;
@@ -666,15 +666,17 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 			    N_("show possible merge bases")),
 		OPT_BOOLEAN(0, "independent", &independent,
 			    N_("show refs unreachable from any other ref")),
-		OPT_BOOLEAN(0, "topo-order", &lifo,
-			    N_("show commits in topological order")),
+		OPT_SET_INT(0, "topo-order", &sort_order,
+			    N_("show commits in topological order"),
+			    REV_SORT_IN_GRAPH_ORDER),
 		OPT_BOOLEAN(0, "topics", &topics,
 			    N_("show only commits not on the first branch")),
 		OPT_SET_INT(0, "sparse", &dense,
 			    N_("show merges reachable from only one tip"), 0),
-		OPT_SET_INT(0, "date-order", &lifo,
+		OPT_SET_INT(0, "date-order", &sort_order,
 			    N_("show commits where no parent comes before its "
-			       "children"), 0),
+			       "children"),
+			    REV_SORT_BY_COMMIT_DATE),
 		{ OPTION_CALLBACK, 'g', "reflog", &reflog_base, N_("<n>[,<base>]"),
 			    N_("show <n> most recent ref-log entries starting at "
 			       "base"),
@@ -901,7 +903,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 		exit(0);
 
 	/* Sort topologically */
-	sort_in_topological_order(&seen, lifo);
+	sort_in_topological_order(&seen, sort_order);
 
 	/* Give names to commits */
 	if (!sha1_name && !no_name)
diff --git a/commit.c b/commit.c
index f97456d..11b9635 100644
--- a/commit.c
+++ b/commit.c
@@ -512,7 +512,7 @@ define_commit_slab(indegree_slab, int);
 /*
  * Performs an in-place topological sort on the list supplied.
  */
-void sort_in_topological_order(struct commit_list ** list, int lifo)
+void sort_in_topological_order(struct commit_list ** list, enum rev_sort_order sort_order)
 {
 	struct commit_list *next, *orig = *list;
 	struct commit_list *work, **insert;
@@ -561,7 +561,7 @@ void sort_in_topological_order(struct commit_list ** list, int lifo)
 	}
 
 	/* process the list in topological order */
-	if (!lifo)
+	if (sort_order != REV_SORT_IN_GRAPH_ORDER)
 		commit_list_sort_by_date(&work);
 
 	pptr = list;
@@ -588,10 +588,14 @@ void sort_in_topological_order(struct commit_list ** list, int lifo)
 			 * guaranteeing topological order.
 			 */
 			if (--(*pi) == 1) {
-				if (!lifo)
+				switch (sort_order) {
+				case REV_SORT_BY_COMMIT_DATE:
 					commit_list_insert_by_date(parent, &work);
-				else
+					break;
+				default: /* REV_SORT_IN_GRAPH_ORDER */
 					commit_list_insert(parent, &work);
+					break;
+				}
 			}
 		}
 		/*
diff --git a/commit.h b/commit.h
index 70e749d..247e474 100644
--- a/commit.h
+++ b/commit.h
@@ -139,15 +139,23 @@ struct commit *pop_commit(struct commit_list **stack);
 void clear_commit_marks(struct commit *commit, unsigned int mark);
 void clear_commit_marks_for_object_array(struct object_array *a, unsigned mark);
 
+
+enum rev_sort_order {
+	REV_SORT_IN_GRAPH_ORDER = 0,
+	REV_SORT_BY_COMMIT_DATE
+};
+
 /*
  * Performs an in-place topological sort of list supplied.
  *
  *   invariant of resulting list is:
  *      a reachable from b => ord(b) < ord(a)
- *   in addition, when lifo == 0, commits on parallel tracks are
- *   sorted in the dates order.
+ *   sort_order further specifies:
+ *   REV_SORT_IN_GRAPH_ORDER: try to show a commit on a single-parent
+ *                            chain together.
+ *   REV_SORT_BY_COMMIT_DATE: show eligible commits in committer-date order.
  */
-void sort_in_topological_order(struct commit_list ** list, int lifo);
+void sort_in_topological_order(struct commit_list **, enum rev_sort_order);
 
 struct commit_graft {
 	unsigned char sha1[20];
diff --git a/revision.c b/revision.c
index cf620c6..966ebbc 100644
--- a/revision.c
+++ b/revision.c
@@ -1038,7 +1038,7 @@ void init_revisions(struct rev_info *revs, const char *prefix)
 	DIFF_OPT_SET(&revs->pruning, QUICK);
 	revs->pruning.add_remove = file_add_remove;
 	revs->pruning.change = file_change;
-	revs->lifo = 1;
+	revs->sort_order = REV_SORT_IN_GRAPH_ORDER;
 	revs->dense = 1;
 	revs->prefix = prefix;
 	revs->max_age = -1;
@@ -1373,7 +1373,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--merge")) {
 		revs->show_merge = 1;
 	} else if (!strcmp(arg, "--topo-order")) {
-		revs->lifo = 1;
+		revs->sort_order = REV_SORT_IN_GRAPH_ORDER;
 		revs->topo_order = 1;
 	} else if (!strcmp(arg, "--simplify-merges")) {
 		revs->simplify_merges = 1;
@@ -1391,7 +1391,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->prune = 1;
 		load_ref_decorations(DECORATE_SHORT_REFS);
 	} else if (!strcmp(arg, "--date-order")) {
-		revs->lifo = 0;
+		revs->sort_order = REV_SORT_BY_COMMIT_DATE;
 		revs->topo_order = 1;
 	} else if (!prefixcmp(arg, "--early-output")) {
 		int count = 100;
@@ -2165,7 +2165,7 @@ int prepare_revision_walk(struct rev_info *revs)
 		if (limit_list(revs) < 0)
 			return -1;
 	if (revs->topo_order)
-		sort_in_topological_order(&revs->commits, revs->lifo);
+		sort_in_topological_order(&revs->commits, revs->sort_order);
 	if (revs->simplify_merges)
 		simplify_merges(revs);
 	if (revs->children.name)
@@ -2480,7 +2480,7 @@ static void create_boundary_commit_list(struct rev_info *revs)
 	 * If revs->topo_order is set, sort the boundary commits
 	 * in topological order
 	 */
-	sort_in_topological_order(&revs->commits, revs->lifo);
+	sort_in_topological_order(&revs->commits, revs->sort_order);
 }
 
 static struct commit *get_revision_internal(struct rev_info *revs)
diff --git a/revision.h b/revision.h
index 5da09ee..2a5e325 100644
--- a/revision.h
+++ b/revision.h
@@ -4,6 +4,7 @@
 #include "parse-options.h"
 #include "grep.h"
 #include "notes.h"
+#include "commit.h"
 
 #define SEEN		(1u<<0)
 #define UNINTERESTING   (1u<<1)
@@ -60,6 +61,10 @@ struct rev_info {
 	const char *prefix;
 	const char *def;
 	struct pathspec prune_data;
+
+	/* topo-sort */
+	enum rev_sort_order sort_order;
+
 	unsigned int	early_output:1,
 			ignore_missing:1;
 
@@ -70,7 +75,6 @@ struct rev_info {
 			show_all:1,
 			remove_empty_trees:1,
 			simplify_history:1,
-			lifo:1,
 			topo_order:1,
 			simplify_merges:1,
 			simplify_by_decoration:1,
-- 
1.8.3-451-gb703ddf

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 2/4] commit-queue: LIFO or priority queue of commits
  2013-06-09 23:24                   ` [PATCH v2 0/4] log --author-date-order Junio C Hamano
  2013-06-09 23:24                     ` [PATCH v2 1/4] toposort: rename "lifo" field Junio C Hamano
@ 2013-06-09 23:24                     ` Junio C Hamano
  2013-06-10  5:25                       ` Jeff King
  2013-06-09 23:24                     ` [PATCH v2 3/4] sort-in-topological-order: use commit-queue Junio C Hamano
  2013-06-09 23:24                     ` [PATCH v2 4/4] log: --author-date-order Junio C Hamano
  3 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-09 23:24 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

Traditionally we used a singly linked list of commits to hold a set
of in-flight commits while traversing history.  The most typical use
of the list is to add commits that are newly discovered to it, keep
the list sorted by commit timestamp, pick up the newest one from the
list, and keep digging.  The cost of keeping the singly linked list
sorted is nontrivial, and this typical use pattern better matches a
priority queue.

Introduce a commit-queue structure, that can be used either as a
LIFO stack, or a priority queue.  This will be used in the next
patch to hold in-flight commits during sort-in-topological-order.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Makefile       |  2 ++
 commit-queue.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 commit-queue.h | 31 +++++++++++++++++++++++++
 3 files changed, 104 insertions(+)
 create mode 100644 commit-queue.c
 create mode 100644 commit-queue.h

diff --git a/Makefile b/Makefile
index 598d631..3cf55e9 100644
--- a/Makefile
+++ b/Makefile
@@ -634,6 +634,7 @@ LIB_H += cache.h
 LIB_H += color.h
 LIB_H += column.h
 LIB_H += commit.h
+LIB_H += commit-queue.h
 LIB_H += compat/bswap.h
 LIB_H += compat/cygwin.h
 LIB_H += compat/mingw.h
@@ -757,6 +758,7 @@ LIB_OBJS += color.o
 LIB_OBJS += column.o
 LIB_OBJS += combine-diff.o
 LIB_OBJS += commit.o
+LIB_OBJS += commit-queue.o
 LIB_OBJS += compat/obstack.o
 LIB_OBJS += compat/terminal.o
 LIB_OBJS += config.o
diff --git a/commit-queue.c b/commit-queue.c
new file mode 100644
index 0000000..77d4b02
--- /dev/null
+++ b/commit-queue.c
@@ -0,0 +1,71 @@
+#include "cache.h"
+#include "commit.h"
+#include "commit-queue.h"
+
+void clear_commit_queue(struct commit_queue *queue)
+{
+	free(queue->array);
+	queue->nr = 0;
+	queue->alloc = 0;
+	queue->array = NULL;
+}
+
+void commit_queue_put(struct commit_queue *queue, struct commit *commit)
+{
+	commit_compare_fn compare = queue->compare;
+	int ix, parent;
+
+	/* Append at the end */
+	ALLOC_GROW(queue->array, queue->nr + 1, queue->alloc);
+	queue->array[queue->nr++] = commit;
+	if (!compare)
+		return; /* LIFO */
+
+	/* Bubble up the new one */
+	for (ix = queue->nr - 1; ix; ix = parent) {
+		parent = (ix - 1) / 2;
+		if (compare(queue->array[parent], queue->array[ix],
+			    queue->cb_data) < 0)
+			break;
+
+		commit = queue->array[parent];
+		queue->array[parent] = queue->array[ix];
+		queue->array[ix] = commit;
+	}
+}
+
+struct commit *commit_queue_get(struct commit_queue *queue)
+{
+	struct commit *result, *swap;
+	int ix, child;
+	commit_compare_fn compare = queue->compare;
+
+	if (!queue->nr)
+		return NULL;
+	if (!compare)
+		return queue->array[--queue->nr]; /* LIFO */
+
+	result = queue->array[0];
+	if (!--queue->nr)
+		return result;
+
+	queue->array[0] = queue->array[queue->nr];
+
+	/* Push down the one at the root */
+	for (ix = 0; ix * 2 + 1 < queue->nr; ix = child) {
+		child = ix * 2 + 1; /* left */
+		if ((child + 1 < queue->nr) &&
+		    (compare(queue->array[child], queue->array[child + 1],
+			     queue->cb_data) >= 0))
+			child++; /* use right child */
+
+		if (compare(queue->array[ix], queue->array[child],
+			    queue->cb_data) < 0)
+			break;
+
+		swap = queue->array[child];
+		queue->array[child] = queue->array[ix];
+		queue->array[ix] = swap;
+	}
+	return result;
+}
diff --git a/commit-queue.h b/commit-queue.h
new file mode 100644
index 0000000..7c5dc4c
--- /dev/null
+++ b/commit-queue.h
@@ -0,0 +1,31 @@
+#ifndef COMMIT_QUEUE_H
+#define COMMIT_QUEUE_H
+
+/*
+ * Compare two commits; the third parameter is cb_data in the
+ * commit_queue structure.
+ */
+typedef int (*commit_compare_fn)(struct commit *, struct commit *, void *);
+
+struct commit_queue {
+	commit_compare_fn compare;
+	void *cb_data;
+	int alloc, nr;
+	struct commit **array;
+};
+
+/*
+ * Add the commit to the queue
+ */
+extern void commit_queue_put(struct commit_queue *, struct commit *);
+
+/*
+ * Extract the commit that compares the smallest out of the queue,
+ * or NULL.  If compare function is NULL, the queue acts as a LIFO
+ * stack.
+ */
+extern struct commit *commit_queue_get(struct commit_queue *);
+
+extern void clear_commit_queue(struct commit_queue *);
+
+#endif /* COMMIT_QUEUE_H */
-- 
1.8.3-451-gb703ddf

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 3/4] sort-in-topological-order: use commit-queue
  2013-06-09 23:24                   ` [PATCH v2 0/4] log --author-date-order Junio C Hamano
  2013-06-09 23:24                     ` [PATCH v2 1/4] toposort: rename "lifo" field Junio C Hamano
  2013-06-09 23:24                     ` [PATCH v2 2/4] commit-queue: LIFO or priority queue of commits Junio C Hamano
@ 2013-06-09 23:24                     ` Junio C Hamano
  2013-06-09 23:37                       ` Junio C Hamano
  2013-06-09 23:24                     ` [PATCH v2 4/4] log: --author-date-order Junio C Hamano
  3 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-09 23:24 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

Use the commit-queue data structure to implement a priority queue
of commits sorted by committer date, when handling --date-order.
The commit-queue structure can also be used as a simple LIFO stack,
which is a good match for --topo-order processing.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 commit-queue.c | 13 +++++++++++
 commit-queue.h |  3 +++
 commit.c       | 74 ++++++++++++++++++++++++++++++++++------------------------
 3 files changed, 59 insertions(+), 31 deletions(-)

diff --git a/commit-queue.c b/commit-queue.c
index 77d4b02..ffffc4e 100644
--- a/commit-queue.c
+++ b/commit-queue.c
@@ -2,6 +2,19 @@
 #include "commit.h"
 #include "commit-queue.h"
 
+void commit_queue_reverse(struct commit_queue *queue)
+{
+	int i, j;
+
+	if (queue->compare != NULL)
+		die("BUG: commit_queue_reverse() on non-LIFO queue");
+	for (i = 0; i <= (j = (queue->nr - 1) - i); i++) {
+		struct commit *swap = queue->array[i];
+		queue->array[i] = queue->array[j];
+		queue->array[j] = swap;
+	}
+}
+
 void clear_commit_queue(struct commit_queue *queue)
 {
 	free(queue->array);
diff --git a/commit-queue.h b/commit-queue.h
index 7c5dc4c..d3c92e5 100644
--- a/commit-queue.h
+++ b/commit-queue.h
@@ -28,4 +28,7 @@ extern struct commit *commit_queue_get(struct commit_queue *);
 
 extern void clear_commit_queue(struct commit_queue *);
 
+/* Reverse the LIFO elements */
+extern void commit_queue_reverse(struct commit_queue *);
+
 #endif /* COMMIT_QUEUE_H */
diff --git a/commit.c b/commit.c
index 11b9635..cc6d385 100644
--- a/commit.c
+++ b/commit.c
@@ -9,6 +9,7 @@
 #include "gpg-interface.h"
 #include "mergesort.h"
 #include "commit-slab.h"
+#include "commit-queue.h"
 
 static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
 
@@ -509,21 +510,41 @@ struct commit *pop_commit(struct commit_list **stack)
 /* count number of children that have not been emitted */
 define_commit_slab(indegree_slab, int);
 
+static int compare_commits_by_commit_date(struct commit *a, struct commit *b, void *unused)
+{
+	/* newer commits with larger date first */
+	if (a->date < b->date)
+		return 1;
+	else if (a->date > b->date)
+		return -1;
+	return 0;
+}
+
 /*
  * Performs an in-place topological sort on the list supplied.
  */
-void sort_in_topological_order(struct commit_list ** list, enum rev_sort_order sort_order)
+void sort_in_topological_order(struct commit_list **list, enum rev_sort_order sort_order)
 {
 	struct commit_list *next, *orig = *list;
-	struct commit_list *work, **insert;
 	struct commit_list **pptr;
 	struct indegree_slab indegree;
+	struct commit_queue queue;
+	struct commit *commit;
 
 	if (!orig)
 		return;
 	*list = NULL;
 
 	init_indegree_slab(&indegree);
+	memset(&queue, '\0', sizeof(queue));
+	switch (sort_order) {
+	default: /* REV_SORT_IN_GRAPH_ORDER */
+		queue.compare = NULL;
+		break;
+	case REV_SORT_BY_COMMIT_DATE:
+		queue.compare = compare_commits_by_commit_date;
+		break;
+	}
 
 	/* Mark them and clear the indegree */
 	for (next = orig; next; next = next->next) {
@@ -533,7 +554,7 @@ void sort_in_topological_order(struct commit_list ** list, enum rev_sort_order s
 
 	/* update the indegree */
 	for (next = orig; next; next = next->next) {
-		struct commit_list * parents = next->item->parents;
+		struct commit_list *parents = next->item->parents;
 		while (parents) {
 			struct commit *parent = parents->item;
 			int *pi = indegree_slab_at(&indegree, parent);
@@ -551,30 +572,28 @@ void sort_in_topological_order(struct commit_list ** list, enum rev_sort_order s
 	 *
 	 * the tips serve as a starting set for the work queue.
 	 */
-	work = NULL;
-	insert = &work;
 	for (next = orig; next; next = next->next) {
 		struct commit *commit = next->item;
 
 		if (*(indegree_slab_at(&indegree, commit)) == 1)
-			insert = &commit_list_insert(commit, insert)->next;
+			commit_queue_put(&queue, commit);
 	}
 
-	/* process the list in topological order */
-	if (sort_order != REV_SORT_IN_GRAPH_ORDER)
-		commit_list_sort_by_date(&work);
+	/*
+	 * This is unfortunate; the initial tips need to be shown
+	 * in the order given from the revision traversal machinery.
+	 */
+	if (sort_order == REV_SORT_IN_GRAPH_ORDER)
+		commit_queue_reverse(&queue);
+
+	/* We no longer need the commit list */
+	free_commit_list(orig);
 
 	pptr = list;
 	*list = NULL;
-	while (work) {
-		struct commit *commit;
-		struct commit_list *parents, *work_item;
-
-		work_item = work;
-		work = work_item->next;
-		work_item->next = NULL;
+	while ((commit = commit_queue_get(&queue)) != NULL) {
+		struct commit_list *parents;
 
-		commit = work_item->item;
 		for (parents = commit->parents; parents ; parents = parents->next) {
 			struct commit *parent = parents->item;
 			int *pi = indegree_slab_at(&indegree, parent);
@@ -587,27 +606,20 @@ void sort_in_topological_order(struct commit_list ** list, enum rev_sort_order s
 			 * when all their children have been emitted thereby
 			 * guaranteeing topological order.
 			 */
-			if (--(*pi) == 1) {
-				switch (sort_order) {
-				case REV_SORT_BY_COMMIT_DATE:
-					commit_list_insert_by_date(parent, &work);
-					break;
-				default: /* REV_SORT_IN_GRAPH_ORDER */
-					commit_list_insert(parent, &work);
-					break;
-				}
-			}
+			if (--(*pi) == 1)
+				commit_queue_put(&queue, parent);
 		}
 		/*
-		 * work_item is a commit all of whose children
-		 * have already been emitted. we can emit it now.
+		 * all children of commit have already been
+		 * emitted. we can emit it now.
 		 */
 		*(indegree_slab_at(&indegree, commit)) = 0;
-		*pptr = work_item;
-		pptr = &work_item->next;
+
+		pptr = &commit_list_insert(commit, pptr)->next;
 	}
 
 	clear_indegree_slab(&indegree);
+	clear_commit_queue(&queue);
 }
 
 /* merge-base stuff */
-- 
1.8.3-451-gb703ddf

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v2 4/4] log: --author-date-order
  2013-06-09 23:24                   ` [PATCH v2 0/4] log --author-date-order Junio C Hamano
                                       ` (2 preceding siblings ...)
  2013-06-09 23:24                     ` [PATCH v2 3/4] sort-in-topological-order: use commit-queue Junio C Hamano
@ 2013-06-09 23:24                     ` Junio C Hamano
  2013-06-10  5:50                       ` Jeff King
  3 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-09 23:24 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

Sometimes people would want to view the commits in parallel
histories in the order of author dates, not committer dates.

Teach "topo-order" sort machinery to do so, using a commit-info slab
to record the author dates of each commit, and commit-queue to sort
them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/rev-list-options.txt |  4 +++
 commit.c                           | 59 ++++++++++++++++++++++++++++++++++++++
 commit.h                           |  3 +-
 revision.c                         |  3 ++
 4 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 3bdbf5e..8302402 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -617,6 +617,10 @@ By default, the commits are shown in reverse chronological order.
 	Show no parents before all of its children are shown, but
 	otherwise show commits in the commit timestamp order.
 
+--author-date-order::
+	Show no parents before all of its children are shown, but
+	otherwise show commits in the author timestamp order.
+
 --topo-order::
 	Show no parents before all of its children are shown, and
 	avoid showing commits on multiple lines of history
diff --git a/commit.c b/commit.c
index cc6d385..f3a2f09 100644
--- a/commit.c
+++ b/commit.c
@@ -510,6 +510,53 @@ struct commit *pop_commit(struct commit_list **stack)
 /* count number of children that have not been emitted */
 define_commit_slab(indegree_slab, int);
 
+/* record author-date for each commit object */
+define_commit_slab(author_date_slab, unsigned long);
+
+static void record_author_date(struct author_date_slab *author_date,
+			       struct commit *commit)
+{
+	const char *buf, *line_end;
+	struct ident_split ident;
+	char *date_end;
+	unsigned long date;
+
+	for (buf = commit->buffer; buf; buf = line_end + 1) {
+		line_end = strchrnul(buf, '\n');
+		if (prefixcmp(buf, "author ")) {
+			if (!line_end[0] || line_end[1] == '\n')
+				return; /* end of header */
+			continue;
+		}
+		if (split_ident_line(&ident,
+				     buf + strlen("author "),
+				     line_end - (buf + strlen("author "))) ||
+		    !ident.date_begin || !ident.date_end)
+			return; /* malformed "author" line */
+		break;
+	}
+
+	date = strtoul(ident.date_begin, &date_end, 10);
+	if (date_end != ident.date_end)
+		return; /* malformed date */
+	*(author_date_slab_at(author_date, commit)) = date;
+}
+
+static int compare_commits_by_author_date(struct commit *a, struct commit *b,
+					  void *cb_data)
+{
+	struct author_date_slab *author_date = cb_data;
+	unsigned long a_date = *(author_date_slab_at(author_date, a));
+	unsigned long b_date = *(author_date_slab_at(author_date, b));
+
+	/* newer commits with larger date first */
+	if (a_date < b_date)
+		return 1;
+	else if (a_date > b_date)
+		return -1;
+	return 0;
+}
+
 static int compare_commits_by_commit_date(struct commit *a, struct commit *b, void *unused)
 {
 	/* newer commits with larger date first */
@@ -530,6 +577,7 @@ void sort_in_topological_order(struct commit_list **list, enum rev_sort_order so
 	struct indegree_slab indegree;
 	struct commit_queue queue;
 	struct commit *commit;
+	struct author_date_slab author_date;
 
 	if (!orig)
 		return;
@@ -537,6 +585,7 @@ void sort_in_topological_order(struct commit_list **list, enum rev_sort_order so
 
 	init_indegree_slab(&indegree);
 	memset(&queue, '\0', sizeof(queue));
+
 	switch (sort_order) {
 	default: /* REV_SORT_IN_GRAPH_ORDER */
 		queue.compare = NULL;
@@ -544,12 +593,20 @@ void sort_in_topological_order(struct commit_list **list, enum rev_sort_order so
 	case REV_SORT_BY_COMMIT_DATE:
 		queue.compare = compare_commits_by_commit_date;
 		break;
+	case REV_SORT_BY_AUTHOR_DATE:
+		init_author_date_slab(&author_date);
+		queue.compare = compare_commits_by_author_date;
+		queue.cb_data = &author_date;
+		break;
 	}
 
 	/* Mark them and clear the indegree */
 	for (next = orig; next; next = next->next) {
 		struct commit *commit = next->item;
 		*(indegree_slab_at(&indegree, commit)) = 1;
+		/* also record the author dates, if needed */
+		if (sort_order == REV_SORT_BY_AUTHOR_DATE)
+			record_author_date(&author_date, commit);
 	}
 
 	/* update the indegree */
@@ -620,6 +677,8 @@ void sort_in_topological_order(struct commit_list **list, enum rev_sort_order so
 
 	clear_indegree_slab(&indegree);
 	clear_commit_queue(&queue);
+	if (sort_order == REV_SORT_BY_AUTHOR_DATE)
+		clear_author_date_slab(&author_date);
 }
 
 /* merge-base stuff */
diff --git a/commit.h b/commit.h
index 247e474..e43dfd0 100644
--- a/commit.h
+++ b/commit.h
@@ -142,7 +142,8 @@ void clear_commit_marks_for_object_array(struct object_array *a, unsigned mark);
 
 enum rev_sort_order {
 	REV_SORT_IN_GRAPH_ORDER = 0,
-	REV_SORT_BY_COMMIT_DATE
+	REV_SORT_BY_COMMIT_DATE,
+	REV_SORT_BY_AUTHOR_DATE
 };
 
 /*
diff --git a/revision.c b/revision.c
index 966ebbc..12d9b64 100644
--- a/revision.c
+++ b/revision.c
@@ -1393,6 +1393,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--date-order")) {
 		revs->sort_order = REV_SORT_BY_COMMIT_DATE;
 		revs->topo_order = 1;
+	} else if (!strcmp(arg, "--author-date-order")) {
+		revs->sort_order = REV_SORT_BY_AUTHOR_DATE;
+		revs->topo_order = 1;
 	} else if (!prefixcmp(arg, "--early-output")) {
 		int count = 100;
 		switch (arg[14]) {
-- 
1.8.3-451-gb703ddf

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 3/4] sort-in-topological-order: use commit-queue
  2013-06-09 23:24                     ` [PATCH v2 3/4] sort-in-topological-order: use commit-queue Junio C Hamano
@ 2013-06-09 23:37                       ` Junio C Hamano
  2013-06-10  5:31                         ` Jeff King
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-09 23:37 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

Junio C Hamano <gitster@pobox.com> writes:

> Use the commit-queue data structure to implement a priority queue
> of commits sorted by committer date, when handling --date-order.
> The commit-queue structure can also be used as a simple LIFO stack,
> which is a good match for --topo-order processing.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  commit-queue.c | 13 +++++++++++
>  commit-queue.h |  3 +++
>  commit.c       | 74 ++++++++++++++++++++++++++++++++++------------------------
>  3 files changed, 59 insertions(+), 31 deletions(-)

Peff, I think you were the one who did a priority queue previously,
primarily for performance.  The primary reason for this round was so
that I didn't have to touch the revision.c and struct commit in
order to sort by keys in commit-info-slabs and I was not aiming for
performance but a quick and rough benchmarking seems to indicate
that

 - for a small repository like git.git, there is not much difference
   in runtime;

 - but it does seem to cut down the memory pressure (less minor
   faults).

Representative runs of "rev-list --date-order v0.99..v1.8.3" on my
box with 'master' and with these patches spend 0.47user/0.04system
with 0.50elapsed (no time change), with 13450 vs 13108 minor faults
(smaller memory use).

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 1/4] toposort: rename "lifo" field
  2013-06-09 23:24                     ` [PATCH v2 1/4] toposort: rename "lifo" field Junio C Hamano
@ 2013-06-10  2:12                       ` Eric Sunshine
  2013-06-10  5:05                       ` Jeff King
  1 sibling, 0 replies; 51+ messages in thread
From: Eric Sunshine @ 2013-06-10  2:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List, Elliott Cable, Jeff King

On Sun, Jun 9, 2013 at 7:24 PM, Junio C Hamano <gitster@pobox.com> wrote:
> The primary invariant of sort_in_topological_order() is that a
> parent commit is not emitted untile all children of it are.  When

s/untile/until/

> traversing a forked history like this with "git log C E":
>
>     A----B----C
>      \
>       D----E
>
> we ensure that A is emitted after all of B, C, D, and E are done, B
> has to wait until C is done, and D has to wait until E is done.
>
> In some applications, however, we would further want to control how
> these child commits B, C, D and E on two parallel ancestry chains
> are shown.
>
> Most of the time, we would want to see C and B emitted together, and
> then E and D, and finally A (i.e. the --topo-order output).  The
> "lifo" parameter of the sort_in_topological_order() function is used
> to control this behaviour.  We start the traversal by knowing two
> commits, C and E.  While keeping in mind that we also need to
> inspect E later, we pick C first to inspect, and we notice and
> record that B needs to be inspected.  By structuring the "work to be
> done" set as a LIFO stack, we ensure that B is inspected next,
> before other in-flight commits we had known that we will need to
> inspect, e.g. E.
>
> When showing in --date-order, we would want to see commits ordered
> by timestamps, i.e. show C, E, B and D in this order before showing
> A, possibly mixing commits from two parallel histories together.
> When "lifo" parameter is set to false, the function keeps the "work
> to be done" set sorted in the date order to realize this semantics.
> After inspecting C, we add B to the "work to be done" set, but the
> next commit we inspect from the set is E which is newer than B.
>
> The name "lifo", however, is too strongly tied to the way how the

s/the way//

> function implements its behaviour, and does not describe what the
> behaviour _means_.
>
> Replace this field with an enum rev_sort_order, with two possible
> values: REV_SORT_IN_GRAPH_ORDER and REV_SORT_BY_COMMIT_DATE, and
> update the existing code.  The mechanical replacement rule is:
>
>   "lifo == 0" is equivalent to "sort_order == REV_SORT_BY_COMMIT_DATE"
>   "lifo == 1" is equivalent to "sort_order == REV_SORT_IN_GRAPH_ORDER"
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 1/4] toposort: rename "lifo" field
  2013-06-09 23:24                     ` [PATCH v2 1/4] toposort: rename "lifo" field Junio C Hamano
  2013-06-10  2:12                       ` Eric Sunshine
@ 2013-06-10  5:05                       ` Jeff King
  1 sibling, 0 replies; 51+ messages in thread
From: Jeff King @ 2013-06-10  5:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Elliott Cable

On Sun, Jun 09, 2013 at 04:24:34PM -0700, Junio C Hamano wrote:

> The name "lifo", however, is too strongly tied to the way how the
> function implements its behaviour, and does not describe what the
> behaviour _means_.
> 
> Replace this field with an enum rev_sort_order, with two possible
> values: REV_SORT_IN_GRAPH_ORDER and REV_SORT_BY_COMMIT_DATE, and
> update the existing code.  The mechanical replacement rule is:
> 
>   "lifo == 0" is equivalent to "sort_order == REV_SORT_BY_COMMIT_DATE"
>   "lifo == 1" is equivalent to "sort_order == REV_SORT_IN_GRAPH_ORDER"

Thanks. Having looked at this code for the first time in a long time
recently, I was very confused by the purpose of the "lifo" flag; this
patch would have made it much clearer.

Patch itself looks fine to me.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 2/4] commit-queue: LIFO or priority queue of commits
  2013-06-09 23:24                     ` [PATCH v2 2/4] commit-queue: LIFO or priority queue of commits Junio C Hamano
@ 2013-06-10  5:25                       ` Jeff King
  2013-06-10  7:21                         ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Jeff King @ 2013-06-10  5:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Elliott Cable

On Sun, Jun 09, 2013 at 04:24:35PM -0700, Junio C Hamano wrote:

> Traditionally we used a singly linked list of commits to hold a set
> of in-flight commits while traversing history.  The most typical use
> of the list is to add commits that are newly discovered to it, keep
> the list sorted by commit timestamp, pick up the newest one from the
> list, and keep digging.  The cost of keeping the singly linked list
> sorted is nontrivial, and this typical use pattern better matches a
> priority queue.
> 
> Introduce a commit-queue structure, that can be used either as a
> LIFO stack, or a priority queue.  This will be used in the next
> patch to hold in-flight commits during sort-in-topological-order.

Great. You may recall I had a similar patch or year or two back, in an
attempt to fix some of the O(n^2) places (e.g., in fetch-pack's
mark_complete). We ended up dropping it because duplicate removal kept
"n" small enough for common cases, and most of the commit_list users
depend on doing cheap splicing and other linked-list operations.

It may be worth looking again for other places to use this over
commit_list, but even the caller you are introducing here justifies its
presence.

Also, I wrote some basic tests to cover the priority queue as a unit. I
can rebase them on your commit if you are interested.

A few comments on the code itself:

> +void commit_queue_put(struct commit_queue *queue, struct commit *commit)

Is it worth making this "struct commit *" a void pointer, and handling
arbitrary items in our priority queue? The compare function should be
the only thing that dereferences them.

I do not have any non-commit priority queue use in mind, but I do not
think it adds any complexity in this case.

> +	/* Bubble up the new one */
> +	for (ix = queue->nr - 1; ix; ix = parent) {
> +		parent = (ix - 1) / 2;
> +		if (compare(queue->array[parent], queue->array[ix],
> +			    queue->cb_data) < 0)
> +			break;

In my implementation, I stopped on "compare() <= 0". It is late and my
mind is fuzzy, but I recall that heaps are never stable with respect to
insertion order, so I don't think it would matter.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 3/4] sort-in-topological-order: use commit-queue
  2013-06-09 23:37                       ` Junio C Hamano
@ 2013-06-10  5:31                         ` Jeff King
  2013-06-10  7:27                           ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Jeff King @ 2013-06-10  5:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Elliott Cable

On Sun, Jun 09, 2013 at 04:37:27PM -0700, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Use the commit-queue data structure to implement a priority queue
> > of commits sorted by committer date, when handling --date-order.
> > The commit-queue structure can also be used as a simple LIFO stack,
> > which is a good match for --topo-order processing.
> >
> > Signed-off-by: Junio C Hamano <gitster@pobox.com>
> > ---
> >  commit-queue.c | 13 +++++++++++
> >  commit-queue.h |  3 +++
> >  commit.c       | 74 ++++++++++++++++++++++++++++++++++------------------------
> >  3 files changed, 59 insertions(+), 31 deletions(-)
> 
> Peff, I think you were the one who did a priority queue previously,
> primarily for performance.  The primary reason for this round was so
> that I didn't have to touch the revision.c and struct commit in
> order to sort by keys in commit-info-slabs and I was not aiming for
> performance but a quick and rough benchmarking seems to indicate
> that
> 
>  - for a small repository like git.git, there is not much difference
>    in runtime;
> 
>  - but it does seem to cut down the memory pressure (less minor
>    faults).
> 
> Representative runs of "rev-list --date-order v0.99..v1.8.3" on my
> box with 'master' and with these patches spend 0.47user/0.04system
> with 0.50elapsed (no time change), with 13450 vs 13108 minor faults
> (smaller memory use).

The performance enhancement of the priority queue came from replacing
"commit_list_insert_by_date" calls with insertion into a queue. That
drops O(n^2) behavior on the linked-list down to O(n log n), as we have
"n" insertions, each causing an O(log n) heapify operation.

Around the same time, though, René wrote the linked-list merge sort that
powers commit_list_sort_by_date. And topo-sort learned to do O(1)
insertions into the unsorted list, and then one O(n log n) sort.

So your results are exactly what I would expect: the time should be
about the same (due to the same complexity), but the memory is used more
compactly (array of pointers instead of linked list of pointers).

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 4/4] log: --author-date-order
  2013-06-09 23:24                     ` [PATCH v2 4/4] log: --author-date-order Junio C Hamano
@ 2013-06-10  5:50                       ` Jeff King
  2013-06-10  7:39                         ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Jeff King @ 2013-06-10  5:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Elliott Cable

On Sun, Jun 09, 2013 at 04:24:37PM -0700, Junio C Hamano wrote:

> Sometimes people would want to view the commits in parallel
> histories in the order of author dates, not committer dates.
> 
> Teach "topo-order" sort machinery to do so, using a commit-info slab
> to record the author dates of each commit, and commit-queue to sort
> them.

Nice, this is basically what I was envisioning when I mentioned the
slabs. However, I don't think the code works. :(

> +static void record_author_date(struct author_date_slab *author_date,
> +			       struct commit *commit)
> +{
> +	const char *buf, *line_end;
> +	struct ident_split ident;
> +	char *date_end;
> +	unsigned long date;
> +
> +	for (buf = commit->buffer; buf; buf = line_end + 1) {
> +		line_end = strchrnul(buf, '\n');
> +		if (prefixcmp(buf, "author ")) {
> +			if (!line_end[0] || line_end[1] == '\n')
> +				return; /* end of header */
> +			continue;
> +		}
> +		if (split_ident_line(&ident,
> +				     buf + strlen("author "),
> +				     line_end - (buf + strlen("author "))) ||
> +		    !ident.date_begin || !ident.date_end)
> +			return; /* malformed "author" line */
> +		break;
> +	}
> +
> +	date = strtoul(ident.date_begin, &date_end, 10);
> +	if (date_end != ident.date_end)
> +		return; /* malformed date */
> +	*(author_date_slab_at(author_date, commit)) = date;
> +}

I'm not excited about introducing yet another place that parses commit
objects (mostly not for correctness, but because we have had
inconsistency in how malformed objects are treated). It is at least
using split_ident_line which covers the hard bits. I wonder how much
slower it would be to simply call format_commit_message to do the
parsing.

>  	/* Mark them and clear the indegree */
>  	for (next = orig; next; next = next->next) {
>  		struct commit *commit = next->item;
>  		*(indegree_slab_at(&indegree, commit)) = 1;
> +		/* also record the author dates, if needed */
> +		if (sort_order == REV_SORT_BY_AUTHOR_DATE)
> +			record_author_date(&author_date, commit);

The record_author_date function assumes that commit->buffer is valid
(i.e., not NULL).  We seem to assume that the commits are parsed already
(for looking at parents, and at the committer date).  But if
"save_commit_buffer" is set to 0 (as it is for rev-list), we would not
have a buffer at all.

It's hard to notice the problem because a NULL buffer will cause
record_author_date to simply leave the slab entry at 0. That would give
the same output as regular "--topo-order" (because everybody has the
same timestamp), except that the priority queue heap is not stable.
With this patch:

diff --git a/commit.c b/commit.c
index f3a2f09..5e62ae8 100644
--- a/commit.c
+++ b/commit.c
@@ -521,6 +521,9 @@ static void record_author_date(struct author_date_slab *author_date,
 	char *date_end;
 	unsigned long date;
 
+	if (!commit->buffer)
+		die("whooops!");
+
 	for (buf = commit->buffer; buf; buf = line_end + 1) {
 		line_end = strchrnul(buf, '\n');
 		if (prefixcmp(buf, "author ")) {

you can see the problem more clearly with "git rev-list
--author-date-order HEAD".

-Peff

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 2/4] commit-queue: LIFO or priority queue of commits
  2013-06-10  5:25                       ` Jeff King
@ 2013-06-10  7:21                         ` Junio C Hamano
  2013-06-10 18:15                           ` Jeff King
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-10  7:21 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Elliott Cable

Jeff King <peff@peff.net> writes:

> It may be worth looking again for other places to use this over
> commit_list, but even the caller you are introducing here justifies its
> presence.

The next candidate is paint-down-to-common, probably.

> Also, I wrote some basic tests to cover the priority queue as a unit. I
> can rebase them on your commit if you are interested.

It would be great.

> A few comments on the code itself:
>
>> +void commit_queue_put(struct commit_queue *queue, struct commit *commit)
>
> Is it worth making this "struct commit *" a void pointer, and handling
> arbitrary items in our priority queue? The compare function should be
> the only thing that dereferences them.
>  
> I do not have any non-commit priority queue use in mind, but I do not
> think it adds any complexity in this case.

I didn't either (and still I don't think of one), but I agree that
the implementation can be reused for pq of any type, as long as it
is a pointer to struct.

>> +	/* Bubble up the new one */
>> +	for (ix = queue->nr - 1; ix; ix = parent) {
>> +		parent = (ix - 1) / 2;
>> +		if (compare(queue->array[parent], queue->array[ix],
>> +			    queue->cb_data) < 0)
>> +			break;
>
> In my implementation, I stopped on "compare() <= 0". It is late and my
> mind is fuzzy, but I recall that heaps are never stable with respect to
> insertion order, so I don't think it would matter.

It would matter in the sense that we cannot replace linked-list, if
the caller wants stability.  It is more like "we cannot do anything
about it" than "it would not matter".

We can make each queue element a pair of <pointer to payload,
insertion counter>, and tiebreak using the insertion order, if the
callers want the same stability as linked-list implementation, but
I tend to think it really matters.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 3/4] sort-in-topological-order: use commit-queue
  2013-06-10  5:31                         ` Jeff King
@ 2013-06-10  7:27                           ` Junio C Hamano
  2013-06-10 18:24                             ` Jeff King
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-10  7:27 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Elliott Cable

Jeff King <peff@peff.net> writes:

> The performance enhancement of the priority queue came from replacing
> "commit_list_insert_by_date" calls with insertion into a queue. That
> drops O(n^2) behavior on the linked-list down to O(n log n), as we have
> "n" insertions, each causing an O(log n) heapify operation.

Yes.

> Around the same time, though, René wrote the linked-list merge sort that
> powers commit_list_sort_by_date. And topo-sort learned to do O(1)
> insertions into the unsorted list, and then one O(n log n) sort.

Yes, but that only affects the "sort the work queue in date order"
before entering the main loop, and maintenance of work queue as we
dig along still is "find the place to put this in the date-order
sorted linked list", no?

> So your results are exactly what I would expect: the time should be
> about the same (due to the same complexity), but the memory is used more
> compactly (array of pointers instead of linked list of pointers).

I've been disturbed every time I saw the commit_list insertion
function that does a small allocation which will be freed fairly
often and have been wondering if we can rewrite it with custom slab
allocator, but not using linked list where we do not have to feels
like a better solution to that issue, and use of pqueue may be a
right direction to go in.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 4/4] log: --author-date-order
  2013-06-10  5:50                       ` Jeff King
@ 2013-06-10  7:39                         ` Junio C Hamano
  2013-06-10 18:49                           ` Jeff King
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-10  7:39 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Elliott Cable

Jeff King <peff@peff.net> writes:

> I'm not excited about introducing yet another place that parses commit
> objects (mostly not for correctness, but because we have had
> inconsistency in how malformed objects are treated). It is at least
> using split_ident_line which covers the hard bits. I wonder how much
> slower it would be to simply call format_commit_message to do the
> parsing.

The thought certainly crossed my mind, not exactly in that form but
more about splitting the machinery used in pretty.c into a more
reusable form.

The result of my attempt however did not become all that reusable
(admittedly I didn't spend too much brain cycles on it), so I punted
;-).

> The record_author_date function assumes that commit->buffer is valid
> (i.e., not NULL).  We seem to assume that the commits are parsed already
> (for looking at parents, and at the committer date).  

I thought that the latter is warranted, as the function worked on
the output of limit_list(), and by the time limit_list() finishes,
everything relevant must have been parsed already.

But you are right.  The commit->buffer may no longer be there, and
the --author-date-order option needs to read the object again
in this codepath.  That would be in line with what --pretty/format
would do, I guess.

Or we could extend parse_commit() API to take an optional commit
info slab to store not just author date but other non-essential
stuff like people's names, and we arrange that extended API to be
triggered when we know --author-date-order is in effect?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 2/4] commit-queue: LIFO or priority queue of commits
  2013-06-10  7:21                         ` Junio C Hamano
@ 2013-06-10 18:15                           ` Jeff King
  2013-06-10 18:56                             ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Jeff King @ 2013-06-10 18:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Elliott Cable

On Mon, Jun 10, 2013 at 12:21:00AM -0700, Junio C Hamano wrote:

> > It may be worth looking again for other places to use this over
> > commit_list, but even the caller you are introducing here justifies its
> > presence.
> 
> The next candidate is paint-down-to-common, probably.

Yeah, I don't think I looked at that at all last time (mostly because it
only large as the graph gets wide, which is typically acceptable for
us). But it should be easy to do.

> > Also, I wrote some basic tests to cover the priority queue as a unit. I
> > can rebase them on your commit if you are interested.
> 
> It would be great.

Squashable patch is below.

> > Is it worth making this "struct commit *" a void pointer, and handling
> > arbitrary items in our priority queue? The compare function should be
> > the only thing that dereferences them.
> >  
> > I do not have any non-commit priority queue use in mind, but I do not
> > think it adds any complexity in this case.
> 
> I didn't either (and still I don't think of one), but I agree that
> the implementation can be reused for pq of any type, as long as it
> is a pointer to struct.

I converted this to a void pointer in my patch below, simply because it
makes it easier to write a test-queue that operates on ints. Due to
implicit casting, it should work for the most part without changing the
calling code unless you have a caller that does something like:

  commit_queue_get(&q)->date

or similar. I didn't change the name, either. It may be silly to call it
"commit_queue" still since it is now more general. I simply called mine
"queue" (I wanted "pqueue", but that conflicted with globals defined by
OpenSSL; yours is a more general queue anyway, so maybe that is a good
name).

> >> +	/* Bubble up the new one */
> >> +	for (ix = queue->nr - 1; ix; ix = parent) {
> >> +		parent = (ix - 1) / 2;
> >> +		if (compare(queue->array[parent], queue->array[ix],
> >> +			    queue->cb_data) < 0)
> >> +			break;
> >
> > In my implementation, I stopped on "compare() <= 0". It is late and my
> > mind is fuzzy, but I recall that heaps are never stable with respect to
> > insertion order, so I don't think it would matter.
> 
> It would matter in the sense that we cannot replace linked-list, if
> the caller wants stability.  It is more like "we cannot do anything
> about it" than "it would not matter".

Right. I meant "I do not think it matters if you do <= or < here, as we
are not stable anyway". Doing "<= 0" stops the heapify operation sooner,
though I doubt it matters in practice (it is not an algorithmic
complexity change, but just that you can sometimes quit early).

I think it is the same situation in your "push down", too, where you can
quit when the parent is equal to the largest child.

> We can make each queue element a pair of <pointer to payload,
> insertion counter>, and tiebreak using the insertion order, if the
> callers want the same stability as linked-list implementation, but
> I tend to think it really matters.

Yes, I think that is the usual solution.

Here's the patch with the tests, meant to be squashed into your 2/4. As
I mentioned above, you may want to further tweak the name, which would
require fixing up the rebase patches on top.

If you don't want to do the "s/struct commit/void/" change now, we can
probably just have test-queue stuff the ints into commit pointers.

The tests themselves are not extremely extensive, but at least let you
check that you implemented the heap correctly. :)

---
 .gitignore       |  1 +
 Makefile         |  1 +
 commit-queue.c   |  6 ++---
 commit-queue.h   |  8 +++---
 t/t0009-queue.sh | 50 +++++++++++++++++++++++++++++++++++
 test-queue.c     | 39 +++++++++++++++++++++++++++
 6 files changed, 98 insertions(+), 7 deletions(-)

diff --git a/.gitignore b/.gitignore
index 6669bf0..8670e6d 100644
--- a/.gitignore
+++ b/.gitignore
@@ -193,6 +193,7 @@
 /test-regex
 /test-revision-walking
 /test-run-command
+/test-queue
 /test-sha1
 /test-sigchain
 /test-string-list
diff --git a/Makefile b/Makefile
index 3cf55e9..c957637 100644
--- a/Makefile
+++ b/Makefile
@@ -552,6 +552,7 @@ TEST_PROGRAMS_NEED_X += test-path-utils
 TEST_PROGRAMS_NEED_X += test-mktemp
 TEST_PROGRAMS_NEED_X += test-parse-options
 TEST_PROGRAMS_NEED_X += test-path-utils
+TEST_PROGRAMS_NEED_X += test-queue
 TEST_PROGRAMS_NEED_X += test-regex
 TEST_PROGRAMS_NEED_X += test-revision-walking
 TEST_PROGRAMS_NEED_X += test-run-command
diff --git a/commit-queue.c b/commit-queue.c
index 77d4b02..04acf23 100644
--- a/commit-queue.c
+++ b/commit-queue.c
@@ -10,7 +10,7 @@ void clear_commit_queue(struct commit_queue *queue)
 	queue->array = NULL;
 }
 
-void commit_queue_put(struct commit_queue *queue, struct commit *commit)
+void commit_queue_put(struct commit_queue *queue, void *commit)
 {
 	commit_compare_fn compare = queue->compare;
 	int ix, parent;
@@ -34,9 +34,9 @@ struct commit *commit_queue_get(struct commit_queue *queue)
 	}
 }
 
-struct commit *commit_queue_get(struct commit_queue *queue)
+void *commit_queue_get(struct commit_queue *queue)
 {
-	struct commit *result, *swap;
+	void *result, *swap;
 	int ix, child;
 	commit_compare_fn compare = queue->compare;
 
diff --git a/commit-queue.h b/commit-queue.h
index 7c5dc4c..ef8fb87 100644
--- a/commit-queue.h
+++ b/commit-queue.h
@@ -5,26 +5,26 @@ extern void commit_queue_put(struct commit_queue *, struct commit *);
  * Compare two commits; the third parameter is cb_data in the
  * commit_queue structure.
  */
-typedef int (*commit_compare_fn)(struct commit *, struct commit *, void *);
+typedef int (*commit_compare_fn)(void *, void *, void *);
 
 struct commit_queue {
 	commit_compare_fn compare;
 	void *cb_data;
 	int alloc, nr;
-	struct commit **array;
+	void **array;
 };
 
 /*
  * Add the commit to the queue
  */
-extern void commit_queue_put(struct commit_queue *, struct commit *);
+extern void commit_queue_put(struct commit_queue *, void *);
 
 /*
  * Extract the commit that compares the smallest out of the queue,
  * or NULL.  If compare function is NULL, the queue acts as a LIFO
  * stack.
  */
-extern struct commit *commit_queue_get(struct commit_queue *);
+extern void *commit_queue_get(struct commit_queue *);
 
 extern void clear_commit_queue(struct commit_queue *);
 
diff --git a/t/t0009-queue.sh b/t/t0009-queue.sh
new file mode 100755
index 0000000..186df01
--- /dev/null
+++ b/t/t0009-queue.sh
@@ -0,0 +1,50 @@
+#!/bin/sh
+
+test_description='basic tests for priority queue implementation'
+. ./test-lib.sh
+
+cat >expect <<'EOF'
+1
+2
+3
+4
+5
+5
+6
+7
+8
+9
+10
+EOF
+test_expect_success 'basic ordering' '
+	test-queue 2 6 3 10 9 5 7 4 5 8 1 dump >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<'EOF'
+2
+3
+4
+1
+5
+6
+EOF
+test_expect_success 'mixed put and get' '
+	test-queue 6 2 4 get 5 3 get get 1 dump >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<'EOF'
+1
+2
+NULL
+1
+2
+NULL
+EOF
+test_expect_success 'notice empty queue' '
+	test-queue 1 2 get get get 1 2 get get get >actual &&
+	test_cmp expect actual
+'
+
+test_done
diff --git a/test-queue.c b/test-queue.c
new file mode 100644
index 0000000..7743775
--- /dev/null
+++ b/test-queue.c
@@ -0,0 +1,39 @@
+#include "cache.h"
+#include "commit-queue.h"
+
+static int intcmp(void *va, void *vb, void *data)
+{
+	const int *a = va, *b = vb;
+	return *a - *b;
+}
+
+static void show(int *v)
+{
+	if (!v)
+		printf("NULL\n");
+	else
+		printf("%d\n", *v);
+	free(v);
+}
+
+int main(int argc, char **argv)
+{
+	struct commit_queue pq = { intcmp };
+
+	while (*++argv) {
+		if (!strcmp(*argv, "get"))
+			show(commit_queue_get(&pq));
+		else if (!strcmp(*argv, "dump")) {
+			int *v;
+			while ((v = commit_queue_get(&pq)))
+			       show(v);
+		}
+		else {
+			int *v = malloc(sizeof(*v));
+			*v = atoi(*argv);
+			commit_queue_put(&pq, v);
+		}
+	}
+
+	return 0;
+}

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 3/4] sort-in-topological-order: use commit-queue
  2013-06-10  7:27                           ` Junio C Hamano
@ 2013-06-10 18:24                             ` Jeff King
  0 siblings, 0 replies; 51+ messages in thread
From: Jeff King @ 2013-06-10 18:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Elliott Cable

On Mon, Jun 10, 2013 at 12:27:31AM -0700, Junio C Hamano wrote:

> > Around the same time, though, René wrote the linked-list merge sort that
> > powers commit_list_sort_by_date. And topo-sort learned to do O(1)
> > insertions into the unsorted list, and then one O(n log n) sort.
> 
> Yes, but that only affects the "sort the work queue in date order"
> before entering the main loop, and maintenance of work queue as we
> dig along still is "find the place to put this in the date-order
> sorted linked list", no?

Ah, you're right. I was thinking that we saw all of the commits up
front and then sorted. And we do, but we still keep a separate list in
the work queue.

So I think it may just be the case that "N" does not get very big here
(the width of the graph), so log(N) versus (N) does not make a big
difference.

> I've been disturbed every time I saw the commit_list insertion
> function that does a small allocation which will be freed fairly
> often and have been wondering if we can rewrite it with custom slab
> allocator, but not using linked list where we do not have to feels
> like a better solution to that issue, and use of pqueue may be a
> right direction to go in.

Agreed. The only thing I'd worry about is that somebody cares about the
order stability of same-time commits in the output. But I cannot think
of a case where it is important (especially because the timestamps are
subject to minor skew anyway, so it is not like you could even count on
particular commits having equivalent timestamps).

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 4/4] log: --author-date-order
  2013-06-10  7:39                         ` Junio C Hamano
@ 2013-06-10 18:49                           ` Jeff King
  2013-06-20 19:36                             ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Jeff King @ 2013-06-10 18:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Elliott Cable

On Mon, Jun 10, 2013 at 12:39:24AM -0700, Junio C Hamano wrote:

> > I'm not excited about introducing yet another place that parses commit
> > objects (mostly not for correctness, but because we have had
> > inconsistency in how malformed objects are treated). It is at least
> > using split_ident_line which covers the hard bits. I wonder how much
> > slower it would be to simply call format_commit_message to do the
> > parsing.
> 
> The thought certainly crossed my mind, not exactly in that form but
> more about splitting the machinery used in pretty.c into a more
> reusable form.
> 
> The result of my attempt however did not become all that reusable
> (admittedly I didn't spend too much brain cycles on it), so I punted
> ;-).

Yes, I feel like it has been tried before. The problem is that a clean
interface would let you get individual pieces of information with a
single call. But an efficient interface will utilize the same parsing
pass to get multiple items out, and stop parsing when we have gotten all
required items (but leave the parser in a consistent state so that we
can pick it up later).

The format_commit_one parser does that, but the "format_commit_context"
it holds is a bit bulky. I think it might be possible to pull out the
parsing bits into a separate struct, and you could call it something
like:

  struct commit_parser parser;
  unsigned long authordate;
  const char *authorname;
  int authorlen;

  commit_parser_init(&parser, commit);
  authordate = commit_parse_authordate(&parser);
  authorname = commit_parse_authorname(&parser, &authorlen);

where the second parse call is basically "free", because we've already
done (and cached) the hard work in the first call.

So they might look like:

  static void parse_author_ident(struct commit_parser *parser)
  {
          if (!parser->author.name_begin) {
                  if (!parser->authorline.start)
                          parse_commit_header(parser);
                  split_ident_line(&parser->author,
                                   parser->authorline.start,
                                   parser->authorline.len);
          }
  }

  unsigned long commit_parse_authordate(struct commit_parser *parser)
  {
          parse_author_ident(parser);
          /* XXX should check for malformedness here */
          return strtoul(ident.date_begin, NULL, 10);
  }

  const char *commit_parse_authorname(struct commit_parser *parser,
                                      unsigned long *len)
  {
          parse_author_ident(parser);
          *len = parser.author.name_end - parser.author.name_begin;
          return parser.author.name_begin;
  }

and so forth. It would be easy (and have the same efficiency) for
format_commit_message to build on that, and it calling it from regular
code is not too bad.

> But you are right.  The commit->buffer may no longer be there, and
> the --author-date-order option needs to read the object again
> in this codepath.  That would be in line with what --pretty/format
> would do, I guess.
> 
> Or we could extend parse_commit() API to take an optional commit
> info slab to store not just author date but other non-essential
> stuff like people's names, and we arrange that extended API to be
> triggered when we know --author-date-order is in effect?

I like the latter option. It takes a non-trivial amount of time to load
the commits from disk, and now we are potentially doing it 2 or 3 times
for a run (once to parse, once to get the author info for topo-sort, and
possibly later to show it if --pretty is given; though I did not check
and maybe we turn off save_commit_buffer with --pretty). It would be
nice to have an extended parse_object that handled that. I'm not sure of
the interface. Maybe variadic with pairs of type/slab, like:

  parse_commit_extended(commit,
                        PARSE_COMMIT_AUTHORDATE, &authordate_slab,
                        PARSE_COMMIT_DONE);

?

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 2/4] commit-queue: LIFO or priority queue of commits
  2013-06-10 18:15                           ` Jeff King
@ 2013-06-10 18:56                             ` Junio C Hamano
  2013-06-10 18:59                               ` Jeff King
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-10 18:56 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Elliott Cable

Jeff King <peff@peff.net> writes:

> On Mon, Jun 10, 2013 at 12:21:00AM -0700, Junio C Hamano wrote:
>
>> > It may be worth looking again for other places to use this over
>> > commit_list, but even the caller you are introducing here justifies its
>> > presence.
>> 
>> The next candidate is paint-down-to-common, probably.
>
> Yeah, I don't think I looked at that at all last time (mostly because it
> only large as the graph gets wide, which is typically acceptable for
> us). But it should be easy to do.
>
>> > Also, I wrote some basic tests to cover the priority queue as a unit. I
>> > can rebase them on your commit if you are interested.
>> 
>> It would be great.
>
> Squashable patch is below.
>
>> > Is it worth making this "struct commit *" a void pointer, and handling
>> > arbitrary items in our priority queue? The compare function should be
>> > the only thing that dereferences them.
>> >  
>> > I do not have any non-commit priority queue use in mind, but I do not
>> > think it adds any complexity in this case.
>> 
>> I didn't either (and still I don't think of one), but I agree that
>> the implementation can be reused for pq of any type, as long as it
>> is a pointer to struct.
>
> I converted this to a void pointer in my patch below, simply because it
> makes it easier to write a test-queue that operates on ints. Due to
> implicit casting, it should work for the most part without changing the
> calling code unless you have a caller that does something like:
>
>   commit_queue_get(&q)->date
>
> or similar. I didn't change the name, either. It may be silly to call it
> "commit_queue" still since it is now more general. I simply called mine
> "queue" (I wanted "pqueue", but that conflicted with globals defined by
> OpenSSL; yours is a more general queue anyway, so maybe that is a good
> name).

I agree that it makes sense not to call it either commit-queue or
pqueue.  While at it, the filenames should probably be moved as
well, no?

> Here's the patch with the tests, meant to be squashed into your 2/4. As
> I mentioned above, you may want to further tweak the name, which would
> require fixing up the rebase patches on top.
>
> If you don't want to do the "s/struct commit/void/" change now, we can
> probably just have test-queue stuff the ints into commit pointers.
>
> The tests themselves are not extremely extensive, but at least let you
> check that you implemented the heap correctly. :)

Thanks.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 2/4] commit-queue: LIFO or priority queue of commits
  2013-06-10 18:56                             ` Junio C Hamano
@ 2013-06-10 18:59                               ` Jeff King
  2013-06-10 23:23                                 ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Jeff King @ 2013-06-10 18:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Elliott Cable

On Mon, Jun 10, 2013 at 11:56:33AM -0700, Junio C Hamano wrote:

> > or similar. I didn't change the name, either. It may be silly to call it
> > "commit_queue" still since it is now more general. I simply called mine
> > "queue" (I wanted "pqueue", but that conflicted with globals defined by
> > OpenSSL; yours is a more general queue anyway, so maybe that is a good
> > name).
> 
> I agree that it makes sense not to call it either commit-queue or
> pqueue.  While at it, the filenames should probably be moved as
> well, no?

Yeah, definitely. I left all of that as an exercise for you, since the
name change will involve a lot of fallout in the other patches.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 2/4] commit-queue: LIFO or priority queue of commits
  2013-06-10 18:59                               ` Jeff King
@ 2013-06-10 23:23                                 ` Junio C Hamano
  2013-06-11  6:36                                   ` Jeff King
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-10 23:23 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Elliott Cable

Jeff King <peff@peff.net> writes:

> On Mon, Jun 10, 2013 at 11:56:33AM -0700, Junio C Hamano wrote:
>
>> > or similar. I didn't change the name, either. It may be silly to call it
>> > "commit_queue" still since it is now more general. I simply called mine
>> > "queue" (I wanted "pqueue", but that conflicted with globals defined by
>> > OpenSSL; yours is a more general queue anyway, so maybe that is a good
>> > name).
>> 
>> I agree that it makes sense not to call it either commit-queue or
>> pqueue.  While at it, the filenames should probably be moved as
>> well, no?
>
> Yeah, definitely. I left all of that as an exercise for you, since the
> name change will involve a lot of fallout in the other patches.

OK, I pushed out a result of some renaming and rebasing.  Notable
changes are:

 - The data and API is called prio-queue and they live in prio-queue.[ch];

 - The test script is also named test-prio-queue.c, to leave the
   door open for other kinds of queue;

 - For now, record_author_date() does the obvious read-sha1-file and
   free; and

 - The comparison callback's function signature had three "void *",
   so they are named in the header file now.  Also two "thing"
   pointers are marked as "const void *".

I may have flipped the comparison < vs <= as well.

Thanks.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 2/4] commit-queue: LIFO or priority queue of commits
  2013-06-10 23:23                                 ` Junio C Hamano
@ 2013-06-11  6:36                                   ` Jeff King
  2013-06-11 17:02                                     ` Junio C Hamano
  2013-06-11 22:19                                     ` [PATCH v3 0/4] log --author-date-order Junio C Hamano
  0 siblings, 2 replies; 51+ messages in thread
From: Jeff King @ 2013-06-11  6:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Elliott Cable

On Mon, Jun 10, 2013 at 04:23:31PM -0700, Junio C Hamano wrote:

> OK, I pushed out a result of some renaming and rebasing.  Notable
> changes are:
> 
>  - The data and API is called prio-queue and they live in prio-queue.[ch];
> 
>  - The test script is also named test-prio-queue.c, to leave the
>    door open for other kinds of queue;

Sounds reasonable, though you may want to update the commit message of
jc/topo-author-date-sort~2.

>  - For now, record_author_date() does the obvious read-sha1-file and
>    free; and

I think that is a good place to leave it in this series. It does not
hurt performance in any existing cases, and any parsing refactoring can
come later if somebody wants to work on it.

>  - The comparison callback's function signature had three "void *",
>    so they are named in the header file now.  Also two "thing"
>    pointers are marked as "const void *".

Yeah, I noticed both when porting my tests, but didn't want to add too
many distracting details. Thanks for fixing.

Overall, it looks good for me except for the commit message tweaks I
mentioned above.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 2/4] commit-queue: LIFO or priority queue of commits
  2013-06-11  6:36                                   ` Jeff King
@ 2013-06-11 17:02                                     ` Junio C Hamano
  2013-06-11 22:19                                     ` [PATCH v3 0/4] log --author-date-order Junio C Hamano
  1 sibling, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2013-06-11 17:02 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Elliott Cable

Jeff King <peff@peff.net> writes:

> Overall, it looks good for me except for the commit message tweaks I
> mentioned above.

Thanks.  Rerolled; will resend when I have time (and if I do not
forget).

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v3 0/4] log --author-date-order
  2013-06-11  6:36                                   ` Jeff King
  2013-06-11 17:02                                     ` Junio C Hamano
@ 2013-06-11 22:19                                     ` Junio C Hamano
  2013-06-11 22:19                                       ` [PATCH v3 1/4] toposort: rename "lifo" field Junio C Hamano
                                                         ` (3 more replies)
  1 sibling, 4 replies; 51+ messages in thread
From: Junio C Hamano @ 2013-06-11 22:19 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

The first one is unchanged.  The second one was redone with Peff's
help, and the other two patches have been adjusted for it.

Adding tests to t4202 and/or t6012 is left as an exercise to readers.

Junio C Hamano (4):
  toposort: rename "lifo" field
  prio-queue: priority queue of pointers to structs
  sort-in-topological-order: use prio-queue
  log: --author-date-order

 .gitignore                         |   1 +
 Documentation/rev-list-options.txt |   4 +
 Makefile                           |   3 +
 builtin/log.c                      |   2 +-
 builtin/show-branch.c              |  14 ++--
 commit.c                           | 145 ++++++++++++++++++++++++++++++-------
 commit.h                           |  15 +++-
 prio-queue.c                       |  84 +++++++++++++++++++++
 prio-queue.h                       |  48 ++++++++++++
 revision.c                         |  13 ++--
 revision.h                         |   6 +-
 t/t0009-prio-queue.sh              |  50 +++++++++++++
 test-prio-queue.c                  |  39 ++++++++++
 13 files changed, 381 insertions(+), 43 deletions(-)
 create mode 100644 prio-queue.c
 create mode 100644 prio-queue.h
 create mode 100755 t/t0009-prio-queue.sh
 create mode 100644 test-prio-queue.c

-- 
1.8.3.1-494-g51b8af5

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v3 1/4] toposort: rename "lifo" field
  2013-06-11 22:19                                     ` [PATCH v3 0/4] log --author-date-order Junio C Hamano
@ 2013-06-11 22:19                                       ` Junio C Hamano
  2013-06-11 22:19                                       ` [PATCH v3 2/4] prio-queue: priority queue of pointers to structs Junio C Hamano
                                                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2013-06-11 22:19 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

The primary invariant of sort_in_topological_order() is that a
parent commit is not emitted until all children of it are.  When
traversing a forked history like this with "git log C E":

    A----B----C
     \
      D----E

we ensure that A is emitted after all of B, C, D, and E are done, B
has to wait until C is done, and D has to wait until E is done.

In some applications, however, we would further want to control how
these child commits B, C, D and E on two parallel ancestry chains
are shown.

Most of the time, we would want to see C and B emitted together, and
then E and D, and finally A (i.e. the --topo-order output).  The
"lifo" parameter of the sort_in_topological_order() function is used
to control this behaviour.  We start the traversal by knowing two
commits, C and E.  While keeping in mind that we also need to
inspect E later, we pick C first to inspect, and we notice and
record that B needs to be inspected.  By structuring the "work to be
done" set as a LIFO stack, we ensure that B is inspected next,
before other in-flight commits we had known that we will need to
inspect, e.g. E.

When showing in --date-order, we would want to see commits ordered
by timestamps, i.e. show C, E, B and D in this order before showing
A, possibly mixing commits from two parallel histories together.
When "lifo" parameter is set to false, the function keeps the "work
to be done" set sorted in the date order to realize this semantics.
After inspecting C, we add B to the "work to be done" set, but the
next commit we inspect from the set is E which is newer than B.

The name "lifo", however, is too strongly tied to the way how the
function implements its behaviour, and does not describe what the
behaviour _means_.

Replace this field with an enum rev_sort_order, with two possible
values: REV_SORT_IN_GRAPH_ORDER and REV_SORT_BY_COMMIT_DATE, and
update the existing code.  The mechanical replacement rule is:

  "lifo == 0" is equivalent to "sort_order == REV_SORT_BY_COMMIT_DATE"
  "lifo == 1" is equivalent to "sort_order == REV_SORT_IN_GRAPH_ORDER"

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/log.c         |  2 +-
 builtin/show-branch.c | 14 ++++++++------
 commit.c              | 12 ++++++++----
 commit.h              | 14 +++++++++++---
 revision.c            | 10 +++++-----
 revision.h            |  6 +++++-
 6 files changed, 38 insertions(+), 20 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 8f0b2e8..8d26042 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -205,7 +205,7 @@ static void log_show_early(struct rev_info *revs, struct commit_list *list)
 	int i = revs->early_output;
 	int show_header = 1;
 
-	sort_in_topological_order(&list, revs->lifo);
+	sort_in_topological_order(&list, revs->sort_order);
 	while (list && i) {
 		struct commit *commit = list->item;
 		switch (simplify_commit(revs, commit)) {
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index d208fd6..7c57985 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -631,7 +631,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 	int num_rev, i, extra = 0;
 	int all_heads = 0, all_remotes = 0;
 	int all_mask, all_revs;
-	int lifo = 1;
+	enum rev_sort_order sort_order = REV_SORT_IN_GRAPH_ORDER;
 	char head[128];
 	const char *head_p;
 	int head_len;
@@ -666,15 +666,17 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 			    N_("show possible merge bases")),
 		OPT_BOOLEAN(0, "independent", &independent,
 			    N_("show refs unreachable from any other ref")),
-		OPT_BOOLEAN(0, "topo-order", &lifo,
-			    N_("show commits in topological order")),
+		OPT_SET_INT(0, "topo-order", &sort_order,
+			    N_("show commits in topological order"),
+			    REV_SORT_IN_GRAPH_ORDER),
 		OPT_BOOLEAN(0, "topics", &topics,
 			    N_("show only commits not on the first branch")),
 		OPT_SET_INT(0, "sparse", &dense,
 			    N_("show merges reachable from only one tip"), 0),
-		OPT_SET_INT(0, "date-order", &lifo,
+		OPT_SET_INT(0, "date-order", &sort_order,
 			    N_("show commits where no parent comes before its "
-			       "children"), 0),
+			       "children"),
+			    REV_SORT_BY_COMMIT_DATE),
 		{ OPTION_CALLBACK, 'g', "reflog", &reflog_base, N_("<n>[,<base>]"),
 			    N_("show <n> most recent ref-log entries starting at "
 			       "base"),
@@ -901,7 +903,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 		exit(0);
 
 	/* Sort topologically */
-	sort_in_topological_order(&seen, lifo);
+	sort_in_topological_order(&seen, sort_order);
 
 	/* Give names to commits */
 	if (!sha1_name && !no_name)
diff --git a/commit.c b/commit.c
index f97456d..11b9635 100644
--- a/commit.c
+++ b/commit.c
@@ -512,7 +512,7 @@ define_commit_slab(indegree_slab, int);
 /*
  * Performs an in-place topological sort on the list supplied.
  */
-void sort_in_topological_order(struct commit_list ** list, int lifo)
+void sort_in_topological_order(struct commit_list ** list, enum rev_sort_order sort_order)
 {
 	struct commit_list *next, *orig = *list;
 	struct commit_list *work, **insert;
@@ -561,7 +561,7 @@ void sort_in_topological_order(struct commit_list ** list, int lifo)
 	}
 
 	/* process the list in topological order */
-	if (!lifo)
+	if (sort_order != REV_SORT_IN_GRAPH_ORDER)
 		commit_list_sort_by_date(&work);
 
 	pptr = list;
@@ -588,10 +588,14 @@ void sort_in_topological_order(struct commit_list ** list, int lifo)
 			 * guaranteeing topological order.
 			 */
 			if (--(*pi) == 1) {
-				if (!lifo)
+				switch (sort_order) {
+				case REV_SORT_BY_COMMIT_DATE:
 					commit_list_insert_by_date(parent, &work);
-				else
+					break;
+				default: /* REV_SORT_IN_GRAPH_ORDER */
 					commit_list_insert(parent, &work);
+					break;
+				}
 			}
 		}
 		/*
diff --git a/commit.h b/commit.h
index 70e749d..247e474 100644
--- a/commit.h
+++ b/commit.h
@@ -139,15 +139,23 @@ struct commit *pop_commit(struct commit_list **stack);
 void clear_commit_marks(struct commit *commit, unsigned int mark);
 void clear_commit_marks_for_object_array(struct object_array *a, unsigned mark);
 
+
+enum rev_sort_order {
+	REV_SORT_IN_GRAPH_ORDER = 0,
+	REV_SORT_BY_COMMIT_DATE
+};
+
 /*
  * Performs an in-place topological sort of list supplied.
  *
  *   invariant of resulting list is:
  *      a reachable from b => ord(b) < ord(a)
- *   in addition, when lifo == 0, commits on parallel tracks are
- *   sorted in the dates order.
+ *   sort_order further specifies:
+ *   REV_SORT_IN_GRAPH_ORDER: try to show a commit on a single-parent
+ *                            chain together.
+ *   REV_SORT_BY_COMMIT_DATE: show eligible commits in committer-date order.
  */
-void sort_in_topological_order(struct commit_list ** list, int lifo);
+void sort_in_topological_order(struct commit_list **, enum rev_sort_order);
 
 struct commit_graft {
 	unsigned char sha1[20];
diff --git a/revision.c b/revision.c
index cf620c6..966ebbc 100644
--- a/revision.c
+++ b/revision.c
@@ -1038,7 +1038,7 @@ void init_revisions(struct rev_info *revs, const char *prefix)
 	DIFF_OPT_SET(&revs->pruning, QUICK);
 	revs->pruning.add_remove = file_add_remove;
 	revs->pruning.change = file_change;
-	revs->lifo = 1;
+	revs->sort_order = REV_SORT_IN_GRAPH_ORDER;
 	revs->dense = 1;
 	revs->prefix = prefix;
 	revs->max_age = -1;
@@ -1373,7 +1373,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--merge")) {
 		revs->show_merge = 1;
 	} else if (!strcmp(arg, "--topo-order")) {
-		revs->lifo = 1;
+		revs->sort_order = REV_SORT_IN_GRAPH_ORDER;
 		revs->topo_order = 1;
 	} else if (!strcmp(arg, "--simplify-merges")) {
 		revs->simplify_merges = 1;
@@ -1391,7 +1391,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->prune = 1;
 		load_ref_decorations(DECORATE_SHORT_REFS);
 	} else if (!strcmp(arg, "--date-order")) {
-		revs->lifo = 0;
+		revs->sort_order = REV_SORT_BY_COMMIT_DATE;
 		revs->topo_order = 1;
 	} else if (!prefixcmp(arg, "--early-output")) {
 		int count = 100;
@@ -2165,7 +2165,7 @@ int prepare_revision_walk(struct rev_info *revs)
 		if (limit_list(revs) < 0)
 			return -1;
 	if (revs->topo_order)
-		sort_in_topological_order(&revs->commits, revs->lifo);
+		sort_in_topological_order(&revs->commits, revs->sort_order);
 	if (revs->simplify_merges)
 		simplify_merges(revs);
 	if (revs->children.name)
@@ -2480,7 +2480,7 @@ static void create_boundary_commit_list(struct rev_info *revs)
 	 * If revs->topo_order is set, sort the boundary commits
 	 * in topological order
 	 */
-	sort_in_topological_order(&revs->commits, revs->lifo);
+	sort_in_topological_order(&revs->commits, revs->sort_order);
 }
 
 static struct commit *get_revision_internal(struct rev_info *revs)
diff --git a/revision.h b/revision.h
index 5da09ee..2a5e325 100644
--- a/revision.h
+++ b/revision.h
@@ -4,6 +4,7 @@
 #include "parse-options.h"
 #include "grep.h"
 #include "notes.h"
+#include "commit.h"
 
 #define SEEN		(1u<<0)
 #define UNINTERESTING   (1u<<1)
@@ -60,6 +61,10 @@ struct rev_info {
 	const char *prefix;
 	const char *def;
 	struct pathspec prune_data;
+
+	/* topo-sort */
+	enum rev_sort_order sort_order;
+
 	unsigned int	early_output:1,
 			ignore_missing:1;
 
@@ -70,7 +75,6 @@ struct rev_info {
 			show_all:1,
 			remove_empty_trees:1,
 			simplify_history:1,
-			lifo:1,
 			topo_order:1,
 			simplify_merges:1,
 			simplify_by_decoration:1,
-- 
1.8.3.1-494-g51b8af5

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 2/4] prio-queue: priority queue of pointers to structs
  2013-06-11 22:19                                     ` [PATCH v3 0/4] log --author-date-order Junio C Hamano
  2013-06-11 22:19                                       ` [PATCH v3 1/4] toposort: rename "lifo" field Junio C Hamano
@ 2013-06-11 22:19                                       ` Junio C Hamano
  2013-06-11 22:19                                       ` [PATCH v3 3/4] sort-in-topological-order: use prio-queue Junio C Hamano
  2013-06-11 22:19                                       ` [PATCH v3 4/4] log: --author-date-order Junio C Hamano
  3 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2013-06-11 22:19 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

Traditionally we used a singly linked list of commits to hold a set
of in-flight commits while traversing history.  The most typical use
of the list is to add commits that are newly discovered to it, keep
the list sorted by commit timestamp, pick up the newest one from the
list, and keep digging.  The cost of keeping the singly linked list
sorted is nontrivial, and this typical use pattern better matches a
priority queue.

Introduce a prio-queue structure, that can be used either as a LIFO
stack, or a priority queue.  This will be used in the next patch to
hold in-flight commits during sort-in-topological-order.

Tests and the idea to make it usable for any "void *" pointers to
"things" are by Jeff King.  Bugs are mine.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 .gitignore            |  1 +
 Makefile              |  3 +++
 prio-queue.c          | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++
 prio-queue.h          | 45 ++++++++++++++++++++++++++++++++
 t/t0009-prio-queue.sh | 50 ++++++++++++++++++++++++++++++++++++
 test-prio-queue.c     | 39 ++++++++++++++++++++++++++++
 6 files changed, 209 insertions(+)
 create mode 100644 prio-queue.c
 create mode 100644 prio-queue.h
 create mode 100755 t/t0009-prio-queue.sh
 create mode 100644 test-prio-queue.c

diff --git a/.gitignore b/.gitignore
index 6669bf0..b753817 100644
--- a/.gitignore
+++ b/.gitignore
@@ -190,6 +190,7 @@
 /test-mktemp
 /test-parse-options
 /test-path-utils
+/test-prio-queue
 /test-regex
 /test-revision-walking
 /test-run-command
diff --git a/Makefile b/Makefile
index 598d631..0246194 100644
--- a/Makefile
+++ b/Makefile
@@ -552,6 +552,7 @@ TEST_PROGRAMS_NEED_X += test-mergesort
 TEST_PROGRAMS_NEED_X += test-mktemp
 TEST_PROGRAMS_NEED_X += test-parse-options
 TEST_PROGRAMS_NEED_X += test-path-utils
+TEST_PROGRAMS_NEED_X += test-prio-queue
 TEST_PROGRAMS_NEED_X += test-regex
 TEST_PROGRAMS_NEED_X += test-revision-walking
 TEST_PROGRAMS_NEED_X += test-run-command
@@ -685,6 +686,7 @@ LIB_H += parse-options.h
 LIB_H += patch-ids.h
 LIB_H += pathspec.h
 LIB_H += pkt-line.h
+LIB_H += prio-queue.h
 LIB_H += progress.h
 LIB_H += prompt.h
 LIB_H += quote.h
@@ -824,6 +826,7 @@ LIB_OBJS += pathspec.o
 LIB_OBJS += pkt-line.o
 LIB_OBJS += preload-index.o
 LIB_OBJS += pretty.o
+LIB_OBJS += prio-queue.o
 LIB_OBJS += progress.o
 LIB_OBJS += prompt.o
 LIB_OBJS += quote.o
diff --git a/prio-queue.c b/prio-queue.c
new file mode 100644
index 0000000..f2a4973
--- /dev/null
+++ b/prio-queue.c
@@ -0,0 +1,71 @@
+#include "cache.h"
+#include "commit.h"
+#include "prio-queue.h"
+
+void clear_prio_queue(struct prio_queue *queue)
+{
+	free(queue->array);
+	queue->nr = 0;
+	queue->alloc = 0;
+	queue->array = NULL;
+}
+
+void prio_queue_put(struct prio_queue *queue, void *thing)
+{
+	prio_queue_compare_fn compare = queue->compare;
+	int ix, parent;
+
+	/* Append at the end */
+	ALLOC_GROW(queue->array, queue->nr + 1, queue->alloc);
+	queue->array[queue->nr++] = thing;
+	if (!compare)
+		return; /* LIFO */
+
+	/* Bubble up the new one */
+	for (ix = queue->nr - 1; ix; ix = parent) {
+		parent = (ix - 1) / 2;
+		if (compare(queue->array[parent], queue->array[ix],
+			    queue->cb_data) <= 0)
+			break;
+
+		thing = queue->array[parent];
+		queue->array[parent] = queue->array[ix];
+		queue->array[ix] = thing;
+	}
+}
+
+void *prio_queue_get(struct prio_queue *queue)
+{
+	void *result, *swap;
+	int ix, child;
+	prio_queue_compare_fn compare = queue->compare;
+
+	if (!queue->nr)
+		return NULL;
+	if (!compare)
+		return queue->array[--queue->nr]; /* LIFO */
+
+	result = queue->array[0];
+	if (!--queue->nr)
+		return result;
+
+	queue->array[0] = queue->array[queue->nr];
+
+	/* Push down the one at the root */
+	for (ix = 0; ix * 2 + 1 < queue->nr; ix = child) {
+		child = ix * 2 + 1; /* left */
+		if ((child + 1 < queue->nr) &&
+		    (compare(queue->array[child], queue->array[child + 1],
+			     queue->cb_data) >= 0))
+			child++; /* use right child */
+
+		if (compare(queue->array[ix], queue->array[child],
+			    queue->cb_data) <= 0)
+			break;
+
+		swap = queue->array[child];
+		queue->array[child] = queue->array[ix];
+		queue->array[ix] = swap;
+	}
+	return result;
+}
diff --git a/prio-queue.h b/prio-queue.h
new file mode 100644
index 0000000..ed354a5
--- /dev/null
+++ b/prio-queue.h
@@ -0,0 +1,45 @@
+#ifndef PRIO_QUEUE_H
+#define PRIO_QUEUE_H
+
+/*
+ * A priority queue implementation, primarily for keeping track of
+ * commits in the 'date-order' so that we process them from new to old
+ * as they are discovered, but can be used to hold any pointer to
+ * struct.  The caller is responsible for supplying a function to
+ * compare two "things".
+ *
+ * Alternatively, this data structure can also be used as a LIFO stack
+ * by specifying NULL as the comparison function.
+ */
+
+/*
+ * Compare two "things", one and two; the third parameter is cb_data
+ * in the prio_queue structure.  The result is returned as a sign of
+ * the return value, being the same as the sign of the result of
+ * subtracting "two" from "one" (i.e. negative if "one" sorts earlier
+ * than "two").
+ */
+typedef int (*prio_queue_compare_fn)(const void *one, const void *two, void *cb_data);
+
+struct prio_queue {
+	prio_queue_compare_fn compare;
+	void *cb_data;
+	int alloc, nr;
+	void **array;
+};
+
+/*
+ * Add the "thing" to the queue.
+ */
+extern void prio_queue_put(struct prio_queue *, void *thing);
+
+/*
+ * Extract the "thing" that compares the smallest out of the queue,
+ * or NULL.  If compare function is NULL, the queue acts as a LIFO
+ * stack.
+ */
+extern void *prio_queue_get(struct prio_queue *);
+
+extern void clear_prio_queue(struct prio_queue *);
+
+#endif /* PRIO_QUEUE_H */
diff --git a/t/t0009-prio-queue.sh b/t/t0009-prio-queue.sh
new file mode 100755
index 0000000..94045c3
--- /dev/null
+++ b/t/t0009-prio-queue.sh
@@ -0,0 +1,50 @@
+#!/bin/sh
+
+test_description='basic tests for priority queue implementation'
+. ./test-lib.sh
+
+cat >expect <<'EOF'
+1
+2
+3
+4
+5
+5
+6
+7
+8
+9
+10
+EOF
+test_expect_success 'basic ordering' '
+	test-prio-queue 2 6 3 10 9 5 7 4 5 8 1 dump >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<'EOF'
+2
+3
+4
+1
+5
+6
+EOF
+test_expect_success 'mixed put and get' '
+	test-prio-queue 6 2 4 get 5 3 get get 1 dump >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<'EOF'
+1
+2
+NULL
+1
+2
+NULL
+EOF
+test_expect_success 'notice empty queue' '
+	test-prio-queue 1 2 get get get 1 2 get get get >actual &&
+	test_cmp expect actual
+'
+
+test_done
diff --git a/test-prio-queue.c b/test-prio-queue.c
new file mode 100644
index 0000000..7be72f0
--- /dev/null
+++ b/test-prio-queue.c
@@ -0,0 +1,39 @@
+#include "cache.h"
+#include "prio-queue.h"
+
+static int intcmp(const void *va, const void *vb, void *data)
+{
+	const int *a = va, *b = vb;
+	return *a - *b;
+}
+
+static void show(int *v)
+{
+	if (!v)
+		printf("NULL\n");
+	else
+		printf("%d\n", *v);
+	free(v);
+}
+
+int main(int argc, char **argv)
+{
+	struct prio_queue pq = { intcmp };
+
+	while (*++argv) {
+		if (!strcmp(*argv, "get"))
+			show(prio_queue_get(&pq));
+		else if (!strcmp(*argv, "dump")) {
+			int *v;
+			while ((v = prio_queue_get(&pq)))
+			       show(v);
+		}
+		else {
+			int *v = malloc(sizeof(*v));
+			*v = atoi(*argv);
+			prio_queue_put(&pq, v);
+		}
+	}
+
+	return 0;
+}
-- 
1.8.3.1-494-g51b8af5

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 3/4] sort-in-topological-order: use prio-queue
  2013-06-11 22:19                                     ` [PATCH v3 0/4] log --author-date-order Junio C Hamano
  2013-06-11 22:19                                       ` [PATCH v3 1/4] toposort: rename "lifo" field Junio C Hamano
  2013-06-11 22:19                                       ` [PATCH v3 2/4] prio-queue: priority queue of pointers to structs Junio C Hamano
@ 2013-06-11 22:19                                       ` Junio C Hamano
  2013-06-11 22:19                                       ` [PATCH v3 4/4] log: --author-date-order Junio C Hamano
  3 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2013-06-11 22:19 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

Use the prio-queue data structure to implement a priority queue of
commits sorted by committer date, when handling --date-order.  The
structure can also be used as a simple LIFO stack, which is a good
match for --topo-order processing.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 commit.c     | 75 +++++++++++++++++++++++++++++++++++-------------------------
 prio-queue.c | 13 +++++++++++
 prio-queue.h |  3 +++
 3 files changed, 60 insertions(+), 31 deletions(-)

diff --git a/commit.c b/commit.c
index 11b9635..8b84ebf 100644
--- a/commit.c
+++ b/commit.c
@@ -9,6 +9,7 @@
 #include "gpg-interface.h"
 #include "mergesort.h"
 #include "commit-slab.h"
+#include "prio-queue.h"
 
 static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
 
@@ -509,21 +510,42 @@ struct commit *pop_commit(struct commit_list **stack)
 /* count number of children that have not been emitted */
 define_commit_slab(indegree_slab, int);
 
+static int compare_commits_by_commit_date(const void *a_, const void *b_, void *unused)
+{
+	const struct commit *a = a_, *b = b_;
+	/* newer commits with larger date first */
+	if (a->date < b->date)
+		return 1;
+	else if (a->date > b->date)
+		return -1;
+	return 0;
+}
+
 /*
  * Performs an in-place topological sort on the list supplied.
  */
-void sort_in_topological_order(struct commit_list ** list, enum rev_sort_order sort_order)
+void sort_in_topological_order(struct commit_list **list, enum rev_sort_order sort_order)
 {
 	struct commit_list *next, *orig = *list;
-	struct commit_list *work, **insert;
 	struct commit_list **pptr;
 	struct indegree_slab indegree;
+	struct prio_queue queue;
+	struct commit *commit;
 
 	if (!orig)
 		return;
 	*list = NULL;
 
 	init_indegree_slab(&indegree);
+	memset(&queue, '\0', sizeof(queue));
+	switch (sort_order) {
+	default: /* REV_SORT_IN_GRAPH_ORDER */
+		queue.compare = NULL;
+		break;
+	case REV_SORT_BY_COMMIT_DATE:
+		queue.compare = compare_commits_by_commit_date;
+		break;
+	}
 
 	/* Mark them and clear the indegree */
 	for (next = orig; next; next = next->next) {
@@ -533,7 +555,7 @@ void sort_in_topological_order(struct commit_list ** list, enum rev_sort_order s
 
 	/* update the indegree */
 	for (next = orig; next; next = next->next) {
-		struct commit_list * parents = next->item->parents;
+		struct commit_list *parents = next->item->parents;
 		while (parents) {
 			struct commit *parent = parents->item;
 			int *pi = indegree_slab_at(&indegree, parent);
@@ -551,30 +573,28 @@ void sort_in_topological_order(struct commit_list ** list, enum rev_sort_order s
 	 *
 	 * the tips serve as a starting set for the work queue.
 	 */
-	work = NULL;
-	insert = &work;
 	for (next = orig; next; next = next->next) {
 		struct commit *commit = next->item;
 
 		if (*(indegree_slab_at(&indegree, commit)) == 1)
-			insert = &commit_list_insert(commit, insert)->next;
+			prio_queue_put(&queue, commit);
 	}
 
-	/* process the list in topological order */
-	if (sort_order != REV_SORT_IN_GRAPH_ORDER)
-		commit_list_sort_by_date(&work);
+	/*
+	 * This is unfortunate; the initial tips need to be shown
+	 * in the order given from the revision traversal machinery.
+	 */
+	if (sort_order == REV_SORT_IN_GRAPH_ORDER)
+		prio_queue_reverse(&queue);
+
+	/* We no longer need the commit list */
+	free_commit_list(orig);
 
 	pptr = list;
 	*list = NULL;
-	while (work) {
-		struct commit *commit;
-		struct commit_list *parents, *work_item;
-
-		work_item = work;
-		work = work_item->next;
-		work_item->next = NULL;
+	while ((commit = prio_queue_get(&queue)) != NULL) {
+		struct commit_list *parents;
 
-		commit = work_item->item;
 		for (parents = commit->parents; parents ; parents = parents->next) {
 			struct commit *parent = parents->item;
 			int *pi = indegree_slab_at(&indegree, parent);
@@ -587,27 +607,20 @@ void sort_in_topological_order(struct commit_list ** list, enum rev_sort_order s
 			 * when all their children have been emitted thereby
 			 * guaranteeing topological order.
 			 */
-			if (--(*pi) == 1) {
-				switch (sort_order) {
-				case REV_SORT_BY_COMMIT_DATE:
-					commit_list_insert_by_date(parent, &work);
-					break;
-				default: /* REV_SORT_IN_GRAPH_ORDER */
-					commit_list_insert(parent, &work);
-					break;
-				}
-			}
+			if (--(*pi) == 1)
+				prio_queue_put(&queue, parent);
 		}
 		/*
-		 * work_item is a commit all of whose children
-		 * have already been emitted. we can emit it now.
+		 * all children of commit have already been
+		 * emitted. we can emit it now.
 		 */
 		*(indegree_slab_at(&indegree, commit)) = 0;
-		*pptr = work_item;
-		pptr = &work_item->next;
+
+		pptr = &commit_list_insert(commit, pptr)->next;
 	}
 
 	clear_indegree_slab(&indegree);
+	clear_prio_queue(&queue);
 }
 
 /* merge-base stuff */
diff --git a/prio-queue.c b/prio-queue.c
index f2a4973..c9f8c6d 100644
--- a/prio-queue.c
+++ b/prio-queue.c
@@ -2,6 +2,19 @@
 #include "commit.h"
 #include "prio-queue.h"
 
+void prio_queue_reverse(struct prio_queue *queue)
+{
+	int i, j;
+
+	if (queue->compare != NULL)
+		die("BUG: prio_queue_reverse() on non-LIFO queue");
+	for (i = 0; i <= (j = (queue->nr - 1) - i); i++) {
+		struct commit *swap = queue->array[i];
+		queue->array[i] = queue->array[j];
+		queue->array[j] = swap;
+	}
+}
+
 void clear_prio_queue(struct prio_queue *queue)
 {
 	free(queue->array);
diff --git a/prio-queue.h b/prio-queue.h
index ed354a5..e8b81e2 100644
--- a/prio-queue.h
+++ b/prio-queue.h
@@ -42,4 +42,7 @@ extern void *prio_queue_get(struct prio_queue *);
 
 extern void clear_prio_queue(struct prio_queue *);
 
+/* Reverse the LIFO elements */
+extern void prio_queue_reverse(struct prio_queue *);
+
 #endif /* PRIO_QUEUE_H */
-- 
1.8.3.1-494-g51b8af5

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v3 4/4] log: --author-date-order
  2013-06-11 22:19                                     ` [PATCH v3 0/4] log --author-date-order Junio C Hamano
                                                         ` (2 preceding siblings ...)
  2013-06-11 22:19                                       ` [PATCH v3 3/4] sort-in-topological-order: use prio-queue Junio C Hamano
@ 2013-06-11 22:19                                       ` Junio C Hamano
  3 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2013-06-11 22:19 UTC (permalink / raw)
  To: git; +Cc: Elliott Cable, Jeff King

Sometimes people would want to view the commits in parallel
histories in the order of author dates, not committer dates.

Teach "topo-order" sort machinery to do so, using a commit-info slab
to record the author dates of each commit, and prio-queue to sort
them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * This re-reads the commit object when commit->buf has already been
   freed, which is necessary to sort by the author date.

 Documentation/rev-list-options.txt |  4 +++
 commit.c                           | 74 ++++++++++++++++++++++++++++++++++++++
 commit.h                           |  3 +-
 revision.c                         |  3 ++
 4 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 3bdbf5e..8302402 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -617,6 +617,10 @@ By default, the commits are shown in reverse chronological order.
 	Show no parents before all of its children are shown, but
 	otherwise show commits in the commit timestamp order.
 
+--author-date-order::
+	Show no parents before all of its children are shown, but
+	otherwise show commits in the author timestamp order.
+
 --topo-order::
 	Show no parents before all of its children are shown, and
 	avoid showing commits on multiple lines of history
diff --git a/commit.c b/commit.c
index 8b84ebf..076c1fa 100644
--- a/commit.c
+++ b/commit.c
@@ -510,6 +510,68 @@ struct commit *pop_commit(struct commit_list **stack)
 /* count number of children that have not been emitted */
 define_commit_slab(indegree_slab, int);
 
+/* record author-date for each commit object */
+define_commit_slab(author_date_slab, unsigned long);
+
+static void record_author_date(struct author_date_slab *author_date,
+			       struct commit *commit)
+{
+	const char *buf, *line_end;
+	char *buffer = NULL;
+	struct ident_split ident;
+	char *date_end;
+	unsigned long date;
+
+	if (!commit->buffer) {
+		unsigned long size;
+		enum object_type type;
+		buffer = read_sha1_file(commit->object.sha1, &type, &size);
+		if (!buffer)
+			return;
+	}
+
+	for (buf = commit->buffer ? commit->buffer : buffer;
+	     buf;
+	     buf = line_end + 1) {
+		line_end = strchrnul(buf, '\n');
+		if (prefixcmp(buf, "author ")) {
+			if (!line_end[0] || line_end[1] == '\n')
+				return; /* end of header */
+			continue;
+		}
+		if (split_ident_line(&ident,
+				     buf + strlen("author "),
+				     line_end - (buf + strlen("author "))) ||
+		    !ident.date_begin || !ident.date_end)
+			goto fail_exit; /* malformed "author" line */
+		break;
+	}
+
+	date = strtoul(ident.date_begin, &date_end, 10);
+	if (date_end != ident.date_end)
+		goto fail_exit; /* malformed date */
+	*(author_date_slab_at(author_date, commit)) = date;
+
+fail_exit:
+	free(buffer);
+}
+
+static int compare_commits_by_author_date(const void *a_, const void *b_,
+					  void *cb_data)
+{
+	const struct commit *a = a_, *b = b_;
+	struct author_date_slab *author_date = cb_data;
+	unsigned long a_date = *(author_date_slab_at(author_date, a));
+	unsigned long b_date = *(author_date_slab_at(author_date, b));
+
+	/* newer commits with larger date first */
+	if (a_date < b_date)
+		return 1;
+	else if (a_date > b_date)
+		return -1;
+	return 0;
+}
+
 static int compare_commits_by_commit_date(const void *a_, const void *b_, void *unused)
 {
 	const struct commit *a = a_, *b = b_;
@@ -531,6 +593,7 @@ void sort_in_topological_order(struct commit_list **list, enum rev_sort_order so
 	struct indegree_slab indegree;
 	struct prio_queue queue;
 	struct commit *commit;
+	struct author_date_slab author_date;
 
 	if (!orig)
 		return;
@@ -538,6 +601,7 @@ void sort_in_topological_order(struct commit_list **list, enum rev_sort_order so
 
 	init_indegree_slab(&indegree);
 	memset(&queue, '\0', sizeof(queue));
+
 	switch (sort_order) {
 	default: /* REV_SORT_IN_GRAPH_ORDER */
 		queue.compare = NULL;
@@ -545,12 +609,20 @@ void sort_in_topological_order(struct commit_list **list, enum rev_sort_order so
 	case REV_SORT_BY_COMMIT_DATE:
 		queue.compare = compare_commits_by_commit_date;
 		break;
+	case REV_SORT_BY_AUTHOR_DATE:
+		init_author_date_slab(&author_date);
+		queue.compare = compare_commits_by_author_date;
+		queue.cb_data = &author_date;
+		break;
 	}
 
 	/* Mark them and clear the indegree */
 	for (next = orig; next; next = next->next) {
 		struct commit *commit = next->item;
 		*(indegree_slab_at(&indegree, commit)) = 1;
+		/* also record the author dates, if needed */
+		if (sort_order == REV_SORT_BY_AUTHOR_DATE)
+			record_author_date(&author_date, commit);
 	}
 
 	/* update the indegree */
@@ -621,6 +693,8 @@ void sort_in_topological_order(struct commit_list **list, enum rev_sort_order so
 
 	clear_indegree_slab(&indegree);
 	clear_prio_queue(&queue);
+	if (sort_order == REV_SORT_BY_AUTHOR_DATE)
+		clear_author_date_slab(&author_date);
 }
 
 /* merge-base stuff */
diff --git a/commit.h b/commit.h
index 247e474..e43dfd0 100644
--- a/commit.h
+++ b/commit.h
@@ -142,7 +142,8 @@ void clear_commit_marks_for_object_array(struct object_array *a, unsigned mark);
 
 enum rev_sort_order {
 	REV_SORT_IN_GRAPH_ORDER = 0,
-	REV_SORT_BY_COMMIT_DATE
+	REV_SORT_BY_COMMIT_DATE,
+	REV_SORT_BY_AUTHOR_DATE
 };
 
 /*
diff --git a/revision.c b/revision.c
index 966ebbc..12d9b64 100644
--- a/revision.c
+++ b/revision.c
@@ -1393,6 +1393,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--date-order")) {
 		revs->sort_order = REV_SORT_BY_COMMIT_DATE;
 		revs->topo_order = 1;
+	} else if (!strcmp(arg, "--author-date-order")) {
+		revs->sort_order = REV_SORT_BY_AUTHOR_DATE;
+		revs->topo_order = 1;
 	} else if (!prefixcmp(arg, "--early-output")) {
 		int count = 100;
 		switch (arg[14]) {
-- 
1.8.3.1-494-g51b8af5

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 4/4] log: --author-date-order
  2013-06-10 18:49                           ` Jeff King
@ 2013-06-20 19:36                             ` Junio C Hamano
  2013-06-20 20:16                               ` Jeff King
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2013-06-20 19:36 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Elliott Cable

Jeff King <peff@peff.net> writes:

>> Or we could extend parse_commit() API to take an optional commit
>> info slab to store not just author date but other non-essential
>> stuff like people's names, and we arrange that extended API to be
>> triggered when we know --author-date-order is in effect?
>
> I like the latter option. It takes a non-trivial amount of time to load
> the commits from disk, and now we are potentially doing it 2 or 3 times
> for a run (once to parse, once to get the author info for topo-sort, and
> possibly later to show it if --pretty is given; though I did not check
> and maybe we turn off save_commit_buffer with --pretty). It would be
> nice to have an extended parse_object that handled that. I'm not sure of
> the interface. Maybe variadic with pairs of type/slab, like:
>
>   parse_commit_extended(commit,
>                         PARSE_COMMIT_AUTHORDATE, &authordate_slab,
>                         PARSE_COMMIT_DONE);
>
> ?

What I had in mind actually was a custom slab tailored for each
caller that is an array of struct.  If the caller is interested in
authordate and authorname, instead of populating two separate
authordate_slab and authorname_slab, the caller declares a

	struct {
        	unsigned long date;
                char name[FLEX_ARRAY];
	} author_info;

prepares author_info_slab, and use your commit_parser API to fill
them.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v2 4/4] log: --author-date-order
  2013-06-20 19:36                             ` Junio C Hamano
@ 2013-06-20 20:16                               ` Jeff King
  0 siblings, 0 replies; 51+ messages in thread
From: Jeff King @ 2013-06-20 20:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Elliott Cable

On Thu, Jun 20, 2013 at 12:36:21PM -0700, Junio C Hamano wrote:

> > I like the latter option. It takes a non-trivial amount of time to load
> > the commits from disk, and now we are potentially doing it 2 or 3 times
> > for a run (once to parse, once to get the author info for topo-sort, and
> > possibly later to show it if --pretty is given; though I did not check
> > and maybe we turn off save_commit_buffer with --pretty). It would be
> > nice to have an extended parse_object that handled that. I'm not sure of
> > the interface. Maybe variadic with pairs of type/slab, like:
> >
> >   parse_commit_extended(commit,
> >                         PARSE_COMMIT_AUTHORDATE, &authordate_slab,
> >                         PARSE_COMMIT_DONE);
> >
> > ?
> 
> What I had in mind actually was a custom slab tailored for each
> caller that is an array of struct.  If the caller is interested in
> authordate and authorname, instead of populating two separate
> authordate_slab and authorname_slab, the caller declares a
> 
> 	struct {
>         	unsigned long date;
>                 char name[FLEX_ARRAY];
> 	} author_info;
> 
> prepares author_info_slab, and use your commit_parser API to fill
> them.

Yes, I think it is nicer to stay in one slab if you have multiple
values, but it means more custom code for the caller. If the
commit_parser API is nice, it should not be that much code, though.

It does make it harder to support arbitrary combinations directly in
parse_commit. If a caller wants to also parse_commit and use the same
buffer to pick out its custom information, I think we'd need to do one
of:

  1. Give parse_commit a callback, so that the callback can pick out the
     data it wants while parse_commit has the commit buffer in memory.
     E.g.:

       void grab_author_info(const char *buf, unsigned long len, void *data)
       {
              struct author_info *ai = data;
              /* fill fields from buffer */
       }

       ...
       parse_commit_extra(commit, grab_author_info,
                          slab_at(&author_slab, commit));

  2. Teach parse_commit to operate not only on a raw commit object, but
     also on the commit_parser API. Like:

       struct commit_parser parser = {0};

       /* actually open the object and start our incremental parser */
       init_commit_parser(&parser, commit);

       /* fill in parents, date, etc, as parse_commit does now */
       parse_commit_from_parser(commit, &parser);

       /* fill in whatever extra data we are interested in */
       *slab_at(&slab, commit) = get_author_date(&parser);

       /* done, drop the buffer */
       close_commit_parser(&parser);

The latter would need to handle transferring ownership of the buffer to
"struct commit" from "struct commit_parser" when save_commit_buffer is
turned off.

I think we're a bit high-level now to be making such decisions, though,
as we do not even have such a commit_parser API.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2013-06-20 20:16 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-04 18:08 [PATCH/RFC] add --authorship-order flag to git log / rev-list elliottcable
2013-06-04 18:08 ` [PATCH/RFC] rev-list: add --authorship-order alternative ordering elliottcable
2013-06-04 19:14   ` Junio C Hamano
2013-06-04 21:20     ` Junio C Hamano
2013-06-06 19:03       ` Elliott Cable
2013-06-06 19:29         ` Junio C Hamano
2013-06-06 19:32           ` Elliott Cable
2013-06-06 19:40         ` Junio C Hamano
2013-06-06 22:48           ` Junio C Hamano
2013-06-06 23:25             ` [PATCH] toposort: rename "lifo" field Junio C Hamano
2013-06-07  1:25               ` Junio C Hamano
2013-06-07  5:11                 ` [PATCH 0/3] Preparing for --date-order=author Junio C Hamano
2013-06-07  5:11                   ` [PATCH 1/3] toposort: rename "lifo" field Junio C Hamano
2013-06-07  5:18                     ` Eric Sunshine
2013-06-07  5:21                       ` Junio C Hamano
2013-06-07  5:11                   ` [PATCH 2/3] commit-queue: LIFO or priority queue of commits Junio C Hamano
2013-06-07  5:29                     ` Eric Sunshine
2013-06-07  5:11                   ` [PATCH 3/3] sort-in-topological-order: use commit-queue Junio C Hamano
2013-06-09 23:24                   ` [PATCH v2 0/4] log --author-date-order Junio C Hamano
2013-06-09 23:24                     ` [PATCH v2 1/4] toposort: rename "lifo" field Junio C Hamano
2013-06-10  2:12                       ` Eric Sunshine
2013-06-10  5:05                       ` Jeff King
2013-06-09 23:24                     ` [PATCH v2 2/4] commit-queue: LIFO or priority queue of commits Junio C Hamano
2013-06-10  5:25                       ` Jeff King
2013-06-10  7:21                         ` Junio C Hamano
2013-06-10 18:15                           ` Jeff King
2013-06-10 18:56                             ` Junio C Hamano
2013-06-10 18:59                               ` Jeff King
2013-06-10 23:23                                 ` Junio C Hamano
2013-06-11  6:36                                   ` Jeff King
2013-06-11 17:02                                     ` Junio C Hamano
2013-06-11 22:19                                     ` [PATCH v3 0/4] log --author-date-order Junio C Hamano
2013-06-11 22:19                                       ` [PATCH v3 1/4] toposort: rename "lifo" field Junio C Hamano
2013-06-11 22:19                                       ` [PATCH v3 2/4] prio-queue: priority queue of pointers to structs Junio C Hamano
2013-06-11 22:19                                       ` [PATCH v3 3/4] sort-in-topological-order: use prio-queue Junio C Hamano
2013-06-11 22:19                                       ` [PATCH v3 4/4] log: --author-date-order Junio C Hamano
2013-06-09 23:24                     ` [PATCH v2 3/4] sort-in-topological-order: use commit-queue Junio C Hamano
2013-06-09 23:37                       ` Junio C Hamano
2013-06-10  5:31                         ` Jeff King
2013-06-10  7:27                           ` Junio C Hamano
2013-06-10 18:24                             ` Jeff King
2013-06-09 23:24                     ` [PATCH v2 4/4] log: --author-date-order Junio C Hamano
2013-06-10  5:50                       ` Jeff King
2013-06-10  7:39                         ` Junio C Hamano
2013-06-10 18:49                           ` Jeff King
2013-06-20 19:36                             ` Junio C Hamano
2013-06-20 20:16                               ` Jeff King
2013-06-07  5:09               ` [PATCH] toposort: rename "lifo" field Eric Sunshine
2013-06-04 21:22     ` [PATCH/RFC] rev-list: add --authorship-order alternative ordering Jeff King
2013-06-04 18:53 ` [PATCH/RFC] add --authorship-order flag to git log / rev-list Junio C Hamano
2013-06-06 18:06   ` Elliott Cable

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).