git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Truncating HEAD reflog on branch move
@ 2017-06-21 21:40 brian m. carlson
  2017-06-21 21:46 ` Junio C Hamano
  0 siblings, 1 reply; 69+ messages in thread
From: brian m. carlson @ 2017-06-21 21:40 UTC (permalink / raw)
  To: git; +Cc: Sahil Dua, Kyle Meyer

[-- Attachment #1: Type: text/plain, Size: 1348 bytes --]

In Git 2.11.0, using git branch -m caused an empty entry to be logged to
the reflog, but it preserved the previous history of the default (HEAD)
reflog.  In Git 2.13.1 (and in next and pu), we write a valid entry into
the reflog, but truncate it.

So for the following commands:

  git clone https://git.crustytoothpaste.net/git/bmc/homedir.git
  cd homedir
  git checkout cPanel
  git checkout master
  git branch -m master new

I get the following on 2.11.0:

2cbfbd5 HEAD@{0}:
2cbfbd5 HEAD@{1}: checkout: moving from cPanel to master
eaf8db2 HEAD@{2}: checkout: moving from master to cPanel
2cbfbd5 HEAD@{3}: clone: from https://bmc@git.crustytoothpaste.net/git/bmc/homedir.git

and this on a recent next:

2cbfbd5 HEAD@{0}: Branch: renamed refs/heads/master to refs/heads/new

For this test, any repo will work; I just picked this one because it had
two branches I could use to generate dummy reflog entries.

A colleague reported this to me as a bug.  I don't see anything in the
release notes about this as a desired behavior change, and it does seem
undesirable to truncate the reflog this way.  If this isn't intentional,
I'm happy to work up a patch.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-06-21 21:40 Truncating HEAD reflog on branch move brian m. carlson
@ 2017-06-21 21:46 ` Junio C Hamano
  2017-06-21 23:04   ` Kyle Meyer
  0 siblings, 1 reply; 69+ messages in thread
From: Junio C Hamano @ 2017-06-21 21:46 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git, Sahil Dua, Kyle Meyer

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> I get the following on 2.11.0:
>
> 2cbfbd5 HEAD@{0}:
> 2cbfbd5 HEAD@{1}: checkout: moving from cPanel to master
> eaf8db2 HEAD@{2}: checkout: moving from master to cPanel
> 2cbfbd5 HEAD@{3}: clone: from https://bmc@git.crustytoothpaste.net/git/bmc/homedir.git
>
> and this on a recent next:
>
> 2cbfbd5 HEAD@{0}: Branch: renamed refs/heads/master to refs/heads/new
>
> For this test, any repo will work; I just picked this one because it had
> two branches I could use to generate dummy reflog entries.
>
> A colleague reported this to me as a bug.  I don't see anything in the
> release notes about this as a desired behavior change, and it does seem
> undesirable to truncate the reflog this way.  If this isn't intentional,
> I'm happy to work up a patch.

I do not think either behaviour is intentional (old one that gives a
meaningless empty entry probably is probably not what we want, the
new one that truncates is not what we want, either).


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-06-21 21:46 ` Junio C Hamano
@ 2017-06-21 23:04   ` Kyle Meyer
  2017-06-21 23:12     ` Kyle Meyer
  2017-06-22 15:16     ` Jeff King
  0 siblings, 2 replies; 69+ messages in thread
From: Kyle Meyer @ 2017-06-21 23:04 UTC (permalink / raw)
  To: Junio C Hamano, brian m. carlson; +Cc: git, Sahil Dua

Junio C Hamano <gitster@pobox.com> writes:

> "brian m. carlson" <sandals@crustytoothpaste.net> writes:
>
>> I get the following on 2.11.0:
>>
>> 2cbfbd5 HEAD@{0}:
>> 2cbfbd5 HEAD@{1}: checkout: moving from cPanel to master
>> eaf8db2 HEAD@{2}: checkout: moving from master to cPanel
>> 2cbfbd5 HEAD@{3}: clone: from https://bmc@git.crustytoothpaste.net/git/bmc/homedir.git
>>
>> and this on a recent next:
>>
>> 2cbfbd5 HEAD@{0}: Branch: renamed refs/heads/master to refs/heads/new
>>
>> For this test, any repo will work; I just picked this one because it had
>> two branches I could use to generate dummy reflog entries.
>>
>> A colleague reported this to me as a bug.  I don't see anything in the
>> release notes about this as a desired behavior change, and it does seem
>> undesirable to truncate the reflog this way.  If this isn't intentional,
>> I'm happy to work up a patch.
>
> I do not think either behaviour is intentional (old one that gives a
> meaningless empty entry probably is probably not what we want, the
> new one that truncates is not what we want, either).

Eh, sorry about that.  I haven't dug very deeply, but it seems like the
entry is still in .git/logs/HEAD, while "git reflog" is only showing
they latest entry.  (Maybe because we stop consuming the entries when we
hit a null sha1 in the old id field?)

-- 
Kyle

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-06-21 23:04   ` Kyle Meyer
@ 2017-06-21 23:12     ` Kyle Meyer
  2017-06-22 15:16     ` Jeff King
  1 sibling, 0 replies; 69+ messages in thread
From: Kyle Meyer @ 2017-06-21 23:12 UTC (permalink / raw)
  To: Junio C Hamano, brian m. carlson; +Cc: git, Sahil Dua

Kyle Meyer <kyle@kyleam.com> writes:

> Eh, sorry about that.  I haven't dug very deeply, but it seems like the
> entry is still in .git/logs/HEAD, while "git reflog" is only showing
> they latest entry.

s/entry is still/missing entries are still/


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-06-21 23:04   ` Kyle Meyer
  2017-06-21 23:12     ` Kyle Meyer
@ 2017-06-22 15:16     ` Jeff King
  2017-06-22 18:32       ` Junio C Hamano
  1 sibling, 1 reply; 69+ messages in thread
From: Jeff King @ 2017-06-22 15:16 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: Junio C Hamano, brian m. carlson, git, Sahil Dua

On Wed, Jun 21, 2017 at 07:04:22PM -0400, Kyle Meyer wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > "brian m. carlson" <sandals@crustytoothpaste.net> writes:
> >
> >> I get the following on 2.11.0:
> >>
> >> 2cbfbd5 HEAD@{0}:
> >> 2cbfbd5 HEAD@{1}: checkout: moving from cPanel to master
> >> eaf8db2 HEAD@{2}: checkout: moving from master to cPanel
> >> 2cbfbd5 HEAD@{3}: clone: from https://bmc@git.crustytoothpaste.net/git/bmc/homedir.git
> >>
> >> and this on a recent next:
> >>
> >> 2cbfbd5 HEAD@{0}: Branch: renamed refs/heads/master to refs/heads/new
> >>
> >> For this test, any repo will work; I just picked this one because it had
> >> two branches I could use to generate dummy reflog entries.
> >>
> >> A colleague reported this to me as a bug.  I don't see anything in the
> >> release notes about this as a desired behavior change, and it does seem
> >> undesirable to truncate the reflog this way.  If this isn't intentional,
> >> I'm happy to work up a patch.
> >
> > I do not think either behaviour is intentional (old one that gives a
> > meaningless empty entry probably is probably not what we want, the
> > new one that truncates is not what we want, either).
> 
> Eh, sorry about that.  I haven't dug very deeply, but it seems like the
> entry is still in .git/logs/HEAD, while "git reflog" is only showing
> they latest entry.  (Maybe because we stop consuming the entries when we
> hit a null sha1 in the old id field?)

Yeah, I think that's it.

I had trouble digging to find the place where this happens, but I
_think_ it's unintentional, and is an artifact of the way the reflog
iteration is bolted onto the regular revision machinery. In
get_revision_1(), we fake the commit parents as if the earlier reflog
entries formed an actual commit history.

There's some magic in fake_reflog_parent() when we hit an entry with a
null sha1 in the old-oid field. But it depends on looking at the new-oid
field of the entry before. And because our entries look like this:

  real_sha1 null_sha1 ...
  null_sha1 real_sha1 ...

Looking at the prior entry's new-oid field doesn't help at all. And we
end up claiming that there are no parents, and this is the end of the
traversal. The patch below makes it work for me by falling back further
to the previous entry's old oid. This feels like a terrible hack to make
this case work. I'd think we'd actually want some kind of loop to keep
looking. After all, what should:

  real_sha1 real_sha1 ...
  null_sha1 null_sha1 ...
  null_sha1 real_sha1 ...

show? Surely we should not stop traversing when we hit the double
null_sha1. We should either display it in some form, or skip it and
issue a warning that there's a funky entry in your reflog.

This whole fake-parents things does just feel like a gigantic hack,
though. It seems like we should be able to just walk backwards down the
reflog list and show the entries. The revision machinery already
special-cases a bunch of reflog-walk bits; I don't know that adding one
or two more would be the end of the world.

Another approach entirely would be to stop generating two entries in the
reflog, and generate a single one:

  real_sha1 real_sha1 ... Rename from "foo" to "bar"

I think we talked about that earlier, but I think the resolution was
just that doing so was hard with the existing ref-update calls.

diff --git a/reflog-walk.c b/reflog-walk.c
index ed99437ad..b7e489ad3 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -259,6 +259,8 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 		/* a root commit, but there are still more entries to show */
 		reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
 		logobj = parse_object(&reflog->noid);
+		if (!logobj)
+			logobj = parse_object(&reflog->ooid);
 	}
 
 	if (!logobj || logobj->type != OBJ_COMMIT) {


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-06-22 15:16     ` Jeff King
@ 2017-06-22 18:32       ` Junio C Hamano
  2017-06-22 18:45         ` Jeff King
  0 siblings, 1 reply; 69+ messages in thread
From: Junio C Hamano @ 2017-06-22 18:32 UTC (permalink / raw)
  To: Jeff King; +Cc: Kyle Meyer, brian m. carlson, git, Sahil Dua

Jeff King <peff@peff.net> writes:

> There's some magic in fake_reflog_parent() when we hit an entry with a
> ...
> This whole fake-parents things does just feel like a gigantic hack,
> though.
> ...
> It seems like we should be able to just walk backwards down the
> reflog list and show the entries. The revision machinery already
> special-cases a bunch of reflog-walk bits; I don't know that adding one
> or two more would be the end of the world.

Unfortunate but I tend to agree that at least such an addition would
make the "gigantic hack" a bit more complete one ;-)

> diff --git a/reflog-walk.c b/reflog-walk.c
> index ed99437ad..b7e489ad3 100644
> --- a/reflog-walk.c
> +++ b/reflog-walk.c
> @@ -259,6 +259,8 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
>  		/* a root commit, but there are still more entries to show */
>  		reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
>  		logobj = parse_object(&reflog->noid);
> +		if (!logobj)
> +			logobj = parse_object(&reflog->ooid);
>  	}
>  
>  	if (!logobj || logobj->type != OBJ_COMMIT) {

We already have a loop to find an entry that is a commit that
discards any non-commit object before the pre-context of this hunk.
This "oops, old side is NULL so let's cover it up by using the new
side" kicks in after that.  I wonder if we can roll that cover-up
logic into the loop, perhaps like

	do {
		reflog = &commit_reflog->...[recno];
		commit_reflog->recno--;
-		logobj = parse_object(&reflog->ooid);
+		logobj = parse_object(is_null_oid(&reflog->ooid) 
+				? &reflog->noid : &reflog->ooid);
-	} while (commit_reflog->recno && (logobj && logobj->type != OBJ_COMMIT));
+	} while (commit_reflog->recno && (!logobj || logobj->type != OBJ_COMMIT));
-
-	if (!logobj && commit_reflog->recno >= 0 && is_null_oid(&reflog->ooid)) {
-		/* a root commit ... */
-		reflog = &commit_reflog->...[recno];
-		logobj = parse_object(&reflog->noid);
-	}

which may deal with your "both old and new sides were  NULL" case
better.



^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-06-22 18:32       ` Junio C Hamano
@ 2017-06-22 18:45         ` Jeff King
  2017-06-22 19:03           ` Junio C Hamano
  0 siblings, 1 reply; 69+ messages in thread
From: Jeff King @ 2017-06-22 18:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Kyle Meyer, brian m. carlson, git, Sahil Dua

On Thu, Jun 22, 2017 at 11:32:43AM -0700, Junio C Hamano wrote:

> > diff --git a/reflog-walk.c b/reflog-walk.c
> > index ed99437ad..b7e489ad3 100644
> > --- a/reflog-walk.c
> > +++ b/reflog-walk.c
> > @@ -259,6 +259,8 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
> >  		/* a root commit, but there are still more entries to show */
> >  		reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
> >  		logobj = parse_object(&reflog->noid);
> > +		if (!logobj)
> > +			logobj = parse_object(&reflog->ooid);
> >  	}
> >  
> >  	if (!logobj || logobj->type != OBJ_COMMIT) {
> 
> We already have a loop to find an entry that is a commit that
> discards any non-commit object before the pre-context of this hunk.
> This "oops, old side is NULL so let's cover it up by using the new
> side" kicks in after that.  I wonder if we can roll that cover-up
> logic into the loop, perhaps like

Yeah, I tried making the second conditional its own loop, but I agree it
probably ought to be part of a single loop looking for a sane parent
entry.

> 	do {
> 		reflog = &commit_reflog->...[recno];
> 		commit_reflog->recno--;
> -		logobj = parse_object(&reflog->ooid);
> +		logobj = parse_object(is_null_oid(&reflog->ooid) 
> +				? &reflog->noid : &reflog->ooid);
> -	} while (commit_reflog->recno && (logobj && logobj->type != OBJ_COMMIT));
> +	} while (commit_reflog->recno && (!logobj || logobj->type != OBJ_COMMIT));
> -
> -	if (!logobj && commit_reflog->recno >= 0 && is_null_oid(&reflog->ooid)) {
> -		/* a root commit ... */
> -		reflog = &commit_reflog->...[recno];
> -		logobj = parse_object(&reflog->noid);
> -	}

I don't think this is quite right, though. We've decremented "recno"
after assigning the old pointer to "reflog". So in the existing code,
"reflog" in that second conditional pointing to the _next_ entry (or
previous, really, since we are going in reverse order).

So I think you'd need to look at commit->reflog again (after checking
that we didn't go past the start of the array).

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-06-22 18:45         ` Jeff King
@ 2017-06-22 19:03           ` Junio C Hamano
  2017-06-22 20:21             ` Jeff King
  0 siblings, 1 reply; 69+ messages in thread
From: Junio C Hamano @ 2017-06-22 19:03 UTC (permalink / raw)
  To: Jeff King; +Cc: Kyle Meyer, brian m. carlson, git, Sahil Dua

Jeff King <peff@peff.net> writes:

> I don't think this is quite right, though. We've decremented "recno"
> after assigning the old pointer to "reflog". So in the existing code,
> "reflog" in that second conditional pointing to the _next_ entry (or
> previous, really, since we are going in reverse order).
>
> So I think you'd need to look at commit->reflog again (after checking
> that we didn't go past the start of the array).

Perhaps.  I did the illustration that way simply because I was not
sure if the current "the entry was NULL from something new, so skip
and look at the previous entry's new" was correct to begin with.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-06-22 19:03           ` Junio C Hamano
@ 2017-06-22 20:21             ` Jeff King
  2017-06-22 20:32               ` Junio C Hamano
  0 siblings, 1 reply; 69+ messages in thread
From: Jeff King @ 2017-06-22 20:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Kyle Meyer, brian m. carlson, git, Sahil Dua

On Thu, Jun 22, 2017 at 12:03:31PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I don't think this is quite right, though. We've decremented "recno"
> > after assigning the old pointer to "reflog". So in the existing code,
> > "reflog" in that second conditional pointing to the _next_ entry (or
> > previous, really, since we are going in reverse order).
> >
> > So I think you'd need to look at commit->reflog again (after checking
> > that we didn't go past the start of the array).
> 
> Perhaps.  I did the illustration that way simply because I was not
> sure if the current "the entry was NULL from something new, so skip
> and look at the previous entry's new" was correct to begin with.

I'm not sure it makes sense for an entry to have its "after" state as
its own parent. On the other hand, I'm not sure it makes sense to
consider this fake parentage in the first place.

I think in the old days we used truly rewrite the parents, and traversal
like "git log -g -p" would show the diff between reflog entries. But
that changed in 838f9a156 (log: use true parents for diff when walking
reflogs, 2013-08-03). Given that we disable reflog-walking with things
like "--graph" that show the relationship, are the fake parents actually
accomplishing anything anymore? The parents we show via "log -g
--parents" seem like nonsense, and I'm not sure how we would do any
history simplification based no the munged values.

So I'd be tempted to just ditch the whole thing and teach
get_revision_1() to just walk through the list of logs, rather than this
weird "add a pending commit and then try to figure out which reflog it
referred to". For instance, right now:

  git log -g HEAD $(git symbolic-ref HEAD)

only shows _one_ reflog. The patch below is the direction I'm thinking.
It fails two tests, but haven't dug yet.

---
 reflog-walk.c | 112 +++++++++--------------------------
 reflog-walk.h |   4 +-
 revision.c    |  24 ++++----
 3 files changed, 43 insertions(+), 97 deletions(-)

diff --git a/reflog-walk.c b/reflog-walk.c
index ed99437ad..77170a3cb 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -78,45 +78,6 @@ static int get_reflog_recno_by_time(struct complete_reflogs *array,
 	return -1;
 }
 
-struct commit_info_lifo {
-	struct commit_info {
-		struct commit *commit;
-		void *util;
-	} *items;
-	int nr, alloc;
-};
-
-static struct commit_info *get_commit_info(struct commit *commit,
-		struct commit_info_lifo *lifo, int pop)
-{
-	int i;
-	for (i = 0; i < lifo->nr; i++)
-		if (lifo->items[i].commit == commit) {
-			struct commit_info *result = &lifo->items[i];
-			if (pop) {
-				if (i + 1 < lifo->nr)
-					memmove(lifo->items + i,
-						lifo->items + i + 1,
-						(lifo->nr - i) *
-						sizeof(struct commit_info));
-				lifo->nr--;
-			}
-			return result;
-		}
-	return NULL;
-}
-
-static void add_commit_info(struct commit *commit, void *util,
-		struct commit_info_lifo *lifo)
-{
-	struct commit_info *info;
-	ALLOC_GROW(lifo->items, lifo->nr + 1, lifo->alloc);
-	info = lifo->items + lifo->nr;
-	info->commit = commit;
-	info->util = util;
-	lifo->nr++;
-}
-
 struct commit_reflog {
 	int recno;
 	enum selector_type {
@@ -128,7 +89,8 @@ struct commit_reflog {
 };
 
 struct reflog_walk_info {
-	struct commit_info_lifo reflogs;
+	struct commit_reflog **logs;
+	size_t nr, alloc, cur;
 	struct string_list complete_reflogs;
 	struct commit_reflog *last_commit_reflog;
 };
@@ -226,50 +188,10 @@ int add_reflog_for_walk(struct reflog_walk_info *info,
 	commit_reflog->selector = selector;
 	commit_reflog->reflogs = reflogs;
 
-	add_commit_info(commit, commit_reflog, &info->reflogs);
-	return 0;
-}
-
-void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
-{
-	struct commit_info *commit_info =
-		get_commit_info(commit, &info->reflogs, 0);
-	struct commit_reflog *commit_reflog;
-	struct object *logobj;
-	struct reflog_info *reflog;
-
-	info->last_commit_reflog = NULL;
-	if (!commit_info)
-		return;
+	ALLOC_GROW(info->logs, info->nr + 1, info->alloc);
+	info->logs[info->nr++] = commit_reflog;
 
-	commit_reflog = commit_info->util;
-	if (commit_reflog->recno < 0) {
-		commit->parents = NULL;
-		return;
-	}
-	info->last_commit_reflog = commit_reflog;
-
-	do {
-		reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
-		commit_reflog->recno--;
-		logobj = parse_object(&reflog->ooid);
-	} while (commit_reflog->recno && (logobj && logobj->type != OBJ_COMMIT));
-
-	if (!logobj && commit_reflog->recno >= 0 && is_null_oid(&reflog->ooid)) {
-		/* a root commit, but there are still more entries to show */
-		reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
-		logobj = parse_object(&reflog->noid);
-	}
-
-	if (!logobj || logobj->type != OBJ_COMMIT) {
-		commit_info->commit = NULL;
-		commit->parents = NULL;
-		return;
-	}
-	commit_info->commit = (struct commit *)logobj;
-
-	commit->parents = xcalloc(1, sizeof(struct commit_list));
-	commit->parents->item = commit_info->commit;
+	return 0;
 }
 
 void get_reflog_selector(struct strbuf *sb,
@@ -356,3 +278,27 @@ void show_reflog_message(struct reflog_walk_info *reflog_info, int oneline,
 		strbuf_release(&selector);
 	}
 }
+
+struct commit *next_entry_in_log(struct commit_reflog *log)
+{
+	while (log->recno >= 0) {
+		struct reflog_info *entry = &log->reflogs->items[log->recno--];
+		struct object *obj = parse_object(&entry->noid);
+
+		if (obj && obj->type == OBJ_COMMIT)
+			return (struct commit *)obj;
+	}
+	return NULL;
+}
+
+struct commit *next_reflog_entry(struct reflog_walk_info *walk)
+{
+	for (; walk->cur < walk->nr; walk->cur++) {
+		struct commit *ret = next_entry_in_log(walk->logs[walk->cur]);
+		if (ret) {
+			walk->last_commit_reflog = walk->logs[walk->cur];
+			return ret;
+		}
+	}
+	return NULL;
+}
diff --git a/reflog-walk.h b/reflog-walk.h
index 27886f793..5cc6b7bd4 100644
--- a/reflog-walk.h
+++ b/reflog-walk.h
@@ -8,8 +8,6 @@ struct reflog_walk_info;
 extern void init_reflog_walk(struct reflog_walk_info **info);
 extern int add_reflog_for_walk(struct reflog_walk_info *info,
 		struct commit *commit, const char *name);
-extern void fake_reflog_parent(struct reflog_walk_info *info,
-		struct commit *commit);
 extern void show_reflog_message(struct reflog_walk_info *info, int,
 				const struct date_mode *, int force_date);
 extern void get_reflog_message(struct strbuf *sb,
@@ -20,4 +18,6 @@ extern void get_reflog_selector(struct strbuf *sb,
 		const struct date_mode *dmode, int force_date,
 		int shorten);
 
+struct commit *next_reflog_entry(struct reflog_walk_info *reflog_info);
+
 #endif
diff --git a/revision.c b/revision.c
index 12eb33235..675247cd9 100644
--- a/revision.c
+++ b/revision.c
@@ -148,16 +148,14 @@ static void add_pending_object_with_path(struct rev_info *revs,
 	if (revs->reflog_info && obj->type == OBJ_COMMIT) {
 		struct strbuf buf = STRBUF_INIT;
 		int len = interpret_branch_name(name, 0, &buf, 0);
-		int st;
 
 		if (0 < len && name[len] && buf.len)
 			strbuf_addstr(&buf, name + len);
-		st = add_reflog_for_walk(revs->reflog_info,
-					 (struct commit *)obj,
-					 buf.buf[0] ? buf.buf: name);
+		add_reflog_for_walk(revs->reflog_info,
+				    (struct commit *)obj,
+				    buf.buf[0] ? buf.buf: name);
 		strbuf_release(&buf);
-		if (st)
-			return;
+		return; /* do not add the commit itself */
 	}
 	add_object_array_with_path(obj, name, &revs->pending, mode, path);
 }
@@ -3114,18 +3112,20 @@ static void track_linear(struct rev_info *revs, struct commit *commit)
 
 static struct commit *get_revision_1(struct rev_info *revs)
 {
+	if (revs->reflog_info) {
+		struct commit *commit = next_reflog_entry(revs->reflog_info);
+		if (commit) {
+			commit->object.flags &= ~(ADDED | SEEN | SHOWN);
+			return commit;
+		}
+	}
+
 	if (!revs->commits)
 		return NULL;
 
 	do {
 		struct commit *commit = pop_commit(&revs->commits);
 
-		if (revs->reflog_info) {
-			save_parents(revs, commit);
-			fake_reflog_parent(revs->reflog_info, commit);
-			commit->object.flags &= ~(ADDED | SEEN | SHOWN);
-		}
-
 		/*
 		 * If we haven't done the list limiting, we need to look at
 		 * the parents here. We also need to do the date-based limiting

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-06-22 20:21             ` Jeff King
@ 2017-06-22 20:32               ` Junio C Hamano
  2017-06-22 21:52                 ` Jeff King
  0 siblings, 1 reply; 69+ messages in thread
From: Junio C Hamano @ 2017-06-22 20:32 UTC (permalink / raw)
  To: Jeff King; +Cc: Kyle Meyer, brian m. carlson, git, Sahil Dua

Jeff King <peff@peff.net> writes:

> So I'd be tempted to just ditch the whole thing and teach
> get_revision_1() to just walk through the list of logs, rather than this
> weird "add a pending commit and then try to figure out which reflog it
> referred to". For instance, right now:
>
>   git log -g HEAD $(git symbolic-ref HEAD)
>
> only shows _one_ reflog. The patch below is the direction I'm thinking.
> It fails two tests, but haven't dug yet.
>
> ---
>  reflog-walk.c | 112 +++++++++--------------------------
>  reflog-walk.h |   4 +-
>  revision.c    |  24 ++++----
>  3 files changed, 43 insertions(+), 97 deletions(-)

Yeah, I agree with the "we now show diffs with true parents"
reasoning, and I like the above code reduction, obviously ;-)

> @@ -3114,18 +3112,20 @@ static void track_linear(struct rev_info *revs, struct commit *commit)
>  
>  static struct commit *get_revision_1(struct rev_info *revs)
>  {
> +	if (revs->reflog_info) {
> +		struct commit *commit = next_reflog_entry(revs->reflog_info);
> +		if (commit) {
> +			commit->object.flags &= ~(ADDED | SEEN | SHOWN);
> +			return commit;
> +		}
> +	}
> +
>  	if (!revs->commits)
>  		return NULL;
>  
>  	do {
>  		struct commit *commit = pop_commit(&revs->commits);
>  
> -		if (revs->reflog_info) {
> -			save_parents(revs, commit);
> -			fake_reflog_parent(revs->reflog_info, commit);
> -			commit->object.flags &= ~(ADDED | SEEN | SHOWN);
> -		}
> -
>  		/*
>  		 * If we haven't done the list limiting, we need to look at
>  		 * the parents here. We also need to do the date-based limiting

This part of the patch I can 100% agree with ;-)  

I do not think command line parser does not allow "log -g
maint..master" so all the "limited" processing the remainder of
get_revision_1() does shouldn't matter.

I however think pathspec will affect simplify_commit() and suspect
that "git log -g -20 HEAD path" will behave differently.  Perhaps
the difference is "it used to use path in an unexplainable way, now
it ignores", in which case this is an improvement.

Thanks.






^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-06-22 20:32               ` Junio C Hamano
@ 2017-06-22 21:52                 ` Jeff King
  2017-06-22 22:25                   ` Jeff King
  0 siblings, 1 reply; 69+ messages in thread
From: Jeff King @ 2017-06-22 21:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Kyle Meyer, brian m. carlson, git, Sahil Dua

On Thu, Jun 22, 2017 at 01:32:44PM -0700, Junio C Hamano wrote:

> I do not think command line parser does not allow "log -g
> maint..master" so all the "limited" processing the remainder of
> get_revision_1() does shouldn't matter.

Yeah, I don't think negative endpoints work at all, and "foo...bar"
seems to also break (though with a confusing message). It seems clear to
me that multiple positive endpoints don't work well either, if they have
overlapping commits.

> I however think pathspec will affect simplify_commit() and suspect
> that "git log -g -20 HEAD path" will behave differently.  Perhaps
> the difference is "it used to use path in an unexplainable way, now
> it ignores", in which case this is an improvement.

The current behavior there does seem like nonsense, because it's based
on the fake parents. For instance, if I set up a simple two-branch case:

  commit() {
	echo "$1" >"$1" && git add "$1" && git commit -m "$1"
  }

  git init repo
  cd repo
  commit base
  commit master
  git checkout -b side HEAD^
  commit side
  git merge --no-edit master
  commit combined

Then I get:

  $ git log -g --oneline --name-status -- master
  f06c3cd HEAD@{1}: merge master: Merge made by the 'recursive' strategy.
  5bf12c4 HEAD@{3}: checkout: moving from master to side
  dfa408b HEAD@{4}: commit: master
  A       master

Even though only one of those commits touched master. But with my patch,
it's also somewhat confusing. We ignore the pathspec when picking which
commits to show, but still apply it for diffing. So:

  03cf1ad HEAD@{0}: commit: combined
  f06c3cd HEAD@{1}: merge master: Merge made by the 'recursive' strategy.
  277042b HEAD@{2}: commit: side
  5bf12c4 HEAD@{3}: checkout: moving from master to side
  dfa408b HEAD@{4}: commit: master
  A       master
  5bf12c4 HEAD@{5}: commit (initial): base

I think we'd want to just omit any entries that are TREESAME to their
parents. We don't actually care about true parent rewriting (since we're
not walking the parents), but if it happened as a side effect that would
probably be OK.

It looks like simplify_commit() is also where we apply bits like
--no-merges, which doesn't work with my patch. It _also_ behaves
nonsensically in the current code, because of course the fake reflog
parents are never merges.

Which really makes me feel like this patch is going in the right
direction, as it makes all of this behave conceptually like:

  while read old new etc ...
  do
    git show $new
  done <.git/logs/$ref

which is simple to explain and is what I'd expect (and is certainly the
direction we went with the "diff uses real parents" commit).

We just need to hit the simplify_commit() code path here.

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-06-22 21:52                 ` Jeff King
@ 2017-06-22 22:25                   ` Jeff King
  2017-06-23  3:13                     ` Jeff King
  0 siblings, 1 reply; 69+ messages in thread
From: Jeff King @ 2017-06-22 22:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Kyle Meyer, brian m. carlson, git, Sahil Dua

On Thu, Jun 22, 2017 at 05:52:35PM -0400, Jeff King wrote:

> Which really makes me feel like this patch is going in the right
> direction, as it makes all of this behave conceptually like:
> 
>   while read old new etc ...
>   do
>     git show $new
>   done <.git/logs/$ref
> 
> which is simple to explain and is what I'd expect (and is certainly the
> direction we went with the "diff uses real parents" commit).
> 
> We just need to hit the simplify_commit() code path here.

So here's a patch on top of what I posted before that pushes the reflog
check into the loop (it just decides whether to pull from the reflogs or
from the commit queue at the top of the loop).

I was surprised to find, though, that simplify_commit() does not
actually do the pathspec limiting! It's done by
try_to_simplify_commit(), which is part of add_parents_to_list(). I
hacked around it in the later part of the loop by calling
try_to_simplify myself and checking the TREESAME flag. But I have a
feeling I'm missing something about how the traversal is supposed to
work.

This does behave sensibly with "--no-merges" and "--merges", as well as
pathspec limiting.

diff --git a/revision.c b/revision.c
index 675247cd9..203468ddf 100644
--- a/revision.c
+++ b/revision.c
@@ -3112,19 +3112,19 @@ static void track_linear(struct rev_info *revs, struct commit *commit)
 
 static struct commit *get_revision_1(struct rev_info *revs)
 {
-	if (revs->reflog_info) {
-		struct commit *commit = next_reflog_entry(revs->reflog_info);
-		if (commit) {
-			commit->object.flags &= ~(ADDED | SEEN | SHOWN);
-			return commit;
-		}
-	}
+	while (1) {
+		struct commit *commit;
 
-	if (!revs->commits)
-		return NULL;
+		if (revs->reflog_info)
+			commit = next_reflog_entry(revs->reflog_info);
+		else
+			commit = pop_commit(&revs->commits);
 
-	do {
-		struct commit *commit = pop_commit(&revs->commits);
+		if (!commit)
+			return NULL;
+
+		if (revs->reflog_info)
+			commit->object.flags &= ~(ADDED | SEEN | SHOWN);
 
 		/*
 		 * If we haven't done the list limiting, we need to look at
@@ -3135,7 +3135,8 @@ static struct commit *get_revision_1(struct rev_info *revs)
 			if (revs->max_age != -1 &&
 			    (commit->date < revs->max_age))
 				continue;
-			if (add_parents_to_list(revs, commit, &revs->commits, NULL) < 0) {
+			if (!revs->reflog_info &&
+			    add_parents_to_list(revs, commit, &revs->commits, NULL) < 0) {
 				if (!revs->ignore_missing_links)
 					die("Failed to traverse parents of commit %s",
 						oid_to_hex(&commit->object.oid));
@@ -3151,10 +3152,14 @@ static struct commit *get_revision_1(struct rev_info *revs)
 		default:
 			if (revs->track_linear)
 				track_linear(revs, commit);
+			if (revs->reflog_info) {
+				try_to_simplify_commit(revs, commit);
+				if (commit->object.flags & TREESAME)
+					continue;
+			}
 			return commit;
 		}
-	} while (revs->commits);
-	return NULL;
+	}
 }
 
 /*

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-06-22 22:25                   ` Jeff King
@ 2017-06-23  3:13                     ` Jeff King
  2017-07-04 19:58                       ` brian m. carlson
  0 siblings, 1 reply; 69+ messages in thread
From: Jeff King @ 2017-06-23  3:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Kyle Meyer, brian m. carlson, git, Sahil Dua

On Thu, Jun 22, 2017 at 06:25:46PM -0400, Jeff King wrote:

> So here's a patch on top of what I posted before that pushes the reflog
> check into the loop (it just decides whether to pull from the reflogs or
> from the commit queue at the top of the loop).
> 
> I was surprised to find, though, that simplify_commit() does not
> actually do the pathspec limiting! It's done by
> try_to_simplify_commit(), which is part of add_parents_to_list(). I
> hacked around it in the later part of the loop by calling
> try_to_simplify myself and checking the TREESAME flag. But I have a
> feeling I'm missing something about how the traversal is supposed to
> work.
> 
> This does behave sensibly with "--no-merges" and "--merges", as well as
> pathspec limiting.

And here's one more patch on top of those that's necessary to get the
tests to pass (I don't expect anybody to necessarily be applying this
slow string of patches; it's just to show the direction I'm looking in).

---
 builtin/log.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 998437b23..512538479 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -373,9 +373,9 @@ static int cmd_log_walk(struct rev_info *rev)
 		if (!rev->reflog_info) {
 			/* we allow cycles in reflog ancestry */
 			free_commit_buffer(commit);
+			free_commit_list(commit->parents);
+			commit->parents = NULL;
 		}
-		free_commit_list(commit->parents);
-		commit->parents = NULL;
 		if (saved_nrl < rev->diffopt.needed_rename_limit)
 			saved_nrl = rev->diffopt.needed_rename_limit;
 		if (rev->diffopt.degraded_cc_to_c)

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-06-23  3:13                     ` Jeff King
@ 2017-07-04 19:58                       ` brian m. carlson
  2017-07-04 21:24                         ` Jeff King
  0 siblings, 1 reply; 69+ messages in thread
From: brian m. carlson @ 2017-07-04 19:58 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua

[-- Attachment #1: Type: text/plain, Size: 1560 bytes --]

On Thu, Jun 22, 2017 at 11:13:15PM -0400, Jeff King wrote:
> On Thu, Jun 22, 2017 at 06:25:46PM -0400, Jeff King wrote:
> 
> > So here's a patch on top of what I posted before that pushes the reflog
> > check into the loop (it just decides whether to pull from the reflogs or
> > from the commit queue at the top of the loop).
> > 
> > I was surprised to find, though, that simplify_commit() does not
> > actually do the pathspec limiting! It's done by
> > try_to_simplify_commit(), which is part of add_parents_to_list(). I
> > hacked around it in the later part of the loop by calling
> > try_to_simplify myself and checking the TREESAME flag. But I have a
> > feeling I'm missing something about how the traversal is supposed to
> > work.
> > 
> > This does behave sensibly with "--no-merges" and "--merges", as well as
> > pathspec limiting.
> 
> And here's one more patch on top of those that's necessary to get the
> tests to pass (I don't expect anybody to necessarily be applying this
> slow string of patches; it's just to show the direction I'm looking in).

I've looked at your original patch, which modified reflog-walk.c, and it
does fix the issue.  I'm happy to send in a patch with that and a test
(provided you're okay with me adding your sign-off), or if you wanted to
send in something a bit more complete, like the series of patches here,
that's fine, too.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-07-04 19:58                       ` brian m. carlson
@ 2017-07-04 21:24                         ` Jeff King
  2017-07-04 21:27                           ` brian m. carlson
                                             ` (2 more replies)
  0 siblings, 3 replies; 69+ messages in thread
From: Jeff King @ 2017-07-04 21:24 UTC (permalink / raw)
  To: brian m. carlson, Junio C Hamano, Kyle Meyer, git, Sahil Dua

On Tue, Jul 04, 2017 at 07:58:06PM +0000, brian m. carlson wrote:

> > And here's one more patch on top of those that's necessary to get the
> > tests to pass (I don't expect anybody to necessarily be applying this
> > slow string of patches; it's just to show the direction I'm looking in).
> 
> I've looked at your original patch, which modified reflog-walk.c, and it
> does fix the issue.  I'm happy to send in a patch with that and a test
> (provided you're okay with me adding your sign-off), or if you wanted to
> send in something a bit more complete, like the series of patches here,
> that's fine, too.

I've been on vacation for the past week, but wrapping this up is on my
todo. I'll try to get to it tonight.

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-07-04 21:24                         ` Jeff King
@ 2017-07-04 21:27                           ` brian m. carlson
  2017-07-04 21:28                             ` Jeff King
  2017-07-05  1:49                           ` Junio C Hamano
  2017-07-05  7:55                           ` [PATCH 0/6] fixing reflog-walk oddities Jeff King
  2 siblings, 1 reply; 69+ messages in thread
From: brian m. carlson @ 2017-07-04 21:27 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua

[-- Attachment #1: Type: text/plain, Size: 882 bytes --]

On Tue, Jul 04, 2017 at 05:24:08PM -0400, Jeff King wrote:
> On Tue, Jul 04, 2017 at 07:58:06PM +0000, brian m. carlson wrote:
> > I've looked at your original patch, which modified reflog-walk.c, and it
> > does fix the issue.  I'm happy to send in a patch with that and a test
> > (provided you're okay with me adding your sign-off), or if you wanted to
> > send in something a bit more complete, like the series of patches here,
> > that's fine, too.
> 
> I've been on vacation for the past week, but wrapping this up is on my
> todo. I'll try to get to it tonight.

Okay, if it's on your radar, that's fine.  Certainly you should enjoy
your vacation and not feel obligated to work on Git unless you want to.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-07-04 21:27                           ` brian m. carlson
@ 2017-07-04 21:28                             ` Jeff King
  0 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-04 21:28 UTC (permalink / raw)
  To: brian m. carlson, Junio C Hamano, Kyle Meyer, git, Sahil Dua

On Tue, Jul 04, 2017 at 09:27:09PM +0000, brian m. carlson wrote:

> On Tue, Jul 04, 2017 at 05:24:08PM -0400, Jeff King wrote:
> > On Tue, Jul 04, 2017 at 07:58:06PM +0000, brian m. carlson wrote:
> > > I've looked at your original patch, which modified reflog-walk.c, and it
> > > does fix the issue.  I'm happy to send in a patch with that and a test
> > > (provided you're okay with me adding your sign-off), or if you wanted to
> > > send in something a bit more complete, like the series of patches here,
> > > that's fine, too.
> > 
> > I've been on vacation for the past week, but wrapping this up is on my
> > todo. I'll try to get to it tonight.
> 
> Okay, if it's on your radar, that's fine.  Certainly you should enjoy
> your vacation and not feel obligated to work on Git unless you want to.

Heh, no, "tonight" is the first work day after the vacation ends. Once
the fireworks are done, it's back to work. ;)

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: Truncating HEAD reflog on branch move
  2017-07-04 21:24                         ` Jeff King
  2017-07-04 21:27                           ` brian m. carlson
@ 2017-07-05  1:49                           ` Junio C Hamano
  2017-07-05  7:55                           ` [PATCH 0/6] fixing reflog-walk oddities Jeff King
  2 siblings, 0 replies; 69+ messages in thread
From: Junio C Hamano @ 2017-07-05  1:49 UTC (permalink / raw)
  To: Jeff King; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua

Jeff King <peff@peff.net> writes:

> On Tue, Jul 04, 2017 at 07:58:06PM +0000, brian m. carlson wrote:
>
>> > And here's one more patch on top of those that's necessary to get the
>> > tests to pass (I don't expect anybody to necessarily be applying this
>> > slow string of patches; it's just to show the direction I'm looking in).
>> 
>> I've looked at your original patch, which modified reflog-walk.c, and it
>> does fix the issue.  I'm happy to send in a patch with that and a test
>> (provided you're okay with me adding your sign-off), or if you wanted to
>> send in something a bit more complete, like the series of patches here,
>> that's fine, too.
>
> I've been on vacation for the past week, but wrapping this up is on my
> todo. I'll try to get to it tonight.

;-)  Welcome back.  

Hopefully I haven't advanced topics to 'next' that are sub-par
during your absence.



^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH 0/6] fixing reflog-walk oddities
  2017-07-04 21:24                         ` Jeff King
  2017-07-04 21:27                           ` brian m. carlson
  2017-07-05  1:49                           ` Junio C Hamano
@ 2017-07-05  7:55                           ` Jeff King
  2017-07-05  7:57                             ` [PATCH 1/6] reflog-walk: skip over double-null oid due to HEAD rename Jeff King
                                               ` (6 more replies)
  2 siblings, 7 replies; 69+ messages in thread
From: Jeff King @ 2017-07-05  7:55 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua

On Tue, Jul 04, 2017 at 05:24:08PM -0400, Jeff King wrote:

> On Tue, Jul 04, 2017 at 07:58:06PM +0000, brian m. carlson wrote:
> 
> > > And here's one more patch on top of those that's necessary to get the
> > > tests to pass (I don't expect anybody to necessarily be applying this
> > > slow string of patches; it's just to show the direction I'm looking in).
> > 
> > I've looked at your original patch, which modified reflog-walk.c, and it
> > does fix the issue.  I'm happy to send in a patch with that and a test
> > (provided you're okay with me adding your sign-off), or if you wanted to
> > send in something a bit more complete, like the series of patches here,
> > that's fine, too.
> 
> I've been on vacation for the past week, but wrapping this up is on my
> todo. I'll try to get to it tonight.

OK, so here's what I came up with.

The first patch is my original small fix with an extra test. I think
that would be appropriate for 'maint'. Its behavior still has some
quirks, but it avoids the confusion that you experienced and has a low
risk of breaking anything else.

The rest of it replaces the fake-parent thing with a more
straight-forward iteration over the reflogs (i.e., a cleanup of the
further patches I've been posting). After digging into it and especially
after writing the new tests, I think I've convinced myself that this is
the right way forward.

I tried to anticipate the behavior changes and I think all of them are
improvements. I won't be surprised if there's some hidden gotcha,
though, so this is definitely not for 'maint'. The patches do textually
depend on the fix from 1/6; my intent was that they'd be applied in
sequence and only merge up the first one to maint.

  [1/6]: reflog-walk: skip over double-null oid due to HEAD rename
  [2/6]: t1414: document some reflog-walk oddities
  [3/6]: log: do not free parents when walking reflog
  [4/6]: get_revision_1(): replace do-while with an early return
  [5/6]: rev-list: check reflog_info before showing usage
  [6/6]: reflog-walk: stop using fake parents

 builtin/log.c          |   4 +-
 builtin/rev-list.c     |   3 +-
 reflog-walk.c          | 117 ++++++++++++++-----------------------------------
 reflog-walk.h          |   6 ++-
 revision.c             |  39 ++++++++++-------
 t/t1414-reflog-walk.sh |  86 ++++++++++++++++++++++++++++++++++++
 t/t3200-branch.sh      |  10 +++++
 7 files changed, 160 insertions(+), 105 deletions(-)
 create mode 100755 t/t1414-reflog-walk.sh

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH 1/6] reflog-walk: skip over double-null oid due to HEAD rename
  2017-07-05  7:55                           ` [PATCH 0/6] fixing reflog-walk oddities Jeff King
@ 2017-07-05  7:57                             ` Jeff King
  2017-07-05 17:34                               ` Junio C Hamano
  2017-07-05  8:00                             ` [PATCH 2/6] t1414: document some reflog-walk oddities Jeff King
                                               ` (5 subsequent siblings)
  6 siblings, 1 reply; 69+ messages in thread
From: Jeff King @ 2017-07-05  7:57 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua

Since 39ee4c6c2f (branch: record creation of renamed branch
in HEAD's log, 2017-02-20), a rename on the currently
checked out branch will create two entries in the HEAD
reflog: one where the branch goes away (switching to the
null oid), and one where it comes back (switching away from
the null oid).

This confuses the reflog-walk code. When walking backwards,
it first sees the null oid in the "old" field of the second
entry. Thanks to the "root commit" logic added by 71abeb753f
(reflog: continue walking the reflog past root commits,
2016-06-03), we keep looking for the next entry by scanning
the "new" field from the previous entry. But that field is
also null! We need to go just a tiny bit further, and look
at its "old" field. But with the current code, we decide the
reflog has nothing else to show and just give up. To the
user this looks like the reflog was truncated by the rename
operation, when in fact those entries are still there.

This patch does the absolute minimal fix, which is to look
back that one extra level and keep traversing.

The resulting behavior may not be the _best_ thing to do in
the long run (for example, we show both reflog entries each
with the same commit id), but it's a simple way to fix the
problem without risking further regressions.

Signed-off-by: Jeff King <peff@peff.net>
---
I do still think it would be worth looking into making this rename
create a single reflog entry, but that's largely orthogonal to making
the display code sane(r).

 reflog-walk.c     |  2 ++
 t/t3200-branch.sh | 11 +++++++++++
 2 files changed, 13 insertions(+)

diff --git a/reflog-walk.c b/reflog-walk.c
index ed99437ad2..b7e489ad32 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -259,6 +259,8 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 		/* a root commit, but there are still more entries to show */
 		reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
 		logobj = parse_object(&reflog->noid);
+		if (!logobj)
+			logobj = parse_object(&reflog->ooid);
 	}
 
 	if (!logobj || logobj->type != OBJ_COMMIT) {
diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
index 48d152b9a9..dd37ac47c5 100755
--- a/t/t3200-branch.sh
+++ b/t/t3200-branch.sh
@@ -162,6 +162,17 @@ test_expect_success 'git branch -M baz bam should add entries to .git/logs/HEAD'
 	grep "^0\{40\}.*$msg$" .git/logs/HEAD
 '
 
+test_expect_success 'resulting reflog can be shown by log -g' '
+	oid=$(git rev-parse HEAD) &&
+	cat >expect <<-EOF &&
+	HEAD@{0} $oid $msg
+	HEAD@{1} $oid $msg
+	HEAD@{2} $oid checkout: moving from foo to baz
+	EOF
+	git log -g --format="%gd %H %gs" -3 HEAD >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'git branch -M baz bam should succeed when baz is checked out as linked working tree' '
 	git checkout master &&
 	git worktree add -b baz bazdir &&
-- 
2.13.2.892.g25f9b59978


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 2/6] t1414: document some reflog-walk oddities
  2017-07-05  7:55                           ` [PATCH 0/6] fixing reflog-walk oddities Jeff King
  2017-07-05  7:57                             ` [PATCH 1/6] reflog-walk: skip over double-null oid due to HEAD rename Jeff King
@ 2017-07-05  8:00                             ` Jeff King
  2017-07-05 17:56                               ` Junio C Hamano
  2017-07-05 20:05                               ` Ramsay Jones
  2017-07-05  8:02                             ` [PATCH 3/6] log: do not free parents when walking reflog Jeff King
                                               ` (4 subsequent siblings)
  6 siblings, 2 replies; 69+ messages in thread
From: Jeff King @ 2017-07-05  8:00 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua

Since its inception, the general strategy of the reflog-walk
code has been to start with the tip commit for the ref, and
as we traverse replace each commit's parent pointers with
fake parents pointing to the previous reflog entry.

This lets us traverse the reflog as if it were a real
history, but it has some user-visible oddities. Namely:

  1. The fake parents are used for commit selection and
     display. So for example, "--merges" or "--no-merges"
     are useful, because the history appears as a linear
     string. Likewise, pathspec limiting is based on the
     diff between adjacent entries, not the changes actually
     introduced by a commit.

     These are often the same (e.g., because the entry was
     just running "git commit" and the adjacent entry _is_
     the true parent), but it may not be in several common
     cases. For instance, using "git reset" to jump around
     history, or "git checkout" to move HEAD.

  2. We reverse-map each commit back to its reflog. So when
     it comes time to show commit X, we say "a-ha, we added
     X because it was at the tip of the 'foo' reflog, so
     let's show the foo reflog". But this leads to nonsense
     results when you ask to traverse multiple reflogs: if
     two reflogs have the same tip commit, we only map back
     to one of them.

     Instead, we should show each reflog individually, in
     the order the user gave us on the command line.

  2. If the tip of the reflog and the ref tip disagree on
     the current value, we show the ref tip but give no
     indication of the value in the reflog.  This situation
     isn't supposed to happen (since any ref update should
     touch the reflog). But if it does, given that the
     requested operation is to show the reflog, it makes
     sense to prefer that.

This commit adds several expect_failure tests, to show how
the tool ought to behave.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/t1414-reflog-walk.sh | 83 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 83 insertions(+)
 create mode 100755 t/t1414-reflog-walk.sh

diff --git a/t/t1414-reflog-walk.sh b/t/t1414-reflog-walk.sh
new file mode 100755
index 0000000000..bb847f797d
--- /dev/null
+++ b/t/t1414-reflog-walk.sh
@@ -0,0 +1,83 @@
+#!/bin/sh
+
+test_description='various tests of reflog walk (log -g) behavior'
+. ./test-lib.sh
+
+test_expect_success 'set up some reflog entries' '
+	test_commit one &&
+	test_commit two &&
+	git checkout -b side HEAD^ &&
+	test_commit three &&
+	git merge --no-commit master &&
+	echo evil-merge-content >>one.t &&
+	test_tick &&
+	git commit --no-edit -a
+'
+
+do_walk () {
+	git log -g --format="%gd %gs" "$@"
+}
+
+sq="'"
+test_expect_success 'set up expected reflog' '
+	cat >expect.all <<-EOF
+	HEAD@{0} commit (merge): Merge branch ${sq}master${sq} into side
+	HEAD@{1} commit: three
+	HEAD@{2} checkout: moving from master to side
+	HEAD@{3} commit: two
+	HEAD@{4} commit (initial): one
+	EOF
+'
+
+test_expect_success 'reflog walk shows expected logs' '
+	do_walk >actual &&
+	test_cmp expect.all actual
+'
+
+test_expect_failure 'reflog can limit with --no-merges' '
+	grep -v merge expect.all >expect &&
+	do_walk --no-merges >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'reflog can limit with pathspecs' '
+	grep two expect.all >expect &&
+	do_walk -- two.t >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'pathspec limiting handles merges' '
+	sed -n "1p;3p;5p" expect.all >expect &&
+	do_walk -- one.t >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure '--parents shows true parents' '
+	# convert newlines to spaces
+	echo $(git rev-parse HEAD HEAD^1 HEAD^2) >expect &&
+	git rev-list -g --parents -1 HEAD >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'walking multiple reflogs shows both' '
+	{
+		do_walk HEAD &&
+		do_walk side
+	} >expect &&
+	do_walk HEAD side >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'walk prefers reflog to ref tip' '
+	head=$(git rev-parse HEAD) &&
+	one=$(git rev-parse one) &&
+	ident="$GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE" &&
+	echo "$head $one $ident	broken reflog entry" >>.git/logs/HEAD &&
+
+	echo $one >expect &&
+	git log -g --format=%H -1 >actual &&
+	test_cmp expect actual
+'
+
+
+test_done
-- 
2.13.2.892.g25f9b59978


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 3/6] log: do not free parents when walking reflog
  2017-07-05  7:55                           ` [PATCH 0/6] fixing reflog-walk oddities Jeff King
  2017-07-05  7:57                             ` [PATCH 1/6] reflog-walk: skip over double-null oid due to HEAD rename Jeff King
  2017-07-05  8:00                             ` [PATCH 2/6] t1414: document some reflog-walk oddities Jeff King
@ 2017-07-05  8:02                             ` Jeff King
  2017-07-05  8:04                             ` [PATCH 4/6] get_revision_1(): replace do-while with an early return Jeff King
                                               ` (3 subsequent siblings)
  6 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-05  8:02 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua

When we're doing a reflog walk (instead of walking the
actual parent pointers), we may see commits multiple times.
For this reason, we hold on to the commit buffer for each
commit rather than freeing it after we've showed the commit.

We should do the same for the parent list. Right now this is
just a minor optimization. But once we refactor how reflog
walks are performed, keeping the parents will avoid
confusing us the second time we see the commit.

Signed-off-by: Jeff King <peff@peff.net>
---
I didn't dig deep into the details of "confusing", so there may be other
ways to solve it. But this seems like the right thing to do regardless.

 builtin/log.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 8ca1de9894..9c8bb3b5c3 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -374,9 +374,9 @@ static int cmd_log_walk(struct rev_info *rev)
 		if (!rev->reflog_info) {
 			/* we allow cycles in reflog ancestry */
 			free_commit_buffer(commit);
+			free_commit_list(commit->parents);
+			commit->parents = NULL;
 		}
-		free_commit_list(commit->parents);
-		commit->parents = NULL;
 		if (saved_nrl < rev->diffopt.needed_rename_limit)
 			saved_nrl = rev->diffopt.needed_rename_limit;
 		if (rev->diffopt.degraded_cc_to_c)
-- 
2.13.2.892.g25f9b59978


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 4/6] get_revision_1(): replace do-while with an early return
  2017-07-05  7:55                           ` [PATCH 0/6] fixing reflog-walk oddities Jeff King
                                               ` (2 preceding siblings ...)
  2017-07-05  8:02                             ` [PATCH 3/6] log: do not free parents when walking reflog Jeff King
@ 2017-07-05  8:04                             ` Jeff King
  2017-07-05  8:06                             ` [PATCH 5/6] rev-list: check reflog_info before showing usage Jeff King
                                               ` (2 subsequent siblings)
  6 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-05  8:04 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua

The get_revision_1() function tries to avoid entering its
main loop at all when there are no commits to look at. But
it's perfectly safe to call pop_commit() on an empty list
(in which case it will return NULL). Switching to an early
return from the loop lets us skip repeating the loop
condition before we enter the do-while. That will get more
important when we start pulling reflog-walk commits from a
source besides the revs->commits queue, as that condition
will get much more complicated.

Signed-off-by: Jeff King <peff@peff.net>
---
 revision.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/revision.c b/revision.c
index e181ad1b70..4dc7c63654 100644
--- a/revision.c
+++ b/revision.c
@@ -3109,12 +3109,12 @@ static void track_linear(struct rev_info *revs, struct commit *commit)
 
 static struct commit *get_revision_1(struct rev_info *revs)
 {
-	if (!revs->commits)
-		return NULL;
-
-	do {
+	while (1) {
 		struct commit *commit = pop_commit(&revs->commits);
 
+		if (!commit)
+			return NULL;
+
 		if (revs->reflog_info) {
 			save_parents(revs, commit);
 			fake_reflog_parent(revs->reflog_info, commit);
@@ -3148,8 +3148,7 @@ static struct commit *get_revision_1(struct rev_info *revs)
 				track_linear(revs, commit);
 			return commit;
 		}
-	} while (revs->commits);
-	return NULL;
+	}
 }
 
 /*
-- 
2.13.2.892.g25f9b59978


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 5/6] rev-list: check reflog_info before showing usage
  2017-07-05  7:55                           ` [PATCH 0/6] fixing reflog-walk oddities Jeff King
                                               ` (3 preceding siblings ...)
  2017-07-05  8:04                             ` [PATCH 4/6] get_revision_1(): replace do-while with an early return Jeff King
@ 2017-07-05  8:06                             ` Jeff King
  2017-07-05 18:07                               ` Junio C Hamano
  2017-07-05  8:09                             ` [PATCH 6/6] reflog-walk: stop using fake parents Jeff King
  2017-07-07  8:36                             ` [PATCH v2 0/4] reflog-walk fixes for maint Jeff King
  6 siblings, 1 reply; 69+ messages in thread
From: Jeff King @ 2017-07-05  8:06 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua

When git-rev-list sees no pending commits, it shows a usage
message. This works even when reflog-walking is requested,
because the reflog-walk code currently puts the reflog tips
into the pending queue.

In preparation for refactoring the reflog-walk code, let's
explicitly check whether we have any reflogs to walk. For
now this is a noop, but the existing reflog tests will make
sure that it kicks in after the refactoring. Likewise, we'll
add a test that "rev-list -g" without specifying any reflogs
continues to fail (so that we know our check does not kick
in too aggressively).

Note that the implementation needs to go into its own
sub-function, as the walk code does not expose its innards
outside of reflog-walk.c.

Signed-off-by: Jeff King <peff@peff.net>
---
This is actually the main "gotcha" I'm worried about with this series.
I'm not sure if any other code would care about seeing the pending items
in revs->commits. I still think the series is the right direction; if
there is such a place, we'd want to teach it to handle reflog walking in
a similar way, too.

 builtin/rev-list.c     | 3 ++-
 reflog-walk.c          | 5 +++++
 reflog-walk.h          | 2 ++
 t/t1414-reflog-walk.sh | 3 +++
 4 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 95d84d5cda..53a746dd89 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -11,6 +11,7 @@
 #include "graph.h"
 #include "bisect.h"
 #include "progress.h"
+#include "reflog-walk.h"
 
 static const char rev_list_usage[] =
 "git rev-list [OPTION] <commit-id>... [ -- paths... ]\n"
@@ -348,7 +349,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 		/* Only --header was specified */
 		revs.commit_format = CMIT_FMT_RAW;
 
-	if ((!revs.commits &&
+	if ((!revs.commits && reflog_walk_empty(revs.reflog_info) &&
 	     (!(revs.tag_objects || revs.tree_objects || revs.blob_objects) &&
 	      !revs.pending.nr)) ||
 	    revs.diff)
diff --git a/reflog-walk.c b/reflog-walk.c
index b7e489ad32..89e719c459 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -358,3 +358,8 @@ void show_reflog_message(struct reflog_walk_info *reflog_info, int oneline,
 		strbuf_release(&selector);
 	}
 }
+
+int reflog_walk_empty(struct reflog_walk_info *info)
+{
+	return !info || !info->reflogs.nr;
+}
diff --git a/reflog-walk.h b/reflog-walk.h
index 27886f793e..af32361072 100644
--- a/reflog-walk.h
+++ b/reflog-walk.h
@@ -20,4 +20,6 @@ extern void get_reflog_selector(struct strbuf *sb,
 		const struct date_mode *dmode, int force_date,
 		int shorten);
 
+extern int reflog_walk_empty(struct reflog_walk_info *walk);
+
 #endif
diff --git a/t/t1414-reflog-walk.sh b/t/t1414-reflog-walk.sh
index bb847f797d..fba6788e94 100755
--- a/t/t1414-reflog-walk.sh
+++ b/t/t1414-reflog-walk.sh
@@ -79,5 +79,8 @@ test_expect_failure 'walk prefers reflog to ref tip' '
 	test_cmp expect actual
 '
 
+test_expect_success 'rev-list -g complains when there are no reflogs' '
+	test_must_fail git rev-list -g
+'
 
 test_done
-- 
2.13.2.892.g25f9b59978


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 6/6] reflog-walk: stop using fake parents
  2017-07-05  7:55                           ` [PATCH 0/6] fixing reflog-walk oddities Jeff King
                                               ` (4 preceding siblings ...)
  2017-07-05  8:06                             ` [PATCH 5/6] rev-list: check reflog_info before showing usage Jeff King
@ 2017-07-05  8:09                             ` Jeff King
  2017-07-07  0:32                               ` Eric Wong
  2017-07-07  8:36                             ` [PATCH v2 0/4] reflog-walk fixes for maint Jeff King
  6 siblings, 1 reply; 69+ messages in thread
From: Jeff King @ 2017-07-05  8:09 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua

The reflog-walk system works by putting a ref's tip into the
pending queue, and then "traversing" the reflog by
pretending that the parent of each commit is the previous
reflog entry.

This causes a number of user-visible oddities, as documented
in t1414 (and the commit message which introduced it). We
can fix all of them in one go by replacing the fake-reflog
system with a much simpler one: just keeping a list of
reflogs to show, and walking through them entry by entry.

The implementation is fairly straight-forward, but there are
a few items to note:

  1. We obviously must skip calling add_parents_to_list()
     when we are traversing reflogs, since we do not want to
     walk the original parents at all.

     As a result, we must call try_to_simplify_commit()
     ourselves and skip any TREESAME commits.

     There are other parts of add_parents_to_list() we skip,
     as well, but none of them should matter for a reflog
     traversal:

       -  We do not allow UNINTERESTING commits, nor
          symmetric ranges (and we bail when these are used
          with "-g").

       - Using --source makes no sense, since we aren't
         traversing. The reflog selector shows the same
         information with more detail.

       - Using --first-parent is still sensible, since you
         may want to see the first-parent diff for each
         entry. But since we're not traversing, we don't
         need to cull the parent list here.

  2. Since we now just walk the reflog entries themselves,
     rather than starting with the ref tip, we now look at
     the "new" field of each entry rather than the "old"
     (i.e., we are showing entries, not faking parents).
     This removes all of the tricky logic around skipping
     past root commits.

     But note that we have no way to show an entry with the
     null sha1 in its "new" field (because such a commit
     obviously does not exist). Normally this would not
     happen, since we delete reflogs along with refs, but
     there is one special case. When we rename the currently
     checked out branch, we write two reflog entries into
     the HEAD log: one where the commit goes away, and
     another where it comes back.

     Prior to this commit, we show both entries with
     identical reflog messages. After this commit, we show
     only the "comes back" entry. See the update in t3200
     which demonstrates this.

     Arguably either is fine, as the whole double-entry
     thing is a bit hacky in the first place. And until a
     recent fix, we truncated the traversal in such a case
     anyway, which was _definitely_ wrong.

Signed-off-by: Jeff King <peff@peff.net>
---
 reflog-walk.c          | 116 +++++++++++++------------------------------------
 reflog-walk.h          |   4 +-
 revision.c             |  30 ++++++++-----
 t/t1414-reflog-walk.sh |  12 ++---
 t/t3200-branch.sh      |   3 +-
 5 files changed, 57 insertions(+), 108 deletions(-)

diff --git a/reflog-walk.c b/reflog-walk.c
index 89e719c459..a7644d944e 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -78,45 +78,6 @@ static int get_reflog_recno_by_time(struct complete_reflogs *array,
 	return -1;
 }
 
-struct commit_info_lifo {
-	struct commit_info {
-		struct commit *commit;
-		void *util;
-	} *items;
-	int nr, alloc;
-};
-
-static struct commit_info *get_commit_info(struct commit *commit,
-		struct commit_info_lifo *lifo, int pop)
-{
-	int i;
-	for (i = 0; i < lifo->nr; i++)
-		if (lifo->items[i].commit == commit) {
-			struct commit_info *result = &lifo->items[i];
-			if (pop) {
-				if (i + 1 < lifo->nr)
-					memmove(lifo->items + i,
-						lifo->items + i + 1,
-						(lifo->nr - i) *
-						sizeof(struct commit_info));
-				lifo->nr--;
-			}
-			return result;
-		}
-	return NULL;
-}
-
-static void add_commit_info(struct commit *commit, void *util,
-		struct commit_info_lifo *lifo)
-{
-	struct commit_info *info;
-	ALLOC_GROW(lifo->items, lifo->nr + 1, lifo->alloc);
-	info = lifo->items + lifo->nr;
-	info->commit = commit;
-	info->util = util;
-	lifo->nr++;
-}
-
 struct commit_reflog {
 	int recno;
 	enum selector_type {
@@ -128,7 +89,8 @@ struct commit_reflog {
 };
 
 struct reflog_walk_info {
-	struct commit_info_lifo reflogs;
+	struct commit_reflog **logs;
+	size_t nr, alloc, cur;
 	struct string_list complete_reflogs;
 	struct commit_reflog *last_commit_reflog;
 };
@@ -226,52 +188,10 @@ int add_reflog_for_walk(struct reflog_walk_info *info,
 	commit_reflog->selector = selector;
 	commit_reflog->reflogs = reflogs;
 
-	add_commit_info(commit, commit_reflog, &info->reflogs);
-	return 0;
-}
-
-void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
-{
-	struct commit_info *commit_info =
-		get_commit_info(commit, &info->reflogs, 0);
-	struct commit_reflog *commit_reflog;
-	struct object *logobj;
-	struct reflog_info *reflog;
-
-	info->last_commit_reflog = NULL;
-	if (!commit_info)
-		return;
+	ALLOC_GROW(info->logs, info->nr + 1, info->alloc);
+	info->logs[info->nr++] = commit_reflog;
 
-	commit_reflog = commit_info->util;
-	if (commit_reflog->recno < 0) {
-		commit->parents = NULL;
-		return;
-	}
-	info->last_commit_reflog = commit_reflog;
-
-	do {
-		reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
-		commit_reflog->recno--;
-		logobj = parse_object(&reflog->ooid);
-	} while (commit_reflog->recno && (logobj && logobj->type != OBJ_COMMIT));
-
-	if (!logobj && commit_reflog->recno >= 0 && is_null_oid(&reflog->ooid)) {
-		/* a root commit, but there are still more entries to show */
-		reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
-		logobj = parse_object(&reflog->noid);
-		if (!logobj)
-			logobj = parse_object(&reflog->ooid);
-	}
-
-	if (!logobj || logobj->type != OBJ_COMMIT) {
-		commit_info->commit = NULL;
-		commit->parents = NULL;
-		return;
-	}
-	commit_info->commit = (struct commit *)logobj;
-
-	commit->parents = xcalloc(1, sizeof(struct commit_list));
-	commit->parents->item = commit_info->commit;
+	return 0;
 }
 
 void get_reflog_selector(struct strbuf *sb,
@@ -361,5 +281,29 @@ void show_reflog_message(struct reflog_walk_info *reflog_info, int oneline,
 
 int reflog_walk_empty(struct reflog_walk_info *info)
 {
-	return !info || !info->reflogs.nr;
+	return !info || !info->nr;
+}
+
+struct commit *next_entry_in_log(struct commit_reflog *log)
+{
+	while (log->recno >= 0) {
+		struct reflog_info *entry = &log->reflogs->items[log->recno--];
+		struct object *obj = parse_object(&entry->noid);
+
+		if (obj && obj->type == OBJ_COMMIT)
+			return (struct commit *)obj;
+	}
+	return NULL;
+}
+
+struct commit *next_reflog_entry(struct reflog_walk_info *walk)
+{
+	for (; walk->cur < walk->nr; walk->cur++) {
+		struct commit *ret = next_entry_in_log(walk->logs[walk->cur]);
+		if (ret) {
+			walk->last_commit_reflog = walk->logs[walk->cur];
+			return ret;
+		}
+	}
+	return NULL;
 }
diff --git a/reflog-walk.h b/reflog-walk.h
index af32361072..373388cd14 100644
--- a/reflog-walk.h
+++ b/reflog-walk.h
@@ -8,8 +8,6 @@ struct reflog_walk_info;
 extern void init_reflog_walk(struct reflog_walk_info **info);
 extern int add_reflog_for_walk(struct reflog_walk_info *info,
 		struct commit *commit, const char *name);
-extern void fake_reflog_parent(struct reflog_walk_info *info,
-		struct commit *commit);
 extern void show_reflog_message(struct reflog_walk_info *info, int,
 				const struct date_mode *, int force_date);
 extern void get_reflog_message(struct strbuf *sb,
@@ -22,4 +20,6 @@ extern void get_reflog_selector(struct strbuf *sb,
 
 extern int reflog_walk_empty(struct reflog_walk_info *walk);
 
+struct commit *next_reflog_entry(struct reflog_walk_info *reflog_info);
+
 #endif
diff --git a/revision.c b/revision.c
index 4dc7c63654..5fc01f2d26 100644
--- a/revision.c
+++ b/revision.c
@@ -148,16 +148,14 @@ static void add_pending_object_with_path(struct rev_info *revs,
 	if (revs->reflog_info && obj->type == OBJ_COMMIT) {
 		struct strbuf buf = STRBUF_INIT;
 		int len = interpret_branch_name(name, 0, &buf, 0);
-		int st;
 
 		if (0 < len && name[len] && buf.len)
 			strbuf_addstr(&buf, name + len);
-		st = add_reflog_for_walk(revs->reflog_info,
-					 (struct commit *)obj,
-					 buf.buf[0] ? buf.buf: name);
+		add_reflog_for_walk(revs->reflog_info,
+				    (struct commit *)obj,
+				    buf.buf[0] ? buf.buf: name);
 		strbuf_release(&buf);
-		if (st)
-			return;
+		return; /* do not add the commit itself */
 	}
 	add_object_array_with_path(obj, name, &revs->pending, mode, path);
 }
@@ -3110,16 +3108,18 @@ static void track_linear(struct rev_info *revs, struct commit *commit)
 static struct commit *get_revision_1(struct rev_info *revs)
 {
 	while (1) {
-		struct commit *commit = pop_commit(&revs->commits);
+		struct commit *commit;
+
+		if (revs->reflog_info)
+			commit = next_reflog_entry(revs->reflog_info);
+		else
+			commit = pop_commit(&revs->commits);
 
 		if (!commit)
 			return NULL;
 
-		if (revs->reflog_info) {
-			save_parents(revs, commit);
-			fake_reflog_parent(revs->reflog_info, commit);
+		if (revs->reflog_info)
 			commit->object.flags &= ~(ADDED | SEEN | SHOWN);
-		}
 
 		/*
 		 * If we haven't done the list limiting, we need to look at
@@ -3130,7 +3130,8 @@ static struct commit *get_revision_1(struct rev_info *revs)
 			if (revs->max_age != -1 &&
 			    (commit->date < revs->max_age))
 				continue;
-			if (add_parents_to_list(revs, commit, &revs->commits, NULL) < 0) {
+			if (!revs->reflog_info &&
+			    add_parents_to_list(revs, commit, &revs->commits, NULL) < 0) {
 				if (!revs->ignore_missing_links)
 					die("Failed to traverse parents of commit %s",
 						oid_to_hex(&commit->object.oid));
@@ -3146,6 +3147,11 @@ static struct commit *get_revision_1(struct rev_info *revs)
 		default:
 			if (revs->track_linear)
 				track_linear(revs, commit);
+			if (revs->reflog_info) {
+				try_to_simplify_commit(revs, commit);
+				if (commit->object.flags & TREESAME)
+					continue;
+			}
 			return commit;
 		}
 	}
diff --git a/t/t1414-reflog-walk.sh b/t/t1414-reflog-walk.sh
index fba6788e94..7cc52d946d 100755
--- a/t/t1414-reflog-walk.sh
+++ b/t/t1414-reflog-walk.sh
@@ -34,32 +34,32 @@ test_expect_success 'reflog walk shows expected logs' '
 	test_cmp expect.all actual
 '
 
-test_expect_failure 'reflog can limit with --no-merges' '
+test_expect_success 'reflog can limit with --no-merges' '
 	grep -v merge expect.all >expect &&
 	do_walk --no-merges >actual &&
 	test_cmp expect actual
 '
 
-test_expect_failure 'reflog can limit with pathspecs' '
+test_expect_success 'reflog can limit with pathspecs' '
 	grep two expect.all >expect &&
 	do_walk -- two.t >actual &&
 	test_cmp expect actual
 '
 
-test_expect_failure 'pathspec limiting handles merges' '
+test_expect_success 'pathspec limiting handles merges' '
 	sed -n "1p;3p;5p" expect.all >expect &&
 	do_walk -- one.t >actual &&
 	test_cmp expect actual
 '
 
-test_expect_failure '--parents shows true parents' '
+test_expect_success '--parents shows true parents' '
 	# convert newlines to spaces
 	echo $(git rev-parse HEAD HEAD^1 HEAD^2) >expect &&
 	git rev-list -g --parents -1 HEAD >actual &&
 	test_cmp expect actual
 '
 
-test_expect_failure 'walking multiple reflogs shows both' '
+test_expect_success 'walking multiple reflogs shows both' '
 	{
 		do_walk HEAD &&
 		do_walk side
@@ -68,7 +68,7 @@ test_expect_failure 'walking multiple reflogs shows both' '
 	test_cmp expect actual
 '
 
-test_expect_failure 'walk prefers reflog to ref tip' '
+test_expect_success 'walk prefers reflog to ref tip' '
 	head=$(git rev-parse HEAD) &&
 	one=$(git rev-parse one) &&
 	ident="$GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE" &&
diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
index dd37ac47c5..9d707d2a40 100755
--- a/t/t3200-branch.sh
+++ b/t/t3200-branch.sh
@@ -166,10 +166,9 @@ test_expect_success 'resulting reflog can be shown by log -g' '
 	oid=$(git rev-parse HEAD) &&
 	cat >expect <<-EOF &&
 	HEAD@{0} $oid $msg
-	HEAD@{1} $oid $msg
 	HEAD@{2} $oid checkout: moving from foo to baz
 	EOF
-	git log -g --format="%gd %H %gs" -3 HEAD >actual &&
+	git log -g --format="%gd %H %gs" -2 HEAD >actual &&
 	test_cmp expect actual
 '
 
-- 
2.13.2.892.g25f9b59978

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH 1/6] reflog-walk: skip over double-null oid due to HEAD rename
  2017-07-05  7:57                             ` [PATCH 1/6] reflog-walk: skip over double-null oid due to HEAD rename Jeff King
@ 2017-07-05 17:34                               ` Junio C Hamano
  2017-07-05 21:21                                 ` Jeff King
  0 siblings, 1 reply; 69+ messages in thread
From: Junio C Hamano @ 2017-07-05 17:34 UTC (permalink / raw)
  To: Jeff King; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua

Jeff King <peff@peff.net> writes:

> Since 39ee4c6c2f (branch: record creation of renamed branch
> in HEAD's log, 2017-02-20), a rename on the currently
> checked out branch will create two entries in the HEAD
> reflog: one where the branch goes away (switching to the
> null oid), and one where it comes back (switching away from
> the null oid).
> ...
> The resulting behavior may not be the _best_ thing to do in
> the long run (for example, we show both reflog entries each
> with the same commit id), but it's a simple way to fix the
> problem without risking further regressions.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> I do still think it would be worth looking into making this rename
> create a single reflog entry, but that's largely orthogonal to making
> the display code sane(r).

I agree with this assessment.

>  reflog-walk.c     |  2 ++
>  t/t3200-branch.sh | 11 +++++++++++
>  2 files changed, 13 insertions(+)
>
> diff --git a/reflog-walk.c b/reflog-walk.c
> index ed99437ad2..b7e489ad32 100644
> --- a/reflog-walk.c
> +++ b/reflog-walk.c
> @@ -259,6 +259,8 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
>  		/* a root commit, but there are still more entries to show */
>  		reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
>  		logobj = parse_object(&reflog->noid);
> +		if (!logobj)
> +			logobj = parse_object(&reflog->ooid);
>  	}

For the current 'maint', this would need to be backported to the
uchar[20] interface (which is trivial to do, and merging it upwards
while adjusting it back to "struct object_id" is also trivial; there
is no need to resend).

Thanks.  Will queue.

>  	if (!logobj || logobj->type != OBJ_COMMIT) {
> diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
> index 48d152b9a9..dd37ac47c5 100755
> --- a/t/t3200-branch.sh
> +++ b/t/t3200-branch.sh
> @@ -162,6 +162,17 @@ test_expect_success 'git branch -M baz bam should add entries to .git/logs/HEAD'
>  	grep "^0\{40\}.*$msg$" .git/logs/HEAD
>  '
>  
> +test_expect_success 'resulting reflog can be shown by log -g' '
> +	oid=$(git rev-parse HEAD) &&
> +	cat >expect <<-EOF &&
> +	HEAD@{0} $oid $msg
> +	HEAD@{1} $oid $msg
> +	HEAD@{2} $oid checkout: moving from foo to baz
> +	EOF
> +	git log -g --format="%gd %H %gs" -3 HEAD >actual &&
> +	test_cmp expect actual
> +'
> +
>  test_expect_success 'git branch -M baz bam should succeed when baz is checked out as linked working tree' '
>  	git checkout master &&
>  	git worktree add -b baz bazdir &&

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 2/6] t1414: document some reflog-walk oddities
  2017-07-05  8:00                             ` [PATCH 2/6] t1414: document some reflog-walk oddities Jeff King
@ 2017-07-05 17:56                               ` Junio C Hamano
  2017-07-05 21:27                                 ` Jeff King
  2017-07-05 20:05                               ` Ramsay Jones
  1 sibling, 1 reply; 69+ messages in thread
From: Junio C Hamano @ 2017-07-05 17:56 UTC (permalink / raw)
  To: Jeff King; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua

Jeff King <peff@peff.net> writes:

> Since its inception, the general strategy of the reflog-walk
> code has been to start with the tip commit for the ref, and
> as we traverse replace each commit's parent pointers with
> fake parents pointing to the previous reflog entry.
>
> This lets us traverse the reflog as if it were a real
> history, but it has some user-visible oddities. Namely:
>
>   1. The fake parents are used for commit selection and
>      display. So for example, "--merges" or "--no-merges"
>      are useful, because the history appears as a linear

s/useful/useless/ perhaps?

>      string. Likewise, pathspec limiting is based on the
>      diff between adjacent entries, not the changes actually
>      introduced by a commit.
> ...
> This commit adds several expect_failure tests, to show how
> the tool ought to behave.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---

> diff --git a/t/t1414-reflog-walk.sh b/t/t1414-reflog-walk.sh
> new file mode 100755
> index 0000000000..bb847f797d
> --- /dev/null
> +++ b/t/t1414-reflog-walk.sh
> @@ -0,0 +1,83 @@
> +#!/bin/sh
> +
> +test_description='various tests of reflog walk (log -g) behavior'
> +. ./test-lib.sh
> +
> +test_expect_success 'set up some reflog entries' '
> +	test_commit one &&
> +	test_commit two &&
> +	git checkout -b side HEAD^ &&
> +	test_commit three &&
> +	git merge --no-commit master &&
> +	echo evil-merge-content >>one.t &&
> +	test_tick &&
> +	git commit --no-edit -a
> +'

Mental note: logically, what we want to see in the log are:

    master:  one-->two
    side:                one-->three-->(evil)
    HEAD:    one-->two-->one-->three-->(evil)

where the middle one in HEAD is "switching from master to side".

> +test_expect_failure 'pathspec limiting handles merges' '
> +	sed -n "1p;3p;5p" expect.all >expect &&
> +	do_walk -- one.t >actual &&
> +	test_cmp expect actual
> +'

OK (it was a bit tricky to see why the topmost one should, but the
evilness of the merge makes it eligible).

> +test_expect_failure '--parents shows true parents' '
> +	# convert newlines to spaces
> +	echo $(git rev-parse HEAD HEAD^1 HEAD^2) >expect &&

I saw something related to this nearby this morning.  Good thinking
to add a comment on this 'echo' ;-).

> +	git rev-list -g --parents -1 HEAD >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_failure 'walking multiple reflogs shows both' '
> +	{
> +		do_walk HEAD &&
> +		do_walk side
> +	} >expect &&
> +	do_walk HEAD side >actual &&
> +	test_cmp expect actual
> +'

I somehow find this one a bit iffy.  

The order that commits appear in the "walk from HEAD and side at the
same time" may want to be different from what this test expects.
"rev-list --since=3.days -g master next", for example, would want to
refrain from reading the reflog of 'master' for all 90 days before
switching to the reflog of 'next', no?

All others I did not comment on and omitted from quoting make sense
to me.

Thanks.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 5/6] rev-list: check reflog_info before showing usage
  2017-07-05  8:06                             ` [PATCH 5/6] rev-list: check reflog_info before showing usage Jeff King
@ 2017-07-05 18:07                               ` Junio C Hamano
  2017-07-05 21:31                                 ` Jeff King
  0 siblings, 1 reply; 69+ messages in thread
From: Junio C Hamano @ 2017-07-05 18:07 UTC (permalink / raw)
  To: Jeff King; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua

Jeff King <peff@peff.net> writes:

> When git-rev-list sees no pending commits, it shows a usage
> message. This works even when reflog-walking is requested,
> because the reflog-walk code currently puts the reflog tips
> into the pending queue.
>
> In preparation for refactoring the reflog-walk code, let's
> explicitly check whether we have any reflogs to walk. For
> now this is a noop, but the existing reflog tests will make
> sure that it kicks in after the refactoring. Likewise, we'll
> add a test that "rev-list -g" without specifying any reflogs
> continues to fail (so that we know our check does not kick
> in too aggressively).
>
> Note that the implementation needs to go into its own
> sub-function, as the walk code does not expose its innards
> outside of reflog-walk.c.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> This is actually the main "gotcha" I'm worried about with this series.
> I'm not sure if any other code would care about seeing the pending items
> in revs->commits. I still think the series is the right direction; if
> there is such a place, we'd want to teach it to handle reflog walking in
> a similar way, too.

Ah, very good thinking---when I was following along your drafts to
bypass the revs.commits queue for reflog walking, I didn't think of
this one.  

Perhaps "!revs.commits && reflog_walk_empty(revs.reflog_info)" may
want to become a macro "walk_finished(&revs)" or something to
replace existing !revs.commits that is not accompanied by the check
on .reflog_info field?


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 2/6] t1414: document some reflog-walk oddities
  2017-07-05  8:00                             ` [PATCH 2/6] t1414: document some reflog-walk oddities Jeff King
  2017-07-05 17:56                               ` Junio C Hamano
@ 2017-07-05 20:05                               ` Ramsay Jones
  2017-07-05 21:30                                 ` Jeff King
  1 sibling, 1 reply; 69+ messages in thread
From: Ramsay Jones @ 2017-07-05 20:05 UTC (permalink / raw)
  To: Jeff King, brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua



On 05/07/17 09:00, Jeff King wrote:
> Since its inception, the general strategy of the reflog-walk
> code has been to start with the tip commit for the ref, and
> as we traverse replace each commit's parent pointers with
> fake parents pointing to the previous reflog entry.
> 
> This lets us traverse the reflog as if it were a real
> history, but it has some user-visible oddities. Namely:
> 
>   1. The fake parents are used for commit selection and
>      display. So for example, "--merges" or "--no-merges"
>      are useful, because the history appears as a linear
>      string. Likewise, pathspec limiting is based on the
>      diff between adjacent entries, not the changes actually
>      introduced by a commit.
> 
>      These are often the same (e.g., because the entry was
>      just running "git commit" and the adjacent entry _is_
>      the true parent), but it may not be in several common
>      cases. For instance, using "git reset" to jump around
>      history, or "git checkout" to move HEAD.
> 
>   2. We reverse-map each commit back to its reflog. So when
>      it comes time to show commit X, we say "a-ha, we added
>      X because it was at the tip of the 'foo' reflog, so
>      let's show the foo reflog". But this leads to nonsense
>      results when you ask to traverse multiple reflogs: if
>      two reflogs have the same tip commit, we only map back
>      to one of them.
> 
>      Instead, we should show each reflog individually, in
>      the order the user gave us on the command line.
> 
>   2. If the tip of the reflog and the ref tip disagree on
    ^^
It seems hard to get off the second point! ;-)

ATB,
Ramsay Jones


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 1/6] reflog-walk: skip over double-null oid due to HEAD rename
  2017-07-05 17:34                               ` Junio C Hamano
@ 2017-07-05 21:21                                 ` Jeff King
  0 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-05 21:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua

On Wed, Jul 05, 2017 at 10:34:03AM -0700, Junio C Hamano wrote:

> > +		if (!logobj)
> > +			logobj = parse_object(&reflog->ooid);
> >  	}
> 
> For the current 'maint', this would need to be backported to the
> uchar[20] interface (which is trivial to do, and merging it upwards
> while adjusting it back to "struct object_id" is also trivial; there
> is no need to resend).
> 
> Thanks.  Will queue.

Ah, right. I should probably actually apply the patches on "maint"
before suggesting you do so. ;)

In theory it should be applied directly on 39ee4c6c2 and then merged up
(but I think the same wiggling would apply either way).

The rest of the patches would need to be adjusted on top, but it should
be easy to resolve; all of that code just goes away.

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 2/6] t1414: document some reflog-walk oddities
  2017-07-05 17:56                               ` Junio C Hamano
@ 2017-07-05 21:27                                 ` Jeff King
  2017-07-05 22:45                                   ` Junio C Hamano
  0 siblings, 1 reply; 69+ messages in thread
From: Jeff King @ 2017-07-05 21:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua

On Wed, Jul 05, 2017 at 10:56:42AM -0700, Junio C Hamano wrote:

> >   1. The fake parents are used for commit selection and
> >      display. So for example, "--merges" or "--no-merges"
> >      are useful, because the history appears as a linear
> 
> s/useful/useless/ perhaps?

Oops, yes ("not useful" is probably what I was going for).

> > +test_expect_success 'set up some reflog entries' '
> > +	test_commit one &&
> > +	test_commit two &&
> > +	git checkout -b side HEAD^ &&
> > +	test_commit three &&
> > +	git merge --no-commit master &&
> > +	echo evil-merge-content >>one.t &&
> > +	test_tick &&
> > +	git commit --no-edit -a
> > +'
> 
> Mental note: logically, what we want to see in the log are:
> 
>     master:  one-->two
>     side:                one-->three-->(evil)
>     HEAD:    one-->two-->one-->three-->(evil)
> 
> where the middle one in HEAD is "switching from master to side".

Yeah. I was tempted to document that, but I think the expect.all mostly
does that for HEAD (and I don't really check the others). The grossest
thing is this numeric selection:

> > +test_expect_failure 'pathspec limiting handles merges' '
> > +	sed -n "1p;3p;5p" expect.all >expect &&

I tried to think of an easy way to pick them out by name but couldn't
come up with one. I don't know if that sed invocation deserves a comment
or not.

> > +	do_walk -- one.t >actual &&
> > +	test_cmp expect actual
> > +'
> 
> OK (it was a bit tricky to see why the topmost one should, but the
> evilness of the merge makes it eligible).

Yeah, that was why I added the evilness. A more real-world test would
perhaps be a file with an actual conflict that got resolved (but not
matching either parent exactly). I actually started by adding the evil
content to "three", which is a bit closer. But it passes even without
the patch, because the diff to the first parent still matches.

So I dunno. I think it's OK as is. My main goal was just to make sure
that my TREESAME hackery catches both normal and merge commits.

> > +test_expect_failure 'walking multiple reflogs shows both' '
> > +	{
> > +		do_walk HEAD &&
> > +		do_walk side
> > +	} >expect &&
> > +	do_walk HEAD side >actual &&
> > +	test_cmp expect actual
> > +'
> 
> I somehow find this one a bit iffy.  
> 
> The order that commits appear in the "walk from HEAD and side at the
> same time" may want to be different from what this test expects.
> "rev-list --since=3.days -g master next", for example, would want to
> refrain from reading the reflog of 'master' for all 90 days before
> switching to the reflog of 'next', no?

I did make the ordering intentional. We process each reflog sequentially
in turn. I don't think it would be wrong to interleave them by date, but
I actually think it's useful for somebody who switches that behavior to
consciously update the test. Maybe it's worth calling out in the commit
message.

I stopped short of actually documenting and guaranteeing that behavior
to the user. I don't know how comfortable we would be changing it later.

(As an aside, I also prowled through the documentation looking for any
guarantees we make to the user about the fake-parent thing, but I
couldn't find any. So I considered its user-visible effects an unwanted
side effect of the implementation).

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 2/6] t1414: document some reflog-walk oddities
  2017-07-05 20:05                               ` Ramsay Jones
@ 2017-07-05 21:30                                 ` Jeff King
  0 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-05 21:30 UTC (permalink / raw)
  To: Ramsay Jones; +Cc: brian m. carlson, Junio C Hamano, Kyle Meyer, git, Sahil Dua

On Wed, Jul 05, 2017 at 09:05:14PM +0100, Ramsay Jones wrote:

> On 05/07/17 09:00, Jeff King wrote:
> > Since its inception, the general strategy of the reflog-walk
> > code has been to start with the tip commit for the ref, and
> > as we traverse replace each commit's parent pointers with
> > fake parents pointing to the previous reflog entry.
> > 
> > This lets us traverse the reflog as if it were a real
> > history, but it has some user-visible oddities. Namely:
> > 
> >   1. The fake parents are used for commit selection and
> >      display. So for example, "--merges" or "--no-merges"
> >      are useful, because the history appears as a linear
> >      string. Likewise, pathspec limiting is based on the
> >      diff between adjacent entries, not the changes actually
> >      introduced by a commit.
> > 
> >      These are often the same (e.g., because the entry was
> >      just running "git commit" and the adjacent entry _is_
> >      the true parent), but it may not be in several common
> >      cases. For instance, using "git reset" to jump around
> >      history, or "git checkout" to move HEAD.
> > 
> >   2. We reverse-map each commit back to its reflog. So when
> >      it comes time to show commit X, we say "a-ha, we added
> >      X because it was at the tip of the 'foo' reflog, so
> >      let's show the foo reflog". But this leads to nonsense
> >      results when you ask to traverse multiple reflogs: if
> >      two reflogs have the same tip commit, we only map back
> >      to one of them.
> > 
> >      Instead, we should show each reflog individually, in
> >      the order the user gave us on the command line.
> > 
> >   2. If the tip of the reflog and the ref tip disagree on
>     ^^
> It seems hard to get off the second point! ;-)

Heh. You know, I even remember checking that I didn't screw that up
(because I originally wrote the first and third, and later went back to
add in the second point). But somehow I botched the proofreading, too.

I'll plan to re-roll this to update the little bits you and Junio have
pointed out. Junio, I'll probably just do the whole thing on "maint" and
then the merge-up to "master" should be straight-forward, I'd think.

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 5/6] rev-list: check reflog_info before showing usage
  2017-07-05 18:07                               ` Junio C Hamano
@ 2017-07-05 21:31                                 ` Jeff King
  0 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-05 21:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua

On Wed, Jul 05, 2017 at 11:07:35AM -0700, Junio C Hamano wrote:

> > This is actually the main "gotcha" I'm worried about with this series.
> > I'm not sure if any other code would care about seeing the pending items
> > in revs->commits. I still think the series is the right direction; if
> > there is such a place, we'd want to teach it to handle reflog walking in
> > a similar way, too.
> 
> Ah, very good thinking---when I was following along your drafts to
> bypass the revs.commits queue for reflog walking, I didn't think of
> this one.  
> 
> Perhaps "!revs.commits && reflog_walk_empty(revs.reflog_info)" may
> want to become a macro "walk_finished(&revs)" or something to
> replace existing !revs.commits that is not accompanied by the check
> on .reflog_info field?

Yeah, I had the same thought. But I don't think it's really worth adding
a helper if we don't end up with another call-site. So I was stalling to
see if that happened.

(And just for reference in case we do end up adding it, it's not so much
walk_finished() as walk_empty_in_the_first_place()).

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 2/6] t1414: document some reflog-walk oddities
  2017-07-05 21:27                                 ` Jeff King
@ 2017-07-05 22:45                                   ` Junio C Hamano
  2017-07-06  7:16                                     ` Jeff King
  0 siblings, 1 reply; 69+ messages in thread
From: Junio C Hamano @ 2017-07-05 22:45 UTC (permalink / raw)
  To: Jeff King; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua

Jeff King <peff@peff.net> writes:

>> > +test_expect_failure 'walking multiple reflogs shows both' '
>> > +	{
>> > +		do_walk HEAD &&
>> > +		do_walk side
>> > +	} >expect &&
>> > +	do_walk HEAD side >actual &&
>> > +	test_cmp expect actual
>> > +'
>> 
>> I somehow find this one a bit iffy.  
>> 
>> The order that commits appear in the "walk from HEAD and side at the
>> same time" may want to be different from what this test expects.
>> "rev-list --since=3.days -g master next", for example, would want to
>> refrain from reading the reflog of 'master' for all 90 days before
>> switching to the reflog of 'next', no?
>
> I did make the ordering intentional. We process each reflog sequentially
> in turn. I don't think it would be wrong to interleave them by date, but
> I actually think it's useful for somebody who switches that behavior to
> consciously update the test. Maybe it's worth calling out in the commit
> message.
>
> I stopped short of actually documenting and guaranteeing that behavior
> to the user. I don't know how comfortable we would be changing it later.

I somehow feel that the "showing all entries from HEAD and then
showing all from side" is simply a buggy behaviour.  I do not think
our users would terribly mind if we changed it later.  But I may be
missing the reason why (sometimes?) the sequential behaviour may be
useful.

> (As an aside, I also prowled through the documentation looking for any
> guarantees we make to the user about the fake-parent thing, but I
> couldn't find any. So I considered its user-visible effects an unwanted
> side effect of the implementation).

Yes, I think the corner cases you documented here in these tests are
not something we desired to have in the first place.  Rather they
are merely fallouts from the hackyness of the implementation.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 2/6] t1414: document some reflog-walk oddities
  2017-07-05 22:45                                   ` Junio C Hamano
@ 2017-07-06  7:16                                     ` Jeff King
  2017-07-06  7:55                                       ` Jeff King
  2017-07-06 15:42                                       ` Junio C Hamano
  0 siblings, 2 replies; 69+ messages in thread
From: Jeff King @ 2017-07-06  7:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua

On Wed, Jul 05, 2017 at 03:45:03PM -0700, Junio C Hamano wrote:

> > I did make the ordering intentional. We process each reflog sequentially
> > in turn. I don't think it would be wrong to interleave them by date, but
> > I actually think it's useful for somebody who switches that behavior to
> > consciously update the test. Maybe it's worth calling out in the commit
> > message.
> >
> > I stopped short of actually documenting and guaranteeing that behavior
> > to the user. I don't know how comfortable we would be changing it later.
> 
> I somehow feel that the "showing all entries from HEAD and then
> showing all from side" is simply a buggy behaviour.  I do not think
> our users would terribly mind if we changed it later.  But I may be
> missing the reason why (sometimes?) the sequential behaviour may be
> useful.

If we think it's buggy, we can fix it now. But I'm not convinced that
sequential iteration is that bad. It's at least _simple_ and easy to
explain. The only other thing that would make sense to me is sorting the
entries by date (reflog date, not commit date) and showing them in that
order. But that gives rise to some funny corner cases.

What do we do if there's clock skew within the reflog (e.g., somebody
fixed their skewed clock such that HEAD@{5} is less recent than
HEAD@{6}). Do we show them out of order then (even if only a single
reflog is being shown?).

Or do we try some complicated sort where entries compare sequentially
within a given reflog, but use date comparisons between different logs?
I don't think that can work with a normal comparison sort, as it makes
the comparison non-transitive.

But maybe we could do a pseudo-list-merge, where we treat the reflogs as
queues and always pick from the queue with the most recent timestamp.

That would probably be pretty easy to retrofit on the iteration from the
current series.

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 2/6] t1414: document some reflog-walk oddities
  2017-07-06  7:16                                     ` Jeff King
@ 2017-07-06  7:55                                       ` Jeff King
  2017-07-06 15:42                                       ` Junio C Hamano
  1 sibling, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-06  7:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua

On Thu, Jul 06, 2017 at 03:16:06AM -0400, Jeff King wrote:

> If we think it's buggy, we can fix it now. But I'm not convinced that
> sequential iteration is that bad. It's at least _simple_ and easy to
> explain. The only other thing that would make sense to me is sorting the
> entries by date (reflog date, not commit date) and showing them in that
> order. But that gives rise to some funny corner cases.
> 
> What do we do if there's clock skew within the reflog (e.g., somebody
> fixed their skewed clock such that HEAD@{5} is less recent than
> HEAD@{6}). Do we show them out of order then (even if only a single
> reflog is being shown?).
> 
> Or do we try some complicated sort where entries compare sequentially
> within a given reflog, but use date comparisons between different logs?
> I don't think that can work with a normal comparison sort, as it makes
> the comparison non-transitive.
> 
> But maybe we could do a pseudo-list-merge, where we treat the reflogs as
> queues and always pick from the queue with the most recent timestamp.
> 
> That would probably be pretty easy to retrofit on the iteration from the
> current series.

Something like the patch below. Looking over the output, the
interleaving makes it look confusing (especially if you don't use
--date, since then the interleaved numbers have no relation to each
other).

It would help if we actually had a use case where somebody really did
want to show two reflogs from a single command. Then we'd know what they
were expecting and which output would be most useful, at least for that
case. But I can't really think of why you'd want to do so (which is
probably why nobody noticed for years that the current behavior is so
broken).

---
diff --git a/reflog-walk.c b/reflog-walk.c
index a7644d944e..6cdf7bfb0f 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -90,7 +90,7 @@ struct commit_reflog {
 
 struct reflog_walk_info {
 	struct commit_reflog **logs;
-	size_t nr, alloc, cur;
+	size_t nr, alloc;
 	struct string_list complete_reflogs;
 	struct commit_reflog *last_commit_reflog;
 };
@@ -284,26 +284,43 @@ int reflog_walk_empty(struct reflog_walk_info *info)
 	return !info || !info->nr;
 }
 
-struct commit *next_entry_in_log(struct commit_reflog *log)
+static void skip_unusable_entries(struct commit_reflog *log)
 {
 	while (log->recno >= 0) {
-		struct reflog_info *entry = &log->reflogs->items[log->recno--];
+		struct reflog_info *entry = &log->reflogs->items[log->recno];
 		struct object *obj = parse_object(&entry->noid);
 
 		if (obj && obj->type == OBJ_COMMIT)
-			return (struct commit *)obj;
+			return;
+		log->recno--;
 	}
-	return NULL;
+}
+
+static timestamp_t next_timestamp(struct commit_reflog *log)
+{
+	return log->reflogs->items[log->recno].timestamp;
 }
 
 struct commit *next_reflog_entry(struct reflog_walk_info *walk)
 {
-	for (; walk->cur < walk->nr; walk->cur++) {
-		struct commit *ret = next_entry_in_log(walk->logs[walk->cur]);
-		if (ret) {
-			walk->last_commit_reflog = walk->logs[walk->cur];
-			return ret;
-		}
+	struct commit_reflog *best = NULL;
+	size_t i;
+
+	for (i = 0; i < walk->nr; i++) {
+		struct commit_reflog *log = walk->logs[i];
+
+		skip_unusable_entries(log);
+		if (log->recno < 0)
+			continue;
+
+		if (!best || next_timestamp(log) > next_timestamp(best))
+			best = log;
 	}
+
+	if (best) {
+		walk->last_commit_reflog = best;
+		return lookup_commit(&best->reflogs->items[best->recno--].noid);
+	}
+
 	return NULL;
 }

The implementation is fairly naive. It would be O(nr_reflogs *
nr_entries) in the worst case. I wouldn't expect nr_reflogs to get
large. But if we cared, we could fix it by keeping the reflogs in a
heap.

-Peff

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH 2/6] t1414: document some reflog-walk oddities
  2017-07-06  7:16                                     ` Jeff King
  2017-07-06  7:55                                       ` Jeff King
@ 2017-07-06 15:42                                       ` Junio C Hamano
  2017-07-07  5:19                                         ` Jeff King
  1 sibling, 1 reply; 69+ messages in thread
From: Junio C Hamano @ 2017-07-06 15:42 UTC (permalink / raw)
  To: Jeff King; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua

Jeff King <peff@peff.net> writes:

> On Wed, Jul 05, 2017 at 03:45:03PM -0700, Junio C Hamano wrote:
>
>> > I did make the ordering intentional. We process each reflog sequentially
>> > in turn. I don't think it would be wrong to interleave them by date, but
>> > I actually think it's useful for somebody who switches that behavior to
>> > consciously update the test. Maybe it's worth calling out in the commit
>> > message.
>> >
>> > I stopped short of actually documenting and guaranteeing that behavior
>> > to the user. I don't know how comfortable we would be changing it later.
>> 
>> I somehow feel that the "showing all entries from HEAD and then
>> showing all from side" is simply a buggy behaviour.  I do not think
>> our users would terribly mind if we changed it later.  But I may be
>> missing the reason why (sometimes?) the sequential behaviour may be
>> useful.
>
> If we think it's buggy, we can fix it now. But I'm not convinced that
> sequential iteration is that bad. It's at least _simple_ and easy to
> explain.

Yes, I agree that sequential is easy to explain, but only when I
consider use of "log" family without "-n 30" or "--since 3.days".
It still is easy to explain---we show from one and then from the
other, but because we stop after showing 30 of them, and the first
one has more than that, you do not see any from the latter.  

It's just the easy-to-explian behaviour is not very useful and very
hard to defend.

> The only other thing that would make sense to me is sorting the
> entries by date (reflog date, not commit date) and showing them in that
> order. But that gives rise to some funny corner cases.
>
> What do we do if there's clock skew within the reflog (e.g., somebody
> fixed their skewed clock such that HEAD@{5} is less recent than
> HEAD@{6}). Do we show them out of order then (even if only a single
> reflog is being shown?).

I would think the easiest-to-explain thing is handle clock skew just
like how clock skew affects the traversal with commit date.  At
least inside a single reflog, there is no need to sort---its
append-only nature gives a reliable "newer to older" order.

> But maybe we could do a pseudo-list-merge, where we treat the reflogs as
> queues and always pick from the queue with the most recent timestamp.

Yes, that was what I had in mind.  An entry with an artificially old
timestamp due to clock skew would clog the output and prevent the
entries behind it on the same reflog from getting shown before
entries from other reflog, but that is the same as what happens in
the normal traversal to sane parents that can only be reached from a
child commit that has an artificially old timestamp.

Thanks.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 6/6] reflog-walk: stop using fake parents
  2017-07-05  8:09                             ` [PATCH 6/6] reflog-walk: stop using fake parents Jeff King
@ 2017-07-07  0:32                               ` Eric Wong
  2017-07-07  3:02                                 ` Jeff King
  0 siblings, 1 reply; 69+ messages in thread
From: Eric Wong @ 2017-07-07  0:32 UTC (permalink / raw)
  To: Jeff King
  Cc: brian m. carlson, Junio C Hamano, Kyle Meyer, git, Sahil Dua,
	Ramsay Jones

I'm not sure why, but this is causing t1414.8 failures on 32-bit
x86 with the latest pu with Debian jessie (oldstable).

Reverting this (beafb2c62947a6d4a97b9c3baf99fe62ec8e830f) in pu
seems to fix the test for me.

+Cc: Ramsay since he also had a 32-bit environment.

--8<--
ok 7 - --parents shows true parents

expecting success: 
	{
		do_walk HEAD &&
		do_walk side
	} >expect &&
	do_walk HEAD side >actual &&
	test_cmp expect actual

--- expect	2017-07-07 00:30:57.796325232 +0000
+++ actual	2017-07-07 00:30:57.796325232 +0000
@@ -3,6 +3,8 @@
 HEAD@{2} checkout: moving from master to side
 HEAD@{3} commit: two
 HEAD@{4} commit (initial): one
-side@{0} commit (merge): Merge branch 'master' into side
-side@{1} commit: three
-side@{2} branch: Created from HEAD^
+HEAD@{0} commit (merge): Merge branch 'master' into side
+HEAD@{1} commit: three
+HEAD@{2} checkout: moving from master to side
+HEAD@{3} commit: two
+HEAD@{4} commit (initial): one
not ok 8 - walking multiple reflogs shows both
#	
#		{
#			do_walk HEAD &&
#			do_walk side
#		} >expect &&
#		do_walk HEAD side >actual &&
#		test_cmp expect actual
#	

expecting success: 
	head=$(git rev-parse HEAD) &&
	one=$(git rev-parse one) &&
	ident="$GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE" &&
	echo "$head $one $ident	broken reflog entry" >>.git/logs/HEAD &&

	echo $one >expect &&
	git log -g --format=%H -1 >actual &&
	test_cmp expect actual

ok 9 - walk prefers reflog to ref tip

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 6/6] reflog-walk: stop using fake parents
  2017-07-07  0:32                               ` Eric Wong
@ 2017-07-07  3:02                                 ` Jeff King
  2017-07-07  3:15                                   ` Jeff King
  2017-07-10  9:42                                   ` Eric Wong
  0 siblings, 2 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  3:02 UTC (permalink / raw)
  To: Eric Wong
  Cc: brian m. carlson, Junio C Hamano, Kyle Meyer, git, Sahil Dua,
	Ramsay Jones

On Fri, Jul 07, 2017 at 12:32:39AM +0000, Eric Wong wrote:

> I'm not sure why, but this is causing t1414.8 failures on 32-bit
> x86 with the latest pu with Debian jessie (oldstable).
> 
> Reverting this (beafb2c62947a6d4a97b9c3baf99fe62ec8e830f) in pu
> seems to fix the test for me.
> 
> +Cc: Ramsay since he also had a 32-bit environment.

Thanks, I was able to reproduce with CFLAGS=-m32.

> --- expect	2017-07-07 00:30:57.796325232 +0000
> +++ actual	2017-07-07 00:30:57.796325232 +0000
> @@ -3,6 +3,8 @@
>  HEAD@{2} checkout: moving from master to side
>  HEAD@{3} commit: two
>  HEAD@{4} commit (initial): one
> -side@{0} commit (merge): Merge branch 'master' into side
> -side@{1} commit: three
> -side@{2} branch: Created from HEAD^
> +HEAD@{0} commit (merge): Merge branch 'master' into side
> +HEAD@{1} commit: three
> +HEAD@{2} checkout: moving from master to side
> +HEAD@{3} commit: two
> +HEAD@{4} commit (initial): one

That's quite an unexpected error (to show the HEAD reflog twice). Given
that it triggers with 32-bit builds, it's like there's some funny memory
error. And indeed, running it under valgrind shows a problem in
add_reflog_for_walk. I'll try to dig into it. Thanks for reporting.

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 6/6] reflog-walk: stop using fake parents
  2017-07-07  3:02                                 ` Jeff King
@ 2017-07-07  3:15                                   ` Jeff King
  2017-07-10  9:42                                   ` Eric Wong
  1 sibling, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  3:15 UTC (permalink / raw)
  To: Eric Wong
  Cc: brian m. carlson, Junio C Hamano, Kyle Meyer, git, Sahil Dua,
	Ramsay Jones

On Thu, Jul 06, 2017 at 11:02:24PM -0400, Jeff King wrote:

> On Fri, Jul 07, 2017 at 12:32:39AM +0000, Eric Wong wrote:
> 
> > I'm not sure why, but this is causing t1414.8 failures on 32-bit
> > x86 with the latest pu with Debian jessie (oldstable).
> > 
> > Reverting this (beafb2c62947a6d4a97b9c3baf99fe62ec8e830f) in pu
> > seems to fix the test for me.
> > 
> > +Cc: Ramsay since he also had a 32-bit environment.
> 
> Thanks, I was able to reproduce with CFLAGS=-m32.
> 
> > --- expect	2017-07-07 00:30:57.796325232 +0000
> > +++ actual	2017-07-07 00:30:57.796325232 +0000
> > @@ -3,6 +3,8 @@
> >  HEAD@{2} checkout: moving from master to side
> >  HEAD@{3} commit: two
> >  HEAD@{4} commit (initial): one
> > -side@{0} commit (merge): Merge branch 'master' into side
> > -side@{1} commit: three
> > -side@{2} branch: Created from HEAD^
> > +HEAD@{0} commit (merge): Merge branch 'master' into side
> > +HEAD@{1} commit: three
> > +HEAD@{2} checkout: moving from master to side
> > +HEAD@{3} commit: two
> > +HEAD@{4} commit (initial): one
> 
> That's quite an unexpected error (to show the HEAD reflog twice). Given
> that it triggers with 32-bit builds, it's like there's some funny memory
> error. And indeed, running it under valgrind shows a problem in
> add_reflog_for_walk. I'll try to dig into it. Thanks for reporting.

Ah, it's a bug in the existing code. It inserts an entry into a string
list and then frees the memory, but the string list wasn't asked to make
a copy of the string. You can see the bug by running t1414 with
--valgrind even before any of the code changes (though note that you'll
have to look at the "-v" output, since it's already marked as
expect-failure).

This fixes it:

diff --git a/reflog-walk.c b/reflog-walk.c
index b7e489ad32..c1f61c524a 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -136,6 +136,7 @@ struct reflog_walk_info {
 void init_reflog_walk(struct reflog_walk_info **info)
 {
 	*info = xcalloc(1, sizeof(struct reflog_walk_info));
+	(*info)->complete_reflogs.strdup_strings = 1;
 }
 
 int add_reflog_for_walk(struct reflog_walk_info *info,

I'll add that to the 'maint' portion of the series.

-Peff

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH 2/6] t1414: document some reflog-walk oddities
  2017-07-06 15:42                                       ` Junio C Hamano
@ 2017-07-07  5:19                                         ` Jeff King
  2017-07-07  6:00                                           ` Junio C Hamano
  0 siblings, 1 reply; 69+ messages in thread
From: Jeff King @ 2017-07-07  5:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua

On Thu, Jul 06, 2017 at 08:42:45AM -0700, Junio C Hamano wrote:

> >> I somehow feel that the "showing all entries from HEAD and then
> >> showing all from side" is simply a buggy behaviour.  I do not think
> >> our users would terribly mind if we changed it later.  But I may be
> >> missing the reason why (sometimes?) the sequential behaviour may be
> >> useful.
> >
> > If we think it's buggy, we can fix it now. But I'm not convinced that
> > sequential iteration is that bad. It's at least _simple_ and easy to
> > explain.
> 
> Yes, I agree that sequential is easy to explain, but only when I
> consider use of "log" family without "-n 30" or "--since 3.days".
> It still is easy to explain---we show from one and then from the
> other, but because we stop after showing 30 of them, and the first
> one has more than that, you do not see any from the latter.

Ah, right, I didn't think of limiting like that. I agree that makes a
strong argument in favor of the date-ordered queue.

I suspect that "--since 3.days" is still quite buggy (even with a single
reflog) because it checks commit timestamps and stops traversing when we
go too bar back. But in a reflog, the commits may be totally out of
order. I'm not sure what it should do. Either:

  1. During a reflog walk, --since and --until should respect reflog
     timestamps instead of commit timestamps. You can already do
     "--until" there by simply starting the traversal later, but there's
     no way to cut it off with --since.

  2. Limit commits shown by --since/--until as usual, but skip the "stop
     traversing" optimization when we see too many "old" commits. I.e.,
     omit a 4.days.ago commit, but keep walking to find other recent
     commits.

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 2/6] t1414: document some reflog-walk oddities
  2017-07-07  5:19                                         ` Jeff King
@ 2017-07-07  6:00                                           ` Junio C Hamano
  2017-07-07  6:22                                             ` Jeff King
  0 siblings, 1 reply; 69+ messages in thread
From: Junio C Hamano @ 2017-07-07  6:00 UTC (permalink / raw)
  To: Jeff King; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua

Jeff King <peff@peff.net> writes:

> I suspect that "--since 3.days" is still quite buggy (even with a single
> reflog) because it checks commit timestamps and stops traversing when we
> go too bar back. But in a reflog, the commits may be totally out of
> order. I'm not sure what it should do. Either:
>
>   1. During a reflog walk, --since and --until should respect reflog
>      timestamps instead of commit timestamps. You can already do
>      "--until" there by simply starting the traversal later, but there's
>      no way to cut it off with --since.
>
>   2. Limit commits shown by --since/--until as usual, but skip the "stop
>      traversing" optimization when we see too many "old" commits. I.e.,
>      omit a 4.days.ago commit, but keep walking to find other recent
>      commits.

I think 1. is more logical, and I was imagining that it should be
doable, now we are not constrained by (ab)using the commit_list with
the fake-parent thing, but are pulling the entries directly from the
reflog iterator and the timestamp would be already available to the
iterator.

But I recall that the max_age and min_age cutoff is done long after
a commit is pulled out of the "iterator mechanism" (be it the
commit_list or your direct reflog iterator) by comparing
commit->date with the cut-off, so it may be a bit more involved to
arrange than I imagined X-<.  Hmph...



^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 2/6] t1414: document some reflog-walk oddities
  2017-07-07  6:00                                           ` Junio C Hamano
@ 2017-07-07  6:22                                             ` Jeff King
  0 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  6:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua

On Thu, Jul 06, 2017 at 11:00:13PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I suspect that "--since 3.days" is still quite buggy (even with a single
> > reflog) because it checks commit timestamps and stops traversing when we
> > go too bar back. But in a reflog, the commits may be totally out of
> > order. I'm not sure what it should do. Either:
> >
> >   1. During a reflog walk, --since and --until should respect reflog
> >      timestamps instead of commit timestamps. You can already do
> >      "--until" there by simply starting the traversal later, but there's
> >      no way to cut it off with --since.
> >
> >   2. Limit commits shown by --since/--until as usual, but skip the "stop
> >      traversing" optimization when we see too many "old" commits. I.e.,
> >      omit a 4.days.ago commit, but keep walking to find other recent
> >      commits.
> 
> I think 1. is more logical, and I was imagining that it should be
> doable, now we are not constrained by (ab)using the commit_list with
> the fake-parent thing, but are pulling the entries directly from the
> reflog iterator and the timestamp would be already available to the
> iterator.
> 
> But I recall that the max_age and min_age cutoff is done long after
> a commit is pulled out of the "iterator mechanism" (be it the
> commit_list or your direct reflog iterator) by comparing
> commit->date with the cut-off, so it may be a bit more involved to
> arrange than I imagined X-<.  Hmph...

It's probably not too bad.

We do some of the limiting in limit_list(), which tries to mark commits
as UNINTERESTING. But I think in general that limit_list is incompatible
with reflog traversals (though I wouldn't be surprised if we fail to
flag all the options that set revs->limited as incompatible).

We handle max_age in get_revision_1() itself, which should be pretty
straightforward. For min_age, we do it in get_commit_action(), which
would need access to the reflog timestamp. But we do have the rev_info
there.

So something like the patch below would work, I think.

diff --git a/reflog-walk.c b/reflog-walk.c
index fbee9e0126..74ebe5148f 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -264,6 +264,18 @@ const char *get_reflog_ident(struct reflog_walk_info *reflog_info)
 	return info->email;
 }
 
+timestamp_t get_reflog_timestamp(struct reflog_walk_info *reflog_info)
+{
+	struct commit_reflog *commit_reflog = reflog_info->last_commit_reflog;
+	struct reflog_info *info;
+
+	if (!commit_reflog)
+		return 0;
+
+	info = &commit_reflog->reflogs->items[commit_reflog->recno+1];
+	return info->timestamp;
+}
+
 void show_reflog_message(struct reflog_walk_info *reflog_info, int oneline,
 			 const struct date_mode *dmode, int force_date)
 {
diff --git a/reflog-walk.h b/reflog-walk.h
index 373388cd14..7553c448fe 100644
--- a/reflog-walk.h
+++ b/reflog-walk.h
@@ -13,6 +13,7 @@ extern void show_reflog_message(struct reflog_walk_info *info, int,
 extern void get_reflog_message(struct strbuf *sb,
 		struct reflog_walk_info *reflog_info);
 extern const char *get_reflog_ident(struct reflog_walk_info *reflog_info);
+extern timestamp_t get_reflog_timestamp(struct reflog_walk_info *reflog_info);
 extern void get_reflog_selector(struct strbuf *sb,
 		struct reflog_walk_info *reflog_info,
 		const struct date_mode *dmode, int force_date,
diff --git a/revision.c b/revision.c
index 5fc01f2d26..c248a16974 100644
--- a/revision.c
+++ b/revision.c
@@ -2973,8 +2973,13 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
 		return commit_show;
 	if (commit->object.flags & UNINTERESTING)
 		return commit_ignore;
-	if (revs->min_age != -1 && (commit->date > revs->min_age))
-		return commit_ignore;
+	if (revs->min_age != -1) {
+		timestamp_t date = revs->reflog_info ?
+				   get_reflog_timestamp(revs->reflog_info) :
+				   commit->date;
+		if (date > revs->min_age)
+			return commit_ignore;
+	}
 	if (revs->min_parents || (revs->max_parents >= 0)) {
 		int n = commit_list_count(commit->parents);
 		if ((n < revs->min_parents) ||
@@ -3127,9 +3132,13 @@ static struct commit *get_revision_1(struct rev_info *revs)
 		 * that we'd otherwise have done in limit_list().
 		 */
 		if (!revs->limited) {
-			if (revs->max_age != -1 &&
-			    (commit->date < revs->max_age))
-				continue;
+			if (revs->max_age != -1) {
+				timestamp_t date = revs->reflog_info ?
+						   get_reflog_timestamp(revs->reflog_info) :
+						   commit->date;
+				if (date < revs->max_age)
+					continue;
+			}
 			if (!revs->reflog_info &&
 			    add_parents_to_list(revs, commit, &revs->commits, NULL) < 0) {
 				if (!revs->ignore_missing_links)

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 0/4] reflog-walk fixes for maint
  2017-07-05  7:55                           ` [PATCH 0/6] fixing reflog-walk oddities Jeff King
                                               ` (5 preceding siblings ...)
  2017-07-05  8:09                             ` [PATCH 6/6] reflog-walk: stop using fake parents Jeff King
@ 2017-07-07  8:36                             ` Jeff King
  2017-07-07  8:37                               ` [PATCH v2 1/4] reflog-walk: skip over double-null oid due to HEAD rename Jeff King
                                                 ` (5 more replies)
  6 siblings, 6 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  8:36 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua, Eric Wong

On Wed, Jul 05, 2017 at 03:55:08AM -0400, Jeff King wrote:

> The first patch is my original small fix with an extra test. I think
> that would be appropriate for 'maint'. Its behavior still has some
> quirks, but it avoids the confusion that you experienced and has a low
> risk of breaking anything else.
> 
> The rest of it replaces the fake-parent thing with a more
> straight-forward iteration over the reflogs (i.e., a cleanup of the
> further patches I've been posting). After digging into it and especially
> after writing the new tests, I think I've convinced myself that this is
> the right way forward.

Here's an updated version of the bug-fix patch, along with the fix for
the problem that Eric noticed, and some other problems I noticed while
fixing that one. So I've split these immediate fixes for maint off into
their own series.

These are based on maint itself, rather than Kyle's original commit that
introduces the double-null reflog, since some of the bugs came later in
the v2.13 cycle (if we really wanted to, we could split it again into
two more series, but I don't think it's worth the trouble).

  [1/4]: reflog-walk: skip over double-null oid due to HEAD rename

    This is the fix for the pseudo-truncation in v2.13, and is the same
    as the previous round.

  [2/4]: reflog-walk: duplicate strings in complete_reflogs list

    This fixes Eric's bug, and is the same as what I showed earlier.
    It's a triggerable use-after-free, which is why I think it's
    important to get it into maint.

  [3/4]: reflog-walk: don't free reflogs added to cache

    This is another use-after-free, though it's slightly harder to
    trigger.

  [4/4]: reflog-walk: include all fields when freeing complete_reflogs

    This one is an optional cleanup, but worth doing, I think.


 reflog-walk.c          | 33 +++++++++++++++++++++------------
 t/t1411-reflog-show.sh | 10 ++++++++++
 t/t3200-branch.sh      | 11 +++++++++++
 3 files changed, 42 insertions(+), 12 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v2 1/4] reflog-walk: skip over double-null oid due to HEAD rename
  2017-07-07  8:36                             ` [PATCH v2 0/4] reflog-walk fixes for maint Jeff King
@ 2017-07-07  8:37                               ` Jeff King
  2017-07-07  8:39                               ` [PATCH v2 2/4] reflog-walk: duplicate strings in complete_reflogs list Jeff King
                                                 ` (4 subsequent siblings)
  5 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  8:37 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua, Eric Wong

Since 39ee4c6c2f (branch: record creation of renamed branch
in HEAD's log, 2017-02-20), a rename on the currently
checked out branch will create two entries in the HEAD
reflog: one where the branch goes away (switching to the
null oid), and one where it comes back (switching away from
the null oid).

This confuses the reflog-walk code. When walking backwards,
it first sees the null oid in the "old" field of the second
entry. Thanks to the "root commit" logic added by 71abeb753f
(reflog: continue walking the reflog past root commits,
2016-06-03), we keep looking for the next entry by scanning
the "new" field from the previous entry. But that field is
also null! We need to go just a tiny bit further, and look
at its "old" field. But with the current code, we decide the
reflog has nothing else to show and just give up. To the
user this looks like the reflog was truncated by the rename
operation, when in fact those entries are still there.

This patch does the absolute minimal fix, which is to look
back that one extra level and keep traversing.

The resulting behavior may not be the _best_ thing to do in
the long run (for example, we show both reflog entries each
with the same commit id), but it's a simple way to fix the
problem without risking further regressions.

Signed-off-by: Jeff King <peff@peff.net>
---
 reflog-walk.c     |  2 ++
 t/t3200-branch.sh | 11 +++++++++++
 2 files changed, 13 insertions(+)

diff --git a/reflog-walk.c b/reflog-walk.c
index c63eb1a3fd..f7ffd1caa3 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -259,6 +259,8 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 		/* a root commit, but there are still more entries to show */
 		reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
 		logobj = parse_object(reflog->noid.hash);
+		if (!logobj)
+			logobj = parse_object(reflog->ooid.hash);
 	}
 
 	if (!logobj || logobj->type != OBJ_COMMIT) {
diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
index 48d152b9a9..dd37ac47c5 100755
--- a/t/t3200-branch.sh
+++ b/t/t3200-branch.sh
@@ -162,6 +162,17 @@ test_expect_success 'git branch -M baz bam should add entries to .git/logs/HEAD'
 	grep "^0\{40\}.*$msg$" .git/logs/HEAD
 '
 
+test_expect_success 'resulting reflog can be shown by log -g' '
+	oid=$(git rev-parse HEAD) &&
+	cat >expect <<-EOF &&
+	HEAD@{0} $oid $msg
+	HEAD@{1} $oid $msg
+	HEAD@{2} $oid checkout: moving from foo to baz
+	EOF
+	git log -g --format="%gd %H %gs" -3 HEAD >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'git branch -M baz bam should succeed when baz is checked out as linked working tree' '
 	git checkout master &&
 	git worktree add -b baz bazdir &&
-- 
2.13.2.1000.g8590c1af5d


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 2/4] reflog-walk: duplicate strings in complete_reflogs list
  2017-07-07  8:36                             ` [PATCH v2 0/4] reflog-walk fixes for maint Jeff King
  2017-07-07  8:37                               ` [PATCH v2 1/4] reflog-walk: skip over double-null oid due to HEAD rename Jeff King
@ 2017-07-07  8:39                               ` Jeff King
  2017-07-07  8:41                               ` [PATCH v2 3/4] reflog-walk: don't free reflogs added to cache Jeff King
                                                 ` (3 subsequent siblings)
  5 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  8:39 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua, Eric Wong

As part of the add_reflog_to_walk() function, we keep a
string_list mapping refnames to their reflog contents. This
serves as a cache so that accessing the same reflog twice
requires only a single copy of the log in memory.

The string_list is initialized via xcalloc, meaning its
strdup_strings field is set to 0. But after inserting a
string into the list, we unconditionally call free() on the
string, leaving the list pointing to freed memory. If
another reflog is added (e.g., "git log -g HEAD HEAD"), then
the second one may have unpredictable results.

The extra free was added by 5026b47175 (add_reflog_for_walk:
avoid memory leak, 2017-05-04). Though if you look
carefully, you can see that the code was buggy even before
then. If we tried to read the reflogs by time but came up
with no entries, we exited with an error, freeing the string
in that code path. So the bug was harder to trigger, but
still there.

We can fix it by just asking the string list to make a copy
of the string. Technically we could fix the problem by not
calling free() on our string (and just handing over
ownership to the string list), but there are enough
conditionals that it's quite hard to figure out which code
paths need the free and which do not. Simpler is better
here.

The new test reliably shows the problem when run with
--valgrind or ASAN.

Signed-off-by: Jeff King <peff@peff.net>
---
 reflog-walk.c          | 1 +
 t/t1411-reflog-show.sh | 6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/reflog-walk.c b/reflog-walk.c
index f7ffd1caa3..aec718beba 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -136,6 +136,7 @@ struct reflog_walk_info {
 void init_reflog_walk(struct reflog_walk_info **info)
 {
 	*info = xcalloc(1, sizeof(struct reflog_walk_info));
+	(*info)->complete_reflogs.strdup_strings = 1;
 }
 
 int add_reflog_for_walk(struct reflog_walk_info *info,
diff --git a/t/t1411-reflog-show.sh b/t/t1411-reflog-show.sh
index 6ac7734d79..ae96eeb66a 100755
--- a/t/t1411-reflog-show.sh
+++ b/t/t1411-reflog-show.sh
@@ -171,4 +171,10 @@ test_expect_success 'reflog exists works' '
 	! git reflog exists refs/heads/nonexistent
 '
 
+# The behavior with two reflogs is buggy and the output is in flux; for now
+# we're just checking that the program works at all without segfaulting.
+test_expect_success 'showing multiple reflogs works' '
+	git log -g HEAD HEAD >actual
+'
+
 test_done
-- 
2.13.2.1000.g8590c1af5d


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 3/4] reflog-walk: don't free reflogs added to cache
  2017-07-07  8:36                             ` [PATCH v2 0/4] reflog-walk fixes for maint Jeff King
  2017-07-07  8:37                               ` [PATCH v2 1/4] reflog-walk: skip over double-null oid due to HEAD rename Jeff King
  2017-07-07  8:39                               ` [PATCH v2 2/4] reflog-walk: duplicate strings in complete_reflogs list Jeff King
@ 2017-07-07  8:41                               ` Jeff King
  2017-07-07  8:43                               ` [PATCH v2 4/4] reflog-walk: include all fields when freeing complete_reflogs Jeff King
                                                 ` (2 subsequent siblings)
  5 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  8:41 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua, Eric Wong

The add_reflog_for_walk() function keeps a cache mapping
refnames to their reflog contents. We use a cached reflog
entry if available, and otherwise allocate and store a new
one.

Since 5026b47175 (add_reflog_for_walk: avoid memory leak,
2017-05-04), when we hit an error parsing a date-based
reflog spec, we free the reflog memory but leave the cache
entry pointing to the now-freed memory.

We can fix this by just leaving the memory intact once it
has made it into the cache. This may leave an unused entry
in the cache, but that's OK. And it means we also catch a
similar situation: we may not have allocated at all in this
invocation, but simply be pointing to a cached entry from a
previous invocation (which is relying on that entry being
present).

The new test in t1411 exercises this case and fails when run
with --valgrind or ASan.

Signed-off-by: Jeff King <peff@peff.net>
---
 reflog-walk.c          | 4 ----
 t/t1411-reflog-show.sh | 4 ++++
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/reflog-walk.c b/reflog-walk.c
index aec718beba..2a43537326 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -215,10 +215,6 @@ int add_reflog_for_walk(struct reflog_walk_info *info,
 	if (recno < 0) {
 		commit_reflog->recno = get_reflog_recno_by_time(reflogs, timestamp);
 		if (commit_reflog->recno < 0) {
-			if (reflogs) {
-				free(reflogs->ref);
-				free(reflogs);
-			}
 			free(commit_reflog);
 			return -1;
 		}
diff --git a/t/t1411-reflog-show.sh b/t/t1411-reflog-show.sh
index ae96eeb66a..b9cb76654b 100755
--- a/t/t1411-reflog-show.sh
+++ b/t/t1411-reflog-show.sh
@@ -177,4 +177,8 @@ test_expect_success 'showing multiple reflogs works' '
 	git log -g HEAD HEAD >actual
 '
 
+test_expect_success 'showing multiple reflogs with an old date' '
+	git log -g HEAD@{1979-01-01} HEAD >actual
+'
+
 test_done
-- 
2.13.2.1000.g8590c1af5d


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 4/4] reflog-walk: include all fields when freeing complete_reflogs
  2017-07-07  8:36                             ` [PATCH v2 0/4] reflog-walk fixes for maint Jeff King
                                                 ` (2 preceding siblings ...)
  2017-07-07  8:41                               ` [PATCH v2 3/4] reflog-walk: don't free reflogs added to cache Jeff King
@ 2017-07-07  8:43                               ` Jeff King
  2017-07-07  9:05                               ` [PATCH v2 0/7] fixing reflog-walk oddities Jeff King
  2017-07-07 20:15                               ` [PATCH v2 0/4] reflog-walk fixes for maint Junio C Hamano
  5 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  8:43 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua, Eric Wong

When we encounter an error adding reflogs for a walk, we try
to free any logs we have read. But we didn't free all
fields, meaning that we could in theory leak all of the
"items" array (which would consitute the bulk of the
allocated memory).

This patch adds a helper which frees all of the entries and
uses it as appropriate.

As it turns out, the leak seems impossible to trigger with
the current code. Of the three error paths that free the
complete_reflogs struct, two only kick in when the items
array is empty, and the third was removed entirely in the
previous commit.

So this patch should be a noop in terms of behavior, but it
fixes a potential maintenance headache should anybody add a
new error path and copy the partial-free code. Which is
what happened in 5026b47175 (add_reflog_for_walk: avoid
memory leak, 2017-05-04), though its leaky call was the
third one that was recently removed.

Signed-off-by: Jeff King <peff@peff.net>
---
 reflog-walk.c | 26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/reflog-walk.c b/reflog-walk.c
index 2a43537326..ba72020fc3 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -38,6 +38,22 @@ static int read_one_reflog(struct object_id *ooid, struct object_id *noid,
 	return 0;
 }
 
+static void free_complete_reflog(struct complete_reflogs *array)
+{
+	int i;
+
+	if (!array)
+		return;
+
+	for (i = 0; i < array->nr; i++) {
+		free(array->items[i].email);
+		free(array->items[i].message);
+	}
+	free(array->items);
+	free(array->ref);
+	free(array);
+}
+
 static struct complete_reflogs *read_complete_reflog(const char *ref)
 {
 	struct complete_reflogs *reflogs =
@@ -189,20 +205,14 @@ int add_reflog_for_walk(struct reflog_walk_info *info,
 			if (ret > 1)
 				free(b);
 			else if (ret == 1) {
-				if (reflogs) {
-					free(reflogs->ref);
-					free(reflogs);
-				}
+				free_complete_reflog(reflogs);
 				free(branch);
 				branch = b;
 				reflogs = read_complete_reflog(branch);
 			}
 		}
 		if (!reflogs || reflogs->nr == 0) {
-			if (reflogs) {
-				free(reflogs->ref);
-				free(reflogs);
-			}
+			free_complete_reflog(reflogs);
 			free(branch);
 			return -1;
 		}
-- 
2.13.2.1000.g8590c1af5d

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 0/7] fixing reflog-walk oddities
  2017-07-07  8:36                             ` [PATCH v2 0/4] reflog-walk fixes for maint Jeff King
                                                 ` (3 preceding siblings ...)
  2017-07-07  8:43                               ` [PATCH v2 4/4] reflog-walk: include all fields when freeing complete_reflogs Jeff King
@ 2017-07-07  9:05                               ` Jeff King
  2017-07-07  9:06                                 ` [PATCH v2 1/7] t1414: document some " Jeff King
                                                   ` (6 more replies)
  2017-07-07 20:15                               ` [PATCH v2 0/4] reflog-walk fixes for maint Junio C Hamano
  5 siblings, 7 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  9:05 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua, Eric Wong

On Fri, Jul 07, 2017 at 04:36:36AM -0400, Jeff King wrote:

> Here's an updated version of the bug-fix patch, along with the fix for
> the problem that Eric noticed, and some other problems I noticed while
> fixing that one. So I've split these immediate fixes for maint off into
> their own series.

And here's the parent-less walk for master. This must be applied on a
merge of the bug-fix series to the current master. I didn't do it all on
"maint" because a lot of it depends on the timestamp_t work which is
only in master. And trying to use "unsigned long" and merge that up to
master correctly is error-prone. It not only doesn't create textual
conflicts, but the semantic conflicts it generates are really subtle and
only break on certain systems.

This should address all of the comments on v1, including the
multi-reflog iteration order and the --since/--until bits. There are a
few new fixes, too.

  [v2 1/7]: t1414: document some reflog-walk oddities

    The big change is that this expects the interleaved order in the
    multi-reflog test. There are a few updated comments and some
    adaptations to cover some bits from the bug-fix series.

  [v2 2/7]: revision: disallow reflog walking with revs->limited

    This is new, and just cleanly disallows some already-broken cases.

  [v2 3/7]: log: do not free parents when walking reflog
  [v2 4/7]: get_revision_1(): replace do-while with an early return
  [v2 5/7]: rev-list: check reflog_info before showing usage

    These ones are the same as before.

  [v2 6/7]: reflog-walk: stop using fake parents

    The big change here is the interleaved output. This is largely what
    I showed earlier, but with a few cleanups.

  [v2 7/7]: reflog-walk: apply --since/--until to reflog dates

    This is new, and is a cleaned-up version of what I showed earlier.
    See the commit message for some discussion. These options _do_
    actually work sanely after 6/7, but I think the semantics given here
    are probably more useful and what people would expect.

 builtin/log.c          |   4 +-
 builtin/rev-list.c     |   3 +-
 reflog-walk.c          | 152 ++++++++++++++++++++++---------------------------
 reflog-walk.h          |   7 ++-
 revision.c             |  57 ++++++++++++-------
 t/t1411-reflog-show.sh |  10 ----
 t/t1414-reflog-walk.sh | 135 +++++++++++++++++++++++++++++++++++++++++++
 t/t3200-branch.sh      |   3 +-
 8 files changed, 249 insertions(+), 122 deletions(-)
 create mode 100755 t/t1414-reflog-walk.sh

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v2 1/7] t1414: document some reflog-walk oddities
  2017-07-07  9:05                               ` [PATCH v2 0/7] fixing reflog-walk oddities Jeff King
@ 2017-07-07  9:06                                 ` Jeff King
  2017-07-07  9:21                                   ` Jeff King
  2017-07-07 15:54                                   ` Kyle Meyer
  2017-07-07  9:07                                 ` [PATCH v2 2/7] revision: disallow reflog walking with revs->limited Jeff King
                                                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  9:06 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua, Eric Wong

Since its inception, the general strategy of the reflog-walk
code has been to start with the tip commit for the ref, and
as we traverse replace each commit's parent pointers with
fake parents pointing to the previous reflog entry.

This lets us traverse the reflog as if it were a real
history, but it has some user-visible oddities. Namely:

  1. The fake parents are used for commit selection and
     display. So for example, "--merges" or "--no-merges"
     are not useful, because the history appears as a linear
     string of commits. Likewise, pathspec limiting is based
     on the diff between adjacent entries, not the changes
     actually introduced by a commit.

     These are often the same (e.g., because the entry was
     just running "git commit" and the adjacent entry _is_
     the true parent), but it may not be in several common
     cases. For instance, using "git reset" to jump around
     history, or "git checkout" to move HEAD.

  2. We reverse-map each commit back to its reflog. So when
     it comes time to show commit X, we say "a-ha, we added
     X because it was at the tip of the 'foo' reflog, so
     let's show the foo reflog". But this leads to nonsense
     results when you ask to traverse multiple reflogs: if
     two reflogs have the same tip commit, we only map back
     to one of them.  Instead, we should show both.

  3. If the tip of the reflog and the ref tip disagree on
     the current value, we show the ref tip but give no
     indication of the value in the reflog.  This situation
     isn't supposed to happen (since any ref update should
     touch the reflog). But if it does, given that the
     requested operation is to show the reflog, it makes
     sense to prefer that.

This commit adds a new script with several expect_failure
tests to demonstrate the problems.  This could be part of
the existing t1411, but it's a bit easier to start from a
fresh state, where we know exactly what will be in the log.

Since the new multiple-reflog tests are checking the actual
output, we can drop the "make sure we don't segfault" tests
from t1411, which are a strict subset of what we're doing
here.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/t1411-reflog-show.sh |  10 -----
 t/t1414-reflog-walk.sh | 105 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 105 insertions(+), 10 deletions(-)
 create mode 100755 t/t1414-reflog-walk.sh

diff --git a/t/t1411-reflog-show.sh b/t/t1411-reflog-show.sh
index b9cb76654b..6ac7734d79 100755
--- a/t/t1411-reflog-show.sh
+++ b/t/t1411-reflog-show.sh
@@ -171,14 +171,4 @@ test_expect_success 'reflog exists works' '
 	! git reflog exists refs/heads/nonexistent
 '
 
-# The behavior with two reflogs is buggy and the output is in flux; for now
-# we're just checking that the program works at all without segfaulting.
-test_expect_success 'showing multiple reflogs works' '
-	git log -g HEAD HEAD >actual
-'
-
-test_expect_success 'showing multiple reflogs with an old date' '
-	git log -g HEAD@{1979-01-01} HEAD >actual
-'
-
 test_done
diff --git a/t/t1414-reflog-walk.sh b/t/t1414-reflog-walk.sh
new file mode 100755
index 0000000000..6426829352
--- /dev/null
+++ b/t/t1414-reflog-walk.sh
@@ -0,0 +1,105 @@
+#!/bin/sh
+
+test_description='various tests of reflog walk (log -g) behavior'
+. ./test-lib.sh
+
+test_expect_success 'set up some reflog entries' '
+	test_commit one &&
+	test_commit two &&
+	git checkout -b side HEAD^ &&
+	test_commit three &&
+	git merge --no-commit master &&
+	echo evil-merge-content >>one.t &&
+	test_tick &&
+	git commit --no-edit -a
+'
+
+do_walk () {
+	git log -g --format="%gd %gs" "$@"
+}
+
+sq="'"
+test_expect_success 'set up expected reflog' '
+	cat >expect.all <<-EOF
+	HEAD@{0} commit (merge): Merge branch ${sq}master${sq} into side
+	HEAD@{1} commit: three
+	HEAD@{2} checkout: moving from master to side
+	HEAD@{3} commit: two
+	HEAD@{4} commit (initial): one
+	EOF
+'
+
+test_expect_success 'reflog walk shows expected logs' '
+	do_walk >actual &&
+	test_cmp expect.all actual
+'
+
+test_expect_failure 'reflog can limit with --no-merges' '
+	grep -v merge expect.all >expect &&
+	do_walk --no-merges >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'reflog can limit with pathspecs' '
+	grep two expect.all >expect &&
+	do_walk -- two.t >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'pathspec limiting handles merges' '
+	# we pick up:
+	#   - the initial commit of one
+	#   - the checkout back to commit one
+	#   - the evil merge which touched one
+	sed -n "1p;3p;5p" expect.all >expect &&
+	do_walk -- one.t >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure '--parents shows true parents' '
+	# convert newlines to spaces
+	echo $(git rev-parse HEAD HEAD^1 HEAD^2) >expect &&
+	git rev-list -g --parents -1 HEAD >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'walking multiple reflogs shows all' '
+	# We expect to see all entries for all reflogs, but interleaved by
+	# date, with order no the command line breaking ties. We
+	# can use "sort" on the separate lists to generate this,
+	# but note two tricks:
+	#
+	#   1. We use "{" as the delimiter, which lets us skip to the reflog
+	#      date specifier as our second field, and then our "-n" numeric
+	#      sort ignores the bits after the timestamp.
+	#
+	#   2. POSIX leaves undefined whether this is a stable sort or not. So
+	#      we use "-k 1" to ensure that we see HEAD before master before
+	#      side when breaking ties.
+	{
+		do_walk --date=unix HEAD &&
+		do_walk --date=unix side &&
+		do_walk --date=unix master
+	} >expect.raw &&
+	sort -t "{" -k 2nr -k 1 <expect.raw >expect &&
+	do_walk --date=unix HEAD master side >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'date-limiting does not interfere with other logs' '
+	do_walk HEAD@{1979-01-01} HEAD >actual &&
+	test_cmp expect.all actual
+'
+
+test_expect_failure 'walk prefers reflog to ref tip' '
+	head=$(git rev-parse HEAD) &&
+	one=$(git rev-parse one) &&
+	ident="$GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE" &&
+	echo "$head $one $ident	broken reflog entry" >>.git/logs/HEAD &&
+
+	echo $one >expect &&
+	git log -g --format=%H -1 >actual &&
+	test_cmp expect actual
+'
+
+test_done
-- 
2.13.2.1000.g8590c1af5d


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 2/7] revision: disallow reflog walking with revs->limited
  2017-07-07  9:05                               ` [PATCH v2 0/7] fixing reflog-walk oddities Jeff King
  2017-07-07  9:06                                 ` [PATCH v2 1/7] t1414: document some " Jeff King
@ 2017-07-07  9:07                                 ` Jeff King
  2017-07-07  9:07                                 ` [PATCH v2 3/7] log: do not free parents when walking reflog Jeff King
                                                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  9:07 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua, Eric Wong

The reflog-walk code doesn't work with limit_list().  That
function traverses down the real history graph, not the fake
reflog history that get_revision() returns. So it's not
going to actually examine all of the commits we're going to
show, because we'd add them to the pending list only during
the actual traversal.

In practice this limitation doesn't really matter, because
the options that require list-limiting generally need
UNINTERESTING endpoints or symmetric ranges, which already
are forbidden for reflog walks. Still, there are likely some
corner cases that would behave oddly. We're better off to
warn the user that we can't fulfill their request than to
generate potentially wrong output.

This will also make it easier to refactor the reflog-walking
code, because it eliminates a whole area of corner cases
we'd have to consider (that already don't work anyway).

Signed-off-by: Jeff King <peff@peff.net>
---
 revision.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/revision.c b/revision.c
index e181ad1b70..6678de04d9 100644
--- a/revision.c
+++ b/revision.c
@@ -2366,6 +2366,8 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 
 	if (revs->reverse && revs->reflog_info)
 		die("cannot combine --reverse with --walk-reflogs");
+	if (revs->reflog_info && revs->limited)
+		die("cannot combine --walk-reflogs with history-limiting options");
 	if (revs->rewrite_parents && revs->children.name)
 		die("cannot combine --parents and --children");
 
-- 
2.13.2.1000.g8590c1af5d


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 3/7] log: do not free parents when walking reflog
  2017-07-07  9:05                               ` [PATCH v2 0/7] fixing reflog-walk oddities Jeff King
  2017-07-07  9:06                                 ` [PATCH v2 1/7] t1414: document some " Jeff King
  2017-07-07  9:07                                 ` [PATCH v2 2/7] revision: disallow reflog walking with revs->limited Jeff King
@ 2017-07-07  9:07                                 ` Jeff King
  2017-07-07 17:10                                   ` Junio C Hamano
  2017-07-07  9:07                                 ` [PATCH v2 4/7] get_revision_1(): replace do-while with an early return Jeff King
                                                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 69+ messages in thread
From: Jeff King @ 2017-07-07  9:07 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua, Eric Wong

When we're doing a reflog walk (instead of walking the
actual parent pointers), we may see commits multiple times.
For this reason, we hold on to the commit buffer for each
commit rather than freeing it after we've showed the commit.

We should do the same for the parent list. Right now this is
just a minor optimization. But once we refactor how reflog
walks are performed, keeping the parents will avoid
confusing us the second time we see the commit.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/log.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 8ca1de9894..9c8bb3b5c3 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -374,9 +374,9 @@ static int cmd_log_walk(struct rev_info *rev)
 		if (!rev->reflog_info) {
 			/* we allow cycles in reflog ancestry */
 			free_commit_buffer(commit);
+			free_commit_list(commit->parents);
+			commit->parents = NULL;
 		}
-		free_commit_list(commit->parents);
-		commit->parents = NULL;
 		if (saved_nrl < rev->diffopt.needed_rename_limit)
 			saved_nrl = rev->diffopt.needed_rename_limit;
 		if (rev->diffopt.degraded_cc_to_c)
-- 
2.13.2.1000.g8590c1af5d


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 4/7] get_revision_1(): replace do-while with an early return
  2017-07-07  9:05                               ` [PATCH v2 0/7] fixing reflog-walk oddities Jeff King
                                                   ` (2 preceding siblings ...)
  2017-07-07  9:07                                 ` [PATCH v2 3/7] log: do not free parents when walking reflog Jeff King
@ 2017-07-07  9:07                                 ` Jeff King
  2017-07-07  9:08                                 ` [PATCH v2 5/7] rev-list: check reflog_info before showing usage Jeff King
                                                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  9:07 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua, Eric Wong

The get_revision_1() function tries to avoid entering its
main loop at all when there are no commits to look at. But
it's perfectly safe to call pop_commit() on an empty list
(in which case it will return NULL). Switching to an early
return from the loop lets us skip repeating the loop
condition before we enter the do-while. That will get more
important when we start pulling reflog-walk commits from a
source besides the revs->commits queue, as that condition
will get much more complicated.

Signed-off-by: Jeff King <peff@peff.net>
---
 revision.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/revision.c b/revision.c
index 6678de04d9..4019e8cf23 100644
--- a/revision.c
+++ b/revision.c
@@ -3111,12 +3111,12 @@ static void track_linear(struct rev_info *revs, struct commit *commit)
 
 static struct commit *get_revision_1(struct rev_info *revs)
 {
-	if (!revs->commits)
-		return NULL;
-
-	do {
+	while (1) {
 		struct commit *commit = pop_commit(&revs->commits);
 
+		if (!commit)
+			return NULL;
+
 		if (revs->reflog_info) {
 			save_parents(revs, commit);
 			fake_reflog_parent(revs->reflog_info, commit);
@@ -3150,8 +3150,7 @@ static struct commit *get_revision_1(struct rev_info *revs)
 				track_linear(revs, commit);
 			return commit;
 		}
-	} while (revs->commits);
-	return NULL;
+	}
 }
 
 /*
-- 
2.13.2.1000.g8590c1af5d


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 5/7] rev-list: check reflog_info before showing usage
  2017-07-07  9:05                               ` [PATCH v2 0/7] fixing reflog-walk oddities Jeff King
                                                   ` (3 preceding siblings ...)
  2017-07-07  9:07                                 ` [PATCH v2 4/7] get_revision_1(): replace do-while with an early return Jeff King
@ 2017-07-07  9:08                                 ` Jeff King
  2017-07-07  9:14                                 ` [PATCH v2 6/7] reflog-walk: stop using fake parents Jeff King
  2017-07-07  9:16                                 ` [PATCH v2 7/7] reflog-walk: apply --since/--until to reflog dates Jeff King
  6 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  9:08 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua, Eric Wong

When git-rev-list sees no pending commits, it shows a usage
message. This works even when reflog-walking is requested,
because the reflog-walk code currently puts the reflog tips
into the pending queue.

In preparation for refactoring the reflog-walk code, let's
explicitly check whether we have any reflogs to walk. For
now this is a noop, but the existing reflog tests will make
sure that it kicks in after the refactoring. Likewise, we'll
add a test that "rev-list -g" without specifying any reflogs
continues to fail (so that we know our check does not kick
in too aggressively).

Note that the implementation needs to go into its own
sub-function, as the walk code does not expose its innards
outside of reflog-walk.c.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/rev-list.c     | 3 ++-
 reflog-walk.c          | 5 +++++
 reflog-walk.h          | 2 ++
 t/t1414-reflog-walk.sh | 4 ++++
 4 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 95d84d5cda..53a746dd89 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -11,6 +11,7 @@
 #include "graph.h"
 #include "bisect.h"
 #include "progress.h"
+#include "reflog-walk.h"
 
 static const char rev_list_usage[] =
 "git rev-list [OPTION] <commit-id>... [ -- paths... ]\n"
@@ -348,7 +349,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 		/* Only --header was specified */
 		revs.commit_format = CMIT_FMT_RAW;
 
-	if ((!revs.commits &&
+	if ((!revs.commits && reflog_walk_empty(revs.reflog_info) &&
 	     (!(revs.tag_objects || revs.tree_objects || revs.blob_objects) &&
 	      !revs.pending.nr)) ||
 	    revs.diff)
diff --git a/reflog-walk.c b/reflog-walk.c
index 081f89b70d..98c2f42de9 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -365,3 +365,8 @@ void show_reflog_message(struct reflog_walk_info *reflog_info, int oneline,
 		strbuf_release(&selector);
 	}
 }
+
+int reflog_walk_empty(struct reflog_walk_info *info)
+{
+	return !info || !info->reflogs.nr;
+}
diff --git a/reflog-walk.h b/reflog-walk.h
index 27886f793e..af32361072 100644
--- a/reflog-walk.h
+++ b/reflog-walk.h
@@ -20,4 +20,6 @@ extern void get_reflog_selector(struct strbuf *sb,
 		const struct date_mode *dmode, int force_date,
 		int shorten);
 
+extern int reflog_walk_empty(struct reflog_walk_info *walk);
+
 #endif
diff --git a/t/t1414-reflog-walk.sh b/t/t1414-reflog-walk.sh
index 6426829352..289c502725 100755
--- a/t/t1414-reflog-walk.sh
+++ b/t/t1414-reflog-walk.sh
@@ -102,4 +102,8 @@ test_expect_failure 'walk prefers reflog to ref tip' '
 	test_cmp expect actual
 '
 
+test_expect_success 'rev-list -g complains when there are no reflogs' '
+	test_must_fail git rev-list -g
+'
+
 test_done
-- 
2.13.2.1000.g8590c1af5d


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 6/7] reflog-walk: stop using fake parents
  2017-07-07  9:05                               ` [PATCH v2 0/7] fixing reflog-walk oddities Jeff King
                                                   ` (4 preceding siblings ...)
  2017-07-07  9:08                                 ` [PATCH v2 5/7] rev-list: check reflog_info before showing usage Jeff King
@ 2017-07-07  9:14                                 ` Jeff King
  2017-07-07  9:24                                   ` Jeff King
  2017-07-07 15:54                                   ` Kyle Meyer
  2017-07-07  9:16                                 ` [PATCH v2 7/7] reflog-walk: apply --since/--until to reflog dates Jeff King
  6 siblings, 2 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  9:14 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua, Eric Wong

The reflog-walk system works by putting a ref's tip into the
pending queue, and then "traversing" the reflog by
pretending that the parent of each commit is the previous
reflog entry.

This causes a number of user-visible oddities, as documented
in t1414 (and the commit message which introduced it). We
can fix all of them in one go by replacing the fake-reflog
system with a much simpler one: just keeping a list of
reflogs to show, and walking through them entry by entry.

The implementation is fairly straight-forward, but there are
a few items to note:

  1. We obviously must skip calling add_parents_to_list()
     when we are traversing reflogs, since we do not want to
     walk the original parents at all.  As a result, we must call
     try_to_simplify_commit() ourselves.

     There are other parts of add_parents_to_list() we skip,
     as well, but none of them should matter for a reflog
     traversal:

       -  We do not allow UNINTERESTING commits, nor
          symmetric ranges (and we bail when these are used
          with "-g").

       - Using --source makes no sense, since we aren't
         traversing. The reflog selector shows the same
         information with more detail.

       - Using --first-parent is still sensible, since you
         may want to see the first-parent diff for each
         entry. But since we're not traversing, we don't
         need to cull the parent list here.

  2. Since we now just walk the reflog entries themselves,
     rather than starting with the ref tip, we now look at
     the "new" field of each entry rather than the "old"
     (i.e., we are showing entries, not faking parents).
     This removes all of the tricky logic around skipping
     past root commits.

     But note that we have no way to show an entry with the
     null sha1 in its "new" field (because such a commit
     obviously does not exist). Normally this would not
     happen, since we delete reflogs along with refs, but
     there is one special case. When we rename the currently
     checked out branch, we write two reflog entries into
     the HEAD log: one where the commit goes away, and
     another where it comes back.

     Prior to this commit, we show both entries with
     identical reflog messages. After this commit, we show
     only the "comes back" entry. See the update in t3200
     which demonstrates this.

     Arguably either is fine, as the whole double-entry
     thing is a bit hacky in the first place. And until a
     recent fix, we truncated the traversal in such a case
     anyway, which was _definitely_ wrong.

  3. We show individual reflogs in order, but choose which
     reflog to show at each stage based on which has the
     most recent timestamp.  This interleaves the output
     from multiple reflogs based on date order, which is
     probably what you'd want with limiting like "-n 30".

     Note that the implementation aims for simplicity. It
     does a linear walk over the reflog queue for each
     commit it pulls, which may perform badly if you
     interleave an enormous number of reflogs. That seems
     like an unlikely use case; if we did want to handle it,
     we could probably keep a priority queue of reflogs,
     ordered by the timestamp of their current tip entry.

Signed-off-by: Jeff King <peff@peff.net>
---
 reflog-walk.c          | 137 ++++++++++++++++++-------------------------------
 reflog-walk.h          |   4 +-
 revision.c             |  27 +++++-----
 t/t1414-reflog-walk.sh |  12 ++---
 t/t3200-branch.sh      |   3 +-
 5 files changed, 75 insertions(+), 108 deletions(-)

diff --git a/reflog-walk.c b/reflog-walk.c
index 98c2f42de9..fbee9e0126 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -94,45 +94,6 @@ static int get_reflog_recno_by_time(struct complete_reflogs *array,
 	return -1;
 }
 
-struct commit_info_lifo {
-	struct commit_info {
-		struct commit *commit;
-		void *util;
-	} *items;
-	int nr, alloc;
-};
-
-static struct commit_info *get_commit_info(struct commit *commit,
-		struct commit_info_lifo *lifo, int pop)
-{
-	int i;
-	for (i = 0; i < lifo->nr; i++)
-		if (lifo->items[i].commit == commit) {
-			struct commit_info *result = &lifo->items[i];
-			if (pop) {
-				if (i + 1 < lifo->nr)
-					memmove(lifo->items + i,
-						lifo->items + i + 1,
-						(lifo->nr - i) *
-						sizeof(struct commit_info));
-				lifo->nr--;
-			}
-			return result;
-		}
-	return NULL;
-}
-
-static void add_commit_info(struct commit *commit, void *util,
-		struct commit_info_lifo *lifo)
-{
-	struct commit_info *info;
-	ALLOC_GROW(lifo->items, lifo->nr + 1, lifo->alloc);
-	info = lifo->items + lifo->nr;
-	info->commit = commit;
-	info->util = util;
-	lifo->nr++;
-}
-
 struct commit_reflog {
 	int recno;
 	enum selector_type {
@@ -144,7 +105,8 @@ struct commit_reflog {
 };
 
 struct reflog_walk_info {
-	struct commit_info_lifo reflogs;
+	struct commit_reflog **logs;
+	size_t nr, alloc;
 	struct string_list complete_reflogs;
 	struct commit_reflog *last_commit_reflog;
 };
@@ -233,52 +195,10 @@ int add_reflog_for_walk(struct reflog_walk_info *info,
 	commit_reflog->selector = selector;
 	commit_reflog->reflogs = reflogs;
 
-	add_commit_info(commit, commit_reflog, &info->reflogs);
-	return 0;
-}
-
-void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
-{
-	struct commit_info *commit_info =
-		get_commit_info(commit, &info->reflogs, 0);
-	struct commit_reflog *commit_reflog;
-	struct object *logobj;
-	struct reflog_info *reflog;
-
-	info->last_commit_reflog = NULL;
-	if (!commit_info)
-		return;
-
-	commit_reflog = commit_info->util;
-	if (commit_reflog->recno < 0) {
-		commit->parents = NULL;
-		return;
-	}
-	info->last_commit_reflog = commit_reflog;
-
-	do {
-		reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
-		commit_reflog->recno--;
-		logobj = parse_object(&reflog->ooid);
-	} while (commit_reflog->recno && (logobj && logobj->type != OBJ_COMMIT));
-
-	if (!logobj && commit_reflog->recno >= 0 && is_null_oid(&reflog->ooid)) {
-		/* a root commit, but there are still more entries to show */
-		reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
-		logobj = parse_object(&reflog->noid);
-		if (!logobj)
-			logobj = parse_object(&reflog->ooid);
-	}
-
-	if (!logobj || logobj->type != OBJ_COMMIT) {
-		commit_info->commit = NULL;
-		commit->parents = NULL;
-		return;
-	}
-	commit_info->commit = (struct commit *)logobj;
+	ALLOC_GROW(info->logs, info->nr + 1, info->alloc);
+	info->logs[info->nr++] = commit_reflog;
 
-	commit->parents = xcalloc(1, sizeof(struct commit_list));
-	commit->parents->item = commit_info->commit;
+	return 0;
 }
 
 void get_reflog_selector(struct strbuf *sb,
@@ -368,5 +288,50 @@ void show_reflog_message(struct reflog_walk_info *reflog_info, int oneline,
 
 int reflog_walk_empty(struct reflog_walk_info *info)
 {
-	return !info || !info->reflogs.nr;
+	return !info || !info->nr;
+}
+
+static struct commit *next_reflog_commit(struct commit_reflog *log)
+{
+	for (; log->recno >= 0; log->recno--) {
+		struct reflog_info *entry = &log->reflogs->items[log->recno];
+		struct object *obj = parse_object(&entry->noid);
+
+		if (obj && obj->type == OBJ_COMMIT)
+			return (struct commit *)obj;
+	}
+	return NULL;
+}
+
+static timestamp_t log_timestamp(struct commit_reflog *log)
+{
+	return log->reflogs->items[log->recno].timestamp;
+}
+
+struct commit *next_reflog_entry(struct reflog_walk_info *walk)
+{
+	struct commit_reflog *best = NULL;
+	struct commit *best_commit = NULL;
+	size_t i;
+
+	for (i = 0; i < walk->nr; i++) {
+		struct commit_reflog *log = walk->logs[i];
+		struct commit *commit = next_reflog_commit(log);
+
+		if (!commit)
+			continue;
+
+		if (!best || log_timestamp(log) > log_timestamp(best)) {
+			best = log;
+			best_commit = commit;
+		}
+	}
+
+	if (best) {
+		best->recno--;
+		walk->last_commit_reflog = best;
+		return best_commit;
+	}
+
+	return NULL;
 }
diff --git a/reflog-walk.h b/reflog-walk.h
index af32361072..373388cd14 100644
--- a/reflog-walk.h
+++ b/reflog-walk.h
@@ -8,8 +8,6 @@ struct reflog_walk_info;
 extern void init_reflog_walk(struct reflog_walk_info **info);
 extern int add_reflog_for_walk(struct reflog_walk_info *info,
 		struct commit *commit, const char *name);
-extern void fake_reflog_parent(struct reflog_walk_info *info,
-		struct commit *commit);
 extern void show_reflog_message(struct reflog_walk_info *info, int,
 				const struct date_mode *, int force_date);
 extern void get_reflog_message(struct strbuf *sb,
@@ -22,4 +20,6 @@ extern void get_reflog_selector(struct strbuf *sb,
 
 extern int reflog_walk_empty(struct reflog_walk_info *walk);
 
+struct commit *next_reflog_entry(struct reflog_walk_info *reflog_info);
+
 #endif
diff --git a/revision.c b/revision.c
index 4019e8cf23..587199739a 100644
--- a/revision.c
+++ b/revision.c
@@ -148,16 +148,14 @@ static void add_pending_object_with_path(struct rev_info *revs,
 	if (revs->reflog_info && obj->type == OBJ_COMMIT) {
 		struct strbuf buf = STRBUF_INIT;
 		int len = interpret_branch_name(name, 0, &buf, 0);
-		int st;
 
 		if (0 < len && name[len] && buf.len)
 			strbuf_addstr(&buf, name + len);
-		st = add_reflog_for_walk(revs->reflog_info,
-					 (struct commit *)obj,
-					 buf.buf[0] ? buf.buf: name);
+		add_reflog_for_walk(revs->reflog_info,
+				    (struct commit *)obj,
+				    buf.buf[0] ? buf.buf: name);
 		strbuf_release(&buf);
-		if (st)
-			return;
+		return; /* do not add the commit itself */
 	}
 	add_object_array_with_path(obj, name, &revs->pending, mode, path);
 }
@@ -3112,16 +3110,18 @@ static void track_linear(struct rev_info *revs, struct commit *commit)
 static struct commit *get_revision_1(struct rev_info *revs)
 {
 	while (1) {
-		struct commit *commit = pop_commit(&revs->commits);
+		struct commit *commit;
+
+		if (revs->reflog_info)
+			commit = next_reflog_entry(revs->reflog_info);
+		else
+			commit = pop_commit(&revs->commits);
 
 		if (!commit)
 			return NULL;
 
-		if (revs->reflog_info) {
-			save_parents(revs, commit);
-			fake_reflog_parent(revs->reflog_info, commit);
+		if (revs->reflog_info)
 			commit->object.flags &= ~(ADDED | SEEN | SHOWN);
-		}
 
 		/*
 		 * If we haven't done the list limiting, we need to look at
@@ -3132,7 +3132,10 @@ static struct commit *get_revision_1(struct rev_info *revs)
 			if (revs->max_age != -1 &&
 			    (commit->date < revs->max_age))
 				continue;
-			if (add_parents_to_list(revs, commit, &revs->commits, NULL) < 0) {
+
+			if (revs->reflog_info)
+				try_to_simplify_commit(revs, commit);
+			else if (add_parents_to_list(revs, commit, &revs->commits, NULL) < 0) {
 				if (!revs->ignore_missing_links)
 					die("Failed to traverse parents of commit %s",
 						oid_to_hex(&commit->object.oid));
diff --git a/t/t1414-reflog-walk.sh b/t/t1414-reflog-walk.sh
index 289c502725..c4c53bd209 100755
--- a/t/t1414-reflog-walk.sh
+++ b/t/t1414-reflog-walk.sh
@@ -34,19 +34,19 @@ test_expect_success 'reflog walk shows expected logs' '
 	test_cmp expect.all actual
 '
 
-test_expect_failure 'reflog can limit with --no-merges' '
+test_expect_success 'reflog can limit with --no-merges' '
 	grep -v merge expect.all >expect &&
 	do_walk --no-merges >actual &&
 	test_cmp expect actual
 '
 
-test_expect_failure 'reflog can limit with pathspecs' '
+test_expect_success 'reflog can limit with pathspecs' '
 	grep two expect.all >expect &&
 	do_walk -- two.t >actual &&
 	test_cmp expect actual
 '
 
-test_expect_failure 'pathspec limiting handles merges' '
+test_expect_success 'pathspec limiting handles merges' '
 	# we pick up:
 	#   - the initial commit of one
 	#   - the checkout back to commit one
@@ -56,14 +56,14 @@ test_expect_failure 'pathspec limiting handles merges' '
 	test_cmp expect actual
 '
 
-test_expect_failure '--parents shows true parents' '
+test_expect_success '--parents shows true parents' '
 	# convert newlines to spaces
 	echo $(git rev-parse HEAD HEAD^1 HEAD^2) >expect &&
 	git rev-list -g --parents -1 HEAD >actual &&
 	test_cmp expect actual
 '
 
-test_expect_failure 'walking multiple reflogs shows all' '
+test_expect_success 'walking multiple reflogs shows all' '
 	# We expect to see all entries for all reflogs, but interleaved by
 	# date, with order no the command line breaking ties. We
 	# can use "sort" on the separate lists to generate this,
@@ -91,7 +91,7 @@ test_expect_success 'date-limiting does not interfere with other logs' '
 	test_cmp expect.all actual
 '
 
-test_expect_failure 'walk prefers reflog to ref tip' '
+test_expect_success 'walk prefers reflog to ref tip' '
 	head=$(git rev-parse HEAD) &&
 	one=$(git rev-parse one) &&
 	ident="$GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE" &&
diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
index dd37ac47c5..9d707d2a40 100755
--- a/t/t3200-branch.sh
+++ b/t/t3200-branch.sh
@@ -166,10 +166,9 @@ test_expect_success 'resulting reflog can be shown by log -g' '
 	oid=$(git rev-parse HEAD) &&
 	cat >expect <<-EOF &&
 	HEAD@{0} $oid $msg
-	HEAD@{1} $oid $msg
 	HEAD@{2} $oid checkout: moving from foo to baz
 	EOF
-	git log -g --format="%gd %H %gs" -3 HEAD >actual &&
+	git log -g --format="%gd %H %gs" -2 HEAD >actual &&
 	test_cmp expect actual
 '
 
-- 
2.13.2.1000.g8590c1af5d


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH v2 7/7] reflog-walk: apply --since/--until to reflog dates
  2017-07-07  9:05                               ` [PATCH v2 0/7] fixing reflog-walk oddities Jeff King
                                                   ` (5 preceding siblings ...)
  2017-07-07  9:14                                 ` [PATCH v2 6/7] reflog-walk: stop using fake parents Jeff King
@ 2017-07-07  9:16                                 ` Jeff King
  6 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  9:16 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua, Eric Wong

When doing a reflog walk, we use the commit's date to
do any date limiting. In earlier versions of Git, this could
lead to nonsense results, since a skipped commit would
truncate the traversal. So a sequence like:

  git commit ...
  git checkout week-old-branch
  git checkout -
  git log -g --since=1.day.ago

would stop at the week-old-branch, even though the "git
commit" entry further back is still interesting.

As of the prior commit, which uses a parent-less traversal
of the reflog, you get the whole reflog minus any commits
whose dates do not match the specified options. This is
arguably useful, as you could scan the reflogs for commits
that originated in a certain range.

But more likely a user doing a reflog walk wants to limit
based on the reflog entries themselves. You can simulate
--until with:

  git log -g @{1.day.ago}

but there's no way to ask Git to traverse only back to a
certain date. E.g.:

  # show me reflog entries from the past day
  git log -g --since=1.day.ago

This patch teaches the revision machinery to prefer the
reflog entry dates to the commit dates when doing a reflog
walk. Technically this is a change in behavior that affects
plumbing, but the previous behavior was so buggy that it's
unlikely anyone was relying on it.

Signed-off-by: Jeff King <peff@peff.net>
---
 reflog-walk.c          | 12 ++++++++++++
 reflog-walk.h          |  1 +
 revision.c             | 19 ++++++++++++++++---
 t/t1414-reflog-walk.sh | 26 ++++++++++++++++++++++++++
 4 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/reflog-walk.c b/reflog-walk.c
index fbee9e0126..74ebe5148f 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -264,6 +264,18 @@ const char *get_reflog_ident(struct reflog_walk_info *reflog_info)
 	return info->email;
 }
 
+timestamp_t get_reflog_timestamp(struct reflog_walk_info *reflog_info)
+{
+	struct commit_reflog *commit_reflog = reflog_info->last_commit_reflog;
+	struct reflog_info *info;
+
+	if (!commit_reflog)
+		return 0;
+
+	info = &commit_reflog->reflogs->items[commit_reflog->recno+1];
+	return info->timestamp;
+}
+
 void show_reflog_message(struct reflog_walk_info *reflog_info, int oneline,
 			 const struct date_mode *dmode, int force_date)
 {
diff --git a/reflog-walk.h b/reflog-walk.h
index 373388cd14..7553c448fe 100644
--- a/reflog-walk.h
+++ b/reflog-walk.h
@@ -13,6 +13,7 @@ extern void show_reflog_message(struct reflog_walk_info *info, int,
 extern void get_reflog_message(struct strbuf *sb,
 		struct reflog_walk_info *reflog_info);
 extern const char *get_reflog_ident(struct reflog_walk_info *reflog_info);
+extern timestamp_t get_reflog_timestamp(struct reflog_walk_info *reflog_info);
 extern void get_reflog_selector(struct strbuf *sb,
 		struct reflog_walk_info *reflog_info,
 		const struct date_mode *dmode, int force_date,
diff --git a/revision.c b/revision.c
index 587199739a..41b4375c3c 100644
--- a/revision.c
+++ b/revision.c
@@ -2965,6 +2965,18 @@ static inline int want_ancestry(const struct rev_info *revs)
 	return (revs->rewrite_parents || revs->children.name);
 }
 
+/*
+ * Return a timestamp to be used for --since/--until comparisons for this
+ * commit, based on the revision options.
+ */
+static timestamp_t comparison_date(const struct rev_info *revs,
+				   struct commit *commit)
+{
+	return revs->reflog_info ?
+		get_reflog_timestamp(revs->reflog_info) :
+		commit->date;
+}
+
 enum commit_action get_commit_action(struct rev_info *revs, struct commit *commit)
 {
 	if (commit->object.flags & SHOWN)
@@ -2975,8 +2987,9 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
 		return commit_show;
 	if (commit->object.flags & UNINTERESTING)
 		return commit_ignore;
-	if (revs->min_age != -1 && (commit->date > revs->min_age))
-		return commit_ignore;
+	if (revs->min_age != -1 &&
+	    comparison_date(revs, commit) > revs->min_age)
+			return commit_ignore;
 	if (revs->min_parents || (revs->max_parents >= 0)) {
 		int n = commit_list_count(commit->parents);
 		if ((n < revs->min_parents) ||
@@ -3130,7 +3143,7 @@ static struct commit *get_revision_1(struct rev_info *revs)
 		 */
 		if (!revs->limited) {
 			if (revs->max_age != -1 &&
-			    (commit->date < revs->max_age))
+			    comparison_date(revs, commit) < revs->max_age)
 				continue;
 
 			if (revs->reflog_info)
diff --git a/t/t1414-reflog-walk.sh b/t/t1414-reflog-walk.sh
index c4c53bd209..360754959e 100755
--- a/t/t1414-reflog-walk.sh
+++ b/t/t1414-reflog-walk.sh
@@ -91,6 +91,32 @@ test_expect_success 'date-limiting does not interfere with other logs' '
 	test_cmp expect.all actual
 '
 
+test_expect_success 'min/max age uses entry date to limit' '
+	# Flip between commits one and two so each ref update actually
+	# does something (and does not get optimized out). We know
+	# that the timestamps of those commits will be before our "min".
+
+	git update-ref -m before refs/heads/minmax one &&
+
+	test_tick &&
+	min=$test_tick &&
+	git update-ref -m min refs/heads/minmax two &&
+
+	test_tick &&
+	max=$test_tick &&
+	git update-ref -m max refs/heads/minmax one &&
+
+	test_tick &&
+	git update-ref -m after refs/heads/minmax two &&
+
+	cat >expect <<-\EOF &&
+	max
+	min
+	EOF
+	git log -g --since=$min --until=$max --format=%gs minmax >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'walk prefers reflog to ref tip' '
 	head=$(git rev-parse HEAD) &&
 	one=$(git rev-parse one) &&
-- 
2.13.2.1000.g8590c1af5d

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH v2 1/7] t1414: document some reflog-walk oddities
  2017-07-07  9:06                                 ` [PATCH v2 1/7] t1414: document some " Jeff King
@ 2017-07-07  9:21                                   ` Jeff King
  2017-07-07 15:54                                   ` Kyle Meyer
  1 sibling, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  9:21 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua, Eric Wong

On Fri, Jul 07, 2017 at 05:06:10AM -0400, Jeff King wrote:

> +test_expect_failure 'walking multiple reflogs shows all' '
> +	# We expect to see all entries for all reflogs, but interleaved by
> +	# date, with order no the command line breaking ties. We
> +	# can use "sort" on the separate lists to generate this,
> +	# but note two tricks:
> +	#
> +	#   1. We use "{" as the delimiter, which lets us skip to the reflog
> +	#      date specifier as our second field, and then our "-n" numeric
> +	#      sort ignores the bits after the timestamp.
> +	#
> +	#   2. POSIX leaves undefined whether this is a stable sort or not. So
> +	#      we use "-k 1" to ensure that we see HEAD before master before
> +	#      side when breaking ties.
> +	{
> +		do_walk --date=unix HEAD &&
> +		do_walk --date=unix side &&
> +		do_walk --date=unix master
> +	} >expect.raw &&
> +	sort -t "{" -k 2nr -k 1 <expect.raw >expect &&

I won't lie; the contortions needed for this "sort" made me a little
sick to my stomach.

It would be much easier if we could do something like:

  git log -g --format="%gt %gd"

and just get the reflog timestamp separately. But we never added %gt, so
we'd have to munge it with sed or perl.

One thing that perl would buy is that we could rely on a stable sort,
and ask for an order besides alphabetical. I.e., we could ask for:

  do_walk side master HEAD

and confirm that "side" comes before "HEAD", even though it doesn't
byte-wise sort.

I didn't think writing a perl snippet was worth the trouble for that
minor benefit, though probably it would be shorter than the comment I
needed to explain the "sort" invocation. ;)

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v2 6/7] reflog-walk: stop using fake parents
  2017-07-07  9:14                                 ` [PATCH v2 6/7] reflog-walk: stop using fake parents Jeff King
@ 2017-07-07  9:24                                   ` Jeff King
  2017-07-07 15:54                                   ` Kyle Meyer
  1 sibling, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-07  9:24 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Junio C Hamano, Kyle Meyer, git, Sahil Dua, Eric Wong

On Fri, Jul 07, 2017 at 05:14:07AM -0400, Jeff King wrote:

> @@ -3132,7 +3132,10 @@ static struct commit *get_revision_1(struct rev_info *revs)
>  			if (revs->max_age != -1 &&
>  			    (commit->date < revs->max_age))
>  				continue;
> -			if (add_parents_to_list(revs, commit, &revs->commits, NULL) < 0) {
> +
> +			if (revs->reflog_info)
> +				try_to_simplify_commit(revs, commit);
> +			else if (add_parents_to_list(revs, commit, &revs->commits, NULL) < 0) {
>  				if (!revs->ignore_missing_links)
>  					die("Failed to traverse parents of commit %s",
>  						oid_to_hex(&commit->object.oid));

There's one other subtle change from v1 here. In my original, we called
try_to_simplify_commit() much later, and then we had to check whether it
marked the commit as TREESAME. That felt really hacky to me, and the
reason is because I was using the function wrong. The intent is for
try_to_simplify to be called earlier (i.e., here), when we start looking
at the commit's parents. And then the TREESAME flag it sets is later
picked up by get_commit_action(), which is called by simplify_commit(),
which is used later in get_revision_1().

So by calling try_to_simplify here, where it would normally be called
for a non-reflog walk, that's all we have to do. I don't think the
original patch had any visible bugs, but this way is much cleaner and
more future-proof.

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v2 6/7] reflog-walk: stop using fake parents
  2017-07-07  9:14                                 ` [PATCH v2 6/7] reflog-walk: stop using fake parents Jeff King
  2017-07-07  9:24                                   ` Jeff King
@ 2017-07-07 15:54                                   ` Kyle Meyer
  2017-07-09 10:08                                     ` Jeff King
  1 sibling, 1 reply; 69+ messages in thread
From: Kyle Meyer @ 2017-07-07 15:54 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Sahil Dua, Eric Wong, brian m. carlson

Jeff King <peff@peff.net> writes:

>      Prior to this commit, we show both entries with
>      identical reflog messages. After this commit, we show
>      only the "comes back" entry. See the update in t3200
>      which demonstrates this.
>
>      Arguably either is fine, as the whole double-entry
>      thing is a bit hacky in the first place. And until a
>      recent fix, we truncated the traversal in such a case
>      anyway, which was _definitely_ wrong.

Yeah, I agree that the double-entry thing is a bit hacky and only
showing the "comes back" entry makes sense.

And with this change, I believe that the display of a rename event will
be the same for HEAD's log and the renamed branch's log, despite the
underlying entries having a different representation.

-- 
Kyle

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v2 1/7] t1414: document some reflog-walk oddities
  2017-07-07  9:06                                 ` [PATCH v2 1/7] t1414: document some " Jeff King
  2017-07-07  9:21                                   ` Jeff King
@ 2017-07-07 15:54                                   ` Kyle Meyer
  1 sibling, 0 replies; 69+ messages in thread
From: Kyle Meyer @ 2017-07-07 15:54 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Sahil Dua, Eric Wong, brian m. carlson

Jeff King <peff@peff.net> writes:

> +test_expect_failure 'walking multiple reflogs shows all' '
> +	# We expect to see all entries for all reflogs, but interleaved by
> +	# date, with order no the command line breaking ties. We

s/order no/order on/

-- 
Kyle

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v2 3/7] log: do not free parents when walking reflog
  2017-07-07  9:07                                 ` [PATCH v2 3/7] log: do not free parents when walking reflog Jeff King
@ 2017-07-07 17:10                                   ` Junio C Hamano
  2017-07-09 10:13                                     ` Jeff King
  0 siblings, 1 reply; 69+ messages in thread
From: Junio C Hamano @ 2017-07-07 17:10 UTC (permalink / raw)
  To: Jeff King; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua, Eric Wong

Jeff King <peff@peff.net> writes:

> When we're doing a reflog walk (instead of walking the
> actual parent pointers), we may see commits multiple times.
> For this reason, we hold on to the commit buffer for each
> commit rather than freeing it after we've showed the commit.
>
> We should do the same for the parent list. Right now this is
> just a minor optimization. But once we refactor how reflog
> walks are performed, keeping the parents will avoid
> confusing us the second time we see the commit.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  builtin/log.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/log.c b/builtin/log.c
> index 8ca1de9894..9c8bb3b5c3 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -374,9 +374,9 @@ static int cmd_log_walk(struct rev_info *rev)
>  		if (!rev->reflog_info) {
>  			/* we allow cycles in reflog ancestry */
>  			free_commit_buffer(commit);
> +			free_commit_list(commit->parents);
> +			commit->parents = NULL;

After step 6/7, we no longer "allow cycles in reflog ancestry", as
there will be no reflog ancestry to speak of ;-), so it would be
nice to remove the comment above in that step.  But alternatively,
we can rephrase the comment here, to say something like "the same
commit can be shown multiple times while showing entries from the
reflog" instead.

>  		}
> -		free_commit_list(commit->parents);
> -		commit->parents = NULL;
>  		if (saved_nrl < rev->diffopt.needed_rename_limit)
>  			saved_nrl = rev->diffopt.needed_rename_limit;
>  		if (rev->diffopt.degraded_cc_to_c)

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v2 0/4] reflog-walk fixes for maint
  2017-07-07  8:36                             ` [PATCH v2 0/4] reflog-walk fixes for maint Jeff King
                                                 ` (4 preceding siblings ...)
  2017-07-07  9:05                               ` [PATCH v2 0/7] fixing reflog-walk oddities Jeff King
@ 2017-07-07 20:15                               ` Junio C Hamano
  5 siblings, 0 replies; 69+ messages in thread
From: Junio C Hamano @ 2017-07-07 20:15 UTC (permalink / raw)
  To: Jeff King; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua, Eric Wong

Thanks.  All made sense to me after reading them twice.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v2 6/7] reflog-walk: stop using fake parents
  2017-07-07 15:54                                   ` Kyle Meyer
@ 2017-07-09 10:08                                     ` Jeff King
  0 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-09 10:08 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: Junio C Hamano, git, Sahil Dua, Eric Wong, brian m. carlson

On Fri, Jul 07, 2017 at 11:54:05AM -0400, Kyle Meyer wrote:

> Jeff King <peff@peff.net> writes:
> 
> >      Prior to this commit, we show both entries with
> >      identical reflog messages. After this commit, we show
> >      only the "comes back" entry. See the update in t3200
> >      which demonstrates this.
> >
> >      Arguably either is fine, as the whole double-entry
> >      thing is a bit hacky in the first place. And until a
> >      recent fix, we truncated the traversal in such a case
> >      anyway, which was _definitely_ wrong.
> 
> Yeah, I agree that the double-entry thing is a bit hacky and only
> showing the "comes back" entry makes sense.
> 
> And with this change, I believe that the display of a rename event will
> be the same for HEAD's log and the renamed branch's log, despite the
> underlying entries having a different representation.

There's one minor difference: the numbering for HEAD will show a "hole"
in the reflog. So on the branch you might see something like:

  $ git log -g --oneline other
  269000b other@{0}: Branch: renamed refs/heads/master to refs/heads/other
  269000b other@{1}: commit (initial): foo

but the HEAD reflog will show:

  $ git log -g --oneline HEAD
  269000b HEAD@{0}: Branch: renamed refs/heads/master to refs/heads/other
  269000b HEAD@{2}: commit (initial): foo

This is pretty minor. I do wonder if there are other bits that might be
confused, though. Looking at the hole is odd:

  $ git show --oneline HEAD@{1}
  warning: Log for ref HEAD unexpectedly ended on Sun, 9 Jul 2017 06:07:05 -0400.
  269000b foo

Despite the warning, we do seem to correctly walk past it, though:

  $ git show --oneline HEAD@{2}
  269000b foo

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v2 3/7] log: do not free parents when walking reflog
  2017-07-07 17:10                                   ` Junio C Hamano
@ 2017-07-09 10:13                                     ` Jeff King
  2017-07-09 16:59                                       ` Junio C Hamano
  0 siblings, 1 reply; 69+ messages in thread
From: Jeff King @ 2017-07-09 10:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua, Eric Wong

On Fri, Jul 07, 2017 at 10:10:54AM -0700, Junio C Hamano wrote:

> > diff --git a/builtin/log.c b/builtin/log.c
> > index 8ca1de9894..9c8bb3b5c3 100644
> > --- a/builtin/log.c
> > +++ b/builtin/log.c
> > @@ -374,9 +374,9 @@ static int cmd_log_walk(struct rev_info *rev)
> >  		if (!rev->reflog_info) {
> >  			/* we allow cycles in reflog ancestry */
> >  			free_commit_buffer(commit);
> > +			free_commit_list(commit->parents);
> > +			commit->parents = NULL;
> 
> After step 6/7, we no longer "allow cycles in reflog ancestry", as
> there will be no reflog ancestry to speak of ;-), so it would be
> nice to remove the comment above in that step.  But alternatively,
> we can rephrase the comment here, to say something like "the same
> commit can be shown multiple times while showing entries from the
> reflog" instead.

I actually think the comment is a bit obtuse in the first place. The
real issue is that we show commits multiple times. That's caused by
cycles, yes, but also by us clearing the SEEN flag. ;)

Maybe this on top?

-- >8 --
Subject: [PATCH] log: clarify comment about reflog cycles

When we're walking reflogs, we leave the commit buffer and
parents in place. A comment explains that this is due to
"cycles". But the interesting thing is the unsaid
implication: that the cycles (plus our clearing of the SEEN
flag) will cause us to show commits multiple times. Let's
spell it out.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/log.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/builtin/log.c b/builtin/log.c
index 9c8bb3b5c..630d6cff2 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -372,7 +372,10 @@ static int cmd_log_walk(struct rev_info *rev)
 			 */
 			rev->max_count++;
 		if (!rev->reflog_info) {
-			/* we allow cycles in reflog ancestry */
+			/*
+			 * We may show a given commit multiple times when
+			 * walking the reflogs.
+			 */
 			free_commit_buffer(commit);
 			free_commit_list(commit->parents);
 			commit->parents = NULL;
-- 
2.13.2.1066.gabaed60bd


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH v2 3/7] log: do not free parents when walking reflog
  2017-07-09 10:13                                     ` Jeff King
@ 2017-07-09 16:59                                       ` Junio C Hamano
  2017-07-10 13:27                                         ` Jeff King
  0 siblings, 1 reply; 69+ messages in thread
From: Junio C Hamano @ 2017-07-09 16:59 UTC (permalink / raw)
  To: Jeff King; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua, Eric Wong

Jeff King <peff@peff.net> writes:

> On Fri, Jul 07, 2017 at 10:10:54AM -0700, Junio C Hamano wrote:
>
>> > diff --git a/builtin/log.c b/builtin/log.c
>> > index 8ca1de9894..9c8bb3b5c3 100644
>> > --- a/builtin/log.c
>> > +++ b/builtin/log.c
>> > @@ -374,9 +374,9 @@ static int cmd_log_walk(struct rev_info *rev)
>> >  		if (!rev->reflog_info) {
>> >  			/* we allow cycles in reflog ancestry */
>> >  			free_commit_buffer(commit);
>> > +			free_commit_list(commit->parents);
>> > +			commit->parents = NULL;
>> 
>> After step 6/7, we no longer "allow cycles in reflog ancestry", as
>> there will be no reflog ancestry to speak of ;-), so it would be
>> nice to remove the comment above in that step.  But alternatively,
>> we can rephrase the comment here, to say something like "the same
>> commit can be shown multiple times while showing entries from the
>> reflog" instead.
>
> I actually think the comment is a bit obtuse in the first place. The
> real issue is that we show commits multiple times. That's caused by
> cycles, yes, but also by us clearing the SEEN flag. ;)
>
> Maybe this on top?

Yup, that is a much better version of what I had in mind that can go
either before this step as a preparatory cleanup, squashed into this
as "while at it", or after the series as a finishing touches.  The
last one will let the codebase lie for a short while, though, so I am
likely to squash it in or wiggle it under.

Thanks.



>
> -- >8 --
> Subject: [PATCH] log: clarify comment about reflog cycles
>
> When we're walking reflogs, we leave the commit buffer and
> parents in place. A comment explains that this is due to
> "cycles". But the interesting thing is the unsaid
> implication: that the cycles (plus our clearing of the SEEN
> flag) will cause us to show commits multiple times. Let's
> spell it out.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  builtin/log.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/log.c b/builtin/log.c
> index 9c8bb3b5c..630d6cff2 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -372,7 +372,10 @@ static int cmd_log_walk(struct rev_info *rev)
>  			 */
>  			rev->max_count++;
>  		if (!rev->reflog_info) {
> -			/* we allow cycles in reflog ancestry */
> +			/*
> +			 * We may show a given commit multiple times when
> +			 * walking the reflogs.
> +			 */
>  			free_commit_buffer(commit);
>  			free_commit_list(commit->parents);
>  			commit->parents = NULL;

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 6/6] reflog-walk: stop using fake parents
  2017-07-07  3:02                                 ` Jeff King
  2017-07-07  3:15                                   ` Jeff King
@ 2017-07-10  9:42                                   ` Eric Wong
  2017-07-10 11:20                                     ` Jeff King
  2017-07-10 16:09                                     ` Junio C Hamano
  1 sibling, 2 replies; 69+ messages in thread
From: Eric Wong @ 2017-07-10  9:42 UTC (permalink / raw)
  To: Jeff King
  Cc: brian m. carlson, Junio C Hamano, Kyle Meyer, git, Sahil Dua,
	Ramsay Jones

Jeff King <peff@peff.net> wrote:
> On Fri, Jul 07, 2017 at 12:32:39AM +0000, Eric Wong wrote:
> 
> > I'm not sure why, but this is causing t1414.8 failures on 32-bit
> > x86 with the latest pu with Debian jessie (oldstable).
> > 
> > Reverting this (beafb2c62947a6d4a97b9c3baf99fe62ec8e830f) in pu
> > seems to fix the test for me.
> > 
> > +Cc: Ramsay since he also had a 32-bit environment.
> 
> Thanks, I was able to reproduce with CFLAGS=-m32.

No problem and thanks for the quick response!  Latest pu is
fine for me.

To notice similar errors sooner, I wonder if we should have a
test target for 64-bit users to test with -m32 enabled, somehow.

Fwiw, I started having a cronjob which runs "make test-pu" on my
32-bit machine, where config.mak has this:

==> config.mak <==
DEVELOPER := 1
pulog := /tmp/git-test-pu.log

cur_head = $(shell git rev-parse HEAD^0)
cur_pu = $(shell git rev-parse refs/remotes/origin/pu^0)

test-pu:
	git fetch -q origin
	test x"$(cur_pu)" = x"$(cur_head)" || { \
		$(MAKE) clean && \
		git reset --hard origin/pu && \
		{ $(MAKE) -k test >$(pulog) 2>&1 || \
		 mutt -s 'pufail' e@80x24.org <$(pulog); }; }
-- 
EW

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 6/6] reflog-walk: stop using fake parents
  2017-07-10  9:42                                   ` Eric Wong
@ 2017-07-10 11:20                                     ` Jeff King
  2017-07-10 16:09                                     ` Junio C Hamano
  1 sibling, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-10 11:20 UTC (permalink / raw)
  To: Eric Wong
  Cc: brian m. carlson, Junio C Hamano, Kyle Meyer, git, Sahil Dua,
	Ramsay Jones

On Mon, Jul 10, 2017 at 09:42:55AM +0000, Eric Wong wrote:

> > Thanks, I was able to reproduce with CFLAGS=-m32.
> 
> No problem and thanks for the quick response!  Latest pu is
> fine for me.
> 
> To notice similar errors sooner, I wonder if we should have a
> test target for 64-bit users to test with -m32 enabled, somehow.

It's "just"

  make CFLAGS=-m32 test

once you are setup for building 32-bit binaries. But that setup can be a
bit tricky.  On Debian, I have multi-arch set up, and then I have to
"apt-get install zlib1g-dev:i386" etc for all the dependencies. Which
isn't _too_ bad except that libcurl4-openssl-dev conflicts between the
amd64 and i386 versions due to /usr/bin/curl-config. So I have to
install the i386 package, do the -m32 build, and then reinstall the
amd64 one to go back to my regular builds.

At any rate, the 32-bit thing in this instance[1] mostly just tickled
the bug in a different way. Testing with either ASan or valgrind also
detected the problem on 64-bit machines, and people (including me) do
run those periodically. In this case the bug was years old. The reason
it wasn't caught earlier is that there wasn't test coverage, and my
series added it for other reasons. So you really _did_ catch it as soon
as it hit pu, which seems pretty reasonable.

-Peff

[1] There are bugs that really are 32-bit-only, of course, and
    instrumenting tools wouldn't catch those. I think we had a bug a
    while back where the code mistook sizeof(uint64_t*) for
    sizeof(uint64_t). They're the same on 64-bit platforms, but not on
    32-bit ones. I do think such bugs are relatively rare, though.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v2 3/7] log: do not free parents when walking reflog
  2017-07-09 16:59                                       ` Junio C Hamano
@ 2017-07-10 13:27                                         ` Jeff King
  0 siblings, 0 replies; 69+ messages in thread
From: Jeff King @ 2017-07-10 13:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: brian m. carlson, Kyle Meyer, git, Sahil Dua, Eric Wong

On Sun, Jul 09, 2017 at 09:59:33AM -0700, Junio C Hamano wrote:

> >> After step 6/7, we no longer "allow cycles in reflog ancestry", as
> >> there will be no reflog ancestry to speak of ;-), so it would be
> >> nice to remove the comment above in that step.  But alternatively,
> >> we can rephrase the comment here, to say something like "the same
> >> commit can be shown multiple times while showing entries from the
> >> reflog" instead.
> >
> > I actually think the comment is a bit obtuse in the first place. The
> > real issue is that we show commits multiple times. That's caused by
> > cycles, yes, but also by us clearing the SEEN flag. ;)
> >
> > Maybe this on top?
> 
> Yup, that is a much better version of what I had in mind that can go
> either before this step as a preparatory cleanup, squashed into this
> as "while at it", or after the series as a finishing touches.  The
> last one will let the codebase lie for a short while, though, so I am
> likely to squash it in or wiggle it under.

Where you put it on jk/reflog-walk in your tree is fine, though note the
commit message is slightly inaccurate there (it says "commit buffer and
parents" but at that point it is really just "commit buffer").

-Peff

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 6/6] reflog-walk: stop using fake parents
  2017-07-10  9:42                                   ` Eric Wong
  2017-07-10 11:20                                     ` Jeff King
@ 2017-07-10 16:09                                     ` Junio C Hamano
  1 sibling, 0 replies; 69+ messages in thread
From: Junio C Hamano @ 2017-07-10 16:09 UTC (permalink / raw)
  To: Eric Wong
  Cc: Jeff King, brian m. carlson, Kyle Meyer, git, Sahil Dua,
	Ramsay Jones

Eric Wong <e@80x24.org> writes:

> To notice similar errors sooner, I wonder if we should have a
> test target for 64-bit users to test with -m32 enabled, somehow.

Thanks.  Hopefully https://travis-ci.org/git/git/jobs/251772744 and
its later incarnations should be sufficient?

^ permalink raw reply	[flat|nested] 69+ messages in thread

end of thread, other threads:[~2017-07-10 16:09 UTC | newest]

Thread overview: 69+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-21 21:40 Truncating HEAD reflog on branch move brian m. carlson
2017-06-21 21:46 ` Junio C Hamano
2017-06-21 23:04   ` Kyle Meyer
2017-06-21 23:12     ` Kyle Meyer
2017-06-22 15:16     ` Jeff King
2017-06-22 18:32       ` Junio C Hamano
2017-06-22 18:45         ` Jeff King
2017-06-22 19:03           ` Junio C Hamano
2017-06-22 20:21             ` Jeff King
2017-06-22 20:32               ` Junio C Hamano
2017-06-22 21:52                 ` Jeff King
2017-06-22 22:25                   ` Jeff King
2017-06-23  3:13                     ` Jeff King
2017-07-04 19:58                       ` brian m. carlson
2017-07-04 21:24                         ` Jeff King
2017-07-04 21:27                           ` brian m. carlson
2017-07-04 21:28                             ` Jeff King
2017-07-05  1:49                           ` Junio C Hamano
2017-07-05  7:55                           ` [PATCH 0/6] fixing reflog-walk oddities Jeff King
2017-07-05  7:57                             ` [PATCH 1/6] reflog-walk: skip over double-null oid due to HEAD rename Jeff King
2017-07-05 17:34                               ` Junio C Hamano
2017-07-05 21:21                                 ` Jeff King
2017-07-05  8:00                             ` [PATCH 2/6] t1414: document some reflog-walk oddities Jeff King
2017-07-05 17:56                               ` Junio C Hamano
2017-07-05 21:27                                 ` Jeff King
2017-07-05 22:45                                   ` Junio C Hamano
2017-07-06  7:16                                     ` Jeff King
2017-07-06  7:55                                       ` Jeff King
2017-07-06 15:42                                       ` Junio C Hamano
2017-07-07  5:19                                         ` Jeff King
2017-07-07  6:00                                           ` Junio C Hamano
2017-07-07  6:22                                             ` Jeff King
2017-07-05 20:05                               ` Ramsay Jones
2017-07-05 21:30                                 ` Jeff King
2017-07-05  8:02                             ` [PATCH 3/6] log: do not free parents when walking reflog Jeff King
2017-07-05  8:04                             ` [PATCH 4/6] get_revision_1(): replace do-while with an early return Jeff King
2017-07-05  8:06                             ` [PATCH 5/6] rev-list: check reflog_info before showing usage Jeff King
2017-07-05 18:07                               ` Junio C Hamano
2017-07-05 21:31                                 ` Jeff King
2017-07-05  8:09                             ` [PATCH 6/6] reflog-walk: stop using fake parents Jeff King
2017-07-07  0:32                               ` Eric Wong
2017-07-07  3:02                                 ` Jeff King
2017-07-07  3:15                                   ` Jeff King
2017-07-10  9:42                                   ` Eric Wong
2017-07-10 11:20                                     ` Jeff King
2017-07-10 16:09                                     ` Junio C Hamano
2017-07-07  8:36                             ` [PATCH v2 0/4] reflog-walk fixes for maint Jeff King
2017-07-07  8:37                               ` [PATCH v2 1/4] reflog-walk: skip over double-null oid due to HEAD rename Jeff King
2017-07-07  8:39                               ` [PATCH v2 2/4] reflog-walk: duplicate strings in complete_reflogs list Jeff King
2017-07-07  8:41                               ` [PATCH v2 3/4] reflog-walk: don't free reflogs added to cache Jeff King
2017-07-07  8:43                               ` [PATCH v2 4/4] reflog-walk: include all fields when freeing complete_reflogs Jeff King
2017-07-07  9:05                               ` [PATCH v2 0/7] fixing reflog-walk oddities Jeff King
2017-07-07  9:06                                 ` [PATCH v2 1/7] t1414: document some " Jeff King
2017-07-07  9:21                                   ` Jeff King
2017-07-07 15:54                                   ` Kyle Meyer
2017-07-07  9:07                                 ` [PATCH v2 2/7] revision: disallow reflog walking with revs->limited Jeff King
2017-07-07  9:07                                 ` [PATCH v2 3/7] log: do not free parents when walking reflog Jeff King
2017-07-07 17:10                                   ` Junio C Hamano
2017-07-09 10:13                                     ` Jeff King
2017-07-09 16:59                                       ` Junio C Hamano
2017-07-10 13:27                                         ` Jeff King
2017-07-07  9:07                                 ` [PATCH v2 4/7] get_revision_1(): replace do-while with an early return Jeff King
2017-07-07  9:08                                 ` [PATCH v2 5/7] rev-list: check reflog_info before showing usage Jeff King
2017-07-07  9:14                                 ` [PATCH v2 6/7] reflog-walk: stop using fake parents Jeff King
2017-07-07  9:24                                   ` Jeff King
2017-07-07 15:54                                   ` Kyle Meyer
2017-07-09 10:08                                     ` Jeff King
2017-07-07  9:16                                 ` [PATCH v2 7/7] reflog-walk: apply --since/--until to reflog dates Jeff King
2017-07-07 20:15                               ` [PATCH v2 0/4] reflog-walk fixes for maint Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).