git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] name-rev: change ULONG_MAX to TIME_MAX
@ 2017-08-30  9:46 Michael J Gruber
  2017-08-30 20:23 ` Johannes Schindelin
  2017-09-06  3:35 ` Junio C Hamano
  0 siblings, 2 replies; 14+ messages in thread
From: Michael J Gruber @ 2017-08-30  9:46 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Johannes Schindelin

Earlier, dddbad728c ("timestamp_t: a new data type for timestamps",
2017-04-26) changed several types to timestamp_t.

5589e87fd8 ("name-rev: change a "long" variable to timestamp_t",
2017-05-20) cleaned up a missed variable, but both missed a _MAX
constant.

Change the remaining constant to the one appropriate for the current
type

Signed-off-by: Michael J Gruber <git@grubix.eu>
---
 builtin/name-rev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index c41ea7c2a6..598da6c8bc 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -253,7 +253,7 @@ static int name_ref(const char *path, const struct object_id *oid, int flags, vo
 		struct commit *commit = (struct commit *)o;
 		int from_tag = starts_with(path, "refs/tags/");
 
-		if (taggerdate == ULONG_MAX)
+		if (taggerdate == TIME_MAX)
 			taggerdate = ((struct commit *)o)->date;
 		path = name_ref_abbrev(path, can_abbreviate_output);
 		name_rev(commit, xstrdup(path), taggerdate, 0, 0,
-- 
2.14.1.603.gf58147c36e


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] name-rev: change ULONG_MAX to TIME_MAX
  2017-08-30  9:46 [PATCH] name-rev: change ULONG_MAX to TIME_MAX Michael J Gruber
@ 2017-08-30 20:23 ` Johannes Schindelin
  2017-09-06  3:35 ` Junio C Hamano
  1 sibling, 0 replies; 14+ messages in thread
From: Johannes Schindelin @ 2017-08-30 20:23 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Junio C Hamano

Hi Michael,

On Wed, 30 Aug 2017, Michael J Gruber wrote:

> Earlier, dddbad728c ("timestamp_t: a new data type for timestamps",
> 2017-04-26) changed several types to timestamp_t.
> 
> 5589e87fd8 ("name-rev: change a "long" variable to timestamp_t",
> 2017-05-20) cleaned up a missed variable, but both missed a _MAX
> constant.
> 
> Change the remaining constant to the one appropriate for the current
> type
> 
> Signed-off-by: Michael J Gruber <git@grubix.eu>

Good catch!

ACK,
Dscho

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] name-rev: change ULONG_MAX to TIME_MAX
  2017-08-30  9:46 [PATCH] name-rev: change ULONG_MAX to TIME_MAX Michael J Gruber
  2017-08-30 20:23 ` Johannes Schindelin
@ 2017-09-06  3:35 ` Junio C Hamano
  2017-09-06 11:59   ` Michael J Gruber
  1 sibling, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2017-09-06  3:35 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Johannes Schindelin

Michael J Gruber <git@grubix.eu> writes:

> Earlier, dddbad728c ("timestamp_t: a new data type for timestamps",
> 2017-04-26) changed several types to timestamp_t.
>
> 5589e87fd8 ("name-rev: change a "long" variable to timestamp_t",
> 2017-05-20) cleaned up a missed variable, but both missed a _MAX
> constant.
>
> Change the remaining constant to the one appropriate for the current
> type
>
> Signed-off-by: Michael J Gruber <git@grubix.eu>
> ---

Thanks.

I think this (and the earlier 5589e8) was caused by an unnoticed
semantic conflict at 78089b71 ("Merge branch 'jc/name-rev-lw-tag'",
2017-05-30).  Merging is sometimes hard ;-)

Will queue.

>  builtin/name-rev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index c41ea7c2a6..598da6c8bc 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -253,7 +253,7 @@ static int name_ref(const char *path, const struct object_id *oid, int flags, vo
>  		struct commit *commit = (struct commit *)o;
>  		int from_tag = starts_with(path, "refs/tags/");
>  
> -		if (taggerdate == ULONG_MAX)
> +		if (taggerdate == TIME_MAX)
>  			taggerdate = ((struct commit *)o)->date;
>  		path = name_ref_abbrev(path, can_abbreviate_output);
>  		name_rev(commit, xstrdup(path), taggerdate, 0, 0,

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] name-rev: change ULONG_MAX to TIME_MAX
  2017-09-06  3:35 ` Junio C Hamano
@ 2017-09-06 11:59   ` Michael J Gruber
  2017-09-06 13:35     ` Jeff King
  0 siblings, 1 reply; 14+ messages in thread
From: Michael J Gruber @ 2017-09-06 11:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Johannes Schindelin

Junio C Hamano venit, vidit, dixit 06.09.2017 05:35:
> Michael J Gruber <git@grubix.eu> writes:
> 
>> Earlier, dddbad728c ("timestamp_t: a new data type for timestamps",
>> 2017-04-26) changed several types to timestamp_t.
>>
>> 5589e87fd8 ("name-rev: change a "long" variable to timestamp_t",
>> 2017-05-20) cleaned up a missed variable, but both missed a _MAX
>> constant.
>>
>> Change the remaining constant to the one appropriate for the current
>> type
>>
>> Signed-off-by: Michael J Gruber <git@grubix.eu>
>> ---
> 
> Thanks.
> 
> I think this (and the earlier 5589e8) was caused by an unnoticed
> semantic conflict at 78089b71 ("Merge branch 'jc/name-rev-lw-tag'",
> 2017-05-30).  Merging is sometimes hard ;-)

Simple merges and semi-simple merges...

BTW, there's more fallout from those name-rev changes: In connection
with that other thread about surprising describe results for emacs.git I
noticed that I can easily get "git name-rev --stdin" to segfault there.
As easy as

echo bc5d96a0b2a1dccf7eeeec459e40d21b54c977f4 | git name-rev --stdin

for example.

That's unfortunate for the use-case of name-rev to amend git log output.

The reason seems to be that with "--stdin" or "--all", "name-rev" walks
and names all commits before beginning to use that those names for even
a single commit as above.

That segfault bisects to the logic changing commit in
jc/name-rev-lw-tag, but I think the changed logic simply leads to more
xmallocs() the segfault sooner now. Or something that I dind't spot even
after a few hours.

On the other hand, nearly every time that I try to understand describe
or name-rev I want get rid of insert_commit_by_date() and the like and
replace this by generations, and maybe a simple rev-walk (per ref)...

> Will queue.
> 
>>  builtin/name-rev.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
>> index c41ea7c2a6..598da6c8bc 100644
>> --- a/builtin/name-rev.c
>> +++ b/builtin/name-rev.c
>> @@ -253,7 +253,7 @@ static int name_ref(const char *path, const struct object_id *oid, int flags, vo
>>  		struct commit *commit = (struct commit *)o;
>>  		int from_tag = starts_with(path, "refs/tags/");
>>  
>> -		if (taggerdate == ULONG_MAX)
>> +		if (taggerdate == TIME_MAX)
>>  			taggerdate = ((struct commit *)o)->date;
>>  		path = name_ref_abbrev(path, can_abbreviate_output);
>>  		name_rev(commit, xstrdup(path), taggerdate, 0, 0,

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] name-rev: change ULONG_MAX to TIME_MAX
  2017-09-06 11:59   ` Michael J Gruber
@ 2017-09-06 13:35     ` Jeff King
  2017-09-07 12:17       ` Michael J Gruber
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff King @ 2017-09-06 13:35 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: Junio C Hamano, git, Johannes Schindelin

On Wed, Sep 06, 2017 at 01:59:31PM +0200, Michael J Gruber wrote:

> BTW, there's more fallout from those name-rev changes: In connection
> with that other thread about surprising describe results for emacs.git I
> noticed that I can easily get "git name-rev --stdin" to segfault there.
> As easy as
> 
> echo bc5d96a0b2a1dccf7eeeec459e40d21b54c977f4 | git name-rev --stdin
> 
> for example.
> 
> That's unfortunate for the use-case of name-rev to amend git log output.
> 
> The reason seems to be that with "--stdin" or "--all", "name-rev" walks
> and names all commits before beginning to use that those names for even
> a single commit as above.
> 
> That segfault bisects to the logic changing commit in
> jc/name-rev-lw-tag, but I think the changed logic simply leads to more
> xmallocs() the segfault sooner now. Or something that I dind't spot even
> after a few hours.

The segfault seems to be due to running out of stack space. The problem
is that name_rev() is recursive over the history graph.  That topic
added a parameter to the function, which increased the memory used for
each level of the recursion. But the fundamental problem has always been
there. The right solution is to switch to iteration (with our own stack
structure if necessary).

We had similar problems with the recursive --contains traversal in tag,
and ended up with cbc60b6720 (git tag --contains: avoid stack overflow,
2014-04-24).

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] name-rev: change ULONG_MAX to TIME_MAX
  2017-09-06 13:35     ` Jeff King
@ 2017-09-07 12:17       ` Michael J Gruber
  2017-09-07 14:02         ` [PATCH 0/4] Test name-rev with small stack Michael J Gruber
  0 siblings, 1 reply; 14+ messages in thread
From: Michael J Gruber @ 2017-09-07 12:17 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Johannes Schindelin

Jeff King venit, vidit, dixit 06.09.2017 15:35:
> On Wed, Sep 06, 2017 at 01:59:31PM +0200, Michael J Gruber wrote:
> 
>> BTW, there's more fallout from those name-rev changes: In connection
>> with that other thread about surprising describe results for emacs.git I
>> noticed that I can easily get "git name-rev --stdin" to segfault there.
>> As easy as
>>
>> echo bc5d96a0b2a1dccf7eeeec459e40d21b54c977f4 | git name-rev --stdin
>>
>> for example.
>>
>> That's unfortunate for the use-case of name-rev to amend git log output.
>>
>> The reason seems to be that with "--stdin" or "--all", "name-rev" walks
>> and names all commits before beginning to use that those names for even
>> a single commit as above.
>>
>> That segfault bisects to the logic changing commit in
>> jc/name-rev-lw-tag, but I think the changed logic simply leads to more
>> xmallocs() the segfault sooner now. Or something that I dind't spot even
>> after a few hours.
> 
> The segfault seems to be due to running out of stack space. The problem
> is that name_rev() is recursive over the history graph.  That topic
> added a parameter to the function, which increased the memory used for
> each level of the recursion. But the fundamental problem has always been
> there. The right solution is to switch to iteration (with our own stack
> structure if necessary).
> 
> We had similar problems with the recursive --contains traversal in tag,
> and ended up with cbc60b6720 (git tag --contains: avoid stack overflow,
> 2014-04-24).

Cool, thanks for the pointer. ulimit -s is a great way to test this.

Michael

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 0/4] Test name-rev with small stack
  2017-09-07 12:17       ` Michael J Gruber
@ 2017-09-07 14:02         ` Michael J Gruber
  2017-09-07 14:02           ` [PATCH 1/4] t7004: move limited stack prereq to test-lib Michael J Gruber
                             ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Michael J Gruber @ 2017-09-07 14:02 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

name-rev segfaults for me in emacs.git with the typical 8102 stack size.
The reason is the recursive walk that name-rev uses.

This series adds a test to mark this as known failure, after some
clean-ups.

Michael J Gruber (4):
  t7004: move limited stack prereq to test-lib
  t6120: test name-rev --all and --stdin
  t6120: clean up state after breaking repo
  t6120: test describe and name-rev with deep repos

 t/t6120-describe.sh | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 t/t7004-tag.sh      |  6 ------
 t/test-lib.sh       |  6 ++++++
 3 files changed, 63 insertions(+), 6 deletions(-)

-- 
2.14.1.603.gf58147c36e


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] t7004: move limited stack prereq to test-lib
  2017-09-07 14:02         ` [PATCH 0/4] Test name-rev with small stack Michael J Gruber
@ 2017-09-07 14:02           ` Michael J Gruber
  2017-09-07 14:02           ` [PATCH 2/4] t6120: test name-rev --all and --stdin Michael J Gruber
                             ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Michael J Gruber @ 2017-09-07 14:02 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

The lazy prerequisite  ULIMIT_STACK_SIZE is used only in t7004 so far.

Move it to test-lib.sh so that it can be used in other tests (which it will
be in a follow-up commit).

Signed-off-by: Michael J Gruber <git@grubix.eu>
---
 t/t7004-tag.sh | 6 ------
 t/test-lib.sh  | 6 ++++++
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index dbcd6f623c..5bf5ace56b 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -1863,12 +1863,6 @@ test_expect_success 'version sort with very long prerelease suffix' '
 	git tag -l --sort=version:refname
 '
 
-run_with_limited_stack () {
-	(ulimit -s 128 && "$@")
-}
-
-test_lazy_prereq ULIMIT_STACK_SIZE 'run_with_limited_stack true'
-
 # we require ulimit, this excludes Windows
 test_expect_success ULIMIT_STACK_SIZE '--contains and --no-contains work in a deep repo' '
 	>expect &&
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 5fbd8d4a90..f22c1b260a 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1167,6 +1167,12 @@ run_with_limited_cmdline () {
 
 test_lazy_prereq CMDLINE_LIMIT 'run_with_limited_cmdline true'
 
+run_with_limited_stack () {
+	(ulimit -s 128 && "$@")
+}
+
+test_lazy_prereq ULIMIT_STACK_SIZE 'run_with_limited_stack true'
+
 build_option () {
 	git version --build-options |
 	sed -ne "s/^$1: //p"
-- 
2.14.1.603.gf58147c36e


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/4] t6120: test name-rev --all and --stdin
  2017-09-07 14:02         ` [PATCH 0/4] Test name-rev with small stack Michael J Gruber
  2017-09-07 14:02           ` [PATCH 1/4] t7004: move limited stack prereq to test-lib Michael J Gruber
@ 2017-09-07 14:02           ` Michael J Gruber
  2017-09-07 14:02           ` [PATCH 3/4] t6120: clean up state after breaking repo Michael J Gruber
                             ` (2 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Michael J Gruber @ 2017-09-07 14:02 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

name-rev is used in a few tests, but tested only in t6120 along with
describe so far.

Add tests for name-rev with --all and --stdin.

Signed-off-by: Michael J Gruber <git@grubix.eu>
---
 t/t6120-describe.sh | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index aa74eb8f0d..7c5728ebd5 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -198,6 +198,31 @@ test_expect_success 'name-rev with exact tags' '
 	test_cmp expect actual
 '
 
+test_expect_success 'name-rev --all' '
+	>expect.unsorted &&
+	for rev in $(git rev-list --all)
+	do
+		git name-rev $rev >>expect.unsorted
+	done &&
+	sort <expect.unsorted >expect &&
+	git name-rev --all >actual.unsorted &&
+	sort <actual.unsorted >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'name-rev --stdin' '
+	>expect.unsorted &&
+	for rev in $(git rev-list --all)
+	do
+		name=$(git name-rev --name-only $rev) &&
+		echo "$rev ($name)" >>expect.unsorted
+	done &&
+	sort <expect.unsorted >expect &&
+	git rev-list --all | git name-rev --stdin >actual.unsorted &&
+	sort <actual.unsorted >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'describe --contains with the exact tags' '
 	echo "A^0" >expect &&
 	tag_object=$(git rev-parse refs/tags/A) &&
-- 
2.14.1.603.gf58147c36e


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/4] t6120: clean up state after breaking repo
  2017-09-07 14:02         ` [PATCH 0/4] Test name-rev with small stack Michael J Gruber
  2017-09-07 14:02           ` [PATCH 1/4] t7004: move limited stack prereq to test-lib Michael J Gruber
  2017-09-07 14:02           ` [PATCH 2/4] t6120: test name-rev --all and --stdin Michael J Gruber
@ 2017-09-07 14:02           ` Michael J Gruber
  2017-09-07 14:02           ` [PATCH 4/4] t6120: test describe and name-rev with deep repos Michael J Gruber
  2017-09-07 14:54           ` [PATCH 0/4] Test name-rev with small stack Jeff King
  4 siblings, 0 replies; 14+ messages in thread
From: Michael J Gruber @ 2017-09-07 14:02 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

t6120 breaks the repo state intentionally in the last tests.

Clean up the breakage afterwards (and before adding more tests).

Signed-off-by: Michael J Gruber <git@grubix.eu>
---
 t/t6120-describe.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 7c5728ebd5..1997ccde56 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -275,6 +275,7 @@ test_expect_success 'describe chokes on severely broken submodules' '
 '
 test_expect_success 'describe ignoring a borken submodule' '
 	git describe --broken >out &&
+	test_when_finished "mv .git/modules/sub_moved .git/modules/sub1" &&
 	grep broken out
 '
 
-- 
2.14.1.603.gf58147c36e


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/4] t6120: test describe and name-rev with deep repos
  2017-09-07 14:02         ` [PATCH 0/4] Test name-rev with small stack Michael J Gruber
                             ` (2 preceding siblings ...)
  2017-09-07 14:02           ` [PATCH 3/4] t6120: clean up state after breaking repo Michael J Gruber
@ 2017-09-07 14:02           ` Michael J Gruber
  2017-09-07 14:54           ` [PATCH 0/4] Test name-rev with small stack Jeff King
  4 siblings, 0 replies; 14+ messages in thread
From: Michael J Gruber @ 2017-09-07 14:02 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

Depending on the implementation of walks, limitted stack size may lead
to problems (for recursion).

Test name-rev and describe with deep repos and limitted stack size and
mark the former with known failure.

We add these tests (which add gazillions of commits) last so as to keep
the runtime of other subtests the same.

Signed-off-by: Michael J Gruber <git@grubix.eu>
---
 t/t6120-describe.sh | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 1997ccde56..dd6dd9df9b 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -279,4 +279,35 @@ test_expect_success 'describe ignoring a borken submodule' '
 	grep broken out
 '
 
+# we require ulimit, this excludes Windows
+test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' '
+	i=1 &&
+	while test $i -lt 8000
+	do
+		echo "commit refs/heads/master
+committer A U Thor <author@example.com> $((1000000000 + $i * 100)) +0200
+data <<EOF
+commit #$i
+EOF"
+		test $i = 1 && echo "from refs/heads/master^0"
+		i=$(($i + 1))
+	done | git fast-import &&
+	git checkout master &&
+	git tag far-far-away HEAD^ &&
+	echo "HEAD~4000 tags/far-far-away~3999" >expect &&
+	git name-rev HEAD~4000 >actual &&
+	test_cmp expect actual &&
+	run_with_limited_stack git name-rev HEAD~4000 >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success ULIMIT_STACK_SIZE 'describe works in a deep repo' '
+	git tag -f far-far-away HEAD~7999 &&
+	echo "far-far-away" >expect &&
+	git describe --tags --abbrev=0 HEAD~4000 >actual &&
+	test_cmp expect actual &&
+	run_with_limited_stack git describe --tags --abbrev=0 HEAD~4000 >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.14.1.603.gf58147c36e


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] Test name-rev with small stack
  2017-09-07 14:02         ` [PATCH 0/4] Test name-rev with small stack Michael J Gruber
                             ` (3 preceding siblings ...)
  2017-09-07 14:02           ` [PATCH 4/4] t6120: test describe and name-rev with deep repos Michael J Gruber
@ 2017-09-07 14:54           ` Jeff King
  2017-09-08 12:33             ` Michael J Gruber
  4 siblings, 1 reply; 14+ messages in thread
From: Jeff King @ 2017-09-07 14:54 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Junio C Hamano

On Thu, Sep 07, 2017 at 04:02:19PM +0200, Michael J Gruber wrote:

> name-rev segfaults for me in emacs.git with the typical 8102 stack size.
> The reason is the recursive walk that name-rev uses.
> 
> This series adds a test to mark this as known failure, after some
> clean-ups.

These all look reasonable to me. The size of the test case in the final
one is presumably arbitrary and just copied from t7004. I don't know if
it's worth trying to shrink it. It could shorten a rather expensive
test. OTOH, if we shorten it too much then we might get a false pass
(e.g., if the algorithm remains recursive but has a smaller stack
footprint).

> Michael J Gruber (4):
>   t7004: move limited stack prereq to test-lib
>   t6120: test name-rev --all and --stdin
>   t6120: clean up state after breaking repo
>   t6120: test describe and name-rev with deep repos

Now comes the hard part: rewriting the C code. :)

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] Test name-rev with small stack
  2017-09-07 14:54           ` [PATCH 0/4] Test name-rev with small stack Jeff King
@ 2017-09-08 12:33             ` Michael J Gruber
  2017-09-11 18:08               ` Jeff King
  0 siblings, 1 reply; 14+ messages in thread
From: Michael J Gruber @ 2017-09-08 12:33 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano

Jeff King venit, vidit, dixit 07.09.2017 16:54:
> On Thu, Sep 07, 2017 at 04:02:19PM +0200, Michael J Gruber wrote:
> 
>> name-rev segfaults for me in emacs.git with the typical 8102 stack size.
>> The reason is the recursive walk that name-rev uses.
>>
>> This series adds a test to mark this as known failure, after some
>> clean-ups.
> 
> These all look reasonable to me. The size of the test case in the final
> one is presumably arbitrary and just copied from t7004. I don't know if
> it's worth trying to shrink it. It could shorten a rather expensive
> test. OTOH, if we shorten it too much then we might get a false pass
> (e.g., if the algorithm remains recursive but has a smaller stack
> footprint).
> 
>> Michael J Gruber (4):
>>   t7004: move limited stack prereq to test-lib
>>   t6120: test name-rev --all and --stdin
>>   t6120: clean up state after breaking repo
>>   t6120: test describe and name-rev with deep repos
> 
> Now comes the hard part: rewriting the C code. :)

Looking at it more closely, the solution in cbc60b6720 ("git tag
--contains: avoid stack overflow", 2014-04-24) seems to be a bit "ad
hoc" to me:

First of all, there is more than "tag --contains" that may exceed the
stack by recursion. So I would expect the solution to be more general,
and not localised and specialised to builtin/tag.c

Second, this is a walk, so I'm wondering whether our revision walk
machinery should be the place to add missing functionality (if any).
That way, everything would benefit from possible or existing
improvements there. For example, I think some of our "extra walkers"
don't heed object replacements. (I need to test more.)

Michael

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/4] Test name-rev with small stack
  2017-09-08 12:33             ` Michael J Gruber
@ 2017-09-11 18:08               ` Jeff King
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2017-09-11 18:08 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git, Junio C Hamano

On Fri, Sep 08, 2017 at 02:33:35PM +0200, Michael J Gruber wrote:

> Looking at it more closely, the solution in cbc60b6720 ("git tag
> --contains: avoid stack overflow", 2014-04-24) seems to be a bit "ad
> hoc" to me:
> 
> First of all, there is more than "tag --contains" that may exceed the
> stack by recursion. So I would expect the solution to be more general,
> and not localised and specialised to builtin/tag.c

At the time, tag was the only one using this depth-first contains
algorithm. It's since been adapted to ref-filter.c, but of course the
stack handling went with it.

Most traversals have a date-sorted queue, so are effectively doing a
breadth-first iteration with no recursion.

> Second, this is a walk, so I'm wondering whether our revision walk
> machinery should be the place to add missing functionality (if any).
> That way, everything would benefit from possible or existing
> improvements there. For example, I think some of our "extra walkers"
> don't heed object replacements. (I need to test more.)

It's possible that name-rev could make better use of the existing
traversal machinery. It's often tough to do so, though, because that
machinery gives you a linearized ordering of the commits. Whereas
something like name-rev really cares about the order that it visits the
commits, because it's building up the names.

It's the same for this "tag --contains" traversal. It _used_ to be a
series of merge-base computations. But by doing a custom traversal, we
can cache incremental results through the graph and avoid walking over
the same bits multiple times. There actually is a way to do it with the
regular breadth-first traversal, but you have to store one bit per ref
you're checking for each commit.

I played around with that a bit a while ago, and it did seem to work. I
can dig up the patches if you're interested. But one downside is that
one bit per ref per commit adds up if you have a lot of refs. A large
number of those bitfields will be the same, so you could probably get by
with a copy-on-write scheme, but I never implemented that.

Of course somebody may have a more clever algorithm, too. I don't claim
the above is a proof. ;)

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-09-11 18:08 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-30  9:46 [PATCH] name-rev: change ULONG_MAX to TIME_MAX Michael J Gruber
2017-08-30 20:23 ` Johannes Schindelin
2017-09-06  3:35 ` Junio C Hamano
2017-09-06 11:59   ` Michael J Gruber
2017-09-06 13:35     ` Jeff King
2017-09-07 12:17       ` Michael J Gruber
2017-09-07 14:02         ` [PATCH 0/4] Test name-rev with small stack Michael J Gruber
2017-09-07 14:02           ` [PATCH 1/4] t7004: move limited stack prereq to test-lib Michael J Gruber
2017-09-07 14:02           ` [PATCH 2/4] t6120: test name-rev --all and --stdin Michael J Gruber
2017-09-07 14:02           ` [PATCH 3/4] t6120: clean up state after breaking repo Michael J Gruber
2017-09-07 14:02           ` [PATCH 4/4] t6120: test describe and name-rev with deep repos Michael J Gruber
2017-09-07 14:54           ` [PATCH 0/4] Test name-rev with small stack Jeff King
2017-09-08 12:33             ` Michael J Gruber
2017-09-11 18:08               ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).