git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* git name-rev looks at refs/notes, refs/svn/map: stack overflow
@ 2019-11-09 11:31 Sebastiaan Dammann
  2019-11-11  4:21 ` Jeff King
  0 siblings, 1 reply; 4+ messages in thread
From: Sebastiaan Dammann @ 2019-11-09 11:31 UTC (permalink / raw)
  To: git

Hello git users,

I'm running into git crashing with an stack overflow and hope to hear
your views on this.

I have a from SVN converted repository (using Subgit). If I execute
"git name-rev 92f054ba5c71b8e689aed529ff31370984089b64" on this bare
repository, git crashes with a stack overflow.

One property of the git conversion process is that at every commit a
note is attached which lists the SVN version number. This repository
has over 50000 commits.
In addition, refs exist at refs/svn/map which also contains many
entries. It appears the name-rev looks at the notes and other refs as
well.

The backtrace of the stack overflow is this:

[repeats many times]
#12976 0x000000000046d178 in name_rev (commit=0x37241b0,
    tip_name=0x37ccb70 "notes/commits", taggerdate=1572995101, generation=3,
    distance=3, from_tag=0, deref=0) at builtin/name-rev.c:139
#12977 0x000000000046d178 in name_rev (commit=0x3724160,
    tip_name=0x37ccb70 "notes/commits", taggerdate=1572995101, generation=2,
    distance=2, from_tag=0, deref=0) at builtin/name-rev.c:139
#12978 0x000000000046d178 in name_rev (commit=0x3724110,
    tip_name=0x37ccb70 "notes/commits", taggerdate=1572995101, generation=1,
    distance=1, from_tag=0, deref=0) at builtin/name-rev.c:139
#12979 0x000000000046d178 in name_rev (commit=0x37240c0,
    tip_name=0x37ccb70 "notes/commits", taggerdate=1572995101, generation=0,
    distance=0, from_tag=0, deref=0) at builtin/name-rev.c:139
#12980 0x000000000046d69d in name_ref (path=0x37232e5 "notes/commits",
    oid=0x37c6030, flags=66, cb_data=0x155ef20) at builtin/name-rev.c:277
#12981 0x00000000005c6645 in do_for_each_ref_helper (r=0x7acee0 <the_repo>,
    refname=0x37232e0 "refs/notes/commits", oid=0x37c6030, flags=66,
    cb_data=0x155ede0) at refs.c:1552
#12982 0x00000000005cfcd1 in do_for_each_repo_ref_iterator (
    r=0x7acee0 <the_repo>, iter=0x37c6190,
    fn=0x5c65ff <do_for_each_ref_helper>, cb_data=0x155ede0)
    at refs/iterator.c:418
#12983 0x00000000005c66de in do_for_each_ref (refs=0x3716aa8,
    prefix=0x71d34f <__ac_HASH_UPPER+271> "", fn=0x46d3ff <name_ref>, trim=0,
    flags=0, cb_data=0x155ef20) at refs.c:1566
#12984 0x00000000005c673f in refs_for_each_ref (refs=0x3716aa8,
    fn=0x46d3ff <name_ref>, cb_data=0x155ef20) at refs.c:1572
#12985 0x00000000005c677a in for_each_ref (fn=0x46d3ff <name_ref>,
   cb_data=0x155ef20) at refs.c:1577
#12986 0x000000000046e2c1 in cmd_name_rev (argc=0, argv=0x37106b8, prefix=0x0)
    at builtin/name-rev.c:495
#12987 0x000000000040305c in run_builtin (p=0x6b56e8 <commands+1704>, argc=2,
    argv=0x37106b0) at git.c:444
#12988 0x0000000000403446 in handle_builtin (argc=2, argv=0x37106b0)
    at git.c:673
#12989 0x00000000004036fc in run_argv (argcp=0x155fdb0, argv=0x155fd58)
    at git.c:740
#12990 0x0000000000403b62 in cmd_main (argc=2, argv=0x37106b0) at git.c:871
#12991 0x00000000004c8647 in main (argc=3, argv=0x37106a8) at common-main.c:52

If I trace it back to the source we get this:
(gdb)
#16 0x000000000046d178 in name_rev (commit=0x3b948d30,
    tip_name=0x37ccb70 "notes/commits", taggerdate=1572995101,
    generation=12963, distance=12963, from_tag=0, deref=0)
    at builtin/name-rev.c:139
139                             name_rev(parents->item, tip_name, taggerdate,
(gdb)
#17 0x000000000046d178 in name_rev (commit=0x3b948ce0,
    tip_name=0x37ccb70 "notes/commits", taggerdate=1572995101,
    generation=12962, distance=12962, from_tag=0, deref=0)
    at builtin/name-rev.c:139
139                             name_rev(parents->item, tip_name, taggerdate,
(gdb)
#18 0x000000000046d178 in name_rev (commit=0x3b948c90,
    tip_name=0x37ccb70 "notes/commits", taggerdate=1572995101,
    generation=12961, distance=12961, from_tag=0, deref=0)
    at builtin/name-rev.c:139
139                             name_rev(parents->item, tip_name, taggerdate,
(gdb)
#19 0x000000000046d178 in name_rev (commit=0x3b948c40,
    tip_name=0x37ccb70 "notes/commits", taggerdate=1572995101,
    generation=12960, distance=12960, from_tag=0, deref=0)
    at builtin/name-rev.c:139
139                             name_rev(parents->item, tip_name, taggerdate,
(gdb)
#20 0x000000000046d178 in name_rev (commit=0x3b948bf0,
    tip_name=0x37ccb70 "notes/commits", taggerdate=1572995101,
    generation=12959, distance=12959, from_tag=0, deref=0)
    at builtin/name-rev.c:139
139                             name_rev(parents->item, tip_name, taggerdate,
(gdb)
#21 0x000000000046d178 in name_rev (commit=0x3b948ba0,
    tip_name=0x37ccb70 "notes/commits", taggerdate=1572995101,
    generation=12958, distance=12958, from_tag=0, deref=0)
    at builtin/name-rev.c:139
139

Essentially too much, but not endless recursion, causing a stack overflow.

The destructive workaround at the moment is to delete the refs, then
run an aggressive gc:
git update-ref -d refs/notes/commits
git update-ref -d refs/svn/map
git gc --prune=all --aggressive

I was redirected here by the git-for-windows team (issue link if
you're interested:
https://github.com/git-for-windows/git/issues/2393), so I'm going to
repeat what was said there:

> But yes, the problem seems to be the many, many notes. Which points to a deeper issue: name_rev should not even look at anything byt refs/heads/*, refs/remotes/* and refs/tags/. Certainly it should exclude refs/notes/*...
> This, however, is not Windows-specific. Maybe I can ask you to report this issue to the Git mailing list (git@vger.kernel.org, please suppress all HTML lest the mail will be bounced)?

I hope to hear your view on this. Is this an (confirmed) issue with
git? Are there beside the workaround I mentioned, any other
workarounds?

Groeten, Kind regards,
Sebastiaan Dammann

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git name-rev looks at refs/notes, refs/svn/map: stack overflow
  2019-11-09 11:31 git name-rev looks at refs/notes, refs/svn/map: stack overflow Sebastiaan Dammann
@ 2019-11-11  4:21 ` Jeff King
  2019-11-11 14:09   ` SZEDER Gábor
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff King @ 2019-11-11  4:21 UTC (permalink / raw)
  To: Sebastiaan Dammann; +Cc: SZEDER Gábor, git

On Sat, Nov 09, 2019 at 12:31:31PM +0100, Sebastiaan Dammann wrote:

> The destructive workaround at the moment is to delete the refs, then
> run an aggressive gc:
> git update-ref -d refs/notes/commits
> git update-ref -d refs/svn/map
> git gc --prune=all --aggressive

If the issue is just traversing from those refs, you shouldn't need to
do a "gc". In fact, you should be able to use the "--refs" option to
name-rev to avoid them without deleting them.

All of that's obviously a workaround, though. The real issue is the
stack exhaustion.

> I hope to hear your view on this. Is this an (confirmed) issue with
> git? Are there beside the workaround I mentioned, any other
> workarounds?

Yes, this is well known. It's covered in the test suite (unfortunately
still failing, of course) since 31625b34c0 (t6120: test describe and
name-rev with deep repos, 2017-09-07).

There was a proposed fix recently in:

  https://public-inbox.org/git/20190919214712.7348-1-szeder.dev@gmail.com/

but it doesn't seem to have been picked up. I'm not sure what the
current status is.

-Peff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git name-rev looks at refs/notes, refs/svn/map: stack overflow
  2019-11-11  4:21 ` Jeff King
@ 2019-11-11 14:09   ` SZEDER Gábor
  2019-11-12 18:33     ` Sebastiaan Dammann
  0 siblings, 1 reply; 4+ messages in thread
From: SZEDER Gábor @ 2019-11-11 14:09 UTC (permalink / raw)
  To: Jeff King; +Cc: Sebastiaan Dammann, git

On Sun, Nov 10, 2019 at 11:21:49PM -0500, Jeff King wrote:
> On Sat, Nov 09, 2019 at 12:31:31PM +0100, Sebastiaan Dammann wrote:

> All of that's obviously a workaround, though. The real issue is the
> stack exhaustion.
> 
> > I hope to hear your view on this. Is this an (confirmed) issue with
> > git? Are there beside the workaround I mentioned, any other
> > workarounds?
> 
> Yes, this is well known. It's covered in the test suite (unfortunately
> still failing, of course) since 31625b34c0 (t6120: test describe and
> name-rev with deep repos, 2017-09-07).
> 
> There was a proposed fix recently in:
> 
>   https://public-inbox.org/git/20190919214712.7348-1-szeder.dev@gmail.com/
> 
> but it doesn't seem to have been picked up. I'm not sure what the
> current status is.

It's getting along, being polished, clarified and fine-tuned ever so
slowly.

E.g. it turned out that the performance penalty from eliminating the
recursion is basically entirely caused by using a 'commit-list' to
store interesting parents (the overhead of a malloc() on each insert
and a free() on each pop).  Switching to a LIFO 'prio-queue' doesn't
cause any measurable slowdown.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git name-rev looks at refs/notes, refs/svn/map: stack overflow
  2019-11-11 14:09   ` SZEDER Gábor
@ 2019-11-12 18:33     ` Sebastiaan Dammann
  0 siblings, 0 replies; 4+ messages in thread
From: Sebastiaan Dammann @ 2019-11-12 18:33 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Jeff King, git

Is there anything I can do to help? I have real-life (though
proprietary) repositories to test the fix on.

On Mon, 11 Nov 2019 at 15:09, SZEDER Gábor <szeder.dev@gmail.com> wrote:
>
> On Sun, Nov 10, 2019 at 11:21:49PM -0500, Jeff King wrote:
> > On Sat, Nov 09, 2019 at 12:31:31PM +0100, Sebastiaan Dammann wrote:
>
> > All of that's obviously a workaround, though. The real issue is the
> > stack exhaustion.
> >
> > > I hope to hear your view on this. Is this an (confirmed) issue with
> > > git? Are there beside the workaround I mentioned, any other
> > > workarounds?
> >
> > Yes, this is well known. It's covered in the test suite (unfortunately
> > still failing, of course) since 31625b34c0 (t6120: test describe and
> > name-rev with deep repos, 2017-09-07).
> >
> > There was a proposed fix recently in:
> >
> >   https://public-inbox.org/git/20190919214712.7348-1-szeder.dev@gmail.com/
> >
> > but it doesn't seem to have been picked up. I'm not sure what the
> > current status is.
>
> It's getting along, being polished, clarified and fine-tuned ever so
> slowly.
>
> E.g. it turned out that the performance penalty from eliminating the
> recursion is basically entirely caused by using a 'commit-list' to
> store interesting parents (the overhead of a malloc() on each insert
> and a free() on each pop).  Switching to a LIFO 'prio-queue' doesn't
> cause any measurable slowdown.
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-11-12 18:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-09 11:31 git name-rev looks at refs/notes, refs/svn/map: stack overflow Sebastiaan Dammann
2019-11-11  4:21 ` Jeff King
2019-11-11 14:09   ` SZEDER Gábor
2019-11-12 18:33     ` Sebastiaan Dammann

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).