git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Ulrich Spoerlein <uqs@FreeBSD.org>
To: git@vger.kernel.org, Jeff King <peff@peff.net>
Cc: Ed Maste <emaste@freebsd.org>, Junio C Hamano <gitster@pobox.com>
Subject: Re: git fast-import crashing on big imports
Date: Wed, 18 Jan 2017 15:01:17 +0100	[thread overview]
Message-ID: <20170118140117.GK4426@acme.spoerlein.net> (raw)
In-Reply-To: <20170112082138.GJ4426@acme.spoerlein.net>

Yo Jeff, your commit 8261e1f139db3f8aa6f9fd7d98c876cbeb0f927c from Aug
22nd, that changes delta_base_cache to use hashmap.h is the culprit for
git fast-import crashing on large imports.

Please read below, you can find a 55G SVN dump that should show the
problem after a couple of minutes to less than an hour. Please also see
similar issues from 2009 and 2011. This seems to be a rather fragile
part of the code, could you add unit tests that make sure this
regression is not re-introduce again once you fix it? Thanks!

I'm happy to test any patches that you can provide.

Cheers,
Uli

On Do., 2017-01-12 at 09:21:38 +0100, Ulrich Spörlein wrote:
> Hey,
> 
> the FreeBSD svn2git conversion is crashing somewhat
> non-deterministically during its long conversion process. From memory,
> this was not as bad is it is with more recent versions of git (but I
> can't be sure, really).
> 
> I have a dump file that you can grab at
> http://scan.spoerlein.net/pub/freebsd-base.dump.xz (19G, 55G uncompressed)
> that shows this problem after a couple of minutes of runtime. The caveat is
> that for another member of the team on a different machine the crashes are on
> different revisions.
> 
> Googling around I found two previous threads that were discussing
> problems just like this (memory corruption, bad caching, etc)
> 
> https://www.spinics.net/lists/git/msg93598.html  from 2009
> and
> http://git.661346.n2.nabble.com/long-fast-import-errors-out-quot-failed-to-apply-delta-quot-td6557884.html
> from 2011
> 
> % git fast-import --stats < ../freebsd-base.dump
> ...
> progress SVN r49318 branch master = :49869
> progress SVN r49319 branch stable/3 = :49870
> progress SVN r49320 branch master = :49871
> error: failed to apply delta
> error: bad offset for revindex
> error: bad offset for revindex
> error: bad offset for revindex
> error: bad offset for revindex
> error: bad offset for revindex
> fatal: Can't load tree b35ae4e9c2a41677e84a3f14bed09f584c3ff25e
> fast-import: dumping crash report to fast_import_crash_29613
> 
> 
> fast-import crash report:
>     fast-import process: 29613
>     parent process     : 29612
>     at 2017-01-11 19:33:37 +0000
> 
> fatal: Can't load tree b35ae4e9c2a41677e84a3f14bed09f584c3ff25e
> 
> 
> git fsck shows a somewhat incomplete pack file (I guess that's expected if the
> process dies mid-stream?)
> 
> % git fsck
> Checking object directories: 100% (256/256), done.
> error: failed to apply delta6/614500)
> error: cannot unpack d1d7ee1f81c6767c5e0f75d14d400d7512a85a0f from ./objects/pack/pack-e28fcea43fc221d2ebe92857b484da58bb888237.pack at offset 122654805
> error: failed to apply delta
> error: failed to read delta base object d1d7ee1f81c6767c5e0f75d14d400d7512a85a0f at offset 122654805 from ./objects/pack/pack-e28fcea43fc221d2ebe92857b484da58bb888237.pack
> error: cannot unpack 8523bde63ef34bef725347994fdaec996d756510 from ./objects/pack/pack-e28fcea43fc221d2ebe92857b484da58bb888237.pack at offset 122671596
> error: failed to apply delta0/614500)
> error: failed to read delta base object d1d7ee1f81c6767c5e0f75d14d400d7512a85a0f at offset 122654805 from ./objects/pack/pack-e28fcea43fc221d2ebe92857b484da58bb888237.pack
> ...
> 
> 
> Any comments on whether the original problems from 2009 and 2011 were ever
> fixed and committed?
> 
> Some more facts:
> - git version 2.11.0
> - I don't recall these sorts of crashes with a git from 2-3 years ago
> - adding more checkpoints does help, but not fix the problem, it merely shifts
>   the crashes around to different revisions
> - incremental runs of the conversion *will* complete most of the time, but
>   depending on how often checkpoints are used, I've seen it croak on specific
>   commits and not being able to progress further :(
> 
> Thanks for any pointers or things to try!
> Cheers
> Uli

  reply	other threads:[~2017-01-18 14:31 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-12  8:21 git fast-import crashing on big imports Ulrich Spörlein
2017-01-18 14:01 ` Ulrich Spoerlein [this message]
2017-01-18 14:38   ` Jeff King
2017-01-18 20:06     ` Jeff King
     [not found]       ` <CAJ9axoSzZJXD4RKvVx+D60dw4sakMJWgNmOP-cREWA53Ae3C3w@mail.gmail.com>
2017-01-18 20:27         ` Jeff King
2017-01-18 21:51           ` Jeff King
2017-01-19 14:03             ` Ulrich Spörlein
2017-01-19 16:33               ` [PATCH] clear_delta_base_cache(): don't modify hashmap while iterating Jeff King
2017-01-19 19:16                 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170118140117.GK4426@acme.spoerlein.net \
    --to=uqs@freebsd.org \
    --cc=emaste@freebsd.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).