git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Peter Oberndorfer <kumbayo84@arcor.de>, git@vger.kernel.org
Subject: Re: crash on git diff-tree -Ganything <tree> for new files with textconv filter
Date: Tue, 30 Oct 2012 09:12:22 -0400	[thread overview]
Message-ID: <20121030131221.GA19571@sigill.intra.peff.net> (raw)
In-Reply-To: <da24b6ea-ac9b-46dd-b591-25fd4e8e6504@email.android.com>

On Tue, Oct 30, 2012 at 09:46:01PM +0900, Junio C Hamano wrote:

> (1) sounds attractive for more than one reason. In addition to
> avoidance of this issue, it would bring bug-to-bug compatibility
> across platforms.

Yeah. I mentioned breaking the build for people who would now need to
turn on NO_REGEX. But the only reason to do that is to let people on
glibc systems use the system version of the tools. A much saner approach
would be to just always build with our compat regex, and turn NO_REGEX
into a no-op. We already do the same thing for kwset.

> (4), if we can run grep on streaming data (tweak interface we have for
> checking out a large blob to the working tree), would let us work on
> dataset larger than fit in core. Even though it would be much more
> work, it might turn out to be a better option in the longer run.

Agreed, that would be nice. It's potentially a lot of work, but we could
probably get by with a special streaming version of diff_populate_filespec.

The tricky thing is that we have to run the regex matcher progressively
as we stream data in (since your match might fall in the middle of a
read boundary). Which is certainly going to require switching off of
regular regexec. I don't think glibc regex will handle it either,
though. It looks like pcre can report a partial match at the end of the
string, and you can either continue with the next chunk (if using
pcre_dfa) or append and re-start the pattern match (for regular
pcre_exec).

Which means we'd probably have to make streaming matches an optional
feature, and still do (1) first to fix the correctness issue.

-Peff

  reply	other threads:[~2012-10-30 13:12 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-27 18:37 crash on git diff-tree -Ganything <tree> for new files with textconv filter Peter Oberndorfer
2012-10-28 12:01 ` Jeff King
2012-10-28 12:45   ` [PATCH 0/2] textconv support for "log -S" Jeff King
2012-10-28 12:46     ` [PATCH 1/2] pickaxe: hoist empty needle check Jeff King
2012-10-28 12:47     ` [PATCH 2/2] pickaxe: use textconv for -S counting Jeff King
2012-11-13 23:13       ` Junio C Hamano
2012-11-15  1:21         ` Jeff King
2012-11-20  0:31           ` Junio C Hamano
2012-11-20  0:48             ` Junio C Hamano
2012-11-21 20:27               ` Jeff King
2012-10-28 19:56   ` crash on git diff-tree -Ganything <tree> for new files with textconv filter Peter Oberndorfer
2012-10-29  6:05     ` Jeff King
2012-10-29  6:18       ` Jeff King
2012-10-29 20:19       ` Peter Oberndorfer
2012-10-29 22:35         ` Jeff King
2012-10-29 22:47           ` Jeff King
2012-10-30 12:17             ` Jeff King
2012-10-30 12:46               ` Junio C Hamano
2012-10-30 13:12                 ` Jeff King [this message]
2012-11-01 19:19               ` Ramsay Jones
2012-11-07 21:10           ` Peter Oberndorfer
2012-11-07 21:13             ` Jeff King
2013-06-03 17:25               ` Peter Oberndorfer
2013-06-03 22:17                 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121030131221.GA19571@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=kumbayo84@arcor.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).