git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* log -S/-G (aka pickaxe) searches binary files by default
@ 2017-03-03  0:52 Thomas Braun
  2017-03-03  1:36 ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Braun @ 2017-03-03  0:52 UTC (permalink / raw)
  To: GIT Mailing-list

Hi,

I happen to have quite large binary files in my repos.

Today I realized that a line like
git log -G a
searches also files found to be binary (or explicitly marked as binary).

Is that on purpose?
The documentation of "-G<regex>" states

"Look for differences whose patch text contains added/removed lines that
match <regex>."

which contradicts the current behaviour. At least for me text != binary.

To reproduce:
$ git init
$ echo -e "a\0b" > data.bin
$ git add data.bin
$ git commit -m "Add new data"
$ git log -p
[...]
diff --git a/data.bin b/data.bin
new file mode 100644
index 0000000..1a23e4b
Binary files /dev/null and b/data.bin differ
$ git log -G a
[...]

    Add new data

I've verified the behaviour with git version 2.12.0.windows.1 and git
version 2.12.0.189.g3bc5322 on debian.

If it is on purpose is there a config option to disable that?

Thanks for reading,
Thomas

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: log -S/-G (aka pickaxe) searches binary files by default
  2017-03-03  0:52 log -S/-G (aka pickaxe) searches binary files by default Thomas Braun
@ 2017-03-03  1:36 ` Junio C Hamano
  2017-03-03  5:17   ` Jeff King
  0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2017-03-03  1:36 UTC (permalink / raw)
  To: Thomas Braun; +Cc: GIT Mailing-list

On Thu, Mar 2, 2017 at 4:52 PM, Thomas Braun
<thomas.braun@virtuell-zuhause.de> wrote:
>
> I happen to have quite large binary files in my repos.
>
> Today I realized that a line like
> git log -G a
> searches also files found to be binary (or explicitly marked as binary).
>
> Is that on purpose?

No, it's a mere oversight (as I do not think I never even thought
about special casing binary
files from day one, it is unlikely that you would find _any_ old
version of Git that behaves
differently).

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: log -S/-G (aka pickaxe) searches binary files by default
  2017-03-03  1:36 ` Junio C Hamano
@ 2017-03-03  5:17   ` Jeff King
  2017-03-03 16:07     ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff King @ 2017-03-03  5:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Braun, GIT Mailing-list

On Thu, Mar 02, 2017 at 05:36:17PM -0800, Junio C Hamano wrote:

> On Thu, Mar 2, 2017 at 4:52 PM, Thomas Braun
> <thomas.braun@virtuell-zuhause.de> wrote:
> >
> > I happen to have quite large binary files in my repos.
> >
> > Today I realized that a line like
> > git log -G a
> > searches also files found to be binary (or explicitly marked as binary).
> >
> > Is that on purpose?
> 
> No, it's a mere oversight (as I do not think I never even thought
> about special casing binary
> files from day one, it is unlikely that you would find _any_ old
> version of Git that behaves
> differently).

The email focuses on "-G", and I think it is wrong to look in binary
files there, as "grep in diff" does not make sense for a binary file
that we would refuse to diff.

But the subject also mentions "-S". I always assumed it was intentional
to look in binary files there, as it is searching for a pure byte
sequence. I would not mind an option to disable that, but I think the
default should remain on.

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: log -S/-G (aka pickaxe) searches binary files by default
  2017-03-03  5:17   ` Jeff King
@ 2017-03-03 16:07     ` Junio C Hamano
  2017-03-03 18:05       ` Thomas Braun
  0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2017-03-03 16:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Thomas Braun, GIT Mailing-list

Jeff King <peff@peff.net> writes:

> On Thu, Mar 02, 2017 at 05:36:17PM -0800, Junio C Hamano wrote:
> ...
>> > Is that on purpose?
>> 
>> No, it's a mere oversight (as I do not think I never even thought
>> about special casing binary
>> files from day one, it is unlikely that you would find _any_ old
>> version of Git that behaves
>> differently).
>
> The email focuses on "-G", and I think it is wrong to look in binary
> files there, as "grep in diff" does not make sense for a binary file
> that we would refuse to diff.

Yeah, I agree.

> But the subject also mentions "-S". I always assumed it was intentional
> to look in binary files there, as it is searching for a pure byte
> sequence. I would not mind an option to disable that, but I think the
> default should remain on.

As the feature was built to be one of the core ingredients necessary
towards the 'ideal SCM' envisioned in

  <http://public-inbox.org/git/Pine.LNX.4.58.0504150753440.7211@ppc970.osdl.org>

"-S" is about finding "a block of text". It was merely an oversight
that we didn't add explicit code to ignore binary when we introduced
the concept of "is this text?  is it worth finding things in and
diffing binary files?".

I do agree that it may be too late and/or disruptive to change its
behaviour now, as people may grew expectations different from the
original motivation and design, though.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: log -S/-G (aka pickaxe) searches binary files by default
  2017-03-03 16:07     ` Junio C Hamano
@ 2017-03-03 18:05       ` Thomas Braun
  0 siblings, 0 replies; 5+ messages in thread
From: Thomas Braun @ 2017-03-03 18:05 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King; +Cc: GIT Mailing-list

Am 03.03.2017 um 17:07 schrieb Junio C Hamano:
> Jeff King <peff@peff.net> writes:
> 
>> On Thu, Mar 02, 2017 at 05:36:17PM -0800, Junio C Hamano wrote:
>> ...
>>>> Is that on purpose?
>>>
>>> No, it's a mere oversight (as I do not think I never even thought
>>> about special casing binary
>>> files from day one, it is unlikely that you would find _any_ old
>>> version of Git that behaves
>>> differently).
>>
>> The email focuses on "-G", and I think it is wrong to look in binary
>> files there, as "grep in diff" does not make sense for a binary file
>> that we would refuse to diff.
> 
> Yeah, I agree.
> 
>> But the subject also mentions "-S". I always assumed it was intentional
>> to look in binary files there, as it is searching for a pure byte
>> sequence. I would not mind an option to disable that, but I think the
>> default should remain on.
> 
> As the feature was built to be one of the core ingredients necessary
> towards the 'ideal SCM' envisioned in
> 
>   <http://public-inbox.org/git/Pine.LNX.4.58.0504150753440.7211@ppc970.osdl.org>
> 
> "-S" is about finding "a block of text". It was merely an oversight
> that we didn't add explicit code to ignore binary when we introduced
> the concept of "is this text?  is it worth finding things in and
> diffing binary files?".
> 
> I do agree that it may be too late and/or disruptive to change its
> behaviour now, as people may grew expectations different from the
> original motivation and design, though.

Thanks both for the encouraging answers.

I'll try to come up with patches in the next couple of weeks for the
following changes:
"log -G": disable looking in binaries
"log -S": add option to switch looking into binaries, defaults to true

Thomas


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-03-03 18:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-03  0:52 log -S/-G (aka pickaxe) searches binary files by default Thomas Braun
2017-03-03  1:36 ` Junio C Hamano
2017-03-03  5:17   ` Jeff King
2017-03-03 16:07     ` Junio C Hamano
2017-03-03 18:05       ` Thomas Braun

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).