git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Surprising interaction of "binary" and "eol" gitattributes
@ 2015-03-05 16:38 Michael Haggerty
  2015-03-05 20:49 ` Torsten Bögershausen
  2015-03-05 22:08 ` Junio C Hamano
  0 siblings, 2 replies; 17+ messages in thread
From: Michael Haggerty @ 2015-03-05 16:38 UTC (permalink / raw)
  To: git discussion list

I would expect that the following .gitattributes file

    *       eol=crlf
    *.png   -text

would leave EOL translation turned off for PNG files. In other words, I
would expect that explicitly setting "-text" would take precedence over
the fact that setting "eol" implies that a file should be considered to
be "text".

I would even more strongly expect

    *       eol=crlf
    *.png   binary

to turn off EOL translation for PNG files.

But in fact, in both of the above cases, EOL translation is turned *on*
for PNG files.

I propose that "-text" should override any setting for "eol" (which
would of course fix both problems, since "binary" is equivalent to
"-diff -merge -text"). What do people think?

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Surprising interaction of "binary" and "eol" gitattributes
  2015-03-05 16:38 Surprising interaction of "binary" and "eol" gitattributes Michael Haggerty
@ 2015-03-05 20:49 ` Torsten Bögershausen
  2015-03-05 22:08 ` Junio C Hamano
  1 sibling, 0 replies; 17+ messages in thread
From: Torsten Bögershausen @ 2015-03-05 20:49 UTC (permalink / raw)
  To: Michael Haggerty, git discussion list

On 2015-03-05 17.38, Michael Haggerty wrote:
> I would expect that the following .gitattributes file
> 
>     *       eol=crlf
>     *.png   -text
> 
> would leave EOL translation turned off for PNG files. In other words, I
> would expect that explicitly setting "-text" would take precedence over
> the fact that setting "eol" implies that a file should be considered to
> be "text".
> 
> I would even more strongly expect
> 
>     *       eol=crlf
>     *.png   binary
> 
> to turn off EOL translation for PNG files.
> 
> But in fact, in both of the above cases, EOL translation is turned *on*
> for PNG files.
> 
> I propose that "-text" should override any setting for "eol" (which
> would of course fix both problems, since "binary" is equivalent to
> "-diff -merge -text"). What do people think?
> 
> Michael
> 

("binary" is not supported, we need "-text")
Beside that,  

>     *       eol=crlf
>     *.png   -text
should work as you describe.

Do you think you make a test case for this ?
In best case as a real patch :-)

(I know that attributes should take precedence over eol settings in the
config file, and this is not always the case)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Surprising interaction of "binary" and "eol" gitattributes
  2015-03-05 16:38 Surprising interaction of "binary" and "eol" gitattributes Michael Haggerty
  2015-03-05 20:49 ` Torsten Bögershausen
@ 2015-03-05 22:08 ` Junio C Hamano
  2015-03-06  5:59   ` Torsten Bögershausen
  1 sibling, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2015-03-05 22:08 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git discussion list

Michael Haggerty <mhagger@alum.mit.edu> writes:

> I would expect that the following .gitattributes file
>
>     *       eol=crlf
>     *.png   -text
>
> would leave EOL translation turned off for PNG files. In other words, I
> would expect that explicitly setting "-text" would take precedence over
> the fact that setting "eol" implies that a file should be considered to
> be "text".
>
> I would even more strongly expect
>
>     *       eol=crlf
>     *.png   binary
>
> to turn off EOL translation for PNG files.
>
> But in fact, in both of the above cases, EOL translation is turned *on*
> for PNG files.
>
> I propose that "-text" should override any setting for "eol" (which
> would of course fix both problems, since "binary" is equivalent to
> "-diff -merge -text"). What do people think?

Hmm, is there really something that needs a new proposal and
opinions?

The way I read the flow in convert.c is:

    convert_to_git() uses input_crlf_action() to figure out what
    crlf_to_git() conversion is necessary.

    input_crlf_action() looks at text_attr and says CRLF_BINARY when
    it is CRLF_BINARY without looking at eol_attr at all.

    text_attr above is ca.crlf_action in convert_to_git().

    The whole ca.* comes from convert_attrs() inspecting attributes
    on the incoming path.

    convert_attrs() inspects "eol" and "text" attributes, among
    others, and sets crlf_action by calling git_path_check_crlf().

    git_path_check_crlf() looks at the 'text' attribute; if it is
    set to false, it returns CRLF_BINARY.

    crlf_to_git() when given crlf_action==CRLF_BINARY is a no-op.

So, with the above attributes where anything is eol=crlf by default
and in addition *.png is binary (which contains -text), we shouldn't
get any crlf munging.  Am I reading/following the code incorrectly?

Puzzled....

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Surprising interaction of "binary" and "eol" gitattributes
  2015-03-05 22:08 ` Junio C Hamano
@ 2015-03-06  5:59   ` Torsten Bögershausen
  2015-03-06 17:51     ` Michael Haggerty
  0 siblings, 1 reply; 17+ messages in thread
From: Torsten Bögershausen @ 2015-03-06  5:59 UTC (permalink / raw)
  To: Junio C Hamano, Michael Haggerty; +Cc: git discussion list

On 03/05/2015 11:08 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>
>> I would expect that the following .gitattributes file
>>
>>      *       eol=crlf
>>      *.png   -text
>>
>> would leave EOL translation turned off for PNG files. In other words, I
>> would expect that explicitly setting "-text" would take precedence over
>> the fact that setting "eol" implies that a file should be considered to
>> be "text".
>>
>> I would even more strongly expect
>>
>>      *       eol=crlf
>>      *.png   binary
>>
>> to turn off EOL translation for PNG files.
>>
>> But in fact, in both of the above cases, EOL translation is turned *on*
>> for PNG files.
>>
>> I propose that "-text" should override any setting for "eol" (which
>> would of course fix both problems, since "binary" is equivalent to
>> "-diff -merge -text"). What do people think?
> Hmm, is there really something that needs a new proposal and
> opinions?
>
> The way I read the flow in convert.c is:
>
>      convert_to_git() uses input_crlf_action() to figure out what
>      crlf_to_git() conversion is necessary.
>
>      input_crlf_action() looks at text_attr and says CRLF_BINARY when
>      it is CRLF_BINARY without looking at eol_attr at all.
>
>      text_attr above is ca.crlf_action in convert_to_git().
>
>      The whole ca.* comes from convert_attrs() inspecting attributes
>      on the incoming path.
>
>      convert_attrs() inspects "eol" and "text" attributes, among
>      others, and sets crlf_action by calling git_path_check_crlf().
>
>      git_path_check_crlf() looks at the 'text' attribute; if it is
>      set to false, it returns CRLF_BINARY.
>
>      crlf_to_git() when given crlf_action==CRLF_BINARY is a no-op.
>
> So, with the above attributes where anything is eol=crlf by default
> and in addition *.png is binary (which contains -text), we shouldn't
> get any crlf munging.  Am I reading/following the code incorrectly?
>
> Puzzled....
> --

I need to admit that I can't reproduce it here,
the following should trigger it, but all test cases pass


diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh

index 452320d..22f031d 100755
--- a/t/t0027-auto-crlf.sh
+++ b/t/t0027-auto-crlf.sh
@@ -37,7 +37,8 @@ create_gitattributes () {
                 echo "*.txt text" >.gitattributes
                 ;;
                 -text)
-               echo "*.txt -text" >.gitattributes
+               echo "* eol=crlf" >.gitattributes
+               echo "*.txt -text" >>.gitattributes
                 ;;
                 crlf)
                 echo "*.txt eol=crlf" >.gitattributes

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: Surprising interaction of "binary" and "eol" gitattributes
  2015-03-06  5:59   ` Torsten Bögershausen
@ 2015-03-06 17:51     ` Michael Haggerty
  2015-03-06 21:30       ` Torsten Bögershausen
  0 siblings, 1 reply; 17+ messages in thread
From: Michael Haggerty @ 2015-03-06 17:51 UTC (permalink / raw)
  To: Torsten Bögershausen, Junio C Hamano; +Cc: git discussion list

On 03/06/2015 06:59 AM, Torsten Bögershausen wrote:
> On 03/05/2015 11:08 PM, Junio C Hamano wrote:
>> Michael Haggerty <mhagger@alum.mit.edu> writes:
>>
>>> I would expect that the following .gitattributes file
>>>
>>>      *       eol=crlf
>>>      *.png   -text
>>>
>>> would leave EOL translation turned off for PNG files. In other words, I
>>> would expect that explicitly setting "-text" would take precedence over
>>> the fact that setting "eol" implies that a file should be considered to
>>> be "text".
>>>
>>> I would even more strongly expect
>>>
>>>      *       eol=crlf
>>>      *.png   binary
>>>
>>> to turn off EOL translation for PNG files.
>>>
>>> But in fact, in both of the above cases, EOL translation is turned *on*
>>> for PNG files.
>>>
>>> I propose that "-text" should override any setting for "eol" (which
>>> would of course fix both problems, since "binary" is equivalent to
>>> "-diff -merge -text"). What do people think?
>> Hmm, is there really something that needs a new proposal and
>> opinions?
>>
>> The way I read the flow in convert.c is:
>>
>>      convert_to_git() uses input_crlf_action() to figure out what
>>      crlf_to_git() conversion is necessary.
>>
>>      input_crlf_action() looks at text_attr and says CRLF_BINARY when
>>      it is CRLF_BINARY without looking at eol_attr at all.
>>
>>      text_attr above is ca.crlf_action in convert_to_git().
>>
>>      The whole ca.* comes from convert_attrs() inspecting attributes
>>      on the incoming path.
>>
>>      convert_attrs() inspects "eol" and "text" attributes, among
>>      others, and sets crlf_action by calling git_path_check_crlf().
>>
>>      git_path_check_crlf() looks at the 'text' attribute; if it is
>>      set to false, it returns CRLF_BINARY.
>>
>>      crlf_to_git() when given crlf_action==CRLF_BINARY is a no-op.
>>
>> So, with the above attributes where anything is eol=crlf by default
>> and in addition *.png is binary (which contains -text), we shouldn't
>> get any crlf munging.  Am I reading/following the code incorrectly?
>>
>> Puzzled....
>> -- 
> 
> I need to admit that I can't reproduce it here,
> the following should trigger it, but all test cases pass
> 
> 
> diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
> 
> index 452320d..22f031d 100755
> --- a/t/t0027-auto-crlf.sh
> +++ b/t/t0027-auto-crlf.sh
> @@ -37,7 +37,8 @@ create_gitattributes () {
>                 echo "*.txt text" >.gitattributes
>                 ;;
>                 -text)
> -               echo "*.txt -text" >.gitattributes
> +               echo "* eol=crlf" >.gitattributes
> +               echo "*.txt -text" >>.gitattributes
>                 ;;
>                 crlf)
>                 echo "*.txt eol=crlf" >.gitattributes
> 

Oops, I misunderstood an internal bug report. In seems that it is the
following scenario that is incorrect:

    *.png text=auto eol=crlf

when applied to a binary file.

I'm currently studying documentation and writing tests to figure out the
exact current behavior. But honestly, this end-of-line conversion stuff
is bewildering, so it might take a while.

Sorry for the half-cocked bug report. I'll report back when I know more.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Surprising interaction of "binary" and "eol" gitattributes
  2015-03-06 17:51     ` Michael Haggerty
@ 2015-03-06 21:30       ` Torsten Bögershausen
  2015-03-10 19:25         ` Michael Haggerty
  0 siblings, 1 reply; 17+ messages in thread
From: Torsten Bögershausen @ 2015-03-06 21:30 UTC (permalink / raw)
  To: Michael Haggerty, Torsten Bögershausen, Junio C Hamano
  Cc: git discussion list


> Oops, I misunderstood an internal bug report. In seems that it is the
> following scenario that is incorrect:
>
>     *.png text=auto eol=crlf
Hm, I don't know if we support this combination at all.

The current logic supports auto-detection of text/binary,
* text=auto
(the files will get the line ending from core.eol or core.autocrlf)

or the  the setting of a line ending:
*.sh eol=lf
*.bat eol=crlf


Is there a special use-case, which needs the combination of both ?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Surprising interaction of "binary" and "eol" gitattributes
  2015-03-06 21:30       ` Torsten Bögershausen
@ 2015-03-10 19:25         ` Michael Haggerty
  2015-03-10 20:01           ` Junio C Hamano
  2015-03-10 20:26           ` Torsten Bögershausen
  0 siblings, 2 replies; 17+ messages in thread
From: Michael Haggerty @ 2015-03-10 19:25 UTC (permalink / raw)
  To: Torsten Bögershausen, Junio C Hamano; +Cc: git discussion list

On 03/06/2015 10:30 PM, Torsten Bögershausen wrote:
> 
>> Oops, I misunderstood an internal bug report. In seems that it is the
>> following scenario that is incorrect:
>>
>>     *.png text=auto eol=crlf
> Hm, I don't know if we support this combination at all.

The user can specify this combination in a .gitattributes file and we
have to react to it *some way*. Theoretically we could document that
this combination is undefined and/or emit an error if we see this
combination, but we don't do so.

> The current logic supports auto-detection of text/binary,
> * text=auto
> (the files will get the line ending from core.eol or core.autocrlf)
> 
> or the  the setting of a line ending:
> *.sh eol=lf
> *.bat eol=crlf
> 
> 
> Is there a special use-case, which needs the combination of both ?

I'm still trying to infer the spirit of the current behavior, so caveats
here.

This comes from a real-life scenario where a user, somewhere early in
.gitattributes, had

    * text
    * eol=crlf

and then later (this could be in a subdirectory) tried to carve out
exceptions to this rule by using

    *.png binary
    * text=auto

Intuitively it *feels* like either of the later lines should suppress
EOL translation in PNG files (assuming the PNG file has a NUL byte in
the first 8k, which this one did).

It seems to me that setting "text=auto" should mean that Git uses its
heuristic to guess whether a particular file is text or not, and then
treats the file as if it had "text" or "-text" set. If the latter, then
EOL translation should be suppressed.

It also seems to me that "binary" should imply "-eol".

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Surprising interaction of "binary" and "eol" gitattributes
  2015-03-10 19:25         ` Michael Haggerty
@ 2015-03-10 20:01           ` Junio C Hamano
  2015-03-10 22:16             ` Michael Haggerty
  2015-03-10 20:26           ` Torsten Bögershausen
  1 sibling, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2015-03-10 20:01 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Torsten Bögershausen, git discussion list

Michael Haggerty <mhagger@alum.mit.edu> writes:

> On 03/06/2015 10:30 PM, Torsten Bögershausen wrote:
>> 
>>> Oops, I misunderstood an internal bug report. In seems that it is the
>>> following scenario that is incorrect:
>>>
>>>     *.png text=auto eol=crlf
>> Hm, I don't know if we support this combination at all.
>
> The user can specify this combination in a .gitattributes file and we
> have to react to it *some way*.
>
> Theoretically we could document that
> this combination is undefined and/or emit an error if we see this
> combination, but we don't do so.
>
>> The current logic supports auto-detection of text/binary,
>> * text=auto
>> (the files will get the line ending from core.eol or core.autocrlf)
>> 
>> or the  the setting of a line ending:
>> *.sh eol=lf
>> *.bat eol=crlf
>> 
>> 
>> Is there a special use-case, which needs the combination of both ?
>
> I'm still trying to infer the spirit of the current behavior, so caveats
> here.
>
> This comes from a real-life scenario where a user, somewhere early in
> .gitattributes, had
>
>     * text
>     * eol=crlf
>
> and then later (this could be in a subdirectory) tried to carve out
> exceptions to this rule by using
>
>     *.png binary
>     * text=auto
>
> Intuitively it *feels* like either of the later lines should suppress
> EOL translation in PNG files (assuming the PNG file has a NUL byte in
> the first 8k, which this one did).

The way I read the description of "eol" was that that was a more
specific way to do what used to be done with "text" (meaning "OK,
that may be a text file, but how exactly is the end-of-line
handled?"), so I would say if the above behaved the same way as

    *.png eol=crlf text

that would be the least surprising to me.  But perhaps that is only
because I know which one came first and which one came later for
what purpose.

But ...

> It seems to me that setting "text=auto" should mean that Git uses its
> heuristic to guess whether a particular file is text or not, and then
> treats the file as if it had "text" or "-text" set. If the latter, then
> EOL translation should be suppressed.

... I think this makes even more sense. I do not think the code is
set up to do so.  To be honest, eol_attr thing introduced in
fd6cce9e (Add per-repository eol normalization, 2010-05-19) always
confuses me whenever I follow this codepath.

> It also seems to me that "binary" should imply "-eol".

I thought that "eol" attribute is not even looked at when you say
"binary"; that is what I recall finding out when I dug into this
earlier in the thread.

http://thread.gmane.org/gmane.comp.version-control.git/264850/focus=264872

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Surprising interaction of "binary" and "eol" gitattributes
  2015-03-10 19:25         ` Michael Haggerty
  2015-03-10 20:01           ` Junio C Hamano
@ 2015-03-10 20:26           ` Torsten Bögershausen
  2015-03-10 22:24             ` Michael Haggerty
  1 sibling, 1 reply; 17+ messages in thread
From: Torsten Bögershausen @ 2015-03-10 20:26 UTC (permalink / raw)
  To: Michael Haggerty, Torsten Bögershausen, Junio C Hamano
  Cc: git discussion list

On 10.03.15 20:25, Michael Haggerty wrote:
> On 03/06/2015 10:30 PM, Torsten Bögershausen wrote:
>>
>>> Oops, I misunderstood an internal bug report. In seems that it is the
>>> following scenario that is incorrect:
>>>
>>>     *.png text=auto eol=crlf
>> Hm, I don't know if we support this combination at all.
> 
> The user can specify this combination in a .gitattributes file and we
> have to react to it *some way*. Theoretically we could document that
> this combination is undefined and/or emit an error if we see this
> combination, but we don't do so.
> 
>> The current logic supports auto-detection of text/binary,
>> * text=auto
>> (the files will get the line ending from core.eol or core.autocrlf)
>>
>> or the  the setting of a line ending:
>> *.sh eol=lf
>> *.bat eol=crlf
>>
>>
>> Is there a special use-case, which needs the combination of both ?
> 
> I'm still trying to infer the spirit of the current behavior, so caveats
> here.
> 
> This comes from a real-life scenario where a user, somewhere early in
> .gitattributes, had
> 
>     * text
>     * eol=crlf
> 
> and then later (this could be in a subdirectory) tried to carve out
> exceptions to this rule by using
> 
>     *.png binary
>     * text=auto
Hm,
I can see 2 problems here:
the "binary" attribute does not exist at all.

I sometimes which we had it, but we don't.
There is "text" and "-text", and that is it.

The other problem is the order of the lines, which is fully
intuitive for each person who has ever written a "matching parser".

The parser matches each file namr on it's own, depending on the matching:

*.png -text
* text=auto
means that all png files are binary, and ALL files are "auto".

Guess what happens to the png's ?

The second rule wins, as it is the last rule processed.

git check-attr text *
A.png: text: auto
B.txt: text: auto
---
If we reverse the order in .gitattributes, things look better:
* text=auto
*.png -text

git check-attr text *
A.png: text: unset
B.txt: text: auto


--------------------
This is not very intuitive or even surprising, at least for many people.

Unless I mis-understand the problem, it may be that the documentation may be updated ?

The general rule for writing .gitattributes 
is to specify the "wider" rules first, and then the more specific rules after that.

It could be that 
Documentation/gitattributes.txt
should mention this instead:
------------------------
*		-text
*.txt		text
*.vcproj	eol=crlf
*.sh		eol=lf
------------------------


The other thing is to promote/mention the command 
"git  check-attr text *"

at a prominent place.

> Intuitively it *feels* like either of the later lines should suppress
> EOL translation in PNG files (assuming the PNG file has a NUL byte in
> the first 8k, which this one did).
> 
> It seems to me that setting "text=auto" should mean that Git uses its
> heuristic to guess whether a particular file is text or not, and then
> treats the file as if it had "text" or "-text" set. If the latter, then
> EOL translation should be suppressed.
> 
> It also seems to me that "binary" should imply "-eol".
> 
> Michael
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Surprising interaction of "binary" and "eol" gitattributes
  2015-03-10 20:01           ` Junio C Hamano
@ 2015-03-10 22:16             ` Michael Haggerty
  2015-03-10 22:54               ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Michael Haggerty @ 2015-03-10 22:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Torsten Bögershausen, git discussion list

On 03/10/2015 09:01 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>> [...]
>> It seems to me that setting "text=auto" should mean that Git uses its
>> heuristic to guess whether a particular file is text or not, and then
>> treats the file as if it had "text" or "-text" set. If the latter, then
>> EOL translation should be suppressed.
> 
> ... I think this makes even more sense. I do not think the code is
> set up to do so.  To be honest, eol_attr thing introduced in
> fd6cce9e (Add per-repository eol normalization, 2010-05-19) always
> confuses me whenever I follow this codepath.

Would this change be "backwards-compatible enough" that it can be made
without waiting for Git 3.0?

>> It also seems to me that "binary" should imply "-eol".
> 
> I thought that "eol" attribute is not even looked at when you say
> "binary"; that is what I recall finding out when I dug into this
> earlier in the thread.

Well, that's true, but the "eol" attribute can regain its effect if
"binary" is followed by "text" or "text=auto". So I guess the simplest
question is as follows. Suppose I have the following .gitattributes:

    a.foo eol=crlf
    a.foo binary
    a.foo text

It is obvious in this case that a.foo should be treated as a text file.
Should it be processed with "eol=crlf", or should the intervening
"binary" imply "-eol"?

I guess it would be more natural to process it with "eol=crlf". So I
withdraw my proposal that "binary" should imply "-eol", provided the
first change (that "text=auto" is treated the same as "-text" for binary
files) is implemented.

So I guess the proposed new behavior WRT these attributes is:

* "text" determines whether a file should be subject to EOL
  translation.
* "text=auto" has the same effect as "text" or "-text", depending
  on the outcome of the binary detection heuristic; in particular,
  it causes EOL translation to be suppressed for files determined
  to be binary.
* "eol" determines what EOLs should be translated to *if* the
  file is determined to be a text file.
* If "text" is unspecified but "eol" is specified, then do EOL
  translation without a heuristic check.

But I still have to work out how core.autocrlf and the "crlf" attribute
fit into all this.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Surprising interaction of "binary" and "eol" gitattributes
  2015-03-10 20:26           ` Torsten Bögershausen
@ 2015-03-10 22:24             ` Michael Haggerty
  0 siblings, 0 replies; 17+ messages in thread
From: Michael Haggerty @ 2015-03-10 22:24 UTC (permalink / raw)
  To: Torsten Bögershausen, Junio C Hamano; +Cc: git discussion list

On 03/10/2015 09:26 PM, Torsten Bögershausen wrote:
> On 10.03.15 20:25, Michael Haggerty wrote:
>> [...]
>> I'm still trying to infer the spirit of the current behavior, so caveats
>> here.
>>
>> This comes from a real-life scenario where a user, somewhere early in
>> .gitattributes, had
>>
>>     * text
>>     * eol=crlf
>>
>> and then later (this could be in a subdirectory) tried to carve out
>> exceptions to this rule by using
>>
>>     *.png binary
>>     * text=auto
> Hm,
> I can see 2 problems here:
> the "binary" attribute does not exist at all.
>
> I sometimes which we had it, but we don't.
> There is "text" and "-text", and that is it.

There is a "binary" macro that is equivalent to "-diff -merge -text". It
is documented in gitattributes(5) under "USING MACRO ATTRIBUTES" and
"DEFINING MACRO ATTRIBUTES".

> The other problem is the order of the lines, which is fully
> intuitive for each person who has ever written a "matching parser".
> 
> The parser matches each file namr on it's own, depending on the matching:
> 
> *.png -text
> * text=auto
> means that all png files are binary, and ALL files are "auto".
> 
> Guess what happens to the png's ?
> 
> The second rule wins, as it is the last rule processed.
> 
> git check-attr text *
> A.png: text: auto
> B.txt: text: auto

That much is perfectly clear. The question is, what should happen when
the contents-based heuristic, whose use was requested by "text=auto",
determines that A.png is in fact a binary file? Should it be subjected
to EOL translation anyway? I think not.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Surprising interaction of "binary" and "eol" gitattributes
  2015-03-10 22:16             ` Michael Haggerty
@ 2015-03-10 22:54               ` Junio C Hamano
  2015-03-11  5:54                 ` Torsten Bögershausen
  2015-03-11 20:30                 ` Johannes Sixt
  0 siblings, 2 replies; 17+ messages in thread
From: Junio C Hamano @ 2015-03-10 22:54 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Torsten Bögershausen, git discussion list

Michael Haggerty <mhagger@alum.mit.edu> writes:

> Well, that's true, but the "eol" attribute can regain its effect if
> "binary" is followed by "text" or "text=auto". So I guess the simplest
> question is as follows. Suppose I have the following .gitattributes:
>
>     a.foo eol=crlf
>     a.foo binary
>     a.foo text
>
> It is obvious in this case that a.foo should be treated as a text file.
> Should it be processed with "eol=crlf", or should the intervening
> "binary" imply "-eol"?

I would say former.  You find out what attributes apply to a path
and then consider the collective effect of these attributes that
survived.

So the second "No it is not text" which is overruled by the "oops,
no that is text" later should not get in the picture, I would say.

As binary is not just -text and turns other things off, those other
things will be off after these three.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Surprising interaction of "binary" and "eol" gitattributes
  2015-03-10 22:54               ` Junio C Hamano
@ 2015-03-11  5:54                 ` Torsten Bögershausen
  2015-03-11 17:59                   ` Junio C Hamano
  2015-03-11 20:30                 ` Johannes Sixt
  1 sibling, 1 reply; 17+ messages in thread
From: Torsten Bögershausen @ 2015-03-11  5:54 UTC (permalink / raw)
  To: Junio C Hamano, Michael Haggerty
  Cc: Torsten Bögershausen, git discussion list

On 03/10/2015 11:54 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>
>> Well, that's true, but the "eol" attribute can regain its effect if
>> "binary" is followed by "text" or "text=auto". So I guess the simplest
>> question is as follows. Suppose I have the following .gitattributes:
>>
>>      a.foo eol=crlf
>>      a.foo binary
>>      a.foo text
>>
>> It is obvious in this case that a.foo should be treated as a text file.
>> Should it be processed with "eol=crlf", or should the intervening
>> "binary" imply "-eol"?
> I would say former.  You find out what attributes apply to a path
> and then consider the collective effect of these attributes that
> survived.
>
> So the second "No it is not text" which is overruled by the "oops,
> no that is text" later should not get in the picture, I would say.
>
> As binary is not just -text and turns other things off, those other
> things will be off after these three.
Not sure if I follow:
Whenever you specify -text, the eol doesn't matter, or what do I miss ?

Specifying "*.txt eol=crlf" includes "*.txt text",
but with the following it should be possible to turn on "text=auto"

cat .gitattributes
* eol=crlf
*.sh eol=lf
* text=auto
*.png -text

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Surprising interaction of "binary" and "eol" gitattributes
  2015-03-11  5:54                 ` Torsten Bögershausen
@ 2015-03-11 17:59                   ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2015-03-11 17:59 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: Michael Haggerty, git discussion list

Torsten Bögershausen <tboegi@web.de> writes:

> On 03/10/2015 11:54 PM, Junio C Hamano wrote:
>> Michael Haggerty <mhagger@alum.mit.edu> writes:
>>
>>> Well, that's true, but the "eol" attribute can regain its effect if
>>> "binary" is followed by "text" or "text=auto". So I guess the simplest
>>> question is as follows. Suppose I have the following .gitattributes:
>>>
>>>      a.foo eol=crlf
>>>      a.foo binary
>>>      a.foo text
>>>
>> As binary is not just -text and turns other things off, those other
>> things will be off after these three.
> Not sure if I follow:
> Whenever you specify -text, the eol doesn't matter, or what do I miss ?

Something unrelated to the main theme of this topic ;-).

I was just saying that saying "a.foo text" is not a way to take your
earlier mistake of saying "a.foo binary" back, if that "binary" was
placed on the path by mistake or an over-eager globbing.  The 'text'
attribute will be reset, but -diff you placed on the path by saying
"binary" is still there after these three attribute lines and running
"git diff a.foo" would sill show the effect from it.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Surprising interaction of "binary" and "eol" gitattributes
  2015-03-10 22:54               ` Junio C Hamano
  2015-03-11  5:54                 ` Torsten Bögershausen
@ 2015-03-11 20:30                 ` Johannes Sixt
  2015-03-11 21:31                   ` Junio C Hamano
  1 sibling, 1 reply; 17+ messages in thread
From: Johannes Sixt @ 2015-03-11 20:30 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Michael Haggerty, Torsten Bögershausen, git discussion list

Am 10.03.2015 um 23:54 schrieb Junio C Hamano:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>
>> Well, that's true, but the "eol" attribute can regain its effect if
>> "binary" is followed by "text" or "text=auto". So I guess the simplest
>> question is as follows. Suppose I have the following .gitattributes:
>>
>>      a.foo eol=crlf
>>      a.foo binary
>>      a.foo text
>>
>> It is obvious in this case that a.foo should be treated as a text file.
>> Should it be processed with "eol=crlf", or should the intervening
>> "binary" imply "-eol"?
>
> I would say former.  You find out what attributes apply to a path
> and then consider the collective effect of these attributes that
> survived.
>
> So the second "No it is not text" which is overruled by the "oops,
> no that is text" later should not get in the picture, I would say.
>
> As binary is not just -text and turns other things off, those other
> things will be off after these three.

Is that how attribute lookup works? I.e., given a path, all attributes 
are collected?

Isn't it more like: Here we are interested in the "eol" attribute of 
this file named "a.foo". And the lookup would find the first line that 
says "eol=crlf". Elsewhere, we are interested in the "binary" attribute 
of the file named "a.foo", and lookup would find the second line that 
sets the "binary" attribute. And again elsewhere, we ask for the "text" 
attribute, and we find the last line that sets the "text" property.

Am I totally off track?

-- Hannes

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Surprising interaction of "binary" and "eol" gitattributes
  2015-03-11 20:30                 ` Johannes Sixt
@ 2015-03-11 21:31                   ` Junio C Hamano
  2015-03-11 21:43                     ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2015-03-11 21:31 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Michael Haggerty, Torsten Bögershausen, git discussion list

Johannes Sixt <j6t@kdbg.org> writes:

>> I would say former.  You find out what attributes apply to a path
>> and then consider the collective effect of these attributes that
>> survived.
>>
>> So the second "No it is not text" which is overruled by the "oops,
>> no that is text" later should not get in the picture, I would say.
>>
>> As binary is not just -text and turns other things off, those other
>> things will be off after these three.
>
> Is that how attribute lookup works? I.e., given a path, all attributes
> are collected?
>
> Isn't it more like: Here we are interested in the "eol" attribute of
> this file named "a.foo". And the lookup would find the first line that
> says "eol=crlf". Elsewhere, we are interested in the "binary"
> attribute of the file named "a.foo", and lookup would find the second
> line that sets the "binary" attribute. And again elsewhere, we ask for
> the "text" attribute, and we find the last line that sets the "text"
> property.
>
> Am I totally off track?

In the codepath in question, we say "we are interested in text and
eol attributes", grab the values (set/unset/set-to-value/unspecified)
for these two for the path we are interested in from all the
applicable gitattributes file and then act on the result.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Surprising interaction of "binary" and "eol" gitattributes
  2015-03-11 21:31                   ` Junio C Hamano
@ 2015-03-11 21:43                     ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2015-03-11 21:43 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Michael Haggerty, Torsten Bögershausen, git discussion list

Junio C Hamano <gitster@pobox.com> writes:

> Johannes Sixt <j6t@kdbg.org> writes:
>
>> Isn't it more like: Here we are interested in the "eol" attribute of
>> this file named "a.foo". And the lookup would find the first line that
>> says "eol=crlf". Elsewhere, we are interested in the "binary"
>> attribute of the file named "a.foo", and lookup would find the second
>> line that sets the "binary" attribute. And again elsewhere, we ask for
>> the "text" attribute, and we find the last line that sets the "text"
>> property.
>>
>> Am I totally off track?
>
> In the codepath in question, we say "we are interested in text and
> eol attributes", grab the values (set/unset/set-to-value/unspecified)
> for these two for the path we are interested in from all the
> applicable gitattributes file and then act on the result.

Technically the above is only a half-answer; I know you are
intelligent enough to derive the other untold half from it, but for
the benefit of those reading from sidelines....

The calling convention to (1) prepare the list of attributes you are
interested in upfront and then (2) to ask the set of attributes that
apply among that set for a path is designed to force programmers
avoid doing attribute lookups (which is rather expensive) at random
places in their code.  But in the end, this let us implement the
functionality as you alluded to in the quote paragraph.

If a proposed change wants to make 'text=auto' stronger in the sense
that we decide if we pay attention to 'eol' only after checking if
the contents look non-text, it can do so, just like setting '-text'
to the current code makes settings of 'eol' irrelevant.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2015-03-11 21:43 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-05 16:38 Surprising interaction of "binary" and "eol" gitattributes Michael Haggerty
2015-03-05 20:49 ` Torsten Bögershausen
2015-03-05 22:08 ` Junio C Hamano
2015-03-06  5:59   ` Torsten Bögershausen
2015-03-06 17:51     ` Michael Haggerty
2015-03-06 21:30       ` Torsten Bögershausen
2015-03-10 19:25         ` Michael Haggerty
2015-03-10 20:01           ` Junio C Hamano
2015-03-10 22:16             ` Michael Haggerty
2015-03-10 22:54               ` Junio C Hamano
2015-03-11  5:54                 ` Torsten Bögershausen
2015-03-11 17:59                   ` Junio C Hamano
2015-03-11 20:30                 ` Johannes Sixt
2015-03-11 21:31                   ` Junio C Hamano
2015-03-11 21:43                     ` Junio C Hamano
2015-03-10 20:26           ` Torsten Bögershausen
2015-03-10 22:24             ` Michael Haggerty

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).