git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* .gitattributes override behavior (possible bug, or documentation bug)
@ 2018-03-20  1:49 Dakota Hawkins
  2018-03-20  2:34 ` Jeff King
  0 siblings, 1 reply; 28+ messages in thread
From: Dakota Hawkins @ 2018-03-20  1:49 UTC (permalink / raw)
  To: Git

According to the gitattributes docs:

"When more than one pattern matches the path, a later line overrides
an earlier line. This overriding is done per attribute."

I had a need to do this recently, and while the attributes are git-lfs
specific, I think the issue may be generic.

Summary: Trying to apply attributes to file extensions everywhere
except in one directory.

.gitattributes:

    *.[Pp][Nn][Gg] filter=lfs diff=lfs merge=lfs -text
    /.readme-docs/ -filter=lfs -diff=lfs -merge=lfs

Make some data:

    echo "asldkjfa;sldkjf;alsdjf" > ./.readme-docs/test.png
    git add -A

Result:

    "git check-attr -a --cached -- ./readme-docs/test.png" still shows
"filter=lfs diff=lfs merge=lfs" attributes.

Full details: https://github.com/git-lfs/git-lfs/issues/2120#issuecomment-374432354

Is this me misunderstanding something in the documentation? I would
expect "./.readme-docs/" to match "./.readme-docs/test.png" and
override the earlier "*.[Pp][Nn][Gg]" attributes.

I have found the following overrides to work in lieu of the directory match:

    /.readme-docs/* -filter=lfs -diff=lfs -merge=lfs
    /.readme-docs/**/* -filter=lfs -diff=lfs -merge=lfs

...but I don't see a justification in the documentation for this
working and the original directory filter not working.

Thanks for any clarification or help you can provide,

- Dakota

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-20  1:49 .gitattributes override behavior (possible bug, or documentation bug) Dakota Hawkins
@ 2018-03-20  2:34 ` Jeff King
  2018-03-20  3:10   ` Dakota Hawkins
  2018-03-20  3:33   ` Junio C Hamano
  0 siblings, 2 replies; 28+ messages in thread
From: Jeff King @ 2018-03-20  2:34 UTC (permalink / raw)
  To: Dakota Hawkins; +Cc: Git

On Mon, Mar 19, 2018 at 09:49:28PM -0400, Dakota Hawkins wrote:

> Summary: Trying to apply attributes to file extensions everywhere
> except in one directory.
> 
> .gitattributes:
> 
>     *.[Pp][Nn][Gg] filter=lfs diff=lfs merge=lfs -text
>     /.readme-docs/ -filter=lfs -diff=lfs -merge=lfs
> 
> Make some data:
> 
>     echo "asldkjfa;sldkjf;alsdjf" > ./.readme-docs/test.png
>     git add -A

As you noted below, that second line does not match your path, because
attributes on a directory aren't applied recursively. And it has nothing
to do with overriding. If you remove the png line entirely, you can see
that we still do not match it. You need to use "*" to match the paths.

You may also find that "-diff=lfs" does not do quite what you want.
There is no way to say "cancel any previous attribute", which I think is
what you're trying for here. You can only override it with a new value.
So:

  /.readme-docs/* -diff

says "do not diff this". And:

  /.readme-docs/* diff

says "diff this as text, even if it looks binary".

The best you can probably do is:

  /.readme-docs/* diff=foo

Since you have no diff.foo.* config, that will behave in the default way
(including respecting the usual "is it binary" checks). So a bit hacky,
but I think it would work as "ignore prior diff".

And I think filter and merge drivers should work the same.

> Is this me misunderstanding something in the documentation? I would
> expect "./.readme-docs/" to match "./.readme-docs/test.png" and
> override the earlier "*.[Pp][Nn][Gg]" attributes.
>
> I have found the following overrides to work in lieu of the directory match:
> 
>     /.readme-docs/* -filter=lfs -diff=lfs -merge=lfs
>     /.readme-docs/**/* -filter=lfs -diff=lfs -merge=lfs
> 
> ...but I don't see a justification in the documentation for this
> working and the original directory filter not working.

I could not find anything useful in gitattributes(5). There's some old
discussion here:

  https://public-inbox.org/git/slrnkldd3g.1l4.jan@majutsushi.net/

which makes it clear that attributes aren't recursive, but it's probably
worth calling out in the documentation. In fact, I think the current
documentation is a bit misleading in that it says "patterns are matched
as in .gitignore", which is clearly not the case here.

I think just "/.readme-docs/**" should be sufficient for your case. You
could also probably write "*" inside ".readme-docs/.gitattributes",
which may be simpler (you don't need "**" there because patterns without
a slash are just matched directly against the basename).

-Peff

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-20  2:34 ` Jeff King
@ 2018-03-20  3:10   ` Dakota Hawkins
  2018-03-20  3:17     ` Dakota Hawkins
  2018-03-20  4:04     ` Jeff King
  2018-03-20  3:33   ` Junio C Hamano
  1 sibling, 2 replies; 28+ messages in thread
From: Dakota Hawkins @ 2018-03-20  3:10 UTC (permalink / raw)
  To: Jeff King; +Cc: Git

Thanks for the quick reply!

On Mon, Mar 19, 2018 at 10:34 PM, Jeff King <peff@peff.net> wrote:
> On Mon, Mar 19, 2018 at 09:49:28PM -0400, Dakota Hawkins wrote:
>
>> Summary: Trying to apply attributes to file extensions everywhere
>> except in one directory.
>>
>> .gitattributes:
>>
>>     *.[Pp][Nn][Gg] filter=lfs diff=lfs merge=lfs -text
>>     /.readme-docs/ -filter=lfs -diff=lfs -merge=lfs
>>
>> Make some data:
>>
>>     echo "asldkjfa;sldkjf;alsdjf" > ./.readme-docs/test.png
>>     git add -A
>
> As you noted below, that second line does not match your path, because
> attributes on a directory aren't applied recursively. And it has nothing
> to do with overriding. If you remove the png line entirely, you can see
> that we still do not match it. You need to use "*" to match the paths.

Ah, yes, I see that. Inconsistent with .gitignore (more below), right?

> You may also find that "-diff=lfs" does not do quite what you want.
> There is no way to say "cancel any previous attribute", which I think is
> what you're trying for here. You can only override it with a new value.
> So:
>
>   /.readme-docs/* -diff
>
> says "do not diff this". And:
>
>   /.readme-docs/* diff
>
> says "diff this as text, even if it looks binary".
>
> The best you can probably do is:
>
>   /.readme-docs/* diff=foo
>
> Since you have no diff.foo.* config, that will behave in the default way
> (including respecting the usual "is it binary" checks). So a bit hacky,
> but I think it would work as "ignore prior diff".
>
> And I think filter and merge drivers should work the same.

That's interesting... in this case I was taking my advice on how this
should work from the git-lfs folks. I have promised to share what I
find here with them, so that will help at least :)

I think that makes sense to me -- there would be no good way to tell
it what the default should have been without explicitly telling it
what to use instead.

>> Is this me misunderstanding something in the documentation? I would
>> expect "./.readme-docs/" to match "./.readme-docs/test.png" and
>> override the earlier "*.[Pp][Nn][Gg]" attributes.
>>
>> I have found the following overrides to work in lieu of the directory match:
>>
>>     /.readme-docs/* -filter=lfs -diff=lfs -merge=lfs
>>     /.readme-docs/**/* -filter=lfs -diff=lfs -merge=lfs
>>
>> ...but I don't see a justification in the documentation for this
>> working and the original directory filter not working.
>
> I could not find anything useful in gitattributes(5). There's some old
> discussion here:
>
>   https://public-inbox.org/git/slrnkldd3g.1l4.jan@majutsushi.net/

If I follow that correctly: There's some initial speculation that it
would be OK to apply the attributes recursively, which is then shot
down because it wasn't designed to be recursive (though I don't see a
different, technical reason for that), followed by finding a (this
same?) solution/workaround for the original problem. Is that about
right?

> which makes it clear that attributes aren't recursive, but it's probably
> worth calling out in the documentation. In fact, I think the current
> documentation is a bit misleading in that it says "patterns are matched
> as in .gitignore", which is clearly not the case here.

I was indeed going off of the suggestion to consult the .gitignore
pattern matching documentation.

> I think just "/.readme-docs/**" should be sufficient for your case. You
> could also probably write "*" inside ".readme-docs/.gitattributes",
> which may be simpler (you don't need "**" there because patterns without
> a slash are just matched directly against the basename).

Wouldn't that make the "*" inside ".readme-docs/.gitattributes",
technically recursive when "*" matches a directory? It's always seemed
to me that both were necessary to explicitly match things in a
directory and its subdirectories (example, IIRC: "git ls-files --
.gitattributes" vs "git ls-files -- .gitattributes
**/.gitattributes"). Maybe that example is peculiar in that its a
dotfile and can't have a wildcard before the dot?

I guess my takeaway is that it would be _good_ if the gitattributes
documentation contained the caveat about not matching directories
recursively, but _great_ if gitattributes and gitignore (and whatever
else there is) were consistent.

At any rate, thanks for the great, quick help!

-Dakota

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-20  3:10   ` Dakota Hawkins
@ 2018-03-20  3:17     ` Dakota Hawkins
  2018-03-20  4:12       ` Jeff King
  2018-03-20  4:04     ` Jeff King
  1 sibling, 1 reply; 28+ messages in thread
From: Dakota Hawkins @ 2018-03-20  3:17 UTC (permalink / raw)
  To: Jeff King; +Cc: Git

Sorry to tack on to my previous email, but I just thought of this:

If something like "-diff=lfs" won't do what I (and git-lfs) thought it
would, do you think it would be prudent/reasonable to suggest git-lfs
add a "no-lfs" filter for exactly this case? That way I could have
explicit exclusions without any "diff=foo" shenanigans.

Thanks again,

- Dakota

On Mon, Mar 19, 2018 at 11:10 PM, Dakota Hawkins
<dakota@dakotahawkins.com> wrote:
> Thanks for the quick reply!
>
> On Mon, Mar 19, 2018 at 10:34 PM, Jeff King <peff@peff.net> wrote:
>> On Mon, Mar 19, 2018 at 09:49:28PM -0400, Dakota Hawkins wrote:
>>
>>> Summary: Trying to apply attributes to file extensions everywhere
>>> except in one directory.
>>>
>>> .gitattributes:
>>>
>>>     *.[Pp][Nn][Gg] filter=lfs diff=lfs merge=lfs -text
>>>     /.readme-docs/ -filter=lfs -diff=lfs -merge=lfs
>>>
>>> Make some data:
>>>
>>>     echo "asldkjfa;sldkjf;alsdjf" > ./.readme-docs/test.png
>>>     git add -A
>>
>> As you noted below, that second line does not match your path, because
>> attributes on a directory aren't applied recursively. And it has nothing
>> to do with overriding. If you remove the png line entirely, you can see
>> that we still do not match it. You need to use "*" to match the paths.
>
> Ah, yes, I see that. Inconsistent with .gitignore (more below), right?
>
>> You may also find that "-diff=lfs" does not do quite what you want.
>> There is no way to say "cancel any previous attribute", which I think is
>> what you're trying for here. You can only override it with a new value.
>> So:
>>
>>   /.readme-docs/* -diff
>>
>> says "do not diff this". And:
>>
>>   /.readme-docs/* diff
>>
>> says "diff this as text, even if it looks binary".
>>
>> The best you can probably do is:
>>
>>   /.readme-docs/* diff=foo
>>
>> Since you have no diff.foo.* config, that will behave in the default way
>> (including respecting the usual "is it binary" checks). So a bit hacky,
>> but I think it would work as "ignore prior diff".
>>
>> And I think filter and merge drivers should work the same.
>
> That's interesting... in this case I was taking my advice on how this
> should work from the git-lfs folks. I have promised to share what I
> find here with them, so that will help at least :)
>
> I think that makes sense to me -- there would be no good way to tell
> it what the default should have been without explicitly telling it
> what to use instead.
>
>>> Is this me misunderstanding something in the documentation? I would
>>> expect "./.readme-docs/" to match "./.readme-docs/test.png" and
>>> override the earlier "*.[Pp][Nn][Gg]" attributes.
>>>
>>> I have found the following overrides to work in lieu of the directory match:
>>>
>>>     /.readme-docs/* -filter=lfs -diff=lfs -merge=lfs
>>>     /.readme-docs/**/* -filter=lfs -diff=lfs -merge=lfs
>>>
>>> ...but I don't see a justification in the documentation for this
>>> working and the original directory filter not working.
>>
>> I could not find anything useful in gitattributes(5). There's some old
>> discussion here:
>>
>>   https://public-inbox.org/git/slrnkldd3g.1l4.jan@majutsushi.net/
>
> If I follow that correctly: There's some initial speculation that it
> would be OK to apply the attributes recursively, which is then shot
> down because it wasn't designed to be recursive (though I don't see a
> different, technical reason for that), followed by finding a (this
> same?) solution/workaround for the original problem. Is that about
> right?
>
>> which makes it clear that attributes aren't recursive, but it's probably
>> worth calling out in the documentation. In fact, I think the current
>> documentation is a bit misleading in that it says "patterns are matched
>> as in .gitignore", which is clearly not the case here.
>
> I was indeed going off of the suggestion to consult the .gitignore
> pattern matching documentation.
>
>> I think just "/.readme-docs/**" should be sufficient for your case. You
>> could also probably write "*" inside ".readme-docs/.gitattributes",
>> which may be simpler (you don't need "**" there because patterns without
>> a slash are just matched directly against the basename).
>
> Wouldn't that make the "*" inside ".readme-docs/.gitattributes",
> technically recursive when "*" matches a directory? It's always seemed
> to me that both were necessary to explicitly match things in a
> directory and its subdirectories (example, IIRC: "git ls-files --
> .gitattributes" vs "git ls-files -- .gitattributes
> **/.gitattributes"). Maybe that example is peculiar in that its a
> dotfile and can't have a wildcard before the dot?
>
> I guess my takeaway is that it would be _good_ if the gitattributes
> documentation contained the caveat about not matching directories
> recursively, but _great_ if gitattributes and gitignore (and whatever
> else there is) were consistent.
>
> At any rate, thanks for the great, quick help!
>
> -Dakota

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-20  2:34 ` Jeff King
  2018-03-20  3:10   ` Dakota Hawkins
@ 2018-03-20  3:33   ` Junio C Hamano
  2018-03-20  3:40     ` Dakota Hawkins
  2018-03-20  3:45     ` Jeff King
  1 sibling, 2 replies; 28+ messages in thread
From: Junio C Hamano @ 2018-03-20  3:33 UTC (permalink / raw)
  To: Jeff King; +Cc: Dakota Hawkins, Git

Jeff King <peff@peff.net> writes:

> The best you can probably do is:
>
>   /.readme-docs/* diff=foo
>
> Since you have no diff.foo.* config, that will behave in the default way
> (including respecting the usual "is it binary" checks). So a bit hacky,
> but I think it would work as "ignore prior diff".

You can say

	/.readme-docs/* !diff

I think.  Thre is a difference between setting it to "false"
(i.e. Unset) and reverting it to unspecified state.

Isn't that what you want here?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-20  3:33   ` Junio C Hamano
@ 2018-03-20  3:40     ` Dakota Hawkins
  2018-03-20  3:45     ` Jeff King
  1 sibling, 0 replies; 28+ messages in thread
From: Dakota Hawkins @ 2018-03-20  3:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Git

On Mon, Mar 19, 2018 at 11:33 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Jeff King <peff@peff.net> writes:
>
>> The best you can probably do is:
>>
>>   /.readme-docs/* diff=foo
>>
>> Since you have no diff.foo.* config, that will behave in the default way
>> (including respecting the usual "is it binary" checks). So a bit hacky,
>> but I think it would work as "ignore prior diff".
>
> You can say
>
>         /.readme-docs/* !diff
>
> I think.  Thre is a difference between setting it to "false"
> (i.e. Unset) and reverting it to unspecified state.
>
> Isn't that what you want here?

In this case, I think so? In this context I don't necessarily know the
files in /.readme-docs/ are binary (though that's its intent).
Ideally, I just want it to do whatever it did before the match gave it
"diff=lfs". I realize that that's not possible/feasible, so I asked in
a separate reply whether it might be a good idea to ask git-lfs for a
"no-lfs" filter for exactly this situation.

The actual "lfs" section of my .gitattributes file is about 65 lines
for 65 different extensions, so I'd prefer to handle the exclusion of
a directory without having to repeat that whole block with different
options, if that makes sense :)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-20  3:33   ` Junio C Hamano
  2018-03-20  3:40     ` Dakota Hawkins
@ 2018-03-20  3:45     ` Jeff King
  1 sibling, 0 replies; 28+ messages in thread
From: Jeff King @ 2018-03-20  3:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Dakota Hawkins, Git

On Mon, Mar 19, 2018 at 08:33:15PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > The best you can probably do is:
> >
> >   /.readme-docs/* diff=foo
> >
> > Since you have no diff.foo.* config, that will behave in the default way
> > (including respecting the usual "is it binary" checks). So a bit hacky,
> > but I think it would work as "ignore prior diff".
> 
> You can say
> 
> 	/.readme-docs/* !diff
> 
> I think.  Thre is a difference between setting it to "false"
> (i.e. Unset) and reverting it to unspecified state.
> 
> Isn't that what you want here?

Ah, yes, I totally forgot that existed. That's much better than the
hackery I proposed.

-Peff

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-20  3:10   ` Dakota Hawkins
  2018-03-20  3:17     ` Dakota Hawkins
@ 2018-03-20  4:04     ` Jeff King
  2018-03-20  4:14       ` [PATCH] doc/gitattributes: mention non-recursive behavior Jeff King
  2018-03-20  4:25       ` .gitattributes override behavior (possible bug, or documentation bug) Dakota Hawkins
  1 sibling, 2 replies; 28+ messages in thread
From: Jeff King @ 2018-03-20  4:04 UTC (permalink / raw)
  To: Dakota Hawkins; +Cc: Git

On Mon, Mar 19, 2018 at 11:10:26PM -0400, Dakota Hawkins wrote:

> > As you noted below, that second line does not match your path, because
> > attributes on a directory aren't applied recursively. And it has nothing
> > to do with overriding. If you remove the png line entirely, you can see
> > that we still do not match it. You need to use "*" to match the paths.
> 
> Ah, yes, I see that. Inconsistent with .gitignore (more below), right?

Yes, it is.

> > I could not find anything useful in gitattributes(5). There's some old
> > discussion here:
> >
> >   https://public-inbox.org/git/slrnkldd3g.1l4.jan@majutsushi.net/
> 
> If I follow that correctly: There's some initial speculation that it
> would be OK to apply the attributes recursively, which is then shot
> down because it wasn't designed to be recursive (though I don't see a
> different, technical reason for that), followed by finding a (this
> same?) solution/workaround for the original problem. Is that about
> right?

Right. The technical reason is mostly "that is not how it was designed,
and it would possibly break some corner cases if we switched it now".

> > I think just "/.readme-docs/**" should be sufficient for your case. You
> > could also probably write "*" inside ".readme-docs/.gitattributes",
> > which may be simpler (you don't need "**" there because patterns without
> > a slash are just matched directly against the basename).
> 
> Wouldn't that make the "*" inside ".readme-docs/.gitattributes",
> technically recursive when "*" matches a directory?

Sort of. The pattern is applied recursively to each basename, but the
pattern itself does not match recursively. That's probably splitting
hairs, though. :)

> It's always seemed to me that both were necessary to explicitly match
> things in a directory and its subdirectories (example, IIRC: "git
> ls-files -- .gitattributes" vs "git ls-files -- .gitattributes
> **/.gitattributes"). Maybe that example is peculiar in that its a
> dotfile and can't have a wildcard before the dot?

Those are pathspecs, which are not quite the same as gitattributes. They
don't do the magic basename matching.

For pathspecs a "*" matches across slashes, which is what allows "git
log -- '*.h'" to work (but not a suffix wildcard like 'foo*').

> I guess my takeaway is that it would be _good_ if the gitattributes
> documentation contained the caveat about not matching directories
> recursively, but _great_ if gitattributes and gitignore (and whatever
> else there is) were consistent.

I agree it would be nice if they were consistent (and pathspecs, too).
But unfortunately at this point there's a maze of backwards
compatibility to deal with.

-Peff

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-20  3:17     ` Dakota Hawkins
@ 2018-03-20  4:12       ` Jeff King
  0 siblings, 0 replies; 28+ messages in thread
From: Jeff King @ 2018-03-20  4:12 UTC (permalink / raw)
  To: Dakota Hawkins; +Cc: Git

On Mon, Mar 19, 2018 at 11:17:04PM -0400, Dakota Hawkins wrote:

> Sorry to tack on to my previous email, but I just thought of this:
> 
> If something like "-diff=lfs" won't do what I (and git-lfs) thought it
> would, do you think it would be prudent/reasonable to suggest git-lfs
> add a "no-lfs" filter for exactly this case? That way I could have
> explicit exclusions without any "diff=foo" shenanigans.

It might be if my earlier email weren't totally wrong. ;) I think the
"!filter" syntax that Junio mentioned would do what you want.

-Peff

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] doc/gitattributes: mention non-recursive behavior
  2018-03-20  4:04     ` Jeff King
@ 2018-03-20  4:14       ` Jeff King
  2018-03-20  4:28         ` Dakota Hawkins
  2018-03-20 16:41         ` Duy Nguyen
  2018-03-20  4:25       ` .gitattributes override behavior (possible bug, or documentation bug) Dakota Hawkins
  1 sibling, 2 replies; 28+ messages in thread
From: Jeff King @ 2018-03-20  4:14 UTC (permalink / raw)
  To: Dakota Hawkins; +Cc: Junio C Hamano, Git

On Tue, Mar 20, 2018 at 12:04:11AM -0400, Jeff King wrote:

> > I guess my takeaway is that it would be _good_ if the gitattributes
> > documentation contained the caveat about not matching directories
> > recursively, but _great_ if gitattributes and gitignore (and whatever
> > else there is) were consistent.
> 
> I agree it would be nice if they were consistent (and pathspecs, too).
> But unfortunately at this point there's a maze of backwards
> compatibility to deal with.

So let's not forget to do the easy half there. Here's a patch.

-- >8 --
Subject: [PATCH] doc/gitattributes: mention non-recursive behavior

The gitattributes documentation claims that the pattern
rules are largely the same as for gitignore. However, the
rules for recursion are different.

In an ideal world, we would make them the same (if for
nothing else than consistency and simplicity), but that
would create backwards compatibility issues. For some
discussion, see this thread:

  https://public-inbox.org/git/slrnkldd3g.1l4.jan@majutsushi.net/

But let's at least document the differences instead of
actively misleading the user by claiming that they're the
same.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/gitattributes.txt | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index d52b254a22..1094fe2b5b 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -56,9 +56,16 @@ Unspecified::
 
 When more than one pattern matches the path, a later line
 overrides an earlier line.  This overriding is done per
-attribute.  The rules how the pattern matches paths are the
-same as in `.gitignore` files; see linkgit:gitignore[5].
-Unlike `.gitignore`, negative patterns are forbidden.
+attribute.
+
+The rules by which the pattern matches paths are the same as in
+`.gitignore` files (see linkgit:gitignore[5]), with a few exceptions:
+
+  - negative patterns are forbidden
+
+  - patterns that match a directory do not recursively match paths
+    inside that directory (so using the trailing-slash `path/` syntax is
+    pointless in an attributes file; use `path/**` instead)
 
 When deciding what attributes are assigned to a path, Git
 consults `$GIT_DIR/info/attributes` file (which has the highest
-- 
2.17.0.rc0.402.ged0b3fd1ee


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-20  4:04     ` Jeff King
  2018-03-20  4:14       ` [PATCH] doc/gitattributes: mention non-recursive behavior Jeff King
@ 2018-03-20  4:25       ` Dakota Hawkins
  2018-03-20  4:40         ` Jeff King
  1 sibling, 1 reply; 28+ messages in thread
From: Dakota Hawkins @ 2018-03-20  4:25 UTC (permalink / raw)
  To: Jeff King, Junio C Hamano; +Cc: Git

> Right. The technical reason is mostly "that is not how it was designed,
> and it would possibly break some corner cases if we switched it now".

I'm just spitballing here, but do you guys think there's any subset of
the combined .gitignore and .gitattributes matching functionality that
could at least serve as a good "best-practices, going forward"
(because of consistency) for both? I will say every time I do this for
a new repo and have to do something even slightly complicated or
different from what I've done before with .gitattributes/.gitignore
that it takes me a long-ish time to figure it out. It's like I'm
vaguely aware of pitfalls I've encountered in the past in certain
areas but don't remember exactly what they are, so I consult the docs,
which are (in sum) confusing and lead to more time spent
trying/failing/trying/works/fails-later/etc.

One "this subset of rules will work for both this way" would be
awesome even if the matching capabilities are technically divergent,
but on the other hand that might paint both into a corner in terms of
functionality.

>> > I think just "/.readme-docs/**" should be sufficient for your case. You
>> > could also probably write "*" inside ".readme-docs/.gitattributes",
>> > which may be simpler (you don't need "**" there because patterns without
>> > a slash are just matched directly against the basename).
>>
>> Wouldn't that make the "*" inside ".readme-docs/.gitattributes",
>> technically recursive when "*" matches a directory?
>
> Sort of. The pattern is applied recursively to each basename, but the
> pattern itself does not match recursively. That's probably splitting
> hairs, though. :)

I understand, but if I think about it too much I feel the overwhelming
urge to stop thinking about it :)

>> It's always seemed to me that both were necessary to explicitly match
>> things in a directory and its subdirectories (example, IIRC: "git
>> ls-files -- .gitattributes" vs "git ls-files -- .gitattributes
>> **/.gitattributes"). Maybe that example is peculiar in that its a
>> dotfile and can't have a wildcard before the dot?
>
> Those are pathspecs, which are not quite the same as gitattributes. They
> don't do the magic basename matching.
>
> For pathspecs a "*" matches across slashes, which is what allows "git
> log -- '*.h'" to work (but not a suffix wildcard like 'foo*').

Same comment -- makes sense but hard to want to think too much about.

>> I guess my takeaway is that it would be _good_ if the gitattributes
>> documentation contained the caveat about not matching directories
>> recursively, but _great_ if gitattributes and gitignore (and whatever
>> else there is) were consistent.
>
> I agree it would be nice if they were consistent (and pathspecs, too).
> But unfortunately at this point there's a maze of backwards
> compatibility to deal with.
>
> -Peff

Again, as above, what if there were a subset of functionality git
could recommend to avoid inconsistencies?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] doc/gitattributes: mention non-recursive behavior
  2018-03-20  4:14       ` [PATCH] doc/gitattributes: mention non-recursive behavior Jeff King
@ 2018-03-20  4:28         ` Dakota Hawkins
  2018-03-20 16:41         ` Duy Nguyen
  1 sibling, 0 replies; 28+ messages in thread
From: Dakota Hawkins @ 2018-03-20  4:28 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Git

That's perfect, thank you so much!

On Tue, Mar 20, 2018 at 12:14 AM, Jeff King <peff@peff.net> wrote:
> On Tue, Mar 20, 2018 at 12:04:11AM -0400, Jeff King wrote:
>
>> > I guess my takeaway is that it would be _good_ if the gitattributes
>> > documentation contained the caveat about not matching directories
>> > recursively, but _great_ if gitattributes and gitignore (and whatever
>> > else there is) were consistent.
>>
>> I agree it would be nice if they were consistent (and pathspecs, too).
>> But unfortunately at this point there's a maze of backwards
>> compatibility to deal with.
>
> So let's not forget to do the easy half there. Here's a patch.
>
> -- >8 --
> Subject: [PATCH] doc/gitattributes: mention non-recursive behavior
>
> The gitattributes documentation claims that the pattern
> rules are largely the same as for gitignore. However, the
> rules for recursion are different.
>
> In an ideal world, we would make them the same (if for
> nothing else than consistency and simplicity), but that
> would create backwards compatibility issues. For some
> discussion, see this thread:
>
>   https://public-inbox.org/git/slrnkldd3g.1l4.jan@majutsushi.net/
>
> But let's at least document the differences instead of
> actively misleading the user by claiming that they're the
> same.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  Documentation/gitattributes.txt | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index d52b254a22..1094fe2b5b 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -56,9 +56,16 @@ Unspecified::
>
>  When more than one pattern matches the path, a later line
>  overrides an earlier line.  This overriding is done per
> -attribute.  The rules how the pattern matches paths are the
> -same as in `.gitignore` files; see linkgit:gitignore[5].
> -Unlike `.gitignore`, negative patterns are forbidden.
> +attribute.
> +
> +The rules by which the pattern matches paths are the same as in
> +`.gitignore` files (see linkgit:gitignore[5]), with a few exceptions:
> +
> +  - negative patterns are forbidden
> +
> +  - patterns that match a directory do not recursively match paths
> +    inside that directory (so using the trailing-slash `path/` syntax is
> +    pointless in an attributes file; use `path/**` instead)
>
>  When deciding what attributes are assigned to a path, Git
>  consults `$GIT_DIR/info/attributes` file (which has the highest
> --
> 2.17.0.rc0.402.ged0b3fd1ee
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-20  4:25       ` .gitattributes override behavior (possible bug, or documentation bug) Dakota Hawkins
@ 2018-03-20  4:40         ` Jeff King
  2018-03-20  4:49           ` Dakota Hawkins
  2018-03-20 16:28           ` Duy Nguyen
  0 siblings, 2 replies; 28+ messages in thread
From: Jeff King @ 2018-03-20  4:40 UTC (permalink / raw)
  To: Dakota Hawkins; +Cc: Junio C Hamano, Git, Nguyễn Thái Ngọc Duy

On Tue, Mar 20, 2018 at 12:25:27AM -0400, Dakota Hawkins wrote:

> > Right. The technical reason is mostly "that is not how it was designed,
> > and it would possibly break some corner cases if we switched it now".
> 
> I'm just spitballing here, but do you guys think there's any subset of
> the combined .gitignore and .gitattributes matching functionality that
> could at least serve as a good "best-practices, going forward"
> (because of consistency) for both? I will say every time I do this for
> a new repo and have to do something even slightly complicated or
> different from what I've done before with .gitattributes/.gitignore
> that it takes me a long-ish time to figure it out. It's like I'm
> vaguely aware of pitfalls I've encountered in the past in certain
> areas but don't remember exactly what they are, so I consult the docs,
> which are (in sum) confusing and lead to more time spent
> trying/failing/trying/works/fails-later/etc.
> 
> One "this subset of rules will work for both this way" would be
> awesome even if the matching capabilities are technically divergent,
> but on the other hand that might paint both into a corner in terms of
> functionality.

As far as I know, they should be the same with the exception of this
recursion, and the negative-pattern thing. But I'm cc-ing Duy, who is
the resident expert on ignore and attributes matching (whether he wants
to be or not ;) ). I wouldn't be surprised if there's something I don't
know about.

So I think the "recommended subset" is basically "everything but these
few constructs". We just need to document them. ;)

I probably should cc'd Duy on the documentation patch, too:

  https://public-inbox.org/git/20180320041454.GA15213@sigill.intra.peff.net/

-Peff

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-20  4:40         ` Jeff King
@ 2018-03-20  4:49           ` Dakota Hawkins
  2018-03-20 16:28           ` Duy Nguyen
  1 sibling, 0 replies; 28+ messages in thread
From: Dakota Hawkins @ 2018-03-20  4:49 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Git, Nguyễn Thái Ngọc Duy

> So I think the "recommended subset" is basically "everything but these
> few constructs". We just need to document them. ;)

Mentioned so-far/running list?

- Matching directories recursively, or at all I guess (e.g. "<dir>/")
  - (???) Instead: "<dir>/*"
- Negative matches
  - (???) Instead: Is there any longer-form attributes-OK equivalent?
Just positive-match with "!<thing>s"?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-20  4:40         ` Jeff King
  2018-03-20  4:49           ` Dakota Hawkins
@ 2018-03-20 16:28           ` Duy Nguyen
  2018-03-21  3:22             ` Dakota Hawkins
  1 sibling, 1 reply; 28+ messages in thread
From: Duy Nguyen @ 2018-03-20 16:28 UTC (permalink / raw)
  To: Jeff King; +Cc: Dakota Hawkins, Junio C Hamano, Git

On Tue, Mar 20, 2018 at 5:40 AM, Jeff King <peff@peff.net> wrote:
> On Tue, Mar 20, 2018 at 12:25:27AM -0400, Dakota Hawkins wrote:
>
>> > Right. The technical reason is mostly "that is not how it was designed,
>> > and it would possibly break some corner cases if we switched it now".
>>
>> I'm just spitballing here, but do you guys think there's any subset of
>> the combined .gitignore and .gitattributes matching functionality that
>> could at least serve as a good "best-practices, going forward"
>> (because of consistency) for both? I will say every time I do this for
>> a new repo and have to do something even slightly complicated or
>> different from what I've done before with .gitattributes/.gitignore
>> that it takes me a long-ish time to figure it out. It's like I'm
>> vaguely aware of pitfalls I've encountered in the past in certain
>> areas but don't remember exactly what they are, so I consult the docs,
>> which are (in sum) confusing and lead to more time spent
>> trying/failing/trying/works/fails-later/etc.
>>
>> One "this subset of rules will work for both this way" would be

You know, you (Dakota) could implement the new "exclude" attribute in
.gitattributes and ignore .gitignore files completely. That makes it
works "for both" ;-) The effort is probably not small though.

>> awesome even if the matching capabilities are technically divergent,
>> but on the other hand that might paint both into a corner in terms of
>> functionality.
>
> As far as I know, they should be the same with the exception of this
> recursion, and the negative-pattern thing. But I'm cc-ing Duy, who is
> the resident expert on ignore and attributes matching (whether he wants
> to be or not ;) ).

Ha ha ha.

> I wouldn't be surprised if there's something I don't know about.

The only thing from the top of my head is what made me fail to unify
the implementation of the two. It's basically different order of
evaluation [1] when your patterns are spread out in multiple files. I
think it makes gitattr and gitignore behavior different too (but I
didn't try to verify).

Apart from that, the two should behave the same way besides the
exceptions you pointed out.

[1] https://public-inbox.org/git/%3CCACsJy8B8kYU7bkD8SiK354z4u=sY3hHbe4JVwNT_1pxod1cqUw@mail.gmail.com%3E/

> So I think the "recommended subset" is basically "everything but these
> few constructs". We just need to document them. ;)
>
> I probably should cc'd Duy on the documentation patch, too:
>
>   https://public-inbox.org/git/20180320041454.GA15213@sigill.intra.peff.net/
>
> -Peff
-- 
Duy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] doc/gitattributes: mention non-recursive behavior
  2018-03-20  4:14       ` [PATCH] doc/gitattributes: mention non-recursive behavior Jeff King
  2018-03-20  4:28         ` Dakota Hawkins
@ 2018-03-20 16:41         ` Duy Nguyen
  2018-03-21  6:50           ` Jeff King
  1 sibling, 1 reply; 28+ messages in thread
From: Duy Nguyen @ 2018-03-20 16:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Dakota Hawkins, Junio C Hamano, Git

On Tue, Mar 20, 2018 at 5:14 AM, Jeff King <peff@peff.net> wrote:
> On Tue, Mar 20, 2018 at 12:04:11AM -0400, Jeff King wrote:
>
>> > I guess my takeaway is that it would be _good_ if the gitattributes
>> > documentation contained the caveat about not matching directories
>> > recursively, but _great_ if gitattributes and gitignore (and whatever
>> > else there is) were consistent.
>>
>> I agree it would be nice if they were consistent (and pathspecs, too).
>> But unfortunately at this point there's a maze of backwards
>> compatibility to deal with.
>
> So let's not forget to do the easy half there. Here's a patch.
>
> -- >8 --
> Subject: [PATCH] doc/gitattributes: mention non-recursive behavior
>
> The gitattributes documentation claims that the pattern
> rules are largely the same as for gitignore. However, the
> rules for recursion are different.
>
> In an ideal world, we would make them the same (if for
> nothing else than consistency and simplicity), but that
> would create backwards compatibility issues. For some
> discussion, see this thread:
>
>   https://public-inbox.org/git/slrnkldd3g.1l4.jan@majutsushi.net/
>
> But let's at least document the differences instead of
> actively misleading the user by claiming that they're the
> same.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  Documentation/gitattributes.txt | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index d52b254a22..1094fe2b5b 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -56,9 +56,16 @@ Unspecified::
>
>  When more than one pattern matches the path, a later line
>  overrides an earlier line.  This overriding is done per
> -attribute.  The rules how the pattern matches paths are the
> -same as in `.gitignore` files; see linkgit:gitignore[5].
> -Unlike `.gitignore`, negative patterns are forbidden.
> +attribute.
> +
> +The rules by which the pattern matches paths are the same as in
> +`.gitignore` files (see linkgit:gitignore[5]), with a few exceptions:
> +
> +  - negative patterns are forbidden

After 8b1bd02415 (Make !pattern in .gitattributes non-fatal -
2013-03-01) maybe we could use the verb "ignored" too instead of
"forbidden"

> +
> +  - patterns that match a directory do not recursively match paths
> +    inside that directory (so using the trailing-slash `path/` syntax is

Technically gitignore does not match paths inside either. It simply
ignores the whole dir and not traverse in (which is more of an
optimization than anything). That is coincidentally perceived as
recursively ignoring. Anyway yes it's good to spell out the
differences here for gitattributes.

> +    pointless in an attributes file; use `path/**` instead)

We probably could do this internally too (converting "path/" to
"path/**") but we need to deal with corner cases (e.g. "path" without
the trailing slash, but is a directory). So yes, suggesting the user
to do it instead may be easier.

>
>  When deciding what attributes are assigned to a path, Git
>  consults `$GIT_DIR/info/attributes` file (which has the highest
> --
> 2.17.0.rc0.402.ged0b3fd1ee
>



-- 
Duy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-20 16:28           ` Duy Nguyen
@ 2018-03-21  3:22             ` Dakota Hawkins
  2018-03-21  6:52               ` Jeff King
  2018-03-21 16:07               ` Duy Nguyen
  0 siblings, 2 replies; 28+ messages in thread
From: Dakota Hawkins @ 2018-03-21  3:22 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Jeff King, Junio C Hamano, Git

Thinking about this a little more, I'm now attracted to the idea that
its .gitignore that's weird.

As I understand it, .gitignore stops recursion when there's a
directory match (`somedir/`) but also explicitly allows nested
.gitnore file _as well as_ exclusion (`!*.txt`).

So, in the following (contrived) example, the user doesn't get what they want:

    repo/
    |- .git/
    |- .gitignore               # /ignore-most/
    |- ignore-most/
    |  |- .gitignore            # !*.txt
    |  |- please_ignore.png
    |  |- dont_ignore_me.txt

`repo/ignore-most/dont_ignore_me.txt` is still ignored, despite what
seems like the obvious intention of the user.

Maybe a unified "best-practices" would first-and-foremost recommend
against matching directories at all (makes sense, git doesn't manage
directories). In the above example, changing `/ignore-most/` to
`/ignore-most/*` has the "desired" effect.

What do you think?

On Tue, Mar 20, 2018 at 12:28 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Tue, Mar 20, 2018 at 5:40 AM, Jeff King <peff@peff.net> wrote:
>> On Tue, Mar 20, 2018 at 12:25:27AM -0400, Dakota Hawkins wrote:
>>
>>> > Right. The technical reason is mostly "that is not how it was designed,
>>> > and it would possibly break some corner cases if we switched it now".
>>>
>>> I'm just spitballing here, but do you guys think there's any subset of
>>> the combined .gitignore and .gitattributes matching functionality that
>>> could at least serve as a good "best-practices, going forward"
>>> (because of consistency) for both? I will say every time I do this for
>>> a new repo and have to do something even slightly complicated or
>>> different from what I've done before with .gitattributes/.gitignore
>>> that it takes me a long-ish time to figure it out. It's like I'm
>>> vaguely aware of pitfalls I've encountered in the past in certain
>>> areas but don't remember exactly what they are, so I consult the docs,
>>> which are (in sum) confusing and lead to more time spent
>>> trying/failing/trying/works/fails-later/etc.
>>>
>>> One "this subset of rules will work for both this way" would be
>
> You know, you (Dakota) could implement the new "exclude" attribute in
> .gitattributes and ignore .gitignore files completely. That makes it
> works "for both" ;-) The effort is probably not small though.
>
>>> awesome even if the matching capabilities are technically divergent,
>>> but on the other hand that might paint both into a corner in terms of
>>> functionality.
>>
>> As far as I know, they should be the same with the exception of this
>> recursion, and the negative-pattern thing. But I'm cc-ing Duy, who is
>> the resident expert on ignore and attributes matching (whether he wants
>> to be or not ;) ).
>
> Ha ha ha.
>
>> I wouldn't be surprised if there's something I don't know about.
>
> The only thing from the top of my head is what made me fail to unify
> the implementation of the two. It's basically different order of
> evaluation [1] when your patterns are spread out in multiple files. I
> think it makes gitattr and gitignore behavior different too (but I
> didn't try to verify).
>
> Apart from that, the two should behave the same way besides the
> exceptions you pointed out.
>
> [1] https://public-inbox.org/git/%3CCACsJy8B8kYU7bkD8SiK354z4u=sY3hHbe4JVwNT_1pxod1cqUw@mail.gmail.com%3E/
>
>> So I think the "recommended subset" is basically "everything but these
>> few constructs". We just need to document them. ;)
>>
>> I probably should cc'd Duy on the documentation patch, too:
>>
>>   https://public-inbox.org/git/20180320041454.GA15213@sigill.intra.peff.net/
>>
>> -Peff
> --
> Duy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] doc/gitattributes: mention non-recursive behavior
  2018-03-20 16:41         ` Duy Nguyen
@ 2018-03-21  6:50           ` Jeff King
  2018-03-21 16:16             ` Duy Nguyen
  0 siblings, 1 reply; 28+ messages in thread
From: Jeff King @ 2018-03-21  6:50 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Dakota Hawkins, Junio C Hamano, Git

On Tue, Mar 20, 2018 at 05:41:52PM +0100, Duy Nguyen wrote:

> > +The rules by which the pattern matches paths are the same as in
> > +`.gitignore` files (see linkgit:gitignore[5]), with a few exceptions:
> > +
> > +  - negative patterns are forbidden
> 
> After 8b1bd02415 (Make !pattern in .gitattributes non-fatal -
> 2013-03-01) maybe we could use the verb "ignored" too instead of
> "forbidden"

Makes sense. The original is already in 'next', so do you want to send
an incremental patch?

> > +    pointless in an attributes file; use `path/**` instead)
> 
> We probably could do this internally too (converting "path/" to
> "path/**") but we need to deal with corner cases (e.g. "path" without
> the trailing slash, but is a directory). So yes, suggesting the user
> to do it instead may be easier.

Yeah, I almost suggested that, but I worried about those corner cases.
It seems like documenting the current behavior is the right first step
in any case.

One other maybe-difference I came across coincidentally today: you have
to quote the pattern in .gitattributes if it contains spaces, but not in
.gitignore. But that's more an artifact of the rest of the file syntax
than the pattern syntax (.gitignore has no other fields to confuse it
with).

-Peff

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-21  3:22             ` Dakota Hawkins
@ 2018-03-21  6:52               ` Jeff King
  2018-03-21  7:36                 ` Dakota Hawkins
  2018-03-21 16:07               ` Duy Nguyen
  1 sibling, 1 reply; 28+ messages in thread
From: Jeff King @ 2018-03-21  6:52 UTC (permalink / raw)
  To: Dakota Hawkins; +Cc: Duy Nguyen, Junio C Hamano, Git

On Tue, Mar 20, 2018 at 11:22:02PM -0400, Dakota Hawkins wrote:

> Thinking about this a little more, I'm now attracted to the idea that
> its .gitignore that's weird.
> 
> As I understand it, .gitignore stops recursion when there's a
> directory match (`somedir/`) but also explicitly allows nested
> .gitnore file _as well as_ exclusion (`!*.txt`).
> 
> So, in the following (contrived) example, the user doesn't get what they want:
> 
>     repo/
>     |- .git/
>     |- .gitignore               # /ignore-most/
>     |- ignore-most/
>     |  |- .gitignore            # !*.txt
>     |  |- please_ignore.png
>     |  |- dont_ignore_me.txt
> 
> `repo/ignore-most/dont_ignore_me.txt` is still ignored, despite what
> seems like the obvious intention of the user.

Right, I think this gets back to Duy's other email. We're not really
recursively applying the pattern, it's just that we don't bother
descending into ignore areas at all.

> Maybe a unified "best-practices" would first-and-foremost recommend
> against matching directories at all (makes sense, git doesn't manage
> directories). In the above example, changing `/ignore-most/` to
> `/ignore-most/*` has the "desired" effect.
> 
> What do you think?

I think that ignoring all of /ignore-most/ is much more efficient, since
we don't have to enumerate the paths inside it at all (which is why the
current behavior works as it does).

-Peff

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-21  6:52               ` Jeff King
@ 2018-03-21  7:36                 ` Dakota Hawkins
  2018-03-21  7:44                   ` Dakota Hawkins
                                     ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Dakota Hawkins @ 2018-03-21  7:36 UTC (permalink / raw)
  To: Jeff King; +Cc: Duy Nguyen, Junio C Hamano, Git

> I think that ignoring all of /ignore-most/ is much more efficient, since
> we don't have to enumerate the paths inside it at all (which is why the
> current behavior works as it does).

That's definitely true, but I wonder what the impact would be for most
cases (even for most cases with large repos and larges sets of ignore
files).

Most of my .gitignore patterns weren't hand-written
(https://www.gitignore.io/ is pretty neat), but there are a ton of
patterns like `dir/`...

I think if I were designing it from scratch and knew what I know now
I'd probably argue that behavior should be declarative (`dir/*
recurse=false` or something), but we can't really get there from here.

At any rate, would it at least be a good idea to make the "trailing
slash halts recursion, won't consider nested .gitignore files"
explicit in the `.gitignore` doc? Unless I'm missing it, I don't think
that behavior is called out (or at least not called out concisely/in
one place). It looks like this is all there is:

    "If the pattern ends with a slash, it is removed for the purpose
of the following description, but it would only find a match with a
directory. In other words, foo/ will match a directory foo and paths
underneath it, but will not match a regular file or a symbolic link
foo (this is consistent with the way how pathspec works in general in
Git)."

Also, I'm not sure what the "following description" is in "it is
removed for the purpose of the following description". Is that trying
to imply "excluded from the rest of the doc"?

- Dakota

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-21  7:36                 ` Dakota Hawkins
@ 2018-03-21  7:44                   ` Dakota Hawkins
  2018-03-21  7:50                   ` Jeff King
  2018-03-21 16:18                   ` Junio C Hamano
  2 siblings, 0 replies; 28+ messages in thread
From: Dakota Hawkins @ 2018-03-21  7:44 UTC (permalink / raw)
  To: Jeff King; +Cc: Duy Nguyen, Junio C Hamano, Git

One extra note:

I was burned by this just a few hours ago in a new repo (but because
of this discussion I realized what the problem was pretty quickly).

In the top-level .gitignore I had

    build/
    ...
    !alpine/build/

where `build/` was a stock ignore line among hundreds that I blindly
pasted in there, and the exclusion was an attempt to exclude some
things that shouldn't have been ignored.

Even in the same file, the exclusion doesn't work. I had to change it to:

    build/*
    ...
    !alpine/build/

Good times :)

- Dakota

On Wed, Mar 21, 2018 at 3:36 AM, Dakota Hawkins
<dakota@dakotahawkins.com> wrote:
>> I think that ignoring all of /ignore-most/ is much more efficient, since
>> we don't have to enumerate the paths inside it at all (which is why the
>> current behavior works as it does).
>
> That's definitely true, but I wonder what the impact would be for most
> cases (even for most cases with large repos and larges sets of ignore
> files).
>
> Most of my .gitignore patterns weren't hand-written
> (https://www.gitignore.io/ is pretty neat), but there are a ton of
> patterns like `dir/`...
>
> I think if I were designing it from scratch and knew what I know now
> I'd probably argue that behavior should be declarative (`dir/*
> recurse=false` or something), but we can't really get there from here.
>
> At any rate, would it at least be a good idea to make the "trailing
> slash halts recursion, won't consider nested .gitignore files"
> explicit in the `.gitignore` doc? Unless I'm missing it, I don't think
> that behavior is called out (or at least not called out concisely/in
> one place). It looks like this is all there is:
>
>     "If the pattern ends with a slash, it is removed for the purpose
> of the following description, but it would only find a match with a
> directory. In other words, foo/ will match a directory foo and paths
> underneath it, but will not match a regular file or a symbolic link
> foo (this is consistent with the way how pathspec works in general in
> Git)."
>
> Also, I'm not sure what the "following description" is in "it is
> removed for the purpose of the following description". Is that trying
> to imply "excluded from the rest of the doc"?
>
> - Dakota

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-21  7:36                 ` Dakota Hawkins
  2018-03-21  7:44                   ` Dakota Hawkins
@ 2018-03-21  7:50                   ` Jeff King
  2018-03-21  8:35                     ` Dakota Hawkins
  2018-03-21 16:18                   ` Junio C Hamano
  2 siblings, 1 reply; 28+ messages in thread
From: Jeff King @ 2018-03-21  7:50 UTC (permalink / raw)
  To: Dakota Hawkins; +Cc: Duy Nguyen, Junio C Hamano, Git

On Wed, Mar 21, 2018 at 03:36:09AM -0400, Dakota Hawkins wrote:

> > I think that ignoring all of /ignore-most/ is much more efficient, since
> > we don't have to enumerate the paths inside it at all (which is why the
> > current behavior works as it does).
> 
> That's definitely true, but I wonder what the impact would be for most
> cases (even for most cases with large repos and larges sets of ignore
> files).

Probably not that much CPU, but it used to be quite annoying to fault in
the dcache for an infrequently used hierarchy. These days I generally
have an SSD and lots of memory for disk caching, so I don't know if it
would still be.

On systems where readdir/stat are slower than Linux it might be
noticeable for a big hierarchy.

> At any rate, would it at least be a good idea to make the "trailing
> slash halts recursion, won't consider nested .gitignore files"
> explicit in the `.gitignore` doc? Unless I'm missing it, I don't think
> that behavior is called out (or at least not called out concisely/in
> one place). It looks like this is all there is:

Yeah, it's definitely come up multiple times over the years. I don't
know what all is in gitignore(5), but if it's not mentioned it probably
should be.

>     "If the pattern ends with a slash, it is removed for the purpose
> of the following description, but it would only find a match with a
> directory. In other words, foo/ will match a directory foo and paths
> underneath it, but will not match a regular file or a symbolic link
> foo (this is consistent with the way how pathspec works in general in
> Git)."
> 
> Also, I'm not sure what the "following description" is in "it is
> removed for the purpose of the following description". Is that trying
> to imply "excluded from the rest of the doc"?

I think it means "for the rest of the description of how the patterns
work". I.e., "foo/" matches as "foo" when the rest of the matching rules
are applied. I agree it's a bit awkward. Patches welcome. :)

-Peff

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-21  7:50                   ` Jeff King
@ 2018-03-21  8:35                     ` Dakota Hawkins
  2018-03-21  8:36                       ` Jeff King
  0 siblings, 1 reply; 28+ messages in thread
From: Dakota Hawkins @ 2018-03-21  8:35 UTC (permalink / raw)
  To: Jeff King; +Cc: Duy Nguyen, Junio C Hamano, Git

> I think it means "for the rest of the description of how the patterns
> work". I.e., "foo/" matches as "foo" when the rest of the matching rules
> are applied. I agree it's a bit awkward. Patches welcome. :)

I'd be more than happy to do that!

It will take me a while, this (email and text-patches) is foreign to
me and will take me some extra time, but I think I can figure it out.

Is "consistent with the way how pathspec works in general in Git"
absolutely still true (and relevant?)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-21  8:35                     ` Dakota Hawkins
@ 2018-03-21  8:36                       ` Jeff King
  0 siblings, 0 replies; 28+ messages in thread
From: Jeff King @ 2018-03-21  8:36 UTC (permalink / raw)
  To: Dakota Hawkins; +Cc: Duy Nguyen, Junio C Hamano, Git

On Wed, Mar 21, 2018 at 04:35:26AM -0400, Dakota Hawkins wrote:

> > I think it means "for the rest of the description of how the patterns
> > work". I.e., "foo/" matches as "foo" when the rest of the matching rules
> > are applied. I agree it's a bit awkward. Patches welcome. :)
> 
> I'd be more than happy to do that!
> 
> It will take me a while, this (email and text-patches) is foreign to
> me and will take me some extra time, but I think I can figure it out.
> 
> Is "consistent with the way how pathspec works in general in Git"
> absolutely still true (and relevant?)

I think so. "git log Makefile/" will not match anything, for example.

-Peff

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-21  3:22             ` Dakota Hawkins
  2018-03-21  6:52               ` Jeff King
@ 2018-03-21 16:07               ` Duy Nguyen
  1 sibling, 0 replies; 28+ messages in thread
From: Duy Nguyen @ 2018-03-21 16:07 UTC (permalink / raw)
  To: Dakota Hawkins; +Cc: Jeff King, Junio C Hamano, Git

On Wed, Mar 21, 2018 at 4:22 AM, Dakota Hawkins
<dakota@dakotahawkins.com> wrote:
> Thinking about this a little more, I'm now attracted to the idea that
> its .gitignore that's weird.
>
> As I understand it, .gitignore stops recursion when there's a
> directory match (`somedir/`) but also explicitly allows nested
> .gitnore file _as well as_ exclusion (`!*.txt`).
>
> So, in the following (contrived) example, the user doesn't get what they want:
>
>     repo/
>     |- .git/
>     |- .gitignore               # /ignore-most/
>     |- ignore-most/
>     |  |- .gitignore            # !*.txt
>     |  |- please_ignore.png
>     |  |- dont_ignore_me.txt
>
> `repo/ignore-most/dont_ignore_me.txt` is still ignored, despite what
> seems like the obvious intention of the user.

Don't get me started on this. I voiced this problem a couple times.
Attempted to fix it once which made it to rc cycles and caused lots of
regressions. I haven't taken another stab since.

> Maybe a unified "best-practices" would first-and-foremost recommend
> against matching directories at all (makes sense, git doesn't manage
> directories). In the above example, changing `/ignore-most/` to
> `/ignore-most/*` has the "desired" effect.

I think it's actually more intuitive to think "ignore recursively"
(with the trailing slash) These things make sense to you once you know
exactly _how_ the matching machinery works. But normal users don't
know that. It's probably better that we add recursive matching support
in gitattributes. Then both gitignore/gitattr are more or less
consistent again and also easy to use (most of the times)
-- 
Duy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] doc/gitattributes: mention non-recursive behavior
  2018-03-21  6:50           ` Jeff King
@ 2018-03-21 16:16             ` Duy Nguyen
  2018-03-23  9:12               ` Jeff King
  0 siblings, 1 reply; 28+ messages in thread
From: Duy Nguyen @ 2018-03-21 16:16 UTC (permalink / raw)
  To: Jeff King; +Cc: Dakota Hawkins, Junio C Hamano, Git

On Wed, Mar 21, 2018 at 7:50 AM, Jeff King <peff@peff.net> wrote:
> On Tue, Mar 20, 2018 at 05:41:52PM +0100, Duy Nguyen wrote:
>
>> > +The rules by which the pattern matches paths are the same as in
>> > +`.gitignore` files (see linkgit:gitignore[5]), with a few exceptions:
>> > +
>> > +  - negative patterns are forbidden
>>
>> After 8b1bd02415 (Make !pattern in .gitattributes non-fatal -
>> 2013-03-01) maybe we could use the verb "ignored" too instead of
>> "forbidden"
>
> Makes sense. The original is already in 'next', so do you want to send
> an incremental patch?

It's up to you. After all it's you who's doing all the work :)

>> > +    pointless in an attributes file; use `path/**` instead)
>>
>> We probably could do this internally too (converting "path/" to
>> "path/**") but we need to deal with corner cases (e.g. "path" without
>> the trailing slash, but is a directory). So yes, suggesting the user
>> to do it instead may be easier.
>
> Yeah, I almost suggested that, but I worried about those corner cases.
> It seems like documenting the current behavior is the right first step
> in any case.

Agreed.

> One other maybe-difference I came across coincidentally today: you have
> to quote the pattern in .gitattributes if it contains spaces, but not in
> .gitignore. But that's more an artifact of the rest of the file syntax
> than the pattern syntax (.gitignore has no other fields to confuse it
> with).

Yeah I forgot about that (and I was the one who started it). The
document was updated in 860a74d9d9 (attr: support quoting pathname
patterns in C style - 2017-01-27) though.
-- 
Duy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: .gitattributes override behavior (possible bug, or documentation bug)
  2018-03-21  7:36                 ` Dakota Hawkins
  2018-03-21  7:44                   ` Dakota Hawkins
  2018-03-21  7:50                   ` Jeff King
@ 2018-03-21 16:18                   ` Junio C Hamano
  2 siblings, 0 replies; 28+ messages in thread
From: Junio C Hamano @ 2018-03-21 16:18 UTC (permalink / raw)
  To: Dakota Hawkins; +Cc: Jeff King, Duy Nguyen, Git

Dakota Hawkins <dakota@dakotahawkins.com> writes:

> At any rate, would it at least be a good idea to make the "trailing
> slash halts recursion, won't consider nested .gitignore files"
> explicit in the `.gitignore` doc? Unless I'm missing it, I don't think
> that behavior is called out (or at least not called out concisely/in
> one place). It looks like this is all there is:
>
>     "If the pattern ends with a slash, it is removed for the purpose
> of the following description, but it would only find a match with a
> directory. In other words, foo/ will match a directory foo and paths
> underneath it, but will not match a regular file or a symbolic link
> foo (this is consistent with the way how pathspec works in general in
> Git)."

This is not saying "trailing slash halts" (and there is no need to
say "trailing slash halts"---the rule is "directory is not recursed
into").  The mention of '/' in the above paragraph is merely saying
that we chose to use a trailing slash as a syntax to say "I want the
path to match this pattern, but only when the path is a directory".
In other words, if "D" (nothing else on the line) is on a line, it
would match a file whose path is "D", but if you write "D/", it
would not.  It only would match a directory whose path is "D".

What "removed for the purpose of the following description" wants to
say is about where in the path "D" in the above example can appear.
Another rule after the paragraph would say that a pattern with a
slash in it will match only at the current level, and a pattern with
no slash would match in any level down below.  Imagine an entry "D"
and another entry "A/B" in .gitignore at the top-level of the
project.  The former has slash in it, the latter does not.  These
patterns match paths "D", "X/D", "A/B" but not "X/A/B".  The first
two match because the pattern "D" is not anchored with any slash,
the last one does not match because the pattern "A/B" has a slash in
it.

If the top-level .gitignore had "D/" and "A/B" in the above example,
and paths "D" and "X/D" were both directories, the pattern "D/"
still matches the directory "X/D", even though it is spelled with a
slash in it.  That slash is there merely for the purpose of marking
the pattern to only match directories, and does not trigger "slash
in a pattern anchors the pattern" rule, because it is "removed for
the purpose of the following description".

> Also, I'm not sure what the "following description" is in "it is
> removed for the purpose of the following description". Is that trying
> to imply "excluded from the rest of the doc"?

See above.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] doc/gitattributes: mention non-recursive behavior
  2018-03-21 16:16             ` Duy Nguyen
@ 2018-03-23  9:12               ` Jeff King
  0 siblings, 0 replies; 28+ messages in thread
From: Jeff King @ 2018-03-23  9:12 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Dakota Hawkins, Junio C Hamano, Git

On Wed, Mar 21, 2018 at 05:16:16PM +0100, Duy Nguyen wrote:

> >> After 8b1bd02415 (Make !pattern in .gitattributes non-fatal -
> >> 2013-03-01) maybe we could use the verb "ignored" too instead of
> >> "forbidden"
> >
> > Makes sense. The original is already in 'next', so do you want to send
> > an incremental patch?
> 
> It's up to you. After all it's you who's doing all the work :)

I was trying to trick you into doing it. ;)

As I tried to write up the commit message, though, I had second
thoughts. It's true that they are ignored in the current code. But I do
not think they are something we want to encourage, and certainly we do
not want to promise that the "ignored" behavior will last forever.

So I think it is actually best to declare them forbidden, and we just
happen to treat it as a non-fatal error in the current code.

> > One other maybe-difference I came across coincidentally today: you have
> > to quote the pattern in .gitattributes if it contains spaces, but not in
> > .gitignore. But that's more an artifact of the rest of the file syntax
> > than the pattern syntax (.gitignore has no other fields to confuse it
> > with).
> 
> Yeah I forgot about that (and I was the one who started it). The
> document was updated in 860a74d9d9 (attr: support quoting pathname
> patterns in C style - 2017-01-27) though.

OK. I was considering adding a special note to the list of differences,
but I think the existing text is probably fine.

-Peff

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2018-03-23  9:12 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-20  1:49 .gitattributes override behavior (possible bug, or documentation bug) Dakota Hawkins
2018-03-20  2:34 ` Jeff King
2018-03-20  3:10   ` Dakota Hawkins
2018-03-20  3:17     ` Dakota Hawkins
2018-03-20  4:12       ` Jeff King
2018-03-20  4:04     ` Jeff King
2018-03-20  4:14       ` [PATCH] doc/gitattributes: mention non-recursive behavior Jeff King
2018-03-20  4:28         ` Dakota Hawkins
2018-03-20 16:41         ` Duy Nguyen
2018-03-21  6:50           ` Jeff King
2018-03-21 16:16             ` Duy Nguyen
2018-03-23  9:12               ` Jeff King
2018-03-20  4:25       ` .gitattributes override behavior (possible bug, or documentation bug) Dakota Hawkins
2018-03-20  4:40         ` Jeff King
2018-03-20  4:49           ` Dakota Hawkins
2018-03-20 16:28           ` Duy Nguyen
2018-03-21  3:22             ` Dakota Hawkins
2018-03-21  6:52               ` Jeff King
2018-03-21  7:36                 ` Dakota Hawkins
2018-03-21  7:44                   ` Dakota Hawkins
2018-03-21  7:50                   ` Jeff King
2018-03-21  8:35                     ` Dakota Hawkins
2018-03-21  8:36                       ` Jeff King
2018-03-21 16:18                   ` Junio C Hamano
2018-03-21 16:07               ` Duy Nguyen
2018-03-20  3:33   ` Junio C Hamano
2018-03-20  3:40     ` Dakota Hawkins
2018-03-20  3:45     ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).