git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Kevin Ballard <kevin@sb.org>
To: Nguyen Thai Ngoc Duy <pclouds@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [PROPOSAL] .gitignore syntax modification
Date: Fri, 15 Oct 2010 13:15:35 -0700	[thread overview]
Message-ID: <40CE50FF-C2F0-4403-9248-1D8872BF567E@sb.org> (raw)
In-Reply-To: <AANLkTim8wTQiX5L1gcXWNR9xuTPATxY6_+0Q=KdoxpPL@mail.gmail.com>

On Oct 15, 2010, at 5:57 AM, Nguyen Thai Ngoc Duy wrote:

> On Fri, Oct 15, 2010 at 6:01 PM, Kevin Ballard <kevin@sb.org> wrote:
>> Got around to glancing at your patch. Looks pretty good, and it does build if you simply define EXC_FLAG_STARSTAR, though there are a few changes that are definitely necessary (a path of "*" will cause this to run off the end of the string while trying to detect "**/"). I'll have some more time next week to take a much closer look though. As for performance, I'm not particularly worried. The only performance change is if EXC_FLAG_STARSTAR is checked, in the worst-case it'll try to apply the pattern once per level of directory nesting. As this is just string twiddling, it's bound to be pretty fast, and I don't think there's any viable alternative to doing this kind of loop anyway. That said, I'd still like to support putting **/ anywhere in the pattern instead of just at the beginning, a
 nd possibly even support ** (without the trailing /).
> 
> Well that would mean reimplementing fnmatch(). I don't know, maybe
> it's not hard to do that. '*' can already match '/' if FNM_PATHNAME is
> not given. So one just needs to tell fnmatch() '**' is '*' without
> FNM_PATHNAME.
> 
> "**/" optimization can be extended to support "path/to/**/" quite
> easily as long as no wildcards are used in "path/to/" part.

I think both cases can be dealt with while still using fnmatch(). We can split the string along all instances of "**" and then match each pattern segment with fnmatch() along parts of the path. If a given segment matches part of the path, then we can assume that's correct and move on (e.g. never backtrack to the ** before it). The only specialization is the very last path segment has to match at the end of the path, and we can use slash-counting in each path segment in order to figure out how to slice up the path to pass to fnmatch().

>> If we do support ** by itself, I wonder if we should special-case having ** as the last path component of the pattern. The possible behavior change we could have is making this only match files and not directories. The use-case here is putting something like "foo/**" in the top-level .gitignore and then a few levels into foo we could put another .gitignore with an inverse pattern in order to un-ignore some deep file (or just "!foo/*/*/bar.c" inside that top-level .gitignore as well). The only way I can think of to achieve this behavior with the current gitignore is something along the lines of
>> 
>> foo/*
>> !foo/bar/
>> foo/bar/*
>> !foo/bar/baz/
>> foo/bar/baz/*
>> !foo/bar/baz/bar.c
>> 
>> And even this will only work if you know all the intermediate directories. I cannot think of any way at all right now to ignore everything in a single directory except for one file at least 1 level of nesting deeper if you don't know the names of the intermediate directories. With the proposed special-case we can say
>> 
>> foo/**
>> !foo/*/*/bar.c
>> 
>> and it will behave exactly as specified.
>> 
>> It occurs to me that we could actually tweak this slightly, to say that if a ** is encountered and there are zero slashes in the pattern after it, then it will only match files (with zero or more leading directories). This way you can have a pattern "foo/**.d" which only ignores files with the extension ".d" but will still avoid ignoring directories that end in ".d".
> 
> No idea. Seems overkill to me. But I don't use .gitignore heavily. For
> really complex ignore rules, how about allowing an external process to
> do the job? It would keep .gitignore syntax simple, yet powerful when
> needed.
> 
> A leading '|' marks an external process and can be used intermixed
> with normal patterns in .gitignore. When excluded_from_list examines a
> '|' pattern, it sends all information to the associated process' stdin
> and expects to a result code in stdout. The process is started when it
> is examined the first time and is kept alive until git process
> terminates.

That would certainly be powerful, but I don't know how much work it would take to implement. I still haven't really looked at the gitignore code yet. I think this is a good suggestion to do, but I still want to handle ** natively if possible.

-Kevin Ballard

      reply	other threads:[~2010-10-15 20:16 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-08 21:26 [PROPOSAL] .gitignore syntax modification Kevin Ballard
2010-10-08 21:58 ` Maaartin
2010-10-09  0:03   ` Kevin Ballard
2010-10-11 23:46     ` Maaartin
2010-10-13  2:24 ` Nguyen Thai Ngoc Duy
2010-10-13  2:32   ` Kevin Ballard
2010-10-13  2:40     ` Jonathan Nieder
2010-10-13  3:05       ` Kevin Ballard
2010-10-13 16:51         ` Junio C Hamano
2010-10-13 12:15     ` Nguyen Thai Ngoc Duy
2010-10-15 11:01       ` Kevin Ballard
2010-10-15 12:57         ` Nguyen Thai Ngoc Duy
2010-10-15 20:15           ` Kevin Ballard [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=40CE50FF-C2F0-4403-9248-1D8872BF567E@sb.org \
    --to=kevin@sb.org \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).