* What's cooking in git.git (Oct 2012, #01; Tue, 2) @ 2012-10-02 23:20 Junio C Hamano 2012-10-03 15:23 ` Nguyen Thai Ngoc Duy ` (2 more replies) 0 siblings, 3 replies; 31+ messages in thread From: Junio C Hamano @ 2012-10-02 23:20 UTC (permalink / raw) To: git Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'. The tip of 'master' is a bit past v1.8.0-rc0; we are more or less feature complete, except for a few topics marked as "Will merge to 'master' in the list below. I do not expect the topics in the stalled category will be ready for the upcoming release; they may appear in the release after v1.8.0 if they are rerolled. Also some of the topics that have been cooking in 'next' have to be deferred to the next cycle. I'm planning to keep this cycle reasonably short and aim for tagging the result as 1.8.0 at the end of 9th week, on October 21st, after which I'd disappear for a few weeks. http://tinyurl.com/gitCal is where you can always find my rough tagging schedule at. You can find the changes described here in the integration branches of the repositories listed at http://git-blame.blogspot.com/p/git-public-repositories.html -------------------------------------------------- [New Topics] * bw/config-lift-variable-name-length-limit (2012-10-01) 1 commit - Remove the hard coded length limit on variable names in config files The configuration parser had an unnecessary hardcoded limit on variable names that was not checked consistently. Lift the limit. Will merge to 'next'. * jc/maint-t1450-fsck-order-fix (2012-10-02) 1 commit - t1450: the order the objects are checked is undefined The fsck test assumed too much on what kind of error it will detect. The only important thing is the inconsistency is detected as an error. Will merge to 'next'. -------------------------------------------------- [Graduated to "master"] * da/mergetool-custom (2012-09-25) 1 commit (merged to 'next' on 2012-09-27 at f910851) + mergetool--lib: Allow custom commands to override built-ins The actual external command to run for mergetool backend can be specified with difftool/mergetool.$name.cmd configuration variables, but this mechanism was ignored for the backends we natively support. * ep/malloc-check-perturb (2012-09-26) 1 commit (merged to 'next' on 2012-09-27 at f115b8b) + MALLOC_CHECK: enable it, unless disabled explicitly Fixes a brown-paper bag bug. * jc/blame-follows-renames (2012-09-21) 1 commit (merged to 'next' on 2012-09-27 at 12634c1) + git blame: document that it always follows origin across whole-file renames (this branch is used by jc/blame-no-follow.) Clarify the "blame" documentation to tell the users that there is no need to ask for "--follow". * jk/completion-tests (2012-09-27) 2 commits (merged to 'next' on 2012-09-27 at 5cd6968) + t9902: add completion tests for "odd" filenames + t9902: add a few basic completion tests * jk/receive-pack-unpack-error-to-pusher (2012-09-21) 3 commits (merged to 'next' on 2012-09-27 at 90f0c6f) + receive-pack: drop "n/a" on unpacker errors + receive-pack: send pack-processing stderr over sideband + receive-pack: redirect unpack-objects stdout to /dev/null Send errors from "unpack-objects" and "index-pack" back to the "git push" over the git and smart-http protocols, just like it is done for a push over the ssh protocol. * os/commit-submodule-ignore (2012-09-24) 1 commit (merged to 'next' on 2012-09-27 at 9cd2cfd) + commit: pay attention to submodule.$name.ignore in .gitmodules "git status" honored the ignore=dirty settings in .gitmodules but "git commit" didn't. * rt/maint-clone-single (2012-09-20) 1 commit (merged to 'next' on 2012-09-27 at a47d54d) + clone --single: limit the fetch refspec to fetched branch Running "git fetch" in a repository made with "git clone --single" slurps all the branches, defeating the point of "--single". -------------------------------------------------- [Stalled] * rc/maint-complete-git-p4 (2012-09-24) 1 commit (merged to 'next' on 2012-09-25 at 116e58f) + Teach git-completion about git p4 Comment from Pete will need to be addressed in a follow-up patch. * as/check-ignore (2012-09-27) 17 commits - [SQUASH-FIX] 283d072 (Add git-check-ignore sub-command, 2012-09-20) - [SQUASH-FIX] 283d072 (Add git-check-ignore sub-command, 2012-09-20) - [REROLL NEEDED] minimum compilation fix - Add git-check-ignore sub-command - dir.c: provide free_directory() for reclaiming dir_struct memory - pathspec.c: move reusable code from builtin/add.c - dir.c: refactor treat_gitlinks() - dir.c: keep track of where patterns came from - dir.c: refactor is_path_excluded() - dir.c: refactor is_excluded() - dir.c: refactor is_excluded_from_list() - dir.c: rename excluded() to is_excluded() - dir.c: rename excluded_from_list() to is_excluded_from_list() - dir.c: rename path_excluded() to is_path_excluded() - dir.c: rename cryptic 'which' variable to more consistent name - Improve documentation and comments regarding directory traversal API - Update directory listing API doc to match code Expecting a reroll. * as/test-tweaks (2012-09-20) 7 commits - tests: paint unexpectedly fixed known breakages in bold red - tests: test the test framework more thoroughly - [SQUASH] t/t0000-basic.sh: quoting of TEST_DIRECTORY is screwed up - tests: refactor mechanics of testing in a sub test-lib - tests: paint skipped tests in bold blue - tests: test number comes first in 'not ok $count - $message' - tests: paint known breakages in bold yellow Various minor tweaks to the test framework to paint its output lines in colors that match what they mean better. Has the "is this really blue?" issue Peff raised resolved??? * fa/remote-svn (2012-09-19) 16 commits - Add a test script for remote-svn - remote-svn: add marks-file regeneration - Add a svnrdump-simulator replaying a dump file for testing - remote-svn: add incremental import - remote-svn: Activate import/export-marks for fast-import - Create a note for every imported commit containing svn metadata - vcs-svn: add fast_export_note to create notes - Allow reading svn dumps from files via file:// urls - remote-svn, vcs-svn: Enable fetching to private refs - When debug==1, start fast-import with "--stats" instead of "--quiet" - Add documentation for the 'bidi-import' capability of remote-helpers - Connect fast-import to the remote-helper via pipe, adding 'bidi-import' capability - Add argv_array_detach and argv_array_free_detached - Add svndump_init_fd to allow reading dumps from arbitrary FDs - Add git-remote-testsvn to Makefile - Implement a remote helper for svn in C (this branch is used by fa/vcs-svn.) A GSoC project. Waiting for comments from mentors and stakeholders. * fa/vcs-svn (2012-09-19) 4 commits - vcs-svn: remove repo_tree - vcs-svn/svndump: rewrite handle_node(), begin|end_revision() - vcs-svn/svndump: restructure node_ctx, rev_ctx handling - svndump: move struct definitions to .h (this branch uses fa/remote-svn.) A GSoC project. Waiting for comments from mentors and stakeholders. * jc/maint-name-rev (2012-09-17) 7 commits - describe --contains: use "name-rev --algorithm=weight" - name-rev --algorithm=weight: tests and documentation - name-rev --algorithm=weight: cache the computed weight in notes - name-rev --algorithm=weight: trivial optimization - name-rev: --algorithm option - name_rev: clarify the logic to assign a new tip-name to a commit - name-rev: lose unnecessary typedef "git name-rev" names the given revision based on a ref that can be reached in the smallest number of steps from the rev, but that is not useful when the caller wants to know which tag is the oldest one that contains the rev. This teaches a new mode to the command that uses the oldest ref among those which contain the rev. I am not sure if this is worth it; for one thing, even with the help from notes-cache, it seems to make the "describe --contains" even slower. Also the command will be unusably slow for a user who does not have a write access (hence unable to create or update the notes-cache). Stalled mostly due to lack of responses. * jc/xprm-generation (2012-09-14) 1 commit - test-generation: compute generation numbers and clock skews A toy to analyze how bad the clock skews are in histories of real world projects. Stalled mostly due to lack of responses. * jc/blame-no-follow (2012-09-21) 2 commits - blame: pay attention to --no-follow - diff: accept --no-follow option Teaches "--no-follow" option to "git blame" to disable its whole-file rename detection. Stalled mostly due to lack of responses. * mk/maint-graph-infinity-loop (2012-09-25) 1 commit - graph.c: infinite loop in git whatchanged --graph -m The --graph code fell into infinite loop when asked to do what the code did not expect ;-) Anybody who worked on "--graph" wants to comment? Stalled mostly due to lack of responses. * ph/credential-refactor (2012-09-02) 5 commits - wincred: port to generic credential helper - Merge branch 'ef/win32-cred-helper' into ph/credential-refactor - osxkeychain: port to generic credential helper implementation - gnome-keyring: port to generic helper implementation - contrib: add generic credential helper Attempts to refactor to share code among OSX keychain, Gnome keyring and Win32 credential helpers. * ms/contrib-thunderbird-updates (2012-08-31) 2 commits - [SQUASH] minimum fixup - Thunderbird: fix appp.sh format problems Update helper to send out format-patch output using Thunderbird. Seems to have design regression for silent users. * jx/test-real-path (2012-08-27) 1 commit - test: set the realpath of CWD as TRASH_DIRECTORY Running tests with the "trash" directory elsewhere with the "--root" option did not work well if the directory was specified by a symbolic link pointing at it. Seems broken as it makes $(pwd) and TRASH_DIRECTORY inconsistent. Will discard. * jc/maint-push-refs-all (2012-08-27) 2 commits - get_fetch_map(): tighten checks on dest refs - [BROKEN] fetch/push: allow refs/*:refs/* Allows pushing and fetching everything including refs/stash. This is broken (see the log message there). Not ready. * jc/add-delete-default (2012-08-13) 1 commit - git add: notice removal of tracked paths by default "git add dir/" updated modified files and added new files, but does not notice removed files, which may be "Huh?" to some users. They can of course use "git add -A dir/", but why should they? Resurrected from graveyard, as I thought it was a worthwhile thing to do in the longer term. Waiting for comments. * tx/relative-in-the-future (2012-08-16) 2 commits - date: show relative dates in the future - date: refactor the relative date logic from presentation Not my itch; rewritten an earlier submission by Tom Xue into somewhat more maintainable form, though it breaks existing i18n. Waiting for a voluteer to fix it up. Otherwise may discard. * mb/remote-default-nn-origin (2012-07-11) 6 commits - Teach get_default_remote to respect remote.default. - Test that plain "git fetch" uses remote.default when on a detached HEAD. - Teach clone to set remote.default. - Teach "git remote" about remote.default. - Teach remote.c about the remote.default configuration setting. - Rename remote.c's default_remote_name static variables. When the user does not specify what remote to interact with, we often attempt to use 'origin'. This can now be customized via a configuration variable. Expecting a reroll. "The first remote becomes the default" bit is better done as a separate step. * jc/split-blob (2012-04-03) 6 commits - chunked-object: streaming checkout - chunked-object: fallback checkout codepaths - bulk-checkin: support chunked-object encoding - bulk-checkin: allow the same data to be multiply hashed - new representation types in the packstream - packfile: use varint functions Not ready. I finished the streaming checkout codepath, but as explained in 127b177 (bulk-checkin: support chunked-object encoding, 2011-11-30), these are still early steps of a long and painful journey. At least pack-objects and fsck need to learn the new encoding for the series to be usable locally, and then index-pack/unpack-objects needs to learn it to be used remotely. Given that I heard a lot of noise that people want large files, and that I was asked by somebody at GitTogether'11 privately for an advice on how to pay developers (not me) to help adding necessary support, I am somewhat dissapointed that the original patch series that was sent long time ago still remains here without much comments and updates from the developer community. I even made the interface to the logic that decides where to split chunks easily replaceable, and I deliberately made the logic in the original patch extremely stupid to entice others, especially the "bup" fanbois, to come up with a better logic, thinking that giving people an easy target to shoot for, they may be encouraged to help out. The plan is not working :-<. -------------------------------------------------- [Cooking] * jl/submodule-add-by-name (2012-09-30) 2 commits - submodule add: Fail when .git/modules/<name> already exists unless forced - Teach "git submodule add" the --name option If you remove a submodule, in order to keep the repository so that "git checkout" to an older commit in the superproject history can resurrect the submodule, the real repository will stay in $GIT_DIR of the superproject. A later "git submodule add $path" to add a different submodule at the same path will fail. Diagnose this case a bit better, and if the user really wants to add an unrelated submodule at the same path, give the "--name" option to give it a place in $GIT_DIR of the superproject that does not conflict with the original submodule. Will merge to 'next'. * nd/grep-reflog (2012-09-29) 4 commits (merged to 'next' on 2012-10-01 at 57773a6) + revision: make --grep search in notes too if shown + log --grep-reflog: reject the option without -g + revision: add --grep-reflog to filter commits by reflog messages + grep: prepare for new header field filter Teach the commands from the "log" family the "--grep-reflog" option to limit output by string that appears in the reflog entry when the "--walk-reflogs" option is in effect. Will merge to 'master'. * lt/mailinfo-handle-attachment-more-sanely (2012-09-30) 1 commit (merged to 'next' on 2012-10-01 at 2a1cecc) + mailinfo: don't require "text" mime type for attachments A patch attached as application/octet-stream (e.g. not text/*) were mishandled, not correctly honoring Content-Transfer-Encoding (e.g. base64). Will merge to 'master'. * tu/gc-auto-quiet (2012-09-27) 1 commit (merged to 'next' on 2012-10-01 at ad8b91b) + silence git gc --auto --quiet output "gc --auto" notified the user that auto-packing has triggered even under the "--quiet" option. Will merge to 'master'. * jm/diff-context-config (2012-10-02) 2 commits (merged to 'next' on 2012-10-02 at e57700a) + t4055: avoid use of sed 'a' command (merged to 'next' on 2012-10-01 at 509a558) + diff: diff.context configuration gives default to -U Teaches a new configuration variable to "git diff" Porcelain and its friends. Will defer to the next cycle. * mh/ceiling (2012-09-29) 9 commits - t1504: stop resolving symlinks in GIT_CEILING_DIRECTORIES - longest_ancestor_length(): resolve symlinks before comparing paths - longest_ancestor_length(): use string_list_longest_prefix() - longest_ancestor_length(): always add a slash to the end of prefixes - longest_ancestor_length(): explicitly filter list before loop - longest_ancestor_length(): use string_list_split() - Introduce new function real_path_if_valid() - real_path_internal(): add comment explaining use of cwd - Introduce new static function real_path_internal() Elements of GIT_CEILING_DIRECTORIES list may not match the real pathname we obtain from getcwd(), leading the GIT_DIR discovery logic to escape the ceilings the user thought to have specified. The solution felt a bit unnecessarily convoluted to me. Expecting a reroll. * jl/submodule-rm (2012-09-29) 1 commit (merged to 'next' on 2012-10-01 at 4e5c4fc) + submodule: teach rm to remove submodules unless they contain a git directory "git rm submodule" cannot blindly remove a submodule directory as its working tree may have local changes, and worse yet, it may even have its repository embedded in it. Teach it some special cases where it is safe to remove a submodule, specifically, when there is no local changes in the submodule working tree, and its repository is not embedded in its working tree but is elsewhere and uses the gitfile mechanism to point at it. Will defer to the next cycle. * nd/wildmatch (2012-09-27) 5 commits - Support "**" in .gitignore and .gitattributes patterns using wildmatch() - Integrate wildmatch to git - compat/wildmatch: fix case-insensitive matching - compat/wildmatch: remove static variable force_lower_case - Import wildmatch from rsync Allows pathname patterns in .gitignore and .gitattributes files with double-asterisks "foo/**/bar" to match any number of directory hiearchies. It was pointed out that some symbols that do not have to be global are left global. I think this reroll fixed most of them. Will merge to 'next'. * nd/pretty-placeholder-with-color-option (2012-09-30) 9 commits - pretty: support %>> that steal trailing spaces - pretty: support truncating in %>, %< and %>< - pretty: support padding placeholders, %< %> and %>< - pretty: two phase conversion for non utf-8 commits - utf8.c: add utf8_strnwidth() with the ability to skip ansi sequences - utf8.c: move display_mode_esc_sequence_len() for use by other functions - pretty: support %C(auto[,N]) to turn on coloring on next placeholder(s) - pretty: split parsing %C into a separate function - pretty: share code between format_decoration and show_decorations * jk/no-more-pre-exec-callback (2012-06-05) 1 commit - pager: drop "wait for output to run less" hack (Originally merged to 'next' on 2012-07-23) Will defer to the next cycle. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: What's cooking in git.git (Oct 2012, #01; Tue, 2) 2012-10-02 23:20 What's cooking in git.git (Oct 2012, #01; Tue, 2) Junio C Hamano @ 2012-10-03 15:23 ` Nguyen Thai Ngoc Duy 2012-10-03 18:17 ` Junio C Hamano 2012-10-04 8:17 ` David Michael Barr 2012-10-30 12:15 ` Florian Achleitner 2 siblings, 1 reply; 31+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-10-03 15:23 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Wed, Oct 3, 2012 at 6:20 AM, Junio C Hamano <gitster@pobox.com> wrote: > * nd/wildmatch (2012-09-27) 5 commits > - Support "**" in .gitignore and .gitattributes patterns using wildmatch() > - Integrate wildmatch to git > - compat/wildmatch: fix case-insensitive matching > - compat/wildmatch: remove static variable force_lower_case > - Import wildmatch from rsync > > Allows pathname patterns in .gitignore and .gitattributes files > with double-asterisks "foo/**/bar" to match any number of directory > hiearchies. > > It was pointed out that some symbols that do not have to be global > are left global. I think this reroll fixed most of them. > > Will merge to 'next'. Just a bit of finding lately, in case you want to postpone the merge. There's an interesting case: "**foo". According to our rules, that pattern does not contain slashes therefore is basename match. But some might find that confusing because "**" can match slashes, as opposed to ordinary wildcards which cannot. So we could either go with our rules and consider "**" just like "*" in this case (do we need document clarification?), or redefine it that the presence of "**" implies FNM_PATHNAME. I think the latter makes more sense. When users put "**" they expect to match some slashes. But that may call for a refactoring in path_matches() in attr.c. Putting strstr(pattern, "**") in that matching function may increase overhead unnecessarily. The third option is just die() and let users decide either "*foo", "**/foo" or "/**foo", never "**foo". -- Duy ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: What's cooking in git.git (Oct 2012, #01; Tue, 2) 2012-10-03 15:23 ` Nguyen Thai Ngoc Duy @ 2012-10-03 18:17 ` Junio C Hamano 2012-10-04 1:56 ` Nguyen Thai Ngoc Duy 2012-10-04 9:34 ` What's cooking in git.git (Oct 2012, #01; Tue, 2) Michael Haggerty 0 siblings, 2 replies; 31+ messages in thread From: Junio C Hamano @ 2012-10-03 18:17 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: git Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes: > There's an interesting case: "**foo". According to our rules, that > pattern does not contain slashes therefore is basename match. But some > might find that confusing because "**" can match slashes,... By "our rules", if you mean "if a pattern has slash, it is anchored", that obviously need to be updated with this series, if "**" is meant to match multiple hierarchies. > I think the latter makes more sense. When users put "**" they expect > to match some slashes. But that may call for a refactoring in > path_matches() in attr.c. Putting strstr(pattern, "**") in that > matching function may increase overhead unnecessarily. > > The third option is just die() and let users decide either "*foo", > "**/foo" or "/**foo", never "**foo". For the double-star at the beginning, you should just turn it into "**/" if it is not followed by a slash internally, I think. What is the semantics of ** in the first place? Is it described to a reasonable level of detail in the documentation updates? For example does "**foo" match "afoo", "a/b/foo", "a/bfoo", "a/foo/b", "a/bfoo/c"? Does "x**y" match "xy", "xay", "xa/by", "x/a/y"? I am guessing that the only sensible definition is that "**" requires anything that comes before it (if exists) is at a proper hierarchy boundary, and anything matches it is also at a proper hierarchy boundary, so "x**y" matches "x/a/y" and not "xy", "xay", nor "xa/by" in the above example. If "x**y" can match "xy" or "xay" (or "**foo" can match "afoo"), it would be unreasonable to say it implies the pattern is anchored at any level, no? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: What's cooking in git.git (Oct 2012, #01; Tue, 2) 2012-10-03 18:17 ` Junio C Hamano @ 2012-10-04 1:56 ` Nguyen Thai Ngoc Duy 2012-10-04 6:01 ` Junio C Hamano 2012-10-04 9:34 ` What's cooking in git.git (Oct 2012, #01; Tue, 2) Michael Haggerty 1 sibling, 1 reply; 31+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-10-04 1:56 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Thu, Oct 4, 2012 at 1:17 AM, Junio C Hamano <gitster@pobox.com> wrote: > For the double-star at the beginning, you should just turn it into "**/" > if it is not followed by a slash internally, I think. > > What is the semantics of ** in the first place? Is it described to > a reasonable level of detail in the documentation updates? For > example does "**foo" match "afoo", "a/b/foo", "a/bfoo", "a/foo/b", > "a/bfoo/c"? Does "x**y" match "xy", "xay", "xa/by", "x/a/y"? It's basically what rsync describes: use ’**’ to match anything, including slashes. Reading rsync's man page again, I notice I missed two other rules related to **: - If the pattern contains a / (not counting a trailing /) or a "**", then it is matched against the full pathname, including any leading directories. If the pattern doesn't contain a / or a "**", then it is matched only against the final component of the filename. (Remember that the algorithm is applied recursively so "full filename" can actually be any portion of a path from the starting directory on down.) - A trailing "dir_name/***" will match both the directory (as if "dir_name/" had been specified) and everything in the directory (as if "dir_name/**" had been specified). This behavior was added in version 2.6.7. From what you wrote, I think we'll go with the first rule. The second rule looks irrelevant to what git's doing. > I am guessing that the only sensible definition is that "**" > requires anything that comes before it (if exists) is at a proper > hierarchy boundary, and anything matches it is also at a proper > hierarchy boundary, so "x**y" matches "x/a/y" and "x/y" too? (As opposed to "x/**/y" which does not) > and not "xy", "xay", > nor "xa/by" in the above example. If "x**y" can match "xy" or "xay" > (or "**foo" can match "afoo"), it would be unreasonable to say it > implies the pattern is anchored at any level, no? Yeah. That makes things easier to reason, though not exactly what we're having. -- Duy ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: What's cooking in git.git (Oct 2012, #01; Tue, 2) 2012-10-04 1:56 ` Nguyen Thai Ngoc Duy @ 2012-10-04 6:01 ` Junio C Hamano 2012-10-04 7:39 ` [PATCH 0/6] wildmatch part 2 Nguyễn Thái Ngọc Duy 0 siblings, 1 reply; 31+ messages in thread From: Junio C Hamano @ 2012-10-04 6:01 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: git Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes: >> I am guessing that the only sensible definition is that "**" >> requires anything that comes before it (if exists) is at a proper >> hierarchy boundary, and anything matches it is also at a proper >> hierarchy boundary, so "x**y" matches "x/a/y" > > and "x/y" too? (As opposed to "x/**/y" which does not) Yeah, x**y would match x/y under that "sensible" semantics. >> and not "xy", "xay", >> nor "xa/by" in the above example. If "x**y" can match "xy" or "xay" >> (or "**foo" can match "afoo"), it would be unreasonable to say it >> implies the pattern is anchored at any level, no? > > Yeah. That makes things easier to reason, though not exactly what we're having. It sounds like that "x**y" with the code you imported would match "xy" and "xa/b/cy", and I do not think of a concise and good way to describe what it does to the end users. "matches anything including '/'" is not a useful description for the purpose of allowing the user to intuitively understand why "x**y" is anchored at the level (or is not anchored and can appear anywhere). Perhaps the wildmatch code may not be what we want X-<. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 0/6] wildmatch part 2 2012-10-04 6:01 ` Junio C Hamano @ 2012-10-04 7:39 ` Nguyễn Thái Ngọc Duy 2012-10-04 7:39 ` [PATCH 1/6] attr: remove the union in struct match_attr Nguyễn Thái Ngọc Duy ` (6 more replies) 0 siblings, 7 replies; 31+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2012-10-04 7:39 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy On Thu, Oct 4, 2012 at 1:01 PM, Junio C Hamano <gitster@pobox.com> wrote: > Perhaps the wildmatch code may not be what we want X-<. When I imported wildmatch I was hoping to make minimum changes to it. But wildmatch is probably the only practical way to support "**" even if we later need to change it the way we want. Other options are base our work on top of compat/fnmatch.c, which is an #ifdef spaghetti mess, or write a new fnmatch()-compatible function. Both unattractive to me. Anyway, this is on top of nd/wildmatch, which makes "ab**cd" match full pathname. attr patches port .gitignore optimizations over. In long term, we should probably have a shared matching implementation instead. I tried that road once and failed so I won't attempt again any time soon. If we drop wildmatch, I can split these attr patches out as a separate series. It's a good thing to do anyway. The last patch just reflects that current "**" is not exactly what we want. I'm not sure if I could look into wildmatch.c and change it. Anybody is welcome to step up, of course. Nguyễn Thái Ngọc Duy (6): attr: remove the union in struct match_attr attr: avoid strlen() on every match attr: avoid searching for basename on every match attr: more matching optimizations from .gitignore gitignore: do not do basename match with patterns that have '**' t3001: note about expected "**" behavior Documentation/gitignore.txt | 10 ++-- attr.c | 101 +++++++++++++++++++++++++++---------- dir.c | 6 +-- dir.h | 2 + t/t0003-attributes.sh | 16 ++++++ t/t3001-ls-files-others-exclude.sh | 18 +++++++ 6 files changed, 118 insertions(+), 35 deletions(-) -- 1.7.12.1.405.gb727dc9 ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 1/6] attr: remove the union in struct match_attr 2012-10-04 7:39 ` [PATCH 0/6] wildmatch part 2 Nguyễn Thái Ngọc Duy @ 2012-10-04 7:39 ` Nguyễn Thái Ngọc Duy 2012-10-04 7:39 ` [PATCH 2/6] attr: avoid strlen() on every match Nguyễn Thái Ngọc Duy ` (5 subsequent siblings) 6 siblings, 0 replies; 31+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2012-10-04 7:39 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy We're going to add more attributes to u.pattern so it'll become bigger in size than a pointer. There's no point in sharing the same room with u.attr. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- attr.c | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/attr.c b/attr.c index 15ebaa1..48df800 100644 --- a/attr.c +++ b/attr.c @@ -119,10 +119,10 @@ struct attr_state { /* * One rule, as from a .gitattributes file. * - * If is_macro is true, then u.attr is a pointer to the git_attr being + * If is_macro is true, then attr is a pointer to the git_attr being * defined. * - * If is_macro is false, then u.pattern points at the filename pattern + * If is_macro is false, then pattern points at the filename pattern * to which the rule applies. (The memory pointed to is part of the * memory block allocated for the match_attr instance.) * @@ -131,10 +131,8 @@ struct attr_state { * listed as they appear in the file (macros unexpanded). */ struct match_attr { - union { - char *pattern; - struct git_attr *attr; - } u; + const char *pattern; + struct git_attr *attr; char is_macro; unsigned num_attr; struct attr_state state[FLEX_ARRAY]; @@ -240,11 +238,12 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, sizeof(struct attr_state) * num_attr + (is_macro ? 0 : namelen + 1)); if (is_macro) - res->u.attr = git_attr_internal(name, namelen); + res->attr = git_attr_internal(name, namelen); else { - res->u.pattern = (char *)&(res->state[num_attr]); - memcpy(res->u.pattern, name, namelen); - res->u.pattern[namelen] = 0; + char *p = (char *)&(res->state[num_attr]); + memcpy(p, name, namelen); + p[namelen] = 0; + res->pattern = p; } res->is_macro = is_macro; res->num_attr = num_attr; @@ -682,7 +681,7 @@ static int fill_one(const char *what, struct match_attr *a, int rem) if (*n == ATTR__UNKNOWN) { debug_set(what, - a->is_macro ? a->u.attr->name : a->u.pattern, + a->is_macro ? a->attr->name : a->pattern, attr, v); *n = v; rem--; @@ -702,7 +701,7 @@ static int fill(const char *path, int pathlen, struct attr_stack *stk, int rem) if (a->is_macro) continue; if (path_matches(path, pathlen, - a->u.pattern, base, strlen(base))) + a->pattern, base, strlen(base))) rem = fill_one("fill", a, rem); } return rem; @@ -722,7 +721,7 @@ static int macroexpand_one(int attr_nr, int rem) struct match_attr *ma = stk->attrs[i]; if (!ma->is_macro) continue; - if (ma->u.attr->attr_nr == attr_nr) + if (ma->attr->attr_nr == attr_nr) a = ma; } -- 1.7.12.1.405.gb727dc9 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 2/6] attr: avoid strlen() on every match 2012-10-04 7:39 ` [PATCH 0/6] wildmatch part 2 Nguyễn Thái Ngọc Duy 2012-10-04 7:39 ` [PATCH 1/6] attr: remove the union in struct match_attr Nguyễn Thái Ngọc Duy @ 2012-10-04 7:39 ` Nguyễn Thái Ngọc Duy 2012-10-04 7:39 ` [PATCH 3/6] attr: avoid searching for basename " Nguyễn Thái Ngọc Duy ` (4 subsequent siblings) 6 siblings, 0 replies; 31+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2012-10-04 7:39 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- attr.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/attr.c b/attr.c index 48df800..66b96d9 100644 --- a/attr.c +++ b/attr.c @@ -277,6 +277,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, static struct attr_stack { struct attr_stack *prev; char *origin; + size_t originlen; unsigned num_matches; unsigned alloc; struct match_attr **attrs; @@ -532,6 +533,7 @@ static void bootstrap_attr_stack(void) if (!is_bare_repository() || direction == GIT_ATTR_INDEX) { elem = read_attr(GITATTRIBUTES_FILE, 1); elem->origin = xstrdup(""); + elem->originlen = 0; elem->prev = attr_stack; attr_stack = elem; debug_push(elem); @@ -625,7 +627,7 @@ static void prepare_attr_stack(const char *path) strbuf_addstr(&pathbuf, GITATTRIBUTES_FILE); elem = read_attr(pathbuf.buf, 0); strbuf_setlen(&pathbuf, cp - path); - elem->origin = strbuf_detach(&pathbuf, NULL); + elem->origin = strbuf_detach(&pathbuf, &elem->originlen); elem->prev = attr_stack; attr_stack = elem; debug_push(elem); @@ -701,7 +703,7 @@ static int fill(const char *path, int pathlen, struct attr_stack *stk, int rem) if (a->is_macro) continue; if (path_matches(path, pathlen, - a->pattern, base, strlen(base))) + a->pattern, base, stk->originlen)) rem = fill_one("fill", a, rem); } return rem; -- 1.7.12.1.405.gb727dc9 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 3/6] attr: avoid searching for basename on every match 2012-10-04 7:39 ` [PATCH 0/6] wildmatch part 2 Nguyễn Thái Ngọc Duy 2012-10-04 7:39 ` [PATCH 1/6] attr: remove the union in struct match_attr Nguyễn Thái Ngọc Duy 2012-10-04 7:39 ` [PATCH 2/6] attr: avoid strlen() on every match Nguyễn Thái Ngọc Duy @ 2012-10-04 7:39 ` Nguyễn Thái Ngọc Duy 2012-10-04 7:39 ` [PATCH 4/6] attr: more matching optimizations from .gitignore Nguyễn Thái Ngọc Duy ` (3 subsequent siblings) 6 siblings, 0 replies; 31+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2012-10-04 7:39 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- attr.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/attr.c b/attr.c index 66b96d9..eb576ac 100644 --- a/attr.c +++ b/attr.c @@ -644,13 +644,11 @@ static void prepare_attr_stack(const char *path) } static int path_matches(const char *pathname, int pathlen, + const char *basename, const char *pattern, const char *base, int baselen) { if (!strchr(pattern, '/')) { - /* match basename */ - const char *basename = strrchr(pathname, '/'); - basename = basename ? basename + 1 : pathname; return (fnmatch_icase(pattern, basename, 0) == 0); } /* @@ -693,7 +691,8 @@ static int fill_one(const char *what, struct match_attr *a, int rem) return rem; } -static int fill(const char *path, int pathlen, struct attr_stack *stk, int rem) +static int fill(const char *path, int pathlen, const char *basename, + struct attr_stack *stk, int rem) { int i; const char *base = stk->origin ? stk->origin : ""; @@ -702,7 +701,7 @@ static int fill(const char *path, int pathlen, struct attr_stack *stk, int rem) struct match_attr *a = stk->attrs[i]; if (a->is_macro) continue; - if (path_matches(path, pathlen, + if (path_matches(path, pathlen, basename, a->pattern, base, stk->originlen)) rem = fill_one("fill", a, rem); } @@ -741,15 +740,19 @@ static void collect_all_attrs(const char *path) { struct attr_stack *stk; int i, pathlen, rem; + const char *basename; prepare_attr_stack(path); for (i = 0; i < attr_nr; i++) check_all_attr[i].value = ATTR__UNKNOWN; + basename = strrchr(path, '/'); + basename = basename ? basename + 1 : path; + pathlen = strlen(path); rem = attr_nr; for (stk = attr_stack; 0 < rem && stk; stk = stk->prev) - rem = fill(path, pathlen, stk, rem); + rem = fill(path, pathlen, basename, stk, rem); } int git_check_attr(const char *path, int num, struct git_attr_check *check) -- 1.7.12.1.405.gb727dc9 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 4/6] attr: more matching optimizations from .gitignore 2012-10-04 7:39 ` [PATCH 0/6] wildmatch part 2 Nguyễn Thái Ngọc Duy ` (2 preceding siblings ...) 2012-10-04 7:39 ` [PATCH 3/6] attr: avoid searching for basename " Nguyễn Thái Ngọc Duy @ 2012-10-04 7:39 ` Nguyễn Thái Ngọc Duy 2012-10-04 7:39 ` [PATCH 5/6] gitignore: do not do basename match with patterns that have '**' Nguyễn Thái Ngọc Duy ` (2 subsequent siblings) 6 siblings, 0 replies; 31+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2012-10-04 7:39 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy .gitattributes and .gitignore share the same pattern syntax but has separate matching implementation. Over the years, ignore's implementation accumulates more optimizations while attr's stays the same. This patch adds those optimizations to .gitattributes. Basically it tries to avoid fnmatch/wildmatch in favor of strncmp as much as possible. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- attr.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++++---------- dir.c | 4 ++-- dir.h | 2 ++ 3 files changed, 57 insertions(+), 12 deletions(-) diff --git a/attr.c b/attr.c index eb576ac..3fde9fa 100644 --- a/attr.c +++ b/attr.c @@ -116,6 +116,13 @@ struct attr_state { const char *setto; }; +struct pattern { + const char *pattern; + int patternlen; + int nowildcardlen; + int flags; /* EXC_FLAG_* */ +}; + /* * One rule, as from a .gitattributes file. * @@ -131,7 +138,7 @@ struct attr_state { * listed as they appear in the file (macros unexpanded). */ struct match_attr { - const char *pattern; + struct pattern pat; struct git_attr *attr; char is_macro; unsigned num_attr; @@ -243,7 +250,13 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, char *p = (char *)&(res->state[num_attr]); memcpy(p, name, namelen); p[namelen] = 0; - res->pattern = p; + res->pat.pattern = p; + res->pat.patternlen = strlen(p); + res->pat.nowildcardlen = simple_length(p); + if (!strchr(p, '/')) + res->pat.flags |= EXC_FLAG_NODIR; + if (*p == '*' && no_wildcard(p+1)) + res->pat.flags |= EXC_FLAG_ENDSWITH; } res->is_macro = is_macro; res->num_attr = num_attr; @@ -645,26 +658,56 @@ static void prepare_attr_stack(const char *path) static int path_matches(const char *pathname, int pathlen, const char *basename, - const char *pattern, + const struct pattern *pat, const char *base, int baselen) { - if (!strchr(pattern, '/')) { + const char *pattern = pat->pattern; + int prefix = pat->nowildcardlen; + const char *name; + int namelen; + + if (pat->flags & EXC_FLAG_NODIR) { + if (prefix == pat->patternlen && + !strcmp_icase(pattern, basename)) + return 1; + + if (pat->flags & EXC_FLAG_ENDSWITH && + pat->patternlen - 1 <= pathlen && + !strcmp_icase(pattern + 1, pathname + + pathlen - pat->patternlen + 1)) + return 1; + return (fnmatch_icase(pattern, basename, 0) == 0); } /* * match with FNM_PATHNAME; the pattern has base implicitly * in front of it. */ - if (*pattern == '/') + if (*pattern == '/') { pattern++; + prefix--; + } + + /* + * note: unlike excluded_from_list, baselen here does not + * contain the trailing slash + */ + if (pathlen < baselen || (baselen && pathname[baselen] != '/') || strncmp(pathname, base, baselen)) return 0; - if (baselen != 0) - baselen++; - return (ignore_case && iwildmatch(pattern, pathname + baselen)) || - (!ignore_case && wildmatch(pattern, pathname + baselen)); + + namelen = baselen ? pathlen - baselen - 1 : pathlen; + name = pathname + pathlen - namelen; + + /* if the non-wildcard part is longer than the remaining + pathname, surely it cannot match */ + if (!namelen || prefix > namelen) + return 0; + + return (ignore_case && iwildmatch(pattern, name)) || + (!ignore_case && wildmatch(pattern, name)); } static int macroexpand_one(int attr_nr, int rem); @@ -702,7 +745,7 @@ static int fill(const char *path, int pathlen, const char *basename, if (a->is_macro) continue; if (path_matches(path, pathlen, basename, - a->pattern, base, stk->originlen)) + &a->pat, base, stk->originlen)) rem = fill_one("fill", a, rem); } return rem; diff --git a/dir.c b/dir.c index 92cda82..fd49336 100644 --- a/dir.c +++ b/dir.c @@ -292,7 +292,7 @@ int match_pathspec_depth(const struct pathspec *ps, /* * Return the length of the "simple" part of a path match limiter. */ -static int simple_length(const char *match) +int simple_length(const char *match) { int len = -1; @@ -304,7 +304,7 @@ static int simple_length(const char *match) } } -static int no_wildcard(const char *string) +int no_wildcard(const char *string) { return string[simple_length(string)] == '\0'; } diff --git a/dir.h b/dir.h index 893465a..7ea8678 100644 --- a/dir.h +++ b/dir.h @@ -101,6 +101,8 @@ extern void add_exclude(const char *string, const char *base, int baselen, struct exclude_list *which); extern void free_excludes(struct exclude_list *el); extern int file_exists(const char *); +extern int simple_length(const char *match); +extern int no_wildcard(const char *string); extern int is_inside_dir(const char *dir); extern int dir_inside_of(const char *subdir, const char *dir); -- 1.7.12.1.405.gb727dc9 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 5/6] gitignore: do not do basename match with patterns that have '**' 2012-10-04 7:39 ` [PATCH 0/6] wildmatch part 2 Nguyễn Thái Ngọc Duy ` (3 preceding siblings ...) 2012-10-04 7:39 ` [PATCH 4/6] attr: more matching optimizations from .gitignore Nguyễn Thái Ngọc Duy @ 2012-10-04 7:39 ` Nguyễn Thái Ngọc Duy 2012-10-04 17:59 ` Junio C Hamano 2012-10-05 7:01 ` Johannes Sixt 2012-10-04 7:39 ` [PATCH 6/6] t3001: note about expected "**" behavior Nguyễn Thái Ngọc Duy 2012-10-04 17:43 ` [PATCH 0/6] wildmatch part 2 Junio C Hamano 6 siblings, 2 replies; 31+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2012-10-04 7:39 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy "**" can match slashes, not like "*". "ab**ef" should be able to match "ab/cd/ef", or "ab/c/d/ef" and so on. Turn off the EXC_FLAG_NODIR in this case otherwise the pattern is only checked against the base name. This behavior is in sync with rsync. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- Documentation/gitignore.txt | 10 +++++----- attr.c | 2 +- dir.c | 2 +- t/t0003-attributes.sh | 16 ++++++++++++++++ t/t3001-ls-files-others-exclude.sh | 10 ++++++++++ 5 files changed, 33 insertions(+), 7 deletions(-) diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt index eb81d31..4dfe8bd 100644 --- a/Documentation/gitignore.txt +++ b/Documentation/gitignore.txt @@ -81,11 +81,11 @@ PATTERN FORMAT regular file or a symbolic link `foo` (this is consistent with the way how pathspec works in general in git). - - If the pattern does not contain a slash '/', git treats it as - a shell glob pattern and checks for a match against the - pathname relative to the location of the `.gitignore` file - (relative to the toplevel of the work tree if not from a - `.gitignore` file). + - If the pattern does not contain a slash '/' nor '**', git + treats it as a shell glob pattern and checks for a match + against the pathname relative to the location of the + `.gitignore` file (relative to the toplevel of the work tree + if not from a `.gitignore` file). - Otherwise, git treats the pattern as a shell glob suitable for consumption by fnmatch(3) with the FNM_PATHNAME flag: diff --git a/attr.c b/attr.c index 3fde9fa..634b39c 100644 --- a/attr.c +++ b/attr.c @@ -253,7 +253,7 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, res->pat.pattern = p; res->pat.patternlen = strlen(p); res->pat.nowildcardlen = simple_length(p); - if (!strchr(p, '/')) + if (!strchr(p, '/') && !strstr(p, "**")) res->pat.flags |= EXC_FLAG_NODIR; if (*p == '*' && no_wildcard(p+1)) res->pat.flags |= EXC_FLAG_ENDSWITH; diff --git a/dir.c b/dir.c index fd49336..6a5de98 100644 --- a/dir.c +++ b/dir.c @@ -340,7 +340,7 @@ void add_exclude(const char *string, const char *base, x->base = base; x->baselen = baselen; x->flags = flags; - if (!strchr(string, '/')) + if (!strchr(string, '/') && !strstr(string, "**")) x->flags |= EXC_FLAG_NODIR; x->nowildcardlen = simple_length(string); if (*string == '*' && no_wildcard(string+1)) diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh index 6c3c554..9b534a0 100755 --- a/t/t0003-attributes.sh +++ b/t/t0003-attributes.sh @@ -249,4 +249,20 @@ EOF test_line_count = 0 err ' +test_expect_success '"**" with no slashes test' ' + echo "a**f foo=bar" >.gitattributes && + cat <<\EOF >expect && +f: foo: unspecified +a/f: foo: bar +a/b/f: foo: bar +a/b/c/f: foo: bar +EOF + git check-attr foo -- "f" >actual 2>err && + git check-attr foo -- "a/f" >>actual 2>>err && + git check-attr foo -- "a/b/f" >>actual 2>>err && + git check-attr foo -- "a/b/c/f" >>actual 2>>err && + test_cmp expect actual && + test_line_count = 0 err +' + test_done diff --git a/t/t3001-ls-files-others-exclude.sh b/t/t3001-ls-files-others-exclude.sh index 67c8bcf..6a5a4ab 100755 --- a/t/t3001-ls-files-others-exclude.sh +++ b/t/t3001-ls-files-others-exclude.sh @@ -225,4 +225,14 @@ EOF test_cmp expect actual ' + +test_expect_success 'ls-files with "**" patterns and no slashes' ' + cat <<\EOF >expect && +one/a.1 +one/two/a.1 +EOF + git ls-files -o -i --exclude "one**a.1" >actual + test_cmp expect actual +' + test_done -- 1.7.12.1.405.gb727dc9 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 5/6] gitignore: do not do basename match with patterns that have '**' 2012-10-04 7:39 ` [PATCH 5/6] gitignore: do not do basename match with patterns that have '**' Nguyễn Thái Ngọc Duy @ 2012-10-04 17:59 ` Junio C Hamano 2012-10-05 7:01 ` Johannes Sixt 1 sibling, 0 replies; 31+ messages in thread From: Junio C Hamano @ 2012-10-04 17:59 UTC (permalink / raw) To: Nguyễn Thái Ngọc Duy; +Cc: git Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes: > - if (!strchr(p, '/')) > + if (!strchr(p, '/') && !strstr(p, "**")) Doesn't wildmatch allow these to be quoted, similar to the way usual glob works, e.g. $ >ff $ >\?f $ echo ?? ?f ff $ echo \?f ?f Even If wildmatch out-of-the-box doesn't, I would assume that we would want to fix it so that it does. And if that is the case, we would want to be careful about "two/asterisks\**in path" to avoid triggering this logic, no? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 5/6] gitignore: do not do basename match with patterns that have '**' 2012-10-04 7:39 ` [PATCH 5/6] gitignore: do not do basename match with patterns that have '**' Nguyễn Thái Ngọc Duy 2012-10-04 17:59 ` Junio C Hamano @ 2012-10-05 7:01 ` Johannes Sixt 2012-10-05 11:23 ` Nguyen Thai Ngoc Duy 1 sibling, 1 reply; 31+ messages in thread From: Johannes Sixt @ 2012-10-05 7:01 UTC (permalink / raw) To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano Am 10/4/2012 9:39, schrieb Nguyễn Thái Ngọc Duy: > - - If the pattern does not contain a slash '/', git treats it as > - a shell glob pattern and checks for a match against the > - pathname relative to the location of the `.gitignore` file > - (relative to the toplevel of the work tree if not from a > - `.gitignore` file). > + - If the pattern does not contain a slash '/' nor '**', git > + treats it as a shell glob pattern and checks for a match > + against the pathname relative to the location of the > + `.gitignore` file (relative to the toplevel of the work tree > + if not from a `.gitignore` file). > +test_expect_success '"**" with no slashes test' ' > + echo "a**f foo=bar" >.gitattributes && > + cat <<\EOF >expect && > +f: foo: unspecified > +a/f: foo: bar > +a/b/f: foo: bar > +a/b/c/f: foo: bar > +EOF Should the above .gitattributes match nested paths, such as b/a/c/f? I think it should, because the user can easily say "/a**f" that nested paths should not be matched. But if it does not match, as your documentation update implies, which options does the user have to match nested paths? Only to add more patterns for each nested directory, such as "b/a**f". -- Hannes ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 5/6] gitignore: do not do basename match with patterns that have '**' 2012-10-05 7:01 ` Johannes Sixt @ 2012-10-05 11:23 ` Nguyen Thai Ngoc Duy 0 siblings, 0 replies; 31+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-10-05 11:23 UTC (permalink / raw) To: Johannes Sixt; +Cc: git, Junio C Hamano On Fri, Oct 5, 2012 at 2:01 PM, Johannes Sixt <j.sixt@viscovery.net> wrote: > Am 10/4/2012 9:39, schrieb Nguyễn Thái Ngọc Duy: >> - - If the pattern does not contain a slash '/', git treats it as >> - a shell glob pattern and checks for a match against the >> - pathname relative to the location of the `.gitignore` file >> - (relative to the toplevel of the work tree if not from a >> - `.gitignore` file). >> + - If the pattern does not contain a slash '/' nor '**', git >> + treats it as a shell glob pattern and checks for a match >> + against the pathname relative to the location of the >> + `.gitignore` file (relative to the toplevel of the work tree >> + if not from a `.gitignore` file). I think in the latest round, we forbid this case (i.e. a/**, **/b and a/**/b are ok, but a**b is not), exactly because it's hard to define how it should do. Thanks for another example. >> +test_expect_success '"**" with no slashes test' ' >> + echo "a**f foo=bar" >.gitattributes && >> + cat <<\EOF >expect && >> +f: foo: unspecified >> +a/f: foo: bar >> +a/b/f: foo: bar >> +a/b/c/f: foo: bar >> +EOF > > Should the above .gitattributes match nested paths, such as b/a/c/f? > > I think it should, because the user can easily say "/a**f" that nested > paths should not be matched. The user can also say **/a/**f to match b/a/c/f. -- Duy ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 6/6] t3001: note about expected "**" behavior 2012-10-04 7:39 ` [PATCH 0/6] wildmatch part 2 Nguyễn Thái Ngọc Duy ` (4 preceding siblings ...) 2012-10-04 7:39 ` [PATCH 5/6] gitignore: do not do basename match with patterns that have '**' Nguyễn Thái Ngọc Duy @ 2012-10-04 7:39 ` Nguyễn Thái Ngọc Duy 2012-10-04 18:04 ` Junio C Hamano 2012-10-04 17:43 ` [PATCH 0/6] wildmatch part 2 Junio C Hamano 6 siblings, 1 reply; 31+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2012-10-04 7:39 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy "**" currently matches any characters including slashes. It's probably too powerful. A more sensible definition may be match any characters that the but the whole match must be wrapped by slashes. So "**" can match none, "/", "/aaa/", "/aa/bb/" and so on but not "aa/bb". Note it in the test suite. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- t/t3001-ls-files-others-exclude.sh | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/t/t3001-ls-files-others-exclude.sh b/t/t3001-ls-files-others-exclude.sh index 6a5a4ab..99b5f5c 100755 --- a/t/t3001-ls-files-others-exclude.sh +++ b/t/t3001-ls-files-others-exclude.sh @@ -235,4 +235,12 @@ EOF test_cmp expect actual ' +# We might want ** to match at directory boundary, e.g. a**b matches +# a/b, a/x/b, a/x/x/b... but not ax/xb. +test_expect_failure 'ls-files with "**" patterns and no slashes' ' + : >expect && + git ls-files -o -i --exclude "o**a.1" >actual + test_cmp expect actual +' + test_done -- 1.7.12.1.405.gb727dc9 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 6/6] t3001: note about expected "**" behavior 2012-10-04 7:39 ` [PATCH 6/6] t3001: note about expected "**" behavior Nguyễn Thái Ngọc Duy @ 2012-10-04 18:04 ` Junio C Hamano 0 siblings, 0 replies; 31+ messages in thread From: Junio C Hamano @ 2012-10-04 18:04 UTC (permalink / raw) To: Nguyễn Thái Ngọc Duy; +Cc: git Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes: > "**" currently matches any characters including slashes. It's probably > too powerful. A more sensible definition may be match any characters > that the but the whole match must be wrapped by slashes. So "**" can > match none, "/", "/aaa/", "/aa/bb/" and so on but not "aa/bb". I do not think this is something we want to retroactively change after releasing it to the public, especially when we _know_ it is a problem from the get-go (unlike the case we did not notice it had a problem, release it to the public and then realize it and have to scramble to devise a fix to bring more sanity in a backward compatible way). We must either declare that "**" that matches any characters is the sane semantics and promise we will never change it, or have "**" that matches \(/[^/]*\)*/ (sorry for a line noise^W^W^Wregexp) from the beginning. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 0/6] wildmatch part 2 2012-10-04 7:39 ` [PATCH 0/6] wildmatch part 2 Nguyễn Thái Ngọc Duy ` (5 preceding siblings ...) 2012-10-04 7:39 ` [PATCH 6/6] t3001: note about expected "**" behavior Nguyễn Thái Ngọc Duy @ 2012-10-04 17:43 ` Junio C Hamano 6 siblings, 0 replies; 31+ messages in thread From: Junio C Hamano @ 2012-10-04 17:43 UTC (permalink / raw) To: Nguyễn Thái Ngọc Duy; +Cc: git Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes: > On Thu, Oct 4, 2012 at 1:01 PM, Junio C Hamano <gitster@pobox.com> wrote: >> Perhaps the wildmatch code may not be what we want X-<. > > When I imported wildmatch I was hoping to make minimum changes to it. > But wildmatch is probably the only practical way to support "**" even > if we later need to change it the way we want. Other options are base > our work on top of compat/fnmatch.c, which is an #ifdef spaghetti > mess, or write a new fnmatch()-compatible function. Both unattractive > to me. > > Anyway, this is on top of nd/wildmatch, which makes "ab**cd" match > full pathname. I do not think we are in a hurry to push "**" support in before we know what semantics we want to get out of it. Pushing half-baked "this is good enough at least to me for now" topics before they are ready will cost the users in the longer term. On the other hand, the three patches (2/3/4) in this series look like a good improvement regardless of what kind of matching engine we use. I would have preferred to see them _before_ nd/wildmatch. I do not agree with the reasoning behind [1/6] that changes union { char *pattern; strict git_attr *attr; } u; char is_macro; to char *pattern; strict git_attr *attr; char is_macro; by the way. The union is much less about space saving but is more about the nature of the usage; we use pattern or attr but not both at the same time. Even if the pattern becomes richer with later patches, that does not change the fundamental premise that depending on the value of "is_macro", either "attr" is used or "pattern" is used but they won't be in effect at the same time. The evolution of this should go more like this, I think: union { struct { const char *string; int pattern_length; int prefix_literal_length; int flags; } pattern; struct git_attr *attr; } u; char is_macro; Thanks. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: What's cooking in git.git (Oct 2012, #01; Tue, 2) 2012-10-03 18:17 ` Junio C Hamano 2012-10-04 1:56 ` Nguyen Thai Ngoc Duy @ 2012-10-04 9:34 ` Michael Haggerty 2012-10-04 11:46 ` Nguyen Thai Ngoc Duy 2012-10-04 16:39 ` Junio C Hamano 1 sibling, 2 replies; 31+ messages in thread From: Michael Haggerty @ 2012-10-04 9:34 UTC (permalink / raw) To: Junio C Hamano; +Cc: Nguyen Thai Ngoc Duy, git On 10/03/2012 08:17 PM, Junio C Hamano wrote: > Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes: > >> There's an interesting case: "**foo". According to our rules, that >> pattern does not contain slashes therefore is basename match. But some >> might find that confusing because "**" can match slashes,... > > By "our rules", if you mean "if a pattern has slash, it is anchored", > that obviously need to be updated with this series, if "**" is meant > to match multiple hierarchies. >> I think the latter makes more sense. When users put "**" they expect >> to match some slashes. But that may call for a refactoring in >> path_matches() in attr.c. Putting strstr(pattern, "**") in that >> matching function may increase overhead unnecessarily. >> >> The third option is just die() and let users decide either "*foo", >> "**/foo" or "/**foo", never "**foo". > > For the double-star at the beginning, you should just turn it into "**/" > if it is not followed by a slash internally, I think. > > What is the semantics of ** in the first place? Is it described to > a reasonable level of detail in the documentation updates? For > example does "**foo" match "afoo", "a/b/foo", "a/bfoo", "a/foo/b", > "a/bfoo/c"? Does "x**y" match "xy", "xay", "xa/by", "x/a/y"? > > I am guessing that the only sensible definition is that "**" > requires anything that comes before it (if exists) is at a proper > hierarchy boundary, and anything matches it is also at a proper > hierarchy boundary, so "x**y" matches "x/a/y" and not "xy", "xay", > nor "xa/by" in the above example. If "x**y" can match "xy" or "xay" > (or "**foo" can match "afoo"), it would be unreasonable to say it > implies the pattern is anchored at any level, no? Given that there is no obvious interpretation for what a construct like "x**y" would mean, and many plausible guesses (most of which sound rather useless), I suggest that we forbid it. This will make the feature easier to explain and make .gitignore files that use it easier to understand. I think that 98% of the usefulness of "**" would be in constructs where it replaces a proper part of the pathname, like "**/SOMETHING" or "SOMETHING/**/SOMETHING"; in other words, where its use matches the regexp "(^|/)\*\*/". In these constructs the only ambiguity is whether "**/" matches regexp "([^/]+/)+" or "([^/]+/)*" (e.g., whether "foo/**/bar" matches "foo/bar"). I personally prefer the second, because the first behavior can be had using the second interpretation by using "SOMETHING/*/**/SOMETHING", whereas the second behavior cannot be implemented in terms of the first in a single line of the .gitignore file. Optionally, one might also like to support "SOMETHING/**" or "**" alone in the obvious ways. As for the implementation, it is quite easy to textually convert a glob pattern, including "**" parts, into a regexp. I happen to have written some Python code that does this for another project (see below). An obvious optimization would be to read any literal parts of the path off the beginning of the glob pattern and only use regexps for the tail part. Would a regexp-based implementation be too slow? Michael _filename_char_pattern = r'[^/]' _glob_patterns = [ ('?', _filename_char_pattern), ('/**', r'(/.+)?'), ('**/', r'(.+/)?'), ('*', _filename_char_pattern + r'*'), ] def glob_to_regexp(pattern): pattern = os.path.normpath(pattern) # remove trivial redundancies if pattern == '**': # This case has to be handled separately because it doesn't # involve a '/' character adjacent to the '**' pattern. (Such # slashes otherwise have to be considered part of the pattern # to handle the matching of zero path components.) return re.compile( r'^' + _filename_char_pattern + r'(.+' + _filename_char_pattern + r')?$' ) regexp = [r'^'] i = 0 while i < len(pattern): for (s, r) in _glob_patterns: if pattern.startswith(s, i): regexp.append(r) i += len(s) break else: # AFAIK it's a normal character. Escape it and add it to # pattern. regexp.append(re.escape(pattern[i])) i += 1 regexp.append(r'$') return re.compile(''.join(regexp)) -- Michael Haggerty mhagger@alum.mit.edu http://softwareswirl.blogspot.com/ ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: What's cooking in git.git (Oct 2012, #01; Tue, 2) 2012-10-04 9:34 ` What's cooking in git.git (Oct 2012, #01; Tue, 2) Michael Haggerty @ 2012-10-04 11:46 ` Nguyen Thai Ngoc Duy 2012-10-04 15:17 ` Michael Haggerty 2012-10-04 16:39 ` Junio C Hamano 1 sibling, 1 reply; 31+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-10-04 11:46 UTC (permalink / raw) To: Michael Haggerty; +Cc: Junio C Hamano, git On Thu, Oct 4, 2012 at 4:34 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote: On Thu, Oct 4, 2012 at 4:34 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote: > Given that there is no obvious interpretation for what a construct like > "x**y" would mean, and many plausible guesses (most of which sound > rather useless), I suggest that we forbid it. This will make the > feature easier to explain and make .gitignore files that use it easier > to understand. Yep, sounds like a good short term plan. > As for the implementation, it is quite easy to textually convert a glob > pattern, including "**" parts, into a regexp. Or we could introduce regexp syntax as an alternative and let users choose (and pay associated price). Patterns starting with // are never matched (we don't normalize paths in .gitignore). Any patterns started with "//regex:" is followed by regex. Reject all other // patterns for future use. > _filename_char_pattern = r'[^/]' > _glob_patterns = [ > ('?', _filename_char_pattern), > ('/**', r'(/.+)?'), > ('**/', r'(.+/)?'), > ('*', _filename_char_pattern + r'*'), > ] I don't fully understand the rest (never been a big fan of python) but what about bracket expressions like [!abc] and [:alnum:]? -- Duy ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: What's cooking in git.git (Oct 2012, #01; Tue, 2) 2012-10-04 11:46 ` Nguyen Thai Ngoc Duy @ 2012-10-04 15:17 ` Michael Haggerty 0 siblings, 0 replies; 31+ messages in thread From: Michael Haggerty @ 2012-10-04 15:17 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: Junio C Hamano, git On 10/04/2012 01:46 PM, Nguyen Thai Ngoc Duy wrote: > On Thu, Oct 4, 2012 at 4:34 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote: >> As for the implementation, it is quite easy to textually convert a glob >> pattern, including "**" parts, into a regexp. > > Or we could introduce regexp syntax as an alternative and let users > choose (and pay associated price). It seems like overkill to me. For filenames, globs are usually adequate. >> _filename_char_pattern = r'[^/]' >> _glob_patterns = [ >> ('?', _filename_char_pattern), >> ('/**', r'(/.+)?'), >> ('**/', r'(.+/)?'), >> ('*', _filename_char_pattern + r'*'), >> ] > > I don't fully understand the rest (never been a big fan of python) but > what about bracket expressions like [!abc] and [:alnum:]? You're right; I forgot that the code that I posted doesn't support brackets. Michael -- Michael Haggerty mhagger@alum.mit.edu http://softwareswirl.blogspot.com/ ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: What's cooking in git.git (Oct 2012, #01; Tue, 2) 2012-10-04 9:34 ` What's cooking in git.git (Oct 2012, #01; Tue, 2) Michael Haggerty 2012-10-04 11:46 ` Nguyen Thai Ngoc Duy @ 2012-10-04 16:39 ` Junio C Hamano 2012-10-05 12:19 ` Andreas Schwab 2012-10-05 13:21 ` Nguyen Thai Ngoc Duy 1 sibling, 2 replies; 31+ messages in thread From: Junio C Hamano @ 2012-10-04 16:39 UTC (permalink / raw) To: Michael Haggerty; +Cc: Nguyen Thai Ngoc Duy, git Michael Haggerty <mhagger@alum.mit.edu> writes: > On 10/03/2012 08:17 PM, Junio C Hamano wrote: >> >> What is the semantics of ** in the first place? Is it described to >> a reasonable level of detail in the documentation updates? For >> example does "**foo" match "afoo", "a/b/foo", "a/bfoo", "a/foo/b", >> "a/bfoo/c"? Does "x**y" match "xy", "xay", "xa/by", "x/a/y"? >> >> I am guessing that the only sensible definition is that "**" >> requires anything that comes before it (if exists) is at a proper >> hierarchy boundary, and anything matches it is also at a proper >> hierarchy boundary, so "x**y" matches "x/a/y" and not "xy", "xay", >> nor "xa/by" in the above example. If "x**y" can match "xy" or "xay" >> (or "**foo" can match "afoo"), it would be unreasonable to say it >> implies the pattern is anchored at any level, no? > > Given that there is no obvious interpretation for what a construct like > "x**y" would mean, and many plausible guesses (most of which sound > rather useless), I suggest that we forbid it. This will make the > feature easier to explain and make .gitignore files that use it easier > to understand. > > I think that 98% of the usefulness of "**" would be in constructs where > it replaces a proper part of the pathname, like "**/SOMETHING" or > "SOMETHING/**/SOMETHING"... I think it is a good way to go in the longer term, if we all agree that "**" matching anything does not give us a useful semantics [*1*]. Is it something we can easily get by simple patch into the wildmatch code? I'd hate to see us parsing the input and validating it before passing it to the library, as we will surely botch the quoting or something while doing so. When we require "x/**/y", I think we still want it to match "x/y". Do people agree, or are there good reasons to require at least one level between x and y for such a pattern? Assuming that we do want to match "x/y" with "x/**/y", I suspect that "'**' matches anything including a slash" would not give us that semantics. Is it something we can easily fix in the wildmatch code? [Footnote] *1* The message you are responding to was written in a somewhat provocative way on purpose so that people who like the way rsync matches "**" can vocally object. I would like to see arguments from the both sides to see if it makes sense. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: What's cooking in git.git (Oct 2012, #01; Tue, 2) 2012-10-04 16:39 ` Junio C Hamano @ 2012-10-05 12:19 ` Andreas Schwab 2012-10-05 12:30 ` Matthieu Moy 2012-10-05 13:21 ` Nguyen Thai Ngoc Duy 1 sibling, 1 reply; 31+ messages in thread From: Andreas Schwab @ 2012-10-05 12:19 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael Haggerty, Nguyen Thai Ngoc Duy, git Junio C Hamano <gitster@pobox.com> writes: > When we require "x/**/y", I think we still want it to match "x/y". FWIW, in bash (+extglob), ksh and zsh it doesn't. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: What's cooking in git.git (Oct 2012, #01; Tue, 2) 2012-10-05 12:19 ` Andreas Schwab @ 2012-10-05 12:30 ` Matthieu Moy 2012-10-05 14:15 ` Andreas Schwab 0 siblings, 1 reply; 31+ messages in thread From: Matthieu Moy @ 2012-10-05 12:30 UTC (permalink / raw) To: Andreas Schwab Cc: Junio C Hamano, Michael Haggerty, Nguyen Thai Ngoc Duy, git Andreas Schwab <schwab@linux-m68k.org> writes: > Junio C Hamano <gitster@pobox.com> writes: > >> When we require "x/**/y", I think we still want it to match "x/y". > > FWIW, in bash (+extglob), ksh and zsh it doesn't. You're right about bash, but I see the opposite for zsh and ksh: zsh$ echo x/**/y x/y x/z/y ksh$ echo x/**/y x/y x/z/y (didn't check the doc so see whether this was configurable, but I've set HOME=/ when launching the shell to disable my own configuration) -- Matthieu Moy http://www-verimag.imag.fr/~moy/ ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: What's cooking in git.git (Oct 2012, #01; Tue, 2) 2012-10-05 12:30 ` Matthieu Moy @ 2012-10-05 14:15 ` Andreas Schwab 0 siblings, 0 replies; 31+ messages in thread From: Andreas Schwab @ 2012-10-05 14:15 UTC (permalink / raw) To: Matthieu Moy; +Cc: Junio C Hamano, Michael Haggerty, Nguyen Thai Ngoc Duy, git Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes: > Andreas Schwab <schwab@linux-m68k.org> writes: > >> Junio C Hamano <gitster@pobox.com> writes: >> >>> When we require "x/**/y", I think we still want it to match "x/y". >> >> FWIW, in bash (+extglob), ksh and zsh it doesn't. > > You're right about bash, but I see the opposite for zsh and ksh: > > zsh$ echo x/**/y > x/y x/z/y > > ksh$ echo x/**/y > x/y x/z/y Looks like this is different between filename expansion and case pattern matching (I only tested the latter). Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: What's cooking in git.git (Oct 2012, #01; Tue, 2) 2012-10-04 16:39 ` Junio C Hamano 2012-10-05 12:19 ` Andreas Schwab @ 2012-10-05 13:21 ` Nguyen Thai Ngoc Duy 1 sibling, 0 replies; 31+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-10-05 13:21 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael Haggerty, git On Thu, Oct 04, 2012 at 09:39:02AM -0700, Junio C Hamano wrote: > Assuming that we do want to match "x/y" with "x/**/y", I suspect > that "'**' matches anything including a slash" would not give us > that semantics. Is it something we can easily fix in the wildmatch > code? Something like this may suffice. Lightly tested with "git add -n". Reading the code, I think we can even distinguish "match zero or more directories" and "match one or more directories" with "/**/" and maybe "/***/". Right now **, ***, ****... are the same. So are /**/, /***/, /****/... -- 8< -- diff --git a/wildmatch.c b/wildmatch.c index f153f8a..81eadc8 100644 --- a/wildmatch.c +++ b/wildmatch.c @@ -98,8 +98,12 @@ static int dowild(const uchar *p, const uchar *text, continue; case '*': if (*++p == '*') { + int slashstarstar = p[-2] == '/'; while (*++p == '*') {} special = TRUE; + if (slashstarstar && *p == '/' && + dowild(p + 1, text, a, force_lower_case) == TRUE) + return TRUE; } else special = FALSE; if (*p == '\0') { -- 8< -- -- Duy ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: What's cooking in git.git (Oct 2012, #01; Tue, 2) 2012-10-02 23:20 What's cooking in git.git (Oct 2012, #01; Tue, 2) Junio C Hamano 2012-10-03 15:23 ` Nguyen Thai Ngoc Duy @ 2012-10-04 8:17 ` David Michael Barr 2012-10-04 8:30 ` fa/remote-svn (Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)) Jonathan Nieder 2012-10-04 16:27 ` What's cooking in git.git (Oct 2012, #01; Tue, 2) Junio C Hamano 2012-10-30 12:15 ` Florian Achleitner 2 siblings, 2 replies; 31+ messages in thread From: David Michael Barr @ 2012-10-04 8:17 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Florian Achleitner, Dmitry Ivankov, Jonathan Nieder On Wednesday, 3 October 2012 at 9:20 AM, Junio C Hamano wrote: > > * fa/remote-svn (2012-09-19) 16 commits > - Add a test script for remote-svn > - remote-svn: add marks-file regeneration > - Add a svnrdump-simulator replaying a dump file for testing > - remote-svn: add incremental import > - remote-svn: Activate import/export-marks for fast-import > - Create a note for every imported commit containing svn metadata > - vcs-svn: add fast_export_note to create notes > - Allow reading svn dumps from files via file:// urls > - remote-svn, vcs-svn: Enable fetching to private refs > - When debug==1, start fast-import with "--stats" instead of "--quiet" > - Add documentation for the 'bidi-import' capability of remote-helpers > - Connect fast-import to the remote-helper via pipe, adding 'bidi-import' capability > - Add argv_array_detach and argv_array_free_detached > - Add svndump_init_fd to allow reading dumps from arbitrary FDs > - Add git-remote-testsvn to Makefile > - Implement a remote helper for svn in C > (this branch is used by fa/vcs-svn.) > > A GSoC project. > Waiting for comments from mentors and stakeholders. I have reviewed this topic and am happy with the design and implementation. I support this topic for inclusion. Acked-by: David Michael Barr <b@rr-dav.id.au> > > * fa/vcs-svn (2012-09-19) 4 commits > - vcs-svn: remove repo_tree > - vcs-svn/svndump: rewrite handle_node(), begin|end_revision() > - vcs-svn/svndump: restructure node_ctx, rev_ctx handling > - svndump: move struct definitions to .h > (this branch uses fa/remote-svn.) > > A GSoC project. > Waiting for comments from mentors and stakeholders. This follow-on topic I'm not so sure on, some of the design decisions make me uncomfortable and I need some convincing before I can get behind this topic. -- David Michael Barr ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: fa/remote-svn (Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)) 2012-10-04 8:17 ` David Michael Barr @ 2012-10-04 8:30 ` Jonathan Nieder 2012-10-04 13:16 ` Stephen Bash 2012-10-04 16:27 ` What's cooking in git.git (Oct 2012, #01; Tue, 2) Junio C Hamano 1 sibling, 1 reply; 31+ messages in thread From: Jonathan Nieder @ 2012-10-04 8:30 UTC (permalink / raw) To: David Michael Barr Cc: Junio C Hamano, git, Florian Achleitner, Dmitry Ivankov David Michael Barr wrote: > On Wednesday, 3 October 2012 at 9:20 AM, Junio C Hamano wrote: >> * fa/remote-svn (2012-09-19) 16 commits >> - Add a test script for remote-svn >> - remote-svn: add marks-file regeneration >> - Add a svnrdump-simulator replaying a dump file for testing >> - remote-svn: add incremental import >> - remote-svn: Activate import/export-marks for fast-import >> - Create a note for every imported commit containing svn metadata >> - vcs-svn: add fast_export_note to create notes >> - Allow reading svn dumps from files via file:// urls >> - remote-svn, vcs-svn: Enable fetching to private refs >> - When debug==1, start fast-import with "--stats" instead of "--quiet" >> - Add documentation for the 'bidi-import' capability of remote-helpers >> - Connect fast-import to the remote-helper via pipe, adding 'bidi-import' capability >> - Add argv_array_detach and argv_array_free_detached >> - Add svndump_init_fd to allow reading dumps from arbitrary FDs >> - Add git-remote-testsvn to Makefile >> - Implement a remote helper for svn in C >> (this branch is used by fa/vcs-svn.) >> >> A GSoC project. >> Waiting for comments from mentors and stakeholders. > > I have reviewed this topic and am happy with the design and implementation. > I support this topic for inclusion. Thanks! I'll try moving the tests to the first patch and trying it and hopefully send out a branch to pull tomorrow. If I don't send anything tomorrow, that's probably a sign that I never will, so since I like the goal of the series I guess it would be a kind of implied ack. Jonathan ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: fa/remote-svn (Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)) 2012-10-04 8:30 ` fa/remote-svn (Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)) Jonathan Nieder @ 2012-10-04 13:16 ` Stephen Bash 2012-10-04 16:30 ` Junio C Hamano 0 siblings, 1 reply; 31+ messages in thread From: Stephen Bash @ 2012-10-04 13:16 UTC (permalink / raw) To: Jonathan Nieder Cc: Junio C Hamano, git, Florian Achleitner, Dmitry Ivankov, David Michael Barr ----- Original Message ----- > From: "Jonathan Nieder" <jrnieder@gmail.com> > Sent: Thursday, October 4, 2012 4:30:01 AM > Subject: Re: fa/remote-svn (Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)) > > > > * fa/remote-svn (2012-09-19) 16 commits > > > - Add a test script for remote-svn > > > - remote-svn: add marks-file regeneration > > > - Add a svnrdump-simulator replaying a dump file for testing > > > - remote-svn: add incremental import > > > - remote-svn: Activate import/export-marks for fast-import > > > - Create a note for every imported commit containing svn metadata > > > - vcs-svn: add fast_export_note to create notes > > > - Allow reading svn dumps from files via file:// urls > > > - remote-svn, vcs-svn: Enable fetching to private refs > > > - When debug==1, start fast-import with "--stats" instead of > > > "--quiet" > > > - Add documentation for the 'bidi-import' capability of > > > remote-helpers > > > - Connect fast-import to the remote-helper via pipe, adding > > > 'bidi-import' capability > > > - Add argv_array_detach and argv_array_free_detached > > > - Add svndump_init_fd to allow reading dumps from arbitrary FDs > > > - Add git-remote-testsvn to Makefile > > > - Implement a remote helper for svn in C > > > (this branch is used by fa/vcs-svn.) > > > > > > A GSoC project. > > > Waiting for comments from mentors and stakeholders. > > > > I have reviewed this topic and am happy with the design and > > implementation. I support this topic for inclusion. > > Thanks! I'll try moving the tests to the first patch and trying it > and hopefully send out a branch to pull tomorrow. > > If I don't send anything tomorrow, that's probably a sign that I never > will, so since I like the goal of the series I guess it would be a > kind of implied ack. I seemed to have missed the GSoC wrap up conversation... (links happily accepted) Looking at the big picture (as much as I can remember) it seems to me the missing pieces now are branch mapping (lots of hard work), and possibly parts (all?) of the "push to SVN" functionality? Thoughts? Thanks, Stephen ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: fa/remote-svn (Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)) 2012-10-04 13:16 ` Stephen Bash @ 2012-10-04 16:30 ` Junio C Hamano 0 siblings, 0 replies; 31+ messages in thread From: Junio C Hamano @ 2012-10-04 16:30 UTC (permalink / raw) To: Stephen Bash Cc: Jonathan Nieder, git, Florian Achleitner, Dmitry Ivankov, David Michael Barr Stephen Bash <bash@genarts.com> writes: > I seemed to have missed the GSoC wrap up conversation... (links happily > accepted). I also seem to have missed such conversation, if anything like that happened. It certainly would have been nice for the mentors and the student for each project to give us a two-to-three-paragraphs summary. As GSoC is a Google event and not the Git community one, I wouldn't *demand* a concluding write-up, but it still took considerable community resource, so... ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: What's cooking in git.git (Oct 2012, #01; Tue, 2) 2012-10-04 8:17 ` David Michael Barr 2012-10-04 8:30 ` fa/remote-svn (Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)) Jonathan Nieder @ 2012-10-04 16:27 ` Junio C Hamano 1 sibling, 0 replies; 31+ messages in thread From: Junio C Hamano @ 2012-10-04 16:27 UTC (permalink / raw) To: David Michael Barr Cc: git, Florian Achleitner, Dmitry Ivankov, Jonathan Nieder David Michael Barr <b@rr-dav.id.au> writes: > On Wednesday, 3 October 2012 at 9:20 AM, Junio C Hamano wrote: >> >> * fa/remote-svn (2012-09-19) 16 commits >> ... >> >> A GSoC project. >> Waiting for comments from mentors and stakeholders. > > I have reviewed this topic and am happy with the design and implementation. > I support this topic for inclusion. > > Acked-by: David Michael Barr <b@rr-dav.id.au> >> >> * fa/vcs-svn (2012-09-19) 4 commits >> ... > > This follow-on topic I'm not so sure on, some of the design > decisions make me uncomfortable and I need some convincing before > I can get behind this topic. Thanks for a feedback. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: What's cooking in git.git (Oct 2012, #01; Tue, 2) 2012-10-02 23:20 What's cooking in git.git (Oct 2012, #01; Tue, 2) Junio C Hamano 2012-10-03 15:23 ` Nguyen Thai Ngoc Duy 2012-10-04 8:17 ` David Michael Barr @ 2012-10-30 12:15 ` Florian Achleitner 2 siblings, 0 replies; 31+ messages in thread From: Florian Achleitner @ 2012-10-30 12:15 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, David Michael Barr, Dmitry Ivankov, Jonathan Nieder Sorry for reacting so late, I didn't read the list carefully in the last weeks and my gmail filter somehow didn't trigger on that. On Tuesday 02 October 2012 16:20:22 Junio C Hamano wrote: > * fa/remote-svn (2012-09-19) 16 commits > - Add a test script for remote-svn > - remote-svn: add marks-file regeneration > - Add a svnrdump-simulator replaying a dump file for testing > - remote-svn: add incremental import > - remote-svn: Activate import/export-marks for fast-import > - Create a note for every imported commit containing svn metadata > - vcs-svn: add fast_export_note to create notes > - Allow reading svn dumps from files via file:// urls > - remote-svn, vcs-svn: Enable fetching to private refs > - When debug==1, start fast-import with "--stats" instead of "--quiet" > - Add documentation for the 'bidi-import' capability of remote-helpers > - Connect fast-import to the remote-helper via pipe, adding 'bidi-import' > capability - Add argv_array_detach and argv_array_free_detached > - Add svndump_init_fd to allow reading dumps from arbitrary FDs > - Add git-remote-testsvn to Makefile > - Implement a remote helper for svn in C > (this branch is used by fa/vcs-svn.) > > A GSoC project. > Waiting for comments from mentors and stakeholders. >From my point of view, this is rather complete. It got eight review cycles on the list. Note that the remote helper can only fetch, pushing is not possible at all. > > > * fa/vcs-svn (2012-09-19) 4 commits > - vcs-svn: remove repo_tree > - vcs-svn/svndump: rewrite handle_node(), begin|end_revision() > - vcs-svn/svndump: restructure node_ctx, rev_ctx handling > - svndump: move struct definitions to .h > (this branch uses fa/remote-svn.) > > A GSoC project. > Waiting for comments from mentors and stakeholders. This is the result of what I did when I wanted to start implementing branch detection. I found that the existing code is not suitable and restructured it. The main goal is to seperate svn revision parsing from git commit creation. Because for creating commits, you need to know on which branch to create the commit. While for finding out which branch is the right one, you need to read the complete svn revision first to see what dirs are changed and how. It is rather invasive and it doesn't make sense without using it later on. So I'm not surprised that you may not like it. Anyways it passes all existing tests (that doesn't mean it's good of course ;)) Florian ^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2012-10-30 13:17 UTC | newest] Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-10-02 23:20 What's cooking in git.git (Oct 2012, #01; Tue, 2) Junio C Hamano 2012-10-03 15:23 ` Nguyen Thai Ngoc Duy 2012-10-03 18:17 ` Junio C Hamano 2012-10-04 1:56 ` Nguyen Thai Ngoc Duy 2012-10-04 6:01 ` Junio C Hamano 2012-10-04 7:39 ` [PATCH 0/6] wildmatch part 2 Nguyễn Thái Ngọc Duy 2012-10-04 7:39 ` [PATCH 1/6] attr: remove the union in struct match_attr Nguyễn Thái Ngọc Duy 2012-10-04 7:39 ` [PATCH 2/6] attr: avoid strlen() on every match Nguyễn Thái Ngọc Duy 2012-10-04 7:39 ` [PATCH 3/6] attr: avoid searching for basename " Nguyễn Thái Ngọc Duy 2012-10-04 7:39 ` [PATCH 4/6] attr: more matching optimizations from .gitignore Nguyễn Thái Ngọc Duy 2012-10-04 7:39 ` [PATCH 5/6] gitignore: do not do basename match with patterns that have '**' Nguyễn Thái Ngọc Duy 2012-10-04 17:59 ` Junio C Hamano 2012-10-05 7:01 ` Johannes Sixt 2012-10-05 11:23 ` Nguyen Thai Ngoc Duy 2012-10-04 7:39 ` [PATCH 6/6] t3001: note about expected "**" behavior Nguyễn Thái Ngọc Duy 2012-10-04 18:04 ` Junio C Hamano 2012-10-04 17:43 ` [PATCH 0/6] wildmatch part 2 Junio C Hamano 2012-10-04 9:34 ` What's cooking in git.git (Oct 2012, #01; Tue, 2) Michael Haggerty 2012-10-04 11:46 ` Nguyen Thai Ngoc Duy 2012-10-04 15:17 ` Michael Haggerty 2012-10-04 16:39 ` Junio C Hamano 2012-10-05 12:19 ` Andreas Schwab 2012-10-05 12:30 ` Matthieu Moy 2012-10-05 14:15 ` Andreas Schwab 2012-10-05 13:21 ` Nguyen Thai Ngoc Duy 2012-10-04 8:17 ` David Michael Barr 2012-10-04 8:30 ` fa/remote-svn (Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)) Jonathan Nieder 2012-10-04 13:16 ` Stephen Bash 2012-10-04 16:30 ` Junio C Hamano 2012-10-04 16:27 ` What's cooking in git.git (Oct 2012, #01; Tue, 2) Junio C Hamano 2012-10-30 12:15 ` Florian Achleitner
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).