git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: [PATCH 00/19] nd/struct-pathspec (or pathspec unification [1])
Date: Mon, 13 Dec 2010 16:46:37 +0700	[thread overview]
Message-ID: <1292233616-27692-1-git-send-email-pclouds@gmail.com> (raw)

Background:

pathspecs in git can be handled differently in three places

 1. log family uses tree_entry_interesting() and ce_path_match()
 2. most index-related operations use match_pathspec()
 3. grep uses its own pathspec_matches()

Out of three, #3 provides the most advanced functionalities, while #1
has a few good optimizations, but not as powerful as #3. #2 is sort of
trade-off between the other two.

This series brings all the #3 goodness to #1 and #2, then kills #3. I
don't want to kill #2 because it takes a list as input, while #1 takes
trees (ce_path_match() takes list though). There could be different
optmizations based on different input type.

Summary of patches:

  Add struct pathspec
  diff-no-index: use diff_tree_setup_paths()
  pathspec: cache string length when initializing pathspec
  Convert struct diff_options to use struct pathspec
  tree_entry_interesting(): remove dependency on struct diff_options
  Move tree_entry_interesting() to tree-walk.c and export it

This is unchanged from nd/struct-pathspec in pu. There is one patch
from pu replaced later.

  glossary: define pathspec

This is what I am aiming to. If I make mistakes, blame Jonathan
because he mis-specifies it ;-)

  pathspec: mark wildcard pathspecs from the beginning

>From old nd/struct-pathspec, to recognize potential wildcard pathspecs
early.

  tree-diff.c: reserve space in "base" for pathname concatenation

The (probably most) used operation in traversing trees is concatenate
dirname and basename into full path (especially for wildcard matching).
This requires a new buffer every time. This patch ensures that the
caller prepares a writable buffer with dirname already filled. If the
callee wants full path, it does not have to allocate another buffer
(and does shorter memcpy).

This patch is not strictly needed though.

  tree_entry_interesting(): factor out most matching logic

For readibility of the next patches.

  tree_entry_interesting: support depth limit

Goodness from #3.

  tree_entry_interesting(): support wildcard matching
  tree_entry_interesting(): optimize fnmatch when base is matched

This is something t_e_i() lacks for so long. However, in order to make
log family commands work properly, ce_path_match() also needs to learn
wildcards.

This changes tree_entry_interesting() interface, therefore breaks
en/object-list-with-pathspec. I'll send fixes shortly.

  Convert ce_path_match() use to match_pathspec()

So that log family now works with wildcards.

  pathspec: add match_pathspec_depth()

This is new match_pathspec(). I don't want to replace the old one
because it changes more places. But once it works, another patch to
kill match_pathspec() should be easy.

  grep: convert to use struct pathspec
  grep: use match_pathspec_depth() for cache grepping
  grep: use preallocated buffer for grep_tree()
  grep: drop pathspec_matches() in favor of tree_entry_interesting()

grep (especially t7810) is how I test all these. I need to write more
tests to make sure things work. But for now t7810 passes.

Hopefully I did not lose any optimizations in pathspec_matches().

It's time to rebase negative pathspec patches on top and get back to
my narrow clone.

[1] https://git.wiki.kernel.org/index.php/SoC2010Ideas#Unify_Pathspec_Semantics

 Documentation/glossary-content.txt |   23 ++++
 builtin/diff-files.c               |    2 +-
 builtin/diff.c                     |    4 +-
 builtin/grep.c                     |  200 ++++++++---------------------
 builtin/log.c                      |    2 +-
 cache.h                            |   14 ++
 diff-lib.c                         |    2 +-
 diff-no-index.c                    |   13 +-
 diff.h                             |    4 +-
 dir.c                              |   98 ++++++++++++++
 dir.h                              |    4 +
 read-cache.c                       |   20 +---
 revision.c                         |    6 +-
 t/t4010-diff-pathspec.sh           |   14 ++
 tree-diff.c                        |  246 ++++++++----------------------------
 tree-walk.c                        |  186 +++++++++++++++++++++++++++
 tree-walk.h                        |    2 +
 17 files changed, 461 insertions(+), 379 deletions(-)

-- 
1.7.3.3.476.g10a82

             reply	other threads:[~2010-12-13  9:47 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-13  9:46 Nguyễn Thái Ngọc Duy [this message]
2010-12-13  9:46 ` [PATCH 01/19] Add struct pathspec Nguyễn Thái Ngọc Duy
2010-12-13 17:31   ` Thiago Farina
2010-12-14 12:50     ` Nguyen Thai Ngoc Duy
2010-12-13  9:46 ` [PATCH 02/19] diff-no-index: use diff_tree_setup_paths() Nguyễn Thái Ngọc Duy
2010-12-13  9:46 ` [PATCH 03/19] pathspec: cache string length when initializing pathspec Nguyễn Thái Ngọc Duy
2010-12-13  9:46 ` [PATCH 04/19] Convert struct diff_options to use struct pathspec Nguyễn Thái Ngọc Duy
2010-12-13 19:00   ` Junio C Hamano
2010-12-14  5:02     ` Nguyen Thai Ngoc Duy
2010-12-13  9:46 ` [PATCH 05/19] tree_entry_interesting(): remove dependency on struct diff_options Nguyễn Thái Ngọc Duy
2010-12-13 19:11   ` Junio C Hamano
2010-12-13  9:46 ` [PATCH 06/19] Move tree_entry_interesting() to tree-walk.c and export it Nguyễn Thái Ngọc Duy
2010-12-13  9:46 ` [PATCH 07/19] glossary: define pathspec Nguyễn Thái Ngọc Duy
2010-12-13  9:46 ` [PATCH 08/19] pathspec: mark wildcard pathspecs from the beginning Nguyễn Thái Ngọc Duy
2010-12-13 18:09   ` Junio C Hamano
2010-12-13  9:46 ` [PATCH 09/19] tree-diff.c: reserve space in "base" for pathname concatenation Nguyễn Thái Ngọc Duy
2010-12-13 18:10   ` Junio C Hamano
2010-12-14  5:00     ` Nguyen Thai Ngoc Duy
2010-12-14  5:32       ` Junio C Hamano
2010-12-14  7:10         ` Nguyen Thai Ngoc Duy
2010-12-14  7:32         ` Johannes Sixt
2010-12-14  7:43           ` Nguyen Thai Ngoc Duy
2010-12-14  8:21             ` Johannes Sixt
2010-12-14 13:01               ` Nguyen Thai Ngoc Duy
2010-12-14 17:11                 ` Junio C Hamano
2010-12-13  9:46 ` [PATCH 10/19] tree_entry_interesting(): factor out most matching logic Nguyễn Thái Ngọc Duy
2010-12-13 18:10   ` Junio C Hamano
2010-12-13  9:46 ` [PATCH 11/19] tree_entry_interesting: support depth limit Nguyễn Thái Ngọc Duy
2010-12-13 18:10   ` Junio C Hamano
2010-12-14 14:44     ` Nguyen Thai Ngoc Duy
2010-12-13  9:46 ` [PATCH 12/19] tree_entry_interesting(): support wildcard matching Nguyễn Thái Ngọc Duy
2010-12-13 18:10   ` Junio C Hamano
2010-12-14 15:04     ` Nguyen Thai Ngoc Duy
2010-12-13  9:46 ` [PATCH 13/19] tree_entry_interesting(): optimize fnmatch when base is matched Nguyễn Thái Ngọc Duy
2010-12-13 18:10   ` Junio C Hamano
2010-12-13  9:46 ` [PATCH 14/19] Convert ce_path_match() use to match_pathspec() Nguyễn Thái Ngọc Duy
2010-12-13 19:31   ` Junio C Hamano
2010-12-14 15:14     ` Nguyen Thai Ngoc Duy
2010-12-13  9:46 ` [PATCH 15/19] pathspec: add match_pathspec_depth() Nguyễn Thái Ngọc Duy
2010-12-13 19:28   ` Junio C Hamano
2010-12-14  5:07     ` Nguyen Thai Ngoc Duy
2010-12-13  9:46 ` [PATCH 16/19] grep: convert to use struct pathspec Nguyễn Thái Ngọc Duy
2010-12-13  9:46 ` [PATCH 17/19] grep: use match_pathspec_depth() for cache/worktree grepping Nguyễn Thái Ngọc Duy
2010-12-13  9:46 ` [PATCH 18/19] grep: use preallocated buffer for grep_tree() Nguyễn Thái Ngọc Duy
2010-12-13  9:46 ` [PATCH 19/19] grep: drop pathspec_matches() in favor of tree_entry_interesting() Nguyễn Thái Ngọc Duy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1292233616-27692-1-git-send-email-pclouds@gmail.com \
    --to=pclouds@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).