From: Elijah Newren <newren@gmail.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
Junio C Hamano <gitster@pobox.com>,
Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 5/5] unpack-trees: rename 'is_excluded_from_list()'
Date: Wed, 4 Sep 2019 13:25:54 -0700 [thread overview]
Message-ID: <CABPp-BFsOynr0skVAgxuhjrdXoZknU86+m2ecnN3BN+fFEiDoA@mail.gmail.com> (raw)
In-Reply-To: <d9a62c76339b64325a13d5f483f673450aa90c2b.1567533893.git.gitgitgadget@gmail.com>
On Tue, Sep 3, 2019 at 11:04 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> The first consumer of pattern-matching filenames was the
> .gitignore feature. In that context, storing a list of patterns
> as a 'struct exclude_list' makes sense. However, the
> sparse-checkout feature then adopted these structures and methods,
> but with the opposite meaning: these patterns match the files
> that should be included!
>
> Now that this library is renamed to use 'struct pattern_list'
> and 'struct pattern', we can now rename the method used by
> the sparse-checkout feature to determine which paths should
> appear in the working directory.
>
> The method is_excluded_from_list() is only used by the
> sparse-checkout logic in unpack-trees and list-objects-filter.
> The confusing part is that it returned 1 for "excluded" (i.e.
> it matches the list of exclusions) but that really manes that
s/manes/means/
> the path matched the list of patterns for _inclusion_ in the
> working directory.
>
> Rename the method to be path_matches_pattern_list() and have
> it return an explicit 'enum pattern_match_result'. Here, the
> values MATCHED = 1, UNMATCHED = 0, and UNDECIDED = -1 agree
> with the previous integer values. This shift allows future
> consumers to better understand what the retur values mean,
s/retur/return/
> and provides more type checking for handling those values.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
> dir.c | 25 +++++++++++++++++--------
> dir.h | 21 +++++++++++++++++----
> list-objects-filter.c | 29 +++++++++++++++--------------
> unpack-trees.c | 39 +++++++++++++++++++++++----------------
> 4 files changed, 72 insertions(+), 42 deletions(-)
>
> diff --git a/dir.c b/dir.c
> index b057bd3d95..34972abdaf 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -1072,19 +1072,28 @@ static struct path_pattern *last_matching_pattern_from_list(const char *pathname
> }
>
> /*
> - * Scan the list and let the last match determine the fate.
> - * Return 1 for exclude, 0 for include and -1 for undecided.
> + * Scan the list of patterns to determine if the ordered list
> + * of patterns matches on 'pathname'.
> + *
> + * Return 1 for a match, 0 for not matched and -1 for undecided.
Maybe drop this last sentence since it's misleading, and since the
function signature already specifies that it returns an enum and the
enums have obvious meanings already?
> */
> -int is_excluded_from_list(const char *pathname,
> - int pathlen, const char *basename, int *dtype,
> - struct pattern_list *pl, struct index_state *istate)
> +enum pattern_match_result path_matches_pattern_list(
> + const char *pathname, int pathlen,
> + const char *basename, int *dtype,
> + struct pattern_list *pl,
> + struct index_state *istate)
> {
> struct path_pattern *pattern;
> pattern = last_matching_pattern_from_list(pathname, pathlen, basename,
> dtype, pl, istate);
> - if (pattern)
> - return pattern->flags & PATTERN_FLAG_NEGATIVE ? 0 : 1;
> - return -1; /* undecided */
> + if (pattern) {
> + if (pattern->flags & PATTERN_FLAG_NEGATIVE)
> + return NOT_MATCHED;
> + else
> + return MATCHED;
> + }
> +
> + return UNDECIDED;
> }
>
> static struct path_pattern *last_matching_pattern_from_lists(
> diff --git a/dir.h b/dir.h
> index 7babf31d88..608696c958 100644
> --- a/dir.h
> +++ b/dir.h
> @@ -230,10 +230,23 @@ int read_directory(struct dir_struct *, struct index_state *istate,
> const char *path, int len,
> const struct pathspec *pathspec);
>
> -int is_excluded_from_list(const char *pathname, int pathlen,
> - const char *basename, int *dtype,
> - struct pattern_list *pl,
> - struct index_state *istate);
> +enum pattern_match_result {
> + UNDECIDED = -1,
> + NOT_MATCHED = 0,
> + MATCHED = 1,
Are we worried about preventing e.g. grep from using these names or
clashing with another lib (block-sha1?) that might choose to define
these?
> +};
> +
> +/*
> + * Scan the list of patterns to determine if the ordered list
> + * of patterns matches on 'pathname'.
> + *
> + * Return 1 for a match, 0 for not matched and -1 for undecided.
Again, I'd drop the last sentence.
> + */
> +enum pattern_match_result path_matches_pattern_list(const char *pathname,
> + int pathlen,
> + const char *basename, int *dtype,
> + struct pattern_list *pl,
> + struct index_state *istate);
> struct dir_entry *dir_add_ignored(struct dir_struct *dir,
> struct index_state *istate,
> const char *pathname, int len);
> diff --git a/list-objects-filter.c b/list-objects-filter.c
> index ccd58e61c3..d624f1c898 100644
> --- a/list-objects-filter.c
> +++ b/list-objects-filter.c
> @@ -328,12 +328,12 @@ static void filter_blobs_limit__init(
> */
> struct frame {
> /*
> - * defval is the usual default include/exclude value that
> + * default_match is the usual default include/exclude value that
> * should be inherited as we recurse into directories based
> * upon pattern matching of the directory itself or of a
> * containing directory.
> */
> - int defval;
> + enum pattern_match_result default_match;
>
> /*
> * 1 if the directory (recursively) contains any provisionally
> @@ -363,8 +363,9 @@ static enum list_objects_filter_result filter_sparse(
> void *filter_data_)
> {
> struct filter_sparse_data *filter_data = filter_data_;
> - int val, dtype;
> + int dtype;
> struct frame *frame;
> + enum pattern_match_result match;
>
> switch (filter_situation) {
> default:
> @@ -373,15 +374,15 @@ static enum list_objects_filter_result filter_sparse(
> case LOFS_BEGIN_TREE:
> assert(obj->type == OBJ_TREE);
> dtype = DT_DIR;
> - val = is_excluded_from_list(pathname, strlen(pathname),
> - filename, &dtype, &filter_data->pl,
> - r->index);
> - if (val < 0)
> - val = filter_data->array_frame[filter_data->nr - 1].defval;
> + match = path_matches_pattern_list(pathname, strlen(pathname),
> + filename, &dtype, &filter_data->pl,
> + r->index);
> + if (match == UNDECIDED)
> + match = filter_data->array_frame[filter_data->nr - 1].default_match;
>
> ALLOC_GROW(filter_data->array_frame, filter_data->nr + 1,
> filter_data->alloc);
> - filter_data->array_frame[filter_data->nr].defval = val;
> + filter_data->array_frame[filter_data->nr].default_match = match;
> filter_data->array_frame[filter_data->nr].child_prov_omit = 0;
> filter_data->nr++;
>
> @@ -435,12 +436,12 @@ static enum list_objects_filter_result filter_sparse(
> frame = &filter_data->array_frame[filter_data->nr - 1];
>
> dtype = DT_REG;
> - val = is_excluded_from_list(pathname, strlen(pathname),
> + match = path_matches_pattern_list(pathname, strlen(pathname),
> filename, &dtype, &filter_data->pl,
> r->index);
> - if (val < 0)
> - val = frame->defval;
> - if (val > 0) {
> + if (match == UNDECIDED)
> + match = frame->default_match;
> + if (match == MATCHED) {
> if (omits)
> oidset_remove(omits, &obj->oid);
> return LOFR_MARK_SEEN | LOFR_DO_SHOW;
> @@ -487,7 +488,7 @@ static void filter_sparse_oid__init(
> die("could not load filter specification");
>
> ALLOC_GROW(d->array_frame, d->nr + 1, d->alloc);
> - d->array_frame[d->nr].defval = 0; /* default to include */
> + d->array_frame[d->nr].default_match = 0; /* default to include */
s/0/MATCHED/ ?
> d->array_frame[d->nr].child_prov_omit = 0;
> d->nr++;
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 902a799aeb..cd548f4fa2 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1265,7 +1265,8 @@ static int clear_ce_flags_1(struct index_state *istate,
> struct cache_entry **cache, int nr,
> struct strbuf *prefix,
> int select_mask, int clear_mask,
> - struct pattern_list *pl, int defval);
> + struct pattern_list *pl,
> + enum pattern_match_result default_match);
>
> /* Whole directory matching */
> static int clear_ce_flags_dir(struct index_state *istate,
> @@ -1273,19 +1274,21 @@ static int clear_ce_flags_dir(struct index_state *istate,
> struct strbuf *prefix,
> char *basename,
> int select_mask, int clear_mask,
> - struct pattern_list *pl, int defval)
> + struct pattern_list *pl,
> + enum pattern_match_result default_match)
> {
> struct cache_entry **cache_end;
> int dtype = DT_DIR;
> - int ret = is_excluded_from_list(prefix->buf, prefix->len,
> - basename, &dtype, pl, istate);
> int rc;
> + enum pattern_match_result ret;
> + ret = path_matches_pattern_list(prefix->buf, prefix->len,
> + basename, &dtype, pl, istate);
>
> strbuf_addch(prefix, '/');
>
> /* If undecided, use matching result of parent dir in defval */
> - if (ret < 0)
> - ret = defval;
> + if (ret == UNDECIDED)
> + ret = default_match;
>
> for (cache_end = cache; cache_end != cache + nr; cache_end++) {
> struct cache_entry *ce = *cache_end;
> @@ -1298,7 +1301,7 @@ static int clear_ce_flags_dir(struct index_state *istate,
> * with ret (iow, we know in advance the incl/excl
> * decision for the entire directory), clear flag here without
> * calling clear_ce_flags_1(). That function will call
> - * the expensive is_excluded_from_list() on every entry.
> + * the expensive path_matches_pattern_list() on every entry.
> */
> rc = clear_ce_flags_1(istate, cache, cache_end - cache,
> prefix,
> @@ -1327,7 +1330,8 @@ static int clear_ce_flags_1(struct index_state *istate,
> struct cache_entry **cache, int nr,
> struct strbuf *prefix,
> int select_mask, int clear_mask,
> - struct pattern_list *pl, int defval)
> + struct pattern_list *pl,
> + enum pattern_match_result default_match)
> {
> struct cache_entry **cache_end = cache + nr;
>
> @@ -1338,7 +1342,8 @@ static int clear_ce_flags_1(struct index_state *istate,
> while(cache != cache_end) {
> struct cache_entry *ce = *cache;
> const char *name, *slash;
> - int len, dtype, ret;
> + int len, dtype;
> + enum pattern_match_result ret;
>
> if (select_mask && !(ce->ce_flags & select_mask)) {
> cache++;
> @@ -1362,7 +1367,7 @@ static int clear_ce_flags_1(struct index_state *istate,
> prefix,
> prefix->buf + prefix->len - len,
> select_mask, clear_mask,
> - pl, defval);
> + pl, default_match);
>
> /* clear_c_f_dir eats a whole dir already? */
> if (processed) {
> @@ -1374,18 +1379,20 @@ static int clear_ce_flags_1(struct index_state *istate,
> strbuf_addch(prefix, '/');
> cache += clear_ce_flags_1(istate, cache, cache_end - cache,
> prefix,
> - select_mask, clear_mask, pl, defval);
> + select_mask, clear_mask, pl,
> + default_match);
> strbuf_setlen(prefix, prefix->len - len - 1);
> continue;
> }
>
> /* Non-directory */
> dtype = ce_to_dtype(ce);
> - ret = is_excluded_from_list(ce->name, ce_namelen(ce),
> - name, &dtype, pl, istate);
> - if (ret < 0)
> - ret = defval;
> - if (ret > 0)
> + ret = path_matches_pattern_list(ce->name,
> + ce_namelen(ce),
> + name, &dtype, pl, istate);
> + if (ret == UNDECIDED)
> + ret = default_match;
> + if (ret == MATCHED)
> ce->ce_flags &= ~clear_mask;
> cache++;
> }
> --
> gitgitgadget
The rest looks good.
next prev parent reply other threads:[~2019-09-04 20:26 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-03 18:04 [PATCH 0/5] Refactor excludes library Derrick Stolee via GitGitGadget
2019-09-03 18:04 ` [PATCH 1/5] treewide: rename 'struct exclude' to 'struct path_pattern' Derrick Stolee via GitGitGadget
2019-09-05 6:55 ` Jeff King
2019-09-05 21:03 ` Junio C Hamano
2019-09-03 18:04 ` [PATCH 3/5] treewide: rename 'EXCL_FLAG_' to 'PATTERN_FLAG_' Derrick Stolee via GitGitGadget
2019-09-03 18:04 ` [PATCH 2/5] treewide: rename 'struct exclude_list' to 'struct pattern_list' Derrick Stolee via GitGitGadget
2019-09-03 18:04 ` [PATCH 4/5] treewide: rename 'exclude' methods to 'pattern' Derrick Stolee via GitGitGadget
2019-09-03 18:04 ` [PATCH 5/5] unpack-trees: rename 'is_excluded_from_list()' Derrick Stolee via GitGitGadget
2019-09-04 20:25 ` Elijah Newren [this message]
2019-09-04 20:28 ` [PATCH 0/5] Refactor excludes library Elijah Newren
2019-09-06 20:34 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CABPp-BFsOynr0skVAgxuhjrdXoZknU86+m2ecnN3BN+fFEiDoA@mail.gmail.com \
--to=newren@gmail.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).