git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org, gitster@pobox.com, peff@peff.net,
	Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: [PATCH v3 4/6] list-objects: filter objects in traverse_commit_list
Date: Thu, 16 Nov 2017 12:23:56 -0500	[thread overview]
Message-ID: <61855872-221b-0e97-abaa-24a011ad899e@jeffhostetler.com> (raw)
In-Reply-To: <20171107152034.47686f6ece72ea3d43005b12@google.com>



On 11/7/2017 6:20 PM, Jonathan Tan wrote:
> On Tue,  7 Nov 2017 19:35:44 +0000
> Jeff Hostetler <git@jeffhostetler.com> wrote:
> 
>> +/*
>> + * Reject the arg if it contains any characters that might
>> + * require quoting or escaping when handing to a sub-command.
>> + */
>> +static int reject_injection_chars(const char *arg)
>> +{
> [snip]
>> +}
> 
> Someone pointed me to quote.{c,h}, which is probably sufficient to
> ensure shell safety if we do invoke subcommands through the shell. If
> that is so, we probably don't need a blacklist.
> 
> Having said that, though, it might be safer to still introduce one, and
> relax it later if necessary - it is much easier to relax a constraint
> than to increase one.

I couldn't use quote.[ch] because it is more concerned with
quoting pathnames because of LF and CR characters within
them -- rather than semicolons and quotes and the like which
I was concerned about.

Anyway, in my next patch series I've replaced all of the
injection code from my last series with something a little
stronger and not restricting.

> 
>> +	} else if (skip_prefix(arg, "sparse:", &v0)) {
>> +
>> +		if (skip_prefix(v0, "oid=", &v1)) {
>> +			struct object_context oc;
>> +			struct object_id sparse_oid;
>> +			filter_options->choice = LOFC_SPARSE_OID;
>> +			if (!get_oid_with_context(v1, GET_OID_BLOB,
>> +						  &sparse_oid, &oc))
>> +				filter_options->sparse_oid_value =
>> +					oiddup(&sparse_oid);
>> +			return 0;
>> +		}
> 
> In your recent e-mail [1], you said that you will change it to always pass
> the original expression - is that still the plan?
> 
> [1] https://public-inbox.org/git/f698d5a8-bf31-cea1-a8da-88b755b0b7af@jeffhostetler.com/

yes.  I always pass filter_options.raw_value over the wire.
The code above tries to parse it and put it in an OID for
private use by the current process -- just like the size limit
value in the blob:limit filter.

>> +/* Remember to update object flag allocation in object.h */
> 
> You probably can delete this line.

Every other place that defined flag bits included this comment,
so I did too.  (It really made it easier to find the other
random places that define bits, actually.)

> 
>> +/*
>> + * FILTER_SHOWN_BUT_REVISIT -- we set this bit on tree objects
>> + * that have been shown, but should be revisited if they appear
>> + * in the traversal (until we mark it SEEN).  This is a way to
>> + * let us silently de-dup calls to show() in the caller.
> 
> This is unclear to me at first reading. Maybe something like:
> 
>    FILTER_SHOWN_BUT_REVISIT -- we set this bit on tree objects that have
>    been shown, but should not be skipped over if they reappear in the
>    traversal. This ensures that the tree's descendants are re-processed
>    if the tree reappears subsequently, and that the tree is not shown
>    twice.
> 
>> + * This
>> + * is subtly different from the "revision.h:SHOWN" and the
>> + * "sha1_name.c:ONELINE_SEEN" bits.  And also different from
>> + * the non-de-dup usage in pack-bitmap.c
>> + */
> 
> Optional: I'm not sure if this comparison is useful. (Maybe it is useful
> to others, though.)

I was thinking the first comment about my FILTER_SHOWN field
would be to ask why I wasn't just using the existing SHOWN bit.
There are subtle differences between the bits and I wanted to
point out that I was not just duplicating the usage of an existing
bit.
  
> 
>> +/*
>> + * A filter driven by a sparse-checkout specification to only
>> + * include blobs that a sparse checkout would populate.
>> + *
>> + * The sparse-checkout spec can be loaded from a blob with the
>> + * given OID or from a local pathname.  We allow an OID because
>> + * the repo may be bare or we may be doing the filtering on the
>> + * server.
>> + */
>> +struct frame {
>> +	/*
>> +	 * defval is the usual default include/exclude value that
>> +	 * should be inherited as we recurse into directories based
>> +	 * upon pattern matching of the directory itself or of a
>> +	 * containing directory.
>> +	 */
>> +	int defval;
> 
> Can this be an "unsigned defval : 1" as well? In the function below, I
> see that you assign to an "int val" first (which can take -1, 0, and 1)
> before assigning to this, so that is fine.
> 
> Also, maybe a better name would be "exclude", with the documentation:
> 
>    1 if the directory is excluded, 0 otherwise. Excluded directories will
>    still be recursed through, because an "include" rule for an object
>    might override an "exclude" rule for one of its ancestors.
> 

The name "defval" is used unpack-trees.c during the clear_ce_flags()
recursion while looking at the exclusion list.  I was just trying to
match that behavior.

Thanks
Jeff

  parent reply	other threads:[~2017-11-16 17:24 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-07 19:35 [PATCH v3 0/6] Partial clone part 1: object filtering Jeff Hostetler
2017-11-07 19:35 ` [PATCH v3 1/6] dir: allow exclusions from blob in addition to file Jeff Hostetler
2017-11-07 19:35 ` [PATCH v3 2/6] oidmap: add oidmap iterator methods Jeff Hostetler
2017-11-07 19:35 ` [PATCH v3 3/6] oidset: add iterator methods to oidset Jeff Hostetler
2017-11-07 19:35 ` [PATCH v3 4/6] list-objects: filter objects in traverse_commit_list Jeff Hostetler
2017-11-07 23:20   ` Jonathan Tan
2017-11-08  5:01     ` Junio C Hamano
2017-11-16 17:28       ` Jeff Hostetler
2017-11-16 17:23     ` Jeff Hostetler [this message]
2017-11-07 19:35 ` [PATCH v3 5/6] rev-list: add list-objects filtering support Jeff Hostetler
2017-11-07 19:35 ` [PATCH v3 6/6] pack-objects: add list-objects filtering Jeff Hostetler
2017-11-08  0:45   ` Jonathan Tan
2017-11-08  5:25 ` [PATCH v3 0/6] Partial clone part 1: object filtering Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=61855872-221b-0e97-abaa-24a011ad899e@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=jonathantanmy@google.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).