git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Matthew DeVore <matvore@comcast.net>
To: Jonathan Tan <jonathantanmy@google.com>, matvore@google.com
Cc: git@vger.kernel.org, sbeller@google.com, git@jeffhostetler.com,
	jeffhost@microsoft.com, peff@peff.net, stefanbeller@gmail.com,
	pclouds@gmail.com
Subject: Re: [PATCH v2 1/2] list-objects-filter: teach tree:# how to handle >0
Date: Tue, 8 Jan 2019 11:22:55 -0800	[thread overview]
Message-ID: <54fba0d3-4b8e-1faf-4b2d-e67c1f5fbf02@comcast.net> (raw)
In-Reply-To: <20190108015631.22727-1-jonathantanmy@google.com>


On 2019/01/07 17:56, Jonathan Tan wrote:
>>   	case LOFS_END_TREE:
>>   		assert(obj->type == OBJ_TREE);
>> +		filter_data->current_depth--;
>>   		return LOFR_ZERO;
>>   
>> +	case LOFS_BLOB:
>> +		filter_trees_update_omits(obj, filter_data, include_it);
>> +		return include_it ? LOFR_MARK_SEEN | LOFR_DO_SHOW : LOFR_ZERO;
> Any reason for moving "case LOFS_BLOB" (and "case LOFS_BEGIN_TREE"
> below) after LOFS_END_TREE?

I put LOFS_BLOB and after LOFS_END_TREE since that is the order in all 
the other filter logic functions. I put LOFS_BEGIN_TREE at the end 
(which is different from the other filter logic functions) because it's 
usually better to put simpler things before longer or more complex 
things. LOFS_BEGIN_TREE is much more complex and if it were not the last 
switch section, it would tend to hide the sections that come after it.

FWIW, I consider this the coding corollary of the end-weight problem in 
linguistics - see https://www.thoughtco.com/end-weight-grammar-1690594 - 
this is not my original idea, but something from the book Perl Best 
Practices, although that book only mentioned it in the context of 
ordering clauses in single statements rather than ordering entire blocks.

>
> This is drastically different from the previous case, but this makes
> sense - previously, all blobs accessed through traversal were not shown,
> but now they are sometimes shown.
Yes.
> Here, filter_trees_update_omits() is
> only ever used to remove a blob from the omits set, since once this blob
> is encountered with include_it == true, it is marked as LOFR_MARK_SEEN
> and will not be traversed again.
It is possible that include_it can be false and then in a later 
invocation it can be true. In that case, the blob will be added to the 
set and then removed from it.
>
>> +	case LOFS_BEGIN_TREE:
>> +		seen_info = oidmap_get(
>> +			&filter_data->seen_at_depth, &obj->oid);
>> +		if (!seen_info) {
>> +			seen_info = xcalloc(1, sizeof(struct seen_map_entry));
> Use sizeof(*seen_info).
Done.
>
>> +			seen_info->base.oid = obj->oid;
> We have been using oidcpy, but come to think of it, I'm not sure why...
Because the hash algorithm in use may not use the entire structure, 
apparently. Or there are future improvements planned to the function and 
they need to be picked up by all current hash-copying operations. Fixed.
>
>> +			seen_info->depth = filter_data->current_depth;
>> +			oidmap_put(&filter_data->seen_at_depth, seen_info);
>> +			already_seen = 0;
>> +		} else
>> +			already_seen =
>> +				filter_data->current_depth >= seen_info->depth;
> There has been recently some clarification that if one branch of an
> if/else construct requires braces, braces should be put on all of them:
> 1797dc5176 ("CodingGuidelines: clarify multi-line brace style",
> 2017-01-17). Likewise below.
Done, thank you - that's good to know.
>
>> +		if (already_seen)
>> +			filter_res = LOFR_SKIP_TREE;
>> +		else {
>> +			seen_info->depth = filter_data->current_depth;
>> +			filter_trees_update_omits(obj, filter_data, include_it);
>> +
>> +			if (include_it)
>> +				filter_res = LOFR_DO_SHOW;
>> +			else if (filter_data->omits)
>> +				filter_res = LOFR_ZERO;
>> +			else
>> +				filter_res = LOFR_SKIP_TREE;
> Looks straightforward. If we have already seen it at a shallower or
> equal depth, we can skip it (since we have already done the appropriate
> processing). Otherwise, we need to ensure that its "omit" is correctly
> set, and:
>   - show it if include_it
>   - don't do anything special if not include_it and we need the omit set
>   - skip the tree if not include_it and we don't need the omit set
Right.
>
>> +static void filter_trees_free(void *filter_data) {
>> +	struct filter_trees_depth_data* d = filter_data;
>> +	oidmap_free(&d->seen_at_depth, 1);
>> +	free(d);
>> +}
> Check for NULL-ness of filter_data too, to match the usual behavior of
> free functions.
>
Done.
>> diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh
>> index eb32505a6e..54e7096d40 100755
>> --- a/t/t6112-rev-list-filters-objects.sh
>> +++ b/t/t6112-rev-list-filters-objects.sh
> [snip]
>
> Thanks for the tests that cover quite a wide range of cases. Can you
> also demonstrate the case where a blob would normally be omitted
> (because it is too deep) but it is directly specified, so it is
> included.

I didn't exactly use TDD, but I did try to cover every line of code as 
well as both branches of each ternary operator.

Added such a test.

>
>> +expect_has_with_different_name () {
>> +	repo=$1 &&
>> +	name=$2 &&
>> +
>> +	hash=$(git -C $repo rev-parse HEAD:$name) &&
>> +	! grep "^$hash $name$" actual &&
>> +	grep "^$hash " actual &&
>> +	! grep "~$hash" actual
>> +}
> Should we also check that a "~" entry appears with $name?

I don't believe there is a way to get the object names to appear next to 
~ entries (note that the names are not saved in the omits oidset).

For your reference, here is an interdiff for this particular patch after 
applying your comments:

--- a/list-objects-filter.c
+++ b/list-objects-filter.c
@@ -158,18 +158,19 @@ static enum list_objects_filter_result 
filter_trees_depth(
          seen_info = oidmap_get(
              &filter_data->seen_at_depth, &obj->oid);
          if (!seen_info) {
-            seen_info = xcalloc(1, sizeof(struct seen_map_entry));
-            seen_info->base.oid = obj->oid;
+            seen_info = xcalloc(1, sizeof(*seen_info));
+            oidcpy(&seen_info->base.oid, &obj->oid);
              seen_info->depth = filter_data->current_depth;
              oidmap_put(&filter_data->seen_at_depth, seen_info);
              already_seen = 0;
-        } else
+        } else {
              already_seen =
                  filter_data->current_depth >= seen_info->depth;
+        }

-        if (already_seen)
+        if (already_seen) {
              filter_res = LOFR_SKIP_TREE;
-        else {
+        } else {
              int been_omitted = filter_trees_update_omits(
                  obj, filter_data, include_it);
              seen_info->depth = filter_data->current_depth;
@@ -193,6 +194,8 @@ static enum list_objects_filter_result 
filter_trees_depth(

  static void filter_trees_free(void *filter_data) {
      struct filter_trees_depth_data* d = filter_data;
+    if (!d)
+        return;
      oidmap_free(&d->seen_at_depth, 1);
      free(d);
  }
diff --git a/t/'t6112-rev-list-filters-objects.sh 
b/t/'t6112-rev-list-filters-objects.sh
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/t/t6112-rev-list-filters-objects.sh 
b/t/t6112-rev-list-filters-objects.sh
index 18b0b14d5a..d6edad6a01 100755
--- a/t/t6112-rev-list-filters-objects.sh
+++ b/t/t6112-rev-list-filters-objects.sh
@@ -407,6 +407,13 @@ test_expect_success 'tree:<depth> where we iterate 
over tree at two levels' '
      expect_has_with_different_name r5 a/subdir/b/foo
  '

+test_expect_success 'tree:<depth> which filters out blob but given as 
arg' '
+    export blob_hash=$(git -C r4 rev-parse HEAD:subdir/bar) &&
+
+    git -C r4 rev-list --objects --filter=tree:1 HEAD $blob_hash >actual &&
+    grep ^$blob_hash actual
+'
+
  # Delete some loose objects and use rev-list, but WITHOUT any filtering.
  # This models previously omitted objects that we did not receive.





  reply	other threads:[~2019-01-08 19:23 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-10 23:40 [PATCH v2 0/2] support for filtering trees and blobs based on depth Matthew DeVore
2018-12-10 23:40 ` [PATCH v2 1/2] list-objects-filter: teach tree:# how to handle >0 Matthew DeVore
2019-01-08  1:56   ` Jonathan Tan
2019-01-08 19:22     ` Matthew DeVore [this message]
2019-01-08 23:19       ` Jonathan Tan
2019-01-08 23:36         ` Junio C Hamano
2019-01-08 23:41           ` Jonathan Tan
2019-01-08 23:39     ` Junio C Hamano
2019-01-09  2:43       ` MATTHEW DEVORE
2018-12-10 23:40 ` [PATCH v2 2/2] tree:<depth>: skip some trees even when collecting omits Matthew DeVore
2019-01-08  2:00   ` Jonathan Tan
2019-01-08 23:22     ` Jonathan Tan
2019-01-09  2:47       ` MATTHEW DEVORE
2019-01-09  0:29     ` Matthew DeVore
2018-12-11  8:45 ` [PATCH v2 0/2] support for filtering trees and blobs based on depth Junio C Hamano
2019-01-08  0:56   ` Matthew DeVore
2019-01-09  2:59 ` [PATCH v3 " Matthew DeVore
2019-01-09  2:59   ` [PATCH v3 1/2] list-objects-filter: teach tree:# how to handle >0 Matthew DeVore
2019-01-09  2:59   ` [PATCH v3 2/2] tree:<depth>: skip some trees even when collecting omits Matthew DeVore
2019-01-09 18:06   ` [PATCH v3 0/2] support for filtering trees and blobs based on depth Jonathan Tan
2019-01-15 23:35     ` Junio C Hamano
2019-01-15 23:41       ` Junio C Hamano
2019-01-17  0:14         ` Matthew DeVore
2019-01-17 18:44           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54fba0d3-4b8e-1faf-4b2d-e67c1f5fbf02@comcast.net \
    --to=matvore@comcast.net \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=jeffhost@microsoft.com \
    --cc=jonathantanmy@google.com \
    --cc=matvore@google.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    --cc=stefanbeller@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).