git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] fetch-pack: write effective filter to trace2
@ 2022-07-15 17:29 Jonathan Tan
  2022-07-15 17:38 ` Jeff Hostetler
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Jonathan Tan @ 2022-07-15 17:29 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Administrators of a managed Git environment (like the one at $DAYJOB)
might want to quantify the performance change of fetches with and
without partial clone from the client's point of view. Therefore, log
the effective filter being sent to the server whenever a fetch (or
clone) occurs. Note that this is not necessarily the same as what's
specified on the CLI, because during a fetch, the configured filter is
used whenever a filter is not specified on the CLI.

This is implemented for protocol v0, v1, and v2.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 fetch-pack.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index cb6647d657..dec8743bec 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -392,7 +392,10 @@ static int find_common(struct fetch_negotiator *negotiator,
 	if (server_supports_filtering && args->filter_options.choice) {
 		const char *spec =
 			expand_list_objects_filter_spec(&args->filter_options);
+		trace2_data_string("fetch", the_repository, "fetch/effective-filter", spec);
 		packet_buf_write(&req_buf, "filter %s", spec);
+	} else {
+		trace2_data_string("fetch", the_repository, "fetch/effective-filter", "none");
 	}
 	packet_buf_flush(&req_buf);
 	state_len = req_buf.len;
@@ -1328,9 +1331,12 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
 		const char *spec =
 			expand_list_objects_filter_spec(&args->filter_options);
 		print_verbose(args, _("Server supports filter"));
+		trace2_data_string("fetch", the_repository, "fetch/effective-filter", spec);
 		packet_buf_write(&req_buf, "filter %s", spec);
-	} else if (args->filter_options.choice) {
-		warning("filtering not recognized by server, ignoring");
+	} else {
+		if (args->filter_options.choice)
+			warning("filtering not recognized by server, ignoring");
+		trace2_data_string("fetch", the_repository, "fetch/effective-filter", "none");
 	}
 
 	if (server_supports_feature("fetch", "packfile-uris", 0)) {
-- 
2.37.0.170.g444d1eabd0-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] fetch-pack: write effective filter to trace2
  2022-07-15 17:29 [PATCH] fetch-pack: write effective filter to trace2 Jonathan Tan
@ 2022-07-15 17:38 ` Jeff Hostetler
  2022-07-15 18:28   ` Junio C Hamano
  2022-07-18 17:00 ` [PATCH v2] " Jonathan Tan
  2022-07-26 16:27 ` [PATCH v3] " Jonathan Tan
  2 siblings, 1 reply; 15+ messages in thread
From: Jeff Hostetler @ 2022-07-15 17:38 UTC (permalink / raw)
  To: Jonathan Tan, git



On 7/15/22 1:29 PM, Jonathan Tan wrote:
> Administrators of a managed Git environment (like the one at $DAYJOB)
> might want to quantify the performance change of fetches with and
> without partial clone from the client's point of view. Therefore, log
> the effective filter being sent to the server whenever a fetch (or
> clone) occurs. Note that this is not necessarily the same as what's
> specified on the CLI, because during a fetch, the configured filter is
> used whenever a filter is not specified on the CLI.
> 
> This is implemented for protocol v0, v1, and v2.
> 
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>   fetch-pack.c | 10 ++++++++--
>   1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/fetch-pack.c b/fetch-pack.c
> index cb6647d657..dec8743bec 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -392,7 +392,10 @@ static int find_common(struct fetch_negotiator *negotiator,
>   	if (server_supports_filtering && args->filter_options.choice) {
>   		const char *spec =
>   			expand_list_objects_filter_spec(&args->filter_options);
> +		trace2_data_string("fetch", the_repository, "fetch/effective-filter", spec);
>   		packet_buf_write(&req_buf, "filter %s", spec);
> +	} else {
> +		trace2_data_string("fetch", the_repository, "fetch/effective-filter", "none");
>   	}
>   	packet_buf_flush(&req_buf);
>   	state_len = req_buf.len;
> @@ -1328,9 +1331,12 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
>   		const char *spec =
>   			expand_list_objects_filter_spec(&args->filter_options);
>   		print_verbose(args, _("Server supports filter"));
> +		trace2_data_string("fetch", the_repository, "fetch/effective-filter", spec);
>   		packet_buf_write(&req_buf, "filter %s", spec);
> -	} else if (args->filter_options.choice) {
> -		warning("filtering not recognized by server, ignoring");
> +	} else {
> +		if (args->filter_options.choice)
> +			warning("filtering not recognized by server, ignoring");
> +		trace2_data_string("fetch", the_repository, "fetch/effective-filter", "none");
>   	}
>   
>   	if (server_supports_feature("fetch", "packfile-uris", 0)) {
> 

This looks nice.  Thanks!
Jeff

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] fetch-pack: write effective filter to trace2
  2022-07-15 17:38 ` Jeff Hostetler
@ 2022-07-15 18:28   ` Junio C Hamano
  2022-07-15 19:09     ` Jonathan Tan
  2022-07-18 14:08     ` Jeff Hostetler
  0 siblings, 2 replies; 15+ messages in thread
From: Junio C Hamano @ 2022-07-15 18:28 UTC (permalink / raw)
  To: Jeff Hostetler; +Cc: Jonathan Tan, git

Jeff Hostetler <git@jeffhostetler.com> writes:

> On 7/15/22 1:29 PM, Jonathan Tan wrote:
>> Administrators of a managed Git environment (like the one at $DAYJOB)
>> might want to quantify the performance change of fetches with and
>> without partial clone from the client's point of view. Therefore, log
>> the effective filter being sent to the server whenever a fetch (or
>> clone) occurs. Note that this is not necessarily the same as what's
>> specified on the CLI, because during a fetch, the configured filter is
>> used whenever a filter is not specified on the CLI.
>> This is implemented for protocol v0, v1, and v2.

Is that different to say "for all protocols"?  I am wondering if it
is worth saying (unlike in a hypothetical case where we do not
support v0 and v1 we may want to state why we only support v2).

>> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
>> ---
>>   fetch-pack.c | 10 ++++++++--
>>   1 file changed, 8 insertions(+), 2 deletions(-)
>> diff --git a/fetch-pack.c b/fetch-pack.c
>> index cb6647d657..dec8743bec 100644
>> --- a/fetch-pack.c
>> +++ b/fetch-pack.c
>> @@ -392,7 +392,10 @@ static int find_common(struct fetch_negotiator *negotiator,
>>   	if (server_supports_filtering && args->filter_options.choice) {
>>   		const char *spec =
>>   			expand_list_objects_filter_spec(&args->filter_options);
>> +		trace2_data_string("fetch", the_repository, "fetch/effective-filter", spec);
>>   		packet_buf_write(&req_buf, "filter %s", spec);
>> +	} else {
>> +		trace2_data_string("fetch", the_repository, "fetch/effective-filter", "none");

Do we show "none" anywhere else where an expanded list objects
filter spec is expected?

I am wondering two things: 

 - The lack of this line would be a cleaner implementation of a
   signal to say "this client did not ask any filtering".

 - It would be good if we keep what report here more-or-less the
   same as what we can pass "--filter=" on the command line of
   "git pack-objects".

If "--filter=none" meant "this --filter passes everything", then
saying "none" here makes perfect sense wrt the latter, but I doubt
it is the case.

>>   	}
>>   	packet_buf_flush(&req_buf);
>>   	state_len = req_buf.len;
>> @@ -1328,9 +1331,12 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
>>   		const char *spec =
>>   			expand_list_objects_filter_spec(&args->filter_options);
>>   		print_verbose(args, _("Server supports filter"));
>> +		trace2_data_string("fetch", the_repository, "fetch/effective-filter", spec);
>>   		packet_buf_write(&req_buf, "filter %s", spec);
>> -	} else if (args->filter_options.choice) {
>> -		warning("filtering not recognized by server, ignoring");
>> +	} else {
>> +		if (args->filter_options.choice)
>> +			warning("filtering not recognized by server, ignoring");
>> +		trace2_data_string("fetch", the_repository, "fetch/effective-filter", "none");

At the first glance, this seems to lose data, because you should be
able to use expand_list_objects_filter_spec() to report the filter
spec.  But this is reporting the filter that was actually in effect,
so it is OK.

But the same question about "none" remains.

>>   	}
>>     	if (server_supports_feature("fetch", "packfile-uris", 0)) {
>> 
>
> This looks nice.  Thanks!
> Jeff

Will queue with your Acked-by, then?

Thanks, both.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] fetch-pack: write effective filter to trace2
  2022-07-15 18:28   ` Junio C Hamano
@ 2022-07-15 19:09     ` Jonathan Tan
  2022-07-15 20:10       ` Junio C Hamano
  2022-07-18 14:08     ` Jeff Hostetler
  1 sibling, 1 reply; 15+ messages in thread
From: Jonathan Tan @ 2022-07-15 19:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, Jeff Hostetler, git

Junio C Hamano <gitster@pobox.com> writes:
> Jeff Hostetler <git@jeffhostetler.com> writes:
> 
> > On 7/15/22 1:29 PM, Jonathan Tan wrote:
> >> Administrators of a managed Git environment (like the one at $DAYJOB)
> >> might want to quantify the performance change of fetches with and
> >> without partial clone from the client's point of view. Therefore, log
> >> the effective filter being sent to the server whenever a fetch (or
> >> clone) occurs. Note that this is not necessarily the same as what's
> >> specified on the CLI, because during a fetch, the configured filter is
> >> used whenever a filter is not specified on the CLI.
> >> This is implemented for protocol v0, v1, and v2.
> 
> Is that different to say "for all protocols"?  I am wondering if it
> is worth saying (unlike in a hypothetical case where we do not
> support v0 and v1 we may want to state why we only support v2).

I wrote it that way to avoid confusion with things like HTTP (which is a
protocol, at least in its name). Maybe better would be "This is
implemented for all protocols (v0, v1, and v2)". I'll make that change
in the commit message (after the other questions are discussed).

> >> diff --git a/fetch-pack.c b/fetch-pack.c
> >> index cb6647d657..dec8743bec 100644
> >> --- a/fetch-pack.c
> >> +++ b/fetch-pack.c
> >> @@ -392,7 +392,10 @@ static int find_common(struct fetch_negotiator *negotiator,
> >>   	if (server_supports_filtering && args->filter_options.choice) {
> >>   		const char *spec =
> >>   			expand_list_objects_filter_spec(&args->filter_options);
> >> +		trace2_data_string("fetch", the_repository, "fetch/effective-filter", spec);
> >>   		packet_buf_write(&req_buf, "filter %s", spec);
> >> +	} else {
> >> +		trace2_data_string("fetch", the_repository, "fetch/effective-filter", "none");
> 
> Do we show "none" anywhere else where an expanded list objects
> filter spec is expected?

Hmm...no, we don't.

> I am wondering two things: 
> 
>  - The lack of this line would be a cleaner implementation of a
>    signal to say "this client did not ask any filtering".

I think the presence is important to distinguish "no filtering" versus
someone using an old Git version that does not emit such traces, but I'm
open to changing the "none" to "none-specified" or something like that.

>  - It would be good if we keep what report here more-or-less the
>    same as what we can pass "--filter=" on the command line of
>    "git pack-objects".

My intent is to report what is being sent to the server in the fetch
request.

> If "--filter=none" meant "this --filter passes everything", then
> saying "none" here makes perfect sense wrt the latter, but I doubt
> it is the case.

--filter=none does not work (a user would have to specify --no-filter),
although it conceptually makes sense. I just wanted to have something
show up to indicate that we are using a Git version that would emit a
trace.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] fetch-pack: write effective filter to trace2
  2022-07-15 19:09     ` Jonathan Tan
@ 2022-07-15 20:10       ` Junio C Hamano
  2022-07-15 20:49         ` Jonathan Tan
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2022-07-15 20:10 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Jeff Hostetler, git

Jonathan Tan <jonathantanmy@google.com> writes:

>> >> This is implemented for protocol v0, v1, and v2.
>> 
>> Is that different to say "for all protocols"?  I am wondering if it
>> is worth saying (unlike in a hypothetical case where we do not
>> support v0 and v1 we may want to state why we only support v2).
>
> I wrote it that way to avoid confusion with things like HTTP (which is a
> protocol, at least in its name). Maybe better would be "This is
> implemented for all protocols (v0, v1, and v2)".

I still wonder if it is better left unsaid.  When adding a new
thing, it would not be noteworthy if the new thing is available no
matter what protocol is being used, no?

>> >> +		trace2_data_string("fetch", the_repository, "fetch/effective-filter", spec);
>> >>   		packet_buf_write(&req_buf, "filter %s", spec);
>> >> +	} else {
>> >> +		trace2_data_string("fetch", the_repository, "fetch/effective-filter", "none");
>> 
>> Do we show "none" anywhere else where an expanded list objects
>> filter spec is expected?
>
> Hmm...no, we don't.
>
>> I am wondering two things: 
>> 
>>  - The lack of this line would be a cleaner implementation of a
>>    signal to say "this client did not ask any filtering".
>
> I think the presence is important to distinguish "no filtering" versus
> someone using an old Git version that does not emit such traces, but I'm
> open to changing the "none" to "none-specified" or something like that.

OK.

>>  - It would be good if we keep what report here more-or-less the
>>    same as what we can pass "--filter=" on the command line of
>>    "git pack-objects".
>
> My intent is to report what is being sent to the server in the fetch
> request.

Then I'd be OK with a design that reports "none", if we send "none"
to the server in this case.  If not, then I do not think we should.

Perhaps report an empty string or not reporting at all?  After all,
the reader knows the client version and capability in the log so it
is easy to tell between 'no filter was used' and 'too old to report
the filter', no?  I dunno.




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] fetch-pack: write effective filter to trace2
  2022-07-15 20:10       ` Junio C Hamano
@ 2022-07-15 20:49         ` Jonathan Tan
  0 siblings, 0 replies; 15+ messages in thread
From: Jonathan Tan @ 2022-07-15 20:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, Jeff Hostetler, git

Junio C Hamano <gitster@pobox.com> writes:
> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> >> >> This is implemented for protocol v0, v1, and v2.
> >> 
> >> Is that different to say "for all protocols"?  I am wondering if it
> >> is worth saying (unlike in a hypothetical case where we do not
> >> support v0 and v1 we may want to state why we only support v2).
> >
> > I wrote it that way to avoid confusion with things like HTTP (which is a
> > protocol, at least in its name). Maybe better would be "This is
> > implemented for all protocols (v0, v1, and v2)".
> 
> I still wonder if it is better left unsaid.  When adding a new
> thing, it would not be noteworthy if the new thing is available no
> matter what protocol is being used, no?

I was thinking that a reviewer would think "why is this done twice" and
wrote this comment to answer this issue in advance, but it makes sense
to just remove it. OK - I'll do that.

> > My intent is to report what is being sent to the server in the fetch
> > request.
> 
> Then I'd be OK with a design that reports "none", if we send "none"
> to the server in this case.  If not, then I do not think we should.
> 
> Perhaps report an empty string or not reporting at all?  After all,
> the reader knows the client version and capability in the log so it
> is easy to tell between 'no filter was used' and 'too old to report
> the filter', no?  I dunno.

We don't send "none" to the server. If an empty string is OK then I'll
use that, since it will make it slightly easier for analysis since we do
not have to check against the version (and also to know which version
has support for this feature). But I'm OK either way.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] fetch-pack: write effective filter to trace2
  2022-07-15 18:28   ` Junio C Hamano
  2022-07-15 19:09     ` Jonathan Tan
@ 2022-07-18 14:08     ` Jeff Hostetler
  2022-07-18 15:53       ` Junio C Hamano
  1 sibling, 1 reply; 15+ messages in thread
From: Jeff Hostetler @ 2022-07-18 14:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, git



On 7/15/22 2:28 PM, Junio C Hamano wrote:
> Jeff Hostetler <git@jeffhostetler.com> writes:
> 
>> On 7/15/22 1:29 PM, Jonathan Tan wrote:
...
>>> +		trace2_data_string("fetch", the_repository, "fetch/effective-filter", "none");
> 
> At the first glance, this seems to lose data, because you should be
> able to use expand_list_objects_filter_spec() to report the filter
> spec.  But this is reporting the filter that was actually in effect,
> so it is OK.
> 
> But the same question about "none" remains.

Yeah, the use of "none" gave me pause, but I didn't have a better idea
at the time.  I guess we have:
(a) requested, supported, used.
(b) "none used because the server doesn't support it" and
(c) "none used because the user didn't request it" cases,
right?

Perhaps it would be better to do:
     if (server_supports_filtering && args->filter_options.choice)
         trace2_data_string("fetch", r, "filter/effective", spec);
     else
         trace2_data_string("fetch", r, "filter/unsupported", spec);

Using two different keywords.

So that the log only contains "filter/effective" when it was actually
used.  And there is no "filter/effective" event when (for whatever
reason) it was not in effect.

Then the "filter/unsupported" event helps you with debugging.  Did they
hit a server that doesn't support filtering or did they have a typo in
their filter spec?

Then don't emit a message at all for the "not requested" case.  And you
can use the Git version number to know how to interpret it.  There are
other places where we don't bother sending messages where the value is
a zero or empty.

Jeff

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] fetch-pack: write effective filter to trace2
  2022-07-18 14:08     ` Jeff Hostetler
@ 2022-07-18 15:53       ` Junio C Hamano
  2022-07-18 16:18         ` Jonathan Tan
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2022-07-18 15:53 UTC (permalink / raw)
  To: Jeff Hostetler; +Cc: Jonathan Tan, git

Jeff Hostetler <git@jeffhostetler.com> writes:

> Yeah, the use of "none" gave me pause, but I didn't have a better idea
> at the time.  I guess we have:
> (a) requested, supported, used.
> (b) "none used because the server doesn't support it" and
> (c) "none used because the user didn't request it" cases,
> right?

At these sites where the new traces are added, we cannot detect the
remaining case (d) "requested by the user, asked for the server, but
the server dropped the ball", I think the above covers the possible
cases that are interesting to us entirely.

> Perhaps it would be better to do:
>     if (server_supports_filtering && args->filter_options.choice)
>         trace2_data_string("fetch", r, "filter/effective", spec);
>     else
>         trace2_data_string("fetch", r, "filter/unsupported", spec);
>
> Using two different keywords.
>
> So that the log only contains "filter/effective" when it was actually
> used.  And there is no "filter/effective" event when (for whatever
> reason) it was not in effect.
>
> Then the "filter/unsupported" event helps you with debugging.  Did they
> hit a server that doesn't support filtering or did they have a typo in
> their filter spec?
>
> Then don't emit a message at all for the "not requested" case.  And you
> can use the Git version number to know how to interpret it.  There are
> other places where we don't bother sending messages where the value is
> a zero or empty.

Sounds alright.  We could standardize the other way, which might
make the interpretation of individual trace entries independent of
the context easier, though.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] fetch-pack: write effective filter to trace2
  2022-07-18 15:53       ` Junio C Hamano
@ 2022-07-18 16:18         ` Jonathan Tan
  2022-07-18 17:56           ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Jonathan Tan @ 2022-07-18 16:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, Jeff Hostetler, git

Junio C Hamano <gitster@pobox.com> writes:
> > Using two different keywords.
> >
> > So that the log only contains "filter/effective" when it was actually
> > used.  And there is no "filter/effective" event when (for whatever
> > reason) it was not in effect.
> >
> > Then the "filter/unsupported" event helps you with debugging.  Did they
> > hit a server that doesn't support filtering or did they have a typo in
> > their filter spec?
> >
> > Then don't emit a message at all for the "not requested" case.  And you
> > can use the Git version number to know how to interpret it.  There are
> > other places where we don't bother sending messages where the value is
> > a zero or empty.
> 
> Sounds alright.  We could standardize the other way, which might
> make the interpretation of individual trace entries independent of
> the context easier, though.
> 
> Thanks.

Thanks for bringing up the use case of debugging a server that we
expected to support filtering but doesn't. As for whether we should not
send a message when the value is empty, I can see at least one reason
for not sending it - to not waste I/O and clutter the trace because of a
feature that the user is not using, but fetch is I/O-intensive enough
and having the empty message is useful enough (not only do we not need
to know which versions have this feature, but we also can be sure that
the message wasn't excluded because of some unexpected log filtering or
something like that) that I think we should have the empty message. I'll
put it in v2 but we can easily remove it if we decide that we don't want
it.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2] fetch-pack: write effective filter to trace2
  2022-07-15 17:29 [PATCH] fetch-pack: write effective filter to trace2 Jonathan Tan
  2022-07-15 17:38 ` Jeff Hostetler
@ 2022-07-18 17:00 ` Jonathan Tan
  2022-07-18 19:47   ` Junio C Hamano
  2022-07-26 16:27 ` [PATCH v3] " Jonathan Tan
  2 siblings, 1 reply; 15+ messages in thread
From: Jonathan Tan @ 2022-07-18 17:00 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, git

Administrators of a managed Git environment (like the one at $DAYJOB)
might want to quantify the performance change of fetches with and
without filters from the client's point of view, and also detect if a
server does not support it. Therefore, log the filter information being
sent to the server whenever a fetch (or clone) occurs. Note that this is
not necessarily the same as what's specified on the CLI, because during
a fetch, the configured filter is used whenever a filter is not
specified on the CLI.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
Thanks everyone for your comments. I've renamed the keys and
distinguished the case in which the server does not support filters
following Jeff's suggestions.

A few small discussion points:
 - whether we need the print_verbose now that we have traces
 - should we log if no filter is specified
 - if yes, what key should it be logged under (I used "filter/none" for
   now)
---
 fetch-pack.c | 40 ++++++++++++++++++++++++++--------------
 1 file changed, 26 insertions(+), 14 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index cb6647d657..68820f9a1a 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -292,6 +292,29 @@ static void mark_tips(struct fetch_negotiator *negotiator,
 	return;
 }
 
+static void write_and_trace_filter(struct fetch_pack_args *args,
+				   struct strbuf *req_buf,
+				   int server_supports_filter)
+{
+	if (args->filter_options.choice) {
+		const char *spec =
+			expand_list_objects_filter_spec(&args->filter_options);
+		if (server_supports_filter) {
+			print_verbose(args, _("Server supports filter"));
+			packet_buf_write(req_buf, "filter %s", spec);
+			trace2_data_string("fetch", the_repository,
+					   "filter/effective", spec);
+		} else {
+			warning("filtering not recognized by server, ignoring");
+			trace2_data_string("fetch", the_repository,
+					   "filter/unsupported", spec);
+		}
+	} else {
+		trace2_data_string("fetch", the_repository,
+				   "filter/none", "");
+	}
+}
+
 static int find_common(struct fetch_negotiator *negotiator,
 		       struct fetch_pack_args *args,
 		       int fd[2], struct object_id *result_oid,
@@ -389,11 +412,7 @@ static int find_common(struct fetch_negotiator *negotiator,
 			packet_buf_write(&req_buf, "deepen-not %s", s->string);
 		}
 	}
-	if (server_supports_filtering && args->filter_options.choice) {
-		const char *spec =
-			expand_list_objects_filter_spec(&args->filter_options);
-		packet_buf_write(&req_buf, "filter %s", spec);
-	}
+	write_and_trace_filter(args, &req_buf, server_supports_filtering);
 	packet_buf_flush(&req_buf);
 	state_len = req_buf.len;
 
@@ -1323,15 +1342,8 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
 		die(_("Server does not support shallow requests"));
 
 	/* Add filter */
-	if (server_supports_feature("fetch", "filter", 0) &&
-	    args->filter_options.choice) {
-		const char *spec =
-			expand_list_objects_filter_spec(&args->filter_options);
-		print_verbose(args, _("Server supports filter"));
-		packet_buf_write(&req_buf, "filter %s", spec);
-	} else if (args->filter_options.choice) {
-		warning("filtering not recognized by server, ignoring");
-	}
+	write_and_trace_filter(args, &req_buf,
+			       server_supports_feature("fetch", "filter", 0));
 
 	if (server_supports_feature("fetch", "packfile-uris", 0)) {
 		int i;
-- 
2.37.0.170.g444d1eabd0-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] fetch-pack: write effective filter to trace2
  2022-07-18 16:18         ` Jonathan Tan
@ 2022-07-18 17:56           ` Junio C Hamano
  0 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2022-07-18 17:56 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Jeff Hostetler, git

Jonathan Tan <jonathantanmy@google.com> writes:

> ... that the user is not using, but fetch is I/O-intensive enough
> and having the empty message is useful enough (not only do we not need
> to know which versions have this feature, but we also can be sure that
> the message wasn't excluded because of some unexpected log filtering or
> something like that) that I think we should have the empty message. I'll
> put it in v2 but we can easily remove it if we decide that we don't want
> it.

As long as it is clearly documented that "none" is given explicitly
when no filtering is requested (Jeff's point of making it possible
to tell the reason why we are not requesting is still valid), I
think it would be OK.  If we are giving enough log entries for
"fetch" operation, another "empty" message would not hurt (if the
trace for "fetch" is otherwise very quiet, then an unused "empty"
may become distracting, but I somehow do not think it is the case).

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] fetch-pack: write effective filter to trace2
  2022-07-18 17:00 ` [PATCH v2] " Jonathan Tan
@ 2022-07-18 19:47   ` Junio C Hamano
  2022-07-25 18:56     ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2022-07-18 19:47 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, git

Jonathan Tan <jonathantanmy@google.com> writes:

> +static void write_and_trace_filter(struct fetch_pack_args *args,
> +				   struct strbuf *req_buf,
> +				   int server_supports_filter)
> +{
> +	if (args->filter_options.choice) {
> +		const char *spec =
> +			expand_list_objects_filter_spec(&args->filter_options);
> +		if (server_supports_filter) {
> +			print_verbose(args, _("Server supports filter"));
> +			packet_buf_write(req_buf, "filter %s", spec);
> +			trace2_data_string("fetch", the_repository,
> +					   "filter/effective", spec);
> +		} else {
> +			warning("filtering not recognized by server, ignoring");
> +			trace2_data_string("fetch", the_repository,
> +					   "filter/unsupported", spec);
> +		}
> +	} else {
> +		trace2_data_string("fetch", the_repository,
> +				   "filter/none", "");
> +	}
> +}

The previous round already had the same issue, but this makes it
even worse by introducing a function that makes it clear that it
mixes quite unrelated two features (i.e. write the filter to the
other end, which MUST be done for correct operation of the protocol,
and write a log to trace2, which may not be even necessary when we
are not logging at all).

Can we make the code a bit better structured, I have to wonder, by
having two separate phases of processing, one for the operation
proper, and the other for the logging/debugging?

In a sense, we can say that the only purpose this helper function is
to tell the server end what the filter we use is by renaming it; it
is OK to have debugging statements and logging code as part of the
implementation of such a function.

I actually like that direction better.  A helper function may exist
*ONLY* to trace, in which case, having "trace" in its name would
make perfect sense.  A helper function may exist to perform the real
work, but it may log what it did to perform the real work as well.
I think the latter shouldn't have "trace" in its name at all, or our
helpers will all be called do_FOO_and_trace(), do_BAR_and_debug(),
etc., which is nonsense.  Just calling do_FOO() and do_BAR(), and
then having them log or trace as needed, is fine.

Will queue, anyway.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] fetch-pack: write effective filter to trace2
  2022-07-18 19:47   ` Junio C Hamano
@ 2022-07-25 18:56     ` Junio C Hamano
  2022-07-26 16:28       ` Jonathan Tan
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2022-07-25 18:56 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, git

Junio C Hamano <gitster@pobox.com> writes:

> Jonathan Tan <jonathantanmy@google.com> writes:
>
>> +static void write_and_trace_filter(struct fetch_pack_args *args,
>> +				   struct strbuf *req_buf,
>> +				   int server_supports_filter)
>> +{
>> +...
>> +}
>
> The previous round already had the same issue, but this makes it
> even worse by introducing a function that makes it clear that it
> mixes quite unrelated two features (i.e. write the filter to the
> other end, which MUST be done for correct operation of the protocol,
> and write a log to trace2, which may not be even necessary when we
> are not logging at all).
> ...
> In a sense, we can say that the only purpose this helper function is
> to tell the server end what the filter we use is by renaming it; it
> is OK to have debugging statements and logging code as part of the
> implementation of such a function.
>
> I actually like that direction better.  A helper function may exist
> *ONLY* to trace, in which case, having "trace" in its name would
> make perfect sense.  A helper function may exist to perform the real
> work, but it may log what it did to perform the real work as well.
> I think the latter shouldn't have "trace" in its name at all, or our
> helpers will all be called do_FOO_and_trace(), do_BAR_and_debug(),
> etc., which is nonsense.  Just calling do_FOO() and do_BAR(), and
> then having them log or trace as needed, is fine.

After waiting for a week, I still haven't seen any correction to
this patch, but do you want to give the helper function a bit more
sensible name in an updated patch, perhaps say "send_filter()" or
something?

Otherwise the topic looked good.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3] fetch-pack: write effective filter to trace2
  2022-07-15 17:29 [PATCH] fetch-pack: write effective filter to trace2 Jonathan Tan
  2022-07-15 17:38 ` Jeff Hostetler
  2022-07-18 17:00 ` [PATCH v2] " Jonathan Tan
@ 2022-07-26 16:27 ` Jonathan Tan
  2 siblings, 0 replies; 15+ messages in thread
From: Jonathan Tan @ 2022-07-26 16:27 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Administrators of a managed Git environment (like the one at $DAYJOB)
might want to quantify the performance change of fetches with and
without filters from the client's point of view, and also detect if a
server does not support it. Therefore, log the filter information being
sent to the server whenever a fetch (or clone) occurs. Note that this is
not necessarily the same as what's specified on the CLI, because during
a fetch, the configured filter is used whenever a filter is not
specified on the CLI.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
Here's v3, with an updated function name.
---
Range-diff against v2:
1:  d3ea7d3b95 ! 1:  17cef49fcc fetch-pack: write effective filter to trace2
    @@ Commit message
     
         Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
         ---
    -    Thanks everyone for your comments. I've renamed the keys and
    -    distinguished the case in which the server does not support filters
    -    following Jeff's suggestions.
    -
    -    A few small discussion points:
    -     - whether we need the print_verbose now that we have traces
    -     - should we log if no filter is specified
    -     - if yes, what key should it be logged under (I used "filter/none" for
    -       now)
    +    Here's v3, with an updated function name.
     
      ## fetch-pack.c ##
     @@ fetch-pack.c: static void mark_tips(struct fetch_negotiator *negotiator,
      	return;
      }
      
    -+static void write_and_trace_filter(struct fetch_pack_args *args,
    -+				   struct strbuf *req_buf,
    -+				   int server_supports_filter)
    ++static void send_filter(struct fetch_pack_args *args,
    ++			struct strbuf *req_buf,
    ++			int server_supports_filter)
     +{
     +	if (args->filter_options.choice) {
     +		const char *spec =
    @@ fetch-pack.c: static int find_common(struct fetch_negotiator *negotiator,
     -			expand_list_objects_filter_spec(&args->filter_options);
     -		packet_buf_write(&req_buf, "filter %s", spec);
     -	}
    -+	write_and_trace_filter(args, &req_buf, server_supports_filtering);
    ++	send_filter(args, &req_buf, server_supports_filtering);
      	packet_buf_flush(&req_buf);
      	state_len = req_buf.len;
      
    @@ fetch-pack.c: static int send_fetch_request(struct fetch_negotiator *negotiator,
     -	} else if (args->filter_options.choice) {
     -		warning("filtering not recognized by server, ignoring");
     -	}
    -+	write_and_trace_filter(args, &req_buf,
    -+			       server_supports_feature("fetch", "filter", 0));
    ++	send_filter(args, &req_buf,
    ++		    server_supports_feature("fetch", "filter", 0));
      
      	if (server_supports_feature("fetch", "packfile-uris", 0)) {
      		int i;

 fetch-pack.c | 40 ++++++++++++++++++++++++++--------------
 1 file changed, 26 insertions(+), 14 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index cb6647d657..a44eb32a71 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -292,6 +292,29 @@ static void mark_tips(struct fetch_negotiator *negotiator,
 	return;
 }
 
+static void send_filter(struct fetch_pack_args *args,
+			struct strbuf *req_buf,
+			int server_supports_filter)
+{
+	if (args->filter_options.choice) {
+		const char *spec =
+			expand_list_objects_filter_spec(&args->filter_options);
+		if (server_supports_filter) {
+			print_verbose(args, _("Server supports filter"));
+			packet_buf_write(req_buf, "filter %s", spec);
+			trace2_data_string("fetch", the_repository,
+					   "filter/effective", spec);
+		} else {
+			warning("filtering not recognized by server, ignoring");
+			trace2_data_string("fetch", the_repository,
+					   "filter/unsupported", spec);
+		}
+	} else {
+		trace2_data_string("fetch", the_repository,
+				   "filter/none", "");
+	}
+}
+
 static int find_common(struct fetch_negotiator *negotiator,
 		       struct fetch_pack_args *args,
 		       int fd[2], struct object_id *result_oid,
@@ -389,11 +412,7 @@ static int find_common(struct fetch_negotiator *negotiator,
 			packet_buf_write(&req_buf, "deepen-not %s", s->string);
 		}
 	}
-	if (server_supports_filtering && args->filter_options.choice) {
-		const char *spec =
-			expand_list_objects_filter_spec(&args->filter_options);
-		packet_buf_write(&req_buf, "filter %s", spec);
-	}
+	send_filter(args, &req_buf, server_supports_filtering);
 	packet_buf_flush(&req_buf);
 	state_len = req_buf.len;
 
@@ -1323,15 +1342,8 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
 		die(_("Server does not support shallow requests"));
 
 	/* Add filter */
-	if (server_supports_feature("fetch", "filter", 0) &&
-	    args->filter_options.choice) {
-		const char *spec =
-			expand_list_objects_filter_spec(&args->filter_options);
-		print_verbose(args, _("Server supports filter"));
-		packet_buf_write(&req_buf, "filter %s", spec);
-	} else if (args->filter_options.choice) {
-		warning("filtering not recognized by server, ignoring");
-	}
+	send_filter(args, &req_buf,
+		    server_supports_feature("fetch", "filter", 0));
 
 	if (server_supports_feature("fetch", "packfile-uris", 0)) {
 		int i;
-- 
2.37.1.359.gd136c6c3e2-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] fetch-pack: write effective filter to trace2
  2022-07-25 18:56     ` Junio C Hamano
@ 2022-07-26 16:28       ` Jonathan Tan
  0 siblings, 0 replies; 15+ messages in thread
From: Jonathan Tan @ 2022-07-26 16:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, git, git

Junio C Hamano <gitster@pobox.com> writes:
> After waiting for a week, I still haven't seen any correction to
> this patch, but do you want to give the helper function a bit more
> sensible name in an updated patch, perhaps say "send_filter()" or
> something?
> 
> Otherwise the topic looked good.
> 
> Thanks.

Thanks. Your proposed name looks good. I've sent out a v3.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-07-26 16:28 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-15 17:29 [PATCH] fetch-pack: write effective filter to trace2 Jonathan Tan
2022-07-15 17:38 ` Jeff Hostetler
2022-07-15 18:28   ` Junio C Hamano
2022-07-15 19:09     ` Jonathan Tan
2022-07-15 20:10       ` Junio C Hamano
2022-07-15 20:49         ` Jonathan Tan
2022-07-18 14:08     ` Jeff Hostetler
2022-07-18 15:53       ` Junio C Hamano
2022-07-18 16:18         ` Jonathan Tan
2022-07-18 17:56           ` Junio C Hamano
2022-07-18 17:00 ` [PATCH v2] " Jonathan Tan
2022-07-18 19:47   ` Junio C Hamano
2022-07-25 18:56     ` Junio C Hamano
2022-07-26 16:28       ` Jonathan Tan
2022-07-26 16:27 ` [PATCH v3] " Jonathan Tan

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).