git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "John Cai via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, John Cai <johncai86@gmail.com>
Subject: Re: [PATCH] cat-file: skip expanding default format
Date: Sun, 06 Mar 2022 22:11:03 -0800	[thread overview]
Message-ID: <xmqqilsquwaw.fsf@gitster.g> (raw)
In-Reply-To: <xmqqmti2uwzr.fsf@gitster.g> (Junio C. Hamano's message of "Sun, 06 Mar 2022 21:56:08 -0800")

Junio C Hamano <gitster@pobox.com> writes:

> "John Cai via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: John Cai <johncai86@gmail.com>
>>
>> When format is passed into --batch, --batch-check, --batch-command,
>> the format gets expanded. When nothing is passed in, the default format
>> is set and the expand_format() gets called.
>>
>> We can save on these cycles by hardcoding how to print the
>> information when nothing is passed as the format, or when the default
>> format is passed. There is no need for the fully expanded format with
>> the default. Since batch_object_write() happens on every object provided
>> in batch mode, we get a nice performance improvement.
>
> That is OK in principle, but ...
>
>> +	if (!opt->format && !opt->print_contents) {
>> +		char buf[1024];
>> +
>> +		print_default_format(buf, 1024, data);
>> +		batch_write(opt, buf, strlen(buf));
>> +		goto cleanup;
>> +	}
>> +
>> +	fmt = opt->format ? opt->format : default_format;
>
> ... instead of doing this, wouldn't it be nicer to base the decision
> to call print_default_format() on purely the contents of the format,
> i.e.
>
> 	fmt = opt->format ? opt->format : default_format;
> 	if (!strcmp(fmt, DEFAULT_FORMAT) && !opt->print_contents) {
> 		... the above print_default_format() call block here ...
> 		goto cleanup;
> 	}
>
> where DEFAULT_FORMAT is 
>
> #define DEFAULT_FORMAT = "%(objectname) %(objecttype) %(objectsize)"
>
> and
>
>> @@ -515,9 +543,7 @@ static int batch_objects(struct batch_options *opt)
>>  	struct expand_data data;
>>  	int save_warning;
>>  	int retval = 0;
>> -
>> -	if (!opt->format)
>> -		opt->format = "%(objectname) %(objecttype) %(objectsize)";
>
> retain the defaulting with
>
> 	if (!opt->format)
> 		opt->format = DEFAULT_FORMAT;
>
> instead of making opt->format == NULL to mean something special?
>
> That way, even if the user-input happens to name the format that is
> identical to DEFAULT_FORMAT, because we only care what the format
> is, and not where the format comes from, we will get the same
> optimization.  Wouldn't it make more sense?

Actually, doing that literally and naively would not be a good idea,
as the special case code is inside batch_object_write() that is
called once per each object, and because the format used will not
change for each call, doing strcmp() every time is wasteful.  The
same is true for

	fmt = opt->format ? opt->format : default_format;

as opt->format will not change across calls to this function.

So, if we were to do this optimization:

 * we key on the fact that opt->format is NULL to trigger the
   optimization inside batch_object_write(), so that we do not have
   to strcmp(DEFAULT_FORMAT, fmt) for each and every object.

 * a while loop in batch_objects() or for_each_*_object() calls is
   what calls batch_object_write() for each object.  So somewhere
   early in that function (or before we enter the function), we can
   check opt->format and

    - if it is NULL, we can leave it NULL.
    - if it is the same as DEFAULT_FORMAT, clear it to NULL.

   so that the optimization in batch_object_write() can cheaply kick
   in.

would be a good way to go, perhaps?

  reply	other threads:[~2022-03-07  6:11 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-04 21:37 [PATCH] cat-file: skip expanding default format John Cai via GitGitGadget
2022-03-07  5:56 ` Junio C Hamano
2022-03-07  6:11   ` Junio C Hamano [this message]
2022-03-07 17:41     ` John Cai
2022-03-07 12:15 ` Ævar Arnfjörð Bjarmason
2022-03-08  2:54 ` [PATCH v2] " John Cai via GitGitGadget
2022-03-08 16:59   ` Junio C Hamano
2022-03-08 19:01     ` John Cai
2022-03-08 22:00   ` Taylor Blau
2022-03-08 22:06     ` John Cai
2022-03-08 22:24     ` Taylor Blau
2022-03-08 22:45       ` John Cai
2022-03-08 22:08   ` [PATCH v3] " John Cai via GitGitGadget
2022-03-08 22:30     ` Taylor Blau
2022-03-08 23:09       ` John Cai
2022-03-08 23:34         ` John Cai
2022-03-15  2:40     ` [PATCH v4] " John Cai via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqilsquwaw.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=johncai86@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).