From: John Cai <johncai86@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: John Cai via GitGitGadget <gitgitgadget@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH] cat-file: skip expanding default format
Date: Mon, 07 Mar 2022 12:41:52 -0500 [thread overview]
Message-ID: <68505E7E-AEED-4DA7-A70F-8B4FE214C05D@gmail.com> (raw)
In-Reply-To: <xmqqilsquwaw.fsf@gitster.g>
Hi Junio,
On 7 Mar 2022, at 1:11, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> "John Cai via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>>> From: John Cai <johncai86@gmail.com>
>>>
>>> When format is passed into --batch, --batch-check, --batch-command,
>>> the format gets expanded. When nothing is passed in, the default format
>>> is set and the expand_format() gets called.
>>>
>>> We can save on these cycles by hardcoding how to print the
>>> information when nothing is passed as the format, or when the default
>>> format is passed. There is no need for the fully expanded format with
>>> the default. Since batch_object_write() happens on every object provided
>>> in batch mode, we get a nice performance improvement.
>>
>> That is OK in principle, but ...
>>
>>> + if (!opt->format && !opt->print_contents) {
>>> + char buf[1024];
>>> +
>>> + print_default_format(buf, 1024, data);
>>> + batch_write(opt, buf, strlen(buf));
>>> + goto cleanup;
>>> + }
>>> +
>>> + fmt = opt->format ? opt->format : default_format;
>>
>> ... instead of doing this, wouldn't it be nicer to base the decision
>> to call print_default_format() on purely the contents of the format,
>> i.e.
>>
>> fmt = opt->format ? opt->format : default_format;
>> if (!strcmp(fmt, DEFAULT_FORMAT) && !opt->print_contents) {
>> ... the above print_default_format() call block here ...
>> goto cleanup;
>> }
>>
>> where DEFAULT_FORMAT is
>>
>> #define DEFAULT_FORMAT = "%(objectname) %(objecttype) %(objectsize)"
>>
>> and
>>
>>> @@ -515,9 +543,7 @@ static int batch_objects(struct batch_options *opt)
>>> struct expand_data data;
>>> int save_warning;
>>> int retval = 0;
>>> -
>>> - if (!opt->format)
>>> - opt->format = "%(objectname) %(objecttype) %(objectsize)";
>>
>> retain the defaulting with
>>
>> if (!opt->format)
>> opt->format = DEFAULT_FORMAT;
>>
>> instead of making opt->format == NULL to mean something special?
>>
>> That way, even if the user-input happens to name the format that is
>> identical to DEFAULT_FORMAT, because we only care what the format
>> is, and not where the format comes from, we will get the same
>> optimization. Wouldn't it make more sense?
>
> Actually, doing that literally and naively would not be a good idea,
> as the special case code is inside batch_object_write() that is
> called once per each object, and because the format used will not
> change for each call, doing strcmp() every time is wasteful. The
> same is true for
>
> fmt = opt->format ? opt->format : default_format;
>
> as opt->format will not change across calls to this function.
>
> So, if we were to do this optimization:
>
> * we key on the fact that opt->format is NULL to trigger the
> optimization inside batch_object_write(), so that we do not have
> to strcmp(DEFAULT_FORMAT, fmt) for each and every object.
>
> * a while loop in batch_objects() or for_each_*_object() calls is
> what calls batch_object_write() for each object. So somewhere
> early in that function (or before we enter the function), we can
> check opt->format and
>
> - if it is NULL, we can leave it NULL.
> - if it is the same as DEFAULT_FORMAT, clear it to NULL.
>
> so that the optimization in batch_object_write() can cheaply kick
> in.
>
> would be a good way to go, perhaps?
thanks for looking into this. Yeah, I think the approach you outlined makes
sense for the reasons given.
next prev parent reply other threads:[~2022-03-07 17:41 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-04 21:37 [PATCH] cat-file: skip expanding default format John Cai via GitGitGadget
2022-03-07 5:56 ` Junio C Hamano
2022-03-07 6:11 ` Junio C Hamano
2022-03-07 17:41 ` John Cai [this message]
2022-03-07 12:15 ` Ævar Arnfjörð Bjarmason
2022-03-08 2:54 ` [PATCH v2] " John Cai via GitGitGadget
2022-03-08 16:59 ` Junio C Hamano
2022-03-08 19:01 ` John Cai
2022-03-08 22:00 ` Taylor Blau
2022-03-08 22:06 ` John Cai
2022-03-08 22:24 ` Taylor Blau
2022-03-08 22:45 ` John Cai
2022-03-08 22:08 ` [PATCH v3] " John Cai via GitGitGadget
2022-03-08 22:30 ` Taylor Blau
2022-03-08 23:09 ` John Cai
2022-03-08 23:34 ` John Cai
2022-03-15 2:40 ` [PATCH v4] " John Cai via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=68505E7E-AEED-4DA7-A70F-8B4FE214C05D@gmail.com \
--to=johncai86@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).