From: "René Scharfe" <l.s.r@web.de>
To: Jeff King <peff@peff.net>, Eric Sunshine <sunshine@sunshineco.com>
Cc: Git Mailing List <git@vger.kernel.org>,
Junio C Hamano <gitster@pobox.com>, Taylor Blau <me@ttaylorr.com>
Subject: Re: [PATCH v2] strbuf: add and use strbuf_insertstr()
Date: Tue, 11 Feb 2020 17:18:02 +0100 [thread overview]
Message-ID: <a5622196-1978-2fe5-144f-99edc5516dd6@web.de> (raw)
In-Reply-To: <20200210234427.GA632160@coredump.intra.peff.net>
Am 11.02.20 um 00:44 schrieb Jeff King:
> On Sun, Feb 09, 2020 at 12:36:22PM -0500, Eric Sunshine wrote:
>
>> On Sun, Feb 9, 2020 at 8:45 AM René Scharfe <l.s.r@web.de> wrote:
>>> Add a function for inserting a C string into a strbuf. Use it
>>> throughout the source to get rid of magic string length constants and
>>> explicit strlen() calls.
>>>
>>> Like strbuf_addstr(), implement it as an inline function to avoid the
>>> implicit strlen() calls to cause runtime overhead.
>>>
>>> Signed-off-by: René Scharfe <l.s.r@web.de>
>>> ---
>>> diff --git a/mailinfo.c b/mailinfo.c
>>> @@ -570,7 +570,7 @@ static int check_header(struct mailinfo *mi,
>>> len = strlen("Content-Type: ");
>>> strbuf_add(&sb, line->buf + len, line->len - len);
>>> decode_header(mi, &sb);
>>> - strbuf_insert(&sb, 0, "Content-Type: ", len);
>>> + strbuf_insertstr(&sb, 0, "Content-Type: ");
>>> handle_content_type(mi, &sb);
>>
>> Meh. We've already computed the length of "Content-Type: " a few lines
>> earlier, so taking advantage of that value when inserting the string
>> literal is perfectly sensible. Thus, I'm not convinced that this
>> change is an improvement.
>
> I had a similar thought. I kind of wonder if all of these "len" bits
> (and their repeated strings) could go away if cmp_header() just used
> skip_iprefix() under the hood and had a pointer out-parameter.
>
> Something like the (largely untested) patch below. That would also make
> it easy to support arbitrary amounts of whitespace after the header,
> which I think are allowed by the standard (whereas now we'd parse
> "Content-type: text/plain" as " text/plain", which is silly).
>
> Worth doing?
Sure.
> ---
> diff --git a/mailinfo.c b/mailinfo.c
> index b395adbdf2..bbb5b429f8 100644
> --- a/mailinfo.c
> +++ b/mailinfo.c
> @@ -346,11 +346,16 @@ static const char *header[MAX_HDR_PARSED] = {
> "From","Subject","Date",
> };
>
> -static inline int cmp_header(const struct strbuf *line, const char *hdr)
> +static inline int cmp_header(const struct strbuf *line, const char *hdr,
> + const char **outval)
> {
> - int len = strlen(hdr);
> - return !strncasecmp(line->buf, hdr, len) && line->len > len &&
> - line->buf[len] == ':' && isspace(line->buf[len + 1]);
> + const char *val;
> + if (!skip_iprefix(line->buf, hdr, &val) ||
> + *val++ != ':' ||
> + !isspace(*val++))
> + return 0;
> + *outval = val;
> + return 1;
> }
And you could rename it to skip_header() to fix the issue that its name
starts with cmp but its return value is the inverse of a cmp-style
function.
And it could take a char pointer instead of a strbuf, to reduce its
dependencies and make it more widely useful -- but that might also be
a case of YAGNI.
>
> static int is_format_patch_separator(const char *line, int len)
> @@ -547,17 +552,17 @@ static int check_header(struct mailinfo *mi,
> const struct strbuf *line,
> struct strbuf *hdr_data[], int overwrite)
> {
> - int i, ret = 0, len;
> + int i, ret = 0;
> struct strbuf sb = STRBUF_INIT;
> + const char *val;
>
> /* search for the interesting parts */
> for (i = 0; header[i]; i++) {
> - int len = strlen(header[i]);
> - if ((!hdr_data[i] || overwrite) && cmp_header(line, header[i])) {
> + if ((!hdr_data[i] || overwrite) && cmp_header(line, header[i], &val)) {
> /* Unwrap inline B and Q encoding, and optionally
> * normalize the meta information to utf8.
> */
> - strbuf_add(&sb, line->buf + len + 2, line->len - len - 2);
> + strbuf_addstr(&sb, val);
That assumes the header value never contains NULs. Valid?
> decode_header(mi, &sb);
> handle_header(&hdr_data[i], &sb);
> ret = 1;
> @@ -566,26 +571,22 @@ static int check_header(struct mailinfo *mi,
> }
>
> /* Content stuff */
> - if (cmp_header(line, "Content-Type")) {
> - len = strlen("Content-Type: ");
> - strbuf_add(&sb, line->buf + len, line->len - len);
> + if (cmp_header(line, "Content-Type", &val)) {
> + strbuf_addstr(&sb, val);
> decode_header(mi, &sb);
> - strbuf_insert(&sb, 0, "Content-Type: ", len);
> handle_content_type(mi, &sb);
> ret = 1;
> goto check_header_out;
> }
> - if (cmp_header(line, "Content-Transfer-Encoding")) {
> - len = strlen("Content-Transfer-Encoding: ");
> - strbuf_add(&sb, line->buf + len, line->len - len);
> + if (cmp_header(line, "Content-Transfer-Encoding", &val)) {
> + strbuf_addstr(&sb, val);
> decode_header(mi, &sb);
> handle_content_transfer_encoding(mi, &sb);
> ret = 1;
> goto check_header_out;
> }
> - if (cmp_header(line, "Message-Id")) {
> - len = strlen("Message-Id: ");
> - strbuf_add(&sb, line->buf + len, line->len - len);
> + if (cmp_header(line, "Message-Id", &val)) {
> + strbuf_addstr(&sb, val);
> decode_header(mi, &sb);
> if (mi->add_message_id)
> mi->message_id = strbuf_detach(&sb, NULL);
The repeated sequence cmp_header()+strbuf_add{,str}()+decode_header()
makes me itchy.
> @@ -607,8 +608,9 @@ static int is_inbody_header(const struct mailinfo *mi,
> const struct strbuf *line)
> {
> int i;
> + const char *val;
> for (i = 0; header[i]; i++)
> - if (!mi->s_hdr_data[i] && cmp_header(line, header[i]))
> + if (!mi->s_hdr_data[i] && cmp_header(line, header[i], &val))
> return 1;
> return 0;
> }
>
next prev parent reply other threads:[~2020-02-11 16:18 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-08 19:56 [PATCH] strbuf: add and use strbuf_insertstr() René Scharfe
2020-02-08 23:08 ` Taylor Blau
2020-02-09 10:23 ` René Scharfe
2020-02-09 0:53 ` Eric Sunshine
2020-02-09 10:23 ` René Scharfe
2020-02-09 13:44 ` [PATCH v2] " René Scharfe
2020-02-09 17:36 ` Eric Sunshine
2020-02-09 18:28 ` René Scharfe
2020-02-09 21:09 ` Eric Sunshine
2020-02-09 23:10 ` Taylor Blau
2020-02-10 23:44 ` Jeff King
2020-02-11 16:17 ` Junio C Hamano
2020-02-11 17:16 ` [PATCH 0/4] some more mailinfo cleanups Jeff King
2020-02-11 17:18 ` [PATCH 1/4] mailinfo: treat header values as C strings Jeff King
2020-02-11 17:26 ` Eric Sunshine
2020-02-11 17:19 ` [PATCH 2/4] mailinfo: simplify parsing of header values Jeff King
2020-02-11 17:19 ` [PATCH 3/4] mailinfo: be more liberal with header whitespace Jeff King
2020-02-11 17:20 ` [PATCH 4/4] mailinfo: factor out some repeated header handling Jeff King
2020-02-11 16:18 ` René Scharfe [this message]
2020-02-11 17:13 ` [PATCH v2] strbuf: add and use strbuf_insertstr() Jeff King
2020-02-10 7:15 ` [PATCH 2/1] mailinfo: don't insert header prefix for handle_content_type() René Scharfe
2020-02-10 17:27 ` Junio C Hamano
2020-02-10 19:55 ` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a5622196-1978-2fe5-144f-99edc5516dd6@web.de \
--to=l.s.r@web.de \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=peff@peff.net \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).