git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Phillip Wood <phillip.wood123@gmail.com>
Cc: "Michał Kępień" <michal@isc.org>, git@vger.kernel.org
Subject: Re: [PATCH v3 2/2] diff: add -I<regex> that ignores matching changes
Date: Fri, 16 Oct 2020 11:04:02 -0700	[thread overview]
Message-ID: <xmqqeelycajx.fsf@gitster.c.googlers.com> (raw)
In-Reply-To: <afd3b1cf-b883-6df5-bea5-28f8e06d8702@gmail.com> (Phillip Wood's message of "Fri, 16 Oct 2020 16:32:47 +0100")

Phillip Wood <phillip.wood123@gmail.com> writes:

>> @@ -6405,6 +6428,11 @@ void diff_flush(struct diff_options *options)
>>   	DIFF_QUEUE_CLEAR(q);
>>   	if (options->close_file)
>>   		fclose(options->file);
>> +	for (i = 0; i < options->ignore_regex_nr; i++) {
>> +		regfree(options->ignore_regex[i]);
>> +		free(options->ignore_regex[i]);
>> +	}
>> +	free(options->ignore_regex);
>
> If I run `git log -p -I foo` then the address sanitizer reports
>
> AddressSanitizer: heap-use-after-free xdiff/xdiffi.c:1027 in
> record_matches_regex
>
> after it has printed the diff for the first commit. I think freeing
> the regex here is the cause of the problem.

Yes, this is a good location to clear the diff_queue, which is
per-invocation.  But diff_options->ignore_regex[] can and should be
shared across multiple invocations of the diff machinery, as we are
parsing upfront in the command line option parser just once to be
used throughout the life of the program.  It is wrong to free them
early like this.

I really hate to suggest this, one common API call made into the
diff machinery from various consumers, when they are absolutely done
with diff_options, is probably diff_result_code().  Its purpose is
to collect bits computed to participate in the overall result into
the final result code and anybody who ran one or more diff and wants
to learn the outcome must call it, so it is unlikely that callers
that matter are missing a call to finalize their uses of the diff
machinery.  So if freeing .ignore_regex[] is really necessary, it
could be used to tell us where to do so [*1*].

Unlike per-invocation resources that MUST be freed for repeated
invocations of the diff machinery made in "git log" family,
resources held for and repeatedly used during the entire session
like this may not have to be freed, to be honest, though.

Thanks.


[Footnote]

*1* Adding calls to free() in diff_result_code() directly is
    probably not a good idea.  Some callers may not even need result
    code (e.g.  "blame") and do not call it at all, but we may want
    to free the resource in diff_options that is not per-invocation.
    So if we were to do this, we probably need a new API function,
    perhaps diff_options_clear() or something, and call that from
    diff_result_code(), and then find callers that do not care about
    the result code and call diff_options_clear() directly from
    them.

    If a caller does a single diff_setup_done(), makes repeated
    calls to the diff machinery, and calls diff_result_code() once
    per each iteration (e.g. imagine "git log -p" that detects the
    status for each commit), then having the diff_options_clear()
    call in diff_result_code() will break the machinery, so if we
    were to follow the outline given in the previous paragraph, we
    need to make sure there is no such caller.  But I see that
    diff_result_code() does not clear the fields it looks at in the
    diff_options fields after it uses them, so the second and
    subsequent calls into the diff machinery by such a caller may
    already be broken (not in the "resource leakage" sense, but in
    the correctness sense).
    

  reply	other threads:[~2020-10-16 18:04 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-01 12:06 [PATCH 0/2] diff: add -I<regex> that ignores matching changes Michał Kępień
2020-10-01 12:06 ` [PATCH 1/2] " Michał Kępień
2020-10-01 18:21   ` Junio C Hamano
2020-10-07 19:48     ` Michał Kępień
2020-10-07 20:08       ` Junio C Hamano
2020-10-01 12:06 ` [PATCH 2/2] t: add -I<regex> tests Michał Kępień
2020-10-01 17:02 ` [PATCH 0/2] diff: add -I<regex> that ignores matching changes Junio C Hamano
2020-10-12  9:17 ` [PATCH v2 0/3] " Michał Kępień
2020-10-12  9:17   ` [PATCH v2 1/3] merge-base, xdiff: zero out xpparam_t structures Michał Kępień
2020-10-12 11:14     ` Johannes Schindelin
2020-10-12 17:09       ` Junio C Hamano
2020-10-12 19:52     ` Junio C Hamano
2020-10-13  6:35       ` Michał Kępień
2020-10-12  9:17   ` [PATCH v2 2/3] diff: add -I<regex> that ignores matching changes Michał Kępień
2020-10-12 11:20     ` Johannes Schindelin
2020-10-12 20:00       ` Junio C Hamano
2020-10-12 20:39         ` Johannes Schindelin
2020-10-12 21:43           ` Junio C Hamano
2020-10-13  6:37             ` Michał Kępień
2020-10-13 15:49               ` Junio C Hamano
2020-10-13  6:36       ` Michał Kępień
2020-10-13 12:02         ` Johannes Schindelin
2020-10-13 15:53           ` Junio C Hamano
2020-10-13 18:45           ` Michał Kępień
2020-10-12 18:01     ` Junio C Hamano
2020-10-13  6:38       ` Michał Kępień
2020-10-12 20:04     ` Junio C Hamano
2020-10-13  6:38       ` Michał Kępień
2020-10-12  9:17   ` [PATCH v2 3/3] t: add -I<regex> tests Michał Kępień
2020-10-12 11:49     ` Johannes Schindelin
2020-10-13  6:38       ` Michał Kępień
2020-10-13 12:00         ` Johannes Schindelin
2020-10-13 16:00           ` Junio C Hamano
2020-10-13 19:01           ` Michał Kępień
2020-10-15 11:45             ` Johannes Schindelin
2020-10-15  7:24   ` [PATCH v3 0/2] diff: add -I<regex> that ignores matching changes Michał Kępień
2020-10-15  7:24     ` [PATCH v3 1/2] merge-base, xdiff: zero out xpparam_t structures Michał Kępień
2020-10-15  7:24     ` [PATCH v3 2/2] diff: add -I<regex> that ignores matching changes Michał Kępień
2020-10-16 15:32       ` Phillip Wood
2020-10-16 18:04         ` Junio C Hamano [this message]
2020-10-19  9:48           ` Michał Kępień
2020-10-16 18:16       ` Junio C Hamano
2020-10-19  9:55         ` Michał Kępień
2020-10-19 17:29           ` Junio C Hamano
2020-10-16 10:00     ` [PATCH v3 0/2] " Johannes Schindelin
2020-10-20  6:48     ` [PATCH v4 " Michał Kępień
2020-10-20  6:48       ` [PATCH v4 1/2] merge-base, xdiff: zero out xpparam_t structures Michał Kępień
2020-10-20  6:48       ` [PATCH v4 2/2] diff: add -I<regex> that ignores matching changes Michał Kępień
2021-02-05 14:13       ` [PATCH 1/2] diff: add an API for deferred freeing Ævar Arnfjörð Bjarmason
2021-02-10 16:00         ` Johannes Schindelin
2021-02-11  3:00           ` Ævar Arnfjörð Bjarmason
2021-02-11  9:40             ` Johannes Schindelin
2021-02-11 10:21               ` Jeff King
2021-02-11 10:45                 ` [PATCH v2 0/2] " Ævar Arnfjörð Bjarmason
2021-02-11 10:45                 ` [PATCH v2 1/2] " Ævar Arnfjörð Bjarmason
2021-02-11 10:45                 ` [PATCH v2 2/2] diff: plug memory leak from regcomp() on {log,diff} -I Ævar Arnfjörð Bjarmason
2021-02-05 14:13       ` [PATCH " Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqeelycajx.fsf@gitster.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=michal@isc.org \
    --cc=phillip.wood123@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).