git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
	"Michał Kępień" <michal@isc.org>,
	"Phillip Wood" <phillip.wood123@gmail.com>,
	"Jeff King" <peff@peff.net>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [PATCH v2 0/2] diff: add an API for deferred freeing
Date: Thu, 11 Feb 2021 11:45:33 +0100	[thread overview]
Message-ID: <20210211104535.16076-1-avarab@gmail.com> (raw)
In-Reply-To: <YCUFNVj7qlt9wzlX@coredump.intra.peff.net>

I skipped the rename of the "close_file" flag, and updated the code &
commit messages in response to Johannes's feedback on v1. Hopefully
this is all better now.

Ævar Arnfjörð Bjarmason (2):
  diff: add an API for deferred freeing
  diff: plug memory leak from regcomp() on {log,diff} -I

 builtin/log.c | 23 ++++++++++++-----------
 diff.c        | 32 ++++++++++++++++++++++++++++----
 diff.h        | 15 ++++++++++++++-
 log-tree.c    | 10 ++++++----
 4 files changed, 60 insertions(+), 20 deletions(-)

Range-diff:
1:  531fed77f4c ! 1:  045d3f72d15 diff: add an API for deferred freeing
    @@ Commit message
         by setting "no_free" in "diff_options".
     
         This is required because when e.g. "git diff" is run we'll allocate
    -    things in that struct, use the diff machinery once, and then exit, but
    -    if we run e.g. "git log -p" we're going to re-use what we allocated
    -    across multiple diff_flush() calls, and only want to free things at
    -    the end.
    +    things in that struct, use the diff machinery once, and then exit.
    +
    +    But if we run e.g. "git log -p" we're going to re-use what we
    +    allocated across multiple diff_flush() calls, and only want to free
    +    things at the end.
     
         We've thus ended up with features like the recently added "diff -I"[1]
         where we'll leak memory. As it turns out it could have simply used the
    @@ Commit message
         the diffopt.close_file attribute, 2016-06-22).
     
         Manually adding more such flags to things log_tree_commit() every time
    -    we need to allocate something would be tedious.
    -
    -    Let's instead move that fclose() code it to a new diff_free(), in
    -    anticipation of freeing more things in that function in follow-up
    -    commits. I'm renaming the "close_file" struct member to "fclose_file"
    -    for the ease of validating this, we can be certain that these are all
    -    the relevant callsites.
    +    we need to allocate something would be tedious. Let's instead move
    +    that fclose() code it to a new diff_free(), in anticipation of freeing
    +    more things in that function in follow-up commits.
     
         Some functions such as log_tree_commit() need an idiom of optionally
         retaining a previous "no_free", as they may either free the memory
         themselves, or their caller may do so. I'm keeping that idiom in
    -    log_show_early() even though I don't think it's currently called in
    -    this manner, since it also gets passed an existing "struct rev_info"..
    +    log_show_early() for good measure, even though I don't think it's
    +    currently called in this manner. It also gets passed an existing
    +    "struct rev_info", so future callers may want to set the "no_free"
    +    flag.
    +
    +    This change is a bit hard to read because while the freeing pattern
    +    we're introducing isn't unusual, the "file" member is a special
    +    snowflake. We usually don't want to fclose() it. This is because
    +    "file" is usually stdout, in which case we don't want to fclose()
    +    it. We only want to opt-in to closing it when we e.g. open a file on
    +    the filesystem. Thus the opt-in "close_file" flag.
    +
    +    So the API in general just needs a "no_free" flag to defer freeing,
    +    but the "file" member still needs its "close_file" flag. This is made
    +    more confusing because while refactoring this code we could replace
    +    some "close_file=0" with "no_free=1", whereas others need to set both
    +    flags.
    +
    +    This is because there were some cases where an existing "close_file=0"
    +    meant "let's defer deallocation", and others where it meant "we don't
    +    want to close this file handle at all".
     
         1. 296d4a94e7 (diff: add -I<regex> that ignores matching changes,
            2020-10-20)
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    - ## builtin/add.c ##
    -@@ builtin/add.c: static int edit_patch(int argc, const char **argv, const char *prefix)
    - 	if (out < 0)
    - 		die(_("Could not open '%s' for writing."), file);
    - 	rev.diffopt.file = xfdopen(out, "w");
    --	rev.diffopt.close_file = 1;
    -+	rev.diffopt.fclose_file = 1;
    - 	if (run_diff_files(&rev, 0))
    - 		die(_("Could not write patch"));
    - 
    -
    - ## builtin/am.c ##
    -@@ builtin/am.c: static void write_commit_patch(const struct am_state *state, struct commit *comm
    - 	rev_info.diffopt.flags.full_index = 1;
    - 	rev_info.diffopt.use_color = 0;
    - 	rev_info.diffopt.file = fp;
    --	rev_info.diffopt.close_file = 1;
    -+	rev_info.diffopt.fclose_file = 1; /* log_tree_commit() sets .no_free=1 */
    - 	add_pending_object(&rev_info, &commit->object, "");
    - 	diff_setup_done(&rev_info.diffopt);
    - 	log_tree_commit(&rev_info, commit);
    -@@ builtin/am.c: static void write_index_patch(const struct am_state *state)
    - 	rev_info.diffopt.output_format = DIFF_FORMAT_PATCH;
    - 	rev_info.diffopt.use_color = 0;
    - 	rev_info.diffopt.file = fp;
    --	rev_info.diffopt.close_file = 1;
    -+	rev_info.diffopt.fclose_file = 1;
    -+	rev_info.diffopt.no_free = 1;
    - 	add_pending_object(&rev_info, &tree->object, "");
    - 	diff_setup_done(&rev_info.diffopt);
    - 	run_diff_index(&rev_info, 1);
    -+	diff_free(&rev_info.diffopt);
    - }
    - 
    - /**
    -
      ## builtin/log.c ##
     @@ builtin/log.c: static struct itimerval early_output_timer;
      
    @@ builtin/log.c: static int cmd_log_walk(struct rev_info *rev)
      	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
      	    rev->diffopt.flags.check_failed) {
     @@ builtin/log.c: int cmd_format_patch(int argc, const char **argv, const char *prefix)
    - 	if (rev.show_notes)
    - 		load_display_notes(&rev.notes_opt);
    - 
    --	if (use_stdout + rev.diffopt.close_file + !!output_directory > 1)
    -+	if (use_stdout + rev.diffopt.fclose_file + !!output_directory > 1)
    - 		die(_("--stdout, --output, and --output-directory are mutually exclusive"));
    - 
    - 	if (use_stdout) {
    - 		setup_pager();
    --	} else if (rev.diffopt.close_file) {
    -+	} else if (rev.diffopt.fclose_file) {
    - 		/*
    - 		 * The diff code parsed --output; it has already opened the
      		 * file, but but we must instruct it not to close after each
      		 * diff.
      		 */
    @@ builtin/log.c: int cmd_format_patch(int argc, const char **argv, const char *pre
      
     
      ## diff.c ##
    -@@ diff.c: static enum parse_opt_result diff_opt_output(struct parse_opt_ctx_t *ctx,
    - 	BUG_ON_OPT_NEG(unset);
    - 	path = prefix_filename(ctx->prefix, arg);
    - 	options->file = xfopen(path, "w");
    --	options->close_file = 1;
    -+	options->fclose_file = 1;
    - 	if (options->use_color != GIT_COLOR_ALWAYS)
    - 		options->use_color = GIT_COLOR_NEVER;
    - 	free(path);
    +@@ diff.c: static void diff_flush_patch_all_file_pairs(struct diff_options *o)
    + 	}
    + }
    + 
    ++static void diff_free_file(struct diff_options *options)
    ++{
    ++	if (options->close_file)
    ++		fclose(options->file);
    ++}
    ++
    ++void diff_free(struct diff_options *options)
    ++{
    ++	if (options->no_free)
    ++		return;
    ++
    ++	diff_free_file(options);
    ++}
    ++
    + void diff_flush(struct diff_options *options)
    + {
    + 	struct diff_queue_struct *q = &diff_queued_diff;
     @@ diff.c: void diff_flush(struct diff_options *options)
      		 * options->file to /dev/null should be safe, because we
      		 * aren't supposed to produce any output anyway.
      		 */
     -		if (options->close_file)
    -+		if (options->fclose_file)
    - 			fclose(options->file);
    +-			fclose(options->file);
    ++		diff_free_file(options);
      		options->file = xfopen("/dev/null", "w");
    --		options->close_file = 1;
    -+		options->fclose_file = 1;
    + 		options->close_file = 1;
      		options->color_moved = 0;
    - 		for (i = 0; i < q->nr; i++) {
    - 			struct diff_filepair *p = q->queue[i];
     @@ diff.c: void diff_flush(struct diff_options *options)
      free_queue:
      	free(q->queue);
    @@ diff.c: void diff_flush(struct diff_options *options)
      
      	/*
      	 * Report the content-level differences with HAS_CHANGES;
    -@@ diff.c: void diff_flush(struct diff_options *options)
    - 	}
    - }
    - 
    -+void diff_free(struct diff_options *options)
    -+{
    -+	if (options->no_free)
    -+		return;
    -+	if (options->fclose_file)
    -+		fclose(options->file);
    -+}
    -+	
    -+
    - static int match_filter(const struct diff_options *options, const struct diff_filepair *p)
    - {
    - 	return (((p->status == DIFF_STATUS_MODIFIED) &&
     
      ## diff.h ##
     @@
       * - Once you finish feeding the pairs of files, call `diffcore_std()`.
       * This will tell the diffcore library to go ahead and do its work.
       *
    -+ * - The `diff_opt_parse()` etc. functions might allocate memory in
    -+ *  `struct diff_options`. When running the API `N > 1` set `.no_free
    -+ *  = 1` to make the `diff_free()` invoked by `diff_flush()` below a
    -+ *  noop.
    -+ *
    -  * - Calling `diff_flush()` will produce the output.
    +- * - Calling `diff_flush()` will produce the output.
    ++ * - Calling `diff_flush()` will produce the output, it will call
    ++ *   `diff_free()` to free any resources, e.g. those allocated in
    ++ *   `diff_opt_parse()`.
     + *
    -+ * - If you set `.no_free = 1` before set it to `0` and call
    -+ *   `diff_free()` again. If `.no_free = 1` was not set there's no
    -+ *   need to call `diff_free()`, `diff_flush()` will call it.
    ++ * - Set `.no_free = 1` before calling `diff_flush()` to defer the
    ++ *   freeing of allocated memory in diff_options. This is useful when
    ++ *   `diff_flush()` is being called in a loop, rather than as a
    ++ *   one-off. When setting `.no_free = 1` you must ensure that
    ++ *   `diff_free()` is called at the end, either by flipping the flag
    ++ *   before the last `diff_flush()` call, or by flipping it before
    ++ *   calling `diff_free()` yourself.
       */
      
      struct combine_diff_path;
     @@ diff.h: struct diff_options {
    - 	void (*set_default)(struct diff_options *);
    - 
    - 	FILE *file;
    --	int close_file;
    -+	int fclose_file;
    - 
    - #define OUTPUT_INDICATOR_NEW 0
    - #define OUTPUT_INDICATOR_OLD 1
    -@@ diff.h: struct diff_options {
      
      	struct repository *repo;
      	struct option *parseopts;
    @@ log-tree.c: int log_tree_commit(struct rev_info *opt, struct commit *commit)
     +	diff_free(&opt->diffopt);
      	return shown;
      }
    -
    - ## wt-status.c ##
    -@@ wt-status.c: static void wt_longstatus_print_verbose(struct wt_status *s)
    - 	rev.diffopt.rename_limit = s->rename_limit >= 0 ? s->rename_limit : rev.diffopt.rename_limit;
    - 	rev.diffopt.rename_score = s->rename_score >= 0 ? s->rename_score : rev.diffopt.rename_score;
    - 	rev.diffopt.file = s->fp;
    --	rev.diffopt.close_file = 0;
    -+	rev.diffopt.fclose_file = 0; /* wt_status owns the s->fp */
    - 	/*
    - 	 * If we're not going to stdout, then we definitely don't
    - 	 * want color, since we are going to the commit message
2:  7192cf01e71 ! 2:  f571524e6d8 diff: plug memory leak from regcomp() on {log,diff} -I
    @@ Commit message
         At that time freeing the memory was somewhat tedious, but since it
         isn't anymore with the newly introduced diff_free() let's use it.
     
    +    Let's retain the pattern for diff_free_file() and add a
    +    diff_free_ignore_regex(), even though (unlike "diff_free_file") we
    +    don't need to call it elsewhere. I think this'll make for more
    +    readable code than gradually accumulating a giant diff_free()
    +    function, sharing "int i" across unrelated code etc.
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## diff.c ##
    -@@ diff.c: void diff_flush(struct diff_options *options)
    +@@ diff.c: static void diff_free_file(struct diff_options *options)
    + 		fclose(options->file);
    + }
      
    - void diff_free(struct diff_options *options)
    - {
    ++static void diff_free_ignore_regex(struct diff_options *options)
    ++{
     +	int i;
    - 	if (options->no_free)
    - 		return;
    - 	if (options->fclose_file)
    - 		fclose(options->file);
     +
     +	for (i = 0; i < options->ignore_regex_nr; i++) {
     +		regfree(options->ignore_regex[i]);
     +		free(options->ignore_regex[i]);
     +	}
     +	free(options->ignore_regex);
    ++}
    ++
    + void diff_free(struct diff_options *options)
    + {
    + 	if (options->no_free)
    + 		return;
    + 
    + 	diff_free_file(options);
    ++	diff_free_ignore_regex(options);
      }
    - 	
      
    + void diff_flush(struct diff_options *options)
-- 
2.30.0.284.gd98b1dd5eaa7


  reply	other threads:[~2021-02-11 10:52 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-01 12:06 [PATCH 0/2] diff: add -I<regex> that ignores matching changes Michał Kępień
2020-10-01 12:06 ` [PATCH 1/2] " Michał Kępień
2020-10-01 18:21   ` Junio C Hamano
2020-10-07 19:48     ` Michał Kępień
2020-10-07 20:08       ` Junio C Hamano
2020-10-01 12:06 ` [PATCH 2/2] t: add -I<regex> tests Michał Kępień
2020-10-01 17:02 ` [PATCH 0/2] diff: add -I<regex> that ignores matching changes Junio C Hamano
2020-10-12  9:17 ` [PATCH v2 0/3] " Michał Kępień
2020-10-12  9:17   ` [PATCH v2 1/3] merge-base, xdiff: zero out xpparam_t structures Michał Kępień
2020-10-12 11:14     ` Johannes Schindelin
2020-10-12 17:09       ` Junio C Hamano
2020-10-12 19:52     ` Junio C Hamano
2020-10-13  6:35       ` Michał Kępień
2020-10-12  9:17   ` [PATCH v2 2/3] diff: add -I<regex> that ignores matching changes Michał Kępień
2020-10-12 11:20     ` Johannes Schindelin
2020-10-12 20:00       ` Junio C Hamano
2020-10-12 20:39         ` Johannes Schindelin
2020-10-12 21:43           ` Junio C Hamano
2020-10-13  6:37             ` Michał Kępień
2020-10-13 15:49               ` Junio C Hamano
2020-10-13  6:36       ` Michał Kępień
2020-10-13 12:02         ` Johannes Schindelin
2020-10-13 15:53           ` Junio C Hamano
2020-10-13 18:45           ` Michał Kępień
2020-10-12 18:01     ` Junio C Hamano
2020-10-13  6:38       ` Michał Kępień
2020-10-12 20:04     ` Junio C Hamano
2020-10-13  6:38       ` Michał Kępień
2020-10-12  9:17   ` [PATCH v2 3/3] t: add -I<regex> tests Michał Kępień
2020-10-12 11:49     ` Johannes Schindelin
2020-10-13  6:38       ` Michał Kępień
2020-10-13 12:00         ` Johannes Schindelin
2020-10-13 16:00           ` Junio C Hamano
2020-10-13 19:01           ` Michał Kępień
2020-10-15 11:45             ` Johannes Schindelin
2020-10-15  7:24   ` [PATCH v3 0/2] diff: add -I<regex> that ignores matching changes Michał Kępień
2020-10-15  7:24     ` [PATCH v3 1/2] merge-base, xdiff: zero out xpparam_t structures Michał Kępień
2020-10-15  7:24     ` [PATCH v3 2/2] diff: add -I<regex> that ignores matching changes Michał Kępień
2020-10-16 15:32       ` Phillip Wood
2020-10-16 18:04         ` Junio C Hamano
2020-10-19  9:48           ` Michał Kępień
2020-10-16 18:16       ` Junio C Hamano
2020-10-19  9:55         ` Michał Kępień
2020-10-19 17:29           ` Junio C Hamano
2020-10-16 10:00     ` [PATCH v3 0/2] " Johannes Schindelin
2020-10-20  6:48     ` [PATCH v4 " Michał Kępień
2020-10-20  6:48       ` [PATCH v4 1/2] merge-base, xdiff: zero out xpparam_t structures Michał Kępień
2020-10-20  6:48       ` [PATCH v4 2/2] diff: add -I<regex> that ignores matching changes Michał Kępień
2021-02-05 14:13       ` [PATCH 1/2] diff: add an API for deferred freeing Ævar Arnfjörð Bjarmason
2021-02-10 16:00         ` Johannes Schindelin
2021-02-11  3:00           ` Ævar Arnfjörð Bjarmason
2021-02-11  9:40             ` Johannes Schindelin
2021-02-11 10:21               ` Jeff King
2021-02-11 10:45                 ` Ævar Arnfjörð Bjarmason [this message]
2021-02-11 10:45                 ` [PATCH v2 " Ævar Arnfjörð Bjarmason
2021-02-11 10:45                 ` [PATCH v2 2/2] diff: plug memory leak from regcomp() on {log,diff} -I Ævar Arnfjörð Bjarmason
2021-02-05 14:13       ` [PATCH " Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210211104535.16076-1-avarab@gmail.com \
    --to=avarab@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=michal@isc.org \
    --cc=peff@peff.net \
    --cc=phillip.wood123@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).