From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
"Michał Kępień" <michal@isc.org>,
"Phillip Wood" <phillip.wood123@gmail.com>
Subject: Re: [PATCH 1/2] diff: add an API for deferred freeing
Date: Wed, 10 Feb 2021 17:00:45 +0100 (CET) [thread overview]
Message-ID: <nycvar.QRO.7.76.6.2102101557160.29765@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <20210205141320.18076-1-avarab@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 12871 bytes --]
Hi Ævar,
On Fri, 5 Feb 2021, Ævar Arnfjörð Bjarmason wrote:
> Add a diff_free() function to free anything we may have allocated in
> the "diff_options" struct, and the ability to make calling it a noop
> by setting "no_free" in "diff_options".
Why do we need a `no_free` flag? Why not simply set the `free()`d (or
`fclose()`d) attributes to `NULL`?
> This is required because when e.g. "git diff" is run we'll allocate
> things in that struct, use the diff machinery once, and then exit, but
> if we run e.g. "git log -p" we're going to re-use what we allocated
> across multiple diff_flush() calls, and only want to free things at
> the end.
>
> We've thus ended up with features like the recently added "diff -I"[1]
> where we'll leak memory. As it turns out it could have simply used the
> pattern established in 6ea57703f6 (log: prepare log/log-tree to reuse
> the diffopt.close_file attribute, 2016-06-22).
>
> Manually adding more such flags to things log_tree_commit() every time
> we need to allocate something would be tedious.
>
> Let's instead move that fclose() code it to a new diff_free(), in
> anticipation of freeing more things in that function in follow-up
> commits. I'm renaming the "close_file" struct member to "fclose_file"
> for the ease of validating this, we can be certain that these are all
> the relevant callsites.
I like that idea a lot.
> Some functions such as log_tree_commit() need an idiom of optionally
> retaining a previous "no_free", as they may either free the memory
> themselves, or their caller may do so. I'm keeping that idiom in
> log_show_early() even though I don't think it's currently called in
> this manner, since it also gets passed an existing "struct rev_info"..
>
> 1. 296d4a94e7 (diff: add -I<regex> that ignores matching changes,
> 2020-10-20)
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
> builtin/add.c | 2 +-
> builtin/am.c | 6 ++++--
> builtin/log.c | 27 ++++++++++++++-------------
> diff.c | 18 +++++++++++++-----
> diff.h | 14 +++++++++++++-
> log-tree.c | 10 ++++++----
> wt-status.c | 2 +-
> 7 files changed, 52 insertions(+), 27 deletions(-)
>
> diff --git a/builtin/add.c b/builtin/add.c
> index a825887c50..6319710186 100644
> --- a/builtin/add.c
> +++ b/builtin/add.c
> @@ -282,7 +282,7 @@ static int edit_patch(int argc, const char **argv, const char *prefix)
> if (out < 0)
> die(_("Could not open '%s' for writing."), file);
> rev.diffopt.file = xfdopen(out, "w");
> - rev.diffopt.close_file = 1;
> + rev.diffopt.fclose_file = 1;
This rename makes the patch unnecessarily tedious to review, and I do not
even agree with leaking the implementation detail that we need to
`fclose()` the file.
Let's just not?
> if (run_diff_files(&rev, 0))
> die(_("Could not write patch"));
>
> diff --git a/builtin/am.c b/builtin/am.c
> index 8355e3566f..157d264583 100644
> --- a/builtin/am.c
> +++ b/builtin/am.c
> @@ -1315,7 +1315,7 @@ static void write_commit_patch(const struct am_state *state, struct commit *comm
> rev_info.diffopt.flags.full_index = 1;
> rev_info.diffopt.use_color = 0;
> rev_info.diffopt.file = fp;
> - rev_info.diffopt.close_file = 1;
> + rev_info.diffopt.fclose_file = 1; /* log_tree_commit() sets .no_free=1 */
> add_pending_object(&rev_info, &commit->object, "");
> diff_setup_done(&rev_info.diffopt);
> log_tree_commit(&rev_info, commit);
> @@ -1347,10 +1347,12 @@ static void write_index_patch(const struct am_state *state)
> rev_info.diffopt.output_format = DIFF_FORMAT_PATCH;
> rev_info.diffopt.use_color = 0;
> rev_info.diffopt.file = fp;
> - rev_info.diffopt.close_file = 1;
> + rev_info.diffopt.fclose_file = 1;
> + rev_info.diffopt.no_free = 1;
> add_pending_object(&rev_info, &tree->object, "");
> diff_setup_done(&rev_info.diffopt);
> run_diff_index(&rev_info, 1);
> + diff_free(&rev_info.diffopt);
> }
>
> /**
> diff --git a/builtin/log.c b/builtin/log.c
> index fd282def43..604ee29ec0 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -306,10 +306,11 @@ static struct itimerval early_output_timer;
>
> static void log_show_early(struct rev_info *revs, struct commit_list *list)
> {
> - int i = revs->early_output, close_file = revs->diffopt.close_file;
> + int i = revs->early_output;
> int show_header = 1;
> + int no_free = revs->diffopt.no_free;
>
> - revs->diffopt.close_file = 0;
> + revs->diffopt.no_free = 0;
> sort_in_topological_order(&list, revs->sort_order);
> while (list && i) {
> struct commit *commit = list->item;
> @@ -326,8 +327,8 @@ static void log_show_early(struct rev_info *revs, struct commit_list *list)
> case commit_ignore:
> break;
> case commit_error:
> - if (close_file)
> - fclose(revs->diffopt.file);
> + revs->diffopt.no_free = no_free;
> + diff_free(&revs->diffopt);
> return;
> }
> list = list->next;
> @@ -335,8 +336,8 @@ static void log_show_early(struct rev_info *revs, struct commit_list *list)
>
> /* Did we already get enough commits for the early output? */
> if (!i) {
> - if (close_file)
> - fclose(revs->diffopt.file);
> + revs->diffopt.no_free = 0;
> + diff_free(&revs->diffopt);
> return;
> }
>
> @@ -400,7 +401,7 @@ static int cmd_log_walk(struct rev_info *rev)
> {
> struct commit *commit;
> int saved_nrl = 0;
> - int saved_dcctc = 0, close_file = rev->diffopt.close_file;
> + int saved_dcctc = 0;
>
> if (rev->early_output)
> setup_early_output();
> @@ -416,7 +417,7 @@ static int cmd_log_walk(struct rev_info *rev)
> * and HAS_CHANGES being accumulated in rev->diffopt, so be careful to
> * retain that state information if replacing rev->diffopt in this loop
> */
> - rev->diffopt.close_file = 0;
> + rev->diffopt.no_free = 1;
> while ((commit = get_revision(rev)) != NULL) {
> if (!log_tree_commit(rev, commit) && rev->max_count >= 0)
> /*
> @@ -441,8 +442,8 @@ static int cmd_log_walk(struct rev_info *rev)
> }
> rev->diffopt.degraded_cc_to_c = saved_dcctc;
> rev->diffopt.needed_rename_limit = saved_nrl;
> - if (close_file)
> - fclose(rev->diffopt.file);
> + rev->diffopt.no_free = 0;
> + diff_free(&rev->diffopt);
>
> if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
> rev->diffopt.flags.check_failed) {
> @@ -1952,18 +1953,18 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
> if (rev.show_notes)
> load_display_notes(&rev.notes_opt);
>
> - if (use_stdout + rev.diffopt.close_file + !!output_directory > 1)
> + if (use_stdout + rev.diffopt.fclose_file + !!output_directory > 1)
> die(_("--stdout, --output, and --output-directory are mutually exclusive"));
>
> if (use_stdout) {
> setup_pager();
> - } else if (rev.diffopt.close_file) {
> + } else if (rev.diffopt.fclose_file) {
> /*
> * The diff code parsed --output; it has already opened the
> * file, but but we must instruct it not to close after each
> * diff.
> */
> - rev.diffopt.close_file = 0;
> + rev.diffopt.no_free = 1;
> } else {
> int saved;
>
> diff --git a/diff.c b/diff.c
> index 69e3bc00ed..3e6f8f0a71 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -5187,7 +5187,7 @@ static enum parse_opt_result diff_opt_output(struct parse_opt_ctx_t *ctx,
> BUG_ON_OPT_NEG(unset);
> path = prefix_filename(ctx->prefix, arg);
> options->file = xfopen(path, "w");
> - options->close_file = 1;
> + options->fclose_file = 1;
> if (options->use_color != GIT_COLOR_ALWAYS)
> options->use_color = GIT_COLOR_NEVER;
> free(path);
> @@ -6399,10 +6399,10 @@ void diff_flush(struct diff_options *options)
> * options->file to /dev/null should be safe, because we
> * aren't supposed to produce any output anyway.
> */
> - if (options->close_file)
> + if (options->fclose_file)
> fclose(options->file);
And at this stage, we should set `options->file = NULL`.
> options->file = xfopen("/dev/null", "w");
> - options->close_file = 1;
> + options->fclose_file = 1;
> options->color_moved = 0;
> for (i = 0; i < q->nr; i++) {
> struct diff_filepair *p = q->queue[i];
> @@ -6433,8 +6433,7 @@ void diff_flush(struct diff_options *options)
> free_queue:
> free(q->queue);
> DIFF_QUEUE_CLEAR(q);
> - if (options->close_file)
> - fclose(options->file);
> + diff_free(options);
>
> /*
> * Report the content-level differences with HAS_CHANGES;
> @@ -6449,6 +6448,15 @@ void diff_flush(struct diff_options *options)
> }
> }
>
> +void diff_free(struct diff_options *options)
> +{
> + if (options->no_free)
> + return;
> + if (options->fclose_file)
> + fclose(options->file);
And at this stage, we should set `options->file = NULL`.
> +}
> +
> +
> static int match_filter(const struct diff_options *options, const struct diff_filepair *p)
> {
> return (((p->status == DIFF_STATUS_MODIFIED) &&
> diff --git a/diff.h b/diff.h
> index 2ff2b1c7f2..d1d74c3a9e 100644
> --- a/diff.h
> +++ b/diff.h
> @@ -49,7 +49,16 @@
> * - Once you finish feeding the pairs of files, call `diffcore_std()`.
> * This will tell the diffcore library to go ahead and do its work.
> *
> + * - The `diff_opt_parse()` etc. functions might allocate memory in
> + * `struct diff_options`. When running the API `N > 1` set `.no_free
> + * = 1` to make the `diff_free()` invoked by `diff_flush()` below a
> + * noop.
I have serious trouble parsing that last sentence. Would you mind
rephrasing it?
> + *
> * - Calling `diff_flush()` will produce the output.
> + *
> + * - If you set `.no_free = 1` before set it to `0` and call
> + * `diff_free()` again. If `.no_free = 1` was not set there's no
> + * need to call `diff_free()`, `diff_flush()` will call it.
Again, as I mentioned before, the indicator whether things need to be
released should be whether the attribute is `NULL` or not. And once
released, ot should be set to `NULL`.
Other than that, it looks fine to me. And I definitely like the idea of
introducing a function to release all of the stuff in `struct diffopt`.
Thanks,
Dscho
> */
>
> struct combine_diff_path;
> @@ -328,7 +337,7 @@ struct diff_options {
> void (*set_default)(struct diff_options *);
>
> FILE *file;
> - int close_file;
> + int fclose_file;
>
> #define OUTPUT_INDICATOR_NEW 0
> #define OUTPUT_INDICATOR_OLD 1
> @@ -365,6 +374,8 @@ struct diff_options {
>
> struct repository *repo;
> struct option *parseopts;
> +
> + int no_free;
> };
>
> unsigned diff_filter_bit(char status);
> @@ -559,6 +570,7 @@ void diffcore_fix_diff_index(void);
>
> int diff_queue_is_empty(void);
> void diff_flush(struct diff_options*);
> +void diff_free(struct diff_options*);
> void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc);
>
> /* diff-raw status letters */
> diff --git a/log-tree.c b/log-tree.c
> index fd0dde97ec..bb946f15f1 100644
> --- a/log-tree.c
> +++ b/log-tree.c
> @@ -952,12 +952,14 @@ static int log_tree_diff(struct rev_info *opt, struct commit *commit, struct log
> int log_tree_commit(struct rev_info *opt, struct commit *commit)
> {
> struct log_info log;
> - int shown, close_file = opt->diffopt.close_file;
> + int shown;
> + /* maybe called by e.g. cmd_log_walk(), maybe stand-alone */
> + int no_free = opt->diffopt.no_free;
>
> log.commit = commit;
> log.parent = NULL;
> opt->loginfo = &log;
> - opt->diffopt.close_file = 0;
> + opt->diffopt.no_free = 1;
>
> if (opt->line_level_traverse)
> return line_log_print(opt, commit);
> @@ -974,7 +976,7 @@ int log_tree_commit(struct rev_info *opt, struct commit *commit)
> fprintf(opt->diffopt.file, "\n%s\n", opt->break_bar);
> opt->loginfo = NULL;
> maybe_flush_or_die(opt->diffopt.file, "stdout");
> - if (close_file)
> - fclose(opt->diffopt.file);
> + opt->diffopt.no_free = no_free;
> + diff_free(&opt->diffopt);
> return shown;
> }
> diff --git a/wt-status.c b/wt-status.c
> index 0c8287a023..41b08474e5 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -1052,7 +1052,7 @@ static void wt_longstatus_print_verbose(struct wt_status *s)
> rev.diffopt.rename_limit = s->rename_limit >= 0 ? s->rename_limit : rev.diffopt.rename_limit;
> rev.diffopt.rename_score = s->rename_score >= 0 ? s->rename_score : rev.diffopt.rename_score;
> rev.diffopt.file = s->fp;
> - rev.diffopt.close_file = 0;
> + rev.diffopt.fclose_file = 0; /* wt_status owns the s->fp */
> /*
> * If we're not going to stdout, then we definitely don't
> * want color, since we are going to the commit message
> --
> 2.30.0.284.gd98b1dd5eaa7
>
>
next prev parent reply other threads:[~2021-02-10 16:03 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-01 12:06 [PATCH 0/2] diff: add -I<regex> that ignores matching changes Michał Kępień
2020-10-01 12:06 ` [PATCH 1/2] " Michał Kępień
2020-10-01 18:21 ` Junio C Hamano
2020-10-07 19:48 ` Michał Kępień
2020-10-07 20:08 ` Junio C Hamano
2020-10-01 12:06 ` [PATCH 2/2] t: add -I<regex> tests Michał Kępień
2020-10-01 17:02 ` [PATCH 0/2] diff: add -I<regex> that ignores matching changes Junio C Hamano
2020-10-12 9:17 ` [PATCH v2 0/3] " Michał Kępień
2020-10-12 9:17 ` [PATCH v2 1/3] merge-base, xdiff: zero out xpparam_t structures Michał Kępień
2020-10-12 11:14 ` Johannes Schindelin
2020-10-12 17:09 ` Junio C Hamano
2020-10-12 19:52 ` Junio C Hamano
2020-10-13 6:35 ` Michał Kępień
2020-10-12 9:17 ` [PATCH v2 2/3] diff: add -I<regex> that ignores matching changes Michał Kępień
2020-10-12 11:20 ` Johannes Schindelin
2020-10-12 20:00 ` Junio C Hamano
2020-10-12 20:39 ` Johannes Schindelin
2020-10-12 21:43 ` Junio C Hamano
2020-10-13 6:37 ` Michał Kępień
2020-10-13 15:49 ` Junio C Hamano
2020-10-13 6:36 ` Michał Kępień
2020-10-13 12:02 ` Johannes Schindelin
2020-10-13 15:53 ` Junio C Hamano
2020-10-13 18:45 ` Michał Kępień
2020-10-12 18:01 ` Junio C Hamano
2020-10-13 6:38 ` Michał Kępień
2020-10-12 20:04 ` Junio C Hamano
2020-10-13 6:38 ` Michał Kępień
2020-10-12 9:17 ` [PATCH v2 3/3] t: add -I<regex> tests Michał Kępień
2020-10-12 11:49 ` Johannes Schindelin
2020-10-13 6:38 ` Michał Kępień
2020-10-13 12:00 ` Johannes Schindelin
2020-10-13 16:00 ` Junio C Hamano
2020-10-13 19:01 ` Michał Kępień
2020-10-15 11:45 ` Johannes Schindelin
2020-10-15 7:24 ` [PATCH v3 0/2] diff: add -I<regex> that ignores matching changes Michał Kępień
2020-10-15 7:24 ` [PATCH v3 1/2] merge-base, xdiff: zero out xpparam_t structures Michał Kępień
2020-10-15 7:24 ` [PATCH v3 2/2] diff: add -I<regex> that ignores matching changes Michał Kępień
2020-10-16 15:32 ` Phillip Wood
2020-10-16 18:04 ` Junio C Hamano
2020-10-19 9:48 ` Michał Kępień
2020-10-16 18:16 ` Junio C Hamano
2020-10-19 9:55 ` Michał Kępień
2020-10-19 17:29 ` Junio C Hamano
2020-10-16 10:00 ` [PATCH v3 0/2] " Johannes Schindelin
2020-10-20 6:48 ` [PATCH v4 " Michał Kępień
2020-10-20 6:48 ` [PATCH v4 1/2] merge-base, xdiff: zero out xpparam_t structures Michał Kępień
2020-10-20 6:48 ` [PATCH v4 2/2] diff: add -I<regex> that ignores matching changes Michał Kępień
2021-02-05 14:13 ` [PATCH 1/2] diff: add an API for deferred freeing Ævar Arnfjörð Bjarmason
2021-02-10 16:00 ` Johannes Schindelin [this message]
2021-02-11 3:00 ` Ævar Arnfjörð Bjarmason
2021-02-11 9:40 ` Johannes Schindelin
2021-02-11 10:21 ` Jeff King
2021-02-11 10:45 ` [PATCH v2 0/2] " Ævar Arnfjörð Bjarmason
2021-02-11 10:45 ` [PATCH v2 1/2] " Ævar Arnfjörð Bjarmason
2021-02-11 10:45 ` [PATCH v2 2/2] diff: plug memory leak from regcomp() on {log,diff} -I Ævar Arnfjörð Bjarmason
2021-02-05 14:13 ` [PATCH " Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=nycvar.QRO.7.76.6.2102101557160.29765@tvgsbejvaqbjf.bet \
--to=johannes.schindelin@gmx.de \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=michal@isc.org \
--cc=phillip.wood123@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).