git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: git@vger.kernel.org
Cc: "Taylor Blau" <me@ttaylorr.com>,
	"man dog" <dogman888888@gmail.com>,
	"SZEDER Gábor" <szeder.dev@gmail.com>
Subject: [PATCH 1/3] line-log: free diff queue when processing non-merge commits
Date: Wed,  2 Nov 2022 23:01:40 +0100	[thread overview]
Message-ID: <20221102220142.574890-2-szeder.dev@gmail.com> (raw)
In-Reply-To: <20221102220142.574890-1-szeder.dev@gmail.com>

When processing a non-merge commit, the line-level log first asks the
tree-diff machinery whether any of the files in the given line ranges
were modified between the current commit and its parent, and if some
of them were, then it loads the contents of those files from both
commits to see whether their line ranges were modified and/or need to
be adjusted.  Alas, it doesn't free() the diff queue holding the
results of that query and the contents of those files once its done.
This can add up to a substantial amount of leaked memory, especially
when the file in question is big and is frequently modified: a user
reported "Out of memory, malloc failed" errors with a 2MB text file
that was modified ~2800 times [1] (I estimate the leak would use up
almost 11GB memory in that case).

Free that diff queue to plug this memory leak.  However, instead of
simply open-coding the necessary three lines, add them as a helper
function to the diff API, because it will be useful elsewhere as well.

[1] https://public-inbox.org/git/CAFOPqVXz2XwzX8vGU7wLuqb2ZuwTuOFAzBLRM_QPk+NJa=eC-g@mail.gmail.com/

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 diff.c     | 7 +++++++
 diffcore.h | 1 +
 line-log.c | 1 +
 3 files changed, 9 insertions(+)

diff --git a/diff.c b/diff.c
index 35e46dd968..ef94175163 100644
--- a/diff.c
+++ b/diff.c
@@ -5773,6 +5773,13 @@ void diff_free_filepair(struct diff_filepair *p)
 	free(p);
 }
 
+void diff_free_queue(struct diff_queue_struct *q)
+{
+	for (int i = 0; i < q->nr; i++)
+		diff_free_filepair(q->queue[i]);
+	free(q->queue);
+}
+
 const char *diff_aligned_abbrev(const struct object_id *oid, int len)
 {
 	int abblen;
diff --git a/diffcore.h b/diffcore.h
index badc2261c2..9b588a1ee1 100644
--- a/diffcore.h
+++ b/diffcore.h
@@ -162,6 +162,7 @@ struct diff_filepair *diff_queue(struct diff_queue_struct *,
 				 struct diff_filespec *,
 				 struct diff_filespec *);
 void diff_q(struct diff_queue_struct *, struct diff_filepair *);
+void diff_free_queue(struct diff_queue_struct *q);
 
 /* dir_rename_relevance: the reason we want rename information for a dir */
 enum dir_rename_relevance {
diff --git a/line-log.c b/line-log.c
index 51d93310a4..7a74daf2e8 100644
--- a/line-log.c
+++ b/line-log.c
@@ -1195,6 +1195,7 @@ static int process_ranges_ordinary_commit(struct rev_info *rev, struct commit *c
 	if (parent)
 		add_line_range(rev, parent, parent_range);
 	free_line_log_data(parent_range);
+	diff_free_queue(&queue);
 	return changed;
 }
 
-- 
2.38.1.564.g99c012faba


  reply	other threads:[~2022-11-02 22:01 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-29 16:59 Bug report: git -L requires excessive memory man dog
2022-10-31 21:45 ` SZEDER Gábor
2022-10-31 21:56   ` Taylor Blau
2022-11-02 22:01     ` [PATCH 0/3] line-log: plug some memory leaks SZEDER Gábor
2022-11-02 22:01       ` SZEDER Gábor [this message]
2022-11-03  0:20         ` [PATCH 1/3] line-log: free diff queue when processing non-merge commits Taylor Blau
2022-11-07 15:11           ` SZEDER Gábor
2022-11-07 15:29             ` Ævar Arnfjörð Bjarmason
2022-11-07 15:57               ` SZEDER Gábor
2022-11-08  2:14                 ` Taylor Blau
2022-11-02 22:01       ` [PATCH 2/3] line-log: free the diff queues' arrays when processing merge commits SZEDER Gábor
2022-11-03  0:21         ` Taylor Blau
2022-11-02 22:01       ` [PATCH 3/3] diff.c: use diff_free_queue() SZEDER Gábor
2022-11-03  0:24         ` Taylor Blau
2022-11-07 16:13           ` SZEDER Gábor
2022-11-08  2:14             ` Taylor Blau
2022-11-03  9:05       ` [PATCH 0/3] line-log: plug some memory leaks Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221102220142.574890-2-szeder.dev@gmail.com \
    --to=szeder.dev@gmail.com \
    --cc=dogman888888@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).