From: Phillip Wood <phillip.wood123@gmail.com>
To: git@vger.kernel.org
Cc: Ezekiel Newren <ezekielnewren@gmail.com>,
Phillip Wood <phillip.wood123@gmail.com>
Subject: [PATCH 1/4] xdiff: reduce size of action arrays
Date: Thu, 2 Apr 2026 15:57:41 +0100 [thread overview]
Message-ID: <447b8c0af1746d61bfa26e7908a784583ab5dc2e.1775141855.git.phillip.wood@dunelm.org.uk> (raw)
In-Reply-To: <cover.1775141855.git.phillip.wood@dunelm.org.uk>
From: Phillip Wood <phillip.wood@dunelm.org.uk>
When the myers algorithm is selected the input files are pre-processed
to remove any common prefix and suffix. Then any lines that appear
only in one side of the diff are marked as changed and frequently
occurring lines are marked as changed if they are adjacent to a
changed line. This step requires a couple of temporary arrays. As as
the common prefix and suffix have already been removed, the arrays
only need to be big enough to hold the lines between them, not the
whole file. Reduce the size of the arrays and adjust the loops that
use them accordingly while taking care to keep indexing the arrays
in xdfile_t with absolute line numbers.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
xdiff/xprepare.c | 31 +++++++++++++++++--------------
1 file changed, 17 insertions(+), 14 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 1f2e8c6b4b9..4bb3a8ef41c 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -273,16 +273,19 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
uint8_t *action1 = NULL, *action2 = NULL;
bool need_min = !!(cf->flags & XDF_NEED_MINIMAL);
int ret = 0;
+ ptrdiff_t off = xdf1->dstart;
+ ptrdiff_t len1 = xdf1->dend - off + 1;
+ ptrdiff_t len2 = xdf2->dend - off + 1;
/*
* Create temporary arrays that will help us decide if
* changed[i] should remain false, or become true.
*/
- if (!XDL_CALLOC_ARRAY(action1, xdf1->nrec + 1)) {
+ if (!XDL_CALLOC_ARRAY(action1, len1)) {
ret = -1;
goto cleanup;
}
- if (!XDL_CALLOC_ARRAY(action2, xdf2->nrec + 1)) {
+ if (!XDL_CALLOC_ARRAY(action2, len2)) {
ret = -1;
goto cleanup;
}
@@ -299,8 +302,8 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
/*
* Initialize temporary arrays with DISCARD, KEEP, or INVESTIGATE.
*/
- for (i = xdf1->dstart; i <= xdf1->dend; i++) {
- size_t mph1 = xdf1->recs[i].minimal_perfect_hash;
+ for (i = 0; i < len1; i++) {
+ size_t mph1 = xdf1->recs[i + off].minimal_perfect_hash;
rcrec = cf->rcrecs[mph1];
nm = rcrec ? rcrec->len2 : 0;
if (nm == 0)
@@ -311,8 +314,8 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
action1[i] = INVESTIGATE;
}
- for (i = xdf2->dstart; i <= xdf2->dend; i++) {
- size_t mph2 = xdf2->recs[i].minimal_perfect_hash;
+ for (i = 0; i < len2; i++) {
+ size_t mph2 = xdf2->recs[i + off].minimal_perfect_hash;
rcrec = cf->rcrecs[mph2];
nm = rcrec ? rcrec->len1 : 0;
if (nm == 0)
@@ -328,37 +331,37 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
* false, or become true.
*/
xdf1->nreff = 0;
- for (i = xdf1->dstart; i <= xdf1->dend; i++) {
+ for (i = 0; i < len1; i++) {
if (action1[i] == INVESTIGATE) {
- if (!xdl_clean_mmatch(action1, i, xdf1->dstart, xdf1->dend))
+ if (!xdl_clean_mmatch(action1, i, 0, len1 - 1))
action1[i] = KEEP;
else
action1[i] = DISCARD;
}
if (action1[i] == KEEP) {
- xdf1->reference_index[xdf1->nreff++] = i;
+ xdf1->reference_index[xdf1->nreff++] = i + off;
/* changed[i] remains false */
} else if (action1[i] == DISCARD)
- xdf1->changed[i] = true;
+ xdf1->changed[i + off] = true;
else
BUG("Illegal state for action1[i]");
}
xdf2->nreff = 0;
- for (i = xdf2->dstart; i <= xdf2->dend; i++) {
+ for (i = 0; i < len2; i++) {
if (action2[i] == INVESTIGATE) {
- if (!xdl_clean_mmatch(action2, i, xdf2->dstart, xdf2->dend))
+ if (!xdl_clean_mmatch(action2, i, 0, len2 - 1))
action2[i] = KEEP;
else
action2[i] = DISCARD;
}
if (action2[i] == KEEP) {
- xdf2->reference_index[xdf2->nreff++] = i;
+ xdf2->reference_index[xdf2->nreff++] = i + off;
/* changed[i] remains false */
} else if (action2[i] == DISCARD)
- xdf2->changed[i] = true;
+ xdf2->changed[i + off] = true;
else
BUG("Illegal state for action2[i]");
}
--
2.52.0.362.g884e03848a9.dirty
next prev parent reply other threads:[~2026-04-02 14:58 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-02 14:57 [PATCH 0/4] xdiff: reduce the size of a couple of arrays Phillip Wood
2026-04-02 14:57 ` Phillip Wood [this message]
2026-04-02 19:19 ` [PATCH 1/4] xdiff: reduce size of action arrays Junio C Hamano
2026-04-02 14:57 ` [PATCH 2/4] xdiff: cleanup xdl_clean_mmatch() Phillip Wood
2026-04-02 19:20 ` Junio C Hamano
2026-04-02 14:57 ` [PATCH 3/4] xprepare: simplify error handling Phillip Wood
2026-04-02 19:24 ` Junio C Hamano
2026-04-02 14:57 ` [PATCH 4/4] xdiff: reduce the size of array Phillip Wood
2026-04-02 19:44 ` Junio C Hamano
2026-05-04 14:06 ` [PATCH v2 0/4] xdiff: reduce the size of a couple of arrays Phillip Wood
2026-05-04 14:06 ` [PATCH v2 1/4] xdiff: reduce size of action arrays Phillip Wood
2026-05-04 14:06 ` [PATCH v2 2/4] xdiff: cleanup xdl_clean_mmatch() Phillip Wood
2026-05-04 14:06 ` [PATCH v2 3/4] xprepare: simplify error handling Phillip Wood
2026-05-04 14:06 ` [PATCH v2 4/4] xdiff: reduce the size of array Phillip Wood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=447b8c0af1746d61bfa26e7908a784583ab5dc2e.1775141855.git.phillip.wood@dunelm.org.uk \
--to=phillip.wood123@gmail.com \
--cc=ezekielnewren@gmail.com \
--cc=git@vger.kernel.org \
--cc=phillip.wood@dunelm.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).