git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Yann Dirson <ydirson@altern.org>
To: git@vger.kernel.org
Cc: Yann Dirson <ydirson@altern.org>, Yann Dirson <ydirson@free.fr>
Subject: [PATCH v6.1] [RFC] Consider all parents of a file as candidates for bulk rename.
Date: Sun, 17 Oct 2010 21:24:03 +0200	[thread overview]
Message-ID: <1287343443-23098-1-git-send-email-ydirson@altern.org> (raw)
In-Reply-To: <1287098999-9244-5-git-send-email-ydirson@altern.org>

We previously only looked at the immediate parents of a file involved
in a move.  Now consider upper ones as candidates too.

The way it is done here is somewhat inefficient, but at least it
handles all previously-known cases of incorrectness.

Signed-off-by: Yann Dirson <ydirson@free.fr>
---
 diffcore-rename.c |  120 ++++++++++++++++++++++++++++-------------------------
 1 files changed, 63 insertions(+), 57 deletions(-)

There was a major flaw in the first iteration of this patch (never
looking again at discarded renames).  This iteration also adds more
explicit comments about why this piece of code should work, despite
having been written late at night and having already shown one problem
which demonstrated that a testsuite is no replacement for a good night
of sleep :)

I'm still not happy with the thing, though - the loops can surely be
made shorter by using the discarded flags previously set, and more
importantly we need to consider moves to parents of the file's
destination too, which will add a loop of its own which will have to
be made as short as possible.  All of this seems to require to sort
bulkmove_candidates so we can binary-search into it and rapidly access
related paths.

Other opinions on the spirit of this patch still gladly welcome :)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index 1be1af1..a8f3a8b 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -542,7 +542,7 @@ static int maybe_disqualify_bulkmove(const char* one_parent_path,
 		if (strncmp(rename_dst[j].two->path,
 			    one_parent_path, onep_len))
 			break; /* exhausted directory in this direction */
-		fprintf(stderr, "[DBG] leftover file %s in %s\n",
+		fprintf(stderr, "[DBG] leftover file %s in '%s'\n",
 			rename_dst[j].two->path, one_parent_path);
 		if (rename_dst[j].i_am_not_single || /* those were already here */
 		    (rename_dst[j].pair &&
@@ -596,66 +596,72 @@ static void check_one_bulk_move(struct diff_filepair *dstpair)
 	    maybe_disqualify_bulkmove(one_parent_path, one_leftover))
 		return;
 
-	fprintf(stderr, "[] %s -> %s ?\n",dstpair->one->path, dstpair->two->path);
-
-	// FIXME: loop over successive prefixes
-	unsigned needs_adding = 1;
+	fprintf(stderr, "[] %s -> %s ?\n", dstpair->one->path, dstpair->two->path);
 
-	/* already considered ? */
-	for (seen=bulkmove_candidates; seen; seen = seen->next) {
-		if (seen->discarded) {
-			/* already seen a rename from seen->one to some than ->two */
-			needs_adding = 0;
-			continue;
-		}
-		/* check exact dir */
-		if (!strcmp(seen->one->path, one_parent_path)) {
-			/* already added */
-			needs_adding = 0;
-			/* check that seen entry matches this rename */
-			if (strcmp(two_parent_path, seen->two->path)) {
-				fprintf(stderr, "[DBG] discarding dir %s from bulk moves (split into %s and %s)\n",
-					seen->one->path, two_parent_path, seen->two->path);
-				// FIXME: may be worth to free it instead
-				seen->discarded = 1;
+	/* loop up one_parent_path over successive parents */
+	// FIXME: also loop over two_parent_path prefixes ?
+	// FIXME: use a more informative prefixcmp to avoid strcmp calls
+	do {
+		unsigned needs_adding = 1;
+
+		fprintf(stderr, "[[]] %s ...\n", one_parent_path);
+
+		/* already considered ? */
+		for (seen=bulkmove_candidates; seen; seen = seen->next) {
+			fprintf(stderr, "[DBG]  ? %s -> %s\n", seen->one->path, seen->two->path);
+			/* subdir of "seen" source dir ? */
+			if (!prefixcmp(one_parent_path, seen->one->path)) {
+				/* subdir of "seen" dest dir ? */
+				if (!prefixcmp(two_parent_path, seen->two->path)) {
+					if (!strcmp(seen->one->path, one_parent_path) &&
+					    !strcmp(seen->two->path, two_parent_path)) {
+						/* already added */
+						fprintf(stderr, "[DBG]  already added\n");
+						needs_adding = 0;
+					} else /* is in a subdir: confirms but still
+						* may need adding */
+						fprintf(stderr, "[DBG]  'dstpair' conforts 'seen'\n");
+				} else {
+					fprintf(stderr, "[DBG] discarding %s -> %s from bulk moves (split into %s and %s)\n",
+						seen->one->path, seen->two->path,
+						two_parent_path, seen->two->path);
+					seen->discarded = 1;
+					/* Need to discard dstpair as well, unless moving
+					 * from a strict subdir of seen->one or to a
+					 * strict subdir of seen->two */
+					if (!strcmp(seen->one->path, one_parent_path) &&
+					    prefixcmp(seen->two->path, two_parent_path)) {
+						fprintf(stderr, "[DBG] ... and not adding self\n");
+						needs_adding = 0;
+					}
+					continue;
+				}
 			}
-			continue;
+			else fprintf(stderr, "[DBG]  %s outside of %s\n",
+				     one_parent_path, seen->one->path);
+
+			/* dstpair confirms seen, or does not infirm */
+			fprintf(stderr, "[DBG] %s -> %s DOES NOT cause discard of %s -> %s\n",
+				dstpair->one->path, dstpair->two->path,
+				seen->one->path, seen->two->path);
 		}
-		if (!prefixcmp(one_parent_path, seen->one->path)) {
-			if (prefixcmp(two_parent_path, seen->two->path)) {
-				fprintf(stderr, "[DBG] discarding dir %s from bulk moves (split into %s and %s)\n",
-					seen->one->path, two_parent_path, seen->two->path);
-				// FIXME: may be worth to free it instead
-				seen->discarded = 1;
-				continue;
-			}
-		} else {
-			fprintf(stderr, "[DBG]  %s considered irrelevant for %s -> %s\n",
-				dstpair->one->path, seen->one->path, seen->two->path);
-			continue;
+		if (needs_adding) { /* record potential dir rename */
+			seen = xmalloc(sizeof(*seen));
+			seen->one = alloc_filespec(one_parent_path);
+			fill_filespec(seen->one, null_sha1, S_IFDIR);
+			seen->two = alloc_filespec(two_parent_path);
+			fill_filespec(seen->two, null_sha1, S_IFDIR);
+			seen->discarded = 0;
+			seen->next = bulkmove_candidates;
+			bulkmove_candidates = seen;
+			fprintf(stderr, "[DBG] %s -> %s suggests possible bulk move from %s to %s\n",
+				dstpair->one->path,
+				dstpair->two->path,
+				one_parent_path, two_parent_path);
 		}
-
-		/* dstpair confirms seen */
-		fprintf(stderr, "[DBG] %s -> %s DOES NOT cause discard of %s -> %s\n",
-			dstpair->one->path, dstpair->two->path,
-			seen->one->path, seen->two->path);
-	}
-	if (needs_adding) { /* record potential dir rename */
-		/* all checks ok, we keep that entry */
-		seen = xmalloc(sizeof(*seen));
-		seen->one = alloc_filespec(one_parent_path);
-		fill_filespec(seen->one, null_sha1, S_IFDIR);
-		seen->two = alloc_filespec(two_parent_path);
-		fill_filespec(seen->two, null_sha1, S_IFDIR);
-		seen->discarded = 0;
-		seen->next = bulkmove_candidates;
-		bulkmove_candidates = seen;
-		fprintf(stderr, "[DBG] %s -> %s suggests possible bulk move from %s to %s\n",
-			dstpair->one->path,
-			dstpair->two->path,
-			one_parent_path, two_parent_path);
-		return;
-	}
+		/* next parent if any */
+		copy_dirname(one_parent_path, one_parent_path);
+	} while (*one_parent_path);
 }
 
 /*
-- 
1.7.2.3

  parent reply	other threads:[~2010-10-17 19:24 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-14 23:29 [PATCH v6 0/5] Detection of directory renames Yann Dirson
2010-10-14 23:29 ` [PATCH v6 1/5] Introduce bulk-move detection in diffcore Yann Dirson
2010-10-14 23:29   ` [PATCH v6 2/5] Add testcases for the --detect-bulk-moves diffcore flag Yann Dirson
2010-10-14 23:29     ` [PATCH v6 3/5] [RFC] Handle the simpler case of a subdir invalidating bulk move Yann Dirson
2010-10-14 23:29       ` [PATCH v6 4/5] [RFC] Consider all parents of a file as candidates for bulk rename Yann Dirson
2010-10-14 23:29         ` [PATCH v6 5/5] [WIP] Allow hiding renames of individual files involved in a directory rename Yann Dirson
2010-10-17 19:24         ` Yann Dirson [this message]
2010-10-15  5:17 ` [PATCH] compat: add memrchr() Jonathan Nieder
2010-10-15  5:31   ` Ævar Arnfjörð Bjarmason
2010-10-15  6:06     ` Jonathan Nieder
2010-10-15 10:49       ` Ævar Arnfjörð Bjarmason
2010-10-15 22:27         ` Junio C Hamano
2010-10-15  6:57     ` Johannes Sixt
2010-10-15  8:56   ` Ludvig Strigeus
2010-10-15 15:26     ` [PATCH v2] " Jonathan Nieder
2010-10-15  8:56   ` [PATCH] " Erik Faye-Lund
2010-10-15  9:35     ` Johannes Sixt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1287343443-23098-1-git-send-email-ydirson@altern.org \
    --to=ydirson@altern.org \
    --cc=git@vger.kernel.org \
    --cc=ydirson@free.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).