git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Torsten Bögershausen" <tboegi@web.de>
Cc: Robert Dailey <rcdailey.lists@gmail.com>, Git <git@vger.kernel.org>
Subject: Re: Line ending normalization doesn't work as expected
Date: Thu, 05 Oct 2017 12:31:06 +0900	[thread overview]
Message-ID: <xmqqo9pm8e4l.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <xmqq7ewa9xw6.fsf@gitster.mtv.corp.google.com> (Junio C. Hamano's message of "Thu, 05 Oct 2017 10:38:49 +0900")

Junio C Hamano <gitster@pobox.com> writes:

> Both this and its "git read-tree --empty" cousin share a grave
> issue.  The "git add ." step would mean that before doing these
> commands, your working tree must be truly clean, i.e. the paths
> in the filesystem known to the index must match what is in the
> index (modulo the line-ending gotcha you are trying to correct), 
> *AND* there must be *NO* untracked paths you do not want to add
> in the working tree.
>
> That is a reason why we should solve it differently.  Perhaps adding
> a new option "git add --rehash" to tell Git "Hey, you may think some
> paths in the index and in the working tree are identical and no need
> to re-register, but you are WRONG.  For each path in the index,
> remove it and then register the object by hashing the contents from
> the filesystem afresh!" would be the best way to go.

Here is just to illustrate the direction I was heading to in the
above.  This is not even compile tested and I won't guarantee what
corner cases there are, though.

In a true production code, we shouldn't be using string-list with
two loops, but I just didn't want to spend more braincycles worrying
about removing from the list and then adding to it, both inside a
single loop that iterates over it in a mere illustration patch.

The second loop uses a simple "remove then add", but I think it
should rather be a "mark ce that it will _never_ match anything on
the working tree" followed by "add_file_to_cache()".  Currently we
do not have the "mark ce that it never matches" operation that lets
us bypass the comparison with the current cache entry (with safecrlf
thing that interferes), but we can afford to use a (in-core only)
bit in the ce flags word to represent this and plumb it through.
That way, we will still preserve the executable bit from the
original entry, hopefully ;-)


 builtin/add.c | 42 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/builtin/add.c b/builtin/add.c
index 5d5773d5cd..264f84dbe7 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -26,6 +26,7 @@ static const char * const builtin_add_usage[] = {
 };
 static int patch_interactive, add_interactive, edit_interactive;
 static int take_worktree_changes;
+static int rehash;
 
 struct update_callback_data {
 	int flags;
@@ -121,6 +122,41 @@ int add_files_to_cache(const char *prefix,
 	return !!data.add_errors;
 }
 
+static int rehash_tracked_files(const char *prefix, const struct pathspec *pathspec,
+				int flags)
+{
+	struct string_list paths = STRING_LIST_INIT_DUP;
+	struct string_list_item *path;
+	int i, retval = 0;
+
+	for (i = 0; i < active_nr; i++) {
+		struct cache_entry *ce = active_cache[i];
+
+		if (ce_stage(ce))
+			continue; /* do not touch unmerged paths */
+		if (!S_ISREG(ce->ce_mode) && !S_ISLNK(ce->ce_mode))
+			continue; /* do not touch non blobs */
+		if (pathspec && !ce_path_match(ce, pathspec, NULL))
+			continue;
+		string_list_append(&paths, ce->name);
+	}
+
+	for_each_string_list_item(path, &paths) {
+		/*
+		 * Having a blob contaminated with CR will trigger the
+		 * safe-crlf kludge, avoidance of which is the primary
+		 * thing this helper function exists.  Remove it and
+		 * then re-add it.  Note that this may lose executable
+		 * bit on a filesystem that lacks it.
+		 */
+		remove_file_from_cache(path->string);
+		add_file_to_cache(path->string, flags);
+	}
+
+	string_list_clear(&paths, 0);
+	return retval;
+}
+
 static char *prune_directory(struct dir_struct *dir, struct pathspec *pathspec, int prefix)
 {
 	char *seen;
@@ -274,6 +310,7 @@ static struct option builtin_add_options[] = {
 	OPT_BOOL('e', "edit", &edit_interactive, N_("edit current diff and apply")),
 	OPT__FORCE(&ignored_too, N_("allow adding otherwise ignored files")),
 	OPT_BOOL('u', "update", &take_worktree_changes, N_("update tracked files")),
+	OPT_BOOL(0, "rehash", &rehash, N_("really update tracked files")),
 	OPT_BOOL('N', "intent-to-add", &intent_to_add, N_("record only the fact that the path will be added later")),
 	OPT_BOOL('A', "all", &addremove_explicit, N_("add changes from all tracked and untracked files")),
 	{ OPTION_CALLBACK, 0, "ignore-removal", &addremove_explicit,
@@ -498,7 +535,10 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 
 	plug_bulk_checkin();
 
-	exit_status |= add_files_to_cache(prefix, &pathspec, flags);
+	if (rehash)
+		exit_status |= rehash_tracked_files(prefix, &pathspec, flags);
+	else
+		exit_status |= add_files_to_cache(prefix, &pathspec, flags);
 
 	if (add_new_files)
 		exit_status |= add_files(&dir, flags);

  reply	other threads:[~2017-10-05  3:31 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-03 15:00 Line ending normalization doesn't work as expected Robert Dailey
2017-10-03 16:26 ` Torsten Bögershausen
2017-10-03 17:23   ` Robert Dailey
2017-10-03 19:19     ` Torsten Bögershausen
2017-10-04  2:00       ` Junio C Hamano
2017-10-04 16:26         ` Robert Dailey
2017-10-04 16:59           ` Jonathan Nieder
2017-10-04 18:03             ` Robert Dailey
2017-10-05  1:31             ` Junio C Hamano
2017-10-05  1:46               ` Jonathan Nieder
2017-10-04 21:17           ` Torsten Bögershausen
2017-10-05  1:38             ` Junio C Hamano
2017-10-05  3:31               ` Junio C Hamano [this message]
2017-10-05 21:42                 ` Torsten Bögershausen
2017-10-06  0:33                   ` Junio C Hamano
2017-10-06 17:58                     ` Torsten Bögershausen
2017-10-16 16:49                 ` [PATCH v1 1/1] Introduce git add --renormalize tboegi
2017-10-16 17:34                   ` Junio C Hamano
2017-10-30 16:29                     ` [PATCH v2 " tboegi
2017-11-07  5:50                       ` Junio C Hamano
2017-11-07 17:26                         ` Torsten Bögershausen
2017-11-08  0:37                           ` Junio C Hamano
2017-11-09 18:47                             ` Torsten Bögershausen
2017-11-10  0:22                               ` Junio C Hamano
2017-11-12 20:08                                 ` Torsten Bögershausen
2017-11-16 16:38                     ` [PATCH v3 " tboegi
2017-11-17  1:24                       ` Junio C Hamano
2017-11-17 20:44                       ` Eric Sunshine
2017-11-18  1:47                         ` Junio C Hamano
2018-02-15 15:24         ` Line ending normalization doesn't work as expected Robert Dailey
2018-02-15 19:16           ` Junio C Hamano
2018-02-15 21:47             ` Robert Dailey
2018-02-16 16:34           ` Torsten Bögershausen
2018-02-16 17:19             ` Robert Dailey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqo9pm8e4l.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=rcdailey.lists@gmail.com \
    --cc=tboegi@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).