From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Subject: [RFC] git checkout $tree -- $path always rewrites files
Date: Fri, 7 Nov 2014 03:13:24 -0500 [thread overview]
Message-ID: <20141107081324.GA19845@peff.net> (raw)
I noticed that "git checkout $tree -- $path" will _always_ unlink and
write a new copy of each matching path, even if they are up-to-date with
the index and the content in $tree is the same.
Here's a simple reproduction:
# some repo with a file in it
git init
echo content >foo && git add foo && git commit -m foo
# give the file a known timestamp (it doesn't matter what)
perl -e 'utime(123, 123, @ARGV)' foo
git update-index --refresh
# now check out an identical copy
git checkout HEAD -- foo
# and check whether it was updated
perl -le 'print ((stat)[9]) for @ARGV' foo
which yields a modern timestamp, showing that the file was rewritten.
If you use
git checkout -- foo
instead to checkout from the index, that works fine (the final line
prints 123). Similarly, if you load the new index and checkout
separately, like:
git read-tree -m $tree
git checkout -- foo
that also works. Of course it is not quite the same; read-tree loads the
whole index from $tree, whereas checkout would only touch a subset of
the entries. And that's the culprit; checkout does not use unpack-trees
to only touch the entries that need updating. It uses read_tree_recursive
with a pathspec, and just replaces any index entries that match.
Below is a patch which makes it do what I expect by silently copying
flags and stat-data from the old entry, but I feel like it is going in
the wrong direction. Besides causing a few tests to fail (which I
suspect is because I implemented it at the wrong layer), it really feels
like this should be using unpack-trees or something similar.
After all, that is what "git reset $tree -- foo" does, and it gets this
case right (in fact, you can replace the read-tree above with it). It
looks like it actually does a pathspec-limited index-diff rather than
unpack-trees, but the end effect is the same (and while checkout could
just pass "update" to unpack-trees, we need to handle checking out
directly from the index anyway, so I do not think it is a big deal to
keep the index update as a separate step).
Is there a reason that we don't use this diff technique for checkout?
Anyway, here is the (wrong) patch I came up with initially.
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 29b0f6d..802e556 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -273,19 +273,13 @@ static int checkout_paths(const struct checkout_opts *opts,
ce->ce_flags &= ~CE_MATCHED;
if (!opts->ignore_skipworktree && ce_skip_worktree(ce))
continue;
- if (opts->source_tree && !(ce->ce_flags & CE_UPDATE))
- /*
- * "git checkout tree-ish -- path", but this entry
- * is in the original index; it will not be checked
- * out to the working tree and it does not matter
- * if pathspec matched this entry. We will not do
- * anything to this entry at all.
- */
- continue;
+
/*
- * Either this entry came from the tree-ish we are
- * checking the paths out of, or we are checking out
- * of the index.
+ * If we are checking out from a tree-ish, we already
+ * did our pathspec matching. However, since we then
+ * drop the CE_UPDATE flag from any paths that do not
+ * need updating, we don't know which ones were not
+ * mentioned and which ones were simply up-to-date.
*
* If it comes from the tree-ish, we already know it
* matches the pathspec and could just stamp
diff --git a/read-cache.c b/read-cache.c
index 8f3e9eb..fb2240b 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -962,8 +962,13 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e
/* existing match? Just replace it. */
if (pos >= 0) {
- if (!new_only)
+ if (!new_only) {
+ struct cache_entry *old = istate->cache[pos];
+ if (!is_null_sha1(ce->sha1) &&
+ !hashcmp(old->sha1, ce->sha1))
+ copy_cache_entry(ce, old);
replace_index_entry(istate, pos, ce);
+ }
return 0;
}
pos = -pos-1;
next reply other threads:[~2014-11-07 8:13 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-07 8:13 Jeff King [this message]
2014-11-07 8:38 ` [RFC] git checkout $tree -- $path always rewrites files Jeff King
2014-11-07 10:13 ` Duy Nguyen
2014-11-07 16:51 ` Junio C Hamano
2014-11-07 19:15 ` Jeff King
2014-11-07 17:14 ` Junio C Hamano
2014-11-07 19:17 ` Jeff King
[not found] ` <CANiSa6hufp=80TaesNpo1CxCbwVq3LPXvYaUSbcmzPE5pj_GGw@mail.gmail.com>
2014-11-08 7:10 ` Martin von Zweigbergk
[not found] ` <CAPc5daWdzrHr8Rdksr3HycMRQu0=Ji7h=BPYjzZj7MH6Ko0VgQ@mail.gmail.com>
2014-11-08 8:03 ` Martin von Zweigbergk
2014-11-08 8:30 ` Jeff King
2014-11-08 8:45 ` Jeff King
2014-11-09 18:37 ` Junio C Hamano
2014-11-08 16:19 ` Martin von Zweigbergk
2014-11-09 9:42 ` Jeff King
2014-11-09 17:21 ` Junio C Hamano
2014-11-13 18:30 ` Jeff King
2014-11-13 19:15 ` Junio C Hamano
2014-11-13 19:26 ` Jeff King
2014-11-13 20:03 ` Jeff King
2014-11-13 21:18 ` Junio C Hamano
2014-11-13 21:37 ` Jeff King
2014-11-14 5:44 ` David Aguilar
2014-11-14 19:27 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141107081324.GA19845@peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).