git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Michał Górny" <mgorny@gentoo.org>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
	"Lars Schneider" <larsxschneider@gmail.com>,
	"Michał Górny" <mgorny@gentoo.org>
Subject: [RFC PATCH] checkout: Force matching mtime between files
Date: Fri, 13 Apr 2018 19:01:29 +0200	[thread overview]
Message-ID: <20180413170129.15310-1-mgorny@gentoo.org> (raw)

Currently git does not control mtimes of files being checked out.  This
means that the only assumption you could make is that all files created
or modified within a single checkout action will have mtime between
start time and end time of this checkout.  The relations between mtimes
of different files depend on the order in which they are checked out,
filesystem speed and timestamp precision.

Git repositories sometimes contain both generated and respective source
files.  For example, the Gentoo 'user syncing' repository combines
source ebuild files with generated metadata cache for user convenience.
Ideally, the 'git checkout' would be fast enough that (combined with low
timestamp precision) all files created or modified within a single
checkout would have matching timestamp.  However, in reality the cache
files may end up being wrongly 'older' than source file, requiring
unnecessary recheck.

The opposite problem (cache files not being regenerated when they should
have been) might also occur.  However, it could not be solved without
preserving timestamps, therefore it is out of scope of this change.

In order to avoid unnecessary cache mismatches, force a matching mtime
between all files created by a single checkout action.  This seems to be
the best course of action.  Matching mtimes do not trigger cache
updates.  They also match the concept of 'checkout' being an atomic
action.  Finally, this change does not break backwards compatibility
as the new result is a subset of the possible previous results.

For example, let's say that 'git checkout' creates two files in order,
with respective timestamps T1 and T2.  Before this patch, T1 <= T2.
After this patch, T1 == T2 which also satisfies T1 <= T2.

A similar problem was previously being addressed via git-restore-mtime
tool.  However, that solution is unnecessarily complex for Gentoo's use
case and does not seem to be suitable for 'seamless' integration.

The patch integrates mtime forcing via adding an additional member of
'struct checkout'.  This seemed the simplest way of adding it without
having to modify prototypes and adjust multiple call sites.  The member
is set to the current time in check_updates() function and is afterwards
enforced by checkout_entry().

The patch uses utime() rather than futimes() as that seems to be
the function used everywhere else in git and provided some MinGW
compatibility code.

Signed-off-by: Michał Górny <mgorny@gentoo.org>
---
 cache.h        |  1 +
 entry.c        | 12 +++++++++++-
 unpack-trees.c |  1 +
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index bbaf5c349..9f0a7c867 100644
--- a/cache.h
+++ b/cache.h
@@ -1526,6 +1526,7 @@ struct checkout {
 	const char *base_dir;
 	int base_dir_len;
 	struct delayed_checkout *delayed_checkout;
+	time_t checkout_mtime;
 	unsigned force:1,
 		 quiet:1,
 		 not_new:1,
diff --git a/entry.c b/entry.c
index 2101201a1..7ee5a6d28 100644
--- a/entry.c
+++ b/entry.c
@@ -411,6 +411,7 @@ int checkout_entry(struct cache_entry *ce,
 {
 	static struct strbuf path = STRBUF_INIT;
 	struct stat st;
+	int ret;
 
 	if (topath)
 		return write_entry(ce, topath, state, 1);
@@ -473,5 +474,14 @@ int checkout_entry(struct cache_entry *ce,
 		return 0;
 
 	create_directories(path.buf, path.len, state);
-	return write_entry(ce, path.buf, state, 0);
+	ret = write_entry(ce, path.buf, state, 0);
+
+	if (ret == 0 && state->checkout_mtime != 0) {
+		struct utimbuf t;
+		t.modtime = state->checkout_mtime;
+		if (utime(path.buf, &t) < 0)
+			warning_errno("failed utime() on %s", path.buf);
+	}
+
+	return ret;
 }
diff --git a/unpack-trees.c b/unpack-trees.c
index e73745051..e1efefb68 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -346,6 +346,7 @@ static int check_updates(struct unpack_trees_options *o)
 	state.quiet = 1;
 	state.refresh_cache = 1;
 	state.istate = index;
+	state.checkout_mtime = time(NULL);
 
 	progress = get_progress(o);
 
-- 
2.17.0


             reply	other threads:[~2018-04-13 17:01 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-13 17:01 Michał Górny [this message]
2018-04-23 20:07 ` [RFC PATCH] checkout: Force matching mtime between files Robin H. Johnson
2018-04-23 23:41   ` Junio C Hamano
2018-04-25  7:13     ` Robin H. Johnson
2018-04-25  8:48       ` Junio C Hamano
2018-04-25 15:18         ` Marc Branchaud
2018-04-25 20:07           ` Robin H. Johnson
2018-04-26  1:25           ` Junio C Hamano
2018-04-26 14:12             ` Marc Branchaud
2018-04-26 14:46             ` Michał Górny
2018-04-28 14:23               ` Duy Nguyen
2018-04-28 19:35                 ` Michał Górny
2018-04-26 16:43           ` Duy Nguyen
2018-04-26 17:48             ` Robin H. Johnson
2018-04-26 18:44               ` Duy Nguyen
2018-04-29 23:56                 ` Junio C Hamano
2018-04-30 15:10                   ` Duy Nguyen
2018-04-27 17:03           ` Duy Nguyen
2018-04-27 21:08             ` Elijah Newren
2018-04-28  6:08               ` Duy Nguyen
2018-04-29 23:47               ` Junio C Hamano
2018-04-27 21:08             ` Marc Branchaud
2018-04-28  6:16               ` Duy Nguyen
2018-04-27 17:18           ` Michał Górny
2018-04-27 19:53             ` Ævar Arnfjörð Bjarmason
2018-04-25  8:41     ` Ævar Arnfjörð Bjarmason
2018-04-26 17:15       ` Duy Nguyen
2018-04-26 17:51         ` Robin H. Johnson
2018-04-26 17:53         ` SZEDER Gábor
2018-04-26 18:45           ` Duy Nguyen
2018-04-24 14:41 ` Marc Branchaud
2018-04-25  6:58 ` Robin H. Johnson
2018-04-25  7:13   ` Michał Górny
2018-05-05 18:44 ` Jeff King
2018-05-06  3:37   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180413170129.15310-1-mgorny@gentoo.org \
    --to=mgorny@gentoo.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=larsxschneider@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).