git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Zygo Blaxell <zblaxell@gibbs.hungrycats.org>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: Nicolas Pitre <nico@fluxnic.net>,
	Junio C Hamano <gitster@pobox.com>,
	Thomas Rast <trast@student.ethz.ch>,
	Dmitry Potapov <dpotapov@gmail.com>,
	Ilari Liusvaara <ilari.liusvaara@elisanet.fi>,
	git@vger.kernel.org
Subject: Re: [PATCH] Teach "git add" and friends to be paranoid
Date: Fri, 19 Feb 2010 10:26:09 -0500	[thread overview]
Message-ID: <20100219152609.GC11733@gibbs.hungrycats.org> (raw)
In-Reply-To: <20100219010456.GA1789@progeny.tock>

On Thu, Feb 18, 2010 at 07:04:56PM -0600, Jonathan Nieder wrote:
> Nicolas Pitre wrote:
> > On Thu, 18 Feb 2010, Junio C Hamano wrote:
> >> I suspect that opening to mmap(2), hashing once to compute the object
> >> name, and deflating it to write it out, will all happen within the same
> >> second, unless you are talking about a really huge file, or you started at
> >> very near a second boundary.
> >
> > How is the index dealing with this?  Surely if a file is added to the 
> > index and modified within the same second then 'git status' will fail to 
> > notice the changes.  I'm not familiar enough with that part of Git.
> 
> See Documentation/technical/racy-git.txt and t/t0010-racy-git.sh.
> 
> Short version: in the awful case, the timestamp of the index is the
> same as (or before) the timestamp of the file.  Git will notice this
> and re-hash the tracked file.

As far as I can tell, the index doesn't handle this case at all.

Suppose the file is modified during git add near the beginning of the
file, after git add has read that part of the file, but the modifications
finish before git add does.  Now the mtime of the file is earlier
than the index timestamp, but the file contents don't match the index.
This holds even if the objects git adds to the index aren't corrupted.
Actually right now you can have all four combinations:  index up to date
or not, and object matching its sha1 hash or not, depending on where and
when you modify data during an index update.

racy-git.txt doesn't discuss concurrent modification of files with the
index.  It only discusses low-resolution file timestamps and modifications
at times that are close to, but not concurrent with, index modifications.

Git probably also doesn't handle things like NTP time corrections
(especially those where time moves backward by sub-second intervals) and
mismatched server/client clocks on remote filesystems either (mind you,
I know of no SCM that currently handles that case, and CVS in particular
is unusually bad at it).

Personally, I find the combination of nanosecond-precision timestamps
and network file systems amusing.  At nanosecond precision, relativistic
effects start to matter across a volume of space the size of my laptop.
I'm not sure how timestamps at any resolution could be a reliable metric
for detecting changes to file contents in the general case.  A valuable
hint in many cases, but not authoritative (unless they all come from a
single monotonic high-resolution clock guaranteed to increment faster than
git--but they don't).

rsync solves this sort of problem with a 'modification window' parameter,
which is a time interval that is "close enough" to consider two timestamps
to be equal.  Some of rsync's use cases set that window to six months.
Git would use a modification window for the opposite reason rsync
does--rsync uses the window to avoid unnecessarily examining files that
have different timestamps, while git would use it to re-examine files
even when it appears to be unnecessary.

Git probably wants the modification window to be the maximum clock
offset between a network filesystem client and server plus the minimum
representable interval in the filesystem's timestamp data type--which
is a value git couldn't possibly know for some cases, so it needs input
from the user.

  reply	other threads:[~2010-02-19 15:26 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20100211234753.22574.48799.reportbug@gibbs.hungrycats.org>
2010-02-12  0:27 ` Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs Jonathan Nieder
2010-02-12  1:23   ` Zygo Blaxell
2010-02-13 12:12     ` Jonathan Nieder
2010-02-13 13:39       ` Ilari Liusvaara
2010-02-13 14:39         ` Thomas Rast
2010-02-13 16:29           ` Ilari Liusvaara
2010-02-13 22:09             ` Dmitry Potapov
2010-02-13 22:37               ` Zygo Blaxell
2010-02-14  1:18                 ` [PATCH] don't use mmap() to hash files Dmitry Potapov
2010-02-14  1:37                   ` Junio C Hamano
2010-02-14  2:18                     ` Dmitry Potapov
2010-02-14  3:14                       ` Junio C Hamano
2010-02-14 11:14                         ` Thomas Rast
2010-02-14 11:46                           ` Junio C Hamano
2010-02-14  1:53                   ` Johannes Schindelin
2010-02-14  2:00                     ` Junio C Hamano
2010-02-14  2:42                     ` Dmitry Potapov
2010-02-14 11:07                       ` Jakub Narebski
2010-02-14 11:55                       ` Paolo Bonzini
2010-02-14 18:10                       ` Johannes Schindelin
2010-02-14 19:06                         ` Dmitry Potapov
2010-02-14 19:22                           ` Johannes Schindelin
2010-02-14 19:28                             ` Johannes Schindelin
2010-02-14 19:56                               ` Dmitry Potapov
2010-02-14 23:52                                 ` Zygo Blaxell
2010-02-15  5:05                                 ` Nicolas Pitre
2010-02-15 12:23                                   ` Dmitry Potapov
2010-02-15  7:48                                 ` Paolo Bonzini
2010-02-15 12:25                                   ` Dmitry Potapov
2010-02-14 19:55                             ` Dmitry Potapov
2010-02-14 23:13                           ` Avery Pennarun
2010-02-15  4:16                             ` Nicolas Pitre
2010-02-15  5:01                               ` Avery Pennarun
2010-02-15  5:48                                 ` Nicolas Pitre
2010-02-15 19:19                                   ` Avery Pennarun
2010-02-15 19:29                                     ` Nicolas Pitre
2010-02-14  3:05                   ` [PATCH v2] " Dmitry Potapov
2010-02-18  1:16                   ` [PATCH] Teach "git add" and friends to be paranoid Junio C Hamano
2010-02-18  1:20                     ` Junio C Hamano
2010-02-18 15:32                       ` Zygo Blaxell
2010-02-19 17:51                         ` Junio C Hamano
2010-02-18  1:38                     ` Jeff King
2010-02-18  4:55                       ` Nicolas Pitre
2010-02-18  5:36                         ` Junio C Hamano
2010-02-18  7:27                           ` Wincent Colaiuta
2010-02-18 16:18                             ` Zygo Blaxell
2010-02-18 18:12                               ` Jonathan Nieder
2010-02-18 18:35                                 ` Junio C Hamano
2010-02-22 12:59                           ` Paolo Bonzini
2010-02-22 13:33                             ` Dmitry Potapov
2010-02-18 10:14                     ` Thomas Rast
2010-02-18 18:16                       ` Junio C Hamano
2010-02-18 19:58                         ` Nicolas Pitre
2010-02-18 20:11                           ` 16 gig, 350,000 file repository Bill Lear
2010-02-18 20:58                             ` Nicolas Pitre
2010-02-19  9:27                               ` Erik Faye-Lund
2010-02-22 22:20                               ` Bill Lear
2010-02-22 22:31                                 ` Nicolas Pitre
2010-02-18 20:14                           ` [PATCH] Teach "git add" and friends to be paranoid Peter Harris
2010-02-18 20:17                           ` Junio C Hamano
2010-02-18 21:30                             ` Nicolas Pitre
2010-02-19  1:04                               ` Jonathan Nieder
2010-02-19 15:26                                 ` Zygo Blaxell [this message]
2010-02-19 17:52                                   ` Junio C Hamano
2010-02-19 19:08                                     ` Zygo Blaxell
2010-02-19  8:28                     ` Dmitry Potapov
2010-02-19 17:52                       ` Junio C Hamano
2010-02-20 19:23                         ` Junio C Hamano
2010-02-21  7:21                           ` Dmitry Potapov
2010-02-21 19:32                             ` Junio C Hamano
2010-02-22  3:35                               ` Dmitry Potapov
2010-02-22  6:59                                 ` Junio C Hamano
2010-02-22 12:25                                   ` Dmitry Potapov
2010-02-22 15:40                                   ` Nicolas Pitre
2010-02-22 16:01                                     ` Dmitry Potapov
2010-02-22 17:31                                     ` Zygo Blaxell
2010-02-22 18:01                                       ` Nicolas Pitre
2010-02-22 19:56                                         ` Junio C Hamano
2010-02-22 20:52                                           ` Nicolas Pitre
2010-02-22 18:05                                       ` Dmitry Potapov
2010-02-22 18:14                                         ` Nicolas Pitre
2010-02-14  1:36   ` mmap with MAP_PRIVATE is useless (was Re: Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs) Paolo Bonzini
2010-02-14  1:53     ` mmap with MAP_PRIVATE is useless Junio C Hamano
2010-02-14  2:11       ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100219152609.GC11733@gibbs.hungrycats.org \
    --to=zblaxell@gibbs.hungrycats.org \
    --cc=dpotapov@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=ilari.liusvaara@elisanet.fi \
    --cc=jrnieder@gmail.com \
    --cc=nico@fluxnic.net \
    --cc=trast@student.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).