git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Nigel Magnay" <nigel.magnay@gmail.com>
To: "Dmitry Potapov" <dpotapov@gmail.com>
Cc: git <git@vger.kernel.org>
Subject: Re: crlf with git-svn driving me nuts...
Date: Thu, 17 Apr 2008 08:07:27 +0100	[thread overview]
Message-ID: <320075ff0804170007k5171eb72n68882679f62fa238@mail.gmail.com> (raw)
In-Reply-To: <20080417004645.GK3133@dpotapov.dyndns.org>

On Thu, Apr 17, 2008 at 1:46 AM, Dmitry Potapov <dpotapov@gmail.com> wrote:
> On Thu, Apr 17, 2008 at 12:07:27AM +0100, Nigel Magnay wrote:
>  > >  > The bit I really don't understand is why git thinks a file that has
>  > >  > just been touched has chnaged when it hasn't,
>  > >
>  > >  Actually, it did change in the sense that if you try to commit this
>  > >  file now into the repository, you will have a different file in Git!
>  > >  So, it is more correct to say that Git did not notice this change until
>  > >  you touch this file, because this change is indirect (autocrlf causes
>  > >  a different interpretation of the file).
>  > >
>  >
>  > Okay - at the very least this behaviour is really, really confusing.
>  > And I think there's actually a bug (it should *always* report that the
>  > file is different), not magically after it's been touched.
>
>  I don't think there is a simple way to correct that without penalizing
>  normal use cases. Usually, people do not change autocrlf during their
>  normal work. Besides, you can have your own input filters and they may
>  cause the same effect. So, Git works in the assumption that input filters
>  always produce the same results...

This has nothing to do with changing core.autocrlf after checkout -
it's a problem with *any* repo with CRLF files, being checked out on a
core.autocrlf=true machine, which basically is any windows machine.

The current 'isDirty' check seems to be something like

isDirty = ( wc.file.mtime > someValue ) && ( repository.file !=
filter(wc.file) )

I'm saying it ought to be something like

isDirty = ( wc.file.mtime > someValue ) && (sha1(repository.file) !=
sha1(wc.file) ) && ( repository.file != filter(wc.file) )


>
>
>  >
>  > But fixing that minor bug still leads to badness for the user. Doing
>  > (on a core.autocrlf=true machine) a checkout of any revision
>  > containing a file that is (currently) CRLF in the repository, and your
>  > WC is *immediately* dirty. However technically correct that is, it
>  > doesn't fit most people's user model of an SCM, because they haven't
>  > made any modification.
>
>  IMHO, the only sane way is never store CRLF in the Git repository.
>  You can have whatever ending you like in your work tree, but inside
>  of Git, LF is the actually marker of the end-of-line.
>

Great. I'll go and argue with the team using svn, who don't even
*notice* this issue, and try to get them to adjust the metadata on
every single file in the repository.

Then, for a bonus, I'll try the same with every OSS project that I'm
tracking with git-svn. :-(

I get that things are horribly broken if you get CRLF in your
repository. But it's unreasonable to expect the ability to bend the
rest of the world to what's convenient for me! Some of our windows
coders probably even *like* svn:eol-style=CRLF !

>
>  > And if 1 person makes a change along with their
>  > conversion, and the other 'just' does a CRLF->LF conversion,
>
>  If you imported correctly in Git, it should not have CRLF for text
>  files. So, there is no conversion that a user does expliciltly.
>
>
>  > And because the svn is
>  > mastered crlf (well, strictly speaking, it's ignorant of line endings)
>  > this is gonna happen a lot.
>
>  Not really. SVN has its own setting for EOL conversion. If you have
>  'svn:eol-style' set to 'native' for any text file then SVN will
>  checkout text files accordingly to your native EOL (you can specify
>  your native EOL using the --native-eol option when it is necessary).
>

Can I set this personally, without affecting the svn repo? If so, why
isn't git-svn doing this anyway, and can I tell it to do so?

>
>  > Can't git be taught that if the WC is byte-identical to the revision
>  > in the repository (regardless of autocrlf) then that ought not to be
>  > regarded as a change?
>
>  Why should not it? If a file is different as long as Git repository is
>  concern then then it *is* a change. Git binary compare files _after_
>  applying all specified filters (and you can have your own filters, not
>  only autocrlf).
>

See above. Unchanged (on disk, byte identical) files, if touched, get
(sometimes) marked as dirty.

>
>  > Is there a way I can persuade the diff / merge mechanisms to normalise
>  > before they operate? (e.g if core.autocrlf does lf->crlf/crlf->lf,
>  > then an equivalent that does crlf->lf/crlf->lf before doing the merge
>  > )?
>
>  I am not sure if there is a standard option for that, but it is
>  certainly possible to define your own merge strategy.
>
Ok - I'll have a look into this - just a filter on each file before
merging would be sufficient. Presumably people that do things like
$Id$ expansion need something similar to avoid constant merge
conflicts..

>
>  >
>  > In a perfect world I'd be able to switch all files int he repo to LF,
>  > but that's not going to happen any time soon because of the majority
>  > of developers, still on svn, still on windows.
>
>  Well, I don't see any problem here if everything is configured properly.
>  How files are stored inside and what you have in your work tree does
>  not have to be the same. So, storing everything inside with LF is
>  certainly possible. Actually, I believe it is exactly what CVS does
>  (unless you added a file with '-kb'), and people use CVS on Windows.
>  Importing files with CRLF in Git, it is like putting files as _binary_
>  in CVS.
>
>  Dmitry
>

  parent reply	other threads:[~2008-04-17  7:08 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-16 19:10 crlf with git-svn driving me nuts Nigel Magnay
2008-04-16 20:01 ` Dmitry Potapov
2008-04-16 20:20   ` Avery Pennarun
2008-04-16 20:39     ` Dmitry Potapov
2008-04-16 21:56       ` Nigel Magnay
     [not found]       ` <320075ff0804161447u25dfbb2bmcd36ea507224d835@mail.gmail.com>
     [not found]         ` <20080416223739.GJ3133@dpotapov.dyndns.org>
2008-04-16 23:07           ` Nigel Magnay
2008-04-17  0:46             ` Dmitry Potapov
2008-04-17  1:44               ` Avery Pennarun
2008-04-17  7:07               ` Nigel Magnay [this message]
2008-04-17  9:43                 ` Dmitry Potapov
2008-04-17 10:09                   ` Nigel Magnay
2008-04-17 18:53                     ` Dmitry Potapov
2008-04-17 22:03                       ` Nigel Magnay
2008-04-17 22:42                         ` Dmitry Potapov
2008-04-17  5:43             ` Steffen Prohaska
2008-04-16 20:56   ` Martin Langhoff
2008-04-16 21:02     ` Avery Pennarun
2008-04-16 21:17     ` Dmitry Potapov
2008-04-16 20:03 ` Avery Pennarun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=320075ff0804170007k5171eb72n68882679f62fa238@mail.gmail.com \
    --to=nigel.magnay@gmail.com \
    --cc=dpotapov@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).