git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] Respect crlf attribute even if core.autocrlf has not been set
@ 2008-07-22 21:56 Johannes Schindelin
  2008-07-22 23:11 ` Dmitry Potapov
  2008-07-23  0:12 ` Junio C Hamano
  0 siblings, 2 replies; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-22 21:56 UTC (permalink / raw
  To: git, gitster


When a file's crlf attribute is explicitely set, it does not make sense
to ignore it, just because the config variable core.autocrlf has not
been set.

This patch does not affect the case when the crlf attribute is unset.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---

	There have been no comments on this patch so far, however I think
	that this is a valid fix which should be in 1.6.0.

 convert.c       |    2 +-
 t/t0020-crlf.sh |   10 ++++++++++
 2 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/convert.c b/convert.c
index 78efed8..d038d2f 100644
--- a/convert.c
+++ b/convert.c
@@ -126,7 +126,7 @@ static int crlf_to_git(const char *path, const char *src, size_t len,
 	struct text_stat stats;
 	char *dst;
 
-	if ((action == CRLF_BINARY) || !auto_crlf || !len)
+	if ((action == CRLF_BINARY) || (!auto_crlf && action < 0) || !len)
 		return 0;
 
 	gather_stats(src, len, &stats);
diff --git a/t/t0020-crlf.sh b/t/t0020-crlf.sh
index 1be7446..0bb3e6f 100755
--- a/t/t0020-crlf.sh
+++ b/t/t0020-crlf.sh
@@ -436,4 +436,14 @@ test_expect_success 'invalid .gitattributes (must not crash)' '
 
 '
 
+test_expect_success 'attribute crlf is heeded even without core.autocrlf' '
+
+	echo "allcrlf crlf=input" > .gitattributes &&
+	git config --unset core.autocrlf &&
+	git add allcrlf &&
+	git show :allcrlf | append_cr > expect &&
+	test_cmp allcrlf expect
+
+'
+
 test_done
-- 
1.6.0.rc0.22.gf2096d.dirty

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-22 21:56 [PATCH] Respect crlf attribute even if core.autocrlf has not been set Johannes Schindelin
@ 2008-07-22 23:11 ` Dmitry Potapov
  2008-07-22 23:23   ` Johannes Schindelin
  2008-07-23  0:12 ` Junio C Hamano
  1 sibling, 1 reply; 58+ messages in thread
From: Dmitry Potapov @ 2008-07-22 23:11 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: git, gitster

On Tue, Jul 22, 2008 at 10:56:04PM +0100, Johannes Schindelin wrote:
> 
> When a file's crlf attribute is explicitely set, it does not make sense
> to ignore it, just because the config variable core.autocrlf has not
> been set.

Hmm... About a week ago, I was about to propose the same change, but
after reading documentation and some thinking I was not able to convince
myself that this change would be the right thing to do.

First, let's look at what Git's documentation says:

===
`crlf`
^^^^^^

This attribute controls the line-ending convention.

Set::

	Setting the `crlf` attribute on a path is meant to mark
	the path as a "text" file.  'core.autocrlf' conversion
	takes place without guessing the content type by
	inspection.

<snip>

The `core.autocrlf` conversion
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If the configuration variable `core.autocrlf` is false, no
conversion is done.
===

So, my reading is that if I set the `crlf` attribute on some path, but
I have core.autocrlf=false, there will be no conversion.

And this can be used to mark text files in .gitattribute, which is
stored in the repository and thus it is shared among users with
different end-of-line ending, i.e. you can have something like this
in .gitattribute:

*.[ch] crlf
*.txt crlf

but on Unix, you have core.autocrlf=false, so no conversion is done,
while, on Windows, you set core.autocrlf=true, so you will have crlf
conversion without any guessing.

Now, I can agree with you that using the 'crlf' attribute to mark text
files may appear not very intuitive (you may expect that crlf means that
those files always need crlf conversion), but right now we do not have
any better way to mark text files and the using crlf in this role is
explicitly suggested by documentation. See above.


> diff --git a/t/t0020-crlf.sh b/t/t0020-crlf.sh
> index 1be7446..0bb3e6f 100755
> --- a/t/t0020-crlf.sh
> +++ b/t/t0020-crlf.sh
> @@ -436,4 +436,14 @@ test_expect_success 'invalid .gitattributes (must not crash)' '
>  
>  '
>  
> +test_expect_success 'attribute crlf is heeded even without core.autocrlf' '

s/heeded/needed/


Dmitry

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-22 23:11 ` Dmitry Potapov
@ 2008-07-22 23:23   ` Johannes Schindelin
  2008-07-23  0:14     ` Dmitry Potapov
  0 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-22 23:23 UTC (permalink / raw
  To: Dmitry Potapov; +Cc: git, gitster

Hi,

On Wed, 23 Jul 2008, Dmitry Potapov wrote:

> On Tue, Jul 22, 2008 at 10:56:04PM +0100, Johannes Schindelin wrote:
> > 
> > When a file's crlf attribute is explicitely set, it does not make 
> > sense to ignore it, just because the config variable core.autocrlf has 
> > not been set.
> 
> Hmm... About a week ago, I was about to propose the same change, but 
> after reading documentation and some thinking I was not able to convince 
> myself that this change would be the right thing to do.

Well, I have a shared repository, where I set the attribute.  Now, every 
once in a while, people check in text _with_ CR/LF.  Yes, that is right, I 
marked it explicitely as crlf, yet I am on the whim of the people choosing 
to set the config variable or not.

And I could not care less what the documentation says: if it does not make 
sense, it does not make sense.

> > +test_expect_success 'attribute crlf is heeded even without core.autocrlf' '
> 
> s/heeded/needed/

Nope.  "heeded" is what I meant.  I am not a native speaker, so this could 
be wrong.  But "needed" is not what I meant (the sentence would not make 
sense with "needed").

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-22 21:56 [PATCH] Respect crlf attribute even if core.autocrlf has not been set Johannes Schindelin
  2008-07-22 23:11 ` Dmitry Potapov
@ 2008-07-23  0:12 ` Junio C Hamano
  2008-07-23  1:10   ` Johannes Schindelin
  1 sibling, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2008-07-23  0:12 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> When a file's crlf attribute is explicitely set, it does not make sense
> to ignore it, just because the config variable core.autocrlf has not
> been set.

I am not sure if I agree with that reasoning.

Attribute defines what each path is.  Is it a text file, is it a binary?
The nature of the contents does not change between people on POSIX and
Windows, and that is why it is described in .gitattributes and cloned
across repositories.

On the other hand, the configuration defines what to do with contents with
various attributes in this particular repository.  Do I want to see a text
file checked out with CRLF endings, or LF?

So it is perfectly valid and normal for a cross-platform minded project to
use the crlf atttribute to say "These files are text" and expect them to
be checked out with LF endings on POSIX while making sure they are checked
out with CRLF on Windows.  Adding CR at the end of line for such files on
POSIX systems is positively a wrong thing to do in such a case.

Projects like the kernel that originate from LF side of the world may not
bother marking things as such, though.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-22 23:23   ` Johannes Schindelin
@ 2008-07-23  0:14     ` Dmitry Potapov
  0 siblings, 0 replies; 58+ messages in thread
From: Dmitry Potapov @ 2008-07-23  0:14 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: git, gitster

On Wed, Jul 23, 2008 at 12:23:27AM +0100, Johannes Schindelin wrote:
> Hi,
> 
> On Wed, 23 Jul 2008, Dmitry Potapov wrote:
> 
> > On Tue, Jul 22, 2008 at 10:56:04PM +0100, Johannes Schindelin wrote:
> > > 
> > > When a file's crlf attribute is explicitely set, it does not make 
> > > sense to ignore it, just because the config variable core.autocrlf has 
> > > not been set.
> > 
> > Hmm... About a week ago, I was about to propose the same change, but 
> > after reading documentation and some thinking I was not able to convince 
> > myself that this change would be the right thing to do.
> 
> Well, I have a shared repository, where I set the attribute.  Now, every 
> once in a while, people check in text _with_ CR/LF.  Yes, that is right, I 
> marked it explicitely as crlf, yet I am on the whim of the people choosing 
> to set the config variable or not.
> 
> And I could not care less what the documentation says: if it does not make 
> sense, it does not make sense.

If you think that the current documentation does not make sense, why
don't you write something that will make sense? Really, the current
behavior may not be the best one, but at least it is consistent with
documentation, while your change may confuse users, because they may
expect one behavior, but git will act differently.

If I understand your change correctly, you take into account the crlf
attribute unconditionally only in worktree-to-git conversion, while you
still ignore it if core.autocrlf=false on checkout. Is it correct?  If
so, I think your patch does make sense, and it should not break anything
too badly, because you still respect core.autocrlf on checkout, but you
should have said that in your commit message.

Dmitry

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-23  0:12 ` Junio C Hamano
@ 2008-07-23  1:10   ` Johannes Schindelin
  2008-07-23  1:31     ` [PATCH] Respect crlf attribute in "git add" " Johannes Schindelin
  2008-07-23 17:07     ` [PATCH] Respect crlf attribute " Junio C Hamano
  0 siblings, 2 replies; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-23  1:10 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git

Hi,

On Tue, 22 Jul 2008, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > When a file's crlf attribute is explicitely set, it does not make 
> > sense to ignore it, just because the config variable core.autocrlf has 
> > not been set.
> 
> I am not sure if I agree with that reasoning.
> 
> Attribute defines what each path is.  Is it a text file, is it a binary? 
> The nature of the contents does not change between people on POSIX and 
> Windows, and that is why it is described in .gitattributes and cloned 
> across repositories.
> 
> On the other hand, the configuration defines what to do with contents with
> various attributes in this particular repository.  Do I want to see a text
> file checked out with CRLF endings, or LF?

Actually, I now see that I expressed myself badly.  Extremely badly.

The whole issue is about _check in_, as can be seen from the test case I 
provided.

And I think it is even a bug in crlf handling, as gitattributes.txt has 
this to say about crlf=input:

        This is similar to setting the attribute to `true`, but
        also forces git to act as if `core.autocrlf` is set to
        `input` for the path.

It suggests to this coder that core.autocrlf is not even looked at when 
crlf=input.

> So it is perfectly valid and normal for a cross-platform minded project 
> to use the crlf atttribute to say "These files are text" and expect them 
> to be checked out with LF endings on POSIX while making sure they are 
> checked out with CRLF on Windows.  Adding CR at the end of line for such 
> files on POSIX systems is positively a wrong thing to do in such a case.
> 
> Projects like the kernel that originate from LF side of the world may not
> bother marking things as such, though.

I fully agree.

However, if you want to avoid CRs to _enter_ the repository, when you have 
a lot of binary files tracked, you _do_ want to force all repositories to 
crlf=input.

Now, if you look at the patch, you will see that it _only_ touches 
crlf_to_git(), but not only for crlf=input, but also for crlf=true.

I maintain that this respects the law of least surprise, namely that if 
you set the attribute crlf=true, and some person forgets to set 
core.autocrlf=true, at _check in_ CRs are stripped, but at _check out_, no 
CR is added (as the person did not ask for core.autocrlf, but that should 
not punish all the others who do not want to have CRs in the repository).

But yes, the commit message, and the oneline in particular are severely 
lacking.

Tomorrow,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH] Respect crlf attribute in "git add" even if core.autocrlf has not been set
  2008-07-23  1:10   ` Johannes Schindelin
@ 2008-07-23  1:31     ` Johannes Schindelin
  2008-07-23  5:49       ` Steffen Prohaska
  2008-07-23 17:07     ` [PATCH] Respect crlf attribute " Junio C Hamano
  1 sibling, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-23  1:31 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Dmitry Potapov


When a file's crlf attribute is explicitely set, it does not make sense
to ignore it when adding the file, just because the config variable
core.autocrlf has not been set.

The alternative would be risking to get CR/LF files into the repository
just because one user forgot to set core.autocrlf.

This patch does not affect the case when the crlf attribute is unset,
or when checking files out.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---

	Okay, so I lied and did not go to sleep (but that part comes 
	now).  Only the wording in the commit message has changed.

 convert.c       |    2 +-
 t/t0020-crlf.sh |   10 ++++++++++
 2 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/convert.c b/convert.c
index 78efed8..d038d2f 100644
--- a/convert.c
+++ b/convert.c
@@ -126,7 +126,7 @@ static int crlf_to_git(const char *path, const char *src, size_t len,
 	struct text_stat stats;
 	char *dst;
 
-	if ((action == CRLF_BINARY) || !auto_crlf || !len)
+	if ((action == CRLF_BINARY) || (!auto_crlf && action < 0) || !len)
 		return 0;
 
 	gather_stats(src, len, &stats);
diff --git a/t/t0020-crlf.sh b/t/t0020-crlf.sh
index 1be7446..0bb3e6f 100755
--- a/t/t0020-crlf.sh
+++ b/t/t0020-crlf.sh
@@ -436,4 +436,14 @@ test_expect_success 'invalid .gitattributes (must not crash)' '
 
 '
 
+test_expect_success 'attribute crlf is heeded even without core.autocrlf' '
+
+	echo "allcrlf crlf=input" > .gitattributes &&
+	git config --unset core.autocrlf &&
+	git add allcrlf &&
+	git show :allcrlf | append_cr > expect &&
+	test_cmp allcrlf expect
+
+'
+
 test_done
-- 
1.6.0.rc0.22.gf2096d.dirty

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute in "git add" even if core.autocrlf has not been set
  2008-07-23  1:31     ` [PATCH] Respect crlf attribute in "git add" " Johannes Schindelin
@ 2008-07-23  5:49       ` Steffen Prohaska
  2008-07-23  9:02         ` Johannes Schindelin
  2008-07-23 11:40         ` Dmitry Potapov
  0 siblings, 2 replies; 58+ messages in thread
From: Steffen Prohaska @ 2008-07-23  5:49 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: Junio C Hamano, git, Dmitry Potapov


On Jul 23, 2008, at 3:31 AM, Johannes Schindelin wrote:

> When a file's crlf attribute is explicitely set, it does not make  
> sense
> to ignore it when adding the file, just because the config variable
> core.autocrlf has not been set.

Your patch is not about core.autocrlf unset, but about  
core.autocrlf=false.
On Unix, the two cases seem to be identical, but on Windows they are  
not.
(see below).


> The alternative would be risking to get CR/LF files into the  
> repository
> just because one user forgot to set core.autocrlf.

Git could help the user *and* the maintainer of the repository if we
chose core.autocrlf=input as the default on Unix.  We would never
let CR/LF enter the repository unless explicitly requested to do so
by core.autocrlf=false.  This setting however would not be recommended
to the average user.

But with Unix' default core.autocrlf=false, it makes sense to let the
maintainer of a repository enforce stripping CR from all files if not
explicitly configured otherwise for specific paths.  Setting  
"crlf=input"
in .gitattribute seems to be the documented way to do so --- although
the documentation in gitattributes.txt is a bit complex to read.


> This patch does not affect the case when the crlf attribute is unset,
> or when checking files out.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>
> 	Okay, so I lied and did not go to sleep (but that part comes
> 	now).  Only the wording in the commit message has changed.
>
> convert.c       |    2 +-
> t/t0020-crlf.sh |   10 ++++++++++
> 2 files changed, 11 insertions(+), 1 deletions(-)
>
> diff --git a/convert.c b/convert.c
> index 78efed8..d038d2f 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -126,7 +126,7 @@ static int crlf_to_git(const char *path, const  
> char *src, size_t len,
> 	struct text_stat stats;
> 	char *dst;
>
> -	if ((action == CRLF_BINARY) || !auto_crlf || !len)
> +	if ((action == CRLF_BINARY) || (!auto_crlf && action < 0) || !len)

I think we should strictly follow the documentation, so this should read

+       if ((action == CRLF_BINARY) || (!auto_crlf && action !=  
CRLF_INPUT) || !len)



>
> 		return 0;
>
> 	gather_stats(src, len, &stats);
> diff --git a/t/t0020-crlf.sh b/t/t0020-crlf.sh
> index 1be7446..0bb3e6f 100755
> --- a/t/t0020-crlf.sh
> +++ b/t/t0020-crlf.sh
> @@ -436,4 +436,14 @@ test_expect_success 'invalid .gitattributes  
> (must not crash)' '
>
> '
>
> +test_expect_success 'attribute crlf is heeded even without  
> core.autocrlf' '
> +
> +	echo "allcrlf crlf=input" > .gitattributes &&
> +	git config --unset core.autocrlf &&

You should set core.autocrlf explicitly to false:

    git config core.autocrlf false

Otherwise the test would pick up the user's global default.


>
> +	git add allcrlf &&
> +	git show :allcrlf | append_cr > expect &&
> +	test_cmp allcrlf expect
> +
> +'
> +
> test_done
> -- 
> 1.6.0.rc0.22.gf2096d.dirty


... and we should verify that we only treat crlf=input specially, but  
not "crlf".
The following changes could be applied on top of yours. (Apologies if  
lines are
wrapped.  I am composing this mail with the wrong email client.)


diff --git a/t/t0020-crlf.sh b/t/t0020-crlf.sh
index b37059b..e51e810 100755
--- a/t/t0020-crlf.sh
+++ b/t/t0020-crlf.sh
@@ -436,7 +436,7 @@ test_expect_success 'invalid .gitattributes (must  
not crash)' '

  '

-test_expect_success 'attribute crlf is heeded even without  
core.autocrlf' '
+test_expect_success 'attribute crlf=input is heeded even with  
core.autocrlf=false' '

         echo "allcrlf crlf=input" > .gitattributes &&
         git config core.autocrlf false &&
@@ -446,4 +446,15 @@ test_expect_success 'attribute crlf is heeded  
even without core.autocrlf' '

  '

+test_expect_success 'attribute crlf is ignored with  
core.autocrlf=false' '
+
+       echo "allcrlf crlf" > .gitattributes &&
+       git config core.autocrlf false &&
+       git read-tree --reset HEAD &&
+       git add allcrlf &&
+       git show :allcrlf > expect &&
+       test_cmp allcrlf expect
+
+'
+
  test_done

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute in "git add" even if core.autocrlf has not been set
  2008-07-23  5:49       ` Steffen Prohaska
@ 2008-07-23  9:02         ` Johannes Schindelin
  2008-07-23 11:40         ` Dmitry Potapov
  1 sibling, 0 replies; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-23  9:02 UTC (permalink / raw
  To: Steffen Prohaska; +Cc: Junio C Hamano, git, Dmitry Potapov

Hi,

On Wed, 23 Jul 2008, Steffen Prohaska wrote:

> On Jul 23, 2008, at 3:31 AM, Johannes Schindelin wrote:
> 
> >When a file's crlf attribute is explicitely set, it does not make sense 
> >to ignore it when adding the file, just because the config variable 
> >core.autocrlf has not been set.
> 
> Your patch is not about core.autocrlf unset, but about 
> core.autocrlf=false.

No, no, no!

It is about autocrlf _unset_, i.e. when the user was _not_ explicit!

> >The alternative would be risking to get CR/LF files into the repository 
> >just because one user forgot to set core.autocrlf.
> 
> Git could help the user *and* the maintainer of the repository if we
> chose core.autocrlf=input as the default on Unix.

NO!

The machinery behind core.autocrlf _does_ slow down Git, even if only by a 
few percent.  But with patch queues as maintainers of busy projects 
encounter frequently, this makes a difference.

So no, I am _totally_ opposed to punishing Unix users for Windows' faults.  
Totally.

> >diff --git a/convert.c b/convert.c
> >index 78efed8..d038d2f 100644
> >--- a/convert.c
> >+++ b/convert.c
> >@@ -126,7 +126,7 @@ static int crlf_to_git(const char *path, const char *src,
> >size_t len,
> > struct text_stat stats;
> > char *dst;
> >
> >-	if ((action == CRLF_BINARY) || !auto_crlf || !len)
> >+	if ((action == CRLF_BINARY) || (!auto_crlf && action < 0) || !len)
> 
> I think we should strictly follow the documentation, so this should read
> 
> +       if ((action == CRLF_BINARY) || (!auto_crlf && action != CRLF_INPUT) ||
> !len)

NO!

Your version would break action == CRLF_TEXT.

> >diff --git a/t/t0020-crlf.sh b/t/t0020-crlf.sh
> >index 1be7446..0bb3e6f 100755
> >--- a/t/t0020-crlf.sh
> >+++ b/t/t0020-crlf.sh
> >@@ -436,4 +436,14 @@ test_expect_success 'invalid .gitattributes (must not
> >crash)' '
> >
> >'
> >
> >+test_expect_success 'attribute crlf is heeded even without core.autocrlf' '
> >+
> >+	echo "allcrlf crlf=input" > .gitattributes &&
> >+	git config --unset core.autocrlf &&
> 
> You should set core.autocrlf explicitly to false:
> 
>   git config core.autocrlf false

But this is _not_ what I want!  I want it to be _unset_.  Yes, at the 
moment, auto_crlf is initialized to 0.  But I do not _care_ what it is 
set.  I want this test to succeed when the user did _not_ set autocrlf.

> Otherwise the test would pick up the user's global default.

It will not.  We went to great pains not to pick up global defaults in the 
test suite.

> ... and we should verify that we only treat crlf=input specially, but not
> "crlf".

Umm... Why?  Why should a file marked "crlf=true" be allowed to be checked 
in _with_ CRs?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute in "git add" even if core.autocrlf has not been set
  2008-07-23  5:49       ` Steffen Prohaska
  2008-07-23  9:02         ` Johannes Schindelin
@ 2008-07-23 11:40         ` Dmitry Potapov
  2008-07-24  6:06           ` Steffen Prohaska
  1 sibling, 1 reply; 58+ messages in thread
From: Dmitry Potapov @ 2008-07-23 11:40 UTC (permalink / raw
  To: Steffen Prohaska; +Cc: Johannes Schindelin, Junio C Hamano, git

On Wed, Jul 23, 2008 at 07:49:20AM +0200, Steffen Prohaska wrote:
> 
> On Jul 23, 2008, at 3:31 AM, Johannes Schindelin wrote:
> 
> >
> >-	if ((action == CRLF_BINARY) || !auto_crlf || !len)
> >+	if ((action == CRLF_BINARY) || (!auto_crlf && action < 0) || !len)
> 
> I think we should strictly follow the documentation, so this should read
> 
> +       if ((action == CRLF_BINARY) || (!auto_crlf && action !=  
> CRLF_INPUT) || !len)

Well, your expression is correct if we choose to strictly follow to what
the current documentation says, but it is not well written in this place,
and, more importantly, I don't see any use case where you would want to
set crlf=input in .gitattributes, because it is shared among users on
different platforms. What you want to specify in it is whether a file is
text or binary. If crlf is set, it means a text file; if unset, it is a
binary file. Regardless of what autocrlf value, I don't see why a text
file should be checked in with CRs. So, Dscho's patch makes more sense
to me.  This requires correction to the documentation though:

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index d7b4114..448857b 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -99,9 +99,9 @@ This attribute controls the line-ending convention.
 Set::
 
 	Setting the `crlf` attribute on a path is meant to mark
-	the path as a "text" file.  'core.autocrlf' conversion
-	takes place without guessing the content type by
-	inspection.
+	the path as a "text" file. Line endings in a text file
+	are converted to LF upon checkin, and if 'core.autocrlf'
+	is true then to CRLF upon checkout.
 
 Unset::
 

Dmitry

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-23  1:10   ` Johannes Schindelin
  2008-07-23  1:31     ` [PATCH] Respect crlf attribute in "git add" " Johannes Schindelin
@ 2008-07-23 17:07     ` Junio C Hamano
  2008-07-23 17:22       ` Johannes Schindelin
  2008-07-24 16:53       ` Dmitry Potapov
  1 sibling, 2 replies; 58+ messages in thread
From: Junio C Hamano @ 2008-07-23 17:07 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> However, if you want to avoid CRs to _enter_ the repository, when you have 
> a lot of binary files tracked, you _do_ want to force all repositories to 
> crlf=input.

If you are on a sane system, you do not even want to pay the price of
conversion.  Only people on systems with CRLF line endings should pay the
price (because your aim is to convert on such systems).  Are we throwing
that out of the window when the project decides to use gitattributes?

How about setting autocrlf automatically on mingw/msys/cygwin versions,
perhaps via templates or a patch to init-db?  Would that, combined with
user education, be a viable alternative?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-23 17:07     ` [PATCH] Respect crlf attribute " Junio C Hamano
@ 2008-07-23 17:22       ` Johannes Schindelin
  2008-07-23 18:04         ` Joshua Jensen
  2008-07-23 19:23         ` Junio C Hamano
  2008-07-24 16:53       ` Dmitry Potapov
  1 sibling, 2 replies; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-23 17:22 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git

Hi,

On Wed, 23 Jul 2008, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > However, if you want to avoid CRs to _enter_ the repository, when you 
> > have a lot of binary files tracked, you _do_ want to force all 
> > repositories to crlf=input.
> 
> If you are on a sane system, you do not even want to pay the price of 
> conversion.  Only people on systems with CRLF line endings should pay 
> the price (because your aim is to convert on such systems).  Are we 
> throwing that out of the window when the project decides to use 
> gitattributes?

Well, if you do not want that, why do you set crlf in the gitattributes to 
begin with?

> How about setting autocrlf automatically on mingw/msys/cygwin versions, 
> perhaps via templates or a patch to init-db?  Would that, combined with 
> user education, be a viable alternative?

On msys we do that.  A few users decided they know better and switched it 
off.  Me for example.  But I wouldn't do something as stupid as editing a 
file with an editor that tries to be helpful and adds CR/LFs.

However, Cygwin?  No, they don't.  I don't see them change it, either.

The _only_ way we could do that is by setting auto_crlf in environment, 
depending on the platform.

HOWEVER: as you can see from the git-svn post this morning, there are 
known issues with autocrlf.

We also had much fun (not!) with a lot of users who were really happy to 
insult us for the change to set autocrlf to true by default (via 
/etc/gitconfig) in msysGit.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-23 17:22       ` Johannes Schindelin
@ 2008-07-23 18:04         ` Joshua Jensen
  2008-07-23 18:33           ` Avery Pennarun
                             ` (2 more replies)
  2008-07-23 19:23         ` Junio C Hamano
  1 sibling, 3 replies; 58+ messages in thread
From: Joshua Jensen @ 2008-07-23 18:04 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: Junio C Hamano, git

----- Original Message -----
From: Johannes Schindelin
Date: 7/23/2008 11:22 AM
> On msys we do that.  A few users decided they know better and switched it 
> off.  Me for example.  But I wouldn't do something as stupid as editing a 
> file with an editor that tries to be helpful and adds CR/LFs.
>   
There are certain file formats, such as a Visual Studio .sln file, that 
MUST be CRLF.  When a .sln file is not CRLF, Visual Studio refuses to 
read it.  I want to be able to set into the committed .gitattributes 
file the list of files that must be translated to the proper format 
regardless of the autocrlf setting.  An example is below:

*.bat crlf
*.def crlf
*.dsp crlf
*.dsw crlf
*.rc crlf
*.sln crlf
*.vcproj crlf
*.vcp crlf
*.vcw crlf

I'm not sure your patch was intended to accomplish the forced crlf 
settings above, but certainly, this functionality would be useful, if 
not critical.

Thanks.

Josh

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-23 18:04         ` Joshua Jensen
@ 2008-07-23 18:33           ` Avery Pennarun
  2008-07-23 18:57             ` Johannes Schindelin
  2008-07-23 19:22           ` Robin Rosenberg
  2008-07-23 19:33           ` Dmitry Potapov
  2 siblings, 1 reply; 58+ messages in thread
From: Avery Pennarun @ 2008-07-23 18:33 UTC (permalink / raw
  To: Joshua Jensen; +Cc: Johannes Schindelin, Junio C Hamano, git

On 7/23/08, Joshua Jensen <jjensen@workspacewhiz.com> wrote:
>  There are certain file formats, such as a Visual Studio .sln file, that
> MUST be CRLF.  [...]

It seems like what people really want is some additional file attributes:

1. always CRLF on all platforms (eg. for .bat files)
2. always LF on all platforms (eg. for shell scripts and perl scripts)
3. just leave it alone no matter what (eg. for binary files)
4. convert line endings to LF on checkin, native on checkout (eg. for
most source files)

Where "native" is defined by some config option, but the choice of #1
through #4 is defined by .gitattributes.  Thus, the config option
affects only mode #4 (and perhaps the default mode, as it does now).

The current system works for #3.  With Dscho's patch, #4 works too.  I
think more kinds of per-file attributes are needed in order to get #1
and #2.

Avery

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-23 18:33           ` Avery Pennarun
@ 2008-07-23 18:57             ` Johannes Schindelin
  2008-07-23 19:20               ` Eyvind Bernhardsen
  0 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-23 18:57 UTC (permalink / raw
  To: Avery Pennarun; +Cc: Joshua Jensen, Junio C Hamano, git

Hi,

On Wed, 23 Jul 2008, Avery Pennarun wrote:

> 1. always CRLF on all platforms (eg. for .bat files)
> 2. always LF on all platforms (eg. for shell scripts and perl scripts)
> 3. just leave it alone no matter what (eg. for binary files)

These are not different, but equal.  "Do no harm to the contents of this 
file".

Hth,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-23 18:57             ` Johannes Schindelin
@ 2008-07-23 19:20               ` Eyvind Bernhardsen
  2008-07-23 19:44                 ` Johannes Schindelin
  0 siblings, 1 reply; 58+ messages in thread
From: Eyvind Bernhardsen @ 2008-07-23 19:20 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: Avery Pennarun, Joshua Jensen, Junio C Hamano, git

On 23. juli. 2008, at 20.57, Johannes Schindelin wrote:

> Hi,
>
> On Wed, 23 Jul 2008, Avery Pennarun wrote:
>
>> 1. always CRLF on all platforms (eg. for .bat files)
>> 2. always LF on all platforms (eg. for shell scripts and perl  
>> scripts)
>> 3. just leave it alone no matter what (eg. for binary files)
>
> These are not different, but equal.  "Do no harm to the contents of  
> this
> file".

That is only true until someone edits the file in an editor which  
prefers the wrong end-of-line marker, and converts to it when saving.   
It will be obvious that this has happened if the user does a "git  
diff" before committing, but I think the intent of nos. 1 and 2 is for  
git to automatically convert the line endings back instead of kicking  
up a fuss.

Might be too magical, though.

Eyvind Bernhardsen

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-23 18:04         ` Joshua Jensen
  2008-07-23 18:33           ` Avery Pennarun
@ 2008-07-23 19:22           ` Robin Rosenberg
  2008-07-23 19:35             ` Junio C Hamano
  2008-07-23 19:33           ` Dmitry Potapov
  2 siblings, 1 reply; 58+ messages in thread
From: Robin Rosenberg @ 2008-07-23 19:22 UTC (permalink / raw
  To: Joshua Jensen; +Cc: Johannes Schindelin, Junio C Hamano, git

onsdagen den 23 juli 2008 20.04.44 skrev Joshua Jensen:
> ----- Original Message -----
> From: Johannes Schindelin
> Date: 7/23/2008 11:22 AM
> > On msys we do that.  A few users decided they know better and switched it 
> > off.  Me for example.  But I wouldn't do something as stupid as editing a 
> > file with an editor that tries to be helpful and adds CR/LFs.
> >   
> There are certain file formats, such as a Visual Studio .sln file, that 
> MUST be CRLF.  When a .sln file is not CRLF, Visual Studio refuses to 
> read it.  I want to be able to set into the committed .gitattributes 
> file the list of files that must be translated to the proper format 
> regardless of the autocrlf setting.  An example is below:
> 
> *.bat crlf
> *.def crlf
> *.dsp crlf
> *.dsw crlf
> *.rc crlf
> *.sln crlf
> *.vcproj crlf
> *.vcp crlf
> *.vcw crlf

Wouldn't  "*bat -crlf " etc be good for these, and thus store CRLF in the repo.

-- robin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-23 17:22       ` Johannes Schindelin
  2008-07-23 18:04         ` Joshua Jensen
@ 2008-07-23 19:23         ` Junio C Hamano
  2008-07-23 20:07           ` Johannes Schindelin
  1 sibling, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2008-07-23 19:23 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Wed, 23 Jul 2008, Junio C Hamano wrote:
> ...
>> If you are on a sane system, you do not even want to pay the price of 
>> conversion.  Only people on systems with CRLF line endings should pay 
>> the price (because your aim is to convert on such systems).  Are we 
>> throwing that out of the window when the project decides to use 
>> gitattributes?
>
> Well, if you do not want that, why do you set crlf in the gitattributes to 
> begin with?

It is not _me_ but the project upstream that needs to interact also with
Windows people who manages gitattributes.  And me personally knows my
editors are not helpful to add CR at the end of lines, so I do not need
the conversion.

That was the situation I had in mind.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-23 18:04         ` Joshua Jensen
  2008-07-23 18:33           ` Avery Pennarun
  2008-07-23 19:22           ` Robin Rosenberg
@ 2008-07-23 19:33           ` Dmitry Potapov
  2 siblings, 0 replies; 58+ messages in thread
From: Dmitry Potapov @ 2008-07-23 19:33 UTC (permalink / raw
  To: Joshua Jensen; +Cc: Johannes Schindelin, Junio C Hamano, git

On Wed, Jul 23, 2008 at 10:04 PM, Joshua Jensen
<jjensen@workspacewhiz.com> wrote:
>
> There are certain file formats, such as a Visual Studio .sln file, that MUST
> be CRLF.  When a .sln file is not CRLF, Visual Studio refuses to read it.

I wonder what exactly version of Visual Studio you use? All version that
I have seen had no problem reading .sln with LF ending, though they always
wrote it back with CRLF. In any case, you can define as binary, so Git will
not do anything about their EOLs.

Dmitry

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-23 19:22           ` Robin Rosenberg
@ 2008-07-23 19:35             ` Junio C Hamano
  2008-07-23 19:41               ` Johannes Schindelin
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2008-07-23 19:35 UTC (permalink / raw
  To: Robin Rosenberg; +Cc: Joshua Jensen, Johannes Schindelin, git

Robin Rosenberg <robin.rosenberg.lists@dewire.com> writes:

>> There are certain file formats, such as a Visual Studio .sln file, that 
>> MUST be CRLF.  When a .sln file is not CRLF, Visual Studio refuses to 
>> read it.  I want to be able to set into the committed .gitattributes 
>> file the list of files that must be translated to the proper format 
>> regardless of the autocrlf setting.  An example is below:
>> 
>> *.bat crlf
>...
>> *.vcw crlf
>
> Wouldn't  "*bat -crlf " etc be good for these, and thus store CRLF in the repo.

I'd agree.  And I do not think we would want to introduce "crlf=force"
that converts working tree files that could be LF terminated to CRLF upon
checking in.  That is as bad as some helpful editors that adds CR at the
end of line without being asked.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-23 19:35             ` Junio C Hamano
@ 2008-07-23 19:41               ` Johannes Schindelin
  0 siblings, 0 replies; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-23 19:41 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Robin Rosenberg, Joshua Jensen, git

Hi,

On Wed, 23 Jul 2008, Junio C Hamano wrote:

> Robin Rosenberg <robin.rosenberg.lists@dewire.com> writes:
> 
> >> There are certain file formats, such as a Visual Studio .sln file, 
> >> that MUST be CRLF.  When a .sln file is not CRLF, Visual Studio 
> >> refuses to read it.  I want to be able to set into the committed 
> >> .gitattributes file the list of files that must be translated to the 
> >> proper format regardless of the autocrlf setting.  An example is 
> >> below:
> >> 
> >> *.bat crlf
> >...
> >> *.vcw crlf
> >
> > Wouldn't "*bat -crlf " etc be good for these, and thus store CRLF in 
> > the repo.
> 
> I'd agree.  And I do not think we would want to introduce "crlf=force" 
> that converts working tree files that could be LF terminated to CRLF 
> upon checking in.  That is as bad as some helpful editors that adds CR 
> at the end of line without being asked.

You can say that only because you are not in the situation having a lot of 
CR/LF files in your _public_ repository _in spite of_ setting the 
gitattributes correctly.

That seriously sucks.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-23 19:20               ` Eyvind Bernhardsen
@ 2008-07-23 19:44                 ` Johannes Schindelin
  2008-07-24 21:30                   ` Eyvind Bernhardsen
  0 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-23 19:44 UTC (permalink / raw
  To: Eyvind Bernhardsen; +Cc: Avery Pennarun, Joshua Jensen, Junio C Hamano, git

Hi,

On Wed, 23 Jul 2008, Eyvind Bernhardsen wrote:

> On 23. juli. 2008, at 20.57, Johannes Schindelin wrote:
> 
> >On Wed, 23 Jul 2008, Avery Pennarun wrote:
> >
> > >1. always CRLF on all platforms (eg. for .bat files)
> > >2. always LF on all platforms (eg. for shell scripts and perl scripts)
> > >3. just leave it alone no matter what (eg. for binary files)
> >
> >These are not different, but equal.  "Do no harm to the contents of this
> >file".
> 
> That is only true until someone edits the file in an editor which 
> prefers the wrong end-of-line marker, and converts to it when saving.  
> It will be obvious that this has happened if the user does a "git diff" 
> before committing, but I think the intent of nos. 1 and 2 is for git to 
> automatically convert the line endings back instead of kicking up a 
> fuss.
> 
> Might be too magical, though.

I deem it not, uhm, magical.  By your reasoning there should be a way for 
Git to convert a file to UTF-8 when some entertaining person converted the 
working directory file to ISO-8859-15.

Really, either it is CR/LF on all platforms (and then the project members 
have to live by it), or it is not.  You cannot have both.

If it is CR/LF on all platforms, you just _commit_ it as CR/LF.  No 
conversion, not even a brain required.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-23 19:23         ` Junio C Hamano
@ 2008-07-23 20:07           ` Johannes Schindelin
  0 siblings, 0 replies; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-23 20:07 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git

Hi,

On Wed, 23 Jul 2008, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > On Wed, 23 Jul 2008, Junio C Hamano wrote:
> > ...
> >> If you are on a sane system, you do not even want to pay the price of 
> >> conversion.  Only people on systems with CRLF line endings should pay 
> >> the price (because your aim is to convert on such systems).  Are we 
> >> throwing that out of the window when the project decides to use 
> >> gitattributes?
> >
> > Well, if you do not want that, why do you set crlf in the 
> > gitattributes to begin with?
> 
> It is not _me_ but the project upstream that needs to interact also with 
> Windows people who manages gitattributes.  And me personally knows my 
> editors are not helpful to add CR at the end of lines, so I do not need 
> the conversion.

I know you do.  And I know those users don't.  They do not even know that 
they should set autocrlf = input in their cygwin Git.

Or at least, now they do.  After a few hundred commits that have been 
published _after_ their broken checkins.

Sigh,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute in "git add" even if core.autocrlf has not been set
  2008-07-23 11:40         ` Dmitry Potapov
@ 2008-07-24  6:06           ` Steffen Prohaska
  2008-07-24 12:39             ` Johannes Schindelin
  2008-07-24 14:09             ` Dmitry Potapov
  0 siblings, 2 replies; 58+ messages in thread
From: Steffen Prohaska @ 2008-07-24  6:06 UTC (permalink / raw
  To: Dmitry Potapov, Johannes Schindelin; +Cc: Junio C Hamano, Git Mailing List


On Jul 23, 2008, at 1:40 PM, Dmitry Potapov wrote:

> On Wed, Jul 23, 2008 at 07:49:20AM +0200, Steffen Prohaska wrote:
>>
>> On Jul 23, 2008, at 3:31 AM, Johannes Schindelin wrote:
>>
>>>
>>> -	if ((action == CRLF_BINARY) || !auto_crlf || !len)
>>> +	if ((action == CRLF_BINARY) || (!auto_crlf && action < 0) || !len)
>>
>> I think we should strictly follow the documentation, so this should  
>> read
>>
>> +       if ((action == CRLF_BINARY) || (!auto_crlf && action !=
>> CRLF_INPUT) || !len)
>
> Well, your expression is correct if we choose to strictly follow to  
> what
> the current documentation says, but it is not well written in this  
> place,
> and, more importantly, I don't see any use case where you would want  
> to
> set crlf=input in .gitattributes, because it is shared among users on
> different platforms. What you want to specify in it is whether a  
> file is
> text or binary. If crlf is set, it means a text file; if unset, it  
> is a
> binary file. Regardless of what autocrlf value, I don't see why a text
> file should be checked in with CRs. So, Dscho's patch makes more sense
> to me.  This requires correction to the documentation though:

Ok, this and Dscho's earlier mail convinced me.  My patch does not
make sense at all.  I should have had thought a bit harder ;-)

I have however a related question:

Dscho,
Is the following your use case?

  "I am the maintainer of this project.  I know that this project needs
   crlf conversion, because it is a cross-platform project.  Therefore,
   I want to force crlf conversion for this specific project, even if
   the user did not configure core.autocrlf=input on Unix."

Your patch provides a solution, though not a very comfortable one.  With
your patch applied, you could explicitly list all files (or filetypes)
that are text and mark them with 'crlf'.  Git would not let CRLFs enter
the repository for these files.  You could also specify 'crlf=input',
but I don't understand why you want to specify this.  Anyway, you would
need to explicitly list all text filetypes and explicitly *not list* the
binary filetypes because they must not be converted.  This is no very
comfortable, no?

I think what you really want to do is to tell git that the *automatic*
crlf detection shall be used for all files of your specific project,
even if the user did not configure core.autocrlf=input.  This would
avoid listing each filetype separately.

Maybe we could allow 'crlf=guess' in .gitattributes with the following
documentation:

-- 8< --
Set to string value "guess"::

	Setting `guess` tells git to apply conversion upon checkin
         if the file content looks like text.  On checkout, however,
         git applies the conversion specified in `core.autocrlf`.
         Setting `guess` can be useful to force automatic file type
         detection for a specific project.
-- >8 --


> diff --git a/Documentation/gitattributes.txt b/Documentation/ 
> gitattributes.txt
> index d7b4114..448857b 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -99,9 +99,9 @@ This attribute controls the line-ending convention.
> Set::
>
> 	Setting the `crlf` attribute on a path is meant to mark
> -	the path as a "text" file.  'core.autocrlf' conversion
> -	takes place without guessing the content type by
> -	inspection.
> +	the path as a "text" file. Line endings in a text file
> +	are converted to LF upon checkin, and if 'core.autocrlf'
> +	is true then to CRLF upon checkout.
>
> Unset::

Makes sense.

	Steffen

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute in "git add" even if core.autocrlf has not been set
  2008-07-24  6:06           ` Steffen Prohaska
@ 2008-07-24 12:39             ` Johannes Schindelin
  2008-07-24 17:05               ` Dmitry Potapov
  2008-07-24 14:09             ` Dmitry Potapov
  1 sibling, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-24 12:39 UTC (permalink / raw
  To: Steffen Prohaska; +Cc: Dmitry Potapov, Junio C Hamano, Git Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1510 bytes --]

Hi,

On Thu, 24 Jul 2008, Steffen Prohaska wrote:

> Dscho,
> Is the following your use case?
> 
> "I am the maintainer of this project.  I know that this project needs
>  crlf conversion, because it is a cross-platform project.  Therefore,
>  I want to force crlf conversion for this specific project, even if
>  the user did not configure core.autocrlf=input on Unix."

No.

My use case is this: a few users work on Windows (Cygwin), others on 
MacOSX, yet others on Linux.

We often integrate files from somewhere else, and them Windows guys love 
to edit their files with their anachronistic proprietary crap tools that 
add CRs where CRs no belongee.

Also, the Windows guys (being Windows guys) cannot be bothered to read 
documentation, so they did not set autocrlf.

> Your patch provides a solution, though not a very comfortable one.  With
> your patch applied, you could explicitly list all files (or filetypes)
> that are text and mark them with 'crlf'.

Why should I want to do that?  I do not _want_ CRs.  And them Windows guys 
do not need them either; they are often not even aware that their crap 
tools introduce them.

> You could also specify 'crlf=input', but I don't understand why you want 
> to specify this.

That's what I do.  And _I_ need to understand why ;-)

Well, seems I will just continue shouting "why did you commit that 
CR-ridden file?" and get shouted back with "why does this §&%&%&!° Git 
tool not fix it for me automatically?  Even _Subversion_ can do it."

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute in "git add" even if core.autocrlf has not been set
  2008-07-24  6:06           ` Steffen Prohaska
  2008-07-24 12:39             ` Johannes Schindelin
@ 2008-07-24 14:09             ` Dmitry Potapov
  2008-07-24 14:38               ` Johannes Schindelin
  1 sibling, 1 reply; 58+ messages in thread
From: Dmitry Potapov @ 2008-07-24 14:09 UTC (permalink / raw
  To: Steffen Prohaska; +Cc: Johannes Schindelin, Junio C Hamano, Git Mailing List

On Thu, Jul 24, 2008 at 08:06:29AM +0200, Steffen Prohaska wrote:
> 
> I have however a related question:
> 
> Dscho,
> Is the following your use case?
> 
>  "I am the maintainer of this project.  I know that this project needs
>   crlf conversion, because it is a cross-platform project.  Therefore,
>   I want to force crlf conversion for this specific project, even if
>   the user did not configure core.autocrlf=input on Unix."

I suspect that most problems with crlf is caused by Windows users who
have core.autocrlf=false for whatever reason (I suspect without a good
reason in most cases). But perhaps having core.autocrlf=input as default
on Unix and Mac will be better than the current core.autocrlf=false.

> Maybe we could allow 'crlf=guess' in .gitattributes with the following
> documentation:

Please, no. It feels wrong to complicate crlf settings to workaround
incorrect user configuration. I don't think that many people really will
be able to benefit from that. In most cases, people end up with the
incorrect value autocrlf just because the documentation is complicated
and it is difficult to understand from it what autocrlf does.

So I believe the real solution is providing more reasonable defaults
for autocrlf and more clear and simple explanation, i.e. to have
something like in the introduction section describing how to set up
your own repository:

===
If you work on Windows, you are most likely use text files with CRLF
endings. To ensure that your line endings will be correctly represented
on other platforms, you should set core.autocrlf=true.

If you prefer to have LF endings in your text files, you may want to set
core.autocrlf=input to ensure that files with inconsistent end-of-lines
will not enter into your repository.

autocrlf conversion will automatically detect text files and convert
line endings accordingly to the specified settings. In very rare cases,
heuristic may be wrong[*], so you may want to specify what files should
bed considered as text and what as binary explicitly. You can do that by
setting the crlf attribute in .gitattributes, like this:

*.exe -crlf
*.c crlf

The above says that all files with the .exe extension should be treated
as binary and all files with the .c extension should treated as text.

[*] You do not have to worry too much about heuristic being wrong,
because you will be warned if conversion for a file is irreversible.
See core.safecrlf in linkgit:gitattributes[5] for more information.
===

We should not expect every users to read everything written in
gitattributes just to get autocrlf correctly.

Dmitry

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute in "git add" even if core.autocrlf has not been set
  2008-07-24 14:09             ` Dmitry Potapov
@ 2008-07-24 14:38               ` Johannes Schindelin
  2008-07-24 14:52                 ` Steffen Prohaska
  0 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-24 14:38 UTC (permalink / raw
  To: Dmitry Potapov; +Cc: Steffen Prohaska, Junio C Hamano, Git Mailing List

Hi,

On Thu, 24 Jul 2008, Dmitry Potapov wrote:

> On Thu, Jul 24, 2008 at 08:06:29AM +0200, Steffen Prohaska wrote:
> > 
> >   I am the maintainer of this project.  I know that this project needs 
> >   crlf conversion, because it is a cross-platform project.  
> >   Therefore, I want to force crlf conversion for this specific 
> >   project, even if the user did not configure core.autocrlf=input on 
> >   Unix.
> 
> I suspect that most problems with crlf is caused by Windows users who 
> have core.autocrlf=false for whatever reason (I suspect without a good 
> reason in most cases).

Almost correct.  It is _unset_!  And it should be perfectly valid for 
users not having to set anything there.

> But perhaps having core.autocrlf=input as default on Unix and Mac will 
> be better than the current core.autocrlf=false.

Have you read Junio's reply?  If so, how can you suggest any change for 
Unix and Mac?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute in "git add" even if core.autocrlf has not been set
  2008-07-24 14:38               ` Johannes Schindelin
@ 2008-07-24 14:52                 ` Steffen Prohaska
  2008-07-24 16:44                   ` Avery Pennarun
  2008-07-24 16:45                   ` Johannes Schindelin
  0 siblings, 2 replies; 58+ messages in thread
From: Steffen Prohaska @ 2008-07-24 14:52 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: Dmitry Potapov, Junio C Hamano, Git Mailing List


On Jul 24, 2008, at 4:38 PM, Johannes Schindelin wrote:

> On Thu, 24 Jul 2008, Dmitry Potapov wrote:
>
>> On Thu, Jul 24, 2008 at 08:06:29AM +0200, Steffen Prohaska wrote:
>>>
>>>  I am the maintainer of this project.  I know that this project  
>>> needs
>>>  crlf conversion, because it is a cross-platform project.
>>>  Therefore, I want to force crlf conversion for this specific
>>>  project, even if the user did not configure core.autocrlf=input on
>>>  Unix.
>>
>> I suspect that most problems with crlf is caused by Windows users who
>> have core.autocrlf=false for whatever reason (I suspect without a  
>> good
>> reason in most cases).
>
> Almost correct.  It is _unset_!  And it should be perfectly valid for
> users not having to set anything there.

Why is it unset?  Since 1.5.5, our installer sets core.autocrlf=true
on Windows.  So we *do* set a sane default for Windows users.

In the projects I am using git, the problematic platform is *Unix*.
To work around the default on Unix (core.autocrlf=false), I deliver
a custom script to our developers that verifies the setup on Unix and
complains if core.autocrlf is not set to "input".  Since I do this,
I haven't seen any further problems.

	Steffen

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute in "git add" even if core.autocrlf has not been set
  2008-07-24 14:52                 ` Steffen Prohaska
@ 2008-07-24 16:44                   ` Avery Pennarun
  2008-07-24 16:45                   ` Johannes Schindelin
  1 sibling, 0 replies; 58+ messages in thread
From: Avery Pennarun @ 2008-07-24 16:44 UTC (permalink / raw
  To: Steffen Prohaska
  Cc: Johannes Schindelin, Dmitry Potapov, Junio C Hamano,
	Git Mailing List

On 7/24/08, Steffen Prohaska <prohaska@zib.de> wrote:
>  Why is it unset?  Since 1.5.5, our installer sets core.autocrlf=true
>  on Windows.  So we *do* set a sane default for Windows users.

Well, if you've been following the thread, not everyone uses msysGit,
and cygwin Git doesn't set autocrlf by default, and changing such
things behind people's backs can be really nasty.

So I can't decide if I mean the following as a joke or not ;)

diff --git a/builtin-init-db.c b/builtin-init-db.c
index 38b4fcb..91b79dd 100644
--- a/builtin-init-db.c
+++ b/builtin-init-db.c
@@ -314,10 +314,19 @@ int init_db(const char *template_dir, unsigned int flags)
 	}

 	if (!(flags & INIT_DB_QUIET))
+	{
 		printf("%s%s Git repository in %s/\n",
 		       reinit ? "Reinitialized existing" : "Initialized empty",
 		       shared_repository ? " shared" : "",
 		       get_git_dir());
+#ifdef _WIN32
+		if (!auto_crlf)
+			printf("WARNING: You're using Windows but core.autocrlf is not set!\n");
+#else
+		if (auto_crlf)
+			printf("WARNING: You're not using Windows but core.autocrlf is set!\n");
+#endif
+	}

 	return 0;
 }

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute in "git add" even if core.autocrlf has not been set
  2008-07-24 14:52                 ` Steffen Prohaska
  2008-07-24 16:44                   ` Avery Pennarun
@ 2008-07-24 16:45                   ` Johannes Schindelin
  2008-07-24 20:44                     ` Robin Rosenberg
  1 sibling, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-24 16:45 UTC (permalink / raw
  To: Steffen Prohaska; +Cc: Dmitry Potapov, Junio C Hamano, Git Mailing List

Hi,

On Thu, 24 Jul 2008, Steffen Prohaska wrote:

> On Jul 24, 2008, at 4:38 PM, Johannes Schindelin wrote:
> 
> >On Thu, 24 Jul 2008, Dmitry Potapov wrote:
> >
> > >On Thu, Jul 24, 2008 at 08:06:29AM +0200, Steffen Prohaska wrote:
> > > >
> > > > I am the maintainer of this project.  I know that this project 
> > > > needs crlf conversion, because it is a cross-platform project. 
> > > > Therefore, I want to force crlf conversion for this specific 
> > > > project, even if the user did not configure core.autocrlf=input on 
> > > > Unix.
> > >
> > >I suspect that most problems with crlf is caused by Windows users who 
> > >have core.autocrlf=false for whatever reason (I suspect without a 
> > >good reason in most cases).
> >
> >Almost correct.  It is _unset_!  And it should be perfectly valid for 
> >users not having to set anything there.
> 
> Why is it unset?  Since 1.5.5, our installer sets core.autocrlf=true
> on Windows.  So we *do* set a sane default for Windows users.

As I mentioned earlier, these users use Cygwin's Git.  And they do not 
update frequently.

> In the projects I am using git, the problematic platform is *Unix*. To 
> work around the default on Unix (core.autocrlf=false), I deliver a 
> custom script to our developers that verifies the setup on Unix and 
> complains if core.autocrlf is not set to "input".  Since I do this, I 
> haven't seen any further problems.

That is exactly the use case I had in mind for my patch.  If we already 
bother to mark the files with crlf=input in .gitattributes, then Git could 
be so nice to respect it.

However, I see Junio's point: on sane platforms you have to work hard to 
get CR-damaged files.  And the regular sane platform user should not be 
punished for the brain/CR damage a certain monopolist has brought over 
this planet.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-23 17:07     ` [PATCH] Respect crlf attribute " Junio C Hamano
  2008-07-23 17:22       ` Johannes Schindelin
@ 2008-07-24 16:53       ` Dmitry Potapov
  2008-07-24 17:14         ` Johannes Schindelin
  1 sibling, 1 reply; 58+ messages in thread
From: Dmitry Potapov @ 2008-07-24 16:53 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Johannes Schindelin, git

On Wed, Jul 23, 2008 at 10:07:43AM -0700, Junio C Hamano wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > However, if you want to avoid CRs to _enter_ the repository, when you have 
> > a lot of binary files tracked, you _do_ want to force all repositories to 
> > crlf=input.
> 
> If you are on a sane system, you do not even want to pay the price of
> conversion. 

If you do not have CRLF then there will be no conversion, so you pay only
the price of check. The relative cost of that check obviously depends on
what other operations are. I believe that the worst case for this check
will be when there is no other operations, just re-reading files, so I
did:

# touch all files to force git re-read them...
git ls-files | xargs touch
time git status

Here is the average based on 3 attempts:
0.232s with autocrlf=false
0.265s with autocrlf=input (14% increase)

Of course, without touching files, git-status will take the same time in
both cases (about 0.026s in my testing), and touching all files not very
interesting from the practical point of view.

So, I decided to do something more practical, like applying 100 patches.
Here are my results:

         false     input
         ------   -------
         7.409s   7.603s
         7.592s   7.239s
         7.617s   7.402s
         ------   -------
Average: 7.539s   7.415s

autocrlf=input turned out to be even slightly faster, but the difference
is not statistically significant.

So, though it is obvious that autocrlf=input clearly adds some overhead,
I believe it is negligable in most practical cases as there are much
more expensive operations. And you can always turn autocrlf off, if you
are sure that you never have files with the wrong line ending.

> Only people on systems with CRLF line endings should pay the
> price (because your aim is to convert on such systems).

Unfortunately, it is not so clearly cut. Some people may work on Unix
but share files with Windows using network shares or emails.

> Are we throwing
> that out of the window when the project decides to use gitattributes?

I am not sure that I understand your question. In any case, Dscho's
patch did not change the default, so it did not penalize anyone except
explicitly asked about that by setting crlf on some files. I would like
also to notice that setting crlf on some files (in contrast to unsetting
it!) is rarely necessary in practice if all users have sane settings for
autocrlf.

> 
> How about setting autocrlf automatically on mingw/msys/cygwin versions,
> perhaps via templates or a patch to init-db?  Would that, combined with
> user education, be a viable alternative?

Perhaps, it is good compromise for now, considering that Windows users
have most troubles with that... I really don't mind seeing 'input' as
default on other platforms, but I won't insist on that.


Dmitry

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute in "git add" even if core.autocrlf has not been set
  2008-07-24 12:39             ` Johannes Schindelin
@ 2008-07-24 17:05               ` Dmitry Potapov
  0 siblings, 0 replies; 58+ messages in thread
From: Dmitry Potapov @ 2008-07-24 17:05 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: Steffen Prohaska, Junio C Hamano, Git Mailing List

On Thu, Jul 24, 2008 at 01:39:34PM +0100, Johannes Schindelin wrote:
> 
> Well, seems I will just continue shouting "why did you commit that 
> CR-ridden file?" and get shouted back with "why does this §&%&%&!° Git 
> tool not fix it for me automatically?  Even _Subversion_ can do it."

Actually, no... With Subversion, if you do not have correct auto-props
then every newly added file will have incorrect end-of-lines. So, each
user has to manually add all possible text extensions in the auto-props
section of his/her subversion config file. In practice, it means that
with SVN you have far less problems with EOLs in first few days when
users just existing files, but then the problem reemerge every time when
someone adds a new file... So, it is not much better...

In constrast, Git provides automatic text detection and EOL conversion,
but it is not enabled where it should... at least, on Windows, and for
those who copies file directly from Windows through network shares or by
other means.

Dmitry

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-24 16:53       ` Dmitry Potapov
@ 2008-07-24 17:14         ` Johannes Schindelin
  2008-07-24 17:55           ` Dmitry Potapov
  0 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-24 17:14 UTC (permalink / raw
  To: Dmitry Potapov; +Cc: Junio C Hamano, git

Hi,

On Thu, 24 Jul 2008, Dmitry Potapov wrote:

> If you do not have CRLF then there will be no conversion, so you pay 
> only the price of check. The relative cost of that check obviously 
> depends on what other operations are. I believe that the worst case for 
> this check will be when there is no other operations, just re-reading 
> files, so I did:
> 
> # touch all files to force git re-read them...
> git ls-files | xargs touch
> time git status
> 
> Here is the average based on 3 attempts:
> 0.232s with autocrlf=false
> 0.265s with autocrlf=input (14% increase)

And you tested... on git.git?  One of the smallest serious Git users?

If you want to come close to the hard cases (that you would punish with 
setting autocrlf = input, not yourself of course), test _at least_ the 
Linux kernel.

To come closer you should test on OpenOffice.org or KDE.

Hth,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-24 17:14         ` Johannes Schindelin
@ 2008-07-24 17:55           ` Dmitry Potapov
  0 siblings, 0 replies; 58+ messages in thread
From: Dmitry Potapov @ 2008-07-24 17:55 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: Junio C Hamano, git

On Thu, Jul 24, 2008 at 06:14:09PM +0100, Johannes Schindelin wrote:
> On Thu, 24 Jul 2008, Dmitry Potapov wrote:
> > 
> > Here is the average based on 3 attempts:
> > 0.232s with autocrlf=false
> > 0.265s with autocrlf=input (14% increase)
> 
> And you tested... on git.git?  One of the smallest serious Git users?

If you take a huge repo, you are going to be bound by disk IO unless
huge amount memory, so all files reside in the filesystem cache.

> 
> If you want to come close to the hard cases (that you would punish with 
> setting autocrlf = input, not yourself of course), test _at least_ the 
> Linux kernel.

false   input
6.913s  8.426s
6.922s  8.337s
6.969s  8.335s
6.975s  8.386s
6.939s  8.435s
6.954s  8.320s

autocrlf=input takes 20% more than autocrlf=false, which is rather
strange, because I clearly saw some disk activity... so I expected
it to be less...

In any case, this test is rather artificial, and the real one should
be is of course applying patches... Maybe I will try that later...

Dmitry

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute in "git add" even if core.autocrlf has not been set
  2008-07-24 16:45                   ` Johannes Schindelin
@ 2008-07-24 20:44                     ` Robin Rosenberg
  2008-07-24 23:58                       ` Johannes Schindelin
  0 siblings, 1 reply; 58+ messages in thread
From: Robin Rosenberg @ 2008-07-24 20:44 UTC (permalink / raw
  To: Johannes Schindelin
  Cc: Steffen Prohaska, Dmitry Potapov, Junio C Hamano,
	Git Mailing List

torsdagen den 24 juli 2008 18.45.07 skrev Johannes Schindelin:
> However, I see Junio's point: on sane platforms you have to work hard to 
> get CR-damaged files.  And the regular sane platform user should not be 
> punished for the brain/CR damage a certain monopolist has brought over 
> this planet.

That wasn't hard at all :) Eclipse did it all by itself for JEE MANIFEST.MF files.
We got into some strange problems on windows with msysgit after that where git
(status) detected the files as changed, but no actual content change and refused
to change it's mind. I think we had actually fixed the files on Windows prior to that.
Only after converting the checkout to autocrlf=input did msysgit enter a sane state
again. I was not able to repeat the problem at the time.

Maybe I'll try again in a few weeks time.

-- robin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-23 19:44                 ` Johannes Schindelin
@ 2008-07-24 21:30                   ` Eyvind Bernhardsen
  2008-07-25  0:01                     ` Johannes Schindelin
  0 siblings, 1 reply; 58+ messages in thread
From: Eyvind Bernhardsen @ 2008-07-24 21:30 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: Avery Pennarun, Joshua Jensen, Junio C Hamano, git

On 23. juli. 2008, at 21.44, Johannes Schindelin wrote:

[...]

> I deem it not, uhm, magical.  By your reasoning there should be a  
> way for
> Git to convert a file to UTF-8 when some entertaining person  
> converted the
> working directory file to ISO-8859-15.

Um, what?  This discussion is about figuring out when Git should mess  
with the line endings a user is trying to commit.  Why are you  
bringing character encodings into it?  Git does not (to my knowledge)  
have any "utf8" setting that converts all files to UTF-8, and  
certainly not one that is enabled by default on Windows.

Being able to mark a file with an "enforce crlf line endings on this  
file" flag is useful _given that Git messes with line endings_, and a  
"do not mess with line endings in this file" flag does not achieve the  
same purpose.  Okay?

If you want my personal opinion on autocrlf, I was happier when Git  
left line endings alone so I didn't have to worry about the files I  
commit being mangled (however reversibly).

I think "autocrlf=input" can make sense on a per-repository basis, but  
I would have it somewhere versioned like .gitattributes, so that (a)  
it is propagated when the repository is cloned, and (b) it is possible  
to take an existing repository with checked-in CRLFs and do a one-time  
conversion that also adds the "autocrlf=input" setting  
to .gitattributes (or wherever the setting ends up going).

I don't use "autocrlf=true", even on Windows, so I don't have any  
opinion on it other than thinking it should be a per-user setting  
rather than per-repository (but with a per-platform default :).  I  
don't see how it could work unless the repository has normalised line  
endings, though.

My background: I have a ton of repositories imported from CVS with a  
sordid mix of CR and CRLF (occasionally in the same file!).  Using  
these repositories with "autocrlf=true" is a pain in the arse since  
CRLFs in the repository cause spurious changes after a clean checkout,  
merge troubles, etc.  The code is supposed to build on Windows as well  
as Unix, so I got bit by this when msysgit changed the default  
autocrlf setting. [1]

My workaround is simply to set autocrlf=false; another solution might  
be to filter-branch all CRLFs out of the repository, but then I'd be  
messing with a lot of history, possibly breaking the files that  
actually need CRLFs, and for no real upside.  No thanks.

Okay, this turned into a longer rant than I intended; what I'm trying  
to say is that I think autocrlf(=true) as it is currently implemented  
is pretty useless, and that a setting that is versioned and cloneable  
would be better.  Phew!
-- 
Eyvind Bernhardsen


Footnote:
1. Your impression that people were rude when "autocrlf=true" was made  
the default on msysgit might be related to the fact that nobody had  
considered what would happen to existing repositories before making  
the change?  Steffen Prohaska wrote in response to one such complaint:

> Unfortunately, existing repositories that contain the wrong line  
> endings
> suffer from the problems you described above.  This can be handled
> either by switching off autocrlf, as you propose, or by cleaning up  
> the
> line endings.

(http://groups.google.com/group/msysgit/browse_thread/thread/978a3435c1cb0c81/70bac514a8b438ef?lnk=gst#70bac514a8b438ef 
)

Steffen went on to propose a patch to make autocrlf a sticky per- 
repository setting, but I think you shot it down.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute in "git add" even if core.autocrlf has not been set
  2008-07-24 20:44                     ` Robin Rosenberg
@ 2008-07-24 23:58                       ` Johannes Schindelin
  0 siblings, 0 replies; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-24 23:58 UTC (permalink / raw
  To: Robin Rosenberg
  Cc: Steffen Prohaska, Dmitry Potapov, Junio C Hamano,
	Git Mailing List

Hi,

On Thu, 24 Jul 2008, Robin Rosenberg wrote:

> torsdagen den 24 juli 2008 18.45.07 skrev Johannes Schindelin:
> > However, I see Junio's point: on sane platforms you have to work hard 
> > to get CR-damaged files.  And the regular sane platform user should 
> > not be punished for the brain/CR damage a certain monopolist has 
> > brought over this planet.
> 
> That wasn't hard at all :) Eclipse did it all by itself for JEE 
> MANIFEST.MF files. We got into some strange problems on windows with 
> msysgit after that where git (status) detected the files as changed, but 
> no actual content change and refused to change it's mind. I think we had 
> actually fixed the files on Windows prior to that. Only after converting 
> the checkout to autocrlf=input did msysgit enter a sane state again. I 
> was not able to repeat the problem at the time.
> 
> Maybe I'll try again in a few weeks time.

That would be rather useful, because without a way to reproduce, you leave 
me with a rather frustrating situation.

Hth,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-24 21:30                   ` Eyvind Bernhardsen
@ 2008-07-25  0:01                     ` Johannes Schindelin
  2008-07-25 12:30                       ` Eyvind Bernhardsen
  0 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-25  0:01 UTC (permalink / raw
  To: Eyvind Bernhardsen; +Cc: Avery Pennarun, Joshua Jensen, Junio C Hamano, git

Hi,

On Thu, 24 Jul 2008, Eyvind Bernhardsen wrote:

> On 23. juli. 2008, at 21.44, Johannes Schindelin wrote:
> 
> [...]
> 
> >I deem it not, uhm, magical.  By your reasoning there should be a way 
> >for Git to convert a file to UTF-8 when some entertaining person 
> >converted the working directory file to ISO-8859-15.
> 
> Um, what?  This discussion is about figuring out when Git should mess 
> with the line endings a user is trying to commit.

This discusion is about when Git should mess with _content_ at all.

It is not limited to line endings.

Ciao,
Dscho "who personally could not care less about CR if it was not for M$' 
stupidity"

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-25  0:01                     ` Johannes Schindelin
@ 2008-07-25 12:30                       ` Eyvind Bernhardsen
  2008-07-25 14:01                         ` Dmitry Potapov
  0 siblings, 1 reply; 58+ messages in thread
From: Eyvind Bernhardsen @ 2008-07-25 12:30 UTC (permalink / raw
  To: Johannes Schindelin; +Cc: Avery Pennarun, Joshua Jensen, Junio C Hamano, git

On 25. juli. 2008, at 02.01, Johannes Schindelin wrote:

>> Um, what?  This discussion is about figuring out when Git should mess
>> with the line endings a user is trying to commit.
>
> This discusion is about when Git should mess with _content_ at all.
>
> It is not limited to line endings.

Fair enough.  Did you read the rest of my email to see when I think  
Git should mess with content?  I've thought about it, and being able  
to do stuff like this in .gitattributes would work for me:

* eol=auto
*.bat eol=crlf
Makefile eol=lf
bin/magic-binary eol=none

I.e. "detect line endings and do CRLF->LF conversion on all files  
except *.bat (*->CRLF), Makefile (*->LF) and bin/magic-binary (do  
nothing)".

The closest you can get today is by setting core.autocrlf to "input"  
and having this in .gitattributes:

*.bat -crlf
bin/magic-binary -crlf

...but "core.autocrlf" is not versioned and applies to _all_  
repositories, and anyone who doesn't have the correct setting can mess  
the repository up.  There's also no way of saying "this file should  
have LF line endings, even with autocrlf=true".

One problem is that the autocrlf setting mixes "I want LF only in my  
repositories" and "I like to have CRLFs in my working directory" into  
one config variable.  Instead, I'd like to have a config setting that  
specifies which line ending form I prefer: "when a text file is marked  
eol=auto, convert LFs to CRLFs on checkout".

Does this make sense to anyone but me?

> Ciao,
> Dscho "who personally could not care less about CR if it was not for  
> M$'
> stupidity"

Well, we agree on that.
-- 
Eyvind Bernhardsen

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-25 12:30                       ` Eyvind Bernhardsen
@ 2008-07-25 14:01                         ` Dmitry Potapov
  2008-07-25 21:05                           ` Eyvind Bernhardsen
  0 siblings, 1 reply; 58+ messages in thread
From: Dmitry Potapov @ 2008-07-25 14:01 UTC (permalink / raw
  To: Eyvind Bernhardsen
  Cc: Johannes Schindelin, Avery Pennarun, Joshua Jensen,
	Junio C Hamano, git

On Fri, Jul 25, 2008 at 02:30:16PM +0200, Eyvind Bernhardsen wrote:
> 
> Fair enough.  Did you read the rest of my email to see when I think  
> Git should mess with content?  I've thought about it, and being able  
> to do stuff like this in .gitattributes would work for me:
> 
> * eol=auto
> *.bat eol=crlf
> Makefile eol=lf
> bin/magic-binary eol=none
> 
> I.e. "detect line endings and do CRLF->LF conversion on all files  
> except *.bat (*->CRLF), Makefile (*->LF) and bin/magic-binary (do  
> nothing)".

I suppose "* eol=auto" means to convert CRLF->LF on checkin and
LF->native on checkout?

Also, perhaps, it should be also possible to explicitly specify:
*.txt eol=native
which is the same as 'auto' but without guessing whether it is text
or not.

> ...but "core.autocrlf" is not versioned and applies to _all_  
> repositories, and anyone who doesn't have the correct setting can mess  
> the repository up.

I think the real issue here is not as much about being or not being
versioned, but about forcing and not forcing anything on users.

If we had core.autocrlf=input as default then clueless users will not
checkin files with the incorrect ending. But there is an objection to
that -- you penalize those who always have good endings. And even the
fact that is merely default value that you can easily change to false
does not convince everyone.

The same can be said about your
* eol=auto
It forces conversion on everyone, even on those who do not need it.
Of course, you can say those projects that do not have the problem with
clueless users putting text files with incorrect end-of-lines will not
have lines like that in their .gitattribute. Yet, if I participate in
that project, why do I have to pay the price for this conversion just
because someone stupid can mess up line-endings?

> There's also no way of saying "this file should  
> have LF line endings, even with autocrlf=true".

Actually, there is

*.sh crlf=input

i.e. I want my shell files to have LF even I normally use CRLF for
all other files (on Windows).

> 
> One problem is that the autocrlf setting mixes "I want LF only in my  
> repositories" and "I like to have CRLFs in my working directory" into  
> one config variable.  Instead, I'd like to have a config setting that  
> specifies which line ending form I prefer: "when a text file is marked  
> eol=auto, convert LFs to CRLFs on checkout".

Following your style above, I believe it should be defined as
native-eol=crlf

but there are people who do not want to pay any price for conversion.
Currently, "core.autocrlf=false" means to do nothing about end-of-lines,
and even to ignore setting in .gitattributes. Should it be possible to
disable *any* conversion on checkin and checkout? Should this be that
value be the default, which most users use?

Dmitry
--
If you write a program that any idiot can use, only idiots will use it.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-25 14:01                         ` Dmitry Potapov
@ 2008-07-25 21:05                           ` Eyvind Bernhardsen
  2008-07-26  2:09                             ` Johannes Schindelin
  2008-07-29 13:46                             ` Dmitry Potapov
  0 siblings, 2 replies; 58+ messages in thread
From: Eyvind Bernhardsen @ 2008-07-25 21:05 UTC (permalink / raw
  To: Dmitry Potapov
  Cc: Johannes Schindelin, Avery Pennarun, Joshua Jensen,
	Junio C Hamano, git

On 25. juli. 2008, at 16.01, Dmitry Potapov wrote:

> On Fri, Jul 25, 2008 at 02:30:16PM +0200, Eyvind Bernhardsen wrote:
>>
>> Fair enough.  Did you read the rest of my email to see when I think
>> Git should mess with content?  I've thought about it, and being able
>> to do stuff like this in .gitattributes would work for me:
>>
>> * eol=auto
>> *.bat eol=crlf
>> Makefile eol=lf
>> bin/magic-binary eol=none
>>
>> I.e. "detect line endings and do CRLF->LF conversion on all files
>> except *.bat (*->CRLF), Makefile (*->LF) and bin/magic-binary (do
>> nothing)".
>
> I suppose "* eol=auto" means to convert CRLF->LF on checkin and
> LF->native on checkout?

That's the idea, yes, with "native" being configurable.

> Also, perhaps, it should be also possible to explicitly specify:
> *.txt eol=native
> which is the same as 'auto' but without guessing whether it is text
> or not.

Yes!  Good catch, that needs to be possible.

>> ...but "core.autocrlf" is not versioned and applies to _all_
>> repositories, and anyone who doesn't have the correct setting can  
>> mess
>> the repository up.
>
> I think the real issue here is not as much about being or not being
> versioned, but about forcing and not forcing anything on users.

The reason I want versioning is to fix the problem of enforcing  
normalised line endings in repositories with checked-in CRLFs; I'm not  
sure how to solve it any other way, but I'm open to suggestions.

> If we had core.autocrlf=input as default then clueless users will not
> checkin files with the incorrect ending. But there is an objection to
> that -- you penalize those who always have good endings. And even the
> fact that is merely default value that you can easily change to false
> does not convince everyone.

That is an excellent argument for why setting "autocrlf=true" by  
default on Windows was a bad idea.  Thanks! :)

> The same can be said about your
> * eol=auto
> It forces conversion on everyone, even on those who do not need it.
> Of course, you can say those projects that do not have the problem  
> with
> clueless users putting text files with incorrect end-of-lines will not
> have lines like that in their .gitattribute. Yet, if I participate in
> that project, why do I have to pay the price for this conversion just
> because someone stupid can mess up line-endings?

It's about correctness: if the repository isn't supposed to have CRs  
in any text files, that needs to be enforced.  You might not be  
stupid, but that doesn't mean you won't ever take a file that was  
created on Windows and commit it to the repository on Linux.  If the  
tool used to create the file was CR-damaged, there goes the  
repository's LF-only policy.

As you say, the reason I want the setting to be per-repository is that  
I don't think the cost is worthwhile for every repository.  The case  
where it _is_ worthwhile is when the repository will be shared between  
Windows users and Linux users, and the Windows users want CRLFs in  
their working directories.  I think it's worthwhile to actually make  
Git work right in that case.

As a side note, there's a lot of complaining about the cost of  
enforcing LF-only input, but I can't remember seeing any actual  
numbers.  Is it really that bad?

>> There's also no way of saying "this file should
>> have LF line endings, even with autocrlf=true".
>
> Actually, there is
>
> *.sh crlf=input
>
> i.e. I want my shell files to have LF even I normally use CRLF for
> all other files (on Windows).

Won't they still be converted to CRLF on checkout when autocrlf=true  
on Windows?

>> One problem is that the autocrlf setting mixes "I want LF only in my
>> repositories" and "I like to have CRLFs in my working directory" into
>> one config variable.  Instead, I'd like to have a config setting that
>> specifies which line ending form I prefer: "when a text file is  
>> marked
>> eol=auto, convert LFs to CRLFs on checkout".
>
> Following your style above, I believe it should be defined as
> native-eol=crlf

Yes, that sounds right.

> but there are people who do not want to pay any price for conversion.
> Currently, "core.autocrlf=false" means to do nothing about end-of- 
> lines,
> and even to ignore setting in .gitattributes. Should it be possible to
> disable *any* conversion on checkin and checkout? Should this be that
> value be the default, which most users use?

Well, there's no reason why Git repositories used only on Windows  
machines shouldn't store CRLFs, so why should all msysgit users pay  
the cost on every checkin _and_ checkout?

This is the reason the setting needs to be a per-repository policy and  
not a user configuration; users should not be able to configure away  
correctness, but they shouldn't be penalised unnecessarily for it,  
either.
-- 
Eyvind Bernhardsen

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-25 21:05                           ` Eyvind Bernhardsen
@ 2008-07-26  2:09                             ` Johannes Schindelin
  2008-07-29 19:11                               ` Eyvind Bernhardsen
  2008-07-29 13:46                             ` Dmitry Potapov
  1 sibling, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2008-07-26  2:09 UTC (permalink / raw
  To: Eyvind Bernhardsen
  Cc: Dmitry Potapov, Avery Pennarun, Joshua Jensen, Junio C Hamano,
	git

Hi,

On Fri, 25 Jul 2008, Eyvind Bernhardsen wrote:

> That is an excellent argument for why setting "autocrlf=true" by default 
> on Windows was a bad idea.  Thanks! :)

Well, these days, I tend to give a flying nothing to opinions that are not 
backed up by any effort toward the project.

In other words, if you have not participated in the community process to 
find what is best for Git, you could just as well say that you want the 
moon to be green, and I could not care less (in both cases).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-25 21:05                           ` Eyvind Bernhardsen
  2008-07-26  2:09                             ` Johannes Schindelin
@ 2008-07-29 13:46                             ` Dmitry Potapov
  2008-07-29 21:17                               ` Eyvind Bernhardsen
  1 sibling, 1 reply; 58+ messages in thread
From: Dmitry Potapov @ 2008-07-29 13:46 UTC (permalink / raw
  To: Eyvind Bernhardsen
  Cc: Johannes Schindelin, Avery Pennarun, Joshua Jensen,
	Junio C Hamano, git

On Fri, Jul 25, 2008 at 11:05:33PM +0200, Eyvind Bernhardsen wrote:
> 
> The reason I want versioning is to fix the problem of enforcing  
> normalised line endings in repositories with checked-in CRLFs; I'm not  
> sure how to solve it any other way, but I'm open to suggestions.

You clearly want *enforcing* and versioning for you is just means to do
that. Regardless, what is the best way to achieve that, the *main*
question is whether we want this enforcing and if yet then in what form
and to what degree...

> 
> >If we had core.autocrlf=input as default then clueless users will not
> >checkin files with the incorrect ending. But there is an objection to
> >that -- you penalize those who always have good endings. And even the
> >fact that is merely default value that you can easily change to false
> >does not convince everyone.
> 
> That is an excellent argument for why setting "autocrlf=true" by  
> default on Windows was a bad idea.  Thanks! :)

I am sorry, but I don't see connection here.

> 
> >The same can be said about your
> >* eol=auto
> >It forces conversion on everyone, even on those who do not need it.
> >Of course, you can say those projects that do not have the problem  
> >with
> >clueless users putting text files with incorrect end-of-lines will not
> >have lines like that in their .gitattribute. Yet, if I participate in
> >that project, why do I have to pay the price for this conversion just
> >because someone stupid can mess up line-endings?
> 
> It's about correctness: if the repository isn't supposed to have CRs  
> in any text files, that needs to be enforced.  You might not be  
> stupid, but that doesn't mean you won't ever take a file that was  
> created on Windows and commit it to the repository on Linux.  If the  
> tool used to create the file was CR-damaged, there goes the  
> repository's LF-only policy.

Again, why should people who do not use Windows and other CR-damaged
tools be penalized? No one can prevent me from putting in *my* own 
repository whatever I want anyway. Thus, if we agreed having this
conversion/check is useful, remaining questions are:
1. whether this check should be possible to turn off
2. whether this check should be turned off by default for people
   who use Git on other system than Windows?
The current status is:
1. Yes, it is possible to turn of this conversion
2. It is turned off by default for anyone but MSYS/GIT users.

And the main argument for having that in this way is that people with LF
text files should be unnecessary penalized for Windows being different.

> 
> As you say, the reason I want the setting to be per-repository is that  
> I don't think the cost is worthwhile for every repository.

Side note: Personally, I am not very concerned about this cost, but some
people are...

> The case  
> where it _is_ worthwhile is when the repository will be shared between  
> Windows users and Linux users, and the Windows users want CRLFs in  
> their working directories.  I think it's worthwhile to actually make  
> Git work right in that case.

Windows users who want CRLF should set autocrlf=true
Windows users who prefer LF should set autocrlf=input
Non-Windows users who copy file directly from Windows should also
set autocrlf=input
All other users who do not touch Windows may have autocrlf=false.

Files that do not need conversion (such as *.bat) should be marked as
"-crlf" in .gitattributes.

Of course, those who are very careful and have good editors can set
autocrlf=false even on Windows...

> 
> As a side note, there's a lot of complaining about the cost of  
> enforcing LF-only input, but I can't remember seeing any actual  
> numbers.  Is it really that bad?

No, it is not bad. In most practical cases, you will not notice any
difference. I've posted some numbers in this thread. You can find
them here:
http://article.gmane.org/gmane.comp.version-control.git/89908

> 
> >>There's also no way of saying "this file should
> >>have LF line endings, even with autocrlf=true".
> >
> >Actually, there is
> >
> >*.sh crlf=input
> >
> >i.e. I want my shell files to have LF even I normally use CRLF for
> >all other files (on Windows).
> 
> Won't they still be converted to CRLF on checkout when autocrlf=true  
> on Windows?

autocrlf=true works in the same way on Linux and Windows, and I have
tested it right now, and it works as I said above.

> >but there are people who do not want to pay any price for conversion.
> >Currently, "core.autocrlf=false" means to do nothing about end-of- 
> >lines,
> >and even to ignore setting in .gitattributes. Should it be possible to
> >disable *any* conversion on checkin and checkout? Should this be that
> >value be the default, which most users use?
> 
> Well, there's no reason why Git repositories used only on Windows  
> machines shouldn't store CRLFs, so why should all msysgit users pay  
> the cost on every checkin _and_ checkout?

1. You may want to use on other platforms later. In any case, it makes
   much sense to have the default that compatible with other systems.
2. Conversion cost is not significant (I suppose much less than gzip
   compression used for all objects) and it also saves some space.
3. Internally, Git considers only LF as true end of line. CR is just
   like an extra space before end-of-line. Is any good reason to have
   it inside of your repo anyway?
4. This is what other VCSes do on Windows. CVS converts all text files
   on checkin. SVN does the same for files having svn:eol-style=native.

In fact, if you read the discussion we had here some time ago, you may
find some other arguments too.

> 
> This is the reason the setting needs to be a per-repository policy and  
> not a user configuration;

What exactly?

> users should not be able to configure away  
> correctness,

Why not? *Every* user has him/her own repository, so he/she can do
*anything* with it, in principle.

> but they shouldn't be penalised unnecessarily for it,  
> either.

The problem is how to determine when it is necessary and when it is
not. If I never commit with wrong EOLs, I don't think it is necessary
for me to have this conversion... On the other hand, I don't mind having
this check as default.  It does not really bother me much, and if I can
turn it off, it is fine with me. But I suppose other people may feel
differently about this issue.


Dmitry

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-26  2:09                             ` Johannes Schindelin
@ 2008-07-29 19:11                               ` Eyvind Bernhardsen
  0 siblings, 0 replies; 58+ messages in thread
From: Eyvind Bernhardsen @ 2008-07-29 19:11 UTC (permalink / raw
  To: Johannes Schindelin
  Cc: Dmitry Potapov, Avery Pennarun, Joshua Jensen, Junio C Hamano,
	git


On 26. juli. 2008, at 04.09, Johannes Schindelin wrote:

> Hi,
>
> On Fri, 25 Jul 2008, Eyvind Bernhardsen wrote:
>
>> That is an excellent argument for why setting "autocrlf=true" by  
>> default
>> on Windows was a bad idea.  Thanks! :)
>
> Well, these days, I tend to give a flying nothing to opinions that  
> are not
> backed up by any effort toward the project.

Yeah, sorry, I'll stop complaining about this until I have actual code  
to back me up.
-- 
Eyvind Bernhardsen

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-29 13:46                             ` Dmitry Potapov
@ 2008-07-29 21:17                               ` Eyvind Bernhardsen
  2008-07-30  5:35                                 ` Steffen Prohaska
  2008-07-30 21:45                                 ` Dmitry Potapov
  0 siblings, 2 replies; 58+ messages in thread
From: Eyvind Bernhardsen @ 2008-07-29 21:17 UTC (permalink / raw
  To: Dmitry Potapov
  Cc: Johannes Schindelin, Avery Pennarun, Joshua Jensen,
	Junio C Hamano, git

On 29. juli. 2008, at 15.46, Dmitry Potapov wrote:

> On Fri, Jul 25, 2008 at 11:05:33PM +0200, Eyvind Bernhardsen wrote:
>
> You clearly want *enforcing* and versioning for you is just means to  
> do
> that. Regardless, what is the best way to achieve that, the *main*
> question is whether we want this enforcing and if yet then in what  
> form
> and to what degree...

The versioning has nothing to do with the enforcing per se, it just  
makes it possible to convert a repository with CRLFs to one that's LF- 
only without having to rewrite history.  See below.

>>> If we had core.autocrlf=input as default then clueless users will  
>>> not
>>> checkin files with the incorrect ending. But there is an objection  
>>> to
>>> that -- you penalize those who always have good endings. And even  
>>> the
>>> fact that is merely default value that you can easily change to  
>>> false
>>> does not convince everyone.
>>
>> That is an excellent argument for why setting "autocrlf=true" by
>> default on Windows was a bad idea.  Thanks! :)
>
> I am sorry, but I don't see connection here.

My point was that autocrlf penalises Windows users just as much as it  
does Linux users, so why should it be turned on by default on  
Windows?  But I've promised not to complain any more about this until  
I have cold, hard code to back me up.

> Again, why should people who do not use Windows and other CR-damaged
> tools be penalized? No one can prevent me from putting in *my* own
> repository whatever I want anyway. Thus, if we agreed having this
> conversion/check is useful, remaining questions are:
> 1. whether this check should be possible to turn off
> 2. whether this check should be turned off by default for people
>   who use Git on other system than Windows?
> The current status is:
> 1. Yes, it is possible to turn of this conversion
> 2. It is turned off by default for anyone but MSYS/GIT users.
>
> And the main argument for having that in this way is that people  
> with LF
> text files should be unnecessary penalized for Windows being  
> different.

I know, but my point is that I don't like to be unnecessarily  
penalised any more when I am using Windows than when I'm using Linux  
or OS X.  I would like the default to be "no conversion", and for  
conversion to be enabled not based on platform, but as a policy  
decision on the repositories where it actually matters.

You can have anything you like in _your_ repository, of course, but if  
you're not publishing it anywhere, who cares what your line endings  
are?  Your line endings only matter when you publish.  That's why I  
want a setting that is propagated: so that when you clone a repository  
with a LF-only policy, Git knows what to do.

>> As you say, the reason I want the setting to be per-repository is  
>> that
>> I don't think the cost is worthwhile for every repository.
>
> Side note: Personally, I am not very concerned about this cost, but  
> some
> people are...

Yeah :)

I think the real penalty is that with autocrlf enabled, Git no longer  
stores exactly what I committed.

>> The case
>> where it _is_ worthwhile is when the repository will be shared  
>> between
>> Windows users and Linux users, and the Windows users want CRLFs in
>> their working directories.  I think it's worthwhile to actually make
>> Git work right in that case.
>
> Windows users who want CRLF should set autocrlf=true
> Windows users who prefer LF should set autocrlf=input
> Non-Windows users who copy file directly from Windows should also
> set autocrlf=input
> All other users who do not touch Windows may have autocrlf=false.
>
> Files that do not need conversion (such as *.bat) should be marked as
> "-crlf" in .gitattributes.

Yes, and I see you checked that "crlf=input" does actually work when  
you want LF-only, sorry about that.  The syntax is _horrible_, though.

> Of course, those who are very careful and have good editors can set
> autocrlf=false even on Windows...

Right, or who know that the repository they're using will only be  
shared with other Windows users.  But with "autocrlf=false" in their  
config, they would have to remember to set "autocrlf=true" in .git/ 
config in any new repositories they want to share with Linux users.   
I'd like to make that per-repository configuration automatic.

>> As a side note, there's a lot of complaining about the cost of
>> enforcing LF-only input, but I can't remember seeing any actual
>> numbers.  Is it really that bad?
>
> No, it is not bad. In most practical cases, you will not notice any
> difference. I've posted some numbers in this thread. You can find
> them here:
> http://article.gmane.org/gmane.comp.version-control.git/89908

Oh, thanks!  I missed that post, and I agree that those numbers don't  
look particularly worrying.

[...]

>>> but there are people who do not want to pay any price for  
>>> conversion.
>>> Currently, "core.autocrlf=false" means to do nothing about end-of-
>>> lines,
>>> and even to ignore setting in .gitattributes. Should it be  
>>> possible to
>>> disable *any* conversion on checkin and checkout? Should this be  
>>> that
>>> value be the default, which most users use?
>>
>> Well, there's no reason why Git repositories used only on Windows
>> machines shouldn't store CRLFs, so why should all msysgit users pay
>> the cost on every checkin _and_ checkout?
>
> 1. You may want to use on other platforms later. In any case, it makes
>   much sense to have the default that compatible with other systems.

This is why I want the setting to be versioned.  If you decide to  
share with other platforms later, you just enable the setting and "git  
commit -a" (untested!  Recursive touching is probably required  
first).  No history has to be rewritten, and new checkouts will work  
properly (as will old checkouts, but for a slightly different value of  
"properly").

> 2. Conversion cost is not significant (I suppose much less than gzip
>   compression used for all objects) and it also saves some space.
> 3. Internally, Git considers only LF as true end of line. CR is just
>   like an extra space before end-of-line. Is any good reason to have
>   it inside of your repo anyway?

Internally, Git doesn't really care, does it?  The only reason to keep  
the line endings I committed is "because that's what I committed", but  
I think it's quite a good reason.

> 4. This is what other VCSes do on Windows. CVS converts all text files
>   on checkin. SVN does the same for files having svn:eol-style=native.

Heh.  Where I work, we hacked CVS for Windows to get away from that  
behaviour :)

> In fact, if you read the discussion we had here some time ago, you may
> find some other arguments too.

Yep, but I think I'm doing this backwards.  Instead of rehashing old  
arguments, I need to implement something that does what I want and let  
the list decide if it's useful or not.

>> This is the reason the setting needs to be a per-repository policy  
>> and
>> not a user configuration;
>
> What exactly?
>
>
>> users should not be able to configure away
>> correctness,
>
> Why not? *Every* user has him/her own repository, so he/she can do
> *anything* with it, in principle.
>
>> but they shouldn't be penalised unnecessarily for it,
>> either.
>
> The problem is how to determine when it is necessary and when it is
> not. If I never commit with wrong EOLs, I don't think it is necessary
> for me to have this conversion... On the other hand, I don't mind  
> having
> this check as default.  It does not really bother me much, and if I  
> can
> turn it off, it is fine with me. But I suppose other people may feel
> differently about this issue.

Well, what I want is to be able to say "it is necessary to do eol  
conversion in this repository", allowing the default to be "don't do  
eol conversion" on Windows, too.  For a setting like that to be  
useful, it has to be propagated when the repository is cloned.

I want it to be versioned because you might want to change it without  
messing with the content that's already in the repository.  This is  
actually my main motivation, since I have lots of CRLF-infused CVS  
history to deal with.

It should apply on Linux as well as Windows because there is always  
the chance that a user will manage to commit a CRLF on Linux (a  
colleague of mine once complained that CVS on Linux doesn't do eol  
conversion; he edited files on Windows, but checked them in on  
Linux).  It would probably be okay to have a setting that turns all  
conversion off, but wouldn't that be kind of rude?

Thanks, all, for your help in getting my ideas straight; now to find  
out if I'm coder enough to implement them.  If not, I'll just stick  
with setting "autocrlf=false" on Windows and stop whining.
-- 
Eyvind Bernhardsen

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-29 21:17                               ` Eyvind Bernhardsen
@ 2008-07-30  5:35                                 ` Steffen Prohaska
  2008-07-30 18:33                                   ` Avery Pennarun
  2008-07-30 21:45                                 ` Dmitry Potapov
  1 sibling, 1 reply; 58+ messages in thread
From: Steffen Prohaska @ 2008-07-30  5:35 UTC (permalink / raw
  To: Eyvind Bernhardsen
  Cc: Dmitry Potapov, Johannes Schindelin, Avery Pennarun,
	Joshua Jensen, Junio C Hamano, git


On Jul 29, 2008, at 11:17 PM, Eyvind Bernhardsen wrote:

>>> As you say, the reason I want the setting to be per-repository is  
>>> that
>>> I don't think the cost is worthwhile for every repository.
>>
>> Side note: Personally, I am not very concerned about this cost, but  
>> some
>> people are...
>
> Yeah :)
>
> I think the real penalty is that with autocrlf enabled, Git no  
> longer stores exactly what I committed.

Git does *never* exactly store what you committed.  Git compresses
your data and creates packs containing many of your individual
files in a single pack.

What matters is that git gives you exactly back what you committed.  It
does so with core.autocrlf=true, unless you check out with a different
setting for autocrlf.  There is a small chance that git decides that
a file is text even though it should be binary and that the content of
this file does not allow for reversible CRLF-conversion.  In this case
git warns about the irreversible conversion and the user gets a chance
to correct git's choice.

We accept this slight chance of irreversible conversion because we do
want to handle line-endings of text files for cross-platform use.  For
this, the goal of "giving you *exactly* back what you committed" is
modified.  Instead, we want to give you exactly back what you committed,
except for line-endings (in text files), which should be converted to
the platform-dependent line-endings (LF or CRLF), depending on the  
user's
setting.

Because of a design choice we made, CRLF must be converted on Windows.
We decided that the token that git uses *internally* to represent
a line-ending in a text file is LF.  We made this choice because git
originally supported only Unix and so we chose the Unix line-ending for
representing line-endings internally.  Now, Windows uses CRLF to
indicate line-endings but git internally uses LF, so we must convert
them.  Note that if we had users that completely ignored their native
Windows environment and only used well-selected tools, all configured to
*never* write native Windows line-ending, for these users we could set
autocrlf=false and the repository would nonetheless only contain LFs.
Those exceptional super-expert users could manually modify their
settings.  The average user (including me) will not be able to guarantee
that he will never create CRLF in text files on Windows.  Those users
simply accept that they work on Windows and use the native line-endings
(CRLF) and because we care about these average users we set  
autocrlf=true.

In contrast, setting autocrlf=input on Unix is only a safety valve.  The
average user who is only working on Unix will most likely *never* create
CRLF line-endings.  In a Unix-only environment it is actually very hard
to create CRLF line-endings.  Thus, the current default (autocrlf unset)
assumes that all text files on Unix contain only LF, and git wants LF
internally, which means we do not need to convert the line-endings.  In
cross-platform environments however, our assumption that all files on
Unix contain only LFs probably no longer holds.  In a cross-platform
environment you can easily copy files from Windows to Unix and thus
*easily* create files on Unix that contains CRLF.  In this case
autocrlf=input can save you, by correcting the line-endings for you.  In
this case, git *does not* give you exactly back what you committed, but
gives you back the very same text you committed however with the native
LF line-endings.

Personally I believe that our assumption that it is virtually impossible
to unintentionally create CRLF line-endings on Unix is wrong; but the
prevailing opinion on the list is different.  Personally, I believe that
autocrlf=input should be the default on Unix to shield the repository
from CRLFs.  I am using autocrlf=input for some time now and it has
already saved me several times.  Note that I am not working in a
Unix-only environment, but in a mixed Unix/Mac/Windows environment, so
unintentionally creating CRLFs is quite easy.

Another valid concern is speed.  But the timings that Dmitry presented
indicate that the overhead of autocrlf is so small that it is hard to
measure in practice.  I think we should stop raising this concern unless
someone comes up with timings that indicate a larger overhead than
measured by Dmitry.

	Steffen

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-30  5:35                                 ` Steffen Prohaska
@ 2008-07-30 18:33                                   ` Avery Pennarun
  2008-07-30 19:25                                     ` Steffen Prohaska
  0 siblings, 1 reply; 58+ messages in thread
From: Avery Pennarun @ 2008-07-30 18:33 UTC (permalink / raw
  To: Steffen Prohaska
  Cc: Eyvind Bernhardsen, Dmitry Potapov, Johannes Schindelin,
	Joshua Jensen, Junio C Hamano, git

On 7/30/08, Steffen Prohaska <prohaska@zib.de> wrote:
>  On Jul 29, 2008, at 11:17 PM, Eyvind Bernhardsen wrote:
> > I think the real penalty is that with autocrlf enabled, Git no longer
> > stores exactly what I committed.
>
>  Git does *never* exactly store what you committed.  Git compresses
>  your data and creates packs containing many of your individual
>  files in a single pack.

This is not really an argument; making perfectly reversible changes as
part of the storage process, then reversing them as part of the
retrieval process, doesn't count as not storing what I committed.

>  What matters is that git gives you exactly back what you committed.  It
>  does so with core.autocrlf=true, unless you check out with a different
>  setting for autocrlf.

You can tell that this statement isn't quite true because if you have
a file with mixed LF and CRLF line endings, which I do (thanks,
Windows!) then CRLF->LF conversion is not a reversible operation.
Interestingly LF->CRLF still is (because an LF->CRLF'd file will never
have a bare LF, and on such a subset of files, CRLF->LF is
reversible).

Also note that core.autocrlf=input is *definitely* not a perfectly
reversible operation.

And so here's the problem: svn hands you a file.  It may or may not
have CRLFs in it, and the line endings may actually be a random mix of
LF and CRLF, as I am actually experiencing at the moment in a
particular repository at work.  If core.autocrlf is anything other
than "false", git will modify the file, and git-svn won't be apply the
diff on the next revision.

It's conceivable that core.autocrlf=true will work if your svn
repository is pure and svn hands you files only with CRLF endings.
It's somewhat unlikely that most svn repositories are in that state
(remember: it has to be perfect in *every revision* for git-svn to
work).

So anyway, I can't see how git-svn can possibly work in the general
case if core.autocrlf is anything other than false *at git-svn fetch
time*.  And that's what I do, and it works great, modulo a bunch of
stupid CRLFs that sneak into the repo via svn, but those can be fixed.
 Someday we'll stop using svn, and git-filter-branch can fix them all
retroactively so that "blame" will work right.

Perhaps git-svn needs to actually ignore the core.autocrlf setting?

Have fun,

Avery

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-30 18:33                                   ` Avery Pennarun
@ 2008-07-30 19:25                                     ` Steffen Prohaska
  2008-07-30 21:07                                       ` Avery Pennarun
  2008-08-03 16:54                                       ` Tarmigan
  0 siblings, 2 replies; 58+ messages in thread
From: Steffen Prohaska @ 2008-07-30 19:25 UTC (permalink / raw
  To: Avery Pennarun
  Cc: Eyvind Bernhardsen, Dmitry Potapov, Johannes Schindelin,
	Joshua Jensen, Junio C Hamano, git


On Jul 30, 2008, at 8:33 PM, Avery Pennarun wrote:

> On 7/30/08, Steffen Prohaska <prohaska@zib.de> wrote:
>
>> What matters is that git gives you exactly back what you  
>> committed.  It
>> does so with core.autocrlf=true, unless you check out with a  
>> different
>> setting for autocrlf.
>
> You can tell that this statement isn't quite true because if you have
> a file with mixed LF and CRLF line endings, which I do (thanks,
> Windows!) then CRLF->LF conversion is not a reversible operation.
> Interestingly LF->CRLF still is (because an LF->CRLF'd file will never
> have a bare LF, and on such a subset of files, CRLF->LF is
> reversible).
>
> Also note that core.autocrlf=input is *definitely* not a perfectly
> reversible operation.

You are absolutely right.  The files your describe are modified by git,
because they are "invalid" text files, as git defines them.  For git's
autocrlf mechanism to work, a text file is only allowed to have a
*single* type of line endings.  Otherwise it is broken and git tries to
help you fixing it.


> And so here's the problem: svn hands you a file.  It may or may not
> have CRLFs in it, and the line endings may actually be a random mix of
> LF and CRLF, as I am actually experiencing at the moment in a
> particular repository at work.  If core.autocrlf is anything other
> than "false", git will modify the file, and git-svn won't be apply the
> diff on the next revision.

This sound like a specific problem with svn, not a general problem
of git's autocrlf concept.  I work with a git-only workflow and I
never see the problems you describe.


> It's conceivable that core.autocrlf=true will work if your svn
> repository is pure and svn hands you files only with CRLF endings.
> It's somewhat unlikely that most svn repositories are in that state
> (remember: it has to be perfect in *every revision* for git-svn to
> work).
>
> So anyway, I can't see how git-svn can possibly work in the general
> case if core.autocrlf is anything other than false *at git-svn fetch
> time*.

That might be the case.  Note though that I cannot contribute
much to this discussion because I never use git-svn.


> And that's what I do, and it works great, modulo a bunch of
> stupid CRLFs that sneak into the repo via svn, but those can be fixed.
> Someday we'll stop using svn, and git-filter-branch can fix them all
> retroactively so that "blame" will work right.
>
> Perhaps git-svn needs to actually ignore the core.autocrlf setting?

 From what you describe, I get the impression that git-svn's handling
of line endings could certainly be improved.

	Steffen

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-30 19:25                                     ` Steffen Prohaska
@ 2008-07-30 21:07                                       ` Avery Pennarun
  2008-07-30 22:02                                         ` Dmitry Potapov
  2008-08-03 16:54                                       ` Tarmigan
  1 sibling, 1 reply; 58+ messages in thread
From: Avery Pennarun @ 2008-07-30 21:07 UTC (permalink / raw
  To: Steffen Prohaska
  Cc: Eyvind Bernhardsen, Dmitry Potapov, Johannes Schindelin,
	Joshua Jensen, Junio C Hamano, git

On 7/30/08, Steffen Prohaska <prohaska@zib.de> wrote:
> > And so here's the problem: svn hands you a file.  It may or may not
> > have CRLFs in it, and the line endings may actually be a random mix of
> > LF and CRLF, as I am actually experiencing at the moment in a
> > particular repository at work.  If core.autocrlf is anything other
> > than "false", git will modify the file, and git-svn won't be apply the
> > diff on the next revision.
>
>  This sound like a specific problem with svn, not a general problem
>  of git's autocrlf concept.  I work with a git-only workflow and I
>  never see the problems you describe.

My apologies, I think I got this thread mixed up with a different
thread about the fact that git-svn doesn't work with autocrlf.

FWIW, this problem would apply to any system that incrementally
imports into git from another system using binary deltas.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-29 21:17                               ` Eyvind Bernhardsen
  2008-07-30  5:35                                 ` Steffen Prohaska
@ 2008-07-30 21:45                                 ` Dmitry Potapov
  2008-08-02 12:51                                   ` Eyvind Bernhardsen
  1 sibling, 1 reply; 58+ messages in thread
From: Dmitry Potapov @ 2008-07-30 21:45 UTC (permalink / raw
  To: Eyvind Bernhardsen
  Cc: Johannes Schindelin, Avery Pennarun, Joshua Jensen,
	Junio C Hamano, git

On Tue, Jul 29, 2008 at 11:17:23PM +0200, Eyvind Bernhardsen wrote:
> 
> My point was that autocrlf penalises Windows users just as much as it  
> does Linux users, so why should it be turned on by default on  
> Windows? 

Because it does make sense on Windows (I mentioned some reasons for that
in my previous email), and it does NOT make any sense on Linux unless
you copy files from Windows.

> >
> >And the main argument for having that in this way is that people  
> >with LF
> >text files should be unnecessary penalized for Windows being  
> >different.
> 
> I know, but my point is that I don't like to be unnecessarily  
> penalised any more when I am using Windows than when I'm using Linux  
> or OS X.

It seems you still have not realized that from the very day when you
started to use Windows, you are penalized for that day-in, day-out.
Let's take something simple. For instance, the hello-world program:
cat <<=== > hello.c
#include <stdio.h>
int main() {
	printf ("Hello world!\n");
	return 0;
}
===

You can compile it on Linux and Windows (using Microsoft Visual C).
Now, if you run it on Linux, it will print 13 symbols (exactly
as many as there are symbols in the printf string) while on Windows
you will get 14 bytes. It means that printf and as many other C
function on Windows does conversion and penalize you already!

> I would like the default to be "no conversion", and for  
> conversion to be enabled not based on platform, but as a policy  
> decision on the repositories where it actually matters.

If Microsoft C library did not do any conversion, I suspect we would
have this CRLF conversion at all!

> You can have anything you like in _your_ repository, of course, but if  
> you're not publishing it anywhere, who cares what your line endings  
> are?  Your line endings only matter when you publish. 

You can publish only what you have. So, it must decided before.

> That's why I  
> want a setting that is propagated: so that when you clone a repository  
> with a LF-only policy, Git knows what to do.

LF-only policy is the only sane policy for any text files. It has nothing
to do with clone.

> 
> I think the real penalty is that with autocrlf enabled, Git no longer  
> stores exactly what I committed.

And what is wrong with that? In any case, Git deltifies and compress
your files. Why does it not bother you? So, what matters here is
whether you are able to get back exactly what you put or not. That's
why Git has this safecrlf option! Of course, it is guarantee to get
exactly the same back only if you have the same autocrlf setting,
but if you have different autocrlf settings, it means that you want
*different* representation of text files!

> >Files that do not need conversion (such as *.bat) should be marked as
> >"-crlf" in .gitattributes.
> 
> Yes, and I see you checked that "crlf=input" does actually work when  
> you want LF-only, sorry about that.  The syntax is _horrible_, though.
> 
> >Of course, those who are very careful and have good editors can set
> >autocrlf=false even on Windows...
> 
> Right, or who know that the repository they're using will only be  
> shared with other Windows users.

WRONG! Using storing CRLF in text files is a completely idiotic idea.
Those who do so asked for troubles, so they should not complain!

> 
> Internally, Git doesn't really care, does it?

It DOES!!! Such things like merges and diffs and many other every day
commands do care about end-of-lines and the ONLY end-of-line they
recognize is '\n'. In fact, there are more than 400 places in Git where
'\n' is used. Of course, not all of them may be qualified as internals
but many of them do.

So, as long as Git internally consider only LF as the end-of-line.

> 
> Heh.  Where I work, we hacked CVS for Windows to get away from that  
> behaviour :)

Maybe, you used Cygwin version of CVS, which had LF, but those Windows
versions of CVS I used produced CRLF on Windows.

> >
> >The problem is how to determine when it is necessary and when it is
> >not. If I never commit with wrong EOLs, I don't think it is necessary
> >for me to have this conversion... On the other hand, I don't mind  
> >having
> >this check as default.  It does not really bother me much, and if I  
> >can
> >turn it off, it is fine with me. But I suppose other people may feel
> >differently about this issue.
> 
> Well, what I want is to be able to say "it is necessary to do eol  
> conversion in this repository",

You always can do that in _your_ repository:
git config core.autocrlf ...

> allowing the default to be "don't do  
> eol conversion" on Windows, too.  For a setting like that to be  
> useful, it has to be propagated when the repository is cloned.

Why? Other people may have other preferences.

> I want it to be versioned because you might want to change it without  
> messing with the content that's already in the repository.  This is  
> actually my main motivation, since I have lots of CRLF-infused CVS  
> history to deal with.

Well, you can try to use .gitattributes, but I believe it would be far
more reasonable to clean this mess with line-ending than finding a
way to continue insanity with different ending in different text files.

> 
> It should apply on Linux as well as Windows because there is always  
> the chance that a user will manage to commit a CRLF on Linux (a  
> colleague of mine once complained that CVS on Linux doesn't do eol  
> conversion; he edited files on Windows, but checked them in on  
> Linux).

For those who need it, Git provides autocrlf=input. But the question
here is what should be default. I don't think that those who copies
files directly from Windows to Linux are majority. OTOH, I don't
mind autocrlf=input as default. If it can be turned off, it is fine
with me :)

> It would probably be okay to have a setting that turns all  
> conversion off, but wouldn't that be kind of rude?

How so?


Dmitry

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-30 21:07                                       ` Avery Pennarun
@ 2008-07-30 22:02                                         ` Dmitry Potapov
  2008-07-30 22:14                                           ` Avery Pennarun
  0 siblings, 1 reply; 58+ messages in thread
From: Dmitry Potapov @ 2008-07-30 22:02 UTC (permalink / raw
  To: Avery Pennarun
  Cc: Steffen Prohaska, Eyvind Bernhardsen, Johannes Schindelin,
	Joshua Jensen, Junio C Hamano, git

On Wed, Jul 30, 2008 at 05:07:49PM -0400, Avery Pennarun wrote:
> 
> FWIW, this problem would apply to any system that incrementally
> imports into git from another system using binary deltas.

Not necessary. It depends on how import is done. You should not apply
this binary deltas to your working tree, but to files in the repo. And
obviously those files that are imported should be stored as is without
any conversion. In the same way as if you clone some git repo, you do
not apply any commit conversion to any existing commit. This conversion
should be done for *new* commits that you create locally.

So, the problem with git-svn is mostly due to how the import is done. Of
course, the other part of that problem is that conversion setting in Git
and SVN for text files specified very differently.

Dmitry

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-30 22:02                                         ` Dmitry Potapov
@ 2008-07-30 22:14                                           ` Avery Pennarun
  0 siblings, 0 replies; 58+ messages in thread
From: Avery Pennarun @ 2008-07-30 22:14 UTC (permalink / raw
  To: Dmitry Potapov
  Cc: Steffen Prohaska, Eyvind Bernhardsen, Johannes Schindelin,
	Joshua Jensen, Junio C Hamano, git

On 7/30/08, Dmitry Potapov <dpotapov@gmail.com> wrote:
> On Wed, Jul 30, 2008 at 05:07:49PM -0400, Avery Pennarun wrote:
>  > FWIW, this problem would apply to any system that incrementally
>  > imports into git from another system using binary deltas.
>
> Not necessary. It depends on how import is done. You should not apply
>  this binary deltas to your working tree, but to files in the repo. And
>  obviously those files that are imported should be stored as is without
>  any conversion. In the same way as if you clone some git repo, you do
>  not apply any commit conversion to any existing commit. This conversion
>  should be done for *new* commits that you create locally.
>
>  So, the problem with git-svn is mostly due to how the import is done. Of
>  course, the other part of that problem is that conversion setting in Git
>  and SVN for text files specified very differently.

Hmm, if your first paragraphis true (do not apply any commit
conversion to any existing commit), then in fact SVN's method for
specifying text files doesn't really matter, right?  It's just a bug
in git-svn.

Is it possible to have git-svn override core.autocrlf temporarily?  If
someone can give me a hint how to do that, I don't mind working up a
patch for that. I can't see any situation where *not* doing so would
even work (unless core.autocrlf=false already, of course).

Have fun,

Avery

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-30 21:45                                 ` Dmitry Potapov
@ 2008-08-02 12:51                                   ` Eyvind Bernhardsen
  2008-08-03 16:21                                     ` Dmitry Potapov
  0 siblings, 1 reply; 58+ messages in thread
From: Eyvind Bernhardsen @ 2008-08-02 12:51 UTC (permalink / raw
  To: Dmitry Potapov
  Cc: Johannes Schindelin, Avery Pennarun, Joshua Jensen,
	Junio C Hamano, git

On 30. juli. 2008, at 23.45, Dmitry Potapov wrote:

> On Tue, Jul 29, 2008 at 11:17:23PM +0200, Eyvind Bernhardsen wrote:
>>
>> My point was that autocrlf penalises Windows users just as much as it
>> does Linux users, so why should it be turned on by default on
>> Windows?
>
> Because it does make sense on Windows (I mentioned some reasons for  
> that
> in my previous email), and it does NOT make any sense on Linux unless
> you copy files from Windows.

It doesn't make any more sense on Windows than it does on Linux,  
unless the user is careless about CRLFs and the repository will be  
shared with Linux users. If you have to work with undisciplined  
Windows users but don't actually use Windows yourself, it might _seem_  
to make sense, but that is an illusion.

>>> And the main argument for having that in this way is that people
>>> with LF
>>> text files should be unnecessary penalized for Windows being
>>> different.
>>
>> I know, but my point is that I don't like to be unnecessarily
>> penalised any more when I am using Windows than when I'm using Linux
>> or OS X.
>
> It seems you still have not realized that from the very day when you
> started to use Windows, you are penalized for that day-in, day-out.
> Let's take something simple. For instance, the hello-world program:
> cat <<=== > hello.c
> #include <stdio.h>
> int main() {
> 	printf ("Hello world!\n");
> 	return 0;
> }
> ===
>
> You can compile it on Linux and Windows (using Microsoft Visual C).
> Now, if you run it on Linux, it will print 13 symbols (exactly
> as many as there are symbols in the printf string) while on Windows
> you will get 14 bytes. It means that printf and as many other C
> function on Windows does conversion and penalize you already!

But I don't use printf to store files.  What does printf have to do  
with anything?

Btw, I'm not actually a "natural" Windows user; I have a Mac on my  
desktop, and I prefer Linux to Windows on my own PCs.  I often use  
Windows at work, though, and I'm trying to make Git work as well as  
possible when I do.

>> I would like the default to be "no conversion", and for
>> conversion to be enabled not based on platform, but as a policy
>> decision on the repositories where it actually matters.
>
> If Microsoft C library did not do any conversion, I suspect we would
> have this CRLF conversion at all!

Surely, but unfortunately, I don't live in happy fairy land.

>> You can have anything you like in _your_ repository, of course, but  
>> if
>> you're not publishing it anywhere, who cares what your line endings
>> are?  Your line endings only matter when you publish.
>
> You can publish only what you have. So, it must decided before.

Yes!  Someone should decide that the repository should only have LFs  
in it, then flag it as such so that Git can respect that decision.

>> That's why I
>> want a setting that is propagated: so that when you clone a  
>> repository
>> with a LF-only policy, Git knows what to do.
>
> LF-only policy is the only sane policy for any text files. It has  
> nothing
> to do with clone.

Gah!  So close, but so far.  What about a repository which already has  
CRLFs in it?  You're telling me that it shouldn't have CRLFs in it,  
which is almost entirely unhelpful.

>> I think the real penalty is that with autocrlf enabled, Git no longer
>> stores exactly what I committed.
>
> And what is wrong with that? In any case, Git deltifies and compress
> your files. Why does it not bother you? So, what matters here is
> whether you are able to get back exactly what you put or not. That's
> why Git has this safecrlf option! Of course, it is guarantee to get
> exactly the same back only if you have the same autocrlf setting,
> but if you have different autocrlf settings, it means that you want
> *different* representation of text files!

My point is that if I commit two files with different line endings, I  
want Git to store them that way.  Works on Linux by default, doesn't  
work on Windows by default.  Deltification and compression are  
irrelevant since they don't change the content.  autocrlf does.

You only want to normalise text files if you (a) have undisciplined  
committers, and (b) need to share the repository between Windows and  
Unix.  Right now, the default configuration of Git assumes that all  
Windows users are undisciplined, and that they will always be sharing  
repositories with Unix users.

>>> Of course, those who are very careful and have good editors can set
>>> autocrlf=false even on Windows...
>>
>> Right, or who know that the repository they're using will only be
>> shared with other Windows users.
>
> WRONG! Using storing CRLF in text files is a completely idiotic idea.
> Those who do so asked for troubles, so they should not complain!

But some text files need CRLFs, the autocrlf mechanism even caters for  
them.  And it's not like the extra CR actually hurts anything; I use  
Git on Windows with "autocrlf=false", and I've yet to see any problems  
with it.  The reason I've changed from the default "autocrlf=true" is  
that I've seen plenty of problems with that.

>> Internally, Git doesn't really care, does it?
>
> It DOES!!! Such things like merges and diffs and many other every day
> commands do care about end-of-lines and the ONLY end-of-line they
> recognize is '\n'. In fact, there are more than 400 places in Git  
> where
> '\n' is used. Of course, not all of them may be qualified as internals
> but many of them do.

Buh?  That's crazy talk.  The only time CRLFs cause trouble with git  
is when autocrlf is enabled.  Try this:


git init
echo -n "testing\r\ntesting 2\r\n" > testing
git add testing
git commit
git config --bool --add core.autocrlf true
touch testing
git diff


File with CRLF in the repository, file with CRLF in the working  
directory, so where does the difference come from?  The same thing  
happens with "autocrlf=input", and the problem is obviously a lot  
bigger when you're trying to merge.

Now I've shown you when autocrlf is a problem, you show me a case  
where CRLFs cause trouble with autocrlf=false.

> So, as long as Git internally consider only LF as the end-of-line.

Bah.  To the extent it "cares", Git just sees CR as an extra white  
space character before the end-of-line LF, and you know it.

>> Heh.  Where I work, we hacked CVS for Windows to get away from that
>> behaviour :)
>
> Maybe, you used Cygwin version of CVS, which had LF, but those Windows
> versions of CVS I used produced CRLF on Windows.

No, we just compiled our own, precisely because we didn't want CVS to  
mangle our data, even on Windows.  The CVS repositories are used by  
Linux and Windows clients, and some developers had their working  
directories on cross-platform network shares, so eol conversion just  
caused unnecessary problems.

>> Well, what I want is to be able to say "it is necessary to do eol
>> conversion in this repository",
>
> You always can do that in _your_ repository:
> git config core.autocrlf ...

I can, except that (a) any CRLFs already in the repository will cause  
trouble, and (b) everyone who clones from me has to make the same  
setting manually, or they won't follow the same convention.

>> allowing the default to be "don't do
>> eol conversion" on Windows, too.  For a setting like that to be
>> useful, it has to be propagated when the repository is cloned.
>
> Why? Other people may have other preferences.

Yes.  Other people may have other preferences which will lead them to  
push CRLFs to my repository.  That is why the setting should be  
propagated.

>> I want it to be versioned because you might want to change it without
>> messing with the content that's already in the repository.  This is
>> actually my main motivation, since I have lots of CRLF-infused CVS
>> history to deal with.
>
> Well, you can try to use .gitattributes, but I believe it would be far
> more reasonable to clean this mess with line-ending than finding a
> way to continue insanity with different ending in different text  
> files.

But I'd rather not change the historical data in every repository.  So  
yes, I think it's a good idea to clean up the line endings, but I  
would like to do it in a way that allows me to check out an old  
version with mixed line endings while still ensuring that newly  
committed files are normalised.

>> It should apply on Linux as well as Windows because there is always
>> the chance that a user will manage to commit a CRLF on Linux (a
>> colleague of mine once complained that CVS on Linux doesn't do eol
>> conversion; he edited files on Windows, but checked them in on
>> Linux).
>
> For those who need it, Git provides autocrlf=input. But the question
> here is what should be default. I don't think that those who copies
> files directly from Windows to Linux are majority. OTOH, I don't
> mind autocrlf=input as default. If it can be turned off, it is fine
> with me :)

Yeah, if autocrlf is set, it probably makes sense to use it, i.e.  
anyone who doesn't want line ending conversion at all can set  
"autocrlf=false".

>> It would probably be okay to have a setting that turns all
>> conversion off, but wouldn't that be kind of rude?
>
> How so?

You're explicitly saying "I don't care if I commit CRLFs into a  
repository that requests only LFs".  So you're probably only saying it  
because you know you won't commit CRLFs anyway, but since the cost is  
so low, why not make sure?
-- 
Eyvind Bernhardsen

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-08-02 12:51                                   ` Eyvind Bernhardsen
@ 2008-08-03 16:21                                     ` Dmitry Potapov
  0 siblings, 0 replies; 58+ messages in thread
From: Dmitry Potapov @ 2008-08-03 16:21 UTC (permalink / raw
  To: Eyvind Bernhardsen
  Cc: Johannes Schindelin, Avery Pennarun, Joshua Jensen,
	Junio C Hamano, git

On Sat, Aug 02, 2008 at 02:51:28PM +0200, Eyvind Bernhardsen wrote:
> On 30. juli. 2008, at 23.45, Dmitry Potapov wrote:
> 
> >On Tue, Jul 29, 2008 at 11:17:23PM +0200, Eyvind Bernhardsen wrote:
> >>
> >>My point was that autocrlf penalises Windows users just as much as it
> >>does Linux users, so why should it be turned on by default on
> >>Windows?
> >
> >Because it does make sense on Windows (I mentioned some reasons for  
> >that
> >in my previous email), and it does NOT make any sense on Linux unless
> >you copy files from Windows.
> 
> It doesn't make any more sense on Windows than it does on Linux,  
> unless the user is careless about CRLFs and the repository will be  
> shared with Linux users.

It does, because LF is the only EOL that Git recognize internally,
and if CRLF sometimes works, but it is an artifact. With the same
success you can rely on a value of uninitialized variable or any
other kind of undefined behavior.

> If you have to work with undisciplined  
> Windows users but don't actually use Windows yourself, it might _seem_  
> to make sense, but that is an illusion.

Illusion? Is the most logical objection that you capable to produce
to all my explanation? I think it is very convenient approach, but
shall we follow it a bit further and declare all your problem with
line endings just an illusion...

> >You can compile it on Linux and Windows (using Microsoft Visual C).
> >Now, if you run it on Linux, it will print 13 symbols (exactly
> >as many as there are symbols in the printf string) while on Windows
> >you will get 14 bytes. It means that printf and as many other C
> >function on Windows does conversion and penalize you already!
> 
> But I don't use printf to store files.  What does printf have to do  
> with anything?

You missed the point entirely, did you? LF is the standard EOL in C (as
well as in many other languages that is used by Git developers). Thus,
naturally Git internally considers only LF as the only true EOL, and if
it happened that you store files with different EOLs, they should be
converted during import to the Git repository. And, yes, C programs
do use printf and other C library functions to write text files...
Why do you think Microsoft C library does convert CRLF on reading and
writing for text files? Or that is another illusion?

> >>You can have anything you like in _your_ repository, of course, but  
> >>if
> >>you're not publishing it anywhere, who cares what your line endings
> >>are?  Your line endings only matter when you publish.
> >
> >You can publish only what you have. So, it must decided before.
> 
> Yes!  Someone should decide that the repository should only have LFs  
> in it, then flag it as such so that Git can respect that decision.

What is your point?! Git *does* respect your autocrlf choice. But it
also respects *other* people choice to have what they want in *their*
repositories.

> 
> >>That's why I
> >>want a setting that is propagated: so that when you clone a  
> >>repository
> >>with a LF-only policy, Git knows what to do.
> >
> >LF-only policy is the only sane policy for any text files. It has  
> >nothing
> >to do with clone.
> 
> Gah!  So close, but so far.  What about a repository which already has  
> CRLFs in it?  You're telling me that it shouldn't have CRLFs in it,  
> which is almost entirely unhelpful.

What I am saying is that having CRLF for text files is a mistake, and it
should be dealt as that. Perhaps, something can be added to help people
to deal with their past mistakes, but by no means, we should encourage
to produce more crap... So far, I have not heard any rational suggestion,
let alone seeing your code that implements it.

> 
> My point is that if I commit two files with different line endings, I  
> want Git to store them that way.

If you want something insane, you can have that, but the insane mode is
not for normal users. So, it should not be default.

> Works on Linux by default, doesn't  
> work on Windows by default.

So what? Sane people do NOT store text files with different endings,
especially on Linux...

> >>>Of course, those who are very careful and have good editors can set
> >>>autocrlf=false even on Windows...
> >>
> >>Right, or who know that the repository they're using will only be
> >>shared with other Windows users.
> >
> >WRONG! Using storing CRLF in text files is a completely idiotic idea.
> >Those who do so asked for troubles, so they should not complain!
> 
> But some text files need CRLFs, the autocrlf mechanism even caters for  
> them.  And it's not like the extra CR actually hurts anything; I use  
> Git on Windows with "autocrlf=false", and I've yet to see any problems  
> with it.  The reason I've changed from the default "autocrlf=true" is  
> that I've seen plenty of problems with that.

What problems? Isn't true that these problems were caused in the first
place by having autocrlf=false? If you prefer CRLF for your text files
what exactly is your problem with having crlf=true? Your old history?
But why should other users suffer because you have messed up file
endings in your repository?

Perhaps, it would be nicer if Git saved autocrlf when created a new
repository, so if the global setting is changed, it will not affect
already existing repositories. I believe something like that can be
added. In fact, Junio mentioned about this possibility before, but
you do not seem to care to do _anything_ to improve the situation,
and when I say to do something, I don't mean just empty flame.

> 
> >>Internally, Git doesn't really care, does it?
> >
> >It DOES!!! Such things like merges and diffs and many other every day
> >commands do care about end-of-lines and the ONLY end-of-line they
> >recognize is '\n'. In fact, there are more than 400 places in Git  
> >where
> >'\n' is used. Of course, not all of them may be qualified as internals
> >but many of them do.
> 
> Buh?  That's crazy talk.  The only time CRLFs cause trouble with git  
> is when autocrlf is enabled.  Try this:
> 
> 
> git init
> echo -n "testing\r\ntesting 2\r\n" > testing
> git add testing
> git commit
> git config --bool --add core.autocrlf true
> touch testing
> git diff

That is *exactly* why autocrlf=true should be default on Windows, as you
want to have at the very moment when a new repository is created! Thanks!

And, yes, the file is shown changed. You can have the same effect with
SVN if you did not set svn:eol-style=native when added a file and then
change it to svn:eol-style=native later. The issue here is that you have
wrong ending for text files in the first place.

> >So, as long as Git internally consider only LF as the end-of-line.
> 
> Bah.  To the extent it "cares", Git just sees CR as an extra white  
> space character before the end-of-line LF, and you know it.

I know, and I also know that I don't like crap in my repository and
I don't like any tool that encourages users to put every crap in it.
And, no, I don't think that that check-in-all-crap-I-have should be
the default mode for Git. If you like that mode, you probably can set
autocrlf=false and also create some nice alias for git commit -a, so
you check-in all crap crap at once, but I can't care less about all
problems that you'll have after that.

> 
> >>Heh.  Where I work, we hacked CVS for Windows to get away from that
> >>behaviour :)
> >
> >Maybe, you used Cygwin version of CVS, which had LF, but those Windows
> >versions of CVS I used produced CRLF on Windows.
> 
> No, we just compiled our own, precisely because we didn't want CVS to  
> mangle our data, even on Windows.

*Shrug* All I can say I don't remember any problem like this with CVS,
and it did conversion automatically for us. Unfortunately, it converted
binary files too, so new users had to be taught about using 'cvs -kb'
or their binary files would be mutilated. So, I am pretty sure about
this. Perhaps, the difference was that we has always had our CVS server
on Linux, but we have not recompiled anything.

> The CVS repositories are used by  
> Linux and Windows clients, and some developers had their working  
> directories on cross-platform network shares, so eol conversion just  
> caused unnecessary problems.

Sharing working directories is a real issue. I hit it myself once with
CVS. Fortunately, git provides a good solution: autocrlf=input.

> 
> >>Well, what I want is to be able to say "it is necessary to do eol
> >>conversion in this repository",
> >
> >You always can do that in _your_ repository:
> >git config core.autocrlf ...
> 
> I can, except that (a) any CRLFs already in the repository will cause  
> trouble,

Sure, they will. It is insane ending to store in your repo. It was
exactly my point all way along, but somehow you missed that...

> and (b) everyone who clones from me has to make the same  
> setting manually, or they won't follow the same convention.

If people follow sane settings, they do not have to change anything
manually, and they can edit file using *their* preferable EOLs. However,
if you publish a repo with CRLF for text files, *you* create problems
for other people, and they have every right to blame you for that.

> 
> Yes.  Other people may have other preferences which will lead them to  
> push CRLFs to my repository.  That is why the setting should be  
> propagated.

Add the update hook, which will prevent pushes with the wrong ending.

> 
> >>I want it to be versioned because you might want to change it without
> >>messing with the content that's already in the repository.  This is
> >>actually my main motivation, since I have lots of CRLF-infused CVS
> >>history to deal with.
> >
> >Well, you can try to use .gitattributes, but I believe it would be far
> >more reasonable to clean this mess with line-ending than finding a
> >way to continue insanity with different ending in different text  
> >files.
> 
> But I'd rather not change the historical data in every repository.  So  
> yes, I think it's a good idea to clean up the line endings, but I  
> would like to do it in a way that allows me to check out an old  
> version with mixed line endings while still ensuring that newly  
> committed files are normalised.

Well, I don't see much historical value in seeing what files had
what ending and at what time. So, usually, it is a good idea to
clean up them when you convert your repo into Git. But as I said
above, you can try to use .gitattributes.

Anyway, autocrlf=true makes much more sense as default, because
newly created repositories and files in them will have the correct
ending.

> >>It would probably be okay to have a setting that turns all
> >>conversion off, but wouldn't that be kind of rude?
> >
> >How so?
> 
> You're explicitly saying "I don't care if I commit CRLFs into a  
> repository that requests only LFs".

Naw! autocrlf=false means you take responsibility for EOL in files that
you check in.  In fact, no version control system that can verify that
all what you check-in makes sense. So it can only _facilitate_ you in
creating good patches, but it cannot force you to do things in the right
way. Thus, any option that helps people to do their job is good, but you
cannot foresee all use cases, so it should be always choice to turn this
automatically thing off when it is necessary.

> So you're probably only saying it  
> because you know you won't commit CRLFs anyway, but since the cost is  
> so low, why not make sure?

Who are you to tell what cost is high and low? Why do you think that
you should dictate your idea of cost to *everyone*?


Dmitry

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-07-30 19:25                                     ` Steffen Prohaska
  2008-07-30 21:07                                       ` Avery Pennarun
@ 2008-08-03 16:54                                       ` Tarmigan
  2008-08-03 17:33                                         ` Dmitry Potapov
  1 sibling, 1 reply; 58+ messages in thread
From: Tarmigan @ 2008-08-03 16:54 UTC (permalink / raw
  To: Steffen Prohaska
  Cc: Avery Pennarun, Eyvind Bernhardsen, Dmitry Potapov,
	Johannes Schindelin, Joshua Jensen, Junio C Hamano, git

On Wed, Jul 30, 2008 at 12:25 PM, Steffen Prohaska <prohaska@zib.de> wrote:
>
> On Jul 30, 2008, at 8:33 PM, Avery Pennarun wrote:
>
>> On 7/30/08, Steffen Prohaska <prohaska@zib.de> wrote:
>>
>>> What matters is that git gives you exactly back what you committed.  It
>>> does so with core.autocrlf=true, unless you check out with a different
>>> setting for autocrlf.
>>
>> You can tell that this statement isn't quite true because if you have
>> a file with mixed LF and CRLF line endings, which I do (thanks,
>> Windows!) then CRLF->LF conversion is not a reversible operation.
>> Interestingly LF->CRLF still is (because an LF->CRLF'd file will never
>> have a bare LF, and on such a subset of files, CRLF->LF is
>> reversible).
>>
>> Also note that core.autocrlf=input is *definitely* not a perfectly
>> reversible operation.
>
> You are absolutely right.  The files your describe are modified by git,
> because they are "invalid" text files, as git defines them.

For all I care, git can consider the files as binary, but by *default*
I should get back the same as I put in.

[For the rest of my rant, I am referring to the default configuration
of autocrlf on windows]

> For git's
> autocrlf mechanism to work, a text file is only allowed to have a
> *single* type of line endings.

Git's autocrlf mechanism can be a nice feature.  But by default it
should not be on (even on windows) because it can modify screw up my
files.

To be clear: when I say "git checkout" I want to get EXACTLY the same
bits as went in when I did "git add" and "git commit".  Any other
default is broken.

> Otherwise it is broken and git tries to
> help you fixing it.

My files were NOT broken when I put them into git.  I committed them
known good state.  If msysgit changes them by *default* , then msysgit
is broken.

IF you are working on a cross platform project, then setting autocrlf
on windows might be nice.  But having it on by *default* is broken.

Thanks,
Tarmigan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-08-03 16:54                                       ` Tarmigan
@ 2008-08-03 17:33                                         ` Dmitry Potapov
  2008-08-03 18:54                                           ` Tarmigan
  0 siblings, 1 reply; 58+ messages in thread
From: Dmitry Potapov @ 2008-08-03 17:33 UTC (permalink / raw
  To: Tarmigan
  Cc: Steffen Prohaska, Avery Pennarun, Eyvind Bernhardsen,
	Johannes Schindelin, Joshua Jensen, Junio C Hamano, git

On Sun, Aug 03, 2008 at 09:54:42AM -0700, Tarmigan wrote:
> 
> For all I care, git can consider the files as binary, but by *default*
> I should get back the same as I put in.

Sorry, but Git is a source control system, and by definition the
main focus is on *text* files. Storying binary files is *exception*
not the rule. And the default settings should respect exactly that
fact.

> > For git's
> > autocrlf mechanism to work, a text file is only allowed to have a
> > *single* type of line endings.
> 
> Git's autocrlf mechanism can be a nice feature.  But by default it
> should not be on (even on windows) because it can modify screw up my
> files.
> 
> To be clear: when I say "git checkout" I want to get EXACTLY the same
> bits as went in when I did "git add" and "git commit".  Any other
> default is broken.

You will get exactly the same unless you change your autocrlf settings.
There are very rare situation where automatic heuristic can be wrong,
but even then you will be warned about that the file you are going to
checkout different. See core.safecrlf.

> 
> > Otherwise it is broken and git tries to
> > help you fixing it.
> 
> My files were NOT broken when I put them into git.  I committed them
> known good state.  If msysgit changes them by *default* , then msysgit
> is broken.

Text files committed with CRLF inside of your repository is BROKEN by
definition! So you had BROKEN state in the first place.

Of course, changing the global autocrlf setting should not change
autocrlf settings in already existing repositories, as it currently
does.  Care to provide patch?


Dmitry

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-08-03 17:33                                         ` Dmitry Potapov
@ 2008-08-03 18:54                                           ` Tarmigan
  2008-08-04 16:06                                             ` Dmitry Potapov
  0 siblings, 1 reply; 58+ messages in thread
From: Tarmigan @ 2008-08-03 18:54 UTC (permalink / raw
  To: Dmitry Potapov
  Cc: Steffen Prohaska, Avery Pennarun, Eyvind Bernhardsen,
	Johannes Schindelin, Joshua Jensen, Junio C Hamano, git

On Sun, Aug 3, 2008 at 10:33 AM, Dmitry Potapov <dpotapov@gmail.com> wrote:
> On Sun, Aug 03, 2008 at 09:54:42AM -0700, Tarmigan wrote:
>>
>> For all I care, git can consider the files as binary, but by *default*
>> I should get back the same as I put in.
>
> Sorry, but Git is a source control system,

I think this is the heart of the disagreement.  What I love about git
is that git trusts me, the user, and it trusts my files.  It doesn't
change the encoding of my filenames by default.  It doesn't do keyword
expansion by default.  It doesn't change the case of filenames by
default.

If git is not willing to change the encoding/case of file*names* by
default, how is it acceptable to change the *content* of the files
themselves?

Yes, some systems that define themselves as "source control
management" systems make these changes for you.  But that sometimes
leads to frustrating and hard to understand (to the user) behavior.
Git has a very simple and transparent mental model, which is one of
it's greatest strengths.  In my humble opinion, autocrlf breaks this
simple "content tracker" model.

Breaking this mental model bothers me much more than the practical
issues involved with autocrlf, so I'm just going to drop that line of
argument.

Thanks,
Tarmigan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] Respect crlf attribute even if core.autocrlf has not been set
  2008-08-03 18:54                                           ` Tarmigan
@ 2008-08-04 16:06                                             ` Dmitry Potapov
  0 siblings, 0 replies; 58+ messages in thread
From: Dmitry Potapov @ 2008-08-04 16:06 UTC (permalink / raw
  To: Tarmigan
  Cc: Steffen Prohaska, Avery Pennarun, Eyvind Bernhardsen,
	Johannes Schindelin, Joshua Jensen, Junio C Hamano, git

On Sun, Aug 03, 2008 at 11:54:34AM -0700, Tarmigan wrote:
> On Sun, Aug 3, 2008 at 10:33 AM, Dmitry Potapov <dpotapov@gmail.com> wrote:
> > On Sun, Aug 03, 2008 at 09:54:42AM -0700, Tarmigan wrote:
> >>
> >> For all I care, git can consider the files as binary, but by *default*
> >> I should get back the same as I put in.
> >
> > Sorry, but Git is a source control system,
> 
> I think this is the heart of the disagreement.

I guess so... Because for me it is very important that most of files I
store are text files. In fact, such operations as diff and merge would
not make much sense for binary files, would they? IOW, Git is a revision
control system, not a versioning file system.

> What I love about git
> is that git trusts me, the user, and it trusts my files.

Sure, git trusts you. You can always turn off some features if you don't
like them; but the issue having the right defaults. autocrlf=true is the
right default if you want to have CRLF in your work directory. And as
long as text heuristic is right (and it works pretty damn good), you get
exactly what you put in it. In very very rare cases where the heuristic
is wrong, it will warn you about that you are not going to have back what
you put. So, you can tell Git explicitly that you want this file stored
as binary. But this situation is very unlikely unless you store in it
something like svndump files, but storying such files in not the main
purpose of Git.

So, I don't think that this CRLF conversion is a real issue, except that
the fact that changing the global autocrlf value should not have changed
autocrlf in already existing repositories. Because autocrlf is not just
a preference, but also determines in what format your files in the working
directory are stored. So, I believe it should be corrected. But even in
this case, you do not really lose anything.

> It doesn't change the encoding of my filenames by default.

And not by default? Currently, does not support encoding filenames
from your local encoding to UTF-8. And that will be a problem at
some point if you store file names in non UTF-8 encoding. But it is
a separate issue connected to i18n support. I don't think it makes
sense to go into it right now as it is completely irrelevant to the
problem we discuss.

What is relevant is that Git *does* change filename representation.
For instance, if you try to add a file with a name 'foo\bar', Git
will actually add 'foo/bar'. You see, on check-in, Git converts the
directory separator from its Windows form ('\') to the format it uses
internally ('/').

So, it is logical to do the same text files, because text files are
sequences of lines separated by line-separator, which is CRLF on
Windows, but its internal representation in Git is LF.

> It doesn't do keyword expansion by default.

Because keyword expansion is almost always meaningless thing to do
in your working directory. It just makes things slower and you do
not win *anything*. Arguably, there are some cases when you may
want some expansion during export your sources to archive, but even
that is very uncommon.

> It doesn't change the case of filenames by default.

Change case? Would it not be a completely insane thing to do? People put
some meaning in what registry of letters when wrote names, why would you
want Git (or any source control system) to mess up with that?

> If git is not willing to change the encoding/case of file*names* by
> default, how is it acceptable to change the *content* of the files
> themselves?

It does not change the content, it changes the EOL marker from its
external to internal representation. It does the same thing as what C
library on Windows does when you read or write files in the text mode.
This should be completely transparent to users as long as autocrlf is
not changed.


Dmitry

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2008-08-04 16:08 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-22 21:56 [PATCH] Respect crlf attribute even if core.autocrlf has not been set Johannes Schindelin
2008-07-22 23:11 ` Dmitry Potapov
2008-07-22 23:23   ` Johannes Schindelin
2008-07-23  0:14     ` Dmitry Potapov
2008-07-23  0:12 ` Junio C Hamano
2008-07-23  1:10   ` Johannes Schindelin
2008-07-23  1:31     ` [PATCH] Respect crlf attribute in "git add" " Johannes Schindelin
2008-07-23  5:49       ` Steffen Prohaska
2008-07-23  9:02         ` Johannes Schindelin
2008-07-23 11:40         ` Dmitry Potapov
2008-07-24  6:06           ` Steffen Prohaska
2008-07-24 12:39             ` Johannes Schindelin
2008-07-24 17:05               ` Dmitry Potapov
2008-07-24 14:09             ` Dmitry Potapov
2008-07-24 14:38               ` Johannes Schindelin
2008-07-24 14:52                 ` Steffen Prohaska
2008-07-24 16:44                   ` Avery Pennarun
2008-07-24 16:45                   ` Johannes Schindelin
2008-07-24 20:44                     ` Robin Rosenberg
2008-07-24 23:58                       ` Johannes Schindelin
2008-07-23 17:07     ` [PATCH] Respect crlf attribute " Junio C Hamano
2008-07-23 17:22       ` Johannes Schindelin
2008-07-23 18:04         ` Joshua Jensen
2008-07-23 18:33           ` Avery Pennarun
2008-07-23 18:57             ` Johannes Schindelin
2008-07-23 19:20               ` Eyvind Bernhardsen
2008-07-23 19:44                 ` Johannes Schindelin
2008-07-24 21:30                   ` Eyvind Bernhardsen
2008-07-25  0:01                     ` Johannes Schindelin
2008-07-25 12:30                       ` Eyvind Bernhardsen
2008-07-25 14:01                         ` Dmitry Potapov
2008-07-25 21:05                           ` Eyvind Bernhardsen
2008-07-26  2:09                             ` Johannes Schindelin
2008-07-29 19:11                               ` Eyvind Bernhardsen
2008-07-29 13:46                             ` Dmitry Potapov
2008-07-29 21:17                               ` Eyvind Bernhardsen
2008-07-30  5:35                                 ` Steffen Prohaska
2008-07-30 18:33                                   ` Avery Pennarun
2008-07-30 19:25                                     ` Steffen Prohaska
2008-07-30 21:07                                       ` Avery Pennarun
2008-07-30 22:02                                         ` Dmitry Potapov
2008-07-30 22:14                                           ` Avery Pennarun
2008-08-03 16:54                                       ` Tarmigan
2008-08-03 17:33                                         ` Dmitry Potapov
2008-08-03 18:54                                           ` Tarmigan
2008-08-04 16:06                                             ` Dmitry Potapov
2008-07-30 21:45                                 ` Dmitry Potapov
2008-08-02 12:51                                   ` Eyvind Bernhardsen
2008-08-03 16:21                                     ` Dmitry Potapov
2008-07-23 19:22           ` Robin Rosenberg
2008-07-23 19:35             ` Junio C Hamano
2008-07-23 19:41               ` Johannes Schindelin
2008-07-23 19:33           ` Dmitry Potapov
2008-07-23 19:23         ` Junio C Hamano
2008-07-23 20:07           ` Johannes Schindelin
2008-07-24 16:53       ` Dmitry Potapov
2008-07-24 17:14         ` Johannes Schindelin
2008-07-24 17:55           ` Dmitry Potapov

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).