* Changing encoding of a file : What should happen to CRLF in file ?
@ 2017-11-14 12:31 Ashish Negi
2017-11-14 15:20 ` Torsten Bögershausen
` (4 more replies)
0 siblings, 5 replies; 23+ messages in thread
From: Ashish Negi @ 2017-11-14 12:31 UTC (permalink / raw)
To: git
Hello
I have a cross platform project. I have a utf-16 file in it.
I changed its encoding to urf-8 and committed. When i pulled the file
in Linux, it shows that file is modified. This means that the commit
which changed the encoding does not convert crlf to lf, when new
format is text based (utf-8).
Steps to reproduce:
In windows :
Change encoding of file from utf-16 to utf-8.
Commit the change.
In linux:
Pull your branch.
You will see the issue of file being shown as modified even though
you have not used it.
If i change the file encoding in linux and commit. Then if i do git
pull in windows, i don't see the issue.
In linux, during committing i get warning : warning: CRLF will be
replaced by LF in …file_name..
Here are my configuration :
> git config --global --get core.autocrlf
false
> git config --get core.autocrlf
false
> cat E:\work\WindowsFabric\.gitattributes
# Set the default behavior, in case people don't have core.autocrlf set.
* text=auto
*.vcxproj eol=crlf
*.sh eol=lf
# Denote all files that are truly binary and should not be modified.
*.exe binary
*.dll binary
*.pdb binary
*.ico binary
*.png binary
*.jpg binary
> git --version
git version 2.14.2.windows.2
I played around with core.autocrlf values like true and native, but
that didn't help.
The behavior is inconsistent across platforms, and windows version is
giving me problem.
Can someone suggest right settings for this ? Or is this a bug in Git.
Thanks
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Changing encoding of a file : What should happen to CRLF in file ?
2017-11-14 12:31 Changing encoding of a file : What should happen to CRLF in file ? Ashish Negi
@ 2017-11-14 15:20 ` Torsten Bögershausen
2017-11-14 16:13 ` Ashish Negi
2017-11-24 16:14 ` [PATCH 1/1] convert: tighten the safe autocrlf handling tboegi
` (3 subsequent siblings)
4 siblings, 1 reply; 23+ messages in thread
From: Torsten Bögershausen @ 2017-11-14 15:20 UTC (permalink / raw)
To: Ashish Negi, git
On 2017-11-14 13:31, Ashish Negi wrote:
> Hello
>
> I have a cross platform project. I have a utf-16 file in it.
> I changed its encoding to urf-8 and committed. When i pulled the file
> in Linux, it shows that file is modified. This means that the commit
> which changed the encoding does not convert crlf to lf, when new
> format is text based (utf-8).
>
> Steps to reproduce:
>
> In windows :
>
> Change encoding of file from utf-16 to utf-8.
> Commit the change.
>
> In linux:
>
> Pull your branch.
> You will see the issue of file being shown as modified even though
> you have not used it.
>
>
> If i change the file encoding in linux and commit. Then if i do git
> pull in windows, i don't see the issue.
> In linux, during committing i get warning : warning: CRLF will be
> replaced by LF in …file_name..
>
> Here are my configuration :
>
>
>> git config --global --get core.autocrlf
>
> false
>
>
>> git config --get core.autocrlf
>
> false
>
>
>
>> cat E:\work\WindowsFabric\.gitattributes
>
>
> # Set the default behavior, in case people don't have core.autocrlf set.
>
> * text=auto
> *.vcxproj eol=crlf
> *.sh eol=lf
>
> # Denote all files that are truly binary and should not be modified.
> *.exe binary
> *.dll binary
> *.pdb binary
> *.ico binary
> *.png binary
> *.jpg binary
>
>
>> git --version
> git version 2.14.2.windows.2
>
>
> I played around with core.autocrlf values like true and native, but
> that didn't help.
>
> The behavior is inconsistent across platforms, and windows version is
> giving me problem.
>
> Can someone suggest right settings for this ? Or is this a bug in Git.
>
I don't think it is a bug :-)
What does
git ls-files --eol …file_name
give you under Windows ?
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Changing encoding of a file : What should happen to CRLF in file ?
2017-11-14 15:20 ` Torsten Bögershausen
@ 2017-11-14 16:13 ` Ashish Negi
2017-11-14 16:15 ` Ashish Negi
2017-11-14 16:45 ` Torsten Bögershausen
0 siblings, 2 replies; 23+ messages in thread
From: Ashish Negi @ 2017-11-14 16:13 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: git
Running the command gives me :
git ls-files --eol file_name
i/-text w/-text attr/text=auto file_name
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Changing encoding of a file : What should happen to CRLF in file ?
2017-11-14 16:13 ` Ashish Negi
@ 2017-11-14 16:15 ` Ashish Negi
2017-11-14 17:09 ` Torsten Bögershausen
2017-11-14 16:45 ` Torsten Bögershausen
1 sibling, 1 reply; 23+ messages in thread
From: Ashish Negi @ 2017-11-14 16:15 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: git
After changing the encoding of file to utf-8, same command gives:
git ls-files --eol file_name
i/lf w/crlf attr/text=auto ApplicationManifest.xml
On Tue, Nov 14, 2017 at 9:43 PM, Ashish Negi <ashishnegi33@gmail.com> wrote:
> Running the command gives me :
>
> git ls-files --eol file_name
> i/-text w/-text attr/text=auto file_name
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Changing encoding of a file : What should happen to CRLF in file ?
2017-11-14 16:13 ` Ashish Negi
2017-11-14 16:15 ` Ashish Negi
@ 2017-11-14 16:45 ` Torsten Bögershausen
1 sibling, 0 replies; 23+ messages in thread
From: Torsten Bögershausen @ 2017-11-14 16:45 UTC (permalink / raw)
To: Ashish Negi; +Cc: git
On 2017-11-14 17:13, Ashish Negi wrote:
> Running the command gives me :
>
> git ls-files --eol file_name
> i/-text w/-text attr/text=auto file_name
>
That is strange to me:
According to that, Git would treat the file as text=auto.
And the content is "not next", so there is no need to convert.
Do you have configured any filters ?
Is this a public repo ?
if not: Is there any chance that you provide a public example,
so that we can have a look ?
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Changing encoding of a file : What should happen to CRLF in file ?
2017-11-14 16:15 ` Ashish Negi
@ 2017-11-14 17:09 ` Torsten Bögershausen
2017-11-15 8:11 ` Ashish Negi
0 siblings, 1 reply; 23+ messages in thread
From: Torsten Bögershausen @ 2017-11-14 17:09 UTC (permalink / raw)
To: Ashish Negi; +Cc: git
(Back to the beginning)
You have a file ApplicationManifest.xml
It is encoded in UTF-16 (and has CRLF)
You convert it into UTF-8
The file has still CRLF (in the worktree)
Now you add it and make a commit.
Under both Linux and Windows you have "text=auto".
I assume that you have efficiently core.eol=lf under Linux
and core.eol=crlf on Windows.
(That is the default, when you don't change anything)
Now, what happens to the CRLF?
If you commit the file, it will be stored with LF in the index,
on both systems.
On checkout, Windows will convert them into CRLF, but Linux will not.
That why you see
>On linux, during committing i get warning : warning: CRLF will be
>replaced by LF in …file_name..
All in all there is nothing wrong, at least as I see it.
The question remains:
Do you need CRLF in Linux ?
Probably not, but if yes, plase add a line
*.xml text eol=crlf
to your
.gitattributes
Otherwise your .gitconfig looks good to me.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Changing encoding of a file : What should happen to CRLF in file ?
2017-11-14 17:09 ` Torsten Bögershausen
@ 2017-11-15 8:11 ` Ashish Negi
2017-11-15 17:12 ` Torsten Bögershausen
0 siblings, 1 reply; 23+ messages in thread
From: Ashish Negi @ 2017-11-15 8:11 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 2145 bytes --]
> If you commit the file, it will be stored with LF in the index,
This is what i believe is not happening.
Lets do this with a public repository and steps which are reproducible.
I have created a repo : https://github.com/ashishnegi/git_encoding
If you clone this repo in linux and run `git status`, you will find
that file is modified.
About repo :
Repo have 2 commits, done on windows machine.
First one check in a utf-16le encoded file which has crlf. crlf will
not be converted to lf in index as git treats it as binary file.
2nd commit changes encoding to utf-8 and commits.
This commit does not change crlf to lf in index, even though new
format is utf-8 which is text based for git. This is the crux of
problem.
I have attached all commands i ran on windows while creating the repo.
I tried to capture all information that i could give.
Please have a look. It might be useful.
Finally, thank you Torsten for giving your time to the problem. Really
appreciate it.
On Tue, Nov 14, 2017 at 10:39 PM, Torsten Bögershausen <tboegi@web.de> wrote:
> (Back to the beginning)
>
> You have a file ApplicationManifest.xml
> It is encoded in UTF-16 (and has CRLF)
>
> You convert it into UTF-8
> The file has still CRLF (in the worktree)
>
> Now you add it and make a commit.
> Under both Linux and Windows you have "text=auto".
>
> I assume that you have efficiently core.eol=lf under Linux
> and core.eol=crlf on Windows.
>
> (That is the default, when you don't change anything)
>
> Now, what happens to the CRLF?
> If you commit the file, it will be stored with LF in the index,
> on both systems.
> On checkout, Windows will convert them into CRLF, but Linux will not.
>
> That why you see
>>On linux, during committing i get warning : warning: CRLF will be
>>replaced by LF in …file_name..
>
> All in all there is nothing wrong, at least as I see it.
>
> The question remains:
> Do you need CRLF in Linux ?
> Probably not, but if yes, plase add a line
>
> *.xml text eol=crlf
>
> to your
> .gitattributes
>
> Otherwise your .gitconfig looks good to me.
>
>
>
>
>
>
[-- Attachment #2: all_git_commands_output.txt --]
[-- Type: text/plain, Size: 2916 bytes --]
git_encoding_repo>git config --global core.safecrlf
true
git_encoding_repo>git config core.autocrlf
false
git_encoding_repo>git config --get core.autocrlf
false
git_encoding_repo>cat .gitattributes
# Set the default behavior, in case people don't have core.autocrlf set.
* text=auto
*.vcxproj eol=crlf
*.sh eol=lf
# Denote all files that are truly binary and should not be modified.
*.exe binary
*.dll binary
*.pdb binary
*.ico binary
*.png binary
*.jpg binary
git_encoding_repo>git status
On branch master
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
.gitattributes
file_name.txt
nothing added to commit but untracked files present (use "git add" to track)
git_encoding_repo>git ls-files --eol file_name.txt
git_encoding_repo>git add .
git_encoding_repo>git status
On branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: .gitattributes
new file: file_name.txt
git_encoding_repo>git ls-files --eol file_name.txt
i/-text w/-text attr/text=auto file_name.txt
git_encoding_repo>git commit -m "Commit utf-16le encoded file which has crlf."
[master (root-commit) 91fe3bd] Commit utf-16le encoded file which has crlf.
2 files changed, 13 insertions(+)
create mode 100644 .gitattributes
create mode 100644 file_name.txt
## At this time, i changed file encoding to utf-8.
git_encoding_repo>git status
On branch master
nothing to commit, working tree clean
git_encoding_repo>git add -p
Only binary files changed.
git_encoding_repo>git status
On branch master
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: file_name.txt
no changes added to commit (use "git add" and/or "git commit -a")
git_encoding_repo>git add file_name.txt
git_encoding_repo>git status
On branch master
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: file_name.txt
git_encoding_repo>git commit -m "Change encoding of file to utf-8"
[master 179c27b] Change encoding of file to utf-8
1 file changed, 0 insertions(+), 0 deletions(-)
rewrite file_name.txt (100%)
git_encoding_repo>git remote add origin https://github.com/ashishnegi/git_encoding.git
git_encoding_repo>git push -u origin master
Counting objects: 7, done.
Delta compression using up to 12 threads.
Compressing objects: 100% (7/7), done.
Writing objects: 100% (7/7), 837 bytes | 837.00 KiB/s, done.
Total 7 (delta 0), reused 0 (delta 0)
To https://github.com/ashishnegi/git_encoding.git
* [new branch] master -> master
Branch master set up to track remote branch master from origin.
git_encoding_repo>git ls-files --eol file_name.txt
i/crlf w/crlf attr/text=auto file_name.txt
git_encoding_repo>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Changing encoding of a file : What should happen to CRLF in file ?
2017-11-15 8:11 ` Ashish Negi
@ 2017-11-15 17:12 ` Torsten Bögershausen
2017-11-15 19:05 ` Ashish Negi
0 siblings, 1 reply; 23+ messages in thread
From: Torsten Bögershausen @ 2017-11-15 17:12 UTC (permalink / raw)
To: Ashish Negi; +Cc: git
On Wed, Nov 15, 2017 at 01:41:42PM +0530, Ashish Negi wrote:
> > If you commit the file, it will be stored with LF in the index,
> This is what i believe is not happening.
>
> Lets do this with a public repository and steps which are reproducible.
> I have created a repo : https://github.com/ashishnegi/git_encoding
>
> If you clone this repo in linux and run `git status`, you will find
> that file is modified.
This is what what I get:
git ls-files --eol
i/lf w/lf attr/text=auto .gitattributes
i/crlf w/crlf attr/text=auto file_name.txt
(And you get the same, I think)
Newer versions of Git (>=2.10) will -not- treat file_name.txt as changed,
older versions do.
What does "git --version" say on your Linux machine ?
However, if you want to fix it, you want to either end up with
A)
git ls-files --eol
i/lf w/lf attr/text=auto .gitattributes
i/lf w/crlf attr/text=auto file_name.txt
or
B)
git ls-files --eol
i/lf w/lf attr/text=auto .gitattributes
i/crlf w/crlf attr/-text file_name.txt
(The "attr/-text" means "don't change the line endings")
Both A) or B) will work, typically A) is preferred.
You should be able to achive A) :
git rm --cached file_name.txt && git add file_name.txt
rm 'file_name.txt'
warning: CRLF will be replaced by LF in file_name.txt.
The file will have its original line endings in your working directory.
git ls-files --eol
i/lf w/lf attr/text=auto .gitattributes
i/lf w/crlf attr/text=auto file_name.txt
[snip the rest]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Changing encoding of a file : What should happen to CRLF in file ?
2017-11-15 17:12 ` Torsten Bögershausen
@ 2017-11-15 19:05 ` Ashish Negi
2017-11-16 16:15 ` Torsten Bögershausen
0 siblings, 1 reply; 23+ messages in thread
From: Ashish Negi @ 2017-11-15 19:05 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: git
On windows :
> git --version
git version 2.14.2.windows.2
On linux :
> git --version
git version 2.7.4
I would like to understand the solution :
If i understood it correctly : it removes file_name.txt from index, so
git forgets about it.
we then add the file again after changing encoding. This time, git
takes it as utf-8 file and converts crlf to lf when storing it
internally.
Right ?
Thank you for the support.
On Wed, Nov 15, 2017 at 10:42 PM, Torsten Bögershausen <tboegi@web.de> wrote:
> On Wed, Nov 15, 2017 at 01:41:42PM +0530, Ashish Negi wrote:
>> > If you commit the file, it will be stored with LF in the index,
>> This is what i believe is not happening.
>>
>> Lets do this with a public repository and steps which are reproducible.
>> I have created a repo : https://github.com/ashishnegi/git_encoding
>>
>> If you clone this repo in linux and run `git status`, you will find
>> that file is modified.
>
> This is what what I get:
>
> git ls-files --eol
> i/lf w/lf attr/text=auto .gitattributes
> i/crlf w/crlf attr/text=auto file_name.txt
>
> (And you get the same, I think)
> Newer versions of Git (>=2.10) will -not- treat file_name.txt as changed,
> older versions do.
> What does "git --version" say on your Linux machine ?
>
> However, if you want to fix it, you want to either end up with
>
> A)
> git ls-files --eol
> i/lf w/lf attr/text=auto .gitattributes
> i/lf w/crlf attr/text=auto file_name.txt
>
> or
> B)
> git ls-files --eol
> i/lf w/lf attr/text=auto .gitattributes
> i/crlf w/crlf attr/-text file_name.txt
>
> (The "attr/-text" means "don't change the line endings")
>
> Both A) or B) will work, typically A) is preferred.
>
> You should be able to achive A) :
> git rm --cached file_name.txt && git add file_name.txt
> rm 'file_name.txt'
> warning: CRLF will be replaced by LF in file_name.txt.
> The file will have its original line endings in your working directory.
>
> git ls-files --eol
> i/lf w/lf attr/text=auto .gitattributes
> i/lf w/crlf attr/text=auto file_name.txt
>
> [snip the rest]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Changing encoding of a file : What should happen to CRLF in file ?
2017-11-15 19:05 ` Ashish Negi
@ 2017-11-16 16:15 ` Torsten Bögershausen
2017-11-23 16:31 ` Ashish Negi
0 siblings, 1 reply; 23+ messages in thread
From: Torsten Bögershausen @ 2017-11-16 16:15 UTC (permalink / raw)
To: Ashish Negi; +Cc: git
On Thu, Nov 16, 2017 at 12:35:33AM +0530, Ashish Negi wrote:
> On windows :
> > git --version
> git version 2.14.2.windows.2
>
> On linux :
> > git --version
> git version 2.7.4
>
> I would like to understand the solution :
> If i understood it correctly : it removes file_name.txt from index, so
> git forgets about it.
> we then add the file again after changing encoding. This time, git
> takes it as utf-8 file and converts crlf to lf when storing it
> internally.
> Right ?
Yes, exactly.
(In a coming release of Git there will be a "git add --renormalize <pathspec>" )
>
> Thank you for the support.
>
Thanks for a clean bug report.
Actually it is a bug, I put it on my to do list
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Changing encoding of a file : What should happen to CRLF in file ?
2017-11-16 16:15 ` Torsten Bögershausen
@ 2017-11-23 16:31 ` Ashish Negi
2017-11-23 20:25 ` Torsten Bögershausen
0 siblings, 1 reply; 23+ messages in thread
From: Ashish Negi @ 2017-11-23 16:31 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: git
Thanks for confirming.
Is it possible to track this via a bug number ?
It will help me to try out the fix when its available.
On Thu, Nov 16, 2017 at 9:45 PM, Torsten Bögershausen <tboegi@web.de> wrote:
> On Thu, Nov 16, 2017 at 12:35:33AM +0530, Ashish Negi wrote:
>> On windows :
>> > git --version
>> git version 2.14.2.windows.2
>>
>> On linux :
>> > git --version
>> git version 2.7.4
>>
>> I would like to understand the solution :
>> If i understood it correctly : it removes file_name.txt from index, so
>> git forgets about it.
>> we then add the file again after changing encoding. This time, git
>> takes it as utf-8 file and converts crlf to lf when storing it
>> internally.
>> Right ?
>
> Yes, exactly.
> (In a coming release of Git there will be a "git add --renormalize <pathspec>" )
>
>>
>> Thank you for the support.
>>
>
> Thanks for a clean bug report.
> Actually it is a bug, I put it on my to do list
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Changing encoding of a file : What should happen to CRLF in file ?
2017-11-23 16:31 ` Ashish Negi
@ 2017-11-23 20:25 ` Torsten Bögershausen
2017-11-24 6:37 ` Ashish Negi
0 siblings, 1 reply; 23+ messages in thread
From: Torsten Bögershausen @ 2017-11-23 20:25 UTC (permalink / raw)
To: Ashish Negi; +Cc: git
On Thu, Nov 23, 2017 at 10:01:40PM +0530, Ashish Negi wrote:
> Thanks for confirming.
> Is it possible to track this via a bug number ?
> It will help me to try out the fix when its available.
>
No worry, the fix is nearly complete and will come out in a couple of days.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Changing encoding of a file : What should happen to CRLF in file ?
2017-11-23 20:25 ` Torsten Bögershausen
@ 2017-11-24 6:37 ` Ashish Negi
0 siblings, 0 replies; 23+ messages in thread
From: Ashish Negi @ 2017-11-24 6:37 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: git
Great work !!!
Thanks
On Fri, Nov 24, 2017 at 1:55 AM, Torsten Bögershausen <tboegi@web.de> wrote:
> On Thu, Nov 23, 2017 at 10:01:40PM +0530, Ashish Negi wrote:
>> Thanks for confirming.
>> Is it possible to track this via a bug number ?
>> It will help me to try out the fix when its available.
>>
>
> No worry, the fix is nearly complete and will come out in a couple of days.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 1/1] convert: tighten the safe autocrlf handling
2017-11-14 12:31 Changing encoding of a file : What should happen to CRLF in file ? Ashish Negi
2017-11-14 15:20 ` Torsten Bögershausen
@ 2017-11-24 16:14 ` tboegi
2017-11-24 17:24 ` Eric Sunshine
2017-11-25 3:16 ` Junio C Hamano
2017-11-26 12:20 ` [PATCH v2 " tboegi
` (2 subsequent siblings)
4 siblings, 2 replies; 23+ messages in thread
From: tboegi @ 2017-11-24 16:14 UTC (permalink / raw)
To: git, ashishnegi33; +Cc: Torsten Bögershausen
From: Torsten Bögershausen <tboegi@web.de>
When a text file had been commited with CRLF and the file is commited
again, the CRLF are kept if .gitattributs has "text=auto".
This is done by analyzing the content of the blob stored in the index:
If a '\r' is found, Git assumes that the blob was commited with CRLF.
The simple search for a '\r' does not always work as expected:
A file is encoded in UTF-16 with CRLF and commited. Git treats it as binary.
Now the content is converted into UTF-8. At the next commit Git treats the
file as text, the CRLF should be converted into LF, but isn't.
Solution:
Replace has_cr_in_index() with has_crlf_in_index(). When no '\r' is found,
0 is returned directly, this is the most common case.
If a '\r' is found, the content is analyzed more deeply.
Reported-By: Ashish Negi <ashishnegi33@gmail.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
convert.c | 19 +++++++++----
t/t0027-auto-crlf.sh | 76 ++++++++++++++++++++++++++++++++++++++++++++++++----
2 files changed, 85 insertions(+), 10 deletions(-)
diff --git a/convert.c b/convert.c
index 20d7ab67bd..63ef799239 100644
--- a/convert.c
+++ b/convert.c
@@ -220,18 +220,27 @@ static void check_safe_crlf(const char *path, enum crlf_action crlf_action,
}
}
-static int has_cr_in_index(const struct index_state *istate, const char *path)
+static int has_crlf_in_index(const struct index_state *istate, const char *path)
{
unsigned long sz;
void *data;
- int has_cr;
+ const char *crp;
+ int has_crlf = 0;
data = read_blob_data_from_index(istate, path, &sz);
if (!data)
return 0;
- has_cr = memchr(data, '\r', sz) != NULL;
+
+ crp = memchr(data, '\r', sz);
+ if (crp && (crp[1] == '\n')) {
+ unsigned int ret_stats;
+ ret_stats = gather_convert_stats(data, sz);
+ if (!(ret_stats & CONVERT_STAT_BITS_BIN) &&
+ (ret_stats & CONVERT_STAT_BITS_TXT_CRLF))
+ has_crlf = 1;
+ }
free(data);
- return has_cr;
+ return has_crlf;
}
static int will_convert_lf_to_crlf(size_t len, struct text_stat *stats,
@@ -290,7 +299,7 @@ static int crlf_to_git(const struct index_state *istate,
* cherry-pick.
*/
if ((checksafe != SAFE_CRLF_RENORMALIZE) &&
- has_cr_in_index(istate, path))
+ has_crlf_in_index(istate, path))
convert_crlf_into_lf = 0;
}
if ((checksafe == SAFE_CRLF_WARN ||
diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
index 68108d956a..0af35cfb1f 100755
--- a/t/t0027-auto-crlf.sh
+++ b/t/t0027-auto-crlf.sh
@@ -43,19 +43,31 @@ create_gitattributes () {
} >.gitattributes
}
-create_NNO_files () {
+# Create 2 sets of files:
+# The NNO files are "Not NOrmalized in the repo. We use CRLF_mix_LF and store
+# it under different names for the different test cases, see ${pfx}
+# Depending on .gitattributes they are normalized at the next commit (or not)
+# The MIX files have different contents in the repo.
+# Depending on its contents, the "new safer autocrlf" may kick in.
+create_NNO_MIX_files () {
for crlf in false true input
do
for attr in "" auto text -text
do
for aeol in "" lf crlf
do
- pfx=NNO_attr_${attr}_aeol_${aeol}_${crlf}
+ pfx=NNO_attr_${attr}_aeol_${aeol}_${crlf} &&
cp CRLF_mix_LF ${pfx}_LF.txt &&
cp CRLF_mix_LF ${pfx}_CRLF.txt &&
cp CRLF_mix_LF ${pfx}_CRLF_mix_LF.txt &&
cp CRLF_mix_LF ${pfx}_LF_mix_CR.txt &&
- cp CRLF_mix_LF ${pfx}_CRLF_nul.txt
+ cp CRLF_mix_LF ${pfx}_CRLF_nul.txt &&
+ pfx=MIX_attr_${attr}_aeol_${aeol}_${crlf} &&
+ cp LF ${pfx}_LF.txt &&
+ cp CRLF ${pfx}_CRLF.txt &&
+ cp CRLF_mix_LF ${pfx}_CRLF_mix_LF.txt &&
+ cp LF_mix_CR ${pfx}_LF_mix_CR.txt &&
+ cp CRLF_nul ${pfx}_CRLF_nul.txt
done
done
done
@@ -136,6 +148,49 @@ commit_chk_wrnNNO () {
'
}
+# Commit a file with mixed line endings on top of different files
+# in the index. Check for warnings
+commit_MIX_chkwrn () {
+ attr=$1 ; shift
+ aeol=$1 ; shift
+ crlf=$1 ; shift
+ lfwarn=$1 ; shift
+ crlfwarn=$1 ; shift
+ lfmixcrlf=$1 ; shift
+ lfmixcr=$1 ; shift
+ crlfnul=$1 ; shift
+ pfx=MIX_attr_${attr}_aeol_${aeol}_${crlf}
+ #Commit file with CLRF_mix_LF on top of existing file
+ create_gitattributes "$attr" $aeol &&
+ for f in LF CRLF CRLF_mix_LF LF_mix_CR CRLF_nul
+ do
+ fname=${pfx}_$f.txt &&
+ cp CRLF_mix_LF $fname &&
+ printf Z >>"$fname" &&
+ git -c core.autocrlf=$crlf add $fname 2>"${pfx}_$f.err"
+ done
+
+ test_expect_success "commit file with mixed EOL crlf=$crlf attr=$attr LF" '
+ check_warning "$lfwarn" ${pfx}_LF.err
+ '
+ test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf CRLF" '
+ check_warning "$crlfwarn" ${pfx}_CRLF.err
+ '
+
+ test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf CRLF_mix_LF" '
+ check_warning "$lfmixcrlf" ${pfx}_CRLF_mix_LF.err
+ '
+
+ test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf LF_mix_cr" '
+ check_warning "$lfmixcr" ${pfx}_LF_mix_CR.err
+ '
+
+ test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf CRLF_nul" '
+ check_warning "$crlfnul" ${pfx}_CRLF_nul.err
+ '
+}
+
+
stats_ascii () {
case "$1" in
LF)
@@ -323,8 +378,8 @@ test_expect_success 'setup master' '
printf "\$Id: 0000000000000000000000000000000000000000 \$\r\nLINEONE\r\nLINETWO\rLINETHREE" >CRLF_mix_CR &&
printf "\$Id: 0000000000000000000000000000000000000000 \$\r\nLINEONEQ\r\nLINETWO\r\nLINETHREE" | q_to_nul >CRLF_nul &&
printf "\$Id: 0000000000000000000000000000000000000000 \$\nLINEONEQ\nLINETWO\nLINETHREE" | q_to_nul >LF_nul &&
- create_NNO_files CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF &&
- git -c core.autocrlf=false add NNO_*.txt &&
+ create_NNO_MIX_files CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF &&
+ git -c core.autocrlf=false add NNO_*.txt MIX_*.txt &&
git commit -m "mixed line endings" &&
test_tick
'
@@ -385,6 +440,17 @@ test_expect_success 'commit files attr=crlf' '
commit_check_warn input "crlf" "LF_CRLF" "" "LF_CRLF" "LF_CRLF" ""
'
+# Commit "CRLFmixLF" on top of these files already in the repo:
+# LF, CRLF, CRLFmixLF LF_mix_CR CRLFNULL
+# attr LF CRLF CRLFmixLF LF_mix_CR CRLFNUL
+commit_MIX_chkwrn "" "" false "" "" "" "" ""
+commit_MIX_chkwrn "" "" true "LF_CRLF" "" "" "LF_CRLF" "LF_CRLF"
+commit_MIX_chkwrn "" "" input "CRLF_LF" "" "" "CRLF_LF" "CRLF_LF"
+
+commit_MIX_chkwrn "auto" "" false "CRLF_LF" "" "" "CRLF_LF" "CRLF_LF"
+commit_MIX_chkwrn "auto" "" true "LF_CRLF" "" "" "LF_CRLF" "LF_CRLF"
+commit_MIX_chkwrn "auto" "" input "CRLF_LF" "" "" "CRLF_LF" "CRLF_LF"
+
# attr LF CRLF CRLFmixLF LF_mix_CR CRLFNUL
commit_chk_wrnNNO "" "" false "" "" "" "" ""
commit_chk_wrnNNO "" "" true LF_CRLF "" "" "" ""
--
2.15.0.278.gbd97f72a1b.dirty
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 1/1] convert: tighten the safe autocrlf handling
2017-11-24 16:14 ` [PATCH 1/1] convert: tighten the safe autocrlf handling tboegi
@ 2017-11-24 17:24 ` Eric Sunshine
2017-11-24 18:59 ` Torsten Bögershausen
2017-11-25 3:16 ` Junio C Hamano
1 sibling, 1 reply; 23+ messages in thread
From: Eric Sunshine @ 2017-11-24 17:24 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: Git List, ashishnegi33
On Fri, Nov 24, 2017 at 11:14 AM, <tboegi@web.de> wrote:
> When a text file had been commited with CRLF and the file is commited
> again, the CRLF are kept if .gitattributs has "text=auto".
> This is done by analyzing the content of the blob stored in the index:
> If a '\r' is found, Git assumes that the blob was commited with CRLF.
>
> The simple search for a '\r' does not always work as expected:
> A file is encoded in UTF-16 with CRLF and commited. Git treats it as binary.
> Now the content is converted into UTF-8. At the next commit Git treats the
> file as text, the CRLF should be converted into LF, but isn't.
>
> Solution:
> Replace has_cr_in_index() with has_crlf_in_index(). When no '\r' is found,
> 0 is returned directly, this is the most common case.
> If a '\r' is found, the content is analyzed more deeply.
>
> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> ---
> diff --git a/convert.c b/convert.c
> @@ -220,18 +220,27 @@ static void check_safe_crlf(const char *path, enum crlf_action crlf_action,
> -static int has_cr_in_index(const struct index_state *istate, const char *path)
> +static int has_crlf_in_index(const struct index_state *istate, const char *path)
> {
> unsigned long sz;
> void *data;
> - int has_cr;
> + const char *crp;
> + int has_crlf = 0;
>
> data = read_blob_data_from_index(istate, path, &sz);
> if (!data)
> return 0;
> - has_cr = memchr(data, '\r', sz) != NULL;
> +
> + crp = memchr(data, '\r', sz);
> + if (crp && (crp[1] == '\n')) {
If I understand correctly, this isn't a NUL-terminated string and it
might be a binary blob, so if the lone CR in a file resides at the end
of the file, won't this try looking for LF out-of-bounds? I would have
expected the conditional to be:
if (crp && crp - data + 1 < sz && crp[1] == '\n') {
or any equivalent variation.
> + unsigned int ret_stats;
> + ret_stats = gather_convert_stats(data, sz);
> + if (!(ret_stats & CONVERT_STAT_BITS_BIN) &&
> + (ret_stats & CONVERT_STAT_BITS_TXT_CRLF))
> + has_crlf = 1;
> + }
> free(data);
> - return has_cr;
> + return has_crlf;
> }
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/1] convert: tighten the safe autocrlf handling
2017-11-24 17:24 ` Eric Sunshine
@ 2017-11-24 18:59 ` Torsten Bögershausen
0 siblings, 0 replies; 23+ messages in thread
From: Torsten Bögershausen @ 2017-11-24 18:59 UTC (permalink / raw)
To: Eric Sunshine; +Cc: Git List, ashishnegi33
On Fri, Nov 24, 2017 at 12:24:48PM -0500, Eric Sunshine wrote:
> On Fri, Nov 24, 2017 at 11:14 AM, <tboegi@web.de> wrote:
> > When a text file had been commited with CRLF and the file is commited
> > again, the CRLF are kept if .gitattributs has "text=auto".
> > This is done by analyzing the content of the blob stored in the index:
> > If a '\r' is found, Git assumes that the blob was commited with CRLF.
> >
> > The simple search for a '\r' does not always work as expected:
> > A file is encoded in UTF-16 with CRLF and commited. Git treats it as binary.
> > Now the content is converted into UTF-8. At the next commit Git treats the
> > file as text, the CRLF should be converted into LF, but isn't.
> >
> > Solution:
> > Replace has_cr_in_index() with has_crlf_in_index(). When no '\r' is found,
> > 0 is returned directly, this is the most common case.
> > If a '\r' is found, the content is analyzed more deeply.
> >
> > Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> > ---
> > diff --git a/convert.c b/convert.c
> > @@ -220,18 +220,27 @@ static void check_safe_crlf(const char *path, enum crlf_action crlf_action,
> > -static int has_cr_in_index(const struct index_state *istate, const char *path)
> > +static int has_crlf_in_index(const struct index_state *istate, const char *path)
> > {
> > unsigned long sz;
> > void *data;
> > - int has_cr;
> > + const char *crp;
> > + int has_crlf = 0;
> >
> > data = read_blob_data_from_index(istate, path, &sz);
> > if (!data)
> > return 0;
> > - has_cr = memchr(data, '\r', sz) != NULL;
> > +
> > + crp = memchr(data, '\r', sz);
> > + if (crp && (crp[1] == '\n')) {
>
> If I understand correctly, this isn't a NUL-terminated string and it
> might be a binary blob, so if the lone CR in a file resides at the end
> of the file, won't this try looking for LF out-of-bounds? I would have
> expected the conditional to be:
>
> if (crp && crp - data + 1 < sz && crp[1] == '\n') {
>
> or any equivalent variation.
>
The read_blob_data_from_index() function should always append a '\0',
regardless if the blob is binary or not.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/1] convert: tighten the safe autocrlf handling
2017-11-24 16:14 ` [PATCH 1/1] convert: tighten the safe autocrlf handling tboegi
2017-11-24 17:24 ` Eric Sunshine
@ 2017-11-25 3:16 ` Junio C Hamano
1 sibling, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2017-11-25 3:16 UTC (permalink / raw)
To: tboegi; +Cc: git, ashishnegi33
tboegi@web.de writes:
> From: Torsten Bögershausen <tboegi@web.de>
>
> When a text file had been commited with CRLF and the file is commited
> again, the CRLF are kept if .gitattributs has "text=auto".
> This is done by analyzing the content of the blob stored in the index:
> If a '\r' is found, Git assumes that the blob was commited with CRLF.
>
> The simple search for a '\r' does not always work as expected:
> A file is encoded in UTF-16 with CRLF and commited. Git treats it as binary.
> Now the content is converted into UTF-8. At the next commit Git treats the
> file as text, the CRLF should be converted into LF, but isn't.
>
> Solution:
Remove this line.
> Replace has_cr_in_index() with has_crlf_in_index(). When no '\r' is found,
> 0 is returned directly, this is the most common case.
> If a '\r' is found, the content is analyzed more deeply.
I may be recalling things incorrectly, but didn't an old version of
the code check CRLF explicitly, unlike the current implementation
that only check CRs?
In any case, I think we have accumulated enough cruft only to work
around the issues caused by "safe" crlf. I moderately strongly
wonder if we should go back and think if that "feature" is adding
much value, and remove it if it is not.
In the meantime, let's queue this fix on top of the "safe crlf
workaround" pile.
Thanks.
>
> Reported-By: Ashish Negi <ashishnegi33@gmail.com>
> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> ---
> convert.c | 19 +++++++++----
> t/t0027-auto-crlf.sh | 76 ++++++++++++++++++++++++++++++++++++++++++++++++----
> 2 files changed, 85 insertions(+), 10 deletions(-)
>
> diff --git a/convert.c b/convert.c
> index 20d7ab67bd..63ef799239 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -220,18 +220,27 @@ static void check_safe_crlf(const char *path, enum crlf_action crlf_action,
> }
> }
>
> -static int has_cr_in_index(const struct index_state *istate, const char *path)
> +static int has_crlf_in_index(const struct index_state *istate, const char *path)
> {
> unsigned long sz;
> void *data;
> - int has_cr;
> + const char *crp;
> + int has_crlf = 0;
>
> data = read_blob_data_from_index(istate, path, &sz);
> if (!data)
> return 0;
> - has_cr = memchr(data, '\r', sz) != NULL;
> +
> + crp = memchr(data, '\r', sz);
> + if (crp && (crp[1] == '\n')) {
> + unsigned int ret_stats;
> + ret_stats = gather_convert_stats(data, sz);
> + if (!(ret_stats & CONVERT_STAT_BITS_BIN) &&
> + (ret_stats & CONVERT_STAT_BITS_TXT_CRLF))
> + has_crlf = 1;
> + }
> free(data);
> - return has_cr;
> + return has_crlf;
> }
>
> static int will_convert_lf_to_crlf(size_t len, struct text_stat *stats,
> @@ -290,7 +299,7 @@ static int crlf_to_git(const struct index_state *istate,
> * cherry-pick.
> */
> if ((checksafe != SAFE_CRLF_RENORMALIZE) &&
> - has_cr_in_index(istate, path))
> + has_crlf_in_index(istate, path))
> convert_crlf_into_lf = 0;
> }
> if ((checksafe == SAFE_CRLF_WARN ||
> diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
> index 68108d956a..0af35cfb1f 100755
> --- a/t/t0027-auto-crlf.sh
> +++ b/t/t0027-auto-crlf.sh
> @@ -43,19 +43,31 @@ create_gitattributes () {
> } >.gitattributes
> }
>
> -create_NNO_files () {
> +# Create 2 sets of files:
> +# The NNO files are "Not NOrmalized in the repo. We use CRLF_mix_LF and store
> +# it under different names for the different test cases, see ${pfx}
> +# Depending on .gitattributes they are normalized at the next commit (or not)
> +# The MIX files have different contents in the repo.
> +# Depending on its contents, the "new safer autocrlf" may kick in.
> +create_NNO_MIX_files () {
> for crlf in false true input
> do
> for attr in "" auto text -text
> do
> for aeol in "" lf crlf
> do
> - pfx=NNO_attr_${attr}_aeol_${aeol}_${crlf}
> + pfx=NNO_attr_${attr}_aeol_${aeol}_${crlf} &&
> cp CRLF_mix_LF ${pfx}_LF.txt &&
> cp CRLF_mix_LF ${pfx}_CRLF.txt &&
> cp CRLF_mix_LF ${pfx}_CRLF_mix_LF.txt &&
> cp CRLF_mix_LF ${pfx}_LF_mix_CR.txt &&
> - cp CRLF_mix_LF ${pfx}_CRLF_nul.txt
> + cp CRLF_mix_LF ${pfx}_CRLF_nul.txt &&
> + pfx=MIX_attr_${attr}_aeol_${aeol}_${crlf} &&
> + cp LF ${pfx}_LF.txt &&
> + cp CRLF ${pfx}_CRLF.txt &&
> + cp CRLF_mix_LF ${pfx}_CRLF_mix_LF.txt &&
> + cp LF_mix_CR ${pfx}_LF_mix_CR.txt &&
> + cp CRLF_nul ${pfx}_CRLF_nul.txt
> done
> done
> done
> @@ -136,6 +148,49 @@ commit_chk_wrnNNO () {
> '
> }
>
> +# Commit a file with mixed line endings on top of different files
> +# in the index. Check for warnings
> +commit_MIX_chkwrn () {
> + attr=$1 ; shift
> + aeol=$1 ; shift
> + crlf=$1 ; shift
> + lfwarn=$1 ; shift
> + crlfwarn=$1 ; shift
> + lfmixcrlf=$1 ; shift
> + lfmixcr=$1 ; shift
> + crlfnul=$1 ; shift
> + pfx=MIX_attr_${attr}_aeol_${aeol}_${crlf}
> + #Commit file with CLRF_mix_LF on top of existing file
> + create_gitattributes "$attr" $aeol &&
> + for f in LF CRLF CRLF_mix_LF LF_mix_CR CRLF_nul
> + do
> + fname=${pfx}_$f.txt &&
> + cp CRLF_mix_LF $fname &&
> + printf Z >>"$fname" &&
> + git -c core.autocrlf=$crlf add $fname 2>"${pfx}_$f.err"
> + done
> +
> + test_expect_success "commit file with mixed EOL crlf=$crlf attr=$attr LF" '
> + check_warning "$lfwarn" ${pfx}_LF.err
> + '
> + test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf CRLF" '
> + check_warning "$crlfwarn" ${pfx}_CRLF.err
> + '
> +
> + test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf CRLF_mix_LF" '
> + check_warning "$lfmixcrlf" ${pfx}_CRLF_mix_LF.err
> + '
> +
> + test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf LF_mix_cr" '
> + check_warning "$lfmixcr" ${pfx}_LF_mix_CR.err
> + '
> +
> + test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf CRLF_nul" '
> + check_warning "$crlfnul" ${pfx}_CRLF_nul.err
> + '
> +}
> +
> +
> stats_ascii () {
> case "$1" in
> LF)
> @@ -323,8 +378,8 @@ test_expect_success 'setup master' '
> printf "\$Id: 0000000000000000000000000000000000000000 \$\r\nLINEONE\r\nLINETWO\rLINETHREE" >CRLF_mix_CR &&
> printf "\$Id: 0000000000000000000000000000000000000000 \$\r\nLINEONEQ\r\nLINETWO\r\nLINETHREE" | q_to_nul >CRLF_nul &&
> printf "\$Id: 0000000000000000000000000000000000000000 \$\nLINEONEQ\nLINETWO\nLINETHREE" | q_to_nul >LF_nul &&
> - create_NNO_files CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF &&
> - git -c core.autocrlf=false add NNO_*.txt &&
> + create_NNO_MIX_files CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF &&
> + git -c core.autocrlf=false add NNO_*.txt MIX_*.txt &&
> git commit -m "mixed line endings" &&
> test_tick
> '
> @@ -385,6 +440,17 @@ test_expect_success 'commit files attr=crlf' '
> commit_check_warn input "crlf" "LF_CRLF" "" "LF_CRLF" "LF_CRLF" ""
> '
>
> +# Commit "CRLFmixLF" on top of these files already in the repo:
> +# LF, CRLF, CRLFmixLF LF_mix_CR CRLFNULL
> +# attr LF CRLF CRLFmixLF LF_mix_CR CRLFNUL
> +commit_MIX_chkwrn "" "" false "" "" "" "" ""
> +commit_MIX_chkwrn "" "" true "LF_CRLF" "" "" "LF_CRLF" "LF_CRLF"
> +commit_MIX_chkwrn "" "" input "CRLF_LF" "" "" "CRLF_LF" "CRLF_LF"
> +
> +commit_MIX_chkwrn "auto" "" false "CRLF_LF" "" "" "CRLF_LF" "CRLF_LF"
> +commit_MIX_chkwrn "auto" "" true "LF_CRLF" "" "" "LF_CRLF" "LF_CRLF"
> +commit_MIX_chkwrn "auto" "" input "CRLF_LF" "" "" "CRLF_LF" "CRLF_LF"
> +
> # attr LF CRLF CRLFmixLF LF_mix_CR CRLFNUL
> commit_chk_wrnNNO "" "" false "" "" "" "" ""
> commit_chk_wrnNNO "" "" true LF_CRLF "" "" "" ""
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 1/1] convert: tighten the safe autocrlf handling
2017-11-14 12:31 Changing encoding of a file : What should happen to CRLF in file ? Ashish Negi
2017-11-14 15:20 ` Torsten Bögershausen
2017-11-24 16:14 ` [PATCH 1/1] convert: tighten the safe autocrlf handling tboegi
@ 2017-11-26 12:20 ` tboegi
2017-12-08 17:46 ` [PATCH v1 1/2] t0027: Don't use git commit <empty-pathspec> tboegi
2017-12-08 17:46 ` [PATCH v1 2/2] t0027: Adapt the new MIX tests to Windows tboegi
4 siblings, 0 replies; 23+ messages in thread
From: tboegi @ 2017-11-26 12:20 UTC (permalink / raw)
To: git, ashishnegi33; +Cc: Torsten Bögershausen
From: Torsten Bögershausen <tboegi@web.de>
When a text file had been commited with CRLF and the file is commited
again, the CRLF are kept if .gitattributs has "text=auto".
This is done by analyzing the content of the blob stored in the index:
If a '\r' is found, Git assumes that the blob was commited with CRLF.
The simple search for a '\r' does not always work as expected:
A file is encoded in UTF-16 with CRLF and commited. Git treats it as binary.
Now the content is converted into UTF-8. At the next commit Git treats the
file as text, the CRLF should be converted into LF, but isn't.
Replace has_cr_in_index() with has_crlf_in_index(). When no '\r' is found,
0 is returned directly, this is the most common case.
If a '\r' is found, the content is analyzed more deeply.
Reported-By: Ashish Negi <ashishnegi33@gmail.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
Changes against v1:
- change "if (crp && (crp[1] == '\n'))" to "if (crp)"
(Thanks Eric. The new patch is more straightforward, and no risk to
read out of data)
- Remove the "Solution:" in the commit message
Note:
The original function has_cr_in_index() is from this commit:
c4805393d73 (Finn Arne Gangstad 2010-05-12 00:37:57 +0200 225)
And has this info:
>Change autocrlf to not do any conversions to files that in the
>repository already contain a CR. git with autocrlf set will never
>create such a file, or change a LF only file to contain CRs, so the
>(new) assumption is that if a file contains a CR, it is intentional,
>and autocrlf should not change that.
So the original assumption was slightly optimistic (but did work in 7 years)
convert.c | 19 +++++++++----
t/t0027-auto-crlf.sh | 76 ++++++++++++++++++++++++++++++++++++++++++++++++----
2 files changed, 85 insertions(+), 10 deletions(-)
diff --git a/convert.c b/convert.c
index 20d7ab67bd..1a41a48e15 100644
--- a/convert.c
+++ b/convert.c
@@ -220,18 +220,27 @@ static void check_safe_crlf(const char *path, enum crlf_action crlf_action,
}
}
-static int has_cr_in_index(const struct index_state *istate, const char *path)
+static int has_crlf_in_index(const struct index_state *istate, const char *path)
{
unsigned long sz;
void *data;
- int has_cr;
+ const char *crp;
+ int has_crlf = 0;
data = read_blob_data_from_index(istate, path, &sz);
if (!data)
return 0;
- has_cr = memchr(data, '\r', sz) != NULL;
+
+ crp = memchr(data, '\r', sz);
+ if (crp) {
+ unsigned int ret_stats;
+ ret_stats = gather_convert_stats(data, sz);
+ if (!(ret_stats & CONVERT_STAT_BITS_BIN) &&
+ (ret_stats & CONVERT_STAT_BITS_TXT_CRLF))
+ has_crlf = 1;
+ }
free(data);
- return has_cr;
+ return has_crlf;
}
static int will_convert_lf_to_crlf(size_t len, struct text_stat *stats,
@@ -290,7 +299,7 @@ static int crlf_to_git(const struct index_state *istate,
* cherry-pick.
*/
if ((checksafe != SAFE_CRLF_RENORMALIZE) &&
- has_cr_in_index(istate, path))
+ has_crlf_in_index(istate, path))
convert_crlf_into_lf = 0;
}
if ((checksafe == SAFE_CRLF_WARN ||
diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
index 68108d956a..0af35cfb1f 100755
--- a/t/t0027-auto-crlf.sh
+++ b/t/t0027-auto-crlf.sh
@@ -43,19 +43,31 @@ create_gitattributes () {
} >.gitattributes
}
-create_NNO_files () {
+# Create 2 sets of files:
+# The NNO files are "Not NOrmalized in the repo. We use CRLF_mix_LF and store
+# it under different names for the different test cases, see ${pfx}
+# Depending on .gitattributes they are normalized at the next commit (or not)
+# The MIX files have different contents in the repo.
+# Depending on its contents, the "new safer autocrlf" may kick in.
+create_NNO_MIX_files () {
for crlf in false true input
do
for attr in "" auto text -text
do
for aeol in "" lf crlf
do
- pfx=NNO_attr_${attr}_aeol_${aeol}_${crlf}
+ pfx=NNO_attr_${attr}_aeol_${aeol}_${crlf} &&
cp CRLF_mix_LF ${pfx}_LF.txt &&
cp CRLF_mix_LF ${pfx}_CRLF.txt &&
cp CRLF_mix_LF ${pfx}_CRLF_mix_LF.txt &&
cp CRLF_mix_LF ${pfx}_LF_mix_CR.txt &&
- cp CRLF_mix_LF ${pfx}_CRLF_nul.txt
+ cp CRLF_mix_LF ${pfx}_CRLF_nul.txt &&
+ pfx=MIX_attr_${attr}_aeol_${aeol}_${crlf} &&
+ cp LF ${pfx}_LF.txt &&
+ cp CRLF ${pfx}_CRLF.txt &&
+ cp CRLF_mix_LF ${pfx}_CRLF_mix_LF.txt &&
+ cp LF_mix_CR ${pfx}_LF_mix_CR.txt &&
+ cp CRLF_nul ${pfx}_CRLF_nul.txt
done
done
done
@@ -136,6 +148,49 @@ commit_chk_wrnNNO () {
'
}
+# Commit a file with mixed line endings on top of different files
+# in the index. Check for warnings
+commit_MIX_chkwrn () {
+ attr=$1 ; shift
+ aeol=$1 ; shift
+ crlf=$1 ; shift
+ lfwarn=$1 ; shift
+ crlfwarn=$1 ; shift
+ lfmixcrlf=$1 ; shift
+ lfmixcr=$1 ; shift
+ crlfnul=$1 ; shift
+ pfx=MIX_attr_${attr}_aeol_${aeol}_${crlf}
+ #Commit file with CLRF_mix_LF on top of existing file
+ create_gitattributes "$attr" $aeol &&
+ for f in LF CRLF CRLF_mix_LF LF_mix_CR CRLF_nul
+ do
+ fname=${pfx}_$f.txt &&
+ cp CRLF_mix_LF $fname &&
+ printf Z >>"$fname" &&
+ git -c core.autocrlf=$crlf add $fname 2>"${pfx}_$f.err"
+ done
+
+ test_expect_success "commit file with mixed EOL crlf=$crlf attr=$attr LF" '
+ check_warning "$lfwarn" ${pfx}_LF.err
+ '
+ test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf CRLF" '
+ check_warning "$crlfwarn" ${pfx}_CRLF.err
+ '
+
+ test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf CRLF_mix_LF" '
+ check_warning "$lfmixcrlf" ${pfx}_CRLF_mix_LF.err
+ '
+
+ test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf LF_mix_cr" '
+ check_warning "$lfmixcr" ${pfx}_LF_mix_CR.err
+ '
+
+ test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf CRLF_nul" '
+ check_warning "$crlfnul" ${pfx}_CRLF_nul.err
+ '
+}
+
+
stats_ascii () {
case "$1" in
LF)
@@ -323,8 +378,8 @@ test_expect_success 'setup master' '
printf "\$Id: 0000000000000000000000000000000000000000 \$\r\nLINEONE\r\nLINETWO\rLINETHREE" >CRLF_mix_CR &&
printf "\$Id: 0000000000000000000000000000000000000000 \$\r\nLINEONEQ\r\nLINETWO\r\nLINETHREE" | q_to_nul >CRLF_nul &&
printf "\$Id: 0000000000000000000000000000000000000000 \$\nLINEONEQ\nLINETWO\nLINETHREE" | q_to_nul >LF_nul &&
- create_NNO_files CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF &&
- git -c core.autocrlf=false add NNO_*.txt &&
+ create_NNO_MIX_files CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF &&
+ git -c core.autocrlf=false add NNO_*.txt MIX_*.txt &&
git commit -m "mixed line endings" &&
test_tick
'
@@ -385,6 +440,17 @@ test_expect_success 'commit files attr=crlf' '
commit_check_warn input "crlf" "LF_CRLF" "" "LF_CRLF" "LF_CRLF" ""
'
+# Commit "CRLFmixLF" on top of these files already in the repo:
+# LF, CRLF, CRLFmixLF LF_mix_CR CRLFNULL
+# attr LF CRLF CRLFmixLF LF_mix_CR CRLFNUL
+commit_MIX_chkwrn "" "" false "" "" "" "" ""
+commit_MIX_chkwrn "" "" true "LF_CRLF" "" "" "LF_CRLF" "LF_CRLF"
+commit_MIX_chkwrn "" "" input "CRLF_LF" "" "" "CRLF_LF" "CRLF_LF"
+
+commit_MIX_chkwrn "auto" "" false "CRLF_LF" "" "" "CRLF_LF" "CRLF_LF"
+commit_MIX_chkwrn "auto" "" true "LF_CRLF" "" "" "LF_CRLF" "LF_CRLF"
+commit_MIX_chkwrn "auto" "" input "CRLF_LF" "" "" "CRLF_LF" "CRLF_LF"
+
# attr LF CRLF CRLFmixLF LF_mix_CR CRLFNUL
commit_chk_wrnNNO "" "" false "" "" "" "" ""
commit_chk_wrnNNO "" "" true LF_CRLF "" "" "" ""
--
2.15.0.278.gbd97f72a1b.dirty
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v1 1/2] t0027: Don't use git commit <empty-pathspec>
2017-11-14 12:31 Changing encoding of a file : What should happen to CRLF in file ? Ashish Negi
` (2 preceding siblings ...)
2017-11-26 12:20 ` [PATCH v2 " tboegi
@ 2017-12-08 17:46 ` tboegi
2017-12-08 18:13 ` Junio C Hamano
2017-12-08 17:46 ` [PATCH v1 2/2] t0027: Adapt the new MIX tests to Windows tboegi
4 siblings, 1 reply; 23+ messages in thread
From: tboegi @ 2017-12-08 17:46 UTC (permalink / raw)
To: git, ashishnegi33; +Cc: Torsten Bögershausen
From: Torsten Bögershausen <tboegi@web.de>
Replace `git commit -m "comment" ""` with `git commit -m "comment"` to
remove the empty path spec.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
t/t0027-auto-crlf.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
index c2c208fdcd..97154f5c79 100755
--- a/t/t0027-auto-crlf.sh
+++ b/t/t0027-auto-crlf.sh
@@ -370,7 +370,7 @@ test_expect_success 'setup master' '
echo >.gitattributes &&
git checkout -b master &&
git add .gitattributes &&
- git commit -m "add .gitattributes" "" &&
+ git commit -m "add .gitattributes" &&
printf "\$Id: 0000000000000000000000000000000000000000 \$\nLINEONE\nLINETWO\nLINETHREE" >LF &&
printf "\$Id: 0000000000000000000000000000000000000000 \$\r\nLINEONE\r\nLINETWO\r\nLINETHREE" >CRLF &&
printf "\$Id: 0000000000000000000000000000000000000000 \$\nLINEONE\r\nLINETWO\nLINETHREE" >CRLF_mix_LF &&
--
2.15.1.271.g1a4e40aa5d
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v1 2/2] t0027: Adapt the new MIX tests to Windows
2017-11-14 12:31 Changing encoding of a file : What should happen to CRLF in file ? Ashish Negi
` (3 preceding siblings ...)
2017-12-08 17:46 ` [PATCH v1 1/2] t0027: Don't use git commit <empty-pathspec> tboegi
@ 2017-12-08 17:46 ` tboegi
4 siblings, 0 replies; 23+ messages in thread
From: tboegi @ 2017-12-08 17:46 UTC (permalink / raw)
To: git, ashishnegi33; +Cc: Torsten Bögershausen
From: Torsten Bögershausen <tboegi@web.de>
The new MIX tests don't pass under Windows, adapt them
to use the correct native line ending.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
Sorry for the breakage.
This needs to go on top of tb/check-crlf-for-safe-crlf
t/t0027-auto-crlf.sh | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
index 97154f5c79..8d929b76dc 100755
--- a/t/t0027-auto-crlf.sh
+++ b/t/t0027-auto-crlf.sh
@@ -170,22 +170,22 @@ commit_MIX_chkwrn () {
git -c core.autocrlf=$crlf add $fname 2>"${pfx}_$f.err"
done
- test_expect_success "commit file with mixed EOL crlf=$crlf attr=$attr LF" '
+ test_expect_success "commit file with mixed EOL onto LF crlf=$crlf attr=$attr" '
check_warning "$lfwarn" ${pfx}_LF.err
'
- test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf CRLF" '
+ test_expect_success "commit file with mixed EOL onto CLRF attr=$attr aeol=$aeol crlf=$crlf" '
check_warning "$crlfwarn" ${pfx}_CRLF.err
'
- test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf CRLF_mix_LF" '
+ test_expect_success "commit file with mixed EOL onto CRLF_mix_LF attr=$attr aeol=$aeol crlf=$crlf" '
check_warning "$lfmixcrlf" ${pfx}_CRLF_mix_LF.err
'
- test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf LF_mix_cr" '
+ test_expect_success "commit file with mixed EOL onto LF_mix_cr attr=$attr aeol=$aeol crlf=$crlf " '
check_warning "$lfmixcr" ${pfx}_LF_mix_CR.err
'
- test_expect_success "commit file with mixed EOL attr=$attr aeol=$aeol crlf=$crlf CRLF_nul" '
+ test_expect_success "commit file with mixed EOL onto CRLF_nul attr=$attr aeol=$aeol crlf=$crlf" '
check_warning "$crlfnul" ${pfx}_CRLF_nul.err
'
}
@@ -378,7 +378,7 @@ test_expect_success 'setup master' '
printf "\$Id: 0000000000000000000000000000000000000000 \$\r\nLINEONE\r\nLINETWO\rLINETHREE" >CRLF_mix_CR &&
printf "\$Id: 0000000000000000000000000000000000000000 \$\r\nLINEONEQ\r\nLINETWO\r\nLINETHREE" | q_to_nul >CRLF_nul &&
printf "\$Id: 0000000000000000000000000000000000000000 \$\nLINEONEQ\nLINETWO\nLINETHREE" | q_to_nul >LF_nul &&
- create_NNO_MIX_files CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF CRLF_mix_LF &&
+ create_NNO_MIX_files &&
git -c core.autocrlf=false add NNO_*.txt MIX_*.txt &&
git commit -m "mixed line endings" &&
test_tick
@@ -441,13 +441,14 @@ test_expect_success 'commit files attr=crlf' '
'
# Commit "CRLFmixLF" on top of these files already in the repo:
-# LF, CRLF, CRLFmixLF LF_mix_CR CRLFNULL
+# mixed mixed mixed mixed mixed
+# onto onto onto onto onto
# attr LF CRLF CRLFmixLF LF_mix_CR CRLFNUL
commit_MIX_chkwrn "" "" false "" "" "" "" ""
commit_MIX_chkwrn "" "" true "LF_CRLF" "" "" "LF_CRLF" "LF_CRLF"
commit_MIX_chkwrn "" "" input "CRLF_LF" "" "" "CRLF_LF" "CRLF_LF"
-commit_MIX_chkwrn "auto" "" false "CRLF_LF" "" "" "CRLF_LF" "CRLF_LF"
+commit_MIX_chkwrn "auto" "" false "$WAMIX" "" "" "$WAMIX" "$WAMIX"
commit_MIX_chkwrn "auto" "" true "LF_CRLF" "" "" "LF_CRLF" "LF_CRLF"
commit_MIX_chkwrn "auto" "" input "CRLF_LF" "" "" "CRLF_LF" "CRLF_LF"
--
2.15.1.271.g1a4e40aa5d
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v1 1/2] t0027: Don't use git commit <empty-pathspec>
2017-12-08 17:46 ` [PATCH v1 1/2] t0027: Don't use git commit <empty-pathspec> tboegi
@ 2017-12-08 18:13 ` Junio C Hamano
2017-12-08 18:21 ` Junio C Hamano
0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2017-12-08 18:13 UTC (permalink / raw)
To: tboegi; +Cc: git, ashishnegi33
tboegi@web.de writes:
> From: Torsten Bögershausen <tboegi@web.de>
>
> Replace `git commit -m "comment" ""` with `git commit -m "comment"` to
> remove the empty path spec.
>
> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> ---
> t/t0027-auto-crlf.sh | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
This looks a bit strange. The intent seems to commit all changes
made in the working tree, so I'd understand it if it replaced the
empty string with a single dot.
I also thought that we deprecated use of an empty string as "match
all" pathspec recently, and expected that this to break.
... goes and looks ...
Indeed, 229a95aa ("t0027: do not use an empty string as a pathspec
element", 2017-06-23) does exactly that.
> diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
> index c2c208fdcd..97154f5c79 100755
> --- a/t/t0027-auto-crlf.sh
> +++ b/t/t0027-auto-crlf.sh
> @@ -370,7 +370,7 @@ test_expect_success 'setup master' '
> echo >.gitattributes &&
> git checkout -b master &&
> git add .gitattributes &&
> - git commit -m "add .gitattributes" "" &&
> + git commit -m "add .gitattributes" &&
> printf "\$Id: 0000000000000000000000000000000000000000 \$\nLINEONE\nLINETWO\nLINETHREE" >LF &&
> printf "\$Id: 0000000000000000000000000000000000000000 \$\r\nLINEONE\r\nLINETWO\r\nLINETHREE" >CRLF &&
> printf "\$Id: 0000000000000000000000000000000000000000 \$\nLINEONE\r\nLINETWO\nLINETHREE" >CRLF_mix_LF &&
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v1 1/2] t0027: Don't use git commit <empty-pathspec>
2017-12-08 18:13 ` Junio C Hamano
@ 2017-12-08 18:21 ` Junio C Hamano
2017-12-08 18:50 ` Torsten Bögershausen
0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2017-12-08 18:21 UTC (permalink / raw)
To: tboegi; +Cc: git, ashishnegi33
Junio C Hamano <gitster@pobox.com> writes:
> tboegi@web.de writes:
>
>> From: Torsten Bögershausen <tboegi@web.de>
>>
>> Replace `git commit -m "comment" ""` with `git commit -m "comment"` to
>> remove the empty path spec.
>>
>> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
>> ---
>> t/t0027-auto-crlf.sh | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> This looks a bit strange. The intent seems to commit all changes
> made in the working tree, so I'd understand it if it replaced the
> empty string with a single dot.
>
> I also thought that we deprecated use of an empty string as "match
> all" pathspec recently, and expected that this to break.
>
> ... goes and looks ...
>
> Indeed, 229a95aa ("t0027: do not use an empty string as a pathspec
> element", 2017-06-23) does exactly that.
OK, I think I can safely omit this patch, because
(1) when 2/2 is queued on top of tb/check-crlf-for-safe-crlf,
because ex/deprecate-empty-pathspec-as-match-all is not yet in
the topic, an empty pathspec still means "match all" and
nothing breaks; and
(2) when tb/check-crlf-for-safe-crlf plus 2/2 is merged to any
branch with ex/deprecate-empty-pathspec-as-match-all, the topic
already fixes what this 1/2 tries to
Thanks for being thorough, though.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v1 1/2] t0027: Don't use git commit <empty-pathspec>
2017-12-08 18:21 ` Junio C Hamano
@ 2017-12-08 18:50 ` Torsten Bögershausen
0 siblings, 0 replies; 23+ messages in thread
From: Torsten Bögershausen @ 2017-12-08 18:50 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, ashishnegi33
On Fri, Dec 08, 2017 at 10:21:19AM -0800, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
> > tboegi@web.de writes:
> >
> >> From: Torsten Bögershausen <tboegi@web.de>
> >>
> >> Replace `git commit -m "comment" ""` with `git commit -m "comment"` to
> >> remove the empty path spec.
> >>
> >> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> >> ---
> >> t/t0027-auto-crlf.sh | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > This looks a bit strange. The intent seems to commit all changes
> > made in the working tree, so I'd understand it if it replaced the
> > empty string with a single dot.
> >
> > I also thought that we deprecated use of an empty string as "match
> > all" pathspec recently, and expected that this to break.
> >
> > ... goes and looks ...
> >
> > Indeed, 229a95aa ("t0027: do not use an empty string as a pathspec
> > element", 2017-06-23) does exactly that.
>
> OK, I think I can safely omit this patch, because
>
> (1) when 2/2 is queued on top of tb/check-crlf-for-safe-crlf,
> because ex/deprecate-empty-pathspec-as-match-all is not yet in
> the topic, an empty pathspec still means "match all" and
> nothing breaks; and
>
> (2) when tb/check-crlf-for-safe-crlf plus 2/2 is merged to any
> branch with ex/deprecate-empty-pathspec-as-match-all, the topic
> already fixes what this 1/2 tries to
>
> Thanks for being thorough, though.
>
Sure, the credit goes 100% to you:
commit 229a95aafa77b583b46a3156b4fad469c264ddfd
Author: Junio C Hamano <gitster@pobox.com>
Date: Fri Jun 23 11:04:14 2017 -0700
t0027: do not use an empty string as a pathspec element
My brain did just assume that this had mad it to master, sorry for the noise
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2017-12-08 18:51 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-14 12:31 Changing encoding of a file : What should happen to CRLF in file ? Ashish Negi
2017-11-14 15:20 ` Torsten Bögershausen
2017-11-14 16:13 ` Ashish Negi
2017-11-14 16:15 ` Ashish Negi
2017-11-14 17:09 ` Torsten Bögershausen
2017-11-15 8:11 ` Ashish Negi
2017-11-15 17:12 ` Torsten Bögershausen
2017-11-15 19:05 ` Ashish Negi
2017-11-16 16:15 ` Torsten Bögershausen
2017-11-23 16:31 ` Ashish Negi
2017-11-23 20:25 ` Torsten Bögershausen
2017-11-24 6:37 ` Ashish Negi
2017-11-14 16:45 ` Torsten Bögershausen
2017-11-24 16:14 ` [PATCH 1/1] convert: tighten the safe autocrlf handling tboegi
2017-11-24 17:24 ` Eric Sunshine
2017-11-24 18:59 ` Torsten Bögershausen
2017-11-25 3:16 ` Junio C Hamano
2017-11-26 12:20 ` [PATCH v2 " tboegi
2017-12-08 17:46 ` [PATCH v1 1/2] t0027: Don't use git commit <empty-pathspec> tboegi
2017-12-08 18:13 ` Junio C Hamano
2017-12-08 18:21 ` Junio C Hamano
2017-12-08 18:50 ` Torsten Bögershausen
2017-12-08 17:46 ` [PATCH v1 2/2] t0027: Adapt the new MIX tests to Windows tboegi
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).