git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 1/4] convert: Safer handling of $Id$ contraction.
@ 2010-03-01 16:16 Henrik Grubbström
  2010-03-03  1:10 ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Henrik Grubbström @ 2010-03-01 16:16 UTC (permalink / raw
  To: git; +Cc: Henrik Grubbström  

From: Henrik Grubbström (Grubba) <grubba@grubba.org>

The code to contract $Id:xxxxx$ strings could eat an arbitrary amount
of source text if the terminating $ was lost. It now refuses to
contract $Id:xxxxx$ strings spanning multiple lines.

Signed-off-by: Henrik Grubbström <grubba@grubba.org>
---
 convert.c |   17 +++++++++++++++--
 1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/convert.c b/convert.c
index 4f8fcb7..91207ab 100644
--- a/convert.c
+++ b/convert.c
@@ -425,6 +425,7 @@ static int count_ident(const char *cp, unsigned long size)
 				cnt++;
 				break;
 			}
+			if (ch == '\n') break;
 		}
 	}
 	return cnt;
@@ -433,7 +434,7 @@ static int count_ident(const char *cp, unsigned long size)
 static int ident_to_git(const char *path, const char *src, size_t len,
                         struct strbuf *buf, int ident)
 {
-	char *dst, *dollar;
+	char *dst, *dollar, *nl;
 
 	if (!ident || !count_ident(src, len))
 		return 0;
@@ -455,6 +456,12 @@ static int ident_to_git(const char *path, const char *src, size_t len,
 			dollar = memchr(src + 3, '$', len - 3);
 			if (!dollar)
 				break;
+			nl = memchr(src + 3, '\n', len - 3);
+			if (nl && nl < dollar) {
+				/* Line break before the next dollar. */
+				continue;
+			}
+
 			memcpy(dst, "Id$", 3);
 			dst += 3;
 			len -= dollar + 1 - src;
@@ -470,7 +477,7 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
                              struct strbuf *buf, int ident)
 {
 	unsigned char sha1[20];
-	char *to_free = NULL, *dollar;
+	char *to_free = NULL, *dollar, *nl;
 	int cnt;
 
 	if (!ident)
@@ -514,6 +521,12 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
 				break;
 			}
 
+			nl = memchr(src + 3, '\n', len - 3);
+			if (nl && nl < dollar) {
+				/* Line break before the next dollar. */
+				continue;
+			}
+
 			len -= dollar + 1 - src;
 			src  = dollar + 1;
 		} else {
-- 
1.6.4.122.g6ffd7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] convert: Safer handling of $Id$ contraction.
  2010-03-01 16:16 Henrik Grubbström
@ 2010-03-03  1:10 ` Junio C Hamano
  2010-03-03 10:40   ` Henrik Grubbström
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2010-03-03  1:10 UTC (permalink / raw
  To: Henrik Grubbström; +Cc: git

Henrik Grubbström <grubba@grubba.org> writes:

> From: Henrik Grubbström (Grubba) <grubba@grubba.org>

You can omit this line, as it is the same as your From: header.

> The code to contract $Id:xxxxx$ strings could eat an arbitrary amount
> of source text if the terminating $ was lost. It now refuses to
> contract $Id:xxxxx$ strings spanning multiple lines.

Hmm, at least when going from working tree to the index, shouldn't the
code refuse _and_ die(), instead of silently pass the garbage through?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] convert: Safer handling of $Id$ contraction.
  2010-03-03  1:10 ` Junio C Hamano
@ 2010-03-03 10:40   ` Henrik Grubbström
  2010-03-09  9:22     ` Henrik Grubbström
  0 siblings, 1 reply; 14+ messages in thread
From: Henrik Grubbström @ 2010-03-03 10:40 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Henrik Grubbström

[-- Attachment #1: Type: TEXT/PLAIN, Size: 791 bytes --]

On Tue, 2 Mar 2010, Junio C Hamano wrote:

> Henrik Grubbström <grubba@grubba.org> writes:
>
>> From: Henrik Grubbström (Grubba) <grubba@grubba.org>
>
> You can omit this line, as it is the same as your From: header.

I blame git-send-email(1).

>> The code to contract $Id:xxxxx$ strings could eat an arbitrary amount
>> of source text if the terminating $ was lost. It now refuses to
>> contract $Id:xxxxx$ strings spanning multiple lines.
>
> Hmm, at least when going from working tree to the index, shouldn't the
> code refuse _and_ die(), instead of silently pass the garbage through?

It depends; it could be part of some code that scans for the $Id: tag.
A warning might be appropriate though.

--
Henrik Grubbström					grubba@grubba.org
Roxen Internet Software AB				grubba@roxen.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] convert: Safer handling of $Id$ contraction.
  2010-03-03 10:40   ` Henrik Grubbström
@ 2010-03-09  9:22     ` Henrik Grubbström
  0 siblings, 0 replies; 14+ messages in thread
From: Henrik Grubbström @ 2010-03-09  9:22 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, Henrik Grubbström

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1104 bytes --]

On Wed, 3 Mar 2010, Henrik Grubbström wrote:

> On Tue, 2 Mar 2010, Junio C Hamano wrote:
>
>> Henrik Grubbström <grubba@grubba.org> writes:
>> 
>>> The code to contract $Id:xxxxx$ strings could eat an arbitrary amount
>>> of source text if the terminating $ was lost. It now refuses to
>>> contract $Id:xxxxx$ strings spanning multiple lines.
>> 
>> Hmm, at least when going from working tree to the index, shouldn't the
>> code refuse _and_ die(), instead of silently pass the garbage through?
>
> It depends; it could be part of some code that scans for the $Id: tag.
> A warning might be appropriate though.

A nonscientific survey of some version control systems gives:

VCS		Id-keyword	Eats linefeed
-------------------------------------------------
bzr		no		-
cvs		yes		no, silent accept
git(1.7.0)	yes		yes, silent
hg		yes(hgext)	no, silent accept
monotone	no		-
rcs		yes		no, silent accept
svn		yes		no, silent accept

So it seems my original patch is in line with what other version control 
systems do.

--
Henrik Grubbström					grubba@grubba.org
Roxen Internet Software AB				grubba@roxen.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 0/4] ident attribute related patches (resend w/ testsuite)
@ 2010-03-15 15:30 Henrik Grubbström (Grubba)
  2010-03-15 15:30 ` [PATCH 1/4] convert: Safer handling of $Id$ contraction Henrik Grubbström (Grubba)
  0 siblings, 1 reply; 14+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-03-15 15:30 UTC (permalink / raw
  To: git; +Cc: Henrik Grubbström  

These are some patches that are needed to get the support for the
'ident' attribute to a useable state.

Since last time, some issues that caused the the existing testsuite to
fail have been fixed, and the testsuite has been extended to test the
issues fixed by the patches.

Henrik Grubbström (Grubba) (4):
  convert: Safer handling of $Id$ contraction.
  convert: Keep foreign $Id$ on checkout.
  convert: Inhibit contraction of foreign $Id$ during stats.
  convert: Added core.refilteronadd feature.

 Documentation/config.txt |    6 ++++
 builtin/apply.c          |    2 +-
 builtin/blame.c          |    2 +-
 cache.h                  |    9 +++++-
 combine-diff.c           |    2 +-
 config.c                 |    5 +++
 convert.c                |   53 ++++++++++++++++++++++++++++++----
 diff.c                   |    2 +-
 environment.c            |    1 +
 sha1_file.c              |   60 +++++++++++++++++++++++++++++++++++++-
 t/t0021-conversion.sh    |   72 ++++++++++++++++++++++++++++++++++++++++++----
 11 files changed, 196 insertions(+), 18 deletions(-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] convert: Safer handling of $Id$ contraction.
  2010-03-15 15:30 [PATCH 0/4] ident attribute related patches (resend w/ testsuite) Henrik Grubbström (Grubba)
@ 2010-03-15 15:30 ` Henrik Grubbström (Grubba)
  2010-03-15 15:30   ` [PATCH 2/4] convert: Keep foreign $Id$ on checkout Henrik Grubbström (Grubba)
                     ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-03-15 15:30 UTC (permalink / raw
  To: git; +Cc: Henrik Grubbström  

The code to contract $Id:xxxxx$ strings could eat an arbitrary amount
of source text if the terminating $ was lost. It now refuses to
contract $Id:xxxxx$ strings spanning multiple lines.

Signed-off-by: Henrik Grubbström <grubba@grubba.org>
---
The behaviour implemented by the patch is in line with what other
VCSes that implement $Id$ do.

 convert.c             |   17 +++++++++++++++--
 t/t0021-conversion.sh |   16 ++++++++++------
 2 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/convert.c b/convert.c
index 4f8fcb7..91207ab 100644
--- a/convert.c
+++ b/convert.c
@@ -425,6 +425,7 @@ static int count_ident(const char *cp, unsigned long size)
 				cnt++;
 				break;
 			}
+			if (ch == '\n') break;
 		}
 	}
 	return cnt;
@@ -433,7 +434,7 @@ static int count_ident(const char *cp, unsigned long size)
 static int ident_to_git(const char *path, const char *src, size_t len,
                         struct strbuf *buf, int ident)
 {
-	char *dst, *dollar;
+	char *dst, *dollar, *nl;
 
 	if (!ident || !count_ident(src, len))
 		return 0;
@@ -455,6 +456,12 @@ static int ident_to_git(const char *path, const char *src, size_t len,
 			dollar = memchr(src + 3, '$', len - 3);
 			if (!dollar)
 				break;
+			nl = memchr(src + 3, '\n', len - 3);
+			if (nl && nl < dollar) {
+				/* Line break before the next dollar. */
+				continue;
+			}
+
 			memcpy(dst, "Id$", 3);
 			dst += 3;
 			len -= dollar + 1 - src;
@@ -470,7 +477,7 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
                              struct strbuf *buf, int ident)
 {
 	unsigned char sha1[20];
-	char *to_free = NULL, *dollar;
+	char *to_free = NULL, *dollar, *nl;
 	int cnt;
 
 	if (!ident)
@@ -514,6 +521,12 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
 				break;
 			}
 
+			nl = memchr(src + 3, '\n', len - 3);
+			if (nl && nl < dollar) {
+				/* Line break before the next dollar. */
+				continue;
+			}
+
 			len -= dollar + 1 - src;
 			src  = dollar + 1;
 		} else {
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 6cb8d60..a21a8d2 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -65,17 +65,21 @@ test_expect_success expanded_in_repo '
 		echo "\$Id:NoSpaceAtFront \$"
 		echo "\$Id:NoSpaceAtEitherEnd\$"
 		echo "\$Id: NoTerminatingSymbol"
+		echo "\$Id: Foreign Commit With Spaces $"
+		echo "\$Id: NoTerminatingSymbolAtEOF"
 	} > expanded-keywords &&
 
 	{
 		echo "File with expanded keywords"
-		echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
-		echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
-		echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
-		echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
-		echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
-		echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
+		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
+		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
+		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
+		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
+		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
+		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
 		echo "\$Id: NoTerminatingSymbol"
+		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 $"
+		echo "\$Id: NoTerminatingSymbolAtEOF"
 	} > expected-output &&
 
 	git add expanded-keywords &&
-- 
1.6.4.122.g6ffd7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/4] convert: Keep foreign $Id$ on checkout.
  2010-03-15 15:30 ` [PATCH 1/4] convert: Safer handling of $Id$ contraction Henrik Grubbström (Grubba)
@ 2010-03-15 15:30   ` Henrik Grubbström (Grubba)
  2010-03-15 15:30     ` [PATCH 3/4] convert: Inhibit contraction of foreign $Id$ during stats Henrik Grubbström (Grubba)
  2010-03-15 15:39   ` [PATCH 1/4] convert: Safer handling of $Id$ contraction Bert Wesarg
  2010-03-15 18:16   ` René Scharfe
  2 siblings, 1 reply; 14+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-03-15 15:30 UTC (permalink / raw
  To: git; +Cc: Henrik Grubbström  

If there are foreign $Id$ keywords in the repository, they are most
likely there for a reason. Let's keep them on checkout (which is also
what the documentation indicates). Foreign $Id$ keywords are now
recognized by there being multiple space separated fields in $Id:xxxxx$.

Signed-off-by: Henrik Grubbström <grubba@grubba.org>
---
The typical use case is for repositories that have been converted
from some other VCS, where it is desirable to keep the old identifiers
around until there's some other reason to alter the file.

 convert.c             |   16 ++++++++++++++--
 t/t0021-conversion.sh |    2 +-
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/convert.c b/convert.c
index 91207ab..4165385 100644
--- a/convert.c
+++ b/convert.c
@@ -477,7 +477,7 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
                              struct strbuf *buf, int ident)
 {
 	unsigned char sha1[20];
-	char *to_free = NULL, *dollar, *nl;
+	char *to_free = NULL, *dollar, *nl, *spc;
 	int cnt;
 
 	if (!ident)
@@ -513,7 +513,10 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
 		} else if (src[2] == ':') {
 			/*
 			 * It's possible that an expanded Id has crept its way into the
-			 * repository, we cope with that by stripping the expansion out
+			 * repository, we cope with that by stripping the expansion out.
+			 * This is probably not a good idea, since it will cause changes
+			 * on checkout, which won't go away by stash, but let's keep it
+			 * for git-style ids.
 			 */
 			dollar = memchr(src + 3, '$', len - 3);
 			if (!dollar) {
@@ -527,6 +530,15 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
 				continue;
 			}
 
+			spc = memchr(src + 4, ' ', len - 4);
+			if (spc && spc < dollar-1) {
+				/* There are spaces in unexpected places.
+				 * This is probably an id from some other
+				 * versioning system. Keep it for now.
+				 */
+				continue;
+			}
+
 			len -= dollar + 1 - src;
 			src  = dollar + 1;
 		} else {
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index a21a8d2..248efcc 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -78,7 +78,7 @@ test_expect_success expanded_in_repo '
 		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
 		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
 		echo "\$Id: NoTerminatingSymbol"
-		echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 $"
+		echo "\$Id: Foreign Commit With Spaces $"
 		echo "\$Id: NoTerminatingSymbolAtEOF"
 	} > expected-output &&
 
-- 
1.6.4.122.g6ffd7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/4] convert: Inhibit contraction of foreign $Id$ during stats.
  2010-03-15 15:30   ` [PATCH 2/4] convert: Keep foreign $Id$ on checkout Henrik Grubbström (Grubba)
@ 2010-03-15 15:30     ` Henrik Grubbström (Grubba)
  2010-03-15 15:30       ` [PATCH 4/4] convert: Added core.refilteronadd feature Henrik Grubbström (Grubba)
  2010-03-15 15:42       ` [PATCH 3/4] convert: Inhibit contraction of foreign $Id$ during stats Bert Wesarg
  0 siblings, 2 replies; 14+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-03-15 15:30 UTC (permalink / raw
  To: git; +Cc: Henrik Grubbström  

Files containing foreign $Id$'s were reported as modified directly
on checkout, which ment that it was difficult to keep a clean working
tree when handling commits with files containing such. convert_to_git()
now takes one more mode parameter for controlling this.

Signed-off-by: Henrik Grubbström <grubba@grubba.org>
---
 builtin/apply.c       |    2 +-
 builtin/blame.c       |    2 +-
 cache.h               |    8 +++++++-
 combine-diff.c        |    2 +-
 convert.c             |   24 ++++++++++++++++++++----
 diff.c                |    2 +-
 sha1_file.c           |    3 ++-
 t/t0021-conversion.sh |   21 +++++++++++++++++++++
 8 files changed, 54 insertions(+), 10 deletions(-)

diff --git a/builtin/apply.c b/builtin/apply.c
index 3af4ae0..25adef8 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -1759,7 +1759,7 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf)
 	case S_IFREG:
 		if (strbuf_read_file(buf, path, st->st_size) != st->st_size)
 			return error("unable to open or read %s", path);
-		convert_to_git(path, buf->buf, buf->len, buf, 0);
+		convert_to_git(path, buf->buf, buf->len, buf, 0, 0);
 		return 0;
 	default:
 		return -1;
diff --git a/builtin/blame.c b/builtin/blame.c
index fc15863..16f7f00 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -2050,7 +2050,7 @@ static struct commit *fake_working_tree_commit(const char *path, const char *con
 		if (strbuf_read(&buf, 0, 0) < 0)
 			die_errno("failed to read from stdin");
 	}
-	convert_to_git(path, buf.buf, buf.len, &buf, 0);
+	convert_to_git(path, buf.buf, buf.len, &buf, 0, 0);
 	origin->file.ptr = buf.buf;
 	origin->file.size = buf.len;
 	pretend_sha1_file(buf.buf, buf.len, OBJ_BLOB, origin->blob_sha1);
diff --git a/cache.h b/cache.h
index 89f6a40..b62c462 100644
--- a/cache.h
+++ b/cache.h
@@ -556,6 +556,11 @@ enum safe_crlf {
 	SAFE_CRLF_WARN = 2,
 };
 
+enum ident_mode {
+	IDENT_MODE_FALSE = 0,
+	IDENT_MODE_KEEP_FOREIGN = 1,
+};
+
 extern enum safe_crlf safe_crlf;
 
 enum branch_track {
@@ -1010,7 +1015,8 @@ extern void trace_argv_printf(const char **argv, const char *format, ...);
 /* convert.c */
 /* returns 1 if *dst was used */
 extern int convert_to_git(const char *path, const char *src, size_t len,
-                          struct strbuf *dst, enum safe_crlf checksafe);
+			  struct strbuf *dst, enum safe_crlf checksafe,
+			  enum ident_mode identmode);
 extern int convert_to_working_tree(const char *path, const char *src, size_t len, struct strbuf *dst);
 
 /* add */
diff --git a/combine-diff.c b/combine-diff.c
index 6162691..8c9320a 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -758,7 +758,7 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
 			if (is_file) {
 				struct strbuf buf = STRBUF_INIT;
 
-				if (convert_to_git(elem->path, result, len, &buf, safe_crlf)) {
+				if (convert_to_git(elem->path, result, len, &buf, safe_crlf, 0)) {
 					free(result);
 					result = strbuf_detach(&buf, &len);
 					result_size = len;
diff --git a/convert.c b/convert.c
index 4165385..ab2e98e 100644
--- a/convert.c
+++ b/convert.c
@@ -432,9 +432,10 @@ static int count_ident(const char *cp, unsigned long size)
 }
 
 static int ident_to_git(const char *path, const char *src, size_t len,
-                        struct strbuf *buf, int ident)
+			struct strbuf *buf, int ident,
+			enum ident_mode identmode)
 {
-	char *dst, *dollar, *nl;
+	char *dst, *dollar, *nl, *spc;
 
 	if (!ident || !count_ident(src, len))
 		return 0;
@@ -462,6 +463,20 @@ static int ident_to_git(const char *path, const char *src, size_t len,
 				continue;
 			}
 
+			if ((identmode == IDENT_MODE_KEEP_FOREIGN) && len > 5) {
+				spc = memchr(src + 4, ' ', len - 4);
+				if (spc && spc < dollar-1) {
+					/* Foreign id.
+					 * Contraction of these is inhibited
+					 * during status operations to avoid
+					 * all files containing such being
+					 * marked as modified on checkout.
+					 * cf sha1_file.c:index_mem().
+					 */
+					continue;
+				}
+			}
+
 			memcpy(dst, "Id$", 3);
 			dst += 3;
 			len -= dollar + 1 - src;
@@ -594,7 +609,8 @@ static int git_path_check_ident(const char *path, struct git_attr_check *check)
 }
 
 int convert_to_git(const char *path, const char *src, size_t len,
-                   struct strbuf *dst, enum safe_crlf checksafe)
+		   struct strbuf *dst, enum safe_crlf checksafe,
+		   enum ident_mode identmode)
 {
 	struct git_attr_check check[3];
 	int crlf = CRLF_GUESS;
@@ -621,7 +637,7 @@ int convert_to_git(const char *path, const char *src, size_t len,
 		src = dst->buf;
 		len = dst->len;
 	}
-	return ret | ident_to_git(path, src, len, dst, ident);
+	return ret | ident_to_git(path, src, len, dst, ident, identmode);
 }
 
 int convert_to_working_tree(const char *path, const char *src, size_t len, struct strbuf *dst)
diff --git a/diff.c b/diff.c
index dfdfa1a..dd464bc 100644
--- a/diff.c
+++ b/diff.c
@@ -2113,7 +2113,7 @@ int diff_populate_filespec(struct diff_filespec *s, int size_only)
 		/*
 		 * Convert from working tree format to canonical git format
 		 */
-		if (convert_to_git(s->path, s->data, s->size, &buf, safe_crlf)) {
+		if (convert_to_git(s->path, s->data, s->size, &buf, safe_crlf, 0)) {
 			size_t size = 0;
 			munmap(s->data, s->size);
 			s->should_munmap = 0;
diff --git a/sha1_file.c b/sha1_file.c
index a08a9d0..992b624 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2417,7 +2417,8 @@ static int index_mem(unsigned char *sha1, void *buf, size_t size,
 	if ((type == OBJ_BLOB) && path) {
 		struct strbuf nbuf = STRBUF_INIT;
 		if (convert_to_git(path, buf, size, &nbuf,
-		                   write_object ? safe_crlf : 0)) {
+				   write_object ? safe_crlf : 0,
+				   write_object ? 0 : IDENT_MODE_KEEP_FOREIGN)) {
 			buf = strbuf_detach(&nbuf, &size);
 			re_allocated = 1;
 		}
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 248efcc..57812b6 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -93,4 +93,25 @@ test_expect_success expanded_in_repo '
 	cmp expanded-keywords expected-output
 '
 
+# Check that files containing keywords with proper markup aren't marked
+# as modified on checkout.
+test_expect_success keywords_not_modified '
+	{
+		echo "File with foreign keywords"
+		echo "\$Id\$"
+		echo "\$Id: NoTerminatingSymbol"
+		echo "\$Id: Foreign Commit With Spaces $"
+		echo "\$Id: NoTerminatingSymbolAtEOF"
+	} > expanded-keywords2 &&
+
+	git add expanded-keywords2 &&
+	git commit -m "File with keywords expanded" &&
+
+	echo "expanded-keywords2 ident" >> .gitattributes &&
+
+	rm -f expanded-keywords2 &&
+	git checkout -- expanded-keywords2 &&
+	test "x`git status --porcelain -- expanded-keywords2`" = x
+'
+
 test_done
-- 
1.6.4.122.g6ffd7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/4] convert: Added core.refilteronadd feature.
  2010-03-15 15:30     ` [PATCH 3/4] convert: Inhibit contraction of foreign $Id$ during stats Henrik Grubbström (Grubba)
@ 2010-03-15 15:30       ` Henrik Grubbström (Grubba)
  2010-03-15 15:42       ` [PATCH 3/4] convert: Inhibit contraction of foreign $Id$ during stats Bert Wesarg
  1 sibling, 0 replies; 14+ messages in thread
From: Henrik Grubbström (Grubba) @ 2010-03-15 15:30 UTC (permalink / raw
  To: git; +Cc: Henrik Grubbström  

When having $Id$ tags in other versioning systems, it is customary
to recalculate the tags in the source on commit or equvivalent.
This commit adds a configuration option to git that causes source
files to pass through a conversion roundtrip when added to the index.

Signed-off-by: Henrik Grubbström <grubba@grubba.org>
---
 Documentation/config.txt |    6 +++++
 cache.h                  |    1 +
 config.c                 |    5 ++++
 environment.c            |    1 +
 sha1_file.c              |   57 ++++++++++++++++++++++++++++++++++++++++++++++
 t/t0021-conversion.sh    |   35 ++++++++++++++++++++++++++++
 6 files changed, 105 insertions(+), 0 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index c80262b..a38b23e 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -533,6 +533,12 @@ core.sparseCheckout::
 	Enable "sparse checkout" feature. See section "Sparse checkout" in
 	linkgit:git-read-tree[1] for more information.
 
+core.refilterOnAdd::
+	Enable "refilter on add" feature. This causes source files to be
+	behave as if they were checked out after a linkgit:git-add[1].
+	This is typically usefull if eg the `ident` attribute is active,
+	in which case the $Id$ tags will be updated.
+
 add.ignore-errors::
 	Tells 'git add' to continue adding files when some files cannot be
 	added due to indexing errors. Equivalent to the '--ignore-errors'
diff --git a/cache.h b/cache.h
index b62c462..8cff256 100644
--- a/cache.h
+++ b/cache.h
@@ -549,6 +549,7 @@ extern int read_replace_refs;
 extern int fsync_object_files;
 extern int core_preload_index;
 extern int core_apply_sparse_checkout;
+extern int core_refilter_on_add;
 
 enum safe_crlf {
 	SAFE_CRLF_FALSE = 0,
diff --git a/config.c b/config.c
index 6963fbe..b1db505 100644
--- a/config.c
+++ b/config.c
@@ -523,6 +523,11 @@ static int git_default_core_config(const char *var, const char *value)
 		return 0;
 	}
 
+	if (!strcmp(var, "core.refilteronadd")) {
+		core_refilter_on_add = git_config_bool(var, value);
+		return 0;
+	}
+
 	/* Add other config variables here and to Documentation/config.txt. */
 	return 0;
 }
diff --git a/environment.c b/environment.c
index 876c5e5..25e1e47 100644
--- a/environment.c
+++ b/environment.c
@@ -52,6 +52,7 @@ enum object_creation_mode object_creation_mode = OBJECT_CREATION_MODE;
 char *notes_ref_name;
 int grafts_replace_parents = 1;
 int core_apply_sparse_checkout;
+int core_refilter_on_add;
 
 /* Parallel index stat data preload? */
 int core_preload_index = 0;
diff --git a/sha1_file.c b/sha1_file.c
index 992b624..d7cc3ad 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2467,6 +2467,54 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object,
 	return ret;
 }
 
+static int refilter_fd(int fd, struct stat *st, const char *path)
+{
+	int ret = -1;
+	size_t size = xsize_t(st->st_size);
+	struct strbuf gbuf = STRBUF_INIT;
+
+	if (!S_ISREG(st->st_mode)) {
+		struct strbuf sbuf = STRBUF_INIT;
+		if (strbuf_read(&sbuf, fd, 4096) >= 0)
+			ret = convert_to_git(path, sbuf.buf, sbuf.len, &gbuf, safe_crlf, 0);
+		else
+			ret = -1;
+		strbuf_release(&sbuf);
+	} else if (size) {
+		void *buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
+		ret = convert_to_git(path, buf, size, &gbuf, safe_crlf, 0);
+		munmap(buf, size);
+	} else
+		ret = -1;
+
+	if (ret > 0) {
+		/* Something happened during conversion to git.
+		 * Now convert it back, and save the result.
+		 */
+		struct strbuf obuf = STRBUF_INIT;
+
+		lseek(fd, 0, SEEK_SET);
+
+		if (convert_to_working_tree(path, gbuf.buf, gbuf.len, &obuf)) {
+			if (write_or_whine(fd, obuf.buf, obuf.len, path))
+				ftruncate(fd, obuf.len);
+			else
+				ret = -1;
+		} else {
+			if (write_or_whine(fd, gbuf.buf, gbuf.len, path))
+				ftruncate(fd, gbuf.len);
+			else
+				ret = -1;
+		}
+
+		strbuf_release(&obuf);
+	}
+	strbuf_release(&gbuf);
+
+	close(fd);
+	return ret;
+}
+
 int index_path(unsigned char *sha1, const char *path, struct stat *st, int write_object)
 {
 	int fd;
@@ -2481,6 +2529,15 @@ int index_path(unsigned char *sha1, const char *path, struct stat *st, int write
 		if (index_fd(sha1, fd, st, write_object, OBJ_BLOB, path) < 0)
 			return error("%s: failed to insert into database",
 				     path);
+		if (write_object && core_refilter_on_add) {
+			fd = open(path, O_RDWR);
+			if (fd < 0)
+				return error("open(\"%s\"): %s", path,
+					     strerror(errno));
+			if (refilter_fd(fd, st, path) < 0)
+				return error("%s: failed to refilter file",
+					     path);
+		}
 		break;
 	case S_IFLNK:
 		if (strbuf_readlink(&sb, path, st->st_size)) {
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 57812b6..9d98072 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -114,4 +114,39 @@ test_expect_success keywords_not_modified '
 	test "x`git status --porcelain -- expanded-keywords2`" = x
 '
 
+# Check that keywords are expanded on add when refilter is enabled.
+test_expect_success expanded_on_add '
+	git config --add core.refilteronadd true
+
+	echo "expanded-keywords3 ident" >> .gitattributes &&
+
+	{
+		echo "File with keyword"
+		echo "\$Id\$"
+	} > expanded-keywords3 &&
+
+	{
+		echo "File with keyword"
+		echo "\$Id: 0661580d6f976fe7cc1e4512f8db3f2ddb8d93cc \$"
+	} > expected-output3 &&
+
+	git add expanded-keywords3 &&
+
+	cat expanded-keywords3 &&
+	cmp expanded-keywords3 expected-output3
+'
+
+# Check that keywords are expanded on commit when refilter is enabled.
+test_expect_success expanded_on_commit '
+	{
+		echo "File with keyword"
+		echo "\$Id\$"
+	} > expanded-keywords3 &&
+
+	git commit -m "File with keyword" expanded-keywords3 &&
+
+	cat expanded-keywords3 &&
+	cmp expanded-keywords3 expected-output3
+'
+
 test_done
-- 
1.6.4.122.g6ffd7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] convert: Safer handling of $Id$ contraction.
  2010-03-15 15:30 ` [PATCH 1/4] convert: Safer handling of $Id$ contraction Henrik Grubbström (Grubba)
  2010-03-15 15:30   ` [PATCH 2/4] convert: Keep foreign $Id$ on checkout Henrik Grubbström (Grubba)
@ 2010-03-15 15:39   ` Bert Wesarg
  2010-03-15 18:21     ` Henrik Grubbström
  2010-03-15 18:16   ` René Scharfe
  2 siblings, 1 reply; 14+ messages in thread
From: Bert Wesarg @ 2010-03-15 15:39 UTC (permalink / raw
  To: Henrik Grubbström (Grubba); +Cc: git

2010/3/15 Henrik Grubbström (Grubba) <grubba@grubba.org>:
> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
> index 6cb8d60..a21a8d2 100755
> --- a/t/t0021-conversion.sh
> +++ b/t/t0021-conversion.sh
> @@ -65,17 +65,21 @@ test_expect_success expanded_in_repo '
>                echo "\$Id:NoSpaceAtFront \$"
>                echo "\$Id:NoSpaceAtEitherEnd\$"
>                echo "\$Id: NoTerminatingSymbol"
> +               echo "\$Id: Foreign Commit With Spaces $"

Missing \ for second $.

> +               echo "\$Id: NoTerminatingSymbolAtEOF"
>        } > expanded-keywords &&
>
>        {
>                echo "File with expanded keywords"
> -               echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
> -               echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
> -               echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
> -               echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
> -               echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
> -               echo "\$Id: 4f21723e7b15065df7de95bd46c8ba6fb1818f4c \$"
> +               echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
> +               echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
> +               echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
> +               echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
> +               echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
> +               echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 \$"
>                echo "\$Id: NoTerminatingSymbol"
> +               echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 $"

Ditto.

> +               echo "\$Id: NoTerminatingSymbolAtEOF"
>        } > expected-output &&
>
>        git add expanded-keywords &&

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] convert: Inhibit contraction of foreign $Id$ during  stats.
  2010-03-15 15:30     ` [PATCH 3/4] convert: Inhibit contraction of foreign $Id$ during stats Henrik Grubbström (Grubba)
  2010-03-15 15:30       ` [PATCH 4/4] convert: Added core.refilteronadd feature Henrik Grubbström (Grubba)
@ 2010-03-15 15:42       ` Bert Wesarg
  2010-03-15 18:32         ` Henrik Grubbström
  1 sibling, 1 reply; 14+ messages in thread
From: Bert Wesarg @ 2010-03-15 15:42 UTC (permalink / raw
  To: Henrik Grubbström (Grubba); +Cc: git

2010/3/15 Henrik Grubbström (Grubba) <grubba@grubba.org>:
> diff --git a/builtin/apply.c b/builtin/apply.c
> index 3af4ae0..25adef8 100644
> --- a/builtin/apply.c
> +++ b/builtin/apply.c
> @@ -1759,7 +1759,7 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf)
>        case S_IFREG:
>                if (strbuf_read_file(buf, path, st->st_size) != st->st_size)
>                        return error("unable to open or read %s", path);
> -               convert_to_git(path, buf->buf, buf->len, buf, 0);
> +               convert_to_git(path, buf->buf, buf->len, buf, 0, 0);

So this new 0 should be ...

>                return 0;
>        default:
>                return -1;
> diff --git a/builtin/blame.c b/builtin/blame.c
> index fc15863..16f7f00 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -2050,7 +2050,7 @@ static struct commit *fake_working_tree_commit(const char *path, const char *con
>                if (strbuf_read(&buf, 0, 0) < 0)
>                        die_errno("failed to read from stdin");
>        }
> -       convert_to_git(path, buf.buf, buf.len, &buf, 0);
> +       convert_to_git(path, buf.buf, buf.len, &buf, 0, 0);

... (and this one too) ...

>        origin->file.ptr = buf.buf;
>        origin->file.size = buf.len;
>        pretend_sha1_file(buf.buf, buf.len, OBJ_BLOB, origin->blob_sha1);
> diff --git a/cache.h b/cache.h
> index 89f6a40..b62c462 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -556,6 +556,11 @@ enum safe_crlf {
>        SAFE_CRLF_WARN = 2,
>  };
>
> +enum ident_mode {
> +       IDENT_MODE_FALSE = 0,

... this?

> +       IDENT_MODE_KEEP_FOREIGN = 1,
> +};
> +
>  extern enum safe_crlf safe_crlf;
>
>  enum branch_track {

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] convert: Safer handling of $Id$ contraction.
  2010-03-15 15:30 ` [PATCH 1/4] convert: Safer handling of $Id$ contraction Henrik Grubbström (Grubba)
  2010-03-15 15:30   ` [PATCH 2/4] convert: Keep foreign $Id$ on checkout Henrik Grubbström (Grubba)
  2010-03-15 15:39   ` [PATCH 1/4] convert: Safer handling of $Id$ contraction Bert Wesarg
@ 2010-03-15 18:16   ` René Scharfe
  2 siblings, 0 replies; 14+ messages in thread
From: René Scharfe @ 2010-03-15 18:16 UTC (permalink / raw
  To: "Henrik Grubbström (Grubba)"; +Cc: git

Am 15.03.2010 16:30, schrieb Henrik Grubbström (Grubba):
> The code to contract $Id:xxxxx$ strings could eat an arbitrary amount
> of source text if the terminating $ was lost. It now refuses to
> contract $Id:xxxxx$ strings spanning multiple lines.
> 
> Signed-off-by: Henrik Grubbström <grubba@grubba.org>
> ---
> The behaviour implemented by the patch is in line with what other
> VCSes that implement $Id$ do.
> 
>  convert.c             |   17 +++++++++++++++--
>  t/t0021-conversion.sh |   16 ++++++++++------
>  2 files changed, 25 insertions(+), 8 deletions(-)
> 
> diff --git a/convert.c b/convert.c
> index 4f8fcb7..91207ab 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -425,6 +425,7 @@ static int count_ident(const char *cp, unsigned long size)
>  				cnt++;
>  				break;
>  			}
> +			if (ch == '\n') break;

Style:

			if (ch == '\n')
				break;

>  		}
>  	}
>  	return cnt;
> @@ -433,7 +434,7 @@ static int count_ident(const char *cp, unsigned long size)
>  static int ident_to_git(const char *path, const char *src, size_t len,
>                          struct strbuf *buf, int ident)
>  {
> -	char *dst, *dollar;
> +	char *dst, *dollar, *nl;
>  
>  	if (!ident || !count_ident(src, len))
>  		return 0;
> @@ -455,6 +456,12 @@ static int ident_to_git(const char *path, const char *src, size_t len,
>  			dollar = memchr(src + 3, '$', len - 3);
>  			if (!dollar)
>  				break;
> +			nl = memchr(src + 3, '\n', len - 3);
> +			if (nl && nl < dollar) {
> +				/* Line break before the next dollar. */
> +				continue;
> +			}
> +

You only need to search up to the previously found dollar sign here:

			if (memchr(src + 3, '\n', dollar - src - 3))
				continue;

>  			memcpy(dst, "Id$", 3);
>  			dst += 3;
>  			len -= dollar + 1 - src;
> @@ -470,7 +477,7 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
>                               struct strbuf *buf, int ident)
>  {
>  	unsigned char sha1[20];
> -	char *to_free = NULL, *dollar;
> +	char *to_free = NULL, *dollar, *nl;
>  	int cnt;
>  
>  	if (!ident)
> @@ -514,6 +521,12 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
>  				break;
>  			}
>  
> +			nl = memchr(src + 3, '\n', len - 3);
> +			if (nl && nl < dollar) {
> +				/* Line break before the next dollar. */
> +				continue;
> +			}
> +

Ditto.

René

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] convert: Safer handling of $Id$ contraction.
  2010-03-15 15:39   ` [PATCH 1/4] convert: Safer handling of $Id$ contraction Bert Wesarg
@ 2010-03-15 18:21     ` Henrik Grubbström
  0 siblings, 0 replies; 14+ messages in thread
From: Henrik Grubbström @ 2010-03-15 18:21 UTC (permalink / raw
  To: Bert Wesarg; +Cc: git, Henrik Grubbström

[-- Attachment #1: Type: TEXT/PLAIN, Size: 548 bytes --]

On Mon, 15 Mar 2010, Bert Wesarg wrote:

> 2010/3/15 Henrik Grubbström (Grubba) <grubba@grubba.org>:
>> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
[...]
>> +               echo "\$Id: Foreign Commit With Spaces $"
>
> Missing \ for second $.
[...]
>> +               echo "\$Id: fd0478f5f1486f3d5177d4c3f6eb2765e8fc56b9 $"
>
> Ditto.

Thanks. It seems like bash and /bin/sh on Solaris don't care, but it's 
probably not a portable behaviour.

--
Henrik Grubbström					grubba@grubba.org
Roxen Internet Software AB				grubba@roxen.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] convert: Inhibit contraction of foreign $Id$ during stats.
  2010-03-15 15:42       ` [PATCH 3/4] convert: Inhibit contraction of foreign $Id$ during stats Bert Wesarg
@ 2010-03-15 18:32         ` Henrik Grubbström
  0 siblings, 0 replies; 14+ messages in thread
From: Henrik Grubbström @ 2010-03-15 18:32 UTC (permalink / raw
  To: Bert Wesarg; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 930 bytes --]

On Mon, 15 Mar 2010, Bert Wesarg wrote:

> 2010/3/15 Henrik Grubbström (Grubba) <grubba@grubba.org>:
>> diff --git a/builtin/apply.c b/builtin/apply.c
>> index 3af4ae0..25adef8 100644
>> --- a/builtin/apply.c
>> +++ b/builtin/apply.c
[...]
>> -               convert_to_git(path, buf->buf, buf->len, buf, 0);
>> +               convert_to_git(path, buf->buf, buf->len, buf, 0, 0);
>
> So this new 0 should be ...
>
[...]
>> @@ -556,6 +556,11 @@ enum safe_crlf {
>>        SAFE_CRLF_WARN = 2,
>>  };
>>
>> +enum ident_mode {
>> +       IDENT_MODE_FALSE = 0,
>
> ... this?

Correct. I followed the style for the preceeding parameter (ie the other 
zero), which maps to enum safe_crlf SAFE_CRLF_FALSE. I agree that using 
the enum constants here instead of the plain zeros could perhaps be more 
readable, but that should be a separate patch.

--
Henrik Grubbström					grubba@grubba.org
Roxen Internet Software AB				grubba@roxen.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-03-15 18:32 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-15 15:30 [PATCH 0/4] ident attribute related patches (resend w/ testsuite) Henrik Grubbström (Grubba)
2010-03-15 15:30 ` [PATCH 1/4] convert: Safer handling of $Id$ contraction Henrik Grubbström (Grubba)
2010-03-15 15:30   ` [PATCH 2/4] convert: Keep foreign $Id$ on checkout Henrik Grubbström (Grubba)
2010-03-15 15:30     ` [PATCH 3/4] convert: Inhibit contraction of foreign $Id$ during stats Henrik Grubbström (Grubba)
2010-03-15 15:30       ` [PATCH 4/4] convert: Added core.refilteronadd feature Henrik Grubbström (Grubba)
2010-03-15 15:42       ` [PATCH 3/4] convert: Inhibit contraction of foreign $Id$ during stats Bert Wesarg
2010-03-15 18:32         ` Henrik Grubbström
2010-03-15 15:39   ` [PATCH 1/4] convert: Safer handling of $Id$ contraction Bert Wesarg
2010-03-15 18:21     ` Henrik Grubbström
2010-03-15 18:16   ` René Scharfe
  -- strict thread matches above, loose matches on Subject: below --
2010-03-01 16:16 Henrik Grubbström
2010-03-03  1:10 ` Junio C Hamano
2010-03-03 10:40   ` Henrik Grubbström
2010-03-09  9:22     ` Henrik Grubbström

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).