git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 1/2] Change error messages in ident.c Make error messages caused by failed reads of the /etc/passwd file easier to understand. Signed-off-by: Angus Hammond <angusgh@gmail.com>
@ 2012-05-10 19:06 Angus Hammond
  2012-05-10 19:06 ` [PATCH 2/2] Remove diagnostics section from commit-tree and var man pages New error messages shouldn't need explaining like the old ones did so just delete the diagnostics section of the man pages. " Angus Hammond
                   ` (3 more replies)
  0 siblings, 4 replies; 68+ messages in thread
From: Angus Hammond @ 2012-05-10 19:06 UTC (permalink / raw)
  To: git; +Cc: Angus Hammond

---
 ident.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/ident.c b/ident.c
index 87c697c..51a7a73 100644
--- a/ident.c
+++ b/ident.c
@@ -46,7 +46,7 @@ static void copy_gecos(const struct passwd *w, char *name, size_t sz)
 	if (len < sz)
 		name[len] = 0;
 	else
-		die("Your parents must have hated you!");
+		die("Your GECOS field is too long.");
 
 }
 
@@ -106,7 +106,7 @@ static void copy_email(const struct passwd *pw)
 	 */
 	size_t len = strlen(pw->pw_name);
 	if (len > sizeof(git_default_email)/2)
-		die("Your sysadmin must hate you!");
+		die("Your name field in is too long.");
 	memcpy(git_default_email, pw->pw_name, len);
 	git_default_email[len++] = '@';
 
@@ -125,7 +125,7 @@ static void setup_ident(const char **name, const char **emailp)
 	if (!*name && !git_default_name[0]) {
 		pw = getpwuid(getuid());
 		if (!pw)
-			die("You don't exist. Go away!");
+			die("Could not read your GECOS field.");
 		copy_gecos(pw, git_default_name, sizeof(git_default_name));
 	}
 	if (!*name)
@@ -142,7 +142,7 @@ static void setup_ident(const char **name, const char **emailp)
 			if (!pw)
 				pw = getpwuid(getuid());
 			if (!pw)
-				die("You don't exist. Go away!");
+				die("Could not read your GECOS field.");
 			copy_email(pw);
 		}
 	}
@@ -325,7 +325,7 @@ const char *fmt_ident(const char *name, const char *email,
 			die("empty ident %s <%s> not allowed", name, email);
 		pw = getpwuid(getuid());
 		if (!pw)
-			die("You don't exist. Go away!");
+			die("Could not read your GECOS field.");
 		strlcpy(git_default_name, pw->pw_name,
 			sizeof(git_default_name));
 		name = git_default_name;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 2/2] Remove diagnostics section from commit-tree and var man pages New error messages shouldn't need explaining like the old ones did so just delete the diagnostics section of the man pages. Signed-off-by: Angus Hammond <angusgh@gmail.com>
  2012-05-10 19:06 [PATCH 1/2] Change error messages in ident.c Make error messages caused by failed reads of the /etc/passwd file easier to understand. Signed-off-by: Angus Hammond <angusgh@gmail.com> Angus Hammond
@ 2012-05-10 19:06 ` Angus Hammond
  2012-05-10 19:21   ` Angus Hammond
  2012-05-10 19:23 ` [PATCH 1/2] Change error messages in ident.c Jeff King
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 68+ messages in thread
From: Angus Hammond @ 2012-05-10 19:06 UTC (permalink / raw)
  To: git; +Cc: Angus Hammond

---
 Documentation/git-commit-tree.txt |    9 ---------
 Documentation/git-var.txt         |    9 ---------
 2 files changed, 18 deletions(-)

diff --git a/Documentation/git-commit-tree.txt b/Documentation/git-commit-tree.txt
index cfb9906..eb8ee99 100644
--- a/Documentation/git-commit-tree.txt
+++ b/Documentation/git-commit-tree.txt
@@ -88,15 +88,6 @@ for one to be entered and terminated with ^D.
 
 include::date-formats.txt[]
 
-Diagnostics
------------
-You don't exist. Go away!::
-    The passwd(5) gecos field couldn't be read
-Your parents must have hated you!::
-    The passwd(5) gecos field is longer than a giant static buffer.
-Your sysadmin must hate you!::
-    The passwd(5) name field is longer than a giant static buffer.
-
 Discussion
 ----------
 
diff --git a/Documentation/git-var.txt b/Documentation/git-var.txt
index 988a323..67edf58 100644
--- a/Documentation/git-var.txt
+++ b/Documentation/git-var.txt
@@ -59,15 +59,6 @@ ifdef::git-default-pager[]
     The build you are using chose '{git-default-pager}' as the default.
 endif::git-default-pager[]
 
-Diagnostics
------------
-You don't exist. Go away!::
-    The passwd(5) gecos field couldn't be read
-Your parents must have hated you!::
-    The passwd(5) gecos field is longer than a giant static buffer.
-Your sysadmin must hate you!::
-    The passwd(5) name field is longer than a giant static buffer.
-
 SEE ALSO
 --------
 linkgit:git-commit-tree[1]
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH 2/2] Remove diagnostics section from commit-tree and var man pages New error messages shouldn't need explaining like the old ones did so just delete the diagnostics section of the man pages. Signed-off-by: Angus Hammond <angusgh@gmail.com>
  2012-05-10 19:06 ` [PATCH 2/2] Remove diagnostics section from commit-tree and var man pages New error messages shouldn't need explaining like the old ones did so just delete the diagnostics section of the man pages. " Angus Hammond
@ 2012-05-10 19:21   ` Angus Hammond
  0 siblings, 0 replies; 68+ messages in thread
From: Angus Hammond @ 2012-05-10 19:21 UTC (permalink / raw)
  To: git; +Cc: Angus Hammond

I have no idea how I managed to send this to myself and see it as one,
properly formatted email despite it being 2 formatted diabolically with
entire commits messages in the subjects. Sorry about that.
These were meant to offer an alternative solution to the current issue over
unusual error messages from commit-tree by bypassing the whole unix humour
issue. They look it'll still be possible to apply them even if they are
badly sent. If not and people think they're worth while I'd be happy (try
and) send them again without this mess.
Thanks
Angus

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] Change error messages in ident.c...
  2012-05-10 19:06 [PATCH 1/2] Change error messages in ident.c Make error messages caused by failed reads of the /etc/passwd file easier to understand. Signed-off-by: Angus Hammond <angusgh@gmail.com> Angus Hammond
  2012-05-10 19:06 ` [PATCH 2/2] Remove diagnostics section from commit-tree and var man pages New error messages shouldn't need explaining like the old ones did so just delete the diagnostics section of the man pages. " Angus Hammond
@ 2012-05-10 19:23 ` Jeff King
  2012-05-10 19:56   ` Jeff King
  2012-05-10 20:04   ` [PATCH 1/2] Change error messages in ident.c Junio C Hamano
  2012-05-10 19:43 ` [PATCH 1/2] Change error messages in ident.c Make error messages caused by failed reads of the /etc/passwd file easier to understand. Signed-off-by: Angus Hammond <angusgh@gmail.com> Junio C Hamano
  2012-05-11 11:35 ` Nguyen Thai Ngoc Duy
  3 siblings, 2 replies; 68+ messages in thread
From: Jeff King @ 2012-05-10 19:23 UTC (permalink / raw)
  To: Angus Hammond; +Cc: git

On Thu, May 10, 2012 at 08:06:09PM +0100, Angus Hammond wrote:

> Subject: Re: [PATCH 1/2] Change error messages in ident.c Make error messages
>  caused by failed reads of the /etc/passwd file easier to understand.
>  Signed-off-by: Angus Hammond <angusgh@gmail.com>

Holy line-breaks, Batman!

As amusing as I find the existing messages, this is probably a good
direction (although I find it unlikely that most people would see the
messages under normal use).

I am also tempted to suggest that we simply replace the static buffers
with dynamic strbufs. I guess that may open up new vectors for an
attacker to convince git to allocate arbitrary amounts of memory, but
that is already pretty easy to do, so I doubt it's a big deal.

> @@ -46,7 +46,7 @@ static void copy_gecos(const struct passwd *w, char *name, size_t sz)
>  	if (len < sz)
>  		name[len] = 0;
>  	else
> -		die("Your parents must have hated you!");
> +		die("Your GECOS field is too long.");

I know that "GECOS" is the standard name for the field, but I wonder if
it is a bit unnecessarily jargon-y. Wouldn't something like:

  die("unable to get real name from system password file: name too long");

be a little more friendly? It tells what operation we were actually
performing, and it doesn't use any jargon.

> @@ -106,7 +106,7 @@ static void copy_email(const struct passwd *pw)
>  	 */
>  	size_t len = strlen(pw->pw_name);
>  	if (len > sizeof(git_default_email)/2)
> -		die("Your sysadmin must hate you!");
> +		die("Your name field in is too long.");

s/in is/is/. Also, similar complaints to above (if you see this message
unexpectedly, you might ask "which name field? One inside a commit
object?").

> [...]

And similar comments for the rest of the messages.

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] Change error messages in ident.c Make error messages caused by failed reads of the /etc/passwd file easier to understand. Signed-off-by: Angus Hammond <angusgh@gmail.com>
  2012-05-10 19:06 [PATCH 1/2] Change error messages in ident.c Make error messages caused by failed reads of the /etc/passwd file easier to understand. Signed-off-by: Angus Hammond <angusgh@gmail.com> Angus Hammond
  2012-05-10 19:06 ` [PATCH 2/2] Remove diagnostics section from commit-tree and var man pages New error messages shouldn't need explaining like the old ones did so just delete the diagnostics section of the man pages. " Angus Hammond
  2012-05-10 19:23 ` [PATCH 1/2] Change error messages in ident.c Jeff King
@ 2012-05-10 19:43 ` Junio C Hamano
  2012-05-10 19:57   ` Angus Hammond
  2012-05-11 11:35 ` Nguyen Thai Ngoc Duy
  3 siblings, 1 reply; 68+ messages in thread
From: Junio C Hamano @ 2012-05-10 19:43 UTC (permalink / raw)
  To: Angus Hammond; +Cc: git

They are one of the oldest and humorous messages we have in the system,
and more importantly, users will see them only once on a badly configured
system.  If there is no real-life reason (e.g. "if we do not change this
message, Nuclear reactors will start misbehaving"), I would rather keep
them as they are for hysterical raisins.

But that is just my preference to keep Linus's twisted sense of humor.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] Change error messages in ident.c...
  2012-05-10 19:23 ` [PATCH 1/2] Change error messages in ident.c Jeff King
@ 2012-05-10 19:56   ` Jeff King
  2012-05-11 22:53     ` Junio C Hamano
  2012-05-10 20:04   ` [PATCH 1/2] Change error messages in ident.c Junio C Hamano
  1 sibling, 1 reply; 68+ messages in thread
From: Jeff King @ 2012-05-10 19:56 UTC (permalink / raw)
  To: Angus Hammond; +Cc: git

On Thu, May 10, 2012 at 03:23:39PM -0400, Jeff King wrote:

> I am also tempted to suggest that we simply replace the static buffers
> with dynamic strbufs. I guess that may open up new vectors for an
> attacker to convince git to allocate arbitrary amounts of memory, but
> that is already pretty easy to do, so I doubt it's a big deal.

For reference, that patch would look like something like this:

---
 builtin/fmt-merge-msg.c | 14 ++++----
 cache.h                 |  5 ++-
 config.c                |  4 +--
 environment.c           |  4 +--
 http-push.c             |  2 +-
 ident.c                 | 94 ++++++++++++++++++-------------------------------
 6 files changed, 50 insertions(+), 73 deletions(-)

diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index a517f17..bb716c8 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -230,7 +230,8 @@ static void add_branch_desc(struct strbuf *out, const char *name)
 static void record_person(int which, struct string_list *people,
 			  struct commit *commit)
 {
-	char name_buf[MAX_GITNAME], *name, *name_end;
+	struct strbuf name_buf = STRBUF_INIT;
+	char *name, *name_end;
 	struct string_list_item *elem;
 	const char *field = (which == 'a') ? "\nauthor " : "\ncommitter ";
 
@@ -243,17 +244,18 @@ static void record_person(int which, struct string_list *people,
 		name_end--;
 	while (isspace(*name_end) && name <= name_end)
 		name_end--;
-	if (name_end < name || name + MAX_GITNAME <= name_end)
+	if (name_end < name)
 		return;
-	memcpy(name_buf, name, name_end - name + 1);
-	name_buf[name_end - name + 1] = '\0';
+	strbuf_add(&name_buf, name, name_end - name + 1);
 
-	elem = string_list_lookup(people, name_buf);
+	elem = string_list_lookup(people, name_buf.buf);
 	if (!elem) {
-		elem = string_list_insert(people, name_buf);
+		elem = string_list_insert(people, name_buf.buf);
 		elem->util = (void *)0;
 	}
 	elem->util = (void*)(util_as_integral(elem) + 1);
+
+	strbuf_release(&name_buf);
 }
 
 static int cmp_string_list_util_as_integral(const void *a_, const void *b_)
diff --git a/cache.h b/cache.h
index e14ffcd..0c1a332 100644
--- a/cache.h
+++ b/cache.h
@@ -1138,9 +1138,8 @@ struct config_include_data {
 #define CONFIG_INCLUDE_INIT { 0 }
 extern int git_config_include(const char *name, const char *value, void *data);
 
-#define MAX_GITNAME (1000)
-extern char git_default_email[MAX_GITNAME];
-extern char git_default_name[MAX_GITNAME];
+extern struct strbuf git_default_email;
+extern struct strbuf git_default_name;
 #define IDENT_NAME_GIVEN 01
 #define IDENT_MAIL_GIVEN 02
 #define IDENT_ALL_GIVEN (IDENT_NAME_GIVEN|IDENT_MAIL_GIVEN)
diff --git a/config.c b/config.c
index eeee986..69cb08c 100644
--- a/config.c
+++ b/config.c
@@ -767,7 +767,7 @@ static int git_default_user_config(const char *var, const char *value)
 	if (!strcmp(var, "user.name")) {
 		if (!value)
 			return config_error_nonbool(var);
-		strlcpy(git_default_name, value, sizeof(git_default_name));
+		strbuf_addstr(&git_default_name, value);
 		user_ident_explicitly_given |= IDENT_NAME_GIVEN;
 		return 0;
 	}
@@ -775,7 +775,7 @@ static int git_default_user_config(const char *var, const char *value)
 	if (!strcmp(var, "user.email")) {
 		if (!value)
 			return config_error_nonbool(var);
-		strlcpy(git_default_email, value, sizeof(git_default_email));
+		strbuf_addstr(&git_default_email, value);
 		user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
 		return 0;
 	}
diff --git a/environment.c b/environment.c
index d7e6c65..f4e3b53 100644
--- a/environment.c
+++ b/environment.c
@@ -11,8 +11,8 @@
 #include "refs.h"
 #include "fmt-merge-msg.h"
 
-char git_default_email[MAX_GITNAME];
-char git_default_name[MAX_GITNAME];
+struct strbuf git_default_email = STRBUF_INIT;
+struct strbuf git_default_name = STRBUF_INIT;
 int user_ident_explicitly_given;
 int trust_executable_bit = 1;
 int trust_ctime = 1;
diff --git a/http-push.c b/http-push.c
index 1df7ab5..2362ffd 100644
--- a/http-push.c
+++ b/http-push.c
@@ -904,7 +904,7 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 		ep = strchr(ep + 1, '/');
 	}
 
-	escaped = xml_entities(git_default_email);
+	escaped = xml_entities(git_default_email.buf);
 	strbuf_addf(&out_buffer.buf, LOCK_REQUEST, escaped);
 	free(escaped);
 
diff --git a/ident.c b/ident.c
index 87c697c..c7bdb3f 100644
--- a/ident.c
+++ b/ident.c
@@ -15,42 +15,27 @@ static char git_default_date[50];
 #define get_gecos(struct_passwd) ((struct_passwd)->pw_gecos)
 #endif
 
-static void copy_gecos(const struct passwd *w, char *name, size_t sz)
+static void copy_gecos(const struct passwd *w, struct strbuf *name)
 {
-	char *src, *dst;
-	size_t len, nlen;
-
-	nlen = strlen(w->pw_name);
+	char *src;
 
 	/* Traditionally GECOS field had office phone numbers etc, separated
 	 * with commas.  Also & stands for capitalized form of the login name.
 	 */
 
-	for (len = 0, dst = name, src = get_gecos(w); len < sz; src++) {
+	for (src = get_gecos(w); *src && *src != ','; src++) {
 		int ch = *src;
-		if (ch != '&') {
-			*dst++ = ch;
-			if (ch == 0 || ch == ',')
-				break;
-			len++;
-			continue;
-		}
-		if (len + nlen < sz) {
+		if (ch != '&')
+			strbuf_addch(name, ch);
+		else {
 			/* Sorry, Mr. McDonald... */
-			*dst++ = toupper(*w->pw_name);
-			memcpy(dst, w->pw_name + 1, nlen - 1);
-			dst += nlen - 1;
-			len += nlen;
+			strbuf_addch(name, toupper(*w->pw_name));
+			strbuf_addstr(name, w->pw_name + 1);
 		}
 	}
-	if (len < sz)
-		name[len] = 0;
-	else
-		die("Your parents must have hated you!");
-
 }
 
-static int add_mailname_host(char *buf, size_t len)
+static int add_mailname_host(struct strbuf *buf)
 {
 	FILE *mailname;
 
@@ -61,7 +46,7 @@ static int add_mailname_host(char *buf, size_t len)
 				strerror(errno));
 		return -1;
 	}
-	if (!fgets(buf, len, mailname)) {
+	if (strbuf_getline(buf, mailname, '\n') == EOF) {
 		if (ferror(mailname))
 			warning("cannot read /etc/mailname: %s",
 				strerror(errno));
@@ -73,48 +58,41 @@ static int add_mailname_host(char *buf, size_t len)
 	return 0;
 }
 
-static void add_domainname(char *buf, size_t len)
+static void add_domainname(struct strbuf *out)
 {
+	char buf[1024];
 	struct hostent *he;
-	size_t namelen;
 	const char *domainname;
 
-	if (gethostname(buf, len)) {
+	if (gethostname(buf, sizeof(buf))) {
 		warning("cannot get host name: %s", strerror(errno));
-		strlcpy(buf, "(none)", len);
+		strbuf_addstr(out, "(none)");
 		return;
 	}
-	namelen = strlen(buf);
-	if (memchr(buf, '.', namelen))
+	strbuf_addstr(out, buf);
+	if (strchr(buf, '.'))
 		return;
 
 	he = gethostbyname(buf);
-	buf[namelen++] = '.';
-	buf += namelen;
-	len -= namelen;
+	strbuf_addch(out, '.');
 	if (he && (domainname = strchr(he->h_name, '.')))
-		strlcpy(buf, domainname + 1, len);
+		strbuf_addstr(out, domainname + 1);
 	else
-		strlcpy(buf, "(none)", len);
+		strbuf_addstr(out, "(none)");
 }
 
-static void copy_email(const struct passwd *pw)
+static void copy_email(const struct passwd *pw, struct strbuf *email)
 {
 	/*
 	 * Make up a fake email address
 	 * (name + '@' + hostname [+ '.' + domainname])
 	 */
-	size_t len = strlen(pw->pw_name);
-	if (len > sizeof(git_default_email)/2)
-		die("Your sysadmin must hate you!");
-	memcpy(git_default_email, pw->pw_name, len);
-	git_default_email[len++] = '@';
-
-	if (!add_mailname_host(git_default_email + len,
-				sizeof(git_default_email) - len))
+	strbuf_addstr(email, pw->pw_name);
+	strbuf_addch(email, '@');
+
+	if (!add_mailname_host(email))
 		return;	/* read from "/etc/mailname" (Debian) */
-	add_domainname(git_default_email + len,
-			sizeof(git_default_email) - len);
+	add_domainname(email);
 }
 
 static void setup_ident(const char **name, const char **emailp)
@@ -122,32 +100,31 @@ static void setup_ident(const char **name, const char **emailp)
 	struct passwd *pw = NULL;
 
 	/* Get the name ("gecos") */
-	if (!*name && !git_default_name[0]) {
+	if (!*name && !git_default_name.len) {
 		pw = getpwuid(getuid());
 		if (!pw)
 			die("You don't exist. Go away!");
-		copy_gecos(pw, git_default_name, sizeof(git_default_name));
+		copy_gecos(pw, &git_default_name);
 	}
 	if (!*name)
-		*name = git_default_name;
+		*name = git_default_name.buf;
 
-	if (!*emailp && !git_default_email[0]) {
+	if (!*emailp && !git_default_email.len) {
 		const char *email = getenv("EMAIL");
 
 		if (email && email[0]) {
-			strlcpy(git_default_email, email,
-				sizeof(git_default_email));
+			strbuf_addstr(&git_default_email, email);
 			user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
 		} else {
 			if (!pw)
 				pw = getpwuid(getuid());
 			if (!pw)
 				die("You don't exist. Go away!");
-			copy_email(pw);
+			copy_email(pw, &git_default_email);
 		}
 	}
 	if (!*emailp)
-		*emailp = git_default_email;
+		*emailp = git_default_email.buf;
 
 	/* And set the default date */
 	if (!git_default_date[0])
@@ -317,7 +294,7 @@ const char *fmt_ident(const char *name, const char *email,
 		struct passwd *pw;
 
 		if ((warn_on_no_name || error_on_no_name) &&
-		    name == git_default_name && env_hint) {
+		    name == git_default_name.buf && env_hint) {
 			fputs(env_hint, stderr);
 			env_hint = NULL; /* warn only once */
 		}
@@ -326,9 +303,8 @@ const char *fmt_ident(const char *name, const char *email,
 		pw = getpwuid(getuid());
 		if (!pw)
 			die("You don't exist. Go away!");
-		strlcpy(git_default_name, pw->pw_name,
-			sizeof(git_default_name));
-		name = git_default_name;
+		strbuf_addstr(&git_default_name, pw->pw_name);
+		name = git_default_name.buf;
 	}
 
 	strcpy(date, git_default_date);

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] Change error messages in ident.c Make error messages caused by failed reads of the /etc/passwd file easier to understand. Signed-off-by: Angus Hammond <angusgh@gmail.com>
  2012-05-10 19:43 ` [PATCH 1/2] Change error messages in ident.c Make error messages caused by failed reads of the /etc/passwd file easier to understand. Signed-off-by: Angus Hammond <angusgh@gmail.com> Junio C Hamano
@ 2012-05-10 19:57   ` Angus Hammond
  0 siblings, 0 replies; 68+ messages in thread
From: Angus Hammond @ 2012-05-10 19:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 10 May 2012 20:43, Junio C Hamano <gitster@pobox.com> wrote:
> They are one of the oldest and humorous messages we have in the system,
> and more importantly, users will see them only once on a badly configured
> system.  If there is no real-life reason (e.g. "if we do not change this
> message, Nuclear reactors will start misbehaving"), I would rather keep
> them as they are for hysterical raisins.
I'm not too worried either way, just tried to knock the patch out
quickly because it came up and this seemed like the logical solution.
In all honesty though, whilst I don't have a problem with unix humour
being in git, I do have a bit of a problem with it being in error
messages since when these are displayed it means that a users system
is preventing them from using git for whatever reason, and at those
times there's a good chance you're worried about fixing that problem,
not laughing at a joke made by Linus several years ago.

Just my 2 cents. It's probably not worth too much bother since it'll
only ever show up very rarely.
Thanks
Angus

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] Change error messages in ident.c...
  2012-05-10 19:23 ` [PATCH 1/2] Change error messages in ident.c Jeff King
  2012-05-10 19:56   ` Jeff King
@ 2012-05-10 20:04   ` Junio C Hamano
  2012-05-10 20:22     ` Jeff King
  1 sibling, 1 reply; 68+ messages in thread
From: Junio C Hamano @ 2012-05-10 20:04 UTC (permalink / raw)
  To: Jeff King; +Cc: Angus Hammond, git

Jeff King <peff@peff.net> writes:

> I am also tempted to suggest that we simply replace the static buffers
> with dynamic strbufs.

Yeah, I think that is a proper approach for this issue, as it will make
two of these messages unnecessary (or all?  I couldn't think of a way
to deal with missing getpwent case myself, though).

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] Change error messages in ident.c...
  2012-05-10 20:04   ` [PATCH 1/2] Change error messages in ident.c Junio C Hamano
@ 2012-05-10 20:22     ` Jeff King
  2012-05-10 20:28       ` Junio C Hamano
  0 siblings, 1 reply; 68+ messages in thread
From: Jeff King @ 2012-05-10 20:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

On Thu, May 10, 2012 at 01:04:13PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I am also tempted to suggest that we simply replace the static buffers
> > with dynamic strbufs.
> 
> Yeah, I think that is a proper approach for this issue, as it will make
> two of these messages unnecessary (or all?  I couldn't think of a way
> to deal with missing getpwent case myself, though).

It doesn't get rid of the "you don't exist" message, and I think just
dying there makes sense.  But that is actually the one that I consider
the most likely to happen in practice, and should probably have a more
useful error message.

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] Change error messages in ident.c...
  2012-05-10 20:22     ` Jeff King
@ 2012-05-10 20:28       ` Junio C Hamano
  0 siblings, 0 replies; 68+ messages in thread
From: Junio C Hamano @ 2012-05-10 20:28 UTC (permalink / raw)
  To: Jeff King; +Cc: Angus Hammond, git

Jeff King <peff@peff.net> writes:

> It doesn't get rid of the "you don't exist" message, and I think just
> dying there makes sense.  But that is actually the one that I consider
> the most likely to happen in practice, and should probably have a more
> useful error message.

Yeah, I do not think anybody minds losing that phrasing from that message
(the "parents" and "sysadmin" were the humorous ones), and we certainly
can phrase it differently, e.g.

    Your system didn't tell me your real name; hint: git help config
    and look for user.name

or something.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] Change error messages in ident.c Make error messages caused by failed reads of the /etc/passwd file easier to understand. Signed-off-by: Angus Hammond <angusgh@gmail.com>
  2012-05-10 19:06 [PATCH 1/2] Change error messages in ident.c Make error messages caused by failed reads of the /etc/passwd file easier to understand. Signed-off-by: Angus Hammond <angusgh@gmail.com> Angus Hammond
                   ` (2 preceding siblings ...)
  2012-05-10 19:43 ` [PATCH 1/2] Change error messages in ident.c Make error messages caused by failed reads of the /etc/passwd file easier to understand. Signed-off-by: Angus Hammond <angusgh@gmail.com> Junio C Hamano
@ 2012-05-11 11:35 ` Nguyen Thai Ngoc Duy
  3 siblings, 0 replies; 68+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-05-11 11:35 UTC (permalink / raw)
  To: Angus Hammond; +Cc: git

On Fri, May 11, 2012 at 2:06 AM, Angus Hammond <angusgh@gmail.com> wrote:
> ---
>  ident.c |   10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)

While you are touching this, perhaps you can also turn all die(xxx) in
this file to die(_(xxx)), same for warning()? You touch 5 out of 11
already. And it helps make sure all the new strings are in the same
humor level (aka none). _() allows the messages to be translated in
another language, by the way.

Also this on top so we get nice advice

diff --git a/ident.c b/ident.c
index 87c697c..b5a631f 100644
--- a/ident.c
+++ b/ident.c
@@ -289,7 +289,7 @@ person_only:
 }

 static const char *env_hint =
-"\n"
+N_("\n"
 "*** Please tell me who you are.\n"
 "\n"
 "Run\n"
@@ -299,7 +299,7 @@ static const char *env_hint =
 "\n"
 "to set your account\'s default identity.\n"
 "Omit --global to set the identity only in this repository.\n"
-"\n";
+"\n");

 const char *fmt_ident(const char *name, const char *email,
 		      const char *date_str, int flag)
@@ -318,7 +318,7 @@ const char *fmt_ident(const char *name, const char *email,

 		if ((warn_on_no_name || error_on_no_name) &&
 		    name == git_default_name && env_hint) {
-			fputs(env_hint, stderr);
+			fputs(_(env_hint), stderr);
 			env_hint = NULL; /* warn only once */
 		}
 		if (error_on_no_name)
-- 
Duy

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] Change error messages in ident.c...
  2012-05-10 19:56   ` Jeff King
@ 2012-05-11 22:53     ` Junio C Hamano
  2012-05-11 23:13       ` Jeff King
  0 siblings, 1 reply; 68+ messages in thread
From: Junio C Hamano @ 2012-05-11 22:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Angus Hammond, git

Jeff King <peff@peff.net> writes:

> On Thu, May 10, 2012 at 03:23:39PM -0400, Jeff King wrote:
>
>> I am also tempted to suggest that we simply replace the static buffers
>> with dynamic strbufs. I guess that may open up new vectors for an
>> attacker to convince git to allocate arbitrary amounts of memory, but
>> that is already pretty easy to do, so I doubt it's a big deal.
>
> For reference, that patch would look like something like this:

Looks quite straight-forward and readable, I would say.  Not only you gave
us a legitimate excuse to get rid of the humourous messages, you lifted
most of the artificial limitations ('domainname' limit is still there but
that is not anything new) and use of strlcpy(), the last of which is a
huge win from my point of view ;-)

>
> ---
>  builtin/fmt-merge-msg.c | 14 ++++----
>  cache.h                 |  5 ++-
>  config.c                |  4 +--
>  environment.c           |  4 +--
>  http-push.c             |  2 +-
>  ident.c                 | 94 ++++++++++++++++++-------------------------------
>  6 files changed, 50 insertions(+), 73 deletions(-)
>
> diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
> index a517f17..bb716c8 100644
> --- a/builtin/fmt-merge-msg.c
> +++ b/builtin/fmt-merge-msg.c
> @@ -230,7 +230,8 @@ static void add_branch_desc(struct strbuf *out, const char *name)
>  static void record_person(int which, struct string_list *people,
>  			  struct commit *commit)
>  {
> -	char name_buf[MAX_GITNAME], *name, *name_end;
> +	struct strbuf name_buf = STRBUF_INIT;
> +	char *name, *name_end;
>  	struct string_list_item *elem;
>  	const char *field = (which == 'a') ? "\nauthor " : "\ncommitter ";
>  
> @@ -243,17 +244,18 @@ static void record_person(int which, struct string_list *people,
>  		name_end--;
>  	while (isspace(*name_end) && name <= name_end)
>  		name_end--;
> -	if (name_end < name || name + MAX_GITNAME <= name_end)
> +	if (name_end < name)
>  		return;
> -	memcpy(name_buf, name, name_end - name + 1);
> -	name_buf[name_end - name + 1] = '\0';
> +	strbuf_add(&name_buf, name, name_end - name + 1);
>  
> -	elem = string_list_lookup(people, name_buf);
> +	elem = string_list_lookup(people, name_buf.buf);
>  	if (!elem) {
> -		elem = string_list_insert(people, name_buf);
> +		elem = string_list_insert(people, name_buf.buf);
>  		elem->util = (void *)0;
>  	}
>  	elem->util = (void*)(util_as_integral(elem) + 1);
> +
> +	strbuf_release(&name_buf);
>  }
>  
>  static int cmp_string_list_util_as_integral(const void *a_, const void *b_)
> diff --git a/cache.h b/cache.h
> index e14ffcd..0c1a332 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1138,9 +1138,8 @@ struct config_include_data {
>  #define CONFIG_INCLUDE_INIT { 0 }
>  extern int git_config_include(const char *name, const char *value, void *data);
>  
> -#define MAX_GITNAME (1000)
> -extern char git_default_email[MAX_GITNAME];
> -extern char git_default_name[MAX_GITNAME];
> +extern struct strbuf git_default_email;
> +extern struct strbuf git_default_name;
>  #define IDENT_NAME_GIVEN 01
>  #define IDENT_MAIL_GIVEN 02
>  #define IDENT_ALL_GIVEN (IDENT_NAME_GIVEN|IDENT_MAIL_GIVEN)
> diff --git a/config.c b/config.c
> index eeee986..69cb08c 100644
> --- a/config.c
> +++ b/config.c
> @@ -767,7 +767,7 @@ static int git_default_user_config(const char *var, const char *value)
>  	if (!strcmp(var, "user.name")) {
>  		if (!value)
>  			return config_error_nonbool(var);
> -		strlcpy(git_default_name, value, sizeof(git_default_name));
> +		strbuf_addstr(&git_default_name, value);
>  		user_ident_explicitly_given |= IDENT_NAME_GIVEN;
>  		return 0;
>  	}
> @@ -775,7 +775,7 @@ static int git_default_user_config(const char *var, const char *value)
>  	if (!strcmp(var, "user.email")) {
>  		if (!value)
>  			return config_error_nonbool(var);
> -		strlcpy(git_default_email, value, sizeof(git_default_email));
> +		strbuf_addstr(&git_default_email, value);
>  		user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
>  		return 0;
>  	}
> diff --git a/environment.c b/environment.c
> index d7e6c65..f4e3b53 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -11,8 +11,8 @@
>  #include "refs.h"
>  #include "fmt-merge-msg.h"
>  
> -char git_default_email[MAX_GITNAME];
> -char git_default_name[MAX_GITNAME];
> +struct strbuf git_default_email = STRBUF_INIT;
> +struct strbuf git_default_name = STRBUF_INIT;
>  int user_ident_explicitly_given;
>  int trust_executable_bit = 1;
>  int trust_ctime = 1;
> diff --git a/http-push.c b/http-push.c
> index 1df7ab5..2362ffd 100644
> --- a/http-push.c
> +++ b/http-push.c
> @@ -904,7 +904,7 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
>  		ep = strchr(ep + 1, '/');
>  	}
>  
> -	escaped = xml_entities(git_default_email);
> +	escaped = xml_entities(git_default_email.buf);
>  	strbuf_addf(&out_buffer.buf, LOCK_REQUEST, escaped);
>  	free(escaped);
>  
> diff --git a/ident.c b/ident.c
> index 87c697c..c7bdb3f 100644
> --- a/ident.c
> +++ b/ident.c
> @@ -15,42 +15,27 @@ static char git_default_date[50];
>  #define get_gecos(struct_passwd) ((struct_passwd)->pw_gecos)
>  #endif
>  
> -static void copy_gecos(const struct passwd *w, char *name, size_t sz)
> +static void copy_gecos(const struct passwd *w, struct strbuf *name)
>  {
> -	char *src, *dst;
> -	size_t len, nlen;
> -
> -	nlen = strlen(w->pw_name);
> +	char *src;
>  
>  	/* Traditionally GECOS field had office phone numbers etc, separated
>  	 * with commas.  Also & stands for capitalized form of the login name.
>  	 */
>  
> -	for (len = 0, dst = name, src = get_gecos(w); len < sz; src++) {
> +	for (src = get_gecos(w); *src && *src != ','; src++) {
>  		int ch = *src;
> -		if (ch != '&') {
> -			*dst++ = ch;
> -			if (ch == 0 || ch == ',')
> -				break;
> -			len++;
> -			continue;
> -		}
> -		if (len + nlen < sz) {
> +		if (ch != '&')
> +			strbuf_addch(name, ch);
> +		else {
>  			/* Sorry, Mr. McDonald... */
> -			*dst++ = toupper(*w->pw_name);
> -			memcpy(dst, w->pw_name + 1, nlen - 1);
> -			dst += nlen - 1;
> -			len += nlen;
> +			strbuf_addch(name, toupper(*w->pw_name));
> +			strbuf_addstr(name, w->pw_name + 1);
>  		}
>  	}
> -	if (len < sz)
> -		name[len] = 0;
> -	else
> -		die("Your parents must have hated you!");
> -
>  }
>  
> -static int add_mailname_host(char *buf, size_t len)
> +static int add_mailname_host(struct strbuf *buf)
>  {
>  	FILE *mailname;
>  
> @@ -61,7 +46,7 @@ static int add_mailname_host(char *buf, size_t len)
>  				strerror(errno));
>  		return -1;
>  	}
> -	if (!fgets(buf, len, mailname)) {
> +	if (strbuf_getline(buf, mailname, '\n') == EOF) {
>  		if (ferror(mailname))
>  			warning("cannot read /etc/mailname: %s",
>  				strerror(errno));
> @@ -73,48 +58,41 @@ static int add_mailname_host(char *buf, size_t len)
>  	return 0;
>  }
>  
> -static void add_domainname(char *buf, size_t len)
> +static void add_domainname(struct strbuf *out)
>  {
> +	char buf[1024];
>  	struct hostent *he;
> -	size_t namelen;
>  	const char *domainname;
>  
> -	if (gethostname(buf, len)) {
> +	if (gethostname(buf, sizeof(buf))) {
>  		warning("cannot get host name: %s", strerror(errno));
> -		strlcpy(buf, "(none)", len);
> +		strbuf_addstr(out, "(none)");
>  		return;
>  	}
> -	namelen = strlen(buf);
> -	if (memchr(buf, '.', namelen))
> +	strbuf_addstr(out, buf);
> +	if (strchr(buf, '.'))
>  		return;
>  
>  	he = gethostbyname(buf);
> -	buf[namelen++] = '.';
> -	buf += namelen;
> -	len -= namelen;
> +	strbuf_addch(out, '.');
>  	if (he && (domainname = strchr(he->h_name, '.')))
> -		strlcpy(buf, domainname + 1, len);
> +		strbuf_addstr(out, domainname + 1);
>  	else
> -		strlcpy(buf, "(none)", len);
> +		strbuf_addstr(out, "(none)");
>  }
>  
> -static void copy_email(const struct passwd *pw)
> +static void copy_email(const struct passwd *pw, struct strbuf *email)
>  {
>  	/*
>  	 * Make up a fake email address
>  	 * (name + '@' + hostname [+ '.' + domainname])
>  	 */
> -	size_t len = strlen(pw->pw_name);
> -	if (len > sizeof(git_default_email)/2)
> -		die("Your sysadmin must hate you!");
> -	memcpy(git_default_email, pw->pw_name, len);
> -	git_default_email[len++] = '@';
> -
> -	if (!add_mailname_host(git_default_email + len,
> -				sizeof(git_default_email) - len))
> +	strbuf_addstr(email, pw->pw_name);
> +	strbuf_addch(email, '@');
> +
> +	if (!add_mailname_host(email))
>  		return;	/* read from "/etc/mailname" (Debian) */
> -	add_domainname(git_default_email + len,
> -			sizeof(git_default_email) - len);
> +	add_domainname(email);
>  }
>  
>  static void setup_ident(const char **name, const char **emailp)
> @@ -122,32 +100,31 @@ static void setup_ident(const char **name, const char **emailp)
>  	struct passwd *pw = NULL;
>  
>  	/* Get the name ("gecos") */
> -	if (!*name && !git_default_name[0]) {
> +	if (!*name && !git_default_name.len) {
>  		pw = getpwuid(getuid());
>  		if (!pw)
>  			die("You don't exist. Go away!");
> -		copy_gecos(pw, git_default_name, sizeof(git_default_name));
> +		copy_gecos(pw, &git_default_name);
>  	}
>  	if (!*name)
> -		*name = git_default_name;
> +		*name = git_default_name.buf;
>  
> -	if (!*emailp && !git_default_email[0]) {
> +	if (!*emailp && !git_default_email.len) {
>  		const char *email = getenv("EMAIL");
>  
>  		if (email && email[0]) {
> -			strlcpy(git_default_email, email,
> -				sizeof(git_default_email));
> +			strbuf_addstr(&git_default_email, email);
>  			user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
>  		} else {
>  			if (!pw)
>  				pw = getpwuid(getuid());
>  			if (!pw)
>  				die("You don't exist. Go away!");
> -			copy_email(pw);
> +			copy_email(pw, &git_default_email);
>  		}
>  	}
>  	if (!*emailp)
> -		*emailp = git_default_email;
> +		*emailp = git_default_email.buf;
>  
>  	/* And set the default date */
>  	if (!git_default_date[0])
> @@ -317,7 +294,7 @@ const char *fmt_ident(const char *name, const char *email,
>  		struct passwd *pw;
>  
>  		if ((warn_on_no_name || error_on_no_name) &&
> -		    name == git_default_name && env_hint) {
> +		    name == git_default_name.buf && env_hint) {
>  			fputs(env_hint, stderr);
>  			env_hint = NULL; /* warn only once */
>  		}
> @@ -326,9 +303,8 @@ const char *fmt_ident(const char *name, const char *email,
>  		pw = getpwuid(getuid());
>  		if (!pw)
>  			die("You don't exist. Go away!");
> -		strlcpy(git_default_name, pw->pw_name,
> -			sizeof(git_default_name));
> -		name = git_default_name;
> +		strbuf_addstr(&git_default_name, pw->pw_name);
> +		name = git_default_name.buf;
>  	}
>  
>  	strcpy(date, git_default_date);

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] Change error messages in ident.c...
  2012-05-11 22:53     ` Junio C Hamano
@ 2012-05-11 23:13       ` Jeff King
  2012-05-14 16:28         ` [PATCH 1/2] drop length limitations on gecos-derived names and emails Jeff King
  2012-05-14 16:36         ` [PATCH 2/2] ident: report passwd errors with a more friendly message Jeff King
  0 siblings, 2 replies; 68+ messages in thread
From: Jeff King @ 2012-05-11 23:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

On Fri, May 11, 2012 at 03:53:43PM -0700, Junio C Hamano wrote:

> >> I am also tempted to suggest that we simply replace the static buffers
> >> with dynamic strbufs. I guess that may open up new vectors for an
> >> attacker to convince git to allocate arbitrary amounts of memory, but
> >> that is already pretty easy to do, so I doubt it's a big deal.
> >
> > For reference, that patch would look like something like this:
> 
> Looks quite straight-forward and readable, I would say.  Not only you gave
> us a legitimate excuse to get rid of the humourous messages, you lifted
> most of the artificial limitations ('domainname' limit is still there but
> that is not anything new) and use of strlcpy(), the last of which is a
> huge win from my point of view ;-)

Thanks. I'll re-roll with a commit message, and a follow-on patch to fix
the "you don't exist" message. But probably tomorrow, as I am just
finishing gitting for the day.

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 1/2] drop length limitations on gecos-derived names and emails
  2012-05-11 23:13       ` Jeff King
@ 2012-05-14 16:28         ` Jeff King
  2012-05-14 17:05           ` Jeff King
  2012-05-14 21:02           ` Jeff King
  2012-05-14 16:36         ` [PATCH 2/2] ident: report passwd errors with a more friendly message Jeff King
  1 sibling, 2 replies; 68+ messages in thread
From: Jeff King @ 2012-05-14 16:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

When we pull the user's name from the GECOS field of the
passwd file (or generate an email address based on their
username and hostname), we put the result into a
static buffer. While it's extremely unlikely that anybody
ever hit these limits (after all, in such a case their
parents must have hated them), we still had to deal with the
error cases in our code.

Converting these static buffers to strbufs lets us simplify
the code and drop some error messages from the documentation
that have confused some users.

Note that there is still one length limitation: the
gethostname interface requires us to provide a static
buffer, so we arbitrarily choose 1024 bytes for the
hostname.

Signed-off-by: Jeff King <peff@peff.net>
---
I noticed in add_domainname that we look up the host via gethostname,
and then if it is not fully qualified, call gethostbyname and steal the
domain portion of the result, tacking it onto the hostname we got.

That seems oddly complex to me, and like it could result in a bogus
hostname if the unqualified name does not match the first part of the
returned qualified name. E.g., if the /etc/hosts file contains something
like:

  192.168.1.1 foo.example.com bar.example.com bar

(and your hostname is "bar"). I doubt it matters much in practice, and
it is outside the scope of this patch, so I left it for now.

 Documentation/git-commit-tree.txt |  4 --
 Documentation/git-var.txt         |  4 --
 builtin/fmt-merge-msg.c           | 14 +++---
 cache.h                           |  5 +--
 config.c                          |  4 +-
 environment.c                     |  4 +-
 http-push.c                       |  2 +-
 ident.c                           | 94 +++++++++++++++------------------------
 8 files changed, 50 insertions(+), 81 deletions(-)

diff --git a/Documentation/git-commit-tree.txt b/Documentation/git-commit-tree.txt
index cfb9906..eb12b2d 100644
--- a/Documentation/git-commit-tree.txt
+++ b/Documentation/git-commit-tree.txt
@@ -92,10 +92,6 @@ Diagnostics
 -----------
 You don't exist. Go away!::
     The passwd(5) gecos field couldn't be read
-Your parents must have hated you!::
-    The passwd(5) gecos field is longer than a giant static buffer.
-Your sysadmin must hate you!::
-    The passwd(5) name field is longer than a giant static buffer.
 
 Discussion
 ----------
diff --git a/Documentation/git-var.txt b/Documentation/git-var.txt
index 988a323..3f703e3 100644
--- a/Documentation/git-var.txt
+++ b/Documentation/git-var.txt
@@ -63,10 +63,6 @@ Diagnostics
 -----------
 You don't exist. Go away!::
     The passwd(5) gecos field couldn't be read
-Your parents must have hated you!::
-    The passwd(5) gecos field is longer than a giant static buffer.
-Your sysadmin must hate you!::
-    The passwd(5) name field is longer than a giant static buffer.
 
 SEE ALSO
 --------
diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index a517f17..bb716c8 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -230,7 +230,8 @@ static void add_branch_desc(struct strbuf *out, const char *name)
 static void record_person(int which, struct string_list *people,
 			  struct commit *commit)
 {
-	char name_buf[MAX_GITNAME], *name, *name_end;
+	struct strbuf name_buf = STRBUF_INIT;
+	char *name, *name_end;
 	struct string_list_item *elem;
 	const char *field = (which == 'a') ? "\nauthor " : "\ncommitter ";
 
@@ -243,17 +244,18 @@ static void record_person(int which, struct string_list *people,
 		name_end--;
 	while (isspace(*name_end) && name <= name_end)
 		name_end--;
-	if (name_end < name || name + MAX_GITNAME <= name_end)
+	if (name_end < name)
 		return;
-	memcpy(name_buf, name, name_end - name + 1);
-	name_buf[name_end - name + 1] = '\0';
+	strbuf_add(&name_buf, name, name_end - name + 1);
 
-	elem = string_list_lookup(people, name_buf);
+	elem = string_list_lookup(people, name_buf.buf);
 	if (!elem) {
-		elem = string_list_insert(people, name_buf);
+		elem = string_list_insert(people, name_buf.buf);
 		elem->util = (void *)0;
 	}
 	elem->util = (void*)(util_as_integral(elem) + 1);
+
+	strbuf_release(&name_buf);
 }
 
 static int cmp_string_list_util_as_integral(const void *a_, const void *b_)
diff --git a/cache.h b/cache.h
index e14ffcd..0c1a332 100644
--- a/cache.h
+++ b/cache.h
@@ -1138,9 +1138,8 @@ struct config_include_data {
 #define CONFIG_INCLUDE_INIT { 0 }
 extern int git_config_include(const char *name, const char *value, void *data);
 
-#define MAX_GITNAME (1000)
-extern char git_default_email[MAX_GITNAME];
-extern char git_default_name[MAX_GITNAME];
+extern struct strbuf git_default_email;
+extern struct strbuf git_default_name;
 #define IDENT_NAME_GIVEN 01
 #define IDENT_MAIL_GIVEN 02
 #define IDENT_ALL_GIVEN (IDENT_NAME_GIVEN|IDENT_MAIL_GIVEN)
diff --git a/config.c b/config.c
index eeee986..69cb08c 100644
--- a/config.c
+++ b/config.c
@@ -767,7 +767,7 @@ static int git_default_user_config(const char *var, const char *value)
 	if (!strcmp(var, "user.name")) {
 		if (!value)
 			return config_error_nonbool(var);
-		strlcpy(git_default_name, value, sizeof(git_default_name));
+		strbuf_addstr(&git_default_name, value);
 		user_ident_explicitly_given |= IDENT_NAME_GIVEN;
 		return 0;
 	}
@@ -775,7 +775,7 @@ static int git_default_user_config(const char *var, const char *value)
 	if (!strcmp(var, "user.email")) {
 		if (!value)
 			return config_error_nonbool(var);
-		strlcpy(git_default_email, value, sizeof(git_default_email));
+		strbuf_addstr(&git_default_email, value);
 		user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
 		return 0;
 	}
diff --git a/environment.c b/environment.c
index d7e6c65..f4e3b53 100644
--- a/environment.c
+++ b/environment.c
@@ -11,8 +11,8 @@
 #include "refs.h"
 #include "fmt-merge-msg.h"
 
-char git_default_email[MAX_GITNAME];
-char git_default_name[MAX_GITNAME];
+struct strbuf git_default_email = STRBUF_INIT;
+struct strbuf git_default_name = STRBUF_INIT;
 int user_ident_explicitly_given;
 int trust_executable_bit = 1;
 int trust_ctime = 1;
diff --git a/http-push.c b/http-push.c
index 1df7ab5..2362ffd 100644
--- a/http-push.c
+++ b/http-push.c
@@ -904,7 +904,7 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 		ep = strchr(ep + 1, '/');
 	}
 
-	escaped = xml_entities(git_default_email);
+	escaped = xml_entities(git_default_email.buf);
 	strbuf_addf(&out_buffer.buf, LOCK_REQUEST, escaped);
 	free(escaped);
 
diff --git a/ident.c b/ident.c
index 87c697c..c7bdb3f 100644
--- a/ident.c
+++ b/ident.c
@@ -15,42 +15,27 @@ static char git_default_date[50];
 #define get_gecos(struct_passwd) ((struct_passwd)->pw_gecos)
 #endif
 
-static void copy_gecos(const struct passwd *w, char *name, size_t sz)
+static void copy_gecos(const struct passwd *w, struct strbuf *name)
 {
-	char *src, *dst;
-	size_t len, nlen;
-
-	nlen = strlen(w->pw_name);
+	char *src;
 
 	/* Traditionally GECOS field had office phone numbers etc, separated
 	 * with commas.  Also & stands for capitalized form of the login name.
 	 */
 
-	for (len = 0, dst = name, src = get_gecos(w); len < sz; src++) {
+	for (src = get_gecos(w); *src && *src != ','; src++) {
 		int ch = *src;
-		if (ch != '&') {
-			*dst++ = ch;
-			if (ch == 0 || ch == ',')
-				break;
-			len++;
-			continue;
-		}
-		if (len + nlen < sz) {
+		if (ch != '&')
+			strbuf_addch(name, ch);
+		else {
 			/* Sorry, Mr. McDonald... */
-			*dst++ = toupper(*w->pw_name);
-			memcpy(dst, w->pw_name + 1, nlen - 1);
-			dst += nlen - 1;
-			len += nlen;
+			strbuf_addch(name, toupper(*w->pw_name));
+			strbuf_addstr(name, w->pw_name + 1);
 		}
 	}
-	if (len < sz)
-		name[len] = 0;
-	else
-		die("Your parents must have hated you!");
-
 }
 
-static int add_mailname_host(char *buf, size_t len)
+static int add_mailname_host(struct strbuf *buf)
 {
 	FILE *mailname;
 
@@ -61,7 +46,7 @@ static int add_mailname_host(char *buf, size_t len)
 				strerror(errno));
 		return -1;
 	}
-	if (!fgets(buf, len, mailname)) {
+	if (strbuf_getline(buf, mailname, '\n') == EOF) {
 		if (ferror(mailname))
 			warning("cannot read /etc/mailname: %s",
 				strerror(errno));
@@ -73,48 +58,41 @@ static int add_mailname_host(char *buf, size_t len)
 	return 0;
 }
 
-static void add_domainname(char *buf, size_t len)
+static void add_domainname(struct strbuf *out)
 {
+	char buf[1024];
 	struct hostent *he;
-	size_t namelen;
 	const char *domainname;
 
-	if (gethostname(buf, len)) {
+	if (gethostname(buf, sizeof(buf))) {
 		warning("cannot get host name: %s", strerror(errno));
-		strlcpy(buf, "(none)", len);
+		strbuf_addstr(out, "(none)");
 		return;
 	}
-	namelen = strlen(buf);
-	if (memchr(buf, '.', namelen))
+	strbuf_addstr(out, buf);
+	if (strchr(buf, '.'))
 		return;
 
 	he = gethostbyname(buf);
-	buf[namelen++] = '.';
-	buf += namelen;
-	len -= namelen;
+	strbuf_addch(out, '.');
 	if (he && (domainname = strchr(he->h_name, '.')))
-		strlcpy(buf, domainname + 1, len);
+		strbuf_addstr(out, domainname + 1);
 	else
-		strlcpy(buf, "(none)", len);
+		strbuf_addstr(out, "(none)");
 }
 
-static void copy_email(const struct passwd *pw)
+static void copy_email(const struct passwd *pw, struct strbuf *email)
 {
 	/*
 	 * Make up a fake email address
 	 * (name + '@' + hostname [+ '.' + domainname])
 	 */
-	size_t len = strlen(pw->pw_name);
-	if (len > sizeof(git_default_email)/2)
-		die("Your sysadmin must hate you!");
-	memcpy(git_default_email, pw->pw_name, len);
-	git_default_email[len++] = '@';
-
-	if (!add_mailname_host(git_default_email + len,
-				sizeof(git_default_email) - len))
+	strbuf_addstr(email, pw->pw_name);
+	strbuf_addch(email, '@');
+
+	if (!add_mailname_host(email))
 		return;	/* read from "/etc/mailname" (Debian) */
-	add_domainname(git_default_email + len,
-			sizeof(git_default_email) - len);
+	add_domainname(email);
 }
 
 static void setup_ident(const char **name, const char **emailp)
@@ -122,32 +100,31 @@ static void setup_ident(const char **name, const char **emailp)
 	struct passwd *pw = NULL;
 
 	/* Get the name ("gecos") */
-	if (!*name && !git_default_name[0]) {
+	if (!*name && !git_default_name.len) {
 		pw = getpwuid(getuid());
 		if (!pw)
 			die("You don't exist. Go away!");
-		copy_gecos(pw, git_default_name, sizeof(git_default_name));
+		copy_gecos(pw, &git_default_name);
 	}
 	if (!*name)
-		*name = git_default_name;
+		*name = git_default_name.buf;
 
-	if (!*emailp && !git_default_email[0]) {
+	if (!*emailp && !git_default_email.len) {
 		const char *email = getenv("EMAIL");
 
 		if (email && email[0]) {
-			strlcpy(git_default_email, email,
-				sizeof(git_default_email));
+			strbuf_addstr(&git_default_email, email);
 			user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
 		} else {
 			if (!pw)
 				pw = getpwuid(getuid());
 			if (!pw)
 				die("You don't exist. Go away!");
-			copy_email(pw);
+			copy_email(pw, &git_default_email);
 		}
 	}
 	if (!*emailp)
-		*emailp = git_default_email;
+		*emailp = git_default_email.buf;
 
 	/* And set the default date */
 	if (!git_default_date[0])
@@ -317,7 +294,7 @@ const char *fmt_ident(const char *name, const char *email,
 		struct passwd *pw;
 
 		if ((warn_on_no_name || error_on_no_name) &&
-		    name == git_default_name && env_hint) {
+		    name == git_default_name.buf && env_hint) {
 			fputs(env_hint, stderr);
 			env_hint = NULL; /* warn only once */
 		}
@@ -326,9 +303,8 @@ const char *fmt_ident(const char *name, const char *email,
 		pw = getpwuid(getuid());
 		if (!pw)
 			die("You don't exist. Go away!");
-		strlcpy(git_default_name, pw->pw_name,
-			sizeof(git_default_name));
-		name = git_default_name;
+		strbuf_addstr(&git_default_name, pw->pw_name);
+		name = git_default_name.buf;
 	}
 
 	strcpy(date, git_default_date);
-- 
1.7.10.2.8.g1101eed

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 2/2] ident: report passwd errors with a more friendly message
  2012-05-11 23:13       ` Jeff King
  2012-05-14 16:28         ` [PATCH 1/2] drop length limitations on gecos-derived names and emails Jeff King
@ 2012-05-14 16:36         ` Jeff King
  1 sibling, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-14 16:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git, Johannes Sixt

When getpwuid fails, we give a cute but cryptic message.
While it makes sense if you know that getpwuid or identity
functions are being called, this code is triggered behind
the scenes by quite a few git commands these days (e.g.,
receive-pack on a remote server might use it for a reflog;
the current message is hard to distinguish from an
authentication error).  Let's switch to something that gives
a little more context.

While we're at it, we can factor out all of the
cut-and-pastes of the "you don't exist" message into a
wrapper function. Rather than provide xgetpwuid, let's make
it even more specific to just getting the passwd entry for
the current uid. That's the only way we use getpwuid anyway,
and it lets us make an even more specific error message.

The current message also fails to mention errno. While the
usual cause for getpwuid failing is that the user does not
exist, mentioning errno makes it easier to diagnose these
problems.  Note that POSIX specifies that errno remain
untouched if the passwd entry does not exist (but will be
set on actual errors), whereas some systems will return
ENOENT or similar for a missing entry. We handle both cases
in our wrapper.

Signed-off-by: Jeff King <peff@peff.net>
---
You earlier suggested to show a hint to set "user.name". That might be
complicated by the fact that this message can come from a remote server.
Or maybe since that is by far the minority case, we should disregard it
and show the hint. I left it out of this patch, as it can be trivially
added on top due to the refactoring.

I also noticed that the version of getpwuid in compat/mingw.c completely
disregards its uid argument. This isn't a problem in the current
codebase, since we always feed getuid(). But since the new wrapper is
explicitly about getting our _own_ pw entry, it might make more sense to
convert our getpwuid() replacement into an xgetpwuid_self() replacement,
which is slightly more accurate. I'll leave that cleanup to Johannes if
he cares to do it.

 Documentation/git-commit-tree.txt |  5 -----
 Documentation/git-var.txt         |  5 -----
 git-compat-util.h                 |  3 +++
 ident.c                           | 12 +++---------
 wrapper.c                         | 12 ++++++++++++
 5 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/Documentation/git-commit-tree.txt b/Documentation/git-commit-tree.txt
index eb12b2d..eb8ee99 100644
--- a/Documentation/git-commit-tree.txt
+++ b/Documentation/git-commit-tree.txt
@@ -88,11 +88,6 @@ for one to be entered and terminated with ^D.
 
 include::date-formats.txt[]
 
-Diagnostics
------------
-You don't exist. Go away!::
-    The passwd(5) gecos field couldn't be read
-
 Discussion
 ----------
 
diff --git a/Documentation/git-var.txt b/Documentation/git-var.txt
index 3f703e3..67edf58 100644
--- a/Documentation/git-var.txt
+++ b/Documentation/git-var.txt
@@ -59,11 +59,6 @@ ifdef::git-default-pager[]
     The build you are using chose '{git-default-pager}' as the default.
 endif::git-default-pager[]
 
-Diagnostics
------------
-You don't exist. Go away!::
-    The passwd(5) gecos field couldn't be read
-
 SEE ALSO
 --------
 linkgit:git-commit-tree[1]
diff --git a/git-compat-util.h b/git-compat-util.h
index ed11ad8..5bd9ad7 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -595,4 +595,7 @@ int rmdir_or_warn(const char *path);
  */
 int remove_or_warn(unsigned int mode, const char *path);
 
+/* Get the passwd entry for the UID of the current process. */
+struct passwd *xgetpwuid_self(void);
+
 #endif
diff --git a/ident.c b/ident.c
index c7bdb3f..72944ba 100644
--- a/ident.c
+++ b/ident.c
@@ -101,9 +101,7 @@ static void setup_ident(const char **name, const char **emailp)
 
 	/* Get the name ("gecos") */
 	if (!*name && !git_default_name.len) {
-		pw = getpwuid(getuid());
-		if (!pw)
-			die("You don't exist. Go away!");
+		pw = xgetpwuid_self();
 		copy_gecos(pw, &git_default_name);
 	}
 	if (!*name)
@@ -117,9 +115,7 @@ static void setup_ident(const char **name, const char **emailp)
 			user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
 		} else {
 			if (!pw)
-				pw = getpwuid(getuid());
-			if (!pw)
-				die("You don't exist. Go away!");
+				pw = xgetpwuid_self();
 			copy_email(pw, &git_default_email);
 		}
 	}
@@ -300,9 +296,7 @@ const char *fmt_ident(const char *name, const char *email,
 		}
 		if (error_on_no_name)
 			die("empty ident %s <%s> not allowed", name, email);
-		pw = getpwuid(getuid());
-		if (!pw)
-			die("You don't exist. Go away!");
+		pw = xgetpwuid_self();
 		strbuf_addstr(&git_default_name, pw->pw_name);
 		name = git_default_name.buf;
 	}
diff --git a/wrapper.c b/wrapper.c
index 6ccd059..b5e33e4 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -402,3 +402,15 @@ int remove_or_warn(unsigned int mode, const char *file)
 {
 	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
 }
+
+struct passwd *xgetpwuid_self(void)
+{
+	struct passwd *pw;
+
+	errno = 0;
+	pw = getpwuid(getuid());
+	if (!pw)
+		die(_("unable to look up current user in the passwd file: %s"),
+		    errno ? strerror(errno) : _("no such user"));
+	return pw;
+}
-- 
1.7.10.2.8.g1101eed

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] drop length limitations on gecos-derived names and emails
  2012-05-14 16:28         ` [PATCH 1/2] drop length limitations on gecos-derived names and emails Jeff King
@ 2012-05-14 17:05           ` Jeff King
  2012-05-14 21:02           ` Jeff King
  1 sibling, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-14 17:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

On Mon, May 14, 2012 at 12:28:24PM -0400, Jeff King wrote:

> I noticed in add_domainname that we look up the host via gethostname,
> and then if it is not fully qualified, call gethostbyname and steal the
> domain portion of the result, tacking it onto the hostname we got.
> 
> That seems oddly complex to me, and like it could result in a bogus
> hostname if the unqualified name does not match the first part of the
> returned qualified name. E.g., if the /etc/hosts file contains something
> like:
> 
>   192.168.1.1 foo.example.com bar.example.com bar
> 
> (and your hostname is "bar"). I doubt it matters much in practice, and
> it is outside the scope of this patch, so I left it for now.

It looks like a bug in adc3dbc (Use sensible domain name (the DNS one)
when guessing ident information, 2005-10-21). Before that we used
getdomainname, where that procedure made more sense.

The patch below fixes it. I doubt it matters much in practice, but I
think the resulting code is way less confusing to read.

-- >8 --
Subject: [PATCH] ident: use full dns names to generate email addresses

When we construct an email address from the username and
hostname, we generate the host part of the email with this
procedure:

  1. add the result of gethostname

  2. if it has a dot, ok, it's fully qualified

  3. if not, then look up the unqualified hostname via
     gethostbyname; take the domain name of the result and
     append it to the hostname

Step 3 can actually produce a bogus result, as the name
returned by gethostbyname may not be related to the hostname
we fed it (e.g., consider a machine "foo" with names
"foo.one.example.com" and "bar.two.example.com"; we may have
the latter returned and generate the bogus name
"foo.two.example.com").

This patch simply uses the full hostname returned by
gethostbyname. In the common case that the first part is the
same as the unqualified hostname, the behavior is identical.
And in the case that it is not the same, we are much more
likely to be generating a valid name.

Signed-off-by: Jeff King <peff@peff.net>
---
 ident.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/ident.c b/ident.c
index 72944ba..e552e7f 100644
--- a/ident.c
+++ b/ident.c
@@ -62,23 +62,18 @@ static void add_domainname(struct strbuf *out)
 {
 	char buf[1024];
 	struct hostent *he;
-	const char *domainname;
 
 	if (gethostname(buf, sizeof(buf))) {
 		warning("cannot get host name: %s", strerror(errno));
 		strbuf_addstr(out, "(none)");
 		return;
 	}
-	strbuf_addstr(out, buf);
 	if (strchr(buf, '.'))
-		return;
-
-	he = gethostbyname(buf);
-	strbuf_addch(out, '.');
-	if (he && (domainname = strchr(he->h_name, '.')))
-		strbuf_addstr(out, domainname + 1);
+		strbuf_addstr(out, buf);
+	else if ((he = gethostbyname(buf)) && strchr(he->h_name, '.'))
+		strbuf_addstr(out, he->h_name);
 	else
-		strbuf_addstr(out, "(none)");
+		strbuf_addf(out, "%s.(none)", buf);
 }
 
 static void copy_email(const struct passwd *pw, struct strbuf *email)
-- 
1.7.10.2.8.g1101eed

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] drop length limitations on gecos-derived names and emails
  2012-05-14 16:28         ` [PATCH 1/2] drop length limitations on gecos-derived names and emails Jeff King
  2012-05-14 17:05           ` Jeff King
@ 2012-05-14 21:02           ` Jeff King
  2012-05-14 21:13             ` Jeff King
  1 sibling, 1 reply; 68+ messages in thread
From: Jeff King @ 2012-05-14 21:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

On Mon, May 14, 2012 at 12:28:24PM -0400, Jeff King wrote:

> When we pull the user's name from the GECOS field of the
> passwd file (or generate an email address based on their
> username and hostname), we put the result into a
> static buffer. While it's extremely unlikely that anybody
> ever hit these limits (after all, in such a case their
> parents must have hated them), we still had to deal with the
> error cases in our code.
> 
> Converting these static buffers to strbufs lets us simplify
> the code and drop some error messages from the documentation
> that have confused some users.
> 
> Note that there is still one length limitation: the
> gethostname interface requires us to provide a static
> buffer, so we arbitrarily choose 1024 bytes for the
> hostname.
> 
> Signed-off-by: Jeff King <peff@peff.net>

Ick, there is something very wrong with this patch. While testing a
completely unrelated bug, I noticed that it set my name to "Jeff
KingJeff KingJeff King". Which, while a wonderful ego massage, is
probably excessive.

I'm sure the problem is the switch to strbuf's appending semantics
rather than strlcpy's overwriting semantics. I thought we were careful
not to bother re-run the gecos code if we had already gotten a name, but
obviously that is not the case in some code paths. I'll investigate.

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] drop length limitations on gecos-derived names and emails
  2012-05-14 21:02           ` Jeff King
@ 2012-05-14 21:13             ` Jeff King
  2012-05-15  1:54               ` Jeff King
  0 siblings, 1 reply; 68+ messages in thread
From: Jeff King @ 2012-05-14 21:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

On Mon, May 14, 2012 at 05:02:25PM -0400, Jeff King wrote:

> Ick, there is something very wrong with this patch. While testing a
> completely unrelated bug, I noticed that it set my name to "Jeff
> KingJeff KingJeff King". Which, while a wonderful ego massage, is
> probably excessive.
> 
> I'm sure the problem is the switch to strbuf's appending semantics
> rather than strlcpy's overwriting semantics. I thought we were careful
> not to bother re-run the gecos code if we had already gotten a name, but
> obviously that is not the case in some code paths. I'll investigate.

Ah, I see. The problem is here:

> --- a/config.c
> +++ b/config.c
> @@ -767,7 +767,7 @@ static int git_default_user_config(const char *var, const char *value)
>  	if (!strcmp(var, "user.name")) {
>  		if (!value)
>  			return config_error_nonbool(var);
> -		strlcpy(git_default_name, value, sizeof(git_default_name));
> +		strbuf_addstr(&git_default_name, value);
>  		user_ident_explicitly_given |= IDENT_NAME_GIVEN;
>  		return 0;
>  	}
> @@ -775,7 +775,7 @@ static int git_default_user_config(const char *var, const char *value)
>  	if (!strcmp(var, "user.email")) {
>  		if (!value)
>  			return config_error_nonbool(var);
> -		strlcpy(git_default_email, value, sizeof(git_default_email));
> +		strbuf_addstr(&git_default_email, value);
>  		user_ident_explicitly_given |= IDENT_MAIL_GIVEN;

where we are not careful. The fix is trivial. However, while examining
fmt_ident, I notice there is another potential spot there that needs
further investigation (I think it may actually be unreachable code, but
I need to look closer).

I'll re-roll the series with the fixes after investigating fmt_ident.

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] drop length limitations on gecos-derived names and emails
  2012-05-14 21:13             ` Jeff King
@ 2012-05-15  1:54               ` Jeff King
  2012-05-15  2:32                 ` Jeff King
  2012-05-15 15:03                 ` Junio C Hamano
  0 siblings, 2 replies; 68+ messages in thread
From: Jeff King @ 2012-05-15  1:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

On Mon, May 14, 2012 at 05:13:24PM -0400, Jeff King wrote:

> where we are not careful. The fix is trivial. However, while examining
> fmt_ident, I notice there is another potential spot there that needs
> further investigation (I think it may actually be unreachable code, but
> I need to look closer).
> 
> I'll re-roll the series with the fixes after investigating fmt_ident.

Hmm. This code from fmt_ident is very odd:

> const char *fmt_ident(const char *name, const char *email,
> 		      const char *date_str, int flag)
> {
> [...]
> 	setup_ident(&name, &email);
> 
> 	if (!*name) {
> 		struct passwd *pw;
> 
> 		if ((warn_on_no_name || error_on_no_name) &&
> 		    name == git_default_name && env_hint) {
> 			fputs(env_hint, stderr);
> 			env_hint = NULL; /* warn only once */
> 		}
> 		if (error_on_no_name)
> 			die("empty ident %s <%s> not allowed", name, email);
> 		pw = getpwuid(getuid());
> 		if (!pw)
> 			die("You don't exist. Go away!");
> 		strlcpy(git_default_name, pw->pw_name,
> 			sizeof(git_default_name));
> 		name = git_default_name;
> 	}

We call setup_ident with our name pointer, which usually comes from
getenv("GIT_*_NAME"), although could also come from something like "git
commit -c $commit". We feed that to setup_ident. If name is NULL, then
setup_ident will use git_default_name (filling it in from gecos or
config). If it's not NULL, then we use it literally. And then we check
_that_ result to see if it's empty. If it is, we either die or warn,
depending on the flags. In the latter case, we fallback to using the
username as the name.

And that's what confuses me. Depending on what was passed in, we may
have checked that GIT_COMMITTER_NAME is an empty string, or we may have
checked that the config or gecos field yielded an empty string. In the
latter case, it makes sense to fall back to the username. But in the
former case, it doesn't; we should fall back to the config name or the
gecos name. And worse, we've polluted git_default_name for the rest of
the program run.

Instead of falling back to getpwuid(), should it fall back to:

   /* If this wasn't our default name already, then fall back to that. */
   if (name != git_default_name) {
           name = NULL;
           setup_ident(&name, &email);
   }

   /* If we _still_ don't have a non-empty name, then fall back to
    * username. */
   if (!*name) {
          pw = getpwuid(getuid());
          if (!pw)
                  die("You don't exist. Go away!");
          strlcpy(git_default_name, pw->pw_name, sizeof(git_default_name));
          nae = git_default_name;
   }

Of course we've still polluted this crappy fake name into
git_default_name, so that later calls with error_on_no_name will see it
and not error. I think so far it hasn't mattered because the only user
of this "warn" code is format-patch, which otherwise does not care about
ident (and doesn't even end up using the name at all!). And I doubt this
code path gets triggered much anyway; do people really run
"GIT_COMMITTER_NAME= git format-patch"?

I can just leave it as it's not really hurting anybody, I think. But I
was refactoring in the area and it just seemed flaky and questionable. I
wonder if we can simply get rid of the IDENT_WARN_ON_NO_NAME code path
entirely. The use here is grabbing the email address to use as part of a
message id. Could we just call setup_ident and then read from
git_default_email directly? There's no need to respect
GIT_COMMITTER_EMAIL here at all.

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] drop length limitations on gecos-derived names and emails
  2012-05-15  1:54               ` Jeff King
@ 2012-05-15  2:32                 ` Jeff King
  2012-05-15 15:03                 ` Junio C Hamano
  1 sibling, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-15  2:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

On Mon, May 14, 2012 at 09:54:37PM -0400, Jeff King wrote:

> Of course we've still polluted this crappy fake name into
> git_default_name, so that later calls with error_on_no_name will see it
> and not error. I think so far it hasn't mattered because the only user
> of this "warn" code is format-patch, which otherwise does not care about
> ident (and doesn't even end up using the name at all!). And I doubt this
> code path gets triggered much anyway; do people really run
> "GIT_COMMITTER_NAME= git format-patch"?
> 
> I can just leave it as it's not really hurting anybody, I think. But I
> was refactoring in the area and it just seemed flaky and questionable. I
> wonder if we can simply get rid of the IDENT_WARN_ON_NO_NAME code path
> entirely. The use here is grabbing the email address to use as part of a
> message id. Could we just call setup_ident and then read from
> git_default_email directly? There's no need to respect
> GIT_COMMITTER_EMAIL here at all.

Hmm, I was mistaken. This code path also gets followed whenever
IDENT_ERROR_ON_NO_NAME is not set (regardless of IDENT_WARN_ON_NO_NAME).
So other programs may accidentally get this pollution of
git_default_name and show a username when we _could_ have shown the name
from config. I can see the pollution in a debugger in "git commit", but
I don't think you can actually trigger a commit with it, because later
calls to fmt_ident use ERROR_ON_NO_NAME.

I really wonder if we can just get rid of all of the calls which do not
use ERROR_ON_NO_NAME. As far as I can tell, they are all part of
programs which later end up using ERROR_ON_NO_NAME anyway.

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] drop length limitations on gecos-derived names and emails
  2012-05-15  1:54               ` Jeff King
  2012-05-15  2:32                 ` Jeff King
@ 2012-05-15 15:03                 ` Junio C Hamano
  2012-05-15 17:47                   ` Jeff King
  1 sibling, 1 reply; 68+ messages in thread
From: Junio C Hamano @ 2012-05-15 15:03 UTC (permalink / raw)
  To: Jeff King; +Cc: Angus Hammond, git

Jeff King <peff@peff.net> writes:

> On Mon, May 14, 2012 at 05:13:24PM -0400, Jeff King wrote:
>
> We call setup_ident with our name pointer, which usually comes from
> getenv("GIT_*_NAME"), although could also come from something like "git
> commit -c $commit". We feed that to setup_ident. If name is NULL, then
> setup_ident will use git_default_name (filling it in from gecos or
> config). If it's not NULL, then we use it literally. And then we check
> _that_ result to see if it's empty. If it is, we either die or warn,
> depending on the flags. In the latter case, we fallback to using the
> username as the name.
>
> And that's what confuses me. Depending on what was passed in, we may
> have checked that GIT_COMMITTER_NAME is an empty string, or we may have
> checked that the config or gecos field yielded an empty string. 

Sounds quite sensible to me, though.

> In the
> latter case, it makes sense to fall back to the username.

I agree that we should use something like "Sorry, Mr. McDonald" codepath
when the GECOS field returns an empty string---after all that is what we
do when we are built with NO_GECOS_IN_PWENT.

> But in the
> former case, it doesn't; we should fall back to the config name or the
> gecos name.

If the user said GIT_COMMITTER_NAME is empty with "GIT_COMMITTER_NAME=",
that is different from saying with "unset GIT_COMMITTER_NAME" that the
user does not want the environment to take effect, no?  So I do not think
falling back to configured or gecos in the former case is the right thing
to do, even though that would mean explicitly giving an empty string in
that configuration variable is asking only for an error without any
recourse, which is not useful at all.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] drop length limitations on gecos-derived names and emails
  2012-05-15 15:03                 ` Junio C Hamano
@ 2012-05-15 17:47                   ` Jeff King
  2012-05-15 18:10                     ` Junio C Hamano
  0 siblings, 1 reply; 68+ messages in thread
From: Jeff King @ 2012-05-15 17:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

On Tue, May 15, 2012 at 08:03:38AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > On Mon, May 14, 2012 at 05:13:24PM -0400, Jeff King wrote:
> >
> > We call setup_ident with our name pointer, which usually comes from
> > getenv("GIT_*_NAME"), although could also come from something like "git
> > commit -c $commit". We feed that to setup_ident. If name is NULL, then
> > setup_ident will use git_default_name (filling it in from gecos or
> > config). If it's not NULL, then we use it literally. And then we check
> > _that_ result to see if it's empty. If it is, we either die or warn,
> > depending on the flags. In the latter case, we fallback to using the
> > username as the name.
> >
> > And that's what confuses me. Depending on what was passed in, we may
> > have checked that GIT_COMMITTER_NAME is an empty string, or we may have
> > checked that the config or gecos field yielded an empty string. 
> 
> Sounds quite sensible to me, though.

Yes, I think it is OK to check what was given to us (or our fallback).
But using that check to decide which next step to take doesn't seem
right.

> > In the
> > latter case, it makes sense to fall back to the username.
> 
> I agree that we should use something like "Sorry, Mr. McDonald" codepath
> when the GECOS field returns an empty string---after all that is what we
> do when we are built with NO_GECOS_IN_PWENT.

Right, and that is more or less what we do (just without the
capitalization).

> > But in the
> > former case, it doesn't; we should fall back to the config name or the
> > gecos name.
> 
> If the user said GIT_COMMITTER_NAME is empty with "GIT_COMMITTER_NAME=",
> that is different from saying with "unset GIT_COMMITTER_NAME" that the
> user does not want the environment to take effect, no?

I agree two the cases are different. And for the most part, you are
insane to pass an empty GIT_COMMITTER_NAME. But if you do, why would the
right behavior be to fall back to sticking the username into the name
field, and not the gecos name?

Part of me is wondering why we should fall back at all in that case. If
a caller does not pass ERROR_ON_NO_NAME, then they don't really care
what the name is, do they? The current callers that do not pass it are:

  - blame.c:fake_working_tree_commit, which is passing in a fake name
    buffer anyway (so will never trigger this code path)

  - log.c:gen_message_id, which only cares about the email
    portion anyway

  - fmt-merge-msg.c:credit_people; this caller compares the name field
    to what's in the commits, checking for differences. So it could just
    as easily be "(none)" or some other token

  - commit.c:prepare_to_commit; this compares and shows author and
    commiter ids, and does not care about a blank name for the committer
    (but does for the author). The commit can't go through anyway with a
    blank committer name, so should it not just use ERROR_ON_NO_NAME?

  - log.c:make_cover_letter; this uses the committer information to make
    a fake commit that we ultimately use just to get the "%f" pretty
    userformat from it. In other words, we don't care about the
    committer at all, and this is really just working around an
    absolutely horrific interface.

  - refs.c:log_ref_write; finally, a caller who actually cares about the
    name, but doesn't want to die if we don't have a good name. We are
    happy enough with the username, though if somebody passes
    GIT_COMMITTER_NAME=, wouldn't it be OK to fail?

So it seems to me like a much simpler set of rules would be:

  1. When reading gecos, always fall back to the username if the gecos
     field is unavailable or blank.

  2. Always die when the name field is blank. That means we will die
     when you pass in a bogus empty GIT_COMMITTER_NAME (or an empty
     config name), which makes a lot more sense to me than falling back;
     those are bogus requests, not system config problems.  And we won't
     ever have a blank gecos name, because we'll always fall back on the
     username.

Again, I'm sorry to belabor this, and we can just drop it; I don't think
there's currently a bug. It's just that I'm cleaning up in the area, and
the current behavior seems overly complex; in particular, I'm worried
that writing the username into the git_default_name field (overwriting
the _real_ name the user gave us!) is a maintenance time-bomb that will
bite us later.

If I'm not being clear, I can express it in the form of patches, which
might be more obvious.

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/2] drop length limitations on gecos-derived names and emails
  2012-05-15 17:47                   ` Jeff King
@ 2012-05-15 18:10                     ` Junio C Hamano
  2012-05-18 23:05                       ` [PATCH 0/13] ident cleanups and bugfixes Jeff King
  0 siblings, 1 reply; 68+ messages in thread
From: Junio C Hamano @ 2012-05-15 18:10 UTC (permalink / raw)
  To: Jeff King; +Cc: Angus Hammond, git

Jeff King <peff@peff.net> writes:

> So it seems to me like a much simpler set of rules would be:
>
>   1. When reading gecos, always fall back to the username if the gecos
>      field is unavailable or blank.
>
>   2. Always die when the name field is blank. That means we will die
>      when you pass in a bogus empty GIT_COMMITTER_NAME (or an empty
>      config name), which makes a lot more sense to me than falling back;
>      those are bogus requests, not system config problems.  And we won't
>      ever have a blank gecos name, because we'll always fall back on the
>      username.

That certainly sounds very simple to explain and understand, and I do not
offhand think of anything *sane* that would break ;-)

Thanks.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 0/13] ident cleanups and bugfixes
  2012-05-15 18:10                     ` Junio C Hamano
@ 2012-05-18 23:05                       ` Jeff King
  2012-05-18 23:07                         ` [PATCH 01/13] ident: split setup_ident into separate functions Jeff King
                                           ` (12 more replies)
  0 siblings, 13 replies; 68+ messages in thread
From: Jeff King @ 2012-05-18 23:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

On Tue, May 15, 2012 at 11:10:36AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > So it seems to me like a much simpler set of rules would be:
> >
> >   1. When reading gecos, always fall back to the username if the gecos
> >      field is unavailable or blank.
> >
> >   2. Always die when the name field is blank. That means we will die
> >      when you pass in a bogus empty GIT_COMMITTER_NAME (or an empty
> >      config name), which makes a lot more sense to me than falling back;
> >      those are bogus requests, not system config problems.  And we won't
> >      ever have a blank gecos name, because we'll always fall back on the
> >      username.
> 
> That certainly sounds very simple to explain and understand, and I do not
> offhand think of anything *sane* that would break ;-)

Actually, it does end up breaking. The "error_on_no_name" check wants to
see the value _before_ the fallback, because it is not just about "do
not hand out an ident with an empty name", but rather about "do not hand
out this crappy fallback name when the gecos field is empty".  So we
cannot do an "always fallback" behavior without breaking that check.

I ended up avoiding the issue in a different way, as you'll see in patch
8 below. Here's the series:

  [01/13]: ident: split setup_ident into separate functions
  [02/13]: http-push: do not access git_default_email directly
  [03/13]: fmt-merge-msg: don't use static buffer in record_person
  [04/13]: move identity config parsing to ident.c
  [05/13]: move git_default_* variables to ident.c
  [06/13]: format-patch: use default email for generating message ids
  [07/13]: fmt_ident: drop IDENT_WARN_ON_NO_NAME code
  [08/13]: ident: don't write fallback username into git_default_name
  [09/13]: drop length limitations on gecos-derived names and emails
  [10/13]: ident: report passwd errors with a more friendly message
  [11/13]: ident: use full dns names to generate email addresses
  [12/13]: ident: use a dynamic strbuf in fmt_ident
  [13/13]: format-patch: refactor get_patch_filename

Patches 2, 8, and 11 fix actual bugs. The rest of it is refactoring and
cleanup; here's the diffstat:

 Documentation/git-commit-tree.txt |    9 --
 Documentation/git-var.txt         |    9 --
 builtin/fmt-merge-msg.c           |    8 +-
 builtin/log.c                     |   45 ++------
 cache.h                           |   12 +-
 config.c                          |   24 +----
 environment.c                     |    3 -
 git-compat-util.h                 |    3 +
 http-push.c                       |    2 +-
 ident.c                           |  213 ++++++++++++++++---------------------
 log-tree.c                        |   19 ++--
 log-tree.h                        |    4 +-
 wrapper.c                         |   12 ++
 13 files changed, 142 insertions(+), 221 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 01/13] ident: split setup_ident into separate functions
  2012-05-18 23:05                       ` [PATCH 0/13] ident cleanups and bugfixes Jeff King
@ 2012-05-18 23:07                         ` Jeff King
  2012-05-18 23:09                         ` [PATCH 02/13] http-push: do not access git_default_email directly Jeff King
                                           ` (11 subsequent siblings)
  12 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-18 23:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

This function sets up the default name, email, and date, and
is not publicly available. Let's split it into three public
functions so that callers can get just the parts they need.

While we're at it, let's change the interface to simple
accessors. The original function was called only by fmt_ident,
and contained logic for "if we already have some other
value, don't load the default" which properly belongs in
fmt_ident.

Signed-off-by: Jeff King <peff@peff.net>
---
The patch is a pain to read because it's splitting an existing function
in 3; just reading the result is much easier.

 cache.h |  3 +++
 ident.c | 35 +++++++++++++++++++----------------
 2 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/cache.h b/cache.h
index e14ffcd..0c095d4 100644
--- a/cache.h
+++ b/cache.h
@@ -894,6 +894,9 @@ extern const char *git_author_info(int);
 extern const char *git_committer_info(int);
 extern const char *fmt_ident(const char *name, const char *email, const char *date_str, int);
 extern const char *fmt_name(const char *name, const char *email);
+extern const char *ident_default_name(void);
+extern const char *ident_default_email(void);
+extern const char *ident_default_date(void);
 extern const char *git_editor(void);
 extern const char *git_pager(int stdout_is_tty);
 
diff --git a/ident.c b/ident.c
index 87c697c..0f7dcae 100644
--- a/ident.c
+++ b/ident.c
@@ -117,21 +117,20 @@ static void copy_email(const struct passwd *pw)
 			sizeof(git_default_email) - len);
 }
 
-static void setup_ident(const char **name, const char **emailp)
+const char *ident_default_name(void)
 {
-	struct passwd *pw = NULL;
-
-	/* Get the name ("gecos") */
-	if (!*name && !git_default_name[0]) {
-		pw = getpwuid(getuid());
+	if (!git_default_name[0]) {
+		struct passwd *pw = getpwuid(getuid());
 		if (!pw)
 			die("You don't exist. Go away!");
 		copy_gecos(pw, git_default_name, sizeof(git_default_name));
 	}
-	if (!*name)
-		*name = git_default_name;
+	return git_default_name;
+}
 
-	if (!*emailp && !git_default_email[0]) {
+const char *ident_default_email(void)
+{
+	if (!git_default_email[0]) {
 		const char *email = getenv("EMAIL");
 
 		if (email && email[0]) {
@@ -139,19 +138,20 @@ static void setup_ident(const char **name, const char **emailp)
 				sizeof(git_default_email));
 			user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
 		} else {
-			if (!pw)
-				pw = getpwuid(getuid());
+			struct passwd *pw = getpwuid(getuid());
 			if (!pw)
 				die("You don't exist. Go away!");
 			copy_email(pw);
 		}
 	}
-	if (!*emailp)
-		*emailp = git_default_email;
+	return git_default_email;
+}
 
-	/* And set the default date */
+const char *ident_default_date(void)
+{
 	if (!git_default_date[0])
 		datestamp(git_default_date, sizeof(git_default_date));
+	return git_default_date;
 }
 
 static int add_raw(char *buf, size_t size, int offset, const char *str)
@@ -311,7 +311,10 @@ const char *fmt_ident(const char *name, const char *email,
 	int warn_on_no_name = (flag & IDENT_WARN_ON_NO_NAME);
 	int name_addr_only = (flag & IDENT_NO_DATE);
 
-	setup_ident(&name, &email);
+	if (!name)
+		name = ident_default_name();
+	if (!email)
+		email = ident_default_email();
 
 	if (!*name) {
 		struct passwd *pw;
@@ -331,7 +334,7 @@ const char *fmt_ident(const char *name, const char *email,
 		name = git_default_name;
 	}
 
-	strcpy(date, git_default_date);
+	strcpy(date, ident_default_date());
 	if (!name_addr_only && date_str && date_str[0]) {
 		if (parse_date(date_str, date, sizeof(date)) < 0)
 			die("invalid date format: %s", date_str);
-- 
1.7.10.1.16.g53a707b

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 02/13] http-push: do not access git_default_email directly
  2012-05-18 23:05                       ` [PATCH 0/13] ident cleanups and bugfixes Jeff King
  2012-05-18 23:07                         ` [PATCH 01/13] ident: split setup_ident into separate functions Jeff King
@ 2012-05-18 23:09                         ` Jeff King
  2012-05-18 23:10                         ` [PATCH 03/13] fmt-merge-msg: don't use static buffer in record_person Jeff King
                                           ` (10 subsequent siblings)
  12 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-18 23:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

By calling the ident_default_email accessor, we can be sure
that the default value is actually filled-in.

Signed-off-by: Jeff King <peff@peff.net>
---
I believe this is a real, triggerable bug if you don't have your
user.email config set, but I didn't actually test it (and I have no idea
if DAV would actually care about an empty email here anyway).

 http-push.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/http-push.c b/http-push.c
index 1df7ab5..a832ca7 100644
--- a/http-push.c
+++ b/http-push.c
@@ -904,7 +904,7 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 		ep = strchr(ep + 1, '/');
 	}
 
-	escaped = xml_entities(git_default_email);
+	escaped = xml_entities(ident_default_email());
 	strbuf_addf(&out_buffer.buf, LOCK_REQUEST, escaped);
 	free(escaped);
 
-- 
1.7.10.1.16.g53a707b

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 03/13] fmt-merge-msg: don't use static buffer in record_person
  2012-05-18 23:05                       ` [PATCH 0/13] ident cleanups and bugfixes Jeff King
  2012-05-18 23:07                         ` [PATCH 01/13] ident: split setup_ident into separate functions Jeff King
  2012-05-18 23:09                         ` [PATCH 02/13] http-push: do not access git_default_email directly Jeff King
@ 2012-05-18 23:10                         ` Jeff King
  2012-05-18 23:11                         ` [PATCH 04/13] move identity config parsing to ident.c Jeff King
                                           ` (9 subsequent siblings)
  12 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-18 23:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

The record_person function just parses out the "name" field
of the person line in a commit and adds it to a string_list.
The only reason we need an extra buffer is that the
string_list functions require a NUL-terminated string.

Instead of the static buffer, we can just allocate a
temporary NUL-terminated copy. In addition to removing a
useless limit, this removes the only user of MAX_GITNAME
outside of ident.c.

Signed-off-by: Jeff King <peff@peff.net>
---
This actually introduces an extra malloc() per commit that we analyze. I
doubt we care, but if we do, the right solution IMHO is to introduce
string-list functions which can lookup or add a (buf, len) pair and
avoid the copy altogether.

 builtin/fmt-merge-msg.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index a517f17..4ee6a88 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -230,7 +230,7 @@ static void add_branch_desc(struct strbuf *out, const char *name)
 static void record_person(int which, struct string_list *people,
 			  struct commit *commit)
 {
-	char name_buf[MAX_GITNAME], *name, *name_end;
+	char *name_buf, *name, *name_end;
 	struct string_list_item *elem;
 	const char *field = (which == 'a') ? "\nauthor " : "\ncommitter ";
 
@@ -243,10 +243,9 @@ static void record_person(int which, struct string_list *people,
 		name_end--;
 	while (isspace(*name_end) && name <= name_end)
 		name_end--;
-	if (name_end < name || name + MAX_GITNAME <= name_end)
+	if (name_end < name)
 		return;
-	memcpy(name_buf, name, name_end - name + 1);
-	name_buf[name_end - name + 1] = '\0';
+	name_buf = xmemdupz(name, name_end - name + 1);
 
 	elem = string_list_lookup(people, name_buf);
 	if (!elem) {
@@ -254,6 +253,7 @@ static void record_person(int which, struct string_list *people,
 		elem->util = (void *)0;
 	}
 	elem->util = (void*)(util_as_integral(elem) + 1);
+	free(name_buf);
 }
 
 static int cmp_string_list_util_as_integral(const void *a_, const void *b_)
-- 
1.7.10.1.16.g53a707b

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 04/13] move identity config parsing to ident.c
  2012-05-18 23:05                       ` [PATCH 0/13] ident cleanups and bugfixes Jeff King
                                           ` (2 preceding siblings ...)
  2012-05-18 23:10                         ` [PATCH 03/13] fmt-merge-msg: don't use static buffer in record_person Jeff King
@ 2012-05-18 23:11                         ` Jeff King
  2012-05-18 23:11                         ` [PATCH 05/13] move git_default_* variables " Jeff King
                                           ` (8 subsequent siblings)
  12 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-18 23:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

There's no reason for this to be in config, except that once
upon a time all of the config parsing was there. It makes
more sense to keep the ident code together.

Signed-off-by: Jeff King <peff@peff.net>
---
 cache.h  |  1 +
 config.c | 24 +-----------------------
 ident.c  | 21 +++++++++++++++++++++
 3 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/cache.h b/cache.h
index 0c095d4..86224c8 100644
--- a/cache.h
+++ b/cache.h
@@ -899,6 +899,7 @@ extern const char *ident_default_email(void);
 extern const char *ident_default_date(void);
 extern const char *git_editor(void);
 extern const char *git_pager(int stdout_is_tty);
+extern int git_ident_config(const char *, const char *, void *);
 
 struct ident_split {
 	const char *name_begin;
diff --git a/config.c b/config.c
index eeee986..71ef171 100644
--- a/config.c
+++ b/config.c
@@ -762,28 +762,6 @@ static int git_default_core_config(const char *var, const char *value)
 	return 0;
 }
 
-static int git_default_user_config(const char *var, const char *value)
-{
-	if (!strcmp(var, "user.name")) {
-		if (!value)
-			return config_error_nonbool(var);
-		strlcpy(git_default_name, value, sizeof(git_default_name));
-		user_ident_explicitly_given |= IDENT_NAME_GIVEN;
-		return 0;
-	}
-
-	if (!strcmp(var, "user.email")) {
-		if (!value)
-			return config_error_nonbool(var);
-		strlcpy(git_default_email, value, sizeof(git_default_email));
-		user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
-		return 0;
-	}
-
-	/* Add other config variables here and to Documentation/config.txt. */
-	return 0;
-}
-
 static int git_default_i18n_config(const char *var, const char *value)
 {
 	if (!strcmp(var, "i18n.commitencoding"))
@@ -870,7 +848,7 @@ int git_default_config(const char *var, const char *value, void *dummy)
 		return git_default_core_config(var, value);
 
 	if (!prefixcmp(var, "user."))
-		return git_default_user_config(var, value);
+		return git_ident_config(var, value, dummy);
 
 	if (!prefixcmp(var, "i18n."))
 		return git_default_i18n_config(var, value);
diff --git a/ident.c b/ident.c
index 0f7dcae..bb1158f 100644
--- a/ident.c
+++ b/ident.c
@@ -388,3 +388,24 @@ int user_ident_sufficiently_given(void)
 	return (user_ident_explicitly_given == IDENT_ALL_GIVEN);
 #endif
 }
+
+int git_ident_config(const char *var, const char *value, void *data)
+{
+	if (!strcmp(var, "user.name")) {
+		if (!value)
+			return config_error_nonbool(var);
+		strlcpy(git_default_name, value, sizeof(git_default_name));
+		user_ident_explicitly_given |= IDENT_NAME_GIVEN;
+		return 0;
+	}
+
+	if (!strcmp(var, "user.email")) {
+		if (!value)
+			return config_error_nonbool(var);
+		strlcpy(git_default_email, value, sizeof(git_default_email));
+		user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
+		return 0;
+	}
+
+	return 0;
+}
-- 
1.7.10.1.16.g53a707b

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 05/13] move git_default_* variables to ident.c
  2012-05-18 23:05                       ` [PATCH 0/13] ident cleanups and bugfixes Jeff King
                                           ` (3 preceding siblings ...)
  2012-05-18 23:11                         ` [PATCH 04/13] move identity config parsing to ident.c Jeff King
@ 2012-05-18 23:11                         ` Jeff King
  2012-05-21  4:07                           ` Junio C Hamano
  2012-05-18 23:13                         ` [PATCH 06/13] format-patch: use default email for generating message ids Jeff King
                                           ` (7 subsequent siblings)
  12 siblings, 1 reply; 68+ messages in thread
From: Jeff King @ 2012-05-18 23:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

There's no reason anybody outside of ident.c should access
these directly (they should use the new accessors which make
sure the variables are initialized), so we can make them
file-scope statics.

While we're at it, move user_ident_explicitly_given into
ident.c; while still globally visible, it makes more sense
to reside with the ident code.

Signed-off-by: Jeff King <peff@peff.net>
---
 cache.h       | 3 ---
 environment.c | 3 ---
 ident.c       | 4 ++++
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/cache.h b/cache.h
index 86224c8..f63b71f 100644
--- a/cache.h
+++ b/cache.h
@@ -1142,9 +1142,6 @@ struct config_include_data {
 #define CONFIG_INCLUDE_INIT { 0 }
 extern int git_config_include(const char *name, const char *value, void *data);
 
-#define MAX_GITNAME (1000)
-extern char git_default_email[MAX_GITNAME];
-extern char git_default_name[MAX_GITNAME];
 #define IDENT_NAME_GIVEN 01
 #define IDENT_MAIL_GIVEN 02
 #define IDENT_ALL_GIVEN (IDENT_NAME_GIVEN|IDENT_MAIL_GIVEN)
diff --git a/environment.c b/environment.c
index d7e6c65..669e498 100644
--- a/environment.c
+++ b/environment.c
@@ -11,9 +11,6 @@
 #include "refs.h"
 #include "fmt-merge-msg.h"
 
-char git_default_email[MAX_GITNAME];
-char git_default_name[MAX_GITNAME];
-int user_ident_explicitly_given;
 int trust_executable_bit = 1;
 int trust_ctime = 1;
 int has_symlinks = 1;
diff --git a/ident.c b/ident.c
index bb1158f..af92b2c 100644
--- a/ident.c
+++ b/ident.c
@@ -7,7 +7,11 @@
  */
 #include "cache.h"
 
+#define MAX_GITNAME (1000)
+static char git_default_name[MAX_GITNAME];
+static char git_default_email[MAX_GITNAME];
 static char git_default_date[50];
+int user_ident_explicitly_given;
 
 #ifdef NO_GECOS_IN_PWENT
 #define get_gecos(ignored) "&"
-- 
1.7.10.1.16.g53a707b

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 06/13] format-patch: use default email for generating message ids
  2012-05-18 23:05                       ` [PATCH 0/13] ident cleanups and bugfixes Jeff King
                                           ` (4 preceding siblings ...)
  2012-05-18 23:11                         ` [PATCH 05/13] move git_default_* variables " Jeff King
@ 2012-05-18 23:13                         ` Jeff King
  2012-05-21  2:58                           ` Junio C Hamano
  2012-05-18 23:14                         ` [PATCH 07/13] fmt_ident: drop IDENT_WARN_ON_NO_NAME code Jeff King
                                           ` (6 subsequent siblings)
  12 siblings, 1 reply; 68+ messages in thread
From: Jeff King @ 2012-05-18 23:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

We try to generate a sane message id for cover letters and
threading by appending some changing bits to the front of
the user's email address. The current code parses the email
out of the results of git_committer_info, but we can do this
much more easily by just calling ident_default_email
ourselves.

Signed-off-by: Jeff King <peff@peff.net>
---
Technically this is a regression if you really wanted:

  GIT_COMMITTER_EMAIL=some.addr@example.com \
  git format-patch --thread=deep

to make your environment variable part of the message-ids. I don't think
it matters, but I can adjust it if we care.

 builtin/log.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 690caa7..656bddf 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -737,15 +737,9 @@ static void get_patch_ids(struct rev_info *rev, struct patch_ids *ids, const cha
 
 static void gen_message_id(struct rev_info *info, char *base)
 {
-	const char *committer = git_committer_info(IDENT_WARN_ON_NO_NAME);
-	const char *email_start = strrchr(committer, '<');
-	const char *email_end = strrchr(committer, '>');
 	struct strbuf buf = STRBUF_INIT;
-	if (!email_start || !email_end || email_start > email_end - 1)
-		die(_("Could not extract email from committer identity."));
-	strbuf_addf(&buf, "%s.%lu.git.%.*s", base,
-		    (unsigned long) time(NULL),
-		    (int)(email_end - email_start - 1), email_start + 1);
+	strbuf_addf(&buf, "%s.%lu.git.%s", base,
+		    (unsigned long) time(NULL), ident_default_email());
 	info->message_id = strbuf_detach(&buf, NULL);
 }
 
-- 
1.7.10.1.16.g53a707b

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 07/13] fmt_ident: drop IDENT_WARN_ON_NO_NAME code
  2012-05-18 23:05                       ` [PATCH 0/13] ident cleanups and bugfixes Jeff King
                                           ` (5 preceding siblings ...)
  2012-05-18 23:13                         ` [PATCH 06/13] format-patch: use default email for generating message ids Jeff King
@ 2012-05-18 23:14                         ` Jeff King
  2012-05-18 23:19                         ` [PATCH 08/13] ident: don't write fallback username into git_default_name Jeff King
                                           ` (5 subsequent siblings)
  12 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-18 23:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

There are no more callers who want this, so we can drop it.

Signed-off-by: Jeff King <peff@peff.net>
---
 cache.h |  5 ++---
 ident.c | 11 ++++-------
 2 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/cache.h b/cache.h
index f63b71f..65cbab5 100644
--- a/cache.h
+++ b/cache.h
@@ -887,9 +887,8 @@ unsigned long approxidate_careful(const char *, int *);
 unsigned long approxidate_relative(const char *date, const struct timeval *now);
 enum date_mode parse_date_format(const char *format);
 
-#define IDENT_WARN_ON_NO_NAME  1
-#define IDENT_ERROR_ON_NO_NAME 2
-#define IDENT_NO_DATE	       4
+#define IDENT_ERROR_ON_NO_NAME 1
+#define IDENT_NO_DATE	       2
 extern const char *git_author_info(int);
 extern const char *git_committer_info(int);
 extern const char *fmt_ident(const char *name, const char *email, const char *date_str, int);
diff --git a/ident.c b/ident.c
index af92b2c..d2fa271 100644
--- a/ident.c
+++ b/ident.c
@@ -312,7 +312,6 @@ const char *fmt_ident(const char *name, const char *email,
 	char date[50];
 	int i;
 	int error_on_no_name = (flag & IDENT_ERROR_ON_NO_NAME);
-	int warn_on_no_name = (flag & IDENT_WARN_ON_NO_NAME);
 	int name_addr_only = (flag & IDENT_NO_DATE);
 
 	if (!name)
@@ -323,13 +322,11 @@ const char *fmt_ident(const char *name, const char *email,
 	if (!*name) {
 		struct passwd *pw;
 
-		if ((warn_on_no_name || error_on_no_name) &&
-		    name == git_default_name && env_hint) {
-			fputs(env_hint, stderr);
-			env_hint = NULL; /* warn only once */
-		}
-		if (error_on_no_name)
+		if (error_on_no_name) {
+			if (name == git_default_name)
+				fputs(env_hint, stderr);
 			die("empty ident %s <%s> not allowed", name, email);
+		}
 		pw = getpwuid(getuid());
 		if (!pw)
 			die("You don't exist. Go away!");
-- 
1.7.10.1.16.g53a707b

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 08/13] ident: don't write fallback username into git_default_name
  2012-05-18 23:05                       ` [PATCH 0/13] ident cleanups and bugfixes Jeff King
                                           ` (6 preceding siblings ...)
  2012-05-18 23:14                         ` [PATCH 07/13] fmt_ident: drop IDENT_WARN_ON_NO_NAME code Jeff King
@ 2012-05-18 23:19                         ` Jeff King
  2012-05-21  2:54                           ` Junio C Hamano
  2012-05-18 23:20                         ` [PATCH 09/13] drop length limitations on gecos-derived names and emails Jeff King
                                           ` (4 subsequent siblings)
  12 siblings, 1 reply; 68+ messages in thread
From: Jeff King @ 2012-05-18 23:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

The fmt_ident function gets a flag that tells us whether to
die if the name field is blank. If it is blank and we don't
die, then we fall back to the username from the passwd file.

The current code writes the value into git_default_name.
However, that's not necessarily correct, as the empty value
might have come from git_default_name, or it might have been
passed in.  This leads to two potential problems:

  1. If we are overriding an empty name in the passed-in
     value, then we may be overwriting a perfectly good name
     (from gitconfig or gecos) in the git_default_name
     buffer. Later calls to fmt_ident will end up using the
     fallback name, even though a better name was available.

  2. If we override an empty gecos name, we end up with the
     fallback name in git_default_name. A later call that
     uses IDENT_ERROR_ON_NO_NAME will see the fallback name
     and think that it is a good name, instead of producing
     an error. In other words, a blank gecos name would
     cause an error with this code:

       git_committer_info(IDENT_ERROR_ON_NO_NAME);

     but not this:

       git_committer_info(0);
       git_committer_info(IDENT_ERROR_ON_NO_NAME);

     because in the latter case, the first call has polluted
     the name buffer.

Instead, let's make the fallback a per-invocation variable.
We can just use the pw->pw_name string directly, since it
only needs to persist through the rest of the function (and
we don't do any other getpwent calls).

Note that while this solves (1) for future invocations of
fmt_indent, the current invocation might use the fallback
when it could in theory load a better value from
git_default_name. However, by not passing
IDENT_ERROR_ON_NO_NAME, the caller is indicating that it
does not care too much about the name, anyway, so we don't
bother; this is primarily about protecting future callers
who do care.

Signed-off-by: Jeff King <peff@peff.net>
---
You can trigger bug (2) by having an empty gecos field, no user.name,
and running a merge. The recent credit_person code path will call
git_committer_info(0), then the commit-generating code path will call
git_committer_info(IDENT_ERROR_ON_NO_NAME), and you will end
up with a commit with your username in the "name" field (even though the
second call would want to error out).

I don't think (1) is triggerable in the current code, mainly because we
typically are feeding getenv("GIT_*_NAME") to fmt_ident, and a blank
there will end up causing an error later on in the program.

By the way, I left it so that:

  GIT_COMMITTER_NAME= git ...

will continue to fallback to the passwd username if a caller doesn't
request to die on a blank name. We could pretty easily die in that case,
but people with flaky scripts might be affected. I figured that callers
who don't give ERROR_ON_NO_NAME don't really care too much what's in the
name anyway, so there's not much point in tightening the rules.

 ident.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/ident.c b/ident.c
index d2fa271..f44bcb3 100644
--- a/ident.c
+++ b/ident.c
@@ -330,9 +330,7 @@ const char *fmt_ident(const char *name, const char *email,
 		pw = getpwuid(getuid());
 		if (!pw)
 			die("You don't exist. Go away!");
-		strlcpy(git_default_name, pw->pw_name,
-			sizeof(git_default_name));
-		name = git_default_name;
+		name = pw->pw_name;
 	}
 
 	strcpy(date, ident_default_date());
-- 
1.7.10.1.16.g53a707b

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 09/13] drop length limitations on gecos-derived names and emails
  2012-05-18 23:05                       ` [PATCH 0/13] ident cleanups and bugfixes Jeff King
                                           ` (7 preceding siblings ...)
  2012-05-18 23:19                         ` [PATCH 08/13] ident: don't write fallback username into git_default_name Jeff King
@ 2012-05-18 23:20                         ` Jeff King
  2012-05-18 23:21                         ` [PATCH 10/13] ident: report passwd errors with a more friendly message Jeff King
                                           ` (3 subsequent siblings)
  12 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-18 23:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

When we pull the user's name from the GECOS field of the
passwd file (or generate an email address based on their
username and hostname), we put the result into a
static buffer. While it's extremely unlikely that anybody
ever hit these limits (after all, in such a case their
parents must have hated them), we still had to deal with the
error cases in our code.

Converting these static buffers to strbufs lets us simplify
the code and drop some error messages from the documentation
that have confused some users.

The conversion is mostly mechanical: replace string copies
with strbuf equivalents, and access the strbuf.buf directly.
There are a few exceptions:

  - copy_gecos and copy_email are the big winners in code
    reduction (since they no longer have to manage the
    string length manually)

  - git_ident_config wants to replace old versions of
    the default name (e.g., if we read the config multiple
    times), so it must reset+add to the strbuf instead of
    just adding

Note that there is still one length limitation: the
gethostname interface requires us to provide a static
buffer, so we arbitrarily choose 1024 bytes for the
hostname.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/git-commit-tree.txt |   4 --
 Documentation/git-var.txt         |   4 --
 ident.c                           | 100 +++++++++++++++-----------------------
 3 files changed, 39 insertions(+), 69 deletions(-)

diff --git a/Documentation/git-commit-tree.txt b/Documentation/git-commit-tree.txt
index cfb9906..eb12b2d 100644
--- a/Documentation/git-commit-tree.txt
+++ b/Documentation/git-commit-tree.txt
@@ -92,10 +92,6 @@ Diagnostics
 -----------
 You don't exist. Go away!::
     The passwd(5) gecos field couldn't be read
-Your parents must have hated you!::
-    The passwd(5) gecos field is longer than a giant static buffer.
-Your sysadmin must hate you!::
-    The passwd(5) name field is longer than a giant static buffer.
 
 Discussion
 ----------
diff --git a/Documentation/git-var.txt b/Documentation/git-var.txt
index 988a323..3f703e3 100644
--- a/Documentation/git-var.txt
+++ b/Documentation/git-var.txt
@@ -63,10 +63,6 @@ Diagnostics
 -----------
 You don't exist. Go away!::
     The passwd(5) gecos field couldn't be read
-Your parents must have hated you!::
-    The passwd(5) gecos field is longer than a giant static buffer.
-Your sysadmin must hate you!::
-    The passwd(5) name field is longer than a giant static buffer.
 
 SEE ALSO
 --------
diff --git a/ident.c b/ident.c
index f44bcb3..73a06a1 100644
--- a/ident.c
+++ b/ident.c
@@ -7,9 +7,8 @@
  */
 #include "cache.h"
 
-#define MAX_GITNAME (1000)
-static char git_default_name[MAX_GITNAME];
-static char git_default_email[MAX_GITNAME];
+static struct strbuf git_default_name = STRBUF_INIT;
+static struct strbuf git_default_email = STRBUF_INIT;
 static char git_default_date[50];
 int user_ident_explicitly_given;
 
@@ -19,42 +18,27 @@ int user_ident_explicitly_given;
 #define get_gecos(struct_passwd) ((struct_passwd)->pw_gecos)
 #endif
 
-static void copy_gecos(const struct passwd *w, char *name, size_t sz)
+static void copy_gecos(const struct passwd *w, struct strbuf *name)
 {
-	char *src, *dst;
-	size_t len, nlen;
-
-	nlen = strlen(w->pw_name);
+	char *src;
 
 	/* Traditionally GECOS field had office phone numbers etc, separated
 	 * with commas.  Also & stands for capitalized form of the login name.
 	 */
 
-	for (len = 0, dst = name, src = get_gecos(w); len < sz; src++) {
+	for (src = get_gecos(w); *src && *src != ','; src++) {
 		int ch = *src;
-		if (ch != '&') {
-			*dst++ = ch;
-			if (ch == 0 || ch == ',')
-				break;
-			len++;
-			continue;
-		}
-		if (len + nlen < sz) {
+		if (ch != '&')
+			strbuf_addch(name, ch);
+		else {
 			/* Sorry, Mr. McDonald... */
-			*dst++ = toupper(*w->pw_name);
-			memcpy(dst, w->pw_name + 1, nlen - 1);
-			dst += nlen - 1;
-			len += nlen;
+			strbuf_addch(name, toupper(*w->pw_name));
+			strbuf_addstr(name, w->pw_name + 1);
 		}
 	}
-	if (len < sz)
-		name[len] = 0;
-	else
-		die("Your parents must have hated you!");
-
 }
 
-static int add_mailname_host(char *buf, size_t len)
+static int add_mailname_host(struct strbuf *buf)
 {
 	FILE *mailname;
 
@@ -65,7 +49,7 @@ static int add_mailname_host(char *buf, size_t len)
 				strerror(errno));
 		return -1;
 	}
-	if (!fgets(buf, len, mailname)) {
+	if (strbuf_getline(buf, mailname, '\n') == EOF) {
 		if (ferror(mailname))
 			warning("cannot read /etc/mailname: %s",
 				strerror(errno));
@@ -77,78 +61,70 @@ static int add_mailname_host(char *buf, size_t len)
 	return 0;
 }
 
-static void add_domainname(char *buf, size_t len)
+static void add_domainname(struct strbuf *out)
 {
+	char buf[1024];
 	struct hostent *he;
-	size_t namelen;
 	const char *domainname;
 
-	if (gethostname(buf, len)) {
+	if (gethostname(buf, sizeof(buf))) {
 		warning("cannot get host name: %s", strerror(errno));
-		strlcpy(buf, "(none)", len);
+		strbuf_addstr(out, "(none)");
 		return;
 	}
-	namelen = strlen(buf);
-	if (memchr(buf, '.', namelen))
+	strbuf_addstr(out, buf);
+	if (strchr(buf, '.'))
 		return;
 
 	he = gethostbyname(buf);
-	buf[namelen++] = '.';
-	buf += namelen;
-	len -= namelen;
+	strbuf_addch(out, '.');
 	if (he && (domainname = strchr(he->h_name, '.')))
-		strlcpy(buf, domainname + 1, len);
+		strbuf_addstr(out, domainname + 1);
 	else
-		strlcpy(buf, "(none)", len);
+		strbuf_addstr(out, "(none)");
 }
 
-static void copy_email(const struct passwd *pw)
+static void copy_email(const struct passwd *pw, struct strbuf *email)
 {
 	/*
 	 * Make up a fake email address
 	 * (name + '@' + hostname [+ '.' + domainname])
 	 */
-	size_t len = strlen(pw->pw_name);
-	if (len > sizeof(git_default_email)/2)
-		die("Your sysadmin must hate you!");
-	memcpy(git_default_email, pw->pw_name, len);
-	git_default_email[len++] = '@';
-
-	if (!add_mailname_host(git_default_email + len,
-				sizeof(git_default_email) - len))
+	strbuf_addstr(email, pw->pw_name);
+	strbuf_addch(email, '@');
+
+	if (!add_mailname_host(email))
 		return;	/* read from "/etc/mailname" (Debian) */
-	add_domainname(git_default_email + len,
-			sizeof(git_default_email) - len);
+	add_domainname(email);
 }
 
 const char *ident_default_name(void)
 {
-	if (!git_default_name[0]) {
+	if (!git_default_name.len) {
 		struct passwd *pw = getpwuid(getuid());
 		if (!pw)
 			die("You don't exist. Go away!");
-		copy_gecos(pw, git_default_name, sizeof(git_default_name));
+		copy_gecos(pw, &git_default_name);
 	}
-	return git_default_name;
+	return git_default_name.buf;
 }
 
 const char *ident_default_email(void)
 {
-	if (!git_default_email[0]) {
+	if (!git_default_email.len) {
 		const char *email = getenv("EMAIL");
 
 		if (email && email[0]) {
-			strlcpy(git_default_email, email,
-				sizeof(git_default_email));
+			strbuf_addstr(&git_default_email, email);
 			user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
 		} else {
 			struct passwd *pw = getpwuid(getuid());
 			if (!pw)
 				die("You don't exist. Go away!");
-			copy_email(pw);
+			copy_email(pw, &git_default_email);
 		}
 	}
-	return git_default_email;
+	return git_default_email.buf;
 }
 
 const char *ident_default_date(void)
@@ -323,7 +299,7 @@ const char *fmt_ident(const char *name, const char *email,
 		struct passwd *pw;
 
 		if (error_on_no_name) {
-			if (name == git_default_name)
+			if (name == git_default_name.buf)
 				fputs(env_hint, stderr);
 			die("empty ident %s <%s> not allowed", name, email);
 		}
@@ -393,7 +369,8 @@ int git_ident_config(const char *var, const char *value, void *data)
 	if (!strcmp(var, "user.name")) {
 		if (!value)
 			return config_error_nonbool(var);
-		strlcpy(git_default_name, value, sizeof(git_default_name));
+		strbuf_reset(&git_default_name);
+		strbuf_addstr(&git_default_name, value);
 		user_ident_explicitly_given |= IDENT_NAME_GIVEN;
 		return 0;
 	}
@@ -401,7 +378,8 @@ int git_ident_config(const char *var, const char *value, void *data)
 	if (!strcmp(var, "user.email")) {
 		if (!value)
 			return config_error_nonbool(var);
-		strlcpy(git_default_email, value, sizeof(git_default_email));
+		strbuf_reset(&git_default_email);
+		strbuf_addstr(&git_default_email, value);
 		user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
 		return 0;
 	}
-- 
1.7.10.1.16.g53a707b

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 10/13] ident: report passwd errors with a more friendly message
  2012-05-18 23:05                       ` [PATCH 0/13] ident cleanups and bugfixes Jeff King
                                           ` (8 preceding siblings ...)
  2012-05-18 23:20                         ` [PATCH 09/13] drop length limitations on gecos-derived names and emails Jeff King
@ 2012-05-18 23:21                         ` Jeff King
  2012-05-18 23:22                         ` [PATCH 11/13] ident: use full dns names to generate email addresses Jeff King
                                           ` (2 subsequent siblings)
  12 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-18 23:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

When getpwuid fails, we give a cute but cryptic message.
While it makes sense if you know that getpwuid or identity
functions are being called, this code is triggered behind
the scenes by quite a few git commands these days (e.g.,
receive-pack on a remote server might use it for a reflog;
the current message is hard to distinguish from an
authentication error).  Let's switch to something that gives
a little more context.

While we're at it, we can factor out all of the
cut-and-pastes of the "you don't exist" message into a
wrapper function. Rather than provide xgetpwuid, let's make
it even more specific to just getting the passwd entry for
the current uid. That's the only way we use getpwuid anyway,
and it lets us make an even more specific error message.

The current message also fails to mention errno. While the
usual cause for getpwuid failing is that the user does not
exist, mentioning errno makes it easier to diagnose these
problems.  Note that POSIX specifies that errno remain
untouched if the passwd entry does not exist (but will be
set on actual errors), whereas some systems will return
ENOENT or similar for a missing entry. We handle both cases
in our wrapper.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/git-commit-tree.txt |  5 -----
 Documentation/git-var.txt         |  5 -----
 git-compat-util.h                 |  3 +++
 ident.c                           | 20 +++++---------------
 wrapper.c                         | 12 ++++++++++++
 5 files changed, 20 insertions(+), 25 deletions(-)

diff --git a/Documentation/git-commit-tree.txt b/Documentation/git-commit-tree.txt
index eb12b2d..eb8ee99 100644
--- a/Documentation/git-commit-tree.txt
+++ b/Documentation/git-commit-tree.txt
@@ -88,11 +88,6 @@ for one to be entered and terminated with ^D.
 
 include::date-formats.txt[]
 
-Diagnostics
------------
-You don't exist. Go away!::
-    The passwd(5) gecos field couldn't be read
-
 Discussion
 ----------
 
diff --git a/Documentation/git-var.txt b/Documentation/git-var.txt
index 3f703e3..67edf58 100644
--- a/Documentation/git-var.txt
+++ b/Documentation/git-var.txt
@@ -59,11 +59,6 @@ ifdef::git-default-pager[]
     The build you are using chose '{git-default-pager}' as the default.
 endif::git-default-pager[]
 
-Diagnostics
------------
-You don't exist. Go away!::
-    The passwd(5) gecos field couldn't be read
-
 SEE ALSO
 --------
 linkgit:git-commit-tree[1]
diff --git a/git-compat-util.h b/git-compat-util.h
index ed11ad8..5bd9ad7 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -595,4 +595,7 @@ int rmdir_or_warn(const char *path);
  */
 int remove_or_warn(unsigned int mode, const char *path);
 
+/* Get the passwd entry for the UID of the current process. */
+struct passwd *xgetpwuid_self(void);
+
 #endif
diff --git a/ident.c b/ident.c
index 73a06a1..5aec073 100644
--- a/ident.c
+++ b/ident.c
@@ -100,12 +100,8 @@ static void copy_email(const struct passwd *pw, struct strbuf *email)
 
 const char *ident_default_name(void)
 {
-	if (!git_default_name.len) {
-		struct passwd *pw = getpwuid(getuid());
-		if (!pw)
-			die("You don't exist. Go away!");
-		copy_gecos(pw, &git_default_name);
-	}
+	if (!git_default_name.len)
+		copy_gecos(xgetpwuid_self(), &git_default_name);
 	return git_default_name.buf;
 }
 
@@ -117,12 +113,8 @@ const char *ident_default_email(void)
 		if (email && email[0]) {
 			strbuf_addstr(&git_default_email, email);
 			user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
-		} else {
-			struct passwd *pw = getpwuid(getuid());
-			if (!pw)
-				die("You don't exist. Go away!");
-			copy_email(pw, &git_default_email);
-		}
+		} else
+			copy_email(xgetpwuid_self(), &git_default_email);
 	}
 	return git_default_email.buf;
 }
@@ -303,9 +295,7 @@ const char *fmt_ident(const char *name, const char *email,
 				fputs(env_hint, stderr);
 			die("empty ident %s <%s> not allowed", name, email);
 		}
-		pw = getpwuid(getuid());
-		if (!pw)
-			die("You don't exist. Go away!");
+		pw = xgetpwuid_self();
 		name = pw->pw_name;
 	}
 
diff --git a/wrapper.c b/wrapper.c
index 6ccd059..b5e33e4 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -402,3 +402,15 @@ int remove_or_warn(unsigned int mode, const char *file)
 {
 	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
 }
+
+struct passwd *xgetpwuid_self(void)
+{
+	struct passwd *pw;
+
+	errno = 0;
+	pw = getpwuid(getuid());
+	if (!pw)
+		die(_("unable to look up current user in the passwd file: %s"),
+		    errno ? strerror(errno) : _("no such user"));
+	return pw;
+}
-- 
1.7.10.1.16.g53a707b

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 11/13] ident: use full dns names to generate email addresses
  2012-05-18 23:05                       ` [PATCH 0/13] ident cleanups and bugfixes Jeff King
                                           ` (9 preceding siblings ...)
  2012-05-18 23:21                         ` [PATCH 10/13] ident: report passwd errors with a more friendly message Jeff King
@ 2012-05-18 23:22                         ` Jeff King
  2012-05-18 23:23                         ` [PATCH 12/13] ident: use a dynamic strbuf in fmt_ident Jeff King
  2012-05-18 23:24                         ` [PATCH 13/13] format-patch: refactor get_patch_filename Jeff King
  12 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-18 23:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

When we construct an email address from the username and
hostname, we generate the host part of the email with this
procedure:

  1. add the result of gethostname

  2. if it has a dot, ok, it's fully qualified

  3. if not, then look up the unqualified hostname via
     gethostbyname; take the domain name of the result and
     append it to the hostname

Step 3 can actually produce a bogus result, as the name
returned by gethostbyname may not be related to the hostname
we fed it (e.g., consider a machine "foo" with names
"foo.one.example.com" and "bar.two.example.com"; we may have
the latter returned and generate the bogus name
"foo.two.example.com").

This patch simply uses the full hostname returned by
gethostbyname. In the common case that the first part is the
same as the unqualified hostname, the behavior is identical.
And in the case that it is not the same, we are much more
likely to be generating a valid name.

Signed-off-by: Jeff King <peff@peff.net>
---
It takes a pretty odd /etc/hosts setup, but I was able to trigger this
bug. I doubt anyone is affected in practice, but hey, the right code is
5 lines shorter (and easier to understand).

 ident.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/ident.c b/ident.c
index 5aec073..b111e34 100644
--- a/ident.c
+++ b/ident.c
@@ -65,23 +65,18 @@ static void add_domainname(struct strbuf *out)
 {
 	char buf[1024];
 	struct hostent *he;
-	const char *domainname;
 
 	if (gethostname(buf, sizeof(buf))) {
 		warning("cannot get host name: %s", strerror(errno));
 		strbuf_addstr(out, "(none)");
 		return;
 	}
-	strbuf_addstr(out, buf);
 	if (strchr(buf, '.'))
-		return;
-
-	he = gethostbyname(buf);
-	strbuf_addch(out, '.');
-	if (he && (domainname = strchr(he->h_name, '.')))
-		strbuf_addstr(out, domainname + 1);
+		strbuf_addstr(out, buf);
+	else if ((he = gethostbyname(buf)) && strchr(he->h_name, '.'))
+		strbuf_addstr(out, he->h_name);
 	else
-		strbuf_addstr(out, "(none)");
+		strbuf_addf(out, "%s.(none)", buf);
 }
 
 static void copy_email(const struct passwd *pw, struct strbuf *email)
-- 
1.7.10.1.16.g53a707b

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 12/13] ident: use a dynamic strbuf in fmt_ident
  2012-05-18 23:05                       ` [PATCH 0/13] ident cleanups and bugfixes Jeff King
                                           ` (10 preceding siblings ...)
  2012-05-18 23:22                         ` [PATCH 11/13] ident: use full dns names to generate email addresses Jeff King
@ 2012-05-18 23:23                         ` Jeff King
  2012-05-18 23:24                         ` [PATCH 13/13] format-patch: refactor get_patch_filename Jeff King
  12 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-18 23:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

Now that we accept arbitrary-sized names and email
addresses, the only remaining limit is in the actual
formatting of the names into a buffer. The current limit is
1000 characters, which is not likely to be reached, but
using a strbuf is one less error condition we have to worry
about.

Signed-off-by: Jeff King <peff@peff.net>
---
 ident.c | 43 +++++++++++++++----------------------------
 1 file changed, 15 insertions(+), 28 deletions(-)

diff --git a/ident.c b/ident.c
index b111e34..cefb829 100644
--- a/ident.c
+++ b/ident.c
@@ -121,15 +121,6 @@ const char *ident_default_date(void)
 	return git_default_date;
 }
 
-static int add_raw(char *buf, size_t size, int offset, const char *str)
-{
-	size_t len = strlen(str);
-	if (offset + len > size)
-		return size;
-	memcpy(buf + offset, str, len);
-	return offset + len;
-}
-
 static int crud(unsigned char c)
 {
 	return  c <= 32  ||
@@ -148,7 +139,7 @@ static int crud(unsigned char c)
  * Copy over a string to the destination, but avoid special
  * characters ('\n', '<' and '>') and remove crud at the end
  */
-static int copy(char *buf, size_t size, int offset, const char *src)
+static void strbuf_addstr_without_crud(struct strbuf *sb, const char *src)
 {
 	size_t i, len;
 	unsigned char c;
@@ -172,19 +163,19 @@ static int copy(char *buf, size_t size, int offset, const char *src)
 	/*
 	 * Copy the rest to the buffer, but avoid the special
 	 * characters '\n' '<' and '>' that act as delimiters on
-	 * an identification line
+	 * an identification line. We can only remove crud, never add it,
+	 * so 'len' is our maximum.
 	 */
+	strbuf_grow(sb, len);
 	for (i = 0; i < len; i++) {
 		c = *src++;
 		switch (c) {
 		case '\n': case '<': case '>':
 			continue;
 		}
-		if (offset >= size)
-			return size;
-		buf[offset++] = c;
+		sb->buf[sb->len++] = c;
 	}
-	return offset;
+	sb->buf[sb->len] = '\0';
 }
 
 /*
@@ -271,9 +262,8 @@ static const char *env_hint =
 const char *fmt_ident(const char *name, const char *email,
 		      const char *date_str, int flag)
 {
-	static char buffer[1000];
+	static struct strbuf ident = STRBUF_INIT;
 	char date[50];
-	int i;
 	int error_on_no_name = (flag & IDENT_ERROR_ON_NO_NAME);
 	int name_addr_only = (flag & IDENT_NO_DATE);
 
@@ -300,19 +290,16 @@ const char *fmt_ident(const char *name, const char *email,
 			die("invalid date format: %s", date_str);
 	}
 
-	i = copy(buffer, sizeof(buffer), 0, name);
-	i = add_raw(buffer, sizeof(buffer), i, " <");
-	i = copy(buffer, sizeof(buffer), i, email);
+	strbuf_reset(&ident);
+	strbuf_addstr_without_crud(&ident, name);
+	strbuf_addstr(&ident, " <");
+	strbuf_addstr_without_crud(&ident, email);
+	strbuf_addch(&ident, '>');
 	if (!name_addr_only) {
-		i = add_raw(buffer, sizeof(buffer), i,  "> ");
-		i = copy(buffer, sizeof(buffer), i, date);
-	} else {
-		i = add_raw(buffer, sizeof(buffer), i, ">");
+		strbuf_addch(&ident, ' ');
+		strbuf_addstr_without_crud(&ident, date);
 	}
-	if (i >= sizeof(buffer))
-		die("Impossibly long personal identifier");
-	buffer[i] = 0;
-	return buffer;
+	return ident.buf;
 }
 
 const char *fmt_name(const char *name, const char *email)
-- 
1.7.10.1.16.g53a707b

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 13/13] format-patch: refactor get_patch_filename
  2012-05-18 23:05                       ` [PATCH 0/13] ident cleanups and bugfixes Jeff King
                                           ` (11 preceding siblings ...)
  2012-05-18 23:23                         ` [PATCH 12/13] ident: use a dynamic strbuf in fmt_ident Jeff King
@ 2012-05-18 23:24                         ` Jeff King
  12 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-18 23:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

The get_patch_filename function expects a commit argument
and uses it to get the sanitized subject line when making a
patch filename. However, we also want to use this same
function for the cover letter, which does not have a commit
object. The current solution is to create a fake commit with
the subject "cover letter". Instead, let's make the
get_patch_filename interface more flexibile, and allow
passing a direct subject.

Signed-off-by: Jeff King <peff@peff.net>
---
This one is a bonus. It doesn't really have anything to do with ident,
except that I came across it while trying to figure out how the
committer info is being used in make_cover_letter.

 builtin/log.c | 35 +++++++----------------------------
 log-tree.c    | 19 +++++++++++--------
 log-tree.h    |  4 ++--
 3 files changed, 20 insertions(+), 38 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 656bddf..8010a40 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -663,7 +663,8 @@ static FILE *realstdout = NULL;
 static const char *output_directory = NULL;
 static int outdir_offset;
 
-static int reopen_stdout(struct commit *commit, struct rev_info *rev, int quiet)
+static int reopen_stdout(struct commit *commit, const char *subject,
+			 struct rev_info *rev, int quiet)
 {
 	struct strbuf filename = STRBUF_INIT;
 	int suffix_len = strlen(fmt_patch_suffix) + 1;
@@ -677,7 +678,7 @@ static int reopen_stdout(struct commit *commit, struct rev_info *rev, int quiet)
 			strbuf_addch(&filename, '/');
 	}
 
-	get_patch_filename(commit, rev->nr, fmt_patch_suffix, &filename);
+	get_patch_filename(commit, subject, rev->nr, fmt_patch_suffix, &filename);
 
 	if (!quiet)
 		fprintf(realstdout, "%s\n", filename.buf + outdir_offset);
@@ -778,7 +779,6 @@ static void make_cover_letter(struct rev_info *rev, int use_stdout,
 	const char *encoding = "UTF-8";
 	struct diff_options opts;
 	int need_8bit_cte = 0;
-	struct commit *commit = NULL;
 	struct pretty_print_context pp = {0};
 
 	if (rev->commit_format != CMIT_FMT_EMAIL)
@@ -786,31 +786,10 @@ static void make_cover_letter(struct rev_info *rev, int use_stdout,
 
 	committer = git_committer_info(0);
 
-	if (!numbered_files) {
-		/*
-		 * We fake a commit for the cover letter so we get the filename
-		 * desired.
-		 */
-		commit = xcalloc(1, sizeof(*commit));
-		commit->buffer = xmalloc(400);
-		snprintf(commit->buffer, 400,
-			"tree 0000000000000000000000000000000000000000\n"
-			"parent %s\n"
-			"author %s\n"
-			"committer %s\n\n"
-			"cover letter\n",
-			sha1_to_hex(head->object.sha1), committer, committer);
-	}
-
-	if (!use_stdout && reopen_stdout(commit, rev, quiet))
+	if (!use_stdout &&
+	    reopen_stdout(NULL, numbered_files ? NULL : "cover-letter", rev, quiet))
 		return;
 
-	if (commit) {
-
-		free(commit->buffer);
-		free(commit);
-	}
-
 	log_write_email_headers(rev, head, &pp.subject, &pp.after_subject,
 				&need_8bit_cte);
 
@@ -1405,8 +1384,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 			gen_message_id(&rev, sha1_to_hex(commit->object.sha1));
 		}
 
-		if (!use_stdout && reopen_stdout(numbered_files ? NULL : commit,
-						 &rev, quiet))
+		if (!use_stdout &&
+		    reopen_stdout(numbered_files ? NULL : commit, NULL, &rev, quiet))
 			die(_("Failed to create output files"));
 		shown = log_tree_commit(&rev, commit);
 		free(commit->buffer);
diff --git a/log-tree.c b/log-tree.c
index 376d973..c894930 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -299,19 +299,22 @@ static unsigned int digits_in_number(unsigned int number)
 	return result;
 }
 
-void get_patch_filename(struct commit *commit, int nr, const char *suffix,
-			struct strbuf *buf)
+void get_patch_filename(struct commit *commit, const char *subject, int nr,
+			const char *suffix, struct strbuf *buf)
 {
 	int suffix_len = strlen(suffix) + 1;
 	int start_len = buf->len;
 
-	strbuf_addf(buf, commit ? "%04d-" : "%d", nr);
-	if (commit) {
+	strbuf_addf(buf, commit || subject ? "%04d-" : "%d", nr);
+	if (commit || subject) {
 		int max_len = start_len + FORMAT_PATCH_NAME_MAX - suffix_len;
 		struct pretty_print_context ctx = {0};
-		ctx.date_mode = DATE_NORMAL;
 
-		format_commit_message(commit, "%f", buf, &ctx);
+		if (subject)
+			strbuf_addstr(buf, subject);
+		else if (commit)
+			format_commit_message(commit, "%f", buf, &ctx);
+
 		if (max_len < buf->len)
 			strbuf_setlen(buf, max_len);
 		strbuf_addstr(buf, suffix);
@@ -384,8 +387,8 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			 mime_boundary_leader, opt->mime_boundary);
 		extra_headers = subject_buffer;
 
-		get_patch_filename(opt->numbered_files ? NULL : commit, opt->nr,
-				    opt->patch_suffix, &filename);
+		get_patch_filename(opt->numbered_files ? NULL : commit, NULL,
+				   opt->nr, opt->patch_suffix, &filename);
 		snprintf(buffer, sizeof(buffer) - 1,
 			 "\n--%s%s\n"
 			 "Content-Type: text/x-patch;"
diff --git a/log-tree.h b/log-tree.h
index 5c4cf7c..f5ac238 100644
--- a/log-tree.h
+++ b/log-tree.h
@@ -21,7 +21,7 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 void load_ref_decorations(int flags);
 
 #define FORMAT_PATCH_NAME_MAX 64
-void get_patch_filename(struct commit *commit, int nr, const char *suffix,
-			struct strbuf *buf);
+void get_patch_filename(struct commit *commit, const char *subject, int nr,
+			const char *suffix, struct strbuf *buf);
 
 #endif
-- 
1.7.10.1.16.g53a707b

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH 08/13] ident: don't write fallback username into git_default_name
  2012-05-18 23:19                         ` [PATCH 08/13] ident: don't write fallback username into git_default_name Jeff King
@ 2012-05-21  2:54                           ` Junio C Hamano
  2012-05-21  6:31                             ` Jeff King
  0 siblings, 1 reply; 68+ messages in thread
From: Junio C Hamano @ 2012-05-21  2:54 UTC (permalink / raw)
  To: Jeff King; +Cc: Angus Hammond, git

This step seems to break t/t4014-format-patch.sh; the last output when the
test is run with "-i -v" indicates that Message-Id: and In-Reply-To: lines
have "<long-hexdigits.long-decimal.git.name@host" without matching closing
">" on it.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 06/13] format-patch: use default email for generating message ids
  2012-05-18 23:13                         ` [PATCH 06/13] format-patch: use default email for generating message ids Jeff King
@ 2012-05-21  2:58                           ` Junio C Hamano
  2012-05-21  6:36                             ` Jeff King
  0 siblings, 1 reply; 68+ messages in thread
From: Junio C Hamano @ 2012-05-21  2:58 UTC (permalink / raw)
  To: Jeff King; +Cc: Angus Hammond, git

Jeff King <peff@peff.net> writes:

> We try to generate a sane message id for cover letters and
> threading by appending some changing bits to the front of
> the user's email address. The current code parses the email
> out of the results of git_committer_info, but we can do this
> much more easily by just calling ident_default_email
> ourselves.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> Technically this is a regression if you really wanted:
>
>   GIT_COMMITTER_EMAIL=some.addr@example.com \
>   git format-patch --thread=deep
>
> to make your environment variable part of the message-ids. I don't think
> it matters, but I can adjust it if we care.

Is it because you no longer explicitly ask for "committer" and get generic
"who am I" bit from the ident infrastructure?

I wouldn't be surprised if some automated "commit email notification
reacting to push" bot in post-receive hook is using the environment
variable to affect the message ID.  I would doubt that would break the
message as long as the message ID generated from this codepath stays
valid, though, so I wouldn't worry about complaints along the lines of
"you started using names different from what you used to use".  As long as
we don't die due to "Hey bot, you do not seem to have a valid e-mail
address!", I don't think we need to worry about it.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 05/13] move git_default_* variables to ident.c
  2012-05-18 23:11                         ` [PATCH 05/13] move git_default_* variables " Jeff King
@ 2012-05-21  4:07                           ` Junio C Hamano
  2012-05-21  5:41                             ` Jeff King
  0 siblings, 1 reply; 68+ messages in thread
From: Junio C Hamano @ 2012-05-21  4:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Angus Hammond, git

Jeff King <peff@peff.net> writes:

> diff --git a/cache.h b/cache.h
> index 86224c8..f63b71f 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1142,9 +1142,6 @@ struct config_include_data {
>  #define CONFIG_INCLUDE_INIT { 0 }
>  extern int git_config_include(const char *name, const char *value, void *data);
>  
> -#define MAX_GITNAME (1000)
> -extern char git_default_email[MAX_GITNAME];
> -extern char git_default_name[MAX_GITNAME];
>  #define IDENT_NAME_GIVEN 01
>  #define IDENT_MAIL_GIVEN 02
>  #define IDENT_ALL_GIVEN (IDENT_NAME_GIVEN|IDENT_MAIL_GIVEN)

During this step, builtin/fmt-merge-msg.c ceases to compile, even though
in the end it is fixed...

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 05/13] move git_default_* variables to ident.c
  2012-05-21  4:07                           ` Junio C Hamano
@ 2012-05-21  5:41                             ` Jeff King
  2012-05-21  6:41                               ` Jeff King
  0 siblings, 1 reply; 68+ messages in thread
From: Jeff King @ 2012-05-21  5:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

On Sun, May 20, 2012 at 09:07:42PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > diff --git a/cache.h b/cache.h
> > index 86224c8..f63b71f 100644
> > --- a/cache.h
> > +++ b/cache.h
> > @@ -1142,9 +1142,6 @@ struct config_include_data {
> >  #define CONFIG_INCLUDE_INIT { 0 }
> >  extern int git_config_include(const char *name, const char *value, void *data);
> >  
> > -#define MAX_GITNAME (1000)
> > -extern char git_default_email[MAX_GITNAME];
> > -extern char git_default_name[MAX_GITNAME];
> >  #define IDENT_NAME_GIVEN 01
> >  #define IDENT_MAIL_GIVEN 02
> >  #define IDENT_ALL_GIVEN (IDENT_NAME_GIVEN|IDENT_MAIL_GIVEN)
> 
> During this step, builtin/fmt-merge-msg.c ceases to compile, even though
> in the end it is fixed...

Hrm. I ran the series through "make" on each commit before submitting,
and it compiled fine. I just did it again to double-check, and it's
still fine.

What was the error you got? Are you sure you had 03/13 in place
beforehand, which removes the use of MAX_GITNAME from
builtin/fmt-merge-msg.c?

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 08/13] ident: don't write fallback username into git_default_name
  2012-05-21  2:54                           ` Junio C Hamano
@ 2012-05-21  6:31                             ` Jeff King
  2012-05-21  9:11                               ` Junio C Hamano
  0 siblings, 1 reply; 68+ messages in thread
From: Jeff King @ 2012-05-21  6:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

On Sun, May 20, 2012 at 07:54:24PM -0700, Junio C Hamano wrote:

> This step seems to break t/t4014-format-patch.sh; the last output when the
> test is run with "-i -v" indicates that Message-Id: and In-Reply-To: lines
> have "<long-hexdigits.long-decimal.git.name@host" without matching closing
> ">" on it.

I found the issue. The test failure is actually introduced by patch
06/13 (format-patch: use default email for generating message ids). But
then it later gets fixed in 09/13 (drop length limitations on
gecos-derived names and emails).

The problems turns out to be a latent bug hidden in add_mailname_host.
It uses fgets to read from /etc/mailname, but never trims the trailing
newline. So git_default_email ends up set to something like
"peff@peff.net\n". Most of the time, this bug gets covered up by the
normalization that happens in fmt_ident (where we remove cruft
characters like newline). But in patch 6, we start using it directly
instead of re-parsing the output of fmt_ident, introducing a noticeable
bug.

I suspect there is also a bug in http-push.c, which directly uses
git_default_email as part of the xml in the dav request. I don't know if
it actually matters there or not.  And of course this triggers only on
systems with /etc/mailname, and only when you don't have user.email set.
So it may just be that the combination of that and the fact that hardly
anybody uses dumb http push has caused it to remain hidden.

The bug later ends up fixed in patch 9, because we replace fgets with
strbuf_getline, which trims the newline for us. Placing the patch below
anywhere before patch 6 fixes the issue (and then patch 9 would need to
later remove the code).

It does raise a subtle issue, though: should we be trimming whitespace
and other undesirable characters from git_default_email (or
git_default_name)? We don't currently, and it ends up OK because the
result typically is fed through fmt_ident, which cleans it up. But:

  1. We do look at the git_default_* variables for things like deciding
     whether the name is blank. So if your gecos field was " ", I think
     that would fool git into thinking it had something useful, and skip
     the IDENT_ERROR_ON_NO_NAME check, even though fmt_ident would
     produce an empty name.

  2. We don't always feed it through fmt_ident (the http-push.c callsite
     I mentioned above, and now patch 6 adds another one).

So I think my preference would be:

  - apply the patch below as 5.5/13

  - tweak patch 9 to remove the extra trimming

  - add a patch 14 to call strbuf_trim on the name and email buffers
    after reading them from system files.

-- >8 --
Subject: [PATCH] ident: trim trailing whitespace from /etc/mailname

Otherwise we will typically end up with an extra newline in
our git_default_email. Most of the time this doesn't matter,
as fmt_ident will skip it as cruft, but there is one code
path that accesses it directly (in http-push.c:lock_remote).

Let's trim any trailing space or line termination when we
read from /etc/mailname.

Signed-off-by: Jeff King <peff@peff.net>
---
 ident.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/ident.c b/ident.c
index af92b2c..e797e36 100644
--- a/ident.c
+++ b/ident.c
@@ -57,6 +57,7 @@ static void copy_gecos(const struct passwd *w, char *name, size_t sz)
 static int add_mailname_host(char *buf, size_t len)
 {
 	FILE *mailname;
+	char *p;
 
 	mailname = fopen("/etc/mailname", "r");
 	if (!mailname) {
@@ -74,6 +75,8 @@ static int add_mailname_host(char *buf, size_t len)
 	}
 	/* success! */
 	fclose(mailname);
+	for (p = buf + strlen(buf) - 1; p >= buf && isspace(*p); p--)
+		*p = '\0';
 	return 0;
 }
 
-- 
1.7.10.1.19.g711d603

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH 06/13] format-patch: use default email for generating message ids
  2012-05-21  2:58                           ` Junio C Hamano
@ 2012-05-21  6:36                             ` Jeff King
  0 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-21  6:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

On Sun, May 20, 2012 at 07:58:04PM -0700, Junio C Hamano wrote:

> > Technically this is a regression if you really wanted:
> >
> >   GIT_COMMITTER_EMAIL=some.addr@example.com \
> >   git format-patch --thread=deep
> >
> > to make your environment variable part of the message-ids. I don't think
> > it matters, but I can adjust it if we care.
> 
> Is it because you no longer explicitly ask for "committer" and get generic
> "who am I" bit from the ident infrastructure?

Right. Calling git_committer_info will also check
getenv("GIT_COMMITTER_EMAIL"), but we don't bother to do that here. We
could change the code to:

  const char *email = getenv("GIT_COMMITTER_EMAIL");
  if (!email)
          email = ident_default_email();

if it matters. I don't think it's worth building an alternate version of
git_committer_info that doesn't do the "name" half of the info.

> I wouldn't be surprised if some automated "commit email notification
> reacting to push" bot in post-receive hook is using the environment
> variable to affect the message ID.  I would doubt that would break the
> message as long as the message ID generated from this codepath stays
> valid, though, so I wouldn't worry about complaints along the lines of
> "you started using names different from what you used to use".  As long as
> we don't die due to "Hey bot, you do not seem to have a valid e-mail
> address!", I don't think we need to worry about it.

No, we'll never die as a result. But if you have a poorly configured
machine, you might get "user@host.(none)". Or I suppose you could leak
information about the username of the post-receive process. In both
cases, we do this already, so the only change is if you are trying to
override that with GIT_COMMITTER_EMAIL.

So it's a little far-fetched, which is why I didn't bother. But the code
above is quite simple, so maybe it is better to be conservative.

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 05/13] move git_default_* variables to ident.c
  2012-05-21  5:41                             ` Jeff King
@ 2012-05-21  6:41                               ` Jeff King
  0 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-21  6:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

On Mon, May 21, 2012 at 01:41:18AM -0400, Jeff King wrote:

> > > -#define MAX_GITNAME (1000)
> > > -extern char git_default_email[MAX_GITNAME];
> > > -extern char git_default_name[MAX_GITNAME];
> > >  #define IDENT_NAME_GIVEN 01
> > >  #define IDENT_MAIL_GIVEN 02
> > >  #define IDENT_ALL_GIVEN (IDENT_NAME_GIVEN|IDENT_MAIL_GIVEN)
> > 
> > During this step, builtin/fmt-merge-msg.c ceases to compile, even though
> > in the end it is fixed...
> 
> Hrm. I ran the series through "make" on each commit before submitting,
> and it compiled fine. I just did it again to double-check, and it's
> still fine.
> 
> What was the error you got? Are you sure you had 03/13 in place
> beforehand, which removes the use of MAX_GITNAME from
> builtin/fmt-merge-msg.c?

Ah, yeah. I just looked at what you wrote in the "What's Cooking"
message. For some reason you applied patch 03/13 out of order (in slot
5). I'm not sure how you order the messages in a series when you apply
it, but AFAICT, the numbers in the subjects and the date stamps are
in the right order. Operator error? :)

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 08/13] ident: don't write fallback username into git_default_name
  2012-05-21  6:31                             ` Jeff King
@ 2012-05-21  9:11                               ` Junio C Hamano
  2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
  0 siblings, 1 reply; 68+ messages in thread
From: Junio C Hamano @ 2012-05-21  9:11 UTC (permalink / raw)
  To: Jeff King; +Cc: Angus Hammond, git

Jeff King <peff@peff.net> writes:

> It does raise a subtle issue, though: should we be trimming whitespace
> and other undesirable characters from git_default_email (or
> git_default_name)? We don't currently, and it ends up OK because the
> result typically is fed through fmt_ident, which cleans it up. But:
>
>   1. We do look at the git_default_* variables for things like deciding
>      whether the name is blank. So if your gecos field was " ", I think
>      that would fool git into thinking it had something useful, and skip
>      the IDENT_ERROR_ON_NO_NAME check, even though fmt_ident would
>      produce an empty name.
>
>   2. We don't always feed it through fmt_ident (the http-push.c callsite
>      I mentioned above, and now patch 6 adds another one).
>
> So I think my preference would be:
>
>   - apply the patch below as 5.5/13
>
>   - tweak patch 9 to remove the extra trimming
>
>   - add a patch 14 to call strbuf_trim on the name and email buffers
>     after reading them from system files.

Sounds sensible; thanks.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCHv2 0/15] ident cleanups git_default_name
  2012-05-21  9:11                               ` Junio C Hamano
@ 2012-05-21 23:09                                 ` Jeff King
  2012-05-21 23:09                                   ` [PATCHv2 01/15] ident: split setup_ident into separate functions Jeff King
                                                     ` (14 more replies)
  0 siblings, 15 replies; 68+ messages in thread
From: Jeff King @ 2012-05-21 23:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

On Mon, May 21, 2012 at 02:11:58AM -0700, Junio C Hamano wrote:

> > So I think my preference would be:
> >
> >   - apply the patch below as 5.5/13
> >
> >   - tweak patch 9 to remove the extra trimming
> >
> >   - add a patch 14 to call strbuf_trim on the name and email buffers
> >     after reading them from system files.
> 
> Sounds sensible; thanks.

Here's the whole series again. Most of it is unchanged, but given the
ordering problem earlier, it seems reasonable to just post it all again.

  [01/15]: ident: split setup_ident into separate functions
  [02/15]: http-push: do not access git_default_email directly
  [03/15]: fmt-merge-msg: don't use static buffer in record_person
  [04/15]: move identity config parsing to ident.c
  [05/15]: move git_default_* variables to ident.c
  [06/15]: ident: trim trailing newline from /etc/mailname

  This one is new and fixes the t4014 failure. The earlier version I
  posted trimmed all trailing whitespace, but there's not much point in
  being thorough, since this code gets removed later, anyway.

  [07/15]: format-patch: use default email for generating message ids
  [08/15]: fmt_ident: drop IDENT_WARN_ON_NO_NAME code
  [09/15]: ident: don't write fallback username into git_default_name
  [10/15]: drop length limitations on gecos-derived names and emails

  This one drops the new code from patch 6, as it is pointless with
  strbufs. I considered reordering the patches to avoid patch 6
  altogether, but there is a circular dependency with getting rid of the
  WARN case in patch 8.

  [11/15]: ident: report passwd errors with a more friendly message
  [12/15]: ident: use full dns names to generate email addresses
  [13/15]: ident: use a dynamic strbuf in fmt_ident
  [14/15]: ident: trim whitespace from default name/email

  This one is new, and actually does a good job of trimming unnecessary
  whitespace (from both the front and back, and from both name and
  email).

  [15/15]: format-patch: refactor get_patch_filename

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCHv2 01/15] ident: split setup_ident into separate functions
  2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
@ 2012-05-21 23:09                                   ` Jeff King
  2012-05-21 23:09                                   ` [PATCHv2 02/15] http-push: do not access git_default_email directly Jeff King
                                                     ` (13 subsequent siblings)
  14 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-21 23:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

This function sets up the default name, email, and date, and
is not publicly available. Let's split it into three public
functions so that callers can get just the parts they need.

While we're at it, let's change the interface to simple
accessors. The original function was called only by fmt_ident,
and contained logic for "if we already have some other
value, don't load the default" which properly belongs in
fmt_ident.

Signed-off-by: Jeff King <peff@peff.net>
---
 cache.h |  3 +++
 ident.c | 35 +++++++++++++++++++----------------
 2 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/cache.h b/cache.h
index e14ffcd..0c095d4 100644
--- a/cache.h
+++ b/cache.h
@@ -894,6 +894,9 @@ extern const char *git_author_info(int);
 extern const char *git_committer_info(int);
 extern const char *fmt_ident(const char *name, const char *email, const char *date_str, int);
 extern const char *fmt_name(const char *name, const char *email);
+extern const char *ident_default_name(void);
+extern const char *ident_default_email(void);
+extern const char *ident_default_date(void);
 extern const char *git_editor(void);
 extern const char *git_pager(int stdout_is_tty);
 
diff --git a/ident.c b/ident.c
index 87c697c..0f7dcae 100644
--- a/ident.c
+++ b/ident.c
@@ -117,21 +117,20 @@ static void copy_email(const struct passwd *pw)
 			sizeof(git_default_email) - len);
 }
 
-static void setup_ident(const char **name, const char **emailp)
+const char *ident_default_name(void)
 {
-	struct passwd *pw = NULL;
-
-	/* Get the name ("gecos") */
-	if (!*name && !git_default_name[0]) {
-		pw = getpwuid(getuid());
+	if (!git_default_name[0]) {
+		struct passwd *pw = getpwuid(getuid());
 		if (!pw)
 			die("You don't exist. Go away!");
 		copy_gecos(pw, git_default_name, sizeof(git_default_name));
 	}
-	if (!*name)
-		*name = git_default_name;
+	return git_default_name;
+}
 
-	if (!*emailp && !git_default_email[0]) {
+const char *ident_default_email(void)
+{
+	if (!git_default_email[0]) {
 		const char *email = getenv("EMAIL");
 
 		if (email && email[0]) {
@@ -139,19 +138,20 @@ static void setup_ident(const char **name, const char **emailp)
 				sizeof(git_default_email));
 			user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
 		} else {
-			if (!pw)
-				pw = getpwuid(getuid());
+			struct passwd *pw = getpwuid(getuid());
 			if (!pw)
 				die("You don't exist. Go away!");
 			copy_email(pw);
 		}
 	}
-	if (!*emailp)
-		*emailp = git_default_email;
+	return git_default_email;
+}
 
-	/* And set the default date */
+const char *ident_default_date(void)
+{
 	if (!git_default_date[0])
 		datestamp(git_default_date, sizeof(git_default_date));
+	return git_default_date;
 }
 
 static int add_raw(char *buf, size_t size, int offset, const char *str)
@@ -311,7 +311,10 @@ const char *fmt_ident(const char *name, const char *email,
 	int warn_on_no_name = (flag & IDENT_WARN_ON_NO_NAME);
 	int name_addr_only = (flag & IDENT_NO_DATE);
 
-	setup_ident(&name, &email);
+	if (!name)
+		name = ident_default_name();
+	if (!email)
+		email = ident_default_email();
 
 	if (!*name) {
 		struct passwd *pw;
@@ -331,7 +334,7 @@ const char *fmt_ident(const char *name, const char *email,
 		name = git_default_name;
 	}
 
-	strcpy(date, git_default_date);
+	strcpy(date, ident_default_date());
 	if (!name_addr_only && date_str && date_str[0]) {
 		if (parse_date(date_str, date, sizeof(date)) < 0)
 			die("invalid date format: %s", date_str);
-- 
1.7.10.1.19.g711d603

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCHv2 02/15] http-push: do not access git_default_email directly
  2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
  2012-05-21 23:09                                   ` [PATCHv2 01/15] ident: split setup_ident into separate functions Jeff King
@ 2012-05-21 23:09                                   ` Jeff King
  2012-05-21 23:09                                   ` [PATCHv2 03/15] fmt-merge-msg: don't use static buffer in record_person Jeff King
                                                     ` (12 subsequent siblings)
  14 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-21 23:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

By calling the ident_default_email accessor, we can be sure
that the default value is actually filled-in.

Signed-off-by: Jeff King <peff@peff.net>
---
 http-push.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/http-push.c b/http-push.c
index 1df7ab5..a832ca7 100644
--- a/http-push.c
+++ b/http-push.c
@@ -904,7 +904,7 @@ static struct remote_lock *lock_remote(const char *path, long timeout)
 		ep = strchr(ep + 1, '/');
 	}
 
-	escaped = xml_entities(git_default_email);
+	escaped = xml_entities(ident_default_email());
 	strbuf_addf(&out_buffer.buf, LOCK_REQUEST, escaped);
 	free(escaped);
 
-- 
1.7.10.1.19.g711d603

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCHv2 03/15] fmt-merge-msg: don't use static buffer in record_person
  2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
  2012-05-21 23:09                                   ` [PATCHv2 01/15] ident: split setup_ident into separate functions Jeff King
  2012-05-21 23:09                                   ` [PATCHv2 02/15] http-push: do not access git_default_email directly Jeff King
@ 2012-05-21 23:09                                   ` Jeff King
  2012-05-21 23:09                                   ` [PATCHv2 04/15] move identity config parsing to ident.c Jeff King
                                                     ` (11 subsequent siblings)
  14 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-21 23:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

The record_person function just parses out the "name" field
of the person line in a commit and adds it to a string_list.
The only reason we need an extra buffer is that the
string_list functions require a NUL-terminated string.

Instead of the static buffer, we can just allocate a
temporary NUL-terminated copy. In addition to removing a
useless limit, this removes the only user of MAX_GITNAME
outside of ident.c.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/fmt-merge-msg.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index a517f17..4ee6a88 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -230,7 +230,7 @@ static void add_branch_desc(struct strbuf *out, const char *name)
 static void record_person(int which, struct string_list *people,
 			  struct commit *commit)
 {
-	char name_buf[MAX_GITNAME], *name, *name_end;
+	char *name_buf, *name, *name_end;
 	struct string_list_item *elem;
 	const char *field = (which == 'a') ? "\nauthor " : "\ncommitter ";
 
@@ -243,10 +243,9 @@ static void record_person(int which, struct string_list *people,
 		name_end--;
 	while (isspace(*name_end) && name <= name_end)
 		name_end--;
-	if (name_end < name || name + MAX_GITNAME <= name_end)
+	if (name_end < name)
 		return;
-	memcpy(name_buf, name, name_end - name + 1);
-	name_buf[name_end - name + 1] = '\0';
+	name_buf = xmemdupz(name, name_end - name + 1);
 
 	elem = string_list_lookup(people, name_buf);
 	if (!elem) {
@@ -254,6 +253,7 @@ static void record_person(int which, struct string_list *people,
 		elem->util = (void *)0;
 	}
 	elem->util = (void*)(util_as_integral(elem) + 1);
+	free(name_buf);
 }
 
 static int cmp_string_list_util_as_integral(const void *a_, const void *b_)
-- 
1.7.10.1.19.g711d603

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCHv2 04/15] move identity config parsing to ident.c
  2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
                                                     ` (2 preceding siblings ...)
  2012-05-21 23:09                                   ` [PATCHv2 03/15] fmt-merge-msg: don't use static buffer in record_person Jeff King
@ 2012-05-21 23:09                                   ` Jeff King
  2012-05-21 23:09                                   ` [PATCHv2 05/15] move git_default_* variables " Jeff King
                                                     ` (10 subsequent siblings)
  14 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-21 23:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

There's no reason for this to be in config, except that once
upon a time all of the config parsing was there. It makes
more sense to keep the ident code together.

Signed-off-by: Jeff King <peff@peff.net>
---
 cache.h  |  1 +
 config.c | 24 +-----------------------
 ident.c  | 21 +++++++++++++++++++++
 3 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/cache.h b/cache.h
index 0c095d4..86224c8 100644
--- a/cache.h
+++ b/cache.h
@@ -899,6 +899,7 @@ extern const char *ident_default_email(void);
 extern const char *ident_default_date(void);
 extern const char *git_editor(void);
 extern const char *git_pager(int stdout_is_tty);
+extern int git_ident_config(const char *, const char *, void *);
 
 struct ident_split {
 	const char *name_begin;
diff --git a/config.c b/config.c
index eeee986..71ef171 100644
--- a/config.c
+++ b/config.c
@@ -762,28 +762,6 @@ static int git_default_core_config(const char *var, const char *value)
 	return 0;
 }
 
-static int git_default_user_config(const char *var, const char *value)
-{
-	if (!strcmp(var, "user.name")) {
-		if (!value)
-			return config_error_nonbool(var);
-		strlcpy(git_default_name, value, sizeof(git_default_name));
-		user_ident_explicitly_given |= IDENT_NAME_GIVEN;
-		return 0;
-	}
-
-	if (!strcmp(var, "user.email")) {
-		if (!value)
-			return config_error_nonbool(var);
-		strlcpy(git_default_email, value, sizeof(git_default_email));
-		user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
-		return 0;
-	}
-
-	/* Add other config variables here and to Documentation/config.txt. */
-	return 0;
-}
-
 static int git_default_i18n_config(const char *var, const char *value)
 {
 	if (!strcmp(var, "i18n.commitencoding"))
@@ -870,7 +848,7 @@ int git_default_config(const char *var, const char *value, void *dummy)
 		return git_default_core_config(var, value);
 
 	if (!prefixcmp(var, "user."))
-		return git_default_user_config(var, value);
+		return git_ident_config(var, value, dummy);
 
 	if (!prefixcmp(var, "i18n."))
 		return git_default_i18n_config(var, value);
diff --git a/ident.c b/ident.c
index 0f7dcae..bb1158f 100644
--- a/ident.c
+++ b/ident.c
@@ -388,3 +388,24 @@ int user_ident_sufficiently_given(void)
 	return (user_ident_explicitly_given == IDENT_ALL_GIVEN);
 #endif
 }
+
+int git_ident_config(const char *var, const char *value, void *data)
+{
+	if (!strcmp(var, "user.name")) {
+		if (!value)
+			return config_error_nonbool(var);
+		strlcpy(git_default_name, value, sizeof(git_default_name));
+		user_ident_explicitly_given |= IDENT_NAME_GIVEN;
+		return 0;
+	}
+
+	if (!strcmp(var, "user.email")) {
+		if (!value)
+			return config_error_nonbool(var);
+		strlcpy(git_default_email, value, sizeof(git_default_email));
+		user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
+		return 0;
+	}
+
+	return 0;
+}
-- 
1.7.10.1.19.g711d603

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCHv2 05/15] move git_default_* variables to ident.c
  2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
                                                     ` (3 preceding siblings ...)
  2012-05-21 23:09                                   ` [PATCHv2 04/15] move identity config parsing to ident.c Jeff King
@ 2012-05-21 23:09                                   ` Jeff King
  2012-05-21 23:10                                   ` [PATCHv2 06/15] ident: trim trailing newline from /etc/mailname Jeff King
                                                     ` (9 subsequent siblings)
  14 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-21 23:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

There's no reason anybody outside of ident.c should access
these directly (they should use the new accessors which make
sure the variables are initialized), so we can make them
file-scope statics.

While we're at it, move user_ident_explicitly_given into
ident.c; while still globally visible, it makes more sense
to reside with the ident code.

Signed-off-by: Jeff King <peff@peff.net>
---
 cache.h       | 3 ---
 environment.c | 3 ---
 ident.c       | 4 ++++
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/cache.h b/cache.h
index 86224c8..f63b71f 100644
--- a/cache.h
+++ b/cache.h
@@ -1142,9 +1142,6 @@ struct config_include_data {
 #define CONFIG_INCLUDE_INIT { 0 }
 extern int git_config_include(const char *name, const char *value, void *data);
 
-#define MAX_GITNAME (1000)
-extern char git_default_email[MAX_GITNAME];
-extern char git_default_name[MAX_GITNAME];
 #define IDENT_NAME_GIVEN 01
 #define IDENT_MAIL_GIVEN 02
 #define IDENT_ALL_GIVEN (IDENT_NAME_GIVEN|IDENT_MAIL_GIVEN)
diff --git a/environment.c b/environment.c
index d7e6c65..669e498 100644
--- a/environment.c
+++ b/environment.c
@@ -11,9 +11,6 @@
 #include "refs.h"
 #include "fmt-merge-msg.h"
 
-char git_default_email[MAX_GITNAME];
-char git_default_name[MAX_GITNAME];
-int user_ident_explicitly_given;
 int trust_executable_bit = 1;
 int trust_ctime = 1;
 int has_symlinks = 1;
diff --git a/ident.c b/ident.c
index bb1158f..af92b2c 100644
--- a/ident.c
+++ b/ident.c
@@ -7,7 +7,11 @@
  */
 #include "cache.h"
 
+#define MAX_GITNAME (1000)
+static char git_default_name[MAX_GITNAME];
+static char git_default_email[MAX_GITNAME];
 static char git_default_date[50];
+int user_ident_explicitly_given;
 
 #ifdef NO_GECOS_IN_PWENT
 #define get_gecos(ignored) "&"
-- 
1.7.10.1.19.g711d603

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCHv2 06/15] ident: trim trailing newline from /etc/mailname
  2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
                                                     ` (4 preceding siblings ...)
  2012-05-21 23:09                                   ` [PATCHv2 05/15] move git_default_* variables " Jeff King
@ 2012-05-21 23:10                                   ` Jeff King
  2012-05-21 23:10                                   ` [PATCHv2 07/15] format-patch: use default email for generating message ids Jeff King
                                                     ` (8 subsequent siblings)
  14 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-21 23:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

We use fgets to read the /etc/mailname file, which means we
will typically end up with an extra newline in our
git_default_email. Most of the time this doesn't matter, as
fmt_ident will skip it as cruft, but there is one code path
that accesses it directly (in http-push.c:lock_remote).

Signed-off-by: Jeff King <peff@peff.net>
---
 ident.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/ident.c b/ident.c
index af92b2c..acb3a08 100644
--- a/ident.c
+++ b/ident.c
@@ -74,6 +74,10 @@ static int add_mailname_host(char *buf, size_t len)
 	}
 	/* success! */
 	fclose(mailname);
+
+	len = strlen(buf);
+	if (len && buf[len-1] == '\n')
+		buf[len-1] = '\0';
 	return 0;
 }
 
-- 
1.7.10.1.19.g711d603

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCHv2 07/15] format-patch: use default email for generating message ids
  2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
                                                     ` (5 preceding siblings ...)
  2012-05-21 23:10                                   ` [PATCHv2 06/15] ident: trim trailing newline from /etc/mailname Jeff King
@ 2012-05-21 23:10                                   ` Jeff King
  2012-05-21 23:10                                   ` [PATCHv2 08/15] fmt_ident: drop IDENT_WARN_ON_NO_NAME code Jeff King
                                                     ` (7 subsequent siblings)
  14 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-21 23:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

We try to generate a sane message id for cover letters and
threading by appending some changing bits to the front of
the user's email address. The current code parses the email
out of the results of git_committer_info, but we can do this
much more easily by just calling ident_default_email
ourselves.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/log.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 690caa7..656bddf 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -737,15 +737,9 @@ static void get_patch_ids(struct rev_info *rev, struct patch_ids *ids, const cha
 
 static void gen_message_id(struct rev_info *info, char *base)
 {
-	const char *committer = git_committer_info(IDENT_WARN_ON_NO_NAME);
-	const char *email_start = strrchr(committer, '<');
-	const char *email_end = strrchr(committer, '>');
 	struct strbuf buf = STRBUF_INIT;
-	if (!email_start || !email_end || email_start > email_end - 1)
-		die(_("Could not extract email from committer identity."));
-	strbuf_addf(&buf, "%s.%lu.git.%.*s", base,
-		    (unsigned long) time(NULL),
-		    (int)(email_end - email_start - 1), email_start + 1);
+	strbuf_addf(&buf, "%s.%lu.git.%s", base,
+		    (unsigned long) time(NULL), ident_default_email());
 	info->message_id = strbuf_detach(&buf, NULL);
 }
 
-- 
1.7.10.1.19.g711d603

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCHv2 08/15] fmt_ident: drop IDENT_WARN_ON_NO_NAME code
  2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
                                                     ` (6 preceding siblings ...)
  2012-05-21 23:10                                   ` [PATCHv2 07/15] format-patch: use default email for generating message ids Jeff King
@ 2012-05-21 23:10                                   ` Jeff King
  2012-05-21 23:10                                   ` [PATCHv2 09/15] ident: don't write fallback username into git_default_name Jeff King
                                                     ` (6 subsequent siblings)
  14 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-21 23:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

There are no more callers who want this, so we can drop it.

Signed-off-by: Jeff King <peff@peff.net>
---
 cache.h |  5 ++---
 ident.c | 11 ++++-------
 2 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/cache.h b/cache.h
index f63b71f..65cbab5 100644
--- a/cache.h
+++ b/cache.h
@@ -887,9 +887,8 @@ unsigned long approxidate_careful(const char *, int *);
 unsigned long approxidate_relative(const char *date, const struct timeval *now);
 enum date_mode parse_date_format(const char *format);
 
-#define IDENT_WARN_ON_NO_NAME  1
-#define IDENT_ERROR_ON_NO_NAME 2
-#define IDENT_NO_DATE	       4
+#define IDENT_ERROR_ON_NO_NAME 1
+#define IDENT_NO_DATE	       2
 extern const char *git_author_info(int);
 extern const char *git_committer_info(int);
 extern const char *fmt_ident(const char *name, const char *email, const char *date_str, int);
diff --git a/ident.c b/ident.c
index acb3a08..3b92c44 100644
--- a/ident.c
+++ b/ident.c
@@ -316,7 +316,6 @@ const char *fmt_ident(const char *name, const char *email,
 	char date[50];
 	int i;
 	int error_on_no_name = (flag & IDENT_ERROR_ON_NO_NAME);
-	int warn_on_no_name = (flag & IDENT_WARN_ON_NO_NAME);
 	int name_addr_only = (flag & IDENT_NO_DATE);
 
 	if (!name)
@@ -327,13 +326,11 @@ const char *fmt_ident(const char *name, const char *email,
 	if (!*name) {
 		struct passwd *pw;
 
-		if ((warn_on_no_name || error_on_no_name) &&
-		    name == git_default_name && env_hint) {
-			fputs(env_hint, stderr);
-			env_hint = NULL; /* warn only once */
-		}
-		if (error_on_no_name)
+		if (error_on_no_name) {
+			if (name == git_default_name)
+				fputs(env_hint, stderr);
 			die("empty ident %s <%s> not allowed", name, email);
+		}
 		pw = getpwuid(getuid());
 		if (!pw)
 			die("You don't exist. Go away!");
-- 
1.7.10.1.19.g711d603

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCHv2 09/15] ident: don't write fallback username into git_default_name
  2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
                                                     ` (7 preceding siblings ...)
  2012-05-21 23:10                                   ` [PATCHv2 08/15] fmt_ident: drop IDENT_WARN_ON_NO_NAME code Jeff King
@ 2012-05-21 23:10                                   ` Jeff King
  2012-05-21 23:10                                   ` [PATCHv2 10/15] drop length limitations on gecos-derived names and emails Jeff King
                                                     ` (5 subsequent siblings)
  14 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-21 23:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

The fmt_ident function gets a flag that tells us whether to
die if the name field is blank. If it is blank and we don't
die, then we fall back to the username from the passwd file.

The current code writes the value into git_default_name.
However, that's not necessarily correct, as the empty value
might have come from git_default_name, or it might have been
passed in.  This leads to two potential problems:

  1. If we are overriding an empty name in the passed-in
     value, then we may be overwriting a perfectly good name
     (from gitconfig or gecos) in the git_default_name
     buffer. Later calls to fmt_ident will end up using the
     fallback name, even though a better name was available.

  2. If we override an empty gecos name, we end up with the
     fallback name in git_default_name. A later call that
     uses IDENT_ERROR_ON_NO_NAME will see the fallback name
     and think that it is a good name, instead of producing
     an error. In other words, a blank gecos name would
     cause an error with this code:

       git_committer_info(IDENT_ERROR_ON_NO_NAME);

     but not this:

       git_committer_info(0);
       git_committer_info(IDENT_ERROR_ON_NO_NAME);

     because in the latter case, the first call has polluted
     the name buffer.

Instead, let's make the fallback a per-invocation variable.
We can just use the pw->pw_name string directly, since it
only needs to persist through the rest of the function (and
we don't do any other getpwent calls).

Note that while this solves (1) for future invocations of
fmt_indent, the current invocation might use the fallback
when it could in theory load a better value from
git_default_name. However, by not passing
IDENT_ERROR_ON_NO_NAME, the caller is indicating that it
does not care too much about the name, anyway, so we don't
bother; this is primarily about protecting future callers
who do care.

Signed-off-by: Jeff King <peff@peff.net>
---
 ident.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/ident.c b/ident.c
index 3b92c44..87e3cbe 100644
--- a/ident.c
+++ b/ident.c
@@ -334,9 +334,7 @@ const char *fmt_ident(const char *name, const char *email,
 		pw = getpwuid(getuid());
 		if (!pw)
 			die("You don't exist. Go away!");
-		strlcpy(git_default_name, pw->pw_name,
-			sizeof(git_default_name));
-		name = git_default_name;
+		name = pw->pw_name;
 	}
 
 	strcpy(date, ident_default_date());
-- 
1.7.10.1.19.g711d603

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCHv2 10/15] drop length limitations on gecos-derived names and emails
  2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
                                                     ` (8 preceding siblings ...)
  2012-05-21 23:10                                   ` [PATCHv2 09/15] ident: don't write fallback username into git_default_name Jeff King
@ 2012-05-21 23:10                                   ` Jeff King
  2013-01-24 23:21                                     ` [regression] " Jonathan Nieder
  2012-05-21 23:10                                   ` [PATCHv2 11/15] ident: report passwd errors with a more friendly message Jeff King
                                                     ` (4 subsequent siblings)
  14 siblings, 1 reply; 68+ messages in thread
From: Jeff King @ 2012-05-21 23:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

When we pull the user's name from the GECOS field of the
passwd file (or generate an email address based on their
username and hostname), we put the result into a
static buffer. While it's extremely unlikely that anybody
ever hit these limits (after all, in such a case their
parents must have hated them), we still had to deal with the
error cases in our code.

Converting these static buffers to strbufs lets us simplify
the code and drop some error messages from the documentation
that have confused some users.

The conversion is mostly mechanical: replace string copies
with strbuf equivalents, and access the strbuf.buf directly.
There are a few exceptions:

  - copy_gecos and copy_email are the big winners in code
    reduction (since they no longer have to manage the
    string length manually)

  - git_ident_config wants to replace old versions of
    the default name (e.g., if we read the config multiple
    times), so it must reset+add to the strbuf instead of
    just adding

Note that there is still one length limitation: the
gethostname interface requires us to provide a static
buffer, so we arbitrarily choose 1024 bytes for the
hostname.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/git-commit-tree.txt |   4 --
 Documentation/git-var.txt         |   4 --
 ident.c                           | 104 ++++++++++++++------------------------
 3 files changed, 39 insertions(+), 73 deletions(-)

diff --git a/Documentation/git-commit-tree.txt b/Documentation/git-commit-tree.txt
index cfb9906..eb12b2d 100644
--- a/Documentation/git-commit-tree.txt
+++ b/Documentation/git-commit-tree.txt
@@ -92,10 +92,6 @@ Diagnostics
 -----------
 You don't exist. Go away!::
     The passwd(5) gecos field couldn't be read
-Your parents must have hated you!::
-    The passwd(5) gecos field is longer than a giant static buffer.
-Your sysadmin must hate you!::
-    The passwd(5) name field is longer than a giant static buffer.
 
 Discussion
 ----------
diff --git a/Documentation/git-var.txt b/Documentation/git-var.txt
index 988a323..3f703e3 100644
--- a/Documentation/git-var.txt
+++ b/Documentation/git-var.txt
@@ -63,10 +63,6 @@ Diagnostics
 -----------
 You don't exist. Go away!::
     The passwd(5) gecos field couldn't be read
-Your parents must have hated you!::
-    The passwd(5) gecos field is longer than a giant static buffer.
-Your sysadmin must hate you!::
-    The passwd(5) name field is longer than a giant static buffer.
 
 SEE ALSO
 --------
diff --git a/ident.c b/ident.c
index 87e3cbe..73a06a1 100644
--- a/ident.c
+++ b/ident.c
@@ -7,9 +7,8 @@
  */
 #include "cache.h"
 
-#define MAX_GITNAME (1000)
-static char git_default_name[MAX_GITNAME];
-static char git_default_email[MAX_GITNAME];
+static struct strbuf git_default_name = STRBUF_INIT;
+static struct strbuf git_default_email = STRBUF_INIT;
 static char git_default_date[50];
 int user_ident_explicitly_given;
 
@@ -19,42 +18,27 @@ int user_ident_explicitly_given;
 #define get_gecos(struct_passwd) ((struct_passwd)->pw_gecos)
 #endif
 
-static void copy_gecos(const struct passwd *w, char *name, size_t sz)
+static void copy_gecos(const struct passwd *w, struct strbuf *name)
 {
-	char *src, *dst;
-	size_t len, nlen;
-
-	nlen = strlen(w->pw_name);
+	char *src;
 
 	/* Traditionally GECOS field had office phone numbers etc, separated
 	 * with commas.  Also & stands for capitalized form of the login name.
 	 */
 
-	for (len = 0, dst = name, src = get_gecos(w); len < sz; src++) {
+	for (src = get_gecos(w); *src && *src != ','; src++) {
 		int ch = *src;
-		if (ch != '&') {
-			*dst++ = ch;
-			if (ch == 0 || ch == ',')
-				break;
-			len++;
-			continue;
-		}
-		if (len + nlen < sz) {
+		if (ch != '&')
+			strbuf_addch(name, ch);
+		else {
 			/* Sorry, Mr. McDonald... */
-			*dst++ = toupper(*w->pw_name);
-			memcpy(dst, w->pw_name + 1, nlen - 1);
-			dst += nlen - 1;
-			len += nlen;
+			strbuf_addch(name, toupper(*w->pw_name));
+			strbuf_addstr(name, w->pw_name + 1);
 		}
 	}
-	if (len < sz)
-		name[len] = 0;
-	else
-		die("Your parents must have hated you!");
-
 }
 
-static int add_mailname_host(char *buf, size_t len)
+static int add_mailname_host(struct strbuf *buf)
 {
 	FILE *mailname;
 
@@ -65,7 +49,7 @@ static int add_mailname_host(char *buf, size_t len)
 				strerror(errno));
 		return -1;
 	}
-	if (!fgets(buf, len, mailname)) {
+	if (strbuf_getline(buf, mailname, '\n') == EOF) {
 		if (ferror(mailname))
 			warning("cannot read /etc/mailname: %s",
 				strerror(errno));
@@ -74,85 +58,73 @@ static int add_mailname_host(char *buf, size_t len)
 	}
 	/* success! */
 	fclose(mailname);
-
-	len = strlen(buf);
-	if (len && buf[len-1] == '\n')
-		buf[len-1] = '\0';
 	return 0;
 }
 
-static void add_domainname(char *buf, size_t len)
+static void add_domainname(struct strbuf *out)
 {
+	char buf[1024];
 	struct hostent *he;
-	size_t namelen;
 	const char *domainname;
 
-	if (gethostname(buf, len)) {
+	if (gethostname(buf, sizeof(buf))) {
 		warning("cannot get host name: %s", strerror(errno));
-		strlcpy(buf, "(none)", len);
+		strbuf_addstr(out, "(none)");
 		return;
 	}
-	namelen = strlen(buf);
-	if (memchr(buf, '.', namelen))
+	strbuf_addstr(out, buf);
+	if (strchr(buf, '.'))
 		return;
 
 	he = gethostbyname(buf);
-	buf[namelen++] = '.';
-	buf += namelen;
-	len -= namelen;
+	strbuf_addch(out, '.');
 	if (he && (domainname = strchr(he->h_name, '.')))
-		strlcpy(buf, domainname + 1, len);
+		strbuf_addstr(out, domainname + 1);
 	else
-		strlcpy(buf, "(none)", len);
+		strbuf_addstr(out, "(none)");
 }
 
-static void copy_email(const struct passwd *pw)
+static void copy_email(const struct passwd *pw, struct strbuf *email)
 {
 	/*
 	 * Make up a fake email address
 	 * (name + '@' + hostname [+ '.' + domainname])
 	 */
-	size_t len = strlen(pw->pw_name);
-	if (len > sizeof(git_default_email)/2)
-		die("Your sysadmin must hate you!");
-	memcpy(git_default_email, pw->pw_name, len);
-	git_default_email[len++] = '@';
-
-	if (!add_mailname_host(git_default_email + len,
-				sizeof(git_default_email) - len))
+	strbuf_addstr(email, pw->pw_name);
+	strbuf_addch(email, '@');
+
+	if (!add_mailname_host(email))
 		return;	/* read from "/etc/mailname" (Debian) */
-	add_domainname(git_default_email + len,
-			sizeof(git_default_email) - len);
+	add_domainname(email);
 }
 
 const char *ident_default_name(void)
 {
-	if (!git_default_name[0]) {
+	if (!git_default_name.len) {
 		struct passwd *pw = getpwuid(getuid());
 		if (!pw)
 			die("You don't exist. Go away!");
-		copy_gecos(pw, git_default_name, sizeof(git_default_name));
+		copy_gecos(pw, &git_default_name);
 	}
-	return git_default_name;
+	return git_default_name.buf;
 }
 
 const char *ident_default_email(void)
 {
-	if (!git_default_email[0]) {
+	if (!git_default_email.len) {
 		const char *email = getenv("EMAIL");
 
 		if (email && email[0]) {
-			strlcpy(git_default_email, email,
-				sizeof(git_default_email));
+			strbuf_addstr(&git_default_email, email);
 			user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
 		} else {
 			struct passwd *pw = getpwuid(getuid());
 			if (!pw)
 				die("You don't exist. Go away!");
-			copy_email(pw);
+			copy_email(pw, &git_default_email);
 		}
 	}
-	return git_default_email;
+	return git_default_email.buf;
 }
 
 const char *ident_default_date(void)
@@ -327,7 +299,7 @@ const char *fmt_ident(const char *name, const char *email,
 		struct passwd *pw;
 
 		if (error_on_no_name) {
-			if (name == git_default_name)
+			if (name == git_default_name.buf)
 				fputs(env_hint, stderr);
 			die("empty ident %s <%s> not allowed", name, email);
 		}
@@ -397,7 +369,8 @@ int git_ident_config(const char *var, const char *value, void *data)
 	if (!strcmp(var, "user.name")) {
 		if (!value)
 			return config_error_nonbool(var);
-		strlcpy(git_default_name, value, sizeof(git_default_name));
+		strbuf_reset(&git_default_name);
+		strbuf_addstr(&git_default_name, value);
 		user_ident_explicitly_given |= IDENT_NAME_GIVEN;
 		return 0;
 	}
@@ -405,7 +378,8 @@ int git_ident_config(const char *var, const char *value, void *data)
 	if (!strcmp(var, "user.email")) {
 		if (!value)
 			return config_error_nonbool(var);
-		strlcpy(git_default_email, value, sizeof(git_default_email));
+		strbuf_reset(&git_default_email);
+		strbuf_addstr(&git_default_email, value);
 		user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
 		return 0;
 	}
-- 
1.7.10.1.19.g711d603

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCHv2 11/15] ident: report passwd errors with a more friendly message
  2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
                                                     ` (9 preceding siblings ...)
  2012-05-21 23:10                                   ` [PATCHv2 10/15] drop length limitations on gecos-derived names and emails Jeff King
@ 2012-05-21 23:10                                   ` Jeff King
  2012-05-21 23:10                                   ` [PATCHv2 12/15] ident: use full dns names to generate email addresses Jeff King
                                                     ` (3 subsequent siblings)
  14 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-21 23:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

When getpwuid fails, we give a cute but cryptic message.
While it makes sense if you know that getpwuid or identity
functions are being called, this code is triggered behind
the scenes by quite a few git commands these days (e.g.,
receive-pack on a remote server might use it for a reflog;
the current message is hard to distinguish from an
authentication error).  Let's switch to something that gives
a little more context.

While we're at it, we can factor out all of the
cut-and-pastes of the "you don't exist" message into a
wrapper function. Rather than provide xgetpwuid, let's make
it even more specific to just getting the passwd entry for
the current uid. That's the only way we use getpwuid anyway,
and it lets us make an even more specific error message.

The current message also fails to mention errno. While the
usual cause for getpwuid failing is that the user does not
exist, mentioning errno makes it easier to diagnose these
problems.  Note that POSIX specifies that errno remain
untouched if the passwd entry does not exist (but will be
set on actual errors), whereas some systems will return
ENOENT or similar for a missing entry. We handle both cases
in our wrapper.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/git-commit-tree.txt |  5 -----
 Documentation/git-var.txt         |  5 -----
 git-compat-util.h                 |  3 +++
 ident.c                           | 20 +++++---------------
 wrapper.c                         | 12 ++++++++++++
 5 files changed, 20 insertions(+), 25 deletions(-)

diff --git a/Documentation/git-commit-tree.txt b/Documentation/git-commit-tree.txt
index eb12b2d..eb8ee99 100644
--- a/Documentation/git-commit-tree.txt
+++ b/Documentation/git-commit-tree.txt
@@ -88,11 +88,6 @@ for one to be entered and terminated with ^D.
 
 include::date-formats.txt[]
 
-Diagnostics
------------
-You don't exist. Go away!::
-    The passwd(5) gecos field couldn't be read
-
 Discussion
 ----------
 
diff --git a/Documentation/git-var.txt b/Documentation/git-var.txt
index 3f703e3..67edf58 100644
--- a/Documentation/git-var.txt
+++ b/Documentation/git-var.txt
@@ -59,11 +59,6 @@ ifdef::git-default-pager[]
     The build you are using chose '{git-default-pager}' as the default.
 endif::git-default-pager[]
 
-Diagnostics
------------
-You don't exist. Go away!::
-    The passwd(5) gecos field couldn't be read
-
 SEE ALSO
 --------
 linkgit:git-commit-tree[1]
diff --git a/git-compat-util.h b/git-compat-util.h
index ed11ad8..5bd9ad7 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -595,4 +595,7 @@ int rmdir_or_warn(const char *path);
  */
 int remove_or_warn(unsigned int mode, const char *path);
 
+/* Get the passwd entry for the UID of the current process. */
+struct passwd *xgetpwuid_self(void);
+
 #endif
diff --git a/ident.c b/ident.c
index 73a06a1..5aec073 100644
--- a/ident.c
+++ b/ident.c
@@ -100,12 +100,8 @@ static void copy_email(const struct passwd *pw, struct strbuf *email)
 
 const char *ident_default_name(void)
 {
-	if (!git_default_name.len) {
-		struct passwd *pw = getpwuid(getuid());
-		if (!pw)
-			die("You don't exist. Go away!");
-		copy_gecos(pw, &git_default_name);
-	}
+	if (!git_default_name.len)
+		copy_gecos(xgetpwuid_self(), &git_default_name);
 	return git_default_name.buf;
 }
 
@@ -117,12 +113,8 @@ const char *ident_default_email(void)
 		if (email && email[0]) {
 			strbuf_addstr(&git_default_email, email);
 			user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
-		} else {
-			struct passwd *pw = getpwuid(getuid());
-			if (!pw)
-				die("You don't exist. Go away!");
-			copy_email(pw, &git_default_email);
-		}
+		} else
+			copy_email(xgetpwuid_self(), &git_default_email);
 	}
 	return git_default_email.buf;
 }
@@ -303,9 +295,7 @@ const char *fmt_ident(const char *name, const char *email,
 				fputs(env_hint, stderr);
 			die("empty ident %s <%s> not allowed", name, email);
 		}
-		pw = getpwuid(getuid());
-		if (!pw)
-			die("You don't exist. Go away!");
+		pw = xgetpwuid_self();
 		name = pw->pw_name;
 	}
 
diff --git a/wrapper.c b/wrapper.c
index 6ccd059..b5e33e4 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -402,3 +402,15 @@ int remove_or_warn(unsigned int mode, const char *file)
 {
 	return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
 }
+
+struct passwd *xgetpwuid_self(void)
+{
+	struct passwd *pw;
+
+	errno = 0;
+	pw = getpwuid(getuid());
+	if (!pw)
+		die(_("unable to look up current user in the passwd file: %s"),
+		    errno ? strerror(errno) : _("no such user"));
+	return pw;
+}
-- 
1.7.10.1.19.g711d603

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCHv2 12/15] ident: use full dns names to generate email addresses
  2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
                                                     ` (10 preceding siblings ...)
  2012-05-21 23:10                                   ` [PATCHv2 11/15] ident: report passwd errors with a more friendly message Jeff King
@ 2012-05-21 23:10                                   ` Jeff King
  2012-05-21 23:10                                   ` [PATCHv2 13/15] ident: use a dynamic strbuf in fmt_ident Jeff King
                                                     ` (2 subsequent siblings)
  14 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-21 23:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

When we construct an email address from the username and
hostname, we generate the host part of the email with this
procedure:

  1. add the result of gethostname

  2. if it has a dot, ok, it's fully qualified

  3. if not, then look up the unqualified hostname via
     gethostbyname; take the domain name of the result and
     append it to the hostname

Step 3 can actually produce a bogus result, as the name
returned by gethostbyname may not be related to the hostname
we fed it (e.g., consider a machine "foo" with names
"foo.one.example.com" and "bar.two.example.com"; we may have
the latter returned and generate the bogus name
"foo.two.example.com").

This patch simply uses the full hostname returned by
gethostbyname. In the common case that the first part is the
same as the unqualified hostname, the behavior is identical.
And in the case that it is not the same, we are much more
likely to be generating a valid name.

Signed-off-by: Jeff King <peff@peff.net>
---
 ident.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/ident.c b/ident.c
index 5aec073..b111e34 100644
--- a/ident.c
+++ b/ident.c
@@ -65,23 +65,18 @@ static void add_domainname(struct strbuf *out)
 {
 	char buf[1024];
 	struct hostent *he;
-	const char *domainname;
 
 	if (gethostname(buf, sizeof(buf))) {
 		warning("cannot get host name: %s", strerror(errno));
 		strbuf_addstr(out, "(none)");
 		return;
 	}
-	strbuf_addstr(out, buf);
 	if (strchr(buf, '.'))
-		return;
-
-	he = gethostbyname(buf);
-	strbuf_addch(out, '.');
-	if (he && (domainname = strchr(he->h_name, '.')))
-		strbuf_addstr(out, domainname + 1);
+		strbuf_addstr(out, buf);
+	else if ((he = gethostbyname(buf)) && strchr(he->h_name, '.'))
+		strbuf_addstr(out, he->h_name);
 	else
-		strbuf_addstr(out, "(none)");
+		strbuf_addf(out, "%s.(none)", buf);
 }
 
 static void copy_email(const struct passwd *pw, struct strbuf *email)
-- 
1.7.10.1.19.g711d603

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCHv2 13/15] ident: use a dynamic strbuf in fmt_ident
  2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
                                                     ` (11 preceding siblings ...)
  2012-05-21 23:10                                   ` [PATCHv2 12/15] ident: use full dns names to generate email addresses Jeff King
@ 2012-05-21 23:10                                   ` Jeff King
  2012-05-21 23:10                                   ` [PATCHv2 14/15] ident: trim whitespace from default name/email Jeff King
  2012-05-21 23:10                                   ` [PATCHv2 15/15] format-patch: refactor get_patch_filename Jeff King
  14 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-21 23:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

Now that we accept arbitrary-sized names and email
addresses, the only remaining limit is in the actual
formatting of the names into a buffer. The current limit is
1000 characters, which is not likely to be reached, but
using a strbuf is one less error condition we have to worry
about.

Signed-off-by: Jeff King <peff@peff.net>
---
 ident.c | 43 +++++++++++++++----------------------------
 1 file changed, 15 insertions(+), 28 deletions(-)

diff --git a/ident.c b/ident.c
index b111e34..cefb829 100644
--- a/ident.c
+++ b/ident.c
@@ -121,15 +121,6 @@ const char *ident_default_date(void)
 	return git_default_date;
 }
 
-static int add_raw(char *buf, size_t size, int offset, const char *str)
-{
-	size_t len = strlen(str);
-	if (offset + len > size)
-		return size;
-	memcpy(buf + offset, str, len);
-	return offset + len;
-}
-
 static int crud(unsigned char c)
 {
 	return  c <= 32  ||
@@ -148,7 +139,7 @@ static int crud(unsigned char c)
  * Copy over a string to the destination, but avoid special
  * characters ('\n', '<' and '>') and remove crud at the end
  */
-static int copy(char *buf, size_t size, int offset, const char *src)
+static void strbuf_addstr_without_crud(struct strbuf *sb, const char *src)
 {
 	size_t i, len;
 	unsigned char c;
@@ -172,19 +163,19 @@ static int copy(char *buf, size_t size, int offset, const char *src)
 	/*
 	 * Copy the rest to the buffer, but avoid the special
 	 * characters '\n' '<' and '>' that act as delimiters on
-	 * an identification line
+	 * an identification line. We can only remove crud, never add it,
+	 * so 'len' is our maximum.
 	 */
+	strbuf_grow(sb, len);
 	for (i = 0; i < len; i++) {
 		c = *src++;
 		switch (c) {
 		case '\n': case '<': case '>':
 			continue;
 		}
-		if (offset >= size)
-			return size;
-		buf[offset++] = c;
+		sb->buf[sb->len++] = c;
 	}
-	return offset;
+	sb->buf[sb->len] = '\0';
 }
 
 /*
@@ -271,9 +262,8 @@ static const char *env_hint =
 const char *fmt_ident(const char *name, const char *email,
 		      const char *date_str, int flag)
 {
-	static char buffer[1000];
+	static struct strbuf ident = STRBUF_INIT;
 	char date[50];
-	int i;
 	int error_on_no_name = (flag & IDENT_ERROR_ON_NO_NAME);
 	int name_addr_only = (flag & IDENT_NO_DATE);
 
@@ -300,19 +290,16 @@ const char *fmt_ident(const char *name, const char *email,
 			die("invalid date format: %s", date_str);
 	}
 
-	i = copy(buffer, sizeof(buffer), 0, name);
-	i = add_raw(buffer, sizeof(buffer), i, " <");
-	i = copy(buffer, sizeof(buffer), i, email);
+	strbuf_reset(&ident);
+	strbuf_addstr_without_crud(&ident, name);
+	strbuf_addstr(&ident, " <");
+	strbuf_addstr_without_crud(&ident, email);
+	strbuf_addch(&ident, '>');
 	if (!name_addr_only) {
-		i = add_raw(buffer, sizeof(buffer), i,  "> ");
-		i = copy(buffer, sizeof(buffer), i, date);
-	} else {
-		i = add_raw(buffer, sizeof(buffer), i, ">");
+		strbuf_addch(&ident, ' ');
+		strbuf_addstr_without_crud(&ident, date);
 	}
-	if (i >= sizeof(buffer))
-		die("Impossibly long personal identifier");
-	buffer[i] = 0;
-	return buffer;
+	return ident.buf;
 }
 
 const char *fmt_name(const char *name, const char *email)
-- 
1.7.10.1.19.g711d603

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCHv2 14/15] ident: trim whitespace from default name/email
  2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
                                                     ` (12 preceding siblings ...)
  2012-05-21 23:10                                   ` [PATCHv2 13/15] ident: use a dynamic strbuf in fmt_ident Jeff King
@ 2012-05-21 23:10                                   ` Jeff King
  2012-05-22 16:55                                     ` Junio C Hamano
  2012-05-21 23:10                                   ` [PATCHv2 15/15] format-patch: refactor get_patch_filename Jeff King
  14 siblings, 1 reply; 68+ messages in thread
From: Jeff King @ 2012-05-21 23:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

Usually these values get fed to fmt_ident, which will trim
any cruft anyway, but there are a few code paths which use
them directly. Let's clean them up for the benefit of those
callers. Furthermore, fmt_ident will look at the pre-trimmed
value and decide whether to invoke ERROR_ON_NO_NAME; this
check can be fooled by a name consisting only of spaces.

Note that we only bother to clean up when we are pulling the
information from gecos or from system files. Any other value
comes from a config file, where we will have cleaned up
accidental whitespace already.

Signed-off-by: Jeff King <peff@peff.net>
---
 ident.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/ident.c b/ident.c
index cefb829..e279039 100644
--- a/ident.c
+++ b/ident.c
@@ -95,8 +95,10 @@ static void copy_email(const struct passwd *pw, struct strbuf *email)
 
 const char *ident_default_name(void)
 {
-	if (!git_default_name.len)
+	if (!git_default_name.len) {
 		copy_gecos(xgetpwuid_self(), &git_default_name);
+		strbuf_trim(&git_default_name);
+	}
 	return git_default_name.buf;
 }
 
@@ -110,6 +112,7 @@ const char *ident_default_email(void)
 			user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
 		} else
 			copy_email(xgetpwuid_self(), &git_default_email);
+		strbuf_trim(&git_default_email);
 	}
 	return git_default_email.buf;
 }
-- 
1.7.10.1.19.g711d603

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCHv2 15/15] format-patch: refactor get_patch_filename
  2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
                                                     ` (13 preceding siblings ...)
  2012-05-21 23:10                                   ` [PATCHv2 14/15] ident: trim whitespace from default name/email Jeff King
@ 2012-05-21 23:10                                   ` Jeff King
  14 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2012-05-21 23:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

The get_patch_filename function expects a commit argument
and uses it to get the sanitized subject line when making a
patch filename. However, we also want to use this same
function for the cover letter, which does not have a commit
object. The current solution is to create a fake commit with
the subject "cover letter". Instead, let's make the
get_patch_filename interface more flexibile, and allow
passing a direct subject.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/log.c | 35 +++++++----------------------------
 log-tree.c    | 19 +++++++++++--------
 log-tree.h    |  4 ++--
 3 files changed, 20 insertions(+), 38 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 656bddf..8010a40 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -663,7 +663,8 @@ static FILE *realstdout = NULL;
 static const char *output_directory = NULL;
 static int outdir_offset;
 
-static int reopen_stdout(struct commit *commit, struct rev_info *rev, int quiet)
+static int reopen_stdout(struct commit *commit, const char *subject,
+			 struct rev_info *rev, int quiet)
 {
 	struct strbuf filename = STRBUF_INIT;
 	int suffix_len = strlen(fmt_patch_suffix) + 1;
@@ -677,7 +678,7 @@ static int reopen_stdout(struct commit *commit, struct rev_info *rev, int quiet)
 			strbuf_addch(&filename, '/');
 	}
 
-	get_patch_filename(commit, rev->nr, fmt_patch_suffix, &filename);
+	get_patch_filename(commit, subject, rev->nr, fmt_patch_suffix, &filename);
 
 	if (!quiet)
 		fprintf(realstdout, "%s\n", filename.buf + outdir_offset);
@@ -778,7 +779,6 @@ static void make_cover_letter(struct rev_info *rev, int use_stdout,
 	const char *encoding = "UTF-8";
 	struct diff_options opts;
 	int need_8bit_cte = 0;
-	struct commit *commit = NULL;
 	struct pretty_print_context pp = {0};
 
 	if (rev->commit_format != CMIT_FMT_EMAIL)
@@ -786,31 +786,10 @@ static void make_cover_letter(struct rev_info *rev, int use_stdout,
 
 	committer = git_committer_info(0);
 
-	if (!numbered_files) {
-		/*
-		 * We fake a commit for the cover letter so we get the filename
-		 * desired.
-		 */
-		commit = xcalloc(1, sizeof(*commit));
-		commit->buffer = xmalloc(400);
-		snprintf(commit->buffer, 400,
-			"tree 0000000000000000000000000000000000000000\n"
-			"parent %s\n"
-			"author %s\n"
-			"committer %s\n\n"
-			"cover letter\n",
-			sha1_to_hex(head->object.sha1), committer, committer);
-	}
-
-	if (!use_stdout && reopen_stdout(commit, rev, quiet))
+	if (!use_stdout &&
+	    reopen_stdout(NULL, numbered_files ? NULL : "cover-letter", rev, quiet))
 		return;
 
-	if (commit) {
-
-		free(commit->buffer);
-		free(commit);
-	}
-
 	log_write_email_headers(rev, head, &pp.subject, &pp.after_subject,
 				&need_8bit_cte);
 
@@ -1405,8 +1384,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 			gen_message_id(&rev, sha1_to_hex(commit->object.sha1));
 		}
 
-		if (!use_stdout && reopen_stdout(numbered_files ? NULL : commit,
-						 &rev, quiet))
+		if (!use_stdout &&
+		    reopen_stdout(numbered_files ? NULL : commit, NULL, &rev, quiet))
 			die(_("Failed to create output files"));
 		shown = log_tree_commit(&rev, commit);
 		free(commit->buffer);
diff --git a/log-tree.c b/log-tree.c
index 376d973..c894930 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -299,19 +299,22 @@ static unsigned int digits_in_number(unsigned int number)
 	return result;
 }
 
-void get_patch_filename(struct commit *commit, int nr, const char *suffix,
-			struct strbuf *buf)
+void get_patch_filename(struct commit *commit, const char *subject, int nr,
+			const char *suffix, struct strbuf *buf)
 {
 	int suffix_len = strlen(suffix) + 1;
 	int start_len = buf->len;
 
-	strbuf_addf(buf, commit ? "%04d-" : "%d", nr);
-	if (commit) {
+	strbuf_addf(buf, commit || subject ? "%04d-" : "%d", nr);
+	if (commit || subject) {
 		int max_len = start_len + FORMAT_PATCH_NAME_MAX - suffix_len;
 		struct pretty_print_context ctx = {0};
-		ctx.date_mode = DATE_NORMAL;
 
-		format_commit_message(commit, "%f", buf, &ctx);
+		if (subject)
+			strbuf_addstr(buf, subject);
+		else if (commit)
+			format_commit_message(commit, "%f", buf, &ctx);
+
 		if (max_len < buf->len)
 			strbuf_setlen(buf, max_len);
 		strbuf_addstr(buf, suffix);
@@ -384,8 +387,8 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			 mime_boundary_leader, opt->mime_boundary);
 		extra_headers = subject_buffer;
 
-		get_patch_filename(opt->numbered_files ? NULL : commit, opt->nr,
-				    opt->patch_suffix, &filename);
+		get_patch_filename(opt->numbered_files ? NULL : commit, NULL,
+				   opt->nr, opt->patch_suffix, &filename);
 		snprintf(buffer, sizeof(buffer) - 1,
 			 "\n--%s%s\n"
 			 "Content-Type: text/x-patch;"
diff --git a/log-tree.h b/log-tree.h
index 5c4cf7c..f5ac238 100644
--- a/log-tree.h
+++ b/log-tree.h
@@ -21,7 +21,7 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 void load_ref_decorations(int flags);
 
 #define FORMAT_PATCH_NAME_MAX 64
-void get_patch_filename(struct commit *commit, int nr, const char *suffix,
-			struct strbuf *buf);
+void get_patch_filename(struct commit *commit, const char *subject, int nr,
+			const char *suffix, struct strbuf *buf);
 
 #endif
-- 
1.7.10.1.19.g711d603

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCHv2 14/15] ident: trim whitespace from default name/email
  2012-05-21 23:10                                   ` [PATCHv2 14/15] ident: trim whitespace from default name/email Jeff King
@ 2012-05-22 16:55                                     ` Junio C Hamano
  2012-05-22 17:12                                       ` Jeff King
  0 siblings, 1 reply; 68+ messages in thread
From: Junio C Hamano @ 2012-05-22 16:55 UTC (permalink / raw)
  To: Jeff King; +Cc: Angus Hammond, git

Jeff King <peff@peff.net> writes:

> ... Any other value
> comes from a config file, where we will have cleaned up
> accidental whitespace already.

Are you referring to the behaviour of the config parser that removes
leading and trailing whitespaces when reading an unquoted string value?

If the user really wanted to have trailing whitespaces by quoting, we
would let it pass, in other words; it probably is a reasonable behaviour.

The same can be said about the environment variable GIT_COMMITTER_NAME and
friends, but "accidental whitespaces are cleaned up already" does not
apply to them.

So, isn't the real rationale behind this choice to allow users who give
leading or trailing whitespaces in the configuration and environment
variables on purpose use whatever value they specified?

I agree with the placement of trimming in this patch, but I do not quite
get (I do not mean "I do not agree with") what the quoted sentence wanted
to say.

Other than that single small "hrm...", the entire series was a pleasant
read.  Thanks.

>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  ident.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/ident.c b/ident.c
> index cefb829..e279039 100644
> --- a/ident.c
> +++ b/ident.c
> @@ -95,8 +95,10 @@ static void copy_email(const struct passwd *pw, struct strbuf *email)
>  
>  const char *ident_default_name(void)
>  {
> -	if (!git_default_name.len)
> +	if (!git_default_name.len) {
>  		copy_gecos(xgetpwuid_self(), &git_default_name);
> +		strbuf_trim(&git_default_name);
> +	}
>  	return git_default_name.buf;
>  }
>  
> @@ -110,6 +112,7 @@ const char *ident_default_email(void)
>  			user_ident_explicitly_given |= IDENT_MAIL_GIVEN;
>  		} else
>  			copy_email(xgetpwuid_self(), &git_default_email);
> +		strbuf_trim(&git_default_email);
>  	}
>  	return git_default_email.buf;
>  }

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCHv2 14/15] ident: trim whitespace from default name/email
  2012-05-22 16:55                                     ` Junio C Hamano
@ 2012-05-22 17:12                                       ` Jeff King
  2012-05-22 17:21                                         ` Junio C Hamano
  0 siblings, 1 reply; 68+ messages in thread
From: Jeff King @ 2012-05-22 17:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Angus Hammond, git

On Tue, May 22, 2012 at 09:55:19AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > ... Any other value
> > comes from a config file, where we will have cleaned up
> > accidental whitespace already.
> 
> Are you referring to the behaviour of the config parser that removes
> leading and trailing whitespaces when reading an unquoted string value?

Yes, exactly.

> If the user really wanted to have trailing whitespaces by quoting, we
> would let it pass, in other words; it probably is a reasonable behaviour.

Right, that's why I said "accidental" above; you can still do it, but I
think you'd have to be really trying. But I really wonder if we should
be blocking that, too. fmt_ident will drop it anyway, so this is really
only for code paths that want to use the email directly.

> The same can be said about the environment variable GIT_COMMITTER_NAME and
> friends, but "accidental whitespaces are cleaned up already" does not
> apply to them.

Yes, but those variables don't hit this code path at all. They go
straight to fmt_ident, which does the cleanup.  If you have code like
this:

  email = getenv("GIT_COMMITTER_EMAIL");
  if (!email)
          email = ident_default_email();

you might have trim-able spaces in the getenv result, and nobody is
handling that (fmt_ident does, but accessing it directly does not).
We could go a step further and replace ident_default_email with a
git_committer_email() which does the above, plus any whitespace
trimming.

> So, isn't the real rationale behind this choice to allow users who give
> leading or trailing whitespaces in the configuration and environment
> variables on purpose use whatever value they specified?

To be honest, the real rationale was laziness and a desire to keep the
interfaces simple. The risky code path (in my opinion) for getting extra
whitespace is pulling data from gecos, or from a file like
/etc/mailname. IOW, my intent was to block common accidents, but not
worry about somebody intentionally trying to push whitespace through.

> I agree with the placement of trimming in this patch, but I do not quite
> get (I do not mean "I do not agree with") what the quoted sentence wanted
> to say.
> 
> Other than that single small "hrm...", the entire series was a pleasant
> read.  Thanks.

Thanks. I'm happy to update this patch if we want it to be more
paranoid, or I can update the commit message if it just needs explained
better.

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCHv2 14/15] ident: trim whitespace from default name/email
  2012-05-22 17:12                                       ` Jeff King
@ 2012-05-22 17:21                                         ` Junio C Hamano
  0 siblings, 0 replies; 68+ messages in thread
From: Junio C Hamano @ 2012-05-22 17:21 UTC (permalink / raw)
  To: Jeff King; +Cc: Angus Hammond, git

Jeff King <peff@peff.net> writes:

> On Tue, May 22, 2012 at 09:55:19AM -0700, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>> 
>> > ... Any other value
>> > comes from a config file, where we will have cleaned up
>> > accidental whitespace already.
>> 
>> Are you referring to the behaviour of the config parser that removes
>> leading and trailing whitespaces when reading an unquoted string value?
>
> Yes, exactly.
>
>> If the user really wanted to have trailing whitespaces by quoting, we
>> would let it pass, in other words; it probably is a reasonable behaviour.
>
> Right, that's why I said "accidental" above; you can still do it, but I
> think you'd have to be really trying. But I really wonder if we should
> be blocking that, too. fmt_ident will drop it anyway, so this is really
> only for code paths that want to use the email directly.

Thanks.

>> Other than that single small "hrm...", the entire series was a pleasant
>> read.  Thanks.
>
> Thanks. I'm happy to update this patch if we want it to be more
> paranoid, or I can update the commit message if it just needs explained
> better.

Neither is necessary. I was just puzzled a bit and was worried if I was
missing something.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [regression] Re: [PATCHv2 10/15] drop length limitations on gecos-derived names and emails
  2012-05-21 23:10                                   ` [PATCHv2 10/15] drop length limitations on gecos-derived names and emails Jeff King
@ 2013-01-24 23:21                                     ` Jonathan Nieder
  2013-01-25  1:05                                       ` Jeff King
  0 siblings, 1 reply; 68+ messages in thread
From: Jonathan Nieder @ 2013-01-24 23:21 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Angus Hammond, git, Mihai Rusu

Hi Jeff,

Jeff King wrote:

> When we pull the user's name from the GECOS field of the
> passwd file (or generate an email address based on their
> username and hostname), we put the result into a
> static buffer.
[...]
> The conversion is mostly mechanical: replace string copies
> with strbuf equivalents, and access the strbuf.buf directly.
> There are a few exceptions:

This broke /etc/mailname handling.  Before:

	$ git var GIT_COMMITTER_IDENT
	Jonathan Nieder <jrn@mailname.example.com> 1359069165 -0800

After:

	$ git var GIT_COMMITTER_IDENT
	Jonathan Nieder <mailname.example.com> 1359069192 -0800

The cause:

[...]
> --- a/ident.c
> +++ b/ident.c
> @@ -19,42 +18,27 @@ int user_ident_explicitly_given;
[...]
> -static int add_mailname_host(char *buf, size_t len)
> +static int add_mailname_host(struct strbuf *buf)
>  {
>  	FILE *mailname;
>  
> @@ -65,7 +49,7 @@ static int add_mailname_host(char *buf, size_t len)
>  				strerror(errno));
>  		return -1;
>  	}
> -	if (!fgets(buf, len, mailname)) {
> +	if (strbuf_getline(buf, mailname, '\n') == EOF) {

This clears the strbuf.

How about something like this as a quick fix?

Reported-by: Mihai Rusu <dizzy@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>

diff --git a/ident.c b/ident.c
index 73a06a1..cabd73f 100644
--- a/ident.c
+++ b/ident.c
@@ -41,6 +41,7 @@ static void copy_gecos(const struct passwd *w, struct strbuf *name)
 static int add_mailname_host(struct strbuf *buf)
 {
 	FILE *mailname;
+	struct strbuf mailnamebuf = STRBUF_INIT;
 
 	mailname = fopen("/etc/mailname", "r");
 	if (!mailname) {
@@ -49,14 +50,17 @@ static int add_mailname_host(struct strbuf *buf)
 				strerror(errno));
 		return -1;
 	}
-	if (strbuf_getline(buf, mailname, '\n') == EOF) {
+	if (strbuf_getline(&mailnamebuf, mailname, '\n') == EOF) {
 		if (ferror(mailname))
 			warning("cannot read /etc/mailname: %s",
 				strerror(errno));
+		strbuf_release(&mailnamebuf);
 		fclose(mailname);
 		return -1;
 	}
 	/* success! */
+	strbuf_addbuf(buf, &mailnamebuf);
+	strbuf_release(&mailnamebuf);
 	fclose(mailname);
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [regression] Re: [PATCHv2 10/15] drop length limitations on gecos-derived names and emails
  2013-01-24 23:21                                     ` [regression] " Jonathan Nieder
@ 2013-01-25  1:05                                       ` Jeff King
  2013-01-25 18:46                                         ` Junio C Hamano
  0 siblings, 1 reply; 68+ messages in thread
From: Jeff King @ 2013-01-25  1:05 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Junio C Hamano, Angus Hammond, git, Mihai Rusu

On Thu, Jan 24, 2013 at 03:21:46PM -0800, Jonathan Nieder wrote:

> This broke /etc/mailname handling.  Before:
> 
> 	$ git var GIT_COMMITTER_IDENT
> 	Jonathan Nieder <jrn@mailname.example.com> 1359069165 -0800
> 
> After:
> 
> 	$ git var GIT_COMMITTER_IDENT
> 	Jonathan Nieder <mailname.example.com> 1359069192 -0800

Ick. I wonder how that slipped through...I know I was testing with
/etc/mailname when developing the series, because I'm on a Debian
system. We do even check this code path in t7502 (if you have the
AUTOIDENT prereq), but of course we can't verify the actual value
automatically, because it could be anything. So I guess I just missed it
during my manual testing, and the automated testing is insufficient to
catch this particular breakage.

> > -	if (!fgets(buf, len, mailname)) {
> > +	if (strbuf_getline(buf, mailname, '\n') == EOF) {
> 
> This clears the strbuf.

Right. Definitely the problem.

> How about something like this as a quick fix?
> 
> Reported-by: Mihai Rusu <dizzy@google.com>
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> 
> diff --git a/ident.c b/ident.c
> index 73a06a1..cabd73f 100644
> --- a/ident.c
> +++ b/ident.c
> @@ -41,6 +41,7 @@ static void copy_gecos(const struct passwd *w, struct strbuf *name)
>  static int add_mailname_host(struct strbuf *buf)
>  {
>  	FILE *mailname;
> +	struct strbuf mailnamebuf = STRBUF_INIT;
>  
>  	mailname = fopen("/etc/mailname", "r");
>  	if (!mailname) {
> @@ -49,14 +50,17 @@ static int add_mailname_host(struct strbuf *buf)
>  				strerror(errno));
>  		return -1;
>  	}
> -	if (strbuf_getline(buf, mailname, '\n') == EOF) {
> +	if (strbuf_getline(&mailnamebuf, mailname, '\n') == EOF) {
>  		if (ferror(mailname))
>  			warning("cannot read /etc/mailname: %s",
>  				strerror(errno));
> +		strbuf_release(&mailnamebuf);
>  		fclose(mailname);
>  		return -1;
>  	}
>  	/* success! */
> +	strbuf_addbuf(buf, &mailnamebuf);
> +	strbuf_release(&mailnamebuf);
>  	fclose(mailname);
>  	return 0;
>  }

I think that is the only reasonable fix. Thanks for figuring it out.

We could expand the test in t7502 to check for "@" in the email, but it
feels weirdly specific to this bug. Either way,

Acked-by: Jeff King <peff@peff.net>

(with a proper commit message, of course).

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [regression] Re: [PATCHv2 10/15] drop length limitations on gecos-derived names and emails
  2013-01-25  1:05                                       ` Jeff King
@ 2013-01-25 18:46                                         ` Junio C Hamano
  2013-01-25 22:10                                           ` Jeff King
  0 siblings, 1 reply; 68+ messages in thread
From: Junio C Hamano @ 2013-01-25 18:46 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Nieder, Angus Hammond, git, Mihai Rusu

Jeff King <peff@peff.net> writes:

> (with a proper commit message, of course).

Will queue this one, to be merged to 'maint' and 'master'.

-- >8 --
From: Jonathan Nieder <jrnieder@gmail.com>
Date: Thu, 24 Jan 2013 15:21:46 -0800
Subject: [PATCH] ident: do not drop username when reading from /etc/mailname

An earlier conversion from fgets() to strbuf_getline() in the
codepath to read from /etc/mailname to learn the default host-part
of the ident e-mail address forgot that strbuf_getline() stores the
line at the beginning of the buffer just like fgets().

The "username@" the caller has prepared in the strbuf, expecting the
function to append the host-part to it, was lost because of this.

Reported-by: Mihai Rusu <dizzy@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 ident.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/ident.c b/ident.c
index 484e0a9..fb4cc72 100644
--- a/ident.c
+++ b/ident.c
@@ -41,6 +41,7 @@ static void copy_gecos(const struct passwd *w, struct strbuf *name)
 static int add_mailname_host(struct strbuf *buf)
 {
 	FILE *mailname;
+	struct strbuf mailnamebuf = STRBUF_INIT;
 
 	mailname = fopen("/etc/mailname", "r");
 	if (!mailname) {
@@ -49,14 +50,17 @@ static int add_mailname_host(struct strbuf *buf)
 				strerror(errno));
 		return -1;
 	}
-	if (strbuf_getline(buf, mailname, '\n') == EOF) {
+	if (strbuf_getline(&mailnamebuf, mailname, '\n') == EOF) {
 		if (ferror(mailname))
 			warning("cannot read /etc/mailname: %s",
 				strerror(errno));
+		strbuf_release(&mailnamebuf);
 		fclose(mailname);
 		return -1;
 	}
 	/* success! */
+	strbuf_addbuf(buf, &mailnamebuf);
+	strbuf_release(&mailnamebuf);
 	fclose(mailname);
 	return 0;
 }
-- 
1.8.1.1.525.gdace530

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [regression] Re: [PATCHv2 10/15] drop length limitations on gecos-derived names and emails
  2013-01-25 18:46                                         ` Junio C Hamano
@ 2013-01-25 22:10                                           ` Jeff King
  0 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2013-01-25 22:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, Angus Hammond, git, Mihai Rusu

On Fri, Jan 25, 2013 at 10:46:48AM -0800, Junio C Hamano wrote:

> Will queue this one, to be merged to 'maint' and 'master'.
> 
> -- >8 --
> From: Jonathan Nieder <jrnieder@gmail.com>
> Date: Thu, 24 Jan 2013 15:21:46 -0800
> Subject: [PATCH] ident: do not drop username when reading from /etc/mailname

Thanks, looks fine to me (and thanks to Jonathan). One nit:

> -	if (strbuf_getline(buf, mailname, '\n') == EOF) {
> +	if (strbuf_getline(&mailnamebuf, mailname, '\n') == EOF) {
>  		if (ferror(mailname))
>  			warning("cannot read /etc/mailname: %s",
>  				strerror(errno));
> +		strbuf_release(&mailnamebuf);
>  		fclose(mailname);
>  		return -1;
>  	}

This strbuf_release is unnecessary, as an EOF return by definition means
we did not read anything. I don't mind it as a defensive measure,
though, in case the strbuf implementation changes to pre-allocate.

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2013-01-25 22:10 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-10 19:06 [PATCH 1/2] Change error messages in ident.c Make error messages caused by failed reads of the /etc/passwd file easier to understand. Signed-off-by: Angus Hammond <angusgh@gmail.com> Angus Hammond
2012-05-10 19:06 ` [PATCH 2/2] Remove diagnostics section from commit-tree and var man pages New error messages shouldn't need explaining like the old ones did so just delete the diagnostics section of the man pages. " Angus Hammond
2012-05-10 19:21   ` Angus Hammond
2012-05-10 19:23 ` [PATCH 1/2] Change error messages in ident.c Jeff King
2012-05-10 19:56   ` Jeff King
2012-05-11 22:53     ` Junio C Hamano
2012-05-11 23:13       ` Jeff King
2012-05-14 16:28         ` [PATCH 1/2] drop length limitations on gecos-derived names and emails Jeff King
2012-05-14 17:05           ` Jeff King
2012-05-14 21:02           ` Jeff King
2012-05-14 21:13             ` Jeff King
2012-05-15  1:54               ` Jeff King
2012-05-15  2:32                 ` Jeff King
2012-05-15 15:03                 ` Junio C Hamano
2012-05-15 17:47                   ` Jeff King
2012-05-15 18:10                     ` Junio C Hamano
2012-05-18 23:05                       ` [PATCH 0/13] ident cleanups and bugfixes Jeff King
2012-05-18 23:07                         ` [PATCH 01/13] ident: split setup_ident into separate functions Jeff King
2012-05-18 23:09                         ` [PATCH 02/13] http-push: do not access git_default_email directly Jeff King
2012-05-18 23:10                         ` [PATCH 03/13] fmt-merge-msg: don't use static buffer in record_person Jeff King
2012-05-18 23:11                         ` [PATCH 04/13] move identity config parsing to ident.c Jeff King
2012-05-18 23:11                         ` [PATCH 05/13] move git_default_* variables " Jeff King
2012-05-21  4:07                           ` Junio C Hamano
2012-05-21  5:41                             ` Jeff King
2012-05-21  6:41                               ` Jeff King
2012-05-18 23:13                         ` [PATCH 06/13] format-patch: use default email for generating message ids Jeff King
2012-05-21  2:58                           ` Junio C Hamano
2012-05-21  6:36                             ` Jeff King
2012-05-18 23:14                         ` [PATCH 07/13] fmt_ident: drop IDENT_WARN_ON_NO_NAME code Jeff King
2012-05-18 23:19                         ` [PATCH 08/13] ident: don't write fallback username into git_default_name Jeff King
2012-05-21  2:54                           ` Junio C Hamano
2012-05-21  6:31                             ` Jeff King
2012-05-21  9:11                               ` Junio C Hamano
2012-05-21 23:09                                 ` [PATCHv2 0/15] ident cleanups git_default_name Jeff King
2012-05-21 23:09                                   ` [PATCHv2 01/15] ident: split setup_ident into separate functions Jeff King
2012-05-21 23:09                                   ` [PATCHv2 02/15] http-push: do not access git_default_email directly Jeff King
2012-05-21 23:09                                   ` [PATCHv2 03/15] fmt-merge-msg: don't use static buffer in record_person Jeff King
2012-05-21 23:09                                   ` [PATCHv2 04/15] move identity config parsing to ident.c Jeff King
2012-05-21 23:09                                   ` [PATCHv2 05/15] move git_default_* variables " Jeff King
2012-05-21 23:10                                   ` [PATCHv2 06/15] ident: trim trailing newline from /etc/mailname Jeff King
2012-05-21 23:10                                   ` [PATCHv2 07/15] format-patch: use default email for generating message ids Jeff King
2012-05-21 23:10                                   ` [PATCHv2 08/15] fmt_ident: drop IDENT_WARN_ON_NO_NAME code Jeff King
2012-05-21 23:10                                   ` [PATCHv2 09/15] ident: don't write fallback username into git_default_name Jeff King
2012-05-21 23:10                                   ` [PATCHv2 10/15] drop length limitations on gecos-derived names and emails Jeff King
2013-01-24 23:21                                     ` [regression] " Jonathan Nieder
2013-01-25  1:05                                       ` Jeff King
2013-01-25 18:46                                         ` Junio C Hamano
2013-01-25 22:10                                           ` Jeff King
2012-05-21 23:10                                   ` [PATCHv2 11/15] ident: report passwd errors with a more friendly message Jeff King
2012-05-21 23:10                                   ` [PATCHv2 12/15] ident: use full dns names to generate email addresses Jeff King
2012-05-21 23:10                                   ` [PATCHv2 13/15] ident: use a dynamic strbuf in fmt_ident Jeff King
2012-05-21 23:10                                   ` [PATCHv2 14/15] ident: trim whitespace from default name/email Jeff King
2012-05-22 16:55                                     ` Junio C Hamano
2012-05-22 17:12                                       ` Jeff King
2012-05-22 17:21                                         ` Junio C Hamano
2012-05-21 23:10                                   ` [PATCHv2 15/15] format-patch: refactor get_patch_filename Jeff King
2012-05-18 23:20                         ` [PATCH 09/13] drop length limitations on gecos-derived names and emails Jeff King
2012-05-18 23:21                         ` [PATCH 10/13] ident: report passwd errors with a more friendly message Jeff King
2012-05-18 23:22                         ` [PATCH 11/13] ident: use full dns names to generate email addresses Jeff King
2012-05-18 23:23                         ` [PATCH 12/13] ident: use a dynamic strbuf in fmt_ident Jeff King
2012-05-18 23:24                         ` [PATCH 13/13] format-patch: refactor get_patch_filename Jeff King
2012-05-14 16:36         ` [PATCH 2/2] ident: report passwd errors with a more friendly message Jeff King
2012-05-10 20:04   ` [PATCH 1/2] Change error messages in ident.c Junio C Hamano
2012-05-10 20:22     ` Jeff King
2012-05-10 20:28       ` Junio C Hamano
2012-05-10 19:43 ` [PATCH 1/2] Change error messages in ident.c Make error messages caused by failed reads of the /etc/passwd file easier to understand. Signed-off-by: Angus Hammond <angusgh@gmail.com> Junio C Hamano
2012-05-10 19:57   ` Angus Hammond
2012-05-11 11:35 ` Nguyen Thai Ngoc Duy

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).