git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* several date related issues
@ 2015-06-25 11:19 H.Merijn Brand
  2015-06-25 12:44 ` Jeff King
  0 siblings, 1 reply; 31+ messages in thread
From: H.Merijn Brand @ 2015-06-25 11:19 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2567 bytes --]

Running 2.4.4

*** Dates do not respect LC_TIME

$ date
Thu 25 Jun 2015 13:02:48 CEST
$ git log --pretty=fuller --date=local | head -6 | grep -i date
AuthorDate: Mon Feb 16 16:47:08 2015
CommitDate: Mon Feb 16 16:47:08 2015
$ locale -ck LC_TIME | grep fmt
d_t_fmt="%a %d %b %Y %r %Z"
d_fmt="%d-%m-%Y"
t_fmt="%r"
t_fmt_ampm="%H:%M:%S"
era_d_fmt=""
era_d_t_fmt=""
era_t_fmt=""
date_fmt="%a %e %b %Y %H:%M:%S %Z"

*** git log --date-order and --author-date-order do not order by date

$ mkdir git-test
$ cd git-test
$ git init
Initialized empty Git repository in /home/merijn/git-test/.git/
$ touch foo
$ git add foo
$ git commit -m boo
[master (root-commit) 09483e5] boo
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 foo
$ git log | cat
commit 09483e527d6a4d4a9e49f82a11871ab55133cf72
Author: H.Merijn Brand <h.m.brand@xs4all.nl>
Date:   Thu Jun 25 13:14:37 2015 +0200

    boo
$ touch bar
$ env GIT_AUTHOR_DATE="2015-01-15 12:13:14" GIT_COMMITTER_DATE="2015-01-15 12:13:14" git add bar
$ env GIT_AUTHOR_DATE="2015-01-15 12:13:14" GIT_COMMITTER_DATE="2015-01-15 12:13:14" git commit -m "Commit in past"
[master 6acc7f3] Commit in past
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 bar
$ git log | cat
commit 6acc7f3d2cbaca7176b63ffac51005ce07b90b91
Author: H.Merijn Brand <h.m.brand@xs4all.nl>
Date:   Thu Jan 15 12:13:14 2015 +0100

    Commit in past

commit 09483e527d6a4d4a9e49f82a11871ab55133cf72
Author: H.Merijn Brand <h.m.brand@xs4all.nl>
Date:   Thu Jun 25 13:14:37 2015 +0200

    boo
$ git log --date-order | cat
commit 6acc7f3d2cbaca7176b63ffac51005ce07b90b91
Author: H.Merijn Brand <h.m.brand@xs4all.nl>
Date:   Thu Jan 15 12:13:14 2015 +0100

    Commit in past

commit 09483e527d6a4d4a9e49f82a11871ab55133cf72
Author: H.Merijn Brand <h.m.brand@xs4all.nl>
Date:   Thu Jun 25 13:14:37 2015 +0200

    boo
$ git log --author-date-order | cat
commit 6acc7f3d2cbaca7176b63ffac51005ce07b90b91
Author: H.Merijn Brand <h.m.brand@xs4all.nl>
Date:   Thu Jan 15 12:13:14 2015 +0100

    Commit in past

commit 09483e527d6a4d4a9e49f82a11871ab55133cf72
Author: H.Merijn Brand <h.m.brand@xs4all.nl>
Date:   Thu Jun 25 13:14:37 2015 +0200

    boo
$


-- 
H.Merijn Brand  http://tux.nl   Perl Monger  http://amsterdam.pm.org/
using perl5.00307 .. 5.21   porting perl5 on HP-UX, AIX, and openSUSE
http://mirrors.develooper.com/hpux/        http://www.test-smoke.org/
http://qa.perl.org   http://www.goldmark.org/jeff/stupid-disclaimers/

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: several date related issues
  2015-06-25 11:19 several date related issues H.Merijn Brand
@ 2015-06-25 12:44 ` Jeff King
  2015-06-25 12:56   ` H.Merijn Brand
  0 siblings, 1 reply; 31+ messages in thread
From: Jeff King @ 2015-06-25 12:44 UTC (permalink / raw)
  To: H.Merijn Brand; +Cc: git

On Thu, Jun 25, 2015 at 01:19:01PM +0200, H.Merijn Brand wrote:

> *** Dates do not respect LC_TIME

Right, we use our own routines for formatting the dates, and not
strftime. And it probably should stay that way in general, as git's
output is often meant to be parsed.

That being said, I do not think it would be wrong to have a date-mode
that just showed strftime("%c"), or some other arbitrary strftime
string.  You could then set log.date as appropriate for human
consumption.

> *** git log --date-order and --author-date-order do not order by date

Correct. The documentation says:

   --date-order
     Show no parents before all of its children are shown, but otherwise
     show commits in the commit timestamp order.

In your example, one commit is the parent of the other, so it hits the
first part of the sentence, and the dates are never even compared.

There is not a simple way to show commits in arbitrary order without
respect to parentage. I think you'd have to do something like:

  git log --format='%at %H' |
  sort -rn |
  cut -d' ' -f2 |
  git log --stdin --no-walk

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: several date related issues
  2015-06-25 12:44 ` Jeff King
@ 2015-06-25 12:56   ` H.Merijn Brand
  2015-06-25 16:53     ` [PATCH 0/3] localized date format Jeff King
  0 siblings, 1 reply; 31+ messages in thread
From: H.Merijn Brand @ 2015-06-25 12:56 UTC (permalink / raw)
  To: Jeff King; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1756 bytes --]

On Thu, 25 Jun 2015 08:44:45 -0400, Jeff King <peff@peff.net> wrote:

> On Thu, Jun 25, 2015 at 01:19:01PM +0200, H.Merijn Brand wrote:
> 
> > *** Dates do not respect LC_TIME
> 
> Right, we use our own routines for formatting the dates, and not
> strftime. And it probably should stay that way in general, as git's
> output is often meant to be parsed.
> 
> That being said, I do not think it would be wrong to have a date-mode
> that just showed strftime("%c"), or some other arbitrary strftime
> string.  You could then set log.date as appropriate for human
> consumption.

Yes please :)

 --date=lc
 --date=lc_time
 --date=locale

all spring to mind as valid options

> > *** git log --date-order and --author-date-order do not order by date
> 
> Correct. The documentation says:
> 
>    --date-order
>      Show no parents before all of its children are shown, but otherwise
>      show commits in the commit timestamp order.
> 
> In your example, one commit is the parent of the other, so it hits the
> first part of the sentence, and the dates are never even compared.

That is what I gathered, and concluded that the option name is
misleading

> There is not a simple way to show commits in arbitrary order without
> respect to parentage. I think you'd have to do something like:
> 
>   git log --format='%at %H' |
>   sort -rn |
>   cut -d' ' -f2 |
>   git log --stdin --no-walk

I'd like that as gitk option!

-- 
H.Merijn Brand  http://tux.nl   Perl Monger  http://amsterdam.pm.org/
using perl5.00307 .. 5.21   porting perl5 on HP-UX, AIX, and openSUSE
http://mirrors.develooper.com/hpux/        http://www.test-smoke.org/
http://qa.perl.org   http://www.goldmark.org/jeff/stupid-disclaimers/

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 0/3] localized date format
  2015-06-25 12:56   ` H.Merijn Brand
@ 2015-06-25 16:53     ` Jeff King
  2015-06-25 16:54       ` [PATCH 1/3] show-branch: use DATE_RELATIVE instead of magic number Jeff King
                         ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Jeff King @ 2015-06-25 16:53 UTC (permalink / raw)
  To: H.Merijn Brand; +Cc: git

On Thu, Jun 25, 2015 at 02:56:59PM +0200, H.Merijn Brand wrote:

> > That being said, I do not think it would be wrong to have a date-mode
> > that just showed strftime("%c"), or some other arbitrary strftime
> > string.  You could then set log.date as appropriate for human
> > consumption.
> 
> Yes please :)
> 
>  --date=lc
>  --date=lc_time
>  --date=locale
> 
> all spring to mind as valid options

Here's a patch series that allows `--date=format:%c`, which is the same
thing. We could possibly do `--date=locale` as syntactic sugar, I guess.

  [1/3]: show-branch: use DATE_RELATIVE instead of magic number
  [2/3]: convert "enum date_mode" into a struct
  [3/3]: introduce "format" date-mode

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/3] show-branch: use DATE_RELATIVE instead of magic number
  2015-06-25 16:53     ` [PATCH 0/3] localized date format Jeff King
@ 2015-06-25 16:54       ` Jeff King
  2015-06-25 16:55       ` [PATCH 2/3] convert "enum date_mode" into a struct Jeff King
  2015-06-25 16:55       ` [PATCH 3/3] introduce "format" date-mode Jeff King
  2 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2015-06-25 16:54 UTC (permalink / raw)
  To: H.Merijn Brand; +Cc: git

This is more readable, and won't break if we ever change the
order of the date_mode enum.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/show-branch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 323f857..d784613 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -784,7 +784,7 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 			else
 				msg++;
 			reflog_msg[i] = xstrfmt("(%s) %s",
-						show_date(timestamp, tz, 1),
+						show_date(timestamp, tz, DATE_RELATIVE),
 						msg);
 			free(logmsg);
 			sprintf(nth_desc, "%s@{%d}", *av, base+i);
-- 
2.4.4.742.g862750c

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/3] convert "enum date_mode" into a struct
  2015-06-25 16:53     ` [PATCH 0/3] localized date format Jeff King
  2015-06-25 16:54       ` [PATCH 1/3] show-branch: use DATE_RELATIVE instead of magic number Jeff King
@ 2015-06-25 16:55       ` Jeff King
  2015-06-25 17:03         ` John Keeping
  2015-07-07 20:37         ` Junio C Hamano
  2015-06-25 16:55       ` [PATCH 3/3] introduce "format" date-mode Jeff King
  2 siblings, 2 replies; 31+ messages in thread
From: Jeff King @ 2015-06-25 16:55 UTC (permalink / raw)
  To: H.Merijn Brand; +Cc: git

In preparation for adding date modes that may carry extra
information beyond the mode itself, this patch converts the
date_mode enum into a struct.

Most of the conversion is fairly straightforward; we pass
the struct as a pointer and dereference the type field where
necessary. Locations that declare a date_mode can use a "{}"
constructor.  However, the tricky case is where we use the
enum labels as constants, like:

  show_date(t, tz, DATE_NORMAL);

Ideally we could say:

  show_date(t, tz, &{ DATE_NORMAL });

but of course C does not allow that. Likewise, we cannot
cast the constant to a struct, because we need to pass an
actual address. Our options are basically:

  1. Manually add a "struct date_mode d = { DATE_NORMAL }"
     definition to each caller, and pass "&d". This makes
     the callers uglier, because they sometimes do not even
     have their own scope (e.g., they are inside a switch
     statement).

  2. Provide a pre-made global "date_normal" struct that can
     be passed by address. We'd also need "date_rfc2822",
     "date_iso8601", and so forth. But at least the ugliness
     is defined in one place.

  3. Provide a wrapper that generates the correct struct on
     the fly. The big downside is that we end up pointing to
     a single global, which makes our wrapper non-reentrant.
     But show_date is already not reentrant, so it does not
     matter.

This patch implements 3, along with a minor macro to keep
the size of the callers sane.

Signed-off-by: Jeff King <peff@peff.net>
---
 archive.c               |  2 +-
 builtin/blame.c         | 10 +++++-----
 builtin/commit.c        |  4 ++--
 builtin/for-each-ref.c  |  6 +++---
 builtin/log.c           |  4 ++--
 builtin/shortlog.c      |  2 +-
 builtin/show-branch.c   |  3 ++-
 cache.h                 | 35 +++++++++++++++++++++++------------
 commit.h                |  2 +-
 date.c                  | 43 +++++++++++++++++++++++++------------------
 fast-import.c           |  2 +-
 http-backend.c          |  2 +-
 log-tree.c              |  2 +-
 pretty.c                | 29 +++++++++++++++--------------
 reflog-walk.c           |  4 ++--
 reflog-walk.h           |  4 ++--
 refs.c                  |  4 ++--
 revision.c              |  4 ++--
 revision.h              |  2 +-
 sha1_name.c             |  2 +-
 submodule.c             |  2 +-
 test-date.c             |  4 ++--
 test-revision-walking.c |  2 +-
 23 files changed, 97 insertions(+), 77 deletions(-)

diff --git a/archive.c b/archive.c
index d37c41d..30bda56 100644
--- a/archive.c
+++ b/archive.c
@@ -33,7 +33,7 @@ static void format_subst(const struct commit *commit,
 	char *to_free = NULL;
 	struct strbuf fmt = STRBUF_INIT;
 	struct pretty_print_context ctx = {0};
-	ctx.date_mode = DATE_NORMAL;
+	ctx.date_mode.type = DATE_NORMAL;
 	ctx.abbrev = DEFAULT_ABBREV;
 
 	if (src == buf->buf)
diff --git a/builtin/blame.c b/builtin/blame.c
index b3e948e..43e8473 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -50,7 +50,7 @@ static int xdl_opts;
 static int abbrev = -1;
 static int no_whole_file_rename;
 
-static enum date_mode blame_date_mode = DATE_ISO8601;
+static struct date_mode blame_date_mode = { DATE_ISO8601 };
 static size_t blame_date_width;
 
 static struct string_list mailmap;
@@ -1827,7 +1827,7 @@ static const char *format_time(unsigned long time, const char *tz_str,
 		size_t time_width;
 		int tz;
 		tz = atoi(tz_str);
-		time_str = show_date(time, tz, blame_date_mode);
+		time_str = show_date(time, tz, &blame_date_mode);
 		strbuf_addstr(&time_buf, time_str);
 		/*
 		 * Add space paddings to time_buf to display a fixed width
@@ -2179,7 +2179,7 @@ static int git_blame_config(const char *var, const char *value, void *cb)
 	if (!strcmp(var, "blame.date")) {
 		if (!value)
 			return config_error_nonbool(var);
-		blame_date_mode = parse_date_format(value);
+		parse_date_format(value, &blame_date_mode);
 		return 0;
 	}
 
@@ -2561,13 +2561,13 @@ parse_done:
 
 	if (cmd_is_annotate) {
 		output_option |= OUTPUT_ANNOTATE_COMPAT;
-		blame_date_mode = DATE_ISO8601;
+		blame_date_mode.type = DATE_ISO8601;
 	} else {
 		blame_date_mode = revs.date_mode;
 	}
 
 	/* The maximum width used to show the dates */
-	switch (blame_date_mode) {
+	switch (blame_date_mode.type) {
 	case DATE_RFC2822:
 		blame_date_width = sizeof("Thu, 19 Oct 2006 16:00:04 -0700");
 		break;
diff --git a/builtin/commit.c b/builtin/commit.c
index 254477f..7314f33 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -856,7 +856,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 				_("%s"
 				"Date:      %s"),
 				ident_shown++ ? "" : "\n",
-				show_ident_date(&ai, DATE_NORMAL));
+				show_ident_date(&ai, DATE_MODE(NORMAL)));
 
 		if (!committer_ident_sufficiently_given())
 			status_printf_ln(s, GIT_COLOR_NORMAL,
@@ -1046,7 +1046,7 @@ static const char *find_author_by_nickname(const char *name)
 	commit = get_revision(&revs);
 	if (commit) {
 		struct pretty_print_context ctx = {0};
-		ctx.date_mode = DATE_NORMAL;
+		ctx.date_mode.type = DATE_NORMAL;
 		strbuf_release(&buf);
 		format_commit_message(commit, "%aN <%aE>", &buf, &ctx);
 		clear_mailmap(&mailmap);
diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index f7e51a7..691daf1 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -389,7 +389,7 @@ static void grab_date(const char *buf, struct atom_value *v, const char *atomnam
 	char *zone;
 	unsigned long timestamp;
 	long tz;
-	enum date_mode date_mode = DATE_NORMAL;
+	struct date_mode date_mode = { DATE_NORMAL };
 	const char *formatp;
 
 	/*
@@ -401,7 +401,7 @@ static void grab_date(const char *buf, struct atom_value *v, const char *atomnam
 	formatp = strchr(atomname, ':');
 	if (formatp != NULL) {
 		formatp++;
-		date_mode = parse_date_format(formatp);
+		parse_date_format(formatp, &date_mode);
 	}
 
 	if (!eoemail)
@@ -412,7 +412,7 @@ static void grab_date(const char *buf, struct atom_value *v, const char *atomnam
 	tz = strtol(zone, NULL, 10);
 	if ((tz == LONG_MIN || tz == LONG_MAX) && errno == ERANGE)
 		goto bad;
-	v->s = xstrdup(show_date(timestamp, tz, date_mode));
+	v->s = xstrdup(show_date(timestamp, tz, &date_mode));
 	v->ul = timestamp;
 	return;
  bad:
diff --git a/builtin/log.c b/builtin/log.c
index e67671e..ce1581e 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -112,7 +112,7 @@ static void cmd_log_init_defaults(struct rev_info *rev)
 	DIFF_OPT_SET(&rev->diffopt, ALLOW_TEXTCONV);
 
 	if (default_date_mode)
-		rev->date_mode = parse_date_format(default_date_mode);
+		parse_date_format(default_date_mode, &rev->date_mode);
 	rev->diffopt.touched_flags = 0;
 }
 
@@ -939,7 +939,7 @@ static void make_cover_letter(struct rev_info *rev, int use_stdout,
 
 	msg = body;
 	pp.fmt = CMIT_FMT_EMAIL;
-	pp.date_mode = DATE_RFC2822;
+	pp.date_mode.type = DATE_RFC2822;
 	pp_user_info(&pp, NULL, &sb, committer, encoding);
 	pp_title_line(&pp, &msg, &sb, encoding, need_8bit_cte);
 	pp_remainder(&pp, &msg, &sb, 0);
diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index c0bab6a..007cc66 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -138,7 +138,7 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
 		ctx.abbrev = log->abbrev;
 		ctx.subject = "";
 		ctx.after_subject = "";
-		ctx.date_mode = DATE_NORMAL;
+		ctx.date_mode.type = DATE_NORMAL;
 		ctx.output_encoding = get_log_output_encoding();
 		pretty_print_commit(&ctx, commit, &ufbuf);
 		buffer = ufbuf.buf;
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index d784613..c87c46e 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -784,7 +784,8 @@ int cmd_show_branch(int ac, const char **av, const char *prefix)
 			else
 				msg++;
 			reflog_msg[i] = xstrfmt("(%s) %s",
-						show_date(timestamp, tz, DATE_RELATIVE),
+						show_date(timestamp, tz,
+							  DATE_MODE(RELATIVE)),
 						msg);
 			free(logmsg);
 			sprintf(nth_desc, "%s@{%d}", *av, base+i);
diff --git a/cache.h b/cache.h
index 571c98f..1759011 100644
--- a/cache.h
+++ b/cache.h
@@ -1096,18 +1096,28 @@ extern void *read_object_with_reference(const unsigned char *sha1,
 extern struct object *peel_to_type(const char *name, int namelen,
 				   struct object *o, enum object_type);
 
-enum date_mode {
-	DATE_NORMAL = 0,
-	DATE_RELATIVE,
-	DATE_SHORT,
-	DATE_LOCAL,
-	DATE_ISO8601,
-	DATE_ISO8601_STRICT,
-	DATE_RFC2822,
-	DATE_RAW
+struct date_mode {
+	enum date_mode_type {
+		DATE_NORMAL = 0,
+		DATE_RELATIVE,
+		DATE_SHORT,
+		DATE_LOCAL,
+		DATE_ISO8601,
+		DATE_ISO8601_STRICT,
+		DATE_RFC2822,
+		DATE_RAW
+	} type;
 };
 
-const char *show_date(unsigned long time, int timezone, enum date_mode mode);
+/*
+ * Convenience helper for passing a constant type, like:
+ *
+ *   show_date(t, tz, DATE_MODE(NORMAL));
+ */
+#define DATE_MODE(t) date_mode_from_type(DATE_##t)
+struct date_mode *date_mode_from_type(enum date_mode_type type);
+
+const char *show_date(unsigned long time, int timezone, const struct date_mode *mode);
 void show_date_relative(unsigned long time, int tz, const struct timeval *now,
 			struct strbuf *timebuf);
 int parse_date(const char *date, struct strbuf *out);
@@ -1117,7 +1127,7 @@ void datestamp(struct strbuf *out);
 #define approxidate(s) approxidate_careful((s), NULL)
 unsigned long approxidate_careful(const char *, int *);
 unsigned long approxidate_relative(const char *date, const struct timeval *now);
-enum date_mode parse_date_format(const char *format);
+void parse_date_format(const char *format, struct date_mode *mode);
 int date_overflows(unsigned long date);
 
 #define IDENT_STRICT	       1
@@ -1154,7 +1164,8 @@ extern int split_ident_line(struct ident_split *, const char *, int);
  * the ident_split. It will also sanity-check the values and produce
  * a well-known sentinel date if they appear bogus.
  */
-const char *show_ident_date(const struct ident_split *id, enum date_mode mode);
+const char *show_ident_date(const struct ident_split *id,
+			    const struct date_mode *mode);
 
 /*
  * Compare split idents for equality or strict ordering. Note that we
diff --git a/commit.h b/commit.h
index 9a1fa96..1496882 100644
--- a/commit.h
+++ b/commit.h
@@ -145,7 +145,7 @@ struct pretty_print_context {
 	const char *subject;
 	const char *after_subject;
 	int preserve_subject;
-	enum date_mode date_mode;
+	struct date_mode date_mode;
 	unsigned date_mode_explicit:1;
 	int need_8bit_cte;
 	char *notes_message;
diff --git a/date.c b/date.c
index 733d1b2..cdad4db 100644
--- a/date.c
+++ b/date.c
@@ -160,18 +160,25 @@ void show_date_relative(unsigned long time, int tz,
 		 (diff + 183) / 365);
 }
 
-const char *show_date(unsigned long time, int tz, enum date_mode mode)
+struct date_mode *date_mode_from_type(enum date_mode_type type)
+{
+	static struct date_mode mode;
+	mode.type = type;
+	return &mode;
+}
+
+const char *show_date(unsigned long time, int tz, const struct date_mode *mode)
 {
 	struct tm *tm;
 	static struct strbuf timebuf = STRBUF_INIT;
 
-	if (mode == DATE_RAW) {
+	if (mode->type == DATE_RAW) {
 		strbuf_reset(&timebuf);
 		strbuf_addf(&timebuf, "%lu %+05d", time, tz);
 		return timebuf.buf;
 	}
 
-	if (mode == DATE_RELATIVE) {
+	if (mode->type == DATE_RELATIVE) {
 		struct timeval now;
 
 		strbuf_reset(&timebuf);
@@ -180,7 +187,7 @@ const char *show_date(unsigned long time, int tz, enum date_mode mode)
 		return timebuf.buf;
 	}
 
-	if (mode == DATE_LOCAL)
+	if (mode->type == DATE_LOCAL)
 		tz = local_tzoffset(time);
 
 	tm = time_to_tm(time, tz);
@@ -190,17 +197,17 @@ const char *show_date(unsigned long time, int tz, enum date_mode mode)
 	}
 
 	strbuf_reset(&timebuf);
-	if (mode == DATE_SHORT)
+	if (mode->type == DATE_SHORT)
 		strbuf_addf(&timebuf, "%04d-%02d-%02d", tm->tm_year + 1900,
 				tm->tm_mon + 1, tm->tm_mday);
-	else if (mode == DATE_ISO8601)
+	else if (mode->type == DATE_ISO8601)
 		strbuf_addf(&timebuf, "%04d-%02d-%02d %02d:%02d:%02d %+05d",
 				tm->tm_year + 1900,
 				tm->tm_mon + 1,
 				tm->tm_mday,
 				tm->tm_hour, tm->tm_min, tm->tm_sec,
 				tz);
-	else if (mode == DATE_ISO8601_STRICT) {
+	else if (mode->type == DATE_ISO8601_STRICT) {
 		char sign = (tz >= 0) ? '+' : '-';
 		tz = abs(tz);
 		strbuf_addf(&timebuf, "%04d-%02d-%02dT%02d:%02d:%02d%c%02d:%02d",
@@ -209,7 +216,7 @@ const char *show_date(unsigned long time, int tz, enum date_mode mode)
 				tm->tm_mday,
 				tm->tm_hour, tm->tm_min, tm->tm_sec,
 				sign, tz / 100, tz % 100);
-	} else if (mode == DATE_RFC2822)
+	} else if (mode->type == DATE_RFC2822)
 		strbuf_addf(&timebuf, "%.3s, %d %.3s %d %02d:%02d:%02d %+05d",
 			weekday_names[tm->tm_wday], tm->tm_mday,
 			month_names[tm->tm_mon], tm->tm_year + 1900,
@@ -221,7 +228,7 @@ const char *show_date(unsigned long time, int tz, enum date_mode mode)
 				tm->tm_mday,
 				tm->tm_hour, tm->tm_min, tm->tm_sec,
 				tm->tm_year + 1900,
-				(mode == DATE_LOCAL) ? 0 : ' ',
+				(mode->type == DATE_LOCAL) ? 0 : ' ',
 				tz);
 	return timebuf.buf;
 }
@@ -759,27 +766,27 @@ int parse_date(const char *date, struct strbuf *result)
 	return 0;
 }
 
-enum date_mode parse_date_format(const char *format)
+void parse_date_format(const char *format, struct date_mode *mode)
 {
 	if (!strcmp(format, "relative"))
-		return DATE_RELATIVE;
+		mode->type = DATE_RELATIVE;
 	else if (!strcmp(format, "iso8601") ||
 		 !strcmp(format, "iso"))
-		return DATE_ISO8601;
+		mode->type = DATE_ISO8601;
 	else if (!strcmp(format, "iso8601-strict") ||
 		 !strcmp(format, "iso-strict"))
-		return DATE_ISO8601_STRICT;
+		mode->type = DATE_ISO8601_STRICT;
 	else if (!strcmp(format, "rfc2822") ||
 		 !strcmp(format, "rfc"))
-		return DATE_RFC2822;
+		mode->type = DATE_RFC2822;
 	else if (!strcmp(format, "short"))
-		return DATE_SHORT;
+		mode->type = DATE_SHORT;
 	else if (!strcmp(format, "local"))
-		return DATE_LOCAL;
+		mode->type = DATE_LOCAL;
 	else if (!strcmp(format, "default"))
-		return DATE_NORMAL;
+		mode->type = DATE_NORMAL;
 	else if (!strcmp(format, "raw"))
-		return DATE_RAW;
+		mode->type = DATE_RAW;
 	else
 		die("unknown date format %s", format);
 }
diff --git a/fast-import.c b/fast-import.c
index 6378726..9363cc7 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -421,7 +421,7 @@ static void write_crash_report(const char *err)
 	fprintf(rpt, "fast-import crash report:\n");
 	fprintf(rpt, "    fast-import process: %"PRIuMAX"\n", (uintmax_t) getpid());
 	fprintf(rpt, "    parent process     : %"PRIuMAX"\n", (uintmax_t) getppid());
-	fprintf(rpt, "    at %s\n", show_date(time(NULL), 0, DATE_LOCAL));
+	fprintf(rpt, "    at %s\n", show_date(time(NULL), 0, DATE_MODE(LOCAL)));
 	fputc('\n', rpt);
 
 	fputs("fatal: ", rpt);
diff --git a/http-backend.c b/http-backend.c
index 501bf79..b977c00 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -92,7 +92,7 @@ static void hdr_int(const char *name, uintmax_t value)
 
 static void hdr_date(const char *name, unsigned long when)
 {
-	const char *value = show_date(when, 0, DATE_RFC2822);
+	const char *value = show_date(when, 0, DATE_MODE(RFC2822));
 	hdr_str(name, value);
 }
 
diff --git a/log-tree.c b/log-tree.c
index 01beb11..e57e6b6 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -639,7 +639,7 @@ void show_log(struct rev_info *opt)
 			 */
 			show_reflog_message(opt->reflog_info,
 					    opt->commit_format == CMIT_FMT_ONELINE,
-					    opt->date_mode,
+					    &opt->date_mode,
 					    opt->date_mode_explicit);
 			if (opt->commit_format == CMIT_FMT_ONELINE)
 				return;
diff --git a/pretty.c b/pretty.c
index 7b49304..151c2ae 100644
--- a/pretty.c
+++ b/pretty.c
@@ -399,7 +399,7 @@ static void add_rfc2047(struct strbuf *sb, const char *line, size_t len,
 }
 
 const char *show_ident_date(const struct ident_split *ident,
-			    enum date_mode mode)
+			    const struct date_mode *mode)
 {
 	unsigned long date = 0;
 	long tz = 0;
@@ -489,15 +489,15 @@ void pp_user_info(struct pretty_print_context *pp,
 	switch (pp->fmt) {
 	case CMIT_FMT_MEDIUM:
 		strbuf_addf(sb, "Date:   %s\n",
-			    show_ident_date(&ident, pp->date_mode));
+			    show_ident_date(&ident, &pp->date_mode));
 		break;
 	case CMIT_FMT_EMAIL:
 		strbuf_addf(sb, "Date: %s\n",
-			    show_ident_date(&ident, DATE_RFC2822));
+			    show_ident_date(&ident, DATE_MODE(RFC2822)));
 		break;
 	case CMIT_FMT_FULLER:
 		strbuf_addf(sb, "%sDate: %s\n", what,
-			    show_ident_date(&ident, pp->date_mode));
+			    show_ident_date(&ident, &pp->date_mode));
 		break;
 	default:
 		/* notin' */
@@ -671,7 +671,8 @@ static int mailmap_name(const char **email, size_t *email_len,
 }
 
 static size_t format_person_part(struct strbuf *sb, char part,
-				 const char *msg, int len, enum date_mode dmode)
+				 const char *msg, int len,
+				 const struct date_mode *dmode)
 {
 	/* currently all placeholders have same length */
 	const int placeholder_len = 2;
@@ -711,16 +712,16 @@ static size_t format_person_part(struct strbuf *sb, char part,
 		strbuf_addstr(sb, show_ident_date(&s, dmode));
 		return placeholder_len;
 	case 'D':	/* date, RFC2822 style */
-		strbuf_addstr(sb, show_ident_date(&s, DATE_RFC2822));
+		strbuf_addstr(sb, show_ident_date(&s, DATE_MODE(RFC2822)));
 		return placeholder_len;
 	case 'r':	/* date, relative */
-		strbuf_addstr(sb, show_ident_date(&s, DATE_RELATIVE));
+		strbuf_addstr(sb, show_ident_date(&s, DATE_MODE(RELATIVE)));
 		return placeholder_len;
 	case 'i':	/* date, ISO 8601-like */
-		strbuf_addstr(sb, show_ident_date(&s, DATE_ISO8601));
+		strbuf_addstr(sb, show_ident_date(&s, DATE_MODE(ISO8601)));
 		return placeholder_len;
 	case 'I':	/* date, ISO 8601 strict */
-		strbuf_addstr(sb, show_ident_date(&s, DATE_ISO8601_STRICT));
+		strbuf_addstr(sb, show_ident_date(&s, DATE_MODE(ISO8601_STRICT)));
 		return placeholder_len;
 	}
 
@@ -933,7 +934,7 @@ static void rewrap_message_tail(struct strbuf *sb,
 static int format_reflog_person(struct strbuf *sb,
 				char part,
 				struct reflog_walk_info *log,
-				enum date_mode dmode)
+				const struct date_mode *dmode)
 {
 	const char *ident;
 
@@ -1185,7 +1186,7 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
 			if (c->pretty_ctx->reflog_info)
 				get_reflog_selector(sb,
 						    c->pretty_ctx->reflog_info,
-						    c->pretty_ctx->date_mode,
+						    &c->pretty_ctx->date_mode,
 						    c->pretty_ctx->date_mode_explicit,
 						    (placeholder[1] == 'd'));
 			return 2;
@@ -1200,7 +1201,7 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
 			return format_reflog_person(sb,
 						    placeholder[1],
 						    c->pretty_ctx->reflog_info,
-						    c->pretty_ctx->date_mode);
+						    &c->pretty_ctx->date_mode);
 		}
 		return 0;	/* unknown %g placeholder */
 	case 'N':
@@ -1251,11 +1252,11 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
 	case 'a':	/* author ... */
 		return format_person_part(sb, placeholder[1],
 				   msg + c->author.off, c->author.len,
-				   c->pretty_ctx->date_mode);
+				   &c->pretty_ctx->date_mode);
 	case 'c':	/* committer ... */
 		return format_person_part(sb, placeholder[1],
 				   msg + c->committer.off, c->committer.len,
-				   c->pretty_ctx->date_mode);
+				   &c->pretty_ctx->date_mode);
 	case 'e':	/* encoding */
 		if (c->commit_encoding)
 			strbuf_addstr(sb, c->commit_encoding);
diff --git a/reflog-walk.c b/reflog-walk.c
index 222de76..f8e743a 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -249,7 +249,7 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 
 void get_reflog_selector(struct strbuf *sb,
 			 struct reflog_walk_info *reflog_info,
-			 enum date_mode dmode, int force_date,
+			 const struct date_mode *dmode, int force_date,
 			 int shorten)
 {
 	struct commit_reflog *commit_reflog = reflog_info->last_commit_reflog;
@@ -311,7 +311,7 @@ const char *get_reflog_ident(struct reflog_walk_info *reflog_info)
 }
 
 void show_reflog_message(struct reflog_walk_info *reflog_info, int oneline,
-			 enum date_mode dmode, int force_date)
+			 const struct date_mode *dmode, int force_date)
 {
 	if (reflog_info && reflog_info->last_commit_reflog) {
 		struct commit_reflog *commit_reflog = reflog_info->last_commit_reflog;
diff --git a/reflog-walk.h b/reflog-walk.h
index a9bd60e..27886f7 100644
--- a/reflog-walk.h
+++ b/reflog-walk.h
@@ -11,13 +11,13 @@ extern int add_reflog_for_walk(struct reflog_walk_info *info,
 extern void fake_reflog_parent(struct reflog_walk_info *info,
 		struct commit *commit);
 extern void show_reflog_message(struct reflog_walk_info *info, int,
-				enum date_mode, int force_date);
+				const struct date_mode *, int force_date);
 extern void get_reflog_message(struct strbuf *sb,
 		struct reflog_walk_info *reflog_info);
 extern const char *get_reflog_ident(struct reflog_walk_info *reflog_info);
 extern void get_reflog_selector(struct strbuf *sb,
 		struct reflog_walk_info *reflog_info,
-		enum date_mode dmode, int force_date,
+		const struct date_mode *dmode, int force_date,
 		int shorten);
 
 #endif
diff --git a/refs.c b/refs.c
index 26d1ac1..f9b828c 100644
--- a/refs.c
+++ b/refs.c
@@ -3356,14 +3356,14 @@ static int read_ref_at_ent(unsigned char *osha1, unsigned char *nsha1,
 			hashcpy(cb->sha1, nsha1);
 			if (hashcmp(cb->osha1, nsha1))
 				warning("Log for ref %s has gap after %s.",
-					cb->refname, show_date(cb->date, cb->tz, DATE_RFC2822));
+					cb->refname, show_date(cb->date, cb->tz, DATE_MODE(RFC2822)));
 		}
 		else if (cb->date == cb->at_time)
 			hashcpy(cb->sha1, nsha1);
 		else if (hashcmp(nsha1, cb->sha1))
 			warning("Log for ref %s unexpectedly ended on %s.",
 				cb->refname, show_date(cb->date, cb->tz,
-						   DATE_RFC2822));
+						       DATE_MODE(RFC2822)));
 		hashcpy(cb->osha1, osha1);
 		hashcpy(cb->nsha1, nsha1);
 		cb->found_it = 1;
diff --git a/revision.c b/revision.c
index 3ff8723..887e65e 100644
--- a/revision.c
+++ b/revision.c
@@ -1996,10 +1996,10 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--full-history")) {
 		revs->simplify_history = 0;
 	} else if (!strcmp(arg, "--relative-date")) {
-		revs->date_mode = DATE_RELATIVE;
+		revs->date_mode.type = DATE_RELATIVE;
 		revs->date_mode_explicit = 1;
 	} else if ((argcount = parse_long_opt("date", argv, &optarg))) {
-		revs->date_mode = parse_date_format(optarg);
+		parse_date_format(optarg, &revs->date_mode);
 		revs->date_mode_explicit = 1;
 		return argcount;
 	} else if (!strcmp(arg, "--log-size")) {
diff --git a/revision.h b/revision.h
index 0ea8b4e..5bc9686 100644
--- a/revision.h
+++ b/revision.h
@@ -146,7 +146,7 @@ struct rev_info {
 			track_first_time:1,
 			linear:1;
 
-	enum date_mode date_mode;
+	struct date_mode date_mode;
 
 	unsigned int	abbrev;
 	enum cmit_fmt	commit_format;
diff --git a/sha1_name.c b/sha1_name.c
index e57513e..da6874c 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -576,7 +576,7 @@ static int get_sha1_basic(const char *str, int len, unsigned char *sha1,
 				if (!(flags & GET_SHA1_QUIETLY)) {
 					warning("Log for '%.*s' only goes "
 						"back to %s.", len, str,
-						show_date(co_time, co_tz, DATE_RFC2822));
+						show_date(co_time, co_tz, DATE_MODE(RFC2822)));
 				}
 			} else {
 				if (flags & GET_SHA1_QUIETLY) {
diff --git a/submodule.c b/submodule.c
index 15e90d1..700bbf4 100644
--- a/submodule.c
+++ b/submodule.c
@@ -993,7 +993,7 @@ static void print_commit(struct commit *commit)
 {
 	struct strbuf sb = STRBUF_INIT;
 	struct pretty_print_context ctx = {0};
-	ctx.date_mode = DATE_NORMAL;
+	ctx.date_mode.type = DATE_NORMAL;
 	format_commit_message(commit, " %h: %m %s", &sb, &ctx);
 	fprintf(stderr, "%s\n", sb.buf);
 	strbuf_release(&sb);
diff --git a/test-date.c b/test-date.c
index 94a6997..63f3735 100644
--- a/test-date.c
+++ b/test-date.c
@@ -29,7 +29,7 @@ static void parse_dates(char **argv, struct timeval *now)
 		parse_date(*argv, &result);
 		if (sscanf(result.buf, "%lu %d", &t, &tz) == 2)
 			printf("%s -> %s\n",
-			       *argv, show_date(t, tz, DATE_ISO8601));
+			       *argv, show_date(t, tz, DATE_MODE(ISO8601)));
 		else
 			printf("%s -> bad\n", *argv);
 	}
@@ -41,7 +41,7 @@ static void parse_approxidate(char **argv, struct timeval *now)
 	for (; *argv; argv++) {
 		time_t t;
 		t = approxidate_relative(*argv, now);
-		printf("%s -> %s\n", *argv, show_date(t, 0, DATE_ISO8601));
+		printf("%s -> %s\n", *argv, show_date(t, 0, DATE_MODE(ISO8601)));
 	}
 }
 
diff --git a/test-revision-walking.c b/test-revision-walking.c
index 3ade02c..285f06b 100644
--- a/test-revision-walking.c
+++ b/test-revision-walking.c
@@ -17,7 +17,7 @@ static void print_commit(struct commit *commit)
 {
 	struct strbuf sb = STRBUF_INIT;
 	struct pretty_print_context ctx = {0};
-	ctx.date_mode = DATE_NORMAL;
+	ctx.date_mode.type = DATE_NORMAL;
 	format_commit_message(commit, " %m %s", &sb, &ctx);
 	printf("%s\n", sb.buf);
 	strbuf_release(&sb);
-- 
2.4.4.742.g862750c

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/3] introduce "format" date-mode
  2015-06-25 16:53     ` [PATCH 0/3] localized date format Jeff King
  2015-06-25 16:54       ` [PATCH 1/3] show-branch: use DATE_RELATIVE instead of magic number Jeff King
  2015-06-25 16:55       ` [PATCH 2/3] convert "enum date_mode" into a struct Jeff King
@ 2015-06-25 16:55       ` Jeff King
  2015-06-29 22:22         ` Eric Sunshine
  2 siblings, 1 reply; 31+ messages in thread
From: Jeff King @ 2015-06-25 16:55 UTC (permalink / raw)
  To: H.Merijn Brand; +Cc: git

This feeds the format directly to strftime. Besides being a
little more flexible, the main advantage is that your system
strftime may know more about your locale's preferred format
(e.g., how to spell the days of the week).

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/rev-list-options.txt |  5 +++++
 builtin/blame.c                    |  3 +++
 cache.h                            |  2 ++
 date.c                             |  9 ++++++++-
 gettext.c                          |  1 +
 strbuf.c                           | 29 +++++++++++++++++++++++++++++
 strbuf.h                           |  5 +++++
 t/t6300-for-each-ref.sh            |  8 ++++++++
 8 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 77ac439..a9b808f 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -727,6 +727,11 @@ format, often found in email messages.
 +
 `--date=raw` shows the date in the internal raw Git format `%s %z` format.
 +
+`--date=format:...` feeds the format `...` to your system `strftime`.
+Use `--date=format:%c` to show the date in your system locale's
+preferred format.  See the `strftime` manual for a complete list of
+format placeholders.
++
 `--date=default` shows timestamps in the original time zone
 (either committer's or author's).
 
diff --git a/builtin/blame.c b/builtin/blame.c
index 43e8473..e2e3e75 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -2596,6 +2596,9 @@ parse_done:
 	case DATE_NORMAL:
 		blame_date_width = sizeof("Thu Oct 19 16:00:04 2006 -0700");
 		break;
+	case DATE_STRFTIME:
+		blame_date_width = strlen(show_date(0, 0, &blame_date_mode)) + 1; /* add the null */
+		break;
 	}
 	blame_date_width -= 1; /* strip the null */
 
diff --git a/cache.h b/cache.h
index 1759011..bb63a58 100644
--- a/cache.h
+++ b/cache.h
@@ -1105,8 +1105,10 @@ struct date_mode {
 		DATE_ISO8601,
 		DATE_ISO8601_STRICT,
 		DATE_RFC2822,
+		DATE_STRFTIME,
 		DATE_RAW
 	} type;
+	const char *strftime_fmt;
 };
 
 /*
diff --git a/date.c b/date.c
index cdad4db..8f91569 100644
--- a/date.c
+++ b/date.c
@@ -163,6 +163,8 @@ void show_date_relative(unsigned long time, int tz,
 struct date_mode *date_mode_from_type(enum date_mode_type type)
 {
 	static struct date_mode mode;
+	if (type == DATE_STRFTIME)
+		die("BUG: cannot create anonymous strftime date_mode struct");
 	mode.type = type;
 	return &mode;
 }
@@ -221,6 +223,8 @@ const char *show_date(unsigned long time, int tz, const struct date_mode *mode)
 			weekday_names[tm->tm_wday], tm->tm_mday,
 			month_names[tm->tm_mon], tm->tm_year + 1900,
 			tm->tm_hour, tm->tm_min, tm->tm_sec, tz);
+	else if (mode->type == DATE_STRFTIME)
+		strbuf_addftime(&timebuf, mode->strftime_fmt, tm);
 	else
 		strbuf_addf(&timebuf, "%.3s %.3s %d %02d:%02d:%02d %d%c%+05d",
 				weekday_names[tm->tm_wday],
@@ -787,7 +791,10 @@ void parse_date_format(const char *format, struct date_mode *mode)
 		mode->type = DATE_NORMAL;
 	else if (!strcmp(format, "raw"))
 		mode->type = DATE_RAW;
-	else
+	else if (skip_prefix(format, "format:", &format)) {
+		mode->type = DATE_STRFTIME;
+		mode->strftime_fmt = xstrdup(format);
+	} else
 		die("unknown date format %s", format);
 }
 
diff --git a/gettext.c b/gettext.c
index 7378ba2..a268a2c 100644
--- a/gettext.c
+++ b/gettext.c
@@ -162,6 +162,7 @@ void git_setup_gettext(void)
 		podir = GIT_LOCALE_PATH;
 	bindtextdomain("git", podir);
 	setlocale(LC_MESSAGES, "");
+	setlocale(LC_TIME, "");
 	init_gettext_charset("git");
 	textdomain("git");
 }
diff --git a/strbuf.c b/strbuf.c
index 0d4f4e5..a7ba028 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -709,3 +709,32 @@ char *xstrfmt(const char *fmt, ...)
 
 	return ret;
 }
+
+void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm)
+{
+	size_t len;
+
+	/*
+	 * strftime reports "0" if it could not fit the result in the buffer.
+	 * Unfortunately, it also reports "0" if the requested time string
+	 * takes 0 bytes. So if we were to probe and grow, we have to choose
+	 * some arbitrary cap beyond which we guess that the format probably
+	 * just results in a 0-length output. Since we have to choose some
+	 * reasonable cap anyway, and since it is not that big, we may
+	 * as well just grow to their in the first place.
+	 */
+	strbuf_grow(sb, 128);
+	len = strftime(sb->buf + sb->len, sb->alloc - sb->len, fmt, tm);
+
+	if (!len) {
+		/*
+		 * Either we failed, or the format actually produces a 0-length
+		 * output. There's not much we can do, so we leave it blank.
+		 * However, the output array is left in an undefined state, so
+		 * we must re-assert our NUL terminator.
+		 */
+		sb->buf[sb->len] = '\0';
+	} else {
+		sb->len += len;
+	}
+}
diff --git a/strbuf.h b/strbuf.h
index 01c5c63..8c8f8d5 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -345,6 +345,11 @@ __attribute__((format (printf,2,0)))
 extern void strbuf_vaddf(struct strbuf *sb, const char *fmt, va_list ap);
 
 /**
+ * Add the time specified by `tm`, as formatted by `strftime`.
+ */
+extern void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm);
+
+/**
  * Read a given size of data from a FILE* pointer to the buffer.
  *
  * NOTE: The buffer is rewound if the read fails. If -1 is returned,
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 24fc2ba..c7f368c 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -227,6 +227,14 @@ test_expect_success 'Check format "rfc2822" date fields output' '
 	test_cmp expected actual
 '
 
+test_expect_success 'Check format of strftime date fields' '
+	echo "my date is 2006-07-03" >expected &&
+	git for-each-ref \
+	  --format="%(authordate:format:my date is %Y-%m-%d)" \
+	  refs/heads >actual &&
+	test_cmp expected actual
+'
+
 cat >expected <<\EOF
 refs/heads/master
 refs/remotes/origin/master
-- 
2.4.4.742.g862750c

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] convert "enum date_mode" into a struct
  2015-06-25 16:55       ` [PATCH 2/3] convert "enum date_mode" into a struct Jeff King
@ 2015-06-25 17:03         ` John Keeping
  2015-06-25 17:22           ` Jeff King
  2015-07-07 20:37         ` Junio C Hamano
  1 sibling, 1 reply; 31+ messages in thread
From: John Keeping @ 2015-06-25 17:03 UTC (permalink / raw)
  To: Jeff King; +Cc: H.Merijn Brand, git

On Thu, Jun 25, 2015 at 12:55:02PM -0400, Jeff King wrote:
> In preparation for adding date modes that may carry extra
> information beyond the mode itself, this patch converts the
> date_mode enum into a struct.
> 
> Most of the conversion is fairly straightforward; we pass
> the struct as a pointer and dereference the type field where
> necessary. Locations that declare a date_mode can use a "{}"
> constructor.  However, the tricky case is where we use the
> enum labels as constants, like:
> 
>   show_date(t, tz, DATE_NORMAL);
> 
> Ideally we could say:
> 
>   show_date(t, tz, &{ DATE_NORMAL });
> 
> but of course C does not allow that.

Yes it does, e.g. in 6.5.2.5 of C11, example 3 shows:

	drawline(&(struct point){.x=1, .y=1},
		&(struct point){.x=3, .y=4});

The cast is required, but if the argument is pointer-to-const you can
construct a temporary in the function call.

Of course, whether all of the compilers we target support it is a
different question.  If they do, perhaps something like:

#define SIMPLE_DATE(f)		&(struct date_mode) { DATE_NORMAL }

would allow the callers to remain reasonably sane.

>                                      Likewise, we cannot
> cast the constant to a struct, because we need to pass an
> actual address. Our options are basically:
> 
>   1. Manually add a "struct date_mode d = { DATE_NORMAL }"
>      definition to each caller, and pass "&d". This makes
>      the callers uglier, because they sometimes do not even
>      have their own scope (e.g., they are inside a switch
>      statement).
> 
>   2. Provide a pre-made global "date_normal" struct that can
>      be passed by address. We'd also need "date_rfc2822",
>      "date_iso8601", and so forth. But at least the ugliness
>      is defined in one place.
> 
>   3. Provide a wrapper that generates the correct struct on
>      the fly. The big downside is that we end up pointing to
>      a single global, which makes our wrapper non-reentrant.
>      But show_date is already not reentrant, so it does not
>      matter.
> 
> This patch implements 3, along with a minor macro to keep
> the size of the callers sane.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] convert "enum date_mode" into a struct
  2015-06-25 17:03         ` John Keeping
@ 2015-06-25 17:22           ` Jeff King
  0 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2015-06-25 17:22 UTC (permalink / raw)
  To: John Keeping; +Cc: H.Merijn Brand, git

On Thu, Jun 25, 2015 at 06:03:28PM +0100, John Keeping wrote:

> > Ideally we could say:
> > 
> >   show_date(t, tz, &{ DATE_NORMAL });
> > 
> > but of course C does not allow that.
> 
> Yes it does, e.g. in 6.5.2.5 of C11, example 3 shows:

Well, yes. But we generally restrict ourselves to C89 here, so we are
not even close.

> Of course, whether all of the compilers we target support it is a
> different question.  If they do, perhaps something like:
> 
> #define SIMPLE_DATE(f)		&(struct date_mode) { DATE_NORMAL }
> 
> would allow the callers to remain reasonably sane.

My patch already introduces DATE_MODE, so you could conditionally hide
it there, and fall back to date_mode_from_type when the compiler is too
old for this. But then, what is the advantage over the existing
solution? It's reentrant, but I don't think that is a problem here.

And in patch 3, you'll see that I add an extra assertion to
date_mode_from_type that this cannot support (to make sure that we do
not create a DATE_STRFTIME mode with no matching format string). The
syntax above would at least give us NULL for the string, which is better
than random garbage, but I think the assert is better still.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-06-25 16:55       ` [PATCH 3/3] introduce "format" date-mode Jeff King
@ 2015-06-29 22:22         ` Eric Sunshine
  2015-06-30 10:20           ` Jeff King
  2015-06-30 13:26           ` Jeff King
  0 siblings, 2 replies; 31+ messages in thread
From: Eric Sunshine @ 2015-06-29 22:22 UTC (permalink / raw)
  To: Jeff King; +Cc: H.Merijn Brand, git

On Thu, Jun 25, 2015 at 12:55:45PM -0400, Jeff King wrote:
> This feeds the format directly to strftime. Besides being a
> little more flexible, the main advantage is that your system
> strftime may know more about your locale's preferred format
> (e.g., how to spell the days of the week).
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> diff --git a/strbuf.c b/strbuf.c
> index 0d4f4e5..a7ba028 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -709,3 +709,32 @@ char *xstrfmt(const char *fmt, ...)
> +void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm)
> +{
> +	size_t len;
> +
> +	/*
> +	 * strftime reports "0" if it could not fit the result in the buffer.
> +	 * Unfortunately, it also reports "0" if the requested time string
> +	 * takes 0 bytes. So if we were to probe and grow, we have to choose
> +	 * some arbitrary cap beyond which we guess that the format probably
> +	 * just results in a 0-length output. Since we have to choose some
> +	 * reasonable cap anyway, and since it is not that big, we may
> +	 * as well just grow to their in the first place.
> +	 */
> +	strbuf_grow(sb, 128);
> +	len = strftime(sb->buf + sb->len, sb->alloc - sb->len, fmt, tm);
> +
> +	if (!len) {
> +		/*
> +		 * Either we failed, or the format actually produces a 0-length
> +		 * output. There's not much we can do, so we leave it blank.
> +		 * However, the output array is left in an undefined state, so
> +		 * we must re-assert our NUL terminator.
> +		 */
> +		sb->buf[sb->len] = '\0';
> +	} else {
> +		sb->len += len;
> +	}
> +}

Clients of strbuf rightly expect the buffer to grow as needed in
order to complete the requested operation. It is, therefore, both
weird and expectation-breaking for strbuf_addftime() to lack this
behavior. Worse, it doesn't even signal when the format has failed
due to insufficient buffer space.

How about taking this approach (or something similar), instead, which
grows the strbuf as needed?

--- 8< ---
void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm)
{
	size_t len;
	struct strbuf f = STRBUF_INIT;

	/*
	 * This is a bit tricky since strftime returns 0 if the result did not
	 * fit in the supplied buffer, as well as when the formatted time has
	 * zero length. In the former case, we need to grow the buffer and try
	 * again. To distinguish between the two cases, we supply strftime with
	 * a format string one character longer than what the client supplied,
	 * which ensures that a successful format will have non-zero length,
	 * and then drop the extra character from the formatted time before
	 * returning.
	 */
	strbuf_addf(&f, "%s ", fmt);

	do {
		strbuf_grow(sb, 128);
		len = strftime(sb->buf + sb->len, sb->alloc - sb->len,
			       f.buf, tm);
	} while (!len);
	strbuf_setlen(sb, sb->len + len - 1);

	strbuf_release(&f);
}
--- 8< ---

If this is performance critical code, then the augmented format
string can be constructed with less expensive functions than
strbuf_addf().

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-06-29 22:22         ` Eric Sunshine
@ 2015-06-30 10:20           ` Jeff King
  2015-06-30 16:22             ` Junio C Hamano
                               ` (2 more replies)
  2015-06-30 13:26           ` Jeff King
  1 sibling, 3 replies; 31+ messages in thread
From: Jeff King @ 2015-06-30 10:20 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: H.Merijn Brand, git

On Mon, Jun 29, 2015 at 06:22:47PM -0400, Eric Sunshine wrote:

> Clients of strbuf rightly expect the buffer to grow as needed in
> order to complete the requested operation. It is, therefore, both
> weird and expectation-breaking for strbuf_addftime() to lack this
> behavior. Worse, it doesn't even signal when the format has failed
> due to insufficient buffer space.

Agreed on all points.

> --- 8< ---
> void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm)
> {
> 	size_t len;
> 	struct strbuf f = STRBUF_INIT;
> 
> 	/*
> 	 * This is a bit tricky since strftime returns 0 if the result did not
> 	 * fit in the supplied buffer, as well as when the formatted time has
> 	 * zero length. In the former case, we need to grow the buffer and try
> 	 * again. To distinguish between the two cases, we supply strftime with
> 	 * a format string one character longer than what the client supplied,
> 	 * which ensures that a successful format will have non-zero length,
> 	 * and then drop the extra character from the formatted time before
> 	 * returning.
> 	 */
> 	strbuf_addf(&f, "%s ", fmt);

Basically I was trying to avoid making any assumptions about exactly how
strftime works. But presumably "stick a space in the format" is a
universally reasonable thing to do. It's a hack, but it's contained to
the function.

> 	do {
> 		strbuf_grow(sb, 128);
> 		len = strftime(sb->buf + sb->len, sb->alloc - sb->len,
> 			       f.buf, tm);
> 	} while (!len);

I think we need to keep growing this 128 ourselves, or else each loop
iteration will just say "yup, we have 128 bytes available; no need to
grow".


> [...]
> 
> If this is performance critical code, then the augmented format
> string can be constructed with less expensive functions than
> strbuf_addf().

This does get called a lot (e.g., once per commit). One extra allocation
would probably not kill us there, but I think we could fairly trivially
put this on the unlikely path:

  size_t hint = 128;
  size_t len;

  /* optimize out obvious 0-length case */
  if (!*fmt)
	return;

  strbuf_grow(sb, hint);
  len = strftime(sb->buf + sb->len, sb->alloc - sb->len, fmt, tm);

  /* maybe not enough room, or maybe 0-length output */
  if (!len) {
	struct strbuf f = STRBUF_INIT;
	strbuf_addf(&f, "%s ", fmt);
	while (!len) {
		hint *= 2;
		strbuf_grow(sb, hint);
		len = strftime(sb->buf + sb->len, sb->alloc - sb->len, f.buf, tm);
	}
  }

I'd guess most cases will fit in 128 bytes and never even hit this code
path. You could also get fancier and start the buffer smaller, but only
do the fmt hack when we cross a threshold.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-06-29 22:22         ` Eric Sunshine
  2015-06-30 10:20           ` Jeff King
@ 2015-06-30 13:26           ` Jeff King
  2015-06-30 17:05             ` Eric Sunshine
  2015-07-21  0:41             ` Eric Sunshine
  1 sibling, 2 replies; 31+ messages in thread
From: Jeff King @ 2015-06-30 13:26 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Junio C Hamano, H.Merijn Brand, git

On Mon, Jun 29, 2015 at 06:22:47PM -0400, Eric Sunshine wrote:

> Clients of strbuf rightly expect the buffer to grow as needed in
> order to complete the requested operation. It is, therefore, both
> weird and expectation-breaking for strbuf_addftime() to lack this
> behavior. Worse, it doesn't even signal when the format has failed
> due to insufficient buffer space.
> 
> How about taking this approach (or something similar), instead, which
> grows the strbuf as needed?

Here's a patch, on top of jk/date-mode-format (I think it would also be
fine to just squash into the tip commit; the explanation in the commit
message is sufficiently mirrored in the code comment).

-- >8 --
Subject: [PATCH] strbuf: make strbuf_addftime more robust

The return value of strftime is poorly designed; when it
returns 0, the caller cannot tell if the buffer was not
large enough, or if the output was actually 0 bytes. In the
original implementation of strbuf_addftime, we simply punted
and guessed that our 128-byte hint would be large enough.

We can do better, though, if we're willing to treat strftime
like less of a black box. We can munge the incoming format
to make sure that it never produces 0-length output, and
then "fix" the resulting output.  That lets us reliably grow
the buffer based on strftime's return value.

Clever-idea-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
---
 strbuf.c                | 38 +++++++++++++++++++++-----------------
 t/t6300-for-each-ref.sh | 10 ++++++++++
 2 files changed, 31 insertions(+), 17 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index a7ba028..e5e7370 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -712,29 +712,33 @@ char *xstrfmt(const char *fmt, ...)
 
 void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm)
 {
+	size_t hint = 128;
 	size_t len;
 
-	/*
-	 * strftime reports "0" if it could not fit the result in the buffer.
-	 * Unfortunately, it also reports "0" if the requested time string
-	 * takes 0 bytes. So if we were to probe and grow, we have to choose
-	 * some arbitrary cap beyond which we guess that the format probably
-	 * just results in a 0-length output. Since we have to choose some
-	 * reasonable cap anyway, and since it is not that big, we may
-	 * as well just grow to their in the first place.
-	 */
-	strbuf_grow(sb, 128);
+	if (!*fmt)
+		return;
+
+	strbuf_grow(sb, hint);
 	len = strftime(sb->buf + sb->len, sb->alloc - sb->len, fmt, tm);
 
 	if (!len) {
 		/*
-		 * Either we failed, or the format actually produces a 0-length
-		 * output. There's not much we can do, so we leave it blank.
-		 * However, the output array is left in an undefined state, so
-		 * we must re-assert our NUL terminator.
+		 * strftime reports "0" if it could not fit the result in the buffer.
+		 * Unfortunately, it also reports "0" if the requested time string
+		 * takes 0 bytes. So our strategy is to munge the format so that the
+		 * output contains at least one character, and then drop the extra
+		 * character before returning.
 		 */
-		sb->buf[sb->len] = '\0';
-	} else {
-		sb->len += len;
+		struct strbuf munged_fmt = STRBUF_INIT;
+		strbuf_addf(&munged_fmt, "%s ", fmt);
+		while (!len) {
+			hint *= 2;
+			strbuf_grow(sb, hint);
+			len = strftime(sb->buf + sb->len, sb->alloc - sb->len,
+				       munged_fmt.buf, tm);
+		}
+		strbuf_release(&munged_fmt);
+		len--; /* drop munged space */
 	}
+	strbuf_setlen(sb, sb->len + len);
 }
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index c7f368c..7c9bec7 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -235,6 +235,16 @@ test_expect_success 'Check format of strftime date fields' '
 	test_cmp expected actual
 '
 
+test_expect_success 'exercise strftime with odd fields' '
+	echo >expected &&
+	git for-each-ref --format="%(authordate:format:)" refs/heads >actual &&
+	test_cmp expected actual &&
+	long="long format -- $_z40$_z40$_z40$_z40$_z40$_z40$_z40" &&
+	echo $long >expected &&
+	git for-each-ref --format="%(authordate:format:$long)" refs/heads >actual &&
+	test_cmp expected actual
+'
+
 cat >expected <<\EOF
 refs/heads/master
 refs/remotes/origin/master
-- 
2.5.0.rc0.336.g8460790

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-06-30 10:20           ` Jeff King
@ 2015-06-30 16:22             ` Junio C Hamano
  2015-06-30 17:50               ` Jeff King
  2015-06-30 16:58             ` Eric Sunshine
  2015-06-30 17:05             ` Junio C Hamano
  2 siblings, 1 reply; 31+ messages in thread
From: Junio C Hamano @ 2015-06-30 16:22 UTC (permalink / raw)
  To: Jeff King; +Cc: Eric Sunshine, H.Merijn Brand, git

Jeff King <peff@peff.net> writes:

>> 	strbuf_addf(&f, "%s ", fmt);
>
> Basically I was trying to avoid making any assumptions about exactly how
> strftime works. But presumably "stick a space in the format" is a
> universally reasonable thing to do. It's a hack, but it's contained to
> the function.

Why can't I shake this feeling that (" %s", fmt), i.e. prepend not
append, is the safer thing to do than to append?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-06-30 10:20           ` Jeff King
  2015-06-30 16:22             ` Junio C Hamano
@ 2015-06-30 16:58             ` Eric Sunshine
  2015-06-30 17:58               ` Jeff King
  2015-06-30 17:05             ` Junio C Hamano
  2 siblings, 1 reply; 31+ messages in thread
From: Eric Sunshine @ 2015-06-30 16:58 UTC (permalink / raw)
  To: Jeff King; +Cc: H.Merijn Brand, Git List

On Tue, Jun 30, 2015 at 6:20 AM, Jeff King <peff@peff.net> wrote:
> On Mon, Jun 29, 2015 at 06:22:47PM -0400, Eric Sunshine wrote:
>> void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm)
>> {
>>       size_t len;
>>       struct strbuf f = STRBUF_INIT;
>>
>>       /*
>>        * This is a bit tricky since strftime returns 0 if the result did not
>>        * fit in the supplied buffer, as well as when the formatted time has
>>        * zero length. In the former case, we need to grow the buffer and try
>>        * again. To distinguish between the two cases, we supply strftime with
>>        * a format string one character longer than what the client supplied,
>>        * which ensures that a successful format will have non-zero length,
>>        * and then drop the extra character from the formatted time before
>>        * returning.
>>        */
>>       strbuf_addf(&f, "%s ", fmt);
>
> Basically I was trying to avoid making any assumptions about exactly how
> strftime works. But presumably "stick a space in the format" is a
> universally reasonable thing to do. It's a hack, but it's contained to
> the function.

I don't think we're making any assumptions about strftime(). POSIX states:

    The format string consists of zero or more conversion
    specifications and ordinary characters. [...] All ordinary
    characters (including the terminating NUL character) are copied
    unchanged into the array.

So, we seem to be on solid footing with this approach (even though
it's a localized hack).

>>       do {
>>               strbuf_grow(sb, 128);
>>               len = strftime(sb->buf + sb->len, sb->alloc - sb->len,
>>                              f.buf, tm);
>>       } while (!len);
>
> I think we need to keep growing this 128 ourselves, or else each loop
> iteration will just say "yup, we have 128 bytes available; no need to
> grow".

Yeah, I toyed with the idea of increasing the requested amount each
iteration but wanted to keep the example simple, thus left it out.
However, for some reason, I was thinking that strbuf_grow() was
unconditionally expanding the buffer by the requested amount rather
than merely ensuring that that amount was availabile, so the amount
clearly needs to be increased on each iteration. Thanks for pointing
that out.

>> If this is performance critical code, then the augmented format
>> string can be constructed with less expensive functions than
>> strbuf_addf().
>
> This does get called a lot (e.g., once per commit). One extra allocation
> would probably not kill us there [...]

Beyond the extra allocation, I was also concerned about the
sledgehammer approach of "%s " to append a single character when there
are much less expensive ways to do so.

> [...] but I think we could fairly trivially put this on the unlikely path:
>
>   size_t hint = 128;
>   size_t len;
>
>   /* optimize out obvious 0-length case */
>   if (!*fmt)
>         return;
>
>   strbuf_grow(sb, hint);
>   len = strftime(sb->buf + sb->len, sb->alloc - sb->len, fmt, tm);
>
>   /* maybe not enough room, or maybe 0-length output */
>   if (!len) {
>         struct strbuf f = STRBUF_INIT;
>         strbuf_addf(&f, "%s ", fmt);
>         while (!len) {
>                 hint *= 2;
>                 strbuf_grow(sb, hint);
>                 len = strftime(sb->buf + sb->len, sb->alloc - sb->len, f.buf, tm);
>         }
>   }
>
> I'd guess most cases will fit in 128 bytes and never even hit this code
> path. You could also get fancier and start the buffer smaller, but only
> do the fmt hack when we cross a threshold.

Yep, I toyed with other looping constructs as well before settling
upon do-while in the example for its simplicity. If called often
enough, this sort of optimization seems reasonable enough.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-06-30 10:20           ` Jeff King
  2015-06-30 16:22             ` Junio C Hamano
  2015-06-30 16:58             ` Eric Sunshine
@ 2015-06-30 17:05             ` Junio C Hamano
  2015-06-30 17:48               ` Eric Sunshine
  2015-06-30 19:17               ` Jeff King
  2 siblings, 2 replies; 31+ messages in thread
From: Junio C Hamano @ 2015-06-30 17:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Eric Sunshine, H.Merijn Brand, git

Jeff King <peff@peff.net> writes:

> This does get called a lot (e.g., once per commit). One extra allocation
> would probably not kill us there, but I think we could fairly trivially
> put this on the unlikely path:
>
>   size_t hint = 128;
>   size_t len;
>
>   /* optimize out obvious 0-length case */
>   if (!*fmt)
> 	return;
>
>   strbuf_grow(sb, hint);
>   len = strftime(sb->buf + sb->len, sb->alloc - sb->len, fmt, tm);
>
>   /* maybe not enough room, or maybe 0-length output */
>   if (!len) {
> 	struct strbuf f = STRBUF_INIT;
> 	strbuf_addf(&f, "%s ", fmt);
> 	while (!len) {
> 		hint *= 2;
> 		strbuf_grow(sb, hint);
> 		len = strftime(sb->buf + sb->len, sb->alloc - sb->len, f.buf, tm);
> 	}
>   }
>
> I'd guess most cases will fit in 128 bytes and never even hit this code
> path. You could also get fancier and start the buffer smaller, but only
> do the fmt hack when we cross a threshold.

I'd assume that the "hint" thing will persist across calls somehow?
In an invocation of "git log --date=format:<some format>" for
millions of commits, it is likely that the length of the formatted
date string will stay the same or close to the same (yeah, I know
"Wednesday" would be longer than "Monday").

Answering myself to my earlier question, the reason is because I was
worried what happens when given fmt is a malformed strftime format
specifier.  Perhaps it ends with a lone % and "% " may format to
something unexpected, or something.

Are we checking an error from strftime(3)?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-06-30 13:26           ` Jeff King
@ 2015-06-30 17:05             ` Eric Sunshine
  2015-07-21  0:41             ` Eric Sunshine
  1 sibling, 0 replies; 31+ messages in thread
From: Eric Sunshine @ 2015-06-30 17:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, H.Merijn Brand, Git List

On Tue, Jun 30, 2015 at 9:26 AM, Jeff King <peff@peff.net> wrote:
> On Mon, Jun 29, 2015 at 06:22:47PM -0400, Eric Sunshine wrote:
>
>> Clients of strbuf rightly expect the buffer to grow as needed in
>> order to complete the requested operation. It is, therefore, both
>> weird and expectation-breaking for strbuf_addftime() to lack this
>> behavior. Worse, it doesn't even signal when the format has failed
>> due to insufficient buffer space.
>>
>> How about taking this approach (or something similar), instead, which
>> grows the strbuf as needed?
>
> Here's a patch, on top of jk/date-mode-format (I think it would also be
> fine to just squash into the tip commit; the explanation in the commit
> message is sufficiently mirrored in the code comment).

As a logical change in itself, I could also see introduction of
strbuf_addftime() split out into its own patch (with this patch
squashed in). Either way, it's a welcome improvement over the
original.

> -- >8 --
> Subject: [PATCH] strbuf: make strbuf_addftime more robust
>
> The return value of strftime is poorly designed; when it
> returns 0, the caller cannot tell if the buffer was not
> large enough, or if the output was actually 0 bytes. In the
> original implementation of strbuf_addftime, we simply punted
> and guessed that our 128-byte hint would be large enough.
>
> We can do better, though, if we're willing to treat strftime
> like less of a black box. We can munge the incoming format
> to make sure that it never produces 0-length output, and
> then "fix" the resulting output.  That lets us reliably grow
> the buffer based on strftime's return value.
>
> Clever-idea-by: Eric Sunshine <sunshine@sunshineco.com>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  strbuf.c                | 38 +++++++++++++++++++++-----------------
>  t/t6300-for-each-ref.sh | 10 ++++++++++
>  2 files changed, 31 insertions(+), 17 deletions(-)
>
> diff --git a/strbuf.c b/strbuf.c
> index a7ba028..e5e7370 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -712,29 +712,33 @@ char *xstrfmt(const char *fmt, ...)
>
>  void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm)
>  {
> +       size_t hint = 128;
>         size_t len;
>
> -       /*
> -        * strftime reports "0" if it could not fit the result in the buffer.
> -        * Unfortunately, it also reports "0" if the requested time string
> -        * takes 0 bytes. So if we were to probe and grow, we have to choose
> -        * some arbitrary cap beyond which we guess that the format probably
> -        * just results in a 0-length output. Since we have to choose some
> -        * reasonable cap anyway, and since it is not that big, we may
> -        * as well just grow to their in the first place.
> -        */
> -       strbuf_grow(sb, 128);
> +       if (!*fmt)
> +               return;
> +
> +       strbuf_grow(sb, hint);
>         len = strftime(sb->buf + sb->len, sb->alloc - sb->len, fmt, tm);
>
>         if (!len) {
>                 /*
> -                * Either we failed, or the format actually produces a 0-length
> -                * output. There's not much we can do, so we leave it blank.
> -                * However, the output array is left in an undefined state, so
> -                * we must re-assert our NUL terminator.
> +                * strftime reports "0" if it could not fit the result in the buffer.
> +                * Unfortunately, it also reports "0" if the requested time string
> +                * takes 0 bytes. So our strategy is to munge the format so that the
> +                * output contains at least one character, and then drop the extra
> +                * character before returning.
>                  */
> -               sb->buf[sb->len] = '\0';
> -       } else {
> -               sb->len += len;
> +               struct strbuf munged_fmt = STRBUF_INIT;
> +               strbuf_addf(&munged_fmt, "%s ", fmt);
> +               while (!len) {
> +                       hint *= 2;
> +                       strbuf_grow(sb, hint);
> +                       len = strftime(sb->buf + sb->len, sb->alloc - sb->len,
> +                                      munged_fmt.buf, tm);
> +               }
> +               strbuf_release(&munged_fmt);
> +               len--; /* drop munged space */
>         }
> +       strbuf_setlen(sb, sb->len + len);
>  }
> diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
> index c7f368c..7c9bec7 100755
> --- a/t/t6300-for-each-ref.sh
> +++ b/t/t6300-for-each-ref.sh
> @@ -235,6 +235,16 @@ test_expect_success 'Check format of strftime date fields' '
>         test_cmp expected actual
>  '
>
> +test_expect_success 'exercise strftime with odd fields' '
> +       echo >expected &&
> +       git for-each-ref --format="%(authordate:format:)" refs/heads >actual &&
> +       test_cmp expected actual &&
> +       long="long format -- $_z40$_z40$_z40$_z40$_z40$_z40$_z40" &&
> +       echo $long >expected &&
> +       git for-each-ref --format="%(authordate:format:$long)" refs/heads >actual &&
> +       test_cmp expected actual
> +'
> +
>  cat >expected <<\EOF
>  refs/heads/master
>  refs/remotes/origin/master
> --
> 2.5.0.rc0.336.g8460790

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-06-30 17:05             ` Junio C Hamano
@ 2015-06-30 17:48               ` Eric Sunshine
  2015-06-30 19:17               ` Jeff King
  1 sibling, 0 replies; 31+ messages in thread
From: Eric Sunshine @ 2015-06-30 17:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, H.Merijn Brand, Git List

On Tue, Jun 30, 2015 at 1:05 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Answering myself to my earlier question, the reason is because I was
> worried what happens when given fmt is a malformed strftime format
> specifier.  Perhaps it ends with a lone % and "% " may format to
> something unexpected, or something.

That's a good point. I had considered prepending the extra character
(space) rather than appending it but eventually rejected it to avoid
the expense of shifting the characters down by one before returning
the formatted string.

However, is it our responsibility to guard against a malformed format?
POSIX doesn't state the behavior of "% ", so I don't think we are any
worse off by appending space to a malformed format ending with "%"
since the malformed format could wreak havoc even without our
transformation.

> Are we checking an error from strftime(3)?

According to the "BUGS" section in POSIX:

    If the output string would exceed max bytes, errno is not set.
    This makes it impossible to distinguish this error case from
    cases where the format string legitimately produces a zero-length
    output string. POSIX.1-2001 does not specify any errno settings
    for strftime().

So, there does not seem to be a point in checking 'errno'.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-06-30 16:22             ` Junio C Hamano
@ 2015-06-30 17:50               ` Jeff King
  2015-06-30 19:23                 ` Junio C Hamano
  0 siblings, 1 reply; 31+ messages in thread
From: Jeff King @ 2015-06-30 17:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric Sunshine, H.Merijn Brand, git

On Tue, Jun 30, 2015 at 09:22:18AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >> 	strbuf_addf(&f, "%s ", fmt);
> >
> > Basically I was trying to avoid making any assumptions about exactly how
> > strftime works. But presumably "stick a space in the format" is a
> > universally reasonable thing to do. It's a hack, but it's contained to
> > the function.
> 
> Why can't I shake this feeling that (" %s", fmt), i.e. prepend not
> append, is the safer thing to do than to append?

Because then removing the extra space involves `memmove` of the buffer,
rather than just shortening the length by one.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-06-30 16:58             ` Eric Sunshine
@ 2015-06-30 17:58               ` Jeff King
  2015-06-30 18:13                 ` Eric Sunshine
  0 siblings, 1 reply; 31+ messages in thread
From: Jeff King @ 2015-06-30 17:58 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: H.Merijn Brand, Git List

On Tue, Jun 30, 2015 at 12:58:33PM -0400, Eric Sunshine wrote:

> > Basically I was trying to avoid making any assumptions about exactly how
> > strftime works. But presumably "stick a space in the format" is a
> > universally reasonable thing to do. It's a hack, but it's contained to
> > the function.
> 
> I don't think we're making any assumptions about strftime(). POSIX states:
> 
>     The format string consists of zero or more conversion
>     specifications and ordinary characters. [...] All ordinary
>     characters (including the terminating NUL character) are copied
>     unchanged into the array.
> 
> So, we seem to be on solid footing with this approach (even though
> it's a localized hack).

Yeah, sorry I wasn't more clear. I had originally been thinking of
making assumptions like "well, %c cannot ever be blank". But your
solution does not suffer from that level of knowledge. I think it is
reasonably clever.

> Yeah, I toyed with the idea of increasing the requested amount each
> iteration but wanted to keep the example simple, thus left it out.
> However, for some reason, I was thinking that strbuf_grow() was
> unconditionally expanding the buffer by the requested amount rather
> than merely ensuring that that amount was availabile, so the amount
> clearly needs to be increased on each iteration. Thanks for pointing
> that out.

FWIW, I had to look at it to double-check. I've often made the same
mistake.

> Beyond the extra allocation, I was also concerned about the
> sledgehammer approach of "%s " to append a single character when there
> are much less expensive ways to do so.

I don't think there's any other way. We have to feed a contiguous buffer
to strftime, and we don't own the buffer, so we have to make a new copy.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-06-30 17:58               ` Jeff King
@ 2015-06-30 18:13                 ` Eric Sunshine
  2015-06-30 19:22                   ` Jeff King
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Sunshine @ 2015-06-30 18:13 UTC (permalink / raw)
  To: Jeff King; +Cc: H.Merijn Brand, Git List

On Tue, Jun 30, 2015 at 1:58 PM, Jeff King <peff@peff.net> wrote:
> On Tue, Jun 30, 2015 at 12:58:33PM -0400, Eric Sunshine wrote:
>> Beyond the extra allocation, I was also concerned about the
>> sledgehammer approach of "%s " to append a single character when there
>> are much less expensive ways to do so.
>
> I don't think there's any other way. We have to feed a contiguous buffer
> to strftime, and we don't own the buffer, so we have to make a new copy.

Sorry, I meant that the interpolation expense of "%s ". A cheaper (but
more verbose) alternative might be:

    size_t n = strlen(fmt);
    const char *f = xmalloc(n + 2);
    strcpy(f, fmt);
    f[n] = ' ';
    f[n + 1] = '\0';
    ...
    free(f);

or something similar.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-06-30 17:05             ` Junio C Hamano
  2015-06-30 17:48               ` Eric Sunshine
@ 2015-06-30 19:17               ` Jeff King
  1 sibling, 0 replies; 31+ messages in thread
From: Jeff King @ 2015-06-30 19:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric Sunshine, H.Merijn Brand, git

On Tue, Jun 30, 2015 at 10:05:33AM -0700, Junio C Hamano wrote:

> > I'd guess most cases will fit in 128 bytes and never even hit this code
> > path. You could also get fancier and start the buffer smaller, but only
> > do the fmt hack when we cross a threshold.
> 
> I'd assume that the "hint" thing will persist across calls somehow?
> In an invocation of "git log --date=format:<some format>" for
> millions of commits, it is likely that the length of the formatted
> date string will stay the same or close to the same (yeah, I know
> "Wednesday" would be longer than "Monday").

I hadn't thought about that. It could persist, but I don't think this is
necessarily the right place to do it. For two reasons:

  1. You have no idea in strbuf_addftime if it's the same fmt being
     added over and over. This is the wrong place to make that
     optimization.

  2. If you are interested in efficiency in a loop, then you should be
     reusing the same strbuf over and over, and avoiding the extra
     allocation in the first place. And that is indeed what we do for
     "git log --date", as we will always use the same static-local
     buffer in show_date().

> Answering myself to my earlier question, the reason is because I was
> worried what happens when given fmt is a malformed strftime format
> specifier.  Perhaps it ends with a lone % and "% " may format to
> something unexpected, or something.
> 
> Are we checking an error from strftime(3)?

POSIX doesn't define any errno values for strftime (and in fact says "No
errors are defined"). The return value description for POSIX (and the
glibc manpage) talk about only whether or not the output fits. However,
POSIX does say "If a conversion specifier is not one of the above, the
behavior is undefined".

So certainly I could imagine an implementation that returns "0" when you
feed it a bogus value. If you (as a user) feed us crap to give to
strftime, I am not particularly concerned with whether you get crap out.
My main concern is that it would return "0" and we would loop forever.
OTOH, I think any sane implementation would simply copy unknown
placeholders out (certainly glibc does that). So I think we could simply
consider it a quality of implementation issue, and deal with any
particular crappy implementations if and when they get reported. We
could add something tricky (like "--date=format:%") to the test suite to
make it likelier to catch such a thing.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-06-30 18:13                 ` Eric Sunshine
@ 2015-06-30 19:22                   ` Jeff King
  0 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2015-06-30 19:22 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: H.Merijn Brand, Git List

On Tue, Jun 30, 2015 at 02:13:53PM -0400, Eric Sunshine wrote:

> Sorry, I meant that the interpolation expense of "%s ". A cheaper (but
> more verbose) alternative might be:
> 
>     size_t n = strlen(fmt);
>     const char *f = xmalloc(n + 2);
>     strcpy(f, fmt);
>     f[n] = ' ';
>     f[n + 1] = '\0';
>     ...
>     free(f);
> 
> or something similar.

I think you're probably getting into premature optimization here. Using
strbuf_vaddf should be within the same order of magnitude of
instructions (and I think we should almost never hit this code path
anyway, because we'll be reading into a static strbuf generally which
will come pre-sized to hold a date).

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-06-30 17:50               ` Jeff King
@ 2015-06-30 19:23                 ` Junio C Hamano
  2015-06-30 19:33                   ` Jeff King
  0 siblings, 1 reply; 31+ messages in thread
From: Junio C Hamano @ 2015-06-30 19:23 UTC (permalink / raw)
  To: Jeff King; +Cc: Eric Sunshine, H.Merijn Brand, git

Jeff King <peff@peff.net> writes:

> On Tue, Jun 30, 2015 at 09:22:18AM -0700, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>> 
>> >> 	strbuf_addf(&f, "%s ", fmt);
>> >
>> > Basically I was trying to avoid making any assumptions about exactly how
>> > strftime works. But presumably "stick a space in the format" is a
>> > universally reasonable thing to do. It's a hack, but it's contained to
>> > the function.
>> 
>> Why can't I shake this feeling that (" %s", fmt), i.e. prepend not
>> append, is the safer thing to do than to append?
>
> Because then removing the extra space involves `memmove` of the buffer,
> rather than just shortening the length by one.

That does not explain why I feel the other way is safer, though ;-)

In any case, responding myself to my other question, strftime(3)
does not define any useful error information even for a case where
you feed nonsense format to it, so it is even not possible to
protect ourselves by checking errno or doing something similar X-<.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-06-30 19:23                 ` Junio C Hamano
@ 2015-06-30 19:33                   ` Jeff King
  0 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2015-06-30 19:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric Sunshine, H.Merijn Brand, git

On Tue, Jun 30, 2015 at 12:23:08PM -0700, Junio C Hamano wrote:

> >> Why can't I shake this feeling that (" %s", fmt), i.e. prepend not
> >> append, is the safer thing to do than to append?
> >
> > Because then removing the extra space involves `memmove` of the buffer,
> > rather than just shortening the length by one.
> 
> That does not explain why I feel the other way is safer, though ;-)

Sorry, I think I read it as "saner". ;)

> In any case, responding myself to my other question, strftime(3)
> does not define any useful error information even for a case where
> you feed nonsense format to it, so it is even not possible to
> protect ourselves by checking errno or doing something similar X-<.

I do think " %" is probably safer than "% " in that it is less likely to
cause weird output from strftime (e.g., I could see an implementation
treat "%X", where "X" is unknown, as just "X", but treat a trailing "%"
as a raw "%"). But I doubt there is an implementation that would return
"0" for one but not the other (anything's possible of course, but we are
deep in the realm of guessing how systems might implement the
standard, so it's really our best guess until somebody can present a
particular data point; I've looked only at glibc and it is sane).

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] convert "enum date_mode" into a struct
  2015-06-25 16:55       ` [PATCH 2/3] convert "enum date_mode" into a struct Jeff King
  2015-06-25 17:03         ` John Keeping
@ 2015-07-07 20:37         ` Junio C Hamano
  2015-07-07 20:48           ` Jeff King
  1 sibling, 1 reply; 31+ messages in thread
From: Junio C Hamano @ 2015-07-07 20:37 UTC (permalink / raw)
  To: Jeff King; +Cc: H.Merijn Brand, git

Jeff King <peff@peff.net> writes:

> ...  However, the tricky case is where we use the
> enum labels as constants, like:
>
>   show_date(t, tz, DATE_NORMAL);
>
> Ideally we could say:
>
>   show_date(t, tz, &{ DATE_NORMAL });
>
> but of course C does not allow that.
> ...
>   3. Provide a wrapper that generates the correct struct on
>      the fly. The big downside is that we end up pointing to
>      a single global, which makes our wrapper non-reentrant.
>      But show_date is already not reentrant, so it does not
>      matter.
>
> This patch implements 3, along with a minor macro to keep
> the size of the callers sane.

Another big downside is that DATE_NORMAL is defined to be "0".

This makes it very cumbersome to merge a side branch that uses an
outdated definition of show_date() and its friends and tell them
to show date normally.  The compiler does not help detecting
places that need to be adjusted during merge and instead just pass
a NULL pointer as a pointer to the new struct.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] convert "enum date_mode" into a struct
  2015-07-07 20:37         ` Junio C Hamano
@ 2015-07-07 20:48           ` Jeff King
  2015-07-07 21:05             ` Junio C Hamano
  0 siblings, 1 reply; 31+ messages in thread
From: Jeff King @ 2015-07-07 20:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: H.Merijn Brand, git

On Tue, Jul 07, 2015 at 01:37:08PM -0700, Junio C Hamano wrote:

> >   3. Provide a wrapper that generates the correct struct on
> >      the fly. The big downside is that we end up pointing to
> >      a single global, which makes our wrapper non-reentrant.
> >      But show_date is already not reentrant, so it does not
> >      matter.
> >
> > This patch implements 3, along with a minor macro to keep
> > the size of the callers sane.
> 
> Another big downside is that DATE_NORMAL is defined to be "0".
> 
> This makes it very cumbersome to merge a side branch that uses an
> outdated definition of show_date() and its friends and tell them
> to show date normally.  The compiler does not help detecting
> places that need to be adjusted during merge and instead just pass
> a NULL pointer as a pointer to the new struct.

My assumption was that using the raw "0" is something we would frowned
upon in new code. There was a single historical instance that I fixed in
the series, but I wouldn't expect new ones (and actually, that instance
was "1", which would be caught by the compiler).

However, if you're concerned, I think we could have show_date massage a
NULL date, like:

diff --git a/date.c b/date.c
index 8f91569..a04d089 100644
--- a/date.c
+++ b/date.c
@@ -173,6 +173,10 @@ const char *show_date(unsigned long time, int tz, const struct date_mode *mode)
 {
 	struct tm *tm;
 	static struct strbuf timebuf = STRBUF_INIT;
+	static const struct fallback_mode = { DATE_NORMAL };
+
+	if (!mode)
+		mode = &fallback_mode;
 
 	if (mode->type == DATE_RAW) {
 		strbuf_reset(&timebuf);


That would also allow people to explicitly call:

  show_date(t, tz, NULL);

to get the default format, though I personally prefer spelling it out.

I guess we _could_ introduce:

  #define DATE_MODE(x) ((struct date_mode *)(x))

and then take any numeric value, under the assumption that the first
page of memory will never be a valid pointer:

diff --git a/date.c b/date.c
index 8f91569..f388fee 100644
--- a/date.c
+++ b/date.c
@@ -173,6 +173,18 @@ const char *show_date(unsigned long time, int tz, const struct date_mode *mode)
 {
 	struct tm *tm;
 	static struct strbuf timebuf = STRBUF_INIT;
+	struct date_mode fallback;
+
+	/* hysterical compatibility */
+	if (mode < 1024) {
+		if (mode == DATE_STRFTIME)
+			die("BUG: nice try");
+		fallback.type = mode;
+		mode = &fallback;
+	}
+
+	if (!mode)
+		mode = &fallback_mode;
 
 	if (mode->type == DATE_RAW) {
 		strbuf_reset(&timebuf);

That's kind of nasty, but at least it's hidden from the callers.

-Peff

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] convert "enum date_mode" into a struct
  2015-07-07 20:48           ` Jeff King
@ 2015-07-07 21:05             ` Junio C Hamano
  2015-07-07 21:13               ` Jeff King
  0 siblings, 1 reply; 31+ messages in thread
From: Junio C Hamano @ 2015-07-07 21:05 UTC (permalink / raw)
  To: Jeff King; +Cc: H.Merijn Brand, git

Jeff King <peff@peff.net> writes:

> My assumption was that using the raw "0" is something we would frowned
> upon in new code. There was a single historical instance that I fixed in
> the series, but I wouldn't expect new ones (and actually, that instance
> was "1", which would be caught by the compiler).

That is not the problem.

The code on the side branch may add a new callsite, something like
this:

	show_ident_date(&ident_split, DATE_NORMAL);

based on the current codebase (e.g. 'master' as of today).

The merge goes cleanly, it compiles, even though the new function
signature of show_ident_date(), similar to the updated show_date(),
takes a pointer to a struct where they used to take DATE_$format
constants.

And that is because DATE_NORMAL is defined to be 0; we can claim
that the compiler is being stupid to take one of the enum
date_mode_type values that happens to be 0 and misinterpret it as
the program wanted to pass a NULL pointer to a structure, but that
is not what happened.

> However, if you're concerned, I think we could have show_date massage a
> NULL date, like:
>
> diff --git a/date.c b/date.c
> index 8f91569..a04d089 100644
> --- a/date.c
> +++ b/date.c
> @@ -173,6 +173,10 @@ const char *show_date(unsigned long time, int tz, const struct date_mode *mode)
>  {
>  	struct tm *tm;
>  	static struct strbuf timebuf = STRBUF_INIT;
> +	static const struct fallback_mode = { DATE_NORMAL };

Yes, that is nasty.  Renumbering the enum to begin with 1 may be a
much saner solution, unless somebody does

	if (!mode->type)
        	/* we know DATE_NORMAL is zero, he he */
	        do the normal thing;

In any case, I did another evil merge to fix it.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] convert "enum date_mode" into a struct
  2015-07-07 21:05             ` Junio C Hamano
@ 2015-07-07 21:13               ` Jeff King
  2015-07-07 21:19                 ` Junio C Hamano
  0 siblings, 1 reply; 31+ messages in thread
From: Jeff King @ 2015-07-07 21:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: H.Merijn Brand, git

On Tue, Jul 07, 2015 at 02:05:52PM -0700, Junio C Hamano wrote:

> And that is because DATE_NORMAL is defined to be 0; we can claim
> that the compiler is being stupid to take one of the enum
> date_mode_type values that happens to be 0 and misinterpret it as
> the program wanted to pass a NULL pointer to a structure, but that
> is not what happened.

Ah, I didn't think the compiler would coerce an enum into a pointer
constant. That seems kind of insane. But it is indeed what gcc does.

In that case, we can indeed do the NULL-pointer thing I mentioned. Which
is not even _that_ ugly; it follows the standard.

The "cast DATE_RELATIVE to a pointer and uncast it on the other side"
thing _does_ violate the standard. It is not needed for this, but it
would make the DATE_MODE() macro reentrant.

> > +	static const struct fallback_mode = { DATE_NORMAL };
> 
> Yes, that is nasty.  Renumbering the enum to begin with 1 may be a
> much saner solution, unless somebody does

I am worried more about somebody who does memset(0) on a struct, and
expects that to default to DATE_NORMAL.

> In any case, I did another evil merge to fix it.

OK. Do you want to leave it be, then, or would you prefer me to do the
NULL fallback? Or we could bump the enum to start with 1, and then
explicitly treat "0" as a synonym for DATE_NORMAL (in case it comes in
through a memset or similar).

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/3] convert "enum date_mode" into a struct
  2015-07-07 21:13               ` Jeff King
@ 2015-07-07 21:19                 ` Junio C Hamano
  0 siblings, 0 replies; 31+ messages in thread
From: Junio C Hamano @ 2015-07-07 21:19 UTC (permalink / raw)
  To: Jeff King; +Cc: H.Merijn Brand, git

Jeff King <peff@peff.net> writes:

> OK. Do you want to leave it be, then, or would you prefer me to do the
> NULL fallback? Or we could bump the enum to start with 1, and then
> explicitly treat "0" as a synonym for DATE_NORMAL (in case it comes in
> through a memset or similar).

I didn't think about the memset() initialization, and keeping the
normal case to be 0 makes tons of sense.

I'd prefer to see the stale code dump core rather than leaving it
stale without anybody noticing.  Hopefully the date-mode change can
hit 'master' fairly soon after the upcoming release, so unless a fix
that involves show_date() need to be written and applied to 2.4
codebase and upwards at the same time, I do not think it is a huge
issue.

Thanks.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-06-30 13:26           ` Jeff King
  2015-06-30 17:05             ` Eric Sunshine
@ 2015-07-21  0:41             ` Eric Sunshine
  2015-07-21  1:19               ` Jeff King
  1 sibling, 1 reply; 31+ messages in thread
From: Eric Sunshine @ 2015-07-21  0:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Git List, H.Merijn Brand

On Tue, Jun 30, 2015 at 9:26 AM, Jeff King <peff@peff.net> wrote:
> On Mon, Jun 29, 2015 at 06:22:47PM -0400, Eric Sunshine wrote:
>> Clients of strbuf rightly expect the buffer to grow as needed in
>> order to complete the requested operation. It is, therefore, both
>> weird and expectation-breaking for strbuf_addftime() to lack this
>> behavior. Worse, it doesn't even signal when the format has failed
>> due to insufficient buffer space.
>>
>> How about taking this approach (or something similar), instead, which
>> grows the strbuf as needed?
>
> Here's a patch, on top of jk/date-mode-format (I think it would also be
> fine to just squash into the tip commit; the explanation in the commit
> message is sufficiently mirrored in the code comment).

While cleaning up old local branches, I noticed that, although the
jk/date-mode-format topic[1] made it into 'next' (and will be merged
to 'master' according to "What's cooking"[2]), the below follow-on
patch[3] which improves strbuf_addftime() never got picked up. Was
this omission intentional? Based upon the discussion[4], I was under
the impression that the patch was considered reasonably acceptable
(and did not worsen problems with bogus format strings -- which are
bogus anyway).

[1]: http://thread.gmane.org/gmane.comp.version-control.git/272658/focus=272695
[2]: http://news.gmane.org/gmane.comp.version-control.git
[3]: http://article.gmane.org/gmane.comp.version-control.git/273061
[4]: http://thread.gmane.org/gmane.comp.version-control.git/272658/focus=273026

> -- >8 --
> Subject: [PATCH] strbuf: make strbuf_addftime more robust
>
> The return value of strftime is poorly designed; when it
> returns 0, the caller cannot tell if the buffer was not
> large enough, or if the output was actually 0 bytes. In the
> original implementation of strbuf_addftime, we simply punted
> and guessed that our 128-byte hint would be large enough.
>
> We can do better, though, if we're willing to treat strftime
> like less of a black box. We can munge the incoming format
> to make sure that it never produces 0-length output, and
> then "fix" the resulting output.  That lets us reliably grow
> the buffer based on strftime's return value.
>
> Clever-idea-by: Eric Sunshine <sunshine@sunshineco.com>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  strbuf.c                | 38 +++++++++++++++++++++-----------------
>  t/t6300-for-each-ref.sh | 10 ++++++++++
>  2 files changed, 31 insertions(+), 17 deletions(-)
>
> diff --git a/strbuf.c b/strbuf.c
> index a7ba028..e5e7370 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -712,29 +712,33 @@ char *xstrfmt(const char *fmt, ...)
>
>  void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm)
>  {
> +       size_t hint = 128;
>         size_t len;
>
> -       /*
> -        * strftime reports "0" if it could not fit the result in the buffer.
> -        * Unfortunately, it also reports "0" if the requested time string
> -        * takes 0 bytes. So if we were to probe and grow, we have to choose
> -        * some arbitrary cap beyond which we guess that the format probably
> -        * just results in a 0-length output. Since we have to choose some
> -        * reasonable cap anyway, and since it is not that big, we may
> -        * as well just grow to their in the first place.
> -        */
> -       strbuf_grow(sb, 128);
> +       if (!*fmt)
> +               return;
> +
> +       strbuf_grow(sb, hint);
>         len = strftime(sb->buf + sb->len, sb->alloc - sb->len, fmt, tm);
>
>         if (!len) {
>                 /*
> -                * Either we failed, or the format actually produces a 0-length
> -                * output. There's not much we can do, so we leave it blank.
> -                * However, the output array is left in an undefined state, so
> -                * we must re-assert our NUL terminator.
> +                * strftime reports "0" if it could not fit the result in the buffer.
> +                * Unfortunately, it also reports "0" if the requested time string
> +                * takes 0 bytes. So our strategy is to munge the format so that the
> +                * output contains at least one character, and then drop the extra
> +                * character before returning.
>                  */
> -               sb->buf[sb->len] = '\0';
> -       } else {
> -               sb->len += len;
> +               struct strbuf munged_fmt = STRBUF_INIT;
> +               strbuf_addf(&munged_fmt, "%s ", fmt);
> +               while (!len) {
> +                       hint *= 2;
> +                       strbuf_grow(sb, hint);
> +                       len = strftime(sb->buf + sb->len, sb->alloc - sb->len,
> +                                      munged_fmt.buf, tm);
> +               }
> +               strbuf_release(&munged_fmt);
> +               len--; /* drop munged space */
>         }
> +       strbuf_setlen(sb, sb->len + len);
>  }
> diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
> index c7f368c..7c9bec7 100755
> --- a/t/t6300-for-each-ref.sh
> +++ b/t/t6300-for-each-ref.sh
> @@ -235,6 +235,16 @@ test_expect_success 'Check format of strftime date fields' '
>         test_cmp expected actual
>  '
>
> +test_expect_success 'exercise strftime with odd fields' '
> +       echo >expected &&
> +       git for-each-ref --format="%(authordate:format:)" refs/heads >actual &&
> +       test_cmp expected actual &&
> +       long="long format -- $_z40$_z40$_z40$_z40$_z40$_z40$_z40" &&
> +       echo $long >expected &&
> +       git for-each-ref --format="%(authordate:format:$long)" refs/heads >actual &&
> +       test_cmp expected actual
> +'
> +
>  cat >expected <<\EOF
>  refs/heads/master
>  refs/remotes/origin/master
> --
> 2.5.0.rc0.336.g8460790
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/3] introduce "format" date-mode
  2015-07-21  0:41             ` Eric Sunshine
@ 2015-07-21  1:19               ` Jeff King
  0 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2015-07-21  1:19 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Junio C Hamano, Git List, H.Merijn Brand

On Mon, Jul 20, 2015 at 08:41:08PM -0400, Eric Sunshine wrote:

> > Here's a patch, on top of jk/date-mode-format (I think it would also be
> > fine to just squash into the tip commit; the explanation in the commit
> > message is sufficiently mirrored in the code comment).
> 
> While cleaning up old local branches, I noticed that, although the
> jk/date-mode-format topic[1] made it into 'next' (and will be merged
> to 'master' according to "What's cooking"[2]), the below follow-on
> patch[3] which improves strbuf_addftime() never got picked up. Was
> this omission intentional? Based upon the discussion[4], I was under
> the impression that the patch was considered reasonably acceptable
> (and did not worsen problems with bogus format strings -- which are
> bogus anyway).

Thanks for noticing. I do think the patch you quoted (to loop and grow
the strbuf) is a good change. The original code would easily bite
somebody with a really large date format, whereas this should work
sanely everywhere, short of malformed inputs. And even then, I'd expect
reasonable behavior on most systems. The obvious thing to worry about is
a system where feeding a malformed "% " causes strftime to return 0, no
matter what, and we reallocated and loop forever. But:

  1. I don't even know if such a system exists.

  2. We probably would blow up on malloc() eventually, so it wouldn't
     even be a "real" infinite loop.

So I think the worst case is probably that we get a report later on from
somebody on an arcane system that says "I fed crap to --date=format, and
my git died with an out-of-memory error", and then we figure out exactly
_how_ their system is weird and deal with it then.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2015-07-21  1:19 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-25 11:19 several date related issues H.Merijn Brand
2015-06-25 12:44 ` Jeff King
2015-06-25 12:56   ` H.Merijn Brand
2015-06-25 16:53     ` [PATCH 0/3] localized date format Jeff King
2015-06-25 16:54       ` [PATCH 1/3] show-branch: use DATE_RELATIVE instead of magic number Jeff King
2015-06-25 16:55       ` [PATCH 2/3] convert "enum date_mode" into a struct Jeff King
2015-06-25 17:03         ` John Keeping
2015-06-25 17:22           ` Jeff King
2015-07-07 20:37         ` Junio C Hamano
2015-07-07 20:48           ` Jeff King
2015-07-07 21:05             ` Junio C Hamano
2015-07-07 21:13               ` Jeff King
2015-07-07 21:19                 ` Junio C Hamano
2015-06-25 16:55       ` [PATCH 3/3] introduce "format" date-mode Jeff King
2015-06-29 22:22         ` Eric Sunshine
2015-06-30 10:20           ` Jeff King
2015-06-30 16:22             ` Junio C Hamano
2015-06-30 17:50               ` Jeff King
2015-06-30 19:23                 ` Junio C Hamano
2015-06-30 19:33                   ` Jeff King
2015-06-30 16:58             ` Eric Sunshine
2015-06-30 17:58               ` Jeff King
2015-06-30 18:13                 ` Eric Sunshine
2015-06-30 19:22                   ` Jeff King
2015-06-30 17:05             ` Junio C Hamano
2015-06-30 17:48               ` Eric Sunshine
2015-06-30 19:17               ` Jeff King
2015-06-30 13:26           ` Jeff King
2015-06-30 17:05             ` Eric Sunshine
2015-07-21  0:41             ` Eric Sunshine
2015-07-21  1:19               ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).