git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/8] Add function strbuf_addstr_xml_quoted() and more
@ 2012-11-25 11:08 Michael Haggerty
  2012-11-25 11:08 ` [PATCH 1/8] Add new function strbuf_add_xml_quoted() Michael Haggerty
                   ` (8 more replies)
  0 siblings, 9 replies; 18+ messages in thread
From: Michael Haggerty @ 2012-11-25 11:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeremy White, Johannes Schindelin, Michael Haggerty

There were two functions doing almost the same XML quoting of
character entities, so implement a library function
strbuf_addstr_xml_quoted() and use that in both places.

Along the way, do a lot of simplification within imap-send.c, which
was doing a lot of its own string management instead of using strbuf.

Please note that "git imap-send" is utterly absent from the test
suite, probably due to the difficulty of testing without a real IMAP
server.  I ran some manual tests after my changes and didn't find any
problems.

The bug that I reported on 2012-11-12, namely that

    git format-patch --signoff --stdout --attach origin | git imap-send

is broken, is not addressed by these patches.

Michael Haggerty (8):
  Add new function strbuf_add_xml_quoted()
  xml_entities(): use function strbuf_addstr_xml_quoted()
  lf_to_crlf(): NUL-terminate msg_data::data
  imap-send: store all_msgs as a strbuf
  imap-send: correctly report errors reading from stdin
  imap-send: change msg_data from storing (char *, len) to storing
    strbuf
  wrap_in_html(): use strbuf_addstr_xml_quoted()
  wrap_in_html(): process message in bulk rather than line-by-line

 http-push.c |  23 +--------
 imap-send.c | 157 +++++++++++++++++++++++++++---------------------------------
 strbuf.c    |  26 ++++++++++
 strbuf.h    |   6 +++
 4 files changed, 104 insertions(+), 108 deletions(-)

-- 
1.8.0

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/8] Add new function strbuf_add_xml_quoted()
  2012-11-25 11:08 [PATCH 0/8] Add function strbuf_addstr_xml_quoted() and more Michael Haggerty
@ 2012-11-25 11:08 ` Michael Haggerty
  2012-11-25 11:08 ` [PATCH 2/8] xml_entities(): use function strbuf_addstr_xml_quoted() Michael Haggerty
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Michael Haggerty @ 2012-11-25 11:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeremy White, Johannes Schindelin, Michael Haggerty

Substantially the same code is present in http-push.c and imap-send.c,
so make a library function out of it.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 strbuf.c | 26 ++++++++++++++++++++++++++
 strbuf.h |  6 ++++++
 2 files changed, 32 insertions(+)

diff --git a/strbuf.c b/strbuf.c
index 05d0693..9a373be 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -425,6 +425,32 @@ void strbuf_add_lines(struct strbuf *out, const char *prefix,
 	strbuf_complete_line(out);
 }
 
+void strbuf_addstr_xml_quoted(struct strbuf *buf, const char *s)
+{
+	while (*s) {
+		size_t len = strcspn(s, "\"<>&");
+		strbuf_add(buf, s, len);
+		s += len;
+		switch (*s) {
+		case '"':
+			strbuf_addstr(buf, "&quot;");
+			break;
+		case '<':
+			strbuf_addstr(buf, "&lt;");
+			break;
+		case '>':
+			strbuf_addstr(buf, "&gt;");
+			break;
+		case '&':
+			strbuf_addstr(buf, "&amp;");
+			break;
+		case 0:
+			return;
+		}
+		s++;
+	}
+}
+
 static int is_rfc3986_reserved(char ch)
 {
 	switch (ch) {
diff --git a/strbuf.h b/strbuf.h
index aa386c6..ecae4e2 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -136,6 +136,12 @@ extern void strbuf_vaddf(struct strbuf *sb, const char *fmt, va_list ap);
 
 extern void strbuf_add_lines(struct strbuf *sb, const char *prefix, const char *buf, size_t size);
 
+/*
+ * Append s to sb, with the characters '<', '>', '&' and '"' converted
+ * into XML entities.
+ */
+extern void strbuf_addstr_xml_quoted(struct strbuf *sb, const char *s);
+
 static inline void strbuf_complete_line(struct strbuf *sb)
 {
 	if (sb->len && sb->buf[sb->len - 1] != '\n')
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/8] xml_entities(): use function strbuf_addstr_xml_quoted()
  2012-11-25 11:08 [PATCH 0/8] Add function strbuf_addstr_xml_quoted() and more Michael Haggerty
  2012-11-25 11:08 ` [PATCH 1/8] Add new function strbuf_add_xml_quoted() Michael Haggerty
@ 2012-11-25 11:08 ` Michael Haggerty
  2012-11-25 11:08 ` [PATCH 3/8] lf_to_crlf(): NUL-terminate msg_data::data Michael Haggerty
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Michael Haggerty @ 2012-11-25 11:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeremy White, Johannes Schindelin, Michael Haggerty


Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 http-push.c | 23 +----------------------
 1 file changed, 1 insertion(+), 22 deletions(-)

diff --git a/http-push.c b/http-push.c
index 8701c12..9923441 100644
--- a/http-push.c
+++ b/http-push.c
@@ -172,28 +172,7 @@ enum dav_header_flag {
 static char *xml_entities(const char *s)
 {
 	struct strbuf buf = STRBUF_INIT;
-	while (*s) {
-		size_t len = strcspn(s, "\"<>&");
-		strbuf_add(&buf, s, len);
-		s += len;
-		switch (*s) {
-		case '"':
-			strbuf_addstr(&buf, "&quot;");
-			break;
-		case '<':
-			strbuf_addstr(&buf, "&lt;");
-			break;
-		case '>':
-			strbuf_addstr(&buf, "&gt;");
-			break;
-		case '&':
-			strbuf_addstr(&buf, "&amp;");
-			break;
-		case 0:
-			return strbuf_detach(&buf, NULL);
-		}
-		s++;
-	}
+	strbuf_addstr_xml_quoted(&buf, s);
 	return strbuf_detach(&buf, NULL);
 }
 
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/8] lf_to_crlf(): NUL-terminate msg_data::data
  2012-11-25 11:08 [PATCH 0/8] Add function strbuf_addstr_xml_quoted() and more Michael Haggerty
  2012-11-25 11:08 ` [PATCH 1/8] Add new function strbuf_add_xml_quoted() Michael Haggerty
  2012-11-25 11:08 ` [PATCH 2/8] xml_entities(): use function strbuf_addstr_xml_quoted() Michael Haggerty
@ 2012-11-25 11:08 ` Michael Haggerty
  2012-11-25 11:08 ` [PATCH 4/8] imap-send: store all_msgs as a strbuf Michael Haggerty
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Michael Haggerty @ 2012-11-25 11:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeremy White, Johannes Schindelin, Michael Haggerty

Through the rest of the file, the data member of struct msg_data is
kept NUL-terminated, and that fact is relied upon in a couple of
places.  Change lf_to_crlf() to preserve this invariant.

In fact, there are no execution paths in which lf_to_crlf() is called
and then its data member is required to be NUL-terminated, but it is
better to be consistent to prevent future confusion.

Document the invariant in the struct msg_data definition.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 imap-send.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/imap-send.c b/imap-send.c
index d42e471..c818b0c 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -69,8 +69,12 @@ struct store {
 };
 
 struct msg_data {
+	/* NUL-terminated data: */
 	char *data;
+
+	/* length of data (not including NUL): */
 	int len;
+
 	unsigned char flags;
 };
 
@@ -1276,7 +1280,7 @@ static void lf_to_crlf(struct msg_data *msg)
 			lfnum++;
 	}
 
-	new = xmalloc(msg->len + lfnum);
+	new = xmalloc(msg->len + lfnum + 1);
 	if (msg->data[0] == '\n') {
 		new[0] = '\r';
 		new[1] = '\n';
@@ -1297,6 +1301,7 @@ static void lf_to_crlf(struct msg_data *msg)
 		/* otherwise it already had CR before */
 		new[j++] = '\n';
 	}
+	new[j] = '\0';
 	msg->len += lfnum;
 	free(msg->data);
 	msg->data = new;
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/8] imap-send: store all_msgs as a strbuf
  2012-11-25 11:08 [PATCH 0/8] Add function strbuf_addstr_xml_quoted() and more Michael Haggerty
                   ` (2 preceding siblings ...)
  2012-11-25 11:08 ` [PATCH 3/8] lf_to_crlf(): NUL-terminate msg_data::data Michael Haggerty
@ 2012-11-25 11:08 ` Michael Haggerty
  2012-11-25 11:08 ` [PATCH 5/8] imap-send: correctly report errors reading from stdin Michael Haggerty
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Michael Haggerty @ 2012-11-25 11:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeremy White, Johannes Schindelin, Michael Haggerty

all_msgs is only used as a glorified string, therefore there is no
reason to declare it as a struct msg_data.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 imap-send.c | 23 +++++++++--------------
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/imap-send.c b/imap-send.c
index c818b0c..50e223a 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -1391,26 +1391,20 @@ static void wrap_in_html(struct msg_data *msg)
 
 #define CHUNKSIZE 0x1000
 
-static int read_message(FILE *f, struct msg_data *msg)
+static int read_message(FILE *f, struct strbuf *all_msgs)
 {
-	struct strbuf buf = STRBUF_INIT;
-
-	memset(msg, 0, sizeof(*msg));
-
 	do {
-		if (strbuf_fread(&buf, CHUNKSIZE, f) <= 0)
+		if (strbuf_fread(all_msgs, CHUNKSIZE, f) <= 0)
 			break;
 	} while (!feof(f));
 
-	msg->len  = buf.len;
-	msg->data = strbuf_detach(&buf, NULL);
-	return msg->len;
+	return all_msgs->len;
 }
 
-static int count_messages(struct msg_data *msg)
+static int count_messages(struct strbuf *all_msgs)
 {
 	int count = 0;
-	char *p = msg->data;
+	char *p = all_msgs->buf;
 
 	while (1) {
 		if (!prefixcmp(p, "From ")) {
@@ -1431,7 +1425,7 @@ static int count_messages(struct msg_data *msg)
 	return count;
 }
 
-static int split_msg(struct msg_data *all_msgs, struct msg_data *msg, int *ofs)
+static int split_msg(struct strbuf *all_msgs, struct msg_data *msg, int *ofs)
 {
 	char *p, *data;
 
@@ -1439,7 +1433,7 @@ static int split_msg(struct msg_data *all_msgs, struct msg_data *msg, int *ofs)
 	if (*ofs >= all_msgs->len)
 		return 0;
 
-	data = &all_msgs->data[*ofs];
+	data = &all_msgs->buf[*ofs];
 	msg->len = all_msgs->len - *ofs;
 
 	if (msg->len < 5 || prefixcmp(data, "From "))
@@ -1509,7 +1503,8 @@ static int git_imap_config(const char *key, const char *val, void *cb)
 
 int main(int argc, char **argv)
 {
-	struct msg_data all_msgs, msg;
+	struct strbuf all_msgs = STRBUF_INIT;
+	struct msg_data msg;
 	struct store *ctx = NULL;
 	int ofs = 0;
 	int r;
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/8] imap-send: correctly report errors reading from stdin
  2012-11-25 11:08 [PATCH 0/8] Add function strbuf_addstr_xml_quoted() and more Michael Haggerty
                   ` (3 preceding siblings ...)
  2012-11-25 11:08 ` [PATCH 4/8] imap-send: store all_msgs as a strbuf Michael Haggerty
@ 2012-11-25 11:08 ` Michael Haggerty
  2012-11-25 11:08 ` [PATCH 6/8] imap-send: change msg_data from storing (char *, len) to storing strbuf Michael Haggerty
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Michael Haggerty @ 2012-11-25 11:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeremy White, Johannes Schindelin, Michael Haggerty

Previously, read_message() didn't distinguish between an error and eof
when reading its input.  This could have resulted in incorrect
behavior if there was an error: (1) reporting "nothing to send" if no
bytes were read or (2) sending an incomplete message if some bytes
were read before the error.

Change read_message() to return -1 on ferror()s and 0 on success, so
that the caller can recognize that an error occurred.  (The return
value used to be the length of the input read, which was redundant
because that is already available as the strbuf length.

Change the caller to report errors correctly.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 imap-send.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/imap-send.c b/imap-send.c
index 50e223a..86cf603 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -1398,7 +1398,7 @@ static int read_message(FILE *f, struct strbuf *all_msgs)
 			break;
 	} while (!feof(f));
 
-	return all_msgs->len;
+	return ferror(f) ? -1 : 0;
 }
 
 static int count_messages(struct strbuf *all_msgs)
@@ -1537,7 +1537,12 @@ int main(int argc, char **argv)
 	}
 
 	/* read the messages */
-	if (!read_message(stdin, &all_msgs)) {
+	if (read_message(stdin, &all_msgs)) {
+		fprintf(stderr, "error reading input\n");
+		return 1;
+	}
+
+	if (all_msgs.len == 0) {
 		fprintf(stderr, "nothing to send\n");
 		return 1;
 	}
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/8] imap-send: change msg_data from storing (char *, len) to storing strbuf
  2012-11-25 11:08 [PATCH 0/8] Add function strbuf_addstr_xml_quoted() and more Michael Haggerty
                   ` (4 preceding siblings ...)
  2012-11-25 11:08 ` [PATCH 5/8] imap-send: correctly report errors reading from stdin Michael Haggerty
@ 2012-11-25 11:08 ` Michael Haggerty
  2012-11-29 21:30   ` Junio C Hamano
  2012-11-25 11:08 ` [PATCH 7/8] wrap_in_html(): use strbuf_addstr_xml_quoted() Michael Haggerty
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Michael Haggerty @ 2012-11-25 11:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeremy White, Johannes Schindelin, Michael Haggerty

struct msg_data stored (char *, len) of the data to be included in a
message, kept the character data NUL-terminated, etc., much like a
strbuf would do.  So change it to use a struct strbuf.  This makes the
code clearer and reduces copying a little bit.

A side effect of this change is that the memory for each message is
freed after it is used rather than leaked, though that detail is
unimportant given that imap-send is a top-level command.

--

For some reason, there is a bunch of infrastructure in this file for
dealing with IMAP flags, although there is nothing in the code that
actually allows any flags to be set.  If there is no plan to add
support for flags in the future, a bunch of code could be ripped out
and "struct msg_data" could be completely replaced with strbuf.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 imap-send.c | 92 +++++++++++++++++++++++++++++++------------------------------
 1 file changed, 47 insertions(+), 45 deletions(-)

diff --git a/imap-send.c b/imap-send.c
index 86cf603..a5e0e33 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -69,12 +69,7 @@ struct store {
 };
 
 struct msg_data {
-	/* NUL-terminated data: */
-	char *data;
-
-	/* length of data (not including NUL): */
-	int len;
-
+	struct strbuf data;
 	unsigned char flags;
 };
 
@@ -1268,46 +1263,49 @@ static int imap_make_flags(int flags, char *buf)
 	return d;
 }
 
-static void lf_to_crlf(struct msg_data *msg)
+static void lf_to_crlf(struct strbuf *msg)
 {
+	size_t new_len;
 	char *new;
 	int i, j, lfnum = 0;
 
-	if (msg->data[0] == '\n')
+	if (msg->buf[0] == '\n')
 		lfnum++;
 	for (i = 1; i < msg->len; i++) {
-		if (msg->data[i - 1] != '\r' && msg->data[i] == '\n')
+		if (msg->buf[i - 1] != '\r' && msg->buf[i] == '\n')
 			lfnum++;
 	}
 
-	new = xmalloc(msg->len + lfnum + 1);
-	if (msg->data[0] == '\n') {
+	new_len = msg->len + lfnum;
+	new = xmalloc(new_len + 1);
+	if (msg->buf[0] == '\n') {
 		new[0] = '\r';
 		new[1] = '\n';
 		i = 1;
 		j = 2;
 	} else {
-		new[0] = msg->data[0];
+		new[0] = msg->buf[0];
 		i = 1;
 		j = 1;
 	}
 	for ( ; i < msg->len; i++) {
-		if (msg->data[i] != '\n') {
-			new[j++] = msg->data[i];
+		if (msg->buf[i] != '\n') {
+			new[j++] = msg->buf[i];
 			continue;
 		}
-		if (msg->data[i - 1] != '\r')
+		if (msg->buf[i - 1] != '\r')
 			new[j++] = '\r';
 		/* otherwise it already had CR before */
 		new[j++] = '\n';
 	}
-	new[j] = '\0';
-	msg->len += lfnum;
-	free(msg->data);
-	msg->data = new;
+	strbuf_attach(msg, new, new_len, new_len + 1);
 }
 
-static int imap_store_msg(struct store *gctx, struct msg_data *data)
+/*
+ * Store msg to IMAP.  Also detach and free the data from msg->data,
+ * leaving msg->data empty.
+ */
+static int imap_store_msg(struct store *gctx, struct msg_data *msg)
 {
 	struct imap_store *ctx = (struct imap_store *)gctx;
 	struct imap *imap = ctx->imap;
@@ -1316,16 +1314,15 @@ static int imap_store_msg(struct store *gctx, struct msg_data *data)
 	int ret, d;
 	char flagstr[128];
 
-	lf_to_crlf(data);
+	lf_to_crlf(&msg->data);
 	memset(&cb, 0, sizeof(cb));
 
-	cb.dlen = data->len;
-	cb.data = xmalloc(cb.dlen);
-	memcpy(cb.data, data->data, data->len);
+	cb.dlen = msg->data.len;
+	cb.data = strbuf_detach(&msg->data, NULL);
 
 	d = 0;
-	if (data->flags) {
-		d = imap_make_flags(data->flags, flagstr);
+	if (msg->flags) {
+		d = imap_make_flags(msg->flags, flagstr);
 		flagstr[d++] = ' ';
 	}
 	flagstr[d] = 0;
@@ -1356,7 +1353,8 @@ static void encode_html_chars(struct strbuf *p)
 			strbuf_splice(p, i, 1, "&quot;", 6);
 	}
 }
-static void wrap_in_html(struct msg_data *msg)
+
+static void wrap_in_html(struct strbuf *msg)
 {
 	struct strbuf buf = STRBUF_INIT;
 	struct strbuf **lines;
@@ -1366,9 +1364,7 @@ static void wrap_in_html(struct msg_data *msg)
 	static char *pre_close = "</pre>\n";
 	int added_header = 0;
 
-	strbuf_attach(&buf, msg->data, msg->len, msg->len);
-	lines = strbuf_split(&buf, '\n');
-	strbuf_release(&buf);
+	lines = strbuf_split(msg, '\n');
 	for (p = lines; *p; p++) {
 		if (! added_header) {
 			if ((*p)->len == 1 && *((*p)->buf) == '\n') {
@@ -1385,8 +1381,8 @@ static void wrap_in_html(struct msg_data *msg)
 	}
 	strbuf_addstr(&buf, pre_close);
 	strbuf_list_free(lines);
-	msg->len  = buf.len;
-	msg->data = strbuf_detach(&buf, NULL);
+	strbuf_release(msg);
+	*msg = buf;
 }
 
 #define CHUNKSIZE 0x1000
@@ -1425,34 +1421,39 @@ static int count_messages(struct strbuf *all_msgs)
 	return count;
 }
 
-static int split_msg(struct strbuf *all_msgs, struct msg_data *msg, int *ofs)
+/*
+ * Copy the next message from all_msgs, starting at offset *ofs, to
+ * msg.  Update *ofs to the start of the following message.  Return
+ * true iff a message was successfully copied.
+ */
+static int split_msg(struct strbuf *all_msgs, struct strbuf *msg, int *ofs)
 {
 	char *p, *data;
+	size_t len;
 
-	memset(msg, 0, sizeof *msg);
 	if (*ofs >= all_msgs->len)
 		return 0;
 
 	data = &all_msgs->buf[*ofs];
-	msg->len = all_msgs->len - *ofs;
+	len = all_msgs->len - *ofs;
 
-	if (msg->len < 5 || prefixcmp(data, "From "))
+	if (len < 5 || prefixcmp(data, "From "))
 		return 0;
 
 	p = strchr(data, '\n');
 	if (p) {
-		p = &p[1];
-		msg->len -= p-data;
-		*ofs += p-data;
+		p++;
+		len -= p - data;
+		*ofs += p - data;
 		data = p;
 	}
 
 	p = strstr(data, "\nFrom ");
 	if (p)
-		msg->len = &p[1] - data;
+		len = &p[1] - data;
 
-	msg->data = xmemdupz(data, msg->len);
-	*ofs += msg->len;
+	strbuf_add(msg, data, len);
+	*ofs += len;
 	return 1;
 }
 
@@ -1504,7 +1505,7 @@ static int git_imap_config(const char *key, const char *val, void *cb)
 int main(int argc, char **argv)
 {
 	struct strbuf all_msgs = STRBUF_INIT;
-	struct msg_data msg;
+	struct msg_data msg = {STRBUF_INIT, 0};
 	struct store *ctx = NULL;
 	int ofs = 0;
 	int r;
@@ -1564,11 +1565,12 @@ int main(int argc, char **argv)
 	ctx->name = imap_folder;
 	while (1) {
 		unsigned percent = n * 100 / total;
+
 		fprintf(stderr, "%4u%% (%d/%d) done\r", percent, n, total);
-		if (!split_msg(&all_msgs, &msg, &ofs))
+		if (!split_msg(&all_msgs, &msg.data, &ofs))
 			break;
 		if (server.use_html)
-			wrap_in_html(&msg);
+			wrap_in_html(&msg.data);
 		r = imap_store_msg(ctx, &msg);
 		if (r != DRV_OK)
 			break;
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 7/8] wrap_in_html(): use strbuf_addstr_xml_quoted()
  2012-11-25 11:08 [PATCH 0/8] Add function strbuf_addstr_xml_quoted() and more Michael Haggerty
                   ` (5 preceding siblings ...)
  2012-11-25 11:08 ` [PATCH 6/8] imap-send: change msg_data from storing (char *, len) to storing strbuf Michael Haggerty
@ 2012-11-25 11:08 ` Michael Haggerty
  2012-11-29 21:41 ` [PATCH 0/8] Add function strbuf_addstr_xml_quoted() and more Junio C Hamano
       [not found] ` <1353841721-16269-9-git-send-email-mhagger@alum.mit.edu>
  8 siblings, 0 replies; 18+ messages in thread
From: Michael Haggerty @ 2012-11-25 11:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeremy White, Johannes Schindelin, Michael Haggerty

Use the new function to quote characters as they are being added to
buf, rather than quoting them in *p and then copying them into buf.
This increases code sharing, and changes the algorithm from O(N^2) to
O(N) in the number of characters in a line.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 imap-send.c | 23 ++++-------------------
 1 file changed, 4 insertions(+), 19 deletions(-)

diff --git a/imap-send.c b/imap-send.c
index a5e0e33..b73c913 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -1339,21 +1339,6 @@ static int imap_store_msg(struct store *gctx, struct msg_data *msg)
 	return DRV_OK;
 }
 
-static void encode_html_chars(struct strbuf *p)
-{
-	int i;
-	for (i = 0; i < p->len; i++) {
-		if (p->buf[i] == '&')
-			strbuf_splice(p, i, 1, "&amp;", 5);
-		if (p->buf[i] == '<')
-			strbuf_splice(p, i, 1, "&lt;", 4);
-		if (p->buf[i] == '>')
-			strbuf_splice(p, i, 1, "&gt;", 4);
-		if (p->buf[i] == '"')
-			strbuf_splice(p, i, 1, "&quot;", 6);
-	}
-}
-
 static void wrap_in_html(struct strbuf *msg)
 {
 	struct strbuf buf = STRBUF_INIT;
@@ -1372,12 +1357,12 @@ static void wrap_in_html(struct strbuf *msg)
 				strbuf_addbuf(&buf, *p);
 				strbuf_addstr(&buf, pre_open);
 				added_header = 1;
-				continue;
+			} else {
+				strbuf_addbuf(&buf, *p);
 			}
+		} else {
+			strbuf_addstr_xml_quoted(&buf, (*p)->buf);
 		}
-		else
-			encode_html_chars(*p);
-		strbuf_addbuf(&buf, *p);
 	}
 	strbuf_addstr(&buf, pre_close);
 	strbuf_list_free(lines);
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/8] imap-send: change msg_data from storing (char *, len) to storing strbuf
  2012-11-25 11:08 ` [PATCH 6/8] imap-send: change msg_data from storing (char *, len) to storing strbuf Michael Haggerty
@ 2012-11-29 21:30   ` Junio C Hamano
  2012-11-29 23:43     ` Jeff King
  2012-11-30 13:36     ` Michael Haggerty
  0 siblings, 2 replies; 18+ messages in thread
From: Junio C Hamano @ 2012-11-29 21:30 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git, Jeremy White, Johannes Schindelin

Michael Haggerty <mhagger@alum.mit.edu> writes:

> struct msg_data stored (char *, len) of the data to be included in a

That (<type>, <varname>) is a bit funny notation, even though it is
understandable.

> message, kept the character data NUL-terminated, etc., much like a
> strbuf would do.  So change it to use a struct strbuf.  This makes the
> code clearer and reduces copying a little bit.
>
> A side effect of this change is that the memory for each message is
> freed after it is used rather than leaked, though that detail is
> unimportant given that imap-send is a top-level command.
>
> --

?

> For some reason, there is a bunch of infrastructure in this file for
> dealing with IMAP flags, although there is nothing in the code that
> actually allows any flags to be set.  If there is no plan to add
> support for flags in the future, a bunch of code could be ripped out
> and "struct msg_data" could be completely replaced with strbuf.

Yeah, after all these years we have kept the unused flags field
there and nobody needed anything out of it.  I am OK with a removal
if it is done at the very end of the series.

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/8] Add function strbuf_addstr_xml_quoted() and more
  2012-11-25 11:08 [PATCH 0/8] Add function strbuf_addstr_xml_quoted() and more Michael Haggerty
                   ` (6 preceding siblings ...)
  2012-11-25 11:08 ` [PATCH 7/8] wrap_in_html(): use strbuf_addstr_xml_quoted() Michael Haggerty
@ 2012-11-29 21:41 ` Junio C Hamano
       [not found] ` <1353841721-16269-9-git-send-email-mhagger@alum.mit.edu>
  8 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2012-11-29 21:41 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git, Jeremy White, Johannes Schindelin

Michael Haggerty <mhagger@alum.mit.edu> writes:

> There were two functions doing almost the same XML quoting of
> character entities, so implement a library function
> strbuf_addstr_xml_quoted() and use that in both places.
>
> Along the way, do a lot of simplification within imap-send.c, which
> was doing a lot of its own string management instead of using strbuf.

Overall the series looked good to me.  Thanks; will queue.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/8] imap-send: change msg_data from storing (char *, len) to storing strbuf
  2012-11-29 21:30   ` Junio C Hamano
@ 2012-11-29 23:43     ` Jeff King
  2012-11-30 13:36     ` Michael Haggerty
  1 sibling, 0 replies; 18+ messages in thread
From: Jeff King @ 2012-11-29 23:43 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, git, Jeremy White, Johannes Schindelin

On Thu, Nov 29, 2012 at 01:30:54PM -0800, Junio C Hamano wrote:

> > For some reason, there is a bunch of infrastructure in this file for
> > dealing with IMAP flags, although there is nothing in the code that
> > actually allows any flags to be set.  If there is no plan to add
> > support for flags in the future, a bunch of code could be ripped out
> > and "struct msg_data" could be completely replaced with strbuf.
> 
> Yeah, after all these years we have kept the unused flags field
> there and nobody needed anything out of it.  I am OK with a removal
> if it is done at the very end of the series.

There's a bunch of unused junk in imap-send. The original implementation
copied a bunch of code from isync, a much more full-featured imap
client, and the result ended up way more complex than it needed to be. I
have ripped a few things out over the years when they cause a problem
(e.g., portability of /dev/urandom, conflict over the name "struct
string_list"), but have mostly let it be out of a vague sense that we
might one day want to pull bugfixes from isync upstream.

That has not happened once in the last six years, though, and I would
doubt that a straightforward merge would work after so many years. So
ripping out and refactoring the code in the name of maintainability is
probably a good thing at this point.

-Peff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/8] imap-send: change msg_data from storing (char *, len) to storing strbuf
  2012-11-29 21:30   ` Junio C Hamano
  2012-11-29 23:43     ` Jeff King
@ 2012-11-30 13:36     ` Michael Haggerty
  2012-12-02  1:48       ` Junio C Hamano
  1 sibling, 1 reply; 18+ messages in thread
From: Michael Haggerty @ 2012-11-30 13:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeremy White, Johannes Schindelin, Jeff King

On 11/29/2012 10:30 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> struct msg_data stored (char *, len) of the data to be included in a
> 
> That (<type>, <varname>) is a bit funny notation, even though it is
> understandable.

I understand that it is funny, but it seems like the clearest way to
express what is meant in a way that fits in the summary line.  Feel free
to change it if you like.

>> message, kept the character data NUL-terminated, etc., much like a
>> strbuf would do.  So change it to use a struct strbuf.  This makes the
>> code clearer and reduces copying a little bit.
>>
>> A side effect of this change is that the memory for each message is
>> freed after it is used rather than leaked, though that detail is
>> unimportant given that imap-send is a top-level command.
>>
>> --
> 
> ?

If by "?" you are wondering where the memory leak was, it was:

* The while loop in main() called split_msg()

  * split_msg() cleared the msg_data structure using
    memset(msg, 0, sizeof *msg)

  * split_msg() copied the first message out of all_msgs using
    xmemdupz() and stored the result to msg->data

* The msg_data was passed to imap_store_msg().  Its contents were
  copied to cb.data (which will be freed in the imap functions) but
  the original was left unfreed.

* The next time through the loop, split_msg() zeroed the msg_data
  structure again, thus discarding the pointer to the xmemdupz()ed
  memory.

The leak caused more memory than necessary to be allocated (worst case:
nearly the total size of all_msgs).  But (a) all_msgs is already stored
in memory, so the wastage is at most a factor of 2; and (b) this all
happens in main() shortly before program exit erases all sins.

I didn't bother documenting this in the commit message because the patch
changes the code anyway, but feel free to add the above explanation to
the commit message if you think it is useful.

>> For some reason, there is a bunch of infrastructure in this file for
>> dealing with IMAP flags, although there is nothing in the code that
>> actually allows any flags to be set.  If there is no plan to add
>> support for flags in the future, a bunch of code could be ripped out
>> and "struct msg_data" could be completely replaced with strbuf.
> 
> Yeah, after all these years we have kept the unused flags field
> there and nobody needed anything out of it.  I am OK with a removal
> if it is done at the very end of the series.

I don't think the removal of flags needs to be part of the same series.
 I suggest a separate patch series dedicated to deleting *all* the extra
imap infrastructure at once.  That being said, I'm not committing to do
so.  (We could add it to an "straightforward projects for aspiring git
developers" list, if we had such a thing.)

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 8/8] wrap_in_html(): process message in bulk rather than line-by-line
       [not found]   ` <7v7gp4p00u.fsf@alter.siamese.dyndns.org>
@ 2012-11-30 13:40     ` Michael Haggerty
  2012-12-02  9:25       ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Haggerty @ 2012-11-30 13:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeremy White, Johannes Schindelin, Jeff King

[-- Attachment #1: Type: text/plain, Size: 1456 bytes --]

On 11/29/2012 10:33 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> Now that we can xml-quote an arbitrary string in O(N), there is no
>> reason to process the message line by line.  This change saves lots of
>> memory allocations and copying.
>>
>> The old code would have created invalid output for a malformed input
>> message (one that does not contain a blank line separating the header
>> from the body).  The new code die()s in this situation.
> 
> Given that imap-send is about sending a patch the distinction would
> not matter in practice, but isn't the difference between the two
> that the new version would not allow sending a header-only message
> without a body, while the old one allowed it?

I was thinking that the end-of-header line is a required part of an
RFC2282 email message, but I was wrong.  If you squash the attached
patch onto this commit, it will handle emails without bodies correctly.

Nevertheless, the old code was even *more* broken because it added a
"</pre>" regardless of whether the separator line had been seen, and
therefore a message without an end-of-header line would come out like

    Header1: foo
    Header2: bar
    </pre>

with no content_type line, no pre_open, and </pre> appended to the
header without a blank line in between.  This is the "invalid output"
that I was referring to.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

[-- Attachment #2: addstr-xml-quoted-addendum.diff --]
[-- Type: text/x-patch, Size: 337 bytes --]

diff --git a/imap-send.c b/imap-send.c
index eec9e35..e521e2f 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -1348,7 +1348,7 @@ static void wrap_in_html(struct strbuf *msg)
 	const char *body = strstr(msg->buf, "\n\n");
 
 	if (!body)
-		die("malformed message");
+		return; /* Headers but no body; no wrapping needed */
 
 	body += 2;
 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/8] imap-send: change msg_data from storing (char *, len) to storing strbuf
  2012-11-30 13:36     ` Michael Haggerty
@ 2012-12-02  1:48       ` Junio C Hamano
  2012-12-02  6:03         ` Michael Haggerty
  2012-12-03 15:06         ` Thiago Farina
  0 siblings, 2 replies; 18+ messages in thread
From: Junio C Hamano @ 2012-12-02  1:48 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git, Jeremy White, Johannes Schindelin, Jeff King

Michael Haggerty <mhagger@alum.mit.edu> writes:

> On 11/29/2012 10:30 PM, Junio C Hamano wrote:
> 
>>> A side effect of this change is that the memory for each message is
>>> freed after it is used rather than leaked, though that detail is
>>> unimportant given that imap-send is a top-level command.
>>>
>>> --
>> 
>> ?
>
> If by "?" you are wondering where the memory leak was, it was:

No, I was wondering if you meant to say "---" to mark te remainder
of what you wrote does not exactly belong to the log message.

>>> For some reason, there is a bunch of infrastructure in this file for
>>> dealing with IMAP flags, although there is nothing in the code that
>>> actually allows any flags to be set.  If there is no plan to add
>>> support for flags in the future, a bunch of code could be ripped out
>>> and "struct msg_data" could be completely replaced with strbuf.
>> 
>> Yeah, after all these years we have kept the unused flags field
>> there and nobody needed anything out of it.  I am OK with a removal
>> if it is done at the very end of the series.
>
> I don't think the removal of flags needs to be part of the same series.

Oh, I did not think so, either.

> I suggest a separate patch series dedicated to deleting *all* the extra
> imap infrastructure at once.  That being said, I'm not committing to do
> so.  (We could add it to an "straightforward projects for aspiring git
> developers" list, if we had such a thing.)

A "low-hanging fruit and/or janitorial work" stack may be worth
having.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/8] imap-send: change msg_data from storing (char *, len) to storing strbuf
  2012-12-02  1:48       ` Junio C Hamano
@ 2012-12-02  6:03         ` Michael Haggerty
  2012-12-03 15:06         ` Thiago Farina
  1 sibling, 0 replies; 18+ messages in thread
From: Michael Haggerty @ 2012-12-02  6:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeremy White, Johannes Schindelin, Jeff King

On 12/02/2012 02:48 AM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> On 11/29/2012 10:30 PM, Junio C Hamano wrote:
>>
>>>> A side effect of this change is that the memory for each message is
>>>> freed after it is used rather than leaked, though that detail is
>>>> unimportant given that imap-send is a top-level command.
>>>>
>>>> --
>>>
>>> ?
>>
>> If by "?" you are wondering where the memory leak was, it was:
> 
> No, I was wondering if you meant to say "---" to mark te remainder
> of what you wrote does not exactly belong to the log message.

Oh.  Yes, that was my intention.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 8/8] wrap_in_html(): process message in bulk rather than line-by-line
  2012-11-30 13:40     ` [PATCH 8/8] wrap_in_html(): process message in bulk rather than line-by-line Michael Haggerty
@ 2012-12-02  9:25       ` Junio C Hamano
  2012-12-02 10:35         ` Michael Haggerty
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2012-12-02  9:25 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git, Jeremy White, Johannes Schindelin, Jeff King

Michael Haggerty <mhagger@alum.mit.edu> writes:

> Nevertheless, the old code was even *more* broken because it added a
> "</pre>" regardless of whether the separator line had been seen,...

OK. I'll rewrite the tail-end of the original log message to read:

    The old code would have created invalid output when there was no
    body, emitting a closing </pre> without a blank line nor an opening
    <pre> after the header.  The new code simply returns in this
    situation without doing harm (even though either would not make much
    sense in the context of imap-send that is meant to send out patches).

and squash this in.

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 8/8] wrap_in_html(): process message in bulk rather than line-by-line
  2012-12-02  9:25       ` Junio C Hamano
@ 2012-12-02 10:35         ` Michael Haggerty
  0 siblings, 0 replies; 18+ messages in thread
From: Michael Haggerty @ 2012-12-02 10:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeremy White, Johannes Schindelin, Jeff King

On 12/02/2012 10:25 AM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> Nevertheless, the old code was even *more* broken because it added a
>> "</pre>" regardless of whether the separator line had been seen,...
> 
> OK. I'll rewrite the tail-end of the original log message to read:
> 
>     The old code would have created invalid output when there was no
>     body, emitting a closing </pre> without a blank line nor an opening
>     <pre> after the header.  The new code simply returns in this
>     situation without doing harm (even though either would not make much
>     sense in the context of imap-send that is meant to send out patches).
> 
> and squash this in.

ACK.  Thanks.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/8] imap-send: change msg_data from storing (char *, len) to storing strbuf
  2012-12-02  1:48       ` Junio C Hamano
  2012-12-02  6:03         ` Michael Haggerty
@ 2012-12-03 15:06         ` Thiago Farina
  1 sibling, 0 replies; 18+ messages in thread
From: Thiago Farina @ 2012-12-03 15:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Michael Haggerty, git, Jeremy White, Johannes Schindelin,
	Jeff King

On Sat, Dec 1, 2012 at 11:48 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> I suggest a separate patch series dedicated to deleting *all* the extra
>> imap infrastructure at once.  That being said, I'm not committing to do
>> so.  (We could add it to an "straightforward projects for aspiring git
>> developers" list, if we had such a thing.)
>
> A "low-hanging fruit and/or janitorial work" stack may be worth
> having.

That would be good for not so versed developers, I think. Do we have a
place for listing janitor projects?

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2012-12-03 15:06 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-25 11:08 [PATCH 0/8] Add function strbuf_addstr_xml_quoted() and more Michael Haggerty
2012-11-25 11:08 ` [PATCH 1/8] Add new function strbuf_add_xml_quoted() Michael Haggerty
2012-11-25 11:08 ` [PATCH 2/8] xml_entities(): use function strbuf_addstr_xml_quoted() Michael Haggerty
2012-11-25 11:08 ` [PATCH 3/8] lf_to_crlf(): NUL-terminate msg_data::data Michael Haggerty
2012-11-25 11:08 ` [PATCH 4/8] imap-send: store all_msgs as a strbuf Michael Haggerty
2012-11-25 11:08 ` [PATCH 5/8] imap-send: correctly report errors reading from stdin Michael Haggerty
2012-11-25 11:08 ` [PATCH 6/8] imap-send: change msg_data from storing (char *, len) to storing strbuf Michael Haggerty
2012-11-29 21:30   ` Junio C Hamano
2012-11-29 23:43     ` Jeff King
2012-11-30 13:36     ` Michael Haggerty
2012-12-02  1:48       ` Junio C Hamano
2012-12-02  6:03         ` Michael Haggerty
2012-12-03 15:06         ` Thiago Farina
2012-11-25 11:08 ` [PATCH 7/8] wrap_in_html(): use strbuf_addstr_xml_quoted() Michael Haggerty
2012-11-29 21:41 ` [PATCH 0/8] Add function strbuf_addstr_xml_quoted() and more Junio C Hamano
     [not found] ` <1353841721-16269-9-git-send-email-mhagger@alum.mit.edu>
     [not found]   ` <7v7gp4p00u.fsf@alter.siamese.dyndns.org>
2012-11-30 13:40     ` [PATCH 8/8] wrap_in_html(): process message in bulk rather than line-by-line Michael Haggerty
2012-12-02  9:25       ` Junio C Hamano
2012-12-02 10:35         ` Michael Haggerty

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).