git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v7 0/2] http-backend: respect CONTENT_LENGTH as specified by rfc3875
@ 2018-06-02 21:27 Max Kirillov
  2018-06-02 21:27 ` [PATCH v7 1/2] " Max Kirillov
                   ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Max Kirillov @ 2018-06-02 21:27 UTC (permalink / raw)
  To: Junio C Hamano, Eric Sunshine, Jeff King
  Cc: Max Kirillov, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org

It's been time. Thank you for parience.

Changes:

* did most of the changes proposed
* rebase to newer master (latest conflicting change is addition of combined test helper)
* make tests which cover, hopefully, all cases.
* handle incorectly truncated input also in receive-pack. Considering the complications
  pointed out by Jeff, it just filters the input in the frontend process. I hope it
  is acceptable thing to do.

Max Kirillov (2):
  http-backend: respect CONTENT_LENGTH as specified by rfc3875
  http-backend: respect CONTENT_LENGTH for receive-pack

 Makefile                                |   1 +
 config.c                                |   2 +-
 config.h                                |   1 +
 http-backend.c                          |  86 +++++++++++--
 t/helper/test-print-larger-than-ssize.c |  11 ++
 t/helper/test-tool.c                    |   1 +
 t/helper/test-tool.h                    |   1 +
 t/t5560-http-backend-noserver.sh        |  13 ++
 t/t5562-http-backend-content-length.sh  | 155 ++++++++++++++++++++++++
 t/t5562/invoke-with-content-length.pl   |  30 +++++
 10 files changed, 291 insertions(+), 10 deletions(-)
 create mode 100644 t/helper/test-print-larger-than-ssize.c
 create mode 100755 t/t5562-http-backend-content-length.sh
 create mode 100755 t/t5562/invoke-with-content-length.pl

-- 
2.17.0.1185.g782057d875


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v7 1/2] http-backend: respect CONTENT_LENGTH as specified by rfc3875
  2018-06-02 21:27 [PATCH v7 0/2] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
@ 2018-06-02 21:27 ` Max Kirillov
  2018-06-04  3:44   ` Jeff King
  2018-06-02 21:27 ` [PATCH v7 2/2] http-backend: respect CONTENT_LENGTH for receive-pack Max Kirillov
  2018-06-10 15:05 ` [PATCH v8 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
  2 siblings, 1 reply; 31+ messages in thread
From: Max Kirillov @ 2018-06-02 21:27 UTC (permalink / raw)
  To: Junio C Hamano, Eric Sunshine, Jeff King
  Cc: Max Kirillov, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org

http-backend reads whole input until EOF. However, the RFC 3875 specifies
that a script must read only as many bytes as specified by CONTENT_LENGTH
environment variable. Web server may exercise the specification by not closing
the script's standard input after writing content. In that case http-backend
would hang waiting for the input. The issue is known to happen with
IIS/Windows, for example.

Make http-backend read only CONTENT_LENGTH bytes, if it's defined, rather than
the whole input until EOF. If the variable is not defined, keep older behavior
of reading until EOF because it is used to support chunked transfer-encoding.

Signed-off-by: Florian Manschwetus <manschwetus@cs-software-gmbh.de>
[mk: fixed trivial build failures and polished style issues]
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Max Kirillov <max@max630.net>
---
 config.c       |  2 +-
 config.h       |  1 +
 http-backend.c | 43 ++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/config.c b/config.c
index c698988f5e..4148a3529d 100644
--- a/config.c
+++ b/config.c
@@ -853,7 +853,7 @@ int git_parse_ulong(const char *value, unsigned long *ret)
 	return 1;
 }
 
-static int git_parse_ssize_t(const char *value, ssize_t *ret)
+int git_parse_ssize_t(const char *value, ssize_t *ret)
 {
 	intmax_t tmp;
 	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
diff --git a/config.h b/config.h
index ef70a9cac1..c143a1b634 100644
--- a/config.h
+++ b/config.h
@@ -48,6 +48,7 @@ extern void git_config(config_fn_t fn, void *);
 extern int config_with_options(config_fn_t fn, void *,
 			       struct git_config_source *config_source,
 			       const struct config_options *opts);
+extern int git_parse_ssize_t(const char *, ssize_t *);
 extern int git_parse_ulong(const char *, unsigned long *);
 extern int git_parse_maybe_bool(const char *);
 extern int git_config_int(const char *, const char *);
diff --git a/http-backend.c b/http-backend.c
index 88d2a9bc40..3066697a24 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -283,7 +283,7 @@ static struct rpc_service *select_service(struct strbuf *hdr, const char *name)
  * hit max_request_buffer we die (we'd rather reject a
  * maliciously large request than chew up infinite memory).
  */
-static ssize_t read_request(int fd, unsigned char **out)
+static ssize_t read_request_eof(int fd, unsigned char **out)
 {
 	size_t len = 0, alloc = 8192;
 	unsigned char *buf = xmalloc(alloc);
@@ -320,6 +320,47 @@ static ssize_t read_request(int fd, unsigned char **out)
 	}
 }
 
+static ssize_t read_request_fixed_len(int fd, ssize_t req_len, unsigned char **out)
+{
+	unsigned char *buf = NULL;
+	ssize_t cnt = 0;
+
+	if (max_request_buffer < req_len) {
+		die("request was larger than our maximum size (%lu): "
+		    "%" PRIuMAX "; try setting GIT_HTTP_MAX_REQUEST_BUFFER",
+		    max_request_buffer, (uintmax_t)req_len);
+	}
+
+	buf = xmalloc(req_len);
+	cnt = read_in_full(fd, buf, req_len);
+	if (cnt < 0) {
+		free(buf);
+		return -1;
+	}
+	*out = buf;
+	return cnt;
+}
+
+static ssize_t get_content_length(void)
+{
+	ssize_t val = -1;
+	const char *str = getenv("CONTENT_LENGTH");
+
+	if (str && !git_parse_ssize_t(str, &val))
+		die("failed to parse CONTENT_LENGTH: %s", str);
+	return val;
+}
+
+static ssize_t read_request(int fd, unsigned char **out)
+{
+	ssize_t req_len = get_content_length();
+
+	if (req_len < 0)
+		return read_request_eof(fd, out);
+	else
+		return read_request_fixed_len(fd, req_len, out);
+}
+
 static void inflate_request(const char *prog_name, int out, int buffer_input)
 {
 	git_zstream stream;
-- 
2.17.0.1185.g782057d875


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v7 2/2] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-06-02 21:27 [PATCH v7 0/2] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
  2018-06-02 21:27 ` [PATCH v7 1/2] " Max Kirillov
@ 2018-06-02 21:27 ` Max Kirillov
  2018-06-04  4:31   ` Junio C Hamano
  2018-06-04  4:44   ` Jeff King
  2018-06-10 15:05 ` [PATCH v8 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
  2 siblings, 2 replies; 31+ messages in thread
From: Max Kirillov @ 2018-06-02 21:27 UTC (permalink / raw)
  To: Junio C Hamano, Eric Sunshine, Jeff King
  Cc: Max Kirillov, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org

Push passes to another commands, as described in
https://public-inbox.org/git/20171129032214.GB32345@sigill.intra.peff.net/

As it gets complicated to correctly track the data length, instead transfer
the data through parent process and cut the pipe as the specified length is
reached. Do it only when CONTENT_LENGTH is set, otherwise pass the input
directly to the forked commands.

Add tests for cases:

* CONTENT_LENGTH is set, script's stdin has more data, with all combinations
  of variations: fetch or push, plain or compressed body, correct or truncated
  input.

* CONTENT_LENGTH is specified to a value which does not fit into ssize_t.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Max Kirillov <max@max630.net>
---
 Makefile                                |   1 +
 http-backend.c                          |  49 ++++++--
 t/helper/test-print-larger-than-ssize.c |  11 ++
 t/helper/test-tool.c                    |   1 +
 t/helper/test-tool.h                    |   1 +
 t/t5560-http-backend-noserver.sh        |  13 ++
 t/t5562-http-backend-content-length.sh  | 155 ++++++++++++++++++++++++
 t/t5562/invoke-with-content-length.pl   |  30 +++++
 8 files changed, 250 insertions(+), 11 deletions(-)
 create mode 100644 t/helper/test-print-larger-than-ssize.c
 create mode 100755 t/t5562-http-backend-content-length.sh
 create mode 100755 t/t5562/invoke-with-content-length.pl

diff --git a/Makefile b/Makefile
index f181687250..93dc4bc23b 100644
--- a/Makefile
+++ b/Makefile
@@ -678,6 +678,7 @@ TEST_BUILTINS_OBJS += test-mergesort.o
 TEST_BUILTINS_OBJS += test-mktemp.o
 TEST_BUILTINS_OBJS += test-online-cpus.o
 TEST_BUILTINS_OBJS += test-path-utils.o
+TEST_BUILTINS_OBJS += test-print-larger-than-ssize.o
 TEST_BUILTINS_OBJS += test-prio-queue.o
 TEST_BUILTINS_OBJS += test-read-cache.o
 TEST_BUILTINS_OBJS += test-ref-store.o
diff --git a/http-backend.c b/http-backend.c
index 3066697a24..78a588c551 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -351,23 +351,22 @@ static ssize_t get_content_length(void)
 	return val;
 }
 
-static ssize_t read_request(int fd, unsigned char **out)
+static ssize_t read_request(int fd, unsigned char **out, ssize_t req_len)
 {
-	ssize_t req_len = get_content_length();
-
 	if (req_len < 0)
 		return read_request_eof(fd, out);
 	else
 		return read_request_fixed_len(fd, req_len, out);
 }
 
-static void inflate_request(const char *prog_name, int out, int buffer_input)
+static void inflate_request(const char *prog_name, int out, int buffer_input, ssize_t req_len)
 {
 	git_zstream stream;
 	unsigned char *full_request = NULL;
 	unsigned char in_buf[8192];
 	unsigned char out_buf[8192];
 	unsigned long cnt = 0;
+	ssize_t req_remaining_len = req_len;
 
 	memset(&stream, 0, sizeof(stream));
 	git_inflate_init_gzip_only(&stream);
@@ -379,11 +378,18 @@ static void inflate_request(const char *prog_name, int out, int buffer_input)
 			if (full_request)
 				n = 0; /* nothing left to read */
 			else
-				n = read_request(0, &full_request);
+				n = read_request(0, &full_request, req_len);
 			stream.next_in = full_request;
 		} else {
-			n = xread(0, in_buf, sizeof(in_buf));
+			ssize_t buffer_len;
+			if (req_remaining_len < 0 || req_remaining_len > sizeof(in_buf))
+			    buffer_len = sizeof(in_buf);
+			else
+			    buffer_len = req_remaining_len;
+			n = xread(0, in_buf, buffer_len);
 			stream.next_in = in_buf;
+			if (req_remaining_len >= 0)
+				req_remaining_len -= n;
 		}
 
 		if (n <= 0)
@@ -416,10 +422,10 @@ static void inflate_request(const char *prog_name, int out, int buffer_input)
 	free(full_request);
 }
 
-static void copy_request(const char *prog_name, int out)
+static void copy_request(const char *prog_name, int out, ssize_t req_len)
 {
 	unsigned char *buf;
-	ssize_t n = read_request(0, &buf);
+	ssize_t n = read_request(0, &buf, req_len);
 	if (n < 0)
 		die_errno("error reading request body");
 	if (write_in_full(out, buf, n) < 0)
@@ -428,6 +434,24 @@ static void copy_request(const char *prog_name, int out)
 	free(buf);
 }
 
+static void pipe_fixed_length(const char *prog_name, int out, size_t req_len)
+{
+	unsigned char buf[8192];
+	size_t remaining_len = req_len;
+
+	while (remaining_len > 0) {
+		size_t chunk_length = remaining_len > sizeof(buf) ? sizeof(buf) : remaining_len;
+		size_t n = xread(0, buf, chunk_length);
+		if (n < 0)
+			die_errno("Reading request failed");
+		if (write_in_full(out, buf, n) < 0)
+			die_errno("%s aborted reading request", prog_name);
+		remaining_len -= n;
+	}
+
+	close(out);
+}
+
 static void run_service(const char **argv, int buffer_input)
 {
 	const char *encoding = getenv("HTTP_CONTENT_ENCODING");
@@ -435,6 +459,7 @@ static void run_service(const char **argv, int buffer_input)
 	const char *host = getenv("REMOTE_ADDR");
 	int gzipped_request = 0;
 	struct child_process cld = CHILD_PROCESS_INIT;
+	ssize_t req_len = get_content_length();
 
 	if (encoding && !strcmp(encoding, "gzip"))
 		gzipped_request = 1;
@@ -453,7 +478,7 @@ static void run_service(const char **argv, int buffer_input)
 				 "GIT_COMMITTER_EMAIL=%s@http.%s", user, host);
 
 	cld.argv = argv;
-	if (buffer_input || gzipped_request)
+	if (buffer_input || gzipped_request || req_len >= 0)
 		cld.in = -1;
 	cld.git_cmd = 1;
 	if (start_command(&cld))
@@ -461,9 +486,11 @@ static void run_service(const char **argv, int buffer_input)
 
 	close(1);
 	if (gzipped_request)
-		inflate_request(argv[0], cld.in, buffer_input);
+		inflate_request(argv[0], cld.in, buffer_input, req_len);
 	else if (buffer_input)
-		copy_request(argv[0], cld.in);
+		copy_request(argv[0], cld.in, req_len);
+	else if (req_len >= 0)
+		pipe_fixed_length(argv[0], cld.in, req_len);
 	else
 		close(0);
 
diff --git a/t/helper/test-print-larger-than-ssize.c b/t/helper/test-print-larger-than-ssize.c
new file mode 100644
index 0000000000..83472a32f1
--- /dev/null
+++ b/t/helper/test-print-larger-than-ssize.c
@@ -0,0 +1,11 @@
+#include "test-tool.h"
+#include "cache.h"
+
+int cmd__print_larger_than_ssize(int argc, const char **argv)
+{
+	size_t large = ~0;
+
+	large = ~(large & ~(large >> 1)) + 1;
+	printf("%" PRIuMAX "\n", (uintmax_t) large);
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 87066ced62..edcfe5df63 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -25,6 +25,7 @@ static struct test_cmd cmds[] = {
 	{ "mktemp", cmd__mktemp },
 	{ "online-cpus", cmd__online_cpus },
 	{ "path-utils", cmd__path_utils },
+        { "print-larger-than-ssize", cmd__print_larger_than_ssize },
 	{ "prio-queue", cmd__prio_queue },
 	{ "read-cache", cmd__read_cache },
 	{ "ref-store", cmd__ref_store },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 7116ddfb94..c2aa0803d2 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -19,6 +19,7 @@ int cmd__mergesort(int argc, const char **argv);
 int cmd__mktemp(int argc, const char **argv);
 int cmd__online_cpus(int argc, const char **argv);
 int cmd__path_utils(int argc, const char **argv);
+int cmd__print_larger_than_ssize(int argc, const char **argv);
 int cmd__prio_queue(int argc, const char **argv);
 int cmd__read_cache(int argc, const char **argv);
 int cmd__ref_store(int argc, const char **argv);
diff --git a/t/t5560-http-backend-noserver.sh b/t/t5560-http-backend-noserver.sh
index 9fafcf1945..8c212393b4 100755
--- a/t/t5560-http-backend-noserver.sh
+++ b/t/t5560-http-backend-noserver.sh
@@ -71,4 +71,17 @@ test_expect_success 'http-backend blocks bad PATH_INFO' '
 	expect_aliased 1 //domain/data.txt
 '
 
+test_expect_success 'CONTENT_LENGTH overflow ssite_t' '
+	NOT_FIT_IN_SSIZE=$("$GIT_BUILD_DIR/t/helper/test-tool" print-larger-than-ssize) &&
+	env \
+		CONTENT_TYPE=application/x-git-upload-pack-request \
+		QUERY_STRING=/repo.git/git-upload-pack \
+		PATH_TRANSLATED="$PWD"/.git/git-upload-pack \
+		GIT_HTTP_EXPORT_ALL=TRUE \
+		REQUEST_METHOD=POST \
+		CONTENT_LENGTH="$NOT_FIT_IN_SSIZE" \
+		git http-backend </dev/zero >/dev/null 2>err &&
+	grep -q "fatal:.*CONTENT_LENGTH" err
+'
+
 test_done
diff --git a/t/t5562-http-backend-content-length.sh b/t/t5562-http-backend-content-length.sh
new file mode 100755
index 0000000000..6b0c005db0
--- /dev/null
+++ b/t/t5562-http-backend-content-length.sh
@@ -0,0 +1,155 @@
+#!/bin/sh
+
+test_description='test git-http-backend respects CONTENT_LENGTH'
+. ./test-lib.sh
+
+verify_http_result() {
+	# sometimes there is fatal error buit the result is still 200
+	if grep 'fatal:' act.err
+	then
+		return 1
+	fi
+
+	if ! grep "Status" act.out >act
+	then
+		printf "Status: 200 OK\r\n" >act
+	fi
+	printf "Status: $1\r\n" >exp &&
+	test_cmp exp act
+}
+
+test_http_env() {
+	handler_type="$1"
+	shift
+	env \
+		CONTENT_TYPE="application/x-git-$handler_type-pack-request" \
+		QUERY_STRING="/repo.git/git-$handler_type-pack" \
+		PATH_TRANSLATED="$PWD/.git/git-$handler_type-pack" \
+		GIT_HTTP_EXPORT_ALL=TRUE \
+		REQUEST_METHOD=POST \
+		"$@"
+}
+
+test_expect_success 'setup repository' '
+	test_commit c0 &&
+	test_commit c1
+'
+
+hash_head=$(git rev-parse HEAD)
+hash_prev=$(git rev-parse HEAD~1)
+
+cat >fetch_body <<EOF
+0032want $hash_head
+00000032have $hash_prev
+0009done
+EOF
+
+gzip -k fetch_body
+
+head -c -10 <fetch_body.gz >fetch_body.gz.trunc
+
+head -c -10 <fetch_body >fetch_body.trunc
+
+hash_next=$(git commit-tree -p HEAD -m next HEAD^{tree})
+printf '00790000000000000000000000000000000000000000 %s refs/heads/newbranch\0report-status\n0000' "$hash_next" >push_body
+echo "$hash_next" | git pack-objects --stdout >>push_body
+
+gzip -k push_body
+
+head -c -10 <push_body.gz >push_body.gz.trunc
+
+head -c -10 <push_body >push_body.trunc
+
+touch empty_body
+
+test_expect_success 'fetch plain' '
+	test_http_env upload \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl fetch_body git http-backend >act.out 2>act.err &&
+	verify_http_result "200 OK"
+'
+
+test_expect_success 'fetch plain truncated' '
+	test_http_env upload \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl fetch_body.trunc git http-backend >act.out 2>act.err &&
+	test_must_fail verify_http_result "200 OK"
+'
+
+test_expect_success 'fetch plain empty' '
+	test_http_env upload \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl empty_body git http-backend >act.out 2>act.err &&
+	test_must_fail verify_http_result "200 OK"
+'
+
+test_expect_success 'fetch gzipped' '
+	test_http_env upload \
+		HTTP_CONTENT_ENCODING="gzip" \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl fetch_body.gz git http-backend >act.out 2>act.err &&
+	verify_http_result "200 OK"
+'
+
+test_expect_success 'fetch gzipped truncated' '
+	test_http_env upload \
+		HTTP_CONTENT_ENCODING="gzip" \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl fetch_body.gz.trunc git http-backend >act.out 2>act.err &&
+	test_must_fail verify_http_result "200 OK"
+'
+
+test_expect_success 'fetch gzipped empty' '
+	test_http_env upload \
+		HTTP_CONTENT_ENCODING="gzip" \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl empty_body git http-backend >act.out 2>act.err &&
+	test_must_fail verify_http_result "200 OK"
+'
+
+test_expect_success 'push plain' '
+	git config http.receivepack true &&
+	test_http_env receive \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl push_body git http-backend >act.out 2>act.err &&
+	verify_http_result "200 OK" &&
+	git rev-parse newbranch >act.head &&
+	echo "$hash_next" >exp.head &&
+	test_cmp act.head exp.head &&
+	git branch -D newbranch
+'
+
+
+test_expect_success 'push plain truncated' '
+	git config http.receivepack true &&
+	test_http_env receive \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl push_body.trunc git http-backend >act.out 2>act.err &&
+	test_must_fail verify_http_result "200 OK"
+'
+
+test_expect_success 'push plain empty' '
+	git config http.receivepack true &&
+	test_http_env receive \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl empty_body git http-backend >act.out 2>act.err &&
+	test_must_fail verify_http_result "200 OK"
+'
+
+test_expect_success 'push gzipped' '
+	test_http_env receive \
+		HTTP_CONTENT_ENCODING="gzip" \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl push_body.gz git http-backend >act.out 2>act.err &&
+	verify_http_result "200 OK" &&
+	git rev-parse newbranch >act.head &&
+	echo "$hash_next" >exp.head &&
+	test_cmp act.head exp.head &&
+	git branch -D newbranch
+'
+
+test_expect_success 'push gzipped truncated' '
+	test_http_env receive \
+		HTTP_CONTENT_ENCODING="gzip" \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl push_body.gz.trunc git http-backend >act.out 2>act.err &&
+	test_must_fail verify_http_result "200 OK"
+'
+
+test_expect_success 'push gzipped empty' '
+	test_http_env receive \
+		HTTP_CONTENT_ENCODING="gzip" \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl empty_body git http-backend >act.out 2>act.err &&
+	test_must_fail verify_http_result "200 OK"
+'
+
+test_done
diff --git a/t/t5562/invoke-with-content-length.pl b/t/t5562/invoke-with-content-length.pl
new file mode 100755
index 0000000000..7f84242e77
--- /dev/null
+++ b/t/t5562/invoke-with-content-length.pl
@@ -0,0 +1,30 @@
+#!/usr/bin/perl
+use 5.008;
+use strict;
+use warnings;
+
+my $body_filename = $ARGV[0];
+my @command = @ARGV[1 .. $#ARGV];
+
+# read data
+my $body_size = -s $body_filename;
+$ENV{"CONTENT_LENGTH"} = $body_size;
+open(my $body_fh, "<", $body_filename) or die "Cannot open $body_filename: $!";
+my $body_data;
+defined read($body_fh, $body_data, $body_size) or die "Cannot read $body_filename: $!";
+close($body_fh);
+
+my $exited = 0;
+$SIG{"CHLD"} = sub {
+        $exited = 1;
+};
+
+# write data
+my $pid = open(my $out, "|-", @command);
+defined syswrite($out, $body_data) or die "Cannot write data: $!";
+
+sleep 1; # is interrupted by SIGCHLD
+if (!$exited) {
+        close($out);
+        die "Command did not exit after reading whole body";
+}
-- 
2.17.0.1185.g782057d875


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v7 1/2] http-backend: respect CONTENT_LENGTH as specified by rfc3875
  2018-06-02 21:27 ` [PATCH v7 1/2] " Max Kirillov
@ 2018-06-04  3:44   ` Jeff King
  0 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2018-06-04  3:44 UTC (permalink / raw)
  To: Max Kirillov
  Cc: Junio C Hamano, Eric Sunshine, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org

On Sun, Jun 03, 2018 at 12:27:48AM +0300, Max Kirillov wrote:

> http-backend reads whole input until EOF. However, the RFC 3875 specifies
> that a script must read only as many bytes as specified by CONTENT_LENGTH
> environment variable. Web server may exercise the specification by not closing
> the script's standard input after writing content. In that case http-backend
> would hang waiting for the input. The issue is known to happen with
> IIS/Windows, for example.
> 
> Make http-backend read only CONTENT_LENGTH bytes, if it's defined, rather than
> the whole input until EOF. If the variable is not defined, keep older behavior
> of reading until EOF because it is used to support chunked transfer-encoding.
> 
> Signed-off-by: Florian Manschwetus <manschwetus@cs-software-gmbh.de>
> [mk: fixed trivial build failures and polished style issues]
> Helped-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Max Kirillov <max@max630.net>
> ---
>  config.c       |  2 +-
>  config.h       |  1 +
>  http-backend.c | 43 ++++++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 44 insertions(+), 2 deletions(-)

This first patch looks good to me, though it may be worth mentioning in
the commit message that we're only handling the buffered-input side here
(that is obvious to anybody reading this whole series now, but it may
help out people digging in the history later).

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v7 2/2] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-06-02 21:27 ` [PATCH v7 2/2] http-backend: respect CONTENT_LENGTH for receive-pack Max Kirillov
@ 2018-06-04  4:31   ` Junio C Hamano
  2018-06-04 17:06     ` Max Kirillov
  2018-06-04  4:44   ` Jeff King
  1 sibling, 1 reply; 31+ messages in thread
From: Junio C Hamano @ 2018-06-04  4:31 UTC (permalink / raw)
  To: Max Kirillov
  Cc: Eric Sunshine, Jeff King, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org

Max Kirillov <max@max630.net> writes:

> +static void pipe_fixed_length(const char *prog_name, int out, size_t req_len)
> +{
> +	unsigned char buf[8192];
> +	size_t remaining_len = req_len;
> +
> +	while (remaining_len > 0) {
> +		size_t chunk_length = remaining_len > sizeof(buf) ? sizeof(buf) : remaining_len;
> +		size_t n = xread(0, buf, chunk_length);
> +		if (n < 0)
> +			die_errno("Reading request failed");

n that is of type size_t is unsigned and cannot be negative here.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v7 2/2] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-06-02 21:27 ` [PATCH v7 2/2] http-backend: respect CONTENT_LENGTH for receive-pack Max Kirillov
  2018-06-04  4:31   ` Junio C Hamano
@ 2018-06-04  4:44   ` Jeff King
  2018-06-04 22:18     ` Max Kirillov
  2018-06-10 15:07     ` Max Kirillov
  1 sibling, 2 replies; 31+ messages in thread
From: Jeff King @ 2018-06-04  4:44 UTC (permalink / raw)
  To: Max Kirillov
  Cc: Junio C Hamano, Eric Sunshine, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org

On Sun, Jun 03, 2018 at 12:27:49AM +0300, Max Kirillov wrote:

> Push passes to another commands, as described in
> https://public-inbox.org/git/20171129032214.GB32345@sigill.intra.peff.net/
> 
> As it gets complicated to correctly track the data length, instead transfer
> the data through parent process and cut the pipe as the specified length is
> reached. Do it only when CONTENT_LENGTH is set, otherwise pass the input
> directly to the forked commands.

I think this approach is reasonable. It's basically converting the
known-length case to a read-to-eof case for the sub-program, which
should paper over any problems of this type. And it's what we really
_want_ the web server to be doing in the first place.

Since this is slightly less efficient, and because it only matters if
the web server does not already close the pipe, should this have a
run-time configuration knob, even if it defaults to
safe-but-slightly-slower?

I admit I don't overly care that much myself (the only large-scale Git
server deployment I am personally familiar with does not use
git-http-backend at all), but it might be nice to leave an escape hatch.

There are a few things in the patch worth fixing, but overall I think it
looks like a pretty good direction. Comments inline.

> diff --git a/http-backend.c b/http-backend.c
> index 3066697a24..78a588c551 100644
> --- a/http-backend.c
> +++ b/http-backend.c
> @@ -351,23 +351,22 @@ static ssize_t get_content_length(void)
>  	return val;
>  }
>  
> -static ssize_t read_request(int fd, unsigned char **out)
> +static ssize_t read_request(int fd, unsigned char **out, ssize_t req_len)
>  {
> -	ssize_t req_len = get_content_length();
> -
>  	if (req_len < 0)
>  		return read_request_eof(fd, out);
>  	else
>  		return read_request_fixed_len(fd, req_len, out);
>  }

Minor nit, but it might have been nice to build in this infrastructure
in the first patch, rather than refactoring it here. It would also make
it much more obvious that the first one is not handling some cases,
since we'd have "req_len" but not pass it to all of the code paths. ;)

> @@ -379,11 +378,18 @@ static void inflate_request(const char *prog_name, int out, int buffer_input)
>  			if (full_request)
>  				n = 0; /* nothing left to read */
>  			else
> -				n = read_request(0, &full_request);
> +				n = read_request(0, &full_request, req_len);
>  			stream.next_in = full_request;
>  		} else {
> -			n = xread(0, in_buf, sizeof(in_buf));
> +			ssize_t buffer_len;
> +			if (req_remaining_len < 0 || req_remaining_len > sizeof(in_buf))
> +			    buffer_len = sizeof(in_buf);
> +			else
> +			    buffer_len = req_remaining_len;
> +			n = xread(0, in_buf, buffer_len);
>  			stream.next_in = in_buf;
> +			if (req_remaining_len >= 0)
> +				req_remaining_len -= n;
>  		}

What happens here if xread() returns an error? We probably don't want to
modify req_remaining_len (it probably doesn't matter since we'd report
the errot after this, but it feels funny not to check here).

> +static void pipe_fixed_length(const char *prog_name, int out, size_t req_len)
> +{
> +	unsigned char buf[8192];
> +	size_t remaining_len = req_len;
> +
> +	while (remaining_len > 0) {
> +		size_t chunk_length = remaining_len > sizeof(buf) ? sizeof(buf) : remaining_len;
> +		size_t n = xread(0, buf, chunk_length);
> +		if (n < 0)
> +			die_errno("Reading request failed");

I was going to complain that we usually start our error messages with a
lowercase, but this program seems to be an exception. So here you've
followed the local custom, which is OK.

> +		if (write_in_full(out, buf, n) < 0)
> +			die_errno("%s aborted reading request", prog_name);

We don't necessarily know why the write failed. If it's EPIPE, then yes,
the program probably did abort. But all we know is that write() failed.
We should probably say something more generic like:

  die_errno("unable to write to '%s'");

or similar.

> diff --git a/t/helper/test-print-larger-than-ssize.c b/t/helper/test-print-larger-than-ssize.c
> new file mode 100644
> index 0000000000..83472a32f1
> --- /dev/null
> +++ b/t/helper/test-print-larger-than-ssize.c
> @@ -0,0 +1,11 @@
> +#include "test-tool.h"
> +#include "cache.h"
> +
> +int cmd__print_larger_than_ssize(int argc, const char **argv)
> +{
> +	size_t large = ~0;
> +
> +	large = ~(large & ~(large >> 1)) + 1;
> +	printf("%" PRIuMAX "\n", (uintmax_t) large);
> +	return 0;
> +}

I think this might be nicer as part of "git version --build-options".
Either as a byte-size as I showed in [1], or directly showing this
value.

[1] https://public-inbox.org/git/20171129032632.GC32345@sigill.intra.peff.net/

> diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
> index 87066ced62..edcfe5df63 100644
> --- a/t/helper/test-tool.c
> +++ b/t/helper/test-tool.c
> @@ -25,6 +25,7 @@ static struct test_cmd cmds[] = {
>  	{ "mktemp", cmd__mktemp },
>  	{ "online-cpus", cmd__online_cpus },
>  	{ "path-utils", cmd__path_utils },
> +        { "print-larger-than-ssize", cmd__print_larger_than_ssize },

Indent with spaces?

> diff --git a/t/t5560-http-backend-noserver.sh b/t/t5560-http-backend-noserver.sh
> index 9fafcf1945..8c212393b4 100755
> --- a/t/t5560-http-backend-noserver.sh
> +++ b/t/t5560-http-backend-noserver.sh
> @@ -71,4 +71,17 @@ test_expect_success 'http-backend blocks bad PATH_INFO' '
>  	expect_aliased 1 //domain/data.txt
>  '
>  
> +test_expect_success 'CONTENT_LENGTH overflow ssite_t' '
> +	NOT_FIT_IN_SSIZE=$("$GIT_BUILD_DIR/t/helper/test-tool" print-larger-than-ssize) &&

We put helpers in the PATH, so this could just be "test-tool
print-larger-than-ssize" (though I still prefer the --build-options
variant).

> +	env \
> +		CONTENT_TYPE=application/x-git-upload-pack-request \
> +		QUERY_STRING=/repo.git/git-upload-pack \
> +		PATH_TRANSLATED="$PWD"/.git/git-upload-pack \
> +		GIT_HTTP_EXPORT_ALL=TRUE \
> +		REQUEST_METHOD=POST \
> +		CONTENT_LENGTH="$NOT_FIT_IN_SSIZE" \
> +		git http-backend </dev/zero >/dev/null 2>err &&
> +	grep -q "fatal:.*CONTENT_LENGTH" err

I'm not sure if these messages should be marked for translation. If so,
you'd want test_i18ngrep here.

We also generally avoid "-q" to grep. If the script is in non-verbose
mode it will go to /dev/null anyway, and in verbose mode it's useful to
see (possibly ditto for the /dev/null redirection of stdout above, but I
think that might actually spew a binary packfile if the test fails,
which we'd rather avoid).

> diff --git a/t/t5562-http-backend-content-length.sh b/t/t5562-http-backend-content-length.sh
> new file mode 100755
> index 0000000000..6b0c005db0
> --- /dev/null
> +++ b/t/t5562-http-backend-content-length.sh
> @@ -0,0 +1,155 @@
> +#!/bin/sh
> +
> +test_description='test git-http-backend respects CONTENT_LENGTH'
> +. ./test-lib.sh

Why is the too-large CONTENT_LENGTH test in another file? I'd have
thought it would go well here, based on the description.

> +verify_http_result() {
> +	# sometimes there is fatal error buit the result is still 200
> +	if grep 'fatal:' act.err
> +	then
> +		return 1
> +	fi
> +
> +	if ! grep "Status" act.out >act
> +	then
> +		printf "Status: 200 OK\r\n" >act
> +	fi
> +	printf "Status: $1\r\n" >exp &&
> +	test_cmp exp act
> +}

200 with a fatal error sounds non-ideal. But I think it's unavoidable in
some cases where we see write failures, etc.

> +test_http_env() {
> +	handler_type="$1"
> +	shift
> +	env \
> +		CONTENT_TYPE="application/x-git-$handler_type-pack-request" \
> +		QUERY_STRING="/repo.git/git-$handler_type-pack" \
> +		PATH_TRANSLATED="$PWD/.git/git-$handler_type-pack" \
> +		GIT_HTTP_EXPORT_ALL=TRUE \
> +		REQUEST_METHOD=POST \
> +		"$@"
> +}

I think this env (and the earlier one) are not strictly necessary, as
you could just use shell one-shot variables. But I'm OK with them as an
abundance of caution, since in theory a caller could use a shell
function rather than a real command here (in which case one-shot
variables do the wrong thing).

> +test_expect_success 'setup repository' '
> +	test_commit c0 &&
> +	test_commit c1
> +'
> +
> +hash_head=$(git rev-parse HEAD)
> +hash_prev=$(git rev-parse HEAD~1)

We generally prefer to have all commands, even ones we don't expect to
fail, inside test_expect blocks (e.g., with a "setup" description).

> +cat >fetch_body <<EOF
> +0032want $hash_head
> +00000032have $hash_prev
> +0009done
> +EOF

This depends on the size of the hash. That's always 40 for now, but is
something that may change soon.

We already have a packetize() helper; could we use it here?

(Looking at the definition of that helper, it's actually kind of
expensive in terms of number of processes. We could perhaps convert it
to perl and do it all in a single process, but that's orthogonal to your
series).

> +gzip -k fetch_body

We don't unconditionally rely on gzip elsewhere. The test blocks using
it (and the ones that depend on them) should be marked with the GZIP
prerequisite.

> +head -c -10 <fetch_body.gz >fetch_body.gz.trunc
> +
> +head -c -10 <fetch_body >fetch_body.trunc

We can into portability problems with "head -c", but I think they were
mostly with different buffering behavior (i.e., reading more than 10
bytes). And that would be OK in this setting, since nobody is going to
read the rest of the input after us.

So it's probably OK, but we could use "test_copy_bytes 10" here if it
isn't.

> +touch empty_body

We usually prefer ">empty_body" to using "touch".

> +test_expect_success 'fetch plain truncated' '
> +	test_http_env upload \
> +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl fetch_body.trunc git http-backend >act.out 2>act.err &&
> +	test_must_fail verify_http_result "200 OK"
> +'

Usually test_must_fail on a checking function like this is a sign that
the check is not as robust as we'd like. If the function checks two
things "A && B", then checking test_must_fail will only let us know
"!A || !B", but you probably want to check both.

The usual solution is for verify_http_result to take an optional "!" in
the first parameter and invert its sense. Or to just split it into two
separate functions.

(We'd also generally not use test_must_fail with a non-git command, and
just use a simple "! verify_http_result"; that would apply equally if
gets split into two commands).

> +test_expect_success 'push plain' '
> +	git config http.receivepack true &&

This will persist after the test finishes. Try:

  test_config http.receivepack true

which will clean up after the test finishes. Alternatively, since I
think you'd want this whole script to run with http.receivepack set,
this could be part of the repository setup in the earlier steps (and
then _don't_ use test_config, because the whole point is for it to
persist).

> +	test_http_env receive \
> +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl push_body git http-backend >act.out 2>act.err &&
> +	verify_http_result "200 OK" &&
> +	git rev-parse newbranch >act.head &&
> +	echo "$hash_next" >exp.head &&
> +	test_cmp act.head exp.head &&
> +	git branch -D newbranch
> +'

Should this "git branch -D" go into a "test_when_finished" block closer
to when it is created?

> +# write data
> +my $pid = open(my $out, "|-", @command);
> +defined syswrite($out, $body_data) or die "Cannot write data: $!";

I assume perl's syswrite() has the usual write() pitfalls, like
sometimes returning without writing all of the bytes. Could this just
be:

  print $out $body_date;

?

> +sleep 1; # is interrupted by SIGCHLD
> +if (!$exited) {
> +        close($out);
> +        die "Command did not exit after reading whole body";
> +}

A sleep like this is a recipe for having the test fail when the system
is under heavy load and it takes the sub-process more than a second to
return (and the SIGCHLD to get delivered).

Normally I'd suggest wait() or pause(), but I think the intent is to
sleep because in the failure case we'd never see the signal, and just
hang? If so, then perhaps we should give a much higher sleep, like 60
seconds. That will mean the test eventually does report failure, but
should be much less likely to cause a false negative. And if we do get
the signal (which we'd usually expect), then we exit immediately.

Also, do we need to protect ourselves against other signals being
delivered? E.g., if I resize my xterm and this process gets SIGWINCH, is
it going to erroneously end the sleep and say "nope, no exited signal"?


My read through the tests was mostly looking for mechanical problems. I
didn't give much though to whether we were getting full coverage, and
now it's my bed-time here. So I'll leave that for later (or somebody
else).

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v7 2/2] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-06-04  4:31   ` Junio C Hamano
@ 2018-06-04 17:06     ` Max Kirillov
  2018-06-05  2:30       ` Ramsay Jones
  0 siblings, 1 reply; 31+ messages in thread
From: Max Kirillov @ 2018-06-04 17:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Max Kirillov, Eric Sunshine, Jeff King, Florian Manschwetus,
	Chris Packham, Konstantin Khomoutov, git@vger.kernel.org

On Mon, Jun 04, 2018 at 01:31:58PM +0900, Junio C Hamano wrote:
> Max Kirillov <max@max630.net> writes:
>> +		size_t n = xread(0, buf, chunk_length);
>> +		if (n < 0)
>> +			die_errno("Reading request failed");
> 
> n that is of type size_t is unsigned and cannot be negative here.

Thanks, fixing it
Do you think sanity check for n <= chunk_length (the code
will go mad in this case) is needed? As far as I can see n
returns straight from system's read()

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v7 2/2] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-06-04  4:44   ` Jeff King
@ 2018-06-04 22:18     ` Max Kirillov
  2018-06-10 15:06       ` Max Kirillov
  2018-06-11  9:18       ` Jeff King
  2018-06-10 15:07     ` Max Kirillov
  1 sibling, 2 replies; 31+ messages in thread
From: Max Kirillov @ 2018-06-04 22:18 UTC (permalink / raw)
  To: Jeff King
  Cc: Max Kirillov, Junio C Hamano, Eric Sunshine, Florian Manschwetus,
	Chris Packham, Konstantin Khomoutov, git@vger.kernel.org

On Mon, Jun 04, 2018 at 12:44:09AM -0400, Jeff King wrote:

Thanks for the comments, I will do the things you proposed,
or try to and get back later if there are any issues. Some
notes below.

> On Sun, Jun 03, 2018 at 12:27:49AM +0300, Max Kirillov wrote:
> Since this is slightly less efficient, and because it only matters if
> the web server does not already close the pipe, should this have a
> run-time configuration knob, even if it defaults to
> safe-but-slightly-slower?

Personally, I of course don't want this. Also, I don't think
the difference is much noticeable. But you can never be sure
without trying. I'll try to measure some numbers.

>> +		if (write_in_full(out, buf, n) < 0)
>> +			die_errno("%s aborted reading request", prog_name);
> 
> We don't necessarily know why the write failed. If it's EPIPE, then yes,
> the program probably did abort. But all we know is that write() failed.
> We should probably say something more generic like:
> 
>   die_errno("unable to write to '%s'");
> 
> or similar.

Actually, it is already 3rd same error in this file. Maybe
deserve some refactoring. I will change the message also.

>> +test_expect_success 'setup repository' '
>> +	test_commit c0 &&
>> +	test_commit c1
>> +'
>> +
>> +hash_head=$(git rev-parse HEAD)
>> +hash_prev=$(git rev-parse HEAD~1)
> 
> We generally prefer to have all commands, even ones we don't expect to
> fail, inside test_expect blocks (e.g., with a "setup" description).

Will the defined variables get to the next test? I'll try to
do as you describe.

>> +cat >fetch_body <<EOF
>> +0032want $hash_head
>> +00000032have $hash_prev
>> +0009done
>> +EOF
> 
> This depends on the size of the hash. That's always 40 for now, but is
> something that may change soon.
> 
> We already have a packetize() helper; could we use it here?

Could you point me to it? I cannot find it.

My understanfing is that the current protocol assumes
40 symbols hash, so another hash length would be another
protocol, and since it's manually forged here it would
anyway has to be changeda.

>> +test_expect_success 'fetch plain truncated' '
>> +	test_http_env upload \
>> +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl fetch_body.trunc git http-backend >act.out 2>act.err &&
>> +	test_must_fail verify_http_result "200 OK"
>> +'
> 
> Usually test_must_fail on a checking function like this is a sign that
> the check is not as robust as we'd like. If the function checks two
> things "A && B", then checking test_must_fail will only let us know
> "!A || !B", but you probably want to check both.

Well here I just want to know that the request has failed,
and we already know that it can fail in different ways,
but the test is not going to differentiate those ways.

> (We'd also generally not use test_must_fail with a non-git command, and
> just use a simple "! verify_http_result"; that would apply equally if
> gets split into two commands).

Will use ! there.

>> +sleep 1; # is interrupted by SIGCHLD
>> +if (!$exited) {
>> +        close($out);
>> +        die "Command did not exit after reading whole body";
>> +}

...

> Also, do we need to protect ourselves against other signals being
> delivered? E.g., if I resize my xterm and this process gets SIGWINCH, is
> it going to erroneously end the sleep and say "nope, no exited signal"?

I'll check, but what could I do? Should I add blocking other
signals there?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v7 2/2] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-06-04 17:06     ` Max Kirillov
@ 2018-06-05  2:30       ` Ramsay Jones
  0 siblings, 0 replies; 31+ messages in thread
From: Ramsay Jones @ 2018-06-05  2:30 UTC (permalink / raw)
  To: Max Kirillov, Junio C Hamano
  Cc: Eric Sunshine, Jeff King, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org



On 04/06/18 18:06, Max Kirillov wrote:
> On Mon, Jun 04, 2018 at 01:31:58PM +0900, Junio C Hamano wrote:
>> Max Kirillov <max@max630.net> writes:
>>> +		size_t n = xread(0, buf, chunk_length);
>>> +		if (n < 0)
>>> +			die_errno("Reading request failed");
>>
>> n that is of type size_t is unsigned and cannot be negative here.
> 

Hmm, xread() returns an ssize_t, which is a signed type ...

> Thanks, fixing it
> Do you think sanity check for n <= chunk_length (the code
> will go mad in this case) is needed? As far as I can see n
> returns straight from system's read()

ATB,
Ramsay Jones


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v8 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875
  2018-06-02 21:27 [PATCH v7 0/2] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
  2018-06-02 21:27 ` [PATCH v7 1/2] " Max Kirillov
  2018-06-02 21:27 ` [PATCH v7 2/2] http-backend: respect CONTENT_LENGTH for receive-pack Max Kirillov
@ 2018-06-10 15:05 ` Max Kirillov
  2018-06-10 15:05   ` [PATCH v8 1/3] http-backend: cleanup writing to child process Max Kirillov
                     ` (3 more replies)
  2 siblings, 4 replies; 31+ messages in thread
From: Max Kirillov @ 2018-06-10 15:05 UTC (permalink / raw)
  To: Junio C Hamano, Eric Sunshine, Jeff King
  Cc: Max Kirillov, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org

Did the fixes proposed for v7

Max Kirillov (3):
  http-backend: cleanup writing to child process
  http-backend: respect CONTENT_LENGTH as specified by rfc3875
  http-backend: respect CONTENT_LENGTH for receive-pack

 config.c                               |   2 +-
 config.h                               |   1 +
 help.c                                 |   1 +
 http-backend.c                         | 100 +++++++++++++--
 t/t5562-http-backend-content-length.sh | 169 +++++++++++++++++++++++++
 t/t5562/invoke-with-content-length.pl  |  37 ++++++
 6 files changed, 295 insertions(+), 15 deletions(-)
 create mode 100755 t/t5562-http-backend-content-length.sh
 create mode 100755 t/t5562/invoke-with-content-length.pl

-- 
2.17.0.1185.g782057d875


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v8 1/3] http-backend: cleanup writing to child process
  2018-06-10 15:05 ` [PATCH v8 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
@ 2018-06-10 15:05   ` Max Kirillov
  2018-06-10 15:05   ` [PATCH v8 2/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 31+ messages in thread
From: Max Kirillov @ 2018-06-10 15:05 UTC (permalink / raw)
  To: Junio C Hamano, Eric Sunshine, Jeff King
  Cc: Max Kirillov, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org

As explained in [1], we should not assume the reason why the writing has
failed, and even if the reason is that child has existed not the reason
why it have done so. So instead just say that writing has failed.

[1] https://public-inbox.org/git/20180604044408.GD14451@sigill.intra.peff.net/

Signed-off-by: Max Kirillov <max@max630.net>
---
 http-backend.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/http-backend.c b/http-backend.c
index 88d2a9bc40..206dc28e07 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -278,6 +278,12 @@ static struct rpc_service *select_service(struct strbuf *hdr, const char *name)
 	return svc;
 }
 
+static void write_to_child(int out, const unsigned char *buf, ssize_t len, const char *prog_name)
+{
+	if (write_in_full(out, buf, len) < 0)
+		die("unable to write to '%s'", prog_name);
+}
+
 /*
  * This is basically strbuf_read(), except that if we
  * hit max_request_buffer we die (we'd rather reject a
@@ -360,9 +366,8 @@ static void inflate_request(const char *prog_name, int out, int buffer_input)
 				die("zlib error inflating request, result %d", ret);
 
 			n = stream.total_out - cnt;
-			if (write_in_full(out, out_buf, n) < 0)
-				die("%s aborted reading request", prog_name);
-			cnt += n;
+			write_to_child(out, out_buf, stream.total_out - cnt, prog_name);
+			cnt = stream.total_out;
 
 			if (ret == Z_STREAM_END)
 				goto done;
@@ -381,8 +386,7 @@ static void copy_request(const char *prog_name, int out)
 	ssize_t n = read_request(0, &buf);
 	if (n < 0)
 		die_errno("error reading request body");
-	if (write_in_full(out, buf, n) < 0)
-		die("%s aborted reading request", prog_name);
+	write_to_child(out, buf, n, prog_name);
 	close(out);
 	free(buf);
 }
-- 
2.17.0.1185.g782057d875


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 2/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875
  2018-06-10 15:05 ` [PATCH v8 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
  2018-06-10 15:05   ` [PATCH v8 1/3] http-backend: cleanup writing to child process Max Kirillov
@ 2018-06-10 15:05   ` Max Kirillov
  2018-06-10 15:05   ` [PATCH v8 3/3] http-backend: respect CONTENT_LENGTH for receive-pack Max Kirillov
  2018-07-27  3:48   ` [PATCH v9 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
  3 siblings, 0 replies; 31+ messages in thread
From: Max Kirillov @ 2018-06-10 15:05 UTC (permalink / raw)
  To: Junio C Hamano, Eric Sunshine, Jeff King
  Cc: Max Kirillov, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org

http-backend reads whole input until EOF. However, the RFC 3875 specifies
that a script must read only as many bytes as specified by CONTENT_LENGTH
environment variable. Web server may exercise the specification by not closing
the script's standard input after writing content. In that case http-backend
would hang waiting for the input. The issue is known to happen with
IIS/Windows, for example.

Make http-backend read only CONTENT_LENGTH bytes, if it's defined, rather than
the whole input until EOF. If the variable is not defined, keep older behavior
of reading until EOF because it is used to support chunked transfer-encoding.

This commit only fixes buffered input, whcih reads whole body before
processign it. Non-buffered input is going to be fixed in subsequent commit.

Signed-off-by: Florian Manschwetus <manschwetus@cs-software-gmbh.de>
[mk: fixed trivial build failures and polished style issues]
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Max Kirillov <max@max630.net>
---
 config.c       |  2 +-
 config.h       |  1 +
 http-backend.c | 54 +++++++++++++++++++++++++++++++++++++++++++-------
 3 files changed, 49 insertions(+), 8 deletions(-)

diff --git a/config.c b/config.c
index c698988f5e..4148a3529d 100644
--- a/config.c
+++ b/config.c
@@ -853,7 +853,7 @@ int git_parse_ulong(const char *value, unsigned long *ret)
 	return 1;
 }
 
-static int git_parse_ssize_t(const char *value, ssize_t *ret)
+int git_parse_ssize_t(const char *value, ssize_t *ret)
 {
 	intmax_t tmp;
 	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
diff --git a/config.h b/config.h
index ef70a9cac1..c143a1b634 100644
--- a/config.h
+++ b/config.h
@@ -48,6 +48,7 @@ extern void git_config(config_fn_t fn, void *);
 extern int config_with_options(config_fn_t fn, void *,
 			       struct git_config_source *config_source,
 			       const struct config_options *opts);
+extern int git_parse_ssize_t(const char *, ssize_t *);
 extern int git_parse_ulong(const char *, unsigned long *);
 extern int git_parse_maybe_bool(const char *);
 extern int git_config_int(const char *, const char *);
diff --git a/http-backend.c b/http-backend.c
index 206dc28e07..0c9e9be2b7 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -289,7 +289,7 @@ static void write_to_child(int out, const unsigned char *buf, ssize_t len, const
  * hit max_request_buffer we die (we'd rather reject a
  * maliciously large request than chew up infinite memory).
  */
-static ssize_t read_request(int fd, unsigned char **out)
+static ssize_t read_request_eof(int fd, unsigned char **out)
 {
 	size_t len = 0, alloc = 8192;
 	unsigned char *buf = xmalloc(alloc);
@@ -326,7 +326,46 @@ static ssize_t read_request(int fd, unsigned char **out)
 	}
 }
 
-static void inflate_request(const char *prog_name, int out, int buffer_input)
+static ssize_t read_request_fixed_len(int fd, ssize_t req_len, unsigned char **out)
+{
+	unsigned char *buf = NULL;
+	ssize_t cnt = 0;
+
+	if (max_request_buffer < req_len) {
+		die("request was larger than our maximum size (%lu): "
+		    "%" PRIuMAX "; try setting GIT_HTTP_MAX_REQUEST_BUFFER",
+		    max_request_buffer, (uintmax_t)req_len);
+	}
+
+	buf = xmalloc(req_len);
+	cnt = read_in_full(fd, buf, req_len);
+	if (cnt < 0) {
+		free(buf);
+		return -1;
+	}
+	*out = buf;
+	return cnt;
+}
+
+static ssize_t get_content_length(void)
+{
+	ssize_t val = -1;
+	const char *str = getenv("CONTENT_LENGTH");
+
+	if (str && !git_parse_ssize_t(str, &val))
+		die("failed to parse CONTENT_LENGTH: %s", str);
+	return val;
+}
+
+static ssize_t read_request(int fd, unsigned char **out, ssize_t req_len)
+{
+	if (req_len < 0)
+		return read_request_eof(fd, out);
+	else
+		return read_request_fixed_len(fd, req_len, out);
+}
+
+static void inflate_request(const char *prog_name, int out, int buffer_input, ssize_t req_len)
 {
 	git_zstream stream;
 	unsigned char *full_request = NULL;
@@ -344,7 +383,7 @@ static void inflate_request(const char *prog_name, int out, int buffer_input)
 			if (full_request)
 				n = 0; /* nothing left to read */
 			else
-				n = read_request(0, &full_request);
+				n = read_request(0, &full_request, req_len);
 			stream.next_in = full_request;
 		} else {
 			n = xread(0, in_buf, sizeof(in_buf));
@@ -380,10 +419,10 @@ static void inflate_request(const char *prog_name, int out, int buffer_input)
 	free(full_request);
 }
 
-static void copy_request(const char *prog_name, int out)
+static void copy_request(const char *prog_name, int out, ssize_t req_len)
 {
 	unsigned char *buf;
-	ssize_t n = read_request(0, &buf);
+	ssize_t n = read_request(0, &buf, req_len);
 	if (n < 0)
 		die_errno("error reading request body");
 	write_to_child(out, buf, n, prog_name);
@@ -398,6 +437,7 @@ static void run_service(const char **argv, int buffer_input)
 	const char *host = getenv("REMOTE_ADDR");
 	int gzipped_request = 0;
 	struct child_process cld = CHILD_PROCESS_INIT;
+	ssize_t req_len = get_content_length();
 
 	if (encoding && !strcmp(encoding, "gzip"))
 		gzipped_request = 1;
@@ -424,9 +464,9 @@ static void run_service(const char **argv, int buffer_input)
 
 	close(1);
 	if (gzipped_request)
-		inflate_request(argv[0], cld.in, buffer_input);
+		inflate_request(argv[0], cld.in, buffer_input, req_len);
 	else if (buffer_input)
-		copy_request(argv[0], cld.in);
+		copy_request(argv[0], cld.in, req_len);
 	else
 		close(0);
 
-- 
2.17.0.1185.g782057d875


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v8 3/3] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-06-10 15:05 ` [PATCH v8 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
  2018-06-10 15:05   ` [PATCH v8 1/3] http-backend: cleanup writing to child process Max Kirillov
  2018-06-10 15:05   ` [PATCH v8 2/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
@ 2018-06-10 15:05   ` Max Kirillov
  2018-07-25 12:14     ` SZEDER Gábor
  2018-07-27  3:48   ` [PATCH v9 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
  3 siblings, 1 reply; 31+ messages in thread
From: Max Kirillov @ 2018-06-10 15:05 UTC (permalink / raw)
  To: Junio C Hamano, Eric Sunshine, Jeff King
  Cc: Max Kirillov, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org

Push passes to another commands, as described in
https://public-inbox.org/git/20171129032214.GB32345@sigill.intra.peff.net/

As it gets complicated to correctly track the data length, instead transfer
the data through parent process and cut the pipe as the specified length is
reached. Do it only when CONTENT_LENGTH is set, otherwise pass the input
directly to the forked commands.

Add tests for cases:

* CONTENT_LENGTH is set, script's stdin has more data, with all combinations
  of variations: fetch or push, plain or compressed body, correct or truncated
  input.

* CONTENT_LENGTH is specified to a value which does not fit into ssize_t.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Max Kirillov <max@max630.net>
---
 help.c                                 |   1 +
 http-backend.c                         |  32 ++++-
 t/t5562-http-backend-content-length.sh | 169 +++++++++++++++++++++++++
 t/t5562/invoke-with-content-length.pl  |  37 ++++++
 4 files changed, 237 insertions(+), 2 deletions(-)
 create mode 100755 t/t5562-http-backend-content-length.sh
 create mode 100755 t/t5562/invoke-with-content-length.pl

diff --git a/help.c b/help.c
index 60071a9bea..42600ca026 100644
--- a/help.c
+++ b/help.c
@@ -419,6 +419,7 @@ int cmd_version(int argc, const char **argv, const char *prefix)
 		else
 			printf("no commit associated with this build\n");
 		printf("sizeof-long: %d\n", (int)sizeof(long));
+		printf("sizeof-size_t: %d\n", (int)sizeof(size_t));
 		/* NEEDSWORK: also save and output GIT-BUILD_OPTIONS? */
 	}
 	return 0;
diff --git a/http-backend.c b/http-backend.c
index 0c9e9be2b7..28c07e7c2a 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -372,6 +372,8 @@ static void inflate_request(const char *prog_name, int out, int buffer_input, ss
 	unsigned char in_buf[8192];
 	unsigned char out_buf[8192];
 	unsigned long cnt = 0;
+	int req_len_defined = req_len >= 0;
+	size_t req_remaining_len = req_len;
 
 	memset(&stream, 0, sizeof(stream));
 	git_inflate_init_gzip_only(&stream);
@@ -386,8 +388,15 @@ static void inflate_request(const char *prog_name, int out, int buffer_input, ss
 				n = read_request(0, &full_request, req_len);
 			stream.next_in = full_request;
 		} else {
-			n = xread(0, in_buf, sizeof(in_buf));
+			ssize_t buffer_len;
+			if (req_len_defined && req_remaining_len <= sizeof(in_buf))
+				buffer_len = req_remaining_len;
+			else
+				buffer_len = sizeof(in_buf);
+			n = xread(0, in_buf, buffer_len);
 			stream.next_in = in_buf;
+			if (req_len_defined && n > 0)
+				req_remaining_len -= n;
 		}
 
 		if (n <= 0)
@@ -430,6 +439,23 @@ static void copy_request(const char *prog_name, int out, ssize_t req_len)
 	free(buf);
 }
 
+static void pipe_fixed_length(const char *prog_name, int out, size_t req_len)
+{
+	unsigned char buf[8192];
+	size_t remaining_len = req_len;
+
+	while (remaining_len > 0) {
+		size_t chunk_length = remaining_len > sizeof(buf) ? sizeof(buf) : remaining_len;
+		ssize_t n = xread(0, buf, chunk_length);
+		if (n < 0)
+			die_errno("Reading request failed");
+		write_to_child(out, buf, n, prog_name);
+		remaining_len -= n;
+	}
+
+	close(out);
+}
+
 static void run_service(const char **argv, int buffer_input)
 {
 	const char *encoding = getenv("HTTP_CONTENT_ENCODING");
@@ -456,7 +482,7 @@ static void run_service(const char **argv, int buffer_input)
 				 "GIT_COMMITTER_EMAIL=%s@http.%s", user, host);
 
 	cld.argv = argv;
-	if (buffer_input || gzipped_request)
+	if (buffer_input || gzipped_request || req_len >= 0)
 		cld.in = -1;
 	cld.git_cmd = 1;
 	if (start_command(&cld))
@@ -467,6 +493,8 @@ static void run_service(const char **argv, int buffer_input)
 		inflate_request(argv[0], cld.in, buffer_input, req_len);
 	else if (buffer_input)
 		copy_request(argv[0], cld.in, req_len);
+	else if (req_len >= 0)
+		pipe_fixed_length(argv[0], cld.in, req_len);
 	else
 		close(0);
 
diff --git a/t/t5562-http-backend-content-length.sh b/t/t5562-http-backend-content-length.sh
new file mode 100755
index 0000000000..8040d80e04
--- /dev/null
+++ b/t/t5562-http-backend-content-length.sh
@@ -0,0 +1,169 @@
+#!/bin/sh
+
+test_description='test git-http-backend respects CONTENT_LENGTH'
+. ./test-lib.sh
+
+test_lazy_prereq GZIP 'gzip --version'
+
+verify_http_result() {
+	# sometimes there is fatal error buit the result is still 200
+	if grep 'fatal:' act.err
+	then
+		return 1
+	fi
+
+	if ! grep "Status" act.out >act
+	then
+		printf "Status: 200 OK\r\n" >act
+	fi
+	printf "Status: $1\r\n" >exp &&
+	test_cmp exp act
+}
+
+test_http_env() {
+	handler_type="$1"
+	shift
+	env \
+		CONTENT_TYPE="application/x-git-$handler_type-pack-request" \
+		QUERY_STRING="/repo.git/git-$handler_type-pack" \
+		PATH_TRANSLATED="$PWD/.git/git-$handler_type-pack" \
+		GIT_HTTP_EXPORT_ALL=TRUE \
+		REQUEST_METHOD=POST \
+		"$@"
+}
+
+ssize_b100dots() {
+	# hardcoded ((size_t) SSIZE_MAX) + 1
+	case "$(build_option sizeof-size_t)" in
+	8) echo 9223372036854775808;;
+	4) echo 2147483648;;
+	*) die "Unexpected ssize_t size: $(build_option sizeof-size_t)";;
+	esac
+}
+
+test_expect_success 'setup' '
+	git config http.receivepack true &&
+	test_commit c0 &&
+	test_commit c1 &&
+	hash_head=$(git rev-parse HEAD) &&
+	hash_prev=$(git rev-parse HEAD~1) &&
+	printf "want %s" "$hash_head" | packetize >fetch_body &&
+	printf 0000 >>fetch_body &&
+	printf "have %s" "$hash_prev" | packetize >>fetch_body &&
+	printf done | packetize >>fetch_body &&
+	test_copy_bytes 10 <fetch_body >fetch_body.trunc &&
+	hash_next=$(git commit-tree -p HEAD -m next HEAD^{tree}) &&
+	printf "%s %s refs/heads/newbranch\\0report-status\\n" "$_z40" "$hash_next" | packetize >push_body &&
+	printf 0000 >>push_body &&
+	echo "$hash_next" | git pack-objects --stdout >>push_body &&
+	test_copy_bytes 10 <push_body >push_body.trunc &&
+	: >empty_body
+'
+
+test_expect_success GZIP 'setup, compression related' '
+	gzip -k fetch_body &&
+	test_copy_bytes 10 <fetch_body.gz >fetch_body.gz.trunc &&
+	gzip -k push_body &&
+	test_copy_bytes 10 <push_body.gz >push_body.gz.trunc
+'
+
+test_expect_success 'fetch plain' '
+	test_http_env upload \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl fetch_body git http-backend >act.out 2>act.err &&
+	verify_http_result "200 OK"
+'
+
+test_expect_success 'fetch plain truncated' '
+	test_http_env upload \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl fetch_body.trunc git http-backend >act.out 2>act.err &&
+	! verify_http_result "200 OK"
+'
+
+test_expect_success 'fetch plain empty' '
+	test_http_env upload \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl empty_body git http-backend >act.out 2>act.err &&
+	! verify_http_result "200 OK"
+'
+
+test_expect_success GZIP 'fetch gzipped' '
+	test_http_env upload \
+		HTTP_CONTENT_ENCODING="gzip" \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl fetch_body.gz git http-backend >act.out 2>act.err &&
+	verify_http_result "200 OK"
+'
+
+test_expect_success GZIP 'fetch gzipped truncated' '
+	test_http_env upload \
+		HTTP_CONTENT_ENCODING="gzip" \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl fetch_body.gz.trunc git http-backend >act.out 2>act.err &&
+	! verify_http_result "200 OK"
+'
+
+test_expect_success GZIP 'fetch gzipped empty' '
+	test_http_env upload \
+		HTTP_CONTENT_ENCODING="gzip" \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl empty_body git http-backend >act.out 2>act.err &&
+	! verify_http_result "200 OK"
+'
+
+test_expect_success GZIP 'push plain' '
+	test_when_finished "git branch -D newbranch" &&
+	test_http_env receive \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl push_body git http-backend >act.out 2>act.err &&
+	verify_http_result "200 OK" &&
+	git rev-parse newbranch >act.head &&
+	echo "$hash_next" >exp.head &&
+	test_cmp act.head exp.head
+'
+
+test_expect_success 'push plain truncated' '
+	test_http_env receive \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl push_body.trunc git http-backend >act.out 2>act.err &&
+	! verify_http_result "200 OK"
+'
+
+test_expect_success 'push plain empty' '
+	test_http_env receive \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl empty_body git http-backend >act.out 2>act.err &&
+	! verify_http_result "200 OK"
+'
+
+test_expect_success GZIP 'push gzipped' '
+	test_when_finished "git branch -D newbranch" &&
+	test_http_env receive \
+		HTTP_CONTENT_ENCODING="gzip" \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl push_body.gz git http-backend >act.out 2>act.err &&
+	verify_http_result "200 OK" &&
+	git rev-parse newbranch >act.head &&
+	echo "$hash_next" >exp.head &&
+	test_cmp act.head exp.head
+'
+
+test_expect_success GZIP 'push gzipped truncated' '
+	test_http_env receive \
+		HTTP_CONTENT_ENCODING="gzip" \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl push_body.gz.trunc git http-backend >act.out 2>act.err &&
+	! verify_http_result "200 OK"
+'
+
+test_expect_success GZIP 'push gzipped empty' '
+	test_http_env receive \
+		HTTP_CONTENT_ENCODING="gzip" \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl empty_body git http-backend >act.out 2>act.err &&
+	! verify_http_result "200 OK"
+'
+
+test_expect_success 'CONTENT_LENGTH overflow ssite_t' '
+	NOT_FIT_IN_SSIZE=$(ssize_b100dots) &&
+	env \
+		CONTENT_TYPE=application/x-git-upload-pack-request \
+		QUERY_STRING=/repo.git/git-upload-pack \
+		PATH_TRANSLATED="$PWD"/.git/git-upload-pack \
+		GIT_HTTP_EXPORT_ALL=TRUE \
+		REQUEST_METHOD=POST \
+		CONTENT_LENGTH="$NOT_FIT_IN_SSIZE" \
+		git http-backend </dev/zero >/dev/null 2>err &&
+	grep "fatal:.*CONTENT_LENGTH" err
+'
+
+test_done
diff --git a/t/t5562/invoke-with-content-length.pl b/t/t5562/invoke-with-content-length.pl
new file mode 100755
index 0000000000..6c2aae7692
--- /dev/null
+++ b/t/t5562/invoke-with-content-length.pl
@@ -0,0 +1,37 @@
+#!/usr/bin/perl
+use 5.008;
+use strict;
+use warnings;
+
+my $body_filename = $ARGV[0];
+my @command = @ARGV[1 .. $#ARGV];
+
+# read data
+my $body_size = -s $body_filename;
+$ENV{"CONTENT_LENGTH"} = $body_size;
+open(my $body_fh, "<", $body_filename) or die "Cannot open $body_filename: $!";
+my $body_data;
+defined read($body_fh, $body_data, $body_size) or die "Cannot read $body_filename: $!";
+close($body_fh);
+
+my $exited = 0;
+$SIG{"CHLD"} = sub {
+        $exited = 1;
+};
+
+# write data
+my $pid = open(my $out, "|-", @command);
+{
+        # disable buffering at $out
+        my $old_selected = select;
+        select $out;
+        $| = 1;
+        select $old_selected;
+}
+print $out $body_data or die "Cannot write data: $!";
+
+sleep 60; # is interrupted by SIGCHLD
+if (!$exited) {
+        close($out);
+        die "Command did not exit after reading whole body";
+}
-- 
2.17.0.1185.g782057d875


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v7 2/2] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-06-04 22:18     ` Max Kirillov
@ 2018-06-10 15:06       ` Max Kirillov
  2018-06-11  9:18       ` Jeff King
  1 sibling, 0 replies; 31+ messages in thread
From: Max Kirillov @ 2018-06-10 15:06 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Eric Sunshine, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org

On Tue, Jun 05, 2018 at 01:18:08AM +0300, Max Kirillov wrote:
> On Mon, Jun 04, 2018 at 12:44:09AM -0400, Jeff King wrote:
>> Since this is slightly less efficient, and because it only matters if
>> the web server does not already close the pipe, should this have a
>> run-time configuration knob, even if it defaults to
>> safe-but-slightly-slower?
> 
> Personally, I of course don't want this. Also, I don't think
> the difference is much noticeable. But you can never be sure
> without trying. I'll try to measure some numbers.

It seems to be challenging to see any effect at my system.
At least not with any real operation because changing
references needs IO and index-pack needs CPU so. I'll try
it some more.

>> We should probably say something more generic like:
>> 
>>   die_errno("unable to write to '%s'");
>> 
>> or similar.
> 
> Actually, it is already 3rd same error in this file. Maybe
> deserve some refactoring. I will change the message also.

Extracted the writing and refactoring to a single function,
also fixed the message.

>>> +cat >fetch_body <<EOF
>>> +0032want $hash_head
>>> +00000032have $hash_prev
>>> +0009done
>>> +EOF
>> 
>> This depends on the size of the hash. That's always 40 for now, but is
>> something that may change soon.
>> 
>> We already have a packetize() helper; could we use it here?

> Could you point me to it? I cannot find it.

Sorry, misread it as packetSize. Found and used.

>> Also, do we need to protect ourselves against other signals being
>> delivered? E.g., if I resize my xterm and this process gets SIGWINCH, is
>> it going to erroneously end the sleep and say "nope, no exited signal"?

> I'll check, but what could I do? Should I add blocking other
> signals there?

In my Linux I don't see the signal. Except that, there seem to
be not that many ignored signals. Anyway, I don't see what
could be done bout it.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v7 2/2] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-06-04  4:44   ` Jeff King
  2018-06-04 22:18     ` Max Kirillov
@ 2018-06-10 15:07     ` Max Kirillov
  2018-06-11  8:59       ` Jeff King
  1 sibling, 1 reply; 31+ messages in thread
From: Max Kirillov @ 2018-06-10 15:07 UTC (permalink / raw)
  To: Jeff King
  Cc: Max Kirillov, Junio C Hamano, Eric Sunshine, Florian Manschwetus,
	Chris Packham, Konstantin Khomoutov, git@vger.kernel.org

On Mon, Jun 04, 2018 at 12:44:09AM -0400, Jeff King wrote:
> On Sun, Jun 03, 2018 at 12:27:49AM +0300, Max Kirillov wrote:
>> +	env \
>> +		CONTENT_TYPE=application/x-git-upload-pack-request \
>> +		QUERY_STRING=/repo.git/git-upload-pack \
>> +		PATH_TRANSLATED="$PWD"/.git/git-upload-pack \
>> +		GIT_HTTP_EXPORT_ALL=TRUE \
>> +		REQUEST_METHOD=POST \
>> +		CONTENT_LENGTH="$NOT_FIT_IN_SSIZE" \
>> +		git http-backend </dev/zero >/dev/null 2>err &&
>> +	grep -q "fatal:.*CONTENT_LENGTH" err
> 
> I'm not sure if these messages should be marked for translation. If so,
> you'd want test_i18ngrep here.

Message localization does not seem to be used in
http-backend at all. It makes sense - server-side software
probably does not know who is the user on the other side, if
the message gets to the user at all. So, I think the
message should not be translated.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v7 2/2] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-06-10 15:07     ` Max Kirillov
@ 2018-06-11  8:59       ` Jeff King
  0 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2018-06-11  8:59 UTC (permalink / raw)
  To: Max Kirillov
  Cc: Junio C Hamano, Eric Sunshine, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org

On Sun, Jun 10, 2018 at 06:07:27PM +0300, Max Kirillov wrote:

> On Mon, Jun 04, 2018 at 12:44:09AM -0400, Jeff King wrote:
> > On Sun, Jun 03, 2018 at 12:27:49AM +0300, Max Kirillov wrote:
> >> +	env \
> >> +		CONTENT_TYPE=application/x-git-upload-pack-request \
> >> +		QUERY_STRING=/repo.git/git-upload-pack \
> >> +		PATH_TRANSLATED="$PWD"/.git/git-upload-pack \
> >> +		GIT_HTTP_EXPORT_ALL=TRUE \
> >> +		REQUEST_METHOD=POST \
> >> +		CONTENT_LENGTH="$NOT_FIT_IN_SSIZE" \
> >> +		git http-backend </dev/zero >/dev/null 2>err &&
> >> +	grep -q "fatal:.*CONTENT_LENGTH" err
> > 
> > I'm not sure if these messages should be marked for translation. If so,
> > you'd want test_i18ngrep here.
> 
> Message localization does not seem to be used in
> http-backend at all. It makes sense - server-side software
> probably does not know who is the user on the other side, if
> the message gets to the user at all. So, I think the
> message should not be translated.

OK. I think there's been talk of localizing "fatal:", but whoever does
that patch would have to deal with fallout all over the test-suite. I
don't think we need to worry about it yet.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v7 2/2] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-06-04 22:18     ` Max Kirillov
  2018-06-10 15:06       ` Max Kirillov
@ 2018-06-11  9:18       ` Jeff King
  2018-06-11  9:24         ` Jeff King
  1 sibling, 1 reply; 31+ messages in thread
From: Jeff King @ 2018-06-11  9:18 UTC (permalink / raw)
  To: Max Kirillov
  Cc: Junio C Hamano, Eric Sunshine, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org

On Tue, Jun 05, 2018 at 01:18:08AM +0300, Max Kirillov wrote:

> > On Sun, Jun 03, 2018 at 12:27:49AM +0300, Max Kirillov wrote:
> > Since this is slightly less efficient, and because it only matters if
> > the web server does not already close the pipe, should this have a
> > run-time configuration knob, even if it defaults to
> > safe-but-slightly-slower?
> 
> Personally, I of course don't want this. Also, I don't think
> the difference is much noticeable. But you can never be sure
> without trying. I'll try to measure some numbers.

I don't know if it will matter or not. I just wonder if we want to leave
an escape hatch for people who might. I could take or leave it.

> Actually, it is already 3rd same error in this file. Maybe
> deserve some refactoring. I will change the message also.

Thanks, that kind of related cleanup is very welcome.

> > We generally prefer to have all commands, even ones we don't expect to
> > fail, inside test_expect blocks (e.g., with a "setup" description).
> 
> Will the defined variables get to the next test? I'll try to
> do as you describe.

Yes, the tests are all run as evals. So as long as you don't open a
subshell yourself, any changes you make to process state will persist.

> >> +test_expect_success 'fetch plain truncated' '
> >> +	test_http_env upload \
> >> +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl fetch_body.trunc git http-backend >act.out 2>act.err &&
> >> +	test_must_fail verify_http_result "200 OK"
> >> +'
> > 
> > Usually test_must_fail on a checking function like this is a sign that
> > the check is not as robust as we'd like. If the function checks two
> > things "A && B", then checking test_must_fail will only let us know
> > "!A || !B", but you probably want to check both.
> 
> Well here I just want to know that the request has failed,
> and we already know that it can fail in different ways,
> but the test is not going to differentiate those ways.

OK, looking over your verify_http_result function, I _think_ we are OK
here, because the only && is against a printf, which we wouldn't really
expect to fail.

> >> +sleep 1; # is interrupted by SIGCHLD
> >> +if (!$exited) {
> >> +        close($out);
> >> +        die "Command did not exit after reading whole body";
> >> +}
> 
> > Also, do we need to protect ourselves against other signals being
> > delivered? E.g., if I resize my xterm and this process gets SIGWINCH, is
> > it going to erroneously end the sleep and say "nope, no exited signal"?
> 
> I'll check, but what could I do? Should I add blocking other
> signals there?

I think a more robust check may be to waitpid() on the child for up to N
seconds. Something like this:

  $SIG{ALRM} = sub {
	  kill(9, $pid);
	  die "command did not exit after reading whole body"
  };
  alarm(60);
  waitpid($pid, 0);
  alarm(0);

That should exit immediately if $pid does, and otherwise die after
exactly 60 seconds. Perl's waitpid implementation will restart
automatically if it gets another signal.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v7 2/2] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-06-11  9:18       ` Jeff King
@ 2018-06-11  9:24         ` Jeff King
  0 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2018-06-11  9:24 UTC (permalink / raw)
  To: Max Kirillov
  Cc: Junio C Hamano, Eric Sunshine, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org

On Mon, Jun 11, 2018 at 05:18:13AM -0400, Jeff King wrote:

> > >> +sleep 1; # is interrupted by SIGCHLD
> > >> +if (!$exited) {
> > >> +        close($out);
> > >> +        die "Command did not exit after reading whole body";
> > >> +}
> > 
> > > Also, do we need to protect ourselves against other signals being
> > > delivered? E.g., if I resize my xterm and this process gets SIGWINCH, is
> > > it going to erroneously end the sleep and say "nope, no exited signal"?
> > 
> > I'll check, but what could I do? Should I add blocking other
> > signals there?
> 
> I think a more robust check may be to waitpid() on the child for up to N
> seconds. Something like this:
> 
>   $SIG{ALRM} = sub {
> 	  kill(9, $pid);
> 	  die "command did not exit after reading whole body"
>   };
>   alarm(60);
>   waitpid($pid, 0);
>   alarm(0);
> 
> That should exit immediately if $pid does, and otherwise die after
> exactly 60 seconds. Perl's waitpid implementation will restart
> automatically if it gets another signal.

I tried your original, delivering some signals to it. I think it
actually is OK, too, because perl's sleep() implementation will also
restart for something like SIGWINCH.

E.g., stracing looks like this:

  nanosleep({tv_sec=60, tv_nsec=0}, {tv_sec=57, tv_nsec=791891377}) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
  --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_KERNEL} ---
  restart_syscall(<... resuming interrupted nanosleep ...>

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v8 3/3] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-06-10 15:05   ` [PATCH v8 3/3] http-backend: respect CONTENT_LENGTH for receive-pack Max Kirillov
@ 2018-07-25 12:14     ` SZEDER Gábor
  2018-07-25 14:51       ` Max Kirillov
  0 siblings, 1 reply; 31+ messages in thread
From: SZEDER Gábor @ 2018-07-25 12:14 UTC (permalink / raw)
  To: Max Kirillov
  Cc: SZEDER Gábor, Junio C Hamano, Eric Sunshine, Jeff King,
	Florian Manschwetus, Chris Packham, Konstantin Khomoutov,
	git@vger.kernel.org


[Hrm, this time with hopefully proper In-Reply-To: header.
 Sorry for the double post.]


> Push passes to another commands, as described in
> https://public-inbox.org/git/20171129032214.GB32345@sigill.intra.peff.net/
> 
> As it gets complicated to correctly track the data length, instead transfer
> the data through parent process and cut the pipe as the specified length is
> reached. Do it only when CONTENT_LENGTH is set, otherwise pass the input
> directly to the forked commands.
> 
> Add tests for cases:
> 
> * CONTENT_LENGTH is set, script's stdin has more data, with all combinations
>   of variations: fetch or push, plain or compressed body, correct or truncated
>   input.
> 
> * CONTENT_LENGTH is specified to a value which does not fit into ssize_t.
> 
> Helped-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Max Kirillov <max@max630.net>
> ---
>  help.c                                 |   1 +
>  http-backend.c                         |  32 ++++-
>  t/t5562-http-backend-content-length.sh | 169 +++++++++++++++++++++++++
>  t/t5562/invoke-with-content-length.pl  |  37 ++++++
>  4 files changed, 237 insertions(+), 2 deletions(-)
>  create mode 100755 t/t5562-http-backend-content-length.sh
>  create mode 100755 t/t5562/invoke-with-content-length.pl


> diff --git a/t/t5562-http-backend-content-length.sh b/t/t5562-http-backend-content-length.sh
> new file mode 100755
> index 0000000000..8040d80e04
> --- /dev/null
> +++ b/t/t5562-http-backend-content-length.sh
> @@ -0,0 +1,169 @@
> +#!/bin/sh
> +
> +test_description='test git-http-backend respects CONTENT_LENGTH'
> +. ./test-lib.sh
> +
> +test_lazy_prereq GZIP 'gzip --version'
> +
> +verify_http_result() {
> +	# sometimes there is fatal error buit the result is still 200

s/buit/but/

> +	if grep 'fatal:' act.err
> +	then
> +		return 1
> +	fi

I just happened to stumble upon a failure because of 'fatal: the
remote end hung up unexpectedly' in the test 'push plain'.

What does that "sometimes" in the above comment mean, and how often
does such a failure happen?  I see these patches are in 'pu' for over
a month now, so based on the number of reflog entries since then it
happened once from about 30-35 builds on Travis CI so far.

I don't really like the idea of adding a bunch of flaky test cases...
we have enough of them already, unfortunately.

> +
> +	if ! grep "Status" act.out >act
> +	then
> +		printf "Status: 200 OK\r\n" >act
> +	fi
> +	printf "Status: $1\r\n" >exp &&
> +	test_cmp exp act
> +}
> +
> +test_http_env() {
> +	handler_type="$1"
> +	shift
> +	env \
> +		CONTENT_TYPE="application/x-git-$handler_type-pack-request" \
> +		QUERY_STRING="/repo.git/git-$handler_type-pack" \
> +		PATH_TRANSLATED="$PWD/.git/git-$handler_type-pack" \
> +		GIT_HTTP_EXPORT_ALL=TRUE \
> +		REQUEST_METHOD=POST \
> +		"$@"
> +}
> +
> +ssize_b100dots() {
> +	# hardcoded ((size_t) SSIZE_MAX) + 1
> +	case "$(build_option sizeof-size_t)" in
> +	8) echo 9223372036854775808;;
> +	4) echo 2147483648;;
> +	*) die "Unexpected ssize_t size: $(build_option sizeof-size_t)";;
> +	esac
> +}
> +
> +test_expect_success 'setup' '
> +	git config http.receivepack true &&
> +	test_commit c0 &&
> +	test_commit c1 &&
> +	hash_head=$(git rev-parse HEAD) &&
> +	hash_prev=$(git rev-parse HEAD~1) &&
> +	printf "want %s" "$hash_head" | packetize >fetch_body &&
> +	printf 0000 >>fetch_body &&
> +	printf "have %s" "$hash_prev" | packetize >>fetch_body &&
> +	printf done | packetize >>fetch_body &&
> +	test_copy_bytes 10 <fetch_body >fetch_body.trunc &&
> +	hash_next=$(git commit-tree -p HEAD -m next HEAD^{tree}) &&
> +	printf "%s %s refs/heads/newbranch\\0report-status\\n" "$_z40" "$hash_next" | packetize >push_body &&
> +	printf 0000 >>push_body &&
> +	echo "$hash_next" | git pack-objects --stdout >>push_body &&
> +	test_copy_bytes 10 <push_body >push_body.trunc &&
> +	: >empty_body
> +'
> +
> +test_expect_success GZIP 'setup, compression related' '
> +	gzip -k fetch_body &&
> +	test_copy_bytes 10 <fetch_body.gz >fetch_body.gz.trunc &&
> +	gzip -k push_body &&
> +	test_copy_bytes 10 <push_body.gz >push_body.gz.trunc
> +'
> +
> +test_expect_success 'fetch plain' '
> +	test_http_env upload \
> +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl fetch_body git http-backend >act.out 2>act.err &&

Don't save the standard error of the whole shell function.
When running the test with /bin/sh and '-x' tracing, then the trace of
commands executed in the function will be included in the standard
error as well, which may interfere with later verification (though in
this case it doesn't seem like it would cause any issues).

Please limit the redirections to the relevant command's output.  AFAICT
all invocations of 'test_http_env' in these tests have their stdout and
stderr redirected to the same pair of files, so perhaps you could
simply move all these redirections inside the function.

> +	verify_http_result "200 OK"
> +'
> +
> +test_expect_success 'fetch plain truncated' '
> +	test_http_env upload \
> +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl fetch_body.trunc git http-backend >act.out 2>act.err &&

If this command were to print a "fatal: ..." message to its standard
error, then ...

> +	! verify_http_result "200 OK"

... this function would return error (because of that 'if grep fatal:
...' statement) without even looking at the status, but the test would
still succeed.  Is that really the desired behavior here?

> +'
> +
> +test_expect_success 'fetch plain empty' '
> +	test_http_env upload \
> +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl empty_body git http-backend >act.out 2>act.err &&
> +	! verify_http_result "200 OK"
> +'
> +
> +test_expect_success GZIP 'fetch gzipped' '
> +	test_http_env upload \
> +		HTTP_CONTENT_ENCODING="gzip" \
> +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl fetch_body.gz git http-backend >act.out 2>act.err &&
> +	verify_http_result "200 OK"
> +'
> +
> +test_expect_success GZIP 'fetch gzipped truncated' '
> +	test_http_env upload \
> +		HTTP_CONTENT_ENCODING="gzip" \
> +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl fetch_body.gz.trunc git http-backend >act.out 2>act.err &&
> +	! verify_http_result "200 OK"
> +'
> +
> +test_expect_success GZIP 'fetch gzipped empty' '
> +	test_http_env upload \
> +		HTTP_CONTENT_ENCODING="gzip" \
> +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl empty_body git http-backend >act.out 2>act.err &&
> +	! verify_http_result "200 OK"
> +'
> +
> +test_expect_success GZIP 'push plain' '
> +	test_when_finished "git branch -D newbranch" &&
> +	test_http_env receive \
> +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl push_body git http-backend >act.out 2>act.err &&
> +	verify_http_result "200 OK" &&
> +	git rev-parse newbranch >act.head &&
> +	echo "$hash_next" >exp.head &&
> +	test_cmp act.head exp.head
> +'
> +
> +test_expect_success 'push plain truncated' '
> +	test_http_env receive \
> +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl push_body.trunc git http-backend >act.out 2>act.err &&
> +	! verify_http_result "200 OK"
> +'
> +
> +test_expect_success 'push plain empty' '
> +	test_http_env receive \
> +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl empty_body git http-backend >act.out 2>act.err &&
> +	! verify_http_result "200 OK"
> +'
> +
> +test_expect_success GZIP 'push gzipped' '
> +	test_when_finished "git branch -D newbranch" &&
> +	test_http_env receive \
> +		HTTP_CONTENT_ENCODING="gzip" \
> +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl push_body.gz git http-backend >act.out 2>act.err &&
> +	verify_http_result "200 OK" &&
> +	git rev-parse newbranch >act.head &&
> +	echo "$hash_next" >exp.head &&
> +	test_cmp act.head exp.head
> +'
> +
> +test_expect_success GZIP 'push gzipped truncated' '
> +	test_http_env receive \
> +		HTTP_CONTENT_ENCODING="gzip" \
> +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl push_body.gz.trunc git http-backend >act.out 2>act.err &&
> +	! verify_http_result "200 OK"
> +'
> +
> +test_expect_success GZIP 'push gzipped empty' '
> +	test_http_env receive \
> +		HTTP_CONTENT_ENCODING="gzip" \
> +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl empty_body git http-backend >act.out 2>act.err &&
> +	! verify_http_result "200 OK"
> +'
> +
> +test_expect_success 'CONTENT_LENGTH overflow ssite_t' '
> +	NOT_FIT_IN_SSIZE=$(ssize_b100dots) &&
> +	env \
> +		CONTENT_TYPE=application/x-git-upload-pack-request \
> +		QUERY_STRING=/repo.git/git-upload-pack \
> +		PATH_TRANSLATED="$PWD"/.git/git-upload-pack \
> +		GIT_HTTP_EXPORT_ALL=TRUE \
> +		REQUEST_METHOD=POST \
> +		CONTENT_LENGTH="$NOT_FIT_IN_SSIZE" \
> +		git http-backend </dev/zero >/dev/null 2>err &&
> +	grep "fatal:.*CONTENT_LENGTH" err
> +'
> +
> +test_done
> diff --git a/t/t5562/invoke-with-content-length.pl b/t/t5562/invoke-with-content-length.pl
> new file mode 100755
> index 0000000000..6c2aae7692
> --- /dev/null
> +++ b/t/t5562/invoke-with-content-length.pl
> @@ -0,0 +1,37 @@
> +#!/usr/bin/perl
> +use 5.008;
> +use strict;
> +use warnings;
> +
> +my $body_filename = $ARGV[0];
> +my @command = @ARGV[1 .. $#ARGV];
> +
> +# read data
> +my $body_size = -s $body_filename;
> +$ENV{"CONTENT_LENGTH"} = $body_size;
> +open(my $body_fh, "<", $body_filename) or die "Cannot open $body_filename: $!";
> +my $body_data;
> +defined read($body_fh, $body_data, $body_size) or die "Cannot read $body_filename: $!";
> +close($body_fh);
> +
> +my $exited = 0;
> +$SIG{"CHLD"} = sub {
> +        $exited = 1;
> +};
> +
> +# write data
> +my $pid = open(my $out, "|-", @command);
> +{
> +        # disable buffering at $out
> +        my $old_selected = select;
> +        select $out;
> +        $| = 1;
> +        select $old_selected;
> +}
> +print $out $body_data or die "Cannot write data: $!";
> +
> +sleep 60; # is interrupted by SIGCHLD
> +if (!$exited) {
> +        close($out);
> +        die "Command did not exit after reading whole body";
> +}
> -- 
> 2.17.0.1185.g782057d875
> 
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v8 3/3] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-07-25 12:14     ` SZEDER Gábor
@ 2018-07-25 14:51       ` Max Kirillov
  2018-07-25 18:41         ` SZEDER Gábor
  0 siblings, 1 reply; 31+ messages in thread
From: Max Kirillov @ 2018-07-25 14:51 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Max Kirillov, Junio C Hamano, Eric Sunshine, Jeff King,
	Florian Manschwetus, Chris Packham, Konstantin Khomoutov,
	git@vger.kernel.org

On Wed, Jul 25, 2018 at 02:14:35PM +0200, SZEDER Gábor wrote:
>> +	# sometimes there is fatal error buit the result is still 200
> 
> s/buit/but/

Thanks, will fix

>> +	if grep 'fatal:' act.err
>> +	then
>> +		return 1
>> +	fi
> 
> I just happened to stumble upon a failure because of 'fatal: the
> remote end hung up unexpectedly' in the test 'push plain'.

Did it happen once or repeated? It is rather strange, that
one shoud not fail. Which OS it was?

There have been doubds that a random incoming signal can
trigger such a failure.

> What does that "sometimes" in the above comment mean, and how often
> does such a failure happen?  I see these patches are in 'pu' for over
> a month now, so based on the number of reflog entries since then it
> happened once from about 30-35 builds on Travis CI so far.

"sometimes" here means "for some kinds of fatal error
failure", there is nothing random in it.

> > +		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl fetch_body git http-backend >act.out 2>act.err &&
> 
> Don't save the standard error of the whole shell function.
> When running the test with /bin/sh and '-x' tracing, then the trace of
> commands executed in the function will be included in the standard
> error as well, which may interfere with later verification (though in
> this case it doesn't seem like it would cause any issues).
> 
> Please limit the redirections to the relevant command's output.  AFAICT
> all invocations of 'test_http_env' in these tests have their stdout and
> stderr redirected to the same pair of files, so perhaps you could
> simply move all these redirections inside the function.

Thanks, I'll try to fix it 

>> +	! verify_http_result "200 OK"
> 
> ... this function would return error (because of that 'if grep fatal:
> ...' statement) without even looking at the status, but the test would
> still succeed.  Is that really the desired behavior here?

Yes, it is a desired behavior. A failure is expected here,
and the failure does not show up as non-200 status, as
described above.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v8 3/3] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-07-25 14:51       ` Max Kirillov
@ 2018-07-25 18:41         ` SZEDER Gábor
  2018-07-26  4:37           ` Max Kirillov
  0 siblings, 1 reply; 31+ messages in thread
From: SZEDER Gábor @ 2018-07-25 18:41 UTC (permalink / raw)
  To: Max Kirillov
  Cc: Junio C Hamano, Eric Sunshine, Jeff King, Florian Manschwetus,
	Chris Packham, Konstantin Khomoutov, Git mailing list

On Wed, Jul 25, 2018 at 4:51 PM Max Kirillov <max@max630.net> wrote:
>
> On Wed, Jul 25, 2018 at 02:14:35PM +0200, SZEDER Gábor wrote:
> >> +    # sometimes there is fatal error buit the result is still 200

> >> +    if grep 'fatal:' act.err
> >> +    then
> >> +            return 1
> >> +    fi
> >
> > I just happened to stumble upon a failure because of 'fatal: the
> > remote end hung up unexpectedly' in the test 'push plain'.
>
> Did it happen once or repeated? It is rather strange, that
> one shoud not fail. Which OS it was?

Only once, so far.  It was one of my OSX build jobs on Travis CI, but
I don't know what OSX version is used.

'act.err' contained this (which will get line-wrapped, I'm afraid):

++handler_type=receive
++shift
++env CONTENT_TYPE=application/x-git-receive-pack-request
QUERY_STRING=/repo.git/git-receive-pack
'PATH_TRANSLATED=/Users/travis/t/trash
dir.t5562/.git/git-receive-pack' GIT_HTTP_EXPORT_ALL=TRUE
REQUEST_METHOD=POST
/Users/travis/build/szeder/git-cooking-topics-for-travis-ci/t/t5562/invoke-with-content-length.pl
push_body git http-backend
<...128 zero bytes...>fatal: the remote end hung up unexpectedly

I couldn't reproduce it on my Linux box.

> There have been doubds that a random incoming signal can
> trigger such a failure.
>
> > What does that "sometimes" in the above comment mean, and how often
> > does such a failure happen?  I see these patches are in 'pu' for over
> > a month now, so based on the number of reflog entries since then it
> > happened once from about 30-35 builds on Travis CI so far.
>
> "sometimes" here means "for some kinds of fatal error
> failure", there is nothing random in it.

> >> +    ! verify_http_result "200 OK"
> >
> > ... this function would return error (because of that 'if grep fatal:
> > ...' statement) without even looking at the status, but the test would
> > still succeed.  Is that really the desired behavior here?
>
> Yes, it is a desired behavior. A failure is expected here,
> and the failure does not show up as non-200 status, as
> described above.

OK, then I misunderstood that comment.

Perhaps a different wording could make it slightly better?  E.g. "In
some of these tests ..." instead of that "sometimes".  Dunno.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v8 3/3] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-07-25 18:41         ` SZEDER Gábor
@ 2018-07-26  4:37           ` Max Kirillov
  0 siblings, 0 replies; 31+ messages in thread
From: Max Kirillov @ 2018-07-26  4:37 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Max Kirillov, Junio C Hamano, Eric Sunshine, Jeff King,
	Florian Manschwetus, Chris Packham, Konstantin Khomoutov,
	Git mailing list

On Wed, Jul 25, 2018 at 08:41:31PM +0200, SZEDER Gábor wrote:
> On Wed, Jul 25, 2018 at 4:51 PM Max Kirillov <max@max630.net> wrote:
>>> I just happened to stumble upon a failure because of 'fatal: the
>>> remote end hung up unexpectedly' in the test 'push plain'.
>>
>> Did it happen once or repeated? It is rather strange, that
>> one shoud not fail. Which OS it was?
> 
> Only once, so far.  It was one of my OSX build jobs on Travis CI, but
> I don't know what OSX version is used.
> 
> 'act.err' contained this (which will get line-wrapped, I'm afraid):
> 
> ++handler_type=receive
> ++shift
> ++env CONTENT_TYPE=application/x-git-receive-pack-request
> QUERY_STRING=/repo.git/git-receive-pack
> 'PATH_TRANSLATED=/Users/travis/t/trash
> dir.t5562/.git/git-receive-pack' GIT_HTTP_EXPORT_ALL=TRUE
> REQUEST_METHOD=POST
> /Users/travis/build/szeder/git-cooking-topics-for-travis-ci/t/t5562/invoke-with-content-length.pl
> push_body git http-backend
> <...128 zero bytes...>fatal: the remote end hung up unexpectedly
> 
> I couldn't reproduce it on my Linux box.

The only reason for this I could imagine is some perl
utility failure to feed the body to git http-backend.
I could not reproduce it either, but if such things happen
often again maybe should concider C helper instead. Though
I'm afraid I easily can make more mistakes in it than perl
interpreter authors.

I'll make the other changes, and sofar just hope it would
not happen again.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v9 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875
  2018-06-10 15:05 ` [PATCH v8 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
                     ` (2 preceding siblings ...)
  2018-06-10 15:05   ` [PATCH v8 3/3] http-backend: respect CONTENT_LENGTH for receive-pack Max Kirillov
@ 2018-07-27  3:48   ` Max Kirillov
  2018-07-27  3:48     ` [PATCH v9 1/3] http-backend: cleanup writing to child process Max Kirillov
                       ` (3 more replies)
  3 siblings, 4 replies; 31+ messages in thread
From: Max Kirillov @ 2018-07-27  3:48 UTC (permalink / raw)
  To: Junio C Hamano, SZEDER Gábor, Torsten Bögershausen
  Cc: Max Kirillov, Eric Sunshine, Jeff King, Florian Manschwetus,
	Chris Packham, Konstantin Khomoutov, git@vger.kernel.org

* fix the gzip usage as suggested in https://public-inbox.org/git/xmqqk1quvegh.fsf@gitster-ct.c.googlers.com/
* better explanation of why status check is needed
* redirect only the helper call, not the whole shell function, also move more into the shell function

Max Kirillov (3):
  http-backend: cleanup writing to child process
  http-backend: respect CONTENT_LENGTH as specified by rfc3875
  http-backend: respect CONTENT_LENGTH for receive-pack

 config.c                               |   2 +-
 config.h                               |   1 +
 help.c                                 |   1 +
 http-backend.c                         | 100 +++++++++++++---
 t/t5562-http-backend-content-length.sh | 155 +++++++++++++++++++++++++
 t/t5562/invoke-with-content-length.pl  |  37 ++++++
 6 files changed, 281 insertions(+), 15 deletions(-)
 create mode 100755 t/t5562-http-backend-content-length.sh
 create mode 100755 t/t5562/invoke-with-content-length.pl

-- 
2.17.0.1185.g782057d875


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v9 1/3] http-backend: cleanup writing to child process
  2018-07-27  3:48   ` [PATCH v9 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
@ 2018-07-27  3:48     ` Max Kirillov
  2018-07-27  3:48     ` [PATCH v9 2/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 31+ messages in thread
From: Max Kirillov @ 2018-07-27  3:48 UTC (permalink / raw)
  To: Junio C Hamano, SZEDER Gábor, Torsten Bögershausen
  Cc: Max Kirillov, Eric Sunshine, Jeff King, Florian Manschwetus,
	Chris Packham, Konstantin Khomoutov, git@vger.kernel.org

As explained in [1], we should not assume the reason why the writing has
failed, and even if the reason is that child has existed not the reason
why it have done so. So instead just say that writing has failed.

[1] https://public-inbox.org/git/20180604044408.GD14451@sigill.intra.peff.net/

Signed-off-by: Max Kirillov <max@max630.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 http-backend.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/http-backend.c b/http-backend.c
index adaef16fad..cefdfd6fc6 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -279,6 +279,12 @@ static struct rpc_service *select_service(struct strbuf *hdr, const char *name)
 	return svc;
 }
 
+static void write_to_child(int out, const unsigned char *buf, ssize_t len, const char *prog_name)
+{
+	if (write_in_full(out, buf, len) < 0)
+		die("unable to write to '%s'", prog_name);
+}
+
 /*
  * This is basically strbuf_read(), except that if we
  * hit max_request_buffer we die (we'd rather reject a
@@ -361,9 +367,8 @@ static void inflate_request(const char *prog_name, int out, int buffer_input)
 				die("zlib error inflating request, result %d", ret);
 
 			n = stream.total_out - cnt;
-			if (write_in_full(out, out_buf, n) < 0)
-				die("%s aborted reading request", prog_name);
-			cnt += n;
+			write_to_child(out, out_buf, stream.total_out - cnt, prog_name);
+			cnt = stream.total_out;
 
 			if (ret == Z_STREAM_END)
 				goto done;
@@ -382,8 +387,7 @@ static void copy_request(const char *prog_name, int out)
 	ssize_t n = read_request(0, &buf);
 	if (n < 0)
 		die_errno("error reading request body");
-	if (write_in_full(out, buf, n) < 0)
-		die("%s aborted reading request", prog_name);
+	write_to_child(out, buf, n, prog_name);
 	close(out);
 	free(buf);
 }
-- 
2.17.0.1185.g782057d875


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v9 2/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875
  2018-07-27  3:48   ` [PATCH v9 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
  2018-07-27  3:48     ` [PATCH v9 1/3] http-backend: cleanup writing to child process Max Kirillov
@ 2018-07-27  3:48     ` Max Kirillov
  2018-08-04  6:34       ` Duy Nguyen
  2018-07-27  3:48     ` [PATCH v9 3/3] http-backend: respect CONTENT_LENGTH for receive-pack Max Kirillov
  2018-07-27  3:50     ` [PATCH v9 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
  3 siblings, 1 reply; 31+ messages in thread
From: Max Kirillov @ 2018-07-27  3:48 UTC (permalink / raw)
  To: Junio C Hamano, SZEDER Gábor, Torsten Bögershausen
  Cc: Max Kirillov, Eric Sunshine, Jeff King, Florian Manschwetus,
	Chris Packham, Konstantin Khomoutov, git@vger.kernel.org

http-backend reads whole input until EOF. However, the RFC 3875 specifies
that a script must read only as many bytes as specified by CONTENT_LENGTH
environment variable. Web server may exercise the specification by not closing
the script's standard input after writing content. In that case http-backend
would hang waiting for the input. The issue is known to happen with
IIS/Windows, for example.

Make http-backend read only CONTENT_LENGTH bytes, if it's defined, rather than
the whole input until EOF. If the variable is not defined, keep older behavior
of reading until EOF because it is used to support chunked transfer-encoding.

This commit only fixes buffered input, whcih reads whole body before
processign it. Non-buffered input is going to be fixed in subsequent commit.

Signed-off-by: Florian Manschwetus <manschwetus@cs-software-gmbh.de>
[mk: fixed trivial build failures and polished style issues]
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Max Kirillov <max@max630.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 config.c       |  2 +-
 config.h       |  1 +
 http-backend.c | 54 +++++++++++++++++++++++++++++++++++++++++++-------
 3 files changed, 49 insertions(+), 8 deletions(-)

diff --git a/config.c b/config.c
index fbbf0f8e9f..158afa858b 100644
--- a/config.c
+++ b/config.c
@@ -921,7 +921,7 @@ int git_parse_ulong(const char *value, unsigned long *ret)
 	return 1;
 }
 
-static int git_parse_ssize_t(const char *value, ssize_t *ret)
+int git_parse_ssize_t(const char *value, ssize_t *ret)
 {
 	intmax_t tmp;
 	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
diff --git a/config.h b/config.h
index cdac2fc73e..7808413bd0 100644
--- a/config.h
+++ b/config.h
@@ -73,6 +73,7 @@ extern void git_config(config_fn_t fn, void *);
 extern int config_with_options(config_fn_t fn, void *,
 			       struct git_config_source *config_source,
 			       const struct config_options *opts);
+extern int git_parse_ssize_t(const char *, ssize_t *);
 extern int git_parse_ulong(const char *, unsigned long *);
 extern int git_parse_maybe_bool(const char *);
 extern int git_config_int(const char *, const char *);
diff --git a/http-backend.c b/http-backend.c
index cefdfd6fc6..d0b6cb1b09 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -290,7 +290,7 @@ static void write_to_child(int out, const unsigned char *buf, ssize_t len, const
  * hit max_request_buffer we die (we'd rather reject a
  * maliciously large request than chew up infinite memory).
  */
-static ssize_t read_request(int fd, unsigned char **out)
+static ssize_t read_request_eof(int fd, unsigned char **out)
 {
 	size_t len = 0, alloc = 8192;
 	unsigned char *buf = xmalloc(alloc);
@@ -327,7 +327,46 @@ static ssize_t read_request(int fd, unsigned char **out)
 	}
 }
 
-static void inflate_request(const char *prog_name, int out, int buffer_input)
+static ssize_t read_request_fixed_len(int fd, ssize_t req_len, unsigned char **out)
+{
+	unsigned char *buf = NULL;
+	ssize_t cnt = 0;
+
+	if (max_request_buffer < req_len) {
+		die("request was larger than our maximum size (%lu): "
+		    "%" PRIuMAX "; try setting GIT_HTTP_MAX_REQUEST_BUFFER",
+		    max_request_buffer, (uintmax_t)req_len);
+	}
+
+	buf = xmalloc(req_len);
+	cnt = read_in_full(fd, buf, req_len);
+	if (cnt < 0) {
+		free(buf);
+		return -1;
+	}
+	*out = buf;
+	return cnt;
+}
+
+static ssize_t get_content_length(void)
+{
+	ssize_t val = -1;
+	const char *str = getenv("CONTENT_LENGTH");
+
+	if (str && !git_parse_ssize_t(str, &val))
+		die("failed to parse CONTENT_LENGTH: %s", str);
+	return val;
+}
+
+static ssize_t read_request(int fd, unsigned char **out, ssize_t req_len)
+{
+	if (req_len < 0)
+		return read_request_eof(fd, out);
+	else
+		return read_request_fixed_len(fd, req_len, out);
+}
+
+static void inflate_request(const char *prog_name, int out, int buffer_input, ssize_t req_len)
 {
 	git_zstream stream;
 	unsigned char *full_request = NULL;
@@ -345,7 +384,7 @@ static void inflate_request(const char *prog_name, int out, int buffer_input)
 			if (full_request)
 				n = 0; /* nothing left to read */
 			else
-				n = read_request(0, &full_request);
+				n = read_request(0, &full_request, req_len);
 			stream.next_in = full_request;
 		} else {
 			n = xread(0, in_buf, sizeof(in_buf));
@@ -381,10 +420,10 @@ static void inflate_request(const char *prog_name, int out, int buffer_input)
 	free(full_request);
 }
 
-static void copy_request(const char *prog_name, int out)
+static void copy_request(const char *prog_name, int out, ssize_t req_len)
 {
 	unsigned char *buf;
-	ssize_t n = read_request(0, &buf);
+	ssize_t n = read_request(0, &buf, req_len);
 	if (n < 0)
 		die_errno("error reading request body");
 	write_to_child(out, buf, n, prog_name);
@@ -399,6 +438,7 @@ static void run_service(const char **argv, int buffer_input)
 	const char *host = getenv("REMOTE_ADDR");
 	int gzipped_request = 0;
 	struct child_process cld = CHILD_PROCESS_INIT;
+	ssize_t req_len = get_content_length();
 
 	if (encoding && !strcmp(encoding, "gzip"))
 		gzipped_request = 1;
@@ -425,9 +465,9 @@ static void run_service(const char **argv, int buffer_input)
 
 	close(1);
 	if (gzipped_request)
-		inflate_request(argv[0], cld.in, buffer_input);
+		inflate_request(argv[0], cld.in, buffer_input, req_len);
 	else if (buffer_input)
-		copy_request(argv[0], cld.in);
+		copy_request(argv[0], cld.in, req_len);
 	else
 		close(0);
 
-- 
2.17.0.1185.g782057d875


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v9 3/3] http-backend: respect CONTENT_LENGTH for receive-pack
  2018-07-27  3:48   ` [PATCH v9 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
  2018-07-27  3:48     ` [PATCH v9 1/3] http-backend: cleanup writing to child process Max Kirillov
  2018-07-27  3:48     ` [PATCH v9 2/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
@ 2018-07-27  3:48     ` Max Kirillov
  2018-07-27  3:50     ` [PATCH v9 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
  3 siblings, 0 replies; 31+ messages in thread
From: Max Kirillov @ 2018-07-27  3:48 UTC (permalink / raw)
  To: Junio C Hamano, SZEDER Gábor, Torsten Bögershausen
  Cc: Max Kirillov, Eric Sunshine, Jeff King, Florian Manschwetus,
	Chris Packham, Konstantin Khomoutov, git@vger.kernel.org

Push passes to another commands, as described in
https://public-inbox.org/git/20171129032214.GB32345@sigill.intra.peff.net/

As it gets complicated to correctly track the data length, instead transfer
the data through parent process and cut the pipe as the specified length is
reached. Do it only when CONTENT_LENGTH is set, otherwise pass the input
directly to the forked commands.

Add tests for cases:

* CONTENT_LENGTH is set, script's stdin has more data, with all combinations
  of variations: fetch or push, plain or compressed body, correct or truncated
  input.

* CONTENT_LENGTH is specified to a value which does not fit into ssize_t.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Max Kirillov <max@max630.net>
---
 help.c                                 |   1 +
 http-backend.c                         |  32 ++++-
 t/t5562-http-backend-content-length.sh | 155 +++++++++++++++++++++++++
 t/t5562/invoke-with-content-length.pl  |  37 ++++++
 4 files changed, 223 insertions(+), 2 deletions(-)
 create mode 100755 t/t5562-http-backend-content-length.sh
 create mode 100755 t/t5562/invoke-with-content-length.pl

diff --git a/help.c b/help.c
index dd35fcc133..e469f5731c 100644
--- a/help.c
+++ b/help.c
@@ -609,6 +609,7 @@ int cmd_version(int argc, const char **argv, const char *prefix)
 		else
 			printf("no commit associated with this build\n");
 		printf("sizeof-long: %d\n", (int)sizeof(long));
+		printf("sizeof-size_t: %d\n", (int)sizeof(size_t));
 		/* NEEDSWORK: also save and output GIT-BUILD_OPTIONS? */
 	}
 	return 0;
diff --git a/http-backend.c b/http-backend.c
index d0b6cb1b09..e88d29f62b 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -373,6 +373,8 @@ static void inflate_request(const char *prog_name, int out, int buffer_input, ss
 	unsigned char in_buf[8192];
 	unsigned char out_buf[8192];
 	unsigned long cnt = 0;
+	int req_len_defined = req_len >= 0;
+	size_t req_remaining_len = req_len;
 
 	memset(&stream, 0, sizeof(stream));
 	git_inflate_init_gzip_only(&stream);
@@ -387,8 +389,15 @@ static void inflate_request(const char *prog_name, int out, int buffer_input, ss
 				n = read_request(0, &full_request, req_len);
 			stream.next_in = full_request;
 		} else {
-			n = xread(0, in_buf, sizeof(in_buf));
+			ssize_t buffer_len;
+			if (req_len_defined && req_remaining_len <= sizeof(in_buf))
+				buffer_len = req_remaining_len;
+			else
+				buffer_len = sizeof(in_buf);
+			n = xread(0, in_buf, buffer_len);
 			stream.next_in = in_buf;
+			if (req_len_defined && n > 0)
+				req_remaining_len -= n;
 		}
 
 		if (n <= 0)
@@ -431,6 +440,23 @@ static void copy_request(const char *prog_name, int out, ssize_t req_len)
 	free(buf);
 }
 
+static void pipe_fixed_length(const char *prog_name, int out, size_t req_len)
+{
+	unsigned char buf[8192];
+	size_t remaining_len = req_len;
+
+	while (remaining_len > 0) {
+		size_t chunk_length = remaining_len > sizeof(buf) ? sizeof(buf) : remaining_len;
+		ssize_t n = xread(0, buf, chunk_length);
+		if (n < 0)
+			die_errno("Reading request failed");
+		write_to_child(out, buf, n, prog_name);
+		remaining_len -= n;
+	}
+
+	close(out);
+}
+
 static void run_service(const char **argv, int buffer_input)
 {
 	const char *encoding = getenv("HTTP_CONTENT_ENCODING");
@@ -457,7 +483,7 @@ static void run_service(const char **argv, int buffer_input)
 				 "GIT_COMMITTER_EMAIL=%s@http.%s", user, host);
 
 	cld.argv = argv;
-	if (buffer_input || gzipped_request)
+	if (buffer_input || gzipped_request || req_len >= 0)
 		cld.in = -1;
 	cld.git_cmd = 1;
 	if (start_command(&cld))
@@ -468,6 +494,8 @@ static void run_service(const char **argv, int buffer_input)
 		inflate_request(argv[0], cld.in, buffer_input, req_len);
 	else if (buffer_input)
 		copy_request(argv[0], cld.in, req_len);
+	else if (req_len >= 0)
+		pipe_fixed_length(argv[0], cld.in, req_len);
 	else
 		close(0);
 
diff --git a/t/t5562-http-backend-content-length.sh b/t/t5562-http-backend-content-length.sh
new file mode 100755
index 0000000000..057dcb85d6
--- /dev/null
+++ b/t/t5562-http-backend-content-length.sh
@@ -0,0 +1,155 @@
+#!/bin/sh
+
+test_description='test git-http-backend respects CONTENT_LENGTH'
+. ./test-lib.sh
+
+test_lazy_prereq GZIP 'gzip --version'
+
+verify_http_result() {
+	# some fatal errors still produce status 200
+	# so check if there is the error message
+	if grep 'fatal:' act.err
+	then
+		return 1
+	fi
+
+	if ! grep "Status" act.out >act
+	then
+		printf "Status: 200 OK\r\n" >act
+	fi
+	printf "Status: $1\r\n" >exp &&
+	test_cmp exp act
+}
+
+test_http_env() {
+	handler_type="$1"
+	request_body="$2"
+	shift
+	env \
+		CONTENT_TYPE="application/x-git-$handler_type-pack-request" \
+		QUERY_STRING="/repo.git/git-$handler_type-pack" \
+		PATH_TRANSLATED="$PWD/.git/git-$handler_type-pack" \
+		GIT_HTTP_EXPORT_ALL=TRUE \
+		REQUEST_METHOD=POST \
+		"$TEST_DIRECTORY"/t5562/invoke-with-content-length.pl \
+		    "$request_body" git http-backend >act.out 2>act.err
+}
+
+ssize_b100dots() {
+	# hardcoded ((size_t) SSIZE_MAX) + 1
+	case "$(build_option sizeof-size_t)" in
+	8) echo 9223372036854775808;;
+	4) echo 2147483648;;
+	*) die "Unexpected ssize_t size: $(build_option sizeof-size_t)";;
+	esac
+}
+
+test_expect_success 'setup' '
+	export HTTP_CONTENT_ENCODING="identity" &&
+	git config http.receivepack true &&
+	test_commit c0 &&
+	test_commit c1 &&
+	hash_head=$(git rev-parse HEAD) &&
+	hash_prev=$(git rev-parse HEAD~1) &&
+	printf "want %s" "$hash_head" | packetize >fetch_body &&
+	printf 0000 >>fetch_body &&
+	printf "have %s" "$hash_prev" | packetize >>fetch_body &&
+	printf done | packetize >>fetch_body &&
+	test_copy_bytes 10 <fetch_body >fetch_body.trunc &&
+	hash_next=$(git commit-tree -p HEAD -m next HEAD^{tree}) &&
+	printf "%s %s refs/heads/newbranch\\0report-status\\n" "$_z40" "$hash_next" | packetize >push_body &&
+	printf 0000 >>push_body &&
+	echo "$hash_next" | git pack-objects --stdout >>push_body &&
+	test_copy_bytes 10 <push_body >push_body.trunc &&
+	: >empty_body
+'
+
+test_expect_success GZIP 'setup, compression related' '
+	gzip -c fetch_body >fetch_body.gz &&
+	test_copy_bytes 10 <fetch_body.gz >fetch_body.gz.trunc &&
+	gzip -c push_body >push_body.gz &&
+	test_copy_bytes 10 <push_body.gz >push_body.gz.trunc
+'
+
+test_expect_success 'fetch plain' '
+	test_http_env upload fetch_body &&
+	verify_http_result "200 OK"
+'
+
+test_expect_success 'fetch plain truncated' '
+	test_http_env upload fetch_body.trunc &&
+	! verify_http_result "200 OK"
+'
+
+test_expect_success 'fetch plain empty' '
+	test_http_env upload empty_body &&
+	! verify_http_result "200 OK"
+'
+
+test_expect_success GZIP 'fetch gzipped' '
+	test_env HTTP_CONTENT_ENCODING="gzip" test_http_env upload fetch_body.gz &&
+	verify_http_result "200 OK"
+'
+
+test_expect_success GZIP 'fetch gzipped truncated' '
+	test_env HTTP_CONTENT_ENCODING="gzip" test_http_env upload fetch_body.gz.trunc &&
+	! verify_http_result "200 OK"
+'
+
+test_expect_success GZIP 'fetch gzipped empty' '
+	test_env HTTP_CONTENT_ENCODING="gzip" test_http_env upload empty_body &&
+	! verify_http_result "200 OK"
+'
+
+test_expect_success GZIP 'push plain' '
+	test_when_finished "git branch -D newbranch" &&
+	test_http_env receive push_body &&
+	verify_http_result "200 OK" &&
+	git rev-parse newbranch >act.head &&
+	echo "$hash_next" >exp.head &&
+	test_cmp act.head exp.head
+'
+
+test_expect_success 'push plain truncated' '
+	test_http_env receive push_body.trunc &&
+	! verify_http_result "200 OK"
+'
+
+test_expect_success 'push plain empty' '
+	test_http_env receive empty_body &&
+	! verify_http_result "200 OK"
+'
+
+test_expect_success GZIP 'push gzipped' '
+	test_when_finished "git branch -D newbranch" &&
+	test_env HTTP_CONTENT_ENCODING="gzip" test_http_env receive push_body.gz &&
+	verify_http_result "200 OK" &&
+	git rev-parse newbranch >act.head &&
+	echo "$hash_next" >exp.head &&
+	test_cmp act.head exp.head
+'
+
+test_expect_success GZIP 'push gzipped truncated' '
+	test_env HTTP_CONTENT_ENCODING="gzip" test_http_env receive push_body.gz.trunc &&
+	! verify_http_result "200 OK"
+'
+
+test_expect_success GZIP 'push gzipped empty' '
+	test_env HTTP_CONTENT_ENCODING="gzip" test_http_env receive empty_body &&
+	! verify_http_result "200 OK"
+'
+
+test_expect_success 'CONTENT_LENGTH overflow ssite_t' '
+	NOT_FIT_IN_SSIZE=$(ssize_b100dots) &&
+	env \
+		CONTENT_TYPE=application/x-git-upload-pack-request \
+		QUERY_STRING=/repo.git/git-upload-pack \
+		PATH_TRANSLATED="$PWD"/.git/git-upload-pack \
+		GIT_HTTP_EXPORT_ALL=TRUE \
+		REQUEST_METHOD=POST \
+		CONTENT_LENGTH="$NOT_FIT_IN_SSIZE" \
+		git http-backend </dev/zero >/dev/null 2>err &&
+	grep "fatal:.*CONTENT_LENGTH" err
+'
+
+test_done
diff --git a/t/t5562/invoke-with-content-length.pl b/t/t5562/invoke-with-content-length.pl
new file mode 100755
index 0000000000..6c2aae7692
--- /dev/null
+++ b/t/t5562/invoke-with-content-length.pl
@@ -0,0 +1,37 @@
+#!/usr/bin/perl
+use 5.008;
+use strict;
+use warnings;
+
+my $body_filename = $ARGV[0];
+my @command = @ARGV[1 .. $#ARGV];
+
+# read data
+my $body_size = -s $body_filename;
+$ENV{"CONTENT_LENGTH"} = $body_size;
+open(my $body_fh, "<", $body_filename) or die "Cannot open $body_filename: $!";
+my $body_data;
+defined read($body_fh, $body_data, $body_size) or die "Cannot read $body_filename: $!";
+close($body_fh);
+
+my $exited = 0;
+$SIG{"CHLD"} = sub {
+        $exited = 1;
+};
+
+# write data
+my $pid = open(my $out, "|-", @command);
+{
+        # disable buffering at $out
+        my $old_selected = select;
+        select $out;
+        $| = 1;
+        select $old_selected;
+}
+print $out $body_data or die "Cannot write data: $!";
+
+sleep 60; # is interrupted by SIGCHLD
+if (!$exited) {
+        close($out);
+        die "Command did not exit after reading whole body";
+}
-- 
2.17.0.1185.g782057d875


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v9 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875
  2018-07-27  3:48   ` [PATCH v9 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
                       ` (2 preceding siblings ...)
  2018-07-27  3:48     ` [PATCH v9 3/3] http-backend: respect CONTENT_LENGTH for receive-pack Max Kirillov
@ 2018-07-27  3:50     ` Max Kirillov
  2018-07-27 17:49       ` Junio C Hamano
  3 siblings, 1 reply; 31+ messages in thread
From: Max Kirillov @ 2018-07-27  3:50 UTC (permalink / raw)
  To: Max Kirillov
  Cc: Junio C Hamano, SZEDER Gábor, Torsten Bögershausen,
	Eric Sunshine, Jeff King, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org

Only the 3rd patch has changed

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v9 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875
  2018-07-27  3:50     ` [PATCH v9 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
@ 2018-07-27 17:49       ` Junio C Hamano
  0 siblings, 0 replies; 31+ messages in thread
From: Junio C Hamano @ 2018-07-27 17:49 UTC (permalink / raw)
  To: Max Kirillov
  Cc: SZEDER Gábor, Torsten Bögershausen, Eric Sunshine,
	Jeff King, Florian Manschwetus, Chris Packham,
	Konstantin Khomoutov, git@vger.kernel.org

Max Kirillov <max@max630.net> writes:

> Only the 3rd patch has changed

Thanks.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v9 2/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875
  2018-07-27  3:48     ` [PATCH v9 2/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
@ 2018-08-04  6:34       ` Duy Nguyen
  2018-08-04 11:28         ` Max Kirillov
  0 siblings, 1 reply; 31+ messages in thread
From: Duy Nguyen @ 2018-08-04  6:34 UTC (permalink / raw)
  To: Max Kirillov
  Cc: Junio C Hamano, SZEDER Gábor, Torsten Bögershausen,
	Eric Sunshine, Jeff King, manschwetus, Chris Packham,
	Konstantin Khomoutov, Git Mailing List

On Fri, Jul 27, 2018 at 5:50 AM Max Kirillov <max@max630.net> wrote:
> -static void inflate_request(const char *prog_name, int out, int buffer_input)
> +static ssize_t read_request_fixed_len(int fd, ssize_t req_len, unsigned char **out)
> +{
> +       unsigned char *buf = NULL;
> +       ssize_t cnt = 0;
> +
> +       if (max_request_buffer < req_len) {
> +               die("request was larger than our maximum size (%lu): "
> +                   "%" PRIuMAX "; try setting GIT_HTTP_MAX_REQUEST_BUFFER",
> +                   max_request_buffer, (uintmax_t)req_len);

Please mark these strings for translation with _().
-- 
Duy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v9 2/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875
  2018-08-04  6:34       ` Duy Nguyen
@ 2018-08-04 11:28         ` Max Kirillov
  2018-08-04 17:20           ` Junio C Hamano
  0 siblings, 1 reply; 31+ messages in thread
From: Max Kirillov @ 2018-08-04 11:28 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Max Kirillov, Junio C Hamano, SZEDER Gábor,
	Torsten Bögershausen, Eric Sunshine, Jeff King, manschwetus,
	Chris Packham, Konstantin Khomoutov, Git Mailing List

On Sat, Aug 04, 2018 at 08:34:08AM +0200, Duy Nguyen wrote:
> On Fri, Jul 27, 2018 at 5:50 AM Max Kirillov <max@max630.net> wrote:
>> +       if (max_request_buffer < req_len) {
>> +               die("request was larger than our maximum size (%lu): "
>> +                   "%" PRIuMAX "; try setting GIT_HTTP_MAX_REQUEST_BUFFER",
>> +                   max_request_buffer, (uintmax_t)req_len);
> 
> Please mark these strings for translation with _().

It has been discussed in [1]. Since it is not a local user
facing part, probably should not be translated.

[1] https://public-inbox.org/git/20180610150727.GE27650@jessie.local/

-- 
Max

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v9 2/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875
  2018-08-04 11:28         ` Max Kirillov
@ 2018-08-04 17:20           ` Junio C Hamano
  0 siblings, 0 replies; 31+ messages in thread
From: Junio C Hamano @ 2018-08-04 17:20 UTC (permalink / raw)
  To: Max Kirillov
  Cc: Duy Nguyen, SZEDER Gábor, Torsten Bögershausen,
	Eric Sunshine, Jeff King, manschwetus, Chris Packham,
	Konstantin Khomoutov, Git Mailing List

Max Kirillov <max@max630.net> writes:

> On Sat, Aug 04, 2018 at 08:34:08AM +0200, Duy Nguyen wrote:
>> On Fri, Jul 27, 2018 at 5:50 AM Max Kirillov <max@max630.net> wrote:
>>> +       if (max_request_buffer < req_len) {
>>> +               die("request was larger than our maximum size (%lu): "
>>> +                   "%" PRIuMAX "; try setting GIT_HTTP_MAX_REQUEST_BUFFER",
>>> +                   max_request_buffer, (uintmax_t)req_len);
>> 
>> Please mark these strings for translation with _().
>
> It has been discussed in [1]. Since it is not a local user
> facing part, probably should not be translated.
>
> [1] https://public-inbox.org/git/20180610150727.GE27650@jessie.local/

I'd support that design decision, FWIW.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2018-08-04 17:20 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-02 21:27 [PATCH v7 0/2] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
2018-06-02 21:27 ` [PATCH v7 1/2] " Max Kirillov
2018-06-04  3:44   ` Jeff King
2018-06-02 21:27 ` [PATCH v7 2/2] http-backend: respect CONTENT_LENGTH for receive-pack Max Kirillov
2018-06-04  4:31   ` Junio C Hamano
2018-06-04 17:06     ` Max Kirillov
2018-06-05  2:30       ` Ramsay Jones
2018-06-04  4:44   ` Jeff King
2018-06-04 22:18     ` Max Kirillov
2018-06-10 15:06       ` Max Kirillov
2018-06-11  9:18       ` Jeff King
2018-06-11  9:24         ` Jeff King
2018-06-10 15:07     ` Max Kirillov
2018-06-11  8:59       ` Jeff King
2018-06-10 15:05 ` [PATCH v8 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
2018-06-10 15:05   ` [PATCH v8 1/3] http-backend: cleanup writing to child process Max Kirillov
2018-06-10 15:05   ` [PATCH v8 2/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
2018-06-10 15:05   ` [PATCH v8 3/3] http-backend: respect CONTENT_LENGTH for receive-pack Max Kirillov
2018-07-25 12:14     ` SZEDER Gábor
2018-07-25 14:51       ` Max Kirillov
2018-07-25 18:41         ` SZEDER Gábor
2018-07-26  4:37           ` Max Kirillov
2018-07-27  3:48   ` [PATCH v9 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
2018-07-27  3:48     ` [PATCH v9 1/3] http-backend: cleanup writing to child process Max Kirillov
2018-07-27  3:48     ` [PATCH v9 2/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
2018-08-04  6:34       ` Duy Nguyen
2018-08-04 11:28         ` Max Kirillov
2018-08-04 17:20           ` Junio C Hamano
2018-07-27  3:48     ` [PATCH v9 3/3] http-backend: respect CONTENT_LENGTH for receive-pack Max Kirillov
2018-07-27  3:50     ` [PATCH v9 0/3] http-backend: respect CONTENT_LENGTH as specified by rfc3875 Max Kirillov
2018-07-27 17:49       ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).