git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v6 0/6] Stream fd to clean filter; GIT_MMAP_LIMIT, GIT_ALLOC_LIMIT with git_env_ulong()
@ 2014-08-26 15:23 Steffen Prohaska
  2014-08-26 15:23 ` [PATCH v6 1/6] convert: drop arguments other than 'path' from would_convert_to_git() Steffen Prohaska
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Steffen Prohaska @ 2014-08-26 15:23 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, peff, pclouds, john, schacon, Steffen Prohaska

Changes since v5:

 - Introduce and use git_env_ulong().
 - Change copy_fd() to not close input fd, which simplified changes to convert.
 - More detailed explanation why filter must be marked "required" in commit
   message of 6/6.
 - Style fixes.

Steffen Prohaska (6):
  convert: drop arguments other than 'path' from would_convert_to_git()
  Add git_env_ulong() to parse environment variable
  Change GIT_ALLOC_LIMIT check to use git_env_ulong()
  Introduce GIT_MMAP_LIMIT to allow testing expected mmap size
  Change copy_fd() to not close input fd
  convert: stream from fd to required clean filter to reduce used
    address space

 cache.h               |  1 +
 config.c              | 11 +++++++++++
 convert.c             | 55 +++++++++++++++++++++++++++++++++++++++++++++------
 convert.h             | 10 +++++++---
 copy.c                |  5 +----
 lockfile.c            |  3 +++
 sha1_file.c           | 47 ++++++++++++++++++++++++++++++++++++++++---
 t/t0021-conversion.sh | 24 +++++++++++++++++-----
 t/t1050-large.sh      |  2 +-
 wrapper.c             | 15 +++++++-------
 10 files changed, 144 insertions(+), 29 deletions(-)

-- 
2.1.0.8.gf3a29c8

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v6 1/6] convert: drop arguments other than 'path' from would_convert_to_git()
  2014-08-26 15:23 [PATCH v6 0/6] Stream fd to clean filter; GIT_MMAP_LIMIT, GIT_ALLOC_LIMIT with git_env_ulong() Steffen Prohaska
@ 2014-08-26 15:23 ` Steffen Prohaska
  2014-08-26 15:23 ` [PATCH v6 2/6] Add git_env_ulong() to parse environment variable Steffen Prohaska
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Steffen Prohaska @ 2014-08-26 15:23 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, peff, pclouds, john, schacon, Steffen Prohaska

It is only the path that matters in the decision whether to filter or
not.  Clarify this by making path the single argument of
would_convert_to_git().

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
---
 convert.h   | 5 ++---
 sha1_file.c | 2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/convert.h b/convert.h
index 0c2143c..c638b33 100644
--- a/convert.h
+++ b/convert.h
@@ -40,10 +40,9 @@ extern int convert_to_working_tree(const char *path, const char *src,
 				   size_t len, struct strbuf *dst);
 extern int renormalize_buffer(const char *path, const char *src, size_t len,
 			      struct strbuf *dst);
-static inline int would_convert_to_git(const char *path, const char *src,
-				       size_t len, enum safe_crlf checksafe)
+static inline int would_convert_to_git(const char *path)
 {
-	return convert_to_git(path, src, len, NULL, checksafe);
+	return convert_to_git(path, NULL, 0, NULL, 0);
 }
 
 /*****************************************************************
diff --git a/sha1_file.c b/sha1_file.c
index 3f70b1d..00c07f2 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3144,7 +3144,7 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st,
 	if (!S_ISREG(st->st_mode))
 		ret = index_pipe(sha1, fd, type, path, flags);
 	else if (size <= big_file_threshold || type != OBJ_BLOB ||
-		 (path && would_convert_to_git(path, NULL, 0, 0)))
+		 (path && would_convert_to_git(path)))
 		ret = index_core(sha1, fd, size, type, path, flags);
 	else
 		ret = index_stream(sha1, fd, size, type, path, flags);
-- 
2.1.0.8.gf3a29c8

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v6 2/6] Add git_env_ulong() to parse environment variable
  2014-08-26 15:23 [PATCH v6 0/6] Stream fd to clean filter; GIT_MMAP_LIMIT, GIT_ALLOC_LIMIT with git_env_ulong() Steffen Prohaska
  2014-08-26 15:23 ` [PATCH v6 1/6] convert: drop arguments other than 'path' from would_convert_to_git() Steffen Prohaska
@ 2014-08-26 15:23 ` Steffen Prohaska
  2014-08-26 18:21   ` Jeff King
  2014-08-26 15:23 ` [PATCH v6 3/6] Change GIT_ALLOC_LIMIT check to use git_env_ulong() Steffen Prohaska
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Steffen Prohaska @ 2014-08-26 15:23 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, peff, pclouds, john, schacon, Steffen Prohaska

The new function will be used to parse GIT_MMAP_LIMIT and
GIT_ALLOC_LIMIT.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
---
 cache.h  |  1 +
 config.c | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/cache.h b/cache.h
index fcb511d..b820b6a 100644
--- a/cache.h
+++ b/cache.h
@@ -1318,6 +1318,7 @@ extern int git_config_rename_section_in_file(const char *, const char *, const c
 extern const char *git_etc_gitconfig(void);
 extern int check_repository_format_version(const char *var, const char *value, void *cb);
 extern int git_env_bool(const char *, int);
+extern unsigned long git_env_ulong(const char *, unsigned long);
 extern int git_config_system(void);
 extern int config_error_nonbool(const char *);
 #if defined(__GNUC__)
diff --git a/config.c b/config.c
index 058505c..178721f 100644
--- a/config.c
+++ b/config.c
@@ -1122,6 +1122,17 @@ int git_env_bool(const char *k, int def)
 	return v ? git_config_bool(k, v) : def;
 }
 
+/*
+ * Use default if environment variable is unset or empty string.
+ */
+unsigned long git_env_ulong(const char *k, unsigned long val)
+{
+	const char *v = getenv(k);
+	if (v && *v && !git_parse_ulong(v, &val))
+		die("failed to parse %s", k);
+	return val;
+}
+
 int git_config_system(void)
 {
 	return !git_env_bool("GIT_CONFIG_NOSYSTEM", 0);
-- 
2.1.0.8.gf3a29c8

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v6 3/6] Change GIT_ALLOC_LIMIT check to use git_env_ulong()
  2014-08-26 15:23 [PATCH v6 0/6] Stream fd to clean filter; GIT_MMAP_LIMIT, GIT_ALLOC_LIMIT with git_env_ulong() Steffen Prohaska
  2014-08-26 15:23 ` [PATCH v6 1/6] convert: drop arguments other than 'path' from would_convert_to_git() Steffen Prohaska
  2014-08-26 15:23 ` [PATCH v6 2/6] Add git_env_ulong() to parse environment variable Steffen Prohaska
@ 2014-08-26 15:23 ` Steffen Prohaska
  2014-08-26 15:23 ` [PATCH v6 4/6] Introduce GIT_MMAP_LIMIT to allow testing expected mmap size Steffen Prohaska
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Steffen Prohaska @ 2014-08-26 15:23 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, peff, pclouds, john, schacon, Steffen Prohaska

GIT_ALLOC_LIMIT limits xmalloc()'s size, which is of type size_t.
Better use git_env_ulong() to parse the environment variable, so that
the postfixes 'k', 'm', and 'g' can be used; and use size_t to store the
limit for consistency.  The change to size_t has no direct practical
impact, because we use GIT_ALLOC_LIMIT to test small sizes.

The cast of size in the call to die() is changed to uintmax_t to match
the format string PRIuMAX.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
---
 t/t1050-large.sh |  2 +-
 wrapper.c        | 15 ++++++++-------
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/t/t1050-large.sh b/t/t1050-large.sh
index aea4936..e7657ab 100755
--- a/t/t1050-large.sh
+++ b/t/t1050-large.sh
@@ -13,7 +13,7 @@ test_expect_success setup '
 	echo X | dd of=large2 bs=1k seek=2000 &&
 	echo X | dd of=large3 bs=1k seek=2000 &&
 	echo Y | dd of=huge bs=1k seek=2500 &&
-	GIT_ALLOC_LIMIT=1500 &&
+	GIT_ALLOC_LIMIT=1500k &&
 	export GIT_ALLOC_LIMIT
 '
 
diff --git a/wrapper.c b/wrapper.c
index bc1bfb8..c5204f7 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -11,14 +11,15 @@ static void (*try_to_free_routine)(size_t size) = do_nothing;
 
 static void memory_limit_check(size_t size)
 {
-	static int limit = -1;
-	if (limit == -1) {
-		const char *env = getenv("GIT_ALLOC_LIMIT");
-		limit = env ? atoi(env) * 1024 : 0;
+	static size_t limit = 0;
+	if (!limit) {
+		limit = git_env_ulong("GIT_ALLOC_LIMIT", 0);
+		if (!limit)
+			limit = SIZE_MAX;
 	}
-	if (limit && size > limit)
-		die("attempting to allocate %"PRIuMAX" over limit %d",
-		    (intmax_t)size, limit);
+	if (size > limit)
+		die("attempting to allocate %"PRIuMAX" over limit %"PRIuMAX,
+		    (uintmax_t)size, (uintmax_t)limit);
 }
 
 try_to_free_t set_try_to_free_routine(try_to_free_t routine)
-- 
2.1.0.8.gf3a29c8

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v6 4/6] Introduce GIT_MMAP_LIMIT to allow testing expected mmap size
  2014-08-26 15:23 [PATCH v6 0/6] Stream fd to clean filter; GIT_MMAP_LIMIT, GIT_ALLOC_LIMIT with git_env_ulong() Steffen Prohaska
                   ` (2 preceding siblings ...)
  2014-08-26 15:23 ` [PATCH v6 3/6] Change GIT_ALLOC_LIMIT check to use git_env_ulong() Steffen Prohaska
@ 2014-08-26 15:23 ` Steffen Prohaska
  2014-08-26 15:23 ` [PATCH v6 5/6] Change copy_fd() to not close input fd Steffen Prohaska
  2014-08-26 15:23 ` [PATCH v6 6/6] convert: stream from fd to required clean filter to reduce used address space Steffen Prohaska
  5 siblings, 0 replies; 19+ messages in thread
From: Steffen Prohaska @ 2014-08-26 15:23 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, peff, pclouds, john, schacon, Steffen Prohaska

Similar to testing expectations about malloc with GIT_ALLOC_LIMIT (see
commit d41489 and previous commit), it can be useful to test
expectations about mmap.

This introduces a new environment variable GIT_MMAP_LIMIT to limit the
largest allowed mmap length.  xmmap() is modified to check the limit.
Together with GIT_ALLOC_LIMIT tests can now easily confirm expectations
about memory consumption.

GIT_MMAP_LIMIT will be used in the next commit to test that data will be
streamed to an external filter without mmaping the entire file.

[commit d41489]: d41489a6424308dc9a0409bc2f6845aa08bd4f7d Add more large
    blob test cases

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
---
 sha1_file.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/sha1_file.c b/sha1_file.c
index 00c07f2..d9b5157 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -663,10 +663,26 @@ void release_pack_memory(size_t need)
 		; /* nothing */
 }
 
+static void mmap_limit_check(size_t length)
+{
+	static size_t limit = 0;
+	if (!limit) {
+		limit = git_env_ulong("GIT_MMAP_LIMIT", 0);
+		if (!limit)
+			limit = SIZE_MAX;
+	}
+	if (length > limit)
+		die("attempting to mmap %"PRIuMAX" over limit %"PRIuMAX,
+		    (uintmax_t)length, (uintmax_t)limit);
+}
+
 void *xmmap(void *start, size_t length,
 	int prot, int flags, int fd, off_t offset)
 {
-	void *ret = mmap(start, length, prot, flags, fd, offset);
+	void *ret;
+
+	mmap_limit_check(length);
+	ret = mmap(start, length, prot, flags, fd, offset);
 	if (ret == MAP_FAILED) {
 		if (!length)
 			return NULL;
-- 
2.1.0.8.gf3a29c8

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v6 5/6] Change copy_fd() to not close input fd
  2014-08-26 15:23 [PATCH v6 0/6] Stream fd to clean filter; GIT_MMAP_LIMIT, GIT_ALLOC_LIMIT with git_env_ulong() Steffen Prohaska
                   ` (3 preceding siblings ...)
  2014-08-26 15:23 ` [PATCH v6 4/6] Introduce GIT_MMAP_LIMIT to allow testing expected mmap size Steffen Prohaska
@ 2014-08-26 15:23 ` Steffen Prohaska
  2014-08-26 17:10   ` Junio C Hamano
  2014-08-26 18:29   ` Jeff King
  2014-08-26 15:23 ` [PATCH v6 6/6] convert: stream from fd to required clean filter to reduce used address space Steffen Prohaska
  5 siblings, 2 replies; 19+ messages in thread
From: Steffen Prohaska @ 2014-08-26 15:23 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, peff, pclouds, john, schacon, Steffen Prohaska

The caller opened the fd, so it should be responsible for closing it.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
---
 copy.c     | 5 +----
 lockfile.c | 3 +++
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/copy.c b/copy.c
index a7f58fd..d0a1d82 100644
--- a/copy.c
+++ b/copy.c
@@ -10,7 +10,6 @@ int copy_fd(int ifd, int ofd)
 			break;
 		if (len < 0) {
 			int read_error = errno;
-			close(ifd);
 			return error("copy-fd: read returned %s",
 				     strerror(read_error));
 		}
@@ -21,17 +20,14 @@ int copy_fd(int ifd, int ofd)
 				len -= written;
 			}
 			else if (!written) {
-				close(ifd);
 				return error("copy-fd: write returned 0");
 			} else {
 				int write_error = errno;
-				close(ifd);
 				return error("copy-fd: write returned %s",
 					     strerror(write_error));
 			}
 		}
 	}
-	close(ifd);
 	return 0;
 }
 
@@ -60,6 +56,7 @@ int copy_file(const char *dst, const char *src, int mode)
 		return fdo;
 	}
 	status = copy_fd(fdi, fdo);
+	close(fdi);
 	if (close(fdo) != 0)
 		return error("%s: close error: %s", dst, strerror(errno));
 
diff --git a/lockfile.c b/lockfile.c
index 2564a7f..2448d30 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -224,8 +224,11 @@ int hold_lock_file_for_append(struct lock_file *lk, const char *path, int flags)
 	} else if (copy_fd(orig_fd, fd)) {
 		if (flags & LOCK_DIE_ON_ERROR)
 			exit(128);
+		close(orig_fd);
 		close(fd);
 		return -1;
+	} else {
+		close(orig_fd);
 	}
 	return fd;
 }
-- 
2.1.0.8.gf3a29c8

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v6 6/6] convert: stream from fd to required clean filter to reduce used address space
  2014-08-26 15:23 [PATCH v6 0/6] Stream fd to clean filter; GIT_MMAP_LIMIT, GIT_ALLOC_LIMIT with git_env_ulong() Steffen Prohaska
                   ` (4 preceding siblings ...)
  2014-08-26 15:23 ` [PATCH v6 5/6] Change copy_fd() to not close input fd Steffen Prohaska
@ 2014-08-26 15:23 ` Steffen Prohaska
  5 siblings, 0 replies; 19+ messages in thread
From: Steffen Prohaska @ 2014-08-26 15:23 UTC (permalink / raw
  To: Junio C Hamano; +Cc: git, peff, pclouds, john, schacon, Steffen Prohaska

The data is streamed to the filter process anyway.  Better avoid mapping
the file if possible.  This is especially useful if a clean filter
reduces the size, for example if it computes a sha1 for binary data,
like git media.  The file size that the previous implementation could
handle was limited by the available address space; large files for
example could not be handled with (32-bit) msysgit.  The new
implementation can filter files of any size as long as the filter output
is small enough.

The new code path is only taken if the filter is required.  The filter
consumes data directly from the fd.  If it fails, the original data is
not immediately available.  The condition can easily be handled as
a fatal error, which is expected for a required filter anyway.

If the filter was not required, the condition would need to be handled
in a different way, like seeking to 0 and reading the data.  But this
would require more restructuring of the code and is probably not worth
it.  The obvious approach of falling back to reading all data would not
help achieving the main purpose of this patch, which is to handle large
files with limited address space.  If reading all data is an option, we
can simply take the old code path right away and mmap the entire file.

The environment variable GIT_MMAP_LIMIT, which has been introduced in
a previous commit is used to test that the expected code path is taken.
A related test that exercises required filters is modified to verify
that the data actually has been modified on its way from the file system
to the object store.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
---
 convert.c             | 55 +++++++++++++++++++++++++++++++++++++++++++++------
 convert.h             |  5 +++++
 sha1_file.c           | 27 ++++++++++++++++++++++++-
 t/t0021-conversion.sh | 24 +++++++++++++++++-----
 4 files changed, 99 insertions(+), 12 deletions(-)

diff --git a/convert.c b/convert.c
index cb5fbb4..677d339 100644
--- a/convert.c
+++ b/convert.c
@@ -312,11 +312,12 @@ static int crlf_to_worktree(const char *path, const char *src, size_t len,
 struct filter_params {
 	const char *src;
 	unsigned long size;
+	int fd;
 	const char *cmd;
 	const char *path;
 };
 
-static int filter_buffer(int in, int out, void *data)
+static int filter_buffer_or_fd(int in, int out, void *data)
 {
 	/*
 	 * Spawn cmd and feed the buffer contents through its stdin.
@@ -355,7 +356,12 @@ static int filter_buffer(int in, int out, void *data)
 
 	sigchain_push(SIGPIPE, SIG_IGN);
 
-	write_err = (write_in_full(child_process.in, params->src, params->size) < 0);
+	if (params->src) {
+		write_err = (write_in_full(child_process.in, params->src, params->size) < 0);
+	} else {
+		write_err = copy_fd(params->fd, child_process.in);
+	}
+
 	if (close(child_process.in))
 		write_err = 1;
 	if (write_err)
@@ -371,7 +377,7 @@ static int filter_buffer(int in, int out, void *data)
 	return (write_err || status);
 }
 
-static int apply_filter(const char *path, const char *src, size_t len,
+static int apply_filter(const char *path, const char *src, size_t len, int fd,
                         struct strbuf *dst, const char *cmd)
 {
 	/*
@@ -392,11 +398,12 @@ static int apply_filter(const char *path, const char *src, size_t len,
 		return 1;
 
 	memset(&async, 0, sizeof(async));
-	async.proc = filter_buffer;
+	async.proc = filter_buffer_or_fd;
 	async.data = &params;
 	async.out = -1;
 	params.src = src;
 	params.size = len;
+	params.fd = fd;
 	params.cmd = cmd;
 	params.path = path;
 
@@ -747,6 +754,25 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
 	}
 }
 
+int would_convert_to_git_filter_fd(const char *path)
+{
+	struct conv_attrs ca;
+
+	convert_attrs(&ca, path);
+	if (!ca.drv)
+		return 0;
+
+	/*
+	 * Apply a filter to an fd only if the filter is required to succeed.
+	 * We must die if the filter fails, because the original data before
+	 * filtering is not available.
+	 */
+	if (!ca.drv->required)
+		return 0;
+
+	return apply_filter(path, NULL, 0, -1, NULL, ca.drv->clean);
+}
+
 int convert_to_git(const char *path, const char *src, size_t len,
                    struct strbuf *dst, enum safe_crlf checksafe)
 {
@@ -761,7 +787,7 @@ int convert_to_git(const char *path, const char *src, size_t len,
 		required = ca.drv->required;
 	}
 
-	ret |= apply_filter(path, src, len, dst, filter);
+	ret |= apply_filter(path, src, len, -1, dst, filter);
 	if (!ret && required)
 		die("%s: clean filter '%s' failed", path, ca.drv->name);
 
@@ -778,6 +804,23 @@ int convert_to_git(const char *path, const char *src, size_t len,
 	return ret | ident_to_git(path, src, len, dst, ca.ident);
 }
 
+void convert_to_git_filter_fd(const char *path, int fd, struct strbuf *dst,
+			      enum safe_crlf checksafe)
+{
+	struct conv_attrs ca;
+	convert_attrs(&ca, path);
+
+	assert(ca.drv);
+	assert(ca.drv->clean);
+
+	if (!apply_filter(path, NULL, 0, fd, dst, ca.drv->clean))
+		die("%s: clean filter '%s' failed", path, ca.drv->name);
+
+	ca.crlf_action = input_crlf_action(ca.crlf_action, ca.eol_attr);
+	crlf_to_git(path, dst->buf, dst->len, dst, ca.crlf_action, checksafe);
+	ident_to_git(path, dst->buf, dst->len, dst, ca.ident);
+}
+
 static int convert_to_working_tree_internal(const char *path, const char *src,
 					    size_t len, struct strbuf *dst,
 					    int normalizing)
@@ -811,7 +854,7 @@ static int convert_to_working_tree_internal(const char *path, const char *src,
 		}
 	}
 
-	ret_filter = apply_filter(path, src, len, dst, filter);
+	ret_filter = apply_filter(path, src, len, -1, dst, filter);
 	if (!ret_filter && required)
 		die("%s: smudge filter %s failed", path, ca.drv->name);
 
diff --git a/convert.h b/convert.h
index c638b33..d9d853c 100644
--- a/convert.h
+++ b/convert.h
@@ -44,6 +44,11 @@ static inline int would_convert_to_git(const char *path)
 {
 	return convert_to_git(path, NULL, 0, NULL, 0);
 }
+/* Precondition: would_convert_to_git_filter_fd(path) == true */
+extern void convert_to_git_filter_fd(const char *path, int fd,
+				     struct strbuf *dst,
+				     enum safe_crlf checksafe);
+extern int would_convert_to_git_filter_fd(const char *path);
 
 /*****************************************************************
  *
diff --git a/sha1_file.c b/sha1_file.c
index d9b5157..423ec64 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3090,6 +3090,29 @@ static int index_mem(unsigned char *sha1, void *buf, size_t size,
 	return ret;
 }
 
+static int index_stream_convert_blob(unsigned char *sha1, int fd,
+				     const char *path, unsigned flags)
+{
+	int ret;
+	const int write_object = flags & HASH_WRITE_OBJECT;
+	struct strbuf sbuf = STRBUF_INIT;
+
+	assert(path);
+	assert(would_convert_to_git_filter_fd(path));
+
+	convert_to_git_filter_fd(path, fd, &sbuf,
+				 write_object ? safe_crlf : SAFE_CRLF_FALSE);
+
+	if (write_object)
+		ret = write_sha1_file(sbuf.buf, sbuf.len, typename(OBJ_BLOB),
+				      sha1);
+	else
+		ret = hash_sha1_file(sbuf.buf, sbuf.len, typename(OBJ_BLOB),
+				     sha1);
+	strbuf_release(&sbuf);
+	return ret;
+}
+
 static int index_pipe(unsigned char *sha1, int fd, enum object_type type,
 		      const char *path, unsigned flags)
 {
@@ -3157,7 +3180,9 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st,
 	int ret;
 	size_t size = xsize_t(st->st_size);
 
-	if (!S_ISREG(st->st_mode))
+	if (type == OBJ_BLOB && path && would_convert_to_git_filter_fd(path))
+		ret = index_stream_convert_blob(sha1, fd, path, flags);
+	else if (!S_ISREG(st->st_mode))
 		ret = index_pipe(sha1, fd, type, path, flags);
 	else if (size <= big_file_threshold || type != OBJ_BLOB ||
 		 (path && would_convert_to_git(path)))
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index f890c54..82ebbff 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -153,17 +153,23 @@ test_expect_success 'filter shell-escaped filenames' '
 	:
 '
 
-test_expect_success 'required filter success' '
-	git config filter.required.smudge cat &&
-	git config filter.required.clean cat &&
+test_expect_success 'required filter should filter data' '
+	git config filter.required.smudge ./rot13.sh &&
+	git config filter.required.clean ./rot13.sh &&
 	git config filter.required.required true &&
 
 	echo "*.r filter=required" >.gitattributes &&
 
-	echo test >test.r &&
+	cat test.o >test.r &&
 	git add test.r &&
+
 	rm -f test.r &&
-	git checkout -- test.r
+	git checkout -- test.r &&
+	cmp test.o test.r &&
+
+	./rot13.sh <test.o >expected &&
+	git cat-file blob :test.r >actual &&
+	cmp expected actual
 '
 
 test_expect_success 'required filter smudge failure' '
@@ -189,6 +195,14 @@ test_expect_success 'required filter clean failure' '
 	echo test >test.fc &&
 	test_must_fail git add test.fc
 '
+test_expect_success \
+'filtering large input to small output should use little memory' '
+	git config filter.devnull.clean "cat >/dev/null" &&
+	git config filter.devnull.required true &&
+	for i in $(test_seq 1 30); do printf "%1048576d" 1; done >30MB &&
+	echo "30MB filter=devnull" >.gitattributes &&
+	GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add 30MB
+'
 
 test_expect_success EXPENSIVE 'filter large file' '
 	git config filter.largefile.smudge cat &&
-- 
2.1.0.8.gf3a29c8

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 5/6] Change copy_fd() to not close input fd
  2014-08-26 15:23 ` [PATCH v6 5/6] Change copy_fd() to not close input fd Steffen Prohaska
@ 2014-08-26 17:10   ` Junio C Hamano
  2014-08-26 18:29   ` Jeff King
  1 sibling, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2014-08-26 17:10 UTC (permalink / raw
  To: Steffen Prohaska; +Cc: git, peff, pclouds, john, schacon

Steffen Prohaska <prohaska@zib.de> writes:

> The caller opened the fd, so it should be responsible for closing it.

Hmmmm, I am not sure what the benefit of such a dogmatism.  This
function consumes all that is useful in the fd, and there is nothing
interesting that can be do further on it, no?

Ah, wait.  The caller could choose to rewind and reuse the contents,
and it is very selfish of this function to unilaterally declare that
once you give an fd to it you cannot do anything with it laster.

So I think this is a good change, but the justification is not
quite.  It is not "the caller should be responsible"; it is more
about "the callee did not open it, does not own it, and should allow
the caller, if it chooses, reuse it by seeking after the callee is
done."

>> Subject: Re: [PATCH v6 5/6] Change copy_fd() to not close input fd

Let's follow the "<area>: <description>" convention here, too, e.g.

    copy_fd(): allow callers to work on input fd after it returns

or something.

Thanks.

>  copy.c     | 5 +----
>  lockfile.c | 3 +++
>  2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/copy.c b/copy.c
> index a7f58fd..d0a1d82 100644
> --- a/copy.c
> +++ b/copy.c
> @@ -10,7 +10,6 @@ int copy_fd(int ifd, int ofd)
>  			break;
>  		if (len < 0) {
>  			int read_error = errno;
> -			close(ifd);
>  			return error("copy-fd: read returned %s",
>  				     strerror(read_error));
>  		}
> @@ -21,17 +20,14 @@ int copy_fd(int ifd, int ofd)
>  				len -= written;
>  			}
>  			else if (!written) {
> -				close(ifd);
>  				return error("copy-fd: write returned 0");
>  			} else {
>  				int write_error = errno;
> -				close(ifd);
>  				return error("copy-fd: write returned %s",
>  					     strerror(write_error));
>  			}
>  		}
>  	}
> -	close(ifd);
>  	return 0;
>  }
>  
> @@ -60,6 +56,7 @@ int copy_file(const char *dst, const char *src, int mode)
>  		return fdo;
>  	}
>  	status = copy_fd(fdi, fdo);
> +	close(fdi);
>  	if (close(fdo) != 0)
>  		return error("%s: close error: %s", dst, strerror(errno));
>  
> diff --git a/lockfile.c b/lockfile.c
> index 2564a7f..2448d30 100644
> --- a/lockfile.c
> +++ b/lockfile.c
> @@ -224,8 +224,11 @@ int hold_lock_file_for_append(struct lock_file *lk, const char *path, int flags)
>  	} else if (copy_fd(orig_fd, fd)) {
>  		if (flags & LOCK_DIE_ON_ERROR)
>  			exit(128);
> +		close(orig_fd);
>  		close(fd);
>  		return -1;
> +	} else {
> +		close(orig_fd);
>  	}
>  	return fd;
>  }

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 2/6] Add git_env_ulong() to parse environment variable
  2014-08-26 15:23 ` [PATCH v6 2/6] Add git_env_ulong() to parse environment variable Steffen Prohaska
@ 2014-08-26 18:21   ` Jeff King
  2014-08-26 20:20     ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Jeff King @ 2014-08-26 18:21 UTC (permalink / raw
  To: Steffen Prohaska; +Cc: Junio C Hamano, git, pclouds, john, schacon

On Tue, Aug 26, 2014 at 05:23:21PM +0200, Steffen Prohaska wrote:

> +/*
> + * Use default if environment variable is unset or empty string.
> + */
> +unsigned long git_env_ulong(const char *k, unsigned long val)
> +{
> +	const char *v = getenv(k);
> +	if (v && *v && !git_parse_ulong(v, &val))
> +		die("failed to parse %s", k);
> +	return val;
> +}

I think the "empty string" behavior here is sensible. I notice that
git_env_bool is not so careful. I think we should probably do this
(independent of your series):

-- >8 --
Subject: git_env_bool: treat empty string as "not set"

If an environment variable we treat as a boolean is not set,
we use some default value provided by the caller. If it is
set but is the empty string, however, we treat it as
"false". This can be rather confusing, as it is easy to set
the variable to the empty string in the shell (e.g., by
calling GIT_SMART_HTTP= instead of "unset").

Instead, let's treat unset and empty variables the same.
This should not have any negative backwards-compatibility
consequences, because:

  1. The existing behavior was confusing and undocumented in
     the first place.

  2. For most variables, the default _is_ false, and so this
     change is a noop. The only affected variables are
     GIT_IMPLICIT_WORK_TREE (which is undocumented and
     internally always set to "0" or "1") and GIT_SMART_HTTP
     (which is also undocumented, and we use only for
     testing).

Since there won't be any fallout with the current variables,
this is a good time to make the switch (before any other
variables are added).

Signed-off-by: Jeff King <peff@peff.net>
---
 config.c                    | 2 +-
 t/t5551-http-fetch-smart.sh | 7 +++++++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/config.c b/config.c
index 058505c..7bf0704 100644
--- a/config.c
+++ b/config.c
@@ -1119,7 +1119,7 @@ const char *git_etc_gitconfig(void)
 int git_env_bool(const char *k, int def)
 {
 	const char *v = getenv(k);
-	return v ? git_config_bool(k, v) : def;
+	return v && *v ? git_config_bool(k, v) : def;
 }
 
 int git_config_system(void)
diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index 6cbc12d..831f9e4 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -168,6 +168,13 @@ test_expect_success 'GIT_SMART_HTTP can disable smart http' '
 	 test_must_fail git fetch)
 '
 
+test_expect_success 'empty GIT_SMART_HTTP leaves smart http enabled' '
+	(GIT_SMART_HTTP= &&
+	 export GIT_SMART_HTTP &&
+	 cd clone &&
+	 git fetch)
+'
+
 test_expect_success 'invalid Content-Type rejected' '
 	test_must_fail git clone $HTTPD_URL/broken_smart/repo.git 2>actual
 	grep "not valid:" actual
-- 
2.1.0.346.ga0367b9

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 5/6] Change copy_fd() to not close input fd
  2014-08-26 15:23 ` [PATCH v6 5/6] Change copy_fd() to not close input fd Steffen Prohaska
  2014-08-26 17:10   ` Junio C Hamano
@ 2014-08-26 18:29   ` Jeff King
  2014-08-28 15:37     ` Steffen Prohaska
  1 sibling, 1 reply; 19+ messages in thread
From: Jeff King @ 2014-08-26 18:29 UTC (permalink / raw
  To: Steffen Prohaska; +Cc: Junio C Hamano, git, pclouds, john, schacon

On Tue, Aug 26, 2014 at 05:23:24PM +0200, Steffen Prohaska wrote:

> The caller opened the fd, so it should be responsible for closing it.
> 
> Signed-off-by: Steffen Prohaska <prohaska@zib.de>
> ---
>  copy.c     | 5 +----
>  lockfile.c | 3 +++
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/copy.c b/copy.c
> index a7f58fd..d0a1d82 100644
> --- a/copy.c
> +++ b/copy.c
> @@ -10,7 +10,6 @@ int copy_fd(int ifd, int ofd)
>  			break;
>  		if (len < 0) {
>  			int read_error = errno;
> -			close(ifd);
>  			return error("copy-fd: read returned %s",
>  				     strerror(read_error));
>  		}

This saved errno is not necessary anymore (the problem was that close()
clobbered the error in the original code). It can go away, and we can
even drop the curly braces.

> @@ -21,17 +20,14 @@ int copy_fd(int ifd, int ofd)
>  				len -= written;
>  			}
>  			else if (!written) {
> -				close(ifd);
>  				return error("copy-fd: write returned 0");
>  			} else {
>  				int write_error = errno;
> -				close(ifd);
>  				return error("copy-fd: write returned %s",
>  					     strerror(write_error));
>  			}
>  		}

Ditto here. Actually, isn't this whole write just a reimplementation of
write_in_full? The latter treats a return of 0 as ENOSPC rather than
using a custom message, but I think that is sane.

All together:

---
 copy.c | 28 +++++-----------------------
 1 file changed, 5 insertions(+), 23 deletions(-)

diff --git a/copy.c b/copy.c
index a7f58fd..53a9ece 100644
--- a/copy.c
+++ b/copy.c
@@ -4,34 +4,16 @@ int copy_fd(int ifd, int ofd)
 {
 	while (1) {
 		char buffer[8192];
-		char *buf = buffer;
 		ssize_t len = xread(ifd, buffer, sizeof(buffer));
 		if (!len)
 			break;
-		if (len < 0) {
-			int read_error = errno;
-			close(ifd);
+		if (len < 0)
 			return error("copy-fd: read returned %s",
-				     strerror(read_error));
-		}
-		while (len) {
-			int written = xwrite(ofd, buf, len);
-			if (written > 0) {
-				buf += written;
-				len -= written;
-			}
-			else if (!written) {
-				close(ifd);
-				return error("copy-fd: write returned 0");
-			} else {
-				int write_error = errno;
-				close(ifd);
-				return error("copy-fd: write returned %s",
-					     strerror(write_error));
-			}
-		}
+				     strerror(errno));
+		if (write_in_full(ofd, buffer, len) < 0)
+			return error("copy-fd: write returned %s",
+				     strerror(errno));
 	}
-	close(ifd);
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 2/6] Add git_env_ulong() to parse environment variable
  2014-08-26 18:21   ` Jeff King
@ 2014-08-26 20:20     ` Junio C Hamano
  2014-08-26 20:31       ` Jeff King
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2014-08-26 20:20 UTC (permalink / raw
  To: Jeff King; +Cc: Steffen Prohaska, git, pclouds, john, schacon

Jeff King <peff@peff.net> writes:

> On Tue, Aug 26, 2014 at 05:23:21PM +0200, Steffen Prohaska wrote:
>
>> +/*
>> + * Use default if environment variable is unset or empty string.
>> + */
>> +unsigned long git_env_ulong(const char *k, unsigned long val)
>> +{
>> +	const char *v = getenv(k);
>> +	if (v && *v && !git_parse_ulong(v, &val))
>> +		die("failed to parse %s", k);
>> +	return val;
>> +}
>
> I think the "empty string" behavior here is sensible. I notice that
> git_env_bool is not so careful. I think we should probably do this
> (independent of your series):
>
> -- >8 --
> Subject: git_env_bool: treat empty string as "not set"
>
> If an environment variable we treat as a boolean is not set,
> we use some default value provided by the caller. If it is
> set but is the empty string, however, we treat it as
> "false". This can be rather confusing, as it is easy to set
> the variable to the empty string in the shell (e.g., by
> calling GIT_SMART_HTTP= instead of "unset").

I think different people have different confusion criteria.
To me, these two are very different operations:

    $ VAR=
    $ unset VAR

I think it boils down to that I see that the distance between "unset
vs set to empty" is far larger than the distance between "empty vs
false".  You probably see these two distances the other way,
i.e. "set to empty is almost like unset" and "empty is not a valid
way to say false".

Due to this difference, the new test confused me and had me read it
three times.

> +test_expect_success 'empty GIT_SMART_HTTP leaves smart http enabled' '
> +	(GIT_SMART_HTTP= &&
> +	 export GIT_SMART_HTTP &&
> +	 cd clone &&
> +	 git fetch)
> +'

The test before this one explicitly sets GIT_SMART_HTTP to "0" and
expects the fetch to fail.  It is sensible to you because "0" is a
lot more explicit "false" than an empty string to you, and you
equate an empty and unset, hence the new one should succeed.

But it looks nonsensical to me that the new one expects to succeed,
because "0" and an empty string are both valid way to say "false"
to me, and it should behave the same way as the "0" one.

I view the *v check before git_parse_ulong() being unnecessarily
defensive to avoid triggering "die()".  An empty string is obviously
not a number (somebody could argue that it is the same as zero,
though), but nevertheless the user _is_ telling us to use that value
by setting and exporting the variable.  If we cannot parse it,
barfing is what the user would appreciate.

So, I am not sure the patch in the message I am responding to, and I
am not sure about that *v check in Steffen's patch, either.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 2/6] Add git_env_ulong() to parse environment variable
  2014-08-26 20:20     ` Junio C Hamano
@ 2014-08-26 20:31       ` Jeff King
  2014-08-26 21:54         ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Jeff King @ 2014-08-26 20:31 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Steffen Prohaska, git, pclouds, john, schacon

On Tue, Aug 26, 2014 at 01:20:53PM -0700, Junio C Hamano wrote:

> I think different people have different confusion criteria.
> To me, these two are very different operations:
> 
>     $ VAR=
>     $ unset VAR
> 
> I think it boils down to that I see that the distance between "unset
> vs set to empty" is far larger than the distance between "empty vs
> false".  You probably see these two distances the other way,
> i.e. "set to empty is almost like unset" and "empty is not a valid
> way to say false".
> 
> Due to this difference, the new test confused me and had me read it
> three times.

I agree that it is rather a subjective decision.

> So, I am not sure the patch in the message I am responding to, and I
> am not sure about that *v check in Steffen's patch, either.

If it is truly "some people prefer it one way and some the other", I am
not sure if we should leave it as-is (that is preferring one way). The
middle ground would be to die(). That does not seem super-friendly, but
then we would also die with GIT_SMART_HTTP=foobar, so perhaps it is not
unreasonable to just consider it a syntax error.

I dunno. I can live with leaving it as-is. Certainly the existing
behavior is not what I expected, but it is not like it came up in the
real world (and I would not expect it to do so often). And it is
consistent with the config, which treats:

  [foo]
  bar =

as boolean false. That _also_ seems weird to me, but that is not
something I think we can easily change or outlaw at this point anyway.

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 2/6] Add git_env_ulong() to parse environment variable
  2014-08-26 20:31       ` Jeff King
@ 2014-08-26 21:54         ` Junio C Hamano
  2014-08-27  4:46           ` Jeff King
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2014-08-26 21:54 UTC (permalink / raw
  To: Jeff King; +Cc: Steffen Prohaska, git, pclouds, john, schacon

Jeff King <peff@peff.net> writes:

> If it is truly "some people prefer it one way and some the other", I am
> not sure if we should leave it as-is (that is preferring one way). 

A worse position is to have git_env_bool() that says "empty is
false" and add a new git_env_ulong() that says "empty is unset".

We should pick one or the other and use it for both.

As you pointed out in the later part of the message, an empty string
is a valid way to spell "false" in the config subsystem since the
beginning at 17712991 (Add ".git/config" file parser, 2005-10-10);
consistency with it probably is sensible.

But perhaps my brain is rotten with too much Perl and Python X-<.
I do not know where Linus picked up that, though ;-)

> The middle ground would be to die(). That does not seem super-friendly, but
> then we would also die with GIT_SMART_HTTP=foobar, so perhaps it is not
> unreasonable to just consider it a syntax error.

Hmm, I am not sure if dying is better.  Unless we decide to make
empty string no longer false everywhere and warn now and then later
die as part of a 3.0 transition plan or something, that is.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 2/6] Add git_env_ulong() to parse environment variable
  2014-08-26 21:54         ` Junio C Hamano
@ 2014-08-27  4:46           ` Jeff King
  2014-08-27 14:47             ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Jeff King @ 2014-08-27  4:46 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Steffen Prohaska, git, pclouds, john, schacon

On Tue, Aug 26, 2014 at 02:54:11PM -0700, Junio C Hamano wrote:

> A worse position is to have git_env_bool() that says "empty is
> false" and add a new git_env_ulong() that says "empty is unset".
> 
> We should pick one or the other and use it for both.

Yeah, I agree they should probably behave the same.

> > The middle ground would be to die(). That does not seem super-friendly, but
> > then we would also die with GIT_SMART_HTTP=foobar, so perhaps it is not
> > unreasonable to just consider it a syntax error.
> 
> Hmm, I am not sure if dying is better.  Unless we decide to make
> empty string no longer false everywhere and warn now and then later
> die as part of a 3.0 transition plan or something, that is.

I think it is better in the sense that while it may be unexpected, it
does not unexpectedly do something that the user cannot easily undo.

I really do not think this topic is worth the effort of a long-term
deprecation scheme (which I agree _is_ required for a change to the
config behavior). Let's just leave it as-is. We've seen zero real-world
complaints, only my own surprise after reading the code (and Steffen's
patch should be tweaked to match).

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 2/6] Add git_env_ulong() to parse environment variable
  2014-08-27  4:46           ` Jeff King
@ 2014-08-27 14:47             ` Junio C Hamano
  2014-08-28 15:21               ` Steffen Prohaska
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2014-08-27 14:47 UTC (permalink / raw
  To: Jeff King; +Cc: Steffen Prohaska, git, pclouds, john, schacon

Jeff King <peff@peff.net> writes:

> On Tue, Aug 26, 2014 at 02:54:11PM -0700, Junio C Hamano wrote:
>
>> A worse position is to have git_env_bool() that says "empty is
>> false" and add a new git_env_ulong() that says "empty is unset".
>> 
>> We should pick one or the other and use it for both.
>
> Yeah, I agree they should probably behave the same.
>
>> > The middle ground would be to die(). That does not seem super-friendly, but
>> > then we would also die with GIT_SMART_HTTP=foobar, so perhaps it is not
>> > unreasonable to just consider it a syntax error.
>> 
>> Hmm, I am not sure if dying is better.  Unless we decide to make
>> empty string no longer false everywhere and warn now and then later
>> die as part of a 3.0 transition plan or something, that is.
>
> I think it is better in the sense that while it may be unexpected, it
> does not unexpectedly do something that the user cannot easily undo.
>
> I really do not think this topic is worth the effort of a long-term
> deprecation scheme (which I agree _is_ required for a change to the
> config behavior). Let's just leave it as-is. We've seen zero real-world
> complaints, only my own surprise after reading the code (and Steffen's
> patch should be tweaked to match).

OK, then let's do that at least for now and move on.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 2/6] Add git_env_ulong() to parse environment variable
  2014-08-27 14:47             ` Junio C Hamano
@ 2014-08-28 15:21               ` Steffen Prohaska
  2014-08-28 17:13                 ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Steffen Prohaska @ 2014-08-28 15:21 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Jeff King, Git Mailing List, pclouds, john, schacon


On Aug 27, 2014, at 4:47 PM, Junio C Hamano <gitster@pobox.com> wrote:

> Jeff King <peff@peff.net> writes:
> 
>> On Tue, Aug 26, 2014 at 02:54:11PM -0700, Junio C Hamano wrote:
>> 
>>> A worse position is to have git_env_bool() that says "empty is
>>> false" and add a new git_env_ulong() that says "empty is unset".
>>> 
>>> We should pick one or the other and use it for both.
>> 
>> Yeah, I agree they should probably behave the same.
>> 
>>>> The middle ground would be to die(). That does not seem super-friendly, but
>>>> then we would also die with GIT_SMART_HTTP=foobar, so perhaps it is not
>>>> unreasonable to just consider it a syntax error.
>>> 
>>> Hmm, I am not sure if dying is better.  Unless we decide to make
>>> empty string no longer false everywhere and warn now and then later
>>> die as part of a 3.0 transition plan or something, that is.
>> 
>> I think it is better in the sense that while it may be unexpected, it
>> does not unexpectedly do something that the user cannot easily undo.
>> 
>> I really do not think this topic is worth the effort of a long-term
>> deprecation scheme (which I agree _is_ required for a change to the
>> config behavior). Let's just leave it as-is. We've seen zero real-world
>> complaints, only my own surprise after reading the code (and Steffen's
>> patch should be tweaked to match).
> 
> OK, then let's do that at least for now and move on.

Ok.  I saw that you tweaked my patch on pu.  Maybe remove the outdated
comment above the function completely:

diff --git a/config.c b/config.c
index 87db755..010bcd0 100644
--- a/config.c
+++ b/config.c
@@ -1122,9 +1122,6 @@ int git_env_bool(const char *k, int def)
        return v ? git_config_bool(k, v) : def;
 }

-/*
- * Use default if environment variable is unset or empty string.
- */
 unsigned long git_env_ulong(const char *k, unsigned long val)
 {
        const char *v = getenv(k);

	Steffen

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 5/6] Change copy_fd() to not close input fd
  2014-08-26 18:29   ` Jeff King
@ 2014-08-28 15:37     ` Steffen Prohaska
  2014-08-28 17:15       ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Steffen Prohaska @ 2014-08-28 15:37 UTC (permalink / raw
  To: Jeff King, Junio C Hamano
  Cc: Git Mailing List, Duy Nguyen, John Keeping, Scott Chacon


On Aug 26, 2014, at 8:29 PM, Jeff King <peff@peff.net> wrote:

> On Tue, Aug 26, 2014 at 05:23:24PM +0200, Steffen Prohaska wrote:
> 
>> The caller opened the fd, so it should be responsible for closing it.
>> 
>> Signed-off-by: Steffen Prohaska <prohaska@zib.de>
>> ---
>> copy.c     | 5 +----
>> lockfile.c | 3 +++
>> 2 files changed, 4 insertions(+), 4 deletions(-)
>> 
>> diff --git a/copy.c b/copy.c
>> index a7f58fd..d0a1d82 100644
>> --- a/copy.c
>> +++ b/copy.c
>> @@ -10,7 +10,6 @@ int copy_fd(int ifd, int ofd)
>> 			break;
>> 		if (len < 0) {
>> 			int read_error = errno;
>> -			close(ifd);
>> 			return error("copy-fd: read returned %s",
>> 				     strerror(read_error));
>> 		}
> 
> This saved errno is not necessary anymore (the problem was that close()
> clobbered the error in the original code). It can go away, and we can
> even drop the curly braces.
> 
>> @@ -21,17 +20,14 @@ int copy_fd(int ifd, int ofd)
>> 				len -= written;
>> 			}
>> 			else if (!written) {
>> -				close(ifd);
>> 				return error("copy-fd: write returned 0");
>> 			} else {
>> 				int write_error = errno;
>> -				close(ifd);
>> 				return error("copy-fd: write returned %s",
>> 					     strerror(write_error));
>> 			}
>> 		}
> 
> Ditto here. Actually, isn't this whole write just a reimplementation of
> write_in_full? The latter treats a return of 0 as ENOSPC rather than
> using a custom message, but I think that is sane.
> 
> All together:

Makes all sense, and seems sane to me, too.

Junio, I saw that you have the changes on pu with 'SQUASH???...'.  Will you
squash it, or shall I send another complete update of the patch series?

	Steffen



> ---
> copy.c | 28 +++++-----------------------
> 1 file changed, 5 insertions(+), 23 deletions(-)
> 
> diff --git a/copy.c b/copy.c
> index a7f58fd..53a9ece 100644
> --- a/copy.c
> +++ b/copy.c
> @@ -4,34 +4,16 @@ int copy_fd(int ifd, int ofd)
> {
> 	while (1) {
> 		char buffer[8192];
> -		char *buf = buffer;
> 		ssize_t len = xread(ifd, buffer, sizeof(buffer));
> 		if (!len)
> 			break;
> -		if (len < 0) {
> -			int read_error = errno;
> -			close(ifd);
> +		if (len < 0)
> 			return error("copy-fd: read returned %s",
> -				     strerror(read_error));
> -		}
> -		while (len) {
> -			int written = xwrite(ofd, buf, len);
> -			if (written > 0) {
> -				buf += written;
> -				len -= written;
> -			}
> -			else if (!written) {
> -				close(ifd);
> -				return error("copy-fd: write returned 0");
> -			} else {
> -				int write_error = errno;
> -				close(ifd);
> -				return error("copy-fd: write returned %s",
> -					     strerror(write_error));
> -			}
> -		}
> +				     strerror(errno));
> +		if (write_in_full(ofd, buffer, len) < 0)
> +			return error("copy-fd: write returned %s",
> +				     strerror(errno));
> 	}
> -	close(ifd);
> 	return 0;
> }
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 2/6] Add git_env_ulong() to parse environment variable
  2014-08-28 15:21               ` Steffen Prohaska
@ 2014-08-28 17:13                 ` Junio C Hamano
  0 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2014-08-28 17:13 UTC (permalink / raw
  To: Steffen Prohaska; +Cc: Jeff King, Git Mailing List, pclouds, john, schacon

Steffen Prohaska <prohaska@zib.de> writes:

>> OK, then let's do that at least for now and move on.
>
> Ok.  I saw that you tweaked my patch on pu.  Maybe remove the outdated
> comment above the function completely:
>
> diff --git a/config.c b/config.c
> index 87db755..010bcd0 100644
> --- a/config.c
> +++ b/config.c
> @@ -1122,9 +1122,6 @@ int git_env_bool(const char *k, int def)
>         return v ? git_config_bool(k, v) : def;
>  }
>
> -/*
> - * Use default if environment variable is unset or empty string.
> - */

Thanks, will do.

>  unsigned long git_env_ulong(const char *k, unsigned long val)
>  {
>         const char *v = getenv(k);
>
> 	Steffen--
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v6 5/6] Change copy_fd() to not close input fd
  2014-08-28 15:37     ` Steffen Prohaska
@ 2014-08-28 17:15       ` Junio C Hamano
  0 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2014-08-28 17:15 UTC (permalink / raw
  To: Steffen Prohaska
  Cc: Jeff King, Git Mailing List, Duy Nguyen, John Keeping,
	Scott Chacon

Steffen Prohaska <prohaska@zib.de> writes:

> On Aug 26, 2014, at 8:29 PM, Jeff King <peff@peff.net> wrote:
> ...
> Makes all sense, and seems sane to me, too.
>
> Junio, I saw that you have the changes on pu with 'SQUASH???...'.  Will you
> squash it, or shall I send another complete update of the patch series?

If I let you reroll, I will risk that you will forget to propagate
the log message fixes I did here while queuing the v6 iteration in
it, so let me try doing the squashing and push out the result first.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2014-08-28 17:15 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-26 15:23 [PATCH v6 0/6] Stream fd to clean filter; GIT_MMAP_LIMIT, GIT_ALLOC_LIMIT with git_env_ulong() Steffen Prohaska
2014-08-26 15:23 ` [PATCH v6 1/6] convert: drop arguments other than 'path' from would_convert_to_git() Steffen Prohaska
2014-08-26 15:23 ` [PATCH v6 2/6] Add git_env_ulong() to parse environment variable Steffen Prohaska
2014-08-26 18:21   ` Jeff King
2014-08-26 20:20     ` Junio C Hamano
2014-08-26 20:31       ` Jeff King
2014-08-26 21:54         ` Junio C Hamano
2014-08-27  4:46           ` Jeff King
2014-08-27 14:47             ` Junio C Hamano
2014-08-28 15:21               ` Steffen Prohaska
2014-08-28 17:13                 ` Junio C Hamano
2014-08-26 15:23 ` [PATCH v6 3/6] Change GIT_ALLOC_LIMIT check to use git_env_ulong() Steffen Prohaska
2014-08-26 15:23 ` [PATCH v6 4/6] Introduce GIT_MMAP_LIMIT to allow testing expected mmap size Steffen Prohaska
2014-08-26 15:23 ` [PATCH v6 5/6] Change copy_fd() to not close input fd Steffen Prohaska
2014-08-26 17:10   ` Junio C Hamano
2014-08-26 18:29   ` Jeff King
2014-08-28 15:37     ` Steffen Prohaska
2014-08-28 17:15       ` Junio C Hamano
2014-08-26 15:23 ` [PATCH v6 6/6] convert: stream from fd to required clean filter to reduce used address space Steffen Prohaska

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).