git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] hash-object: fix descriptor leak with --literally
@ 2023-01-19  1:57 Jeff King
  2023-01-19  6:26 ` Junio C Hamano
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff King @ 2023-01-19  1:57 UTC (permalink / raw)
  To: git

In hash_object(), we open a descriptor for each file to hash (whether we
got the filename from the command line or --stdin-paths), but never
close it. For the traditional code path which feeds the result to
index_fd(), this is OK; it closes the descriptor for us.

But 5ba9a93b39 (hash-object: add --literally option, 2014-09-11) a
second code path which does not close the descriptor. There we need to
do so ourselves.

You can see the problem in a clone of git.git like this:

  $ git ls-files -s | grep ^100644 | cut -f2 |
    git hash-object --stdin-paths --literally >/dev/null
  fatal: could not open 'builtin/var.c' for reading: Too many open files

After this patch, it completes successfully. I didn't bother with a
test, as it's a pain to deal with descriptor limits portably, and the
fix is so trivial.

Signed-off-by: Jeff King <peff@peff.net>
---
Something I ran into while testing my hash-object fsck series, but I
broke it off here because it's really an independent bug-fix.

I do think the world would be less confusing if index_fd() didn't close
the descriptor we pass it, and then hash_file() could just do:

  fd = open();
  hash_fd(fd);
  close(fd);

which is much more readable. But it has many other callers. So even if
we wanted to untangle all that, I think it makes sense to do this
obvious fix in the meantime.

 builtin/hash-object.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/builtin/hash-object.c b/builtin/hash-object.c
index b506381502..44db83f07f 100644
--- a/builtin/hash-object.c
+++ b/builtin/hash-object.c
@@ -27,6 +27,7 @@ static int hash_literally(struct object_id *oid, int fd, const char *type, unsig
 	else
 		ret = write_object_file_literally(buf.buf, buf.len, type, oid,
 						 flags);
+	close(fd);
 	strbuf_release(&buf);
 	return ret;
 }
-- 
2.39.1.616.gd06fca9e99

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] hash-object: fix descriptor leak with --literally
  2023-01-19  1:57 [PATCH] hash-object: fix descriptor leak with --literally Jeff King
@ 2023-01-19  6:26 ` Junio C Hamano
  2023-01-19  8:20   ` Jeff King
  0 siblings, 1 reply; 3+ messages in thread
From: Junio C Hamano @ 2023-01-19  6:26 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> In hash_object(), we open a descriptor for each file to hash (whether we
> got the filename from the command line or --stdin-paths), but never
> close it. For the traditional code path which feeds the result to
> index_fd(), this is OK; it closes the descriptor for us.
>
> But 5ba9a93b39 (hash-object: add --literally option, 2014-09-11) a
> second code path which does not close the descriptor.

A sentence without verb?  "5ba9 (hash-...) added a second code path,
which does not close the descriptor." or something?

> After this patch, it completes successfully. I didn't bother with a
> test, as it's a pain to deal with descriptor limits portably, and the
> fix is so trivial.

True.  Will queue.  Thanks.

> I do think the world would be less confusing if index_fd() didn't close
> the descriptor we pass it, and then hash_file() could just do:
>
>   fd = open();
>   hash_fd(fd);
>   close(fd);
>
> which is much more readable. But it has many other callers. So even if
> we wanted to untangle all that, I think it makes sense to do this
> obvious fix in the meantime.

Indeed, thanks.

>  builtin/hash-object.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/builtin/hash-object.c b/builtin/hash-object.c
> index b506381502..44db83f07f 100644
> --- a/builtin/hash-object.c
> +++ b/builtin/hash-object.c
> @@ -27,6 +27,7 @@ static int hash_literally(struct object_id *oid, int fd, const char *type, unsig
>  	else
>  		ret = write_object_file_literally(buf.buf, buf.len, type, oid,
>  						 flags);
> +	close(fd);
>  	strbuf_release(&buf);
>  	return ret;
>  }

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] hash-object: fix descriptor leak with --literally
  2023-01-19  6:26 ` Junio C Hamano
@ 2023-01-19  8:20   ` Jeff King
  0 siblings, 0 replies; 3+ messages in thread
From: Jeff King @ 2023-01-19  8:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Jan 18, 2023 at 10:26:40PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > In hash_object(), we open a descriptor for each file to hash (whether we
> > got the filename from the command line or --stdin-paths), but never
> > close it. For the traditional code path which feeds the result to
> > index_fd(), this is OK; it closes the descriptor for us.
> >
> > But 5ba9a93b39 (hash-object: add --literally option, 2014-09-11) a
> > second code path which does not close the descriptor.
> 
> A sentence without verb?  "5ba9 (hash-...) added a second code path,
> which does not close the descriptor." or something?

Yes, the missing word was "added". Thanks.

-Peff

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-01-19  8:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-19  1:57 [PATCH] hash-object: fix descriptor leak with --literally Jeff King
2023-01-19  6:26 ` Junio C Hamano
2023-01-19  8:20   ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).