git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] fix read past end of array in alternates files
@ 2017-09-18 15:51 Jeff King
  2017-09-18 15:54 ` [PATCH 1/2] read_info_alternates: read contents into strbuf Jeff King
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Jeff King @ 2017-09-18 15:51 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

This series fixes a regression in v2.11.1 where we might read past the
end of an mmap'd buffer. It was introduced in cf3c635210, but I didn't
base the patch on there, for a few reasons:

  1. There's a trivial conflict when merging up (because of
     git_open_noatime() becoming just git_open() in the inerim).

  2. The reproduction advice relies on our SANITIZE Makefile knob, which
     didn't exist back then.

  3. The second patch does not apply there because we don't have
     warn_on_fopen_errors(). Though admittedly it could be applied
     separately after merging up; it's just a clean-up on top.

It does apply on the current "maint".

  [1/2]: read_info_alternates: read contents into strbuf
  [2/2]: read_info_alternates: warn on non-trivial errors

 sha1_file.c | 30 ++++++++++--------------------
 1 file changed, 10 insertions(+), 20 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] read_info_alternates: read contents into strbuf
  2017-09-18 15:51 [PATCH 0/2] fix read past end of array in alternates files Jeff King
@ 2017-09-18 15:54 ` Jeff King
  2017-09-19  0:08   ` Junio C Hamano
  2017-09-19  2:42   ` Jonathan Nieder
  2017-09-18 15:55 ` [PATCH 2/2] read_info_alternates: warn on non-trivial errors Jeff King
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 14+ messages in thread
From: Jeff King @ 2017-09-18 15:54 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

The link_alt_odb_entries() function has always taken a
ptr/len pair as input. Until cf3c635210 (alternates: accept
double-quoted paths, 2016-12-12), we made a copy of those
bytes in a string. But after that commit, we switched to
parsing the input left-to-right, and we ignore "len"
totally, instead reading until we hit a NUL.

This has mostly gone unnoticed for a few reasons:

  1. All but one caller passes a NUL-terminated string, with
     "len" pointing to the NUL.

  2. The remaining caller, read_info_alternates(), passes in
     an mmap'd file. Unless the file is an exact multiple of
     the page size, it will generally be followed by NUL
     padding to the end of the page, which just works.

The easiest way to demonstrate the problem is to build with:

  make SANITIZE=address NO_MMAP=Nope test

Any test which involves $GIT_DIR/info/alternates will fail,
as the mmap emulation (correctly) does not add an extra NUL,
and ASAN complains about reading past the end of the buffer.

One solution would be to teach link_alt_odb_entries() to
respect "len". But it's actually a bit tricky, since we
depend on unquote_c_style() under the hood, and it has no
ptr/len variant.

We could also just make a NUL-terminated copy of the input
bytes and operate on that. But since all but one caller
already is passing a string, instead let's just fix that
caller to provide NUL-terminated input in the first place,
by swapping out mmap for strbuf_read_file().

There's no advantage to using mmap on the alternates file.
It's not expected to be large (and anyway, we're copying its
contents into an in-memory linked list). Nor is using
git_open() buying us anything here, since we don't keep the
descriptor open for a long period of time.

Let's also drop the "len" parameter entirely from
link_alt_odb_entries(), since it's completely ignored. That
will avoid any new callers re-introducing a similar bug.

Reported-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
---
I didn't reproduce with the mmap even-page-size thing, but I
think it's the same reason we didn't notice the "git log -G"
read-past-mmap issues for a long time. Which makes testing
with ASAN and NO_MMAP an interesting combination, as it
should find out any similar cases (and after this, the whole
test suite passes with that configuration).

 sha1_file.c | 29 +++++++++--------------------
 1 file changed, 9 insertions(+), 20 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 5f71bbac3e..b1e4193679 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -398,7 +398,7 @@ static const char *parse_alt_odb_entry(const char *string,
 	return end;
 }
 
-static void link_alt_odb_entries(const char *alt, int len, int sep,
+static void link_alt_odb_entries(const char *alt, int sep,
 				 const char *relative_base, int depth)
 {
 	struct strbuf objdirbuf = STRBUF_INIT;
@@ -427,28 +427,17 @@ static void link_alt_odb_entries(const char *alt, int len, int sep,
 
 static void read_info_alternates(const char * relative_base, int depth)
 {
-	char *map;
-	size_t mapsz;
-	struct stat st;
 	char *path;
-	int fd;
+	struct strbuf buf = STRBUF_INIT;
 
 	path = xstrfmt("%s/info/alternates", relative_base);
-	fd = git_open(path);
-	free(path);
-	if (fd < 0)
-		return;
-	if (fstat(fd, &st) || (st.st_size == 0)) {
-		close(fd);
+	if (strbuf_read_file(&buf, path, 1024) < 0) {
+		free(path);
 		return;
 	}
-	mapsz = xsize_t(st.st_size);
-	map = xmmap(NULL, mapsz, PROT_READ, MAP_PRIVATE, fd, 0);
-	close(fd);
 
-	link_alt_odb_entries(map, mapsz, '\n', relative_base, depth);
-
-	munmap(map, mapsz);
+	link_alt_odb_entries(buf.buf, '\n', relative_base, depth);
+	strbuf_release(&buf);
 }
 
 struct alternate_object_database *alloc_alt_odb(const char *dir)
@@ -503,7 +492,7 @@ void add_to_alternates_file(const char *reference)
 		if (commit_lock_file(lock))
 			die_errno("unable to move new alternates file into place");
 		if (alt_odb_tail)
-			link_alt_odb_entries(reference, strlen(reference), '\n', NULL, 0);
+			link_alt_odb_entries(reference, '\n', NULL, 0);
 	}
 	free(alts);
 }
@@ -516,7 +505,7 @@ void add_to_alternates_memory(const char *reference)
 	 */
 	prepare_alt_odb();
 
-	link_alt_odb_entries(reference, strlen(reference), '\n', NULL, 0);
+	link_alt_odb_entries(reference, '\n', NULL, 0);
 }
 
 /*
@@ -619,7 +608,7 @@ void prepare_alt_odb(void)
 	if (!alt) alt = "";
 
 	alt_odb_tail = &alt_odb_list;
-	link_alt_odb_entries(alt, strlen(alt), PATH_SEP, NULL, 0);
+	link_alt_odb_entries(alt, PATH_SEP, NULL, 0);
 
 	read_info_alternates(get_object_directory(), 0);
 }
-- 
2.14.1.1014.g252e627ae0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] read_info_alternates: warn on non-trivial errors
  2017-09-18 15:51 [PATCH 0/2] fix read past end of array in alternates files Jeff King
  2017-09-18 15:54 ` [PATCH 1/2] read_info_alternates: read contents into strbuf Jeff King
@ 2017-09-18 15:55 ` Jeff King
  2017-09-19  2:46   ` Jonathan Nieder
  2017-09-19  2:36 ` [PATCH 0/2] fix read past end of array in alternates files Jonathan Nieder
  2017-09-19 19:40 ` [PATCH v2 " Jeff King
  3 siblings, 1 reply; 14+ messages in thread
From: Jeff King @ 2017-09-18 15:55 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

When we fail to open $GIT_DIR/info/alternates, we silently
assume there are no alternates. This is the right thing to
do for ENOENT, but not for other errors.

A hard error is probably overkill here. If we fail to read
an alternates file then either we'll complete our operation
anyway, or we'll fail to find some needed object. Either
way, a warning is good idea. And we already have a helper
function to handle this pattern; let's just call
warn_on_fopen_error().

Note that technically the errno from strbuf_read_file()
might be from a read() error, not open(). But since read()
would never return ENOENT or ENOTDIR, and since it produces
a generic "unable to access" error, it's suitable for
handling errors from either.

Signed-off-by: Jeff King <peff@peff.net>
---
I'm pretty comfortable with the rationale in the final paragraph. But if
we're concerned that warn_on_fopen_errors() might change, we can swap
out strbuf_read_file() for fopen()+strbuf_read().

 sha1_file.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sha1_file.c b/sha1_file.c
index b1e4193679..9cec326298 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -432,6 +432,7 @@ static void read_info_alternates(const char * relative_base, int depth)
 
 	path = xstrfmt("%s/info/alternates", relative_base);
 	if (strbuf_read_file(&buf, path, 1024) < 0) {
+		warn_on_fopen_errors(path);
 		free(path);
 		return;
 	}
-- 
2.14.1.1014.g252e627ae0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] read_info_alternates: read contents into strbuf
  2017-09-18 15:54 ` [PATCH 1/2] read_info_alternates: read contents into strbuf Jeff King
@ 2017-09-19  0:08   ` Junio C Hamano
  2017-09-19  2:42   ` Jonathan Nieder
  1 sibling, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2017-09-19  0:08 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Michael Haggerty

Jeff King <peff@peff.net> writes:

> We could also just make a NUL-terminated copy of the input
> bytes and operate on that. But since all but one caller
> already is passing a string, instead let's just fix that
> caller to provide NUL-terminated input in the first place,
> by swapping out mmap for strbuf_read_file().
> ...
> Let's also drop the "len" parameter entirely from
> link_alt_odb_entries(), since it's completely ignored. That
> will avoid any new callers re-introducing a similar bug.

Both design decisions make perfect sense to me.

>  sha1_file.c | 29 +++++++++--------------------
>  1 file changed, 9 insertions(+), 20 deletions(-)

And diffstat also agrees that it is a good change ;-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] fix read past end of array in alternates files
  2017-09-18 15:51 [PATCH 0/2] fix read past end of array in alternates files Jeff King
  2017-09-18 15:54 ` [PATCH 1/2] read_info_alternates: read contents into strbuf Jeff King
  2017-09-18 15:55 ` [PATCH 2/2] read_info_alternates: warn on non-trivial errors Jeff King
@ 2017-09-19  2:36 ` Jonathan Nieder
  2017-09-19  4:57   ` Jeff King
  2017-09-19 19:40 ` [PATCH v2 " Jeff King
  3 siblings, 1 reply; 14+ messages in thread
From: Jonathan Nieder @ 2017-09-19  2:36 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Michael Haggerty

Jeff King wrote:

> This series fixes a regression in v2.11.1 where we might read past the
> end of an mmap'd buffer. It was introduced in cf3c635210,

The above information is super helpful.  Can it go in one of the commit
messages?

>                                                           but I didn't
> base the patch on there, for a few reasons:
>
>   1. There's a trivial conflict when merging up (because of
>      git_open_noatime() becoming just git_open() in the inerim).
>
>   2. The reproduction advice relies on our SANITIZE Makefile knob, which
>      didn't exist back then.
>
>   3. The second patch does not apply there because we don't have
>      warn_on_fopen_errors(). Though admittedly it could be applied
>      separately after merging up; it's just a clean-up on top.

Even this part could go in a commit message, but it's fine for it not
to.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] read_info_alternates: read contents into strbuf
  2017-09-18 15:54 ` [PATCH 1/2] read_info_alternates: read contents into strbuf Jeff King
  2017-09-19  0:08   ` Junio C Hamano
@ 2017-09-19  2:42   ` Jonathan Nieder
  2017-09-19  5:03     ` Jeff King
  1 sibling, 1 reply; 14+ messages in thread
From: Jonathan Nieder @ 2017-09-19  2:42 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Michael Haggerty

Jeff King wrote:

> Reported-by: Michael Haggerty <mhagger@alum.mit.edu>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  sha1_file.c | 29 +++++++++--------------------
>  1 file changed, 9 insertions(+), 20 deletions(-)

Thanks for tracking it down.

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

[...]
> +++ b/sha1_file.c
[...]
> @@ -427,28 +427,17 @@ static void link_alt_odb_entries(const char *alt, int len, int sep,
>  
>  static void read_info_alternates(const char * relative_base, int depth)
>  {
> -	char *map;
> -	size_t mapsz;
> -	struct stat st;
>  	char *path;
> -	int fd;
> +	struct strbuf buf = STRBUF_INIT;
>  
>  	path = xstrfmt("%s/info/alternates", relative_base);
> -	fd = git_open(path);
> -	free(path);
> -	if (fd < 0)
> -		return;
> -	if (fstat(fd, &st) || (st.st_size == 0)) {
> -		close(fd);
> +	if (strbuf_read_file(&buf, path, 1024) < 0) {
> +		free(path);
>  		return;

strbuf_read_file is careful to release buf on failure, so this doesn't
leak (but it's a bit subtle).

What happened to the !st.st_size case?  Is it eliminated for
simplicity?

The rest looks good.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] read_info_alternates: warn on non-trivial errors
  2017-09-18 15:55 ` [PATCH 2/2] read_info_alternates: warn on non-trivial errors Jeff King
@ 2017-09-19  2:46   ` Jonathan Nieder
  2017-09-19  5:15     ` Jeff King
  0 siblings, 1 reply; 14+ messages in thread
From: Jonathan Nieder @ 2017-09-19  2:46 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Michael Haggerty

Jeff King wrote:

> When we fail to open $GIT_DIR/info/alternates, we silently
> assume there are no alternates. This is the right thing to
> do for ENOENT, but not for other errors.
>
> A hard error is probably overkill here. If we fail to read
> an alternates file then either we'll complete our operation
> anyway, or we'll fail to find some needed object. Either
> way, a warning is good idea. And we already have a helper
> function to handle this pattern; let's just call
> warn_on_fopen_error().

I think I prefer a hard error.  What kind of cases are you imagining
where it would be better to warn?

E.g. for EIO, erroring out so that the user can try again seems better
than hoping that the application will be able to cope with the more
subtle error that comes from discovering some objects are missing.

For EACCES, I can see how it makes sense to warn and move on, but no
other errors like that are occuring to me.

Thoughts?

Thanks,
Jonathan

> Note that technically the errno from strbuf_read_file()
> might be from a read() error, not open(). But since read()
> would never return ENOENT or ENOTDIR, and since it produces
> a generic "unable to access" error, it's suitable for
> handling errors from either.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] fix read past end of array in alternates files
  2017-09-19  2:36 ` [PATCH 0/2] fix read past end of array in alternates files Jonathan Nieder
@ 2017-09-19  4:57   ` Jeff King
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2017-09-19  4:57 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Michael Haggerty

On Mon, Sep 18, 2017 at 07:36:03PM -0700, Jonathan Nieder wrote:

> Jeff King wrote:
> 
> > This series fixes a regression in v2.11.1 where we might read past the
> > end of an mmap'd buffer. It was introduced in cf3c635210,
> 
> The above information is super helpful.  Can it go in one of the commit
> messages?

Er, didn't I?

> > base the patch on there, for a few reasons:
> >
> >   1. There's a trivial conflict when merging up (because of
> >      git_open_noatime() becoming just git_open() in the inerim).
> >
> >   2. The reproduction advice relies on our SANITIZE Makefile knob, which
> >      didn't exist back then.
> >
> >   3. The second patch does not apply there because we don't have
> >      warn_on_fopen_errors(). Though admittedly it could be applied
> >      separately after merging up; it's just a clean-up on top.
> 
> Even this part could go in a commit message, but it's fine for it not
> to.

IMHO this kind of meta information doesn't belong in the commit message.
It's useful to the maintainer to know where to apply the patch, but I
don't think it helps somebody who is reading "git log" output.

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] read_info_alternates: read contents into strbuf
  2017-09-19  2:42   ` Jonathan Nieder
@ 2017-09-19  5:03     ` Jeff King
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2017-09-19  5:03 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Michael Haggerty

On Mon, Sep 18, 2017 at 07:42:53PM -0700, Jonathan Nieder wrote:

> Jeff King wrote:
> 
> > Reported-by: Michael Haggerty <mhagger@alum.mit.edu>
> > Signed-off-by: Jeff King <peff@peff.net>
> > ---
> >  sha1_file.c | 29 +++++++++--------------------
> >  1 file changed, 9 insertions(+), 20 deletions(-)
> 
> Thanks for tracking it down.

To be fair, Michael did most of the work in identifying and bisecting
the bug. He even wrote a very similar patch in parallel; I just swooped
in at the end.

> >  	path = xstrfmt("%s/info/alternates", relative_base);
> > -	fd = git_open(path);
> > -	free(path);
> > -	if (fd < 0)
> > -		return;
> > -	if (fstat(fd, &st) || (st.st_size == 0)) {
> > -		close(fd);
> > +	if (strbuf_read_file(&buf, path, 1024) < 0) {
> > +		free(path);
> >  		return;
> 
> strbuf_read_file is careful to release buf on failure, so this doesn't
> leak (but it's a bit subtle).

Yep. I didn't think it was worth calling out with a comment since the
"don't allocate on failure" pattern is common to most of the strbuf
functions.

> What happened to the !st.st_size case?  Is it eliminated for
> simplicity?

Good question, and the answer is yes. Obviously we can bail early on an
empty file, but I don't think there's any reason to complicate the code
with it (the original seems to come from d5a63b9983 (Alternate object
pool mechanism updates., 2005-08-14), where it avoided a corner case
that has long since been deleted.

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] read_info_alternates: warn on non-trivial errors
  2017-09-19  2:46   ` Jonathan Nieder
@ 2017-09-19  5:15     ` Jeff King
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2017-09-19  5:15 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Michael Haggerty

On Mon, Sep 18, 2017 at 07:46:24PM -0700, Jonathan Nieder wrote:

> Jeff King wrote:
> 
> > When we fail to open $GIT_DIR/info/alternates, we silently
> > assume there are no alternates. This is the right thing to
> > do for ENOENT, but not for other errors.
> >
> > A hard error is probably overkill here. If we fail to read
> > an alternates file then either we'll complete our operation
> > anyway, or we'll fail to find some needed object. Either
> > way, a warning is good idea. And we already have a helper
> > function to handle this pattern; let's just call
> > warn_on_fopen_error().
> 
> I think I prefer a hard error.  What kind of cases are you imagining
> where it would be better to warn?
> 
> E.g. for EIO, erroring out so that the user can try again seems better
> than hoping that the application will be able to cope with the more
> subtle error that comes from discovering some objects are missing.
> 
> For EACCES, I can see how it makes sense to warn and move on, but no
> other errors like that are occuring to me.
> 
> Thoughts?

Yes, EACCES is exactly the example that came to mind.  But most
importantly, we don't know at this point whether the alternate is even
relevant to the current operation. So calling die() here would make some
cases fail that would otherwise succeed. That's usually not a big deal,
but we've had cases where being lazy about dying helps people use git
itself to help repair the case.

Admittedly most of those chicken-and-egg problems are centered around
git-config (e.g., using "git config --edit" to repair broken config), so
it's perhaps less important here. But it seems like a reasonable
principle to follow in general.

If there's a compelling reason to die hard, I'd reconsider it. But I
can't really think of one (if we were _writing_ a new alternates file
and encountered a read error of the old data we're copying, then I think
it would be very important to bail before claiming a successful write.
But that's not what's happening here).

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 0/2] fix read past end of array in alternates files
  2017-09-18 15:51 [PATCH 0/2] fix read past end of array in alternates files Jeff King
                   ` (2 preceding siblings ...)
  2017-09-19  2:36 ` [PATCH 0/2] fix read past end of array in alternates files Jonathan Nieder
@ 2017-09-19 19:40 ` Jeff King
  2017-09-19 19:41   ` [PATCH v2 1/2] read_info_alternates: read contents into strbuf Jeff King
                     ` (2 more replies)
  3 siblings, 3 replies; 14+ messages in thread
From: Jeff King @ 2017-09-19 19:40 UTC (permalink / raw)
  To: git; +Cc: Jonathan Nieder, Junio C Hamano, Michael Haggerty

On Mon, Sep 18, 2017 at 11:51:00AM -0400, Jeff King wrote:

> This series fixes a regression in v2.11.1 where we might read past the
> end of an mmap'd buffer. It was introduced in cf3c635210, but I didn't
> base the patch on there, for a few reasons:

Here's a v2 that fixes a minor leak in the first patch (I carefully
remembered to free() the path buffer on the error path, but forgot to do
it in the normal one. Oops).

I also tried to address Jonathan's "should this be in the commit
message" comment. The information above _is_ in there, but maybe putting
it at the top as a sort of tl;dr makes it easier to find?

The second patch is unchanged.

Junio, I see you ended up back-porting it to v2.11. Would you prefer me
to have done it that way in the first place? I was trying to reduce your
work by basing it on "maint" (figuring that we wouldn't bother making a
v2.11.x release anyway, and that skips you having to apply the second
patch separately after the merge).

  [1/2]: read_info_alternates: read contents into strbuf
  [2/2]: read_info_alternates: warn on non-trivial errors

 sha1_file.c | 31 +++++++++++--------------------
 1 file changed, 11 insertions(+), 20 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/2] read_info_alternates: read contents into strbuf
  2017-09-19 19:40 ` [PATCH v2 " Jeff King
@ 2017-09-19 19:41   ` Jeff King
  2017-09-19 19:41   ` [PATCH v2 2/2] read_info_alternates: warn on non-trivial errors Jeff King
  2017-09-20  2:44   ` [PATCH v2 0/2] fix read past end of array in alternates files Junio C Hamano
  2 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2017-09-19 19:41 UTC (permalink / raw)
  To: git; +Cc: Jonathan Nieder, Junio C Hamano, Michael Haggerty

This patch fixes a regression in v2.11.1 where we might read
past the end of an mmap'd buffer. It was introduced in
cf3c635210.

The link_alt_odb_entries() function has always taken a
ptr/len pair as input. Until cf3c635210 (alternates: accept
double-quoted paths, 2016-12-12), we made a copy of those
bytes in a string. But after that commit, we switched to
parsing the input left-to-right, and we ignore "len"
totally, instead reading until we hit a NUL.

This has mostly gone unnoticed for a few reasons:

  1. All but one caller passes a NUL-terminated string, with
     "len" pointing to the NUL.

  2. The remaining caller, read_info_alternates(), passes in
     an mmap'd file. Unless the file is an exact multiple of
     the page size, it will generally be followed by NUL
     padding to the end of the page, which just works.

The easiest way to demonstrate the problem is to build with:

  make SANITIZE=address NO_MMAP=Nope test

Any test which involves $GIT_DIR/info/alternates will fail,
as the mmap emulation (correctly) does not add an extra NUL,
and ASAN complains about reading past the end of the buffer.

One solution would be to teach link_alt_odb_entries() to
respect "len". But it's actually a bit tricky, since we
depend on unquote_c_style() under the hood, and it has no
ptr/len variant.

We could also just make a NUL-terminated copy of the input
bytes and operate on that. But since all but one caller
already is passing a string, instead let's just fix that
caller to provide NUL-terminated input in the first place,
by swapping out mmap for strbuf_read_file().

There's no advantage to using mmap on the alternates file.
It's not expected to be large (and anyway, we're copying its
contents into an in-memory linked list). Nor is using
git_open() buying us anything here, since we don't keep the
descriptor open for a long period of time.

Let's also drop the "len" parameter entirely from
link_alt_odb_entries(), since it's completely ignored. That
will avoid any new callers re-introducing a similar bug.

Reported-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
---
 sha1_file.c | 30 ++++++++++--------------------
 1 file changed, 10 insertions(+), 20 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index b4a67bb838..b682b7ec06 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -398,7 +398,7 @@ static const char *parse_alt_odb_entry(const char *string,
 	return end;
 }
 
-static void link_alt_odb_entries(const char *alt, int len, int sep,
+static void link_alt_odb_entries(const char *alt, int sep,
 				 const char *relative_base, int depth)
 {
 	struct strbuf objdirbuf = STRBUF_INIT;
@@ -427,28 +427,18 @@ static void link_alt_odb_entries(const char *alt, int len, int sep,
 
 static void read_info_alternates(const char * relative_base, int depth)
 {
-	char *map;
-	size_t mapsz;
-	struct stat st;
 	char *path;
-	int fd;
+	struct strbuf buf = STRBUF_INIT;
 
 	path = xstrfmt("%s/info/alternates", relative_base);
-	fd = git_open(path);
-	free(path);
-	if (fd < 0)
-		return;
-	if (fstat(fd, &st) || (st.st_size == 0)) {
-		close(fd);
+	if (strbuf_read_file(&buf, path, 1024) < 0) {
+		free(path);
 		return;
 	}
-	mapsz = xsize_t(st.st_size);
-	map = xmmap(NULL, mapsz, PROT_READ, MAP_PRIVATE, fd, 0);
-	close(fd);
-
-	link_alt_odb_entries(map, mapsz, '\n', relative_base, depth);
 
-	munmap(map, mapsz);
+	link_alt_odb_entries(buf.buf, '\n', relative_base, depth);
+	strbuf_release(&buf);
+	free(path);
 }
 
 struct alternate_object_database *alloc_alt_odb(const char *dir)
@@ -503,7 +493,7 @@ void add_to_alternates_file(const char *reference)
 		if (commit_lock_file(lock))
 			die_errno("unable to move new alternates file into place");
 		if (alt_odb_tail)
-			link_alt_odb_entries(reference, strlen(reference), '\n', NULL, 0);
+			link_alt_odb_entries(reference, '\n', NULL, 0);
 	}
 	free(alts);
 }
@@ -516,7 +506,7 @@ void add_to_alternates_memory(const char *reference)
 	 */
 	prepare_alt_odb();
 
-	link_alt_odb_entries(reference, strlen(reference), '\n', NULL, 0);
+	link_alt_odb_entries(reference, '\n', NULL, 0);
 }
 
 /*
@@ -619,7 +609,7 @@ void prepare_alt_odb(void)
 	if (!alt) alt = "";
 
 	alt_odb_tail = &alt_odb_list;
-	link_alt_odb_entries(alt, strlen(alt), PATH_SEP, NULL, 0);
+	link_alt_odb_entries(alt, PATH_SEP, NULL, 0);
 
 	read_info_alternates(get_object_directory(), 0);
 }
-- 
2.14.1.1014.g252e627ae0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/2] read_info_alternates: warn on non-trivial errors
  2017-09-19 19:40 ` [PATCH v2 " Jeff King
  2017-09-19 19:41   ` [PATCH v2 1/2] read_info_alternates: read contents into strbuf Jeff King
@ 2017-09-19 19:41   ` Jeff King
  2017-09-20  2:44   ` [PATCH v2 0/2] fix read past end of array in alternates files Junio C Hamano
  2 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2017-09-19 19:41 UTC (permalink / raw)
  To: git; +Cc: Jonathan Nieder, Junio C Hamano, Michael Haggerty

When we fail to open $GIT_DIR/info/alternates, we silently
assume there are no alternates. This is the right thing to
do for ENOENT, but not for other errors.

A hard error is probably overkill here. If we fail to read
an alternates file then either we'll complete our operation
anyway, or we'll fail to find some needed object. Either
way, a warning is good idea. And we already have a helper
function to handle this pattern; let's just call
warn_on_fopen_error().

Note that technically the errno from strbuf_read_file()
might be from a read() error, not open(). But since read()
would never return ENOENT or ENOTDIR, and since it produces
a generic "unable to access" error, it's suitable for
handling errors from either.

Signed-off-by: Jeff King <peff@peff.net>
---
 sha1_file.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sha1_file.c b/sha1_file.c
index b682b7ec06..1477ea7b50 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -432,6 +432,7 @@ static void read_info_alternates(const char * relative_base, int depth)
 
 	path = xstrfmt("%s/info/alternates", relative_base);
 	if (strbuf_read_file(&buf, path, 1024) < 0) {
+		warn_on_fopen_errors(path);
 		free(path);
 		return;
 	}
-- 
2.14.1.1014.g252e627ae0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/2] fix read past end of array in alternates files
  2017-09-19 19:40 ` [PATCH v2 " Jeff King
  2017-09-19 19:41   ` [PATCH v2 1/2] read_info_alternates: read contents into strbuf Jeff King
  2017-09-19 19:41   ` [PATCH v2 2/2] read_info_alternates: warn on non-trivial errors Jeff King
@ 2017-09-20  2:44   ` Junio C Hamano
  2 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2017-09-20  2:44 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Jonathan Nieder, Michael Haggerty

Jeff King <peff@peff.net> writes:

> On Mon, Sep 18, 2017 at 11:51:00AM -0400, Jeff King wrote:
>
>> This series fixes a regression in v2.11.1 where we might read past the
>> end of an mmap'd buffer. It was introduced in cf3c635210, but I didn't
>> base the patch on there, for a few reasons:
>
> Here's a v2 that fixes a minor leak in the first patch (I carefully
> remembered to free() the path buffer on the error path, but forgot to do
> it in the normal one. Oops).

Thanks.

> I also tried to address Jonathan's "should this be in the commit
> message" comment. The information above _is_ in there, but maybe putting
> it at the top as a sort of tl;dr makes it easier to find?

Probably.

> The second patch is unchanged.
>
> Junio, I see you ended up back-porting it to v2.11. Would you prefer me
> to have done it that way in the first place? I was trying to reduce your
> work by basing it on "maint" (figuring that we wouldn't bother making a
> v2.11.x release anyway, and that skips you having to apply the second
> patch separately after the merge).

Upon seeing that this dated back to 2.11, because I am lazy and do
not assess how much the fix needs to go to older tracks when I am
queuing (remember: my attention span during patch queueing is
measured in minutes, as people send changes to different areas), I
tend to first try to see what's the oldest maintenance track we can
practically apply the patch to.  It turned out that the conflict
resolution to apply on maint-2.11 wasn't that painful, so I took the
lazy route all the way---the real fix on the oldest, even though I
do not know (because I refused to think and decide due to laziness)
if a next v2.11.x release is necessary, followed by a nice-to-have
warning that uses newer features on the maintenance track.  That
way, when we decide that the fix won't be a big deal to require a
new v2.11.x, but it is nice to have in v2.13.x, we could merge the
first one, without having to cherry-pick.

All of the above is part of how the daily maintainer workflow goes,
and there is no strong preference on my side, if the original is on
the theoretically oldest (i.e. maint-2.11) or on the oldest
practical (i.e. maint), as long as the conflicts are not too
painful.


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-09-20  2:44 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-18 15:51 [PATCH 0/2] fix read past end of array in alternates files Jeff King
2017-09-18 15:54 ` [PATCH 1/2] read_info_alternates: read contents into strbuf Jeff King
2017-09-19  0:08   ` Junio C Hamano
2017-09-19  2:42   ` Jonathan Nieder
2017-09-19  5:03     ` Jeff King
2017-09-18 15:55 ` [PATCH 2/2] read_info_alternates: warn on non-trivial errors Jeff King
2017-09-19  2:46   ` Jonathan Nieder
2017-09-19  5:15     ` Jeff King
2017-09-19  2:36 ` [PATCH 0/2] fix read past end of array in alternates files Jonathan Nieder
2017-09-19  4:57   ` Jeff King
2017-09-19 19:40 ` [PATCH v2 " Jeff King
2017-09-19 19:41   ` [PATCH v2 1/2] read_info_alternates: read contents into strbuf Jeff King
2017-09-19 19:41   ` [PATCH v2 2/2] read_info_alternates: warn on non-trivial errors Jeff King
2017-09-20  2:44   ` [PATCH v2 0/2] fix read past end of array in alternates files Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).