From: Bryan Turner <bturner@atlassian.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: "brian m. carlson" <sandals@crustytoothpaste.net>,
Git Users <git@vger.kernel.org>, Jeff King <peff@peff.net>,
Junio C Hamano <gitster@pobox.com>,
"Miriam R." <mirucam@gmail.com>
Subject: Re: [PATCH] run-command: avoid undefined behavior in exists_in_PATH
Date: Mon, 6 Jan 2020 19:41:30 -0800 [thread overview]
Message-ID: <CAGyf7-G_YKtyKgAA+LLqbvh2Tuq+qmVVH=yz9=kfC5uWXHMr_Q@mail.gmail.com> (raw)
In-Reply-To: <CAGyf7-GpCSaBauuKunDxBcZbcHU2sYkrt3+BpdF20dN8ZuPZgw@mail.gmail.com>
On Mon, Jan 6, 2020 at 7:40 PM Bryan Turner <bturner@atlassian.com> wrote:
>
> On Mon, Jan 6, 2020 at 6:04 PM Jonathan Nieder <jrnieder@gmail.com> wrote:
> >
> > brian m. carlson wrote:
> >
> > > In this function, we free the pointer we get from locate_in_PATH and
> > > then check whether it's NULL. However, this is undefined behavior if
> > > the pointer is non-NULL, since the C standard no longer permits us to
> > > use a valid pointer after freeing it.
> > [...]
> > > Noticed-by: Miriam R. <mirucam@gmail.com>
> > > Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> > > ---
> > > run-command.c | 3 ++-
> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
> >
> > This API that forces the caller to deal with the allocated result when
> > they never asked for it seems a bit inconvenient. Should we clean it up
> > a little? Something like this (on top):
> >
> > -- >8 --
> > Subject: run-command: let caller pass in buffer to locate_in_PATH
> >
> > Instead of returning a buffer that the caller is responsible for
> > freeing, use a strbuf output parameter to record the path to the
> > searched-for program.
> >
> > This makes ownership a little easier to reason about, since the owning
> > code declares the buffer. It's a good habit to follow because it
> > allows buffer reuse when calling such a function in a loop.
> >
> > It also allows the caller exists_in_PATH that does not care about the
> > path to the command to be slightly simplified, by allowing a NULL
> > output parameter that means that locate_in_PATH should take care of
> > allocating and freeing its temporary buffer.
> >
> > Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> > ---
> > run-command.c | 51 +++++++++++++++++++++++++++++----------------------
> > 1 file changed, 29 insertions(+), 22 deletions(-)
> >
> > diff --git i/run-command.c w/run-command.c
> > index f5e1149f9b..a6dc38396a 100644
> > --- i/run-command.c
> > +++ w/run-command.c
> > @@ -170,52 +170,57 @@ int is_executable(const char *name)
> > * The caller should ensure that file contains no directory
> > * separators.
> > *
> > - * Returns the path to the command, as found in $PATH or NULL if the
> > - * command could not be found. The caller inherits ownership of the memory
> > - * used to store the resultant path.
> > + * Returns a boolean indicating whether the command was found in $PATH.
> > + * The path to the command is recorded in the strbuf 'out', if supplied.
> > *
> > * This should not be used on Windows, where the $PATH search rules
> > * are more complicated (e.g., a search for "foo" should find
> > * "foo.exe").
> > */
> > -static char *locate_in_PATH(const char *file)
> > +static int locate_in_PATH(const char *file, struct strbuf *out)
> > {
> > const char *p = getenv("PATH");
> > struct strbuf buf = STRBUF_INIT;
> >
> > - if (!p || !*p)
> > - return NULL;
> > + if (!out)
> > + out = &buf;
> > +
> > + if (!p || !*p) {
> > + strbuf_reset(out);
>
> Since the while loop and this block both call strbuf_reset(out);, is
> there a reason not to do:
> if (out)
> strbuf_reset(out);
> } else {
> out = &buf;
> }
> above? The loop below would still need its own reset, but it could do
> that when it decides to loop.
Ignore my incorrect braces; force of habit. I meant:
if (out)
strbuf_reset(out);
else
out = &buf;
>
> > + strbuf_release(&buf);
> > + return 0;
> > + }
> >
> > while (1) {
> > const char *end = strchrnul(p, ':');
> >
> > - strbuf_reset(&buf);
> > + strbuf_reset(out);
>
> This reset would be removed
>
> >
> > /* POSIX specifies an empty entry as the current directory. */
> > if (end != p) {
> > - strbuf_add(&buf, p, end - p);
> > - strbuf_addch(&buf, '/');
> > + strbuf_add(out, p, end - p);
> > + strbuf_addch(out, '/');
> > }
> > - strbuf_addstr(&buf, file);
> > + strbuf_addstr(out, file);
> >
> > - if (is_executable(buf.buf))
> > - return strbuf_detach(&buf, NULL);
> > + if (is_executable(out->buf)) {
> > + strbuf_release(&buf);
> > + return 1;
> > + }
> >
>
> We'd call strbuf_reset(out); here instead, before we break or loop.
>
> > if (!*end)
> > break;
> > p = end + 1
> > }
> >
> > + strbuf_reset(out);
>
> And we could leave this one out; if we make it here, we'd know the
> buffer was already reset.
>
> > strbuf_release(&buf);
> > - return NULL;
> > + return 0;
> > }
> >
> > static int exists_in_PATH(const char *file)
> > {
> > - char *r = locate_in_PATH(file);
> > - int found = r != NULL;
> > - free(r);
> > - return found;
> > + return locate_in_PATH(file, NULL);
> > }
> >
> > int sane_execvp(const char *file, char * const argv[])
> > @@ -427,15 +432,17 @@ static int prepare_cmd(struct argv_array *out, const struct child_process *cmd)
> > * directly.
> > */
> > if (!strchr(out->argv[1], '/')) {
> > - char *program = locate_in_PATH(out->argv[1]);
> > - if (program) {
> > - free((char *)out->argv[1]);
> > - out->argv[1] = program;
> > - } else {
> > + struct strbuf program = STRBUF_INIT;
> > +
> > + if (!locate_in_PATH(out->argv[1], &program)) {
> > argv_array_clear(out);
> > + strbuf_release(&program);
> > errno = ENOENT;
> > return -1;
> > }
> > +
> > + free((char *)out->argv[1]);
> > + out->argv[1] = strbuf_detach(&program, NULL);
> > }
> >
> > return 0;
>
> Just a thought. (Pardon the noise from the peanut gallery!)
>
> Best regards,
> Bryan Turner
next prev parent reply other threads:[~2020-01-07 3:41 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-07 1:36 [PATCH] run-command: avoid undefined behavior in exists_in_PATH brian m. carlson
2020-01-07 2:04 ` Jonathan Nieder
2020-01-07 2:16 ` brian m. carlson
2020-01-07 3:40 ` Bryan Turner
2020-01-07 3:41 ` Bryan Turner [this message]
2020-01-07 11:08 ` Jeff King
2020-01-07 11:01 ` Jeff King
2020-01-07 16:58 ` Junio C Hamano
2020-01-08 2:47 ` brian m. carlson
2020-01-08 9:15 ` Miriam R.
2020-01-08 10:28 ` Christian Couder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAGyf7-G_YKtyKgAA+LLqbvh2Tuq+qmVVH=yz9=kfC5uWXHMr_Q@mail.gmail.com' \
--to=bturner@atlassian.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jrnieder@gmail.com \
--cc=mirucam@gmail.com \
--cc=peff@peff.net \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).