git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Bryan Turner <bturner@atlassian.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: "brian m. carlson" <sandals@crustytoothpaste.net>,
	Git Users <git@vger.kernel.org>, Jeff King <peff@peff.net>,
	Junio C Hamano <gitster@pobox.com>,
	"Miriam R." <mirucam@gmail.com>
Subject: Re: [PATCH] run-command: avoid undefined behavior in exists_in_PATH
Date: Mon, 6 Jan 2020 19:41:30 -0800	[thread overview]
Message-ID: <CAGyf7-G_YKtyKgAA+LLqbvh2Tuq+qmVVH=yz9=kfC5uWXHMr_Q@mail.gmail.com> (raw)
In-Reply-To: <CAGyf7-GpCSaBauuKunDxBcZbcHU2sYkrt3+BpdF20dN8ZuPZgw@mail.gmail.com>

On Mon, Jan 6, 2020 at 7:40 PM Bryan Turner <bturner@atlassian.com> wrote:
>
> On Mon, Jan 6, 2020 at 6:04 PM Jonathan Nieder <jrnieder@gmail.com> wrote:
> >
> > brian m. carlson wrote:
> >
> > > In this function, we free the pointer we get from locate_in_PATH and
> > > then check whether it's NULL.  However, this is undefined behavior if
> > > the pointer is non-NULL, since the C standard no longer permits us to
> > > use a valid pointer after freeing it.
> > [...]
> > > Noticed-by: Miriam R. <mirucam@gmail.com>
> > > Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> > > ---
> > >  run-command.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
> >
> > This API that forces the caller to deal with the allocated result when
> > they never asked for it seems a bit inconvenient.  Should we clean it up
> > a little?  Something like this (on top):
> >
> > -- >8 --
> > Subject: run-command: let caller pass in buffer to locate_in_PATH
> >
> > Instead of returning a buffer that the caller is responsible for
> > freeing, use a strbuf output parameter to record the path to the
> > searched-for program.
> >
> > This makes ownership a little easier to reason about, since the owning
> > code declares the buffer.  It's a good habit to follow because it
> > allows buffer reuse when calling such a function in a loop.
> >
> > It also allows the caller exists_in_PATH that does not care about the
> > path to the command to be slightly simplified, by allowing a NULL
> > output parameter that means that locate_in_PATH should take care of
> > allocating and freeing its temporary buffer.
> >
> > Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> > ---
> >  run-command.c | 51 +++++++++++++++++++++++++++++----------------------
> >  1 file changed, 29 insertions(+), 22 deletions(-)
> >
> > diff --git i/run-command.c w/run-command.c
> > index f5e1149f9b..a6dc38396a 100644
> > --- i/run-command.c
> > +++ w/run-command.c
> > @@ -170,52 +170,57 @@ int is_executable(const char *name)
> >   * The caller should ensure that file contains no directory
> >   * separators.
> >   *
> > - * Returns the path to the command, as found in $PATH or NULL if the
> > - * command could not be found.  The caller inherits ownership of the memory
> > - * used to store the resultant path.
> > + * Returns a boolean indicating whether the command was found in $PATH.
> > + * The path to the command is recorded in the strbuf 'out', if supplied.
> >   *
> >   * This should not be used on Windows, where the $PATH search rules
> >   * are more complicated (e.g., a search for "foo" should find
> >   * "foo.exe").
> >   */
> > -static char *locate_in_PATH(const char *file)
> > +static int locate_in_PATH(const char *file, struct strbuf *out)
> >  {
> >         const char *p = getenv("PATH");
> >         struct strbuf buf = STRBUF_INIT;
> >
> > -       if (!p || !*p)
> > -               return NULL;
> > +       if (!out)
> > +               out = &buf;
> > +
> > +       if (!p || !*p) {
> > +               strbuf_reset(out);
>
> Since the while loop and this block both call strbuf_reset(out);, is
> there a reason not to do:
> if (out)
>         strbuf_reset(out);
> } else {
>         out = &buf;
> }
> above? The loop below would still need its own reset, but it could do
> that when it decides to loop.

Ignore my incorrect braces; force of habit. I meant:
if (out)
        strbuf_reset(out);
else
        out = &buf;

>
> > +               strbuf_release(&buf);
> > +               return 0;
> > +       }
> >
> >         while (1) {
> >                 const char *end = strchrnul(p, ':');
> >
> > -               strbuf_reset(&buf);
> > +               strbuf_reset(out);
>
> This reset would be removed
>
> >
> >                 /* POSIX specifies an empty entry as the current directory. */
> >                 if (end != p) {
> > -                       strbuf_add(&buf, p, end - p);
> > -                       strbuf_addch(&buf, '/');
> > +                       strbuf_add(out, p, end - p);
> > +                       strbuf_addch(out, '/');
> >                 }
> > -               strbuf_addstr(&buf, file);
> > +               strbuf_addstr(out, file);
> >
> > -               if (is_executable(buf.buf))
> > -                       return strbuf_detach(&buf, NULL);
> > +               if (is_executable(out->buf)) {
> > +                       strbuf_release(&buf);
> > +                       return 1;
> > +               }
> >
>
> We'd call strbuf_reset(out); here instead, before we break or loop.
>
> >                 if (!*end)
> >                         break;
> >                 p = end + 1
> >         }
> >
> > +       strbuf_reset(out);
>
> And we could leave this one out; if we make it here, we'd know the
> buffer was already reset.
>
> >         strbuf_release(&buf);
> > -       return NULL;
> > +       return 0;
> >  }
> >
> >  static int exists_in_PATH(const char *file)
> >  {
> > -       char *r = locate_in_PATH(file);
> > -       int found = r != NULL;
> > -       free(r);
> > -       return found;
> > +       return locate_in_PATH(file, NULL);
> >  }
> >
> >  int sane_execvp(const char *file, char * const argv[])
> > @@ -427,15 +432,17 @@ static int prepare_cmd(struct argv_array *out, const struct child_process *cmd)
> >          * directly.
> >          */
> >         if (!strchr(out->argv[1], '/')) {
> > -               char *program = locate_in_PATH(out->argv[1]);
> > -               if (program) {
> > -                       free((char *)out->argv[1]);
> > -                       out->argv[1] = program;
> > -               } else {
> > +               struct strbuf program = STRBUF_INIT;
> > +
> > +               if (!locate_in_PATH(out->argv[1], &program)) {
> >                         argv_array_clear(out);
> > +                       strbuf_release(&program);
> >                         errno = ENOENT;
> >                         return -1;
> >                 }
> > +
> > +               free((char *)out->argv[1]);
> > +               out->argv[1] = strbuf_detach(&program, NULL);
> >         }
> >
> >         return 0;
>
> Just a thought. (Pardon the noise from the peanut gallery!)
>
> Best regards,
> Bryan Turner

  reply	other threads:[~2020-01-07  3:41 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-07  1:36 [PATCH] run-command: avoid undefined behavior in exists_in_PATH brian m. carlson
2020-01-07  2:04 ` Jonathan Nieder
2020-01-07  2:16   ` brian m. carlson
2020-01-07  3:40   ` Bryan Turner
2020-01-07  3:41     ` Bryan Turner [this message]
2020-01-07 11:08   ` Jeff King
2020-01-07 11:01 ` Jeff King
2020-01-07 16:58   ` Junio C Hamano
2020-01-08  2:47   ` brian m. carlson
2020-01-08  9:15     ` Miriam R.
2020-01-08 10:28       ` Christian Couder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGyf7-G_YKtyKgAA+LLqbvh2Tuq+qmVVH=yz9=kfC5uWXHMr_Q@mail.gmail.com' \
    --to=bturner@atlassian.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=mirucam@gmail.com \
    --cc=peff@peff.net \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).