git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Enzo Matsumiya <ematsumiya@suse.de>
Subject: Re: [PATCH 5/5] run-command API: remove "argv" member, always use "args"
Date: Mon, 22 Nov 2021 12:32:39 -0500	[thread overview]
Message-ID: <YZvUN0kuTpmf9Q7P@coredump.intra.peff.net> (raw)
In-Reply-To: <patch-5.5-ea1011f7473-20211122T153605Z-avarab@gmail.com>

On Mon, Nov 22, 2021 at 05:04:07PM +0100, Ævar Arnfjörð Bjarmason wrote:

> This change is larger than I'd like, but there's no easy way to avoid
> it that wouldn't involve even more verbose intermediate steps. We use
> the "argv" as the source of truth over the "args", so we need to
> change all parts of run-command.[ch] itself, as well as the trace2
> logging at the same time.

Yeah, agreed most of this needs to come in a bundle. But at least few spots
could be split out into the earlier patches:

>  builtin/worktree.c          |  2 --
>  t/helper/test-run-command.c | 10 +++++---

as they are just users of the API.

> diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
> index 3c4fb862234..b5519f92a19 100644
> --- a/t/helper/test-run-command.c
> +++ b/t/helper/test-run-command.c
> [...]
> @@ -396,7 +396,8 @@ int cmd__run_command(int argc, const char **argv)
>  	}
>  	if (argc < 3)
>  		return 1;
> -	proc.argv = (const char **)argv + 2;
> +	strvec_clear(&proc.args);
> +	strvec_pushv(&proc.args, (const char **)argv + 2);
>  
>  	if (!strcmp(argv[1], "start-command-ENOENT")) {
>  		if (start_command(&proc) < 0 && errno == ENOENT)
> @@ -408,7 +409,8 @@ int cmd__run_command(int argc, const char **argv)
>  		exit(run_command(&proc));
>  
>  	jobs = atoi(argv[2]);
> -	proc.argv = (const char **)argv + 3;
> +	strvec_clear(&proc.args);
> +	strvec_pushv(&proc.args, (const char **)argv + 3);
>  
>  	if (!strcmp(argv[1], "run-command-parallel"))
>  		exit(run_processes_parallel(jobs, parallel_next,

These two in particular are tricky. This second strvec_clear() is
necessary because we are overwriting what was put into proc.args by the
earlier strvec_pushv(). I think this is one of two interesting ways
these kinds of trivial conversions can fail:

  - somebody is relying on the fact that "argv = whatever" is an
    assignment, not an append (which is the case here)

  - somebody is relying on the fact that assignment of the pointer is
    shallow, whereas strvec_push() is doing a deep copy. So converting:

      cp.argv = argv;
      argv[1] = "foo";

    to:

      strvec_pushv(&cp.args, argv);
      argv[1] = "foo";

    is not correct. We wouldn't use the updated "foo". I didn't notice
    any cases like this during my rough own rough conversion, but they
    could be lurking.

The strvec_clear() in the first hunk above is also not necessary (nobody
has written anything before it), but I don't mind it for consistency /
being defensive.

> @@ -902,8 +900,9 @@ int start_command(struct child_process *cmd)
>  #else
>  {
>  	int fhin = 0, fhout = 1, fherr = 2;
> -	const char **sargv = cmd->argv;
> +	const char **sargv = strvec_detach(&cmd->args);
>  	struct strvec nargv = STRVEC_INIT;
> +	const char **temp_argv = NULL;

I wondered about detaching here, because other parts of the code
(finish_command(), tracing, etc) will be looking at cmd->args.v[0] for
the command name.

But we eventually overwrite it with munged results:

> @@ -929,20 +928,26 @@ int start_command(struct child_process *cmd)
>  		fhout = dup(cmd->out);
>  
>  	if (cmd->git_cmd)
> -		cmd->argv = prepare_git_cmd(&nargv, cmd->argv);
> +		temp_argv = prepare_git_cmd(&nargv, sargv);
>  	else if (cmd->use_shell)
> -		cmd->argv = prepare_shell_cmd(&nargv, cmd->argv);
> +		temp_argv = prepare_shell_cmd(&nargv, sargv);
> +	else
> +		temp_argv = sargv;
> +	if (!temp_argv)
> +		BUG("should have some cmd->args to set");
> +	strvec_pushv(&cmd->args, temp_argv);
> +	strvec_clear(&nargv);

So we have to do some replacement. I think the memory management gets
confusing here, though.

At this point "temp_argv" might point to nargv.v (in which case it is a
dangling pointer after we clear nargv) or to the original sargv (in
which case it is memory owned by us that must be freed). The former is
OK, because we never look at it again. And the latter we eventually
reattach into &cmd->args, but:

> -	strvec_clear(&nargv);
> -	cmd->argv = sargv;
> +	strvec_pushv(&cmd->args, sargv);;
> +	free(sargv);

I think we still have all the entries we pushed into cmd->args from
temp_argv earlier. So we'd need to strvec_clear() it before pushing
sargv again.

And then the free(sargv) is too shallow. It's just freeing the outer
array, but sargv[0], etc, are leaked.

I think what you really want is the equivalent of strvec_attach(). We
don't have that, but it's basically just:

  void strvec_attach(struct strvec *s, const char **v)
  {
	/*
	 * this is convenient for our caller, but if we wanted to find
	 * potential misuses, we could also BUG() if s.nr > 0
	 */
	strvec_clear(&s);

        /* take over ownership */
	s->v = v;

	/* v must be NULL-terminated; count up to set number */
	s->nr = 0;
	for (; *v; v++)
		s->nr++;
	/*
	 * we don't know how big the original allocation was, so we can
	 * assume only nr. But add 1 to account for the NULL.
	 */
	s->alloc = s->nr + 1;
  }

I do think this whole thing would be easier to read if we left cmd->argv
intact, and just operated on a separate argv strvec. That's what the
non-Windows side does. But that distinction is nothing new in your
patch, and I'm not sure if there is a reason that the Windows code does
it differently.

-Peff

  reply	other threads:[~2021-11-22 17:32 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-20 19:40 [PATCH v2] pager: fix crash when pager program doesn't exist Enzo Matsumiya
2021-11-21 18:37 ` Jeff King
2021-11-22  2:10   ` Junio C Hamano
2021-11-22  4:35     ` Jeff King
2021-11-22 14:52       ` Enzo Matsumiya
2021-11-22 17:05         ` Junio C Hamano
2021-11-23 16:40           ` Enzo Matsumiya
2021-11-24  1:55             ` Ævar Arnfjörð Bjarmason
2021-11-24 15:51               ` Jeff King
2021-11-22 16:04       ` [PATCH 0/5] run-command API: get rid of "argv" Ævar Arnfjörð Bjarmason
2021-11-22 16:04         ` [PATCH 1/5] archive-tar: use our own cmd.buf in error message Ævar Arnfjörð Bjarmason
2021-11-22 21:04           ` Junio C Hamano
2021-11-22 16:04         ` [PATCH 2/5] upload-archive: use regular "struct child_process" pattern Ævar Arnfjörð Bjarmason
2021-11-22 17:02           ` Jeff King
2021-11-22 20:53           ` Ævar Arnfjörð Bjarmason
2021-11-22 21:10             ` Jeff King
2021-11-22 21:36               ` Ævar Arnfjörð Bjarmason
2021-11-22 16:04         ` [PATCH 3/5] run-command API users: use strvec_pushv(), not argv assignment Ævar Arnfjörð Bjarmason
2021-11-22 21:19           ` Junio C Hamano
2021-11-22 21:30             ` Ævar Arnfjörð Bjarmason
2021-11-22 16:04         ` [PATCH 4/5] run-command API users: use strvec_pushl(), not argv construction Ævar Arnfjörð Bjarmason
2021-11-22 16:04         ` [PATCH 5/5] run-command API: remove "argv" member, always use "args" Ævar Arnfjörð Bjarmason
2021-11-22 17:32           ` Jeff King [this message]
2021-11-22 18:19             ` Ævar Arnfjörð Bjarmason
2021-11-22 18:47               ` Jeff King
2021-11-22 17:52         ` [PATCH 0/5] run-command API: get rid of "argv" Jeff King
2021-11-22 18:11           ` Junio C Hamano
2021-11-22 18:33             ` Ævar Arnfjörð Bjarmason
2021-11-22 18:49               ` Jeff King
2021-11-22 18:26           ` Ævar Arnfjörð Bjarmason
2021-11-23 12:06         ` [PATCH v2 0/9] run-command API: get rid of "argv" and "env" Ævar Arnfjörð Bjarmason
2021-11-23 12:06           ` [PATCH v2 1/9] worktree: remove redundant NULL-ing of "cp.argv Ævar Arnfjörð Bjarmason
2021-11-23 15:26             ` Eric Sunshine
2021-11-24  1:54               ` Junio C Hamano
2021-11-24  6:00                 ` Eric Sunshine
2021-11-24  6:12                   ` Eric Sunshine
2021-11-24  5:44               ` Eric Sunshine
2021-11-23 12:06           ` [PATCH v2 2/9] upload-archive: use regular "struct child_process" pattern Ævar Arnfjörð Bjarmason
2021-11-23 12:06           ` [PATCH v2 3/9] run-command API users: use strvec_pushv(), not argv assignment Ævar Arnfjörð Bjarmason
2021-11-23 12:06           ` [PATCH v2 4/9] run-command tests: " Ævar Arnfjörð Bjarmason
2021-11-24  1:33             ` Eric Sunshine
2021-11-23 12:06           ` [PATCH v2 5/9] run-command API users: use strvec_pushl(), not argv construction Ævar Arnfjörð Bjarmason
2021-11-23 12:06           ` [PATCH v2 6/9] run-command API users: use strvec_push(), " Ævar Arnfjörð Bjarmason
2021-11-23 12:06           ` [PATCH v2 7/9] run-command API: remove "argv" member, always use "args" Ævar Arnfjörð Bjarmason
2021-11-23 12:06           ` [PATCH v2 8/9] difftool: use "env_array" to simplify memory management Ævar Arnfjörð Bjarmason
2021-11-23 12:06           ` [PATCH v2 9/9] run-command API: remove "env" member, always use "env_array" Ævar Arnfjörð Bjarmason
2021-11-25 22:52           ` [PATCH v3 0/9] run-command API: get rid of "argv" and "env" Ævar Arnfjörð Bjarmason
2021-11-25 22:52             ` [PATCH v3 1/9] worktree: stop being overly intimate with run_command() internals Ævar Arnfjörð Bjarmason
2021-11-26  9:48               ` Eric Sunshine
2021-11-25 22:52             ` [PATCH v3 2/9] upload-archive: use regular "struct child_process" pattern Ævar Arnfjörð Bjarmason
2021-11-25 22:52             ` [PATCH v3 3/9] run-command API users: use strvec_pushv(), not argv assignment Ævar Arnfjörð Bjarmason
2021-11-25 22:52             ` [PATCH v3 4/9] run-command tests: " Ævar Arnfjörð Bjarmason
2021-11-25 22:52             ` [PATCH v3 5/9] run-command API users: use strvec_pushl(), not argv construction Ævar Arnfjörð Bjarmason
2021-11-25 22:52             ` [PATCH v3 6/9] run-command API users: use strvec_push(), " Ævar Arnfjörð Bjarmason
2021-11-25 22:52             ` [PATCH v3 7/9] run-command API: remove "argv" member, always use "args" Ævar Arnfjörð Bjarmason
2021-11-25 22:52             ` [PATCH v3 8/9] difftool: use "env_array" to simplify memory management Ævar Arnfjörð Bjarmason
2021-11-25 22:52             ` [PATCH v3 9/9] run-command API: remove "env" member, always use "env_array" Ævar Arnfjörð Bjarmason
2021-11-22 15:31     ` [PATCH v2] pager: fix crash when pager program doesn't exist Enzo Matsumiya
2021-11-22 16:22       ` Ævar Arnfjörð Bjarmason
2021-11-22 16:46         ` Enzo Matsumiya
2021-11-22 17:10           ` Ævar Arnfjörð Bjarmason
2021-11-22 17:41             ` Jeff King
2021-11-22 18:00             ` Junio C Hamano
2021-11-22 18:26               ` Jeff King
2021-11-22 17:55           ` Junio C Hamano
2021-11-22 18:19             ` Junio C Hamano
2021-11-22 18:37               ` Jeff King
2021-11-22 20:39                 ` Junio C Hamano
2021-11-22 17:08         ` Junio C Hamano
2021-11-22 18:35           ` Ævar Arnfjörð Bjarmason
2021-11-22 16:30       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YZvUN0kuTpmf9Q7P@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=ematsumiya@suse.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).