git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: Patrick Steinhardt <ps@pks.im>,
	 git@vger.kernel.org,  Eric Sunshine <sunshine@sunshineco.com>,
	 Karthik Nayak <karthik.188@gmail.com>
Subject: Re: [PATCH 5/5] mv: replace src_dir with a strvec
Date: Thu, 30 May 2024 08:36:25 -0700	[thread overview]
Message-ID: <xmqqo78nqpl2.fsf@gitster.g> (raw)
In-Reply-To: <20240530064638.GE1949704@coredump.intra.peff.net> (Jeff King's message of "Thu, 30 May 2024 02:46:38 -0400")

Jeff King <peff@peff.net> writes:

> We manually manage the src_dir array with ALLOC_GROW. Using a strvec is
> a little more ergonomic, and makes the memory ownership more clear. It
> does mean that we copy the strings (which were otherwise just pointers
> into the "sources" strvec), but using the same rationale as 9fcd9e4e72
> (builtin/mv duplicate string list memory, 2024-05-27), it's just not
> enough to be worth worrying about here.

Hmph, the rationale given by 9fcd9e4e (builtin/mv duplicate string
list memory, 2024-05-27) essentially is "the number of elements are
the same as the number of command line parameters", but I do not
think that is quite correct.

When you do "mv srcA srcB ... dst", you'd inspect the command line
arguments from left to right, notice that srcA is a directory, find
the cache entries for paths that are inside srcA, append the paths
in that directory to source[] and destination[] array, and extend
argc.  "for (i = 0; i < argc; i++)" loop that appends one element to
src_for_dst per iteration ends up running the number of paths being
moved, which can be order of magnitude more than the command line
parameters.

Of course, if we needed to make copies for correctness reasons (or
to clarify memory ownership semantics), that alone may be a good
justification and we do not need an excuse "it's just a handful of
elements anyway" to begin with.

Anyway, that is about somebody else's patch, not this one ;-).

The rationale *does* apply to this change; src_dir is a list of
directories we found on the command line, so the number of elements
in it is reasonably bounded.

> As a bonus, this gets rid of some "int"s used for allocation management
> (though in practice these were limited to command-line sizes and thus
> not overflowable).

Again, correct.

> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  builtin/mv.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/builtin/mv.c b/builtin/mv.c
> index 01725e4a20..6c69033c5f 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -197,8 +197,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  	struct strvec submodule_gitfiles_to_free = STRVEC_INIT;
>  	const char **submodule_gitfiles;
>  	char *dst_w_slash = NULL;
> -	const char **src_dir = NULL;
> -	int src_dir_nr = 0, src_dir_alloc = 0;
> +	struct strvec src_dir = STRVEC_INIT;
>  	enum update_mode *modes, dst_mode = 0;
>  	struct stat st, dest_st;
>  	struct string_list src_for_dst = STRING_LIST_INIT_DUP;
> @@ -344,8 +343,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  			/* last - first >= 1 */
>  			modes[i] |= WORKING_DIRECTORY;
>  
> -			ALLOC_GROW(src_dir, src_dir_nr + 1, src_dir_alloc);
> -			src_dir[src_dir_nr++] = src;
> +			strvec_push(&src_dir, src);
>  
>  			n = argc + last - first;
>  			REALLOC_ARRAY(modes, n);
> @@ -559,7 +557,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  		}
>  	}
>  
> -	remove_empty_src_dirs(src_dir, src_dir_nr);
> +	remove_empty_src_dirs(src_dir.v, src_dir.nr);
>  
>  	if (dirty_paths.nr)
>  		advise_on_moving_dirty_path(&dirty_paths);
> @@ -574,7 +572,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  	ret = 0;
>  
>  out:
> -	free(src_dir);
> +	strvec_clear(&src_dir);
>  	free(dst_w_slash);
>  	string_list_clear(&src_for_dst, 0);
>  	string_list_clear(&dirty_paths, 0);


  reply	other threads:[~2024-05-30 15:36 UTC|newest]

Thread overview: 115+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-23 12:25 [PATCH 00/20] Various memory leak fixes Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 01/20] t: mark a bunch of tests as leak-free Patrick Steinhardt
2024-05-23 17:44   ` Junio C Hamano
2024-05-24  6:56     ` Patrick Steinhardt
2024-05-24 16:05       ` Junio C Hamano
2024-05-24 17:53         ` Junio C Hamano
2024-05-24 20:34   ` Karthik Nayak
2024-05-23 12:25 ` [PATCH 02/20] transport-helper: fix leaking helper name Patrick Steinhardt
2024-05-23 17:36   ` Junio C Hamano
2024-05-24 20:38   ` Karthik Nayak
2024-05-23 12:25 ` [PATCH 03/20] strbuf: fix leak when `appendwholeline()` fails with EOF Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 04/20] checkout: clarify memory ownership in `unique_tracking_name()` Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 05/20] http: refactor code to clarify memory ownership Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 06/20] config: clarify memory ownership in `git_config_pathname()` Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 07/20] diff: refactor code to clarify memory ownership of prefixes Patrick Steinhardt
2024-05-23 16:59   ` Eric Sunshine
2024-05-23 12:25 ` [PATCH 08/20] convert: refactor code to clarify ownership of check_roundtrip_encoding Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 09/20] builtin/log: stop using globals for log config Patrick Steinhardt
2024-05-23 12:25 ` [PATCH 10/20] builtin/log: stop using globals for format config Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 11/20] config: clarify memory ownership in `git_config_string()` Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 12/20] config: plug various memory leaks Patrick Steinhardt
2024-05-23 17:13   ` Junio C Hamano
2024-05-24  6:58     ` Patrick Steinhardt
2024-05-24  8:55       ` Patrick Steinhardt
2024-05-24 16:12         ` Junio C Hamano
2024-05-24 16:11       ` Junio C Hamano
2024-05-23 12:26 ` [PATCH 13/20] builtin/credential: clear credential before exit Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 14/20] commit-reach: fix memory leak in `ahead_behind()` Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 15/20] submodule: fix leaking memory for submodule entries Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 16/20] strvec: add functions to replace and remove strings Patrick Steinhardt
2024-05-23 17:09   ` Eric Sunshine
2024-05-24  6:56     ` Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 17/20] builtin/mv: refactor `add_slash()` to always return allocated strings Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 18/20] builtin/mv duplicate string list memory Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 19/20] builtin/mv: refactor to use `struct strvec` Patrick Steinhardt
2024-05-23 12:26 ` [PATCH 20/20] builtin/mv: fix leaks for submodule gitfile paths Patrick Steinhardt
2024-05-23 16:45 ` [PATCH 00/20] Various memory leak fixes Junio C Hamano
2024-05-24  6:56   ` Patrick Steinhardt
2024-05-24 10:03 ` [PATCH v2 00/21] " Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 01/21] ci: add missing dependency for TTY prereq Patrick Steinhardt
2024-05-24 16:31     ` Junio C Hamano
2024-05-24 10:03   ` [PATCH v2 02/21] t: mark a bunch of tests as leak-free Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 03/21] transport-helper: fix leaking helper name Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 04/21] strbuf: fix leak when `appendwholeline()` fails with EOF Patrick Steinhardt
2024-05-25  4:46     ` Jeff King
2024-05-27  6:44       ` Patrick Steinhardt
2024-05-29  9:16         ` Jeff King
2024-05-29 11:25           ` Patrick Steinhardt
2024-05-30  7:16             ` Jeff King
2024-05-24 10:03   ` [PATCH v2 05/21] checkout: clarify memory ownership in `unique_tracking_name()` Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 06/21] http: refactor code to clarify memory ownership Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 07/21] config: clarify memory ownership in `git_config_pathname()` Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 08/21] diff: refactor code to clarify memory ownership of prefixes Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 09/21] convert: refactor code to clarify ownership of check_roundtrip_encoding Patrick Steinhardt
2024-05-24 10:03   ` [PATCH v2 10/21] builtin/log: stop using globals for log config Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 11/21] builtin/log: stop using globals for format config Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 12/21] config: clarify memory ownership in `git_config_string()` Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 13/21] config: plug various memory leaks Patrick Steinhardt
2024-05-24 10:13     ` Patrick Steinhardt
2024-05-25  4:33     ` Jeff King
2024-05-27  6:46       ` Patrick Steinhardt
2024-05-29  9:20         ` Jeff King
2024-05-24 10:04   ` [PATCH v2 14/21] builtin/credential: clear credential before exit Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 15/21] commit-reach: fix memory leak in `ahead_behind()` Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 16/21] submodule: fix leaking memory for submodule entries Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 17/21] strvec: add functions to replace and remove strings Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 18/21] builtin/mv: refactor `add_slash()` to always return allocated strings Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 19/21] builtin/mv duplicate string list memory Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 20/21] builtin/mv: refactor to use `struct strvec` Patrick Steinhardt
2024-05-24 10:04   ` [PATCH v2 21/21] builtin/mv: fix leaks for submodule gitfile paths Patrick Steinhardt
2024-05-25  2:10   ` [PATCH v2 00/21] Various memory leak fixes Junio C Hamano
2024-05-27  6:44     ` Patrick Steinhardt
2024-05-27 17:38       ` Junio C Hamano
2024-05-27 18:02         ` Junio C Hamano
2024-05-28  5:09         ` Patrick Steinhardt
2024-05-29  8:25       ` Karthik Nayak
2024-05-27 11:45 ` [PATCH v3 " Patrick Steinhardt
2024-05-27 11:45   ` [PATCH v3 01/21] ci: add missing dependency for TTY prereq Patrick Steinhardt
2024-05-27 11:45   ` [PATCH v3 02/21] t: mark a bunch of tests as leak-free Patrick Steinhardt
2024-05-27 11:45   ` [PATCH v3 03/21] transport-helper: fix leaking helper name Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 04/21] strbuf: fix leak when `appendwholeline()` fails with EOF Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 05/21] checkout: clarify memory ownership in `unique_tracking_name()` Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 06/21] http: refactor code to clarify memory ownership Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 07/21] config: clarify memory ownership in `git_config_pathname()` Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 08/21] diff: refactor code to clarify memory ownership of prefixes Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 09/21] convert: refactor code to clarify ownership of check_roundtrip_encoding Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 10/21] builtin/log: stop using globals for log config Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 11/21] builtin/log: stop using globals for format config Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 12/21] config: clarify memory ownership in `git_config_string()` Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 13/21] config: plug various memory leaks Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 14/21] builtin/credential: clear credential before exit Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 15/21] commit-reach: fix memory leak in `ahead_behind()` Patrick Steinhardt
2024-05-27 11:46   ` [PATCH v3 16/21] submodule: fix leaking memory for submodule entries Patrick Steinhardt
2024-05-27 11:47   ` [PATCH v3 17/21] strvec: add functions to replace and remove strings Patrick Steinhardt
2024-05-27 11:47   ` [PATCH v3 18/21] builtin/mv: refactor `add_slash()` to always return allocated strings Patrick Steinhardt
2024-05-27 11:47   ` [PATCH v3 19/21] builtin/mv duplicate string list memory Patrick Steinhardt
2024-05-27 11:47   ` [PATCH v3 20/21] builtin/mv: refactor to use `struct strvec` Patrick Steinhardt
2024-05-27 11:47   ` [PATCH v3 21/21] builtin/mv: fix leaks for submodule gitfile paths Patrick Steinhardt
2024-05-27 17:52   ` [PATCH v3 00/21] Various memory leak fixes Junio C Hamano
2024-05-30  6:38   ` [PATCH 0/5] add-ons for ps/leakfixes Jeff King
2024-05-30  6:39     ` [PATCH 1/5] t-strvec: use va_end() to match va_start() Jeff King
2024-05-30  6:39     ` [PATCH 2/5] t-strvec: mark variable-arg helper with LAST_ARG_MUST_BE_NULL Jeff King
2024-05-30  6:44     ` [PATCH 3/5] mv: move src_dir cleanup to end of cmd_mv() Jeff King
2024-05-30  7:04       ` Patrick Steinhardt
2024-05-30  7:21         ` Jeff King
2024-05-30  7:24           ` Patrick Steinhardt
2024-05-30  8:15             ` Jeff King
2024-05-30  8:19               ` Patrick Steinhardt
2024-05-30  8:28                 ` Jeff King
2024-05-30  6:45     ` [PATCH 4/5] mv: factor out empty src_dir removal Jeff King
2024-05-30  6:46     ` [PATCH 5/5] mv: replace src_dir with a strvec Jeff King
2024-05-30 15:36       ` Junio C Hamano [this message]
2024-05-31 11:12         ` Jeff King
2024-05-31 14:56           ` Junio C Hamano
2024-05-30  7:05     ` [PATCH 0/5] add-ons for ps/leakfixes Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqo78nqpl2.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=karthik.188@gmail.com \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).