git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Tom Clarkson via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Avery Pennarun <apenwarr@gmail.com>,
	Ed Maste <emaste@freebsd.org>, Tom Clarkson <tom@tqclarkson.com>,
	Tom Clarkson <tom@tqclarkson.com>
Subject: Re: [PATCH v2 3/7] subtree: persist cache between split runs
Date: Wed, 7 Oct 2020 18:06:26 +0200 (CEST)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.2010071750310.50@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <8eec18388c86071db47512b84118e3b9111bd34d.1602021913.git.gitgitgadget@gmail.com>

Hi Tom,

On Tue, 6 Oct 2020, Tom Clarkson via GitGitGadget wrote:

> @@ -48,6 +49,7 @@ annotate=
>  squash=
>  message=
>  prefix=
> +clearcache=

It might be more consistent to call it `clear_cache` (i.e. with an
underscore), just like `ignore_joins`.

>
>  debug () {
>  	if test -n "$debug"
> @@ -131,6 +133,9 @@ do
>  	--no-rejoin)
>  		rejoin=
>  		;;
> +	--clear-cache)
> +		clearcache=1
> +		;;
>  	--ignore-joins)
>  		ignore_joins=1
>  		;;
> @@ -206,9 +211,13 @@ debug "opts: {$*}"
>  debug
>
>  cache_setup () {
> -	cachedir="$GIT_DIR/subtree-cache/$$"
> -	rm -rf "$cachedir" ||
> -		die "Can't delete old cachedir: $cachedir"
> +	cachedir="$GIT_DIR/subtree-cache/$prefix"

Excellent, the `prefix` should be "unique enough".

> +	if test -n "$clearcache"
> +	then
> +		debug "Clearing cache"
> +		rm -rf "$cachedir" ||
> +			die "Can't delete old cachedir: $cachedir"
> +	fi
>  	mkdir -p "$cachedir" ||
>  		die "Can't create new cachedir: $cachedir"
>  	mkdir -p "$cachedir/notree" ||
> @@ -266,6 +275,16 @@ cache_set () {
>  	echo "$newrev" >"$cachedir/$oldrev"
>  }
>
> +cache_set_if_unset () {
> +	oldrev="$1"
> +	newrev="$2"

`local`? ;-)

> +	if test -e "$cachedir/$oldrev"
> +	then
> +		return
> +	fi
> +	echo "$newrev" >"$cachedir/$oldrev"

So that directory contains commit mappings, a file for each mapped
revision.

Thinking back to patch 2/11, I am now no longer that sure that it makes
sense to fill it up with every commit in that commit range: performance
suffers when directories contain too many files.

For example, I had a case in the past where it took a minute just to
enumerate a directory, and even looking whether a file existed in that
directory was not exactly fun.

In any case, I would write it slightly shorter:

	test -e "$cachedir/$oldrev" ||
	echo "$newrev" >"$cachedir/$oldrev"

> +}
> +
>  rev_exists () {
>  	if git rev-parse "$1" >/dev/null 2>&1
>  	then
> @@ -375,13 +394,13 @@ find_existing_splits () {
>  			then
>  				# squash commits refer to a subtree
>  				debug "  Squash: $sq from $sub"
> -				cache_set "$sq" "$sub"
> +				cache_set_if_unset "$sq" "$sub"
>  			fi
>  			if test -n "$main" -a -n "$sub"
>  			then
>  				debug "  Prior: $main -> $sub"
> -				cache_set $main $sub
> -				cache_set $sub $sub
> +				cache_set_if_unset $main $sub
> +				cache_set_if_unset $sub $sub
>  				try_remove_previous "$main"
>  				try_remove_previous "$sub"
>  			fi
> @@ -688,6 +707,8 @@ process_split_commit () {
>  		if test -n "$newparents"
>  		then
>  			cache_set "$rev" "$rev"
> +		else
> +			cache_set "$rev" ""

Was this hunk intended to be snuck in here? I can understand the
s/cache_set/cache_set_if_unset/ changes, of course, but not this hunk.

>  		fi
>  		return
>  	fi
> @@ -785,7 +806,7 @@ cmd_split () {
>  			# the 'onto' history is already just the subdir, so
>  			# any parent we find there can be used verbatim
>  			debug "  cache: $rev"
> -			cache_set "$rev" "$rev"
> +			cache_set_if_unset "$rev" "$rev"
>  		done
>  	fi
>
> @@ -798,7 +819,7 @@ cmd_split () {
>  		git rev-list --topo-order --skip=1 $mainline |
>  		while read rev
>  		do
> -			cache_set "$rev" ""
> +			cache_set_if_unset "$rev" ""

Okay. A quite interesting question now would be: are there any callers of
`cache_set` left? If so, why?

Thanks,
Dscho

>  		done || exit $?
>  	fi
>
> --
> gitgitgadget
>
>

  reply	other threads:[~2020-10-07 16:06 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-11  5:49 [PATCH 0/7] subtree: Fix handling of complex history Tom Clarkson via GitGitGadget
2020-05-11  5:49 ` [PATCH 1/7] subtree: handle multiple parents passed to cache_miss Tom Clarkson via GitGitGadget
2020-05-11  5:49 ` [PATCH 2/7] subtree: exclude commits predating add from recursive processing Tom Clarkson via GitGitGadget
2020-05-11  5:49 ` [PATCH 3/7] subtree: persist cache between split runs Tom Clarkson via GitGitGadget
2020-05-11  5:49 ` [PATCH 4/7] subtree: add git subtree map command Tom Clarkson via GitGitGadget
2020-05-11  5:49 ` [PATCH 5/7] subtree: add git subtree use and ignore commands Tom Clarkson via GitGitGadget
2020-05-11  5:50 ` [PATCH 6/7] subtree: more robustly distinguish subtree and mainline commits Tom Clarkson via GitGitGadget
2020-05-11  5:50 ` [PATCH 7/7] subtree: document new subtree commands Tom Clarkson via GitGitGadget
2020-10-04 17:52 ` [PATCH 0/7] subtree: Fix handling of complex history Ed Maste
2020-10-04 19:27   ` Johannes Schindelin
2020-10-05 16:47     ` Junio C Hamano
2020-10-05 21:37     ` Ed Maste
2020-10-07 16:31       ` Johannes Schindelin
2020-10-06 22:05 ` [PATCH v2 " Tom Clarkson via GitGitGadget
2020-10-06 22:05   ` [PATCH v2 1/7] subtree: handle multiple parents passed to cache_miss Tom Clarkson via GitGitGadget
2020-10-07 13:12     ` Ed Maste
2020-10-06 22:05   ` [PATCH v2 2/7] subtree: exclude commits predating add from recursive processing Tom Clarkson via GitGitGadget
2020-10-07 15:36     ` Johannes Schindelin
2020-10-06 22:05   ` [PATCH v2 3/7] subtree: persist cache between split runs Tom Clarkson via GitGitGadget
2020-10-07 16:06     ` Johannes Schindelin [this message]
2020-10-06 22:05   ` [PATCH v2 4/7] subtree: add git subtree map command Tom Clarkson via GitGitGadget
2020-10-06 22:05   ` [PATCH v2 5/7] subtree: add git subtree use and ignore commands Tom Clarkson via GitGitGadget
2020-10-07 16:29     ` Johannes Schindelin
2020-10-06 22:05   ` [PATCH v2 6/7] subtree: more robustly distinguish subtree and mainline commits Tom Clarkson via GitGitGadget
2020-10-07 19:42     ` Johannes Schindelin
2020-10-06 22:05   ` [PATCH v2 7/7] subtree: document new subtree commands Tom Clarkson via GitGitGadget
2020-10-07 19:43     ` Johannes Schindelin
2020-10-07 19:46   ` [PATCH v2 0/7] subtree: Fix handling of complex history Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.2010071750310.50@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=apenwarr@gmail.com \
    --cc=emaste@freebsd.org \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=tom@tqclarkson.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).