From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Tom Clarkson via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Avery Pennarun <apenwarr@gmail.com>,
Ed Maste <emaste@freebsd.org>, Tom Clarkson <tom@tqclarkson.com>,
Tom Clarkson <tom@tqclarkson.com>
Subject: Re: [PATCH v2 3/7] subtree: persist cache between split runs
Date: Wed, 7 Oct 2020 18:06:26 +0200 (CEST) [thread overview]
Message-ID: <nycvar.QRO.7.76.6.2010071750310.50@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <8eec18388c86071db47512b84118e3b9111bd34d.1602021913.git.gitgitgadget@gmail.com>
Hi Tom,
On Tue, 6 Oct 2020, Tom Clarkson via GitGitGadget wrote:
> @@ -48,6 +49,7 @@ annotate=
> squash=
> message=
> prefix=
> +clearcache=
It might be more consistent to call it `clear_cache` (i.e. with an
underscore), just like `ignore_joins`.
>
> debug () {
> if test -n "$debug"
> @@ -131,6 +133,9 @@ do
> --no-rejoin)
> rejoin=
> ;;
> + --clear-cache)
> + clearcache=1
> + ;;
> --ignore-joins)
> ignore_joins=1
> ;;
> @@ -206,9 +211,13 @@ debug "opts: {$*}"
> debug
>
> cache_setup () {
> - cachedir="$GIT_DIR/subtree-cache/$$"
> - rm -rf "$cachedir" ||
> - die "Can't delete old cachedir: $cachedir"
> + cachedir="$GIT_DIR/subtree-cache/$prefix"
Excellent, the `prefix` should be "unique enough".
> + if test -n "$clearcache"
> + then
> + debug "Clearing cache"
> + rm -rf "$cachedir" ||
> + die "Can't delete old cachedir: $cachedir"
> + fi
> mkdir -p "$cachedir" ||
> die "Can't create new cachedir: $cachedir"
> mkdir -p "$cachedir/notree" ||
> @@ -266,6 +275,16 @@ cache_set () {
> echo "$newrev" >"$cachedir/$oldrev"
> }
>
> +cache_set_if_unset () {
> + oldrev="$1"
> + newrev="$2"
`local`? ;-)
> + if test -e "$cachedir/$oldrev"
> + then
> + return
> + fi
> + echo "$newrev" >"$cachedir/$oldrev"
So that directory contains commit mappings, a file for each mapped
revision.
Thinking back to patch 2/11, I am now no longer that sure that it makes
sense to fill it up with every commit in that commit range: performance
suffers when directories contain too many files.
For example, I had a case in the past where it took a minute just to
enumerate a directory, and even looking whether a file existed in that
directory was not exactly fun.
In any case, I would write it slightly shorter:
test -e "$cachedir/$oldrev" ||
echo "$newrev" >"$cachedir/$oldrev"
> +}
> +
> rev_exists () {
> if git rev-parse "$1" >/dev/null 2>&1
> then
> @@ -375,13 +394,13 @@ find_existing_splits () {
> then
> # squash commits refer to a subtree
> debug " Squash: $sq from $sub"
> - cache_set "$sq" "$sub"
> + cache_set_if_unset "$sq" "$sub"
> fi
> if test -n "$main" -a -n "$sub"
> then
> debug " Prior: $main -> $sub"
> - cache_set $main $sub
> - cache_set $sub $sub
> + cache_set_if_unset $main $sub
> + cache_set_if_unset $sub $sub
> try_remove_previous "$main"
> try_remove_previous "$sub"
> fi
> @@ -688,6 +707,8 @@ process_split_commit () {
> if test -n "$newparents"
> then
> cache_set "$rev" "$rev"
> + else
> + cache_set "$rev" ""
Was this hunk intended to be snuck in here? I can understand the
s/cache_set/cache_set_if_unset/ changes, of course, but not this hunk.
> fi
> return
> fi
> @@ -785,7 +806,7 @@ cmd_split () {
> # the 'onto' history is already just the subdir, so
> # any parent we find there can be used verbatim
> debug " cache: $rev"
> - cache_set "$rev" "$rev"
> + cache_set_if_unset "$rev" "$rev"
> done
> fi
>
> @@ -798,7 +819,7 @@ cmd_split () {
> git rev-list --topo-order --skip=1 $mainline |
> while read rev
> do
> - cache_set "$rev" ""
> + cache_set_if_unset "$rev" ""
Okay. A quite interesting question now would be: are there any callers of
`cache_set` left? If so, why?
Thanks,
Dscho
> done || exit $?
> fi
>
> --
> gitgitgadget
>
>
next prev parent reply other threads:[~2020-10-07 16:06 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-11 5:49 [PATCH 0/7] subtree: Fix handling of complex history Tom Clarkson via GitGitGadget
2020-05-11 5:49 ` [PATCH 1/7] subtree: handle multiple parents passed to cache_miss Tom Clarkson via GitGitGadget
2020-05-11 5:49 ` [PATCH 2/7] subtree: exclude commits predating add from recursive processing Tom Clarkson via GitGitGadget
2020-05-11 5:49 ` [PATCH 3/7] subtree: persist cache between split runs Tom Clarkson via GitGitGadget
2020-05-11 5:49 ` [PATCH 4/7] subtree: add git subtree map command Tom Clarkson via GitGitGadget
2020-05-11 5:49 ` [PATCH 5/7] subtree: add git subtree use and ignore commands Tom Clarkson via GitGitGadget
2020-05-11 5:50 ` [PATCH 6/7] subtree: more robustly distinguish subtree and mainline commits Tom Clarkson via GitGitGadget
2020-05-11 5:50 ` [PATCH 7/7] subtree: document new subtree commands Tom Clarkson via GitGitGadget
2020-10-04 17:52 ` [PATCH 0/7] subtree: Fix handling of complex history Ed Maste
2020-10-04 19:27 ` Johannes Schindelin
2020-10-05 16:47 ` Junio C Hamano
2020-10-05 21:37 ` Ed Maste
2020-10-07 16:31 ` Johannes Schindelin
2020-10-06 22:05 ` [PATCH v2 " Tom Clarkson via GitGitGadget
2020-10-06 22:05 ` [PATCH v2 1/7] subtree: handle multiple parents passed to cache_miss Tom Clarkson via GitGitGadget
2020-10-07 13:12 ` Ed Maste
2020-10-06 22:05 ` [PATCH v2 2/7] subtree: exclude commits predating add from recursive processing Tom Clarkson via GitGitGadget
2020-10-07 15:36 ` Johannes Schindelin
2020-10-06 22:05 ` [PATCH v2 3/7] subtree: persist cache between split runs Tom Clarkson via GitGitGadget
2020-10-07 16:06 ` Johannes Schindelin [this message]
2020-10-06 22:05 ` [PATCH v2 4/7] subtree: add git subtree map command Tom Clarkson via GitGitGadget
2020-10-06 22:05 ` [PATCH v2 5/7] subtree: add git subtree use and ignore commands Tom Clarkson via GitGitGadget
2020-10-07 16:29 ` Johannes Schindelin
2020-10-06 22:05 ` [PATCH v2 6/7] subtree: more robustly distinguish subtree and mainline commits Tom Clarkson via GitGitGadget
2020-10-07 19:42 ` Johannes Schindelin
2020-10-06 22:05 ` [PATCH v2 7/7] subtree: document new subtree commands Tom Clarkson via GitGitGadget
2020-10-07 19:43 ` Johannes Schindelin
2020-10-07 19:46 ` [PATCH v2 0/7] subtree: Fix handling of complex history Johannes Schindelin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=nycvar.QRO.7.76.6.2010071750310.50@tvgsbejvaqbjf.bet \
--to=johannes.schindelin@gmx.de \
--cc=apenwarr@gmail.com \
--cc=emaste@freebsd.org \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=tom@tqclarkson.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).