git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v3 0/4] filter-branch: support for incremental update + fix for ancient tag format
@ 2017-09-21  7:49 Ian Campbell
  2017-09-21  7:49 ` [PATCH v3 1/4] filter-branch: reset $GIT_* before cleaning up Ian Campbell
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Ian Campbell @ 2017-09-21  7:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

This is the third version of my patches to add incremental support to
git-filter-branch. Since the last time I have replaced `git mktag --
allow-missing-tagger` with `git hash-object -t tag -w --stdin`.

I've force pushed to [1] (Travis is still running) and have set off the
process of re-rewriting the devicetree tree from scratch (a multi-day
affair) to validate (it's looking good).

Ian.

[0] https://git.kernel.org/pub/scm/linux/kernel/git/devicetree/devicetree-rebasing.git/
[1] https://github.com/ijc/git/tree/git-filter-branch

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/4] filter-branch: reset $GIT_* before cleaning up
  2017-09-21  7:49 [PATCH v3 0/4] filter-branch: support for incremental update + fix for ancient tag format Ian Campbell
@ 2017-09-21  7:49 ` Ian Campbell
  2017-09-21  7:49 ` [PATCH v3 2/4] filter-branch: preserve and restore $GIT_AUTHOR_* and $GIT_COMMITTER_* Ian Campbell
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Ian Campbell @ 2017-09-21  7:49 UTC (permalink / raw)
  To: gitster; +Cc: git, Ian Campbell

This is pure code motion to enable a subsequent patch to add code which needs
to happen with the reset $GIT_* but before the temporary directory has been
cleaned up.

Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
---
 git-filter-branch.sh | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 3a74602ef..3da281f8a 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -544,11 +544,6 @@ if [ "$filter_tag_name" ]; then
 	done
 fi
 
-cd "$orig_dir"
-rm -rf "$tempdir"
-
-trap - 0
-
 unset GIT_DIR GIT_WORK_TREE GIT_INDEX_FILE
 test -z "$ORIG_GIT_DIR" || {
 	GIT_DIR="$ORIG_GIT_DIR" && export GIT_DIR
@@ -562,6 +557,11 @@ test -z "$ORIG_GIT_INDEX_FILE" || {
 	export GIT_INDEX_FILE
 }
 
+cd "$orig_dir"
+rm -rf "$tempdir"
+
+trap - 0
+
 if [ "$(is_bare_repository)" = false ]; then
 	git read-tree -u -m HEAD || exit
 fi
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 2/4] filter-branch: preserve and restore $GIT_AUTHOR_* and $GIT_COMMITTER_*
  2017-09-21  7:49 [PATCH v3 0/4] filter-branch: support for incremental update + fix for ancient tag format Ian Campbell
  2017-09-21  7:49 ` [PATCH v3 1/4] filter-branch: reset $GIT_* before cleaning up Ian Campbell
@ 2017-09-21  7:49 ` Ian Campbell
  2017-09-21  7:49 ` [PATCH v3 3/4] filter-branch: stash away ref map in a branch Ian Campbell
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Ian Campbell @ 2017-09-21  7:49 UTC (permalink / raw)
  To: gitster; +Cc: git, Ian Campbell

These are modified by set_ident() but a subsequent patch would like to operate
on their original values.

Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
---
 git-filter-branch.sh | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 3da281f8a..9edb94206 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -219,6 +219,13 @@ trap 'cd "$orig_dir"; rm -rf "$tempdir"' 0
 ORIG_GIT_DIR="$GIT_DIR"
 ORIG_GIT_WORK_TREE="$GIT_WORK_TREE"
 ORIG_GIT_INDEX_FILE="$GIT_INDEX_FILE"
+ORIG_GIT_AUTHOR_NAME="$GIT_AUTHOR_NAME"
+ORIG_GIT_AUTHOR_EMAIL="$GIT_AUTHOR_EMAIL"
+ORIG_GIT_AUTHOR_DATE="$GIT_AUTHOR_DATE"
+ORIG_GIT_COMMITTER_NAME="$GIT_COMMITTER_NAME"
+ORIG_GIT_COMMITTER_EMAIL="$GIT_COMMITTER_EMAIL"
+ORIG_GIT_COMMITTER_DATE="$GIT_COMMITTER_DATE"
+
 GIT_WORK_TREE=.
 export GIT_DIR GIT_WORK_TREE
 
@@ -545,6 +552,8 @@ if [ "$filter_tag_name" ]; then
 fi
 
 unset GIT_DIR GIT_WORK_TREE GIT_INDEX_FILE
+unset GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_AUTHOR_DATE
+unset GIT_COMMITTER_NAME GIT_COMMITTER_EMAIL GIT_COMMITTER_DATE
 test -z "$ORIG_GIT_DIR" || {
 	GIT_DIR="$ORIG_GIT_DIR" && export GIT_DIR
 }
@@ -556,6 +565,30 @@ test -z "$ORIG_GIT_INDEX_FILE" || {
 	GIT_INDEX_FILE="$ORIG_GIT_INDEX_FILE" &&
 	export GIT_INDEX_FILE
 }
+test -z "$ORIG_GIT_AUTHOR_NAME" || {
+	GIT_AUTHOR_NAME="$ORIG_GIT_AUTHOR_NAME" &&
+	export GIT_AUTHOR_NAME
+}
+test -z "$ORIG_GIT_AUTHOR_EMAIL" || {
+	GIT_AUTHOR_EMAIL="$ORIG_GIT_AUTHOR_EMAIL" &&
+	export GIT_AUTHOR_EMAIL
+}
+test -z "$ORIG_GIT_AUTHOR_DATE" || {
+	GIT_AUTHOR_DATE="$ORIG_GIT_AUTHOR_DATE" &&
+	export GIT_AUTHOR_DATE
+}
+test -z "$ORIG_GIT_COMMITTER_NAME" || {
+	GIT_COMMITTER_NAME="$ORIG_GIT_COMMITTER_NAME" &&
+	export GIT_COMMITTER_NAME
+}
+test -z "$ORIG_GIT_COMMITTER_EMAIL" || {
+	GIT_COMMITTER_EMAIL="$ORIG_GIT_COMMITTER_EMAIL" &&
+	export GIT_COMMITTER_EMAIL
+}
+test -z "$ORIG_GIT_COMMITTER_DATE" || {
+	GIT_COMMITTER_DATE="$ORIG_GIT_COMMITTER_DATE" &&
+	export GIT_COMMITTER_DATE
+}
 
 cd "$orig_dir"
 rm -rf "$tempdir"
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 3/4] filter-branch: stash away ref map in a branch
  2017-09-21  7:49 [PATCH v3 0/4] filter-branch: support for incremental update + fix for ancient tag format Ian Campbell
  2017-09-21  7:49 ` [PATCH v3 1/4] filter-branch: reset $GIT_* before cleaning up Ian Campbell
  2017-09-21  7:49 ` [PATCH v3 2/4] filter-branch: preserve and restore $GIT_AUTHOR_* and $GIT_COMMITTER_* Ian Campbell
@ 2017-09-21  7:49 ` Ian Campbell
  2017-09-21  7:49 ` [PATCH v3 4/4] filter-branch: use hash-object instead of mktag Ian Campbell
  2017-09-22  4:42 ` [PATCH v3 0/4] filter-branch: support for incremental update + fix for ancient tag format Junio C Hamano
  4 siblings, 0 replies; 8+ messages in thread
From: Ian Campbell @ 2017-09-21  7:49 UTC (permalink / raw)
  To: gitster; +Cc: git, Ian Campbell

With "--state-branch=<branchname>" option, the mapping from old object names
and filtered ones in ./map/ directory is stashed away in the object database,
and the one from the previous run is read to populate the ./map/ directory,
allowing for incremental updates of large trees.

Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
---
I have been using this as part of the device tree extraction from the Linux
kernel source since 2013, about time I sent the patch upstream!

v2:
- added several preceding cleanup patches, including:
  - new: use of mktag --allow-missing tagger.
  - split-out: preserving $GIT_*.
- use git rev-parse rather than git show-ref.
- improved error handling for Perl sub-processes.
- collapsed some shell pipelines involving piping output of git and ls into
  Perl into the Perl scripts.
- style fixes for conditionals and sub-shells.
- fixup indentation.
- added documentation.
- improved commit message.
---
 Documentation/git-filter-branch.txt |  8 +++++-
 git-filter-branch.sh                | 49 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-filter-branch.txt b/Documentation/git-filter-branch.txt
index 9e5169aa6..bebdcdec5 100644
--- a/Documentation/git-filter-branch.txt
+++ b/Documentation/git-filter-branch.txt
@@ -14,7 +14,7 @@ SYNOPSIS
 	[--commit-filter <command>] [--tag-name-filter <command>]
 	[--subdirectory-filter <directory>] [--prune-empty]
 	[--original <namespace>] [-d <directory>] [-f | --force]
-	[--] [<rev-list options>...]
+	[--state-branch <branch>] [--] [<rev-list options>...]
 
 DESCRIPTION
 -----------
@@ -198,6 +198,12 @@ to other tags will be rewritten to point to the underlying commit.
 	directory or when there are already refs starting with
 	'refs/original/', unless forced.
 
+--state-branch <branch>::
+	This option will cause the mapping from old to new objects to
+	be loaded from named branch upon startup and saved as a new
+	commit to that branch upon exit, enabling incremental of large
+	trees. If '<branch>' does not exist it will be created.
+
 <rev-list options>...::
 	Arguments for 'git rev-list'.  All positive refs included by
 	these options are rewritten.  You may also specify options
diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 9edb94206..956869b8e 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -86,7 +86,7 @@ USAGE="[--setup <command>] [--env-filter <command>]
 	[--parent-filter <command>] [--msg-filter <command>]
 	[--commit-filter <command>] [--tag-name-filter <command>]
 	[--subdirectory-filter <directory>] [--original <namespace>]
-	[-d <directory>] [-f | --force]
+	[-d <directory>] [-f | --force] [--state-branch <branch>]
 	[--] [<rev-list options>...]"
 
 OPTIONS_SPEC=
@@ -106,6 +106,7 @@ filter_msg=cat
 filter_commit=
 filter_tag_name=
 filter_subdir=
+state_branch=
 orig_namespace=refs/original/
 force=
 prune_empty=
@@ -181,6 +182,9 @@ do
 	--original)
 		orig_namespace=$(expr "$OPTARG/" : '\(.*[^/]\)/*$')/
 		;;
+	--state-branch)
+		state_branch="$OPTARG"
+		;;
 	*)
 		usage
 		;;
@@ -259,6 +263,26 @@ export GIT_INDEX_FILE
 # map old->new commit ids for rewriting parents
 mkdir ../map || die "Could not create map/ directory"
 
+if test -n "$state_branch"
+then
+	state_commit=$(git rev-parse --no-flags --revs-only "$state_branch")
+	if test -n "$state_commit"
+	then
+		echo "Populating map from $state_branch ($state_commit)" 1>&2
+		perl -e'open(MAP, "-|", "git show $ARGV[0]:filter.map") or die;
+			while (<MAP>) {
+				m/(.*):(.*)/ or die;
+				open F, ">../map/$1" or die;
+				print F "$2" or die;
+				close(F) or die;
+			}
+			close(MAP) or die;' "$state_commit" \
+				|| die "Unable to load state from $state_branch:filter.map"
+	else
+		echo "Branch $state_branch does not exist. Will create" 1>&2
+	fi
+fi
+
 # we need "--" only if there are no path arguments in $@
 nonrevs=$(git rev-parse --no-revs "$@") || exit
 if test -z "$nonrevs"
@@ -590,6 +614,29 @@ test -z "$ORIG_GIT_COMMITTER_DATE" || {
 	export GIT_COMMITTER_DATE
 }
 
+if test -n "$state_branch"
+then
+	echo "Saving rewrite state to $state_branch" 1>&2
+	state_blob=$(
+		perl -e'opendir D, "../map" or die;
+			open H, "|-", "git hash-object -w --stdin" or die;
+			foreach (sort readdir(D)) {
+				next if m/^\.\.?$/;
+				open F, "<../map/$_" or die;
+				chomp($f = <F>);
+				print H "$_:$f\n" or die;
+			}
+			close(H) or die;' || die "Unable to save state")
+	state_tree=$(/bin/echo -e "100644 blob $state_blob\tfilter.map" | git mktree)
+	if test -n "$state_commit"
+	then
+		state_commit=$(/bin/echo "Sync" | git commit-tree "$state_tree" -p "$state_commit")
+	else
+		state_commit=$(/bin/echo "Sync" | git commit-tree "$state_tree" )
+	fi
+	git update-ref "$state_branch" "$state_commit"
+fi
+
 cd "$orig_dir"
 rm -rf "$tempdir"
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 4/4] filter-branch: use hash-object instead of mktag
  2017-09-21  7:49 [PATCH v3 0/4] filter-branch: support for incremental update + fix for ancient tag format Ian Campbell
                   ` (2 preceding siblings ...)
  2017-09-21  7:49 ` [PATCH v3 3/4] filter-branch: stash away ref map in a branch Ian Campbell
@ 2017-09-21  7:49 ` Ian Campbell
  2017-09-22  4:42 ` [PATCH v3 0/4] filter-branch: support for incremental update + fix for ancient tag format Junio C Hamano
  4 siblings, 0 replies; 8+ messages in thread
From: Ian Campbell @ 2017-09-21  7:49 UTC (permalink / raw)
  To: gitster; +Cc: git, Ian Campbell

This allows us to recreate even historical tags which would now be consider
invalid, such as v2.6.12-rc2..v2.6.13-rc3 in the Linux kernel source tree which
lack the `tagger` header.

    $ git rev-parse v2.6.12-rc2
    9e734775f7c22d2f89943ad6c745571f1930105f
    $ git cat-file tag v2.6.12-rc2 | git mktag
    error: char76: could not find "tagger "
    fatal: invalid tag signature file
    $ git cat-file tag v2.6.12-rc2 | git hash-object -t tag -w --stdin
    9e734775f7c22d2f89943ad6c745571f1930105f

Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
---
 git-filter-branch.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 956869b8e..3365a3b86 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -561,7 +561,7 @@ if [ "$filter_tag_name" ]; then
 					}' \
 				    -e '/^-----BEGIN PGP SIGNATURE-----/q' \
 				    -e 'p' ) |
-				git mktag) ||
+				git hash-object -t tag -w --stdin) ||
 				die "Could not create new tag object for $ref"
 			if git cat-file tag "$ref" | \
 			   sane_grep '^-----BEGIN PGP SIGNATURE-----' >/dev/null 2>&1
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 0/4] filter-branch: support for incremental update + fix for ancient tag format
  2017-09-21  7:49 [PATCH v3 0/4] filter-branch: support for incremental update + fix for ancient tag format Ian Campbell
                   ` (3 preceding siblings ...)
  2017-09-21  7:49 ` [PATCH v3 4/4] filter-branch: use hash-object instead of mktag Ian Campbell
@ 2017-09-22  4:42 ` Junio C Hamano
  2017-09-22  8:38   ` Ian Campbell
  4 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2017-09-22  4:42 UTC (permalink / raw)
  To: Ian Campbell; +Cc: git

Ian Campbell <ijc@hellion.org.uk> writes:

> This is the third version of my patches to add incremental support to
> git-filter-branch. Since the last time I have replaced `git mktag --
> allow-missing-tagger` with `git hash-object -t tag -w --stdin`.
>
> I've force pushed to [1] (Travis is still running) and have set off the
> process of re-rewriting the devicetree tree from scratch (a multi-day
> affair) to validate (it's looking good).

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 0/4] filter-branch: support for incremental update + fix for ancient tag format
  2017-09-22  4:42 ` [PATCH v3 0/4] filter-branch: support for incremental update + fix for ancient tag format Junio C Hamano
@ 2017-09-22  8:38   ` Ian Campbell
  2017-09-22  8:58     ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: Ian Campbell @ 2017-09-22  8:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Fri, 2017-09-22 at 13:42 +0900, Junio C Hamano wrote:
> Ian Campbell <ijc@hellion.org.uk> writes:
> 
> > This is the third version of my patches to add incremental support to
> > git-filter-branch. Since the last time I have replaced `git mktag --
> > allow-missing-tagger` with `git hash-object -t tag -w --stdin`.
> > 
> > I've force pushed to [1] (Travis is still running) and have set off the
> > process of re-rewriting the devicetree tree from scratch (a multi-day
> > affair) to validate (it's looking good).
> 
> Thanks.

Travis is happy and the dt reconvert looks sensible (only took 60 hours
;-)).

Don't know if this is useful to your workflow but:

The following changes since commit 4384e3cde2ce8ecd194202e171ae16333d241326:

  Git 2.14 (2017-08-04 09:31:12 -0700)

are available in the git repository at:

  https://github.com/ijc/git git-filter-branch

for you to fetch changes up to e31c74f709fbf2827d57b4abf826bb836f120329:

  filter-branch: use hash-object instead of mktag (2017-09-21 08:44:59 +0100)

----------------------------------------------------------------
Ian Campbell (4):
      filter-branch: reset $GIT_* before cleaning up
      filter-branch: preserve and restore $GIT_AUTHOR_* and $GIT_COMMITTER_*
      filter-branch: stash away ref map in a branch
      filter-branch: use hash-object instead of mktag

 Documentation/git-filter-branch.txt |  8 +++++++-
 git-filter-branch.sh                | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 94 insertions(+), 8 deletions(-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 0/4] filter-branch: support for incremental update + fix for ancient tag format
  2017-09-22  8:38   ` Ian Campbell
@ 2017-09-22  8:58     ` Junio C Hamano
  0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2017-09-22  8:58 UTC (permalink / raw)
  To: Ian Campbell; +Cc: git

Ian Campbell <ijc@hellion.org.uk> writes:

> Travis is happy and the dt reconvert looks sensible (only took 60 hours
> ;-)).

Good.

> Don't know if this is useful to your workflow but:
>
> The following changes since commit 4384e3cde2ce8ecd194202e171ae16333d241326:
>
>   Git 2.14 (2017-08-04 09:31:12 -0700)
>
> are available...

That should match (modulo that they lack my sign-off, for obvious
reasons) what I have in the 'pu' branch, four commits on a single
strand of pearls ending at b2c1ca6b ("filter-branch: use hash-object
instead of mktag", 2017-09-21).

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-09-22  8:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-21  7:49 [PATCH v3 0/4] filter-branch: support for incremental update + fix for ancient tag format Ian Campbell
2017-09-21  7:49 ` [PATCH v3 1/4] filter-branch: reset $GIT_* before cleaning up Ian Campbell
2017-09-21  7:49 ` [PATCH v3 2/4] filter-branch: preserve and restore $GIT_AUTHOR_* and $GIT_COMMITTER_* Ian Campbell
2017-09-21  7:49 ` [PATCH v3 3/4] filter-branch: stash away ref map in a branch Ian Campbell
2017-09-21  7:49 ` [PATCH v3 4/4] filter-branch: use hash-object instead of mktag Ian Campbell
2017-09-22  4:42 ` [PATCH v3 0/4] filter-branch: support for incremental update + fix for ancient tag format Junio C Hamano
2017-09-22  8:38   ` Ian Campbell
2017-09-22  8:58     ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).