git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* How to use filter-branch with --state-branch?
@ 2018-03-06 15:17 Michele Locati
  2018-03-08  9:25 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 6+ messages in thread
From: Michele Locati @ 2018-03-06 15:17 UTC (permalink / raw)
  To: git

Recent versions of git filter-branch command introduced the --state-branch
option.
BTW I can't find any info about how this can be actually used.

We have this repository on github:
https://github.com/concrete5/concrete5

When someone pushes to that repo, we clone it and execute
`git filter-branch --subdirectory-filter concrete`
to extract the concrete directory, and we push the result to
https://github.com/concrete5/concrete5-core
(including all the branches and tags)

The script at the moment is this one:
https://github.com/concrete5/core_splitter/blob/70879e676b95160f7fc5d0ffc22b8f7420b0580b/bin/splitcore

I tried to use the --state-branch option on a local mirror, so that we could
do an incremental filtering. Here's the script:

# Executed just one time
git clone --no-checkout --mirror \
   https://github.com/concrete5/concrete5.git work
cd work
git filter-branch \
   --subdirectory-filter concrete \
   --tag-name-filter cat \
   --prune-empty \
   --state-branch FILTERBRANCH_STATE \
   -- --all
# Executed every time the repo is updated
git remote update --prune
git filter-branch \
   --subdirectory-filter concrete \
   --tag-name-filter cat \
   --prune-empty \
   --state-branch FILTERBRANCH_STATE \
   -- --all

The first filter-branch call required 7168 steps, so did the second call...
I also tried without the --prune option of remote update (I had to add
--force to the second filter-branch), but nothing changed.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: How to use filter-branch with --state-branch?
  2018-03-06 15:17 How to use filter-branch with --state-branch? Michele Locati
@ 2018-03-08  9:25 ` Ævar Arnfjörð Bjarmason
  2018-03-08  9:40   ` Ian Campbell
  0 siblings, 1 reply; 6+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-03-08  9:25 UTC (permalink / raw)
  To: Michele Locati; +Cc: git, Ian Campbell


On Tue, Mar 06 2018, Michele Locati jotted:

> Recent versions of git filter-branch command introduced the --state-branch
> option.
> BTW I can't find any info about how this can be actually used.
>
> We have this repository on github:
> https://github.com/concrete5/concrete5
>
> When someone pushes to that repo, we clone it and execute
> `git filter-branch --subdirectory-filter concrete`
> to extract the concrete directory, and we push the result to
> https://github.com/concrete5/concrete5-core
> (including all the branches and tags)
>
> The script at the moment is this one:
> https://github.com/concrete5/core_splitter/blob/70879e676b95160f7fc5d0ffc22b8f7420b0580b/bin/splitcore
>
> I tried to use the --state-branch option on a local mirror, so that we could
> do an incremental filtering. Here's the script:
>
> # Executed just one time
> git clone --no-checkout --mirror \
>    https://github.com/concrete5/concrete5.git work
> cd work
> git filter-branch \
>    --subdirectory-filter concrete \
>    --tag-name-filter cat \
>    --prune-empty \
>    --state-branch FILTERBRANCH_STATE \
>    -- --all
> # Executed every time the repo is updated
> git remote update --prune
> git filter-branch \
>    --subdirectory-filter concrete \
>    --tag-name-filter cat \
>    --prune-empty \
>    --state-branch FILTERBRANCH_STATE \
>    -- --all
>
> The first filter-branch call required 7168 steps, so did the second call...
> I also tried without the --prune option of remote update (I had to add
> --force to the second filter-branch), but nothing changed.

CC-ing the author of that feature. Usually I'd just look at how the
tests for it work to answer your question, but I see this new feature
made it in recently with no tests for it, which doesn't make me very
happy :(

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: How to use filter-branch with --state-branch?
  2018-03-08  9:25 ` Ævar Arnfjörð Bjarmason
@ 2018-03-08  9:40   ` Ian Campbell
  2018-03-09 13:04     ` Michele Locati
  0 siblings, 1 reply; 6+ messages in thread
From: Ian Campbell @ 2018-03-08  9:40 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Michele Locati; +Cc: git

On Thu, 2018-03-08 at 10:25 +0100, Ævar Arnfjörð Bjarmason wrote:

> > The first filter-branch call required 7168 steps, so did the second call...
> > I also tried without the --prune option of remote update (I had to add
> > --force to the second filter-branch), but nothing changed.

You can see an example of the usage in:
    https://git.kernel.org/pub/scm/linux/kernel/git/devicetree/devicetree-rebasing.git/

in the `scripts/` sub dir (flow is `cronjob` → `filter.sh` → `git
filter-branch...`.

I think the big difference is rather than `--all` you need to give it
the `previous..now` range since that is the update you wish to do
(first time around you just give it `now`).

The devicetree-rebasing scripting arranges that by keeping the previous
in a separate branch.

Ian.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: How to use filter-branch with --state-branch?
  2018-03-08  9:40   ` Ian Campbell
@ 2018-03-09 13:04     ` Michele Locati
  2018-03-09 13:23       ` Ian Campbell
  0 siblings, 1 reply; 6+ messages in thread
From: Michele Locati @ 2018-03-09 13:04 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ævar Arnfjörð Bjarmason, git

2018-03-08 10:40 GMT+01:00 Ian Campbell <ijc@hellion.org.uk>:
>
> On Thu, 2018-03-08 at 10:25 +0100, Ævar Arnfjörð Bjarmason wrote:
>
> > > The first filter-branch call required 7168 steps, so did the second call...
> > > I also tried without the --prune option of remote update (I had to add
> > > --force to the second filter-branch), but nothing changed.
>
> You can see an example of the usage in:
>     https://git.kernel.org/pub/scm/linux/kernel/git/devicetree/devicetree-rebasing.git/
>
> in the `scripts/` sub dir (flow is `cronjob` → `filter.sh` → `git
> filter-branch...`.
>
> I think the big difference is rather than `--all` you need to give it
> the `previous..now` range since that is the update you wish to do
> (first time around you just give it `now`).
>
> The devicetree-rebasing scripting arranges that by keeping the previous
> in a separate branch.
>
> Ian.

Thank you for your quick reply, Ian.

Just a couple of questions:

1. it seems to me it's not possible to process all the branches in one
go. Am I right?

2. Why do you have this line in filter.sh?
`rm -f .git/refs/original/refs/heads/${UPSTREAM_REWRITTEN}`

Thank you again,
Michele

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: How to use filter-branch with --state-branch?
  2018-03-09 13:04     ` Michele Locati
@ 2018-03-09 13:23       ` Ian Campbell
  2018-03-09 17:17         ` Michele Locati
  0 siblings, 1 reply; 6+ messages in thread
From: Ian Campbell @ 2018-03-09 13:23 UTC (permalink / raw)
  To: Michele Locati; +Cc: Ævar Arnfjörð Bjarmason, git

On Fri, 2018-03-09 at 14:04 +0100, Michele Locati wrote:
> Just a couple of questions:
> 
> 1. it seems to me it's not possible to process all the branches in one
> go. Am I right?

I'm not sure, I've never done such a thing, in fact I didn't know you
could.

Really all this feature does is record the `.git/rewrite-map` (or
whatever the correct name is) at the end of the rewrite and reinstate
it again the next time, so it shouldn't really interact with many of
the other options.

My method for storeing "last version processed" in a branch does
conflict I suppose (since that branch would be rewritten) but that's an
artefact of the surrounding scaffolding -- you could equally well keep
the record in some file on the local system or in a non-branch-ish ref
(I guess).

> 2. Why do you have this line in filter.sh?
> `rm -f .git/refs/original/refs/heads/${UPSTREAM_REWRITTEN}`

TBH I'm not really sure. I originally wrote this patch many years ago,
it's just recently that I got around to upstreaming, so my memory is
more fuzzy than might be expected.

I think perhaps I was trying to avoid this error:

    A previous backup already exists in $orig_namespace
    Force overwriting the backup with -f"

which comes if there is an existing backup (a safety feature in the
non-incremental case).

Note quite sure why I didn't use `-f` as the message says, but I guess
because it forces other things too which I didn't want to do?

Perhaps what I should have done is make that check conditional on the
use of --state-branch.

I wonder if you could use the `original/refs/...` as the "last version
processed"? Would be a lot less manual messing around than what I do!

Ian.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: How to use filter-branch with --state-branch?
  2018-03-09 13:23       ` Ian Campbell
@ 2018-03-09 17:17         ` Michele Locati
  0 siblings, 0 replies; 6+ messages in thread
From: Michele Locati @ 2018-03-09 17:17 UTC (permalink / raw)
  To: git

2018-03-09 14:23 GMT+01:00 Ian Campbell <ijc@hellion.org.uk>:
> On Fri, 2018-03-09 at 14:04 +0100, Michele Locati wrote:
>> Just a couple of questions:
>>
>> 1. it seems to me it's not possible to process all the branches in one
>> go. Am I right?
>
> I'm not sure, I've never done such a thing, in fact I didn't know you
> could.
>
> Really all this feature does is record the `.git/rewrite-map` (or
> whatever the correct name is) at the end of the rewrite and reinstate
> it again the next time, so it shouldn't really interact with many of
> the other options.
>
> My method for storeing "last version processed" in a branch does
> conflict I suppose (since that branch would be rewritten) but that's an
> artefact of the surrounding scaffolding -- you could equally well keep
> the record in some file on the local system or in a non-branch-ish ref
> (I guess).
>
>> 2. Why do you have this line in filter.sh?
>> `rm -f .git/refs/original/refs/heads/${UPSTREAM_REWRITTEN}`
>
> TBH I'm not really sure. I originally wrote this patch many years ago,
> it's just recently that I got around to upstreaming, so my memory is
> more fuzzy than might be expected.
>
> I think perhaps I was trying to avoid this error:
>
>     A previous backup already exists in $orig_namespace
>     Force overwriting the backup with -f"
>
> which comes if there is an existing backup (a safety feature in the
> non-incremental case).
>
> Note quite sure why I didn't use `-f` as the message says, but I guess
> because it forces other things too which I didn't want to do?
>
> Perhaps what I should have done is make that check conditional on the
> use of --state-branch.
>
> I wonder if you could use the `original/refs/...` as the "last version
> processed"? Would be a lot less manual messing around than what I do!
>
> Ian.


I managed to get a general script that seems to work: see
https://github.com/mlocati/incremental-git-filter-branch

Thanks again, Ian.

--
Michele

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-03-09 17:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-06 15:17 How to use filter-branch with --state-branch? Michele Locati
2018-03-08  9:25 ` Ævar Arnfjörð Bjarmason
2018-03-08  9:40   ` Ian Campbell
2018-03-09 13:04     ` Michele Locati
2018-03-09 13:23       ` Ian Campbell
2018-03-09 17:17         ` Michele Locati

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).