git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Seaders Oloinsigh <seaders69@gmail.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: git filter-branch --subdirectory-filter not working as expected, history of other folders is preserved
Date: Tue, 11 Oct 2016 14:56:39 +0100	[thread overview]
Message-ID: <CAN40BKoHGHWrNyTg3h=VJdBtzpQbWXbc1F+2j+mrPgPC7uUmMQ@mail.gmail.com> (raw)
In-Reply-To: <20161010181907.bqekupn6npuimbir@sigill.intra.peff.net>

On Mon, Oct 10, 2016 at 7:19 PM, Jeff King <peff@peff.net> wrote:
> On Mon, Oct 10, 2016 at 05:12:25PM +0100, Seaders Oloinsigh wrote:
>
>> Due to the structure of this repo, it looks like there are some
>> branches that never had anything to do with the android/ subdirectory,
>> so they're not getting wiped out.  My branch is in a better state to
>> how I want it, but still, if I run your suggestion,
>> [...]
>
> Hmm. Yeah, I think this is an artifact of the way that filter-branch
> works with pathspec limiting. It keeps a mapping of commits that it has
> rewritten (including ones that were rewritten only because their
> ancestors were), and realizes that a branch ref needs updated when the
> commit it points to was rewritten.
>
> But if we don't touch _any_ commits in the history reachable from a
> branch (because they didn't even show up in our pathspec-limited
> rev-list), then it doesn't realize we touched the branch's history at
> all.
>
> I agree that the right outcome is for it to delete those branches
> entirely. I suspect the fix would be pretty tricky, though.
>
> In the meantime, I think you can work around it by either:
>
>   1. Make a pass beforehand for refs that do not touch your desired
>      paths at all, like:
>
>        path=android ;# or whatever
>        git for-each-ref --format='%(refname)' |
>        while read ref; do
>          if test "$(git rev-list --count "$ref" -- "$path")" = 0; then
>            echo "delete $ref"
>          fi
>        done |
>        git update-ref --stdin
>
>      and then filter what's left:
>
>        git filter-branch --subdirectory-filter $path -- --all

This is the perfect solution for me.  Going through the delete
branches runthrough also quickened the filter-branch command, and I'm
left with a much more complete version of where I want to be.

I would still contend that the filter-branch either doesn't work as
expected, or the docs need updating to provide extra steps like you've
done, because when dealing with a large repo like we have, running
multiple filter-branch commands, trying different combinations is
quite a time sync, when you're left with the same incorrect solution
again and again.

>
> or
>
>   2. Do the filter-branch, and because you know you specified --all and
>      that your filters would touch all histories, any ref which _wasn't_
>      touched can be deleted. That list is anything which didn't get a
>      backup entry in refs/original. So something like:
>
>        git for-each-ref --format='%(refname)' |
>        perl -lne 'print $1 if m{^refs/original/(.*)}' >backups
>
>        git for-each-ref --format='%(refname)' |
>        grep -v ^refs/original >refs
>
>        comm -23 refs backups |
>        sed "s/^/delete /" |
>        git update-ref --stdin
>
> -Peff

      reply	other threads:[~2016-10-11 13:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-10 13:42 git filter-branch --subdirectory-filter not working as expected, history of other folders is preserved Seaders Oloinsigh
2016-10-10 15:30 ` Jeff King
2016-10-10 16:12   ` Seaders Oloinsigh
2016-10-10 18:19     ` Jeff King
2016-10-11 13:56       ` Seaders Oloinsigh [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAN40BKoHGHWrNyTg3h=VJdBtzpQbWXbc1F+2j+mrPgPC7uUmMQ@mail.gmail.com' \
    --to=seaders69@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).