git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: "Sergey Organov" <sorganov@gmail.com>, "Eric Wong" <e@80x24.org>,
	"Git Mailing List" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Derrick Stolee" <stolee@gmail.com>, "Jeff King" <peff@peff.net>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Lars Schneider" <larsxschneider@gmail.com>,
	"Jonathan Nieder" <jrnieder@gmail.com>
Subject: Re: [RFC PATCH 0/5] Remove git-filter-branch from git.git; host it elsewhere
Date: Fri, 30 Aug 2019 16:22:10 -0700	[thread overview]
Message-ID: <CABPp-BHMXAQGPaBYyg2dtVeN5h8fW8G4YdhddCeAjY5r74BAzw@mail.gmail.com> (raw)
In-Reply-To: <nycvar.QRO.7.76.6.1908302221210.46@tvgsbejvaqbjf.bet>

Hi Dscho,

On Fri, Aug 30, 2019 at 1:40 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Elijah,
>
>
> On Wed, 28 Aug 2019, Elijah Newren wrote:
>
> > Hi Sergey,
> >
> > On Wed, Aug 28, 2019 at 1:52 AM Sergey Organov <sorganov@gmail.com> wrote:
> > >
> > > Elijah Newren <newren@gmail.com> writes:
> > >
> > > > On Tue, Aug 27, 2019 at 1:43 AM Sergey Organov <sorganov@gmail.com> wrote:
> > > >>
> > > >> Eric Wong <e@80x24.org> writes:
> > > >>
> > > >>
> > > >> [...]
> > > >>
> > > >> > AFAIK, filter-branch is not causing support headaches for any
> > > >> > git developers today.  With so many commands in git, it's
> > > >> > unlikely newbies will ever get around to discover it :)
> > > >> > So I think think we should be in any rush to remove it.
> > > >>
> > > >> Nah, discovering it is simple. Just Google for "git change author". That
> > > >> eventually leads to a script that uses "git filter-branch --env-filter"
> > > >> to get the job done, and I'm afraid it is spread all over the world.
> > > >>
> > > >> See, e.g.:
> > > >>
> > > >> https://help.github.com/en/articles/changing-author-info
> > > >
> > > > Side note: Is the goal to "fix names and email addresses in this
> > > > repository"?  If so, this guide fails: it doesn't update tagger names
> > > > or email addresses.  Indeed, filter-branch doesn't provide a way to do
> > > > that.  (Not to mention other problems like not updating references to
> > > > commit hashes in commit messages when it busy rewriting everything.)
> > >
> > > No. Maybe the original goal was like that, by I, personally, use
> > > modified version of this to change my "Author" credentials from
> > > "internal" to "public" in branches that I'm going to send upstream, so
> > > the actual aim is to change e-mail of particular Author from a@b to c@d
> > > in all the commits in a (feature) branch.
> >
> > There's an interesting usecase I hadn't heard of or thought of before.
>
> I'll throw in another use case that's kinda related: extracting the
> history of one file (or subdirectory).

Thanks for sending these along!  I do have some comments, and a bunch
of questions...

> In my most recent instance of this, I wanted to publish the script I
> used to use for submitting patch series to the Git mailing list,
> maintaining tags for iterations and generating cover letters from branch
> descriptions and interdiffs (this script eventually became GitGitGadget,
> https://github.com/gitgitgadget/gitgitgadget/commits?after=6fb0ede48f86e729292ee1542729bc0f5a30cfa6+0
> demonstrates this).
>
> To do that, I ran a `git filter-branch` in the repository where I track
> all the scripts I deem unsuitable for public consumption, to remove all
> files but `mail-patch-series.sh`, then pushed it to
> https://github.com/dscho/mail-patch-series
>
> Please note that most crucially, I wanted to rewrite a newly-created
> branch, and only that branch.
>
> Could I have done the same using `git fast-export`, filtering the output
> with a Perl script, then passing it to `git fast-import`? Sure, I was
> really tempted to do that. In the end, it took less of _my_ time to just
> let `git filter-branch` do its work with a not-too-complicated index
> filter.

Why a perl script?  Shouldn't
    git fast-export [--no-data] HEAD -- $PATH | git fast-import --force --quiet
do the trick?  And it's probably simpler and shorter than the index
filter you used.

That said, yeah it'd be nice to get automatic rewriting of commit
hashes in commit messages and other niceties from filter-repo (e.g.
future automatic reattaching of notes to the rewritten commits).  Some
questions:

  * What's the backup strategy in case you specify the wrong filters
(e.g. you have a typo in the pathnames)?  filter-repo encourages folks
to make a clone and then filter the fresh clone, because if anything
goes awry, you can just delete and restart.  (I am heavily opposed to
the refs/original/ backup mechanism used by filter-branch, for
multiple reasons.)  Is your safety stance just "If I mess up it's my
own fault; do the rewrite?"  Or are you okay with cloning before
filtering?
  * If you're okay with cloning before filtering...then is there an
issue with rewriting all branches, and just pushing the one you need?
(Is there an issue with "this branch is small, the others are huge,
and filter-branch is slow -- so rewriting one branch saves me lots of
time"?  Or are there other issues at play too?)
  * What if the user has auxiliary information for the branch in other
refs?  For example, git-notes pointing at any of the commits, or tags
in the history of the branch that might be relevant, or perhaps even
replace refs in combination with GIT_NO_REPLACE_OBJECTS=1?  Is this an
"I don't care, toss that stuff and just rewrite just this branch?"
  * filter-repo by default creates new replace references so that you
can refer to new commit IDs using old (unabbreviated) commit IDs.
Would that be considered helpful for this usecase?  unhelpful?
irrelevant, since you'll just push the branch you want somewhere and
nuke the temporary clone?


I'm not by any means ruling out the possibility of documenting --refs
and adjusting the defaults when it is used so the user can just run
something like
   git filter-repo --path $PATH --refs $MYBRANCH
but I feel like I need to understand answers to questions like the
above ones so that I can know how to phrase warnings and adjust
defaults and update the documentation.

> In another instance, a long, long time ago, I needed to restart a
> repository which had included way too many files for its own good, then
> rename the old repository and start with a fresh `master` that contained
> but a single commit whose tree was identical to the previous `master`'s
> tip commit. I simply grafted that commit, ran `git filter-branch` and
> had precisely what I needed.

filter-repo supports grafts and replace objects, the same as
filter-branch.  (Although, technically, I didn't have to do a thing to
support it; fast-export does the special handling of rewriting based
on grafts and replace objects.)  So, I'd say this is fully supported.

Side question: the git-replace documents suggest that the graft file
is deprecated.  Are there any timeframes or plans for phasing out
beyond the git-replace manpage existing?  Should I avoid documenting
the graft file support in filter-repo?  Should I include examples
using not just git-replace but also using the graft file?

> I would be _delighted_ if these kinds of use case (rewriting a branch,
> or even just a commit range) became more of a first-class citizen with
> `git filter-repo`.

I've got all the pieces for supporting a single branch or a commit
range (e.g. 'git filter-repo --path foo --refs ^master~4 ^stable~23
mybranch'), but the defaults (error out unless in a bare repo, move
refs/remotes/origin/* to refs/heads/*, disconnect origin remote,
expire reflogs & repack & prune, create new replace references so
folks can access new commits using old commit IDs) may be somewhat
friction-filled for this usecase.  Those defaults other than the new
replace refs happen to all be turned off with the combination of
--force and --target, so, assuming turning them off is what you need,
you could cheat and just specify 'git filter-repo --force --target .
--refs $MYBRANCH' today and perhaps get what you want, but that's a
really non-intuitive command line that is way too ugly to recommend.
And I don't want to tie myself to '--target .' being the magic sauce
in the future either.

  reply	other threads:[~2019-08-30 23:22 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-22 18:26 RFC: Proposing git-filter-repo for inclusion in git.git Elijah Newren
2019-08-22 20:23 ` Junio C Hamano
2019-08-22 21:12   ` Elijah Newren
2019-08-22 21:34     ` Junio C Hamano
2019-08-26 23:52       ` [RFC PATCH 0/5] Remove git-filter-branch from git.git; host it elsewhere Elijah Newren
2019-08-26 23:52         ` [RFC PATCH 1/5] t6006: simplify and optimize empty message test Elijah Newren
2019-08-27  1:23           ` Derrick Stolee
2019-08-26 23:52         ` [RFC PATCH 2/5] t3427: accelerate this test by using fast-export and fast-import Elijah Newren
2019-08-27  1:25           ` Derrick Stolee
2019-08-26 23:52         ` [RFC PATCH 3/5] git-sh-i18n: work with external scripts Elijah Newren
2019-08-27  1:28           ` Derrick Stolee
2019-08-26 23:52         ` [RFC PATCH 4/5] Recommend git-filter-repo instead of git-filter-branch in documentation Elijah Newren
2019-08-27  1:32           ` Derrick Stolee
2019-08-27  6:23             ` Elijah Newren
2019-08-26 23:52         ` [RFC PATCH 5/5] Remove git-filter-branch, it is now external to git.git Elijah Newren
2019-08-27  1:39         ` [RFC PATCH 0/5] Remove git-filter-branch from git.git; host it elsewhere Derrick Stolee
2019-08-27  6:17           ` Elijah Newren
2019-08-27  7:03         ` Eric Wong
2019-08-27  8:43           ` Sergey Organov
2019-08-27 22:18             ` Elijah Newren
2019-08-28  8:52               ` Sergey Organov
2019-08-28 17:16                 ` Elijah Newren
2019-08-28 19:03                   ` Sergey Organov
2019-08-30 20:40                   ` Johannes Schindelin
2019-08-30 23:22                     ` Elijah Newren [this message]
2019-09-02  9:29                       ` Johannes Schindelin
2019-09-03 17:37                         ` Elijah Newren
2019-08-28  0:22         ` [PATCH v2 0/4] Warn about git-filter-branch usage and avoid it Elijah Newren
2019-08-28  0:22           ` [PATCH v2 1/4] t6006: simplify and optimize empty message test Elijah Newren
2019-08-28  0:22           ` [PATCH v2 2/4] t3427: accelerate this test by using fast-export and fast-import Elijah Newren
2019-08-28  6:00             ` Eric Sunshine
2019-08-28  0:22           ` [PATCH v2 3/4] Recommend git-filter-repo instead of git-filter-branch Elijah Newren
2019-08-28  6:17             ` Eric Sunshine
2019-08-28 21:48               ` Elijah Newren
2019-08-28  0:22           ` [RFC PATCH v2 4/4] Remove git-filter-branch, it is now external to git.git Elijah Newren
2019-08-29  0:06           ` [PATCH v3 0/4] Warn about git-filter-branch usage and avoid it Elijah Newren
2019-08-29  0:06             ` [PATCH v3 1/4] t6006: simplify and optimize empty message test Elijah Newren
2019-08-29  0:06             ` [PATCH v3 2/4] t3427: accelerate this test by using fast-export and fast-import Elijah Newren
2019-08-29  0:06             ` [PATCH v3 3/4] Recommend git-filter-repo instead of git-filter-branch Elijah Newren
2019-08-29 18:10               ` Eric Sunshine
2019-08-30  0:04                 ` Elijah Newren
2019-08-29  0:06             ` [PATCH v3 4/4] t9902: use a non-deprecated command for testing Elijah Newren
2019-08-30  5:57             ` [PATCH v4 0/4] Warn about git-filter-branch usage and avoid it Elijah Newren
2019-08-30  5:57               ` [PATCH v4 1/4] t6006: simplify and optimize empty message test Elijah Newren
2019-09-02 14:47                 ` Johannes Schindelin
2019-08-30  5:57               ` [PATCH v4 2/4] t3427: accelerate this test by using fast-export and fast-import Elijah Newren
2019-09-02 14:45                 ` Johannes Schindelin
2019-08-30  5:57               ` [PATCH v4 3/4] Recommend git-filter-repo instead of git-filter-branch Elijah Newren
2019-08-30  5:57               ` [PATCH v4 4/4] t9902: use a non-deprecated command for testing Elijah Newren
2019-09-03 18:55           ` [PATCH v5 0/4] Warn about git-filter-branch usage and avoid it Elijah Newren
2019-09-03 18:55             ` [PATCH v5 1/4] t6006: simplify and optimize empty message test Elijah Newren
2019-09-03 21:08               ` Junio C Hamano
2019-09-03 21:58                 ` Elijah Newren
2019-09-03 22:25                   ` Junio C Hamano
2019-09-03 18:55             ` [PATCH v5 2/4] t3427: accelerate this test by using fast-export and fast-import Elijah Newren
2019-09-03 21:26               ` Junio C Hamano
2019-09-03 22:46                 ` Junio C Hamano
2019-09-04 20:32                   ` Elijah Newren
2019-09-03 18:55             ` [PATCH v5 3/4] Recommend git-filter-repo instead of git-filter-branch Elijah Newren
2019-09-03 21:40               ` Junio C Hamano
2019-09-04 20:30                 ` Elijah Newren
2019-09-03 18:55             ` [PATCH v5 4/4] t9902: use a non-deprecated command for testing Elijah Newren
2019-09-04 22:32             ` [PATCH v6 0/3] Warn about git-filter-branch usage and avoid it Elijah Newren
2019-09-04 22:32               ` [PATCH v6 1/3] t6006: simplify, fix, and optimize empty message test Elijah Newren
2019-09-04 22:32               ` [PATCH v6 2/3] Recommend git-filter-repo instead of git-filter-branch Elijah Newren
2019-09-04 22:32               ` [PATCH v6 3/3] t9902: use a non-deprecated command for testing Elijah Newren
2019-08-23  3:00     ` RFC: Proposing git-filter-repo for inclusion in git.git Eric Wong
2019-08-23 18:06       ` Elijah Newren
2019-08-23 18:29         ` Elijah Newren
2019-08-28 11:09         ` Johannes Schindelin
2019-08-28 15:06           ` Junio C Hamano
2019-08-23 12:02     ` Derrick Stolee
2019-08-26 19:56   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BHMXAQGPaBYyg2dtVeN5h8fW8G4YdhddCeAjY5r74BAzw@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=larsxschneider@gmail.com \
    --cc=peff@peff.net \
    --cc=sorganov@gmail.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).