git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>, Git Mailing List <git@vger.kernel.org>
Subject: Re: Bug report: git branch behaves as if --no-replace-objects is passed
Date: Tue, 30 Mar 2021 14:30:56 -0700	[thread overview]
Message-ID: <CABPp-BFewHGOx-RCVtDKhn3=0QC9YWdA--Wtbb_MTHQbu3FQcw@mail.gmail.com> (raw)
In-Reply-To: <CABPp-BEE0eksCJSAviDh5GyqsOu=i_mW3VY6SEULa4kx0NsUMg@mail.gmail.com>

On Tue, Mar 30, 2021 at 2:19 PM Elijah Newren <newren@gmail.com> wrote:
>
> On Tue, Mar 30, 2021 at 11:58 AM Junio C Hamano <gitster@pobox.com> wrote:
> >
> > Jeff King <peff@peff.net> writes:
> >
> > > ... though if we go that route, I suspect we ought to be adding both the
> > > original _and_ the replacement.
> >
> > So "branch --contains X" would ask "which of these branches reach X
> > or its replacement?" and "branch --no-contains X" would ask "which
> > of these do not reach X nor its replacement?" --- I guess the result
> > is still internally consistent (meaning: any and all branches fall
> > into either "--contains X" or "--no-contains X" camp).
>
> I'm not so sure about this interpretation.  Based on the documentation
> in git-replace(1):
>
>        Replacement references will be used by default by all Git commands
>        except those doing reachability traversal (prune, pack transfer and
>        fsck).
>
> I would have thought that
>
> * "branch --contains X" would ask "which of these branches reach X's
> replacement?"
> * "git --no-replace-objects branch --contains X" would ask "which of
> these branches reach X?"
>
> and if folks really wanted to check whether either X or its
> replacement were reachable then they'd need to run both commands.
>
> The only place outside of reachability traversal where I think it
> makes sense for a command to distinguish between X being a replace ref
> for Y and Y itself is in `git log` where it can show the "replaced"
> moniker.
>
> > > I'm not entirely sure this is a good direction, though.
> > >
> > >> and possibly worse, if I create a new branch based on it and use it:
> > >>
> > >>     $ git branch foobar deadbeefdeadbeefdeadbeefdeadbeefdeadbeef
> > >>     $ git checkout foobar
> > >>     $ echo stuff >empty
> > >>     $ git add empty
> > >>     $ git commit -m more
> > >>
> > >> then it's clear that branch created foobar pointing to the replaced
> > >> object rather than the replacement object -- despite the fact that the
> > >> replaced object doesn't even exist within this repo:
> > >>
> > >>     $ git cat-file -p HEAD
> > >>     tree 18108bae26dc91af2055bc66cc9fea278012dbd3
> > >>     parent deadbeefdeadbeefdeadbeefdeadbeefdeadbeef
> > >>     author Elijah Newren <newren@gmail.com> 1617083739 -0700
> > >>     committer Elijah Newren <newren@gmail.com> 1617083739 -0700
> > >>
> > >>     more
> > >
> > > Yeah, that's pretty horrible.
> >
> > I am not sure.  As you analize below, the replace mechanism is about
> > telling Git: when anybody refers to deadbeef, use its replacement if
> > defined instead.
> >
> > And one of the points in the mechanism is to allow to do so even
> > retroactively, so the HEAD object there may be referring to deadbeef
> > that may not exist does not matter, as long as the object that is to
> > replace deadbeef is available.  If not, that is a repository
> > corruption.  After all, the commit object you cat-file'ed may have
> > been created by somebody else in a separate repository that had
> > deadbeef before they were told by Elijah that the object is obsolete
> > and to be replaced by something else (Git supports distributed
> > development) and then pulled into Elijah's repository, and we should
> > be prepared to seeing "parent deadbeef" in such a commit.  As long as
> > replacement happens when accessing the contents, that would be OK.
> >
> > So, I do not see it as "pretty horrible", but I may be missing
> > something.
>
> I think you're focusing on git commit, or perhaps on git checkout.
> I'm focusing on git branch; what it did does not seem fine to me.
> Using your own words:
>
> "the replace mechanism is about telling Git: when anybody refers to
> deadbeef, use its replacement if defined instead."
>
> git branch didn't do that; it put deadbeef into refs/heads/foobar.

Perhaps I should also add why it not only breaks expectations, but why
that broken expectation causes problems:

* People tend to have commit hashes stored in lots of weird placed --
bug trackers, reports, emails, etc.  These tend to be important for a
short time period, but the number of these references make it harder
for folks who want to rewrite history to fix various past issues (very
large binary blobs and other misdeeds).

* filter-repo uses replace refs to provide users with a way to access
new commits using old commit hashes, to help them through this
transition period.

* Additional refs (especially one for every commit) will cause some
slowness.  So it's nice to be able to provide these replace refs for
short term transition, but tell users they can simply delete the
replace refs when they no longer need them without consequence.


The fact that git branch puts deadbeef into refs/heads/foobar, leads
to a chain where new commits now rely on replacement refs.  In the
best case, others will not be able to pull from this user and the user
will not be able to push the new commits anywhere -- and that user
will have some work to do to rewrite (rebase?) the commits
appropriately.  In the worst case, the users do succeed in
distributing this new history, and now all users everywhere will be
mandated to keep all replace refs for all time (or at least until the
next major repository rewrite)...

  reply	other threads:[~2021-03-30 21:32 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-30  6:05 Bug report: git branch behaves as if --no-replace-objects is passed Elijah Newren
2021-03-30  7:02 ` Jeff King
2021-03-30 18:58   ` Junio C Hamano
2021-03-30 21:19     ` Elijah Newren
2021-03-30 21:30       ` Elijah Newren [this message]
2021-03-30 21:59         ` Junio C Hamano
2021-03-30 21:53       ` Junio C Hamano
2021-03-30 22:43         ` Elijah Newren
2021-03-30 23:04           ` Junio C Hamano
2021-03-31  0:32             ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABPp-BFewHGOx-RCVtDKhn3=0QC9YWdA--Wtbb_MTHQbu3FQcw@mail.gmail.com' \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).