git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>, Git Mailing List <git@vger.kernel.org>
Subject: Re: Bug report: git branch behaves as if --no-replace-objects is passed
Date: Tue, 30 Mar 2021 17:32:18 -0700	[thread overview]
Message-ID: <CABPp-BG+xKr10BpziijMB4j9+F=PCozAFQRdJ1DVBuGGjir40Q@mail.gmail.com> (raw)
In-Reply-To: <xmqqlfa4b5we.fsf@gitster.g>

On Tue, Mar 30, 2021 at 4:04 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > Your "as long as" is I think the assumption that's violated in the
> > workflow in question.  You may have the replace ref defined, but
> > others don't[1].  Neither party has the actual original deadbeef
> > commit[2].  Having deadbeef in refs/heads/foobar leads eventually to
> > creating commits with deadbeef as an explicit parent, as we discussed
> > above.  While that's internally consistent, as you point out, can you
> > push your new commit elsewhere without pushing the replace refs too?
>
> I think the change to "branch --contains" would be an improvement
> whether you actually have deadbeef or not, but in any case, defining

Great, we agree that branch --contains can be improved.

> (eh, rather, being able to define) a replacement for something you
> do not have is the ultimate source of the problem.  And that "bug"
> has not very much specific to how "branch --contains" should behave.

Okay, perhaps you consider the ability to create a replacement for an
object that doesn't exist as a bug.  How do we handle this bug,
though?  That's not at all clear to me.  Is it documentation updates?
More error messages from various commands?  Code changes to handle
these cases better?  Something else?  This behavior has been allowed
for over a decade, the refs can be created outside of "git replace"
and the replacement mechanism comes with
* documentation claiming there are only two restrictions on
replacement or replaced refs, both bypassable[1]
* documentation claims of robustness for the replace mechanism[2]
* documentation claiming that all non-reachability traversal related
commands will translate replacement refs to the real commit IDs[3]
* user-facing UI to support creating/updating/deleting replace refs

Based on the above, filter-repo has been creating replacement refs for
years now, one for every commit in the repository that wasn't filtered
out.  And I thought it all worked great without any problems, until
the recent report.  I guess I only ever used the old commit IDs to
pass to things like diff, log, etc., and I guess most users either
didn't use the replace refs at all or only used them that way
too...until this week. However, I'm planning to do a very big testcase
of history rewriting (ancient and huge binary blobs) at $DAYJOB later
this year including replace refs for some of the release-team folks,
so I'm kind of concerned about what I need to fix in git or at least
what I need to document.

[1] See the part up to "...There is no other restriction on the
replaced and replacement objects.", from git-replace(1)
[2] "Note that the grafts mechanism is outdated and can lead to
problems...see git-replace(1) for a more flexible and robust system",
from gitrepository-layout(1)
[3] "Replacement references will be used by default by all Git
commands except those doing reachability traversal", from
git-replace(1)

> > Why does `git branch` (in conjunction with one user deciding to fetch
> > replace refs) make it so easy to create a branch that cannot readily
> > be shared with others?
>
> In other words, I do not think it is "git branch" or "git checkout -b"
> that brought your repository into a broken state.  The "replace"
> mechanism may have room for improvement to avoid such a corruption.
>
> IIRC, the original "graft" mechanism did not even have any UI, so it
> was pretty much "you can graft any parent to any child, and if you
> break the repository you can keep both halves".  Now "replace" has a
> dedicated UI component in the form of "git replace" command, we should
> be able to teach it how to record replacement more safely.

I would like to avoid corruption, and I'm happy to have alternate
solutions.  I don't understand how this could be fixed in "git
replace", though, especially since "git replace" was not even invoked
in the real use case under consideration (git filter-repo uses "git
update-ref --stdin" to create the replace objects).

Are there other alternative ways to fix this outside of "git
branch"/"git checkout -b"?

If not, are there technical reasons that "git branch <branchname>
<replace-hash>" and other commands like it could not be adjusted to
write the replacement hash rather than the replaced hash to the new
branch?

Even if there are no alternative fixes and there are technical reasons
that "git branch" and other commands like it cannot be adjusted, I
still feel that translate-old-commit-hashes via
replacement-of-non-existent-commits feature is valuable enough to keep
in filter-repo anyway.  It'd make me a little uneasy, but I'd continue
creating the replace refs and just document the drawbacks in that
scenario.

      reply	other threads:[~2021-03-31  0:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-30  6:05 Bug report: git branch behaves as if --no-replace-objects is passed Elijah Newren
2021-03-30  7:02 ` Jeff King
2021-03-30 18:58   ` Junio C Hamano
2021-03-30 21:19     ` Elijah Newren
2021-03-30 21:30       ` Elijah Newren
2021-03-30 21:59         ` Junio C Hamano
2021-03-30 21:53       ` Junio C Hamano
2021-03-30 22:43         ` Elijah Newren
2021-03-30 23:04           ` Junio C Hamano
2021-03-31  0:32             ` Elijah Newren [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABPp-BG+xKr10BpziijMB4j9+F=PCozAFQRdJ1DVBuGGjir40Q@mail.gmail.com' \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).