* Bug report: git branch behaves as if --no-replace-objects is passed @ 2021-03-30 6:05 Elijah Newren 2021-03-30 7:02 ` Jeff King 0 siblings, 1 reply; 10+ messages in thread From: Elijah Newren @ 2021-03-30 6:05 UTC (permalink / raw) To: Git Mailing List Hi, Forwarding a report from https://github.com/newren/git-filter-repo/issues/229, with a few more details... log, diff, etc. seem to all support replace objects nicely: git init -b main whatever cd whatever/ >empty git add empty git commit -m initial git replace -f deadbeefdeadbeefdeadbeefdeadbeefdeadbeef $(git rev-parse main) echo stuff >>empty Now, If I try to use this replace object: $ git diff --stat deadbeefdeadbeefdeadbeefdeadbeefdeadbeef empty | 1 + 1 file changed, 1 insertion(+) $ git log --oneline deadbeefdeadbeefdeadbeefdeadbeefdeadbeef deadbee (replaced) initial it all works well. BUT, if I try to use it with branch it doesn't work: $ git branch --contains deadbeefdeadbeefdeadbeefdeadbeefdeadbeef $ and possibly worse, if I create a new branch based on it and use it: $ git branch foobar deadbeefdeadbeefdeadbeefdeadbeefdeadbeef $ git checkout foobar $ echo stuff >empty $ git add empty $ git commit -m more then it's clear that branch created foobar pointing to the replaced object rather than the replacement object -- despite the fact that the replaced object doesn't even exist within this repo: $ git cat-file -p HEAD tree 18108bae26dc91af2055bc66cc9fea278012dbd3 parent deadbeefdeadbeefdeadbeefdeadbeefdeadbeef author Elijah Newren <newren@gmail.com> 1617083739 -0700 committer Elijah Newren <newren@gmail.com> 1617083739 -0700 more I poked around in the code a little but it is not at all clear to me why some parts of the code (log, diff) translate replace refs correctly, while others (branch) don't. It is clear from the output that log is aware that the refs are replaced, which makes me wonder if every caller needs to be aware of replace refs for them to work correctly everywhere, because I couldn't find a missing environment setup for "branch". ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug report: git branch behaves as if --no-replace-objects is passed 2021-03-30 6:05 Bug report: git branch behaves as if --no-replace-objects is passed Elijah Newren @ 2021-03-30 7:02 ` Jeff King 2021-03-30 18:58 ` Junio C Hamano 0 siblings, 1 reply; 10+ messages in thread From: Jeff King @ 2021-03-30 7:02 UTC (permalink / raw) To: Elijah Newren; +Cc: Git Mailing List On Mon, Mar 29, 2021 at 11:05:05PM -0700, Elijah Newren wrote: > it all works well. BUT, if I try to use it with branch it doesn't work: > > $ git branch --contains deadbeefdeadbeefdeadbeefdeadbeefdeadbeef > $ I'm not surprised here. The replace mechanism is usually "if you are trying to access object X, then access the contents of Y instead". But I don't think we generally rewrite references from other objects. So in that sense, no ref "contains" deadbeef, because nobody points to it as an ancestor. I guess "branch --contains" could convert the mention of the replaced object on the command-line, something like this: diff --git a/parse-options-cb.c b/parse-options-cb.c index 4542d4d3f9..200be4e3d2 100644 --- a/parse-options-cb.c +++ b/parse-options-cb.c @@ -6,6 +6,7 @@ #include "string-list.h" #include "strvec.h" #include "oid-array.h" +#include "replace-object.h" /*----- some often used options -----*/ @@ -84,6 +85,7 @@ int parse_opt_verbosity_cb(const struct option *opt, const char *arg, int parse_opt_commits(const struct option *opt, const char *arg, int unset) { struct object_id oid; + const struct object_id *replace; struct commit *commit; BUG_ON_OPT_NEG(unset); @@ -92,7 +94,9 @@ int parse_opt_commits(const struct option *opt, const char *arg, int unset) return -1; if (get_oid(arg, &oid)) return error("malformed object name %s", arg); - commit = lookup_commit_reference(the_repository, &oid); + + replace = lookup_replace_object(the_repository, &oid); + commit = lookup_commit_reference(the_repository, replace); if (!commit) return error("no such commit %s", arg); commit_list_insert(commit, opt->value); though if we go that route, I suspect we ought to be adding both the original _and_ the replacement. I'm not entirely sure this is a good direction, though. > and possibly worse, if I create a new branch based on it and use it: > > $ git branch foobar deadbeefdeadbeefdeadbeefdeadbeefdeadbeef > $ git checkout foobar > $ echo stuff >empty > $ git add empty > $ git commit -m more > > then it's clear that branch created foobar pointing to the replaced > object rather than the replacement object -- despite the fact that the > replaced object doesn't even exist within this repo: > > $ git cat-file -p HEAD > tree 18108bae26dc91af2055bc66cc9fea278012dbd3 > parent deadbeefdeadbeefdeadbeefdeadbeefdeadbeef > author Elijah Newren <newren@gmail.com> 1617083739 -0700 > committer Elijah Newren <newren@gmail.com> 1617083739 -0700 > > more Yeah, that's pretty horrible. I do think you're using replace objects in a way they weren't really intended for, in two ways: - usually you'd actually have deadbeef, and you just want to rewrite it to something else - you wouldn't usually work directly with the replace object on the command line like this. Usually the intent is to patch some old section of history to get a different view. None of which is an excuse exactly. But just to say that I'm not too surprised that the "replace" mechanism is a bit half-baked, and there are probably a lot of weird implications nobody has thought through. Patches welcome, of course, though I suspect it may be a rabbit hole that isn't worth your time. :) > I poked around in the code a little but it is not at all clear to me > why some parts of the code (log, diff) translate replace refs > correctly, while others (branch) don't. It is clear from the output > that log is aware that the refs are replaced, which makes me wonder if > every caller needs to be aware of replace refs for them to work > correctly everywhere, because I couldn't find a missing environment > setup for "branch". Right. To work as you expect above, basically everything that is going to look at an oid would need to consider whether to use a replace ref. Shoving the lookup into get_oid() would "fix" that, but I suspect would confuse some other callers. Right now the rule is "when you access the object contents, pretend as if it had X instead of Y", so we can enforce that at the layer of accessing the object database. -Peff ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: Bug report: git branch behaves as if --no-replace-objects is passed 2021-03-30 7:02 ` Jeff King @ 2021-03-30 18:58 ` Junio C Hamano 2021-03-30 21:19 ` Elijah Newren 0 siblings, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2021-03-30 18:58 UTC (permalink / raw) To: Jeff King; +Cc: Elijah Newren, Git Mailing List Jeff King <peff@peff.net> writes: > ... though if we go that route, I suspect we ought to be adding both the > original _and_ the replacement. So "branch --contains X" would ask "which of these branches reach X or its replacement?" and "branch --no-contains X" would ask "which of these do not reach X nor its replacement?" --- I guess the result is still internally consistent (meaning: any and all branches fall into either "--contains X" or "--no-contains X" camp). > I'm not entirely sure this is a good direction, though. > >> and possibly worse, if I create a new branch based on it and use it: >> >> $ git branch foobar deadbeefdeadbeefdeadbeefdeadbeefdeadbeef >> $ git checkout foobar >> $ echo stuff >empty >> $ git add empty >> $ git commit -m more >> >> then it's clear that branch created foobar pointing to the replaced >> object rather than the replacement object -- despite the fact that the >> replaced object doesn't even exist within this repo: >> >> $ git cat-file -p HEAD >> tree 18108bae26dc91af2055bc66cc9fea278012dbd3 >> parent deadbeefdeadbeefdeadbeefdeadbeefdeadbeef >> author Elijah Newren <newren@gmail.com> 1617083739 -0700 >> committer Elijah Newren <newren@gmail.com> 1617083739 -0700 >> >> more > > Yeah, that's pretty horrible. I am not sure. As you analize below, the replace mechanism is about telling Git: when anybody refers to deadbeef, use its replacement if defined instead. And one of the points in the mechanism is to allow to do so even retroactively, so the HEAD object there may be referring to deadbeef that may not exist does not matter, as long as the object that is to replace deadbeef is available. If not, that is a repository corruption. After all, the commit object you cat-file'ed may have been created by somebody else in a separate repository that had deadbeef before they were told by Elijah that the object is obsolete and to be replaced by something else (Git supports distributed development) and then pulled into Elijah's repository, and we should be prepared to seeing "parent deadbeef" in such a commit. As long as replacement happens when accessing the contents, that would be OK. So, I do not see it as "pretty horrible", but I may be missing something. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug report: git branch behaves as if --no-replace-objects is passed 2021-03-30 18:58 ` Junio C Hamano @ 2021-03-30 21:19 ` Elijah Newren 2021-03-30 21:30 ` Elijah Newren 2021-03-30 21:53 ` Junio C Hamano 0 siblings, 2 replies; 10+ messages in thread From: Elijah Newren @ 2021-03-30 21:19 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Git Mailing List On Tue, Mar 30, 2021 at 11:58 AM Junio C Hamano <gitster@pobox.com> wrote: > > Jeff King <peff@peff.net> writes: > > > ... though if we go that route, I suspect we ought to be adding both the > > original _and_ the replacement. > > So "branch --contains X" would ask "which of these branches reach X > or its replacement?" and "branch --no-contains X" would ask "which > of these do not reach X nor its replacement?" --- I guess the result > is still internally consistent (meaning: any and all branches fall > into either "--contains X" or "--no-contains X" camp). I'm not so sure about this interpretation. Based on the documentation in git-replace(1): Replacement references will be used by default by all Git commands except those doing reachability traversal (prune, pack transfer and fsck). I would have thought that * "branch --contains X" would ask "which of these branches reach X's replacement?" * "git --no-replace-objects branch --contains X" would ask "which of these branches reach X?" and if folks really wanted to check whether either X or its replacement were reachable then they'd need to run both commands. The only place outside of reachability traversal where I think it makes sense for a command to distinguish between X being a replace ref for Y and Y itself is in `git log` where it can show the "replaced" moniker. > > I'm not entirely sure this is a good direction, though. > > > >> and possibly worse, if I create a new branch based on it and use it: > >> > >> $ git branch foobar deadbeefdeadbeefdeadbeefdeadbeefdeadbeef > >> $ git checkout foobar > >> $ echo stuff >empty > >> $ git add empty > >> $ git commit -m more > >> > >> then it's clear that branch created foobar pointing to the replaced > >> object rather than the replacement object -- despite the fact that the > >> replaced object doesn't even exist within this repo: > >> > >> $ git cat-file -p HEAD > >> tree 18108bae26dc91af2055bc66cc9fea278012dbd3 > >> parent deadbeefdeadbeefdeadbeefdeadbeefdeadbeef > >> author Elijah Newren <newren@gmail.com> 1617083739 -0700 > >> committer Elijah Newren <newren@gmail.com> 1617083739 -0700 > >> > >> more > > > > Yeah, that's pretty horrible. > > I am not sure. As you analize below, the replace mechanism is about > telling Git: when anybody refers to deadbeef, use its replacement if > defined instead. > > And one of the points in the mechanism is to allow to do so even > retroactively, so the HEAD object there may be referring to deadbeef > that may not exist does not matter, as long as the object that is to > replace deadbeef is available. If not, that is a repository > corruption. After all, the commit object you cat-file'ed may have > been created by somebody else in a separate repository that had > deadbeef before they were told by Elijah that the object is obsolete > and to be replaced by something else (Git supports distributed > development) and then pulled into Elijah's repository, and we should > be prepared to seeing "parent deadbeef" in such a commit. As long as > replacement happens when accessing the contents, that would be OK. > > So, I do not see it as "pretty horrible", but I may be missing > something. I think you're focusing on git commit, or perhaps on git checkout. I'm focusing on git branch; what it did does not seem fine to me. Using your own words: "the replace mechanism is about telling Git: when anybody refers to deadbeef, use its replacement if defined instead." git branch didn't do that; it put deadbeef into refs/heads/foobar. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug report: git branch behaves as if --no-replace-objects is passed 2021-03-30 21:19 ` Elijah Newren @ 2021-03-30 21:30 ` Elijah Newren 2021-03-30 21:59 ` Junio C Hamano 2021-03-30 21:53 ` Junio C Hamano 1 sibling, 1 reply; 10+ messages in thread From: Elijah Newren @ 2021-03-30 21:30 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Git Mailing List On Tue, Mar 30, 2021 at 2:19 PM Elijah Newren <newren@gmail.com> wrote: > > On Tue, Mar 30, 2021 at 11:58 AM Junio C Hamano <gitster@pobox.com> wrote: > > > > Jeff King <peff@peff.net> writes: > > > > > ... though if we go that route, I suspect we ought to be adding both the > > > original _and_ the replacement. > > > > So "branch --contains X" would ask "which of these branches reach X > > or its replacement?" and "branch --no-contains X" would ask "which > > of these do not reach X nor its replacement?" --- I guess the result > > is still internally consistent (meaning: any and all branches fall > > into either "--contains X" or "--no-contains X" camp). > > I'm not so sure about this interpretation. Based on the documentation > in git-replace(1): > > Replacement references will be used by default by all Git commands > except those doing reachability traversal (prune, pack transfer and > fsck). > > I would have thought that > > * "branch --contains X" would ask "which of these branches reach X's > replacement?" > * "git --no-replace-objects branch --contains X" would ask "which of > these branches reach X?" > > and if folks really wanted to check whether either X or its > replacement were reachable then they'd need to run both commands. > > The only place outside of reachability traversal where I think it > makes sense for a command to distinguish between X being a replace ref > for Y and Y itself is in `git log` where it can show the "replaced" > moniker. > > > > I'm not entirely sure this is a good direction, though. > > > > > >> and possibly worse, if I create a new branch based on it and use it: > > >> > > >> $ git branch foobar deadbeefdeadbeefdeadbeefdeadbeefdeadbeef > > >> $ git checkout foobar > > >> $ echo stuff >empty > > >> $ git add empty > > >> $ git commit -m more > > >> > > >> then it's clear that branch created foobar pointing to the replaced > > >> object rather than the replacement object -- despite the fact that the > > >> replaced object doesn't even exist within this repo: > > >> > > >> $ git cat-file -p HEAD > > >> tree 18108bae26dc91af2055bc66cc9fea278012dbd3 > > >> parent deadbeefdeadbeefdeadbeefdeadbeefdeadbeef > > >> author Elijah Newren <newren@gmail.com> 1617083739 -0700 > > >> committer Elijah Newren <newren@gmail.com> 1617083739 -0700 > > >> > > >> more > > > > > > Yeah, that's pretty horrible. > > > > I am not sure. As you analize below, the replace mechanism is about > > telling Git: when anybody refers to deadbeef, use its replacement if > > defined instead. > > > > And one of the points in the mechanism is to allow to do so even > > retroactively, so the HEAD object there may be referring to deadbeef > > that may not exist does not matter, as long as the object that is to > > replace deadbeef is available. If not, that is a repository > > corruption. After all, the commit object you cat-file'ed may have > > been created by somebody else in a separate repository that had > > deadbeef before they were told by Elijah that the object is obsolete > > and to be replaced by something else (Git supports distributed > > development) and then pulled into Elijah's repository, and we should > > be prepared to seeing "parent deadbeef" in such a commit. As long as > > replacement happens when accessing the contents, that would be OK. > > > > So, I do not see it as "pretty horrible", but I may be missing > > something. > > I think you're focusing on git commit, or perhaps on git checkout. > I'm focusing on git branch; what it did does not seem fine to me. > Using your own words: > > "the replace mechanism is about telling Git: when anybody refers to > deadbeef, use its replacement if defined instead." > > git branch didn't do that; it put deadbeef into refs/heads/foobar. Perhaps I should also add why it not only breaks expectations, but why that broken expectation causes problems: * People tend to have commit hashes stored in lots of weird placed -- bug trackers, reports, emails, etc. These tend to be important for a short time period, but the number of these references make it harder for folks who want to rewrite history to fix various past issues (very large binary blobs and other misdeeds). * filter-repo uses replace refs to provide users with a way to access new commits using old commit hashes, to help them through this transition period. * Additional refs (especially one for every commit) will cause some slowness. So it's nice to be able to provide these replace refs for short term transition, but tell users they can simply delete the replace refs when they no longer need them without consequence. The fact that git branch puts deadbeef into refs/heads/foobar, leads to a chain where new commits now rely on replacement refs. In the best case, others will not be able to pull from this user and the user will not be able to push the new commits anywhere -- and that user will have some work to do to rewrite (rebase?) the commits appropriately. In the worst case, the users do succeed in distributing this new history, and now all users everywhere will be mandated to keep all replace refs for all time (or at least until the next major repository rewrite)... ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug report: git branch behaves as if --no-replace-objects is passed 2021-03-30 21:30 ` Elijah Newren @ 2021-03-30 21:59 ` Junio C Hamano 0 siblings, 0 replies; 10+ messages in thread From: Junio C Hamano @ 2021-03-30 21:59 UTC (permalink / raw) To: Elijah Newren; +Cc: Jeff King, Git Mailing List Elijah Newren <newren@gmail.com> writes: > On Tue, Mar 30, 2021 at 2:19 PM Elijah Newren <newren@gmail.com> wrote: >> >> On Tue, Mar 30, 2021 at 11:58 AM Junio C Hamano <gitster@pobox.com> wrote: >> > >> > Jeff King <peff@peff.net> writes: >> > >> > > ... though if we go that route, I suspect we ought to be adding both the >> > > original _and_ the replacement. >> > >> > So "branch --contains X" would ask "which of these branches reach X >> > or its replacement?" and "branch --no-contains X" would ask "which >> > of these do not reach X nor its replacement?" --- I guess the result >> > is still internally consistent (meaning: any and all branches fall >> > into either "--contains X" or "--no-contains X" camp). >> >> I'm not so sure about this interpretation. Based on the documentation >> in git-replace(1): >> >> Replacement references will be used by default by all Git commands >> except those doing reachability traversal (prune, pack transfer and >> fsck). This "rechability" sidenote is primarily so that we won't result in a corrupt repository when replacement is lifted. When object A is replaced by object B, and somebody makes A reachable (e.g. a ref points at deadbeef), we mark both A (and the objects A refers to) and B reachable, so that "prune" won't lose A. It would allow the replacement lifted after "prune". Tweaking "branch --contains X" to list branches that reach either X or its replacement would probably have the same effect, so I would think it would be a good change (and fix to your original issue). The "the result is still internally consistent" comment was the result of my attempt to make sure such a change would not introduce gross incoherency to the resulting system. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug report: git branch behaves as if --no-replace-objects is passed 2021-03-30 21:19 ` Elijah Newren 2021-03-30 21:30 ` Elijah Newren @ 2021-03-30 21:53 ` Junio C Hamano 2021-03-30 22:43 ` Elijah Newren 1 sibling, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2021-03-30 21:53 UTC (permalink / raw) To: Elijah Newren; +Cc: Jeff King, Git Mailing List Elijah Newren <newren@gmail.com> writes: > Using your own words: > > "the replace mechanism is about telling Git: when anybody refers to > deadbeef, use its replacement if defined instead." "When anybody wants the contents of deadbeef, give the contents of its replacement" is what the replace (graft, too) is. > git branch didn't do that; it put deadbeef into refs/heads/foobar. Yes, but refs/heads/foobar having deadbeef does not have a problem, as long as you get the contents of the replacement object when you ask for the contents of deadbeef (or "the object referred to by the refs/heads/foobar ref"). ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug report: git branch behaves as if --no-replace-objects is passed 2021-03-30 21:53 ` Junio C Hamano @ 2021-03-30 22:43 ` Elijah Newren 2021-03-30 23:04 ` Junio C Hamano 0 siblings, 1 reply; 10+ messages in thread From: Elijah Newren @ 2021-03-30 22:43 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Git Mailing List On Tue, Mar 30, 2021 at 2:53 PM Junio C Hamano <gitster@pobox.com> wrote: > > Elijah Newren <newren@gmail.com> writes: > > > Using your own words: > > > > "the replace mechanism is about telling Git: when anybody refers to > > deadbeef, use its replacement if defined instead." > > "When anybody wants the contents of deadbeef, give the contents of > its replacement" is what the replace (graft, too) is. > > > git branch didn't do that; it put deadbeef into refs/heads/foobar. > > Yes, but refs/heads/foobar having deadbeef does not have a problem, > as long as you get the contents of the replacement object when you > ask for the contents of deadbeef (or "the object referred to by > the refs/heads/foobar ref"). Your "as long as" is I think the assumption that's violated in the workflow in question. You may have the replace ref defined, but others don't[1]. Neither party has the actual original deadbeef commit[2]. Having deadbeef in refs/heads/foobar leads eventually to creating commits with deadbeef as an explicit parent, as we discussed above. While that's internally consistent, as you point out, can you push your new commit elsewhere without pushing the replace refs too? * If you can, isn't that repository corruption? * If you can't, how are users supposed to understand, debug, and correct the fact that their history is unshareable? Why does `git branch` (in conjunction with one user deciding to fetch replace refs) make it so easy to create a branch that cannot readily be shared with others? [1] Everyone can get the replace refs, but fetch/clone does not grab them by default, and push does not include them by default, so it's multiple manual steps. [2] For the rewrite at $DAYJOB, we'll make it as hard as possible for users to access the original deadbeef commit(s). People who cloned the old unrewritten repo in the past might still have it somewhere if they didn't follow instructions to delete old clones and reclone, but new hires won't have deadbeef and won't have a way to get it. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug report: git branch behaves as if --no-replace-objects is passed 2021-03-30 22:43 ` Elijah Newren @ 2021-03-30 23:04 ` Junio C Hamano 2021-03-31 0:32 ` Elijah Newren 0 siblings, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2021-03-30 23:04 UTC (permalink / raw) To: Elijah Newren; +Cc: Jeff King, Git Mailing List Elijah Newren <newren@gmail.com> writes: > Your "as long as" is I think the assumption that's violated in the > workflow in question. You may have the replace ref defined, but > others don't[1]. Neither party has the actual original deadbeef > commit[2]. Having deadbeef in refs/heads/foobar leads eventually to > creating commits with deadbeef as an explicit parent, as we discussed > above. While that's internally consistent, as you point out, can you > push your new commit elsewhere without pushing the replace refs too? I think the change to "branch --contains" would be an improvement whether you actually have deadbeef or not, but in any case, defining (eh, rather, being able to define) a replacement for something you do not have is the ultimate source of the problem. And that "bug" has not very much specific to how "branch --contains" should behave. > Why does `git branch` (in conjunction with one user deciding to fetch > replace refs) make it so easy to create a branch that cannot readily > be shared with others? In other words, I do not think it is "git branch" or "git checkout -b" that brought your repository into a broken state. The "replace" mechanism may have room for improvement to avoid such a corruption. IIRC, the original "graft" mechanism did not even have any UI, so it was pretty much "you can graft any parent to any child, and if you break the repository you can keep both halves". Now "replace" has a dedicated UI component in the form of "git replace" command, we should be able to teach it how to record replacement more safely. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug report: git branch behaves as if --no-replace-objects is passed 2021-03-30 23:04 ` Junio C Hamano @ 2021-03-31 0:32 ` Elijah Newren 0 siblings, 0 replies; 10+ messages in thread From: Elijah Newren @ 2021-03-31 0:32 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Git Mailing List On Tue, Mar 30, 2021 at 4:04 PM Junio C Hamano <gitster@pobox.com> wrote: > > Elijah Newren <newren@gmail.com> writes: > > > Your "as long as" is I think the assumption that's violated in the > > workflow in question. You may have the replace ref defined, but > > others don't[1]. Neither party has the actual original deadbeef > > commit[2]. Having deadbeef in refs/heads/foobar leads eventually to > > creating commits with deadbeef as an explicit parent, as we discussed > > above. While that's internally consistent, as you point out, can you > > push your new commit elsewhere without pushing the replace refs too? > > I think the change to "branch --contains" would be an improvement > whether you actually have deadbeef or not, but in any case, defining Great, we agree that branch --contains can be improved. > (eh, rather, being able to define) a replacement for something you > do not have is the ultimate source of the problem. And that "bug" > has not very much specific to how "branch --contains" should behave. Okay, perhaps you consider the ability to create a replacement for an object that doesn't exist as a bug. How do we handle this bug, though? That's not at all clear to me. Is it documentation updates? More error messages from various commands? Code changes to handle these cases better? Something else? This behavior has been allowed for over a decade, the refs can be created outside of "git replace" and the replacement mechanism comes with * documentation claiming there are only two restrictions on replacement or replaced refs, both bypassable[1] * documentation claims of robustness for the replace mechanism[2] * documentation claiming that all non-reachability traversal related commands will translate replacement refs to the real commit IDs[3] * user-facing UI to support creating/updating/deleting replace refs Based on the above, filter-repo has been creating replacement refs for years now, one for every commit in the repository that wasn't filtered out. And I thought it all worked great without any problems, until the recent report. I guess I only ever used the old commit IDs to pass to things like diff, log, etc., and I guess most users either didn't use the replace refs at all or only used them that way too...until this week. However, I'm planning to do a very big testcase of history rewriting (ancient and huge binary blobs) at $DAYJOB later this year including replace refs for some of the release-team folks, so I'm kind of concerned about what I need to fix in git or at least what I need to document. [1] See the part up to "...There is no other restriction on the replaced and replacement objects.", from git-replace(1) [2] "Note that the grafts mechanism is outdated and can lead to problems...see git-replace(1) for a more flexible and robust system", from gitrepository-layout(1) [3] "Replacement references will be used by default by all Git commands except those doing reachability traversal", from git-replace(1) > > Why does `git branch` (in conjunction with one user deciding to fetch > > replace refs) make it so easy to create a branch that cannot readily > > be shared with others? > > In other words, I do not think it is "git branch" or "git checkout -b" > that brought your repository into a broken state. The "replace" > mechanism may have room for improvement to avoid such a corruption. > > IIRC, the original "graft" mechanism did not even have any UI, so it > was pretty much "you can graft any parent to any child, and if you > break the repository you can keep both halves". Now "replace" has a > dedicated UI component in the form of "git replace" command, we should > be able to teach it how to record replacement more safely. I would like to avoid corruption, and I'm happy to have alternate solutions. I don't understand how this could be fixed in "git replace", though, especially since "git replace" was not even invoked in the real use case under consideration (git filter-repo uses "git update-ref --stdin" to create the replace objects). Are there other alternative ways to fix this outside of "git branch"/"git checkout -b"? If not, are there technical reasons that "git branch <branchname> <replace-hash>" and other commands like it could not be adjusted to write the replacement hash rather than the replaced hash to the new branch? Even if there are no alternative fixes and there are technical reasons that "git branch" and other commands like it cannot be adjusted, I still feel that translate-old-commit-hashes via replacement-of-non-existent-commits feature is valuable enough to keep in filter-repo anyway. It'd make me a little uneasy, but I'd continue creating the replace refs and just document the drawbacks in that scenario. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-03-31 0:33 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-03-30 6:05 Bug report: git branch behaves as if --no-replace-objects is passed Elijah Newren 2021-03-30 7:02 ` Jeff King 2021-03-30 18:58 ` Junio C Hamano 2021-03-30 21:19 ` Elijah Newren 2021-03-30 21:30 ` Elijah Newren 2021-03-30 21:59 ` Junio C Hamano 2021-03-30 21:53 ` Junio C Hamano 2021-03-30 22:43 ` Elijah Newren 2021-03-30 23:04 ` Junio C Hamano 2021-03-31 0:32 ` Elijah Newren
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).