On Wed, Nov 29, 2023 at 04:59:35PM -0500, Taylor Blau wrote: > On Wed, Nov 29, 2023 at 09:14:20AM +0100, Patrick Steinhardt wrote: > > We have some references that are more special than others. The reason > > for them being special is that they either do not follow the usual > > format of references, or that they are written to the filesystem > > directly by the respective owning subsystem and thus circumvent the > > reference backend. > > > > This works perfectly fine right now because the reffiles backend will > > know how to read those refs just fine. But with the prospect of gaining > > a new reference backend implementation we need to be a lot more careful > > here: > > > > - We need to make sure that we are consistent about how those refs are > > written. They must either always be written via the filesystem, or > > they must always be written via the reference backend. Any mixture > > will lead to inconsistent state. > > > > - We need to make sure that such special refs are always handled > > specially when reading them. > > > > We're already mostly good with regard to the first item, except for > > `BISECT_EXPECTED_REV` which will be addressed in a subsequent commit. > > But the current list of special refs is missing a lot of refs that > > really should be treated specially. Right now, we only treat > > `FETCH_HEAD` and `MERGE_HEAD` specially here. > > > > Introduce a new function `is_special_ref()` that contains all current > > instances of special refs to fix the reading path. > > > > Based-on-patch-by: Han-Wen Nienhuys > > Signed-off-by: Patrick Steinhardt > > --- > > refs.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-- > > 1 file changed, 56 insertions(+), 2 deletions(-) > > > > diff --git a/refs.c b/refs.c > > index 7d4a057f36..2d39d3fe80 100644 > > --- a/refs.c > > +++ b/refs.c > > @@ -1822,15 +1822,69 @@ static int refs_read_special_head(struct ref_store *ref_store, > > return result; > > } > > > > +static int is_special_ref(const char *refname) > > +{ > > + /* > > + * Special references get written and read directly via the filesystem > > + * by the subsystems that create them. Thus, they must not go through > > + * the reference backend but must instead be read directly. It is > > + * arguable whether this behaviour is sensible, or whether it's simply > > + * a leaky abstraction enabled by us only having a single reference > > + * backend implementation. But at least for a subset of references it > > + * indeed does make sense to treat them specially: > > + * > > + * - FETCH_HEAD may contain multiple object IDs, and each one of them > > + * carries additional metadata like where it came from. > > + * > > + * - MERGE_HEAD may contain multiple object IDs when merging multiple > > + * heads. > > + * > > + * - "rebase-apply/" and "rebase-merge/" contain all of the state for > > + * rebases, where keeping it closely together feels sensible. > > + * > > + * There are some exceptions that you might expect to see on this list > > + * but which are handled exclusively via the reference backend: > > + * > > + * - CHERRY_PICK_HEAD > > + * - HEAD > > + * - ORIG_HEAD > > + * > > + * Writing or deleting references must consistently go either through > > + * the filesystem (special refs) or through the reference backend > > + * (normal ones). > > + */ > > + const char * const special_refs[] = { > > + "AUTO_MERGE", > > + "BISECT_EXPECTED_REV", > > + "FETCH_HEAD", > > + "MERGE_AUTOSTASH", > > + "MERGE_HEAD", > > + }; > > Is there a reason that we don't want to declare this statically? If we > did, I think we could drop one const, since the strings would instead > reside in the .rodata section. Not really, no. > > + int i; > > Not that it matters for this case, but it may be worth declaring i to be > an unsigned type, since it's used as an index into an array. size_t > seems like an appropriate choice there. Hm. We do use `int` almost everywhere when iterating through an array via `ARRAY_SIZE`, but ultimately I don't mind whether it's `int`, `unsigned` or `size_t`. > > + for (i = 0; i < ARRAY_SIZE(special_refs); i++) > > + if (!strcmp(refname, special_refs[i])) > > + return 1; > > + > > + /* > > + * git-rebase(1) stores its state in `rebase-apply/` or > > + * `rebase-merge/`, including various reference-like bits. > > + */ > > + if (starts_with(refname, "rebase-apply/") || > > + starts_with(refname, "rebase-merge/")) > > Do we care about case sensitivity here? Definitely not on case-sensitive > filesystems, but I'm not sure about case-insensitive ones. For instance, > on macOS, I can do: > > $ git rev-parse hEAd > > and get the same value as "git rev-parse HEAD" (on my Linux workstation, > this fails as expected). > > I doubt that there are many users in the wild asking to resolve > reBASe-APPLY/xyz, but I think that after this patch that would no longer > work as-is, so we may want to replace this with istarts_with() instead. In practice I'd argue that nobody is ever going to ask for something in `rebase-apply/` outside of Git internals or scripts, and I'd expect these to always use proper casing. So I rather lean towards a "no, we don't care about case sensitivity". Patrick