git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 5/6] refs: stop resolving ref corresponding to reflogs
Date: Tue, 20 Feb 2024 09:34:33 +0100	[thread overview]
Message-ID: <ZdRkGWhUrHQgWbxy@tanuki> (raw)
In-Reply-To: <xmqqjzn0yote.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 6112 bytes --]

On Mon, Feb 19, 2024 at 04:14:21PM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > Refactor the code to call `check_refname_format()` directly instead of
> > trying to resolve the ref. This is significantly more efficient given
> > that we don't have to hit the object database anymore to list reflogs.
> > And second, it ensures that we end up showing reflogs of broken refs,
> > which will help to make the reflog more useful.
> 
> And the user would notice corrupt ones among those reflogs listed
> when using "rev-list -g" on the reflog anyway?  Which sounds like a
> sensible thing to do.

Yeah. Overall the user experience is still quite lacking when you have
such "funny" reflogs. Corrupted ones would result in errors as you
mentioned, and that's to be expected in my opinion.

The more dubious behaviour is that `git reflog show $REFLOG` refuses to
show the reflog when the corresponding ref is missing. This is something
I plan to address in a follow-up patch series.

> > Note that this really only impacts the case where the corresponding ref
> > is corrupt. Reflogs for nonexistent refs would have been returned to the
> > caller beforehand already as we did not pass `RESOLVE_REF_READING` to
> > the function, and thus `refs_resolve_ref_unsafe()` would have returned
> > successfully in that case.
> 
> What do "Reflogs for nonexistent refs" really mean?  With the files
> backend, if "git branch -d main" that removed the "main" branch
> somehow forgot to remove the ".git/logs/refs/heads/main" file, the
> reflog entries in such a file is for nonexistent ref.  Is that what
> you meant?

Yes. Would "Reflogs which do not have a corresponding ref with the same
name" be clearer?

> As a tool to help diagnosing and correcting minor repo
> breakages, finding such a leftover file that should not exist is a
> good idea, I would think.
> 
> Would we see missing reflog for a ref that exists in the iteration?
> I guess we shouldn't, as the reflog iterator that recursively
> enumerates files under "$GIT_DIR/logs/" would not see such a missing
> reflog by definition.

No, and I'd claim we shouldn't. The reflog mechanism gives the user
control over which reflogs should and which shouldn't exist. For one,
`core.logAllRefUpdates` allows the user to either enable or disable the
reflog mechanism. If set to "false" then no reflogs are created, with
"true" some are created, and with "always" we always end up creating
reflogs. So depending on this setting it's expected that a subset of
reflogs do not exist.

But that'also not the whole story yet. Theoretically speaking, reflogs
have a subtle opt-in mechanism: once a reflog is created, we will
continue writing to it no matter what `core.logAllRefUpdates` says. So
it's feasible to have `core.logAllRefUpdates=false`, but then explicitly
create a specific reflog so that you log changes to a specific ref.

With this behaviour in mind I'd say that we shouldn't log missing
reflogs.

> > diff --git a/refs/files-backend.c b/refs/files-backend.c
> > index 2b3c99b00d..741148087d 100644
> > --- a/refs/files-backend.c
> > +++ b/refs/files-backend.c
> > @@ -2130,17 +2130,9 @@ static int files_reflog_iterator_advance(struct ref_iterator *ref_iterator)
> >  	while ((ok = dir_iterator_advance(diter)) == ITER_OK) {
> >  		if (!S_ISREG(diter->st.st_mode))
> >  			continue;
> > -		if (diter->basename[0] == '.')
> > +		if (check_refname_format(diter->basename,
> > +					 REFNAME_ALLOW_ONELEVEL))
> >  			continue;
> 
> A tangent.
> 
> I've never liked the code arrangement in the check_refname_format()
> that assumes that each level can be separately checked with exactly
> the same logic, and the only thing ALLOW_ONELEVEL does is to include
> pseudorefs and HEAD; this makes such assumption even more ingrained.
> I am not sure what to think about it, but let's keep reading.

Yeah. This code here is basically just copied over from
`refs_resolve_ref_unsafe()` to ensure that it remains compatible. In a
future patch series we might include a new option `--include-broken`
that would also surface broken-but-safe reflog names.

But going down the tangent even more: one think I've noticed is that the
way `check_refname_format()` is structured is also wildly inefficient.
It's quite astonishing that when iterating over refs, we spend _more_
time in `check_refname_format()` than reading the refs from disk,
parsing them and massaging them into their final representation.

Overall, the whole infra to check refnames could use some improvement.
But this has already been discussed in other threads recently.

Patrick

> > -		if (ends_with(diter->basename, ".lock"))
> > -			continue;
> 
> This can safely go, as it is rejected by check_refname_format().
> 
> > -		if (!refs_resolve_ref_unsafe(iter->ref_store,
> > -					     diter->relative_path, 0,
> > -					     NULL, NULL)) {
> > -			error("bad ref for %s", diter->path.buf);
> > -			continue;
> > -		}
> 
> This is the focus of this step in the series.  We did not abort the
> iteration before, but now we no longer issue any error message.
> 
> >  		iter->base.refname = diter->relative_path;
> >  		return ITER_OK;
> > diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
> > index 889bb1f1ba..efbbf23c72 100644
> > --- a/refs/reftable-backend.c
> > +++ b/refs/reftable-backend.c
> > @@ -1659,11 +1659,9 @@ static int reftable_reflog_iterator_advance(struct ref_iterator *ref_iterator)
> >  		if (iter->last_name && !strcmp(iter->log.refname, iter->last_name))
> >  			continue;
> >  
> > -		if (!refs_resolve_ref_unsafe(&iter->refs->base, iter->log.refname,
> > -					     0, NULL, NULL)) {
> > -			error(_("bad ref for %s"), iter->log.refname);
> > +		if (check_refname_format(iter->log.refname,
> > +					 REFNAME_ALLOW_ONELEVEL))
> >  			continue;
> > -		}
> 
> This side is much more straight-forward.  Looking good.
> 
> >  
> >  		free(iter->last_name);
> >  		iter->last_name = xstrdup(iter->log.refname);

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2024-02-20  8:35 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-19 14:35 [PATCH 0/6] reflog: introduce subcommand to list reflogs Patrick Steinhardt
2024-02-19 14:35 ` [PATCH 1/6] dir-iterator: pass name to `prepare_next_entry_data()` directly Patrick Steinhardt
2024-02-19 14:35 ` [PATCH 2/6] dir-iterator: support iteration in sorted order Patrick Steinhardt
2024-02-19 23:39   ` Junio C Hamano
2024-02-20  8:34     ` Patrick Steinhardt
2024-02-19 14:35 ` [PATCH 3/6] refs/files: sort reflogs returned by the reflog iterator Patrick Steinhardt
2024-02-20  0:04   ` Junio C Hamano
2024-02-20  8:34     ` Patrick Steinhardt
2024-02-19 14:35 ` [PATCH 4/6] refs: drop unused params from the reflog iterator callback Patrick Steinhardt
2024-02-20  0:14   ` Junio C Hamano
2024-02-20  8:34     ` Patrick Steinhardt
2024-02-19 14:35 ` [PATCH 5/6] refs: stop resolving ref corresponding to reflogs Patrick Steinhardt
2024-02-20  0:14   ` Junio C Hamano
2024-02-20  8:34     ` Patrick Steinhardt [this message]
2024-02-19 14:35 ` [PATCH 6/6] builtin/reflog: introduce subcommand to list reflogs Patrick Steinhardt
2024-02-20  0:32   ` Junio C Hamano
2024-02-20  8:34     ` Patrick Steinhardt
2024-02-20  9:06 ` [PATCH v2 0/7] reflog: " Patrick Steinhardt
2024-02-20  9:06   ` [PATCH v2 1/7] dir-iterator: pass name to `prepare_next_entry_data()` directly Patrick Steinhardt
2024-02-20  9:06   ` [PATCH v2 2/7] dir-iterator: support iteration in sorted order Patrick Steinhardt
2024-02-20  9:06   ` [PATCH v2 3/7] refs/files: sort reflogs returned by the reflog iterator Patrick Steinhardt
2024-02-20  9:06   ` [PATCH v2 4/7] refs: always treat iterators as ordered Patrick Steinhardt
2024-02-20  9:06   ` [PATCH v2 5/7] refs: drop unused params from the reflog iterator callback Patrick Steinhardt
2024-02-20  9:06   ` [PATCH v2 6/7] refs: stop resolving ref corresponding to reflogs Patrick Steinhardt
2024-02-20  9:06   ` [PATCH v2 7/7] builtin/reflog: introduce subcommand to list reflogs Patrick Steinhardt
2024-04-24  7:30     ` Teng Long
2024-04-24  8:01       ` Patrick Steinhardt
2024-04-24 14:53         ` Junio C Hamano
2024-02-20 17:22   ` [PATCH v2 0/7] reflog: " Junio C Hamano
2024-02-21 11:48     ` Patrick Steinhardt
2024-02-21 12:37 ` [PATCH v3 0/8] " Patrick Steinhardt
2024-02-21 12:37   ` [PATCH v3 1/8] dir-iterator: pass name to `prepare_next_entry_data()` directly Patrick Steinhardt
2024-02-21 12:37   ` [PATCH v3 2/8] dir-iterator: support iteration in sorted order Patrick Steinhardt
2024-02-21 12:37   ` [PATCH v3 3/8] refs/files: sort reflogs returned by the reflog iterator Patrick Steinhardt
2024-02-21 12:37   ` [PATCH v3 4/8] refs/files: sort merged worktree and common reflogs Patrick Steinhardt
2024-02-21 12:37   ` [PATCH v3 5/8] refs: always treat iterators as ordered Patrick Steinhardt
2024-02-21 12:37   ` [PATCH v3 6/8] refs: drop unused params from the reflog iterator callback Patrick Steinhardt
2024-02-21 12:37   ` [PATCH v3 7/8] refs: stop resolving ref corresponding to reflogs Patrick Steinhardt
2024-02-21 12:37   ` [PATCH v3 8/8] builtin/reflog: introduce subcommand to list reflogs Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZdRkGWhUrHQgWbxy@tanuki \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).