git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* reflog existence & reftable
@ 2021-04-21 10:02 Han-Wen Nienhuys
  2021-04-21 11:57 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 7+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-21 10:02 UTC (permalink / raw)
  To: git

Hi there,

(splitting off from a code review of my test cleanups.)

Currently, reflogs are stored in .git/log/*. Git adds entries to the
reflog only if the reflog already exists (See the log_ref_setup()
function).

The current iteration of the reftable design has a unified key space
of {refname,index-number} for reflog entries. This causes there to be
no distinction between

  1) reflog is empty (.git/logs/blah is a 0-byte file)
  2) reflog does not exist (.git/logs/blah does not exist)

This trips up some current tests that make assumptions on reflog existence.

I don't know why one can tweak reflog to be written or not, but the
current functionality will cause a change in operation with reftable.
I see two ways forward:

1) Have different functionality in case of reftable: you cannot query
for the existence of reflogs, and writing reflogs doesn't depend on
the existence of a reflog.

2) Add a reflog existence feature to reftable. We could introduce a
magical reflog entry, which indicates that the reflog exists (but
might be empty). This adds some complexity to the C code, but lets us
maintain backward compatibility.

What do you think?


-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: reflog existence & reftable
  2021-04-21 10:02 reflog existence & reftable Han-Wen Nienhuys
@ 2021-04-21 11:57 ` Ævar Arnfjörð Bjarmason
  2021-04-21 16:55   ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-21 11:57 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: git


On Wed, Apr 21 2021, Han-Wen Nienhuys wrote:

> Hi there,
>
> (splitting off from a code review of my test cleanups.)

Just changing the "Subject" won't break In-Reply-To and prune CC, FWIW
the missing In-Reply-To is (probably):
https://lore.kernel.org/git/87pmyo3zvw.fsf@evledraar.gmail.com/

> Currently, reflogs are stored in .git/log/*. Git adds entries to the
> reflog only if the reflog already exists (See the log_ref_setup()
> function).
>
> The current iteration of the reftable design has a unified key space
> of {refname,index-number} for reflog entries. This causes there to be
> no distinction between
>
>   1) reflog is empty (.git/logs/blah is a 0-byte file)
>   2) reflog does not exist (.git/logs/blah does not exist)
>
> This trips up some current tests that make assumptions on reflog existence.
>
> I don't know why one can tweak reflog to be written or not, but the
> current functionality will cause a change in operation with reftable.
> I see two ways forward:
>
> 1) Have different functionality in case of reftable: you cannot query
> for the existence of reflogs, and writing reflogs doesn't depend on
> the existence of a reflog.
>
> 2) Add a reflog existence feature to reftable. We could introduce a
> magical reflog entry, which indicates that the reflog exists (but
> might be empty). This adds some complexity to the C code, but lets us
> maintain backward compatibility.
>
> What do you think?

I think we should fix the tests first, per [1] :)

Because there's a third case revealed by the test case, which would be
teased out by not entirely skipping the test with REFFILES, but
incrementally splitting it up:

Which is that the current reftable implementation is failing a test
(well, probably a lot more, but the one under discussion) that:

 A. Sets core.logAllRefUpdates=false
 B. Checks out an orphan branch
 C. Checks that it has no existing reflog
 D. Makes a commit there
 E. Checks that it has no reflog

Only the "E" case is covered by your summary above.

But no matter how reftable's behavior is under
core.logAllRefUpdates=false it should surely not be returning true in
the "C" case, because there's no log entry to serve up, and indeed the
branch being asked for has no commits.

So for that case somewhere in the guts of the reftable integration we're
losing the distinction between asking for a log that can't exist
v.s. one that's empty, maybe the reftable code is returning "yes I have
logging on" or "yes I have some entries somewhere" in that case?

And in "E", related to "C" isn't in unambiguous to not write it if
there's no existing entry for the branch in question and
core.logAllRefUpdates=false is in effect?

For the rest of this is the behavior under reftable indistinguishable
from having core.logAllRefUpdates=always set?

In any case, I don't think the emergent behavior of the files backend is
worth emulating, but maybe if some feel that way it might be better to
transition the setting in general to core.logAllRefUpdates being a
global on/off boolean, and having a branch.<name>.logRefUpdates, but I
suspect that there's not going to be any/many users of this selective
logging feature.

1. https://lore.kernel.org/git/87lf9b3mth.fsf@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: reflog existence & reftable
  2021-04-21 11:57 ` Ævar Arnfjörð Bjarmason
@ 2021-04-21 16:55   ` Junio C Hamano
  2021-04-23  9:20     ` Han-Wen Nienhuys
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2021-04-21 16:55 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Han-Wen Nienhuys, git

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> So for that case somewhere in the guts of the reftable integration we're
> losing the distinction between asking for a log that can't exist
> v.s. one that's empty, maybe the reftable code is returning "yes I have
> logging on" or "yes I have some entries somewhere" in that case?
>
> And in "E", related to "C" isn't in unambiguous to not write it if
> there's no existing entry for the branch in question and
> core.logAllRefUpdates=false is in effect?

So, in short, we can rely on the fact that if reflog exists it will
have at least one element?  

But the original "turn logallrefupdates to false and touch the
reflog files for refs you care about" (which I recall I did for my
own use case) allowed an empty reflog in preparation to have new
entries to be appended, so in its initial state an existing reflog
can have zero element.

Is there a documented way to just "enable" a single reflog via any
Git command?  That "if a file exists, append" code dates back for
more than 15 years and I do not remember if in the target use case
I was happy enough to tell people to just "touch" the reflog file
of interest, or if I bothered to add a command support (e.g. "git
reflog create 'refs/heads/next'").

If there isn't, then we could do either one of these two things.

 (1) we could add "git reflog create <ref>" and the reftable can
     record the fact that "reflog exists for the ref, but no ref
     movement recorded yet".  Then the condition C can be checked.

 (2) we could declare that there is no way to create an empty reflog
     supported across ref backends, and make the tests that rely on
     the "feature" conditional on REF_FILES prerequisite.

I have no strong preference.  In the early days I found the ability
to limit which branches get logged convenient, so if reftable
backend can learn the similar trick, we would want to go route (1)
(the convenience largely came from the fact that there was no need
to add one configuration item per branch, so I do not think we would
want to bother with branch.<name>.reflog=bool configuration---that
won't be an easy-to-use substitute).  On the other hand, logs are
useful, and dormant logs are not costing anything (other than holding
onto stale objects we may no longer want), so it could be that it
may not be as convenient as it used to be to be able to turn logs on
only on selected refs, in which case approach (2) is fine.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: reflog existence & reftable
  2021-04-21 16:55   ` Junio C Hamano
@ 2021-04-23  9:20     ` Han-Wen Nienhuys
  2021-04-23 14:07       ` Jeff King
  0 siblings, 1 reply; 7+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-23  9:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, git

On Wed, Apr 21, 2021 at 6:55 PM Junio C Hamano <gitster@pobox.com> wrote:
> If there isn't, then we could do either one of these two things.
>
>  (1) we could add "git reflog create <ref>" and the reftable can
>      record the fact that "reflog exists for the ref, but no ref
>      movement recorded yet".  Then the condition C can be checked.
>
>  (2) we could declare that there is no way to create an empty reflog
>      supported across ref backends, and make the tests that rely on
>      the "feature" conditional on REF_FILES prerequisite.
>
> I have no strong preference.  In the early days I found the ability
> to limit which branches get logged convenient, so if reftable
> backend can learn the similar trick, we would want to go route (1)
> (the convenience largely came from the fact that there was no need
> to add one configuration item per branch, so I do not think we would
> want to bother with branch.<name>.reflog=bool configuration---that
> won't be an easy-to-use substitute).  On the other hand, logs are
> useful, and dormant logs are not costing anything (other than holding
> onto stale objects we may no longer want), so it could be that it
> may not be as convenient as it used to be to be able to turn logs on
> only on selected refs, in which case approach (2) is fine.

Exactly, these are the two options I outlined in my original message.
Both can be made to work. I slightly prefer 2 (empty reflogs don't
exist, and make logging a global switch), because it is simpler to
understand and document. The divergence with the files backend itself
is extra complexity, though. Maybe we could deprecate the behavior and
always write reflogs in the  files backend too.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: reflog existence & reftable
  2021-04-23  9:20     ` Han-Wen Nienhuys
@ 2021-04-23 14:07       ` Jeff King
  2021-04-26 17:33         ` Han-Wen Nienhuys
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff King @ 2021-04-23 14:07 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, git

On Fri, Apr 23, 2021 at 11:20:52AM +0200, Han-Wen Nienhuys wrote:

> On Wed, Apr 21, 2021 at 6:55 PM Junio C Hamano <gitster@pobox.com> wrote:
> > If there isn't, then we could do either one of these two things.
> >
> >  (1) we could add "git reflog create <ref>" and the reftable can
> >      record the fact that "reflog exists for the ref, but no ref
> >      movement recorded yet".  Then the condition C can be checked.
> >
> >  (2) we could declare that there is no way to create an empty reflog
> >      supported across ref backends, and make the tests that rely on
> >      the "feature" conditional on REF_FILES prerequisite.
> >
> > I have no strong preference.  In the early days I found the ability
> > to limit which branches get logged convenient, so if reftable
> > backend can learn the similar trick, we would want to go route (1)
> > (the convenience largely came from the fact that there was no need
> > to add one configuration item per branch, so I do not think we would
> > want to bother with branch.<name>.reflog=bool configuration---that
> > won't be an easy-to-use substitute).  On the other hand, logs are
> > useful, and dormant logs are not costing anything (other than holding
> > onto stale objects we may no longer want), so it could be that it
> > may not be as convenient as it used to be to be able to turn logs on
> > only on selected refs, in which case approach (2) is fine.
> 
> Exactly, these are the two options I outlined in my original message.
> Both can be made to work. I slightly prefer 2 (empty reflogs don't
> exist, and make logging a global switch), because it is simpler to
> understand and document. The divergence with the files backend itself
> is extra complexity, though. Maybe we could deprecate the behavior and
> always write reflogs in the  files backend too.

Yeah, I like (2) as well. This "write a reflog if it always exists"
behavior has always seemed hacky, and like a leftover from early days
when we didn't just turn reflogs on by default. Given that it was
documented as "touch the file", I don't see any need to pretend that it
makes any sense at all in a reftables world.

I'd also be perfectly happy with removing the feature on the files
backend (and perhaps replacing it with a simple globbing config value,
in case anybody really wants to log only some refs). I find it hard to
imagine that anybody would really care, but it _is_ a
backwards-incompatible change. So possibly we should do the usual
deprecation thing, or wait for a major version bump. I dunno.

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: reflog existence & reftable
  2021-04-23 14:07       ` Jeff King
@ 2021-04-26 17:33         ` Han-Wen Nienhuys
  2021-04-27  6:52           ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Han-Wen Nienhuys @ 2021-04-26 17:33 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, git

On Fri, Apr 23, 2021 at 4:07 PM Jeff King <peff@peff.net> wrote:
>
> On Fri, Apr 23, 2021 at 11:20:52AM +0200, Han-Wen Nienhuys wrote:
>
> > On Wed, Apr 21, 2021 at 6:55 PM Junio C Hamano <gitster@pobox.com> wrote:

> > >  (2) we could declare that there is no way to create an empty reflog
> > >      supported across ref backends, and make the tests that rely on
> > >      the "feature" conditional on REF_FILES prerequisite.
> > >
> > > I have no strong preference.  In the early days I found the ability
> > > to limit which branches get logged convenient, so if reftable
> > > backend can learn the similar trick, we would want to go route (1)
> > > (the convenience largely came from the fact that there was no need
> > > to add one configuration item per branch, so I do not think we would
> > > want to bother with branch.<name>.reflog=bool configuration---that
> > > won't be an easy-to-use substitute).  On the other hand, logs are
> > > useful, and dormant logs are not costing anything (other than holding
> > > onto stale objects we may no longer want), so it could be that it
> > > may not be as convenient as it used to be to be able to turn logs on
> > > only on selected refs, in which case approach (2) is fine.
> >
> > Exactly, these are the two options I outlined in my original message.
> > Both can be made to work. I slightly prefer 2 (empty reflogs don't
> > exist, and make logging a global switch), because it is simpler to
> > understand and document. The divergence with the files backend itself
> > is extra complexity, though. Maybe we could deprecate the behavior and
> > always write reflogs in the  files backend too.
>
> Yeah, I like (2) as well. This "write a reflog if it always exists"
> behavior has always seemed hacky, and like a leftover from early days
> when we didn't just turn reflogs on by default. Given that it was
> documented as "touch the file", I don't see any need to pretend that it
> makes any sense at all in a reftables world.

Thanks. Does that count as consensus? Junio?

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: reflog existence & reftable
  2021-04-26 17:33         ` Han-Wen Nienhuys
@ 2021-04-27  6:52           ` Junio C Hamano
  0 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2021-04-27  6:52 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: Jeff King, Ævar Arnfjörð Bjarmason, git

Han-Wen Nienhuys <hanwen@google.com> writes:

> Thanks. Does that count as consensus? Junio?

Sounds like it, even though I have been absent for most of the
period ;-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-04-27  6:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-21 10:02 reflog existence & reftable Han-Wen Nienhuys
2021-04-21 11:57 ` Ævar Arnfjörð Bjarmason
2021-04-21 16:55   ` Junio C Hamano
2021-04-23  9:20     ` Han-Wen Nienhuys
2021-04-23 14:07       ` Jeff King
2021-04-26 17:33         ` Han-Wen Nienhuys
2021-04-27  6:52           ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).