git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Han-Wen Nienhuys <hanwen@google.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>, Taylor Blau <me@ttaylorr.com>,
	Han-Wen Nienhuys via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, lucamilanesio <luca.milanesio@gmail.com>
Subject: Re: [PATCH 2/3] t1405: mark test that checks existence as REFFILES
Date: Thu, 3 Feb 2022 19:10:43 +0100	[thread overview]
Message-ID: <CAFQ2z_MkZBtjViTsDNuKLYUXzFXJM6sPLOvXdiRAZrs84pggUw@mail.gmail.com> (raw)
In-Reply-To: <220203.867dab6dmp.gmgdl@evledraar.gmail.com>

On Thu, Feb 3, 2022 at 6:53 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> >> Yes.  It is exactly the point of the question I asked.  If it is
> >> simple and easy to add such a new type that is ignored/skipped by
> >> existing clients, then we can go that route.  If it is simple and
> >> easy to add a new bit per ref that existing clients would not barf,
> >> we can use that as an alternative implementation strategy.
> >
> > I'm not sure that there are any JGit clients: I committed reftable
> > support at the end of 2019. Before that time, we were running it
> > internally at Google, but only ref storage, and without the posix
> > part. Reflogs were never stored in refable, and I actually found a
> > couple of bugs in Shawn's Java code.
> >
> > Gerrit has increasingly started using Git as a database, and the
> > packed/loose system is just not a very good database, so that
> > motivates the work reftable in general. But the folks who run Gerrit
> > on a POSIX filesystem want to be sure that isn't a fringe feature, so
> > they only want to start using it once Git itself supports it. So there
> > is a chicken & egg problem.
> >
> > It's sad that we have to introduce an existence bit to make things
> > work, but overall it is probably easier for me to do than trying to
> > make sense of sequencer.c and how it uses refs/stash@{0}.
> >
> > Technically, the only obstacle I see is that we'd need to treat an
> > existence entry especially for the purpose of compaction/gc: we can
> > discard older entries, but we shouldn't discard the existence bit, no
> > matter how old it is.
>
> Ah, that's very informative. I had been assuming (or misremembered) that
> reftable was already seeing production use at Google. Perhaps I
> remembering the now-dead Google Code (or whatever it was called). Maybe
> not.

We use the format (the JGit code) at Google, but we only use it to
store refs, that is, the refname => {SHA1, symref, tag} mapping. We
currently don't store reflog data in reftable, and the bugs I found
were just in the reflog parts.

We store the tables in bigtable (among others), so the part that does
the POSIX file locking is new (basically, everything in
reftable/stack.c and its equivalent in JGit).

So we are locked into the format to some degree.

For the existence bit, I think I could simply record a $zeroid =>
$zeroid ref update in ref log and treat that specially.

> In any case, not being locked into the format as specified is very
> nice. So is it basically seeing no (production) use anywhere as far as
> you know? Whether that's in production at Google, or some third parties
> via JGit-something (maybe as editor libraries?).

I think our friends at Gerritforge have been experimenting with it,
but not in a production setting, AFAIK. Luca might confirm.

> Taking a bit of a step back.
>
> I do think that generally speaking parts of this series are putting the
> cart before the horse in seemingly trying to get the test suite clean
> before we have the integration in-tree.
>
> Not everything you have here, but some of it.
>
> I know I'm the one who started encouraging you to work towards getting
> the test mode passing, but I think that while it's good to mark some
> obviously file-only tests beforehand, anything where we have different
> behavior under reftable should really come after.

(I can't parse your last sentence)

> Of course that will mean we'll have some interim period where our test
> suite is a dumpster fire under GIT_TEST_REFTABLE=true, and I think

It's actually not that bad. By my last count, there were 38 files with
test failures.

> that's fine, as long as we work towards getting it passing, and as long
> as the non-stability of the nascent backend is very prominently
> advertised in the interim.
>
> I.e. I think *the* issue with the original series you had in this regard
> was that git-init.txt (or whatever it was) basically just discussed
> enabling reftable matter-of-factly, when we were still failing
> tens/hundreds of tests, which is just setting up a big bear trap for
> users to step into.

I read that comment, and I removed that long ago. Right now the only
way to get a reftable is to say GIT_TEST_REFTABLE=1 on init.

> But if we just changed those docs a bit to note "!!WARNING WARNING!!
> EXPERIMENTAL AND UNSTABLE !!WARNING WARNING!!" or whatever we could
> merge the API integration parts sooner than later, even with a lot of
> known-broken tests.
>
> We could then whitelist the broken parts, and work on narrowing that set
> down. Similar what the SANITIZE=leak mode is currently doing for memory
> leaks.
>
> I think that would make things a lot easier when reviewing submissions
> like these, in that we have reftable/* in-tree already, but with the
> "real" integration we could check how files/reftable backends behave,
> add the diverging behavior to tests etc.
>
> What do you think?

Sounds more fun than the current model :-)

The latest version of the code is here:
https://github.com/hanwen/git/tree/merged-seen-20220117

What do you need besides the "RFC: reftable backend" commit?

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

  reply	other threads:[~2022-02-03 18:11 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-31 17:50 [PATCH 0/3] reftable related test tweaks Han-Wen Nienhuys via GitGitGadget
2022-01-31 17:50 ` [PATCH 1/3] t1405: explictly delete reflogs for reftable Han-Wen Nienhuys via GitGitGadget
2022-01-31 17:50 ` [PATCH 2/3] t1405: mark test that checks existence as REFFILES Han-Wen Nienhuys via GitGitGadget
2022-01-31 21:26   ` Taylor Blau
2022-01-31 22:15     ` Junio C Hamano
2022-02-01 20:06       ` Han-Wen Nienhuys
2022-02-01 21:03         ` Junio C Hamano
2022-02-01 21:22           ` Ævar Arnfjörð Bjarmason
2022-02-01 22:11             ` Junio C Hamano
2022-02-03 16:02               ` Han-Wen Nienhuys
2022-02-03 17:39                 ` Ævar Arnfjörð Bjarmason
2022-02-03 18:10                   ` Han-Wen Nienhuys [this message]
2022-02-03 23:06                 ` Junio C Hamano
2022-02-07  9:48                   ` Han-Wen Nienhuys
2022-02-07 16:52                     ` Han-Wen Nienhuys
2022-02-07 23:40                       ` Junio C Hamano
2022-02-08 14:58                         ` Han-Wen Nienhuys
2022-01-31 17:50 ` [PATCH 3/3] t5312: prepare for reftable Han-Wen Nienhuys via GitGitGadget
2022-02-01 21:17   ` Ævar Arnfjörð Bjarmason
2022-02-03 14:24     ` Han-Wen Nienhuys
2022-02-03 18:31       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFQ2z_MkZBtjViTsDNuKLYUXzFXJM6sPLOvXdiRAZrs84pggUw@mail.gmail.com \
    --to=hanwen@google.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=luca.milanesio@gmail.com \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).