Re: [gitgitgadget/git] Reftable support git-core (#539)

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* Re: [gitgitgadget/git] Reftable support git-core (#539)
       [not found] ` <gitgitgadget/git/pull/539/c603037045@github.com>
@ 2020-03-27 11:20   ` Han-Wen Nienhuys
  0 siblings, 0 replies; 3+ messages in thread
From: Han-Wen Nienhuys @ 2020-03-27 11:20 UTC (permalink / raw)
  To: gitgitgadget/git, git, jrn; +Cc: gitgitgadget/git, Mention

On Tue, Mar 24, 2020 at 7:09 AM gitgitgadget[bot]
<notifications@github.com> wrote:
> Contribution format
> ~~~~~~~~~~~~~~~~~~~
> This is a bit of an unusual contribution to Git:
>
> - the heart of the series is in a single commit
> - it is under a different license than most of Git
> - the code doesn't use the same dialect as and "look like" the rest of
>   Git
>
> All three of these are because the code is meant to be a standalone
> library that can be used by Git, libgit2, and any other project that
> wants to understand reftables.  But they create obstacles to reviewing
> (one reviewer said he had spent a two-hour block looking at the patch
> and not made much headway).
>
> On the other hand, people mostly agreed that having the ability to
> share this code with libgit2 is beneficial.  So how can we get a
> substantial review without losing that benefit?
>
>  1. Most importantly, people would like a series of multiple patches
>     that tell a story.  If review of an early patch in the series
>     reveals significant changes that should happen in reftable
>     upstream, that's fine --- upstream, that can happen in a patch on
>     top, whereas in the series being contributed, the patch would be
>     amended.
>
>     In the end, there would be two different series of commits: the
>     contributed series, and the upstream history.  The upstream
>     history might be messier, and that's fine.  The contributed series
>     would be applied to git.git.

This not very practical. I currently sync the reftable library into
git by means of the reftable/update.sh script,
and creating a layered commits is a lot of busywork.  Also, getting
the entire code in one go lets reviewers see how the whole thing fits
together.

I do understand the complaint, though. The good news is that there
already is a story in the source code; you just have to read it in the
right order. That order is (bottom-up):

* record.{c,h} - (de)serialized single records
* block.{c,h} - read and write blocks
* writer.c - write a reftable
* reader.c - read a reftable
* merged.{c,h}, pq.{c,h} - reading a stack of reftables
* stack.{c,h} - writing and compacting stacks of reftable on the filesystem.

before anyone attempts to review the code, they should read the
specification of the format very carefully.

>  2. Both Git and libgit2 have abstractions like strbuf.  We'd like
>     reftable to make use of similar abstractions where they make the
>     code cleaner.

Have you looked at slice.{c,h} ? Is there any specific code that
anyone had an issue with?

>  3. libgit2 has a git_malloc allocator.  reftable doesn't necessarily
>     have to use it or make it pluggable --- at worst, we can #define
>     malloc to use it.  But it's something to be aware of.

I added pluggable malloc routines.

> Maintenance
> ~~~~~~~~~~~
> There's some interest in the maintenance story: if we run into an
> issue with the reftable library, do we report and fix it on the git
> mailing list or does the reftable library have its own upstream forums
> for that?

There is no specific forum. The most practical solution would be
report to the git mailing list with cc to hanwen@google.com.

Alternatively, an issue on the github project would also work.

I don't foresee much need for maintenance, though. (Or do you foresee
a need to store more types of data besides refs and reflogs?)

> Portability
> ~~~~~~~~~~~
> There was some confusion about the scope of what the reftable library
> provides.  It provides a reader and a writer and the caller is
> responsible for connecting those to files.

No, it also does transactional updates to a stack of reftables on the
filesystem.

> There were some questions about mmap usage here (there is none) and

The format is suitable for mmap, but it seems premature optimization,
given the (lack of) efficiency of loose+packed refs storage that it
indends to replace.

> whether the library needs some kind of seeking reader interface for
> abstracting the OS interface to files.

Look at reftable_block_source in reftable.h

> I think the upshot is
>
>  4. Some summary documentation would be helpful e.g. in the README.md,
>     to point people to what header file a person should start with to
>     understand the library

reftable.h is a good start, as it declares the public interface, and
is extensively documented. It also addresses many of the points that
this email raised.

>  5. People are also interested in the file-oriented part of reftable,
>     even though this library doesn't implement it (that's patch 6,
>     which is Git-specific).

No, this is incorrect.

> Testing
> ~~~~~~~
> Git's testsuite has various GIT_TEST_* knobs.  A GIT_TEST_REF_BACKEND
> knob would be helpful, to allow running the full testsuite with
> reftable.  That's a good way to suss out edge cases.

I agree, but this outside my personal area of expertise. I think this
is best tackled by someone else.

> The testsuite should *also* include some specific tests designed for
> reftable.  They can demonstrate edge cases and demonstrate some of the
> benefits

There are unittests in the upstream project. Edge cases in the library
itself should be covered by unittests. Is there interest in adding
these to the git project itself?

I can add some Git tests for directory/file conflicts and case sensitivity.

> We also discussed table-driven tests that can be shared between
> implementations but didn't end up saying anything too concrete about
> that.

My plan is to add a cross-language tests, that passes data between
JGit, Go and C implementations to verify that they interoperate.

--
Han-Wen Nienhuys - hanwenn@gmail.com - http://www.xs4all.nl/~hanwen

^ permalink raw reply	[flat|nested] 3+ messages in thread

[parent not found: <gitgitgadget/git/pull/539/c615547763@github.com>]

[parent not found: <CAOw_e7azo1Wb9RO=7kH8kXp4RxTzD6SW4a9w_2ifiGxUmt2YKw@mail.gmail.com>]

* Re: [gitgitgadget/git] Reftable support git-core (#539)
       [not found]   ` <CAOw_e7azo1Wb9RO=7kH8kXp4RxTzD6SW4a9w_2ifiGxUmt2YKw@mail.gmail.com>
@ 2020-04-22 11:27     ` Johannes Schindelin
  2020-04-22 15:18       ` Han-Wen Nienhuys
  0 siblings, 1 reply; 3+ messages in thread
From: Johannes Schindelin @ 2020-04-22 11:27 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: git

Hi Han-Wen,

On Mon, 20 Apr 2020, Han-Wen Nienhuys wrote:

> On Sat, Apr 18, 2020 at 5:26 AM gitgitgadget[bot] <notifications@github.com>
> wrote:
>
> > On the Git mailing list
> > <https://lore.kernel.org/git/20200418032252.GA9169@danh.dev>, Danh Doan
> > wrote (reply to this
> > <https://github.com/gitgitgadget/gitgitgadget/wiki/ReplyToThis>):
> >
> > On 2020-04-15 16:29:33-0700, Junio C Hamano <gitster@pobox.com> wrote:
> > > "Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > >
> > > > This adds the reftable library, and hooks it up as a ref backend.
> > >
> > > Since I queued the topic to 'pu', I needed to apply these fix-up
> > > patches on top.  The change to the Makefile is about "make clean"
> > > which forgot to clean the reftable library.  All the rest are ws
> > > clean-ups.
> > >
> > > There are build breakages reported on Windows, with possible fixes
> > > (which IIRC were reported to segfault X-<), but I do not have URL or
> > > message-IDs handy.
> >
> > FYI, it's also segfault in Linux even if only the test is applied.
> > IOW, the code to fix breakages on Windows doesn't affect the segfault.
> >
> > Here is the message: <nycvar.QRO.7.76.6.2004101604210.46@tvgsbejvaqbjf.bet>
> >
> >
> Thanks for the reference. I'll look into it.
>
> Johannes, if you have specific comments about the reftable patch, please
> include me (hanwen@google.com) explicitly on your messages.

Sorry, I forgot. Will try to do better next time.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [gitgitgadget/git] Reftable support git-core (#539)
  2020-04-22 11:27     ` Johannes Schindelin
@ 2020-04-22 15:18       ` Han-Wen Nienhuys
  0 siblings, 0 replies; 3+ messages in thread
From: Han-Wen Nienhuys @ 2020-04-22 15:18 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Han-Wen Nienhuys, git

x

On Wed, Apr 22, 2020 at 1:27 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> > > Here is the message: <nycvar.QRO.7.76.6.2004101604210.46@tvgsbejvaqbjf.bet>
> > >
> > >
> > Thanks for the reference. I'll look into it.
> >
> > Johannes, if you have specific comments about the reftable patch, please
> > include me (hanwen@google.com) explicitly on your messages.
>
> Sorry, I forgot. Will try to do better next time.

I fixed all the outstanding issues in the last push to GGG. For the
record, the following caused segfaults:

https://github.com/google/reftable/commit/31078a87067ec9c33f496afb6b478eed6e9c3d12
https://github.com/google/reftable/commit/9107ddd6ed73844cb9092dc18ba92091a1132a9e

The message about "bad replace ref name" was a missing copy of the
prefix filter.

The C reftable library was based on the code I wrote in Go first, and
I keep both versions in sync, hence the null/zero initialization of
data throughout.

By now I'm an expert on reftable, but not so much about Git's test
infrastructure, and unfortunately, I lack the time to become an expert
without your guidance, so please give specific feedback.

The most pressing thing right now is that the windows port on GGG says

 ++ git tag file
 fatal: reftable: transaction failure general error

which suggests that renames don't work like I think they do on windows.

> Ciao,
> Dscho

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-04-22 15:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <gitgitgadget/git/pull/539@github.com>
     [not found] ` <gitgitgadget/git/pull/539/c603037045@github.com>
2020-03-27 11:20   ` [gitgitgadget/git] Reftable support git-core (#539) Han-Wen Nienhuys
     [not found] ` <gitgitgadget/git/pull/539/c615547763@github.com>
     [not found]   ` <CAOw_e7azo1Wb9RO=7kH8kXp4RxTzD6SW4a9w_2ifiGxUmt2YKw@mail.gmail.com>
2020-04-22 11:27     ` Johannes Schindelin
2020-04-22 15:18       ` Han-Wen Nienhuys

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).