* Re: [gitgitgadget/git] Reftable support git-core (#539)
[not found] ` <gitgitgadget/git/pull/539/c603037045@github.com>
@ 2020-03-27 11:20 ` Han-Wen Nienhuys
0 siblings, 0 replies; 3+ messages in thread
From: Han-Wen Nienhuys @ 2020-03-27 11:20 UTC (permalink / raw)
To: gitgitgadget/git, git, jrn; +Cc: gitgitgadget/git, Mention
On Tue, Mar 24, 2020 at 7:09 AM gitgitgadget[bot]
<notifications@github.com> wrote:
> Contribution format
> ~~~~~~~~~~~~~~~~~~~
> This is a bit of an unusual contribution to Git:
>
> - the heart of the series is in a single commit
> - it is under a different license than most of Git
> - the code doesn't use the same dialect as and "look like" the rest of
> Git
>
> All three of these are because the code is meant to be a standalone
> library that can be used by Git, libgit2, and any other project that
> wants to understand reftables. But they create obstacles to reviewing
> (one reviewer said he had spent a two-hour block looking at the patch
> and not made much headway).
>
> On the other hand, people mostly agreed that having the ability to
> share this code with libgit2 is beneficial. So how can we get a
> substantial review without losing that benefit?
>
> 1. Most importantly, people would like a series of multiple patches
> that tell a story. If review of an early patch in the series
> reveals significant changes that should happen in reftable
> upstream, that's fine --- upstream, that can happen in a patch on
> top, whereas in the series being contributed, the patch would be
> amended.
>
> In the end, there would be two different series of commits: the
> contributed series, and the upstream history. The upstream
> history might be messier, and that's fine. The contributed series
> would be applied to git.git.
This not very practical. I currently sync the reftable library into
git by means of the reftable/update.sh script,
and creating a layered commits is a lot of busywork. Also, getting
the entire code in one go lets reviewers see how the whole thing fits
together.
I do understand the complaint, though. The good news is that there
already is a story in the source code; you just have to read it in the
right order. That order is (bottom-up):
* record.{c,h} - (de)serialized single records
* block.{c,h} - read and write blocks
* writer.c - write a reftable
* reader.c - read a reftable
* merged.{c,h}, pq.{c,h} - reading a stack of reftables
* stack.{c,h} - writing and compacting stacks of reftable on the filesystem.
before anyone attempts to review the code, they should read the
specification of the format very carefully.
> 2. Both Git and libgit2 have abstractions like strbuf. We'd like
> reftable to make use of similar abstractions where they make the
> code cleaner.
Have you looked at slice.{c,h} ? Is there any specific code that
anyone had an issue with?
> 3. libgit2 has a git_malloc allocator. reftable doesn't necessarily
> have to use it or make it pluggable --- at worst, we can #define
> malloc to use it. But it's something to be aware of.
I added pluggable malloc routines.
> Maintenance
> ~~~~~~~~~~~
> There's some interest in the maintenance story: if we run into an
> issue with the reftable library, do we report and fix it on the git
> mailing list or does the reftable library have its own upstream forums
> for that?
There is no specific forum. The most practical solution would be
report to the git mailing list with cc to hanwen@google.com.
Alternatively, an issue on the github project would also work.
I don't foresee much need for maintenance, though. (Or do you foresee
a need to store more types of data besides refs and reflogs?)
> Portability
> ~~~~~~~~~~~
> There was some confusion about the scope of what the reftable library
> provides. It provides a reader and a writer and the caller is
> responsible for connecting those to files.
No, it also does transactional updates to a stack of reftables on the
filesystem.
> There were some questions about mmap usage here (there is none) and
The format is suitable for mmap, but it seems premature optimization,
given the (lack of) efficiency of loose+packed refs storage that it
indends to replace.
> whether the library needs some kind of seeking reader interface for
> abstracting the OS interface to files.
Look at reftable_block_source in reftable.h
> I think the upshot is
>
> 4. Some summary documentation would be helpful e.g. in the README.md,
> to point people to what header file a person should start with to
> understand the library
reftable.h is a good start, as it declares the public interface, and
is extensively documented. It also addresses many of the points that
this email raised.
> 5. People are also interested in the file-oriented part of reftable,
> even though this library doesn't implement it (that's patch 6,
> which is Git-specific).
No, this is incorrect.
> Testing
> ~~~~~~~
> Git's testsuite has various GIT_TEST_* knobs. A GIT_TEST_REF_BACKEND
> knob would be helpful, to allow running the full testsuite with
> reftable. That's a good way to suss out edge cases.
I agree, but this outside my personal area of expertise. I think this
is best tackled by someone else.
> The testsuite should *also* include some specific tests designed for
> reftable. They can demonstrate edge cases and demonstrate some of the
> benefits
There are unittests in the upstream project. Edge cases in the library
itself should be covered by unittests. Is there interest in adding
these to the git project itself?
I can add some Git tests for directory/file conflicts and case sensitivity.
> We also discussed table-driven tests that can be shared between
> implementations but didn't end up saying anything too concrete about
> that.
My plan is to add a cross-language tests, that passes data between
JGit, Go and C implementations to verify that they interoperate.
--
Han-Wen Nienhuys - hanwenn@gmail.com - http://www.xs4all.nl/~hanwen
^ permalink raw reply [flat|nested] 3+ messages in thread