git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* reftable & jgit compatibility
@ 2024-04-03 10:36 Han-Wen Nienhuys
  2024-04-03 10:47 ` Patrick Steinhardt
  2024-04-03 15:51 ` Luca Milanesio
  0 siblings, 2 replies; 11+ messages in thread
From: Han-Wen Nienhuys @ 2024-04-03 10:36 UTC (permalink / raw
  To: Patrick Steinhardt, git, Josh Steadmon

Thanks again for taking up this work.

As I'm browsing over your patches (and realizing how much of the
arcana of the format I've forgotten), I hope that I did not make any
errors in implementing the spec (and/or that Shawn didn't deviate his
implementation from the spec). It would be extremely unfortunate if an
incompatibility between CGit and JGit were discovered after it is
released.

So far I have always been able to read JGit reftables using the C / Go
code, but it would be good to systematically test this, ie. generate a
 bunch of tables using JGit and check that passing them through the C
code (read & write) leaves them unchanged. Or perhaps check in some
tables as golden reference data.

Josh can probably connect you to the right folks to help with this on
the JGit side.

-- 
Han-Wen Nienhuys - hanwenn@gmail.com - http://www.xs4all.nl/~hanwen


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: reftable & jgit compatibility
  2024-04-03 10:36 reftable & jgit compatibility Han-Wen Nienhuys
@ 2024-04-03 10:47 ` Patrick Steinhardt
  2024-04-03 15:57   ` Han-Wen Nienhuys
  2024-04-03 20:54   ` Jeff King
  2024-04-03 15:51 ` Luca Milanesio
  1 sibling, 2 replies; 11+ messages in thread
From: Patrick Steinhardt @ 2024-04-03 10:47 UTC (permalink / raw
  To: Han-Wen Nienhuys; +Cc: git, Josh Steadmon

[-- Attachment #1: Type: text/plain, Size: 1563 bytes --]

On Wed, Apr 03, 2024 at 12:36:04PM +0200, Han-Wen Nienhuys wrote:
> Thanks again for taking up this work.
> 
> As I'm browsing over your patches (and realizing how much of the
> arcana of the format I've forgotten), I hope that I did not make any
> errors in implementing the spec (and/or that Shawn didn't deviate his
> implementation from the spec). It would be extremely unfortunate if an
> incompatibility between CGit and JGit were discovered after it is
> released.
> 
> So far I have always been able to read JGit reftables using the C / Go
> code, but it would be good to systematically test this, ie. generate a
>  bunch of tables using JGit and check that passing them through the C
> code (read & write) leaves them unchanged. Or perhaps check in some
> tables as golden reference data.
> 
> Josh can probably connect you to the right folks to help with this on
> the JGit side.

I very much agree, this thought has crossed my mind multiple times while
working on the whole reftable saga. Ideally, we would have integration
tests that write reftables with one of the implementations and then read
them with the respective other implementation. I wouldn't really know
where to put those though. CGit is very unlikely to pull in JGit as a
test dependency. Does JGit have any tests that already use CGit?

Adding a bunch of reftables pre-generated by JGit might be an okayish
tradeoff, I guess. I also don't really expect the format to evolve
significantly, so these should be reasonably static over the long term.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: reftable & jgit compatibility
  2024-04-03 10:36 reftable & jgit compatibility Han-Wen Nienhuys
  2024-04-03 10:47 ` Patrick Steinhardt
@ 2024-04-03 15:51 ` Luca Milanesio
  2024-04-03 16:42   ` Junio C Hamano
  1 sibling, 1 reply; 11+ messages in thread
From: Luca Milanesio @ 2024-04-03 15:51 UTC (permalink / raw
  To: git, JGit Developers list
  Cc: Luca Milanesio, Patrick Steinhardt, Josh Steadmon,
	Han-Wen Nienhuys

Hi Han-Wen,
Thanks for completing the ref-table on JGit and kicking off the work on CGit.

> On 3 Apr 2024, at 11:36, Han-Wen Nienhuys <hanwenn@gmail.com> wrote:
> 
> Thanks again for taking up this work.
> 
> As I'm browsing over your patches (and realizing how much of the
> arcana of the format I've forgotten), I hope that I did not make any
> errors in implementing the spec (and/or that Shawn didn't deviate his
> implementation from the spec). It would be extremely unfortunate if an
> incompatibility between CGit and JGit were discovered after it is
> released.
> 
> So far I have always been able to read JGit reftables using the C / Go
> code, but it would be good to systematically test this, ie. generate a
> bunch of tables using JGit and check that passing them through the C
> code (read & write) leaves them unchanged. Or perhaps check in some
> tables as golden reference data.
> Josh can probably connect you to the right folks to help with this on
> the JGit side.

I am happy to experiment the support on GerritHub.io, we have over 40k repositories !

Luca.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: reftable & jgit compatibility
  2024-04-03 10:47 ` Patrick Steinhardt
@ 2024-04-03 15:57   ` Han-Wen Nienhuys
  2024-04-04  6:23     ` Patrick Steinhardt
  2024-04-03 20:54   ` Jeff King
  1 sibling, 1 reply; 11+ messages in thread
From: Han-Wen Nienhuys @ 2024-04-03 15:57 UTC (permalink / raw
  To: Patrick Steinhardt; +Cc: git, Josh Steadmon

On Wed, Apr 3, 2024 at 4:41 PM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Wed, Apr 03, 2024 at 12:36:04PM +0200, Han-Wen Nienhuys wrote:
> > Thanks again for taking up this work.
> >
> > As I'm browsing over your patches (and realizing how much of the
> > arcana of the format I've forgotten), I hope that I did not make any
> > errors in implementing the spec (and/or that Shawn didn't deviate his
> > implementation from the spec). It would be extremely unfortunate if an
> > incompatibility between CGit and JGit were discovered after it is
> > released.
> >
> > So far I have always been able to read JGit reftables using the C / Go
> > code, but it would be good to systematically test this, ie. generate a
> >  bunch of tables using JGit and check that passing them through the C
> > code (read & write) leaves them unchanged. Or perhaps check in some
> > tables as golden reference data.
> >
> > Josh can probably connect you to the right folks to help with this on
> > the JGit side.
>
> I very much agree, this thought has crossed my mind multiple times while
> working on the whole reftable saga. Ideally, we would have integration
> tests that write reftables with one of the implementations and then read
> them with the respective other implementation. I wouldn't really know
> where to put those though. CGit is very unlikely to pull in JGit as a
> test dependency. Does JGit have any tests that already use CGit?

Yes, but not many (eg. CGitIgnoreTest.java).

I think the easiest way to make this happen is if CGit would ship a
command to dump a raw reftable in a release soonish. Then JGit could
use that command to cross-check that a JGit-written reftable can be
read correctly by the CGit code.  By shipping just the dumper you
avoid having to wait for proper reftable support to land in git.

Probably the dumper should be extended to also support seeks, so you
can also exercise the indexing/searching code.

-- 
Han-Wen Nienhuys - hanwenn@gmail.com - http://www.xs4all.nl/~hanwen


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: reftable & jgit compatibility
  2024-04-03 15:51 ` Luca Milanesio
@ 2024-04-03 16:42   ` Junio C Hamano
  0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2024-04-03 16:42 UTC (permalink / raw
  To: Luca Milanesio
  Cc: git, JGit Developers list, Patrick Steinhardt, Josh Steadmon,
	Han-Wen Nienhuys

Luca Milanesio <luca.milanesio@gmail.com> writes:

> Hi Han-Wen,
> Thanks for completing the ref-table on JGit and kicking off the work on CGit.
> ...
>> So far I have always been able to read JGit reftables using the C / Go
>> code, but it would be good to systematically test this, ie. generate a
>> bunch of tables using JGit and check that passing them through the C
>> code (read & write) leaves them unchanged. Or perhaps check in some
>> tables as golden reference data.
> ...
> I am happy to experiment the support on GerritHub.io, we have over 40k repositories !

Thanks.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: reftable & jgit compatibility
  2024-04-03 10:47 ` Patrick Steinhardt
  2024-04-03 15:57   ` Han-Wen Nienhuys
@ 2024-04-03 20:54   ` Jeff King
  2024-04-04  6:29     ` Patrick Steinhardt
  1 sibling, 1 reply; 11+ messages in thread
From: Jeff King @ 2024-04-03 20:54 UTC (permalink / raw
  To: Patrick Steinhardt; +Cc: Han-Wen Nienhuys, git, Josh Steadmon

On Wed, Apr 03, 2024 at 12:47:15PM +0200, Patrick Steinhardt wrote:

> I very much agree, this thought has crossed my mind multiple times while
> working on the whole reftable saga. Ideally, we would have integration
> tests that write reftables with one of the implementations and then read
> them with the respective other implementation. I wouldn't really know
> where to put those though. CGit is very unlikely to pull in JGit as a
> test dependency. Does JGit have any tests that already use CGit?

We do have some tests that use jgit to check bitmap interoperability.
But obviously they're optional, and I suspect they are not run very
often (I do have jgit in my path these days, so I run them, but I assume
most people don't). It probably wouldn't be too hard to include it in
one of the CI runs, though. You can grep for the JGIT prereq in t/.

We had another test that used jgit to check for some protocol
interoperability. But it was broken with sha256 and nobody noticed. ;)
There I replaced it with a hard-coded input. See 13e67aa39b (v0
protocol: fix sha1/sha256 confusion for capabilities^{}, 2023-04-14) for
some discussion.

I think using actual jgit (versus a hard-coded input) is a good basic
smoke test: it tells us if the two can interoperate generally. But for
testing specific inputs like the case in 13e67aa39b, we are depending on
jgit producing that specific behavior (which in this case, it probably
wasn't any more). And there we are better off just with a manual test
vector.

-Peff


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: reftable & jgit compatibility
  2024-04-03 15:57   ` Han-Wen Nienhuys
@ 2024-04-04  6:23     ` Patrick Steinhardt
  2024-04-04  6:44       ` Han-Wen Nienhuys
  0 siblings, 1 reply; 11+ messages in thread
From: Patrick Steinhardt @ 2024-04-04  6:23 UTC (permalink / raw
  To: Han-Wen Nienhuys; +Cc: git, Josh Steadmon

[-- Attachment #1: Type: text/plain, Size: 2427 bytes --]

On Wed, Apr 03, 2024 at 05:57:19PM +0200, Han-Wen Nienhuys wrote:
> On Wed, Apr 3, 2024 at 4:41 PM Patrick Steinhardt <ps@pks.im> wrote:
> >
> > On Wed, Apr 03, 2024 at 12:36:04PM +0200, Han-Wen Nienhuys wrote:
> > > Thanks again for taking up this work.
> > >
> > > As I'm browsing over your patches (and realizing how much of the
> > > arcana of the format I've forgotten), I hope that I did not make any
> > > errors in implementing the spec (and/or that Shawn didn't deviate his
> > > implementation from the spec). It would be extremely unfortunate if an
> > > incompatibility between CGit and JGit were discovered after it is
> > > released.
> > >
> > > So far I have always been able to read JGit reftables using the C / Go
> > > code, but it would be good to systematically test this, ie. generate a
> > >  bunch of tables using JGit and check that passing them through the C
> > > code (read & write) leaves them unchanged. Or perhaps check in some
> > > tables as golden reference data.
> > >
> > > Josh can probably connect you to the right folks to help with this on
> > > the JGit side.
> >
> > I very much agree, this thought has crossed my mind multiple times while
> > working on the whole reftable saga. Ideally, we would have integration
> > tests that write reftables with one of the implementations and then read
> > them with the respective other implementation. I wouldn't really know
> > where to put those though. CGit is very unlikely to pull in JGit as a
> > test dependency. Does JGit have any tests that already use CGit?
> 
> Yes, but not many (eg. CGitIgnoreTest.java).
> 
> I think the easiest way to make this happen is if CGit would ship a
> command to dump a raw reftable in a release soonish. Then JGit could
> use that command to cross-check that a JGit-written reftable can be
> read correctly by the CGit code.  By shipping just the dumper you
> avoid having to wait for proper reftable support to land in git.

You do realize that "proper reftable support" has already landed, right?
So you can just use Git to create a reftable-enabled repository, write
commits and then use JGit to access the whole repository instead of only
checking a single table.

Might be I'm missing your point though, not sure.

Patrick

> Probably the dumper should be extended to also support seeks, so you
> can also exercise the indexing/searching code.



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: reftable & jgit compatibility
  2024-04-03 20:54   ` Jeff King
@ 2024-04-04  6:29     ` Patrick Steinhardt
  0 siblings, 0 replies; 11+ messages in thread
From: Patrick Steinhardt @ 2024-04-04  6:29 UTC (permalink / raw
  To: Jeff King; +Cc: Han-Wen Nienhuys, git, Josh Steadmon

[-- Attachment #1: Type: text/plain, Size: 2045 bytes --]

On Wed, Apr 03, 2024 at 04:54:51PM -0400, Jeff King wrote:
> On Wed, Apr 03, 2024 at 12:47:15PM +0200, Patrick Steinhardt wrote:
> 
> > I very much agree, this thought has crossed my mind multiple times while
> > working on the whole reftable saga. Ideally, we would have integration
> > tests that write reftables with one of the implementations and then read
> > them with the respective other implementation. I wouldn't really know
> > where to put those though. CGit is very unlikely to pull in JGit as a
> > test dependency. Does JGit have any tests that already use CGit?
> 
> We do have some tests that use jgit to check bitmap interoperability.
> But obviously they're optional, and I suspect they are not run very
> often (I do have jgit in my path these days, so I run them, but I assume
> most people don't). It probably wouldn't be too hard to include it in
> one of the CI runs, though. You can grep for the JGIT prereq in t/.

Oh, that's great, I didn't know about that! I will take a look at
updating our CI systems to include JGit...

> We had another test that used jgit to check for some protocol
> interoperability. But it was broken with sha256 and nobody noticed. ;)
> There I replaced it with a hard-coded input. See 13e67aa39b (v0
> protocol: fix sha1/sha256 confusion for capabilities^{}, 2023-04-14) for
> some discussion.

... also to avoid rotting tests like this.

> I think using actual jgit (versus a hard-coded input) is a good basic
> smoke test: it tells us if the two can interoperate generally. But for
> testing specific inputs like the case in 13e67aa39b, we are depending on
> jgit producing that specific behavior (which in this case, it probably
> wasn't any more). And there we are better off just with a manual test
> vector.

Agreed. I will add some basic interop tests that ensure that JGit and
CGit can read their respective formats. I don't want it to be too fancy
initially, but it's good to have a baseline which we can iterate from in
the future.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: reftable & jgit compatibility
  2024-04-04  6:23     ` Patrick Steinhardt
@ 2024-04-04  6:44       ` Han-Wen Nienhuys
  2024-04-04  7:20         ` Patrick Steinhardt
  0 siblings, 1 reply; 11+ messages in thread
From: Han-Wen Nienhuys @ 2024-04-04  6:44 UTC (permalink / raw
  To: Patrick Steinhardt; +Cc: git, Josh Steadmon

On Thu, Apr 4, 2024 at 8:23 AM Patrick Steinhardt <ps@pks.im> wrote:

> > I think the easiest way to make this happen is if CGit would ship a
> > command to dump a raw reftable in a release soonish. Then JGit could
> > use that command to cross-check that a JGit-written reftable can be
> > read correctly by the CGit code.  By shipping just the dumper you
> > avoid having to wait for proper reftable support to land in git.
>
> You do realize that "proper reftable support" has already landed, right?

I had not realized this, and that's great news!

> So you can just use Git to create a reftable-enabled repository, write
> commits and then use JGit to access the whole repository instead of only
> checking a single table.

For testing, it's probably easier if you can work in terms of
individual tables (because that is where the complexity lies:
different blocksizes, restart frequencies, with index, without index,
with reflog, without reflog etc.), but one can create controlled
individual tables by creating a whole repo and then compacting it.
OTOH, this would necessitate exposing all writer options to the git
CLI, which is maybe a bit much.

-- 
Han-Wen Nienhuys - hanwenn@gmail.com - http://www.xs4all.nl/~hanwen


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: reftable & jgit compatibility
  2024-04-04  6:44       ` Han-Wen Nienhuys
@ 2024-04-04  7:20         ` Patrick Steinhardt
  2024-04-04  8:36           ` Han-Wen Nienhuys
  0 siblings, 1 reply; 11+ messages in thread
From: Patrick Steinhardt @ 2024-04-04  7:20 UTC (permalink / raw
  To: Han-Wen Nienhuys; +Cc: git, Josh Steadmon

[-- Attachment #1: Type: text/plain, Size: 2139 bytes --]

On Thu, Apr 04, 2024 at 08:44:44AM +0200, Han-Wen Nienhuys wrote:
> On Thu, Apr 4, 2024 at 8:23 AM Patrick Steinhardt <ps@pks.im> wrote:
> 
> > > I think the easiest way to make this happen is if CGit would ship a
> > > command to dump a raw reftable in a release soonish. Then JGit could
> > > use that command to cross-check that a JGit-written reftable can be
> > > read correctly by the CGit code.  By shipping just the dumper you
> > > avoid having to wait for proper reftable support to land in git.
> >
> > You do realize that "proper reftable support" has already landed, right?
> 
> I had not realized this, and that's great news!
> 
> > So you can just use Git to create a reftable-enabled repository, write
> > commits and then use JGit to access the whole repository instead of only
> > checking a single table.
> 
> For testing, it's probably easier if you can work in terms of
> individual tables (because that is where the complexity lies:
> different blocksizes, restart frequencies, with index, without index,
> with reflog, without reflog etc.), but one can create controlled
> individual tables by creating a whole repo and then compacting it.
> OTOH, this would necessitate exposing all writer options to the git
> CLI, which is maybe a bit much.

Potentially, yeah. But as you say, it's likely quite some complexity to
expose this via the CLI directly. So for now, I'm going to focus on some
basic interoperability tests in Git that act on the repository level. We
can build on that and expand them as required when the need arises.

Different blocksizes is definitely a bit of a sore spot right now. I do
plan to expose write options via Git config options in the future, e.g.
something like "reftable.blockSize" or "reftable.restartCount". But for
all I know the CGit reftable library doesn't yet play nice with block
sizes other than 4k.

I didn't yet want to introduce configs which are specific to reftables
in the first release of Git with the reftable backend, so I pushed this
issue further down. I do plan to work on that in the next release cycle
though.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: reftable & jgit compatibility
  2024-04-04  7:20         ` Patrick Steinhardt
@ 2024-04-04  8:36           ` Han-Wen Nienhuys
  0 siblings, 0 replies; 11+ messages in thread
From: Han-Wen Nienhuys @ 2024-04-04  8:36 UTC (permalink / raw
  To: Patrick Steinhardt; +Cc: git, Josh Steadmon

On Thu, Apr 4, 2024 at 9:20 AM Patrick Steinhardt <ps@pks.im> wrote:
> Different blocksizes is definitely a bit of a sore spot right now. I do
> plan to expose write options via Git config options in the future, e.g.
> something like "reftable.blockSize" or "reftable.restartCount". But for
> all I know the CGit reftable library doesn't yet play nice with block
> sizes other than 4k.

It shouldn't be too bad. Many unittests use blocksizes other than 4k
simply because populating a multi-block table takes less space at
smaller blocksizes.

-- 
Han-Wen Nienhuys - hanwenn@gmail.com - http://www.xs4all.nl/~hanwen


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-04-04  8:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-03 10:36 reftable & jgit compatibility Han-Wen Nienhuys
2024-04-03 10:47 ` Patrick Steinhardt
2024-04-03 15:57   ` Han-Wen Nienhuys
2024-04-04  6:23     ` Patrick Steinhardt
2024-04-04  6:44       ` Han-Wen Nienhuys
2024-04-04  7:20         ` Patrick Steinhardt
2024-04-04  8:36           ` Han-Wen Nienhuys
2024-04-03 20:54   ` Jeff King
2024-04-04  6:29     ` Patrick Steinhardt
2024-04-03 15:51 ` Luca Milanesio
2024-04-03 16:42   ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).