git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Shawn Pearce <spearce@spearce.org>
To: Ben Alex <ben.alex@acegi.com.au>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	git <git@vger.kernel.org>, "Jeff King" <peff@peff.net>,
	"Michael Haggerty" <mhagger@alum.mit.edu>,
	"Junio C Hamano" <gitster@pobox.com>,
	"David Borowitz" <dborowitz@google.com>,
	"Stefan Beller" <sbeller@google.com>,
	"David Turner" <David.Turner@twosigma.com>,
	"Kristoffer Sjogren" <stoffe@gmail.com>
Subject: Re: reftable [v5]: new ref storage format
Date: Mon, 7 Aug 2017 07:41:43 -0700	[thread overview]
Message-ID: <CAJo=hJsEaKH40WnhxqvkASpiXnV8ipc+b1zrZ9VEjqRjpJ17Qg@mail.gmail.com> (raw)
In-Reply-To: <CAOhB0ruYhGAyNn84ZjS7TH7QdwxNi2bPN8KFxEEBd58B9qVrmg@mail.gmail.com>

On Sun, Aug 6, 2017 at 4:37 PM, Ben Alex <ben.alex@acegi.com.au> wrote:
> Just on the LmdbJava specific pieces:
>
> On Mon, Aug 7, 2017 at 8:56 AM, Shawn Pearce <spearce@spearce.org> wrote:
>>
>> Looks pretty complete. Its a Java wrapper around the C implementation
>> of LMDB, which may be sufficient for reference storage. Keys are
>> limited to 511 bytes, so insanely long reference names would have to
>> be rejected. Reftable allows reference names up to the file's
>> `page_size`, minus overhead (~15 bytes) and value (20 bytes).
>
>
> For clarification LmdbJava code doesn't enforce a particular key size limit.
> For puts the caller nominates the size in the buffer they present for
> storage, and for get-style operations (cursors etc) the LMDB database stores
> the key size and LmdbJava adjusts the Java-visible buffer accordingly.
>
> A 511 byte key limit is specified at compile time for the native LMDB
> library. For convenience the native library is compiled for 64-bit Windows,
> Linux and Mac OS and included in the LmdbJava JAR, and this compilation is
> performed using default values (including the 511 key limit) by the
> https://github.com/lmdbjava/native project. Users can specify a different
> native library to use (eg one packaged by their OS or separately compiled
> using an LmdbJava Native-like automatic build) with a larger key size if
> they wish.
>
> As such if JGit wanted to use a longer key size, it is possible to implement
> similar automatic builds and packaging into JGit.

I don't know if we need a larger key size. $DAY_JOB limits ref names
to ~200 bytes in a hook. I think GitHub does similar. But I'm worried
about the general masses who might be using our software and expect
ref names thus far to be as long as PATH_MAX on their system. Most
systems run PATH_MAX around 1024.

The limitation of needing native JARs, and having such a low compile
time constant, may be annoying to some.

>> A downside for JGit is getting these two open source projects cleared.
>> We would have to get approval from our sponsor (Eclipse Foundation) to
>> use both lmdbjava (Apache License) and LMDB (LMDB license).
>
>
> I can't speak for the other contributors, but I'm happy to review LmdbJava's
> license if this assisted. For example changing to the OpenLDAP License would
> seem a reasonable variation given users of LmdbJava already need to accept
> the OpenLDAP License to use it. Kristoffer, do you have thoughts on this?

Thanks for considering it, but please don't change your licensing just
because of JGit. Its unlikely we can use LMDB for a lot of technical
reasons.

>> Plus it
>> looks like lmdbjava still relies on local disk and isn't giving us a
>> way to patch in a virtual filesystem the way I need to at $DAY_JOB.
>
>
> LMDB's mdb_env_open requires a const char* path, so we can pass through any
> char array desired. But I think you'll find LMDB native can't map to a
> virtual file system implemented by JVM code (the LMDB caveats section has
> further local file system considerations).

Mostly at $DAY_JOB its because we can't virtualize the filesystem
calls the C library is doing.

In git-core, I'm worried about the caveats related to locking. Git
tries to work nicely on NFS, and it seems LMDB wouldn't. Git also runs
fine on a read-only filesystem, and LMDB gets a little weird about
that. Finally, Git doesn't have nearly the risks LMDB has about a
crashed reader or writer locking out future operations until the locks
have been resolved. This is especially true with shared user
repositories, where another user might setup and own the semaphore.

  parent reply	other threads:[~2017-08-07 14:42 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-06  3:15 reftable [v5]: new ref storage format Shawn Pearce
2017-08-06 16:56 ` Ævar Arnfjörð Bjarmason
2017-08-06 22:56   ` Shawn Pearce
     [not found]     ` <CAOhB0ruYhGAyNn84ZjS7TH7QdwxNi2bPN8KFxEEBd58B9qVrmg@mail.gmail.com>
2017-08-07 14:41       ` Shawn Pearce [this message]
2017-08-07 15:40         ` David Turner
2017-08-08  7:52           ` Jeff King
2017-08-08  9:16             ` Shawn Pearce
2017-08-08  7:38         ` Jeff King
2017-08-09 11:18         ` Howard Chu
2017-08-14 12:30           ` Howard Chu
2017-08-14 16:05             ` David Turner
2017-08-15  3:54               ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJo=hJsEaKH40WnhxqvkASpiXnV8ipc+b1zrZ9VEjqRjpJ17Qg@mail.gmail.com' \
    --to=spearce@spearce.org \
    --cc=David.Turner@twosigma.com \
    --cc=avarab@gmail.com \
    --cc=ben.alex@acegi.com.au \
    --cc=dborowitz@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    --cc=stoffe@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).