git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Howard Chu <hyc@symas.com>
To: spearce@spearce.org
Cc: David.Turner@twosigma.com, avarab@gmail.com,
	ben.alex@acegi.com.au, dborowitz@google.com, git@vger.kernel.org,
	gitster@pobox.com, mhagger@alum.mit.edu, peff@peff.net,
	sbeller@google.com, stoffe@gmail.com
Subject: Re: reftable [v5]: new ref storage format
Date: Wed, 9 Aug 2017 12:18:45 +0100	[thread overview]
Message-ID: <ee8f70bd-6f9e-3fb6-67be-ba26b6d5bf16@symas.com> (raw)
In-Reply-To: <CAJo=hJsEaKH40WnhxqvkASpiXnV8ipc+b1zrZ9VEjqRjpJ17Qg@mail.gmail.com>

Shawn Pearce wrote:
> On Sun, Aug 6, 2017 at 4:37 PM, Ben Alex <ben.alex@acegi.com.au> wrote:
>> > Just on the LmdbJava specific pieces:
>> >
>> > On Mon, Aug 7, 2017 at 8:56 AM, Shawn Pearce <spearce@spearce.org> wrote:

> I don't know if we need a larger key size. $DAY_JOB limits ref names
> to ~200 bytes in a hook. I think GitHub does similar. But I'm worried
> about the general masses who might be using our software and expect
> ref names thus far to be as long as PATH_MAX on their system. Most
> systems run PATH_MAX around 1024.

The key size limit in LMDB can be safely raised to around 2KB or so without 
any issues. There's also work underway in LMDB 1.0 to raise the limit to 2GB, 
but in general it would be silly to use such large keys.

> Mostly at $DAY_JOB its because we can't virtualize the filesystem
> calls the C library is doing.
> 
> In git-core, I'm worried about the caveats related to locking. Git
> tries to work nicely on NFS,

That may be a problem in current LMDB 0.9, but needs further clarification.

> and it seems LMDB wouldn't. Git also runs
> fine on a read-only filesystem, and LMDB gets a little weird about
> that.

Not sure what you're talking about. LMDB works perfectly fine on read-only 
filesystems, it just enforces that it is used in read-only mode.

> Finally, Git doesn't have nearly the risks LMDB has about a
> crashed reader or writer locking out future operations until the locks
> have been resolved. This is especially true with shared user
> repositories, where another user might setup and own the semaphore.

All locks disappear when the last process using the DB environment exits.
If only a single process is using the DB environment, there's no issue. If 
multiple processes are sharing the DB environment concurrently, the write lock 
cleans up automatically when the writer terminates; stale reader locks would 
require a call to mdb_reader_check() to clean them up.

The primary issue with using LMDB over NFS is with performance. All reads are 
performed thru accesses of mapped memory, and in general, NFS implementations 
don't cache mmap'd pages. I believe this is a consequence of the fact that 
they also can't guarantee cache coherence, so the only way for an NFS client 
to see a write from another NFS client is by always refetching pages whenever 
they're accessed.

This is also why LMDB doesn't provide user-level VFS hooks - it's generally 
impractical to emulate mmap from application level. You could always write a 
FUSE driver if that's really what you need to do, but again, the performance 
of such a solution is pretty horrible.

LMDB's read lock management also wouldn't perform well over NFS; it also uses 
an mmap'd file. On a local filesystem LMDB read locks are zero cost since they 
just atomically update a word in the mmap. Over NFS, each update to the mmap 
would also require an msync() to propagate the change back to the server. This 
would seriously limit the speed with which read transactions may be opened and 
closed. (Ordinarily opening and closing a read txn can be done with zero 
system calls.)

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

  parent reply	other threads:[~2017-08-09 11:24 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-06  3:15 reftable [v5]: new ref storage format Shawn Pearce
2017-08-06 16:56 ` Ævar Arnfjörð Bjarmason
2017-08-06 22:56   ` Shawn Pearce
     [not found]     ` <CAOhB0ruYhGAyNn84ZjS7TH7QdwxNi2bPN8KFxEEBd58B9qVrmg@mail.gmail.com>
2017-08-07 14:41       ` Shawn Pearce
2017-08-07 15:40         ` David Turner
2017-08-08  7:52           ` Jeff King
2017-08-08  9:16             ` Shawn Pearce
2017-08-08  7:38         ` Jeff King
2017-08-09 11:18         ` Howard Chu [this message]
2017-08-14 12:30           ` Howard Chu
2017-08-14 16:05             ` David Turner
2017-08-15  3:54               ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ee8f70bd-6f9e-3fb6-67be-ba26b6d5bf16@symas.com \
    --to=hyc@symas.com \
    --cc=David.Turner@twosigma.com \
    --cc=avarab@gmail.com \
    --cc=ben.alex@acegi.com.au \
    --cc=dborowitz@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    --cc=spearce@spearce.org \
    --cc=stoffe@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).