git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: David Turner <dturner@twopensource.com>
To: Michael Haggerty <mhagger@alum.mit.edu>, git@vger.kernel.org
Subject: Re: [PATCH v4 20/21] refs: add LMDB refs storage backend
Date: Fri, 12 Feb 2016 20:23:43 -0500	[thread overview]
Message-ID: <1455326623.10888.3.camel@twopensource.com> (raw)
In-Reply-To: <56BE0FE1.5030407@alum.mit.edu>

On Fri, 2016-02-12 at 18:01 +0100, Michael Haggerty wrote:
> On 02/05/2016 08:44 PM, David Turner wrote:
> > Add a database backend for refs using LMDB.  This backend runs git
> > for-each-ref about 30% faster than the files backend with fully
> > -packed
> > refs on a repo with ~120k refs.  It's also about 4x faster than
> > using
> > fully-unpacked refs.  In addition, and perhaps more importantly, it
> > avoids case-conflict issues on OS X.
> > 
> > LMDB has a few features that make it suitable for usage in git:
> > 
> 
> 0. It is licensed under the OpenLDAP Public License, which is
> apparently
> GPL compatible [1].
> 
> [1] http://www.gnu.org/licenses/license-list.en.html#newOpenLDAP

Oh, yeah, I checked that before choosing it and then forgot all about
it. 

> > 1. It is relatively lightweight; it requires only one header file,
> > and
> > the library code takes under 64k at runtime.
> > 
> > 2. It is well-tested: it's been used in OpenLDAP for years.
> > 
> > 3. It's very fast.  LMDB's benchmarks show that it is among
> > the fastest key-value stores.
> > 
> > 4. It has a relatively simple concurrency story; readers don't
> > block writers and writers don't block readers.
> > 
> > Ronnie Sahlberg's original version of this patchset used tdb.  The
> > major disadvantage of tdb is that tdb is hard to build on OS X. 
> >  It's
> > also not in homebrew.  So lmdb seemed simpler.
> > 
> > To test this backend's correctness, I hacked test-lib.sh and
> > test-lib-functions.sh to run all tests under the refs backend.
> > Dozens
> > of tests use manual ref/reflog reading/writing, or create
> > submodules
> > without passing --ref-storage to git init.  If those tests are
> > changed to use the update-ref machinery or test-refs-lmdb-backend
> > (or,
> > in the case of packed-refs, corrupt refs, and dumb fetch tests, are
> > skipped), the only remaining failing tests are the git-new-workdir
> > tests and the gitweb tests.
> 
> Would it possible to build your "hack" into the test suite? For
> example,
> perhaps one could implement an option that requests tests to use LMDB
> wherever possible and skip the tests that are not LMDB-compatible.
> Over
> time, we could try to increase the fraction of tests that are
> LMDB-compatible by (for example) using `git update-ref` and `git
> rev-parse` rather than reading/writing reference files via the
> filesystem directly.
> 
> Otherwise, I'm worried that LMDB support will bitrot quickly because
> it
> will be too much of a nuisance to exercise the code.

I can add that.  It requires minor changes to a large number of tests,
but that's OK.

> > Signed-off-by: David Turner <dturner@twopensource.com>
> > ---
> >  .gitignore                                     |    1 +
> >  Documentation/config.txt                       |    7 +
> >  Documentation/git-clone.txt                    |    3 +-
> >  Documentation/git-init.txt                     |    2 +-
> >  Documentation/technical/refs-lmdb-backend.txt  |   52 +
> >  Documentation/technical/repository-version.txt |    5 +
> >  Makefile                                       |   12 +
> >  configure.ac                                   |   33 +
> >  contrib/workdir/git-new-workdir                |    3 +
> >  refs.c                                         |    3 +
> >  refs.h                                         |    2 +
> >  refs/lmdb-backend.c                            | 2052
> > ++++++++++++++++++++++++
> >  setup.c                                        |   11 +-
> >  test-refs-lmdb-backend.c                       |   64 +
> >  transport.c                                    |    7 +-
> >  15 files changed, 2250 insertions(+), 7 deletions(-)
> >  create mode 100644 Documentation/technical/refs-lmdb-backend.txt
> >  create mode 100644 refs/lmdb-backend.c
> >  create mode 100644 test-refs-lmdb-backend.c
> > 
> > diff --git a/.gitignore b/.gitignore
> > index 1c2f832..87d45a2 100644
> > --- a/.gitignore
> > +++ b/.gitignore
> > @@ -199,6 +199,7 @@
> >  /test-path-utils
> >  /test-prio-queue
> >  /test-read-cache
> > +/test-refs-lmdb-backend
> >  /test-regex
> >  /test-revision-walking
> >  /test-run-command
> > diff --git a/Documentation/config.txt b/Documentation/config.txt
> > index 06d3659..bc222c9 100644
> > --- a/Documentation/config.txt
> > +++ b/Documentation/config.txt
> > @@ -1153,6 +1153,13 @@ difftool.<tool>.cmd::
> >  difftool.prompt::
> >  	Prompt before each invocation of the diff tool.
> >  
> > +extensions.refsStorage::
> > +	Type of refs storage backend. Default is to use the
> > original
> > +	files based ref storage.  When set to "lmdb", refs are
> > stored in
> > +	the lmdb database backend.  This setting reflects the refs
> > +	backend that is currently in use; it is not possible to
> > change
> > +	the backend by changing this setting.
> > +
> 
> Let's give the users a pointer to how they *can* change this setting.
> Maybe
> 
> 	Type of refs storage backend. Default is to use the original
> 	files based ref storage.  When set to "lmdb", refs are stored
> in
> 	an LMDB database.  This setting reflects the refs storage
> 	backend that was chosen via the --ref-storage option when the
> 	repository was originally created. It is currently
> 	not possible to change the refs storage backend of an
> 	existing repository.

OK.

> >  fetch.recurseSubmodules::
> >  	This option can be either set to a boolean value or to 'on
> > -demand'.
> >  	Setting it to a boolean changes the behavior of fetch and
> > pull to
> > diff --git a/Documentation/git-clone.txt b/Documentation/git
> > -clone.txt
> > index 68f56a7..b9c52ce 100644
> > --- a/Documentation/git-clone.txt
> > +++ b/Documentation/git-clone.txt
> > @@ -227,7 +227,8 @@ objects from the source repository into a pack
> > in the cloned repository.
> >  
> >  --ref-storage=<name>::
> >  	Type of ref storage backend. Default is to use the
> > original files
> > -	based ref storage.
> > +	based ref storage. Set to "lmdb" to store refs in the lmdb
> > database
> > +	backend.
> 
> Should the second "lmdb" be upper-case? Maybe also in other places
> where
> the LMDB database is being referred to?

OK, I think I got most of them.

> >  <repository>::
> >  	The (possibly remote) repository to clone from.  See the
> > [...]
> 
> I still haven't reviewed the actual lmdb-backend code. I guess I'm
> going
> to have to make myself an LMDB expert one of these days...

I highly recommend it!  The API is pretty reasonable, and it does seem
to be fast.  The error messages could use some work, but that's my only
complaint.

  reply	other threads:[~2016-02-13  1:24 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-05 19:44 [PATCH v4 00/20] refs backend David Turner
2016-02-05 19:44 ` [PATCH v4 01/21] refs: add a backend method structure with transaction functions David Turner
2016-02-05 19:44 ` [PATCH v4 02/21] refs: add methods for misc ref operations David Turner
2016-02-11  7:45   ` Michael Haggerty
2016-02-12  1:09     ` David Turner
2016-02-05 19:44 ` [PATCH v4 03/21] refs: add methods for the ref iterators David Turner
2016-02-11  8:42   ` Michael Haggerty
2016-02-12  1:08     ` David Turner
2016-02-05 19:44 ` [PATCH v4 04/21] refs: add do_for_each_per_worktree_ref David Turner
2016-02-05 19:44 ` [PATCH v4 05/21] refs: add methods for reflog David Turner
2016-02-05 19:44 ` [PATCH v4 06/21] refs: add method for initial ref transaction commit David Turner
2016-02-05 19:44 ` [PATCH v4 07/21] refs: add method for delete_refs David Turner
2016-02-05 19:44 ` [PATCH v4 08/21] refs: add methods to init refs db David Turner
2016-02-11  8:54   ` Michael Haggerty
2016-02-11 21:15     ` David Turner
2016-02-05 19:44 ` [PATCH v4 09/21] refs: add method to rename refs David Turner
2016-02-11  9:00   ` Michael Haggerty
2016-02-11 21:12     ` David Turner
2016-02-05 19:44 ` [PATCH v4 10/21] refs: make lock generic David Turner
2016-02-05 19:44 ` [PATCH v4 11/21] refs: move duplicate check to common code David Turner
2016-02-05 19:44 ` [PATCH v4 12/21] refs: allow log-only updates David Turner
2016-02-11 10:03   ` Michael Haggerty
2016-02-11 21:23     ` David Turner
2016-02-05 19:44 ` [PATCH v4 13/21] refs: resolve symbolic refs first David Turner
2016-02-12 14:09   ` Michael Haggerty
2016-02-18  0:29     ` David Turner
2016-02-18 11:59       ` Michael Haggerty
2016-02-05 19:44 ` [PATCH v4 14/21] refs: always handle non-normal refs in files backend David Turner
2016-02-12 15:07   ` Michael Haggerty
2016-02-18  2:44     ` David Turner
2016-02-18 12:07       ` Michael Haggerty
2016-02-18 18:32         ` David Turner
2016-02-05 19:44 ` [PATCH v4 15/21] init: allow alternate ref strorage to be set for new repos David Turner
2016-02-12 15:26   ` Michael Haggerty
2016-02-17 20:47     ` David Turner
2016-02-18 14:12       ` Michael Haggerty
2016-02-05 19:44 ` [PATCH v4 16/21] refs: check submodules ref storage config David Turner
2016-02-05 19:44 ` [PATCH v4 17/21] clone: allow ref storage backend to be set for clone David Turner
2016-02-05 19:44 ` [PATCH v4 18/21] svn: learn ref-storage argument David Turner
2016-02-05 19:44 ` [PATCH v4 19/21] refs: add register_ref_storage_backends() David Turner
2016-02-12 15:42   ` Michael Haggerty
2016-02-17 20:32     ` David Turner
2016-02-05 19:44 ` [PATCH v4 20/21] refs: add LMDB refs storage backend David Turner
2016-02-11  8:48   ` Michael Haggerty
2016-02-11 21:21     ` David Turner
2016-02-12 17:01   ` Michael Haggerty
2016-02-13  1:23     ` David Turner [this message]
2016-02-14 12:04   ` Duy Nguyen
2016-02-15  9:57     ` Duy Nguyen
2016-02-16 22:01       ` David Turner
2016-02-17 20:32     ` David Turner
2016-02-05 19:44 ` [PATCH v4 21/21] refs: tests for lmdb backend David Turner
2016-02-08 23:37 ` [PATCH v4 00/20] refs backend Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1455326623.10888.3.camel@twopensource.com \
    --to=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    --cc=mhagger@alum.mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).