git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Karsten Blees <karsten.blees@gmail.com>
Cc: git discussion list <git@vger.kernel.org>,
	Jeff King <peff@peff.net>, Vicent Marti <tanoku@gmail.com>,
	Brad King <brad.king@kitware.com>,
	Johan Herland <johan@herland.net>
Subject: Re: [RFC/WIP] Pluggable reference backends
Date: Wed, 12 Mar 2014 12:43:35 +0100	[thread overview]
Message-ID: <53204867.4010809@alum.mit.edu> (raw)
In-Reply-To: <531EEBCC.10409@gmail.com>

Karsten,

Thanks for your feedback!

On 03/11/2014 11:56 AM, Karsten Blees wrote:
> Am 10.03.2014 12:00, schrieb Michael Haggerty:
>> 
>> Reference transactions ----------------------
> 
> Very cool ideas indeed.
> 
> However, I'm concerned a bit that transactions are conceptual
> overkill. How many concurrent updates do you expect in a repository?
> Wouldn't a single repo-wide lock suffice (and be _much_ simpler to
> implement with any backend, esp. file-based)?

I am mostly thinking about long-running processes, like "gc" and
"prune-refs", which need to be made race-free without blocking other
processes for the whole time they are running (whereas it might be quite
tolerable to have them fail or only complete part of their work in any
given invocation).  Also, I work at GitHub, where we have quite a few
repositories, some of which are quite active :-)

Remember that I'm not yet proposing anything like hard-core ACID
reference transactions.  I'm just clearing the way for various possible
changes in reference handling.  I listed the ideas only to whet people's
appetites and motivate the refactoring, which will take a while before
it bears any real fruit.

> The API you posted in [1] doesn't look very much like a transaction
> API either (rather like batch-updates). E.g. there's no rollback, the
> queue* methods cannot report failure, and there's no way to read a
> ref as part of the transaction. So I'm afraid that backends that
> support transactions out of the box (e.g. RDBMSs) will be hard to
> adapt to this.

Gmane is down at the moment but I assume you are referring to my patch
series and the ref_transaction implementation therein.

No explicit rollback is necessary at this stage, because the "commit"
function first locks all of the references that it wants to change
(first verifying that they have the expected values), and then modifies
them all.  By the time the references are locked, the whole transaction
is guaranteed to succeed [1].  If the locks can't all be acquired, then
any locks that were obtained are released.

If a caller wants to rollback a transaction, it only needs to free the
transaction instead of committing.  I should probably make that clearer
by renaming free_ref_transaction() to rollback_ref_transaction().  By
the time we start implementing other reference backends, that function
will of course have to do more.  For that matter, maybe
create_ref_transaction() should be renamed to begin_ref_transaction().
Now would be a good time for concrete bikeshedding suggestions about
function names or other details of the API :-)

Yes, the queue_*() methods should probably later make a preliminary
check of the reference's old value and return an error if the expected
value is already incorrect.  This would allow callers to fail fast if
the transaction is doomed to failure.  But that wasn't needed yet for
the one existing caller, which builds up a transaction and commits it
immediately, so I didn't implement it yet.  And the early checks would
add overhead for this caller, so maybe they should be optional anyway.
Maybe these functions should already be declared to return an error
status, but there should be an option passed to create_ref_transaction()
that selects whether fast checks should be performed or not for that
transaction.

Really, all that this first patch series does is put a different API
around the mechanism that was already there, in update_refs().  There
will be a lot more steps before we see anything approaching real
reference transactions.  But I think your (implied) suggestion, to make
the API more reminiscent of something like database transactions, is a
good one and I will work on it.

Cheers,
Michael

[1] "Guaranteed" here is of course relative.  The commit could still
fail due to the process being killed, disk errors, etc.  But it can't
fail due to lock contention with another git process.

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

      reply	other threads:[~2014-03-12 11:43 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-10 11:00 [RFC/WIP] Pluggable reference backends Michael Haggerty
2014-03-10 11:44 ` Johan Herland
2014-03-10 14:30 ` Shawn Pearce
2014-03-10 15:51   ` Max Horn
2014-03-10 15:52   ` Jeff King
2014-03-10 16:14     ` David Kastrup
2014-03-10 16:28       ` David Lang
2014-03-10 19:42       ` Jeff King
2014-03-10 19:56         ` David Kastrup
2014-03-10 17:46     ` Junio C Hamano
2014-03-10 17:56       ` Jeff King
2014-03-10 21:07     ` Michael Haggerty
2014-03-11  2:39       ` Shawn Pearce
2014-03-12 10:26         ` egit vs. git behaviour (was: [RFC/WIP] Pluggable reference backends) Andreas Krey
2014-03-12 16:48           ` Shawn Pearce
2014-03-11 10:56 ` [RFC/WIP] Pluggable reference backends Karsten Blees
2014-03-12 11:43   ` Michael Haggerty [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53204867.4010809@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=brad.king@kitware.com \
    --cc=git@vger.kernel.org \
    --cc=johan@herland.net \
    --cc=karsten.blees@gmail.com \
    --cc=peff@peff.net \
    --cc=tanoku@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).