git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/6] receive-pack: quarantine pushed objects
@ 2016-09-30 19:35 Jeff King
  2016-09-30 19:35 ` [PATCH 1/6] check_connected: accept an env argument Jeff King
                   ` (7 more replies)
  0 siblings, 8 replies; 32+ messages in thread
From: Jeff King @ 2016-09-30 19:35 UTC (permalink / raw)
  To: git; +Cc: David Turner

I've mentioned before on the list that GitHub "quarantines" objects
while the pre-receive hook runs. Here are the patches to implement
that.

The basic problem is that as-is, index-pack admits pushed objects into
the main object database immediately, before the pre-receive hook runs.
It _has_ to, since the hook needs to be able to actually look at the
objects. However, this means that if the pre-receive hook rejects the
push, we still end up with the objects in the repository. We can't just
delete them as temporary files, because we don't know what other
processes might have started referencing them.

The solution here is to push into a "quarantine" directory that is
accessible only to pre-receive, check_connected(), etc, and only
move the objects into the main object database after we've finished
those basic checks.

One of the things we use it for at GitHub is object-size policy, which
we implement via a pre-receive hook (sort of; see below). This scheme
has been in use for about 2 years, though I did do a fair bit of
tweaking to make it ready for upstream (squashing bugfixes and merges
from upstream that came later, along with polishing a few rough edges I
saw while doing so). So I may have introduced new bugs. :)

The patches are:

  [1/6]: check_connected: accept an env argument
  [2/6]: sha1_file: always allow relative paths to alternates

    These two are preparatory.

  [3/6]: tmp-objdir: introduce API for temporary object directories
  [4/6]: receive-pack: quarantine objects until pre-receive accepts

    This is the interesting part.

  [5/6]: tmp-objdir: put quarantine information in the environment
  [6/6]: tmp-objdir: do not migrate files starting with '.'

    These are two changes that I ended up doing later to support another
    series. They're not strictly necessary here, but I think they're
    worth including now, as they change the visible behavior in minor
    ways. It seems like a good idea to start with what I think should be
    the final behavior.

    The other series is basically an optimization for the object-size
    policy. Without it, you are stuck walking the graph again in the
    pre-receive hook to find the new objects and check their sizes.

    But index-pack can do that for you very cheaply; it has the size of
    each object already. But it _doesn't_ produce nice error messages;
    it has no idea at what path the objects are found, and it doesn't
    know what kind of advice it should give the user.

    So what we can do is ask index-pack to make a note of any objects
    larger than N bytes, and write their sha1 and size into a file in
    the quarantine path. Then the pre-receive hook can look in that log
    and generate any nice message it wants. In the common case, the log
    is empty, and it does not have to do any work at all.

    These two patches set that up by letting index-pack and pre-receive
    know that quarantine path and use it to store arbitrary files that
    _don't_ get migrated to the main object database (i.e., the log file
    mentioned above).

-Peff

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2017-04-10 21:14 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-30 19:35 [PATCH 0/6] receive-pack: quarantine pushed objects Jeff King
2016-09-30 19:35 ` [PATCH 1/6] check_connected: accept an env argument Jeff King
2016-09-30 19:36 ` [PATCH 2/6] sha1_file: always allow relative paths to alternates Jeff King
2016-10-02  9:07   ` René Scharfe
2016-10-02 13:03     ` Jeff King
2016-10-02 15:38     ` Jeff King
2016-10-02 16:59       ` Jeff King
2016-09-30 19:36 ` [PATCH 3/6] tmp-objdir: introduce API for temporary object directories Jeff King
2016-09-30 21:25   ` Junio C Hamano
2016-09-30 22:13     ` Jeff King
2016-09-30 21:32   ` David Turner
2016-09-30 22:44     ` Jeff King
2016-09-30 23:07       ` David Turner
2016-09-30 19:36 ` [PATCH 4/6] receive-pack: quarantine objects until pre-receive accepts Jeff King
2016-10-01  9:12   ` Jeff King
2017-04-08 14:53     ` Ævar Arnfjörð Bjarmason
2017-04-10 21:14       ` Jeff King
2016-09-30 19:36 ` [PATCH 5/6] tmp-objdir: put quarantine information in the environment Jeff King
2016-09-30 19:36 ` [PATCH 6/6] tmp-objdir: do not migrate files starting with '.' Jeff King
2016-10-02  9:20 ` [PATCH 0/6] receive-pack: quarantine pushed objects Christian Couder
2016-10-02 13:02   ` Jeff King
2016-10-03  6:45     ` Christian Couder
2016-10-03 20:48 ` [PATCH v2 0/5] " Jeff King
2016-10-03 20:49   ` [PATCH v2 1/5] check_connected: accept an env argument Jeff King
2016-10-05 19:01     ` Jakub Narębski
2016-10-05 19:06       ` Jeff King
2016-10-03 20:49   ` [PATCH v2 2/5] tmp-objdir: introduce API for temporary object directories Jeff King
2016-10-03 20:49   ` [PATCH v2 3/5] receive-pack: quarantine objects until pre-receive accepts Jeff King
2016-10-03 20:49   ` [PATCH v2 4/5] tmp-objdir: put quarantine information in the environment Jeff King
2016-10-03 20:49   ` [PATCH v2 5/5] tmp-objdir: do not migrate files starting with '.' Jeff King
2016-10-03 21:25   ` [PATCH v2 0/5] receive-pack: quarantine pushed objects Junio C Hamano
2016-10-03 21:28     ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).