git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Jeff King <peff@peff.net>
Cc: Ian Harvey <iharvey@good.com>, git@vger.kernel.org
Subject: Re: [PATCH 4/4] archive: ignore blob objects when checking reachability
Date: Thu, 06 Jun 2013 09:57:32 +0200	[thread overview]
Message-ID: <51B040EC.6080707@alum.mit.edu> (raw)
In-Reply-To: <20130605224038.GD15607@sigill.intra.peff.net>

On 06/06/2013 12:40 AM, Jeff King wrote:
> We cannot create an archive from a blob object, so we would
> not expect anyone to provide one to us. And if they do, we
> will fail anyway just after the reachability check.  We can
> therefore optimize our reachability check to ignore blobs
> completely, and not even create a "struct blob" for them.
> 
> Depending on the repository size and the exact place we find
> the reachable object in the traversal, this can save 20-25%,
> a we can avoid many lookups in the object hash.
> 
> The downside of this is that a blob provided to a remote
> archive process will fail with "no such object" rather than
> "object is not a tree" (we could organize the code to retain
> the old message, but since we no longer know whether the
> blob is reachable or not, we would potentially be leaking
> information about the existence of unreachable objects).

Could we change the error message to "no such tree object" to be
non-committal about the reason for the failure?


For a moment I thought that one could get correct error messages while
retaining the speed gain in the usual case by doing a quick object
lookup, and then

    check type of object
    if object is missing:
        die(there is no such object)
    else if object is a blob:
        do reachability test including blobs
        if object is not reachable:
            die(there is no such object)
        else
            die(object is not a tree)
    else
        do reachability test excluding blobs
        etc

However, even this would leak information about the existence of
nonreachable objects to a client measuring time time for the response
because the death due to non-reachability would take longer than death
due to missing object.  So, if one would insist on correct error
messages and no information leakage, one could just skip the first
"object is missing" optimization (it should be pretty rare anyway!) like so:

    check type of object
    if object is missing or object is a blob:
        /* Force the same delay in either case: */
        do reachability test including blobs
        if object is missing or object is not reachable:
            die(there is no such object)
        else
            die(object is not a tree)
    else
        do reachability test excluding blobs
        etc

I'm not suggesting that the extra effort is worth it; I just wanted to
mention the possibility.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

  reply	other threads:[~2013-06-06  7:57 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-10 21:18 [BUG] git archive broken in 1.7.8.1 Albert Astals Cid
2012-01-10 21:33 ` Carlos Martín Nieto
2012-01-10 22:05   ` Albert Astals Cid
2012-01-10 22:50     ` Carlos Martín Nieto
2012-01-10 23:21       ` Jeff King
2012-01-11 12:12         ` [PATCH] archive: re-allow HEAD:Documentation on a remote invocation Carlos Martín Nieto
2012-01-11 19:39           ` Jeff King
2012-01-11 19:42             ` [PATCH 1/2] get_sha1_with_context: report features used in resolution Jeff King
2012-01-12  2:36               ` Junio C Hamano
2012-01-12  2:51                 ` Jeff King
2012-01-11 19:42             ` [PATCH 2/2] archive: loosen restrictions on remote object lookup Jeff King
2013-05-29 12:05               ` Ian Harvey
2013-06-05 16:38                 ` Jeff King
2013-06-05 22:35                   ` [RFC/PATCH 0/4] real reachability checks for upload-archive Jeff King
2013-06-05 22:37                     ` [PATCH 1/4] clear parsed flag when we free tree buffers Jeff King
2013-06-06 17:55                       ` Junio C Hamano
2013-06-05 22:39                     ` [PATCH 2/4] upload-archive: restrict remote objects with reachability check Jeff King
2013-06-05 22:40                     ` [PATCH 3/4] list-objects: optimize "revs->blob_objects = 0" case Jeff King
2013-06-05 22:40                     ` [PATCH 4/4] archive: ignore blob objects when checking reachability Jeff King
2013-06-06  7:57                       ` Michael Haggerty [this message]
2013-06-07  0:50                       ` Eric Sunshine
2013-06-06 17:27                     ` [RFC/PATCH 0/4] real reachability checks for upload-archive Junio C Hamano
2012-01-12  2:46           ` [PATCH] archive: re-allow HEAD:Documentation on a remote invocation Junio C Hamano
2012-01-12  2:54             ` Jeff King
2012-01-12  2:59               ` Jeff King
2012-01-12  3:03               ` Junio C Hamano
2012-01-12  3:10                 ` Jeff King
2012-01-12  3:20                   ` Junio C Hamano
2012-01-10 23:01     ` [BUG] git archive broken in 1.7.8.1 Allan Wind
2012-01-11 12:51       ` Carlos Martín Nieto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51B040EC.6080707@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=iharvey@good.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).