git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, markbt@efaref.net, git@jeffhostetler.com,
	kevin.david@microsoft.com
Subject: Re: Proposal for missing blob support in Git repos
Date: Mon, 1 May 2017 12:12:27 -0700	[thread overview]
Message-ID: <193d1d84-2386-c4c8-81ef-0042f0d8bb02@google.com> (raw)
In-Reply-To: <xmqqinllgrfl.fsf@gitster.mtv.corp.google.com>

On 04/30/2017 08:57 PM, Junio C Hamano wrote:
> One thing I wonder is what the performance impact of a change like
> this to the codepath that wants to see if an object does _not_ exist
> in the repository.  When creating a new object by hashing raw data,
> we see if an object with the same name already exists before writing
> the compressed loose object out (or comparing the payload to detect
> hash collision).  With a "missing blob" support, we'd essentially
> spawn an extra process every time we want to create a new blob
> locally, and most of the time that is done only to hear the external
> command to say "no, we've never heard of such an object", with a
> possibly large latency.
>
> If we do not have to worry about that (or if it is no use to worry
> about it, because we cannot avoid it if we wanted to do the lazy
> loading of objects from elsewhere), then the patch presented here
> looked like a sensible first step towards the stated goal.
>
> Thanks.

Thanks for your comments. If you're referring to the codepath involving 
write_sha1_file() (for example, builtin/hash-object -> index_fd or 
builtin/unpack-objects), that is fine because write_sha1_file() invokes 
freshen_packed_object() and freshen_loose_object() directly to check if 
the object already exists (and thus does not invoke the new mechanism in 
this patch).

Having said that, looking at other parts of the fetching mechanism, 
there are a few calls to has_sha1_file() and others that might need to 
be checked. (We have already discussed one - the one in rev-list when 
invoked to check connectivity.) I could take a look at that, but was 
hoping for discussion on what I've sent so far (so that I know that I'm 
on the right track, and because it somewhat works, albeit slowly).

  reply	other threads:[~2017-05-01 19:12 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-26 22:13 Proposal for missing blob support in Git repos Jonathan Tan
2017-05-01  3:57 ` Junio C Hamano
2017-05-01 19:12   ` Jonathan Tan [this message]
2017-05-01 23:29     ` Junio C Hamano
2017-05-02  0:33       ` Jonathan Tan
2017-05-02  0:38         ` Brandon Williams
2017-05-02  1:41         ` Junio C Hamano
2017-05-02 17:21           ` Jonathan Tan
2017-05-02 18:32             ` Ævar Arnfjörð Bjarmason
2017-05-02 21:45               ` Jonathan Tan
2017-05-04  4:29                 ` Junio C Hamano
2017-05-04 17:09                   ` Jonathan Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=193d1d84-2386-c4c8-81ef-0042f0d8bb02@google.com \
    --to=jonathantanmy@google.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=kevin.david@microsoft.com \
    --cc=markbt@efaref.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).