git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: Christian Couder <christian.couder@gmail.com>,
	Duy Nguyen <pclouds@gmail.com>, git <git@vger.kernel.org>
Subject: Re: [PATCH v2 4/4] bundle v3: the beginning
Date: Wed, 08 Jun 2016 11:05:20 -0700	[thread overview]
Message-ID: <xmqq4m93mu8v.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <20160607202351.GA5726@sigill.intra.peff.net> (Jeff King's message of "Tue, 7 Jun 2016 16:23:51 -0400")

Jeff King <peff@peff.net> writes:

> This interface comes from my earlier patches, so I'll try to shed a
> little light on the decisions I made there.
>
> Because this "external odb" essentially acts as a git alternate, we
> would hit it only when we couldn't find an object through regular means.
> Git would then make the object available in the usual on-disk format
> (probably as a loose object).
>
> So in most processes, we would not need to consult the odb command at
> all. And when we do, the first thing would be to get its "have" list,
> which would at most run once per process.
>
> So the per-object cost is really calling "get", and my assumption there
> was that the cost of actually retrieving the object over the network
> would dwarf the fork/exec cost.

OK, presented that way, the design makes sense (I do not know if
Christian's (revised) design and implementation does or not, though,
as I haven't seen it).

As "check for non-existence" is important and costly, grabbing
"have" once is a good strategy, just like we open the .idx files of
available packfiles.

>> >   - "<command> have": the command should output the sha1, size and
>> > type of all the objects the external ODB contains, one object per
>> > line.
>> 
>> Why size and type at this point is needed by the clients?  That is
>> more expensive to compute than just a bare list of object names.
>
> Yes, but it lets get avoid doing a lot of "get" operations.

OK, so it is more like having richer information in pack-v4 index ;-)

>> >   - "<command> put <sha1> <size> <type>": the command should then read
>> > from stdin an object and store it in the external ODB.
>> 
>> Is ODB required to sanity check that <sha1> matches what the data
>> hashes down to?
>
> I think that would be up to the ODB, but it does seem like a good idea.
>
> Likewise, I'm not sure if "get" should be allowed to return contents
> that don't match the sha1.

Yes, this is what I was getting at.  It would be ideal to come up
with a way to do the large-blob offload without resorting to hacks
(like LFS and annex where "the same object contents will always
result in the same object name" is deliberately broken), and "object
name must match what the data hashes down to" is a basic requirement
for that.

Thanks.

  parent reply	other threads:[~2016-06-08 18:05 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-01 23:35 [PATCH 1/2] bundle: plug resource leak Junio C Hamano
2016-03-01 23:36 ` [PATCH 2/2] bundle: keep a copy of bundle file name in the in-core bundle header Junio C Hamano
2016-03-02  9:01   ` Jeff King
2016-03-02 18:15     ` Junio C Hamano
2016-03-02 20:32       ` [PATCH v2 0/4] "split bundle" preview Junio C Hamano
2016-03-02 20:32         ` [PATCH v2 1/4] bundle doc: 'verify' is not about verifying the bundle Junio C Hamano
2016-03-02 20:32         ` [PATCH v2 2/4] bundle: plug resource leak Junio C Hamano
2016-03-02 20:32         ` [PATCH v2 3/4] bundle: keep a copy of bundle file name in the in-core bundle header Junio C Hamano
2016-03-02 20:49           ` Jeff King
2016-03-02 20:32         ` [PATCH v2 4/4] bundle v3: the beginning Junio C Hamano
2016-03-03  1:36           ` Duy Nguyen
2016-03-03  2:57             ` Junio C Hamano
2016-03-03  5:15               ` Duy Nguyen
2016-05-20 12:39           ` Christian Couder
2016-05-31 12:43             ` Duy Nguyen
2016-05-31 13:18               ` Christian Couder
2016-06-01 13:37                 ` Duy Nguyen
2016-06-07 14:49                   ` Christian Couder
2016-06-01 14:00                 ` Duy Nguyen
2016-06-07  8:46                   ` Christian Couder
2016-06-07  8:53                     ` Mike Hommey
2016-06-07 10:22                     ` Duy Nguyen
2016-06-07 19:23                     ` Junio C Hamano
2016-06-07 20:23                       ` Jeff King
2016-06-08 10:44                         ` Duy Nguyen
2016-06-08 16:19                           ` Jeff King
2016-06-09  8:53                             ` Duy Nguyen
2016-06-09 17:23                               ` Jeff King
2016-06-08 18:05                         ` Junio C Hamano [this message]
2016-06-08 19:00                           ` Jeff King
2016-05-31 22:23               ` Jeff King
2016-05-31 22:31             ` Jeff King
2016-06-07 13:19               ` Christian Couder
2016-06-07 20:35                 ` Jeff King
2016-03-02  8:54 ` [PATCH 1/2] bundle: plug resource leak Jeff King
2016-03-02  9:00   ` Junio C Hamano
2016-03-02  9:02     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq4m93mu8v.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).