git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: Christian Couder <christian.couder@gmail.com>
Cc: git <git@vger.kernel.org>, Junio C Hamano <gitster@pobox.com>,
	Jeff King <peff@peff.net>, Ben Peart <Ben.Peart@microsoft.com>,
	Nguyen Thai Ngoc Duy <pclouds@gmail.com>,
	Mike Hommey <mh@glandium.org>,
	Lars Schneider <larsxschneider@gmail.com>,
	Eric Wong <e@80x24.org>,
	Christian Couder <chriscool@tuxfamily.org>
Subject: Re: [PATCH v6 09/40] Add initial external odb support
Date: Fri, 29 Sep 2017 13:36:14 -0700	[thread overview]
Message-ID: <20170929133614.1c0cd68ad80139fafdf68b86@google.com> (raw)
In-Reply-To: <CAP8UFD2D2NkKKy9oT-1Mwe0Gq6=UG+15t9GENKKbV-PbRf3Kkw@mail.gmail.com>

On Wed, 27 Sep 2017 18:46:30 +0200
Christian Couder <christian.couder@gmail.com> wrote:

> I am ok to split the patch series, but I am not sure that 01/40 to
> 09/40 is the right range for the first patch series.
> I would say that 01/40 to 07/40 is better as it can be seen as a
> separate refactoring.

I mentioned 09/40 because this is (as far as I can tell) the first one
that introduces a new design.

> I don't think single-shot processes would be a huge burden, because
> the code is simpler, and because for example for filters we already
> have single shot and long-running processes and no one complains about
> that. It's code that is useful as it makes it much easier for people
> to do some things (see the clone bundle example).
> 
> In fact in Git development we usually start to by first implementing
> simpler single-shot solutions, before thinking, when the need arise,
> to make it faster. So a perhaps an equally valid opinion could be to
> first only submit the patches for the single-shot protocol and later
> submit the rest of the series when we start getting feedback about how
> external odbs are used.

My concern is that, as far as I understand about the Microsoft use case,
we already know that we need the faster solution, so the need has
already arisen.

> And yeah I could change the order of the patch series to implement the
> long-running processes first and the single-shot process last, so that
> it could be possible to first get feedback about the long-running
> processes, before we decide to merge or not the single-shot stuff, but
> I don't think it would look like the most logical order.

My thinking was that we would just implement the long-running process
and not implement the single-shot process at all (besides maybe a script
in contrib/). If we are going to do both anyway, I agree that we should
do the single-shot process first.

> > And another possible issue is that we design ourselves into a corner.
> > Thinking about the use cases that I know about (the Android use case and
> > the Microsoft GVFS use case), I don't think we are doing that - for
> > Android, this means that large blob metadata needs to be part of the
> > design (and this patch series does provide for that), and for Microsoft
> > GVFS, "get" is relatively cheap, so a configuration option to not invoke
> > "have" first when loading a missing object might be sufficient.
> 
> If the helper does not advertise the "have" capability, the "have"
> instruction will not be sent to the helper, so the current design is
> already working for that case.

Ah, that's good to know.

> > And I think that my design can be extended to support a use case in
> > which, for example, blobs corresponding to a certain type of filename
> > (defined by a glob like in gitattributes) can be excluded during
> > fetch/clone, much like --blob-max-bytes, and they can be fetched either
> > through the built-in mechanism or through a custom hook.
> 
> Sure, we could probably rebuild something equivalent to what I did on
> top of your design.
> My opinion though is that if we want to eventually get to the same
> goal, it is better to first merge something that get us very close to
> the end goal and then add some improvements on top of it.

I agree - I mentioned that because I personally prefer to review smaller
patch sets at a time, and my patch set already includes a lot of the
same infrastructure needed by yours - for example, the places in the
code to dynamically fetch objects, exclusion of objects when fetching or
cloning, configuring the cloned repo when cloning, fsck, and gc.

> >  - I get compile errors when I "git am" these onto master. I think
> >    '#include "config.h"' is needed in some places.
> 
> It's strange because I get no compile errors even after a "make clean"
> from my branch.
> Could you show the actual errors?

I don't have the error messages with me now, but it was something about
a function being implicitly declared. You will probably get these errors
if you sync past commit e67a57f ("config: create config.h", 2017-06-15).

> > Any reason why you prefer to update the loose object functions than to
> > update the generic one (sha1_object_info_extended)? My concern with just
> > updating the loose object functions was that a caller might have
> > obtained the path by iterating through the loose object dirs, and in
> > that case we shouldn't query the external ODB for anything.
> 
> You are thinking about fsck or gc?
> Otherwise I don't think it would be clean to iterate through loose object dirs.

Yes, fsck and gc (well, prune, I think) do that. I agree that Git
typically doesn't do that (except for exceptional cases like fsck and
gc), but I was thinking about supporting existing code that does that
iteration, not introducing new code that does that.

  reply	other threads:[~2017-09-29 20:36 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-16  8:06 [PATCH v6 00/40] Add initial experimental external ODB support Christian Couder
2017-09-16  8:06 ` [PATCH v6 01/40] builtin/clone: get rid of 'value' strbuf Christian Couder
2017-09-16  8:06 ` [PATCH v6 02/40] t0021/rot13-filter: refactor packet reading functions Christian Couder
2017-09-16  8:06 ` [PATCH v6 03/40] t0021/rot13-filter: improve 'if .. elsif .. else' style Christian Couder
2017-09-16  8:06 ` [PATCH v6 04/40] t0021/rot13-filter: improve error message Christian Couder
2017-09-16  8:06 ` [PATCH v6 05/40] t0021/rot13-filter: add packet_initialize() Christian Couder
2017-09-16  8:06 ` [PATCH v6 06/40] t0021/rot13-filter: add capability functions Christian Couder
2017-09-16  8:06 ` [PATCH v6 07/40] Add Git/Packet.pm from parts of t0021/rot13-filter.pl Christian Couder
2017-09-16  8:06 ` [PATCH v6 08/40] sha1_file: prepare for external odbs Christian Couder
2017-09-16  8:07 ` [PATCH v6 09/40] Add initial external odb support Christian Couder
2017-09-19 17:45   ` Jonathan Tan
2017-09-27 16:46     ` Christian Couder
2017-09-29 20:36       ` Jonathan Tan [this message]
2017-10-02 14:34         ` Ben Peart
2017-10-03  9:45         ` Christian Couder
2017-10-04  0:15           ` Jonathan Tan
2017-09-16  8:07 ` [PATCH v6 10/40] odb-helper: add odb_helper_init() to send 'init' instruction Christian Couder
2017-09-16  8:07 ` [PATCH v6 11/40] t0400: add 'put_raw_obj' instruction to odb-helper script Christian Couder
2017-09-16  8:07 ` [PATCH v6 12/40] external odb: add 'put_raw_obj' support Christian Couder
2017-09-16  8:07 ` [PATCH v6 13/40] external-odb: accept only blobs for now Christian Couder
2017-09-16  8:07 ` [PATCH v6 14/40] t0400: add test for external odb write support Christian Couder
2017-09-16  8:07 ` [PATCH v6 15/40] Add GIT_NO_EXTERNAL_ODB env variable Christian Couder
2017-09-16  8:07 ` [PATCH v6 16/40] Add t0410 to test external ODB transfer Christian Couder
2017-09-16  8:07 ` [PATCH v6 17/40] lib-httpd: pass config file to start_httpd() Christian Couder
2017-09-16  8:07 ` [PATCH v6 18/40] lib-httpd: add upload.sh Christian Couder
2017-09-16  8:07 ` [PATCH v6 19/40] lib-httpd: add list.sh Christian Couder
2017-09-16  8:07 ` [PATCH v6 20/40] lib-httpd: add apache-e-odb.conf Christian Couder
2017-09-16  8:07 ` [PATCH v6 21/40] odb-helper: add odb_helper_get_raw_object() Christian Couder
2017-09-16  8:07 ` [PATCH v6 22/40] pack-objects: don't pack objects in external odbs Christian Couder
2017-09-16  8:07 ` [PATCH v6 23/40] Add t0420 to test transfer to HTTP external odb Christian Couder
2017-09-16  8:07 ` [PATCH v6 24/40] external-odb: add 'get_direct' support Christian Couder
2017-09-16  8:07 ` [PATCH v6 25/40] odb-helper: add 'script_mode' to 'struct odb_helper' Christian Couder
2017-09-16  8:07 ` [PATCH v6 26/40] odb-helper: add init_object_process() Christian Couder
2017-09-16  8:07 ` [PATCH v6 27/40] Add t0450 to test 'get_direct' mechanism Christian Couder
2017-09-16  8:07 ` [PATCH v6 28/40] Add t0460 to test passing git objects Christian Couder
2017-09-16  8:07 ` [PATCH v6 29/40] odb-helper: add put_object_process() Christian Couder
2017-09-16  8:07 ` [PATCH v6 30/40] Add t0470 to test passing raw objects Christian Couder
2017-09-16  8:07 ` [PATCH v6 31/40] odb-helper: add have_object_process() Christian Couder
2017-09-16  8:07 ` [PATCH v6 32/40] Add t0480 to test "have" capability and raw objects Christian Couder
2017-09-16  8:07 ` [PATCH v6 33/40] external-odb: use 'odb=magic' attribute to mark odb blobs Christian Couder
2017-09-16  8:07 ` [PATCH v6 34/40] Add Documentation/technical/external-odb.txt Christian Couder
2017-09-16  8:07 ` [PATCH v6 35/40] clone: add 'initial' param to write_remote_refs() Christian Couder
2017-09-16  8:07 ` [PATCH v6 36/40] clone: add --initial-refspec option Christian Couder
2017-09-16  8:07 ` [PATCH v6 37/40] clone: disable external odb before initial clone Christian Couder
2017-09-16  8:07 ` [PATCH v6 38/40] Add tests for 'clone --initial-refspec' Christian Couder
2017-09-16  8:07 ` [PATCH v6 39/40] Add t0430 to test cloning using bundles Christian Couder
2017-09-16  8:07 ` [PATCH v6 40/40] Doc/external-odb: explain transfering objects and metadata Christian Couder
2017-10-02 14:18 ` [PATCH v6 00/40] Add initial experimental external ODB support Ben Peart
2017-10-03  6:32   ` Christian Couder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170929133614.1c0cd68ad80139fafdf68b86@google.com \
    --to=jonathantanmy@google.com \
    --cc=Ben.Peart@microsoft.com \
    --cc=chriscool@tuxfamily.org \
    --cc=christian.couder@gmail.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=larsxschneider@gmail.com \
    --cc=mh@glandium.org \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).