git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: Ben Peart <peartben@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, benpeart@microsoft.com,
	pclouds@gmail.com, christian.couder@gmail.com,
	git@jeffhostetler.com
Subject: Re: [PATCH v2 1/1] sha1_file: Add support for downloading blobs on demand
Date: Mon, 17 Jul 2017 11:06:02 -0700	[thread overview]
Message-ID: <20170717110602.6fac89ea@twelve2.svl.corp.google.com> (raw)
In-Reply-To: <20170714132651.170708-2-benpeart@microsoft.com>

About the difference between this patch and my patch set [1], besides
the fact that this patch does not spawn separate processes for each
missing object, which does seem like an improvement to me, this patch
(i) does not use a list of promised objects (but instead communicates
with the hook for each missing object), and (ii) provides backwards
compatibility with other Git code (that does not know about promised
objects) in a different way.

The costs and benefits of (i) are being discussed here [2]. As for (ii),
I still think that my approach is better - I have commented more about
this below.

Maybe the best approach is a combination of both our approaches.

[1] https://public-inbox.org/git/34efd9e9936fdab331655f5a33a098a72dc134f4.1499800530.git.jonathantanmy@google.com/

[2] https://public-inbox.org/git/20170713123951.5cab1adc@twelve2.svl.corp.google.com/

On Fri, 14 Jul 2017 09:26:51 -0400
Ben Peart <peartben@gmail.com> wrote:

> +------------------------
> +packet: git> command=get
> +packet: git> sha1=0a214a649e1b3d5011e14a3dc227753f2bd2be05
> +packet: git> 0000
> +------------------------

It would be useful to have this command support more than one SHA-1, so
that hooks that know how to batch can do so.

> +static int subprocess_map_initialized;
> +static struct hashmap subprocess_map;

The documentation of "tablesize" in "struct hashmap" states that it can
be used to check if the hashmap is initialized, so
subprocess_map_initialized is probably unnecessary.

>  static int check_and_freshen(const unsigned char *sha1, int freshen)
>  {
> -	return check_and_freshen_local(sha1, freshen) ||
> -	       check_and_freshen_nonlocal(sha1, freshen);
> +	int ret;
> +	int already_retried = 0;
> +
> +retry:
> +	ret = check_and_freshen_local(sha1, freshen) ||
> +		check_and_freshen_nonlocal(sha1, freshen);
> +	if (!ret && core_virtualize_objects && !already_retried) {
> +		already_retried = 1;
> +		if (!read_object_process(sha1))
> +			goto retry;
> +	}
> +
> +	return ret;
>  }

Is this change meant to ensure that Git code that operates on loose
objects directly (bypassing storage-agnostic functions such as
sha1_object_info_extended() and has_sha1_file()) still work? If yes,
this patch appears incomplete (for example, read_loose_object() needs to
be changed too), and this seems like a difficult task - in my patch set
[1], I ended up deciding to create a separate type of storage and
instead looked at the code that operates on *packed* objects directly
(because there were fewer such methods) to ensure that they would work
correctly in the presence of a separate type of storage.

[1] https://public-inbox.org/git/34efd9e9936fdab331655f5a33a098a72dc134f4.1499800530.git.jonathantanmy@google.com/

  parent reply	other threads:[~2017-07-17 18:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-14 13:26 [RFC/PATCH v2 0/1] Add support for downloading blobs on demand Ben Peart
2017-07-14 13:26 ` [PATCH v2 1/1] sha1_file: " Ben Peart
2017-07-14 15:18   ` Christian Couder
2017-07-17 18:06   ` Jonathan Tan [this message]
2017-07-17 20:09     ` Ben Peart
2017-07-17 23:24       ` Jonathan Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170717110602.6fac89ea@twelve2.svl.corp.google.com \
    --to=jonathantanmy@google.com \
    --cc=benpeart@microsoft.com \
    --cc=christian.couder@gmail.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    --cc=peartben@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).