git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: Jeff Hostetler <git@jeffhostetler.com>,
	git@vger.kernel.org, jeffhost@microsoft.com, peff@peff.net,
	markbt@efaref.net, benpeart@microsoft.com,
	jonathantanmy@google.com
Subject: Re: [PATCH 00/10] RFC Partial Clone and Fetch
Date: Mon, 08 May 2017 09:15:33 +0900	[thread overview]
Message-ID: <xmqqwp9scifu.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <20170503182725.GC28740@aiede.svl.corp.google.com> (Jonathan Nieder's message of "Wed, 3 May 2017 11:27:26 -0700")

Jonathan Nieder <jrnieder@gmail.com> writes:

> - there shouldn't be any need for the blobs to even be mentioned in
>   the .pack stored locally.  The .idx file maps from sha1 to offset
>   within the packfile --- a special offset could mean "this is a
>   missing blob".

Clever.

> - However, the list of missing blobs can be inferred from the existing
>   pack format, without a change to the pack format used in git
>   protocol.  As part of constructing the idx, "git index-pack"
>   inflates every object in the pack file sent by the server.  This
>   means we know what blobs they reference, so we can easily produce a
>   list for the idx file without changing the pack file format.

A minor wrinkle to keep in mind if you were to go this route is that
you'd need a way to tell the reason why a blob that is referenced by
a tree in the pack stream is not in the same pack stream.  

If the resulting repository on the receiving side has that blob
after the transfer, it is likely that the reason why the blob does
not appear in the pack is because the want/have/ack exchange told
the sending side that the receiving side has a commit that contains
the blob.  But when the blob does not exist in the receiving side
after the transfer, we cannot tell between two possible cases.  The
server may have actively wanted to omit it (i.e. the case we are
interested in in this discussion thread).  Or the receiving end said
that it has a commit that contains the blob, but due to preexisting
corruption, the receiving repository was missing the blob in
reality.

  parent reply	other threads:[~2017-05-08  0:15 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-08 18:50 [PATCH 00/10] RFC Partial Clone and Fetch git
2017-03-08 18:50 ` [PATCH 01/10] pack-objects: eat CR in addition to LF after fgets git
2017-03-08 18:50 ` [PATCH 02/10] pack-objects: add --partial-by-size=n --partial-special git
2017-03-08 18:50 ` [PATCH 03/10] pack-objects: test for --partial-by-size --partial-special git
2017-03-08 18:50 ` [PATCH 04/10] upload-pack: add partial (sparse) fetch git
2017-03-08 18:50 ` [PATCH 05/10] fetch-pack: add partial-by-size and partial-special git
2017-03-08 18:50 ` [PATCH 06/10] rev-list: add --allow-partial option to relax connectivity checks git
2017-03-08 18:50 ` [PATCH 07/10] index-pack: add --allow-partial option to relax blob existence checks git
2017-03-08 18:50 ` [PATCH 08/10] fetch: add partial-by-size and partial-special arguments git
2017-03-08 18:50 ` [PATCH 09/10] clone: " git
2017-03-08 18:50 ` [PATCH 10/10] ls-partial: created command to list missing blobs git
2017-03-09 20:18 ` [PATCH 00/10] RFC Partial Clone and Fetch Jonathan Tan
2017-03-16 21:43   ` Jeff Hostetler
2017-03-17 14:13     ` Jeff Hostetler
2017-03-22 15:16 ` ankostis
2017-03-22 16:21   ` Johannes Schindelin
2017-03-22 17:51     ` Jeff Hostetler
2017-05-03 16:38 ` Jeff Hostetler
2017-05-03 18:27   ` Jonathan Nieder
2017-05-04 16:51     ` Jeff Hostetler
2017-05-04 18:41       ` Jonathan Nieder
2017-05-08  0:15     ` Junio C Hamano [this message]
2017-05-03 20:40   ` Jonathan Tan
2017-05-03 21:08     ` Jonathan Nieder
  -- strict thread matches above, loose matches on Subject: below --
2017-03-08 17:37 Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqwp9scifu.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=benpeart@microsoft.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=jeffhost@microsoft.com \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=markbt@efaref.net \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).