git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: Jonathan Tan <jonathantanmy@google.com>,
	Junio C Hamano <gitster@pobox.com>
Cc: Git mailing list <git@vger.kernel.org>, Jeff King <peff@peff.net>,
	Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: [PATCH 00/13] WIP Partial clone part 1: object filtering
Date: Wed, 25 Oct 2017 11:39:00 -0400	[thread overview]
Message-ID: <4ac6ffd8-60bd-d2bc-6bc1-6c5bb1473416@jeffhostetler.com> (raw)
In-Reply-To: <CAGf8dg+cK3WpEqosgkbdcrDzrMXJxVYHiBZda6UM7k8Ggq=eBw@mail.gmail.com>



On 10/25/2017 2:46 AM, Jonathan Tan wrote:
> On Tue, Oct 24, 2017 at 10:00 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> OK, thanks for working well together.  So does this (1) build on
>> Jonathan's fsck-squelching series, or (2) ignores that and builds
>> filtering first, potentially leaving the codebase to a broken state
>> where it can create fsck-unclean repository until Jonathan's series
>> is rebased on top of this, or (3) something else?  [*1*]
> 
> Excluding the partialclone patch (patch 9), I think that the answer is
> (2), but I don't think that it leaves the codebase in a broken state.
> In particular, none of the code modifies the repo, so it can't create
> a fsck-unclean one.

My part 1 series starts with filtering, rev-list, and pack-objects.
So, yes, it add new features that no one will use yet.  But it is useful
by itself.  For example, you can use rev-list to ask for the set of
missing objects that you would need to do a checkout (assuming you had
commits and trees, but no blobs or no large blobs) *before* actually
starting the checkout.

I was then going to lay Jonathan's fsck/gc/dynamic object fetch
on top of that.  I started that here:
	https://github.com/jeffhostetler/git/pull/7

Patch 9 just adds the "extensions.partialclone*" fields and is prep
for my rev-list and his fsck changes.  I included it sooner rather than
later so I can test rev-list on a repo with hand-deleted blobs.
But yes, it can be omitted from this series and included with the fsck
changes.


> 
> Maybe one could say that this leaves the codebase with features that
> no one will ever use in the absence of partial clone, but I don't
> think that's true - rev-list with blob-size/sparse-specification
> filter seems independently useful, at least (for example, when
> desiring to operate on a repo in a sparse way without going through a
> workdir), and if we're planning to allow listing of objects, we
> probably should allow packing as well (especially since this doesn't
> add much code).
> 
> The above is relevant only if we can exclude the partialclone patch,
> but I think that we can and we should, as I wrote in my reply to Jeff
> Hostetler [1].
> 
> As for how this patch set (excluding the partialclone patch) interacts
> with my fsck series, they are relatively independent, as far as I can
> tell. I'll rebase my fsck, gc, and lazy object fetch patches (but not
> the fetch and clone parts, which we plan to instead adapt from Jeff
> Hostetler's patches, as far as I know) on top of these and resend
> those out once discussion on this has settled.

Yes, I want to get Jonathan's fsck/gc/lazy changes built into part 2.
They came over easily and are independent of how/why there are missing
objects.

For part 3, I'd like to take my version of clone, fetch, fetch-pack,
and upload-pack (that talks with the filters from part 1) and incorporate
Jonathan's promisor concepts.  That merge is a little messier, so I'd
like to parts 1 and 2 a chance to get some feedback first.


> [1] https://public-inbox.org/git/CAGf8dg+8AR=XfSV92ODAtKTNjBnD1+oVZp9rs4Y4Otz_eZyTfg@mail.gmail.com/
> 
>> I also saw a patch marked as "this is from Jonathan's earlier work",
>> taking the authorship (which to me implies that the changes were
>> extensive enough), so I am a bit at loss envisioning how this piece
>> fits in the bigger picture together with the other piece.
> 
> The patch you mentioned is the partialclone patch, which I think can
> be considered separately from the rest (as I said above).

A question of mailing-list etiquette: in patch 9, I took Jonathan's
ideas for adding the "extensions.partialclone" setting and extended it
with some helper functions.  His change was part of a larger change
with other code (fsck, IIRC) that I wasn't ready for.  What is the
preferred way to give credit for something like this?


Thanks
Jeff



  reply	other threads:[~2017-10-25 15:39 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-24 18:53 [PATCH 00/13] WIP Partial clone part 1: object filtering Jeff Hostetler
2017-10-24 18:53 ` [PATCH 01/13] dir: allow exclusions from blob in addition to file Jeff Hostetler
2017-10-25  4:05   ` Eric Sunshine
2017-10-25  6:43   ` Junio C Hamano
2017-10-25 14:54     ` Jeff Hostetler
2017-10-26  3:47       ` Junio C Hamano
2017-10-26 18:11         ` Jeff Hostetler
2017-10-24 18:53 ` [PATCH 02/13] list-objects-filter-map: extend oidmap to collect omitted objects Jeff Hostetler
2017-10-25  7:10   ` Junio C Hamano
2017-10-25 19:22     ` Jeff Hostetler
2017-10-26  4:12       ` Junio C Hamano
2017-10-24 18:53 ` [PATCH 03/13] list-objects: filter objects in traverse_commit_list Jeff Hostetler
2017-10-25  4:05   ` Jonathan Tan
2017-10-25 19:25     ` Jeff Hostetler
2017-10-24 18:53 ` [PATCH 04/13] list-objects-filter-blobs-none: add filter to omit all blobs Jeff Hostetler
2017-10-24 18:53 ` [PATCH 05/13] list-objects-filter-blobs-limit: add large blob filtering Jeff Hostetler
2017-10-24 18:53 ` [PATCH 06/13] list-objects-filter-sparse: add sparse filter Jeff Hostetler
2017-10-24 18:53 ` [PATCH 07/13] list-objects-filter-options: common argument parsing Jeff Hostetler
2017-10-25  4:14   ` Jonathan Tan
2017-10-25 19:28     ` Jeff Hostetler
2017-10-24 18:53 ` [PATCH 08/13] list-objects: add traverse_commit_list_filtered method Jeff Hostetler
2017-10-25  4:24   ` Jonathan Tan
2017-10-25 19:29     ` Jeff Hostetler
2017-10-24 18:53 ` [PATCH 09/13] extension.partialclone: introduce partial clone extension Jeff Hostetler
2017-10-24 18:53 ` [PATCH 10/13] rev-list: add list-objects filtering support Jeff Hostetler
2017-10-25  4:41   ` Jonathan Tan
2017-10-25 19:37     ` Jeff Hostetler
2017-10-24 18:53 ` [PATCH 11/13] t6112: rev-list object filtering test Jeff Hostetler
2017-10-24 18:53 ` [PATCH 12/13] pack-objects: add list-objects filtering Jeff Hostetler
2017-10-24 18:53 ` [PATCH 13/13] t5317: pack-objects object filtering test Jeff Hostetler
2017-10-25  4:57 ` [PATCH 00/13] WIP Partial clone part 1: object filtering Jonathan Tan
2017-10-25  5:00 ` Junio C Hamano
2017-10-25  6:46   ` Jonathan Tan
2017-10-25 15:39     ` Jeff Hostetler [this message]
2017-10-26  2:09       ` Junio C Hamano
2017-10-26  2:01     ` Junio C Hamano
2017-10-30 22:27     ` Jonathan Nieder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ac6ffd8-60bd-d2bc-6bc1-6c5bb1473416@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=jonathantanmy@google.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).