git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: gitster@pobox.com
Cc: jonathantanmy@google.com, git@vger.kernel.org
Subject: Re: [PATCH] fetch-pack: approximate no_dependents with filter
Date: Thu, 27 Sep 2018 11:37:18 -0700	[thread overview]
Message-ID: <20180927183718.89804-1-jonathantanmy@google.com> (raw)
In-Reply-To: <xmqqh8idns9i.fsf@gitster-ct.c.googlers.com>

> It is very clear how you are churning the code, but it is utterly
> unclear from the description what you perceived as a problem and why
> this change is a good (if not the best) solution for that problem,
> at least to me.

Firstly, thanks for your comments and questions - it's sometimes hard
for me to think of the questions someone else would ask when reading one
of my patches. I have tried to rewrite the commit message (you can see
it at the end of this e-mail) following your questions.

The new paragraph 1 addresses what I perceive as a problem, and the new
paragraph 2 addresses the ideal and partial solution.

> After reading the above description, I cannot shake the feeling that
> this is tied too strongly to the tree:0 use case?  Does it help
> other use cases (e.g. would it be useful or harmful if a lazy clone
> was done to exclude blobs that are larger than certain threshold, or
> objects of all types that are not referenced by commits younger than
> certain threshold)?

Yes, it is solely for the tree:0 use case. But it doesn't hurt other use
cases, as I have explained in new paragraph 3.

I have retained old paragraph 3 as new paragraph 4, and removed old
paragraph 2 as it mostly duplicates the comments in the code. New commit
message follows:

[start commit message]

fetch-pack: exclude blobs when lazy-fetching trees

A partial clone with missing trees can be obtained using "git clone
--filter=tree:none <repo>". In such a repository, when a tree needs to
be lazily fetched, any tree or blob it directly or indirectly references
is fetched as well, regardless of whether the original command required
those objects, or if the local repository already had some of them.

This is because the fetch protocol, which the lazy fetch uses, does not
allow clients to request that only the wanted objects be sent, which
would be the ideal solution. This patch implements a partial solution:
specify the "blob:none" filter, somewhat reducing the fetch payload.

This change has no effect when lazily fetching blobs (due to how filters
work). And if lazily fetching a commit (such repositories are difficult
to construct and is not a use case we support very well, but it is
possible), referenced commits and trees are still fetched - only the
blobs are not fetched.

The necessary code change is done in fetch_pack() instead of somewhere
closer to where the "filter" instruction is written to the wire so that
only one part of the code needs to be changed in order for users of all
protocol versions to benefit from this optimization.

[end commit message]

  reply	other threads:[~2018-09-27 18:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-24 15:45 [PATCH] fetch-pack: approximate no_dependents with filter Jonathan Tan
2018-09-25 22:09 ` Junio C Hamano
2018-09-27 18:37   ` Jonathan Tan [this message]
2018-09-29 20:26 ` Junio C Hamano
2018-10-03 23:04 ` [PATCH v2 0/2] Lazy fetch bug fix (and feature that reveals it) Jonathan Tan
2018-10-03 23:04   ` [PATCH v2 1/2] fetch-pack: avoid object flags if no_dependents Jonathan Tan
2018-10-03 23:04   ` [PATCH v2 2/2] fetch-pack: exclude blobs when lazy-fetching trees Jonathan Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180927183718.89804-1-jonathantanmy@google.com \
    --to=jonathantanmy@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).