From: Matthew DeVore <matvore@comcast.net>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: matvore@google.com, git@vger.kernel.org, jrn@google.com
Subject: Re: Proposal: object negotiation for partial clones
Date: Mon, 13 May 2019 17:09:00 -0700 [thread overview]
Message-ID: <4A3BD0B7-894F-4FB0-A3BA-15675F60046C@comcast.net> (raw)
In-Reply-To: <20190509180022.91700-1-jonathantanmy@google.com>
> On 2019/05/09, at 11:00, Jonathan Tan <jonathantanmy@google.com> wrote:
>
> Thanks for the numbers. Let me think about it some more, but I'm still
> reluctant to introduce multiple filter support in the protocol and the
> implementation for the following reasons:
Correction to the original command - I was tweaking it in the middle of running it, and introduced an error that I didn’t notice. Here is one that will work for an entire repo:
$ git rev-list --objects --filter=blob:none HEAD: | awk '{print $1}' | xargs -n 1 git cat-file -s | awk '{ total += $1; print total }'
When run to completion, Chromium totaled 17 301 144 bytes.
>
> - For large projects like Linux and Chromium, it may be reasonable to
> expect that an infrequent checkout would result in a few-megabyte
> download.
Anyone developing on Chromium would definitely consider a 17 MB original clone to be an improvement over the status quo, but it is still not ideal.
And the 17MB initial download is only incurred once *assuming* the next idea is implemented:
> - (After some in-office discussion) It may be possible to mitigate much
> of that by sending root trees that we have as "have" (e.g. by
> consulting the reflog), and that wouldn't need any protocol change.
This would complicate the code - not in Git itself, but in my FUSE-related logic. We would have to explore the reflog and try to find the closest commits in history to the target commit being checked out. This is sounding a bit hacky and round-about, and it assumes that at the FUSE layer we can detect when a checkout is happening cleanly and sufficiently early (rather than when one of the sub-sub-trees is being accessed).
> - Supporting any combination of filter means that we have more to
> implement and test, especially if we want to support more filters in
> the future. In particular, the different filters (e.g. blob, tree)
> have different code paths now in Git. One way to solve it would be to
> combine everything into one monolith, but I would like to avoid it if
> possible (after having to deal with revision walking a few times...)
I don’t believe there is any need to introduce monolithic code. The bulk of the filter implementation is in list-objects-filter.c, and I don’t think the file will get much longer with an additional filter that “combines” the existing filter. The new filter is likely simpler than the sparse filter. Once I add the new filter and send out the initial patch set, we can discuss splitting up the file, if it appears to be necessary.
My idea - if it is not clear already - is to add another OO-like interface to list-objects-filter.c which parallels the 5 that are already there.
next prev parent reply other threads:[~2019-05-14 0:09 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-28 15:55 Proposal: object negotiation for partial clones Matthew DeVore
2019-05-06 18:25 ` Jonathan Nieder
2019-05-06 19:28 ` Jonathan Tan
2019-05-06 19:46 ` Jonathan Nieder
2019-05-06 23:20 ` Matthew DeVore
2019-05-07 0:02 ` Jonathan Nieder
2019-05-06 22:47 ` Matthew DeVore
2019-05-07 18:34 ` Jonathan Tan
2019-05-07 21:57 ` Matthew DeVore
2019-05-09 18:00 ` Jonathan Tan
2019-05-14 0:09 ` Matthew DeVore [this message]
2019-05-14 0:16 ` Jonathan Nieder
2019-05-16 18:56 ` [RFC PATCH 0/3] implement composite filters Matthew DeVore
2019-05-16 18:56 ` [RFC PATCH 1/3] list-objects-filter: refactor into a context struct Matthew DeVore
2019-05-16 18:56 ` [RFC PATCH 2/3] list-objects-filter-options: error is localizeable Matthew DeVore
2019-05-16 18:56 ` [RFC PATCH 3/3] list-objects-filter: implement composite filters Matthew DeVore
2019-05-17 3:25 ` Junio C Hamano
2019-05-17 13:17 ` Matthew DeVore
2019-05-19 1:12 ` Junio C Hamano
2019-05-20 18:24 ` Matthew DeVore
2019-05-20 18:28 ` Matthew DeVore
2019-05-16 22:41 ` [RFC PATCH 0/3] " Jonathan Tan
2019-05-17 0:01 ` Matthew DeVore
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A3BD0B7-894F-4FB0-A3BA-15675F60046C@comcast.net \
--to=matvore@comcast.net \
--cc=git@vger.kernel.org \
--cc=jonathantanmy@google.com \
--cc=jrn@google.com \
--cc=matvore@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).