git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: ZheNing Hu <adlternative@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com>,
	"Git List" <git@vger.kernel.org>,
	"Christian Couder" <christian.couder@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Jeff King" <peff@peff.net>,
	"Jeff Hostetler" <jeffhost@microsoft.com>,
	"Derrick Stolee" <dstolee@microsoft.com>
Subject: Re: [PATCH] [RFC] list-objects-filter: introduce new filter sparse:buffer=<spec>
Date: Tue, 9 Aug 2022 14:13:46 +0800	[thread overview]
Message-ID: <CAOLTT8TSyKArOKTjbuGO=OkpciS6DH0mmVPSmiidOSHxo4thNQ@mail.gmail.com> (raw)
In-Reply-To: <xmqqczdau2yd.fsf@gitster.g>

Junio C Hamano <gitster@pobox.com> 于2022年8月9日周二 00:15写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > Although we already had a `--filter=sparse:oid=<oid>` which
> > can used to clone a repository with limited objects which meet
> > filter rules in the file corresponding to the <oid> on the git
> > server. But it can only read filter rules which have been record
> > in the git server before.
>
> Was the reason why we have "we limit to an object we already have"
> restriction because we didn't want to blindly use a piece of
> uncontrolled arbigrary end-user data here?  Just wondering.
>

* An end-user's maybe doesn't even have write access to the repository,
so they can't config a filterspec file before git clone, what should they
do now?

* If there are  thousands of  different developers use the same git repo,
and they use "--filter=sparse:oid" to do different partial-clone, then how
many filterspec file should repo managers config first?

* Why not carefully check "uncontrolled arbigrary end-user data" here,
such as add a config like "partialclone.sparsebufferlimit" to limit transport
data size, or check if filterspec file is legal? Or if git server
don't trust its
user... we can use a config to ban this filter, And within some companies,
users can basically be trusted.

 * I'm sure it would be beneficial to let the filtering rules be configured
by the user, because now many people have such needs: download
only a few of files of directories of the repository.

* sparse-checkout + partial-clone is a good reference: we have a
".git/info/sparse-checkout" for record what we actually want to checkout to
work-tree, and it will fetch some missing git objects which record in
".git/info/sparse-checkout" from git server. I know it use <oid> to fetch
objects one by one instead of "path"... But In hindsight, its performance is
extraordinarily bad as a result...

Anyway, this patch represents some of my complaints about the current
partial-clone feature and I hope the community will move forward with it.

Thanks.

ZheNing Hu

  reply	other threads:[~2022-08-09  6:14 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-08 11:29 [PATCH] [RFC] list-objects-filter: introduce new filter sparse:buffer=<spec> ZheNing Hu via GitGitGadget
2022-08-08 16:15 ` Junio C Hamano
2022-08-09  6:13   ` ZheNing Hu [this message]
2022-08-09 13:37   ` Derrick Stolee
2022-08-10 21:15     ` Jeff King
2022-08-12 15:49       ` ZheNing Hu
2022-08-14  6:54         ` Jeff King
2022-08-12 15:40     ` ZheNing Hu
2022-08-26  5:10     ` ZheNing Hu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOLTT8TSyKArOKTjbuGO=OkpciS6DH0mmVPSmiidOSHxo4thNQ@mail.gmail.com' \
    --to=adlternative@gmail.com \
    --cc=avarab@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).