git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Brown, Chris" <chris.c.brown@siemens.com>
To: Rudy Rigot <rudy.rigot@gmail.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: RE: Negative patterns in cone mode
Date: Sat, 8 Apr 2023 21:52:09 +0000	[thread overview]
Message-ID: <GVXPR10MB8199695A0BD5D7A404CCCFD4B9979@GVXPR10MB8199.EURPRD10.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <CANaDLWJ4XSFUULc4PGen_trsyJ1_K1qoufisoxgpjCfMhoDjKQ@mail.gmail.com>

Thanks for the inputs. I came to a similar conclusion after poring over the docs. We have a similar situation to you; by excluding ~50 directories not required at build time we can avoid 18GB of files on disk and reduce the file count by 2x. I ended up writing a python script that uses git ls-tree and then converts a few negative patterns specified by the developer into a huge set of positive patterns for all directories in the tree *except* those that should be excluded. The performance is good; the python script takes around 1s, and then allows the sparse checkouts to operate in cone mode which works in seconds. This is great compared to non-cone-mode processing which takes several minutes to sparse-checkout the same directories expressed directly as negative patterns.

This suggests to me that cone mode *could* be enhanced to natively support a restricted type of negative pattern (exclude this directory and all subdirs) without performance overhead.

The problem with my script is that it is quite complex, generates thousands of positive patterns, and I am not yet 100% convinced that the complexity is worth it over simply paying the cost to download the monorepo.

Chris

-----Original Message-----
From: Rudy Rigot <rudy.rigot@gmail.com> 
Sent: 08 April 2023 16:43
To: Brown, Chris (DI SW LCS CF) <chris.c.brown@siemens.com>
Cc: git@vger.kernel.org
Subject: Re: Negative patterns in cone mode

Hi,

> I'm facing an issue with negative patterns in cone mode.
> I can't tell from the docs or git code if I misunderstand the usage, 
> am trying something not supported, or if there is a bug.

My understanding so far, and I would appreciate if someone can correct me if I'm wrong, is that the point of cone mode is that there can't be negative patterns, and everything is a positive rule, so the match search can stop as soon as a positive rule is found.

My understanding has been that it was designed with the use case in mind of large mono-repos made of several independent applications, of which a given developer only needs a few. For instance, if I am an iOS developer, I will configure my sparse checkout to have the back-end code and the iOS code, but not the front-end code and the Android code.

I don't know if that's accurate because I'm not as well-versed about it as I should be, so I would appreciate if someone could correct my understanding. It is the chief reason we are sticking with non-cone mode with our massive monolith at
Salesforce: it is not a mono-repo of independent applications, but one massive monolith of which only a few (very large) files are not needed for all devs.

Thanks in advance for anyone who may have insights.

      reply	other threads:[~2023-04-08 21:52 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-04 12:00 Negative patterns in cone mode Brown, Chris
2023-04-08 15:43 ` Rudy Rigot
2023-04-08 21:52   ` Brown, Chris [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=GVXPR10MB8199695A0BD5D7A404CCCFD4B9979@GVXPR10MB8199.EURPRD10.PROD.OUTLOOK.COM \
    --to=chris.c.brown@siemens.com \
    --cc=git@vger.kernel.org \
    --cc=rudy.rigot@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).