From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by dcvr.yhbt.net (Postfix) with ESMTP id CF69A1F55B for ; Fri, 22 May 2020 20:54:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731028AbgEVUyq (ORCPT ); Fri, 22 May 2020 16:54:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57418 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730963AbgEVUyq (ORCPT ); Fri, 22 May 2020 16:54:46 -0400 Received: from mail-lj1-x241.google.com (mail-lj1-x241.google.com [IPv6:2a00:1450:4864:20::241]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D70C3C061A0E for ; Fri, 22 May 2020 13:54:45 -0700 (PDT) Received: by mail-lj1-x241.google.com with SMTP id q2so14247211ljm.10 for ; Fri, 22 May 2020 13:54:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=iqhiZPjgyxnLvIwj4aRtK2YxL2ugSKzi7wQm2Y+aTvM=; b=O/kB7KnUdYx8RpIpBpgr0qyEjgpTkEVja+YJXC9ldPPbXUzUFw0ThzIH7NpmjQVM0P hS8Ku2ywVeDZK60WU7wBiXjgyqyXk75LUr5OeEZiG8yQHm2tQlx/8sDdqox2CzOPyyRJ zO5f3BC+NMn/3oJcZ2ABk/sJ1V6IV1vpYcnSsaGRs3v+deBRLOX0x7haw+ger5pPfrWH xFEFq4jMim++aT3os407IJ9m/EkXexSsqKJzcNBOlO0jksnfKoh1qndK9QiwY4CWdwWc DgwYZ2rtJSbZOK4GDBF+9hnJEXo2pJgMYzVW0rLZ/4QZ85JZqdgsJYH5wD0Wcg7qkbf6 Mc0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=iqhiZPjgyxnLvIwj4aRtK2YxL2ugSKzi7wQm2Y+aTvM=; b=LQvcrLE9mWOB/7TJiraSdCLUThHe8ik12NhB5inLbbUPyopgnxdRjD6r3xHkS4Dfqf xfr2hx3L4rK8z/afvPBH3wDSbZLU68Pd/BotDNzhLTWryQDFWtiHOw2DkXRra1tXJKex zFFfo6U0N0RAU+xUPoXEFjbBe92ViHcQeXIKIr9I8UTdGEGxjB0POqSwVsGrDFiiCyiD Jdq1/dbgT9h/dASdL7yweXrZ6yv7f5of4VbGzlBPjszyX0TchTQd3nsKaACsUjYmnLNd wuxFXy3DTcFB3p4UetAbpCYhKrcMjfuXtI9DKp+9vvwQ0T+pg8Seyj3/XVGLGDsspQUM wrZg== X-Gm-Message-State: AOAM533My+i0RyLs+iKJiLsmIdIz30M2bjjxLr/jJe8NsNn9X95Y2vqR UDnnY611zfwO0DKVmHlD1Uo8JLQGVOmb4219TkNmhQ== X-Google-Smtp-Source: ABdhPJxQ5y4CclJD0ZUweTmGO3gl0SUOk/kG/p+0iteSvDLPkPH6CiKdNWU2b9EkxAmFKIddUSkjOd3goBQS/x3WmF8= X-Received: by 2002:a2e:7007:: with SMTP id l7mr2119710ljc.74.1590180884193; Fri, 22 May 2020 13:54:44 -0700 (PDT) MIME-Version: 1.0 References: <20200522142611.1217757-1-newren@gmail.com> In-Reply-To: From: Matheus Tavares Bernardino Date: Fri, 22 May 2020 17:54:33 -0300 Message-ID: Subject: Re: [RFC PATCH v2 3/4] grep: honor sparse checkout patterns To: Elijah Newren Cc: Git Mailing List , Junio C Hamano , Derrick Stolee , Jonathan Tan Content-Type: text/plain; charset="UTF-8" Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Hi, Elijah On Fri, May 22, 2020 at 12:36 PM Elijah Newren wrote: > > On Fri, May 22, 2020 at 7:26 AM Elijah Newren wrote: > > > > Hi Matheus, > > > > On Thu, May 21, 2020 at 10:49 PM Matheus Tavares Bernardino wrote: > > > > > > On Thu, May 21, 2020 at 2:52 PM Elijah Newren wrote: > > > > > > > > Does this seem like a good approach? Or is there another solution that > > > I have not considered? Or even further, should we choose to skip the > > > submodules in excluded paths? My only concern in this case is that it > > > would be contrary to the design in git-sparse-checkout.txt. And the > > > working tree grep and cached grep would differ even on a clean working > > > tree. > > > > > Anyway, the wording in that file seems to be really important, so > > let's fix it. > > > > Let me also try to give a concrete proposal for grep behavior for the > edge cases we've discussed: Thank you for this proposal and for the previous comments as well. > git -c sparse.restrictCmds=true grep --recurse-submodules $PATTERN > > This goes through all the files in the index (i.e. all tracked files) > which do not have the SKIP_WORKTREE bit set. For each of these: If > the file is a symlink, ignore it (like grep currently does). If the > file is a regular file and is present in the working copy, search it. > If the file is a submodule and it is initialized, recurse into it. Sounds good. And when sparse.restrictCmds=false, we also search the present files and present initialized submodules that have the SKIP_WORKTREE set, right? > git -c sparse.restrictCmds=true grep --recurse-submodules --cached $PATTERN > > This goes through all the files in the index (i.e. all tracked files) > which do not have the SKIP_WORKTREE bit set. For each of these: Skip > symlinks. Search regular files. Recurse into submodules if they are > initialized. OK. > git -c sparse.restrictCmds=true grep --recurse-submodules $REVISION $PATTERN > > This goes through all the files in the given revision (i.e. all > tracked files) which match the sparsity patterns (i.e. that would not > have the SKIP_WORKTREE bit set if were we to checkout that commit). > For each of these: Skip symlinks. Search regular files. Recurse into > submodules if they are initialized. OK. > Further, for any of these, when recursing into submodules, make sure > to load that submodules' core.sparseCheckout setting (and related > settings) and the submodules' sparsity patterns, if any. > > Sound good? > > I think this addresses the edge cases we've discussed so far: > interaction between submodules and sparsity patterns, and handling of > files that are still present despite not matching the sparsity > patterns. (Also note that files which are present-despite-the-rules > are prone to be removed by the next `git sparse-checkout reapply` or > anything that triggers a call to unpack_trees(); there's already > multiple things that do and Stolee's proposed patches would add more). > If I've missed edge cases, let me know. Sounds great. This addresses all the edge cases we've mentioned before. Thanks again for the detailed proposal, and for considering case by case.