git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Ben Peart <peartben@gmail.com>
To: Elijah Newren <newren@gmail.com>, Ben Peart <Ben.Peart@microsoft.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>,
	"gitster@pobox.com" <gitster@pobox.com>,
	"pclouds@gmail.com" <pclouds@gmail.com>,
	"vmiklos@frugalware.org" <vmiklos@frugalware.org>,
	Alejandro Pauly <alpauly@microsoft.com>,
	"Johannes.Schindelin@gmx.de" <Johannes.Schindelin@gmx.de>,
	"eckhard.s.maass@googlemail.com" <eckhard.s.maass@googlemail.com>
Subject: Re: [PATCH v2] add status config and command line options for rename detection
Date: Thu, 10 May 2018 15:09:23 -0400	[thread overview]
Message-ID: <bc81823e-7b7d-c516-dfc2-cd47bedb5a5a@gmail.com> (raw)
In-Reply-To: <CABPp-BGE6RXv3ka8wGXruFjk3W=kDEDJ6zpH3t5=_CGSTONCHQ@mail.gmail.com>



On 5/10/2018 12:19 PM, Elijah Newren wrote:
> Hi Ben,
> 
> On Thu, May 10, 2018 at 7:16 AM, Ben Peart <Ben.Peart@microsoft.com> wrote:
>> After performing a merge that has conflicts, git status will by default attempt
>> to detect renames which causes many objects to be examined.  In a virtualized
>> repo, those objects do not exist locally so the rename logic triggers them to be
>> fetched from the server. This results in the status call taking hours to
>> complete on very large repos.  Even in a small repo (the GVFS repo) turning off
>> break and rename detection has a significant impact:
> 
> It'd be nice if you could show that impact by comparing 'git status'
> to 'git status --no-renames', for some repo.  Showing only the latter
> gives us no way to assess the impact.
> 

Given the example perf impact is arbitrary (the actual example that 
triggered this patch took status from 2+ hours to seconds) and can't be 
replicated using the current performance tools in git, I'm just going 
drop the specific numbers.  I believe the patch is worth while just to 
give users the flexibility to control these behaviors.

>> git status --no-renames:
>> 31 secs., 105 loose object downloads
>>
>> git status --no-breaks
>> 7 secs., 17 loose object downloads
>>
>> git status --no-breaks --no-renames
>> 1 sec., 1 loose object download
> 
> This patch doesn't add a --no-breaks option and it doesn't exist
> previously, so adding it to the commit message serves to confuse
> rather than help.  I'd just drop the last two of these (and redo the
> timing for --no-renames assuming you are built on
> em/status-rename-config).
> 

OK

>> Add a new config status.renames setting to enable turning off rename detection
>> during status.  This setting will default to the value of diff.renames.
>>
>> Add a new config status.renamelimit setting to to enable bounding the time spent
>> finding out inexact renames during status.  This setting will default to the
>> value of diff.renamelimit.
> 
> It may be worth mentioning that these config settings also affect 'git
> commit' (and it does, in my testing, which I think is a good thing).
> 

I agree this is a good thing as the other status settings behave the 
same way.  I'll update the documentation to reflect this as well.

>> Add status --no-renames command line option that enables overriding the config
>> setting from the command line. Add --find-renames[=<n>] to enable detecting
>> renames and optionally setting the similarity index from the command line.
> 
> The command line options are specific to 'git status'.  I don't really
> have a strong opinion on whether they should also be added to
> git-commit; I suspect users would be more likely to use the config
> options in order to set it once and forget about it and that users
> would be more likely to want to override their config setting for
> status than for commit.
> 
>> Note: I removed the --no-breaks command line option from the original patch as
>> it will no longer be needed once the default has been changed [1] to turn it off.
>>
>> [1] https://public-inbox.org/git/20180430093421.27551-2-eckhard.s.maass@gmail.com/
> 
> I'd just drop these lines from the commit message, and instead mention
> that your patch depends on em/status-rename-config.
> 

OK

>> +       if ((intptr_t)rename_score_arg != -1) {
>> +               s.detect_rename = DIFF_DETECT_RENAME;
> 
> I'd still prefer this was a
>          if (s.detect_rename < DIFF_DETECT_RENAME)
>                  s.detect_rename = DIFF_DETECT_RENAME;
> 
> If a user specifies they are willing to pay for copy detection, but
> then just passes --find-renames=40% because they want to find more
> renames, it seems odd to disable copy detection to me.
> 

I agree and will change it. It is unfortunate this will behave 
differently than it does with merge.  Fixing the merge behavior to match 
is outside the scope of this patch.

>> +++ b/t/t7525-status-rename.sh
> 
> Testcases look good.  It'd be nice to also add a few testcases where
> copy detection is turned on -- in particular, I'd like to see one with
> --find-renames=$DIFFERENT_THAN_DEFAULT being passed when
> merge.renames=copies.
> 

OK.  I also added tests to verify the settings correctly impact commit.

> 
>> +test_expect_success 'setup' '
>> +       echo 1 >original &&
>> +       git add . &&
>> +       git commit -m"Adding original file." &&
>> +       mv original renamed &&
>> +       echo 2 >> renamed &&
>> +       git add .
>> +'
> 
> 
>> +cat >.gitignore <<\EOF
>> +.gitignore
>> +expect*
>> +actual*
>> +EOF
> 
> Can this just be included in the setup?
> 

OK

> 
> Everything else in the patch looked good to me.
> 

  reply	other threads:[~2018-05-10 19:09 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-09 14:42 [PATCH v1] add status config and command line options for rename detection Ben Peart
2018-05-09 15:59 ` Duy Nguyen
2018-05-09 17:04   ` Ben Peart
2018-05-09 16:56 ` Elijah Newren
2018-05-09 19:54   ` Ben Peart
2018-05-10 14:16 ` [PATCH v2] " Ben Peart
2018-05-10 16:19   ` Elijah Newren
2018-05-10 19:09     ` Ben Peart [this message]
2018-05-10 22:31       ` Elijah Newren
2018-05-11 12:50         ` Ben Peart
2018-05-11  1:57     ` Junio C Hamano
2018-05-11  6:39   ` Junio C Hamano
2018-05-11 12:56 ` [PATCH v3] " Ben Peart
2018-05-11 14:33   ` Elijah Newren
2018-05-12  8:04   ` Eckhard Maaß
2018-05-14 12:57     ` Ben Peart
2018-05-11 15:38 ` [PATCH v4] " Ben Peart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bc81823e-7b7d-c516-dfc2-cd47bedb5a5a@gmail.com \
    --to=peartben@gmail.com \
    --cc=Ben.Peart@microsoft.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=alpauly@microsoft.com \
    --cc=eckhard.s.maass@googlemail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=pclouds@gmail.com \
    --cc=vmiklos@frugalware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).