git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Ben Peart <Ben.Peart@microsoft.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>,
	"gitster@pobox.com" <gitster@pobox.com>,
	"pclouds@gmail.com" <pclouds@gmail.com>,
	"vmiklos@frugalware.org" <vmiklos@frugalware.org>,
	Alejandro Pauly <alpauly@microsoft.com>,
	"Johannes.Schindelin@gmx.de" <Johannes.Schindelin@gmx.de>,
	"eckhard.s.maass@googlemail.com" <eckhard.s.maass@googlemail.com>
Subject: Re: [PATCH v1] add status config and command line options for rename detection
Date: Wed, 9 May 2018 09:56:14 -0700	[thread overview]
Message-ID: <CABPp-BEkQN55diiovv+33P=Ouk+FcK8N4p85EZZqVmw-mxuL1A@mail.gmail.com> (raw)
In-Reply-To: <20180509144213.18032-1-benpeart@microsoft.com>

Hi Ben,

Overall I think this is good, but I have lots of nit-picky things to
bring up.  :-)


On Wed, May 9, 2018 at 7:42 AM, Ben Peart <Ben.Peart@microsoft.com> wrote:
> Add status --no-renames command line option that enables overriding the config
> setting from the command line. Add --find-renames[=<n>] to enable detecting
> renames and optionaly setting the similarity index from the command line.

s/optionaly/optionally/

> Notes:
>     Base Ref:

No base ref?  ;-)

> +status.renameLimit::
> +       The number of files to consider when performing rename detection;
> +       if not specified, defaults to the value of diff.renameLimit.
> +
> +status.renames::
> +       Whether and how Git detects renames.  If set to "false",
> +       rename detection is disabled. If set to "true", basic rename
> +       detection is enabled.  Defaults to the value of diff.renames.

I suspect that status.renames should mention "copy", just as
diff.renames does.  (We didn't mention it in merge.renames, because
merge isn't an operation for which copy detection can be useful -- at
least not until the diffstat at the end of the merge is controlled by
merge.renames as well...)

Also, do these two config settings only affect 'git status', or does
it also affect the status shown when composing a commit message
(assuming the user hasn't turned commit.status off)?  Does it affect
`git commit --dry-run` too?

> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -109,6 +109,10 @@ static int have_option_m;
>  static struct strbuf message = STRBUF_INIT;
>
>  static enum wt_status_format status_format = STATUS_FORMAT_UNSPECIFIED;
> +static int diff_detect_rename = -1;
> +static int status_detect_rename = -1;
> +static int diff_rename_limit = -1;
> +static int status_rename_limit = -1;

Could we replace these four variables with just two: detect_rename and
rename_limit?  Keeping these separate invites people to write code
using only one of the settings rather than the appropriate inherited
mixture of them, resulting in a weird bug.  More on this below...

> @@ -1259,11 +1273,29 @@ static int git_status_config(const char *k, const char *v, void *cb)
>                         return error(_("Invalid untracked files mode '%s'"), v);
>                 return 0;
>         }
> +       if (!strcmp(k, "diff.renamelimit")) {
> +               diff_rename_limit = git_config_int(k, v);
> +               return 0;
> +       }
> +       if (!strcmp(k, "status.renamelimit")) {
> +               status_rename_limit = git_config_int(k, v);
> +               return 0;
> +       }

Here, since you're already checking diff.renamelimit first, you can
just set rename_limit in both blocks and not need both
diff_rename_limit and status_rename_limit.  (Similar can be said for
diff.renames/status.renames.)

<snip>

> @@ -1297,6 +1329,10 @@ int cmd_status(int argc, const char **argv, const char *prefix)
>                   N_("ignore changes to submodules, optional when: all, dirty, untracked. (Default: all)"),
>                   PARSE_OPT_OPTARG, NULL, (intptr_t)"all" },
>                 OPT_COLUMN(0, "column", &s.colopts, N_("list untracked files in columns")),
> +               OPT_BOOL(0, "no-renames", &no_renames, N_("do not detect renames")),
> +               { OPTION_CALLBACK, 'M', "find-renames", &rename_score_arg,
> +                 N_("n"), N_("detect renames, optionally set similarity index"),
> +                 PARSE_OPT_OPTARG, opt_parse_rename_score },

Should probably also document these options in
Documentation/git-status.txt (and maybe Documentation/git-commit.txt
as well).

Not sure if we want to add a flag for copy detection (similar to
git-diff's -C/--find-copies and --find-copies-harder), or just leave
that for when someone finds a need.  If left out, might want to just
mention that it was considered and intentionally omitted for now in
the commit message.

> @@ -1336,6 +1372,27 @@ int cmd_status(int argc, const char **argv, const char *prefix)
>         s.ignore_submodule_arg = ignore_submodule_arg;
>         s.status_format = status_format;
>         s.verbose = verbose;
> +       s.detect_rename = no_renames >= 0 ? !no_renames :
> +                                         status_detect_rename >= 0 ? status_detect_rename :
> +                                         diff_detect_rename >= 0 ? diff_detect_rename :

Combining the four vars into two as mentioned above would allow
combining the last two lines above into one.

> +       if ((intptr_t)rename_score_arg != -1) {

I don't understand why rename_score_arg is a (char*) and then you need
to cast to intptr_t, but that may just be because I haven't done much
of anything with option parsing.  A quick look at the docs isn't
making it clear to me, though; could you enlighten me?

> +               s.detect_rename = DIFF_DETECT_RENAME;

What if status.renames is 'copy' but someone wants to override the
score for detecting renames and pass --find-renames=40?  Does the
--find-renames override and degrade the 'copy'?  I think it'd make
more sense to increase s.detect_rename to at least DIFF_DETECT_RENAME,
rather than just outright setting it here.

> +               if (rename_score_arg)
> +                       s.rename_score = parse_rename_score(&rename_score_arg);
> +       }
> +       s.rename_limit = status_rename_limit >= 0 ? status_rename_limit :
> +                                        diff_rename_limit >= 0 ? diff_rename_limit :

Again, combination of variables could allow these last two lines to be combined.

> +                                        s.rename_limit;
> +
> +       /*
> +        * We do not have logic to handle the detection of copies.  In
> +        * fact, it may not even make sense to add such logic: would we
> +        * really want a change to a base file to be propagated through
> +        * multiple other files by a merge?
> +        */
> +       if (s.detect_rename > DIFF_DETECT_RENAME)
> +               s.detect_rename = DIFF_DETECT_RENAME;

This comment and code made sense in merge-recursive.c (which doesn't
show detected diffs/renames/copies but just uses them for internal
processing logic).  It does not make sense here; git status could show
detected copies much like `git diff -C --name-status` shows it.  In
fact, a quick grep for DIFF_STATUS_COPIED shows multiple hits in
wt-status.c, so I suspect it already has the necessary logic for
displaying copies.


I looked over the rest of the patch.  Nice testcases.  Adding a couple
more testcases around copy detection could be useful.

  parent reply	other threads:[~2018-05-09 16:56 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-09 14:42 [PATCH v1] add status config and command line options for rename detection Ben Peart
2018-05-09 15:59 ` Duy Nguyen
2018-05-09 17:04   ` Ben Peart
2018-05-09 16:56 ` Elijah Newren [this message]
2018-05-09 19:54   ` Ben Peart
2018-05-10 14:16 ` [PATCH v2] " Ben Peart
2018-05-10 16:19   ` Elijah Newren
2018-05-10 19:09     ` Ben Peart
2018-05-10 22:31       ` Elijah Newren
2018-05-11 12:50         ` Ben Peart
2018-05-11  1:57     ` Junio C Hamano
2018-05-11  6:39   ` Junio C Hamano
2018-05-11 12:56 ` [PATCH v3] " Ben Peart
2018-05-11 14:33   ` Elijah Newren
2018-05-12  8:04   ` Eckhard Maaß
2018-05-14 12:57     ` Ben Peart
2018-05-11 15:38 ` [PATCH v4] " Ben Peart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABPp-BEkQN55diiovv+33P=Ouk+FcK8N4p85EZZqVmw-mxuL1A@mail.gmail.com' \
    --to=newren@gmail.com \
    --cc=Ben.Peart@microsoft.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=alpauly@microsoft.com \
    --cc=eckhard.s.maass@googlemail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    --cc=vmiklos@frugalware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).