git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Philip Oakley <philipoakley@iee.email>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Junio C Hamano <gitster@pobox.com>
Cc: Johannes Schindelin via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, Denton Liu <liu.denton@gmail.com>
Subject: Re: [PATCH v3 00/13] ci: include a Visual Studio build & test in our Azure Pipeline
Date: Wed, 9 Oct 2019 14:57:22 +0100	[thread overview]
Message-ID: <9ccbdb9a-845f-a534-29b6-52cfe9eb3229@iee.email> (raw)
In-Reply-To: <nycvar.QRO.7.76.6.1910081423250.46@tvgsbejvaqbjf.bet>

Hi Dscho,

On 08/10/2019 13:46, Johannes Schindelin wrote:
> Hi Junio,
>
> On Tue, 8 Oct 2019, Junio C Hamano wrote:
>
>> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>>
>>>> I didn't quite understand this part, though.
>>>>
>>>>      The default creation factor is 60 (roughly speaking, it wants 60% of
>>>>      the lines to match between two patches, otherwise it considers the
>>>>      patches to be unrelated).
>>>>
>>>> Would the updated creation factor used which is 95 (roughly
>>>> speaking) want 95% of the lines to match between two patches?
>>>>
>>>> That would make the matching logic even pickier and reject more
>>>> paring, so I must be reading the statement wrong X-<.
>>> No, I must have written the opposite of what I tried to say, is all.
>> So, cfactor of 60 means at most 60% is allowed to differ and the
>> two patches are still considered to be related, while 95 means only
>> 5% needs to be common?  That would make more sense to me.
> Okay, I not only wrote the opposite of what I wanted to say, I also
> misremembered.
>
> When `range-diff` tries to determine matching pairs of patches, it
> builds an `(m+n)x(m+n)` cost matrix, where `m` is the number of patches
> in the first commit range and `n` is the number of patches in the second
> one.
>
> Why not `m x n`? Well, that's the obvious matrix, and that's what it
> starts with, essentially assigning the number of lines of the diff
> between the diffs as "cost".
>
> But then `git range-diff` extends the cost matrix to allow for _all_ of
> the `m` patches to be considered deleted, and _all_ of the `n` patches
> to be added. As cost, it cannot use a "diff of diffs" because there is
> no second diff. So it uses the number of lines of the one diff it has,
> multiplied by the creation factor interpreted as a percentage.
>
> The naive creation factor would be 100%, which is (almost) as if we
> assumed an empty diff for the missing diff. But that would make the
> range-diff too eager to dismiss rewrites, as experience obviously showed
> (not my experience, but Thomas Rast's, who came up with `tbdiff` after
> all): the diff of diffs includes a diff header, for example.
>
> The interpretation I offered (although I inverted what I wanted to say)
> is similar in spirit to that metric (which is not actually a metric, I
> believe, because I expect it to violate the triangle inequality) is
> obviously inaccurate: the number of lines of the diff of diffs does not
> say anything about the number of matching lines, quite to the contrary,
> it correlates somewhat to the number of non-matching lines.
>
> So a better interpretation would have been:
>
> 	The default creation factor is 60 (roughly speaking, it wants at
> 	most 60% of the diffs' lines to differ, otherwise it considers
> 	them not to be a match.
>
> This is still inaccurate, but at least it gets the idea of the
> range-diff across.
>
> Of course, I will never be able to amend the commit message in
> GitGitGadget anyway, as I have merged that PR already.
>
> Ciao,
> Dscho
Medium term, is this something that could go in the algorithms section 
of the range-diff man page, especially if the upstream commit message is 
already in place.

#leftoverdocs ?

Philip

  reply	other threads:[~2019-10-09 13:57 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-26  8:30 [PATCH 00/13] ci: include a Visual Studio build & test in our Azure Pipeline Johannes Schindelin via GitGitGadget
2019-09-26  8:30 ` [PATCH 01/13] push: do not pretend to return `int` from `die_push_simple()` Johannes Schindelin via GitGitGadget
2019-09-26  8:30 ` [PATCH 02/13] msvc: avoid using minus operator on unsigned types Johannes Schindelin via GitGitGadget
2019-09-26 17:20   ` Denton Liu
2019-09-26 21:01     ` Johannes Schindelin
2019-09-26 23:57       ` Denton Liu
2019-09-30  9:50         ` Johannes Schindelin
2019-09-26  8:30 ` [PATCH 03/13] winansi: use FLEX_ARRAY to avoid compiler warning Johannes Schindelin via GitGitGadget
2019-09-26  8:30 ` [PATCH 04/13] compat/win32/path-utils.h: add #include guards Johannes Schindelin via GitGitGadget
2019-09-26  8:30 ` [PATCH 05/13] msvc: ignore some libraries when linking Johannes Schindelin via GitGitGadget
2019-09-26  8:30 ` [PATCH 06/13] msvc: handle DEVELOPER=1 Johannes Schindelin via GitGitGadget
2019-09-26  8:30 ` [PATCH 07/13] msvc: work around a bug in GetEnvironmentVariable() Johannes Schindelin via GitGitGadget
2019-09-26  8:30 ` [PATCH 08/13] vcxproj: only copy `git-remote-http.exe` once it was built Johannes Schindelin via GitGitGadget
2019-09-26  8:30 ` [PATCH 09/13] vcxproj: include more generated files Johannes Schindelin via GitGitGadget
2019-09-26  8:30 ` [PATCH 10/13] test-tool run-command: learn to run (parts of) the testsuite Johannes Schindelin via GitGitGadget
2019-09-26  8:30 ` [PATCH 11/13] tests: let --immediate and --write-junit-xml play well together Johannes Schindelin via GitGitGadget
2019-09-28 22:22   ` Junio C Hamano
2019-09-30  9:52     ` Johannes Schindelin
2019-09-26  8:30 ` [PATCH 12/13] ci: really use shallow clones on Azure Pipelines Johannes Schindelin via GitGitGadget
2019-09-26  8:30 ` [PATCH 13/13] ci: also build and test with MS Visual Studio " Johannes Schindelin via GitGitGadget
2019-09-30  9:55 ` [PATCH v2 00/13] ci: include a Visual Studio build & test in our Azure Pipeline Johannes Schindelin via GitGitGadget
2019-09-30  9:55   ` [PATCH v2 01/13] push: do not pretend to return `int` from `die_push_simple()` Johannes Schindelin via GitGitGadget
2019-10-03 22:37     ` Junio C Hamano
2019-10-04  9:36       ` Johannes Schindelin
2019-09-30  9:55   ` [PATCH v2 02/13] msvc: avoid using minus operator on unsigned types Johannes Schindelin via GitGitGadget
2019-10-03 22:44     ` Junio C Hamano
2019-10-04  9:55       ` Johannes Schindelin
2019-10-04 17:09         ` Johannes Sixt
2019-10-04 21:24           ` Johannes Schindelin
2019-10-06  0:02             ` Junio C Hamano
2019-10-06 10:53               ` Johannes Sixt
2019-10-08 12:04                 ` Johannes Schindelin
2019-10-08 21:13                   ` Johannes Sixt
2019-09-30  9:55   ` [PATCH v2 03/13] winansi: use FLEX_ARRAY to avoid compiler warning Johannes Schindelin via GitGitGadget
2019-09-30  9:55   ` [PATCH v2 04/13] compat/win32/path-utils.h: add #include guards Johannes Schindelin via GitGitGadget
2019-09-30  9:55   ` [PATCH v2 05/13] msvc: ignore some libraries when linking Johannes Schindelin via GitGitGadget
2019-09-30  9:55   ` [PATCH v2 06/13] msvc: handle DEVELOPER=1 Johannes Schindelin via GitGitGadget
2019-09-30  9:55   ` [PATCH v2 07/13] msvc: work around a bug in GetEnvironmentVariable() Johannes Schindelin via GitGitGadget
2019-09-30  9:55   ` [PATCH v2 09/13] vcxproj: include more generated files Johannes Schindelin via GitGitGadget
2019-09-30  9:55   ` [PATCH v2 08/13] vcxproj: only copy `git-remote-http.exe` once it was built Johannes Schindelin via GitGitGadget
2019-09-30  9:55   ` [PATCH v2 10/13] test-tool run-command: learn to run (parts of) the testsuite Johannes Schindelin via GitGitGadget
2019-09-30  9:55   ` [PATCH v2 11/13] tests: let --immediate and --write-junit-xml play well together Johannes Schindelin via GitGitGadget
2019-09-30  9:55   ` [PATCH v2 13/13] ci: also build and test with MS Visual Studio on Azure Pipelines Johannes Schindelin via GitGitGadget
2019-09-30  9:55   ` [PATCH v2 12/13] ci: really use shallow clones " Johannes Schindelin via GitGitGadget
2019-10-04 15:09   ` [PATCH v3 00/13] ci: include a Visual Studio build & test in our Azure Pipeline Johannes Schindelin via GitGitGadget
2019-10-04 15:09     ` [PATCH v3 01/13] push: do not pretend to return `int` from `die_push_simple()` Johannes Schindelin via GitGitGadget
2019-10-04 15:09     ` [PATCH v3 02/13] msvc: avoid using minus operator on unsigned types Johannes Schindelin via GitGitGadget
2019-10-04 15:09     ` [PATCH v3 03/13] winansi: use FLEX_ARRAY to avoid compiler warning Johannes Schindelin via GitGitGadget
2019-10-07 19:16       ` Alban Gruin
2019-10-04 15:09     ` [PATCH v3 04/13] compat/win32/path-utils.h: add #include guards Johannes Schindelin via GitGitGadget
2019-10-04 15:09     ` [PATCH v3 05/13] msvc: ignore some libraries when linking Johannes Schindelin via GitGitGadget
2019-10-04 15:09     ` [PATCH v3 06/13] msvc: handle DEVELOPER=1 Johannes Schindelin via GitGitGadget
2019-10-04 15:09     ` [PATCH v3 07/13] msvc: work around a bug in GetEnvironmentVariable() Johannes Schindelin via GitGitGadget
2019-10-04 15:09     ` [PATCH v3 08/13] vcxproj: only copy `git-remote-http.exe` once it was built Johannes Schindelin via GitGitGadget
2019-10-04 15:09     ` [PATCH v3 09/13] vcxproj: include more generated files Johannes Schindelin via GitGitGadget
2019-10-04 15:09     ` [PATCH v3 10/13] test-tool run-command: learn to run (parts of) the testsuite Johannes Schindelin via GitGitGadget
2019-10-04 15:09     ` [PATCH v3 11/13] tests: let --immediate and --write-junit-xml play well together Johannes Schindelin via GitGitGadget
2019-10-04 15:09     ` [PATCH v3 12/13] ci: really use shallow clones on Azure Pipelines Johannes Schindelin via GitGitGadget
2019-10-04 15:09     ` [PATCH v3 13/13] ci: also build and test with MS Visual Studio " Johannes Schindelin via GitGitGadget
2019-10-06  0:19     ` [PATCH v3 00/13] ci: include a Visual Studio build & test in our Azure Pipeline Junio C Hamano
2019-10-06 10:45       ` Johannes Schindelin
2019-10-06 20:38         ` Johannes Schindelin
2019-10-07  1:14           ` Junio C Hamano
2019-10-07 21:51             ` Johannes Schindelin
2019-10-08  2:19               ` Junio C Hamano
2019-10-08 12:46                 ` Johannes Schindelin
2019-10-09 13:57                   ` Philip Oakley [this message]
2019-10-10  9:03                     ` Johannes Schindelin
2019-10-10 10:12                       ` Philip Oakley
2019-10-07  0:59         ` Junio C Hamano
2019-10-07 16:08           ` Thomas Gummerer
2019-10-11 22:06             ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9ccbdb9a-845f-a534-29b6-52cfe9eb3229@iee.email \
    --to=philipoakley@iee.email \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=liu.denton@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).