git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Stefan Beller <sbeller@google.com>,
	Marc Branchaud <marcnarc@xiplink.com>,
	"git@vger.kernel.org" <git@vger.kernel.org>,
	Michael Haggerty <mhagger@alum.mit.edu>
Subject: Re: [PATCH 0/2] Make diff plumbing commands respect the indentHeuristic.
Date: Mon, 1 May 2017 12:34:38 +0200	[thread overview]
Message-ID: <CACBZZX5f81HKCjRjTDyXzNMVuef9Z_ECS+0SVk2xpbwXudgxCw@mail.gmail.com> (raw)
In-Reply-To: <20170428220450.olqitnuwhrxzg3pv@sigill.intra.peff.net>

On Sat, Apr 29, 2017 at 12:04 AM, Jeff King <peff@peff.net> wrote:
> On Fri, Apr 28, 2017 at 10:34:15AM -0700, Stefan Beller wrote:
>
>> > So instead I chose to make the indentHeuristic option part of diff's basic
>> > configuration, and in each of the diff plumbing commands I moved the call to
>> > git_config() before the call to init_revisions().
>> [...]
>>
>> The feature was included in v2.11 (released 2016-11-29) and we got no
>> negative feedback. Quite the opposite, all feedback we got, was positive.
>> This could be explained by having the feature as an experimental feature
>> and users who would be broken by it, did not test it yet or did not speak up.
>
> Yeah, if the point you're trying to make is "nobody will be mad if this
> is turned on by default", I don't think shipping it as a config option
> is very conclusive.
>
> I think a more interesting argument is: we turned on renames by default
> a few versions ago, which changes the diff in a much bigger way, and
> nobody complained.
>
>   As a side note, I do happen to know of one program that depends
>   heavily on diffs remaining stable. Imagine you have a Git hosting site
>   which lets people comment on lines of diffs. You need some way to
>   address the lines of the diff so that the annotations appear in the
>   correct position when you regenerate the diff later.
>
>   One way to do it is to just position the comment at the n'th line of
>   the diff. Which obviously breaks if the diff changes. IMHO that is a
>   bug in that program, which should be fixed to use the line numbers
>   from the original blob (which is still not foolproof, because a
>   different diff algorithm may move a change such that the line isn't
>   even part of the diff anymore).
>
>   I'm not worried about this particular program, as I happen to know it
>   has already been fixed. But it's possible others have made a similar
>   mistake.
>
>> So I'd propose to turn it on by default and anyone negatively impacted by that
>> could then use the config to turn it off for themselves (including plumbing).
>>
>> Something like this, maybe?
>
> Yeah, as long as this is on top of Marc's patches, I think it is the
> natural conclusion that we had planned.
>
> I don't know if we would want to be extra paranoid about patch-ids.
> There is no helping:
>
>   git rev-list HEAD | git diff-tree --stdin -p | git patch-id --stable
>
> because diff-tree doesn't know that it's trying for "--stable" output.
> But the diffs we compute internally for patch-id could disable the
> heuristics. I'm not sure if those matter, though. AFAIK those are used
> only for internal comparisons within a single program. I.e., we never
> compare them against input from the user, nor do we output them to the
> user. So they'll change, but I don't think anybody would care.

I have a few-million row table with commit_id as one column & patch_id
as another. I.e. a commit -> patch_id mapping.

This is used for an hourly reporting system of sorts, i.e. you can see
that you were working on commit X on Friday. It uses the patch-id to
de-duplicate that commit against some commit Y you may have pushed to
a topic branch, and already used for filling in hours.

I.e. it's a thing do DWYM in a rebase-heavy workflow where the commit
ids change, and where you're too lazy to have deleted the topic branch
afterwards (because they get pruned anyway).

It's fine for me if the patch-id changes, for most diffs it'll be the
same, and if not it'll auto-adjust eventually since the records I'm
inserting will all use the new algorithm.

But as the diff algorithm changes this system will start presenting
both X and Y in the UI as not-duplicate, because it'll have computed
the patch id of X with an older version of git, and the id of Y with
the new one.

I'm surely not the only one who's written something like this. Just
wanted to point out that systems like this do exist, I don't know of
any easier way with git to get a "is this rebased commit the same"
than patch-id, and for performance reasons you may be comparing patch
ids across git versions.

  parent reply	other threads:[~2017-05-01 10:35 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-25 17:21 BUG: diff-{index,files,tree} (and git-gui) do not respect the diff.indentHeuristic config setting Marc Branchaud
2017-04-25 21:27 ` Jeff King
2017-04-27 20:50 ` [PATCH 0/2] Make diff plumbing commands respect the indentHeuristic Marc Branchaud
2017-04-27 20:50   ` [PATCH 1/2] Make the indent heuristic part of diff's basic configuration Marc Branchaud
2017-04-28  7:59     ` Jeff King
2017-04-27 20:50   ` [PATCH 2/2] Have the diff-* builtins configure diff before initializing revisions Marc Branchaud
2017-04-28  8:06     ` Jeff King
2017-05-01  1:01       ` Junio C Hamano
2017-05-01  5:17         ` Jeff King
2017-05-01  5:29           ` Jeff King
2017-04-28  7:56   ` [PATCH 0/2] Make diff plumbing commands respect the indentHeuristic Jeff King
2017-04-28 17:34   ` Stefan Beller
2017-04-28 22:04     ` Jeff King
2017-04-28 22:13       ` Stefan Beller
2017-05-01 10:34       ` Ævar Arnfjörð Bjarmason [this message]
2017-05-09  3:16         ` Jeff King
2017-05-09  4:06           ` Junio C Hamano
2017-05-09  7:58           ` Ævar Arnfjörð Bjarmason
2017-05-09  8:04             ` Jeff King
2017-04-28 22:33   ` [PATCHv2 0/3] " Marc Branchaud
2017-04-28 22:33     ` [PATCHv2 1/3] diff: make the indent heuristic part of diff's basic configuration Marc Branchaud
2017-04-28 22:33     ` [PATCHv2 2/3] diff: have the diff-* builtins configure diff before initializing revisions Marc Branchaud
2017-04-28 22:33     ` [PATCH 3/3] diff: enable indent heuristic by default Marc Branchaud
2017-04-29 12:40     ` [PATCHv2 0/3] Make diff plumbing commands respect the indentHeuristic Jeff King
2017-04-29 13:14       ` Jeff King
2017-05-01  1:11         ` Junio C Hamano
2017-05-01  5:15           ` Jeff King
2017-05-01 23:25             ` Junio C Hamano
2017-05-01 22:13         ` [PATCHv3 0/4] " Marc Branchaud
2017-05-01 22:13           ` [PATCHv3 1/4] diff: make the indent heuristic part of diff's basic configuration Marc Branchaud
2017-05-01 22:13           ` [PATCHv3 2/4] diff: have the diff-* builtins configure diff before initializing revisions Marc Branchaud
2017-05-01 22:13           ` [PATCHv3 3/4] diff: enable indent heuristic by default Marc Branchaud
2017-05-01 22:20             ` Jeff King
2017-05-01 22:13           ` [PATCHv3 4/4] add--interactive: drop diff.indentHeuristic handling Marc Branchaud
2017-05-01 22:18           ` [PATCHv3 0/4] Make diff plumbing commands respect the indentHeuristic Stefan Beller
2017-04-29 13:15       ` [PATCH 4/3] add--interactive: drop diff.indentHeuristic handling Jeff King
2017-04-30  3:26       ` [PATCHv2 0/3] Make diff plumbing commands respect the indentHeuristic Michael Haggerty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACBZZX5f81HKCjRjTDyXzNMVuef9Z_ECS+0SVk2xpbwXudgxCw@mail.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=marcnarc@xiplink.com \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).