git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Thomas Braun <thomas.braun@virtuell-zuhause.de>
To: Tom Ritter <tom@ritter.vg>, git@vger.kernel.org
Subject: Re: Can I convince the diff algorithm to behave better?
Date: Wed, 3 Mar 2021 13:41:00 +0100	[thread overview]
Message-ID: <10c330f1-b3ae-38ab-1a8b-23c0b46f1557@virtuell-zuhause.de> (raw)
In-Reply-To: <CA+cU71=FfReSG411Feo=vmkw4MdK4KDgokP1jH6uwOkC_0AbYA@mail.gmail.com>

On 3/3/2021 3:03 AM, Tom Ritter wrote:

Hi Tom,

> (For a specific, nuanced, and personal definition of better...)
> 
> I have a frequent behavior that arises when I am copy/pasting chunks
> of code, typically in tests.  Here is an example:
> 
> My Original code:
> 
> def function():
>    line 1
>    line 2
>    line 3
>    line 4
>    line 5
>    line 6
> 
> --------------------------------
> I add, after it:
> 
> def function2():
>    line 1
>    line 2
>    line 3
>    line 4
>    line 5
>    line 6
> 
> --------------------------------
> My diff is:
> 
> +   line 3
> +   line 4
> +   line 5
> +   line 6
> +
> +def function2():
> +   line 1
> +   line 2
> 
> --------------------------------
> I'd like my diff to be
> 
> +
> +def function2():
> +   line 1
> +   line 2
> +   line 3
> +   line 4
> +   line 5
> +   line 6

I tried to reproduce and got exactly the diff you wanted to have. I need
to add a newline after the first "line 4" to get the not-sought-for diff.

Commit:

+++ b/test.py
@@ -0,0 +1,7 @@
+def function():
+    line 1
+    line 2
+    line 3
+    line 4
+    line 5
+    line 6

and then the following change:

--- a/test.py
+++ b/test.py
@@ -3,5 +3,14 @@ def function():
     line 2
     line 3
     line 4
+
+    line 5
+    line 6
+
+def function2():
+    line 1
+    line 2
+    line 3
+    line 4
     line 5
     line 6

I usually play around with --anchored when I want to solve an issue like
that.

The documentation of anchored says

If a line exists in both the source and destination, exists only once,
and starts with this text, this algorithm attempts to prevent it from
appearing as a deletion or addition in the output. It uses the "patience
diff" algorithm internally.

But I can't get it working here as the "exists only once" premise is broken.

Stepping back: It might also make sense to rethink the code as repeating
the same 6 lines in every function might not be the best possible design.

Thomas

[...]

  reply	other threads:[~2021-03-04  0:53 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-03  2:03 Can I convince the diff algorithm to behave better? Tom Ritter
2021-03-03 12:41 ` Thomas Braun [this message]
2021-03-03 23:45 ` Jonathan Tan
2021-03-04  9:52 ` Christian Couder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=10c330f1-b3ae-38ab-1a8b-23c0b46f1557@virtuell-zuhause.de \
    --to=thomas.braun@virtuell-zuhause.de \
    --cc=git@vger.kernel.org \
    --cc=tom@ritter.vg \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).