git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] Git's rename detection requires a stable sort
@ 2019-09-25  8:35 Johannes Schindelin via GitGitGadget
  2019-09-25  8:36 ` [PATCH 1/2] Move git_sort(), a stable sort, into into libgit.a Johannes Schindelin via GitGitGadget
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-09-25  8:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

With the en/merge-recursive-cleanup patches already having advanced to next,
the problem I discovered when rebasing Git for Windows' branch thicket
becomes quite relevant now: t3030.35 fails consistently in the MSVC build &
test (this part of the Azure Pipeline will be upstreamed later).

The solution: use a stable sort.

Note: this patch series is based on top of en/merge-recursive-cleanup.

Johannes Schindelin (2):
  Move git_sort(), a stable sort, into into libgit.a
  diffcore_rename(): use a stable sort

 Makefile                  | 2 +-
 compat/mingw.c            | 5 -----
 diffcore-rename.c         | 2 +-
 git-compat-util.h         | 4 +++-
 compat/qsort.c => qsort.c | 2 +-
 5 files changed, 6 insertions(+), 9 deletions(-)
 rename compat/qsort.c => qsort.c (97%)


base-commit: 4615a8cb5b3a8d4959c30338925b1fa3b948ae52
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-352%2Fdscho%2Frename-needs-stable-sort-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-352/dscho/rename-needs-stable-sort-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/352
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] Move git_sort(), a stable sort, into into libgit.a
  2019-09-25  8:35 [PATCH 0/2] Git's rename detection requires a stable sort Johannes Schindelin via GitGitGadget
@ 2019-09-25  8:36 ` Johannes Schindelin via GitGitGadget
  2019-09-28 10:14   ` Junio C Hamano
  2019-09-25  8:36 ` [PATCH 2/2] diffcore_rename(): use a stable sort Johannes Schindelin via GitGitGadget
  2019-09-30 17:21 ` [PATCH v2 0/2] Git's rename detection requires " Johannes Schindelin via GitGitGadget
  2 siblings, 1 reply; 9+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-09-25  8:36 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `qsort()` function is not guaranteed to be stable, i.e. it does not
promise to maintain the order of items it is told to consider equal. In
contrast, the `git_sort()` function we carry in `compat/qsort.c` _is_
stable, by virtue of implementing a merge sort algorithm.

In preparation for using a stable sort in Git's rename detection, move
the stable sort into `libgit.a` so that it is compiled in
unconditionally.

Note: this also makes the hack obsolete that was introduced in
fe21c6b285d (mingw: reencode environment variables on the fly (UTF-16
<-> UTF-8), 2018-10-30), where we included `compat/qsort.c` directly in
`compat/mingw.c` to use the stable sort.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Makefile                  | 2 +-
 compat/mingw.c            | 5 -----
 git-compat-util.h         | 4 +++-
 compat/qsort.c => qsort.c | 2 +-
 4 files changed, 5 insertions(+), 8 deletions(-)
 rename compat/qsort.c => qsort.c (97%)

diff --git a/Makefile b/Makefile
index f9255344ae..1e144598e4 100644
--- a/Makefile
+++ b/Makefile
@@ -950,6 +950,7 @@ LIB_OBJS += prio-queue.o
 LIB_OBJS += progress.o
 LIB_OBJS += prompt.o
 LIB_OBJS += protocol.o
+LIB_OBJS += qsort.o
 LIB_OBJS += quote.o
 LIB_OBJS += range-diff.o
 LIB_OBJS += reachable.o
@@ -1714,7 +1715,6 @@ ifdef NO_GETPAGESIZE
 endif
 ifdef INTERNAL_QSORT
 	COMPAT_CFLAGS += -DINTERNAL_QSORT
-	COMPAT_OBJS += compat/qsort.o
 endif
 ifdef HAVE_ISO_QSORT_S
 	COMPAT_CFLAGS += -DHAVE_ISO_QSORT_S
diff --git a/compat/mingw.c b/compat/mingw.c
index 738f0a826a..77d4ef4d19 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1229,11 +1229,6 @@ static int wenvcmp(const void *a, const void *b)
 	return _wcsnicmp(p, q, p_len);
 }
 
-/* We need a stable sort to convert the environment between UTF-16 <-> UTF-8 */
-#ifndef INTERNAL_QSORT
-#include "qsort.c"
-#endif
-
 /*
  * Build an environment block combining the inherited environment
  * merged with the given list of settings.
diff --git a/git-compat-util.h b/git-compat-util.h
index 83be89de0a..2d46162897 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1094,9 +1094,9 @@ static inline int strtol_i(char const *s, int base, int *result)
 	return 0;
 }
 
-#ifdef INTERNAL_QSORT
 void git_qsort(void *base, size_t nmemb, size_t size,
 	       int(*compar)(const void *, const void *));
+#ifdef INTERNAL_QSORT
 #define qsort git_qsort
 #endif
 
@@ -1108,6 +1108,8 @@ static inline void sane_qsort(void *base, size_t nmemb, size_t size,
 		qsort(base, nmemb, size, compar);
 }
 
+#define QSORT_STABLE(base, n, compar) git_qsort((base), (n), sizeof(*(base)), compar)
+
 #ifndef HAVE_ISO_QSORT_S
 int git_qsort_s(void *base, size_t nmemb, size_t size,
 		int (*compar)(const void *, const void *, void *), void *ctx);
diff --git a/compat/qsort.c b/qsort.c
similarity index 97%
rename from compat/qsort.c
rename to qsort.c
index 7d071afb70..08f80eea09 100644
--- a/compat/qsort.c
+++ b/qsort.c
@@ -1,4 +1,4 @@
-#include "../git-compat-util.h"
+#include "git-compat-util.h"
 
 /*
  * A merge sort implementation, simplified from the qsort implementation
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] diffcore_rename(): use a stable sort
  2019-09-25  8:35 [PATCH 0/2] Git's rename detection requires a stable sort Johannes Schindelin via GitGitGadget
  2019-09-25  8:36 ` [PATCH 1/2] Move git_sort(), a stable sort, into into libgit.a Johannes Schindelin via GitGitGadget
@ 2019-09-25  8:36 ` Johannes Schindelin via GitGitGadget
  2019-09-30 17:21 ` [PATCH v2 0/2] Git's rename detection requires " Johannes Schindelin via GitGitGadget
  2 siblings, 0 replies; 9+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-09-25  8:36 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

During Git's rename detection, the file names are sorted. At the moment,
this job is performed by `qsort()`. As that function is not guaranteed
to implement a stable sort algorithm, this can lead to inconsistent
and/or surprising behavior: a rename might be detected differently
depending on the platform where Git was run.

The `qsort()` in MS Visual C's runtime does _not_ implement a stable
sort algorithm, and it even leads to an inconsistency leading to a test
failure in t3030.35 "merge-recursive remembers the names of all base
trees": a different code path than on Linux is taken in the rename
detection of an ambiguous rename between either `e` to `a` or
`a~Temporary merge branch 2_0` to `a` during a recursive merge,
unexpectedly resulting in a clean merge.

Let's use the stable sort provided by `git_sort()` to avoid this
inconsistency.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 diffcore-rename.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index 9624864858..248f2be430 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -585,7 +585,7 @@ void diffcore_rename(struct diff_options *options)
 	stop_progress(&progress);
 
 	/* cost matrix sorted by most to least similar pair */
-	QSORT(mx, dst_cnt * NUM_CANDIDATE_PER_DST, score_compare);
+	QSORT_STABLE(mx, dst_cnt * NUM_CANDIDATE_PER_DST, score_compare);
 
 	rename_count += find_renames(mx, dst_cnt, minimum_score, 0);
 	if (detect_rename == DIFF_DETECT_COPY)
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] Move git_sort(), a stable sort, into into libgit.a
  2019-09-25  8:36 ` [PATCH 1/2] Move git_sort(), a stable sort, into into libgit.a Johannes Schindelin via GitGitGadget
@ 2019-09-28 10:14   ` Junio C Hamano
  2019-09-30 11:38     ` Johannes Schindelin
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2019-09-28 10:14 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> ---
>  Makefile                  | 2 +-
>  compat/mingw.c            | 5 -----
>  git-compat-util.h         | 4 +++-
>  compat/qsort.c => qsort.c | 2 +-
>  4 files changed, 5 insertions(+), 8 deletions(-)
>  rename compat/qsort.c => qsort.c (97%)

Quite pleasing.

> diff --git a/compat/mingw.c b/compat/mingw.c
> index 738f0a826a..77d4ef4d19 100644
> --- a/compat/mingw.c
> +++ b/compat/mingw.c
> @@ -1229,11 +1229,6 @@ static int wenvcmp(const void *a, const void *b)
>  	return _wcsnicmp(p, q, p_len);
>  }
>  
> -/* We need a stable sort to convert the environment between UTF-16 <-> UTF-8 */
> -#ifndef INTERNAL_QSORT
> -#include "qsort.c"
> -#endif
> -

Especially these ;-)

> diff --git a/compat/qsort.c b/qsort.c
> similarity index 97%
> rename from compat/qsort.c
> rename to qsort.c
> index 7d071afb70..08f80eea09 100644
> --- a/compat/qsort.c
> +++ b/qsort.c
> @@ -1,4 +1,4 @@
> -#include "../git-compat-util.h"
> +#include "git-compat-util.h"

I however do not think this goes far enough.  With a bit more effort
would make the intention of the API more obvious.

What we are saying now is that

 (1) some platforms do not even have qsort()
 (2) some codepaths do care the stability of sorting

the former used to be the reason why we called our implementation
git_qsort() and aliased qsort() to use git_qsort().

But now (2) is in the picture, we would probably want to make it
more clear that our own implementation is not about having a sort
function that behaves like qsort() and We want a lot more out of it
(namely, stability).  It probably is time to rename git_qsort() to
git_stable_qsort() or something like that.

Macros QSORT() and QSORT_S() are about having a convenient and less
error-prone thin wrapper that retains an interface similar to
qsort(3) and qsort_s(3) that developers are familiar with, so they
can and should stay the same name, and it is perfectly fine if the
former called qsort() that in turn calls git_stable_qsort() on some
platforms.

And that way, if/when the calling site of QSORT() (not the calling
site of STABLE_QSORT()) needs a more performant but unstable sort
implementation, we can safely give the most sensible name for it,
which is git_qsort().

Other than that, the two patches look good to me.  Thanks for
working on this topic.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] Move git_sort(), a stable sort, into into libgit.a
  2019-09-28 10:14   ` Junio C Hamano
@ 2019-09-30 11:38     ` Johannes Schindelin
  0 siblings, 0 replies; 9+ messages in thread
From: Johannes Schindelin @ 2019-09-30 11:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin via GitGitGadget, git

Hi Junio,

On Sat, 28 Sep 2019, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > ---
> >  Makefile                  | 2 +-
> >  compat/mingw.c            | 5 -----
> >  git-compat-util.h         | 4 +++-
> >  compat/qsort.c => qsort.c | 2 +-
> >  4 files changed, 5 insertions(+), 8 deletions(-)
> >  rename compat/qsort.c => qsort.c (97%)
>
> Quite pleasing.
>
> > diff --git a/compat/mingw.c b/compat/mingw.c
> > index 738f0a826a..77d4ef4d19 100644
> > --- a/compat/mingw.c
> > +++ b/compat/mingw.c
> > @@ -1229,11 +1229,6 @@ static int wenvcmp(const void *a, const void *b)
> >  	return _wcsnicmp(p, q, p_len);
> >  }
> >
> > -/* We need a stable sort to convert the environment between UTF-16 <-> UTF-8 */
> > -#ifndef INTERNAL_QSORT
> > -#include "qsort.c"
> > -#endif
> > -
>
> Especially these ;-)
>
> > diff --git a/compat/qsort.c b/qsort.c
> > similarity index 97%
> > rename from compat/qsort.c
> > rename to qsort.c
> > index 7d071afb70..08f80eea09 100644
> > --- a/compat/qsort.c
> > +++ b/qsort.c
> > @@ -1,4 +1,4 @@
> > -#include "../git-compat-util.h"
> > +#include "git-compat-util.h"
>
> I however do not think this goes far enough.  With a bit more effort
> would make the intention of the API more obvious.
>
> What we are saying now is that
>
>  (1) some platforms do not even have qsort()
>  (2) some codepaths do care the stability of sorting
>
> the former used to be the reason why we called our implementation
> git_qsort() and aliased qsort() to use git_qsort().
>
> But now (2) is in the picture, we would probably want to make it
> more clear that our own implementation is not about having a sort
> function that behaves like qsort() and We want a lot more out of it
> (namely, stability).  It probably is time to rename git_qsort() to
> git_stable_qsort() or something like that.
>
> Macros QSORT() and QSORT_S() are about having a convenient and less
> error-prone thin wrapper that retains an interface similar to
> qsort(3) and qsort_s(3) that developers are familiar with, so they
> can and should stay the same name, and it is perfectly fine if the
> former called qsort() that in turn calls git_stable_qsort() on some
> platforms.
>
> And that way, if/when the calling site of QSORT() (not the calling
> site of STABLE_QSORT()) needs a more performant but unstable sort
> implementation, we can safely give the most sensible name for it,
> which is git_qsort().
>
> Other than that, the two patches look good to me.  Thanks for
> working on this topic.

Okay, I will make that change.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 0/2] Git's rename detection requires a stable sort
  2019-09-25  8:35 [PATCH 0/2] Git's rename detection requires a stable sort Johannes Schindelin via GitGitGadget
  2019-09-25  8:36 ` [PATCH 1/2] Move git_sort(), a stable sort, into into libgit.a Johannes Schindelin via GitGitGadget
  2019-09-25  8:36 ` [PATCH 2/2] diffcore_rename(): use a stable sort Johannes Schindelin via GitGitGadget
@ 2019-09-30 17:21 ` Johannes Schindelin via GitGitGadget
  2019-09-30 17:21   ` [PATCH v2 1/2] Move git_sort(), a stable sort, into into libgit.a Johannes Schindelin via GitGitGadget
                     ` (2 more replies)
  2 siblings, 3 replies; 9+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-09-30 17:21 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Junio C Hamano

With the en/merge-recursive-cleanup patches already having advanced to next,
the problem I discovered when rebasing Git for Windows' branch thicket
becomes quite relevant now: t3030.35 fails consistently in the MSVC build &
test (this part of the Azure Pipeline will be upstreamed later).

The solution: use a stable sort.

Note: this patch series is based on top of en/merge-recursive-cleanup.

Changes since v1:

 * The function was renamed to git_stable_qsort(), as per Junio's
   suggestion.

Johannes Schindelin (2):
  Move git_sort(), a stable sort, into into libgit.a
  diffcore_rename(): use a stable sort

 Makefile                         |  2 +-
 compat/mingw.c                   | 11 +++--------
 diffcore-rename.c                |  2 +-
 git-compat-util.h                |  9 ++++++---
 compat/qsort.c => stable-qsort.c |  6 +++---
 5 files changed, 14 insertions(+), 16 deletions(-)
 rename compat/qsort.c => stable-qsort.c (89%)


base-commit: 4615a8cb5b3a8d4959c30338925b1fa3b948ae52
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-352%2Fdscho%2Frename-needs-stable-sort-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-352/dscho/rename-needs-stable-sort-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/352

Range-diff vs v1:

 1:  1202809db7 < -:  ---------- Move git_sort(), a stable sort, into into libgit.a
 -:  ---------- > 1:  8a99008a64 Move git_sort(), a stable sort, into into libgit.a
 2:  a95cdf1e94 ! 2:  442140438a diffcore_rename(): use a stable sort
     @@ -16,7 +16,7 @@
          `a~Temporary merge branch 2_0` to `a` during a recursive merge,
          unexpectedly resulting in a clean merge.
      
     -    Let's use the stable sort provided by `git_sort()` to avoid this
     +    Let's use the stable sort provided by `git_stable_qsort()` to avoid this
          inconsistency.
      
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
     @@ -29,7 +29,7 @@
       
       	/* cost matrix sorted by most to least similar pair */
      -	QSORT(mx, dst_cnt * NUM_CANDIDATE_PER_DST, score_compare);
     -+	QSORT_STABLE(mx, dst_cnt * NUM_CANDIDATE_PER_DST, score_compare);
     ++	STABLE_QSORT(mx, dst_cnt * NUM_CANDIDATE_PER_DST, score_compare);
       
       	rename_count += find_renames(mx, dst_cnt, minimum_score, 0);
       	if (detect_rename == DIFF_DETECT_COPY)

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/2] Move git_sort(), a stable sort, into into libgit.a
  2019-09-30 17:21 ` [PATCH v2 0/2] Git's rename detection requires " Johannes Schindelin via GitGitGadget
@ 2019-09-30 17:21   ` Johannes Schindelin via GitGitGadget
  2019-09-30 17:21   ` [PATCH v2 2/2] diffcore_rename(): use a stable sort Johannes Schindelin via GitGitGadget
  2019-10-02  5:47   ` [PATCH v2 0/2] Git's rename detection requires " Junio C Hamano
  2 siblings, 0 replies; 9+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-09-30 17:21 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Junio C Hamano, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `qsort()` function is not guaranteed to be stable, i.e. it does not
promise to maintain the order of items it is told to consider equal. In
contrast, the `git_sort()` function we carry in `compat/qsort.c` _is_
stable, by virtue of implementing a merge sort algorithm.

In preparation for using a stable sort in Git's rename detection, move
the stable sort into `libgit.a` so that it is compiled in
unconditionally, and rename it to `git_stable_qsort()`.

Note: this also makes the hack obsolete that was introduced in
fe21c6b285d (mingw: reencode environment variables on the fly (UTF-16
<-> UTF-8), 2018-10-30), where we included `compat/qsort.c` directly in
`compat/mingw.c` to use the stable sort.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Makefile                         |  2 +-
 compat/mingw.c                   | 11 +++--------
 git-compat-util.h                |  9 ++++++---
 compat/qsort.c => stable-qsort.c |  6 +++---
 4 files changed, 13 insertions(+), 15 deletions(-)
 rename compat/qsort.c => stable-qsort.c (89%)

diff --git a/Makefile b/Makefile
index f9255344ae..60c809e580 100644
--- a/Makefile
+++ b/Makefile
@@ -983,6 +983,7 @@ LIB_OBJS += shallow.o
 LIB_OBJS += sideband.o
 LIB_OBJS += sigchain.o
 LIB_OBJS += split-index.o
+LIB_OBJS += stable-qsort.o
 LIB_OBJS += strbuf.o
 LIB_OBJS += streaming.o
 LIB_OBJS += string-list.o
@@ -1714,7 +1715,6 @@ ifdef NO_GETPAGESIZE
 endif
 ifdef INTERNAL_QSORT
 	COMPAT_CFLAGS += -DINTERNAL_QSORT
-	COMPAT_OBJS += compat/qsort.o
 endif
 ifdef HAVE_ISO_QSORT_S
 	COMPAT_CFLAGS += -DHAVE_ISO_QSORT_S
diff --git a/compat/mingw.c b/compat/mingw.c
index 738f0a826a..50af33b2b3 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -1229,11 +1229,6 @@ static int wenvcmp(const void *a, const void *b)
 	return _wcsnicmp(p, q, p_len);
 }
 
-/* We need a stable sort to convert the environment between UTF-16 <-> UTF-8 */
-#ifndef INTERNAL_QSORT
-#include "qsort.c"
-#endif
-
 /*
  * Build an environment block combining the inherited environment
  * merged with the given list of settings.
@@ -1272,8 +1267,8 @@ static wchar_t *make_environment_block(char **deltaenv)
 
 	/*
 	 * If there is a deltaenv, let's accumulate all keys into `array`,
-	 * sort them using the stable git_qsort() and then copy, skipping
-	 * duplicate keys
+	 * sort them using the stable git_stable_qsort() and then copy,
+	 * skipping duplicate keys
 	 */
 	for (p = wenv; p && *p; ) {
 		ALLOC_GROW(array, nr + 1, alloc);
@@ -1296,7 +1291,7 @@ static wchar_t *make_environment_block(char **deltaenv)
 		p += wlen + 1;
 	}
 
-	git_qsort(array, nr, sizeof(*array), wenvcmp);
+	git_stable_qsort(array, nr, sizeof(*array), wenvcmp);
 	ALLOC_ARRAY(result, size + delta_size);
 
 	for (p = result, i = 0; i < nr; i++) {
diff --git a/git-compat-util.h b/git-compat-util.h
index 83be89de0a..7ec2c8fe31 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1094,10 +1094,10 @@ static inline int strtol_i(char const *s, int base, int *result)
 	return 0;
 }
 
+void git_stable_qsort(void *base, size_t nmemb, size_t size,
+		      int(*compar)(const void *, const void *));
 #ifdef INTERNAL_QSORT
-void git_qsort(void *base, size_t nmemb, size_t size,
-	       int(*compar)(const void *, const void *));
-#define qsort git_qsort
+#define qsort git_stable_qsort
 #endif
 
 #define QSORT(base, n, compar) sane_qsort((base), (n), sizeof(*(base)), compar)
@@ -1108,6 +1108,9 @@ static inline void sane_qsort(void *base, size_t nmemb, size_t size,
 		qsort(base, nmemb, size, compar);
 }
 
+#define STABLE_QSORT(base, n, compar) \
+	git_stable_qsort((base), (n), sizeof(*(base)), compar)
+
 #ifndef HAVE_ISO_QSORT_S
 int git_qsort_s(void *base, size_t nmemb, size_t size,
 		int (*compar)(const void *, const void *, void *), void *ctx);
diff --git a/compat/qsort.c b/stable-qsort.c
similarity index 89%
rename from compat/qsort.c
rename to stable-qsort.c
index 7d071afb70..6cbaf39f7b 100644
--- a/compat/qsort.c
+++ b/stable-qsort.c
@@ -1,4 +1,4 @@
-#include "../git-compat-util.h"
+#include "git-compat-util.h"
 
 /*
  * A merge sort implementation, simplified from the qsort implementation
@@ -44,8 +44,8 @@ static void msort_with_tmp(void *b, size_t n, size_t s,
 	memcpy(b, t, (n - n2) * s);
 }
 
-void git_qsort(void *b, size_t n, size_t s,
-	       int (*cmp)(const void *, const void *))
+void git_stable_qsort(void *b, size_t n, size_t s,
+		      int (*cmp)(const void *, const void *))
 {
 	const size_t size = st_mult(n, s);
 	char buf[1024];
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/2] diffcore_rename(): use a stable sort
  2019-09-30 17:21 ` [PATCH v2 0/2] Git's rename detection requires " Johannes Schindelin via GitGitGadget
  2019-09-30 17:21   ` [PATCH v2 1/2] Move git_sort(), a stable sort, into into libgit.a Johannes Schindelin via GitGitGadget
@ 2019-09-30 17:21   ` Johannes Schindelin via GitGitGadget
  2019-10-02  5:47   ` [PATCH v2 0/2] Git's rename detection requires " Junio C Hamano
  2 siblings, 0 replies; 9+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-09-30 17:21 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Junio C Hamano, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

During Git's rename detection, the file names are sorted. At the moment,
this job is performed by `qsort()`. As that function is not guaranteed
to implement a stable sort algorithm, this can lead to inconsistent
and/or surprising behavior: a rename might be detected differently
depending on the platform where Git was run.

The `qsort()` in MS Visual C's runtime does _not_ implement a stable
sort algorithm, and it even leads to an inconsistency leading to a test
failure in t3030.35 "merge-recursive remembers the names of all base
trees": a different code path than on Linux is taken in the rename
detection of an ambiguous rename between either `e` to `a` or
`a~Temporary merge branch 2_0` to `a` during a recursive merge,
unexpectedly resulting in a clean merge.

Let's use the stable sort provided by `git_stable_qsort()` to avoid this
inconsistency.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 diffcore-rename.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index 9624864858..9042936aba 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -585,7 +585,7 @@ void diffcore_rename(struct diff_options *options)
 	stop_progress(&progress);
 
 	/* cost matrix sorted by most to least similar pair */
-	QSORT(mx, dst_cnt * NUM_CANDIDATE_PER_DST, score_compare);
+	STABLE_QSORT(mx, dst_cnt * NUM_CANDIDATE_PER_DST, score_compare);
 
 	rename_count += find_renames(mx, dst_cnt, minimum_score, 0);
 	if (detect_rename == DIFF_DETECT_COPY)
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/2] Git's rename detection requires a stable sort
  2019-09-30 17:21 ` [PATCH v2 0/2] Git's rename detection requires " Johannes Schindelin via GitGitGadget
  2019-09-30 17:21   ` [PATCH v2 1/2] Move git_sort(), a stable sort, into into libgit.a Johannes Schindelin via GitGitGadget
  2019-09-30 17:21   ` [PATCH v2 2/2] diffcore_rename(): use a stable sort Johannes Schindelin via GitGitGadget
@ 2019-10-02  5:47   ` Junio C Hamano
  2 siblings, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2019-10-02  5:47 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> With the en/merge-recursive-cleanup patches already having advanced to next,
> the problem I discovered when rebasing Git for Windows' branch thicket
> becomes quite relevant now: t3030.35 fails consistently in the MSVC build &
> test (this part of the Azure Pipeline will be upstreamed later).
>
> The solution: use a stable sort.
>
> Note: this patch series is based on top of en/merge-recursive-cleanup.
>
> Changes since v1:
>
>  * The function was renamed to git_stable_qsort(), as per Junio's
>    suggestion.
>
> Johannes Schindelin (2):
>   Move git_sort(), a stable sort, into into libgit.a
>   diffcore_rename(): use a stable sort
>
>  Makefile                         |  2 +-
>  compat/mingw.c                   | 11 +++--------
>  diffcore-rename.c                |  2 +-
>  git-compat-util.h                |  9 ++++++---
>  compat/qsort.c => stable-qsort.c |  6 +++---
>  5 files changed, 14 insertions(+), 16 deletions(-)
>  rename compat/qsort.c => stable-qsort.c (89%)

Thanks.  All looked sensible.


>
>  1:  1202809db7 < -:  ---------- Move git_sort(), a stable sort, into into libgit.a
>  -:  ---------- > 1:  8a99008a64 Move git_sort(), a stable sort, into into libgit.a

This is where the current range-diff could be improved to shine
more.  If it were allowed to compute a pairwise diff with rename
detection, readers would be able to see that qsort.c from version 1
is renamed to stable-qsort.c in version 2 with minimum adjustment of
one function name.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-10-02  5:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-25  8:35 [PATCH 0/2] Git's rename detection requires a stable sort Johannes Schindelin via GitGitGadget
2019-09-25  8:36 ` [PATCH 1/2] Move git_sort(), a stable sort, into into libgit.a Johannes Schindelin via GitGitGadget
2019-09-28 10:14   ` Junio C Hamano
2019-09-30 11:38     ` Johannes Schindelin
2019-09-25  8:36 ` [PATCH 2/2] diffcore_rename(): use a stable sort Johannes Schindelin via GitGitGadget
2019-09-30 17:21 ` [PATCH v2 0/2] Git's rename detection requires " Johannes Schindelin via GitGitGadget
2019-09-30 17:21   ` [PATCH v2 1/2] Move git_sort(), a stable sort, into into libgit.a Johannes Schindelin via GitGitGadget
2019-09-30 17:21   ` [PATCH v2 2/2] diffcore_rename(): use a stable sort Johannes Schindelin via GitGitGadget
2019-10-02  5:47   ` [PATCH v2 0/2] Git's rename detection requires " Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).