git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] merge-recursive.c: use QSORT macro
@ 2016-11-22 12:30 Nguyễn Thái Ngọc Duy
  2016-11-22 17:49 ` Jeff King
  2016-11-24 11:45 ` [PATCH v2] merge-recursive.c: use string_list_sort instead of qsort Nguyễn Thái Ngọc Duy
  0 siblings, 2 replies; 8+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-11-22 12:30 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, Nguyễn Thái Ngọc Duy

This is the follow up of rs/qsort series, merged in b8688ad (Merge
branch 'rs/qsort' - 2016-10-10), where coccinelle was used to do
automatic transformation.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
  coccinelle missed this place, understandably, because it can't know
  that
  
      sizeof(*entries->items)
  
  is the same as
  
      sizeof(*df_name_compare.items)
  
  without some semantic analysis.

 merge-recursive.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 9041c2f..2d4dca9 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -451,7 +451,7 @@ static void record_df_conflict_files(struct merge_options *o,
 		string_list_append(&df_sorted_entries, next->string)->util =
 				   next->util;
 	}
-	qsort(df_sorted_entries.items, entries->nr, sizeof(*entries->items),
+	QSORT(df_sorted_entries.items, entries->nr,
 	      string_list_df_name_compare);
 
 	string_list_clear(&o->df_conflict_file_set, 1);
-- 
2.8.2.524.g6ff3d78


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] merge-recursive.c: use QSORT macro
  2016-11-22 12:30 [PATCH] merge-recursive.c: use QSORT macro Nguyễn Thái Ngọc Duy
@ 2016-11-22 17:49 ` Jeff King
  2016-11-23  9:43   ` Duy Nguyen
  2016-11-23 17:21   ` Junio C Hamano
  2016-11-24 11:45 ` [PATCH v2] merge-recursive.c: use string_list_sort instead of qsort Nguyễn Thái Ngọc Duy
  1 sibling, 2 replies; 8+ messages in thread
From: Jeff King @ 2016-11-22 17:49 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, René Scharfe

On Tue, Nov 22, 2016 at 07:30:19PM +0700, Nguyễn Thái Ngọc Duy wrote:

> This is the follow up of rs/qsort series, merged in b8688ad (Merge
> branch 'rs/qsort' - 2016-10-10), where coccinelle was used to do
> automatic transformation.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>   coccinelle missed this place, understandably, because it can't know
>   that
>   
>       sizeof(*entries->items)
>   
>   is the same as
>   
>       sizeof(*df_name_compare.items)
>   
>   without some semantic analysis.

That made me wonder why "entries" is used at all. Does it point to the
same struct? But no, df_name_compare is a string list we create with the
same list of strings.

Which is why...

> -	qsort(df_sorted_entries.items, entries->nr, sizeof(*entries->items),
> +	QSORT(df_sorted_entries.items, entries->nr,
>  	      string_list_df_name_compare);

...it's OK to use entries->nr here, and not df_sorted_entries.nr. It
still seems a bit odd, though. Maybe it's worth making this:

  QSORT(df_sorted_entries.items, df_sorted_entries.nr,
        string_list_df_name_compare);

while we're at it. Another possibility is:

  df_sorted_entries.cmp = string_list_df_name_compare;
  string_list_sort(&df_sorted_entries);

It's not any shorter, but maybe it's conceptually simpler.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] merge-recursive.c: use QSORT macro
  2016-11-22 17:49 ` Jeff King
@ 2016-11-23  9:43   ` Duy Nguyen
  2016-11-23 17:21   ` Junio C Hamano
  1 sibling, 0 replies; 8+ messages in thread
From: Duy Nguyen @ 2016-11-23  9:43 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, René Scharfe

On Wed, Nov 23, 2016 at 12:49 AM, Jeff King <peff@peff.net> wrote:
> On Tue, Nov 22, 2016 at 07:30:19PM +0700, Nguyễn Thái Ngọc Duy wrote:
>
>> This is the follow up of rs/qsort series, merged in b8688ad (Merge
>> branch 'rs/qsort' - 2016-10-10), where coccinelle was used to do
>> automatic transformation.
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>> ---
>>   coccinelle missed this place, understandably, because it can't know
>>   that
>>
>>       sizeof(*entries->items)
>>
>>   is the same as
>>
>>       sizeof(*df_name_compare.items)
>>
>>   without some semantic analysis.
>
> That made me wonder why "entries" is used at all. Does it point to the
> same struct? But no, df_name_compare is a string list we create with the
> same list of strings.
>
> Which is why...
>
>> -     qsort(df_sorted_entries.items, entries->nr, sizeof(*entries->items),
>> +     QSORT(df_sorted_entries.items, entries->nr,
>>             string_list_df_name_compare);
>
> ...it's OK to use entries->nr here, and not df_sorted_entries.nr. It
> still seems a bit odd, though.

Argh.. I completely overlooked that entries->nr !

> Maybe it's worth making this:
>
>   QSORT(df_sorted_entries.items, df_sorted_entries.nr,
>         string_list_df_name_compare);
>
> while we're at it. Another possibility is:
>
>   df_sorted_entries.cmp = string_list_df_name_compare;
>   string_list_sort(&df_sorted_entries);
>
> It's not any shorter, but maybe it's conceptually simpler.

Agreed. Shall I re-roll with this?
-- 
Duy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] merge-recursive.c: use QSORT macro
  2016-11-22 17:49 ` Jeff King
  2016-11-23  9:43   ` Duy Nguyen
@ 2016-11-23 17:21   ` Junio C Hamano
  1 sibling, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2016-11-23 17:21 UTC (permalink / raw)
  To: Jeff King; +Cc: Nguyễn Thái Ngọc Duy, git, René Scharfe

Jeff King <peff@peff.net> writes:

> Another possibility is:
>
>   df_sorted_entries.cmp = string_list_df_name_compare;
>   string_list_sort(&df_sorted_entries);
>
> It's not any shorter, but maybe it's conceptually simpler.

My first reaction to Duy's patch was: it is moronic for the
string-list API not to offer "I've done _append() to add many items
while avoiding the overhead of doing insertion sort all the time;
now I finished adding and I want the result sorted".

And then I looked at string-list.h and there it was ;-)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] merge-recursive.c: use string_list_sort instead of qsort
  2016-11-22 12:30 [PATCH] merge-recursive.c: use QSORT macro Nguyễn Thái Ngọc Duy
  2016-11-22 17:49 ` Jeff King
@ 2016-11-24 11:45 ` Nguyễn Thái Ngọc Duy
  2016-11-24 20:52   ` Jeff King
  1 sibling, 1 reply; 8+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-11-24 11:45 UTC (permalink / raw)
  To: git
  Cc: René Scharfe, Jeff King, Junio C Hamano,
	Nguyễn Thái Ngọc Duy

This started out to as a hunt for remaining qsort() calls after rs/qsort
series because qsort() API is a bit easy to get wrong (*). However,
since we have string_list_sort(), it's conceptually a better way to sort
here.

(*) In this particular case, it's even more confusing when you sort one
variable but you use the number of items and item size from an unrelated
variable (from a first glance)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 merge-recursive.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 9041c2f..90e83bd 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -388,12 +388,10 @@ static struct string_list *get_unmerged(void)
 	return unmerged;
 }
 
-static int string_list_df_name_compare(const void *a, const void *b)
+static int string_list_df_name_compare(const char *one, const char *two)
 {
-	const struct string_list_item *one = a;
-	const struct string_list_item *two = b;
-	int onelen = strlen(one->string);
-	int twolen = strlen(two->string);
+	int onelen = strlen(one);
+	int twolen = strlen(two);
 	/*
 	 * Here we only care that entries for D/F conflicts are
 	 * adjacent, in particular with the file of the D/F conflict
@@ -406,8 +404,8 @@ static int string_list_df_name_compare(const void *a, const void *b)
 	 * since in other cases any changes in their order due to
 	 * sorting cause no problems for us.
 	 */
-	int cmp = df_name_compare(one->string, onelen, S_IFDIR,
-				  two->string, twolen, S_IFDIR);
+	int cmp = df_name_compare(one, onelen, S_IFDIR,
+				  two, twolen, S_IFDIR);
 	/*
 	 * Now that 'foo' and 'foo/bar' compare equal, we have to make sure
 	 * that 'foo' comes before 'foo/bar'.
@@ -451,8 +449,8 @@ static void record_df_conflict_files(struct merge_options *o,
 		string_list_append(&df_sorted_entries, next->string)->util =
 				   next->util;
 	}
-	qsort(df_sorted_entries.items, entries->nr, sizeof(*entries->items),
-	      string_list_df_name_compare);
+	df_sorted_entries.cmp = string_list_df_name_compare;
+	string_list_sort(&df_sorted_entries);
 
 	string_list_clear(&o->df_conflict_file_set, 1);
 	for (i = 0; i < df_sorted_entries.nr; i++) {
-- 
2.8.2.524.g6ff3d78


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] merge-recursive.c: use string_list_sort instead of qsort
  2016-11-24 11:45 ` [PATCH v2] merge-recursive.c: use string_list_sort instead of qsort Nguyễn Thái Ngọc Duy
@ 2016-11-24 20:52   ` Jeff King
  2016-11-25 12:15     ` Duy Nguyen
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2016-11-24 20:52 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: git, René Scharfe, Junio C Hamano

On Thu, Nov 24, 2016 at 06:45:36PM +0700, Nguyễn Thái Ngọc Duy wrote:

> This started out to as a hunt for remaining qsort() calls after rs/qsort
> series because qsort() API is a bit easy to get wrong (*). However,
> since we have string_list_sort(), it's conceptually a better way to sort
> here.
> 
> (*) In this particular case, it's even more confusing when you sort one
> variable but you use the number of items and item size from an unrelated
> variable (from a first glance)

Makes sense, though I think I probably would have explained it in
reverse order:

  Merge-recursive sorts a string list using a raw qsort(), where it
  feeds the "items" from one struct but the "nr" and size fields from
  another struct. This isn't a bug because one list is a copy of the
  other, but it's unnecessarily confusing (and also caused our recent
  QSORT() cleanups via coccinelle to miss this call site).

  Let's use string_list_sort() instead, which is more concise and harder
  to get wrong. Note that we need to adjust our comparison function,
  which gets fed only the strings now, not the string_list_items. That's
  OK because we don't use the "util" field as part of our sort.

Feel free to use or ignore my description as you see fit. :)

> -static int string_list_df_name_compare(const void *a, const void *b)
> +static int string_list_df_name_compare(const char *one, const char *two)
>  {
> -	const struct string_list_item *one = a;
> -	const struct string_list_item *two = b;
> -	int onelen = strlen(one->string);
> -	int twolen = strlen(two->string);
> +	int onelen = strlen(one);
> +	int twolen = strlen(two);

I guess I haven't used string_list_sort() in a while, but I was
surprised to find that it just feeds the strings to the comparator. That
makes sense for using a raw strcmp() as the comparator, but I wonder if
any callers would ever want to take the util field into account (e.g.,
to break ties).

We don't seem to care here, though (which can be verified by reading the
code, but also because any mention of one->util would be a compilation
error after your patch). So I guess we can punt on it until the day that
some caller does need it.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] merge-recursive.c: use string_list_sort instead of qsort
  2016-11-24 20:52   ` Jeff King
@ 2016-11-25 12:15     ` Duy Nguyen
  2016-11-25 17:15       ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Duy Nguyen @ 2016-11-25 12:15 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, René Scharfe, Junio C Hamano

On Fri, Nov 25, 2016 at 3:52 AM, Jeff King <peff@peff.net> wrote:
> On Thu, Nov 24, 2016 at 06:45:36PM +0700, Nguyễn Thái Ngọc Duy wrote:
>
>> This started out to as a hunt for remaining qsort() calls after rs/qsort
>> series because qsort() API is a bit easy to get wrong (*). However,
>> since we have string_list_sort(), it's conceptually a better way to sort
>> here.
>>
>> (*) In this particular case, it's even more confusing when you sort one
>> variable but you use the number of items and item size from an unrelated
>> variable (from a first glance)
>
> Makes sense, though I think I probably would have explained it in
> reverse order:
>
>   Merge-recursive sorts a string list using a raw qsort(), where it
>   feeds the "items" from one struct but the "nr" and size fields from
>   another struct. This isn't a bug because one list is a copy of the
>   other, but it's unnecessarily confusing (and also caused our recent
>   QSORT() cleanups via coccinelle to miss this call site).
>
>   Let's use string_list_sort() instead, which is more concise and harder
>   to get wrong. Note that we need to adjust our comparison function,
>   which gets fed only the strings now, not the string_list_items. That's
>   OK because we don't use the "util" field as part of our sort.
>
> Feel free to use or ignore my description as you see fit. :)

I delegate the decision to Junio. He can amend the commit if he
decides so. I suspect it's a good idea to do so.

>> -static int string_list_df_name_compare(const void *a, const void *b)
>> +static int string_list_df_name_compare(const char *one, const char *two)
>>  {
>> -     const struct string_list_item *one = a;
>> -     const struct string_list_item *two = b;
>> -     int onelen = strlen(one->string);
>> -     int twolen = strlen(two->string);
>> +     int onelen = strlen(one);
>> +     int twolen = strlen(two);
>
> I guess I haven't used string_list_sort() in a while, but I was
> surprised to find that it just feeds the strings to the comparator. That
> makes sense for using a raw strcmp() as the comparator, but I wonder if
> any callers would ever want to take the util field into account (e.g.,
> to break ties).
>
> We don't seem to care here, though (which can be verified by reading the
> code, but also because any mention of one->util would be a compilation
> error after your patch). So I guess we can punt on it until the day that
> some caller does need it.

Some callers do need it, or at least fmt-merge-msg.c:add_people_info()
does, maybe builtin/remote.c:show() and shortlog.c:shortlog_output()
too. But I'll stop here and get back to my worktree stuff.
-- 
Duy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] merge-recursive.c: use string_list_sort instead of qsort
  2016-11-25 12:15     ` Duy Nguyen
@ 2016-11-25 17:15       ` Jeff King
  0 siblings, 0 replies; 8+ messages in thread
From: Jeff King @ 2016-11-25 17:15 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List, René Scharfe, Junio C Hamano

On Fri, Nov 25, 2016 at 07:15:15PM +0700, Duy Nguyen wrote:

> > I guess I haven't used string_list_sort() in a while, but I was
> > surprised to find that it just feeds the strings to the comparator. That
> > makes sense for using a raw strcmp() as the comparator, but I wonder if
> > any callers would ever want to take the util field into account (e.g.,
> > to break ties).
> >
> > We don't seem to care here, though (which can be verified by reading the
> > code, but also because any mention of one->util would be a compilation
> > error after your patch). So I guess we can punt on it until the day that
> > some caller does need it.
> 
> Some callers do need it, or at least fmt-merge-msg.c:add_people_info()
> does, maybe builtin/remote.c:show() and shortlog.c:shortlog_output()
> too. But I'll stop here and get back to my worktree stuff.

I started to work on this, figuring it would be a nice warm-up for the
day. But it actually is a little complicated, and I think not worth
doing. :)

The obvious backwards-compatible way to do it is to add a "cmp_item"
field to the string list. Sorting should use that if non-NULL, and
fallback to the string-oriented "cmp" otherwise.

And that does work when you want to sort via string_list_sort, like:

  authors->cmp_item = cmp_string_list_util_as_integral;
  string_list_sort(authors);

(the example is from fmt-merge-message.c). But the original use of
sorting in string-list was to keep a sorted list as you go with
string_list_insert(). And in that call we have _only_ the newly added
string, and the caller has not yet had an opportunity to set the util
field. So:

  struct string_list list = STRING_LIST_INIT_DUP;
  list.cmp_item = cmp_util_fields;
  for (...)
	string_list_insert(&list, foo[i])->util = bar[i];

is nonsense. It would always see a NULL util field during the
comparison.

Certainly "don't do that" is a possible answer. But it's just a bad
interface. It encourages a nonsensical use, and it makes a natural use
(sorting after the fact) more clunky by making the caller set a field in
the struct rather than pass a parameter. The correct interface is more
like:

  string_list_sort_items(authors, cmp_string_list_util_as_integral);

but then we are not really saving much over the more generic:

  QSORT(authors->items, authors->nr, cmp_string_list_util_as_integral);

So I'm inclined to leave it as-is.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-11-25 17:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-22 12:30 [PATCH] merge-recursive.c: use QSORT macro Nguyễn Thái Ngọc Duy
2016-11-22 17:49 ` Jeff King
2016-11-23  9:43   ` Duy Nguyen
2016-11-23 17:21   ` Junio C Hamano
2016-11-24 11:45 ` [PATCH v2] merge-recursive.c: use string_list_sort instead of qsort Nguyễn Thái Ngọc Duy
2016-11-24 20:52   ` Jeff King
2016-11-25 12:15     ` Duy Nguyen
2016-11-25 17:15       ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).