git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 1/3] add QSORT
@ 2016-09-29 15:23 René Scharfe
  2016-09-29 15:27 ` [PATCH 2/3] use QSORT René Scharfe
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: René Scharfe @ 2016-09-29 15:23 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano

Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
size of the array elements and supports the convention of initializing
empty arrays with a NULL pointer, which we use in some places.

Calling qsort(3) directly with a NULL pointer is undefined -- even with
an element count of zero -- and allows the compiler to optimize away any
following NULL checks.  Using the macro avoids such surprises.

Add a semantic patch as well to demonstrate the macro's usage and to
automate the transformation of trivial cases.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 contrib/coccinelle/qsort.cocci | 19 +++++++++++++++++++
 git-compat-util.h              |  8 ++++++++
 2 files changed, 27 insertions(+)
 create mode 100644 contrib/coccinelle/qsort.cocci

diff --git a/contrib/coccinelle/qsort.cocci b/contrib/coccinelle/qsort.cocci
new file mode 100644
index 0000000..a094e7c
--- /dev/null
+++ b/contrib/coccinelle/qsort.cocci
@@ -0,0 +1,19 @@
+@@
+expression base, nmemb, compar;
+@@
+- qsort(base, nmemb, sizeof(*base), compar);
++ QSORT(base, nmemb, compar);
+
+@@
+expression base, nmemb, compar;
+@@
+- qsort(base, nmemb, sizeof(base[0]), compar);
++ QSORT(base, nmemb, compar);
+
+@@
+type T;
+T *base;
+expression nmemb, compar;
+@@
+- qsort(base, nmemb, sizeof(T), compar);
++ QSORT(base, nmemb, compar);
diff --git a/git-compat-util.h b/git-compat-util.h
index 8aab0c3..d7ed137 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -977,6 +977,14 @@ void git_qsort(void *base, size_t nmemb, size_t size,
 #define qsort git_qsort
 #endif
 
+#define QSORT(base, n, compar) sane_qsort((base), (n), sizeof(*(base)), compar)
+static void inline sane_qsort(void *base, size_t nmemb, size_t size,
+			      int(*compar)(const void *, const void *))
+{
+	if (nmemb > 1)
+		qsort(base, nmemb, size, compar);
+}
+
 #ifndef REG_STARTEND
 #error "Git requires REG_STARTEND support. Compile with NO_REGEX=NeedsStartEnd"
 #endif
-- 
2.10.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] use QSORT
  2016-09-29 15:23 [PATCH 1/3] add QSORT René Scharfe
@ 2016-09-29 15:27 ` René Scharfe
  2016-09-29 15:29 ` [PATCH 3/3] remove unnecessary check before QSORT René Scharfe
  2016-09-29 22:36 ` [PATCH 1/3] add QSORT Junio C Hamano
  2 siblings, 0 replies; 13+ messages in thread
From: René Scharfe @ 2016-09-29 15:27 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano

Apply the semantic patch contrib/coccinelle/qsort.cocci to the code
base, replacing calls of qsort(3) with QSORT.  The resulting code is
shorter and supports empty arrays with NULL pointers.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
Freshly generated using coccicheck, compiles, survives make test.

 bisect.c                             |  2 +-
 builtin/describe.c                   |  2 +-
 builtin/fast-export.c                |  2 +-
 builtin/fmt-merge-msg.c              |  6 ++----
 builtin/index-pack.c                 |  8 +++-----
 builtin/mktree.c                     |  2 +-
 builtin/name-rev.c                   |  3 +--
 builtin/pack-objects.c               |  7 +++----
 builtin/remote.c                     |  3 +--
 diff.c                               |  6 +++---
 diffcore-delta.c                     |  5 +----
 diffcore-order.c                     |  2 +-
 diffcore-rename.c                    |  2 +-
 dir.c                                |  4 ++--
 fast-import.c                        |  4 ++--
 fetch-pack.c                         |  2 +-
 help.c                               | 15 +++++----------
 line-log.c                           |  2 +-
 pack-bitmap-write.c                  |  3 +--
 pack-check.c                         |  2 +-
 pack-write.c                         |  3 +--
 pathspec.c                           |  3 +--
 ref-filter.c                         |  2 +-
 refs/files-backend.c                 |  2 +-
 server-info.c                        |  2 +-
 sh-i18n--envsubst.c                  |  2 +-
 sha1-array.c                         |  2 +-
 string-list.c                        |  2 +-
 t/helper/test-dump-untracked-cache.c |  6 ++----
 tree.c                               |  3 +--
 30 files changed, 44 insertions(+), 65 deletions(-)

diff --git a/bisect.c b/bisect.c
index 6f512c2..21bc6da 100644
--- a/bisect.c
+++ b/bisect.c
@@ -215,7 +215,7 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list, int n
 		array[cnt].distance = distance;
 		cnt++;
 	}
-	qsort(array, cnt, sizeof(*array), compare_commit_dist);
+	QSORT(array, cnt, compare_commit_dist);
 	for (p = list, i = 0; i < cnt; i++) {
 		char buf[100]; /* enough for dist=%d */
 		struct object *obj = &(array[i].commit->object);
diff --git a/builtin/describe.c b/builtin/describe.c
index 8a25abe..01490a1 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -352,7 +352,7 @@ static void describe(const char *arg, int last_one)
 			    oid_to_hex(oid));
 	}
 
-	qsort(all_matches, match_cnt, sizeof(all_matches[0]), compare_pt);
+	QSORT(all_matches, match_cnt, compare_pt);
 
 	if (gave_up_on) {
 		commit_list_insert_by_date(gave_up_on, &list);
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index c0652a7..1e815b5 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -347,7 +347,7 @@ static void show_filemodify(struct diff_queue_struct *q,
 	 * Handle files below a directory first, in case they are all deleted
 	 * and the directory changes to a file or symlink.
 	 */
-	qsort(q->queue, q->nr, sizeof(q->queue[0]), depth_first);
+	QSORT(q->queue, q->nr, depth_first);
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filespec *ospec = q->queue[i]->one;
diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index dc2e9e4..4976967 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -315,12 +315,10 @@ static void add_people_info(struct strbuf *out,
 			    struct string_list *committers)
 {
 	if (authors->nr)
-		qsort(authors->items,
-		      authors->nr, sizeof(authors->items[0]),
+		QSORT(authors->items, authors->nr,
 		      cmp_string_list_util_as_integral);
 	if (committers->nr)
-		qsort(committers->items,
-		      committers->nr, sizeof(committers->items[0]),
+		QSORT(committers->items, committers->nr,
 		      cmp_string_list_util_as_integral);
 
 	credit_people(out, authors, 'a');
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 4a8b4ae..7657d0a 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1190,10 +1190,8 @@ static void resolve_deltas(void)
 		return;
 
 	/* Sort deltas by base SHA1/offset for fast searching */
-	qsort(ofs_deltas, nr_ofs_deltas, sizeof(struct ofs_delta_entry),
-	      compare_ofs_delta_entry);
-	qsort(ref_deltas, nr_ref_deltas, sizeof(struct ref_delta_entry),
-	      compare_ref_delta_entry);
+	QSORT(ofs_deltas, nr_ofs_deltas, compare_ofs_delta_entry);
+	QSORT(ref_deltas, nr_ref_deltas, compare_ref_delta_entry);
 
 	if (verbose || show_resolving_progress)
 		progress = start_progress(_("Resolving deltas"),
@@ -1356,7 +1354,7 @@ static void fix_unresolved_deltas(struct sha1file *f)
 	ALLOC_ARRAY(sorted_by_pos, nr_ref_deltas);
 	for (i = 0; i < nr_ref_deltas; i++)
 		sorted_by_pos[i] = &ref_deltas[i];
-	qsort(sorted_by_pos, nr_ref_deltas, sizeof(*sorted_by_pos), delta_pos_compare);
+	QSORT(sorted_by_pos, nr_ref_deltas, delta_pos_compare);
 
 	for (i = 0; i < nr_ref_deltas; i++) {
 		struct ref_delta_entry *d = sorted_by_pos[i];
diff --git a/builtin/mktree.c b/builtin/mktree.c
index 4282b62..de9b40f 100644
--- a/builtin/mktree.c
+++ b/builtin/mktree.c
@@ -46,7 +46,7 @@ static void write_tree(unsigned char *sha1)
 	size_t size;
 	int i;
 
-	qsort(entries, used, sizeof(*entries), ent_compare);
+	QSORT(entries, used, ent_compare);
 	for (size = i = 0; i < used; i++)
 		size += 32 + entries[i]->len;
 
diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index 57be35f..cd89d48 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -195,8 +195,7 @@ static const char *get_exact_ref_match(const struct object *o)
 		return NULL;
 
 	if (!tip_table.sorted) {
-		qsort(tip_table.table, tip_table.nr, sizeof(*tip_table.table),
-		      tipcmp);
+		QSORT(tip_table.table, tip_table.nr, tipcmp);
 		tip_table.sorted = 1;
 	}
 
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 166e52c..8aeba6a 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1535,7 +1535,7 @@ static void get_object_details(void)
 	sorted_by_offset = xcalloc(to_pack.nr_objects, sizeof(struct object_entry *));
 	for (i = 0; i < to_pack.nr_objects; i++)
 		sorted_by_offset[i] = to_pack.objects + i;
-	qsort(sorted_by_offset, to_pack.nr_objects, sizeof(*sorted_by_offset), pack_offset_sort);
+	QSORT(sorted_by_offset, to_pack.nr_objects, pack_offset_sort);
 
 	for (i = 0; i < to_pack.nr_objects; i++) {
 		struct object_entry *entry = sorted_by_offset[i];
@@ -2257,7 +2257,7 @@ static void prepare_pack(int window, int depth)
 		if (progress)
 			progress_state = start_progress(_("Compressing objects"),
 							nr_deltas);
-		qsort(delta_list, n, sizeof(*delta_list), type_size_sort);
+		QSORT(delta_list, n, type_size_sort);
 		ll_find_deltas(delta_list, n, window+1, depth, &nr_done);
 		stop_progress(&progress_state);
 		if (nr_done != nr_deltas)
@@ -2449,8 +2449,7 @@ static void add_objects_in_unpacked_packs(struct rev_info *revs)
 	}
 
 	if (in_pack.nr) {
-		qsort(in_pack.array, in_pack.nr, sizeof(in_pack.array[0]),
-		      ofscmp);
+		QSORT(in_pack.array, in_pack.nr, ofscmp);
 		for (i = 0; i < in_pack.nr; i++) {
 			struct object *o = in_pack.array[i].object;
 			add_object_entry(o->oid.hash, o->type, "", 0);
diff --git a/builtin/remote.c b/builtin/remote.c
index 9f6a6b3..e52cf39 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -1197,8 +1197,7 @@ static int show(int argc, const char **argv)
 
 		info.width = info.width2 = 0;
 		for_each_string_list(&states.push, add_push_to_show_info, &info);
-		qsort(info.list->items, info.list->nr,
-			sizeof(*info.list->items), cmp_string_with_push);
+		QSORT(info.list->items, info.list->nr, cmp_string_with_push);
 		if (info.list->nr)
 			printf_ln(Q_("  Local ref configured for 'git push'%s:",
 				     "  Local refs configured for 'git push'%s:",
diff --git a/diff.c b/diff.c
index a178ed3..c2f09fb 100644
--- a/diff.c
+++ b/diff.c
@@ -2019,7 +2019,7 @@ static void show_dirstat(struct diff_options *options)
 		return;
 
 	/* Show all directories with more than x% of the changes */
-	qsort(dir.files, dir.nr, sizeof(dir.files[0]), dirstat_compare);
+	QSORT(dir.files, dir.nr, dirstat_compare);
 	gather_dirstat(options, &dir, changed, "", 0);
 }
 
@@ -2063,7 +2063,7 @@ static void show_dirstat_by_line(struct diffstat_t *data, struct diff_options *o
 		return;
 
 	/* Show all directories with more than x% of the changes */
-	qsort(dir.files, dir.nr, sizeof(dir.files[0]), dirstat_compare);
+	QSORT(dir.files, dir.nr, dirstat_compare);
 	gather_dirstat(options, &dir, changed, "", 0);
 }
 
@@ -4923,7 +4923,7 @@ static int diffnamecmp(const void *a_, const void *b_)
 void diffcore_fix_diff_index(struct diff_options *options)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
-	qsort(q->queue, q->nr, sizeof(q->queue[0]), diffnamecmp);
+	QSORT(q->queue, q->nr, diffnamecmp);
 }
 
 void diffcore_std(struct diff_options *options)
diff --git a/diffcore-delta.c b/diffcore-delta.c
index 4159748..2ebedb3 100644
--- a/diffcore-delta.c
+++ b/diffcore-delta.c
@@ -158,10 +158,7 @@ static struct spanhash_top *hash_chars(struct diff_filespec *one)
 		n = 0;
 		accum1 = accum2 = 0;
 	}
-	qsort(hash->data,
-		1ul << hash->alloc_log2,
-		sizeof(hash->data[0]),
-		spanhash_cmp);
+	QSORT(hash->data, 1ul << hash->alloc_log2, spanhash_cmp);
 	return hash;
 }
 
diff --git a/diffcore-order.c b/diffcore-order.c
index 69d41f7..1957f82 100644
--- a/diffcore-order.c
+++ b/diffcore-order.c
@@ -101,7 +101,7 @@ void order_objects(const char *orderfile, obj_path_fn_t obj_path,
 		objs[i].orig_order = i;
 		objs[i].order = match_order(obj_path(objs[i].obj));
 	}
-	qsort(objs, nr, sizeof(*objs), compare_objs_order);
+	QSORT(objs, nr, compare_objs_order);
 }
 
 static const char *pair_pathtwo(void *obj)
diff --git a/diffcore-rename.c b/diffcore-rename.c
index 73d003a..54a2396 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -580,7 +580,7 @@ void diffcore_rename(struct diff_options *options)
 	stop_progress(&progress);
 
 	/* cost matrix sorted by most to least similar pair */
-	qsort(mx, dst_cnt * NUM_CANDIDATE_PER_DST, sizeof(*mx), score_compare);
+	QSORT(mx, dst_cnt * NUM_CANDIDATE_PER_DST, score_compare);
 
 	rename_count += find_renames(mx, dst_cnt, minimum_score, 0);
 	if (detect_rename == DIFF_DETECT_COPY)
diff --git a/dir.c b/dir.c
index 9e09bcb..3bad1ad 100644
--- a/dir.c
+++ b/dir.c
@@ -2005,8 +2005,8 @@ int read_directory(struct dir_struct *dir, const char *path, int len, const stru
 	if (!len || treat_leading_path(dir, path, len, simplify))
 		read_directory_recursive(dir, path, len, untracked, 0, simplify);
 	free_simplify(simplify);
-	qsort(dir->entries, dir->nr, sizeof(struct dir_entry *), cmp_name);
-	qsort(dir->ignored, dir->ignored_nr, sizeof(struct dir_entry *), cmp_name);
+	QSORT(dir->entries, dir->nr, cmp_name);
+	QSORT(dir->ignored, dir->ignored_nr, cmp_name);
 	if (dir->untracked) {
 		static struct trace_key trace_untracked_stats = TRACE_KEY_INIT(UNTRACKED_STATS);
 		trace_printf_key(&trace_untracked_stats,
diff --git a/fast-import.c b/fast-import.c
index bf53ac9..cb545d7 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1460,9 +1460,9 @@ static void mktree(struct tree_content *t, int v, struct strbuf *b)
 	unsigned int i;
 
 	if (!v)
-		qsort(t->entries,t->entry_count,sizeof(t->entries[0]),tecmp0);
+		QSORT(t->entries, t->entry_count, tecmp0);
 	else
-		qsort(t->entries,t->entry_count,sizeof(t->entries[0]),tecmp1);
+		QSORT(t->entries, t->entry_count, tecmp1);
 
 	for (i = 0; i < t->entry_count; i++) {
 		if (t->entries[i]->versions[v].mode)
diff --git a/fetch-pack.c b/fetch-pack.c
index 85e77af..8a38d30 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -812,7 +812,7 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 	int agent_len;
 
 	sort_ref_list(&ref, ref_compare_name);
-	qsort(sought, nr_sought, sizeof(*sought), cmp_ref_by_name);
+	QSORT(sought, nr_sought, cmp_ref_by_name);
 
 	if ((args->depth > 0 || is_repository_shallow()) && !server_supports("shallow"))
 		die("Server does not support shallow clients");
diff --git a/help.c b/help.c
index 2ff3b5a..53e2a67 100644
--- a/help.c
+++ b/help.c
@@ -170,8 +170,7 @@ void load_command_list(const char *prefix,
 
 	if (exec_path) {
 		list_commands_in_dir(main_cmds, exec_path, prefix);
-		qsort(main_cmds->names, main_cmds->cnt,
-		      sizeof(*main_cmds->names), cmdname_compare);
+		QSORT(main_cmds->names, main_cmds->cnt, cmdname_compare);
 		uniq(main_cmds);
 	}
 
@@ -190,8 +189,7 @@ void load_command_list(const char *prefix,
 		}
 		free(paths);
 
-		qsort(other_cmds->names, other_cmds->cnt,
-		      sizeof(*other_cmds->names), cmdname_compare);
+		QSORT(other_cmds->names, other_cmds->cnt, cmdname_compare);
 		uniq(other_cmds);
 	}
 	exclude_cmds(other_cmds, main_cmds);
@@ -238,8 +236,7 @@ void list_common_cmds_help(void)
 			longest = strlen(common_cmds[i].name);
 	}
 
-	qsort(common_cmds, ARRAY_SIZE(common_cmds),
-		sizeof(common_cmds[0]), cmd_group_cmp);
+	QSORT(common_cmds, ARRAY_SIZE(common_cmds), cmd_group_cmp);
 
 	puts(_("These are common Git commands used in various situations:"));
 
@@ -324,8 +321,7 @@ const char *help_unknown_cmd(const char *cmd)
 
 	add_cmd_list(&main_cmds, &aliases);
 	add_cmd_list(&main_cmds, &other_cmds);
-	qsort(main_cmds.names, main_cmds.cnt,
-	      sizeof(*main_cmds.names), cmdname_compare);
+	QSORT(main_cmds.names, main_cmds.cnt, cmdname_compare);
 	uniq(&main_cmds);
 
 	/* This abuses cmdname->len for levenshtein distance */
@@ -359,8 +355,7 @@ const char *help_unknown_cmd(const char *cmd)
 			levenshtein(cmd, candidate, 0, 2, 1, 3) + 1;
 	}
 
-	qsort(main_cmds.names, main_cmds.cnt,
-	      sizeof(*main_cmds.names), levenshtein_compare);
+	QSORT(main_cmds.names, main_cmds.cnt, levenshtein_compare);
 
 	if (!main_cmds.cnt)
 		die(_("Uh oh. Your system reports no Git commands at all."));
diff --git a/line-log.c b/line-log.c
index 916e724..65f3558 100644
--- a/line-log.c
+++ b/line-log.c
@@ -113,7 +113,7 @@ void sort_and_merge_range_set(struct range_set *rs)
 	int i;
 	int o = 0; /* output cursor */
 
-	qsort(rs->ranges, rs->nr, sizeof(struct range), range_cmp);
+	QSORT(rs->ranges, rs->nr, range_cmp);
 
 	for (i = 0; i < rs->nr; i++) {
 		if (rs->ranges[i].start == rs->ranges[i].end)
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index c30bcd0..9705596 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -385,8 +385,7 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
 {
 	unsigned int i = 0, j, next;
 
-	qsort(indexed_commits, indexed_commits_nr, sizeof(indexed_commits[0]),
-	      date_compare);
+	QSORT(indexed_commits, indexed_commits_nr, date_compare);
 
 	if (writer.show_progress)
 		writer.progress = start_progress("Selecting bitmap commits", 0);
diff --git a/pack-check.c b/pack-check.c
index d123846..72440a8 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -99,7 +99,7 @@ static int verify_packfile(struct packed_git *p,
 		entries[i].offset = nth_packed_object_offset(p, i);
 		entries[i].nr = i;
 	}
-	qsort(entries, nr_objects, sizeof(*entries), compare_entries);
+	QSORT(entries, nr_objects, compare_entries);
 
 	for (i = 0; i < nr_objects; i++) {
 		void *data;
diff --git a/pack-write.c b/pack-write.c
index ea0b788..88bc7f9 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -61,8 +61,7 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec
 			if (objects[i]->offset > last_obj_offset)
 				last_obj_offset = objects[i]->offset;
 		}
-		qsort(sorted_by_sha, nr_objects, sizeof(sorted_by_sha[0]),
-		      sha1_compare);
+		QSORT(sorted_by_sha, nr_objects, sha1_compare);
 	}
 	else
 		sorted_by_sha = list = last = NULL;
diff --git a/pathspec.c b/pathspec.c
index 24e0dd5..eda13b5 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -446,8 +446,7 @@ void parse_pathspec(struct pathspec *pathspec,
 	if (pathspec->magic & PATHSPEC_MAXDEPTH) {
 		if (flags & PATHSPEC_KEEP_ORDER)
 			die("BUG: PATHSPEC_MAXDEPTH_VALID and PATHSPEC_KEEP_ORDER are incompatible");
-		qsort(pathspec->items, pathspec->nr,
-		      sizeof(struct pathspec_item), pathspec_item_cmp);
+		QSORT(pathspec->items, pathspec->nr, pathspec_item_cmp);
 	}
 }
 
diff --git a/ref-filter.c b/ref-filter.c
index 9adbb8a..44029b0 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1573,7 +1573,7 @@ static int compare_refs(const void *a_, const void *b_)
 void ref_array_sort(struct ref_sorting *sorting, struct ref_array *array)
 {
 	ref_sorting = sorting;
-	qsort(array->items, array->nr, sizeof(struct ref_array_item *), compare_refs);
+	QSORT(array->items, array->nr, compare_refs);
 }
 
 static void append_literal(const char *cp, const char *ep, struct ref_formatting_state *state)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0709f60..d16feb1 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -501,7 +501,7 @@ static void sort_ref_dir(struct ref_dir *dir)
 	if (dir->sorted == dir->nr)
 		return;
 
-	qsort(dir->entries, dir->nr, sizeof(*dir->entries), ref_entry_cmp);
+	QSORT(dir->entries, dir->nr, ref_entry_cmp);
 
 	/* Remove any duplicates: */
 	for (i = 0, j = 0; j < dir->nr; j++) {
diff --git a/server-info.c b/server-info.c
index 75dd677..7bc4e75 100644
--- a/server-info.c
+++ b/server-info.c
@@ -229,7 +229,7 @@ static void init_pack_info(const char *infofile, int force)
 	}
 
 	/* renumber them */
-	qsort(info, num_pack, sizeof(info[0]), compare_info);
+	QSORT(info, num_pack, compare_info);
 	for (i = 0; i < num_pack; i++)
 		info[i]->new_num = i;
 }
diff --git a/sh-i18n--envsubst.c b/sh-i18n--envsubst.c
index e06b2c1..3637a2a 100644
--- a/sh-i18n--envsubst.c
+++ b/sh-i18n--envsubst.c
@@ -231,7 +231,7 @@ static inline void
 string_list_sort (string_list_ty *slp)
 {
   if (slp->nitems > 0)
-    qsort (slp->item, slp->nitems, sizeof (slp->item[0]), cmp_string);
+    QSORT(slp->item, slp->nitems, cmp_string);
 }
 
 /* Test whether a sorted string list contains a given string.  */
diff --git a/sha1-array.c b/sha1-array.c
index 6f4a224..21188de 100644
--- a/sha1-array.c
+++ b/sha1-array.c
@@ -16,7 +16,7 @@ static int void_hashcmp(const void *a, const void *b)
 
 static void sha1_array_sort(struct sha1_array *array)
 {
-	qsort(array->sha1, array->nr, sizeof(*array->sha1), void_hashcmp);
+	QSORT(array->sha1, array->nr, void_hashcmp);
 	array->sorted = 1;
 }
 
diff --git a/string-list.c b/string-list.c
index 62d2084..8c83cac 100644
--- a/string-list.c
+++ b/string-list.c
@@ -225,7 +225,7 @@ static int cmp_items(const void *a, const void *b)
 void string_list_sort(struct string_list *list)
 {
 	compare_for_qsort = list->cmp ? list->cmp : strcmp;
-	qsort(list->items, list->nr, sizeof(*list->items), cmp_items);
+	QSORT(list->items, list->nr, cmp_items);
 }
 
 struct string_list_item *unsorted_string_list_lookup(struct string_list *list,
diff --git a/t/helper/test-dump-untracked-cache.c b/t/helper/test-dump-untracked-cache.c
index 50112cc..f752532 100644
--- a/t/helper/test-dump-untracked-cache.c
+++ b/t/helper/test-dump-untracked-cache.c
@@ -18,10 +18,8 @@ static int compare_dir(const void *a_, const void *b_)
 static void dump(struct untracked_cache_dir *ucd, struct strbuf *base)
 {
 	int i, len;
-	qsort(ucd->untracked, ucd->untracked_nr, sizeof(*ucd->untracked),
-	      compare_untracked);
-	qsort(ucd->dirs, ucd->dirs_nr, sizeof(*ucd->dirs),
-	      compare_dir);
+	QSORT(ucd->untracked, ucd->untracked_nr, compare_untracked);
+	QSORT(ucd->dirs, ucd->dirs_nr, compare_dir);
 	len = base->len;
 	strbuf_addf(base, "%s/", ucd->name);
 	printf("%s %s", base->buf,
diff --git a/tree.c b/tree.c
index 2b5a5a8..ce345c5 100644
--- a/tree.c
+++ b/tree.c
@@ -180,8 +180,7 @@ int read_tree(struct tree *tree, int stage, struct pathspec *match)
 	 * Sort the cache entry -- we need to nuke the cache tree, though.
 	 */
 	cache_tree_free(&active_cache_tree);
-	qsort(active_cache, active_nr, sizeof(active_cache[0]),
-	      cmp_cache_name_compare);
+	QSORT(active_cache, active_nr, cmp_cache_name_compare);
 	return 0;
 }
 
-- 
2.10.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] remove unnecessary check before QSORT
  2016-09-29 15:23 [PATCH 1/3] add QSORT René Scharfe
  2016-09-29 15:27 ` [PATCH 2/3] use QSORT René Scharfe
@ 2016-09-29 15:29 ` René Scharfe
  2016-09-29 22:36 ` [PATCH 1/3] add QSORT Junio C Hamano
  2 siblings, 0 replies; 13+ messages in thread
From: René Scharfe @ 2016-09-29 15:29 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano

Add a semantic patch for removing checks similar to the one that QSORT
already does internally and apply it to the code base.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 builtin/fmt-merge-msg.c        | 10 ++++------
 contrib/coccinelle/qsort.cocci | 18 ++++++++++++++++++
 sh-i18n--envsubst.c            |  3 +--
 3 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index 4976967..efab62f 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -314,12 +314,10 @@ static void add_people_info(struct strbuf *out,
 			    struct string_list *authors,
 			    struct string_list *committers)
 {
-	if (authors->nr)
-		QSORT(authors->items, authors->nr,
-		      cmp_string_list_util_as_integral);
-	if (committers->nr)
-		QSORT(committers->items, committers->nr,
-		      cmp_string_list_util_as_integral);
+	QSORT(authors->items, authors->nr,
+	      cmp_string_list_util_as_integral);
+	QSORT(committers->items, committers->nr,
+	      cmp_string_list_util_as_integral);
 
 	credit_people(out, authors, 'a');
 	credit_people(out, committers, 'c');
diff --git a/contrib/coccinelle/qsort.cocci b/contrib/coccinelle/qsort.cocci
index a094e7c..22b93a9 100644
--- a/contrib/coccinelle/qsort.cocci
+++ b/contrib/coccinelle/qsort.cocci
@@ -17,3 +17,21 @@ expression nmemb, compar;
 @@
 - qsort(base, nmemb, sizeof(T), compar);
 + QSORT(base, nmemb, compar);
+
+@@
+expression base, nmemb, compar;
+@@
+- if (nmemb)
+    QSORT(base, nmemb, compar);
+
+@@
+expression base, nmemb, compar;
+@@
+- if (nmemb > 0)
+    QSORT(base, nmemb, compar);
+
+@@
+expression base, nmemb, compar;
+@@
+- if (nmemb > 1)
+    QSORT(base, nmemb, compar);
diff --git a/sh-i18n--envsubst.c b/sh-i18n--envsubst.c
index 3637a2a..c3a2b5a 100644
--- a/sh-i18n--envsubst.c
+++ b/sh-i18n--envsubst.c
@@ -230,8 +230,7 @@ cmp_string (const void *pstr1, const void *pstr2)
 static inline void
 string_list_sort (string_list_ty *slp)
 {
-  if (slp->nitems > 0)
-    QSORT(slp->item, slp->nitems, cmp_string);
+  QSORT(slp->item, slp->nitems, cmp_string);
 }
 
 /* Test whether a sorted string list contains a given string.  */
-- 
2.10.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] add QSORT
  2016-09-29 15:23 [PATCH 1/3] add QSORT René Scharfe
  2016-09-29 15:27 ` [PATCH 2/3] use QSORT René Scharfe
  2016-09-29 15:29 ` [PATCH 3/3] remove unnecessary check before QSORT René Scharfe
@ 2016-09-29 22:36 ` Junio C Hamano
  2016-09-29 23:21   ` René Scharfe
  2016-10-01 16:19   ` René Scharfe
  2 siblings, 2 replies; 13+ messages in thread
From: Junio C Hamano @ 2016-09-29 22:36 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List

René Scharfe <l.s.r@web.de> writes:

> Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
> size of the array elements and supports the convention of initializing
> empty arrays with a NULL pointer, which we use in some places.
>
> Calling qsort(3) directly with a NULL pointer is undefined -- even with
> an element count of zero -- and allows the compiler to optimize away any
> following NULL checks.  Using the macro avoids such surprises.
>
> Add a semantic patch as well to demonstrate the macro's usage and to
> automate the transformation of trivial cases.
>
> Signed-off-by: Rene Scharfe <l.s.r@web.de>
> ---
>  contrib/coccinelle/qsort.cocci | 19 +++++++++++++++++++
>  git-compat-util.h              |  8 ++++++++
>  2 files changed, 27 insertions(+)
>  create mode 100644 contrib/coccinelle/qsort.cocci

The direct calls to qsort(3) that this series leaves behind are
interesting.

1. builtin/index-pack.c has this:

	if (1 < opts->anomaly_nr)
		qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), cmp_uint32);

where opts->anomaly is coming from pack.h:

    struct pack_idx_option {
            unsigned flags;
            ...
            int anomaly_alloc, anomaly_nr;
            uint32_t *anomaly;
    };

I cannot quite see how the automated conversion misses it?  It's not
like base and nmemb are type-restricted in the rule (they are both
just "expression"s).

2. builtin/shortlog.c has this:

	qsort(log->list.items, log->list.nr, sizeof(struct string_list_item),
	      log->summary ? compare_by_counter : compare_by_list);

where log->list is coming from shortlog.h:

    struct shortlog {
            struct string_list list;
    };

and string-list.h says:

    struct string_list {
            struct string_list_item *items;
            unsigned int nr, alloc;
            ...
    };

which seems to be a good candidate for this rule:

    type T;
    T *base;
    expression nmemb, compar;
    @@
    - qsort(base, nmemb, sizeof(T), compar);
    + QSORT(base, nmemb, compar);

if we take "T == struct string_list_item".

3. builtin/show-branch.c does this:

    qsort(ref_name + bottom, top - bottom, sizeof(ref_name[0]),
          compare_ref_name);

where ref_name[] is a file-scope global:

    static char *ref_name[MAX_REVS + 1];

and top and bottom are plain integers.  The sizeof() does not take
the size of *base, so it is understandable that this does not get
automatically converted.

It seems that some calls to this function _could_ send the same top
and bottom, asking for 0 element array to be sorted, by the way.

Thanks for an amusing read.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] add QSORT
  2016-09-29 22:36 ` [PATCH 1/3] add QSORT Junio C Hamano
@ 2016-09-29 23:21   ` René Scharfe
  2016-09-29 23:40     ` René Scharfe
  2016-10-01 16:19   ` René Scharfe
  1 sibling, 1 reply; 13+ messages in thread
From: René Scharfe @ 2016-09-29 23:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List

Am 30.09.2016 um 00:36 schrieb Junio C Hamano:
> René Scharfe <l.s.r@web.de> writes:
> 
>> Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
>> size of the array elements and supports the convention of initializing
>> empty arrays with a NULL pointer, which we use in some places.
>>
>> Calling qsort(3) directly with a NULL pointer is undefined -- even with
>> an element count of zero -- and allows the compiler to optimize away any
>> following NULL checks.  Using the macro avoids such surprises.
>>
>> Add a semantic patch as well to demonstrate the macro's usage and to
>> automate the transformation of trivial cases.
>>
>> Signed-off-by: Rene Scharfe <l.s.r@web.de>
>> ---
>>  contrib/coccinelle/qsort.cocci | 19 +++++++++++++++++++
>>  git-compat-util.h              |  8 ++++++++
>>  2 files changed, 27 insertions(+)
>>  create mode 100644 contrib/coccinelle/qsort.cocci
> 
> The direct calls to qsort(3) that this series leaves behind are
> interesting.
> 
> 1. builtin/index-pack.c has this:
> 
> 	if (1 < opts->anomaly_nr)
> 		qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), cmp_uint32);
> 
> where opts->anomaly is coming from pack.h:
> 
>     struct pack_idx_option {
>             unsigned flags;
>             ...
>             int anomaly_alloc, anomaly_nr;
>             uint32_t *anomaly;
>     };
> 
> I cannot quite see how the automated conversion misses it?  It's not
> like base and nmemb are type-restricted in the rule (they are both
> just "expression"s).
> 
> 2. builtin/shortlog.c has this:
> 
> 	qsort(log->list.items, log->list.nr, sizeof(struct string_list_item),
> 	      log->summary ? compare_by_counter : compare_by_list);
> 
> where log->list is coming from shortlog.h:
> 
>     struct shortlog {
>             struct string_list list;
>     };
> 
> and string-list.h says:
> 
>     struct string_list {
>             struct string_list_item *items;
>             unsigned int nr, alloc;
>             ...
>     };
> 
> which seems to be a good candidate for this rule:
> 
>     type T;
>     T *base;
>     expression nmemb, compar;
>     @@
>     - qsort(base, nmemb, sizeof(T), compar);
>     + QSORT(base, nmemb, compar);
> 
> if we take "T == struct string_list_item".

Transformations for these two are generated if we pass --all-includes
to spatch.  So let's do that.

-- >8 --
Subject: [PATCH] coccicheck: use --all-includes by default

Add a make variable, SPATCH_FLAGS, for specifying flags for spatch, and
set it to --all-includes by default.  This option lets it consider
header files which would otherwise be ignored.  That's important for
some rules that rely on type information.  It doubles the duration of
coccicheck, however.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 1aad150..d15bf8d 100644
--- a/Makefile
+++ b/Makefile
@@ -467,6 +467,7 @@ SPATCH = spatch
 export TCL_PATH TCLTK_PATH
 
 SPARSE_FLAGS =
+SPATCH_FLAGS = --all-includes
 
 
 
@@ -2314,7 +2315,7 @@ C_SOURCES = $(patsubst %.o,%.c,$(C_OBJ))
 %.cocci.patch: %.cocci $(C_SOURCES)
 	@echo '    ' SPATCH $<; \
 	for f in $(C_SOURCES); do \
-		$(SPATCH) --sp-file $< $$f; \
+		$(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS); \
 	done >$@ 2>$@.log; \
 	if test -s $@; \
 	then \
-- 
2.10.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] add QSORT
  2016-09-29 23:21   ` René Scharfe
@ 2016-09-29 23:40     ` René Scharfe
  0 siblings, 0 replies; 13+ messages in thread
From: René Scharfe @ 2016-09-29 23:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List

Am 30.09.2016 um 01:21 schrieb René Scharfe:
> Am 30.09.2016 um 00:36 schrieb Junio C Hamano:
>> René Scharfe <l.s.r@web.de> writes:
>>
>>> Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
>>> size of the array elements and supports the convention of initializing
>>> empty arrays with a NULL pointer, which we use in some places.
>>>
>>> Calling qsort(3) directly with a NULL pointer is undefined -- even with
>>> an element count of zero -- and allows the compiler to optimize away any
>>> following NULL checks.  Using the macro avoids such surprises.
>>>
>>> Add a semantic patch as well to demonstrate the macro's usage and to
>>> automate the transformation of trivial cases.
>>>
>>> Signed-off-by: Rene Scharfe <l.s.r@web.de>
>>> ---
>>>  contrib/coccinelle/qsort.cocci | 19 +++++++++++++++++++
>>>  git-compat-util.h              |  8 ++++++++
>>>  2 files changed, 27 insertions(+)
>>>  create mode 100644 contrib/coccinelle/qsort.cocci
>>
>> The direct calls to qsort(3) that this series leaves behind are
>> interesting.
>>
>> 1. builtin/index-pack.c has this:
>>
>> 	if (1 < opts->anomaly_nr)
>> 		qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), cmp_uint32);
>>
>> where opts->anomaly is coming from pack.h:
>>
>>     struct pack_idx_option {
>>             unsigned flags;
>>             ...
>>             int anomaly_alloc, anomaly_nr;
>>             uint32_t *anomaly;
>>     };
>>
>> I cannot quite see how the automated conversion misses it?  It's not
>> like base and nmemb are type-restricted in the rule (they are both
>> just "expression"s).
>>
>> 2. builtin/shortlog.c has this:
>>
>> 	qsort(log->list.items, log->list.nr, sizeof(struct string_list_item),
>> 	      log->summary ? compare_by_counter : compare_by_list);
>>
>> where log->list is coming from shortlog.h:
>>
>>     struct shortlog {
>>             struct string_list list;
>>     };
>>
>> and string-list.h says:
>>
>>     struct string_list {
>>             struct string_list_item *items;
>>             unsigned int nr, alloc;
>>             ...
>>     };
>>
>> which seems to be a good candidate for this rule:
>>
>>     type T;
>>     T *base;
>>     expression nmemb, compar;
>>     @@
>>     - qsort(base, nmemb, sizeof(T), compar);
>>     + QSORT(base, nmemb, compar);
>>
>> if we take "T == struct string_list_item".
> 
> Transformations for these two are generated if we pass --all-includes
> to spatch.  So let's do that.

And here's the result:

-- >8 --
Subject: [PATCH] use QSORT, part 2

Convert two more qsort(3) calls to QSORT to reduce code size and for
better safety and consistency.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
Squashable.

 builtin/index-pack.c | 3 +--
 builtin/shortlog.c   | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 7657d0a..0a27bab 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1531,8 +1531,7 @@ static void read_v2_anomalous_offsets(struct packed_git *p,
 		opts->anomaly[opts->anomaly_nr++] = ntohl(idx2[off * 2 + 1]);
 	}
 
-	if (1 < opts->anomaly_nr)
-		qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), cmp_uint32);
+	QSORT(opts->anomaly, opts->anomaly_nr, cmp_uint32);
 }
 
 static void read_idx_option(struct pack_idx_option *opts, const char *pack_name)
diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index 25fa8a6..ba0e115 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -308,7 +308,7 @@ void shortlog_output(struct shortlog *log)
 	struct strbuf sb = STRBUF_INIT;
 
 	if (log->sort_by_number)
-		qsort(log->list.items, log->list.nr, sizeof(struct string_list_item),
+		QSORT(log->list.items, log->list.nr,
 		      log->summary ? compare_by_counter : compare_by_list);
 	for (i = 0; i < log->list.nr; i++) {
 		const struct string_list_item *item = &log->list.items[i];
-- 
2.10.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] add QSORT
  2016-09-29 22:36 ` [PATCH 1/3] add QSORT Junio C Hamano
  2016-09-29 23:21   ` René Scharfe
@ 2016-10-01 16:19   ` René Scharfe
  2016-10-03 16:46     ` Kevin Bracey
  2016-10-03 17:09     ` Kevin Bracey
  1 sibling, 2 replies; 13+ messages in thread
From: René Scharfe @ 2016-10-01 16:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List

Am 30.09.2016 um 00:36 schrieb Junio C Hamano:
> 3. builtin/show-branch.c does this:
> 
>     qsort(ref_name + bottom, top - bottom, sizeof(ref_name[0]),
>           compare_ref_name);
> 
> where ref_name[] is a file-scope global:
> 
>     static char *ref_name[MAX_REVS + 1];
> 
> and top and bottom are plain integers.  The sizeof() does not take
> the size of *base, so it is understandable that this does not get
> automatically converted.
> 
> It seems that some calls to this function _could_ send the same top
> and bottom, asking for 0 element array to be sorted, by the way.

It's hard to imagine an implementation of qsort(3) that can't handle
zero elements.  QSORT's safety feature is that it prevents the compiler
from removing NULL checks for the array pointer.  E.g. the last two
lines in the following example could be optimized away:

	qsort(ptr, n, sizeof(*ptr), fn);
	if (!ptr)
		do_stuff();

You can see that on https://godbolt.org/g/JwS99b -- an awesome website
for exploring compilation results for small snippets, by the way.

This optimization is dangerous when combined with the convention of
using a NULL pointer for empty arrays.  Diagnosing an affected NULL
check is probably quite hard -- it's right there in the code after all
and not all compilers remove it.

builtin/show-branch.c never passes NULL, so it's not affected by that
hazard.  We can (and should, IMHO) still use QSORT there for
consistency and convenience, though:

-- >8 --
Subject: [PATCH] show-branch: use QSORT

Shorten the code by using QSORT instead of calling qsort(3) directly,
as the former determines the element size automatically and checks if
there are at least two elements to sort already.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 builtin/show-branch.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 623ca56..974f340 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -353,8 +353,7 @@ static int compare_ref_name(const void *a_, const void *b_)
 
 static void sort_ref_range(int bottom, int top)
 {
-	qsort(ref_name + bottom, top - bottom, sizeof(ref_name[0]),
-	      compare_ref_name);
+	QSORT(ref_name + bottom, top - bottom, compare_ref_name);
 }
 
 static int append_ref(const char *refname, const struct object_id *oid,
@@ -540,8 +539,7 @@ static void append_one_rev(const char *av)
 		if (saved_matches == ref_name_cnt &&
 		    ref_name_cnt < MAX_REVS)
 			error(_("no matching refs with %s"), av);
-		if (saved_matches + 1 < ref_name_cnt)
-			sort_ref_range(saved_matches, ref_name_cnt);
+		sort_ref_range(saved_matches, ref_name_cnt);
 		return;
 	}
 	die("bad sha1 reference %s", av);
-- 
2.10.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] add QSORT
  2016-10-01 16:19   ` René Scharfe
@ 2016-10-03 16:46     ` Kevin Bracey
  2016-10-03 17:09     ` Kevin Bracey
  1 sibling, 0 replies; 13+ messages in thread
From: Kevin Bracey @ 2016-10-03 16:46 UTC (permalink / raw)
  To: GIT Mailing-list

On 01/10/2016 19:19, René Scharfe wrote:
> It's hard to imagine an implementation of qsort(3) that can't handle
> zero elements.  QSORT's safety feature is that it prevents the compiler
> from removing NULL checks for the array pointer.  E.g. the last two
> lines in the following example could be optimized away:
>
> 	qsort(ptr, n, sizeof(*ptr), fn);
> 	if (!ptr)
> 		do_stuff();
>
> You can see that on https://godbolt.org/g/JwS99b -- an awesome website
> for exploring compilation results for small snippets, by the way.
>
> This optimization is dangerous when combined with the convention of
> using a NULL pointer for empty arrays.  Diagnosing an affected NULL
> check is probably quite hard -- it's right there in the code after all
> and not all compilers remove it.

Hang on, hang on. This is either a compiler bug, or you're wrong on your 
assumption about the specification of qsort.

Either way, the extra layer of indirection is not proper protection. The 
unwanted compiler optimisation you're inadvertently triggering could 
still be triggered through the inline.

Now, looking at the C standard, this isn't actually clear to me. The 
standard says that if you call qsort with nmemb zero, the pointer still 
has to be "valid". Not totally clear to me if NULL is valid, by their 
definition in C99 7.1.4. Googling hasn't given me a concrete answer.

The compiler seems to think that NULL wouldn't be valid, so because 
you've called qsort on it, you've invoked undefined behaviour if it's 
NULL, so it's free to elide the NULL check.

Kevin


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] add QSORT
  2016-10-01 16:19   ` René Scharfe
  2016-10-03 16:46     ` Kevin Bracey
@ 2016-10-03 17:09     ` Kevin Bracey
  2016-10-03 22:00       ` René Scharfe
  1 sibling, 1 reply; 13+ messages in thread
From: Kevin Bracey @ 2016-10-03 17:09 UTC (permalink / raw)
  To: GIT Mailing-list, René Scharfe

On 01/10/2016 19:19, René Scharfe wrote:
>
> It's hard to imagine an implementation of qsort(3) that can't handle
> zero elements.  QSORT's safety feature is that it prevents the compiler
> from removing NULL checks for the array pointer.  E.g. the last two
> lines in the following example could be optimized away:
>
> 	qsort(ptr, n, sizeof(*ptr), fn);
> 	if (!ptr)
> 		do_stuff();
>
> You can see that on https://godbolt.org/g/JwS99b -- an awesome website
> for exploring compilation results for small snippets, by the way.
>
Ah, second attempt. Originally misread the original code, and didn't 
understand what it was doing.

I get it now.

A nasty trap I hadn't been aware of - I was under the impression NULL + 
zero length was generally legal, but the C standard does indeed not give 
you a specific out for NULL to library functions in that case.

As such, NULL checks can still be elided even with your change. If you 
effectively change your example to:

     if (nmemb > 1)
         qsort(array, nmemb, size, cmp);
     if (!array)
         printf("array is NULL\n");

array may only be checked for NULL if nmemb <= 1. You can see GCC doing 
that in the compiler explorer - it effectively turns that into "else 
if".  To make that check really work, you have to do:

     if (array)
         qsort(array, nmemb, size, cmp);
     else
         printf("array is NULL\n");

So maybe your "sane_qsort" should be checking array, not nmemb.

Kevin


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] add QSORT
  2016-10-03 17:09     ` Kevin Bracey
@ 2016-10-03 22:00       ` René Scharfe
  2016-10-04  5:28         ` Kevin Bracey
  0 siblings, 1 reply; 13+ messages in thread
From: René Scharfe @ 2016-10-03 22:00 UTC (permalink / raw)
  To: Kevin Bracey, GIT Mailing-list

Am 03.10.2016 um 19:09 schrieb Kevin Bracey:
> As such, NULL checks can still be elided even with your change. If you
> effectively change your example to:
>
>     if (nmemb > 1)
>         qsort(array, nmemb, size, cmp);
>     if (!array)
>         printf("array is NULL\n");
>
> array may only be checked for NULL if nmemb <= 1. You can see GCC doing
> that in the compiler explorer - it effectively turns that into "else
> if".

We don't support array == NULL together with nmemb > 1, so a segfault is 
to be expected in such cases, and thus NULL checks can be removed safely.

> To make that check really work, you have to do:
>
>     if (array)
>         qsort(array, nmemb, size, cmp);
>     else
>         printf("array is NULL\n");
>
> So maybe your "sane_qsort" should be checking array, not nmemb.

It would be safe, but arguably too much so, because non-empty arrays 
with NULL wouldn't segfault anymore, and thus become harder to identify 
as the programming errors they are.

The intention is to support NULL pointers only for empty arrays (in 
addition to valid pointers).  That we also support NULL pointers for 
arrays with a single member might be considered to be the result of a 
premature optimization, but it should be safe -- the compiler won't 
remove checks unexpectedly.

Does that make sense (it's getting late here, so my logic might already 
be resting..)?

René

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] add QSORT
  2016-10-03 22:00       ` René Scharfe
@ 2016-10-04  5:28         ` Kevin Bracey
  2016-10-04 20:31           ` René Scharfe
  0 siblings, 1 reply; 13+ messages in thread
From: Kevin Bracey @ 2016-10-04  5:28 UTC (permalink / raw)
  To: GIT Mailing-list; +Cc: René Scharfe

On 04/10/2016 01:00, René Scharfe wrote:
> Am 03.10.2016 um 19:09 schrieb Kevin Bracey:
>> As such, NULL checks can still be elided even with your change. If you
>> effectively change your example to:
>>
>>     if (nmemb > 1)
>>         qsort(array, nmemb, size, cmp);
>>     if (!array)
>>         printf("array is NULL\n");
>>
>> array may only be checked for NULL if nmemb <= 1. You can see GCC doing
>> that in the compiler explorer - it effectively turns that into "else
>> if".
>
> We don't support array == NULL together with nmemb > 1, so a segfault 
> is to be expected in such cases, and thus NULL checks can be removed 
> safely.
>
Possibly true in practice.

But technically wrong by the C standard - behaviour is undefined if the 
qsort pointer is invalid. You can't formally expect the defined 
behaviour of a segfault when sending NULL into qsort. (Hell, maybe the 
qsort has its own NULL check and silently returns! cf printf - some 
printfs will segfault when passed NULL, some print "(null)"). I've 
worked on systems that don't fault reads to NULL, only writes, so those 
might not segfault there, if NULL appeared sorted...

And obviously there's the language lawyer favourite possibility of the 
call causing nasal flying monkeys or whatever.

So if it's not a program error for array to be NULL and nmemb to be zero 
in your code, and you want a diagnostic for array=NULL, nmemb non-zero, 
I think you should put that diagnostic into sane_qsort as an assert or 
something, not rely on qsort's undefined behaviour being a segfault.

     sane_qsort(blah)
     {
          if (nmemb >= 1) {
              assert(array);
              qsort(array, nmemb, ...);
          }
     }

Can't invoke undefined behaviour from NULL without triggering the 
assert. (Could still have other invalid pointers, of course).

Usually I am on the side of "no NULL checks", as I make the assumption 
that we will get a segfault as soon as NULL pointers are used, and those 
are generally easy to diagnose. But seeing a compiler invoking this sort 
of new trickery due to invoking undefined behaviour is making me more 
nervous about doing so...

>> To make that check really work, you have to do:
>>
>>     if (array)
>>         qsort(array, nmemb, size, cmp);
>>     else
>>         printf("array is NULL\n");
>>
>> So maybe your "sane_qsort" should be checking array, not nmemb.
>
> It would be safe, but arguably too much so, because non-empty arrays 
> with NULL wouldn't segfault anymore, and thus become harder to 
> identify as the programming errors they are.
Well, you get the print. Although I guess you're worrying about the 
second if being real code, not a debugging check.

I must say, this is quite a courageous new optimisation from GCC. It 
strikes me as finding a language lawyer loophole that seems to have been 
intended for something else (mapping library functions directly onto 
CISCy CPU intrinsics), and using it to invent a whole new optimisation 
that seems more likely to trigger bugs than optimise any significant 
amount of code in a desirable way.

Doubly weird as there's no (standard) language support for this. I don't 
know how you'd define "my_qsort" that triggered the same optimisations.

I've seen similar 
library-knowledge-without-any-way-to-reproduce-in-user-code 
optimisations like "malloc returns a new pointer that doesn't alias with 
anything existing" (and no way to reproduce the optimisation with 
my_malloc_wrapper). But those seemed to have a clear performance 
benefit, without any obvious traps. Doubtful about this one.

Kevin


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] add QSORT
  2016-10-04  5:28         ` Kevin Bracey
@ 2016-10-04 20:31           ` René Scharfe
  2016-10-05 15:00             ` Kevin Bracey
  0 siblings, 1 reply; 13+ messages in thread
From: René Scharfe @ 2016-10-04 20:31 UTC (permalink / raw)
  To: Kevin Bracey, GIT Mailing-list

Am 04.10.2016 um 07:28 schrieb Kevin Bracey:
> On 04/10/2016 01:00, René Scharfe wrote:
>> Am 03.10.2016 um 19:09 schrieb Kevin Bracey:
>>> As such, NULL checks can still be elided even with your change. If you
>>> effectively change your example to:
>>>
>>>     if (nmemb > 1)
>>>         qsort(array, nmemb, size, cmp);
>>>     if (!array)
>>>         printf("array is NULL\n");
>>>
>>> array may only be checked for NULL if nmemb <= 1. You can see GCC doing
>>> that in the compiler explorer - it effectively turns that into "else
>>> if".
>>
>> We don't support array == NULL together with nmemb > 1, so a segfault
>> is to be expected in such cases, and thus NULL checks can be removed
>> safely.
>>
> Possibly true in practice.
>
> But technically wrong by the C standard - behaviour is undefined if the
> qsort pointer is invalid. You can't formally expect the defined
> behaviour of a segfault when sending NULL into qsort. (Hell, maybe the
> qsort has its own NULL check and silently returns!

A qsort(3) implementation that doesn't segfault is inconvenient, but 
still safe.  I'm more concerned about NULL checks being removed from our 
code.

> So if it's not a program error for array to be NULL and nmemb to be zero
> in your code, and you want a diagnostic for array=NULL, nmemb non-zero,
> I think you should put that diagnostic into sane_qsort as an assert or
> something, not rely on qsort's undefined behaviour being a segfault.
>
>     sane_qsort(blah)
>     {
>          if (nmemb >= 1) {
>              assert(array);
>              qsort(array, nmemb, ...);
>          }
>     }
>
> Can't invoke undefined behaviour from NULL without triggering the
> assert. (Could still have other invalid pointers, of course).

We could do that, but I think it's not necessary.  We'd get a segfault 
when accessing the sorted array anyway.  (If we don't look at the data 
after sorting then we can get rid of the sorting step altogether.)

> Usually I am on the side of "no NULL checks", as I make the assumption
> that we will get a segfault as soon as NULL pointers are used, and those
> are generally easy to diagnose. But seeing a compiler invoking this sort
> of new trickery due to invoking undefined behaviour is making me more
> nervous about doing so...

I was shocked a bit myself when I learned about this, but let's not 
panic. :)

>>> To make that check really work, you have to do:
>>>
>>>     if (array)
>>>         qsort(array, nmemb, size, cmp);
>>>     else
>>>         printf("array is NULL\n");
>>>
>>> So maybe your "sane_qsort" should be checking array, not nmemb.
>>
>> It would be safe, but arguably too much so, because non-empty arrays
>> with NULL wouldn't segfault anymore, and thus become harder to
>> identify as the programming errors they are.
> Well, you get the print. Although I guess you're worrying about the
> second if being real code, not a debugging check.

Yes, but the optimization is valid: If nmemb > 0 then array can only be 
NULL if we have a bug, and then we'd get a segfault eventually.  So such 
checks can be removed safely.

> I must say, this is quite a courageous new optimisation from GCC. It
> strikes me as finding a language lawyer loophole that seems to have been
> intended for something else (mapping library functions directly onto
> CISCy CPU intrinsics), and using it to invent a whole new optimisation
> that seems more likely to trigger bugs than optimise any significant
> amount of code in a desirable way.

Yeah, and the bugs triggered are quite obscure in this case.  But having 
richer type information and thus restricting the range of possible 
values for variables *can* enable useful optimizations.

> Doubly weird as there's no (standard) language support for this. I don't
> know how you'd define "my_qsort" that triggered the same optimisations.

The nonnull attribute is a GCC extension, but it's also supported by clang:

   http://clang.llvm.org/docs/AttributeReference.html#nonnull-gnu-nonnull

I don't know if other compilers support it as well, or if there are 
efforts underway to standardize it.

> I've seen similar
> library-knowledge-without-any-way-to-reproduce-in-user-code
> optimisations like "malloc returns a new pointer that doesn't alias with
> anything existing" (and no way to reproduce the optimisation with
> my_malloc_wrapper). But those seemed to have a clear performance
> benefit, without any obvious traps. Doubtful about this one.

Still we have to deal with it..

So let's summarize; here's the effect of a raw qsort(3) call:

array == NULL  nmemb  bug  QSORT  following NULL check
-------------  -----  ---  -----  --------------------
             0      0  no   qsort  is skipped
             0     >0  no   qsort  is skipped
             1      0  no   qsort  is skipped (bad!)
             1     >0  yes  qsort  is skipped

Here's what the current implementation (nmemb > 1) does:

array == NULL  nmemb  bug  QSORT  following NULL check
-------------  -----  ---  -----  --------------------
             0      0  no   noop   is executed
             0      1  no   noop   is executed
             0     >1  no   qsort  is skipped
             1      0  no   noop   is executed
             1      1  yes  noop   is executed
             1     >1  yes  qsort  is skipped

With the micro-optimization removed (nmemb > 0) the matrix gets simpler:

array == NULL  nmemb  bug  QSORT  following NULL check
-------------  -----  ---  -----  --------------------
             0      0  no   noop   is executed
             0     >0  no   qsort  is skipped
             1      0  no   noop   is executed
             1     >0  yes  qsort  is skipped

And with your NULL check (array != NULL) we'd get:

array == NULL  nmemb  bug  QSORT  following NULL check
-------------  -----  ---  -----  --------------------
             0      0  no   qsort  reuses check result
             0     >0  no   qsort  reuses check result
             1      0  no   noop   reuses check result
             1     >0  yes  noop   reuses check result

Did I get it right?  AFAICS all variants (except raw qsort) are safe -- 
no useful NULL checks are removed, and buggy code should be noticed by 
segfaults in code accessing the sorted array.  So the advantage of the 
current code is that it won't call qsort for nmemb <= 1.  And the 
advantage of checking the pointer is that the result of that check can 
be reused by later checks.  I think the former is more useful, but only 
slightly.

René

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] add QSORT
  2016-10-04 20:31           ` René Scharfe
@ 2016-10-05 15:00             ` Kevin Bracey
  0 siblings, 0 replies; 13+ messages in thread
From: Kevin Bracey @ 2016-10-05 15:00 UTC (permalink / raw)
  To: René Scharfe; +Cc: GIT Mailing-list

On 04/10/2016 23:31, René Scharfe wrote:

>
> So let's summarize; here's the effect of a raw qsort(3) call:
>
> array == NULL  nmemb  bug  QSORT  following NULL check
> -------------  -----  ---  -----  --------------------
>             0      0  no   qsort  is skipped
>             0     >0  no   qsort  is skipped
>             1      0  no   qsort  is skipped (bad!) ******
>             1     >0  yes  qsort  is skipped ******
>
Right - row 3 may not be a bug from the point of view of your internals, 
but it means you violate the API of qsort.Therefore a fix is required.

> With the micro-optimization removed (nmemb > 0) the matrix gets simpler:
>
> array == NULL  nmemb  bug  QSORT  following NULL check
> -------------  -----  ---  -----  --------------------
>             0      0  no   noop   is executed
>             0     >0  no   qsort  is skipped
>             1      0  no   noop   is executed
>             1     >0  yes  qsort  is skipped ******
>
> And with your NULL check (array != NULL) we'd get:
>
> array == NULL  nmemb  bug  QSORT  following NULL check
> -------------  -----  ---  -----  --------------------
>             0      0  no   qsort  reuses check result
>             0     >0  no   qsort  reuses check result
>             1      0  no   noop   reuses check result
>             1     >0  yes  noop   reuses check result
>
> Did I get it right?  AFAICS all variants (except raw qsort) are safe 
> -- no useful NULL checks are removed, and buggy code should be noticed 
> by segfaults in code accessing the sorted array.
I think your tables are correct.

But I disagree that you could ever call invoking the "****" lines safe. 
Unless you have documentation on what limit GCC (and your other 
compilers) are prepared to put on the undefined behaviour of violating 
that "non-null" constraint.

Up to now dereferencing a null pointer has been implicitly (or 
explicitly?) defined as simply generating SIGSEGV. And that has 
naturally extended into NULL passed to library implementations. But 
that's no longer true - it seems bets are somewhat off.

But, as long as you are confident you never invoke that line without a 
program bug - ie an API precondition of your own QSORT is that NULL is 
legal iff nmemb is zero, then I guess it's fine. Behaviour is defined, 
as long as you don't violate your internal preconditions.

Kevin



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-10-05 16:16 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-29 15:23 [PATCH 1/3] add QSORT René Scharfe
2016-09-29 15:27 ` [PATCH 2/3] use QSORT René Scharfe
2016-09-29 15:29 ` [PATCH 3/3] remove unnecessary check before QSORT René Scharfe
2016-09-29 22:36 ` [PATCH 1/3] add QSORT Junio C Hamano
2016-09-29 23:21   ` René Scharfe
2016-09-29 23:40     ` René Scharfe
2016-10-01 16:19   ` René Scharfe
2016-10-03 16:46     ` Kevin Bracey
2016-10-03 17:09     ` Kevin Bracey
2016-10-03 22:00       ` René Scharfe
2016-10-04  5:28         ` Kevin Bracey
2016-10-04 20:31           ` René Scharfe
2016-10-05 15:00             ` Kevin Bracey

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).