* [PATCH 1/3] add QSORT @ 2016-09-29 15:23 René Scharfe 2016-09-29 15:27 ` [PATCH 2/3] use QSORT René Scharfe ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: René Scharfe @ 2016-09-29 15:23 UTC (permalink / raw) To: Git List; +Cc: Junio C Hamano Add the macro QSORT, a convenient wrapper for qsort(3) that infers the size of the array elements and supports the convention of initializing empty arrays with a NULL pointer, which we use in some places. Calling qsort(3) directly with a NULL pointer is undefined -- even with an element count of zero -- and allows the compiler to optimize away any following NULL checks. Using the macro avoids such surprises. Add a semantic patch as well to demonstrate the macro's usage and to automate the transformation of trivial cases. Signed-off-by: Rene Scharfe <l.s.r@web.de> --- contrib/coccinelle/qsort.cocci | 19 +++++++++++++++++++ git-compat-util.h | 8 ++++++++ 2 files changed, 27 insertions(+) create mode 100644 contrib/coccinelle/qsort.cocci diff --git a/contrib/coccinelle/qsort.cocci b/contrib/coccinelle/qsort.cocci new file mode 100644 index 0000000..a094e7c --- /dev/null +++ b/contrib/coccinelle/qsort.cocci @@ -0,0 +1,19 @@ +@@ +expression base, nmemb, compar; +@@ +- qsort(base, nmemb, sizeof(*base), compar); ++ QSORT(base, nmemb, compar); + +@@ +expression base, nmemb, compar; +@@ +- qsort(base, nmemb, sizeof(base[0]), compar); ++ QSORT(base, nmemb, compar); + +@@ +type T; +T *base; +expression nmemb, compar; +@@ +- qsort(base, nmemb, sizeof(T), compar); ++ QSORT(base, nmemb, compar); diff --git a/git-compat-util.h b/git-compat-util.h index 8aab0c3..d7ed137 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -977,6 +977,14 @@ void git_qsort(void *base, size_t nmemb, size_t size, #define qsort git_qsort #endif +#define QSORT(base, n, compar) sane_qsort((base), (n), sizeof(*(base)), compar) +static void inline sane_qsort(void *base, size_t nmemb, size_t size, + int(*compar)(const void *, const void *)) +{ + if (nmemb > 1) + qsort(base, nmemb, size, compar); +} + #ifndef REG_STARTEND #error "Git requires REG_STARTEND support. Compile with NO_REGEX=NeedsStartEnd" #endif -- 2.10.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/3] use QSORT 2016-09-29 15:23 [PATCH 1/3] add QSORT René Scharfe @ 2016-09-29 15:27 ` René Scharfe 2016-09-29 15:29 ` [PATCH 3/3] remove unnecessary check before QSORT René Scharfe 2016-09-29 22:36 ` [PATCH 1/3] add QSORT Junio C Hamano 2 siblings, 0 replies; 13+ messages in thread From: René Scharfe @ 2016-09-29 15:27 UTC (permalink / raw) To: Git List; +Cc: Junio C Hamano Apply the semantic patch contrib/coccinelle/qsort.cocci to the code base, replacing calls of qsort(3) with QSORT. The resulting code is shorter and supports empty arrays with NULL pointers. Signed-off-by: Rene Scharfe <l.s.r@web.de> --- Freshly generated using coccicheck, compiles, survives make test. bisect.c | 2 +- builtin/describe.c | 2 +- builtin/fast-export.c | 2 +- builtin/fmt-merge-msg.c | 6 ++---- builtin/index-pack.c | 8 +++----- builtin/mktree.c | 2 +- builtin/name-rev.c | 3 +-- builtin/pack-objects.c | 7 +++---- builtin/remote.c | 3 +-- diff.c | 6 +++--- diffcore-delta.c | 5 +---- diffcore-order.c | 2 +- diffcore-rename.c | 2 +- dir.c | 4 ++-- fast-import.c | 4 ++-- fetch-pack.c | 2 +- help.c | 15 +++++---------- line-log.c | 2 +- pack-bitmap-write.c | 3 +-- pack-check.c | 2 +- pack-write.c | 3 +-- pathspec.c | 3 +-- ref-filter.c | 2 +- refs/files-backend.c | 2 +- server-info.c | 2 +- sh-i18n--envsubst.c | 2 +- sha1-array.c | 2 +- string-list.c | 2 +- t/helper/test-dump-untracked-cache.c | 6 ++---- tree.c | 3 +-- 30 files changed, 44 insertions(+), 65 deletions(-) diff --git a/bisect.c b/bisect.c index 6f512c2..21bc6da 100644 --- a/bisect.c +++ b/bisect.c @@ -215,7 +215,7 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list, int n array[cnt].distance = distance; cnt++; } - qsort(array, cnt, sizeof(*array), compare_commit_dist); + QSORT(array, cnt, compare_commit_dist); for (p = list, i = 0; i < cnt; i++) { char buf[100]; /* enough for dist=%d */ struct object *obj = &(array[i].commit->object); diff --git a/builtin/describe.c b/builtin/describe.c index 8a25abe..01490a1 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -352,7 +352,7 @@ static void describe(const char *arg, int last_one) oid_to_hex(oid)); } - qsort(all_matches, match_cnt, sizeof(all_matches[0]), compare_pt); + QSORT(all_matches, match_cnt, compare_pt); if (gave_up_on) { commit_list_insert_by_date(gave_up_on, &list); diff --git a/builtin/fast-export.c b/builtin/fast-export.c index c0652a7..1e815b5 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -347,7 +347,7 @@ static void show_filemodify(struct diff_queue_struct *q, * Handle files below a directory first, in case they are all deleted * and the directory changes to a file or symlink. */ - qsort(q->queue, q->nr, sizeof(q->queue[0]), depth_first); + QSORT(q->queue, q->nr, depth_first); for (i = 0; i < q->nr; i++) { struct diff_filespec *ospec = q->queue[i]->one; diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c index dc2e9e4..4976967 100644 --- a/builtin/fmt-merge-msg.c +++ b/builtin/fmt-merge-msg.c @@ -315,12 +315,10 @@ static void add_people_info(struct strbuf *out, struct string_list *committers) { if (authors->nr) - qsort(authors->items, - authors->nr, sizeof(authors->items[0]), + QSORT(authors->items, authors->nr, cmp_string_list_util_as_integral); if (committers->nr) - qsort(committers->items, - committers->nr, sizeof(committers->items[0]), + QSORT(committers->items, committers->nr, cmp_string_list_util_as_integral); credit_people(out, authors, 'a'); diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 4a8b4ae..7657d0a 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1190,10 +1190,8 @@ static void resolve_deltas(void) return; /* Sort deltas by base SHA1/offset for fast searching */ - qsort(ofs_deltas, nr_ofs_deltas, sizeof(struct ofs_delta_entry), - compare_ofs_delta_entry); - qsort(ref_deltas, nr_ref_deltas, sizeof(struct ref_delta_entry), - compare_ref_delta_entry); + QSORT(ofs_deltas, nr_ofs_deltas, compare_ofs_delta_entry); + QSORT(ref_deltas, nr_ref_deltas, compare_ref_delta_entry); if (verbose || show_resolving_progress) progress = start_progress(_("Resolving deltas"), @@ -1356,7 +1354,7 @@ static void fix_unresolved_deltas(struct sha1file *f) ALLOC_ARRAY(sorted_by_pos, nr_ref_deltas); for (i = 0; i < nr_ref_deltas; i++) sorted_by_pos[i] = &ref_deltas[i]; - qsort(sorted_by_pos, nr_ref_deltas, sizeof(*sorted_by_pos), delta_pos_compare); + QSORT(sorted_by_pos, nr_ref_deltas, delta_pos_compare); for (i = 0; i < nr_ref_deltas; i++) { struct ref_delta_entry *d = sorted_by_pos[i]; diff --git a/builtin/mktree.c b/builtin/mktree.c index 4282b62..de9b40f 100644 --- a/builtin/mktree.c +++ b/builtin/mktree.c @@ -46,7 +46,7 @@ static void write_tree(unsigned char *sha1) size_t size; int i; - qsort(entries, used, sizeof(*entries), ent_compare); + QSORT(entries, used, ent_compare); for (size = i = 0; i < used; i++) size += 32 + entries[i]->len; diff --git a/builtin/name-rev.c b/builtin/name-rev.c index 57be35f..cd89d48 100644 --- a/builtin/name-rev.c +++ b/builtin/name-rev.c @@ -195,8 +195,7 @@ static const char *get_exact_ref_match(const struct object *o) return NULL; if (!tip_table.sorted) { - qsort(tip_table.table, tip_table.nr, sizeof(*tip_table.table), - tipcmp); + QSORT(tip_table.table, tip_table.nr, tipcmp); tip_table.sorted = 1; } diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 166e52c..8aeba6a 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1535,7 +1535,7 @@ static void get_object_details(void) sorted_by_offset = xcalloc(to_pack.nr_objects, sizeof(struct object_entry *)); for (i = 0; i < to_pack.nr_objects; i++) sorted_by_offset[i] = to_pack.objects + i; - qsort(sorted_by_offset, to_pack.nr_objects, sizeof(*sorted_by_offset), pack_offset_sort); + QSORT(sorted_by_offset, to_pack.nr_objects, pack_offset_sort); for (i = 0; i < to_pack.nr_objects; i++) { struct object_entry *entry = sorted_by_offset[i]; @@ -2257,7 +2257,7 @@ static void prepare_pack(int window, int depth) if (progress) progress_state = start_progress(_("Compressing objects"), nr_deltas); - qsort(delta_list, n, sizeof(*delta_list), type_size_sort); + QSORT(delta_list, n, type_size_sort); ll_find_deltas(delta_list, n, window+1, depth, &nr_done); stop_progress(&progress_state); if (nr_done != nr_deltas) @@ -2449,8 +2449,7 @@ static void add_objects_in_unpacked_packs(struct rev_info *revs) } if (in_pack.nr) { - qsort(in_pack.array, in_pack.nr, sizeof(in_pack.array[0]), - ofscmp); + QSORT(in_pack.array, in_pack.nr, ofscmp); for (i = 0; i < in_pack.nr; i++) { struct object *o = in_pack.array[i].object; add_object_entry(o->oid.hash, o->type, "", 0); diff --git a/builtin/remote.c b/builtin/remote.c index 9f6a6b3..e52cf39 100644 --- a/builtin/remote.c +++ b/builtin/remote.c @@ -1197,8 +1197,7 @@ static int show(int argc, const char **argv) info.width = info.width2 = 0; for_each_string_list(&states.push, add_push_to_show_info, &info); - qsort(info.list->items, info.list->nr, - sizeof(*info.list->items), cmp_string_with_push); + QSORT(info.list->items, info.list->nr, cmp_string_with_push); if (info.list->nr) printf_ln(Q_(" Local ref configured for 'git push'%s:", " Local refs configured for 'git push'%s:", diff --git a/diff.c b/diff.c index a178ed3..c2f09fb 100644 --- a/diff.c +++ b/diff.c @@ -2019,7 +2019,7 @@ static void show_dirstat(struct diff_options *options) return; /* Show all directories with more than x% of the changes */ - qsort(dir.files, dir.nr, sizeof(dir.files[0]), dirstat_compare); + QSORT(dir.files, dir.nr, dirstat_compare); gather_dirstat(options, &dir, changed, "", 0); } @@ -2063,7 +2063,7 @@ static void show_dirstat_by_line(struct diffstat_t *data, struct diff_options *o return; /* Show all directories with more than x% of the changes */ - qsort(dir.files, dir.nr, sizeof(dir.files[0]), dirstat_compare); + QSORT(dir.files, dir.nr, dirstat_compare); gather_dirstat(options, &dir, changed, "", 0); } @@ -4923,7 +4923,7 @@ static int diffnamecmp(const void *a_, const void *b_) void diffcore_fix_diff_index(struct diff_options *options) { struct diff_queue_struct *q = &diff_queued_diff; - qsort(q->queue, q->nr, sizeof(q->queue[0]), diffnamecmp); + QSORT(q->queue, q->nr, diffnamecmp); } void diffcore_std(struct diff_options *options) diff --git a/diffcore-delta.c b/diffcore-delta.c index 4159748..2ebedb3 100644 --- a/diffcore-delta.c +++ b/diffcore-delta.c @@ -158,10 +158,7 @@ static struct spanhash_top *hash_chars(struct diff_filespec *one) n = 0; accum1 = accum2 = 0; } - qsort(hash->data, - 1ul << hash->alloc_log2, - sizeof(hash->data[0]), - spanhash_cmp); + QSORT(hash->data, 1ul << hash->alloc_log2, spanhash_cmp); return hash; } diff --git a/diffcore-order.c b/diffcore-order.c index 69d41f7..1957f82 100644 --- a/diffcore-order.c +++ b/diffcore-order.c @@ -101,7 +101,7 @@ void order_objects(const char *orderfile, obj_path_fn_t obj_path, objs[i].orig_order = i; objs[i].order = match_order(obj_path(objs[i].obj)); } - qsort(objs, nr, sizeof(*objs), compare_objs_order); + QSORT(objs, nr, compare_objs_order); } static const char *pair_pathtwo(void *obj) diff --git a/diffcore-rename.c b/diffcore-rename.c index 73d003a..54a2396 100644 --- a/diffcore-rename.c +++ b/diffcore-rename.c @@ -580,7 +580,7 @@ void diffcore_rename(struct diff_options *options) stop_progress(&progress); /* cost matrix sorted by most to least similar pair */ - qsort(mx, dst_cnt * NUM_CANDIDATE_PER_DST, sizeof(*mx), score_compare); + QSORT(mx, dst_cnt * NUM_CANDIDATE_PER_DST, score_compare); rename_count += find_renames(mx, dst_cnt, minimum_score, 0); if (detect_rename == DIFF_DETECT_COPY) diff --git a/dir.c b/dir.c index 9e09bcb..3bad1ad 100644 --- a/dir.c +++ b/dir.c @@ -2005,8 +2005,8 @@ int read_directory(struct dir_struct *dir, const char *path, int len, const stru if (!len || treat_leading_path(dir, path, len, simplify)) read_directory_recursive(dir, path, len, untracked, 0, simplify); free_simplify(simplify); - qsort(dir->entries, dir->nr, sizeof(struct dir_entry *), cmp_name); - qsort(dir->ignored, dir->ignored_nr, sizeof(struct dir_entry *), cmp_name); + QSORT(dir->entries, dir->nr, cmp_name); + QSORT(dir->ignored, dir->ignored_nr, cmp_name); if (dir->untracked) { static struct trace_key trace_untracked_stats = TRACE_KEY_INIT(UNTRACKED_STATS); trace_printf_key(&trace_untracked_stats, diff --git a/fast-import.c b/fast-import.c index bf53ac9..cb545d7 100644 --- a/fast-import.c +++ b/fast-import.c @@ -1460,9 +1460,9 @@ static void mktree(struct tree_content *t, int v, struct strbuf *b) unsigned int i; if (!v) - qsort(t->entries,t->entry_count,sizeof(t->entries[0]),tecmp0); + QSORT(t->entries, t->entry_count, tecmp0); else - qsort(t->entries,t->entry_count,sizeof(t->entries[0]),tecmp1); + QSORT(t->entries, t->entry_count, tecmp1); for (i = 0; i < t->entry_count; i++) { if (t->entries[i]->versions[v].mode) diff --git a/fetch-pack.c b/fetch-pack.c index 85e77af..8a38d30 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -812,7 +812,7 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args, int agent_len; sort_ref_list(&ref, ref_compare_name); - qsort(sought, nr_sought, sizeof(*sought), cmp_ref_by_name); + QSORT(sought, nr_sought, cmp_ref_by_name); if ((args->depth > 0 || is_repository_shallow()) && !server_supports("shallow")) die("Server does not support shallow clients"); diff --git a/help.c b/help.c index 2ff3b5a..53e2a67 100644 --- a/help.c +++ b/help.c @@ -170,8 +170,7 @@ void load_command_list(const char *prefix, if (exec_path) { list_commands_in_dir(main_cmds, exec_path, prefix); - qsort(main_cmds->names, main_cmds->cnt, - sizeof(*main_cmds->names), cmdname_compare); + QSORT(main_cmds->names, main_cmds->cnt, cmdname_compare); uniq(main_cmds); } @@ -190,8 +189,7 @@ void load_command_list(const char *prefix, } free(paths); - qsort(other_cmds->names, other_cmds->cnt, - sizeof(*other_cmds->names), cmdname_compare); + QSORT(other_cmds->names, other_cmds->cnt, cmdname_compare); uniq(other_cmds); } exclude_cmds(other_cmds, main_cmds); @@ -238,8 +236,7 @@ void list_common_cmds_help(void) longest = strlen(common_cmds[i].name); } - qsort(common_cmds, ARRAY_SIZE(common_cmds), - sizeof(common_cmds[0]), cmd_group_cmp); + QSORT(common_cmds, ARRAY_SIZE(common_cmds), cmd_group_cmp); puts(_("These are common Git commands used in various situations:")); @@ -324,8 +321,7 @@ const char *help_unknown_cmd(const char *cmd) add_cmd_list(&main_cmds, &aliases); add_cmd_list(&main_cmds, &other_cmds); - qsort(main_cmds.names, main_cmds.cnt, - sizeof(*main_cmds.names), cmdname_compare); + QSORT(main_cmds.names, main_cmds.cnt, cmdname_compare); uniq(&main_cmds); /* This abuses cmdname->len for levenshtein distance */ @@ -359,8 +355,7 @@ const char *help_unknown_cmd(const char *cmd) levenshtein(cmd, candidate, 0, 2, 1, 3) + 1; } - qsort(main_cmds.names, main_cmds.cnt, - sizeof(*main_cmds.names), levenshtein_compare); + QSORT(main_cmds.names, main_cmds.cnt, levenshtein_compare); if (!main_cmds.cnt) die(_("Uh oh. Your system reports no Git commands at all.")); diff --git a/line-log.c b/line-log.c index 916e724..65f3558 100644 --- a/line-log.c +++ b/line-log.c @@ -113,7 +113,7 @@ void sort_and_merge_range_set(struct range_set *rs) int i; int o = 0; /* output cursor */ - qsort(rs->ranges, rs->nr, sizeof(struct range), range_cmp); + QSORT(rs->ranges, rs->nr, range_cmp); for (i = 0; i < rs->nr; i++) { if (rs->ranges[i].start == rs->ranges[i].end) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index c30bcd0..9705596 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -385,8 +385,7 @@ void bitmap_writer_select_commits(struct commit **indexed_commits, { unsigned int i = 0, j, next; - qsort(indexed_commits, indexed_commits_nr, sizeof(indexed_commits[0]), - date_compare); + QSORT(indexed_commits, indexed_commits_nr, date_compare); if (writer.show_progress) writer.progress = start_progress("Selecting bitmap commits", 0); diff --git a/pack-check.c b/pack-check.c index d123846..72440a8 100644 --- a/pack-check.c +++ b/pack-check.c @@ -99,7 +99,7 @@ static int verify_packfile(struct packed_git *p, entries[i].offset = nth_packed_object_offset(p, i); entries[i].nr = i; } - qsort(entries, nr_objects, sizeof(*entries), compare_entries); + QSORT(entries, nr_objects, compare_entries); for (i = 0; i < nr_objects; i++) { void *data; diff --git a/pack-write.c b/pack-write.c index ea0b788..88bc7f9 100644 --- a/pack-write.c +++ b/pack-write.c @@ -61,8 +61,7 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec if (objects[i]->offset > last_obj_offset) last_obj_offset = objects[i]->offset; } - qsort(sorted_by_sha, nr_objects, sizeof(sorted_by_sha[0]), - sha1_compare); + QSORT(sorted_by_sha, nr_objects, sha1_compare); } else sorted_by_sha = list = last = NULL; diff --git a/pathspec.c b/pathspec.c index 24e0dd5..eda13b5 100644 --- a/pathspec.c +++ b/pathspec.c @@ -446,8 +446,7 @@ void parse_pathspec(struct pathspec *pathspec, if (pathspec->magic & PATHSPEC_MAXDEPTH) { if (flags & PATHSPEC_KEEP_ORDER) die("BUG: PATHSPEC_MAXDEPTH_VALID and PATHSPEC_KEEP_ORDER are incompatible"); - qsort(pathspec->items, pathspec->nr, - sizeof(struct pathspec_item), pathspec_item_cmp); + QSORT(pathspec->items, pathspec->nr, pathspec_item_cmp); } } diff --git a/ref-filter.c b/ref-filter.c index 9adbb8a..44029b0 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -1573,7 +1573,7 @@ static int compare_refs(const void *a_, const void *b_) void ref_array_sort(struct ref_sorting *sorting, struct ref_array *array) { ref_sorting = sorting; - qsort(array->items, array->nr, sizeof(struct ref_array_item *), compare_refs); + QSORT(array->items, array->nr, compare_refs); } static void append_literal(const char *cp, const char *ep, struct ref_formatting_state *state) diff --git a/refs/files-backend.c b/refs/files-backend.c index 0709f60..d16feb1 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -501,7 +501,7 @@ static void sort_ref_dir(struct ref_dir *dir) if (dir->sorted == dir->nr) return; - qsort(dir->entries, dir->nr, sizeof(*dir->entries), ref_entry_cmp); + QSORT(dir->entries, dir->nr, ref_entry_cmp); /* Remove any duplicates: */ for (i = 0, j = 0; j < dir->nr; j++) { diff --git a/server-info.c b/server-info.c index 75dd677..7bc4e75 100644 --- a/server-info.c +++ b/server-info.c @@ -229,7 +229,7 @@ static void init_pack_info(const char *infofile, int force) } /* renumber them */ - qsort(info, num_pack, sizeof(info[0]), compare_info); + QSORT(info, num_pack, compare_info); for (i = 0; i < num_pack; i++) info[i]->new_num = i; } diff --git a/sh-i18n--envsubst.c b/sh-i18n--envsubst.c index e06b2c1..3637a2a 100644 --- a/sh-i18n--envsubst.c +++ b/sh-i18n--envsubst.c @@ -231,7 +231,7 @@ static inline void string_list_sort (string_list_ty *slp) { if (slp->nitems > 0) - qsort (slp->item, slp->nitems, sizeof (slp->item[0]), cmp_string); + QSORT(slp->item, slp->nitems, cmp_string); } /* Test whether a sorted string list contains a given string. */ diff --git a/sha1-array.c b/sha1-array.c index 6f4a224..21188de 100644 --- a/sha1-array.c +++ b/sha1-array.c @@ -16,7 +16,7 @@ static int void_hashcmp(const void *a, const void *b) static void sha1_array_sort(struct sha1_array *array) { - qsort(array->sha1, array->nr, sizeof(*array->sha1), void_hashcmp); + QSORT(array->sha1, array->nr, void_hashcmp); array->sorted = 1; } diff --git a/string-list.c b/string-list.c index 62d2084..8c83cac 100644 --- a/string-list.c +++ b/string-list.c @@ -225,7 +225,7 @@ static int cmp_items(const void *a, const void *b) void string_list_sort(struct string_list *list) { compare_for_qsort = list->cmp ? list->cmp : strcmp; - qsort(list->items, list->nr, sizeof(*list->items), cmp_items); + QSORT(list->items, list->nr, cmp_items); } struct string_list_item *unsorted_string_list_lookup(struct string_list *list, diff --git a/t/helper/test-dump-untracked-cache.c b/t/helper/test-dump-untracked-cache.c index 50112cc..f752532 100644 --- a/t/helper/test-dump-untracked-cache.c +++ b/t/helper/test-dump-untracked-cache.c @@ -18,10 +18,8 @@ static int compare_dir(const void *a_, const void *b_) static void dump(struct untracked_cache_dir *ucd, struct strbuf *base) { int i, len; - qsort(ucd->untracked, ucd->untracked_nr, sizeof(*ucd->untracked), - compare_untracked); - qsort(ucd->dirs, ucd->dirs_nr, sizeof(*ucd->dirs), - compare_dir); + QSORT(ucd->untracked, ucd->untracked_nr, compare_untracked); + QSORT(ucd->dirs, ucd->dirs_nr, compare_dir); len = base->len; strbuf_addf(base, "%s/", ucd->name); printf("%s %s", base->buf, diff --git a/tree.c b/tree.c index 2b5a5a8..ce345c5 100644 --- a/tree.c +++ b/tree.c @@ -180,8 +180,7 @@ int read_tree(struct tree *tree, int stage, struct pathspec *match) * Sort the cache entry -- we need to nuke the cache tree, though. */ cache_tree_free(&active_cache_tree); - qsort(active_cache, active_nr, sizeof(active_cache[0]), - cmp_cache_name_compare); + QSORT(active_cache, active_nr, cmp_cache_name_compare); return 0; } -- 2.10.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/3] remove unnecessary check before QSORT 2016-09-29 15:23 [PATCH 1/3] add QSORT René Scharfe 2016-09-29 15:27 ` [PATCH 2/3] use QSORT René Scharfe @ 2016-09-29 15:29 ` René Scharfe 2016-09-29 22:36 ` [PATCH 1/3] add QSORT Junio C Hamano 2 siblings, 0 replies; 13+ messages in thread From: René Scharfe @ 2016-09-29 15:29 UTC (permalink / raw) To: Git List; +Cc: Junio C Hamano Add a semantic patch for removing checks similar to the one that QSORT already does internally and apply it to the code base. Signed-off-by: Rene Scharfe <l.s.r@web.de> --- builtin/fmt-merge-msg.c | 10 ++++------ contrib/coccinelle/qsort.cocci | 18 ++++++++++++++++++ sh-i18n--envsubst.c | 3 +-- 3 files changed, 23 insertions(+), 8 deletions(-) diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c index 4976967..efab62f 100644 --- a/builtin/fmt-merge-msg.c +++ b/builtin/fmt-merge-msg.c @@ -314,12 +314,10 @@ static void add_people_info(struct strbuf *out, struct string_list *authors, struct string_list *committers) { - if (authors->nr) - QSORT(authors->items, authors->nr, - cmp_string_list_util_as_integral); - if (committers->nr) - QSORT(committers->items, committers->nr, - cmp_string_list_util_as_integral); + QSORT(authors->items, authors->nr, + cmp_string_list_util_as_integral); + QSORT(committers->items, committers->nr, + cmp_string_list_util_as_integral); credit_people(out, authors, 'a'); credit_people(out, committers, 'c'); diff --git a/contrib/coccinelle/qsort.cocci b/contrib/coccinelle/qsort.cocci index a094e7c..22b93a9 100644 --- a/contrib/coccinelle/qsort.cocci +++ b/contrib/coccinelle/qsort.cocci @@ -17,3 +17,21 @@ expression nmemb, compar; @@ - qsort(base, nmemb, sizeof(T), compar); + QSORT(base, nmemb, compar); + +@@ +expression base, nmemb, compar; +@@ +- if (nmemb) + QSORT(base, nmemb, compar); + +@@ +expression base, nmemb, compar; +@@ +- if (nmemb > 0) + QSORT(base, nmemb, compar); + +@@ +expression base, nmemb, compar; +@@ +- if (nmemb > 1) + QSORT(base, nmemb, compar); diff --git a/sh-i18n--envsubst.c b/sh-i18n--envsubst.c index 3637a2a..c3a2b5a 100644 --- a/sh-i18n--envsubst.c +++ b/sh-i18n--envsubst.c @@ -230,8 +230,7 @@ cmp_string (const void *pstr1, const void *pstr2) static inline void string_list_sort (string_list_ty *slp) { - if (slp->nitems > 0) - QSORT(slp->item, slp->nitems, cmp_string); + QSORT(slp->item, slp->nitems, cmp_string); } /* Test whether a sorted string list contains a given string. */ -- 2.10.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] add QSORT 2016-09-29 15:23 [PATCH 1/3] add QSORT René Scharfe 2016-09-29 15:27 ` [PATCH 2/3] use QSORT René Scharfe 2016-09-29 15:29 ` [PATCH 3/3] remove unnecessary check before QSORT René Scharfe @ 2016-09-29 22:36 ` Junio C Hamano 2016-09-29 23:21 ` René Scharfe 2016-10-01 16:19 ` René Scharfe 2 siblings, 2 replies; 13+ messages in thread From: Junio C Hamano @ 2016-09-29 22:36 UTC (permalink / raw) To: René Scharfe; +Cc: Git List René Scharfe <l.s.r@web.de> writes: > Add the macro QSORT, a convenient wrapper for qsort(3) that infers the > size of the array elements and supports the convention of initializing > empty arrays with a NULL pointer, which we use in some places. > > Calling qsort(3) directly with a NULL pointer is undefined -- even with > an element count of zero -- and allows the compiler to optimize away any > following NULL checks. Using the macro avoids such surprises. > > Add a semantic patch as well to demonstrate the macro's usage and to > automate the transformation of trivial cases. > > Signed-off-by: Rene Scharfe <l.s.r@web.de> > --- > contrib/coccinelle/qsort.cocci | 19 +++++++++++++++++++ > git-compat-util.h | 8 ++++++++ > 2 files changed, 27 insertions(+) > create mode 100644 contrib/coccinelle/qsort.cocci The direct calls to qsort(3) that this series leaves behind are interesting. 1. builtin/index-pack.c has this: if (1 < opts->anomaly_nr) qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), cmp_uint32); where opts->anomaly is coming from pack.h: struct pack_idx_option { unsigned flags; ... int anomaly_alloc, anomaly_nr; uint32_t *anomaly; }; I cannot quite see how the automated conversion misses it? It's not like base and nmemb are type-restricted in the rule (they are both just "expression"s). 2. builtin/shortlog.c has this: qsort(log->list.items, log->list.nr, sizeof(struct string_list_item), log->summary ? compare_by_counter : compare_by_list); where log->list is coming from shortlog.h: struct shortlog { struct string_list list; }; and string-list.h says: struct string_list { struct string_list_item *items; unsigned int nr, alloc; ... }; which seems to be a good candidate for this rule: type T; T *base; expression nmemb, compar; @@ - qsort(base, nmemb, sizeof(T), compar); + QSORT(base, nmemb, compar); if we take "T == struct string_list_item". 3. builtin/show-branch.c does this: qsort(ref_name + bottom, top - bottom, sizeof(ref_name[0]), compare_ref_name); where ref_name[] is a file-scope global: static char *ref_name[MAX_REVS + 1]; and top and bottom are plain integers. The sizeof() does not take the size of *base, so it is understandable that this does not get automatically converted. It seems that some calls to this function _could_ send the same top and bottom, asking for 0 element array to be sorted, by the way. Thanks for an amusing read. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] add QSORT 2016-09-29 22:36 ` [PATCH 1/3] add QSORT Junio C Hamano @ 2016-09-29 23:21 ` René Scharfe 2016-09-29 23:40 ` René Scharfe 2016-10-01 16:19 ` René Scharfe 1 sibling, 1 reply; 13+ messages in thread From: René Scharfe @ 2016-09-29 23:21 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git List Am 30.09.2016 um 00:36 schrieb Junio C Hamano: > René Scharfe <l.s.r@web.de> writes: > >> Add the macro QSORT, a convenient wrapper for qsort(3) that infers the >> size of the array elements and supports the convention of initializing >> empty arrays with a NULL pointer, which we use in some places. >> >> Calling qsort(3) directly with a NULL pointer is undefined -- even with >> an element count of zero -- and allows the compiler to optimize away any >> following NULL checks. Using the macro avoids such surprises. >> >> Add a semantic patch as well to demonstrate the macro's usage and to >> automate the transformation of trivial cases. >> >> Signed-off-by: Rene Scharfe <l.s.r@web.de> >> --- >> contrib/coccinelle/qsort.cocci | 19 +++++++++++++++++++ >> git-compat-util.h | 8 ++++++++ >> 2 files changed, 27 insertions(+) >> create mode 100644 contrib/coccinelle/qsort.cocci > > The direct calls to qsort(3) that this series leaves behind are > interesting. > > 1. builtin/index-pack.c has this: > > if (1 < opts->anomaly_nr) > qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), cmp_uint32); > > where opts->anomaly is coming from pack.h: > > struct pack_idx_option { > unsigned flags; > ... > int anomaly_alloc, anomaly_nr; > uint32_t *anomaly; > }; > > I cannot quite see how the automated conversion misses it? It's not > like base and nmemb are type-restricted in the rule (they are both > just "expression"s). > > 2. builtin/shortlog.c has this: > > qsort(log->list.items, log->list.nr, sizeof(struct string_list_item), > log->summary ? compare_by_counter : compare_by_list); > > where log->list is coming from shortlog.h: > > struct shortlog { > struct string_list list; > }; > > and string-list.h says: > > struct string_list { > struct string_list_item *items; > unsigned int nr, alloc; > ... > }; > > which seems to be a good candidate for this rule: > > type T; > T *base; > expression nmemb, compar; > @@ > - qsort(base, nmemb, sizeof(T), compar); > + QSORT(base, nmemb, compar); > > if we take "T == struct string_list_item". Transformations for these two are generated if we pass --all-includes to spatch. So let's do that. -- >8 -- Subject: [PATCH] coccicheck: use --all-includes by default Add a make variable, SPATCH_FLAGS, for specifying flags for spatch, and set it to --all-includes by default. This option lets it consider header files which would otherwise be ignored. That's important for some rules that rely on type information. It doubles the duration of coccicheck, however. Signed-off-by: Rene Scharfe <l.s.r@web.de> --- Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 1aad150..d15bf8d 100644 --- a/Makefile +++ b/Makefile @@ -467,6 +467,7 @@ SPATCH = spatch export TCL_PATH TCLTK_PATH SPARSE_FLAGS = +SPATCH_FLAGS = --all-includes @@ -2314,7 +2315,7 @@ C_SOURCES = $(patsubst %.o,%.c,$(C_OBJ)) %.cocci.patch: %.cocci $(C_SOURCES) @echo ' ' SPATCH $<; \ for f in $(C_SOURCES); do \ - $(SPATCH) --sp-file $< $$f; \ + $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS); \ done >$@ 2>$@.log; \ if test -s $@; \ then \ -- 2.10.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] add QSORT 2016-09-29 23:21 ` René Scharfe @ 2016-09-29 23:40 ` René Scharfe 0 siblings, 0 replies; 13+ messages in thread From: René Scharfe @ 2016-09-29 23:40 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git List Am 30.09.2016 um 01:21 schrieb René Scharfe: > Am 30.09.2016 um 00:36 schrieb Junio C Hamano: >> René Scharfe <l.s.r@web.de> writes: >> >>> Add the macro QSORT, a convenient wrapper for qsort(3) that infers the >>> size of the array elements and supports the convention of initializing >>> empty arrays with a NULL pointer, which we use in some places. >>> >>> Calling qsort(3) directly with a NULL pointer is undefined -- even with >>> an element count of zero -- and allows the compiler to optimize away any >>> following NULL checks. Using the macro avoids such surprises. >>> >>> Add a semantic patch as well to demonstrate the macro's usage and to >>> automate the transformation of trivial cases. >>> >>> Signed-off-by: Rene Scharfe <l.s.r@web.de> >>> --- >>> contrib/coccinelle/qsort.cocci | 19 +++++++++++++++++++ >>> git-compat-util.h | 8 ++++++++ >>> 2 files changed, 27 insertions(+) >>> create mode 100644 contrib/coccinelle/qsort.cocci >> >> The direct calls to qsort(3) that this series leaves behind are >> interesting. >> >> 1. builtin/index-pack.c has this: >> >> if (1 < opts->anomaly_nr) >> qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), cmp_uint32); >> >> where opts->anomaly is coming from pack.h: >> >> struct pack_idx_option { >> unsigned flags; >> ... >> int anomaly_alloc, anomaly_nr; >> uint32_t *anomaly; >> }; >> >> I cannot quite see how the automated conversion misses it? It's not >> like base and nmemb are type-restricted in the rule (they are both >> just "expression"s). >> >> 2. builtin/shortlog.c has this: >> >> qsort(log->list.items, log->list.nr, sizeof(struct string_list_item), >> log->summary ? compare_by_counter : compare_by_list); >> >> where log->list is coming from shortlog.h: >> >> struct shortlog { >> struct string_list list; >> }; >> >> and string-list.h says: >> >> struct string_list { >> struct string_list_item *items; >> unsigned int nr, alloc; >> ... >> }; >> >> which seems to be a good candidate for this rule: >> >> type T; >> T *base; >> expression nmemb, compar; >> @@ >> - qsort(base, nmemb, sizeof(T), compar); >> + QSORT(base, nmemb, compar); >> >> if we take "T == struct string_list_item". > > Transformations for these two are generated if we pass --all-includes > to spatch. So let's do that. And here's the result: -- >8 -- Subject: [PATCH] use QSORT, part 2 Convert two more qsort(3) calls to QSORT to reduce code size and for better safety and consistency. Signed-off-by: Rene Scharfe <l.s.r@web.de> --- Squashable. builtin/index-pack.c | 3 +-- builtin/shortlog.c | 2 +- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 7657d0a..0a27bab 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1531,8 +1531,7 @@ static void read_v2_anomalous_offsets(struct packed_git *p, opts->anomaly[opts->anomaly_nr++] = ntohl(idx2[off * 2 + 1]); } - if (1 < opts->anomaly_nr) - qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), cmp_uint32); + QSORT(opts->anomaly, opts->anomaly_nr, cmp_uint32); } static void read_idx_option(struct pack_idx_option *opts, const char *pack_name) diff --git a/builtin/shortlog.c b/builtin/shortlog.c index 25fa8a6..ba0e115 100644 --- a/builtin/shortlog.c +++ b/builtin/shortlog.c @@ -308,7 +308,7 @@ void shortlog_output(struct shortlog *log) struct strbuf sb = STRBUF_INIT; if (log->sort_by_number) - qsort(log->list.items, log->list.nr, sizeof(struct string_list_item), + QSORT(log->list.items, log->list.nr, log->summary ? compare_by_counter : compare_by_list); for (i = 0; i < log->list.nr; i++) { const struct string_list_item *item = &log->list.items[i]; -- 2.10.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] add QSORT 2016-09-29 22:36 ` [PATCH 1/3] add QSORT Junio C Hamano 2016-09-29 23:21 ` René Scharfe @ 2016-10-01 16:19 ` René Scharfe 2016-10-03 16:46 ` Kevin Bracey 2016-10-03 17:09 ` Kevin Bracey 1 sibling, 2 replies; 13+ messages in thread From: René Scharfe @ 2016-10-01 16:19 UTC (permalink / raw) To: Junio C Hamano; +Cc: Git List Am 30.09.2016 um 00:36 schrieb Junio C Hamano: > 3. builtin/show-branch.c does this: > > qsort(ref_name + bottom, top - bottom, sizeof(ref_name[0]), > compare_ref_name); > > where ref_name[] is a file-scope global: > > static char *ref_name[MAX_REVS + 1]; > > and top and bottom are plain integers. The sizeof() does not take > the size of *base, so it is understandable that this does not get > automatically converted. > > It seems that some calls to this function _could_ send the same top > and bottom, asking for 0 element array to be sorted, by the way. It's hard to imagine an implementation of qsort(3) that can't handle zero elements. QSORT's safety feature is that it prevents the compiler from removing NULL checks for the array pointer. E.g. the last two lines in the following example could be optimized away: qsort(ptr, n, sizeof(*ptr), fn); if (!ptr) do_stuff(); You can see that on https://godbolt.org/g/JwS99b -- an awesome website for exploring compilation results for small snippets, by the way. This optimization is dangerous when combined with the convention of using a NULL pointer for empty arrays. Diagnosing an affected NULL check is probably quite hard -- it's right there in the code after all and not all compilers remove it. builtin/show-branch.c never passes NULL, so it's not affected by that hazard. We can (and should, IMHO) still use QSORT there for consistency and convenience, though: -- >8 -- Subject: [PATCH] show-branch: use QSORT Shorten the code by using QSORT instead of calling qsort(3) directly, as the former determines the element size automatically and checks if there are at least two elements to sort already. Signed-off-by: Rene Scharfe <l.s.r@web.de> --- builtin/show-branch.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/builtin/show-branch.c b/builtin/show-branch.c index 623ca56..974f340 100644 --- a/builtin/show-branch.c +++ b/builtin/show-branch.c @@ -353,8 +353,7 @@ static int compare_ref_name(const void *a_, const void *b_) static void sort_ref_range(int bottom, int top) { - qsort(ref_name + bottom, top - bottom, sizeof(ref_name[0]), - compare_ref_name); + QSORT(ref_name + bottom, top - bottom, compare_ref_name); } static int append_ref(const char *refname, const struct object_id *oid, @@ -540,8 +539,7 @@ static void append_one_rev(const char *av) if (saved_matches == ref_name_cnt && ref_name_cnt < MAX_REVS) error(_("no matching refs with %s"), av); - if (saved_matches + 1 < ref_name_cnt) - sort_ref_range(saved_matches, ref_name_cnt); + sort_ref_range(saved_matches, ref_name_cnt); return; } die("bad sha1 reference %s", av); -- 2.10.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] add QSORT 2016-10-01 16:19 ` René Scharfe @ 2016-10-03 16:46 ` Kevin Bracey 2016-10-03 17:09 ` Kevin Bracey 1 sibling, 0 replies; 13+ messages in thread From: Kevin Bracey @ 2016-10-03 16:46 UTC (permalink / raw) To: GIT Mailing-list On 01/10/2016 19:19, René Scharfe wrote: > It's hard to imagine an implementation of qsort(3) that can't handle > zero elements. QSORT's safety feature is that it prevents the compiler > from removing NULL checks for the array pointer. E.g. the last two > lines in the following example could be optimized away: > > qsort(ptr, n, sizeof(*ptr), fn); > if (!ptr) > do_stuff(); > > You can see that on https://godbolt.org/g/JwS99b -- an awesome website > for exploring compilation results for small snippets, by the way. > > This optimization is dangerous when combined with the convention of > using a NULL pointer for empty arrays. Diagnosing an affected NULL > check is probably quite hard -- it's right there in the code after all > and not all compilers remove it. Hang on, hang on. This is either a compiler bug, or you're wrong on your assumption about the specification of qsort. Either way, the extra layer of indirection is not proper protection. The unwanted compiler optimisation you're inadvertently triggering could still be triggered through the inline. Now, looking at the C standard, this isn't actually clear to me. The standard says that if you call qsort with nmemb zero, the pointer still has to be "valid". Not totally clear to me if NULL is valid, by their definition in C99 7.1.4. Googling hasn't given me a concrete answer. The compiler seems to think that NULL wouldn't be valid, so because you've called qsort on it, you've invoked undefined behaviour if it's NULL, so it's free to elide the NULL check. Kevin ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] add QSORT 2016-10-01 16:19 ` René Scharfe 2016-10-03 16:46 ` Kevin Bracey @ 2016-10-03 17:09 ` Kevin Bracey 2016-10-03 22:00 ` René Scharfe 1 sibling, 1 reply; 13+ messages in thread From: Kevin Bracey @ 2016-10-03 17:09 UTC (permalink / raw) To: GIT Mailing-list, René Scharfe On 01/10/2016 19:19, René Scharfe wrote: > > It's hard to imagine an implementation of qsort(3) that can't handle > zero elements. QSORT's safety feature is that it prevents the compiler > from removing NULL checks for the array pointer. E.g. the last two > lines in the following example could be optimized away: > > qsort(ptr, n, sizeof(*ptr), fn); > if (!ptr) > do_stuff(); > > You can see that on https://godbolt.org/g/JwS99b -- an awesome website > for exploring compilation results for small snippets, by the way. > Ah, second attempt. Originally misread the original code, and didn't understand what it was doing. I get it now. A nasty trap I hadn't been aware of - I was under the impression NULL + zero length was generally legal, but the C standard does indeed not give you a specific out for NULL to library functions in that case. As such, NULL checks can still be elided even with your change. If you effectively change your example to: if (nmemb > 1) qsort(array, nmemb, size, cmp); if (!array) printf("array is NULL\n"); array may only be checked for NULL if nmemb <= 1. You can see GCC doing that in the compiler explorer - it effectively turns that into "else if". To make that check really work, you have to do: if (array) qsort(array, nmemb, size, cmp); else printf("array is NULL\n"); So maybe your "sane_qsort" should be checking array, not nmemb. Kevin ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] add QSORT 2016-10-03 17:09 ` Kevin Bracey @ 2016-10-03 22:00 ` René Scharfe 2016-10-04 5:28 ` Kevin Bracey 0 siblings, 1 reply; 13+ messages in thread From: René Scharfe @ 2016-10-03 22:00 UTC (permalink / raw) To: Kevin Bracey, GIT Mailing-list Am 03.10.2016 um 19:09 schrieb Kevin Bracey: > As such, NULL checks can still be elided even with your change. If you > effectively change your example to: > > if (nmemb > 1) > qsort(array, nmemb, size, cmp); > if (!array) > printf("array is NULL\n"); > > array may only be checked for NULL if nmemb <= 1. You can see GCC doing > that in the compiler explorer - it effectively turns that into "else > if". We don't support array == NULL together with nmemb > 1, so a segfault is to be expected in such cases, and thus NULL checks can be removed safely. > To make that check really work, you have to do: > > if (array) > qsort(array, nmemb, size, cmp); > else > printf("array is NULL\n"); > > So maybe your "sane_qsort" should be checking array, not nmemb. It would be safe, but arguably too much so, because non-empty arrays with NULL wouldn't segfault anymore, and thus become harder to identify as the programming errors they are. The intention is to support NULL pointers only for empty arrays (in addition to valid pointers). That we also support NULL pointers for arrays with a single member might be considered to be the result of a premature optimization, but it should be safe -- the compiler won't remove checks unexpectedly. Does that make sense (it's getting late here, so my logic might already be resting..)? René ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] add QSORT 2016-10-03 22:00 ` René Scharfe @ 2016-10-04 5:28 ` Kevin Bracey 2016-10-04 20:31 ` René Scharfe 0 siblings, 1 reply; 13+ messages in thread From: Kevin Bracey @ 2016-10-04 5:28 UTC (permalink / raw) To: GIT Mailing-list; +Cc: René Scharfe On 04/10/2016 01:00, René Scharfe wrote: > Am 03.10.2016 um 19:09 schrieb Kevin Bracey: >> As such, NULL checks can still be elided even with your change. If you >> effectively change your example to: >> >> if (nmemb > 1) >> qsort(array, nmemb, size, cmp); >> if (!array) >> printf("array is NULL\n"); >> >> array may only be checked for NULL if nmemb <= 1. You can see GCC doing >> that in the compiler explorer - it effectively turns that into "else >> if". > > We don't support array == NULL together with nmemb > 1, so a segfault > is to be expected in such cases, and thus NULL checks can be removed > safely. > Possibly true in practice. But technically wrong by the C standard - behaviour is undefined if the qsort pointer is invalid. You can't formally expect the defined behaviour of a segfault when sending NULL into qsort. (Hell, maybe the qsort has its own NULL check and silently returns! cf printf - some printfs will segfault when passed NULL, some print "(null)"). I've worked on systems that don't fault reads to NULL, only writes, so those might not segfault there, if NULL appeared sorted... And obviously there's the language lawyer favourite possibility of the call causing nasal flying monkeys or whatever. So if it's not a program error for array to be NULL and nmemb to be zero in your code, and you want a diagnostic for array=NULL, nmemb non-zero, I think you should put that diagnostic into sane_qsort as an assert or something, not rely on qsort's undefined behaviour being a segfault. sane_qsort(blah) { if (nmemb >= 1) { assert(array); qsort(array, nmemb, ...); } } Can't invoke undefined behaviour from NULL without triggering the assert. (Could still have other invalid pointers, of course). Usually I am on the side of "no NULL checks", as I make the assumption that we will get a segfault as soon as NULL pointers are used, and those are generally easy to diagnose. But seeing a compiler invoking this sort of new trickery due to invoking undefined behaviour is making me more nervous about doing so... >> To make that check really work, you have to do: >> >> if (array) >> qsort(array, nmemb, size, cmp); >> else >> printf("array is NULL\n"); >> >> So maybe your "sane_qsort" should be checking array, not nmemb. > > It would be safe, but arguably too much so, because non-empty arrays > with NULL wouldn't segfault anymore, and thus become harder to > identify as the programming errors they are. Well, you get the print. Although I guess you're worrying about the second if being real code, not a debugging check. I must say, this is quite a courageous new optimisation from GCC. It strikes me as finding a language lawyer loophole that seems to have been intended for something else (mapping library functions directly onto CISCy CPU intrinsics), and using it to invent a whole new optimisation that seems more likely to trigger bugs than optimise any significant amount of code in a desirable way. Doubly weird as there's no (standard) language support for this. I don't know how you'd define "my_qsort" that triggered the same optimisations. I've seen similar library-knowledge-without-any-way-to-reproduce-in-user-code optimisations like "malloc returns a new pointer that doesn't alias with anything existing" (and no way to reproduce the optimisation with my_malloc_wrapper). But those seemed to have a clear performance benefit, without any obvious traps. Doubtful about this one. Kevin ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] add QSORT 2016-10-04 5:28 ` Kevin Bracey @ 2016-10-04 20:31 ` René Scharfe 2016-10-05 15:00 ` Kevin Bracey 0 siblings, 1 reply; 13+ messages in thread From: René Scharfe @ 2016-10-04 20:31 UTC (permalink / raw) To: Kevin Bracey, GIT Mailing-list Am 04.10.2016 um 07:28 schrieb Kevin Bracey: > On 04/10/2016 01:00, René Scharfe wrote: >> Am 03.10.2016 um 19:09 schrieb Kevin Bracey: >>> As such, NULL checks can still be elided even with your change. If you >>> effectively change your example to: >>> >>> if (nmemb > 1) >>> qsort(array, nmemb, size, cmp); >>> if (!array) >>> printf("array is NULL\n"); >>> >>> array may only be checked for NULL if nmemb <= 1. You can see GCC doing >>> that in the compiler explorer - it effectively turns that into "else >>> if". >> >> We don't support array == NULL together with nmemb > 1, so a segfault >> is to be expected in such cases, and thus NULL checks can be removed >> safely. >> > Possibly true in practice. > > But technically wrong by the C standard - behaviour is undefined if the > qsort pointer is invalid. You can't formally expect the defined > behaviour of a segfault when sending NULL into qsort. (Hell, maybe the > qsort has its own NULL check and silently returns! A qsort(3) implementation that doesn't segfault is inconvenient, but still safe. I'm more concerned about NULL checks being removed from our code. > So if it's not a program error for array to be NULL and nmemb to be zero > in your code, and you want a diagnostic for array=NULL, nmemb non-zero, > I think you should put that diagnostic into sane_qsort as an assert or > something, not rely on qsort's undefined behaviour being a segfault. > > sane_qsort(blah) > { > if (nmemb >= 1) { > assert(array); > qsort(array, nmemb, ...); > } > } > > Can't invoke undefined behaviour from NULL without triggering the > assert. (Could still have other invalid pointers, of course). We could do that, but I think it's not necessary. We'd get a segfault when accessing the sorted array anyway. (If we don't look at the data after sorting then we can get rid of the sorting step altogether.) > Usually I am on the side of "no NULL checks", as I make the assumption > that we will get a segfault as soon as NULL pointers are used, and those > are generally easy to diagnose. But seeing a compiler invoking this sort > of new trickery due to invoking undefined behaviour is making me more > nervous about doing so... I was shocked a bit myself when I learned about this, but let's not panic. :) >>> To make that check really work, you have to do: >>> >>> if (array) >>> qsort(array, nmemb, size, cmp); >>> else >>> printf("array is NULL\n"); >>> >>> So maybe your "sane_qsort" should be checking array, not nmemb. >> >> It would be safe, but arguably too much so, because non-empty arrays >> with NULL wouldn't segfault anymore, and thus become harder to >> identify as the programming errors they are. > Well, you get the print. Although I guess you're worrying about the > second if being real code, not a debugging check. Yes, but the optimization is valid: If nmemb > 0 then array can only be NULL if we have a bug, and then we'd get a segfault eventually. So such checks can be removed safely. > I must say, this is quite a courageous new optimisation from GCC. It > strikes me as finding a language lawyer loophole that seems to have been > intended for something else (mapping library functions directly onto > CISCy CPU intrinsics), and using it to invent a whole new optimisation > that seems more likely to trigger bugs than optimise any significant > amount of code in a desirable way. Yeah, and the bugs triggered are quite obscure in this case. But having richer type information and thus restricting the range of possible values for variables *can* enable useful optimizations. > Doubly weird as there's no (standard) language support for this. I don't > know how you'd define "my_qsort" that triggered the same optimisations. The nonnull attribute is a GCC extension, but it's also supported by clang: http://clang.llvm.org/docs/AttributeReference.html#nonnull-gnu-nonnull I don't know if other compilers support it as well, or if there are efforts underway to standardize it. > I've seen similar > library-knowledge-without-any-way-to-reproduce-in-user-code > optimisations like "malloc returns a new pointer that doesn't alias with > anything existing" (and no way to reproduce the optimisation with > my_malloc_wrapper). But those seemed to have a clear performance > benefit, without any obvious traps. Doubtful about this one. Still we have to deal with it.. So let's summarize; here's the effect of a raw qsort(3) call: array == NULL nmemb bug QSORT following NULL check ------------- ----- --- ----- -------------------- 0 0 no qsort is skipped 0 >0 no qsort is skipped 1 0 no qsort is skipped (bad!) 1 >0 yes qsort is skipped Here's what the current implementation (nmemb > 1) does: array == NULL nmemb bug QSORT following NULL check ------------- ----- --- ----- -------------------- 0 0 no noop is executed 0 1 no noop is executed 0 >1 no qsort is skipped 1 0 no noop is executed 1 1 yes noop is executed 1 >1 yes qsort is skipped With the micro-optimization removed (nmemb > 0) the matrix gets simpler: array == NULL nmemb bug QSORT following NULL check ------------- ----- --- ----- -------------------- 0 0 no noop is executed 0 >0 no qsort is skipped 1 0 no noop is executed 1 >0 yes qsort is skipped And with your NULL check (array != NULL) we'd get: array == NULL nmemb bug QSORT following NULL check ------------- ----- --- ----- -------------------- 0 0 no qsort reuses check result 0 >0 no qsort reuses check result 1 0 no noop reuses check result 1 >0 yes noop reuses check result Did I get it right? AFAICS all variants (except raw qsort) are safe -- no useful NULL checks are removed, and buggy code should be noticed by segfaults in code accessing the sorted array. So the advantage of the current code is that it won't call qsort for nmemb <= 1. And the advantage of checking the pointer is that the result of that check can be reused by later checks. I think the former is more useful, but only slightly. René ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] add QSORT 2016-10-04 20:31 ` René Scharfe @ 2016-10-05 15:00 ` Kevin Bracey 0 siblings, 0 replies; 13+ messages in thread From: Kevin Bracey @ 2016-10-05 15:00 UTC (permalink / raw) To: René Scharfe; +Cc: GIT Mailing-list On 04/10/2016 23:31, René Scharfe wrote: > > So let's summarize; here's the effect of a raw qsort(3) call: > > array == NULL nmemb bug QSORT following NULL check > ------------- ----- --- ----- -------------------- > 0 0 no qsort is skipped > 0 >0 no qsort is skipped > 1 0 no qsort is skipped (bad!) ****** > 1 >0 yes qsort is skipped ****** > Right - row 3 may not be a bug from the point of view of your internals, but it means you violate the API of qsort.Therefore a fix is required. > With the micro-optimization removed (nmemb > 0) the matrix gets simpler: > > array == NULL nmemb bug QSORT following NULL check > ------------- ----- --- ----- -------------------- > 0 0 no noop is executed > 0 >0 no qsort is skipped > 1 0 no noop is executed > 1 >0 yes qsort is skipped ****** > > And with your NULL check (array != NULL) we'd get: > > array == NULL nmemb bug QSORT following NULL check > ------------- ----- --- ----- -------------------- > 0 0 no qsort reuses check result > 0 >0 no qsort reuses check result > 1 0 no noop reuses check result > 1 >0 yes noop reuses check result > > Did I get it right? AFAICS all variants (except raw qsort) are safe > -- no useful NULL checks are removed, and buggy code should be noticed > by segfaults in code accessing the sorted array. I think your tables are correct. But I disagree that you could ever call invoking the "****" lines safe. Unless you have documentation on what limit GCC (and your other compilers) are prepared to put on the undefined behaviour of violating that "non-null" constraint. Up to now dereferencing a null pointer has been implicitly (or explicitly?) defined as simply generating SIGSEGV. And that has naturally extended into NULL passed to library implementations. But that's no longer true - it seems bets are somewhat off. But, as long as you are confident you never invoke that line without a program bug - ie an API precondition of your own QSORT is that NULL is legal iff nmemb is zero, then I guess it's fine. Behaviour is defined, as long as you don't violate your internal preconditions. Kevin ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2016-10-05 16:16 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-09-29 15:23 [PATCH 1/3] add QSORT René Scharfe 2016-09-29 15:27 ` [PATCH 2/3] use QSORT René Scharfe 2016-09-29 15:29 ` [PATCH 3/3] remove unnecessary check before QSORT René Scharfe 2016-09-29 22:36 ` [PATCH 1/3] add QSORT Junio C Hamano 2016-09-29 23:21 ` René Scharfe 2016-09-29 23:40 ` René Scharfe 2016-10-01 16:19 ` René Scharfe 2016-10-03 16:46 ` Kevin Bracey 2016-10-03 17:09 ` Kevin Bracey 2016-10-03 22:00 ` René Scharfe 2016-10-04 5:28 ` Kevin Bracey 2016-10-04 20:31 ` René Scharfe 2016-10-05 15:00 ` Kevin Bracey
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).