From: Yann Dirson <ydirson@altern.org>
To: git@vger.kernel.org
Cc: Yann Dirson <ydirson@altern.org>
Subject: [PATCH 1/6] Introduce sorted-array binary-search function.
Date: Wed, 8 Dec 2010 23:51:30 +0100 [thread overview]
Message-ID: <1291848695-24601-2-git-send-email-ydirson@altern.org> (raw)
In-Reply-To: <1291848695-24601-1-git-send-email-ydirson@altern.org>
We use a cpp-based template mechanism to declare the array and its
management data, as well as a search function.
Thanks to Jonathan Nieder for this design idea.
Signed-off-by: Yann Dirson <ydirson@altern.org>
---
Documentation/technical/api-sorted-array.txt | 132 ++++++++++++++++++++++++++
Makefile | 1 +
sorted-array.h | 128 +++++++++++++++++++++++++
3 files changed, 261 insertions(+), 0 deletions(-)
create mode 100644 Documentation/technical/api-sorted-array.txt
create mode 100644 sorted-array.h
diff --git a/Documentation/technical/api-sorted-array.txt b/Documentation/technical/api-sorted-array.txt
new file mode 100644
index 0000000..b1238ad
--- /dev/null
+++ b/Documentation/technical/api-sorted-array.txt
@@ -0,0 +1,132 @@
+sorted-array API
+================
+
+The sorted-array API is meant to provide efficient binary-search
+functions into a C array of elements of arbitrary type, and insert
+functions that maintain ordering.
+
+It is meant for data structures where lookup time is much more
+important than insertion type: while lookup has o(log(n)) time
+complexity, insertion has o(n) and involves many copies.
+
+API overview
+------------
+
+The API allow to declare all variables and function as static or not,
+through the first argument to all macros, which can be `static` or
+empty.
+
+It is based on the idea of providing custom functions, which will then
+be used by generic algorithms:
+* one for comparing a search term with an array item
+* one for initializing a new array item from a search term after insertion
+
+A set of convenience wrapper macros are provided, to define a readable
+API for inserting and sorting.
+
+Note that the type of the search term can be different from the type
+of array elements (which can be more expensive to create).
+
+API reference
+-------------
+
+The macros can be categorized as follows:
+
+* the one used to define a sorted array, and its management variable
+ holding currently-allocated number of elements and number of
+ elements effectively used. The name of those (int) management
+ variables are crafted by using the `LISTNAME` and appending
+ respectively `_alloc` and `_nr`.
++
+----
+declare_sorted_array(MAYBESTATIC,ELEMTYPE,LISTNAME)
+----
++
+eg:
++
+----
+declare_sorted_array(static, struct diff_rename_dst, rename_dst);
+----
+
+* those defining ready-for-use search and insert functions. They
+ exist in several flavours, distinguished by their suffix, which
+ refer to different return-type semantics.
++
+----
+declare_sorted_array_search_*(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,LISTNAME,CMP)
+declare_sorted_array_insert_*(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,GENSEARCH,LISTNAME,CMP,INIT)
+----
++
+They all work by defining behind the scene a (static) generic search or insert
+function named with `_gen_` prepended to `FUNCNAME`. Such generic functions
+may be useful to know about if you need to define an insert function with
+the exact same parameters.
++
+All functions of a given type (search or insert) take the same kind of
+arguments:
+
+`MAYBESTATIC`::
+ the ubiquitous `static`or empty placeholder
+`ELEMTYPE`::
+ type of element to be stored in the array
+`FUNCNAME`::
+ name of the function to be defined
+`INITTYPE`::
+ type of search term / initializer passed as argument to the
+ generated function
+`LISTNAME`::
+ name of the array in which the search takes place, usually declared
+ through `declare_sorted_array()`
+`CMP`::
+ comparison function taking an INITTYPE argument and a
+ pointer to an ELEMTYPE argument, and returning an int with
+ the same meaning as `strcmp()`.
+`GENSEARCH`::
+ (insert only) generic search function generated by
+ `declare_gen_binsearch()`
+`INIT`::
+ (insert only) element-initializer function taking as arguments
+ pointer to the ELEMTYPE to be initialized, and an INITTYPE argument
+ to take information from
+
+
+Suffix meanings are as follows:
+
+`check`::
+
+ Returns the position of the element if found pre-existing in
+ the list, or if not found, (-pos - 1) where pos is where the
+ element would have been (for search funcs) or has been (for
+ insert funcs) inserted.
+
+`checkbool`::
+
+ Returns an int, just telling whether the searched element was
+ pre-existing in the list (1) or not (0).
+
+`elem`::
+
+ Returns NULL if the search term was not found (for search
+ funcs), or a pointer to the element found or newly inserted.
+
+* those defining even-more-ready-for-use insert functions, which do
+ not need to define the generic search function at all. Those macro
+ names are the same as the forms with an `GENSEARCH` argument, but
+ with a name ending in `_insertonly_*`, and without that `GENSEARCH`
+ argument. The generic (static) search functions are named as
+ `_gensearch` prepended to the `FUNCNAME` argument.
+
+* those defining the generic algorithms
++
+They rarely need to be used directly, they are usually expanded
+transparently from the `declare_sorted_array_*` macros.
++
+----
+declare_gen_binsearch(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE)
+declare_gen_sorted_insert(MAYBESTATIC,ELEMTYPE,FUNCNAME,SEARCHFUNC,INITTYPE)
+----
+
+* those defining the individual wrappers
++
+They rarely need to be used directly, they are usually expanded
+transparently from the `declare_sorted_array_*` macros.
diff --git a/Makefile b/Makefile
index 7eb948d..522261e 100644
--- a/Makefile
+++ b/Makefile
@@ -544,6 +544,7 @@ LIB_H += run-command.h
LIB_H += sha1-lookup.h
LIB_H += sideband.h
LIB_H += sigchain.h
+LIB_H += sorted-array.h
LIB_H += strbuf.h
LIB_H += string-list.h
LIB_H += submodule.h
diff --git a/sorted-array.h b/sorted-array.h
new file mode 100644
index 0000000..added0e
--- /dev/null
+++ b/sorted-array.h
@@ -0,0 +1,128 @@
+#ifndef SORTED_ARRAY_H_
+#define SORTED_ARRAY_H_
+
+/*
+ * Sorted-array management.
+ * See Documentation/technical/api-sorted-array.txt
+ */
+
+#define declare_sorted_array(MAYBESTATIC,ELEMTYPE,LIST) \
+MAYBESTATIC ELEMTYPE *LIST; \
+MAYBESTATIC int LIST##_nr, LIST##_alloc;
+
+#define declare_gen_binsearch(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE) \
+MAYBESTATIC int FUNCNAME( \
+ ELEMTYPE *list, int list_nr, \
+ int(*cmp_func)(INITTYPE ref, ELEMTYPE *elem), \
+ INITTYPE data) \
+{ \
+ int lo, hi; \
+ \
+ lo = 0; \
+ hi = list_nr; \
+ while (hi > lo) { \
+ int mid = (hi + lo) >> 1; \
+ int cmp = cmp_func(data, list + mid); \
+ if (!cmp) \
+ return mid; \
+ if (cmp < 0) \
+ hi = mid; \
+ else \
+ lo = mid + 1; \
+ } \
+ return -lo - 1; \
+}
+
+#define declare_gen_sorted_insert(MAYBESTATIC,ELEMTYPE,FUNCNAME,SEARCHFUNC,INITTYPE) \
+MAYBESTATIC int FUNCNAME( \
+ ELEMTYPE **list_p, int *list_nr_p, int *list_alloc_p, \
+ int(*cmp_func)(INITTYPE ref, ELEMTYPE *elem), \
+ void(*init_func)(ELEMTYPE *elem, INITTYPE init), \
+ INITTYPE data) \
+{ \
+ int pos = SEARCHFUNC(*list_p, *list_nr_p, cmp_func, data); \
+ if (pos >= 0) \
+ return pos; \
+ /* not found */ \
+ pos = -pos - 1; \
+ /* insert to make it at "pos" */ \
+ if (*list_alloc_p <= *list_nr_p) { \
+ (*list_alloc_p) = alloc_nr((*list_alloc_p)); \
+ *list_p = xrealloc(*list_p, \
+ (*list_alloc_p) * sizeof(**list_p)); \
+ } \
+ (*list_nr_p)++; \
+ if (pos < *list_nr_p) \
+ memmove(*list_p + pos + 1, *list_p + pos, \
+ (*list_nr_p - pos - 1) * sizeof(**list_p)); \
+ init_func(&(*list_p)[pos], data); \
+ return -pos - 1; \
+}
+
+#define _declare_sorted_array_search_check(MAYBESTATIC,FUNCNAME,INITTYPE,GENSEARCH,LIST,CMP) \
+MAYBESTATIC int FUNCNAME(INITTYPE data) \
+{ \
+ return GENSEARCH(LIST, LIST##_nr, CMP, data); \
+}
+#define declare_sorted_array_search_check(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,LIST,CMP) \
+ declare_gen_binsearch(MAYBESTATIC,ELEMTYPE,_gen_##FUNCNAME,INITTYPE); \
+ _declare_sorted_array_search_check(MAYBESTATIC,FUNCNAME,INITTYPE,_gen_##FUNCNAME,LIST,CMP);
+
+#define _declare_sorted_array_insert_check(MAYBESTATIC,FUNCNAME,INITTYPE,GENINSERT,LIST,CMP,INIT) \
+MAYBESTATIC int FUNCNAME(INITTYPE data) \
+{ \
+ return GENINSERT(&LIST, &LIST##_nr, &LIST##_alloc, \
+ CMP, INIT, data); \
+}
+#define declare_sorted_array_insert_check(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,GENSEARCH,LIST,CMP,INIT) \
+ declare_gen_sorted_insert(static,ELEMTYPE,_gen_##FUNCNAME,GENSEARCH,INITTYPE); \
+ _declare_sorted_array_insert_check(MAYBESTATIC,FUNCNAME,INITTYPE,_gen_##FUNCNAME,LIST,CMP,INIT);
+#define declare_sorted_array_insertonly_check(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,LIST,CMP,INIT) \
+ declare_gen_binsearch(static,ELEMTYPE,_gensearch_##FUNCNAME,INITTYPE); \
+ declare_sorted_array_insert_check(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,_gensearch_##FUNCNAME,LIST,CMP,INIT);
+
+#define _declare_sorted_array_insert_checkbool(MAYBESTATIC,FUNCNAME,INITTYPE,GENINSERT,LIST,CMP,INIT) \
+MAYBESTATIC int FUNCNAME(INITTYPE data) \
+{ \
+ int idx = GENINSERT(&LIST, &LIST##_nr, &LIST##_alloc, \
+ CMP, INIT, data); \
+ if (idx < 0) \
+ return 0; \
+ return 1; \
+}
+#define declare_sorted_array_insert_checkbool(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,GENSEARCH,LIST,CMP,INIT) \
+ declare_gen_sorted_insert(static,ELEMTYPE,_gen_##FUNCNAME,GENSEARCH,INITTYPE); \
+ _declare_sorted_array_insert_checkbool(MAYBESTATIC,FUNCNAME,INITTYPE,_gen_##FUNCNAME,LIST,CMP,INIT);
+#define declare_sorted_array_insertonly_checkbool(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,LIST,CMP,INIT) \
+ declare_gen_binsearch(static,ELEMTYPE,_gensearch_##FUNCNAME,INITTYPE); \
+ declare_sorted_array_insert_checkbool(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,_gensearch_##FUNCNAME,LIST,CMP,INIT);
+
+#define _declare_sorted_array_search_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,GENSEARCH,LIST,CMP) \
+MAYBESTATIC ELEMTYPE *FUNCNAME(INITTYPE data) \
+{ \
+ int idx = GENSEARCH(LIST, LIST##_nr, CMP, data); \
+ if (idx < 0) \
+ return NULL; \
+ return &(LIST[idx]); \
+}
+#define declare_sorted_array_search_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,LIST,CMP) \
+ declare_gen_binsearch(static,ELEMTYPE,_gen_##FUNCNAME,INITTYPE); \
+ _declare_sorted_array_search_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,_gen_##FUNCNAME,LIST,CMP);
+
+#define _declare_sorted_array_insert_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,GENINSERT,LIST,CMP,INIT) \
+MAYBESTATIC ELEMTYPE *FUNCNAME(INITTYPE data) \
+{ \
+ int idx = GENINSERT(&LIST, &LIST##_nr, &LIST##_alloc, \
+ CMP, INIT, data); \
+ if (idx < 0) \
+ idx = -idx - 1; \
+ return &(LIST[idx]); \
+}
+#define declare_sorted_array_insert_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,GENSEARCH,LIST,CMP,INIT) \
+ declare_gen_sorted_insert(static,ELEMTYPE,_gen_##FUNCNAME,GENSEARCH,INITTYPE); \
+ _declare_sorted_array_insert_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,_gen_##FUNCNAME,LIST,CMP,INIT);
+#define declare_sorted_array_insertonly_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,LIST,CMP,INIT) \
+ declare_gen_binsearch(static,ELEMTYPE,_gensearch_##FUNCNAME,INITTYPE); \
+ declare_sorted_array_insert_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,_gensearch_##FUNCNAME,LIST,CMP,INIT);
+
+#endif
--
1.7.2.3
next prev parent reply other threads:[~2010-12-08 22:52 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-12-08 22:51 [PATCH v6] generalizing sorted-array handling Yann Dirson
2010-12-08 22:51 ` Yann Dirson [this message]
2010-12-10 22:29 ` [PATCH 1/6] Introduce sorted-array binary-search function Junio C Hamano
2010-12-30 0:40 ` Yann Dirson
2010-12-30 1:06 ` Erik Faye-Lund
2010-12-30 10:49 ` Yann Dirson
2010-12-08 22:51 ` [PATCH 2/6] Convert diffcore-rename's rename_dst to the new sorted-array API Yann Dirson
2010-12-10 22:32 ` Junio C Hamano
2010-12-08 22:51 ` [PATCH 3/6] Convert diffcore-rename's rename_src " Yann Dirson
2010-12-08 22:51 ` [PATCH 4/6] Convert pack-objects.c " Yann Dirson
2010-12-08 22:51 ` [PATCH 5/6] Use sorted-array API for commit.c's commit_graft Yann Dirson
2010-12-08 22:51 ` [PATCH 6/6] [RFC] subvert sorted-array to replace binary-search in unpack-objects Yann Dirson
2010-12-10 23:00 ` Junio C Hamano
2010-12-10 23:22 ` [PATCH v6] generalizing sorted-array handling Junio C Hamano
2010-12-30 0:01 ` Yann Dirson
-- strict thread matches above, loose matches on Subject: below --
2010-12-05 10:34 [PATCH v5] " Yann Dirson
2010-12-05 10:34 ` [PATCH 1/6] Introduce sorted-array binary-search function Yann Dirson
2010-11-29 22:57 [PATCH v4] generalizing sorted-array handling Yann Dirson
2010-11-29 22:57 ` [PATCH 1/6] Introduce sorted-array binary-search function Yann Dirson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1291848695-24601-2-git-send-email-ydirson@altern.org \
--to=ydirson@altern.org \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).