git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Yann Dirson <ydirson@altern.org>
To: git@vger.kernel.org
Cc: Yann Dirson <ydirson@altern.org>
Subject: [PATCH 1/6] Introduce sorted-array binary-search function.
Date: Wed,  8 Dec 2010 23:51:30 +0100	[thread overview]
Message-ID: <1291848695-24601-2-git-send-email-ydirson@altern.org> (raw)
In-Reply-To: <1291848695-24601-1-git-send-email-ydirson@altern.org>

We use a cpp-based template mechanism to declare the array and its
management data, as well as a search function.
Thanks to Jonathan Nieder for this design idea.

Signed-off-by: Yann Dirson <ydirson@altern.org>
---
 Documentation/technical/api-sorted-array.txt |  132 ++++++++++++++++++++++++++
 Makefile                                     |    1 +
 sorted-array.h                               |  128 +++++++++++++++++++++++++
 3 files changed, 261 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/technical/api-sorted-array.txt
 create mode 100644 sorted-array.h

diff --git a/Documentation/technical/api-sorted-array.txt b/Documentation/technical/api-sorted-array.txt
new file mode 100644
index 0000000..b1238ad
--- /dev/null
+++ b/Documentation/technical/api-sorted-array.txt
@@ -0,0 +1,132 @@
+sorted-array API
+================
+
+The sorted-array API is meant to provide efficient binary-search
+functions into a C array of elements of arbitrary type, and insert
+functions that maintain ordering.
+
+It is meant for data structures where lookup time is much more
+important than insertion type: while lookup has o(log(n)) time
+complexity, insertion has o(n) and involves many copies.
+
+API overview
+------------
+
+The API allow to declare all variables and function as static or not,
+through the first argument to all macros, which can be `static` or
+empty.
+
+It is based on the idea of providing custom functions, which will then
+be used by generic algorithms:
+* one for comparing a search term with an array item
+* one for initializing a new array item from a search term after insertion
+
+A set of convenience wrapper macros are provided, to define a readable
+API for inserting and sorting.
+
+Note that the type of the search term can be different from the type
+of array elements (which can be more expensive to create).
+
+API reference
+-------------
+
+The macros can be categorized as follows:
+
+* the one used to define a sorted array, and its management variable
+  holding currently-allocated number of elements and number of
+  elements effectively used.  The name of those (int) management
+  variables are crafted by using the `LISTNAME` and appending
+  respectively `_alloc` and `_nr`.
++
+----
+declare_sorted_array(MAYBESTATIC,ELEMTYPE,LISTNAME)
+----
++
+eg:
++
+----
+declare_sorted_array(static, struct diff_rename_dst, rename_dst);
+----
+
+* those defining ready-for-use search and insert functions.  They
+  exist in several flavours, distinguished by their suffix, which
+  refer to different return-type semantics.
++
+----
+declare_sorted_array_search_*(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,LISTNAME,CMP)
+declare_sorted_array_insert_*(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,GENSEARCH,LISTNAME,CMP,INIT)
+----
++
+They all work by defining behind the scene a (static) generic search or insert
+function named with `_gen_` prepended to `FUNCNAME`.  Such generic functions
+may be useful to know about if you need to define an insert function with
+the exact same parameters.
++
+All functions of a given type (search or insert) take the same kind of
+arguments:
+
+`MAYBESTATIC`::
+	the ubiquitous `static`or empty placeholder
+`ELEMTYPE`::
+	type of element to be stored in the array
+`FUNCNAME`::
+	name of the function to be defined
+`INITTYPE`::
+	type of search term / initializer passed as argument to the
+	generated function
+`LISTNAME`::
+	name of the array in which the search takes place, usually declared
+	through `declare_sorted_array()`
+`CMP`::
+	comparison function taking an INITTYPE argument and a
+	pointer to an ELEMTYPE argument, and returning an int with
+	the same meaning as `strcmp()`.
+`GENSEARCH`::
+	(insert only) generic search function generated by
+	`declare_gen_binsearch()`
+`INIT`::
+	(insert only) element-initializer function taking as arguments
+	pointer to the ELEMTYPE to be initialized, and an INITTYPE argument
+	to take information from
+
+
+Suffix meanings are as follows:
+
+`check`::
+
+	Returns the position of the element if found pre-existing in
+	the list, or if not found, (-pos - 1) where pos is where the
+	element would have been (for search funcs) or has been (for
+	insert funcs) inserted.
+
+`checkbool`::
+
+	Returns an int, just telling whether the searched element was
+	pre-existing in the list (1) or not (0).
+
+`elem`::
+
+	Returns NULL if the search term was not found (for search
+	funcs), or a pointer to the element found or newly inserted.
+
+* those defining even-more-ready-for-use insert functions, which do
+  not need to define the generic search function at all.  Those macro
+  names are the same as the forms with an `GENSEARCH` argument, but
+  with a name ending in `_insertonly_*`, and without that `GENSEARCH`
+  argument.  The generic (static) search functions are named as
+  `_gensearch` prepended to the `FUNCNAME` argument.
+
+* those defining the generic algorithms
++
+They rarely need to be used directly, they are usually expanded
+transparently from the `declare_sorted_array_*` macros.
++
+----
+declare_gen_binsearch(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE)
+declare_gen_sorted_insert(MAYBESTATIC,ELEMTYPE,FUNCNAME,SEARCHFUNC,INITTYPE)
+----
+
+* those defining the individual wrappers
++
+They rarely need to be used directly, they are usually expanded
+transparently from the `declare_sorted_array_*` macros.
diff --git a/Makefile b/Makefile
index 7eb948d..522261e 100644
--- a/Makefile
+++ b/Makefile
@@ -544,6 +544,7 @@ LIB_H += run-command.h
 LIB_H += sha1-lookup.h
 LIB_H += sideband.h
 LIB_H += sigchain.h
+LIB_H += sorted-array.h
 LIB_H += strbuf.h
 LIB_H += string-list.h
 LIB_H += submodule.h
diff --git a/sorted-array.h b/sorted-array.h
new file mode 100644
index 0000000..added0e
--- /dev/null
+++ b/sorted-array.h
@@ -0,0 +1,128 @@
+#ifndef SORTED_ARRAY_H_
+#define SORTED_ARRAY_H_
+
+/*
+ * Sorted-array management.
+ * See Documentation/technical/api-sorted-array.txt
+ */
+
+#define declare_sorted_array(MAYBESTATIC,ELEMTYPE,LIST)			\
+MAYBESTATIC ELEMTYPE *LIST;						\
+MAYBESTATIC int LIST##_nr, LIST##_alloc;
+
+#define declare_gen_binsearch(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE)	\
+MAYBESTATIC int FUNCNAME(						\
+	ELEMTYPE *list, int list_nr,					\
+	int(*cmp_func)(INITTYPE ref, ELEMTYPE *elem),			\
+	INITTYPE data)							\
+{									\
+	int lo, hi;							\
+									\
+	lo = 0;								\
+	hi = list_nr;							\
+	while (hi > lo) {						\
+		int mid = (hi + lo) >> 1;				\
+		int cmp = cmp_func(data, list + mid);			\
+		if (!cmp)						\
+			return mid;					\
+		if (cmp < 0)						\
+			hi = mid;					\
+		else							\
+			lo = mid + 1;					\
+	}								\
+	return -lo - 1;							\
+}
+
+#define declare_gen_sorted_insert(MAYBESTATIC,ELEMTYPE,FUNCNAME,SEARCHFUNC,INITTYPE) \
+MAYBESTATIC int FUNCNAME(						\
+	ELEMTYPE **list_p, int *list_nr_p, int *list_alloc_p,		\
+	int(*cmp_func)(INITTYPE ref, ELEMTYPE *elem),			\
+	void(*init_func)(ELEMTYPE *elem, INITTYPE init),		\
+	INITTYPE data)							\
+{									\
+	int pos = SEARCHFUNC(*list_p, *list_nr_p, cmp_func, data);	\
+	if (pos >= 0) 							\
+		return pos;						\
+	/* not found */							\
+	pos = -pos - 1;							\
+	/* insert to make it at "pos" */				\
+	if (*list_alloc_p <= *list_nr_p) {				\
+		(*list_alloc_p) = alloc_nr((*list_alloc_p));		\
+		*list_p = xrealloc(*list_p,				\
+				   (*list_alloc_p) * sizeof(**list_p)); \
+	}								\
+	(*list_nr_p)++;							\
+	if (pos < *list_nr_p)						\
+		memmove(*list_p + pos + 1, *list_p + pos,		\
+			(*list_nr_p - pos - 1) * sizeof(**list_p));	\
+	init_func(&(*list_p)[pos], data);				\
+	return -pos - 1;						\
+}
+
+#define _declare_sorted_array_search_check(MAYBESTATIC,FUNCNAME,INITTYPE,GENSEARCH,LIST,CMP) \
+MAYBESTATIC int FUNCNAME(INITTYPE data)					\
+{									\
+	return GENSEARCH(LIST, LIST##_nr, CMP, data);			\
+}
+#define declare_sorted_array_search_check(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,LIST,CMP) \
+  declare_gen_binsearch(MAYBESTATIC,ELEMTYPE,_gen_##FUNCNAME,INITTYPE); \
+  _declare_sorted_array_search_check(MAYBESTATIC,FUNCNAME,INITTYPE,_gen_##FUNCNAME,LIST,CMP);
+
+#define _declare_sorted_array_insert_check(MAYBESTATIC,FUNCNAME,INITTYPE,GENINSERT,LIST,CMP,INIT) \
+MAYBESTATIC int FUNCNAME(INITTYPE data)					\
+{									\
+	return GENINSERT(&LIST, &LIST##_nr, &LIST##_alloc,		\
+			 CMP, INIT, data);				\
+}
+#define declare_sorted_array_insert_check(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,GENSEARCH,LIST,CMP,INIT) \
+  declare_gen_sorted_insert(static,ELEMTYPE,_gen_##FUNCNAME,GENSEARCH,INITTYPE); \
+  _declare_sorted_array_insert_check(MAYBESTATIC,FUNCNAME,INITTYPE,_gen_##FUNCNAME,LIST,CMP,INIT);
+#define declare_sorted_array_insertonly_check(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,LIST,CMP,INIT) \
+  declare_gen_binsearch(static,ELEMTYPE,_gensearch_##FUNCNAME,INITTYPE); \
+  declare_sorted_array_insert_check(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,_gensearch_##FUNCNAME,LIST,CMP,INIT);
+
+#define _declare_sorted_array_insert_checkbool(MAYBESTATIC,FUNCNAME,INITTYPE,GENINSERT,LIST,CMP,INIT) \
+MAYBESTATIC int FUNCNAME(INITTYPE data)					\
+{									\
+	int idx = GENINSERT(&LIST, &LIST##_nr, &LIST##_alloc,		\
+			    CMP, INIT, data);				\
+	if (idx < 0)							\
+		return 0;						\
+	return 1;							\
+}
+#define declare_sorted_array_insert_checkbool(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,GENSEARCH,LIST,CMP,INIT) \
+  declare_gen_sorted_insert(static,ELEMTYPE,_gen_##FUNCNAME,GENSEARCH,INITTYPE); \
+  _declare_sorted_array_insert_checkbool(MAYBESTATIC,FUNCNAME,INITTYPE,_gen_##FUNCNAME,LIST,CMP,INIT);
+#define declare_sorted_array_insertonly_checkbool(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,LIST,CMP,INIT) \
+  declare_gen_binsearch(static,ELEMTYPE,_gensearch_##FUNCNAME,INITTYPE); \
+  declare_sorted_array_insert_checkbool(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,_gensearch_##FUNCNAME,LIST,CMP,INIT);
+
+#define _declare_sorted_array_search_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,GENSEARCH,LIST,CMP) \
+MAYBESTATIC ELEMTYPE *FUNCNAME(INITTYPE data)				\
+{									\
+	int idx = GENSEARCH(LIST, LIST##_nr, CMP, data);		\
+	if (idx < 0)							\
+		return NULL;						\
+	return &(LIST[idx]);						\
+}
+#define declare_sorted_array_search_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,LIST,CMP) \
+  declare_gen_binsearch(static,ELEMTYPE,_gen_##FUNCNAME,INITTYPE); \
+  _declare_sorted_array_search_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,_gen_##FUNCNAME,LIST,CMP);
+
+#define _declare_sorted_array_insert_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,GENINSERT,LIST,CMP,INIT) \
+MAYBESTATIC ELEMTYPE *FUNCNAME(INITTYPE data)				\
+{									\
+	int idx = GENINSERT(&LIST, &LIST##_nr, &LIST##_alloc,		\
+			    CMP, INIT, data);				\
+	if (idx < 0)							\
+		idx = -idx - 1;						\
+	return &(LIST[idx]);						\
+}
+#define declare_sorted_array_insert_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,GENSEARCH,LIST,CMP,INIT) \
+  declare_gen_sorted_insert(static,ELEMTYPE,_gen_##FUNCNAME,GENSEARCH,INITTYPE); \
+  _declare_sorted_array_insert_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,_gen_##FUNCNAME,LIST,CMP,INIT);
+#define declare_sorted_array_insertonly_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,LIST,CMP,INIT) \
+  declare_gen_binsearch(static,ELEMTYPE,_gensearch_##FUNCNAME,INITTYPE); \
+  declare_sorted_array_insert_elem(MAYBESTATIC,ELEMTYPE,FUNCNAME,INITTYPE,_gensearch_##FUNCNAME,LIST,CMP,INIT);
+
+#endif
-- 
1.7.2.3

  reply	other threads:[~2010-12-08 22:52 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-08 22:51 [PATCH v6] generalizing sorted-array handling Yann Dirson
2010-12-08 22:51 ` Yann Dirson [this message]
2010-12-10 22:29   ` [PATCH 1/6] Introduce sorted-array binary-search function Junio C Hamano
2010-12-30  0:40     ` Yann Dirson
2010-12-30  1:06       ` Erik Faye-Lund
2010-12-30 10:49         ` Yann Dirson
2010-12-08 22:51 ` [PATCH 2/6] Convert diffcore-rename's rename_dst to the new sorted-array API Yann Dirson
2010-12-10 22:32   ` Junio C Hamano
2010-12-08 22:51 ` [PATCH 3/6] Convert diffcore-rename's rename_src " Yann Dirson
2010-12-08 22:51 ` [PATCH 4/6] Convert pack-objects.c " Yann Dirson
2010-12-08 22:51 ` [PATCH 5/6] Use sorted-array API for commit.c's commit_graft Yann Dirson
2010-12-08 22:51 ` [PATCH 6/6] [RFC] subvert sorted-array to replace binary-search in unpack-objects Yann Dirson
2010-12-10 23:00   ` Junio C Hamano
2010-12-10 23:22 ` [PATCH v6] generalizing sorted-array handling Junio C Hamano
2010-12-30  0:01   ` Yann Dirson
  -- strict thread matches above, loose matches on Subject: below --
2010-12-05 10:34 [PATCH v5] " Yann Dirson
2010-12-05 10:34 ` [PATCH 1/6] Introduce sorted-array binary-search function Yann Dirson
2010-11-29 22:57 [PATCH v4] generalizing sorted-array handling Yann Dirson
2010-11-29 22:57 ` [PATCH 1/6] Introduce sorted-array binary-search function Yann Dirson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1291848695-24601-2-git-send-email-ydirson@altern.org \
    --to=ydirson@altern.org \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).