bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
* [PATCH 1/1] hamt: Document the module in the Gnulib manual.
@ 2021-04-04  8:39 Marc Nieper-Wißkirchen
  0 siblings, 0 replies; only message in thread
From: Marc Nieper-Wißkirchen @ 2021-04-04  8:39 UTC (permalink / raw)
  To: bug-gnulib; +Cc: Marc Nieper-Wißkirchen

From: Marc Nieper-Wißkirchen <marc@nieper-wisskirchen.de>

Suggested by Bruno Haible in
<https://lists.gnu.org/archive/html/bug-gnulib/2021-04/msg00026.html>.
* doc/containers.texi: Add a subsection to section 15.11 Container
data types.
* lib/hamt.h: Improve documentation on how Hamt_entry is supposed
to be used.
---
 ChangeLog           | 10 +++++++++
 doc/containers.texi | 55 +++++++++++++++++++++++++++++++++++++++++++++
 lib/hamt.h          | 11 +++++----
 3 files changed, 72 insertions(+), 4 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 4a665c275..7b296de04 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2021-04-04  Marc Nieper-Wißkirchen  <marc@nieper-wisskirchen.de>
+
+	hamt: Document the module in the Gnulib manual.
+	Suggested by Bruno Haible in
+	<https://lists.gnu.org/archive/html/bug-gnulib/2021-04/msg00026.html>.
+	* doc/containers.texi: Add a subsection to section 15.11 Container
+	data types.
+	* lib/hamt.h: Improve documentation on how Hamt_entry is supposed
+	to be used.
+
 2021-04-03  Paul Eggert  <eggert@cs.ucla.edu>
 
 	savedir: avoid unlikely undefined behavior
diff --git a/doc/containers.texi b/doc/containers.texi
index 15c915b93..a8cfebc14 100644
--- a/doc/containers.texi
+++ b/doc/containers.texi
@@ -35,6 +35,9 @@ log
 Gnulib provides several generic container data types.  They can be used
 to organize collections of application-defined objects.
 
+@node Ordinary containers
+@subsection Ordinary container data types
+
 @multitable @columnfractions .15 .5 .1 .1 .15
 @headitem Data type
 @tab Details
@@ -599,6 +602,58 @@ For C++, Gnulib provides a C++ template class for each of these container data t
 @tab @code{"gl_omap.hh"}
 @end multitable
 
+@node Specialized containers
+@subsection Specialized container data types
+
+The @code{hamt} module implements the hash array mapped trie (HAMT) data
+structure.  This is a data structure that contains (key, value) pairs.
+Lookup of a (key, value) pair given the key is on average an @math{O(1)}
+operation, assuming a good hash function for the keys is employed.
+
+The HAMT data structure is useful when you want modifications (additions
+of pairs, removal, value changes) to be visible only to some part of
+your program, whereas other parts of the program continue to use the
+unmodified HAMT.  The HAMT makes this possible in a space-efficient
+manner: the modified and the unmodified HAMT share most of their
+allocated memory.  It is also time-efficient: Every such modification
+is @math{O(1)} on average, again assuming a good hash function for the keys.
+
+A HAMT can be used whenever an ordinary hash table would be used.  It
+does however, provide non-destructive updating operations without the
+need to copy the whole container. On the other hand, a hash table is
+simpler so that its performance may be better when non-destructive
+update operations are not needed.
+
+For example, a HAMT can be used to model the dynamic environment in a
+LISP interpreter.  Updating a value in the dynamic environment of one
+continuation frame would not modify values in earlier frames.
+
+To use the module, include @code{hamt.h} in your code.  The public
+interface is documented in that header file.  You have to provide a hash
+function and an equivalence relation, which defines key equality.  The
+module includes a test file @code{test-hamt.c}, which demonstrates how
+the API can be used.
+
+In the current implementation, each inner node of the HAMT can store up
+to @math{32 = 2^5} entries and subtries.  Whenever a collision between
+the initial bits of the hash values of two entries would happen, the
+next @math{5} bits of the hash values are examined and the two entries
+pushed down one level in the trie.
+
+HAMTs have the same average access times as hash tables but grow and
+shrink dynamically, so they use memory more economically and do not have
+to be periodically resized.
+
+They were described and analyzed in @cite{Phil Bagwell (2000). Ideal
+Hash Trees (Report). Infoscience Department, École Polytechnique
+Fédérale de Lausanne.}
+
+The persistence aspect of the HAMT data structure, which means that each
+updating operation (like inserting, replacing, or removing an entry)
+returns a new HAMT while leaving the original one intact, is achieved
+through structure sharing, which is even safe in the presence of
+multiple threads when the used C compiler supports atomics.
+
 @ifnottex
 @unmacro log
 @end ifnottex
diff --git a/lib/hamt.h b/lib/hamt.h
index 55ee964dd..25a0ad9f9 100644
--- a/lib/hamt.h
+++ b/lib/hamt.h
@@ -78,10 +78,13 @@ _GL_INLINE_HEADER_BEGIN
 /************/
 
 /* A hamt stores pointers to elements.  Each element has to be a
-   struct whose initial member is of the type Hamt_entry.  An element
-   is conceptually owned by a hamt as soon as it is inserted.  It will
-   be automatically freed as soon as the last hamt containing it is
-   freed.  */
+   struct whose initial member is of the type Hamt_entry.  You need to
+   define this struct yourself.  It will typically contain an
+   Hamt_entry, a key, and, optionally, a value.
+
+   An element is conceptually owned by a hamt as soon as it is
+   inserted.  It will be automatically freed as soon as the last hamt
+   containing it is freed.  */
 typedef struct
 {
 #if GL_HAMT_THREAD_SAFE
-- 
2.25.1



^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2021-04-04  8:39 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-04  8:39 [PATCH 1/1] hamt: Document the module in the Gnulib manual Marc Nieper-Wißkirchen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).