git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCHv3] attr: convert to new threadsafe API
@ 2016-10-12 22:41 Stefan Beller
  2016-10-12 23:33 ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Beller @ 2016-10-12 22:41 UTC (permalink / raw)
  To: gitster, bmwill; +Cc: git, j6t, jacob.keller, Stefan Beller

This revamps the API of the attr subsystem to be thread safe.
Before we had the question and its results in one struct type.
The typical usage of the API was

    static struct git_attr_check *check;

    if (!check)
        check = git_attr_check_initl("text", NULL);

    git_check_attr(path, check);
    act_on(check->value[0]);

This has a couple of issues when it comes to thread safety:

* the initialization is racy in this implementation. To make it
  thread safe, we need to acquire a mutex, such that only one
  thread is executing the code in git_attr_check_initl.
  As we do not want to introduce a mutex at each call site,
  this is best done in the attr code. However to do so, we need
  to have access to the `check` variable, i.e. the API has to
  look like
    git_attr_check_initl(struct git_attr_check*, ...);
  Then one of the threads calling git_attr_check_initl will
  acquire the mutex and init the `check`, while all other threads
  will wait on the mutex just to realize they're late to the
  party and they'll return with no work done.

* While the check for attributes to be questioned only need to
  be initalized once as that part will be read only after its
  initialisation, the answer may be different for each path.
  Because of that we need to decouple the check and the answer,
  such that each thread can obtain an answer for the path it
  is currently processing.

This commit changes the API and adds locking in
git_attr_check_initl that provides the thread safety.

The usage of the new API will be:

    /*
     * The initl call will thread-safely check whether the
     * struct git_attr_check has been initialized. We only
     * want to do the initialization work once, hence we do
     * that work inside a thread safe environment.
     */
    static struct git_attr_check *check;
    git_attr_check_initl(&check, "text", NULL);

    /*
     * Resize internal data structures in the result to match
     * the size of `check`:
     */
    struct git_attr_result *result = GIT_ATTR_RESULT_INIT;
    git_attr_result_init(&result, check);

    /* Perform the check and act on it: */
    git_check_attr(path, check, &result);
    act_on(check->value[0]);

    /*
     * Free the result.
     * The check is static, so doesn't require free'ing
     */
    git_attr_result_clear(&result);

Signed-off-by: Stefan Beller <sbeller@google.com>
---

 * Do not use the Double Check Locking Pattern
 * The callers are all thread safe now (i.e. no static result)
 * So this is roughly what we want for the first version of the conversion?
 
 Thanks,
 Stefan

 Documentation/technical/api-gitattributes.txt |  93 +++++++++++++------
 archive.c                                     |  18 ++--
 attr.c                                        | 125 +++++++++++++++++---------
 attr.h                                        |  80 +++++++++++------
 builtin/check-attr.c                          |  31 ++++---
 builtin/pack-objects.c                        |  20 +++--
 convert.c                                     |  43 +++++----
 ll-merge.c                                    |  28 +++---
 userdiff.c                                    |  24 +++--
 ws.c                                          |  17 ++--
 10 files changed, 314 insertions(+), 165 deletions(-)

diff --git a/Documentation/technical/api-gitattributes.txt b/Documentation/technical/api-gitattributes.txt
index 92fc32a..ac97244 100644
--- a/Documentation/technical/api-gitattributes.txt
+++ b/Documentation/technical/api-gitattributes.txt
@@ -8,6 +8,18 @@ attributes to set of paths.
 Data Structure
 --------------
 
+extern struct git_attr *git_attr(const char *);
+
+/*
+ * Return the name of the attribute represented by the argument.  The
+ * return value is a pointer to a null-delimited string that is part
+ * of the internal data structure; it should not be modified or freed.
+ */
+extern const char *git_attr_name(const struct git_attr *);
+
+extern int attr_name_valid(const char *name, size_t namelen);
+extern void invalid_attr_name_message(struct strbuf *, const char *, int);
+
 `struct git_attr`::
 
 	An attribute is an opaque object that is identified by its name.
@@ -16,15 +28,17 @@ Data Structure
 	of no interest to the calling programs.  The name of the
 	attribute can be retrieved by calling `git_attr_name()`.
 
-`struct git_attr_check_elem`::
-
-	This structure represents one attribute and its value.
-
 `struct git_attr_check`::
 
-	This structure represents a collection of `git_attr_check_elem`.
+	This structure represents a collection of `struct git_attrs`.
 	It is passed to `git_check_attr()` function, specifying the
-	attributes to check, and receives their values.
+	attributes to check, and receives their values into a corresponding
+	`struct git_attr_result`.
+
+`struct git_attr_result`::
+
+	This structure represents a collection of results to its
+	corresponding `struct git_attr_check`, that has the same order.
 
 
 Attribute Values
@@ -53,19 +67,30 @@ value of the attribute for the path.
 Querying Specific Attributes
 ----------------------------
 
-* Prepare `struct git_attr_check` using git_attr_check_initl()
+* Prepare a `struct git_attr_check` using `git_attr_check_initl()`
   function, enumerating the names of attributes whose values you are
   interested in, terminated with a NULL pointer.  Alternatively, an
-  empty `struct git_attr_check` can be prepared by calling
-  `git_attr_check_alloc()` function and then attributes you want to
-  ask about can be added to it with `git_attr_check_append()`
-  function.
-
-* Call `git_check_attr()` to check the attributes for the path.
-
-* Inspect `git_attr_check` structure to see how each of the
-  attribute in the array is defined for the path.
-
+  empty `struct git_attr_check` as allocated by git_attr_check_alloc()
+  can be prepared by calling `git_attr_check_alloc()` function and
+  then attributes you want to ask about can be added to it with
+  `git_attr_check_append()` function.
+  `git_attr_check_initl()` is thread safe, i.e. you can call it
+  from different threads at the same time; when check determines
+  the initialzisation is still needed, the threads will use a
+  single global mutex to perform the initialization just once, the
+  others will wait on the the thread to actually perform the
+  initialization.
+
+* Prepare a `struct git_attr_result` using `git_attr_result_init()`
+  for the check struct. The call to initialize the result is not thread
+  safe, because different threads need their own thread local result
+  anyway.
+
+* Call `git_check_attr()` to check the attributes for the path,
+  the returned `git_attr_result` contains the result.
+
+* Inspect the returned `git_attr_result` structure to see how
+  each of the attribute in the array is defined for the path.
 
 Example
 -------
@@ -89,15 +114,20 @@ static void setup_check(void)
 
 ------------
 	const char *path;
+	struct git_attr_result *result;
 
 	setup_check();
-	git_check_attr(path, check);
+	result = git_check_attr(path, check);
 ------------
 
-. Act on `.value` member of the result, left in `check->check[]`:
+The `result` must not be free'd as it is owned by the attr subsystem.
+It is reused by the same thread, so a subsequent call to git_check_attr
+in the same thread will overwrite the result.
+
+. Act on `result->value[]`:
 
 ------------
-	const char *value = check->check[0].value;
+	const char *value = result->value[0];
 
 	if (ATTR_TRUE(value)) {
 		The attribute is Set, by listing only the name of the
@@ -123,6 +153,8 @@ the first step in the above would be different.
 static struct git_attr_check *check;
 static void setup_check(const char **argv)
 {
+	if (check)
+		return; /* already done */
 	check = git_attr_check_alloc();
 	while (*argv) {
 		struct git_attr *attr = git_attr(*argv);
@@ -138,17 +170,20 @@ Querying All Attributes
 
 To get the values of all attributes associated with a file:
 
-* Prepare an empty `git_attr_check` structure by calling
-  `git_attr_check_alloc()`.
+* Setup a local variables on the stack for both the question
+  `struct git_attr_check` as well as the result `struct git_attr_result`.
+  Zero them out via their respective _INIT macro.
 
-* Call `git_all_attrs()`, which populates the `git_attr_check`
-  with the attributes attached to the path.
+* Call `git_all_attrs()`
 
-* Iterate over the `git_attr_check.check[]` array to examine
-  the attribute names and values.  The name of the attribute
-  described by a  `git_attr_check.check[]` object can be retrieved via
-  `git_attr_name(check->check[i].attr)`.  (Please note that no items
+* Iterate over the `git_attr_check.attr[]` array to examine the
+  attribute names.  The name of the attribute described by a
+  `git_attr_check.attr[]` object can be retrieved via
+  `git_attr_name(check->attr[i])`.  (Please note that no items
   will be returned for unset attributes, so `ATTR_UNSET()` will return
   false for all returned `git_array_check` objects.)
+  The respective value for an attribute can be found in the same
+  index position in of `git_attr_result`.
 
-* Free the `git_array_check` by calling `git_attr_check_free()`.
+* Clear the local variables by calling `git_attr_check_clear()` and
+  `git_attr_result_clear()`.
diff --git a/archive.c b/archive.c
index 11e3951..15849a8 100644
--- a/archive.c
+++ b/archive.c
@@ -107,10 +107,12 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
 		void *context)
 {
 	static struct strbuf path = STRBUF_INIT;
+	static struct git_attr_check *check;
+
 	struct archiver_context *c = context;
 	struct archiver_args *args = c->args;
 	write_archive_entry_fn_t write_entry = c->write_entry;
-	static struct git_attr_check *check;
+	struct git_attr_result result = GIT_ATTR_RESULT_INIT;
 	const char *path_without_prefix;
 	int err;
 
@@ -124,12 +126,16 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
 		strbuf_addch(&path, '/');
 	path_without_prefix = path.buf + args->baselen;
 
-	if (!check)
-		check = git_attr_check_initl("export-ignore", "export-subst", NULL);
-	if (!git_check_attr(path_without_prefix, check)) {
-		if (ATTR_TRUE(check->check[0].value))
+	git_attr_check_initl(&check, "export-ignore", "export-subst", NULL);
+	git_attr_result_init(&result, check);
+
+	if (!git_check_attr(path_without_prefix, check, &result)) {
+		if (ATTR_TRUE(result.value[0])) {
+			git_attr_result_clear(&result);
 			return 0;
-		args->convert = ATTR_TRUE(check->check[1].value);
+		}
+		args->convert = ATTR_TRUE(result.value[1]);
+		git_attr_result_clear(&result);
 	}
 
 	if (S_ISDIR(mode) || S_ISGITLINK(mode)) {
diff --git a/attr.c b/attr.c
index 0faa69f..648c4f9 100644
--- a/attr.c
+++ b/attr.c
@@ -14,6 +14,7 @@
 #include "dir.h"
 #include "utf8.h"
 #include "quote.h"
+#include "thread-utils.h"
 
 const char git_attr__true[] = "(builtin)true";
 const char git_attr__false[] = "\0(builtin)false";
@@ -46,6 +47,19 @@ struct git_attr {
 static int attr_nr;
 static struct git_attr *(git_attr_hash[HASHSIZE]);
 
+#ifndef NO_PTHREADS
+
+static pthread_mutex_t attr_mutex;
+#define attr_lock()		pthread_mutex_lock(&attr_mutex)
+#define attr_unlock()		pthread_mutex_unlock(&attr_mutex)
+
+#else
+
+#define attr_lock()		(void)0
+#define attr_unlock()		(void)0
+
+#endif /* NO_PTHREADS */
+
 /*
  * NEEDSWORK: maybe-real, maybe-macro are not property of
  * an attribute, as it depends on what .gitattributes are
@@ -55,6 +69,16 @@ static struct git_attr *(git_attr_hash[HASHSIZE]);
  */
 static int cannot_trust_maybe_real;
 
+/*
+ * Send one or more git_attr_check to git_check_attrs(), and
+ * each 'value' member tells what its value is.
+ * Unset one is returned as NULL.
+ */
+struct git_attr_check_elem {
+	const struct git_attr *attr;
+	const char *value;
+};
+
 /* NEEDSWORK: This will become per git_attr_check */
 static struct git_attr_check_elem *check_all_attr;
 
@@ -781,7 +805,7 @@ static int macroexpand_one(int nr, int rem)
 
 static int attr_check_is_dynamic(const struct git_attr_check *check)
 {
-	return (void *)(check->check) != (void *)(check + 1);
+	return (void *)(check->attr) != (void *)(check + 1);
 }
 
 static void empty_attr_check_elems(struct git_attr_check *check)
@@ -799,7 +823,8 @@ static void empty_attr_check_elems(struct git_attr_check *check)
  * any and all attributes that are visible are collected in it.
  */
 static void collect_some_attrs(const char *path, int pathlen,
-			       struct git_attr_check *check, int collect_all)
+			       struct git_attr_check *check,
+			       struct git_attr_result *result, int collect_all)
 
 {
 	struct attr_stack *stk;
@@ -825,13 +850,11 @@ static void collect_some_attrs(const char *path, int pathlen,
 		check_all_attr[i].value = ATTR__UNKNOWN;
 
 	if (!collect_all && !cannot_trust_maybe_real) {
-		struct git_attr_check_elem *celem = check->check;
-
 		rem = 0;
 		for (i = 0; i < check->check_nr; i++) {
-			if (!celem[i].attr->maybe_real) {
+			if (!check->attr[i]->maybe_real) {
 				struct git_attr_check_elem *c;
-				c = check_all_attr + celem[i].attr->attr_nr;
+				c = check_all_attr + check->attr[i]->attr_nr;
 				c->value = ATTR__UNSET;
 				rem++;
 			}
@@ -845,38 +868,45 @@ static void collect_some_attrs(const char *path, int pathlen,
 		rem = fill(path, pathlen, basename_offset, stk, rem);
 
 	if (collect_all) {
-		empty_attr_check_elems(check);
 		for (i = 0; i < attr_nr; i++) {
 			const struct git_attr *attr = check_all_attr[i].attr;
 			const char *value = check_all_attr[i].value;
 			if (value == ATTR__UNSET || value == ATTR__UNKNOWN)
 				continue;
-			git_attr_check_append(check, attr)->value = value;
+
+			git_attr_check_append(check, attr);
+
+			ALLOC_GROW(result->value,
+				   result->check_nr + 1,
+				   result->check_alloc);
+			result->value[result->check_nr++] = value;
 		}
 	}
 }
 
 static int git_check_attrs(const char *path, int pathlen,
-			   struct git_attr_check *check)
+			   struct git_attr_check *check,
+			   struct git_attr_result *result)
 {
 	int i;
-	struct git_attr_check_elem *celem = check->check;
 
-	collect_some_attrs(path, pathlen, check, 0);
+	collect_some_attrs(path, pathlen, check, result, 0);
 
 	for (i = 0; i < check->check_nr; i++) {
-		const char *value = check_all_attr[celem[i].attr->attr_nr].value;
+		const char *value = check_all_attr[check->attr[i]->attr_nr].value;
 		if (value == ATTR__UNKNOWN)
 			value = ATTR__UNSET;
-		celem[i].value = value;
+		result->value[i] = value;
 	}
 
 	return 0;
 }
 
-void git_all_attrs(const char *path, struct git_attr_check *check)
+void git_all_attrs(const char *path,
+		   struct git_attr_check *check,
+		   struct git_attr_result *result)
 {
-	collect_some_attrs(path, strlen(path), check, 1);
+	collect_some_attrs(path, strlen(path), check, result, 1);
 }
 
 void git_attr_set_direction(enum git_attr_direction new, struct index_state *istate)
@@ -892,36 +922,42 @@ void git_attr_set_direction(enum git_attr_direction new, struct index_state *ist
 	use_index = istate;
 }
 
-static int git_check_attr_counted(const char *path, int pathlen,
-				  struct git_attr_check *check)
+int git_check_attr(const char *path,
+		    struct git_attr_check *check,
+		    struct git_attr_result *result)
 {
+	if (result->check_nr != check->check_nr)
+		die("BUG: git_attr_result is not prepared correctly");
 	check->finalized = 1;
-	return git_check_attrs(path, pathlen, check);
+	return git_check_attrs(path, strlen(path), check, result);
 }
 
-int git_check_attr(const char *path, struct git_attr_check *check)
+void git_attr_check_initl(struct git_attr_check **check_,
+			  const char *one, ...)
 {
-	return git_check_attr_counted(path, strlen(path), check);
-}
-
-struct git_attr_check *git_attr_check_initl(const char *one, ...)
-{
-	struct git_attr_check *check;
 	int cnt;
 	va_list params;
 	const char *param;
+	struct git_attr_check *check;
+
+	attr_lock();
+	if (*check_) {
+		attr_unlock();
+		return;
+	}
 
 	va_start(params, one);
 	for (cnt = 1; (param = va_arg(params, const char *)) != NULL; cnt++)
 		;
 	va_end(params);
+
 	check = xcalloc(1,
-			sizeof(*check) + cnt * sizeof(*(check->check)));
+			sizeof(*check) + cnt * sizeof(*(check->attr)));
 	check->check_nr = cnt;
 	check->finalized = 1;
-	check->check = (struct git_attr_check_elem *)(check + 1);
+	check->attr = (const struct git_attr **)(check + 1);
 
-	check->check[0].attr = git_attr(one);
+	check->attr[0] = git_attr(one);
 	va_start(params, one);
 	for (cnt = 1; cnt < check->check_nr; cnt++) {
 		struct git_attr *attr;
@@ -932,10 +968,11 @@ struct git_attr_check *git_attr_check_initl(const char *one, ...)
 		attr = git_attr(param);
 		if (!attr)
 			die("BUG: %s: not a valid attribute name", param);
-		check->check[cnt].attr = attr;
+		check->attr[cnt] = attr;
 	}
 	va_end(params);
-	return check;
+	*check_ = check;
+	attr_unlock();
 }
 
 struct git_attr_check *git_attr_check_alloc(void)
@@ -943,18 +980,23 @@ struct git_attr_check *git_attr_check_alloc(void)
 	return xcalloc(1, sizeof(struct git_attr_check));
 }
 
-struct git_attr_check_elem *git_attr_check_append(struct git_attr_check *check,
-						  const struct git_attr *attr)
+void git_attr_result_init(struct git_attr_result *result,
+			  struct git_attr_check *check)
+{
+	ALLOC_GROW(result->value, check->check_nr, result->check_alloc);
+	result->check_nr = check->check_nr;
+}
+
+
+void git_attr_check_append(struct git_attr_check *check,
+			   const struct git_attr *attr)
 {
-	struct git_attr_check_elem *elem;
 	if (check->finalized)
 		die("BUG: append after git_attr_check structure is finalized");
 	if (!attr_check_is_dynamic(check))
 		die("BUG: appending to a statically initialized git_attr_check");
-	ALLOC_GROW(check->check, check->check_nr + 1, check->check_alloc);
-	elem = &check->check[check->check_nr++];
-	elem->attr = attr;
-	return elem;
+	ALLOC_GROW(check->attr, check->check_nr + 1, check->check_alloc);
+	check->attr[check->check_nr++] = attr;
 }
 
 void git_attr_check_clear(struct git_attr_check *check)
@@ -962,12 +1004,13 @@ void git_attr_check_clear(struct git_attr_check *check)
 	empty_attr_check_elems(check);
 	if (!attr_check_is_dynamic(check))
 		die("BUG: clearing a statically initialized git_attr_check");
-	free(check->check);
+	free(check->attr);
 	check->check_alloc = 0;
 }
 
-void git_attr_check_free(struct git_attr_check *check)
+void git_attr_result_clear(struct git_attr_result *result)
 {
-	git_attr_check_clear(check);
-	free(check);
+	free(result->value);
+	result->check_nr = 0;
+	result->check_alloc = 0;
 }
diff --git a/attr.h b/attr.h
index 163fcd6..620f6f8 100644
--- a/attr.h
+++ b/attr.h
@@ -10,6 +10,13 @@ struct git_attr;
  */
 extern struct git_attr *git_attr(const char *);
 
+/*
+ * Return the name of the attribute represented by the argument.  The
+ * return value is a pointer to a null-delimited string that is part
+ * of the internal data structure; it should not be modified or freed.
+ */
+extern const char *git_attr_name(const struct git_attr *);
+
 extern int attr_name_valid(const char *name, size_t namelen);
 extern void invalid_attr_name_message(struct strbuf *, const char *, int);
 
@@ -22,44 +29,65 @@ extern const char git_attr__false[];
 #define ATTR_FALSE(v) ((v) == git_attr__false)
 #define ATTR_UNSET(v) ((v) == NULL)
 
-/*
- * Send one or more git_attr_check to git_check_attrs(), and
- * each 'value' member tells what its value is.
- * Unset one is returned as NULL.
- */
-struct git_attr_check_elem {
-	const struct git_attr *attr;
-	const char *value;
-};
-
 struct git_attr_check {
 	int finalized;
 	int check_nr;
 	int check_alloc;
-	struct git_attr_check_elem *check;
+	const struct git_attr **attr;
 };
+#define GIT_ATTR_CHECK_INIT {0, 0, 0, NULL}
 
-extern struct git_attr_check *git_attr_check_initl(const char *, ...);
-extern int git_check_attr(const char *path, struct git_attr_check *);
+struct git_attr_result {
+	int check_nr;
+	int check_alloc;
+	const char **value;
+};
+#define GIT_ATTR_RESULT_INIT {0, 0, NULL}
 
+/*
+ * Initialize the `git_attr_check` via one of the following three functions:
+ *
+ * git_attr_check_alloc  allocates an empty check,
+ * git_attr_check_append add an attribute to the given git_attr_check
+ * git_attr_result_init  prepare the result struct for a given check struct
+ *
+ * git_all_attrs         allocates a check and fills in all attributes that
+ *                       are set for the given path.
+ * git_attr_check_initl  takes a pointer to where the check will be initialized,
+ *                       followed by all attributes that are to be checked.
+ *                       This makes it potentially thread safe as it could
+ *                       internally have a mutex for that memory location.
+ *                       Currently it is not thread safe!
+ */
 extern struct git_attr_check *git_attr_check_alloc(void);
-extern struct git_attr_check_elem *git_attr_check_append(struct git_attr_check *, const struct git_attr *);
+extern void git_attr_check_append(struct git_attr_check *,
+				  const struct git_attr *);
+extern void git_attr_check_initl(struct git_attr_check **,
+				 const char *, ...);
 
-extern void git_attr_check_clear(struct git_attr_check *);
-extern void git_attr_check_free(struct git_attr_check *);
+extern void git_attr_result_init(struct git_attr_result *,
+				 struct git_attr_check *);
 
-/*
- * Return the name of the attribute represented by the argument.  The
- * return value is a pointer to a null-delimited string that is part
- * of the internal data structure; it should not be modified or freed.
- */
-extern const char *git_attr_name(const struct git_attr *);
+extern void git_all_attrs(const char *path,
+			  struct git_attr_check *,
+			  struct git_attr_result *);
 
-/*
- * Retrieve all attributes that apply to the specified path.
- * check holds the attributes and their values.
+/* Query a path for its attributes */
+extern int git_check_attr(const char *path,
+			  struct git_attr_check *,
+			  struct git_attr_result *);
+
+
+
+/**
+ * Free or clear the result struct. The underlying strings are not free'd
+ * as they are globally known.
  */
-void git_all_attrs(const char *path, struct git_attr_check *check);
+extern void git_attr_result_free(struct git_attr_result *);
+extern void git_attr_result_clear(struct git_attr_result *);
+
+extern void git_attr_check_clear(struct git_attr_check *);
+
 
 enum git_attr_direction {
 	GIT_ATTR_CHECKIN,
diff --git a/builtin/check-attr.c b/builtin/check-attr.c
index ec61476..5e78d5d 100644
--- a/builtin/check-attr.c
+++ b/builtin/check-attr.c
@@ -24,13 +24,17 @@ static const struct option check_attr_options[] = {
 	OPT_END()
 };
 
-static void output_attr(struct git_attr_check *check, const char *file)
+static void output_attr(struct git_attr_check *check,
+			struct git_attr_result *result, const char *file)
 {
 	int j;
 	int cnt = check->check_nr;
 
+	if (check->check_nr != result->check_nr)
+		die("BUG: confused check and result internally");
+
 	for (j = 0; j < cnt; j++) {
-		const char *value = check->check[j].value;
+		const char *value = result->value[j];
 
 		if (ATTR_TRUE(value))
 			value = "set";
@@ -44,11 +48,11 @@ static void output_attr(struct git_attr_check *check, const char *file)
 			       "%s%c" /* attrname */
 			       "%s%c" /* attrvalue */,
 			       file, 0,
-			       git_attr_name(check->check[j].attr), 0, value, 0);
+			       git_attr_name(check->attr[j]), 0, value, 0);
 		} else {
 			quote_c_style(file, NULL, stdout, 0);
 			printf(": %s: %s\n",
-			       git_attr_name(check->check[j].attr), value);
+			       git_attr_name(check->attr[j]), value);
 		}
 	}
 }
@@ -59,16 +63,21 @@ static void check_attr(const char *prefix,
 {
 	char *full_path =
 		prefix_path(prefix, prefix ? strlen(prefix) : 0, file);
+	struct git_attr_result result = GIT_ATTR_RESULT_INIT;
+	struct git_attr_check local_check = GIT_ATTR_CHECK_INIT;
+
 	if (check != NULL) {
-		if (git_check_attr(full_path, check))
-			die("git_check_attr died");
-		output_attr(check, file);
+		git_attr_result_init(&result, check);
+		git_check_attr(full_path, check, &result);
 	} else {
-		check = git_attr_check_alloc();
-		git_all_attrs(full_path, check);
-		output_attr(check, file);
-		git_attr_check_free(check);
+		git_all_attrs(full_path, &local_check, &result);
+		check = &local_check;
 	}
+
+	output_attr(check, &result, file);
+
+	git_attr_result_clear(&result);
+	git_attr_check_clear(&local_check);
 	free(full_path);
 }
 
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 3918c07..14b764a 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -900,13 +900,19 @@ static int no_try_delta(const char *path)
 {
 	static struct git_attr_check *check;
 
-	if (!check)
-		check = git_attr_check_initl("delta", NULL);
-	if (git_check_attr(path, check))
-		return 0;
-	if (ATTR_FALSE(check->check[0].value))
-		return 1;
-	return 0;
+	struct git_attr_result result = GIT_ATTR_RESULT_INIT;
+	int ret = 0;
+
+	git_attr_check_initl(&check, "delta", NULL);
+	git_attr_result_init(&result, check);
+
+	if (!git_check_attr(path, check, &result)) {
+		if (ATTR_FALSE(result.value[0]))
+			ret = 1;
+	}
+
+	git_attr_result_clear(&result);
+	return ret;
 }
 
 /*
diff --git a/convert.c b/convert.c
index bb2435a..01de8ef 100644
--- a/convert.c
+++ b/convert.c
@@ -718,10 +718,8 @@ static int ident_to_worktree(const char *path, const char *src, size_t len,
 	return 1;
 }
 
-static enum crlf_action git_path_check_crlf(struct git_attr_check_elem *check)
+static enum crlf_action git_path_check_crlf(const char *value)
 {
-	const char *value = check->value;
-
 	if (ATTR_TRUE(value))
 		return CRLF_TEXT;
 	else if (ATTR_FALSE(value))
@@ -735,10 +733,8 @@ static enum crlf_action git_path_check_crlf(struct git_attr_check_elem *check)
 	return CRLF_UNDEFINED;
 }
 
-static enum eol git_path_check_eol(struct git_attr_check_elem *check)
+static enum eol git_path_check_eol(const char *value)
 {
-	const char *value = check->value;
-
 	if (ATTR_UNSET(value))
 		;
 	else if (!strcmp(value, "lf"))
@@ -748,9 +744,8 @@ static enum eol git_path_check_eol(struct git_attr_check_elem *check)
 	return EOL_UNSET;
 }
 
-static struct convert_driver *git_path_check_convert(struct git_attr_check_elem *check)
+static struct convert_driver *git_path_check_convert(const char *value)
 {
-	const char *value = check->value;
 	struct convert_driver *drv;
 
 	if (ATTR_TRUE(value) || ATTR_FALSE(value) || ATTR_UNSET(value))
@@ -761,10 +756,8 @@ static struct convert_driver *git_path_check_convert(struct git_attr_check_elem
 	return NULL;
 }
 
-static int git_path_check_ident(struct git_attr_check_elem *check)
+static int git_path_check_ident(const char *value)
 {
-	const char *value = check->value;
-
 	return !!ATTR_TRUE(value);
 }
 
@@ -778,25 +771,29 @@ struct conv_attrs {
 static void convert_attrs(struct conv_attrs *ca, const char *path)
 {
 	static struct git_attr_check *check;
+	static int init_user_convert_tail;
 
-	if (!check) {
-		check = git_attr_check_initl("crlf", "ident",
-					     "filter", "eol", "text",
-					     NULL);
+	struct git_attr_result result = GIT_ATTR_RESULT_INIT;
+
+	git_attr_check_initl(&check, "crlf", "ident", "filter",
+			     "eol", "text", NULL);
+	git_attr_result_init(&result, check);
+
+	if (!init_user_convert_tail) {
 		user_convert_tail = &user_convert;
 		git_config(read_convert_config, NULL);
+		init_user_convert_tail = 1;
 	}
 
-	if (!git_check_attr(path, check)) {
-		struct git_attr_check_elem *ccheck = check->check;
-		ca->crlf_action = git_path_check_crlf(ccheck + 4);
+	if (!git_check_attr(path, check, &result)) {
+		ca->crlf_action = git_path_check_crlf(result.value[4]);
 		if (ca->crlf_action == CRLF_UNDEFINED)
-			ca->crlf_action = git_path_check_crlf(ccheck + 0);
+			ca->crlf_action = git_path_check_crlf(result.value[0]);
 		ca->attr_action = ca->crlf_action;
-		ca->ident = git_path_check_ident(ccheck + 1);
-		ca->drv = git_path_check_convert(ccheck + 2);
+		ca->ident = git_path_check_ident(result.value[1]);
+		ca->drv = git_path_check_convert(result.value[2]);
 		if (ca->crlf_action != CRLF_BINARY) {
-			enum eol eol_attr = git_path_check_eol(ccheck + 3);
+			enum eol eol_attr = git_path_check_eol(result.value[3]);
 			if (ca->crlf_action == CRLF_AUTO && eol_attr == EOL_LF)
 				ca->crlf_action = CRLF_AUTO_INPUT;
 			else if (ca->crlf_action == CRLF_AUTO && eol_attr == EOL_CRLF)
@@ -820,6 +817,8 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
 		ca->crlf_action = CRLF_AUTO_CRLF;
 	if (ca->crlf_action == CRLF_UNDEFINED && auto_crlf == AUTO_CRLF_INPUT)
 		ca->crlf_action = CRLF_AUTO_INPUT;
+
+	git_attr_result_clear(&result);
 }
 
 int would_convert_to_git_filter_fd(const char *path)
diff --git a/ll-merge.c b/ll-merge.c
index bc6479c..3abc77a 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -355,6 +355,7 @@ int ll_merge(mmbuffer_t *result_buf,
 {
 	static struct git_attr_check *check;
 	static const struct ll_merge_options default_opts;
+	static struct git_attr_result result;
 	const char *ll_driver_name = NULL;
 	int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 	const struct ll_merge_driver *driver;
@@ -368,13 +369,15 @@ int ll_merge(mmbuffer_t *result_buf,
 		normalize_file(theirs, path);
 	}
 
-	if (!check)
-		check = git_attr_check_initl("merge", "conflict-marker-size", NULL);
+	if (!check) {
+		git_attr_check_initl(&check, "merge", "conflict-marker-size", NULL);
+		git_attr_result_init(&result, check);
+	}
 
-	if (!git_check_attr(path, check)) {
-		ll_driver_name = check->check[0].value;
-		if (check->check[1].value) {
-			marker_size = atoi(check->check[1].value);
+	if (!git_check_attr(path, check, &result)) {
+		ll_driver_name = result.value[0];
+		if (result.value[1]) {
+			marker_size = atoi(result.value[1]);
 			if (marker_size <= 0)
 				marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 		}
@@ -394,14 +397,19 @@ int ll_merge(mmbuffer_t *result_buf,
 int ll_merge_marker_size(const char *path)
 {
 	static struct git_attr_check *check;
+
+	struct git_attr_result result = GIT_ATTR_RESULT_INIT;
 	int marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 
-	if (!check)
-		check = git_attr_check_initl("conflict-marker-size", NULL);
-	if (!git_check_attr(path, check) && check->check[0].value) {
-		marker_size = atoi(check->check[0].value);
+	git_attr_check_initl(&check, "conflict-marker-size", NULL);
+	git_attr_result_init(&result, check);
+
+	if (!git_check_attr(path, check, &result) && result.value[0]) {
+		marker_size = atoi(result.value[0]);
 		if (marker_size <= 0)
 			marker_size = DEFAULT_CONFLICT_MARKER_SIZE;
 	}
+	git_attr_result_clear(&result);
+
 	return marker_size;
 }
diff --git a/userdiff.c b/userdiff.c
index 46dfd32..62b1ed6 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -264,20 +264,30 @@ struct userdiff_driver *userdiff_find_by_path(const char *path)
 {
 	static struct git_attr_check *check;
 
-	if (!check)
-		check = git_attr_check_initl("diff", NULL);
+	struct git_attr_result result = GIT_ATTR_RESULT_INIT;
+	const char *value;
+
 	if (!path)
 		return NULL;
-	if (git_check_attr(path, check))
+
+	git_attr_check_initl(&check, "diff", NULL);
+	git_attr_result_init(&result, check);
+
+	if (git_check_attr(path, check, &result)) {
+		git_attr_result_clear(&result);
 		return NULL;
+	}
+
+	value = result.value[0];
+	git_attr_result_clear(&result);
 
-	if (ATTR_TRUE(check->check[0].value))
+	if (ATTR_TRUE(value))
 		return &driver_true;
-	if (ATTR_FALSE(check->check[0].value))
+	if (ATTR_FALSE(value))
 		return &driver_false;
-	if (ATTR_UNSET(check->check[0].value))
+	if (ATTR_UNSET(value))
 		return NULL;
-	return userdiff_find_by_name(check->check[0].value);
+	return userdiff_find_by_name(value);
 }
 
 struct userdiff_driver *userdiff_get_textconv(struct userdiff_driver *driver)
diff --git a/ws.c b/ws.c
index bb3270c..5e30fef 100644
--- a/ws.c
+++ b/ws.c
@@ -73,15 +73,20 @@ unsigned parse_whitespace_rule(const char *string)
 
 unsigned whitespace_rule(const char *pathname)
 {
-	static struct git_attr_check *attr_whitespace_rule;
+	static struct git_attr_check *check;
 
-	if (!attr_whitespace_rule)
-		attr_whitespace_rule = git_attr_check_initl("whitespace", NULL);
+	struct git_attr_result result = GIT_ATTR_RESULT_INIT;
+	const char *value;
+	int r;
 
-	if (!git_check_attr(pathname, attr_whitespace_rule)) {
-		const char *value;
+	git_attr_check_initl(&check, "whitespace", NULL);
+	git_attr_result_init(&result, check);
 
-		value = attr_whitespace_rule->check[0].value;
+	r = git_check_attr(pathname, check, &result);
+	value = result.value[0];
+	git_attr_result_clear(&result);
+
+	if (!r) {
 		if (ATTR_TRUE(value)) {
 			/* true (whitespace) */
 			unsigned all_rule = ws_tab_width(whitespace_rule_cfg);
-- 
2.10.1.432.g8a36cd8.dirty


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCHv3] attr: convert to new threadsafe API
  2016-10-12 22:41 [PATCHv3] attr: convert to new threadsafe API Stefan Beller
@ 2016-10-12 23:33 ` Junio C Hamano
  2016-10-13 18:42   ` Stefan Beller
  2016-10-14 15:37   ` Junio C Hamano
  0 siblings, 2 replies; 9+ messages in thread
From: Junio C Hamano @ 2016-10-12 23:33 UTC (permalink / raw)
  To: Stefan Beller; +Cc: bmwill, git, j6t, jacob.keller

Stefan Beller <sbeller@google.com> writes:

> @@ -89,15 +114,20 @@ static void setup_check(void)
>  
>  ------------
>  	const char *path;
> +	struct git_attr_result *result;
>  
>  	setup_check();
> -	git_check_attr(path, check);
> +	result = git_check_attr(path, check);

This looks stale by a few revisions of the other parts of the patch?

> diff --git a/archive.c b/archive.c
> index 11e3951..15849a8 100644
> --- a/archive.c
> +++ b/archive.c
> @@ -107,10 +107,12 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
>  		void *context)
>  {
>  	static struct strbuf path = STRBUF_INIT;
> +	static struct git_attr_check *check;
> +
>  	struct archiver_context *c = context;
>  	struct archiver_args *args = c->args;
>  	write_archive_entry_fn_t write_entry = c->write_entry;
> -	static struct git_attr_check *check;
> +	struct git_attr_result result = GIT_ATTR_RESULT_INIT;
>  	const char *path_without_prefix;
>  	int err;
>  
> @@ -124,12 +126,16 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
>  		strbuf_addch(&path, '/');
>  	path_without_prefix = path.buf + args->baselen;
>  
> -	if (!check)
> -		check = git_attr_check_initl("export-ignore", "export-subst", NULL);
> -	if (!git_check_attr(path_without_prefix, check)) {
> -		if (ATTR_TRUE(check->check[0].value))
> +	git_attr_check_initl(&check, "export-ignore", "export-subst", NULL);
> +	git_attr_result_init(&result, check);
> +
> +	if (!git_check_attr(path_without_prefix, check, &result)) {
> +		if (ATTR_TRUE(result.value[0])) {
> +			git_attr_result_clear(&result);
>  			return 0;
> -		args->convert = ATTR_TRUE(check->check[1].value);
> +		}
> +		args->convert = ATTR_TRUE(result.value[1]);
> +		git_attr_result_clear(&result);
>  	}

This is exactly what I meant by "can we avoid alloc/free of result
in leaf function when we _know_ how many attributes we are
interested in already, which is the majority of the case?".

Starting with a simple but unoptimized internal implementation of
the attr subsystem is one thing (which is good).  Exposing an API that
cannot be optimally implemented later without changing the callers
is another (which is bad).

By encapsulating each element into "struct git_attr_result", we can
extend the API without changing the API user [*1*].  

But I do not think of a way to allow an efficient implementation
later unless the source of the API user somehow says "this user is
only interested in this many attributes", like having this

	struct git_attr_result result[2];

(because this caller only wants "ignore" and "subst") on the API
user's side [*2*].  Without such a clue (like the patch above, that
only says "there is a structure called 'result'"), I do not think of
a way to avoid dynamic allocation on the API implementation side.

All the other callers in the patch (pack-objects, convert, ll-merge,
etc.) seem to share the exact same pattern.  Each of the leaf
functions knows a fixed set of attributes it is interested in, the
caller iterates over many paths and makes calls to these leaf
functions, and it is a waste to pay alloc/free overhead for each and
every iteration when we know how many elements result needs to
store.


[Footnote]

*1* Would we need a wrapping struct around the array of results?  If
    that is the case, we may need something ugly like this on the
    API user side:

	GIT_ATTR_RESULT_TYPE(2) result = {2,};

    with something like the following on the API implementation
    side:

        #define GIT_ATTR_RESULT_TYPE(n) \
            struct { \
                    int num_slots; \
                    const char *value[n]; \
            }

        struct git_attr_result {
                int num_slots;
                const char *value[FLEX_ARRAY];
        };
        git_attr_result_init(void *result_, struct git_attr_check *check)
        {
                struct git_attr_result *result = result_;

                assert(result->num_slots, check->num_attrs);
                ...
        }                
        git_check_attr(const char *path,
                       struct git_attr_check *check,
                       void *result_)
        {                       
                struct git_attr_result *result = result_;

                assert(result->num_slots, check->num_attrs);
                for (i = 0; i < check_num_attrs; i++)
                        result->value[i] = ... found value ...;
        }


*2* Or the uglier

	GIT_ATTR_RESULT_TYPE(2) result = {2,};

    I can see why the "check" side would benefit from a structure
    that contains an array, but I do not see why "result" side would
    want to, so I am hoping that we won't have to do this uglier
    variant and just go with the simple "array of resulting values".

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv3] attr: convert to new threadsafe API
  2016-10-12 23:33 ` Junio C Hamano
@ 2016-10-13 18:42   ` Stefan Beller
  2016-10-13 22:08     ` Stefan Beller
  2016-10-14 15:37   ` Junio C Hamano
  1 sibling, 1 reply; 9+ messages in thread
From: Stefan Beller @ 2016-10-13 18:42 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Brandon Williams, git@vger.kernel.org, Johannes Sixt,
	Jacob Keller

On Wed, Oct 12, 2016 at 4:33 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> @@ -89,15 +114,20 @@ static void setup_check(void)
>>
>>  ------------
>>       const char *path;
>> +     struct git_attr_result *result;
>>
>>       setup_check();
>> -     git_check_attr(path, check);
>> +     result = git_check_attr(path, check);
>
> This looks stale by a few revisions of the other parts of the patch?

I'll update the documentation.

>
>> diff --git a/archive.c b/archive.c
>> index 11e3951..15849a8 100644
>> --- a/archive.c
>> +++ b/archive.c
>> @@ -107,10 +107,12 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
>>               void *context)
>>  {
>>       static struct strbuf path = STRBUF_INIT;
>> +     static struct git_attr_check *check;
>> +
>>       struct archiver_context *c = context;
>>       struct archiver_args *args = c->args;
>>       write_archive_entry_fn_t write_entry = c->write_entry;
>> -     static struct git_attr_check *check;
>> +     struct git_attr_result result = GIT_ATTR_RESULT_INIT;
>>       const char *path_without_prefix;
>>       int err;
>>
>> @@ -124,12 +126,16 @@ static int write_archive_entry(const unsigned char *sha1, const char *base,
>>               strbuf_addch(&path, '/');
>>       path_without_prefix = path.buf + args->baselen;
>>
>> -     if (!check)
>> -             check = git_attr_check_initl("export-ignore", "export-subst", NULL);
>> -     if (!git_check_attr(path_without_prefix, check)) {
>> -             if (ATTR_TRUE(check->check[0].value))
>> +     git_attr_check_initl(&check, "export-ignore", "export-subst", NULL);
>> +     git_attr_result_init(&result, check);
>> +
>> +     if (!git_check_attr(path_without_prefix, check, &result)) {
>> +             if (ATTR_TRUE(result.value[0])) {
>> +                     git_attr_result_clear(&result);
>>                       return 0;
>> -             args->convert = ATTR_TRUE(check->check[1].value);
>> +             }
>> +             args->convert = ATTR_TRUE(result.value[1]);
>> +             git_attr_result_clear(&result);
>>       }
>
> This is exactly what I meant by "can we avoid alloc/free of result
> in leaf function when we _know_ how many attributes we are
> interested in already, which is the majority of the case?".

We can avoid that. For now I settled with an implementation that
has no "answer" type, but uses a bare "const char *result[FLEX_ARRAY];",
or rather a const char **.

>
> Starting with a simple but unoptimized internal implementation of
> the attr subsystem is one thing (which is good).  Exposing an API that
> cannot be optimally implemented later without changing the callers
> is another (which is bad).
>
> By encapsulating each element into "struct git_attr_result", we can
> extend the API without changing the API user [*1*].

Oh I see.

So instead of a raw string we want to have

struct git_attr_result {
    const char *value;
};

just to have it extensible. Makes sense. I'll redo that.


>
> But I do not think of a way to allow an efficient implementation
> later unless the source of the API user somehow says "this user is
> only interested in this many attributes", like having this
>
>         struct git_attr_result result[2];

    const char *result[2] = {NULL, NULL};

as of now would be

    struct git_attr_result result[2];

but we'd lose the ability to set them to NULL easily. Probably not needed.

>
> (because this caller only wants "ignore" and "subst") on the API
> user's side [*2*].  Without such a clue (like the patch above, that
> only says "there is a structure called 'result'"), I do not think of
> a way to avoid dynamic allocation on the API implementation side.
>
> All the other callers in the patch (pack-objects, convert, ll-merge,
> etc.) seem to share the exact same pattern.  Each of the leaf
> functions knows a fixed set of attributes it is interested in, the
> caller iterates over many paths and makes calls to these leaf
> functions, and it is a waste to pay alloc/free overhead for each and
> every iteration when we know how many elements result needs to
> store.
>

Right.

>
> [Footnote]
>
> *1* Would we need a wrapping struct around the array of results?  If
>     that is the case, we may need something ugly like this on the
>     API user side:
>
>         GIT_ATTR_RESULT_TYPE(2) result = {2,};
>
>     with something like the following on the API implementation
>     side:
>
>         #define GIT_ATTR_RESULT_TYPE(n) \
>             struct { \
>                     int num_slots; \
>                     const char *value[n]; \
>             }
>
>         struct git_attr_result {
>                 int num_slots;
>                 const char *value[FLEX_ARRAY];
>         };
>         git_attr_result_init(void *result_, struct git_attr_check *check)
>         {
>                 struct git_attr_result *result = result_;
>
>                 assert(result->num_slots, check->num_attrs);
>                 ...
>         }
>         git_check_attr(const char *path,
>                        struct git_attr_check *check,
>                        void *result_)
>         {
>                 struct git_attr_result *result = result_;
>
>                 assert(result->num_slots, check->num_attrs);
>                 for (i = 0; i < check_num_attrs; i++)
>                         result->value[i] = ... found value ...;
>         }
>
>
> *2* Or the uglier
>
>         GIT_ATTR_RESULT_TYPE(2) result = {2,};
>
>     I can see why the "check" side would benefit from a structure
>     that contains an array, but I do not see why "result" side would
>     want to, so I am hoping that we won't have to do this uglier
>     variant and just go with the simple "array of resulting values".

So I currently have the "array of resulting values", but not wrapped.

Do we expect to get more than the values out of the attr system?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv3] attr: convert to new threadsafe API
  2016-10-13 18:42   ` Stefan Beller
@ 2016-10-13 22:08     ` Stefan Beller
  0 siblings, 0 replies; 9+ messages in thread
From: Stefan Beller @ 2016-10-13 22:08 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Brandon Williams, git@vger.kernel.org, Johannes Sixt,
	Jacob Keller

> On Wed, Oct 12, 2016 at 4:33 PM, Junio C Hamano <gitster@pobox.com> wrote:
> so I am hoping that we won't have to do this uglier variant

---8<--- attr.h:
...
struct git_attr_result {
    int check_nr;
    /* Whether is was statically allocated and cannot be resized. */
    int static_alloc;
    const char *value[FLEX_ARRAY];
};

/* create a pointer pointing to a git_attr_result with a given size: */
#define GIT_ATTR_RESULT_INIT_FOR(name, size) \
    struct { \
        int check_nr; \
        int static_alloc; \
        const char *value[size]; \
    } local_##name; \
    struct git_attr_result *name = \
        (struct git_attr_result *)&(local_##name); \
    local_##name.static_alloc = 1;
...
---8<--- e.g. ws.c:

    static struct git_attr_check *attr_whitespace_rule;
    GIT_ATTR_RESULT_INIT_FOR(result, 1);

    git_attr_check_initl(&attr_whitespace_rule, "whitespace", NULL);

    if (!git_check_attr(pathname, attr_whitespace_rule, result)) {
        if (ATTR_TRUE(result->value[0])) {
            ...
        } else if (ATTR_FALSE(result->value[0])) {
            ...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv3] attr: convert to new threadsafe API
  2016-10-12 23:33 ` Junio C Hamano
  2016-10-13 18:42   ` Stefan Beller
@ 2016-10-14 15:37   ` Junio C Hamano
  2016-10-18 23:52     ` Stefan Beller
  1 sibling, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2016-10-14 15:37 UTC (permalink / raw)
  To: Stefan Beller; +Cc: bmwill, git, j6t, jacob.keller

Junio C Hamano <gitster@pobox.com> writes:

> *1* Would we need a wrapping struct around the array of results?

By the way, I do see a merit on the "check" side (tl;dr: but I do
not think "result" needs it, hence I do not see the need for the
"ugly" variants).

Take "archive" for example.  For each path, it wants to see the
attribute "export-ignore" to decide if it is to be omitted.  In
addition, the usual set of attributes used to smudge blobs into the
working tree representation are inspected by the convert.c API as
part of its implementation of convert_to_working_tree().  This
program has at least two sets of <"check", "result"> that are used
by two git_check_attr() callsites that are unaware of each other.

One of the optimizations we discussed is to trim down the attr-stack
(which caches the attributes read from .gitattributes files that are
in effect for the "last" directory that has the path for which
attrbiutes are queried for) by reading/keeping only the entries that
affect the attributes the caller is interested in.  But when there
are multiple callsites that are interested in different sets of
attributes, we obviously cannot do such an optimization without
taking too much cache-invalidation hit.  Because these callsites are
not unaware of each other, I do not think we can say "keep the
entries that affects the union of all active callsites" very easily,
even if it were possible.

But we could tie this cache to "check", which keeps a constant
subset of attributes that the caller is interested in (i.e. each
callsite would keep its own cache that is useful for its query).
While we are single-threaded, "struct git_attr_check" being a
wrapping struct around the array of "what attributes are of
interest?" is a good place to add that per-check attr-stack cache.
When we go multi-threaded, the attr-stack cache must become
per-thread, and needs to be moved to per-thread storage, and such a
per-thread storage would have multiple attr-stack, one per "check"
instance (i.e. looking up the attr-stack may have to say "who/what
thread am I?" to first go to the thread-local storage for the
current thread, where a table of pointers to attr-stacks is kept and
from there, index into that table to find the attr-stack that
corresponds to the particular "check").  We could use the address of
"check" as the key into this table, but "struct git_attr_check" that
wraps the array gives us another option to allocate a small
consecutive integer every time initl() creates a new "check" and use
it as the index into that attr-stack table, as that integer index
can be in the struct that wraps the array of wanted attributes.

	Note. none of the above is a suggestion to do the attr
	caching the way exactly described.  The above is primarily
	to illustrate how a wrapping struct may give us future
	flexibility without affecting a single line of code in the
	user of API.

It may turn out that we do not need to have anything other than the
array of wanted attributes in the "check" struct, but unlike
"result", "check" is shared across threads, and do not have to live
directly on the stack, so we can prepare for flexibility.

I do not foresee a similar need for wrapping struct for "result",
and given that we do want to keep the option of having them directly
on the stack, I am inclined to say we shouldn't introduce one.

If we were still to do the wrapping for result, I would say that
basing it around the FLEX_ARRAY idiom, i.e.

>         struct git_attr_result {
>                 int num_slots;
>                 const char *value[FLEX_ARRAY];
>         };

is a horrible idea.  It would be less horrible if it were

	struct git_attr_result {
		int num_slots;
		const char **value;
	};

then make the API user write via a convenience macro something like
this

	const char *result_values[NUM_ATTRS_OF_INTEREST];
	struct git_attr_result result = {
		ARRAY_SIZE(result_values), &result_values
	};                

instead.  That way, at least the side that implements git_check_attr()
would not have to be type-unsafe like the example of ugliness in the
message I am following-up on.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv3] attr: convert to new threadsafe API
  2016-10-14 15:37   ` Junio C Hamano
@ 2016-10-18 23:52     ` Stefan Beller
  2016-10-19  0:06       ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Beller @ 2016-10-18 23:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Brandon Williams, git@vger.kernel.org, Johannes Sixt,
	Jacob Keller

On Fri, Oct 14, 2016 at 8:37 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> *1* Would we need a wrapping struct around the array of results?
>
> By the way, I do see a merit on the "check" side (tl;dr: but I do
> not think "result" needs it, hence I do not see the need for the
> "ugly" variants).

So we'd rather go with const char **result instead of our own new struct there.
Ok, got it.

>
> Take "archive" for example.  For each path, it wants to see the
> attribute "export-ignore" to decide if it is to be omitted.  In
> addition, the usual set of attributes used to smudge blobs into the
> working tree representation are inspected by the convert.c API as
> part of its implementation of convert_to_working_tree().  This
> program has at least two sets of <"check", "result"> that are used
> by two git_check_attr() callsites that are unaware of each other.
>
> One of the optimizations we discussed is to trim down the attr-stack
> (which caches the attributes read from .gitattributes files that are
> in effect for the "last" directory that has the path for which
> attrbiutes are queried for) by reading/keeping only the entries that
> affect the attributes the caller is interested in.  But when there
> are multiple callsites that are interested in different sets of
> attributes, we obviously cannot do such an optimization without
> taking too much cache-invalidation hit.  Because these callsites are
> not unaware of each other, I do not think we can say "keep the
> entries that affects the union of all active callsites" very easily,
> even if it were possible.
>
> But we could tie this cache to "check", which keeps a constant
> subset of attributes that the caller is interested in (i.e. each
> callsite would keep its own cache that is useful for its query).
> While we are single-threaded, "struct git_attr_check" being a
> wrapping struct around the array of "what attributes are of
> interest?" is a good place to add that per-check attr-stack cache.
> When we go multi-threaded, the attr-stack cache must become
> per-thread, and needs to be moved to per-thread storage, and such a
> per-thread storage would have multiple attr-stack, one per "check"
> instance (i.e. looking up the attr-stack may have to say "who/what
> thread am I?" to first go to the thread-local storage for the
> current thread, where a table of pointers to attr-stacks is kept and
> from there, index into that table to find the attr-stack that
> corresponds to the particular "check").  We could use the address of
> "check" as the key into this table, but "struct git_attr_check" that
> wraps the array gives us another option to allocate a small
> consecutive integer every time initl() creates a new "check" and use
> it as the index into that attr-stack table, as that integer index
> can be in the struct that wraps the array of wanted attributes.
>
>         Note. none of the above is a suggestion to do the attr
>         caching the way exactly described.  The above is primarily
>         to illustrate how a wrapping struct may give us future
>         flexibility without affecting a single line of code in the
>         user of API.
>
> It may turn out that we do not need to have anything other than the
> array of wanted attributes in the "check" struct, but unlike
> "result", "check" is shared across threads, and do not have to live
> directly on the stack, so we can prepare for flexibility.
>
> I do not foresee a similar need for wrapping struct for "result",
> and given that we do want to keep the option of having them directly
> on the stack, I am inclined to say we shouldn't introduce one.
>
> If we were still to do the wrapping for result, I would say that
> basing it around the FLEX_ARRAY idiom, i.e.
>
>>         struct git_attr_result {
>>                 int num_slots;
>>                 const char *value[FLEX_ARRAY];
>>         };
>
> is a horrible idea.  It would be less horrible if it were
>
>         struct git_attr_result {
>                 int num_slots;
>                 const char **value;
>         };

So const char** but with an additional number of slots, all we do
would be to compare this number of slots to the checks number of slots and
die("BUG:..."),  which is just a burden and no help.

>
> then make the API user write via a convenience macro something like
> this
>
>         const char *result_values[NUM_ATTRS_OF_INTEREST];
>         struct git_attr_result result = {
>                 ARRAY_SIZE(result_values), &result_values
>         };
>
> instead.  That way, at least the side that implements git_check_attr()
> would not have to be type-unsafe like the example of ugliness in the
> message I am following-up on.

Ok I will reroll with the const char** instead of the macro stuff that
I came up with,
(that would be type safe though uglier than the pure variant).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv3] attr: convert to new threadsafe API
  2016-10-18 23:52     ` Stefan Beller
@ 2016-10-19  0:06       ` Junio C Hamano
  2016-10-19  0:20         ` Stefan Beller
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2016-10-19  0:06 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Brandon Williams, git@vger.kernel.org, Johannes Sixt,
	Jacob Keller

On Tue, Oct 18, 2016 at 4:52 PM, Stefan Beller <sbeller@google.com> wrote:
>
> >
> > By the way, I do see a merit on the "check" side (tl;dr: but I do
> > not think "result" needs it, hence I do not see the need for the
> > "ugly" variants).
>
> So we'd rather go with const char **result instead of our own new struct there.
> Ok, got it.

I do not think you got it. I am talking about wrapping struct around
an array of element,
not each element in the array. IOW

> > If we were still to do the wrapping for result, I would say that
> > basing it around the FLEX_ARRAY idiom, i.e.
> >
> >>         struct git_attr_result {
> >>                 int num_slots;
> >>                 const char *value[FLEX_ARRAY];
> >>         };

the structure around the array of elements (value) that allows us to have
something other than value[] in it. That is what I said "I do not see
the need for".

It is perfectly fine future-proofing to have

struct git_attr_result_value {
  const char *value;
};

and have the users of API declare

struct git_attr_result value[5];

or whatever. That way we could fatten the structure later if we wanted
to without having to update the users of API, and there is no downside.

Having wrapping strut around the array does have a huge downside,
and that is what I said I see no need for.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv3] attr: convert to new threadsafe API
  2016-10-19  0:06       ` Junio C Hamano
@ 2016-10-19  0:20         ` Stefan Beller
  2016-10-19  0:40           ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Beller @ 2016-10-19  0:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Brandon Williams, git@vger.kernel.org, Johannes Sixt,
	Jacob Keller

On Tue, Oct 18, 2016 at 5:06 PM, Junio C Hamano <gitster@pobox.com> wrote:
> On Tue, Oct 18, 2016 at 4:52 PM, Stefan Beller <sbeller@google.com> wrote:
>>
>> >
>> > By the way, I do see a merit on the "check" side (tl;dr: but I do
>> > not think "result" needs it, hence I do not see the need for the
>> > "ugly" variants).
>>
>> So we'd rather go with const char **result instead of our own new struct there.
>> Ok, got it.
>
> I do not think you got it. I am talking about wrapping struct around
> an array of element,
> not each element in the array. IOW
>
>> > If we were still to do the wrapping for result, I would say that
>> > basing it around the FLEX_ARRAY idiom, i.e.
>> >
>> >>         struct git_attr_result {
>> >>                 int num_slots;
>> >>                 const char *value[FLEX_ARRAY];
>> >>         };
>
> the structure around the array of elements (value) that allows us to have
> something other than value[] in it. That is what I said "I do not see
> the need for".
>
> It is perfectly fine future-proofing to have
>
> struct git_attr_result_value {
>   const char *value;
> };
>
> and have the users of API declare
>
> struct git_attr_result value[5];
>
> or whatever. That way we could fatten the structure later if we wanted
> to without having to update the users of API, and there is no downside.
>
> Having wrapping strut around the array does have a huge downside,
> and that is what I said I see no need for.

I am not sure if I see the upside on wrapping a single value except for
its future proofness, i.e. what if we want to transport information that
is valid for all values, e.g. an error code or that the result check was done
lazily (for non lazy you would need to do X) or ...

IOW I would expect there to be more use cases for information regarding
all the values and not each value enhanced by a thing.

We could just repeat the thing in each of the 5 values in
> struct git_attr_result value[5];
though.

I'll go with that.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv3] attr: convert to new threadsafe API
  2016-10-19  0:20         ` Stefan Beller
@ 2016-10-19  0:40           ` Junio C Hamano
  0 siblings, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2016-10-19  0:40 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Brandon Williams, git@vger.kernel.org, Johannes Sixt,
	Jacob Keller

Stefan Beller <sbeller@google.com> writes:

> I am not sure if I see the upside on wrapping a single value except for
> its future proofness,

I do not see anything other than future-proofing, either.  If we
need to touch all the code that uses the attributes to update the
API, I'd prefer to avoid having to do that again in the future.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-10-19  0:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-12 22:41 [PATCHv3] attr: convert to new threadsafe API Stefan Beller
2016-10-12 23:33 ` Junio C Hamano
2016-10-13 18:42   ` Stefan Beller
2016-10-13 22:08     ` Stefan Beller
2016-10-14 15:37   ` Junio C Hamano
2016-10-18 23:52     ` Stefan Beller
2016-10-19  0:06       ` Junio C Hamano
2016-10-19  0:20         ` Stefan Beller
2016-10-19  0:40           ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).