[PATCHv3 0/8] fetch: make sure submodule oids are fetched

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCHv3 0/8] fetch: make sure submodule oids are fetched
@ 2018-09-21 22:35 Stefan Beller
  2018-09-21 22:35 ` [PATCH 1/8] sha1-array: provide oid_array_filter Stefan Beller
                   ` (7 more replies)
  0 siblings, 8 replies; 51+ messages in thread
From: Stefan Beller @ 2018-09-21 22:35 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

v3:
* I discovered some issues with v2 after sending,
  which is why I rewrote the later patches completely
  and now we pass around a "task" struct that contains everything to know
  about the things to work on and what needs free()ing afterwards.
* as it is no longer string list based, this drops adding string_list_{pop, last}

v2:
* extended commit messages,
* plugged a memory leak
* rewrote the patch "sha1-array: provide oid_array_filter" to be much more like 
  object_array_fiter
* fixed a typo pointed out by Ramsay.

The range diff is below.

Thanks,
Stefan

v1:
Currently when git-fetch is asked to recurse into submodules, it dispatches
a plain "git-fetch -C <submodule-dir>" (and some submodule related options
such as prefix and recusing strategy, but) without any information of the
remote or the tip that should be fetched.

This works surprisingly well in some workflows, not so well in others,
which this series aims to fix.

The first patches provide new basic functionality and do some refactoring;
the interesting part is in the two last patches.

This was discussed in
https://public-inbox.org/git/20180808221752.195419-1-sbeller@google.com/
and I think I addressed all feedback so far.

Stefan Beller (8):
  sha1-array: provide oid_array_filter
  submodule.c: fix indentation
  submodule.c: sort changed_submodule_names before searching it
  submodule: move global changed_submodule_names into fetch submodule
    struct
  submodule.c: do not copy around submodule list
  submodule: fetch in submodules git directory instead of in worktree
  fetch: retry fetching submodules if needed objects were not fetched
  builtin/fetch: check for submodule updates for non branch fetches

 builtin/fetch.c             |  14 +-
 sha1-array.c                |  17 +++
 sha1-array.h                |   9 ++
 submodule.c                 | 268 ++++++++++++++++++++++++++++--------
 t/t5526-fetch-submodules.sh |  23 +++-
 5 files changed, 268 insertions(+), 63 deletions(-)

-- 
2.19.0.444.g18242da7ef-goog

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-21 22:35 [PATCHv3 0/8] fetch: make sure submodule oids are fetched Stefan Beller
@ 2018-09-21 22:35 ` Stefan Beller
  2018-09-22 12:58   ` Ævar Arnfjörð Bjarmason
  2018-09-21 22:35 ` [PATCH 2/8] submodule.c: fix indentation Stefan Beller
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 51+ messages in thread
From: Stefan Beller @ 2018-09-21 22:35 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---
 sha1-array.c | 17 +++++++++++++++++
 sha1-array.h |  9 +++++++++
 2 files changed, 26 insertions(+)

diff --git a/sha1-array.c b/sha1-array.c
index b94e0ec0f5e..d922e94e3fc 100644
--- a/sha1-array.c
+++ b/sha1-array.c
@@ -77,3 +77,20 @@ int oid_array_for_each_unique(struct oid_array *array,
 	}
 	return 0;
 }
+
+void oid_array_filter(struct oid_array *array,
+		      for_each_oid_fn want,
+		      void *cb_data)
+{
+	unsigned nr = array->nr, src, dst;
+	struct object_id *oids = array->oid;
+
+	for (src = dst = 0; src < nr; src++) {
+		if (want(&oids[src], cb_data)) {
+			if (src != dst)
+				oidcpy(&oids[dst], &oids[src]);
+			dst++;
+		}
+	}
+	array->nr = dst;
+}
diff --git a/sha1-array.h b/sha1-array.h
index 232bf950172..275e5b02f5e 100644
--- a/sha1-array.h
+++ b/sha1-array.h
@@ -23,4 +23,13 @@ int oid_array_for_each_unique(struct oid_array *array,
 			      for_each_oid_fn fn,
 			      void *data);
 
+/*
+ * Apply want to each entry in array, retaining only the entries for
+ * which the function returns true.  Preserve the order of the entries
+ * that are retained.
+ */
+void oid_array_filter(struct oid_array *array,
+		      for_each_oid_fn want,
+		      void *cbdata);
+
 #endif /* SHA1_ARRAY_H */
-- 
2.19.0.444.g18242da7ef-goog


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 2/8] submodule.c: fix indentation
  2018-09-21 22:35 [PATCHv3 0/8] fetch: make sure submodule oids are fetched Stefan Beller
  2018-09-21 22:35 ` [PATCH 1/8] sha1-array: provide oid_array_filter Stefan Beller
@ 2018-09-21 22:35 ` Stefan Beller
  2018-09-21 22:35 ` [PATCH 3/8] submodule.c: sort changed_submodule_names before searching it Stefan Beller
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 51+ messages in thread
From: Stefan Beller @ 2018-09-21 22:35 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

The submodule subsystem is really bad at staying within 80 characters.
Fix it while we are here.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/submodule.c b/submodule.c
index ed05339b588..67469a8f513 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1245,7 +1245,8 @@ static int get_next_submodule(struct child_process *cp,
 		if (!submodule) {
 			const char *name = default_name_or_path(ce->name);
 			if (name) {
-				default_submodule.path = default_submodule.name = name;
+				default_submodule.path = name;
+				default_submodule.name = name;
 				submodule = &default_submodule;
 			}
 		}
@@ -1255,8 +1256,10 @@ static int get_next_submodule(struct child_process *cp,
 		default:
 		case RECURSE_SUBMODULES_DEFAULT:
 		case RECURSE_SUBMODULES_ON_DEMAND:
-			if (!submodule || !unsorted_string_list_lookup(&changed_submodule_names,
-							 submodule->name))
+			if (!submodule ||
+			    !unsorted_string_list_lookup(
+					&changed_submodule_names,
+					submodule->name))
 				continue;
 			default_argv = "on-demand";
 			break;
-- 
2.19.0.444.g18242da7ef-goog


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 3/8] submodule.c: sort changed_submodule_names before searching it
  2018-09-21 22:35 [PATCHv3 0/8] fetch: make sure submodule oids are fetched Stefan Beller
  2018-09-21 22:35 ` [PATCH 1/8] sha1-array: provide oid_array_filter Stefan Beller
  2018-09-21 22:35 ` [PATCH 2/8] submodule.c: fix indentation Stefan Beller
@ 2018-09-21 22:35 ` Stefan Beller
  2018-09-21 22:35 ` [PATCH 4/8] submodule: move global changed_submodule_names into fetch submodule struct Stefan Beller
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 51+ messages in thread
From: Stefan Beller @ 2018-09-21 22:35 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

We can string_list_insert() to maintain sorted-ness of the
list as we find new items, or we can string_list_append() to
build an unsorted list and sort it at the end just once.

To pick which one is more appropriate, we notice the fact
that we discover new items more or less in the already
sorted order.  That makes "append then sort" more
appropriate.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/submodule.c b/submodule.c
index 67469a8f513..2f5c19d0aac 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1257,7 +1257,7 @@ static int get_next_submodule(struct child_process *cp,
 		case RECURSE_SUBMODULES_DEFAULT:
 		case RECURSE_SUBMODULES_ON_DEMAND:
 			if (!submodule ||
-			    !unsorted_string_list_lookup(
+			    !string_list_lookup(
 					&changed_submodule_names,
 					submodule->name))
 				continue;
@@ -1351,6 +1351,7 @@ int fetch_populated_submodules(struct repository *r,
 	/* default value, "--submodule-prefix" and its value are added later */
 
 	calculate_changed_submodule_paths();
+	string_list_sort(&changed_submodule_names);
 	run_processes_parallel(max_parallel_jobs,
 			       get_next_submodule,
 			       fetch_start_failure,
-- 
2.19.0.444.g18242da7ef-goog


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 4/8] submodule: move global changed_submodule_names into fetch submodule struct
  2018-09-21 22:35 [PATCHv3 0/8] fetch: make sure submodule oids are fetched Stefan Beller
                   ` (2 preceding siblings ...)
  2018-09-21 22:35 ` [PATCH 3/8] submodule.c: sort changed_submodule_names before searching it Stefan Beller
@ 2018-09-21 22:35 ` Stefan Beller
  2018-09-21 22:35 ` [PATCH 5/8] submodule.c: do not copy around submodule list Stefan Beller
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 51+ messages in thread
From: Stefan Beller @ 2018-09-21 22:35 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

The `changed_submodule_names` are only used for fetching, so let's make it
part of the struct that is passed around for fetching submodules.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c | 42 +++++++++++++++++++++++-------------------
 1 file changed, 23 insertions(+), 19 deletions(-)

diff --git a/submodule.c b/submodule.c
index 2f5c19d0aac..7b4d136849b 100644
--- a/submodule.c
+++ b/submodule.c
@@ -25,7 +25,7 @@
 #include "commit-reach.h"
 
 static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
-static struct string_list changed_submodule_names = STRING_LIST_INIT_DUP;
+
 static int initialized_fetch_ref_tips;
 static struct oid_array ref_tips_before_fetch;
 static struct oid_array ref_tips_after_fetch;
@@ -1111,7 +1111,22 @@ void check_for_new_submodule_commits(struct object_id *oid)
 	oid_array_append(&ref_tips_after_fetch, oid);
 }
 
-static void calculate_changed_submodule_paths(void)
+struct submodule_parallel_fetch {
+	int count;
+	struct argv_array args;
+	struct repository *r;
+	const char *prefix;
+	int command_line_option;
+	int default_option;
+	int quiet;
+	int result;
+
+	struct string_list changed_submodule_names;
+};
+#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0, STRING_LIST_INIT_DUP }
+
+static void calculate_changed_submodule_paths(
+	struct submodule_parallel_fetch *spf)
 {
 	struct argv_array argv = ARGV_ARRAY_INIT;
 	struct string_list changed_submodules = STRING_LIST_INIT_DUP;
@@ -1149,7 +1164,8 @@ static void calculate_changed_submodule_paths(void)
 			continue;
 
 		if (!submodule_has_commits(path, commits))
-			string_list_append(&changed_submodule_names, name->string);
+			string_list_append(&spf->changed_submodule_names,
+					   name->string);
 	}
 
 	free_submodules_oids(&changed_submodules);
@@ -1186,18 +1202,6 @@ int submodule_touches_in_range(struct object_id *excl_oid,
 	return ret;
 }
 
-struct submodule_parallel_fetch {
-	int count;
-	struct argv_array args;
-	struct repository *r;
-	const char *prefix;
-	int command_line_option;
-	int default_option;
-	int quiet;
-	int result;
-};
-#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0}
-
 static int get_fetch_recurse_config(const struct submodule *submodule,
 				    struct submodule_parallel_fetch *spf)
 {
@@ -1258,7 +1262,7 @@ static int get_next_submodule(struct child_process *cp,
 		case RECURSE_SUBMODULES_ON_DEMAND:
 			if (!submodule ||
 			    !string_list_lookup(
-					&changed_submodule_names,
+					&spf->changed_submodule_names,
 					submodule->name))
 				continue;
 			default_argv = "on-demand";
@@ -1350,8 +1354,8 @@ int fetch_populated_submodules(struct repository *r,
 	argv_array_push(&spf.args, "--recurse-submodules-default");
 	/* default value, "--submodule-prefix" and its value are added later */
 
-	calculate_changed_submodule_paths();
-	string_list_sort(&changed_submodule_names);
+	calculate_changed_submodule_paths(&spf);
+	string_list_sort(&spf.changed_submodule_names);
 	run_processes_parallel(max_parallel_jobs,
 			       get_next_submodule,
 			       fetch_start_failure,
@@ -1360,7 +1364,7 @@ int fetch_populated_submodules(struct repository *r,
 
 	argv_array_clear(&spf.args);
 out:
-	string_list_clear(&changed_submodule_names, 1);
+	string_list_clear(&spf.changed_submodule_names, 1);
 	return spf.result;
 }
 
-- 
2.19.0.444.g18242da7ef-goog


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 5/8] submodule.c: do not copy around submodule list
  2018-09-21 22:35 [PATCHv3 0/8] fetch: make sure submodule oids are fetched Stefan Beller
                   ` (3 preceding siblings ...)
  2018-09-21 22:35 ` [PATCH 4/8] submodule: move global changed_submodule_names into fetch submodule struct Stefan Beller
@ 2018-09-21 22:35 ` Stefan Beller
  2018-09-21 22:35 ` [PATCH 6/8] submodule: fetch in submodules git directory instead of in worktree Stefan Beller
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 51+ messages in thread
From: Stefan Beller @ 2018-09-21 22:35 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

'calculate_changed_submodule_paths' uses a local list to compute the
changed submodules, and then produces the result by copying appropriate
items into the result list.

Instead use the result list directly and prune items afterwards
using string_list_remove_empty_items.

By doing so we'll have access to the util pointer for longer that
contains the commits that we need to fetch, which will be
useful in a later patch.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/submodule.c b/submodule.c
index 7b4d136849b..209680ec076 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1129,8 +1129,7 @@ static void calculate_changed_submodule_paths(
 	struct submodule_parallel_fetch *spf)
 {
 	struct argv_array argv = ARGV_ARRAY_INIT;
-	struct string_list changed_submodules = STRING_LIST_INIT_DUP;
-	const struct string_list_item *name;
+	struct string_list_item *name;
 
 	/* No need to check if there are no submodules configured */
 	if (!submodule_from_path(the_repository, NULL, NULL))
@@ -1147,9 +1146,9 @@ static void calculate_changed_submodule_paths(
 	 * Collect all submodules (whether checked out or not) for which new
 	 * commits have been recorded upstream in "changed_submodule_names".
 	 */
-	collect_changed_submodules(&changed_submodules, &argv);
+	collect_changed_submodules(&spf->changed_submodule_names, &argv);
 
-	for_each_string_list_item(name, &changed_submodules) {
+	for_each_string_list_item(name, &spf->changed_submodule_names) {
 		struct oid_array *commits = name->util;
 		const struct submodule *submodule;
 		const char *path = NULL;
@@ -1163,12 +1162,14 @@ static void calculate_changed_submodule_paths(
 		if (!path)
 			continue;
 
-		if (!submodule_has_commits(path, commits))
-			string_list_append(&spf->changed_submodule_names,
-					   name->string);
+		if (submodule_has_commits(path, commits)) {
+			oid_array_clear(commits);
+			*name->string = '\0';
+		}
 	}
 
-	free_submodules_oids(&changed_submodules);
+	string_list_remove_empty_items(&spf->changed_submodule_names, 1);
+
 	argv_array_clear(&argv);
 	oid_array_clear(&ref_tips_before_fetch);
 	oid_array_clear(&ref_tips_after_fetch);
@@ -1364,7 +1365,7 @@ int fetch_populated_submodules(struct repository *r,
 
 	argv_array_clear(&spf.args);
 out:
-	string_list_clear(&spf.changed_submodule_names, 1);
+	free_submodules_oids(&spf.changed_submodule_names);
 	return spf.result;
 }
 
-- 
2.19.0.444.g18242da7ef-goog


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 6/8] submodule: fetch in submodules git directory instead of in worktree
  2018-09-21 22:35 [PATCHv3 0/8] fetch: make sure submodule oids are fetched Stefan Beller
                   ` (4 preceding siblings ...)
  2018-09-21 22:35 ` [PATCH 5/8] submodule.c: do not copy around submodule list Stefan Beller
@ 2018-09-21 22:35 ` Stefan Beller
  2018-09-21 22:35 ` [PATCH 7/8] fetch: retry fetching submodules if needed objects were not fetched Stefan Beller
  2018-09-21 22:35 ` [PATCH 8/8] builtin/fetch: check for submodule updates for non branch fetches Stefan Beller
  7 siblings, 0 replies; 51+ messages in thread
From: Stefan Beller @ 2018-09-21 22:35 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

This patch started as a refactoring to make 'get_next_submodule' more
readable, but upon doing so, I realized that "git fetch" of the submodule
actually doesn't need to be run in the submodules worktree. So let's run
it in its git dir instead.

That should pave the way towards fetching submodules that are currently
not checked out.

This patch leaks the cp->dir in get_next_submodule, as any further
callback in run_processes_parallel doesn't have access to the child
process any more. In an early iteration of this patch, the function
get_submodule_repo_for directly returned the string containing the
git directory, which would be a better design choice for this patch.

However the next patch both fixes the memory leak of cp->dir and also has
a use case for using the full repository handle of the submodule, so
it makes sense to introduce the get_submodule_repo_for here already.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c                 | 47 +++++++++++++++++++++++++++----------
 t/t5526-fetch-submodules.sh |  7 +++++-
 2 files changed, 40 insertions(+), 14 deletions(-)

diff --git a/submodule.c b/submodule.c
index 209680ec076..8829e8d4eff 100644
--- a/submodule.c
+++ b/submodule.c
@@ -482,6 +482,12 @@ void prepare_submodule_repo_env(struct argv_array *out)
 			 DEFAULT_GIT_DIR_ENVIRONMENT);
 }
 
+static void prepare_submodule_repo_env_in_gitdir(struct argv_array *out)
+{
+	prepare_submodule_repo_env_no_git_dir(out);
+	argv_array_pushf(out, "%s=.", GIT_DIR_ENVIRONMENT);
+}
+
 /* Helper function to display the submodule header line prior to the full
  * summary output. If it can locate the submodule objects directory it will
  * attempt to lookup both the left and right commits and put them into the
@@ -1228,6 +1234,25 @@ static int get_fetch_recurse_config(const struct submodule *submodule,
 	return spf->default_option;
 }
 
+static struct repository *get_submodule_repo_for(struct repository *r,
+						 const char *path)
+{
+	struct repository *ret = xmalloc(sizeof(*ret));
+
+	if (repo_submodule_init(ret, r, path)) {
+		/* no entry in .gitmodules? */
+		struct strbuf gitdir = STRBUF_INIT;
+		strbuf_repo_worktree_path(&gitdir, r, "%s/.git", path);
+		if (repo_init(ret, gitdir.buf, NULL)) {
+			strbuf_release(&gitdir);
+			return NULL;
+		}
+		strbuf_release(&gitdir);
+	}
+
+	return ret;
+}
+
 static int get_next_submodule(struct child_process *cp,
 			      struct strbuf *err, void *data, void **task_cb)
 {
@@ -1235,12 +1260,11 @@ static int get_next_submodule(struct child_process *cp,
 	struct submodule_parallel_fetch *spf = data;
 
 	for (; spf->count < spf->r->index->cache_nr; spf->count++) {
-		struct strbuf submodule_path = STRBUF_INIT;
-		struct strbuf submodule_git_dir = STRBUF_INIT;
 		struct strbuf submodule_prefix = STRBUF_INIT;
 		const struct cache_entry *ce = spf->r->index->cache[spf->count];
-		const char *git_dir, *default_argv;
+		const char *default_argv;
 		const struct submodule *submodule;
+		struct repository *repo;
 		struct submodule default_submodule = SUBMODULE_INIT;
 
 		if (!S_ISGITLINK(ce->ce_mode))
@@ -1275,16 +1299,12 @@ static int get_next_submodule(struct child_process *cp,
 			continue;
 		}
 
-		strbuf_repo_worktree_path(&submodule_path, spf->r, "%s", ce->name);
-		strbuf_addf(&submodule_git_dir, "%s/.git", submodule_path.buf);
 		strbuf_addf(&submodule_prefix, "%s%s/", spf->prefix, ce->name);
-		git_dir = read_gitfile(submodule_git_dir.buf);
-		if (!git_dir)
-			git_dir = submodule_git_dir.buf;
-		if (is_directory(git_dir)) {
+		repo = get_submodule_repo_for(spf->r, ce->name);
+		if (repo) {
 			child_process_init(cp);
-			cp->dir = strbuf_detach(&submodule_path, NULL);
-			prepare_submodule_repo_env(&cp->env_array);
+			prepare_submodule_repo_env_in_gitdir(&cp->env_array);
+			cp->dir = xstrdup(repo->gitdir);
 			cp->git_cmd = 1;
 			if (!spf->quiet)
 				strbuf_addf(err, "Fetching submodule %s%s\n",
@@ -1294,10 +1314,11 @@ static int get_next_submodule(struct child_process *cp,
 			argv_array_push(&cp->args, default_argv);
 			argv_array_push(&cp->args, "--submodule-prefix");
 			argv_array_push(&cp->args, submodule_prefix.buf);
+
+			repo_clear(repo);
+			free(repo);
 			ret = 1;
 		}
-		strbuf_release(&submodule_path);
-		strbuf_release(&submodule_git_dir);
 		strbuf_release(&submodule_prefix);
 		if (ret) {
 			spf->count++;
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index 6c2f9b2ba26..42692219a1a 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -566,7 +566,12 @@ test_expect_success 'fetching submodule into a broken repository' '
 
 	test_must_fail git -C dst status &&
 	test_must_fail git -C dst diff &&
-	test_must_fail git -C dst fetch --recurse-submodules
+
+	# git-fetch cannot find the git directory of the submodule,
+	# so it will do nothing, successfully, as it cannot distinguish between
+	# this broken submodule and a submodule that was just set active but
+	# not cloned yet
+	git -C dst fetch --recurse-submodules
 '
 
 test_expect_success "fetch new commits when submodule got renamed" '
-- 
2.19.0.444.g18242da7ef-goog


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 7/8] fetch: retry fetching submodules if needed objects were not fetched
  2018-09-21 22:35 [PATCHv3 0/8] fetch: make sure submodule oids are fetched Stefan Beller
                   ` (5 preceding siblings ...)
  2018-09-21 22:35 ` [PATCH 6/8] submodule: fetch in submodules git directory instead of in worktree Stefan Beller
@ 2018-09-21 22:35 ` Stefan Beller
  2018-09-21 22:35 ` [PATCH 8/8] builtin/fetch: check for submodule updates for non branch fetches Stefan Beller
  7 siblings, 0 replies; 51+ messages in thread
From: Stefan Beller @ 2018-09-21 22:35 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Currently when git-fetch is asked to recurse into submodules, it dispatches
a plain "git-fetch -C <submodule-dir>" (with some submodule related options
such as prefix and recusing strategy, but) without any information of the
remote or the tip that should be fetched.

This works surprisingly well in some workflows (such as using submodules
as a third party library), while not so well in other scenarios, such
as in a Gerrit topic-based workflow, that can tie together changes
(potentially across repositories) on the server side. One of the parts
of such a Gerrit workflow is to download a change when wanting to examine
it, and you'd want to have its submodule changes that are in the same
topic downloaded as well. However these submodule changes reside in their
own repository in their own ref (refs/changes/<int>).

Retry fetching a submodule by object name if the object id that the
superproject points to, cannot be found.

This retrying does not happen when the "git fetch" done at the
superproject is not storing the fetched results in remote
tracking branches (i.e. instead just recording them to
FETCH_HEAD) in this step. A later patch will fix this.

builtin/fetch used to only inspect submodules when they were fetched
"on-demand", as in either on/off case it was clear whether the submodule
needs to be fetched. However to know whether we need to try fetching the
object ids, we need to identify the object names, which is done in this
function check_for_new_submodule_commits(), so we'll also run that code
in case the submodule recursion is set to "on".

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/fetch.c             |   9 +-
 submodule.c                 | 182 ++++++++++++++++++++++++++++++------
 t/t5526-fetch-submodules.sh |  16 ++++
 3 files changed, 174 insertions(+), 33 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 0696abfc2a1..e3b03ad3bd3 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -707,8 +707,7 @@ static int update_local_ref(struct ref *ref,
 			what = _("[new ref]");
 		}
 
-		if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
-		    (recurse_submodules != RECURSE_SUBMODULES_ON))
+		if (recurse_submodules != RECURSE_SUBMODULES_OFF)
 			check_for_new_submodule_commits(&ref->new_oid);
 		r = s_update_ref(msg, ref, 0);
 		format_display(display, r ? '!' : '*', what,
@@ -723,8 +722,7 @@ static int update_local_ref(struct ref *ref,
 		strbuf_add_unique_abbrev(&quickref, &current->object.oid, DEFAULT_ABBREV);
 		strbuf_addstr(&quickref, "..");
 		strbuf_add_unique_abbrev(&quickref, &ref->new_oid, DEFAULT_ABBREV);
-		if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
-		    (recurse_submodules != RECURSE_SUBMODULES_ON))
+		if (recurse_submodules != RECURSE_SUBMODULES_OFF)
 			check_for_new_submodule_commits(&ref->new_oid);
 		r = s_update_ref("fast-forward", ref, 1);
 		format_display(display, r ? '!' : ' ', quickref.buf,
@@ -738,8 +736,7 @@ static int update_local_ref(struct ref *ref,
 		strbuf_add_unique_abbrev(&quickref, &current->object.oid, DEFAULT_ABBREV);
 		strbuf_addstr(&quickref, "...");
 		strbuf_add_unique_abbrev(&quickref, &ref->new_oid, DEFAULT_ABBREV);
-		if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
-		    (recurse_submodules != RECURSE_SUBMODULES_ON))
+		if (recurse_submodules != RECURSE_SUBMODULES_OFF)
 			check_for_new_submodule_commits(&ref->new_oid);
 		r = s_update_ref("forced-update", ref, 1);
 		format_display(display, r ? '!' : '+', quickref.buf,
diff --git a/submodule.c b/submodule.c
index 8829e8d4eff..ba98a94d4db 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1128,8 +1128,12 @@ struct submodule_parallel_fetch {
 	int result;
 
 	struct string_list changed_submodule_names;
+	struct get_next_submodule_task **retry;
+	int retry_nr, retry_alloc;
 };
-#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0, STRING_LIST_INIT_DUP }
+#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0, \
+		  STRING_LIST_INIT_DUP, \
+		  NULL, 0, 0}
 
 static void calculate_changed_submodule_paths(
 	struct submodule_parallel_fetch *spf)
@@ -1234,6 +1238,56 @@ static int get_fetch_recurse_config(const struct submodule *submodule,
 	return spf->default_option;
 }
 
+struct get_next_submodule_task {
+	struct repository *repo;
+	const struct submodule *sub;
+	unsigned free_sub : 1; /* Do we need to free the submodule? */
+	struct oid_array *commits;
+};
+
+static const struct submodule *get_default_submodule(const char *path)
+{
+	struct submodule *ret = NULL;
+	const char *name = default_name_or_path(path);
+
+	if (!name)
+		return NULL;
+
+	ret = xmalloc(sizeof(*ret));
+	memset(ret, 0, sizeof(*ret));
+	ret->path = name;
+	ret->name = name;
+
+	return (const struct submodule *) ret;
+}
+
+static struct get_next_submodule_task *get_next_submodule_task_create(
+	struct repository *r, const char *path)
+{
+	struct get_next_submodule_task *task = xmalloc(sizeof(*task));
+	memset(task, 0, sizeof(*task));
+
+	task->sub = submodule_from_path(r, &null_oid, path);
+	if (!task->sub) {
+		task->sub = get_default_submodule(path);
+		task->free_sub = 1;
+	}
+
+	return task;
+}
+
+static void get_next_submodule_task_release(struct get_next_submodule_task *p)
+{
+	if (p->free_sub)
+		free((void*)p->sub);
+	p->free_sub = 0;
+	p->sub = NULL;
+
+	if (p->repo)
+		repo_clear(p->repo);
+	FREE_AND_NULL(p->repo);
+}
+
 static struct repository *get_submodule_repo_for(struct repository *r,
 						 const char *path)
 {
@@ -1256,39 +1310,35 @@ static struct repository *get_submodule_repo_for(struct repository *r,
 static int get_next_submodule(struct child_process *cp,
 			      struct strbuf *err, void *data, void **task_cb)
 {
-	int ret = 0;
 	struct submodule_parallel_fetch *spf = data;
 
 	for (; spf->count < spf->r->index->cache_nr; spf->count++) {
-		struct strbuf submodule_prefix = STRBUF_INIT;
+		int recurse_config;
 		const struct cache_entry *ce = spf->r->index->cache[spf->count];
 		const char *default_argv;
-		const struct submodule *submodule;
-		struct repository *repo;
-		struct submodule default_submodule = SUBMODULE_INIT;
+		struct get_next_submodule_task *task;
 
 		if (!S_ISGITLINK(ce->ce_mode))
 			continue;
 
-		submodule = submodule_from_path(spf->r, &null_oid, ce->name);
-		if (!submodule) {
-			const char *name = default_name_or_path(ce->name);
-			if (name) {
-				default_submodule.path = name;
-				default_submodule.name = name;
-				submodule = &default_submodule;
-			}
+		task = get_next_submodule_task_create(spf->r, ce->name);
+
+		if (!task->sub) {
+			free(task);
+			continue;
 		}
 
-		switch (get_fetch_recurse_config(submodule, spf))
+		recurse_config = get_fetch_recurse_config(task->sub, spf);
+
+		switch (recurse_config)
 		{
 		default:
 		case RECURSE_SUBMODULES_DEFAULT:
 		case RECURSE_SUBMODULES_ON_DEMAND:
-			if (!submodule ||
+			if (!task->sub ||
 			    !string_list_lookup(
 					&spf->changed_submodule_names,
-					submodule->name))
+					task->sub->name))
 				continue;
 			default_argv = "on-demand";
 			break;
@@ -1299,12 +1349,13 @@ static int get_next_submodule(struct child_process *cp,
 			continue;
 		}
 
-		strbuf_addf(&submodule_prefix, "%s%s/", spf->prefix, ce->name);
-		repo = get_submodule_repo_for(spf->r, ce->name);
-		if (repo) {
+		task->repo = get_submodule_repo_for(spf->r,
+						       task->sub->path);
+		if (task->repo) {
+			struct strbuf submodule_prefix = STRBUF_INIT;
 			child_process_init(cp);
 			prepare_submodule_repo_env_in_gitdir(&cp->env_array);
-			cp->dir = xstrdup(repo->gitdir);
+			cp->dir = task->repo->gitdir;
 			cp->git_cmd = 1;
 			if (!spf->quiet)
 				strbuf_addf(err, "Fetching submodule %s%s\n",
@@ -1313,18 +1364,51 @@ static int get_next_submodule(struct child_process *cp,
 			argv_array_pushv(&cp->args, spf->args.argv);
 			argv_array_push(&cp->args, default_argv);
 			argv_array_push(&cp->args, "--submodule-prefix");
+
+			strbuf_addf(&submodule_prefix, "%s%s/",
+						       spf->prefix,
+						       task->sub->path);
 			argv_array_push(&cp->args, submodule_prefix.buf);
 
-			repo_clear(repo);
-			free(repo);
-			ret = 1;
-		}
-		strbuf_release(&submodule_prefix);
-		if (ret) {
 			spf->count++;
+			*task_cb = task;
+
+			strbuf_release(&submodule_prefix);
 			return 1;
+		} else {
+			get_next_submodule_task_release(task);
+			free(task);
 		}
 	}
+
+	if (spf->retry_nr) {
+		struct get_next_submodule_task *task = spf->retry[spf->retry_nr - 1];
+		struct strbuf submodule_prefix = STRBUF_INIT;
+		spf->retry_nr--;
+
+		strbuf_addf(&submodule_prefix, "%s%s/", spf->prefix, task->sub->path);
+
+		child_process_init(cp);
+		prepare_submodule_repo_env_in_gitdir(&cp->env_array);
+		cp->git_cmd = 1;
+		cp->dir = task->repo->gitdir;
+
+		argv_array_init(&cp->args);
+		argv_array_pushv(&cp->args, spf->args.argv);
+		argv_array_push(&cp->args, "on-demand");
+		argv_array_push(&cp->args, "--submodule-prefix");
+		argv_array_push(&cp->args, submodule_prefix.buf);
+
+		/* NEEDSWORK: have get_default_remote from s--h */
+		argv_array_push(&cp->args, "origin");
+		oid_array_for_each_unique(task->commits,
+					  append_oid_to_argv, &cp->args);
+
+		*task_cb = task;
+		strbuf_release(&submodule_prefix);
+		return 1;
+	}
+
 	return 0;
 }
 
@@ -1332,20 +1416,64 @@ static int fetch_start_failure(struct strbuf *err,
 			       void *cb, void *task_cb)
 {
 	struct submodule_parallel_fetch *spf = cb;
+	struct get_next_submodule_task *task = task_cb;
 
 	spf->result = 1;
 
+	get_next_submodule_task_release(task);
 	return 0;
 }
 
+static int commit_exists_in_sub(const struct object_id *oid, void *data)
+{
+	struct repository *subrepo = data;
+
+	enum object_type type = oid_object_info(subrepo, oid, NULL);
+
+	return type != OBJ_COMMIT;
+}
+
 static int fetch_finish(int retvalue, struct strbuf *err,
 			void *cb, void *task_cb)
 {
 	struct submodule_parallel_fetch *spf = cb;
+	struct get_next_submodule_task *task = task_cb;
+	const struct submodule *sub;
+
+	struct string_list_item *it;
+	struct oid_array *commits;
 
 	if (retvalue)
 		spf->result = 1;
 
+	if (!task)
+		return 0;
+
+	sub = task->sub;
+	if (!sub)
+		goto out;
+
+	it = string_list_lookup(&spf->changed_submodule_names, sub->name);
+	if (!it)
+		goto out;
+
+	commits = it->util;
+	oid_array_filter(commits,
+			 commit_exists_in_sub,
+			 task->repo);
+
+	trace_printf("checking for submodule: needs %d more commits", commits->nr);
+	if (commits->nr) {
+		task->commits = commits;
+		ALLOC_GROW(spf->retry, spf->retry_nr + 1, spf->retry_alloc);
+		spf->retry[spf->retry_nr] = task;
+		spf->retry_nr++;
+		return 0;
+	}
+
+out:
+	get_next_submodule_task_release(task);
+
 	return 0;
 }
 
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index 42692219a1a..af12c50e7dd 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -605,4 +605,20 @@ test_expect_success "fetch new commits when submodule got renamed" '
 	test_cmp expect actual
 '
 
+test_expect_success "fetch new commits on-demand when they are not reachable" '
+	git checkout --detach &&
+	C=$(git -C submodule commit-tree -m "new change outside refs/heads" HEAD^{tree}) &&
+	git -C submodule update-ref refs/changes/1 $C &&
+	git update-index --cacheinfo 160000 $C submodule &&
+	git commit -m "updated submodule outside of refs/heads" &&
+	D=$(git rev-parse HEAD) &&
+	git update-ref refs/changes/2 $D &&
+	(
+		cd downstream &&
+		git fetch --recurse-submodules --recurse-submodules-default on-demand origin refs/changes/2:refs/heads/my_branch &&
+		git -C submodule cat-file -t $C &&
+		git checkout --recurse-submodules FETCH_HEAD
+	)
+'
+
 test_done
-- 
2.19.0.444.g18242da7ef-goog


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 8/8] builtin/fetch: check for submodule updates for non branch fetches
  2018-09-21 22:35 [PATCHv3 0/8] fetch: make sure submodule oids are fetched Stefan Beller
                   ` (6 preceding siblings ...)
  2018-09-21 22:35 ` [PATCH 7/8] fetch: retry fetching submodules if needed objects were not fetched Stefan Beller
@ 2018-09-21 22:35 ` Stefan Beller
  7 siblings, 0 replies; 51+ messages in thread
From: Stefan Beller @ 2018-09-21 22:35 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Gerrit, the code review tool, has a different workflow than our mailing
list based approach. Usually users upload changes to a Gerrit server and
continuous integration and testing happens by bots. Sometimes however a
user wants to checkout a change locally and look at it locally. For this
use case, Gerrit offers a command line snippet to copy and paste to your
terminal, which looks like

  git fetch https://<host>/gerrit refs/changes/<id> &&
  git checkout FETCH_HEAD

For Gerrit changes that contain changing submodule gitlinks, it would be
easy to extend both the fetch and checkout with the '--recurse-submodules'
flag, such that this command line snippet would produce the state of a
change locally.

However the functionality added in the previous patch, which would
ensure that we fetch the objects in the submodule that the gitlink pointed
at, only works for remote tracking branches so far, not for FETCH_HEAD.

Make sure that fetching a superproject to its FETCH_HEAD, also respects
the existence checks for objects in the submodule recursion.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/fetch.c             | 5 ++++-
 t/t5526-fetch-submodules.sh | 2 +-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index e3b03ad3bd3..f2d9e548bf0 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -894,11 +894,14 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
 				rc |= update_local_ref(ref, what, rm, &note,
 						       summary_width);
 				free(ref);
-			} else
+			} else {
+				check_for_new_submodule_commits(&rm->old_oid);
 				format_display(&note, '*',
 					       *kind ? kind : "branch", NULL,
 					       *what ? what : "HEAD",
 					       "FETCH_HEAD", summary_width);
+			}
+
 			if (note.len) {
 				if (verbosity >= 0 && !shown_url) {
 					fprintf(stderr, _("From %.*s\n"),
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index af12c50e7dd..a509eabb044 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -615,7 +615,7 @@ test_expect_success "fetch new commits on-demand when they are not reachable" '
 	git update-ref refs/changes/2 $D &&
 	(
 		cd downstream &&
-		git fetch --recurse-submodules --recurse-submodules-default on-demand origin refs/changes/2:refs/heads/my_branch &&
+		git fetch --recurse-submodules origin refs/changes/2 &&
 		git -C submodule cat-file -t $C &&
 		git checkout --recurse-submodules FETCH_HEAD
 	)
-- 
2.19.0.444.g18242da7ef-goog

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-21 22:35 ` [PATCH 1/8] sha1-array: provide oid_array_filter Stefan Beller
@ 2018-09-22 12:58   ` Ævar Arnfjörð Bjarmason
  2018-09-25 19:26     ` Stefan Beller
  0 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-09-22 12:58 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git


On Fri, Sep 21 2018, Stefan Beller wrote:

> +/*
> + * Apply want to each entry in array, retaining only the entries for
> + * which the function returns true.  Preserve the order of the entries
> + * that are retained.
> + */
> +void oid_array_filter(struct oid_array *array,
> +		      for_each_oid_fn want,
> +		      void *cbdata);
> +
>  #endif /* SHA1_ARRAY_H */

The code LGTM, but this comment should instead be an update to the API
docs, see my recent 5cc044e025 ("get_short_oid: sort ambiguous objects
by type, then SHA-1", 2018-05-10) for an addition of a new function to
this API where I added some new docs.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-22 12:58   ` Ævar Arnfjörð Bjarmason
@ 2018-09-25 19:26     ` Stefan Beller
  2018-09-26  4:15       ` Jeff King
  0 siblings, 1 reply; 51+ messages in thread
From: Stefan Beller @ 2018-09-25 19:26 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

On Sat, Sep 22, 2018 at 5:58 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Fri, Sep 21 2018, Stefan Beller wrote:
>
> > +/*
> > + * Apply want to each entry in array, retaining only the entries for
> > + * which the function returns true.  Preserve the order of the entries
> > + * that are retained.
> > + */
> > +void oid_array_filter(struct oid_array *array,
> > +                   for_each_oid_fn want,
> > +                   void *cbdata);
> > +
> >  #endif /* SHA1_ARRAY_H */
>
> The code LGTM, but this comment should instead be an update to the API
> docs, see my recent 5cc044e025 ("get_short_oid: sort ambiguous objects
> by type, then SHA-1", 2018-05-10) for an addition of a new function to
> this API where I added some new docs.

ok will fix for consistency (this whole API is there).

Longer term (I thought) we were trying to migrate API docs
to headers instead?

Stefan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-25 19:26     ` Stefan Beller
@ 2018-09-26  4:15       ` Jeff King
  2018-09-26 17:10         ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Jeff King @ 2018-09-26  4:15 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Ævar Arnfjörð Bjarmason, git

On Tue, Sep 25, 2018 at 12:26:44PM -0700, Stefan Beller wrote:

> On Sat, Sep 22, 2018 at 5:58 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
> >
> >
> > On Fri, Sep 21 2018, Stefan Beller wrote:
> >
> > > +/*
> > > + * Apply want to each entry in array, retaining only the entries for
> > > + * which the function returns true.  Preserve the order of the entries
> > > + * that are retained.
> > > + */
> > > +void oid_array_filter(struct oid_array *array,
> > > +                   for_each_oid_fn want,
> > > +                   void *cbdata);
> > > +
> > >  #endif /* SHA1_ARRAY_H */
> >
> > The code LGTM, but this comment should instead be an update to the API
> > docs, see my recent 5cc044e025 ("get_short_oid: sort ambiguous objects
> > by type, then SHA-1", 2018-05-10) for an addition of a new function to
> > this API where I added some new docs.
> 
> ok will fix for consistency (this whole API is there).
> 
> Longer term (I thought) we were trying to migrate API docs
> to headers instead?

Yes, please. I think it prevents exactly this sort of confusion. :)

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-26  4:15       ` Jeff King
@ 2018-09-26 17:10         ` Junio C Hamano
  2018-09-26 17:49           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2018-09-26 17:10 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Beller, Ævar Arnfjörð Bjarmason, git

Jeff King <peff@peff.net> writes:

> On Tue, Sep 25, 2018 at 12:26:44PM -0700, Stefan Beller wrote:
>
>> On Sat, Sep 22, 2018 at 5:58 AM Ævar Arnfjörð Bjarmason
>> <avarab@gmail.com> wrote:
>> >
>> >
>> > On Fri, Sep 21 2018, Stefan Beller wrote:
>> >
>> > > +/*
>> > > + * Apply want to each entry in array, retaining only the entries for
>> > > + * which the function returns true.  Preserve the order of the entries
>> > > + * that are retained.
>> > > + */
>> > > +void oid_array_filter(struct oid_array *array,
>> > > +                   for_each_oid_fn want,
>> > > +                   void *cbdata);
>> > > +
>> > >  #endif /* SHA1_ARRAY_H */
>> >
>> > The code LGTM, but this comment should instead be an update to the API
>> > docs, see my recent 5cc044e025 ("get_short_oid: sort ambiguous objects
>> > by type, then SHA-1", 2018-05-10) for an addition of a new function to
>> > this API where I added some new docs.
>> 
>> ok will fix for consistency (this whole API is there).
>> 
>> Longer term (I thought) we were trying to migrate API docs
>> to headers instead?
>
> Yes, please. I think it prevents exactly this sort of confusion. :)

CodingGuidelines or SubmittingPatches update, perhaps?

 Documentation/CodingGuidelines | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 48aa4edfbd..b54684e807 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -358,7 +358,11 @@ For C programs:
    string_list for sorted string lists, a hash map (mapping struct
    objects) named "struct decorate", amongst other things.
 
- - When you come up with an API, document it.
+ - When you come up with an API, document it.  It used to be
+   encouraged to do so in Documentation/technical/, and the birds-eye
+   level overview may still be more suitable there, but detailed
+   function-by-function level of documentation is done by comments in
+   corresponding .h files these days.
 
  - The first #include in C files, except in platform specific compat/
    implementations, must be either "git-compat-util.h", "cache.h" or

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-26 17:10         ` Junio C Hamano
@ 2018-09-26 17:49           ` Ævar Arnfjörð Bjarmason
  2018-09-26 18:27             ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-09-26 17:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Stefan Beller, git


On Wed, Sep 26 2018, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
>
>> On Tue, Sep 25, 2018 at 12:26:44PM -0700, Stefan Beller wrote:
>>
>>> On Sat, Sep 22, 2018 at 5:58 AM Ævar Arnfjörð Bjarmason
>>> <avarab@gmail.com> wrote:
>>> >
>>> >
>>> > On Fri, Sep 21 2018, Stefan Beller wrote:
>>> >
>>> > > +/*
>>> > > + * Apply want to each entry in array, retaining only the entries for
>>> > > + * which the function returns true.  Preserve the order of the entries
>>> > > + * that are retained.
>>> > > + */
>>> > > +void oid_array_filter(struct oid_array *array,
>>> > > +                   for_each_oid_fn want,
>>> > > +                   void *cbdata);
>>> > > +
>>> > >  #endif /* SHA1_ARRAY_H */
>>> >
>>> > The code LGTM, but this comment should instead be an update to the API
>>> > docs, see my recent 5cc044e025 ("get_short_oid: sort ambiguous objects
>>> > by type, then SHA-1", 2018-05-10) for an addition of a new function to
>>> > this API where I added some new docs.
>>>
>>> ok will fix for consistency (this whole API is there).
>>>
>>> Longer term (I thought) we were trying to migrate API docs
>>> to headers instead?
>>
>> Yes, please. I think it prevents exactly this sort of confusion. :)
>
> CodingGuidelines or SubmittingPatches update, perhaps?
>
>  Documentation/CodingGuidelines | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
> index 48aa4edfbd..b54684e807 100644
> --- a/Documentation/CodingGuidelines
> +++ b/Documentation/CodingGuidelines
> @@ -358,7 +358,11 @@ For C programs:
>     string_list for sorted string lists, a hash map (mapping struct
>     objects) named "struct decorate", amongst other things.
>
> - - When you come up with an API, document it.
> + - When you come up with an API, document it.  It used to be
> +   encouraged to do so in Documentation/technical/, and the birds-eye
> +   level overview may still be more suitable there, but detailed
> +   function-by-function level of documentation is done by comments in
> +   corresponding .h files these days.
>
>   - The first #include in C files, except in platform specific compat/
>     implementations, must be either "git-compat-util.h", "cache.h" or

Thanks. I had not looked at this closely and was under the false
impression that it was going in the other direction. Good to have it
clarified.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-26 17:49           ` Ævar Arnfjörð Bjarmason
@ 2018-09-26 18:27             ` Junio C Hamano
  2018-09-26 18:34               ` Jeff King
  2018-09-26 18:43               ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 51+ messages in thread
From: Junio C Hamano @ 2018-09-26 18:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Jeff King, Stefan Beller, git

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Wed, Sep 26 2018, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>>>
>>> Yes, please. I think it prevents exactly this sort of confusion. :)
>>
>> CodingGuidelines or SubmittingPatches update, perhaps?
>>
>>  Documentation/CodingGuidelines | 6 +++++-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
>> index 48aa4edfbd..b54684e807 100644
>> --- a/Documentation/CodingGuidelines
>> +++ b/Documentation/CodingGuidelines
>> @@ -358,7 +358,11 @@ For C programs:
>>     string_list for sorted string lists, a hash map (mapping struct
>>     objects) named "struct decorate", amongst other things.
>>
>> - - When you come up with an API, document it.
>> + - When you come up with an API, document it.  It used to be
>> +   encouraged to do so in Documentation/technical/, and the birds-eye
>> +   level overview may still be more suitable there, but detailed
>> +   function-by-function level of documentation is done by comments in
>> +   corresponding .h files these days.
>>
>>   - The first #include in C files, except in platform specific compat/
>>     implementations, must be either "git-compat-util.h", "cache.h" or
>
> Thanks. I had not looked at this closely and was under the false
> impression that it was going in the other direction. Good to have it
> clarified.

Heh, I knew people were in favor of one over the other but until
Peff chimed in to this thread, I didn't recall which one was
preferred, partly because I personally do not see a huge advantage
in using in-code comments as docs for programmers, and do not like
having to read them as in-code comments.

If somebody wants to wordsmith the text and send in a patch with
good log message, please do so, as I myself am not sure if what I
wrote is the consensus position.  It could be that they want to have
even birds-eye overview in the header files.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-26 18:27             ` Junio C Hamano
@ 2018-09-26 18:34               ` Jeff King
  2018-09-26 18:43               ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 51+ messages in thread
From: Jeff King @ 2018-09-26 18:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, Stefan Beller, git

On Wed, Sep 26, 2018 at 11:27:53AM -0700, Junio C Hamano wrote:

> >> diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
> >> index 48aa4edfbd..b54684e807 100644
> >> --- a/Documentation/CodingGuidelines
> >> +++ b/Documentation/CodingGuidelines
> >> @@ -358,7 +358,11 @@ For C programs:
> >>     string_list for sorted string lists, a hash map (mapping struct
> >>     objects) named "struct decorate", amongst other things.
> >>
> >> - - When you come up with an API, document it.
> >> + - When you come up with an API, document it.  It used to be
> >> +   encouraged to do so in Documentation/technical/, and the birds-eye
> >> +   level overview may still be more suitable there, but detailed
> >> +   function-by-function level of documentation is done by comments in
> >> +   corresponding .h files these days.
> >>
> >>   - The first #include in C files, except in platform specific compat/
> >>     implementations, must be either "git-compat-util.h", "cache.h" or
> >
> > Thanks. I had not looked at this closely and was under the false
> > impression that it was going in the other direction. Good to have it
> > clarified.
> 
> Heh, I knew people were in favor of one over the other but until
> Peff chimed in to this thread, I didn't recall which one was
> preferred, partly because I personally do not see a huge advantage
> in using in-code comments as docs for programmers, and do not like
> having to read them as in-code comments.
> 
> If somebody wants to wordsmith the text and send in a patch with
> good log message, please do so, as I myself am not sure if what I
> wrote is the consensus position.  It could be that they want to have
> even birds-eye overview in the header files.

Yes, I would say that everything should go into the header files. For
the same reason that the function descriptions should go there: by being
close to the thing being changed, it is more likely that authors will
remember to adjust the documentation. It's not exactly literate
programming, but it's a step in that direction.

That's just my opinion, of course. I've been very happy so far with the
documentation that we have transitioned. We talked a while ago about a
script to extract the comments into a "just documentation" and build it
as asciidoc, but I doubt I would use such a thing myself.

I do agree in general that mentioning this in CodingGuidelines is a good
idea.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-26 18:27             ` Junio C Hamano
  2018-09-26 18:34               ` Jeff King
@ 2018-09-26 18:43               ` Ævar Arnfjörð Bjarmason
  2018-09-26 18:58                 ` Jeff King
  1 sibling, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-09-26 18:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Stefan Beller, git


On Wed, Sep 26 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Wed, Sep 26 2018, Junio C Hamano wrote:
>>
>>> Jeff King <peff@peff.net> writes:
>>>>
>>>> Yes, please. I think it prevents exactly this sort of confusion. :)
>>>
>>> CodingGuidelines or SubmittingPatches update, perhaps?
>>>
>>>  Documentation/CodingGuidelines | 6 +++++-
>>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
>>> index 48aa4edfbd..b54684e807 100644
>>> --- a/Documentation/CodingGuidelines
>>> +++ b/Documentation/CodingGuidelines
>>> @@ -358,7 +358,11 @@ For C programs:
>>>     string_list for sorted string lists, a hash map (mapping struct
>>>     objects) named "struct decorate", amongst other things.
>>>
>>> - - When you come up with an API, document it.
>>> + - When you come up with an API, document it.  It used to be
>>> +   encouraged to do so in Documentation/technical/, and the birds-eye
>>> +   level overview may still be more suitable there, but detailed
>>> +   function-by-function level of documentation is done by comments in
>>> +   corresponding .h files these days.
>>>
>>>   - The first #include in C files, except in platform specific compat/
>>>     implementations, must be either "git-compat-util.h", "cache.h" or
>>
>> Thanks. I had not looked at this closely and was under the false
>> impression that it was going in the other direction. Good to have it
>> clarified.
>
> Heh, I knew people were in favor of one over the other but until
> Peff chimed in to this thread, I didn't recall which one was
> preferred, partly because I personally do not see a huge advantage
> in using in-code comments as docs for programmers, and do not like
> having to read them as in-code comments.
>
> If somebody wants to wordsmith the text and send in a patch with
> good log message, please do so, as I myself am not sure if what I
> wrote is the consensus position.  It could be that they want to have
> even birds-eye overview in the header files.

While we're on that subject, I'd much prefer if these technical docs and
other asciidoc-ish stuff we would be manpages build as part of our
normal "make doc" target. So e.g. this one would be readable via
something like "man 3 gitapi-oid-array".

They could still be maintained along with the code given a sufficiently
smart doc build system, e.g. perl does that:
https://github.com/Perl/perl5/blob/v5.26.0/av.c#L294-L387

Although doing that in some readable and sane way would mean getting rid
of your beloved multi-line /* ... */ comment formatting, at least for
that case. It would be a pain to have everyone configure their editor to
word-wrap lines with leading "*" without screwing up the asciidoc
format.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-26 18:43               ` Ævar Arnfjörð Bjarmason
@ 2018-09-26 18:58                 ` Jeff King
  2018-09-26 19:39                   ` Stefan Beller
                                     ` (3 more replies)
  0 siblings, 4 replies; 51+ messages in thread
From: Jeff King @ 2018-09-26 18:58 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Junio C Hamano, Stefan Beller, git

On Wed, Sep 26, 2018 at 08:43:14PM +0200, Ævar Arnfjörð Bjarmason wrote:

> While we're on that subject, I'd much prefer if these technical docs and
> other asciidoc-ish stuff we would be manpages build as part of our
> normal "make doc" target. So e.g. this one would be readable via
> something like "man 3 gitapi-oid-array".

I'm mildly negative on this, just because it introduces extra work on
people writing the documentation. Now it has to follow special
formatting rules, and sometimes the source is uglier (e.g., the horrible
+-continuation in lists). Are authors now responsible for formatting any
changes they make to make sure they look good in asciidoc? Or are people
who care about the formatted output going to come along afterwards and
submit fixup patches? Either way it seems like make-work.

Now I'll admit it seems like make-work to me because I would not plan to
ever look at the formatted output myself. But I guess I don't understand
the audience for this formatted output. These are APIs internal to Git
itself. We would not generally want to install gitapi-oid-array into
/usr/share/man, because only people actually working on Git would be
able to use it. So it sounds like a convenience for a handful of
developers (who like to look at this manpage versus the source). It
doesn't seem like the cost/benefit is there.

And if we were going to generate something external, would it make more
sense to write in a structured format like doxygen? I am not a big fan
of it myself, but at least from there you can generate a more richly
interconnected set of documentation.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-26 18:58                 ` Jeff King
@ 2018-09-26 19:39                   ` Stefan Beller
  2018-09-26 19:49                   ` Junio C Hamano
                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 51+ messages in thread
From: Stefan Beller @ 2018-09-26 19:39 UTC (permalink / raw)
  To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, Junio C Hamano, git

On Wed, Sep 26, 2018 at 11:58 AM Jeff King <peff@peff.net> wrote:

> Now I'll admit it seems like make-work to me because I would not plan to
> ever look at the formatted output myself. But I guess I don't understand
> the audience for this formatted output.

I have been watching from the side lines so far, as I have a view that is very
much like Peffs, if I were to dictate my opinion onto the project, we'd:
* put all internal API docs into headerfiles.
* get rid of all Documentation/technical/ that describes things
   at a function level. So the remaining things are high level docs such
   as protocol, hash transition plan, partial clones...

But I'd want to understand why we are not doing this (obvious to me)
best thing, I have the impression other people use the docs differently.

How are these docs (for example api-oid-array or
api-revision-walking) used?

I usually tend to use git-grep a lot when looking for a function, just to
jump to the header file and read the comment there and go around the
header file looking if we have a better suited thing.
(Also I tend to like to have fewer files open, such that I prefer
a header file of reasonable size over 2 files, one docs and one header)

So when I find a function documented in Docs/technical, I would first
read there, then go to the header to see the signature and then make
use of it.

If I recall an earlier discussion on this topic, I recall Junio saying that
this *may* pollute the header files, as for example sha1-array.h
is easy to grok in its entirety, and makes it easy to see what
functions regarding oid_arrays are available. A counter example would
be hashmap.h, as that requires some scrolling and skimming.

> These are APIs internal to Git
> itself. We would not generally want to install gitapi-oid-array into
> /usr/share/man, because only people actually working on Git would be
> able to use it. So it sounds like a convenience for a handful of
> developers (who like to look at this manpage versus the source). It
> doesn't seem like the cost/benefit is there.

If this type of docs makes Ævar more productive, I am all for keeping it.

> And if we were going to generate something external, would it make more
> sense to write in a structured format like doxygen? I am not a big fan
> of it myself, but at least from there you can generate a more richly
> interconnected set of documentation.

That sounds sensible to me as the additional burden of yet-another-tool
is only imposed to the developers, but doxygen would provide
a better output in my experience.

Stefan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-26 18:58                 ` Jeff King
  2018-09-26 19:39                   ` Stefan Beller
@ 2018-09-26 19:49                   ` Junio C Hamano
  2018-09-26 19:59                     ` Stefan Beller
  2018-09-26 20:44                   ` On shipping more of our technical docs as manpages Ævar Arnfjörð Bjarmason
  2018-09-27  6:48                   ` [PATCH 1/8] sha1-array: provide oid_array_filter Jeff King
  3 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2018-09-26 19:49 UTC (permalink / raw)
  To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, Stefan Beller, git

Jeff King <peff@peff.net> writes:

> Now I'll admit it seems like make-work to me because I would not plan to
> ever look at the formatted output myself. But I guess I don't understand
> the audience for this formatted output. These are APIs internal to Git
> itself. We would not generally want to install gitapi-oid-array into
> /usr/share/man, because only people actually working on Git would be
> able to use it. So it sounds like a convenience for a handful of
> developers (who like to look at this manpage versus the source). It
> doesn't seem like the cost/benefit is there.
>
> And if we were going to generate something external, would it make more
> sense to write in a structured format like doxygen? I am not a big fan
> of it myself, but at least from there you can generate a more richly
> interconnected set of documentation.

I agree on both counts.  I just like to read these in plain text
while I am coding for Git (or reviewing patches coded for Git).  

The reason why I have mild preference to D/technical/ over in-header
doc is only because I find even these asterisks at the left-side-end
distracting; it is not that materials in D/technical could be passed
through AsciiDoc.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-26 19:49                   ` Junio C Hamano
@ 2018-09-26 19:59                     ` Stefan Beller
  2018-09-26 20:19                       ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Stefan Beller @ 2018-09-26 19:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Ævar Arnfjörð Bjarmason, git

> I agree on both counts.  I just like to read these in plain text
> while I am coding for Git (or reviewing patches coded for Git).
>
> The reason why I have mild preference to D/technical/ over in-header
> doc is only because I find even these asterisks at the left-side-end
> distracting; it is not that materials in D/technical could be passed
> through AsciiDoc.

    /**
    Would we be open to consider another commenting style?
    AFAICT the asterisks are a cultural thing and not enforced by
    and old compiler not understanding these comments.

    Just like ending the comment in an asymetric fashion ;-) */

(I am not seriously proposing unbalanced comments, but for
longer documenting comments we may want to consider,
omitting the asterisk?)

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-26 19:59                     ` Stefan Beller
@ 2018-09-26 20:19                       ` Junio C Hamano
  2018-09-26 20:51                         ` Jeff King
  2018-09-26 21:22                         ` Stefan Beller
  0 siblings, 2 replies; 51+ messages in thread
From: Junio C Hamano @ 2018-09-26 20:19 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jeff King, Ævar Arnfjörð Bjarmason, git

Stefan Beller <sbeller@google.com> writes:

>> I agree on both counts.  I just like to read these in plain text
>> while I am coding for Git (or reviewing patches coded for Git).
>>
>> The reason why I have mild preference to D/technical/ over in-header
>> doc is only because I find even these asterisks at the left-side-end
>> distracting; it is not that materials in D/technical could be passed
>> through AsciiDoc.
>
>     /**
>     Would we be open to consider another commenting style?

No.  You still cannot write */ in that comment section, for example,
so you cannot do "plain text" still---that is not a great trade off
to deviate from the common comment style used for other stuff.

When I say I want plain text, I mean it.  Anything you write in .h
cannot be plain text, unless you make all readers agree with some
convention, and "there is always '*' at the left edge" is such an
acceptable convention can learn to live with, just like everybody
else does ; at least, that is consistent with other comments.

> (I am not seriously proposing unbalanced comments, but for
> longer documenting comments we may want to consider,
> omitting the asterisk?)

If this is not a serious proposal, then I'll stop wasting
everybody's time.

Now it is clear that the only pieces of consensus we got out of this
discussion so far is that we want to add to CodingGuidelines, that
my earlier "something like this" is not the text we want and that we
want everything in the header as comment.

So let's see if we can have usable alternative, apply that and move
on.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* On shipping more of our technical docs as manpages
  2018-09-26 18:58                 ` Jeff King
  2018-09-26 19:39                   ` Stefan Beller
  2018-09-26 19:49                   ` Junio C Hamano
@ 2018-09-26 20:44                   ` Ævar Arnfjörð Bjarmason
  2018-09-26 21:40                     ` Junio C Hamano
                                       ` (2 more replies)
  2018-09-27  6:48                   ` [PATCH 1/8] sha1-array: provide oid_array_filter Jeff King
  3 siblings, 3 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-09-26 20:44 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Stefan Beller, git

[Changing subject because this stopped being about Stefan's patches a
while ago]

On Wed, Sep 26 2018, Jeff King wrote:

> On Wed, Sep 26, 2018 at 08:43:14PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> While we're on that subject, I'd much prefer if these technical docs and
>> other asciidoc-ish stuff we would be manpages build as part of our
>> normal "make doc" target. So e.g. this one would be readable via
>> something like "man 3 gitapi-oid-array".

[responding out of order]

> Now I'll admit it seems like make-work to me because I would not plan to
> ever look at the formatted output myself. But I guess I don't understand
> the audience for this formatted output. These are APIs internal to Git
> itself. We would not generally want to install gitapi-oid-array into
> /usr/share/man, because only people actually working on Git would be
> able to use it. So it sounds like a convenience for a handful of
> developers (who like to look at this manpage versus the source). It
> doesn't seem like the cost/benefit is there.
>
> And if we were going to generate something external, would it make more
> sense to write in a structured format like doxygen? I am not a big fan
> of it myself, but at least from there you can generate a more richly
> interconnected set of documentation.

It's useful to have a single authoritative source for all documentation
that's easy to search through.

I don't really care where that text actually lives, i.e. whether it's
all in *.txt to begin with, or in *.c or *.h and pulled together by some
"make" step, or much how it's formatted, beyond it being easy to work
with. I'm not a big fan of asciidoc, but it's what we've got if we want
formatting, inter-linking etc.

My bias here is that I've also contributed actively to the perl project
in the past, and with that project you can get an overview of *all* of
the docs by typing:

    man perl

That includes stuff like perl585delta(1) which we'd stick in
Documentation/RelNotes, and "Internals and C Language Interface". Most
of what we'd put in Documentation/technical/api-* & headers is in
perlapi(1).

I spent an embarrassingly long time submitting patches to git before
discovering that Documentation/technical/api-*.txt even existed, I think
something like 1-2 years ago.

We have very well documented stuff like strbuf.h (mostly thanks to you),
but how is such documentation supposed to be discovered? Something like:

    git grep -A20 '^/\*$' -- *.h
    <hold in page down>

?

In terms of getting an overview it's indistinguishable from
comments. I.e. there's nothing like an index of:

    man git-api-strbuf     ==> working with strings
    man git-api-sha1-array ==> list, iterate and binary lookup SHA1s

etc.

Sometimes it's obvious where to look, but as someone who uses most of
these APIs very occasionally (because I contribute less) I keep
(re-)discovering various internal APIs.

I'm also not in the habit of opening the *.h file for something, I
usually just start reading the *.c and only later discover there's some
API docs in the relevant *.h (or not).

So in terms of implementation I'm a big fan of the perl.git approach of
having these docs as comments before the function implementation in the
*.c file.

It means you can't avoid seeing it or updating it when source
spelunking, and it leaves header files short, which is useful when you'd
like to get a general API overview without all the docs. Our documented
headers are quite fat, e.g. strbuf.h is 60% of the size of strbuf.c, but
less than 20% when you strip the docs.

> I'm mildly negative on this, just because it introduces extra work on
> people writing the documentation. Now it has to follow special
> formatting rules, and sometimes the source is uglier (e.g., the horrible
> +-continuation in lists). Are authors now responsible for formatting any
> changes they make to make sure they look good in asciidoc? Or are people
> who care about the formatted output going to come along afterwards and
> submit fixup patches? Either way it seems like make-work.

This part I'm slightly confused by. If you're just saying "let's
document stuff in *.h files in free-flowing text", fine. But if we're
talking about the difference between Documentation/technical/api-*.txt
and having the same stuff in manpages, then the stuff in api-*.txt *is*
already asciidoc, and we have a target to format it all up and ship it
with git, and distros ship that.

E.g. on Debian you can "apt install git-doc" and browse stuff like
file:///usr/share/doc/git-doc/technical/api-argv-array.html which is the
HTML formatted version, same for all the other api-*.txt docs.

We just don't have some central index of those, and they're not a
default target which means the likes of our own https://git-scm.com
don't build them, so they're harder to find.

Every perl installation also ships perlapi and a MB or so of obscure
internal docs to everyone, which makes it easier to find via Google et
al, which I find useful. So I guess I'm more on the side fence of
dropping a hypothetical gitapi-oid-array into /usr/share/man than you
are.

ANYWAY

This E-mail is much longer than I intended, sorry about that. I didn't
just mean to bitch about how we ship docs, but I was wondering if there
was a desire to move to something like what I've outlined above, or
whether the status quo was mostly by design and intended.

I just thought I'd write this rather lengthy E-Mails because this is one
of the rare areas where you can fully describe an idea in E-Mail without
writing any patches.

If the consensus is that something like the exhaustive index "perldoc
perl" provides wouldn't be useful for git I can just drop this, but if
people are enthusiastic about having that it would be useful to work on
it...

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-26 20:19                       ` Junio C Hamano
@ 2018-09-26 20:51                         ` Jeff King
  2018-09-26 21:22                         ` Stefan Beller
  1 sibling, 0 replies; 51+ messages in thread
From: Jeff King @ 2018-09-26 20:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, Ævar Arnfjörð Bjarmason, git

On Wed, Sep 26, 2018 at 01:19:20PM -0700, Junio C Hamano wrote:

> >     /**
> >     Would we be open to consider another commenting style?
> 
> No.  You still cannot write */ in that comment section, for example,
> so you cannot do "plain text" still---that is not a great trade off
> to deviate from the common comment style used for other stuff.

We can do:

#if HEADER_DOC

Write anything here except #endif!

#endif /* HEADER_DOC */

But I actually think that is more jarring than a big comment (when I am
reading C code, I expect to see C-like things mostly.

I do agree with you that a true plain text file is nicer in terms of
freedom to format, etc. But IMHO the tradeoff is easily worth it to keep
the related content (function declarations and their documentation)
close together.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-26 20:19                       ` Junio C Hamano
  2018-09-26 20:51                         ` Jeff King
@ 2018-09-26 21:22                         ` Stefan Beller
  1 sibling, 0 replies; 51+ messages in thread
From: Stefan Beller @ 2018-09-26 21:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Ævar Arnfjörð Bjarmason, git

> > (I am not seriously proposing unbalanced comments, but for
> > longer documenting comments we may want to consider,
> > omitting the asterisk?)
>
> If this is not a serious proposal, then I'll stop wasting
> everybody's time.

I proposed two things, one of which I was not serious about.

The other has downsides as both you and Peff point out.
So living with comments may not be the worst after all.

I'll read&reply to Aevars proposal now.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: On shipping more of our technical docs as manpages
  2018-09-26 20:44                   ` On shipping more of our technical docs as manpages Ævar Arnfjörð Bjarmason
@ 2018-09-26 21:40                     ` Junio C Hamano
  2018-09-26 23:21                     ` Stefan Beller
  2018-09-27  6:20                     ` Jeff King
  2 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2018-09-26 21:40 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Jeff King, Stefan Beller, git

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> So in terms of implementation I'm a big fan of the perl.git approach of
> having these docs as comments before the function implementation in the
> *.c file.
>
> It means you can't avoid seeing it or updating it when source
> spelunking, and it leaves header files short, which is useful when you'd
> like to get a general API overview without all the docs. Our documented
> headers are quite fat, e.g. strbuf.h is 60% of the size of strbuf.c, but
> less than 20% when you strip the docs.

Just like you I do not care too much where the authoritative source
lives, and I do agree with Peff that separate text file is much more
likely to go stale wrt the code, and that it is an acceptable trade
off to write docs in comment and live with slightly less freedom
than writing in plain text in order to keep the docs and source
closer together.  I do not think between Peff and me, neither of us
were contrasting in-header vs in-code comments.

And I tend to agree that it makes it even harder to let doc and code
drift apart to have the doc near the implementation (as opposed to
in the header).  It probably makes the result less useful, unless
the project invests in making the result accessible like "man
perlapi" does, than having them in the header for people who view
headers as guide to users of API, though.

> We just don't have some central index of those, and they're not a
> default target which means the likes of our own https://git-scm.com
> don't build them, so they're harder to find.
>
> Every perl installation also ships perlapi and a MB or so of obscure
> internal docs to everyone, which makes it easier to find via Google et
> al, which I find useful. So I guess I'm more on the side fence of
> dropping a hypothetical gitapi-oid-array into /usr/share/man than you
> are.

Regardless of where we place docs (between .c/.h or .txt), making
them indexable and investing in make target to actually produce
these docs with ToC is different matter.  I do not think people who
actually spent time on our docs had enough inclination to do so
themselves, but that should not prevent you or other like-minded
people from leading by example.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: On shipping more of our technical docs as manpages
  2018-09-26 20:44                   ` On shipping more of our technical docs as manpages Ævar Arnfjörð Bjarmason
  2018-09-26 21:40                     ` Junio C Hamano
@ 2018-09-26 23:21                     ` Stefan Beller
  2018-09-27  8:58                       ` Ævar Arnfjörð Bjarmason
  2018-09-27  6:20                     ` Jeff King
  2 siblings, 1 reply; 51+ messages in thread
From: Stefan Beller @ 2018-09-26 23:21 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Jeff King, Junio C Hamano, git

On Wed, Sep 26, 2018 at 1:44 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:

> > And if we were going to generate something external, would it make more
> > sense to write in a structured format like doxygen? I am not a big fan
> > of it myself, but at least from there you can generate a more richly
> > interconnected set of documentation.
>
> It's useful to have a single authoritative source for all documentation
> that's easy to search through.

If that is the case I would propose to keep it all in header files and
organize the headers.

> That includes stuff like perl585delta(1) which we'd stick in
> Documentation/RelNotes, and "Internals and C Language Interface". Most
> of what we'd put in Documentation/technical/api-* & headers is in
> perlapi(1).

This seems cool, but was also a recent introduction?
perl400delta seems to yield nothing for me (which may be because
I do not have an old version of perl installed?)

>
> Sometimes it's obvious where to look, but as someone who uses most of
> these APIs very occasionally (because I contribute less) I keep
> (re-)discovering various internal APIs.

Sometimes I have the same feeling. Maybe more structure in the
source files would help (e.g. datastructures/{strbuf, string-list}.h
and objects/{packfile.h, sha1*} ?

> Every perl installation also ships perlapi and a MB or so of obscure
> internal docs to everyone, which makes it easier to find via Google et
> al, which I find useful. So I guess I'm more on the side fence of
> dropping a hypothetical gitapi-oid-array into /usr/share/man than you
> are.
>

Personally I would not want to ship our internal docs everywhere
as it seems like overly wasteful. But then, only my early days
of childhood were guided by Internet that is not available anywhere
and everywhere. Today I would just take for granted that I can lookup
things in github/git/git when I am in not at my regular desk.

> ANYWAY
>
> This E-mail is much longer than I intended, sorry about that. I didn't
> just mean to bitch about how we ship docs, but I was wondering if there
> was a desire to move to something like what I've outlined above, or
> whether the status quo was mostly by design and intended.
>
> I just thought I'd write this rather lengthy E-Mails because this is one
> of the rare areas where you can fully describe an idea in E-Mail without
> writing any patches.
>
> If the consensus is that something like the exhaustive index "perldoc
> perl" provides wouldn't be useful for git I can just drop this, but if
> people are enthusiastic about having that it would be useful to work on
> it...

I agree with Junio, as that the discoverability is unrelated to where to store
the actual docs, Documentation/technical/* has the advantage that it
only has "good" stuff, i.e. some topic that someone cared enough to
write about and it is easy to guess if it is relevant to your need.
In *.h, we have a lot of false positives (xdiff-interface.h or cache.h
just pollute the searching space when looking for suitable storage
classes.)

So I wonder if we'd want to have a list (similar as in
command-list.txt) that describes the header files and gives
some classification of them, for example one class could be the
data structures (strbuf, stringlist, hashmap), algorithms
(diff, range-diff), OS specific stuff (run-command, trace, alloc)
or Git specific things (apply, fetch, submodule)

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: On shipping more of our technical docs as manpages
  2018-09-26 20:44                   ` On shipping more of our technical docs as manpages Ævar Arnfjörð Bjarmason
  2018-09-26 21:40                     ` Junio C Hamano
  2018-09-26 23:21                     ` Stefan Beller
@ 2018-09-27  6:20                     ` Jeff King
  2018-09-27  6:34                       ` Jonathan Nieder
  2018-09-27 17:36                       ` Junio C Hamano
  2 siblings, 2 replies; 51+ messages in thread
From: Jeff King @ 2018-09-27  6:20 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Junio C Hamano, Stefan Beller, git

On Wed, Sep 26, 2018 at 10:44:33PM +0200, Ævar Arnfjörð Bjarmason wrote:

> My bias here is that I've also contributed actively to the perl project
> in the past, and with that project you can get an overview of *all* of
> the docs by typing:
> 
>     man perl
> 
> That includes stuff like perl585delta(1) which we'd stick in
> Documentation/RelNotes, and "Internals and C Language Interface". Most
> of what we'd put in Documentation/technical/api-* & headers is in
> perlapi(1).

I like the perl documentation, too. But I think most of what you're
talking about there is the public API, where you have many readers. But
here we're talking about internal functions (albeit ones we hope to see
reused across the codebase). My concern is mostly: is the work in
maintaining this documentation-formatting system worth the number of
readers it will get?

There are two numbers there, and I'm guessing at both of them. ;)

If you're interested in pulling documentation out of the header files
and generating asciidoc from it, I'm happy to at least try keeping it up
to date. When we started putting this information into header files, we
used "/**" to start the comment, as a special marker to indicate it was
worth pulling out. I don't know how well we've maintained that
convention, but it's a starting point.

> I spent an embarrassingly long time submitting patches to git before
> discovering that Documentation/technical/api-*.txt even existed, I think
> something like 1-2 years ago.

That's another thing that I think is improved by keeping the
documentation and code together. If you find one, you find the other.
You just have to "somehow" find one, which is what you're getting at
below.

> We have very well documented stuff like strbuf.h (mostly thanks to you),
> but how is such documentation supposed to be discovered? Something like:
> 
>     git grep -A20 '^/\*$' -- *.h
>     <hold in page down>
> 
> ?
> 
> In terms of getting an overview it's indistinguishable from
> comments. I.e. there's nothing like an index of:
> 
>     man git-api-strbuf     ==> working with strings
>     man git-api-sha1-array ==> list, iterate and binary lookup SHA1s

I agree that is a problem, especially for contributors less familiar
with the code base. But I think generating an index is a separate (and
much easier) problem than formatting all of the documentation.

We already have the /** convention I mentioned above. Could we have
another micro-format like:

  /** API:strbuf - working with strings */

in each header file? That would make generating such an index pretty
trivial.

> I'm also not in the habit of opening the *.h file for something, I
> usually just start reading the *.c and only later discover there's some
> API docs in the relevant *.h (or not).

Interesting. I'm not totally opposed to putting the documentation
alongside the actual code. It does feel a bit cluttered to me, but I
think you're right that it keeps everything as close together as
possible.

> It means you can't avoid seeing it or updating it when source
> spelunking, and it leaves header files short, which is useful when you'd
> like to get a general API overview without all the docs. Our documented
> headers are quite fat, e.g. strbuf.h is 60% of the size of strbuf.c, but
> less than 20% when you strip the docs.

I don't use folds in my editor, and I guess nobody else in this thread
does, either. But they may be a reasonable tool for "wow, there are
comments, declarations, and definitions all together and I just want to
view one of them". In vim, try "set foldmethod=syntax" and then "zc" to
close the folds.

I kind of hate it myself, but just another option to explore.

> > I'm mildly negative on this, just because it introduces extra work on
> > people writing the documentation. Now it has to follow special
> > formatting rules, and sometimes the source is uglier (e.g., the horrible
> > +-continuation in lists). Are authors now responsible for formatting any
> > changes they make to make sure they look good in asciidoc? Or are people
> > who care about the formatted output going to come along afterwards and
> > submit fixup patches? Either way it seems like make-work.
> 
> This part I'm slightly confused by. If you're just saying "let's
> document stuff in *.h files in free-flowing text", fine. But if we're
> talking about the difference between Documentation/technical/api-*.txt
> and having the same stuff in manpages, then the stuff in api-*.txt *is*
> already asciidoc, and we have a target to format it all up and ship it
> with git, and distros ship that.

Yes, the "free-flowing text" thing is more or less what I'm saying. I
suspect what's in technical/* currently probably has formatting
problems, and people update it (if we're lucky!) without regard to what
the result looks like.

> E.g. on Debian you can "apt install git-doc" and browse stuff like
> file:///usr/share/doc/git-doc/technical/api-argv-array.html which is the
> HTML formatted version, same for all the other api-*.txt docs.

IMHO this is just silly. What am I as an end user going to do with
api-argv-array.html?

> This E-mail is much longer than I intended, sorry about that. I didn't
> just mean to bitch about how we ship docs, but I was wondering if there
> was a desire to move to something like what I've outlined above, or
> whether the status quo was mostly by design and intended.

I think we're actually half-way between status quos. The stuff in
technical/api-* has existed for a long time. We've been slowly moving
stuff over to header files, but there's still a bit of stuff there.

Building those files as asciidoc is one of those things that just
cropped up along the way. I have never thought it was useful myself, but
somebody bothered to write the patches, so OK. The person who did so is
not even somebody who has otherwise worked on the project, AFAICT.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: On shipping more of our technical docs as manpages
  2018-09-27  6:20                     ` Jeff King
@ 2018-09-27  6:34                       ` Jonathan Nieder
  2018-09-27  6:40                         ` Jeff King
  2018-09-27 17:36                       ` Junio C Hamano
  1 sibling, 1 reply; 51+ messages in thread
From: Jonathan Nieder @ 2018-09-27  6:34 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, Junio C Hamano,
	Stefan Beller, git

Hi,

Jeff King wrote:
> On Wed, Sep 26, 2018 at 10:44:33PM +0200, Ævar Arnfjörð Bjarmason wrote:

>> In terms of getting an overview it's indistinguishable from
>> comments. I.e. there's nothing like an index of:
>>
>>     man git-api-strbuf     ==> working with strings
>>     man git-api-sha1-array ==> list, iterate and binary lookup SHA1s
>
> I agree that is a problem, especially for contributors less familiar
> with the code base. But I think generating an index is a separate (and
> much easier) problem than formatting all of the documentation.
>
> We already have the /** convention I mentioned above. Could we have
> another micro-format like:
>
>   /** API:strbuf - working with strings */
>
> in each header file? That would make generating such an index pretty
> trivial.

Can you spell out the problem for me a little more?  E.g. if we had a
convention that the first comment in strbuf.h should say

	/* strbuf - Git's standard string type */

or even just

	/* Git's standard string type */

would that do the trick?

>> I'm also not in the habit of opening the *.h file for something, I
>> usually just start reading the *.c and only later discover there's some
>> API docs in the relevant *.h (or not).
>
> Interesting. I'm not totally opposed to putting the documentation
> alongside the actual code. It does feel a bit cluttered to me, but I
> think you're right that it keeps everything as close together as
> possible.

I've experienced projects following both conventions.  One thing I like
a lot about documentation in the header file is that it encourages
people to make the API documentation self-contained.  That is, you
describe the contract in a way that doesn't require reading the
implementation for details.

It took me a while to get used to this kind of convention but now I
really like it.  So I really do prefer to keep putting API
documentation in the header files in Git.  (Implementation
documentation in the source files is of course also very welcome.)

>> It means you can't avoid seeing it or updating it when source
>> spelunking, and it leaves header files short, which is useful when you'd
>> like to get a general API overview without all the docs. Our documented
>> headers are quite fat, e.g. strbuf.h is 60% of the size of strbuf.c, but
>> less than 20% when you strip the docs.
>
> I don't use folds in my editor, and I guess nobody else in this thread
> does, either. But they may be a reasonable tool for "wow, there are
> comments, declarations, and definitions all together and I just want to
> view one of them". In vim, try "set foldmethod=syntax" and then "zc" to
> close the folds.

I use kythe to get an outline view of the header files.

[...]
>> E.g. on Debian you can "apt install git-doc" and browse stuff like
>> file:///usr/share/doc/git-doc/technical/api-argv-array.html which is the
>> HTML formatted version, same for all the other api-*.txt docs.
>
> IMHO this is just silly. What am I as an end user going to do with
> api-argv-array.html?

In Debian we just ship all the docs.  For something like
technical/pack-heuristics, it's quite nice.  For the API docs, I think
they belong in the headers.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: On shipping more of our technical docs as manpages
  2018-09-27  6:34                       ` Jonathan Nieder
@ 2018-09-27  6:40                         ` Jeff King
  0 siblings, 0 replies; 51+ messages in thread
From: Jeff King @ 2018-09-27  6:40 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Ævar Arnfjörð Bjarmason, Junio C Hamano,
	Stefan Beller, git

On Wed, Sep 26, 2018 at 11:34:04PM -0700, Jonathan Nieder wrote:

> > We already have the /** convention I mentioned above. Could we have
> > another micro-format like:
> >
> >   /** API:strbuf - working with strings */
> >
> > in each header file? That would make generating such an index pretty
> > trivial.
> 
> Can you spell out the problem for me a little more?  E.g. if we had a
> convention that the first comment in strbuf.h should say
> 
> 	/* strbuf - Git's standard string type */
> 
> or even just
> 
> 	/* Git's standard string type */
> 
> would that do the trick?

I assumed that not all header files would want to advertise themselves
as a public API, so we'd want to some way to differentiate these from
first comments in "other" header files (and I assumed we did not want a
separate list of "these are the API files to go into the index", at
which point you might as well just write "standard string type" in that
list).

But I'm happy with any convention that lets Ævar generate the index he
wants.

> > Interesting. I'm not totally opposed to putting the documentation
> > alongside the actual code. It does feel a bit cluttered to me, but I
> > think you're right that it keeps everything as close together as
> > possible.
> 
> I've experienced projects following both conventions.  One thing I like
> a lot about documentation in the header file is that it encourages
> people to make the API documentation self-contained.  That is, you
> describe the contract in a way that doesn't require reading the
> implementation for details.
> 
> It took me a while to get used to this kind of convention but now I
> really like it.  So I really do prefer to keep putting API
> documentation in the header files in Git.  (Implementation
> documentation in the source files is of course also very welcome.)

Yeah, I'd agree with all of that.

> I use kythe to get an outline view of the header files.

Looks neat. Doesn't seem to be package for Debian, though. ;)

> In Debian we just ship all the docs.  For something like
> technical/pack-heuristics, it's quite nice.  For the API docs, I think
> they belong in the headers.

Yes, I'd agree with that. About half of technical/* is design docs, etc,
that might be of use to random folks. But the internal code APIs are
really only useful if you have an actual clone of git.git.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/8] sha1-array: provide oid_array_filter
  2018-09-26 18:58                 ` Jeff King
                                     ` (2 preceding siblings ...)
  2018-09-26 20:44                   ` On shipping more of our technical docs as manpages Ævar Arnfjörð Bjarmason
@ 2018-09-27  6:48                   ` Jeff King
  3 siblings, 0 replies; 51+ messages in thread
From: Jeff King @ 2018-09-27  6:48 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Junio C Hamano, Stefan Beller, git

On Wed, Sep 26, 2018 at 02:58:12PM -0400, Jeff King wrote:

> And if we were going to generate something external, would it make more
> sense to write in a structured format like doxygen? I am not a big fan
> of it myself, but at least from there you can generate a more richly
> interconnected set of documentation.

By the way, I tried running Doxygen on the current code-base. I _still_
hate the presentation of it (mostly because it's jarring to jump out of
my terminal to a web browser). But it actually does an OK job of showing
strbuf.h, with our overview attached to the struct and individual
functions correctly documented. There are a couple of formatting
hiccups, but it's actually pretty close.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: On shipping more of our technical docs as manpages
  2018-09-26 23:21                     ` Stefan Beller
@ 2018-09-27  8:58                       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-09-27  8:58 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jeff King, Junio C Hamano, git


On Wed, Sep 26 2018, Stefan Beller wrote:

> On Wed, Sep 26, 2018 at 1:44 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>
>> > And if we were going to generate something external, would it make more
>> > sense to write in a structured format like doxygen? I am not a big fan
>> > of it myself, but at least from there you can generate a more richly
>> > interconnected set of documentation.
>>
>> It's useful to have a single authoritative source for all documentation
>> that's easy to search through.
>
> If that is the case I would propose to keep it all in header files and
> organize the headers.
>
>> That includes stuff like perl585delta(1) which we'd stick in
>> Documentation/RelNotes, and "Internals and C Language Interface". Most
>> of what we'd put in Documentation/technical/api-* & headers is in
>> perlapi(1).
>
> This seems cool, but was also a recent introduction?
> perl400delta seems to yield nothing for me (which may be because
> I do not have an old version of perl installed?)

Depends on what you think is "recent" I suppose. Perl 5.4 is the first
version where deltas in POD format started being maintained
consistently, that version was released in mid-1997. Perl 4 was released
in 1991, see "perldoc perlhist". So ~everything consistently in POD has
been the case for ~20 years.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: On shipping more of our technical docs as manpages
  2018-09-27  6:20                     ` Jeff King
  2018-09-27  6:34                       ` Jonathan Nieder
@ 2018-09-27 17:36                       ` Junio C Hamano
  2018-09-27 18:25                         ` Jeff King
  2018-09-27 21:27                         ` [PATCH] Documentation/CodingGuidelines: How to document new APIs Stefan Beller
  1 sibling, 2 replies; 51+ messages in thread
From: Junio C Hamano @ 2018-09-27 17:36 UTC (permalink / raw)
  To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, Stefan Beller, git

Jeff King <peff@peff.net> writes:

> If you're interested in pulling documentation out of the header files
> and generating asciidoc from it, I'm happy to at least try keeping it up
> to date. When we started putting this information into header files, we
> used "/**" to start the comment, as a special marker to indicate it was
> worth pulling out. I don't know how well we've maintained that
> convention, but it's a starting point.

I noticed some people add these extra asterisk at the beginning of
comment, but I do not recall that we declared it is a convention we
adopt, so I'd rather be surprised if we've "maintained" it.

Please have it in CodingGuidelines or somewhere once this thread
settles and we decide to keep that convention (I have no problem
with the convention; it is just I personally didn't think it was
worth doing myself at least until now that we might have found
somebody who wants to make use of the markings).

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: On shipping more of our technical docs as manpages
  2018-09-27 17:36                       ` Junio C Hamano
@ 2018-09-27 18:25                         ` Jeff King
  2018-09-27 21:27                         ` [PATCH] Documentation/CodingGuidelines: How to document new APIs Stefan Beller
  1 sibling, 0 replies; 51+ messages in thread
From: Jeff King @ 2018-09-27 18:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, Stefan Beller, git

On Thu, Sep 27, 2018 at 10:36:11AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > If you're interested in pulling documentation out of the header files
> > and generating asciidoc from it, I'm happy to at least try keeping it up
> > to date. When we started putting this information into header files, we
> > used "/**" to start the comment, as a special marker to indicate it was
> > worth pulling out. I don't know how well we've maintained that
> > convention, but it's a starting point.
> 
> I noticed some people add these extra asterisk at the beginning of
> comment, but I do not recall that we declared it is a convention we
> adopt, so I'd rather be surprised if we've "maintained" it.

True. _I_ declared it as a convention and used it when I migrated some
of the initial api-* documents, but I don't know if anybody else
followed that lead.

FWIW, it is not my own convention, but one used by other tools like
doxygen (which in turn got it from javadoc, I think).

> Please have it in CodingGuidelines or somewhere once this thread
> settles and we decide to keep that convention (I have no problem
> with the convention; it is just I personally didn't think it was
> worth doing myself at least until now that we might have found
> somebody who wants to make use of the markings).

Yeah, this is basically why I hadn't pushed harder for it. If nobody is
actually extracting based on the convention, there is not much point. So
I was waiting for somebody to show up with an interest in using an
extraction tool.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH] Documentation/CodingGuidelines: How to document new APIs
  2018-09-27 17:36                       ` Junio C Hamano
  2018-09-27 18:25                         ` Jeff King
@ 2018-09-27 21:27                         ` Stefan Beller
  2018-09-27 21:43                           ` Junio C Hamano
  2018-09-27 23:27                           ` Jonathan Nieder
  1 sibling, 2 replies; 51+ messages in thread
From: Stefan Beller @ 2018-09-27 21:27 UTC (permalink / raw)
  To: gitster; +Cc: avarab, git, peff, sbeller

There are different opinions on how to document an API properly.
Discussion turns out nobody disagrees with documenting new APIs on the
function level in the header file and high level concepts in
Documentation/technical, so let's state that in the guidelines.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

 This is how I would approach the documentation patch.
 
 Thanks,
 Stefan
 
 Documentation/CodingGuidelines | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)
 
diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 48aa4edfbd..15bfb8bbb8 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -358,7 +358,10 @@ For C programs:
    string_list for sorted string lists, a hash map (mapping struct
    objects) named "struct decorate", amongst other things.
 
- - When you come up with an API, document it.
+ - When you come up with an API, document the functions in the header
+   and highlevel concepts in Documentation/technical/. Note that this
+   directory still contains function level documentation as historically
+   everything was documented there.
 
  - The first #include in C files, except in platform specific compat/
    implementations, must be either "git-compat-util.h", "cache.h" or
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH] Documentation/CodingGuidelines: How to document new APIs
  2018-09-27 21:27                         ` [PATCH] Documentation/CodingGuidelines: How to document new APIs Stefan Beller
@ 2018-09-27 21:43                           ` Junio C Hamano
  2018-09-27 22:21                             ` Ævar Arnfjörð Bjarmason
  2018-09-27 23:27                           ` Jonathan Nieder
  1 sibling, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2018-09-27 21:43 UTC (permalink / raw)
  To: Stefan Beller; +Cc: avarab, git, peff

Stefan Beller <sbeller@google.com> writes:

> There are different opinions on how to document an API properly.
> Discussion turns out nobody disagrees with documenting new APIs on the
> function level in the header file and high level concepts in

Looks conditionally good ;-) Thanks for keeping the ball rolling.

I didn't get an impression that "next to implementation" vs "in
header to force people write clearly" was totally settled.  I'd wait
until Ævar says something on this ;-)


> Documentation/technical, so let's state that in the guidelines.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>
>  This is how I would approach the documentation patch.
>  
>  Thanks,
>  Stefan
>  
>  Documentation/CodingGuidelines | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>  
> diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
> index 48aa4edfbd..15bfb8bbb8 100644
> --- a/Documentation/CodingGuidelines
> +++ b/Documentation/CodingGuidelines
> @@ -358,7 +358,10 @@ For C programs:
>     string_list for sorted string lists, a hash map (mapping struct
>     objects) named "struct decorate", amongst other things.
>  
> - - When you come up with an API, document it.
> + - When you come up with an API, document the functions in the header
> +   and highlevel concepts in Documentation/technical/. Note that this
> +   directory still contains function level documentation as historically
> +   everything was documented there.
>  
>   - The first #include in C files, except in platform specific compat/
>     implementations, must be either "git-compat-util.h", "cache.h" or

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] Documentation/CodingGuidelines: How to document new APIs
  2018-09-27 21:43                           ` Junio C Hamano
@ 2018-09-27 22:21                             ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-09-27 22:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, git, peff


On Thu, Sep 27 2018, Junio C Hamano wrote:

> Stefan Beller <sbeller@google.com> writes:
>
>> There are different opinions on how to document an API properly.
>> Discussion turns out nobody disagrees with documenting new APIs on the
>> function level in the header file and high level concepts in
>
> Looks conditionally good ;-) Thanks for keeping the ball rolling.
>
> I didn't get an impression that "next to implementation" vs "in
> header to force people write clearly" was totally settled.  I'd wait
> until Ævar says something on this ;-)

I think this patch is good and should go in. Having it be consistent is
a good thing, and we're drifting more in the *.h direction than *.txt.

The "next to implementation" suggestion was in the context of what the
perl project does, to get good use out of that we'd a) need to move all
the *.h docs and *.txt to *.c b) have something to slurp up those docs
so they're formatted. I'm not about to submit any patches for that.

>> Documentation/technical, so let's state that in the guidelines.
>>
>> Signed-off-by: Stefan Beller <sbeller@google.com>
>> ---
>>
>>  This is how I would approach the documentation patch.
>>
>>  Thanks,
>>  Stefan
>>
>>  Documentation/CodingGuidelines | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
>> index 48aa4edfbd..15bfb8bbb8 100644
>> --- a/Documentation/CodingGuidelines
>> +++ b/Documentation/CodingGuidelines
>> @@ -358,7 +358,10 @@ For C programs:
>>     string_list for sorted string lists, a hash map (mapping struct
>>     objects) named "struct decorate", amongst other things.
>>
>> - - When you come up with an API, document it.
>> + - When you come up with an API, document the functions in the header
>> +   and highlevel concepts in Documentation/technical/. Note that this
>> +   directory still contains function level documentation as historically
>> +   everything was documented there.
>>
>>   - The first #include in C files, except in platform specific compat/
>>     implementations, must be either "git-compat-util.h", "cache.h" or

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] Documentation/CodingGuidelines: How to document new APIs
  2018-09-27 21:27                         ` [PATCH] Documentation/CodingGuidelines: How to document new APIs Stefan Beller
  2018-09-27 21:43                           ` Junio C Hamano
@ 2018-09-27 23:27                           ` Jonathan Nieder
  2018-09-28  1:11                             ` Jeff King
  1 sibling, 1 reply; 51+ messages in thread
From: Jonathan Nieder @ 2018-09-27 23:27 UTC (permalink / raw)
  To: Stefan Beller; +Cc: gitster, avarab, git, peff

Hi,

Stefan Beller wrote:

> There are different opinions on how to document an API properly.
> Discussion turns out nobody disagrees with documenting new APIs on the
> function level in the header file and high level concepts in
> Documentation/technical, so let's state that in the guidelines.

I disagree with this.  I think we should put all the API docs in the
header file, like we're already doing in e.g. strbuf.h.

Do you have a link to where in the discussion this split-docs strategy
was decided?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] Documentation/CodingGuidelines: How to document new APIs
  2018-09-27 23:27                           ` Jonathan Nieder
@ 2018-09-28  1:11                             ` Jeff King
  2018-09-28 16:50                               ` Junio C Hamano
  2018-09-28 17:14                               ` Stefan Beller
  0 siblings, 2 replies; 51+ messages in thread
From: Jeff King @ 2018-09-28  1:11 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Stefan Beller, gitster, avarab, git

On Thu, Sep 27, 2018 at 04:27:32PM -0700, Jonathan Nieder wrote:

> > There are different opinions on how to document an API properly.
> > Discussion turns out nobody disagrees with documenting new APIs on the
> > function level in the header file and high level concepts in
> > Documentation/technical, so let's state that in the guidelines.
> 
> I disagree with this.  I think we should put all the API docs in the
> header file, like we're already doing in e.g. strbuf.h.

Me too.

I think other high-level concepts that are _not_ APIs (e.g., file
formats, protocol, etc) could go into technical/.

(Though actually, those are the thing that I would not mind at all if
they get formatted into real manpages and shipped to end users. We do
not expect most users to dissect our file-formats, but they could at
least be useful to somebody poking around).

> Do you have a link to where in the discussion this split-docs strategy
> was decided?

I think the purpose of this patch is to spur people towards a decision.
:)

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] Documentation/CodingGuidelines: How to document new APIs
  2018-09-28  1:11                             ` Jeff King
@ 2018-09-28 16:50                               ` Junio C Hamano
  2018-09-28 17:30                                 ` [PATCH] strbuf.h: format according to coding guidelines Stefan Beller
                                                   ` (2 more replies)
  2018-09-28 17:14                               ` Stefan Beller
  1 sibling, 3 replies; 51+ messages in thread
From: Junio C Hamano @ 2018-09-28 16:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Nieder, Stefan Beller, avarab, git

Jeff King <peff@peff.net> writes:

> On Thu, Sep 27, 2018 at 04:27:32PM -0700, Jonathan Nieder wrote:
>
>> > There are different opinions on how to document an API properly.
>> > Discussion turns out nobody disagrees with documenting new APIs on the
>> > function level in the header file and high level concepts in
>> > Documentation/technical, so let's state that in the guidelines.
>> 
>> I disagree with this.  I think we should put all the API docs in the
>> header file, like we're already doing in e.g. strbuf.h.
>
> Me too.
>
> I think other high-level concepts that are _not_ APIs (e.g., file
> formats, protocol, etc) could go into technical/.
>
> (Though actually, those are the thing that I would not mind at all if
> they get formatted into real manpages and shipped to end users. We do
> not expect most users to dissect our file-formats, but they could at
> least be useful to somebody poking around).
>
>> Do you have a link to where in the discussion this split-docs strategy
>> was decided?
>
> I think the purpose of this patch is to spur people towards a decision.
> :)

OK.

-- >8 --
Subject: CodingGuidelines: document the API in *.h files

It makes it harder to let the API description and the reality drift
apart if the doc is kept close to the implementation or the header
of the API.  We have been slowly migrating API docs out of the
Documentation/technical/api-* to *.h files, and the development
community generally considers that how inline docs in strbuf.h is
done the best current practice.

We recommend documenting in the header over documenting near the
implementation to encourage people to write the docs that are
readable without peeking at the implemention.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/CodingGuidelines | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 6d265327c9..e87090c849 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -385,7 +385,11 @@ For C programs:
    string_list for sorted string lists, a hash map (mapping struct
    objects) named "struct decorate", amongst other things.
 
- - When you come up with an API, document it.
+ - When you come up with an API, document it the functions and the
+   structures in the header file that exposes the API to its callers.
+   Use what is in "strbuf.h" as a model to decide the appropriate tone
+   in which the description is given, and the level of details to put
+   in the description.
 
  - The first #include in C files, except in platform specific compat/
    implementations, must be either "git-compat-util.h", "cache.h" or

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH] Documentation/CodingGuidelines: How to document new APIs
  2018-09-28  1:11                             ` Jeff King
  2018-09-28 16:50                               ` Junio C Hamano
@ 2018-09-28 17:14                               ` Stefan Beller
  2018-09-29  7:41                                 ` Jeff King
  1 sibling, 1 reply; 51+ messages in thread
From: Stefan Beller @ 2018-09-28 17:14 UTC (permalink / raw)
  To: Jeff King
  Cc: Jonathan Nieder, Junio C Hamano,
	Ævar Arnfjörð Bjarmason, git

On Thu, Sep 27, 2018 at 6:11 PM Jeff King <peff@peff.net> wrote:
>
> On Thu, Sep 27, 2018 at 04:27:32PM -0700, Jonathan Nieder wrote:
>
> > > There are different opinions on how to document an API properly.
> > > Discussion turns out nobody disagrees with documenting new APIs on the
> > > function level in the header file and high level concepts in
> > > Documentation/technical, so let's state that in the guidelines.
> >
> > I disagree with this.  I think we should put all the API docs in the
> > header file, like we're already doing in e.g. strbuf.h.
>
> Me too.
>
> I think other high-level concepts that are _not_ APIs (e.g., file
> formats, protocol, etc) could go into technical/.

That is what I meant with high level concepts. Anything that talks
about or implies the existence of a function is clearly header level.

> (Though actually, those are the thing that I would not mind at all if
> they get formatted into real manpages and shipped to end users. We do
> not expect most users to dissect our file-formats, but they could at
> least be useful to somebody poking around).

Formats are sensible thing to present to the end user. I was also thinking
about partial-clone, which is a concept rather than a format.

>
> > Do you have a link to where in the discussion this split-docs strategy
> > was decided?
>
> I think the purpose of this patch is to spur people towards a decision.
> :)

For me it might end up in an exercise of writing clear and concise English.
Though after reading Junios patch below, I would hope that may find
consensus.

Stefan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH] strbuf.h: format according to coding guidelines
  2018-09-28 16:50                               ` Junio C Hamano
@ 2018-09-28 17:30                                 ` Stefan Beller
  2018-09-28 19:54                                   ` Junio C Hamano
  2018-09-28 21:42                                   ` Junio C Hamano
  2018-09-28 19:47                                 ` [PATCH] Documentation/CodingGuidelines: How to document new APIs Martin Ågren
  2018-09-29  7:46                                 ` Jeff King
  2 siblings, 2 replies; 51+ messages in thread
From: Stefan Beller @ 2018-09-28 17:30 UTC (permalink / raw)
  To: gitster; +Cc: avarab, git, jrnieder, peff, sbeller

The previous patch suggested the strbuf header to be the leading example
of how we would want our APIs to be documented. This may lead to some
scrutiny of that code and the coding style (which is different from the
API documentation style) and hence might be taken as an example on how
to format code as well.

So let's format strbuf.h in a way that we'd like to see:
* omit the extern keyword from function declarations
* name all parameters (usually the parameters are obvious from its type,
  but consider exceptions like
  `int strbuf_getwholeline_fd(struct strbuf *, int, int);`
* break overly long lines

Signed-off-by: Stefan Beller <sbeller@google.com>
---

Maybe this on top of Junios guideline patch?

Thanks,
Stefan

 strbuf.h | 148 ++++++++++++++++++++++++++++++-------------------------
 1 file changed, 81 insertions(+), 67 deletions(-)

diff --git a/strbuf.h b/strbuf.h
index 60a35aef16..bf18fddb5b 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -87,7 +87,7 @@ struct object_id;
  * Initialize the structure. The second parameter can be zero or a bigger
  * number to allocate memory, in case you want to prevent further reallocs.
  */
-extern void strbuf_init(struct strbuf *, size_t);
+void strbuf_init(struct strbuf *sb, size_t alloc);
 
 /**
  * Release a string buffer and the memory it used. After this call, the
@@ -97,7 +97,7 @@ extern void strbuf_init(struct strbuf *, size_t);
  * To clear a strbuf in preparation for further use without the overhead
  * of free()ing and malloc()ing again, use strbuf_reset() instead.
  */
-extern void strbuf_release(struct strbuf *);
+void strbuf_release(struct strbuf *sb);
 
 /**
  * Detach the string from the strbuf and returns it; you now own the
@@ -107,7 +107,7 @@ extern void strbuf_release(struct strbuf *);
  * The strbuf that previously held the string is reset to `STRBUF_INIT` so
  * it can be reused after calling this function.
  */
-extern char *strbuf_detach(struct strbuf *, size_t *);
+char *strbuf_detach(struct strbuf *sb, size_t *sz);
 
 /**
  * Attach a string to a buffer. You should specify the string to attach,
@@ -117,7 +117,7 @@ extern char *strbuf_detach(struct strbuf *, size_t *);
  * malloc()ed, and after attaching, the pointer cannot be relied upon
  * anymore, and neither be free()d directly.
  */
-extern void strbuf_attach(struct strbuf *, void *, size_t, size_t);
+void strbuf_attach(struct strbuf *sb, void *str, size_t len, size_t mem);
 
 /**
  * Swap the contents of two string buffers.
@@ -148,7 +148,7 @@ static inline size_t strbuf_avail(const struct strbuf *sb)
  * This is never a needed operation, but can be critical for performance in
  * some cases.
  */
-extern void strbuf_grow(struct strbuf *, size_t);
+void strbuf_grow(struct strbuf *sb, size_t amount);
 
 /**
  * Set the length of the buffer to a given value. This function does *not*
@@ -183,30 +183,30 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t len)
  * Strip whitespace from the beginning (`ltrim`), end (`rtrim`), or both side
  * (`trim`) of a string.
  */
-extern void strbuf_trim(struct strbuf *);
-extern void strbuf_rtrim(struct strbuf *);
-extern void strbuf_ltrim(struct strbuf *);
+void strbuf_trim(struct strbuf *sb);
+void strbuf_rtrim(struct strbuf *sb);
+void strbuf_ltrim(struct strbuf *sb);
 
 /* Strip trailing directory separators */
-extern void strbuf_trim_trailing_dir_sep(struct strbuf *);
+void strbuf_trim_trailing_dir_sep(struct strbuf *sb);
 
 /**
  * Replace the contents of the strbuf with a reencoded form.  Returns -1
  * on error, 0 on success.
  */
-extern int strbuf_reencode(struct strbuf *sb, const char *from, const char *to);
+int strbuf_reencode(struct strbuf *sb, const char *from, const char *to);
 
 /**
  * Lowercase each character in the buffer using `tolower`.
  */
-extern void strbuf_tolower(struct strbuf *sb);
+void strbuf_tolower(struct strbuf *sb);
 
 /**
  * Compare two buffers. Returns an integer less than, equal to, or greater
  * than zero if the first buffer is found, respectively, to be less than,
  * to match, or be greater than the second buffer.
  */
-extern int strbuf_cmp(const struct strbuf *, const struct strbuf *);
+int strbuf_cmp(const struct strbuf *first, const struct strbuf *second);
 
 
 /**
@@ -233,37 +233,38 @@ static inline void strbuf_addch(struct strbuf *sb, int c)
 /**
  * Add a character the specified number of times to the buffer.
  */
-extern void strbuf_addchars(struct strbuf *sb, int c, size_t n);
+void strbuf_addchars(struct strbuf *sb, int c, size_t n);
 
 /**
  * Insert data to the given position of the buffer. The remaining contents
  * will be shifted, not overwritten.
  */
-extern void strbuf_insert(struct strbuf *, size_t pos, const void *, size_t);
+void strbuf_insert(struct strbuf *sb, size_t pos, const void *, size_t);
 
 /**
  * Remove given amount of data from a given position of the buffer.
  */
-extern void strbuf_remove(struct strbuf *, size_t pos, size_t len);
+void strbuf_remove(struct strbuf *sb, size_t pos, size_t len);
 
 /**
  * Remove the bytes between `pos..pos+len` and replace it with the given
  * data.
  */
-extern void strbuf_splice(struct strbuf *, size_t pos, size_t len,
-			  const void *, size_t);
+void strbuf_splice(struct strbuf *sb, size_t pos, size_t len,
+		   const void *data, size_t data_len);
 
 /**
  * Add a NUL-terminated string to the buffer. Each line will be prepended
  * by a comment character and a blank.
  */
-extern void strbuf_add_commented_lines(struct strbuf *out, const char *buf, size_t size);
+void strbuf_add_commented_lines(struct strbuf *out,
+				const char *buf, size_t size);
 
 
 /**
  * Add data of given length to the buffer.
  */
-extern void strbuf_add(struct strbuf *, const void *, size_t);
+void strbuf_add(struct strbuf *sb, const void *data, size_t len);
 
 /**
  * Add a NUL-terminated string to the buffer.
@@ -282,7 +283,7 @@ static inline void strbuf_addstr(struct strbuf *sb, const char *s)
 /**
  * Copy the contents of another buffer at the end of the current one.
  */
-extern void strbuf_addbuf(struct strbuf *sb, const struct strbuf *sb2);
+void strbuf_addbuf(struct strbuf *sb, const struct strbuf *sb2);
 
 /**
  * This function can be used to expand a format string containing
@@ -308,8 +309,13 @@ extern void strbuf_addbuf(struct strbuf *sb, const struct strbuf *sb2);
  * parameters to the callback, `strbuf_expand()` passes a context pointer,
  * which can be used by the programmer of the callback as she sees fit.
  */
-typedef size_t (*expand_fn_t) (struct strbuf *sb, const char *placeholder, void *context);
-extern void strbuf_expand(struct strbuf *sb, const char *format, expand_fn_t fn, void *context);
+typedef size_t (*expand_fn_t) (struct strbuf *sb,
+			       const char *placeholder,
+			       void *context);
+void strbuf_expand(struct strbuf *sb,
+		   const char *format,
+		   expand_fn_t fn,
+		   void *context);
 
 /**
  * Used as callback for `strbuf_expand()`, expects an array of
@@ -321,7 +327,9 @@ struct strbuf_expand_dict_entry {
 	const char *placeholder;
 	const char *value;
 };
-extern size_t strbuf_expand_dict_cb(struct strbuf *sb, const char *placeholder, void *context);
+size_t strbuf_expand_dict_cb(struct strbuf *sb,
+			     const char *placeholder,
+			     void *context);
 
 /**
  * Append the contents of one strbuf to another, quoting any
@@ -329,29 +337,29 @@ extern size_t strbuf_expand_dict_cb(struct strbuf *sb, const char *placeholder,
  * destination. This is useful for literal data to be fed to either
  * strbuf_expand or to the *printf family of functions.
  */
-extern void strbuf_addbuf_percentquote(struct strbuf *dst, const struct strbuf *src);
+void strbuf_addbuf_percentquote(struct strbuf *dst, const struct strbuf *src);
 
 /**
  * Append the given byte size as a human-readable string (i.e. 12.23 KiB,
  * 3.50 MiB).
  */
-extern void strbuf_humanise_bytes(struct strbuf *buf, off_t bytes);
+void strbuf_humanise_bytes(struct strbuf *buf, off_t bytes);
 
 /**
  * Add a formatted string to the buffer.
  */
 __attribute__((format (printf,2,3)))
-extern void strbuf_addf(struct strbuf *sb, const char *fmt, ...);
+void strbuf_addf(struct strbuf *sb, const char *fmt, ...);
 
 /**
  * Add a formatted string prepended by a comment character and a
  * blank to the buffer.
  */
 __attribute__((format (printf, 2, 3)))
-extern void strbuf_commented_addf(struct strbuf *sb, const char *fmt, ...);
+void strbuf_commented_addf(struct strbuf *sb, const char *fmt, ...);
 
 __attribute__((format (printf,2,0)))
-extern void strbuf_vaddf(struct strbuf *sb, const char *fmt, va_list ap);
+void strbuf_vaddf(struct strbuf *sb, const char *fmt, va_list ap);
 
 /**
  * Add the time specified by `tm`, as formatted by `strftime`.
@@ -361,9 +369,9 @@ extern void strbuf_vaddf(struct strbuf *sb, const char *fmt, va_list ap);
  * `suppress_tz_name`, when set, expands %Z internally to the empty
  * string rather than passing it to `strftime`.
  */
-extern void strbuf_addftime(struct strbuf *sb, const char *fmt,
-			    const struct tm *tm, int tz_offset,
-			    int suppress_tz_name);
+void strbuf_addftime(struct strbuf *sb, const char *fmt,
+		    const struct tm *tm, int tz_offset,
+		    int suppress_tz_name);
 
 /**
  * Read a given size of data from a FILE* pointer to the buffer.
@@ -373,14 +381,14 @@ extern void strbuf_addftime(struct strbuf *sb, const char *fmt,
  * `strbuf_read()`, `strbuf_read_file()` and `strbuf_getline_*()`
  * family of functions have the same behaviour as well.
  */
-extern size_t strbuf_fread(struct strbuf *, size_t, FILE *);
+size_t strbuf_fread(struct strbuf *sb, size_t size, FILE *file);
 
 /**
  * Read the contents of a given file descriptor. The third argument can be
  * used to give a hint about the file size, to avoid reallocs.  If read fails,
  * any partial read is undone.
  */
-extern ssize_t strbuf_read(struct strbuf *, int fd, size_t hint);
+ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint);
 
 /**
  * Read the contents of a given file descriptor partially by using only one
@@ -388,7 +396,7 @@ extern ssize_t strbuf_read(struct strbuf *, int fd, size_t hint);
  * file size, to avoid reallocs. Returns the number of new bytes appended to
  * the sb.
  */
-extern ssize_t strbuf_read_once(struct strbuf *, int fd, size_t hint);
+ssize_t strbuf_read_once(struct strbuf *sb, int fd, size_t hint);
 
 /**
  * Read the contents of a file, specified by its path. The third argument
@@ -396,19 +404,19 @@ extern ssize_t strbuf_read_once(struct strbuf *, int fd, size_t hint);
  * Return the number of bytes read or a negative value if some error
  * occurred while opening or reading the file.
  */
-extern ssize_t strbuf_read_file(struct strbuf *sb, const char *path, size_t hint);
+ssize_t strbuf_read_file(struct strbuf *sb, const char *path, size_t hint);
 
 /**
  * Read the target of a symbolic link, specified by its path.  The third
  * argument can be used to give a hint about the size, to avoid reallocs.
  */
-extern int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint);
+int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint);
 
 /**
  * Write the whole content of the strbuf to the stream not stopping at
  * NUL bytes.
  */
-extern ssize_t strbuf_write(struct strbuf *sb, FILE *stream);
+ssize_t strbuf_write(struct strbuf *sb, FILE *stream);
 
 /**
  * Read a line from a FILE *, overwriting the existing contents of
@@ -422,10 +430,10 @@ extern ssize_t strbuf_write(struct strbuf *sb, FILE *stream);
 typedef int (*strbuf_getline_fn)(struct strbuf *, FILE *);
 
 /* Uses LF as the line terminator */
-extern int strbuf_getline_lf(struct strbuf *sb, FILE *fp);
+int strbuf_getline_lf(struct strbuf *sb, FILE *fp);
 
 /* Uses NUL as the line terminator */
-extern int strbuf_getline_nul(struct strbuf *sb, FILE *fp);
+int strbuf_getline_nul(struct strbuf *sb, FILE *fp);
 
 /*
  * Similar to strbuf_getline_lf(), but additionally treats a CR that
@@ -434,14 +442,14 @@ extern int strbuf_getline_nul(struct strbuf *sb, FILE *fp);
  * that can come from platforms whose native text format is CRLF
  * terminated.
  */
-extern int strbuf_getline(struct strbuf *, FILE *);
+int strbuf_getline(struct strbuf *sb, FILE *file);
 
 
 /**
  * Like `strbuf_getline`, but keeps the trailing terminator (if
  * any) in the buffer.
  */
-extern int strbuf_getwholeline(struct strbuf *, FILE *, int);
+int strbuf_getwholeline(struct strbuf *sb, FILE *file, int term);
 
 /**
  * Like `strbuf_getwholeline`, but operates on a file descriptor.
@@ -449,19 +457,19 @@ extern int strbuf_getwholeline(struct strbuf *, FILE *, int);
  * use it unless you need the correct position in the file
  * descriptor.
  */
-extern int strbuf_getwholeline_fd(struct strbuf *, int, int);
+int strbuf_getwholeline_fd(struct strbuf *sb, int fd, int term);
 
 /**
  * Set the buffer to the path of the current working directory.
  */
-extern int strbuf_getcwd(struct strbuf *sb);
+int strbuf_getcwd(struct strbuf *sb);
 
 /**
  * Add a path to a buffer, converting a relative path to an
  * absolute one in the process.  Symbolic links are not
  * resolved.
  */
-extern void strbuf_add_absolute_path(struct strbuf *sb, const char *path);
+void strbuf_add_absolute_path(struct strbuf *sb, const char *path);
 
 /**
  * Canonize `path` (make it absolute, resolve symlinks, remove extra
@@ -475,7 +483,7 @@ extern void strbuf_add_absolute_path(struct strbuf *sb, const char *path);
  * Callers that don't mind links should use the more lightweight
  * strbuf_add_absolute_path() instead.
  */
-extern void strbuf_add_real_path(struct strbuf *sb, const char *path);
+void strbuf_add_real_path(struct strbuf *sb, const char *path);
 
 
 /**
@@ -483,13 +491,13 @@ extern void strbuf_add_real_path(struct strbuf *sb, const char *path);
  * normalize_path_copy() for details. If an error occurs, the contents of "sb"
  * are left untouched, and -1 is returned.
  */
-extern int strbuf_normalize_path(struct strbuf *sb);
+int strbuf_normalize_path(struct strbuf *sb);
 
 /**
  * Strip whitespace from a buffer. The second parameter controls if
  * comments are considered contents to be removed or not.
  */
-extern void strbuf_stripspace(struct strbuf *buf, int skip_comments);
+void strbuf_stripspace(struct strbuf *buf, int skip_comments);
 
 static inline int strbuf_strip_suffix(struct strbuf *sb, const char *suffix)
 {
@@ -518,8 +526,8 @@ static inline int strbuf_strip_suffix(struct strbuf *sb, const char *suffix)
  * For lighter-weight alternatives, see string_list_split() and
  * string_list_split_in_place().
  */
-extern struct strbuf **strbuf_split_buf(const char *, size_t,
-					int terminator, int max);
+struct strbuf **strbuf_split_buf(const char *str, size_t len,
+				 int terminator, int max);
 
 static inline struct strbuf **strbuf_split_str(const char *str,
 					       int terminator, int max)
@@ -528,7 +536,7 @@ static inline struct strbuf **strbuf_split_str(const char *str,
 }
 
 static inline struct strbuf **strbuf_split_max(const struct strbuf *sb,
-						int terminator, int max)
+					       int terminator, int max)
 {
 	return strbuf_split_buf(sb->buf, sb->len, terminator, max);
 }
@@ -549,23 +557,23 @@ static inline struct strbuf **strbuf_split(const struct strbuf *sb,
  *   'element1, element2, ..., elementN'
  * to str.  If only one element, just write "element1" to str.
  */
-extern void strbuf_add_separated_string_list(struct strbuf *str,
-					     const char *sep,
-					     struct string_list *slist);
+void strbuf_add_separated_string_list(struct strbuf *str,
+				      const char *sep,
+				      struct string_list *slist);
 
 /**
  * Free a NULL-terminated list of strbufs (for example, the return
  * values of the strbuf_split*() functions).
  */
-extern void strbuf_list_free(struct strbuf **);
+void strbuf_list_free(struct strbuf **list);
 
 /**
  * Add the abbreviation, as generated by find_unique_abbrev, of `sha1` to
  * the strbuf `sb`.
  */
-extern void strbuf_add_unique_abbrev(struct strbuf *sb,
-				     const struct object_id *oid,
-				     int abbrev_len);
+void strbuf_add_unique_abbrev(struct strbuf *sb,
+			      const struct object_id *oid,
+			      int abbrev_len);
 
 /**
  * Launch the user preferred editor to edit a file and fill the buffer
@@ -574,15 +582,21 @@ extern void strbuf_add_unique_abbrev(struct strbuf *sb,
  * run in. If the buffer is NULL the editor is launched as usual but the
  * file's contents are not read into the buffer upon completion.
  */
-extern int launch_editor(const char *path, struct strbuf *buffer, const char *const *env);
+int launch_editor(const char *path,
+		  struct strbuf *buffer,
+		  const char *const *env);
 
-extern void strbuf_add_lines(struct strbuf *sb, const char *prefix, const char *buf, size_t size);
+void strbuf_add_lines(struct strbuf *sb,
+		      const char *prefix,
+		      const char *buf,
+		      size_t size);
 
 /**
  * Append s to sb, with the characters '<', '>', '&' and '"' converted
  * into XML entities.
  */
-extern void strbuf_addstr_xml_quoted(struct strbuf *sb, const char *s);
+void strbuf_addstr_xml_quoted(struct strbuf *sb,
+			      const char *s);
 
 /**
  * "Complete" the contents of `sb` by ensuring that either it ends with the
@@ -612,8 +626,8 @@ static inline void strbuf_complete_line(struct strbuf *sb)
  * If "allowed" is non-zero, restrict the set of allowed expansions. See
  * interpret_branch_name() for details.
  */
-extern void strbuf_branchname(struct strbuf *sb, const char *name,
-			      unsigned allowed);
+void strbuf_branchname(struct strbuf *sb, const char *name,
+		       unsigned allowed);
 
 /*
  * Like strbuf_branchname() above, but confirm that the result is
@@ -621,15 +635,15 @@ extern void strbuf_branchname(struct strbuf *sb, const char *name,
  *
  * The return value is "0" if the result is valid, and "-1" otherwise.
  */
-extern int strbuf_check_branch_ref(struct strbuf *sb, const char *name);
+int strbuf_check_branch_ref(struct strbuf *sb, const char *name);
 
-extern void strbuf_addstr_urlencode(struct strbuf *, const char *,
-				    int reserved);
+void strbuf_addstr_urlencode(struct strbuf *sb, const char *name,
+			     int reserved);
 
 __attribute__((format (printf,1,2)))
-extern int printf_ln(const char *fmt, ...);
+int printf_ln(const char *fmt, ...);
 __attribute__((format (printf,2,3)))
-extern int fprintf_ln(FILE *fp, const char *fmt, ...);
+int fprintf_ln(FILE *fp, const char *fmt, ...);
 
 char *xstrdup_tolower(const char *);
 char *xstrdup_toupper(const char *);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH] Documentation/CodingGuidelines: How to document new APIs
  2018-09-28 16:50                               ` Junio C Hamano
  2018-09-28 17:30                                 ` [PATCH] strbuf.h: format according to coding guidelines Stefan Beller
@ 2018-09-28 19:47                                 ` Martin Ågren
  2018-09-29  7:46                                 ` Jeff King
  2 siblings, 0 replies; 51+ messages in thread
From: Martin Ågren @ 2018-09-28 19:47 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Jonathan Nieder, Stefan Beller,
	Ævar Arnfjörð Bjarmason, Git Mailing List

On Fri, 28 Sep 2018 at 18:51, Junio C Hamano <gitster@pobox.com> wrote:
> We recommend documenting in the header over documenting near the
> implementation to encourage people to write the docs that are
> readable without peeking at the implemention.

s/implemention/implementation/

> - - When you come up with an API, document it.
> + - When you come up with an API, document it the functions and the

s/it the/the/

> +   structures in the header file that exposes the API to its callers.
> +   Use what is in "strbuf.h" as a model to decide the appropriate tone
> +   in which the description is given, and the level of details to put
> +   in the description.

Martin

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] strbuf.h: format according to coding guidelines
  2018-09-28 17:30                                 ` [PATCH] strbuf.h: format according to coding guidelines Stefan Beller
@ 2018-09-28 19:54                                   ` Junio C Hamano
  2018-09-28 20:11                                     ` Junio C Hamano
  2018-09-28 21:42                                   ` Junio C Hamano
  1 sibling, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2018-09-28 19:54 UTC (permalink / raw)
  To: Stefan Beller; +Cc: avarab, git, jrnieder, peff

Stefan Beller <sbeller@google.com> writes:

> So let's format strbuf.h in a way that we'd like to see:
> * omit the extern keyword from function declarations

OK

> * name all parameters (usually the parameters are obvious from its type,
>   but consider exceptions like
>   `int strbuf_getwholeline_fd(struct strbuf *, int, int);`

I am mildly against giving names to obvious ones.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] strbuf.h: format according to coding guidelines
  2018-09-28 19:54                                   ` Junio C Hamano
@ 2018-09-28 20:11                                     ` Junio C Hamano
  2018-09-28 21:38                                       ` Junio C Hamano
  2018-09-29  7:38                                       ` Jeff King
  0 siblings, 2 replies; 51+ messages in thread
From: Junio C Hamano @ 2018-09-28 20:11 UTC (permalink / raw)
  To: Stefan Beller; +Cc: avarab, git, jrnieder, peff

Junio C Hamano <gitster@pobox.com> writes:

> Stefan Beller <sbeller@google.com> writes:
>
>> So let's format strbuf.h in a way that we'd like to see:
>> * omit the extern keyword from function declarations
>
> OK
>
>> * name all parameters (usually the parameters are obvious from its type,
>>   but consider exceptions like
>>   `int strbuf_getwholeline_fd(struct strbuf *, int, int);`
>
> I am mildly against giving names to obvious ones.

Just to make sure nobody listening from sidelines do not
misunderstand, ones like getwholeline_fd() that takes more than one
parameter with the same time is a prime example of a function that
take non-obvious parameters that MUST be named.  

What I am mildly against is the rule that says "name ALL
parameters".  I tend to think these names make headers harder to
read, and prefer to keep them to the minimum.

I actually do not mind the rule to be more like

 * Use the same parameter names used in the function declaration when
   the description in the API documentation refers the parameter.

That _forces_ people to write

	/**
	 * Read a whole line up to the terminating character 
	 * TERM (typically LF or NUL) from the file descriptor FD
	 * and replace the contents of the strbuf SB with the
	 * result ...
	 */
	int strbuf_getwholeline_fd(struct strbuf *sb, int fd, int term);

which is mostly equivalent to having a rule to always name all
parameters, while still allowing "sb" to be omitted by rephrasing
"the contents of the given strbuf with the result ..." and I
consider that a good flexibility to have.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] strbuf.h: format according to coding guidelines
  2018-09-28 20:11                                     ` Junio C Hamano
@ 2018-09-28 21:38                                       ` Junio C Hamano
  2018-09-29  7:38                                       ` Jeff King
  1 sibling, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2018-09-28 21:38 UTC (permalink / raw)
  To: Stefan Beller; +Cc: avarab, git, jrnieder, peff

Junio C Hamano <gitster@pobox.com> writes:

> I actually do not mind the rule to be more like
>
>  * Use the same parameter names used in the function declaration when
>    the description in the API documentation refers the parameter.

Assuming that we adopt the above guideline, let's extending it to
the original patch's review.

The following is a good example.  FIRST and SECOND would have been
upcased if this followed my earlier illustration to make them stand
out as references to the parameters, but it is already readable
without upcasing _and_ naming parameters is helping here.

 /**
  * Compare two buffers. Returns an integer less than, equal to, or greater
  * than zero if the first buffer is found, respectively, to be less than,
  * to match, or be greater than the second buffer.
  */
-extern int strbuf_cmp(const struct strbuf *, const struct strbuf *);
+int strbuf_cmp(const struct strbuf *first, const struct strbuf *second);

The next one could be improved and say something like: "Remove LEN
bytes in the strbuf SB starting at offset POS", as it already had
'pos' and 'len' that are readily usable.  Notice that "Remove LEN
bytes starting at offset POS" is a sufficiently clear description
and that is why I do not think we should require all parameters to
be named.

 /**
  * Remove given amount of data from a given position of the buffer.
  */
-extern void strbuf_remove(struct strbuf *, size_t pos, size_t len);
+void strbuf_remove(struct strbuf *sb, size_t pos, size_t len);

The last example is a job half-done.  The original had pos and len
parameters and referred to them in the text, but just said "with the
given data".  Now we have data and data_len, "the given data" can
and should be clarified by referring to them.

 /**
  * Remove the bytes between `pos..pos+len` and replace it with the given
  * data.
  */
-extern void strbuf_splice(struct strbuf *, size_t pos, size_t len,
-			  const void *, size_t);
+void strbuf_splice(struct strbuf *sb, size_t pos, size_t len,
+		   const void *data, size_t data_len);

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] strbuf.h: format according to coding guidelines
  2018-09-28 17:30                                 ` [PATCH] strbuf.h: format according to coding guidelines Stefan Beller
  2018-09-28 19:54                                   ` Junio C Hamano
@ 2018-09-28 21:42                                   ` Junio C Hamano
  2018-09-28 21:54                                     ` Stefan Beller
  1 sibling, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2018-09-28 21:42 UTC (permalink / raw)
  To: Stefan Beller; +Cc: avarab, git, jrnieder, peff

Stefan Beller <sbeller@google.com> writes:

> The previous patch suggested the strbuf header to be the leading example
> of how we would want our APIs to be documented. This may lead to some
> scrutiny of that code and the coding style (which is different from the
> API documentation style) and hence might be taken as an example on how
> to format code as well.
>
> So let's format strbuf.h in a way that we'd like to see:
> * omit the extern keyword from function declarations
> * name all parameters (usually the parameters are obvious from its type,
>   but consider exceptions like
>   `int strbuf_getwholeline_fd(struct strbuf *, int, int);`
> * break overly long lines
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>
> Maybe this on top of Junios guideline patch?

If we were to do this, I'd rather see us do a very good job on this
file first, with "We are going to use this file as the best current
practice model for an API header file" to begin its log message,
and then recommend its use as the model on top.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] strbuf.h: format according to coding guidelines
  2018-09-28 21:42                                   ` Junio C Hamano
@ 2018-09-28 21:54                                     ` Stefan Beller
  0 siblings, 0 replies; 51+ messages in thread
From: Stefan Beller @ 2018-09-28 21:54 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, git, Jonathan Nieder,
	Jeff King

On Fri, Sep 28, 2018 at 2:42 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Stefan Beller <sbeller@google.com> writes:
>
> > The previous patch suggested the strbuf header to be the leading example
> > of how we would want our APIs to be documented. This may lead to some
> > scrutiny of that code and the coding style (which is different from the
> > API documentation style) and hence might be taken as an example on how
> > to format code as well.
> >
> > So let's format strbuf.h in a way that we'd like to see:
> > * omit the extern keyword from function declarations
> > * name all parameters (usually the parameters are obvious from its type,
> >   but consider exceptions like
> >   `int strbuf_getwholeline_fd(struct strbuf *, int, int);`
> > * break overly long lines
> >
> > Signed-off-by: Stefan Beller <sbeller@google.com>
> > ---
> >
> > Maybe this on top of Junios guideline patch?
>
> If we were to do this, I'd rather see us do a very good job on this
> file first, with "We are going to use this file as the best current
> practice model for an API header file" to begin its log message,
> and then recommend its use as the model on top.

I started going through that file and undoing "the naming all parameters"
and additionally started to remove any obvious parameter name
(mostly the first argument, that is of struct strbuf, and sometimes
was called sb or out)

But it seems in addition to all this we also want to re-read the
documentation and make sure it reflects the API accurately and
describes it well.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] strbuf.h: format according to coding guidelines
  2018-09-28 20:11                                     ` Junio C Hamano
  2018-09-28 21:38                                       ` Junio C Hamano
@ 2018-09-29  7:38                                       ` Jeff King
  1 sibling, 0 replies; 51+ messages in thread
From: Jeff King @ 2018-09-29  7:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, avarab, git, jrnieder

On Fri, Sep 28, 2018 at 01:11:26PM -0700, Junio C Hamano wrote:

> > I am mildly against giving names to obvious ones.
> 
> Just to make sure nobody listening from sidelines do not
> misunderstand, ones like getwholeline_fd() that takes more than one
> parameter with the same time is a prime example of a function that
> take non-obvious parameters that MUST be named.  
> 
> What I am mildly against is the rule that says "name ALL
> parameters".  I tend to think these names make headers harder to
> read, and prefer to keep them to the minimum.
> 
> I actually do not mind the rule to be more like
> 
>  * Use the same parameter names used in the function declaration when
>    the description in the API documentation refers the parameter.

Yes, I agree very much with that rule (and your genera line of
thinking).

I am not personally against just naming every parameter, but I simply
don't care either way.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] Documentation/CodingGuidelines: How to document new APIs
  2018-09-28 17:14                               ` Stefan Beller
@ 2018-09-29  7:41                                 ` Jeff King
  0 siblings, 0 replies; 51+ messages in thread
From: Jeff King @ 2018-09-29  7:41 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Jonathan Nieder, Junio C Hamano,
	Ævar Arnfjörð Bjarmason, git

On Fri, Sep 28, 2018 at 10:14:12AM -0700, Stefan Beller wrote:

> > I think other high-level concepts that are _not_ APIs (e.g., file
> > formats, protocol, etc) could go into technical/.
> 
> That is what I meant with high level concepts. Anything that talks
> about or implies the existence of a function is clearly header level.

Ah, OK. I thought you meant the "this is how strbuf roughly works"
overview, which IMHO should remain in strbuf.h.

I think we are on the same page, then.

> > (Though actually, those are the thing that I would not mind at all if
> > they get formatted into real manpages and shipped to end users. We do
> > not expect most users to dissect our file-formats, but they could at
> > least be useful to somebody poking around).
> 
> Formats are sensible thing to present to the end user. I was also thinking
> about partial-clone, which is a concept rather than a format.

Yeah, definitely. Another good example is technical/api-credentials.
The section on credential helpers there probably ought to be in more
user-visible documentation (probably gitcredentials(7)).

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] Documentation/CodingGuidelines: How to document new APIs
  2018-09-28 16:50                               ` Junio C Hamano
  2018-09-28 17:30                                 ` [PATCH] strbuf.h: format according to coding guidelines Stefan Beller
  2018-09-28 19:47                                 ` [PATCH] Documentation/CodingGuidelines: How to document new APIs Martin Ågren
@ 2018-09-29  7:46                                 ` Jeff King
  2 siblings, 0 replies; 51+ messages in thread
From: Jeff King @ 2018-09-29  7:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, Stefan Beller, avarab, git

On Fri, Sep 28, 2018 at 09:50:14AM -0700, Junio C Hamano wrote:

> -- >8 --
> Subject: CodingGuidelines: document the API in *.h files
> 
> It makes it harder to let the API description and the reality drift
> apart if the doc is kept close to the implementation or the header
> of the API.  We have been slowly migrating API docs out of the
> Documentation/technical/api-* to *.h files, and the development
> community generally considers that how inline docs in strbuf.h is
> done the best current practice.
> 
> We recommend documenting in the header over documenting near the
> implementation to encourage people to write the docs that are
> readable without peeking at the implemention.

Yeah, I agree with all of that rationale.

> diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
> index 6d265327c9..e87090c849 100644
> --- a/Documentation/CodingGuidelines
> +++ b/Documentation/CodingGuidelines
> @@ -385,7 +385,11 @@ For C programs:
>     string_list for sorted string lists, a hash map (mapping struct
>     objects) named "struct decorate", amongst other things.
>  
> - - When you come up with an API, document it.
> + - When you come up with an API, document it the functions and the
> +   structures in the header file that exposes the API to its callers.
> +   Use what is in "strbuf.h" as a model to decide the appropriate tone
> +   in which the description is given, and the level of details to put
> +   in the description.

I like the general idea here. I had trouble parsing the "in which the
description is given". Maybe just:

  When you come up with an API, document its functions and structures in
  the header file that exposes the API to its callers. Use what is in
  "strbuf.h" as a model for the appropriate tone and level of detail.

I like the idea you mentioned elsewhere of polishing up strbuf.h to
serve as the model (but I don't want to hold up this much simpler patch
if that seems likely to drag on).

Thanks for pushing this towards a concrete conclusion.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2018-09-29  7:46 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-21 22:35 [PATCHv3 0/8] fetch: make sure submodule oids are fetched Stefan Beller
2018-09-21 22:35 ` [PATCH 1/8] sha1-array: provide oid_array_filter Stefan Beller
2018-09-22 12:58   ` Ævar Arnfjörð Bjarmason
2018-09-25 19:26     ` Stefan Beller
2018-09-26  4:15       ` Jeff King
2018-09-26 17:10         ` Junio C Hamano
2018-09-26 17:49           ` Ævar Arnfjörð Bjarmason
2018-09-26 18:27             ` Junio C Hamano
2018-09-26 18:34               ` Jeff King
2018-09-26 18:43               ` Ævar Arnfjörð Bjarmason
2018-09-26 18:58                 ` Jeff King
2018-09-26 19:39                   ` Stefan Beller
2018-09-26 19:49                   ` Junio C Hamano
2018-09-26 19:59                     ` Stefan Beller
2018-09-26 20:19                       ` Junio C Hamano
2018-09-26 20:51                         ` Jeff King
2018-09-26 21:22                         ` Stefan Beller
2018-09-26 20:44                   ` On shipping more of our technical docs as manpages Ævar Arnfjörð Bjarmason
2018-09-26 21:40                     ` Junio C Hamano
2018-09-26 23:21                     ` Stefan Beller
2018-09-27  8:58                       ` Ævar Arnfjörð Bjarmason
2018-09-27  6:20                     ` Jeff King
2018-09-27  6:34                       ` Jonathan Nieder
2018-09-27  6:40                         ` Jeff King
2018-09-27 17:36                       ` Junio C Hamano
2018-09-27 18:25                         ` Jeff King
2018-09-27 21:27                         ` [PATCH] Documentation/CodingGuidelines: How to document new APIs Stefan Beller
2018-09-27 21:43                           ` Junio C Hamano
2018-09-27 22:21                             ` Ævar Arnfjörð Bjarmason
2018-09-27 23:27                           ` Jonathan Nieder
2018-09-28  1:11                             ` Jeff King
2018-09-28 16:50                               ` Junio C Hamano
2018-09-28 17:30                                 ` [PATCH] strbuf.h: format according to coding guidelines Stefan Beller
2018-09-28 19:54                                   ` Junio C Hamano
2018-09-28 20:11                                     ` Junio C Hamano
2018-09-28 21:38                                       ` Junio C Hamano
2018-09-29  7:38                                       ` Jeff King
2018-09-28 21:42                                   ` Junio C Hamano
2018-09-28 21:54                                     ` Stefan Beller
2018-09-28 19:47                                 ` [PATCH] Documentation/CodingGuidelines: How to document new APIs Martin Ågren
2018-09-29  7:46                                 ` Jeff King
2018-09-28 17:14                               ` Stefan Beller
2018-09-29  7:41                                 ` Jeff King
2018-09-27  6:48                   ` [PATCH 1/8] sha1-array: provide oid_array_filter Jeff King
2018-09-21 22:35 ` [PATCH 2/8] submodule.c: fix indentation Stefan Beller
2018-09-21 22:35 ` [PATCH 3/8] submodule.c: sort changed_submodule_names before searching it Stefan Beller
2018-09-21 22:35 ` [PATCH 4/8] submodule: move global changed_submodule_names into fetch submodule struct Stefan Beller
2018-09-21 22:35 ` [PATCH 5/8] submodule.c: do not copy around submodule list Stefan Beller
2018-09-21 22:35 ` [PATCH 6/8] submodule: fetch in submodules git directory instead of in worktree Stefan Beller
2018-09-21 22:35 ` [PATCH 7/8] fetch: retry fetching submodules if needed objects were not fetched Stefan Beller
2018-09-21 22:35 ` [PATCH 8/8] builtin/fetch: check for submodule updates for non branch fetches Stefan Beller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).