[PATCH 0/10] towards clean leak-checker output

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH 0/10] towards clean leak-checker output
@ 2017-09-05 13:01 Jeff King
  2017-09-05 13:03 ` [PATCH 01/10] test-lib: --valgrind should not override --verbose-log Jeff King
                   ` (11 more replies)
  0 siblings, 12 replies; 42+ messages in thread
From: Jeff King @ 2017-09-05 13:01 UTC (permalink / raw)
  To: git

Using leak-checking tools like valgrind or LSAN is usually not all that
productive with Git, because the output is far from clean. And _most_ of
these are just not interesting, as they're either:

  1. Real leaks, but ones that only be triggered a small, set number of
     times per program.

  2. Intentional leaks of variables as the main cmd_* functions go out
     of scope (as opposed to manually cleaning).

I focused here on getting t0000 and t0001 to run clean against a
leak-checking tool. That's a fraction of the total test suite, but my
goal was have a tractable sample which could let us see if the path
seems sane.

I feel positive overall about the approach this series takes. The
suppressions aren't too terrible to write, and I found some real (albeit
minor) leaks in these few tests. I hope it can serve as a base for an
ongoing effort to get the whole test suite clean.

A few things to note:

  - I switched from valgrind to ASAN/LSAN early on. It's just way
    faster, which makes the compile-test-fix cycle a lot more pleasant.
    With these patches, you should be able to do:

      make SANITIZE=leak && (cd t && ./t1234-* -v -i)

    and get a leak report for a single script dumped to stderr.

    If you want to try it with t0000, you'll need the lock-file series I
    just posted at:

      https://public-inbox.org/git/20170905121353.62zg3mtextmq5zrs@sigill.intra.peff.net/

    and the fix from Martin at:

      https://public-inbox.org/git/CAN0heSqn59sFF3A-eQ593G+ZDWnO9pKM5F=sgiSQk+prUr-nSQ@mail.gmail.com/

    to get a clean run.

  - I'm using LSAN instead of the full-on ASAN because it's faster. The
    docs warn that it's a bit experimental, and I did notice a few funny
    behaviors (like truncated backtraces), but overall it seems fine.
    You can also do:

      make SANITIZE=address &&
        (cd t && ASAN_OPTIONS=abort_on_error=1 ./t1234-* -v -i)

    to do a full ASAN run (the extra environment setting is necessary to
    override test-lib's defaults).

  - gcc's leak-checker doesn't seem to handle reachability correctly (or
    maybe I'm holding it wrong). As a simple case, building with ASAN
    and running git-init complains:

      $ make SANITIZE=address && ./git init foo
      ...
      Direct leak of 2 byte(s) in 1 object(s) allocated from:
          #0 0x7f011dc699e0 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xd99e0)
          #1 0x558eeedbdab5 in do_xmalloc /home/peff/compile/git/wrapper.c:60
          #2 0x558eeedbdbab in do_xmallocz /home/peff/compile/git/wrapper.c:100
          #3 0x558eeedbdd0d in xmallocz /home/peff/compile/git/wrapper.c:108
          #4 0x558eeedbdd0d in xmemdupz /home/peff/compile/git/wrapper.c:124
          #5 0x558eeedbdd0d in xstrndup /home/peff/compile/git/wrapper.c:130
          #6 0x558eeea0507a in main /home/peff/compile/git/common-main.c:39
          #7 0x7f011cf612e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)

    That line is the setting of argv0_path, which is a global (and thus
    shouldn't be marked as leaking). Interestingly, it only happens with
    -O2. Compiling with -O0 works fine. I'm not sure if it's a bug or
    what.

    I did most of my testing with clang-6.0, which gets this case right.

  - I have no idea who close or far this is to covering the whole suite.
    Only 98 test scripts complete with no leaks. But many of those
    failures will be hitting the same leaks over and over. It looks like
    running "git show' hits a few, which is going to affect a lot of
    scripts. But bear in mind when reading the patches that this might
    be the first step on a successful road, or it could be a dead end. :)

Most of this is actual leak fixes. The interesting part, I think, is the
UNLEAK annotation in patch 10.

  [01/10]: test-lib: --valgrind should not override --verbose-log
  [02/10]: test-lib: set LSAN_OPTIONS to abort by default
  [03/10]: add: free leaked pathspec after add_files_to_cache()
  [04/10]: update-index: fix cache entry leak in add_one_file()
  [05/10]: config: plug user_config leak
  [06/10]: reset: make tree counting less confusing
  [07/10]: reset: free allocated tree buffers
  [08/10]: repository: free fields before overwriting them
  [09/10]: set_git_dir: handle feeding gitdir to itself
  [10/10]: add UNLEAK annotation for reducing leak false positives

 Makefile               |  3 +++
 builtin/add.c          |  3 +++
 builtin/commit.c       |  1 +
 builtin/config.c       | 11 +++++++++--
 builtin/init-db.c      |  2 ++
 builtin/ls-files.c     |  1 +
 builtin/reset.c        | 24 +++++++++++++++++-------
 builtin/update-index.c |  4 +++-
 builtin/worktree.c     |  2 ++
 environment.c          |  4 +++-
 git-compat-util.h      |  7 +++++++
 repository.c           | 14 +++++++-------
 setup.c                |  5 -----
 t/test-lib.sh          |  7 ++++++-
 usage.c                | 13 +++++++++++++
 15 files changed, 77 insertions(+), 24 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 01/10] test-lib: --valgrind should not override --verbose-log
  2017-09-05 13:01 [PATCH 0/10] towards clean leak-checker output Jeff King
@ 2017-09-05 13:03 ` Jeff King
  2017-09-05 13:04 ` [PATCH 02/10] test-lib: set LSAN_OPTIONS to abort by default Jeff King
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Jeff King @ 2017-09-05 13:03 UTC (permalink / raw)
  To: git

The --verbose test option cannot be used with test harnesses
like "prove". Instead, you must use --verbose-log.

Since the --valgrind option implies --verbose, that means
that it cannot be used with prove. I.e., this does not work:

  prove t0000-basic.sh :: --valgrind

You'd think it could be fixed by doing:

  prove t0000-basic.sh :: --valgrind --verbose-log

but that doesn't work either, because the implied --verbose
takes precedence over --verbose-log. If the user has given
us a specific option, we should prefer that.

Signed-off-by: Jeff King <peff@peff.net>
---
I ended up not using valgrind for most of my tests, but this did bite me
early on when I tried to run "make GIT_TEST_OPTS=--valgrind prove".

 t/test-lib.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 5fbd8d4a90..62461a6e35 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -274,7 +274,7 @@ then
 	test -z "$verbose" && verbose_only="$valgrind_only"
 elif test -n "$valgrind"
 then
-	verbose=t
+	test -z "$verbose_log" && verbose=t
 fi
 
 if test -n "$color"
-- 
2.14.1.721.gc5bc1565f1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 02/10] test-lib: set LSAN_OPTIONS to abort by default
  2017-09-05 13:01 [PATCH 0/10] towards clean leak-checker output Jeff King
  2017-09-05 13:03 ` [PATCH 01/10] test-lib: --valgrind should not override --verbose-log Jeff King
@ 2017-09-05 13:04 ` Jeff King
  2017-09-05 13:04 ` [PATCH 03/10] add: free leaked pathspec after add_files_to_cache() Jeff King
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Jeff King @ 2017-09-05 13:04 UTC (permalink / raw)
  To: git

We already set ASAN_OPTIONS to abort if it finds any errors.
As we start to experiment with LSAN, the leak sanitizer,
it's convenient if we give it the same treatment.

Note that ASAN is actually a superset of LSAN and can do the
leak detection itself. So this only has an effect if you
specifically build with "make SANITIZE=leak" (leak detection
but not the rest of ASAN). Building with just LSAN results
in a build that runs much faster. That makes the
build-test-fix cycle more pleasant.

In the long run, once we've fixed or suppressed all the
leaks, it will probably be worth turning leak-detection on
for ASAN and just using that (to check both leaks _and_
memory errors in a single test run). But there's still a lot
of work before we get there.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/test-lib.sh | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 62461a6e35..a738540ef2 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -44,6 +44,11 @@ GIT_BUILD_DIR="$TEST_DIRECTORY"/..
 : ${ASAN_OPTIONS=detect_leaks=0:abort_on_error=1}
 export ASAN_OPTIONS

+# If LSAN is in effect we _do_ want leak checking, but we still
+# want to abort so that we notice the problems.
+: ${LSAN_OPTIONS=abort_on_error=1}
+export LSAN_OPTIONS
+
 ################################################################
 # It appears that people try to run tests without building...
 "$GIT_BUILD_DIR/git" >/dev/null
-- 
2.14.1.721.gc5bc1565f1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 03/10] add: free leaked pathspec after add_files_to_cache()
  2017-09-05 13:01 [PATCH 0/10] towards clean leak-checker output Jeff King
  2017-09-05 13:03 ` [PATCH 01/10] test-lib: --valgrind should not override --verbose-log Jeff King
  2017-09-05 13:04 ` [PATCH 02/10] test-lib: set LSAN_OPTIONS to abort by default Jeff King
@ 2017-09-05 13:04 ` Jeff King
  2017-09-05 13:04 ` [PATCH 04/10] update-index: fix cache entry leak in add_one_file() Jeff King
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Jeff King @ 2017-09-05 13:04 UTC (permalink / raw)
  To: git

After run_diff_files, we throw away the rev_info struct,
including the pathspec that we copied into it, leaking the
memory. this is probably not a big deal in practice. We
usually only run this once per process, and the leak is
proportional to the pathspec list we're already holding in
memory.

But it's still a leak, and it pollutes leak-checker output,
making it harder to find important leaks.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/add.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/builtin/add.c b/builtin/add.c
index c20548e4f5..ef625e3fb8 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -119,6 +119,7 @@ int add_files_to_cache(const char *prefix,
 	rev.diffopt.flags |= DIFF_OPT_OVERRIDE_SUBMODULE_CONFIG;
 	rev.max_count = 0; /* do not compare unmerged paths with stage #2 */
 	run_diff_files(&rev, DIFF_RACY_IS_MODIFIED);
+	clear_pathspec(&rev.prune_data);
 	return !!data.add_errors;
 }
 
-- 
2.14.1.721.gc5bc1565f1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 04/10] update-index: fix cache entry leak in add_one_file()
  2017-09-05 13:01 [PATCH 0/10] towards clean leak-checker output Jeff King
                   ` (2 preceding siblings ...)
  2017-09-05 13:04 ` [PATCH 03/10] add: free leaked pathspec after add_files_to_cache() Jeff King
@ 2017-09-05 13:04 ` Jeff King
  2017-09-05 13:04 ` [PATCH 05/10] config: plug user_config leak Jeff King
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Jeff King @ 2017-09-05 13:04 UTC (permalink / raw)
  To: git

When we fail to add the cache entry to the index, we end up
just leaking the struct. We should follow the pattern of the
early-return above and free it.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/update-index.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 655e6d60d3..bf7420b808 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -287,8 +287,10 @@ static int add_one_path(const struct cache_entry *old, const char *path, int len
 	}
 	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
 	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
-	if (add_cache_entry(ce, option))
+	if (add_cache_entry(ce, option)) {
+		free(ce);
 		return error("%s: cannot add to the index - missing --add option?", path);
+	}
 	return 0;
 }
 
-- 
2.14.1.721.gc5bc1565f1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 05/10] config: plug user_config leak
  2017-09-05 13:01 [PATCH 0/10] towards clean leak-checker output Jeff King
                   ` (3 preceding siblings ...)
  2017-09-05 13:04 ` [PATCH 04/10] update-index: fix cache entry leak in add_one_file() Jeff King
@ 2017-09-05 13:04 ` Jeff King
  2017-09-05 13:04 ` [PATCH 06/10] reset: make tree counting less confusing Jeff King
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Jeff King @ 2017-09-05 13:04 UTC (permalink / raw)
  To: git

We generate filenames for the user_config ("~/.gitconfig")
and the xdg config ("$XDG_CONFIG_HOME/git/config") and then
decide which to use by looking at the filesystem. But after
selecting one, the unused string is just leaked.

This is a tiny leak, but it creates noise in leak-checker
output.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/config.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/builtin/config.c b/builtin/config.c
index 70ff231e9c..52a4606243 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -518,10 +518,13 @@ int cmd_config(int argc, const char **argv, const char *prefix)
 			die("$HOME not set");
 
 		if (access_or_warn(user_config, R_OK, 0) &&
-		    xdg_config && !access_or_warn(xdg_config, R_OK, 0))
+		    xdg_config && !access_or_warn(xdg_config, R_OK, 0)) {
 			given_config_source.file = xdg_config;
-		else
+			free(user_config);
+		} else {
 			given_config_source.file = user_config;
+			free(xdg_config);
+		}
 	}
 	else if (use_system_config)
 		given_config_source.file = git_etc_gitconfig();
-- 
2.14.1.721.gc5bc1565f1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 06/10] reset: make tree counting less confusing
  2017-09-05 13:01 [PATCH 0/10] towards clean leak-checker output Jeff King
                   ` (4 preceding siblings ...)
  2017-09-05 13:04 ` [PATCH 05/10] config: plug user_config leak Jeff King
@ 2017-09-05 13:04 ` Jeff King
  2017-09-05 13:04 ` [PATCH 07/10] reset: free allocated tree buffers Jeff King
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Jeff King @ 2017-09-05 13:04 UTC (permalink / raw)
  To: git

Depending on whether we're in --keep mode, git-reset may
feed one or two trees to unpack_trees(). We start a counter
at "1" and then increment it to "2" only for the two-tree
case. But that means we must always subtract one to find the
correct array slot to fill with each descriptor.

Instead, let's start at "0" and just increment our counter
after adding each tree. This skips the extra subtraction,
and will make things much easier when we start to actually
free our tree buffers.

While we're at it, let's make the first allocation use the
slot at "desc + nr", too, even though we know "nr" is 0 at
that point. It makes the two fill_tree_descriptor() calls
consistent (always "desc + nr", followed by always
incrementing "nr").

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/reset.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/builtin/reset.c b/builtin/reset.c
index f1af9345e4..34839db8ca 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -44,7 +44,7 @@ static inline int is_merge(void)
 
 static int reset_index(const struct object_id *oid, int reset_type, int quiet)
 {
-	int nr = 1;
+	int nr = 0;
 	struct tree_desc desc[2];
 	struct tree *tree;
 	struct unpack_trees_options opts;
@@ -75,14 +75,16 @@ static int reset_index(const struct object_id *oid, int reset_type, int quiet)
 		struct object_id head_oid;
 		if (get_oid("HEAD", &head_oid))
 			return error(_("You do not have a valid HEAD."));
-		if (!fill_tree_descriptor(desc, &head_oid))
+		if (!fill_tree_descriptor(desc + nr, &head_oid))
 			return error(_("Failed to find tree of HEAD."));
 		nr++;
 		opts.fn = twoway_merge;
 	}
 
-	if (!fill_tree_descriptor(desc + nr - 1, oid))
+	if (!fill_tree_descriptor(desc + nr, oid))
 		return error(_("Failed to find tree of %s."), oid_to_hex(oid));
+	nr++;
+
 	if (unpack_trees(nr, desc, &opts))
 		return -1;
 
-- 
2.14.1.721.gc5bc1565f1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 07/10] reset: free allocated tree buffers
  2017-09-05 13:01 [PATCH 0/10] towards clean leak-checker output Jeff King
                   ` (5 preceding siblings ...)
  2017-09-05 13:04 ` [PATCH 06/10] reset: make tree counting less confusing Jeff King
@ 2017-09-05 13:04 ` Jeff King
  2017-09-05 13:04 ` [PATCH 08/10] repository: free fields before overwriting them Jeff King
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Jeff King @ 2017-09-05 13:04 UTC (permalink / raw)
  To: git

We read the tree objects with fill_tree_descriptor(), but
never actually free the resulting buffers, causing a memory
leak. This isn't a huge deal because we call this code at
most twice per program invocation. But it does potentially
double our heap usage if you have large root trees. Let's
free the trees before returning.


Signed-off-by: Jeff King <peff@peff.net>
---
Actually, in the two-call case we repeat one of the trees,
so we _could_ in theory stash it away. But we don't do that
currently, and it's probably not worth the effort. My main
motivation is just shutting up the leak checker so I can
find actual important leaks.

 builtin/reset.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/builtin/reset.c b/builtin/reset.c
index 34839db8ca..9e22b4a1dc 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -44,10 +44,11 @@ static inline int is_merge(void)
 
 static int reset_index(const struct object_id *oid, int reset_type, int quiet)
 {
-	int nr = 0;
+	int i, nr = 0;
 	struct tree_desc desc[2];
 	struct tree *tree;
 	struct unpack_trees_options opts;
+	int ret = -1;
 
 	memset(&opts, 0, sizeof(opts));
 	opts.head_idx = 1;
@@ -81,19 +82,26 @@ static int reset_index(const struct object_id *oid, int reset_type, int quiet)
 		opts.fn = twoway_merge;
 	}
 
-	if (!fill_tree_descriptor(desc + nr, oid))
-		return error(_("Failed to find tree of %s."), oid_to_hex(oid));
+	if (!fill_tree_descriptor(desc + nr, oid)) {
+		error(_("Failed to find tree of %s."), oid_to_hex(oid));
+		goto out;
+	}
 	nr++;
 
-	if (unpack_trees(nr, desc, &opts))
-		return -1;
+	if (unpack_trees(nr, desc, &opts)) 
+		goto out;
 
 	if (reset_type == MIXED || reset_type == HARD) {
 		tree = parse_tree_indirect(oid);
 		prime_cache_tree(&the_index, tree);
 	}
 
-	return 0;
+	ret = 0;
+
+out:
+	for (i = 0; i < nr; i++)
+		free((void *)desc[i].buffer);
+	return ret;
 }
 
 static void print_new_head_line(struct commit *commit)
-- 
2.14.1.721.gc5bc1565f1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 08/10] repository: free fields before overwriting them
  2017-09-05 13:01 [PATCH 0/10] towards clean leak-checker output Jeff King
                   ` (6 preceding siblings ...)
  2017-09-05 13:04 ` [PATCH 07/10] reset: free allocated tree buffers Jeff King
@ 2017-09-05 13:04 ` Jeff King
  2017-09-05 13:05 ` [PATCH 09/10] set_git_dir: handle feeding gitdir to itself Jeff King
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Jeff King @ 2017-09-05 13:04 UTC (permalink / raw)
  To: git

It's possible that the repository data may be initialized
twice (e.g., after doing a chdir() to the top of the
worktree we may have to adjust a relative git_dir path). We
should free() any existing fields before assigning to them
to avoid leaks.

This should be safe, as the fields are set based on the
environment or on other strings like the gitdir or
commondir. That makes it impossible that we are feeding an
alias to the just-freed string.

Signed-off-by: Jeff King <peff@peff.net>
---
 environment.c | 4 +++-
 repository.c  | 4 ++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/environment.c b/environment.c
index 3fd4b10845..f1f934b6fd 100644
--- a/environment.c
+++ b/environment.c
@@ -97,7 +97,7 @@ int ignore_untracked_cache_config;
 /* This is set by setup_git_dir_gently() and/or git_default_config() */
 char *git_work_tree_cfg;
 
-static const char *namespace;
+static char *namespace;
 
 static const char *super_prefix;
 
@@ -152,8 +152,10 @@ void setup_git_env(void)
 	if (getenv(NO_REPLACE_OBJECTS_ENVIRONMENT))
 		check_replace_refs = 0;
 	replace_ref_base = getenv(GIT_REPLACE_REF_BASE_ENVIRONMENT);
+	free(git_replace_ref_base);
 	git_replace_ref_base = xstrdup(replace_ref_base ? replace_ref_base
 							  : "refs/replace/");
+	free(namespace);
 	namespace = expand_namespace(getenv(GIT_NAMESPACE_ENVIRONMENT));
 	shallow_file = getenv(GIT_SHALLOW_FILE_ENVIRONMENT);
 	if (shallow_file)
diff --git a/repository.c b/repository.c
index f107af7d76..52f1821c6b 100644
--- a/repository.c
+++ b/repository.c
@@ -40,11 +40,15 @@ static void repo_setup_env(struct repository *repo)
 
 	repo->different_commondir = find_common_dir(&sb, repo->gitdir,
 						    !repo->ignore_env);
+	free(repo->commondir);
 	repo->commondir = strbuf_detach(&sb, NULL);
+	free(repo->objectdir);
 	repo->objectdir = git_path_from_env(DB_ENVIRONMENT, repo->commondir,
 					    "objects", !repo->ignore_env);
+	free(repo->graft_file);
 	repo->graft_file = git_path_from_env(GRAFT_ENVIRONMENT, repo->commondir,
 					     "info/grafts", !repo->ignore_env);
+	free(repo->index_file);
 	repo->index_file = git_path_from_env(INDEX_ENVIRONMENT, repo->gitdir,
 					     "index", !repo->ignore_env);
 }
-- 
2.14.1.721.gc5bc1565f1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 09/10] set_git_dir: handle feeding gitdir to itself
  2017-09-05 13:01 [PATCH 0/10] towards clean leak-checker output Jeff King
                   ` (7 preceding siblings ...)
  2017-09-05 13:04 ` [PATCH 08/10] repository: free fields before overwriting them Jeff King
@ 2017-09-05 13:05 ` Jeff King
  2017-09-07 19:06   ` Brandon Williams
  2017-09-05 13:05 ` [PATCH 10/10] add UNLEAK annotation for reducing leak false positives Jeff King
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 42+ messages in thread
From: Jeff King @ 2017-09-05 13:05 UTC (permalink / raw)
  To: git

Ideally we'd free the existing gitdir field before assigning
the new one, to avoid a memory leak. But we can't do so
safely because some callers do the equivalent of:

  set_git_dir(get_git_dir());

We can detect that case as a noop, but there are even more
complicated cases like:

  set_git_dir(remove_leading_path(worktree, get_git_dir());

where we really do need to do some work, but the original
string must remain valid.

Rather than put the burden on callers to make a copy of the
string (only to free it later, since we'll make a copy of it
ourselves), let's solve the problem inside set_git_dir(). We
can make a copy of the pointer for the old gitdir, and then
avoid freeing it until after we've made our new copy.

Signed-off-by: Jeff King <peff@peff.net>
---
 repository.c | 10 +++-------
 setup.c      |  5 -----
 2 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/repository.c b/repository.c
index 52f1821c6b..97c732bd48 100644
--- a/repository.c
+++ b/repository.c
@@ -56,16 +56,12 @@ static void repo_setup_env(struct repository *repo)
 void repo_set_gitdir(struct repository *repo, const char *path)
 {
 	const char *gitfile = read_gitfile(path);
+	char *old_gitdir = repo->gitdir;
 
-	/*
-	 * NEEDSWORK: Eventually we want to be able to free gitdir and the rest
-	 * of the environment before reinitializing it again, but we have some
-	 * crazy code paths where we try to set gitdir with the current gitdir
-	 * and we don't want to free gitdir before copying the passed in value.
-	 */
 	repo->gitdir = xstrdup(gitfile ? gitfile : path);
-
 	repo_setup_env(repo);
+
+	free(old_gitdir);
 }
 
 /*
diff --git a/setup.c b/setup.c
index 23950173fc..6d8380acd2 100644
--- a/setup.c
+++ b/setup.c
@@ -399,11 +399,6 @@ void setup_work_tree(void)
 	if (getenv(GIT_WORK_TREE_ENVIRONMENT))
 		setenv(GIT_WORK_TREE_ENVIRONMENT, ".", 1);
 
-	/*
-	 * NEEDSWORK: this call can essentially be set_git_dir(get_git_dir())
-	 * which can cause some problems when trying to free the old value of
-	 * gitdir.
-	 */
 	set_git_dir(remove_leading_path(git_dir, work_tree));
 	initialized = 1;
 }
-- 
2.14.1.721.gc5bc1565f1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-05 13:01 [PATCH 0/10] towards clean leak-checker output Jeff King
                   ` (8 preceding siblings ...)
  2017-09-05 13:05 ` [PATCH 09/10] set_git_dir: handle feeding gitdir to itself Jeff King
@ 2017-09-05 13:05 ` Jeff King
  2017-09-05 22:05   ` Stefan Beller
                     ` (2 more replies)
  2017-09-05 17:50 ` [PATCH 0/10] towards clean leak-checker output Martin Ågren
  2017-09-08  6:38 ` [PATCH v2 10/10] add UNLEAK annotation for reducing leak false positives Jeff King
  11 siblings, 3 replies; 42+ messages in thread
From: Jeff King @ 2017-09-05 13:05 UTC (permalink / raw)
  To: git

It's a common pattern in git commands to allocate some
memory that should last for the lifetime of the program and
then not bother to free it, relying on the OS to throw it
away.

This keeps the code simple, and it's fast (we don't waste
time traversing structures or calling free at the end of the
program). But it also triggers warnings from memory-leak
checkers like valgrind or LSAN. They know that the memory
was still allocated at program exit, but they don't know
_when_ the leaked memory stopped being useful. If it was
early in the program, then it's probably a real and
important leak. But if it was used right up until program
exit, it's not an interesting leak and we'd like to suppress
it so that we can see the real leaks.

This patch introduces an UNLEAK() macro that lets us do so.
To understand its design, let's first look at some of the
alternatives.

Unfortunately the suppression systems offered by
leak-checking tools don't quite do what we want. A
leak-checker basically knows two things:

  1. Which blocks were allocated via malloc, and the
     callstack during the allocation.

  2. Which blocks were left un-freed at the end of the
     program (and which are unreachable, but more on that
     later).

Their suppressions work by mentioning the function or
callstack of a particular allocation, and marking it as OK
to leak.  So imagine you have code like this:

  int main(void)
  {
	/* this allocates some memory */
	char *p = some_function();
	printf("%s", p);
	return 0;
  }

You can say "ignore allocations from some_function(),
they're not leaks". But that's not right. That function may
be called elsewhere, too, and we would potentially want to
know about those leaks.

So you can say "ignore the callstack when main calls
some_function".  That works, but your annotations are
brittle. In this case it's only two functions, but you can
imagine that the actual allocation is much deeper. If any of
the intermediate code changes, you have to update the
suppression.

What we _really_ want to say is that "the value assigned to
p at the end of the function is not a real leak". But
leak-checkers can't understand that; they don't know about
"p" in the first place.

However, we can do something a little bit tricky if we make
some assumptions about how leak-checkers work. They
generally don't just report all un-freed blocks. That would
report even globals which are still accessible when the
leak-check is run.  Instead they take some set of memory
(like BSS) as a root and mark it as "reachable". Then they
scan the reachable blocks for anything that looks like a
pointer to a malloc'd block, and consider that block
reachable. And then they scan those blocks, and so on,
transitively marking anything reachable from a global as
"not leaked" (or at least leaked in a different category).

So we can mark the value of "p" as reachable by putting it
into a variable with program lifetime. One way to do that is
to just mark "p" as static. But that actually affects the
run-time behavior if the function is called twice (you
aren't likely to call main() twice, but some of our cmd_*()
functions are called from other commands).

Instead, we can trick the leak-checker by putting the value
into _any_ reachable bytes. This patch keeps a global
linked-list of bytes copied from "unleaked" variables. That
list is reachable even at program exit, which confers
recursive reachability on whatever values we unleak.

In other words, you can do:

  int main(void)
  {
	char *p = some_function();
	printf("%s", p);
	UNLEAK(p);
	return 0;
  }

to annotate "p" and suppress the leak report.

But wait, couldn't we just say "free(p)"? In this toy
example, yes. But using UNLEAK() has several advantages over
actually freeing the memory:

  1. It can be compiled conditionally. There's no need in
     normal runs to do this free(), and it just wastes time.
     By using a macro, we can get the benefit for leak-check
     builds with zero cost for normal builds (this patch
     uses a compile-time check, though we could clearly also
     make it a run-time check at very low cost).

     Of course one could also hide free() behind a macro, so
     this is really just arguing for having UNLEAK(), not
     for its particular implementation.

  2. It's recursive across structures. In many cases our "p"
     is not just a pointer, but a complex struct whose
     fields may have been allocated by a sub-function. And
     in some cases (e.g., dir_struct) we don't even have a
     function which knows how to free all of the struct
     members.

     By marking the struct itself as reachable, that confers
     reachability on any pointers it contains (including those
     found in embedded structs, or reachable by walking
     heap blocks recursively.

  3. It works on cases where we're not sure if the value is
     allocated or not. For example:

       char *p = argc > 1 ? argv[1] : some_function();

     It's safe to use UNLEAK(p) here, because it's not
     freeing any memory. In the case that we're pointing to
     argv here, the reachability checker will just ignore
     our bytes.

  4. Because it's not actually freeing memory, you can
     UNLEAK() before we are finished accessing the variable.
     This is helpful in cases like this:

       char *p = some_function();
       return another_function(p);

     Writing this with free() requires:

       int ret;
       char *p = some_function();
       ret = another_function(p);
       free(p);
       return ret;

     But with unleak we can just write:

       char *p = some_function();
       UNLEAK(p);
       return another_function(p);

This patch adds the UNLEAK() macro and enables it
automatically when Git is compiled with SANITIZE=leak.
It adds some UNLEAK() annotations to show off how the
feature works. On top of other recent leak fixes, these are
enough to get t0000 and t0001 to pass when compiled with
LSAN.

Signed-off-by: Jeff King <peff@peff.net>
---
 Makefile           |  3 +++
 builtin/add.c      |  2 ++
 builtin/commit.c   |  1 +
 builtin/config.c   |  4 ++++
 builtin/init-db.c  |  2 ++
 builtin/ls-files.c |  1 +
 builtin/worktree.c |  2 ++
 git-compat-util.h  |  7 +++++++
 usage.c            | 13 +++++++++++++
 9 files changed, 35 insertions(+)

diff --git a/Makefile b/Makefile
index f2bb7f2f63..c052f09bba 100644
--- a/Makefile
+++ b/Makefile
@@ -1036,6 +1036,9 @@ BASIC_CFLAGS += -fno-omit-frame-pointer
 ifneq ($(filter undefined,$(SANITIZERS)),)
 BASIC_CFLAGS += -DNO_UNALIGNED_LOADS
 endif
+ifneq ($(filter leak,$(SANITIZERS)),)
+BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
+endif
 endif
 
 ifndef sysconfdir
diff --git a/builtin/add.c b/builtin/add.c
index ef625e3fb8..a648cf4c56 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -515,5 +515,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 			die(_("Unable to write new index file"));
 	}
 
+	UNLEAK(pathspec);
+	UNLEAK(dir);
 	return exit_status;
 }
diff --git a/builtin/commit.c b/builtin/commit.c
index b3b04f5dd3..de775d906c 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1819,5 +1819,6 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 		print_summary(prefix, &oid, !current_head);
 
 	strbuf_release(&err);
+	UNLEAK(sb);
 	return 0;
 }
diff --git a/builtin/config.c b/builtin/config.c
index 52a4606243..d13daeeb55 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -631,6 +631,7 @@ int cmd_config(int argc, const char **argv, const char *prefix)
 		check_write();
 		check_argc(argc, 2, 2);
 		value = normalize_value(argv[0], argv[1]);
+		UNLEAK(value);
 		ret = git_config_set_in_file_gently(given_config_source.file, argv[0], value);
 		if (ret == CONFIG_NOTHING_SET)
 			error(_("cannot overwrite multiple values with a single value\n"
@@ -641,6 +642,7 @@ int cmd_config(int argc, const char **argv, const char *prefix)
 		check_write();
 		check_argc(argc, 2, 3);
 		value = normalize_value(argv[0], argv[1]);
+		UNLEAK(value);
 		return git_config_set_multivar_in_file_gently(given_config_source.file,
 							      argv[0], value, argv[2], 0);
 	}
@@ -648,6 +650,7 @@ int cmd_config(int argc, const char **argv, const char *prefix)
 		check_write();
 		check_argc(argc, 2, 2);
 		value = normalize_value(argv[0], argv[1]);
+		UNLEAK(value);
 		return git_config_set_multivar_in_file_gently(given_config_source.file,
 							      argv[0], value,
 							      CONFIG_REGEX_NONE, 0);
@@ -656,6 +659,7 @@ int cmd_config(int argc, const char **argv, const char *prefix)
 		check_write();
 		check_argc(argc, 2, 3);
 		value = normalize_value(argv[0], argv[1]);
+		UNLEAK(value);
 		return git_config_set_multivar_in_file_gently(given_config_source.file,
 							      argv[0], value, argv[2], 1);
 	}
diff --git a/builtin/init-db.c b/builtin/init-db.c
index 47823f9aa4..c9b7946bad 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -579,6 +579,8 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 			set_git_work_tree(work_tree);
 	}
 
+	UNLEAK(real_git_dir);
+
 	flags |= INIT_DB_EXIST_OK;
 	return init_db(git_dir, real_git_dir, template_dir, flags);
 }
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e1339e6d17..8c713c47ac 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -673,5 +673,6 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 		return bad ? 1 : 0;
 	}
 
+	UNLEAK(dir);
 	return 0;
 }
diff --git a/builtin/worktree.c b/builtin/worktree.c
index c98e2ce5f5..de26849f55 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -381,6 +381,8 @@ static int add(int ac, const char **av, const char *prefix)
 		branch = opts.new_branch;
 	}
 
+	UNLEAK(path);
+	UNLEAK(opts);
 	return add_worktree(path, branch, &opts);
 }
 
diff --git a/git-compat-util.h b/git-compat-util.h
index 6678b488cc..01cde2e375 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1169,4 +1169,11 @@ static inline int is_missing_file_error(int errno_)
 
 extern int cmd_main(int, const char **);
 
+#ifdef SUPPRESS_ANNOTATED_LEAKS
+extern void unleak_memory(const void *ptr, size_t len);
+#define UNLEAK(var) unleak_memory(&(var), sizeof(var));
+#else
+#define UNLEAK(var)
+#endif
+
 #endif
diff --git a/usage.c b/usage.c
index 1ea7df9a20..780ed73be6 100644
--- a/usage.c
+++ b/usage.c
@@ -241,3 +241,16 @@ NORETURN void BUG(const char *fmt, ...)
 	va_end(ap);
 }
 #endif
+
+void unleak_memory(const void *ptr, size_t len)
+{
+	static struct suppressed_leak_root {
+		struct suppressed_leak_root *next;
+		char data[FLEX_ARRAY];
+	} *suppressed_leaks;
+	struct suppressed_leak_root *root;
+
+	FLEX_ALLOC_MEM(root, data, ptr, len);
+	root->next = suppressed_leaks;
+	suppressed_leaks = root;
+}
-- 
2.14.1.721.gc5bc1565f1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/10] towards clean leak-checker output
  2017-09-05 13:01 [PATCH 0/10] towards clean leak-checker output Jeff King
                   ` (9 preceding siblings ...)
  2017-09-05 13:05 ` [PATCH 10/10] add UNLEAK annotation for reducing leak false positives Jeff King
@ 2017-09-05 17:50 ` Martin Ågren
  2017-09-05 19:02   ` Jeff King
  2017-09-08  6:38 ` [PATCH v2 10/10] add UNLEAK annotation for reducing leak false positives Jeff King
  11 siblings, 1 reply; 42+ messages in thread
From: Martin Ågren @ 2017-09-05 17:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On 5 September 2017 at 15:01, Jeff King <peff@peff.net> wrote:
> Using leak-checking tools like valgrind or LSAN is usually not all that
> productive with Git, because the output is far from clean. And _most_ of
> these are just not interesting, as they're either:
>
>   1. Real leaks, but ones that only be triggered a small, set number of
>      times per program.
>
>   2. Intentional leaks of variables as the main cmd_* functions go out
>      of scope (as opposed to manually cleaning).
>
> I focused here on getting t0000 and t0001 to run clean against a
> leak-checking tool. That's a fraction of the total test suite, but my
> goal was have a tractable sample which could let us see if the path
> seems sane.
>
> I feel positive overall about the approach this series takes. The
> suppressions aren't too terrible to write, and I found some real (albeit
> minor) leaks in these few tests. I hope it can serve as a base for an
> ongoing effort to get the whole test suite clean.
>
> A few things to note:
>
>   - I switched from valgrind to ASAN/LSAN early on. It's just way
>     faster, which makes the compile-test-fix cycle a lot more pleasant.
>     With these patches, you should be able to do:
>
>       make SANITIZE=leak && (cd t && ./t1234-* -v -i)
>
>     and get a leak report for a single script dumped to stderr.
>
>     If you want to try it with t0000, you'll need the lock-file series I
>     just posted at:
>
>       https://public-inbox.org/git/20170905121353.62zg3mtextmq5zrs@sigill.intra.peff.net/
>
>     and the fix from Martin at:
>
>       https://public-inbox.org/git/CAN0heSqn59sFF3A-eQ593G+ZDWnO9pKM5F=sgiSQk+prUr-nSQ@mail.gmail.com/
>
>     to get a clean run.
>
>   - I'm using LSAN instead of the full-on ASAN because it's faster. The
>     docs warn that it's a bit experimental, and I did notice a few funny
>     behaviors (like truncated backtraces), but overall it seems fine.
>     You can also do:
>
>       make SANITIZE=address &&
>         (cd t && ASAN_OPTIONS=abort_on_error=1 ./t1234-* -v -i)
>
>     to do a full ASAN run (the extra environment setting is necessary to
>     override test-lib's defaults).
>
>   - gcc's leak-checker doesn't seem to handle reachability correctly (or
>     maybe I'm holding it wrong). As a simple case, building with ASAN
>     and running git-init complains:
>
>       $ make SANITIZE=address && ./git init foo
>       ...
>       Direct leak of 2 byte(s) in 1 object(s) allocated from:
>           #0 0x7f011dc699e0 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xd99e0)
>           #1 0x558eeedbdab5 in do_xmalloc /home/peff/compile/git/wrapper.c:60
>           #2 0x558eeedbdbab in do_xmallocz /home/peff/compile/git/wrapper.c:100
>           #3 0x558eeedbdd0d in xmallocz /home/peff/compile/git/wrapper.c:108
>           #4 0x558eeedbdd0d in xmemdupz /home/peff/compile/git/wrapper.c:124
>           #5 0x558eeedbdd0d in xstrndup /home/peff/compile/git/wrapper.c:130
>           #6 0x558eeea0507a in main /home/peff/compile/git/common-main.c:39
>           #7 0x7f011cf612e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
>
>     That line is the setting of argv0_path, which is a global (and thus
>     shouldn't be marked as leaking). Interestingly, it only happens with
>     -O2. Compiling with -O0 works fine. I'm not sure if it's a bug or
>     what.
>
>     I did most of my testing with clang-6.0, which gets this case right.

Hmmm, I got the same wrong results (IMHO) from Valgrind, which
classified this as "definitely lost". Like you I found that -O0 helped.
And yes, that was with gcc. Maybe gcc with optimization somehow manages
to hide the pointers from these tools. I know too little about the
technical details to have any real ideas, though. My searches did not
bring up anything useful. (gcc 5.4.0)

I guess clang will be my next attempt. And asan/lsan. Valgrind is slow..

I'll look through this series and get back if I have any input.

>   - I have no idea who close or far this is to covering the whole suite.
>     Only 98 test scripts complete with no leaks. But many of those
>     failures will be hitting the same leaks over and over. It looks like
>     running "git show' hits a few, which is going to affect a lot of
>     scripts. But bear in mind when reading the patches that this might
>     be the first step on a successful road, or it could be a dead end. :)
>
> Most of this is actual leak fixes. The interesting part, I think, is the
> UNLEAK annotation in patch 10.
>
>   [01/10]: test-lib: --valgrind should not override --verbose-log
>   [02/10]: test-lib: set LSAN_OPTIONS to abort by default
>   [03/10]: add: free leaked pathspec after add_files_to_cache()
>   [04/10]: update-index: fix cache entry leak in add_one_file()
>   [05/10]: config: plug user_config leak
>   [06/10]: reset: make tree counting less confusing
>   [07/10]: reset: free allocated tree buffers
>   [08/10]: repository: free fields before overwriting them
>   [09/10]: set_git_dir: handle feeding gitdir to itself
>   [10/10]: add UNLEAK annotation for reducing leak false positives
>
>  Makefile               |  3 +++
>  builtin/add.c          |  3 +++
>  builtin/commit.c       |  1 +
>  builtin/config.c       | 11 +++++++++--
>  builtin/init-db.c      |  2 ++
>  builtin/ls-files.c     |  1 +
>  builtin/reset.c        | 24 +++++++++++++++++-------
>  builtin/update-index.c |  4 +++-
>  builtin/worktree.c     |  2 ++
>  environment.c          |  4 +++-
>  git-compat-util.h      |  7 +++++++
>  repository.c           | 14 +++++++-------
>  setup.c                |  5 -----
>  t/test-lib.sh          |  7 ++++++-
>  usage.c                | 13 +++++++++++++
>  15 files changed, 77 insertions(+), 24 deletions(-)
>
> -Peff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/10] towards clean leak-checker output
  2017-09-05 17:50 ` [PATCH 0/10] towards clean leak-checker output Martin Ågren
@ 2017-09-05 19:02   ` Jeff King
  2017-09-05 20:41     ` Martin Ågren
  2017-09-06  1:42     ` Junio C Hamano
  0 siblings, 2 replies; 42+ messages in thread
From: Jeff King @ 2017-09-05 19:02 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Git Mailing List

On Tue, Sep 05, 2017 at 07:50:10PM +0200, Martin Ågren wrote:

> >     That line is the setting of argv0_path, which is a global (and thus
> >     shouldn't be marked as leaking). Interestingly, it only happens with
> >     -O2. Compiling with -O0 works fine. I'm not sure if it's a bug or
> >     what.
> >
> >     I did most of my testing with clang-6.0, which gets this case right.
> 
> Hmmm, I got the same wrong results (IMHO) from Valgrind, which
> classified this as "definitely lost". Like you I found that -O0 helped.
> And yes, that was with gcc. Maybe gcc with optimization somehow manages
> to hide the pointers from these tools. I know too little about the
> technical details to have any real ideas, though. My searches did not
> bring up anything useful. (gcc 5.4.0)

Yeah, I think it is just optimizing out the variable entirely. If
RUNTIME_PREFIX isn't defined (and it's not for non-Windows platforms)
then we never look at the variable at all, and it's a dead assignment.
And the compiler can see that easily because it's got static linkage. So
it drops the variable completely, but it can't drop the call to
xstrdup() with the information in exec_cmd.c. It has to call the
function and throw away the result, resulting in the leak.

In fact, the whole extract_argv0_path thing is pointless without
RUNTIME_PREFIX. So I think one reasonable fix is just:

diff --git a/exec_cmd.c b/exec_cmd.c
index fb94aeba9c..09f05c3bc3 100644
--- a/exec_cmd.c
+++ b/exec_cmd.c
@@ -5,7 +5,10 @@
 #define MAX_ARGS	32
 
 static const char *argv_exec_path;
+
+#ifdef RUNTIME_PREFIX
 static const char *argv0_path;
+#endif
 
 char *system_path(const char *path)
 {
@@ -40,6 +43,7 @@ char *system_path(const char *path)
 
 void git_extract_argv0_path(const char *argv0)
 {
+#ifdef RUNTIME_PREFIX
 	const char *slash;
 
 	if (!argv0 || !*argv0)
@@ -49,6 +53,7 @@ void git_extract_argv0_path(const char *argv0)
 
 	if (slash)
 		argv0_path = xstrndup(argv0, slash - argv0);
+#endif
 }
 
 void git_set_argv_exec_path(const char *exec_path)

-Peff

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/10] towards clean leak-checker output
  2017-09-05 19:02   ` Jeff King
@ 2017-09-05 20:41     ` Martin Ågren
  2017-09-06 12:39       ` Jeff King
  2017-09-06  1:42     ` Junio C Hamano
  1 sibling, 1 reply; 42+ messages in thread
From: Martin Ågren @ 2017-09-05 20:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On 5 September 2017 at 21:02, Jeff King <peff@peff.net> wrote:
> On Tue, Sep 05, 2017 at 07:50:10PM +0200, Martin Ågren wrote:
>
>> >     That line is the setting of argv0_path, which is a global (and thus
>> >     shouldn't be marked as leaking). Interestingly, it only happens with
>> >     -O2. Compiling with -O0 works fine. I'm not sure if it's a bug or
>> >     what.
>> >
>> >     I did most of my testing with clang-6.0, which gets this case right.
>>
>> Hmmm, I got the same wrong results (IMHO) from Valgrind, which
>> classified this as "definitely lost". Like you I found that -O0 helped.
>> And yes, that was with gcc. Maybe gcc with optimization somehow manages
>> to hide the pointers from these tools. I know too little about the
>> technical details to have any real ideas, though. My searches did not
>> bring up anything useful. (gcc 5.4.0)
>
> Yeah, I think it is just optimizing out the variable entirely. If
> RUNTIME_PREFIX isn't defined (and it's not for non-Windows platforms)
> then we never look at the variable at all, and it's a dead assignment.
> And the compiler can see that easily because it's got static linkage. So
> it drops the variable completely, but it can't drop the call to
> xstrdup() with the information in exec_cmd.c. It has to call the
> function and throw away the result, resulting in the leak.

I see. Yeah, that makes sense.

FWIW, this series (combined with the other series you mentioned) makes
t0000 and t0001 pass for me with gcc/clang. There are actually some
leaks in t0000, they're just silently being reported to stderr, since
the exit statuses from git are hidden by pipes. Maybe you're already
aware of it. Depending on your definition of "running clean" it might be
out of scope for this series, which is of course still very interesting
and enlightening as it stands.

One is in cmd_show (you did mention git show) where we leak data in rev.
The other is some use of strdup. I can't immediately figure out how to
get a useful stacktrace (you mentioned this as well) and it's past
bed-time here. I'll try to play more with this tomorrow.

Note for self: getting rid of all pipes would probably also help flush
out a few leaks (or "introduce" them, depending on your viewpoint).

Martin

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-05 13:05 ` [PATCH 10/10] add UNLEAK annotation for reducing leak false positives Jeff King
@ 2017-09-05 22:05   ` Stefan Beller
  2017-09-07  9:17     ` Jeff King
  2017-09-12 14:34     ` Kaartic Sivaraam
  2017-09-06 17:16   ` Martin Ågren
  2017-09-12 13:41   ` Kaartic Sivaraam
  2 siblings, 2 replies; 42+ messages in thread
From: Stefan Beller @ 2017-09-05 22:05 UTC (permalink / raw)
  To: Jeff King; +Cc: git@vger.kernel.org

On Tue, Sep 5, 2017 at 6:05 AM, Jeff King <peff@peff.net> wrote:

>   int main(void)

nit of the day:
  s/void/int argc, char *argv/ or in case we do not
  want to emphasize the argument list s/void//
  as that adds no uninteresting things.


>
> In other words, you can do:
>
>   int main(void)
>   {
>         char *p = some_function();
>         printf("%s", p);
>         UNLEAK(p);
>         return 0;
>   }
>
> to annotate "p" and suppress the leak report.

This sounds really cool so far.

After having a sneak peak at the implementation
it is O(1) in runtime for each added element, and the
space complexity is O(well).

> But wait, couldn't we just say "free(p)"? In this toy
> example, yes. But using UNLEAK() has several advantages over
> actually freeing the memory:

This is indeed the big question, that I have had.

>
>   1. It can be compiled conditionally. There's no need in
>      normal runs to do this free(), and it just wastes time.
>      By using a macro, we can get the benefit for leak-check
>      builds with zero cost for normal builds (this patch
>      uses a compile-time check, though we could clearly also
>      make it a run-time check at very low cost).
>
>      Of course one could also hide free() behind a macro, so
>      this is really just arguing for having UNLEAK(), not
>      for its particular implementation.

This is only a real argument in combination with (2), or in other
words you seem to hint at situations like these:

  struct *foo = obtain_new_foo();
  ...
  #if FREE_ANNOTATED_LEAKS
    /* special free() */
    release_foo(foo);
  #endif

With UNLEAK this situation works out nicely as we just
copy over all memory, ignoring elements allocated inside
foo, but for free() we'd have issues combining the preprocessor
magic with the special free implementation.

So how would we use syntactic sugar to made this
more comfortable? Roughly like

    MAYBE(release_foo(foo))

  #if (FREE_ANNOTATED_LEAKS)
  /* we rely on strict text substitution */
  /* as the function signature may change */
  #define MAYBE(fn) fn;
  #else
  #define MAYBE(fn)
  #endif

Me regurgitating this first argument is just a long way of saying
that it put me off even more after reading only the first argument.
Maybe reorder this argument to show up after the current second
argument, so the reader is guided better?

>   2. It's recursive across structures. In many cases our "p"
>      is not just a pointer, but a complex struct whose
>      fields may have been allocated by a sub-function. And
>      in some cases (e.g., dir_struct) we don't even have a
>      function which knows how to free all of the struct
>      members.
>
>      By marking the struct itself as reachable, that confers
>      reachability on any pointers it contains (including those
>      found in embedded structs, or reachable by walking
>      heap blocks recursively.
>
>   3. It works on cases where we're not sure if the value is
>      allocated or not. For example:
>
>        char *p = argc > 1 ? argv[1] : some_function();
>
>      It's safe to use UNLEAK(p) here, because it's not
>      freeing any memory. In the case that we're pointing to
>      argv here, the reachability checker will just ignore
>      our bytes.

This argument demonstrates why the MAYBE above is
inferior.

>
>   4. Because it's not actually freeing memory, you can
>      UNLEAK() before we are finished accessing the variable.
>      This is helpful in cases like this:
>
>        char *p = some_function();
>        return another_function(p);
>
>      Writing this with free() requires:
>
>        int ret;
>        char *p = some_function();
>        ret = another_function(p);
>        free(p);
>        return ret;
>
>      But with unleak we can just write:
>
>        char *p = some_function();
>        UNLEAK(p);
>        return another_function(p);

  5. It's not just about worrying if we can call UNLEAK
      once (in 4), but we also do not have to worry about
      calling it twice, or recursively. (This argument can be bad
      for cargo cult programmers, but we don't have these ;-)



> +#ifdef SUPPRESS_ANNOTATED_LEAKS
> +extern void unleak_memory(const void *ptr, size_t len);
> +#define UNLEAK(var) unleak_memory(&(var), sizeof(var));

As always with macros we have to be careful about its arguments.

  UNLEAK(a++)
  UNLEAK(baz())

won't work as intended.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/10] towards clean leak-checker output
  2017-09-05 19:02   ` Jeff King
  2017-09-05 20:41     ` Martin Ågren
@ 2017-09-06  1:42     ` Junio C Hamano
  2017-09-06 12:28       ` [PATCH 0/2] simplifying !RUNTIME_PREFIX Jeff King
  1 sibling, 1 reply; 42+ messages in thread
From: Junio C Hamano @ 2017-09-06  1:42 UTC (permalink / raw)
  To: Jeff King; +Cc: Martin Ågren, Git Mailing List

Jeff King <peff@peff.net> writes:

> In fact, the whole extract_argv0_path thing is pointless without
> RUNTIME_PREFIX. So I think one reasonable fix is just:
>
> diff --git a/exec_cmd.c b/exec_cmd.c
> index fb94aeba9c..09f05c3bc3 100644
> --- a/exec_cmd.c
> +++ b/exec_cmd.c
> @@ -5,7 +5,10 @@
>  #define MAX_ARGS	32
>  
>  static const char *argv_exec_path;
> +
> +#ifdef RUNTIME_PREFIX
>  static const char *argv0_path;
> +#endif
>  
>  char *system_path(const char *path)
>  {
> @@ -40,6 +43,7 @@ char *system_path(const char *path)
>  
>  void git_extract_argv0_path(const char *argv0)
>  {
> +#ifdef RUNTIME_PREFIX
>  	const char *slash;
>  
>  	if (!argv0 || !*argv0)
> @@ -49,6 +53,7 @@ void git_extract_argv0_path(const char *argv0)
>  
>  	if (slash)
>  		argv0_path = xstrndup(argv0, slash - argv0);
> +#endif
>  }
>  
>  void git_set_argv_exec_path(const char *exec_path)
>
> -Peff

Yeah, I kind of like this (I would have reduced the ifdef noise by
leaving a totally-unused argv0_path in the BSS, though).


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 0/2] simplifying !RUNTIME_PREFIX
  2017-09-06  1:42     ` Junio C Hamano
@ 2017-09-06 12:28       ` Jeff King
  2017-09-06 12:30         ` [PATCH 1/2] system_path: move RUNTIME_PREFIX to a sub-function Jeff King
  2017-09-06 12:32         ` [PATCH 2/2] git_extract_argv0_path: do nothing without RUNTIME_PREFIX Jeff King
  0 siblings, 2 replies; 42+ messages in thread
From: Jeff King @ 2017-09-06 12:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Martin Ågren, Git Mailing List

On Wed, Sep 06, 2017 at 10:42:07AM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > In fact, the whole extract_argv0_path thing is pointless without
> > RUNTIME_PREFIX. So I think one reasonable fix is just:
> [...]
> Yeah, I kind of like this (I would have reduced the ifdef noise by
> leaving a totally-unused argv0_path in the BSS, though).

I wanted to drop it to make sure that I didn't miss any references to it
(and that we didn't add any in the future). But -Wunused also complains
if you leave it in. :)

I think we can reorganize the code a little to make the end result more
readable. Like this.

  [1/2]: system_path: move RUNTIME_PREFIX to a sub-function
  [2/2]: git_extract_argv0_path: do nothing without RUNTIME_PREFIX

 exec_cmd.c | 42 ++++++++++++++++++++++++++++--------------
 1 file changed, 28 insertions(+), 14 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 1/2] system_path: move RUNTIME_PREFIX to a sub-function
  2017-09-06 12:28       ` [PATCH 0/2] simplifying !RUNTIME_PREFIX Jeff King
@ 2017-09-06 12:30         ` Jeff King
  2017-09-06 13:23           ` Johannes Schindelin
  2017-09-06 12:32         ` [PATCH 2/2] git_extract_argv0_path: do nothing without RUNTIME_PREFIX Jeff King
  1 sibling, 1 reply; 42+ messages in thread
From: Jeff King @ 2017-09-06 12:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Martin Ågren, Git Mailing List

The system_path() function has an #ifdef in the middle of
it. Let's move the conditional logic into a sub-function.
This isolates it more, which will make it easier to change
and add to.

Signed-off-by: Jeff King <peff@peff.net>
---
I find the diff hard to read because it shows the opposite of what I did
(moving the big block to its own function, rather than moving the little
bits to a new copy of the function).

 exec_cmd.c | 33 +++++++++++++++++++++------------
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/exec_cmd.c b/exec_cmd.c
index fb94aeba9c..61092e9715 100644
--- a/exec_cmd.c
+++ b/exec_cmd.c
@@ -7,19 +7,12 @@
 static const char *argv_exec_path;
 static const char *argv0_path;
 
-char *system_path(const char *path)
-{
 #ifdef RUNTIME_PREFIX
-	static const char *prefix;
-#else
-	static const char *prefix = PREFIX;
-#endif
-	struct strbuf d = STRBUF_INIT;
 
-	if (is_absolute_path(path))
-		return xstrdup(path);
+static const char *system_prefix(void)
+{
+	static const char *prefix;
 
-#ifdef RUNTIME_PREFIX
 	assert(argv0_path);
 	assert(is_absolute_path(argv0_path));
 
@@ -32,9 +25,25 @@ char *system_path(const char *path)
 				"but prefix computation failed.  "
 				"Using static fallback '%s'.\n", prefix);
 	}
-#endif
+	return prefix;
+}
+#else
+
+static const char *system_prefix(void)
+{
+	return PREFIX;
+}
+
+#endif /* RUNTIME_PREFIX */
+
+char *system_path(const char *path)
+{
+	struct strbuf d = STRBUF_INIT;
+
+	if (is_absolute_path(path))
+		return xstrdup(path);
 
-	strbuf_addf(&d, "%s/%s", prefix, path);
+	strbuf_addf(&d, "%s/%s", system_prefix(), path);
 	return strbuf_detach(&d, NULL);
 }
 
-- 
2.14.1.757.g8fad538cea


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 2/2] git_extract_argv0_path: do nothing without RUNTIME_PREFIX
  2017-09-06 12:28       ` [PATCH 0/2] simplifying !RUNTIME_PREFIX Jeff King
  2017-09-06 12:30         ` [PATCH 1/2] system_path: move RUNTIME_PREFIX to a sub-function Jeff King
@ 2017-09-06 12:32         ` Jeff King
  1 sibling, 0 replies; 42+ messages in thread
From: Jeff King @ 2017-09-06 12:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Martin Ågren, Git Mailing List

When the RUNTIME_PREFIX compile-time knob isn't set, we
never look at the argv0_path we extract. We can push its
declaration inside the #ifdef to make it more clear that the
extract code is effectively a noop.

This also un-confuses leak-checking of the argv0_path
variable when RUNTIME_PREFIX isn't set. The compiler is free
to drop this static variable that we set but never look at
(and "gcc -O2" does so).  But the compiler still must call
strbuf_detach(), since it doesn't know whether that function
has side effects; it just throws away the result rather than
putting it into the global.

Leak-checkers which work by scanning the data segment for
pointers to heap blocks would normally consider the block
as reachable at program end. But if the compiler removes the
variable entirely, there's nothing to find.

Signed-off-by: Jeff King <peff@peff.net>
---
 exec_cmd.c | 33 +++++++++++++++++++--------------
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/exec_cmd.c b/exec_cmd.c
index 61092e9715..ce192a2d64 100644
--- a/exec_cmd.c
+++ b/exec_cmd.c
@@ -5,9 +5,9 @@
 #define MAX_ARGS	32
 
 static const char *argv_exec_path;
-static const char *argv0_path;
 
 #ifdef RUNTIME_PREFIX
+static const char *argv0_path;
 
 static const char *system_prefix(void)
 {
@@ -27,6 +27,20 @@ static const char *system_prefix(void)
 	}
 	return prefix;
 }
+
+void git_extract_argv0_path(const char *argv0)
+{
+	const char *slash;
+
+	if (!argv0 || !*argv0)
+		return;
+
+	slash = find_last_dir_sep(argv0);
+
+	if (slash)
+		argv0_path = xstrndup(argv0, slash - argv0);
+}
+
 #else
 
 static const char *system_prefix(void)
@@ -34,6 +48,10 @@ static const char *system_prefix(void)
 	return PREFIX;
 }
 
+void git_extract_argv0_path(const char *argv0)
+{
+}
+
 #endif /* RUNTIME_PREFIX */
 
 char *system_path(const char *path)
@@ -47,19 +65,6 @@ char *system_path(const char *path)
 	return strbuf_detach(&d, NULL);
 }
 
-void git_extract_argv0_path(const char *argv0)
-{
-	const char *slash;
-
-	if (!argv0 || !*argv0)
-		return;
-
-	slash = find_last_dir_sep(argv0);
-
-	if (slash)
-		argv0_path = xstrndup(argv0, slash - argv0);
-}
-
 void git_set_argv_exec_path(const char *exec_path)
 {
 	argv_exec_path = exec_path;
-- 
2.14.1.757.g8fad538cea

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/10] towards clean leak-checker output
  2017-09-05 20:41     ` Martin Ågren
@ 2017-09-06 12:39       ` Jeff King
  0 siblings, 0 replies; 42+ messages in thread
From: Jeff King @ 2017-09-06 12:39 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Git Mailing List

On Tue, Sep 05, 2017 at 10:41:51PM +0200, Martin Ågren wrote:

> FWIW, this series (combined with the other series you mentioned) makes
> t0000 and t0001 pass for me with gcc/clang. There are actually some
> leaks in t0000, they're just silently being reported to stderr, since
> the exit statuses from git are hidden by pipes. Maybe you're already
> aware of it. Depending on your definition of "running clean" it might be
> out of scope for this series, which is of course still very interesting
> and enlightening as it stands.

Yeah, I saw those. I don't think they're worth hunting down at this
point. It's quite probable that the same leaks are exercised elsewhere,
and we'll catch them later.

Once the whole test suite runs without reporting any leaks via abort(),
I'll feel more compelled to start grepping stderr for other cases. :)

> One is in cmd_show (you did mention git show) where we leak data in rev.
> The other is some use of strdup. I can't immediately figure out how to
> get a useful stacktrace (you mentioned this as well) and it's past
> bed-time here. I'll try to play more with this tomorrow.

I think I got funny truncated stack traces with just LSAN that went away
with ASAN, but I didn't dig. If you try that, remember that:

  - Technically just SANITIZE=address turns on the leak detector, but
    you need "leak" in the string to turn on the UNLEAK() magic. So use
    "SANITIZE=address,leak".

  - You'll have to set ASAN_OPTIONS=abort_on_error=1 in the environment
    manually to enable leak detection.

Once the test suite passes with leak detection on, I think we'd probably
do both of those automatically (but until then they make ASAN by itself
unusable).

> Note for self: getting rid of all pipes would probably also help flush
> out a few leaks (or "introduce" them, depending on your viewpoint).

Agreed. These leaks are really just a special form of bug. Those pipes
could be hiding other bugs, too. But again, I'm not too concerned. The
places that pipe git are doing so because they are not meant to be
testing that particular command (and it should generally be covered
elsewhere).

-Peff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 1/2] system_path: move RUNTIME_PREFIX to a sub-function
  2017-09-06 12:30         ` [PATCH 1/2] system_path: move RUNTIME_PREFIX to a sub-function Jeff King
@ 2017-09-06 13:23           ` Johannes Schindelin
  2017-09-06 13:27             ` Jeff King
  0 siblings, 1 reply; 42+ messages in thread
From: Johannes Schindelin @ 2017-09-06 13:23 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Martin Ågren, Git Mailing List

Hi Peff,

On Wed, 6 Sep 2017, Jeff King wrote:

> I find the diff hard to read because it shows the opposite of what I did
> (moving the big block to its own function, rather than moving the little
> bits to a new copy of the function).

I guess --patience would have made it slightly more readable. I did not
find any commit in your fork on GitHub with the commit subject, else I
would have pasted the output of `git diff --patience` here.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 1/2] system_path: move RUNTIME_PREFIX to a sub-function
  2017-09-06 13:23           ` Johannes Schindelin
@ 2017-09-06 13:27             ` Jeff King
  0 siblings, 0 replies; 42+ messages in thread
From: Jeff King @ 2017-09-06 13:27 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, Martin Ågren, Git Mailing List

On Wed, Sep 06, 2017 at 03:23:42PM +0200, Johannes Schindelin wrote:

> On Wed, 6 Sep 2017, Jeff King wrote:
> 
> > I find the diff hard to read because it shows the opposite of what I did
> > (moving the big block to its own function, rather than moving the little
> > bits to a new copy of the function).
> 
> I guess --patience would have made it slightly more readable. I did not
> find any commit in your fork on GitHub with the commit subject, else I
> would have pasted the output of `git diff --patience` here.

I hadn't pushed it yet (but it's there now, 130217f4a). Neither
--patience nor --histogram produces markedly different results. I think
the issue is just that what I actually did (moving the big inner block)
is not nearly as minimal a diff as moving what surrounds it.

-Peff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-05 13:05 ` [PATCH 10/10] add UNLEAK annotation for reducing leak false positives Jeff King
  2017-09-05 22:05   ` Stefan Beller
@ 2017-09-06 17:16   ` Martin Ågren
  2017-09-07  9:00     ` Jeff King
  2017-09-12 13:41   ` Kaartic Sivaraam
  2 siblings, 1 reply; 42+ messages in thread
From: Martin Ågren @ 2017-09-06 17:16 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On 5 September 2017 at 15:05, Jeff King <peff@peff.net> wrote:
>   1. It can be compiled conditionally. There's no need in
>      normal runs to do this free(), and it just wastes time.
>      By using a macro, we can get the benefit for leak-check
>      builds with zero cost for normal builds (this patch
>      uses a compile-time check, though we could clearly also
>      make it a run-time check at very low cost).
>
>      Of course one could also hide free() behind a macro, so
>      this is really just arguing for having UNLEAK(), not
>      for its particular implementation.

Like Stefan, I didn't quite follow 1. until after I had read the points
below it. But it's still a very good commit message (as always).

> diff --git a/builtin/commit.c b/builtin/commit.c
> index b3b04f5dd3..de775d906c 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -1819,5 +1819,6 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
>                 print_summary(prefix, &oid, !current_head);
>
>         strbuf_release(&err);
> +       UNLEAK(sb);
>         return 0;
>  }

These are both strbufs, so this ends up being a bit inconsistent. What
would be the ideal end state for these two and all other such
structures? My guess is "always UNLEAK", as opposed to carefully judging
whether foo_release() would/could add any significant overhead.

In other words, it would be ok/wanted with changes such as "let's UNLEAK
bar, because ..., and while at it, convert the existing foo_release to
UNLEAK for consistency" (or per policy, for smaller binary, whatever).
Or "if it ain't broken, don't fix it"? Did you think about this, or was
it more a random choice?

Martin

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-06 17:16   ` Martin Ågren
@ 2017-09-07  9:00     ` Jeff King
  0 siblings, 0 replies; 42+ messages in thread
From: Jeff King @ 2017-09-07  9:00 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Git Mailing List

On Wed, Sep 06, 2017 at 07:16:00PM +0200, Martin Ågren wrote:

> > diff --git a/builtin/commit.c b/builtin/commit.c
> > index b3b04f5dd3..de775d906c 100644
> > --- a/builtin/commit.c
> > +++ b/builtin/commit.c
> > @@ -1819,5 +1819,6 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
> >                 print_summary(prefix, &oid, !current_head);
> >
> >         strbuf_release(&err);
> > +       UNLEAK(sb);
> >         return 0;
> >  }
> 
> These are both strbufs, so this ends up being a bit inconsistent. What
> would be the ideal end state for these two and all other such
> structures? My guess is "always UNLEAK", as opposed to carefully judging
> whether foo_release() would/could add any significant overhead.
> 
> In other words, it would be ok/wanted with changes such as "let's UNLEAK
> bar, because ..., and while at it, convert the existing foo_release to
> UNLEAK for consistency" (or per policy, for smaller binary, whatever).
> Or "if it ain't broken, don't fix it"? Did you think about this, or was
> it more a random choice?

To be honest, I didn't really think that deeply about it. I had a hammer
in my hand, and LSAN kept showing me nails to pound.

I agree that these two strbufs should probably be treated the same.

In general, I think I prefer using UNLEAK() because it's hard to get it
wrong (i.e., you don't have to care about double-frees or uninitialized
pointers). For strbufs, though, that's less of an issue because they are
always maintained in a consistent state.

As an aside, I'm pretty sure that "err" can never have been allocated
here, and this release is always a noop. It's filled in only when we get
an error from the ref update, which also causes us to die(). But in
general I'd prefer the code that causes readers to think the least
(i.e., just calling free or UNLEAK here rather than forcing the reader
to figure out whether it's possible to leak).

-Peff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-05 22:05   ` Stefan Beller
@ 2017-09-07  9:17     ` Jeff King
  2017-09-07 20:38       ` Stefan Beller
  2017-09-12 14:34     ` Kaartic Sivaraam
  1 sibling, 1 reply; 42+ messages in thread
From: Jeff King @ 2017-09-07  9:17 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git@vger.kernel.org

On Tue, Sep 05, 2017 at 03:05:12PM -0700, Stefan Beller wrote:

> On Tue, Sep 5, 2017 at 6:05 AM, Jeff King <peff@peff.net> wrote:
> 
> >   int main(void)
> 
> nit of the day:
>   s/void/int argc, char *argv/ or in case we do not
>   want to emphasize the argument list s/void//
>   as that adds no uninteresting things.

That really is a nit. I chose not to provide argv because it's longer
than "void" and I wasn't going to use the arguments. And I chose not to
use an empty argument list because it violates our style (as well as
arguably the C standard, though it leaves room for implementations to
take other forms of main).

> > In other words, you can do:
> >
> >   int main(void)
> >   {
> >         char *p = some_function();
> >         printf("%s", p);
> >         UNLEAK(p);
> >         return 0;
> >   }
> >
> > to annotate "p" and suppress the leak report.
> 
> This sounds really cool so far.
> 
> After having a sneak peak at the implementation
> it is O(1) in runtime for each added element, and the
> space complexity is O(well).

I'm not sure if your "well" is "this does well" or "well, it could be
quite a lot". :)

It certainly has the potential to grow the heap without bound (since
after all, it's whole point is to make a giant list of variables that
are going out of scope). But in practice we'd sprinkle this over a
handful of variables just before program exit (and remember that it's
copying only what's on the stack already; so pointers get copied, not
whole heap-allocated blocks).

Plus it does nothing at all when not compiled with leak-checking. So I'm
not too worried about the extra memory usage or performance.

> >   1. It can be compiled conditionally. There's no need in
> >      normal runs to do this free(), and it just wastes time.
> >      By using a macro, we can get the benefit for leak-check
> >      builds with zero cost for normal builds (this patch
> >      uses a compile-time check, though we could clearly also
> >      make it a run-time check at very low cost).
> >
> >      Of course one could also hide free() behind a macro, so
> >      this is really just arguing for having UNLEAK(), not
> >      for its particular implementation.
> 
> This is only a real argument in combination with (2), or in other
> words you seem to hint at situations like these:

Well, the numbered list was meant to be a set of arguments, each of
which contributes to the overall conclusion. :) I agree that (1) is the
weakest. Since both you and Martin seemed to get hung up on it, I'll
re-organize it a bit for the re-roll.

>   5. It's not just about worrying if we can call UNLEAK
>       once (in 4), but we also do not have to worry about
>       calling it twice, or recursively. (This argument can be bad
>       for cargo cult programmers, but we don't have these ;-)

True. I didn't come across that case in any of the ones I converted. As
a more general rule, UNLEAK() doesn't access any pointed-to memory at
all. So it's fine with already-freed or even uninitialized memory (which
of course is technically wrong according to the standard, but in
practice would be fine, as we'd copy garbage that does not match a heap
block).

> > +#ifdef SUPPRESS_ANNOTATED_LEAKS
> > +extern void unleak_memory(const void *ptr, size_t len);
> > +#define UNLEAK(var) unleak_memory(&(var), sizeof(var));
> 
> As always with macros we have to be careful about its arguments.
> 
>   UNLEAK(a++)
>   UNLEAK(baz())
> 
> won't work as intended.

Yes, I intended this to be used only for actual variables. I couldn't
think of a way to enforce that at compile time with some kind of
BUILD_ASSERT (even requiring an lvalue isn't quite strict enough).

-Peff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 09/10] set_git_dir: handle feeding gitdir to itself
  2017-09-05 13:05 ` [PATCH 09/10] set_git_dir: handle feeding gitdir to itself Jeff King
@ 2017-09-07 19:06   ` Brandon Williams
  0 siblings, 0 replies; 42+ messages in thread
From: Brandon Williams @ 2017-09-07 19:06 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On 09/05, Jeff King wrote:
> Ideally we'd free the existing gitdir field before assigning
> the new one, to avoid a memory leak. But we can't do so
> safely because some callers do the equivalent of:
> 
>   set_git_dir(get_git_dir());
> 
> We can detect that case as a noop, but there are even more
> complicated cases like:
> 
>   set_git_dir(remove_leading_path(worktree, get_git_dir());
> 
> where we really do need to do some work, but the original
> string must remain valid.
> 
> Rather than put the burden on callers to make a copy of the
> string (only to free it later, since we'll make a copy of it
> ourselves), let's solve the problem inside set_git_dir(). We
> can make a copy of the pointer for the old gitdir, and then
> avoid freeing it until after we've made our new copy.
> 

This patch and the one before it look good to me.  Thanks for fixing
this issue!

> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  repository.c | 10 +++-------
>  setup.c      |  5 -----
>  2 files changed, 3 insertions(+), 12 deletions(-)
> 
> diff --git a/repository.c b/repository.c
> index 52f1821c6b..97c732bd48 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -56,16 +56,12 @@ static void repo_setup_env(struct repository *repo)
>  void repo_set_gitdir(struct repository *repo, const char *path)
>  {
>  	const char *gitfile = read_gitfile(path);
> +	char *old_gitdir = repo->gitdir;
>  
> -	/*
> -	 * NEEDSWORK: Eventually we want to be able to free gitdir and the rest
> -	 * of the environment before reinitializing it again, but we have some
> -	 * crazy code paths where we try to set gitdir with the current gitdir
> -	 * and we don't want to free gitdir before copying the passed in value.
> -	 */
>  	repo->gitdir = xstrdup(gitfile ? gitfile : path);
> -
>  	repo_setup_env(repo);
> +
> +	free(old_gitdir);
>  }
>  
>  /*
> diff --git a/setup.c b/setup.c
> index 23950173fc..6d8380acd2 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -399,11 +399,6 @@ void setup_work_tree(void)
>  	if (getenv(GIT_WORK_TREE_ENVIRONMENT))
>  		setenv(GIT_WORK_TREE_ENVIRONMENT, ".", 1);
>  
> -	/*
> -	 * NEEDSWORK: this call can essentially be set_git_dir(get_git_dir())
> -	 * which can cause some problems when trying to free the old value of
> -	 * gitdir.
> -	 */
>  	set_git_dir(remove_leading_path(git_dir, work_tree));
>  	initialized = 1;
>  }
> -- 
> 2.14.1.721.gc5bc1565f1
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-07  9:17     ` Jeff King
@ 2017-09-07 20:38       ` Stefan Beller
  0 siblings, 0 replies; 42+ messages in thread
From: Stefan Beller @ 2017-09-07 20:38 UTC (permalink / raw)
  To: Jeff King; +Cc: git@vger.kernel.org

On Thu, Sep 7, 2017 at 2:17 AM, Jeff King <peff@peff.net> wrote:
>> After having a sneak peak at the implementation
>> it is O(1) in runtime for each added element, and the
>> space complexity is O(well).
>
> I'm not sure if your "well" is "this does well" or "well, it could be
> quite a lot". :)

Both actually. When I wrote it I thought the phonetic interpretation
was way too funny, but nobody can hear subtle humor on mailing
lists. :)

If UNLEAK is used correctly, then it sounds more like
"this does well (and we cannot do better anyway)".

> It certainly has the potential to grow the heap without bound (since
> after all, it's whole point is to make a giant list of variables that
> are going out of scope). But in practice we'd sprinkle this over a
> handful of variables just before program exit (and remember that it's
> copying only what's on the stack already; so pointers get copied, not
> whole heap-allocated blocks).
>
> Plus it does nothing at all when not compiled with leak-checking. So I'm
> not too worried about the extra memory usage or performance.

me neither.

Thanks for starting this series (I am really happy about this solution)!
Stefan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-05 13:01 [PATCH 0/10] towards clean leak-checker output Jeff King
                   ` (10 preceding siblings ...)
  2017-09-05 17:50 ` [PATCH 0/10] towards clean leak-checker output Martin Ågren
@ 2017-09-08  6:38 ` Jeff King
  2017-09-19 20:45   ` Jonathan Tan
  11 siblings, 1 reply; 42+ messages in thread
From: Jeff King @ 2017-09-08  6:38 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Martin Ågren, Stefan Beller, Ramsay Jones

On Tue, Sep 05, 2017 at 09:01:50AM -0400, Jeff King wrote:

> Most of this is actual leak fixes. The interesting part, I think, is the
> UNLEAK annotation in patch 10.
> 
>   [01/10]: test-lib: --valgrind should not override --verbose-log
>   [02/10]: test-lib: set LSAN_OPTIONS to abort by default
>   [03/10]: add: free leaked pathspec after add_files_to_cache()
>   [04/10]: update-index: fix cache entry leak in add_one_file()
>   [05/10]: config: plug user_config leak
>   [06/10]: reset: make tree counting less confusing
>   [07/10]: reset: free allocated tree buffers
>   [08/10]: repository: free fields before overwriting them
>   [09/10]: set_git_dir: handle feeding gitdir to itself
>   [10/10]: add UNLEAK annotation for reducing leak false positives

Here's a re-roll of just patch 10, which is the only one that got any
critical feedback. This version:

  1. Only compiles the definition of unleak_memory() when it will be
     used. Thanks, Ramsay.

  2. Tweaks the "why UNLEAK() is better than free()" to put the
     compelling arguments first. Thanks Stefan and Martin.

  3. I added an in-code comment to explain how UNLEAK() should be used.
     I resisted this at first because the commit message goes into more
     detail on how it works and the implications. But it's probably good
     to have at least some guidance for people who stumble on it in the
     code.

  4. Tweaks the examples to avoid the "argc vs void" question. :) I
     also made them "cmd_foo()" which is a more realistic example of
     when you'd want to use this, since our commands typically do not
     implement their own main() anyway.

-- >8 --
Subject: add UNLEAK annotation for reducing leak false positives

It's a common pattern in git commands to allocate some
memory that should last for the lifetime of the program and
then not bother to free it, relying on the OS to throw it
away.

This keeps the code simple, and it's fast (we don't waste
time traversing structures or calling free at the end of the
program). But it also triggers warnings from memory-leak
checkers like valgrind or LSAN. They know that the memory
was still allocated at program exit, but they don't know
_when_ the leaked memory stopped being useful. If it was
early in the program, then it's probably a real and
important leak. But if it was used right up until program
exit, it's not an interesting leak and we'd like to suppress
it so that we can see the real leaks.

This patch introduces an UNLEAK() macro that lets us do so.
To understand its design, let's first look at some of the
alternatives.

Unfortunately the suppression systems offered by
leak-checking tools don't quite do what we want. A
leak-checker basically knows two things:

  1. Which blocks were allocated via malloc, and the
     callstack during the allocation.

  2. Which blocks were left un-freed at the end of the
     program (and which are unreachable, but more on that
     later).

Their suppressions work by mentioning the function or
callstack of a particular allocation, and marking it as OK
to leak.  So imagine you have code like this:

  int cmd_foo(...)
  {
	/* this allocates some memory */
	char *p = some_function();
	printf("%s", p);
	return 0;
  }

You can say "ignore allocations from some_function(),
they're not leaks". But that's not right. That function may
be called elsewhere, too, and we would potentially want to
know about those leaks.

So you can say "ignore the callstack when main calls
some_function".  That works, but your annotations are
brittle. In this case it's only two functions, but you can
imagine that the actual allocation is much deeper. If any of
the intermediate code changes, you have to update the
suppression.

What we _really_ want to say is that "the value assigned to
p at the end of the function is not a real leak". But
leak-checkers can't understand that; they don't know about
"p" in the first place.

However, we can do something a little bit tricky if we make
some assumptions about how leak-checkers work. They
generally don't just report all un-freed blocks. That would
report even globals which are still accessible when the
leak-check is run.  Instead they take some set of memory
(like BSS) as a root and mark it as "reachable". Then they
scan the reachable blocks for anything that looks like a
pointer to a malloc'd block, and consider that block
reachable. And then they scan those blocks, and so on,
transitively marking anything reachable from a global as
"not leaked" (or at least leaked in a different category).

So we can mark the value of "p" as reachable by putting it
into a variable with program lifetime. One way to do that is
to just mark "p" as static. But that actually affects the
run-time behavior if the function is called twice (you
aren't likely to call main() twice, but some of our cmd_*()
functions are called from other commands).

Instead, we can trick the leak-checker by putting the value
into _any_ reachable bytes. This patch keeps a global
linked-list of bytes copied from "unleaked" variables. That
list is reachable even at program exit, which confers
recursive reachability on whatever values we unleak.

In other words, you can do:

  int cmd_foo(...)
  {
	char *p = some_function();
	printf("%s", p);
	UNLEAK(p);
	return 0;
  }

to annotate "p" and suppress the leak report.

But wait, couldn't we just say "free(p)"? In this toy
example, yes. But UNLEAK()'s byte-copying strategy has
several advantages over actually freeing the memory:

  1. It's recursive across structures. In many cases our "p"
     is not just a pointer, but a complex struct whose
     fields may have been allocated by a sub-function. And
     in some cases (e.g., dir_struct) we don't even have a
     function which knows how to free all of the struct
     members.

     By marking the struct itself as reachable, that confers
     reachability on any pointers it contains (including those
     found in embedded structs, or reachable by walking
     heap blocks recursively.

  2. It works on cases where we're not sure if the value is
     allocated or not. For example:

       char *p = argc > 1 ? argv[1] : some_function();

     It's safe to use UNLEAK(p) here, because it's not
     freeing any memory. In the case that we're pointing to
     argv here, the reachability checker will just ignore
     our bytes.

  3. Likewise, it works even if the variable has _already_
     been freed. We're just copying the pointer bytes. If
     the block has been freed, the leak-checker will skip
     over those bytes as uninteresting.

  4. Because it's not actually freeing memory, you can
     UNLEAK() before we are finished accessing the variable.
     This is helpful in cases like this:

       char *p = some_function();
       return another_function(p);

     Writing this with free() requires:

       int ret;
       char *p = some_function();
       ret = another_function(p);
       free(p);
       return ret;

     But with unleak we can just write:

       char *p = some_function();
       UNLEAK(p);
       return another_function(p);

This patch adds the UNLEAK() macro and enables it
automatically when Git is compiled with SANITIZE=leak.  In
normal builds it's a noop, so we pay no runtime cost.

It also adds some UNLEAK() annotations to show off how the
feature works. On top of other recent leak fixes, these are
enough to get t0000 and t0001 to pass when compiled with
LSAN.

Note the case in commit.c which actually converts a
strbuf_release() into an UNLEAK. This code was already
non-leaky, but the free didn't do anything useful, since
we're exiting. Converting it to an annotation means that
non-leak-checking builds pay no runtime cost. The cost is
minimal enough that it's probably not worth going on a
crusade to convert these kinds of frees to UNLEAKS. I did it
here for consistency with the "sb" leak (though it would
have been equally correct to go the other way, and turn them
both into strbuf_release() calls).

Signed-off-by: Jeff King <peff@peff.net>
---
 Makefile           |  3 +++
 builtin/add.c      |  2 ++
 builtin/commit.c   |  3 ++-
 builtin/config.c   |  4 ++++
 builtin/init-db.c  |  2 ++
 builtin/ls-files.c |  1 +
 builtin/worktree.c |  2 ++
 git-compat-util.h  | 20 ++++++++++++++++++++
 usage.c            | 15 +++++++++++++++
 9 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index f2bb7f2f63..c052f09bba 100644
--- a/Makefile
+++ b/Makefile
@@ -1036,6 +1036,9 @@ BASIC_CFLAGS += -fno-omit-frame-pointer
 ifneq ($(filter undefined,$(SANITIZERS)),)
 BASIC_CFLAGS += -DNO_UNALIGNED_LOADS
 endif
+ifneq ($(filter leak,$(SANITIZERS)),)
+BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
+endif
 endif

 ifndef sysconfdir
diff --git a/builtin/add.c b/builtin/add.c
index ef625e3fb8..a648cf4c56 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -515,5 +515,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 			die(_("Unable to write new index file"));
 	}

+	UNLEAK(pathspec);
+	UNLEAK(dir);
 	return exit_status;
 }
diff --git a/builtin/commit.c b/builtin/commit.c
index b3b04f5dd3..58f9747c2f 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1818,6 +1818,7 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 	if (!quiet)
 		print_summary(prefix, &oid, !current_head);

-	strbuf_release(&err);
+	UNLEAK(err);
+	UNLEAK(sb);
 	return 0;
 }
diff --git a/builtin/config.c b/builtin/config.c
index 52a4606243..d13daeeb55 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -631,6 +631,7 @@ int cmd_config(int argc, const char **argv, const char *prefix)
 		check_write();
 		check_argc(argc, 2, 2);
 		value = normalize_value(argv[0], argv[1]);
+		UNLEAK(value);
 		ret = git_config_set_in_file_gently(given_config_source.file, argv[0], value);
 		if (ret == CONFIG_NOTHING_SET)
 			error(_("cannot overwrite multiple values with a single value\n"
@@ -641,6 +642,7 @@ int cmd_config(int argc, const char **argv, const char *prefix)
 		check_write();
 		check_argc(argc, 2, 3);
 		value = normalize_value(argv[0], argv[1]);
+		UNLEAK(value);
 		return git_config_set_multivar_in_file_gently(given_config_source.file,
 							      argv[0], value, argv[2], 0);
 	}
@@ -648,6 +650,7 @@ int cmd_config(int argc, const char **argv, const char *prefix)
 		check_write();
 		check_argc(argc, 2, 2);
 		value = normalize_value(argv[0], argv[1]);
+		UNLEAK(value);
 		return git_config_set_multivar_in_file_gently(given_config_source.file,
 							      argv[0], value,
 							      CONFIG_REGEX_NONE, 0);
@@ -656,6 +659,7 @@ int cmd_config(int argc, const char **argv, const char *prefix)
 		check_write();
 		check_argc(argc, 2, 3);
 		value = normalize_value(argv[0], argv[1]);
+		UNLEAK(value);
 		return git_config_set_multivar_in_file_gently(given_config_source.file,
 							      argv[0], value, argv[2], 1);
 	}
diff --git a/builtin/init-db.c b/builtin/init-db.c
index 47823f9aa4..c9b7946bad 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -579,6 +579,8 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 			set_git_work_tree(work_tree);
 	}

+	UNLEAK(real_git_dir);
+
 	flags |= INIT_DB_EXIST_OK;
 	return init_db(git_dir, real_git_dir, template_dir, flags);
 }
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e1339e6d17..8c713c47ac 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -673,5 +673,6 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 		return bad ? 1 : 0;
 	}

+	UNLEAK(dir);
 	return 0;
 }
diff --git a/builtin/worktree.c b/builtin/worktree.c
index c98e2ce5f5..de26849f55 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -381,6 +381,8 @@ static int add(int ac, const char **av, const char *prefix)
 		branch = opts.new_branch;
 	}

+	UNLEAK(path);
+	UNLEAK(opts);
 	return add_worktree(path, branch, &opts);
 }

diff --git a/git-compat-util.h b/git-compat-util.h
index 6678b488cc..003e444c46 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1169,4 +1169,24 @@ static inline int is_missing_file_error(int errno_)

 extern int cmd_main(int, const char **);

+/*
+ * You can mark a stack variable with UNLEAK(var) to avoid it being
+ * reported as a leak by tools like LSAN or valgrind. The argument
+ * should generally be the variable itself (not its address and not what
+ * it points to). It's safe to use this on pointers which may already
+ * have been freed, or on pointers which may still be in use.
+ *
+ * Use this _only_ for a variable that leaks by going out of scope at
+ * program exit (so only from cmd_* functions or their direct helpers).
+ * Normal functions, especially those which may be called multiple
+ * times, should actually free their memory. This is only meant as
+ * an annotation, and does nothing in non-leak-checking builds.
+ */
+#ifdef SUPPRESS_ANNOTATED_LEAKS
+extern void unleak_memory(const void *ptr, size_t len);
+#define UNLEAK(var) unleak_memory(&(var), sizeof(var));
+#else
+#define UNLEAK(var)
+#endif
+
 #endif
diff --git a/usage.c b/usage.c
index 1ea7df9a20..cdd534c9df 100644
--- a/usage.c
+++ b/usage.c
@@ -241,3 +241,18 @@ NORETURN void BUG(const char *fmt, ...)
 	va_end(ap);
 }
 #endif
+
+#ifdef SUPPRESS_ANNOTATED_LEAKS
+void unleak_memory(const void *ptr, size_t len)
+{
+	static struct suppressed_leak_root {
+		struct suppressed_leak_root *next;
+		char data[FLEX_ARRAY];
+	} *suppressed_leaks;
+	struct suppressed_leak_root *root;
+
+	FLEX_ALLOC_MEM(root, data, ptr, len);
+	root->next = suppressed_leaks;
+	suppressed_leaks = root;
+}
+#endif
-- 
2.14.1.769.g4f4ea7dfd3

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-05 13:05 ` [PATCH 10/10] add UNLEAK annotation for reducing leak false positives Jeff King
  2017-09-05 22:05   ` Stefan Beller
  2017-09-06 17:16   ` Martin Ågren
@ 2017-09-12 13:41   ` Kaartic Sivaraam
  2017-09-12 15:29     ` Jeff King
  2 siblings, 1 reply; 42+ messages in thread
From: Kaartic Sivaraam @ 2017-09-12 13:41 UTC (permalink / raw)
  To: Jeff King, git

On Tue, 2017-09-05 at 09:05 -0400, Jeff King wrote:
> This patch introduces an UNLEAK() macro that lets us do so.
> To understand its design, let's first look at some of the
> alternatives.
> 

> This patch adds the UNLEAK() macro and enables it
> automatically when Git is compiled with SANITIZE=leak.
> It adds some UNLEAK() annotations to show off how the
> feature works. On top of other recent leak fixes, these are
> enough to get t0000 and t0001 to pass when compiled with
> LSAN.

My nit of the day ;-)

The above paragraphs seems to going against the following guideline of
Documentation/SubmittingPatches,

    Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
    instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
    to do frotz", as if you are giving orders to the codebase to change
    its behavior.  Try to make sure your explanation can be understood



    -- 
    Kaartic

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-05 22:05   ` Stefan Beller
  2017-09-07  9:17     ` Jeff King
@ 2017-09-12 14:34     ` Kaartic Sivaraam
  2017-09-12 15:05       ` Jeff King
  1 sibling, 1 reply; 42+ messages in thread
From: Kaartic Sivaraam @ 2017-09-12 14:34 UTC (permalink / raw)
  To: Stefan Beller, Jeff King; +Cc: git@vger.kernel.org

> On Tue, 2017-09-05 at 15:05 -0700, Stefan Beller wrote:
> 
> After having a sneak peak at the implementation
> it is O(1) in runtime for each added element, and the
> space complexity is O(well).
> 

Incidentally I was reading about "complexity of algorithms" and there
was the following paragraph in the book,


    Unfortunately, as Knuth observed, big-O notation is often used by careless writers and
    speakers as if it had the same meaning as big-Theta notation. Keep this in mind when you see
    big-O notation used. The recent trend has been to use big-Theta notation whenever both upper
    and lower bounds on the size of a function are needed.


So, if my interpretation is correct the above usage of O(1) and O(well)
should have been Θ(1) and Θ(well).

-- 
Kaartic

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-12 14:34     ` Kaartic Sivaraam
@ 2017-09-12 15:05       ` Jeff King
  2017-09-13  7:13         ` Kaartic Sivaraam
  0 siblings, 1 reply; 42+ messages in thread
From: Jeff King @ 2017-09-12 15:05 UTC (permalink / raw)
  To: Kaartic Sivaraam; +Cc: Stefan Beller, git@vger.kernel.org

On Tue, Sep 12, 2017 at 08:04:52PM +0530, Kaartic Sivaraam wrote:

> > On Tue, 2017-09-05 at 15:05 -0700, Stefan Beller wrote:
> > 
> > After having a sneak peak at the implementation
> > it is O(1) in runtime for each added element, and the
> > space complexity is O(well).
> > 
> 
> Incidentally I was reading about "complexity of algorithms" and there
> was the following paragraph in the book,
> 
> 
>     Unfortunately, as Knuth observed, big-O notation is often used by careless writers and
>     speakers as if it had the same meaning as big-Theta notation. Keep this in mind when you see
>     big-O notation used. The recent trend has been to use big-Theta notation whenever both upper
>     and lower bounds on the size of a function are needed.
> 
> So, if my interpretation is correct the above usage of O(1) and O(well)
> should have been Θ(1) and Θ(well).

But theta-well isn't a pun. :P

It is true that prepending to a linked list is also Θ(1), but I'm not
sure if it's carelessness that causes many programmers to use big-O.
It's that what we care about is worst-case performance. So knowing that
we have a lower bound isn't usually that interesting. What we want to
know is whether it will blow up in our face as "n" gets large.

Plus typing non-ascii characters is annoying. :)

If you want to talk about sloppy analysis, the two most common problems
I see are:

  1. People talk about big-O complexity without discussing constants.
     For reasonable values of "n", the constants often matter (they're
     not wrong about big-O, but they are wrong about what will run fast
     in practice).

  2. Glossing over things like amortized costs. Hash tables aren't
     really O(1) because they eventually fill up and get collisions. You
     have to talk about load factor, resizing, etc.

I'm sure I'm guilty of doing those things sometimes, too.

-Peff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-12 13:41   ` Kaartic Sivaraam
@ 2017-09-12 15:29     ` Jeff King
  2017-09-13  6:44       ` Kaartic Sivaraam
  0 siblings, 1 reply; 42+ messages in thread
From: Jeff King @ 2017-09-12 15:29 UTC (permalink / raw)
  To: Kaartic Sivaraam; +Cc: git

On Tue, Sep 12, 2017 at 07:11:38PM +0530, Kaartic Sivaraam wrote:

> On Tue, 2017-09-05 at 09:05 -0400, Jeff King wrote:
> > This patch introduces an UNLEAK() macro that lets us do so.
> > To understand its design, let's first look at some of the
> > alternatives.
> > 
> 
> > This patch adds the UNLEAK() macro and enables it
> > automatically when Git is compiled with SANITIZE=leak.
> > It adds some UNLEAK() annotations to show off how the
> > feature works. On top of other recent leak fixes, these are
> > enough to get t0000 and t0001 to pass when compiled with
> > LSAN.
> 
> My nit of the day ;-)
> 
> The above paragraphs seems to going against the following guideline of
> Documentation/SubmittingPatches,
> 
>     Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
>     instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
>     to do frotz", as if you are giving orders to the codebase to change
>     its behavior.  Try to make sure your explanation can be understood

Like all good writing rules, I think it's important to know when to
break them. :)

Writing in the imperative is _most_ important in the subject. You're
likely to see a lot of subjects in a list, and it makes the list easier
to read if they all match. It also tends to be shorter, which is good
for subjects.

For short commit messages, I think the imperative also keeps things
tight and to the point: describe the problem and then say how to fix it.
The recent 0db3dc75f is a good example (which I picked by skimming
recent "git log" output). But saying "this patch" is IMHO not that big a
problem there, as long as it isn't done excessively.

When you the explanation is longer or more complicated, the imperative
can actually be a bit _too_ terse. In longer text it helps to guide
readers in the direction you want their thoughts to take. Having a
three-paragraph explanation of the problem or current state of things
and then jumping right into "Do this. Do that." lacks context. A marker
like "this patch" helps the reader know that you're switching gears to
talking about the solution.

I also think that "I" is useful in avoiding the passive voice.  It can
certainly be used gratuitously and make things less clear, but in most
cases I'd rather see something like "I tested performance under these
conditions" than "Performance was tested under these conditions". I also
often use the "academic we" here even when I worked on something myself.

-Peff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-12 15:29     ` Jeff King
@ 2017-09-13  6:44       ` Kaartic Sivaraam
  0 siblings, 0 replies; 42+ messages in thread
From: Kaartic Sivaraam @ 2017-09-13  6:44 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Tuesday 12 September 2017 08:59 PM, Jeff King wrote:
> Like all good writing rules, I think it's important to know when to
> break them. :)

That's right. "Have guidelines but 'Be bold' enough to break them when
they seem to be inducing counter productivity."

> Writing in the imperative is _most_ important in the subject. You're
> likely to see a lot of subjects in a list, and it makes the list easier
> to read if they all match. It also tends to be shorter, which is good
> for subjects.
>
> For short commit messages, I think the imperative also keeps things
> tight and to the point: describe the problem and then say how to fix it.
> The recent 0db3dc75f is a good example (which I picked by skimming
> recent "git log" output). But saying "this patch" is IMHO not that big a
> problem there, as long as it isn't done excessively.
>
> When you the explanation is longer or more complicated, the imperative
> can actually be a bit _too_ terse. In longer text it helps to guide
> readers in the direction you want their thoughts to take. Having a
> three-paragraph explanation of the problem or current state of things
> and then jumping right into "Do this. Do that." lacks context. A marker
> like "this patch" helps the reader know that you're switching gears to
> talking about the solution.
>
> I also think that "I" is useful in avoiding the passive voice.  It can
> certainly be used gratuitously and make things less clear, but in most
> cases I'd rather see something like "I tested performance under these
> conditions" than "Performance was tested under these conditions". I also
> often use the "academic we" here even when I worked on something myself.

Thanks for taking the time to give the detailed and clear explanation.

---
Kaartic

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-12 15:05       ` Jeff King
@ 2017-09-13  7:13         ` Kaartic Sivaraam
  0 siblings, 0 replies; 42+ messages in thread
From: Kaartic Sivaraam @ 2017-09-13  7:13 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Beller, git@vger.kernel.org

On Tuesday 12 September 2017 08:35 PM, Jeff King wrote:
> But theta-well isn't a pun. :P 

:)

> It is true that prepending to a linked list is also Θ(1), but I'm not
> sure if it's carelessness that causes many programmers to use big-O.
> It's that what we care about is worst-case performance. So knowing that
> we have a lower bound isn't usually that interesting. What we want to
> know is whether it will blow up in our face as "n" gets large.

This seems quite acceptable.

> Plus typing non-ascii characters is annoying. :)

I expected this to be a reason. :)

---
Kaartic

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-08  6:38 ` [PATCH v2 10/10] add UNLEAK annotation for reducing leak false positives Jeff King
@ 2017-09-19 20:45   ` Jonathan Tan
  2017-09-19 21:03     ` Jeff King
  0 siblings, 1 reply; 42+ messages in thread
From: Jonathan Tan @ 2017-09-19 20:45 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Junio C Hamano, Martin Ågren, Stefan Beller,
	Ramsay Jones

Thanks - this does look like a good thing to have. Sorry for the late
comments.

The following comments are assuming that we're going to standardize on
UNLEAK(var); (with the semicolon).

On Fri, 8 Sep 2017 02:38:41 -0400
Jeff King <peff@peff.net> wrote:

> +#ifdef SUPPRESS_ANNOTATED_LEAKS
> +extern void unleak_memory(const void *ptr, size_t len);
> +#define UNLEAK(var) unleak_memory(&(var), sizeof(var));

I would feel better if the semicolon was omitted. I don't think it
matters in this particular case, though.

> +#else
> +#define UNLEAK(var)

I think this should be defined to be something (for example, "do {}
while (0)"), at least so that we have compiler errors when UNLEAK(var)
is used incorrectly (for example, without the semicolon) when
SUPPRESS_ANNOTATED_LEAKS is not defined.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-19 20:45   ` Jonathan Tan
@ 2017-09-19 21:03     ` Jeff King
  2017-09-19 21:34       ` [PATCH for jk/leak-checkers] git-compat-util: make UNLEAK less error-prone Jonathan Tan
  2017-09-20  1:45       ` [PATCH v2 10/10] add UNLEAK annotation for reducing leak false positives Junio C Hamano
  0 siblings, 2 replies; 42+ messages in thread
From: Jeff King @ 2017-09-19 21:03 UTC (permalink / raw)
  To: Jonathan Tan
  Cc: git, Junio C Hamano, Martin Ågren, Stefan Beller,
	Ramsay Jones

On Tue, Sep 19, 2017 at 01:45:52PM -0700, Jonathan Tan wrote:

> The following comments are assuming that we're going to standardize on
> UNLEAK(var); (with the semicolon).

Yeah, I assumed we would. We don't have to, since this really is sort-of
magical, but I think the code looks better with it.

> On Fri, 8 Sep 2017 02:38:41 -0400
> Jeff King <peff@peff.net> wrote:
> 
> > +#ifdef SUPPRESS_ANNOTATED_LEAKS
> > +extern void unleak_memory(const void *ptr, size_t len);
> > +#define UNLEAK(var) unleak_memory(&(var), sizeof(var));
> 
> I would feel better if the semicolon was omitted. I don't think it
> matters in this particular case, though.

You end up with a double semi-colon. Some linters might complain.

> > +#else
> > +#define UNLEAK(var)
> 
> I think this should be defined to be something (for example, "do {}
> while (0)"), at least so that we have compiler errors when UNLEAK(var)
> is used incorrectly (for example, without the semicolon) when
> SUPPRESS_ANNOTATED_LEAKS is not defined.

Seems reasonable.

I think both are worth doing, but the patch is already in next so we
need to do it on top. Do you want to prepare a patch?

-Peff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH for jk/leak-checkers] git-compat-util: make UNLEAK less error-prone
  2017-09-19 21:03     ` Jeff King
@ 2017-09-19 21:34       ` Jonathan Tan
  2017-09-19 21:46         ` Jeff King
  2017-09-20  1:45       ` [PATCH v2 10/10] add UNLEAK annotation for reducing leak false positives Junio C Hamano
  1 sibling, 1 reply; 42+ messages in thread
From: Jonathan Tan @ 2017-09-19 21:34 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Commit 0e5bba5 ("add UNLEAK annotation for reducing leak false
positives", 2017-09-08) introduced an UNLEAK macro to be used as
"UNLEAK(var);", but its existing definitions make it possible to be
invoked as "UNLEAK(var)" (without the trailing semicolon) too.

Therefore, modify its definitions to cause a compile-time error if
invoked without the trailing semicolon.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
Sure, here is the patch.
---
 git-compat-util.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 003e444c4..9bc15b036 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1184,9 +1184,9 @@ extern int cmd_main(int, const char **);
  */
 #ifdef SUPPRESS_ANNOTATED_LEAKS
 extern void unleak_memory(const void *ptr, size_t len);
-#define UNLEAK(var) unleak_memory(&(var), sizeof(var));
+#define UNLEAK(var) unleak_memory(&(var), sizeof(var))
 #else
-#define UNLEAK(var)
+#define UNLEAK(var) do {} while (0)
 #endif
 
 #endif
-- 
2.14.1.821.g8fa685d3b7-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH for jk/leak-checkers] git-compat-util: make UNLEAK less error-prone
  2017-09-19 21:34       ` [PATCH for jk/leak-checkers] git-compat-util: make UNLEAK less error-prone Jonathan Tan
@ 2017-09-19 21:46         ` Jeff King
  2017-09-19 22:10           ` [PATCH for jk/leak-checkers v2] " Jonathan Tan
  0 siblings, 1 reply; 42+ messages in thread
From: Jeff King @ 2017-09-19 21:46 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Tue, Sep 19, 2017 at 02:34:56PM -0700, Jonathan Tan wrote:

> Commit 0e5bba5 ("add UNLEAK annotation for reducing leak false
> positives", 2017-09-08) introduced an UNLEAK macro to be used as
> "UNLEAK(var);", but its existing definitions make it possible to be
> invoked as "UNLEAK(var)" (without the trailing semicolon) too.
> 
> Therefore, modify its definitions to cause a compile-time error if
> invoked without the trailing semicolon.

The patch itself looks good, but I think your commit message dances
around the real motivation: some parsers may complain about doubled
semicolons, or semicolons without an associated line of code (which is
really just a doubled semicolon, too, I guess).

-Peff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH for jk/leak-checkers v2] git-compat-util: make UNLEAK less error-prone
  2017-09-19 21:46         ` Jeff King
@ 2017-09-19 22:10           ` Jonathan Tan
  0 siblings, 0 replies; 42+ messages in thread
From: Jonathan Tan @ 2017-09-19 22:10 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Commit 0e5bba5 ("add UNLEAK annotation for reducing leak false
positives", 2017-09-08) introduced an UNLEAK macro to be used as
"UNLEAK(var);", but its existing definitions leave semicolons that act
as empty statements, which may cause some tools to complain.

Therefore, modify its definitions to not leave these semicolons.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
OK, here is the same patch with an updated commit message.
---
 git-compat-util.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 003e444c4..9bc15b036 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1184,9 +1184,9 @@ extern int cmd_main(int, const char **);
  */
 #ifdef SUPPRESS_ANNOTATED_LEAKS
 extern void unleak_memory(const void *ptr, size_t len);
-#define UNLEAK(var) unleak_memory(&(var), sizeof(var));
+#define UNLEAK(var) unleak_memory(&(var), sizeof(var))
 #else
-#define UNLEAK(var)
+#define UNLEAK(var) do {} while (0)
 #endif
 
 #endif
-- 
2.14.1.821.g8fa685d3b7-goog


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-19 21:03     ` Jeff King
  2017-09-19 21:34       ` [PATCH for jk/leak-checkers] git-compat-util: make UNLEAK less error-prone Jonathan Tan
@ 2017-09-20  1:45       ` Junio C Hamano
  2017-09-20  2:28         ` Jeff King
  1 sibling, 1 reply; 42+ messages in thread
From: Junio C Hamano @ 2017-09-20  1:45 UTC (permalink / raw)
  To: Jeff King
  Cc: Jonathan Tan, git, Martin Ågren, Stefan Beller, Ramsay Jones

Jeff King <peff@peff.net> writes:

> On Tue, Sep 19, 2017 at 01:45:52PM -0700, Jonathan Tan wrote:
>
>> The following comments are assuming that we're going to standardize on
>> UNLEAK(var); (with the semicolon).
>
> Yeah, I assumed we would. We don't have to, since this really is sort-of
> magical, but I think the code looks better with it.
>
>> On Fri, 8 Sep 2017 02:38:41 -0400
>> Jeff King <peff@peff.net> wrote:
>> 
>> > +#ifdef SUPPRESS_ANNOTATED_LEAKS
>> > +extern void unleak_memory(const void *ptr, size_t len);
>> > +#define UNLEAK(var) unleak_memory(&(var), sizeof(var));
>> 
>> I would feel better if the semicolon was omitted. I don't think it
>> matters in this particular case, though.
>
> You end up with a double semi-colon. Some linters might complain.

Yeah, and it makes

	if (some condition)
		UNLEAK(var);
	else
		do_something_else_to(var);

a syntax error.

Should have spotted during the review; sorry.

>
>> > +#else
>> > +#define UNLEAK(var)
>> 
>> I think this should be defined to be something (for example, "do {}
>> while (0)"), at least so that we have compiler errors when UNLEAK(var)
>> is used incorrectly (for example, without the semicolon) when
>> SUPPRESS_ANNOTATED_LEAKS is not defined.
>
> Seems reasonable.

Hmph, I am not so sure about this one.  But I agree that the
semicolon must go.


>
> I think both are worth doing, but the patch is already in next so we
> need to do it on top. Do you want to prepare a patch?
>
> -Peff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-20  1:45       ` [PATCH v2 10/10] add UNLEAK annotation for reducing leak false positives Junio C Hamano
@ 2017-09-20  2:28         ` Jeff King
  2017-09-20  5:12           ` Junio C Hamano
  0 siblings, 1 reply; 42+ messages in thread
From: Jeff King @ 2017-09-20  2:28 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Tan, git, Martin Ågren, Stefan Beller, Ramsay Jones

On Wed, Sep 20, 2017 at 10:45:05AM +0900, Junio C Hamano wrote:

> >> > +#else
> >> > +#define UNLEAK(var)
> >> 
> >> I think this should be defined to be something (for example, "do {}
> >> while (0)"), at least so that we have compiler errors when UNLEAK(var)
> >> is used incorrectly (for example, without the semicolon) when
> >> SUPPRESS_ANNOTATED_LEAKS is not defined.
> >
> > Seems reasonable.
> 
> Hmph, I am not so sure about this one.  But I agree that the
> semicolon must go.

I thought we had run into some issues with a compiler or linter
complaining about null statements before. But they _are_ pretty common,
so maybe I'm mis-remembering (or maybe it was something like the if/else
conditional you showed earlier).

At any rate, I think Jonathan's point is that writing:

  UNLEAK(foo)

will silently pass in a normal build, and only much later will somebody
run a leak-checking build and see the compile error.

Of course people adding UNLEAK()s are likely to be compiling leak-check
builds to confirm that they've silenced the warning, so maybe it's not
that important a case to catch.

-Peff

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 10/10] add UNLEAK annotation for reducing leak false positives
  2017-09-20  2:28         ` Jeff King
@ 2017-09-20  5:12           ` Junio C Hamano
  0 siblings, 0 replies; 42+ messages in thread
From: Junio C Hamano @ 2017-09-20  5:12 UTC (permalink / raw)
  To: Jeff King
  Cc: Jonathan Tan, git, Martin Ågren, Stefan Beller, Ramsay Jones

Jeff King <peff@peff.net> writes:

> At any rate, I think Jonathan's point is that writing:
>
>   UNLEAK(foo)
>
> will silently pass in a normal build, and only much later will somebody
> run a leak-checking build and see the compile error.

Yeah, I think I understand that concern.

	#if SOME_CONDITION
	#define X(y) do_x(y)
	#else
	#define X(y) /* nothing */
	#endif

is something we quite often use, and I just found it a bit unusual
that all of a sudden we start to be extra careful only with this
macro.

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2017-09-20  5:12 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-05 13:01 [PATCH 0/10] towards clean leak-checker output Jeff King
2017-09-05 13:03 ` [PATCH 01/10] test-lib: --valgrind should not override --verbose-log Jeff King
2017-09-05 13:04 ` [PATCH 02/10] test-lib: set LSAN_OPTIONS to abort by default Jeff King
2017-09-05 13:04 ` [PATCH 03/10] add: free leaked pathspec after add_files_to_cache() Jeff King
2017-09-05 13:04 ` [PATCH 04/10] update-index: fix cache entry leak in add_one_file() Jeff King
2017-09-05 13:04 ` [PATCH 05/10] config: plug user_config leak Jeff King
2017-09-05 13:04 ` [PATCH 06/10] reset: make tree counting less confusing Jeff King
2017-09-05 13:04 ` [PATCH 07/10] reset: free allocated tree buffers Jeff King
2017-09-05 13:04 ` [PATCH 08/10] repository: free fields before overwriting them Jeff King
2017-09-05 13:05 ` [PATCH 09/10] set_git_dir: handle feeding gitdir to itself Jeff King
2017-09-07 19:06   ` Brandon Williams
2017-09-05 13:05 ` [PATCH 10/10] add UNLEAK annotation for reducing leak false positives Jeff King
2017-09-05 22:05   ` Stefan Beller
2017-09-07  9:17     ` Jeff King
2017-09-07 20:38       ` Stefan Beller
2017-09-12 14:34     ` Kaartic Sivaraam
2017-09-12 15:05       ` Jeff King
2017-09-13  7:13         ` Kaartic Sivaraam
2017-09-06 17:16   ` Martin Ågren
2017-09-07  9:00     ` Jeff King
2017-09-12 13:41   ` Kaartic Sivaraam
2017-09-12 15:29     ` Jeff King
2017-09-13  6:44       ` Kaartic Sivaraam
2017-09-05 17:50 ` [PATCH 0/10] towards clean leak-checker output Martin Ågren
2017-09-05 19:02   ` Jeff King
2017-09-05 20:41     ` Martin Ågren
2017-09-06 12:39       ` Jeff King
2017-09-06  1:42     ` Junio C Hamano
2017-09-06 12:28       ` [PATCH 0/2] simplifying !RUNTIME_PREFIX Jeff King
2017-09-06 12:30         ` [PATCH 1/2] system_path: move RUNTIME_PREFIX to a sub-function Jeff King
2017-09-06 13:23           ` Johannes Schindelin
2017-09-06 13:27             ` Jeff King
2017-09-06 12:32         ` [PATCH 2/2] git_extract_argv0_path: do nothing without RUNTIME_PREFIX Jeff King
2017-09-08  6:38 ` [PATCH v2 10/10] add UNLEAK annotation for reducing leak false positives Jeff King
2017-09-19 20:45   ` Jonathan Tan
2017-09-19 21:03     ` Jeff King
2017-09-19 21:34       ` [PATCH for jk/leak-checkers] git-compat-util: make UNLEAK less error-prone Jonathan Tan
2017-09-19 21:46         ` Jeff King
2017-09-19 22:10           ` [PATCH for jk/leak-checkers v2] " Jonathan Tan
2017-09-20  1:45       ` [PATCH v2 10/10] add UNLEAK annotation for reducing leak false positives Junio C Hamano
2017-09-20  2:28         ` Jeff King
2017-09-20  5:12           ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).