git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 1/2] core: use env variable instead of config var to turn on logging pack access
@ 2013-05-30 13:51 Nguyễn Thái Ngọc Duy
  2013-05-30 13:51 ` [PATCH 2/2] git.txt: document GIT_TRACE_PACKET Nguyễn Thái Ngọc Duy
  2013-06-03 20:24 ` [PATCH 1/2] core: use env variable instead of config var to turn on logging pack access Junio C Hamano
  0 siblings, 2 replies; 6+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-05-30 13:51 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

5f44324 (core: log offset pack data accesses happened - 2011-07-06)
provides a way to observe pack access patterns via a config
switch. Setting an environment variable looks more obvious than a
config var, especially when you just need to _observe_, and more
inline with other tracing knobs we have.

Document it as it may be useful for remote troubleshooting.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git.txt |  7 +++++++
 cache.h               |  3 ---
 config.c              |  3 ---
 sha1_file.c           | 10 ++++++++++
 4 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index 9e302b0..3e74440 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -832,6 +832,13 @@ for further details.
 	as a file path and will try to write the trace messages
 	into it.
 
+'GIT_TRACE_PACK_ACCESS'::
+	If this variable is set to a path, a file will be created at
+	the given path logging all accesses to any packs. For each
+	access, the pack file name and an offset in the pack is
+	recorded. This may be helpful for troubleshooting some
+	pack-related performance problems.
+
 GIT_LITERAL_PATHSPECS::
 	Setting this variable to `1` will cause Git to treat all
 	pathspecs literally, rather than as glob patterns. For example,
diff --git a/cache.h b/cache.h
index 94ca1ac..9bfd76b 100644
--- a/cache.h
+++ b/cache.h
@@ -772,9 +772,6 @@ extern int parse_sha1_header(const char *hdr, unsigned long *sizep);
 /* global flag to enable extra checks when accessing packed objects */
 extern int do_check_packed_object_crc;
 
-/* for development: log offset of pack access */
-extern const char *log_pack_access;
-
 extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned long size, const char *type);
 
 extern int move_temp_to_file(const char *tmpfile, const char *filename);
diff --git a/config.c b/config.c
index aefd80b..ce074d7 100644
--- a/config.c
+++ b/config.c
@@ -675,9 +675,6 @@ static int git_default_core_config(const char *var, const char *value)
 		return 0;
 	}
 
-	if (!strcmp(var, "core.logpackaccess"))
-		return git_config_string(&log_pack_access, var, value);
-
 	if (!strcmp(var, "core.autocrlf")) {
 		if (value && !strcasecmp(value, "input")) {
 			if (core_eol == EOL_CRLF)
diff --git a/sha1_file.c b/sha1_file.c
index 67e815b..7b47bdc 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -36,6 +36,8 @@ static inline uintmax_t sz_fmt(size_t s) { return s; }
 
 const unsigned char null_sha1[20];
 
+static const char *log_pack_access = "";
+
 /*
  * This is meant to hold a *small* number of objects that you would
  * want read_sha1_file() to be able to return, but yet you do not want
@@ -1956,6 +1958,14 @@ static void write_pack_access_log(struct packed_git *p, off_t obj_offset)
 {
 	static FILE *log_file;
 
+	if (!*log_pack_access) {
+		log_pack_access = getenv("GIT_TRACE_PACK_ACCESS");
+		if (!*log_pack_access)
+			log_pack_access = NULL;
+		if (!log_pack_access)
+			return;
+	}
+
 	if (!log_file) {
 		log_file = fopen(log_pack_access, "w");
 		if (!log_file) {
-- 
1.8.2.83.gc99314b

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] git.txt: document GIT_TRACE_PACKET
  2013-05-30 13:51 [PATCH 1/2] core: use env variable instead of config var to turn on logging pack access Nguyễn Thái Ngọc Duy
@ 2013-05-30 13:51 ` Nguyễn Thái Ngọc Duy
  2013-06-03 20:24 ` [PATCH 1/2] core: use env variable instead of config var to turn on logging pack access Junio C Hamano
  1 sibling, 0 replies; 6+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-05-30 13:51 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

"This can help with debugging object negotiation or other protocol
issues."

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git.txt | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index 3e74440..12ef7a2 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -839,6 +839,11 @@ for further details.
 	recorded. This may be helpful for troubleshooting some
 	pack-related performance problems.
 
+'GIT_TRACE_PACKET'::
+	If this variable is set, it shows a trace of all packets
+	coming in or out of a given program. This can help with
+	debugging object negotiation or other protocol issues.
+
 GIT_LITERAL_PATHSPECS::
 	Setting this variable to `1` will cause Git to treat all
 	pathspecs literally, rather than as glob patterns. For example,
-- 
1.8.2.83.gc99314b

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] core: use env variable instead of config var to turn on logging pack access
  2013-05-30 13:51 [PATCH 1/2] core: use env variable instead of config var to turn on logging pack access Nguyễn Thái Ngọc Duy
  2013-05-30 13:51 ` [PATCH 2/2] git.txt: document GIT_TRACE_PACKET Nguyễn Thái Ngọc Duy
@ 2013-06-03 20:24 ` Junio C Hamano
  2013-06-03 22:52   ` [PATCH v2] " Nguyễn Thái Ngọc Duy
  1 sibling, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2013-06-03 20:24 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> diff --git a/sha1_file.c b/sha1_file.c
> index 67e815b..7b47bdc 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -36,6 +36,8 @@ static inline uintmax_t sz_fmt(size_t s) { return s; }
>  
>  const unsigned char null_sha1[20];
>  
> +static const char *log_pack_access = "";
> +
>  /*
>   * This is meant to hold a *small* number of objects that you would
>   * want read_sha1_file() to be able to return, but yet you do not want
> @@ -1956,6 +1958,14 @@ static void write_pack_access_log(struct packed_git *p, off_t obj_offset)
>  {
>  	static FILE *log_file;
>  
> +	if (!*log_pack_access) {
> +		log_pack_access = getenv("GIT_TRACE_PACK_ACCESS");
> +		if (!*log_pack_access)
> +			log_pack_access = NULL;
> +		if (!log_pack_access)
> +			return;
> +	}

Have you ever tested this?

Once log_pack_access goes to NULL (e.g. when it sees the empty
string it was initialized to), this new test will happily
dereference NULL.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] core: use env variable instead of config var to turn on logging pack access
  2013-06-03 20:24 ` [PATCH 1/2] core: use env variable instead of config var to turn on logging pack access Junio C Hamano
@ 2013-06-03 22:52   ` Nguyễn Thái Ngọc Duy
  2013-06-04  6:26     ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-06-03 22:52 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

5f44324 (core: log offset pack data accesses happened - 2011-07-06)
provides a way to observe pack access patterns via a config
switch. Setting an environment variable looks more obvious than a
config var, especially when you just need to _observe_, and more
inline with other tracing knobs we have.

Document it as it may be useful for remote troubleshooting.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 >> +     if (!*log_pack_access) {
 >> +             log_pack_access = getenv("GIT_TRACE_PACK_ACCESS");
 >> +             if (!*log_pack_access)
 >> +                     log_pack_access = NULL;
 >> +             if (!log_pack_access)
 >> +                     return;
 >> +     }
 >
 > Have you ever tested this?
 >
 > Once log_pack_access goes to NULL (e.g. when it sees the empty
 > string it was initialized to), this new test will happily
 > dereference NULL.

 My bad. I did test when GIT_TRACE_PACK_ACCESS was set, not when it
 was set to an empty string. All cases tested now.

 Documentation/git.txt |  7 +++++++
 cache.h               |  3 ---
 config.c              |  3 ---
 sha1_file.c           | 10 ++++++++++
 4 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index 68f1ee6..c760918 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -838,6 +838,13 @@ for further details.
 	as a file path and will try to write the trace messages
 	into it.
 
+'GIT_TRACE_PACK_ACCESS'::
+	If this variable is set to a path, a file will be created at
+	the given path logging all accesses to any packs. For each
+	access, the pack file name and an offset in the pack is
+	recorded. This may be helpful for troubleshooting some
+	pack-related performance problems.
+
 GIT_LITERAL_PATHSPECS::
 	Setting this variable to `1` will cause Git to treat all
 	pathspecs literally, rather than as glob patterns. For example,
diff --git a/cache.h b/cache.h
index df532f8..4f41606 100644
--- a/cache.h
+++ b/cache.h
@@ -772,9 +772,6 @@ extern int parse_sha1_header(const char *hdr, unsigned long *sizep);
 /* global flag to enable extra checks when accessing packed objects */
 extern int do_check_packed_object_crc;
 
-/* for development: log offset of pack access */
-extern const char *log_pack_access;
-
 extern int check_sha1_signature(const unsigned char *sha1, void *buf, unsigned long size, const char *type);
 
 extern int move_temp_to_file(const char *tmpfile, const char *filename);
diff --git a/config.c b/config.c
index 830ee14..e68184f 100644
--- a/config.c
+++ b/config.c
@@ -675,9 +675,6 @@ static int git_default_core_config(const char *var, const char *value)
 		return 0;
 	}
 
-	if (!strcmp(var, "core.logpackaccess"))
-		return git_config_string(&log_pack_access, var, value);
-
 	if (!strcmp(var, "core.autocrlf")) {
 		if (value && !strcasecmp(value, "input")) {
 			if (core_eol == EOL_CRLF)
diff --git a/sha1_file.c b/sha1_file.c
index 67e815b..91c3627 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -36,6 +36,8 @@ static inline uintmax_t sz_fmt(size_t s) { return s; }
 
 const unsigned char null_sha1[20];
 
+static const char *log_pack_access = "";
+
 /*
  * This is meant to hold a *small* number of objects that you would
  * want read_sha1_file() to be able to return, but yet you do not want
@@ -1956,6 +1958,14 @@ static void write_pack_access_log(struct packed_git *p, off_t obj_offset)
 {
 	static FILE *log_file;
 
+	if (!*log_pack_access) {
+		log_pack_access = getenv("GIT_TRACE_PACK_ACCESS");
+		if (log_pack_access && !*log_pack_access)
+			log_pack_access = NULL;
+		if (!log_pack_access)
+			return;
+	}
+
 	if (!log_file) {
 		log_file = fopen(log_pack_access, "w");
 		if (!log_file) {
-- 
1.8.2.83.gc99314b

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] core: use env variable instead of config var to turn on logging pack access
  2013-06-03 22:52   ` [PATCH v2] " Nguyễn Thái Ngọc Duy
@ 2013-06-04  6:26     ` Junio C Hamano
  2013-06-04  7:48       ` Duy Nguyen
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2013-06-04  6:26 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:

>  > Have you ever tested this?
>  >
>  > Once log_pack_access goes to NULL (e.g. when it sees the empty
>  > string it was initialized to), this new test will happily
>  > dereference NULL.
>
>  My bad. I did test when GIT_TRACE_PACK_ACCESS was set, not when it
>  was set to an empty string. All cases tested now.
>
> @@ -1956,6 +1958,14 @@ static void write_pack_access_log(struct packed_git *p, off_t obj_offset)
>  {
>  	static FILE *log_file;
>  
> +	if (!*log_pack_access) {

The first time, we will see the static empty string and come into
this block...

> +		log_pack_access = getenv("GIT_TRACE_PACK_ACCESS");
> +		if (log_pack_access && !*log_pack_access)
> +			log_pack_access = NULL;
> +		if (!log_pack_access)
> +			return;
> +	}

This may 

 (1) not find the env-var, in which case log_pack_access becomes
     NULL here, and the function returns.

 (2) find the env-var to be an empty string, in which case
     log_pack_access becomes NULL in the next statement, and the
     function returns.

 (3) find the env-var to be a non-empty string, and the function
     does not return.

And the next time around, (3) above may work fine; the first if()
will fail and we do not come in.  But in either (1) or (2), don't
you keep checking the environment every time you come here?

Oh, no, it is worse than that.  Either case you set the variable to
NULL, so the very first "if (!*log_pack_access)" would dereference
NULL.

Why not start from NULL:

    static const char *log_pack_access;

treat that NULL as "unknown" state, and avoid running getenv over
and over again by treating non-NULL value as "known"?  Perhaps like
this?

	if (!log_pack_access) {
        	/* First time: is env set? */
                log_pack_access = getenv("GIT_TRACE_PACK_ACCESS");
                if (!log_pack_access)
                	log_pack_access = "";
	}
        /* Now GIT_TRACE_PACK_ACCESS is known */
        if (!*log_pack_access)
        	return; /* not wanted */

As you can no longer rely on "config is read before we do anything
else" by switching to lazy env checking, your guard at the caller
needs to be updated, if you want to reduce unneeded call-return
overhead:

	if (!log_pack_access || *log_pack_access)
		write_pack_access_log(...);

But the guard is purely for performance, not correctness, because
the function now does the "return no-op if we know we are not
wanted" itself.

Also you no longer need to have an extern variable in environment.c

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] core: use env variable instead of config var to turn on logging pack access
  2013-06-04  6:26     ` Junio C Hamano
@ 2013-06-04  7:48       ` Duy Nguyen
  0 siblings, 0 replies; 6+ messages in thread
From: Duy Nguyen @ 2013-06-04  7:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On Tue, Jun 4, 2013 at 1:26 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
>
>>  > Have you ever tested this?
>>  >
>>  > Once log_pack_access goes to NULL (e.g. when it sees the empty
>>  > string it was initialized to), this new test will happily
>>  > dereference NULL.
>>
>>  My bad. I did test when GIT_TRACE_PACK_ACCESS was set, not when it
>>  was set to an empty string. All cases tested now.
>>
>> @@ -1956,6 +1958,14 @@ static void write_pack_access_log(struct packed_git *p, off_t obj_offset)
>>  {
>>       static FILE *log_file;
>>
>> +     if (!*log_pack_access) {
>
> The first time, we will see the static empty string and come into
> this block...
>
>> +             log_pack_access = getenv("GIT_TRACE_PACK_ACCESS");
>> +             if (log_pack_access && !*log_pack_access)
>> +                     log_pack_access = NULL;
>> +             if (!log_pack_access)
>> +                     return;
>> +     }
>
> This may
>
>  (1) not find the env-var, in which case log_pack_access becomes
>      NULL here, and the function returns.
>
>  (2) find the env-var to be an empty string, in which case
>      log_pack_access becomes NULL in the next statement, and the
>      function returns.
>
>  (3) find the env-var to be a non-empty string, and the function
>      does not return.
>
> And the next time around, (3) above may work fine; the first if()
> will fail and we do not come in.  But in either (1) or (2), don't
> you keep checking the environment every time you come here?
>
> Oh, no, it is worse than that.  Either case you set the variable to
> NULL, so the very first "if (!*log_pack_access)" would dereference
> NULL.

You found it out already. The only caller does this

if (log_pack_access) write_pack_access_log(p, obj_offset);

so in (1) and (2), write_pack_access_log() will never be called again.
Originally I had "log_pack_access && !*log_pack_access" in the first
if(), but I dropped it because of the caller's condition. It's a bit
fragile though. Imagine a new callsite is added.. (to be continued)

> Why not start from NULL:
>
>     static const char *log_pack_access;
>
> treat that NULL as "unknown" state, and avoid running getenv over
> and over again by treating non-NULL value as "known"?  Perhaps like
> this?
>
>         if (!log_pack_access) {
>                 /* First time: is env set? */
>                 log_pack_access = getenv("GIT_TRACE_PACK_ACCESS");
>                 if (!log_pack_access)
>                         log_pack_access = "";

or set log_pack_access to SomePredefinedButUselessString here. I
wanted to do this but couldn't because of global variable
initialization.

>         }
>         /* Now GIT_TRACE_PACK_ACCESS is known */
>         if (!*log_pack_access)
>                 return; /* not wanted */
>
> As you can no longer rely on "config is read before we do anything
> else" by switching to lazy env checking, your guard at the caller
> needs to be updated, if you want to reduce unneeded call-return
> overhead:
>
>         if (!log_pack_access || *log_pack_access)
>                 write_pack_access_log(...);

and this turns to "if (!log_.. || log != SomePrede...)", no more
peeking into log_pack_access.

>
> But the guard is purely for performance, not correctness, because
> the function now does the "return no-op if we know we are not
> wanted" itself.

(continued here) ..so yeah this looks better.

> Also you no longer need to have an extern variable in environment.c

will fix.
--
Duy

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-06-04  7:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-30 13:51 [PATCH 1/2] core: use env variable instead of config var to turn on logging pack access Nguyễn Thái Ngọc Duy
2013-05-30 13:51 ` [PATCH 2/2] git.txt: document GIT_TRACE_PACKET Nguyễn Thái Ngọc Duy
2013-06-03 20:24 ` [PATCH 1/2] core: use env variable instead of config var to turn on logging pack access Junio C Hamano
2013-06-03 22:52   ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2013-06-04  6:26     ` Junio C Hamano
2013-06-04  7:48       ` Duy Nguyen

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).