git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: Jonathan Nieder <jrnieder@gmail.com>,
	Jeff Hostetler via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, peff@peff.net,
	steadmon@google.com, avarab@gmail.com,
	Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: [PATCH v2 4/7] trace2: use system config for default trace2 settings
Date: Tue, 9 Apr 2019 11:58:01 -0400	[thread overview]
Message-ID: <a0c89d0d-669e-bf56-25d2-cbb09b012e70@jeffhostetler.com> (raw)
In-Reply-To: <20190403000032.GA190454@google.com>



On 4/2/2019 8:00 PM, Jonathan Nieder wrote:
> Hi,
> 
> Jeff Hostetler via GitGitGadget wrote:
> 
>> Teach git to read the system config (usually "/etc/gitconfig") for
>> default Trace2 settings.  This allows system-wide Trace2 settings to
>> be installed and inherited to make it easier to manage a collection of
>> systems.
> 
> Yay!  Thanks for writing this, and sorry for the slow review.
> 
> [...]
>> Only the system config file is used.  Trace2 config values are ignored
>> in local, global, and other config files.  Likewise, the "-c" command
>> line arguments are ignored for Trace2 values.  These limits are for
>> performance reasons.
> 
> Can you say a bit more about this?  If I'm willing to pay the
> performance cost, is there a way for me to turn on respecting other
> config files?  What is that performance cost?

Several thoughts here.  Some are performance-related and some are
startup chicken-n-egg problems.  I tried to add trace2 to the code
base in the least disruptive way possible and hopefully with minimal
side-effects on existing behaviors.  And also minimum impact on overall
performance when not in use.

So, with that in mind:

[] Trace2 should be initialized as early as and lightly as possible
    so that timers are started and it can capture events from other
    startup activities.  And so that we can tell as early as possible
    if trace2 should NOT be enabled and short-cut those calls and not
    waste time collecting unnecessary data.

[] Trace2 is initialized by calls from common-main.c:main().  This
    happens before cmd_main() is called, so we are very early in the
    startup process and very few things have been initialized.  For
    example, I had to let git_resolve_executable_dir() run so that
    we could find the location of the system config, but that's about
    it.  We have not even initialized "the_repository" yet.

[] WRT "-c var=val", this is processed by git.c:handle_options() and
    called from git.c:cmd_main() and git.c:handle_alias().  So the "-c"
    args won't be respected until after that point.

    Adding trace2 initialization after that point is a bit of a mess.
    Teaching trace2 to scan the argv for "-c" is just duplicating effort.
    And moving the "-c" processing earlier in the startup, changes the
    behavior of the "git-*" commands (which don't currently recognize
    the "-c" option).

    So that's why I'm suggesting we not respect "-c" for this.

[] By initializing trace2 inside main() we guarantee that it will get
    started.  If we wait to initialize it until after handle_options()
    returns, we miss events for commands that it handles itself (such
    as "git --exec-path" where it just prints and exits or syntax errors
    where it directly calls usage() and exits).

[] WRT to per-repo config values:

    In common-main.c:main() we have not yet discovered the .git dir,
    so repo-local config files are questionable at a very early point
    in the process startup.  Again, it comes down to whether we wait
    for gitdir discovery (and whatever file system scanning, --gitdir,
    or -C processing is required) before deciding whether to start
    trace2 or not.

    Also, per-repo trace2 config settings aren't available at the time
    of the clone, so just relying on them will leave our telemetry
    incomplete.  It's better to have a system (or maybe a user) setting
    and not a per-repo setting.

    Longer term, with submodules and plans to support them in-proc rather
    than via sub-processes, I don't think there should be any expectation
    that trace2 settings would adapt during recursive operations.  So for
    example, a top-level command might see different trace2 settings than
    a command run inside a submodule.  If submodule operations become
    in-proc, then to maintain that expected behavior we'd need to have
    per-repo trace2 settings.  (Granted, not impossible, but by saying no
    to that now, we can eliminate a possible pain point in the future.)

[] i'll add more on this topic at the bottom of this note in response to
    your per-user and includes questions.


> [...]
>> --- a/Documentation/technical/api-trace2.txt
>> +++ b/Documentation/technical/api-trace2.txt
>> @@ -117,6 +117,37 @@ values are recognized.
>>   Socket type can be either `stream` or `dgram`.  If the socket type is
>>   omitted, Git will try both.
>>   
>> +== Trace2 Settings in System Config
>> +
>> +Trace2 also reads configuration information from the system config.
>> +This is intended to help adminstrators to gather system-wide Git
>> +performance data.
>> +
>> +Trace2 only reads the system configuration, it does not read global,
>> +local, worktree, or `-c` config settings.
> 
> An additional limitation is that this doesn't appear to support
> include.* directives.  Intended?

I didn't specifically intend to support or not-support include files
here.  Frankly, the complexity of the config code makes my eyes bleed.
I'll investigate adding it.


> 
> [...]
>> --- a/t/t0210-trace2-normal.sh
>> +++ b/t/t0210-trace2-normal.sh
> [...]
>> +MOCK=./mock_system_config
>> +
>> +test_expect_success 'setup mocked /etc/gitconfig' '
>> +	git config --file $MOCK trace2.normalTarget "$(pwd)/trace.normal" &&
>> +	git config --file $MOCK trace2.normalBrief 1
>> +'
>> +
>> +test_expect_success 'using mock, normal stream, return code 0' '
>> +	test_when_finished "rm trace.normal actual expect" &&
>> +	GIT_TEST_TR2_SYSTEM_CONFIG="$MOCK" test-tool trace2 001return 0 &&
> 
> Tests run with GIT_CONFIG_NOSYSTEM=1 to protect themselves from any
> system config the user has.

Thanks for the pointer.  I'll update the tr2_sysenv_load() to check
git_system_config() before trying to use the system config file.

> 
> So this would be easier to test if we can use user-level config for
> it.

I suppose.  I think it just moves the testing problem to a trick using
$HOME, right?  But it does get rid of that GIT_TEST_TR2_SYSTEM_CONFIG
symbol, so yeah a net win.  I'll investigate.  Thanks.


> [...]
>> --- /dev/null
>> +++ b/trace2/tr2_sysenv.c
>> @@ -0,0 +1,128 @@
>> +#include "cache.h"
>> +#include "config.h"
>> +#include "dir.h"
>> +#include "tr2_sysenv.h"
>> +
>> +/*
>> + * Each entry represents a trace2 setting.
>> + * See Documentation/technical/api-trace2.txt
>> + */
>> +struct tr2_sysenv_entry {
>> +	const char *env_var_name;
>> +	const char *git_config_name;
>> +
>> +	char *value;
>> +	unsigned int getenv_called : 1;
>> +};
>> +
>> +/*
>> + * This table must match "enum tr2_sysenv_variable" in tr2_sysenv.h.
> 
> Can we deduplicate to avoid the need to match?
> 
> Perhaps using C99 array initializers would help:
> 
> 	[TR2_SYSENV_CFG_PARAM] = { ... },

Cool.  I hadn't seen that syntax before.

> 
> [...]
>> +/*
>> + * Load Trace2 settings from the system config (usually "/etc/gitconfig"
>> + * unless we were built with a runtime-prefix).  These are intended to
>> + * define the default values for Trace2 as requested by the administrator.
>> + */
>> +void tr2_sysenv_load(void)
>> +{
>> +	const char *system_config_pathname;
>> +	const char *test_pathname;
>> +
>> +	system_config_pathname = git_etc_gitconfig();
>> +
>> +	test_pathname = getenv("GIT_TEST_TR2_SYSTEM_CONFIG");
>> +	if (test_pathname) {
>> +		if (!*test_pathname || !strcmp(test_pathname, "0"))
>> +			return; /* disable use of system config */
>> +
>> +		/* mock it with given test file */
>> +		system_config_pathname = test_pathname;
>> +	}
>> +
>> +	if (file_exists(system_config_pathname))
>> +		git_config_from_file(tr2_sysenv_cb, system_config_pathname,
>> +				     NULL);
> 
> This duplicates functionality from config.c and misses some features
> along the way (e.g. support for include.*).
> 
> Would read_early_config work?  If we want a variant that doesn't
> discover_git_directory, we can change that function to handle it.
> 
> For our needs at Google, it would be very helpful to have support for
> include.* and reading settings from at least $HOME/.gitconfig in
> addition to /etc/gitconfig (even better if it supports per-repo config
> as well).  I believe it would simplify the code and tests, too.  If
> there's anything I can to help, please don't hesitate to ask.

read_early_config() might work, but it is pretty heavy and doesn't
address my above comments.

Perhaps a compromise would be to have it read just the system and
user config files (w/ includes turned on).  It shouldn't take too
much time and it shouldn't have the startup-order dependencies, I
think.

Thoughts??

> 
> Thanks,
> Jonathan
> 

Thanks,
Jeff

  reply	other threads:[~2019-04-09 15:58 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-28 13:30 [PATCH 0/4] trace2: load trace2 settings from system config Jeff Hostetler via GitGitGadget
2019-03-28 13:30 ` [PATCH 1/4] trace2: refactor setting process starting time Jeff Hostetler via GitGitGadget
2019-03-28 13:31 ` [PATCH 2/4] trace2: add absolute elapsed time to start event Jeff Hostetler via GitGitGadget
2019-03-28 13:31 ` [PATCH 3/4] trace2: find exec-dir before trace2 initialization Jeff Hostetler via GitGitGadget
2019-03-28 13:31 ` [PATCH 4/4] trace2: use system config for default trace2 settings Jeff Hostetler via GitGitGadget
2019-03-28 14:36   ` Ævar Arnfjörð Bjarmason
2019-03-28 18:50     ` Jeff Hostetler
2019-03-28 21:28   ` Josh Steadmon
2019-03-29 17:04 ` [PATCH v2 0/7] trace2: load trace2 settings from system config Jeff Hostetler via GitGitGadget
2019-03-29 17:04   ` [PATCH v2 1/7] trace2: refactor setting process starting time Jeff Hostetler via GitGitGadget
2019-03-29 17:04   ` [PATCH v2 2/7] trace2: add absolute elapsed time to start event Jeff Hostetler via GitGitGadget
2019-03-29 17:04   ` [PATCH v2 3/7] trace2: find exec-dir before trace2 initialization Jeff Hostetler via GitGitGadget
2019-03-29 17:04   ` [PATCH v2 4/7] trace2: use system config for default trace2 settings Jeff Hostetler via GitGitGadget
2019-04-01 21:00     ` Josh Steadmon
2019-04-01 21:06       ` Jeff Hostetler
2019-04-03  0:01       ` Jonathan Nieder
2019-04-03  0:00     ` Jonathan Nieder
2019-04-09 15:58       ` Jeff Hostetler [this message]
2019-03-29 17:04   ` [PATCH v2 5/7] trace2: report peak memory usage of the process Jeff Hostetler via GitGitGadget
2019-03-29 22:16     ` Ævar Arnfjörð Bjarmason
2019-04-01 21:05       ` Jeff Hostetler
2019-03-29 17:04   ` [PATCH v2 7/7] trace2: make SIDs more unique Jeff Hostetler via GitGitGadget
2019-03-29 22:12     ` Ævar Arnfjörð Bjarmason
2019-04-01 21:16       ` Jeff Hostetler
2019-03-29 17:04   ` [PATCH v2 6/7] trace2: clarify UTC datetime formatting Jeff Hostetler via GitGitGadget
2019-04-01 21:02   ` [PATCH v2 0/7] trace2: load trace2 settings from system config Josh Steadmon
2019-04-11 15:18   ` [PATCH v3 00/10] " Jeff Hostetler via GitGitGadget
2019-04-11 15:18     ` [PATCH v3 01/10] config: initialize opts structure in repo_read_config() Jeff Hostetler via GitGitGadget
2019-04-12  3:52       ` Jonathan Nieder
2019-04-15 14:34         ` Johannes Schindelin
2019-04-11 15:18     ` [PATCH v3 03/10] trace2: add absolute elapsed time to start event Jeff Hostetler via GitGitGadget
2019-04-11 15:18     ` [PATCH v3 02/10] trace2: refactor setting process starting time Jeff Hostetler via GitGitGadget
2019-04-11 15:18     ` [PATCH v3 04/10] trace2: find exec-dir before trace2 initialization Jeff Hostetler via GitGitGadget
2019-04-11 15:18     ` [PATCH v3 05/10] config: add read_very_early_config() Jeff Hostetler via GitGitGadget
2019-04-11 15:18     ` [PATCH v3 06/10] trace2: use system/global config for default trace2 settings Jeff Hostetler via GitGitGadget
2019-04-11 15:18     ` [PATCH v3 07/10] trace2: report peak memory usage of the process Jeff Hostetler via GitGitGadget
2019-04-11 15:18     ` [PATCH v3 08/10] trace2: clarify UTC datetime formatting Jeff Hostetler via GitGitGadget
2019-04-11 15:18     ` [PATCH v3 09/10] trace2: make SIDs more unique Jeff Hostetler via GitGitGadget
2019-04-11 15:18     ` [PATCH v3 10/10] trace2: update docs to describe system/global config settings Jeff Hostetler via GitGitGadget
2019-04-12  2:29     ` [PATCH v3 00/10] trace2: load trace2 settings from system config Junio C Hamano
2019-04-12 13:47       ` Jeff Hostetler
2019-04-15 20:39     ` [PATCH v4 " Jeff Hostetler via GitGitGadget
2019-04-15 20:39       ` [PATCH v4 01/10] config: initialize opts structure in repo_read_config() Jeff Hostetler via GitGitGadget
2019-04-15 20:39       ` [PATCH v4 02/10] trace2: refactor setting process starting time Jeff Hostetler via GitGitGadget
2019-04-15 20:39       ` [PATCH v4 03/10] trace2: add absolute elapsed time to start event Jeff Hostetler via GitGitGadget
2019-04-15 20:39       ` [PATCH v4 04/10] trace2: find exec-dir before trace2 initialization Jeff Hostetler via GitGitGadget
2019-04-15 20:39       ` [PATCH v4 05/10] config: add read_very_early_config() Jeff Hostetler via GitGitGadget
2019-04-15 20:39       ` [PATCH v4 06/10] trace2: use system/global config for default trace2 settings Jeff Hostetler via GitGitGadget
2019-04-27 13:43         ` SZEDER Gábor
2019-04-29 19:03           ` Jeff Hostetler
2019-04-15 20:39       ` [PATCH v4 07/10] trace2: report peak memory usage of the process Jeff Hostetler via GitGitGadget
2019-04-15 20:39       ` [PATCH v4 08/10] trace2: clarify UTC datetime formatting Jeff Hostetler via GitGitGadget
2019-04-15 20:39       ` [PATCH v4 09/10] trace2: make SIDs more unique Jeff Hostetler via GitGitGadget
2019-04-15 20:39       ` [PATCH v4 10/10] trace2: update docs to describe system/global config settings Jeff Hostetler via GitGitGadget
2019-04-29 20:14       ` [PATCH v5 00/11] trace2: load trace2 settings from system config Jeff Hostetler via GitGitGadget
2019-04-29 20:14         ` [PATCH v5 01/11] config: initialize opts structure in repo_read_config() Jeff Hostetler via GitGitGadget
2019-04-29 20:14         ` [PATCH v5 02/11] trace2: refactor setting process starting time Jeff Hostetler via GitGitGadget
2019-04-29 20:14         ` [PATCH v5 03/11] trace2: add absolute elapsed time to start event Jeff Hostetler via GitGitGadget
2019-04-29 20:14         ` [PATCH v5 04/11] trace2: find exec-dir before trace2 initialization Jeff Hostetler via GitGitGadget
2019-04-29 20:14         ` [PATCH v5 05/11] config: add read_very_early_config() Jeff Hostetler via GitGitGadget
2019-04-29 20:14         ` [PATCH v5 06/11] trace2: use system/global config for default trace2 settings Jeff Hostetler via GitGitGadget
2019-04-29 20:14         ` [PATCH v5 07/11] trace2: report peak memory usage of the process Jeff Hostetler via GitGitGadget
2019-04-29 20:14         ` [PATCH v5 09/11] trace2: make SIDs more unique Jeff Hostetler via GitGitGadget
2019-04-29 20:14         ` [PATCH v5 08/11] trace2: clarify UTC datetime formatting Jeff Hostetler via GitGitGadget
2019-04-29 20:14         ` [PATCH v5 10/11] trace2: update docs to describe system/global config settings Jeff Hostetler via GitGitGadget
2019-04-29 20:14         ` [PATCH v5 11/11] trace2: fixup access problem on /etc/gitconfig in read_very_early_config Jeff Hostetler via GitGitGadget
2019-04-29 20:21         ` [PATCH v5 00/11] trace2: load trace2 settings from system config Jeff Hostetler
2019-05-07  1:18           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a0c89d0d-669e-bf56-25d2-cbb09b012e70@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    --cc=steadmon@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).