From: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr>
To: Tanay Abhra <tanayabh@gmail.com>
Cc: git@vger.kernel.org, Ramkumar Ramachandra <artagnon@gmail.com>,
Karsten Blees <karsten.blees@gmail.com>,
Junio C Hamano <gitster@pobox.com>,
Eric Sunshine <sunshine@sunshineco.com>
Subject: Re: [PATCH v4 1/2] add `config_set` API for caching config files
Date: Wed, 02 Jul 2014 11:14:10 +0200 [thread overview]
Message-ID: <vpqoax8m8bh.fsf@anie.imag.fr> (raw)
In-Reply-To: <1404280905-26763-2-git-send-email-tanayabh@gmail.com> (Tanay Abhra's message of "Tue, 1 Jul 2014 23:01:44 -0700")
Tanay Abhra <tanayabh@gmail.com> writes:
> Add a `config_set` construct that points to an ordered set of config files
> specified by the user, each of which represents what was read and cached
> in core as hashmaps. Add two external functions `git_configset_get_value`
> and `git_configset_get_value_multi` for querying from the config sets,
> which follow `last one wins` semantic(i.e. if there are multiple matches
Space before (
> for the queried key in the files of the configset the value returned will
> the last value in the last matching file of the configset) Add type
will _be_ ?
Does this remark also apply to git_configset_get_value_multi? My
understanding was that "last one wins" applied only to
git_configset_get_value.
> Add a default `config_set`, `the_config_set` to cache all key-value pairs
I don't like the name "the_config_set". It's not the only one. Perhaps
repo_config_set? (not totally satisfactory as it does not contain only
the repo, but the repo+user+system)
What do others think?
> read from usual config files(repo specific .git/config, user wide
(You always forget space before '('...)
> ~/.gitconfig and the global /etc/gitconfig). `the_config_set` uses a
> single hashmap populated using git_config(). The reason for doing so is
> twofold, one is to honour include directives, another is to guarantee O(1)
> lookups for usual config values, as values are queried for hundred of
> times during a run of a git program.
What is the reason to deal with `the_config_set` and other config sets
differently? You're giving arguments to store `the_config_set` as a
single hashmap, but what is the reason to store others as multiple
hashmaps?
And actually, I don't completely buy your arguments: having 3 or 4
hashmaps (.git/config, ~/.gitconfig, ~/.config/git/config and
/etc/gitconfig) would be a O(4) lookup, which is still O(1), and you
could deal with include directives by having ".git/config and included
files" in a hashmap, "~/.gitconfig and included files" in a second
hashmap, ...
My guess is that the real argument is "git_config already does what I
want and I'm too lazy to change it". And I do consider lazyness as a
quality for a computer-scientist ;-).
I would personally find it much simpler to have a single hashmap. We'd
lose the ability to invalidate the cache for only a single file, but I'm
not sure it's worth the code complexity.
> +Querying For Specific Variables
> +-------------------------------
> +
> +For programs wanting to query for specific variables in a non-callback
> +manner, the config API provides two functions `git_config_get_value`
> +and `git_config_get_value_multi`.They both read values from an internal
Space after .
> +`git_config_iter` gives a way to iterate over the keys in cache. Existing
> +`git_config` callback function signature is used for iterating.
"Existing" refers to the old state. It's OK in a commit message, but
won't make sense in the future, when your non-callback API and the old
callback API will live side by side.
> +The config API also provides type specific API functions which do conversion
> +as well as retrieval for the queried variable, including:
> +
> +`git_config_get_int`::
> +Parse the retrieved value to an integer, including unit factors. Dies on
> +error; otherwise, allocates and copies the integer into the dest parameter.
> +Returns 0 on success, or 1 if no value was found.
It seems strange to me to return 1 here, and -1 in git_config_get_value
to mean the same thing.
> +`git_config_bool`::
Isn't it git_config_get_bool?
> +/* config-hash.c */
You may point to the documentation in the comment too.
> +#define CONFIG_SET_INIT { NULL, 0, 0 }
> +
> +struct config_set {
> + struct config_set_item *item;
> + unsigned int nr, alloc;
> +};
> +
> +struct config_set_item {
> + struct hashmap config_hash;
> + char *filename;
Perhaps const char *?
> +static struct hashmap *add_config_hash(const char *filename, struct config_set *cs)
> +{
> + int ret = 0;
> + struct config_set_item *ct;
> + struct config_set_cb cb;
> + ALLOC_GROW(cs->item, cs->nr + 1, cs->alloc);
> + ct = &cs->item[cs->nr++];
> + ct->filename = xstrdup(filename);
> + config_hash_init(&ct->config_hash);
> + cb.cs = cs;
> + cb.ct = ct;
> + /*
> + * When filename is "default", hashmap is constructed from the usual set of
> + * config files (i.e repo specific .git/config, user wide ~/.gitconfig and the
> + * global /etc/gitconfig), used in `the_config_set`
> + */
> + if (!strcmp(filename, "default"))
How does one read a file actually called "default" with this API?
More generally, why do you need a special-case here? I think it would be
much better to leave this part unaware of the default, and have a
separate function like create_repo_config_hash (or
create_the_config_hash) that would call git_config(...). There actually
isn't much shared code between the "default" case and the others in your
function.
> +static struct hashmap *get_config_hash(const char *filename, struct config_set *cs)
> +{
> + int i;
> + for(i = 0; i < cs->nr; i++) {
> + if (!strcmp(cs->item[i].filename, filename))
> + return &cs->item[i].config_hash;
> + }
> + return add_config_hash(filename, cs);
> +}
> +
> +static struct config_hash_entry *config_hash_find_entry(const char *key, const char* filename,
> + struct config_set *cs)
I don't get the logic here.
Either the caller explicitly manages a config_set variable like
config_set my_set = git_configset_init();
git_configset_add_file(my_set, "my-file");
git_configset_get_value(my_set, "some.variable.name");
Or there's an implicit singleton mapping files to hashmaps to allow the
user to say
git_configset_get_value("my-file", "some.variable.name");
without asking the user to explicitly manipulate config_set variables.
It seems to me that your solution is a bad compromise between both
solutions: you do require the user to manipulate config_set variables,
but you also require a filename to look for the right entry in the list.
Did you miss the end of Junio's message:
http://thread.gmane.org/gmane.comp.version-control.git/252567
?
If the user explicitly asks for a single entry in the list, then why
group them in a list? I guess the question is the same as above, and the
more I read the patch, the more I think a config_set should contain a
single hashmap.
> +static int config_hash_add_value(const char *key, const char *value,
> + struct config_set_item *ct, struct config_set *cs)
> +{
> + struct hashmap *config_hash = &ct->config_hash;
> + struct config_hash_entry *e;
> + e = config_hash_find_entry(key, ct->filename, cs);
> +
> + if (!e) {
> + e = xmalloc(sizeof(*e));
> + hashmap_entry_init(e, strhash(key));
> + e->key = xstrdup(key);
> + memset(&e->value_list, 0, sizeof(e->value_list));
> + e->value_list.strdup_strings = 1;
> + hashmap_add(config_hash, e);
> + }
> + /*
> + * Since the values are being fed by git_config*() callback mechanism, they
> + * are already normalised.
We usually speak en_US rather than en_UK => normalized.
> +int git_config_get_value(const char *key, const char **value)
> +{
> + if (!the_config_set_initialised) {
> + git_configset_add_file(&the_config_set, "default");
> + the_config_set_initialised = 1;
> + }
> + return git_configset_get_value(&the_config_set, key, value);
> +}
> +
> +const struct string_list *git_config_get_value_multi(const char *key)
> +{
> + if (!the_config_set_initialised) {
> + git_configset_add_file(&the_config_set, "default");
> + the_config_set_initialised = 1;
> + }
> + return git_configset_get_value_multi(&the_config_set, key);
> +}
These if statements could be refactored, and the_config_set_initialised
could be hidden to the caller. Just make a function get_the_config_set()
that initializes if needed and returns a pointer to the_config_set.
--
Matthieu Moy
http://www-verimag.imag.fr/~moy/
next prev parent reply other threads:[~2014-07-02 9:14 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-02 6:01 [PATCH v4 0/3] git config cache & special querying api utilizing the cache Tanay Abhra
2014-07-02 6:01 ` [PATCH v4 1/2] add `config_set` API for caching config files Tanay Abhra
2014-07-02 9:14 ` Matthieu Moy [this message]
2014-07-02 11:58 ` Tanay Abhra
2014-07-02 14:32 ` Matthieu Moy
2014-07-02 17:09 ` Junio C Hamano
2014-07-04 4:58 ` Tanay Abhra
2014-07-04 9:17 ` Matthieu Moy
2014-07-04 9:25 ` Tanay Abhra
2014-07-04 9:43 ` Matthieu Moy
2014-07-07 17:02 ` Junio C Hamano
2014-07-02 17:00 ` Junio C Hamano
2014-07-02 17:09 ` Matthieu Moy
2014-07-03 17:05 ` Tanay Abhra
2014-07-02 6:01 ` [PATCH 2/2] test-config: Add tests for the config_set API Tanay Abhra
2014-07-02 9:29 ` Matthieu Moy
2014-07-02 12:04 ` Tanay Abhra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=vpqoax8m8bh.fsf@anie.imag.fr \
--to=matthieu.moy@grenoble-inp.fr \
--cc=artagnon@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=karsten.blees@gmail.com \
--cc=sunshine@sunshineco.com \
--cc=tanayabh@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).