From: Tanay Abhra <tanayabh@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Ramkumar Ramachandra <artagnon@gmail.com>,
Matthieu Moy <Matthieu.Moy@grenoble-inp.fr>,
Karsten Blees <karsten.blees@gmail.com>,
Eric Sunshine <sunshine@sunshineco.com>
Subject: Re: [PATCH v4 1/2] add `config_set` API for caching config files
Date: Thu, 03 Jul 2014 22:35:13 +0530 [thread overview]
Message-ID: <53B58D49.8010409@gmail.com> (raw)
In-Reply-To: <xmqqd2dnitm7.fsf@gitster.dls.corp.google.com>
On 7/2/2014 10:30 PM, Junio C Hamano wrote:
> Tanay Abhra <tanayabh@gmail.com> writes:
>
>> diff --git a/Documentation/technical/api-config.txt b/Documentation/technical/api-config.txt
>> index 230b3a0..2c02fee 100644
>> --- a/Documentation/technical/api-config.txt
>> +++ b/Documentation/technical/api-config.txt
>> @@ -77,6 +77,75 @@ To read a specific file in git-config format, use
>> +`git_config_get_value` returns 0 on success, or -1 for no value found.
>> +
>> +`git_config_get_value_multi` allocates and returns a `string_list`
>> +containing all the values for the key passed as parameter, sorted in order
>> +of increasing priority (Note: caller has to call `string_list_clear` on
>> +the returned pointer and then free it).
>> +
>> +`git_config_get_value_multi` returns NULL for no value found.
>> +
>> +`git_config_clear` is provided for resetting and invalidating the cache.
>> +
>> +`git_config_iter` gives a way to iterate over the keys in cache. Existing
>> +`git_config` callback function signature is used for iterating.
>
> Instead of doing prose above and then enumeration below,
> consistently using the enumeration-style would make the descriptions
> of API functions easier to find and read. Also show what parameters
> each function takes. E.g.
>
Noted.
>
> A few random thoughts.
>
> - Are "a value for the variable is found" and "no value for the
> variable is found" the only possible outcome? Should somebody
> (not necessarily the calling code) be notified of an error---for
> example, if you opened a config file successfully but later found
> that the file could not be parsed due to a syntax error or an I/O
> error, is it possible the caller of this function to tell, or are
> these just some special cases of "variable not found"?
The syntactical errors when parsing the file are shown when it fails to
construct the hashmap. Whenever a caller calls for a value for the first
time, the file is read and parsed and if it fails during parsing it dies
raising a error specifying the lineno and filename.
Variable not found means that the variable is not present in the file,
nothing more. Note: the hashmap code uses git_config() parsing stack
so whatever error it raises in normal git_config() invocation, it
would be raised here too.
> - The same goes for the "multi" variant but it is a bit more grave,
> as a function that returns an "int" can later be updated to
> return different negative values to signal different kinds of
> errors, but a function that returns a pointer to string-list can
> only return one kind of NULL ;-)
>
Null pointer just means no variable found in the files. What other kind
of errors may arise?
> - For the purpose of "git_config_get_value()", what is the value
> returned for core.filemode when the configuration file says
> "[core] filemode\n", i.e. when git_config() callback would get a
> NULL for value to signal a boolean true?
NULL in value pointer with 0 return value denoting variable found.
> - Is there a reason why you need a separate and new "config_iter"?
> What does it do differently from the good-old git_config() and
> how? It does not belong to "Querying for Specific Variables"
> anyway.
>
You mentioned previously in the list for a analogue to git_config()
which supports iterating over the keys.
The link is [1] I think, gmane is not working for me currently.
http://thread.gmane.org/gmane.comp.version-control.git/252567
>> +The config API also provides type specific API functions which do conversion
>> +as well as retrieval for the queried variable, including:
>> +
>> +`git_config_get_int`::
>> +Parse the retrieved value to an integer, including unit factors. Dies on
>> +error; otherwise, allocates and copies the integer into the dest parameter.
>> +Returns 0 on success, or 1 if no value was found.
>
> "allocates and copies"???? Is a caller forced to do something like
> this?
>
> int *val;
>
> if (!git_config_get_int("int.one", &val)) {
> use(*val);
> free(val);
> }
>
> I'd expect it to be more like:
>
> int val;
> if (!git_config_get_int("int.one", &val))
> use(val);
>
Yup, you are right, my fault.
> Also, is there a reason why the "not found" signal is "1" (as
> opposed to "-1" for the lower-level get_value() API)?
>
Many of the type specific functions return -1 for wrongful parsing
like git_config_get_string and git_config_get_maybe_bool, that is
why I changed the return value for such functions.
>> +Custom Configsets
>> +-----------------
>> +
>> +A `config_set` points at an ordered set of config files, each of
>> +which represents what was read and cached in-core from a file.
>
> This over-specifies the implementation. Judging from the list of
> API functions below, an implementation is possible without having to
> keep track of what source files were fed in what order (i.e. it can
> just keep a single hash-table of read values and incrementally parse
> the new contents given via configset-add-file into it, without even
> recording the origins, and forget about the source files once it
> finishes parsing).
>
> As was pointed out during the previous review round, a stack of
> <hash, filename> tuples cannot represent the configuration when
> config-includes are involved. Also we would eventually want to be
> able to also read from non-file sources (e.g. from a blob, or a
> caller-supplied in-core strings).
>
> For the purpose of reporting errors at the point of value being
> used, I have a suspicion that you would need to associate the
> <file,line> pair with each individual value string you will keep
> after configset_add_file() parses the file. The implementation of
> the call-chain may look like:
>
>
> git_configset_add_file(cs, filename):
> file = open filename
> lineno = 0
> while read line from file:
> git_configset_parse_line(cs, filename, lineno++, line)
> close file
>
> git_configset_parse_line(cs, filename, lineno, line):
> ... this needs to implement a state machine
> ... that keeps track of what has been read halfway
> ... e.g. we may have seen "[core] crlf =\\\n"
> ... earlier, which is not a complete input yet,
> ... and now we may be looking at "auto" that lets
> ... us finally parse one item.
>
> if state in 'cs' and new 'line' gives us a complete input:
> determine key and value
> record <value, filename, lineno> for 'key' to cs.hash
> update state in 'cs'
>
> For processing "git -c section.variable=value", we probably would
> call git_configset_parse_line() on the_configset with filename
> set to "command line" or something.
>
> And then when the caller tries to actually use the value, e.g.
>
> git_configset_get_int(cs, key):
> look up <value, filename, lineno> for 'key' from the cs.hash
> if value is successfully parsed as int:
> return the parsed result
> else:
> error("not an int: '%s' (%s:%s)", value, filename, lineno)
>
Okay, lets see what I can do with it.
>
>> +It can be used to construct an in-memory cache for config files that
>> +the caller specifies. For example,
>
> This is almost good to help a reader decide if she might want to use
> it in her code, but we probably want to stress the fact that use of
> a config_set is done for a namespace separate from the main
> configuration system, e.g. ".gitmodules".
>
>> +---------------------------------------
>> +struct config_set gm_config = CONFIG_SET_INIT;
>> +int b;
>> +/* we add config files to the config_set */
>> +git_configset_add_file(&gm_config, ".gitmodules");
>> +git_configset_add_file(&gm_config, ".gitmodules_alt");
>> +
>> +if (!git_configset_get_bool(gm_config, "submodule.frotz.ignore", &b)) {
>> +/* hack hack hack */
>> +}
>> +
>> +/* when we are done with the configset */
>> +git_configset_clear(&gm_config);
>> +----------------------------------------
>> +
>> +Configset API provides functions for the above mentioned work flow, including:
>> +
>> +`git_configset_init`::
>> +
>> +Returns an allocated and initialized struct `config_set` pointer.
>
> "allocated"??? Is a caller forced to do this, i.e. always have the
> config-set on heap?
>
> struct config_set *config = git_configset_init();
> ... use it ...
> git_configset_clear(config);
>
> I'd expect it be more like this:
>
> struct config_set config;
>
> git_configset_init(&config);
> ... use it...
> git_configset_clear(&config);
>
Yup, I prefer the second one.
Thanks for the review,
Tanay Abhra.
next prev parent reply other threads:[~2014-07-03 17:05 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-02 6:01 [PATCH v4 0/3] git config cache & special querying api utilizing the cache Tanay Abhra
2014-07-02 6:01 ` [PATCH v4 1/2] add `config_set` API for caching config files Tanay Abhra
2014-07-02 9:14 ` Matthieu Moy
2014-07-02 11:58 ` Tanay Abhra
2014-07-02 14:32 ` Matthieu Moy
2014-07-02 17:09 ` Junio C Hamano
2014-07-04 4:58 ` Tanay Abhra
2014-07-04 9:17 ` Matthieu Moy
2014-07-04 9:25 ` Tanay Abhra
2014-07-04 9:43 ` Matthieu Moy
2014-07-07 17:02 ` Junio C Hamano
2014-07-02 17:00 ` Junio C Hamano
2014-07-02 17:09 ` Matthieu Moy
2014-07-03 17:05 ` Tanay Abhra [this message]
2014-07-02 6:01 ` [PATCH 2/2] test-config: Add tests for the config_set API Tanay Abhra
2014-07-02 9:29 ` Matthieu Moy
2014-07-02 12:04 ` Tanay Abhra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53B58D49.8010409@gmail.com \
--to=tanayabh@gmail.com \
--cc=Matthieu.Moy@grenoble-inp.fr \
--cc=artagnon@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=karsten.blees@gmail.com \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).