git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Glen Choo" <chooglen@google.com>,
	"Jonathan Tan" <jonathantanmy@google.com>,
	"Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
	"Emily Shaffer" <nasamuffin@google.com>,
	"Derrick Stolee" <derrickstolee@github.com>,
	"Calvin Wan" <calvinwan@google.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [RFC PATCH 0/5] bypass config.c global state with configset
Date: Fri, 17 Mar 2023 06:01:05 +0100	[thread overview]
Message-ID: <RFC-cover-0.5-00000000000-20230317T042408Z-avarab@gmail.com> (raw)
In-Reply-To: <pull.1463.v2.git.git.1678925506.gitgitgadget@gmail.com>

On Thu, Mar 16 2023, Glen Choo via GitGitGadget wrote:

> After reflecting on Ævar's responses on v1, I'm fairly convinced that
> "struct config_reader" shouldn't exist in the long term. I've written my
> thoughts on a good long term direction in the "Leftover bits" section. Based
> on that, I've also updated my WIP libification patches [1] to remove "struct
> config_reader" from the library interface, and think it looks a lot better
> as a result.

That libification url
(https://github.com/git/git/compare/master...chooglen:git:config-lib-parsing)
doesn't work for me, and I didn't find a branch with that name in your
published repo. So maybe you've already done all this
post-libification work...

> = Leftover bits
>
> We still need a global "the_reader" because config callbacks are reading
> auxiliary information about the config (e.g. line number, file name) via
> global functions (e.g. current_config_line(), current_config_name()). This
> is either because the callback uses this info directly (like
> builtin/config.c printing the filename and scope of the value) or for error
> reporting (like git_parse_int() reporting the filename of the value it
> failed to parse).
>
> If we had a way to plumb the state from "struct config_reader" to the config
> callback functions, we could initialize "struct config_reader" in the config
> machinery whenever we read config (instead of asking the caller to
> initialize "struct config_reader" themselves), and config reading could
> become a thread-safe operation. There isn't an obvious way to plumb this
> state to config callbacks without adding an additional arg to config_fn_t
> and incurring a lot of churn, but if we start replacing "config_fn_t" with
> the configset API (which we've independently wanted for some time), this may
> become feasible.

...in any case. This RFC expands a bit on my comments on the v1 (at
[1] and upthread). It doesn't get all the way there, but with the
small change in 5/5 we've gotten rid of current_config_line(), the
1-4/5 are trivial pre-refactorings to make that diff smaller
(e.g. moving the "struct key_value_info" around in config.h).

Maybe it still makes sense to go for this "the_reader" intermediate
step, but I can't help but think that we could just go for it all in
one leap, and that you've just got stuck on thinking that you needed
to change "config_fn_t" for all its callers.

As the 5/5 here shows we have various orthagonal uses of the
"config_fn_t" in config.c, and can just implement a new callback type
for the edge cases where we need the file & line info.

This still leave the current_config_name() etc, which
e.g. builtin/config.c still uses. In your series you've needed to add
the new "reader" parameter for everything from do_config_from(), but
if we're doing that can't we instead just go straight to passing a
"struct key_value_info *" (perhaps with an added "name" field) all the
way down, replacing "cf->linenr" etc?

Instead you end up extending "the_reader" everywhere, including to
e.g. configset_iter, which I think as the 5/5 here shows isn't needed,
but maybe I've missed something.

Similarly, you mention git_parse_int() wanting to report a filename
and/or line number. I'm aware that it can do that, but it doesn't do
so in the common case, e.g.:

	git -c format.filenameMaxLength=abc log
	fatal: bad numeric config value 'abc' for 'format.filenamemaxlength': invalid unit

And the same goes for writing it to e.g. ~/.gitconfig. It's only if
you use "git config --file" or similar that we'll report a filename.

So just as with the current_config_line() I wonder if you just grepped
for e.g. git_config_int() and thought because we have a lot of users
of it that all of them would require this data, but for e.g. this
log.c caller (and most or all of the others) we'll be reading the
normal config, and aren't getting any useful info from
die_bad_number() that we wouldn't get from an error function that
didn't need the "linenr" etc.

> And if we do this, "struct config_reader" itself will probably become
> obsolete, because we'd be able to plumb only the relevant state for the
> current operation, e.g. if we are parsing a config file, we'd pass only the
> config file parsing state, instead of "struct config_reader", which also
> contains config set iterating state. In such a scenario, we'd probably want
> to pass "struct key_value_info" to the config callback, since that's all the
> callback should be interested in anyway. Interestingly, this was proposed by
> Junio back in [4], and we didn't do this back then out of concern for the
> churn (just like in v1).

I think we can make it even simpler than that, and from playing around
with builtin/config.c a bit after the 5/5 here I got a POC working
(but am not posting it here, didn't have time to clean it up).

We can just make config_set_callback() and configset_iter()
non-static, so e.g. the builtin/config.c caller that implements
"--show-origin" can keep its config_with_options(...) call, but
instead of "streaming" the config, it'll buffer it up into a
configset.

The advantage of that is that with the configset API we'll get a
"struct key_value_info *" for free on the other end. I.e. we'll
configset_iter() with a fn=NULL, but with a defined "config_kvi_fn_t",
which 5/5 is adding.

But I haven't done that work (and am not planning to finish this), but
maybe this helps.

We'll also need to track the equivalent of "cf->linenr" etc. while we
do the actual initial parse. I think it might be simpler to start by
converting those "linenr" to a "struct key_value_info" that's placed
in the "config_source" right away.

I.e. when we pass it to the error handlers we'll need to give them
access to the "linenr", but we don't want to provide e.g. "eof" (which
is internal-only state).

I wonder how much else you're converting here is actually dead code in
the end (or can trivially be made dead). E.g. the
current_config_line() change you make in 1/8 is never going to use the
"cf_global" if combined with the 5/5 change here.

1. https://lore.kernel.org/git/230308.867cvrziac.gmgdl@evledraar.gmail.com/

Ævar Arnfjörð Bjarmason (5):
  config.h: move up "struct key_value_info"
  config.c: use "enum config_origin_type", not "int"
  config API: add a config_origin_type_name() helper
  config.c: refactor configset_iter()
  config API: add and use a repo_config_kvi()

 builtin/remote.c       | 11 +++----
 config.c               | 67 ++++++++++++++++++++++++++----------------
 config.h               | 29 +++++++++++++-----
 t/helper/test-config.c | 13 ++++----
 t/t5505-remote.sh      |  7 +++--
 5 files changed, 80 insertions(+), 47 deletions(-)

-- 
2.40.0.rc1.1034.g5867a1b10c5


  parent reply	other threads:[~2023-03-17  5:01 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-01  0:38 [PATCH 0/6] [RFC] config.c: use struct for config reading state Glen Choo via GitGitGadget
2023-03-01  0:38 ` [PATCH 1/6] config.c: plumb config_source through static fns Glen Choo via GitGitGadget
2023-03-03 18:02   ` Junio C Hamano
2023-03-01  0:38 ` [PATCH 2/6] config.c: don't assign to "cf" directly Glen Choo via GitGitGadget
2023-03-01  0:38 ` [PATCH 3/6] config.c: create config_reader and the_reader Glen Choo via GitGitGadget
2023-03-03 18:05   ` Junio C Hamano
2023-03-01  0:38 ` [PATCH 4/6] config.c: plumb the_reader through callbacks Glen Choo via GitGitGadget
2023-03-08  9:54   ` Ævar Arnfjörð Bjarmason
2023-03-08 18:00     ` Glen Choo
2023-03-08 18:07       ` Junio C Hamano
2023-03-01  0:38 ` [PATCH 5/6] config.c: remove current_config_kvi Glen Choo via GitGitGadget
2023-03-06 20:12   ` Calvin Wan
2023-03-01  0:38 ` [PATCH 6/6] config.c: remove current_parsing_scope Glen Choo via GitGitGadget
2023-03-06 19:57 ` [PATCH 0/6] [RFC] config.c: use struct for config reading state Jonathan Tan
2023-03-06 21:45   ` Glen Choo
2023-03-06 22:38     ` Jonathan Tan
2023-03-08 10:32       ` Ævar Arnfjörð Bjarmason
2023-03-08 23:09         ` Glen Choo
2023-03-07 11:57 ` Ævar Arnfjörð Bjarmason
2023-03-07 18:22   ` Glen Choo
2023-03-07 18:36     ` Ævar Arnfjörð Bjarmason
2023-03-07 19:36     ` Junio C Hamano
2023-03-07 22:53       ` Glen Choo
2023-03-08  9:17         ` Ævar Arnfjörð Bjarmason
2023-03-08 23:18           ` Glen Choo
2023-03-16  0:11 ` [PATCH v2 0/8] " Glen Choo via GitGitGadget
2023-03-16  0:11   ` [PATCH v2 1/8] config.c: plumb config_source through static fns Glen Choo via GitGitGadget
2023-03-16 21:16     ` Jonathan Tan
2023-03-16  0:11   ` [PATCH v2 2/8] config.c: don't assign to "cf_global" directly Glen Choo via GitGitGadget
2023-03-16 21:18     ` Jonathan Tan
2023-03-16 21:31       ` Junio C Hamano
2023-03-16 22:56       ` Glen Choo
2023-03-16  0:11   ` [PATCH v2 3/8] config.c: create config_reader and the_reader Glen Choo via GitGitGadget
2023-03-16 21:22     ` Jonathan Tan
2023-03-16  0:11   ` [PATCH v2 4/8] config.c: plumb the_reader through callbacks Glen Choo via GitGitGadget
2023-03-16  0:11   ` [PATCH v2 5/8] config.c: remove current_config_kvi Glen Choo via GitGitGadget
2023-03-16  0:11   ` [PATCH v2 6/8] config.c: remove current_parsing_scope Glen Choo via GitGitGadget
2023-03-16  0:11   ` [PATCH v2 7/8] config: report cached filenames in die_bad_number() Glen Choo via GitGitGadget
2023-03-16 22:22     ` Jonathan Tan
2023-03-16 23:05       ` Glen Choo
2023-03-16  0:11   ` [PATCH v2 8/8] config.c: rename "struct config_source cf" Glen Choo via GitGitGadget
2023-03-16  0:15   ` [PATCH v2 0/8] config.c: use struct for config reading state Glen Choo
2023-03-16 22:29   ` Jonathan Tan
2023-03-17  5:01   ` Ævar Arnfjörð Bjarmason [this message]
2023-03-17  5:01     ` [RFC PATCH 1/5] config.h: move up "struct key_value_info" Ævar Arnfjörð Bjarmason
2023-03-17  5:01     ` [RFC PATCH 2/5] config.c: use "enum config_origin_type", not "int" Ævar Arnfjörð Bjarmason
2023-03-17  5:01     ` [RFC PATCH 3/5] config API: add a config_origin_type_name() helper Ævar Arnfjörð Bjarmason
2023-03-17  5:01     ` [RFC PATCH 4/5] config.c: refactor configset_iter() Ævar Arnfjörð Bjarmason
2023-03-17  5:01     ` [RFC PATCH 5/5] config API: add and use a repo_config_kvi() Ævar Arnfjörð Bjarmason
2023-03-17 17:17       ` Junio C Hamano
2023-03-17 20:59       ` Jonathan Tan
2023-03-17 16:21     ` [RFC PATCH 0/5] bypass config.c global state with configset Junio C Hamano
2023-03-17 16:28     ` Glen Choo
2023-03-17 19:20     ` Glen Choo
2023-03-17 23:32       ` Glen Choo
2023-03-29 11:53       ` Ævar Arnfjörð Bjarmason
2023-03-28 17:51   ` [PATCH v3 0/8] config.c: use struct for config reading state Glen Choo via GitGitGadget
2023-03-28 17:51     ` [PATCH v3 1/8] config.c: plumb config_source through static fns Glen Choo via GitGitGadget
2023-03-28 17:51     ` [PATCH v3 2/8] config.c: don't assign to "cf_global" directly Glen Choo via GitGitGadget
2023-03-28 17:51     ` [PATCH v3 3/8] config.c: create config_reader and the_reader Glen Choo via GitGitGadget
2023-03-29 10:41       ` Ævar Arnfjörð Bjarmason
2023-03-29 18:57         ` Junio C Hamano
2023-03-29 20:02           ` Glen Choo
2023-03-30 17:51         ` Glen Choo
2023-03-28 17:51     ` [PATCH v3 4/8] config.c: plumb the_reader through callbacks Glen Choo via GitGitGadget
2023-03-28 17:51     ` [PATCH v3 5/8] config.c: remove current_config_kvi Glen Choo via GitGitGadget
2023-03-28 17:51     ` [PATCH v3 6/8] config.c: remove current_parsing_scope Glen Choo via GitGitGadget
2023-03-28 17:51     ` [PATCH v3 7/8] config: report cached filenames in die_bad_number() Glen Choo via GitGitGadget
2023-03-28 17:51     ` [PATCH v3 8/8] config.c: rename "struct config_source cf" Glen Choo via GitGitGadget
2023-03-28 18:00     ` [PATCH v3 0/8] config.c: use struct for config reading state Glen Choo
2023-03-28 20:14       ` Junio C Hamano
2023-03-28 21:24         ` Glen Choo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=RFC-cover-0.5-00000000000-20230317T042408Z-avarab@gmail.com \
    --to=avarab@gmail.com \
    --cc=calvinwan@google.com \
    --cc=chooglen@google.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=nasamuffin@google.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).