From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Junio C Hamano <gitster@pobox.com>,
git@vger.kernel.org, John Cai <johncai86@gmail.com>
Subject: Re: [PATCH v3 4/4] Documentation: add lint-fsck-msgids
Date: Fri, 28 Oct 2022 04:04:12 +0200 [thread overview]
Message-ID: <221028.86sfj88xdq.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <Y1su99Kc0ex1W7bX@coredump.intra.peff.net>
On Thu, Oct 27 2022, Jeff King wrote:
> On Wed, Oct 26, 2022 at 01:35:43PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> > (I had a similar thought after getting hit on the head by the recent
>> > t0450-txt-doc-vs-help.sh).
>>
>> Sorry about that. FWIW I've wanted to assert that for a while, and to do
>> it by e.g. having the doc *.txt blurbs generated from running "$buildin
>> -h" during the build.
>>
>> But when I suggested that I gathered That Junio wasn't up for that
>> approach, it does have its downsides thorugh. E.g. to build the docs
>> you'd now have to compile C code, and e.g. that git-scm.com publisher
>> written in Ruby would have to compile the code to generate its docs.
>
> Yes, it would definitely break the git-scm.com importer. It might be
> possible to make it work by actually running "make man" (or maybe even
> some partial targets) in a local repo. The nightly update job pulls all
> of the data using GitHub's API, but it does run in a heroku dyno that
> has git and gcc. Doing a shallow clone of the tag to build is probably
> more expensive, but the cost of an actual build isn't that important.
> 99% of the scheduled job runs are noops because there's no new version
> of Git to build manpages for; as long as those remain cheap-ish, we're
> OK.
>
> I also think in the long run that the site should move to a system that
> builds off a local repo, which we can trigger manually or via GitHub
> Actions. But that doesn't exist yet, and I don't think anyone's actively
> working on it.
>
> I also think it would be nice if the git-scm.com system relied more on
> "make man USE_ASCIIDOCTOR=1", possibly by hooking into
> asciidoctor-extensions.rb or similar, rather than munging files in an
> ad-hoc manner. But that's also a big change that nobody is actively
> working on.
>
> All of which is to say that yes, having docs depend on C code will cause
> work on the site, but it may be work that takes us in the right
> direction.
Good to know.
>> Or we could do it the other way around, and scape the *.txt for the *.c
>> generation, but then we need to run a new script for building
>> builtin/*.o. Also possible, and I think eventually both are better than
>> what t0450-txt-doc-vs-help.sh's doing now, but that was a less invasive
>> change than both...
>
> I think this particular case is tricky in that direction, because it's a
> big set of dependencies that aren't necessarily one-to-one. E.g.,
> builtin/log.c needs to depend on git-log.txt, but also on git-show.txt
> and git-format-patch.txt.
I was thinking of just a generated usage-strings.c or whatever. I.e. one
file with every usage string in it. Then you don't need to hunt down
which thing goes in which file. We'll just have a variable named
"builtin_format_patch_usage". Include it in "builtin.h" and it doesn't
matter if that's in builtin/log.c or builtin/format-patch.c.
It does mean you need to rebuild the C code for other built-ins every
time one of the SYNOPSIS sections changes, but doesn't happen too often.
>> >> +# Unfortunately our dependency management of generated headers used
>> >> +# from within other headers suck, so we'll need to list every user of
>> >> +# fsck.h here, but not too bad...
>> >> +FSCK_MSGIDS_H_BUILTINS = fsck index-pack mktag receive-pack unpack-objects
>> >> +$(foreach f,$(FSCK_MSGIDS_H_BUILTINS:%=builtin/%),$f.sp $f.s $f.o): fsck-msgids.h
>> >> +FSCK_MSGIDS_H_LIBS = fetch-pack fsck
>> >> +$(foreach f,$(FSCK_MSGIDS_H_LIBS),$f.sp $f.s $f.o): fsck-msgids.h
>> >
>> > I don't understand the "used within other headers" part here. Computed
>> > dependencies will get this right. It's only the initial build (before we
>> > have any computed dependencies) that needs to know that the C files in
>> > question depend on the generated file. But that is true whether they do
>> > so directly or indirectly.
>>
>> I forgot the full story, but luckily I wrote it down in the WIP commits
>> :) FWIW if you want to scour that it's mainly:
>>
>> https://github.com/avar/git/commit/a00f1cb9ea5 # add advice-type.h
>> https://github.com/avar/git/commit/9e080923a11 # generate 'struct advice
>
> That seems like the same problem to me. It's just that if something is
> in cache.h, then it's needed by basically everything, and so the
> dependency list is big.
I think I either did or was planning to take it out of cache.h as part
of that, we put way too much in cache.h.
Even advice.c depends on cache.h for its advice.h *sigh*.
Trying it just now putting advice.h in builtin.h instead leaves 10
top-level files not-compiling (and they just need the header).
I think it's fine to include common utilties in our "included by
everything" headers, but if we've got ~500 *.c files and something is
only needed by ~20 of them (as in this case) we should probably just
include it in those 20.
>> Anyway, in partial summary, why (and sorry this got so long):
>>
>> * Yes, once we build e.g. fsck.o we have the full dependency tree,
>> yay....
>> * ...but only for gcc & clang, but we support other compilers.
>> * ...for other compilers (or gcc & clang without the dep detection
>> enabled) what should this do:
I didn't summarize this well enough, I should have said that whether you
have computed deps or not that.....
>>
>> make file-that-does-not-use-generated-header.o
>>
>> It sucks a bit to have e.g. every *.o depend on command-list.h, when
>> we only use it in 1-2 places, ditto the other headers.
> But that is already true of non-generated headers. If your system
> doesn't support computed deps, then all objects depend on all headers.
... this does not build e.g. command-list.h:
make clean
make grep.o
But this does:
make clean
make help.o
Because we've manually declared that.
[Re-arranging a bit]
> That all comes from b8ba629264 (Makefile: fold MISC_H into LIB_H,
> 2012-06-20).
Yes, but you get that for free whether you have computed deps or
not. I.e. your compiler is about to build the code anyway.
But it wasn't about to run a bunch of shellscripts to generate headers
it doesn't actually need.
> Yes, that sucks for you. But almost nobody actively develops on such a
> system, and people building Git to use rarely notice (because they are
> doing an initial build, or jumping versions, both of which generally
> require recompilation anyway).
I guess I'm "almost nobody" :) Not because I don't use computed deps,
but because I tend to e.g. do large interactive rebases with:
git rebase -i 'make grep.o'
Or some other subset of our built assets/objects.
On that front no one thing is the end of the world, i.e. sure, if we run
some shellscript to make a needed-header.h that takes 10ms we can live
with it.
But it adds up, which is one reason I've been slowly trickling in
patches to optimize various common parts of the Makefile & those
scripts.
I think before I started that the fixed cost even to have just make
decide it had nothing to do was often 500ms-1s.
I think that's down to 1/4 or so of that on "master", and I've got some
local patches to get it consistently down to <50ms.
It adds up, especially when combined with ccache & the like. I.e. if i'm
testing 10 commits I happen to have cached a "rebase -i" goes from
noticeably slow (~10s) to not being so slow as to be distracting.
Anyway, all of which is to say that that's one thing I was aiming for:
If I was going to submit patches for new generated headers to find some
good trade-off between complexity and over-declaring dependencies.
>> * We also have e.g. "make hdr-check", which works just fine now, but
>> it's becausue we have no *.h file that uses another generated *.h
>> file.
>
> I'm not that surprised. Probably it should depend on $(GENERATED_H), and
> possibly even be checking those files too.
>
>> * I may be wrong (I'm just looking at some old "here be dragons" note
>> of mine), but I think there's also an edge case where the dependency
>> graph of .depend.d/* is correct *once you've always had it*, but if a
>> file now e.g. depends on foo.h, and I make foo.h include a new
>> foo-generated.h things go boom.
>
> That should work because the timestamp on foo.h will have been updated,
> causing anything that includes it to rebuild (and thus updating its
> computed dependencies to include foo-generated.h).
Yes, in general it should work, maybe there's some Heisenbug there. I
can't recall or reproduce it, sorry.
But that's a good thing, maybe there's no issue :)
next prev parent reply other threads:[~2022-10-28 3:01 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-24 15:00 [PATCH 0/2] Document fsck msg ids John Cai via GitGitGadget
2022-10-24 15:00 ` [PATCH 1/2] fsck: remove the unused BAD_TAG_OBJECT John Cai via GitGitGadget
2022-10-24 16:57 ` Junio C Hamano
2022-10-24 18:16 ` Junio C Hamano
2022-10-24 18:33 ` John Cai
2022-10-24 23:39 ` Jeff King
2022-10-24 15:00 ` [PATCH 2/2] fsck: document msg-id John Cai via GitGitGadget
2022-10-24 17:33 ` Junio C Hamano
2022-10-25 9:41 ` Ævar Arnfjörð Bjarmason
2022-10-25 16:07 ` Junio C Hamano
2022-10-24 18:51 ` [PATCH 0/2] Document fsck msg ids Junio C Hamano
2022-10-25 3:17 ` [PATCH v2 " John Cai via GitGitGadget
2022-10-25 3:17 ` [PATCH v2 1/2] fsck: remove the unused BAD_TAG_OBJECT John Cai via GitGitGadget
2022-10-25 3:17 ` [PATCH v2 2/2] fsck: document msg-id John Cai via GitGitGadget
2022-10-25 4:30 ` [PATCH v2 0/2] Document fsck msg ids Junio C Hamano
2022-10-25 4:40 ` Junio C Hamano
2022-10-25 5:12 ` [PATCH] Documentation: add lint-fsck-msgids Junio C Hamano
2022-10-25 22:42 ` [PATCH v3 0/4] document fsck error message ids Junio C Hamano
2022-10-25 22:42 ` [PATCH v3 1/4] fsck: remove the unused BAD_TAG_OBJECT Junio C Hamano
2022-10-25 22:42 ` [PATCH v3 2/4] fsck: remove the unused MISSING_TREE_OBJECT Junio C Hamano
2022-10-25 22:42 ` [PATCH v3 3/4] fsck: document msg-id Junio C Hamano
2022-10-25 22:42 ` [PATCH v3 4/4] Documentation: add lint-fsck-msgids Junio C Hamano
2022-10-26 2:43 ` Ævar Arnfjörð Bjarmason
2022-10-26 5:34 ` Jeff King
2022-10-26 6:46 ` Junio C Hamano
2022-10-26 11:35 ` Ævar Arnfjörð Bjarmason
2022-10-28 1:23 ` Jeff King
2022-10-28 2:04 ` Ævar Arnfjörð Bjarmason [this message]
2022-10-28 5:32 ` Jeff King
2022-10-28 10:41 ` Ævar Arnfjörð Bjarmason
2022-10-28 3:02 ` John Cai
2022-10-28 3:11 ` Ævar Arnfjörð Bjarmason
2022-10-28 5:32 ` Junio C Hamano
2022-10-28 5:37 ` Jeff King
2022-10-28 5:35 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=221028.86sfj88xdq.gmgdl@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=johncai86@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).