* RFC on packfile URIs and .gitmodules check @ 2021-01-15 23:43 Jonathan Tan 2021-01-16 0:30 ` Junio C Hamano ` (3 more replies) 0 siblings, 4 replies; 229+ messages in thread From: Jonathan Tan @ 2021-01-15 23:43 UTC (permalink / raw) To: git; +Cc: peff, Jonathan Tan Someone at $DAYJOB noticed that if a .gitmodules-containing tree and the .gitmodules blob itself are sent in 2 separate packfiles during a fetch (which can happen when packfile URIs are used), transfer.fsckobjects causes the fetch to fail. You can reproduce it as follows (as of the time of writing): $ git -c fetch.uriprotocols=https -c transfer.fsckobjects=true clone https://chromium.googlesource.com/chromiumos/codesearch Cloning into 'codesearch'... remote: Total 2242 (delta 0), reused 2242 (delta 0) Receiving objects: 100% (2242/2242), 1.77 MiB | 4.62 MiB/s, done. error: object 1f155c20935ee1154a813a814f03ef2b3976680f: gitmodulesMissing: unable to read .gitmodules blob fatal: fsck error in pack objects fatal: index-pack failed This happens because the fsck part is currently being done in index-pack, which operates on one pack at a time. When index-pack sees the tree, it runs fsck on it (like any other object), and the fsck subsystem remembers the .gitmodules target (specifically, in gitmodules_found in fsck.c). Later, index-pack runs fsck_finish() which checks if the target exists, but it doesn't, so it reports the failure. One option is for fetch to do its own pass of checking all downloaded objects once all packfiles have been downloaded, but that seems wasteful as all trees would have to be re-inflated. Another option is to do it within the connectivity check instead - so, update rev-list and the object walking mechanism to be able to detect .gitmodules in trees and fsck the target blob whenever such an entry occurs. This has the advantage that there is no extra re-inflation, although it might be strange to have object walking be able to fsck. The simplest solution would be to just relax this - check the blob if it exists, but if it doesn't, it's OK. Some things in favor of this solution: - This is something we already do in the partial clone case (although it could be argued that in this case, we're already trusting the server for far more than .gitmodules, so just because it's OK in the partial clone case doesn't mean that it's OK in the regular case). - Also, the commit message for this feature (from ed8b10f631 ("fsck: check .gitmodules content", 2018-05-21)) gives a rationale of a newer server being able to protect older clients. - Servers using receive-pack (instead of fetch-pack) to obtain objects would still be protected, since receive-pack still only accepts one packfile at a time (and there are currently no plans to expand this). - Also, malicious .gitobjects files could still be crafted that pass fsck checking - for example, by containing a URL (of another server) that refers to a repo with a .gitobjects that would fail fsck. So I would rather go with just relaxing the check, but if consensus is that we should still do it, I'll investigate doing it in the connectivity check. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC on packfile URIs and .gitmodules check 2021-01-15 23:43 RFC on packfile URIs and .gitmodules check Jonathan Tan @ 2021-01-16 0:30 ` Junio C Hamano 2021-01-16 3:22 ` Taylor Blau 2021-01-20 8:07 ` Ævar Arnfjörð Bjarmason ` (2 subsequent siblings) 3 siblings, 1 reply; 229+ messages in thread From: Junio C Hamano @ 2021-01-16 0:30 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, peff Jonathan Tan <jonathantanmy@google.com> writes: > Someone at $DAYJOB noticed that if a .gitmodules-containing tree and the > .gitmodules blob itself are sent in 2 separate packfiles during a fetch > (which can happen when packfile URIs are used), transfer.fsckobjects > causes the fetch to fail. You can reproduce it as follows (as of the > time of writing): > > $ git -c fetch.uriprotocols=https -c transfer.fsckobjects=true clone https://chromium.googlesource.com/chromiumos/codesearch > Cloning into 'codesearch'... > remote: Total 2242 (delta 0), reused 2242 (delta 0) > Receiving objects: 100% (2242/2242), 1.77 MiB | 4.62 MiB/s, done. > error: object 1f155c20935ee1154a813a814f03ef2b3976680f: gitmodulesMissing: unable to read .gitmodules blob > fatal: fsck error in pack objects > fatal: index-pack failed > > This happens because the fsck part is currently being done in > index-pack, which operates on one pack at a time. When index-pack sees > the tree, it runs fsck on it (like any other object), and the fsck > subsystem remembers the .gitmodules target (specifically, in > gitmodules_found in fsck.c). Later, index-pack runs fsck_finish() which > checks if the target exists, but it doesn't, so it reports the failure. Is this because the gitmodules blob is contained in the base image served via the pack URI mechansim, and the "dynamic" packfile for the latest part of the history refers to the gitmodules file that is unchanged, hence the latter one lacks it? > Another option is to do it within the connectivity check instead - so, > update rev-list and the object walking mechanism to be able to detect > .gitmodules in trees and fsck the target blob whenever such an entry > occurs. This has the advantage that there is no extra re-inflation, > although it might be strange to have object walking be able to fsck. > > The simplest solution would be to just relax this - check the blob if it > exists, but if it doesn't, it's OK. Some things in favor of this > solution: > > - This is something we already do in the partial clone case (although > it could be argued that in this case, we're already trusting the > server for far more than .gitmodules, so just because it's OK in the > partial clone case doesn't mean that it's OK in the regular case). > > - Also, the commit message for this feature (from ed8b10f631 ("fsck: check > .gitmodules content", 2018-05-21)) gives a rationale of a newer > server being able to protect older clients. > - Servers using receive-pack (instead of fetch-pack) to obtain > objects would still be protected, since receive-pack still only > accepts one packfile at a time (and there are currently no plans > to expand this). > - Also, malicious .gitobjects files could still be crafted that pass > fsck checking - for example, by containing a URL (of another > server) that refers to a repo with a .gitobjects that would fail > fsck. > > So I would rather go with just relaxing the check, but if consensus is > that we should still do it, I'll investigate doing it in the > connectivity check. You've listed two possible solutions, i.e. (1) punt and declare that we assume an missing and uncheckable blob is OK, (2) defer the check after transfer completes. Between the two, my gut feeling is that the latter is preferrable. If we assume an missing and uncheckable one is OK, then even if a blob is available to be checked, there is not much point in checking, no? As long as the quarantine of incoming pack works correctly, streaming the incoming packdata (and packfile downloaded out of line via a separate mechanism like pack URI) to index-pack that does not check to complete the transfer, with a separate step to check the sanity of these packs as a whole, should not harm the repository even if it is interrupted in the middle, after transfer is done but before checking says it is OK. As a potential third option, I wonder if it is easier for everybody involved (including third-party implementation of their index-pack/fsck equivalent) if we made it a rule that a pack that has a tree that refers to .git<something> must include the blob for it? Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC on packfile URIs and .gitmodules check 2021-01-16 0:30 ` Junio C Hamano @ 2021-01-16 3:22 ` Taylor Blau 2021-01-19 12:56 ` Derrick Stolee 2021-01-19 19:02 ` Jonathan Tan 0 siblings, 2 replies; 229+ messages in thread From: Taylor Blau @ 2021-01-16 3:22 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jonathan Tan, git, peff On Fri, Jan 15, 2021 at 04:30:07PM -0800, Junio C Hamano wrote: > Jonathan Tan <jonathantanmy@google.com> writes: > > > Someone at $DAYJOB noticed that if a .gitmodules-containing tree and the > > .gitmodules blob itself are sent in 2 separate packfiles during a fetch > > (which can happen when packfile URIs are used), transfer.fsckobjects > > causes the fetch to fail. You can reproduce it as follows (as of the > > time of writing): > > > > $ git -c fetch.uriprotocols=https -c transfer.fsckobjects=true clone https://chromium.googlesource.com/chromiumos/codesearch > > Cloning into 'codesearch'... > > remote: Total 2242 (delta 0), reused 2242 (delta 0) > > Receiving objects: 100% (2242/2242), 1.77 MiB | 4.62 MiB/s, done. > > error: object 1f155c20935ee1154a813a814f03ef2b3976680f: gitmodulesMissing: unable to read .gitmodules blob > > fatal: fsck error in pack objects > > fatal: index-pack failed > > > > This happens because the fsck part is currently being done in > > index-pack, which operates on one pack at a time. When index-pack sees > > the tree, it runs fsck on it (like any other object), and the fsck > > subsystem remembers the .gitmodules target (specifically, in > > gitmodules_found in fsck.c). Later, index-pack runs fsck_finish() which > > checks if the target exists, but it doesn't, so it reports the failure. > > Is this because the gitmodules blob is contained in the base image > served via the pack URI mechansim, and the "dynamic" packfile for > the latest part of the history refers to the gitmodules file that is > unchanged, hence the latter one lacks it? That seems like a likely explanation, although this seems ultimately up to what the pack CDN serves. > You've listed two possible solutions, i.e. > > (1) punt and declare that we assume an missing and uncheckable blob > is OK, > > (2) defer the check after transfer completes. > > Between the two, my gut feeling is that the latter is preferrable. > If we assume an missing and uncheckable one is OK, then even if a > blob is available to be checked, there is not much point in > checking, no? I'm going to second this. If this were a more benign check, then I'd perhaps feel differently, but .gitmodules fsck checks seem to get hardened fairly often during security releases, and so it seems important to keep performing them when the user asked for it. > As long as the quarantine of incoming pack works correctly, > streaming the incoming packdata (and packfile downloaded out of line > via a separate mechanism like pack URI) to index-pack that does not > check to complete the transfer, with a separate step to check the > sanity of these packs as a whole, should not harm the repository > even if it is interrupted in the middle, after transfer is done but > before checking says it is OK. Agreed. Bear in mind that I am pretty unfamiliar with this code, and so I'm not sure if it's 'easy' or not to change it in this way. The obvious downside, which Jonathan notes, is that you almost certainly have to reinflate all of the trees again. But, since the user is asking for transfer.fsckObjects explicitly, I don't think that it's a problem. > As a potential third option, I wonder if it is easier for everybody > involved (including third-party implementation of their > index-pack/fsck equivalent) if we made it a rule that a pack that > has a tree that refers to .git<something> must include the blob for > it? Interesting, but I'm sure CDN administrators would prefer to have as few restrictions in place as possible. A potential fourth option that I can think of is that we can try to eagerly perform the .gitmodules fsck checks as we receive objects, under the assumption that the .gitmoudles blob and the tree which contains it appear in the same pack. If they do, then we ought to be able to check them as we currently do (and avoid leaving them to the slow post-processing step). Any blobs that we _can't_ find get placed into an array, and then that array is iterated over after we have received all packs, including from the CDN. Any blobs that couldn't be found in the pack transferred from the remote, the CDN, or the local repository (and isn't explicitly excluded via an object --filter) is declared missing. Thoughts? > Thanks. Thanks, Taylor ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC on packfile URIs and .gitmodules check 2021-01-16 3:22 ` Taylor Blau @ 2021-01-19 12:56 ` Derrick Stolee 2021-01-19 19:13 ` Jonathan Tan 2021-01-19 19:02 ` Jonathan Tan 1 sibling, 1 reply; 229+ messages in thread From: Derrick Stolee @ 2021-01-19 12:56 UTC (permalink / raw) To: Taylor Blau, Junio C Hamano; +Cc: Jonathan Tan, git, peff On 1/15/2021 10:22 PM, Taylor Blau wrote: > On Fri, Jan 15, 2021 at 04:30:07PM -0800, Junio C Hamano wrote: >> Jonathan Tan <jonathantanmy@google.com> writes: >> >>> Someone at $DAYJOB noticed that if a .gitmodules-containing tree and the >>> .gitmodules blob itself are sent in 2 separate packfiles during a fetch >>> (which can happen when packfile URIs are used), transfer.fsckobjects >>> causes the fetch to fail. You can reproduce it as follows (as of the >>> time of writing): >>> >>> $ git -c fetch.uriprotocols=https -c transfer.fsckobjects=true clone https://chromium.googlesource.com/chromiumos/codesearch >>> Cloning into 'codesearch'... >>> remote: Total 2242 (delta 0), reused 2242 (delta 0) >>> Receiving objects: 100% (2242/2242), 1.77 MiB | 4.62 MiB/s, done. >>> error: object 1f155c20935ee1154a813a814f03ef2b3976680f: gitmodulesMissing: unable to read .gitmodules blob >>> fatal: fsck error in pack objects >>> fatal: index-pack failed I'm contributing a quick suggestion for just this item: >>> This happens because the fsck part is currently being done in >>> index-pack, which operates on one pack at a time. When index-pack sees >>> the tree, it runs fsck on it (like any other object), and the fsck >>> subsystem remembers the .gitmodules target (specifically, in >>> gitmodules_found in fsck.c). Later, index-pack runs fsck_finish() which >>> checks if the target exists, but it doesn't, so it reports the failure. >> >> Is this because the gitmodules blob is contained in the base image >> served via the pack URI mechansim, and the "dynamic" packfile for >> the latest part of the history refers to the gitmodules file that is >> unchanged, hence the latter one lacks it? > > That seems like a likely explanation, although this seems ultimately up > to what the pack CDN serves. >> You've listed two possible solutions, i.e. >> >> (1) punt and declare that we assume an missing and uncheckable blob >> is OK, >> >> (2) defer the check after transfer completes. >> >> Between the two, my gut feeling is that the latter is preferrable. >> If we assume an missing and uncheckable one is OK, then even if a >> blob is available to be checked, there is not much point in >> checking, no? > > I'm going to second this. If this were a more benign check, then I'd > perhaps feel differently, but .gitmodules fsck checks seem to get > hardened fairly often during security releases, and so it seems > important to keep performing them when the user asked for it. It might be nice to teach 'index-pack' a mode that says certain errors should be reported as warnings by writing the problematic OIDs to stdout/stderr. Then, the second check after all packs are present can focus on those problematic objects instead of re-scanning everything. Thanks, -Stolee ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC on packfile URIs and .gitmodules check 2021-01-19 12:56 ` Derrick Stolee @ 2021-01-19 19:13 ` Jonathan Tan 2021-01-20 1:04 ` Junio C Hamano 0 siblings, 1 reply; 229+ messages in thread From: Jonathan Tan @ 2021-01-19 19:13 UTC (permalink / raw) To: stolee; +Cc: me, gitster, jonathantanmy, git, peff > I'm contributing a quick suggestion for just this item: > > >>> This happens because the fsck part is currently being done in > >>> index-pack, which operates on one pack at a time. When index-pack sees > >>> the tree, it runs fsck on it (like any other object), and the fsck > >>> subsystem remembers the .gitmodules target (specifically, in > >>> gitmodules_found in fsck.c). Later, index-pack runs fsck_finish() which > >>> checks if the target exists, but it doesn't, so it reports the failure. > >> > >> Is this because the gitmodules blob is contained in the base image > >> served via the pack URI mechansim, and the "dynamic" packfile for > >> the latest part of the history refers to the gitmodules file that is > >> unchanged, hence the latter one lacks it? > > > > That seems like a likely explanation, although this seems ultimately up > > to what the pack CDN serves. > >> You've listed two possible solutions, i.e. > >> > >> (1) punt and declare that we assume an missing and uncheckable blob > >> is OK, > >> > >> (2) defer the check after transfer completes. > >> > >> Between the two, my gut feeling is that the latter is preferrable. > >> If we assume an missing and uncheckable one is OK, then even if a > >> blob is available to be checked, there is not much point in > >> checking, no? > > > > I'm going to second this. If this were a more benign check, then I'd > > perhaps feel differently, but .gitmodules fsck checks seem to get > > hardened fairly often during security releases, and so it seems > > important to keep performing them when the user asked for it. > > It might be nice to teach 'index-pack' a mode that says certain > errors should be reported as warnings by writing the problematic > OIDs to stdout/stderr. Then, the second check after all packs are > present can focus on those problematic objects instead of > re-scanning everything. My initial reaction was that stdout is already used to report the hash part of the generated name and that stderr is already used for whatever warnings there are, but looking at the documentation, index-pack --fsck-objects is "[for] internal use only", so it might be fine to extend the output format in this case and report the problematic OIDs after the hash. I'll take a look. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC on packfile URIs and .gitmodules check 2021-01-19 19:13 ` Jonathan Tan @ 2021-01-20 1:04 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-01-20 1:04 UTC (permalink / raw) To: Jonathan Tan; +Cc: stolee, me, git, peff Jonathan Tan <jonathantanmy@google.com> writes: >> It might be nice to teach 'index-pack' a mode that says certain >> errors should be reported as warnings by writing the problematic >> OIDs to stdout/stderr. Then, the second check after all packs are >> present can focus on those problematic objects instead of >> re-scanning everything. > > My initial reaction was that stdout is already used to report the hash > part of the generated name and that stderr is already used for whatever > warnings there are, but looking at the documentation, index-pack > --fsck-objects is "[for] internal use only", so it might be fine to > extend the output format in this case and report the problematic OIDs > after the hash. I'll take a look. If I am not mistaken, Taylor also mentioned the possibility to give "these objects need reinspecting" to a later process, and it is an excellent suggestion. And I think it is perfectly fine to adjust the internal format used purely for internal use. Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC on packfile URIs and .gitmodules check 2021-01-16 3:22 ` Taylor Blau 2021-01-19 12:56 ` Derrick Stolee @ 2021-01-19 19:02 ` Jonathan Tan 1 sibling, 0 replies; 229+ messages in thread From: Jonathan Tan @ 2021-01-19 19:02 UTC (permalink / raw) To: me; +Cc: gitster, jonathantanmy, git, peff > > Is this because the gitmodules blob is contained in the base image > > served via the pack URI mechansim, and the "dynamic" packfile for > > the latest part of the history refers to the gitmodules file that is > > unchanged, hence the latter one lacks it? > > That seems like a likely explanation, although this seems ultimately up > to what the pack CDN serves. In this case, yes, that is what is happening. > > You've listed two possible solutions, i.e. > > > > (1) punt and declare that we assume an missing and uncheckable blob > > is OK, > > > > (2) defer the check after transfer completes. > > > > Between the two, my gut feeling is that the latter is preferrable. > > If we assume an missing and uncheckable one is OK, then even if a > > blob is available to be checked, there is not much point in > > checking, no? > > I'm going to second this. If this were a more benign check, then I'd > perhaps feel differently, but .gitmodules fsck checks seem to get > hardened fairly often during security releases, and so it seems > important to keep performing them when the user asked for it. That makes sense. > > As long as the quarantine of incoming pack works correctly, > > streaming the incoming packdata (and packfile downloaded out of line > > via a separate mechanism like pack URI) to index-pack that does not > > check to complete the transfer, with a separate step to check the > > sanity of these packs as a whole, should not harm the repository > > even if it is interrupted in the middle, after transfer is done but > > before checking says it is OK. > > Agreed. Bear in mind that I am pretty unfamiliar with this code, and so > I'm not sure if it's 'easy' or not to change it in this way. The obvious > downside, which Jonathan notes, is that you almost certainly have to > reinflate all of the trees again. > > But, since the user is asking for transfer.fsckObjects explicitly, I > don't think that it's a problem. We might be able to avoid the reinflate if we do it as part of the connectivity check or somehow teach index-pack a way to communicate the dangling .gitmodules links (as you suggest below). > > As a potential third option, I wonder if it is easier for everybody > > involved (including third-party implementation of their > > index-pack/fsck equivalent) if we made it a rule that a pack that > > has a tree that refers to .git<something> must include the blob for > > it? > > Interesting, but I'm sure CDN administrators would prefer to have as few > restrictions in place as possible. That rule would help, but it also seems inelegant in that if we put commits that have the same .gitmodules in 2 or more different packs, there would be identical objects across those packs (besides the reason Taylor mentioned). > A potential fourth option that I can think of is that we can try to > eagerly perform the .gitmodules fsck checks as we receive objects, under > the assumption that the .gitmoudles blob and the tree which contains it > appear in the same pack. > > If they do, then we ought to be able to check them as we currently do > (and avoid leaving them to the slow post-processing step). Any blobs > that we _can't_ find get placed into an array, and then that array is > iterated over after we have received all packs, including from the CDN. > Any blobs that couldn't be found in the pack transferred from the > remote, the CDN, or the local repository (and isn't explicitly excluded > via an object --filter) is declared missing. > > Thoughts? The hard part is communicating this array to the parent fetch process. Stolee has a suggestion [1] which I will reply to directly. [1] https://lore.kernel.org/git/d2ca2fec-a353-787a-15a7-3831a665523e@gmail.com/ ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC on packfile URIs and .gitmodules check 2021-01-15 23:43 RFC on packfile URIs and .gitmodules check Jonathan Tan 2021-01-16 0:30 ` Junio C Hamano @ 2021-01-20 8:07 ` Ævar Arnfjörð Bjarmason 2021-01-20 19:30 ` Jonathan Tan 2021-01-20 19:36 ` [PATCH] Doc: clarify contents of packfile sent as URI Jonathan Tan 2021-01-24 2:34 ` [PATCH 0/4] Check .gitmodules when using packfile URIs Jonathan Tan 2021-02-22 19:20 ` [PATCH v2 " Jonathan Tan 3 siblings, 2 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-01-20 8:07 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, peff, Derrick Stolee, Taylor Blau On Sat, Jan 16 2021, Jonathan Tan wrote: > Someone at $DAYJOB noticed that if a .gitmodules-containing tree and the > .gitmodules blob itself are sent in 2 separate packfiles during a fetch > (which can happen when packfile URIs are used), transfer.fsckobjects > causes the fetch to fail. You can reproduce it as follows (as of the > time of writing): > > $ git -c fetch.uriprotocols=https -c transfer.fsckobjects=true clone https://chromium.googlesource.com/chromiumos/codesearch > Cloning into 'codesearch'... > remote: Total 2242 (delta 0), reused 2242 (delta 0) > Receiving objects: 100% (2242/2242), 1.77 MiB | 4.62 MiB/s, done. > error: object 1f155c20935ee1154a813a814f03ef2b3976680f: gitmodulesMissing: unable to read .gitmodules blob > fatal: fsck error in pack objects > fatal: index-pack failed > > This happens because the fsck part is currently being done in > index-pack, which operates on one pack at a time. When index-pack sees > the tree, it runs fsck on it (like any other object), and the fsck > subsystem remembers the .gitmodules target (specifically, in > gitmodules_found in fsck.c). Later, index-pack runs fsck_finish() which > checks if the target exists, but it doesn't, so it reports the failure. > > One option is for fetch to do its own pass of checking all downloaded > objects once all packfiles have been downloaded, but that seems wasteful > as all trees would have to be re-inflated. > > Another option is to do it within the connectivity check instead - so, > update rev-list and the object walking mechanism to be able to detect > .gitmodules in trees and fsck the target blob whenever such an entry > occurs. This has the advantage that there is no extra re-inflation, > although it might be strange to have object walking be able to fsck. > > The simplest solution would be to just relax this - check the blob if it > exists, but if it doesn't, it's OK. Some things in favor of this > solution: > > - This is something we already do in the partial clone case (although > it could be argued that in this case, we're already trusting the > server for far more than .gitmodules, so just because it's OK in the > partial clone case doesn't mean that it's OK in the regular case). > > - Also, the commit message for this feature (from ed8b10f631 ("fsck: check > .gitmodules content", 2018-05-21)) gives a rationale of a newer > server being able to protect older clients. > - Servers using receive-pack (instead of fetch-pack) to obtain > objects would still be protected, since receive-pack still only > accepts one packfile at a time (and there are currently no plans > to expand this). > - Also, malicious .gitobjects files could still be crafted that pass > fsck checking - for example, by containing a URL (of another > server) that refers to a repo with a .gitobjects that would fail > fsck. > > So I would rather go with just relaxing the check, but if consensus is > that we should still do it, I'll investigate doing it in the > connectivity check. Would this still behave if the $DAYJOB's packfile-uri server support was behaving as documented in packfile-uri.txt, or just because it has outside-spec behavior? I.e. the spec[1] says this: This is the implementation: a feature, marked experimental, that allows the server to be configured by one or more `uploadpack.blobPackfileUri=<sha1> <uri>` entries. Whenever the list of objects to be sent is assembled, all such blobs are excluded, replaced with URIs. The client will download those URIs, expecting them to each point to packfiles containing single blobs. Which I can't see leaving an opening for more than packfile-uri being to serve up packfiles which each contain a single blob. In that case it seems to me we'd be OK (but I haven't tested), because fsck_finish() will call read_object_file() which'll try to read that "blob from the object store when it encounters the ".gitmodules" tree, and because we'd have already downloaded the packfile with the blob before moving onto the main dialog. But as we discussed on-list before[2] this isn't the way packfile-uri actually works in the wild. It's really just sending some arbitrary data in a pack in that URI, with a server that knows what's in that pack and will send the rest in such a way that everything ends up being connected. As far as I can tell the only reason this is called "packfile URI" and behaves this way in git.git is because of the convenience of intrumenting pack-objects.c with an "oidset excluded_by_config" to not stream those blobs in a pack, but it isn't how the only (I'm pretty sure) production server implementation in the wild behaves at all. So *poke* about the reply I had in [3] late last year. I think the first thing worth doing here is fixing the docs so they describe how this works. You didn't get back on that (and I also forgot about it until this thread), but it would be nice to know what you think about the suggested prose there. Re-reading it I'd add something like this to the spec: A. That the config is called "uploadpack.blobPackfileUri" in git.git has nothing to do with how this is expected to behave on the wire. It's just to serve the narrow support pack-objects.c has for crafting such a pack. B. It's then called "packfile-uris" on the wire, nothing to do with blobs. Just packs with a checksum that we'll validate. An older versions of this spec said "[a] packfiles containing single blobs" but it can be any combination of blob/tree/commit data. C. A client is then expected to deal with any combination of data ordered/sliced/split up etc. in any possible way from such a combination of "packfile-uris" and PACK dialog, as long as the end result is valid. Except that the result of this discussion will perhaps be a more narrow definition for "C". 1. https://github.com/git/git/blob/cd8402e0fd8cfc0ec9fb10e22ffb6aabd992eae1/Documentation/technical/packfile-uri.txt#L37-L41 2. https://lore.kernel.org/git/20201125190957.1113461-1-jonathantanmy@google.com/ 3. https://lore.kernel.org/git/87tut5vghw.fsf@evledraar.gmail.com/ ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC on packfile URIs and .gitmodules check 2021-01-20 8:07 ` Ævar Arnfjörð Bjarmason @ 2021-01-20 19:30 ` Jonathan Tan 2021-01-21 3:06 ` Junio C Hamano 2021-01-20 19:36 ` [PATCH] Doc: clarify contents of packfile sent as URI Jonathan Tan 1 sibling, 1 reply; 229+ messages in thread From: Jonathan Tan @ 2021-01-20 19:30 UTC (permalink / raw) To: avarab; +Cc: jonathantanmy, git, peff, stolee, me > Would this still behave if the $DAYJOB's packfile-uri server support was > behaving as documented in packfile-uri.txt, or just because it has > outside-spec behavior? > > I.e. the spec[1] says this: > > This is the implementation: a feature, marked experimental, that > allows the server to be configured by one or more > `uploadpack.blobPackfileUri=<sha1> <uri>` entries. Whenever the list > of objects to be sent is assembled, all such blobs are excluded, > replaced with URIs. The client will download those URIs, expecting > them to each point to packfiles containing single blobs. > > Which I can't see leaving an opening for more than packfile-uri being to > serve up packfiles which each contain a single blob. I meant to leave an opening by referring to this just as a Minimum Viable Product and by explaining in Future Work that the protocol allows evolution of (among other things) which objects the server sends through a URI without any protocol changes. But in any case, this will also happen even if we constrain ourselves to excluding single blobs and sending them via other packfiles instead - see below. > In that case it seems to me we'd be OK (but I haven't tested), because > fsck_finish() will call read_object_file() which'll try to read that > "blob from the object store when it encounters the ".gitmodules" tree, > and because we'd have already downloaded the packfile with the blob > before moving onto the main dialog. We wouldn't be OK, actually. Suppose we have a separate packfile containing only the ".gitmodules" blob - when we call fsck_finish(), we would not have downloaded the other packfile yet. Git processes the entire fetch response by piping the inline packfile (after demux) into index-pack (which is the one that calls fsck_finish()) before it downloads any of the other packfile(s). > But as we discussed on-list before[2] this isn't the way packfile-uri > actually works in the wild. It's really just sending some arbitrary data > in a pack in that URI, with a server that knows what's in that pack and > will send the rest in such a way that everything ends up being > connected. > > As far as I can tell the only reason this is called "packfile URI" and > behaves this way in git.git is because of the convenience of > intrumenting pack-objects.c with an "oidset excluded_by_config" to not > stream those blobs in a pack, but it isn't how the only (I'm pretty > sure) production server implementation in the wild behaves at all. I don't know if this is the only production server implementation, but yes, this particular one (googlesource.com) can put objects of multiple types in the other packfile, not only a single blob. There is some JGit code here [1] that can send a URI corresponding to a "CachedPack" (which may contain all objects, not only blobs) if that pack is also available through a URI. [1] https://gerrit.googlesource.com/jgit/+/a004820858b54d18c6f72fc94dc33bce8b606d66 > So *poke* about the reply I had in [3] late last year. I think the first > thing worth doing here is fixing the docs so they describe how this > works. You didn't get back on that (and I also forgot about it until > this thread), but it would be nice to know what you think about the > suggested prose there. Rereading that, the issue is that uploadpack.blobPackfileUri is indeed how the current Git server handles it - it excludes a blob and sends a URI instead. The client is not supposed to see how the server has configured it, and should not be constrained by the fact that the server that is being shipped with it only excludes single blobs. > Re-reading it I'd add something like this to the spec: > > A. That the config is called "uploadpack.blobPackfileUri" in git.git > has nothing to do with how this is expected to behave on the > wire. It's just to serve the narrow support pack-objects.c has for > crafting such a pack. Yes, that's true. > B. It's then called "packfile-uris" on the wire, nothing to do with > blobs. Just packs with a checksum that we'll validate. An older > versions of this spec said "[a] packfiles containing single blobs" > but it can be any combination of blob/tree/commit data. Yes, we can delete that line. > C. A client is then expected to deal with any combination of data > ordered/sliced/split up etc. in any possible way from such a > combination of "packfile-uris" and PACK dialog, as long as the end > result is valid. > > Except that the result of this discussion will perhaps be a more narrow > definition for "C". Yes. I think all these can be done just by changing the last sentence in "Server design" - I'll send a patch. > 1. https://github.com/git/git/blob/cd8402e0fd8cfc0ec9fb10e22ffb6aabd992eae1/Documentation/technical/packfile-uri.txt#L37-L41 > 2. https://lore.kernel.org/git/20201125190957.1113461-1-jonathantanmy@google.com/ > 3. https://lore.kernel.org/git/87tut5vghw.fsf@evledraar.gmail.com/ ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC on packfile URIs and .gitmodules check 2021-01-20 19:30 ` Jonathan Tan @ 2021-01-21 3:06 ` Junio C Hamano 2021-01-21 18:32 ` Jonathan Tan 0 siblings, 1 reply; 229+ messages in thread From: Junio C Hamano @ 2021-01-21 3:06 UTC (permalink / raw) To: Jonathan Tan; +Cc: avarab, git, peff, stolee, me Jonathan Tan <jonathantanmy@google.com> writes: > We wouldn't be OK, actually. Suppose we have a separate packfile > containing only the ".gitmodules" blob - when we call fsck_finish(), we > would not have downloaded the other packfile yet. Git processes the > entire fetch response by piping the inline packfile (after demux) into > index-pack (which is the one that calls fsck_finish()) before it > downloads any of the other packfile(s). Is that order documented as a requirement for implementation? Naïvely, I would expect that a CDN offload would be to relieve servers from the burden of having to repack ancient part of the history all the time for any new "clone" clients and that is what the "here is a URI, go fetch it because I won't give you objects that already appear there" feature is about. Because we expect that the offloaded contents would not be up-to-date, the traditional packfile transfer would then is used to complete the history with objects necessary for the parts of the history newer than the offloaded contents. And from that viewpoint, it sounds totally backwards to start processing the up-to-the-minute fresh packfile that came via the traditional packfile transfer before the CDN offloaded contents are fetched and stored safely in our repository. We probably want to finish interaction with the live server as quickly as possible---it would go counter to that wish if we force the live part of the history hang in flight, unprocessed, while the client downloads offloaded bulk from CDN and processes it, making the server side stuck waiting for some write(2) to go through. But I still wonder if it is an option to locally delay the processing of the up-to-the-minute-fresh part. Instead of feeding what comes from them directly to "index-pack --fsck-objects", would it make sense to spool it to a temporary, so that we can release the server early, but then make sure to fetch and process packfile URI material before coming back to process the spooled packdata. That would allow the newer part of the history to have newer trees that still reference the same old .gitmodules that is found in the frozen packfile that comes from CDN, no? Or can there be a situation where some objects in CDN pack are referred to by objects in the up-to-the-minute-fresh pack (e.g. a ".gitmodules" blob in CDN pack is still unchanged and used in an updated tree in the latest revision) and some other objects in CDN pack refer to an object in the live part of the history? If there is such a cyclic dependency, "index-pack --fsck" one pack at a time would not work, but I doubt such a cycle can arise. Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC on packfile URIs and .gitmodules check 2021-01-21 3:06 ` Junio C Hamano @ 2021-01-21 18:32 ` Jonathan Tan 2021-01-21 18:39 ` Junio C Hamano 0 siblings, 1 reply; 229+ messages in thread From: Jonathan Tan @ 2021-01-21 18:32 UTC (permalink / raw) To: gitster; +Cc: jonathantanmy, avarab, git, peff, stolee, me > Jonathan Tan <jonathantanmy@google.com> writes: > > > We wouldn't be OK, actually. Suppose we have a separate packfile > > containing only the ".gitmodules" blob - when we call fsck_finish(), we > > would not have downloaded the other packfile yet. Git processes the > > entire fetch response by piping the inline packfile (after demux) into > > index-pack (which is the one that calls fsck_finish()) before it > > downloads any of the other packfile(s). > > Is that order documented as a requirement for implementation? > > Naïvely, I would expect that a CDN offload would be to relieve > servers from the burden of having to repack ancient part of the > history all the time for any new "clone" clients and that is what > the "here is a URI, go fetch it because I won't give you objects > that already appear there" feature is about. Because we expect that > the offloaded contents would not be up-to-date, the traditional > packfile transfer would then is used to complete the history with > objects necessary for the parts of the history newer than the > offloaded contents. > > And from that viewpoint, it sounds totally backwards to start > processing the up-to-the-minute fresh packfile that came via the > traditional packfile transfer before the CDN offloaded contents are > fetched and stored safely in our repository. > > We probably want to finish interaction with the live server as > quickly as possible---it would go counter to that wish if we force > the live part of the history hang in flight, unprocessed, while the > client downloads offloaded bulk from CDN and processes it, making > the server side stuck waiting for some write(2) to go through. > > But I still wonder if it is an option to locally delay the > processing of the up-to-the-minute-fresh part. > > Instead of feeding what comes from them directly to "index-pack > --fsck-objects", would it make sense to spool it to a temporary, so > that we can release the server early, but then make sure to fetch > and process packfile URI material before coming back to process the > spooled packdata. That would allow the newer part of the history to > have newer trees that still reference the same old .gitmodules that > is found in the frozen packfile that comes from CDN, no? > > Or can there be a situation where some objects in CDN pack are > referred to by objects in the up-to-the-minute-fresh pack (e.g. a > ".gitmodules" blob in CDN pack is still unchanged and used in an > updated tree in the latest revision) and some other objects in CDN > pack refer to an object in the live part of the history? If there > is such a cyclic dependency, "index-pack --fsck" one pack at a time > would not work, but I doubt such a cycle can arise. My intention is that the order of the packfiles (and cyclic dependencies) would not matter, so we wouldn't need to delay any processing of the up-to-the-minute-fresh part. I'm currently working on getting index-pack to output a list of the dangling .gitmodules files, so that fetch-pack (its consumer) can do one final fsck on those files. Another way, as you said, is to say that the order of the packfiles matters (which potentially allows some simplification on the client side) but I don't think that we need to lose this flexibility. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC on packfile URIs and .gitmodules check 2021-01-21 18:32 ` Jonathan Tan @ 2021-01-21 18:39 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-01-21 18:39 UTC (permalink / raw) To: Jonathan Tan; +Cc: avarab, git, peff, stolee, me Jonathan Tan <jonathantanmy@google.com> writes: >> Jonathan Tan <jonathantanmy@google.com> writes: >> >> Or can there be a situation where some objects in CDN pack are >> referred to by objects in the up-to-the-minute-fresh pack (e.g. a >> ".gitmodules" blob in CDN pack is still unchanged and used in an >> updated tree in the latest revision) and some other objects in CDN >> pack refer to an object in the live part of the history? If there >> is such a cyclic dependency, "index-pack --fsck" one pack at a time >> would not work, but I doubt such a cycle can arise. > > My intention is that the order of the packfiles (and cyclic > dependencies) would not matter... > I'm currently working on > getting index-pack to output a list of the dangling .gitmodules files, > so that fetch-pack (its consumer) can do one final fsck on those files. In other words, it essentially becomes "we check everything we obtained as a single unit across multiple packs, but for performance we'll let index-pack work as much as possible on each individual pack while it has necessary data in its core, and then we conclude by checking the objects on the 'boundaries' that cannot be validated using info that is only in one pack". That does sound like the right approach. THanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH] Doc: clarify contents of packfile sent as URI 2021-01-20 8:07 ` Ævar Arnfjörð Bjarmason 2021-01-20 19:30 ` Jonathan Tan @ 2021-01-20 19:36 ` Jonathan Tan 1 sibling, 0 replies; 229+ messages in thread From: Jonathan Tan @ 2021-01-20 19:36 UTC (permalink / raw) To: git; +Cc: Jonathan Tan, avarab Clarify that, when the packfile-uri feature is used, the client should not assume that the extra packfiles downloaded would only contain a single blob, but support packfiles containing multiple objects of all types. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- Documentation/technical/packfile-uri.txt | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/Documentation/technical/packfile-uri.txt b/Documentation/technical/packfile-uri.txt index 318713abc3..f7eabc6c76 100644 --- a/Documentation/technical/packfile-uri.txt +++ b/Documentation/technical/packfile-uri.txt @@ -37,8 +37,11 @@ at least so that we can test the client. This is the implementation: a feature, marked experimental, that allows the server to be configured by one or more `uploadpack.blobPackfileUri=<sha1> <uri>` entries. Whenever the list of objects to be sent is assembled, all such -blobs are excluded, replaced with URIs. The client will download those URIs, -expecting them to each point to packfiles containing single blobs. +blobs are excluded, replaced with URIs. As noted in "Future work" below, the +server can evolve in the future to support excluding other objects (or other +implementations of servers could be made that support excluding other objects) +without needing a protocol change, so clients should not expect that packfiles +downloaded in this way only contain single blobs. Client design ------------- -- 2.30.0.284.gd98b1dd5eaa7-goog ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH 0/4] Check .gitmodules when using packfile URIs 2021-01-15 23:43 RFC on packfile URIs and .gitmodules check Jonathan Tan 2021-01-16 0:30 ` Junio C Hamano 2021-01-20 8:07 ` Ævar Arnfjörð Bjarmason @ 2021-01-24 2:34 ` Jonathan Tan 2021-01-24 2:34 ` [PATCH 1/4] http: allow custom index-pack args Jonathan Tan ` (5 more replies) 2021-02-22 19:20 ` [PATCH v2 " Jonathan Tan 3 siblings, 6 replies; 229+ messages in thread From: Jonathan Tan @ 2021-01-24 2:34 UTC (permalink / raw) To: git; +Cc: Jonathan Tan This patch set resolves the .gitmodules-and-tree-in-separate-packfiles issue I mentioned in [1] by having index-pack print out all dangling .gitmodules (instead of returning with an error code) and then teaching fetch-pack to read those and run its own fsck checks after all index-pack invocations are complete. As part of this, index-pack has to output (1) the hash that goes into the name of the .pack/.idx file and (2) the hashes of all dangling .gitmodules. I just had (2) come after (1). If anyone has a better idea, I'm interested. I also discovered a bug in that different index-pack arguments were used when processing the inline packfile and when processing the ones referenced by URIs. Patch 1-3 fixes that bug by passing the arguments to use as a space-separated URL-encoded list. (URL-encoded so that we can have spaces in the arguments.) Again, if anyone has a better idea, I'm interested. It is only in patch 4 that we have the dangling .gitmodules fix. [1] https://lore.kernel.org/git/20210115234300.350442-1-jonathantanmy@google.com/ Jonathan Tan (4): http: allow custom index-pack args http-fetch: allow custom index-pack args fetch-pack: with packfile URIs, use index-pack arg fetch-pack: print and use dangling .gitmodules Documentation/git-http-fetch.txt | 9 ++- Documentation/git-index-pack.txt | 7 +- builtin/index-pack.c | 9 ++- builtin/receive-pack.c | 2 +- fetch-pack.c | 106 ++++++++++++++++++++++++++----- fsck.c | 16 +++-- fsck.h | 8 +++ http-fetch.c | 35 +++++++++- http.c | 15 +++-- http.h | 10 +-- pack-write.c | 8 ++- pack.h | 2 +- t/t5550-http-fetch-dumb.sh | 3 +- t/t5702-protocol-v2.sh | 47 ++++++++++++++ 14 files changed, 232 insertions(+), 45 deletions(-) -- 2.30.0.280.ga3ce27912f-goog ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH 1/4] http: allow custom index-pack args 2021-01-24 2:34 ` [PATCH 0/4] Check .gitmodules when using packfile URIs Jonathan Tan @ 2021-01-24 2:34 ` Jonathan Tan 2021-01-24 2:34 ` [PATCH 2/4] http-fetch: " Jonathan Tan ` (4 subsequent siblings) 5 siblings, 0 replies; 229+ messages in thread From: Jonathan Tan @ 2021-01-24 2:34 UTC (permalink / raw) To: git; +Cc: Jonathan Tan Currently, when fetching, packfiles referenced by URIs are run through index-pack without any arguments other than --stdin and --keep, no matter what arguments are used for the packfile that is inline in the fetch response. As a preparation for ensuring that all packs (whether inline or not) use the same index-pack arguments, teach the http subsystem to allow custom index-pack arguments. http-fetch has been updated to use the new API. For now, it passes --keep alone instead of --keep with a process ID, but this is only temporary because http-fetch itself will be taught to accept index-pack parameters (instead of using a hardcoded constant) in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- http-fetch.c | 6 +++++- http.c | 15 ++++++++------- http.h | 10 +++++----- 3 files changed, 18 insertions(+), 13 deletions(-) diff --git a/http-fetch.c b/http-fetch.c index c4ccc5fea9..2d1d9d054f 100644 --- a/http-fetch.c +++ b/http-fetch.c @@ -43,6 +43,9 @@ static int fetch_using_walker(const char *raw_url, int get_verbosely, return rc; } +static const char *index_pack_args[] = + {"index-pack", "--stdin", "--keep", NULL}; + static void fetch_single_packfile(struct object_id *packfile_hash, const char *url) { struct http_pack_request *preq; @@ -55,7 +58,8 @@ static void fetch_single_packfile(struct object_id *packfile_hash, if (preq == NULL) die("couldn't create http pack request"); preq->slot->results = &results; - preq->generate_keep = 1; + preq->index_pack_args = index_pack_args; + preq->preserve_index_pack_stdout = 1; if (start_active_slot(preq->slot)) { run_active_slot(preq->slot); diff --git a/http.c b/http.c index 8b23a546af..f8ea28bb2e 100644 --- a/http.c +++ b/http.c @@ -2259,6 +2259,9 @@ void release_http_pack_request(struct http_pack_request *preq) free(preq); } +static const char *default_index_pack_args[] = + {"index-pack", "--stdin", NULL}; + int finish_http_pack_request(struct http_pack_request *preq) { struct child_process ip = CHILD_PROCESS_INIT; @@ -2270,17 +2273,15 @@ int finish_http_pack_request(struct http_pack_request *preq) tmpfile_fd = xopen(preq->tmpfile.buf, O_RDONLY); - strvec_push(&ip.args, "index-pack"); - strvec_push(&ip.args, "--stdin"); ip.git_cmd = 1; ip.in = tmpfile_fd; - if (preq->generate_keep) { - strvec_pushf(&ip.args, "--keep=git %"PRIuMAX, - (uintmax_t)getpid()); + ip.argv = preq->index_pack_args ? preq->index_pack_args + : default_index_pack_args; + + if (preq->preserve_index_pack_stdout) ip.out = 0; - } else { + else ip.no_stdout = 1; - } if (run_command(&ip)) { ret = -1; diff --git a/http.h b/http.h index 5de792ef3f..bf3d1270ad 100644 --- a/http.h +++ b/http.h @@ -218,12 +218,12 @@ struct http_pack_request { char *url; /* - * If this is true, finish_http_pack_request() will pass "--keep" to - * index-pack, resulting in the creation of a keep file, and will not - * suppress its stdout (that is, the "keep\t<hash>\n" line will be - * printed to stdout). + * index-pack command to run. Must be terminated by NULL. + * + * If NULL, defaults to {"index-pack", "--stdin", NULL}. */ - unsigned generate_keep : 1; + const char **index_pack_args; + unsigned preserve_index_pack_stdout : 1; FILE *packfile; struct strbuf tmpfile; -- 2.30.0.280.ga3ce27912f-goog ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH 2/4] http-fetch: allow custom index-pack args 2021-01-24 2:34 ` [PATCH 0/4] Check .gitmodules when using packfile URIs Jonathan Tan 2021-01-24 2:34 ` [PATCH 1/4] http: allow custom index-pack args Jonathan Tan @ 2021-01-24 2:34 ` Jonathan Tan 2021-01-24 11:52 ` Ævar Arnfjörð Bjarmason 2021-02-16 20:49 ` Josh Steadmon 2021-01-24 2:34 ` [PATCH 3/4] fetch-pack: with packfile URIs, use index-pack arg Jonathan Tan ` (3 subsequent siblings) 5 siblings, 2 replies; 229+ messages in thread From: Jonathan Tan @ 2021-01-24 2:34 UTC (permalink / raw) To: git; +Cc: Jonathan Tan This is the next step in teaching fetch-pack to pass its index-pack arguments when processing packfiles referenced by URIs. The "--keep" in fetch-pack.c will be replaced with a full message in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- Documentation/git-http-fetch.txt | 9 ++++++-- fetch-pack.c | 1 + http-fetch.c | 35 +++++++++++++++++++++++++++----- t/t5550-http-fetch-dumb.sh | 3 ++- 4 files changed, 40 insertions(+), 8 deletions(-) diff --git a/Documentation/git-http-fetch.txt b/Documentation/git-http-fetch.txt index 4deb4893f5..aa171088e8 100644 --- a/Documentation/git-http-fetch.txt +++ b/Documentation/git-http-fetch.txt @@ -41,11 +41,16 @@ commit-id:: <commit-id>['\t'<filename-as-in--w>] --packfile=<hash>:: - Instead of a commit id on the command line (which is not expected in + For internal use only. Instead of a commit id on the command line (which is not expected in this case), 'git http-fetch' fetches the packfile directly at the given URL and uses index-pack to generate corresponding .idx and .keep files. The hash is used to determine the name of the temporary file and is - arbitrary. The output of index-pack is printed to stdout. + arbitrary. The output of index-pack is printed to stdout. Requires + --index-pack-args. + +--index-pack-args=<args>:: + For internal use only. The command to run on the contents of the + downloaded pack. Arguments are URL-encoded separated by spaces. --recover:: Verify that everything reachable from target is fetched. Used after diff --git a/fetch-pack.c b/fetch-pack.c index 876f90c759..274ae602f7 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -1645,6 +1645,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, strvec_pushf(&cmd.args, "--packfile=%.*s", (int) the_hash_algo->hexsz, packfile_uris.items[i].string); + strvec_push(&cmd.args, "--index-pack-args=index-pack --stdin --keep"); strvec_push(&cmd.args, uri); cmd.git_cmd = 1; cmd.no_stdin = 1; diff --git a/http-fetch.c b/http-fetch.c index 2d1d9d054f..12feb84e71 100644 --- a/http-fetch.c +++ b/http-fetch.c @@ -3,6 +3,7 @@ #include "exec-cmd.h" #include "http.h" #include "walker.h" +#include "strvec.h" static const char http_fetch_usage[] = "git http-fetch " "[-c] [-t] [-a] [-v] [--recover] [-w ref] [--stdin | --packfile=hash | commit-id] url"; @@ -43,11 +44,9 @@ static int fetch_using_walker(const char *raw_url, int get_verbosely, return rc; } -static const char *index_pack_args[] = - {"index-pack", "--stdin", "--keep", NULL}; - static void fetch_single_packfile(struct object_id *packfile_hash, - const char *url) { + const char *url, + const char **index_pack_args) { struct http_pack_request *preq; struct slot_results results; int ret; @@ -90,6 +89,7 @@ int cmd_main(int argc, const char **argv) int packfile = 0; int nongit; struct object_id packfile_hash; + const char *index_pack_args = NULL; setup_git_directory_gently(&nongit); @@ -116,6 +116,8 @@ int cmd_main(int argc, const char **argv) packfile = 1; if (parse_oid_hex(p, &packfile_hash, &end) || *end) die(_("argument to --packfile must be a valid hash (got '%s')"), p); + } else if (skip_prefix(argv[arg], "--index-pack-args=", &p)) { + index_pack_args = p; } arg++; } @@ -128,10 +130,33 @@ int cmd_main(int argc, const char **argv) git_config(git_default_config, NULL); if (packfile) { - fetch_single_packfile(&packfile_hash, argv[arg]); + struct strvec encoded = STRVEC_INIT; + char **raw; + int i; + + if (!index_pack_args) + die(_("--packfile requires --index-pack-args")); + + strvec_split(&encoded, index_pack_args); + + CALLOC_ARRAY(raw, encoded.nr + 1); + for (i = 0; i < encoded.nr; i++) + raw[i] = url_percent_decode(encoded.v[i]); + + fetch_single_packfile(&packfile_hash, argv[arg], + (const char **) raw); + + for (i = 0; i < encoded.nr; i++) + free(raw[i]); + free(raw); + strvec_clear(&encoded); + return 0; } + if (index_pack_args) + die(_("--index-pack-args can only be used with --packfile")); + if (commits_on_stdin) { commits = walker_targets_stdin(&commit_id, &write_ref); } else { diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index 483578b2d7..af90e7efed 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -224,7 +224,8 @@ test_expect_success 'http-fetch --packfile' ' git init packfileclient && p=$(cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git && ls objects/pack/pack-*.pack) && - git -C packfileclient http-fetch --packfile=$ARBITRARY "$HTTPD_URL"/dumb/repo_pack.git/$p >out && + git -C packfileclient http-fetch --packfile=$ARBITRARY \ + --index-pack-args="index-pack --stdin --keep" "$HTTPD_URL"/dumb/repo_pack.git/$p >out && grep "^keep.[0-9a-f]\{16,\}$" out && cut -c6- out >packhash && -- 2.30.0.280.ga3ce27912f-goog ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH 2/4] http-fetch: allow custom index-pack args 2021-01-24 2:34 ` [PATCH 2/4] http-fetch: " Jonathan Tan @ 2021-01-24 11:52 ` Ævar Arnfjörð Bjarmason 2021-01-28 0:32 ` Jonathan Tan 2021-02-16 20:49 ` Josh Steadmon 1 sibling, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-01-24 11:52 UTC (permalink / raw) To: Jonathan Tan; +Cc: git On Sun, Jan 24 2021, Jonathan Tan wrote: > --packfile=<hash>:: > - Instead of a commit id on the command line (which is not expected in > + For internal use only. Instead of a commit id on the command line (which is not expected in Leaves the rest at ~79 and this long line at ~100. Perhaps a follow-up change to re-word-wrap would be in order? ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 2/4] http-fetch: allow custom index-pack args 2021-01-24 11:52 ` Ævar Arnfjörð Bjarmason @ 2021-01-28 0:32 ` Jonathan Tan 0 siblings, 0 replies; 229+ messages in thread From: Jonathan Tan @ 2021-01-28 0:32 UTC (permalink / raw) To: avarab; +Cc: jonathantanmy, git > On Sun, Jan 24 2021, Jonathan Tan wrote: > > > --packfile=<hash>:: > > - Instead of a commit id on the command line (which is not expected in > > + For internal use only. Instead of a commit id on the command line (which is not expected in > > Leaves the rest at ~79 and this long line at ~100. Perhaps a follow-up > change to re-word-wrap would be in order? Hmm...I'll split that onto two lines then. I don't think it's worth the extra commit in history to have it exactly wrapped right, so I'll forgo the follow-up change for now. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 2/4] http-fetch: allow custom index-pack args 2021-01-24 2:34 ` [PATCH 2/4] http-fetch: " Jonathan Tan 2021-01-24 11:52 ` Ævar Arnfjörð Bjarmason @ 2021-02-16 20:49 ` Josh Steadmon 2021-02-16 22:57 ` Junio C Hamano 1 sibling, 1 reply; 229+ messages in thread From: Josh Steadmon @ 2021-02-16 20:49 UTC (permalink / raw) To: Jonathan Tan; +Cc: git On 2021.01.23 18:34, Jonathan Tan wrote: > This is the next step in teaching fetch-pack to pass its index-pack > arguments when processing packfiles referenced by URIs. > > The "--keep" in fetch-pack.c will be replaced with a full message in a > subsequent commit. > > Signed-off-by: Jonathan Tan <jonathantanmy@google.com> > --- > Documentation/git-http-fetch.txt | 9 ++++++-- > fetch-pack.c | 1 + > http-fetch.c | 35 +++++++++++++++++++++++++++----- > t/t5550-http-fetch-dumb.sh | 3 ++- > 4 files changed, 40 insertions(+), 8 deletions(-) > > diff --git a/Documentation/git-http-fetch.txt b/Documentation/git-http-fetch.txt > index 4deb4893f5..aa171088e8 100644 > --- a/Documentation/git-http-fetch.txt > +++ b/Documentation/git-http-fetch.txt > @@ -41,11 +41,16 @@ commit-id:: > <commit-id>['\t'<filename-as-in--w>] > > --packfile=<hash>:: > - Instead of a commit id on the command line (which is not expected in > + For internal use only. Instead of a commit id on the command line (which is not expected in > this case), 'git http-fetch' fetches the packfile directly at the given > URL and uses index-pack to generate corresponding .idx and .keep files. > The hash is used to determine the name of the temporary file and is > - arbitrary. The output of index-pack is printed to stdout. > + arbitrary. The output of index-pack is printed to stdout. Requires > + --index-pack-args. > + > +--index-pack-args=<args>:: > + For internal use only. The command to run on the contents of the > + downloaded pack. Arguments are URL-encoded separated by spaces. I'm a bit skeptical of using URL encoding to work around embedded spaces. I believe in Emily's config-based hooks series, she wrote an argument parser to pull repeated arguments into a strvec, could you do something like that here? I'm sympathetic to the idea that since this is an internal-only flag, we can be a bit weird with the argument format, though. > --recover:: > Verify that everything reachable from target is fetched. Used after > diff --git a/fetch-pack.c b/fetch-pack.c > index 876f90c759..274ae602f7 100644 > --- a/fetch-pack.c > +++ b/fetch-pack.c > @@ -1645,6 +1645,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, > strvec_pushf(&cmd.args, "--packfile=%.*s", > (int) the_hash_algo->hexsz, > packfile_uris.items[i].string); > + strvec_push(&cmd.args, "--index-pack-args=index-pack --stdin --keep"); > strvec_push(&cmd.args, uri); > cmd.git_cmd = 1; > cmd.no_stdin = 1; > diff --git a/http-fetch.c b/http-fetch.c > index 2d1d9d054f..12feb84e71 100644 > --- a/http-fetch.c > +++ b/http-fetch.c > @@ -3,6 +3,7 @@ > #include "exec-cmd.h" > #include "http.h" > #include "walker.h" > +#include "strvec.h" > > static const char http_fetch_usage[] = "git http-fetch " > "[-c] [-t] [-a] [-v] [--recover] [-w ref] [--stdin | --packfile=hash | commit-id] url"; > @@ -43,11 +44,9 @@ static int fetch_using_walker(const char *raw_url, int get_verbosely, > return rc; > } > > -static const char *index_pack_args[] = > - {"index-pack", "--stdin", "--keep", NULL}; > - > static void fetch_single_packfile(struct object_id *packfile_hash, > - const char *url) { > + const char *url, > + const char **index_pack_args) { > struct http_pack_request *preq; > struct slot_results results; > int ret; > @@ -90,6 +89,7 @@ int cmd_main(int argc, const char **argv) > int packfile = 0; > int nongit; > struct object_id packfile_hash; > + const char *index_pack_args = NULL; > > setup_git_directory_gently(&nongit); > > @@ -116,6 +116,8 @@ int cmd_main(int argc, const char **argv) > packfile = 1; > if (parse_oid_hex(p, &packfile_hash, &end) || *end) > die(_("argument to --packfile must be a valid hash (got '%s')"), p); > + } else if (skip_prefix(argv[arg], "--index-pack-args=", &p)) { > + index_pack_args = p; > } > arg++; > } > @@ -128,10 +130,33 @@ int cmd_main(int argc, const char **argv) > git_config(git_default_config, NULL); > > if (packfile) { > - fetch_single_packfile(&packfile_hash, argv[arg]); > + struct strvec encoded = STRVEC_INIT; > + char **raw; > + int i; > + > + if (!index_pack_args) > + die(_("--packfile requires --index-pack-args")); > + > + strvec_split(&encoded, index_pack_args); > + > + CALLOC_ARRAY(raw, encoded.nr + 1); > + for (i = 0; i < encoded.nr; i++) > + raw[i] = url_percent_decode(encoded.v[i]); > + > + fetch_single_packfile(&packfile_hash, argv[arg], > + (const char **) raw); > + > + for (i = 0; i < encoded.nr; i++) > + free(raw[i]); > + free(raw); > + strvec_clear(&encoded); > + > return 0; > } > > + if (index_pack_args) > + die(_("--index-pack-args can only be used with --packfile")); > + > if (commits_on_stdin) { > commits = walker_targets_stdin(&commit_id, &write_ref); > } else { > diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh > index 483578b2d7..af90e7efed 100755 > --- a/t/t5550-http-fetch-dumb.sh > +++ b/t/t5550-http-fetch-dumb.sh > @@ -224,7 +224,8 @@ test_expect_success 'http-fetch --packfile' ' > > git init packfileclient && > p=$(cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git && ls objects/pack/pack-*.pack) && > - git -C packfileclient http-fetch --packfile=$ARBITRARY "$HTTPD_URL"/dumb/repo_pack.git/$p >out && > + git -C packfileclient http-fetch --packfile=$ARBITRARY \ > + --index-pack-args="index-pack --stdin --keep" "$HTTPD_URL"/dumb/repo_pack.git/$p >out && > > grep "^keep.[0-9a-f]\{16,\}$" out && > cut -c6- out >packhash && > -- > 2.30.0.280.ga3ce27912f-goog > ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 2/4] http-fetch: allow custom index-pack args 2021-02-16 20:49 ` Josh Steadmon @ 2021-02-16 22:57 ` Junio C Hamano 2021-02-17 19:46 ` Jonathan Tan 0 siblings, 1 reply; 229+ messages in thread From: Junio C Hamano @ 2021-02-16 22:57 UTC (permalink / raw) To: Josh Steadmon; +Cc: Jonathan Tan, git Josh Steadmon <steadmon@google.com> writes: >> +--index-pack-args=<args>:: >> + For internal use only. The command to run on the contents of the >> + downloaded pack. Arguments are URL-encoded separated by spaces. > > I'm a bit skeptical of using URL encoding to work around embedded > spaces. I believe in Emily's config-based hooks series, she wrote an > argument parser to pull repeated arguments into a strvec, could you do > something like that here? > > I'm sympathetic to the idea that since this is an internal-only flag, we > can be a bit weird with the argument format, though. We tend to prefer quote.c::sq_quote*() suite of quoting; does this codepath have very different constraints that require different encoding? Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 2/4] http-fetch: allow custom index-pack args 2021-02-16 22:57 ` Junio C Hamano @ 2021-02-17 19:46 ` Jonathan Tan 0 siblings, 0 replies; 229+ messages in thread From: Jonathan Tan @ 2021-02-17 19:46 UTC (permalink / raw) To: gitster; +Cc: steadmon, jonathantanmy, git > Josh Steadmon <steadmon@google.com> writes: > > >> +--index-pack-args=<args>:: > >> + For internal use only. The command to run on the contents of the > >> + downloaded pack. Arguments are URL-encoded separated by spaces. > > > > I'm a bit skeptical of using URL encoding to work around embedded > > spaces. I believe in Emily's config-based hooks series, she wrote an > > argument parser to pull repeated arguments into a strvec, could you do > > something like that here? > > > > I'm sympathetic to the idea that since this is an internal-only flag, we > > can be a bit weird with the argument format, though. > > We tend to prefer quote.c::sq_quote*() suite of quoting; does this > codepath have very different constraints that require different > encoding? My main issue was that I needed to join arbitrary strings and then split them, which is why I URL-encoded them (so that they would no longer contain spaces) and then used spaces as the "join" separator. With Josh's suggestion, I wouldn't need any sort of encoding or quoting, so I think I'll use that. ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH 3/4] fetch-pack: with packfile URIs, use index-pack arg 2021-01-24 2:34 ` [PATCH 0/4] Check .gitmodules when using packfile URIs Jonathan Tan 2021-01-24 2:34 ` [PATCH 1/4] http: allow custom index-pack args Jonathan Tan 2021-01-24 2:34 ` [PATCH 2/4] http-fetch: " Jonathan Tan @ 2021-01-24 2:34 ` Jonathan Tan 2021-01-24 2:34 ` [PATCH 4/4] fetch-pack: print and use dangling .gitmodules Jonathan Tan ` (2 subsequent siblings) 5 siblings, 0 replies; 229+ messages in thread From: Jonathan Tan @ 2021-01-24 2:34 UTC (permalink / raw) To: git; +Cc: Jonathan Tan Unify the index-pack arguments used when processing the inline pack and when downloading packfiles referenced by URIs. This is done by teaching get_pack() to also store the index-pack arguments whenever at least one packfile URI is given, and then when processing the packfile URI(s), using the stored arguments. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- fetch-pack.c | 35 ++++++++++++++++++++++++++--------- 1 file changed, 26 insertions(+), 9 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index 274ae602f7..fe69635eb5 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -797,12 +797,13 @@ static void write_promisor_file(const char *keep_name, } /* - * Pass 1 as "only_packfile" if the pack received is the only pack in this - * fetch request (that is, if there were no packfile URIs provided). + * If packfile URIs were provided, pass a non-NULL pointer to index_pack_args. + * The string to pass as the --index-pack-args argument to http-fetch will be + * stored there. (It must be freed by the caller.) */ static int get_pack(struct fetch_pack_args *args, int xd[2], struct string_list *pack_lockfiles, - int only_packfile, + char **index_pack_args, struct ref **sought, int nr_sought) { struct async demux; @@ -845,7 +846,7 @@ static int get_pack(struct fetch_pack_args *args, strvec_push(&cmd.args, alternate_shallow_file); } - if (do_keep || args->from_promisor) { + if (do_keep || args->from_promisor || index_pack_args) { if (pack_lockfiles) cmd.out = -1; cmd_name = "index-pack"; @@ -863,7 +864,7 @@ static int get_pack(struct fetch_pack_args *args, "--keep=fetch-pack %"PRIuMAX " on %s", (uintmax_t)getpid(), hostname); } - if (only_packfile && args->check_self_contained_and_connected) + if (!index_pack_args && args->check_self_contained_and_connected) strvec_push(&cmd.args, "--check-self-contained-and-connected"); else /* @@ -901,7 +902,7 @@ static int get_pack(struct fetch_pack_args *args, : transfer_fsck_objects >= 0 ? transfer_fsck_objects : 0) { - if (args->from_promisor || !only_packfile) + if (args->from_promisor || index_pack_args) /* * We cannot use --strict in index-pack because it * checks both broken objects and links, but we only @@ -913,6 +914,19 @@ static int get_pack(struct fetch_pack_args *args, fsck_msg_types.buf); } + if (index_pack_args) { + struct strbuf joined = STRBUF_INIT; + int i; + + for (i = 0; i < cmd.args.nr; i++) { + if (i) + strbuf_addch(&joined, ' '); + strbuf_addstr_urlencode(&joined, cmd.args.v[i], + is_rfc3986_unreserved); + } + *index_pack_args = strbuf_detach(&joined, NULL); + } + cmd.in = demux.out; cmd.git_cmd = 1; if (start_command(&cmd)) @@ -1084,7 +1098,7 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args, alternate_shallow_file = setup_temporary_shallow(si->shallow); else alternate_shallow_file = NULL; - if (get_pack(args, fd, pack_lockfiles, 1, sought, nr_sought)) + if (get_pack(args, fd, pack_lockfiles, NULL, sought, nr_sought)) die(_("git fetch-pack: fetch failed.")); all_done: @@ -1535,6 +1549,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, int seen_ack = 0; struct string_list packfile_uris = STRING_LIST_INIT_DUP; int i; + char *index_pack_args = NULL; negotiator = &negotiator_alloc; fetch_negotiator_init(r, negotiator); @@ -1624,7 +1639,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, receive_packfile_uris(&reader, &packfile_uris); process_section_header(&reader, "packfile", 0); if (get_pack(args, fd, pack_lockfiles, - !packfile_uris.nr, sought, nr_sought)) + packfile_uris.nr ? &index_pack_args : NULL, + sought, nr_sought)) die(_("git fetch-pack: fetch failed.")); do_check_stateless_delimiter(args, &reader); @@ -1645,7 +1661,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, strvec_pushf(&cmd.args, "--packfile=%.*s", (int) the_hash_algo->hexsz, packfile_uris.items[i].string); - strvec_push(&cmd.args, "--index-pack-args=index-pack --stdin --keep"); + strvec_pushf(&cmd.args, "--index-pack-args=%s", index_pack_args); strvec_push(&cmd.args, uri); cmd.git_cmd = 1; cmd.no_stdin = 1; @@ -1681,6 +1697,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, packname)); } string_list_clear(&packfile_uris, 0); + FREE_AND_NULL(index_pack_args); if (negotiator) negotiator->release(negotiator); -- 2.30.0.280.ga3ce27912f-goog ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH 4/4] fetch-pack: print and use dangling .gitmodules 2021-01-24 2:34 ` [PATCH 0/4] Check .gitmodules when using packfile URIs Jonathan Tan ` (2 preceding siblings ...) 2021-01-24 2:34 ` [PATCH 3/4] fetch-pack: with packfile URIs, use index-pack arg Jonathan Tan @ 2021-01-24 2:34 ` Jonathan Tan 2021-01-24 7:56 ` Junio C Hamano ` (3 more replies) 2021-01-24 6:29 ` [PATCH 0/4] Check .gitmodules when using packfile URIs Junio C Hamano 2021-02-18 23:34 ` Junio C Hamano 5 siblings, 4 replies; 229+ messages in thread From: Jonathan Tan @ 2021-01-24 2:34 UTC (permalink / raw) To: git; +Cc: Jonathan Tan Teach index-pack to print dangling .gitmodules links after its "keep" or "pack" line instead of declaring an error, and teach fetch-pack to check such lines printed. This allows the tree side of the .gitmodules link to be in one packfile and the blob side to be in another without failing the fsck check, because it is now fetch-pack which checks such objects after all packfiles have been downloaded and indexed (and not index-pack on an individual packfile, as it is before this commit). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- Documentation/git-index-pack.txt | 7 ++- builtin/index-pack.c | 9 +++- builtin/receive-pack.c | 2 +- fetch-pack.c | 78 +++++++++++++++++++++++++++----- fsck.c | 16 +++++-- fsck.h | 8 ++++ pack-write.c | 8 +++- pack.h | 2 +- t/t5702-protocol-v2.sh | 47 +++++++++++++++++++ 9 files changed, 155 insertions(+), 22 deletions(-) diff --git a/Documentation/git-index-pack.txt b/Documentation/git-index-pack.txt index af0c26232c..e74a4a1eda 100644 --- a/Documentation/git-index-pack.txt +++ b/Documentation/git-index-pack.txt @@ -78,7 +78,12 @@ OPTIONS Die if the pack contains broken links. For internal use only. --fsck-objects:: - Die if the pack contains broken objects. For internal use only. + For internal use only. ++ +Die if the pack contains broken objects. If the pack contains a tree +pointing to a .gitmodules blob that does not exist, prints the hash of +that blob (for the caller to check) after the hash that goes into the +name of the pack/idx file (see "Notes"). --threads=<n>:: Specifies the number of threads to spawn when resolving diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 557bd2f348..f995c15115 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1888,8 +1888,13 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) else close(input_fd); - if (do_fsck_object && fsck_finish(&fsck_options)) - die(_("fsck error in pack objects")); + if (do_fsck_object) { + struct fsck_options fo = FSCK_OPTIONS_STRICT; + + fo.print_dangling_gitmodules = 1; + if (fsck_finish(&fo)) + die(_("fsck error in pack objects")); + } free(objects); strbuf_release(&index_name_buf); diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index d49d050e6e..ed2c9b42e9 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -2275,7 +2275,7 @@ static const char *unpack(int err_fd, struct shallow_info *si) status = start_command(&child); if (status) return "index-pack fork failed"; - pack_lockfile = index_pack_lockfile(child.out); + pack_lockfile = index_pack_lockfile(child.out, NULL); close(child.out); status = finish_command(&child); if (status) diff --git a/fetch-pack.c b/fetch-pack.c index fe69635eb5..128362e0ba 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -796,6 +796,26 @@ static void write_promisor_file(const char *keep_name, strbuf_release(&promisor_name); } +static void parse_gitmodules_oids(int fd, struct oidset *gitmodules_oids) +{ + int len = the_hash_algo->hexsz + 1; /* hash + NL */ + + do { + char hex_hash[GIT_MAX_HEXSZ + 1]; + int read_len = read_in_full(fd, hex_hash, len); + struct object_id oid; + const char *end; + + if (!read_len) + return; + if (read_len != len) + die("invalid length read %d", read_len); + if (parse_oid_hex(hex_hash, &oid, &end) || *end != '\n') + die("invalid hash"); + oidset_insert(gitmodules_oids, &oid); + } while (1); +} + /* * If packfile URIs were provided, pass a non-NULL pointer to index_pack_args. * The string to pass as the --index-pack-args argument to http-fetch will be @@ -804,7 +824,8 @@ static void write_promisor_file(const char *keep_name, static int get_pack(struct fetch_pack_args *args, int xd[2], struct string_list *pack_lockfiles, char **index_pack_args, - struct ref **sought, int nr_sought) + struct ref **sought, int nr_sought, + struct oidset *gitmodules_oids) { struct async demux; int do_keep = args->keep_pack; @@ -812,6 +833,7 @@ static int get_pack(struct fetch_pack_args *args, struct pack_header header; int pass_header = 0; struct child_process cmd = CHILD_PROCESS_INIT; + int fsck_objects = 0; int ret; memset(&demux, 0, sizeof(demux)); @@ -846,8 +868,15 @@ static int get_pack(struct fetch_pack_args *args, strvec_push(&cmd.args, alternate_shallow_file); } - if (do_keep || args->from_promisor || index_pack_args) { - if (pack_lockfiles) + if (fetch_fsck_objects >= 0 + ? fetch_fsck_objects + : transfer_fsck_objects >= 0 + ? transfer_fsck_objects + : 0) + fsck_objects = 1; + + if (do_keep || args->from_promisor || index_pack_args || fsck_objects) { + if (pack_lockfiles || fsck_objects) cmd.out = -1; cmd_name = "index-pack"; strvec_push(&cmd.args, cmd_name); @@ -897,11 +926,7 @@ static int get_pack(struct fetch_pack_args *args, strvec_pushf(&cmd.args, "--pack_header=%"PRIu32",%"PRIu32, ntohl(header.hdr_version), ntohl(header.hdr_entries)); - if (fetch_fsck_objects >= 0 - ? fetch_fsck_objects - : transfer_fsck_objects >= 0 - ? transfer_fsck_objects - : 0) { + if (fsck_objects) { if (args->from_promisor || index_pack_args) /* * We cannot use --strict in index-pack because it @@ -931,10 +956,15 @@ static int get_pack(struct fetch_pack_args *args, cmd.git_cmd = 1; if (start_command(&cmd)) die(_("fetch-pack: unable to fork off %s"), cmd_name); - if (do_keep && pack_lockfiles) { - char *pack_lockfile = index_pack_lockfile(cmd.out); + if (do_keep && (pack_lockfiles || fsck_objects)) { + int is_well_formed; + char *pack_lockfile = index_pack_lockfile(cmd.out, &is_well_formed); + + if (!is_well_formed) + die(_("fetch-pack: invalid index-pack output")); if (pack_lockfile) string_list_append_nodup(pack_lockfiles, pack_lockfile); + parse_gitmodules_oids(cmd.out, gitmodules_oids); close(cmd.out); } @@ -969,6 +999,22 @@ static int cmp_ref_by_name(const void *a_, const void *b_) return strcmp(a->name, b->name); } +static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) +{ + struct oidset_iter iter; + const struct object_id *oid; + struct fsck_options fo = FSCK_OPTIONS_STRICT; + + if (!oidset_size(gitmodules_oids)) + return; + + oidset_iter_init(gitmodules_oids, &iter); + while ((oid = oidset_iter_next(&iter))) + register_found_gitmodules(oid); + if (fsck_finish(&fo)) + die("fsck failed"); +} + static struct ref *do_fetch_pack(struct fetch_pack_args *args, int fd[2], const struct ref *orig_ref, @@ -983,6 +1029,7 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args, int agent_len; struct fetch_negotiator negotiator_alloc; struct fetch_negotiator *negotiator; + struct oidset gitmodules_oids = OIDSET_INIT; negotiator = &negotiator_alloc; fetch_negotiator_init(r, negotiator); @@ -1098,8 +1145,10 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args, alternate_shallow_file = setup_temporary_shallow(si->shallow); else alternate_shallow_file = NULL; - if (get_pack(args, fd, pack_lockfiles, NULL, sought, nr_sought)) + if (get_pack(args, fd, pack_lockfiles, NULL, sought, nr_sought, + &gitmodules_oids)) die(_("git fetch-pack: fetch failed.")); + fsck_gitmodules_oids(&gitmodules_oids); all_done: if (negotiator) @@ -1550,6 +1599,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, struct string_list packfile_uris = STRING_LIST_INIT_DUP; int i; char *index_pack_args = NULL; + struct oidset gitmodules_oids = OIDSET_INIT; negotiator = &negotiator_alloc; fetch_negotiator_init(r, negotiator); @@ -1640,7 +1690,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, process_section_header(&reader, "packfile", 0); if (get_pack(args, fd, pack_lockfiles, packfile_uris.nr ? &index_pack_args : NULL, - sought, nr_sought)) + sought, nr_sought, &gitmodules_oids)) die(_("git fetch-pack: fetch failed.")); do_check_stateless_delimiter(args, &reader); @@ -1680,6 +1730,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, packname[the_hash_algo->hexsz] = '\0'; + parse_gitmodules_oids(cmd.out, &gitmodules_oids); + close(cmd.out); if (finish_command(&cmd)) @@ -1699,6 +1751,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, string_list_clear(&packfile_uris, 0); FREE_AND_NULL(index_pack_args); + fsck_gitmodules_oids(&gitmodules_oids); + if (negotiator) negotiator->release(negotiator); diff --git a/fsck.c b/fsck.c index f82e2fe9e3..04f3d342af 100644 --- a/fsck.c +++ b/fsck.c @@ -1243,6 +1243,11 @@ int fsck_error_function(struct fsck_options *o, return 1; } +void register_found_gitmodules(const struct object_id *oid) +{ + oidset_insert(&gitmodules_found, oid); +} + int fsck_finish(struct fsck_options *options) { int ret = 0; @@ -1262,10 +1267,13 @@ int fsck_finish(struct fsck_options *options) if (!buf) { if (is_promisor_object(oid)) continue; - ret |= report(options, - oid, OBJ_BLOB, - FSCK_MSG_GITMODULES_MISSING, - "unable to read .gitmodules blob"); + if (options->print_dangling_gitmodules) + printf("%s\n", oid_to_hex(oid)); + else + ret |= report(options, + oid, OBJ_BLOB, + FSCK_MSG_GITMODULES_MISSING, + "unable to read .gitmodules blob"); continue; } diff --git a/fsck.h b/fsck.h index 69cf715e79..4b8cf03445 100644 --- a/fsck.h +++ b/fsck.h @@ -41,6 +41,12 @@ struct fsck_options { int *msg_type; struct oidset skiplist; kh_oid_map_t *object_names; + + /* + * If 1, print the hashes of missing .gitmodules blobs instead of + * considering them to be errors. + */ + unsigned print_dangling_gitmodules:1; }; #define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT } @@ -62,6 +68,8 @@ int fsck_walk(struct object *obj, void *data, struct fsck_options *options); int fsck_object(struct object *obj, void *data, unsigned long size, struct fsck_options *options); +void register_found_gitmodules(const struct object_id *oid); + /* * Some fsck checks are context-dependent, and may end up queued; run this * after completing all fsck_object() calls in order to resolve any remaining diff --git a/pack-write.c b/pack-write.c index 3513665e1e..f66ea8e5a1 100644 --- a/pack-write.c +++ b/pack-write.c @@ -272,7 +272,7 @@ void fixup_pack_header_footer(int pack_fd, fsync_or_die(pack_fd, pack_name); } -char *index_pack_lockfile(int ip_out) +char *index_pack_lockfile(int ip_out, int *is_well_formed) { char packname[GIT_MAX_HEXSZ + 6]; const int len = the_hash_algo->hexsz + 6; @@ -286,11 +286,17 @@ char *index_pack_lockfile(int ip_out) */ if (read_in_full(ip_out, packname, len) == len && packname[len-1] == '\n') { const char *name; + + if (is_well_formed) + *is_well_formed = 1; packname[len-1] = 0; if (skip_prefix(packname, "keep\t", &name)) return xstrfmt("%s/pack/pack-%s.keep", get_object_directory(), name); + return NULL; } + if (is_well_formed) + *is_well_formed = 0; return NULL; } diff --git a/pack.h b/pack.h index 9fc0945ac9..09cffec395 100644 --- a/pack.h +++ b/pack.h @@ -85,7 +85,7 @@ int verify_pack_index(struct packed_git *); int verify_pack(struct repository *, struct packed_git *, verify_fn fn, struct progress *, uint32_t); off_t write_pack_header(struct hashfile *f, uint32_t); void fixup_pack_header_footer(int, unsigned char *, const char *, uint32_t, unsigned char *, off_t); -char *index_pack_lockfile(int fd); +char *index_pack_lockfile(int fd, int *is_well_formed); /* * The "hdr" output buffer should be at least this big, which will handle sizes diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh index 7d5b17909b..8b8fb43dbc 100755 --- a/t/t5702-protocol-v2.sh +++ b/t/t5702-protocol-v2.sh @@ -936,6 +936,53 @@ test_expect_success 'packfile-uri with transfer.fsckobjects fails on bad object' test_i18ngrep "invalid author/committer line - missing email" error ' +test_expect_success 'packfile-uri with transfer.fsckobjects succeeds when .gitmodules is separate from tree' ' + P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && + rm -rf "$P" http_child && + + git init "$P" && + git -C "$P" config "uploadpack.allowsidebandall" "true" && + + echo "[submodule libfoo]" >"$P/.gitmodules" && + echo "path = include/foo" >>"$P/.gitmodules" && + echo "url = git://example.com/git/lib.git" >>"$P/.gitmodules" && + git -C "$P" add .gitmodules && + git -C "$P" commit -m x && + + configure_exclusion "$P" .gitmodules >h && + + sane_unset GIT_TEST_SIDEBAND_ALL && + git -c protocol.version=2 -c transfer.fsckobjects=1 \ + -c fetch.uriprotocols=http,https \ + clone "$HTTPD_URL/smart/http_parent" http_child && + + # Ensure that there are exactly 4 files (2 .pack and 2 .idx). + ls http_child/.git/objects/pack/* >filelist && + test_line_count = 4 filelist +' + +test_expect_success 'packfile-uri with transfer.fsckobjects fails when .gitmodules separate from tree is invalid' ' + P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && + rm -rf "$P" http_child err && + + git init "$P" && + git -C "$P" config "uploadpack.allowsidebandall" "true" && + + echo "[submodule \"..\"]" >"$P/.gitmodules" && + echo "path = include/foo" >>"$P/.gitmodules" && + echo "url = git://example.com/git/lib.git" >>"$P/.gitmodules" && + git -C "$P" add .gitmodules && + git -C "$P" commit -m x && + + configure_exclusion "$P" .gitmodules >h && + + sane_unset GIT_TEST_SIDEBAND_ALL && + test_must_fail git -c protocol.version=2 -c transfer.fsckobjects=1 \ + -c fetch.uriprotocols=http,https \ + clone "$HTTPD_URL/smart/http_parent" http_child 2>err && + test_i18ngrep "disallowed submodule name" err +' + # DO NOT add non-httpd-specific tests here, because the last part of this # test script is only executed when httpd is available and enabled. -- 2.30.0.280.ga3ce27912f-goog ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH 4/4] fetch-pack: print and use dangling .gitmodules 2021-01-24 2:34 ` [PATCH 4/4] fetch-pack: print and use dangling .gitmodules Jonathan Tan @ 2021-01-24 7:56 ` Junio C Hamano 2021-01-26 1:57 ` Junio C Hamano 2021-01-24 12:18 ` Ævar Arnfjörð Bjarmason ` (2 subsequent siblings) 3 siblings, 1 reply; 229+ messages in thread From: Junio C Hamano @ 2021-01-24 7:56 UTC (permalink / raw) To: Jonathan Tan; +Cc: git Jonathan Tan <jonathantanmy@google.com> writes: > diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh > index 7d5b17909b..8b8fb43dbc 100755 > ... > + sane_unset GIT_TEST_SIDEBAND_ALL && > + git -c protocol.version=2 -c transfer.fsckobjects=1 \ > + -c fetch.uriprotocols=http,https \ > + clone "$HTTPD_URL/smart/http_parent" http_child && > + > + # Ensure that there are exactly 4 files (2 .pack and 2 .idx). Ehh, please don't. We may add multi-pack-index there, or perhaps reverse index files in the future. If you care about having two packs logically because you are exercising the out-of-band prepackaged packfile plus the dynamic transfer, make sure you have two packs (and probably the idx files that go with them). Don't assume there will be one .idx each for them *AND* nothing else there. > + ls http_child/.git/objects/pack/* >filelist && > + test_line_count = 4 filelist > +' IOW, d=http_child/.git/objects/pack/ ls "$d"/*.pack "$d"/*.idx >filelist && test_line_count = 4 filelist or something like that. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 4/4] fetch-pack: print and use dangling .gitmodules 2021-01-24 7:56 ` Junio C Hamano @ 2021-01-26 1:57 ` Junio C Hamano 2021-01-28 1:04 ` Jonathan Tan 0 siblings, 1 reply; 229+ messages in thread From: Junio C Hamano @ 2021-01-26 1:57 UTC (permalink / raw) To: Jonathan Tan; +Cc: git Junio C Hamano <gitster@pobox.com> writes: > Jonathan Tan <jonathantanmy@google.com> writes: > >> diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh >> index 7d5b17909b..8b8fb43dbc 100755 >> ... >> + sane_unset GIT_TEST_SIDEBAND_ALL && >> + git -c protocol.version=2 -c transfer.fsckobjects=1 \ >> + -c fetch.uriprotocols=http,https \ >> + clone "$HTTPD_URL/smart/http_parent" http_child && >> + >> + # Ensure that there are exactly 4 files (2 .pack and 2 .idx). > > Ehh, please don't. We may add multi-pack-index there, or perhaps > reverse index files in the future. If you care about having two > packs logically because you are exercising the out-of-band > prepackaged packfile plus the dynamic transfer, make sure you have > two packs (and probably the idx files that go with them). Don't > assume there will be one .idx each for them *AND* nothing else > there. > >> + ls http_child/.git/objects/pack/* >filelist && >> + test_line_count = 4 filelist >> +' > > IOW, > > d=http_child/.git/objects/pack/ > ls "$d"/*.pack "$d"/*.idx >filelist && > test_line_count = 4 filelist > > or something like that. FYI, I have the following queued to make the tip of 'seen' pass the tests. ---- >8 -------- >8 -------- >8 -------- >8 -------- >8 -------- >8 ---- From: Junio C Hamano <gitster@pobox.com> Date: Mon, 25 Jan 2021 17:27:10 -0800 Subject: [PATCH] SQUASH??? test fix --- t/t5702-protocol-v2.sh | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh index 8b8fb43dbc..b1bc73a9a9 100755 --- a/t/t5702-protocol-v2.sh +++ b/t/t5702-protocol-v2.sh @@ -847,8 +847,9 @@ test_expect_success 'part of packfile response provided as URI' ' test -f hfound && test -f h2found && - # Ensure that there are exactly 6 files (3 .pack and 3 .idx). - ls http_child/.git/objects/pack/* >filelist && + # Ensure that there are exactly 3 packfiles with associated .idx + ls http_child/.git/objects/pack/*.pack \ + http_child/.git/objects/pack/*.idx >filelist && test_line_count = 6 filelist ' @@ -901,8 +902,9 @@ test_expect_success 'packfile-uri with transfer.fsckobjects' ' -c fetch.uriprotocols=http,https \ clone "$HTTPD_URL/smart/http_parent" http_child && - # Ensure that there are exactly 4 files (2 .pack and 2 .idx). - ls http_child/.git/objects/pack/* >filelist && + # Ensure that there are exactly 2 packfiles with associated .idx + ls http_child/.git/objects/pack/*.pack \ + http_child/.git/objects/pack/*.idx >filelist && test_line_count = 4 filelist ' @@ -956,8 +958,9 @@ test_expect_success 'packfile-uri with transfer.fsckobjects succeeds when .gitmo -c fetch.uriprotocols=http,https \ clone "$HTTPD_URL/smart/http_parent" http_child && - # Ensure that there are exactly 4 files (2 .pack and 2 .idx). - ls http_child/.git/objects/pack/* >filelist && + # Ensure that there are exactly 2 packfiles with associated .idx + ls http_child/.git/objects/pack/*.pack \ + http_child/.git/objects/pack/*.idx >filelist && test_line_count = 4 filelist ' -- 2.30.0-509-gbbf2750a06 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH 4/4] fetch-pack: print and use dangling .gitmodules 2021-01-26 1:57 ` Junio C Hamano @ 2021-01-28 1:04 ` Jonathan Tan 0 siblings, 0 replies; 229+ messages in thread From: Jonathan Tan @ 2021-01-28 1:04 UTC (permalink / raw) To: gitster; +Cc: jonathantanmy, git > > Ehh, please don't. We may add multi-pack-index there, or perhaps > > reverse index files in the future. If you care about having two > > packs logically because you are exercising the out-of-band > > prepackaged packfile plus the dynamic transfer, make sure you have > > two packs (and probably the idx files that go with them). Don't > > assume there will be one .idx each for them *AND* nothing else > > there. > > > >> + ls http_child/.git/objects/pack/* >filelist && > >> + test_line_count = 4 filelist > >> +' > > > > IOW, > > > > d=http_child/.git/objects/pack/ > > ls "$d"/*.pack "$d"/*.idx >filelist && > > test_line_count = 4 filelist > > > > or something like that. > > FYI, I have the following queued to make the tip of 'seen' pass the > tests. [snip] OK - I'll include these changes in the next version. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 4/4] fetch-pack: print and use dangling .gitmodules 2021-01-24 2:34 ` [PATCH 4/4] fetch-pack: print and use dangling .gitmodules Jonathan Tan 2021-01-24 7:56 ` Junio C Hamano @ 2021-01-24 12:18 ` Ævar Arnfjörð Bjarmason 2021-01-28 1:03 ` Jonathan Tan 2021-01-24 12:30 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:27 ` Ævar Arnfjörð Bjarmason 3 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-01-24 12:18 UTC (permalink / raw) To: Jonathan Tan; +Cc: git On Sun, Jan 24 2021, Jonathan Tan wrote: > +void register_found_gitmodules(const struct object_id *oid) > +{ > + oidset_insert(&gitmodules_found, oid); > +} > + In fsck.c we only use this variable to insert into it, or in fsck_blob() to do the actual check, but then we either abort early if we've found it, or right after that: if (object_on_skiplist(options, oid)) return 0; So (along with comments I have below...) you could just use the existing "skiplist" option instead, no? > int fsck_finish(struct fsck_options *options) > { > int ret = 0; > @@ -1262,10 +1267,13 @@ int fsck_finish(struct fsck_options *options) > if (!buf) { > if (is_promisor_object(oid)) > continue; > - ret |= report(options, > - oid, OBJ_BLOB, > - FSCK_MSG_GITMODULES_MISSING, > - "unable to read .gitmodules blob"); > + if (options->print_dangling_gitmodules) > + printf("%s\n", oid_to_hex(oid)); > + else > + ret |= report(options, > + oid, OBJ_BLOB, > + FSCK_MSG_GITMODULES_MISSING, > + "unable to read .gitmodules blob"); > continue; > } > > diff --git a/fsck.h b/fsck.h > index 69cf715e79..4b8cf03445 100644 > --- a/fsck.h > +++ b/fsck.h > @@ -41,6 +41,12 @@ struct fsck_options { > int *msg_type; > struct oidset skiplist; > kh_oid_map_t *object_names; > + > + /* > + * If 1, print the hashes of missing .gitmodules blobs instead of > + * considering them to be errors. > + */ > + unsigned print_dangling_gitmodules:1; > }; > > #define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT } > @@ -62,6 +68,8 @@ int fsck_walk(struct object *obj, void *data, struct fsck_options *options); > int fsck_object(struct object *obj, void *data, unsigned long size, > struct fsck_options *options); > > +void register_found_gitmodules(const struct object_id *oid); > + > /* > * Some fsck checks are context-dependent, and may end up queued; run this > * after completing all fsck_object() calls in order to resolve any remaining This whole thing seems just like the bad path I took in earlier rounds of my in-flight mktag series. You don't need this new custom API. You just setup an error handler for your fsck which ignores / prints / logs / whatever the OIDs you want if you get a FSCK_MSG_GITMODULES_MISSING error, which you then "return 0" on. If you don't have FSCK_MSG_GITMODULES_MISSING punt and call fsck_error_function(). ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 4/4] fetch-pack: print and use dangling .gitmodules 2021-01-24 12:18 ` Ævar Arnfjörð Bjarmason @ 2021-01-28 1:03 ` Jonathan Tan 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 229+ messages in thread From: Jonathan Tan @ 2021-01-28 1:03 UTC (permalink / raw) To: avarab; +Cc: jonathantanmy, git > On Sun, Jan 24 2021, Jonathan Tan wrote: > > > +void register_found_gitmodules(const struct object_id *oid) > > +{ > > + oidset_insert(&gitmodules_found, oid); > > +} > > + > > In fsck.c we only use this variable to insert into it, or in fsck_blob() > to do the actual check, but then we either abort early if we've found > it, or right after that: By "this variable", do you mean gitmodules_found? fsck_finish() consumes it. > if (object_on_skiplist(options, oid)) > return 0; > > So (along with comments I have below...) you could just use the existing > "skiplist" option instead, no? I don't understand this part (in particular, the part you quoted). About "skiplist", I'll reply to your other email [1] which has more details. [1] https://lore.kernel.org/git/87czxu7c15.fsf@evledraar.gmail.com/ > This whole thing seems just like the bad path I took in earlier rounds > of my in-flight mktag series. You don't need this new custom API. You > just setup an error handler for your fsck which ignores / prints / logs > / whatever the OIDs you want if you get a FSCK_MSG_GITMODULES_MISSING > error, which you then "return 0" on. > > If you don't have FSCK_MSG_GITMODULES_MISSING punt and call > fsck_error_function(). I tried that first, and the issue is that IDs like FSCK_MSG_GITMODULES_MISSING are internal to fsck.c. As for whether we should start exposing the IDs publicly, I think we should wait until a few new cases like this come up, so that we more fully understand the requirements first. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 4/4] fetch-pack: print and use dangling .gitmodules 2021-01-28 1:03 ` Jonathan Tan @ 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 00/14] fsck: API improvements Ævar Arnfjörð Bjarmason ` (15 more replies) 0 siblings, 16 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 1:48 UTC (permalink / raw) To: Jonathan Tan; +Cc: git On Thu, Jan 28 2021, Jonathan Tan wrote: Sorry I managed to miss this at the time. Hopefully a late reply is better than never. >> On Sun, Jan 24 2021, Jonathan Tan wrote: >> >> > +void register_found_gitmodules(const struct object_id *oid) >> > +{ >> > + oidset_insert(&gitmodules_found, oid); >> > +} >> > + >> >> In fsck.c we only use this variable to insert into it, or in fsck_blob() >> to do the actual check, but then we either abort early if we've found >> it, or right after that: > > By "this variable", do you mean gitmodules_found? fsck_finish() consumes > it. Yes, consumes it to emit errors with report(), no? >> if (object_on_skiplist(options, oid)) >> return 0; >> >> So (along with comments I have below...) you could just use the existing >> "skiplist" option instead, no? > > I don't understand this part (in particular, the part you quoted). About > "skiplist", I'll reply to your other email [1] which has more details. > > [1] https://lore.kernel.org/git/87czxu7c15.fsf@evledraar.gmail.com/ *nod* >> This whole thing seems just like the bad path I took in earlier rounds >> of my in-flight mktag series. You don't need this new custom API. You >> just setup an error handler for your fsck which ignores / prints / logs >> / whatever the OIDs you want if you get a FSCK_MSG_GITMODULES_MISSING >> error, which you then "return 0" on. >> >> If you don't have FSCK_MSG_GITMODULES_MISSING punt and call >> fsck_error_function(). > > I tried that first, and the issue is that IDs like > FSCK_MSG_GITMODULES_MISSING are internal to fsck.c. As for whether we > should start exposing the IDs publicly, I think we should wait until a > few new cases like this come up, so that we more fully understand the > requirements first. The requirement is that you want the objects ids we'd otherwise error about in fsck_finish(). Yeah we don't pass the "fsck_msg_id" down in the "report()" function, but you can reliably strstr() it out of the message. We document & hard rely on that already, since it's also a config key. But yeah, we could just change the report function to pass down the id and move the relevant macros from fsck.c to fsck.h. I think that would be a smaller change conceptually than a special-case flag in fsck_options for something we could otherwise do with the error reporting. ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH 00/14] fsck: API improvements 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 ` Ævar Arnfjörð Bjarmason 2021-02-17 21:02 ` Junio C Hamano ` (11 more replies) 2021-02-17 19:42 ` [PATCH 01/14] fsck.h: indent arguments to of fsck_set_msg_type Ævar Arnfjörð Bjarmason ` (14 subsequent siblings) 15 siblings, 12 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Jonathan Tan pointed out that the fsck error_func doesn't pass you the ID of the fsck failure in [1]. This series improves the API so it does, and moves the gitmodules_{found,done} variables into the fsck_options struct. The result is that instead of the "print_dangling_gitmodules" member in that series we can just implement that with the diff at the end of this cover letter (goes on top of a merge of this series & "seen"), and without any changes to fsck_finish(). This conflicts with other in-flight fsck changes but the conflict is rather trivial. Jeff King has another concurrent series to add a couple of new fsck checks, those need to be moved to fsck.h, and there's another trivial conflict in 2 hunks due to the gitmodules_{found,done} move. 1. https://lore.kernel.org/git/87blcja2ha.fsf@evledraar.gmail.com/ Ævar Arnfjörð Bjarmason (14): fsck.h: indent arguments to of fsck_set_msg_type fsck.h: use use "enum object_type" instead of "int" fsck.c: rename variables in fsck_set_msg_type() for less confusion fsck.c: move definition of msg_id into append_msg_id() fsck.c: rename remaining fsck_msg_id "id" to "msg_id" fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum fsck.c: call parse_msg_type() early in fsck_set_msg_type() fsck.c: undefine temporary STR macro after use fsck.c: give "FOREACH_MSG_ID" a more specific name fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h fsck.c: pass along the fsck_msg_id in the fsck_error callback fsck.c: add an fsck_set_msg_type() API that takes enums fsck.h: update FSCK_OPTIONS_* for object_name fsck.c: move gitmodules_{found,done} into fsck_options builtin/fsck.c | 7 +- builtin/index-pack.c | 3 +- builtin/mktag.c | 7 +- builtin/unpack-objects.c | 3 +- fsck.c | 160 ++++++++++++--------------------------- fsck.h | 98 +++++++++++++++++++++--- 6 files changed, 152 insertions(+), 126 deletions(-) -- diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 82f381f854..22dfcfc5de 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1713,6 +1713,20 @@ static void show_pack_info(int stat_only) } } +static int index_pack_fsck_error_func(struct fsck_options *o, + const struct object_id *oid, + enum object_type object_type, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message) +{ + if (msg_id == FSCK_MSG_GITMODULES_MISSING) { + puts(oid_to_hex(oid)); + return 0; + } + return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); +} + int cmd_index_pack(int argc, const char **argv, const char *prefix) { int i, fix_thin_pack = 0, verify = 0, stat_only = 0, rev_index; @@ -1934,10 +1948,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) close(input_fd); if (do_fsck_object) { - struct fsck_options fo = FSCK_OPTIONS_STRICT; - - fo.print_dangling_gitmodules = 1; - if (fsck_finish(&fo)) + fsck_options.error_func = index_pack_fsck_error_func; + if (fsck_finish(&fsck_options)) die(_("fsck error in pack objects")); } diff --git a/fetch-pack.c b/fetch-pack.c index 0a337a04f1..9fc2ce86e4 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -40,6 +40,7 @@ static struct shallow_lock shallow_lock; static const char *alternate_shallow_file; static struct strbuf fsck_msg_types = STRBUF_INIT; static struct string_list uri_protocols = STRING_LIST_INIT_DUP; +static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; /* Remember to update object flag allocation in object.h */ #define COMPLETE (1U << 0) @@ -993,19 +994,34 @@ static int cmp_ref_by_name(const void *a_, const void *b_) return strcmp(a->name, b->name); } +static int fetch_pack_fsck_error_func(struct fsck_options *o, + const struct object_id *oid, + enum object_type object_type, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message) +{ + if (msg_id == FSCK_MSG_GITMODULES_MISSING) { + puts(oid_to_hex(oid)); + return 0; + } + return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); +} + static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) { struct oidset_iter iter; const struct object_id *oid; - struct fsck_options fo = FSCK_OPTIONS_STRICT; if (!oidset_size(gitmodules_oids)) return; oidset_iter_init(gitmodules_oids, &iter); while ((oid = oidset_iter_next(&iter))) - register_found_gitmodules(oid); - if (fsck_finish(&fo)) + oidset_insert(&fsck_options.gitmodules_found, oid); + + fsck_options.error_func = fetch_pack_fsck_error_func; + if (fsck_finish(&fsck_options)) die("fsck failed"); } 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH 00/14] fsck: API improvements 2021-02-17 19:42 ` [PATCH 00/14] fsck: API improvements Ævar Arnfjörð Bjarmason @ 2021-02-17 21:02 ` Junio C Hamano 2021-02-18 0:00 ` Ævar Arnfjörð Bjarmason 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (10 subsequent siblings) 11 siblings, 1 reply; 229+ messages in thread From: Junio C Hamano @ 2021-02-17 21:02 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > Jonathan Tan pointed out that the fsck error_func doesn't pass you the > ID of the fsck failure in [1]. This series improves the API so it > does, and moves the gitmodules_{found,done} variables into the > fsck_options struct. > > The result is that instead of the "print_dangling_gitmodules" member > in that series we can just implement that with the diff at the end of > this cover letter (goes on top of a merge of this series & "seen"), > and without any changes to fsck_finish(). > > This conflicts with other in-flight fsck changes but the conflict is > rather trivial. Jeff King has another concurrent series to add a > couple of new fsck checks, those need to be moved to fsck.h, and > there's another trivial conflict in 2 hunks due to the > gitmodules_{found,done} move. > > 1. https://lore.kernel.org/git/87blcja2ha.fsf@evledraar.gmail.com/ Let's get this reviewed now, but with expectation that it will be rebased after the dust settles. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 00/14] fsck: API improvements 2021-02-17 21:02 ` Junio C Hamano @ 2021-02-18 0:00 ` Ævar Arnfjörð Bjarmason 2021-02-18 19:12 ` Junio C Hamano 0 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-18 0:00 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan On Wed, Feb 17 2021, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > >> Jonathan Tan pointed out that the fsck error_func doesn't pass you the >> ID of the fsck failure in [1]. This series improves the API so it >> does, and moves the gitmodules_{found,done} variables into the >> fsck_options struct. >> >> The result is that instead of the "print_dangling_gitmodules" member >> in that series we can just implement that with the diff at the end of >> this cover letter (goes on top of a merge of this series & "seen"), >> and without any changes to fsck_finish(). >> >> This conflicts with other in-flight fsck changes but the conflict is >> rather trivial. Jeff King has another concurrent series to add a >> couple of new fsck checks, those need to be moved to fsck.h, and >> there's another trivial conflict in 2 hunks due to the >> gitmodules_{found,done} move. >> >> 1. https://lore.kernel.org/git/87blcja2ha.fsf@evledraar.gmail.com/ > > Let's get this reviewed now, but with expectation that it will be > rebased after the dust settles. Makes sense. Pending a review of this would you be interested in queuing a v2 of this that doesn't conflict with in-flight topics? Patches 01..09 & 13/14 can live conflict-free with what's in "seen" now (I'd have made the 13th the 10th in v1 if I'd noticed). Then I could re-roll the remainder of this once the other topics land. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 00/14] fsck: API improvements 2021-02-18 0:00 ` Ævar Arnfjörð Bjarmason @ 2021-02-18 19:12 ` Junio C Hamano 2021-02-18 19:57 ` Jeff King 0 siblings, 1 reply; 229+ messages in thread From: Junio C Hamano @ 2021-02-18 19:12 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >> Let's get this reviewed now, but with expectation that it will be >> rebased after the dust settles. > > Makes sense. Pending a review of this would you be interested in queuing > a v2 of this that doesn't conflict with in-flight topics? Not really. I am not sure your recent patches are getting sufficient review bandwidth they deserve. > Patches 01..09 & 13/14 can live conflict-free with what's in "seen" now > (I'd have made the 13th the 10th in v1 if I'd noticed). Then I could > re-roll the remainder of this once the other topics land. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 00/14] fsck: API improvements 2021-02-18 19:12 ` Junio C Hamano @ 2021-02-18 19:57 ` Jeff King 2021-02-18 20:27 ` Junio C Hamano 2021-02-18 22:36 ` Junio C Hamano 0 siblings, 2 replies; 229+ messages in thread From: Jeff King @ 2021-02-18 19:57 UTC (permalink / raw) To: Junio C Hamano Cc: Ævar Arnfjörð Bjarmason, git, Johannes Schindelin, Jonathan Tan On Thu, Feb 18, 2021 at 11:12:26AM -0800, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > > >> Let's get this reviewed now, but with expectation that it will be > >> rebased after the dust settles. > > > > Makes sense. Pending a review of this would you be interested in queuing > > a v2 of this that doesn't conflict with in-flight topics? > > Not really. I am not sure your recent patches are getting > sufficient review bandwidth they deserve. FWIW, I just read through v2 (without having looked at all at v1 yet!), and they all seemed like quite reasonable cleanups. I left a few small comments that might be worth a quick re-roll, but I would also be OK with the patches being picked up as-is. -Peff ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 00/14] fsck: API improvements 2021-02-18 19:57 ` Jeff King @ 2021-02-18 20:27 ` Junio C Hamano 2021-02-19 0:54 ` Ævar Arnfjörð Bjarmason 2021-02-18 22:36 ` Junio C Hamano 1 sibling, 1 reply; 229+ messages in thread From: Junio C Hamano @ 2021-02-18 20:27 UTC (permalink / raw) To: Jeff King Cc: Ævar Arnfjörð Bjarmason, git, Johannes Schindelin, Jonathan Tan Jeff King <peff@peff.net> writes: > On Thu, Feb 18, 2021 at 11:12:26AM -0800, Junio C Hamano wrote: > >> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >> >> >> Let's get this reviewed now, but with expectation that it will be >> >> rebased after the dust settles. >> > >> > Makes sense. Pending a review of this would you be interested in queuing >> > a v2 of this that doesn't conflict with in-flight topics? >> >> Not really. I am not sure your recent patches are getting >> sufficient review bandwidth they deserve. > > FWIW, I just read through v2 (without having looked at all at v1 yet!), > and they all seemed like quite reasonable cleanups. I left a few small > comments that might be worth a quick re-roll, but I would also be OK > with the patches being picked up as-is. That's good to hear. I shouldn't even have bothered to answer the question, if the v2 were to have sent to the list without waiting for my reply ;-) ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 00/14] fsck: API improvements 2021-02-18 20:27 ` Junio C Hamano @ 2021-02-19 0:54 ` Ævar Arnfjörð Bjarmason 0 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-19 0:54 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, git, Johannes Schindelin, Jonathan Tan On Thu, Feb 18 2021, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > >> On Thu, Feb 18, 2021 at 11:12:26AM -0800, Junio C Hamano wrote: >> >>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >>> >>> >> Let's get this reviewed now, but with expectation that it will be >>> >> rebased after the dust settles. >>> > >>> > Makes sense. Pending a review of this would you be interested in queuing >>> > a v2 of this that doesn't conflict with in-flight topics? >>> >>> Not really. I am not sure your recent patches are getting >>> sufficient review bandwidth they deserve. >> >> FWIW, I just read through v2 (without having looked at all at v1 yet!), >> and they all seemed like quite reasonable cleanups. I left a few small >> comments that might be worth a quick re-roll, but I would also be OK >> with the patches being picked up as-is. > > That's good to hear. I shouldn't even have bothered to answer the > question, if the v2 were to have sent to the list without waiting > for my reply ;-) FWIW it's not that I didn't care about the reply, but I'm somewhat intermittently available time/network wise in the coming days. And there's the TZ difference between us. I sent v1 thinking you might be willing to pick it up & resolve the conflict, but since you expressed an interest in deferring it until conflicting work landed figured I'd ask (and then just sent the patches) if you'd be interested in a conflict-free version to queue alongside those changes. If it was still "nah" fair enough, I'd just wait. But if not those patches would be there to pickup. Thanks a lot to you & Jeff for the review on v2. I won't have time to address all that today, and in any case I got the message that maybe I should stop firehosing the list with patch series's for a bit :) ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 00/14] fsck: API improvements 2021-02-18 19:57 ` Jeff King 2021-02-18 20:27 ` Junio C Hamano @ 2021-02-18 22:36 ` Junio C Hamano 1 sibling, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-02-18 22:36 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason, Jeff King Cc: git, Johannes Schindelin, Jonathan Tan Jeff King <peff@peff.net> writes: > On Thu, Feb 18, 2021 at 11:12:26AM -0800, Junio C Hamano wrote: > >> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >> >> >> Let's get this reviewed now, but with expectation that it will be >> >> rebased after the dust settles. >> > >> > Makes sense. Pending a review of this would you be interested in queuing >> > a v2 of this that doesn't conflict with in-flight topics? >> >> Not really. I am not sure your recent patches are getting >> sufficient review bandwidth they deserve. > > FWIW, I just read through v2 (without having looked at all at v1 yet!), > and they all seemed like quite reasonable cleanups. I left a few small > comments that might be worth a quick re-roll, but I would also be OK > with the patches being picked up as-is. Yeah, all except for a handful minor nits looked good. Thanks for writing and reviewing. Perhaps a final reroll to tie the loose ends, or is it just a matter of signing off one of them and droping a couple of other ones (which other ones)? ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') 2021-02-17 19:42 ` [PATCH 00/14] fsck: API improvements Ævar Arnfjörð Bjarmason 2021-02-17 21:02 ` Junio C Hamano @ 2021-02-18 10:58 ` Ævar Arnfjörð Bjarmason 2021-02-18 22:19 ` Junio C Hamano ` (23 more replies) 2021-02-18 10:58 ` [PATCH v2 01/10] fsck.h: indent arguments to of fsck_set_msg_type Ævar Arnfjörð Bjarmason ` (9 subsequent siblings) 11 siblings, 24 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason As suggested in https://lore.kernel.org/git/87zh028ctp.fsf@evledraar.gmail.com/ a version of this that doesn't conflict with other in-flight topics. I can submit the rest later. Ævar Arnfjörð Bjarmason (10): fsck.h: indent arguments to of fsck_set_msg_type fsck.h: use "enum object_type" instead of "int" fsck.c: rename variables in fsck_set_msg_type() for less confusion fsck.c: move definition of msg_id into append_msg_id() fsck.c: rename remaining fsck_msg_id "id" to "msg_id" fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum fsck.c: call parse_msg_type() early in fsck_set_msg_type() fsck.c: undefine temporary STR macro after use fsck.c: give "FOREACH_MSG_ID" a more specific name fsck.h: update FSCK_OPTIONS_* for object_name builtin/fsck.c | 5 ++-- builtin/index-pack.c | 3 +- builtin/mktag.c | 3 +- builtin/unpack-objects.c | 3 +- fsck.c | 60 ++++++++++++++++++++-------------------- fsck.h | 26 +++++++++-------- 6 files changed, 54 insertions(+), 46 deletions(-) Range-diff: -: ----------- > 1: 88b347b74ed fsck.h: indent arguments to of fsck_set_msg_type 1: 1a60d65d2ca ! 2: 868eac3d4d1 fsck.h: use use "enum object_type" instead of "int" @@ Metadata Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## Commit message ## - fsck.h: use use "enum object_type" instead of "int" + fsck.h: use "enum object_type" instead of "int" Change the fsck_walk_func to use an "enum object_type" instead of an "int" type. The types are compatible, and ever since this was added in 2: 24761f269b7 = 3: f599dc6c8f3 fsck.c: rename variables in fsck_set_msg_type() for less confusion 3: fb4c66f9305 = 4: 33f3b1942c1 fsck.c: move definition of msg_id into append_msg_id() 4: a129dbd9964 = 5: 28c9245e418 fsck.c: rename remaining fsck_msg_id "id" to "msg_id" 5: d9bee41072e = 6: d25037c6f18 fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum 6: 423568026c3 = 7: 66d0f1047cc fsck.c: call parse_msg_type() early in fsck_set_msg_type() 7: cb43e832738 = 8: 7643a5bf211 fsck.c: undefine temporary STR macro after use 8: 2cd14cb4e2a = 9: 7c64e2267ce fsck.c: give "FOREACH_MSG_ID" a more specific name 9: 1ada154ef23 < -: ----------- fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h 10: c4179445f22 < -: ----------- fsck.c: pass along the fsck_msg_id in the fsck_error callback 11: c1fc724f0e8 < -: ----------- fsck.c: add an fsck_set_msg_type() API that takes enums 12: 8de91fac068 = 10: a98a3512629 fsck.h: update FSCK_OPTIONS_* for object_name 13: 29ff97856ff < -: ----------- fsck.c: move gitmodules_{found,done} into fsck_options -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason @ 2021-02-18 22:19 ` Junio C Hamano 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (22 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-02-18 22:19 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > As suggested in > https://lore.kernel.org/git/87zh028ctp.fsf@evledraar.gmail.com/ a > version of this that doesn't conflict with other in-flight topics. I > can submit the rest later. And a bystander does not have a clue what this thing is about, beyond that it tweaks fsck API, how urgent it would be, what benefit it brings to us? That kind of things are expected to be described here. The cover letter of v1 does not do much better job, either, but is it fair to understand that this primarily is about allowing the callback functions (which handle various problems fsck machinery finds) to learn what error it encountered, so that things like "enumerate missing .gitmodules blobs" 384c9d1c (fetch-pack: print and use dangling .gitmodules, 2021-01-23) wants to do does not have to be written by inserting a very narrow custom code into the general error reporting codepath, but by customizing the error reporting function? If so, can we at least say something a bit more specific and focused, than the overly broad "API improvements"? THanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v3 00/22] fsck: API improvements 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason 2021-02-18 22:19 ` Junio C Hamano @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-07 23:04 ` Junio C Hamano ` (23 more replies) 2021-03-06 11:04 ` [PATCH v3 01/22] fsck.h: update FSCK_OPTIONS_* for object_name Ævar Arnfjörð Bjarmason ` (21 subsequent siblings) 23 siblings, 24 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Now that jt/transfer-fsck-across-packs has been merged to master here's a re-roll of v1[1]+v2[2] of this series. v2 was slimmed-down + had a trivial typo fix, so I've done the range-diff against v1. This makes the recent fetch-pack work use the fsck_msg_id API to distinguish messages, and has other various cleanups and improvements to make the fsck API easier to use in the future. There's a an easy merge conflict here with other in-flight changes to fsck. I figured it was better to send this now than wait for those to land. 1. https://lore.kernel.org/git/20210217194246.25342-1-avarab@gmail.com/ 2. https://lore.kernel.org/git/20210218105840.11989-1-avarab@gmail.com/ Ævar Arnfjörð Bjarmason (22): fsck.h: update FSCK_OPTIONS_* for object_name fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} fsck.h: reduce duplication between FSCK_OPTIONS_{DEFAULT,STRICT} fsck.h: add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro fsck.h: indent arguments to of fsck_set_msg_type fsck.h: use "enum object_type" instead of "int" fsck.c: rename variables in fsck_set_msg_type() for less confusion fsck.c: move definition of msg_id into append_msg_id() fsck.c: rename remaining fsck_msg_id "id" to "msg_id" fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum fsck.h: re-order and re-assign "enum fsck_msg_type" fsck.c: call parse_msg_type() early in fsck_set_msg_type() fsck.c: undefine temporary STR macro after use fsck.c: give "FOREACH_MSG_ID" a more specific name fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h fsck.c: pass along the fsck_msg_id in the fsck_error callback fsck.c: add an fsck_set_msg_type() API that takes enums fsck.c: move gitmodules_{found,done} into fsck_options fetch-pack: don't needlessly copy fsck_options fetch-pack: use file-scope static struct for fsck_options fetch-pack: use new fsck API to printing dangling submodules Makefile | 1 + builtin/fsck.c | 7 +- builtin/index-pack.c | 30 ++----- builtin/mktag.c | 7 +- builtin/unpack-objects.c | 3 +- fetch-pack.c | 6 +- fsck-cb.c | 16 ++++ fsck.c | 175 ++++++++++++--------------------------- fsck.h | 132 ++++++++++++++++++++++++++--- 9 files changed, 211 insertions(+), 166 deletions(-) create mode 100644 fsck-cb.c Range-diff: 13: 8de91fac068 = 1: 9d809466bd1 fsck.h: update FSCK_OPTIONS_* for object_name -: ----------- > 2: 33e8b6d6545 fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} -: ----------- > 3: c23f7ce9e4a fsck.h: reduce duplication between FSCK_OPTIONS_{DEFAULT,STRICT} -: ----------- > 4: 5dde68df6c3 fsck.h: add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro 1: 88b347b74ed = 5: 7ae35a6e9d2 fsck.h: indent arguments to of fsck_set_msg_type 2: 1a60d65d2ca ! 6: dfb5f754b37 fsck.h: use use "enum object_type" instead of "int" @@ Metadata Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## Commit message ## - fsck.h: use use "enum object_type" instead of "int" + fsck.h: use "enum object_type" instead of "int" Change the fsck_walk_func to use an "enum object_type" instead of an "int" type. The types are compatible, and ever since this was added in 3: 24761f269b7 ! 7: fd58ec73c6b fsck.c: rename variables in fsck_set_msg_type() for less confusion @@ Commit message It was needlessly confusing that it took a "msg_type" argument, but then later declared another "msg_type" of a different type. - Let's rename that to "tmp", and rename "id" to "msg_id" and "msg_id" - to "msg_id_str" etc. This will make a follow-up change smaller. + Let's rename that to "severity", and rename "id" to "msg_id" and + "msg_id" to "msg_id_str" etc. This will make a follow-up change + smaller. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> @@ fsck.c: int is_valid_msg_type(const char *msg_id, const char *msg_type) int i; - int *msg_type; - ALLOC_ARRAY(msg_type, FSCK_MSG_MAX); -+ int *tmp; -+ ALLOC_ARRAY(tmp, FSCK_MSG_MAX); ++ int *severity; ++ ALLOC_ARRAY(severity, FSCK_MSG_MAX); for (i = 0; i < FSCK_MSG_MAX; i++) - msg_type[i] = fsck_msg_type(i, options); - options->msg_type = msg_type; -+ tmp[i] = fsck_msg_type(i, options); -+ options->msg_type = tmp; ++ severity[i] = fsck_msg_type(i, options); ++ options->msg_type = severity; } - options->msg_type[id] = type; 4: fb4c66f9305 = 8: 48cb4d3bb70 fsck.c: move definition of msg_id into append_msg_id() 5: a129dbd9964 ! 9: 2c80ad32038 fsck.c: rename remaining fsck_msg_id "id" to "msg_id" @@ Commit message "msg_id". This change is relatively small, and is worth the churn for a later change where we have different id's in the "report" function. + Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> + ## fsck.c ## @@ fsck.c: void fsck_set_msg_types(struct fsck_options *options, const char *values) free(to_free); -: ----------- > 10: 92dfbdfb624 fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" 6: d9bee41072e ! 11: c1c476af69b fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum @@ Commit message - f27d05b1704 (fsck: allow upgrading fsck warnings to errors, 2015-06-22) + The reason these were defined in two different places is because we + use FSCK_{IGNORE,INFO,FATAL} only in fsck.c, but FSCK_{ERROR,WARN} are + used by external callbacks. + + Untangling that would take some more work, since we expose the new + "enum fsck_msg_type" to both. Similar to "enum object_type" it's not + worth structuring the API in such a way that only those who need + FSCK_{ERROR,WARN} pass around a different type. + Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## builtin/fsck.c ## @@ builtin/fsck.c: static int objerror(struct object *obj, const char *err) switch (msg_type) { case FSCK_WARN: + ## builtin/index-pack.c ## +@@ builtin/index-pack.c: static void show_pack_info(int stat_only) + static int print_dangling_gitmodules(struct fsck_options *o, + const struct object_id *oid, + enum object_type object_type, +- int msg_type, const char *message) ++ enum fsck_msg_type msg_type, ++ const char *message) + { + /* + * NEEDSWORK: Plumb the MSG_ID (from fsck.c) here and use it + ## builtin/mktag.c ## @@ builtin/mktag.c: static int mktag_config(const char *var, const char *value, void *cb) static int mktag_fsck_error_func(struct fsck_options *o, @@ fsck.c: void list_config_fsck_msg_ids(struct string_list *list, const char *pref +static enum fsck_msg_type fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options) { -- int msg_type; -+ enum fsck_msg_type msg_type; - assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX); + if (!options->msg_type) { +- int msg_type = msg_id_info[msg_id].msg_type; ++ enum fsck_msg_type msg_type = msg_id_info[msg_id].msg_type; + + if (options->strict && msg_type == FSCK_WARN) + msg_type = FSCK_ERROR; @@ fsck.c: static int fsck_msg_type(enum fsck_msg_id msg_id, - return msg_type; + return options->msg_type[msg_id]; } -static int parse_msg_type(const char *str) @@ fsck.c: void fsck_set_msg_type(struct fsck_options *options, if (!options->msg_type) { int i; -- int *tmp; -+ enum fsck_msg_type *tmp; - ALLOC_ARRAY(tmp, FSCK_MSG_MAX); +- int *severity; ++ enum fsck_msg_type *severity; + ALLOC_ARRAY(severity, FSCK_MSG_MAX); for (i = 0; i < FSCK_MSG_MAX; i++) - tmp[i] = fsck_msg_type(i, options); + severity[i] = fsck_msg_type(i, options); @@ fsck.c: static int report(struct fsck_options *options, { va_list ap; @@ fsck.h -#define FSCK_ERROR 1 -#define FSCK_WARN 2 -#define FSCK_IGNORE 3 -- +enum fsck_msg_type { -+ FSCK_INFO = -2, ++ FSCK_INFO = -2, + FSCK_FATAL = -1, + FSCK_ERROR = 1, + FSCK_WARN, + FSCK_IGNORE +}; + struct fsck_options; struct object; - @@ fsck.h: typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, -: ----------- > 12: d55587719a5 fsck.h: re-order and re-assign "enum fsck_msg_type" 7: 423568026c3 = 13: 32828d1c78c fsck.c: call parse_msg_type() early in fsck_set_msg_type() 8: cb43e832738 = 14: 5c62066235c fsck.c: undefine temporary STR macro after use 9: 2cd14cb4e2a = 15: f8e50fbf7d3 fsck.c: give "FOREACH_MSG_ID" a more specific name 10: 1ada154ef23 ! 16: cd74dee8769 fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h @@ fsck.c ## fsck.h ## @@ fsck.h: enum fsck_msg_type { FSCK_WARN, - FSCK_IGNORE }; -+ + +#define FOREACH_FSCK_MSG_ID(FUNC) \ + /* fatal errors */ \ + FUNC(NUL_IN_HEADER, FATAL) \ 11: c4179445f22 ! 17: 234e287d081 fsck.c: pass along the fsck_msg_id in the fsck_error callback @@ builtin/fsck.c: static int objerror(struct object *obj, const char *err) switch (msg_type) { case FSCK_WARN: + ## builtin/index-pack.c ## +@@ builtin/index-pack.c: static int print_dangling_gitmodules(struct fsck_options *o, + const struct object_id *oid, + enum object_type object_type, + enum fsck_msg_type msg_type, ++ enum fsck_msg_id msg_id, + const char *message) + { + /* +@@ builtin/index-pack.c: static int print_dangling_gitmodules(struct fsck_options *o, + printf("%s\n", oid_to_hex(oid)); + return 0; + } +- return fsck_error_function(o, oid, object_type, msg_type, message); ++ return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); + } + + int cmd_index_pack(int argc, const char **argv, const char *prefix) + ## builtin/mktag.c ## @@ builtin/mktag.c: static int mktag_fsck_error_func(struct fsck_options *o, const struct object_id *oid, 12: c1fc724f0e8 ! 18: 8049dc07391 fsck.c: add an fsck_set_msg_type() API that takes enums @@ fsck.c: int is_valid_msg_type(const char *msg_id, const char *msg_type) +{ + if (!options->msg_type) { + int i; -+ enum fsck_msg_type *tmp; -+ ALLOC_ARRAY(tmp, FSCK_MSG_MAX); ++ enum fsck_msg_type *severity; ++ ALLOC_ARRAY(severity, FSCK_MSG_MAX); + for (i = 0; i < FSCK_MSG_MAX; i++) -+ tmp[i] = fsck_msg_type(i, options); -+ options->msg_type = tmp; ++ severity[i] = fsck_msg_type(i, options); ++ options->msg_type = severity; + } + + options->msg_type[msg_id] = msg_type; @@ fsck.c: void fsck_set_msg_type(struct fsck_options *options, - if (!options->msg_type) { - int i; -- enum fsck_msg_type *tmp; -- ALLOC_ARRAY(tmp, FSCK_MSG_MAX); +- enum fsck_msg_type *severity; +- ALLOC_ARRAY(severity, FSCK_MSG_MAX); - for (i = 0; i < FSCK_MSG_MAX; i++) -- tmp[i] = fsck_msg_type(i, options); -- options->msg_type = tmp; +- severity[i] = fsck_msg_type(i, options); +- options->msg_type = severity; - } - - options->msg_type[msg_id] = msg_type; 14: 29ff97856ff ! 19: 4224a29d15c fsck.c: move gitmodules_{found,done} into fsck_options @@ Commit message fsck_options struct. It makes sense to keep all the context in the same place. + This requires changing the recently added register_found_gitmodules() + function added in 5476e1efde (fetch-pack: print and use dangling + .gitmodules, 2021-02-22) to take fsck_options. That function will be + removed in a subsequent commit, but as it'll require the new + gitmodules_found attribute of "fsck_options" we need this intermediate + step first. + Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> + ## fetch-pack.c ## +@@ fetch-pack.c: static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) + + oidset_iter_init(gitmodules_oids, &iter); + while ((oid = oidset_iter_next(&iter))) +- register_found_gitmodules(oid); ++ register_found_gitmodules(&fo, oid); + if (fsck_finish(&fo)) + die("fsck failed"); + } + ## fsck.c ## @@ #include "credential.h" @@ fsck.c: static int fsck_blob(const struct object_id *oid, const char *buf, if (object_on_skiplist(options, oid)) return 0; +@@ fsck.c: int fsck_error_function(struct fsck_options *o, + return 1; + } + +-void register_found_gitmodules(const struct object_id *oid) ++void register_found_gitmodules(struct fsck_options *options, const struct object_id *oid) + { +- oidset_insert(&gitmodules_found, oid); ++ oidset_insert(&options->gitmodules_found, oid); + } + + int fsck_finish(struct fsck_options *options) @@ fsck.c: int fsck_finish(struct fsck_options *options) struct oidset_iter iter; const struct object_id *oid; @@ fsck.h: struct fsck_options { kh_oid_map_t *object_names; }; --#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT, NULL } --#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT, NULL } -+#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT, OIDSET_INIT, OIDSET_INIT, NULL } -+#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT, OIDSET_INIT, OIDSET_INIT, NULL } +@@ fsck.h: struct fsck_options { + .walk = NULL, \ + .msg_type = NULL, \ + .skiplist = OIDSET_INIT, \ ++ .gitmodules_found = OIDSET_INIT, \ ++ .gitmodules_done = OIDSET_INIT, \ + .object_names = NULL, + #define FSCK_OPTIONS_COMMON_ERROR_FUNC \ + FSCK_OPTIONS_COMMON \ +@@ fsck.h: int fsck_walk(struct object *obj, void *data, struct fsck_options *options); + int fsck_object(struct object *obj, void *data, unsigned long size, + struct fsck_options *options); + +-void register_found_gitmodules(const struct object_id *oid); ++void register_found_gitmodules(struct fsck_options *options, ++ const struct object_id *oid); - /* descend in all linked child objects - * the return value is: + /* + * fsck a tag, and pass info about it back to the caller. This is -: ----------- > 20: 40b13468129 fetch-pack: don't needlessly copy fsck_options -: ----------- > 21: 8e418abfbd7 fetch-pack: use file-scope static struct for fsck_options -: ----------- > 22: 113de190f7d fetch-pack: use new fsck API to printing dangling submodules -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v3 00/22] fsck: API improvements 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason @ 2021-03-07 23:04 ` Junio C Hamano 2021-03-08 9:16 ` Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (22 subsequent siblings) 23 siblings, 1 reply; 229+ messages in thread From: Junio C Hamano @ 2021-03-07 23:04 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > Now that jt/transfer-fsck-across-packs has been merged to master > here's a re-roll of v1[1]+v2[2] of this series. It unfortunately is not a good time to review or helping any work on this series, as the base topic introduced an unpleasant regression and needs to either probably gain a band-aid (or reverted in the worst case); of course, it would be appreciated to help resolve the issues on that topic ;-) Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v3 00/22] fsck: API improvements 2021-03-07 23:04 ` Junio C Hamano @ 2021-03-08 9:16 ` Ævar Arnfjörð Bjarmason 0 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-08 9:16 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan On Mon, Mar 08 2021, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > >> Now that jt/transfer-fsck-across-packs has been merged to master >> here's a re-roll of v1[1]+v2[2] of this series. > > It unfortunately is not a good time to review or helping any work on > this series, as the base topic introduced an unpleasant regression > and needs to either probably gain a band-aid (or reverted in the > worst case); of course, it would be appreciated to help resolve the > issues on that topic ;-) I should have mentioned: I saw the bug & proposed fix thread for that. I see that 2aec3bc4b64 (fetch-pack: do not mix --pack_header and packfile uri, 2021-03-04) down into next is now merged down to next. My reading of that thread is that the reported bug is solved, but perhaps we're not 100% happy with the solution? In any case, that patch does not conflict with this series, and all tests pass with/without the two merged together. I don't forese an issue with the two stepping on each other's toes, since I'm just modifying the rather low-level fsck interface of spewing out .gitmodules entries, not touching the logic of what's then done with that information... ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v4 00/22] fsck: API improvements 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason 2021-03-07 23:04 ` Junio C Hamano @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 19:35 ` Derrick Stolee ` (20 more replies) 2021-03-16 16:17 ` [PATCH v4 01/22] fsck.h: update FSCK_OPTIONS_* for object_name Ævar Arnfjörð Bjarmason ` (21 subsequent siblings) 23 siblings, 21 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason A re-send of a rebased v3, which I sent at: http://lore.kernel.org/git/20210306110439.27694-1-avarab@gmail.com as seen in the range-diff there are no changes since v3. I'm just sending this as a post-release bump of this, per https://lore.kernel.org/git/xmqqy2etczqi.fsf@gitster.g/ Ævar Arnfjörð Bjarmason (22): fsck.h: update FSCK_OPTIONS_* for object_name fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} fsck.h: reduce duplication between FSCK_OPTIONS_{DEFAULT,STRICT} fsck.h: add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro fsck.h: indent arguments to of fsck_set_msg_type fsck.h: use "enum object_type" instead of "int" fsck.c: rename variables in fsck_set_msg_type() for less confusion fsck.c: move definition of msg_id into append_msg_id() fsck.c: rename remaining fsck_msg_id "id" to "msg_id" fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum fsck.h: re-order and re-assign "enum fsck_msg_type" fsck.c: call parse_msg_type() early in fsck_set_msg_type() fsck.c: undefine temporary STR macro after use fsck.c: give "FOREACH_MSG_ID" a more specific name fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h fsck.c: pass along the fsck_msg_id in the fsck_error callback fsck.c: add an fsck_set_msg_type() API that takes enums fsck.c: move gitmodules_{found,done} into fsck_options fetch-pack: don't needlessly copy fsck_options fetch-pack: use file-scope static struct for fsck_options fetch-pack: use new fsck API to printing dangling submodules Makefile | 1 + builtin/fsck.c | 7 +- builtin/index-pack.c | 30 ++----- builtin/mktag.c | 7 +- builtin/unpack-objects.c | 3 +- fetch-pack.c | 6 +- fsck-cb.c | 16 ++++ fsck.c | 175 ++++++++++++--------------------------- fsck.h | 132 ++++++++++++++++++++++++++--- 9 files changed, 211 insertions(+), 166 deletions(-) create mode 100644 fsck-cb.c Range-diff: 1: 9d809466bd = 1: 9cd942b526 fsck.h: update FSCK_OPTIONS_* for object_name 2: 33e8b6d654 = 2: d67966b838 fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} 3: c23f7ce9e4 = 3: 211472e0c5 fsck.h: reduce duplication between FSCK_OPTIONS_{DEFAULT,STRICT} 4: 5dde68df6c = 4: 70afee988d fsck.h: add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro 5: 7ae35a6e9d = 5: 1337d53352 fsck.h: indent arguments to of fsck_set_msg_type 6: dfb5f754b3 = 6: e4ef107bb4 fsck.h: use "enum object_type" instead of "int" 7: fd58ec73c6 = 7: 20bac3207e fsck.c: rename variables in fsck_set_msg_type() for less confusion 8: 48cb4d3bb7 = 8: 09c3bba9e9 fsck.c: move definition of msg_id into append_msg_id() 9: 2c80ad3203 = 9: 8067df53a2 fsck.c: rename remaining fsck_msg_id "id" to "msg_id" 10: 92dfbdfb62 = 10: bdf5e13f3d fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" 11: c1c476af69 = 11: b03caa237f fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum 12: d55587719a = 12: 7b1d13b4cc fsck.h: re-order and re-assign "enum fsck_msg_type" 13: 32828d1c78 = 13: a8e4ca7b19 fsck.c: call parse_msg_type() early in fsck_set_msg_type() 14: 5c62066235 = 14: 214c375a20 fsck.c: undefine temporary STR macro after use 15: f8e50fbf7d = 15: 19a2499a80 fsck.c: give "FOREACH_MSG_ID" a more specific name 16: cd74dee876 = 16: 6e1a7b6274 fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h 17: 234e287d08 = 17: 42af4e164c fsck.c: pass along the fsck_msg_id in the fsck_error callback 18: 8049dc0739 = 18: fa47f473a8 fsck.c: add an fsck_set_msg_type() API that takes enums 19: 4224a29d15 = 19: 4cc3880cc4 fsck.c: move gitmodules_{found,done} into fsck_options 20: 40b1346812 = 20: fd219d318a fetch-pack: don't needlessly copy fsck_options 21: 8e418abfbd = 21: e4cd8c250e fetch-pack: use file-scope static struct for fsck_options 22: 113de190f7 = 22: fdbc3c304c fetch-pack: use new fsck API to printing dangling submodules -- 2.31.0.260.g719c683c1d ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v4 00/22] fsck: API improvements 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason @ 2021-03-16 19:35 ` Derrick Stolee 2021-03-17 18:20 ` [PATCH v5 00/19] " Ævar Arnfjörð Bjarmason ` (19 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Derrick Stolee @ 2021-03-16 19:35 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason, git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan On 3/16/2021 12:17 PM, Ævar Arnfjörð Bjarmason wrote: > A re-send of a rebased v3, which I sent at: > http://lore.kernel.org/git/20210306110439.27694-1-avarab@gmail.com as > seen in the range-diff there are no changes since v3. I'm just sending > this as a post-release bump of this, per > https://lore.kernel.org/git/xmqqy2etczqi.fsf@gitster.g/ > > Ævar Arnfjörð Bjarmason (22): > fsck.h: update FSCK_OPTIONS_* for object_name > fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} > fsck.h: reduce duplication between FSCK_OPTIONS_{DEFAULT,STRICT} > fsck.h: add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro > fsck.h: indent arguments to of fsck_set_msg_type > fsck.h: use "enum object_type" instead of "int" > fsck.c: rename variables in fsck_set_msg_type() for less confusion > fsck.c: move definition of msg_id into append_msg_id() > fsck.c: rename remaining fsck_msg_id "id" to "msg_id" > fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" > fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum > fsck.h: re-order and re-assign "enum fsck_msg_type" > fsck.c: call parse_msg_type() early in fsck_set_msg_type() > fsck.c: undefine temporary STR macro after use > fsck.c: give "FOREACH_MSG_ID" a more specific name > fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h > fsck.c: pass along the fsck_msg_id in the fsck_error callback > fsck.c: add an fsck_set_msg_type() API that takes enums > fsck.c: move gitmodules_{found,done} into fsck_options > fetch-pack: don't needlessly copy fsck_options > fetch-pack: use file-scope static struct for fsck_options > fetch-pack: use new fsck API to printing dangling submodules This series is carefully organized and motivated. It was quite easy to read. My complaints were minor. One was that patches 1-4 seemed to be unnecessarily granular. I'm not sure that having four patches like that will be more helpful for inspecting the history in the future. But, I don't care enough to say this should be re-rolled. Finally, the last issue is that fsck-cb.c is loosely justified with only one method inside. If you have plans in the near future to add similar methods there, then I think that is fine. Otherwise, it would be simpler to avoid the extra file and code move. Thanks, -Stolee ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v5 00/19] fsck: API improvements 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason 2021-03-16 19:35 ` Derrick Stolee @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 20:30 ` Derrick Stolee ` (2 more replies) 2021-03-17 18:20 ` [PATCH v5 01/19] fsck.c: refactor and rename common config callback Ævar Arnfjörð Bjarmason ` (18 subsequent siblings) 20 siblings, 3 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason A v5 with changes suggested by Derrick Stolee. Link to v4: https://lore.kernel.org/git/20210316161738.30254-1-avarab@gmail.com/ Changes: * 1/19 is new, it's a simple refactoring of some git_config() code in fsck.c code I changed recently. * Squashed the first 4x patches of incrementally redefining two macros into one. * Squashed a whitespace-only change into another patch that changed the same code. * Got rid of fsck-cb.c, that one function just lives at the bottom of fsck.c now. Ævar Arnfjörð Bjarmason (19): fsck.c: refactor and rename common config callback fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} fsck.h: use "enum object_type" instead of "int" fsck.c: rename variables in fsck_set_msg_type() for less confusion fsck.c: move definition of msg_id into append_msg_id() fsck.c: rename remaining fsck_msg_id "id" to "msg_id" fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum fsck.h: re-order and re-assign "enum fsck_msg_type" fsck.c: call parse_msg_type() early in fsck_set_msg_type() fsck.c: undefine temporary STR macro after use fsck.c: give "FOREACH_MSG_ID" a more specific name fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h fsck.c: pass along the fsck_msg_id in the fsck_error callback fsck.c: add an fsck_set_msg_type() API that takes enums fsck.c: move gitmodules_{found,done} into fsck_options fetch-pack: don't needlessly copy fsck_options fetch-pack: use file-scope static struct for fsck_options fetch-pack: use new fsck API to printing dangling submodules builtin/fsck.c | 14 ++- builtin/index-pack.c | 30 +----- builtin/mktag.c | 14 ++- builtin/unpack-objects.c | 3 +- fetch-pack.c | 6 +- fsck.c | 197 +++++++++++++++------------------------ fsck.h | 131 +++++++++++++++++++++++--- 7 files changed, 213 insertions(+), 182 deletions(-) Range-diff: 1: 9cd942b526 < -: ---------- fsck.h: update FSCK_OPTIONS_* for object_name 2: d67966b838 < -: ---------- fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} 3: 211472e0c5 < -: ---------- fsck.h: reduce duplication between FSCK_OPTIONS_{DEFAULT,STRICT} 4: 70afee988d < -: ---------- fsck.h: add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro 5: 1337d53352 < -: ---------- fsck.h: indent arguments to of fsck_set_msg_type -: ---------- > 1: fe33015e0d fsck.c: refactor and rename common config callback -: ---------- > 2: 72f2e53afa fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} 6: e4ef107bb4 = 3: 237a280686 fsck.h: use "enum object_type" instead of "int" 7: 20bac3207e ! 4: 13b76c73dd fsck.c: rename variables in fsck_set_msg_type() for less confusion @@ Commit message "msg_id" to "msg_id_str" etc. This will make a follow-up change smaller. + While I'm at it properly indent the fsck_set_msg_type() argument list. + Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## fsck.c ## @@ fsck.c: int is_valid_msg_type(const char *msg_id, const char *msg_type) void fsck_set_msg_type(struct fsck_options *options, - const char *msg_id, const char *msg_type) -+ const char *msg_id_str, const char *msg_type_str) ++ const char *msg_id_str, const char *msg_type_str) { - int id = parse_msg_id(msg_id), type; + int msg_id = parse_msg_id(msg_id_str), msg_type; @@ fsck.c: int is_valid_msg_type(const char *msg_id, const char *msg_type) } void fsck_set_msg_types(struct fsck_options *options, const char *values) + + ## fsck.h ## +@@ fsck.h: struct fsck_options; + struct object; + + void fsck_set_msg_type(struct fsck_options *options, +- const char *msg_id, const char *msg_type); ++ const char *msg_id, const char *msg_type); + void fsck_set_msg_types(struct fsck_options *options, const char *values); + int is_valid_msg_type(const char *msg_id, const char *msg_type); + 8: 09c3bba9e9 = 5: 4ae83403b7 fsck.c: move definition of msg_id into append_msg_id() 9: 8067df53a2 = 6: 82107f1dac fsck.c: rename remaining fsck_msg_id "id" to "msg_id" 10: bdf5e13f3d = 7: 796096bf73 fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" 11: b03caa237f ! 8: 3664abb23d fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum @@ builtin/index-pack.c: static void show_pack_info(int stat_only) * NEEDSWORK: Plumb the MSG_ID (from fsck.c) here and use it ## builtin/mktag.c ## -@@ builtin/mktag.c: static int mktag_config(const char *var, const char *value, void *cb) +@@ builtin/mktag.c: static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; static int mktag_fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, @@ fsck.c: static int fsck_msg_type(enum fsck_msg_id msg_id, return FSCK_ERROR; @@ fsck.c: int is_valid_msg_type(const char *msg_id, const char *msg_type) void fsck_set_msg_type(struct fsck_options *options, - const char *msg_id_str, const char *msg_type_str) + const char *msg_id_str, const char *msg_type_str) { - int msg_id = parse_msg_id(msg_id_str), msg_type; + int msg_id = parse_msg_id(msg_id_str); 12: 7b1d13b4cc = 9: 81e6d7ab45 fsck.h: re-order and re-assign "enum fsck_msg_type" 13: a8e4ca7b19 ! 10: 5c2e8e7b84 fsck.c: call parse_msg_type() early in fsck_set_msg_type() @@ Commit message ## fsck.c ## @@ fsck.c: void fsck_set_msg_type(struct fsck_options *options, - const char *msg_id_str, const char *msg_type_str) + const char *msg_id_str, const char *msg_type_str) { int msg_id = parse_msg_id(msg_id_str); - enum fsck_msg_type msg_type; 14: 214c375a20 = 11: 7ffbf9af3f fsck.c: undefine temporary STR macro after use 15: 19a2499a80 = 12: 12ff0f75eb fsck.c: give "FOREACH_MSG_ID" a more specific name 16: 6e1a7b6274 = 13: 0c49dd5164 fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h 17: 42af4e164c = 14: 900263f503 fsck.c: pass along the fsck_msg_id in the fsck_error callback 18: fa47f473a8 ! 15: 5f270e88a0 fsck.c: add an fsck_set_msg_type() API that takes enums @@ builtin/mktag.c: int cmd_mktag(int argc, const char **argv, const char *prefix) + fsck_set_msg_type_from_ids(&fsck_options, FSCK_MSG_EXTRA_HEADER_ENTRY, + FSCK_WARN); /* config might set fsck.extraHeaderEntry=* again */ - git_config(mktag_config, NULL); + git_config(git_fsck_config, &fsck_options); if (fsck_tag_standalone(NULL, buf.buf, buf.len, &fsck_options, ## fsck.c ## @@ fsck.c: int is_valid_msg_type(const char *msg_id, const char *msg_type) +} + void fsck_set_msg_type(struct fsck_options *options, - const char *msg_id_str, const char *msg_type_str) + const char *msg_id_str, const char *msg_type_str) { @@ fsck.c: void fsck_set_msg_type(struct fsck_options *options, if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) 19: 4cc3880cc4 = 16: 539d019712 fsck.c: move gitmodules_{found,done} into fsck_options 20: fd219d318a = 17: 1acf744236 fetch-pack: don't needlessly copy fsck_options 21: e4cd8c250e = 18: b47c3d5ac6 fetch-pack: use file-scope static struct for fsck_options 22: fdbc3c304c ! 19: f05fa5c3ec fetch-pack: use new fsck API to printing dangling submodules @@ Commit message manipulating the "gitmodules_found" member. A recent commit moved it into "fsck_options" so we could do this here. - Add a fsck-cb.c file similar to parse-options-cb.c, the alternative - would be to either define this directly in fsck.c as a public API, or - to create some library shared by fetch-pack.c ad builtin/index-pack. + I'm sticking this callback in fsck.c. Perhaps in the future we'd like + to accumulate such callbacks into another file (maybe fsck-cb.c, + similar to parse-options-cb.c?), but while we've got just the one + let's just put it into fsck.c. - I expect that there won't be many of these fsck utility functions in - the future, so just having a single fsck-cb.c makes sense. + A better alternative in this case would be some library some more + obvious library shared by fetch-pack.c ad builtin/index-pack.c, but + there isn't such a thing. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> - ## Makefile ## -@@ Makefile: LIB_OBJS += fetch-negotiator.o - LIB_OBJS += fetch-pack.o - LIB_OBJS += fmt-merge-msg.o - LIB_OBJS += fsck.o -+LIB_OBJS += fsck-cb.o - LIB_OBJS += fsmonitor.o - LIB_OBJS += gettext.o - LIB_OBJS += gpg-interface.o - ## builtin/index-pack.c ## @@ builtin/index-pack.c: static int nr_threads; static int from_stdin; @@ fetch-pack.c: static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) die("fsck failed"); } - ## fsck-cb.c (new) ## -@@ -+#include "git-compat-util.h" -+#include "fsck.h" + ## fsck.c ## +@@ fsck.c: int fsck_error_function(struct fsck_options *o, + return 1; + } + +-void register_found_gitmodules(struct fsck_options *options, const struct object_id *oid) +-{ +- oidset_insert(&options->gitmodules_found, oid); +-} +- + int fsck_finish(struct fsck_options *options) + { + int ret = 0; +@@ fsck.c: int git_fsck_config(const char *var, const char *value, void *cb) + + return git_default_config(var, value, cb); + } ++ ++/* ++ * Custom error callbacks that are used in more than one place. ++ */ + +int fsck_error_cb_print_missing_gitmodules(struct fsck_options *o, + const struct object_id *oid, @@ fsck-cb.c (new) + return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); +} - ## fsck.c ## -@@ fsck.c: int fsck_error_function(struct fsck_options *o, - return 1; - } - --void register_found_gitmodules(struct fsck_options *options, const struct object_id *oid) --{ -- oidset_insert(&options->gitmodules_found, oid); --} -- - int fsck_finish(struct fsck_options *options) - { - int ret = 0; - ## fsck.h ## @@ fsck.h: int fsck_walk(struct object *obj, void *data, struct fsck_options *options); int fsck_object(struct object *obj, void *data, unsigned long size, @@ fsck.h: int fsck_walk(struct object *obj, void *data, struct fsck_options *optio * fsck a tag, and pass info about it back to the caller. This is * exposed fsck_object() internals for git-mktag(1). @@ fsck.h: const char *fsck_describe_object(struct fsck_options *options, - int fsck_config_internal(const char *var, const char *value, void *cb, - struct fsck_options *options); + */ + int git_fsck_config(const char *var, const char *value, void *cb); +/* -+ * Initializations for callbacks in fsck-cb.c ++ * Custom error callbacks that are used in more than one place. + */ +#define FSCK_OPTIONS_MISSING_GITMODULES { \ + .strict = 1, \ + .error_func = fsck_error_cb_print_missing_gitmodules, \ + FSCK_OPTIONS_COMMON \ +} -+ -+/* -+ * Error callbacks in fsck-cb.c -+ */ +int fsck_error_cb_print_missing_gitmodules(struct fsck_options *o, + const struct object_id *oid, + enum object_type object_type, -- 2.31.0.260.g719c683c1d ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v5 00/19] fsck: API improvements 2021-03-17 18:20 ` [PATCH v5 00/19] " Ævar Arnfjörð Bjarmason @ 2021-03-17 20:30 ` Derrick Stolee 2021-03-17 21:06 ` Junio C Hamano 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason 2 siblings, 0 replies; 229+ messages in thread From: Derrick Stolee @ 2021-03-17 20:30 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason, git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan On 3/17/2021 2:20 PM, Ævar Arnfjörð Bjarmason wrote: > A v5 with changes suggested by Derrick Stolee. Link to v4: > https://lore.kernel.org/git/20210316161738.30254-1-avarab@gmail.com/ > > Changes: > > * 1/19 is new, it's a simple refactoring of some git_config() code in > fsck.c code I changed recently. This new patch is simple, as advertised. > * Squashed the first 4x patches of incrementally redefining two > macros into one. Thanks. > * Squashed a whitespace-only change into another patch that changed > the same code. > > * Got rid of fsck-cb.c, that one function just lives at the bottom of > fsck.c now. I was late in giving you confirmation that fsck.c is a good place, but you got there, anyway. Thanks! This version LGTM. -Stolee ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v5 00/19] fsck: API improvements 2021-03-17 18:20 ` [PATCH v5 00/19] " Ævar Arnfjörð Bjarmason 2021-03-17 20:30 ` Derrick Stolee @ 2021-03-17 21:06 ` Junio C Hamano 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason 2 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-17 21:06 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > A v5 with changes suggested by Derrick Stolee. Link to v4: > https://lore.kernel.org/git/20210316161738.30254-1-avarab@gmail.com/ > > Changes: > > * 1/19 is new, it's a simple refactoring of some git_config() code in > fsck.c code I changed recently. The new step makes sense. I think the series and my comment e-mails crossed, and all of my comments on the previous round still applies (including the fsck-cb.c thing, which I think should be added to its sole user index-pack.c, not to fsck.c). Thanks, will queue. ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v6 00/19] fsck: API improvements 2021-03-17 18:20 ` [PATCH v5 00/19] " Ævar Arnfjörð Bjarmason 2021-03-17 20:30 ` Derrick Stolee 2021-03-17 21:06 ` Junio C Hamano @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 01/19] fsck.c: refactor and rename common config callback Ævar Arnfjörð Bjarmason ` (19 more replies) 2 siblings, 20 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason To recap on the goals in v1[1] this series gets rid of the need to have the rececently added "print_dangling_gitmodules" function in favor of a better fsck API to get at that information. Changes since v5[2]: * Addressed all outstanding feedback AFAICT * The fields we init to 0/NULL in the new designated initializer are gone * There were comments on the refactoring of append_msg_id(), It turns out that we can entirely remove that function. So a new commit go added + one ejected to do that. * Clarifications in commit messages. * I'd still left behind a remnant of the old "print_dangling_gitmodules" code in v5's last commit. I.e. we had code that was accumulating its own list of gitmodules OIDs and then injecting into the fsck state, now that the fsck state tracks those itself we can that list directly instead. 1. https://lore.kernel.org/git/20210217194246.25342-1-avarab@gmail.com/ 2. https://lore.kernel.org/git/20210317182054.5986-1-avarab@gmail.com/ Ævar Arnfjörð Bjarmason (19): fsck.c: refactor and rename common config callback fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} fsck.h: use "enum object_type" instead of "int" fsck.c: rename variables in fsck_set_msg_type() for less confusion fsck.c: remove (mostly) redundant append_msg_id() function fsck.c: rename remaining fsck_msg_id "id" to "msg_id" fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum fsck.h: re-order and re-assign "enum fsck_msg_type" fsck.c: call parse_msg_type() early in fsck_set_msg_type() fsck.c: undefine temporary STR macro after use fsck.c: give "FOREACH_MSG_ID" a more specific name fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h fsck.c: pass along the fsck_msg_id in the fsck_error callback fsck.c: add an fsck_set_msg_type() API that takes enums fsck.c: move gitmodules_{found,done} into fsck_options fetch-pack: don't needlessly copy fsck_options fetch-pack: use file-scope static struct for fsck_options fetch-pack: use new fsck API to printing dangling submodules builtin/fsck.c | 14 ++- builtin/index-pack.c | 30 +----- builtin/mktag.c | 14 ++- builtin/unpack-objects.c | 3 +- fetch-pack.c | 31 ++---- fsck.c | 207 +++++++++++++-------------------------- fsck.h | 127 +++++++++++++++++++++--- 7 files changed, 210 insertions(+), 216 deletions(-) Range-diff: 1: fe33015e0d9 = 1: 579af32ab3e fsck.c: refactor and rename common config callback 2: 72f2e53afac ! 2: b17c982293e fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} @@ Commit message fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Refactor the definitions of FSCK_OPTIONS_{DEFAULT,STRICT} to use - designated initializers. - - While I'm at it add the "object_names" member to the - initialization. This was omitted in 7b35efd734e (fsck_walk(): - optionally name objects on the go, 2016-07-17) when the field was - added. - - I'm using a new FSCK_OPTIONS_COMMON and FSCK_OPTIONS_COMMON_ERROR_FUNC - helper macros to define what FSCK_OPTIONS_{DEFAULT,STRICT} have in - common, and define the two in terms of those macro. - - The FSCK_OPTIONS_COMMON macro will be used in a subsequent commit to - define other variants of common fsck initialization that wants to use - a custom error function, but share the rest of the defaults. + designated initializers. This allows us to omit those fields that + aren't initialized to zero or NULL. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> @@ fsck.h: struct fsck_options { -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT } -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT } -+#define FSCK_OPTIONS_COMMON \ -+ .walk = NULL, \ -+ .msg_type = NULL, \ ++#define FSCK_OPTIONS_DEFAULT { \ + .skiplist = OIDSET_INIT, \ -+ .object_names = NULL, -+#define FSCK_OPTIONS_COMMON_ERROR_FUNC \ -+ FSCK_OPTIONS_COMMON \ -+ .error_func = fsck_error_function -+ -+#define FSCK_OPTIONS_DEFAULT { .strict = 0, FSCK_OPTIONS_COMMON_ERROR_FUNC } -+#define FSCK_OPTIONS_STRICT { .strict = 1, FSCK_OPTIONS_COMMON_ERROR_FUNC } ++ .error_func = fsck_error_function \ ++} ++#define FSCK_OPTIONS_STRICT { \ ++ .strict = 1, \ ++ .error_func = fsck_error_function, \ ++} /* descend in all linked child objects * the return value is: 3: 237a2806865 = 3: a721c396c50 fsck.h: use "enum object_type" instead of "int" 4: 13b76c73dd7 = 4: fcdba2f8fe8 fsck.c: rename variables in fsck_set_msg_type() for less confusion 5: 4ae83403b73 ! 5: b07e8e026ac fsck.c: move definition of msg_id into append_msg_id() @@ Metadata Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## Commit message ## - fsck.c: move definition of msg_id into append_msg_id() + fsck.c: remove (mostly) redundant append_msg_id() function - Refactor code added in 71ab8fa840f (fsck: report the ID of the - error/warning, 2015-06-22) to resolve the msg_id to a string in the - function that wants it, instead of doing it in report(). + Remove the append_msg_id() function in favor of calling + prepare_msg_ids(). We already have code to compute the camel-cased + msg_id strings in msg_id_info, let's use it. + + When the append_msg_id() function was added in 71ab8fa840f (fsck: + report the ID of the error/warning, 2015-06-22) the prepare_msg_ids() + function didn't exist. When prepare_msg_ids() was added in + a46baac61eb (fsck: factor out msg_id_info[] lazy initialization code, + 2018-05-26) this code wasn't moved over to lazy initialization. + + This changes the behavior of the code to initialize all the messages + instead of just camel-casing the one we need on the fly. Since the + common case is that we're printing just one message this is mostly + redundant work. + + But that's OK in this case, reporting this fsck issue to the user + isn't performance-sensitive. If we were somehow doing so in a tight + loop (in a hopelessly broken repository?) this would help, since we'd + save ourselves from re-doing this work for identical messages, we + could just grab the prepared string from msg_id_info after the first + invocation. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> @@ fsck.c: void fsck_set_msg_types(struct fsck_options *options, const char *values } -static void append_msg_id(struct strbuf *sb, const char *msg_id) -+static void append_msg_id(struct strbuf *sb, enum fsck_msg_id id) +-{ +- for (;;) { +- char c = *(msg_id)++; +- +- if (!c) +- break; +- if (c != '_') +- strbuf_addch(sb, tolower(c)); +- else { +- assert(*msg_id); +- strbuf_addch(sb, *(msg_id)++); +- } +- } +- +- strbuf_addstr(sb, ": "); +-} +- + static int object_on_skiplist(struct fsck_options *opts, + const struct object_id *oid) { -+ const char *msg_id = msg_id_info[id].id_string; - for (;;) { - char c = *(msg_id)++; - @@ fsck.c: static int report(struct fsck_options *options, else if (msg_type == FSCK_INFO) msg_type = FSCK_WARN; - append_msg_id(&sb, msg_id_info[id].id_string); -+ append_msg_id(&sb, id); ++ prepare_msg_ids(); ++ strbuf_addf(&sb, "%s: ", msg_id_info[id].camelcased); va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); 6: 82107f1dac0 ! 6: 321b0c652de fsck.c: rename remaining fsck_msg_id "id" to "msg_id" @@ Commit message Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## fsck.c ## -@@ fsck.c: void fsck_set_msg_types(struct fsck_options *options, const char *values) - free(to_free); - } - --static void append_msg_id(struct strbuf *sb, enum fsck_msg_id id) -+static void append_msg_id(struct strbuf *sb, enum fsck_msg_id msg_id) - { -- const char *msg_id = msg_id_info[id].id_string; -+ const char *msg_id_str = msg_id_info[msg_id].id_string; - for (;;) { -- char c = *(msg_id)++; -+ char c = *(msg_id_str)++; - - if (!c) - break; - if (c != '_') - strbuf_addch(sb, tolower(c)); - else { -- assert(*msg_id); -- strbuf_addch(sb, *(msg_id)++); -+ assert(*msg_id_str); -+ strbuf_addch(sb, *(msg_id_str)++); - } - } - @@ fsck.c: static int object_on_skiplist(struct fsck_options *opts, __attribute__((format (printf, 5, 6))) static int report(struct fsck_options *options, @@ fsck.c: static int object_on_skiplist(struct fsck_options *opts, if (msg_type == FSCK_IGNORE) return 0; @@ fsck.c: static int report(struct fsck_options *options, - else if (msg_type == FSCK_INFO) msg_type = FSCK_WARN; -- append_msg_id(&sb, id); -+ append_msg_id(&sb, msg_id); + prepare_msg_ids(); +- strbuf_addf(&sb, "%s: ", msg_id_info[id].camelcased); ++ strbuf_addf(&sb, "%s: ", msg_id_info[msg_id].camelcased); va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); 7: 796096bf73e = 7: 948689ad5c8 fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" 8: 3664abb23de = 8: 8ea468bf4d8 fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum 9: 81e6d7ab450 ! 9: 9316b35cd3b fsck.h: re-order and re-assign "enum fsck_msg_type" @@ Commit message defined as "2". I'm confident that nothing relies on these values, we always compare - them explicitly. Let's not omit "0" so it won't be assumed that we're - using these as a boolean somewhere. + them for equality. Let's not omit "0" so it won't be assumed that + we're using these as a boolean somewhere. This also allows us to re-structure the fields to mark which are "private" v.s. "public". See the preceding commit for a rationale for 10: 5c2e8e7b842 = 10: d7f1c5d37de fsck.c: call parse_msg_type() early in fsck_set_msg_type() 11: 7ffbf9af3fa = 11: ae5efd745cf fsck.c: undefine temporary STR macro after use 12: 12ff0f75ebf = 12: 96995244806 fsck.c: give "FOREACH_MSG_ID" a more specific name 13: 0c49dd5164f = 13: 1b42aea3a64 fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h 14: 900263f503a = 14: 563e6a0e5e6 fsck.c: pass along the fsck_msg_id in the fsck_error callback 15: 5f270e88a0a = 15: 5e504f25c51 fsck.c: add an fsck_set_msg_type() API that takes enums 16: 539d0197129 ! 16: 611631dd779 fsck.c: move gitmodules_{found,done} into fsck_options @@ Commit message gitmodules_found attribute of "fsck_options" we need this intermediate step first. + An earlier version of this patch removed the small amount of + duplication we now have between FSCK_OPTIONS_{DEFAULT,STRICT} with a + FSCK_OPTIONS_COMMON macro. I don't think such de-duplication is worth + it for this amount of copy/pasting. + Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> ## fetch-pack.c ## @@ fsck.h: struct fsck_options { kh_oid_map_t *object_names; }; -@@ fsck.h: struct fsck_options { - .walk = NULL, \ - .msg_type = NULL, \ + #define FSCK_OPTIONS_DEFAULT { \ .skiplist = OIDSET_INIT, \ + .gitmodules_found = OIDSET_INIT, \ + .gitmodules_done = OIDSET_INIT, \ - .object_names = NULL, - #define FSCK_OPTIONS_COMMON_ERROR_FUNC \ - FSCK_OPTIONS_COMMON \ + .error_func = fsck_error_function \ + } + #define FSCK_OPTIONS_STRICT { \ + .strict = 1, \ ++ .gitmodules_found = OIDSET_INIT, \ ++ .gitmodules_done = OIDSET_INIT, \ + .error_func = fsck_error_function, \ + } + @@ fsck.h: int fsck_walk(struct object *obj, void *data, struct fsck_options *options); int fsck_object(struct object *obj, void *data, unsigned long size, struct fsck_options *options); 17: 1acf7442365 = 17: 03d512c8448 fetch-pack: don't needlessly copy fsck_options 18: b47c3d5ac6f = 18: 581c87c63c6 fetch-pack: use file-scope static struct for fsck_options 19: f05fa5c3ec9 ! 19: 6a38cade8c3 fetch-pack: use new fsck API to printing dangling submodules @@ fetch-pack.c: static int server_supports_filtering; static struct strbuf fsck_msg_types = STRBUF_INIT; static struct string_list uri_protocols = STRING_LIST_INIT_DUP; -@@ fetch-pack.c: static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) +@@ fetch-pack.c: static int cmp_ref_by_name(const void *a_, const void *b_) + return strcmp(a->name, b->name); + } - oidset_iter_init(gitmodules_oids, &iter); - while ((oid = oidset_iter_next(&iter))) +-static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) +-{ +- struct oidset_iter iter; +- const struct object_id *oid; +- +- if (!oidset_size(gitmodules_oids)) +- return; +- +- oidset_iter_init(gitmodules_oids, &iter); +- while ((oid = oidset_iter_next(&iter))) - register_found_gitmodules(&fsck_options, oid); -+ oidset_insert(&fsck_options.gitmodules_found, oid); - if (fsck_finish(&fsck_options)) - die("fsck failed"); - } +- if (fsck_finish(&fsck_options)) +- die("fsck failed"); +-} +- + static struct ref *do_fetch_pack(struct fetch_pack_args *args, + int fd[2], + const struct ref *orig_ref, +@@ fetch-pack.c: static struct ref *do_fetch_pack(struct fetch_pack_args *args, + int agent_len; + struct fetch_negotiator negotiator_alloc; + struct fetch_negotiator *negotiator; +- struct oidset gitmodules_oids = OIDSET_INIT; + + negotiator = &negotiator_alloc; + fetch_negotiator_init(r, negotiator); +@@ fetch-pack.c: static struct ref *do_fetch_pack(struct fetch_pack_args *args, + else + alternate_shallow_file = NULL; + if (get_pack(args, fd, pack_lockfiles, NULL, sought, nr_sought, +- &gitmodules_oids)) ++ &fsck_options.gitmodules_found)) + die(_("git fetch-pack: fetch failed.")); +- fsck_gitmodules_oids(&gitmodules_oids); ++ if (fsck_finish(&fsck_options)) ++ die("fsck failed"); + + all_done: + if (negotiator) +@@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, + struct string_list packfile_uris = STRING_LIST_INIT_DUP; + int i; + struct strvec index_pack_args = STRVEC_INIT; +- struct oidset gitmodules_oids = OIDSET_INIT; + + negotiator = &negotiator_alloc; + fetch_negotiator_init(r, negotiator); +@@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, + process_section_header(&reader, "packfile", 0); + if (get_pack(args, fd, pack_lockfiles, + packfile_uris.nr ? &index_pack_args : NULL, +- sought, nr_sought, &gitmodules_oids)) ++ sought, nr_sought, &fsck_options.gitmodules_found)) + die(_("git fetch-pack: fetch failed.")); + do_check_stateless_delimiter(args, &reader); + +@@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, + + packname[the_hash_algo->hexsz] = '\0'; + +- parse_gitmodules_oids(cmd.out, &gitmodules_oids); ++ parse_gitmodules_oids(cmd.out, &fsck_options.gitmodules_found); + + close(cmd.out); + +@@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, + string_list_clear(&packfile_uris, 0); + strvec_clear(&index_pack_args); + +- fsck_gitmodules_oids(&gitmodules_oids); ++ if (fsck_finish(&fsck_options)) ++ die("fsck failed"); + + if (negotiator) + negotiator->release(negotiator); ## fsck.c ## @@ fsck.c: int fsck_error_function(struct fsck_options *o, @@ fsck.c: int git_fsck_config(const char *var, const char *value, void *cb) +} ## fsck.h ## +@@ fsck.h: int fsck_error_function(struct fsck_options *o, + const struct object_id *oid, enum object_type object_type, + enum fsck_msg_type msg_type, enum fsck_msg_id msg_id, + const char *message); ++int fsck_error_cb_print_missing_gitmodules(struct fsck_options *o, ++ const struct object_id *oid, ++ enum object_type object_type, ++ enum fsck_msg_type msg_type, ++ enum fsck_msg_id msg_id, ++ const char *message); + + struct fsck_options { + fsck_walk_func walk; +@@ fsck.h: struct fsck_options { + .gitmodules_done = OIDSET_INIT, \ + .error_func = fsck_error_function, \ + } ++#define FSCK_OPTIONS_MISSING_GITMODULES { \ ++ .strict = 1, \ ++ .gitmodules_found = OIDSET_INIT, \ ++ .gitmodules_done = OIDSET_INIT, \ ++ .error_func = fsck_error_cb_print_missing_gitmodules, \ ++} + + /* descend in all linked child objects + * the return value is: @@ fsck.h: int fsck_walk(struct object *obj, void *data, struct fsck_options *options); int fsck_object(struct object *obj, void *data, unsigned long size, struct fsck_options *options); @@ fsck.h: int fsck_walk(struct object *obj, void *data, struct fsck_options *optio /* * fsck a tag, and pass info about it back to the caller. This is * exposed fsck_object() internals for git-mktag(1). -@@ fsck.h: const char *fsck_describe_object(struct fsck_options *options, - */ - int git_fsck_config(const char *var, const char *value, void *cb); - -+/* -+ * Custom error callbacks that are used in more than one place. -+ */ -+#define FSCK_OPTIONS_MISSING_GITMODULES { \ -+ .strict = 1, \ -+ .error_func = fsck_error_cb_print_missing_gitmodules, \ -+ FSCK_OPTIONS_COMMON \ -+} -+int fsck_error_cb_print_missing_gitmodules(struct fsck_options *o, -+ const struct object_id *oid, -+ enum object_type object_type, -+ enum fsck_msg_type msg_type, -+ enum fsck_msg_id msg_id, -+ const char *message); -+ - #endif -- 2.31.1.445.g087790d4945 ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v6 01/19] fsck.c: refactor and rename common config callback 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 02/19] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason ` (18 subsequent siblings) 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Refactor code I recently changed in 1f3299fda9 (fsck: make fsck_config() re-usable, 2021-01-05) so that I could use fsck's config callback in mktag in 1f3299fda9 (fsck: make fsck_config() re-usable, 2021-01-05). I don't know what I was thinking in structuring the code this way, but it clearly makes no sense to have an fsck_config_internal() at all just so it can get a fsck_options when git_config() already supports passing along some void* data. Let's just make use of that instead, which gets us rid of the two wrapper functions, and brings fsck's common config callback in line with other such reusable config callbacks. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 7 +------ builtin/mktag.c | 7 +------ fsck.c | 4 ++-- fsck.h | 3 +-- 4 files changed, 5 insertions(+), 16 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 821e7798c70..a56a2d0513a 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -71,11 +71,6 @@ static const char *printable_type(const struct object_id *oid, return ret; } -static int fsck_config(const char *var, const char *value, void *cb) -{ - return fsck_config_internal(var, value, cb, &fsck_obj_options); -} - static int objerror(struct object *obj, const char *err) { errors_found |= ERROR_OBJECT; @@ -803,7 +798,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) if (name_objects) fsck_enable_object_names(&fsck_walk_options); - git_config(fsck_config, NULL); + git_config(git_fsck_config, &fsck_obj_options); if (connectivity_only) { for_each_loose_object(mark_loose_for_connectivity, NULL, 0); diff --git a/builtin/mktag.c b/builtin/mktag.c index 41a399a69e4..23c4b8763fa 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -14,11 +14,6 @@ static int option_strict = 1; static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; -static int mktag_config(const char *var, const char *value, void *cb) -{ - return fsck_config_internal(var, value, cb, &fsck_options); -} - static int mktag_fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, @@ -93,7 +88,7 @@ int cmd_mktag(int argc, const char **argv, const char *prefix) fsck_options.error_func = mktag_fsck_error_func; fsck_set_msg_type(&fsck_options, "extraheaderentry", "warn"); /* config might set fsck.extraHeaderEntry=* again */ - git_config(mktag_config, NULL); + git_config(git_fsck_config, &fsck_options); if (fsck_tag_standalone(NULL, buf.buf, buf.len, &fsck_options, &tagged_oid, &tagged_type)) die(_("tag on stdin did not pass our strict fsck check")); diff --git a/fsck.c b/fsck.c index e3030f3b358..5dfb99665ae 100644 --- a/fsck.c +++ b/fsck.c @@ -1323,9 +1323,9 @@ int fsck_finish(struct fsck_options *options) return ret; } -int fsck_config_internal(const char *var, const char *value, void *cb, - struct fsck_options *options) +int git_fsck_config(const char *var, const char *value, void *cb) { + struct fsck_options *options = cb; if (strcmp(var, "fsck.skiplist") == 0) { const char *path; struct strbuf sb = STRBUF_INIT; diff --git a/fsck.h b/fsck.h index 733378f1260..f70d11c5594 100644 --- a/fsck.h +++ b/fsck.h @@ -109,7 +109,6 @@ const char *fsck_describe_object(struct fsck_options *options, * git_config() callback for use by fsck-y tools that want to support * fsck.<msg> fsck.skipList etc. */ -int fsck_config_internal(const char *var, const char *value, void *cb, - struct fsck_options *options); +int git_fsck_config(const char *var, const char *value, void *cb); #endif -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 02/19] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 01/19] fsck.c: refactor and rename common config callback Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 17:15 ` Ramsay Jones 2021-03-28 13:15 ` [PATCH v6 03/19] fsck.h: use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason ` (17 subsequent siblings) 19 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Refactor the definitions of FSCK_OPTIONS_{DEFAULT,STRICT} to use designated initializers. This allows us to omit those fields that aren't initialized to zero or NULL. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/fsck.h b/fsck.h index f70d11c5594..73e8b9f3e4e 100644 --- a/fsck.h +++ b/fsck.h @@ -43,8 +43,14 @@ struct fsck_options { kh_oid_map_t *object_names; }; -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT } -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT } +#define FSCK_OPTIONS_DEFAULT { \ + .skiplist = OIDSET_INIT, \ + .error_func = fsck_error_function \ +} +#define FSCK_OPTIONS_STRICT { \ + .strict = 1, \ + .error_func = fsck_error_function, \ +} /* descend in all linked child objects * the return value is: -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v6 02/19] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} 2021-03-28 13:15 ` [PATCH v6 02/19] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason @ 2021-03-28 17:15 ` Ramsay Jones 2021-03-29 2:04 ` Junio C Hamano 0 siblings, 1 reply; 229+ messages in thread From: Ramsay Jones @ 2021-03-28 17:15 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee On Sun, Mar 28, 2021 at 03:15:34PM +0200, Ævar Arnfjörð Bjarmason wrote: > Refactor the definitions of FSCK_OPTIONS_{DEFAULT,STRICT} to use > designated initializers. This allows us to omit those fields that > aren't initialized to zero or NULL. s/aren't/are/ [I apologize in advance - I am using mutt for the first time to reply to a ML post and I don't know if I should be using L-ist-reply or a g-roup-reply! :D ] ATB, Ramsay Jones ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v6 02/19] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} 2021-03-28 17:15 ` Ramsay Jones @ 2021-03-29 2:04 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-29 2:04 UTC (permalink / raw) To: Ramsay Jones Cc: Ævar Arnfjörð Bjarmason, git, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee Ramsay Jones <ramsay@ramsayjones.plus.com> writes: > On Sun, Mar 28, 2021 at 03:15:34PM +0200, Ævar Arnfjörð Bjarmason wrote: >> Refactor the definitions of FSCK_OPTIONS_{DEFAULT,STRICT} to use >> designated initializers. This allows us to omit those fields that >> aren't initialized to zero or NULL. > > s/aren't/are/ Thanks; tweak applied while queuing. > [I apologize in advance - I am using mutt for the first time to reply > to a ML post and I don't know if I should be using L-ist-reply or a > g-roup-reply! :D ] FWIW, on lore (reading via nntp), the message I am responding to looks just fine. ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v6 03/19] fsck.h: use "enum object_type" instead of "int" 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 01/19] fsck.c: refactor and rename common config callback Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 02/19] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 04/19] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason ` (16 subsequent siblings) 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Change the fsck_walk_func to use an "enum object_type" instead of an "int" type. The types are compatible, and ever since this was added in 355885d5315 (add generic, type aware object chain walker, 2008-02-25) we've used entries from object_type (OBJ_BLOB etc.). So this doesn't really change anything as far as the generated code is concerned, it just gives the compiler more information and makes this easier to read. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 3 ++- builtin/index-pack.c | 3 ++- builtin/unpack-objects.c | 3 ++- fsck.h | 3 ++- 4 files changed, 8 insertions(+), 4 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index a56a2d0513a..ed5f2af6b5c 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -192,7 +192,8 @@ static int traverse_reachable(void) return !!result; } -static int mark_used(struct object *obj, int type, void *data, struct fsck_options *options) +static int mark_used(struct object *obj, enum object_type object_type, + void *data, struct fsck_options *options) { if (!obj) return 1; diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 21899687e2c..f6e1178df90 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -212,7 +212,8 @@ static void cleanup_thread(void) free(thread_data); } -static int mark_link(struct object *obj, int type, void *data, struct fsck_options *options) +static int mark_link(struct object *obj, enum object_type type, + void *data, struct fsck_options *options) { if (!obj) return -1; diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index a4ba2ebac69..4a70b17f8fb 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -187,7 +187,8 @@ static void write_cached_object(struct object *obj, struct obj_buffer *obj_buf) * that have reachability requirements and calls this function. * Verify its reachability and validity recursively and write it out. */ -static int check_object(struct object *obj, int type, void *data, struct fsck_options *options) +static int check_object(struct object *obj, enum object_type type, + void *data, struct fsck_options *options) { struct obj_buffer *obj_buf; diff --git a/fsck.h b/fsck.h index 73e8b9f3e4e..f20f1259e84 100644 --- a/fsck.h +++ b/fsck.h @@ -23,7 +23,8 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type); * <0 error signaled and abort * >0 error signaled and do not abort */ -typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options); +typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, + void *data, struct fsck_options *options); /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 04/19] fsck.c: rename variables in fsck_set_msg_type() for less confusion 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (2 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 03/19] fsck.h: use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 05/19] fsck.c: remove (mostly) redundant append_msg_id() function Ævar Arnfjörð Bjarmason ` (15 subsequent siblings) 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Rename variables in a function added in 0282f4dced0 (fsck: offer a function to demote fsck errors to warnings, 2015-06-22). It was needlessly confusing that it took a "msg_type" argument, but then later declared another "msg_type" of a different type. Let's rename that to "severity", and rename "id" to "msg_id" and "msg_id" to "msg_id_str" etc. This will make a follow-up change smaller. While I'm at it properly indent the fsck_set_msg_type() argument list. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 24 ++++++++++++------------ fsck.h | 2 +- 2 files changed, 13 insertions(+), 13 deletions(-) diff --git a/fsck.c b/fsck.c index 5dfb99665ae..7cc722a25cd 100644 --- a/fsck.c +++ b/fsck.c @@ -203,27 +203,27 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) } void fsck_set_msg_type(struct fsck_options *options, - const char *msg_id, const char *msg_type) + const char *msg_id_str, const char *msg_type_str) { - int id = parse_msg_id(msg_id), type; + int msg_id = parse_msg_id(msg_id_str), msg_type; - if (id < 0) - die("Unhandled message id: %s", msg_id); - type = parse_msg_type(msg_type); + if (msg_id < 0) + die("Unhandled message id: %s", msg_id_str); + msg_type = parse_msg_type(msg_type_str); - if (type != FSCK_ERROR && msg_id_info[id].msg_type == FSCK_FATAL) - die("Cannot demote %s to %s", msg_id, msg_type); + if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) + die("Cannot demote %s to %s", msg_id_str, msg_type_str); if (!options->msg_type) { int i; - int *msg_type; - ALLOC_ARRAY(msg_type, FSCK_MSG_MAX); + int *severity; + ALLOC_ARRAY(severity, FSCK_MSG_MAX); for (i = 0; i < FSCK_MSG_MAX; i++) - msg_type[i] = fsck_msg_type(i, options); - options->msg_type = msg_type; + severity[i] = fsck_msg_type(i, options); + options->msg_type = severity; } - options->msg_type[id] = type; + options->msg_type[msg_id] = msg_type; } void fsck_set_msg_types(struct fsck_options *options, const char *values) diff --git a/fsck.h b/fsck.h index f20f1259e84..30a3acabc50 100644 --- a/fsck.h +++ b/fsck.h @@ -11,7 +11,7 @@ struct fsck_options; struct object; void fsck_set_msg_type(struct fsck_options *options, - const char *msg_id, const char *msg_type); + const char *msg_id, const char *msg_type); void fsck_set_msg_types(struct fsck_options *options, const char *values); int is_valid_msg_type(const char *msg_id, const char *msg_type); -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 05/19] fsck.c: remove (mostly) redundant append_msg_id() function 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (3 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 04/19] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 06/19] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason ` (14 subsequent siblings) 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Remove the append_msg_id() function in favor of calling prepare_msg_ids(). We already have code to compute the camel-cased msg_id strings in msg_id_info, let's use it. When the append_msg_id() function was added in 71ab8fa840f (fsck: report the ID of the error/warning, 2015-06-22) the prepare_msg_ids() function didn't exist. When prepare_msg_ids() was added in a46baac61eb (fsck: factor out msg_id_info[] lazy initialization code, 2018-05-26) this code wasn't moved over to lazy initialization. This changes the behavior of the code to initialize all the messages instead of just camel-casing the one we need on the fly. Since the common case is that we're printing just one message this is mostly redundant work. But that's OK in this case, reporting this fsck issue to the user isn't performance-sensitive. If we were somehow doing so in a tight loop (in a hopelessly broken repository?) this would help, since we'd save ourselves from re-doing this work for identical messages, we could just grab the prepared string from msg_id_info after the first invocation. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 21 ++------------------- 1 file changed, 2 insertions(+), 19 deletions(-) diff --git a/fsck.c b/fsck.c index 7cc722a25cd..25c697fa6a2 100644 --- a/fsck.c +++ b/fsck.c @@ -264,24 +264,6 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) free(to_free); } -static void append_msg_id(struct strbuf *sb, const char *msg_id) -{ - for (;;) { - char c = *(msg_id)++; - - if (!c) - break; - if (c != '_') - strbuf_addch(sb, tolower(c)); - else { - assert(*msg_id); - strbuf_addch(sb, *(msg_id)++); - } - } - - strbuf_addstr(sb, ": "); -} - static int object_on_skiplist(struct fsck_options *opts, const struct object_id *oid) { @@ -308,7 +290,8 @@ static int report(struct fsck_options *options, else if (msg_type == FSCK_INFO) msg_type = FSCK_WARN; - append_msg_id(&sb, msg_id_info[id].id_string); + prepare_msg_ids(); + strbuf_addf(&sb, "%s: ", msg_id_info[id].camelcased); va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 06/19] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (4 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 05/19] fsck.c: remove (mostly) redundant append_msg_id() function Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 07/19] fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" Ævar Arnfjörð Bjarmason ` (13 subsequent siblings) 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Rename the remaining variables of type fsck_msg_id from "id" to "msg_id". This change is relatively small, and is worth the churn for a later change where we have different id's in the "report" function. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fsck.c b/fsck.c index 25c697fa6a2..a0463ea22cc 100644 --- a/fsck.c +++ b/fsck.c @@ -273,11 +273,11 @@ static int object_on_skiplist(struct fsck_options *opts, __attribute__((format (printf, 5, 6))) static int report(struct fsck_options *options, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_id id, const char *fmt, ...) + enum fsck_msg_id msg_id, const char *fmt, ...) { va_list ap; struct strbuf sb = STRBUF_INIT; - int msg_type = fsck_msg_type(id, options), result; + int msg_type = fsck_msg_type(msg_id, options), result; if (msg_type == FSCK_IGNORE) return 0; @@ -291,7 +291,7 @@ static int report(struct fsck_options *options, msg_type = FSCK_WARN; prepare_msg_ids(); - strbuf_addf(&sb, "%s: ", msg_id_info[id].camelcased); + strbuf_addf(&sb, "%s: ", msg_id_info[msg_id].camelcased); va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 07/19] fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (5 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 06/19] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 08/19] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason ` (12 subsequent siblings) 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Refactor "if options->msg_type" and other code added in 0282f4dced0 (fsck: offer a function to demote fsck errors to warnings, 2015-06-22) to reduce the scope of the "int msg_type" variable. This is in preparation for changing its type in a subsequent commit, only using it in the "!options->msg_type" scope makes that change This also brings the code in line with the fsck_set_msg_type() function (also added in 0282f4dced0), which does a similar check for "!options->msg_type". Another minor benefit is getting rid of the style violation of not having braces for the body of the "if". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/fsck.c b/fsck.c index a0463ea22cc..8614ee2c2a0 100644 --- a/fsck.c +++ b/fsck.c @@ -167,19 +167,17 @@ void list_config_fsck_msg_ids(struct string_list *list, const char *prefix) static int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options) { - int msg_type; - assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX); - if (options->msg_type) - msg_type = options->msg_type[msg_id]; - else { - msg_type = msg_id_info[msg_id].msg_type; + if (!options->msg_type) { + int msg_type = msg_id_info[msg_id].msg_type; + if (options->strict && msg_type == FSCK_WARN) msg_type = FSCK_ERROR; + return msg_type; } - return msg_type; + return options->msg_type[msg_id]; } static int parse_msg_type(const char *str) -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 08/19] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (6 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 07/19] fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 09/19] fsck.h: re-order and re-assign "enum fsck_msg_type" Ævar Arnfjörð Bjarmason ` (11 subsequent siblings) 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Move the FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} defines into a new fsck_msg_type enum. These defines were originally introduced in: - ba002f3b28a (builtin-fsck: move common object checking code to fsck.c, 2008-02-25) - f50c4407305 (fsck: disallow demoting grave fsck errors to warnings, 2015-06-22) - efaba7cc77f (fsck: optionally ignore specific fsck issues completely, 2015-06-22) - f27d05b1704 (fsck: allow upgrading fsck warnings to errors, 2015-06-22) The reason these were defined in two different places is because we use FSCK_{IGNORE,INFO,FATAL} only in fsck.c, but FSCK_{ERROR,WARN} are used by external callbacks. Untangling that would take some more work, since we expose the new "enum fsck_msg_type" to both. Similar to "enum object_type" it's not worth structuring the API in such a way that only those who need FSCK_{ERROR,WARN} pass around a different type. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 2 +- builtin/index-pack.c | 3 ++- builtin/mktag.c | 3 ++- fsck.c | 21 ++++++++++----------- fsck.h | 16 ++++++++++------ 5 files changed, 25 insertions(+), 20 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index ed5f2af6b5c..17940a4e24a 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -84,7 +84,7 @@ static int objerror(struct object *obj, const char *err) static int fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/builtin/index-pack.c b/builtin/index-pack.c index f6e1178df90..8338b832b63 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1716,7 +1716,8 @@ static void show_pack_info(int stat_only) static int print_dangling_gitmodules(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, + const char *message) { /* * NEEDSWORK: Plumb the MSG_ID (from fsck.c) here and use it diff --git a/builtin/mktag.c b/builtin/mktag.c index 23c4b8763fa..052a510ad7f 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -17,7 +17,8 @@ static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; static int mktag_fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, + const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/fsck.c b/fsck.c index 8614ee2c2a0..c5a81e4ff05 100644 --- a/fsck.c +++ b/fsck.c @@ -22,9 +22,6 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FSCK_FATAL -1 -#define FSCK_INFO -2 - #define FOREACH_MSG_ID(FUNC) \ /* fatal errors */ \ FUNC(NUL_IN_HEADER, FATAL) \ @@ -97,7 +94,7 @@ static struct { const char *id_string; const char *downcased; const char *camelcased; - int msg_type; + enum fsck_msg_type msg_type; } msg_id_info[FSCK_MSG_MAX + 1] = { FOREACH_MSG_ID(MSG_ID) { NULL, NULL, NULL, -1 } @@ -164,13 +161,13 @@ void list_config_fsck_msg_ids(struct string_list *list, const char *prefix) list_config_item(list, prefix, msg_id_info[i].camelcased); } -static int fsck_msg_type(enum fsck_msg_id msg_id, +static enum fsck_msg_type fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options) { assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX); if (!options->msg_type) { - int msg_type = msg_id_info[msg_id].msg_type; + enum fsck_msg_type msg_type = msg_id_info[msg_id].msg_type; if (options->strict && msg_type == FSCK_WARN) msg_type = FSCK_ERROR; @@ -180,7 +177,7 @@ static int fsck_msg_type(enum fsck_msg_id msg_id, return options->msg_type[msg_id]; } -static int parse_msg_type(const char *str) +static enum fsck_msg_type parse_msg_type(const char *str) { if (!strcmp(str, "error")) return FSCK_ERROR; @@ -203,7 +200,8 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { - int msg_id = parse_msg_id(msg_id_str), msg_type; + int msg_id = parse_msg_id(msg_id_str); + enum fsck_msg_type msg_type; if (msg_id < 0) die("Unhandled message id: %s", msg_id_str); @@ -214,7 +212,7 @@ void fsck_set_msg_type(struct fsck_options *options, if (!options->msg_type) { int i; - int *severity; + enum fsck_msg_type *severity; ALLOC_ARRAY(severity, FSCK_MSG_MAX); for (i = 0; i < FSCK_MSG_MAX; i++) severity[i] = fsck_msg_type(i, options); @@ -275,7 +273,8 @@ static int report(struct fsck_options *options, { va_list ap; struct strbuf sb = STRBUF_INIT; - int msg_type = fsck_msg_type(msg_id, options), result; + enum fsck_msg_type msg_type = fsck_msg_type(msg_id, options); + int result; if (msg_type == FSCK_IGNORE) return 0; @@ -1247,7 +1246,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size, int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, const char *message) { if (msg_type == FSCK_WARN) { warning("object %s: %s", fsck_describe_object(o, oid), message); diff --git a/fsck.h b/fsck.h index 30a3acabc50..baf37620760 100644 --- a/fsck.h +++ b/fsck.h @@ -3,9 +3,13 @@ #include "oidset.h" -#define FSCK_ERROR 1 -#define FSCK_WARN 2 -#define FSCK_IGNORE 3 +enum fsck_msg_type { + FSCK_INFO = -2, + FSCK_FATAL = -1, + FSCK_ERROR = 1, + FSCK_WARN, + FSCK_IGNORE +}; struct fsck_options; struct object; @@ -29,17 +33,17 @@ typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message); + enum fsck_msg_type msg_type, const char *message); int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message); + enum fsck_msg_type msg_type, const char *message); struct fsck_options { fsck_walk_func walk; fsck_error error_func; unsigned strict:1; - int *msg_type; + enum fsck_msg_type *msg_type; struct oidset skiplist; kh_oid_map_t *object_names; }; -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 09/19] fsck.h: re-order and re-assign "enum fsck_msg_type" 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (7 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 08/19] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 10/19] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason ` (10 subsequent siblings) 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Change the values in the "enum fsck_msg_type" from being manually assigned to using default C enum values. This means we end up with a FSCK_IGNORE=0, which was previously defined as "2". I'm confident that nothing relies on these values, we always compare them for equality. Let's not omit "0" so it won't be assumed that we're using these as a boolean somewhere. This also allows us to re-structure the fields to mark which are "private" v.s. "public". See the preceding commit for a rationale for not simply splitting these into two enums, namely that this is used for both the private and public fsck API. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/fsck.h b/fsck.h index baf37620760..a7e092d3fb4 100644 --- a/fsck.h +++ b/fsck.h @@ -4,11 +4,13 @@ #include "oidset.h" enum fsck_msg_type { - FSCK_INFO = -2, - FSCK_FATAL = -1, - FSCK_ERROR = 1, + /* for internal use only */ + FSCK_IGNORE, + FSCK_INFO, + FSCK_FATAL, + /* "public", fed to e.g. error_func callbacks */ + FSCK_ERROR, FSCK_WARN, - FSCK_IGNORE }; struct fsck_options; -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 10/19] fsck.c: call parse_msg_type() early in fsck_set_msg_type() 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (8 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 09/19] fsck.h: re-order and re-assign "enum fsck_msg_type" Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 11/19] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason ` (9 subsequent siblings) 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason There's no reason to defer the calling of parse_msg_type() until after we've checked if the "id < 0". This is not a hot codepath, and parse_msg_type() itself may die on invalid input. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index c5a81e4ff05..80365e62842 100644 --- a/fsck.c +++ b/fsck.c @@ -201,11 +201,10 @@ void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { int msg_id = parse_msg_id(msg_id_str); - enum fsck_msg_type msg_type; + enum fsck_msg_type msg_type = parse_msg_type(msg_type_str); if (msg_id < 0) die("Unhandled message id: %s", msg_id_str); - msg_type = parse_msg_type(msg_type_str); if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) die("Cannot demote %s to %s", msg_id_str, msg_type_str); -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 11/19] fsck.c: undefine temporary STR macro after use 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (9 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 10/19] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 12/19] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason ` (8 subsequent siblings) 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason In f417eed8cde (fsck: provide a function to parse fsck message IDs, 2015-06-22) the "STR" macro was introduced, but that short macro name was not undefined after use as was done earlier in the same series for the MSG_ID macro in c99ba492f1c (fsck: introduce identifiers for fsck messages, 2015-06-22). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fsck.c b/fsck.c index 80365e62842..1b12e824ef6 100644 --- a/fsck.c +++ b/fsck.c @@ -100,6 +100,7 @@ static struct { { NULL, NULL, NULL, -1 } }; #undef MSG_ID +#undef STR static void prepare_msg_ids(void) { -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 12/19] fsck.c: give "FOREACH_MSG_ID" a more specific name 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (10 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 11/19] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 13/19] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h Ævar Arnfjörð Bjarmason ` (7 subsequent siblings) 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Rename the FOREACH_MSG_ID macro to FOREACH_FSCK_MSG_ID in preparation for moving it over to fsck.h. It's good convention to name macros in *.h files in such a way as to clearly not clash with any other names in other files. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fsck.c b/fsck.c index 1b12e824ef6..31c9088e3f7 100644 --- a/fsck.c +++ b/fsck.c @@ -22,7 +22,7 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FOREACH_MSG_ID(FUNC) \ +#define FOREACH_FSCK_MSG_ID(FUNC) \ /* fatal errors */ \ FUNC(NUL_IN_HEADER, FATAL) \ FUNC(UNTERMINATED_HEADER, FATAL) \ @@ -83,7 +83,7 @@ static struct oidset gitmodules_done = OIDSET_INIT; #define MSG_ID(id, msg_type) FSCK_MSG_##id, enum fsck_msg_id { - FOREACH_MSG_ID(MSG_ID) + FOREACH_FSCK_MSG_ID(MSG_ID) FSCK_MSG_MAX }; #undef MSG_ID @@ -96,7 +96,7 @@ static struct { const char *camelcased; enum fsck_msg_type msg_type; } msg_id_info[FSCK_MSG_MAX + 1] = { - FOREACH_MSG_ID(MSG_ID) + FOREACH_FSCK_MSG_ID(MSG_ID) { NULL, NULL, NULL, -1 } }; #undef MSG_ID -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 13/19] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (11 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 12/19] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 14/19] fsck.c: pass along the fsck_msg_id in the fsck_error callback Ævar Arnfjörð Bjarmason ` (6 subsequent siblings) 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Move the FOREACH_FSCK_MSG_ID macro and the fsck_msg_id enum it helps define from fsck.c to fsck.h. This is in preparation for having non-static functions take the fsck_msg_id as an argument. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 66 ---------------------------------------------------------- fsck.h | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 66 insertions(+), 66 deletions(-) diff --git a/fsck.c b/fsck.c index 31c9088e3f7..150fe467e43 100644 --- a/fsck.c +++ b/fsck.c @@ -22,72 +22,6 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FOREACH_FSCK_MSG_ID(FUNC) \ - /* fatal errors */ \ - FUNC(NUL_IN_HEADER, FATAL) \ - FUNC(UNTERMINATED_HEADER, FATAL) \ - /* errors */ \ - FUNC(BAD_DATE, ERROR) \ - FUNC(BAD_DATE_OVERFLOW, ERROR) \ - FUNC(BAD_EMAIL, ERROR) \ - FUNC(BAD_NAME, ERROR) \ - FUNC(BAD_OBJECT_SHA1, ERROR) \ - FUNC(BAD_PARENT_SHA1, ERROR) \ - FUNC(BAD_TAG_OBJECT, ERROR) \ - FUNC(BAD_TIMEZONE, ERROR) \ - FUNC(BAD_TREE, ERROR) \ - FUNC(BAD_TREE_SHA1, ERROR) \ - FUNC(BAD_TYPE, ERROR) \ - FUNC(DUPLICATE_ENTRIES, ERROR) \ - FUNC(MISSING_AUTHOR, ERROR) \ - FUNC(MISSING_COMMITTER, ERROR) \ - FUNC(MISSING_EMAIL, ERROR) \ - FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \ - FUNC(MISSING_OBJECT, ERROR) \ - FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \ - FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \ - FUNC(MISSING_TAG, ERROR) \ - FUNC(MISSING_TAG_ENTRY, ERROR) \ - FUNC(MISSING_TREE, ERROR) \ - FUNC(MISSING_TREE_OBJECT, ERROR) \ - FUNC(MISSING_TYPE, ERROR) \ - FUNC(MISSING_TYPE_ENTRY, ERROR) \ - FUNC(MULTIPLE_AUTHORS, ERROR) \ - FUNC(TREE_NOT_SORTED, ERROR) \ - FUNC(UNKNOWN_TYPE, ERROR) \ - FUNC(ZERO_PADDED_DATE, ERROR) \ - FUNC(GITMODULES_MISSING, ERROR) \ - FUNC(GITMODULES_BLOB, ERROR) \ - FUNC(GITMODULES_LARGE, ERROR) \ - FUNC(GITMODULES_NAME, ERROR) \ - FUNC(GITMODULES_SYMLINK, ERROR) \ - FUNC(GITMODULES_URL, ERROR) \ - FUNC(GITMODULES_PATH, ERROR) \ - FUNC(GITMODULES_UPDATE, ERROR) \ - /* warnings */ \ - FUNC(BAD_FILEMODE, WARN) \ - FUNC(EMPTY_NAME, WARN) \ - FUNC(FULL_PATHNAME, WARN) \ - FUNC(HAS_DOT, WARN) \ - FUNC(HAS_DOTDOT, WARN) \ - FUNC(HAS_DOTGIT, WARN) \ - FUNC(NULL_SHA1, WARN) \ - FUNC(ZERO_PADDED_FILEMODE, WARN) \ - FUNC(NUL_IN_COMMIT, WARN) \ - /* infos (reported as warnings, but ignored by default) */ \ - FUNC(GITMODULES_PARSE, INFO) \ - FUNC(BAD_TAG_NAME, INFO) \ - FUNC(MISSING_TAGGER_ENTRY, INFO) \ - /* ignored (elevated when requested) */ \ - FUNC(EXTRA_HEADER_ENTRY, IGNORE) - -#define MSG_ID(id, msg_type) FSCK_MSG_##id, -enum fsck_msg_id { - FOREACH_FSCK_MSG_ID(MSG_ID) - FSCK_MSG_MAX -}; -#undef MSG_ID - #define STR(x) #x #define MSG_ID(id, msg_type) { STR(id), NULL, NULL, FSCK_##msg_type }, static struct { diff --git a/fsck.h b/fsck.h index a7e092d3fb4..66c4a71139a 100644 --- a/fsck.h +++ b/fsck.h @@ -13,6 +13,72 @@ enum fsck_msg_type { FSCK_WARN, }; +#define FOREACH_FSCK_MSG_ID(FUNC) \ + /* fatal errors */ \ + FUNC(NUL_IN_HEADER, FATAL) \ + FUNC(UNTERMINATED_HEADER, FATAL) \ + /* errors */ \ + FUNC(BAD_DATE, ERROR) \ + FUNC(BAD_DATE_OVERFLOW, ERROR) \ + FUNC(BAD_EMAIL, ERROR) \ + FUNC(BAD_NAME, ERROR) \ + FUNC(BAD_OBJECT_SHA1, ERROR) \ + FUNC(BAD_PARENT_SHA1, ERROR) \ + FUNC(BAD_TAG_OBJECT, ERROR) \ + FUNC(BAD_TIMEZONE, ERROR) \ + FUNC(BAD_TREE, ERROR) \ + FUNC(BAD_TREE_SHA1, ERROR) \ + FUNC(BAD_TYPE, ERROR) \ + FUNC(DUPLICATE_ENTRIES, ERROR) \ + FUNC(MISSING_AUTHOR, ERROR) \ + FUNC(MISSING_COMMITTER, ERROR) \ + FUNC(MISSING_EMAIL, ERROR) \ + FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \ + FUNC(MISSING_OBJECT, ERROR) \ + FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \ + FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \ + FUNC(MISSING_TAG, ERROR) \ + FUNC(MISSING_TAG_ENTRY, ERROR) \ + FUNC(MISSING_TREE, ERROR) \ + FUNC(MISSING_TREE_OBJECT, ERROR) \ + FUNC(MISSING_TYPE, ERROR) \ + FUNC(MISSING_TYPE_ENTRY, ERROR) \ + FUNC(MULTIPLE_AUTHORS, ERROR) \ + FUNC(TREE_NOT_SORTED, ERROR) \ + FUNC(UNKNOWN_TYPE, ERROR) \ + FUNC(ZERO_PADDED_DATE, ERROR) \ + FUNC(GITMODULES_MISSING, ERROR) \ + FUNC(GITMODULES_BLOB, ERROR) \ + FUNC(GITMODULES_LARGE, ERROR) \ + FUNC(GITMODULES_NAME, ERROR) \ + FUNC(GITMODULES_SYMLINK, ERROR) \ + FUNC(GITMODULES_URL, ERROR) \ + FUNC(GITMODULES_PATH, ERROR) \ + FUNC(GITMODULES_UPDATE, ERROR) \ + /* warnings */ \ + FUNC(BAD_FILEMODE, WARN) \ + FUNC(EMPTY_NAME, WARN) \ + FUNC(FULL_PATHNAME, WARN) \ + FUNC(HAS_DOT, WARN) \ + FUNC(HAS_DOTDOT, WARN) \ + FUNC(HAS_DOTGIT, WARN) \ + FUNC(NULL_SHA1, WARN) \ + FUNC(ZERO_PADDED_FILEMODE, WARN) \ + FUNC(NUL_IN_COMMIT, WARN) \ + /* infos (reported as warnings, but ignored by default) */ \ + FUNC(GITMODULES_PARSE, INFO) \ + FUNC(BAD_TAG_NAME, INFO) \ + FUNC(MISSING_TAGGER_ENTRY, INFO) \ + /* ignored (elevated when requested) */ \ + FUNC(EXTRA_HEADER_ENTRY, IGNORE) + +#define MSG_ID(id, msg_type) FSCK_MSG_##id, +enum fsck_msg_id { + FOREACH_FSCK_MSG_ID(MSG_ID) + FSCK_MSG_MAX +}; +#undef MSG_ID + struct fsck_options; struct object; -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 14/19] fsck.c: pass along the fsck_msg_id in the fsck_error callback 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (12 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 13/19] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 15/19] fsck.c: add an fsck_set_msg_type() API that takes enums Ævar Arnfjörð Bjarmason ` (5 subsequent siblings) 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Change the fsck_error callback to also pass along the fsck_msg_id. Before this change the only way to get the message id was to parse it back out of the "message". Let's pass it down explicitly for the benefit of callers that might want to use it, as discussed in [1]. Passing the msg_type is now redundant, as you can always get it back from the msg_id, but I'm not changing that convention. It's really common to need the msg_type, and the report() function itself (which calls "fsck_error") needs to call fsck_msg_type() to discover it. Let's not needlessly re-do that work in the user callback. 1. https://lore.kernel.org/git/87blcja2ha.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 4 +++- builtin/index-pack.c | 3 ++- builtin/mktag.c | 1 + fsck.c | 6 ++++-- fsck.h | 6 ++++-- 5 files changed, 14 insertions(+), 6 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 17940a4e24a..70ff95837ae 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -84,7 +84,9 @@ static int objerror(struct object *obj, const char *err) static int fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message) + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 8338b832b63..2f93957fb5e 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1717,6 +1717,7 @@ static int print_dangling_gitmodules(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, const char *message) { /* @@ -1727,7 +1728,7 @@ static int print_dangling_gitmodules(struct fsck_options *o, printf("%s\n", oid_to_hex(oid)); return 0; } - return fsck_error_function(o, oid, object_type, msg_type, message); + return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); } int cmd_index_pack(int argc, const char **argv, const char *prefix) diff --git a/builtin/mktag.c b/builtin/mktag.c index 052a510ad7f..96e63bc772a 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -18,6 +18,7 @@ static int mktag_fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, const char *message) { switch (msg_type) { diff --git a/fsck.c b/fsck.c index 150fe467e43..23a77fe2e0f 100644 --- a/fsck.c +++ b/fsck.c @@ -227,7 +227,7 @@ static int report(struct fsck_options *options, va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); result = options->error_func(options, oid, object_type, - msg_type, sb.buf); + msg_type, msg_id, sb.buf); strbuf_release(&sb); va_end(ap); @@ -1180,7 +1180,9 @@ int fsck_object(struct object *obj, void *data, unsigned long size, int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message) + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message) { if (msg_type == FSCK_WARN) { warning("object %s: %s", fsck_describe_object(o, oid), message); diff --git a/fsck.h b/fsck.h index 66c4a71139a..fa2d4955ab3 100644 --- a/fsck.h +++ b/fsck.h @@ -101,11 +101,13 @@ typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message); + enum fsck_msg_type msg_type, enum fsck_msg_id msg_id, + const char *message); int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message); + enum fsck_msg_type msg_type, enum fsck_msg_id msg_id, + const char *message); struct fsck_options { fsck_walk_func walk; -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 15/19] fsck.c: add an fsck_set_msg_type() API that takes enums 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (13 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 14/19] fsck.c: pass along the fsck_msg_id in the fsck_error callback Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 16/19] fsck.c: move gitmodules_{found,done} into fsck_options Ævar Arnfjörð Bjarmason ` (4 subsequent siblings) 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Change code I added in acf9de4c94e (mktag: use fsck instead of custom verify_tag(), 2021-01-05) to make use of a new API function that takes the fsck_msg_{id,type} types, instead of arbitrary strings that we'll (hopefully) parse into those types. At the time that the fsck_set_msg_type() API was introduced in 0282f4dced0 (fsck: offer a function to demote fsck errors to warnings, 2015-06-22) it was only intended to be used to parse user-supplied data. For things that are purely internal to the C code it makes sense to have the compiler check these arguments, and to skip the sanity checking of the data in fsck_set_msg_type() which is redundant to checks we get from the compiler. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/mktag.c | 3 ++- fsck.c | 27 +++++++++++++++++---------- fsck.h | 3 +++ 3 files changed, 22 insertions(+), 11 deletions(-) diff --git a/builtin/mktag.c b/builtin/mktag.c index 96e63bc772a..dddcccdd368 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -88,7 +88,8 @@ int cmd_mktag(int argc, const char **argv, const char *prefix) die_errno(_("could not read from stdin")); fsck_options.error_func = mktag_fsck_error_func; - fsck_set_msg_type(&fsck_options, "extraheaderentry", "warn"); + fsck_set_msg_type_from_ids(&fsck_options, FSCK_MSG_EXTRA_HEADER_ENTRY, + FSCK_WARN); /* config might set fsck.extraHeaderEntry=* again */ git_config(git_fsck_config, &fsck_options); if (fsck_tag_standalone(NULL, buf.buf, buf.len, &fsck_options, diff --git a/fsck.c b/fsck.c index 23a77fe2e0f..a59832a1650 100644 --- a/fsck.c +++ b/fsck.c @@ -132,6 +132,22 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) return 1; } +void fsck_set_msg_type_from_ids(struct fsck_options *options, + enum fsck_msg_id msg_id, + enum fsck_msg_type msg_type) +{ + if (!options->msg_type) { + int i; + enum fsck_msg_type *severity; + ALLOC_ARRAY(severity, FSCK_MSG_MAX); + for (i = 0; i < FSCK_MSG_MAX; i++) + severity[i] = fsck_msg_type(i, options); + options->msg_type = severity; + } + + options->msg_type[msg_id] = msg_type; +} + void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { @@ -144,16 +160,7 @@ void fsck_set_msg_type(struct fsck_options *options, if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) die("Cannot demote %s to %s", msg_id_str, msg_type_str); - if (!options->msg_type) { - int i; - enum fsck_msg_type *severity; - ALLOC_ARRAY(severity, FSCK_MSG_MAX); - for (i = 0; i < FSCK_MSG_MAX; i++) - severity[i] = fsck_msg_type(i, options); - options->msg_type = severity; - } - - options->msg_type[msg_id] = msg_type; + fsck_set_msg_type_from_ids(options, msg_id, msg_type); } void fsck_set_msg_types(struct fsck_options *options, const char *values) diff --git a/fsck.h b/fsck.h index fa2d4955ab3..d284bac3614 100644 --- a/fsck.h +++ b/fsck.h @@ -82,6 +82,9 @@ enum fsck_msg_id { struct fsck_options; struct object; +void fsck_set_msg_type_from_ids(struct fsck_options *options, + enum fsck_msg_id msg_id, + enum fsck_msg_type msg_type); void fsck_set_msg_type(struct fsck_options *options, const char *msg_id, const char *msg_type); void fsck_set_msg_types(struct fsck_options *options, const char *values); -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 16/19] fsck.c: move gitmodules_{found,done} into fsck_options 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (14 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 15/19] fsck.c: add an fsck_set_msg_type() API that takes enums Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 17/19] fetch-pack: don't needlessly copy fsck_options Ævar Arnfjörð Bjarmason ` (3 subsequent siblings) 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Move the gitmodules_{found,done} static variables added in 159e7b080bf (fsck: detect gitmodules files, 2018-05-02) into the fsck_options struct. It makes sense to keep all the context in the same place. This requires changing the recently added register_found_gitmodules() function added in 5476e1efde (fetch-pack: print and use dangling .gitmodules, 2021-02-22) to take fsck_options. That function will be removed in a subsequent commit, but as it'll require the new gitmodules_found attribute of "fsck_options" we need this intermediate step first. An earlier version of this patch removed the small amount of duplication we now have between FSCK_OPTIONS_{DEFAULT,STRICT} with a FSCK_OPTIONS_COMMON macro. I don't think such de-duplication is worth it for this amount of copy/pasting. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fetch-pack.c | 2 +- fsck.c | 23 ++++++++++------------- fsck.h | 9 ++++++++- 3 files changed, 19 insertions(+), 15 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index fb04a76ca26..0f898a5ae14 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -998,7 +998,7 @@ static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) oidset_iter_init(gitmodules_oids, &iter); while ((oid = oidset_iter_next(&iter))) - register_found_gitmodules(oid); + register_found_gitmodules(&fo, oid); if (fsck_finish(&fo)) die("fsck failed"); } diff --git a/fsck.c b/fsck.c index a59832a1650..642bd2ef9da 100644 --- a/fsck.c +++ b/fsck.c @@ -19,9 +19,6 @@ #include "credential.h" #include "help.h" -static struct oidset gitmodules_found = OIDSET_INIT; -static struct oidset gitmodules_done = OIDSET_INIT; - #define STR(x) #x #define MSG_ID(id, msg_type) { STR(id), NULL, NULL, FSCK_##msg_type }, static struct { @@ -606,7 +603,7 @@ static int fsck_tree(const struct object_id *oid, if (is_hfs_dotgitmodules(name) || is_ntfs_dotgitmodules(name)) { if (!S_ISLNK(mode)) - oidset_insert(&gitmodules_found, oid); + oidset_insert(&options->gitmodules_found, oid); else retval += report(options, oid, OBJ_TREE, @@ -620,7 +617,7 @@ static int fsck_tree(const struct object_id *oid, has_dotgit |= is_ntfs_dotgit(backslash); if (is_ntfs_dotgitmodules(backslash)) { if (!S_ISLNK(mode)) - oidset_insert(&gitmodules_found, oid); + oidset_insert(&options->gitmodules_found, oid); else retval += report(options, oid, OBJ_TREE, FSCK_MSG_GITMODULES_SYMLINK, @@ -1132,9 +1129,9 @@ static int fsck_blob(const struct object_id *oid, const char *buf, struct fsck_gitmodules_data data; struct config_options config_opts = { 0 }; - if (!oidset_contains(&gitmodules_found, oid)) + if (!oidset_contains(&options->gitmodules_found, oid)) return 0; - oidset_insert(&gitmodules_done, oid); + oidset_insert(&options->gitmodules_done, oid); if (object_on_skiplist(options, oid)) return 0; @@ -1199,9 +1196,9 @@ int fsck_error_function(struct fsck_options *o, return 1; } -void register_found_gitmodules(const struct object_id *oid) +void register_found_gitmodules(struct fsck_options *options, const struct object_id *oid) { - oidset_insert(&gitmodules_found, oid); + oidset_insert(&options->gitmodules_found, oid); } int fsck_finish(struct fsck_options *options) @@ -1210,13 +1207,13 @@ int fsck_finish(struct fsck_options *options) struct oidset_iter iter; const struct object_id *oid; - oidset_iter_init(&gitmodules_found, &iter); + oidset_iter_init(&options->gitmodules_found, &iter); while ((oid = oidset_iter_next(&iter))) { enum object_type type; unsigned long size; char *buf; - if (oidset_contains(&gitmodules_done, oid)) + if (oidset_contains(&options->gitmodules_done, oid)) continue; buf = read_object_file(oid, &type, &size); @@ -1241,8 +1238,8 @@ int fsck_finish(struct fsck_options *options) } - oidset_clear(&gitmodules_found); - oidset_clear(&gitmodules_done); + oidset_clear(&options->gitmodules_found); + oidset_clear(&options->gitmodules_done); return ret; } diff --git a/fsck.h b/fsck.h index d284bac3614..e20f9bcb394 100644 --- a/fsck.h +++ b/fsck.h @@ -118,15 +118,21 @@ struct fsck_options { unsigned strict:1; enum fsck_msg_type *msg_type; struct oidset skiplist; + struct oidset gitmodules_found; + struct oidset gitmodules_done; kh_oid_map_t *object_names; }; #define FSCK_OPTIONS_DEFAULT { \ .skiplist = OIDSET_INIT, \ + .gitmodules_found = OIDSET_INIT, \ + .gitmodules_done = OIDSET_INIT, \ .error_func = fsck_error_function \ } #define FSCK_OPTIONS_STRICT { \ .strict = 1, \ + .gitmodules_found = OIDSET_INIT, \ + .gitmodules_done = OIDSET_INIT, \ .error_func = fsck_error_function, \ } @@ -146,7 +152,8 @@ int fsck_walk(struct object *obj, void *data, struct fsck_options *options); int fsck_object(struct object *obj, void *data, unsigned long size, struct fsck_options *options); -void register_found_gitmodules(const struct object_id *oid); +void register_found_gitmodules(struct fsck_options *options, + const struct object_id *oid); /* * fsck a tag, and pass info about it back to the caller. This is -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 17/19] fetch-pack: don't needlessly copy fsck_options 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (15 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 16/19] fsck.c: move gitmodules_{found,done} into fsck_options Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 18/19] fetch-pack: use file-scope static struct for fsck_options Ævar Arnfjörð Bjarmason ` (2 subsequent siblings) 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Change the behavior of the .gitmodules validation added in 5476e1efde (fetch-pack: print and use dangling .gitmodules, 2021-02-22) so we're using one "fsck_options". I found that code confusing to read. One might think that not setting up the error_func earlier means that we're relying on the "error_func" not being set in some code in between the two hunks being modified here. But we're not, all we're doing in the rest of "cmd_index_pack()" is further setup by calling fsck_set_msg_types(), and assigning to do_fsck_object. So there was no reason in 5476e1efde to make a shallow copy of the fsck_options struct before setting error_func. Let's just do this setup at the top of the function, along with the "walk" assignment. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/index-pack.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 2f93957fb5e..5b7bc3c8947 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1761,6 +1761,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) read_replace_refs = 0; fsck_options.walk = mark_link; + fsck_options.error_func = print_dangling_gitmodules; reset_pack_idx_option(&opts); git_config(git_index_pack_config, &opts); @@ -1951,13 +1952,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) else close(input_fd); - if (do_fsck_object) { - struct fsck_options fo = fsck_options; - - fo.error_func = print_dangling_gitmodules; - if (fsck_finish(&fo)) - die(_("fsck error in pack objects")); - } + if (do_fsck_object && fsck_finish(&fsck_options)) + die(_("fsck error in pack objects")); free(objects); strbuf_release(&index_name_buf); -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 18/19] fetch-pack: use file-scope static struct for fsck_options 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (16 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 17/19] fetch-pack: don't needlessly copy fsck_options Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 19/19] fetch-pack: use new fsck API to printing dangling submodules Ævar Arnfjörð Bjarmason 2021-03-29 2:06 ` [PATCH v6 00/19] fsck: API improvements Junio C Hamano 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Change code added in 5476e1efde (fetch-pack: print and use dangling .gitmodules, 2021-02-22) so that we use a file-scoped "static struct fsck_options" instead of defining one in the "fsck_gitmodules_oids()" function. We use this pattern in all of builtin/{fsck,index-pack,mktag,unpack-objects}.c. It's odd to see fetch-pack be the odd one out. One might think that we're using other fsck_options structs in fetch-pack, or doing on fsck twice there, but we're not. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fetch-pack.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index 0f898a5ae14..4ec10a15852 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -38,6 +38,7 @@ static int server_supports_filtering; static int advertise_sid; static struct shallow_lock shallow_lock; static const char *alternate_shallow_file; +static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; static struct strbuf fsck_msg_types = STRBUF_INIT; static struct string_list uri_protocols = STRING_LIST_INIT_DUP; @@ -991,15 +992,14 @@ static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) { struct oidset_iter iter; const struct object_id *oid; - struct fsck_options fo = FSCK_OPTIONS_STRICT; if (!oidset_size(gitmodules_oids)) return; oidset_iter_init(gitmodules_oids, &iter); while ((oid = oidset_iter_next(&iter))) - register_found_gitmodules(&fo, oid); - if (fsck_finish(&fo)) + register_found_gitmodules(&fsck_options, oid); + if (fsck_finish(&fsck_options)) die("fsck failed"); } -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v6 19/19] fetch-pack: use new fsck API to printing dangling submodules 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (17 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 18/19] fetch-pack: use file-scope static struct for fsck_options Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 ` Ævar Arnfjörð Bjarmason 2021-03-29 2:06 ` [PATCH v6 00/19] fsck: API improvements Junio C Hamano 19 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-28 13:15 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Refactor the check added in 5476e1efde (fetch-pack: print and use dangling .gitmodules, 2021-02-22) to make use of us now passing the "msg_id" to the user defined "error_func". We can now compare against the FSCK_MSG_GITMODULES_MISSING instead of parsing the generated message. Let's also replace register_found_gitmodules() with directly manipulating the "gitmodules_found" member. A recent commit moved it into "fsck_options" so we could do this here. I'm sticking this callback in fsck.c. Perhaps in the future we'd like to accumulate such callbacks into another file (maybe fsck-cb.c, similar to parse-options-cb.c?), but while we've got just the one let's just put it into fsck.c. A better alternative in this case would be some library some more obvious library shared by fetch-pack.c ad builtin/index-pack.c, but there isn't such a thing. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/index-pack.c | 21 +-------------------- fetch-pack.c | 31 ++++++++----------------------- fsck.c | 23 ++++++++++++++++++----- fsck.h | 15 ++++++++++++--- 4 files changed, 39 insertions(+), 51 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 5b7bc3c8947..15507b5cff0 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -120,7 +120,7 @@ static int nr_threads; static int from_stdin; static int strict; static int do_fsck_object; -static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; +static struct fsck_options fsck_options = FSCK_OPTIONS_MISSING_GITMODULES; static int verbose; static int show_resolving_progress; static int show_stat; @@ -1713,24 +1713,6 @@ static void show_pack_info(int stat_only) } } -static int print_dangling_gitmodules(struct fsck_options *o, - const struct object_id *oid, - enum object_type object_type, - enum fsck_msg_type msg_type, - enum fsck_msg_id msg_id, - const char *message) -{ - /* - * NEEDSWORK: Plumb the MSG_ID (from fsck.c) here and use it - * instead of relying on this string check. - */ - if (starts_with(message, "gitmodulesMissing")) { - printf("%s\n", oid_to_hex(oid)); - return 0; - } - return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); -} - int cmd_index_pack(int argc, const char **argv, const char *prefix) { int i, fix_thin_pack = 0, verify = 0, stat_only = 0, rev_index; @@ -1761,7 +1743,6 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) read_replace_refs = 0; fsck_options.walk = mark_link; - fsck_options.error_func = print_dangling_gitmodules; reset_pack_idx_option(&opts); git_config(git_index_pack_config, &opts); diff --git a/fetch-pack.c b/fetch-pack.c index 4ec10a15852..c80eaee7694 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -38,7 +38,7 @@ static int server_supports_filtering; static int advertise_sid; static struct shallow_lock shallow_lock; static const char *alternate_shallow_file; -static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; +static struct fsck_options fsck_options = FSCK_OPTIONS_MISSING_GITMODULES; static struct strbuf fsck_msg_types = STRBUF_INIT; static struct string_list uri_protocols = STRING_LIST_INIT_DUP; @@ -988,21 +988,6 @@ static int cmp_ref_by_name(const void *a_, const void *b_) return strcmp(a->name, b->name); } -static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) -{ - struct oidset_iter iter; - const struct object_id *oid; - - if (!oidset_size(gitmodules_oids)) - return; - - oidset_iter_init(gitmodules_oids, &iter); - while ((oid = oidset_iter_next(&iter))) - register_found_gitmodules(&fsck_options, oid); - if (fsck_finish(&fsck_options)) - die("fsck failed"); -} - static struct ref *do_fetch_pack(struct fetch_pack_args *args, int fd[2], const struct ref *orig_ref, @@ -1017,7 +1002,6 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args, int agent_len; struct fetch_negotiator negotiator_alloc; struct fetch_negotiator *negotiator; - struct oidset gitmodules_oids = OIDSET_INIT; negotiator = &negotiator_alloc; fetch_negotiator_init(r, negotiator); @@ -1134,9 +1118,10 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args, else alternate_shallow_file = NULL; if (get_pack(args, fd, pack_lockfiles, NULL, sought, nr_sought, - &gitmodules_oids)) + &fsck_options.gitmodules_found)) die(_("git fetch-pack: fetch failed.")); - fsck_gitmodules_oids(&gitmodules_oids); + if (fsck_finish(&fsck_options)) + die("fsck failed"); all_done: if (negotiator) @@ -1587,7 +1572,6 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, struct string_list packfile_uris = STRING_LIST_INIT_DUP; int i; struct strvec index_pack_args = STRVEC_INIT; - struct oidset gitmodules_oids = OIDSET_INIT; negotiator = &negotiator_alloc; fetch_negotiator_init(r, negotiator); @@ -1678,7 +1662,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, process_section_header(&reader, "packfile", 0); if (get_pack(args, fd, pack_lockfiles, packfile_uris.nr ? &index_pack_args : NULL, - sought, nr_sought, &gitmodules_oids)) + sought, nr_sought, &fsck_options.gitmodules_found)) die(_("git fetch-pack: fetch failed.")); do_check_stateless_delimiter(args, &reader); @@ -1721,7 +1705,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, packname[the_hash_algo->hexsz] = '\0'; - parse_gitmodules_oids(cmd.out, &gitmodules_oids); + parse_gitmodules_oids(cmd.out, &fsck_options.gitmodules_found); close(cmd.out); @@ -1742,7 +1726,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, string_list_clear(&packfile_uris, 0); strvec_clear(&index_pack_args); - fsck_gitmodules_oids(&gitmodules_oids); + if (fsck_finish(&fsck_options)) + die("fsck failed"); if (negotiator) negotiator->release(negotiator); diff --git a/fsck.c b/fsck.c index 642bd2ef9da..f5ed6a26358 100644 --- a/fsck.c +++ b/fsck.c @@ -1196,11 +1196,6 @@ int fsck_error_function(struct fsck_options *o, return 1; } -void register_found_gitmodules(struct fsck_options *options, const struct object_id *oid) -{ - oidset_insert(&options->gitmodules_found, oid); -} - int fsck_finish(struct fsck_options *options) { int ret = 0; @@ -1266,3 +1261,21 @@ int git_fsck_config(const char *var, const char *value, void *cb) return git_default_config(var, value, cb); } + +/* + * Custom error callbacks that are used in more than one place. + */ + +int fsck_error_cb_print_missing_gitmodules(struct fsck_options *o, + const struct object_id *oid, + enum object_type object_type, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message) +{ + if (msg_id == FSCK_MSG_GITMODULES_MISSING) { + puts(oid_to_hex(oid)); + return 0; + } + return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); +} diff --git a/fsck.h b/fsck.h index e20f9bcb394..7202c3c87e8 100644 --- a/fsck.h +++ b/fsck.h @@ -111,6 +111,12 @@ int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, enum fsck_msg_type msg_type, enum fsck_msg_id msg_id, const char *message); +int fsck_error_cb_print_missing_gitmodules(struct fsck_options *o, + const struct object_id *oid, + enum object_type object_type, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message); struct fsck_options { fsck_walk_func walk; @@ -135,6 +141,12 @@ struct fsck_options { .gitmodules_done = OIDSET_INIT, \ .error_func = fsck_error_function, \ } +#define FSCK_OPTIONS_MISSING_GITMODULES { \ + .strict = 1, \ + .gitmodules_found = OIDSET_INIT, \ + .gitmodules_done = OIDSET_INIT, \ + .error_func = fsck_error_cb_print_missing_gitmodules, \ +} /* descend in all linked child objects * the return value is: @@ -152,9 +164,6 @@ int fsck_walk(struct object *obj, void *data, struct fsck_options *options); int fsck_object(struct object *obj, void *data, unsigned long size, struct fsck_options *options); -void register_found_gitmodules(struct fsck_options *options, - const struct object_id *oid); - /* * fsck a tag, and pass info about it back to the caller. This is * exposed fsck_object() internals for git-mktag(1). -- 2.31.1.445.g087790d4945 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v6 00/19] fsck: API improvements 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason ` (18 preceding siblings ...) 2021-03-28 13:15 ` [PATCH v6 19/19] fetch-pack: use new fsck API to printing dangling submodules Ævar Arnfjörð Bjarmason @ 2021-03-29 2:06 ` Junio C Hamano 19 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-29 2:06 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > To recap on the goals in v1[1] this series gets rid of the need to > have the rececently added "print_dangling_gitmodules" function in > favor of a better fsck API to get at that information. Read the whole series afresh, as well as "git diff @{1}" after replacing to see what changed since the previous round. Didn't find anything iffy. Unless somebody finds improvement opportunities in the coming couple of days, let's declare it is good enough and merge to 'next', polishing incrementally if needed. Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v5 01/19] fsck.c: refactor and rename common config callback 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason 2021-03-16 19:35 ` Derrick Stolee 2021-03-17 18:20 ` [PATCH v5 00/19] " Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 02/19] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason ` (17 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Refactor code I recently changed in 1f3299fda9 (fsck: make fsck_config() re-usable, 2021-01-05) so that I could use fsck's config callback in mktag in 1f3299fda9 (fsck: make fsck_config() re-usable, 2021-01-05). I don't know what I was thinking in structuring the code this way, but it clearly makes no sense to have an fsck_config_internal() at all just so it can get a fsck_options when git_config() already supports passing along some void* data. Let's just make use of that instead, which gets us rid of the two wrapper functions, and brings fsck's common config callback in line with other such reusable config callbacks. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 7 +------ builtin/mktag.c | 7 +------ fsck.c | 4 ++-- fsck.h | 3 +-- 4 files changed, 5 insertions(+), 16 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 821e7798c7..a56a2d0513 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -71,11 +71,6 @@ static const char *printable_type(const struct object_id *oid, return ret; } -static int fsck_config(const char *var, const char *value, void *cb) -{ - return fsck_config_internal(var, value, cb, &fsck_obj_options); -} - static int objerror(struct object *obj, const char *err) { errors_found |= ERROR_OBJECT; @@ -803,7 +798,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) if (name_objects) fsck_enable_object_names(&fsck_walk_options); - git_config(fsck_config, NULL); + git_config(git_fsck_config, &fsck_obj_options); if (connectivity_only) { for_each_loose_object(mark_loose_for_connectivity, NULL, 0); diff --git a/builtin/mktag.c b/builtin/mktag.c index 41a399a69e..23c4b8763f 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -14,11 +14,6 @@ static int option_strict = 1; static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; -static int mktag_config(const char *var, const char *value, void *cb) -{ - return fsck_config_internal(var, value, cb, &fsck_options); -} - static int mktag_fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, @@ -93,7 +88,7 @@ int cmd_mktag(int argc, const char **argv, const char *prefix) fsck_options.error_func = mktag_fsck_error_func; fsck_set_msg_type(&fsck_options, "extraheaderentry", "warn"); /* config might set fsck.extraHeaderEntry=* again */ - git_config(mktag_config, NULL); + git_config(git_fsck_config, &fsck_options); if (fsck_tag_standalone(NULL, buf.buf, buf.len, &fsck_options, &tagged_oid, &tagged_type)) die(_("tag on stdin did not pass our strict fsck check")); diff --git a/fsck.c b/fsck.c index e3030f3b35..5dfb99665a 100644 --- a/fsck.c +++ b/fsck.c @@ -1323,9 +1323,9 @@ int fsck_finish(struct fsck_options *options) return ret; } -int fsck_config_internal(const char *var, const char *value, void *cb, - struct fsck_options *options) +int git_fsck_config(const char *var, const char *value, void *cb) { + struct fsck_options *options = cb; if (strcmp(var, "fsck.skiplist") == 0) { const char *path; struct strbuf sb = STRBUF_INIT; diff --git a/fsck.h b/fsck.h index 733378f126..f70d11c559 100644 --- a/fsck.h +++ b/fsck.h @@ -109,7 +109,6 @@ const char *fsck_describe_object(struct fsck_options *options, * git_config() callback for use by fsck-y tools that want to support * fsck.<msg> fsck.skipList etc. */ -int fsck_config_internal(const char *var, const char *value, void *cb, - struct fsck_options *options); +int git_fsck_config(const char *var, const char *value, void *cb); #endif -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 02/19] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (2 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 01/19] fsck.c: refactor and rename common config callback Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 03/19] fsck.h: use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason ` (16 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Refactor the definitions of FSCK_OPTIONS_{DEFAULT,STRICT} to use designated initializers. While I'm at it add the "object_names" member to the initialization. This was omitted in 7b35efd734e (fsck_walk(): optionally name objects on the go, 2016-07-17) when the field was added. I'm using a new FSCK_OPTIONS_COMMON and FSCK_OPTIONS_COMMON_ERROR_FUNC helper macros to define what FSCK_OPTIONS_{DEFAULT,STRICT} have in common, and define the two in terms of those macro. The FSCK_OPTIONS_COMMON macro will be used in a subsequent commit to define other variants of common fsck initialization that wants to use a custom error function, but share the rest of the defaults. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/fsck.h b/fsck.h index f70d11c559..15e12f292f 100644 --- a/fsck.h +++ b/fsck.h @@ -43,8 +43,17 @@ struct fsck_options { kh_oid_map_t *object_names; }; -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT } -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT } +#define FSCK_OPTIONS_COMMON \ + .walk = NULL, \ + .msg_type = NULL, \ + .skiplist = OIDSET_INIT, \ + .object_names = NULL, +#define FSCK_OPTIONS_COMMON_ERROR_FUNC \ + FSCK_OPTIONS_COMMON \ + .error_func = fsck_error_function + +#define FSCK_OPTIONS_DEFAULT { .strict = 0, FSCK_OPTIONS_COMMON_ERROR_FUNC } +#define FSCK_OPTIONS_STRICT { .strict = 1, FSCK_OPTIONS_COMMON_ERROR_FUNC } /* descend in all linked child objects * the return value is: -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 03/19] fsck.h: use "enum object_type" instead of "int" 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (3 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 02/19] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 04/19] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason ` (15 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Change the fsck_walk_func to use an "enum object_type" instead of an "int" type. The types are compatible, and ever since this was added in 355885d5315 (add generic, type aware object chain walker, 2008-02-25) we've used entries from object_type (OBJ_BLOB etc.). So this doesn't really change anything as far as the generated code is concerned, it just gives the compiler more information and makes this easier to read. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 3 ++- builtin/index-pack.c | 3 ++- builtin/unpack-objects.c | 3 ++- fsck.h | 3 ++- 4 files changed, 8 insertions(+), 4 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index a56a2d0513..ed5f2af6b5 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -192,7 +192,8 @@ static int traverse_reachable(void) return !!result; } -static int mark_used(struct object *obj, int type, void *data, struct fsck_options *options) +static int mark_used(struct object *obj, enum object_type object_type, + void *data, struct fsck_options *options) { if (!obj) return 1; diff --git a/builtin/index-pack.c b/builtin/index-pack.c index bad5748807..69f24fe9f7 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -212,7 +212,8 @@ static void cleanup_thread(void) free(thread_data); } -static int mark_link(struct object *obj, int type, void *data, struct fsck_options *options) +static int mark_link(struct object *obj, enum object_type type, + void *data, struct fsck_options *options) { if (!obj) return -1; diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index dd4a75e030..ca54fd1668 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -187,7 +187,8 @@ static void write_cached_object(struct object *obj, struct obj_buffer *obj_buf) * that have reachability requirements and calls this function. * Verify its reachability and validity recursively and write it out. */ -static int check_object(struct object *obj, int type, void *data, struct fsck_options *options) +static int check_object(struct object *obj, enum object_type type, + void *data, struct fsck_options *options) { struct obj_buffer *obj_buf; diff --git a/fsck.h b/fsck.h index 15e12f292f..e3edaff8e7 100644 --- a/fsck.h +++ b/fsck.h @@ -23,7 +23,8 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type); * <0 error signaled and abort * >0 error signaled and do not abort */ -typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options); +typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, + void *data, struct fsck_options *options); /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 04/19] fsck.c: rename variables in fsck_set_msg_type() for less confusion 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (4 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 03/19] fsck.h: use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 05/19] fsck.c: move definition of msg_id into append_msg_id() Ævar Arnfjörð Bjarmason ` (14 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Rename variables in a function added in 0282f4dced0 (fsck: offer a function to demote fsck errors to warnings, 2015-06-22). It was needlessly confusing that it took a "msg_type" argument, but then later declared another "msg_type" of a different type. Let's rename that to "severity", and rename "id" to "msg_id" and "msg_id" to "msg_id_str" etc. This will make a follow-up change smaller. While I'm at it properly indent the fsck_set_msg_type() argument list. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 24 ++++++++++++------------ fsck.h | 2 +- 2 files changed, 13 insertions(+), 13 deletions(-) diff --git a/fsck.c b/fsck.c index 5dfb99665a..7cc722a25c 100644 --- a/fsck.c +++ b/fsck.c @@ -203,27 +203,27 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) } void fsck_set_msg_type(struct fsck_options *options, - const char *msg_id, const char *msg_type) + const char *msg_id_str, const char *msg_type_str) { - int id = parse_msg_id(msg_id), type; + int msg_id = parse_msg_id(msg_id_str), msg_type; - if (id < 0) - die("Unhandled message id: %s", msg_id); - type = parse_msg_type(msg_type); + if (msg_id < 0) + die("Unhandled message id: %s", msg_id_str); + msg_type = parse_msg_type(msg_type_str); - if (type != FSCK_ERROR && msg_id_info[id].msg_type == FSCK_FATAL) - die("Cannot demote %s to %s", msg_id, msg_type); + if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) + die("Cannot demote %s to %s", msg_id_str, msg_type_str); if (!options->msg_type) { int i; - int *msg_type; - ALLOC_ARRAY(msg_type, FSCK_MSG_MAX); + int *severity; + ALLOC_ARRAY(severity, FSCK_MSG_MAX); for (i = 0; i < FSCK_MSG_MAX; i++) - msg_type[i] = fsck_msg_type(i, options); - options->msg_type = msg_type; + severity[i] = fsck_msg_type(i, options); + options->msg_type = severity; } - options->msg_type[id] = type; + options->msg_type[msg_id] = msg_type; } void fsck_set_msg_types(struct fsck_options *options, const char *values) diff --git a/fsck.h b/fsck.h index e3edaff8e7..12ff99b56e 100644 --- a/fsck.h +++ b/fsck.h @@ -11,7 +11,7 @@ struct fsck_options; struct object; void fsck_set_msg_type(struct fsck_options *options, - const char *msg_id, const char *msg_type); + const char *msg_id, const char *msg_type); void fsck_set_msg_types(struct fsck_options *options, const char *values); int is_valid_msg_type(const char *msg_id, const char *msg_type); -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 05/19] fsck.c: move definition of msg_id into append_msg_id() 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (5 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 04/19] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 06/19] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason ` (13 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Refactor code added in 71ab8fa840f (fsck: report the ID of the error/warning, 2015-06-22) to resolve the msg_id to a string in the function that wants it, instead of doing it in report(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index 7cc722a25c..ffb9115ddb 100644 --- a/fsck.c +++ b/fsck.c @@ -264,8 +264,9 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) free(to_free); } -static void append_msg_id(struct strbuf *sb, const char *msg_id) +static void append_msg_id(struct strbuf *sb, enum fsck_msg_id id) { + const char *msg_id = msg_id_info[id].id_string; for (;;) { char c = *(msg_id)++; @@ -308,7 +309,7 @@ static int report(struct fsck_options *options, else if (msg_type == FSCK_INFO) msg_type = FSCK_WARN; - append_msg_id(&sb, msg_id_info[id].id_string); + append_msg_id(&sb, id); va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 06/19] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (6 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 05/19] fsck.c: move definition of msg_id into append_msg_id() Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 07/19] fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" Ævar Arnfjörð Bjarmason ` (12 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Rename the remaining variables of type fsck_msg_id from "id" to "msg_id". This change is relatively small, and is worth the churn for a later change where we have different id's in the "report" function. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/fsck.c b/fsck.c index ffb9115ddb..a9a8783aeb 100644 --- a/fsck.c +++ b/fsck.c @@ -264,19 +264,19 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) free(to_free); } -static void append_msg_id(struct strbuf *sb, enum fsck_msg_id id) +static void append_msg_id(struct strbuf *sb, enum fsck_msg_id msg_id) { - const char *msg_id = msg_id_info[id].id_string; + const char *msg_id_str = msg_id_info[msg_id].id_string; for (;;) { - char c = *(msg_id)++; + char c = *(msg_id_str)++; if (!c) break; if (c != '_') strbuf_addch(sb, tolower(c)); else { - assert(*msg_id); - strbuf_addch(sb, *(msg_id)++); + assert(*msg_id_str); + strbuf_addch(sb, *(msg_id_str)++); } } @@ -292,11 +292,11 @@ static int object_on_skiplist(struct fsck_options *opts, __attribute__((format (printf, 5, 6))) static int report(struct fsck_options *options, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_id id, const char *fmt, ...) + enum fsck_msg_id msg_id, const char *fmt, ...) { va_list ap; struct strbuf sb = STRBUF_INIT; - int msg_type = fsck_msg_type(id, options), result; + int msg_type = fsck_msg_type(msg_id, options), result; if (msg_type == FSCK_IGNORE) return 0; @@ -309,7 +309,7 @@ static int report(struct fsck_options *options, else if (msg_type == FSCK_INFO) msg_type = FSCK_WARN; - append_msg_id(&sb, id); + append_msg_id(&sb, msg_id); va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 07/19] fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (7 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 06/19] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 08/19] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason ` (11 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Refactor "if options->msg_type" and other code added in 0282f4dced0 (fsck: offer a function to demote fsck errors to warnings, 2015-06-22) to reduce the scope of the "int msg_type" variable. This is in preparation for changing its type in a subsequent commit, only using it in the "!options->msg_type" scope makes that change This also brings the code in line with the fsck_set_msg_type() function (also added in 0282f4dced0), which does a similar check for "!options->msg_type". Another minor benefit is getting rid of the style violation of not having braces for the body of the "if". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/fsck.c b/fsck.c index a9a8783aeb..2f23255f99 100644 --- a/fsck.c +++ b/fsck.c @@ -167,19 +167,17 @@ void list_config_fsck_msg_ids(struct string_list *list, const char *prefix) static int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options) { - int msg_type; - assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX); - if (options->msg_type) - msg_type = options->msg_type[msg_id]; - else { - msg_type = msg_id_info[msg_id].msg_type; + if (!options->msg_type) { + int msg_type = msg_id_info[msg_id].msg_type; + if (options->strict && msg_type == FSCK_WARN) msg_type = FSCK_ERROR; + return msg_type; } - return msg_type; + return options->msg_type[msg_id]; } static int parse_msg_type(const char *str) -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 08/19] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (8 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 07/19] fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 09/19] fsck.h: re-order and re-assign "enum fsck_msg_type" Ævar Arnfjörð Bjarmason ` (10 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Move the FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} defines into a new fsck_msg_type enum. These defines were originally introduced in: - ba002f3b28a (builtin-fsck: move common object checking code to fsck.c, 2008-02-25) - f50c4407305 (fsck: disallow demoting grave fsck errors to warnings, 2015-06-22) - efaba7cc77f (fsck: optionally ignore specific fsck issues completely, 2015-06-22) - f27d05b1704 (fsck: allow upgrading fsck warnings to errors, 2015-06-22) The reason these were defined in two different places is because we use FSCK_{IGNORE,INFO,FATAL} only in fsck.c, but FSCK_{ERROR,WARN} are used by external callbacks. Untangling that would take some more work, since we expose the new "enum fsck_msg_type" to both. Similar to "enum object_type" it's not worth structuring the API in such a way that only those who need FSCK_{ERROR,WARN} pass around a different type. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 2 +- builtin/index-pack.c | 3 ++- builtin/mktag.c | 3 ++- fsck.c | 21 ++++++++++----------- fsck.h | 16 ++++++++++------ 5 files changed, 25 insertions(+), 20 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index ed5f2af6b5..17940a4e24 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -84,7 +84,7 @@ static int objerror(struct object *obj, const char *err) static int fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 69f24fe9f7..56b8efaa89 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1716,7 +1716,8 @@ static void show_pack_info(int stat_only) static int print_dangling_gitmodules(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, + const char *message) { /* * NEEDSWORK: Plumb the MSG_ID (from fsck.c) here and use it diff --git a/builtin/mktag.c b/builtin/mktag.c index 23c4b8763f..052a510ad7 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -17,7 +17,8 @@ static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; static int mktag_fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, + const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/fsck.c b/fsck.c index 2f23255f99..e1e942821d 100644 --- a/fsck.c +++ b/fsck.c @@ -22,9 +22,6 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FSCK_FATAL -1 -#define FSCK_INFO -2 - #define FOREACH_MSG_ID(FUNC) \ /* fatal errors */ \ FUNC(NUL_IN_HEADER, FATAL) \ @@ -97,7 +94,7 @@ static struct { const char *id_string; const char *downcased; const char *camelcased; - int msg_type; + enum fsck_msg_type msg_type; } msg_id_info[FSCK_MSG_MAX + 1] = { FOREACH_MSG_ID(MSG_ID) { NULL, NULL, NULL, -1 } @@ -164,13 +161,13 @@ void list_config_fsck_msg_ids(struct string_list *list, const char *prefix) list_config_item(list, prefix, msg_id_info[i].camelcased); } -static int fsck_msg_type(enum fsck_msg_id msg_id, +static enum fsck_msg_type fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options) { assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX); if (!options->msg_type) { - int msg_type = msg_id_info[msg_id].msg_type; + enum fsck_msg_type msg_type = msg_id_info[msg_id].msg_type; if (options->strict && msg_type == FSCK_WARN) msg_type = FSCK_ERROR; @@ -180,7 +177,7 @@ static int fsck_msg_type(enum fsck_msg_id msg_id, return options->msg_type[msg_id]; } -static int parse_msg_type(const char *str) +static enum fsck_msg_type parse_msg_type(const char *str) { if (!strcmp(str, "error")) return FSCK_ERROR; @@ -203,7 +200,8 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { - int msg_id = parse_msg_id(msg_id_str), msg_type; + int msg_id = parse_msg_id(msg_id_str); + enum fsck_msg_type msg_type; if (msg_id < 0) die("Unhandled message id: %s", msg_id_str); @@ -214,7 +212,7 @@ void fsck_set_msg_type(struct fsck_options *options, if (!options->msg_type) { int i; - int *severity; + enum fsck_msg_type *severity; ALLOC_ARRAY(severity, FSCK_MSG_MAX); for (i = 0; i < FSCK_MSG_MAX; i++) severity[i] = fsck_msg_type(i, options); @@ -294,7 +292,8 @@ static int report(struct fsck_options *options, { va_list ap; struct strbuf sb = STRBUF_INIT; - int msg_type = fsck_msg_type(msg_id, options), result; + enum fsck_msg_type msg_type = fsck_msg_type(msg_id, options); + int result; if (msg_type == FSCK_IGNORE) return 0; @@ -1265,7 +1264,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size, int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, const char *message) { if (msg_type == FSCK_WARN) { warning("object %s: %s", fsck_describe_object(o, oid), message); diff --git a/fsck.h b/fsck.h index 12ff99b56e..0fff04373e 100644 --- a/fsck.h +++ b/fsck.h @@ -3,9 +3,13 @@ #include "oidset.h" -#define FSCK_ERROR 1 -#define FSCK_WARN 2 -#define FSCK_IGNORE 3 +enum fsck_msg_type { + FSCK_INFO = -2, + FSCK_FATAL = -1, + FSCK_ERROR = 1, + FSCK_WARN, + FSCK_IGNORE +}; struct fsck_options; struct object; @@ -29,17 +33,17 @@ typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message); + enum fsck_msg_type msg_type, const char *message); int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message); + enum fsck_msg_type msg_type, const char *message); struct fsck_options { fsck_walk_func walk; fsck_error error_func; unsigned strict:1; - int *msg_type; + enum fsck_msg_type *msg_type; struct oidset skiplist; kh_oid_map_t *object_names; }; -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 09/19] fsck.h: re-order and re-assign "enum fsck_msg_type" 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (9 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 08/19] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 10/19] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason ` (9 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Change the values in the "enum fsck_msg_type" from being manually assigned to using default C enum values. This means we end up with a FSCK_IGNORE=0, which was previously defined as "2". I'm confident that nothing relies on these values, we always compare them explicitly. Let's not omit "0" so it won't be assumed that we're using these as a boolean somewhere. This also allows us to re-structure the fields to mark which are "private" v.s. "public". See the preceding commit for a rationale for not simply splitting these into two enums, namely that this is used for both the private and public fsck API. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/fsck.h b/fsck.h index 0fff04373e..25c456bbd3 100644 --- a/fsck.h +++ b/fsck.h @@ -4,11 +4,13 @@ #include "oidset.h" enum fsck_msg_type { - FSCK_INFO = -2, - FSCK_FATAL = -1, - FSCK_ERROR = 1, + /* for internal use only */ + FSCK_IGNORE, + FSCK_INFO, + FSCK_FATAL, + /* "public", fed to e.g. error_func callbacks */ + FSCK_ERROR, FSCK_WARN, - FSCK_IGNORE }; struct fsck_options; -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 10/19] fsck.c: call parse_msg_type() early in fsck_set_msg_type() 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (10 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 09/19] fsck.h: re-order and re-assign "enum fsck_msg_type" Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 11/19] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason ` (8 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason There's no reason to defer the calling of parse_msg_type() until after we've checked if the "id < 0". This is not a hot codepath, and parse_msg_type() itself may die on invalid input. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index e1e942821d..341c482fed 100644 --- a/fsck.c +++ b/fsck.c @@ -201,11 +201,10 @@ void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { int msg_id = parse_msg_id(msg_id_str); - enum fsck_msg_type msg_type; + enum fsck_msg_type msg_type = parse_msg_type(msg_type_str); if (msg_id < 0) die("Unhandled message id: %s", msg_id_str); - msg_type = parse_msg_type(msg_type_str); if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) die("Cannot demote %s to %s", msg_id_str, msg_type_str); -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 11/19] fsck.c: undefine temporary STR macro after use 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (11 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 10/19] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 12/19] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason ` (7 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason In f417eed8cde (fsck: provide a function to parse fsck message IDs, 2015-06-22) the "STR" macro was introduced, but that short macro name was not undefined after use as was done earlier in the same series for the MSG_ID macro in c99ba492f1c (fsck: introduce identifiers for fsck messages, 2015-06-22). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fsck.c b/fsck.c index 341c482fed..e657636a6f 100644 --- a/fsck.c +++ b/fsck.c @@ -100,6 +100,7 @@ static struct { { NULL, NULL, NULL, -1 } }; #undef MSG_ID +#undef STR static void prepare_msg_ids(void) { -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 12/19] fsck.c: give "FOREACH_MSG_ID" a more specific name 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (12 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 11/19] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 13/19] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h Ævar Arnfjörð Bjarmason ` (6 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Rename the FOREACH_MSG_ID macro to FOREACH_FSCK_MSG_ID in preparation for moving it over to fsck.h. It's good convention to name macros in *.h files in such a way as to clearly not clash with any other names in other files. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fsck.c b/fsck.c index e657636a6f..b64526ea35 100644 --- a/fsck.c +++ b/fsck.c @@ -22,7 +22,7 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FOREACH_MSG_ID(FUNC) \ +#define FOREACH_FSCK_MSG_ID(FUNC) \ /* fatal errors */ \ FUNC(NUL_IN_HEADER, FATAL) \ FUNC(UNTERMINATED_HEADER, FATAL) \ @@ -83,7 +83,7 @@ static struct oidset gitmodules_done = OIDSET_INIT; #define MSG_ID(id, msg_type) FSCK_MSG_##id, enum fsck_msg_id { - FOREACH_MSG_ID(MSG_ID) + FOREACH_FSCK_MSG_ID(MSG_ID) FSCK_MSG_MAX }; #undef MSG_ID @@ -96,7 +96,7 @@ static struct { const char *camelcased; enum fsck_msg_type msg_type; } msg_id_info[FSCK_MSG_MAX + 1] = { - FOREACH_MSG_ID(MSG_ID) + FOREACH_FSCK_MSG_ID(MSG_ID) { NULL, NULL, NULL, -1 } }; #undef MSG_ID -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 13/19] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (13 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 12/19] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 14/19] fsck.c: pass along the fsck_msg_id in the fsck_error callback Ævar Arnfjörð Bjarmason ` (5 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Move the FOREACH_FSCK_MSG_ID macro and the fsck_msg_id enum it helps define from fsck.c to fsck.h. This is in preparation for having non-static functions take the fsck_msg_id as an argument. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 66 ---------------------------------------------------------- fsck.h | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 66 insertions(+), 66 deletions(-) diff --git a/fsck.c b/fsck.c index b64526ea35..49208ec636 100644 --- a/fsck.c +++ b/fsck.c @@ -22,72 +22,6 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FOREACH_FSCK_MSG_ID(FUNC) \ - /* fatal errors */ \ - FUNC(NUL_IN_HEADER, FATAL) \ - FUNC(UNTERMINATED_HEADER, FATAL) \ - /* errors */ \ - FUNC(BAD_DATE, ERROR) \ - FUNC(BAD_DATE_OVERFLOW, ERROR) \ - FUNC(BAD_EMAIL, ERROR) \ - FUNC(BAD_NAME, ERROR) \ - FUNC(BAD_OBJECT_SHA1, ERROR) \ - FUNC(BAD_PARENT_SHA1, ERROR) \ - FUNC(BAD_TAG_OBJECT, ERROR) \ - FUNC(BAD_TIMEZONE, ERROR) \ - FUNC(BAD_TREE, ERROR) \ - FUNC(BAD_TREE_SHA1, ERROR) \ - FUNC(BAD_TYPE, ERROR) \ - FUNC(DUPLICATE_ENTRIES, ERROR) \ - FUNC(MISSING_AUTHOR, ERROR) \ - FUNC(MISSING_COMMITTER, ERROR) \ - FUNC(MISSING_EMAIL, ERROR) \ - FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \ - FUNC(MISSING_OBJECT, ERROR) \ - FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \ - FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \ - FUNC(MISSING_TAG, ERROR) \ - FUNC(MISSING_TAG_ENTRY, ERROR) \ - FUNC(MISSING_TREE, ERROR) \ - FUNC(MISSING_TREE_OBJECT, ERROR) \ - FUNC(MISSING_TYPE, ERROR) \ - FUNC(MISSING_TYPE_ENTRY, ERROR) \ - FUNC(MULTIPLE_AUTHORS, ERROR) \ - FUNC(TREE_NOT_SORTED, ERROR) \ - FUNC(UNKNOWN_TYPE, ERROR) \ - FUNC(ZERO_PADDED_DATE, ERROR) \ - FUNC(GITMODULES_MISSING, ERROR) \ - FUNC(GITMODULES_BLOB, ERROR) \ - FUNC(GITMODULES_LARGE, ERROR) \ - FUNC(GITMODULES_NAME, ERROR) \ - FUNC(GITMODULES_SYMLINK, ERROR) \ - FUNC(GITMODULES_URL, ERROR) \ - FUNC(GITMODULES_PATH, ERROR) \ - FUNC(GITMODULES_UPDATE, ERROR) \ - /* warnings */ \ - FUNC(BAD_FILEMODE, WARN) \ - FUNC(EMPTY_NAME, WARN) \ - FUNC(FULL_PATHNAME, WARN) \ - FUNC(HAS_DOT, WARN) \ - FUNC(HAS_DOTDOT, WARN) \ - FUNC(HAS_DOTGIT, WARN) \ - FUNC(NULL_SHA1, WARN) \ - FUNC(ZERO_PADDED_FILEMODE, WARN) \ - FUNC(NUL_IN_COMMIT, WARN) \ - /* infos (reported as warnings, but ignored by default) */ \ - FUNC(GITMODULES_PARSE, INFO) \ - FUNC(BAD_TAG_NAME, INFO) \ - FUNC(MISSING_TAGGER_ENTRY, INFO) \ - /* ignored (elevated when requested) */ \ - FUNC(EXTRA_HEADER_ENTRY, IGNORE) - -#define MSG_ID(id, msg_type) FSCK_MSG_##id, -enum fsck_msg_id { - FOREACH_FSCK_MSG_ID(MSG_ID) - FSCK_MSG_MAX -}; -#undef MSG_ID - #define STR(x) #x #define MSG_ID(id, msg_type) { STR(id), NULL, NULL, FSCK_##msg_type }, static struct { diff --git a/fsck.h b/fsck.h index 25c456bbd3..7c868410eb 100644 --- a/fsck.h +++ b/fsck.h @@ -13,6 +13,72 @@ enum fsck_msg_type { FSCK_WARN, }; +#define FOREACH_FSCK_MSG_ID(FUNC) \ + /* fatal errors */ \ + FUNC(NUL_IN_HEADER, FATAL) \ + FUNC(UNTERMINATED_HEADER, FATAL) \ + /* errors */ \ + FUNC(BAD_DATE, ERROR) \ + FUNC(BAD_DATE_OVERFLOW, ERROR) \ + FUNC(BAD_EMAIL, ERROR) \ + FUNC(BAD_NAME, ERROR) \ + FUNC(BAD_OBJECT_SHA1, ERROR) \ + FUNC(BAD_PARENT_SHA1, ERROR) \ + FUNC(BAD_TAG_OBJECT, ERROR) \ + FUNC(BAD_TIMEZONE, ERROR) \ + FUNC(BAD_TREE, ERROR) \ + FUNC(BAD_TREE_SHA1, ERROR) \ + FUNC(BAD_TYPE, ERROR) \ + FUNC(DUPLICATE_ENTRIES, ERROR) \ + FUNC(MISSING_AUTHOR, ERROR) \ + FUNC(MISSING_COMMITTER, ERROR) \ + FUNC(MISSING_EMAIL, ERROR) \ + FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \ + FUNC(MISSING_OBJECT, ERROR) \ + FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \ + FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \ + FUNC(MISSING_TAG, ERROR) \ + FUNC(MISSING_TAG_ENTRY, ERROR) \ + FUNC(MISSING_TREE, ERROR) \ + FUNC(MISSING_TREE_OBJECT, ERROR) \ + FUNC(MISSING_TYPE, ERROR) \ + FUNC(MISSING_TYPE_ENTRY, ERROR) \ + FUNC(MULTIPLE_AUTHORS, ERROR) \ + FUNC(TREE_NOT_SORTED, ERROR) \ + FUNC(UNKNOWN_TYPE, ERROR) \ + FUNC(ZERO_PADDED_DATE, ERROR) \ + FUNC(GITMODULES_MISSING, ERROR) \ + FUNC(GITMODULES_BLOB, ERROR) \ + FUNC(GITMODULES_LARGE, ERROR) \ + FUNC(GITMODULES_NAME, ERROR) \ + FUNC(GITMODULES_SYMLINK, ERROR) \ + FUNC(GITMODULES_URL, ERROR) \ + FUNC(GITMODULES_PATH, ERROR) \ + FUNC(GITMODULES_UPDATE, ERROR) \ + /* warnings */ \ + FUNC(BAD_FILEMODE, WARN) \ + FUNC(EMPTY_NAME, WARN) \ + FUNC(FULL_PATHNAME, WARN) \ + FUNC(HAS_DOT, WARN) \ + FUNC(HAS_DOTDOT, WARN) \ + FUNC(HAS_DOTGIT, WARN) \ + FUNC(NULL_SHA1, WARN) \ + FUNC(ZERO_PADDED_FILEMODE, WARN) \ + FUNC(NUL_IN_COMMIT, WARN) \ + /* infos (reported as warnings, but ignored by default) */ \ + FUNC(GITMODULES_PARSE, INFO) \ + FUNC(BAD_TAG_NAME, INFO) \ + FUNC(MISSING_TAGGER_ENTRY, INFO) \ + /* ignored (elevated when requested) */ \ + FUNC(EXTRA_HEADER_ENTRY, IGNORE) + +#define MSG_ID(id, msg_type) FSCK_MSG_##id, +enum fsck_msg_id { + FOREACH_FSCK_MSG_ID(MSG_ID) + FSCK_MSG_MAX +}; +#undef MSG_ID + struct fsck_options; struct object; -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 14/19] fsck.c: pass along the fsck_msg_id in the fsck_error callback 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (14 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 13/19] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 15/19] fsck.c: add an fsck_set_msg_type() API that takes enums Ævar Arnfjörð Bjarmason ` (4 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Change the fsck_error callback to also pass along the fsck_msg_id. Before this change the only way to get the message id was to parse it back out of the "message". Let's pass it down explicitly for the benefit of callers that might want to use it, as discussed in [1]. Passing the msg_type is now redundant, as you can always get it back from the msg_id, but I'm not changing that convention. It's really common to need the msg_type, and the report() function itself (which calls "fsck_error") needs to call fsck_msg_type() to discover it. Let's not needlessly re-do that work in the user callback. 1. https://lore.kernel.org/git/87blcja2ha.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 4 +++- builtin/index-pack.c | 3 ++- builtin/mktag.c | 1 + fsck.c | 6 ++++-- fsck.h | 6 ++++-- 5 files changed, 14 insertions(+), 6 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 17940a4e24..70ff95837a 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -84,7 +84,9 @@ static int objerror(struct object *obj, const char *err) static int fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message) + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 56b8efaa89..2b2266a4b7 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1717,6 +1717,7 @@ static int print_dangling_gitmodules(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, const char *message) { /* @@ -1727,7 +1728,7 @@ static int print_dangling_gitmodules(struct fsck_options *o, printf("%s\n", oid_to_hex(oid)); return 0; } - return fsck_error_function(o, oid, object_type, msg_type, message); + return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); } int cmd_index_pack(int argc, const char **argv, const char *prefix) diff --git a/builtin/mktag.c b/builtin/mktag.c index 052a510ad7..96e63bc772 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -18,6 +18,7 @@ static int mktag_fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, const char *message) { switch (msg_type) { diff --git a/fsck.c b/fsck.c index 49208ec636..01b2724ac0 100644 --- a/fsck.c +++ b/fsck.c @@ -245,7 +245,7 @@ static int report(struct fsck_options *options, va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); result = options->error_func(options, oid, object_type, - msg_type, sb.buf); + msg_type, msg_id, sb.buf); strbuf_release(&sb); va_end(ap); @@ -1198,7 +1198,9 @@ int fsck_object(struct object *obj, void *data, unsigned long size, int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message) + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message) { if (msg_type == FSCK_WARN) { warning("object %s: %s", fsck_describe_object(o, oid), message); diff --git a/fsck.h b/fsck.h index 7c868410eb..80b1984f34 100644 --- a/fsck.h +++ b/fsck.h @@ -101,11 +101,13 @@ typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message); + enum fsck_msg_type msg_type, enum fsck_msg_id msg_id, + const char *message); int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message); + enum fsck_msg_type msg_type, enum fsck_msg_id msg_id, + const char *message); struct fsck_options { fsck_walk_func walk; -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 15/19] fsck.c: add an fsck_set_msg_type() API that takes enums 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (15 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 14/19] fsck.c: pass along the fsck_msg_id in the fsck_error callback Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 16/19] fsck.c: move gitmodules_{found,done} into fsck_options Ævar Arnfjörð Bjarmason ` (3 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Change code I added in acf9de4c94e (mktag: use fsck instead of custom verify_tag(), 2021-01-05) to make use of a new API function that takes the fsck_msg_{id,type} types, instead of arbitrary strings that we'll (hopefully) parse into those types. At the time that the fsck_set_msg_type() API was introduced in 0282f4dced0 (fsck: offer a function to demote fsck errors to warnings, 2015-06-22) it was only intended to be used to parse user-supplied data. For things that are purely internal to the C code it makes sense to have the compiler check these arguments, and to skip the sanity checking of the data in fsck_set_msg_type() which is redundant to checks we get from the compiler. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/mktag.c | 3 ++- fsck.c | 27 +++++++++++++++++---------- fsck.h | 3 +++ 3 files changed, 22 insertions(+), 11 deletions(-) diff --git a/builtin/mktag.c b/builtin/mktag.c index 96e63bc772..dddcccdd36 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -88,7 +88,8 @@ int cmd_mktag(int argc, const char **argv, const char *prefix) die_errno(_("could not read from stdin")); fsck_options.error_func = mktag_fsck_error_func; - fsck_set_msg_type(&fsck_options, "extraheaderentry", "warn"); + fsck_set_msg_type_from_ids(&fsck_options, FSCK_MSG_EXTRA_HEADER_ENTRY, + FSCK_WARN); /* config might set fsck.extraHeaderEntry=* again */ git_config(git_fsck_config, &fsck_options); if (fsck_tag_standalone(NULL, buf.buf, buf.len, &fsck_options, diff --git a/fsck.c b/fsck.c index 01b2724ac0..307d454d92 100644 --- a/fsck.c +++ b/fsck.c @@ -132,6 +132,22 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) return 1; } +void fsck_set_msg_type_from_ids(struct fsck_options *options, + enum fsck_msg_id msg_id, + enum fsck_msg_type msg_type) +{ + if (!options->msg_type) { + int i; + enum fsck_msg_type *severity; + ALLOC_ARRAY(severity, FSCK_MSG_MAX); + for (i = 0; i < FSCK_MSG_MAX; i++) + severity[i] = fsck_msg_type(i, options); + options->msg_type = severity; + } + + options->msg_type[msg_id] = msg_type; +} + void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { @@ -144,16 +160,7 @@ void fsck_set_msg_type(struct fsck_options *options, if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) die("Cannot demote %s to %s", msg_id_str, msg_type_str); - if (!options->msg_type) { - int i; - enum fsck_msg_type *severity; - ALLOC_ARRAY(severity, FSCK_MSG_MAX); - for (i = 0; i < FSCK_MSG_MAX; i++) - severity[i] = fsck_msg_type(i, options); - options->msg_type = severity; - } - - options->msg_type[msg_id] = msg_type; + fsck_set_msg_type_from_ids(options, msg_id, msg_type); } void fsck_set_msg_types(struct fsck_options *options, const char *values) diff --git a/fsck.h b/fsck.h index 80b1984f34..344c3ddc74 100644 --- a/fsck.h +++ b/fsck.h @@ -82,6 +82,9 @@ enum fsck_msg_id { struct fsck_options; struct object; +void fsck_set_msg_type_from_ids(struct fsck_options *options, + enum fsck_msg_id msg_id, + enum fsck_msg_type msg_type); void fsck_set_msg_type(struct fsck_options *options, const char *msg_id, const char *msg_type); void fsck_set_msg_types(struct fsck_options *options, const char *values); -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 16/19] fsck.c: move gitmodules_{found,done} into fsck_options 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (16 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 15/19] fsck.c: add an fsck_set_msg_type() API that takes enums Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 17/19] fetch-pack: don't needlessly copy fsck_options Ævar Arnfjörð Bjarmason ` (2 subsequent siblings) 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Move the gitmodules_{found,done} static variables added in 159e7b080bf (fsck: detect gitmodules files, 2018-05-02) into the fsck_options struct. It makes sense to keep all the context in the same place. This requires changing the recently added register_found_gitmodules() function added in 5476e1efde (fetch-pack: print and use dangling .gitmodules, 2021-02-22) to take fsck_options. That function will be removed in a subsequent commit, but as it'll require the new gitmodules_found attribute of "fsck_options" we need this intermediate step first. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fetch-pack.c | 2 +- fsck.c | 23 ++++++++++------------- fsck.h | 7 ++++++- 3 files changed, 17 insertions(+), 15 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index 6a61a46428..82c3c2c043 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -998,7 +998,7 @@ static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) oidset_iter_init(gitmodules_oids, &iter); while ((oid = oidset_iter_next(&iter))) - register_found_gitmodules(oid); + register_found_gitmodules(&fo, oid); if (fsck_finish(&fo)) die("fsck failed"); } diff --git a/fsck.c b/fsck.c index 307d454d92..00760b1f42 100644 --- a/fsck.c +++ b/fsck.c @@ -19,9 +19,6 @@ #include "credential.h" #include "help.h" -static struct oidset gitmodules_found = OIDSET_INIT; -static struct oidset gitmodules_done = OIDSET_INIT; - #define STR(x) #x #define MSG_ID(id, msg_type) { STR(id), NULL, NULL, FSCK_##msg_type }, static struct { @@ -624,7 +621,7 @@ static int fsck_tree(const struct object_id *oid, if (is_hfs_dotgitmodules(name) || is_ntfs_dotgitmodules(name)) { if (!S_ISLNK(mode)) - oidset_insert(&gitmodules_found, oid); + oidset_insert(&options->gitmodules_found, oid); else retval += report(options, oid, OBJ_TREE, @@ -638,7 +635,7 @@ static int fsck_tree(const struct object_id *oid, has_dotgit |= is_ntfs_dotgit(backslash); if (is_ntfs_dotgitmodules(backslash)) { if (!S_ISLNK(mode)) - oidset_insert(&gitmodules_found, oid); + oidset_insert(&options->gitmodules_found, oid); else retval += report(options, oid, OBJ_TREE, FSCK_MSG_GITMODULES_SYMLINK, @@ -1150,9 +1147,9 @@ static int fsck_blob(const struct object_id *oid, const char *buf, struct fsck_gitmodules_data data; struct config_options config_opts = { 0 }; - if (!oidset_contains(&gitmodules_found, oid)) + if (!oidset_contains(&options->gitmodules_found, oid)) return 0; - oidset_insert(&gitmodules_done, oid); + oidset_insert(&options->gitmodules_done, oid); if (object_on_skiplist(options, oid)) return 0; @@ -1217,9 +1214,9 @@ int fsck_error_function(struct fsck_options *o, return 1; } -void register_found_gitmodules(const struct object_id *oid) +void register_found_gitmodules(struct fsck_options *options, const struct object_id *oid) { - oidset_insert(&gitmodules_found, oid); + oidset_insert(&options->gitmodules_found, oid); } int fsck_finish(struct fsck_options *options) @@ -1228,13 +1225,13 @@ int fsck_finish(struct fsck_options *options) struct oidset_iter iter; const struct object_id *oid; - oidset_iter_init(&gitmodules_found, &iter); + oidset_iter_init(&options->gitmodules_found, &iter); while ((oid = oidset_iter_next(&iter))) { enum object_type type; unsigned long size; char *buf; - if (oidset_contains(&gitmodules_done, oid)) + if (oidset_contains(&options->gitmodules_done, oid)) continue; buf = read_object_file(oid, &type, &size); @@ -1259,8 +1256,8 @@ int fsck_finish(struct fsck_options *options) } - oidset_clear(&gitmodules_found); - oidset_clear(&gitmodules_done); + oidset_clear(&options->gitmodules_found); + oidset_clear(&options->gitmodules_done); return ret; } diff --git a/fsck.h b/fsck.h index 344c3ddc74..b25ae9d8b9 100644 --- a/fsck.h +++ b/fsck.h @@ -118,6 +118,8 @@ struct fsck_options { unsigned strict:1; enum fsck_msg_type *msg_type; struct oidset skiplist; + struct oidset gitmodules_found; + struct oidset gitmodules_done; kh_oid_map_t *object_names; }; @@ -125,6 +127,8 @@ struct fsck_options { .walk = NULL, \ .msg_type = NULL, \ .skiplist = OIDSET_INIT, \ + .gitmodules_found = OIDSET_INIT, \ + .gitmodules_done = OIDSET_INIT, \ .object_names = NULL, #define FSCK_OPTIONS_COMMON_ERROR_FUNC \ FSCK_OPTIONS_COMMON \ @@ -149,7 +153,8 @@ int fsck_walk(struct object *obj, void *data, struct fsck_options *options); int fsck_object(struct object *obj, void *data, unsigned long size, struct fsck_options *options); -void register_found_gitmodules(const struct object_id *oid); +void register_found_gitmodules(struct fsck_options *options, + const struct object_id *oid); /* * fsck a tag, and pass info about it back to the caller. This is -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 17/19] fetch-pack: don't needlessly copy fsck_options 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (17 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 16/19] fsck.c: move gitmodules_{found,done} into fsck_options Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 18/19] fetch-pack: use file-scope static struct for fsck_options Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 19/19] fetch-pack: use new fsck API to printing dangling submodules Ævar Arnfjörð Bjarmason 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Change the behavior of the .gitmodules validation added in 5476e1efde (fetch-pack: print and use dangling .gitmodules, 2021-02-22) so we're using one "fsck_options". I found that code confusing to read. One might think that not setting up the error_func earlier means that we're relying on the "error_func" not being set in some code in between the two hunks being modified here. But we're not, all we're doing in the rest of "cmd_index_pack()" is further setup by calling fsck_set_msg_types(), and assigning to do_fsck_object. So there was no reason in 5476e1efde to make a shallow copy of the fsck_options struct before setting error_func. Let's just do this setup at the top of the function, along with the "walk" assignment. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/index-pack.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 2b2266a4b7..5ad80b85b4 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1761,6 +1761,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) read_replace_refs = 0; fsck_options.walk = mark_link; + fsck_options.error_func = print_dangling_gitmodules; reset_pack_idx_option(&opts); git_config(git_index_pack_config, &opts); @@ -1951,13 +1952,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) else close(input_fd); - if (do_fsck_object) { - struct fsck_options fo = fsck_options; - - fo.error_func = print_dangling_gitmodules; - if (fsck_finish(&fo)) - die(_("fsck error in pack objects")); - } + if (do_fsck_object && fsck_finish(&fsck_options)) + die(_("fsck error in pack objects")); free(objects); strbuf_release(&index_name_buf); -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 18/19] fetch-pack: use file-scope static struct for fsck_options 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (18 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 17/19] fetch-pack: don't needlessly copy fsck_options Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 19/19] fetch-pack: use new fsck API to printing dangling submodules Ævar Arnfjörð Bjarmason 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Change code added in 5476e1efde (fetch-pack: print and use dangling .gitmodules, 2021-02-22) so that we use a file-scoped "static struct fsck_options" instead of defining one in the "fsck_gitmodules_oids()" function. We use this pattern in all of builtin/{fsck,index-pack,mktag,unpack-objects}.c. It's odd to see fetch-pack be the odd one out. One might think that we're using other fsck_options structs in fetch-pack, or doing on fsck twice there, but we're not. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fetch-pack.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index 82c3c2c043..229fd8e2c2 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -38,6 +38,7 @@ static int server_supports_filtering; static int advertise_sid; static struct shallow_lock shallow_lock; static const char *alternate_shallow_file; +static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; static struct strbuf fsck_msg_types = STRBUF_INIT; static struct string_list uri_protocols = STRING_LIST_INIT_DUP; @@ -991,15 +992,14 @@ static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) { struct oidset_iter iter; const struct object_id *oid; - struct fsck_options fo = FSCK_OPTIONS_STRICT; if (!oidset_size(gitmodules_oids)) return; oidset_iter_init(gitmodules_oids, &iter); while ((oid = oidset_iter_next(&iter))) - register_found_gitmodules(&fo, oid); - if (fsck_finish(&fo)) + register_found_gitmodules(&fsck_options, oid); + if (fsck_finish(&fsck_options)) die("fsck failed"); } -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v5 19/19] fetch-pack: use new fsck API to printing dangling submodules 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason ` (19 preceding siblings ...) 2021-03-17 18:20 ` [PATCH v5 18/19] fetch-pack: use file-scope static struct for fsck_options Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 ` Ævar Arnfjörð Bjarmason 20 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 18:20 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Derrick Stolee, Ævar Arnfjörð Bjarmason Refactor the check added in 5476e1efde (fetch-pack: print and use dangling .gitmodules, 2021-02-22) to make use of us now passing the "msg_id" to the user defined "error_func". We can now compare against the FSCK_MSG_GITMODULES_MISSING instead of parsing the generated message. Let's also replace register_found_gitmodules() with directly manipulating the "gitmodules_found" member. A recent commit moved it into "fsck_options" so we could do this here. I'm sticking this callback in fsck.c. Perhaps in the future we'd like to accumulate such callbacks into another file (maybe fsck-cb.c, similar to parse-options-cb.c?), but while we've got just the one let's just put it into fsck.c. A better alternative in this case would be some library some more obvious library shared by fetch-pack.c ad builtin/index-pack.c, but there isn't such a thing. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/index-pack.c | 21 +-------------------- fetch-pack.c | 4 ++-- fsck.c | 23 ++++++++++++++++++----- fsck.h | 18 +++++++++++++++--- 4 files changed, 36 insertions(+), 30 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 5ad80b85b4..11f0fafd33 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -120,7 +120,7 @@ static int nr_threads; static int from_stdin; static int strict; static int do_fsck_object; -static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; +static struct fsck_options fsck_options = FSCK_OPTIONS_MISSING_GITMODULES; static int verbose; static int show_resolving_progress; static int show_stat; @@ -1713,24 +1713,6 @@ static void show_pack_info(int stat_only) } } -static int print_dangling_gitmodules(struct fsck_options *o, - const struct object_id *oid, - enum object_type object_type, - enum fsck_msg_type msg_type, - enum fsck_msg_id msg_id, - const char *message) -{ - /* - * NEEDSWORK: Plumb the MSG_ID (from fsck.c) here and use it - * instead of relying on this string check. - */ - if (starts_with(message, "gitmodulesMissing")) { - printf("%s\n", oid_to_hex(oid)); - return 0; - } - return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); -} - int cmd_index_pack(int argc, const char **argv, const char *prefix) { int i, fix_thin_pack = 0, verify = 0, stat_only = 0, rev_index; @@ -1761,7 +1743,6 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) read_replace_refs = 0; fsck_options.walk = mark_link; - fsck_options.error_func = print_dangling_gitmodules; reset_pack_idx_option(&opts); git_config(git_index_pack_config, &opts); diff --git a/fetch-pack.c b/fetch-pack.c index 229fd8e2c2..008a3facd4 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -38,7 +38,7 @@ static int server_supports_filtering; static int advertise_sid; static struct shallow_lock shallow_lock; static const char *alternate_shallow_file; -static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; +static struct fsck_options fsck_options = FSCK_OPTIONS_MISSING_GITMODULES; static struct strbuf fsck_msg_types = STRBUF_INIT; static struct string_list uri_protocols = STRING_LIST_INIT_DUP; @@ -998,7 +998,7 @@ static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) oidset_iter_init(gitmodules_oids, &iter); while ((oid = oidset_iter_next(&iter))) - register_found_gitmodules(&fsck_options, oid); + oidset_insert(&fsck_options.gitmodules_found, oid); if (fsck_finish(&fsck_options)) die("fsck failed"); } diff --git a/fsck.c b/fsck.c index 00760b1f42..048cf81937 100644 --- a/fsck.c +++ b/fsck.c @@ -1214,11 +1214,6 @@ int fsck_error_function(struct fsck_options *o, return 1; } -void register_found_gitmodules(struct fsck_options *options, const struct object_id *oid) -{ - oidset_insert(&options->gitmodules_found, oid); -} - int fsck_finish(struct fsck_options *options) { int ret = 0; @@ -1284,3 +1279,21 @@ int git_fsck_config(const char *var, const char *value, void *cb) return git_default_config(var, value, cb); } + +/* + * Custom error callbacks that are used in more than one place. + */ + +int fsck_error_cb_print_missing_gitmodules(struct fsck_options *o, + const struct object_id *oid, + enum object_type object_type, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message) +{ + if (msg_id == FSCK_MSG_GITMODULES_MISSING) { + puts(oid_to_hex(oid)); + return 0; + } + return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); +} diff --git a/fsck.h b/fsck.h index b25ae9d8b9..da58f585d7 100644 --- a/fsck.h +++ b/fsck.h @@ -153,9 +153,6 @@ int fsck_walk(struct object *obj, void *data, struct fsck_options *options); int fsck_object(struct object *obj, void *data, unsigned long size, struct fsck_options *options); -void register_found_gitmodules(struct fsck_options *options, - const struct object_id *oid); - /* * fsck a tag, and pass info about it back to the caller. This is * exposed fsck_object() internals for git-mktag(1). @@ -203,4 +200,19 @@ const char *fsck_describe_object(struct fsck_options *options, */ int git_fsck_config(const char *var, const char *value, void *cb); +/* + * Custom error callbacks that are used in more than one place. + */ +#define FSCK_OPTIONS_MISSING_GITMODULES { \ + .strict = 1, \ + .error_func = fsck_error_cb_print_missing_gitmodules, \ + FSCK_OPTIONS_COMMON \ +} +int fsck_error_cb_print_missing_gitmodules(struct fsck_options *o, + const struct object_id *oid, + enum object_type object_type, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message); + #endif -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v4 01/22] fsck.h: update FSCK_OPTIONS_* for object_name 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason 2021-03-07 23:04 ` Junio C Hamano 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:35 ` Junio C Hamano 2021-03-19 14:43 ` Johannes Schindelin 2021-03-16 16:17 ` [PATCH v4 02/22] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason ` (20 subsequent siblings) 23 siblings, 2 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Add the object_name member to the initialization macro. This was omitted in 7b35efd734e (fsck_walk(): optionally name objects on the go, 2016-07-17) when the field was added. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fsck.h b/fsck.h index 733378f126..2274843ba0 100644 --- a/fsck.h +++ b/fsck.h @@ -43,8 +43,8 @@ struct fsck_options { kh_oid_map_t *object_names; }; -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT } -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT } +#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT, NULL } +#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT, NULL } /* descend in all linked child objects * the return value is: -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v4 01/22] fsck.h: update FSCK_OPTIONS_* for object_name 2021-03-16 16:17 ` [PATCH v4 01/22] fsck.h: update FSCK_OPTIONS_* for object_name Ævar Arnfjörð Bjarmason @ 2021-03-17 18:35 ` Junio C Hamano 2021-03-19 14:43 ` Johannes Schindelin 1 sibling, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-17 18:35 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > Add the object_name member to the initialization macro. This was > omitted in 7b35efd734e (fsck_walk(): optionally name objects on the > go, 2016-07-17) when the field was added. While this does not hurt, as the missing one was and is at the end of the struct members, this has no effect. As you'll be rewriting everything into designated initializers anyway, does it matter, I have to wonder (it would affect your commit count karma, but you already have enough of them ;-)? > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > fsck.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fsck.h b/fsck.h > index 733378f126..2274843ba0 100644 > --- a/fsck.h > +++ b/fsck.h > @@ -43,8 +43,8 @@ struct fsck_options { > kh_oid_map_t *object_names; > }; > > -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT } > -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT } > +#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT, NULL } > +#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT, NULL } > > /* descend in all linked child objects > * the return value is: ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v4 01/22] fsck.h: update FSCK_OPTIONS_* for object_name 2021-03-16 16:17 ` [PATCH v4 01/22] fsck.h: update FSCK_OPTIONS_* for object_name Ævar Arnfjörð Bjarmason 2021-03-17 18:35 ` Junio C Hamano @ 2021-03-19 14:43 ` Johannes Schindelin 2021-03-20 9:16 ` Ævar Arnfjörð Bjarmason 1 sibling, 1 reply; 229+ messages in thread From: Johannes Schindelin @ 2021-03-19 14:43 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Junio C Hamano, Jeff King, Jonathan Tan [-- Attachment #1: Type: text/plain, Size: 1441 bytes --] Hi Ævar, just a general note: this patch, which is the first of v4, is marked as replying to the cover letter of v3. That feels quite odd. If you use threading, why not let it reply to the cover letter of the same patch series iteration? In other words, would you mind using the `--thread=shallow` option in the future, for better structuring on the mailing list? Thanks, Johannes On Tue, 16 Mar 2021, Ævar Arnfjörð Bjarmason wrote: > Add the object_name member to the initialization macro. This was > omitted in 7b35efd734e (fsck_walk(): optionally name objects on the > go, 2016-07-17) when the field was added. > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > fsck.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fsck.h b/fsck.h > index 733378f126..2274843ba0 100644 > --- a/fsck.h > +++ b/fsck.h > @@ -43,8 +43,8 @@ struct fsck_options { > kh_oid_map_t *object_names; > }; > > -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT } > -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT } > +#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT, NULL } > +#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT, NULL } > > /* descend in all linked child objects > * the return value is: > -- > 2.31.0.260.g719c683c1d > > ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v4 01/22] fsck.h: update FSCK_OPTIONS_* for object_name 2021-03-19 14:43 ` Johannes Schindelin @ 2021-03-20 9:16 ` Ævar Arnfjörð Bjarmason 2021-03-20 20:04 ` Junio C Hamano 0 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-20 9:16 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git, Junio C Hamano, Jeff King, Jonathan Tan On Fri, Mar 19 2021, Johannes Schindelin wrote: > Hi Ævar, > > just a general note: this patch, which is the first of v4, is marked as > replying to the cover letter of v3. That feels quite odd. If you use > threading, why not let it reply to the cover letter of the same patch > series iteration? > > In other words, would you mind using the `--thread=shallow` option in the > future, for better structuring on the mailing list? Not at all, I've set it in my config now. I've just been using the default configuration of format-patch --in-reply-to --cover-letter && send-email *.patch all this time. Looking around at other patch submissions (aside from GGG) this seems to be the norm though, but isn't documented in SubmittingPatches etc. AFAICT. So I wonder if I'm using some different process from the norm, or if most everyone else is just looking carefully at Message-ID/In-Reply-To norms before sending... > On Tue, 16 Mar 2021, Ævar Arnfjörð Bjarmason wrote: > >> Add the object_name member to the initialization macro. This was >> omitted in 7b35efd734e (fsck_walk(): optionally name objects on the >> go, 2016-07-17) when the field was added. >> >> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> >> --- >> fsck.h | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/fsck.h b/fsck.h >> index 733378f126..2274843ba0 100644 >> --- a/fsck.h >> +++ b/fsck.h >> @@ -43,8 +43,8 @@ struct fsck_options { >> kh_oid_map_t *object_names; >> }; >> >> -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT } >> -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT } >> +#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT, NULL } >> +#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT, NULL } >> >> /* descend in all linked child objects >> * the return value is: >> -- >> 2.31.0.260.g719c683c1d >> >> ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v4 01/22] fsck.h: update FSCK_OPTIONS_* for object_name 2021-03-20 9:16 ` Ævar Arnfjörð Bjarmason @ 2021-03-20 20:04 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-20 20:04 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Johannes Schindelin, git, Jeff King, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >> In other words, would you mind using the `--thread=shallow` option in the >> future, for better structuring on the mailing list? > > Not at all, I've set it in my config now. > > I've just been using the default configuration of format-patch > --in-reply-to --cover-letter && send-email *.patch all this time. > ... > So I wonder if I'm using some different process from the norm, or if > most everyone else is just looking carefully at Message-ID/In-Reply-To > norms before sending... Interesting. I always let send-email assign the message IDs and haven't used --thread=<any> option at all. In other words, my format-patch output files have no message IDs in them or In-reply-to header fields. That in turn means that in-reply-to is decided not when format-patch is run, but when send-email sends things out, it gives them the ids and structures the in-reply-to chains. I guess we have too much flexibility in our tooling X-<. ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v4 02/22] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (2 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 01/22] fsck.h: update FSCK_OPTIONS_* for object_name Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 18:59 ` Derrick Stolee 2021-03-17 18:38 ` Junio C Hamano 2021-03-16 16:17 ` [PATCH v4 03/22] fsck.h: reduce duplication between FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason ` (19 subsequent siblings) 23 siblings, 2 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/fsck.h b/fsck.h index 2274843ba0..40f3cb3f64 100644 --- a/fsck.h +++ b/fsck.h @@ -43,8 +43,22 @@ struct fsck_options { kh_oid_map_t *object_names; }; -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT, NULL } -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT, NULL } +#define FSCK_OPTIONS_DEFAULT { \ + .walk = NULL, \ + .error_func = fsck_error_function, \ + .strict = 0, \ + .msg_type = NULL, \ + .skiplist = OIDSET_INIT, \ + .object_names = NULL, \ +} +#define FSCK_OPTIONS_STRICT { \ + .walk = NULL, \ + .error_func = fsck_error_function, \ + .strict = 1, \ + .msg_type = NULL, \ + .skiplist = OIDSET_INIT, \ + .object_names = NULL, \ +} /* descend in all linked child objects * the return value is: -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v4 02/22] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} 2021-03-16 16:17 ` [PATCH v4 02/22] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason @ 2021-03-16 18:59 ` Derrick Stolee 2021-03-17 18:38 ` Junio C Hamano 1 sibling, 0 replies; 229+ messages in thread From: Derrick Stolee @ 2021-03-16 18:59 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason, git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan On 3/16/2021 12:17 PM, Ævar Arnfjörð Bjarmason wrote: > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > fsck.h | 18 ++++++++++++++++-- > 1 file changed, 16 insertions(+), 2 deletions(-) > > diff --git a/fsck.h b/fsck.h > index 2274843ba0..40f3cb3f64 100644 > --- a/fsck.h > +++ b/fsck.h > @@ -43,8 +43,22 @@ struct fsck_options { > kh_oid_map_t *object_names; > }; > > -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT, NULL } > -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT, NULL } You just edited these lines in the previous patch. Seems unnecesary to split them. You can point out that the object_names portion was previously excluded in the message here. > +#define FSCK_OPTIONS_DEFAULT { \ > + .walk = NULL, \ > + .error_func = fsck_error_function, \ > + .strict = 0, \ > + .msg_type = NULL, \ > + .skiplist = OIDSET_INIT, \ > + .object_names = NULL, \ > +} > +#define FSCK_OPTIONS_STRICT { \ > + .walk = NULL, \ > + .error_func = fsck_error_function, \ > + .strict = 1, \ > + .msg_type = NULL, \ > + .skiplist = OIDSET_INIT, \ > + .object_names = NULL, \ > +} This explicit definition is better. Thanks, -Stolee ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v4 02/22] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} 2021-03-16 16:17 ` [PATCH v4 02/22] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason 2021-03-16 18:59 ` Derrick Stolee @ 2021-03-17 18:38 ` Junio C Hamano 1 sibling, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-17 18:38 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > fsck.h | 18 ++++++++++++++++-- > 1 file changed, 16 insertions(+), 2 deletions(-) > > diff --git a/fsck.h b/fsck.h > index 2274843ba0..40f3cb3f64 100644 > --- a/fsck.h > +++ b/fsck.h > @@ -43,8 +43,22 @@ struct fsck_options { > kh_oid_map_t *object_names; > }; > > -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT, NULL } > -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT, NULL } > +#define FSCK_OPTIONS_DEFAULT { \ > + .walk = NULL, \ > + .error_func = fsck_error_function, \ > + .strict = 0, \ > + .msg_type = NULL, \ > + .skiplist = OIDSET_INIT, \ > + .object_names = NULL, \ > +} > +#define FSCK_OPTIONS_STRICT { \ > + .walk = NULL, \ > + .error_func = fsck_error_function, \ > + .strict = 1, \ > + .msg_type = NULL, \ > + .skiplist = OIDSET_INIT, \ > + .object_names = NULL, \ > +} Being explicit is good, but spelling out zero initialization sounds more like cluttering than clarifying. I do not mind .strict = 0 in the DEFAULT one only because it contrasts well with .strict = 1 on the STRICT side, but it would be easier to read to omit these zero initilization of the .walk, .msg_type and .object_names members. > /* descend in all linked child objects > * the return value is: ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v4 03/22] fsck.h: reduce duplication between FSCK_OPTIONS_{DEFAULT,STRICT} 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (3 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 02/22] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 04/22] fsck.h: add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro Ævar Arnfjörð Bjarmason ` (18 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Use a temporary macro to define what FSCK_OPTIONS_{DEFAULT,STRICT} have in common, and define the two in terms of that macro. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 16 ++++------------ 1 file changed, 4 insertions(+), 12 deletions(-) diff --git a/fsck.h b/fsck.h index 40f3cb3f64..ea3a907ec3 100644 --- a/fsck.h +++ b/fsck.h @@ -43,22 +43,14 @@ struct fsck_options { kh_oid_map_t *object_names; }; -#define FSCK_OPTIONS_DEFAULT { \ +#define FSCK_OPTIONS_COMMON \ .walk = NULL, \ .error_func = fsck_error_function, \ - .strict = 0, \ .msg_type = NULL, \ .skiplist = OIDSET_INIT, \ - .object_names = NULL, \ -} -#define FSCK_OPTIONS_STRICT { \ - .walk = NULL, \ - .error_func = fsck_error_function, \ - .strict = 1, \ - .msg_type = NULL, \ - .skiplist = OIDSET_INIT, \ - .object_names = NULL, \ -} + .object_names = NULL, +#define FSCK_OPTIONS_DEFAULT { .strict = 0, FSCK_OPTIONS_COMMON } +#define FSCK_OPTIONS_STRICT { .strict = 1, FSCK_OPTIONS_COMMON } /* descend in all linked child objects * the return value is: -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v4 04/22] fsck.h: add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (4 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 03/22] fsck.h: reduce duplication between FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 19:06 ` Derrick Stolee 2021-03-16 16:17 ` [PATCH v4 05/22] fsck.h: indent arguments to of fsck_set_msg_type Ævar Arnfjörð Bjarmason ` (17 subsequent siblings) 23 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro for those that would like to use FSCK_OPTIONS_COMMON in their own initialization, but supply their own error functions. Nothing is being changed to use this yet, but in some subsequent commits we'll make use of this macro. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/fsck.h b/fsck.h index ea3a907ec3..dc35924cbf 100644 --- a/fsck.h +++ b/fsck.h @@ -45,12 +45,15 @@ struct fsck_options { #define FSCK_OPTIONS_COMMON \ .walk = NULL, \ - .error_func = fsck_error_function, \ .msg_type = NULL, \ .skiplist = OIDSET_INIT, \ .object_names = NULL, -#define FSCK_OPTIONS_DEFAULT { .strict = 0, FSCK_OPTIONS_COMMON } -#define FSCK_OPTIONS_STRICT { .strict = 1, FSCK_OPTIONS_COMMON } +#define FSCK_OPTIONS_COMMON_ERROR_FUNC \ + FSCK_OPTIONS_COMMON \ + .error_func = fsck_error_function + +#define FSCK_OPTIONS_DEFAULT { .strict = 0, FSCK_OPTIONS_COMMON_ERROR_FUNC } +#define FSCK_OPTIONS_STRICT { .strict = 1, FSCK_OPTIONS_COMMON_ERROR_FUNC } /* descend in all linked child objects * the return value is: -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v4 04/22] fsck.h: add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro 2021-03-16 16:17 ` [PATCH v4 04/22] fsck.h: add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro Ævar Arnfjörð Bjarmason @ 2021-03-16 19:06 ` Derrick Stolee 0 siblings, 0 replies; 229+ messages in thread From: Derrick Stolee @ 2021-03-16 19:06 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason, git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan On 3/16/2021 12:17 PM, Ævar Arnfjörð Bjarmason wrote: > Add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro for those that would like > to use FSCK_OPTIONS_COMMON in their own initialization, but supply > their own error functions. > > Nothing is being changed to use this yet, but in some subsequent > commits we'll make use of this macro. > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > fsck.h | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/fsck.h b/fsck.h > index ea3a907ec3..dc35924cbf 100644 > --- a/fsck.h > +++ b/fsck.h > @@ -45,12 +45,15 @@ struct fsck_options { > > #define FSCK_OPTIONS_COMMON \ > .walk = NULL, \ > - .error_func = fsck_error_function, \ > .msg_type = NULL, \ > .skiplist = OIDSET_INIT, \ > .object_names = NULL, > -#define FSCK_OPTIONS_DEFAULT { .strict = 0, FSCK_OPTIONS_COMMON } > -#define FSCK_OPTIONS_STRICT { .strict = 1, FSCK_OPTIONS_COMMON } > +#define FSCK_OPTIONS_COMMON_ERROR_FUNC \ > + FSCK_OPTIONS_COMMON \ > + .error_func = fsck_error_function > + > +#define FSCK_OPTIONS_DEFAULT { .strict = 0, FSCK_OPTIONS_COMMON_ERROR_FUNC } > +#define FSCK_OPTIONS_STRICT { .strict = 1, FSCK_OPTIONS_COMMON_ERROR_FUNC } OK. It seems like you are converging on your final definitions for these macros. At first glance, this seems like unnecessary split to demonstrate the tiny changes between, but it could just be done with one change and a description of why you want the four different entry points as macros. Thanks, -Stolee ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v4 05/22] fsck.h: indent arguments to of fsck_set_msg_type 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (5 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 04/22] fsck.h: add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 06/22] fsck.h: use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason ` (16 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fsck.h b/fsck.h index dc35924cbf..5e488cef6b 100644 --- a/fsck.h +++ b/fsck.h @@ -11,7 +11,7 @@ struct fsck_options; struct object; void fsck_set_msg_type(struct fsck_options *options, - const char *msg_id, const char *msg_type); + const char *msg_id, const char *msg_type); void fsck_set_msg_types(struct fsck_options *options, const char *values); int is_valid_msg_type(const char *msg_id, const char *msg_type); -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v4 06/22] fsck.h: use "enum object_type" instead of "int" 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (6 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 05/22] fsck.h: indent arguments to of fsck_set_msg_type Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 07/22] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason ` (15 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Change the fsck_walk_func to use an "enum object_type" instead of an "int" type. The types are compatible, and ever since this was added in 355885d5315 (add generic, type aware object chain walker, 2008-02-25) we've used entries from object_type (OBJ_BLOB etc.). So this doesn't really change anything as far as the generated code is concerned, it just gives the compiler more information and makes this easier to read. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 3 ++- builtin/index-pack.c | 3 ++- builtin/unpack-objects.c | 3 ++- fsck.h | 3 ++- 4 files changed, 8 insertions(+), 4 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 821e7798c7..68f0329e69 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -197,7 +197,8 @@ static int traverse_reachable(void) return !!result; } -static int mark_used(struct object *obj, int type, void *data, struct fsck_options *options) +static int mark_used(struct object *obj, enum object_type object_type, + void *data, struct fsck_options *options) { if (!obj) return 1; diff --git a/builtin/index-pack.c b/builtin/index-pack.c index bad5748807..69f24fe9f7 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -212,7 +212,8 @@ static void cleanup_thread(void) free(thread_data); } -static int mark_link(struct object *obj, int type, void *data, struct fsck_options *options) +static int mark_link(struct object *obj, enum object_type type, + void *data, struct fsck_options *options) { if (!obj) return -1; diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index dd4a75e030..ca54fd1668 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -187,7 +187,8 @@ static void write_cached_object(struct object *obj, struct obj_buffer *obj_buf) * that have reachability requirements and calls this function. * Verify its reachability and validity recursively and write it out. */ -static int check_object(struct object *obj, int type, void *data, struct fsck_options *options) +static int check_object(struct object *obj, enum object_type type, + void *data, struct fsck_options *options) { struct obj_buffer *obj_buf; diff --git a/fsck.h b/fsck.h index 5e488cef6b..f67edd8f1f 100644 --- a/fsck.h +++ b/fsck.h @@ -23,7 +23,8 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type); * <0 error signaled and abort * >0 error signaled and do not abort */ -typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options); +typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, + void *data, struct fsck_options *options); /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v4 07/22] fsck.c: rename variables in fsck_set_msg_type() for less confusion 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (7 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 06/22] fsck.h: use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 08/22] fsck.c: move definition of msg_id into append_msg_id() Ævar Arnfjörð Bjarmason ` (14 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Rename variables in a function added in 0282f4dced0 (fsck: offer a function to demote fsck errors to warnings, 2015-06-22). It was needlessly confusing that it took a "msg_type" argument, but then later declared another "msg_type" of a different type. Let's rename that to "severity", and rename "id" to "msg_id" and "msg_id" to "msg_id_str" etc. This will make a follow-up change smaller. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/fsck.c b/fsck.c index e3030f3b35..0a9ac9ca07 100644 --- a/fsck.c +++ b/fsck.c @@ -203,27 +203,27 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) } void fsck_set_msg_type(struct fsck_options *options, - const char *msg_id, const char *msg_type) + const char *msg_id_str, const char *msg_type_str) { - int id = parse_msg_id(msg_id), type; + int msg_id = parse_msg_id(msg_id_str), msg_type; - if (id < 0) - die("Unhandled message id: %s", msg_id); - type = parse_msg_type(msg_type); + if (msg_id < 0) + die("Unhandled message id: %s", msg_id_str); + msg_type = parse_msg_type(msg_type_str); - if (type != FSCK_ERROR && msg_id_info[id].msg_type == FSCK_FATAL) - die("Cannot demote %s to %s", msg_id, msg_type); + if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) + die("Cannot demote %s to %s", msg_id_str, msg_type_str); if (!options->msg_type) { int i; - int *msg_type; - ALLOC_ARRAY(msg_type, FSCK_MSG_MAX); + int *severity; + ALLOC_ARRAY(severity, FSCK_MSG_MAX); for (i = 0; i < FSCK_MSG_MAX; i++) - msg_type[i] = fsck_msg_type(i, options); - options->msg_type = msg_type; + severity[i] = fsck_msg_type(i, options); + options->msg_type = severity; } - options->msg_type[id] = type; + options->msg_type[msg_id] = msg_type; } void fsck_set_msg_types(struct fsck_options *options, const char *values) -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v4 08/22] fsck.c: move definition of msg_id into append_msg_id() 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (8 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 07/22] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:45 ` Junio C Hamano 2021-03-16 16:17 ` [PATCH v4 09/22] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason ` (13 subsequent siblings) 23 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Refactor code added in 71ab8fa840f (fsck: report the ID of the error/warning, 2015-06-22) to resolve the msg_id to a string in the function that wants it, instead of doing it in report(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index 0a9ac9ca07..b977493f57 100644 --- a/fsck.c +++ b/fsck.c @@ -264,8 +264,9 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) free(to_free); } -static void append_msg_id(struct strbuf *sb, const char *msg_id) +static void append_msg_id(struct strbuf *sb, enum fsck_msg_id id) { + const char *msg_id = msg_id_info[id].id_string; for (;;) { char c = *(msg_id)++; @@ -308,7 +309,7 @@ static int report(struct fsck_options *options, else if (msg_type == FSCK_INFO) msg_type = FSCK_WARN; - append_msg_id(&sb, msg_id_info[id].id_string); + append_msg_id(&sb, id); va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v4 08/22] fsck.c: move definition of msg_id into append_msg_id() 2021-03-16 16:17 ` [PATCH v4 08/22] fsck.c: move definition of msg_id into append_msg_id() Ævar Arnfjörð Bjarmason @ 2021-03-17 18:45 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-17 18:45 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > Refactor code added in 71ab8fa840f (fsck: report the ID of the > error/warning, 2015-06-22) to resolve the msg_id to a string in the > function that wants it, instead of doing it in report(). This reintroduces the same confusion 07/22 tried to get rid of, unless msg_id variable is renamed to msg_id_str in this step, instead of being left to the next step, no? > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > fsck.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/fsck.c b/fsck.c > index 0a9ac9ca07..b977493f57 100644 > --- a/fsck.c > +++ b/fsck.c > @@ -264,8 +264,9 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) > free(to_free); > } > > -static void append_msg_id(struct strbuf *sb, const char *msg_id) > +static void append_msg_id(struct strbuf *sb, enum fsck_msg_id id) > { > + const char *msg_id = msg_id_info[id].id_string; > for (;;) { > char c = *(msg_id)++; > > @@ -308,7 +309,7 @@ static int report(struct fsck_options *options, > else if (msg_type == FSCK_INFO) > msg_type = FSCK_WARN; > > - append_msg_id(&sb, msg_id_info[id].id_string); > + append_msg_id(&sb, id); > > va_start(ap, fmt); > strbuf_vaddf(&sb, fmt, ap); ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v4 09/22] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (9 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 08/22] fsck.c: move definition of msg_id into append_msg_id() Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 10/22] fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" Ævar Arnfjörð Bjarmason ` (12 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Rename the remaining variables of type fsck_msg_id from "id" to "msg_id". This change is relatively small, and is worth the churn for a later change where we have different id's in the "report" function. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/fsck.c b/fsck.c index b977493f57..6b72ddaa51 100644 --- a/fsck.c +++ b/fsck.c @@ -264,19 +264,19 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) free(to_free); } -static void append_msg_id(struct strbuf *sb, enum fsck_msg_id id) +static void append_msg_id(struct strbuf *sb, enum fsck_msg_id msg_id) { - const char *msg_id = msg_id_info[id].id_string; + const char *msg_id_str = msg_id_info[msg_id].id_string; for (;;) { - char c = *(msg_id)++; + char c = *(msg_id_str)++; if (!c) break; if (c != '_') strbuf_addch(sb, tolower(c)); else { - assert(*msg_id); - strbuf_addch(sb, *(msg_id)++); + assert(*msg_id_str); + strbuf_addch(sb, *(msg_id_str)++); } } @@ -292,11 +292,11 @@ static int object_on_skiplist(struct fsck_options *opts, __attribute__((format (printf, 5, 6))) static int report(struct fsck_options *options, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_id id, const char *fmt, ...) + enum fsck_msg_id msg_id, const char *fmt, ...) { va_list ap; struct strbuf sb = STRBUF_INIT; - int msg_type = fsck_msg_type(id, options), result; + int msg_type = fsck_msg_type(msg_id, options), result; if (msg_type == FSCK_IGNORE) return 0; @@ -309,7 +309,7 @@ static int report(struct fsck_options *options, else if (msg_type == FSCK_INFO) msg_type = FSCK_WARN; - append_msg_id(&sb, id); + append_msg_id(&sb, msg_id); va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v4 10/22] fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (10 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 09/22] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 11/22] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason ` (11 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Refactor "if options->msg_type" and other code added in 0282f4dced0 (fsck: offer a function to demote fsck errors to warnings, 2015-06-22) to reduce the scope of the "int msg_type" variable. This is in preparation for changing its type in a subsequent commit, only using it in the "!options->msg_type" scope makes that change This also brings the code in line with the fsck_set_msg_type() function (also added in 0282f4dced0), which does a similar check for "!options->msg_type". Another minor benefit is getting rid of the style violation of not having braces for the body of the "if". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/fsck.c b/fsck.c index 6b72ddaa51..0988ab6579 100644 --- a/fsck.c +++ b/fsck.c @@ -167,19 +167,17 @@ void list_config_fsck_msg_ids(struct string_list *list, const char *prefix) static int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options) { - int msg_type; - assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX); - if (options->msg_type) - msg_type = options->msg_type[msg_id]; - else { - msg_type = msg_id_info[msg_id].msg_type; + if (!options->msg_type) { + int msg_type = msg_id_info[msg_id].msg_type; + if (options->strict && msg_type == FSCK_WARN) msg_type = FSCK_ERROR; + return msg_type; } - return msg_type; + return options->msg_type[msg_id]; } static int parse_msg_type(const char *str) -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v4 11/22] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (11 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 10/22] fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:48 ` Junio C Hamano 2021-03-16 16:17 ` [PATCH v4 12/22] fsck.h: re-order and re-assign "enum fsck_msg_type" Ævar Arnfjörð Bjarmason ` (10 subsequent siblings) 23 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Move the FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} defines into a new fsck_msg_type enum. These defines were originally introduced in: - ba002f3b28a (builtin-fsck: move common object checking code to fsck.c, 2008-02-25) - f50c4407305 (fsck: disallow demoting grave fsck errors to warnings, 2015-06-22) - efaba7cc77f (fsck: optionally ignore specific fsck issues completely, 2015-06-22) - f27d05b1704 (fsck: allow upgrading fsck warnings to errors, 2015-06-22) The reason these were defined in two different places is because we use FSCK_{IGNORE,INFO,FATAL} only in fsck.c, but FSCK_{ERROR,WARN} are used by external callbacks. Untangling that would take some more work, since we expose the new "enum fsck_msg_type" to both. Similar to "enum object_type" it's not worth structuring the API in such a way that only those who need FSCK_{ERROR,WARN} pass around a different type. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 2 +- builtin/index-pack.c | 3 ++- builtin/mktag.c | 3 ++- fsck.c | 21 ++++++++++----------- fsck.h | 16 ++++++++++------ 5 files changed, 25 insertions(+), 20 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 68f0329e69..d6d745dc70 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -89,7 +89,7 @@ static int objerror(struct object *obj, const char *err) static int fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 69f24fe9f7..56b8efaa89 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1716,7 +1716,8 @@ static void show_pack_info(int stat_only) static int print_dangling_gitmodules(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, + const char *message) { /* * NEEDSWORK: Plumb the MSG_ID (from fsck.c) here and use it diff --git a/builtin/mktag.c b/builtin/mktag.c index 41a399a69e..1834394a9b 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -22,7 +22,8 @@ static int mktag_config(const char *var, const char *value, void *cb) static int mktag_fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, + const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/fsck.c b/fsck.c index 0988ab6579..fb7d071bbf 100644 --- a/fsck.c +++ b/fsck.c @@ -22,9 +22,6 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FSCK_FATAL -1 -#define FSCK_INFO -2 - #define FOREACH_MSG_ID(FUNC) \ /* fatal errors */ \ FUNC(NUL_IN_HEADER, FATAL) \ @@ -97,7 +94,7 @@ static struct { const char *id_string; const char *downcased; const char *camelcased; - int msg_type; + enum fsck_msg_type msg_type; } msg_id_info[FSCK_MSG_MAX + 1] = { FOREACH_MSG_ID(MSG_ID) { NULL, NULL, NULL, -1 } @@ -164,13 +161,13 @@ void list_config_fsck_msg_ids(struct string_list *list, const char *prefix) list_config_item(list, prefix, msg_id_info[i].camelcased); } -static int fsck_msg_type(enum fsck_msg_id msg_id, +static enum fsck_msg_type fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options) { assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX); if (!options->msg_type) { - int msg_type = msg_id_info[msg_id].msg_type; + enum fsck_msg_type msg_type = msg_id_info[msg_id].msg_type; if (options->strict && msg_type == FSCK_WARN) msg_type = FSCK_ERROR; @@ -180,7 +177,7 @@ static int fsck_msg_type(enum fsck_msg_id msg_id, return options->msg_type[msg_id]; } -static int parse_msg_type(const char *str) +static enum fsck_msg_type parse_msg_type(const char *str) { if (!strcmp(str, "error")) return FSCK_ERROR; @@ -203,7 +200,8 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { - int msg_id = parse_msg_id(msg_id_str), msg_type; + int msg_id = parse_msg_id(msg_id_str); + enum fsck_msg_type msg_type; if (msg_id < 0) die("Unhandled message id: %s", msg_id_str); @@ -214,7 +212,7 @@ void fsck_set_msg_type(struct fsck_options *options, if (!options->msg_type) { int i; - int *severity; + enum fsck_msg_type *severity; ALLOC_ARRAY(severity, FSCK_MSG_MAX); for (i = 0; i < FSCK_MSG_MAX; i++) severity[i] = fsck_msg_type(i, options); @@ -294,7 +292,8 @@ static int report(struct fsck_options *options, { va_list ap; struct strbuf sb = STRBUF_INIT; - int msg_type = fsck_msg_type(msg_id, options), result; + enum fsck_msg_type msg_type = fsck_msg_type(msg_id, options); + int result; if (msg_type == FSCK_IGNORE) return 0; @@ -1265,7 +1264,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size, int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, const char *message) { if (msg_type == FSCK_WARN) { warning("object %s: %s", fsck_describe_object(o, oid), message); diff --git a/fsck.h b/fsck.h index f67edd8f1f..2ecc15eee7 100644 --- a/fsck.h +++ b/fsck.h @@ -3,9 +3,13 @@ #include "oidset.h" -#define FSCK_ERROR 1 -#define FSCK_WARN 2 -#define FSCK_IGNORE 3 +enum fsck_msg_type { + FSCK_INFO = -2, + FSCK_FATAL = -1, + FSCK_ERROR = 1, + FSCK_WARN, + FSCK_IGNORE +}; struct fsck_options; struct object; @@ -29,17 +33,17 @@ typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message); + enum fsck_msg_type msg_type, const char *message); int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message); + enum fsck_msg_type msg_type, const char *message); struct fsck_options { fsck_walk_func walk; fsck_error error_func; unsigned strict:1; - int *msg_type; + enum fsck_msg_type *msg_type; struct oidset skiplist; kh_oid_map_t *object_names; }; -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v4 11/22] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum 2021-03-16 16:17 ` [PATCH v4 11/22] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason @ 2021-03-17 18:48 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-17 18:48 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > Move the FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} defines into a new > fsck_msg_type enum. Nice. > Untangling that would take some more work, since we expose the new > "enum fsck_msg_type" to both. Similar to "enum object_type" it's not > worth structuring the API in such a way that only those who need > FSCK_{ERROR,WARN} pass around a different type. ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v4 12/22] fsck.h: re-order and re-assign "enum fsck_msg_type" 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (12 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 11/22] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:50 ` Junio C Hamano 2021-03-16 16:17 ` [PATCH v4 13/22] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason ` (9 subsequent siblings) 23 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Change the values in the "enum fsck_msg_type" from being manually assigned to using default C enum values. This means we end up with a FSCK_IGNORE=0, which was previously defined as "2". I'm confident that nothing relies on these values, we always compare them explicitly. Let's not omit "0" so it won't be assumed that we're using these as a boolean somewhere. This also allows us to re-structure the fields to mark which are "private" v.s. "public". See the preceding commit for a rationale for not simply splitting these into two enums, namely that this is used for both the private and public fsck API. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/fsck.h b/fsck.h index 2ecc15eee7..fce9981a0c 100644 --- a/fsck.h +++ b/fsck.h @@ -4,11 +4,13 @@ #include "oidset.h" enum fsck_msg_type { - FSCK_INFO = -2, - FSCK_FATAL = -1, - FSCK_ERROR = 1, + /* for internal use only */ + FSCK_IGNORE, + FSCK_INFO, + FSCK_FATAL, + /* "public", fed to e.g. error_func callbacks */ + FSCK_ERROR, FSCK_WARN, - FSCK_IGNORE }; struct fsck_options; -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v4 12/22] fsck.h: re-order and re-assign "enum fsck_msg_type" 2021-03-16 16:17 ` [PATCH v4 12/22] fsck.h: re-order and re-assign "enum fsck_msg_type" Ævar Arnfjörð Bjarmason @ 2021-03-17 18:50 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-17 18:50 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > Change the values in the "enum fsck_msg_type" from being manually > assigned to using default C enum values. > > This means we end up with a FSCK_IGNORE=0, which was previously > defined as "2". > > I'm confident that nothing relies on these values, we always compare > them explicitly. Let's not omit "0" so it won't be assumed that we're > using these as a boolean somewhere. Do you mean by "compare them explicitly", we always compare for equality? If the code had depended on constructs like "if (msg < FSCK_ERROR)", this change would break badly. > diff --git a/fsck.h b/fsck.h > index 2ecc15eee7..fce9981a0c 100644 > --- a/fsck.h > +++ b/fsck.h > @@ -4,11 +4,13 @@ > #include "oidset.h" > > enum fsck_msg_type { > - FSCK_INFO = -2, > - FSCK_FATAL = -1, > - FSCK_ERROR = 1, > + /* for internal use only */ > + FSCK_IGNORE, > + FSCK_INFO, > + FSCK_FATAL, > + /* "public", fed to e.g. error_func callbacks */ > + FSCK_ERROR, > FSCK_WARN, > - FSCK_IGNORE > }; > > struct fsck_options; ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v4 13/22] fsck.c: call parse_msg_type() early in fsck_set_msg_type() 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (13 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 12/22] fsck.h: re-order and re-assign "enum fsck_msg_type" Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 14/22] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason ` (8 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason There's no reason to defer the calling of parse_msg_type() until after we've checked if the "id < 0". This is not a hot codepath, and parse_msg_type() itself may die on invalid input. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index fb7d071bbf..2ccf1a2f0f 100644 --- a/fsck.c +++ b/fsck.c @@ -201,11 +201,10 @@ void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { int msg_id = parse_msg_id(msg_id_str); - enum fsck_msg_type msg_type; + enum fsck_msg_type msg_type = parse_msg_type(msg_type_str); if (msg_id < 0) die("Unhandled message id: %s", msg_id_str); - msg_type = parse_msg_type(msg_type_str); if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) die("Cannot demote %s to %s", msg_id_str, msg_type_str); -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v4 14/22] fsck.c: undefine temporary STR macro after use 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (14 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 13/22] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-17 18:57 ` Junio C Hamano 2021-03-16 16:17 ` [PATCH v4 15/22] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason ` (7 subsequent siblings) 23 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason In f417eed8cde (fsck: provide a function to parse fsck message IDs, 2015-06-22) the "STR" macro was introduced, but that short macro name was not undefined after use as was done earlier in the same series for the MSG_ID macro in c99ba492f1c (fsck: introduce identifiers for fsck messages, 2015-06-22). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fsck.c b/fsck.c index 2ccf1a2f0f..f4c924ed04 100644 --- a/fsck.c +++ b/fsck.c @@ -100,6 +100,7 @@ static struct { { NULL, NULL, NULL, -1 } }; #undef MSG_ID +#undef STR static void prepare_msg_ids(void) { -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v4 14/22] fsck.c: undefine temporary STR macro after use 2021-03-16 16:17 ` [PATCH v4 14/22] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason @ 2021-03-17 18:57 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-17 18:57 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > In f417eed8cde (fsck: provide a function to parse fsck message IDs, > 2015-06-22) the "STR" macro was introduced, but that short macro name > was not undefined after use as was done earlier in the same series for > the MSG_ID macro in c99ba492f1c (fsck: introduce identifiers for fsck > messages, 2015-06-22). Makes sense. Thanks. > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > fsck.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/fsck.c b/fsck.c > index 2ccf1a2f0f..f4c924ed04 100644 > --- a/fsck.c > +++ b/fsck.c > @@ -100,6 +100,7 @@ static struct { > { NULL, NULL, NULL, -1 } > }; > #undef MSG_ID > +#undef STR > > static void prepare_msg_ids(void) > { ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v4 15/22] fsck.c: give "FOREACH_MSG_ID" a more specific name 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (15 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 14/22] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 16/22] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h Ævar Arnfjörð Bjarmason ` (6 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Rename the FOREACH_MSG_ID macro to FOREACH_FSCK_MSG_ID in preparation for moving it over to fsck.h. It's good convention to name macros in *.h files in such a way as to clearly not clash with any other names in other files. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fsck.c b/fsck.c index f4c924ed04..6fbc56e9fa 100644 --- a/fsck.c +++ b/fsck.c @@ -22,7 +22,7 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FOREACH_MSG_ID(FUNC) \ +#define FOREACH_FSCK_MSG_ID(FUNC) \ /* fatal errors */ \ FUNC(NUL_IN_HEADER, FATAL) \ FUNC(UNTERMINATED_HEADER, FATAL) \ @@ -83,7 +83,7 @@ static struct oidset gitmodules_done = OIDSET_INIT; #define MSG_ID(id, msg_type) FSCK_MSG_##id, enum fsck_msg_id { - FOREACH_MSG_ID(MSG_ID) + FOREACH_FSCK_MSG_ID(MSG_ID) FSCK_MSG_MAX }; #undef MSG_ID @@ -96,7 +96,7 @@ static struct { const char *camelcased; enum fsck_msg_type msg_type; } msg_id_info[FSCK_MSG_MAX + 1] = { - FOREACH_MSG_ID(MSG_ID) + FOREACH_FSCK_MSG_ID(MSG_ID) { NULL, NULL, NULL, -1 } }; #undef MSG_ID -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v4 16/22] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (16 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 15/22] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 17/22] fsck.c: pass along the fsck_msg_id in the fsck_error callback Ævar Arnfjörð Bjarmason ` (5 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Move the FOREACH_FSCK_MSG_ID macro and the fsck_msg_id enum it helps define from fsck.c to fsck.h. This is in preparation for having non-static functions take the fsck_msg_id as an argument. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 66 ---------------------------------------------------------- fsck.h | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 66 insertions(+), 66 deletions(-) diff --git a/fsck.c b/fsck.c index 6fbc56e9fa..8a66168e51 100644 --- a/fsck.c +++ b/fsck.c @@ -22,72 +22,6 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FOREACH_FSCK_MSG_ID(FUNC) \ - /* fatal errors */ \ - FUNC(NUL_IN_HEADER, FATAL) \ - FUNC(UNTERMINATED_HEADER, FATAL) \ - /* errors */ \ - FUNC(BAD_DATE, ERROR) \ - FUNC(BAD_DATE_OVERFLOW, ERROR) \ - FUNC(BAD_EMAIL, ERROR) \ - FUNC(BAD_NAME, ERROR) \ - FUNC(BAD_OBJECT_SHA1, ERROR) \ - FUNC(BAD_PARENT_SHA1, ERROR) \ - FUNC(BAD_TAG_OBJECT, ERROR) \ - FUNC(BAD_TIMEZONE, ERROR) \ - FUNC(BAD_TREE, ERROR) \ - FUNC(BAD_TREE_SHA1, ERROR) \ - FUNC(BAD_TYPE, ERROR) \ - FUNC(DUPLICATE_ENTRIES, ERROR) \ - FUNC(MISSING_AUTHOR, ERROR) \ - FUNC(MISSING_COMMITTER, ERROR) \ - FUNC(MISSING_EMAIL, ERROR) \ - FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \ - FUNC(MISSING_OBJECT, ERROR) \ - FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \ - FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \ - FUNC(MISSING_TAG, ERROR) \ - FUNC(MISSING_TAG_ENTRY, ERROR) \ - FUNC(MISSING_TREE, ERROR) \ - FUNC(MISSING_TREE_OBJECT, ERROR) \ - FUNC(MISSING_TYPE, ERROR) \ - FUNC(MISSING_TYPE_ENTRY, ERROR) \ - FUNC(MULTIPLE_AUTHORS, ERROR) \ - FUNC(TREE_NOT_SORTED, ERROR) \ - FUNC(UNKNOWN_TYPE, ERROR) \ - FUNC(ZERO_PADDED_DATE, ERROR) \ - FUNC(GITMODULES_MISSING, ERROR) \ - FUNC(GITMODULES_BLOB, ERROR) \ - FUNC(GITMODULES_LARGE, ERROR) \ - FUNC(GITMODULES_NAME, ERROR) \ - FUNC(GITMODULES_SYMLINK, ERROR) \ - FUNC(GITMODULES_URL, ERROR) \ - FUNC(GITMODULES_PATH, ERROR) \ - FUNC(GITMODULES_UPDATE, ERROR) \ - /* warnings */ \ - FUNC(BAD_FILEMODE, WARN) \ - FUNC(EMPTY_NAME, WARN) \ - FUNC(FULL_PATHNAME, WARN) \ - FUNC(HAS_DOT, WARN) \ - FUNC(HAS_DOTDOT, WARN) \ - FUNC(HAS_DOTGIT, WARN) \ - FUNC(NULL_SHA1, WARN) \ - FUNC(ZERO_PADDED_FILEMODE, WARN) \ - FUNC(NUL_IN_COMMIT, WARN) \ - /* infos (reported as warnings, but ignored by default) */ \ - FUNC(GITMODULES_PARSE, INFO) \ - FUNC(BAD_TAG_NAME, INFO) \ - FUNC(MISSING_TAGGER_ENTRY, INFO) \ - /* ignored (elevated when requested) */ \ - FUNC(EXTRA_HEADER_ENTRY, IGNORE) - -#define MSG_ID(id, msg_type) FSCK_MSG_##id, -enum fsck_msg_id { - FOREACH_FSCK_MSG_ID(MSG_ID) - FSCK_MSG_MAX -}; -#undef MSG_ID - #define STR(x) #x #define MSG_ID(id, msg_type) { STR(id), NULL, NULL, FSCK_##msg_type }, static struct { diff --git a/fsck.h b/fsck.h index fce9981a0c..c3d3b47b88 100644 --- a/fsck.h +++ b/fsck.h @@ -13,6 +13,72 @@ enum fsck_msg_type { FSCK_WARN, }; +#define FOREACH_FSCK_MSG_ID(FUNC) \ + /* fatal errors */ \ + FUNC(NUL_IN_HEADER, FATAL) \ + FUNC(UNTERMINATED_HEADER, FATAL) \ + /* errors */ \ + FUNC(BAD_DATE, ERROR) \ + FUNC(BAD_DATE_OVERFLOW, ERROR) \ + FUNC(BAD_EMAIL, ERROR) \ + FUNC(BAD_NAME, ERROR) \ + FUNC(BAD_OBJECT_SHA1, ERROR) \ + FUNC(BAD_PARENT_SHA1, ERROR) \ + FUNC(BAD_TAG_OBJECT, ERROR) \ + FUNC(BAD_TIMEZONE, ERROR) \ + FUNC(BAD_TREE, ERROR) \ + FUNC(BAD_TREE_SHA1, ERROR) \ + FUNC(BAD_TYPE, ERROR) \ + FUNC(DUPLICATE_ENTRIES, ERROR) \ + FUNC(MISSING_AUTHOR, ERROR) \ + FUNC(MISSING_COMMITTER, ERROR) \ + FUNC(MISSING_EMAIL, ERROR) \ + FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \ + FUNC(MISSING_OBJECT, ERROR) \ + FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \ + FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \ + FUNC(MISSING_TAG, ERROR) \ + FUNC(MISSING_TAG_ENTRY, ERROR) \ + FUNC(MISSING_TREE, ERROR) \ + FUNC(MISSING_TREE_OBJECT, ERROR) \ + FUNC(MISSING_TYPE, ERROR) \ + FUNC(MISSING_TYPE_ENTRY, ERROR) \ + FUNC(MULTIPLE_AUTHORS, ERROR) \ + FUNC(TREE_NOT_SORTED, ERROR) \ + FUNC(UNKNOWN_TYPE, ERROR) \ + FUNC(ZERO_PADDED_DATE, ERROR) \ + FUNC(GITMODULES_MISSING, ERROR) \ + FUNC(GITMODULES_BLOB, ERROR) \ + FUNC(GITMODULES_LARGE, ERROR) \ + FUNC(GITMODULES_NAME, ERROR) \ + FUNC(GITMODULES_SYMLINK, ERROR) \ + FUNC(GITMODULES_URL, ERROR) \ + FUNC(GITMODULES_PATH, ERROR) \ + FUNC(GITMODULES_UPDATE, ERROR) \ + /* warnings */ \ + FUNC(BAD_FILEMODE, WARN) \ + FUNC(EMPTY_NAME, WARN) \ + FUNC(FULL_PATHNAME, WARN) \ + FUNC(HAS_DOT, WARN) \ + FUNC(HAS_DOTDOT, WARN) \ + FUNC(HAS_DOTGIT, WARN) \ + FUNC(NULL_SHA1, WARN) \ + FUNC(ZERO_PADDED_FILEMODE, WARN) \ + FUNC(NUL_IN_COMMIT, WARN) \ + /* infos (reported as warnings, but ignored by default) */ \ + FUNC(GITMODULES_PARSE, INFO) \ + FUNC(BAD_TAG_NAME, INFO) \ + FUNC(MISSING_TAGGER_ENTRY, INFO) \ + /* ignored (elevated when requested) */ \ + FUNC(EXTRA_HEADER_ENTRY, IGNORE) + +#define MSG_ID(id, msg_type) FSCK_MSG_##id, +enum fsck_msg_id { + FOREACH_FSCK_MSG_ID(MSG_ID) + FSCK_MSG_MAX +}; +#undef MSG_ID + struct fsck_options; struct object; -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v4 17/22] fsck.c: pass along the fsck_msg_id in the fsck_error callback 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (17 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 16/22] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-17 19:01 ` Junio C Hamano 2021-03-16 16:17 ` [PATCH v4 18/22] fsck.c: add an fsck_set_msg_type() API that takes enums Ævar Arnfjörð Bjarmason ` (4 subsequent siblings) 23 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Change the fsck_error callback to also pass along the fsck_msg_id. Before this change the only way to get the message id was to parse it back out of the "message". Let's pass it down explicitly for the benefit of callers that might want to use it, as discussed in [1]. Passing the msg_type is now redundant, as you can always get it back from the msg_id, but I'm not changing that convention. It's really common to need the msg_type, and the report() function itself (which calls "fsck_error") needs to call fsck_msg_type() to discover it. Let's not needlessly re-do that work in the user callback. 1. https://lore.kernel.org/git/87blcja2ha.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 4 +++- builtin/index-pack.c | 3 ++- builtin/mktag.c | 1 + fsck.c | 6 ++++-- fsck.h | 6 ++++-- 5 files changed, 14 insertions(+), 6 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index d6d745dc70..b71fac4cec 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -89,7 +89,9 @@ static int objerror(struct object *obj, const char *err) static int fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message) + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 56b8efaa89..2b2266a4b7 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1717,6 +1717,7 @@ static int print_dangling_gitmodules(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, const char *message) { /* @@ -1727,7 +1728,7 @@ static int print_dangling_gitmodules(struct fsck_options *o, printf("%s\n", oid_to_hex(oid)); return 0; } - return fsck_error_function(o, oid, object_type, msg_type, message); + return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); } int cmd_index_pack(int argc, const char **argv, const char *prefix) diff --git a/builtin/mktag.c b/builtin/mktag.c index 1834394a9b..dc989c356f 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -23,6 +23,7 @@ static int mktag_fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, const char *message) { switch (msg_type) { diff --git a/fsck.c b/fsck.c index 8a66168e51..5a040eb4fd 100644 --- a/fsck.c +++ b/fsck.c @@ -245,7 +245,7 @@ static int report(struct fsck_options *options, va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); result = options->error_func(options, oid, object_type, - msg_type, sb.buf); + msg_type, msg_id, sb.buf); strbuf_release(&sb); va_end(ap); @@ -1198,7 +1198,9 @@ int fsck_object(struct object *obj, void *data, unsigned long size, int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message) + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message) { if (msg_type == FSCK_WARN) { warning("object %s: %s", fsck_describe_object(o, oid), message); diff --git a/fsck.h b/fsck.h index c3d3b47b88..33ecf3f3f1 100644 --- a/fsck.h +++ b/fsck.h @@ -101,11 +101,13 @@ typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message); + enum fsck_msg_type msg_type, enum fsck_msg_id msg_id, + const char *message); int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message); + enum fsck_msg_type msg_type, enum fsck_msg_id msg_id, + const char *message); struct fsck_options { fsck_walk_func walk; -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v4 17/22] fsck.c: pass along the fsck_msg_id in the fsck_error callback 2021-03-16 16:17 ` [PATCH v4 17/22] fsck.c: pass along the fsck_msg_id in the fsck_error callback Ævar Arnfjörð Bjarmason @ 2021-03-17 19:01 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-17 19:01 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > Change the fsck_error callback to also pass along the > fsck_msg_id. Before this change the only way to get the message id was > to parse it back out of the "message". Nice. > Let's pass it down explicitly for the benefit of callers that might > want to use it, as discussed in [1]. > > Passing the msg_type is now redundant, as you can always get it back > from the msg_id, but I'm not changing that convention. It's really > common to need the msg_type, and the report() function itself (which > calls "fsck_error") needs to call fsck_msg_type() to discover > it. Let's not needlessly re-do that work in the user callback. ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v4 18/22] fsck.c: add an fsck_set_msg_type() API that takes enums 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (18 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 17/22] fsck.c: pass along the fsck_msg_id in the fsck_error callback Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 19/22] fsck.c: move gitmodules_{found,done} into fsck_options Ævar Arnfjörð Bjarmason ` (3 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Change code I added in acf9de4c94e (mktag: use fsck instead of custom verify_tag(), 2021-01-05) to make use of a new API function that takes the fsck_msg_{id,type} types, instead of arbitrary strings that we'll (hopefully) parse into those types. At the time that the fsck_set_msg_type() API was introduced in 0282f4dced0 (fsck: offer a function to demote fsck errors to warnings, 2015-06-22) it was only intended to be used to parse user-supplied data. For things that are purely internal to the C code it makes sense to have the compiler check these arguments, and to skip the sanity checking of the data in fsck_set_msg_type() which is redundant to checks we get from the compiler. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/mktag.c | 3 ++- fsck.c | 27 +++++++++++++++++---------- fsck.h | 3 +++ 3 files changed, 22 insertions(+), 11 deletions(-) diff --git a/builtin/mktag.c b/builtin/mktag.c index dc989c356f..de67a94f24 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -93,7 +93,8 @@ int cmd_mktag(int argc, const char **argv, const char *prefix) die_errno(_("could not read from stdin")); fsck_options.error_func = mktag_fsck_error_func; - fsck_set_msg_type(&fsck_options, "extraheaderentry", "warn"); + fsck_set_msg_type_from_ids(&fsck_options, FSCK_MSG_EXTRA_HEADER_ENTRY, + FSCK_WARN); /* config might set fsck.extraHeaderEntry=* again */ git_config(mktag_config, NULL); if (fsck_tag_standalone(NULL, buf.buf, buf.len, &fsck_options, diff --git a/fsck.c b/fsck.c index 5a040eb4fd..f26f47b2a1 100644 --- a/fsck.c +++ b/fsck.c @@ -132,6 +132,22 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) return 1; } +void fsck_set_msg_type_from_ids(struct fsck_options *options, + enum fsck_msg_id msg_id, + enum fsck_msg_type msg_type) +{ + if (!options->msg_type) { + int i; + enum fsck_msg_type *severity; + ALLOC_ARRAY(severity, FSCK_MSG_MAX); + for (i = 0; i < FSCK_MSG_MAX; i++) + severity[i] = fsck_msg_type(i, options); + options->msg_type = severity; + } + + options->msg_type[msg_id] = msg_type; +} + void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { @@ -144,16 +160,7 @@ void fsck_set_msg_type(struct fsck_options *options, if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) die("Cannot demote %s to %s", msg_id_str, msg_type_str); - if (!options->msg_type) { - int i; - enum fsck_msg_type *severity; - ALLOC_ARRAY(severity, FSCK_MSG_MAX); - for (i = 0; i < FSCK_MSG_MAX; i++) - severity[i] = fsck_msg_type(i, options); - options->msg_type = severity; - } - - options->msg_type[msg_id] = msg_type; + fsck_set_msg_type_from_ids(options, msg_id, msg_type); } void fsck_set_msg_types(struct fsck_options *options, const char *values) diff --git a/fsck.h b/fsck.h index 33ecf3f3f1..6c2fd9c5cc 100644 --- a/fsck.h +++ b/fsck.h @@ -82,6 +82,9 @@ enum fsck_msg_id { struct fsck_options; struct object; +void fsck_set_msg_type_from_ids(struct fsck_options *options, + enum fsck_msg_id msg_id, + enum fsck_msg_type msg_type); void fsck_set_msg_type(struct fsck_options *options, const char *msg_id, const char *msg_type); void fsck_set_msg_types(struct fsck_options *options, const char *values); -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v4 19/22] fsck.c: move gitmodules_{found,done} into fsck_options 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (19 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 18/22] fsck.c: add an fsck_set_msg_type() API that takes enums Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 20/22] fetch-pack: don't needlessly copy fsck_options Ævar Arnfjörð Bjarmason ` (2 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Move the gitmodules_{found,done} static variables added in 159e7b080bf (fsck: detect gitmodules files, 2018-05-02) into the fsck_options struct. It makes sense to keep all the context in the same place. This requires changing the recently added register_found_gitmodules() function added in 5476e1efde (fetch-pack: print and use dangling .gitmodules, 2021-02-22) to take fsck_options. That function will be removed in a subsequent commit, but as it'll require the new gitmodules_found attribute of "fsck_options" we need this intermediate step first. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fetch-pack.c | 2 +- fsck.c | 23 ++++++++++------------- fsck.h | 7 ++++++- 3 files changed, 17 insertions(+), 15 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index 6a61a46428..82c3c2c043 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -998,7 +998,7 @@ static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) oidset_iter_init(gitmodules_oids, &iter); while ((oid = oidset_iter_next(&iter))) - register_found_gitmodules(oid); + register_found_gitmodules(&fo, oid); if (fsck_finish(&fo)) die("fsck failed"); } diff --git a/fsck.c b/fsck.c index f26f47b2a1..565274a946 100644 --- a/fsck.c +++ b/fsck.c @@ -19,9 +19,6 @@ #include "credential.h" #include "help.h" -static struct oidset gitmodules_found = OIDSET_INIT; -static struct oidset gitmodules_done = OIDSET_INIT; - #define STR(x) #x #define MSG_ID(id, msg_type) { STR(id), NULL, NULL, FSCK_##msg_type }, static struct { @@ -624,7 +621,7 @@ static int fsck_tree(const struct object_id *oid, if (is_hfs_dotgitmodules(name) || is_ntfs_dotgitmodules(name)) { if (!S_ISLNK(mode)) - oidset_insert(&gitmodules_found, oid); + oidset_insert(&options->gitmodules_found, oid); else retval += report(options, oid, OBJ_TREE, @@ -638,7 +635,7 @@ static int fsck_tree(const struct object_id *oid, has_dotgit |= is_ntfs_dotgit(backslash); if (is_ntfs_dotgitmodules(backslash)) { if (!S_ISLNK(mode)) - oidset_insert(&gitmodules_found, oid); + oidset_insert(&options->gitmodules_found, oid); else retval += report(options, oid, OBJ_TREE, FSCK_MSG_GITMODULES_SYMLINK, @@ -1150,9 +1147,9 @@ static int fsck_blob(const struct object_id *oid, const char *buf, struct fsck_gitmodules_data data; struct config_options config_opts = { 0 }; - if (!oidset_contains(&gitmodules_found, oid)) + if (!oidset_contains(&options->gitmodules_found, oid)) return 0; - oidset_insert(&gitmodules_done, oid); + oidset_insert(&options->gitmodules_done, oid); if (object_on_skiplist(options, oid)) return 0; @@ -1217,9 +1214,9 @@ int fsck_error_function(struct fsck_options *o, return 1; } -void register_found_gitmodules(const struct object_id *oid) +void register_found_gitmodules(struct fsck_options *options, const struct object_id *oid) { - oidset_insert(&gitmodules_found, oid); + oidset_insert(&options->gitmodules_found, oid); } int fsck_finish(struct fsck_options *options) @@ -1228,13 +1225,13 @@ int fsck_finish(struct fsck_options *options) struct oidset_iter iter; const struct object_id *oid; - oidset_iter_init(&gitmodules_found, &iter); + oidset_iter_init(&options->gitmodules_found, &iter); while ((oid = oidset_iter_next(&iter))) { enum object_type type; unsigned long size; char *buf; - if (oidset_contains(&gitmodules_done, oid)) + if (oidset_contains(&options->gitmodules_done, oid)) continue; buf = read_object_file(oid, &type, &size); @@ -1259,8 +1256,8 @@ int fsck_finish(struct fsck_options *options) } - oidset_clear(&gitmodules_found); - oidset_clear(&gitmodules_done); + oidset_clear(&options->gitmodules_found); + oidset_clear(&options->gitmodules_done); return ret; } diff --git a/fsck.h b/fsck.h index 6c2fd9c5cc..bb59ef05b6 100644 --- a/fsck.h +++ b/fsck.h @@ -118,6 +118,8 @@ struct fsck_options { unsigned strict:1; enum fsck_msg_type *msg_type; struct oidset skiplist; + struct oidset gitmodules_found; + struct oidset gitmodules_done; kh_oid_map_t *object_names; }; @@ -125,6 +127,8 @@ struct fsck_options { .walk = NULL, \ .msg_type = NULL, \ .skiplist = OIDSET_INIT, \ + .gitmodules_found = OIDSET_INIT, \ + .gitmodules_done = OIDSET_INIT, \ .object_names = NULL, #define FSCK_OPTIONS_COMMON_ERROR_FUNC \ FSCK_OPTIONS_COMMON \ @@ -149,7 +153,8 @@ int fsck_walk(struct object *obj, void *data, struct fsck_options *options); int fsck_object(struct object *obj, void *data, unsigned long size, struct fsck_options *options); -void register_found_gitmodules(const struct object_id *oid); +void register_found_gitmodules(struct fsck_options *options, + const struct object_id *oid); /* * fsck a tag, and pass info about it back to the caller. This is -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v4 20/22] fetch-pack: don't needlessly copy fsck_options 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (20 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 19/22] fsck.c: move gitmodules_{found,done} into fsck_options Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 21/22] fetch-pack: use file-scope static struct for fsck_options Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 22/22] fetch-pack: use new fsck API to printing dangling submodules Ævar Arnfjörð Bjarmason 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Change the behavior of the .gitmodules validation added in 5476e1efde (fetch-pack: print and use dangling .gitmodules, 2021-02-22) so we're using one "fsck_options". I found that code confusing to read. One might think that not setting up the error_func earlier means that we're relying on the "error_func" not being set in some code in between the two hunks being modified here. But we're not, all we're doing in the rest of "cmd_index_pack()" is further setup by calling fsck_set_msg_types(), and assigning to do_fsck_object. So there was no reason in 5476e1efde to make a shallow copy of the fsck_options struct before setting error_func. Let's just do this setup at the top of the function, along with the "walk" assignment. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/index-pack.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 2b2266a4b7..5ad80b85b4 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1761,6 +1761,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) read_replace_refs = 0; fsck_options.walk = mark_link; + fsck_options.error_func = print_dangling_gitmodules; reset_pack_idx_option(&opts); git_config(git_index_pack_config, &opts); @@ -1951,13 +1952,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) else close(input_fd); - if (do_fsck_object) { - struct fsck_options fo = fsck_options; - - fo.error_func = print_dangling_gitmodules; - if (fsck_finish(&fo)) - die(_("fsck error in pack objects")); - } + if (do_fsck_object && fsck_finish(&fsck_options)) + die(_("fsck error in pack objects")); free(objects); strbuf_release(&index_name_buf); -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v4 21/22] fetch-pack: use file-scope static struct for fsck_options 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (21 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 20/22] fetch-pack: don't needlessly copy fsck_options Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 22/22] fetch-pack: use new fsck API to printing dangling submodules Ævar Arnfjörð Bjarmason 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Change code added in 5476e1efde (fetch-pack: print and use dangling .gitmodules, 2021-02-22) so that we use a file-scoped "static struct fsck_options" instead of defining one in the "fsck_gitmodules_oids()" function. We use this pattern in all of builtin/{fsck,index-pack,mktag,unpack-objects}.c. It's odd to see fetch-pack be the odd one out. One might think that we're using other fsck_options structs in fetch-pack, or doing on fsck twice there, but we're not. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fetch-pack.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index 82c3c2c043..229fd8e2c2 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -38,6 +38,7 @@ static int server_supports_filtering; static int advertise_sid; static struct shallow_lock shallow_lock; static const char *alternate_shallow_file; +static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; static struct strbuf fsck_msg_types = STRBUF_INIT; static struct string_list uri_protocols = STRING_LIST_INIT_DUP; @@ -991,15 +992,14 @@ static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) { struct oidset_iter iter; const struct object_id *oid; - struct fsck_options fo = FSCK_OPTIONS_STRICT; if (!oidset_size(gitmodules_oids)) return; oidset_iter_init(gitmodules_oids, &iter); while ((oid = oidset_iter_next(&iter))) - register_found_gitmodules(&fo, oid); - if (fsck_finish(&fo)) + register_found_gitmodules(&fsck_options, oid); + if (fsck_finish(&fsck_options)) die("fsck failed"); } -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v4 22/22] fetch-pack: use new fsck API to printing dangling submodules 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason ` (22 preceding siblings ...) 2021-03-16 16:17 ` [PATCH v4 21/22] fetch-pack: use file-scope static struct for fsck_options Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 ` Ævar Arnfjörð Bjarmason 2021-03-16 19:32 ` Derrick Stolee 2021-03-17 19:12 ` Junio C Hamano 23 siblings, 2 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-16 16:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Refactor the check added in 5476e1efde (fetch-pack: print and use dangling .gitmodules, 2021-02-22) to make use of us now passing the "msg_id" to the user defined "error_func". We can now compare against the FSCK_MSG_GITMODULES_MISSING instead of parsing the generated message. Let's also replace register_found_gitmodules() with directly manipulating the "gitmodules_found" member. A recent commit moved it into "fsck_options" so we could do this here. Add a fsck-cb.c file similar to parse-options-cb.c, the alternative would be to either define this directly in fsck.c as a public API, or to create some library shared by fetch-pack.c ad builtin/index-pack. I expect that there won't be many of these fsck utility functions in the future, so just having a single fsck-cb.c makes sense. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- Makefile | 1 + builtin/index-pack.c | 21 +-------------------- fetch-pack.c | 4 ++-- fsck-cb.c | 16 ++++++++++++++++ fsck.c | 5 ----- fsck.h | 22 +++++++++++++++++++--- 6 files changed, 39 insertions(+), 30 deletions(-) create mode 100644 fsck-cb.c diff --git a/Makefile b/Makefile index dfb0f1000f..3faa8bd0d3 100644 --- a/Makefile +++ b/Makefile @@ -882,6 +882,7 @@ LIB_OBJS += fetch-negotiator.o LIB_OBJS += fetch-pack.o LIB_OBJS += fmt-merge-msg.o LIB_OBJS += fsck.o +LIB_OBJS += fsck-cb.o LIB_OBJS += fsmonitor.o LIB_OBJS += gettext.o LIB_OBJS += gpg-interface.o diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 5ad80b85b4..11f0fafd33 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -120,7 +120,7 @@ static int nr_threads; static int from_stdin; static int strict; static int do_fsck_object; -static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; +static struct fsck_options fsck_options = FSCK_OPTIONS_MISSING_GITMODULES; static int verbose; static int show_resolving_progress; static int show_stat; @@ -1713,24 +1713,6 @@ static void show_pack_info(int stat_only) } } -static int print_dangling_gitmodules(struct fsck_options *o, - const struct object_id *oid, - enum object_type object_type, - enum fsck_msg_type msg_type, - enum fsck_msg_id msg_id, - const char *message) -{ - /* - * NEEDSWORK: Plumb the MSG_ID (from fsck.c) here and use it - * instead of relying on this string check. - */ - if (starts_with(message, "gitmodulesMissing")) { - printf("%s\n", oid_to_hex(oid)); - return 0; - } - return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); -} - int cmd_index_pack(int argc, const char **argv, const char *prefix) { int i, fix_thin_pack = 0, verify = 0, stat_only = 0, rev_index; @@ -1761,7 +1743,6 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) read_replace_refs = 0; fsck_options.walk = mark_link; - fsck_options.error_func = print_dangling_gitmodules; reset_pack_idx_option(&opts); git_config(git_index_pack_config, &opts); diff --git a/fetch-pack.c b/fetch-pack.c index 229fd8e2c2..008a3facd4 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -38,7 +38,7 @@ static int server_supports_filtering; static int advertise_sid; static struct shallow_lock shallow_lock; static const char *alternate_shallow_file; -static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; +static struct fsck_options fsck_options = FSCK_OPTIONS_MISSING_GITMODULES; static struct strbuf fsck_msg_types = STRBUF_INIT; static struct string_list uri_protocols = STRING_LIST_INIT_DUP; @@ -998,7 +998,7 @@ static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) oidset_iter_init(gitmodules_oids, &iter); while ((oid = oidset_iter_next(&iter))) - register_found_gitmodules(&fsck_options, oid); + oidset_insert(&fsck_options.gitmodules_found, oid); if (fsck_finish(&fsck_options)) die("fsck failed"); } diff --git a/fsck-cb.c b/fsck-cb.c new file mode 100644 index 0000000000..465a49235a --- /dev/null +++ b/fsck-cb.c @@ -0,0 +1,16 @@ +#include "git-compat-util.h" +#include "fsck.h" + +int fsck_error_cb_print_missing_gitmodules(struct fsck_options *o, + const struct object_id *oid, + enum object_type object_type, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message) +{ + if (msg_id == FSCK_MSG_GITMODULES_MISSING) { + puts(oid_to_hex(oid)); + return 0; + } + return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); +} diff --git a/fsck.c b/fsck.c index 565274a946..b0089844db 100644 --- a/fsck.c +++ b/fsck.c @@ -1214,11 +1214,6 @@ int fsck_error_function(struct fsck_options *o, return 1; } -void register_found_gitmodules(struct fsck_options *options, const struct object_id *oid) -{ - oidset_insert(&options->gitmodules_found, oid); -} - int fsck_finish(struct fsck_options *options) { int ret = 0; diff --git a/fsck.h b/fsck.h index bb59ef05b6..ae3107638a 100644 --- a/fsck.h +++ b/fsck.h @@ -153,9 +153,6 @@ int fsck_walk(struct object *obj, void *data, struct fsck_options *options); int fsck_object(struct object *obj, void *data, unsigned long size, struct fsck_options *options); -void register_found_gitmodules(struct fsck_options *options, - const struct object_id *oid); - /* * fsck a tag, and pass info about it back to the caller. This is * exposed fsck_object() internals for git-mktag(1). @@ -204,4 +201,23 @@ const char *fsck_describe_object(struct fsck_options *options, int fsck_config_internal(const char *var, const char *value, void *cb, struct fsck_options *options); +/* + * Initializations for callbacks in fsck-cb.c + */ +#define FSCK_OPTIONS_MISSING_GITMODULES { \ + .strict = 1, \ + .error_func = fsck_error_cb_print_missing_gitmodules, \ + FSCK_OPTIONS_COMMON \ +} + +/* + * Error callbacks in fsck-cb.c + */ +int fsck_error_cb_print_missing_gitmodules(struct fsck_options *o, + const struct object_id *oid, + enum object_type object_type, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message); + #endif -- 2.31.0.260.g719c683c1d ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v4 22/22] fetch-pack: use new fsck API to printing dangling submodules 2021-03-16 16:17 ` [PATCH v4 22/22] fetch-pack: use new fsck API to printing dangling submodules Ævar Arnfjörð Bjarmason @ 2021-03-16 19:32 ` Derrick Stolee 2021-03-17 13:47 ` Ævar Arnfjörð Bjarmason 2021-03-17 19:12 ` Junio C Hamano 1 sibling, 1 reply; 229+ messages in thread From: Derrick Stolee @ 2021-03-16 19:32 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason, git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan On 3/16/2021 12:17 PM, Ævar Arnfjörð Bjarmason wrote: > Refactor the check added in 5476e1efde (fetch-pack: print and use > dangling .gitmodules, 2021-02-22) to make use of us now passing the > "msg_id" to the user defined "error_func". We can now compare against > the FSCK_MSG_GITMODULES_MISSING instead of parsing the generated > message. > > Let's also replace register_found_gitmodules() with directly > manipulating the "gitmodules_found" member. A recent commit moved it > into "fsck_options" so we could do this here. > > Add a fsck-cb.c file similar to parse-options-cb.c, the alternative > would be to either define this directly in fsck.c as a public API, or > to create some library shared by fetch-pack.c ad builtin/index-pack. > > I expect that there won't be many of these fsck utility functions in > the future, so just having a single fsck-cb.c makes sense. I'm not convinced that having a single cb function merits its own file. But, if you expect this pattern to be expanded a couple more times, then I would say it is worth it. Do you have such plans? Thanks, -Stolee ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v4 22/22] fetch-pack: use new fsck API to printing dangling submodules 2021-03-16 19:32 ` Derrick Stolee @ 2021-03-17 13:47 ` Ævar Arnfjörð Bjarmason 2021-03-17 20:27 ` Derrick Stolee 0 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-17 13:47 UTC (permalink / raw) To: Derrick Stolee Cc: git, Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan On Tue, Mar 16 2021, Derrick Stolee wrote: > On 3/16/2021 12:17 PM, Ævar Arnfjörð Bjarmason wrote: >> Refactor the check added in 5476e1efde (fetch-pack: print and use >> dangling .gitmodules, 2021-02-22) to make use of us now passing the >> "msg_id" to the user defined "error_func". We can now compare against >> the FSCK_MSG_GITMODULES_MISSING instead of parsing the generated >> message. >> >> Let's also replace register_found_gitmodules() with directly >> manipulating the "gitmodules_found" member. A recent commit moved it >> into "fsck_options" so we could do this here. >> >> Add a fsck-cb.c file similar to parse-options-cb.c, the alternative >> would be to either define this directly in fsck.c as a public API, or >> to create some library shared by fetch-pack.c ad builtin/index-pack. >> >> I expect that there won't be many of these fsck utility functions in >> the future, so just having a single fsck-cb.c makes sense. > > I'm not convinced that having a single cb function merits its > own file. But, if you expect this pattern to be expanded a > couple more times, then I would say it is worth it. Do you have > such plans? Not really, well. Vague ones, but nothing I have even local patches for. It just seemed odd to stick random callback functions shared by related programs into fsck.h's interface, but I guess with FSCK_OPTIONS_MISSING_GITMODULES I already did that. Do you suggest just putting it into fsck.c? ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v4 22/22] fetch-pack: use new fsck API to printing dangling submodules 2021-03-17 13:47 ` Ævar Arnfjörð Bjarmason @ 2021-03-17 20:27 ` Derrick Stolee 0 siblings, 0 replies; 229+ messages in thread From: Derrick Stolee @ 2021-03-17 20:27 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan On 3/17/2021 9:47 AM, Ævar Arnfjörð Bjarmason wrote: > > On Tue, Mar 16 2021, Derrick Stolee wrote: > >> On 3/16/2021 12:17 PM, Ævar Arnfjörð Bjarmason wrote: >>> I expect that there won't be many of these fsck utility functions in >>> the future, so just having a single fsck-cb.c makes sense. >> >> I'm not convinced that having a single cb function merits its >> own file. But, if you expect this pattern to be expanded a >> couple more times, then I would say it is worth it. Do you have >> such plans? > > Not really, well. Vague ones, but nothing I have even local patches for. > > It just seemed odd to stick random callback functions shared by related > programs into fsck.h's interface, but I guess with > FSCK_OPTIONS_MISSING_GITMODULES I already did that. > > Do you suggest just putting it into fsck.c? Yeah, if it is frequently paired with fsck operations, I think it makes the most sense there. And looking at it again, I'm not sure parse-options-cb.c has a good excuse for being separate from parse-options.c, but that's the current state so I wouldn't change it now. Thanks, -Stolee ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v4 22/22] fetch-pack: use new fsck API to printing dangling submodules 2021-03-16 16:17 ` [PATCH v4 22/22] fetch-pack: use new fsck API to printing dangling submodules Ævar Arnfjörð Bjarmason 2021-03-16 19:32 ` Derrick Stolee @ 2021-03-17 19:12 ` Junio C Hamano 1 sibling, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-17 19:12 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > diff --git a/builtin/index-pack.c b/builtin/index-pack.c > index 5ad80b85b4..11f0fafd33 100644 > --- a/builtin/index-pack.c > +++ b/builtin/index-pack.c > @@ -120,7 +120,7 @@ static int nr_threads; > static int from_stdin; > static int strict; > static int do_fsck_object; > -static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; > +static struct fsck_options fsck_options = FSCK_OPTIONS_MISSING_GITMODULES; Hmph, I do not think this is a good way to go. Specifically, fsck-cb.c with the definition of what this thing is, and in fsck.h file the normal "options" initializers being defined quite far away from where this is defined, it is hard to see what is different between the normal strict one and MISSING_GITMODULES one. Rather, it may be far simpler to keep only DEFAULT and STRICT, and override .error_func at runtime in the codepath(s) that needs to, which would make it more clear what is going on. That way, we do not need the split initializers with _ERROR_FUNC, which is another reason why the approach taken by this series is not a good idea (it does not scale---error-func may seem so special to deserve having two sets of macros that use the default one and leave the member unspecified, but it won't stay to be special forever). IOW, > @@ -1761,7 +1743,6 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) > > read_replace_refs = 0; > fsck_options.walk = mark_link; > - fsck_options.error_func = print_dangling_gitmodules; I doubt this hunk is an improvement. > diff --git a/fsck-cb.c b/fsck-cb.c > new file mode 100644 > index 0000000000..465a49235a > --- /dev/null > +++ b/fsck-cb.c > @@ -0,0 +1,16 @@ > +#include "git-compat-util.h" > +#include "fsck.h" > + > +int fsck_error_cb_print_missing_gitmodules(struct fsck_options *o, > + const struct object_id *oid, > + enum object_type object_type, > + enum fsck_msg_type msg_type, > + enum fsck_msg_id msg_id, > + const char *message) > +{ > + if (msg_id == FSCK_MSG_GITMODULES_MISSING) { > + puts(oid_to_hex(oid)); > + return 0; > + } > + return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); > +} As Derrick noticed, I do not know if we want to have a separate file for this single function. Shouldn't it be part of builtin/index-pack.c, or do we want other places to do the same kind of checks? Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v3 01/22] fsck.h: update FSCK_OPTIONS_* for object_name 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason 2021-02-18 22:19 ` Junio C Hamano 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 02/22] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason ` (20 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Add the object_name member to the initialization macro. This was omitted in 7b35efd734e (fsck_walk(): optionally name objects on the go, 2016-07-17) when the field was added. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fsck.h b/fsck.h index 733378f1260..2274843ba0c 100644 --- a/fsck.h +++ b/fsck.h @@ -43,8 +43,8 @@ struct fsck_options { kh_oid_map_t *object_names; }; -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT } -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT } +#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT, NULL } +#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT, NULL } /* descend in all linked child objects * the return value is: -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 02/22] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (2 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 01/22] fsck.h: update FSCK_OPTIONS_* for object_name Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 03/22] fsck.h: reduce duplication between FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason ` (19 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/fsck.h b/fsck.h index 2274843ba0c..40f3cb3f645 100644 --- a/fsck.h +++ b/fsck.h @@ -43,8 +43,22 @@ struct fsck_options { kh_oid_map_t *object_names; }; -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT, NULL } -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT, NULL } +#define FSCK_OPTIONS_DEFAULT { \ + .walk = NULL, \ + .error_func = fsck_error_function, \ + .strict = 0, \ + .msg_type = NULL, \ + .skiplist = OIDSET_INIT, \ + .object_names = NULL, \ +} +#define FSCK_OPTIONS_STRICT { \ + .walk = NULL, \ + .error_func = fsck_error_function, \ + .strict = 1, \ + .msg_type = NULL, \ + .skiplist = OIDSET_INIT, \ + .object_names = NULL, \ +} /* descend in all linked child objects * the return value is: -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 03/22] fsck.h: reduce duplication between FSCK_OPTIONS_{DEFAULT,STRICT} 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (3 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 02/22] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 04/22] fsck.h: add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro Ævar Arnfjörð Bjarmason ` (18 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Use a temporary macro to define what FSCK_OPTIONS_{DEFAULT,STRICT} have in common, and define the two in terms of that macro. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 16 ++++------------ 1 file changed, 4 insertions(+), 12 deletions(-) diff --git a/fsck.h b/fsck.h index 40f3cb3f645..ea3a907ec3b 100644 --- a/fsck.h +++ b/fsck.h @@ -43,22 +43,14 @@ struct fsck_options { kh_oid_map_t *object_names; }; -#define FSCK_OPTIONS_DEFAULT { \ +#define FSCK_OPTIONS_COMMON \ .walk = NULL, \ .error_func = fsck_error_function, \ - .strict = 0, \ .msg_type = NULL, \ .skiplist = OIDSET_INIT, \ - .object_names = NULL, \ -} -#define FSCK_OPTIONS_STRICT { \ - .walk = NULL, \ - .error_func = fsck_error_function, \ - .strict = 1, \ - .msg_type = NULL, \ - .skiplist = OIDSET_INIT, \ - .object_names = NULL, \ -} + .object_names = NULL, +#define FSCK_OPTIONS_DEFAULT { .strict = 0, FSCK_OPTIONS_COMMON } +#define FSCK_OPTIONS_STRICT { .strict = 1, FSCK_OPTIONS_COMMON } /* descend in all linked child objects * the return value is: -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 04/22] fsck.h: add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (4 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 03/22] fsck.h: reduce duplication between FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 05/22] fsck.h: indent arguments to of fsck_set_msg_type Ævar Arnfjörð Bjarmason ` (17 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro for those that would like to use FSCK_OPTIONS_COMMON in their own initialization, but supply their own error functions. Nothing is being changed to use this yet, but in some subsequent commits we'll make use of this macro. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/fsck.h b/fsck.h index ea3a907ec3b..dc35924cbf5 100644 --- a/fsck.h +++ b/fsck.h @@ -45,12 +45,15 @@ struct fsck_options { #define FSCK_OPTIONS_COMMON \ .walk = NULL, \ - .error_func = fsck_error_function, \ .msg_type = NULL, \ .skiplist = OIDSET_INIT, \ .object_names = NULL, -#define FSCK_OPTIONS_DEFAULT { .strict = 0, FSCK_OPTIONS_COMMON } -#define FSCK_OPTIONS_STRICT { .strict = 1, FSCK_OPTIONS_COMMON } +#define FSCK_OPTIONS_COMMON_ERROR_FUNC \ + FSCK_OPTIONS_COMMON \ + .error_func = fsck_error_function + +#define FSCK_OPTIONS_DEFAULT { .strict = 0, FSCK_OPTIONS_COMMON_ERROR_FUNC } +#define FSCK_OPTIONS_STRICT { .strict = 1, FSCK_OPTIONS_COMMON_ERROR_FUNC } /* descend in all linked child objects * the return value is: -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 05/22] fsck.h: indent arguments to of fsck_set_msg_type 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (5 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 04/22] fsck.h: add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 06/22] fsck.h: use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason ` (16 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fsck.h b/fsck.h index dc35924cbf5..5e488cef6b3 100644 --- a/fsck.h +++ b/fsck.h @@ -11,7 +11,7 @@ struct fsck_options; struct object; void fsck_set_msg_type(struct fsck_options *options, - const char *msg_id, const char *msg_type); + const char *msg_id, const char *msg_type); void fsck_set_msg_types(struct fsck_options *options, const char *values); int is_valid_msg_type(const char *msg_id, const char *msg_type); -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 06/22] fsck.h: use "enum object_type" instead of "int" 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (6 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 05/22] fsck.h: indent arguments to of fsck_set_msg_type Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 07/22] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason ` (15 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Change the fsck_walk_func to use an "enum object_type" instead of an "int" type. The types are compatible, and ever since this was added in 355885d5315 (add generic, type aware object chain walker, 2008-02-25) we've used entries from object_type (OBJ_BLOB etc.). So this doesn't really change anything as far as the generated code is concerned, it just gives the compiler more information and makes this easier to read. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 3 ++- builtin/index-pack.c | 3 ++- builtin/unpack-objects.c | 3 ++- fsck.h | 3 ++- 4 files changed, 8 insertions(+), 4 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 821e7798c70..68f0329e69e 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -197,7 +197,8 @@ static int traverse_reachable(void) return !!result; } -static int mark_used(struct object *obj, int type, void *data, struct fsck_options *options) +static int mark_used(struct object *obj, enum object_type object_type, + void *data, struct fsck_options *options) { if (!obj) return 1; diff --git a/builtin/index-pack.c b/builtin/index-pack.c index bad57488079..69f24fe9f76 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -212,7 +212,8 @@ static void cleanup_thread(void) free(thread_data); } -static int mark_link(struct object *obj, int type, void *data, struct fsck_options *options) +static int mark_link(struct object *obj, enum object_type type, + void *data, struct fsck_options *options) { if (!obj) return -1; diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index dd4a75e030d..ca54fd16688 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -187,7 +187,8 @@ static void write_cached_object(struct object *obj, struct obj_buffer *obj_buf) * that have reachability requirements and calls this function. * Verify its reachability and validity recursively and write it out. */ -static int check_object(struct object *obj, int type, void *data, struct fsck_options *options) +static int check_object(struct object *obj, enum object_type type, + void *data, struct fsck_options *options) { struct obj_buffer *obj_buf; diff --git a/fsck.h b/fsck.h index 5e488cef6b3..f67edd8f1f9 100644 --- a/fsck.h +++ b/fsck.h @@ -23,7 +23,8 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type); * <0 error signaled and abort * >0 error signaled and do not abort */ -typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options); +typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, + void *data, struct fsck_options *options); /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 07/22] fsck.c: rename variables in fsck_set_msg_type() for less confusion 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (7 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 06/22] fsck.h: use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 08/22] fsck.c: move definition of msg_id into append_msg_id() Ævar Arnfjörð Bjarmason ` (14 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Rename variables in a function added in 0282f4dced0 (fsck: offer a function to demote fsck errors to warnings, 2015-06-22). It was needlessly confusing that it took a "msg_type" argument, but then later declared another "msg_type" of a different type. Let's rename that to "severity", and rename "id" to "msg_id" and "msg_id" to "msg_id_str" etc. This will make a follow-up change smaller. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/fsck.c b/fsck.c index e3030f3b358..0a9ac9ca070 100644 --- a/fsck.c +++ b/fsck.c @@ -203,27 +203,27 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) } void fsck_set_msg_type(struct fsck_options *options, - const char *msg_id, const char *msg_type) + const char *msg_id_str, const char *msg_type_str) { - int id = parse_msg_id(msg_id), type; + int msg_id = parse_msg_id(msg_id_str), msg_type; - if (id < 0) - die("Unhandled message id: %s", msg_id); - type = parse_msg_type(msg_type); + if (msg_id < 0) + die("Unhandled message id: %s", msg_id_str); + msg_type = parse_msg_type(msg_type_str); - if (type != FSCK_ERROR && msg_id_info[id].msg_type == FSCK_FATAL) - die("Cannot demote %s to %s", msg_id, msg_type); + if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) + die("Cannot demote %s to %s", msg_id_str, msg_type_str); if (!options->msg_type) { int i; - int *msg_type; - ALLOC_ARRAY(msg_type, FSCK_MSG_MAX); + int *severity; + ALLOC_ARRAY(severity, FSCK_MSG_MAX); for (i = 0; i < FSCK_MSG_MAX; i++) - msg_type[i] = fsck_msg_type(i, options); - options->msg_type = msg_type; + severity[i] = fsck_msg_type(i, options); + options->msg_type = severity; } - options->msg_type[id] = type; + options->msg_type[msg_id] = msg_type; } void fsck_set_msg_types(struct fsck_options *options, const char *values) -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 08/22] fsck.c: move definition of msg_id into append_msg_id() 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (8 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 07/22] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 09/22] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason ` (13 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Refactor code added in 71ab8fa840f (fsck: report the ID of the error/warning, 2015-06-22) to resolve the msg_id to a string in the function that wants it, instead of doing it in report(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index 0a9ac9ca070..b977493f57a 100644 --- a/fsck.c +++ b/fsck.c @@ -264,8 +264,9 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) free(to_free); } -static void append_msg_id(struct strbuf *sb, const char *msg_id) +static void append_msg_id(struct strbuf *sb, enum fsck_msg_id id) { + const char *msg_id = msg_id_info[id].id_string; for (;;) { char c = *(msg_id)++; @@ -308,7 +309,7 @@ static int report(struct fsck_options *options, else if (msg_type == FSCK_INFO) msg_type = FSCK_WARN; - append_msg_id(&sb, msg_id_info[id].id_string); + append_msg_id(&sb, id); va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 09/22] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (9 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 08/22] fsck.c: move definition of msg_id into append_msg_id() Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 10/22] fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" Ævar Arnfjörð Bjarmason ` (12 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Rename the remaining variables of type fsck_msg_id from "id" to "msg_id". This change is relatively small, and is worth the churn for a later change where we have different id's in the "report" function. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/fsck.c b/fsck.c index b977493f57a..6b72ddaa51d 100644 --- a/fsck.c +++ b/fsck.c @@ -264,19 +264,19 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) free(to_free); } -static void append_msg_id(struct strbuf *sb, enum fsck_msg_id id) +static void append_msg_id(struct strbuf *sb, enum fsck_msg_id msg_id) { - const char *msg_id = msg_id_info[id].id_string; + const char *msg_id_str = msg_id_info[msg_id].id_string; for (;;) { - char c = *(msg_id)++; + char c = *(msg_id_str)++; if (!c) break; if (c != '_') strbuf_addch(sb, tolower(c)); else { - assert(*msg_id); - strbuf_addch(sb, *(msg_id)++); + assert(*msg_id_str); + strbuf_addch(sb, *(msg_id_str)++); } } @@ -292,11 +292,11 @@ static int object_on_skiplist(struct fsck_options *opts, __attribute__((format (printf, 5, 6))) static int report(struct fsck_options *options, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_id id, const char *fmt, ...) + enum fsck_msg_id msg_id, const char *fmt, ...) { va_list ap; struct strbuf sb = STRBUF_INIT; - int msg_type = fsck_msg_type(id, options), result; + int msg_type = fsck_msg_type(msg_id, options), result; if (msg_type == FSCK_IGNORE) return 0; @@ -309,7 +309,7 @@ static int report(struct fsck_options *options, else if (msg_type == FSCK_INFO) msg_type = FSCK_WARN; - append_msg_id(&sb, id); + append_msg_id(&sb, msg_id); va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 10/22] fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (10 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 09/22] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 11/22] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason ` (11 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Refactor "if options->msg_type" and other code added in 0282f4dced0 (fsck: offer a function to demote fsck errors to warnings, 2015-06-22) to reduce the scope of the "int msg_type" variable. This is in preparation for changing its type in a subsequent commit, only using it in the "!options->msg_type" scope makes that change This also brings the code in line with the fsck_set_msg_type() function (also added in 0282f4dced0), which does a similar check for "!options->msg_type". Another minor benefit is getting rid of the style violation of not having braces for the body of the "if". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/fsck.c b/fsck.c index 6b72ddaa51d..0988ab65792 100644 --- a/fsck.c +++ b/fsck.c @@ -167,19 +167,17 @@ void list_config_fsck_msg_ids(struct string_list *list, const char *prefix) static int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options) { - int msg_type; - assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX); - if (options->msg_type) - msg_type = options->msg_type[msg_id]; - else { - msg_type = msg_id_info[msg_id].msg_type; + if (!options->msg_type) { + int msg_type = msg_id_info[msg_id].msg_type; + if (options->strict && msg_type == FSCK_WARN) msg_type = FSCK_ERROR; + return msg_type; } - return msg_type; + return options->msg_type[msg_id]; } static int parse_msg_type(const char *str) -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 11/22] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (11 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 10/22] fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 12/22] fsck.h: re-order and re-assign "enum fsck_msg_type" Ævar Arnfjörð Bjarmason ` (10 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Move the FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} defines into a new fsck_msg_type enum. These defines were originally introduced in: - ba002f3b28a (builtin-fsck: move common object checking code to fsck.c, 2008-02-25) - f50c4407305 (fsck: disallow demoting grave fsck errors to warnings, 2015-06-22) - efaba7cc77f (fsck: optionally ignore specific fsck issues completely, 2015-06-22) - f27d05b1704 (fsck: allow upgrading fsck warnings to errors, 2015-06-22) The reason these were defined in two different places is because we use FSCK_{IGNORE,INFO,FATAL} only in fsck.c, but FSCK_{ERROR,WARN} are used by external callbacks. Untangling that would take some more work, since we expose the new "enum fsck_msg_type" to both. Similar to "enum object_type" it's not worth structuring the API in such a way that only those who need FSCK_{ERROR,WARN} pass around a different type. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 2 +- builtin/index-pack.c | 3 ++- builtin/mktag.c | 3 ++- fsck.c | 21 ++++++++++----------- fsck.h | 16 ++++++++++------ 5 files changed, 25 insertions(+), 20 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 68f0329e69e..d6d745dc702 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -89,7 +89,7 @@ static int objerror(struct object *obj, const char *err) static int fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 69f24fe9f76..56b8efaa89b 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1716,7 +1716,8 @@ static void show_pack_info(int stat_only) static int print_dangling_gitmodules(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, + const char *message) { /* * NEEDSWORK: Plumb the MSG_ID (from fsck.c) here and use it diff --git a/builtin/mktag.c b/builtin/mktag.c index 41a399a69e4..1834394a9b6 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -22,7 +22,8 @@ static int mktag_config(const char *var, const char *value, void *cb) static int mktag_fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, + const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/fsck.c b/fsck.c index 0988ab65792..fb7d071bbf9 100644 --- a/fsck.c +++ b/fsck.c @@ -22,9 +22,6 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FSCK_FATAL -1 -#define FSCK_INFO -2 - #define FOREACH_MSG_ID(FUNC) \ /* fatal errors */ \ FUNC(NUL_IN_HEADER, FATAL) \ @@ -97,7 +94,7 @@ static struct { const char *id_string; const char *downcased; const char *camelcased; - int msg_type; + enum fsck_msg_type msg_type; } msg_id_info[FSCK_MSG_MAX + 1] = { FOREACH_MSG_ID(MSG_ID) { NULL, NULL, NULL, -1 } @@ -164,13 +161,13 @@ void list_config_fsck_msg_ids(struct string_list *list, const char *prefix) list_config_item(list, prefix, msg_id_info[i].camelcased); } -static int fsck_msg_type(enum fsck_msg_id msg_id, +static enum fsck_msg_type fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options) { assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX); if (!options->msg_type) { - int msg_type = msg_id_info[msg_id].msg_type; + enum fsck_msg_type msg_type = msg_id_info[msg_id].msg_type; if (options->strict && msg_type == FSCK_WARN) msg_type = FSCK_ERROR; @@ -180,7 +177,7 @@ static int fsck_msg_type(enum fsck_msg_id msg_id, return options->msg_type[msg_id]; } -static int parse_msg_type(const char *str) +static enum fsck_msg_type parse_msg_type(const char *str) { if (!strcmp(str, "error")) return FSCK_ERROR; @@ -203,7 +200,8 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { - int msg_id = parse_msg_id(msg_id_str), msg_type; + int msg_id = parse_msg_id(msg_id_str); + enum fsck_msg_type msg_type; if (msg_id < 0) die("Unhandled message id: %s", msg_id_str); @@ -214,7 +212,7 @@ void fsck_set_msg_type(struct fsck_options *options, if (!options->msg_type) { int i; - int *severity; + enum fsck_msg_type *severity; ALLOC_ARRAY(severity, FSCK_MSG_MAX); for (i = 0; i < FSCK_MSG_MAX; i++) severity[i] = fsck_msg_type(i, options); @@ -294,7 +292,8 @@ static int report(struct fsck_options *options, { va_list ap; struct strbuf sb = STRBUF_INIT; - int msg_type = fsck_msg_type(msg_id, options), result; + enum fsck_msg_type msg_type = fsck_msg_type(msg_id, options); + int result; if (msg_type == FSCK_IGNORE) return 0; @@ -1265,7 +1264,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size, int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, const char *message) { if (msg_type == FSCK_WARN) { warning("object %s: %s", fsck_describe_object(o, oid), message); diff --git a/fsck.h b/fsck.h index f67edd8f1f9..2ecc15eee77 100644 --- a/fsck.h +++ b/fsck.h @@ -3,9 +3,13 @@ #include "oidset.h" -#define FSCK_ERROR 1 -#define FSCK_WARN 2 -#define FSCK_IGNORE 3 +enum fsck_msg_type { + FSCK_INFO = -2, + FSCK_FATAL = -1, + FSCK_ERROR = 1, + FSCK_WARN, + FSCK_IGNORE +}; struct fsck_options; struct object; @@ -29,17 +33,17 @@ typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message); + enum fsck_msg_type msg_type, const char *message); int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message); + enum fsck_msg_type msg_type, const char *message); struct fsck_options { fsck_walk_func walk; fsck_error error_func; unsigned strict:1; - int *msg_type; + enum fsck_msg_type *msg_type; struct oidset skiplist; kh_oid_map_t *object_names; }; -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 12/22] fsck.h: re-order and re-assign "enum fsck_msg_type" 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (12 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 11/22] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 13/22] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason ` (9 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Change the values in the "enum fsck_msg_type" from being manually assigned to using default C enum values. This means we end up with a FSCK_IGNORE=0, which was previously defined as "2". I'm confident that nothing relies on these values, we always compare them explicitly. Let's not omit "0" so it won't be assumed that we're using these as a boolean somewhere. This also allows us to re-structure the fields to mark which are "private" v.s. "public". See the preceding commit for a rationale for not simply splitting these into two enums, namely that this is used for both the private and public fsck API. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/fsck.h b/fsck.h index 2ecc15eee77..fce9981a0cb 100644 --- a/fsck.h +++ b/fsck.h @@ -4,11 +4,13 @@ #include "oidset.h" enum fsck_msg_type { - FSCK_INFO = -2, - FSCK_FATAL = -1, - FSCK_ERROR = 1, + /* for internal use only */ + FSCK_IGNORE, + FSCK_INFO, + FSCK_FATAL, + /* "public", fed to e.g. error_func callbacks */ + FSCK_ERROR, FSCK_WARN, - FSCK_IGNORE }; struct fsck_options; -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 13/22] fsck.c: call parse_msg_type() early in fsck_set_msg_type() 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (13 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 12/22] fsck.h: re-order and re-assign "enum fsck_msg_type" Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 14/22] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason ` (8 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason There's no reason to defer the calling of parse_msg_type() until after we've checked if the "id < 0". This is not a hot codepath, and parse_msg_type() itself may die on invalid input. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index fb7d071bbf9..2ccf1a2f0fd 100644 --- a/fsck.c +++ b/fsck.c @@ -201,11 +201,10 @@ void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { int msg_id = parse_msg_id(msg_id_str); - enum fsck_msg_type msg_type; + enum fsck_msg_type msg_type = parse_msg_type(msg_type_str); if (msg_id < 0) die("Unhandled message id: %s", msg_id_str); - msg_type = parse_msg_type(msg_type_str); if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) die("Cannot demote %s to %s", msg_id_str, msg_type_str); -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 14/22] fsck.c: undefine temporary STR macro after use 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (14 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 13/22] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 15/22] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason ` (7 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason In f417eed8cde (fsck: provide a function to parse fsck message IDs, 2015-06-22) the "STR" macro was introduced, but that short macro name was not undefined after use as was done earlier in the same series for the MSG_ID macro in c99ba492f1c (fsck: introduce identifiers for fsck messages, 2015-06-22). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fsck.c b/fsck.c index 2ccf1a2f0fd..f4c924ed044 100644 --- a/fsck.c +++ b/fsck.c @@ -100,6 +100,7 @@ static struct { { NULL, NULL, NULL, -1 } }; #undef MSG_ID +#undef STR static void prepare_msg_ids(void) { -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 15/22] fsck.c: give "FOREACH_MSG_ID" a more specific name 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (15 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 14/22] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 16/22] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h Ævar Arnfjörð Bjarmason ` (6 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Rename the FOREACH_MSG_ID macro to FOREACH_FSCK_MSG_ID in preparation for moving it over to fsck.h. It's good convention to name macros in *.h files in such a way as to clearly not clash with any other names in other files. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fsck.c b/fsck.c index f4c924ed044..6fbc56e9faa 100644 --- a/fsck.c +++ b/fsck.c @@ -22,7 +22,7 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FOREACH_MSG_ID(FUNC) \ +#define FOREACH_FSCK_MSG_ID(FUNC) \ /* fatal errors */ \ FUNC(NUL_IN_HEADER, FATAL) \ FUNC(UNTERMINATED_HEADER, FATAL) \ @@ -83,7 +83,7 @@ static struct oidset gitmodules_done = OIDSET_INIT; #define MSG_ID(id, msg_type) FSCK_MSG_##id, enum fsck_msg_id { - FOREACH_MSG_ID(MSG_ID) + FOREACH_FSCK_MSG_ID(MSG_ID) FSCK_MSG_MAX }; #undef MSG_ID @@ -96,7 +96,7 @@ static struct { const char *camelcased; enum fsck_msg_type msg_type; } msg_id_info[FSCK_MSG_MAX + 1] = { - FOREACH_MSG_ID(MSG_ID) + FOREACH_FSCK_MSG_ID(MSG_ID) { NULL, NULL, NULL, -1 } }; #undef MSG_ID -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 16/22] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (16 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 15/22] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 17/22] fsck.c: pass along the fsck_msg_id in the fsck_error callback Ævar Arnfjörð Bjarmason ` (5 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Move the FOREACH_FSCK_MSG_ID macro and the fsck_msg_id enum it helps define from fsck.c to fsck.h. This is in preparation for having non-static functions take the fsck_msg_id as an argument. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 66 ---------------------------------------------------------- fsck.h | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 66 insertions(+), 66 deletions(-) diff --git a/fsck.c b/fsck.c index 6fbc56e9faa..8a66168e516 100644 --- a/fsck.c +++ b/fsck.c @@ -22,72 +22,6 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FOREACH_FSCK_MSG_ID(FUNC) \ - /* fatal errors */ \ - FUNC(NUL_IN_HEADER, FATAL) \ - FUNC(UNTERMINATED_HEADER, FATAL) \ - /* errors */ \ - FUNC(BAD_DATE, ERROR) \ - FUNC(BAD_DATE_OVERFLOW, ERROR) \ - FUNC(BAD_EMAIL, ERROR) \ - FUNC(BAD_NAME, ERROR) \ - FUNC(BAD_OBJECT_SHA1, ERROR) \ - FUNC(BAD_PARENT_SHA1, ERROR) \ - FUNC(BAD_TAG_OBJECT, ERROR) \ - FUNC(BAD_TIMEZONE, ERROR) \ - FUNC(BAD_TREE, ERROR) \ - FUNC(BAD_TREE_SHA1, ERROR) \ - FUNC(BAD_TYPE, ERROR) \ - FUNC(DUPLICATE_ENTRIES, ERROR) \ - FUNC(MISSING_AUTHOR, ERROR) \ - FUNC(MISSING_COMMITTER, ERROR) \ - FUNC(MISSING_EMAIL, ERROR) \ - FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \ - FUNC(MISSING_OBJECT, ERROR) \ - FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \ - FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \ - FUNC(MISSING_TAG, ERROR) \ - FUNC(MISSING_TAG_ENTRY, ERROR) \ - FUNC(MISSING_TREE, ERROR) \ - FUNC(MISSING_TREE_OBJECT, ERROR) \ - FUNC(MISSING_TYPE, ERROR) \ - FUNC(MISSING_TYPE_ENTRY, ERROR) \ - FUNC(MULTIPLE_AUTHORS, ERROR) \ - FUNC(TREE_NOT_SORTED, ERROR) \ - FUNC(UNKNOWN_TYPE, ERROR) \ - FUNC(ZERO_PADDED_DATE, ERROR) \ - FUNC(GITMODULES_MISSING, ERROR) \ - FUNC(GITMODULES_BLOB, ERROR) \ - FUNC(GITMODULES_LARGE, ERROR) \ - FUNC(GITMODULES_NAME, ERROR) \ - FUNC(GITMODULES_SYMLINK, ERROR) \ - FUNC(GITMODULES_URL, ERROR) \ - FUNC(GITMODULES_PATH, ERROR) \ - FUNC(GITMODULES_UPDATE, ERROR) \ - /* warnings */ \ - FUNC(BAD_FILEMODE, WARN) \ - FUNC(EMPTY_NAME, WARN) \ - FUNC(FULL_PATHNAME, WARN) \ - FUNC(HAS_DOT, WARN) \ - FUNC(HAS_DOTDOT, WARN) \ - FUNC(HAS_DOTGIT, WARN) \ - FUNC(NULL_SHA1, WARN) \ - FUNC(ZERO_PADDED_FILEMODE, WARN) \ - FUNC(NUL_IN_COMMIT, WARN) \ - /* infos (reported as warnings, but ignored by default) */ \ - FUNC(GITMODULES_PARSE, INFO) \ - FUNC(BAD_TAG_NAME, INFO) \ - FUNC(MISSING_TAGGER_ENTRY, INFO) \ - /* ignored (elevated when requested) */ \ - FUNC(EXTRA_HEADER_ENTRY, IGNORE) - -#define MSG_ID(id, msg_type) FSCK_MSG_##id, -enum fsck_msg_id { - FOREACH_FSCK_MSG_ID(MSG_ID) - FSCK_MSG_MAX -}; -#undef MSG_ID - #define STR(x) #x #define MSG_ID(id, msg_type) { STR(id), NULL, NULL, FSCK_##msg_type }, static struct { diff --git a/fsck.h b/fsck.h index fce9981a0cb..c3d3b47b88b 100644 --- a/fsck.h +++ b/fsck.h @@ -13,6 +13,72 @@ enum fsck_msg_type { FSCK_WARN, }; +#define FOREACH_FSCK_MSG_ID(FUNC) \ + /* fatal errors */ \ + FUNC(NUL_IN_HEADER, FATAL) \ + FUNC(UNTERMINATED_HEADER, FATAL) \ + /* errors */ \ + FUNC(BAD_DATE, ERROR) \ + FUNC(BAD_DATE_OVERFLOW, ERROR) \ + FUNC(BAD_EMAIL, ERROR) \ + FUNC(BAD_NAME, ERROR) \ + FUNC(BAD_OBJECT_SHA1, ERROR) \ + FUNC(BAD_PARENT_SHA1, ERROR) \ + FUNC(BAD_TAG_OBJECT, ERROR) \ + FUNC(BAD_TIMEZONE, ERROR) \ + FUNC(BAD_TREE, ERROR) \ + FUNC(BAD_TREE_SHA1, ERROR) \ + FUNC(BAD_TYPE, ERROR) \ + FUNC(DUPLICATE_ENTRIES, ERROR) \ + FUNC(MISSING_AUTHOR, ERROR) \ + FUNC(MISSING_COMMITTER, ERROR) \ + FUNC(MISSING_EMAIL, ERROR) \ + FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \ + FUNC(MISSING_OBJECT, ERROR) \ + FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \ + FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \ + FUNC(MISSING_TAG, ERROR) \ + FUNC(MISSING_TAG_ENTRY, ERROR) \ + FUNC(MISSING_TREE, ERROR) \ + FUNC(MISSING_TREE_OBJECT, ERROR) \ + FUNC(MISSING_TYPE, ERROR) \ + FUNC(MISSING_TYPE_ENTRY, ERROR) \ + FUNC(MULTIPLE_AUTHORS, ERROR) \ + FUNC(TREE_NOT_SORTED, ERROR) \ + FUNC(UNKNOWN_TYPE, ERROR) \ + FUNC(ZERO_PADDED_DATE, ERROR) \ + FUNC(GITMODULES_MISSING, ERROR) \ + FUNC(GITMODULES_BLOB, ERROR) \ + FUNC(GITMODULES_LARGE, ERROR) \ + FUNC(GITMODULES_NAME, ERROR) \ + FUNC(GITMODULES_SYMLINK, ERROR) \ + FUNC(GITMODULES_URL, ERROR) \ + FUNC(GITMODULES_PATH, ERROR) \ + FUNC(GITMODULES_UPDATE, ERROR) \ + /* warnings */ \ + FUNC(BAD_FILEMODE, WARN) \ + FUNC(EMPTY_NAME, WARN) \ + FUNC(FULL_PATHNAME, WARN) \ + FUNC(HAS_DOT, WARN) \ + FUNC(HAS_DOTDOT, WARN) \ + FUNC(HAS_DOTGIT, WARN) \ + FUNC(NULL_SHA1, WARN) \ + FUNC(ZERO_PADDED_FILEMODE, WARN) \ + FUNC(NUL_IN_COMMIT, WARN) \ + /* infos (reported as warnings, but ignored by default) */ \ + FUNC(GITMODULES_PARSE, INFO) \ + FUNC(BAD_TAG_NAME, INFO) \ + FUNC(MISSING_TAGGER_ENTRY, INFO) \ + /* ignored (elevated when requested) */ \ + FUNC(EXTRA_HEADER_ENTRY, IGNORE) + +#define MSG_ID(id, msg_type) FSCK_MSG_##id, +enum fsck_msg_id { + FOREACH_FSCK_MSG_ID(MSG_ID) + FSCK_MSG_MAX +}; +#undef MSG_ID + struct fsck_options; struct object; -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 17/22] fsck.c: pass along the fsck_msg_id in the fsck_error callback 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (17 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 16/22] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 18/22] fsck.c: add an fsck_set_msg_type() API that takes enums Ævar Arnfjörð Bjarmason ` (4 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Change the fsck_error callback to also pass along the fsck_msg_id. Before this change the only way to get the message id was to parse it back out of the "message". Let's pass it down explicitly for the benefit of callers that might want to use it, as discussed in [1]. Passing the msg_type is now redundant, as you can always get it back from the msg_id, but I'm not changing that convention. It's really common to need the msg_type, and the report() function itself (which calls "fsck_error") needs to call fsck_msg_type() to discover it. Let's not needlessly re-do that work in the user callback. 1. https://lore.kernel.org/git/87blcja2ha.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 4 +++- builtin/index-pack.c | 3 ++- builtin/mktag.c | 1 + fsck.c | 6 ++++-- fsck.h | 6 ++++-- 5 files changed, 14 insertions(+), 6 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index d6d745dc702..b71fac4ceca 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -89,7 +89,9 @@ static int objerror(struct object *obj, const char *err) static int fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message) + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 56b8efaa89b..2b2266a4b7d 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1717,6 +1717,7 @@ static int print_dangling_gitmodules(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, const char *message) { /* @@ -1727,7 +1728,7 @@ static int print_dangling_gitmodules(struct fsck_options *o, printf("%s\n", oid_to_hex(oid)); return 0; } - return fsck_error_function(o, oid, object_type, msg_type, message); + return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); } int cmd_index_pack(int argc, const char **argv, const char *prefix) diff --git a/builtin/mktag.c b/builtin/mktag.c index 1834394a9b6..dc989c356f5 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -23,6 +23,7 @@ static int mktag_fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, const char *message) { switch (msg_type) { diff --git a/fsck.c b/fsck.c index 8a66168e516..5a040eb4fd5 100644 --- a/fsck.c +++ b/fsck.c @@ -245,7 +245,7 @@ static int report(struct fsck_options *options, va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); result = options->error_func(options, oid, object_type, - msg_type, sb.buf); + msg_type, msg_id, sb.buf); strbuf_release(&sb); va_end(ap); @@ -1198,7 +1198,9 @@ int fsck_object(struct object *obj, void *data, unsigned long size, int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message) + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message) { if (msg_type == FSCK_WARN) { warning("object %s: %s", fsck_describe_object(o, oid), message); diff --git a/fsck.h b/fsck.h index c3d3b47b88b..33ecf3f3f16 100644 --- a/fsck.h +++ b/fsck.h @@ -101,11 +101,13 @@ typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message); + enum fsck_msg_type msg_type, enum fsck_msg_id msg_id, + const char *message); int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message); + enum fsck_msg_type msg_type, enum fsck_msg_id msg_id, + const char *message); struct fsck_options { fsck_walk_func walk; -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 18/22] fsck.c: add an fsck_set_msg_type() API that takes enums 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (18 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 17/22] fsck.c: pass along the fsck_msg_id in the fsck_error callback Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 19/22] fsck.c: move gitmodules_{found,done} into fsck_options Ævar Arnfjörð Bjarmason ` (3 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Change code I added in acf9de4c94e (mktag: use fsck instead of custom verify_tag(), 2021-01-05) to make use of a new API function that takes the fsck_msg_{id,type} types, instead of arbitrary strings that we'll (hopefully) parse into those types. At the time that the fsck_set_msg_type() API was introduced in 0282f4dced0 (fsck: offer a function to demote fsck errors to warnings, 2015-06-22) it was only intended to be used to parse user-supplied data. For things that are purely internal to the C code it makes sense to have the compiler check these arguments, and to skip the sanity checking of the data in fsck_set_msg_type() which is redundant to checks we get from the compiler. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/mktag.c | 3 ++- fsck.c | 27 +++++++++++++++++---------- fsck.h | 3 +++ 3 files changed, 22 insertions(+), 11 deletions(-) diff --git a/builtin/mktag.c b/builtin/mktag.c index dc989c356f5..de67a94f24e 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -93,7 +93,8 @@ int cmd_mktag(int argc, const char **argv, const char *prefix) die_errno(_("could not read from stdin")); fsck_options.error_func = mktag_fsck_error_func; - fsck_set_msg_type(&fsck_options, "extraheaderentry", "warn"); + fsck_set_msg_type_from_ids(&fsck_options, FSCK_MSG_EXTRA_HEADER_ENTRY, + FSCK_WARN); /* config might set fsck.extraHeaderEntry=* again */ git_config(mktag_config, NULL); if (fsck_tag_standalone(NULL, buf.buf, buf.len, &fsck_options, diff --git a/fsck.c b/fsck.c index 5a040eb4fd5..f26f47b2a10 100644 --- a/fsck.c +++ b/fsck.c @@ -132,6 +132,22 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) return 1; } +void fsck_set_msg_type_from_ids(struct fsck_options *options, + enum fsck_msg_id msg_id, + enum fsck_msg_type msg_type) +{ + if (!options->msg_type) { + int i; + enum fsck_msg_type *severity; + ALLOC_ARRAY(severity, FSCK_MSG_MAX); + for (i = 0; i < FSCK_MSG_MAX; i++) + severity[i] = fsck_msg_type(i, options); + options->msg_type = severity; + } + + options->msg_type[msg_id] = msg_type; +} + void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { @@ -144,16 +160,7 @@ void fsck_set_msg_type(struct fsck_options *options, if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) die("Cannot demote %s to %s", msg_id_str, msg_type_str); - if (!options->msg_type) { - int i; - enum fsck_msg_type *severity; - ALLOC_ARRAY(severity, FSCK_MSG_MAX); - for (i = 0; i < FSCK_MSG_MAX; i++) - severity[i] = fsck_msg_type(i, options); - options->msg_type = severity; - } - - options->msg_type[msg_id] = msg_type; + fsck_set_msg_type_from_ids(options, msg_id, msg_type); } void fsck_set_msg_types(struct fsck_options *options, const char *values) diff --git a/fsck.h b/fsck.h index 33ecf3f3f16..6c2fd9c5cc0 100644 --- a/fsck.h +++ b/fsck.h @@ -82,6 +82,9 @@ enum fsck_msg_id { struct fsck_options; struct object; +void fsck_set_msg_type_from_ids(struct fsck_options *options, + enum fsck_msg_id msg_id, + enum fsck_msg_type msg_type); void fsck_set_msg_type(struct fsck_options *options, const char *msg_id, const char *msg_type); void fsck_set_msg_types(struct fsck_options *options, const char *values); -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 19/22] fsck.c: move gitmodules_{found,done} into fsck_options 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (19 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 18/22] fsck.c: add an fsck_set_msg_type() API that takes enums Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 20/22] fetch-pack: don't needlessly copy fsck_options Ævar Arnfjörð Bjarmason ` (2 subsequent siblings) 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Move the gitmodules_{found,done} static variables added in 159e7b080bf (fsck: detect gitmodules files, 2018-05-02) into the fsck_options struct. It makes sense to keep all the context in the same place. This requires changing the recently added register_found_gitmodules() function added in 5476e1efde (fetch-pack: print and use dangling .gitmodules, 2021-02-22) to take fsck_options. That function will be removed in a subsequent commit, but as it'll require the new gitmodules_found attribute of "fsck_options" we need this intermediate step first. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fetch-pack.c | 2 +- fsck.c | 23 ++++++++++------------- fsck.h | 7 ++++++- 3 files changed, 17 insertions(+), 15 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index 0cb59acc486..53d7ef00856 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -998,7 +998,7 @@ static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) oidset_iter_init(gitmodules_oids, &iter); while ((oid = oidset_iter_next(&iter))) - register_found_gitmodules(oid); + register_found_gitmodules(&fo, oid); if (fsck_finish(&fo)) die("fsck failed"); } diff --git a/fsck.c b/fsck.c index f26f47b2a10..565274a946c 100644 --- a/fsck.c +++ b/fsck.c @@ -19,9 +19,6 @@ #include "credential.h" #include "help.h" -static struct oidset gitmodules_found = OIDSET_INIT; -static struct oidset gitmodules_done = OIDSET_INIT; - #define STR(x) #x #define MSG_ID(id, msg_type) { STR(id), NULL, NULL, FSCK_##msg_type }, static struct { @@ -624,7 +621,7 @@ static int fsck_tree(const struct object_id *oid, if (is_hfs_dotgitmodules(name) || is_ntfs_dotgitmodules(name)) { if (!S_ISLNK(mode)) - oidset_insert(&gitmodules_found, oid); + oidset_insert(&options->gitmodules_found, oid); else retval += report(options, oid, OBJ_TREE, @@ -638,7 +635,7 @@ static int fsck_tree(const struct object_id *oid, has_dotgit |= is_ntfs_dotgit(backslash); if (is_ntfs_dotgitmodules(backslash)) { if (!S_ISLNK(mode)) - oidset_insert(&gitmodules_found, oid); + oidset_insert(&options->gitmodules_found, oid); else retval += report(options, oid, OBJ_TREE, FSCK_MSG_GITMODULES_SYMLINK, @@ -1150,9 +1147,9 @@ static int fsck_blob(const struct object_id *oid, const char *buf, struct fsck_gitmodules_data data; struct config_options config_opts = { 0 }; - if (!oidset_contains(&gitmodules_found, oid)) + if (!oidset_contains(&options->gitmodules_found, oid)) return 0; - oidset_insert(&gitmodules_done, oid); + oidset_insert(&options->gitmodules_done, oid); if (object_on_skiplist(options, oid)) return 0; @@ -1217,9 +1214,9 @@ int fsck_error_function(struct fsck_options *o, return 1; } -void register_found_gitmodules(const struct object_id *oid) +void register_found_gitmodules(struct fsck_options *options, const struct object_id *oid) { - oidset_insert(&gitmodules_found, oid); + oidset_insert(&options->gitmodules_found, oid); } int fsck_finish(struct fsck_options *options) @@ -1228,13 +1225,13 @@ int fsck_finish(struct fsck_options *options) struct oidset_iter iter; const struct object_id *oid; - oidset_iter_init(&gitmodules_found, &iter); + oidset_iter_init(&options->gitmodules_found, &iter); while ((oid = oidset_iter_next(&iter))) { enum object_type type; unsigned long size; char *buf; - if (oidset_contains(&gitmodules_done, oid)) + if (oidset_contains(&options->gitmodules_done, oid)) continue; buf = read_object_file(oid, &type, &size); @@ -1259,8 +1256,8 @@ int fsck_finish(struct fsck_options *options) } - oidset_clear(&gitmodules_found); - oidset_clear(&gitmodules_done); + oidset_clear(&options->gitmodules_found); + oidset_clear(&options->gitmodules_done); return ret; } diff --git a/fsck.h b/fsck.h index 6c2fd9c5cc0..bb59ef05b68 100644 --- a/fsck.h +++ b/fsck.h @@ -118,6 +118,8 @@ struct fsck_options { unsigned strict:1; enum fsck_msg_type *msg_type; struct oidset skiplist; + struct oidset gitmodules_found; + struct oidset gitmodules_done; kh_oid_map_t *object_names; }; @@ -125,6 +127,8 @@ struct fsck_options { .walk = NULL, \ .msg_type = NULL, \ .skiplist = OIDSET_INIT, \ + .gitmodules_found = OIDSET_INIT, \ + .gitmodules_done = OIDSET_INIT, \ .object_names = NULL, #define FSCK_OPTIONS_COMMON_ERROR_FUNC \ FSCK_OPTIONS_COMMON \ @@ -149,7 +153,8 @@ int fsck_walk(struct object *obj, void *data, struct fsck_options *options); int fsck_object(struct object *obj, void *data, unsigned long size, struct fsck_options *options); -void register_found_gitmodules(const struct object_id *oid); +void register_found_gitmodules(struct fsck_options *options, + const struct object_id *oid); /* * fsck a tag, and pass info about it back to the caller. This is -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 20/22] fetch-pack: don't needlessly copy fsck_options 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (20 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 19/22] fsck.c: move gitmodules_{found,done} into fsck_options Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 21/22] fetch-pack: use file-scope static struct for fsck_options Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 22/22] fetch-pack: use new fsck API to printing dangling submodules Ævar Arnfjörð Bjarmason 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Change the behavior of the .gitmodules validation added in 5476e1efde (fetch-pack: print and use dangling .gitmodules, 2021-02-22) so we're using one "fsck_options". I found that code confusing to read. One might think that not setting up the error_func earlier means that we're relying on the "error_func" not being set in some code in between the two hunks being modified here. But we're not, all we're doing in the rest of "cmd_index_pack()" is further setup by calling fsck_set_msg_types(), and assigning to do_fsck_object. So there was no reason in 5476e1efde to make a shallow copy of the fsck_options struct before setting error_func. Let's just do this setup at the top of the function, along with the "walk" assignment. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/index-pack.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 2b2266a4b7d..5ad80b85b47 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1761,6 +1761,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) read_replace_refs = 0; fsck_options.walk = mark_link; + fsck_options.error_func = print_dangling_gitmodules; reset_pack_idx_option(&opts); git_config(git_index_pack_config, &opts); @@ -1951,13 +1952,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) else close(input_fd); - if (do_fsck_object) { - struct fsck_options fo = fsck_options; - - fo.error_func = print_dangling_gitmodules; - if (fsck_finish(&fo)) - die(_("fsck error in pack objects")); - } + if (do_fsck_object && fsck_finish(&fsck_options)) + die(_("fsck error in pack objects")); free(objects); strbuf_release(&index_name_buf); -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 21/22] fetch-pack: use file-scope static struct for fsck_options 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (21 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 20/22] fetch-pack: don't needlessly copy fsck_options Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 22/22] fetch-pack: use new fsck API to printing dangling submodules Ævar Arnfjörð Bjarmason 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Change code added in 5476e1efde (fetch-pack: print and use dangling .gitmodules, 2021-02-22) so that we use a file-scoped "static struct fsck_options" instead of defining one in the "fsck_gitmodules_oids()" function. We use this pattern in all of builtin/{fsck,index-pack,mktag,unpack-objects}.c. It's odd to see fetch-pack be the odd one out. One might think that we're using other fsck_options structs in fetch-pack, or doing on fsck twice there, but we're not. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fetch-pack.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index 53d7ef00856..f961c3067cd 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -38,6 +38,7 @@ static int server_supports_filtering; static int advertise_sid; static struct shallow_lock shallow_lock; static const char *alternate_shallow_file; +static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; static struct strbuf fsck_msg_types = STRBUF_INIT; static struct string_list uri_protocols = STRING_LIST_INIT_DUP; @@ -991,15 +992,14 @@ static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) { struct oidset_iter iter; const struct object_id *oid; - struct fsck_options fo = FSCK_OPTIONS_STRICT; if (!oidset_size(gitmodules_oids)) return; oidset_iter_init(gitmodules_oids, &iter); while ((oid = oidset_iter_next(&iter))) - register_found_gitmodules(&fo, oid); - if (fsck_finish(&fo)) + register_found_gitmodules(&fsck_options, oid); + if (fsck_finish(&fsck_options)) die("fsck failed"); } -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v3 22/22] fetch-pack: use new fsck API to printing dangling submodules 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason ` (22 preceding siblings ...) 2021-03-06 11:04 ` [PATCH v3 21/22] fetch-pack: use file-scope static struct for fsck_options Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 ` Ævar Arnfjörð Bjarmason 23 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-03-06 11:04 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Refactor the check added in 5476e1efde (fetch-pack: print and use dangling .gitmodules, 2021-02-22) to make use of us now passing the "msg_id" to the user defined "error_func". We can now compare against the FSCK_MSG_GITMODULES_MISSING instead of parsing the generated message. Let's also replace register_found_gitmodules() with directly manipulating the "gitmodules_found" member. A recent commit moved it into "fsck_options" so we could do this here. Add a fsck-cb.c file similar to parse-options-cb.c, the alternative would be to either define this directly in fsck.c as a public API, or to create some library shared by fetch-pack.c ad builtin/index-pack. I expect that there won't be many of these fsck utility functions in the future, so just having a single fsck-cb.c makes sense. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- Makefile | 1 + builtin/index-pack.c | 21 +-------------------- fetch-pack.c | 4 ++-- fsck-cb.c | 16 ++++++++++++++++ fsck.c | 5 ----- fsck.h | 22 +++++++++++++++++++--- 6 files changed, 39 insertions(+), 30 deletions(-) create mode 100644 fsck-cb.c diff --git a/Makefile b/Makefile index dd08b4ced01..5bf128c5d2c 100644 --- a/Makefile +++ b/Makefile @@ -879,6 +879,7 @@ LIB_OBJS += fetch-negotiator.o LIB_OBJS += fetch-pack.o LIB_OBJS += fmt-merge-msg.o LIB_OBJS += fsck.o +LIB_OBJS += fsck-cb.o LIB_OBJS += fsmonitor.o LIB_OBJS += gettext.o LIB_OBJS += gpg-interface.o diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 5ad80b85b47..11f0fafd33b 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -120,7 +120,7 @@ static int nr_threads; static int from_stdin; static int strict; static int do_fsck_object; -static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; +static struct fsck_options fsck_options = FSCK_OPTIONS_MISSING_GITMODULES; static int verbose; static int show_resolving_progress; static int show_stat; @@ -1713,24 +1713,6 @@ static void show_pack_info(int stat_only) } } -static int print_dangling_gitmodules(struct fsck_options *o, - const struct object_id *oid, - enum object_type object_type, - enum fsck_msg_type msg_type, - enum fsck_msg_id msg_id, - const char *message) -{ - /* - * NEEDSWORK: Plumb the MSG_ID (from fsck.c) here and use it - * instead of relying on this string check. - */ - if (starts_with(message, "gitmodulesMissing")) { - printf("%s\n", oid_to_hex(oid)); - return 0; - } - return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); -} - int cmd_index_pack(int argc, const char **argv, const char *prefix) { int i, fix_thin_pack = 0, verify = 0, stat_only = 0, rev_index; @@ -1761,7 +1743,6 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) read_replace_refs = 0; fsck_options.walk = mark_link; - fsck_options.error_func = print_dangling_gitmodules; reset_pack_idx_option(&opts); git_config(git_index_pack_config, &opts); diff --git a/fetch-pack.c b/fetch-pack.c index f961c3067cd..7fc305b65c4 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -38,7 +38,7 @@ static int server_supports_filtering; static int advertise_sid; static struct shallow_lock shallow_lock; static const char *alternate_shallow_file; -static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT; +static struct fsck_options fsck_options = FSCK_OPTIONS_MISSING_GITMODULES; static struct strbuf fsck_msg_types = STRBUF_INIT; static struct string_list uri_protocols = STRING_LIST_INIT_DUP; @@ -998,7 +998,7 @@ static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) oidset_iter_init(gitmodules_oids, &iter); while ((oid = oidset_iter_next(&iter))) - register_found_gitmodules(&fsck_options, oid); + oidset_insert(&fsck_options.gitmodules_found, oid); if (fsck_finish(&fsck_options)) die("fsck failed"); } diff --git a/fsck-cb.c b/fsck-cb.c new file mode 100644 index 00000000000..465a49235ac --- /dev/null +++ b/fsck-cb.c @@ -0,0 +1,16 @@ +#include "git-compat-util.h" +#include "fsck.h" + +int fsck_error_cb_print_missing_gitmodules(struct fsck_options *o, + const struct object_id *oid, + enum object_type object_type, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message) +{ + if (msg_id == FSCK_MSG_GITMODULES_MISSING) { + puts(oid_to_hex(oid)); + return 0; + } + return fsck_error_function(o, oid, object_type, msg_type, msg_id, message); +} diff --git a/fsck.c b/fsck.c index 565274a946c..b0089844db9 100644 --- a/fsck.c +++ b/fsck.c @@ -1214,11 +1214,6 @@ int fsck_error_function(struct fsck_options *o, return 1; } -void register_found_gitmodules(struct fsck_options *options, const struct object_id *oid) -{ - oidset_insert(&options->gitmodules_found, oid); -} - int fsck_finish(struct fsck_options *options) { int ret = 0; diff --git a/fsck.h b/fsck.h index bb59ef05b68..ae3107638ab 100644 --- a/fsck.h +++ b/fsck.h @@ -153,9 +153,6 @@ int fsck_walk(struct object *obj, void *data, struct fsck_options *options); int fsck_object(struct object *obj, void *data, unsigned long size, struct fsck_options *options); -void register_found_gitmodules(struct fsck_options *options, - const struct object_id *oid); - /* * fsck a tag, and pass info about it back to the caller. This is * exposed fsck_object() internals for git-mktag(1). @@ -204,4 +201,23 @@ const char *fsck_describe_object(struct fsck_options *options, int fsck_config_internal(const char *var, const char *value, void *cb, struct fsck_options *options); +/* + * Initializations for callbacks in fsck-cb.c + */ +#define FSCK_OPTIONS_MISSING_GITMODULES { \ + .strict = 1, \ + .error_func = fsck_error_cb_print_missing_gitmodules, \ + FSCK_OPTIONS_COMMON \ +} + +/* + * Error callbacks in fsck-cb.c + */ +int fsck_error_cb_print_missing_gitmodules(struct fsck_options *o, + const struct object_id *oid, + enum object_type object_type, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message); + #endif -- 2.31.0.rc0.126.g04f22c5b82 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v2 01/10] fsck.h: indent arguments to of fsck_set_msg_type 2021-02-17 19:42 ` [PATCH 00/14] fsck: API improvements Ævar Arnfjörð Bjarmason 2021-02-17 21:02 ` Junio C Hamano 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 ` Ævar Arnfjörð Bjarmason 2021-02-18 10:58 ` [PATCH v2 02/10] fsck.h: use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason ` (8 subsequent siblings) 11 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fsck.h b/fsck.h index 423c467feb7..df0b64a2163 100644 --- a/fsck.h +++ b/fsck.h @@ -11,7 +11,7 @@ struct fsck_options; struct object; void fsck_set_msg_type(struct fsck_options *options, - const char *msg_id, const char *msg_type); + const char *msg_id, const char *msg_type); void fsck_set_msg_types(struct fsck_options *options, const char *values); int is_valid_msg_type(const char *msg_id, const char *msg_type); -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v2 02/10] fsck.h: use "enum object_type" instead of "int" 2021-02-17 19:42 ` [PATCH 00/14] fsck: API improvements Ævar Arnfjörð Bjarmason ` (2 preceding siblings ...) 2021-02-18 10:58 ` [PATCH v2 01/10] fsck.h: indent arguments to of fsck_set_msg_type Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 ` Ævar Arnfjörð Bjarmason 2021-02-18 10:58 ` [PATCH v2 03/10] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason ` (7 subsequent siblings) 11 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Change the fsck_walk_func to use an "enum object_type" instead of an "int" type. The types are compatible, and ever since this was added in 355885d5315 (add generic, type aware object chain walker, 2008-02-25) we've used entries from object_type (OBJ_BLOB etc.). So this doesn't really change anything as far as the generated code is concerned, it just gives the compiler more information and makes this easier to read. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 3 ++- builtin/index-pack.c | 3 ++- builtin/unpack-objects.c | 3 ++- fsck.h | 3 ++- 4 files changed, 8 insertions(+), 4 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 821e7798c70..68f0329e69e 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -197,7 +197,8 @@ static int traverse_reachable(void) return !!result; } -static int mark_used(struct object *obj, int type, void *data, struct fsck_options *options) +static int mark_used(struct object *obj, enum object_type object_type, + void *data, struct fsck_options *options) { if (!obj) return 1; diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 54f74c48741..2f291a14d4a 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -212,7 +212,8 @@ static void cleanup_thread(void) free(thread_data); } -static int mark_link(struct object *obj, int type, void *data, struct fsck_options *options) +static int mark_link(struct object *obj, enum object_type type, + void *data, struct fsck_options *options) { if (!obj) return -1; diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index dd4a75e030d..ca54fd16688 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -187,7 +187,8 @@ static void write_cached_object(struct object *obj, struct obj_buffer *obj_buf) * that have reachability requirements and calls this function. * Verify its reachability and validity recursively and write it out. */ -static int check_object(struct object *obj, int type, void *data, struct fsck_options *options) +static int check_object(struct object *obj, enum object_type type, + void *data, struct fsck_options *options) { struct obj_buffer *obj_buf; diff --git a/fsck.h b/fsck.h index df0b64a2163..0c75789d219 100644 --- a/fsck.h +++ b/fsck.h @@ -23,7 +23,8 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type); * <0 error signaled and abort * >0 error signaled and do not abort */ -typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options); +typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, + void *data, struct fsck_options *options); /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v2 03/10] fsck.c: rename variables in fsck_set_msg_type() for less confusion 2021-02-17 19:42 ` [PATCH 00/14] fsck: API improvements Ævar Arnfjörð Bjarmason ` (3 preceding siblings ...) 2021-02-18 10:58 ` [PATCH v2 02/10] fsck.h: use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 ` Ævar Arnfjörð Bjarmason 2021-02-18 19:45 ` Jeff King 2021-02-18 10:58 ` [PATCH v2 04/10] fsck.c: move definition of msg_id into append_msg_id() Ævar Arnfjörð Bjarmason ` (6 subsequent siblings) 11 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Rename variables in a function added in 0282f4dced0 (fsck: offer a function to demote fsck errors to warnings, 2015-06-22). It was needlessly confusing that it took a "msg_type" argument, but then later declared another "msg_type" of a different type. Let's rename that to "tmp", and rename "id" to "msg_id" and "msg_id" to "msg_id_str" etc. This will make a follow-up change smaller. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/fsck.c b/fsck.c index 4b7f0b73d73..acccad243ec 100644 --- a/fsck.c +++ b/fsck.c @@ -203,27 +203,27 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) } void fsck_set_msg_type(struct fsck_options *options, - const char *msg_id, const char *msg_type) + const char *msg_id_str, const char *msg_type_str) { - int id = parse_msg_id(msg_id), type; + int msg_id = parse_msg_id(msg_id_str), msg_type; - if (id < 0) - die("Unhandled message id: %s", msg_id); - type = parse_msg_type(msg_type); + if (msg_id < 0) + die("Unhandled message id: %s", msg_id_str); + msg_type = parse_msg_type(msg_type_str); - if (type != FSCK_ERROR && msg_id_info[id].msg_type == FSCK_FATAL) - die("Cannot demote %s to %s", msg_id, msg_type); + if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) + die("Cannot demote %s to %s", msg_id_str, msg_type_str); if (!options->msg_type) { int i; - int *msg_type; - ALLOC_ARRAY(msg_type, FSCK_MSG_MAX); + int *tmp; + ALLOC_ARRAY(tmp, FSCK_MSG_MAX); for (i = 0; i < FSCK_MSG_MAX; i++) - msg_type[i] = fsck_msg_type(i, options); - options->msg_type = msg_type; + tmp[i] = fsck_msg_type(i, options); + options->msg_type = tmp; } - options->msg_type[id] = type; + options->msg_type[msg_id] = msg_type; } void fsck_set_msg_types(struct fsck_options *options, const char *values) -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v2 03/10] fsck.c: rename variables in fsck_set_msg_type() for less confusion 2021-02-18 10:58 ` [PATCH v2 03/10] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason @ 2021-02-18 19:45 ` Jeff King 0 siblings, 0 replies; 229+ messages in thread From: Jeff King @ 2021-02-18 19:45 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Junio C Hamano, Johannes Schindelin, Jonathan Tan On Thu, Feb 18, 2021 at 11:58:33AM +0100, Ævar Arnfjörð Bjarmason wrote: > Rename variables in a function added in 0282f4dced0 (fsck: offer a > function to demote fsck errors to warnings, 2015-06-22). > > It was needlessly confusing that it took a "msg_type" argument, but > then later declared another "msg_type" of a different type. > > Let's rename that to "tmp", and rename "id" to "msg_id" and "msg_id" > to "msg_id_str" etc. This will make a follow-up change smaller. I think this is an improvement, though maybe "severity" would be a less-generic term than "type". > void fsck_set_msg_type(struct fsck_options *options, > - const char *msg_id, const char *msg_type) > + const char *msg_id_str, const char *msg_type_str) > { > - int id = parse_msg_id(msg_id), type; > + int msg_id = parse_msg_id(msg_id_str), msg_type; I always get nervous when a refactoring renames something away from "foo", and then renames another thing _to_ "foo". Any untouched bits of code are vulnerable to confusing them. But I think the types are sufficiently different that we can mostly rely on the compiler (though things like numeric or bool comparisons can work with either pointers or ints), and the fact that we can see the entire function is small enough that we can see the entire thing in the context here. So I think it is OK. -Peff ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v2 04/10] fsck.c: move definition of msg_id into append_msg_id() 2021-02-17 19:42 ` [PATCH 00/14] fsck: API improvements Ævar Arnfjörð Bjarmason ` (4 preceding siblings ...) 2021-02-18 10:58 ` [PATCH v2 03/10] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 ` Ævar Arnfjörð Bjarmason 2021-02-18 10:58 ` [PATCH v2 05/10] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason ` (5 subsequent siblings) 11 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Refactor code added in 71ab8fa840f (fsck: report the ID of the error/warning, 2015-06-22) to resolve the msg_id to a string in the function that wants it, instead of doing it in report(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index acccad243ec..1070071ffec 100644 --- a/fsck.c +++ b/fsck.c @@ -264,8 +264,9 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) free(to_free); } -static void append_msg_id(struct strbuf *sb, const char *msg_id) +static void append_msg_id(struct strbuf *sb, enum fsck_msg_id id) { + const char *msg_id = msg_id_info[id].id_string; for (;;) { char c = *(msg_id)++; @@ -308,7 +309,7 @@ static int report(struct fsck_options *options, else if (msg_type == FSCK_INFO) msg_type = FSCK_WARN; - append_msg_id(&sb, msg_id_info[id].id_string); + append_msg_id(&sb, id); va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v2 05/10] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" 2021-02-17 19:42 ` [PATCH 00/14] fsck: API improvements Ævar Arnfjörð Bjarmason ` (5 preceding siblings ...) 2021-02-18 10:58 ` [PATCH v2 04/10] fsck.c: move definition of msg_id into append_msg_id() Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 ` Ævar Arnfjörð Bjarmason 2021-02-18 22:23 ` Junio C Hamano 2021-02-18 10:58 ` [PATCH v2 06/10] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason ` (4 subsequent siblings) 11 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Rename the remaining variables of type fsck_msg_id from "id" to "msg_id". This change is relatively small, and is worth the churn for a later change where we have different id's in the "report" function. --- fsck.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/fsck.c b/fsck.c index 1070071ffec..dbb6f7c4ee2 100644 --- a/fsck.c +++ b/fsck.c @@ -264,19 +264,19 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) free(to_free); } -static void append_msg_id(struct strbuf *sb, enum fsck_msg_id id) +static void append_msg_id(struct strbuf *sb, enum fsck_msg_id msg_id) { - const char *msg_id = msg_id_info[id].id_string; + const char *msg_id_str = msg_id_info[msg_id].id_string; for (;;) { - char c = *(msg_id)++; + char c = *(msg_id_str)++; if (!c) break; if (c != '_') strbuf_addch(sb, tolower(c)); else { - assert(*msg_id); - strbuf_addch(sb, *(msg_id)++); + assert(*msg_id_str); + strbuf_addch(sb, *(msg_id_str)++); } } @@ -292,11 +292,11 @@ static int object_on_skiplist(struct fsck_options *opts, __attribute__((format (printf, 5, 6))) static int report(struct fsck_options *options, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_id id, const char *fmt, ...) + enum fsck_msg_id msg_id, const char *fmt, ...) { va_list ap; struct strbuf sb = STRBUF_INIT; - int msg_type = fsck_msg_type(id, options), result; + int msg_type = fsck_msg_type(msg_id, options), result; if (msg_type == FSCK_IGNORE) return 0; @@ -309,7 +309,7 @@ static int report(struct fsck_options *options, else if (msg_type == FSCK_INFO) msg_type = FSCK_WARN; - append_msg_id(&sb, id); + append_msg_id(&sb, msg_id); va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v2 05/10] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" 2021-02-18 10:58 ` [PATCH v2 05/10] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason @ 2021-02-18 22:23 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-02-18 22:23 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > Rename the remaining variables of type fsck_msg_id from "id" to > "msg_id". This change is relatively small, and is worth the churn for > a later change where we have different id's in the "report" function. > --- > fsck.c | 16 ++++++++-------- > 1 file changed, 8 insertions(+), 8 deletions(-) Up to this point I have no objections to the patches themselves, but this one is not signed off. > diff --git a/fsck.c b/fsck.c > index 1070071ffec..dbb6f7c4ee2 100644 > --- a/fsck.c > +++ b/fsck.c > @@ -264,19 +264,19 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) > free(to_free); > } > > -static void append_msg_id(struct strbuf *sb, enum fsck_msg_id id) > +static void append_msg_id(struct strbuf *sb, enum fsck_msg_id msg_id) > { > - const char *msg_id = msg_id_info[id].id_string; > + const char *msg_id_str = msg_id_info[msg_id].id_string; > for (;;) { > - char c = *(msg_id)++; > + char c = *(msg_id_str)++; > > if (!c) > break; > if (c != '_') > strbuf_addch(sb, tolower(c)); > else { > - assert(*msg_id); > - strbuf_addch(sb, *(msg_id)++); > + assert(*msg_id_str); > + strbuf_addch(sb, *(msg_id_str)++); > } > } > > @@ -292,11 +292,11 @@ static int object_on_skiplist(struct fsck_options *opts, > __attribute__((format (printf, 5, 6))) > static int report(struct fsck_options *options, > const struct object_id *oid, enum object_type object_type, > - enum fsck_msg_id id, const char *fmt, ...) > + enum fsck_msg_id msg_id, const char *fmt, ...) > { > va_list ap; > struct strbuf sb = STRBUF_INIT; > - int msg_type = fsck_msg_type(id, options), result; > + int msg_type = fsck_msg_type(msg_id, options), result; > > if (msg_type == FSCK_IGNORE) > return 0; > @@ -309,7 +309,7 @@ static int report(struct fsck_options *options, > else if (msg_type == FSCK_INFO) > msg_type = FSCK_WARN; > > - append_msg_id(&sb, id); > + append_msg_id(&sb, msg_id); > > va_start(ap, fmt); > strbuf_vaddf(&sb, fmt, ap); ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v2 06/10] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum 2021-02-17 19:42 ` [PATCH 00/14] fsck: API improvements Ævar Arnfjörð Bjarmason ` (6 preceding siblings ...) 2021-02-18 10:58 ` [PATCH v2 05/10] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 ` Ævar Arnfjörð Bjarmason 2021-02-18 19:52 ` Jeff King 2021-02-18 10:58 ` [PATCH v2 07/10] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason ` (3 subsequent siblings) 11 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Move the FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} defines into a new fsck_msg_type enum. These defines were originally introduced in: - ba002f3b28a (builtin-fsck: move common object checking code to fsck.c, 2008-02-25) - f50c4407305 (fsck: disallow demoting grave fsck errors to warnings, 2015-06-22) - efaba7cc77f (fsck: optionally ignore specific fsck issues completely, 2015-06-22) - f27d05b1704 (fsck: allow upgrading fsck warnings to errors, 2015-06-22) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 2 +- builtin/mktag.c | 3 ++- fsck.c | 21 ++++++++++----------- fsck.h | 17 ++++++++++------- 4 files changed, 23 insertions(+), 20 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 68f0329e69e..d6d745dc702 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -89,7 +89,7 @@ static int objerror(struct object *obj, const char *err) static int fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/builtin/mktag.c b/builtin/mktag.c index 41a399a69e4..1834394a9b6 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -22,7 +22,8 @@ static int mktag_config(const char *var, const char *value, void *cb) static int mktag_fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, + const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/fsck.c b/fsck.c index dbb6f7c4ee2..00e0fef21ca 100644 --- a/fsck.c +++ b/fsck.c @@ -22,9 +22,6 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FSCK_FATAL -1 -#define FSCK_INFO -2 - #define FOREACH_MSG_ID(FUNC) \ /* fatal errors */ \ FUNC(NUL_IN_HEADER, FATAL) \ @@ -97,7 +94,7 @@ static struct { const char *id_string; const char *downcased; const char *camelcased; - int msg_type; + enum fsck_msg_type msg_type; } msg_id_info[FSCK_MSG_MAX + 1] = { FOREACH_MSG_ID(MSG_ID) { NULL, NULL, NULL, -1 } @@ -164,10 +161,10 @@ void list_config_fsck_msg_ids(struct string_list *list, const char *prefix) list_config_item(list, prefix, msg_id_info[i].camelcased); } -static int fsck_msg_type(enum fsck_msg_id msg_id, +static enum fsck_msg_type fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options) { - int msg_type; + enum fsck_msg_type msg_type; assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX); @@ -182,7 +179,7 @@ static int fsck_msg_type(enum fsck_msg_id msg_id, return msg_type; } -static int parse_msg_type(const char *str) +static enum fsck_msg_type parse_msg_type(const char *str) { if (!strcmp(str, "error")) return FSCK_ERROR; @@ -205,7 +202,8 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { - int msg_id = parse_msg_id(msg_id_str), msg_type; + int msg_id = parse_msg_id(msg_id_str); + enum fsck_msg_type msg_type; if (msg_id < 0) die("Unhandled message id: %s", msg_id_str); @@ -216,7 +214,7 @@ void fsck_set_msg_type(struct fsck_options *options, if (!options->msg_type) { int i; - int *tmp; + enum fsck_msg_type *tmp; ALLOC_ARRAY(tmp, FSCK_MSG_MAX); for (i = 0; i < FSCK_MSG_MAX; i++) tmp[i] = fsck_msg_type(i, options); @@ -296,7 +294,8 @@ static int report(struct fsck_options *options, { va_list ap; struct strbuf sb = STRBUF_INIT; - int msg_type = fsck_msg_type(msg_id, options), result; + enum fsck_msg_type msg_type = fsck_msg_type(msg_id, options); + int result; if (msg_type == FSCK_IGNORE) return 0; @@ -1262,7 +1261,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size, int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, const char *message) { if (msg_type == FSCK_WARN) { warning("object %s: %s", fsck_describe_object(o, oid), message); diff --git a/fsck.h b/fsck.h index 0c75789d219..c77e8ddf10b 100644 --- a/fsck.h +++ b/fsck.h @@ -3,10 +3,13 @@ #include "oidset.h" -#define FSCK_ERROR 1 -#define FSCK_WARN 2 -#define FSCK_IGNORE 3 - +enum fsck_msg_type { + FSCK_INFO = -2, + FSCK_FATAL = -1, + FSCK_ERROR = 1, + FSCK_WARN, + FSCK_IGNORE +}; struct fsck_options; struct object; @@ -29,17 +32,17 @@ typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message); + enum fsck_msg_type msg_type, const char *message); int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message); + enum fsck_msg_type msg_type, const char *message); struct fsck_options { fsck_walk_func walk; fsck_error error_func; unsigned strict:1; - int *msg_type; + enum fsck_msg_type *msg_type; struct oidset skiplist; kh_oid_map_t *object_names; }; -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v2 06/10] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum 2021-02-18 10:58 ` [PATCH v2 06/10] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason @ 2021-02-18 19:52 ` Jeff King 2021-02-18 22:27 ` Junio C Hamano 0 siblings, 1 reply; 229+ messages in thread From: Jeff King @ 2021-02-18 19:52 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Junio C Hamano, Johannes Schindelin, Jonathan Tan On Thu, Feb 18, 2021 at 11:58:36AM +0100, Ævar Arnfjörð Bjarmason wrote: > Move the FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} defines into a new > fsck_msg_type enum. Makes sense. As with my previous comment, I wonder if "severity" is a more descriptive term. > diff --git a/fsck.h b/fsck.h > index 0c75789d219..c77e8ddf10b 100644 > --- a/fsck.h > +++ b/fsck.h > @@ -3,10 +3,13 @@ > > #include "oidset.h" > > -#define FSCK_ERROR 1 > -#define FSCK_WARN 2 > -#define FSCK_IGNORE 3 > - > +enum fsck_msg_type { > + FSCK_INFO = -2, > + FSCK_FATAL = -1, > + FSCK_ERROR = 1, > + FSCK_WARN, > + FSCK_IGNORE > +}; You kept the values the same as they were before, which is good in a refactoring step, but...wow, the ordering is weird and confusing. In FATAL/ERROR/WARN/IGNORE the number increases as severity decreases. Maybe reversed from how I'd do it, but at least the order makes sense. But somehow INFO is on the far side of FATAL? Again, not something to address in this patch, but I hope something we could maybe deal with in the longer term (perhaps along with fixing the weird "INFO is a warning from the user's perspective, but WARNING is generally an error" behavior). I also know that this is assigning WARN and IGNORE based on counting-by-one from ERROR, so it's correct. But I think it would be more obvious if you simply filled in the values manually, so a reader does not have to wonder why some are assigned and some are not. -Peff ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v2 06/10] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum 2021-02-18 19:52 ` Jeff King @ 2021-02-18 22:27 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-02-18 22:27 UTC (permalink / raw) To: Jeff King Cc: Ævar Arnfjörð Bjarmason, git, Johannes Schindelin, Jonathan Tan Jeff King <peff@peff.net> writes: > On Thu, Feb 18, 2021 at 11:58:36AM +0100, Ævar Arnfjörð Bjarmason wrote: > >> Move the FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} defines into a new >> fsck_msg_type enum. > > Makes sense. As with my previous comment, I wonder if "severity" is a > more descriptive term. > >> diff --git a/fsck.h b/fsck.h >> index 0c75789d219..c77e8ddf10b 100644 >> --- a/fsck.h >> +++ b/fsck.h >> @@ -3,10 +3,13 @@ >> >> #include "oidset.h" >> >> -#define FSCK_ERROR 1 >> -#define FSCK_WARN 2 >> -#define FSCK_IGNORE 3 >> - >> +enum fsck_msg_type { >> + FSCK_INFO = -2, >> + FSCK_FATAL = -1, >> + FSCK_ERROR = 1, >> + FSCK_WARN, >> + FSCK_IGNORE >> +}; > > You kept the values the same as they were before, which is good in a > refactoring step, but...wow, the ordering is weird and confusing. > > In FATAL/ERROR/WARN/IGNORE the number increases as severity decreases. > Maybe reversed from how I'd do it, but at least the order makes sense. > But somehow INFO is on the far side of FATAL? > > Again, not something to address in this patch, but I hope something we > could maybe deal with in the longer term (perhaps along with fixing the > weird "INFO is a warning from the user's perspective, but WARNING is > generally an error" behavior). > > I also know that this is assigning WARN and IGNORE based on > counting-by-one from ERROR, so it's correct. But I think it would be > more obvious if you simply filled in the values manually, so a reader > does not have to wonder why some are assigned and some are not. I had the same reaction, plus "Wow, we had FSCK_* constants in two different places and without colliding? Have we been lucky? Declaring it in one place, whether we use enum or not (as enum is not very useful in C as a type checking vehicle), makes a lot of sense but why does this come this late in the series, instead of being at the front as a trivial low-hanging fruit?" Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v2 07/10] fsck.c: call parse_msg_type() early in fsck_set_msg_type() 2021-02-17 19:42 ` [PATCH 00/14] fsck: API improvements Ævar Arnfjörð Bjarmason ` (7 preceding siblings ...) 2021-02-18 10:58 ` [PATCH v2 06/10] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 ` Ævar Arnfjörð Bjarmason 2021-02-18 22:29 ` Junio C Hamano 2021-02-18 10:58 ` [PATCH v2 08/10] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason ` (2 subsequent siblings) 11 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason There's no reason to defer the calling of parse_msg_type() until after we've checked if the "id < 0". This is not a hot codepath, and parse_msg_type() itself may die on invalid input. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index 00e0fef21ca..7c53080ad48 100644 --- a/fsck.c +++ b/fsck.c @@ -203,11 +203,10 @@ void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { int msg_id = parse_msg_id(msg_id_str); - enum fsck_msg_type msg_type; + enum fsck_msg_type msg_type = parse_msg_type(msg_type_str); if (msg_id < 0) die("Unhandled message id: %s", msg_id_str); - msg_type = parse_msg_type(msg_type_str); if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) die("Cannot demote %s to %s", msg_id_str, msg_type_str); -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v2 07/10] fsck.c: call parse_msg_type() early in fsck_set_msg_type() 2021-02-18 10:58 ` [PATCH v2 07/10] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason @ 2021-02-18 22:29 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-02-18 22:29 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > There's no reason to defer the calling of parse_msg_type() until after > we've checked if the "id < 0". This is not a hot codepath, and > parse_msg_type() itself may die on invalid input. That explains why this change can be done, but does not justify why it is a good change. Unlike all the previous steps, I would rather say this is borderline needless churn. Let's keep reading as the picture may change as we touch more code around this area. Thanks. > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > fsck.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/fsck.c b/fsck.c > index 00e0fef21ca..7c53080ad48 100644 > --- a/fsck.c > +++ b/fsck.c > @@ -203,11 +203,10 @@ void fsck_set_msg_type(struct fsck_options *options, > const char *msg_id_str, const char *msg_type_str) > { > int msg_id = parse_msg_id(msg_id_str); > - enum fsck_msg_type msg_type; > + enum fsck_msg_type msg_type = parse_msg_type(msg_type_str); > > if (msg_id < 0) > die("Unhandled message id: %s", msg_id_str); > - msg_type = parse_msg_type(msg_type_str); > > if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) > die("Cannot demote %s to %s", msg_id_str, msg_type_str); ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v2 08/10] fsck.c: undefine temporary STR macro after use 2021-02-17 19:42 ` [PATCH 00/14] fsck: API improvements Ævar Arnfjörð Bjarmason ` (8 preceding siblings ...) 2021-02-18 10:58 ` [PATCH v2 07/10] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 ` Ævar Arnfjörð Bjarmason 2021-02-18 22:30 ` Junio C Hamano 2021-02-18 10:58 ` [PATCH v2 09/10] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason 2021-02-18 10:58 ` [PATCH v2 10/10] fsck.h: update FSCK_OPTIONS_* for object_name Ævar Arnfjörð Bjarmason 11 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason In f417eed8cde (fsck: provide a function to parse fsck message IDs, 2015-06-22) the "STR" macro was introduced, but that short macro name was not undefined after use as was done earlier in the same series for the MSG_ID macro in c99ba492f1c (fsck: introduce identifiers for fsck messages, 2015-06-22). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fsck.c b/fsck.c index 7c53080ad48..88884e91c89 100644 --- a/fsck.c +++ b/fsck.c @@ -100,6 +100,7 @@ static struct { { NULL, NULL, NULL, -1 } }; #undef MSG_ID +#undef STR static void prepare_msg_ids(void) { -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v2 08/10] fsck.c: undefine temporary STR macro after use 2021-02-18 10:58 ` [PATCH v2 08/10] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason @ 2021-02-18 22:30 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-02-18 22:30 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > In f417eed8cde (fsck: provide a function to parse fsck message IDs, > 2015-06-22) the "STR" macro was introduced, but that short macro name > was not undefined after use as was done earlier in the same series for > the MSG_ID macro in c99ba492f1c (fsck: introduce identifiers for fsck > messages, 2015-06-22). > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > fsck.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/fsck.c b/fsck.c > index 7c53080ad48..88884e91c89 100644 > --- a/fsck.c > +++ b/fsck.c > @@ -100,6 +100,7 @@ static struct { > { NULL, NULL, NULL, -1 } > }; > #undef MSG_ID > +#undef STR Good clean-up. > > static void prepare_msg_ids(void) > { ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v2 09/10] fsck.c: give "FOREACH_MSG_ID" a more specific name 2021-02-17 19:42 ` [PATCH 00/14] fsck: API improvements Ævar Arnfjörð Bjarmason ` (9 preceding siblings ...) 2021-02-18 10:58 ` [PATCH v2 08/10] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 ` Ævar Arnfjörð Bjarmason 2021-02-18 19:56 ` Jeff King 2021-02-18 10:58 ` [PATCH v2 10/10] fsck.h: update FSCK_OPTIONS_* for object_name Ævar Arnfjörð Bjarmason 11 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Rename the FOREACH_MSG_ID macro to FOREACH_FSCK_MSG_ID in preparation for moving it over to fsck.h. It's good convention to name macros in *.h files in such a way as to clearly not clash with any other names in other files. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fsck.c b/fsck.c index 88884e91c89..1730acd698d 100644 --- a/fsck.c +++ b/fsck.c @@ -22,7 +22,7 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FOREACH_MSG_ID(FUNC) \ +#define FOREACH_FSCK_MSG_ID(FUNC) \ /* fatal errors */ \ FUNC(NUL_IN_HEADER, FATAL) \ FUNC(UNTERMINATED_HEADER, FATAL) \ @@ -83,7 +83,7 @@ static struct oidset gitmodules_done = OIDSET_INIT; #define MSG_ID(id, msg_type) FSCK_MSG_##id, enum fsck_msg_id { - FOREACH_MSG_ID(MSG_ID) + FOREACH_FSCK_MSG_ID(MSG_ID) FSCK_MSG_MAX }; #undef MSG_ID @@ -96,7 +96,7 @@ static struct { const char *camelcased; enum fsck_msg_type msg_type; } msg_id_info[FSCK_MSG_MAX + 1] = { - FOREACH_MSG_ID(MSG_ID) + FOREACH_FSCK_MSG_ID(MSG_ID) { NULL, NULL, NULL, -1 } }; #undef MSG_ID -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v2 09/10] fsck.c: give "FOREACH_MSG_ID" a more specific name 2021-02-18 10:58 ` [PATCH v2 09/10] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason @ 2021-02-18 19:56 ` Jeff King 0 siblings, 0 replies; 229+ messages in thread From: Jeff King @ 2021-02-18 19:56 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Junio C Hamano, Johannes Schindelin, Jonathan Tan On Thu, Feb 18, 2021 at 11:58:39AM +0100, Ævar Arnfjörð Bjarmason wrote: > Rename the FOREACH_MSG_ID macro to FOREACH_FSCK_MSG_ID in preparation > for moving it over to fsck.h. It's good convention to name macros > in *.h files in such a way as to clearly not clash with any other > names in other files. The patch to move it is not in this v2 of the series, so arguably this is less interesting. However, I think the resulting code is equally or more readable, so I don't mind it standing on its own. -Peff ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v2 10/10] fsck.h: update FSCK_OPTIONS_* for object_name 2021-02-17 19:42 ` [PATCH 00/14] fsck: API improvements Ævar Arnfjörð Bjarmason ` (10 preceding siblings ...) 2021-02-18 10:58 ` [PATCH v2 09/10] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 ` Ævar Arnfjörð Bjarmason 2021-02-18 19:56 ` Jeff King 2021-02-18 22:32 ` Junio C Hamano 11 siblings, 2 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-18 10:58 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Add the object_name member to the initialization macro. This was omitted in 7b35efd734e (fsck_walk(): optionally name objects on the go, 2016-07-17) when the field was added. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fsck.h b/fsck.h index c77e8ddf10b..5d44ff1c8e3 100644 --- a/fsck.h +++ b/fsck.h @@ -47,8 +47,8 @@ struct fsck_options { kh_oid_map_t *object_names; }; -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT } -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT } +#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT, NULL } +#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT, NULL } /* descend in all linked child objects * the return value is: -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v2 10/10] fsck.h: update FSCK_OPTIONS_* for object_name 2021-02-18 10:58 ` [PATCH v2 10/10] fsck.h: update FSCK_OPTIONS_* for object_name Ævar Arnfjörð Bjarmason @ 2021-02-18 19:56 ` Jeff King 2021-02-18 22:33 ` Junio C Hamano 2021-02-18 22:32 ` Junio C Hamano 1 sibling, 1 reply; 229+ messages in thread From: Jeff King @ 2021-02-18 19:56 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Junio C Hamano, Johannes Schindelin, Jonathan Tan On Thu, Feb 18, 2021 at 11:58:40AM +0100, Ævar Arnfjörð Bjarmason wrote: > Add the object_name member to the initialization macro. This was > omitted in 7b35efd734e (fsck_walk(): optionally name objects on the > go, 2016-07-17) when the field was added. We're correct either way here, because trailing fields that are not initialized will get the usual zero-initialization. But I don't mind trying to be more complete. That said, we have embraced designated initializers these days, in which case we usually omit the NULL ones. So perhaps: #define FSCK_OPTIONS_DEFAULT { \ .walk = fsck_error_function, \ .skiplist = OIDSET_INIT, \ } #define FSCK_OPTIONS_STRICT { \ .walk = fsck_error_function, \ .skiplist = OIDSET_INIT, \ .strict = 1, \ } would be more readable still? -Peff ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v2 10/10] fsck.h: update FSCK_OPTIONS_* for object_name 2021-02-18 19:56 ` Jeff King @ 2021-02-18 22:33 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-02-18 22:33 UTC (permalink / raw) To: Jeff King Cc: Ævar Arnfjörð Bjarmason, git, Johannes Schindelin, Jonathan Tan Jeff King <peff@peff.net> writes: > On Thu, Feb 18, 2021 at 11:58:40AM +0100, Ævar Arnfjörð Bjarmason wrote: > >> Add the object_name member to the initialization macro. This was >> omitted in 7b35efd734e (fsck_walk(): optionally name objects on the >> go, 2016-07-17) when the field was added. > > We're correct either way here, because trailing fields that are not > initialized will get the usual zero-initialization. But I don't mind > trying to be more complete. > > That said, we have embraced designated initializers these days, in which > case we usually omit the NULL ones. So perhaps: > > #define FSCK_OPTIONS_DEFAULT { \ > .walk = fsck_error_function, \ > .skiplist = OIDSET_INIT, \ > } > #define FSCK_OPTIONS_STRICT { \ > .walk = fsck_error_function, \ > .skiplist = OIDSET_INIT, \ > .strict = 1, \ > } > > would be more readable still? Ahh, I should probably have read your reviews first before reading patches myself ;-) Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v2 10/10] fsck.h: update FSCK_OPTIONS_* for object_name 2021-02-18 10:58 ` [PATCH v2 10/10] fsck.h: update FSCK_OPTIONS_* for object_name Ævar Arnfjörð Bjarmason 2021-02-18 19:56 ` Jeff King @ 2021-02-18 22:32 ` Junio C Hamano 1 sibling, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-02-18 22:32 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > Add the object_name member to the initialization macro. This was > omitted in 7b35efd734e (fsck_walk(): optionally name objects on the > go, 2016-07-17) when the field was added. This is more of a Meh to me. If this were to change us to designated initializers and omit NULL and 0 initialization, it would be more interesting. Thanks. > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > fsck.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fsck.h b/fsck.h > index c77e8ddf10b..5d44ff1c8e3 100644 > --- a/fsck.h > +++ b/fsck.h > @@ -47,8 +47,8 @@ struct fsck_options { > kh_oid_map_t *object_names; > }; > > -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT } > -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT } > +#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT, NULL } > +#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT, NULL } > > /* descend in all linked child objects > * the return value is: ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH 01/14] fsck.h: indent arguments to of fsck_set_msg_type 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 00/14] fsck: API improvements Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 02/14] fsck.h: use use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason ` (13 subsequent siblings) 15 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fsck.h b/fsck.h index 423c467feb7..df0b64a2163 100644 --- a/fsck.h +++ b/fsck.h @@ -11,7 +11,7 @@ struct fsck_options; struct object; void fsck_set_msg_type(struct fsck_options *options, - const char *msg_id, const char *msg_type); + const char *msg_id, const char *msg_type); void fsck_set_msg_types(struct fsck_options *options, const char *values); int is_valid_msg_type(const char *msg_id, const char *msg_type); -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH 02/14] fsck.h: use use "enum object_type" instead of "int" 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 00/14] fsck: API improvements Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 01/14] fsck.h: indent arguments to of fsck_set_msg_type Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 ` Ævar Arnfjörð Bjarmason 2021-02-17 23:40 ` Junio C Hamano 2021-02-17 19:42 ` [PATCH 03/14] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason ` (12 subsequent siblings) 15 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Change the fsck_walk_func to use an "enum object_type" instead of an "int" type. The types are compatible, and ever since this was added in 355885d5315 (add generic, type aware object chain walker, 2008-02-25) we've used entries from object_type (OBJ_BLOB etc.). So this doesn't really change anything as far as the generated code is concerned, it just gives the compiler more information and makes this easier to read. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 3 ++- builtin/index-pack.c | 3 ++- builtin/unpack-objects.c | 3 ++- fsck.h | 3 ++- 4 files changed, 8 insertions(+), 4 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 821e7798c70..68f0329e69e 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -197,7 +197,8 @@ static int traverse_reachable(void) return !!result; } -static int mark_used(struct object *obj, int type, void *data, struct fsck_options *options) +static int mark_used(struct object *obj, enum object_type object_type, + void *data, struct fsck_options *options) { if (!obj) return 1; diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 54f74c48741..2f291a14d4a 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -212,7 +212,8 @@ static void cleanup_thread(void) free(thread_data); } -static int mark_link(struct object *obj, int type, void *data, struct fsck_options *options) +static int mark_link(struct object *obj, enum object_type type, + void *data, struct fsck_options *options) { if (!obj) return -1; diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index dd4a75e030d..ca54fd16688 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -187,7 +187,8 @@ static void write_cached_object(struct object *obj, struct obj_buffer *obj_buf) * that have reachability requirements and calls this function. * Verify its reachability and validity recursively and write it out. */ -static int check_object(struct object *obj, int type, void *data, struct fsck_options *options) +static int check_object(struct object *obj, enum object_type type, + void *data, struct fsck_options *options) { struct obj_buffer *obj_buf; diff --git a/fsck.h b/fsck.h index df0b64a2163..0c75789d219 100644 --- a/fsck.h +++ b/fsck.h @@ -23,7 +23,8 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type); * <0 error signaled and abort * >0 error signaled and do not abort */ -typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options); +typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, + void *data, struct fsck_options *options); /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH 02/14] fsck.h: use use "enum object_type" instead of "int" 2021-02-17 19:42 ` [PATCH 02/14] fsck.h: use use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason @ 2021-02-17 23:40 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-02-17 23:40 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Johannes Schindelin, Jonathan Tan Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > Subject: Re: [PATCH 02/14] fsck.h: use use "enum object_type" instead of "int" use use. > Change the fsck_walk_func to use an "enum object_type" instead of an > "int" type. The types are compatible, and ever since this was added in > 355885d5315 (add generic, type aware object chain walker, 2008-02-25) > we've used entries from object_type (OBJ_BLOB etc.). > > So this doesn't really change anything as far as the generated code is > concerned, it just gives the compiler more information and makes this > easier to read. Yup, as long as we won't trick the compiler into complaining "ah, but you are not covering OBJ_OFS_DELTA or OBJ_BAD values in your switch statement", I think a change like this is a good thing. > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > builtin/fsck.c | 3 ++- > builtin/index-pack.c | 3 ++- > builtin/unpack-objects.c | 3 ++- > fsck.h | 3 ++- > 4 files changed, 8 insertions(+), 4 deletions(-) > > diff --git a/builtin/fsck.c b/builtin/fsck.c > index 821e7798c70..68f0329e69e 100644 > --- a/builtin/fsck.c > +++ b/builtin/fsck.c > @@ -197,7 +197,8 @@ static int traverse_reachable(void) > return !!result; > } > > -static int mark_used(struct object *obj, int type, void *data, struct fsck_options *options) > +static int mark_used(struct object *obj, enum object_type object_type, > + void *data, struct fsck_options *options) > { > if (!obj) > return 1; > diff --git a/builtin/index-pack.c b/builtin/index-pack.c > index 54f74c48741..2f291a14d4a 100644 > --- a/builtin/index-pack.c > +++ b/builtin/index-pack.c > @@ -212,7 +212,8 @@ static void cleanup_thread(void) > free(thread_data); > } > > -static int mark_link(struct object *obj, int type, void *data, struct fsck_options *options) > +static int mark_link(struct object *obj, enum object_type type, > + void *data, struct fsck_options *options) > { > if (!obj) > return -1; > diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c > index dd4a75e030d..ca54fd16688 100644 > --- a/builtin/unpack-objects.c > +++ b/builtin/unpack-objects.c > @@ -187,7 +187,8 @@ static void write_cached_object(struct object *obj, struct obj_buffer *obj_buf) > * that have reachability requirements and calls this function. > * Verify its reachability and validity recursively and write it out. > */ > -static int check_object(struct object *obj, int type, void *data, struct fsck_options *options) > +static int check_object(struct object *obj, enum object_type type, > + void *data, struct fsck_options *options) > { > struct obj_buffer *obj_buf; > > diff --git a/fsck.h b/fsck.h > index df0b64a2163..0c75789d219 100644 > --- a/fsck.h > +++ b/fsck.h > @@ -23,7 +23,8 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type); > * <0 error signaled and abort > * >0 error signaled and do not abort > */ > -typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct fsck_options *options); > +typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, > + void *data, struct fsck_options *options); > > /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ > typedef int (*fsck_error)(struct fsck_options *o, ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH 03/14] fsck.c: rename variables in fsck_set_msg_type() for less confusion 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason ` (2 preceding siblings ...) 2021-02-17 19:42 ` [PATCH 02/14] fsck.h: use use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 04/14] fsck.c: move definition of msg_id into append_msg_id() Ævar Arnfjörð Bjarmason ` (11 subsequent siblings) 15 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Rename variables in a function added in 0282f4dced0 (fsck: offer a function to demote fsck errors to warnings, 2015-06-22). It was needlessly confusing that it took a "msg_type" argument, but then later declared another "msg_type" of a different type. Let's rename that to "tmp", and rename "id" to "msg_id" and "msg_id" to "msg_id_str" etc. This will make a follow-up change smaller. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/fsck.c b/fsck.c index 4b7f0b73d73..acccad243ec 100644 --- a/fsck.c +++ b/fsck.c @@ -203,27 +203,27 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) } void fsck_set_msg_type(struct fsck_options *options, - const char *msg_id, const char *msg_type) + const char *msg_id_str, const char *msg_type_str) { - int id = parse_msg_id(msg_id), type; + int msg_id = parse_msg_id(msg_id_str), msg_type; - if (id < 0) - die("Unhandled message id: %s", msg_id); - type = parse_msg_type(msg_type); + if (msg_id < 0) + die("Unhandled message id: %s", msg_id_str); + msg_type = parse_msg_type(msg_type_str); - if (type != FSCK_ERROR && msg_id_info[id].msg_type == FSCK_FATAL) - die("Cannot demote %s to %s", msg_id, msg_type); + if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) + die("Cannot demote %s to %s", msg_id_str, msg_type_str); if (!options->msg_type) { int i; - int *msg_type; - ALLOC_ARRAY(msg_type, FSCK_MSG_MAX); + int *tmp; + ALLOC_ARRAY(tmp, FSCK_MSG_MAX); for (i = 0; i < FSCK_MSG_MAX; i++) - msg_type[i] = fsck_msg_type(i, options); - options->msg_type = msg_type; + tmp[i] = fsck_msg_type(i, options); + options->msg_type = tmp; } - options->msg_type[id] = type; + options->msg_type[msg_id] = msg_type; } void fsck_set_msg_types(struct fsck_options *options, const char *values) -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH 04/14] fsck.c: move definition of msg_id into append_msg_id() 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason ` (3 preceding siblings ...) 2021-02-17 19:42 ` [PATCH 03/14] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 05/14] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason ` (10 subsequent siblings) 15 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Refactor code added in 71ab8fa840f (fsck: report the ID of the error/warning, 2015-06-22) to resolve the msg_id to a string in the function that wants it, instead of doing it in report(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index acccad243ec..1070071ffec 100644 --- a/fsck.c +++ b/fsck.c @@ -264,8 +264,9 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) free(to_free); } -static void append_msg_id(struct strbuf *sb, const char *msg_id) +static void append_msg_id(struct strbuf *sb, enum fsck_msg_id id) { + const char *msg_id = msg_id_info[id].id_string; for (;;) { char c = *(msg_id)++; @@ -308,7 +309,7 @@ static int report(struct fsck_options *options, else if (msg_type == FSCK_INFO) msg_type = FSCK_WARN; - append_msg_id(&sb, msg_id_info[id].id_string); + append_msg_id(&sb, id); va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH 05/14] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason ` (4 preceding siblings ...) 2021-02-17 19:42 ` [PATCH 04/14] fsck.c: move definition of msg_id into append_msg_id() Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 06/14] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason ` (9 subsequent siblings) 15 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Rename the remaining variables of type fsck_msg_id from "id" to "msg_id". This change is relatively small, and is worth the churn for a later change where we have different id's in the "report" function. --- fsck.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/fsck.c b/fsck.c index 1070071ffec..dbb6f7c4ee2 100644 --- a/fsck.c +++ b/fsck.c @@ -264,19 +264,19 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) free(to_free); } -static void append_msg_id(struct strbuf *sb, enum fsck_msg_id id) +static void append_msg_id(struct strbuf *sb, enum fsck_msg_id msg_id) { - const char *msg_id = msg_id_info[id].id_string; + const char *msg_id_str = msg_id_info[msg_id].id_string; for (;;) { - char c = *(msg_id)++; + char c = *(msg_id_str)++; if (!c) break; if (c != '_') strbuf_addch(sb, tolower(c)); else { - assert(*msg_id); - strbuf_addch(sb, *(msg_id)++); + assert(*msg_id_str); + strbuf_addch(sb, *(msg_id_str)++); } } @@ -292,11 +292,11 @@ static int object_on_skiplist(struct fsck_options *opts, __attribute__((format (printf, 5, 6))) static int report(struct fsck_options *options, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_id id, const char *fmt, ...) + enum fsck_msg_id msg_id, const char *fmt, ...) { va_list ap; struct strbuf sb = STRBUF_INIT; - int msg_type = fsck_msg_type(id, options), result; + int msg_type = fsck_msg_type(msg_id, options), result; if (msg_type == FSCK_IGNORE) return 0; @@ -309,7 +309,7 @@ static int report(struct fsck_options *options, else if (msg_type == FSCK_INFO) msg_type = FSCK_WARN; - append_msg_id(&sb, id); + append_msg_id(&sb, msg_id); va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH 06/14] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason ` (5 preceding siblings ...) 2021-02-17 19:42 ` [PATCH 05/14] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 07/14] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason ` (8 subsequent siblings) 15 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Move the FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} defines into a new fsck_msg_type enum. These defines were originally introduced in: - ba002f3b28a (builtin-fsck: move common object checking code to fsck.c, 2008-02-25) - f50c4407305 (fsck: disallow demoting grave fsck errors to warnings, 2015-06-22) - efaba7cc77f (fsck: optionally ignore specific fsck issues completely, 2015-06-22) - f27d05b1704 (fsck: allow upgrading fsck warnings to errors, 2015-06-22) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 2 +- builtin/mktag.c | 3 ++- fsck.c | 21 ++++++++++----------- fsck.h | 17 ++++++++++------- 4 files changed, 23 insertions(+), 20 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 68f0329e69e..d6d745dc702 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -89,7 +89,7 @@ static int objerror(struct object *obj, const char *err) static int fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/builtin/mktag.c b/builtin/mktag.c index 41a399a69e4..1834394a9b6 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -22,7 +22,8 @@ static int mktag_config(const char *var, const char *value, void *cb) static int mktag_fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, + const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/fsck.c b/fsck.c index dbb6f7c4ee2..00e0fef21ca 100644 --- a/fsck.c +++ b/fsck.c @@ -22,9 +22,6 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FSCK_FATAL -1 -#define FSCK_INFO -2 - #define FOREACH_MSG_ID(FUNC) \ /* fatal errors */ \ FUNC(NUL_IN_HEADER, FATAL) \ @@ -97,7 +94,7 @@ static struct { const char *id_string; const char *downcased; const char *camelcased; - int msg_type; + enum fsck_msg_type msg_type; } msg_id_info[FSCK_MSG_MAX + 1] = { FOREACH_MSG_ID(MSG_ID) { NULL, NULL, NULL, -1 } @@ -164,10 +161,10 @@ void list_config_fsck_msg_ids(struct string_list *list, const char *prefix) list_config_item(list, prefix, msg_id_info[i].camelcased); } -static int fsck_msg_type(enum fsck_msg_id msg_id, +static enum fsck_msg_type fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options) { - int msg_type; + enum fsck_msg_type msg_type; assert(msg_id >= 0 && msg_id < FSCK_MSG_MAX); @@ -182,7 +179,7 @@ static int fsck_msg_type(enum fsck_msg_id msg_id, return msg_type; } -static int parse_msg_type(const char *str) +static enum fsck_msg_type parse_msg_type(const char *str) { if (!strcmp(str, "error")) return FSCK_ERROR; @@ -205,7 +202,8 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { - int msg_id = parse_msg_id(msg_id_str), msg_type; + int msg_id = parse_msg_id(msg_id_str); + enum fsck_msg_type msg_type; if (msg_id < 0) die("Unhandled message id: %s", msg_id_str); @@ -216,7 +214,7 @@ void fsck_set_msg_type(struct fsck_options *options, if (!options->msg_type) { int i; - int *tmp; + enum fsck_msg_type *tmp; ALLOC_ARRAY(tmp, FSCK_MSG_MAX); for (i = 0; i < FSCK_MSG_MAX; i++) tmp[i] = fsck_msg_type(i, options); @@ -296,7 +294,8 @@ static int report(struct fsck_options *options, { va_list ap; struct strbuf sb = STRBUF_INIT; - int msg_type = fsck_msg_type(msg_id, options), result; + enum fsck_msg_type msg_type = fsck_msg_type(msg_id, options); + int result; if (msg_type == FSCK_IGNORE) return 0; @@ -1262,7 +1261,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size, int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message) + enum fsck_msg_type msg_type, const char *message) { if (msg_type == FSCK_WARN) { warning("object %s: %s", fsck_describe_object(o, oid), message); diff --git a/fsck.h b/fsck.h index 0c75789d219..c77e8ddf10b 100644 --- a/fsck.h +++ b/fsck.h @@ -3,10 +3,13 @@ #include "oidset.h" -#define FSCK_ERROR 1 -#define FSCK_WARN 2 -#define FSCK_IGNORE 3 - +enum fsck_msg_type { + FSCK_INFO = -2, + FSCK_FATAL = -1, + FSCK_ERROR = 1, + FSCK_WARN, + FSCK_IGNORE +}; struct fsck_options; struct object; @@ -29,17 +32,17 @@ typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message); + enum fsck_msg_type msg_type, const char *message); int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - int msg_type, const char *message); + enum fsck_msg_type msg_type, const char *message); struct fsck_options { fsck_walk_func walk; fsck_error error_func; unsigned strict:1; - int *msg_type; + enum fsck_msg_type *msg_type; struct oidset skiplist; kh_oid_map_t *object_names; }; -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH 07/14] fsck.c: call parse_msg_type() early in fsck_set_msg_type() 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason ` (6 preceding siblings ...) 2021-02-17 19:42 ` [PATCH 06/14] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 08/14] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason ` (7 subsequent siblings) 15 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason There's no reason to defer the calling of parse_msg_type() until after we've checked if the "id < 0". This is not a hot codepath, and parse_msg_type() itself may die on invalid input. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index 00e0fef21ca..7c53080ad48 100644 --- a/fsck.c +++ b/fsck.c @@ -203,11 +203,10 @@ void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { int msg_id = parse_msg_id(msg_id_str); - enum fsck_msg_type msg_type; + enum fsck_msg_type msg_type = parse_msg_type(msg_type_str); if (msg_id < 0) die("Unhandled message id: %s", msg_id_str); - msg_type = parse_msg_type(msg_type_str); if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) die("Cannot demote %s to %s", msg_id_str, msg_type_str); -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH 08/14] fsck.c: undefine temporary STR macro after use 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason ` (7 preceding siblings ...) 2021-02-17 19:42 ` [PATCH 07/14] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 09/14] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason ` (6 subsequent siblings) 15 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason In f417eed8cde (fsck: provide a function to parse fsck message IDs, 2015-06-22) the "STR" macro was introduced, but that short macro name was not undefined after use as was done earlier in the same series for the MSG_ID macro in c99ba492f1c (fsck: introduce identifiers for fsck messages, 2015-06-22). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fsck.c b/fsck.c index 7c53080ad48..88884e91c89 100644 --- a/fsck.c +++ b/fsck.c @@ -100,6 +100,7 @@ static struct { { NULL, NULL, NULL, -1 } }; #undef MSG_ID +#undef STR static void prepare_msg_ids(void) { -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH 09/14] fsck.c: give "FOREACH_MSG_ID" a more specific name 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason ` (8 preceding siblings ...) 2021-02-17 19:42 ` [PATCH 08/14] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 10/14] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h Ævar Arnfjörð Bjarmason ` (5 subsequent siblings) 15 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Rename the FOREACH_MSG_ID macro to FOREACH_FSCK_MSG_ID in preparation for moving it over to fsck.h. It's good convention to name macros in *.h files in such a way as to clearly not clash with any other names in other files. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fsck.c b/fsck.c index 88884e91c89..1730acd698d 100644 --- a/fsck.c +++ b/fsck.c @@ -22,7 +22,7 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FOREACH_MSG_ID(FUNC) \ +#define FOREACH_FSCK_MSG_ID(FUNC) \ /* fatal errors */ \ FUNC(NUL_IN_HEADER, FATAL) \ FUNC(UNTERMINATED_HEADER, FATAL) \ @@ -83,7 +83,7 @@ static struct oidset gitmodules_done = OIDSET_INIT; #define MSG_ID(id, msg_type) FSCK_MSG_##id, enum fsck_msg_id { - FOREACH_MSG_ID(MSG_ID) + FOREACH_FSCK_MSG_ID(MSG_ID) FSCK_MSG_MAX }; #undef MSG_ID @@ -96,7 +96,7 @@ static struct { const char *camelcased; enum fsck_msg_type msg_type; } msg_id_info[FSCK_MSG_MAX + 1] = { - FOREACH_MSG_ID(MSG_ID) + FOREACH_FSCK_MSG_ID(MSG_ID) { NULL, NULL, NULL, -1 } }; #undef MSG_ID -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH 10/14] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason ` (9 preceding siblings ...) 2021-02-17 19:42 ` [PATCH 09/14] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 11/14] fsck.c: pass along the fsck_msg_id in the fsck_error callback Ævar Arnfjörð Bjarmason ` (4 subsequent siblings) 15 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Move the FOREACH_FSCK_MSG_ID macro and the fsck_msg_id enum it helps define from fsck.c to fsck.h. This is in preparation for having non-static functions take the fsck_msg_id as an argument. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 66 --------------------------------------------------------- fsck.h | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+), 66 deletions(-) diff --git a/fsck.c b/fsck.c index 1730acd698d..980ef2cb8fa 100644 --- a/fsck.c +++ b/fsck.c @@ -22,72 +22,6 @@ static struct oidset gitmodules_found = OIDSET_INIT; static struct oidset gitmodules_done = OIDSET_INIT; -#define FOREACH_FSCK_MSG_ID(FUNC) \ - /* fatal errors */ \ - FUNC(NUL_IN_HEADER, FATAL) \ - FUNC(UNTERMINATED_HEADER, FATAL) \ - /* errors */ \ - FUNC(BAD_DATE, ERROR) \ - FUNC(BAD_DATE_OVERFLOW, ERROR) \ - FUNC(BAD_EMAIL, ERROR) \ - FUNC(BAD_NAME, ERROR) \ - FUNC(BAD_OBJECT_SHA1, ERROR) \ - FUNC(BAD_PARENT_SHA1, ERROR) \ - FUNC(BAD_TAG_OBJECT, ERROR) \ - FUNC(BAD_TIMEZONE, ERROR) \ - FUNC(BAD_TREE, ERROR) \ - FUNC(BAD_TREE_SHA1, ERROR) \ - FUNC(BAD_TYPE, ERROR) \ - FUNC(DUPLICATE_ENTRIES, ERROR) \ - FUNC(MISSING_AUTHOR, ERROR) \ - FUNC(MISSING_COMMITTER, ERROR) \ - FUNC(MISSING_EMAIL, ERROR) \ - FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \ - FUNC(MISSING_OBJECT, ERROR) \ - FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \ - FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \ - FUNC(MISSING_TAG, ERROR) \ - FUNC(MISSING_TAG_ENTRY, ERROR) \ - FUNC(MISSING_TREE, ERROR) \ - FUNC(MISSING_TREE_OBJECT, ERROR) \ - FUNC(MISSING_TYPE, ERROR) \ - FUNC(MISSING_TYPE_ENTRY, ERROR) \ - FUNC(MULTIPLE_AUTHORS, ERROR) \ - FUNC(TREE_NOT_SORTED, ERROR) \ - FUNC(UNKNOWN_TYPE, ERROR) \ - FUNC(ZERO_PADDED_DATE, ERROR) \ - FUNC(GITMODULES_MISSING, ERROR) \ - FUNC(GITMODULES_BLOB, ERROR) \ - FUNC(GITMODULES_LARGE, ERROR) \ - FUNC(GITMODULES_NAME, ERROR) \ - FUNC(GITMODULES_SYMLINK, ERROR) \ - FUNC(GITMODULES_URL, ERROR) \ - FUNC(GITMODULES_PATH, ERROR) \ - FUNC(GITMODULES_UPDATE, ERROR) \ - /* warnings */ \ - FUNC(BAD_FILEMODE, WARN) \ - FUNC(EMPTY_NAME, WARN) \ - FUNC(FULL_PATHNAME, WARN) \ - FUNC(HAS_DOT, WARN) \ - FUNC(HAS_DOTDOT, WARN) \ - FUNC(HAS_DOTGIT, WARN) \ - FUNC(NULL_SHA1, WARN) \ - FUNC(ZERO_PADDED_FILEMODE, WARN) \ - FUNC(NUL_IN_COMMIT, WARN) \ - /* infos (reported as warnings, but ignored by default) */ \ - FUNC(GITMODULES_PARSE, INFO) \ - FUNC(BAD_TAG_NAME, INFO) \ - FUNC(MISSING_TAGGER_ENTRY, INFO) \ - /* ignored (elevated when requested) */ \ - FUNC(EXTRA_HEADER_ENTRY, IGNORE) - -#define MSG_ID(id, msg_type) FSCK_MSG_##id, -enum fsck_msg_id { - FOREACH_FSCK_MSG_ID(MSG_ID) - FSCK_MSG_MAX -}; -#undef MSG_ID - #define STR(x) #x #define MSG_ID(id, msg_type) { STR(id), NULL, NULL, FSCK_##msg_type }, static struct { diff --git a/fsck.h b/fsck.h index c77e8ddf10b..b4c53aaa08c 100644 --- a/fsck.h +++ b/fsck.h @@ -10,6 +10,73 @@ enum fsck_msg_type { FSCK_WARN, FSCK_IGNORE }; + +#define FOREACH_FSCK_MSG_ID(FUNC) \ + /* fatal errors */ \ + FUNC(NUL_IN_HEADER, FATAL) \ + FUNC(UNTERMINATED_HEADER, FATAL) \ + /* errors */ \ + FUNC(BAD_DATE, ERROR) \ + FUNC(BAD_DATE_OVERFLOW, ERROR) \ + FUNC(BAD_EMAIL, ERROR) \ + FUNC(BAD_NAME, ERROR) \ + FUNC(BAD_OBJECT_SHA1, ERROR) \ + FUNC(BAD_PARENT_SHA1, ERROR) \ + FUNC(BAD_TAG_OBJECT, ERROR) \ + FUNC(BAD_TIMEZONE, ERROR) \ + FUNC(BAD_TREE, ERROR) \ + FUNC(BAD_TREE_SHA1, ERROR) \ + FUNC(BAD_TYPE, ERROR) \ + FUNC(DUPLICATE_ENTRIES, ERROR) \ + FUNC(MISSING_AUTHOR, ERROR) \ + FUNC(MISSING_COMMITTER, ERROR) \ + FUNC(MISSING_EMAIL, ERROR) \ + FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \ + FUNC(MISSING_OBJECT, ERROR) \ + FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \ + FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \ + FUNC(MISSING_TAG, ERROR) \ + FUNC(MISSING_TAG_ENTRY, ERROR) \ + FUNC(MISSING_TREE, ERROR) \ + FUNC(MISSING_TREE_OBJECT, ERROR) \ + FUNC(MISSING_TYPE, ERROR) \ + FUNC(MISSING_TYPE_ENTRY, ERROR) \ + FUNC(MULTIPLE_AUTHORS, ERROR) \ + FUNC(TREE_NOT_SORTED, ERROR) \ + FUNC(UNKNOWN_TYPE, ERROR) \ + FUNC(ZERO_PADDED_DATE, ERROR) \ + FUNC(GITMODULES_MISSING, ERROR) \ + FUNC(GITMODULES_BLOB, ERROR) \ + FUNC(GITMODULES_LARGE, ERROR) \ + FUNC(GITMODULES_NAME, ERROR) \ + FUNC(GITMODULES_SYMLINK, ERROR) \ + FUNC(GITMODULES_URL, ERROR) \ + FUNC(GITMODULES_PATH, ERROR) \ + FUNC(GITMODULES_UPDATE, ERROR) \ + /* warnings */ \ + FUNC(BAD_FILEMODE, WARN) \ + FUNC(EMPTY_NAME, WARN) \ + FUNC(FULL_PATHNAME, WARN) \ + FUNC(HAS_DOT, WARN) \ + FUNC(HAS_DOTDOT, WARN) \ + FUNC(HAS_DOTGIT, WARN) \ + FUNC(NULL_SHA1, WARN) \ + FUNC(ZERO_PADDED_FILEMODE, WARN) \ + FUNC(NUL_IN_COMMIT, WARN) \ + /* infos (reported as warnings, but ignored by default) */ \ + FUNC(GITMODULES_PARSE, INFO) \ + FUNC(BAD_TAG_NAME, INFO) \ + FUNC(MISSING_TAGGER_ENTRY, INFO) \ + /* ignored (elevated when requested) */ \ + FUNC(EXTRA_HEADER_ENTRY, IGNORE) + +#define MSG_ID(id, msg_type) FSCK_MSG_##id, +enum fsck_msg_id { + FOREACH_FSCK_MSG_ID(MSG_ID) + FSCK_MSG_MAX +}; +#undef MSG_ID + struct fsck_options; struct object; -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH 11/14] fsck.c: pass along the fsck_msg_id in the fsck_error callback 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason ` (10 preceding siblings ...) 2021-02-17 19:42 ` [PATCH 10/14] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 12/14] fsck.c: add an fsck_set_msg_type() API that takes enums Ævar Arnfjörð Bjarmason ` (3 subsequent siblings) 15 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Change the fsck_error callback to also pass along the fsck_msg_id. Before this change the only way to get the message id was to parse it back out of the "message". Let's pass it down explicitly for the benefit of callers that might want to use it, as discussed in [1]. Passing the msg_type is now redundant, as you can always get it back from the msg_id, but I'm not changing that convention. It's really common to need the msg_type, and the report() function itself (which calls "fsck_error") needs to call fsck_msg_type() to discover it. Let's not needlessly re-do that work in the user callback. 1. https://lore.kernel.org/git/87blcja2ha.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/fsck.c | 4 +++- builtin/mktag.c | 1 + fsck.c | 6 ++++-- fsck.h | 6 ++++-- 4 files changed, 12 insertions(+), 5 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index d6d745dc702..b71fac4ceca 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -89,7 +89,9 @@ static int objerror(struct object *obj, const char *err) static int fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message) + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message) { switch (msg_type) { case FSCK_WARN: diff --git a/builtin/mktag.c b/builtin/mktag.c index 1834394a9b6..dc989c356f5 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -23,6 +23,7 @@ static int mktag_fsck_error_func(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, const char *message) { switch (msg_type) { diff --git a/fsck.c b/fsck.c index 980ef2cb8fa..007f02b556a 100644 --- a/fsck.c +++ b/fsck.c @@ -247,7 +247,7 @@ static int report(struct fsck_options *options, va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); result = options->error_func(options, oid, object_type, - msg_type, sb.buf); + msg_type, msg_id, sb.buf); strbuf_release(&sb); va_end(ap); @@ -1195,7 +1195,9 @@ int fsck_object(struct object *obj, void *data, unsigned long size, int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message) + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message) { if (msg_type == FSCK_WARN) { warning("object %s: %s", fsck_describe_object(o, oid), message); diff --git a/fsck.h b/fsck.h index b4c53aaa08c..56536d7f29e 100644 --- a/fsck.h +++ b/fsck.h @@ -99,11 +99,13 @@ typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message); + enum fsck_msg_type msg_type, enum fsck_msg_id msg_id, + const char *message); int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, const char *message); + enum fsck_msg_type msg_type, enum fsck_msg_id msg_id, + const char *message); struct fsck_options { fsck_walk_func walk; -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH 12/14] fsck.c: add an fsck_set_msg_type() API that takes enums 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason ` (11 preceding siblings ...) 2021-02-17 19:42 ` [PATCH 11/14] fsck.c: pass along the fsck_msg_id in the fsck_error callback Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 13/14] fsck.h: update FSCK_OPTIONS_* for object_name Ævar Arnfjörð Bjarmason ` (2 subsequent siblings) 15 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Change code I added in acf9de4c94e (mktag: use fsck instead of custom verify_tag(), 2021-01-05) to make use of a new API function that takes the fsck_msg_{id,type} types, instead of arbitrary strings that we'll (hopefully) parse into those types. At the time that the fsck_set_msg_type() API was introduced in 0282f4dced0 (fsck: offer a function to demote fsck errors to warnings, 2015-06-22) it was only intended to be used to parse user-supplied data. For things that are purely internal to the C code it makes sense to have the compiler check these arguments, and to skip the sanity checking of the data in fsck_set_msg_type() which is redundant to checks we get from the compiler. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/mktag.c | 3 ++- fsck.c | 27 +++++++++++++++++---------- fsck.h | 3 +++ 3 files changed, 22 insertions(+), 11 deletions(-) diff --git a/builtin/mktag.c b/builtin/mktag.c index dc989c356f5..de67a94f24e 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -93,7 +93,8 @@ int cmd_mktag(int argc, const char **argv, const char *prefix) die_errno(_("could not read from stdin")); fsck_options.error_func = mktag_fsck_error_func; - fsck_set_msg_type(&fsck_options, "extraheaderentry", "warn"); + fsck_set_msg_type_from_ids(&fsck_options, FSCK_MSG_EXTRA_HEADER_ENTRY, + FSCK_WARN); /* config might set fsck.extraHeaderEntry=* again */ git_config(mktag_config, NULL); if (fsck_tag_standalone(NULL, buf.buf, buf.len, &fsck_options, diff --git a/fsck.c b/fsck.c index 007f02b556a..54632404de5 100644 --- a/fsck.c +++ b/fsck.c @@ -134,6 +134,22 @@ int is_valid_msg_type(const char *msg_id, const char *msg_type) return 1; } +void fsck_set_msg_type_from_ids(struct fsck_options *options, + enum fsck_msg_id msg_id, + enum fsck_msg_type msg_type) +{ + if (!options->msg_type) { + int i; + enum fsck_msg_type *tmp; + ALLOC_ARRAY(tmp, FSCK_MSG_MAX); + for (i = 0; i < FSCK_MSG_MAX; i++) + tmp[i] = fsck_msg_type(i, options); + options->msg_type = tmp; + } + + options->msg_type[msg_id] = msg_type; +} + void fsck_set_msg_type(struct fsck_options *options, const char *msg_id_str, const char *msg_type_str) { @@ -146,16 +162,7 @@ void fsck_set_msg_type(struct fsck_options *options, if (msg_type != FSCK_ERROR && msg_id_info[msg_id].msg_type == FSCK_FATAL) die("Cannot demote %s to %s", msg_id_str, msg_type_str); - if (!options->msg_type) { - int i; - enum fsck_msg_type *tmp; - ALLOC_ARRAY(tmp, FSCK_MSG_MAX); - for (i = 0; i < FSCK_MSG_MAX; i++) - tmp[i] = fsck_msg_type(i, options); - options->msg_type = tmp; - } - - options->msg_type[msg_id] = msg_type; + fsck_set_msg_type_from_ids(options, msg_id, msg_type); } void fsck_set_msg_types(struct fsck_options *options, const char *values) diff --git a/fsck.h b/fsck.h index 56536d7f29e..af145bb4596 100644 --- a/fsck.h +++ b/fsck.h @@ -80,6 +80,9 @@ enum fsck_msg_id { struct fsck_options; struct object; +void fsck_set_msg_type_from_ids(struct fsck_options *options, + enum fsck_msg_id msg_id, + enum fsck_msg_type msg_type); void fsck_set_msg_type(struct fsck_options *options, const char *msg_id, const char *msg_type); void fsck_set_msg_types(struct fsck_options *options, const char *values); -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH 13/14] fsck.h: update FSCK_OPTIONS_* for object_name 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason ` (12 preceding siblings ...) 2021-02-17 19:42 ` [PATCH 12/14] fsck.c: add an fsck_set_msg_type() API that takes enums Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 14/14] fsck.c: move gitmodules_{found,done} into fsck_options Ævar Arnfjörð Bjarmason 2021-02-17 20:05 ` [PATCH 4/4] fetch-pack: print and use dangling .gitmodules Jonathan Tan 15 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Add the object_name member to the initialization macro. This was omitted in 7b35efd734e (fsck_walk(): optionally name objects on the go, 2016-07-17) when the field was added. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fsck.h b/fsck.h index af145bb4596..28137a77df0 100644 --- a/fsck.h +++ b/fsck.h @@ -119,8 +119,8 @@ struct fsck_options { kh_oid_map_t *object_names; }; -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT } -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT } +#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT, NULL } +#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT, NULL } /* descend in all linked child objects * the return value is: -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH 14/14] fsck.c: move gitmodules_{found,done} into fsck_options 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason ` (13 preceding siblings ...) 2021-02-17 19:42 ` [PATCH 13/14] fsck.h: update FSCK_OPTIONS_* for object_name Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 ` Ævar Arnfjörð Bjarmason 2021-02-17 20:05 ` [PATCH 4/4] fetch-pack: print and use dangling .gitmodules Jonathan Tan 15 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 19:42 UTC (permalink / raw) To: git Cc: Junio C Hamano, Jeff King, Johannes Schindelin, Jonathan Tan, Ævar Arnfjörð Bjarmason Move the gitmodules_{found,done} static variables added in 159e7b080bf (fsck: detect gitmodules files, 2018-05-02) into the fsck_options struct. It makes sense to keep all the context in the same place. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 19 ++++++++----------- fsck.h | 6 ++++-- 2 files changed, 12 insertions(+), 13 deletions(-) diff --git a/fsck.c b/fsck.c index 54632404de5..f344b6be3d3 100644 --- a/fsck.c +++ b/fsck.c @@ -19,9 +19,6 @@ #include "credential.h" #include "help.h" -static struct oidset gitmodules_found = OIDSET_INIT; -static struct oidset gitmodules_done = OIDSET_INIT; - #define STR(x) #x #define MSG_ID(id, msg_type) { STR(id), NULL, NULL, FSCK_##msg_type }, static struct { @@ -621,7 +618,7 @@ static int fsck_tree(const struct object_id *oid, if (is_hfs_dotgitmodules(name) || is_ntfs_dotgitmodules(name)) { if (!S_ISLNK(mode)) - oidset_insert(&gitmodules_found, oid); + oidset_insert(&options->gitmodules_found, oid); else retval += report(options, oid, OBJ_TREE, @@ -635,7 +632,7 @@ static int fsck_tree(const struct object_id *oid, has_dotgit |= is_ntfs_dotgit(backslash); if (is_ntfs_dotgitmodules(backslash)) { if (!S_ISLNK(mode)) - oidset_insert(&gitmodules_found, oid); + oidset_insert(&options->gitmodules_found, oid); else retval += report(options, oid, OBJ_TREE, FSCK_MSG_GITMODULES_SYMLINK, @@ -1147,9 +1144,9 @@ static int fsck_blob(const struct object_id *oid, const char *buf, struct fsck_gitmodules_data data; struct config_options config_opts = { 0 }; - if (!oidset_contains(&gitmodules_found, oid)) + if (!oidset_contains(&options->gitmodules_found, oid)) return 0; - oidset_insert(&gitmodules_done, oid); + oidset_insert(&options->gitmodules_done, oid); if (object_on_skiplist(options, oid)) return 0; @@ -1220,13 +1217,13 @@ int fsck_finish(struct fsck_options *options) struct oidset_iter iter; const struct object_id *oid; - oidset_iter_init(&gitmodules_found, &iter); + oidset_iter_init(&options->gitmodules_found, &iter); while ((oid = oidset_iter_next(&iter))) { enum object_type type; unsigned long size; char *buf; - if (oidset_contains(&gitmodules_done, oid)) + if (oidset_contains(&options->gitmodules_done, oid)) continue; buf = read_object_file(oid, &type, &size); @@ -1251,8 +1248,8 @@ int fsck_finish(struct fsck_options *options) } - oidset_clear(&gitmodules_found); - oidset_clear(&gitmodules_done); + oidset_clear(&options->gitmodules_found); + oidset_clear(&options->gitmodules_done); return ret; } diff --git a/fsck.h b/fsck.h index 28137a77df0..99c77289688 100644 --- a/fsck.h +++ b/fsck.h @@ -116,11 +116,13 @@ struct fsck_options { unsigned strict:1; enum fsck_msg_type *msg_type; struct oidset skiplist; + struct oidset gitmodules_found; + struct oidset gitmodules_done; kh_oid_map_t *object_names; }; -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT, NULL } -#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT, NULL } +#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT, OIDSET_INIT, OIDSET_INIT, NULL } +#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL, OIDSET_INIT, OIDSET_INIT, OIDSET_INIT, NULL } /* descend in all linked child objects * the return value is: -- 2.30.0.284.gd98b1dd5eaa7 ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH 4/4] fetch-pack: print and use dangling .gitmodules 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason ` (14 preceding siblings ...) 2021-02-17 19:42 ` [PATCH 14/14] fsck.c: move gitmodules_{found,done} into fsck_options Ævar Arnfjörð Bjarmason @ 2021-02-17 20:05 ` Jonathan Tan 15 siblings, 0 replies; 229+ messages in thread From: Jonathan Tan @ 2021-02-17 20:05 UTC (permalink / raw) To: avarab; +Cc: jonathantanmy, git > > I tried that first, and the issue is that IDs like > > FSCK_MSG_GITMODULES_MISSING are internal to fsck.c. As for whether we > > should start exposing the IDs publicly, I think we should wait until a > > few new cases like this come up, so that we more fully understand the > > requirements first. > > The requirement is that you want the objects ids we'd otherwise error > about in fsck_finish(). Yeah we don't pass the "fsck_msg_id" down in the > "report()" function, but you can reliably strstr() it out of the > message. We can't strstr() because of false positives (if, e.g. there is a submodule name that contains the string we're looking for), but looking at report() in fsck.c, the message ID is the very first thing appended, so I think we can use starts_with(). > We document & hard rely on that already, since it's also a > config key. Ah, good point. > But yeah, we could just change the report function to pass down the id > and move the relevant macros from fsck.c to fsck.h. I think that would > be a smaller change conceptually than a special-case flag in > fsck_options for something we could otherwise do with the error > reporting. I agree - I thought this wouldn't be possible, but like you said, we can reliably make use of the string in report() (or pass the ID, like your patch set [1] does) so we should do this. What would be the best way to proceed, now that we have at least 2 patch sets (mine and yours) in play? I was thinking that I should update my one to use the string reported in report() (with starts_with()), so that both our patch sets can be reviewed and merged in parallel, and after that, update the fsck code to use the ID instead of the string. [1] https://lore.kernel.org/git/87blcja2ha.fsf@evledraar.gmail.com/ ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 4/4] fetch-pack: print and use dangling .gitmodules 2021-01-24 2:34 ` [PATCH 4/4] fetch-pack: print and use dangling .gitmodules Jonathan Tan 2021-01-24 7:56 ` Junio C Hamano 2021-01-24 12:18 ` Ævar Arnfjörð Bjarmason @ 2021-01-24 12:30 ` Ævar Arnfjörð Bjarmason 2021-01-28 1:15 ` Jonathan Tan 2021-02-17 19:27 ` Ævar Arnfjörð Bjarmason 3 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-01-24 12:30 UTC (permalink / raw) To: Jonathan Tan; +Cc: git On Sun, Jan 24 2021, Jonathan Tan wrote: > --fsck-objects:: > - Die if the pack contains broken objects. For internal use only. > + For internal use only. > ++ > +Die if the pack contains broken objects. If the pack contains a tree > +pointing to a .gitmodules blob that does not exist, prints the hash of > +that blob (for the caller to check) after the hash that goes into the > +name of the pack/idx file (see "Notes"). [I should have waited a bit and sent one E-Mail] Is this really generally usable as an IPC mechanism, what if we need another set of OIDs we care about? Shouldn't it at least be hidden behind some option so you don't get a deluge of output from index-pack if you're not in this packfile-uri mode? But, along with my other E-Mail... > [...] > +static void parse_gitmodules_oids(int fd, struct oidset *gitmodules_oids) > +{ > + int len = the_hash_algo->hexsz + 1; /* hash + NL */ > + > + do { > + char hex_hash[GIT_MAX_HEXSZ + 1]; > + int read_len = read_in_full(fd, hex_hash, len); > + struct object_id oid; > + const char *end; > + > + if (!read_len) > + return; > + if (read_len != len) > + die("invalid length read %d", read_len); > + if (parse_oid_hex(hex_hash, &oid, &end) || *end != '\n') > + die("invalid hash"); > + oidset_insert(gitmodules_oids, &oid); > + } while (1); > +} > + Doesn't this IPC mechanism already exist in the form of fsck.skipList? See my 1f3299fda9 (fsck: make fsck_config() re-usable, 2021-01-05) on "next". I.e. as noted in my just-sent-E-Mail you could probably just re-use skiplist as-is. Or if not it seems to me that this whole IPC mechanism would be better done with a tempfile and passing it along like we already pass the fsck.skipList between these processes. I doubt it's going to be large enough to matter, we could just put it in .git/ somewhere, like we put gc.log etc (but created with a mktemp() name...). Or if we want to keep the "print <list> | process" model we can refactor the existing fsck IPC noted in 1f3299fda9 a bit, so e.g. you pass some version of "lines prefixed with "fsck-skiplist: " go into list xyz via a command-line option. And then existing option(s) and your potential new list (which as noted, I think is probably redundant to the skiplist) can use it. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 4/4] fetch-pack: print and use dangling .gitmodules 2021-01-24 12:30 ` Ævar Arnfjörð Bjarmason @ 2021-01-28 1:15 ` Jonathan Tan 2021-02-17 2:10 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 229+ messages in thread From: Jonathan Tan @ 2021-01-28 1:15 UTC (permalink / raw) To: avarab; +Cc: jonathantanmy, git > On Sun, Jan 24 2021, Jonathan Tan wrote: > > --fsck-objects:: > > - Die if the pack contains broken objects. For internal use only. > > + For internal use only. > > ++ > > +Die if the pack contains broken objects. If the pack contains a tree > > +pointing to a .gitmodules blob that does not exist, prints the hash of > > +that blob (for the caller to check) after the hash that goes into the > > +name of the pack/idx file (see "Notes"). > > [I should have waited a bit and sent one E-Mail] > > Is this really generally usable as an IPC mechanism, what if we need > another set of OIDs we care about? Shouldn't it at least be hidden > behind some option so you don't get a deluge of output from index-pack > if you're not in this packfile-uri mode? --fsck-objects is only for internal use, and it's only used by fetch-pack.c. So its only consumer does want the output. Junio also mentioned the possibility of another set of OIDs, and I replied [1]. [1] https://lore.kernel.org/git/20210128003536.3874866-1-jonathantanmy@google.com/ > But, along with my other E-Mail... > > > [...] > > +static void parse_gitmodules_oids(int fd, struct oidset *gitmodules_oids) > > +{ > > + int len = the_hash_algo->hexsz + 1; /* hash + NL */ > > + > > + do { > > + char hex_hash[GIT_MAX_HEXSZ + 1]; > > + int read_len = read_in_full(fd, hex_hash, len); > > + struct object_id oid; > > + const char *end; > > + > > + if (!read_len) > > + return; > > + if (read_len != len) > > + die("invalid length read %d", read_len); > > + if (parse_oid_hex(hex_hash, &oid, &end) || *end != '\n') > > + die("invalid hash"); > > + oidset_insert(gitmodules_oids, &oid); > > + } while (1); > > +} > > + > > Doesn't this IPC mechanism already exist in the form of fsck.skipList? > See my 1f3299fda9 (fsck: make fsck_config() re-usable, 2021-01-05) on > "next". I.e. as noted in my just-sent-E-Mail you could probably just > re-use skiplist as-is. I'm not sure how fsck.skipList could be used here. Before running fsck_finish() for the first time, we don't know which .gitmodules are missing and which are not. And when running fsck_finish() for the second time, we definitely do not want to skip any blobs. > Or if not it seems to me that this whole IPC mechanism would be better > done with a tempfile and passing it along like we already pass the > fsck.skipList between these processes. > > I doubt it's going to be large enough to matter, we could just put it in > .git/ somewhere, like we put gc.log etc (but created with a mktemp() > name...). > > Or if we want to keep the "print <list> | process" model we can refactor > the existing fsck IPC noted in 1f3299fda9 a bit, so e.g. you pass some > version of "lines prefixed with "fsck-skiplist: " go into list xyz via a > command-line option. And then existing option(s) and your potential new > list (which as noted, I think is probably redundant to the skiplist) can > use it. I think using stdout is superior to using a tempfile - we don't have to worry about interrupted invocations, for example. What do you mean by "the existing fsck IPC noted in 1f3299fda9"? If you mean the ability to pass a list of OIDs, for example using "-c fsck.skipList=filename.txt", I'm not sure that it solves anything. Firstly, I don't think that the skipList is useful here (as I said earlier). And secondly, I don't think that OID input is the issue - right now, the design is a process (index-pack, calling fsck_finish()) writing to its output which is then picked up by the calling process (fetch-pack). We are not sending the dangling .gitmodules through stdin anywhere. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 4/4] fetch-pack: print and use dangling .gitmodules 2021-01-28 1:15 ` Jonathan Tan @ 2021-02-17 2:10 ` Ævar Arnfjörð Bjarmason 2021-02-17 20:10 ` Jonathan Tan 0 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 2:10 UTC (permalink / raw) To: Jonathan Tan; +Cc: git On Thu, Jan 28 2021, Jonathan Tan wrote: >> On Sun, Jan 24 2021, Jonathan Tan wrote: >> > --fsck-objects:: >> > - Die if the pack contains broken objects. For internal use only. >> > + For internal use only. >> > ++ >> > +Die if the pack contains broken objects. If the pack contains a tree >> > +pointing to a .gitmodules blob that does not exist, prints the hash of >> > +that blob (for the caller to check) after the hash that goes into the >> > +name of the pack/idx file (see "Notes"). >> >> [I should have waited a bit and sent one E-Mail] >> >> Is this really generally usable as an IPC mechanism, what if we need >> another set of OIDs we care about? Shouldn't it at least be hidden >> behind some option so you don't get a deluge of output from index-pack >> if you're not in this packfile-uri mode? > > --fsck-objects is only for internal use, and it's only used by > fetch-pack.c. So its only consumer does want the output. > > Junio also mentioned the possibility of another set of OIDs, and I > replied [1]. > > [1] https://lore.kernel.org/git/20210128003536.3874866-1-jonathantanmy@google.com/ > >> But, along with my other E-Mail... >> >> > [...] >> > +static void parse_gitmodules_oids(int fd, struct oidset *gitmodules_oids) >> > +{ >> > + int len = the_hash_algo->hexsz + 1; /* hash + NL */ >> > + >> > + do { >> > + char hex_hash[GIT_MAX_HEXSZ + 1]; >> > + int read_len = read_in_full(fd, hex_hash, len); >> > + struct object_id oid; >> > + const char *end; >> > + >> > + if (!read_len) >> > + return; >> > + if (read_len != len) >> > + die("invalid length read %d", read_len); >> > + if (parse_oid_hex(hex_hash, &oid, &end) || *end != '\n') >> > + die("invalid hash"); >> > + oidset_insert(gitmodules_oids, &oid); >> > + } while (1); >> > +} >> > + >> >> Doesn't this IPC mechanism already exist in the form of fsck.skipList? >> See my 1f3299fda9 (fsck: make fsck_config() re-usable, 2021-01-05) on >> "next". I.e. as noted in my just-sent-E-Mail you could probably just >> re-use skiplist as-is. > > I'm not sure how fsck.skipList could be used here. Before running > fsck_finish() for the first time, we don't know which .gitmodules are > missing and which are not. And when running fsck_finish() for the second > time, we definitely do not want to skip any blobs. > >> Or if not it seems to me that this whole IPC mechanism would be better >> done with a tempfile and passing it along like we already pass the >> fsck.skipList between these processes. >> >> I doubt it's going to be large enough to matter, we could just put it in >> .git/ somewhere, like we put gc.log etc (but created with a mktemp() >> name...). >> >> Or if we want to keep the "print <list> | process" model we can refactor >> the existing fsck IPC noted in 1f3299fda9 a bit, so e.g. you pass some >> version of "lines prefixed with "fsck-skiplist: " go into list xyz via a >> command-line option. And then existing option(s) and your potential new >> list (which as noted, I think is probably redundant to the skiplist) can >> use it. > > I think using stdout is superior to using a tempfile - we don't have to > worry about interrupted invocations, for example. > > What do you mean by "the existing fsck IPC noted in 1f3299fda9"? If you > mean the ability to pass a list of OIDs, for example using "-c > fsck.skipList=filename.txt", I'm not sure that it solves anything. > Firstly, I don't think that the skipList is useful here (as I said > earlier). And secondly, I don't think that OID input is the issue - > right now, the design is a process (index-pack, calling fsck_finish()) > writing to its output which is then picked up by the calling process > (fetch-pack). We are not sending the dangling .gitmodules through stdin > anywhere. Sorry for being unclear here. I don't think (honestly I don't remember, it's been almost a month) that I meant to you should use the skipList. Looking at that code again we use object_on_skiplist() to do an early punt in report(), but also fsck_blob(), presumably you never want the latter, and that early punting wouldn't be needed if your report() function intercepted the modules blob id for stashing it away / later reporting / whatever. So yeah, I'm 99% sure now that's not what I meant :) What I meant with: Or if we want to keep the "print <list> | process"[...] Is that we have an existing ad-hoc IPC model for these commands in passing along the skipList, which is made more complex because sometimes the initial process reads the file, sometimes it passes it along as-is to the child. And then there's this patch that passes OIDs too, but through a different mechanism. I was suggesting that perhaps it made more sense to refactor both so they could use the same mechanism, because we're potentially passing two lists of OIDs between the two. Just one goes via line-at-a-time in the output, the other via a config option on the command-line. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 4/4] fetch-pack: print and use dangling .gitmodules 2021-02-17 2:10 ` Ævar Arnfjörð Bjarmason @ 2021-02-17 20:10 ` Jonathan Tan 2021-02-18 12:07 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 229+ messages in thread From: Jonathan Tan @ 2021-02-17 20:10 UTC (permalink / raw) To: avarab; +Cc: jonathantanmy, git > Sorry for being unclear here. I don't think (honestly I don't remember, > it's been almost a month) that I meant to you should use the skipList. > > Looking at that code again we use object_on_skiplist() to do an early > punt in report(), but also fsck_blob(), presumably you never want the > latter, and that early punting wouldn't be needed if your report() > function intercepted the modules blob id for stashing it away / later > reporting / whatever. > > So yeah, I'm 99% sure now that's not what I meant :) > > What I meant with: > > Or if we want to keep the "print <list> | process"[...] > > Is that we have an existing ad-hoc IPC model for these commands in > passing along the skipList, which is made more complex because sometimes > the initial process reads the file, sometimes it passes it along as-is > to the child. > > And then there's this patch that passes OIDs too, but through a > different mechanism. > > I was suggesting that perhaps it made more sense to refactor both so > they could use the same mechanism, because we're potentially passing two > lists of OIDs between the two. Just one goes via line-at-a-time in the > output, the other via a config option on the command-line. Thanks for your explanation. I still think that they are quite different - skiplist is a user-written file containing a list of OIDs that will likely never change, whereas my list of dangling .gitmodules is a list of OIDs dynamically generated (and thus, always different) whenever a fetch is done. So I think it's quite reasonable to pass skiplist as a file name, and my list should be passed line-by-line. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 4/4] fetch-pack: print and use dangling .gitmodules 2021-02-17 20:10 ` Jonathan Tan @ 2021-02-18 12:07 ` Ævar Arnfjörð Bjarmason 0 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-18 12:07 UTC (permalink / raw) To: Jonathan Tan; +Cc: git On Wed, Feb 17 2021, Jonathan Tan wrote: >> Sorry for being unclear here. I don't think (honestly I don't remember, >> it's been almost a month) that I meant to you should use the skipList. >> >> Looking at that code again we use object_on_skiplist() to do an early >> punt in report(), but also fsck_blob(), presumably you never want the >> latter, and that early punting wouldn't be needed if your report() >> function intercepted the modules blob id for stashing it away / later >> reporting / whatever. >> >> So yeah, I'm 99% sure now that's not what I meant :) >> >> What I meant with: >> >> Or if we want to keep the "print <list> | process"[...] >> >> Is that we have an existing ad-hoc IPC model for these commands in >> passing along the skipList, which is made more complex because sometimes >> the initial process reads the file, sometimes it passes it along as-is >> to the child. >> >> And then there's this patch that passes OIDs too, but through a >> different mechanism. >> >> I was suggesting that perhaps it made more sense to refactor both so >> they could use the same mechanism, because we're potentially passing two >> lists of OIDs between the two. Just one goes via line-at-a-time in the >> output, the other via a config option on the command-line. > > Thanks for your explanation. I still think that they are quite different > - skiplist is a user-written file containing a list of OIDs that will > likely never change, whereas my list of dangling .gitmodules is a list > of OIDs dynamically generated (and thus, always different) whenever a > fetch is done. So I think it's quite reasonable to pass skiplist as a > file name, and my list should be passed line-by-line. Sure, but I'm not talking about passing it as a tempfile. Yes, I suggested that in the third-to-last paragraph of [1] but then went on to say that we could also move to some IPC mechanism where you spew in the list of dangling .gitmodules, and we also spew in the skipList and anything else we want to pass in. I'm not saying this needs to be part of this series. But let me rephrase: We now have some combination of {receive-pack,upload-pack,send-pack,fetch-pack,unpack-objects} that need to communicate locally or pass data back & forth, passing data either via a CLI option to read a file, packnames/refs on --stdin, or (now) a single list of OIDs on stdout. Let's say we don't just need to pass the .gitmodules OIDs, but also e.g. .mailmap OIDs or whatever (due to some future vulnerability). Would this IPC mechanism deal with that, or would we need to introduce a breaking change (Re: my recently send mail about concurrent updates of libexec programs)? Can we use soemething like pkt-line to talk back & forth in an extensible way? Not needed now, just food for thought... 1. https://lore.kernel.org/git/87czxu7c15.fsf@evledraar.gmail.com/ ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 4/4] fetch-pack: print and use dangling .gitmodules 2021-01-24 2:34 ` [PATCH 4/4] fetch-pack: print and use dangling .gitmodules Jonathan Tan ` (2 preceding siblings ...) 2021-01-24 12:30 ` Ævar Arnfjörð Bjarmason @ 2021-02-17 19:27 ` Ævar Arnfjörð Bjarmason 2021-02-17 20:11 ` Jonathan Tan 3 siblings, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-17 19:27 UTC (permalink / raw) To: Jonathan Tan; +Cc: git On Sun, Jan 24 2021, Jonathan Tan wrote: > diff --git a/builtin/index-pack.c b/builtin/index-pack.c > index 557bd2f348..f995c15115 100644 > --- a/builtin/index-pack.c > +++ b/builtin/index-pack.c > @@ -1888,8 +1888,13 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) > else > close(input_fd); > > - if (do_fsck_object && fsck_finish(&fsck_options)) > - die(_("fsck error in pack objects")); > + if (do_fsck_object) { > + struct fsck_options fo = FSCK_OPTIONS_STRICT; > + > + fo.print_dangling_gitmodules = 1; > + if (fsck_finish(&fo)) > + die(_("fsck error in pack objects")); > + } > [...] > +static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) > +{ > + struct oidset_iter iter; > + const struct object_id *oid; > + struct fsck_options fo = FSCK_OPTIONS_STRICT; > + > + if (!oidset_size(gitmodules_oids)) > + return; > + > + oidset_iter_init(gitmodules_oids, &iter); > + while ((oid = oidset_iter_next(&iter))) > + register_found_gitmodules(oid); > + if (fsck_finish(&fo)) > + die("fsck failed"); > +} > + What's the need for STRICT here & can't the former use the existing fsck_options in index-pack.c? With this on top we pass all tests: diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 18531199242..5464edf4778 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1933,10 +1933,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) close(input_fd); if (do_fsck_object) { - struct fsck_options fo = FSCK_OPTIONS_STRICT; - - fo.print_dangling_gitmodules = 1; - if (fsck_finish(&fo)) + fsck_options.print_dangling_gitmodules = 1; + if (fsck_finish(&fsck_options)) die(_("fsck error in pack objects")); } diff --git a/fetch-pack.c b/fetch-pack.c index 0a337a04f1f..a8754d97e3d 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -997,7 +997,7 @@ static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) { struct oidset_iter iter; const struct object_id *oid; - struct fsck_options fo = FSCK_OPTIONS_STRICT; + struct fsck_options fo = FSCK_OPTIONS_DEFAULT; if (!oidset_size(gitmodules_oids)) return; ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH 4/4] fetch-pack: print and use dangling .gitmodules 2021-02-17 19:27 ` Ævar Arnfjörð Bjarmason @ 2021-02-17 20:11 ` Jonathan Tan 0 siblings, 0 replies; 229+ messages in thread From: Jonathan Tan @ 2021-02-17 20:11 UTC (permalink / raw) To: avarab; +Cc: jonathantanmy, git > What's the need for STRICT here & can't the former use the existing > fsck_options in index-pack.c? With this on top we pass all tests: [snip code] Good point - I'll do that. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 0/4] Check .gitmodules when using packfile URIs 2021-01-24 2:34 ` [PATCH 0/4] Check .gitmodules when using packfile URIs Jonathan Tan ` (3 preceding siblings ...) 2021-01-24 2:34 ` [PATCH 4/4] fetch-pack: print and use dangling .gitmodules Jonathan Tan @ 2021-01-24 6:29 ` Junio C Hamano 2021-01-28 0:35 ` Jonathan Tan 2021-02-18 23:34 ` Junio C Hamano 5 siblings, 1 reply; 229+ messages in thread From: Junio C Hamano @ 2021-01-24 6:29 UTC (permalink / raw) To: Jonathan Tan; +Cc: git Jonathan Tan <jonathantanmy@google.com> writes: > As part of this, index-pack has to output (1) the hash that goes into > the name of the .pack/.idx file and (2) the hashes of all dangling > .gitmodules. I just had (2) come after (1). If anyone has a better idea, > I'm interested. I have this feeling that the "blobs that need to be validated across packs" will *not* be the last enhancement we'd need to make to the output from index-pack to allow richer communication between it and its invoker. While there is no reason to change how the first line of the output looks like, we'd probably want to make sure that the future versions of Git can easily tell "list of blobs that require further validation" from other additional information. I am not comfortable to recommend "ok, then let's add a delimiter line '---\n' if/when we need to have something after the list of blobs and append more stuff in future versions of Git", because we may find need to emit new kinds of info before the list of blobs that needs further validation, for example, in future versions of Git. Having said all that, the internal communication between the index-pack and its caller do not need as much care about compatibility across versions as output visible to end-users, so when a future version of Git needs to send different kinds of information in different order from what you created here, we can do so pretty much freely, I would guess. Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 0/4] Check .gitmodules when using packfile URIs 2021-01-24 6:29 ` [PATCH 0/4] Check .gitmodules when using packfile URIs Junio C Hamano @ 2021-01-28 0:35 ` Jonathan Tan 2021-02-18 11:31 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 229+ messages in thread From: Jonathan Tan @ 2021-01-28 0:35 UTC (permalink / raw) To: gitster; +Cc: jonathantanmy, git > Jonathan Tan <jonathantanmy@google.com> writes: > > > As part of this, index-pack has to output (1) the hash that goes into > > the name of the .pack/.idx file and (2) the hashes of all dangling > > .gitmodules. I just had (2) come after (1). If anyone has a better idea, > > I'm interested. > > I have this feeling that the "blobs that need to be validated across > packs" will *not* be the last enhancement we'd need to make to the > output from index-pack to allow richer communication between it and > its invoker. While there is no reason to change how the first line > of the output looks like, we'd probably want to make sure that the > future versions of Git can easily tell "list of blobs that require > further validation" from other additional information. > > I am not comfortable to recommend "ok, then let's add a delimiter > line '---\n' if/when we need to have something after the list of > blobs and append more stuff in future versions of Git", because we > may find need to emit new kinds of info before the list of blobs > that needs further validation, for example, in future versions of > Git. > > Having said all that, the internal communication between the > index-pack and its caller do not need as much care about > compatibility across versions as output visible to end-users, so > when a future version of Git needs to send different kinds of > information in different order from what you created here, we can do > so pretty much freely, I would guess. Yeah, that's what I thought too - since this is an internal interface, we can evolve them in lockstep. If we're really worried about the Git binaries (on a user's system) getting out of sync, we could just make sure that subsequent updates to this protocol are non-backwards-compatible (e.g. have index-pack emit "foo <hash>", where "foo" is a string that describes the new check, so that current fetch-pack will reject "foo" since it is not a hash). ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 0/4] Check .gitmodules when using packfile URIs 2021-01-28 0:35 ` Jonathan Tan @ 2021-02-18 11:31 ` Ævar Arnfjörð Bjarmason 0 siblings, 0 replies; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-18 11:31 UTC (permalink / raw) To: Jonathan Tan; +Cc: gitster, git, Patrick Steinhardt On Thu, Jan 28 2021, Jonathan Tan wrote: >> Jonathan Tan <jonathantanmy@google.com> writes: >> >> > As part of this, index-pack has to output (1) the hash that goes into >> > the name of the .pack/.idx file and (2) the hashes of all dangling >> > .gitmodules. I just had (2) come after (1). If anyone has a better idea, >> > I'm interested. >> >> I have this feeling that the "blobs that need to be validated across >> packs" will *not* be the last enhancement we'd need to make to the >> output from index-pack to allow richer communication between it and >> its invoker. While there is no reason to change how the first line >> of the output looks like, we'd probably want to make sure that the >> future versions of Git can easily tell "list of blobs that require >> further validation" from other additional information. >> >> I am not comfortable to recommend "ok, then let's add a delimiter >> line '---\n' if/when we need to have something after the list of >> blobs and append more stuff in future versions of Git", because we >> may find need to emit new kinds of info before the list of blobs >> that needs further validation, for example, in future versions of >> Git. >> >> Having said all that, the internal communication between the >> index-pack and its caller do not need as much care about >> compatibility across versions as output visible to end-users, so >> when a future version of Git needs to send different kinds of >> information in different order from what you created here, we can do >> so pretty much freely, I would guess. > > Yeah, that's what I thought too - since this is an internal interface, > we can evolve them in lockstep. If we're really worried about the Git > binaries (on a user's system) getting out of sync, I'm thinking in reading "getting out of sync" that you may be missing an aspect of the issue here. We're not talking about some abnormal error in some packaging system, but how we'd expect all installations of git to behave if you update them with *.rpm, *.deb etc, e.g. when your binaries are in /usr/libexec/git-core. I suppose NixOS or something where there's hash-based paths may be exempt from this. On those systems if you've got a server serving concurrent traffic and update the "git" package you could expect failure if any git process invoked by another is incompatible during such an upgrade. If you browse some of the recent GIT_CONFIG_PARAMETERS discussion this was discussed there. I.e. even if GIT_CONFIG_PARAMETERS is internal-only we bent over backwards not to change it in such a way as to have process A invoking process B and the two not understanding each other because of such an upgrade. That's exactly because of this case, where receive-pack may be started on version A, someone runs "apt install git" in the background concurrently, and now a version A of that program is talking to a version B index-pack. > we could just make sure that subsequent updates to this protocol are > non-backwards-compatible (e.g. have index-pack emit "foo <hash>", > where "foo" is a string that describes the new check, so that current > fetch-pack will reject "foo" since it is not a hash). And then presumably index-pack would die and receive-pack would die on the push or whatever, so the push fails for the end user. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 0/4] Check .gitmodules when using packfile URIs 2021-01-24 2:34 ` [PATCH 0/4] Check .gitmodules when using packfile URIs Jonathan Tan ` (4 preceding siblings ...) 2021-01-24 6:29 ` [PATCH 0/4] Check .gitmodules when using packfile URIs Junio C Hamano @ 2021-02-18 23:34 ` Junio C Hamano 2021-02-19 0:46 ` Jonathan Tan 2021-02-19 1:08 ` Ævar Arnfjörð Bjarmason 5 siblings, 2 replies; 229+ messages in thread From: Junio C Hamano @ 2021-02-18 23:34 UTC (permalink / raw) To: Jonathan Tan; +Cc: git Jonathan Tan <jonathantanmy@google.com> writes: > This patch set resolves the .gitmodules-and-tree-in-separate-packfiles > issue I mentioned in [1] by having index-pack print out all dangling > .gitmodules (instead of returning with an error code) and then teaching > fetch-pack to read those and run its own fsck checks after all > index-pack invocations are complete. > > As part of this, index-pack has to output (1) the hash that goes into > the name of the .pack/.idx file and (2) the hashes of all dangling > .gitmodules. I just had (2) come after (1). If anyone has a better idea, > I'm interested. > > I also discovered a bug in that different index-pack arguments were used > when processing the inline packfile and when processing the ones > referenced by URIs. Patch 1-3 fixes that bug by passing the arguments to > use as a space-separated URL-encoded list. (URL-encoded so that we can > have spaces in the arguments.) Again, if anyone has a better idea, I'm > interested. It is only in patch 4 that we have the dangling .gitmodules > fix. This seems to have been stalled but I think it would be a better approach to use a custom callback for error reporting, suggested by Ævar, which would be where his fsck API clean-up topic would lead to. If it is not ultra-urgent, perhaps you can retract the ones that are queued right now, work with Ævar to finish the error-callback work and rebuild this topic on top of it? Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 0/4] Check .gitmodules when using packfile URIs 2021-02-18 23:34 ` Junio C Hamano @ 2021-02-19 0:46 ` Jonathan Tan 2021-02-20 3:31 ` Junio C Hamano 2021-02-19 1:08 ` Ævar Arnfjörð Bjarmason 1 sibling, 1 reply; 229+ messages in thread From: Jonathan Tan @ 2021-02-19 0:46 UTC (permalink / raw) To: gitster; +Cc: jonathantanmy, git > This seems to have been stalled but I think it would be a better > approach to use a custom callback for error reporting, suggested by > Ævar, which would be where his fsck API clean-up topic would lead > to. > > If it is not ultra-urgent, perhaps you can retract the ones that are > queued right now, work with Ævar to finish the error-callback work > and rebuild this topic on top of it? Thanks. OK - that works. My original idea was to rewrite it using an error-callback but using starts_with() instead of the ID that Ævar's work will provide, but seeing that at least one other contributor (Peff) seems OK with the patches, rebasing mine on top of his works too. I'll also take a look at his patches. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 0/4] Check .gitmodules when using packfile URIs 2021-02-19 0:46 ` Jonathan Tan @ 2021-02-20 3:31 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-02-20 3:31 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, Ævar Arnfjörð Bjarmason Jonathan Tan <jonathantanmy@google.com> writes: >> This seems to have been stalled but I think it would be a better >> approach to use a custom callback for error reporting, suggested by >> Ævar, which would be where his fsck API clean-up topic would lead >> to. >> >> If it is not ultra-urgent, perhaps you can retract the ones that are >> queued right now, work with Ævar to finish the error-callback work >> and rebuild this topic on top of it? Thanks. > > OK - that works. My original idea was to rewrite it using an > error-callback but using starts_with() instead of the ID that Ævar's > work will provide, but seeing that at least one other contributor (Peff) > seems OK with the patches, rebasing mine on top of his works too. I'll > also take a look at his patches. Thanks, either way would work for me, but if the suggested route forces you review Ævar's code and work together, that would be a good bonus point ;-) ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 0/4] Check .gitmodules when using packfile URIs 2021-02-18 23:34 ` Junio C Hamano 2021-02-19 0:46 ` Jonathan Tan @ 2021-02-19 1:08 ` Ævar Arnfjörð Bjarmason 2021-02-20 3:29 ` Junio C Hamano 1 sibling, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-19 1:08 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jonathan Tan, git On Fri, Feb 19 2021, Junio C Hamano wrote: > Jonathan Tan <jonathantanmy@google.com> writes: > >> This patch set resolves the .gitmodules-and-tree-in-separate-packfiles >> issue I mentioned in [1] by having index-pack print out all dangling >> .gitmodules (instead of returning with an error code) and then teaching >> fetch-pack to read those and run its own fsck checks after all >> index-pack invocations are complete. >> >> As part of this, index-pack has to output (1) the hash that goes into >> the name of the .pack/.idx file and (2) the hashes of all dangling >> .gitmodules. I just had (2) come after (1). If anyone has a better idea, >> I'm interested. >> >> I also discovered a bug in that different index-pack arguments were used >> when processing the inline packfile and when processing the ones >> referenced by URIs. Patch 1-3 fixes that bug by passing the arguments to >> use as a space-separated URL-encoded list. (URL-encoded so that we can >> have spaces in the arguments.) Again, if anyone has a better idea, I'm >> interested. It is only in patch 4 that we have the dangling .gitmodules >> fix. > > This seems to have been stalled but I think it would be a better > approach to use a custom callback for error reporting, suggested by > Ævar, which would be where his fsck API clean-up topic would lead > to. > > If it is not ultra-urgent, perhaps you can retract the ones that are > queued right now, work with Ævar to finish the error-callback work > and rebuild this topic on top of it? Thanks. If my vote counts for something I think it makes sense to have Jonathan's series go first and just ignore my fsck API improvement patches (well, the part of my v1[1] which conflicts with his work). I'm also happy to help him queue his on top of a v1 version of my series. But the end result of doing so (shown after the "--" in [1]) is just a small re-arrangement of code to get a cleaner fsck API use, it doesn't actually matter to anyone using git. Whereas his patches actually do, we have in-the-wild server/repo/clone setups that are getting on-clone errors, and the window for 2.31 is getting closer. We can always do the small API use refactoring later. My interest in barking up that tree was just that I've been poking at that part of the fsck API and have some follow-up work that hasn't made it onto the list yet that makes other use of the fsck API. So in the longer term I wanted us to think about not needing N special cases like "print_dangling_gitmodules" if we could help it, but in the shorter term having it is a non-issue. 1. https://lore.kernel.org/git/20210217194246.25342-1-avarab@gmail.com/ ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH 0/4] Check .gitmodules when using packfile URIs 2021-02-19 1:08 ` Ævar Arnfjörð Bjarmason @ 2021-02-20 3:29 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-02-20 3:29 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Jonathan Tan, git Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > On Fri, Feb 19 2021, Junio C Hamano wrote: > >> This seems to have been stalled but I think it would be a better >> approach to use a custom callback for error reporting, suggested by >> Ævar, which would be where his fsck API clean-up topic would lead >> to. >> >> If it is not ultra-urgent, perhaps you can retract the ones that are >> queued right now, work with Ævar to finish the error-callback work >> and rebuild this topic on top of it? Thanks. > > If my vote counts for something I think it makes sense to have > Jonathan's series go first and just ignore my fsck API improvement > patches (well, the part of my v1[1] which conflicts with his work). > > I'm also happy to help him queue his on top of a v1 version of my > series. Either would work for us, I would think. Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v2 0/4] Check .gitmodules when using packfile URIs 2021-01-15 23:43 RFC on packfile URIs and .gitmodules check Jonathan Tan ` (2 preceding siblings ...) 2021-01-24 2:34 ` [PATCH 0/4] Check .gitmodules when using packfile URIs Jonathan Tan @ 2021-02-22 19:20 ` Jonathan Tan 2021-02-22 19:20 ` [PATCH v2 1/4] http: allow custom index-pack args Jonathan Tan ` (4 more replies) 3 siblings, 5 replies; 229+ messages in thread From: Jonathan Tan @ 2021-02-22 19:20 UTC (permalink / raw) To: git; +Cc: Jonathan Tan, avarab, gitster Here's v2. I think I've addressed all the review comments, including passing the index-pack args as separate arguments (to avoid the necessity to somehow encode in order to get rid of spaces), and by using a custom error function instead of a specific option in fsck. This applies on master. I mentioned earlier [1] that I was planning to implement this on Ævar's fsck API improvements, but after looking at the latest v2, I see that it omits patch 11 from v1 (which is the one I need), so what I've done is to use a string check in the meantime. [1] https://lore.kernel.org/git/20210219004612.1181920-1-jonathantanmy@google.com/ Jonathan Tan (4): http: allow custom index-pack args http-fetch: allow custom index-pack args fetch-pack: with packfile URIs, use index-pack arg fetch-pack: print and use dangling .gitmodules Documentation/git-http-fetch.txt | 10 ++- Documentation/git-index-pack.txt | 7 ++- builtin/index-pack.c | 25 +++++++- builtin/receive-pack.c | 2 +- fetch-pack.c | 103 ++++++++++++++++++++++++++----- fsck.c | 5 ++ fsck.h | 2 + http-fetch.c | 20 +++++- http.c | 15 ++--- http.h | 10 +-- pack-write.c | 8 ++- pack.h | 2 +- t/t5550-http-fetch-dumb.sh | 5 +- t/t5702-protocol-v2.sh | 58 +++++++++++++++-- 14 files changed, 227 insertions(+), 45 deletions(-) Range-diff against v1: -: ---------- > 1: b7e376be16 http: allow custom index-pack args 1: 9fba6c9bcc ! 2: 57220ceb84 http-fetch: allow custom index-pack args @@ Documentation/git-http-fetch.txt: commit-id:: --packfile=<hash>:: - Instead of a commit id on the command line (which is not expected in -+ For internal use only. Instead of a commit id on the command line (which is not expected in ++ For internal use only. Instead of a commit id on the command ++ line (which is not expected in this case), 'git http-fetch' fetches the packfile directly at the given URL and uses index-pack to generate corresponding .idx and .keep files. The hash is used to determine the name of the temporary file and is @@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, strvec_pushf(&cmd.args, "--packfile=%.*s", (int) the_hash_algo->hexsz, packfile_uris.items[i].string); -+ strvec_push(&cmd.args, "--index-pack-args=index-pack --stdin --keep"); ++ strvec_push(&cmd.args, "--index-pack-arg=index-pack"); ++ strvec_push(&cmd.args, "--index-pack-arg=--stdin"); ++ strvec_push(&cmd.args, "--index-pack-arg=--keep"); strvec_push(&cmd.args, uri); cmd.git_cmd = 1; cmd.no_stdin = 1; @@ http-fetch.c: int cmd_main(int argc, const char **argv) int packfile = 0; int nongit; struct object_id packfile_hash; -+ const char *index_pack_args = NULL; ++ struct strvec index_pack_args = STRVEC_INIT; setup_git_directory_gently(&nongit); @@ http-fetch.c: int cmd_main(int argc, const char **argv) packfile = 1; if (parse_oid_hex(p, &packfile_hash, &end) || *end) die(_("argument to --packfile must be a valid hash (got '%s')"), p); -+ } else if (skip_prefix(argv[arg], "--index-pack-args=", &p)) { -+ index_pack_args = p; ++ } else if (skip_prefix(argv[arg], "--index-pack-arg=", &p)) { ++ strvec_push(&index_pack_args, p); } arg++; } @@ http-fetch.c: int cmd_main(int argc, const char **argv) if (packfile) { - fetch_single_packfile(&packfile_hash, argv[arg]); -+ struct strvec encoded = STRVEC_INIT; -+ char **raw; -+ int i; -+ -+ if (!index_pack_args) ++ if (!index_pack_args.nr) + die(_("--packfile requires --index-pack-args")); + -+ strvec_split(&encoded, index_pack_args); -+ -+ CALLOC_ARRAY(raw, encoded.nr + 1); -+ for (i = 0; i < encoded.nr; i++) -+ raw[i] = url_percent_decode(encoded.v[i]); -+ + fetch_single_packfile(&packfile_hash, argv[arg], -+ (const char **) raw); -+ -+ for (i = 0; i < encoded.nr; i++) -+ free(raw[i]); -+ free(raw); -+ strvec_clear(&encoded); ++ index_pack_args.v); + return 0; } -+ if (index_pack_args) ++ if (index_pack_args.nr) + die(_("--index-pack-args can only be used with --packfile")); + if (commits_on_stdin) { @@ t/t5550-http-fetch-dumb.sh: test_expect_success 'http-fetch --packfile' ' p=$(cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git && ls objects/pack/pack-*.pack) && - git -C packfileclient http-fetch --packfile=$ARBITRARY "$HTTPD_URL"/dumb/repo_pack.git/$p >out && + git -C packfileclient http-fetch --packfile=$ARBITRARY \ -+ --index-pack-args="index-pack --stdin --keep" "$HTTPD_URL"/dumb/repo_pack.git/$p >out && ++ --index-pack-arg=index-pack --index-pack-arg=--stdin \ ++ --index-pack-arg=--keep \ ++ "$HTTPD_URL"/dumb/repo_pack.git/$p >out && grep "^keep.[0-9a-f]\{16,\}$" out && cut -c6- out >packhash && 2: 7c3244e79f ! 3: aa87335464 fetch-pack: with packfile URIs, use index-pack arg @@ fetch-pack.c: static void write_promisor_file(const char *keep_name, - * Pass 1 as "only_packfile" if the pack received is the only pack in this - * fetch request (that is, if there were no packfile URIs provided). + * If packfile URIs were provided, pass a non-NULL pointer to index_pack_args. -+ * The string to pass as the --index-pack-args argument to http-fetch will be ++ * The strings to pass as the --index-pack-arg arguments to http-fetch will be + * stored there. (It must be freed by the caller.) */ static int get_pack(struct fetch_pack_args *args, int xd[2], struct string_list *pack_lockfiles, - int only_packfile, -+ char **index_pack_args, ++ struct strvec *index_pack_args, struct ref **sought, int nr_sought) { struct async demux; @@ fetch-pack.c: static int get_pack(struct fetch_pack_args *args, } + if (index_pack_args) { -+ struct strbuf joined = STRBUF_INIT; + int i; + -+ for (i = 0; i < cmd.args.nr; i++) { -+ if (i) -+ strbuf_addch(&joined, ' '); -+ strbuf_addstr_urlencode(&joined, cmd.args.v[i], -+ is_rfc3986_unreserved); -+ } -+ *index_pack_args = strbuf_detach(&joined, NULL); ++ for (i = 0; i < cmd.args.nr; i++) ++ strvec_push(index_pack_args, cmd.args.v[i]); + } + cmd.in = demux.out; @@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, int seen_ack = 0; struct string_list packfile_uris = STRING_LIST_INIT_DUP; int i; -+ char *index_pack_args = NULL; ++ struct strvec index_pack_args = STRVEC_INIT; negotiator = &negotiator_alloc; fetch_negotiator_init(r, negotiator); @@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, die(_("git fetch-pack: fetch failed.")); do_check_stateless_delimiter(args, &reader); +@@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, + } + + for (i = 0; i < packfile_uris.nr; i++) { ++ int j; + struct child_process cmd = CHILD_PROCESS_INIT; + char packname[GIT_MAX_HEXSZ + 1]; + const char *uri = packfile_uris.items[i].string + @@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, strvec_pushf(&cmd.args, "--packfile=%.*s", (int) the_hash_algo->hexsz, packfile_uris.items[i].string); -- strvec_push(&cmd.args, "--index-pack-args=index-pack --stdin --keep"); -+ strvec_pushf(&cmd.args, "--index-pack-args=%s", index_pack_args); +- strvec_push(&cmd.args, "--index-pack-arg=index-pack"); +- strvec_push(&cmd.args, "--index-pack-arg=--stdin"); +- strvec_push(&cmd.args, "--index-pack-arg=--keep"); ++ for (j = 0; j < index_pack_args.nr; j++) ++ strvec_pushf(&cmd.args, "--index-pack-arg=%s", ++ index_pack_args.v[j]); strvec_push(&cmd.args, uri); cmd.git_cmd = 1; cmd.no_stdin = 1; @@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, packname)); } string_list_clear(&packfile_uris, 0); -+ FREE_AND_NULL(index_pack_args); ++ strvec_clear(&index_pack_args); if (negotiator) negotiator->release(negotiator); 3: 384c9d1c73 ! 4: e8b18d02e6 fetch-pack: print and use dangling .gitmodules @@ Documentation/git-index-pack.txt: OPTIONS Specifies the number of threads to spawn when resolving ## builtin/index-pack.c ## +@@ builtin/index-pack.c: static void show_pack_info(int stat_only) + } + } + ++static int print_dangling_gitmodules(struct fsck_options *o, ++ const struct object_id *oid, ++ enum object_type object_type, ++ int msg_type, const char *message) ++{ ++ /* ++ * NEEDSWORK: Plumb the MSG_ID (from fsck.c) here and use it ++ * instead of relying on this string check. ++ */ ++ if (starts_with(message, "gitmodulesMissing")) { ++ printf("%s\n", oid_to_hex(oid)); ++ return 0; ++ } ++ return fsck_error_function(o, oid, object_type, msg_type, message); ++} ++ + int cmd_index_pack(int argc, const char **argv, const char *prefix) + { + int i, fix_thin_pack = 0, verify = 0, stat_only = 0; @@ builtin/index-pack.c: int cmd_index_pack(int argc, const char **argv, const char *prefix) else close(input_fd); @@ builtin/index-pack.c: int cmd_index_pack(int argc, const char **argv, const char - if (do_fsck_object && fsck_finish(&fsck_options)) - die(_("fsck error in pack objects")); + if (do_fsck_object) { -+ struct fsck_options fo = FSCK_OPTIONS_STRICT; ++ struct fsck_options fo = fsck_options; + -+ fo.print_dangling_gitmodules = 1; ++ fo.error_func = print_dangling_gitmodules; + if (fsck_finish(&fo)) + die(_("fsck error in pack objects")); + } @@ fetch-pack.c: static void write_promisor_file(const char *keep_name, + /* * If packfile URIs were provided, pass a non-NULL pointer to index_pack_args. - * The string to pass as the --index-pack-args argument to http-fetch will be + * The strings to pass as the --index-pack-arg arguments to http-fetch will be @@ fetch-pack.c: static void write_promisor_file(const char *keep_name, static int get_pack(struct fetch_pack_args *args, int xd[2], struct string_list *pack_lockfiles, - char **index_pack_args, + struct strvec *index_pack_args, - struct ref **sought, int nr_sought) + struct ref **sought, int nr_sought, + struct oidset *gitmodules_oids) @@ fetch-pack.c: static struct ref *do_fetch_pack(struct fetch_pack_args *args, @@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, struct string_list packfile_uris = STRING_LIST_INIT_DUP; int i; - char *index_pack_args = NULL; + struct strvec index_pack_args = STRVEC_INIT; + struct oidset gitmodules_oids = OIDSET_INIT; negotiator = &negotiator_alloc; @@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, if (finish_command(&cmd)) @@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, string_list_clear(&packfile_uris, 0); - FREE_AND_NULL(index_pack_args); + strvec_clear(&index_pack_args); + fsck_gitmodules_oids(&gitmodules_oids); + @@ fsck.c: int fsck_error_function(struct fsck_options *o, int fsck_finish(struct fsck_options *options) { int ret = 0; -@@ fsck.c: int fsck_finish(struct fsck_options *options) - if (!buf) { - if (is_promisor_object(oid)) - continue; -- ret |= report(options, -- oid, OBJ_BLOB, -- FSCK_MSG_GITMODULES_MISSING, -- "unable to read .gitmodules blob"); -+ if (options->print_dangling_gitmodules) -+ printf("%s\n", oid_to_hex(oid)); -+ else -+ ret |= report(options, -+ oid, OBJ_BLOB, -+ FSCK_MSG_GITMODULES_MISSING, -+ "unable to read .gitmodules blob"); - continue; - } - ## fsck.h ## -@@ fsck.h: struct fsck_options { - int *msg_type; - struct oidset skiplist; - kh_oid_map_t *object_names; -+ -+ /* -+ * If 1, print the hashes of missing .gitmodules blobs instead of -+ * considering them to be errors. -+ */ -+ unsigned print_dangling_gitmodules:1; - }; - - #define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT } @@ fsck.h: int fsck_walk(struct object *obj, void *data, struct fsck_options *options); int fsck_object(struct object *obj, void *data, unsigned long size, struct fsck_options *options); @@ pack.h: int verify_pack_index(struct packed_git *); * The "hdr" output buffer should be at least this big, which will handle sizes ## t/t5702-protocol-v2.sh ## +@@ t/t5702-protocol-v2.sh: test_expect_success 'part of packfile response provided as URI' ' + test -f hfound && + test -f h2found && + +- # Ensure that there are exactly 6 files (3 .pack and 3 .idx). +- ls http_child/.git/objects/pack/* >filelist && ++ # Ensure that there are exactly 3 packfiles with associated .idx ++ ls http_child/.git/objects/pack/*.pack \ ++ http_child/.git/objects/pack/*.idx >filelist && + test_line_count = 6 filelist + ' + +@@ t/t5702-protocol-v2.sh: test_expect_success 'packfile-uri with transfer.fsckobjects' ' + -c fetch.uriprotocols=http,https \ + clone "$HTTPD_URL/smart/http_parent" http_child && + +- # Ensure that there are exactly 4 files (2 .pack and 2 .idx). +- ls http_child/.git/objects/pack/* >filelist && ++ # Ensure that there are exactly 2 packfiles with associated .idx ++ ls http_child/.git/objects/pack/*.pack \ ++ http_child/.git/objects/pack/*.idx >filelist && + test_line_count = 4 filelist + ' + @@ t/t5702-protocol-v2.sh: test_expect_success 'packfile-uri with transfer.fsckobjects fails on bad object' test_i18ngrep "invalid author/committer line - missing email" error ' @@ t/t5702-protocol-v2.sh: test_expect_success 'packfile-uri with transfer.fsckobje + -c fetch.uriprotocols=http,https \ + clone "$HTTPD_URL/smart/http_parent" http_child && + -+ # Ensure that there are exactly 4 files (2 .pack and 2 .idx). -+ ls http_child/.git/objects/pack/* >filelist && ++ # Ensure that there are exactly 2 packfiles with associated .idx ++ ls http_child/.git/objects/pack/*.pack \ ++ http_child/.git/objects/pack/*.idx >filelist && + test_line_count = 4 filelist +' + 4: da0d7b38ae < -: ---------- SQUASH??? test fix -- 2.30.0.617.g56c4b15f3c-goog ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v2 1/4] http: allow custom index-pack args 2021-02-22 19:20 ` [PATCH v2 " Jonathan Tan @ 2021-02-22 19:20 ` Jonathan Tan 2021-02-22 19:20 ` [PATCH v2 2/4] http-fetch: " Jonathan Tan ` (3 subsequent siblings) 4 siblings, 0 replies; 229+ messages in thread From: Jonathan Tan @ 2021-02-22 19:20 UTC (permalink / raw) To: git; +Cc: Jonathan Tan, avarab, gitster Currently, when fetching, packfiles referenced by URIs are run through index-pack without any arguments other than --stdin and --keep, no matter what arguments are used for the packfile that is inline in the fetch response. As a preparation for ensuring that all packs (whether inline or not) use the same index-pack arguments, teach the http subsystem to allow custom index-pack arguments. http-fetch has been updated to use the new API. For now, it passes --keep alone instead of --keep with a process ID, but this is only temporary because http-fetch itself will be taught to accept index-pack parameters (instead of using a hardcoded constant) in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- http-fetch.c | 6 +++++- http.c | 15 ++++++++------- http.h | 10 +++++----- 3 files changed, 18 insertions(+), 13 deletions(-) diff --git a/http-fetch.c b/http-fetch.c index c4ccc5fea9..2d1d9d054f 100644 --- a/http-fetch.c +++ b/http-fetch.c @@ -43,6 +43,9 @@ static int fetch_using_walker(const char *raw_url, int get_verbosely, return rc; } +static const char *index_pack_args[] = + {"index-pack", "--stdin", "--keep", NULL}; + static void fetch_single_packfile(struct object_id *packfile_hash, const char *url) { struct http_pack_request *preq; @@ -55,7 +58,8 @@ static void fetch_single_packfile(struct object_id *packfile_hash, if (preq == NULL) die("couldn't create http pack request"); preq->slot->results = &results; - preq->generate_keep = 1; + preq->index_pack_args = index_pack_args; + preq->preserve_index_pack_stdout = 1; if (start_active_slot(preq->slot)) { run_active_slot(preq->slot); diff --git a/http.c b/http.c index 8b23a546af..f8ea28bb2e 100644 --- a/http.c +++ b/http.c @@ -2259,6 +2259,9 @@ void release_http_pack_request(struct http_pack_request *preq) free(preq); } +static const char *default_index_pack_args[] = + {"index-pack", "--stdin", NULL}; + int finish_http_pack_request(struct http_pack_request *preq) { struct child_process ip = CHILD_PROCESS_INIT; @@ -2270,17 +2273,15 @@ int finish_http_pack_request(struct http_pack_request *preq) tmpfile_fd = xopen(preq->tmpfile.buf, O_RDONLY); - strvec_push(&ip.args, "index-pack"); - strvec_push(&ip.args, "--stdin"); ip.git_cmd = 1; ip.in = tmpfile_fd; - if (preq->generate_keep) { - strvec_pushf(&ip.args, "--keep=git %"PRIuMAX, - (uintmax_t)getpid()); + ip.argv = preq->index_pack_args ? preq->index_pack_args + : default_index_pack_args; + + if (preq->preserve_index_pack_stdout) ip.out = 0; - } else { + else ip.no_stdout = 1; - } if (run_command(&ip)) { ret = -1; diff --git a/http.h b/http.h index 5de792ef3f..bf3d1270ad 100644 --- a/http.h +++ b/http.h @@ -218,12 +218,12 @@ struct http_pack_request { char *url; /* - * If this is true, finish_http_pack_request() will pass "--keep" to - * index-pack, resulting in the creation of a keep file, and will not - * suppress its stdout (that is, the "keep\t<hash>\n" line will be - * printed to stdout). + * index-pack command to run. Must be terminated by NULL. + * + * If NULL, defaults to {"index-pack", "--stdin", NULL}. */ - unsigned generate_keep : 1; + const char **index_pack_args; + unsigned preserve_index_pack_stdout : 1; FILE *packfile; struct strbuf tmpfile; -- 2.30.0.617.g56c4b15f3c-goog ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v2 2/4] http-fetch: allow custom index-pack args 2021-02-22 19:20 ` [PATCH v2 " Jonathan Tan 2021-02-22 19:20 ` [PATCH v2 1/4] http: allow custom index-pack args Jonathan Tan @ 2021-02-22 19:20 ` Jonathan Tan 2021-02-23 13:17 ` Ævar Arnfjörð Bjarmason 2021-03-05 0:19 ` Jonathan Nieder 2021-02-22 19:20 ` [PATCH v2 3/4] fetch-pack: with packfile URIs, use index-pack arg Jonathan Tan ` (2 subsequent siblings) 4 siblings, 2 replies; 229+ messages in thread From: Jonathan Tan @ 2021-02-22 19:20 UTC (permalink / raw) To: git; +Cc: Jonathan Tan, avarab, gitster This is the next step in teaching fetch-pack to pass its index-pack arguments when processing packfiles referenced by URIs. The "--keep" in fetch-pack.c will be replaced with a full message in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- Documentation/git-http-fetch.txt | 10 ++++++++-- fetch-pack.c | 3 +++ http-fetch.c | 20 +++++++++++++++----- t/t5550-http-fetch-dumb.sh | 5 ++++- 4 files changed, 30 insertions(+), 8 deletions(-) diff --git a/Documentation/git-http-fetch.txt b/Documentation/git-http-fetch.txt index 4deb4893f5..9fa17b60e4 100644 --- a/Documentation/git-http-fetch.txt +++ b/Documentation/git-http-fetch.txt @@ -41,11 +41,17 @@ commit-id:: <commit-id>['\t'<filename-as-in--w>] --packfile=<hash>:: - Instead of a commit id on the command line (which is not expected in + For internal use only. Instead of a commit id on the command + line (which is not expected in this case), 'git http-fetch' fetches the packfile directly at the given URL and uses index-pack to generate corresponding .idx and .keep files. The hash is used to determine the name of the temporary file and is - arbitrary. The output of index-pack is printed to stdout. + arbitrary. The output of index-pack is printed to stdout. Requires + --index-pack-args. + +--index-pack-args=<args>:: + For internal use only. The command to run on the contents of the + downloaded pack. Arguments are URL-encoded separated by spaces. --recover:: Verify that everything reachable from target is fetched. Used after diff --git a/fetch-pack.c b/fetch-pack.c index 876f90c759..aeac010b0b 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -1645,6 +1645,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, strvec_pushf(&cmd.args, "--packfile=%.*s", (int) the_hash_algo->hexsz, packfile_uris.items[i].string); + strvec_push(&cmd.args, "--index-pack-arg=index-pack"); + strvec_push(&cmd.args, "--index-pack-arg=--stdin"); + strvec_push(&cmd.args, "--index-pack-arg=--keep"); strvec_push(&cmd.args, uri); cmd.git_cmd = 1; cmd.no_stdin = 1; diff --git a/http-fetch.c b/http-fetch.c index 2d1d9d054f..fa642462a9 100644 --- a/http-fetch.c +++ b/http-fetch.c @@ -3,6 +3,7 @@ #include "exec-cmd.h" #include "http.h" #include "walker.h" +#include "strvec.h" static const char http_fetch_usage[] = "git http-fetch " "[-c] [-t] [-a] [-v] [--recover] [-w ref] [--stdin | --packfile=hash | commit-id] url"; @@ -43,11 +44,9 @@ static int fetch_using_walker(const char *raw_url, int get_verbosely, return rc; } -static const char *index_pack_args[] = - {"index-pack", "--stdin", "--keep", NULL}; - static void fetch_single_packfile(struct object_id *packfile_hash, - const char *url) { + const char *url, + const char **index_pack_args) { struct http_pack_request *preq; struct slot_results results; int ret; @@ -90,6 +89,7 @@ int cmd_main(int argc, const char **argv) int packfile = 0; int nongit; struct object_id packfile_hash; + struct strvec index_pack_args = STRVEC_INIT; setup_git_directory_gently(&nongit); @@ -116,6 +116,8 @@ int cmd_main(int argc, const char **argv) packfile = 1; if (parse_oid_hex(p, &packfile_hash, &end) || *end) die(_("argument to --packfile must be a valid hash (got '%s')"), p); + } else if (skip_prefix(argv[arg], "--index-pack-arg=", &p)) { + strvec_push(&index_pack_args, p); } arg++; } @@ -128,10 +130,18 @@ int cmd_main(int argc, const char **argv) git_config(git_default_config, NULL); if (packfile) { - fetch_single_packfile(&packfile_hash, argv[arg]); + if (!index_pack_args.nr) + die(_("--packfile requires --index-pack-args")); + + fetch_single_packfile(&packfile_hash, argv[arg], + index_pack_args.v); + return 0; } + if (index_pack_args.nr) + die(_("--index-pack-args can only be used with --packfile")); + if (commits_on_stdin) { commits = walker_targets_stdin(&commit_id, &write_ref); } else { diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh index 483578b2d7..358b322e05 100755 --- a/t/t5550-http-fetch-dumb.sh +++ b/t/t5550-http-fetch-dumb.sh @@ -224,7 +224,10 @@ test_expect_success 'http-fetch --packfile' ' git init packfileclient && p=$(cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git && ls objects/pack/pack-*.pack) && - git -C packfileclient http-fetch --packfile=$ARBITRARY "$HTTPD_URL"/dumb/repo_pack.git/$p >out && + git -C packfileclient http-fetch --packfile=$ARBITRARY \ + --index-pack-arg=index-pack --index-pack-arg=--stdin \ + --index-pack-arg=--keep \ + "$HTTPD_URL"/dumb/repo_pack.git/$p >out && grep "^keep.[0-9a-f]\{16,\}$" out && cut -c6- out >packhash && -- 2.30.0.617.g56c4b15f3c-goog ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v2 2/4] http-fetch: allow custom index-pack args 2021-02-22 19:20 ` [PATCH v2 2/4] http-fetch: " Jonathan Tan @ 2021-02-23 13:17 ` Ævar Arnfjörð Bjarmason 2021-02-23 16:51 ` Jonathan Tan 2021-03-05 0:19 ` Jonathan Nieder 1 sibling, 1 reply; 229+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-02-23 13:17 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, gitster On Mon, Feb 22 2021, Jonathan Tan wrote: > diff --git a/Documentation/git-http-fetch.txt b/Documentation/git-http-fetch.txt > index 4deb4893f5..9fa17b60e4 100644 > --- a/Documentation/git-http-fetch.txt > +++ b/Documentation/git-http-fetch.txt > @@ -41,11 +41,17 @@ commit-id:: > <commit-id>['\t'<filename-as-in--w>] > > --packfile=<hash>:: > - Instead of a commit id on the command line (which is not expected in > + For internal use only. Instead of a commit id on the command > + line (which is not expected in > this case), 'git http-fetch' fetches the packfile directly at the given > URL and uses index-pack to generate corresponding .idx and .keep files. > The hash is used to determine the name of the temporary file and is > - arbitrary. The output of index-pack is printed to stdout. > + arbitrary. The output of index-pack is printed to stdout. Requires > + --index-pack-args. > + > +--index-pack-args=<args>:: > + For internal use only. The command to run on the contents of the > + downloaded pack. Arguments are URL-encoded separated by spaces. > > --recover:: > Verify that everything reachable from target is fetched. Used after > diff --git a/fetch-pack.c b/fetch-pack.c > index 876f90c759..aeac010b0b 100644 > --- a/fetch-pack.c > +++ b/fetch-pack.c > @@ -1645,6 +1645,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, > strvec_pushf(&cmd.args, "--packfile=%.*s", > (int) the_hash_algo->hexsz, > packfile_uris.items[i].string); > + strvec_push(&cmd.args, "--index-pack-arg=index-pack"); > + strvec_push(&cmd.args, "--index-pack-arg=--stdin"); > + strvec_push(&cmd.args, "--index-pack-arg=--keep"); The docs say --*-args, but the code checks --*arg, that seems like a mistake that should be fixed to make the code/tests use the plural form, no? ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v2 2/4] http-fetch: allow custom index-pack args 2021-02-23 13:17 ` Ævar Arnfjörð Bjarmason @ 2021-02-23 16:51 ` Jonathan Tan 0 siblings, 0 replies; 229+ messages in thread From: Jonathan Tan @ 2021-02-23 16:51 UTC (permalink / raw) To: avarab; +Cc: jonathantanmy, git, gitster > > diff --git a/Documentation/git-http-fetch.txt b/Documentation/git-http-fetch.txt > > index 4deb4893f5..9fa17b60e4 100644 > > --- a/Documentation/git-http-fetch.txt > > +++ b/Documentation/git-http-fetch.txt > > @@ -41,11 +41,17 @@ commit-id:: > > <commit-id>['\t'<filename-as-in--w>] > > > > --packfile=<hash>:: > > - Instead of a commit id on the command line (which is not expected in > > + For internal use only. Instead of a commit id on the command > > + line (which is not expected in > > this case), 'git http-fetch' fetches the packfile directly at the given > > URL and uses index-pack to generate corresponding .idx and .keep files. > > The hash is used to determine the name of the temporary file and is > > - arbitrary. The output of index-pack is printed to stdout. > > + arbitrary. The output of index-pack is printed to stdout. Requires > > + --index-pack-args. > > + > > +--index-pack-args=<args>:: > > + For internal use only. The command to run on the contents of the > > + downloaded pack. Arguments are URL-encoded separated by spaces. > > > > --recover:: > > Verify that everything reachable from target is fetched. Used after > > diff --git a/fetch-pack.c b/fetch-pack.c > > index 876f90c759..aeac010b0b 100644 > > --- a/fetch-pack.c > > +++ b/fetch-pack.c > > @@ -1645,6 +1645,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, > > strvec_pushf(&cmd.args, "--packfile=%.*s", > > (int) the_hash_algo->hexsz, > > packfile_uris.items[i].string); > > + strvec_push(&cmd.args, "--index-pack-arg=index-pack"); > > + strvec_push(&cmd.args, "--index-pack-arg=--stdin"); > > + strvec_push(&cmd.args, "--index-pack-arg=--keep"); > > The docs say --*-args, but the code checks --*arg, that seems like a > mistake that should be fixed to make the code/tests use the plural form, > no? Thanks for catching that. Originally it was plural since this single argument would give multiple arguments to index-pack, but now each argument gives only a single argument, so "arg" is correct. I'll update it in the next version. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH v2 2/4] http-fetch: allow custom index-pack args 2021-02-22 19:20 ` [PATCH v2 2/4] http-fetch: " Jonathan Tan 2021-02-23 13:17 ` Ævar Arnfjörð Bjarmason @ 2021-03-05 0:19 ` Jonathan Nieder 2021-03-05 1:16 ` [PATCH] fetch-pack: do not mix --pack_header and packfile uri Jonathan Tan 1 sibling, 1 reply; 229+ messages in thread From: Jonathan Nieder @ 2021-03-05 0:19 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, avarab, gitster, Nathan Mulcahey Hi Jonathan, Jonathan Tan wrote: > This is the next step in teaching fetch-pack to pass its index-pack > arguments when processing packfiles referenced by URIs. > > The "--keep" in fetch-pack.c will be replaced with a full message in a > subsequent commit. > > Signed-off-by: Jonathan Tan <jonathantanmy@google.com> > Signed-off-by: Junio C Hamano <gitster@pobox.com> > --- > Documentation/git-http-fetch.txt | 10 ++++++++-- > fetch-pack.c | 3 +++ > http-fetch.c | 20 +++++++++++++++----- > t/t5550-http-fetch-dumb.sh | 5 ++++- > 4 files changed, 30 insertions(+), 8 deletions(-) This is producing an interesting symptom for me: git init repro cd repro git config fetch.uriprotocols https git config remote.origin.url https://fuchsia.googlesource.com/fuchsia git config remote.origin.fetch +refs/heads/*:refs/remotes/origin/* git fetch -p origin Expected result: fetches Actual result: fatal: pack has bad object at offset 12: unknown object type 5 fatal: finish_http_pack_request gave result -1 fatal: fetch-pack: expected keep then TAB at start of http-fetch output Thanks to Nathan Mulcahey (cc-ed) for a clear report. Bisects to b664e9ffa153189dae9b88f32d1c5fedcf85056a, which is part of "next" and 2.31.0-rc1. Another report of the same is at https://crbug.com/1184814. Known problem? Thanks, Jonathan ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-05 0:19 ` Jonathan Nieder @ 2021-03-05 1:16 ` Jonathan Tan 2021-03-05 1:52 ` Junio C Hamano 2021-03-05 18:50 ` Junio C Hamano 0 siblings, 2 replies; 229+ messages in thread From: Jonathan Tan @ 2021-03-05 1:16 UTC (permalink / raw) To: git; +Cc: Jonathan Tan, jrnieder, nmulcahey When fetching (as opposed to cloning) from a repository with packfile URIs enabled, an error like this may occur: fatal: pack has bad object at offset 12: unknown object type 5 fatal: finish_http_pack_request gave result -1 fatal: fetch-pack: expected keep then TAB at start of http-fetch output This bug was introduced in b664e9ffa1 ("fetch-pack: with packfile URIs, use index-pack arg", 2021-02-22), when the index-pack args used when processing the inline packfile of a fetch response and when processing packfile URIs were unified. This bug happens because fetch, by default, partially reads (and consumes) the header of the inline packfile to determine if it should store the downloaded objects as a packfile or loose objects, and thus passes --pack_header=<...> to index-pack to inform it that some bytes are missing. However, when it subsequently fetches the additional packfiles linked by URIs, it reuses the same index-pack arguments, thus wrongly passing --index-pack-arg=--pack_header=<...> when no bytes are missing. This does not happen when cloning because "git clone" always passes do_keep, which instructs the fetch mechanism to always retain the packfile, eliminating the need to read the header. There are a few ways to fix this, including filtering out pack_header arguments when downloading the additional packfiles, but I decided to stick to always using index-pack throughout when packfile URIs are present - thus, Git no longer needs to read the bytes, and no longer needs --pack_header here. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- Here's a fix for this issue. This is on jt/transfer-fsck-across-packs. One simplification that we could do is to eliminate the unpack-objects codepath. As far as I understand, the main advantage of writing loose objects is that we have automatic SHA-1 collision detection, but we have such mitigations when writing packs too, so that might not be as large a benefit as we think. This simplification would have enabled us to avoid this bug, I think. --- fetch-pack.c | 4 ++-- t/t5702-protocol-v2.sh | 21 +++++++++++++++++++++ 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index f9def5ac74..e990607742 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -852,7 +852,7 @@ static int get_pack(struct fetch_pack_args *args, else demux.out = xd[0]; - if (!args->keep_pack && unpack_limit) { + if (!args->keep_pack && unpack_limit && !index_pack_args) { if (read_pack_header(demux.out, &header)) die(_("protocol error: bad pack header")); @@ -885,7 +885,7 @@ static int get_pack(struct fetch_pack_args *args, strvec_push(&cmd.args, "-v"); if (args->use_thin_pack) strvec_push(&cmd.args, "--fix-thin"); - if (do_keep && (args->lock_pack || unpack_limit)) { + if ((do_keep || index_pack_args) && (args->lock_pack || unpack_limit)) { char hostname[HOST_NAME_MAX + 1]; if (xgethostname(hostname, sizeof(hostname))) xsnprintf(hostname, sizeof(hostname), "localhost"); diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh index b1bc73a9a9..9df1ec82ca 100755 --- a/t/t5702-protocol-v2.sh +++ b/t/t5702-protocol-v2.sh @@ -853,6 +853,27 @@ test_expect_success 'part of packfile response provided as URI' ' test_line_count = 6 filelist ' +test_expect_success 'packfile URIs with fetch instead of clone' ' + P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && + rm -rf "$P" http_child log && + + git init "$P" && + git -C "$P" config "uploadpack.allowsidebandall" "true" && + + echo my-blob >"$P/my-blob" && + git -C "$P" add my-blob && + git -C "$P" commit -m x && + + configure_exclusion "$P" my-blob >h && + + git init http_child && + + GIT_TEST_SIDEBAND_ALL=1 \ + git -C http_child -c protocol.version=2 \ + -c fetch.uriprotocols=http,https \ + fetch "$HTTPD_URL/smart/http_parent" +' + test_expect_success 'fetching with valid packfile URI but invalid hash fails' ' P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && rm -rf "$P" http_child log && -- 2.30.1.766.gb4fecdf3b7-goog ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-05 1:16 ` [PATCH] fetch-pack: do not mix --pack_header and packfile uri Jonathan Tan @ 2021-03-05 1:52 ` Junio C Hamano 2021-03-05 18:50 ` Junio C Hamano 1 sibling, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-05 1:52 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, jrnieder, nmulcahey Jonathan Tan <jonathantanmy@google.com> writes: > One simplification that we could do is to eliminate the unpack-objects > codepath. As far as I understand, the main advantage of writing loose > objects is that we have automatic SHA-1 collision detection, but we have > such mitigations when writing packs too, so that might not be as large a > benefit as we think. This simplification would have enabled us to avoid > this bug, I think. My understanding is that the primary advantage of loose objects codepath is to help us avoid having too many little packs (instead, we can accumulate enough objects in the loose form and let GC pack them, at least the ones among them that are still reachable, into a single pack). Historically, the only mode of operation "repack" offers that reduces the number of remaining packs has been "do full reachability of the entire history, and pack everything into one", so avoiding creation of little packs and leaving things loose until we accumulate enough used to matter. With the geometric rolling repacking, it may not matter as much, and keeping everything packed, even in a small pack, might start to be overall win. So I am not opposed to such a simplification; we may not be ready for it right now, but I think it would be a sensible future direction. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-05 1:16 ` [PATCH] fetch-pack: do not mix --pack_header and packfile uri Jonathan Tan 2021-03-05 1:52 ` Junio C Hamano @ 2021-03-05 18:50 ` Junio C Hamano 2021-03-05 19:46 ` Junio C Hamano 2021-03-05 22:59 ` Jonathan Tan 1 sibling, 2 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-05 18:50 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, jrnieder, nmulcahey Jonathan Tan <jonathantanmy@google.com> writes: > When fetching (as opposed to cloning) from a repository with packfile > URIs enabled, an error like this may occur: > > fatal: pack has bad object at offset 12: unknown object type 5 > fatal: finish_http_pack_request gave result -1 > fatal: fetch-pack: expected keep then TAB at start of http-fetch output > > This bug was introduced in b664e9ffa1 ("fetch-pack: with packfile URIs, > use index-pack arg", 2021-02-22), when the index-pack args used when > processing the inline packfile of a fetch response and when processing > packfile URIs were unified. > This bug happens because fetch, by default, partially reads (and > consumes) the header of the inline packfile to determine if it should > store the downloaded objects as a packfile or loose objects, and thus > passes --pack_header=<...> to index-pack to inform it that some bytes > are missing. ... and what the values in them are. > However, when it subsequently fetches the additional > packfiles linked by URIs, it reuses the same index-pack arguments, thus > wrongly passing --index-pack-arg=--pack_header=<...> when no bytes are > missing. > > This does not happen when cloning because "git clone" always passes > do_keep, which instructs the fetch mechanism to always retain the > packfile, eliminating the need to read the header. > > There are a few ways to fix this, including filtering out pack_header > arguments when downloading the additional packfiles, but ... Avoiding the condition that exhibits the breakage is possible, and I think it is what is done here, but I actually think that the only right fix is to pass correct argument to commands we invoke in the first place. Why are we reusing the same argument array to begin with? ... goes back and reads the offending commit ... commit b664e9ffa153189dae9b88f32d1c5fedcf85056a Author: Jonathan Tan <jonathantanmy@google.com> Date: Mon Feb 22 11:20:08 2021 -0800 fetch-pack: with packfile URIs, use index-pack arg Unify the index-pack arguments used when processing the inline pack and when downloading packfiles referenced by URIs. This is done by teaching get_pack() to also store the index-pack arguments whenever at least one packfile URI is given, and then when processing the packfile URI(s), using the stored arguments. THis makes it sound like the entire idea of this offending commit was wrong, and before it, the codepath that processed the packfile fetched from the packfile URI were using the index-pack correctly by using index-pack arguments that are independent from the one that is used to process the packfile given in-stream. Why isn't the fix just a straight revert of the commit??? > This is on jt/transfer-fsck-across-packs. Ouch. This definitely is an -rc material. > - if (!args->keep_pack && unpack_limit) { > + if (!args->keep_pack && unpack_limit && !index_pack_args) { This one makes sense as an "avoid conditions that reveals how badly the code is broken" band-aid. When we have index-pack related arguments, we cannot use the unpack-objects codepath even if we are being fed a tiny pack, so there is no point peeking at the beginning of the pack stream to find out how many objects it has. OK. > @@ -885,7 +885,7 @@ static int get_pack(struct fetch_pack_args *args, > strvec_push(&cmd.args, "-v"); > if (args->use_thin_pack) > strvec_push(&cmd.args, "--fix-thin"); > - if (do_keep && (args->lock_pack || unpack_limit)) { > + if ((do_keep || index_pack_args) && (args->lock_pack || unpack_limit)) { > char hostname[HOST_NAME_MAX + 1]; > if (xgethostname(hostname, sizeof(hostname))) > xsnprintf(hostname, sizeof(hostname), "localhost"); I do not quite get what this hunk is doing. Care to explain? Thanks. > diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh > index b1bc73a9a9..9df1ec82ca 100755 > --- a/t/t5702-protocol-v2.sh > +++ b/t/t5702-protocol-v2.sh > @@ -853,6 +853,27 @@ test_expect_success 'part of packfile response provided as URI' ' > test_line_count = 6 filelist > ' > > +test_expect_success 'packfile URIs with fetch instead of clone' ' > + P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && > + rm -rf "$P" http_child log && > + > + git init "$P" && > + git -C "$P" config "uploadpack.allowsidebandall" "true" && > + > + echo my-blob >"$P/my-blob" && > + git -C "$P" add my-blob && > + git -C "$P" commit -m x && > + > + configure_exclusion "$P" my-blob >h && > + > + git init http_child && > + > + GIT_TEST_SIDEBAND_ALL=1 \ > + git -C http_child -c protocol.version=2 \ > + -c fetch.uriprotocols=http,https \ > + fetch "$HTTPD_URL/smart/http_parent" > +' > + > test_expect_success 'fetching with valid packfile URI but invalid hash fails' ' > P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && > rm -rf "$P" http_child log && ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-05 18:50 ` Junio C Hamano @ 2021-03-05 19:46 ` Junio C Hamano 2021-03-05 23:11 ` Jonathan Tan 2021-03-05 23:20 ` Junio C Hamano 2021-03-05 22:59 ` Jonathan Tan 1 sibling, 2 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-05 19:46 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, jrnieder, nmulcahey Junio C Hamano <gitster@pobox.com> writes: > Avoiding the condition that exhibits the breakage is possible, and I > think it is what is done here, but I actually think that the only > right fix is to pass correct argument to commands we invoke in the > first place. Why are we reusing the same argument array to begin > with? > > ... goes back and reads the offending commit ... > > commit b664e9ffa153189dae9b88f32d1c5fedcf85056a > Author: Jonathan Tan <jonathantanmy@google.com> > Date: Mon Feb 22 11:20:08 2021 -0800 > > fetch-pack: with packfile URIs, use index-pack arg > > Unify the index-pack arguments used when processing the inline pack and > when downloading packfiles referenced by URIs. This is done by teaching > get_pack() to also store the index-pack arguments whenever at least one > packfile URI is given, and then when processing the packfile URI(s), > using the stored arguments. > > THis makes it sound like the entire idea of this offending commit > was wrong, and before it, the codepath that processed the packfile > fetched from the packfile URI were using the index-pack correctly > by using index-pack arguments that are independent from the one that > is used to process the packfile given in-stream. Why isn't the fix > just a straight revert of the commit??? By the way, the band-aid in this patch may be OK for the upcoming release (purely because it is easy to see that is sufficient for today's codebase), but I said the above because I worry about the health of the codebase in the longer term. The "pass_header" may not stay to be the only difference between the URI packfile and in-stream packfile in the way they make index-pack invocations. >> This is on jt/transfer-fsck-across-packs. > > Ouch. This definitely is an -rc material. Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-05 19:46 ` Junio C Hamano @ 2021-03-05 23:11 ` Jonathan Tan 2021-03-05 23:20 ` Junio C Hamano 1 sibling, 0 replies; 229+ messages in thread From: Jonathan Tan @ 2021-03-05 23:11 UTC (permalink / raw) To: gitster; +Cc: jonathantanmy, git, jrnieder, nmulcahey > By the way, the band-aid in this patch may be OK for the upcoming > release (purely because it is easy to see that is sufficient for > today's codebase), but I said the above because I worry about the > health of the codebase in the longer term. The "pass_header" may > not stay to be the only difference between the URI packfile and > in-stream packfile in the way they make index-pack invocations. That is true, but at the same time, I think it's better to have the arguments be the same because there are options (e.g. --promisor and --fsck-objects) that have to be duplicated, and I think that for the most part, the URI packfiles and the inline packfile will be processed identically. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-05 19:46 ` Junio C Hamano 2021-03-05 23:11 ` Jonathan Tan @ 2021-03-05 23:20 ` Junio C Hamano 1 sibling, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-05 23:20 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, jrnieder, nmulcahey Junio C Hamano <gitster@pobox.com> writes: >> THis makes it sound like the entire idea of this offending commit >> was wrong, and before it, the codepath that processed the packfile >> fetched from the packfile URI were using the index-pack correctly >> by using index-pack arguments that are independent from the one that >> is used to process the packfile given in-stream. Why isn't the fix >> just a straight revert of the commit??? > > By the way, the band-aid in this patch may be OK for the upcoming > release (purely because it is easy to see that is sufficient for > today's codebase), but I said the above because I worry about the > health of the codebase in the longer term. The "pass_header" may > not stay to be the only difference between the URI packfile and > in-stream packfile in the way they make index-pack invocations. For example, the URI one presumably is a CDN hosted long term one, which may be a good candidate to --keep, and in-stream one, especially when packfile URI feature is used, can be expected to be recent small leftover bits that it is likely that we do not want to keep (in fact, if they are small enough, we'd prefer to keep them loose). ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-05 18:50 ` Junio C Hamano 2021-03-05 19:46 ` Junio C Hamano @ 2021-03-05 22:59 ` Jonathan Tan 2021-03-05 23:18 ` Junio C Hamano 1 sibling, 1 reply; 229+ messages in thread From: Jonathan Tan @ 2021-03-05 22:59 UTC (permalink / raw) To: gitster; +Cc: jonathantanmy, git, jrnieder, nmulcahey > Jonathan Tan <jonathantanmy@google.com> writes: > > > When fetching (as opposed to cloning) from a repository with packfile > > URIs enabled, an error like this may occur: > > > > fatal: pack has bad object at offset 12: unknown object type 5 > > fatal: finish_http_pack_request gave result -1 > > fatal: fetch-pack: expected keep then TAB at start of http-fetch output > > > > This bug was introduced in b664e9ffa1 ("fetch-pack: with packfile URIs, > > use index-pack arg", 2021-02-22), when the index-pack args used when > > processing the inline packfile of a fetch response and when processing > > packfile URIs were unified. > > > This bug happens because fetch, by default, partially reads (and > > consumes) the header of the inline packfile to determine if it should > > store the downloaded objects as a packfile or loose objects, and thus > > passes --pack_header=<...> to index-pack to inform it that some bytes > > are missing. > > ... and what the values in them are. Ah, that's true. > > However, when it subsequently fetches the additional > > packfiles linked by URIs, it reuses the same index-pack arguments, thus > > wrongly passing --index-pack-arg=--pack_header=<...> when no bytes are > > missing. > > > > This does not happen when cloning because "git clone" always passes > > do_keep, which instructs the fetch mechanism to always retain the > > packfile, eliminating the need to read the header. > > > > There are a few ways to fix this, including filtering out pack_header > > arguments when downloading the additional packfiles, but ... > > Avoiding the condition that exhibits the breakage is possible, and I > think it is what is done here, but I actually think that the only > right fix is to pass correct argument to commands we invoke in the > first place. Why are we reusing the same argument array to begin > with? > > ... goes back and reads the offending commit ... > > commit b664e9ffa153189dae9b88f32d1c5fedcf85056a > Author: Jonathan Tan <jonathantanmy@google.com> > Date: Mon Feb 22 11:20:08 2021 -0800 > > fetch-pack: with packfile URIs, use index-pack arg > > Unify the index-pack arguments used when processing the inline pack and > when downloading packfiles referenced by URIs. This is done by teaching > get_pack() to also store the index-pack arguments whenever at least one > packfile URI is given, and then when processing the packfile URI(s), > using the stored arguments. > > THis makes it sound like the entire idea of this offending commit > was wrong, and before it, the codepath that processed the packfile > fetched from the packfile URI were using the index-pack correctly > by using index-pack arguments that are independent from the one that > is used to process the packfile given in-stream. Why isn't the fix > just a straight revert of the commit??? I should probably have written more in the commit message to justify the unification, but it is also part of a bug fix (in particular, --fsck-objects wasn't being passed to the index-pack that indexed the packfiles linked by URI) and for code health purposes (to prevent future bugs by eliminating the divergence). So reverting that commit would reintroduce another bug. > > @@ -885,7 +885,7 @@ static int get_pack(struct fetch_pack_args *args, > > strvec_push(&cmd.args, "-v"); > > if (args->use_thin_pack) > > strvec_push(&cmd.args, "--fix-thin"); > > - if (do_keep && (args->lock_pack || unpack_limit)) { > > + if ((do_keep || index_pack_args) && (args->lock_pack || unpack_limit)) { > > char hostname[HOST_NAME_MAX + 1]; > > if (xgethostname(hostname, sizeof(hostname))) > > xsnprintf(hostname, sizeof(hostname), "localhost"); > > I do not quite get what this hunk is doing. Care to explain? The "do_keep" part was unnecessarily restrictive and I used a band-aid solution to loosen it. I think this started from 88e2f9ed8e ("introduce fetch-object: fetch one promisor object", 2017-12-05) where I might have misunderstood what do_keep was meant to do, and taught fetch-pack to use "index-pack" if do_keep is true or args->from_promisor is true. What I should have done is to set do_keep to true if args->from_promisor is true. Future commits continued to do that with fsck_objects and index_pack_args. Maybe what I can do is to refactor get_pack() so that do_keep retains its original meaning of whether to use "index-pack" or "unpack-objects", and then we wouldn't need this line. What do you think (code-wise and whether this fits in with the release schedule, if we want to get this in before release)? ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-05 22:59 ` Jonathan Tan @ 2021-03-05 23:18 ` Junio C Hamano 2021-03-08 19:14 ` Jonathan Tan 0 siblings, 1 reply; 229+ messages in thread From: Junio C Hamano @ 2021-03-05 23:18 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, jrnieder, nmulcahey Jonathan Tan <jonathantanmy@google.com> writes: > I should probably have written more in the commit message to justify the > unification, but it is also part of a bug fix (in particular, > --fsck-objects wasn't being passed to the index-pack that indexed the > packfiles linked by URI) and for code health purposes (to prevent future > bugs by eliminating the divergence). So reverting that commit would > reintroduce another bug. Not necessarily. Unifying two that do not inherently have to be identical makes it impossible to pass two different things, and that is what we are seeing in the bug this patch is trying to fix (by forcing the two to be identical by eliminating the unpack-objects codepath in certain cases). The right "fix" for the original bug would have been to keep them still separate yet making it easy to pass args that must be used in both of them, no? >> > - if (do_keep && (args->lock_pack || unpack_limit)) { >> > + if ((do_keep || index_pack_args) && (args->lock_pack || unpack_limit)) { >> > char hostname[HOST_NAME_MAX + 1]; >> > if (xgethostname(hostname, sizeof(hostname))) >> > xsnprintf(hostname, sizeof(hostname), "localhost"); >> >> I do not quite get what this hunk is doing. Care to explain? > > The "do_keep" part was unnecessarily restrictive and I used a band-aid > solution to loosen it. I think this started from 88e2f9ed8e ("introduce > fetch-object: fetch one promisor object", 2017-12-05) where I might have > misunderstood what do_keep was meant to do, and taught fetch-pack to use > "index-pack" if do_keep is true or args->from_promisor is true. What I > should have done is to set do_keep to true if args->from_promisor is > true. Future commits continued to do that with fsck_objects and > index_pack_args. > Maybe what I can do is to refactor get_pack() so that do_keep retains > its original meaning of whether to use "index-pack" or "unpack-objects", > and then we wouldn't need this line. What do you think (code-wise and > whether this fits in with the release schedule, if we want to get this > in before release)? How bad is the breakage this one is trying to fix? I know it would only affect folks who have to interact with the server that uses packfile URI feature, but do they have a workaround, perhaps with a configuration knob or command line option to ignore the packfile URI, and how large is the affected population? I cannot shake the feeling that we are seeing band-aid on top of band-aid forced by having chosen to go in a wrong direction in the beginning X-<, and prefer to see the code drift even further into the same direction; hence my earlier suggestion to go back to the root cause by first reverting the wrong fix that introduced this bug and fixing the original bug in a different way. I dunno how involved the necessary surgery would be, though. If this is easy to work around, perhaps it might be a better option for the overall project to ship the upcoming release with this listed as a known breakage. Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-05 23:18 ` Junio C Hamano @ 2021-03-08 19:14 ` Jonathan Tan 2021-03-08 19:34 ` Junio C Hamano 0 siblings, 1 reply; 229+ messages in thread From: Jonathan Tan @ 2021-03-08 19:14 UTC (permalink / raw) To: gitster; +Cc: jonathantanmy, git, jrnieder, nmulcahey > Jonathan Tan <jonathantanmy@google.com> writes: > > > I should probably have written more in the commit message to justify the > > unification, but it is also part of a bug fix (in particular, > > --fsck-objects wasn't being passed to the index-pack that indexed the > > packfiles linked by URI) and for code health purposes (to prevent future > > bugs by eliminating the divergence). So reverting that commit would > > reintroduce another bug. > > Not necessarily. Unifying two that do not inherently have to be > identical makes it impossible to pass two different things, and that > is what we are seeing in the bug this patch is trying to fix (by > forcing the two to be identical by eliminating the unpack-objects > codepath in certain cases). > > The right "fix" for the original bug would have been to keep them > still separate yet making it easy to pass args that must be used in > both of them, no? OK - I'll do this. > >> > - if (do_keep && (args->lock_pack || unpack_limit)) { > >> > + if ((do_keep || index_pack_args) && (args->lock_pack || unpack_limit)) { > >> > char hostname[HOST_NAME_MAX + 1]; > >> > if (xgethostname(hostname, sizeof(hostname))) > >> > xsnprintf(hostname, sizeof(hostname), "localhost"); > >> > >> I do not quite get what this hunk is doing. Care to explain? > > > > The "do_keep" part was unnecessarily restrictive and I used a band-aid > > solution to loosen it. I think this started from 88e2f9ed8e ("introduce > > fetch-object: fetch one promisor object", 2017-12-05) where I might have > > misunderstood what do_keep was meant to do, and taught fetch-pack to use > > "index-pack" if do_keep is true or args->from_promisor is true. What I > > should have done is to set do_keep to true if args->from_promisor is > > true. Future commits continued to do that with fsck_objects and > > index_pack_args. > > > Maybe what I can do is to refactor get_pack() so that do_keep retains > > its original meaning of whether to use "index-pack" or "unpack-objects", > > and then we wouldn't need this line. What do you think (code-wise and > > whether this fits in with the release schedule, if we want to get this > > in before release)? > > How bad is the breakage this one is trying to fix? I know it would > only affect folks who have to interact with the server that uses > packfile URI feature, but do they have a workaround, perhaps with a > configuration knob or command line option to ignore the packfile > URI, Yes, there's a workaround (to disable packfile URIs from the client side using a config variable). > and how large is the affected population? The only issues I've seen are within $DAYJOB, and there, we can carry our own patch to fix this issue. So the affected population (right now) is probably not much (if it even exists). > I cannot shake the feeling that we are seeing band-aid on top of > band-aid forced by having chosen to go in a wrong direction in the > beginning X-<, and prefer to see the code drift even further into > the same direction; hence my earlier suggestion to go back to the > root cause by first reverting the wrong fix that introduced this bug > and fixing the original bug in a different way. > > I dunno how involved the necessary surgery would be, though. If > this is easy to work around, perhaps it might be a better option for > the overall project to ship the upcoming release with this listed as > a known breakage. I don't think it's too difficult - I think we'll only need to filter out the --pack_header when we figure out the arguments to pass for the packfiles given by URI. I'll take a look. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-08 19:14 ` Jonathan Tan @ 2021-03-08 19:34 ` Junio C Hamano 2021-03-09 19:13 ` Junio C Hamano 0 siblings, 1 reply; 229+ messages in thread From: Junio C Hamano @ 2021-03-08 19:34 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, jrnieder, nmulcahey Jonathan Tan <jonathantanmy@google.com> writes: >> I dunno how involved the necessary surgery would be, though. If >> this is easy to work around, perhaps it might be a better option for >> the overall project to ship the upcoming release with this listed as >> a known breakage. > > I don't think it's too difficult - I think we'll only need to filter out > the --pack_header when we figure out the arguments to pass for the > packfiles given by URI. I'll take a look. What you sent earlier is a much better band-aid than "keep the single args array but filter an element out in only one codepath" band-aid, I would think. Any change that is more involved than a single-liner trivial bugfix would be too late for this cycle, as we'd be cutting -rc2 by the end of tomorrow. Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-08 19:34 ` Junio C Hamano @ 2021-03-09 19:13 ` Junio C Hamano 2021-03-10 5:24 ` Junio C Hamano 2021-03-10 16:57 ` Jonathan Tan 0 siblings, 2 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-09 19:13 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, jrnieder, nmulcahey Junio C Hamano <gitster@pobox.com> writes: > Jonathan Tan <jonathantanmy@google.com> writes: > >>> I dunno how involved the necessary surgery would be, though. If >>> this is easy to work around, perhaps it might be a better option for >>> the overall project to ship the upcoming release with this listed as >>> a known breakage. >> >> I don't think it's too difficult - I think we'll only need to filter out >> the --pack_header when we figure out the arguments to pass for the >> packfiles given by URI. I'll take a look. > > What you sent earlier is a much better band-aid than "keep the > single args array but filter an element out in only one codepath" > band-aid, I would think. > > Any change that is more involved than a single-liner trivial bugfix > would be too late for this cycle, as we'd be cutting -rc2 by the end > of tomorrow. I was looking at the index_pack_args vs pass_header codepath in fetch-pack.c again after finishing the -rc2 stuff, and noticed something curious. Before running the command to process in-stream packdata, we have this bit: if (index_pack_args) { int i; for (i = 0; i < cmd.args.nr; i++) strvec_push(index_pack_args, cmd.args.v[i]); } where cmd.args is what the original code (before the "we need to prepare the index pack arguments for the offline HTTP transfer" logic was bolted onto this codepath), so it could of course have things like "--fix-thin", "--promisor", when we are processing an in-stream packfile that has sufficiently large number of objects and choose "index-pack" to process it. None of them should be given to the "index-pack" that processes the offline packfile that is given via the packfile URI mechanism. Also, because this loop copies everything in cmd.args, if our in-stream packdata is small, cmd.args.v[0] would be "unpack-objects", and we end up asking the command to explode the (presumably large enough to be worth pre-generating and serving via CDN) packfile that is given via the packfile URI mechanism. What I think I am seeing in the code is that there are many things other than "pass_header" that fundamentally cannot be reused between the processing of the in-stream packdata and the offline packfile given by the packfile URI (e.g. the in-stream one may want to use "unpack-objects" to avoid accumulating too many tiny packs, so there is nothing to be shared with "index-pack" that will always be used for the offline one), and any attempt to "reuse" cmd.args while "filtering out" inappropriate bits is fragile and unfruitful. Instead, I think we should not touch index_pack in the earlier part of the function at all (both reading, writing, or even checking for NULL-ness), and use the "if (index_pack_args)" block we already have (i.e. the one before we call start_command() to process the in-stream packdata) to decide what the command line to process the offline pack should look like. That way, we won't ever risk such a confusion like running "unpack-objects" instead of "index-pack" (but we can choose to do so deliberately, of course---the important point is to recognise that the in-stream pack and the offline one are independant and we should decide how to cook them separately). ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-09 19:13 ` Junio C Hamano @ 2021-03-10 5:24 ` Junio C Hamano 2021-03-10 16:57 ` Jonathan Tan 1 sibling, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-10 5:24 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, jrnieder, nmulcahey Junio C Hamano <gitster@pobox.com> writes: > Instead, I think we should not touch index_pack in the earlier part > of the function at all (both reading, writing, or even checking for > NULL-ness), ... I have to take the "NULL-ness" part back. As the NULL-ness of the variable is also used to convey that URI packfile is in use, which in turn means we have to tell "index-pack" we are going to use for processing in-stream packfile that the objects in the pack may be pointing at objects that are not yet available. So we do need to check for the NULL-ness in order to decide what command line to use to process the in-stream packdata. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-09 19:13 ` Junio C Hamano 2021-03-10 5:24 ` Junio C Hamano @ 2021-03-10 16:57 ` Jonathan Tan 2021-03-10 18:30 ` Junio C Hamano 1 sibling, 1 reply; 229+ messages in thread From: Jonathan Tan @ 2021-03-10 16:57 UTC (permalink / raw) To: gitster; +Cc: jonathantanmy, git, jrnieder, nmulcahey > I was looking at the index_pack_args vs pass_header codepath in > fetch-pack.c again after finishing the -rc2 stuff, and noticed > something curious. > > Before running the command to process in-stream packdata, we have > this bit: > > if (index_pack_args) { > int i; > > for (i = 0; i < cmd.args.nr; i++) > strvec_push(index_pack_args, cmd.args.v[i]); > } > > where cmd.args is what the original code (before the "we need to > prepare the index pack arguments for the offline HTTP transfer" > logic was bolted onto this codepath), so it could of course have > things like "--fix-thin", "--promisor", when we are processing an > in-stream packfile that has sufficiently large number of objects and > choose "index-pack" to process it. None of them should be given to > the "index-pack" that processes the offline packfile that is given > via the packfile URI mechanism. Thanks for continuing to take a look at this. My thinking is that all packfiles (inline or through URI) should be processed in as similar a manner as possible. Looking at the potential arguments passed to index-pack: 1. --shallow-file (before "index-pack", that is, an argument passed to "git" itself and not the subcommand) 2. index-pack 3. --stdin 4. -v 5. --fix-thin 6. --keep 7. [--check-self-contained-and-connected is guarded by !index_pack_args so we won't be passing it] 8. --promisor 9. --pack_header 10. --fsck_objects 11. [--strict appears in an "else" block opposite index_pack_args so we won't be passing it] You mentioned --fix-thin (5) and --promisor (8). Why do you think that none of these should be given to the "index-pack" that processes the packfiles given by URI? Perhaps it could be argued that these extra packfiles don't need --fix-thin (but I would say that I think servers should be allowed to serve thin packfiles through URI too), but I think that --promisor is necessary (so that a server could, for example, offload all trees and commits to a packfile in a CDN, and offload all blobs to a separate packfile in a CDN). Looking at this list, I think that all the arguments (except 9, which has been fixed) are necessary (or at least useful) for indexing a packfile given by URI. > Also, because this loop copies everything in cmd.args, if our > in-stream packdata is small, cmd.args.v[0] would be "unpack-objects", > and we end up asking the command to explode the (presumably large > enough to be worth pre-generating and serving via CDN) packfile that > is given via the packfile URI mechanism. I specifically guard against this through the "if (do_keep || args->from_promisor || index_pack_args || fsck_objects) {" line (which is a complicated line, unfortunately). > What I think I am seeing in the code is that there are many things > other than "pass_header" that fundamentally cannot be reused between > the processing of the in-stream packdata and the offline packfile > given by the packfile URI (e.g. the in-stream one may want to use > "unpack-objects" to avoid accumulating too many tiny packs, so there > is nothing to be shared with "index-pack" that will always be used > for the offline one), and any attempt to "reuse" cmd.args while > "filtering out" inappropriate bits is fragile and unfruitful. > > Instead, I think we should not touch index_pack in the earlier part > of the function at all (both reading, writing, or even checking for > NULL-ness), and use the "if (index_pack_args)" block we already have > (i.e. the one before we call start_command() to process the > in-stream packdata) to decide what the command line to process the > offline pack should look like. That way, we won't ever risk such a > confusion like running "unpack-objects" instead of "index-pack" (but > we can choose to do so deliberately, of course---the important point > is to recognise that the in-stream pack and the offline one are > independant and we should decide how to cook them separately). We could do that, although I'm concerned that we would be repeating logic a lot (deciding whether or not to pass an argument). One other approach is for each "strvec.push?(&cmd.args" to also have another line that pushes to index_pack_args if it's relevant. But as I said earlier, I think that all or nearly all arguments will be relevant to both. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-10 16:57 ` Jonathan Tan @ 2021-03-10 18:30 ` Junio C Hamano 2021-03-10 19:56 ` Junio C Hamano 0 siblings, 1 reply; 229+ messages in thread From: Junio C Hamano @ 2021-03-10 18:30 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, jrnieder, nmulcahey Jonathan Tan <jonathantanmy@google.com> writes: > You mentioned --fix-thin (5) and --promisor (8). Why do you think that > none of these should be given to the "index-pack" that processes the > packfiles given by URI? Actually, --fix-thin is probably even worse than that. As the code processes the in-stream packdata before processing or even downloading the pregenerated URI packfile, the objects necessary to fix a "thin" in-stream packdata are likely to be unavailable (it is exactly the same problem as the one that made us to delay the fsckobjects done in index-pack when URI packfile is involved, isn't it?). Even if the client asks --thin, the server side shouldn't produce a thin pack for in-stream packdata, no? > Perhaps it could be argued that these extra > packfiles don't need --fix-thin (but I would say that I think servers > should be allowed to serve thin packfiles through URI too), I agree that URI packfile could be thin; after all, the server end chooses, based on what the client claims to have, which pregenerated packfile to hand out, so it is perfectly fine to hand out a pregenerated packfile that is thin if the client asks for a thin pack and says it has base objects missing from that packfile. And because it is (assumed to be) pregenerated, we can make a requirement that no URI packfile should depend on objects that are created later that that (which means it won't depend on in-stream packdata). But we cannot process a thin in-stream packdata, if we are to process it first, right? > but I think > that --promisor is necessary (so that a server could, for example, > offload all trees and commits to a packfile in a CDN, and offload all > blobs to a separate packfile in a CDN). Yes, both packfiles conceptually are given by that same server who promises to be always available to feed us everything we'd need later, so both packfiles should be marked to have come from the same promisor. So this is one example that happens to be sharable between the two. But I do not see it as an indication that the two packs inherently must be processed with the same options. > Looking at this list, I think that all the arguments (except 9, which > has been fixed) are necessary (or at least useful) for indexing a > packfile given by URI. I have to say that this is focusing too much on the current need by going through how the current code handles two packs. Of course, if we start from "two must be the same" viewpoint, and restrict what the code can do by "guarding" bits that require the two to be different out based on "if (index_pack_args)", then the resulting code would invoke two index-pack the same way. I am more worried about the longer term code health, so "currently mostly the same" does not make a convincing argument for the future why the two must be processed the same way. >> Also, because this loop copies everything in cmd.args, if our >> in-stream packdata is small, cmd.args.v[0] would be "unpack-objects", >> and we end up asking the command to explode the (presumably large >> enough to be worth pre-generating and serving via CDN) packfile that >> is given via the packfile URI mechanism. > > I specifically guard against this through the "if (do_keep || > args->from_promisor || index_pack_args || fsck_objects) {" line (which > is a complicated line, unfortunately). I am aware of that line that forbids the in-stream packdata from getting unpacked into loose objects. But unless we were told to keep the resulting pack, or run fsck-objects via the index-pack, I do not see an inherent reason why the "most recent leftover bits that are not in the pregenerated pack offloaded to CDN" objects must be kept in a separate packfile, especially if the number of objects in it is smaller than the unpack limit threshold. In other words, I view that "guard" as one of the things that blinds us into thinking that the two packs should be handled the same way. It is the other way around---the guard is there only because the code wanted to handle the two packs the same way. When cloning from a server that offers bulk of old history in a URI packfile and an in-stream packfile, shouldn't the result be like cloning from the server back when it had only the objects in the URI packfile, and then fetching from it again when it acquired objects that came in the in-stream packfile? The objects that come during the second fetch would be left loose if there aren't that many, so that the third and subsequent fetches and local activity can accumulate enough loose objects to be packed into a single new pack, avoiding accumulation of too many tiny packs. And the "guard" breaks that only because this codepath wants to reuse cmd.args that is unrelated to populate index_pack_args. Isn't that an artificial limitation that we may want to eventually fix? When we want to fix that, the "options are mostly the same when we use the index-pack command for both packdata, so let's copy the entire command line" would come back and haunt us. The person who is doing the fix may be somebody other than you, so it may not matter to you today, but it will hurt somebody tomorrow. I already said that I think 2aec3bc4 (fetch-pack: do not mix --pack_header and packfile uri, 2021-03-04) is OK as a short-term fix for the upcoming release, but it does not change the fact that it is piling technical debt on top of existing technical debt. And that is why I am reacting against your earlier mention of "filering out" rather strongly. The approach continues the "keep the single args array in the belief that two must be mostly the same", which I view as a misguided starting point that must be rethought. Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-10 18:30 ` Junio C Hamano @ 2021-03-10 19:56 ` Junio C Hamano 2021-03-10 23:29 ` Jonathan Tan 0 siblings, 1 reply; 229+ messages in thread From: Junio C Hamano @ 2021-03-10 19:56 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, jrnieder, nmulcahey Junio C Hamano <gitster@pobox.com> writes: > I already said that I think 2aec3bc4 (fetch-pack: do not mix > --pack_header and packfile uri, 2021-03-04) is OK as a short-term > fix for the upcoming release, but it does not change the fact that > it is piling technical debt on top of existing technical debt. > > And that is why I am reacting against your earlier mention of > "filering out" rather strongly. The approach continues the "keep > the single args array in the belief that two must be mostly the > same", which I view as a misguided starting point that must be > rethought. Another way to think about the codepath is this. Can the bulk of get_pack() that deals with a single incoming packfile (from the part that makes the decision to use either index-pack or unpack-objects and chooses what options to pass to the command, to the part that actually calls run_command() and feeds the packdata to the command) be made into a helper function that handles one packdata stream and nothing else? Such a helper would most likely take as its parameters - a stream to read the packdata from (for in-stream packfile that is handled by get_pack(), we already have it available) - fetch_pack_args and other options that are meant to affect the operation of fetch-pack, among which are two bits that are of interest in this topic: if we want to run fsck-objects and if the entire fetch-pack is dealing with more than one packfile (currently, the only source of need to process multiple packfiles is packfile URI mechanism, but that does not have to stay that way). Then get_pack() can move a lot of code out of it to this helper and just call it. The processing the other packfile obtained by the packfile URI mechanism out of band can open the packstream and call the helper the same way. When packfile URI mechanism is in use, both invocations of the helper would get "you are not alone so fsck may hit missing objects" bit, if fsck-objects are asked for. That would avoid the "duplicated logic" and still allow the code to choose the best disposition of the incoming packdata per packfile. In an extreme case, it is not hard to imagine that somebody prepares a very small base packfile and feed it via packfile URI mechanism, but have accumulated so many objects that are not yet rolled into an updated base packfile---cloning from such a repository may result in running unpack-objects for the packfile that came out of band, while processing the in-stream packfile with index-pack. Hmm? ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-10 19:56 ` Junio C Hamano @ 2021-03-10 23:29 ` Jonathan Tan 2021-03-11 0:59 ` Junio C Hamano 2021-03-11 1:41 ` Junio C Hamano 0 siblings, 2 replies; 229+ messages in thread From: Jonathan Tan @ 2021-03-10 23:29 UTC (permalink / raw) To: gitster; +Cc: jonathantanmy, git, jrnieder, nmulcahey > Junio C Hamano <gitster@pobox.com> writes: > > > I already said that I think 2aec3bc4 (fetch-pack: do not mix > > --pack_header and packfile uri, 2021-03-04) is OK as a short-term > > fix for the upcoming release, but it does not change the fact that > > it is piling technical debt on top of existing technical debt. > > > > And that is why I am reacting against your earlier mention of > > "filering out" rather strongly. The approach continues the "keep > > the single args array in the belief that two must be mostly the > > same", which I view as a misguided starting point that must be > > rethought. > > Another way to think about the codepath is this. > > Can the bulk of get_pack() that deals with a single incoming > packfile (from the part that makes the decision to use either > index-pack or unpack-objects and chooses what options to pass to the > command, to the part that actually calls run_command() and feeds the > packdata to the command) be made into a helper function that handles > one packdata stream and nothing else? Such a helper would most > likely take as its parameters > > - a stream to read the packdata from (for in-stream packfile that > is handled by get_pack(), we already have it available) > > - fetch_pack_args and other options that are meant to affect the > operation of fetch-pack, among which are two bits that are of > interest in this topic: if we want to run fsck-objects and if the > entire fetch-pack is dealing with more than one packfile > (currently, the only source of need to process multiple packfiles > is packfile URI mechanism, but that does not have to stay that > way). This probably means that fetch-pack.c itself (instead of finish_http_pack_request(), currently being called from a separate http_fetch process) should call index-pack for the out-of-band packfiles, which is conceptually reasonable. This means that finish_http_pack_request() will need to be able to refrain from running index-pack itself and instead just return where the pack was downloaded. > Then get_pack() can move a lot of code out of it to this helper and > just call it. The processing the other packfile obtained by the > packfile URI mechanism out of band can open the packstream and call > the helper the same way. When packfile URI mechanism is in use, both > invocations of the helper would get "you are not alone so fsck may > hit missing objects" bit, if fsck-objects are asked for. > > That would avoid the "duplicated logic" and still allow the code to > choose the best disposition of the incoming packdata per packfile. > > In an extreme case, it is not hard to imagine that somebody prepares > a very small base packfile and feed it via packfile URI mechanism, > but have accumulated so many objects that are not yet rolled into an > updated base packfile---cloning from such a repository may result in > running unpack-objects for the packfile that came out of band, while > processing the in-stream packfile with index-pack. > > Hmm? Your suggestion (as opposed to the current situation, in which we're locked into using index-pack for the out-of-band packfiles) would make this possible, yes. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-10 23:29 ` Jonathan Tan @ 2021-03-11 0:59 ` Junio C Hamano 2021-03-11 1:41 ` Junio C Hamano 1 sibling, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-11 0:59 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, jrnieder, nmulcahey Jonathan Tan <jonathantanmy@google.com> writes: >> Then get_pack() can move a lot of code out of it to this helper and >> just call it. The processing the other packfile obtained by the >> packfile URI mechanism out of band can open the packstream and call >> the helper the same way. When packfile URI mechanism is in use, both >> invocations of the helper would get "you are not alone so fsck may >> hit missing objects" bit, if fsck-objects are asked for. >> >> That would avoid the "duplicated logic" and still allow the code to >> choose the best disposition of the incoming packdata per packfile. >> >> In an extreme case, it is not hard to imagine that somebody prepares >> a very small base packfile and feed it via packfile URI mechanism, >> but have accumulated so many objects that are not yet rolled into an >> updated base packfile---cloning from such a repository may result in >> running unpack-objects for the packfile that came out of band, while >> processing the in-stream packfile with index-pack. >> >> Hmm? > > Your suggestion (as opposed to the current situation, in which we're > locked into using index-pack for the out-of-band packfiles) would make > this possible, yes. Just to make sure, I am not interested in running unpack-objects on oob packfiles, as they are expected to be "so old, big and not changing that it is worth pre-generating" packfiles, so "yes the approach would make that useless thing possible" is not a useful criteria to judge how good the alternative approach would be. If the approach results in a cleaner design that gives us more flexibility without risking unnecessary code duplication, it would be a good sign that the approach is more sound than the direction we took so far, though. Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-10 23:29 ` Jonathan Tan 2021-03-11 0:59 ` Junio C Hamano @ 2021-03-11 1:41 ` Junio C Hamano 2021-03-11 17:22 ` Jonathan Tan 1 sibling, 1 reply; 229+ messages in thread From: Junio C Hamano @ 2021-03-11 1:41 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, jrnieder, nmulcahey Jonathan Tan <jonathantanmy@google.com> writes: > This probably means that fetch-pack.c itself (instead of > finish_http_pack_request(), currently being called from a separate > http_fetch process) should call index-pack for the out-of-band > packfiles, which is conceptually reasonable. This means that > finish_http_pack_request() will need to be able to refrain from running > index-pack itself and instead just return where the pack was downloaded. The HTTP downloading for packfile specified via the packfile URI mechansim is so different from the rest of the HTTP codepaths in nature, isn't it? It is a straight "download a static file over the web, and we could even afford to resume, or send multiple requests to gain throughput" usecase, which does not exist anywhere else in Git (eh, other than the dumb HTTP protocol nobody sane should be using anymore). Since we are not in the business of writing a performant HTTP downloader, if we can update the codepath not to rely on our http.c code, and instead spawn one of the command line tools written specifically for the "download a single large file over HTTP" usecase (like curl, wget or aria2c), wait for it to do its thing and then concentrate on the processing specific to Git (like running index-pack with various options), it would take us closer to the "make clone resumable" dream, wouldn't it? Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-11 1:41 ` Junio C Hamano @ 2021-03-11 17:22 ` Jonathan Tan 2021-03-11 21:21 ` Junio C Hamano 0 siblings, 1 reply; 229+ messages in thread From: Jonathan Tan @ 2021-03-11 17:22 UTC (permalink / raw) To: gitster; +Cc: jonathantanmy, git, jrnieder, nmulcahey > Jonathan Tan <jonathantanmy@google.com> writes: > > > This probably means that fetch-pack.c itself (instead of > > finish_http_pack_request(), currently being called from a separate > > http_fetch process) should call index-pack for the out-of-band > > packfiles, which is conceptually reasonable. This means that > > finish_http_pack_request() will need to be able to refrain from running > > index-pack itself and instead just return where the pack was downloaded. > > The HTTP downloading for packfile specified via the packfile URI > mechansim is so different from the rest of the HTTP codepaths in > nature, isn't it? It is a straight "download a static file over the > web, and we could even afford to resume, or send multiple requests > to gain throughput" usecase, which does not exist anywhere else in > Git (eh, other than the dumb HTTP protocol nobody sane should be > using anymore). Yes - and I also noticed that finish_http_pack_request() is also used in http-push.c, but I'm not familiar with that. > Since we are not in the business of writing a performant HTTP > downloader, if we can update the codepath not to rely on our http.c > code, and instead spawn one of the command line tools written > specifically for the "download a single large file over HTTP" > usecase (like curl, wget or aria2c), wait for it to do its thing and > then concentrate on the processing specific to Git (like running > index-pack with various options), it would take us closer to the > "make clone resumable" dream, wouldn't it? > > Thanks. We would have to figure out how to communicate any Git HTTP config variables to curl/wget etc. (and also declare a dependency on such a tool), but that could be done. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri 2021-03-11 17:22 ` Jonathan Tan @ 2021-03-11 21:21 ` Junio C Hamano 0 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-03-11 21:21 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, jrnieder, nmulcahey Jonathan Tan <jonathantanmy@google.com> writes: >> Since we are not in the business of writing a performant HTTP >> downloader, if we can update the codepath not to rely on our http.c >> code, and instead spawn one of the command line tools written >> specifically for the "download a single large file over HTTP" >> usecase (like curl, wget or aria2c), wait for it to do its thing and >> then concentrate on the processing specific to Git (like running >> index-pack with various options), it would take us closer to the >> "make clone resumable" dream, wouldn't it? >> >> Thanks. > > We would have to figure out how to communicate any Git HTTP config > variables to curl/wget etc. (and also declare a dependency on such a > tool), but that could be done. Sure, and we do not have to go all the way there in a single step. We'd likely need to ship with a basic "download from this URL and store it in this specified temporary file" (or "to this fd") and use it as the default downloader. We just need to design the interface to that downloader (i.e. which we want to make replaceable) to be not too intimate with the details of the side that spawns the downloader (i.e. git and git-fetch), and other people can write replacement as a thin wrapper around curl/wget etc. to contribute to us. Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
* [PATCH v2 3/4] fetch-pack: with packfile URIs, use index-pack arg 2021-02-22 19:20 ` [PATCH v2 " Jonathan Tan 2021-02-22 19:20 ` [PATCH v2 1/4] http: allow custom index-pack args Jonathan Tan 2021-02-22 19:20 ` [PATCH v2 2/4] http-fetch: " Jonathan Tan @ 2021-02-22 19:20 ` Jonathan Tan 2021-02-22 19:20 ` [PATCH v2 4/4] fetch-pack: print and use dangling .gitmodules Jonathan Tan 2021-02-22 20:12 ` [PATCH v2 0/4] Check .gitmodules when using packfile URIs Junio C Hamano 4 siblings, 0 replies; 229+ messages in thread From: Jonathan Tan @ 2021-02-22 19:20 UTC (permalink / raw) To: git; +Cc: Jonathan Tan, avarab, gitster Unify the index-pack arguments used when processing the inline pack and when downloading packfiles referenced by URIs. This is done by teaching get_pack() to also store the index-pack arguments whenever at least one packfile URI is given, and then when processing the packfile URI(s), using the stored arguments. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- fetch-pack.c | 34 +++++++++++++++++++++++----------- 1 file changed, 23 insertions(+), 11 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index aeac010b0b..dd0a6c4b34 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -797,12 +797,13 @@ static void write_promisor_file(const char *keep_name, } /* - * Pass 1 as "only_packfile" if the pack received is the only pack in this - * fetch request (that is, if there were no packfile URIs provided). + * If packfile URIs were provided, pass a non-NULL pointer to index_pack_args. + * The strings to pass as the --index-pack-arg arguments to http-fetch will be + * stored there. (It must be freed by the caller.) */ static int get_pack(struct fetch_pack_args *args, int xd[2], struct string_list *pack_lockfiles, - int only_packfile, + struct strvec *index_pack_args, struct ref **sought, int nr_sought) { struct async demux; @@ -845,7 +846,7 @@ static int get_pack(struct fetch_pack_args *args, strvec_push(&cmd.args, alternate_shallow_file); } - if (do_keep || args->from_promisor) { + if (do_keep || args->from_promisor || index_pack_args) { if (pack_lockfiles) cmd.out = -1; cmd_name = "index-pack"; @@ -863,7 +864,7 @@ static int get_pack(struct fetch_pack_args *args, "--keep=fetch-pack %"PRIuMAX " on %s", (uintmax_t)getpid(), hostname); } - if (only_packfile && args->check_self_contained_and_connected) + if (!index_pack_args && args->check_self_contained_and_connected) strvec_push(&cmd.args, "--check-self-contained-and-connected"); else /* @@ -901,7 +902,7 @@ static int get_pack(struct fetch_pack_args *args, : transfer_fsck_objects >= 0 ? transfer_fsck_objects : 0) { - if (args->from_promisor || !only_packfile) + if (args->from_promisor || index_pack_args) /* * We cannot use --strict in index-pack because it * checks both broken objects and links, but we only @@ -913,6 +914,13 @@ static int get_pack(struct fetch_pack_args *args, fsck_msg_types.buf); } + if (index_pack_args) { + int i; + + for (i = 0; i < cmd.args.nr; i++) + strvec_push(index_pack_args, cmd.args.v[i]); + } + cmd.in = demux.out; cmd.git_cmd = 1; if (start_command(&cmd)) @@ -1084,7 +1092,7 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args, alternate_shallow_file = setup_temporary_shallow(si->shallow); else alternate_shallow_file = NULL; - if (get_pack(args, fd, pack_lockfiles, 1, sought, nr_sought)) + if (get_pack(args, fd, pack_lockfiles, NULL, sought, nr_sought)) die(_("git fetch-pack: fetch failed.")); all_done: @@ -1535,6 +1543,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, int seen_ack = 0; struct string_list packfile_uris = STRING_LIST_INIT_DUP; int i; + struct strvec index_pack_args = STRVEC_INIT; negotiator = &negotiator_alloc; fetch_negotiator_init(r, negotiator); @@ -1624,7 +1633,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, receive_packfile_uris(&reader, &packfile_uris); process_section_header(&reader, "packfile", 0); if (get_pack(args, fd, pack_lockfiles, - !packfile_uris.nr, sought, nr_sought)) + packfile_uris.nr ? &index_pack_args : NULL, + sought, nr_sought)) die(_("git fetch-pack: fetch failed.")); do_check_stateless_delimiter(args, &reader); @@ -1636,6 +1646,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, } for (i = 0; i < packfile_uris.nr; i++) { + int j; struct child_process cmd = CHILD_PROCESS_INIT; char packname[GIT_MAX_HEXSZ + 1]; const char *uri = packfile_uris.items[i].string + @@ -1645,9 +1656,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, strvec_pushf(&cmd.args, "--packfile=%.*s", (int) the_hash_algo->hexsz, packfile_uris.items[i].string); - strvec_push(&cmd.args, "--index-pack-arg=index-pack"); - strvec_push(&cmd.args, "--index-pack-arg=--stdin"); - strvec_push(&cmd.args, "--index-pack-arg=--keep"); + for (j = 0; j < index_pack_args.nr; j++) + strvec_pushf(&cmd.args, "--index-pack-arg=%s", + index_pack_args.v[j]); strvec_push(&cmd.args, uri); cmd.git_cmd = 1; cmd.no_stdin = 1; @@ -1683,6 +1694,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, packname)); } string_list_clear(&packfile_uris, 0); + strvec_clear(&index_pack_args); if (negotiator) negotiator->release(negotiator); -- 2.30.0.617.g56c4b15f3c-goog ^ permalink raw reply related [flat|nested] 229+ messages in thread
* [PATCH v2 4/4] fetch-pack: print and use dangling .gitmodules 2021-02-22 19:20 ` [PATCH v2 " Jonathan Tan ` (2 preceding siblings ...) 2021-02-22 19:20 ` [PATCH v2 3/4] fetch-pack: with packfile URIs, use index-pack arg Jonathan Tan @ 2021-02-22 19:20 ` Jonathan Tan 2021-02-22 20:12 ` [PATCH v2 0/4] Check .gitmodules when using packfile URIs Junio C Hamano 4 siblings, 0 replies; 229+ messages in thread From: Jonathan Tan @ 2021-02-22 19:20 UTC (permalink / raw) To: git; +Cc: Jonathan Tan, avarab, gitster Teach index-pack to print dangling .gitmodules links after its "keep" or "pack" line instead of declaring an error, and teach fetch-pack to check such lines printed. This allows the tree side of the .gitmodules link to be in one packfile and the blob side to be in another without failing the fsck check, because it is now fetch-pack which checks such objects after all packfiles have been downloaded and indexed (and not index-pack on an individual packfile, as it is before this commit). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- Documentation/git-index-pack.txt | 7 ++- builtin/index-pack.c | 25 +++++++++- builtin/receive-pack.c | 2 +- fetch-pack.c | 78 +++++++++++++++++++++++++++----- fsck.c | 5 ++ fsck.h | 2 + pack-write.c | 8 +++- pack.h | 2 +- t/t5702-protocol-v2.sh | 58 ++++++++++++++++++++++-- 9 files changed, 165 insertions(+), 22 deletions(-) diff --git a/Documentation/git-index-pack.txt b/Documentation/git-index-pack.txt index af0c26232c..e74a4a1eda 100644 --- a/Documentation/git-index-pack.txt +++ b/Documentation/git-index-pack.txt @@ -78,7 +78,12 @@ OPTIONS Die if the pack contains broken links. For internal use only. --fsck-objects:: - Die if the pack contains broken objects. For internal use only. + For internal use only. ++ +Die if the pack contains broken objects. If the pack contains a tree +pointing to a .gitmodules blob that does not exist, prints the hash of +that blob (for the caller to check) after the hash that goes into the +name of the pack/idx file (see "Notes"). --threads=<n>:: Specifies the number of threads to spawn when resolving diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 557bd2f348..0444febeee 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1693,6 +1693,22 @@ static void show_pack_info(int stat_only) } } +static int print_dangling_gitmodules(struct fsck_options *o, + const struct object_id *oid, + enum object_type object_type, + int msg_type, const char *message) +{ + /* + * NEEDSWORK: Plumb the MSG_ID (from fsck.c) here and use it + * instead of relying on this string check. + */ + if (starts_with(message, "gitmodulesMissing")) { + printf("%s\n", oid_to_hex(oid)); + return 0; + } + return fsck_error_function(o, oid, object_type, msg_type, message); +} + int cmd_index_pack(int argc, const char **argv, const char *prefix) { int i, fix_thin_pack = 0, verify = 0, stat_only = 0; @@ -1888,8 +1904,13 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) else close(input_fd); - if (do_fsck_object && fsck_finish(&fsck_options)) - die(_("fsck error in pack objects")); + if (do_fsck_object) { + struct fsck_options fo = fsck_options; + + fo.error_func = print_dangling_gitmodules; + if (fsck_finish(&fo)) + die(_("fsck error in pack objects")); + } free(objects); strbuf_release(&index_name_buf); diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index d49d050e6e..ed2c9b42e9 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -2275,7 +2275,7 @@ static const char *unpack(int err_fd, struct shallow_info *si) status = start_command(&child); if (status) return "index-pack fork failed"; - pack_lockfile = index_pack_lockfile(child.out); + pack_lockfile = index_pack_lockfile(child.out, NULL); close(child.out); status = finish_command(&child); if (status) diff --git a/fetch-pack.c b/fetch-pack.c index dd0a6c4b34..f9def5ac74 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -796,6 +796,26 @@ static void write_promisor_file(const char *keep_name, strbuf_release(&promisor_name); } +static void parse_gitmodules_oids(int fd, struct oidset *gitmodules_oids) +{ + int len = the_hash_algo->hexsz + 1; /* hash + NL */ + + do { + char hex_hash[GIT_MAX_HEXSZ + 1]; + int read_len = read_in_full(fd, hex_hash, len); + struct object_id oid; + const char *end; + + if (!read_len) + return; + if (read_len != len) + die("invalid length read %d", read_len); + if (parse_oid_hex(hex_hash, &oid, &end) || *end != '\n') + die("invalid hash"); + oidset_insert(gitmodules_oids, &oid); + } while (1); +} + /* * If packfile URIs were provided, pass a non-NULL pointer to index_pack_args. * The strings to pass as the --index-pack-arg arguments to http-fetch will be @@ -804,7 +824,8 @@ static void write_promisor_file(const char *keep_name, static int get_pack(struct fetch_pack_args *args, int xd[2], struct string_list *pack_lockfiles, struct strvec *index_pack_args, - struct ref **sought, int nr_sought) + struct ref **sought, int nr_sought, + struct oidset *gitmodules_oids) { struct async demux; int do_keep = args->keep_pack; @@ -812,6 +833,7 @@ static int get_pack(struct fetch_pack_args *args, struct pack_header header; int pass_header = 0; struct child_process cmd = CHILD_PROCESS_INIT; + int fsck_objects = 0; int ret; memset(&demux, 0, sizeof(demux)); @@ -846,8 +868,15 @@ static int get_pack(struct fetch_pack_args *args, strvec_push(&cmd.args, alternate_shallow_file); } - if (do_keep || args->from_promisor || index_pack_args) { - if (pack_lockfiles) + if (fetch_fsck_objects >= 0 + ? fetch_fsck_objects + : transfer_fsck_objects >= 0 + ? transfer_fsck_objects + : 0) + fsck_objects = 1; + + if (do_keep || args->from_promisor || index_pack_args || fsck_objects) { + if (pack_lockfiles || fsck_objects) cmd.out = -1; cmd_name = "index-pack"; strvec_push(&cmd.args, cmd_name); @@ -897,11 +926,7 @@ static int get_pack(struct fetch_pack_args *args, strvec_pushf(&cmd.args, "--pack_header=%"PRIu32",%"PRIu32, ntohl(header.hdr_version), ntohl(header.hdr_entries)); - if (fetch_fsck_objects >= 0 - ? fetch_fsck_objects - : transfer_fsck_objects >= 0 - ? transfer_fsck_objects - : 0) { + if (fsck_objects) { if (args->from_promisor || index_pack_args) /* * We cannot use --strict in index-pack because it @@ -925,10 +950,15 @@ static int get_pack(struct fetch_pack_args *args, cmd.git_cmd = 1; if (start_command(&cmd)) die(_("fetch-pack: unable to fork off %s"), cmd_name); - if (do_keep && pack_lockfiles) { - char *pack_lockfile = index_pack_lockfile(cmd.out); + if (do_keep && (pack_lockfiles || fsck_objects)) { + int is_well_formed; + char *pack_lockfile = index_pack_lockfile(cmd.out, &is_well_formed); + + if (!is_well_formed) + die(_("fetch-pack: invalid index-pack output")); if (pack_lockfile) string_list_append_nodup(pack_lockfiles, pack_lockfile); + parse_gitmodules_oids(cmd.out, gitmodules_oids); close(cmd.out); } @@ -963,6 +993,22 @@ static int cmp_ref_by_name(const void *a_, const void *b_) return strcmp(a->name, b->name); } +static void fsck_gitmodules_oids(struct oidset *gitmodules_oids) +{ + struct oidset_iter iter; + const struct object_id *oid; + struct fsck_options fo = FSCK_OPTIONS_STRICT; + + if (!oidset_size(gitmodules_oids)) + return; + + oidset_iter_init(gitmodules_oids, &iter); + while ((oid = oidset_iter_next(&iter))) + register_found_gitmodules(oid); + if (fsck_finish(&fo)) + die("fsck failed"); +} + static struct ref *do_fetch_pack(struct fetch_pack_args *args, int fd[2], const struct ref *orig_ref, @@ -977,6 +1023,7 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args, int agent_len; struct fetch_negotiator negotiator_alloc; struct fetch_negotiator *negotiator; + struct oidset gitmodules_oids = OIDSET_INIT; negotiator = &negotiator_alloc; fetch_negotiator_init(r, negotiator); @@ -1092,8 +1139,10 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args, alternate_shallow_file = setup_temporary_shallow(si->shallow); else alternate_shallow_file = NULL; - if (get_pack(args, fd, pack_lockfiles, NULL, sought, nr_sought)) + if (get_pack(args, fd, pack_lockfiles, NULL, sought, nr_sought, + &gitmodules_oids)) die(_("git fetch-pack: fetch failed.")); + fsck_gitmodules_oids(&gitmodules_oids); all_done: if (negotiator) @@ -1544,6 +1593,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, struct string_list packfile_uris = STRING_LIST_INIT_DUP; int i; struct strvec index_pack_args = STRVEC_INIT; + struct oidset gitmodules_oids = OIDSET_INIT; negotiator = &negotiator_alloc; fetch_negotiator_init(r, negotiator); @@ -1634,7 +1684,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, process_section_header(&reader, "packfile", 0); if (get_pack(args, fd, pack_lockfiles, packfile_uris.nr ? &index_pack_args : NULL, - sought, nr_sought)) + sought, nr_sought, &gitmodules_oids)) die(_("git fetch-pack: fetch failed.")); do_check_stateless_delimiter(args, &reader); @@ -1677,6 +1727,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, packname[the_hash_algo->hexsz] = '\0'; + parse_gitmodules_oids(cmd.out, &gitmodules_oids); + close(cmd.out); if (finish_command(&cmd)) @@ -1696,6 +1748,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, string_list_clear(&packfile_uris, 0); strvec_clear(&index_pack_args); + fsck_gitmodules_oids(&gitmodules_oids); + if (negotiator) negotiator->release(negotiator); diff --git a/fsck.c b/fsck.c index f82e2fe9e3..49ef6569e8 100644 --- a/fsck.c +++ b/fsck.c @@ -1243,6 +1243,11 @@ int fsck_error_function(struct fsck_options *o, return 1; } +void register_found_gitmodules(const struct object_id *oid) +{ + oidset_insert(&gitmodules_found, oid); +} + int fsck_finish(struct fsck_options *options) { int ret = 0; diff --git a/fsck.h b/fsck.h index 69cf715e79..d75b723bd5 100644 --- a/fsck.h +++ b/fsck.h @@ -62,6 +62,8 @@ int fsck_walk(struct object *obj, void *data, struct fsck_options *options); int fsck_object(struct object *obj, void *data, unsigned long size, struct fsck_options *options); +void register_found_gitmodules(const struct object_id *oid); + /* * Some fsck checks are context-dependent, and may end up queued; run this * after completing all fsck_object() calls in order to resolve any remaining diff --git a/pack-write.c b/pack-write.c index 3513665e1e..f66ea8e5a1 100644 --- a/pack-write.c +++ b/pack-write.c @@ -272,7 +272,7 @@ void fixup_pack_header_footer(int pack_fd, fsync_or_die(pack_fd, pack_name); } -char *index_pack_lockfile(int ip_out) +char *index_pack_lockfile(int ip_out, int *is_well_formed) { char packname[GIT_MAX_HEXSZ + 6]; const int len = the_hash_algo->hexsz + 6; @@ -286,11 +286,17 @@ char *index_pack_lockfile(int ip_out) */ if (read_in_full(ip_out, packname, len) == len && packname[len-1] == '\n') { const char *name; + + if (is_well_formed) + *is_well_formed = 1; packname[len-1] = 0; if (skip_prefix(packname, "keep\t", &name)) return xstrfmt("%s/pack/pack-%s.keep", get_object_directory(), name); + return NULL; } + if (is_well_formed) + *is_well_formed = 0; return NULL; } diff --git a/pack.h b/pack.h index 9fc0945ac9..09cffec395 100644 --- a/pack.h +++ b/pack.h @@ -85,7 +85,7 @@ int verify_pack_index(struct packed_git *); int verify_pack(struct repository *, struct packed_git *, verify_fn fn, struct progress *, uint32_t); off_t write_pack_header(struct hashfile *f, uint32_t); void fixup_pack_header_footer(int, unsigned char *, const char *, uint32_t, unsigned char *, off_t); -char *index_pack_lockfile(int fd); +char *index_pack_lockfile(int fd, int *is_well_formed); /* * The "hdr" output buffer should be at least this big, which will handle sizes diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh index 7d5b17909b..b1bc73a9a9 100755 --- a/t/t5702-protocol-v2.sh +++ b/t/t5702-protocol-v2.sh @@ -847,8 +847,9 @@ test_expect_success 'part of packfile response provided as URI' ' test -f hfound && test -f h2found && - # Ensure that there are exactly 6 files (3 .pack and 3 .idx). - ls http_child/.git/objects/pack/* >filelist && + # Ensure that there are exactly 3 packfiles with associated .idx + ls http_child/.git/objects/pack/*.pack \ + http_child/.git/objects/pack/*.idx >filelist && test_line_count = 6 filelist ' @@ -901,8 +902,9 @@ test_expect_success 'packfile-uri with transfer.fsckobjects' ' -c fetch.uriprotocols=http,https \ clone "$HTTPD_URL/smart/http_parent" http_child && - # Ensure that there are exactly 4 files (2 .pack and 2 .idx). - ls http_child/.git/objects/pack/* >filelist && + # Ensure that there are exactly 2 packfiles with associated .idx + ls http_child/.git/objects/pack/*.pack \ + http_child/.git/objects/pack/*.idx >filelist && test_line_count = 4 filelist ' @@ -936,6 +938,54 @@ test_expect_success 'packfile-uri with transfer.fsckobjects fails on bad object' test_i18ngrep "invalid author/committer line - missing email" error ' +test_expect_success 'packfile-uri with transfer.fsckobjects succeeds when .gitmodules is separate from tree' ' + P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && + rm -rf "$P" http_child && + + git init "$P" && + git -C "$P" config "uploadpack.allowsidebandall" "true" && + + echo "[submodule libfoo]" >"$P/.gitmodules" && + echo "path = include/foo" >>"$P/.gitmodules" && + echo "url = git://example.com/git/lib.git" >>"$P/.gitmodules" && + git -C "$P" add .gitmodules && + git -C "$P" commit -m x && + + configure_exclusion "$P" .gitmodules >h && + + sane_unset GIT_TEST_SIDEBAND_ALL && + git -c protocol.version=2 -c transfer.fsckobjects=1 \ + -c fetch.uriprotocols=http,https \ + clone "$HTTPD_URL/smart/http_parent" http_child && + + # Ensure that there are exactly 2 packfiles with associated .idx + ls http_child/.git/objects/pack/*.pack \ + http_child/.git/objects/pack/*.idx >filelist && + test_line_count = 4 filelist +' + +test_expect_success 'packfile-uri with transfer.fsckobjects fails when .gitmodules separate from tree is invalid' ' + P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && + rm -rf "$P" http_child err && + + git init "$P" && + git -C "$P" config "uploadpack.allowsidebandall" "true" && + + echo "[submodule \"..\"]" >"$P/.gitmodules" && + echo "path = include/foo" >>"$P/.gitmodules" && + echo "url = git://example.com/git/lib.git" >>"$P/.gitmodules" && + git -C "$P" add .gitmodules && + git -C "$P" commit -m x && + + configure_exclusion "$P" .gitmodules >h && + + sane_unset GIT_TEST_SIDEBAND_ALL && + test_must_fail git -c protocol.version=2 -c transfer.fsckobjects=1 \ + -c fetch.uriprotocols=http,https \ + clone "$HTTPD_URL/smart/http_parent" http_child 2>err && + test_i18ngrep "disallowed submodule name" err +' + # DO NOT add non-httpd-specific tests here, because the last part of this # test script is only executed when httpd is available and enabled. -- 2.30.0.617.g56c4b15f3c-goog ^ permalink raw reply related [flat|nested] 229+ messages in thread
* Re: [PATCH v2 0/4] Check .gitmodules when using packfile URIs 2021-02-22 19:20 ` [PATCH v2 " Jonathan Tan ` (3 preceding siblings ...) 2021-02-22 19:20 ` [PATCH v2 4/4] fetch-pack: print and use dangling .gitmodules Jonathan Tan @ 2021-02-22 20:12 ` Junio C Hamano 4 siblings, 0 replies; 229+ messages in thread From: Junio C Hamano @ 2021-02-22 20:12 UTC (permalink / raw) To: Jonathan Tan; +Cc: git, avarab Jonathan Tan <jonathantanmy@google.com> writes: > Here's v2. I think I've addressed all the review comments, including > passing the index-pack args as separate arguments (to avoid the > necessity to somehow encode in order to get rid of spaces), and by using > a custom error function instead of a specific option in fsck. > > This applies on master. I mentioned earlier [1] that I was planning to > implement this on Ævar's fsck API improvements, but after looking at the > latest v2, I see that it omits patch 11 from v1 (which is the one I > need), so what I've done is to use a string check in the meantime. > > [1] https://lore.kernel.org/git/20210219004612.1181920-1-jonathantanmy@google.com/ I only looked at the difference between this round and what is in 'seen', but everything looked reasonable to me (including the code that is near NEEDSWORK comment, and what the comment said). Will queue. Thanks. ^ permalink raw reply [flat|nested] 229+ messages in thread
end of thread, other threads:[~2021-03-29 2:07 UTC | newest] Thread overview: 229+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-01-15 23:43 RFC on packfile URIs and .gitmodules check Jonathan Tan 2021-01-16 0:30 ` Junio C Hamano 2021-01-16 3:22 ` Taylor Blau 2021-01-19 12:56 ` Derrick Stolee 2021-01-19 19:13 ` Jonathan Tan 2021-01-20 1:04 ` Junio C Hamano 2021-01-19 19:02 ` Jonathan Tan 2021-01-20 8:07 ` Ævar Arnfjörð Bjarmason 2021-01-20 19:30 ` Jonathan Tan 2021-01-21 3:06 ` Junio C Hamano 2021-01-21 18:32 ` Jonathan Tan 2021-01-21 18:39 ` Junio C Hamano 2021-01-20 19:36 ` [PATCH] Doc: clarify contents of packfile sent as URI Jonathan Tan 2021-01-24 2:34 ` [PATCH 0/4] Check .gitmodules when using packfile URIs Jonathan Tan 2021-01-24 2:34 ` [PATCH 1/4] http: allow custom index-pack args Jonathan Tan 2021-01-24 2:34 ` [PATCH 2/4] http-fetch: " Jonathan Tan 2021-01-24 11:52 ` Ævar Arnfjörð Bjarmason 2021-01-28 0:32 ` Jonathan Tan 2021-02-16 20:49 ` Josh Steadmon 2021-02-16 22:57 ` Junio C Hamano 2021-02-17 19:46 ` Jonathan Tan 2021-01-24 2:34 ` [PATCH 3/4] fetch-pack: with packfile URIs, use index-pack arg Jonathan Tan 2021-01-24 2:34 ` [PATCH 4/4] fetch-pack: print and use dangling .gitmodules Jonathan Tan 2021-01-24 7:56 ` Junio C Hamano 2021-01-26 1:57 ` Junio C Hamano 2021-01-28 1:04 ` Jonathan Tan 2021-01-24 12:18 ` Ævar Arnfjörð Bjarmason 2021-01-28 1:03 ` Jonathan Tan 2021-02-17 1:48 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 00/14] fsck: API improvements Ævar Arnfjörð Bjarmason 2021-02-17 21:02 ` Junio C Hamano 2021-02-18 0:00 ` Ævar Arnfjörð Bjarmason 2021-02-18 19:12 ` Junio C Hamano 2021-02-18 19:57 ` Jeff King 2021-02-18 20:27 ` Junio C Hamano 2021-02-19 0:54 ` Ævar Arnfjörð Bjarmason 2021-02-18 22:36 ` Junio C Hamano 2021-02-18 10:58 ` [PATCH v2 00/10] fsck: API improvements (no conflicts with 'seen') Ævar Arnfjörð Bjarmason 2021-02-18 22:19 ` Junio C Hamano 2021-03-06 11:04 ` [PATCH v3 00/22] fsck: API improvements Ævar Arnfjörð Bjarmason 2021-03-07 23:04 ` Junio C Hamano 2021-03-08 9:16 ` Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 " Ævar Arnfjörð Bjarmason 2021-03-16 19:35 ` Derrick Stolee 2021-03-17 18:20 ` [PATCH v5 00/19] " Ævar Arnfjörð Bjarmason 2021-03-17 20:30 ` Derrick Stolee 2021-03-17 21:06 ` Junio C Hamano 2021-03-28 13:15 ` [PATCH v6 " Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 01/19] fsck.c: refactor and rename common config callback Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 02/19] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason 2021-03-28 17:15 ` Ramsay Jones 2021-03-29 2:04 ` Junio C Hamano 2021-03-28 13:15 ` [PATCH v6 03/19] fsck.h: use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 04/19] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 05/19] fsck.c: remove (mostly) redundant append_msg_id() function Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 06/19] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 07/19] fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 08/19] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 09/19] fsck.h: re-order and re-assign "enum fsck_msg_type" Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 10/19] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 11/19] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 12/19] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 13/19] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 14/19] fsck.c: pass along the fsck_msg_id in the fsck_error callback Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 15/19] fsck.c: add an fsck_set_msg_type() API that takes enums Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 16/19] fsck.c: move gitmodules_{found,done} into fsck_options Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 17/19] fetch-pack: don't needlessly copy fsck_options Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 18/19] fetch-pack: use file-scope static struct for fsck_options Ævar Arnfjörð Bjarmason 2021-03-28 13:15 ` [PATCH v6 19/19] fetch-pack: use new fsck API to printing dangling submodules Ævar Arnfjörð Bjarmason 2021-03-29 2:06 ` [PATCH v6 00/19] fsck: API improvements Junio C Hamano 2021-03-17 18:20 ` [PATCH v5 01/19] fsck.c: refactor and rename common config callback Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 02/19] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 03/19] fsck.h: use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 04/19] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 05/19] fsck.c: move definition of msg_id into append_msg_id() Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 06/19] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 07/19] fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 08/19] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 09/19] fsck.h: re-order and re-assign "enum fsck_msg_type" Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 10/19] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 11/19] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 12/19] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 13/19] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 14/19] fsck.c: pass along the fsck_msg_id in the fsck_error callback Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 15/19] fsck.c: add an fsck_set_msg_type() API that takes enums Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 16/19] fsck.c: move gitmodules_{found,done} into fsck_options Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 17/19] fetch-pack: don't needlessly copy fsck_options Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 18/19] fetch-pack: use file-scope static struct for fsck_options Ævar Arnfjörð Bjarmason 2021-03-17 18:20 ` [PATCH v5 19/19] fetch-pack: use new fsck API to printing dangling submodules Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 01/22] fsck.h: update FSCK_OPTIONS_* for object_name Ævar Arnfjörð Bjarmason 2021-03-17 18:35 ` Junio C Hamano 2021-03-19 14:43 ` Johannes Schindelin 2021-03-20 9:16 ` Ævar Arnfjörð Bjarmason 2021-03-20 20:04 ` Junio C Hamano 2021-03-16 16:17 ` [PATCH v4 02/22] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason 2021-03-16 18:59 ` Derrick Stolee 2021-03-17 18:38 ` Junio C Hamano 2021-03-16 16:17 ` [PATCH v4 03/22] fsck.h: reduce duplication between FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 04/22] fsck.h: add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro Ævar Arnfjörð Bjarmason 2021-03-16 19:06 ` Derrick Stolee 2021-03-16 16:17 ` [PATCH v4 05/22] fsck.h: indent arguments to of fsck_set_msg_type Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 06/22] fsck.h: use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 07/22] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 08/22] fsck.c: move definition of msg_id into append_msg_id() Ævar Arnfjörð Bjarmason 2021-03-17 18:45 ` Junio C Hamano 2021-03-16 16:17 ` [PATCH v4 09/22] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 10/22] fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 11/22] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason 2021-03-17 18:48 ` Junio C Hamano 2021-03-16 16:17 ` [PATCH v4 12/22] fsck.h: re-order and re-assign "enum fsck_msg_type" Ævar Arnfjörð Bjarmason 2021-03-17 18:50 ` Junio C Hamano 2021-03-16 16:17 ` [PATCH v4 13/22] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 14/22] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason 2021-03-17 18:57 ` Junio C Hamano 2021-03-16 16:17 ` [PATCH v4 15/22] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 16/22] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 17/22] fsck.c: pass along the fsck_msg_id in the fsck_error callback Ævar Arnfjörð Bjarmason 2021-03-17 19:01 ` Junio C Hamano 2021-03-16 16:17 ` [PATCH v4 18/22] fsck.c: add an fsck_set_msg_type() API that takes enums Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 19/22] fsck.c: move gitmodules_{found,done} into fsck_options Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 20/22] fetch-pack: don't needlessly copy fsck_options Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 21/22] fetch-pack: use file-scope static struct for fsck_options Ævar Arnfjörð Bjarmason 2021-03-16 16:17 ` [PATCH v4 22/22] fetch-pack: use new fsck API to printing dangling submodules Ævar Arnfjörð Bjarmason 2021-03-16 19:32 ` Derrick Stolee 2021-03-17 13:47 ` Ævar Arnfjörð Bjarmason 2021-03-17 20:27 ` Derrick Stolee 2021-03-17 19:12 ` Junio C Hamano 2021-03-06 11:04 ` [PATCH v3 01/22] fsck.h: update FSCK_OPTIONS_* for object_name Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 02/22] fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 03/22] fsck.h: reduce duplication between FSCK_OPTIONS_{DEFAULT,STRICT} Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 04/22] fsck.h: add a FSCK_OPTIONS_COMMON_ERROR_FUNC macro Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 05/22] fsck.h: indent arguments to of fsck_set_msg_type Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 06/22] fsck.h: use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 07/22] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 08/22] fsck.c: move definition of msg_id into append_msg_id() Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 09/22] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 10/22] fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 11/22] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 12/22] fsck.h: re-order and re-assign "enum fsck_msg_type" Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 13/22] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 14/22] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 15/22] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 16/22] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 17/22] fsck.c: pass along the fsck_msg_id in the fsck_error callback Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 18/22] fsck.c: add an fsck_set_msg_type() API that takes enums Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 19/22] fsck.c: move gitmodules_{found,done} into fsck_options Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 20/22] fetch-pack: don't needlessly copy fsck_options Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 21/22] fetch-pack: use file-scope static struct for fsck_options Ævar Arnfjörð Bjarmason 2021-03-06 11:04 ` [PATCH v3 22/22] fetch-pack: use new fsck API to printing dangling submodules Ævar Arnfjörð Bjarmason 2021-02-18 10:58 ` [PATCH v2 01/10] fsck.h: indent arguments to of fsck_set_msg_type Ævar Arnfjörð Bjarmason 2021-02-18 10:58 ` [PATCH v2 02/10] fsck.h: use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason 2021-02-18 10:58 ` [PATCH v2 03/10] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason 2021-02-18 19:45 ` Jeff King 2021-02-18 10:58 ` [PATCH v2 04/10] fsck.c: move definition of msg_id into append_msg_id() Ævar Arnfjörð Bjarmason 2021-02-18 10:58 ` [PATCH v2 05/10] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason 2021-02-18 22:23 ` Junio C Hamano 2021-02-18 10:58 ` [PATCH v2 06/10] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason 2021-02-18 19:52 ` Jeff King 2021-02-18 22:27 ` Junio C Hamano 2021-02-18 10:58 ` [PATCH v2 07/10] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason 2021-02-18 22:29 ` Junio C Hamano 2021-02-18 10:58 ` [PATCH v2 08/10] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason 2021-02-18 22:30 ` Junio C Hamano 2021-02-18 10:58 ` [PATCH v2 09/10] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason 2021-02-18 19:56 ` Jeff King 2021-02-18 10:58 ` [PATCH v2 10/10] fsck.h: update FSCK_OPTIONS_* for object_name Ævar Arnfjörð Bjarmason 2021-02-18 19:56 ` Jeff King 2021-02-18 22:33 ` Junio C Hamano 2021-02-18 22:32 ` Junio C Hamano 2021-02-17 19:42 ` [PATCH 01/14] fsck.h: indent arguments to of fsck_set_msg_type Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 02/14] fsck.h: use use "enum object_type" instead of "int" Ævar Arnfjörð Bjarmason 2021-02-17 23:40 ` Junio C Hamano 2021-02-17 19:42 ` [PATCH 03/14] fsck.c: rename variables in fsck_set_msg_type() for less confusion Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 04/14] fsck.c: move definition of msg_id into append_msg_id() Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 05/14] fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 06/14] fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 07/14] fsck.c: call parse_msg_type() early in fsck_set_msg_type() Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 08/14] fsck.c: undefine temporary STR macro after use Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 09/14] fsck.c: give "FOREACH_MSG_ID" a more specific name Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 10/14] fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 11/14] fsck.c: pass along the fsck_msg_id in the fsck_error callback Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 12/14] fsck.c: add an fsck_set_msg_type() API that takes enums Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 13/14] fsck.h: update FSCK_OPTIONS_* for object_name Ævar Arnfjörð Bjarmason 2021-02-17 19:42 ` [PATCH 14/14] fsck.c: move gitmodules_{found,done} into fsck_options Ævar Arnfjörð Bjarmason 2021-02-17 20:05 ` [PATCH 4/4] fetch-pack: print and use dangling .gitmodules Jonathan Tan 2021-01-24 12:30 ` Ævar Arnfjörð Bjarmason 2021-01-28 1:15 ` Jonathan Tan 2021-02-17 2:10 ` Ævar Arnfjörð Bjarmason 2021-02-17 20:10 ` Jonathan Tan 2021-02-18 12:07 ` Ævar Arnfjörð Bjarmason 2021-02-17 19:27 ` Ævar Arnfjörð Bjarmason 2021-02-17 20:11 ` Jonathan Tan 2021-01-24 6:29 ` [PATCH 0/4] Check .gitmodules when using packfile URIs Junio C Hamano 2021-01-28 0:35 ` Jonathan Tan 2021-02-18 11:31 ` Ævar Arnfjörð Bjarmason 2021-02-18 23:34 ` Junio C Hamano 2021-02-19 0:46 ` Jonathan Tan 2021-02-20 3:31 ` Junio C Hamano 2021-02-19 1:08 ` Ævar Arnfjörð Bjarmason 2021-02-20 3:29 ` Junio C Hamano 2021-02-22 19:20 ` [PATCH v2 " Jonathan Tan 2021-02-22 19:20 ` [PATCH v2 1/4] http: allow custom index-pack args Jonathan Tan 2021-02-22 19:20 ` [PATCH v2 2/4] http-fetch: " Jonathan Tan 2021-02-23 13:17 ` Ævar Arnfjörð Bjarmason 2021-02-23 16:51 ` Jonathan Tan 2021-03-05 0:19 ` Jonathan Nieder 2021-03-05 1:16 ` [PATCH] fetch-pack: do not mix --pack_header and packfile uri Jonathan Tan 2021-03-05 1:52 ` Junio C Hamano 2021-03-05 18:50 ` Junio C Hamano 2021-03-05 19:46 ` Junio C Hamano 2021-03-05 23:11 ` Jonathan Tan 2021-03-05 23:20 ` Junio C Hamano 2021-03-05 22:59 ` Jonathan Tan 2021-03-05 23:18 ` Junio C Hamano 2021-03-08 19:14 ` Jonathan Tan 2021-03-08 19:34 ` Junio C Hamano 2021-03-09 19:13 ` Junio C Hamano 2021-03-10 5:24 ` Junio C Hamano 2021-03-10 16:57 ` Jonathan Tan 2021-03-10 18:30 ` Junio C Hamano 2021-03-10 19:56 ` Junio C Hamano 2021-03-10 23:29 ` Jonathan Tan 2021-03-11 0:59 ` Junio C Hamano 2021-03-11 1:41 ` Junio C Hamano 2021-03-11 17:22 ` Jonathan Tan 2021-03-11 21:21 ` Junio C Hamano 2021-02-22 19:20 ` [PATCH v2 3/4] fetch-pack: with packfile URIs, use index-pack arg Jonathan Tan 2021-02-22 19:20 ` [PATCH v2 4/4] fetch-pack: print and use dangling .gitmodules Jonathan Tan 2021-02-22 20:12 ` [PATCH v2 0/4] Check .gitmodules when using packfile URIs Junio C Hamano
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).