git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Problem with fsck and invalid submodule path in history
@ 2019-07-29  7:58 Olivier Bornet
  2019-07-29  9:39 ` SZEDER Gábor
  0 siblings, 1 reply; 8+ messages in thread
From: Olivier Bornet @ 2019-07-29  7:58 UTC (permalink / raw)
  To: git; +Cc: Olivier Bornet

[-- Attachment #1: Type: text/plain, Size: 1398 bytes --]

Hello,

I have a git repository with an error in a submodule path in the history.
The submodule path is “-f”, which is not allowed. But for some reason, it’s in the history of the git, and I’m trying to find a way to manage it without having to rewrite the full history on the main gitlab (if possible)...

To reproduce this unwanted history:

    mkdir test-bad-history
    cd test-bad-history
    echo "Test git submodule problems" > README.md
    git init
    git add README.md
    git commit --message="Start test"
    git submodule add https://github.com/leachim6/hello-world.git
    git commit --message="Commit new submodule with correct path"
    # the bad part...
    git mv hello-world -- -f
    git commit --message="Move submodule to an invalid path"
    # correct it...
    git mv -- -f valid-path
    sed -i.bak 's/-f/valid-path/' .gitmodules
    git add .gitmodules
    git commit --message="Back to a valid path"

After that, even if the git is working correctly, we have a “bad” history if we check with fsck:

    $ git fsck
    Checking object directories: 100% (256/256), done.
    error in blob 19a97d3b70760c74b780c8134e33f5392292c2e6: gitmodulesPath: disallowed submodule path: -f

Is it possible to correct it? Must git handle this kind of errors?

Thanks in advance for any help.

--
Olivier Bornet
Olivier.Bornet@puck.ch


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 235 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Problem with fsck and invalid submodule path in history
  2019-07-29  7:58 Problem with fsck and invalid submodule path in history Olivier Bornet
@ 2019-07-29  9:39 ` SZEDER Gábor
  2019-07-29  9:59   ` [PATCH] Documentation/git-fsck.txt: include fsck.* config variables SZEDER Gábor
  2019-07-29 14:31   ` Problem with fsck and invalid submodule path in history Olivier Bornet
  0 siblings, 2 replies; 8+ messages in thread
From: SZEDER Gábor @ 2019-07-29  9:39 UTC (permalink / raw)
  To: Olivier Bornet; +Cc: git

On Mon, Jul 29, 2019 at 09:58:52AM +0200, Olivier Bornet wrote:
> I have a git repository with an error in a submodule path in the history.
> The submodule path is “-f”, which is not allowed. But for some reason, it’s in the history of the git, and I’m trying to find a way to manage it without having to rewrite the full history on the main gitlab (if possible)...
> 
> To reproduce this unwanted history:

> After that, even if the git is working correctly, we have a “bad” history if we check with fsck:
> 
>     $ git fsck
>     Checking object directories: 100% (256/256), done.
>     error in blob 19a97d3b70760c74b780c8134e33f5392292c2e6: gitmodulesPath: disallowed submodule path: -f
> 
> Is it possible to correct it? Must git handle this kind of errors?

To correct without rewriting history, no.

However, you can tell 'git fsck' to ignore it using the
'fsck.skipList' configuration variable (see 'git help config'; for
some reason it's not included in 'git fsck's documentation):

  $ cat <<EOF >.git-fsck-skiplist
  > # invalid submodule path
  > 19a97d3b70760c74b780c8134e33f5392292c2e6
  > EOF
  $ git config fsck.skipList .git-fsck-skiplist
  $ git fsck
  Checking object directories: 100% (256/256), done.

It may or may not be worth committing this file, I'm not quite sure
what the best practice is.  By committing it others don't have to
maintain such a skiplist file themselves, though they still have to
set the config variable.  OTOH, if anyone sets this config variable
and attempts to run 'git fsck' while on a branch that doesn't contain
this file, then they will get a 'fatal: could not open object name
list: .git-fsck-skiplist' error.

And it won't help anyone cloning the repository with
'fetch.fsckObjects' enabled.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] Documentation/git-fsck.txt: include fsck.* config variables
  2019-07-29  9:39 ` SZEDER Gábor
@ 2019-07-29  9:59   ` SZEDER Gábor
  2019-07-29 15:33     ` Ævar Arnfjörð Bjarmason
  2019-07-29 14:31   ` Problem with fsck and invalid submodule path in history Olivier Bornet
  1 sibling, 1 reply; 8+ messages in thread
From: SZEDER Gábor @ 2019-07-29  9:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Olivier Bornet, SZEDER Gábor

The 'fsck.skipList' and 'fsck.<msg-id>' config variables might be
easier to discover when they are documented in 'git fsck's man page.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 Documentation/git-fsck.txt | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index e0eae642c1..d72d15be5b 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -104,6 +104,11 @@ care about this output and want to speed it up further.
 	progress status even if the standard error stream is not
 	directed to a terminal.
 
+CONFIGURATION
+-------------
+
+include::config/fsck.txt[]
+
 DISCUSSION
 ----------
 
-- 
2.22.0.924.g7c03687a56


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Problem with fsck and invalid submodule path in history
  2019-07-29  9:39 ` SZEDER Gábor
  2019-07-29  9:59   ` [PATCH] Documentation/git-fsck.txt: include fsck.* config variables SZEDER Gábor
@ 2019-07-29 14:31   ` Olivier Bornet
  1 sibling, 0 replies; 8+ messages in thread
From: Olivier Bornet @ 2019-07-29 14:31 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Olivier Bornet, git

[-- Attachment #1: Type: text/plain, Size: 472 bytes --]

Hello,
> On 29 Jul 2019, at 11:39, SZEDER Gábor <szeder.dev@gmail.com> wrote:
> 
> However, you can tell 'git fsck' to ignore it using the
> 'fsck.skipList' configuration variable (see 'git help config'; for
> some reason it's not included in 'git fsck's documentation):

Thanks for pointing it out. Thanks to it, we have managed to handle
the error with our gitlab automatic fsck procedure.

Have a nice day.

--
Olivier Bornet
Olivier.Bornet@puck.ch



[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 235 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Documentation/git-fsck.txt: include fsck.* config variables
  2019-07-29  9:59   ` [PATCH] Documentation/git-fsck.txt: include fsck.* config variables SZEDER Gábor
@ 2019-07-29 15:33     ` Ævar Arnfjörð Bjarmason
  2019-07-29 15:48       ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-07-29 15:33 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Junio C Hamano, git, Olivier Bornet


On Mon, Jul 29 2019, SZEDER Gábor wrote:

> The 'fsck.skipList' and 'fsck.<msg-id>' config variables might be
> easier to discover when they are documented in 'git fsck's man page.
>
> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
> ---
>  Documentation/git-fsck.txt | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
> index e0eae642c1..d72d15be5b 100644
> --- a/Documentation/git-fsck.txt
> +++ b/Documentation/git-fsck.txt
> @@ -104,6 +104,11 @@ care about this output and want to speed it up further.
>  	progress status even if the standard error stream is not
>  	directed to a terminal.
>
> +CONFIGURATION
> +-------------
> +
> +include::config/fsck.txt[]

Before this include let's add:

    The below documentation is the same as what’s found in
    git-config(1):

As I did for a similar change in git-gc in b6a8d09f6d ("gc docs: include
the "gc.*" section from "config" in "gc"", 2019-04-07). Sometimes we
repeat ourselves, it helps the reader to know this isn't some slightly
different prose than what's in git-config.

> +
>  DISCUSSION
>  ----------

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Documentation/git-fsck.txt: include fsck.* config variables
  2019-07-29 15:33     ` Ævar Arnfjörð Bjarmason
@ 2019-07-29 15:48       ` Junio C Hamano
  2019-07-29 20:12         ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2019-07-29 15:48 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: SZEDER Gábor, git, Olivier Bornet

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Mon, Jul 29 2019, SZEDER Gábor wrote:
>
>> The 'fsck.skipList' and 'fsck.<msg-id>' config variables might be
>> easier to discover when they are documented in 'git fsck's man page.
>>
>> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
>> ---
>>  Documentation/git-fsck.txt | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
>> index e0eae642c1..d72d15be5b 100644
>> --- a/Documentation/git-fsck.txt
>> +++ b/Documentation/git-fsck.txt
>> @@ -104,6 +104,11 @@ care about this output and want to speed it up further.
>>  	progress status even if the standard error stream is not
>>  	directed to a terminal.
>>
>> +CONFIGURATION
>> +-------------
>> +
>> +include::config/fsck.txt[]
>
> Before this include let's add:
>
>     The below documentation is the same as what’s found in
>     git-config(1):

I actually do not think we would want to do that.  I am all for the
kind of 'include' proposed by this patch, and we should strive to
make it easier for us to make sure the duplicated text are in sync.

But that would mean that the readers will have to see the "is the
same as the other one" over and over.  If our documentation set is
consistent, they should not have to.

I think we *must* make such a note in a total opposite case,
i.e. "here are the summary of the most often used options; for full
list, see git-config(1)".

> As I did for a similar change in git-gc in b6a8d09f6d ("gc docs: include
> the "gc.*" section from "config" in "gc"", 2019-04-07). Sometimes we
> repeat ourselves, it helps the reader to know this isn't some slightly
> different prose than what's in git-config.

So, I think we should revert that part out of b6a8d09f6d, too.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Documentation/git-fsck.txt: include fsck.* config variables
  2019-07-29 15:48       ` Junio C Hamano
@ 2019-07-29 20:12         ` Jeff King
  2019-07-29 21:32           ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2019-07-29 20:12 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, SZEDER Gábor, git,
	Olivier Bornet

On Mon, Jul 29, 2019 at 08:48:28AM -0700, Junio C Hamano wrote:

> > Before this include let's add:
> >
> >     The below documentation is the same as what’s found in
> >     git-config(1):
> 
> I actually do not think we would want to do that.  I am all for the
> kind of 'include' proposed by this patch, and we should strive to
> make it easier for us to make sure the duplicated text are in sync.
> 
> But that would mean that the readers will have to see the "is the
> same as the other one" over and over.  If our documentation set is
> consistent, they should not have to.
> 
> I think we *must* make such a note in a total opposite case,
> i.e. "here are the summary of the most often used options; for full
> list, see git-config(1)".

I disagree. _We_ know that the content is the same, because we are
looking at the source that says "include". But as a user, how do I know
when I get to one section or the other that it is something I have
already read and can skip over?

Perhaps if Git were entirely consistent here, they could make the
assumption that "CONFIG" sections were always duplicated and know this
already.  But I think even that is asking a bit much. Unless they are
intimately familiar with our documentation, they don't know that we are,
in fact, consistent. And we are in an uphill battle with every other
thing the user has read, which may not agree with our assumptions of
consistency. ;)

So IMHO it's worth leaving a note that guides the reader, as long as
it's short (and I think this one is).


That said, I think an even _better_ solution would be to avoid includes,
and instead make it clear when we are pointing the user to shared
content. Then we get them to the right place _and_ explicitly instruct
them that concepts/content are shared. For config, for example, I've
worked with a previous system that did something like:

  - include fsck.* documentation in the git-fsck manpage

  - make a master table of config options in git-config.1 with _just_
    the names and the associated manpage where the definition can be
    found. This serves as an index if you don't know where to look.

This would probably involve creating new concept-based pages for some of
the groupings (e.g., where does "remote.*" config go?), but I think that
would probably help round out our documentation (if there is a concept
with related config options but we don't explain it anywhere, that is
probably a gap we should fix).

The biggest downside is that chasing down references in manpages sucks.
For the HTML documentation we'd ideally hyperlink from the git-config.1
index into each definition, but there's no way to do that with a regular
manpage.

-Peff

PS This is an approach I've advocated for a while:

    https://public-inbox.org/git/20110120233429.GB9442@sigill.intra.peff.net/

   but haven't actually done much about, so perhaps I should be putting
   my money where my mouth is. ;)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Documentation/git-fsck.txt: include fsck.* config variables
  2019-07-29 20:12         ` Jeff King
@ 2019-07-29 21:32           ` Junio C Hamano
  0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2019-07-29 21:32 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, SZEDER Gábor, git,
	Olivier Bornet

Jeff King <peff@peff.net> writes:

> On Mon, Jul 29, 2019 at 08:48:28AM -0700, Junio C Hamano wrote:
>
>> > Before this include let's add:
>> >
>> >     The below documentation is the same as what’s found in
>> >     git-config(1):
>> 
>> I actually do not think we would want to do that.  I am all for the
>> kind of 'include' proposed by this patch, and we should strive to
>> make it easier for us to make sure the duplicated text are in sync.
>> 
>> But that would mean that the readers will have to see the "is the
>> same as the other one" over and over.  If our documentation set is
>> consistent, they should not have to.
>> 
>> I think we *must* make such a note in a total opposite case,
>> i.e. "here are the summary of the most often used options; for full
>> list, see git-config(1)".
>
> I disagree. _We_ know that the content is the same, because we are
> looking at the source that says "include". But as a user, how do I know
> when I get to one section or the other that it is something I have
> already read and can skip over?

I want to raise the user expectation so that they would expect from
our documentation, unless we say "these are different", we would
never say conflicting things in two places.

So,... I disagree.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-07-29 21:32 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-29  7:58 Problem with fsck and invalid submodule path in history Olivier Bornet
2019-07-29  9:39 ` SZEDER Gábor
2019-07-29  9:59   ` [PATCH] Documentation/git-fsck.txt: include fsck.* config variables SZEDER Gábor
2019-07-29 15:33     ` Ævar Arnfjörð Bjarmason
2019-07-29 15:48       ` Junio C Hamano
2019-07-29 20:12         ` Jeff King
2019-07-29 21:32           ` Junio C Hamano
2019-07-29 14:31   ` Problem with fsck and invalid submodule path in history Olivier Bornet

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).