git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* git submodules implementation question
@ 2016-08-28 23:24 Uma Srinivasan
  2016-08-29 20:03 ` Junio C Hamano
  0 siblings, 1 reply; 31+ messages in thread
From: Uma Srinivasan @ 2016-08-28 23:24 UTC (permalink / raw)
  To: git

Hi,

I am new to git source and internal development. Recently I've been
looking at a problem where issuing "git status" in a corrupted
workspace handed over to me by a user, forks several thousand child
processes recursively.

The symptoms of the corrupted workspace to reproduce this problem are
as follows:

there needs to be an entry in the index file that looks something like this....
0 groc 160000 0 0.000000000

Also, there's a subdirectory under the source dir with the name in the
index file above (i.e 'groc') that has a partially populated .git sub
directory within it, not a .git file with the contents providing a
link to the module subdirectory.

The "git submodule status" command returns "No submodule mapping found
in .gitmodules for path 'groc'"

Doing a "git read-tree HEAD" will regenerate the index file and make
the problem go away. Basically the line entry above relating to 'groc'
goes away as it is not a submodule in the source repo.

This problem can be reproduced with all current versions of GIT
including 2.4, 2.8 and the latest version v2.10.0-rc1.

In looking into the git source code for is_submodule_modified() which
forks off the new child process, I see the following lines in
submodule.c:

git_dir = read_gitfile(buf.buf);
if (!git_dir)

                git_dir = buf.buf;

Can anyone explain to me why we are replacing a failed reading of a
git file with the original sub directory name? When can we expect a
.git directory under a submodule?

Thanks,
Uma

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-28 23:24 git submodules implementation question Uma Srinivasan
@ 2016-08-29 20:03 ` Junio C Hamano
  2016-08-29 21:03   ` Uma Srinivasan
  0 siblings, 1 reply; 31+ messages in thread
From: Junio C Hamano @ 2016-08-29 20:03 UTC (permalink / raw)
  To: Uma Srinivasan; +Cc: git

Uma Srinivasan <usrinivasan@twitter.com> writes:

> git_dir = read_gitfile(buf.buf);
> if (!git_dir)
>
>                 git_dir = buf.buf;
>
> Can anyone explain to me why we are replacing a failed reading of a
> git file with the original sub directory name?

A top-level superproject can have a submodule bound at its "dir/"
directory, and "dir/.git" can either be a gitfile which you can read
with read_gitfile() and point into somewhere in ".git/modules/" of
the top-level superproject.  "dir/.git" can _ALSO_ be a fully valid
Git directory.  So at the top of a superproject, you could do

	git clone $URL ./dir2
        git add dir2

to clone an independent project into dir2 directory, and add it as a
new submodule.  The fallback is to support such a layout.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-29 20:03 ` Junio C Hamano
@ 2016-08-29 21:03   ` Uma Srinivasan
  2016-08-29 21:09     ` Junio C Hamano
  0 siblings, 1 reply; 31+ messages in thread
From: Uma Srinivasan @ 2016-08-29 21:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Thanks for the reply. However, in this case....

      git clone $URL ./dir2
      git add dir2

how will "dir2" get ever get registered as a submodule? I don't see
how one can reach the "is_submodule_modified" routine for the scenario
above.

My understanding is that a sub-directory can be registered as a
submodule only with the "git submodule add ...." command. In this case
only a gitlink file is created within the sub-directory and not a .git
subdirectory. Please correct me if I am wrong.

Thanks again,
Uma

On Mon, Aug 29, 2016 at 1:03 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Uma Srinivasan <usrinivasan@twitter.com> writes:
>
>> git_dir = read_gitfile(buf.buf);
>> if (!git_dir)
>>
>>                 git_dir = buf.buf;
>>
>> Can anyone explain to me why we are replacing a failed reading of a
>> git file with the original sub directory name?
>
> A top-level superproject can have a submodule bound at its "dir/"
> directory, and "dir/.git" can either be a gitfile which you can read
> with read_gitfile() and point into somewhere in ".git/modules/" of
> the top-level superproject.  "dir/.git" can _ALSO_ be a fully valid
> Git directory.  So at the top of a superproject, you could do
>
>         git clone $URL ./dir2
>         git add dir2
>
> to clone an independent project into dir2 directory, and add it as a
> new submodule.  The fallback is to support such a layout.
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-29 21:03   ` Uma Srinivasan
@ 2016-08-29 21:09     ` Junio C Hamano
  2016-08-29 21:13       ` Uma Srinivasan
  0 siblings, 1 reply; 31+ messages in thread
From: Junio C Hamano @ 2016-08-29 21:09 UTC (permalink / raw)
  To: Uma Srinivasan; +Cc: Git Mailing List

On Mon, Aug 29, 2016 at 2:03 PM, Uma Srinivasan <usrinivasan@twitter.com> wrote:
> On Mon, Aug 29, 2016 at 1:03 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>> A top-level superproject can have a submodule bound at its "dir/"
>> directory, and "dir/.git" can either be a gitfile which you can read
>> with read_gitfile() and point into somewhere in ".git/modules/" of
>> the top-level superproject.  "dir/.git" can _ALSO_ be a fully valid
>> Git directory.  So at the top of a superproject, you could do
>>
>>         git clone $URL ./dir2
>>         git add dir2
>>
>> to clone an independent project into dir2 directory, and add it as a
>> new submodule.  The fallback is to support such a layout.
>>
> Thanks for the reply. However, in this case....
>
>       git clone $URL ./dir2
>       git add dir2
>
> how will "dir2" get ever get registered as a submodule?

With a separate invocation of "git config -f .gitmodules", of course.
The layout to use gitfile to point into .git/modules/ is a more recent
invention than the submodule support itself that "git add" knows about.
The code needs to support both layout as well as it can, and that
is what the "can we read it as gitfile?  If not, that directory itself may
be a git repository" codepath you asked about is doing.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-29 21:09     ` Junio C Hamano
@ 2016-08-29 21:13       ` Uma Srinivasan
  2016-08-29 23:04         ` Uma Srinivasan
  0 siblings, 1 reply; 31+ messages in thread
From: Uma Srinivasan @ 2016-08-29 21:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

Ok that makes sense. Thanks much.

Uma

On Mon, Aug 29, 2016 at 2:09 PM, Junio C Hamano <gitster@pobox.com> wrote:
> On Mon, Aug 29, 2016 at 2:03 PM, Uma Srinivasan <usrinivasan@twitter.com> wrote:
>> On Mon, Aug 29, 2016 at 1:03 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>>
>>> A top-level superproject can have a submodule bound at its "dir/"
>>> directory, and "dir/.git" can either be a gitfile which you can read
>>> with read_gitfile() and point into somewhere in ".git/modules/" of
>>> the top-level superproject.  "dir/.git" can _ALSO_ be a fully valid
>>> Git directory.  So at the top of a superproject, you could do
>>>
>>>         git clone $URL ./dir2
>>>         git add dir2
>>>
>>> to clone an independent project into dir2 directory, and add it as a
>>> new submodule.  The fallback is to support such a layout.
>>>
>> Thanks for the reply. However, in this case....
>>
>>       git clone $URL ./dir2
>>       git add dir2
>>
>> how will "dir2" get ever get registered as a submodule?
>
> With a separate invocation of "git config -f .gitmodules", of course.
> The layout to use gitfile to point into .git/modules/ is a more recent
> invention than the submodule support itself that "git add" knows about.
> The code needs to support both layout as well as it can, and that
> is what the "can we read it as gitfile?  If not, that directory itself may
> be a git repository" codepath you asked about is doing.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-29 21:13       ` Uma Srinivasan
@ 2016-08-29 23:04         ` Uma Srinivasan
  2016-08-29 23:15           ` Junio C Hamano
  0 siblings, 1 reply; 31+ messages in thread
From: Uma Srinivasan @ 2016-08-29 23:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

With respect to my original problem with a corrupted .git directory
under the submodule directory, I am thinking of adding the following 4
lines marked with ### to is_submodule_modified() to detect the
corrupted dir and die quickly instead of forking several child
processes:

               strbuf_addf(&buf, "%s/.git", path);
               git_dir = read_gitfile(buf.buf);
               if (!git_dir) {
                 ### strbuf_addf(&head_ref, "%s/HEAD",buf.buf);
                 ### if (strbuf_read_file(&temp_ref, head_ref.buf,0) < 0) {
                          ### die("Corrupted .git dir in submodule %s", path);
                 ###}
                         git_dir = buf.buf;
              }

This fixes my issue but what do you think? Is this the right way to
fix it? Is there a better way?

Thanks,
Uma

On Mon, Aug 29, 2016 at 2:13 PM, Uma Srinivasan <usrinivasan@twitter.com> wrote:
> Ok that makes sense. Thanks much.
>
> Uma
>
> On Mon, Aug 29, 2016 at 2:09 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> On Mon, Aug 29, 2016 at 2:03 PM, Uma Srinivasan <usrinivasan@twitter.com> wrote:
>>> On Mon, Aug 29, 2016 at 1:03 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>>>
>>>> A top-level superproject can have a submodule bound at its "dir/"
>>>> directory, and "dir/.git" can either be a gitfile which you can read
>>>> with read_gitfile() and point into somewhere in ".git/modules/" of
>>>> the top-level superproject.  "dir/.git" can _ALSO_ be a fully valid
>>>> Git directory.  So at the top of a superproject, you could do
>>>>
>>>>         git clone $URL ./dir2
>>>>         git add dir2
>>>>
>>>> to clone an independent project into dir2 directory, and add it as a
>>>> new submodule.  The fallback is to support such a layout.
>>>>
>>> Thanks for the reply. However, in this case....
>>>
>>>       git clone $URL ./dir2
>>>       git add dir2
>>>
>>> how will "dir2" get ever get registered as a submodule?
>>
>> With a separate invocation of "git config -f .gitmodules", of course.
>> The layout to use gitfile to point into .git/modules/ is a more recent
>> invention than the submodule support itself that "git add" knows about.
>> The code needs to support both layout as well as it can, and that
>> is what the "can we read it as gitfile?  If not, that directory itself may
>> be a git repository" codepath you asked about is doing.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-29 23:04         ` Uma Srinivasan
@ 2016-08-29 23:15           ` Junio C Hamano
  2016-08-29 23:34             ` Uma Srinivasan
  2016-08-30  0:02             ` Jacob Keller
  0 siblings, 2 replies; 31+ messages in thread
From: Junio C Hamano @ 2016-08-29 23:15 UTC (permalink / raw)
  To: Uma Srinivasan; +Cc: Git Mailing List, Jens Lehmann, Heiko Voigt, Stefan Beller

Uma Srinivasan <usrinivasan@twitter.com> writes:

> On Mon, Aug 29, 2016 at 2:13 PM, Uma Srinivasan <usrinivasan@twitter.com> wrote:
>> Ok that makes sense. Thanks much.
>>
>> Uma
>>
> With respect to my original problem with a corrupted .git directory
> under the submodule directory, I am thinking of adding the following 4
> lines marked with ### to is_submodule_modified() to detect the
> corrupted dir and die quickly instead of forking several child
> processes:
>
>                strbuf_addf(&buf, "%s/.git", path);
>                git_dir = read_gitfile(buf.buf);
>                if (!git_dir) {
>                  ### strbuf_addf(&head_ref, "%s/HEAD",buf.buf);
>                  ### if (strbuf_read_file(&temp_ref, head_ref.buf,0) < 0) {
>                           ### die("Corrupted .git dir in submodule %s", path);
>                  ###}
>                          git_dir = buf.buf;
>               }
>
> This fixes my issue but what do you think? Is this the right way to
> fix it? Is there a better way?

I think we already have a helper function that does a lot better
than "does it have a file called HEAD" to ask "is this a git
directory?" and its name I think is "is_git_directory" (I know, we
are not imaginative when naming our functions).

As to the check makes sense in the context of this function, I am
not an expert to judge.  I'd expect Jens, Heiko and/or Stefan to
know better than I do.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-29 23:15           ` Junio C Hamano
@ 2016-08-29 23:34             ` Uma Srinivasan
  2016-08-30  0:02             ` Jacob Keller
  1 sibling, 0 replies; 31+ messages in thread
From: Uma Srinivasan @ 2016-08-29 23:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Jens Lehmann, Heiko Voigt, Stefan Beller

Yes, is_git_directory() is much better. Thanks for the pointer.

I will submit a patch unless I hear more suggestions from others.

Uma



On Mon, Aug 29, 2016 at 4:15 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Uma Srinivasan <usrinivasan@twitter.com> writes:
>
>> On Mon, Aug 29, 2016 at 2:13 PM, Uma Srinivasan <usrinivasan@twitter.com> wrote:
>>> Ok that makes sense. Thanks much.
>>>
>>> Uma
>>>
>> With respect to my original problem with a corrupted .git directory
>> under the submodule directory, I am thinking of adding the following 4
>> lines marked with ### to is_submodule_modified() to detect the
>> corrupted dir and die quickly instead of forking several child
>> processes:
>>
>>                strbuf_addf(&buf, "%s/.git", path);
>>                git_dir = read_gitfile(buf.buf);
>>                if (!git_dir) {
>>                  ### strbuf_addf(&head_ref, "%s/HEAD",buf.buf);
>>                  ### if (strbuf_read_file(&temp_ref, head_ref.buf,0) < 0) {
>>                           ### die("Corrupted .git dir in submodule %s", path);
>>                  ###}
>>                          git_dir = buf.buf;
>>               }
>>
>> This fixes my issue but what do you think? Is this the right way to
>> fix it? Is there a better way?
>
> I think we already have a helper function that does a lot better
> than "does it have a file called HEAD" to ask "is this a git
> directory?" and its name I think is "is_git_directory" (I know, we
> are not imaginative when naming our functions).
>
> As to the check makes sense in the context of this function, I am
> not an expert to judge.  I'd expect Jens, Heiko and/or Stefan to
> know better than I do.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-29 23:15           ` Junio C Hamano
  2016-08-29 23:34             ` Uma Srinivasan
@ 2016-08-30  0:02             ` Jacob Keller
  2016-08-30  0:12               ` Uma Srinivasan
  1 sibling, 1 reply; 31+ messages in thread
From: Jacob Keller @ 2016-08-30  0:02 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Uma Srinivasan, Git Mailing List, Jens Lehmann, Heiko Voigt,
	Stefan Beller

On Mon, Aug 29, 2016 at 4:15 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Uma Srinivasan <usrinivasan@twitter.com> writes:
>> This fixes my issue but what do you think? Is this the right way to
>> fix it? Is there a better way?
>
> I think we already have a helper function that does a lot better
> than "does it have a file called HEAD" to ask "is this a git
> directory?" and its name I think is "is_git_directory" (I know, we
> are not imaginative when naming our functions).
>
> As to the check makes sense in the context of this function, I am
> not an expert to judge.  I'd expect Jens, Heiko and/or Stefan to
> know better than I do.

One of my patches adds a "is_git_directory()" call to this, and if we
fail falls back to checking the .gitmodules and git-config for
information regarding the submodule should it no longer be checked
out. I suspect this patch will address your concern.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-30  0:02             ` Jacob Keller
@ 2016-08-30  0:12               ` Uma Srinivasan
  2016-08-30  6:09                 ` Jacob Keller
  0 siblings, 1 reply; 31+ messages in thread
From: Uma Srinivasan @ 2016-08-30  0:12 UTC (permalink / raw)
  To: Jacob Keller
  Cc: Junio C Hamano, Git Mailing List, Jens Lehmann, Heiko Voigt,
	Stefan Beller

This is great! Thanks Jake. If you happen to have the patch ID it
would be helpful.

Uma

On Mon, Aug 29, 2016 at 5:02 PM, Jacob Keller <jacob.keller@gmail.com> wrote:
> On Mon, Aug 29, 2016 at 4:15 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Uma Srinivasan <usrinivasan@twitter.com> writes:
>>> This fixes my issue but what do you think? Is this the right way to
>>> fix it? Is there a better way?
>>
>> I think we already have a helper function that does a lot better
>> than "does it have a file called HEAD" to ask "is this a git
>> directory?" and its name I think is "is_git_directory" (I know, we
>> are not imaginative when naming our functions).
>>
>> As to the check makes sense in the context of this function, I am
>> not an expert to judge.  I'd expect Jens, Heiko and/or Stefan to
>> know better than I do.
>
> One of my patches adds a "is_git_directory()" call to this, and if we
> fail falls back to checking the .gitmodules and git-config for
> information regarding the submodule should it no longer be checked
> out. I suspect this patch will address your concern.
>
> Thanks,
> Jake

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-30  0:12               ` Uma Srinivasan
@ 2016-08-30  6:09                 ` Jacob Keller
  2016-08-30  6:23                   ` Jacob Keller
  0 siblings, 1 reply; 31+ messages in thread
From: Jacob Keller @ 2016-08-30  6:09 UTC (permalink / raw)
  To: Uma Srinivasan
  Cc: Junio C Hamano, Git Mailing List, Jens Lehmann, Heiko Voigt,
	Stefan Beller

On Mon, Aug 29, 2016 at 5:12 PM, Uma Srinivasan <usrinivasan@twitter.com> wrote:
> This is great! Thanks Jake. If you happen to have the patch ID it
> would be helpful.
>
> Uma
>

http://public-inbox.org/git/1472236108.28343.5.camel@intel.com/

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-30  6:09                 ` Jacob Keller
@ 2016-08-30  6:23                   ` Jacob Keller
  2016-08-30 17:40                     ` Uma Srinivasan
  0 siblings, 1 reply; 31+ messages in thread
From: Jacob Keller @ 2016-08-30  6:23 UTC (permalink / raw)
  To: Uma Srinivasan
  Cc: Junio C Hamano, Git Mailing List, Jens Lehmann, Heiko Voigt,
	Stefan Beller

On Mon, Aug 29, 2016 at 11:09 PM, Jacob Keller <jacob.keller@gmail.com> wrote:
> On Mon, Aug 29, 2016 at 5:12 PM, Uma Srinivasan <usrinivasan@twitter.com> wrote:
>> This is great! Thanks Jake. If you happen to have the patch ID it
>> would be helpful.
>>
>> Uma
>>
>
> http://public-inbox.org/git/1472236108.28343.5.camel@intel.com/


Actually correct patch is
http://public-inbox.org/git/20160825233243.30700-6-jacob.e.keller@intel.com/

Thanks,
Jake

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-30  6:23                   ` Jacob Keller
@ 2016-08-30 17:40                     ` Uma Srinivasan
  2016-08-30 17:53                       ` Junio C Hamano
  0 siblings, 1 reply; 31+ messages in thread
From: Uma Srinivasan @ 2016-08-30 17:40 UTC (permalink / raw)
  To: Jacob Keller
  Cc: Junio C Hamano, Git Mailing List, Jens Lehmann, Heiko Voigt,
	Stefan Beller

Thanks for the patch. Unfortunately, it doesn't help in my case as it
invokes the is_submodule_modified() routine which you didn't modify.
Here's my call trace....

#0  is_submodule_modified (path=path@entry=0x17c2f08 "groc", ignore_untracked=0)
    at submodule.c:939
#1  0x00000000004aa4dc in match_stat_with_submodule (
    diffopt=diffopt@entry=0x7fffffffde18, ce=ce@entry=0x17c2eb0,
    st=st@entry=0x7fffffffd840, ce_option=ce_option@entry=0,
    dirty_submodule=dirty_submodule@entry=0x7fffffffd83c) at diff-lib.c:81
#2  0x00000000004ab4f5 in run_diff_files (revs=revs@entry=0x7fffffffd920,
    option=option@entry=0) at diff-lib.c:217
#3  0x000000000054c0d4 in wt_status_collect_changes_worktree
(s=s@entry=0x7de280 <s>)
    at wt-status.c:559
#4  0x000000000054ecf6 in wt_status_collect (s=s@entry=0x7de280 <s>)
at wt-status.c:678
#5  0x0000000000422171 in cmd_status (argc=<optimized out>,
argv=<optimized out>,
    prefix=0x0) at builtin/commit.c:1390
#6  0x0000000000405abe in run_builtin (argv=<optimized out>,
argc=<optimized out>,
    p=<optimized out>) at git.c:352
#7  handle_builtin (argc=1, argv=0x7fffffffe570) at git.c:551
#8  0x0000000000405dd8 in run_argv (argv=0x7fffffffe320, argcp=0x7fffffffe32c)
    at git.c:606
#9  cmd_main (argc=1, argc@entry=2, argv=0x7fffffffe570,
argv@entry=0x7fffffffe568)
    at git.c:678
#10 0x0000000000405060 in main (argc=2, argv=0x7fffffffe568) at common-main.c:40


I think the following fix is still needed to is_submodule_modified():

        strbuf_addf(&buf, "%s/.git", path);
        git_dir = read_gitfile(buf.buf);
        if (!git_dir) {
                git_dir = buf.buf;
 ==>               if (!is_git_directory(git_dir)) {
 ==>                     die("Corrupted .git dir in submodule %s", path);
 ==>               }
        }


Thanks,
Uma

On Mon, Aug 29, 2016 at 11:23 PM, Jacob Keller <jacob.keller@gmail.com> wrote:
> On Mon, Aug 29, 2016 at 11:09 PM, Jacob Keller <jacob.keller@gmail.com> wrote:
>> On Mon, Aug 29, 2016 at 5:12 PM, Uma Srinivasan <usrinivasan@twitter.com> wrote:
>>> This is great! Thanks Jake. If you happen to have the patch ID it
>>> would be helpful.
>>>
>>> Uma
>>>
>>
>> http://public-inbox.org/git/1472236108.28343.5.camel@intel.com/
>
>
> Actually correct patch is
> http://public-inbox.org/git/20160825233243.30700-6-jacob.e.keller@intel.com/
>
> Thanks,
> Jake

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-30 17:40                     ` Uma Srinivasan
@ 2016-08-30 17:53                       ` Junio C Hamano
  2016-08-31  2:54                         ` Uma Srinivasan
  0 siblings, 1 reply; 31+ messages in thread
From: Junio C Hamano @ 2016-08-30 17:53 UTC (permalink / raw)
  To: Uma Srinivasan
  Cc: Jacob Keller, Git Mailing List, Jens Lehmann, Heiko Voigt,
	Stefan Beller

Uma Srinivasan <usrinivasan@twitter.com> writes:

> I think the following fix is still needed to is_submodule_modified():
>
>         strbuf_addf(&buf, "%s/.git", path);
>         git_dir = read_gitfile(buf.buf);
>         if (!git_dir) {
>                 git_dir = buf.buf;
>  ==>               if (!is_git_directory(git_dir)) {
>  ==>                     die("Corrupted .git dir in submodule %s", path);
>  ==>               }
>         }

If it is so important that git_dir is a valid Git
repository after

	git_dir = read_gitfile(buf.buf);
        if (!git_dir)
        	git_dir = buf.buf;

is done to determine what "git_dir" to use, it seems to me that it
does not make much sense to check ONLY dir/.git that is a directory
and leave .git/modules/$name that dir/.git file points at unchecked.

But there is much bigger problem with the above addition, I think.
There also can be a case where dir/ does not even have ".git" in it.
A submodule the user is not interested in will just have an empty
directory there, and immediately after the above three lines I
reproduced above, we have this

	if (!is_directory(git_dir)) {
        	strbuf_release(&buf);
                return 0;
	}

The added check will break the use case.  If anything, that check,
if this code needs to verify that "git_dir" points at a valid Git
repository, should come _after_ that.

Shouldn't "git-status --porcelain" run in the submodule notice that
it is not a valid repository and quickly die anyway?  Why should we
even check before spawning that process in the first place?

I might suggest to update prepare_submodule_repo_env() so that the
spawned process will *NOT* have to guess where the working tree and
repository by exporting GIT_DIR (set to "git_dir" we discover above)
and GIT_WORK_TREE (set to "." as cp.dir is set to the path to the
working tree of the submodule).  That would stop the "git status" to
guess (and fooled by a corrupted dir/.git that is not a git
repository).


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-30 17:53                       ` Junio C Hamano
@ 2016-08-31  2:54                         ` Uma Srinivasan
  2016-08-31 16:42                           ` Junio C Hamano
  0 siblings, 1 reply; 31+ messages in thread
From: Uma Srinivasan @ 2016-08-31  2:54 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jacob Keller, Git Mailing List, Jens Lehmann, Heiko Voigt,
	Stefan Beller

On Tue, Aug 30, 2016 at 10:53 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Uma Srinivasan <usrinivasan@twitter.com> writes:
>
>> I think the following fix is still needed to is_submodule_modified():
>>
>>         strbuf_addf(&buf, "%s/.git", path);
>>         git_dir = read_gitfile(buf.buf);
>>         if (!git_dir) {
>>                 git_dir = buf.buf;
>>  ==>               if (!is_git_directory(git_dir)) {
>>  ==>                     die("Corrupted .git dir in submodule %s", path);
>>  ==>               }
>>         }
>
> If it is so important that git_dir is a valid Git
> repository after
>
>         git_dir = read_gitfile(buf.buf);
>         if (!git_dir)
>                 git_dir = buf.buf;
>
> is done to determine what "git_dir" to use, it seems to me that it
> does not make much sense to check ONLY dir/.git that is a directory
> and leave .git/modules/$name that dir/.git file points at unchecked.
>

Okay.

> But there is much bigger problem with the above addition, I think.
> There also can be a case where dir/ does not even have ".git" in it.
> A submodule the user is not interested in will just have an empty
> directory there, and immediately after the above three lines I
> reproduced above, we have this
>
>         if (!is_directory(git_dir)) {
>                 strbuf_release(&buf);
>                 return 0;
>         }
>

Good to know about this use case.

> The added check will break the use case.  If anything, that check,
> if this code needs to verify that "git_dir" points at a valid Git
> repository, should come _after_ that.

Okay.

>
> Shouldn't "git-status --porcelain" run in the submodule notice that
> it is not a valid repository and quickly die anyway?  Why should we
> even check before spawning that process in the first place?

I don't see any reason to disagree with this.

>
> I might suggest to update prepare_submodule_repo_env() so that the
> spawned process will *NOT* have to guess where the working tree and
> repository by exporting GIT_DIR (set to "git_dir" we discover above)
> and GIT_WORK_TREE (set to "." as cp.dir is set to the path to the
> working tree of the submodule).  That would stop the "git status" to
> guess (and fooled by a corrupted dir/.git that is not a git
> repository).
>

Here's where I am struggling with my lack of knowledge of git
internals and the implementation particularly in the context of how
environment variables are passed from the parent to the child process.

Are you suggesting that we set up the child process environment array
(the "out" argument) in prepare_submodule_repo_env() to include
GIT_DIR_ENVIRONMENT and GIT_WORK_TREE_ENVIRONMENT in addition to
CONFIG_DATA_ENVIRONMENT that is there now?  Can I use the strcmp and
argv_array_push() calls to do this? There are several callers for this
routine prepare_submodule_repo_env(). Would the caller pass the git
dir and work tree as arguments to this routine or would the caller set
up the environment variables as needed? Is there any documentation or
other example code where similar passing of env. vars to a child
process is being done?

Sorry for all these questions but I'm still a novice as far as the
code is concerned.

Thanks for your help.

Uma

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-31  2:54                         ` Uma Srinivasan
@ 2016-08-31 16:42                           ` Junio C Hamano
  2016-08-31 18:40                             ` Uma Srinivasan
  2016-09-01  4:09                             ` Junio C Hamano
  0 siblings, 2 replies; 31+ messages in thread
From: Junio C Hamano @ 2016-08-31 16:42 UTC (permalink / raw)
  To: Uma Srinivasan
  Cc: Jacob Keller, Git Mailing List, Jens Lehmann, Heiko Voigt,
	Stefan Beller

Uma Srinivasan <usrinivasan@twitter.com> writes:

>> I might suggest to update prepare_submodule_repo_env() so that the
>> spawned process will *NOT* have to guess where the working tree and
>> repository by exporting GIT_DIR (set to "git_dir" we discover above)
>> and GIT_WORK_TREE (set to "." as cp.dir is set to the path to the
>> working tree of the submodule).  That would stop the "git status" to
>> guess (and fooled by a corrupted dir/.git that is not a git
>> repository).
>
> Here's where I am struggling with my lack of knowledge of git
> internals and the implementation particularly in the context of how
> environment variables are passed from the parent to the child process.

Ah, I was primarily addressing Jacob in the latter part of my
message, as he's looked at similar things in his recent topic.

> Are you suggesting that we set up the child process environment array
> (the "out" argument) in prepare_submodule_repo_env() to include
> GIT_DIR_ENVIRONMENT and GIT_WORK_TREE_ENVIRONMENT in addition to
> CONFIG_DATA_ENVIRONMENT that is there now?

I was wondering if we should unconditionally stuff GIT_DIR=<the
repository location for the submodule> in the cp.env_array passed to
the function prepare_submodule_repo_env().  As cp.dir will be set to
the submodule's working tree, there is no need to set GIT_WORK_TREE
and export it, I think, although it would not hurt.

After all, we _are_ going into a separate and different project's
repository (that is what a submodule is), and we _know_ where its
repository data (i.e. GIT_DIR) and the location of its working tree
(i.e. GIT_WORK_TREE).  There is no reason for the process that will
work in the submodule to go through the usual "do we have .git in
our $cwd that is a repository?  if not how about the parent directory
of $cwd?  go up until we find one and that directory is the top of
the working tree" discovery.

More importantly, this matters when your GIT_DIR for the submodule
is somehow corrupt.  The discovery process would say "there is .git
in $cwd but that is not a repository" and continue upwards, which
likely would find the repository for the top-level superproject,
which definitely is _not_ want you want to happen.  And one way to
avoid it is to tell the setup.c code that it should not do the
discovery by being explicit.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-31 16:42                           ` Junio C Hamano
@ 2016-08-31 18:40                             ` Uma Srinivasan
  2016-08-31 18:44                               ` Junio C Hamano
  2016-09-01  4:09                             ` Junio C Hamano
  1 sibling, 1 reply; 31+ messages in thread
From: Uma Srinivasan @ 2016-08-31 18:40 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jacob Keller, Git Mailing List, Jens Lehmann, Heiko Voigt,
	Stefan Beller

FWIW, I tried the following quick change based on the info. provided
by Junio and it seems to "fix" my original problem. I'll let you
experts figure out if we need a more complete fix or not.

diff --git a/submodule.c b/submodule.c
index 5a62aa2..23443a7 100644
--- a/submodule.c
+++ b/submodule.c
@@ -960,6 +960,9 @@ unsigned is_submodule_modified(const char *path,
int ignore_untracked)
                return 0;

        }
+       /* stuff submodule git dir into env var */
+       set_git_dir(git_dir);
+
        strbuf_reset(&buf);

        if (ignore_untracked)
@@ -1279,5 +1282,9 @@ void prepare_submodule_repo_env(struct argv_array *out)
        for (var = local_repo_env; *var; var++) {
                if (strcmp(*var, CONFIG_DATA_ENVIRONMENT))
                        argv_array_push(out, *var);
+               if (strcmp(*var, GIT_DIR_ENVIRONMENT))
+                       argv_array_pushf(out, "%s=%s", GIT_DIR_ENVIRONMENT,
+                                        get_git_dir());
        }
+
 }

Thanks,
Uma

On Wed, Aug 31, 2016 at 9:42 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Uma Srinivasan <usrinivasan@twitter.com> writes:
>
>>> I might suggest to update prepare_submodule_repo_env() so that the
>>> spawned process will *NOT* have to guess where the working tree and
>>> repository by exporting GIT_DIR (set to "git_dir" we discover above)
>>> and GIT_WORK_TREE (set to "." as cp.dir is set to the path to the
>>> working tree of the submodule).  That would stop the "git status" to
>>> guess (and fooled by a corrupted dir/.git that is not a git
>>> repository).
>>
>> Here's where I am struggling with my lack of knowledge of git
>> internals and the implementation particularly in the context of how
>> environment variables are passed from the parent to the child process.
>
> Ah, I was primarily addressing Jacob in the latter part of my
> message, as he's looked at similar things in his recent topic.
>
>> Are you suggesting that we set up the child process environment array
>> (the "out" argument) in prepare_submodule_repo_env() to include
>> GIT_DIR_ENVIRONMENT and GIT_WORK_TREE_ENVIRONMENT in addition to
>> CONFIG_DATA_ENVIRONMENT that is there now?
>
> I was wondering if we should unconditionally stuff GIT_DIR=<the
> repository location for the submodule> in the cp.env_array passed to
> the function prepare_submodule_repo_env().  As cp.dir will be set to
> the submodule's working tree, there is no need to set GIT_WORK_TREE
> and export it, I think, although it would not hurt.
>
> After all, we _are_ going into a separate and different project's
> repository (that is what a submodule is), and we _know_ where its
> repository data (i.e. GIT_DIR) and the location of its working tree
> (i.e. GIT_WORK_TREE).  There is no reason for the process that will
> work in the submodule to go through the usual "do we have .git in
> our $cwd that is a repository?  if not how about the parent directory
> of $cwd?  go up until we find one and that directory is the top of
> the working tree" discovery.
>
> More importantly, this matters when your GIT_DIR for the submodule
> is somehow corrupt.  The discovery process would say "there is .git
> in $cwd but that is not a repository" and continue upwards, which
> likely would find the repository for the top-level superproject,
> which definitely is _not_ want you want to happen.  And one way to
> avoid it is to tell the setup.c code that it should not do the
> discovery by being explicit.
>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-31 18:40                             ` Uma Srinivasan
@ 2016-08-31 18:44                               ` Junio C Hamano
  2016-08-31 18:58                                 ` Uma Srinivasan
  0 siblings, 1 reply; 31+ messages in thread
From: Junio C Hamano @ 2016-08-31 18:44 UTC (permalink / raw)
  To: Uma Srinivasan
  Cc: Jacob Keller, Git Mailing List, Jens Lehmann, Heiko Voigt,
	Stefan Beller

Uma Srinivasan <usrinivasan@twitter.com> writes:

> diff --git a/submodule.c b/submodule.c
> index 5a62aa2..23443a7 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -960,6 +960,9 @@ unsigned is_submodule_modified(const char *path,
> int ignore_untracked)
>                 return 0;
>
>         }
> +       /* stuff submodule git dir into env var */
> +       set_git_dir(git_dir);

We want to affect only the process we are going to spawn to work
inside the submodule, not ourselves, which is what this call does;
this does not sound like a good idea.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-31 18:44                               ` Junio C Hamano
@ 2016-08-31 18:58                                 ` Uma Srinivasan
  2016-09-01  1:04                                   ` Uma Srinivasan
  0 siblings, 1 reply; 31+ messages in thread
From: Uma Srinivasan @ 2016-08-31 18:58 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jacob Keller, Git Mailing List, Jens Lehmann, Heiko Voigt,
	Stefan Beller

> We want to affect only the process we are going to spawn to work
> inside the submodule, not ourselves, which is what this call does;
> this does not sound like a good idea.

Okay, in that case I would have to pass the "git_dir" as a new
argument to prepare_submodule_repo_env(). I know what to pass from the
is_submodule_modified() caller. I don't think it's all that obvious
for the other callers.

Thanks,
Uma

On Wed, Aug 31, 2016 at 11:44 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Uma Srinivasan <usrinivasan@twitter.com> writes:
>
>> diff --git a/submodule.c b/submodule.c
>> index 5a62aa2..23443a7 100644
>> --- a/submodule.c
>> +++ b/submodule.c
>> @@ -960,6 +960,9 @@ unsigned is_submodule_modified(const char *path,
>> int ignore_untracked)
>>                 return 0;
>>
>>         }
>> +       /* stuff submodule git dir into env var */
>> +       set_git_dir(git_dir);
>
> We want to affect only the process we are going to spawn to work
> inside the submodule, not ourselves, which is what this call does;
> this does not sound like a good idea.
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-31 18:58                                 ` Uma Srinivasan
@ 2016-09-01  1:04                                   ` Uma Srinivasan
  0 siblings, 0 replies; 31+ messages in thread
From: Uma Srinivasan @ 2016-09-01  1:04 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jacob Keller, Git Mailing List, Jens Lehmann, Heiko Voigt,
	Stefan Beller

Here's one more attempt where I changed prepare_submodule_repo_env()
to take the submodule git dir as an argument.
Sorry for including this long code diff inline. If there's a better
way to have this discussion with code please let me know.

Thanks,
Uma

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 9d79f19..0741f6c 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -465,7 +465,7 @@ static int clone_submodule(const char *path, const
char *gitdir, const char *url
        argv_array_push(&cp.args, path);

        cp.git_cmd = 1;
-       prepare_submodule_repo_env(&cp.env_array);
+       prepare_submodule_repo_env(&cp.env_array, gitdir);
        cp.no_stdin = 1;

        return run_command(&cp);
diff --git a/submodule.c b/submodule.c
index 5a62aa2..3d9174a 100644
--- a/submodule.c
+++ b/submodule.c
@@ -522,11 +522,31 @@ static int has_remote(const char *refname, const
struct object_id *oid,
        return 1;
 }

+static const char *submodule_git_dir = NULL;
+const char *get_submodule_git_dir(const char *path, struct strbuf *bufptr)
+{
+       strbuf_addf(bufptr, "%s/.git", path);
+       submodule_git_dir = read_gitfile(bufptr->buf);
+       if (!submodule_git_dir)
+               submodule_git_dir = bufptr->buf;
+       if (!is_directory(submodule_git_dir)) {
+               return NULL;
+       }
+       return submodule_git_dir;
+}
+
 static int submodule_needs_pushing(const char *path, const unsigned
char sha1[20])
 {
        if (add_submodule_odb(path) || !lookup_commit_reference(sha1))
                return 0;

+       struct strbuf gitdirbuf = STRBUF_INIT;
+        const char *sm_gitdir = get_submodule_git_dir(path, &gitdirbuf);
+       if (sm_gitdir == NULL) {
+               strbuf_release(&gitdirbuf);
+               return 0;
+       }
        if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) {
                struct child_process cp = CHILD_PROCESS_INIT;
                const char *argv[] = {"rev-list", NULL, "--not",
"--remotes", "-n", "1" , NULL};
@@ -535,7 +555,7 @@ static int submodule_needs_pushing(const char
*path, const unsigned char sha1[20

                argv[1] = sha1_to_hex(sha1);
                cp.argv = argv;
-               prepare_submodule_repo_env(&cp.env_array);
+               prepare_submodule_repo_env(&cp.env_array, sm_gitdir);
                cp.git_cmd = 1;
                cp.no_stdin = 1;
                cp.out = -1;
@@ -551,6 +571,7 @@ static int submodule_needs_pushing(const char
*path, const unsigned char sha1[20
                return needs_pushing;
        }

+       strbuf_release(&gitdirbuf);
        return 0;
 }

@@ -617,12 +638,18 @@ static int push_submodule(const char *path)
        if (add_submodule_odb(path))
                return 1;

+       struct strbuf gitdirbuf = STRBUF_INIT;
+        const char *sm_gitdir = get_submodule_git_dir(path, &gitdirbuf);
+       if (sm_gitdir == NULL) {
+               strbuf_release(&gitdirbuf);
+               return 0;
+       }
        if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) {
                struct child_process cp = CHILD_PROCESS_INIT;
                const char *argv[] = {"push", NULL};

                cp.argv = argv;
-               prepare_submodule_repo_env(&cp.env_array);
+               prepare_submodule_repo_env(&cp.env_array, sm_gitdir);
                cp.git_cmd = 1;
                cp.no_stdin = 1;
                cp.dir = path;
@@ -631,6 +658,7 @@ static int push_submodule(const char *path)
                close(cp.out);
        }

+       strbuf_release(&gitdirbuf);
        return 1;
 }

@@ -665,10 +693,16 @@ static int is_submodule_commit_present(const
char *path, unsigned char sha1[20])
                struct child_process cp = CHILD_PROCESS_INIT;
                const char *argv[] = {"rev-list", "-n", "1", NULL,
"--not", "--all", NULL};
                struct strbuf buf = STRBUF_INIT;
+               struct strbuf gitdirbuf = STRBUF_INIT;
+               const char *sm_gitdir = get_submodule_git_dir(path, &gitdirbuf);
+               if (sm_gitdir == NULL) {
+                       strbuf_release(&gitdirbuf);
+                       return 0;
+               }

                argv[3] = sha1_to_hex(sha1);
                cp.argv = argv;
-               prepare_submodule_repo_env(&cp.env_array);
+               prepare_submodule_repo_env(&cp.env_array, sm_gitdir);
                cp.git_cmd = 1;
                cp.no_stdin = 1;
                cp.dir = path;
@@ -676,6 +710,7 @@ static int is_submodule_commit_present(const char
*path, unsigned char sha1[20])
                        is_present = 1;

                strbuf_release(&buf);
+               strbuf_release(&gitdirbuf);
        }
        return is_present;
 }
@@ -851,7 +886,7 @@ static int get_next_submodule(struct child_process *cp,
                if (is_directory(git_dir)) {
                        child_process_init(cp);
                        cp->dir = strbuf_detach(&submodule_path, NULL);
-                       prepare_submodule_repo_env(&cp->env_array);
+                       prepare_submodule_repo_env(&cp->env_array, git_dir);
                        cp->git_cmd = 1;
                        if (!spf->quiet)
                                strbuf_addf(err, "Fetching submodule %s%s\n",
@@ -958,15 +993,14 @@ unsigned is_submodule_modified(const char *path,
int ignore_untracked)
                strbuf_release(&buf);
                /* The submodule is not checked out, so it is not modified */
                return 0;
-
        }
-       strbuf_reset(&buf);

        if (ignore_untracked)
                argv[2] = "-uno";

        cp.argv = argv;
-       prepare_submodule_repo_env(&cp.env_array);
+       prepare_submodule_repo_env(&cp.env_array, git_dir);
+       strbuf_reset(&buf);
        cp.git_cmd = 1;
        cp.no_stdin = 1;
        cp.out = -1;
@@ -1023,11 +1057,11 @@ int submodule_uses_gitfile(const char *path)
                strbuf_release(&buf);
                return 0;
        }
-       strbuf_release(&buf);

        /* Now test that all nested submodules use a gitfile too */
        cp.argv = argv;
-       prepare_submodule_repo_env(&cp.env_array);
+       prepare_submodule_repo_env(&cp.env_array, git_dir);
+       strbuf_release(&buf);
        cp.git_cmd = 1;
        cp.no_stdin = 1;
        cp.no_stderr = 1;
@@ -1052,6 +1086,7 @@ int ok_to_remove_submodule(const char *path)
        };
        struct strbuf buf = STRBUF_INIT;
        int ok_to_remove = 1;
+       const char *git_dir;

        if (!file_exists(path) || is_empty_dir(path))
                return 1;
@@ -1060,7 +1095,10 @@ int ok_to_remove_submodule(const char *path)
                return 0;

        cp.argv = argv;
-       prepare_submodule_repo_env(&cp.env_array);
+       strbuf_addf(&buf, "%s/.git", path);
+       git_dir = read_gitfile(buf.buf);
+       prepare_submodule_repo_env(&cp.env_array, git_dir);
+       strbuf_release(&buf);
        cp.git_cmd = 1;
        cp.no_stdin = 1;
        cp.out = -1;
@@ -1272,12 +1310,16 @@ int parallel_submodules(void)
        return parallel_jobs;
 }

-void prepare_submodule_repo_env(struct argv_array *out)
+void prepare_submodule_repo_env(struct argv_array *out, const char* git_dir)
 {
        const char * const *var;

        for (var = local_repo_env; *var; var++) {
                if (strcmp(*var, CONFIG_DATA_ENVIRONMENT))
                        argv_array_push(out, *var);
+               if (strcmp(*var, GIT_DIR_ENVIRONMENT))
+                       argv_array_pushf(out, "%s=%s", GIT_DIR_ENVIRONMENT,
+                                        real_path(git_dir));
        }
+
 }
diff --git a/submodule.h b/submodule.h
index d9e197a..4f8b0c7 100644
--- a/submodule.h
+++ b/submodule.h
@@ -73,6 +73,6 @@ int parallel_submodules(void);
  * a submodule by clearing any repo-specific envirionment variables, but
  * retaining any config in the environment.
  */
-void prepare_submodule_repo_env(struct argv_array *out);
+void prepare_submodule_repo_env(struct argv_array *out, const char *git_dir);

On Wed, Aug 31, 2016 at 11:58 AM, Uma Srinivasan
<usrinivasan@twitter.com> wrote:
>> We want to affect only the process we are going to spawn to work
>> inside the submodule, not ourselves, which is what this call does;
>> this does not sound like a good idea.
>
> Okay, in that case I would have to pass the "git_dir" as a new
> argument to prepare_submodule_repo_env(). I know what to pass from the
> is_submodule_modified() caller. I don't think it's all that obvious
> for the other callers.
>
> Thanks,
> Uma
>
> On Wed, Aug 31, 2016 at 11:44 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> Uma Srinivasan <usrinivasan@twitter.com> writes:
>>
>>> diff --git a/submodule.c b/submodule.c
>>> index 5a62aa2..23443a7 100644
>>> --- a/submodule.c
>>> +++ b/submodule.c
>>> @@ -960,6 +960,9 @@ unsigned is_submodule_modified(const char *path,
>>> int ignore_untracked)
>>>                 return 0;
>>>
>>>         }
>>> +       /* stuff submodule git dir into env var */
>>> +       set_git_dir(git_dir);
>>
>> We want to affect only the process we are going to spawn to work
>> inside the submodule, not ourselves, which is what this call does;
>> this does not sound like a good idea.
>>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-08-31 16:42                           ` Junio C Hamano
  2016-08-31 18:40                             ` Uma Srinivasan
@ 2016-09-01  4:09                             ` Junio C Hamano
  2016-09-01 16:05                               ` Uma Srinivasan
  1 sibling, 1 reply; 31+ messages in thread
From: Junio C Hamano @ 2016-09-01  4:09 UTC (permalink / raw)
  To: Uma Srinivasan
  Cc: Jacob Keller, Git Mailing List, Jens Lehmann, Heiko Voigt,
	Stefan Beller

Junio C Hamano <gitster@pobox.com> writes:

> I was wondering if we should unconditionally stuff GIT_DIR=<the
> repository location for the submodule> in the cp.env_array passed to
> the function prepare_submodule_repo_env().  As cp.dir will be set to
> the submodule's working tree, there is no need to set GIT_WORK_TREE
> and export it, I think, although it would not hurt.

After checking all callers of prepare_submodule_repo_env() and saw
that cp.dir is set to the top of working directory for the
submodule, I wonder if something as simple and stupid like the
attached patch is sufficient.  Our subprocess will go there, and
there should be a .git embedded directory or a .git file that points
into .git/modules/* in the superproject, and either way, it should
use .git directly underneath it.

 submodule.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/submodule.c b/submodule.c
index 1b5cdfb..e8258f0 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1160,4 +1160,5 @@ void prepare_submodule_repo_env(struct argv_array *out)
 		if (strcmp(*var, CONFIG_DATA_ENVIRONMENT))
 			argv_array_push(out, *var);
 	}
+	argv_array_push(out, "GIT_DIR=.git");
 }


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-09-01  4:09                             ` Junio C Hamano
@ 2016-09-01 16:05                               ` Uma Srinivasan
  2016-09-01 18:32                                 ` Junio C Hamano
  0 siblings, 1 reply; 31+ messages in thread
From: Uma Srinivasan @ 2016-09-01 16:05 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jacob Keller, Git Mailing List, Jens Lehmann, Heiko Voigt,
	Stefan Beller

Yes, this one line fix addresses my problem.

So, what is the next step? Will someone submit a patch or should I?
Please note that I've never submitted a patch before, but I don't mind
learning how to.

Thanks,
Uma

> --- a/submodule.c
> +++ b/submodule.c
> @@ -1160,4 +1160,5 @@ void prepare_submodule_repo_env(struct argv_array *out)
>                 if (strcmp(*var, CONFIG_DATA_ENVIRONMENT))
>                         argv_array_push(out, *var);
>         }
> +       argv_array_push(out, "GIT_DIR=.git");
>  }
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-09-01 16:05                               ` Uma Srinivasan
@ 2016-09-01 18:32                                 ` Junio C Hamano
  2016-09-01 18:37                                   ` Stefan Beller
  0 siblings, 1 reply; 31+ messages in thread
From: Junio C Hamano @ 2016-09-01 18:32 UTC (permalink / raw)
  To: Uma Srinivasan
  Cc: Jacob Keller, Git Mailing List, Jens Lehmann, Heiko Voigt,
	Stefan Beller

Uma Srinivasan <usrinivasan@twitter.com> writes:

> Yes, this one line fix addresses my problem.
>
> So, what is the next step? Will someone submit a patch or should I?
> Please note that I've never submitted a patch before, but I don't mind
> learning how to.

The final version needs to be accompanied with tests to show the
effect of this change for callers.  A test would set up a top-level
and submodule, deliberately break submodule/.git/ repository and
show what breaks and how without this change.

For example, there is a call in ok_to_remove_submodule() to run "git
status" in the submodule directory.  A test writer needs to find who
calls ok_to_remove_submodule() in what codepath (e.g. builtin/rm.c).
Then after figuring out under what condition the check is triggered,
write a test that runs "git rm" in a way that this change makes a
difference.  Hopefully, the test will fail without this change, and
will pass with this change.

So it is a bit more involved than just a single liner patch with one
paragraph log message.  I may be able to find or make some time
after I tag the 2.10 final this weekend to do so myself.

>> --- a/submodule.c
>> +++ b/submodule.c
>> @@ -1160,4 +1160,5 @@ void prepare_submodule_repo_env(struct argv_array *out)
>>                 if (strcmp(*var, CONFIG_DATA_ENVIRONMENT))
>>                         argv_array_push(out, *var);
>>         }
>> +       argv_array_push(out, "GIT_DIR=.git");
>>  }
>>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-09-01 18:32                                 ` Junio C Hamano
@ 2016-09-01 18:37                                   ` Stefan Beller
  2016-09-01 19:19                                     ` Junio C Hamano
  0 siblings, 1 reply; 31+ messages in thread
From: Stefan Beller @ 2016-09-01 18:37 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Uma Srinivasan, Jacob Keller, Git Mailing List, Jens Lehmann,
	Heiko Voigt

> The final version needs to be accompanied with tests to show the
> effect of this change for callers.  A test would set up a top-level
> and submodule, deliberately break submodule/.git/ repository and
> show what breaks and how without this change.

Tests are really good at providing this context as well, or to communicate
the actual underlying problem, which is not quite clear to me.
That is why I refrained from jumping into the discussion as I think the
first few emails were dropped from the mailing list and I am missing context.

>
> So it is a bit more involved than just a single liner patch with one
> paragraph log message.  I may be able to find or make some time
> after I tag the 2.10 final this weekend to do so myself.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-09-01 18:37                                   ` Stefan Beller
@ 2016-09-01 19:19                                     ` Junio C Hamano
  2016-09-01 19:56                                       ` Uma Srinivasan
  2016-09-01 20:21                                       ` Junio C Hamano
  0 siblings, 2 replies; 31+ messages in thread
From: Junio C Hamano @ 2016-09-01 19:19 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Uma Srinivasan, Jacob Keller, Git Mailing List, Jens Lehmann,
	Heiko Voigt

Stefan Beller <sbeller@google.com> writes:

>> The final version needs to be accompanied with tests to show the
>> effect of this change for callers.  A test would set up a top-level
>> and submodule, deliberately break submodule/.git/ repository and
>> show what breaks and how without this change.
>
> Tests are really good at providing this context as well, or to communicate
> the actual underlying problem, which is not quite clear to me.
> That is why I refrained from jumping into the discussion as I think the
> first few emails were dropped from the mailing list and I am missing context.

I do not know where you started reading, but the gist of it is that
submodule.c spawns subprocess to run in the submodule's context by
assuming that chdir'ing into the <path> of the submodule and running
it (i.e. cp.dir set to <path> to drive start_command(&cp)) is
sufficient.  When <path>/.git (either it is a directory itself or it
points at a directory in .git/module/<name> in the superproject) is
a corrupt repository, running "git -C <path> command" would try to
auto-detect the repository, because it thinks <path>/.git is not a
repository and it thinks it is not at the top-level of the working
tree, and instead finds the repository of the top-level, which is
almost never what we want.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-09-01 19:19                                     ` Junio C Hamano
@ 2016-09-01 19:56                                       ` Uma Srinivasan
  2016-09-01 20:29                                         ` Junio C Hamano
  2016-09-01 20:21                                       ` Junio C Hamano
  1 sibling, 1 reply; 31+ messages in thread
From: Uma Srinivasan @ 2016-09-01 19:56 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Stefan Beller, Jacob Keller, Git Mailing List, Jens Lehmann,
	Heiko Voigt

>>> The final version needs to be accompanied with tests to show the
>>> effect of this change for callers.  A test would set up a top-level
>>> and submodule, deliberately break submodule/.git/ repository and
>>> show what breaks and how without this change.

Agreed!

The repo where the original problem surfaced is huge and in fact "git
status" appears to go into an infinite loop forking a zillion
processes on the system. If the dev system is small, it brings it to a
standstill. However, the good news is that I could build myself a
smaller reproducer by doing the following:

1) mkdir sm_test
2) cd sm_test
3) git clone git://git.mysociety.org/commonlib commonlib
4) git init
5) git submodule add ./commonlib/
6) cd commonlib/.git
7) rm -f all files

After this "git status" will fork several thousand processes but it
will ultimately fail as it runs into the path length max limit. I
still don't know why this doesn't happen with my original problem
repo. Perhaps I have to let it run overnight or perhaps it runs into
other system limitations before then....

Anyway, with the fix "git status" fails quickly both in my reproducer
(and original repo) with the following message.....
fatal: Not a git repository: '.git'
fatal: 'git status --porcelain' failed in submodule commonlib

Hope this helps.

Uma


On Thu, Sep 1, 2016 at 12:19 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>>> The final version needs to be accompanied with tests to show the
>>> effect of this change for callers.  A test would set up a top-level
>>> and submodule, deliberately break submodule/.git/ repository and
>>> show what breaks and how without this change.
>>
>> Tests are really good at providing this context as well, or to communicate
>> the actual underlying problem, which is not quite clear to me.
>> That is why I refrained from jumping into the discussion as I think the
>> first few emails were dropped from the mailing list and I am missing context.
>
> I do not know where you started reading, but the gist of it is that
> submodule.c spawns subprocess to run in the submodule's context by
> assuming that chdir'ing into the <path> of the submodule and running
> it (i.e. cp.dir set to <path> to drive start_command(&cp)) is
> sufficient.  When <path>/.git (either it is a directory itself or it
> points at a directory in .git/module/<name> in the superproject) is
> a corrupt repository, running "git -C <path> command" would try to
> auto-detect the repository, because it thinks <path>/.git is not a
> repository and it thinks it is not at the top-level of the working
> tree, and instead finds the repository of the top-level, which is
> almost never what we want.
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-09-01 19:19                                     ` Junio C Hamano
  2016-09-01 19:56                                       ` Uma Srinivasan
@ 2016-09-01 20:21                                       ` Junio C Hamano
  2016-09-01 21:02                                         ` Junio C Hamano
  2016-09-01 21:04                                         ` Stefan Beller
  1 sibling, 2 replies; 31+ messages in thread
From: Junio C Hamano @ 2016-09-01 20:21 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Uma Srinivasan, Jacob Keller, Git Mailing List, Jens Lehmann,
	Heiko Voigt

Junio C Hamano <gitster@pobox.com> writes:

> Stefan Beller <sbeller@google.com> writes:
>
>>> The final version needs to be accompanied with tests to show the
>>> effect of this change for callers.  A test would set up a top-level
>>> and submodule, deliberately break submodule/.git/ repository and
>>> show what breaks and how without this change.
>>
>> Tests are really good at providing this context as well, or to communicate
>> the actual underlying problem, which is not quite clear to me.
>> That is why I refrained from jumping into the discussion as I think the
>> first few emails were dropped from the mailing list and I am missing context.
>
> I do not know where you started reading, but the gist of it is that
> submodule.c spawns subprocess to run in the submodule's context by
> assuming that chdir'ing into the <path> of the submodule and running
> it (i.e. cp.dir set to <path> to drive start_command(&cp)) is
> sufficient.  When <path>/.git (either it is a directory itself or it
> points at a directory in .git/module/<name> in the superproject) is
> a corrupt repository, running "git -C <path> command" would try to
> auto-detect the repository, because it thinks <path>/.git is not a
> repository and it thinks it is not at the top-level of the working
> tree, and instead finds the repository of the top-level, which is
> almost never what we want.

This is with a test that covers the call in get_next_submodule() for
the parallel fetch callback.  I think many of the codepaths will end
up recursing forever the same way without the fix in a submodule
repository that is broken in a similar way, but I didn't check, so
I do not consider this to be completed.

-- >8 --
Subject: submodule: avoid auto-discovery in prepare_submodule_repo_env()

The function is used to set up the environment variable used in a
subprocess we spawn in a submodule directory.  The callers set up a
child_process structure, find the working tree path of one submodule 
and set .dir field to it, and then use start_command() API to spawn
the subprocess like "status", "fetch", etc.

When this happens, we expect that the ".git" (either a directory or
a gitfile that points at the real location) in the current working
directory of the subprocess MUST be the repository for the submodule.

If this ".git" thing is a corrupt repository, however, because
prepare_submodule_repo_env() unsets GIT_DIR and GIT_WORK_TREE, the
subprocess will see ".git", thinks it is not a repository, and
attempt to find one by going up, likely to end up in finding the
repository of the superproject.  In some codepaths, this will cause
a command run with the "--recurse-submodules" option to recurse
forever.

By exporting GIT_DIR=.git, disable the auto-discovery logic in the
subprocess, which would instead stop it and report an error.

Not-signed-off-yet.
---
 submodule.c                 |  1 +
 t/t5526-fetch-submodules.sh | 29 +++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/submodule.c b/submodule.c
index 1b5cdfb..e8258f0 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1160,4 +1160,5 @@ void prepare_submodule_repo_env(struct argv_array *out)
 		if (strcmp(*var, CONFIG_DATA_ENVIRONMENT))
 			argv_array_push(out, *var);
 	}
+	argv_array_push(out, "GIT_DIR=.git");
 }
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index 954d0e4..b2dee30 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -485,4 +485,33 @@ test_expect_success 'fetching submodules respects parallel settings' '
 	)
 '
 
+test_expect_success 'fetching submodule into a broken repository' '
+	# Prepare src and src/sub nested in it
+	git init src &&
+	(
+		cd src &&
+		git init sub &&
+		git -C sub commit --allow-empty -m "initial in sub" &&
+		git submodule add -- ./sub sub &&
+		git commit -m "initial in top"
+	) &&
+
+	# Clone the old-fashoned way
+	git clone src dst &&
+	git -C dst clone ../src/sub sub &&
+
+	# Make sure that old-fashoned layout is still supported
+	git -C dst status &&
+
+	# Recursive-fetch works fine
+	git -C dst fetch --recurse-submodules &&
+
+	# Break the receiving submodule
+	rm -f dst/sub/.git/HEAD &&
+
+	# Recursive-fetch must terminate
+	# NOTE: without fix this will recurse forever!
+	test_must_fail git -C dst fetch --recurse-submodules
+'
+
 test_done



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-09-01 19:56                                       ` Uma Srinivasan
@ 2016-09-01 20:29                                         ` Junio C Hamano
  0 siblings, 0 replies; 31+ messages in thread
From: Junio C Hamano @ 2016-09-01 20:29 UTC (permalink / raw)
  To: Uma Srinivasan
  Cc: Stefan Beller, Jacob Keller, Git Mailing List, Jens Lehmann,
	Heiko Voigt

Uma Srinivasan <usrinivasan@twitter.com> writes:

> Anyway, with the fix "git status" fails quickly both in my reproducer
> (and original repo) with the following message.....
> fatal: Not a git repository: '.git'
> fatal: 'git status --porcelain' failed in submodule commonlib

Thanks, that is exactly what I wanted to see as an outcome when I
did the one-liner patch.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-09-01 20:21                                       ` Junio C Hamano
@ 2016-09-01 21:02                                         ` Junio C Hamano
  2016-09-01 21:04                                         ` Stefan Beller
  1 sibling, 0 replies; 31+ messages in thread
From: Junio C Hamano @ 2016-09-01 21:02 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Uma Srinivasan, Jacob Keller, Git Mailing List, Jens Lehmann,
	Heiko Voigt

Junio C Hamano <gitster@pobox.com> writes:

If we add

	# "git diff" should terminate with an error.
	# NOTE: without fix this will recurse forever!
	test_must_fail git -C dst diff &&

after breaking the repository, we can also see "git diff" recurse
forever, because it wants to know if "sub" submodule is modified,
attempts to run "git status" in there, and ends up running that
command in the context of the parent repository.

I am tempted to cheat and commit this, even though this test is no
longer about fetching submodules.

-- >8 --
[PATCH] submodule: avoid auto-discovery in prepare_submodule_repo_env()

The function is used to set up the environment variable used in a
subprocess we spawn in a submodule directory.  The callers set up a
child_process structure, find the working tree path of one submodule
and set .dir field to it, and then use start_command() API to spawn
the subprocess like "status", "fetch", etc.

When this happens, we expect that the ".git" (either a directory or
a gitfile that points at the real location) in the current working
directory of the subprocess MUST be the repository for the submodule.

If this ".git" thing is a corrupt repository, however, because
prepare_submodule_repo_env() unsets GIT_DIR and GIT_WORK_TREE, the
subprocess will see ".git", thinks it is not a repository, and
attempt to find one by going up, likely to end up in finding the
repository of the superproject.  In some codepaths, this will cause
a command run with the "--recurse-submodules" option to recurse
forever.

By exporting GIT_DIR=.git, disable the auto-discovery logic in the
subprocess, which would instead stop it and report an error.

The test illustrates existing problems in a few callsites of this
function.  Without this fix, "git fetch --recurse-submodules", "git
status" and "git diff" keep recursing forever.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 submodule.c                 |  1 +
 t/t5526-fetch-submodules.sh | 35 +++++++++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+)

diff --git a/submodule.c b/submodule.c
index 4532b11..2801fbb 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1160,4 +1160,5 @@ void prepare_submodule_repo_env(struct argv_array *out)
 		if (strcmp(*var, CONFIG_DATA_ENVIRONMENT))
 			argv_array_push(out, *var);
 	}
+	argv_array_push(out, "GIT_DIR=.git");
 }
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index 954d0e4..f3b0a8d 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -485,4 +485,39 @@ test_expect_success 'fetching submodules respects parallel settings' '
 	)
 '
 
+test_expect_success 'fetching submodule into a broken repository' '
+	# Prepare src and src/sub nested in it
+	git init src &&
+	(
+		cd src &&
+		git init sub &&
+		git -C sub commit --allow-empty -m "initial in sub" &&
+		git submodule add -- ./sub sub &&
+		git commit -m "initial in top"
+	) &&
+
+	# Clone the old-fashoned way
+	git clone src dst &&
+	git -C dst clone ../src/sub sub &&
+
+	# Make sure that old-fashoned layout is still supported
+	git -C dst status &&
+
+	# "diff" would find no change
+	git -C dst diff --exit-code &&
+
+	# Recursive-fetch works fine
+	git -C dst fetch --recurse-submodules &&
+
+	# Break the receiving submodule
+	rm -f dst/sub/.git/HEAD &&
+
+	# NOTE: without the fix the following tests will recurse forever!
+	# They should terminate with an error.
+
+	test_must_fail git -C dst status &&
+	test_must_fail git -C dst diff &&
+	test_must_fail git -C dst fetch --recurse-submodules
+'
+
 test_done
-- 
2.10.0-rc2-314-g775ea9a




^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-09-01 20:21                                       ` Junio C Hamano
  2016-09-01 21:02                                         ` Junio C Hamano
@ 2016-09-01 21:04                                         ` Stefan Beller
  2016-09-01 21:12                                           ` Junio C Hamano
  1 sibling, 1 reply; 31+ messages in thread
From: Stefan Beller @ 2016-09-01 21:04 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Uma Srinivasan, Jacob Keller, Git Mailing List, Jens Lehmann,
	Heiko Voigt

On Thu, Sep 1, 2016 at 1:21 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> Stefan Beller <sbeller@google.com> writes:
>>
>>>> The final version needs to be accompanied with tests to show the
>>>> effect of this change for callers.  A test would set up a top-level
>>>> and submodule, deliberately break submodule/.git/ repository and
>>>> show what breaks and how without this change.
>>>
>>> Tests are really good at providing this context as well, or to communicate
>>> the actual underlying problem, which is not quite clear to me.
>>> That is why I refrained from jumping into the discussion as I think the
>>> first few emails were dropped from the mailing list and I am missing context.
>>
>> I do not know where you started reading, but the gist of it is that
>> submodule.c spawns subprocess to run in the submodule's context by
>> assuming that chdir'ing into the <path> of the submodule and running
>> it (i.e. cp.dir set to <path> to drive start_command(&cp)) is
>> sufficient.  When <path>/.git (either it is a directory itself or it
>> points at a directory in .git/module/<name> in the superproject) is
>> a corrupt repository, running "git -C <path> command" would try to
>> auto-detect the repository, because it thinks <path>/.git is not a
>> repository and it thinks it is not at the top-level of the working
>> tree, and instead finds the repository of the top-level, which is
>> almost never what we want.
>
> This is with a test that covers the call in get_next_submodule() for
> the parallel fetch callback.  I think many of the codepaths will end
> up recursing forever the same way without the fix in a submodule
> repository that is broken in a similar way, but I didn't check, so
> I do not consider this to be completed.

Oh I see. That seems like a nasty bug.

>
> -- >8 --
> Subject: submodule: avoid auto-discovery in prepare_submodule_repo_env()
>
> The function is used to set up the environment variable used in a
> subprocess we spawn in a submodule directory.  The callers set up a
> child_process structure, find the working tree path of one submodule
> and set .dir field to it, and then use start_command() API to spawn
> the subprocess like "status", "fetch", etc.
>
> When this happens, we expect that the ".git" (either a directory or
> a gitfile that points at the real location) in the current working
> directory of the subprocess MUST be the repository for the submodule.
>
> If this ".git" thing is a corrupt repository, however, because
> prepare_submodule_repo_env() unsets GIT_DIR and GIT_WORK_TREE, the
> subprocess will see ".git", thinks it is not a repository, and
> attempt to find one by going up, likely to end up in finding the
> repository of the superproject.  In some codepaths, this will cause
> a command run with the "--recurse-submodules" option to recurse
> forever.
>
> By exporting GIT_DIR=.git, disable the auto-discovery logic in the
> subprocess, which would instead stop it and report an error.

and GIT_DIR=.git works for both .git files as well as the old fashioned way,
with the submodule repository at .git/, although that is not really documented.

>
> Not-signed-off-yet.
> ---
>  submodule.c                 |  1 +
>  t/t5526-fetch-submodules.sh | 29 +++++++++++++++++++++++++++++
>  2 files changed, 30 insertions(+)
>
> diff --git a/submodule.c b/submodule.c
> index 1b5cdfb..e8258f0 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -1160,4 +1160,5 @@ void prepare_submodule_repo_env(struct argv_array *out)
>                 if (strcmp(*var, CONFIG_DATA_ENVIRONMENT))
>                         argv_array_push(out, *var);
>         }
> +       argv_array_push(out, "GIT_DIR=.git");
>  }
> diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
> index 954d0e4..b2dee30 100755
> --- a/t/t5526-fetch-submodules.sh
> +++ b/t/t5526-fetch-submodules.sh
> @@ -485,4 +485,33 @@ test_expect_success 'fetching submodules respects parallel settings' '
>         )
>  '
>
> +test_expect_success 'fetching submodule into a broken repository' '
> +       # Prepare src and src/sub nested in it
> +       git init src &&
> +       (
> +               cd src &&
> +               git init sub &&
> +               git -C sub commit --allow-empty -m "initial in sub" &&
> +               git submodule add -- ./sub sub &&
> +               git commit -m "initial in top"
> +       ) &&

This is not needed, as setup() set up some repositories for you.

> +
> +       # Clone the old-fashoned way
> +       git clone src dst &&

if you don't go with your own setup, maybe:

        git clone . dst
        git -C dst clone ../submodule submodule

    # and further down s/sub/submodule/

> +       git -C dst clone ../src/sub sub &&
> +
> +       # Make sure that old-fashoned layout is still supported

fashioned

> +       git -C dst status &&
> +
> +       # Recursive-fetch works fine
> +       git -C dst fetch --recurse-submodules &&
> +
> +       # Break the receiving submodule
> +       rm -f dst/sub/.git/HEAD &&
> +
> +       # Recursive-fetch must terminate
> +       # NOTE: without fix this will recurse forever!
> +       test_must_fail git -C dst fetch --recurse-submodules

So in case we'd break it again in the future we'd notice by an
infinite time for the test suite, then we'd notice this comment and
know what's going on?

I would have suggested to run this actual test in a subshell and then
check if the expected failure would have recovered after a second or two
and depending on that trigger a test failure.

But it doesn't seem to be easy with shells, so this is fine (even the newer
signed off patch, apart from nits above)

> +'
> +
>  test_done
>
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: git submodules implementation question
  2016-09-01 21:04                                         ` Stefan Beller
@ 2016-09-01 21:12                                           ` Junio C Hamano
  0 siblings, 0 replies; 31+ messages in thread
From: Junio C Hamano @ 2016-09-01 21:12 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Uma Srinivasan, Jacob Keller, Git Mailing List, Jens Lehmann,
	Heiko Voigt

Stefan Beller <sbeller@google.com> writes:

>> +test_expect_success 'fetching submodule into a broken repository' '
>> +       # Prepare src and src/sub nested in it
>> +       git init src &&
>> +       (
>> +               cd src &&
>> +               git init sub &&
>> +               git -C sub commit --allow-empty -m "initial in sub" &&
>> +               git submodule add -- ./sub sub &&
>> +               git commit -m "initial in top"
>> +       ) &&
>
> This is not needed, as setup() set up some repositories for you.

I didn't want any random cruft left behind in the top-level by
previous tests, so this is very much deliberate.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2016-09-01 21:45 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-28 23:24 git submodules implementation question Uma Srinivasan
2016-08-29 20:03 ` Junio C Hamano
2016-08-29 21:03   ` Uma Srinivasan
2016-08-29 21:09     ` Junio C Hamano
2016-08-29 21:13       ` Uma Srinivasan
2016-08-29 23:04         ` Uma Srinivasan
2016-08-29 23:15           ` Junio C Hamano
2016-08-29 23:34             ` Uma Srinivasan
2016-08-30  0:02             ` Jacob Keller
2016-08-30  0:12               ` Uma Srinivasan
2016-08-30  6:09                 ` Jacob Keller
2016-08-30  6:23                   ` Jacob Keller
2016-08-30 17:40                     ` Uma Srinivasan
2016-08-30 17:53                       ` Junio C Hamano
2016-08-31  2:54                         ` Uma Srinivasan
2016-08-31 16:42                           ` Junio C Hamano
2016-08-31 18:40                             ` Uma Srinivasan
2016-08-31 18:44                               ` Junio C Hamano
2016-08-31 18:58                                 ` Uma Srinivasan
2016-09-01  1:04                                   ` Uma Srinivasan
2016-09-01  4:09                             ` Junio C Hamano
2016-09-01 16:05                               ` Uma Srinivasan
2016-09-01 18:32                                 ` Junio C Hamano
2016-09-01 18:37                                   ` Stefan Beller
2016-09-01 19:19                                     ` Junio C Hamano
2016-09-01 19:56                                       ` Uma Srinivasan
2016-09-01 20:29                                         ` Junio C Hamano
2016-09-01 20:21                                       ` Junio C Hamano
2016-09-01 21:02                                         ` Junio C Hamano
2016-09-01 21:04                                         ` Stefan Beller
2016-09-01 21:12                                           ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).