git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [BUG?] gitlink without .gitmodules no longer fails recursive clone
@ 2017-06-06  3:56 Jeff King
  2017-06-06 18:01 ` Stefan Beller
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2017-06-06  3:56 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git

While running some regression tests with v2.13, I noticed an odd
behavior. If I create a repository where there's a gitlink with no
matching .gitmodules entry:

  git init repo
  cd repo
  n10=1234abcdef
  n40=$n10$n10$n10$n10
  git update-index --add --cacheinfo 160000 $n40 foo
  git commit -m "gitlink without .gitmodule entry"

and then I clone it recursively with v2.12, it fails:

  $ git.v2.12.3 clone --recurse-submodules . dst; echo exit=$?
  Cloning into 'dst'...
  done.
  fatal: No url found for submodule path 'foo' in .gitmodules
  exit=128

But with v2.13, it silently ignores the submodule:

  $ git.v2.13.1 clone --recurse-submodules . dst; echo exit=$?
  Cloning into 'dst'...
  done.
  exit=0

This bisects to your bb62e0a99 (clone: teach --recurse-submodules to
optionally take a pathspec, 2017-03-17). That patch just sets
submodule.active by default, so I think the real issue is probably in
a086f921a (submodule: decouple url and submodule interest, 2017-03-17).

I also wasn't sure if this might be intentional. I.e., that we'd just
consider gitlink entries which aren't even configured as not-submodules
and ignore them. I couldn't certainly see an argument for moving in that
direction, but it is different than what we used to do. But I couldn't
find anything in any of the commit messages that mentioned this either
way, so I figured I'd punt and ask. :)

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG?] gitlink without .gitmodules no longer fails recursive clone
  2017-06-06  3:56 [BUG?] gitlink without .gitmodules no longer fails recursive clone Jeff King
@ 2017-06-06 18:01 ` Stefan Beller
  2017-06-06 18:10   ` Brandon Williams
  0 siblings, 1 reply; 23+ messages in thread
From: Stefan Beller @ 2017-06-06 18:01 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, git@vger.kernel.org

On Mon, Jun 5, 2017 at 8:56 PM, Jeff King <peff@peff.net> wrote:
> While running some regression tests with v2.13, I noticed an odd
> behavior. If I create a repository where there's a gitlink with no
> matching .gitmodules entry:
>
>   git init repo
>   cd repo
>   n10=1234abcdef
>   n40=$n10$n10$n10$n10
>   git update-index --add --cacheinfo 160000 $n40 foo
>   git commit -m "gitlink without .gitmodule entry"
>
> and then I clone it recursively with v2.12, it fails:
>
>   $ git.v2.12.3 clone --recurse-submodules . dst; echo exit=$?
>   Cloning into 'dst'...
>   done.
>   fatal: No url found for submodule path 'foo' in .gitmodules
>   exit=128
>
> But with v2.13, it silently ignores the submodule:
>
>   $ git.v2.13.1 clone --recurse-submodules . dst; echo exit=$?
>   Cloning into 'dst'...
>   done.
>   exit=0
>
> This bisects to your bb62e0a99 (clone: teach --recurse-submodules to
> optionally take a pathspec, 2017-03-17). That patch just sets
> submodule.active by default, so I think the real issue is probably in
> a086f921a (submodule: decouple url and submodule interest, 2017-03-17).

It's a feature, not a bug, IMO.

When starting out the journey to improve submodules, one of the major
principle was to not interfere with gitlinks too much, as they are used in
ways git cannot fathom (cf git-series storing patches in gitlink form).

And building on that: You asked for recursing into *submodules*, not
into *gitlinks*. And submodules in the new Git have stronger requirements
w.r.t. the gitmodules file. (You have to tell us exactly how you want your
submodule to be treated, and we do not want to half-ass guess around
the shortcomings of a user not telling us about the submodule)

> I also wasn't sure if this might be intentional. I.e., that we'd just
> consider gitlink entries which aren't even configured as not-submodules
> and ignore them.

I think this is what we want to do, and we should do it consistently.
The only downside for this is that more unintentional gitlinks may be
added to repositories as Git will be very good at ignoring them.

> I couldn't certainly see an argument for moving in that
> direction, but it is different than what we used to do. But I couldn't
> find anything in any of the commit messages that mentioned this either
> way, so I figured I'd punt and ask. :)

Yeah, yesterday we had a big discussion if we want to publish our
roadmap and long term vision (as a team, as a company, or as a
community?) This would help newcomers and outsiders to see where
e.g. submodules are headed and people could speak up early if we miss
their use case.

Thanks for asking,
Stefan

>
> -Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG?] gitlink without .gitmodules no longer fails recursive clone
  2017-06-06 18:01 ` Stefan Beller
@ 2017-06-06 18:10   ` Brandon Williams
  2017-06-06 18:39     ` Jeff King
  0 siblings, 1 reply; 23+ messages in thread
From: Brandon Williams @ 2017-06-06 18:10 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jeff King, git@vger.kernel.org

On 06/06, Stefan Beller wrote:
> On Mon, Jun 5, 2017 at 8:56 PM, Jeff King <peff@peff.net> wrote:
> > While running some regression tests with v2.13, I noticed an odd
> > behavior. If I create a repository where there's a gitlink with no
> > matching .gitmodules entry:
> >
> >   git init repo
> >   cd repo
> >   n10=1234abcdef
> >   n40=$n10$n10$n10$n10
> >   git update-index --add --cacheinfo 160000 $n40 foo
> >   git commit -m "gitlink without .gitmodule entry"
> >
> > and then I clone it recursively with v2.12, it fails:
> >
> >   $ git.v2.12.3 clone --recurse-submodules . dst; echo exit=$?
> >   Cloning into 'dst'...
> >   done.
> >   fatal: No url found for submodule path 'foo' in .gitmodules
> >   exit=128
> >
> > But with v2.13, it silently ignores the submodule:
> >
> >   $ git.v2.13.1 clone --recurse-submodules . dst; echo exit=$?
> >   Cloning into 'dst'...
> >   done.
> >   exit=0
> >
> > This bisects to your bb62e0a99 (clone: teach --recurse-submodules to
> > optionally take a pathspec, 2017-03-17). That patch just sets
> > submodule.active by default, so I think the real issue is probably in
> > a086f921a (submodule: decouple url and submodule interest, 2017-03-17).
> 
> It's a feature, not a bug, IMO.
> 
> When starting out the journey to improve submodules, one of the major
> principle was to not interfere with gitlinks too much, as they are used in
> ways git cannot fathom (cf git-series storing patches in gitlink form).
> 
> And building on that: You asked for recursing into *submodules*, not
> into *gitlinks*. And submodules in the new Git have stronger requirements
> w.r.t. the gitmodules file. (You have to tell us exactly how you want your
> submodule to be treated, and we do not want to half-ass guess around
> the shortcomings of a user not telling us about the submodule)

Just for some background on the new behavior and how this functionality
changed: My series changed how 'submodule init' behaved if you have
'submodule.active' set.  Once set (like how clone --recurse does now)
when not provided any path to a submodule, a list of 'active' submodules
matching the 'submodule.active' pathspec will be initialized.  One of
the requirements to be 'active' is to have an entry in the .gitmodules
file so gitlinks without an entry in the .gitmodules file will simply be
ignored now.

> 
> > I also wasn't sure if this might be intentional. I.e., that we'd just
> > consider gitlink entries which aren't even configured as not-submodules
> > and ignore them.
> 
> I think this is what we want to do, and we should do it consistently.
> The only downside for this is that more unintentional gitlinks may be
> added to repositories as Git will be very good at ignoring them.
> 
> > I couldn't certainly see an argument for moving in that
> > direction, but it is different than what we used to do. But I couldn't
> > find anything in any of the commit messages that mentioned this either
> > way, so I figured I'd punt and ask. :)
> 
> Yeah, yesterday we had a big discussion if we want to publish our
> roadmap and long term vision (as a team, as a company, or as a
> community?) This would help newcomers and outsiders to see where
> e.g. submodules are headed and people could speak up early if we miss
> their use case.
> 
> Thanks for asking,
> Stefan
> 
> >
> > -Peff

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG?] gitlink without .gitmodules no longer fails recursive clone
  2017-06-06 18:10   ` Brandon Williams
@ 2017-06-06 18:39     ` Jeff King
  2017-06-09 23:19       ` Jeff King
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2017-06-06 18:39 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Stefan Beller, git@vger.kernel.org

On Tue, Jun 06, 2017 at 11:10:24AM -0700, Brandon Williams wrote:

> > > This bisects to your bb62e0a99 (clone: teach --recurse-submodules to
> > > optionally take a pathspec, 2017-03-17). That patch just sets
> > > submodule.active by default, so I think the real issue is probably in
> > > a086f921a (submodule: decouple url and submodule interest, 2017-03-17).
> > 
> > It's a feature, not a bug, IMO.
> > 
> > When starting out the journey to improve submodules, one of the major
> > principle was to not interfere with gitlinks too much, as they are used in
> > ways git cannot fathom (cf git-series storing patches in gitlink form).
> > 
> > And building on that: You asked for recursing into *submodules*, not
> > into *gitlinks*. And submodules in the new Git have stronger requirements
> > w.r.t. the gitmodules file. (You have to tell us exactly how you want your
> > submodule to be treated, and we do not want to half-ass guess around
> > the shortcomings of a user not telling us about the submodule)
> 
> Just for some background on the new behavior and how this functionality
> changed: My series changed how 'submodule init' behaved if you have
> 'submodule.active' set.  Once set (like how clone --recurse does now)
> when not provided any path to a submodule, a list of 'active' submodules
> matching the 'submodule.active' pathspec will be initialized.  One of
> the requirements to be 'active' is to have an entry in the .gitmodules
> file so gitlinks without an entry in the .gitmodules file will simply be
> ignored now.

OK, thanks for the explanation. I certainly agree that treating
.gitmodules as the source of truth is consistent and easy to explain.
I'll update my test to handle the new behavior (it's a regression test
for how GitHub Pages handles some broken setups).

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG?] gitlink without .gitmodules no longer fails recursive clone
  2017-06-06 18:39     ` Jeff King
@ 2017-06-09 23:19       ` Jeff King
  2017-06-10  2:10         ` Junio C Hamano
  2017-06-13  9:14         ` Jeff King
  0 siblings, 2 replies; 23+ messages in thread
From: Jeff King @ 2017-06-09 23:19 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Stefan Beller, git@vger.kernel.org

On Tue, Jun 06, 2017 at 02:39:14PM -0400, Jeff King wrote:

> > Just for some background on the new behavior and how this functionality
> > changed: My series changed how 'submodule init' behaved if you have
> > 'submodule.active' set.  Once set (like how clone --recurse does now)
> > when not provided any path to a submodule, a list of 'active' submodules
> > matching the 'submodule.active' pathspec will be initialized.  One of
> > the requirements to be 'active' is to have an entry in the .gitmodules
> > file so gitlinks without an entry in the .gitmodules file will simply be
> > ignored now.
> 
> OK, thanks for the explanation. I certainly agree that treating
> .gitmodules as the source of truth is consistent and easy to explain.
> I'll update my test to handle the new behavior (it's a regression test
> for how GitHub Pages handles some broken setups).

I've been chatting with Pages folks about the kinds of errors they see,
and I was surprised how easy it is to get into this situation.

In an ideal world the user do:

  git submodule add git://host/repo.git path

which adds the gitlink and the .gitmodules entry. But it doesn't seem
unreasonable for somebody unfamiliar with submodules to do:

  git clone git://host/repo.git path
  git add path

This does add the entry as a gitlink, but doesn't write any sort of
.gitmodules entry. With the old code, cloning the repository (either by
another user, or in our case during a Pages build), a recursive clone or
submodule init would complain loudly. But now it's just quietly ignored.
Which seems unfortunate.

I actually don't think the old behavior was that great. Somebody _else_
would see the error and was responsible for filtering it back to the
original author. But what we really want is to warn the user when
they're creating the broken situation.

Should "git add" check whether there's a matching .gitmodules entry for
the path and issue a warning otherwise?

Sort of orthogonal, but I kind of wonder if the recursive clone should
also issue a warning. Not for an inactive submodule (which obviously is
normal) but for one that is inactive because it does not even have a
.gitmodules entry. I suppose that _could_ be considered normal in a
post-v2.13 world (e.g., if the owner of the repo "deactivates" a module
by deleting its .gitmodules entry).  But since it's indistinguishable
from the broken case, perhaps that would be better done with a specific
"active = false" flag in .gitmodules.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG?] gitlink without .gitmodules no longer fails recursive clone
  2017-06-09 23:19       ` Jeff King
@ 2017-06-10  2:10         ` Junio C Hamano
  2017-06-10  7:13           ` Jeff King
  2017-06-12  5:30           ` Stefan Beller
  2017-06-13  9:14         ` Jeff King
  1 sibling, 2 replies; 23+ messages in thread
From: Junio C Hamano @ 2017-06-10  2:10 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, Stefan Beller, git@vger.kernel.org

Jeff King <peff@peff.net> writes:

> In an ideal world the user do:
>
>   git submodule add git://host/repo.git path
>
> which adds the gitlink and the .gitmodules entry. But it doesn't seem
> unreasonable for somebody unfamiliar with submodules to do:
>
>   git clone git://host/repo.git path
>   git add path
>
> This does add the entry as a gitlink, but doesn't write any sort of
> .gitmodules entry.

I actually would think that is a perfectly valid state.  In that
original repository pair (i.e. the superproject with a submodule
without an entry in .gitmodules), as long as the configuration in
the submodule repository "path/.git/config" has necessary remote
definitions, "git push/fetch --recursive" etc., should also be able
to work without having to consult .gitmodules at the top-level
superproject, I would think.

> With the old code, cloning the repository (either by
> another user, or in our case during a Pages build), a recursive clone or
> submodule init would complain loudly. But now it's just quietly ignored.
> Which seems unfortunate.

Of course, if such an original superproject gets pushed to a
publishing location and then the result is cloned, without an entry
in .gitmodules, no information "git submodule" can use to work on
that "path" exists in that clone.  I would say it is OK to leave it
as-is when going "--recursive" (what you called "inactive because
it does not even have a .gitmodules entry).

But even in such a clone, once the user who cloned learns where the
submodule commit that is recorded in the superproject's tree can be
obtained out-of-band and makes a clone at "path" manually (which
replicates the state the original repository pair), things that only
need to look at "path/.git/config" should be able to work (e.g. "git
fetch --recursive"), I'd say.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG?] gitlink without .gitmodules no longer fails recursive clone
  2017-06-10  2:10         ` Junio C Hamano
@ 2017-06-10  7:13           ` Jeff King
  2017-06-10 11:12             ` Junio C Hamano
  2017-06-12  5:30           ` Stefan Beller
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff King @ 2017-06-10  7:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Brandon Williams, Stefan Beller, git@vger.kernel.org

On Sat, Jun 10, 2017 at 11:10:11AM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > In an ideal world the user do:
> >
> >   git submodule add git://host/repo.git path
> >
> > which adds the gitlink and the .gitmodules entry. But it doesn't seem
> > unreasonable for somebody unfamiliar with submodules to do:
> >
> >   git clone git://host/repo.git path
> >   git add path
> >
> > This does add the entry as a gitlink, but doesn't write any sort of
> > .gitmodules entry.
> 
> I actually would think that is a perfectly valid state.  In that
> original repository pair (i.e. the superproject with a submodule
> without an entry in .gitmodules), as long as the configuration in
> the submodule repository "path/.git/config" has necessary remote
> definitions, "git push/fetch --recursive" etc., should also be able
> to work without having to consult .gitmodules at the top-level
> superproject, I would think.

Certainly I think it _could_ be a valid state. But traditionally it
caused "clone --recursive" to barf. And from what Stefan and Brandon
have said, we are moving in the opposite direction entirely, with
.gitmodules becoming the source of truth.

I could see arguments for going in either direction (gitlinks versus
.gitmodules as source of truth for submodules). But it bothers me that
it's so easy to create a state that will behave in a confusing manner.

> But even in such a clone, once the user who cloned learns where the
> submodule commit that is recorded in the superproject's tree can be
> obtained out-of-band and makes a clone at "path" manually (which
> replicates the state the original repository pair), things that only
> need to look at "path/.git/config" should be able to work (e.g. "git
> fetch --recursive"), I'd say.

I agree that's possible. But I think one problem that our submodule
support has traditionally suffered from is leaving open what is possible
and not making any policy choices, or providing any guidance. Unless you
are an expert user who plans on circumventing the usual "git submodule"
process (and having your collaborators do so, too), I don't think you'd
ever want to do a "git add" as above. You want "git submodule add".

Allowing both is not in itself wrong.  But because we provide no
guidance for the expert route, it's easy to shoot yourself in the foot
and not even realize it. Which is why I suggested a warning and not
forbidding the operation. Or better still, an advise() block that tells
you how to recover (probably "git submodule add", though it might need
to learn to handle already-present gitlinks). And which can be disabled
if you really are an expert user.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG?] gitlink without .gitmodules no longer fails recursive clone
  2017-06-10  7:13           ` Jeff King
@ 2017-06-10 11:12             ` Junio C Hamano
  0 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2017-06-10 11:12 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, Stefan Beller, git@vger.kernel.org

Jeff King <peff@peff.net> writes:

> Certainly I think it _could_ be a valid state. But traditionally it
> caused "clone --recursive" to barf. And from what Stefan and Brandon
> have said, we are moving in the opposite direction entirely, with
> .gitmodules becoming the source of truth.
> 
> I could see arguments for going in either direction (gitlinks versus
> .gitmodules as source of truth for submodules). But it bothers me that
> it's so easy to create a state that will behave in a confusing manner.
> ...
> Allowing both is not in itself wrong.  But because we provide no
> guidance for the expert route, it's easy to shoot yourself in the foot
> and not even realize it. Which is why I suggested a warning and not
> forbidding the operation. Or better still, an advise() block that tells
> you how to recover (probably "git submodule add", though it might need
> to learn to handle already-present gitlinks). And which can be disabled
> if you really are an expert user.

Yeah, the current situation is bad in that having both .gitmodules
file and gitlinks in a tree allows for redundant pieces of
information to go out of sync and result in contradiction.  A gitlink
without an entry in the .gitmodules file is an obvious example.  An
entry in .gitmodules file whose path does not have a gitlink in the
tree would also be a broken case.  

While "git submodule" was the only/primary interface to the
submodule, it was rather easy to explain that ".gitmodules need to
be consistent with the index and the tree iff you do 'git submodule'
thing" (implying "if you only use gitlinks to bind two or more
projects locally without sharing with others, you do not need 'git
submodule' and you do not need .gitmodules"), but with the recent
push to add "--recurse-submodules" to everything, the extent of what
was historically "iff you do 'git submodule' thing" is getting
larger and it becomes more and more important to keep .gitmodules
consistent with gitlinks in the tree that house it.

I find it a bit sad, but I do not think of a better way X-<.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG?] gitlink without .gitmodules no longer fails recursive clone
  2017-06-10  2:10         ` Junio C Hamano
  2017-06-10  7:13           ` Jeff King
@ 2017-06-12  5:30           ` Stefan Beller
  1 sibling, 0 replies; 23+ messages in thread
From: Stefan Beller @ 2017-06-12  5:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Brandon Williams, git@vger.kernel.org

On Fri, Jun 9, 2017 at 7:10 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Jeff King <peff@peff.net> writes:
>
>> In an ideal world the user do:
>>
>>   git submodule add git://host/repo.git path
>>
>> which adds the gitlink and the .gitmodules entry. But it doesn't seem
>> unreasonable for somebody unfamiliar with submodules to do:
>>
>>   git clone git://host/repo.git path
>>   git add path
>>
>> This does add the entry as a gitlink, but doesn't write any sort of
>> .gitmodules entry.
>
> I actually would think that is a perfectly valid state.

me too.

But on the other hand I do not want to offend non-submodule-gitlink-users
too much. So maybe:

  $ git add <gitlink>
  # Adding a raw gitlink to the index,
  # in case you want to use a submodule,
  # use add a .gitmodules file or use 'git submodule add'

> In that
> original repository pair (i.e. the superproject with a submodule
> without an entry in .gitmodules), as long as the configuration in
> the submodule repository "path/.git/config" has necessary remote
> definitions, "git push/fetch --recursive" etc., should also be able
> to work without having to consult .gitmodules at the top-level
> superproject, I would think.

but these are the 2nd step, clone fails first.

>
>> With the old code, cloning the repository (either by
>> another user, or in our case during a Pages build), a recursive clone or
>> submodule init would complain loudly. But now it's just quietly ignored.
>> Which seems unfortunate.
>
> Of course, if such an original superproject gets pushed to a
> publishing location and then the result is cloned, without an entry
> in .gitmodules, no information "git submodule" can use to work on
> that "path" exists in that clone.  I would say it is OK to leave it
> as-is when going "--recursive" (what you called "inactive because
> it does not even have a .gitmodules entry).
>
> But even in such a clone, once the user who cloned learns where the
> submodule commit that is recorded in the superproject's tree can be
> obtained out-of-band and makes a clone at "path" manually (which
> replicates the state the original repository pair), things that only
> need to look at "path/.git/config" should be able to work (e.g. "git
> fetch --recursive"), I'd say.

But the user would never learn, because

  $ git clone --recurse-submodules ...

does not complain, but put an empty dir instead.


>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG?] gitlink without .gitmodules no longer fails recursive clone
  2017-06-09 23:19       ` Jeff King
  2017-06-10  2:10         ` Junio C Hamano
@ 2017-06-13  9:14         ` Jeff King
  2017-06-13  9:24           ` [PATCH 1/2] add: warn when adding an embedded repository Jeff King
  2017-06-13  9:24           ` [PATCH 2/2] t: move "git add submodule" into test blocks Jeff King
  1 sibling, 2 replies; 23+ messages in thread
From: Jeff King @ 2017-06-13  9:14 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Stefan Beller, Junio C Hamano, git@vger.kernel.org

On Fri, Jun 09, 2017 at 07:19:35PM -0400, Jeff King wrote:

> Should "git add" check whether there's a matching .gitmodules entry for
> the path and issue a warning otherwise?

Here's my attempt at that.

  [1/2]: add: warn when adding an embedded repository
  [2/2]: t: move "git add submodule" into test blocks

 Documentation/config.txt                     |  3 ++
 Documentation/git-add.txt                    |  7 +++++
 advice.c                                     |  2 ++
 advice.h                                     |  1 +
 builtin/add.c                                | 45 +++++++++++++++++++++++++++-
 git-submodule.sh                             |  5 ++--
 t/t4041-diff-submodule-option.sh             |  8 +++--
 t/t4060-diff-submodule-option-diff-format.sh |  8 +++--
 t/t7401-submodule-summary.sh                 |  8 +++--
 t/t7414-submodule-mistakes.sh                | 40 +++++++++++++++++++++++++
 10 files changed, 115 insertions(+), 12 deletions(-)
 create mode 100755 t/t7414-submodule-mistakes.sh


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/2] add: warn when adding an embedded repository
  2017-06-13  9:14         ` Jeff King
@ 2017-06-13  9:24           ` Jeff King
  2017-06-13 17:07             ` Stefan Beller
  2017-06-13 17:16             ` Junio C Hamano
  2017-06-13  9:24           ` [PATCH 2/2] t: move "git add submodule" into test blocks Jeff King
  1 sibling, 2 replies; 23+ messages in thread
From: Jeff King @ 2017-06-13  9:24 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Stefan Beller, Junio C Hamano, git@vger.kernel.org

It's an easy mistake to add a repository inside another
repository, like:

  git clone $url
  git add .

The resulting entry is a gitlink, but there's no matching
.gitmodules entry. Trying to use "submodule init" (or clone
with --recursive) doesn't do anything useful. Prior to
v2.13, such an entry caused git-submodule to barf entirely.
In v2.13, the entry is considered "inactive" and quietly
ignored. Either way, no clone of your repository can do
anything useful with the gitlink without the user manually
adding the submodule config.

In most cases, the user probably meant to either add a real
submodule, or they forgot to put the embedded repository in
their .gitignore file.

Let's issue a warning when we see this case. There are a few
things to note:

  - the warning will go in the git-add porcelain; anybody
    wanting to do low-level manipulation of the index is
    welcome to create whatever funny states they want.

  - we detect the case by looking for a newly added gitlink;
    updates via "git add submodule" are perfectly reasonable,
    and this avoids us having to investigate .gitmodules
    entirely

  - there's a command-line option to suppress the warning.
    This is needed for git-submodule itself (which adds the
    entry before adding any submodule config), but also
    provides a mechanism for other scripts doing
    submodule-like things.

We could make this a hard error instead of a warning.
However, we do add lots of sub-repos in our test suite. It's
not _wrong_ to do so. It just creates a state where users
may be surprised. Pointing them in the right direction with
a gentle hint is probably the best option.

There is a config knob that can disable the (long) hint. But
I intentionally omitted a config knob to disable the warning
entirely. Whether the warning is sensible or not is
generally about context, not about the user's preferences.
If there's a tool or workflow that adds gitlinks without
matching .gitmodules, it should probably be taught about the
new command-line option, rather than blanket-disabling the
warning.

Signed-off-by: Jeff King <peff@peff.net>
---
The check for "is this a gitlink" is done by looking for a
trailing "/" in the added path. This feels kind of hacky,
but actually seems to work well in practice. We've already
expanded the pathspecs to real filenames via dir.c, and that
omits trees. So anything with a trailing slash must be a
gitlink.

And I really didn't want to incur any extra cost in the
common case here (e.g., checking for "path/.git"). We could
do it at zero-cost by pushing the check much further down
(i.e., when we'd realize anyway that it's a gitlink), but I
didn't want to pollute read-cache.c with what is essentially
a porcelain warning. The actual check done there seems to be
checking S_ISDIR, but I didn't even want to incur an extra
stat per-file.

I also waffled on whether we should ask the submodule code
whether it knows about a particular path. Technically:

  git config submodule.foo.path foo
  git config submodule.foo.url git://...
  git add foo

is legal, but would still warn with this patch. I don't know
how much we should care (it would also be easy to do on
top).

 Documentation/config.txt      |  3 +++
 Documentation/git-add.txt     |  7 +++++++
 advice.c                      |  2 ++
 advice.h                      |  1 +
 builtin/add.c                 | 45 ++++++++++++++++++++++++++++++++++++++++++-
 git-submodule.sh              |  5 +++--
 t/t7414-submodule-mistakes.sh | 40 ++++++++++++++++++++++++++++++++++++++
 7 files changed, 100 insertions(+), 3 deletions(-)
 create mode 100755 t/t7414-submodule-mistakes.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index dd4beec39..e909239bc 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -348,6 +348,9 @@ advice.*::
 	rmHints::
 		In case of failure in the output of linkgit:git-rm[1],
 		show directions on how to proceed from the current state.
+	addEmbeddedRepo::
+		Advice on what to do when you've accidentally added one
+		git repo inside of another.
 --
 
 core.fileMode::
diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index 7ed63dce0..f4169fb1e 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -165,6 +165,13 @@ for "git add --no-all <pathspec>...", i.e. ignored removed files.
 	be ignored, no matter if they are already present in the work
 	tree or not.
 
+--no-warn-embedded-repo::
+	By default, `git add` will warn when adding an embedded
+	repository to the index without using `git submodule add` to
+	create an entry in `.gitmodules`. This option will suppress the
+	warning (e.g., if you are manually performing operations on
+	submodules).
+
 --chmod=(+|-)x::
 	Override the executable bit of the added files.  The executable
 	bit is only changed in the index, the files on disk are left
diff --git a/advice.c b/advice.c
index b84ae4960..e0611d52b 100644
--- a/advice.c
+++ b/advice.c
@@ -15,6 +15,7 @@ int advice_detached_head = 1;
 int advice_set_upstream_failure = 1;
 int advice_object_name_warning = 1;
 int advice_rm_hints = 1;
+int advice_add_embedded_repo = 1;
 
 static struct {
 	const char *name;
@@ -35,6 +36,7 @@ static struct {
 	{ "setupstreamfailure", &advice_set_upstream_failure },
 	{ "objectnamewarning", &advice_object_name_warning },
 	{ "rmhints", &advice_rm_hints },
+	{ "addembeddedrepo", &advice_add_embedded_repo },
 
 	/* make this an alias for backward compatibility */
 	{ "pushnonfastforward", &advice_push_update_rejected }
diff --git a/advice.h b/advice.h
index b341a55ce..c84a44531 100644
--- a/advice.h
+++ b/advice.h
@@ -18,6 +18,7 @@ extern int advice_detached_head;
 extern int advice_set_upstream_failure;
 extern int advice_object_name_warning;
 extern int advice_rm_hints;
+extern int advice_add_embedded_repo;
 
 int git_default_advice_config(const char *var, const char *value);
 __attribute__((format (printf, 1, 2)))
diff --git a/builtin/add.c b/builtin/add.c
index d9a2491e4..ea88db281 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -249,6 +249,7 @@ N_("The following paths are ignored by one of your .gitignore files:\n");
 
 static int verbose, show_only, ignored_too, refresh_only;
 static int ignore_add_errors, intent_to_add, ignore_missing;
+static int warn_on_embedded_repo = 1;
 
 #define ADDREMOVE_DEFAULT 1
 static int addremove = ADDREMOVE_DEFAULT;
@@ -282,6 +283,8 @@ static struct option builtin_add_options[] = {
 	OPT_BOOL( 0 , "ignore-errors", &ignore_add_errors, N_("just skip files which cannot be added because of errors")),
 	OPT_BOOL( 0 , "ignore-missing", &ignore_missing, N_("check if - even missing - files are ignored in dry run")),
 	OPT_STRING( 0 , "chmod", &chmod_arg, N_("(+/-)x"), N_("override the executable bit of the listed files")),
+	OPT_HIDDEN_BOOL(0, "warn-embedded-repo", &warn_on_embedded_repo,
+			N_("warn when adding an embedded repository")),
 	OPT_END(),
 };
 
@@ -295,6 +298,44 @@ static int add_config(const char *var, const char *value, void *cb)
 	return git_default_config(var, value, cb);
 }
 
+static const char embedded_advice[] = N_(
+"You've added another git repository inside your current repository.\n"
+"Clones of the outer repository will not also contain the contents of\n"
+"the embedded repository. If you meant to add a submodule, use:\n"
+"\n"
+"	git submodule add <url> %s\n"
+"\n"
+"If you added this path by mistake, you can remove it from the\n"
+"index with:\n"
+"\n"
+"	git rm --cached %s\n"
+"\n"
+"See \"git help submodule\" for more information."
+);
+
+static void check_embedded_repo(const char *path)
+{
+	struct strbuf name = STRBUF_INIT;
+
+	if (!warn_on_embedded_repo)
+		return;
+	if (!ends_with(path, "/"))
+		return;
+
+	/* Drop trailing slash for aesthetics */
+	strbuf_addstr(&name, path);
+	strbuf_strip_suffix(&name, "/");
+
+	warning(_("adding embedded git repository: %s"), name.buf);
+	if (advice_add_embedded_repo) {
+		advise(embedded_advice, name.buf, name.buf);
+		/* there may be multiple entries; advise only once */
+		advice_add_embedded_repo = 0;
+	}
+
+	strbuf_release(&name);
+}
+
 static int add_files(struct dir_struct *dir, int flags)
 {
 	int i, exit_status = 0;
@@ -307,12 +348,14 @@ static int add_files(struct dir_struct *dir, int flags)
 		exit_status = 1;
 	}
 
-	for (i = 0; i < dir->nr; i++)
+	for (i = 0; i < dir->nr; i++) {
+		check_embedded_repo(dir->entries[i]->name);
 		if (add_file_to_index(&the_index, dir->entries[i]->name, flags)) {
 			if (!ignore_add_errors)
 				die(_("adding files failed"));
 			exit_status = 1;
 		}
+	}
 	return exit_status;
 }
 
diff --git a/git-submodule.sh b/git-submodule.sh
index c0d0e9a4c..e131760ee 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -213,7 +213,8 @@ cmd_add()
 		die "$(eval_gettext "'\$sm_path' already exists in the index and is not a submodule")"
 	fi
 
-	if test -z "$force" && ! git add --dry-run --ignore-missing "$sm_path" > /dev/null 2>&1
+	if test -z "$force" &&
+		! git add --dry-run --ignore-missing --no-warn-embedded-repo "$sm_path" > /dev/null 2>&1
 	then
 		eval_gettextln "The following path is ignored by one of your .gitignore files:
 \$sm_path
@@ -267,7 +268,7 @@ or you are unsure what this means choose another name with the '--name' option."
 	fi
 	git config submodule."$sm_name".url "$realrepo"
 
-	git add $force "$sm_path" ||
+	git add --no-warn-embedded-repo $force "$sm_path" ||
 	die "$(eval_gettext "Failed to add submodule '\$sm_path'")"
 
 	git config -f .gitmodules submodule."$sm_name".path "$sm_path" &&
diff --git a/t/t7414-submodule-mistakes.sh b/t/t7414-submodule-mistakes.sh
new file mode 100755
index 000000000..8059bcb7f
--- /dev/null
+++ b/t/t7414-submodule-mistakes.sh
@@ -0,0 +1,40 @@
+#!/bin/sh
+
+test_description='handling of common mistakes people may make with submodules'
+. ./test-lib.sh
+
+test_expect_success 'create embedded repository' '
+	git init embed &&
+	(
+		cd embed &&
+		test_commit one
+	)
+'
+
+test_expect_success 'git-add on embedded repository warns' '
+	test_when_finished "git rm --cached -f embed" &&
+	git add embed 2>stderr &&
+	test_i18ngrep warning stderr
+'
+
+test_expect_success '--no-warn-embedded-repo suppresses warning' '
+	test_when_finished "git rm --cached -f embed" &&
+	git add --no-warn-embedded-repo embed 2>stderr &&
+	test_i18ngrep ! warning stderr
+'
+
+test_expect_success 'no warning when updating entry' '
+	test_when_finished "git rm --cached -f embed" &&
+	git add embed &&
+	git -C embed commit --allow-empty -m two &&
+	git add embed 2>stderr &&
+	test_i18ngrep ! warning stderr
+'
+
+test_expect_success 'submodule add does not warn' '
+	test_when_finished "git rm -rf submodule .gitmodules" &&
+	git submodule add ./embed submodule 2>stderr &&
+	test_i18ngrep ! warning stderr
+'
+
+test_done
-- 
2.13.1.675.g57c06d071


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/2] t: move "git add submodule" into test blocks
  2017-06-13  9:14         ` Jeff King
  2017-06-13  9:24           ` [PATCH 1/2] add: warn when adding an embedded repository Jeff King
@ 2017-06-13  9:24           ` Jeff King
  2017-06-13 17:15             ` Stefan Beller
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff King @ 2017-06-13  9:24 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Stefan Beller, Junio C Hamano, git@vger.kernel.org

Some submodule tests do some setup outside of a test_expect
block. This is bad because we won't actually check the
outcome of those commands. But it's doubly so because "git
add submodule" now produces a warning to stderr, which is
not suppressed by the test scripts in non-verbose mode.

This patch does the minimal to fix the annoying warnings.
All three of these scripts could use more cleanup of related
setup.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/t4041-diff-submodule-option.sh             | 8 +++++---
 t/t4060-diff-submodule-option-diff-format.sh | 8 +++++---
 t/t7401-submodule-summary.sh                 | 8 +++++---
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/t/t4041-diff-submodule-option.sh b/t/t4041-diff-submodule-option.sh
index 2d9731b52..058ee0829 100755
--- a/t/t4041-diff-submodule-option.sh
+++ b/t/t4041-diff-submodule-option.sh
@@ -430,9 +430,11 @@ test_expect_success 'deleted submodule' '
 	test_cmp expected actual
 '
 
-test_create_repo sm2 &&
-head7=$(add_file sm2 foo8 foo9) &&
-git add sm2
+test_expect_success 'create second submodule' '
+	test_create_repo sm2 &&
+	head7=$(add_file sm2 foo8 foo9) &&
+	git add sm2
+'
 
 test_expect_success 'multiple submodules' '
 	git diff-index -p --submodule=log HEAD >actual &&
diff --git a/t/t4060-diff-submodule-option-diff-format.sh b/t/t4060-diff-submodule-option-diff-format.sh
index 33ec26d75..4b168d0ed 100755
--- a/t/t4060-diff-submodule-option-diff-format.sh
+++ b/t/t4060-diff-submodule-option-diff-format.sh
@@ -643,9 +643,11 @@ test_expect_success 'deleted submodule' '
 	test_cmp expected actual
 '
 
-test_create_repo sm2 &&
-head7=$(add_file sm2 foo8 foo9) &&
-git add sm2
+test_expect_success 'create second submodule' '
+	test_create_repo sm2 &&
+	head7=$(add_file sm2 foo8 foo9) &&
+	git add sm2
+'
 
 test_expect_success 'multiple submodules' '
 	git diff-index -p --submodule=diff HEAD >actual &&
diff --git a/t/t7401-submodule-summary.sh b/t/t7401-submodule-summary.sh
index 366746f0d..4e4c45550 100755
--- a/t/t7401-submodule-summary.sh
+++ b/t/t7401-submodule-summary.sh
@@ -241,9 +241,11 @@ EOF
 	test_cmp expected actual
 "
 
-test_create_repo sm2 &&
-head7=$(add_file sm2 foo8 foo9) &&
-git add sm2
+test_expect_success 'create second submodule' '
+	test_create_repo sm2 &&
+	head7=$(add_file sm2 foo8 foo9) &&
+	git add sm2
+'
 
 test_expect_success 'multiple submodules' "
 	git submodule summary >actual &&
-- 
2.13.1.675.g57c06d071

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] add: warn when adding an embedded repository
  2017-06-13  9:24           ` [PATCH 1/2] add: warn when adding an embedded repository Jeff King
@ 2017-06-13 17:07             ` Stefan Beller
  2017-06-13 17:16               ` Brandon Williams
  2017-06-14  6:36               ` Jeff King
  2017-06-13 17:16             ` Junio C Hamano
  1 sibling, 2 replies; 23+ messages in thread
From: Stefan Beller @ 2017-06-13 17:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, Junio C Hamano, git@vger.kernel.org

On Tue, Jun 13, 2017 at 2:24 AM, Jeff King <peff@peff.net> wrote:
> It's an easy mistake to add a repository inside another
> repository, like:
>
>   git clone $url
>   git add .
>
> The resulting entry is a gitlink, but there's no matching
> .gitmodules entry. Trying to use "submodule init" (or clone
> with --recursive) doesn't do anything useful. Prior to
> v2.13, such an entry caused git-submodule to barf entirely.
> In v2.13, the entry is considered "inactive" and quietly
> ignored. Either way, no clone of your repository can do
> anything useful with the gitlink without the user manually
> adding the submodule config.
>
> In most cases, the user probably meant to either add a real
> submodule, or they forgot to put the embedded repository in
> their .gitignore file.
>
> Let's issue a warning when we see this case. There are a few
> things to note:
>
>   - the warning will go in the git-add porcelain; anybody
>     wanting to do low-level manipulation of the index is
>     welcome to create whatever funny states they want.
>
>   - we detect the case by looking for a newly added gitlink;
>     updates via "git add submodule" are perfectly reasonable,
>     and this avoids us having to investigate .gitmodules
>     entirely
>
>   - there's a command-line option to suppress the warning.
>     This is needed for git-submodule itself (which adds the
>     entry before adding any submodule config), but also
>     provides a mechanism for other scripts doing
>     submodule-like things.
>
> We could make this a hard error instead of a warning.
> However, we do add lots of sub-repos in our test suite. It's
> not _wrong_ to do so. It just creates a state where users
> may be surprised. Pointing them in the right direction with
> a gentle hint is probably the best option.

Sounds good up to here (and right).

> There is a config knob that can disable the (long) hint. But
> I intentionally omitted a config knob to disable the warning
> entirely. Whether the warning is sensible or not is
> generally about context, not about the user's preferences.
> If there's a tool or workflow that adds gitlinks without
> matching .gitmodules, it should probably be taught about the
> new command-line option, rather than blanket-disabling the
> warning.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> The check for "is this a gitlink" is done by looking for a
> trailing "/" in the added path. This feels kind of hacky,
> but actually seems to work well in practice.

This whole "slash at the end" thing comes from extensive use
of shell completion adding the slash at the end of a directory
IMHO. (cf. PATHSPEC_STRIP_SUBMODULE_SLASH_* is
the same underlying hack.)

> We've already
> expanded the pathspecs to real filenames via dir.c, and that
> omits trees. So anything with a trailing slash must be a
> gitlink.

Oh!

>
> And I really didn't want to incur any extra cost in the
> common case here (e.g., checking for "path/.git"). We could
> do it at zero-cost by pushing the check much further down
> (i.e., when we'd realize anyway that it's a gitlink), but I
> didn't want to pollute read-cache.c with what is essentially
> a porcelain warning. The actual check done there seems to be
> checking S_ISDIR, but I didn't even want to incur an extra
> stat per-file.

makes sense.

>
> I also waffled on whether we should ask the submodule code
> whether it knows about a particular path. Technically:
>
>   git config submodule.foo.path foo
>   git config submodule.foo.url git://...
>   git add foo
>
> is legal, but would still warn with this patch. I don't know
> how much we should care (it would also be easy to do on
> top).

And here I was thinking this is not legal, because you may override
anything *except* submodule.*.path in the config. That is because
all the other settings (such as url, active flag, branch,
shallow recommendation) are dependent on the use case, the user,
changes to the environment (url) or such. The name<->path mapping
however is only to be changed via changes to the tracked content.
That is why it would make sense to disallow overriding the path
outside the tracked content.

In my ideal dream world of submodules we would have the following:

  $ cat .gitmodules
  [submodule "sub42"]
    path = foo
  # path only in tree!

  $ cat .git/config
  ...
  [submodule]
    active = .
    active = :(exclude)Irrelevant/submodules/for/my/usecase/*
  # note how this is user centric

  $ git show refs/meta/magic/for/refs/heads/master:.gitmodules
  [submodule "sub42"]
    url = https://example.org/foo
    branch = .
  # Note how this is neither centering on the in-tree
  # contents, nor the user. Instead it focuses on the
  # project or group. It is *workflow* centric.
  # Workflows may change over time, e.g. the url could
  # be repointed to k.org or an in-house mirror without tree
  # changes.


But back to reviewing this patch.

>
>  Documentation/config.txt      |  3 +++
>  Documentation/git-add.txt     |  7 +++++++
>  advice.c                      |  2 ++
>  advice.h                      |  1 +
>  builtin/add.c                 | 45 ++++++++++++++++++++++++++++++++++++++++++-
>  git-submodule.sh              |  5 +++--
>  t/t7414-submodule-mistakes.sh | 40 ++++++++++++++++++++++++++++++++++++++
>  7 files changed, 100 insertions(+), 3 deletions(-)
>  create mode 100755 t/t7414-submodule-mistakes.sh
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index dd4beec39..e909239bc 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -348,6 +348,9 @@ advice.*::
>         rmHints::
>                 In case of failure in the output of linkgit:git-rm[1],
>                 show directions on how to proceed from the current state.
> +       addEmbeddedRepo::
> +               Advice on what to do when you've accidentally added one
> +               git repo inside of another.
>  --
>
>  core.fileMode::
> diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
> index 7ed63dce0..f4169fb1e 100644
> --- a/Documentation/git-add.txt
> +++ b/Documentation/git-add.txt
> @@ -165,6 +165,13 @@ for "git add --no-all <pathspec>...", i.e. ignored removed files.
>         be ignored, no matter if they are already present in the work
>         tree or not.
>
> +--no-warn-embedded-repo::
> +       By default, `git add` will warn when adding an embedded
> +       repository to the index without using `git submodule add` to
> +       create an entry in `.gitmodules`. This option will suppress the
> +       warning (e.g., if you are manually performing operations on
> +       submodules).
> +
>  --chmod=(+|-)x::
>         Override the executable bit of the added files.  The executable
>         bit is only changed in the index, the files on disk are left
> diff --git a/advice.c b/advice.c
> index b84ae4960..e0611d52b 100644
> --- a/advice.c
> +++ b/advice.c
> @@ -15,6 +15,7 @@ int advice_detached_head = 1;
>  int advice_set_upstream_failure = 1;
>  int advice_object_name_warning = 1;
>  int advice_rm_hints = 1;
> +int advice_add_embedded_repo = 1;
>
>  static struct {
>         const char *name;
> @@ -35,6 +36,7 @@ static struct {
>         { "setupstreamfailure", &advice_set_upstream_failure },
>         { "objectnamewarning", &advice_object_name_warning },
>         { "rmhints", &advice_rm_hints },
> +       { "addembeddedrepo", &advice_add_embedded_repo },
>
>         /* make this an alias for backward compatibility */
>         { "pushnonfastforward", &advice_push_update_rejected }
> diff --git a/advice.h b/advice.h
> index b341a55ce..c84a44531 100644
> --- a/advice.h
> +++ b/advice.h
> @@ -18,6 +18,7 @@ extern int advice_detached_head;
>  extern int advice_set_upstream_failure;
>  extern int advice_object_name_warning;
>  extern int advice_rm_hints;
> +extern int advice_add_embedded_repo;
>
>  int git_default_advice_config(const char *var, const char *value);
>  __attribute__((format (printf, 1, 2)))
> diff --git a/builtin/add.c b/builtin/add.c
> index d9a2491e4..ea88db281 100644
> --- a/builtin/add.c
> +++ b/builtin/add.c
> @@ -249,6 +249,7 @@ N_("The following paths are ignored by one of your .gitignore files:\n");
>
>  static int verbose, show_only, ignored_too, refresh_only;
>  static int ignore_add_errors, intent_to_add, ignore_missing;
> +static int warn_on_embedded_repo = 1;
>
>  #define ADDREMOVE_DEFAULT 1
>  static int addremove = ADDREMOVE_DEFAULT;
> @@ -282,6 +283,8 @@ static struct option builtin_add_options[] = {
>         OPT_BOOL( 0 , "ignore-errors", &ignore_add_errors, N_("just skip files which cannot be added because of errors")),
>         OPT_BOOL( 0 , "ignore-missing", &ignore_missing, N_("check if - even missing - files are ignored in dry run")),
>         OPT_STRING( 0 , "chmod", &chmod_arg, N_("(+/-)x"), N_("override the executable bit of the listed files")),
> +       OPT_HIDDEN_BOOL(0, "warn-embedded-repo", &warn_on_embedded_repo,
> +                       N_("warn when adding an embedded repository")),

We do not have a lot of OPT_HIDDEN_BOOLs throughout the code base.
We should use them more often.

It makes sense though in this case.

>         OPT_END(),
>  };
>
> @@ -295,6 +298,44 @@ static int add_config(const char *var, const char *value, void *cb)
>         return git_default_config(var, value, cb);
>  }
>
> +static const char embedded_advice[] = N_(
> +"You've added another git repository inside your current repository.\n"
> +"Clones of the outer repository will not also contain the contents of\n"
> +"the embedded repository. If you meant to add a submodule, use:\n"

The "will not also" sounds a bit off to me. Maybe:
  ...
  Clones of the outer repository will not contain the contents
  of the embedded repository and has no way of knowing how
  to obtain the inner repo. If you meant to add a submodule ...


> +"\n"
> +"      git submodule add <url> %s\n"
> +"\n"
> +"If you added this path by mistake, you can remove it from the\n"
> +"index with:\n"
> +"\n"
> +"      git rm --cached %s\n"
> +"\n"
> +"See \"git help submodule\" for more information."

Once the overhaul of the submodule documentation
comes along[1], we rather want to point at
"man 7 git-submodules", which explains the concepts and
then tell you about commands how to use it. For now the
git-submodule man page is ok.

[1] https://public-inbox.org/git/20170607185354.10050-1-sbeller@google.com/


> +);
> +
> +static void check_embedded_repo(const char *path)
> +{
> +       struct strbuf name = STRBUF_INIT;
> +
> +       if (!warn_on_embedded_repo)
> +               return;
> +       if (!ends_with(path, "/"))
> +               return;
> +
> +       /* Drop trailing slash for aesthetics */
> +       strbuf_addstr(&name, path);
> +       strbuf_strip_suffix(&name, "/");
> +
> +       warning(_("adding embedded git repository: %s"), name.buf);
> +       if (advice_add_embedded_repo) {
> +               advise(embedded_advice, name.buf, name.buf);
> +               /* there may be multiple entries; advise only once */
> +               advice_add_embedded_repo = 0;
> +       }
> +
> +       strbuf_release(&name);
> +}
> +
>  static int add_files(struct dir_struct *dir, int flags)
>  {
>         int i, exit_status = 0;
> @@ -307,12 +348,14 @@ static int add_files(struct dir_struct *dir, int flags)
>                 exit_status = 1;
>         }
>
> -       for (i = 0; i < dir->nr; i++)
> +       for (i = 0; i < dir->nr; i++) {
> +               check_embedded_repo(dir->entries[i]->name);
>                 if (add_file_to_index(&the_index, dir->entries[i]->name, flags)) {
>                         if (!ignore_add_errors)
>                                 die(_("adding files failed"));
>                         exit_status = 1;
>                 }
> +       }
>         return exit_status;
>  }
>
> diff --git a/git-submodule.sh b/git-submodule.sh
> index c0d0e9a4c..e131760ee 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -213,7 +213,8 @@ cmd_add()
>                 die "$(eval_gettext "'\$sm_path' already exists in the index and is not a submodule")"
>         fi
>
> -       if test -z "$force" && ! git add --dry-run --ignore-missing "$sm_path" > /dev/null 2>&1
> +       if test -z "$force" &&
> +               ! git add --dry-run --ignore-missing --no-warn-embedded-repo "$sm_path" > /dev/null 2>&1
>         then
>                 eval_gettextln "The following path is ignored by one of your .gitignore files:
>  \$sm_path
> @@ -267,7 +268,7 @@ or you are unsure what this means choose another name with the '--name' option."
>         fi
>         git config submodule."$sm_name".url "$realrepo"
>
> -       git add $force "$sm_path" ||
> +       git add --no-warn-embedded-repo $force "$sm_path" ||
>         die "$(eval_gettext "Failed to add submodule '\$sm_path'")"
>
>         git config -f .gitmodules submodule."$sm_name".path "$sm_path" &&
> diff --git a/t/t7414-submodule-mistakes.sh b/t/t7414-submodule-mistakes.sh
> new file mode 100755
> index 000000000..8059bcb7f
> --- /dev/null
> +++ b/t/t7414-submodule-mistakes.sh
> @@ -0,0 +1,40 @@
> +#!/bin/sh
> +
> +test_description='handling of common mistakes people may make with submodules'

That is one way to say it. Do we have other tests for
"you think it is a bug, but it is features" ? ;)
I like it though. :)

> +. ./test-lib.sh
> +
> +test_expect_success 'create embedded repository' '
> +       git init embed &&
> +       (
> +               cd embed &&
> +               test_commit one
> +       )

shorter via:

  test_create_repo embed &&
  test_commit -C embed one

(and saves a shell IIRC)


> +test_expect_success 'git-add on embedded repository warns' '
> +       test_when_finished "git rm --cached -f embed" &&
> +       git add embed 2>stderr &&
> +       test_i18ngrep warning stderr
> +'
> +
> +test_expect_success '--no-warn-embedded-repo suppresses warning' '
> +       test_when_finished "git rm --cached -f embed" &&
> +       git add --no-warn-embedded-repo embed 2>stderr &&
> +       test_i18ngrep ! warning stderr
> +'
> +
> +test_expect_success 'no warning when updating entry' '
> +       test_when_finished "git rm --cached -f embed" &&
> +       git add embed &&
> +       git -C embed commit --allow-empty -m two &&
> +       git add embed 2>stderr &&
> +       test_i18ngrep ! warning stderr
> +'
> +
> +test_expect_success 'submodule add does not warn' '
> +       test_when_finished "git rm -rf submodule .gitmodules" &&
> +       git submodule add ./embed submodule 2>stderr &&
> +       test_i18ngrep ! warning stderr
> +'

Thanks for these tests.

This patch looks good to me, apart from the perceived wording nits.

Thanks,
Stefan

> +
> +test_done
> --
> 2.13.1.675.g57c06d071
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] t: move "git add submodule" into test blocks
  2017-06-13  9:24           ` [PATCH 2/2] t: move "git add submodule" into test blocks Jeff King
@ 2017-06-13 17:15             ` Stefan Beller
  0 siblings, 0 replies; 23+ messages in thread
From: Stefan Beller @ 2017-06-13 17:15 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, Junio C Hamano, git@vger.kernel.org

On Tue, Jun 13, 2017 at 2:24 AM, Jeff King <peff@peff.net> wrote:
> Some submodule tests do some setup outside of a test_expect
> block. This is bad because we won't actually check the
> outcome of those commands. But it's doubly so because "git
> add submodule" now produces a warning to stderr, which is
> not suppressed by the test scripts in non-verbose mode.

Makes sense.

> This patch does the minimal to fix the annoying warnings.
> All three of these scripts could use more cleanup of related
> setup.

agreed.

>
> Signed-off-by: Jeff King <peff@peff.net>

Reviewed-by: Stefan Beller <sbeller@google.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] add: warn when adding an embedded repository
  2017-06-13 17:07             ` Stefan Beller
@ 2017-06-13 17:16               ` Brandon Williams
  2017-06-14  6:36               ` Jeff King
  1 sibling, 0 replies; 23+ messages in thread
From: Brandon Williams @ 2017-06-13 17:16 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jeff King, Junio C Hamano, git@vger.kernel.org

On 06/13, Stefan Beller wrote:
> On Tue, Jun 13, 2017 at 2:24 AM, Jeff King <peff@peff.net> wrote:
> 
> > There is a config knob that can disable the (long) hint. But
> > I intentionally omitted a config knob to disable the warning
> > entirely. Whether the warning is sensible or not is
> > generally about context, not about the user's preferences.
> > If there's a tool or workflow that adds gitlinks without
> > matching .gitmodules, it should probably be taught about the
> > new command-line option, rather than blanket-disabling the
> > warning.
> >
> > Signed-off-by: Jeff King <peff@peff.net>
> > ---
> > The check for "is this a gitlink" is done by looking for a
> > trailing "/" in the added path. This feels kind of hacky,
> > but actually seems to work well in practice.
> 
> This whole "slash at the end" thing comes from extensive use
> of shell completion adding the slash at the end of a directory
> IMHO. (cf. PATHSPEC_STRIP_SUBMODULE_SLASH_* is
> the same underlying hack.)

I got rid of PATHSPEC_STRIP_SUBMODULE_SLASH_* recently, just an fyi.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] add: warn when adding an embedded repository
  2017-06-13  9:24           ` [PATCH 1/2] add: warn when adding an embedded repository Jeff King
  2017-06-13 17:07             ` Stefan Beller
@ 2017-06-13 17:16             ` Junio C Hamano
  2017-06-14  6:38               ` Jeff King
  1 sibling, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2017-06-13 17:16 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, Stefan Beller, git@vger.kernel.org

Jeff King <peff@peff.net> writes:

> I also waffled on whether we should ask the submodule code
> whether it knows about a particular path. Technically:
>
>   git config submodule.foo.path foo
>   git config submodule.foo.url git://...
>   git add foo
>
> is legal, but would still warn with this patch. 

Did you mean "git config -f .gitmodules" for the first two?  If so,
I think I agree that (1) it should be legal and (2) we should be
able to add the check on top of this patch.

Or do you really mean the case in which these are in .git/config?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] add: warn when adding an embedded repository
  2017-06-13 17:07             ` Stefan Beller
  2017-06-13 17:16               ` Brandon Williams
@ 2017-06-14  6:36               ` Jeff King
  2017-06-14 10:54                 ` [PATCH v2 0/2] " Jeff King
  2017-06-14 17:53                 ` [PATCH 1/2] add: warn when adding an embedded repository Stefan Beller
  1 sibling, 2 replies; 23+ messages in thread
From: Jeff King @ 2017-06-14  6:36 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Brandon Williams, Junio C Hamano, git@vger.kernel.org

On Tue, Jun 13, 2017 at 10:07:43AM -0700, Stefan Beller wrote:

> > I also waffled on whether we should ask the submodule code
> > whether it knows about a particular path. Technically:
> >
> >   git config submodule.foo.path foo
> >   git config submodule.foo.url git://...
> >   git add foo
> >
> > is legal, but would still warn with this patch. I don't know
> > how much we should care (it would also be easy to do on
> > top).
> 
> And here I was thinking this is not legal, because you may override
> anything *except* submodule.*.path in the config. That is because
> all the other settings (such as url, active flag, branch,
> shallow recommendation) are dependent on the use case, the user,
> changes to the environment (url) or such. The name<->path mapping
> however is only to be changed via changes to the tracked content.
> That is why it would make sense to disallow overriding the path
> outside the tracked content.

It was probably a mistake to use normal config as the example. Junio
mentioned it as a case that could work if you communicate the submodule
URL to somebody else out-of-band. My understanding was that you could
set whatever you like in the regular config, but I think that is just
showing my ignorance of submodules.

Pretend like I said "-f .gitmodules" in each line above. ;)

> In my ideal dream world of submodules we would have the following:
> 
>   $ cat .gitmodules
>   [submodule "sub42"]
>     path = foo
>   # path only in tree!

TBH, I am not sure why we need "path"; couldn't we just use the
subsection name as an implicit path?

> > +       OPT_HIDDEN_BOOL(0, "warn-embedded-repo", &warn_on_embedded_repo,
> > +                       N_("warn when adding an embedded repository")),
> 
> We do not have a lot of OPT_HIDDEN_BOOLs throughout the code base.
> We should use them more often.
> 
> It makes sense though in this case.

Actually, my main reason is that it's nonsense to show
"--warn-embedded-repo" in the help, when it's already the default. I
would like to have written:

  OPT_NEGBOOL(0, "no-warn-embedded-repo", &warn_on_embedded_repo,
		N_("disable warning when adding an embedded repository"))

but we don't have such a thing (and the last discussion on it a few
months ago left a lot of open questions). So given that this really
isn't something I'd expect users to want, I figured hiding it was a good
idea. I mentioned it in the manpage for script writers, but it's really
not worth cluttering "git add -h".

> > +static const char embedded_advice[] = N_(
> > +"You've added another git repository inside your current repository.\n"
> > +"Clones of the outer repository will not also contain the contents of\n"
> > +"the embedded repository. If you meant to add a submodule, use:\n"
> 
> The "will not also" sounds a bit off to me. Maybe:
>   ...
>   Clones of the outer repository will not contain the contents
>   of the embedded repository and has no way of knowing how
>   to obtain the inner repo. If you meant to add a submodule ...

Yeah, I think we could just strike the "also" (I played around with the
wording here quite a bit and I think it was left from an earlier attempt
where it made more sense).

Your "no way of knowing" is probably a good thing to mention.

> > +"See \"git help submodule\" for more information."
> 
> Once the overhaul of the submodule documentation
> comes along[1], we rather want to point at
> "man 7 git-submodules", which explains the concepts and
> then tell you about commands how to use it. For now the
> git-submodule man page is ok.
> 
> [1] https://public-inbox.org/git/20170607185354.10050-1-sbeller@google.com/

Yeah, I poked around looking for a definitive "here's how submodules
work" intro. I'm happy one is in the works, and I agree this should
point there once it exists.

> > +++ b/t/t7414-submodule-mistakes.sh
> > @@ -0,0 +1,40 @@
> > +#!/bin/sh
> > +
> > +test_description='handling of common mistakes people may make with submodules'
> 
> That is one way to say it. Do we have other tests for
> "you think it is a bug, but it is features" ? ;)
> I like it though. :)

Heh. I didn't know how else to lump it together. Just "test git add on a
repository" felt like too little for its own script. I almost added it
to t7400, but I think that script is plenty long enough as it is (it's
also one of the longest-running scripts, I think).

> > +test_expect_success 'create embedded repository' '
> > +       git init embed &&
> > +       (
> > +               cd embed &&
> > +               test_commit one
> > +       )
> 
> shorter via:
> 
>   test_create_repo embed &&
>   test_commit -C embed one
> 
> (and saves a shell IIRC)

Right, I forgot we added -C there. Will change.

> Thanks for these tests.
> 
> This patch looks good to me, apart from the perceived wording nits.

Thanks. I'll re-roll with a few tweaks based on your feedback.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] add: warn when adding an embedded repository
  2017-06-13 17:16             ` Junio C Hamano
@ 2017-06-14  6:38               ` Jeff King
  0 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2017-06-14  6:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Brandon Williams, Stefan Beller, git@vger.kernel.org

On Tue, Jun 13, 2017 at 10:16:15AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I also waffled on whether we should ask the submodule code
> > whether it knows about a particular path. Technically:
> >
> >   git config submodule.foo.path foo
> >   git config submodule.foo.url git://...
> >   git add foo
> >
> > is legal, but would still warn with this patch. 
> 
> Did you mean "git config -f .gitmodules" for the first two?  If so,
> I think I agree that (1) it should be legal and (2) we should be
> able to add the check on top of this patch.
> 
> Or do you really mean the case in which these are in .git/config?

I did mean .git/config, but I think it was because I was mostly confused
about what was possible. In either case I think what we'd want is to ask
"could this be used as an active submodule". And leave it to the
submodule code to decide whatever mechanisms are legal.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2 0/2] warn when adding an embedded repository
  2017-06-14  6:36               ` Jeff King
@ 2017-06-14 10:54                 ` Jeff King
  2017-06-14 10:58                   ` [PATCH v2 1/2] add: " Jeff King
  2017-06-14 10:58                   ` [PATCH v2 2/2] t: move "git add submodule" into test blocks Jeff King
  2017-06-14 17:53                 ` [PATCH 1/2] add: warn when adding an embedded repository Stefan Beller
  1 sibling, 2 replies; 23+ messages in thread
From: Jeff King @ 2017-06-14 10:54 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Brandon Williams, Junio C Hamano, git@vger.kernel.org

On Wed, Jun 14, 2017 at 02:36:14AM -0400, Jeff King wrote:

> > This patch looks good to me, apart from the perceived wording nits.
> 
> Thanks. I'll re-roll with a few tweaks based on your feedback.

Here it is. It just changes the wording and fixes the test as you
suggested.

  [1/2]: add: warn when adding an embedded repository
  [2/2]: t: move "git add submodule" into test blocks

I had some thoughts on adding a third patch to cover the case where
.gitmodules has already been updated. But I couldn't figure out how to
make it work. The patch I came up with is below:

diff --git a/builtin/add.c b/builtin/add.c
index d7cab1bd8..d20052462 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -322,6 +322,8 @@ static void check_embedded_repo(const char *path)
 		return;
 	if (!ends_with(path, "/"))
 		return;
+	if (is_submodule_initialized(path))
+		return;
 
 	/* Drop trailing slash for aesthetics */
 	strbuf_addstr(&name, path);
diff --git a/t/t7414-submodule-mistakes.sh b/t/t7414-submodule-mistakes.sh
index f2e7df59c..aee52b218 100755
--- a/t/t7414-submodule-mistakes.sh
+++ b/t/t7414-submodule-mistakes.sh
@@ -34,4 +34,14 @@ test_expect_success 'submodule add does not warn' '
 	test_i18ngrep ! warning stderr
 '
 
+test_expect_success 'making your own submodule does not warn' '
+	test_when_finished "git rm --cached -f embed &&
+			    git rm -f .gitmodules" &&
+	git config -f .gitmodules submodule.embed.path embed &&
+	git config -f .gitmodules submodule.embed.url \
+				  git://example.com/embed.git &&
+	git add embed 2>stderr &&
+	test_i18ngrep ! warning stderr
+'
+
 test_done

But that doesn't quite work. Because the submodule data is just in
.gitmodules, it's not actually "initialized". I think that might work if
instead it was:

  git config submodule.embed.url ...

but I'm not clear on whether that should work, or if it's even something
somebody would want to do. So I decided to punt on the whole thing and
let somebody work on it who a) knows more about how submodules work or
b) has some not-quite-submodule use case for gitlinks that they want to
have work (without a warning).

-Peff

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 1/2] add: warn when adding an embedded repository
  2017-06-14 10:54                 ` [PATCH v2 0/2] " Jeff King
@ 2017-06-14 10:58                   ` Jeff King
  2017-06-14 10:58                   ` [PATCH v2 2/2] t: move "git add submodule" into test blocks Jeff King
  1 sibling, 0 replies; 23+ messages in thread
From: Jeff King @ 2017-06-14 10:58 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Brandon Williams, Junio C Hamano, git@vger.kernel.org

It's an easy mistake to add a repository inside another
repository, like:

  git clone $url
  git add .

The resulting entry is a gitlink, but there's no matching
.gitmodules entry. Trying to use "submodule init" (or clone
with --recursive) doesn't do anything useful. Prior to
v2.13, such an entry caused git-submodule to barf entirely.
In v2.13, the entry is considered "inactive" and quietly
ignored. Either way, no clone of your repository can do
anything useful with the gitlink without the user manually
adding the submodule config.

In most cases, the user probably meant to either add a real
submodule, or they forgot to put the embedded repository in
their .gitignore file.

Let's issue a warning when we see this case. There are a few
things to note:

  - the warning will go in the git-add porcelain; anybody
    wanting to do low-level manipulation of the index is
    welcome to create whatever funny states they want.

  - we detect the case by looking for a newly added gitlink;
    updates via "git add submodule" are perfectly reasonable,
    and this avoids us having to investigate .gitmodules
    entirely

  - there's a command-line option to suppress the warning.
    This is needed for git-submodule itself (which adds the
    entry before adding any submodule config), but also
    provides a mechanism for other scripts doing
    submodule-like things.

We could make this a hard error instead of a warning.
However, we do add lots of sub-repos in our test suite. It's
not _wrong_ to do so. It just creates a state where users
may be surprised. Pointing them in the right direction with
a gentle hint is probably the best option.

There is a config knob that can disable the (long) hint. But
I intentionally omitted a config knob to disable the warning
entirely. Whether the warning is sensible or not is
generally about context, not about the user's preferences.
If there's a tool or workflow that adds gitlinks without
matching .gitmodules, it should probably be taught about the
new command-line option, rather than blanket-disabling the
warning.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/config.txt      |  3 +++
 Documentation/git-add.txt     |  7 +++++++
 advice.c                      |  2 ++
 advice.h                      |  1 +
 builtin/add.c                 | 46 ++++++++++++++++++++++++++++++++++++++++++-
 git-submodule.sh              |  5 +++--
 t/t7414-submodule-mistakes.sh | 37 ++++++++++++++++++++++++++++++++++
 7 files changed, 98 insertions(+), 3 deletions(-)
 create mode 100755 t/t7414-submodule-mistakes.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f6278a5ae..781ce3e85 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -348,6 +348,9 @@ advice.*::
 	rmHints::
 		In case of failure in the output of linkgit:git-rm[1],
 		show directions on how to proceed from the current state.
+	addEmbeddedRepo::
+		Advice on what to do when you've accidentally added one
+		git repo inside of another.
 --
 
 core.fileMode::
diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index 7ed63dce0..f4169fb1e 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -165,6 +165,13 @@ for "git add --no-all <pathspec>...", i.e. ignored removed files.
 	be ignored, no matter if they are already present in the work
 	tree or not.
 
+--no-warn-embedded-repo::
+	By default, `git add` will warn when adding an embedded
+	repository to the index without using `git submodule add` to
+	create an entry in `.gitmodules`. This option will suppress the
+	warning (e.g., if you are manually performing operations on
+	submodules).
+
 --chmod=(+|-)x::
 	Override the executable bit of the added files.  The executable
 	bit is only changed in the index, the files on disk are left
diff --git a/advice.c b/advice.c
index b84ae4960..e0611d52b 100644
--- a/advice.c
+++ b/advice.c
@@ -15,6 +15,7 @@ int advice_detached_head = 1;
 int advice_set_upstream_failure = 1;
 int advice_object_name_warning = 1;
 int advice_rm_hints = 1;
+int advice_add_embedded_repo = 1;
 
 static struct {
 	const char *name;
@@ -35,6 +36,7 @@ static struct {
 	{ "setupstreamfailure", &advice_set_upstream_failure },
 	{ "objectnamewarning", &advice_object_name_warning },
 	{ "rmhints", &advice_rm_hints },
+	{ "addembeddedrepo", &advice_add_embedded_repo },
 
 	/* make this an alias for backward compatibility */
 	{ "pushnonfastforward", &advice_push_update_rejected }
diff --git a/advice.h b/advice.h
index b341a55ce..c84a44531 100644
--- a/advice.h
+++ b/advice.h
@@ -18,6 +18,7 @@ extern int advice_detached_head;
 extern int advice_set_upstream_failure;
 extern int advice_object_name_warning;
 extern int advice_rm_hints;
+extern int advice_add_embedded_repo;
 
 int git_default_advice_config(const char *var, const char *value);
 __attribute__((format (printf, 1, 2)))
diff --git a/builtin/add.c b/builtin/add.c
index d9a2491e4..d7cab1bd8 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -249,6 +249,7 @@ N_("The following paths are ignored by one of your .gitignore files:\n");
 
 static int verbose, show_only, ignored_too, refresh_only;
 static int ignore_add_errors, intent_to_add, ignore_missing;
+static int warn_on_embedded_repo = 1;
 
 #define ADDREMOVE_DEFAULT 1
 static int addremove = ADDREMOVE_DEFAULT;
@@ -282,6 +283,8 @@ static struct option builtin_add_options[] = {
 	OPT_BOOL( 0 , "ignore-errors", &ignore_add_errors, N_("just skip files which cannot be added because of errors")),
 	OPT_BOOL( 0 , "ignore-missing", &ignore_missing, N_("check if - even missing - files are ignored in dry run")),
 	OPT_STRING( 0 , "chmod", &chmod_arg, N_("(+/-)x"), N_("override the executable bit of the listed files")),
+	OPT_HIDDEN_BOOL(0, "warn-embedded-repo", &warn_on_embedded_repo,
+			N_("warn when adding an embedded repository")),
 	OPT_END(),
 };
 
@@ -295,6 +298,45 @@ static int add_config(const char *var, const char *value, void *cb)
 	return git_default_config(var, value, cb);
 }
 
+static const char embedded_advice[] = N_(
+"You've added another git repository inside your current repository.\n"
+"Clones of the outer repository will not contain the contents of\n"
+"the embedded repository and will not know how to obtain it.\n"
+"If you meant to add a submodule, use:\n"
+"\n"
+"	git submodule add <url> %s\n"
+"\n"
+"If you added this path by mistake, you can remove it from the\n"
+"index with:\n"
+"\n"
+"	git rm --cached %s\n"
+"\n"
+"See \"git help submodule\" for more information."
+);
+
+static void check_embedded_repo(const char *path)
+{
+	struct strbuf name = STRBUF_INIT;
+
+	if (!warn_on_embedded_repo)
+		return;
+	if (!ends_with(path, "/"))
+		return;
+
+	/* Drop trailing slash for aesthetics */
+	strbuf_addstr(&name, path);
+	strbuf_strip_suffix(&name, "/");
+
+	warning(_("adding embedded git repository: %s"), name.buf);
+	if (advice_add_embedded_repo) {
+		advise(embedded_advice, name.buf, name.buf);
+		/* there may be multiple entries; advise only once */
+		advice_add_embedded_repo = 0;
+	}
+
+	strbuf_release(&name);
+}
+
 static int add_files(struct dir_struct *dir, int flags)
 {
 	int i, exit_status = 0;
@@ -307,12 +349,14 @@ static int add_files(struct dir_struct *dir, int flags)
 		exit_status = 1;
 	}
 
-	for (i = 0; i < dir->nr; i++)
+	for (i = 0; i < dir->nr; i++) {
+		check_embedded_repo(dir->entries[i]->name);
 		if (add_file_to_index(&the_index, dir->entries[i]->name, flags)) {
 			if (!ignore_add_errors)
 				die(_("adding files failed"));
 			exit_status = 1;
 		}
+	}
 	return exit_status;
 }
 
diff --git a/git-submodule.sh b/git-submodule.sh
index c0d0e9a4c..e131760ee 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -213,7 +213,8 @@ cmd_add()
 		die "$(eval_gettext "'\$sm_path' already exists in the index and is not a submodule")"
 	fi
 
-	if test -z "$force" && ! git add --dry-run --ignore-missing "$sm_path" > /dev/null 2>&1
+	if test -z "$force" &&
+		! git add --dry-run --ignore-missing --no-warn-embedded-repo "$sm_path" > /dev/null 2>&1
 	then
 		eval_gettextln "The following path is ignored by one of your .gitignore files:
 \$sm_path
@@ -267,7 +268,7 @@ or you are unsure what this means choose another name with the '--name' option."
 	fi
 	git config submodule."$sm_name".url "$realrepo"
 
-	git add $force "$sm_path" ||
+	git add --no-warn-embedded-repo $force "$sm_path" ||
 	die "$(eval_gettext "Failed to add submodule '\$sm_path'")"
 
 	git config -f .gitmodules submodule."$sm_name".path "$sm_path" &&
diff --git a/t/t7414-submodule-mistakes.sh b/t/t7414-submodule-mistakes.sh
new file mode 100755
index 000000000..f2e7df59c
--- /dev/null
+++ b/t/t7414-submodule-mistakes.sh
@@ -0,0 +1,37 @@
+#!/bin/sh
+
+test_description='handling of common mistakes people may make with submodules'
+. ./test-lib.sh
+
+test_expect_success 'create embedded repository' '
+	git init embed &&
+	test_commit -C embed one
+'
+
+test_expect_success 'git-add on embedded repository warns' '
+	test_when_finished "git rm --cached -f embed" &&
+	git add embed 2>stderr &&
+	test_i18ngrep warning stderr
+'
+
+test_expect_success '--no-warn-embedded-repo suppresses warning' '
+	test_when_finished "git rm --cached -f embed" &&
+	git add --no-warn-embedded-repo embed 2>stderr &&
+	test_i18ngrep ! warning stderr
+'
+
+test_expect_success 'no warning when updating entry' '
+	test_when_finished "git rm --cached -f embed" &&
+	git add embed &&
+	git -C embed commit --allow-empty -m two &&
+	git add embed 2>stderr &&
+	test_i18ngrep ! warning stderr
+'
+
+test_expect_success 'submodule add does not warn' '
+	test_when_finished "git rm -rf submodule .gitmodules" &&
+	git submodule add ./embed submodule 2>stderr &&
+	test_i18ngrep ! warning stderr
+'
+
+test_done
-- 
2.13.1.766.g6bea926c5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 2/2] t: move "git add submodule" into test blocks
  2017-06-14 10:54                 ` [PATCH v2 0/2] " Jeff King
  2017-06-14 10:58                   ` [PATCH v2 1/2] add: " Jeff King
@ 2017-06-14 10:58                   ` Jeff King
  1 sibling, 0 replies; 23+ messages in thread
From: Jeff King @ 2017-06-14 10:58 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Brandon Williams, Junio C Hamano, git@vger.kernel.org

Some submodule tests do some setup outside of a test_expect
block. This is bad because we won't actually check the
outcome of those commands. But it's doubly so because "git
add submodule" now produces a warning to stderr, which is
not suppressed by the test scripts in non-verbose mode.

This patch does the minimal to fix the annoying warnings.
All three of these scripts could use more cleanup of related
setup.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/t4041-diff-submodule-option.sh             | 8 +++++---
 t/t4060-diff-submodule-option-diff-format.sh | 8 +++++---
 t/t7401-submodule-summary.sh                 | 8 +++++---
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/t/t4041-diff-submodule-option.sh b/t/t4041-diff-submodule-option.sh
index 2d9731b52..058ee0829 100755
--- a/t/t4041-diff-submodule-option.sh
+++ b/t/t4041-diff-submodule-option.sh
@@ -430,9 +430,11 @@ test_expect_success 'deleted submodule' '
 	test_cmp expected actual
 '
 
-test_create_repo sm2 &&
-head7=$(add_file sm2 foo8 foo9) &&
-git add sm2
+test_expect_success 'create second submodule' '
+	test_create_repo sm2 &&
+	head7=$(add_file sm2 foo8 foo9) &&
+	git add sm2
+'
 
 test_expect_success 'multiple submodules' '
 	git diff-index -p --submodule=log HEAD >actual &&
diff --git a/t/t4060-diff-submodule-option-diff-format.sh b/t/t4060-diff-submodule-option-diff-format.sh
index 33ec26d75..4b168d0ed 100755
--- a/t/t4060-diff-submodule-option-diff-format.sh
+++ b/t/t4060-diff-submodule-option-diff-format.sh
@@ -643,9 +643,11 @@ test_expect_success 'deleted submodule' '
 	test_cmp expected actual
 '
 
-test_create_repo sm2 &&
-head7=$(add_file sm2 foo8 foo9) &&
-git add sm2
+test_expect_success 'create second submodule' '
+	test_create_repo sm2 &&
+	head7=$(add_file sm2 foo8 foo9) &&
+	git add sm2
+'
 
 test_expect_success 'multiple submodules' '
 	git diff-index -p --submodule=diff HEAD >actual &&
diff --git a/t/t7401-submodule-summary.sh b/t/t7401-submodule-summary.sh
index 366746f0d..4e4c45550 100755
--- a/t/t7401-submodule-summary.sh
+++ b/t/t7401-submodule-summary.sh
@@ -241,9 +241,11 @@ EOF
 	test_cmp expected actual
 "
 
-test_create_repo sm2 &&
-head7=$(add_file sm2 foo8 foo9) &&
-git add sm2
+test_expect_success 'create second submodule' '
+	test_create_repo sm2 &&
+	head7=$(add_file sm2 foo8 foo9) &&
+	git add sm2
+'
 
 test_expect_success 'multiple submodules' "
 	git submodule summary >actual &&
-- 
2.13.1.766.g6bea926c5

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] add: warn when adding an embedded repository
  2017-06-14  6:36               ` Jeff King
  2017-06-14 10:54                 ` [PATCH v2 0/2] " Jeff King
@ 2017-06-14 17:53                 ` Stefan Beller
  2017-06-15  6:01                   ` Jeff King
  1 sibling, 1 reply; 23+ messages in thread
From: Stefan Beller @ 2017-06-14 17:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, Junio C Hamano, git@vger.kernel.org

On Tue, Jun 13, 2017 at 11:36 PM, Jeff King <peff@peff.net> wrote:
> On Tue, Jun 13, 2017 at 10:07:43AM -0700, Stefan Beller wrote:
>
>> > I also waffled on whether we should ask the submodule code
>> > whether it knows about a particular path. Technically:
>> >
>> >   git config submodule.foo.path foo
>> >   git config submodule.foo.url git://...
>> >   git add foo
>> >
>> > is legal, but would still warn with this patch. I don't know
>> > how much we should care (it would also be easy to do on
>> > top).
>>
>> And here I was thinking this is not legal, because you may override
>> anything *except* submodule.*.path in the config. That is because
>> all the other settings (such as url, active flag, branch,
>> shallow recommendation) are dependent on the use case, the user,
>> changes to the environment (url) or such. The name<->path mapping
>> however is only to be changed via changes to the tracked content.
>> That is why it would make sense to disallow overriding the path
>> outside the tracked content.
>
> It was probably a mistake to use normal config as the example. Junio
> mentioned it as a case that could work if you communicate the submodule
> URL to somebody else out-of-band. My understanding was that you could
> set whatever you like in the regular config, but I think that is just
> showing my ignorance of submodules.
>
> Pretend like I said "-f .gitmodules" in each line above. ;)
>
>> In my ideal dream world of submodules we would have the following:
>>
>>   $ cat .gitmodules
>>   [submodule "sub42"]
>>     path = foo
>>   # path only in tree!
>
> TBH, I am not sure why we need "path"; couldn't we just use the
> subsection name as an implicit path?

That is what was done back in the time. But then people wanted to rename
submodules (i.e. move them around in the worktree), so the path is not
constant, so either we'd have to move around the git dir whenever the
submodule is renamed (bad idea IMO), or instead introduce a mapping
between (constant name <-> variable path). So that was done.

Historically (IIUC) we had submodule.path.url which then was changed
to submodule.name.url + name->path resolution. And as a hack(?) or
easy way out of a problem then, the name is often the same as the path
hence confusing people, when they see:

    [submodule "foo"]
        path = foo
        url = dadada/foo

What foo means what now? ;)
As a tangent: I want to make the default name different to the path.

So yeah, we want to keep the name and not mingle with implicit path.

I think we may even have bugs in our code base where the
name/path confusion shows.

Talking about another tangent:

  For files there is a rename detection available. For submodules
  It is hard to imagine that there ever will be such a rename detection
  as files have because of the explciit name<->path mapping.

  We *know* when a submodule was moved. So why even try
  to do rename detection? As we record only sha1s for a submodule
  you could swap two submodule object names by accident.
  Consider a superproject that contains different kernels, such as
  a kernel for your phone/embedded device and then a kernel for
  your workstation or other device. And these two kernels are different
  for technical reasons but share the same history.

  Now the inattentive user may make a mistake and git-add the
  "wrong" kernel submodule.  The smart Git would tell that it is a
  rename/move just as we have with files.

>
>> > +       OPT_HIDDEN_BOOL(0, "warn-embedded-repo", &warn_on_embedded_repo,
>> > +                       N_("warn when adding an embedded repository")),
>>
>> We do not have a lot of OPT_HIDDEN_BOOLs throughout the code base.
>> We should use them more often.
>>
>> It makes sense though in this case.
>
> Actually, my main reason is that it's nonsense to show
> "--warn-embedded-repo" in the help, when it's already the default. I
> would like to have written:
>
>   OPT_NEGBOOL(0, "no-warn-embedded-repo", &warn_on_embedded_repo,
>                 N_("disable warning when adding an embedded repository"))
>
> but we don't have such a thing (and the last discussion on it a few
> months ago left a lot of open questions). So given that this really
> isn't something I'd expect users to want, I figured hiding it was a good
> idea. I mentioned it in the manpage for script writers, but it's really
> not worth cluttering "git add -h".

ok :) If you really wanted, you could go with a raw OPTION though. ;)
This is fine with me though.

>
>> > +static const char embedded_advice[] = N_(
>> > +"You've added another git repository inside your current repository.\n"
>> > +"Clones of the outer repository will not also contain the contents of\n"
>> > +"the embedded repository. If you meant to add a submodule, use:\n"
>>
>> The "will not also" sounds a bit off to me. Maybe:
>>   ...
>>   Clones of the outer repository will not contain the contents
>>   of the embedded repository and has no way of knowing how
>>   to obtain the inner repo. If you meant to add a submodule ...
>
> Yeah, I think we could just strike the "also" (I played around with the
> wording here quite a bit and I think it was left from an earlier attempt
> where it made more sense).
>
> Your "no way of knowing" is probably a good thing to mention.
>
>> > +"See \"git help submodule\" for more information."
>>
>> Once the overhaul of the submodule documentation
>> comes along[1], we rather want to point at
>> "man 7 git-submodules", which explains the concepts and
>> then tell you about commands how to use it. For now the
>> git-submodule man page is ok.
>>
>> [1] https://public-inbox.org/git/20170607185354.10050-1-sbeller@google.com/
>
> Yeah, I poked around looking for a definitive "here's how submodules
> work" intro. I'm happy one is in the works, and I agree this should
> point there once it exists.
>
>> > +++ b/t/t7414-submodule-mistakes.sh
>> > @@ -0,0 +1,40 @@
>> > +#!/bin/sh
>> > +
>> > +test_description='handling of common mistakes people may make with submodules'
>>
>> That is one way to say it. Do we have other tests for
>> "you think it is a bug, but it is features" ? ;)
>> I like it though. :)
>
> Heh. I didn't know how else to lump it together. Just "test git add on a
> repository" felt like too little for its own script. I almost added it
> to t7400, but I think that script is plenty long enough as it is (it's
> also one of the longest-running scripts, I think).

Thanks for not doing that. :)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] add: warn when adding an embedded repository
  2017-06-14 17:53                 ` [PATCH 1/2] add: warn when adding an embedded repository Stefan Beller
@ 2017-06-15  6:01                   ` Jeff King
  0 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2017-06-15  6:01 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Brandon Williams, Junio C Hamano, git@vger.kernel.org

On Wed, Jun 14, 2017 at 10:53:12AM -0700, Stefan Beller wrote:

> >> In my ideal dream world of submodules we would have the following:
> >>
> >>   $ cat .gitmodules
> >>   [submodule "sub42"]
> >>     path = foo
> >>   # path only in tree!
> >
> > TBH, I am not sure why we need "path"; couldn't we just use the
> > subsection name as an implicit path?
> 
> That is what was done back in the time. But then people wanted to rename
> submodules (i.e. move them around in the worktree), so the path is not
> constant, so either we'd have to move around the git dir whenever the
> submodule is renamed (bad idea IMO), or instead introduce a mapping
> between (constant name <-> variable path). So that was done.

Ah, right. That makes sense. I forgot that in addition to the in-tree
path, we have to store the submodule repository itself as some name. The
extra level of indirection there isn't strictly necessary, but it lets
the "name" act as a unique id.

> Historically (IIUC) we had submodule.path.url which then was changed
> to submodule.name.url + name->path resolution. And as a hack(?) or
> easy way out of a problem then, the name is often the same as the path
> hence confusing people, when they see:
> 
>     [submodule "foo"]
>         path = foo
>         url = dadada/foo
> 
> What foo means what now? ;)

Right, I am such a person that has been confused. ;)

Thanks for explaining.

> Talking about another tangent:
> 
>   For files there is a rename detection available. For submodules
>   It is hard to imagine that there ever will be such a rename detection
>   as files have because of the explciit name<->path mapping.
> 
>   We *know* when a submodule was moved. So why even try
>   to do rename detection? As we record only sha1s for a submodule
>   you could swap two submodule object names by accident.
>   Consider a superproject that contains different kernels, such as
>   a kernel for your phone/embedded device and then a kernel for
>   your workstation or other device. And these two kernels are different
>   for technical reasons but share the same history.

Do you mean during the rename detection phase of a diff, check to see if
.gitmodules registered a change in path for a particular module (by
finding its entry in the diff and looking at both sides), and if so then
mark that as a rename for the submodule paths?

From a cursory glance, that sounds like an interesting approach.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-06-15  6:01 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-06  3:56 [BUG?] gitlink without .gitmodules no longer fails recursive clone Jeff King
2017-06-06 18:01 ` Stefan Beller
2017-06-06 18:10   ` Brandon Williams
2017-06-06 18:39     ` Jeff King
2017-06-09 23:19       ` Jeff King
2017-06-10  2:10         ` Junio C Hamano
2017-06-10  7:13           ` Jeff King
2017-06-10 11:12             ` Junio C Hamano
2017-06-12  5:30           ` Stefan Beller
2017-06-13  9:14         ` Jeff King
2017-06-13  9:24           ` [PATCH 1/2] add: warn when adding an embedded repository Jeff King
2017-06-13 17:07             ` Stefan Beller
2017-06-13 17:16               ` Brandon Williams
2017-06-14  6:36               ` Jeff King
2017-06-14 10:54                 ` [PATCH v2 0/2] " Jeff King
2017-06-14 10:58                   ` [PATCH v2 1/2] add: " Jeff King
2017-06-14 10:58                   ` [PATCH v2 2/2] t: move "git add submodule" into test blocks Jeff King
2017-06-14 17:53                 ` [PATCH 1/2] add: warn when adding an embedded repository Stefan Beller
2017-06-15  6:01                   ` Jeff King
2017-06-13 17:16             ` Junio C Hamano
2017-06-14  6:38               ` Jeff King
2017-06-13  9:24           ` [PATCH 2/2] t: move "git add submodule" into test blocks Jeff King
2017-06-13 17:15             ` Stefan Beller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).