git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] clone: add `--shallow-submodules` flag
@ 2016-04-25 18:30 Stefan Beller
  2016-04-25 21:04 ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Beller @ 2016-04-25 18:30 UTC (permalink / raw
  To: jacob.keller, larsxschneider; +Cc: git, gitster, Stefan Beller

When creating a shallow clone of a repository with submodules, the depth
argument does not influence the submodules, i.e. the submodules are done
as non-shallow clones. It is unclear what the best default is for the
depth of submodules of a shallow clone, so we need to have the possibility
to do all kinds of combinations:

* shallow super project with shallow submodules
  e.g. build bots starting always from scratch. They want to transmit
  the least amount of network data as well as using the least amount
  of space on their hard drive.
* shallow super project with unshallow submodules
  e.g. The superproject is just there to track a collection of repositories
  and it is not important to have the relationship between the repositories
  intact. However the history of the individual submodules matter.
* unshallow super project with shallow submodules
  e.g. The superproject is the actual project and the submodule is a
  library which is rarely touched.

The new switch to select submodules to be shallow or unshallow supports
all of these three cases.

It is easy to transition from the first to the second case by just
unshallowing the submodules (`git submodule foreach git fetch
--unshallow`), but it is not possible to transition from the second to the
first case (as we would have already transmitted the non shallow over
the network). That is why we want to make the first case the default in
case of a shallow super project. This leads to the inconvenience in the
second case with the shallow super project and unshallow submodules,
as you need to pass `--no-shallow-submodules`.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

  This replaces origin/sb/clone-shallow-passthru.
  
  Thanks Jacob and Lars for review!
  
Changes compared to origin/sb/clone-shallow-passthru:

 * Dropped the first patch (pass --[no-]local down to submodules).
   This was only used to ease testing in patch 3, but we can use the
   file protocol explicitely for now. Instead of a --no-local option Lars
   rather wants to see a more universal solution for specifying protocol
   issues (local/non local is just one of them, we may want to clone
   the submodules with http, while the parent project uses ssh. That option
   may go into the .gitmodules file as well as a "preferred protocol" option?)
 * squashed patches 1 & 2 together (2 was implementing the feature and 3 was
   testing only, so having just one should do as well.)
 * By squashing there is no need to reword the commit message of the 3rd patch.
 * improved documentation slightly, fixed typo in commit message.
 * deepended the history in the test cases to easier see the outcome, i.e.
   git clone --depth 2 --recurse-submodules implies a submodule depth of 1.
   
Thanks,
Stefan

 Documentation/git-clone.txt | 13 +++++--
 builtin/clone.c             |  7 ++++
 t/t5614-clone-submodules.sh | 85 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 102 insertions(+), 3 deletions(-)
 create mode 100755 t/t5614-clone-submodules.sh

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index 6db7b6d..e1a21b7 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -14,8 +14,8 @@ SYNOPSIS
 	  [-o <name>] [-b <name>] [-u <upload-pack>] [--reference <repository>]
 	  [--dissociate] [--separate-git-dir <git dir>]
 	  [--depth <depth>] [--[no-]single-branch]
-	  [--recursive | --recurse-submodules] [--jobs <n>] [--] <repository>
-	  [<directory>]
+	  [--recursive | --recurse-submodules] [--[no-]shallow-submodules]
+	  [--jobs <n>] [--] <repository> [<directory>]
 
 DESCRIPTION
 -----------
@@ -190,7 +190,11 @@ objects from the source repository into a pack in the cloned repository.
 
 --depth <depth>::
 	Create a 'shallow' clone with a history truncated to the
-	specified number of revisions.
+	specified number of revisions. Implies `--single-branch` unless
+	`--no-single-branch` is given to fetch the histories near the
+	tips of all branches. This implies `--shallow-submodules`. If
+	you want to have a shallow superproject clone, but full submodules,
+	also pass `--no-shallow-submodules`.
 
 --[no-]single-branch::
 	Clone only the history leading to the tip of a single branch,
@@ -214,6 +218,9 @@ objects from the source repository into a pack in the cloned repository.
 	repository does not have a worktree/checkout (i.e. if any of
 	`--no-checkout`/`-n`, `--bare`, or `--mirror` is given)
 
+--[no-]shallow-submodules::
+	All submodules which are cloned will be shallow with a depth of 1.
+
 --separate-git-dir=<git dir>::
 	Instead of placing the cloned repository where it is supposed
 	to be, place the cloned repository at the specified directory,
diff --git a/builtin/clone.c b/builtin/clone.c
index b004fb4..ecdf308 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -40,6 +40,7 @@ static const char * const builtin_clone_usage[] = {
 
 static int option_no_checkout, option_bare, option_mirror, option_single_branch = -1;
 static int option_local = -1, option_no_hardlinks, option_shared, option_recursive;
+static int option_shallow_submodules = -1;
 static char *option_template, *option_depth;
 static char *option_origin = NULL;
 static char *option_branch = NULL;
@@ -91,6 +92,8 @@ static struct option builtin_clone_options[] = {
 		    N_("create a shallow clone of that depth")),
 	OPT_BOOL(0, "single-branch", &option_single_branch,
 		    N_("clone only one branch, HEAD or --branch")),
+	OPT_BOOL(0, "shallow-submodules", &option_shallow_submodules,
+		    N_("any cloned submodules will be shallow")),
 	OPT_STRING(0, "separate-git-dir", &real_git_dir, N_("gitdir"),
 		   N_("separate git dir from working tree")),
 	OPT_STRING_LIST('c', "config", &option_config, N_("key=value"),
@@ -727,6 +730,10 @@ static int checkout(void)
 		struct argv_array args = ARGV_ARRAY_INIT;
 		argv_array_pushl(&args, "submodule", "update", "--init", "--recursive", NULL);
 
+		if (option_shallow_submodules == 1
+		    || (option_shallow_submodules == -1 && option_depth))
+			argv_array_push(&args, "--depth=1");
+
 		if (max_jobs != -1)
 			argv_array_pushf(&args, "--jobs=%d", max_jobs);
 
diff --git a/t/t5614-clone-submodules.sh b/t/t5614-clone-submodules.sh
new file mode 100755
index 0000000..825f511
--- /dev/null
+++ b/t/t5614-clone-submodules.sh
@@ -0,0 +1,85 @@
+#!/bin/sh
+
+test_description='Test shallow cloning of repos with submodules'
+
+. ./test-lib.sh
+
+p=$(pwd)
+
+test_expect_success 'setup' '
+	git checkout -b master &&
+	test_commit commit1 &&
+	test_commit commit2 &&
+	mkdir sub &&
+	(
+		cd sub &&
+		git init &&
+		test_commit subcommit1 &&
+		test_commit subcommit2 &&
+		test_commit subcommit3
+	) &&
+	git submodule add "file://$p/sub" sub &&
+	git commit -m "add submodule"
+'
+
+test_expect_success 'nonshallow clone implies nonshallow submodule' '
+	test_when_finished "rm -rf super_clone" &&
+	git clone --recurse-submodules "file://$p/." super_clone &&
+	(
+		cd super_clone &&
+		git log --oneline >lines &&
+		test_line_count = 3 lines
+	) &&
+	(
+		cd super_clone/sub &&
+		git log --oneline >lines &&
+		test_line_count = 3 lines
+	)
+'
+
+test_expect_success 'shallow clone implies shallow submodule' '
+	test_when_finished "rm -rf super_clone" &&
+	git clone --recurse-submodules --depth 2 "file://$p/." super_clone &&
+	(
+		cd super_clone &&
+		git log --oneline >lines &&
+		test_line_count = 2 lines
+	) &&
+	(
+		cd super_clone/sub &&
+		git log --oneline >lines &&
+		test_line_count = 1 lines
+	)
+'
+
+test_expect_success 'shallow clone with non shallow submodule' '
+	test_when_finished "rm -rf super_clone" &&
+	git clone --recurse-submodules --depth 2 --no-shallow-submodules "file://$p/." super_clone &&
+	(
+		cd super_clone &&
+		git log --oneline >lines &&
+		test_line_count = 2 lines
+	) &&
+	(
+		cd super_clone/sub &&
+		git log --oneline >lines &&
+		test_line_count = 3 lines
+	)
+'
+
+test_expect_success 'non shallow clone with shallow submodule' '
+	test_when_finished "rm -rf super_clone" &&
+	git clone --recurse-submodules --no-local --shallow-submodules "file://$p/." super_clone &&
+	(
+		cd super_clone &&
+		git log --oneline >lines &&
+		test_line_count = 3 lines
+	) &&
+	(
+		cd super_clone/sub &&
+		git log --oneline >lines &&
+		test_line_count = 1 lines
+	)
+'
+
+test_done
-- 
2.7.0.rc0.39.g5327a53

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] clone: add `--shallow-submodules` flag
  2016-04-25 18:30 [PATCH] clone: add `--shallow-submodules` flag Stefan Beller
@ 2016-04-25 21:04 ` Junio C Hamano
  2016-04-25 21:25   ` Stefan Beller
  0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2016-04-25 21:04 UTC (permalink / raw
  To: Stefan Beller; +Cc: jacob.keller, larsxschneider, git

Stefan Beller <sbeller@google.com> writes:

>   This replaces origin/sb/clone-shallow-passthru.
> @@ -190,7 +190,11 @@ objects from the source repository into a pack in the cloned repository.
>  
>  --depth <depth>::
>  	Create a 'shallow' clone with a history truncated to the
> -	specified number of revisions.
> +	specified number of revisions. Implies `--single-branch` unless
> +	`--no-single-branch` is given to fetch the histories near the
> +	tips of all branches. This implies `--shallow-submodules`. If
> +	you want to have a shallow superproject clone, but full submodules,
> +	also pass `--no-shallow-submodules`.

This is not wrong per-se but the early half of the new text seems to
come from 28a1b569 (docs: clarify that passing --depth to git-clone
implies --single-branch, 2016-01-06), which was merged to 'master'
some time ago.

I've resolved the conflicts coming from the duplicates, so no need
to resend, but the resulting history would become misleading.  I am
undecided if I should rebasing this on top of 85705cfb (Merge branch
'ss/clone-depth-single-doc', 2016-01-20) or later.

> diff --git a/t/t5614-clone-submodules.sh b/t/t5614-clone-submodules.sh
> new file mode 100755
> index 0000000..825f511
> --- /dev/null
> +++ b/t/t5614-clone-submodules.sh
> @@ -0,0 +1,85 @@
> +#!/bin/sh
> +
> +test_description='Test shallow cloning of repos with submodules'
> +
> +. ./test-lib.sh
> +
> +p=$(pwd)

A single-letter lower-case global that needs to stay constant for
the entire file looks like a ticking time bomb screaming to be
broken by future updates.  I see $D used for this purpose in many
scripts, $here and $top in some, and $TRASH in yet some others.
Perhaps $D may be more appropriate if we wanted to keep this
ultra-short-and-cryptic while mimicking existing ones.

> +test_expect_success 'setup' '
> +	git checkout -b master &&
> +	test_commit commit1 &&
> +	test_commit commit2 &&
> +	mkdir sub &&
> +	(
> +		cd sub &&
> +		git init &&
> +		test_commit subcommit1 &&
> +		test_commit subcommit2 &&
> +		test_commit subcommit3
> +	) &&
> +	git submodule add "file://$p/sub" sub &&
> +	git commit -m "add submodule"
> +'
> +
> +test_expect_success 'nonshallow clone implies nonshallow submodule' '
> +	test_when_finished "rm -rf super_clone" &&
> +	git clone --recurse-submodules "file://$p/." super_clone &&
> +	(
> +		cd super_clone &&
> +		git log --oneline >lines &&
> +		test_line_count = 3 lines
> +	) &&
> +	(
> +		cd super_clone/sub &&
> +		git log --oneline >lines &&
> +		test_line_count = 3 lines
> +	)
> +'
> +
> +test_expect_success 'shallow clone implies shallow submodule' '
> +	test_when_finished "rm -rf super_clone" &&
> +	git clone --recurse-submodules --depth 2 "file://$p/." super_clone &&
> +	(
> +		cd super_clone &&
> +		git log --oneline >lines &&
> +		test_line_count = 2 lines
> +	) &&
> +	(
> +		cd super_clone/sub &&
> +		git log --oneline >lines &&
> +		test_line_count = 1 lines
> +	)
> +'
> +
> +test_expect_success 'shallow clone with non shallow submodule' '
> +	test_when_finished "rm -rf super_clone" &&
> +	git clone --recurse-submodules --depth 2 --no-shallow-submodules "file://$p/." super_clone &&
> +	(
> +		cd super_clone &&
> +		git log --oneline >lines &&
> +		test_line_count = 2 lines
> +	) &&
> +	(
> +		cd super_clone/sub &&
> +		git log --oneline >lines &&
> +		test_line_count = 3 lines
> +	)
> +'
> +
> +test_expect_success 'non shallow clone with shallow submodule' '
> +	test_when_finished "rm -rf super_clone" &&
> +	git clone --recurse-submodules --no-local --shallow-submodules "file://$p/." super_clone &&
> +	(
> +		cd super_clone &&
> +		git log --oneline >lines &&
> +		test_line_count = 3 lines
> +	) &&
> +	(
> +		cd super_clone/sub &&
> +		git log --oneline >lines &&
> +		test_line_count = 1 lines
> +	)
> +'
> +
> +test_done

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] clone: add `--shallow-submodules` flag
  2016-04-25 21:04 ` Junio C Hamano
@ 2016-04-25 21:25   ` Stefan Beller
  2016-04-25 22:37     ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Beller @ 2016-04-25 21:25 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Jacob Keller, Lars Schneider, git@vger.kernel.org

On Mon, Apr 25, 2016 at 2:04 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>>   This replaces origin/sb/clone-shallow-passthru.
>> @@ -190,7 +190,11 @@ objects from the source repository into a pack in the cloned repository.
>>
>>  --depth <depth>::
>>       Create a 'shallow' clone with a history truncated to the
>> -     specified number of revisions.
>> +     specified number of revisions. Implies `--single-branch` unless
>> +     `--no-single-branch` is given to fetch the histories near the
>> +     tips of all branches. This implies `--shallow-submodules`. If
>> +     you want to have a shallow superproject clone, but full submodules,
>> +     also pass `--no-shallow-submodules`.
>
> This is not wrong per-se but the early half of the new text seems to
> come from 28a1b569 (docs: clarify that passing --depth to git-clone
> implies --single-branch, 2016-01-06), which was merged to 'master'
> some time ago.
>
> I've resolved the conflicts coming from the duplicates, so no need
> to resend, but the resulting history would become misleading.  I am
> undecided if I should rebasing this on top of 85705cfb (Merge branch
> 'ss/clone-depth-single-doc', 2016-01-20) or later.

I could reroll with another variable name on top of 85705cfb?

>
>> diff --git a/t/t5614-clone-submodules.sh b/t/t5614-clone-submodules.sh
>> new file mode 100755
>> index 0000000..825f511
>> --- /dev/null
>> +++ b/t/t5614-clone-submodules.sh
>> @@ -0,0 +1,85 @@
>> +#!/bin/sh
>> +
>> +test_description='Test shallow cloning of repos with submodules'
>> +
>> +. ./test-lib.sh
>> +
>> +p=$(pwd)
>
> A single-letter lower-case global that needs to stay constant for
> the entire file looks like a ticking time bomb screaming to be
> broken by future updates.

I agree. How about `currentdir`, `testdir` or `testtop` instead?
That is slightly longer than `D`, `here` or `top`, but is slightly more
informative. $TRASH would also work for me.

Talking about time bombs: I see there are many tests in single files
(e.g. 7412 has ~80 tests IIRC and adding a test in there I often spend
more time figuring out the current state of the testing directory rather than
writing the test.

So I'd rather see more testing files with fewer tests rather than a few
files with all the tests for one topic. (e.g. these tests could have gone
into the clone tests, or the basic submodule tests, but instead I chose
a new file to test this.)


> I see $D used for this purpose in many
> scripts, $here and $top in some, and $TRASH in yet some others.
> Perhaps $D may be more appropriate if we wanted to keep this
> ultra-short-and-cryptic while mimicking existing ones.

I was not keen on being ultrashort, rather I was toying around
to drop the patch introducing the --no-local option.

>
>> +test_expect_success 'setup' '
>> +     git checkout -b master &&
>> +     test_commit commit1 &&
>> +     test_commit commit2 &&
>> +     mkdir sub &&
>> +     (
>> +             cd sub &&
>> +             git init &&
>> +             test_commit subcommit1 &&
>> +             test_commit subcommit2 &&
>> +             test_commit subcommit3
>> +     ) &&
>> +     git submodule add "file://$p/sub" sub &&
>> +     git commit -m "add submodule"
>> +'
>> +
>> +test_expect_success 'nonshallow clone implies nonshallow submodule' '
>> +     test_when_finished "rm -rf super_clone" &&
>> +     git clone --recurse-submodules "file://$p/." super_clone &&
>> +     (
>> +             cd super_clone &&
>> +             git log --oneline >lines &&
>> +             test_line_count = 3 lines
>> +     ) &&
>> +     (
>> +             cd super_clone/sub &&
>> +             git log --oneline >lines &&
>> +             test_line_count = 3 lines
>> +     )
>> +'
>> +
>> +test_expect_success 'shallow clone implies shallow submodule' '
>> +     test_when_finished "rm -rf super_clone" &&
>> +     git clone --recurse-submodules --depth 2 "file://$p/." super_clone &&
>> +     (
>> +             cd super_clone &&
>> +             git log --oneline >lines &&
>> +             test_line_count = 2 lines
>> +     ) &&
>> +     (
>> +             cd super_clone/sub &&
>> +             git log --oneline >lines &&
>> +             test_line_count = 1 lines
>> +     )
>> +'
>> +
>> +test_expect_success 'shallow clone with non shallow submodule' '
>> +     test_when_finished "rm -rf super_clone" &&
>> +     git clone --recurse-submodules --depth 2 --no-shallow-submodules "file://$p/." super_clone &&
>> +     (
>> +             cd super_clone &&
>> +             git log --oneline >lines &&
>> +             test_line_count = 2 lines
>> +     ) &&
>> +     (
>> +             cd super_clone/sub &&
>> +             git log --oneline >lines &&
>> +             test_line_count = 3 lines
>> +     )
>> +'
>> +
>> +test_expect_success 'non shallow clone with shallow submodule' '
>> +     test_when_finished "rm -rf super_clone" &&
>> +     git clone --recurse-submodules --no-local --shallow-submodules "file://$p/." super_clone &&
>> +     (
>> +             cd super_clone &&
>> +             git log --oneline >lines &&
>> +             test_line_count = 3 lines
>> +     ) &&
>> +     (
>> +             cd super_clone/sub &&
>> +             git log --oneline >lines &&
>> +             test_line_count = 1 lines
>> +     )
>> +'
>> +
>> +test_done

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] clone: add `--shallow-submodules` flag
  2016-04-25 21:25   ` Stefan Beller
@ 2016-04-25 22:37     ` Junio C Hamano
  2016-04-26  1:08       ` Stefan Beller
  0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2016-04-25 22:37 UTC (permalink / raw
  To: Stefan Beller; +Cc: Jacob Keller, Lars Schneider, git@vger.kernel.org

Stefan Beller <sbeller@google.com> writes:

> I agree. How about `currentdir`, `testdir` or `testtop` instead?
> That is slightly longer than `D`, `here` or `top`, but is slightly more
> informative. $TRASH would also work for me.

I would not be happy to see a patch that adds yet another variable
that is never used so far, like currentdir, testdir, or testtop.

Among the ones I found that are already in use, $here is probably my
favourite, because it does _not_ have to be set to the top; it just
says "this is the directory my main focus is in".

Having said that, personally, I find $D (as long as D is in capital)
is distinctive and descriptive enough.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] clone: add `--shallow-submodules` flag
  2016-04-25 22:37     ` Junio C Hamano
@ 2016-04-26  1:08       ` Stefan Beller
  0 siblings, 0 replies; 5+ messages in thread
From: Stefan Beller @ 2016-04-26  1:08 UTC (permalink / raw
  To: Junio C Hamano; +Cc: Jacob Keller, Lars Schneider, git@vger.kernel.org

On Mon, Apr 25, 2016 at 3:37 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> I agree. How about `currentdir`, `testdir` or `testtop` instead?
>> That is slightly longer than `D`, `here` or `top`, but is slightly more
>> informative. $TRASH would also work for me.
>
> I would not be happy to see a patch that adds yet another variable
> that is never used so far, like currentdir, testdir, or testtop.

D seems to be popular in 5510, the submodule crowd seems to prefer
$pwd though, see
t5526-fetch-submodules.sh
t7406-submodule-update.sh
t7407-submodule-foreach.s
t7400-submodule-basic.sh

as grep -r "\$pwd" put it.

>
> Among the ones I found that are already in use, $here is probably my
> favourite, because it does _not_ have to be set to the top; it just
> says "this is the directory my main focus is in".
>
> Having said that, personally, I find $D (as long as D is in capital)
> is distinctive and descriptive enough.

I rather go with pwd for now (it keeps the submodule stuff consistent),

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-04-26  1:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-25 18:30 [PATCH] clone: add `--shallow-submodules` flag Stefan Beller
2016-04-25 21:04 ` Junio C Hamano
2016-04-25 21:25   ` Stefan Beller
2016-04-25 22:37     ` Junio C Hamano
2016-04-26  1:08       ` Stefan Beller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).