git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [GSoC][PATCH v1] Disallow git commands from within unpopulated submodules
@ 2017-04-06  6:00 Prathamesh Chavan
  2017-04-06  6:11 ` Prathamesh Chavan
  2017-04-06 18:15 ` Stefan Beller
  0 siblings, 2 replies; 4+ messages in thread
From: Prathamesh Chavan @ 2017-04-06  6:00 UTC (permalink / raw)
  To: git; +Cc: =sbeller, peff, Prathamesh Chavan

The main motivations for disallowing git commands within an
unpopulated submodule are:

Whenever we run "git -C status" within an unpopulated submodule, it
falls back to the superproject. This occurs since there is no .git
file in the submodule directory. So superproject's status gets displayed.
Also, the user's intention is not clear behind running the command
in an unpopulated submodule. Hence we prefer to error out.

When we run the command "git -C sub add ." within a submodule, the
results observed are:

In the case of the populated submodule, it acts like running “git add .“
inside the submodule. This is uncontroversial and runs as expected.

In the case of the unpopulated submodule, the user's intention behind
entering the above command is unclear. He may have intended to add
the submodule to the superproject or to add all files inside the
sub/ directory to the submodule or superproject. Hence we’ll prefer
to error out in these case.

Eventually, we use a check_prefix_inside_submodule to see check if the
path is inside an unpopulated submodule. If it is, then we report the
user about the unpopulated submodule.

Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---

Since this patch effectively uses RUN_SETUP, builtin commands like
'diff' and other non-builtin commands are not filtered.
For such cases, I think, we need to handle them separately.

Also since currently, git-submodule is not a builtin command, the
command for initializing and updating the submodule doesn't return an
error message, but once it is converted to builtin, we need to handle
its case explicitly.

The build report of this patch is available on:
https://travis-ci.org/pratham-pc/git/builds/219030999

Also, the above patch was initially my GSoC project topic, but I changed
it later on and added these bug fixes to my wishlist of the proposal.

 builtin/submodule--helper.c      |  4 ----
 git.c                            |  3 +++
 submodule.c                      | 45 ++++++++++++++++++++++++++++++++++++++++
 submodule.h                      |  6 ++++++
 t/t6134-pathspec-in-submodule.sh |  2 +-
 5 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 15a5430c0..4f7c7d7b8 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -219,10 +219,6 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr
 	return 0;
 }
 
-struct module_list {
-	const struct cache_entry **entries;
-	int alloc, nr;
-};
 #define MODULE_LIST_INIT { NULL, 0, 0 }
 
 static int module_list_compute(int argc, const char **argv,
diff --git a/git.c b/git.c
index 33f52acbc..eefe3fb01 100644
--- a/git.c
+++ b/git.c
@@ -2,6 +2,7 @@
 #include "exec_cmd.h"
 #include "help.h"
 #include "run-command.h"
+#include "submodule.h"
 
 const char git_usage_string[] =
 	"git [--version] [--help] [-C <path>] [-c name=value]\n"
@@ -364,6 +365,8 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 		if (prefix)
 			die("can't use --super-prefix from a subdirectory");
 	}
+	if (prefix)
+		check_prefix_inside_submodule(prefix);
 
 	if (!help && p->option & NEED_WORK_TREE)
 		setup_work_tree();
diff --git a/submodule.c b/submodule.c
index 0a2831d84..d2c3023bf 100644
--- a/submodule.c
+++ b/submodule.c
@@ -545,6 +545,51 @@ void set_config_fetch_recurse_submodules(int value)
 	config_fetch_recurse_submodules = value;
 }
 
+#define MODULE_LIST_INIT { NULL, 0, 0 }
+
+void check_prefix_inside_submodule(const char *prefix)
+{
+	struct module_list list = MODULE_LIST_INIT;
+	int i;
+
+	if (read_cache() < 0)
+		die(_("index file corrupt"));
+
+	for (i = 0; i < active_nr; i++) {
+		const struct cache_entry *ce = active_cache[i];
+
+		if (!S_ISGITLINK(ce->ce_mode))
+				continue;
+
+		ALLOC_GROW(list.entries, list.nr + 1, list.alloc);
+		list.entries[list.nr++] = ce;
+		while (i + 1 < active_nr &&
+			!strcmp(ce->name, active_cache[i + 1]->name))
+			 /*
+			  * Skip entries with the same name in different stages
+			  * to make sure an entry is returned only once.
+			  */
+			i++;
+	}
+
+	for(i = 0; i < list.nr; i++) {
+		if(strlen((*list.entries[i]).name) ==  strlen(prefix)) {
+			if (!strcmp((*list.entries[i]).name, prefix)) {
+				/* This case cannot happen because */
+				die("BUG: prefixes end with '/', but we do not record ending slashes in the index");
+			}
+		}
+		else if(strlen((*list.entries[i]).name) ==  strlen(prefix)-1) {
+			const char *out = NULL;
+			if(skip_prefix(prefix, (*list.entries[i]).name, &out)) {
+				if(strlen(out) == 1 && out[0] == '/')
+					die(_("command from inside unpopulated submodule '%s' not supported."), (*list.entries[i]).name);
+			}
+		}
+	}
+
+}
+
 static int has_remote(const char *refname, const struct object_id *oid,
 		      int flags, void *cb_data)
 {
diff --git a/submodule.h b/submodule.h
index 05ab674f0..5e41e5afc 100644
--- a/submodule.h
+++ b/submodule.h
@@ -31,6 +31,12 @@ struct submodule_update_strategy {
 };
 #define SUBMODULE_UPDATE_STRATEGY_INIT {SM_UPDATE_UNSPECIFIED, NULL}
 
+struct module_list {
+	const struct cache_entry **entries;
+	int alloc, nr;
+};
+
+extern void check_prefix_inside_submodule(const char *prefix);
 extern int is_staging_gitmodules_ok(void);
 extern int update_path_in_gitmodules(const char *oldpath, const char *newpath);
 extern int remove_path_from_gitmodules(const char *path);
diff --git a/t/t6134-pathspec-in-submodule.sh b/t/t6134-pathspec-in-submodule.sh
index fd401ca60..086cc4c47 100755
--- a/t/t6134-pathspec-in-submodule.sh
+++ b/t/t6134-pathspec-in-submodule.sh
@@ -25,7 +25,7 @@ test_expect_success 'error message for path inside submodule' '
 '
 
 cat <<EOF >expect
-fatal: Pathspec '.' is in submodule 'sub'
+fatal: command from inside unpopulated submodule 'sub' not supported.
 EOF
 
 test_expect_success 'error message for path inside submodule from within submodule' '
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [GSoC][PATCH v1] Disallow git commands from within unpopulated submodules
  2017-04-06  6:00 [GSoC][PATCH v1] Disallow git commands from within unpopulated submodules Prathamesh Chavan
@ 2017-04-06  6:11 ` Prathamesh Chavan
  2017-04-06 18:15 ` Stefan Beller
  1 sibling, 0 replies; 4+ messages in thread
From: Prathamesh Chavan @ 2017-04-06  6:11 UTC (permalink / raw)
  To: pc44800; +Cc: sbeller, git, peff

The main motivations for disallowing git commands within an
unpopulated submodule are:

Whenever we run "git -C status" within an unpopulated submodule, it
falls back to the superproject. This occurs since there is no .git
file in the submodule directory. So superproject's status gets displayed.
Also, the user's intention is not clear behind running the command
in an unpopulated submodule. Hence we prefer to error out.

When we run the command "git -C sub add ." within a submodule, the
results observed are:

In the case of the populated submodule, it acts like running “git add .“
inside the submodule. This is uncontroversial and runs as expected.

In the case of the unpopulated submodule, the user's intention behind
entering the above command is unclear. He may have intended to add
the submodule to the superproject or to add all files inside the
sub/ directory to the submodule or superproject. Hence we’ll prefer
to error out in these case.

Eventually, we use a check_prefix_inside_submodule to see check if the
path is inside an unpopulated submodule. If it is, then we report the
user about the unpopulated submodule.

Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
---

Since this patch effectively uses RUN_SETUP, builtin commands like
'diff' and other non-builtin commands are not filtered.
For such cases, I think, we need to handle them separately.

Also since currently, git-submodule is not a builtin command, the
command for initializing and updating the submodule doesn't return an
error message, but once it is converted to builtin, we need to handle
its case explicitly.

The build report of this patch is available on:
https://travis-ci.org/pratham-pc/git/builds/219030999

Also, the above patch was initially my GSoC project topic, but I changed
it later on and added these bug fixes to my wishlist of the proposal.

(Had to send the patch again since a typo occurred while sending previous the mail)

 builtin/submodule--helper.c      |  4 ----
 git.c                            |  3 +++
 submodule.c                      | 45 ++++++++++++++++++++++++++++++++++++++++
 submodule.h                      |  6 ++++++
 t/t6134-pathspec-in-submodule.sh |  2 +-
 5 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 15a5430c0..4f7c7d7b8 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -219,10 +219,6 @@ static int resolve_relative_url_test(int argc, const char **argv, const char *pr
 	return 0;
 }
 
-struct module_list {
-	const struct cache_entry **entries;
-	int alloc, nr;
-};
 #define MODULE_LIST_INIT { NULL, 0, 0 }
 
 static int module_list_compute(int argc, const char **argv,
diff --git a/git.c b/git.c
index 33f52acbc..eefe3fb01 100644
--- a/git.c
+++ b/git.c
@@ -2,6 +2,7 @@
 #include "exec_cmd.h"
 #include "help.h"
 #include "run-command.h"
+#include "submodule.h"
 
 const char git_usage_string[] =
 	"git [--version] [--help] [-C <path>] [-c name=value]\n"
@@ -364,6 +365,8 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 		if (prefix)
 			die("can't use --super-prefix from a subdirectory");
 	}
+	if (prefix)
+		check_prefix_inside_submodule(prefix);
 
 	if (!help && p->option & NEED_WORK_TREE)
 		setup_work_tree();
diff --git a/submodule.c b/submodule.c
index 0a2831d84..d2c3023bf 100644
--- a/submodule.c
+++ b/submodule.c
@@ -545,6 +545,51 @@ void set_config_fetch_recurse_submodules(int value)
 	config_fetch_recurse_submodules = value;
 }
 
+#define MODULE_LIST_INIT { NULL, 0, 0 }
+
+void check_prefix_inside_submodule(const char *prefix)
+{
+	struct module_list list = MODULE_LIST_INIT;
+	int i;
+
+	if (read_cache() < 0)
+		die(_("index file corrupt"));
+
+	for (i = 0; i < active_nr; i++) {
+		const struct cache_entry *ce = active_cache[i];
+
+		if (!S_ISGITLINK(ce->ce_mode))
+				continue;
+
+		ALLOC_GROW(list.entries, list.nr + 1, list.alloc);
+		list.entries[list.nr++] = ce;
+		while (i + 1 < active_nr &&
+			!strcmp(ce->name, active_cache[i + 1]->name))
+			 /*
+			  * Skip entries with the same name in different stages
+			  * to make sure an entry is returned only once.
+			  */
+			i++;
+	}
+
+	for(i = 0; i < list.nr; i++) {
+		if(strlen((*list.entries[i]).name) ==  strlen(prefix)) {
+			if (!strcmp((*list.entries[i]).name, prefix)) {
+				/* This case cannot happen because */
+				die("BUG: prefixes end with '/', but we do not record ending slashes in the index");
+			}
+		}
+		else if(strlen((*list.entries[i]).name) ==  strlen(prefix)-1) {
+			const char *out = NULL;
+			if(skip_prefix(prefix, (*list.entries[i]).name, &out)) {
+				if(strlen(out) == 1 && out[0] == '/')
+					die(_("command from inside unpopulated submodule '%s' not supported."), (*list.entries[i]).name);
+			}
+		}
+	}
+
+}
+
 static int has_remote(const char *refname, const struct object_id *oid,
 		      int flags, void *cb_data)
 {
diff --git a/submodule.h b/submodule.h
index 05ab674f0..5e41e5afc 100644
--- a/submodule.h
+++ b/submodule.h
@@ -31,6 +31,12 @@ struct submodule_update_strategy {
 };
 #define SUBMODULE_UPDATE_STRATEGY_INIT {SM_UPDATE_UNSPECIFIED, NULL}
 
+struct module_list {
+	const struct cache_entry **entries;
+	int alloc, nr;
+};
+
+extern void check_prefix_inside_submodule(const char *prefix);
 extern int is_staging_gitmodules_ok(void);
 extern int update_path_in_gitmodules(const char *oldpath, const char *newpath);
 extern int remove_path_from_gitmodules(const char *path);
diff --git a/t/t6134-pathspec-in-submodule.sh b/t/t6134-pathspec-in-submodule.sh
index fd401ca60..086cc4c47 100755
--- a/t/t6134-pathspec-in-submodule.sh
+++ b/t/t6134-pathspec-in-submodule.sh
@@ -25,7 +25,7 @@ test_expect_success 'error message for path inside submodule' '
 '
 
 cat <<EOF >expect
-fatal: Pathspec '.' is in submodule 'sub'
+fatal: command from inside unpopulated submodule 'sub' not supported.
 EOF
 
 test_expect_success 'error message for path inside submodule from within submodule' '
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [GSoC][PATCH v1] Disallow git commands from within unpopulated submodules
  2017-04-06  6:00 [GSoC][PATCH v1] Disallow git commands from within unpopulated submodules Prathamesh Chavan
  2017-04-06  6:11 ` Prathamesh Chavan
@ 2017-04-06 18:15 ` Stefan Beller
  2017-04-06 20:48   ` Prathamesh Chavan
  1 sibling, 1 reply; 4+ messages in thread
From: Stefan Beller @ 2017-04-06 18:15 UTC (permalink / raw)
  To: Prathamesh Chavan; +Cc: git@vger.kernel.org, Jeff King

On Wed, Apr 5, 2017 at 11:00 PM, Prathamesh Chavan <pc44800@gmail.com> wrote:
> The main motivations for disallowing git commands within an
> unpopulated submodule are:
>
> Whenever we run "git -C status" within an unpopulated submodule, it
> falls back to the superproject. This occurs since there is no .git
> file in the submodule directory. So superproject's status gets displayed.
> Also, the user's intention is not clear behind running the command
> in an unpopulated submodule. Hence we prefer to error out.
>
> When we run the command "git -C sub add ." within a submodule, the
> results observed are:
>
> In the case of the populated submodule, it acts like running “git add .“
> inside the submodule. This is uncontroversial and runs as expected.
>
> In the case of the unpopulated submodule, the user's intention behind
> entering the above command is unclear. He may have intended to add
> the submodule to the superproject or to add all files inside the
> sub/ directory to the submodule or superproject. Hence we’ll prefer
> to error out in these case.
>
> Eventually, we use a check_prefix_inside_submodule to see check if the
> path is inside an unpopulated submodule. If it is, then we report the
> user about the unpopulated submodule.
>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> ---
>
> Since this patch effectively uses RUN_SETUP, builtin commands like
> 'diff' and other non-builtin commands are not filtered.
> For such cases, I think, we need to handle them separately.
>
> Also since currently, git-submodule is not a builtin command, the
> command for initializing and updating the submodule doesn't return an
> error message, but once it is converted to builtin, we need to handle
> its case explicitly.
>
> The build report of this patch is available on:
> https://travis-ci.org/pratham-pc/git/builds/219030999
>
> Also, the above patch was initially my GSoC project topic, but I changed
> it later on and added these bug fixes to my wishlist of the proposal.

Thanks for picking this topic up. :)

A couple of weeks back I floated a similar proposal of a patch[1], but
as far as I remember Peff hinted that it is a bad UI to do it on such a
generic early level[2]. And you also mention here that we'd not affect
git-diff or other commands that do not have RUN_SETUP set.

[1] https://public-inbox.org/git/20170119193023.26837-1-sbeller@google.com/
[2] from the same thread as [1]:
  https://public-inbox.org/git/20170120191728.l3ne5tt5pwbmafjh@sigill.intra.peff.net/

And I agree with Peff here that this high level catching is not the best way to
go. Rather we'd have to go through each command, e.g. in git-status I could
imagine it could look like: (white space mangled):

    diff --git a/builtin/commit.c b/builtin/commit.c
    index 4e288bc513..e3c44d4ac4 100644
    --- a/builtin/commit.c
    +++ b/builtin/commit.c
    @@ -1328,6 +1328,23 @@ static int git_status_config(const char *k,
const char *v, void *cb)
         return git_diff_ui_config(k, v, NULL);
     }

    +static void print_warning_inside_submodule(int status_format,
const char *prefix)
    +{
    +    switch (status_format) {
    +    case STATUS_FORMAT_UNSPECIFIED: /* fall through */
    +    case STATUS_FORMAT_NONE:     /* fall through */
    +    case STATUS_FORMAT_LONG:
    +        printf(_("\n\nWARNING: \n\nIn uninitialized submodule
'%s'\n\n\n"), prefix);
    +    case STATUS_FORMAT_SHORT:
    +        printf(_("WARNING: In uninitialized submodule '%s'\n"), prefix);
    +    case STATUS_FORMAT_PORCELAIN:
    +        /* cannot encode the warning in porcelain v1. */
    +        break;
    +    case STATUS_FORMAT_PORCELAIN_V2:
    +        printf("# WARNING prefix in submodule\n");
    +    }
    +}
    +
     int cmd_status(int argc, const char **argv, const char *prefix)
     {
         static struct wt_status s;
    @@ -1380,6 +1397,9 @@ int cmd_status(int argc, const char **argv,
const char *prefix)
         read_cache_preload(&s.pathspec);
         refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED,
&s.pathspec, NULL, NULL);

    +    if (prefix && check_prefix_inside_submodule(prefix))
    +        print_warning_inside_submodule(status_format, prefix);
    +
         fd = hold_locked_index(&index_lock, 0);

         s.is_initial = get_sha1(s.reference, oid.hash) ? 1 : 0;

>
> +#define MODULE_LIST_INIT { NULL, 0, 0 }

I'd keep this initializer macro near the definition, i.e. in the header
(compare to STRBUF_INIT or STRING_LIST_INIT_* for example),
as this can then be used where ever we can use the data structure.

> +void check_prefix_inside_submodule(const char *prefix)

I think we'll need this function in returning way, i.e.

    int check_prefix_inside_submodule(const char *prefix)
    {
        ... do check ...
        return result;
    }

> +{
> +       struct module_list list = MODULE_LIST_INIT;
> +       int i;
> +
> +       if (read_cache() < 0)
> +               die(_("index file corrupt"));
> +
> +       for (i = 0; i < active_nr; i++) {
> +               const struct cache_entry *ce = active_cache[i];
> +
> +               if (!S_ISGITLINK(ce->ce_mode))
> +                               continue;
> +
> +               ALLOC_GROW(list.entries, list.nr + 1, list.alloc);
> +               list.entries[list.nr++] = ce;
> +               while (i + 1 < active_nr &&
> +                       !strcmp(ce->name, active_cache[i + 1]->name))
> +                        /*
> +                         * Skip entries with the same name in different stages
> +                         * to make sure an entry is returned only once.
> +                         */
> +                       i++;
> +       }

The code up to here seems to be partially duplicate of
submodule--helper.c#module_list_compute().

At first I tried coming up with a nice code deduplication (i.e. put
the essential parts as a function somewhere), but I think this doesn't
make sense from an algorithmic point of view, because this runs in O(n)
of active_cache.

active_cache is sorted by (1st) alphabet and (2nd) the index stage.
The second sorting is why we have the while loop with the comment
in the code.

The problem we are trying to solve here, is "Does the active_index contain
prefix?", which can be done in O(log n) with a binary search, because
active_index is sorted by alphabet.

So I am not sure how much code we can reuse here. nevertheless:

> +       for(i = 0; i < list.nr; i++) {

Style nit: We prefer a whitespace between a control statement
(for, if, while) and the opening parens, i.e. "for (i = .."

> +               if(strlen((*list.entries[i]).name) ==  strlen(prefix)) {

(*list.entries[i]).name

can be simplified. The dereference *, combined with an access
of the struct member can be done as ->.
(*foo).bar is equal to foo->bar.

> +               }
> +               else if(strlen((*list.entries[i]).name) ==  strlen(prefix)-1) {

The Git coding style prefers to have the closing brace and the else
on the same line:

        ..
    } else if (..) {
        ..


> +                       const char *out = NULL;
> +                       if(skip_prefix(prefix, (*list.entries[i]).name, &out)) {
> +                               if(strlen(out) == 1 && out[0] == '/')

The strlen operation can take quite long potentially (O(n) with n the
length of out).
So you could put the cheaper operation first, or the following would
check the same
without having to compute the length:

    if (out && out[0] == '/' && !out + 1)
        ..

> +                                       die(_("command from inside unpopulated submodule '%s' not supported."), (*list.entries[i]).name);

Once we have this function inside each command, we can be more precise
the error message. :)

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [GSoC][PATCH v1] Disallow git commands from within unpopulated submodules
  2017-04-06 18:15 ` Stefan Beller
@ 2017-04-06 20:48   ` Prathamesh Chavan
  0 siblings, 0 replies; 4+ messages in thread
From: Prathamesh Chavan @ 2017-04-06 20:48 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git@vger.kernel.org, Jeff King

> A couple of weeks back I floated a similar proposal of a patch[1], but
> as far as I remember Peff hinted that it is a bad UI to do it on such a
> generic early level[2]. And you also mention here that we'd not affect
> git-diff or other commands that do not have RUN_SETUP set.
>
> [1] https://public-inbox.org/git/20170119193023.26837-1-sbeller@google.com/
> [2] from the same thread as [1]:
>   https://public-inbox.org/git/20170120191728.l3ne5tt5pwbmafjh@sigill.intra.peff.net/
>
> And I agree with Peff here that this high level catching is not the best way to
> go. Rather we'd have to go through each command, e.g. in git-status I could
> imagine it could look like: (white space mangled):
>
>     diff --git a/builtin/commit.c b/builtin/commit.c
>     index 4e288bc513..e3c44d4ac4 100644
>     --- a/builtin/commit.c
>     +++ b/builtin/commit.c
>     @@ -1328,6 +1328,23 @@ static int git_status_config(const char *k,
> const char *v, void *cb)
>          return git_diff_ui_config(k, v, NULL);
>      }
>
>     +static void print_warning_inside_submodule(int status_format,
> const char *prefix)
>     +{
>     +    switch (status_format) {
>     +    case STATUS_FORMAT_UNSPECIFIED: /* fall through */
>     +    case STATUS_FORMAT_NONE:     /* fall through */
>     +    case STATUS_FORMAT_LONG:
>     +        printf(_("\n\nWARNING: \n\nIn uninitialized submodule
> '%s'\n\n\n"), prefix);
>     +    case STATUS_FORMAT_SHORT:
>     +        printf(_("WARNING: In uninitialized submodule '%s'\n"), prefix);
>     +    case STATUS_FORMAT_PORCELAIN:
>     +        /* cannot encode the warning in porcelain v1. */
>     +        break;
>     +    case STATUS_FORMAT_PORCELAIN_V2:
>     +        printf("# WARNING prefix in submodule\n");
>     +    }
>     +}
>     +
>      int cmd_status(int argc, const char **argv, const char *prefix)
>      {
>          static struct wt_status s;
>     @@ -1380,6 +1397,9 @@ int cmd_status(int argc, const char **argv,
> const char *prefix)
>          read_cache_preload(&s.pathspec);
>          refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED,
> &s.pathspec, NULL, NULL);
>
>     +    if (prefix && check_prefix_inside_submodule(prefix))
>     +        print_warning_inside_submodule(status_format, prefix);
>     +
>          fd = hold_locked_index(&index_lock, 0);
>
>          s.is_initial = get_sha1(s.reference, oid.hash) ? 1 : 0;
>
Yes, even after finishing my patch and finding out that it doesn't work
for various commands, high level catching doesn't seems correct. Also
after you mention in the status command itself, as we don't expect it to
error out in every case, hence making the change in every command
individually will be an appropriate choice. Also in this way, we can also
include the case of git-diff and also for the non-bultin commands.

>>
>> +#define MODULE_LIST_INIT { NULL, 0, 0 }
>
> I'd keep this initializer macro near the definition, i.e. in the header
> (compare to STRBUF_INIT or STRING_LIST_INIT_* for example),
> as this can then be used where ever we can use the data structure.
>
I was a little confused about its location too. But I agree with you
that as it can be used where ever we can later, keeping it in
header will be more appropriate.

>> +void check_prefix_inside_submodule(const char *prefix)
>
> I think we'll need this function in returning way, i.e.
>
>     int check_prefix_inside_submodule(const char *prefix)
>     {
>         ... do check ...
>         return result;
>     }
>
>> +{
>> +       struct module_list list = MODULE_LIST_INIT;
>> +       int i;
>> +
>> +       if (read_cache() < 0)
>> +               die(_("index file corrupt"));
>> +
>> +       for (i = 0; i < active_nr; i++) {
>> +               const struct cache_entry *ce = active_cache[i];
>> +
>> +               if (!S_ISGITLINK(ce->ce_mode))
>> +                               continue;
>> +
>> +               ALLOC_GROW(list.entries, list.nr + 1, list.alloc);
>> +               list.entries[list.nr++] = ce;
>> +               while (i + 1 < active_nr &&
>> +                       !strcmp(ce->name, active_cache[i + 1]->name))
>> +                        /*
>> +                         * Skip entries with the same name in different stages
>> +                         * to make sure an entry is returned only once.
>> +                         */
>> +                       i++;
>> +       }
>
> The code up to here seems to be partially duplicate of
> submodule--helper.c#module_list_compute().
>
> At first I tried coming up with a nice code deduplication (i.e. put
> the essential parts as a function somewhere), but I think this doesn't
> make sense from an algorithmic point of view, because this runs in O(n)
> of active_cache.
>
> active_cache is sorted by (1st) alphabet and (2nd) the index stage.
> The second sorting is why we have the while loop with the comment
> in the code.

Sorry, I was unaware about the active_cache being sorted, hence went
for linear search.

>
> The problem we are trying to solve here, is "Does the active_index contain
> prefix?", which can be done in O(log n) with a binary search, because
> active_index is sorted by alphabet.
>
> So I am not sure how much code we can reuse here. nevertheless:
>
>> +       for(i = 0; i < list.nr; i++) {
>
> Style nit: We prefer a whitespace between a control statement
> (for, if, while) and the opening parens, i.e. "for (i = .."
>
>> +               if(strlen((*list.entries[i]).name) ==  strlen(prefix)) {
>
> (*list.entries[i]).name
>
> can be simplified. The dereference *, combined with an access
> of the struct member can be done as ->.
> (*foo).bar is equal to foo->bar.
>
>> +               }
>> +               else if(strlen((*list.entries[i]).name) ==  strlen(prefix)-1) {
>
> The Git coding style prefers to have the closing brace and the else
> on the same line:
>
>         ..
>     } else if (..) {
>         ..
>

Sorry for being sloppy with the coding style. In all my future patches
I'll make sure I learn from all the mistakes I made so far and also
be more careful.

>
>> +                       const char *out = NULL;
>> +                       if(skip_prefix(prefix, (*list.entries[i]).name, &out)) {
>> +                               if(strlen(out) == 1 && out[0] == '/')
>
> The strlen operation can take quite long potentially (O(n) with n the
> length of out).
> So you could put the cheaper operation first, or the following would
> check the same
> without having to compute the length:
>
>     if (out && out[0] == '/' && !out + 1)
>         ..
>
>> +                                       die(_("command from inside unpopulated submodule '%s' not supported."), (*list.entries[i]).name);
>
> Once we have this function inside each command, we can be more precise
> the error message. :)

Thanks for all the advices on this patch. I'll implement the above changes
in a new patch series.
The approach that I'll take will be as follows:
1. Create function check_prefix_inside_submodule which returns value
   accordingly. Also this function is implemented by checking the active_cache
   list and checking if it prefix is present in active_cache[i]->name,
   using binary search.
   Also is there some other way which is more quicker than searching
   for the prefix in the cache?
2. Individually call this in each command after the git environment is set
   and options are parsed.
3. Apply this change for appropriate options only, as suggested above for
   the status command.
This will ensure more accuracy.
I have also mentioned to work on this BUG in my git proposal as well, but I
kept it in my wishlist section. Hence, I'll continue to work on this as my time
permits. Currently, I'm also working on converting the git-submodule
subcommand 'foreach' from script to builtin, by first converting it
to a function in submodule--helper.c and then later converting it
to builtin.

Thanks,
Prathamesh Chavan

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-04-06 20:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-06  6:00 [GSoC][PATCH v1] Disallow git commands from within unpopulated submodules Prathamesh Chavan
2017-04-06  6:11 ` Prathamesh Chavan
2017-04-06 18:15 ` Stefan Beller
2017-04-06 20:48   ` Prathamesh Chavan

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).