git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: Junio C Hamano <gitster@pobox.com>, git <git@vger.kernel.org>
Subject: Re: [PATCH 7/9] submodule: fetch in submodules git directory instead of in worktree
Date: Tue, 23 Oct 2018 11:26:32 -0700
Message-ID: <CAGZ79kbunyc02802+4sjGwthBnj_=eS=+z5B_WQSRdbUAHw1tQ@mail.gmail.com> (raw)
In-Reply-To: <20181017225811.66554-1-jonathantanmy@google.com>

On Wed, Oct 17, 2018 at 3:58 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> > This patch started as a refactoring to make 'get_next_submodule' more
> > readable, but upon doing so, I realized that "git fetch" of the submodule
> > actually doesn't need to be run in the submodules worktree. So let's run
> > it in its git dir instead.
>
> The commit message needs to be updated, I think - this patch does
> significantly more than fetching in the gitdir.

From my point of view, it is not significant, but refactoring.
I'll think how to write a better commit message.

> > This patch leaks the cp->dir in get_next_submodule, as any further
> > callback in run_processes_parallel doesn't have access to the child
> > process any more.
>
> The cp->dir is already leaked - probably better to write "cp->dir in
> get_next_submodule() is still leaked, but this will be fixed in a
> subsequent patch".

... which fails to mention the reason why (as it is hard to do given
the current design) but is more concise.

> > +static void prepare_submodule_repo_env_in_gitdir(struct argv_array *out)
> > +{
> > +     prepare_submodule_repo_env_no_git_dir(out);
> > +     argv_array_pushf(out, "%s=.", GIT_DIR_ENVIRONMENT);
>
> Why does GIT_DIR need to be set? Is it to avoid subcommands recursively
> checking the parent directories in case the CWD is a malformed Git
> repository? If yes, maybe it's worth adding a comment.

It is copying the structure from prepare_submodule_repo_env,
specifically 10f5c52656 (submodule: avoid auto-discovery in
prepare_submodule_repo_env(), 2016-09-01), which sounds
appealing (and brings real benefits for the working directory),
but I have not thought about this protection for the git dir.

Maybe another approach is to not set the cwd for the child process
and instead point GIT_DIR_ENVIRONMENT only to the right
directory.

Then the use of GIT_DIR_ENVIRONMENT is obvious and
is not just for protection of corner cases.

However I think this protection is really valuable for the
.git dir as well as the submodule may be broken and we do not
want to end up in an infinite loop (as the discovery would find
the superproject which then tries to recurse, again, into the
submodule with the broken git dir)

When adding the comment here, we'd also want to have
the comment in prepare_submodule_repo_env, which
could be its own preparation commit.

> > +static struct repository *get_submodule_repo_for(struct repository *r,
> > +                                              const struct submodule *sub)
> > +{
> > +     struct repository *ret = xmalloc(sizeof(*ret));
> > +
> > +     if (repo_submodule_init(ret, r, sub)) {
> > +             /*
> > +              * No entry in .gitmodules? Technically not a submodule,
> > +              * but historically we supported repositories that happen to be
> > +              * in-place where a gitlink is. Keep supporting them.
> > +              */
> > +             struct strbuf gitdir = STRBUF_INIT;
> > +             strbuf_repo_worktree_path(&gitdir, r, "%s/.git", sub->path);
> > +             if (repo_init(ret, gitdir.buf, NULL)) {
> > +                     strbuf_release(&gitdir);
> > +                     return NULL;
> > +             }
> > +             strbuf_release(&gitdir);
> > +     }
> > +
> > +     return ret;
> > +}
>
> This is the significant thing that this patch does more - an unskipped
> submodule is now something that either passes the checks in
> repo_submodule_init() or the checks in repo_init(), which seems to be
> stricter than the current check that ".git" points to a directory or is
> one. This means that we skip certain broken repositories, and this
> necessitates a change in the test.

I see. However there is no change in function, the check in repo_init
(or repo_submodule_init) is less strict than the check in the child process.
So if there are broken submodule repositories, the difference of this
patch is the layer at which it is caught, i.e. we would not spawn a child
that fails, but skip the submodule.

Thinking of that, maybe we need to announce that in get_next_submodule

>
> I think we should be more particular about what we're allowed to skip -
> in particular, maybe if we're planning to skip this submodule, its
> corresponding directory in the worktree (if one exists) needs to be
> empty.

If the working tree directory is empty for that submodule, it means
it is likely not initialized. But why would we use that as a signal to
skip the submodule?



> > -                     cp->dir = strbuf_detach(&submodule_path, NULL);
> > -                     prepare_submodule_repo_env(&cp->env_array);
> > +                     prepare_submodule_repo_env_in_gitdir(&cp->env_array);
> > +                     cp->dir = xstrdup(repo->gitdir);
>
> Here is where the functionality change (fetch in ".git") described in
> the commit message occurs.

True.

Thanks for the review, I'll try to split up this commit a bit more and
explain each part on its own.

  reply index

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-16 18:13 [PATCH 0/9] Resending sb/submodule-recursive-fetch-gets-the-tip Stefan Beller
2018-10-16 18:13 ` [PATCH 1/9] sha1-array: provide oid_array_filter Stefan Beller
2018-10-16 18:13 ` [PATCH 2/9] submodule.c: fix indentation Stefan Beller
2018-10-16 18:13 ` [PATCH 3/9] submodule.c: sort changed_submodule_names before searching it Stefan Beller
2018-10-17 21:21   ` Jonathan Tan
2018-10-16 18:13 ` [PATCH 4/9] submodule.c: move global changed_submodule_names into fetch submodule struct Stefan Beller
2018-10-17 21:26   ` Jonathan Tan
2018-10-18 19:09     ` Stefan Beller
2018-10-16 18:13 ` [PATCH 5/9] submodule.c: do not copy around submodule list Stefan Beller
2018-10-17 21:45   ` Jonathan Tan
2018-10-18  2:35     ` Junio C Hamano
2018-10-16 18:13 ` [PATCH 6/9] repository: repo_submodule_init to take a submodule struct Stefan Beller
2018-10-17 21:55   ` Jonathan Tan
2018-10-16 18:13 ` [PATCH 7/9] submodule: fetch in submodules git directory instead of in worktree Stefan Beller
2018-10-17 22:58   ` Jonathan Tan
2018-10-23 18:26     ` Stefan Beller [this message]
2018-10-23 22:55       ` Jonathan Tan
2018-10-23 23:01         ` Stefan Beller
2018-10-16 18:13 ` [PATCH 8/9] fetch: retry fetching submodules if needed objects were not fetched Stefan Beller
2018-10-18  0:39   ` Jonathan Tan
2018-10-23 22:37     ` Stefan Beller
2018-10-23 23:37       ` Jonathan Tan
2018-10-25 21:42         ` Stefan Beller
2018-10-16 18:13 ` [PATCH 9/9] builtin/fetch: check for submodule updates for non branch fetches Stefan Beller
2018-10-18  0:47   ` Jonathan Tan
2018-10-18  2:30 ` [PATCH 0/9] Resending sb/submodule-recursive-fetch-gets-the-tip Junio C Hamano
2018-10-18  7:30 ` Junio C Hamano
2018-10-18 18:00   ` Stefan Beller
  -- strict thread matches above, loose matches on Subject: below --
2018-09-11 23:49 [PATCH 0/9] fetch: make sure submodule oids are fetched Stefan Beller
2018-09-11 23:49 ` [PATCH 7/9] submodule: fetch in submodules git directory instead of in worktree Stefan Beller
2018-09-12 18:36   ` Junio C Hamano
2018-09-13 19:29     ` Stefan Beller
2018-09-17 21:35 ` [PATCHv2 0/9] fetch: make sure submodule oids are fetched Stefan Beller
2018-09-17 21:35   ` [PATCH 7/9] submodule: fetch in submodules git directory instead of in worktree Stefan Beller

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGZ79kbunyc02802+4sjGwthBnj_=eS=+z5B_WQSRdbUAHw1tQ@mail.gmail.com' \
    --to=sbeller@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox