git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
From: Yaroslav Halchenko <yoh@onerussian.com>
To: Stefan Beller <sbeller@google.com>
Cc: Prathamesh Chavan <pc44800@gmail.com>, "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: in case you want a use-case with lots of submodules
Date: Mon, 19 Jun 2017 16:20:21 -0400
Message-ID: <20170619202021.dmomy5ztwoeat3eg@hopa.kiewit.dartmouth.edu> (raw)
In-Reply-To: <CAGZ79kZhj31eBYnboyxDLuFp1ceeqk8kj0nrnQaCmpRJCVFU4w@mail.gmail.com>


On Mon, 19 Jun 2017, Stefan Beller wrote:

> On Mon, Jun 19, 2017 at 8:59 AM, Yaroslav Halchenko <yoh@onerussian.com> wrote:
> > Hi All,

> > On a recent trip I've listened to the git minutes podcast episode and
> > got excited to hear  Stefan Beller (CCed just in case) describing
> > ongoing work on submodules mechanism.  I got excited, since e.g.
> > performance improvements would be of great benefit to us too.

> If you're mostly interested in performance improvements of the status
> quo (i.e. "make git-submodule fast"), then the work of Prathamesh
> Chavan (cc'd) might be more interesting to you than what I do.
> He is porting git-submodule (which is mostly a shell script nowadays)
> to C, such that we can save a lot of process invocations and can do
> processing within one process.

ah -- cool.  I would be eager to test it out, thanks!  would be
interesting to see if it positively affects our overall performance.
Pointers to that development would be welcome!

> > http://datasets.datalad.org ATM provides quite a sizeable (ATM 370
> > repositories, up to 4 levels deep) hierarchy of git/git-annex
> > repositories all tied together via git submodules mechanism.  And as the
> > collection grows, interactions with it become slower, so additional
> > options (such as --ignore-submodules=dirty  to status) become our
> > friends.

> I am not as much concerned about the 370 number than about the
> 4 layers of nesting. In my experience the nested submodule case
> is a little bit error prone and the bug reports are not as frequent as
> there are not as many users of nesting, yet(?)

well -- part of the story here is that we are forced to use/have full
blown .git/ directories (for git-annex symlinks to content files to
work) within submodules instead of .git file with a reference under
parent's .git/modules.   So we can 'slice' at any level and I
guess that is why may be avoiding some possibly issues due to nesting
and the "parent has all .git/modules" approach.

> In a neighboring thread on the mailing list we have a discussion
> on the usefulness of being on branches than in detached HEAD
> in the submodules.
> https://public-inbox.org/git/0092CDD27C5F9D418B0F3E9B5D05BE08010287DF@SBS2011.opfingen.plc2.de/

> This would not break non-ambiguously, rather it would add
> ease of use.

that is indeed a common caveat... I am not sure if any heuristic
approach would provide a 'bullet proof' solution.  I might even prefer a
hardcoded 'branch-name' to be listed/associated with each submodule
within .gitmodules.  In the datalad case, detached HEAD is common
whenever someone installs "outdated" (branch of which progressed
forward) submodule.  In this case we just check if the branch after "git
clone"  (but before git submodule update) includes the pointed by
Subproject commit, and if so -- we announce that it must be the branch
(so far it is always "master" branch anyways ;) )

> > So I thought to share this as a use-case happen you need more
> > motivation or just a real-case test-bed for your work.  And thank
> > you again for making Git even Greater.

> Thanks for the motivation. :)

the least I could do ;)

-- 
Yaroslav O. Halchenko
Center for Open Neuroscience     http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        

  reply index

Thread overview: 4+ messages in thread (expand / mbox.gz / Atom feed / [top])
2017-06-19 15:59 Yaroslav Halchenko
2017-06-19 19:30 ` Stefan Beller
2017-06-19 20:20   ` Yaroslav Halchenko [this message]
2017-06-20  5:43     ` Stefan Beller

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply to all the recipients using the --to, --cc,
  and --in-reply-to switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170619202021.dmomy5ztwoeat3eg@hopa.kiewit.dartmouth.edu \
    --to=yoh@onerussian.com \
    --cc=git@vger.kernel.org \
    --cc=pc44800@gmail.com \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox