git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [GSoC] Discussion of "Submodule related work" project
@ 2017-03-10 11:27 Valery Tolstov
  2017-03-10 18:47 ` Stefan Beller
  0 siblings, 1 reply; 7+ messages in thread
From: Valery Tolstov @ 2017-03-10 11:27 UTC (permalink / raw)
  To: Stefan Beller; +Cc: christian.couder, Brandon Williams, git

Have some questions about "Submodule related work" project

First of all, I would like to add this task to the project, if I'll 
take it:
https://public-inbox.org/git/1488913150.8812.0@smtp.yandex.ru/T/
What do you think about this task?

 > Cleanup our test suite. Do not use a repo itself as a submodule for 
itself

Not quite familiar with submodules yet, why this is considered to be 
ineligible
(i.e. using repo as a submodule for itself)?

 > (Advanced datastructure knowledge required?) Protect submodule from 
gc-ing
 > interesting HEADS.

Can you provide a small example that shows the problem, please?
And why advanced datastructure knowledge is expected?

Maybe you have something else about this project to say.

Thanks,
  Valery Tolstov



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [GSoC] Discussion of "Submodule related work" project
  2017-03-10 11:27 [GSoC] Discussion of "Submodule related work" project Valery Tolstov
@ 2017-03-10 18:47 ` Stefan Beller
  2017-03-10 19:30   ` Valery Tolstov
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Beller @ 2017-03-10 18:47 UTC (permalink / raw)
  To: Valery Tolstov; +Cc: Christian Couder, Brandon Williams, git@vger.kernel.org

On Fri, Mar 10, 2017 at 3:27 AM, Valery Tolstov <me@vtolstov.org> wrote:
> Have some questions about "Submodule related work" project
>
> First of all, I would like to add this task to the project, if I'll take it:
> https://public-inbox.org/git/1488913150.8812.0@smtp.yandex.ru/T/
> What do you think about this task?

That is a nice project, though my gut feeling is that it is too small
for a GSoC project on itself.

>> Cleanup our test suite. Do not use a repo itself as a submodule for itself
>
> Not quite familiar with submodules yet, why this is considered to be
> ineligible (i.e. using repo as a submodule for itself)?

(a bit of background on submodules)

man gitglossary (then searching for submodule):
       submodule
           A repository that holds the history of a separate project inside
           another repository (the latter of which is called superproject).

       superproject
           A repository that references repositories of other projects in its
           working tree as submodules. The superproject knows about the names
           of (but does not hold copies of) commit objects of the contained
           submodules.

An example that I just found on Github[1]. It is a game
(so it includes graphics, game code etc). But it makes use of a library[2],
which could be used by different projects.

[1] https://github.com/stephank/orona
[2] https://github.com/stephank/villain

Now why would a repo be ineligible to use itself as a submodule?
There is nothing wrong with it *technically* (which is why we do such things
in the test suite.)

But what are the use cases for it? Why would a project bind itself
as a submodule (you can get the same effect of having the source code
by just checking out that other version.) ? Well now that I think about it,
it may be useful if you want to test old versions of yourself for e.g.
networking compatibility. But for that you'd probably still not use submodules.

So the use case of using submodules for another copy of itself is
*very rare* if it exists at all out there. And the Git test suite
should rather test
use cases that are not these weird corner cases, but rather pay attention to
the common case first.

I thought this project would have been solved parially already, but I was wrong.
($ git grep "submodule add \./\."). This also doesn't seem large enough for
a summer project, after thinking about it further.

>> (Advanced datastructure knowledge required?) Protect submodule from gc-ing
>> interesting HEADS.
>
> Can you provide a small example that shows the problem, please?

Let's use this example from above:

$ git clone --recursive https://github.com/stephank/orona
    # now we have 2 repositories, the orona repo as well as its submodule
    # at node_modules/villain
    #
    # "Let's inspect the Readmes/license files, if they are ok to use
    # Oh! the submodule is MIT licensed but doesn't have the full
    # license text, I can contribute and make a patch for it."
$ cd node_modules/villain
$ $EDIT LICENSE
$ git add LICENSE
$ git commit -a -m "add license full text"
$
$ cd ../.. # go back to the superproject
$ git add  node_modules/villain
$ git commit -a -m "update game to include latest lib"
$ git checkout -b "fix_license"
    # note how I forget to actually push it / pull request it!
    # All we need for the demonstration is a local commit
    # in the submodule that is referenced by the superproject...
    #
    # ... "Let's test the pristine copy of the game!" ...
$ git checkout origin/master
$ git submodule update
    # ... which gets lost here. The submodule commit
    # is only referenced by a superproject commit.

.. time passes ..

    # "My disk is so full, maybe I can clean up all these random
    # accumulated projects, to have more disk space again."
    # my cleanup script may do this:

$ cd node_modules/villain
$ git reflog expire --all --expire=all
$ git gc --prune=all
$ cd ../..

$ git branch
    # "Oh what about this 'fix_license branch' ?
    #  Did I actually send that upstream?"
$ git checkout fix_license
$ git submodule update
error: no such remote ref 96016818b2ed9eb0ca72552b18f8339fc20850b4
Fetched in submodule path 'villain', but it did not contain
96016818b2ed9eb0ca72552b18f8339fc20850b4. Direct fetching of that
commit failed.

> And why advanced datastructure knowledge is expected?

I am not quite sure how to approach this problem, so I put
a "warning; it may be complicated" sticker on it. ;)

The problem is that a submodule until now was considered
its own repository, in full control what to keep and delete,
how to name its branches and so on.

git-gc only pays attention to commits (and its history) of all
branches and commits mentioned in the reflog.
(which is why we had to delete the reflog, and as we
were making the license commit on a "detached HEAD",
there was no need to delete its branch).

However it should also consider all commits referenced
by the superproject valuable.

In this case the superproject has a branch "fix_license",
so that commit is considered too valuable for gc in the
superproject, but it breaks with the submodule pointer
as the pointer changes in the superproject, but the
gc operation in the submodule doesn't care.

One way to fix it is to figure out if there is a superproject
at gc time and then collect all valuable hashes (submodule
pointers) before actually performing the gc.

But that may be expensive, so we would rather record
it on the fly, e.g. when making the commit in the superproject
we'd record in the submodule that the given hash by the
submodule pointer is valuable.

This could be done by having a ref (=branch) in the submodule
that points at all the interesting submodule commits.

So despite being prominent on the ideas page (because of a lot
of text), this may be controversial how to actually solve it.

>
> Maybe you have something else about this project to say.

If I remember correctly, shell -> C conversion projects are
easy (both for writing the code as well as for mentoring)

> git archive(/bundle) to have a --recurse-submodules flag
> to include the submodule contents.

is an actual interesting project as well despite its short description.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [GSoC] Discussion of "Submodule related work" project
  2017-03-10 18:47 ` Stefan Beller
@ 2017-03-10 19:30   ` Valery Tolstov
  2017-03-10 21:13     ` Valery Tolstov
  0 siblings, 1 reply; 7+ messages in thread
From: Valery Tolstov @ 2017-03-10 19:30 UTC (permalink / raw)
  To: sbeller; +Cc: git, christian.couder, bmwill

So... I thought those items listed in "Submodule related work" are
considered too small to be complete projects separately, and they
are just "subprojects" of bigger project (maybe I have this thought
because I can't estimate complexity before truly digging in).
In your response you talk about them as independent projects...
This means I can take only any one of them as starting point for
my proposal? Or maybe I misunderstood you?


Thanks,
  Valery Tolstov

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [GSoC] Discussion of "Submodule related work" project
  2017-03-10 19:30   ` Valery Tolstov
@ 2017-03-10 21:13     ` Valery Tolstov
  2017-03-13 18:24       ` Stefan Beller
  0 siblings, 1 reply; 7+ messages in thread
From: Valery Tolstov @ 2017-03-10 21:13 UTC (permalink / raw)
  To: sbeller; +Cc: git, christian.couder, bmwill

> This means I can take only any one of them as starting point for
> my proposal?

If it is true, than i'll try to take sh->C transition for submodule
command, and as addirional part of my whole project also this:
https://public-inbox.org/git/1488913150.8812.0@smtp.yandex.ru/T/

>> Have some questions about "Submodule related work" project
>>
>> First of all, I would like to add this task to the project, if I'll take it:
>> https://public-inbox.org/git/1488913150.8812.0@smtp.yandex.ru/T/
>> What do you think about this task?
>
> That is a nice project, though my gut feeling is that it is too small
> for a GSoC project on itself.

Does it sound good? If does, then I'll begin to work on my proposal.

Thanks,
  Valery Tolstov

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [GSoC] Discussion of "Submodule related work" project
  2017-03-10 21:13     ` Valery Tolstov
@ 2017-03-13 18:24       ` Stefan Beller
  2017-03-15 20:43         ` Valery Tolstov
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Beller @ 2017-03-13 18:24 UTC (permalink / raw)
  To: Valery Tolstov; +Cc: git@vger.kernel.org, Christian Couder, Brandon Williams

> So... I thought those items listed in "Submodule related work" are
> considered too small to be complete projects separately, and they
> are just "subprojects" of bigger project (maybe I have this thought
> because I can't estimate complexity before truly digging in).

When writing these points, I was not sure about the complexity
myself, but rather I wanted to produce a lot of different potential
projects, which can be discussed if they sound exciting and are
of good size.

> In your response you talk about them as independent projects...
> This means I can take only any one of them as starting point for
> my proposal? Or maybe I misunderstood you?

Well I think some of them are too small to stand alone for a full GSoC
project. Others have a good size and complexity for GSoC already.

> If it is true, than i'll try to take sh->C transition for submodule
> command,

For shell -> C transitions, see
65e1449614d
b7d2a15b9f5
307de75c4
dec034a34e
as all of them are rewrites from sh -> C for different commands.
You might find common patterns (e.g. what is all needed for a conversion,
such as slight updates to tests or documentation; certainly updating
the build process in the Makefile, and of course the code translated).

Most of these conversions, start out with a patch that is quite a literal
translation and then afterwards add in optimizations.

Another approach for the conversion is outlined in73c2779f42
(builtin-am: implement skeletal builtin am).  I am not sure how
easy this approach is for a submodule specific command.

> and as addirional part of my whole project also this:
> https://public-inbox.org/git/1488913150.8812.0@smtp.yandex.ru/T/

Yeah, that is also a good task :) Thanks for bringing it up.

> Does it sound good? If does, then I'll begin to work on my proposal.

Sure.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [GSoC] Discussion of "Submodule related work" project
  2017-03-13 18:24       ` Stefan Beller
@ 2017-03-15 20:43         ` Valery Tolstov
  2017-03-15 20:48           ` Stefan Beller
  0 siblings, 1 reply; 7+ messages in thread
From: Valery Tolstov @ 2017-03-15 20:43 UTC (permalink / raw)
  To: sbeller; +Cc: bmwill, christian.couder, git, me

I have a thought. At the moment when submodule command is already translated
to C, possibly we want to rename submodule--helper.c, and maybe hide
some of it's functions from subcommands list. Is there any examples of
similar situation that already happened before?
Don't quite sure about this.

Thanks,
  Valery Tolstov

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [GSoC] Discussion of "Submodule related work" project
  2017-03-15 20:43         ` Valery Tolstov
@ 2017-03-15 20:48           ` Stefan Beller
  0 siblings, 0 replies; 7+ messages in thread
From: Stefan Beller @ 2017-03-15 20:48 UTC (permalink / raw)
  To: Valery Tolstov; +Cc: Brandon Williams, Christian Couder, git@vger.kernel.org

On Wed, Mar 15, 2017 at 1:43 PM, Valery Tolstov <me@vtolstov.org> wrote:
> I have a thought. At the moment when submodule command is already translated
> to C, possibly we want to rename submodule--helper.c, and maybe hide
> some of it's functions from subcommands list. Is there any examples of
> similar situation that already happened before?

I would keep the submodule--helper around as it is
and maybe even promote it to be a "plumbing" command.
Currently it is just an undocumented, internal command.

Once all commands in git-submodule.sh are translated to C
(when they all look similar to "git submodule init", which
is just a relay in the shell script), then we can make a full grown
builtin/submodule.c that calls the functions of the submodule--helper
directly in C.

Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-03-15 20:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-10 11:27 [GSoC] Discussion of "Submodule related work" project Valery Tolstov
2017-03-10 18:47 ` Stefan Beller
2017-03-10 19:30   ` Valery Tolstov
2017-03-10 21:13     ` Valery Tolstov
2017-03-13 18:24       ` Stefan Beller
2017-03-15 20:43         ` Valery Tolstov
2017-03-15 20:48           ` Stefan Beller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).