git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [GSoC][RFC] proposal: convert git-submodule to builtin script
@ 2019-04-02 20:32 Khalid Ali
  2019-04-03 18:48 ` Johannes Schindelin
  2019-04-04  8:30 ` Christian Couder
  0 siblings, 2 replies; 4+ messages in thread
From: Khalid Ali @ 2019-04-02 20:32 UTC (permalink / raw)
  To: git

Hi,

My name is Khalid Ali and I am looking to convert the git-submodule to
a builtin C script. The link below contains my first proposal draft
[1] and my microproject is at [2]. My main concern is that my second
task is not verbose enough. I am not sure if I should add a specific
breakdown of large items within the submodule command.

Outside of the draft, I was wondering whether this should be
implemented through multiple patches to the master branch or through a
separate, long-running feature branch that will be merged at the end
of the GSoC timeline?

Feedback is greatly appreciated!

[1] https://docs.google.com/document/d/1olGG8eJxFoMNyGt-4uMiTD3LjRYx15pttg67AJYliu8/edit?usp=sharing
[2] https://public-inbox.org/git/20190402014115.22478-1-khalludi123@gmail.com/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [GSoC][RFC] proposal: convert git-submodule to builtin script
  2019-04-02 20:32 [GSoC][RFC] proposal: convert git-submodule to builtin script Khalid Ali
@ 2019-04-03 18:48 ` Johannes Schindelin
  2019-04-03 21:07   ` Khalid Ali
  2019-04-04  8:30 ` Christian Couder
  1 sibling, 1 reply; 4+ messages in thread
From: Johannes Schindelin @ 2019-04-03 18:48 UTC (permalink / raw)
  To: Khalid Ali; +Cc: git

Hi,

On Tue, 2 Apr 2019, Khalid Ali wrote:

> My name is Khalid Ali and I am looking to convert the git-submodule to
> a builtin C script. The link below contains my first proposal draft
> [1] and my microproject is at [2]. My main concern is that my second
> task is not verbose enough. I am not sure if I should add a specific
> breakdown of large items within the submodule command.

Nice!

Please note that while I used to be the mentor who basically helped all of
the GSoC/Outreachy students through their "convert to built-in" projects
in the recent years, I am not available to mentor this year.

Having said that, I think I can help you to improve your proposal.

When you talk about "Convert each main task in git-submodule into a C
function." and "If certain functionality is missing, add it to the correct
script.", it is a good idea to back that up by concrete examples.

Like, study `git-submodule.sh` and extract the list of "main tasks", and
then mention that in your proposal. I see that you listed 9 main tasks,
but it is not immediately clear whether you extracted that list from the
usage text, from the manual page, or from the script itself. If the latter
(which I think would be the best, given the goal of converting the code in
that script), it would make a ton of sense to mention the function names
and maybe add a permalink to the corresponding code (you could use e.g.
GitHub's permalinks).

And then look at one of those main tasks, think of something that you
believe should be covered in the test suite, describe it, then figure out
whether it is already covered. If it is, mention that, together with the
location, otherwise state which script would be the best location, and
why.

Further, I would like to caution you about "If there is still some
time"... The `git-submodule.sh` script weighs in with just over 1,000
lines. We had three GSoC projects to convert scripts last year, and they
converted scripts' weights (at the time) were 750 lines for
`git-stash.sh`, 674 lines for `git-rebase.sh` and 1,036 lines for
`git-rebase--interactive.sh`, respectively. That last number should be
taken with a big grain of salt, as is not quite the number of lines that
were converted: as part of the GSoC project, the
`git-rebase--preserve-merges.sh` script was split out, never intended to
be converted, but to be deprecated instead (in favor of `git rebase -r`),
and there were "only" some 283 lines to be converted to C remaining after
that.

Out of those three, the project converting the smallest number of lines
clearly got integrated first (and there was actually time to do more stuff
in that project, and those things are partially still being cooked). The
converted `git stash` is still not in `master`...

So... converting 1,000 lines of code is quite a challenge for 3 months.

Having said that, I would not consider your project a failure if even
"only" as much as half of the lines of code were converted to C.

Besides, if you care to have a bit of a deeper look into the
`git-submodule.sh` script, you will see a peculiar pattern in some of the
subcommands, e.g. in `cmd_foreach`:
https://github.com/git/git/blob/v2.21.0/git-submodule.sh#L320-L349

Essentially, it spends two handfuls of lines on option parsing, and then
the real business logic is performed by the `submodule--helper`, which is
*already* a built-in.

Even better: most of that business logic is implemented in a file that has
the very file name you proposed already: `submodule.c`.

So if I were you, I would add a section to your proposal (which in the end
would no doubt dwarf the existing sections) that has as subsections each
of those commands in `git-submodule.sh` that do *not* yet follow this
pattern "parse options then hand off to submodule--helper".

I would then study the commit history of the ones that *do* use the
`submodule--helper` to see how they were converted, what conventions were
used, whether there were recurring patterns, etc.

In each of those subsections, I would then discuss what the
still-to-be-converted commands do, try to find the closest command that
already uses the `submodule--helper`, and then assess what it would take
to convert them, how much code it would probably need, whether it could
reuse parts that are already in `submodule.c`, etc.

> Outside of the draft, I was wondering whether this should be
> implemented through multiple patches to the master branch or through a
> separate, long-running feature branch that will be merged at the end
> of the GSoC timeline?

Judging from past projects to convert scripts to C, I would say that the
most successful strategy was to chomp off manageable parts and move them
from the script to C. I am sure that you will find tons of good examples
for this strategy by looking at the commit history of `git-submodule.sh`
and then searching for the corresponding patches in the Git mailing list
archive (e.g. https://public-inbox.org/git/).

Do not expect those "chomped off" parts to hit `master` very quickly,
though. Most likely, you would work on one patch series (very closely with
your mentor at first, to avoid unnecessary blocks and to get a better feel
for the way the Git community works right from the start), then, when that
patch series is robust and solid and ready to be contributed, you would
send it to the Git mailing list and immediately start working on the next
patch series, all the while the reviews will trickle in. Those reviews
will help you to improve the patch series, and it is a good idea to
incorporate the good suggestions, and to discuss the ones you think are
not necessary, for a few days before sending the next patch series
iteration.

Essentially, you will work in parallel on a few patch series at all times.
Those patch series stack on top of each other, and they should, one after
the other, make it into `pu` first, then, when they are considered ready
for testing into `next`, and eventually to `master`. Whenever you
contribute a new patch series iteration, you then rebase the remaining
patch series on top. Ideally it will look a bit like a fern, with the
first patch series being along the farthest, and each subsequent patch
series at an earlier stage than its predecessor.

Phew. Long mail. Hopefully this amount of information does not scare you.
And maybe some of it will help you with the proposal and/or the project.

Ciao,
Johannes

> Feedback is greatly appreciated!
>
> [1] https://docs.google.com/document/d/1olGG8eJxFoMNyGt-4uMiTD3LjRYx15pttg67AJYliu8/edit?usp=sharing
> [2] https://public-inbox.org/git/20190402014115.22478-1-khalludi123@gmail.com/
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [GSoC][RFC] proposal: convert git-submodule to builtin script
  2019-04-03 18:48 ` Johannes Schindelin
@ 2019-04-03 21:07   ` Khalid Ali
  0 siblings, 0 replies; 4+ messages in thread
From: Khalid Ali @ 2019-04-03 21:07 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

First of all, thank you so much for the detailed feedback. I wasn't sure
how much to include in the proposal, but I see it still needs a lot of work.

> When you talk about "Convert each main task in git-submodule into a C
> function." and "If certain functionality is missing, add it to the correct
> script.", it is a good idea to back that up by concrete examples.
>
> Like, study `git-submodule.sh` and extract the list of "main tasks", and
> then mention that in your proposal. I see that you listed 9 main tasks,
> but it is not immediately clear whether you extracted that list from the
> usage text, from the manual page, or from the script itself. If the latter
> (which I think would be the best, given the goal of converting the code in
> that script), it would make a ton of sense to mention the function names
> and maybe add a permalink to the corresponding code (you could use e.g.
> GitHub's permalinks).

Yes, I actually did extract the tasks straight from git-submodule.sh. I will
definitely add the appropriate function names and permalinks to the
proposal.

> And then look at one of those main tasks, think of something that you
> believe should be covered in the test suite, describe it, then figure out
> whether it is already covered. If it is, mention that, together with the
> location, otherwise state which script would be the best location, and
> why.

Ah, alright. I'll have a look at the test suite to see what is covered and
include a section in my proposal.

> Besides, if you care to have a bit of a deeper look into the
> `git-submodule.sh` script, you will see a peculiar pattern in some of the
> subcommands, e.g. in `cmd_foreach`:
> https://github.com/git/git/blob/v2.21.0/git-submodule.sh#L320-L349
>
> Essentially, it spends two handfuls of lines on option parsing, and then
> the real business logic is performed by the `submodule--helper`, which is
> *already* a built-in.
>
> Even better: most of that business logic is implemented in a file that has
> the very file name you proposed already: `submodule.c`.
>
> So if I were you, I would add a section to your proposal (which in the end
> would no doubt dwarf the existing sections) that has as subsections each
> of those commands in `git-submodule.sh` that do *not* yet follow this
> pattern "parse options then hand off to submodule--helper".
>
> I would then study the commit history of the ones that *do* use the
> `submodule--helper` to see how they were converted, what conventions were
> used, whether there were recurring patterns, etc.
>
> In each of those subsections, I would then discuss what the
> still-to-be-converted commands do, try to find the closest command that
> already uses the `submodule--helper`, and then assess what it would take
> to convert them, how much code it would probably need, whether it could
> reuse parts that are already in `submodule.c`, etc.

I definitely noticed the option parsing in multiple parts of the function, but
the pattern didn't click until you mentioned it. I'll do as you recommended
and take a look at submodule.c to see how the code and functionality in
git-submodule.sh can be merged.

> Judging from past projects to convert scripts to C, I would say that the
> most successful strategy was to chomp off manageable parts and move them
> from the script to C. I am sure that you will find tons of good examples
> for this strategy by looking at the commit history of `git-submodule.sh`
> and then searching for the corresponding patches in the Git mailing list
> archive (e.g. https://public-inbox.org/git/).
>
> Do not expect those "chomped off" parts to hit `master` very quickly,
> though. Most likely, you would work on one patch series (very closely with
> your mentor at first, to avoid unnecessary blocks and to get a better feel
> for the way the Git community works right from the start), then, when that
> patch series is robust and solid and ready to be contributed, you would
> send it to the Git mailing list and immediately start working on the next
> patch series, all the while the reviews will trickle in. Those reviews
> will help you to improve the patch series, and it is a good idea to
> incorporate the good suggestions, and to discuss the ones you think are
> not necessary, for a few days before sending the next patch series
> iteration.
>
> Essentially, you will work in parallel on a few patch series at all times.
> Those patch series stack on top of each other, and they should, one after
> the other, make it into `pu` first, then, when they are considered ready
> for testing into `next`, and eventually to `master`. Whenever you
> contribute a new patch series iteration, you then rebase the remaining
> patch series on top. Ideally it will look a bit like a fern, with the
> first patch series being along the farthest, and each subsequent patch
> series at an earlier stage than its predecessor.

Ok, so I'd be doing each of the portions and submitting them to the mailing
list as I finish to let other coders take a look and give feedback.

> Phew. Long mail. Hopefully this amount of information does not scare you.
> And maybe some of it will help you with the proposal and/or the project.

Once again, thank you for the detailed feedback. This really gave me a good
idea of the project as a whole and what I need to include in my proposal.

Sincerely,

-Khalid

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [GSoC][RFC] proposal: convert git-submodule to builtin script
  2019-04-02 20:32 [GSoC][RFC] proposal: convert git-submodule to builtin script Khalid Ali
  2019-04-03 18:48 ` Johannes Schindelin
@ 2019-04-04  8:30 ` Christian Couder
  1 sibling, 0 replies; 4+ messages in thread
From: Christian Couder @ 2019-04-04  8:30 UTC (permalink / raw)
  To: Khalid Ali; +Cc: git

Hi,

On Tue, Apr 2, 2019 at 10:34 PM Khalid Ali <khalludi123@gmail.com> wrote:
>
> My name is Khalid Ali and I am looking to convert the git-submodule to
> a builtin C script. The link below contains my first proposal draft
> [1] and my microproject is at [2]. My main concern is that my second
> task is not verbose enough. I am not sure if I should add a specific
> breakdown of large items within the submodule command.

There was a GSoC project about the same subject a few years ago:

https://public-inbox.org/git/CAME+mvXtA6iZNfErTX5tYB-o-5xa1yesAG5h=iP_Z2_zL_kOnQ@mail.gmail.com/

I think you should take a look at the work that was done (merged and
not merged) and report about it in your proposal.

Thanks,
Christian.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-04-04  8:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-02 20:32 [GSoC][RFC] proposal: convert git-submodule to builtin script Khalid Ali
2019-04-03 18:48 ` Johannes Schindelin
2019-04-03 21:07   ` Khalid Ali
2019-04-04  8:30 ` Christian Couder

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).