From: Atharva Raykar <raykar.ath@gmail.com>
To: git@vger.kernel.org
Cc: christian.couder@gmail.com, shouryashukla.oo@gmail.com,
periperidip@gmail.com
Subject: [GSoC][Draft Proposal] Finish converting git submodule to builtin
Date: Sat, 3 Apr 2021 19:38:43 +0530 [thread overview]
Message-ID: <E6E88000-9C18-4035-9A14-8B406617351A@gmail.com> (raw)
Hi all,
Below is my draft of my GSoC proposal. I have noticed that Chinmoy has already
submitted a proposal for the same idea before me, so would that be considered
"taken"? (I don't think I can submit another proposal for the other idea either,
because someone has already sent one for that as well)
Since I have already put my effort into this for a while, I thought I might as
well send it, but I'll accept whatever the mentors say about the eligibility of
this proposal.
Here is a prettier markdown version:
https://gist.github.com/tfidfwastaken/0c6ca9ef2a452f110a416351541e0f19
--8<-----8<-----8<-----8<-----8<-----8<-----8<-----8<-----8<-----8<-----8<--
___________________
GSOC GIT PROPOSAL
Atharva Raykar
___________________
Table of Contents
_________________
1. Personal Details
2. Background
3. Me and Git
.. 1. Current knowledge of git
4. The Project: Finish converting `git submodule' to builtin
5. Prior work
6. General implementation strategy
7. Timeline (using the format dd/mm)
8. Beyond GSoC
9. Blogging
10. Final Remarks: A little more about me
1 Personal Details
==================
Name : Atharva Raykar
Major : Computer Science and Engineering
Email : raykar.ath@gmail.com
IRC nick : atharvaraykar on #git and #git-devel
Address : RB 103, Purva Riviera, Marathahalli, Bangalore
Postal Code : 560037
Time Zone : IST (UTC+5:30)
GitHub : http://github.com/tfidfwastaken
2 Background
============
I am Atharva Raykar, currently in my third year of studying Computer
Science and Engineering at PES University, Bangalore. I have always
enjoyed programming since a young age, but my deep appreciation for
good program design and creating the right abstractions came during my
exploration of the various rabbitholes of knowledge originating from
communities around the internet. I have personally enjoyed learning
about Functional Programming, Database Architecture and Operating
Systems, and my interests keep expanding as I explore more in this
field.
I owe my appreciation of this rich field to these communities, and I
always wanted to give back. With that goal, I restarted the [PES Open
Source] community in our campus, with the goal of creating spaces
where members could share knowledge, much in the same spirit as the
communities that kickstarted my journey in Computer Science. I learnt
a lot about collaborating in the open, maintainership, and reviewing
code. While I have made many small contributions to projects in the
past, I am hoping GSoC will help me make the leap to a larger and more
substantial contribution to one of my favourite projects that made it
all possible in my journey with Open Source.
[PES Open Source] <https://pesos.github.io>
3 Me and Git
============
Here are the various forms of contributions that I have made to Git:
- [Microproject] userdiff: userdiff: add support for Scheme Status: In
progress, patch v2 pending List:
<https://public-inbox.org/git/20210327173938.59391-1-raykar.ath@gmail.com/>
- [Git Education] Conducted a workshop with attendance of hundreds of
students new to git, and increased the prevalence of of git's usage
in my campus.
Photos: <https://photos.app.goo.gl/T7CPk1zkHdK7mx6v7> and
<https://photos.app.goo.gl/bzTgdHMttxDen6z9A>
I intend to continue helping people out on the mailing list and IRC
and tending to patches wherever possible in the meantime.
3.1 Current knowledge of git
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I use git almost daily in some form, and I am fairly comfortable with
it. I have already read and understood the chapters from the Git
Book about submodules along with the one on objects, references,
packfiles and the refspec.
4 The Project: Finish converting `git submodule' to builtin
===========================================================
Git has historically had many components implemented in the form of
shell scripts. This was less than ideal for several reasons:
- Portability: Non-POSIX systems like Windows don't play nice with
shell script commands like grep, cd and printf, to name a few, and
these commands have to be reimplemented for the system. There are
also POSIX to Windows path conversion issues.
- No direct access to plumbing: Shell commands do not have direct
access to the low level git API, and a separate shell is spawned to
just to carry out their operations.
- Performance: Shell scripts tend to create a lot of child processes
which slows down the functioning of these commands, especially with
large repositories.
Over the years, many GSoC students have converted the shell versions
of these commands to C. Git `submodule' is the last of these to be
converted.
5 Prior work
============
I will be taking advantage of the knowledge that was gained in the
process of the converting the previous scripts and avoiding all the
gotchas that may be present in the process. There may be a bunch of
useful helper functions in the previous patches that can be reused as
well (more investigation needed to determine what exactly is
reusable).
Currently the only other commands left to be completed for `submodule'
are `add' and `update'. Work for `add' has already been started by a
previous GSoCer, Shourya Shukla, and needs to picked up from there.
Reference:
<https://github.com/gitgitgadget/git/issues/541#issuecomment-769245064>
I'll have these as my references when I am working on the project:
His blog about his progress:
<https://shouryashukla.blogspot.com/2020/08/the-final-report.html>
(more has been implemented since)
Shourya's latest patch for `submodule add':
<https://lore.kernel.org/git/20201007074538.25891-1-shouryashukla.oo@gmail.com/>
For the most part, the implementation looks fairly complete, but there
seems to be a segfault occurring, along with a few changes suggested
by the reviewers. It will be helpful to contact Shourya to fully
understand what needs to be done.
Prathamesh's previous conversion work:
<https://lore.kernel.org/git/20170724203454.13947-1-pc44800@gmail.com/#t>
6 General implementation strategy
=================================
The way to port the shell to C code for `submodule' will largely
remain the same. There already exists the builtin
`submodule--helper.c' which contains most of the previous commands'
ports. All that the shell script for `git-submodule.sh' is doing for
the previously completed ports is parsing the flags and then calling
the helper, which does all the business logic.
So I will be moving out all the business logic that the shell script
is performing to `submodule--helper.c'. Any reusable functionality
that is introduced during the port will be added to `submodule.c' in
the top level.
For example: The general strategy for converting `cmd_update()' would
be to have a call to `submodule--helper' in the shell script to a
function which would resemble something like `module_update()' which
would perform the work being done by the shell script past the flags
being parsed and make the necessary calls to `update_clone()', and the
git interface in C for performing the merging, checkout and rebase
where necessary.
After this process, the builtin is added to the commands array in
`submodule--helper.c'. And since these two functions are the last bit
of functionality left to convert in submodules, an extended goal can
be to get rid of the shell script altogether, and make the helper into
the actual builtin [1].
[1]
<https://lore.kernel.org/git/nycvar.QRO.7.76.6.2011191327320.56@tvgsbejvaqbjf.bet/>
7 Timeline (using the format dd/mm)
===================================
Periods of limited availability (read: hectic chaos):
- From 13/04 to 20/04 I will be having project evaluations and lab
assessments for five of my courses.
- From 20/04 to 01/05 I have my in-semester exams.
- For a period of two weeks in the range of 08/05 to 29/05 I will be
having my end-semester exams.
My commitment: I will still have time during my finals to help people
out on the mailing list, get acquainted with the community and its
processes, and even review patches if I can. This is because we get
holidays between each exam, and my grades are good enough to that I
can prioritise git over my studies ;-)
And on the safe side, I will still engage with the community from now
till 07/06 so that the community bonding period is not compromised in
any way.
Periods of abundant availability: After 29/05 all the way to the first
week of August, I will be having my summer break, so I can dedicate
myself to git full-time :-)
I would have also finished all my core courses, so even after that, I
will have enough of time to give back to git past my GSoC period.
Phase 1: 07/06 to 14/06 -- Investigate and devise a strategy to port
the submodule functions
- This phase will be more diagrams in my notebook than code in my
editor -- I will go through all the methods used to port the other
submodule functions and see how to do the same for what is left.
- I will find the C equivalents of all the shell invocations in
`git-submodule.sh', and see what invocations have /no/ equivalent
and need to be created as helpers in C (Eg: What is the equivalent
to the `ensure-core-worktree' invocation in C?). For all the helpers
and new functionality that I do introduce, I will need to create the
testing strategy for the same.
- I will go through all the work done by Shourya in his patch, and try
to understand it properly. I will also see the mistakes that were
caught in all the reviews for previous submodule conversion patches
and try to learn from them before I jump to the code.
- Deliverable: I will create a checklist for all the work that needs
to be done with as much detail as I can with the help of inputs from
my mentor and all the knowledge I have gained in the process.
Phase 2: 14/06 to 28/06 -- Convert `add' to builtin in C
- I will work on completing `git submodule add'. One strategy would be
to either reimplement the whole thing using what was learnt in
Shourya's attempt, but it is probably wiser to just take his patch
and modify it. I would know what to do by the time I reach this
phase.
- I will also add tests for this functionality. I will also document
my changes when required. These would be unit tests for the helpers
introduced, and integration of `add' with the other commands.
- Deliverable: Completely port `add' to C!
Phase 3: 28/06 to 16/08 -- Convert `update' to builtin
- Some work has already been done by Stephan Beller that moves the
functionality of `update' to `submodule--helper.c':
<https://github.com/git/git/commit/48308681b072a1d32e1361c255347324a8ad151e>,
but a lot of the business logic of going into the submodule and
checking out or merging or rebasing needs to still be converted.
Plenty to do here.
- As with `add', all of the appropriate tests need to be written and
the changes documented. As I have learnt from the Pro Git Book,
there are a lot of subtleties with how update does its work that I
need to watch out for.
- Deliverable: Completely port `update' to C!
Bonus Phase: If I am ahead of time -- Remove the need for a
`submodule--helper', and make it a proper C builtin.
- Once all the submodule functionality is ported, the shell script is
not really doing much more than parsing the arguments and passing it
to the helper. We won't need this anymore if it is implemented.
8 Beyond GSoC
=============
I love the process of working as a community more than anything else,
and I already felt very welcomed by the git community the moment I
started sending in my microproject patch series. Whether I am selected
or not, I will continue giving back to git wherever I can. Since my
final year is light on coursework, I will be able to mentor people and
help expand the git developer community through all the ways I can (be
it code review, helping people find the right resources or evangelism
of git).
9 Blogging
==========
I will be blogging about my progress on a weekly basis and either post
it on my website at <https://atharvaraykar.me> (probably will tuck it
away in a /gsoc path). Technical blogging is not particularly new to
me, and I hope my posts can help future contributors of git.
10 Final Remarks: A little more about me
========================================
These are some of my core values that I believe will be important to
pull off this project and make the most of my time in GSoC:
- Hard problems don't frustrate me, rather they excite me. Bugs make
my brain perk up (this sentence best left with context). I love
learning.
- I am pro-transparency. If I am having some trouble, I will be open
about it. I don't hesitate to ask questions and dig deep if I need
to.
- At the same time, when I ask a question, I only do so after I have
struggled with the problem for enough time and done my due diligence
in trying to solve it. Clear communication is very important to make
this work.
- I am also very comfortable with learning things all on my own (I
have barely known any other way), and working in a remote,
asynchronous setting.
I hope to make the world better in my own small way by contributing to
a tool that everyone uses and I like. It's more rewarding than any
internship that my peers are doing this year. I look forward to
learning more.
next reply other threads:[~2021-04-03 14:08 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-03 14:08 Atharva Raykar [this message]
2021-04-05 16:02 ` [GSoC][Draft Proposal] Finish converting git submodule to builtin Christian Couder
2021-04-08 10:19 ` [GSoC][Draft Proposal v2] " Atharva Raykar
2021-04-10 12:59 ` Christian Couder
2021-04-11 9:40 ` Atharva Raykar
2021-04-11 19:32 ` Kaartic Sivaraam
2021-04-12 5:56 ` Atharva Raykar
2021-04-12 13:29 ` Christian Couder
2021-04-11 10:17 ` [GSoC][Draft Proposal v3] " Atharva Raykar
2021-05-14 16:00 ` [GSoC][Draft Proposal v2] " Atharva Raykar
2021-05-16 18:40 ` Kaartic Sivaraam
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=E6E88000-9C18-4035-9A14-8B406617351A@gmail.com \
--to=raykar.ath@gmail.com \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=periperidip@gmail.com \
--cc=shouryashukla.oo@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).