git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC][PATCH] GSoC 2023 proposal: more sparse index integration
@ 2023-02-26  8:33 Vivan Garg
  2023-02-26  9:03 ` Ashutosh Pandey
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Vivan Garg @ 2023-02-26  8:33 UTC (permalink / raw)
  To: git, vdye; +Cc: christian.couder, hariom18599, Vivan Garg

Signed-off-by: Vivan Garg <gvivan6@gmail.com>
---
This is a rough draft of my proposal for the GSoC 2023 more sparse 
index integration project. I would greatly appreciate any feedback 
anyone has to offer. Thank you in advance!

 .../More-Sparse-Index-Integrations.txt        | 134 ++++++++++++++++++
 1 file changed, 134 insertions(+)
 create mode 100644 Documentation/More-Sparse-Index-Integrations.txt

diff --git a/Documentation/More-Sparse-Index-Integrations.txt b/Documentation/More-Sparse-Index-Integrations.txt
new file mode 100644
index 0000000000..dbe6da660f
--- /dev/null
+++ b/Documentation/More-Sparse-Index-Integrations.txt
@@ -0,0 +1,134 @@
+# More Sparse Index Integrations
+
+# Personal Information
+
+Full name: Vivan Garg
+
+E-mail: gvivan6@gmail.com 
+Alternate E-mail: v.garg.work@gmail.com
+Tel: (+1)437-987-2678
+
+Education: University of Waterloo (Canada)
+Major: Computer Science and Financial Management (Double-Major)
+Year: Rising Junior
+
+LinkedIn: https://www.linkedin.com/in/gvivan/
+GitHub: https://github.com/gvivan
+Website: https://gvivan.me/
+
+# Before GSoC
+
+## Synopsis
+
+I've chosen the "More Sparse Index Integrations" project idea from the SoC 2023 Ideas page. The goal of this project is to integrate the experimental "sparse-index" feature and "sparse-checkout" command with existing Git commands. 
+
+Git 2.25.0 introduced a new experimental `git sparse-checkout` command, which simplified the existing feature and improved performance for large repositories. It allows users to restrict their working directory to only the files they care about, allowing them to ensure the developer workflow is as fast as possible while maintaining all the benefits of a monorepo. 
+(Bring your monorepo down to size with sparse-checkout, Stolee).
+
+The pattern matching process in Git's sparse-checkout feature becomes expensive as the sparse-checkout file and repository size increase, growing quadratically. This can result in billions of pattern checks for large repositories. However, Git's new mechanism for matching based on folder prefix matches drops the quadratic growth, matching M patterns across N files in O(M+N*d) time, where d is the maximum folder depth of a file. 
+To further optimize the matching process, Git inspects files in a sorted order instead of an arbitrary order. When Git evaluates a file path, it checks whether the start of the folder path matches a recursive pattern exactly. If so, it marks everything in that folder as "included" without doing any further hashset lookups. Similarly, when Git detects the start of a folder that's outside of the specified cone, it marks everything in that folder as "excluded" without doing any further hashset lookups. This reduces the time to be closer to O(M+N) (Bring your monorepo down to size with sparse-checkout, Stolee).
+
+The Git Fundamentals team at GitHub has contributed a new feature to Git called the sparse index, which allows the index to focus on the files within the sparse-checkout cone in a monorepo. The sparse index stores only the information about the files within the sparse-checkout definition, instead of storing information for every file at HEAD, which can make the index much larger in a monorepo. When enabled with other performance features, the sparse index can have a significant impact on performance (Make your monorepo feel small with Git’s sparse index, Stolee).
+
+The sparse index differs from a normal "full" index in that it can store directory paths with the object ID for its tree object. It can be used to determine if an entire directory is out of the sparse-checkout cone and replace all of its contained file paths with a single directory path. The use of sparse index can significantly reduce the size of the index, resulting in faster operations (Make your monorepo feel small with Git’s sparse index, Stolee).
+
+Because "sparse-checkout" and "sparse-index" may potentially influence the logics of other Git commands and the internal data structure of Git, some work is required to optimize compatibility and user experience. That is exactly what my chosen idea proposed.
+
+## Benefits to Community
+
+By joining the community and working on this idea, I can collaborate with my mentor and fellow community members to improve the user experience for people who are working with large monorepos. Furthermore, I am committed to continuing my involvement beyond the GSoC program, not only by contributing to the community but also by sharing my experiences and mentoring future potential newcomers.
+
+
+## Microproject
+
+t4121: modernize test style
+Status: ready to merge
+Description: Test scripts in file t4121-apply-diffs.sh are written in old style,where the test_expect_success command and test title are written on
+separate lines. Therefore update the tests to adhere to the new style.
+
+## Other Contributions
+
+### Reviewing
+
+t9700: modernize test script
+Status: WIP
+Description: I reviewed this patch and pointed the contributor in the right direction by providing examples, links and mentioning the best practices.
+
+### Patches
+
+MyFirstContribution: add note about SMTP server config
+Status: WIP
+Description: The documentation on using git-send-email previously mentioned the need to configure git for your operating system and email provider, but did not provide specific details on the relevant configuration settings. This commit adds a note specifying that the relevant settings can be found under the 'sendemail' section of Git's configuration file, with a link to the relevant documentation. The aim is to provide users with a more complete understanding of the configuration process and help them avoid potential roadblocks in setting up git-send-email.
+
+
+### Related Work
+
+Prior works on the idea have been completed by my mentors and other community members, and these works provide a good approximation of the approach I intend to take. Here are some previous examples of commits:
+Integration with “mv”
+Integration with “reset”
+Integration with “sparse-checkout”
+Integration with “clean”
+Integration with “blame”
+
+# In GSoC
+
+## Plan
+
+The proposed idea of increasing "sparse-index" integrations may seem straightforward at first glance. However, upon reviewing previous implementations, I discovered that this idea can introduce unforeseen difficulties for some functions. For example, to enable "sparse-index," we must ensure that "sparse-checkout" is compatible with the target Git command. Achieving this compatibility requires modifying the original command logic, which can lead to other unanticipated issues. Therefore, I have incorporated some additional steps in the plan outlined below to proactively address potential complications. It's worth noting that points 3-7 are part of the SoC 2023 Ideas proposed by the community and mentors.
+
+1. Conduct an investigation to determine if a Git command functions properly with sparse-checkout.
+
+2. Modify the logic of the Git command, if necessary, to ensure it functions properly with sparse-checkout. Develop corresponding tests to validate the modifications. 
+
+3. Add tests to t1092-sparse-checkout-compatibility.sh for the builtin, with a focus on what happens for paths outside of the sparse-checkout cone.
+
+4. Disable the command_requires_full_index setting in the builtin and ensure the tests pass.
+
+5. If the tests do not pass, then alter the logic to work with the sparse index.
+
+6. Add tests to check that a sparse index stays sparse.
+
+7. Add performance tests to demonstrate speedup.
+
+8. If any changes are made that affect the behavior of the Git command, update the documentation accordingly. Note that such changes should be rare.
+
+## Timeline
+
+During my discussion with Victoria, she informed me that given my commitment of 175 hours, it is expected that I will be able to fully integrate two commands with sparse index during the GSOC program. My plan is to evenly distribute the work for each command over the course of the program. I am confident that I can start the project early as I have already established communication with my mentors and familiarized myself with the related documentation, although my understanding may not be comprehensive.
+
+Based on my prior experience with the idea, I believe I will be able to quickly get up to speed and begin working on the project. The exact timeline for each integration is difficult to determine, but I estimate that I should be able to complete one integration every two months. I have already planned out my next term, and there are only three weeks during which I would prefer to focus on other things: June 23-30 and August 1-15. However, even without an extension, I should be able to manage this timeline. With the flexibility to extend the program, it should be even easier to accommodate any potential scheduling conflicts.	
+
+	
+## Availability
+
+I will respond to all communication daily and will be available throughout the duration of the program. Although I will be taking some summer courses at my university, I will not be enrolled in a typical full course load. As part of GSOC, I plan to commit to 175 hours. I have experience managing my time effectively while taking courses and working full-time internships in the past. My semester ends on August 15th, and I have no commitments for the following month, which allows me to continue working beyond the end of the semester. With the flexibility to extend the timeline of GSOC, I am confident that I will have ample time to complete the project. I have already discussed this with Victoria, the mentor for the project, and she has agreed to extend the deadline until October 2nd, if necessary. After August 15th, I will be able to work at least 8 hours per day, totaling ~360 hours of work until the October 2nd deadline. This exceeds the required commitment of 175 hours, ensuring that I will complete the project on time. Additionally, I am hoping to continue working on the project even after GSOC ends. 
+
+# After GSoC
+
+I recognize the value of having our GSoC participants continue to engage with our community beyond the event. This is why I am committed to doing so myself. Participating in open-source projects, especially with a community that supports a widely-used development tool, is not only cool but also offers an opportunity to learn and grow. By continuing to participate in this community, I believe that I can make important contributions and continue to develop my skills.
+
+I am planning to establish an open source club at my university in the near future. The University of Waterloo is known for its strong emphasis on computer science and engineering, earning it the nickname "MIT of the North." Given this, I believe that there will be a great deal of interest in the club for a variety of reasons. Currently, there is another club called Blueprint that provides a valuable opportunity for real-world development experience through developing software products for charities. However, the entry process for this club is extremely competitive. By contrast, I think that an open source club would offer a similar experience but with a lower barrier to entry, thus making it accessible to more motivated students. Additionally, given the widespread use and vibrant community of Git, I plan to direct students to this community and am confident that many will be interested in contributing to its open source projects.
+
+# Some Credits to Myself
+
+I’ve previously completed three software developer internships and worked with small startups to large sized companies. I am currently interning with Morgan Stanley and am on the architecture team, working on a large scale equity management software. 
+
+I'm interested in open source development as a way to give back to the community while also growing as a developer. My background in C programming language has made me particularly interested in contributing to Git, which is primarily written in C. I am also comfortable with concepts like memory allocation, thanks to my experience with C programming. Furthermore, I have studied shell scripting as part of my coursework, which makes me well-equipped to handle the project's language requirements. Another personal motivation for contributing to this project is that I have worked with monorepos before, and given that it is used by many of the larger tech companies, I want to learn more about it and help improve the user experience with it.
+
+Victoria mentioned that I was the first person to express interest in the project this year, either directly or via the mailing list. In my spare time, I've been contributing and reading documents while also working a full-time job (internship) and taking one course at my university. I expect to have a lot more time next term, so you can expect even more from me ;). Nonetheless, I became familiar and comfortable with the contribution process by writing, responding to, and auditing various types of patches in the community.
+
+With the patches I have submitted so far, I have been able to develop a deeper understanding of Git internals, project structures, commonly used APIs, test suites, required tech stacks, and coding guidelines. To further enhance my comprehension of Git, I have either read or skimmed through several relevant documents, including 'Submitting patches', 'Coding guidelines', 'Myfirstcontribution.txt', 'Git tutorial', 'Git everyday', 'readme', 'Hacking Git', drawing upon my prior knowledge where applicable. Additionally, I have been referring to the book 'Pro Git' on an as-needed basis. Furthermore, I have thoroughly read and referenced blogs such as 'Make your monorepo feel small with Git's sparse index', 'Bring your monorepo down to size with sparse-checkout', and 'Commits are snapshots, not diffs'. The advantage of having prior knowledge and experience with my proposed project idea is that I am well-prepared to tackle any upcoming challenges.
+
+# Closing remarks
+
+I am very motivated for this project because I have previously worked with monorepos and will most likely have to work with them again in my future internships. As a result, I intend to continue working on remaining commands after GSOC whenever I have free time. 
+
+I'd like to state that I'm a genuinely enthusiastic open-source newcomer who is very much looking forward to this opportunity. I am grateful for the opportunity to contribute to Git's development, and I am committed to working diligently to strengthen the open-source ecosystem. My ultimate goal is to use this opportunity to bring new energy and ideas to the table, and to make meaningful contributions that benefit the entire community.
+
+I am grateful for the community's support, especially Victoria's guidance and feedback. She promptly replied to my inquiries and provided me with several resources that were instrumental in helping me get started on the project. I am truly humbled by the dedication and hard work that the community puts in to nurture and enhance this ecosystem, and I feel fortunate to have received such warm and welcoming support as a new contributor. It is an honor to be a part of this community and to work towards advancing its mission.
+
+Thank you so much for reading through my proposal!
+
+Kind Regards,
+Vivan Garg
+

base-commit: d9d677b2d8cc5f70499db04e633ba7a400f64cbf
-- 
2.37.0 (Apple Git-136)


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH] GSoC 2023 proposal: more sparse index integration
  2023-02-26  8:33 [RFC][PATCH] GSoC 2023 proposal: more sparse index integration Vivan Garg
@ 2023-02-26  9:03 ` Ashutosh Pandey
  2023-02-26 23:18   ` Vivan Garg
  2023-02-26 23:03 ` Victoria Dye
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Ashutosh Pandey @ 2023-02-26  9:03 UTC (permalink / raw)
  To: Vivan Garg; +Cc: git, vdye, christian.couder, hariom18599

On Sun, Feb 26, 2023 at 2:17 PM Vivan Garg <gvivan6@gmail.com> wrote:
>
> Signed-off-by: Vivan Garg <gvivan6@gmail.com>
> ---
> This is a rough draft of my proposal for the GSoC 2023 more sparse
> index integration project. I would greatly appreciate any feedback
> anyone has to offer. Thank you in advance!
>
>  .../More-Sparse-Index-Integrations.txt        | 134 ++++++++++++++++++
>  1 file changed, 134 insertions(+)
>  create mode 100644 Documentation/More-Sparse-Index-Integrations.txt
>
> diff --git a/Documentation/More-Sparse-Index-Integrations.txt b/Documentation/More-Sparse-Index-Integrations.txt
> new file mode 100644
> index 0000000000..dbe6da660f
> --- /dev/null
> +++ b/Documentation/More-Sparse-Index-Integrations.txt
> @@ -0,0 +1,134 @@
> +# More Sparse Index Integrations
> +
> +# Personal Information
> +
> +Full name: Vivan Garg
> +
> +E-mail: gvivan6@gmail.com
> +Alternate E-mail: v.garg.work@gmail.com
> +Tel: (+1)437-987-2678
> +
> +Education: University of Waterloo (Canada)
> +Major: Computer Science and Financial Management (Double-Major)
> +Year: Rising Junior
> +
> +LinkedIn: https://www.linkedin.com/in/gvivan/
> +GitHub: https://github.com/gvivan
> +Website: https://gvivan.me/
> +
> +# Before GSoC
> +
> +## Synopsis
> +
> +I've chosen the "More Sparse Index Integrations" project idea from the SoC 2023 Ideas page. The goal of this project is to integrate the experimental "sparse-index" feature and "sparse-checkout" command with existing Git commands.
> +
> +Git 2.25.0 introduced a new experimental `git sparse-checkout` command, which simplified the existing feature and improved performance for large repositories. It allows users to restrict their working directory to only the files they care about, allowing them to ensure the developer workflow is as fast as possible while maintaining all the benefits of a monorepo.
> +(Bring your monorepo down to size with sparse-checkout, Stolee).
> +
> +The pattern matching process in Git's sparse-checkout feature becomes expensive as the sparse-checkout file and repository size increase, growing quadratically. This can result in billions of pattern checks for large repositories. However, Git's new mechanism for matching based on folder prefix matches drops the quadratic growth, matching M patterns across N files in O(M+N*d) time, where d is the maximum folder depth of a file.
> +To further optimize the matching process, Git inspects files in a sorted order instead of an arbitrary order. When Git evaluates a file path, it checks whether the start of the folder path matches a recursive pattern exactly. If so, it marks everything in that folder as "included" without doing any further hashset lookups. Similarly, when Git detects the start of a folder that's outside of the specified cone, it marks everything in that folder as "excluded" without doing any further hashset lookups. This reduces the time to be closer to O(M+N) (Bring your monorepo down to size with sparse-checkout, Stolee).
> +
> +The Git Fundamentals team at GitHub has contributed a new feature to Git called the sparse index, which allows the index to focus on the files within the sparse-checkout cone in a monorepo. The sparse index stores only the information about the files within the sparse-checkout definition, instead of storing information for every file at HEAD, which can make the index much larger in a monorepo. When enabled with other performance features, the sparse index can have a significant impact on performance (Make your monorepo feel small with Git’s sparse index, Stolee).
> +
> +The sparse index differs from a normal "full" index in that it can store directory paths with the object ID for its tree object. It can be used to determine if an entire directory is out of the sparse-checkout cone and replace all of its contained file paths with a single directory path. The use of sparse index can significantly reduce the size of the index, resulting in faster operations (Make your monorepo feel small with Git’s sparse index, Stolee).
> +
> +Because "sparse-checkout" and "sparse-index" may potentially influence the logics of other Git commands and the internal data structure of Git, some work is required to optimize compatibility and user experience. That is exactly what my chosen idea proposed.
> +
> +## Benefits to Community
> +
> +By joining the community and working on this idea, I can collaborate with my mentor and fellow community members to improve the user experience for people who are working with large monorepos. Furthermore, I am committed to continuing my involvement beyond the GSoC program, not only by contributing to the community but also by sharing my experiences and mentoring future potential newcomers.
> +
> +
> +## Microproject
> +
> +t4121: modernize test style
> +Status: ready to merge
> +Description: Test scripts in file t4121-apply-diffs.sh are written in old style,where the test_expect_success command and test title are written on
> +separate lines. Therefore update the tests to adhere to the new style.
you should include a link here to Junio's what's cooking in git where
your contribution is mentioned.
> +
> +## Other Contributions
> +
> +### Reviewing
> +
> +t9700: modernize test script
> +Status: WIP
> +Description: I reviewed this patch and pointed the contributor in the right direction by providing examples, links and mentioning the best practices.
> +
> +### Patches
> +
> +MyFirstContribution: add note about SMTP server config
> +Status: WIP
> +Description: The documentation on using git-send-email previously mentioned the need to configure git for your operating system and email provider, but did not provide specific details on the relevant configuration settings. This commit adds a note specifying that the relevant settings can be found under the 'sendemail' section of Git's configuration file, with a link to the relevant documentation. The aim is to provide users with a more complete understanding of the configuration process and help them avoid potential roadblocks in setting up git-send-email.
> +
> +
> +### Related Work
> +
> +Prior works on the idea have been completed by my mentors and other community members, and these works provide a good approximation of the approach I intend to take. Here are some previous examples of commits:
> +Integration with “mv”
> +Integration with “reset”
> +Integration with “sparse-checkout”
> +Integration with “clean”
> +Integration with “blame”
> +
> +# In GSoC
> +
> +## Plan
> +
> +The proposed idea of increasing "sparse-index" integrations may seem straightforward at first glance. However, upon reviewing previous implementations, I discovered that this idea can introduce unforeseen difficulties for some functions. For example, to enable "sparse-index," we must ensure that "sparse-checkout" is compatible with the target Git command. Achieving this compatibility requires modifying the original command logic, which can lead to other unanticipated issues. Therefore, I have incorporated some additional steps in the plan outlined below to proactively address potential complications. It's worth noting that points 3-7 are part of the SoC 2023 Ideas proposed by the community and mentors.
> +
> +1. Conduct an investigation to determine if a Git command functions properly with sparse-checkout.
> +
> +2. Modify the logic of the Git command, if necessary, to ensure it functions properly with sparse-checkout. Develop corresponding tests to validate the modifications.
> +
> +3. Add tests to t1092-sparse-checkout-compatibility.sh for the builtin, with a focus on what happens for paths outside of the sparse-checkout cone.
> +
> +4. Disable the command_requires_full_index setting in the builtin and ensure the tests pass.
> +
> +5. If the tests do not pass, then alter the logic to work with the sparse index.
> +
> +6. Add tests to check that a sparse index stays sparse.
> +
> +7. Add performance tests to demonstrate speedup.
> +
> +8. If any changes are made that affect the behavior of the Git command, update the documentation accordingly. Note that such changes should be rare.
> +
> +## Timeline
> +
> +During my discussion with Victoria, she informed me that given my commitment of 175 hours, it is expected that I will be able to fully integrate two commands with sparse index during the GSOC program. My plan is to evenly distribute the work for each command over the course of the program. I am confident that I can start the project early as I have already established communication with my mentors and familiarized myself with the related documentation, although my understanding may not be comprehensive.
> +
> +Based on my prior experience with the idea, I believe I will be able to quickly get up to speed and begin working on the project. The exact timeline for each integration is difficult to determine, but I estimate that I should be able to complete one integration every two months. I have already planned out my next term, and there are only three weeks during which I would prefer to focus on other things: June 23-30 and August 1-15. However, even without an extension, I should be able to manage this timeline. With the flexibility to extend the program, it should be even easier to accommodate any potential scheduling conflicts.
> +
> +
> +## Availability
> +
> +I will respond to all communication daily and will be available throughout the duration of the program. Although I will be taking some summer courses at my university, I will not be enrolled in a typical full course load. As part of GSOC, I plan to commit to 175 hours. I have experience managing my time effectively while taking courses and working full-time internships in the past. My semester ends on August 15th, and I have no commitments for the following month, which allows me to continue working beyond the end of the semester. With the flexibility to extend the timeline of GSOC, I am confident that I will have ample time to complete the project. I have already discussed this with Victoria, the mentor for the project, and she has agreed to extend the deadline until October 2nd, if necessary. After August 15th, I will be able to work at least 8 hours per day, totaling ~360 hours of work until the October 2nd deadline. This exceeds the required commitment of 175 hours, ensuring that I will complete the project on time. Additionally, I am hoping to continue working on the project even after GSOC ends.
> +
> +# After GSoC
> +
> +I recognize the value of having our GSoC participants continue to engage with our community beyond the event. This is why I am committed to doing so myself. Participating in open-source projects, especially with a community that supports a widely-used development tool, is not only cool but also offers an opportunity to learn and grow. By continuing to participate in this community, I believe that I can make important contributions and continue to develop my skills.
> +
> +I am planning to establish an open source club at my university in the near future. The University of Waterloo is known for its strong emphasis on computer science and engineering, earning it the nickname "MIT of the North." Given this, I believe that there will be a great deal of interest in the club for a variety of reasons. Currently, there is another club called Blueprint that provides a valuable opportunity for real-world development experience through developing software products for charities. However, the entry process for this club is extremely competitive. By contrast, I think that an open source club would offer a similar experience but with a lower barrier to entry, thus making it accessible to more motivated students. Additionally, given the widespread use and vibrant community of Git, I plan to direct students to this community and am confident that many will be interested in contributing to its open source projects.
> +
> +# Some Credits to Myself
> +
> +I’ve previously completed three software developer internships and worked with small startups to large sized companies. I am currently interning with Morgan Stanley and am on the architecture team, working on a large scale equity management software.
> +
> +I'm interested in open source development as a way to give back to the community while also growing as a developer. My background in C programming language has made me particularly interested in contributing to Git, which is primarily written in C. I am also comfortable with concepts like memory allocation, thanks to my experience with C programming. Furthermore, I have studied shell scripting as part of my coursework, which makes me well-equipped to handle the project's language requirements. Another personal motivation for contributing to this project is that I have worked with monorepos before, and given that it is used by many of the larger tech companies, I want to learn more about it and help improve the user experience with it.
> +
> +Victoria mentioned that I was the first person to express interest in the project this year, either directly or via the mailing list. In my spare time, I've been contributing and reading documents while also working a full-time job (internship) and taking one course at my university. I expect to have a lot more time next term, so you can expect even more from me ;). Nonetheless, I became familiar and comfortable with the contribution process by writing, responding to, and auditing various types of patches in the community.
> +
> +With the patches I have submitted so far, I have been able to develop a deeper understanding of Git internals, project structures, commonly used APIs, test suites, required tech stacks, and coding guidelines. To further enhance my comprehension of Git, I have either read or skimmed through several relevant documents, including 'Submitting patches', 'Coding guidelines', 'Myfirstcontribution.txt', 'Git tutorial', 'Git everyday', 'readme', 'Hacking Git', drawing upon my prior knowledge where applicable. Additionally, I have been referring to the book 'Pro Git' on an as-needed basis. Furthermore, I have thoroughly read and referenced blogs such as 'Make your monorepo feel small with Git's sparse index', 'Bring your monorepo down to size with sparse-checkout', and 'Commits are snapshots, not diffs'. The advantage of having prior knowledge and experience with my proposed project idea is that I am well-prepared to tackle any upcoming challenges.
> +
> +# Closing remarks
> +
> +I am very motivated for this project because I have previously worked with monorepos and will most likely have to work with them again in my future internships. As a result, I intend to continue working on remaining commands after GSOC whenever I have free time.
> +
> +I'd like to state that I'm a genuinely enthusiastic open-source newcomer who is very much looking forward to this opportunity. I am grateful for the opportunity to contribute to Git's development, and I am committed to working diligently to strengthen the open-source ecosystem. My ultimate goal is to use this opportunity to bring new energy and ideas to the table, and to make meaningful contributions that benefit the entire community.
> +
> +I am grateful for the community's support, especially Victoria's guidance and feedback. She promptly replied to my inquiries and provided me with several resources that were instrumental in helping me get started on the project. I am truly humbled by the dedication and hard work that the community puts in to nurture and enhance this ecosystem, and I feel fortunate to have received such warm and welcoming support as a new contributor. It is an honor to be a part of this community and to work towards advancing its mission.
> +
> +Thank you so much for reading through my proposal!
> +
> +Kind Regards,
> +Vivan Garg
> +
>
> base-commit: d9d677b2d8cc5f70499db04e633ba7a400f64cbf
> --
> 2.37.0 (Apple Git-136)
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH] GSoC 2023 proposal: more sparse index integration
  2023-02-26  8:33 [RFC][PATCH] GSoC 2023 proposal: more sparse index integration Vivan Garg
  2023-02-26  9:03 ` Ashutosh Pandey
@ 2023-02-26 23:03 ` Victoria Dye
  2023-02-26 23:52   ` Vivan Garg
  2023-02-27  0:46 ` [RFC][PATCH v2] " Vivan Garg
  2023-03-23  6:38 ` [RFC][PATCH v3] " Vivan Garg
  3 siblings, 1 reply; 11+ messages in thread
From: Victoria Dye @ 2023-02-26 23:03 UTC (permalink / raw)
  To: Vivan Garg, git; +Cc: christian.couder, hariom18599

Vivan Garg wrote:
> ## Synopsis

Please wrap your text to 72 (or up to 80) characters; doing that will make
this much easier for reviewers to format their emails. I've re-wrapped lines
I'm commenting on below.

> Git 2.25.0 introduced a new experimental `git sparse-checkout` command,
> which simplified the existing feature and improved performance for large
> repositories. It allows users to restrict their working directory to only
> the files they care about, allowing them to ensure the developer workflow
> is as fast as possible while maintaining all the benefits of a monorepo. 
> (Bring your monorepo down to size with sparse-checkout, Stolee).

References to other sources (e.g. web links) are usually made with [<N>]
footnotes. In this case, that might look something like:

"
Git 2.25.0 introduced a new experimental `git sparse-checkout` command,
which simplified the existing feature and improved performance for large
repositories. It allows users to restrict their working directory to only
the files they care about, allowing them to ensure the developer workflow is
as fast as possible while maintaining all the benefits of a monorepo. [1]
 
[1]: https://github.blog/2020-01-17-bring-your-monorepo-down-to-size-with-sparse-checkout/
"

Same goes for the other references you've included.
> +## Microproject
> +
> +t4121: modernize test style
> +Status: ready to merge

To expand on the point made by Ashutosh [1], this microproject is not yet
tracked by Junio's "What's cooking" emails (most recent here: [2]), so it is
not "ready to merge." "Under review" would be a more appropriate
description. 

[1] https://lore.kernel.org/git/CACmM78QTptLOvNHs9oE2NNareSNDb+ydGFKr0VHuboCBWSZbSw@mail.gmail.com/
[2] https://lore.kernel.org/git/xmqq1qmeyfps.fsf@gitster.g/

> Integration with “mv”
> Integration with “reset”
> Integration with “sparse-checkout”
> Integration with “clean”
> Integration with “blame”

Please include mailing list archive links to these series.

> During my discussion with Victoria, she informed me that given my
> commitment of 175 hours, it is expected that I will be able to fully
> integrate two commands with sparse index during the GSOC program. My plan
> is to evenly distribute the work for each command over the course of the
> program. I am confident that I can start the project early as I have
> already established communication with my mentors and familiarized myself
> with the related documentation, although my understanding may not be
> comprehensive.

"Two commands per 175 hours" is what I characterized as "rough
expectations," but the actual number of commands integrated for the project
will vary based on the complexity of the commands chosen. In a proposal, I
would expect an applicant present their own, more detailed reasoning around
how long various parts of the project will take, rather than simply quoting
my high-level estimate.

> ## Availability
> 
> I will respond to all communication daily and will be available throughout
> the duration of the program. Although I will be taking some summer courses
> at my university, I will not be enrolled in a typical full course load. As
> part of GSOC, I plan to commit to 175 hours. I have experience managing my
> time effectively while taking courses and working full-time internships in
> the past. My semester ends on August 15th, and I have no commitments for
> the following month, which allows me to continue working beyond the end of
> the semester. With the flexibility to extend the timeline of GSOC, I am
> confident that I will have ample time to complete the project. I have
> already discussed this with Victoria, the mentor for the project, and she
> has agreed to extend the deadline until October 2nd, if necessary. After
> August 15th, I will be able to work at least 8 hours per day, totaling
> ~360 hours of work until the October 2nd deadline. 
I said that "I'd be willing to extend as far as Oct 2 (four weeks) if
needed", but that's a general statement about my own availability and does
not mean that I think such an extension is necessary in this case. The ~360
hours you mention is too large a margin over the 175 hours allocated for the
project to properly understand your planned availability. I would prefer a
more precise breakdown of the time you actually intend to spend on the
project.

> # Some Credits to Myself
> 

[...]

> Victoria mentioned that I was the first person to express interest in the
> project this year, either directly or via the mailing list. 

I want to be extremely clear on this point: while you were the first to
reach out to me, I do not consider that a weighting factor as to who is
ultimately selected for the project. The application deadline is April 4
[3], and I acknowledge that not every potential applicant has the time
available right now to get an early start on preparing their application.
Most importantly, I _absolutely do not_ want your comment to discourage
others from applying.

[3] https://developers.google.com/open-source/gsoc/timeline


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH] GSoC 2023 proposal: more sparse index integration
  2023-02-26  9:03 ` Ashutosh Pandey
@ 2023-02-26 23:18   ` Vivan Garg
  0 siblings, 0 replies; 11+ messages in thread
From: Vivan Garg @ 2023-02-26 23:18 UTC (permalink / raw)
  To: Ashutosh Pandey; +Cc: Vivan Garg, git, vdye, christian.couder, hariom18599

> you should include a link here to Junio's what's cooking in git where
> your contribution is mentioned.

That's something I've already considered. I didn't do it yet because the
status might change by the time I submit my proposal, so I just put
a placeholder there for now. Thanks though!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH] GSoC 2023 proposal: more sparse index integration
  2023-02-26 23:03 ` Victoria Dye
@ 2023-02-26 23:52   ` Vivan Garg
  0 siblings, 0 replies; 11+ messages in thread
From: Vivan Garg @ 2023-02-26 23:52 UTC (permalink / raw)
  To: Victoria Dye; +Cc: Vivan Garg, git, christian.couder, hariom18599

> Please wrap your text to 72 (or up to 80) characters; doing that will make
> this much easier for reviewers to format their emails. I've re-wrapped lines
> I'm commenting on below.

I wrote it in Word, copied and pasted it, and then sent it to the mailing list.
However, I will send a revised version that is properly formatted.

> References to other sources (e.g. web links) are usually made with [<N>]
> footnotes. In this case, that might look something like:
>
> "
> Git 2.25.0 introduced a new experimental `git sparse-checkout` command,
> which simplified the existing feature and improved performance for large
> repositories. It allows users to restrict their working directory to only
> the files they care about, allowing them to ensure the developer workflow is
> as fast as possible while maintaining all the benefits of a monorepo. [1]
>
> [1]: https://github.blog/2020-01-17-bring-your-monorepo-down-to-size-with-sparse-checkout/
> "
>
> Same goes for the other references you've included.

Actually, I had all of the titles in the word document as hyperlinks;
I'll make this
change for the reviewers on the mailing list, but do you recommend changing it
in the final proposal I'm submitting to Google?

> > +## Microproject
> > +
> > +t4121: modernize test style
> > +Status: ready to merge
>
> To expand on the point made by Ashutosh [1], this microproject is not yet
> tracked by Junio's "What's cooking" emails (most recent here: [2]), so it is
> not "ready to merge." "Under review" would be a more appropriate
> description.
> [1] https://lore.kernel.org/git/CACmM78QTptLOvNHs9oE2NNareSNDb+ydGFKr0VHuboCBWSZbSw@mail.gmail.com/
> [2] https://lore.kernel.org/git/xmqq1qmeyfps.fsf@gitster.g/

I only put that in as a placeholder because the status is likely to change by
the time I submit my proposal. However, I'll change the placeholder to WIP.

>
> > Integration with “mv”
> > Integration with “reset”
> > Integration with “sparse-checkout”
> > Integration with “clean”
> > Integration with “blame”
>
> Please include mailing list archive links to these series.

I also had these as hyperlinks. However, I will include the link here.

> "Two commands per 175 hours" is what I characterized as "rough
> expectations," but the actual number of commands integrated for the project
> will vary based on the complexity of the commands chosen. In a proposal, I
> would expect an applicant present their own, more detailed reasoning around
> how long various parts of the project will take, rather than simply quoting
> my high-level estimate.
> I said that "I'd be willing to extend as far as Oct 2 (four weeks) if
> needed", but that's a general statement about my own availability and does
> not mean that I think such an extension is necessary in this case. The ~360
> hours you mention is too large a margin over the 175 hours allocated for the
> project to properly understand your planned availability. I would prefer a
> more precise breakdown of the time you actually intend to send on the
> project.

Is it sufficient to assign an approximate time I intend to devote to each step
in my plan?

Thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC][PATCH v2] GSoC 2023 proposal: more sparse index integration
  2023-02-26  8:33 [RFC][PATCH] GSoC 2023 proposal: more sparse index integration Vivan Garg
  2023-02-26  9:03 ` Ashutosh Pandey
  2023-02-26 23:03 ` Victoria Dye
@ 2023-02-27  0:46 ` Vivan Garg
  2023-03-23  6:38 ` [RFC][PATCH v3] " Vivan Garg
  3 siblings, 0 replies; 11+ messages in thread
From: Vivan Garg @ 2023-02-27  0:46 UTC (permalink / raw)
  To: git, vdye; +Cc: christian.couder, hariom18599, Vivan Garg

Signed-off-by: Vivan Garg <gvivan6@gmail.com>
---
 .../More-Sparse-Index-Integrations.txt        | 312 ++++++++++++++++++
 1 file changed, 312 insertions(+)
 create mode 100644 Documentation/More-Sparse-Index-Integrations.txt

diff --git a/Documentation/More-Sparse-Index-Integrations.txt b/Documentation/More-Sparse-Index-Integrations.txt
new file mode 100644
index 0000000000..2ab6b07f18
--- /dev/null
+++ b/Documentation/More-Sparse-Index-Integrations.txt
@@ -0,0 +1,312 @@
+# More Sparse Index Integrations
+
+# Personal Information
+
+Full name: Vivan Garg
+
+E-mail: gvivan6@gmail.com 
+Alternate E-mail: v.garg.work@gmail.com
+Tel: (+1)437-987-2678
+
+Education: University of Waterloo (Canada)
+Major: Computer Science and Financial Management (Double-Major)
+Year: Rising Junior
+
+LinkedIn: https://www.linkedin.com/in/gvivan/
+GitHub: https://github.com/gvivan
+Website: https://gvivan.me/
+
+# Before GSoC
+
+## Synopsis
+
+I've chosen the "More Sparse Index Integrations" project idea from the
+SoC 2023 Ideas page. The goal of this project is to integrate the 
+experimental "sparse-index" feature and "sparse-checkout" command with 
+existing Git commands. 
+
+Git 2.25.0 introduced a new experimental `git sparse-checkout` command, 
+which simplified the existing feature and improved performance for 
+large repositories. It allows users to restrict their working directory 
+to only the files they care about, allowing them to ensure the developer 
+workflow is as fast as possible while maintaining all the benefits of a 
+monorepo. 
+(Bring your monorepo down to size with sparse-checkout [1], Stolee).
+
+The pattern matching process in Git's sparse-checkout feature becomes 
+expensive as the sparse-checkout file and repository size increase, 
+growing quadratically. This can result in billions of pattern checks 
+for large repositories. However, Git's new mechanism for matching based 
+on folder prefix matches drops the quadratic growth, matching M patterns 
+across N files in O(M+N*d) time, where d is the maximum folder depth of a file. 
+To further optimize the matching process, Git inspects files in a sorted 
+order instead of an arbitrary order. When Git evaluates a file path, it 
+checks whether the start of the folder path matches a recursive pattern exactly. 
+If so, it marks everything in that folder as "included" without doing any further 
+hashset lookups. Similarly, when Git detects the start of a folder that's outside 
+of the specified cone, it marks everything in that folder as "excluded" without 
+doing any further hashset lookups. This reduces the time to be closer to O(M+N) 
+(Bring your monorepo down to size with sparse-checkout [1], Stolee).
+
+[1]: https://github.blog/2020-01-17-bring-your-monorepo-down-to-size-with-sparse-checkout/
+
+The Git Fundamentals team at GitHub has contributed a new feature to Git called 
+the sparse index, which allows the index to focus on the files within the 
+sparse-checkout cone in a monorepo. The sparse index stores only the information 
+about the files within the sparse-checkout definition, instead of storing information 
+for every file at HEAD, which can make the index much larger in a monorepo. When 
+enabled with other performance features, the sparse index can have a significant 
+impact on performance (Make your monorepo feel small with Git’s sparse index [2], Stolee).
+
+[2]: https://github.blog/2021-11-10-make-your-monorepo-feel-small-with-gits-sparse-index/
+
+The sparse index differs from a normal "full" index in that it can store directory 
+paths with the object ID for its tree object. It can be used to determine if an 
+entire directory is out of the sparse-checkout cone and replace all of its contained 
+file paths with a single directory path. The use of sparse index can significantly 
+reduce the size of the index, resulting in faster operations 
+(Make your monorepo feel small with Git’s sparse index [3], Stolee).
+
+[3]: https://github.blog/2021-11-10-make-your-monorepo-feel-small-with-gits-sparse-index/
+
+Because "sparse-checkout" and "sparse-index" may potentially influence the logics of 
+other Git commands and the internal data structure of Git, some work is required to 
+optimize compatibility and user experience. That is exactly what my chosen idea proposed.
+
+## Benefits to Community
+
+By joining the community and working on this idea, I can collaborate with my mentor 
+and fellow community members to improve the user experience for people who are working 
+with large monorepos. Furthermore, I am committed to continuing my involvement beyond 
+the GSoC program, not only by contributing to the community but also by sharing my 
+experiences and mentoring future potential newcomers.
+
+
+## Microproject
+
+t4121: modernize test style [4]
+Status: WIP
+Description: Test scripts in file t4121-apply-diffs.sh are written in old style, 
+where the test_expect_success command and test title are written on
+separate lines. Therefore update the tests to adhere to the new style.
+
+## Other Contributions
+
+### Reviewing
+
+t9700: modernize test script [5]
+Status: WIP
+Description: I reviewed this patch and pointed the contributor in the right direction 
+by providing examples, links and mentioning the best practices.
+
+### Patches
+
+MyFirstContribution: add note about SMTP server config [6]
+Status: WIP
+Description: The documentation on using git-send-email previously mentioned the need 
+to configure git for your operating system and email provider, but did not provide 
+specific details on the relevant configuration settings. This commit adds a note 
+specifying that the relevant settings can be found under the 'sendemail' section of 
+Git's configuration file, with a link to the relevant documentation. The aim is to 
+provide users with a more complete understanding of the configuration process and 
+help them avoid potential roadblocks in setting up git-send-email.
+
+[4]: https://lore.kernel.org/git/CACzddJrZ8YdJ72ng3UpMGN9CJx0qW1+fZfyi3q01z2487V8fxw@mail.gmail.com/T/#md53157af31a3f347dd899679fafdea7fcaf7ecfc
+[5]: https://lore.kernel.org/git/CADupsJPpZnjA=Pu_RZZZXy7Titj3UD7ppww48KvcHHHbrGx=rw@mail.gmail.com/T/#m122db9bdca463c12f0b9ccb259fd1d3229d75945
+[6]: https://lore.kernel.org/git/20230222011317.97943-1-gvivan6@gmail.com/
+
+
+### Related Work
+
+Prior works on the idea have been completed by my mentors and other community members, 
+and these works provide a good approximation of the approach I intend to take. Here 
+are some previous examples of commits:
+
+Integration with “mv” [7]
+Integration with “reset” [8]
+Integration with “sparse-checkout” [9]
+Integration with “clean” [10]
+Integration with “blame” [11]
+
+[7]: https://lore.kernel.org/git/20220331091755.385961-1-shaoxuan.yuan02@gmail.com/
+[8]: https://lore.kernel.org/git/pull.1048.v6.git.1638201164.gitgitgadget@gmail.com/
+[9]: https://lore.kernel.org/git/pull.1208.v3.git.1653313726.gitgitgadget@gmail.com/
+[10]: https://github.com/git/git/commit/1e9e10e04891a13e5ccd52b36cfadc55dfaa5066
+[11]: https://github.com/git/git/commit/add4c864b60766174ad4f74ba7be17e66d61ef16
+
+# In GSoC
+
+## Plan
+
+The proposed idea of increasing "sparse-index" integrations may seem 
+straightforward at first glance. However, upon reviewing previous 
+implementations, I discovered that this idea can introduce unforeseen 
+difficulties for some functions. For example, to enable "sparse-index," 
+we must ensure that "sparse-checkout" is compatible with the target Git 
+command. Achieving this compatibility requires modifying the original 
+command logic, which can lead to other unanticipated issues. Therefore, 
+I have incorporated some additional steps in the plan outlined below to 
+proactively address potential complications. It's worth noting that 
+points 3-7 are part of the SoC 2023 Ideas proposed by the community 
+and mentors.
+
+1. Conduct an investigation to determine if a Git command functions 
+properly with sparse-checkout.
+
+2. Modify the logic of the Git command, if necessary, to ensure it 
+functions properly with sparse-checkout. Develop corresponding tests 
+to validate the modifications. 
+
+3. Add tests to t1092-sparse-checkout-compatibility.sh for the 
+builtin, with a focus on what happens for paths outside of the 
+sparse-checkout cone.
+
+4. Disable the command_requires_full_index setting in the builtin 
+and ensure the tests pass.
+
+5. If the tests do not pass, then alter the logic to work with the 
+sparse index.
+
+6. Add tests to check that a sparse index stays sparse.
+
+7. Add performance tests to demonstrate speedup.
+
+8. If any changes are made that affect the behavior of the Git 
+command, update the documentation accordingly. Note that such 
+changes should be rare.
+
+## Timeline
+
+During my discussion with Victoria, she informed me that given my 
+commitment of 175 hours, it is expected that I will be able to fully 
+integrate two commands with sparse index during the GSOC program. My 
+plan is to evenly distribute the work for each command over the course 
+of the program. I am confident that I can start the project early as I 
+have already established communication with my mentors and familiarized 
+myself with the related documentation, although my understanding may 
+not be comprehensive.
+
+Based on my prior experience with the idea, I believe I will be able 
+to quickly get up to speed and begin working on the project. The exact 
+timeline for each integration is difficult to determine, but I estimate 
+that I should be able to complete one integration every two months. I 
+have already planned out my next term, and there are only three weeks 
+during which I would prefer to focus on other things: June 23-30 and 
+August 1-15. However, even without an extension, I should be able to 
+manage this timeline. With the flexibility to extend the program, it 
+should be even easier to accommodate any potential scheduling conflicts.	
+
+	
+## Availability
+
+I will respond to all communication daily and will be available throughout 
+the duration of the program. Although I will be taking some summer courses 
+at my university, I will not be enrolled in a typical full course load. As 
+part of GSOC, I plan to commit to 175 hours. I have experience managing my 
+time effectively while taking courses and working full-time internships in 
+the past. My semester ends on August 15th, and I have no commitments for the 
+following month, which allows me to continue working beyond the end of the 
+semester. With the flexibility to extend the timeline of GSOC, I am confident 
+that I will have ample time to complete the project. I have already discussed 
+this with Victoria, the mentor for the project, and she has agreed to extend 
+the deadline until October 2nd, if necessary. After August 15th, I will be 
+able to work at least 8 hours per day, totaling ~360 hours of work until the 
+October 2nd deadline. This exceeds the required commitment of 175 hours, 
+ensuring that I will complete the project on time. Additionally, I am hoping 
+to continue working on the project even after GSOC ends. 
+
+# After GSoC
+
+I recognize the value of having our GSoC participants continue to engage with 
+our community beyond the event. This is why I am committed to doing so myself. 
+Participating in open-source projects, especially with a community that supports 
+a widely-used development tool, is not only cool but also offers an opportunity 
+to learn and grow. By continuing to participate in this community, I believe 
+that I can make important contributions and continue to develop my skills.
+
+I am planning to establish an open source club at my university in the near 
+future. The University of Waterloo is known for its strong emphasis on 
+computer science and engineering, earning it the nickname "MIT of the North." 
+Given this, I believe that there will be a great deal of interest in the club 
+for a variety of reasons. Currently, there is another club called Blueprint 
+that provides a valuable opportunity for real-world development experience 
+through developing software products for charities. However, the entry process 
+for this club is extremely competitive. By contrast, I think that an open source 
+club would offer a similar experience but with a lower barrier to entry, thus 
+making it accessible to more motivated students. Additionally, given the 
+widespread use and vibrant community of Git, I plan to direct students to this 
+community and am confident that many will be interested in contributing to its 
+open source projects.
+
+# Some Credits to Myself
+
+I’ve previously completed three software developer internships and worked 
+with small startups to large sized companies. I am currently interning 
+with Morgan Stanley and am on the architecture team, working on a large 
+scale equity management software. 
+
+I'm interested in open source development as a way to give back to the 
+community while also growing as a developer. My background in C programming 
+language has made me particularly interested in contributing to Git, which 
+is primarily written in C. I am also comfortable with concepts like memory 
+allocation, thanks to my experience with C programming. Furthermore, I have 
+studied shell scripting as part of my coursework, which makes me well-equipped
+to handle the project's language requirements. Another personal motivation 
+for contributing to this project is that I have worked with monorepos before, 
+and given that it is used by many of the larger tech companies, I want to 
+learn more about it and help improve the user experience with it.
+
+Victoria mentioned that I was the first person to express interest in the 
+project this year, either directly or via the mailing list. In my spare time, 
+I've been contributing and reading documents while also working a full-time 
+job (internship) and taking one course at my university. I expect to have a 
+lot more time next term, so you can expect even more from me ;). Nonetheless, 
+I became familiar and comfortable with the contribution process by writing, 
+responding to, and auditing various types of patches in the community.
+
+With the patches I have submitted so far, I have been able to develop a deeper 
+understanding of Git internals, project structures, commonly used APIs, test 
+suites, required tech stacks, and coding guidelines. To further enhance my 
+comprehension of Git, I have either read or skimmed through several relevant 
+documents, including 'Submitting patches', 'Coding guidelines', 
+'Myfirstcontribution.txt', 'Git tutorial', 'Git everyday', 'readme', 
+'Hacking Git', drawing upon my prior knowledge where applicable. Additionally, 
+I have been referring to the book 'Pro Git' on an as-needed basis. Furthermore, 
+I have thoroughly read and referenced blogs such as 'Make your monorepo feel 
+small with Git's sparse index [12]', 'Bring your monorepo down to size with 
+sparse-checkout [13]', and 'Commits are snapshots, not diffs [14]'. The 
+advantage of having prior knowledge and experience with my proposed project 
+idea is that I am well-prepared to tackle any upcoming challenges.
+
+[12]: https://github.blog/2021-11-10-make-your-monorepo-feel-small-with-gits-sparse-index/
+[13]: https://github.blog/2020-01-17-bring-your-monorepo-down-to-size-with-sparse-checkout/
+[14]: https://github.blog/2020-12-17-commits-are-snapshots-not-diffs/
+
+# Closing remarks
+
+I am very motivated for this project because I have previously worked with 
+monorepos and will most likely have to work with them again in my future 
+internships. As a result, I intend to continue working on remaining c
+ommands after GSOC whenever I have free time. 
+
+I'd like to state that I'm a genuinely enthusiastic open-source newcomer 
+who is very much looking forward to this opportunity. I am grateful for 
+the opportunity to contribute to Git's development, and I am committed to 
+working diligently to strengthen the open-source ecosystem. My ultimate goal 
+is to use this opportunity to bring new energy and ideas to the table, and to 
+make meaningful contributions that benefit the entire community.
+
+I am grateful for the community's support, especially Victoria's guidance 
+and feedback. She promptly replied to my inquiries and provided me with 
+several resources that were instrumental in helping me get started on the 
+project. I am truly humbled by the dedication and hard work that the 
+community puts in to nurture and enhance this ecosystem, and I feel 
+fortunate to have received such warm and welcoming support as a new 
+contributor. It is an honor to be a part of this community and to 
+work towards advancing its mission.
+
+Thank you so much for reading through my proposal!
+
+Kind Regards,
+Vivan Garg
+
-- 
2.37.0 (Apple Git-136)


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC][PATCH v3] GSoC 2023 proposal: more sparse index integration
  2023-02-26  8:33 [RFC][PATCH] GSoC 2023 proposal: more sparse index integration Vivan Garg
                   ` (2 preceding siblings ...)
  2023-02-27  0:46 ` [RFC][PATCH v2] " Vivan Garg
@ 2023-03-23  6:38 ` Vivan Garg
  2023-03-23  6:50   ` Vivan Garg
  2023-03-28 16:20   ` Victoria Dye
  3 siblings, 2 replies; 11+ messages in thread
From: Vivan Garg @ 2023-03-23  6:38 UTC (permalink / raw)
  To: git, vdye; +Cc: christian.couder, hariom18599, Vivan Garg

Signed-off-by: Vivan Garg <gvivan6@gmail.com>
---
 .../More-Sparse-Index-Integrations.txt        | 319 ++++++++++++++++++
 1 file changed, 319 insertions(+)
 create mode 100644 Documentation/More-Sparse-Index-Integrations.txt

diff --git a/Documentation/More-Sparse-Index-Integrations.txt b/Documentation/More-Sparse-Index-Integrations.txt
new file mode 100644
index 0000000000..a1812b4e64
--- /dev/null
+++ b/Documentation/More-Sparse-Index-Integrations.txt
@@ -0,0 +1,319 @@
+# More Sparse Index Integrations
+
+# Personal Information
+
+Full name: Vivan Garg
+
+E-mail: gvivan6@gmail.com 
+Alternate E-mail: v.garg.work@gmail.com
+Tel: (+1)437-987-2678
+
+Education: University of Waterloo (Canada)
+Major: Computer Science and Financial Management (Double-Major)
+Year: Rising Junior
+
+LinkedIn: https://www.linkedin.com/in/gvivan/
+GitHub: https://github.com/gvivan
+Website: https://gvivan.me/
+
+# Before GSoC
+
+## Synopsis
+
+I've chosen the "More Sparse Index Integrations" project idea from the
+SoC 2023 Ideas page. The goal of this project is to integrate the 
+experimental "sparse-index" feature and "sparse-checkout" command with 
+existing Git commands. 
+
+Git 2.25.0 introduced a new experimental `git sparse-checkout` command, 
+which simplified the existing feature and improved performance for 
+large repositories. It allows users to restrict their working directory 
+to only the files they care about, allowing them to ensure the developer 
+workflow is as fast as possible while maintaining all the benefits of a 
+monorepo. 
+(Bring your monorepo down to size with sparse-checkout [1], Stolee).
+
+The pattern matching process in Git's sparse-checkout feature becomes 
+expensive as the sparse-checkout file and repository size increase, 
+growing quadratically. This can result in billions of pattern checks 
+for large repositories. However, Git's new mechanism for matching based 
+on folder prefix matches drops the quadratic growth, matching M patterns 
+across N files in O(M+N*d) time, where d is the maximum folder depth of a file. 
+To further optimize the matching process, Git inspects files in a sorted 
+order instead of an arbitrary order. When Git evaluates a file path, it 
+checks whether the start of the folder path matches a recursive pattern exactly. 
+If so, it marks everything in that folder as "included" without doing any further 
+hashset lookups. Similarly, when Git detects the start of a folder that's outside 
+of the specified cone, it marks everything in that folder as "excluded" without 
+doing any further hashset lookups. This reduces the time to be closer to O(M+N) 
+(Bring your monorepo down to size with sparse-checkout [1], Stolee).
+
+[1]: https://github.blog/2020-01-17-bring-your-monorepo-down-to-size-with-sparse-checkout/
+
+The Git Fundamentals team at GitHub has contributed a new feature to Git called 
+the sparse index, which allows the index to focus on the files within the 
+sparse-checkout cone in a monorepo. The sparse index stores only the information 
+about the files within the sparse-checkout definition, instead of storing information 
+for every file at HEAD, which can make the index much larger in a monorepo. When 
+enabled with other performance features, the sparse index can have a significant 
+impact on performance (Make your monorepo feel small with Git’s sparse index [2], Stolee).
+
+[2]: https://github.blog/2021-11-10-make-your-monorepo-feel-small-with-gits-sparse-index/
+
+The sparse index differs from a normal "full" index in that it can store directory 
+paths with the object ID for its tree object. It can be used to determine if an 
+entire directory is out of the sparse-checkout cone and replace all of its contained 
+file paths with a single directory path. The use of sparse index can significantly 
+reduce the size of the index, resulting in faster operations 
+(Make your monorepo feel small with Git’s sparse index [3], Stolee).
+
+[3]: https://github.blog/2021-11-10-make-your-monorepo-feel-small-with-gits-sparse-index/
+
+Because "sparse-checkout" and "sparse-index" may potentially influence the logics of 
+other Git commands and the internal data structure of Git, some work is required to 
+optimize compatibility and user experience. That is exactly what my chosen idea proposed.
+
+## Benefits to Community
+
+By joining the community and working on this idea, I can collaborate with my mentor 
+and fellow community members to improve the user experience for people who are working 
+with large monorepos. Furthermore, I am committed to continuing my involvement beyond 
+the GSoC program, not only by contributing to the community but also by sharing my 
+experiences and mentoring future potential newcomers.
+
+
+## Microproject
+
+t4121: modernize test style [4]
+Status: WIP
+Description: Test scripts in file t4121-apply-diffs.sh are written in old style, 
+where the test_expect_success command and test title are written on
+separate lines. Therefore update the tests to adhere to the new style.
+
+## Other Contributions
+
+### Reviewing
+
+t9700: modernize test script [5]
+Status: WIP
+Description: I reviewed this patch and pointed the contributor in the right direction 
+by providing examples, links and mentioning the best practices.
+
+### Patches
+
+MyFirstContribution: add note about SMTP server config [6]
+Status: WIP
+Description: The documentation on using git-send-email previously mentioned the need 
+to configure git for your operating system and email provider, but did not provide 
+specific details on the relevant configuration settings. This commit adds a note 
+specifying that the relevant settings can be found under the 'sendemail' section of 
+Git's configuration file, with a link to the relevant documentation. The aim is to 
+provide users with a more complete understanding of the configuration process and 
+help them avoid potential roadblocks in setting up git-send-email.
+
+[4]: https://lore.kernel.org/git/CACzddJrZ8YdJ72ng3UpMGN9CJx0qW1+fZfyi3q01z2487V8fxw@mail.gmail.com/T/#md53157af31a3f347dd899679fafdea7fcaf7ecfc
+[5]: https://lore.kernel.org/git/CADupsJPpZnjA=Pu_RZZZXy7Titj3UD7ppww48KvcHHHbrGx=rw@mail.gmail.com/T/#m122db9bdca463c12f0b9ccb259fd1d3229d75945
+[6]: https://lore.kernel.org/git/20230222011317.97943-1-gvivan6@gmail.com/
+
+
+### Related Work
+
+Prior works on the idea have been completed by my mentors and other community members, 
+and these works provide a good approximation of the approach I intend to take. Here 
+are some previous examples of commits:
+
+Integration with “mv” [7]
+Integration with “reset” [8]
+Integration with “sparse-checkout” [9]
+Integration with “clean” [10]
+Integration with “blame” [11]
+
+[7]: https://lore.kernel.org/git/20220331091755.385961-1-shaoxuan.yuan02@gmail.com/
+[8]: https://lore.kernel.org/git/pull.1048.v6.git.1638201164.gitgitgadget@gmail.com/
+[9]: https://lore.kernel.org/git/pull.1208.v3.git.1653313726.gitgitgadget@gmail.com/
+[10]: https://github.com/git/git/commit/1e9e10e04891a13e5ccd52b36cfadc55dfaa5066
+[11]: https://github.com/git/git/commit/add4c864b60766174ad4f74ba7be17e66d61ef16
+
+# In GSoC
+
+## Plan
+
+Plan
+
+The proposed idea of increasing "sparse-index" integrations may appear straightforward 
+initially. However, after reviewing previous implementations, I have found that this 
+idea can present unforeseen difficulties for some functions. For example, to enable 
+"sparse-index," we must ensure that "sparse-checkout" is compatible with the target 
+Git command. Achieving this compatibility requires modifying the original command 
+logic, which can lead to other unanticipated issues. Therefore, I have incorporated 
+additional steps in the plan, to the steps proposed by the community and mentors, 
+outlined below to proactively address potential complications.
+
+1.	Conduct an investigation to determine if a Git command functions properly with 
+    sparse-checkout. This step is estimated to take approximately 7-14 days.
+
+2.	Modify the logic of the Git command, if necessary, to ensure it functions 
+    properly with sparse-checkout. Develop corresponding tests to validate the 
+    modifications. This step is estimated to take approximately 7-14 days.
+
+3.	Add tests to t1092-sparse-checkout-compatibility.sh for the built-in, focusing 
+    on what happens for paths outside of the sparse-checkout cone.
+
+4.	Disable the command_requires_full_index setting in the built-in and ensure 
+    the tests pass.
+
+5.	If the tests do not pass, then alter the logic to work with the sparse index.
+
+6.	Add tests to check that a sparse index stays sparse.
+
+7.	Add performance tests to demonstrate speedup.
+
+8.	If any changes are made that affect the behavior of the Git command, update 
+    the documentation accordingly. Note that such changes should be rare.
+
+Points 3-8 combined should take approximately 15-30 days.
+
+To summarize, each integration will follow a similar schedule to the one outlined 
+above. Therefore, without extending the timeline, we can expect to complete 2-3 i
+ntegrations during the GSoC program period.
+
+Timeline 
+
+Determining the exact time arrangement for each integration is difficult, as there 
+may be unforeseen challenges that arise during the process. However, based on my 
+estimation, I anticipate that each integration will take approximately 1.5 - 2 months 
+to complete, starting from May 29th. Please refer to the detailed breakdown of each 
+step in the plan section for a more accurate estimate.
+The proposed integration schedule is as follows:
+
+•	git describe
+•	git write-tree
+•	git diff-files
+
+This schedule is based on the order of difficulty outlined in GSoC 2023 Ideas.
+
+It's worth noting that each integration may require different amounts of time 
+and attention, and modifications to the schedule may be necessary as I delve 
+deeper into each command. Nevertheless, I am committed to delivering quality 
+results within the given timeframe.
+
+In summary, I anticipate that each integration will take an average of 1.5 months, 
+but I remain flexible and open to adjusting the schedule as needed to ensure the 
+success of the project.
+	
+Availability
+
+I commit to responding to all communication daily and being available throughout 
+the duration of the program. While I will be taking some summer courses at my 
+university, I will not be enrolled in a typical full course load. As part of GSOC, 
+I plan to commit to a medium-sized project of 175 hours. I have experience managing 
+my time effectively while taking courses and working full-time internships in the 
+past.
+
+The program is officially 16 weeks long. To ensure timely completion of the project, 
+I plan to spend 8 hours per week until August 15th, which is when my semester ends. 
+From August 16th until September 1st, I plan to dedicate 8 hours per day to the project. 
+There are only three weeks during which I would prefer to focus on other things: 
+June 23rd-30th (midterm week) and August 1st-15th (finals season). However, as I will be 
+committing 8 hours per day following Aug 15th, it should be ample enough to make up for it.
+
+I am confident that I will have ample time to complete the project within the allocated 
+time frame. Additionally, I am hoping to continue working on the project even after 
+GSOC ends, as there are several functions that need to be implemented.
+
+
+# After GSoC
+
+I recognize the value of having our GSoC participants continue to engage with 
+our community beyond the event. This is why I am committed to doing so myself. 
+Participating in open-source projects, especially with a community that supports 
+a widely-used development tool, is not only cool but also offers an opportunity 
+to learn and grow. By continuing to participate in this community, I believe 
+that I can make important contributions and continue to develop my skills.
+
+I am planning to establish an open source club at my university in the near 
+future. The University of Waterloo is known for its strong emphasis on 
+computer science and engineering, earning it the nickname "MIT of the North." 
+Given this, I believe that there will be a great deal of interest in the club 
+for a variety of reasons. Currently, there is another club called Blueprint 
+that provides a valuable opportunity for real-world development experience 
+through developing software products for charities. However, the entry process 
+for this club is extremely competitive. By contrast, I think that an open source 
+club would offer a similar experience but with a lower barrier to entry, thus 
+making it accessible to more motivated students. Additionally, given the 
+widespread use and vibrant community of Git, I plan to direct students to this 
+community and am confident that many will be interested in contributing to its 
+open source projects.
+
+# Some Credits to Myself
+
+I’ve previously completed three software developer internships and worked 
+with small startups to large sized companies. I am currently interning 
+with Morgan Stanley and am on the architecture team, working on a large 
+scale equity management software. 
+
+I'm interested in open source development as a way to give back to the 
+community while also growing as a developer. My background in C programming 
+language has made me particularly interested in contributing to Git, which 
+is primarily written in C. I am also comfortable with concepts like memory 
+allocation, thanks to my experience with C programming. Furthermore, I have 
+studied shell scripting as part of my coursework, which makes me well-equipped
+to handle the project's language requirements. Another personal motivation 
+for contributing to this project is that I have worked with monorepos before, 
+and given that it is used by many of the larger tech companies, I want to 
+learn more about it and help improve the user experience with it.
+
+Victoria mentioned that I was the first person to express interest in the 
+project this year, either directly or via the mailing list. In my spare time, 
+I've been contributing and reading documents while also working a full-time 
+job (internship) and taking one course at my university. I expect to have a 
+lot more time next term, so you can expect even more from me ;). Nonetheless, 
+I became familiar and comfortable with the contribution process by writing, 
+responding to, and auditing various types of patches in the community.
+
+With the patches I have submitted so far, I have been able to develop a deeper 
+understanding of Git internals, project structures, commonly used APIs, test 
+suites, required tech stacks, and coding guidelines. To further enhance my 
+comprehension of Git, I have either read or skimmed through several relevant 
+documents, including 'Submitting patches', 'Coding guidelines', 
+'Myfirstcontribution.txt', 'Git tutorial', 'Git everyday', 'readme', 
+'Hacking Git', drawing upon my prior knowledge where applicable. Additionally, 
+I have been referring to the book 'Pro Git' on an as-needed basis. Furthermore, 
+I have thoroughly read and referenced blogs such as 'Make your monorepo feel 
+small with Git's sparse index [12]', 'Bring your monorepo down to size with 
+sparse-checkout [13]', and 'Commits are snapshots, not diffs [14]'. The 
+advantage of having prior knowledge and experience with my proposed project 
+idea is that I am well-prepared to tackle any upcoming challenges.
+
+[12]: https://github.blog/2021-11-10-make-your-monorepo-feel-small-with-gits-sparse-index/
+[13]: https://github.blog/2020-01-17-bring-your-monorepo-down-to-size-with-sparse-checkout/
+[14]: https://github.blog/2020-12-17-commits-are-snapshots-not-diffs/
+
+# Closing remarks
+
+I am very motivated for this project because I have previously worked with 
+monorepos and will most likely have to work with them again in my future 
+internships. As a result, I intend to continue working on remaining c
+ommands after GSOC whenever I have free time. 
+
+I'd like to state that I'm a genuinely enthusiastic open-source newcomer 
+who is very much looking forward to this opportunity. I am grateful for 
+the opportunity to contribute to Git's development, and I am committed to 
+working diligently to strengthen the open-source ecosystem. My ultimate goal 
+is to use this opportunity to bring new energy and ideas to the table, and to 
+make meaningful contributions that benefit the entire community.
+
+I am grateful for the community's support, especially Victoria's guidance 
+and feedback. She promptly replied to my inquiries and provided me with 
+several resources that were instrumental in helping me get started on the 
+project. I am truly humbled by the dedication and hard work that the 
+community puts in to nurture and enhance this ecosystem, and I feel 
+fortunate to have received such warm and welcoming support as a new 
+contributor. It is an honor to be a part of this community and to 
+work towards advancing its mission.
+
+Thank you so much for reading through my proposal!
+
+Kind Regards,
+Vivan Garg
+
-- 
2.37.0 (Apple Git-136)


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH v3] GSoC 2023 proposal: more sparse index integration
  2023-03-23  6:38 ` [RFC][PATCH v3] " Vivan Garg
@ 2023-03-23  6:50   ` Vivan Garg
  2023-03-23 13:38     ` Derrick Stolee
  2023-03-28 16:20   ` Victoria Dye
  1 sibling, 1 reply; 11+ messages in thread
From: Vivan Garg @ 2023-03-23  6:50 UTC (permalink / raw)
  To: git, vdye; +Cc: christian.couder, hariom18599

I have taken into account Victoria's suggestions and made the necessary
changes to the previous draft. I would appreciate any feedback on the
revised version.

Additionally, I wanted to inform you that I had planned to begin working
on git describe, but unfortunately, I broke my ankle while skiing two to
three weeks ago. This unexpected event caused a delay in my plans.
However, now that I have adjusted to my new lifestyle, I am confident
that I can resume working on it.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH v3] GSoC 2023 proposal: more sparse index integration
  2023-03-23  6:50   ` Vivan Garg
@ 2023-03-23 13:38     ` Derrick Stolee
  0 siblings, 0 replies; 11+ messages in thread
From: Derrick Stolee @ 2023-03-23 13:38 UTC (permalink / raw)
  To: Vivan Garg, git, vdye; +Cc: christian.couder, hariom18599

On 3/23/2023 2:50 AM, Vivan Garg wrote:
> I have taken into account Victoria's suggestions and made the necessary
> changes to the previous draft. I would appreciate any feedback on the
> revised version.
> 
> Additionally, I wanted to inform you that I had planned to begin working
> on git describe, but unfortunately, I broke my ankle while skiing two to
> three weeks ago. This unexpected event caused a delay in my plans.
> However, now that I have adjusted to my new lifestyle, I am confident
> that I can resume working on it.

Ouch! Hopefully things are healing well and the stress of this project
won't delay that.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH v3] GSoC 2023 proposal: more sparse index integration
  2023-03-23  6:38 ` [RFC][PATCH v3] " Vivan Garg
  2023-03-23  6:50   ` Vivan Garg
@ 2023-03-28 16:20   ` Victoria Dye
  2023-03-28 17:54     ` Vivan Garg
  1 sibling, 1 reply; 11+ messages in thread
From: Victoria Dye @ 2023-03-28 16:20 UTC (permalink / raw)
  To: Vivan Garg, git; +Cc: christian.couder, hariom18599

Vivan Garg wrote:

Hi Vivan, 

Sorry for the delay in re-reviewing! You've largely addressed my original
comments, so I only had a few follow-up questions/notes to add.

> +# In GSoC
> +
> +## Plan
> +
> +Plan
> +
> +The proposed idea of increasing "sparse-index" integrations may appear straightforward 
> +initially. However, after reviewing previous implementations, I have found that this 
> +idea can present unforeseen difficulties for some functions. For example, to enable 
> +"sparse-index," we must ensure that "sparse-checkout" is compatible with the target 
> +Git command. Achieving this compatibility requires modifying the original command 
> +logic, which can lead to other unanticipated issues. Therefore, I have incorporated 
> +additional steps in the plan, to the steps proposed by the community and mentors, 
> +outlined below to proactively address potential complications.
> +
> +1.	Conduct an investigation to determine if a Git command functions properly with 
> +    sparse-checkout. This step is estimated to take approximately 7-14 days.
> +
> +2.	Modify the logic of the Git command, if necessary, to ensure it functions 
> +    properly with sparse-checkout. Develop corresponding tests to validate the 
> +    modifications. This step is estimated to take approximately 7-14 days.

I'm guessing these two steps will be much shorter if the command is already
compatible with sparse-checkout (<7 days for step 1, and you could skip step
2 entirely)?

> +
> +3.	Add tests to t1092-sparse-checkout-compatibility.sh for the built-in, focusing 
> +    on what happens for paths outside of the sparse-checkout cone.
> +
> +4.	Disable the command_requires_full_index setting in the built-in and ensure 
> +    the tests pass.
> +
> +5.	If the tests do not pass, then alter the logic to work with the sparse index.
> +
> +6.	Add tests to check that a sparse index stays sparse.
> +
> +7.	Add performance tests to demonstrate speedup.
> +
> +8.	If any changes are made that affect the behavior of the Git command, update 
> +    the documentation accordingly. Note that such changes should be rare.
> +
> +Points 3-8 combined should take approximately 15-30 days.

Does this also account for the time _after_ submission to the mailing list?
Responding to review comments, iterating on changes, etc?

> +
> +To summarize, each integration will follow a similar schedule to the one outlined 
> +above. Therefore, without extending the timeline, we can expect to complete 2-3 i
> +ntegrations during the GSoC program period.
> +
> +Timeline 
> +
> +Determining the exact time arrangement for each integration is difficult, as there 
> +may be unforeseen challenges that arise during the process. However, based on my 
> +estimation, I anticipate that each integration will take approximately 1.5 - 2 months 
> +to complete, starting from May 29th. Please refer to the detailed breakdown of each 
> +step in the plan section for a more accurate estimate.
> +The proposed integration schedule is as follows:
> +
> +•	git describe
> +•	git write-tree
> +•	git diff-files

At this point, initial integrations for both 'git describe' [1] and 'git
diff-files' [2] have been submitted to the mailing list. To make your plan
more flexible/resilient to concurrent contributions, I think it'd be
reasonable to give a list of 5-6 commands you'll choose from to complete
your 2-3 planned integrations.

[1] https://lore.kernel.org/git/pull.1480.git.git.1679926829475.gitgitgadget@gmail.com/
[2] https://lore.kernel.org/git/20230322161820.3609-1-cheskaqiqi@gmail.com/

> +
> +This schedule is based on the order of difficulty outlined in GSoC 2023 Ideas.
> +
> +It's worth noting that each integration may require different amounts of time 
> +and attention, and modifications to the schedule may be necessary as I delve 
> +deeper into each command. Nevertheless, I am committed to delivering quality 
> +results within the given timeframe.
> +
> +In summary, I anticipate that each integration will take an average of 1.5 months, 
> +but I remain flexible and open to adjusting the schedule as needed to ensure the 
> +success of the project.
> +	
> +Availability
> +
> +I commit to responding to all communication daily and being available throughout 
> +the duration of the program. While I will be taking some summer courses at my 
> +university, I will not be enrolled in a typical full course load. As part of GSOC, 
> +I plan to commit to a medium-sized project of 175 hours. I have experience managing 
> +my time effectively while taking courses and working full-time internships in the 
> +past.
> +
> +The program is officially 16 weeks long. To ensure timely completion of the project, 
> +I plan to spend 8 hours per week until August 15th, which is when my semester ends. 
> +From August 16th until September 1st, I plan to dedicate 8 hours per day to the project. 
> +There are only three weeks during which I would prefer to focus on other things: 
> +June 23rd-30th (midterm week) and August 1st-15th (finals season). However, as I will be 
> +committing 8 hours per day following Aug 15th, it should be ample enough to make up for it.

Thanks for adding these availability details!

> +
> +I am confident that I will have ample time to complete the project within the allocated 
> +time frame. Additionally, I am hoping to continue working on the project even after 
> +GSOC ends, as there are several functions that need to be implemented.
> +

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH v3] GSoC 2023 proposal: more sparse index integration
  2023-03-28 16:20   ` Victoria Dye
@ 2023-03-28 17:54     ` Vivan Garg
  0 siblings, 0 replies; 11+ messages in thread
From: Vivan Garg @ 2023-03-28 17:54 UTC (permalink / raw)
  To: Victoria Dye; +Cc: git, christian.couder, hariom18599

> Hi Vivan,
>
> Sorry for the delay in re-reviewing! You've largely addressed my original
> comments, so I only had a few follow-up questions/notes to add.

Thanks for re-reviewing!!

>
> I'm guessing these two steps will be much shorter if the command is already
> compatible with sparse-checkout (<7 days for step 1, and you could skip step
> 2 entirely)?

Yep, you got that right! Perhaps I'll add an optional tag to step 2 to
indicate that it
isn't required for each command.

> Does this also account for the time _after_ submission to the mailing list?
> Responding to review comments, iterating on changes, etc?

It does account for time to reiterate until it reaches a reasonable state
(similar to my microproject [1], in the sense that even though it has not
yet been merged, it has received one approval), after which I plan to start
working on the next command and continue reviewing the patch for any
minor changes that may be required.

[1] https://lore.kernel.org/git/CACzddJrZ8YdJ72ng3UpMGN9CJx0qW1+fZfyi3q01z2487V8fxw@mail.gmail.com/T/#m792fa5cc6c77c5ccb114b488beb72c1ea6145e34

> At this point, initial integrations for both 'git describe' [1] and 'git
> diff-files' [2] have been submitted to the mailing list. To make your plan
> more flexible/resilient to concurrent contributions, I think it'd be
> reasonable to give a list of 5-6 commands you'll choose from to complete
> your 2-3 planned integrations.

I will do that! I didn't realise integration for 'git describe' had
begun until last
week, when I began working on it. I believe I will have to abandon the work
I did over the past week because someone else started working on it
before me. However, I also feel that I might not be able to squeeze out enough
time in the coming week to be able to start and push another command
integration before the application deadline (Apr 4th).


>
> [1] https://lore.kernel.org/git/pull.1480.git.git.1679926829475.gitgitgadget@gmail.com/
> [2] https://lore.kernel.org/git/20230322161820.3609-1-cheskaqiqi@gmail.com/
>
> > +
> > +This schedule is based on the order of difficulty outlined in GSoC 2023 Ideas.
> > +
> > +It's worth noting that each integration may require different amounts of time
> > +and attention, and modifications to the schedule may be necessary as I delve
> > +deeper into each command. Nevertheless, I am committed to delivering quality
> > +results within the given timeframe.
> > +
> > +In summary, I anticipate that each integration will take an average of 1.5 months,
> > +but I remain flexible and open to adjusting the schedule as needed to ensure the
> > +success of the project.
> > +
> > +Availability
> > +
> > +I commit to responding to all communication daily and being available throughout
> > +the duration of the program. While I will be taking some summer courses at my
> > +university, I will not be enrolled in a typical full course load. As part of GSOC,
> > +I plan to commit to a medium-sized project of 175 hours. I have experience managing
> > +my time effectively while taking courses and working full-time internships in the
> > +past.
> > +
> > +The program is officially 16 weeks long. To ensure timely completion of the project,
> > +I plan to spend 8 hours per week until August 15th, which is when my semester ends.
> > +From August 16th until September 1st, I plan to dedicate 8 hours per day to the project.
> > +There are only three weeks during which I would prefer to focus on other things:
> > +June 23rd-30th (midterm week) and August 1st-15th (finals season). However, as I will be
> > +committing 8 hours per day following Aug 15th, it should be ample enough to make up for it.
>
> Thanks for adding these availability details!
>
> > +
> > +I am confident that I will have ample time to complete the project within the allocated
> > +time frame. Additionally, I am hoping to continue working on the project even after
> > +GSOC ends, as there are several functions that need to be implemented.
> > +

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-03-28 17:54 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-26  8:33 [RFC][PATCH] GSoC 2023 proposal: more sparse index integration Vivan Garg
2023-02-26  9:03 ` Ashutosh Pandey
2023-02-26 23:18   ` Vivan Garg
2023-02-26 23:03 ` Victoria Dye
2023-02-26 23:52   ` Vivan Garg
2023-02-27  0:46 ` [RFC][PATCH v2] " Vivan Garg
2023-03-23  6:38 ` [RFC][PATCH v3] " Vivan Garg
2023-03-23  6:50   ` Vivan Garg
2023-03-23 13:38     ` Derrick Stolee
2023-03-28 16:20   ` Victoria Dye
2023-03-28 17:54     ` Vivan Garg

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).