git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [GSoC][RFC][Proposal] Unify ref-filter formats with other pretty formats
@ 2022-04-19 13:48 Jack McGuinness
  2022-04-19 13:49 ` Jack McGuinness
  0 siblings, 1 reply; 3+ messages in thread
From: Jack McGuinness @ 2022-04-19 13:48 UTC (permalink / raw)
  To: git@vger.kernel.org

Hi, thanks for checking out my proposal. I know that I really couldn't be any later with submitting this, but due to some time constraints I had to make do. Included is the plaintext version of my proposal, along with a link to the Google Doc version, which is what I am submitting to GSoC. Well I would love feedback, I waited way to long and I am just going to turn it in as is. 

Thanks for your time,
-Jack McGuinness <jmcguinness2@ucmerced.edu>

Google Doc - https://docs.google.com/document/d/1KhVGkSgBaya8Q6X9z7tUwVJegmurwMS0xmzYj9DRMMY/edit?usp=sharing

Plaintext -

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [GSoC][RFC][Proposal] Unify ref-filter formats with other pretty formats
  2022-04-19 13:48 [GSoC][RFC][Proposal] Unify ref-filter formats with other pretty formats Jack McGuinness
@ 2022-04-19 13:49 ` Jack McGuinness
  2022-04-25 13:34   ` Christian Couder
  0 siblings, 1 reply; 3+ messages in thread
From: Jack McGuinness @ 2022-04-19 13:49 UTC (permalink / raw)
  To: git@vger.kernel.org

Unify ref-filter formats with other pretty formats, for the Fourth Time

Contact Info:

Name: Jack McGuinness
Email: jmcguinness2@ucmerced.edu
University: U.C. merced
Phone number: +1 714-855-9339
Github: https://github.com/JackMcGu
Time Zone: PAC (GMT -07:00)

About me:

I am a computer science and engineering student from sunny California, studying at U.C. Merced. I have been using git at a surface level for years now, but recently with my classes, I have started to use more features of it. I have experience in using a few various programming languages, one of them C, which would be useful for this project. Over the past 3 years, I spend a good portion of my time working on personal projects, normally some sort of simulation or creative graphic. I have wanted to contribute to a public project for a while, but I never took the leap because I didn’t think I could bring something worthwhile to the table, however looking at GSoC and git, I think that I could make a good contribution. I also spend a few hours each week teaching my underclassman computer science concepts, which I believe is a skill that would be helpful when writing blogs to explain what I was doing.

Personal About me:
If I'm not doing something productive, then I’m either biking, reading manga, or playing Magic the Gathering. While working, I’m normally listening to a music playlist that is an amalgamation of Hyperpop, 2000s rock, indie pop, and some hip hop. If I’m writing something by hand, I use a fountain pen, because I just find the way the ink flows soothing.

My Microproject:

For my microproject, I chose to Modernize a test script, specifically t4202-log.sh

I will admit, I had a bumpy run completing my Microproject. Going into this, I had no experience with open-source, and due to this, my first attempt was done incorrectly. In response to this, I went back and redid it, however as of right now it has not been reviewed by anyone. I believe that I corrected all of the improper syntaxes, but it’s entirely possible I missed something.

Note: I used gitgitgadget to submit my patches.

Patch
Purpose
Status
Link to Email
[GSoC][PATCH] Applicant Microproject - Modernizing t4202


The goal of this patch was to update some of the basic syntax of t4202.
Reviewed by Derrick Stolee - Done incorrectly.
https://lore.kernel.org/git/pull.1218.git.1650096550631.gitgitgadget@gmail.com/
[PATCH 0/3] [GSoC][Patch] area: t4202-log.sh, modernizing test script
Cover Letter for following three patches.
Unreviewed
https://lore.kernel.org/git/pull.1220.git.1650331876.gitgitgadget@gmail.com/
[PATCH 1/3] [GSoC][Patch] area: t4202-log.sh, modernizing test script
Removed blank lines in test bodies and replaced indentation with spaces with indentation with tabs.
Unreviewed
https://lore.kernel.org/git/0c6aa4e9103b301556437d35acefa535a90e6a1e.1650331876.git.gitgitgadget@gmail.com/
[PATCH 2/3] [GSoC][Patch] area: t4202-log.sh, modernizing test script p2
Removed whitespace after redirect operators.
Unreviewed
https://lore.kernel.org/git/e5c29a12df61d0f587594664f50ec025b934fadf.1650331876.git.gitgitgadget@gmail.com/
[PATCH 3/3] [GSoC][Patch] area: t4202-log.sh, modernizing test script p3
Split up multiple line commands to have one command per line.
Unreviewed
https://lore.kernel.org/git/4f0f4619806b383f280e8f0d0b9000c189a3b540.1650331876.git.gitgitgadget@gmail.com/


History of Problem:

As most open source projects get developed, git has been built up over time by many different people, working towards a common goal of improvement, but all with different ideas of how to get there. While this is a great thing that is the reason open source software is such a brilliant idea, it can cause confusion within the project.
A prime example of this is formatting command output, where different commands that overlap in what data they would output have different logic for getting said output, which causes people to need to know separate systems for each command.
Git has had a history of having mentored contributors work to amend this, starting in Outreach Round 15, where Olga Telezhania mainly worked to migrate the logic in cat-file.c to the logic in ref-filter, but unfortunately near the end, she was forced to scrap her solution and restart, and her main work was never merged to master.
From what I could find, the next person to work on this was Hariom Verma during GSoC 2020. He started his project by looking over the work of his predecessor Olga, and deciding to take a different approach to the problem. Over Hariom’s summer, he implemented a plethora of formatting options in pretty formats, implementing all formatting options in pretty-lib.
After Hariom, ZheNing Hu carried the torch and worked on the project for the 2021 GSoC. His main contributions revolved around refactoring cat-file, similar to Olga. However, he also made the notable decision to spend time optimizing the performance of ref-filter.


Proposal: Unify ref-filter formats with other pretty formats

My proposal is one of the ideas provided by Git-SCM, unifying the logic of ref-filter with pretty formats. What this means, is that I would be rewriting the formatting logic currently used in ref-filter, to be used in pretty. However, alongside doing this, I also have the goal of adding some small new functionality to the formatting, and possibly optimizing the logic as ZheNing did.

Benefits of Proposal:

Completing this would improve the quality of life of people contributing to the formatting. The erasure of duplicate logic would make it simpler to understand the logic being used to format, and therefore simpler for a prospective contributor to implement a new feature, or alter a current one.

My Plan:

I looked over the proposals and blogs of the previous undertakers of this proposal, and with their work and struggles in mind, I have compiled the following plan.
Before I start working on the formatting logic, I want to learn to a usable degree how to use the following tools:
Valgrind
GDB
Tmux
Possibly Gprof and perf
My reasoning for this is that reading Olga's blog, she commented how if she had started using the debugging assets earlier, she would have been far more on track. I want to go into this already knowing them so that I can apply them when needed, and not needlessly waste time.
After this, I would focus on understanding the logic behind the formatting, by studying the relevant files and working on small contributions and patches, to better understand the system in place. From what I can see, most previous contributors took around a month before they started coding their main project. If possible, I would like to figure this out in under a month, but I know that that’s easier said than done.
After I understand, to a well enough degree, the formatting logic, I want to start implementing the formatting options from other files. As I understand it, major progress has been made in unifying the formatting logic, however, there are still implementations that work separately from what we want the standard to be. Ideally, I would like to spend the majority of my time doing this, along with the debugging that goes along with it.
If I have misjudged the amount of logic left to be rectified, then my plan for my time would be to work to erase the current problems with the formatting logic.
Hariom mentioned that the following problems persisted after he finished GSoC:
30% of log tests are failing
Pretty-lin.{c,b} does not have apt handling for incorrect formatting
Olgas work needs attention
A Lot of what ZheNing worked on covers the third problem, so I would like to tackle the first two.
The reason for the first problem was due to the branch it was tested on not having mailmap logic, and also the second problem influenced it. Because of this, I think If I go this route, my first step would be to implement incorrect formatting handling. A simple form of this is already implemented, however it currently causes a segmentation fault, which would need to be debugged.


Prior Commitments:
To be completely transparent, over the summer I already have a job, however, it is a part time job at my local fair grounds where I help out in the mornings. It doesn’t take too much time out of my day, I just want to be transparent about it.
I also will be spending a few days attending my cousin’s wedding in June, but I will be able to work on the project during this time, except for the day of the actual wedding.
Projected Timeline:

Week
Goal
Prior to work start
In the time I have before the official start of work, I want to get to know the community better, and gain a good understanding of the workflow. Alongside this, I want to look into the aforementioned software tools, and also read Pro Git Book, as Olga said the later chapters were invaluable to her understanding.
1 - 3
During the first three weeks, I plan to spend a majority of my team looking into the formatting logic. It is an important step, and If I start working without knowing exactly what I’m doing then I could make a mistake and end up costing more time then I intend to. During this time I want to make small patches and contributions, to keep me in practice and help me develop my relationship with the community.
4-9
At this point, I want to start my actual work of unifying logic. At this point, I’m unsure of what file I would start with, but I believe that during the time allotted previously, I would be able to figure this out.
10-12
I am leaving this time period for debugging and optimization. It is inevitable that what I make won’t work in some unexpected way. In order to best improve my chances of having a master ready project by the end, I want to make sure I have the time to take the work I've done over the summer and turn it into a polished final product.
After GSoC
After GSoC ends I will go back to school, which will limit the amount of time I have available. However even so, I plan to stay connected to the git community in some way. At the very least, I plan to watch the mailing list, and provide commentary to other peoples patches, and at the most I would want to keep working on a major part of git, and finish what I started.


Blogging:
I am currently a stranger to blogging, as I never thought of any reason somebody would want to read my thoughts. However, I do keep a private journal that I use to remember what I do each month, and plan out the next month, which I think can be translated to a blog.

If I am accepted, then I plan to have 14 total blog posts over the course of the project. 12 of them to summarize what I did each week, 1 before I do any work to provide a reference point for me at the end, and to help me collect what I am going in knowing, and 1 at the end, to be the summation of my experience, and describe my experience and work in full.
Motivation:
My motivation for participating in GSoC:
I have always wanted to participate in an open source project, but I never knew how to take the first step. At times I considered contributing to some projects, but I was worried that my commits would get ignored, I would do it wrong and waste the time of people. When I say Google had a program to connect people to mentors and help get new developers into open source, I thought “Wow, that sounds like exactly what I’m looking for.” I didn’t even find out about the stipend until later, which is an obvious plus.


My “Why Git?”:
When the list of organizations participating in GSoC 2022 came out, I went to the list, and compiled my own list of organizations I knew and would want to contribute to. I ended up creating a list of around 10 organizations, but when I looked at it, I just knew that Git stood out among them. It’s something that I use almost daily, and that I have always wanted to know more about the internals of. At that point I started looking into how to apply. To me Git is a backbone of all other programming projects, and exists as a testament to what open source can be.

Closing remarks:

I would be more than overjoyed if I can be accepted, but even aside from that, I think that I have already learned a lot from GSoC. The materials provided by Google and Git have given me a lot of advice and ideas for how I can personally contribute to something open source. In the case that I don’t get accepted, then I will still spend my summer contributing to open source. I may branch out a bit and focus on more then just git, bit the idea of contributing to a public project just excites me, and I know I have to follow through with it.

________________________________________
From: Jack McGuinness
Sent: Tuesday, April 19, 2022 6:48 AM
To: git@vger.kernel.org
Subject: [GSoC][RFC][Proposal] Unify ref-filter formats with other pretty formats

Hi, thanks for checking out my proposal. I know that I really couldn't be any later with submitting this, but due to some time constraints I had to make do. Included is the plaintext version of my proposal, along with a link to the Google Doc version, which is what I am submitting to GSoC. Well I would love feedback, I waited way to long and I am just going to turn it in as is.

Thanks for your time,
-Jack McGuinness <jmcguinness2@ucmerced.edu>

Google Doc - https://docs.google.com/document/d/1KhVGkSgBaya8Q6X9z7tUwVJegmurwMS0xmzYj9DRMMY/edit?usp=sharing

Plaintext -

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [GSoC][RFC][Proposal] Unify ref-filter formats with other pretty formats
  2022-04-19 13:49 ` Jack McGuinness
@ 2022-04-25 13:34   ` Christian Couder
  0 siblings, 0 replies; 3+ messages in thread
From: Christian Couder @ 2022-04-25 13:34 UTC (permalink / raw)
  To: Jack McGuinness; +Cc: git@vger.kernel.org

On Wed, Apr 20, 2022 at 10:55 PM Jack McGuinness
<jmcguinness2@ucmerced.edu> wrote:
>
> Unify ref-filter formats with other pretty formats, for the Fourth Time

Thanks for your proposal!

> For my microproject, I chose to Modernize a test script, specifically t4202-log.sh
>
> I will admit, I had a bumpy run completing my Microproject. Going into this, I had no experience with open-source, and due to this, my first attempt was done incorrectly. In response to this, I went back and redid it, however as of right now it has not been reviewed by anyone. I believe that I corrected all of the improper syntaxes, but it’s entirely possible I missed something.
>
> Note: I used gitgitgadget to submit my patches.

I just took a look at version 2 and sent a reply with a few comments.
Note that you don't need for all the patches to be fully reviewed
before sending another version.

> History of Problem:
>
> As most open source projects get developed, git has been built up over time by many different people,

s/git/Git/

working towards a common goal of improvement, but all with different
ideas of how to get there. While this is a great thing that is the
reason open source software is such a brilliant idea, it can cause
confusion within the project.
> A prime example of this is formatting command output, where different commands that overlap in what data they would output have different logic for getting said output, which causes people to need to know separate systems for each command.
> Git has had a history of having mentored contributors work to amend this, starting in Outreach Round 15,

s/Outreach/Outreachy/

Also it might be nice to say when was that Outreachy round.

> where Olga Telezhania mainly worked to migrate the logic in cat-file.c to the logic in ref-filter, but unfortunately near the end, she was forced to scrap her solution and restart, and her main work was never merged to master.

You might want to explain a bit more the reason for that.

> From what I could find, the next person to work on this was Hariom Verma during GSoC 2020. He started his project by looking over the work of his predecessor Olga, and deciding to take a different approach to the problem. Over Hariom’s summer, he implemented a plethora of formatting options in pretty formats, implementing all formatting options in pretty-lib.

Hariom's project was not quite the same as Olga's. Olga's was focused
on `git cat-file` while Hariom's wasn't.

> After Hariom, ZheNing Hu carried the torch and worked on the project for the 2021 GSoC. His main contributions revolved around refactoring cat-file, similar to Olga.

Yeah, ZheNing's project was focused on `git cat-file` like Olga's.

> However, he also made the notable decision to spend time optimizing the performance of ref-filter.

It would be nice if you could say why this decision was made.

> Proposal: Unify ref-filter formats with other pretty formats
>
> My proposal is one of the ideas provided by Git-SCM, unifying the logic of ref-filter with pretty formats. What this means, is that I would be rewriting the formatting logic currently used in ref-filter, to be used in pretty. However, alongside doing this, I also have the goal of adding some small new functionality to the formatting, and possibly optimizing the logic as ZheNing did.

Your project is more like Hariom's than Olga's and ZheNing's as it's
not focused on `git cat-file`. If you can also finish what Olga and
ZheNing started, that would be a really nice bonus outcome though.

> Benefits of Proposal:
>
> Completing this would improve the quality of life of people contributing to the formatting. The erasure of duplicate logic would make it simpler to understand the logic being used to format,

It's likely that the old formatting logic will have to be kept for a
long time for backward compatibility and not breaking existing users
though.

> and therefore simpler for a prospective contributor to implement a new feature, or alter a current one.
>
> My Plan:
>
> I looked over the proposals and blogs of the previous undertakers of this proposal, and with their work and struggles in mind, I have compiled the following plan.
> Before I start working on the formatting logic, I want to learn to a usable degree how to use the following tools:
> Valgrind
> GDB
> Tmux
> Possibly Gprof and perf
> My reasoning for this is that reading Olga's blog, she commented how if she had started using the debugging assets earlier, she would have been far more on track. I want to go into this already knowing them so that I can apply them when needed, and not needlessly waste time.
> After this, I would focus on understanding the logic behind the formatting, by studying the relevant files and working on small contributions and patches, to better understand the system in place. From what I can see, most previous contributors took around a month before they started coding their main project. If possible, I would like to figure this out in under a month, but I know that that’s easier said than done.
> After I understand, to a well enough degree, the formatting logic, I want to start implementing the formatting options from other files.

I am not sure what "the formatting options from other files" refers to.

> As I understand it, major progress has been made in unifying the formatting logic, however, there are still implementations that work separately from what we want the standard to be. Ideally, I would like to spend the majority of my time doing this, along with the debugging that goes along with it.
> If I have misjudged the amount of logic left to be rectified, then my plan for my time would be to work to erase the current problems with the formatting logic.
> Hariom mentioned that the following problems persisted after he finished GSoC:
> 30% of log tests are failing
> Pretty-lin.{c,b} does not have apt handling for incorrect formatting
> Olgas work needs attention
> A Lot of what ZheNing worked on covers the third problem, so I would like to tackle the first two.

Yes, please.

> The reason for the first problem was due to the branch it was tested on not having mailmap logic, and also the second problem influenced it. Because of this, I think If I go this route, my first step would be to implement incorrect formatting handling. A simple form of this is already implemented, however it currently causes a segmentation fault, which would need to be debugged.

Ok.

> Prior Commitments:
> To be completely transparent, over the summer I already have a job, however, it is a part time job at my local fair grounds where I help out in the mornings. It doesn’t take too much time out of my day, I just want to be transparent about it.

Thanks for being transparent about it...

> I also will be spending a few days attending my cousin’s wedding in June, but I will be able to work on the project during this time, except for the day of the actual wedding.

...and about this.

> Projected Timeline:
>
> Week
> Goal
> Prior to work start
> In the time I have before the official start of work, I want to get to know the community better, and gain a good understanding of the workflow. Alongside this, I want to look into the aforementioned software tools, and also read Pro Git Book, as Olga said the later chapters were invaluable to her understanding.
> 1 - 3
> During the first three weeks, I plan to spend a majority of my team looking into the formatting logic. It is an important step, and If I start working without knowing exactly what I’m doing then I could make a mistake and end up costing more time then I intend to. During this time I want to make small patches and contributions, to keep me in practice and help me develop my relationship with the community.
> 4-9
> At this point, I want to start my actual work of unifying logic. At this point, I’m unsure of what file I would start with, but I believe that during the time allotted previously, I would be able to figure this out.
> 10-12
> I am leaving this time period for debugging and optimization. It is inevitable that what I make won’t work in some unexpected way. In order to best improve my chances of having a master ready project by the end, I want to make sure I have the time to take the work I've done over the summer and turn it into a polished final product.

The issue with that kind of timeline is that reviewers are not likely
to accept your patches if they look buggy or not polished or optimized
enough. So if you plan to only polish, fully debug and optimize
towards the end of your GSoC time, it is likely that nothing will get
merged before that time. Then if you are late (for example because
early steps took more time than planned) and decide for some reason
(which might be a very valid one) to stop working on the project at
the end of the GSoC time, it might mean that nothing will have been
merged.

So I think it would have been better to split the time in a way similar to:

  - weeks 1 - 4: improve incorrect formatting handling
  - weeks 5 - 8: add mailmap support
  - weeks 9 - 12: fix all remaining issues

where at the end of each of these steps hopefully something can be merged.

> After GSoC
> After GSoC ends I will go back to school, which will limit the amount of time I have available. However even so, I plan to stay connected to the git community in some way. At the very least, I plan to watch the mailing list, and provide commentary to other peoples patches, and at the most I would want to keep working on a major part of git, and finish what I started.

Nice!

> Blogging:
> I am currently a stranger to blogging, as I never thought of any reason somebody would want to read my thoughts. However, I do keep a private journal that I use to remember what I do each month, and plan out the next month, which I think can be translated to a blog.

We don't absolutely require blogging (especially public blogging), but
we think it can help you both during and after your GSoC.

> If I am accepted, then I plan to have 14 total blog posts over the course of the project. 12 of them to summarize what I did each week, 1 before I do any work to provide a reference point for me at the end, and to help me collect what I am going in knowing, and 1 at the end, to be the summation of my experience, and describe my experience and work in full.

Nice!

> Motivation:
> My motivation for participating in GSoC:
> I have always wanted to participate in an open source project, but I never knew how to take the first step. At times I considered contributing to some projects, but I was worried that my commits would get ignored, I would do it wrong and waste the time of people. When I say Google had a program

s/say/saw/

> to connect people to mentors and help get new developers into open source, I thought “Wow, that sounds like exactly what I’m looking for.” I didn’t even find out about the stipend until later, which is an obvious plus.
>
> My “Why Git?”:
> When the list of organizations participating in GSoC 2022 came out, I went to the list, and compiled my own list of organizations I knew and would want to contribute to. I ended up creating a list of around 10 organizations, but when I looked at it, I just knew that Git stood out among them. It’s something that I use almost daily, and that I have always wanted to know more about the internals of. At that point I started looking into how to apply. To me Git is a backbone of all other programming projects, and exists as a testament to what open source can be.
>
> Closing remarks:
>
> I would be more than overjoyed if I can be accepted, but even aside from that, I think that I have already learned a lot from GSoC. The materials provided by Google and Git have given me a lot of advice and ideas for how I can personally contribute to something open source. In the case that I don’t get accepted, then I will still spend my summer contributing to open source. I may branch out a bit and focus on more then just git, bit

s/then just git, bit/than just Git, but/

> the idea of contributing to a public project just excites me, and I know I have to follow through with it.

Great, thanks!

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-04-25 13:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-19 13:48 [GSoC][RFC][Proposal] Unify ref-filter formats with other pretty formats Jack McGuinness
2022-04-19 13:49 ` Jack McGuinness
2022-04-25 13:34   ` Christian Couder

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).