git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Sangeeta NB <sangunb09@gmail.com>
Cc: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>,
	Christian Couder <christian.couder@gmail.com>,
	Git List <git@vger.kernel.org>
Subject: Re: [Outreachy][Proposal] Accelerate rename detection and the range-diff
Date: Mon, 26 Oct 2020 09:52:30 -0700	[thread overview]
Message-ID: <CABPp-BF3MEAkJmmLv_0fWBJV_2AMqh_8P7Dqk62c2_Uz9Pa3Lw@mail.gmail.com> (raw)
In-Reply-To: <CAHjREB6Hh+urW3j2c9p45ZudSdDv0rUP28Lb4e4TZasqTzRmDA@mail.gmail.com>

Hi and welcome!

On Mon, Oct 26, 2020 at 1:44 AM Sangeeta NB <sangunb09@gmail.com> wrote:
>
> Hey Everyone,
>
> I would love to participate in outreachy this year with Git in the
> project "Accelerate rename detection and the range-diff command in
> Git". I have contributed to the microproject "Unify the meaning of
> dirty between diff and describe"[1] which is still under review, but
> through the process, I have got myself familiar with the mailing list
> and patch review system. I am also contributing to another issue[2]
> which is still under discussion[3] about `git bisect` and `git
> rebase`.
>
> [1] https://lore.kernel.org/git/pull.751.git.1602781723670.gitgitgadget@gmail.com
> [2] https://github.com/gitgitgadget/git/issues/486
> [3] https://lore.kernel.org/git/pull.765.git.1603271344522.gitgitgadget@gmail.com/
>
> Coming to the project, I have read more about it[4] and have created
> the initial version for the timeline. I would really love to have
> comments on it.
>
> [4] https://github.com/gitgitgadget/git/issues/519

I might be the bearer of some bad or concerning news.  This email is
directed more to the mentors and others on the git mailing list, but
obviously may affect you as well:

I apologize for not stating my concerns more forcefully earlier, but I
didn't have as many details at the time or have an idea how fast
merge-ort could be upstreamed.  Anyway, I'm still concerned that this
might not be a good project for Outreachy due to two factors: unclear
benefit, and conflicts:

1) I've got merges down to the point where even if there is a massive
rename of 26000 files (e.g. renaming "drivers/" to "pilots/" in the
linux kernel), rename detection is NOT the long tent pole in a merge.
So although this project is interesting, it's not clear that this
project will help us much.  It might be better to get my changes
merged down and see if there's enough need for additional
optimizations first.

2) Ignoring what I've already submitted, the remaining diffstat for
merge-ort is about 5500 lines....
  2a) If I break that ~5500 lines into patches with 50 lines each,
that's 111 patches.  If I assume I can send 10-20 patches per week
without overwhelming folks, that's 6-11 weeks, pulling us somewhere
into mid-December or mid-January.  10-20 patches per week might be
over-optimistic on reviewer fatigue, which would push it out even
further.
  2b) Work is going to soon rotate me onto other non-git projects,
meaning even if the mailing list can review my changes aggressively,
there's a chance I might not be able to keep up on feeding them to the
list.
  2c) diffcore-rename.c is only ~700 lines right now.  My 5500 lines
of changes includes over 1000 new lines for diffcore-rename.c and
about 150 line removals for it.  These changes are spread all over the
file; only four small functions remain untouched.  In fact, I even
made big changes to struct diff_rename_dst too, so any new uses of it
would almost certainly have textual conflicts.
  2d) My diffcore-rename.c changes probably do not make logical sense
to submit first.  They should come after some groundwork is laid for
merge-ort.

Even though at a high level this project is complementary to the
optimizations I made in my 'merge-ort' work, I fear there will be LOTS
of intermediate conflicts as we both make changes to the same areas
during the same time and make a mess of things.

If you all think this is still a good project to have an intern work
on, I'll defer to you, but I am concerned.


Elijah

  reply	other threads:[~2020-10-26 16:52 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-26  7:49 Sangeeta NB
2020-10-26 16:52 ` Elijah Newren [this message]
2020-10-30  9:02   ` Kaartic Sivaraam
2020-10-31 20:31     ` Elijah Newren
2020-11-02 18:35       ` Kaartic Sivaraam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BF3MEAkJmmLv_0fWBJV_2AMqh_8P7Dqk62c2_Uz9Pa3Lw@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=kaartic.sivaraam@gmail.com \
    --cc=sangunb09@gmail.com \
    --subject='Re: [Outreachy][Proposal] Accelerate rename detection and the range-diff' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).