Hi,

[replacing Ramsay's email address with a working one]

On Fri, 12 May 2017, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
> 
> > diff --git a/compat/regex/README b/compat/regex/README
> > new file mode 100644
> > index 0000000000..345d322d8c
> > --- /dev/null
> > +++ b/compat/regex/README
> > @@ -0,0 +1,21 @@
> > +This is the Git project's copy of the GNU awk (Gawk) regex
> > +engine. It's used when Git is build with e.g. NO_REGEX=NeedsStartEnd,
> > +or when the C library's regular expression functions are otherwise
> > +deficient.
> > +
> > +This is not a fork, but a source code copy. Upstream is the Gawk
> > +project, and the sources should be periodically updated from their
> > +copy, which can be done with:
> > +
> > +    for f in $(find . -name '*.[ch]' -printf "%f\n"); do wget http://git.savannah.gnu.org/cgit/gawk.git/plain/support/$f -O $f; done
> > +
> > +For ease of maintenance, and to intentionally make it inconvenient to
> > +diverge from upstream (since it makes it harder to re-merge) any local
> > +changes should be stored in the patches/ directory, which after doing
> > +the above can be applied as:
> > +
> > +    for p in patches/*; do patch -p3 < $p; done
> > +
> > +For any changes that aren't specific to the git.git copy please submit
> > +a patch to the Gawk project and/or to the GNU C library (the Gawk
> > +regex engine is a periodically & forked copy from glibc.git).
> 
> I am not a huge fan of placing patch files under version control.
> 
> If I were doing the "code drop from the outside world from time to
> time", I'd rather do the following every time we update:
> 
>  - have a topic branch for importing version N+1, and in its first
>    commit, replace compat/regex/ with the pristine copy of the files
>    we'll borrow from version N+1.
> 
>  - ask "git log -p compat/regex/" to grab all changes made to the
>    directory, and stop at the commit that imported the pristine copy
>    of the files we borrowed from version N.  These are the changes
>    we made to the pristine copy of version N to adjust it to our
>    needs.
> 
>  - cherry-pick these patches on the topic branch; some of them
>    hopefully have been upstreamed, the remainder of the patches are
>    presumably to adjust the code to our local needs.
> 
>  - make more changes, while still on the topic branch, to adjust the
>    code to our local and current needs.
> 
>  - once the result becomes buildable and tests OK, merge it back to
>    the mainline.
> 
> This may break bisectability, but I think it is OK (you should be
> able to skip and test only first-parent chain, treating as if these
> are squashed together into a single change).  The patch files your
> approach is keeping will become the individual patches on the topic
> branch, and will be explained and justified the same way as any
> other patches in their commit log message.
> 
> Having said all that, since I am not expecting to be the primary one
> working in this area, I'll let you (who I take to be volunteering to
> be the one) pick the approach that you would find the easiest and
> least error prone to handle this task.

FWIW I agree that Junio's proposed strategy would make more sense, with
one addition of my own:

- rather than scraping the files from the CGit website (which does not
  guarantee that the first scraped file will be from the same revision as
  the last scraped file), I would very strongly prefer the files to be
  copied from a clone of gawk.git, and the gawk.git revision from which
  they were copied should be recorded in git.git's commit adding them.

Thanks,
Dscho