Hi, [replacing Ramsay's email address with a working one] On Fri, 12 May 2017, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason writes: > > > diff --git a/compat/regex/README b/compat/regex/README > > new file mode 100644 > > index 0000000000..345d322d8c > > --- /dev/null > > +++ b/compat/regex/README > > @@ -0,0 +1,21 @@ > > +This is the Git project's copy of the GNU awk (Gawk) regex > > +engine. It's used when Git is build with e.g. NO_REGEX=NeedsStartEnd, > > +or when the C library's regular expression functions are otherwise > > +deficient. > > + > > +This is not a fork, but a source code copy. Upstream is the Gawk > > +project, and the sources should be periodically updated from their > > +copy, which can be done with: > > + > > + for f in $(find . -name '*.[ch]' -printf "%f\n"); do wget http://git.savannah.gnu.org/cgit/gawk.git/plain/support/$f -O $f; done > > + > > +For ease of maintenance, and to intentionally make it inconvenient to > > +diverge from upstream (since it makes it harder to re-merge) any local > > +changes should be stored in the patches/ directory, which after doing > > +the above can be applied as: > > + > > + for p in patches/*; do patch -p3 < $p; done > > + > > +For any changes that aren't specific to the git.git copy please submit > > +a patch to the Gawk project and/or to the GNU C library (the Gawk > > +regex engine is a periodically & forked copy from glibc.git). > > I am not a huge fan of placing patch files under version control. > > If I were doing the "code drop from the outside world from time to > time", I'd rather do the following every time we update: > > - have a topic branch for importing version N+1, and in its first > commit, replace compat/regex/ with the pristine copy of the files > we'll borrow from version N+1. > > - ask "git log -p compat/regex/" to grab all changes made to the > directory, and stop at the commit that imported the pristine copy > of the files we borrowed from version N. These are the changes > we made to the pristine copy of version N to adjust it to our > needs. > > - cherry-pick these patches on the topic branch; some of them > hopefully have been upstreamed, the remainder of the patches are > presumably to adjust the code to our local needs. > > - make more changes, while still on the topic branch, to adjust the > code to our local and current needs. > > - once the result becomes buildable and tests OK, merge it back to > the mainline. > > This may break bisectability, but I think it is OK (you should be > able to skip and test only first-parent chain, treating as if these > are squashed together into a single change). The patch files your > approach is keeping will become the individual patches on the topic > branch, and will be explained and justified the same way as any > other patches in their commit log message. > > Having said all that, since I am not expecting to be the primary one > working in this area, I'll let you (who I take to be volunteering to > be the one) pick the approach that you would find the easiest and > least error prone to handle this task. FWIW I agree that Junio's proposed strategy would make more sense, with one addition of my own: - rather than scraping the files from the CGit website (which does not guarantee that the first scraped file will be from the same revision as the last scraped file), I would very strongly prefer the files to be copied from a clone of gawk.git, and the gawk.git revision from which they were copied should be recorded in git.git's commit adding them. Thanks, Dscho