git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/1] xdiff: share xdiff between git and libgit2
@ 2022-02-09  1:29 Edward Thomson
  2022-02-09  1:33 ` [PATCH 1/1] xdiff: provide indirection to git functions Edward Thomson
  2022-02-17 15:44 ` [PATCH 0/1] xdiff: share xdiff between git and libgit2 Johannes Schindelin
  0 siblings, 2 replies; 11+ messages in thread
From: Edward Thomson @ 2022-02-09  1:29 UTC (permalink / raw)
  To: git; +Cc: johannes.schindelin

Hello from libgit2, where we borrowed your xdiff a few years ago and
have watched as we both hacked on it independently.  (For us, mostly it
was around tightening some things up around warnings and signed/unsigned
mismatches.)  However, we'd love to share a common xdiff implementation,
and we're happy if git is the home for that.

The next patch adds an indirection point, `git-xdiff.h`, that contains
the git-specific functionality in xdiff.  This keeps the core of xdiff
to standard functions.  Other xdiff users, like libgit2, can specify
their own compatibility functions in this header file.

I hope that this allows us to make progress on a common xdiff; we'd love
to go back to building it without warnings, but we'd like to not do that
in isolation.

Cheers-
-ed

Edward Thomson (1):
  xdiff: provide indirection to git functions

 xdiff/git-xdiff.h | 14 ++++++++++++++
 xdiff/xdiff.h     |  8 +++-----
 xdiff/xdiffi.c    | 20 ++++++++++----------
 xdiff/xinclude.h  |  2 +-
 4 files changed, 28 insertions(+), 16 deletions(-)
 create mode 100644 xdiff/git-xdiff.h

--
2.35.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/1] xdiff: provide indirection to git functions
  2022-02-09  1:29 [PATCH 0/1] xdiff: share xdiff between git and libgit2 Edward Thomson
@ 2022-02-09  1:33 ` Edward Thomson
  2022-02-09 11:07   ` Phillip Wood
  2022-02-15 23:40   ` Ævar Arnfjörð Bjarmason
  2022-02-17 15:44 ` [PATCH 0/1] xdiff: share xdiff between git and libgit2 Johannes Schindelin
  1 sibling, 2 replies; 11+ messages in thread
From: Edward Thomson @ 2022-02-09  1:33 UTC (permalink / raw)
  To: git; +Cc: johannes.schindelin

Provide an indirection layer into the git-specific functionality and
utilities in `git-xdiff.h`, prefixing those types and functions with
`xdl_` (and `XDL_` for macros).  This allows other projects that use
git's xdiff implementation to keep up-to-date; they can now take all the
files _except_ `git-xdiff.h`, which they have customized for their own
environment.

Signed-off-by: Edward Thomson <ethomson@edwardthomson.com>
---
 xdiff/git-xdiff.h | 14 ++++++++++++++
 xdiff/xdiff.h     |  8 +++-----
 xdiff/xdiffi.c    | 20 ++++++++++----------
 xdiff/xinclude.h  |  2 +-
 4 files changed, 28 insertions(+), 16 deletions(-)
 create mode 100644 xdiff/git-xdiff.h

diff --git a/xdiff/git-xdiff.h b/xdiff/git-xdiff.h
new file mode 100644
index 0000000000..5d47576551
--- /dev/null
+++ b/xdiff/git-xdiff.h
@@ -0,0 +1,14 @@
+#ifndef GIT_XDIFF_H
+#define GIT_XDIFF_H
+
+#define xdl_malloc(x) xmalloc(x)
+#define xdl_free(ptr) free(ptr)
+#define xdl_realloc(ptr,x) xrealloc(ptr,x)
+
+#define xdl_regex_t regex_t
+#define xdl_regmatch_t regmatch_t
+#define xdl_regexec_buf(p, b, s, n, m, f) regexec_buf(p, b, s, n, m, f)
+
+#define XDL_BUG(msg) BUG(msg)
+
+#endif
diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
index 72e25a9ffa..fb47f63fbf 100644
--- a/xdiff/xdiff.h
+++ b/xdiff/xdiff.h
@@ -27,6 +27,8 @@
 extern "C" {
 #endif /* #ifdef __cplusplus */
 
+#include "git-xdiff.h"
+
 /* xpparm_t.flags */
 #define XDF_NEED_MINIMAL (1 << 0)
 
@@ -82,7 +84,7 @@ typedef struct s_xpparam {
 	unsigned long flags;
 
 	/* -I<regex> */
-	regex_t **ignore_regex;
+	xdl_regex_t **ignore_regex;
 	size_t ignore_regex_nr;
 
 	/* See Documentation/diff-options.txt. */
@@ -119,10 +121,6 @@ typedef struct s_bdiffparam {
 } bdiffparam_t;
 
 
-#define xdl_malloc(x) xmalloc(x)
-#define xdl_free(ptr) free(ptr)
-#define xdl_realloc(ptr,x) xrealloc(ptr,x)
-
 void *xdl_mmfile_first(mmfile_t *mmf, long *size);
 long xdl_mmfile_size(mmfile_t *mmf);
 
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 69689fab24..af31b7f4b3 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -832,7 +832,7 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
 			/* Shift the group backward as much as possible: */
 			while (!group_slide_up(xdf, &g))
 				if (group_previous(xdfo, &go))
-					BUG("group sync broken sliding up");
+					XDL_BUG("group sync broken sliding up");
 
 			/*
 			 * This is this highest that this group can be shifted.
@@ -848,7 +848,7 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
 				if (group_slide_down(xdf, &g))
 					break;
 				if (group_next(xdfo, &go))
-					BUG("group sync broken sliding down");
+					XDL_BUG("group sync broken sliding down");
 
 				if (go.end > go.start)
 					end_matching_other = g.end;
@@ -873,9 +873,9 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
 			 */
 			while (go.end == go.start) {
 				if (group_slide_up(xdf, &g))
-					BUG("match disappeared");
+					XDL_BUG("match disappeared");
 				if (group_previous(xdfo, &go))
-					BUG("group sync broken sliding to match");
+					XDL_BUG("group sync broken sliding to match");
 			}
 		} else if (flags & XDF_INDENT_HEURISTIC) {
 			/*
@@ -916,9 +916,9 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
 
 			while (g.end > best_shift) {
 				if (group_slide_up(xdf, &g))
-					BUG("best shift unreached");
+					XDL_BUG("best shift unreached");
 				if (group_previous(xdfo, &go))
-					BUG("group sync broken sliding to blank line");
+					XDL_BUG("group sync broken sliding to blank line");
 			}
 		}
 
@@ -927,11 +927,11 @@ int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
 		if (group_next(xdf, &g))
 			break;
 		if (group_next(xdfo, &go))
-			BUG("group sync broken moving to next group");
+			XDL_BUG("group sync broken moving to next group");
 	}
 
 	if (!group_next(xdfo, &go))
-		BUG("group sync broken at end of file");
+		XDL_BUG("group sync broken at end of file");
 
 	return 0;
 }
@@ -1011,11 +1011,11 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
 }
 
 static int record_matches_regex(xrecord_t *rec, xpparam_t const *xpp) {
-	regmatch_t regmatch;
+	xdl_regmatch_t regmatch;
 	int i;
 
 	for (i = 0; i < xpp->ignore_regex_nr; i++)
-		if (!regexec_buf(xpp->ignore_regex[i], rec->ptr, rec->size, 1,
+		if (!xdl_regexec_buf(xpp->ignore_regex[i], rec->ptr, rec->size, 1,
 				 &regmatch, 0))
 			return 1;
 
diff --git a/xdiff/xinclude.h b/xdiff/xinclude.h
index a4285ac0eb..bf66dc0a87 100644
--- a/xdiff/xinclude.h
+++ b/xdiff/xinclude.h
@@ -24,6 +24,7 @@
 #define XINCLUDE_H
 
 #include "git-compat-util.h"
+#include "git-xdiff.h"
 #include "xmacros.h"
 #include "xdiff.h"
 #include "xtypes.h"
@@ -32,5 +33,4 @@
 #include "xdiffi.h"
 #include "xemit.h"
 
-
 #endif /* #if !defined(XINCLUDE_H) */
-- 
2.35.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] xdiff: provide indirection to git functions
  2022-02-09  1:33 ` [PATCH 1/1] xdiff: provide indirection to git functions Edward Thomson
@ 2022-02-09 11:07   ` Phillip Wood
  2022-02-15 23:40   ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 11+ messages in thread
From: Phillip Wood @ 2022-02-09 11:07 UTC (permalink / raw)
  To: Edward Thomson, git; +Cc: johannes.schindelin

Hi Edward

On 09/02/2022 01:33, Edward Thomson wrote:
> Provide an indirection layer into the git-specific functionality and
> utilities in `git-xdiff.h`, prefixing those types and functions with
> `xdl_` (and `XDL_` for macros).  This allows other projects that use
> git's xdiff implementation to keep up-to-date; they can now take all the
> files _except_ `git-xdiff.h`, which they have customized for their own
> environment.

This seems like a sensible way to make it easier to share a common 
xdiff. The patch looks good to me apart from

> diff --git a/xdiff/xinclude.h b/xdiff/xinclude.h
> index a4285ac0eb..bf66dc0a87 100644
> --- a/xdiff/xinclude.h
> +++ b/xdiff/xinclude.h
> @@ -24,6 +24,7 @@
>   #define XINCLUDE_H
>   
>   #include "git-compat-util.h"

I think you want to remove this

Best Wishes

Phillip


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] xdiff: provide indirection to git functions
  2022-02-09  1:33 ` [PATCH 1/1] xdiff: provide indirection to git functions Edward Thomson
  2022-02-09 11:07   ` Phillip Wood
@ 2022-02-15 23:40   ` Ævar Arnfjörð Bjarmason
  2022-02-16 11:02     ` Phillip Wood
  1 sibling, 1 reply; 11+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-15 23:40 UTC (permalink / raw)
  To: Edward Thomson; +Cc: git, johannes.schindelin


On Wed, Feb 09 2022, Edward Thomson wrote:

> Provide an indirection layer into the git-specific functionality and
> utilities in `git-xdiff.h`, prefixing those types and functions with
> `xdl_` (and `XDL_` for macros).  This allows other projects that use
> git's xdiff implementation to keep up-to-date; they can now take all the
> files _except_ `git-xdiff.h`, which they have customized for their own
> environment.

It seems sensible to share code here, but...

> +#ifndef GIT_XDIFF_H
> +#define GIT_XDIFF_H
> +
> +#define xdl_malloc(x) xmalloc(x)
> +#define xdl_free(ptr) free(ptr)
> +#define xdl_realloc(ptr,x) xrealloc(ptr,x)

...I don't understand the need for prefixing every function that may be
used from git.git with xdl_*. In particular for these memory managing
functions shouldn't this Just Work per 8d128513429 (grep/pcre2: actually
make pcre2 use custom allocator, 2021-02-18) and cbe81e653fa
(grep/pcre2: move back to thread-only PCREv2 structures, 2021-02-18)?
I.e. link-time use of free().

Of course trivial wrappers would be needed for x*() variants...

> +#define xdl_regex_t regex_t

This is a type that's in POSIX. Why do we need an xdl_* prefix for it?

> +#define xdl_regmatch_t regmatch_t

ditto.

> +#define xdl_regexec_buf(p, b, s, n, m, f) regexec_buf(p, b, s, n, m, f)

But this is our own custom function, which brings me to...

> +#define XDL_BUG(msg) BUG(msg)

...unless libgit2 has a regexec_buf() or BUG() why do we need this
indirection? Let's just have xdiff() use a bug, and then either libgit2
will have a BUG() macro/function, or it'll fail at compile-time.

This seems to at least partly have been inspired by git.git's
546096a5cbb (xdiff: use BUG(...), not xdl_bug(...), 2021-06-07), i.e. we
used to have an xdl_bug(), but now we just use BUG().

I then see on your libgit2 side 1458fb56e (xdiff: include new xdiff from
git, 2022-01-29).

But why not simply?:

    #define BUG(msg) GIT_ASSERT(msg)

It would make things easier on the git.git side (etags and all).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] xdiff: provide indirection to git functions
  2022-02-15 23:40   ` Ævar Arnfjörð Bjarmason
@ 2022-02-16 11:02     ` Phillip Wood
  2022-02-16 13:27       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 11+ messages in thread
From: Phillip Wood @ 2022-02-16 11:02 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Edward Thomson
  Cc: git, johannes.schindelin

On 15/02/2022 23:40, Ævar Arnfjörð Bjarmason wrote:
> 
> On Wed, Feb 09 2022, Edward Thomson wrote:
> 
>> Provide an indirection layer into the git-specific functionality and
>> utilities in `git-xdiff.h`, prefixing those types and functions with
>> `xdl_` (and `XDL_` for macros).  This allows other projects that use
>> git's xdiff implementation to keep up-to-date; they can now take all the
>> files _except_ `git-xdiff.h`, which they have customized for their own
>> environment.
> 
> It seems sensible to share code here, but...
> 
>> +#ifndef GIT_XDIFF_H
>> +#define GIT_XDIFF_H
>> +
>> +#define xdl_malloc(x) xmalloc(x)
>> +#define xdl_free(ptr) free(ptr)
>> +#define xdl_realloc(ptr,x) xrealloc(ptr,x)
> 
> ...I don't understand the need for prefixing every function that may be
> used from git.git with xdl_*. In particular for these memory managing
> functions shouldn't this Just Work per 8d128513429 (grep/pcre2: actually
> make pcre2 use custom allocator, 2021-02-18) and cbe81e653fa
> (grep/pcre2: move back to thread-only PCREv2 structures, 2021-02-18)?
> I.e. link-time use of free().

I read that paragraph a couple of times and I'm still not sure I 
understand what you're saying. It is not unusual for libraries to define 
their own allocation functions and the code base is already using 
xdl_malloc etc so these defines seem quite reasonable. As you point out 
below we'd need wrappers for xmalloc() etc anyway so I'm not sure what 
the problem is.

> Of course trivial wrappers would be needed for x*() variants...
> 
>> +#define xdl_regex_t regex_t
> 
> This is a type that's in POSIX. Why do we need an xdl_* prefix for it?
> 
>> +#define xdl_regmatch_t regmatch_t
> 
> ditto.
> 
>> +#define xdl_regexec_buf(p, b, s, n, m, f) regexec_buf(p, b, s, n, m, f)
> 
> But this is our own custom function, which brings me to...
> 
>> +#define XDL_BUG(msg) BUG(msg)
> 
> ...unless libgit2 has a regexec_buf() or BUG() why do we need this
> indirection? Let's just have xdiff() use a bug, and then either libgit2
> will have a BUG() macro/function, or it'll fail at compile-time.
> 
> This seems to at least partly have been inspired by git.git's
> 546096a5cbb (xdiff: use BUG(...), not xdl_bug(...), 2021-06-07), i.e. we
> used to have an xdl_bug(), but now we just use BUG().
> 
> I then see on your libgit2 side 1458fb56e (xdiff: include new xdiff from
> git, 2022-01-29).
> 
> But why not simply?:
> 
>      #define BUG(msg) GIT_ASSERT(msg)
> 
> It would make things easier on the git.git side (etags and all).

If we want xdiff to be usable for other projects I think we're going to 
have to accept that it is sensible to namespace its functions.

Best Wishes

Phillip


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] xdiff: provide indirection to git functions
  2022-02-16 11:02     ` Phillip Wood
@ 2022-02-16 13:27       ` Ævar Arnfjörð Bjarmason
       [not found]         ` <20220217012847.GA8@e5e602f6ad40>
  0 siblings, 1 reply; 11+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16 13:27 UTC (permalink / raw)
  To: phillip.wood; +Cc: Edward Thomson, git, johannes.schindelin


On Wed, Feb 16 2022, Phillip Wood wrote:

> On 15/02/2022 23:40, Ævar Arnfjörð Bjarmason wrote:
>> On Wed, Feb 09 2022, Edward Thomson wrote:
>> 
>>> Provide an indirection layer into the git-specific functionality and
>>> utilities in `git-xdiff.h`, prefixing those types and functions with
>>> `xdl_` (and `XDL_` for macros).  This allows other projects that use
>>> git's xdiff implementation to keep up-to-date; they can now take all the
>>> files _except_ `git-xdiff.h`, which they have customized for their own
>>> environment.
>> It seems sensible to share code here, but...
>> 
>>> +#ifndef GIT_XDIFF_H
>>> +#define GIT_XDIFF_H
>>> +
>>> +#define xdl_malloc(x) xmalloc(x)
>>> +#define xdl_free(ptr) free(ptr)
>>> +#define xdl_realloc(ptr,x) xrealloc(ptr,x)
>> ...I don't understand the need for prefixing every function that may
>> be
>> used from git.git with xdl_*. In particular for these memory managing
>> functions shouldn't this Just Work per 8d128513429 (grep/pcre2: actually
>> make pcre2 use custom allocator, 2021-02-18) and cbe81e653fa
>> (grep/pcre2: move back to thread-only PCREv2 structures, 2021-02-18)?
>> I.e. link-time use of free().
>
> I read that paragraph a couple of times and I'm still not sure I
> understand what you're saying. It is not unusual for libraries to
> define their own allocation functions and the code base is already
> using xdl_malloc etc so these defines seem quite reasonable. As you
> point out below we'd need wrappers for xmalloc() etc anyway so I'm not
> sure what the problem is.

That you generally don't need to define such wrappers for free() and
malloc(), because that's something you can handle at link-time.

This is current libgit2, which seems to have a version of this patch
integrated:
    
    $ git reference; git -P grep '\bfree\(' src/xdiff
    c8450561d (Merge pull request #6216 from libgit2/ethomson/readme, 2022-02-13)
    src/xdiff/xmerge.c:             free(c);
    src/xdiff/xmerge.c:     free(next_m);

I.e. I think instead of having xdl_free(), xdl_regcomp() etc. it makes
sense to just slowly go in the other direction and call free(),
regcomp() etc. Since it seems we're going to be maintaining an xdiff
fork permanently.

>> Of course trivial wrappers would be needed for x*() variants...
>> 
>>> +#define xdl_regex_t regex_t
>> This is a type that's in POSIX. Why do we need an xdl_* prefix for
>> it?
>> 
>>> +#define xdl_regmatch_t regmatch_t
>> ditto.
>> 
>>> +#define xdl_regexec_buf(p, b, s, n, m, f) regexec_buf(p, b, s, n, m, f)
>> But this is our own custom function, which brings me to...
>> 
>>> +#define XDL_BUG(msg) BUG(msg)
>> ...unless libgit2 has a regexec_buf() or BUG() why do we need this
>> indirection? Let's just have xdiff() use a bug, and then either libgit2
>> will have a BUG() macro/function, or it'll fail at compile-time.
>> This seems to at least partly have been inspired by git.git's
>> 546096a5cbb (xdiff: use BUG(...), not xdl_bug(...), 2021-06-07), i.e. we
>> used to have an xdl_bug(), but now we just use BUG().
>> I then see on your libgit2 side 1458fb56e (xdiff: include new xdiff
>> from
>> git, 2022-01-29).
>> But why not simply?:
>>      #define BUG(msg) GIT_ASSERT(msg)
>> It would make things easier on the git.git side (etags and all).
>
> If we want xdiff to be usable for other projects I think we're going
> to have to accept that it is sensible to namespace its functions.

We're just talking about sharing code with libgit2, which I agree with
as a goal. I just don't see why we'd need to have e.g. XDL_BUG() as
opposed to libgit2 just providing a BUG() for its compatibility with our
xdiff.

We have other in-tree code with the same goal that does that, see
reftable/system.h.

It means that development in git.git can proceed without worrying about
the special-case, including stuff like this not doing what you think,
because you forgot the xdiff-specific alias:

    git grep -w BUG

And as long as libgit2 doesn't have a BUG() of its own (which it's
unlikely to do, since it's a generally usable library, and thus is
concerned about namespace conflicts) it can just provide the wrapper,
and providing that will be the same amount of work o that side, no?

This proposed wrapper is also BUGgy in that it's not __VA_ARGS__. It
just happens to work right now because none of xdiff/ uses >1 argument,
but that sort of thing is another reason to use BUG() and push the
compatibility headaches to whoever is doing the one-off import into
other codebases.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] xdiff: provide indirection to git functions
       [not found]         ` <20220217012847.GA8@e5e602f6ad40>
@ 2022-02-17  9:29           ` Ævar Arnfjörð Bjarmason
  2022-02-17 17:32             ` Junio C Hamano
  2022-02-17 22:58             ` Edward Thomson
  0 siblings, 2 replies; 11+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-17  9:29 UTC (permalink / raw)
  To: Edward Thomson; +Cc: Git ML


On Thu, Feb 17 2022, Edward Thomson wrote:

[I'm assuming that dropping the list from CC was a mistake, re-CC-ing]

> On Wed, Feb 16, 2022 at 02:27:27PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> 
>> This is current libgit2, which seems to have a version of this patch
>> integrated:
>>     
>>     $ git reference; git -P grep '\bfree\(' src/xdiff
>>     c8450561d (Merge pull request #6216 from libgit2/ethomson/readme, 2022-02-13)
>>     src/xdiff/xmerge.c:             free(c);
>>     src/xdiff/xmerge.c:     free(next_m);
>
> Yikes!  A buggy version, in fact.  More on that in a moment.
>
>> I.e. I think instead of having xdl_free(), xdl_regcomp() etc. it makes
>> sense to just slowly go in the other direction and call free(),
>> regcomp() etc. Since it seems we're going to be maintaining an xdiff
>> fork permanently.
>
> Right, and that's explicitly what doesn't work for libgit2, and the goal
> of an abstraction layer that we can both use.  git uses `xmalloc` (for
> example), while libgit2 uses `git__alloc` to allocate memory.  This
> seems sensible enough to replace, but there's not a 1:1 mapping in our
> APIs because git lacks an `xfree`.
>
> If it were just a matter of `#define xmalloc git__alloc` then that might
> be a reasonable strategy for libgit2 to take for code re-use.  But
> `git__alloc` isn't necessarily just a wrapper around `malloc`, it's a
> pluggable allocator that a library user can supply.  So we can't call
> our allocator (`git__alloc`) and then the system's `free` because that
> will most certainly fail.  We supply a `git__free` for this reason.
>
> (I appreciate you pointing out that I missed this in our update to
> libgit2!  The ruby bindings make use of their own allocator and we can
> ship a patch before they update.)

Yes. I understand. I'm saying that for the purposes of a "free()" in the
text of the xdiff code you can have your cake and eat it too. You just
need to adjust the compilation of xdiff within libgit2 so that it
e.g. defines free() to git__free() or whatever before that code is
included, or to do the same at link-time.

The reason I pointed at the PCRE commits in git.git is that's exactly
what we ended up doing by accident with nedmalloc + PCRE. I.e. because
we use free() and malloc() we ended up with nedmalloc due to that shim,
but would then link to libpcre2 which would use the system malloc (not
compiled with those shims).

Since in this case we're talking about someone importing libgit2 into
their tree all of malloc() and free() can end up in the right place for
you.

>> 
>> >> Of course trivial wrappers would be needed for x*() variants...
>> >> 
>> >>> +#define xdl_regex_t regex_t
>> >> This is a type that's in POSIX. Why do we need an xdl_* prefix for
>> >> it?
>
> Precisely because it's a type in POSIX.  libgit2 doesn't necessarily
> build on POSIX systems, and a user could - again - supply their own
> regex engine like PCRE even if they do have the POSIX regex engine
> available on their installations.

Sure, but these renames aren't needed for that. In fact PCRE ships with
a POSIX shimmy layer which makes my point for me. See pcre2posix(3),
i.e. it'll redefine regcomp(), regexec(), regex_t etc.

So you're saying you need to renaming so you can get X, but
pcre2posix(3) is a working demonstration of X without that step :)

In this case though we do need the regexec_buf() semantics, but the
right thing to do for xdiff/* compatibility is to just split off the
trivial regexec_buf() shim in git-compat-util.h say a
compat/regexec_buf.c for your convenience, then you could import that
along with xdiff/*.

I haven't checked if pcre2posix supports that non-portable
*BSD-originated trickery in regexec_buf() to make the pmatch variables
carry the length (if not it would be trivial to make it do so), but
everything else above would be easy

> libgit2 could - I suppose - do some magic to ensure that we call it a
> `regex_t` even when it's a `pcre *`.  But any new person to our codebase
> would (rightly) expect a `regex_t` to be ... well, a `regex_t` and might
> (again, rightly) expect to find a `re_nsub` on it.  Hell, even I would
> expect this because I don't interact with the regex code on the regular.

I think if you're expecting shimmying layers for POSIX regexen to be in
play that people would expect it to work like pcre2posix(3). I.e. you
use the POSIX API but drop in a shim for a non-libc implementation.

As for the "new person to our codebase..." I don't think you're wrong
there, but that's an asthetic preference, not something that's required
for the stated aims of this series of dropping in compatibility shims.

The reason I started commenting here was because I was surprised that
this was needed at all for libgit2, since we do exactly that sort of
shimming without the renaming here.

>> We're just talking about sharing code with libgit2, which I agree with
>> as a goal. I just don't see why we'd need to have e.g. XDL_BUG() as
>> opposed to libgit2 just providing a BUG() for its compatibility with our
>> xdiff.
>
> I care very little about `BUG`, but I care very much about allocation
> and regex abstraction.  But now it makes more sense to me to have a
> common prefix for the abstraction rather than piecemeal.
>
> I'll supply a re-roll with the issues that you and Philip pointed out
> and I'm certain that we will continue the discussion.

For the asthetic preference?

If it's XDL_BUG() the primary project (git.git) needs to carry the
XDL_BUG() -> BUG() shim along with libgit2's XDL_BUG() ->
GIT_ASSERT(msg) .

If it's just BUG() we don't need the shim in git.git, but you'll need a
BUG() -> GIT_ASSERT(msg).

I don't see the benefit of requiring two shims instead of one, both in
terms of code, and the readability of the codebase in git.git
(i.e. grepping for "git grep -w BUG" or whatever, then remembering it's
prefixing everything...).


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/1] xdiff: share xdiff between git and libgit2
  2022-02-09  1:29 [PATCH 0/1] xdiff: share xdiff between git and libgit2 Edward Thomson
  2022-02-09  1:33 ` [PATCH 1/1] xdiff: provide indirection to git functions Edward Thomson
@ 2022-02-17 15:44 ` Johannes Schindelin
  1 sibling, 0 replies; 11+ messages in thread
From: Johannes Schindelin @ 2022-02-17 15:44 UTC (permalink / raw)
  To: Edward Thomson; +Cc: git

Hi Ed,

On Wed, 9 Feb 2022, Edward Thomson wrote:

> Hello from libgit2, where we borrowed your xdiff a few years ago and
> have watched as we both hacked on it independently.  (For us, mostly it
> was around tightening some things up around warnings and signed/unsigned
> mismatches.)  However, we'd love to share a common xdiff implementation,
> and we're happy if git is the home for that.

Great!

> The next patch adds an indirection point, `git-xdiff.h`, that contains
> the git-specific functionality in xdiff.  This keeps the core of xdiff
> to standard functions.  Other xdiff users, like libgit2, can specify
> their own compatibility functions in this header file.

I like this direction and looked over the patch: ACK!

> I hope that this allows us to make progress on a common xdiff; we'd love
> to go back to building it without warnings, but we'd like to not do that
> in isolation.

Yes, let's combine efforts.

Thank you for kicking this off,
Dscho

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] xdiff: provide indirection to git functions
  2022-02-17  9:29           ` Ævar Arnfjörð Bjarmason
@ 2022-02-17 17:32             ` Junio C Hamano
  2022-02-17 22:58             ` Edward Thomson
  1 sibling, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2022-02-17 17:32 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Edward Thomson, Git ML

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> ...
> If it's XDL_BUG() the primary project (git.git) needs to carry the
> XDL_BUG() -> BUG() shim along with libgit2's XDL_BUG() ->
> GIT_ASSERT(msg) .
>
> If it's just BUG() we don't need the shim in git.git, but you'll need a
> BUG() -> GIT_ASSERT(msg).
>
> I don't see the benefit of requiring two shims instead of one, both in
> terms of code, and the readability of the codebase in git.git
> (i.e. grepping for "git grep -w BUG" or whatever, then remembering it's
> prefixing everything...).

Renaming symbols with preprocessor macro "#define"s, without forcing
people to change the names they have used in the code and have to
write in the future, sounds like a sensible direction to go in.

Thanks.  

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] xdiff: provide indirection to git functions
  2022-02-17  9:29           ` Ævar Arnfjörð Bjarmason
  2022-02-17 17:32             ` Junio C Hamano
@ 2022-02-17 22:58             ` Edward Thomson
  2022-04-15 15:55               ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 11+ messages in thread
From: Edward Thomson @ 2022-02-17 22:58 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git ML

On Thu, Feb 17, 2022 at 10:29:23AM +0100, Ævar Arnfjörð Bjarmason wrote:
> 
> [I'm assuming that dropping the list from CC was a mistake, re-CC-ing]

It was; many apologies, I don't use mutt very often any more.  Thanks!

> As for the "new person to our codebase..." I don't think you're wrong
> there, but that's an asthetic preference, not something that's required
> for the stated aims of this series of dropping in compatibility shims.

Sure, but avoiding a prefix is also not a technical decision but an
aesthetic and ergonomic one.

Is using a prefix here great?  No, it's not great, it's shit.  But it's
shit that's easy to reason about.

If somebody sees a call to `xdl_free` in some code, they say "wtf is
this `xdl_free` nonsense?"  And they grep around and figure it out and
understand the way that this project handles heap allocations.  It's
very transparent.

If somebody sees a call to `free` in their code, they say "great,
`free`".  But it merely *appears* very transparent; in fact, there's
some magic behind the scenes that turns a `free` into a `git__free`
without you knowing it.  You've not learned the way that this project
handles heap allocations, but you also don't know that there's anything
that you needed to learn.  These are the sorts of things that you think
you understand but only discover when you _need_ to discover it because
something's gone very wrong.

In my experience, calling a function what it _isn't_ is the sort of thing
that a developer discovers the hard way, and that often leads to them
not trusting the codebase because it doesn't do what it says it does.

Cheers-
-ed

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] xdiff: provide indirection to git functions
  2022-02-17 22:58             ` Edward Thomson
@ 2022-04-15 15:55               ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 11+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-15 15:55 UTC (permalink / raw)
  To: Edward Thomson; +Cc: Git ML


On Thu, Feb 17 2022, Edward Thomson wrote:

> On Thu, Feb 17, 2022 at 10:29:23AM +0100, Ævar Arnfjörð Bjarmason wrote:
>> 
>> [I'm assuming that dropping the list from CC was a mistake, re-CC-ing]
>
> It was; many apologies, I don't use mutt very often any more.  Thanks!

No worries. Also, late reply but I remembered & referenced this thread
in
https://lore.kernel.org/git/220415.86bkx2bb0p.gmgdl@evledraar.gmail.com/,
and saw that I'd left this hanging...

>> As for the "new person to our codebase..." I don't think you're wrong
>> there, but that's an asthetic preference, not something that's required
>> for the stated aims of this series of dropping in compatibility shims.
>
> Sure, but avoiding a prefix is also not a technical decision but an
> aesthetic and ergonomic one.

Yes, I see that, but this code is maintained in git.git, not
libgit2.git, and having to remember to use custom malloc()/free()
per-namespace is very much negative asthetics & ergonomics in that
context.

So if the linker solution works...

> Is using a prefix here great?  No, it's not great, it's shit.  But it's
> shit that's easy to reason about.

I really don't see that, as noted in the linked newer reply above we
have bugs due to this sort of pattern where someone uses
mycustom_malloc(), forgets that, and then calls free() instead of
mycustom_free().

Which is a bug and potential segfault that's entirely preventable by not
using such wrappers at the per-file level (some one-off "this is where
we provide a custom malloc" file might of course have such complexity).

> If somebody sees a call to `xdl_free` in some code, they say "wtf is
> this `xdl_free` nonsense?"  And they grep around and figure it out and
> understand the way that this project handles heap allocations.  It's
> very transparent.
>
> If somebody sees a call to `free` in their code, they say "great,
> `free`".  But it merely *appears* very transparent; in fact, there's
> some magic behind the scenes that turns a `free` into a `git__free`
> without you knowing it.  You've not learned the way that this project
> handles heap allocations, but you also don't know that there's anything
> that you needed to learn.  These are the sorts of things that you think
> you understand but only discover when you _need_ to discover it because
> something's gone very wrong.

Because the reader assumed that when they saw malloc/free that it was
The Canonical Libc version, as opposed to whatever custom malloc the
library linked to?

> In my experience, calling a function what it _isn't_ is the sort of thing
> that a developer discovers the hard way, and that often leads to them
> not trusting the codebase because it doesn't do what it says it does.

But they aren't anything until you link to something that provides them.

Anyway, I think I see your point, you'd like names to always reflect
their different-ness, no linker shenanigans.

Anyway, since per [1] it seemed Junio was also more partial to sticking
with malloc/free *and* we're talking about a thing that gets
one-off-imported into libgit2 (not as a submodule, presumably) I don't
think there's any reason to really argue about this.

I.e. instead of importing the sources as-is why not just search-replace
malloc to mymalloc and free to myfree?

Which can be either a dumb "sed" script, or even better the same (and
guaranteed to understand C syntax) thing with coccinelle/spatch.

Which wouldn't require libgit2 to have a dependency on that, just
whatever dev runs that one-off import occasionally. The semantic patch
is just:
	
	@@
	expression E;
	@@
	- free(E);
	+ myfree(E);
	
	@@
	expression E;
	@@
	- malloc(E);
	+ mymalloc(E);

etc.

Wouldn't that also give you exactly what you want? Or was the plan to
have libgit2 have some option to build this *directly* from git.git
sources?

1. https://lore.kernel.org/git/xmqqfsohbdre.fsf@gitster.g/

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-04-15 16:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-09  1:29 [PATCH 0/1] xdiff: share xdiff between git and libgit2 Edward Thomson
2022-02-09  1:33 ` [PATCH 1/1] xdiff: provide indirection to git functions Edward Thomson
2022-02-09 11:07   ` Phillip Wood
2022-02-15 23:40   ` Ævar Arnfjörð Bjarmason
2022-02-16 11:02     ` Phillip Wood
2022-02-16 13:27       ` Ævar Arnfjörð Bjarmason
     [not found]         ` <20220217012847.GA8@e5e602f6ad40>
2022-02-17  9:29           ` Ævar Arnfjörð Bjarmason
2022-02-17 17:32             ` Junio C Hamano
2022-02-17 22:58             ` Edward Thomson
2022-04-15 15:55               ` Ævar Arnfjörð Bjarmason
2022-02-17 15:44 ` [PATCH 0/1] xdiff: share xdiff between git and libgit2 Johannes Schindelin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).