git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH] fast-export: factor out print_oid()
@ 2020-08-13 11:11 René Scharfe
  2020-08-13 11:58 ` Jeff King
  2020-08-13 15:18 ` Taylor Blau
  0 siblings, 2 replies; 10+ messages in thread
From: René Scharfe @ 2020-08-13 11:11 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Junio C Hamano

Simplify the output code by splitting it up and reducing duplication.
Reuse the logic for printing object IDs -- anonymized if needed -- by
moving it to its own function, print_oid().

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 builtin/fast-export.c | 26 +++++++++++++++-----------
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 9f37895d4cf..49bb50634ab 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -420,6 +420,14 @@ static const char *anonymize_oid(const char *oid_hex)
 	return anonymize_str(&objs, generate_fake_oid, oid_hex, len, NULL);
 }

+static void print_oid(const struct object_id *oid, int anonymize)
+{
+	const char *oid_hex = oid_to_hex(oid);
+	if (anonymize)
+		oid_hex = anonymize_oid(oid_hex);
+	fputs(oid_hex, stdout);
+}
+
 static void show_filemodify(struct diff_queue_struct *q,
 			    struct diff_options *options, void *data)
 {
@@ -470,21 +478,19 @@ static void show_filemodify(struct diff_queue_struct *q,
 		case DIFF_STATUS_TYPE_CHANGED:
 		case DIFF_STATUS_MODIFIED:
 		case DIFF_STATUS_ADDED:
+			printf("M %06o ", spec->mode);
 			/*
 			 * Links refer to objects in another repositories;
 			 * output the SHA-1 verbatim.
 			 */
 			if (no_data || S_ISGITLINK(spec->mode))
-				printf("M %06o %s ", spec->mode,
-				       anonymize ?
-				       anonymize_oid(oid_to_hex(&spec->oid)) :
-				       oid_to_hex(&spec->oid));
+				print_oid(&spec->oid, anonymize);
 			else {
 				struct object *object = lookup_object(the_repository,
 								      &spec->oid);
-				printf("M %06o :%d ", spec->mode,
-				       get_object_mark(object));
+				printf(":%d", get_object_mark(object));
 			}
+			putchar(' ');
 			print_path(spec->path);
 			string_list_insert(changed, spec->path);
 			putchar('\n');
@@ -724,12 +730,10 @@ static void handle_commit(struct commit *commit, struct rev_info *rev,
 		else
 			printf("merge ");
 		if (mark)
-			printf(":%d\n", mark);
+			printf(":%d", mark);
 		else
-			printf("%s\n",
-			       anonymize ?
-			       anonymize_oid(oid_to_hex(&obj->oid)) :
-			       oid_to_hex(&obj->oid));
+			print_oid(&obj->oid, anonymize);
+		putchar('\n');
 		i++;
 	}

--
2.28.0

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fast-export: factor out print_oid()
  2020-08-13 11:11 [PATCH] fast-export: factor out print_oid() René Scharfe
@ 2020-08-13 11:58 ` Jeff King
  2020-08-13 17:17   ` René Scharfe
  2020-08-13 15:18 ` Taylor Blau
  1 sibling, 1 reply; 10+ messages in thread
From: Jeff King @ 2020-08-13 11:58 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git Mailing List, Junio C Hamano

On Thu, Aug 13, 2020 at 01:11:18PM +0200, René Scharfe wrote:

> Simplify the output code by splitting it up and reducing duplication.
> Reuse the logic for printing object IDs -- anonymized if needed -- by
> moving it to its own function, print_oid().

Looks sane overall, though somehow we added 4 extra lines while reducing
duplication. ;)

> +static void print_oid(const struct object_id *oid, int anonymize)
> +{
> +	const char *oid_hex = oid_to_hex(oid);
> +	if (anonymize)
> +		oid_hex = anonymize_oid(oid_hex);
> +	fputs(oid_hex, stdout);
> +}

Would anyone ever pass anything except the global "anonymize" into this
function (certainly neither of the new callers does). I get that it
takes us on a possible road towards moving the globals to locals, but in
the meantime, shadowing the global name just seems more confusing to me.

> @@ -470,21 +478,19 @@ static void show_filemodify(struct diff_queue_struct *q,
>  		case DIFF_STATUS_TYPE_CHANGED:
>  		case DIFF_STATUS_MODIFIED:
>  		case DIFF_STATUS_ADDED:
> +			printf("M %06o ", spec->mode);

This makes the output a bit more lego-like (i.e., hard to see what the
full line will look like from the code), but it already was pretty
bad because of using print_path(). I think that's fine.

> @@ -724,12 +730,10 @@ static void handle_commit(struct commit *commit, struct rev_info *rev,
>  		else
>  			printf("merge ");
>  		if (mark)
> -			printf(":%d\n", mark);
> +			printf(":%d", mark);

This line gets repeated, too. I guess we could have a print_mark(), but
there is really no logic here except "put a colon in front of it", so
probably not worthwhile.

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fast-export: factor out print_oid()
  2020-08-13 11:11 [PATCH] fast-export: factor out print_oid() René Scharfe
  2020-08-13 11:58 ` Jeff King
@ 2020-08-13 15:18 ` Taylor Blau
  2020-08-13 17:17   ` René Scharfe
  2020-08-13 18:19   ` Junio C Hamano
  1 sibling, 2 replies; 10+ messages in thread
From: Taylor Blau @ 2020-08-13 15:18 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git Mailing List, Junio C Hamano

On Thu, Aug 13, 2020 at 01:11:18PM +0200, René Scharfe wrote:
> Simplify the output code by splitting it up and reducing duplication.
> Reuse the logic for printing object IDs -- anonymized if needed -- by
> moving it to its own function, print_oid().
>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  builtin/fast-export.c | 26 +++++++++++++++-----------
>  1 file changed, 15 insertions(+), 11 deletions(-)
>
> diff --git a/builtin/fast-export.c b/builtin/fast-export.c
> index 9f37895d4cf..49bb50634ab 100644
> --- a/builtin/fast-export.c
> +++ b/builtin/fast-export.c
> @@ -420,6 +420,14 @@ static const char *anonymize_oid(const char *oid_hex)
>  	return anonymize_str(&objs, generate_fake_oid, oid_hex, len, NULL);
>  }
>
> +static void print_oid(const struct object_id *oid, int anonymize)
> +{
> +	const char *oid_hex = oid_to_hex(oid);
> +	if (anonymize)
> +		oid_hex = anonymize_oid(oid_hex);
> +	fputs(oid_hex, stdout);
> +}
> +

The fact that this calls fputs makes this patch (in my own opinion)
noisier than it needs to be. This is because of all of the factoring out
of the other printfs. I'd expect that this looks something more like:

  -				       anonymize ?
  -				       anonymize_oid(oid_to_hex(&spec->oid)) :
  -				       oid_to_hex(&spec->oid));
  +				       anonymize_oid(anonymize, &spec->oid));

without moving around all of the other printf code.

>  static void show_filemodify(struct diff_queue_struct *q,
>  			    struct diff_options *options, void *data)
>  {
> @@ -470,21 +478,19 @@ static void show_filemodify(struct diff_queue_struct *q,
>  		case DIFF_STATUS_TYPE_CHANGED:
>  		case DIFF_STATUS_MODIFIED:
>  		case DIFF_STATUS_ADDED:
> +			printf("M %06o ", spec->mode);
>  			/*
>  			 * Links refer to objects in another repositories;
>  			 * output the SHA-1 verbatim.
>  			 */
>  			if (no_data || S_ISGITLINK(spec->mode))
> -				printf("M %06o %s ", spec->mode,
> -				       anonymize ?
> -				       anonymize_oid(oid_to_hex(&spec->oid)) :
> -				       oid_to_hex(&spec->oid));
> +				print_oid(&spec->oid, anonymize);
>  			else {
>  				struct object *object = lookup_object(the_repository,
>  								      &spec->oid);
> -				printf("M %06o :%d ", spec->mode,
> -				       get_object_mark(object));
> +				printf(":%d", get_object_mark(object));
>  			}
> +			putchar(' ');

... That said, this transformation looks correct from a quick glance.

>  			print_path(spec->path);
>  			string_list_insert(changed, spec->path);
>  			putchar('\n');
> @@ -724,12 +730,10 @@ static void handle_commit(struct commit *commit, struct rev_info *rev,
>  		else
>  			printf("merge ");
>  		if (mark)
> -			printf(":%d\n", mark);
> +			printf(":%d", mark);
>  		else
> -			printf("%s\n",
> -			       anonymize ?
> -			       anonymize_oid(oid_to_hex(&obj->oid)) :
> -			       oid_to_hex(&obj->oid));
> +			print_oid(&obj->oid, anonymize);
> +		putchar('\n');

As does this one.

>  		i++;
>  	}
>
> --
> 2.28.0

So, I guess what I'm trying to say is that this patch doesn't look
wrong, other than that it seems more invasive than I would have expected
it to be.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fast-export: factor out print_oid()
  2020-08-13 11:58 ` Jeff King
@ 2020-08-13 17:17   ` René Scharfe
  0 siblings, 0 replies; 10+ messages in thread
From: René Scharfe @ 2020-08-13 17:17 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, Junio C Hamano

Am 13.08.20 um 13:58 schrieb Jeff King:
> On Thu, Aug 13, 2020 at 01:11:18PM +0200, René Scharfe wrote:
>
>> Simplify the output code by splitting it up and reducing duplication.
>> Reuse the logic for printing object IDs -- anonymized if needed -- by
>> moving it to its own function, print_oid().
>
> Looks sane overall, though somehow we added 4 extra lines while reducing
> duplication. ;)

Yeah, but they are shorter. :)

>
>> +static void print_oid(const struct object_id *oid, int anonymize)
>> +{
>> +	const char *oid_hex = oid_to_hex(oid);
>> +	if (anonymize)
>> +		oid_hex = anonymize_oid(oid_hex);
>> +	fputs(oid_hex, stdout);
>> +}
>
> Would anyone ever pass anything except the global "anonymize" into this
> function (certainly neither of the new callers does). I get that it
> takes us on a possible road towards moving the globals to locals, but in
> the meantime, shadowing the global name just seems more confusing to me.

Good point.

>
>> @@ -470,21 +478,19 @@ static void show_filemodify(struct diff_queue_struct *q,
>>  		case DIFF_STATUS_TYPE_CHANGED:
>>  		case DIFF_STATUS_MODIFIED:
>>  		case DIFF_STATUS_ADDED:
>> +			printf("M %06o ", spec->mode);
>
> This makes the output a bit more lego-like (i.e., hard to see what the
> full line will look like from the code), but it already was pretty
> bad because of using print_path(). I think that's fine.

Yes, it was almost halfway to all-out lego style before, and the
patch moves it further in that direction.

>
>> @@ -724,12 +730,10 @@ static void handle_commit(struct commit *commit, struct rev_info *rev,
>>  		else
>>  			printf("merge ");
>>  		if (mark)
>> -			printf(":%d\n", mark);
>> +			printf(":%d", mark);
>
> This line gets repeated, too. I guess we could have a print_mark(), but
> there is really no logic here except "put a colon in front of it", so
> probably not worthwhile.

Right.

René

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fast-export: factor out print_oid()
  2020-08-13 15:18 ` Taylor Blau
@ 2020-08-13 17:17   ` René Scharfe
  2020-08-13 17:25     ` Jeff King
  2020-08-13 18:19   ` Junio C Hamano
  1 sibling, 1 reply; 10+ messages in thread
From: René Scharfe @ 2020-08-13 17:17 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Git Mailing List, Junio C Hamano, Jeff King

Am 13.08.20 um 17:18 schrieb Taylor Blau:
> On Thu, Aug 13, 2020 at 01:11:18PM +0200, René Scharfe wrote:
>> Simplify the output code by splitting it up and reducing duplication.
>> Reuse the logic for printing object IDs -- anonymized if needed -- by
>> moving it to its own function, print_oid().
>>
>> Signed-off-by: René Scharfe <l.s.r@web.de>
>> ---
>>  builtin/fast-export.c | 26 +++++++++++++++-----------
>>  1 file changed, 15 insertions(+), 11 deletions(-)
>>
>> diff --git a/builtin/fast-export.c b/builtin/fast-export.c
>> index 9f37895d4cf..49bb50634ab 100644
>> --- a/builtin/fast-export.c
>> +++ b/builtin/fast-export.c
>> @@ -420,6 +420,14 @@ static const char *anonymize_oid(const char *oid_hex)
>>  	return anonymize_str(&objs, generate_fake_oid, oid_hex, len, NULL);
>>  }
>>
>> +static void print_oid(const struct object_id *oid, int anonymize)
>> +{
>> +	const char *oid_hex = oid_to_hex(oid);
>> +	if (anonymize)
>> +		oid_hex = anonymize_oid(oid_hex);
>> +	fputs(oid_hex, stdout);
>> +}
>> +
>
> The fact that this calls fputs makes this patch (in my own opinion)
> noisier than it needs to be. This is because of all of the factoring out
> of the other printfs. I'd expect that this looks something more like:
>
>   -				       anonymize ?
>   -				       anonymize_oid(oid_to_hex(&spec->oid)) :
>   -				       oid_to_hex(&spec->oid));
>   +				       anonymize_oid(anonymize, &spec->oid));
>
> without moving around all of the other printf code.

Moving that part to anonymize_oid() would reduce the line count while
still getting rid of the duplication.  But the function would need a
new name.

-- >8 --
Subject: [PATCH v2] fast-export: deduplicate anonymization handling

Move the code for converting an object_id to a hexadecimal string and
for handling of the default (not anonymizing) case from its callers to
anonymize_oid() and consequently rename it to anonymize_oid_if_needed().
This reduces code duplication.

Suggested-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
 builtin/fast-export.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 9f37895d4cf..fcc3208727f 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -413,10 +413,13 @@ static char *generate_fake_oid(void *data)
 	return hash_to_hex_algop_r(hex, out, the_hash_algo);
 }

-static const char *anonymize_oid(const char *oid_hex)
+static const char *anonymize_oid_if_needed(const struct object_id *oid)
 {
 	static struct hashmap objs;
+	const char *oid_hex = oid_to_hex(oid);
 	size_t len = strlen(oid_hex);
+	if (!anonymize)
+		return oid_hex;
 	return anonymize_str(&objs, generate_fake_oid, oid_hex, len, NULL);
 }

@@ -476,9 +479,7 @@ static void show_filemodify(struct diff_queue_struct *q,
 			 */
 			if (no_data || S_ISGITLINK(spec->mode))
 				printf("M %06o %s ", spec->mode,
-				       anonymize ?
-				       anonymize_oid(oid_to_hex(&spec->oid)) :
-				       oid_to_hex(&spec->oid));
+				       anonymize_oid_if_needed(&spec->oid));
 			else {
 				struct object *object = lookup_object(the_repository,
 								      &spec->oid);
@@ -726,10 +727,7 @@ static void handle_commit(struct commit *commit, struct rev_info *rev,
 		if (mark)
 			printf(":%d\n", mark);
 		else
-			printf("%s\n",
-			       anonymize ?
-			       anonymize_oid(oid_to_hex(&obj->oid)) :
-			       oid_to_hex(&obj->oid));
+			printf("%s\n", anonymize_oid_if_needed(&obj->oid));
 		i++;
 	}

--
2.28.0

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fast-export: factor out print_oid()
  2020-08-13 17:17   ` René Scharfe
@ 2020-08-13 17:25     ` Jeff King
  2020-08-15  7:14       ` René Scharfe
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff King @ 2020-08-13 17:25 UTC (permalink / raw)
  To: René Scharfe; +Cc: Taylor Blau, Git Mailing List, Junio C Hamano

On Thu, Aug 13, 2020 at 07:17:20PM +0200, René Scharfe wrote:

> -- >8 --
> Subject: [PATCH v2] fast-export: deduplicate anonymization handling
> 
> Move the code for converting an object_id to a hexadecimal string and
> for handling of the default (not anonymizing) case from its callers to
> anonymize_oid() and consequently rename it to anonymize_oid_if_needed().
> This reduces code duplication.

I think this is a bad direction unless you're going to do it for all of
the other anonymize_*() functions, as well, for consistency. And there
it gets tricky because the caller is able to use the anonymizing
knowledge in more places.

I actually liked your original version better.

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fast-export: factor out print_oid()
  2020-08-13 15:18 ` Taylor Blau
  2020-08-13 17:17   ` René Scharfe
@ 2020-08-13 18:19   ` Junio C Hamano
  1 sibling, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2020-08-13 18:19 UTC (permalink / raw)
  To: Taylor Blau; +Cc: René Scharfe, Git Mailing List

Taylor Blau <me@ttaylorr.com> writes:

>> +static void print_oid(const struct object_id *oid, int anonymize)
>> +{
>> +	const char *oid_hex = oid_to_hex(oid);
>> +	if (anonymize)
>> +		oid_hex = anonymize_oid(oid_hex);
>> +	fputs(oid_hex, stdout);
>> +}
>> +
>
> The fact that this calls fputs makes this patch (in my own opinion)
> noisier than it needs to be. This is because of all of the factoring out
> of the other printfs. I'd expect that this looks something more like:
>
>   -				       anonymize ?
>   -				       anonymize_oid(oid_to_hex(&spec->oid)) :
>   -				       oid_to_hex(&spec->oid));
>   +				       anonymize_oid(anonymize, &spec->oid));
>
> without moving around all of the other printf code.
> ...
> So, I guess what I'm trying to say is that this patch doesn't look
> wrong, other than that it seems more invasive than I would have expected
> it to be.

Yes, that matches my knee-jerk reaction.  I also shared Peff's
reaction that the code to generate the message is now even more
fragmented into pieces.

Just for comparison purposes, I tried to fold anonymize_oid()'s body
into René's print_oid() and adjusted the calling sites, which did
not look too bad.

So, I dunno.

 builtin/fast-export.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 9f37895d4c..bef2c01bd8 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -413,11 +413,16 @@ static char *generate_fake_oid(void *data)
 	return hash_to_hex_algop_r(hex, out, the_hash_algo);
 }
 
-static const char *anonymize_oid(const char *oid_hex)
+static const char *anonymize_oid(const struct object_id *oid, int anonymize)
 {
-	static struct hashmap objs;
-	size_t len = strlen(oid_hex);
-	return anonymize_str(&objs, generate_fake_oid, oid_hex, len, NULL);
+	const char *oid_hex = oid_to_hex(oid);
+	if (anonymize) {
+		static struct hashmap objs;
+		size_t len = strlen(oid_hex);
+		return anonymize_str(&objs, generate_fake_oid, oid_hex, len, NULL);
+	} else {
+		return oid_hex;
+	}
 }
 
 static void show_filemodify(struct diff_queue_struct *q,
@@ -476,9 +481,7 @@ static void show_filemodify(struct diff_queue_struct *q,
 			 */
 			if (no_data || S_ISGITLINK(spec->mode))
 				printf("M %06o %s ", spec->mode,
-				       anonymize ?
-				       anonymize_oid(oid_to_hex(&spec->oid)) :
-				       oid_to_hex(&spec->oid));
+				       anonymize_oid(&spec->oid, anonymize));
 			else {
 				struct object *object = lookup_object(the_repository,
 								      &spec->oid);
@@ -726,10 +729,7 @@ static void handle_commit(struct commit *commit, struct rev_info *rev,
 		if (mark)
 			printf(":%d\n", mark);
 		else
-			printf("%s\n",
-			       anonymize ?
-			       anonymize_oid(oid_to_hex(&obj->oid)) :
-			       oid_to_hex(&obj->oid));
+			printf("%s\n", anonymize_oid(&obj->oid, anonymize));
 		i++;
 	}
 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fast-export: factor out print_oid()
  2020-08-13 17:25     ` Jeff King
@ 2020-08-15  7:14       ` René Scharfe
  2020-08-17 22:08         ` Jeff King
  0 siblings, 1 reply; 10+ messages in thread
From: René Scharfe @ 2020-08-15  7:14 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, Git Mailing List, Junio C Hamano

Am 13.08.20 um 19:25 schrieb Jeff King:
> On Thu, Aug 13, 2020 at 07:17:20PM +0200, René Scharfe wrote:
>
>> -- >8 --
>> Subject: [PATCH v2] fast-export: deduplicate anonymization handling
>>
>> Move the code for converting an object_id to a hexadecimal string and
>> for handling of the default (not anonymizing) case from its callers to
>> anonymize_oid() and consequently rename it to anonymize_oid_if_needed().
>> This reduces code duplication.
>
> I think this is a bad direction unless you're going to do it for all of
> the other anonymize_*() functions, as well, for consistency. And there
> it gets tricky because the caller is able to use the anonymizing
> knowledge in more places.
>
> I actually liked your original version better.

OK, how about embracing the static and do something like this?

-- >8 --
Subject: [PATCH] fast-export: add format_oid() and format_path()

Add functions that handle anonymization, quoting and formatting of paths
and object IDs and return static buffers fit for use with printf().
Use them to generate each output line containing these components with a
single printf() format specification, which is easier to read.

format_oid() inherits the ability to be used for four different object
IDs in parallel from oid_to_hex() -- but here we only need one anyway.

format_path() has two sets of static buffers, which is just enough for
our purposes here.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 builtin/fast-export.c | 86 ++++++++++++++++++++++---------------------
 1 file changed, 45 insertions(+), 41 deletions(-)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 9f37895d4cf..a9e36dccf9e 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -368,17 +368,6 @@ static int depth_first(const void *a_, const void *b_)
 	return (a->status == 'R') - (b->status == 'R');
 }

-static void print_path_1(const char *path)
-{
-	int need_quote = quote_c_style(path, NULL, NULL, 0);
-	if (need_quote)
-		quote_c_style(path, NULL, stdout, 0);
-	else if (strchr(path, ' '))
-		printf("\"%s\"", path);
-	else
-		printf("%s", path);
-}
-
 static char *anonymize_path_component(void *data)
 {
 	static int counter;
@@ -387,18 +376,34 @@ static char *anonymize_path_component(void *data)
 	return strbuf_detach(&out, NULL);
 }

-static void print_path(const char *path)
+static const char *format_path(const char *path)
 {
-	if (!anonymize)
-		print_path_1(path);
-	else {
-		static struct hashmap paths;
-		static struct strbuf anon = STRBUF_INIT;
-
-		anonymize_path(&anon, path, &paths, anonymize_path_component);
-		print_path_1(anon.buf);
-		strbuf_reset(&anon);
+	static struct hashmap paths;
+	static struct strbuf anon_buffers[] = { STRBUF_INIT, STRBUF_INIT };
+	static struct strbuf quoted_buffers[] = { STRBUF_INIT, STRBUF_INIT };
+	static int which_buffer;
+	struct strbuf *anon = &anon_buffers[which_buffer];
+	struct strbuf *quoted = &quoted_buffers[which_buffer];
+
+	which_buffer++;
+	which_buffer %= ARRAY_SIZE(anon_buffers) + BUILD_ASSERT_OR_ZERO(
+			ARRAY_SIZE(anon_buffers) == ARRAY_SIZE(quoted_buffers));
+
+	if (anonymize) {
+		strbuf_reset(anon);
+		anonymize_path(anon, path, &paths, anonymize_path_component);
+		path = anon->buf;
+	}
+
+	strbuf_reset(quoted);
+	if (quote_c_style(path, quoted, NULL, 0))
+		return quoted->buf;
+	if (strchr(path, ' ')) {
+		strbuf_reset(quoted);
+		strbuf_addf(quoted, "\"%s\"", path);
+		return quoted->buf;
 	}
+	return path;
 }

 static char *generate_fake_oid(void *data)
@@ -420,6 +425,14 @@ static const char *anonymize_oid(const char *oid_hex)
 	return anonymize_str(&objs, generate_fake_oid, oid_hex, len, NULL);
 }

+static const char *format_oid(const struct object_id *oid)
+{
+	const char *oid_hex = oid_to_hex(oid);
+	if (anonymize)
+		oid_hex = anonymize_oid(oid_hex);
+	return oid_hex;
+}
+
 static void show_filemodify(struct diff_queue_struct *q,
 			    struct diff_options *options, void *data)
 {
@@ -438,10 +451,8 @@ static void show_filemodify(struct diff_queue_struct *q,

 		switch (q->queue[i]->status) {
 		case DIFF_STATUS_DELETED:
-			printf("D ");
-			print_path(spec->path);
+			printf("D %s\n", format_path(spec->path));
 			string_list_insert(changed, spec->path);
-			putchar('\n');
 			break;

 		case DIFF_STATUS_COPIED:
@@ -454,12 +465,10 @@ static void show_filemodify(struct diff_queue_struct *q,
 			 * copy or rename only if there was no change observed.
 			 */
 			if (!string_list_has_string(changed, ospec->path)) {
-				printf("%c ", q->queue[i]->status);
-				print_path(ospec->path);
-				putchar(' ');
-				print_path(spec->path);
+				printf("%c %s %s\n", q->queue[i]->status,
+				       format_path(ospec->path),
+				       format_path(spec->path));
 				string_list_insert(changed, spec->path);
-				putchar('\n');

 				if (oideq(&ospec->oid, &spec->oid) &&
 				    ospec->mode == spec->mode)
@@ -475,19 +484,17 @@ static void show_filemodify(struct diff_queue_struct *q,
 			 * output the SHA-1 verbatim.
 			 */
 			if (no_data || S_ISGITLINK(spec->mode))
-				printf("M %06o %s ", spec->mode,
-				       anonymize ?
-				       anonymize_oid(oid_to_hex(&spec->oid)) :
-				       oid_to_hex(&spec->oid));
+				printf("M %06o %s %s\n", spec->mode,
+				       format_oid(&spec->oid),
+				       format_path(spec->path));
 			else {
 				struct object *object = lookup_object(the_repository,
 								      &spec->oid);
-				printf("M %06o :%d ", spec->mode,
-				       get_object_mark(object));
+				printf("M %06o :%d %s\n", spec->mode,
+				       get_object_mark(object),
+				       format_path(spec->path));
 			}
-			print_path(spec->path);
 			string_list_insert(changed, spec->path);
-			putchar('\n');
 			break;

 		default:
@@ -726,10 +733,7 @@ static void handle_commit(struct commit *commit, struct rev_info *rev,
 		if (mark)
 			printf(":%d\n", mark);
 		else
-			printf("%s\n",
-			       anonymize ?
-			       anonymize_oid(oid_to_hex(&obj->oid)) :
-			       oid_to_hex(&obj->oid));
+			printf("%s\n", format_oid(&obj->oid));
 		i++;
 	}

--
2.28.0

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fast-export: factor out print_oid()
  2020-08-15  7:14       ` René Scharfe
@ 2020-08-17 22:08         ` Jeff King
  2020-08-17 22:53           ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff King @ 2020-08-17 22:08 UTC (permalink / raw)
  To: René Scharfe; +Cc: Taylor Blau, Git Mailing List, Junio C Hamano

On Sat, Aug 15, 2020 at 09:14:49AM +0200, René Scharfe wrote:

> > I think this is a bad direction unless you're going to do it for all of
> > the other anonymize_*() functions, as well, for consistency. And there
> > it gets tricky because the caller is able to use the anonymizing
> > knowledge in more places.
> >
> > I actually liked your original version better.
> 
> OK, how about embracing the static and do something like this?
> 
> -- >8 --
> Subject: [PATCH] fast-export: add format_oid() and format_path()

TBH, I don't find it an improvement because of the extra buffer
handling. But I admit that I don't really care between any of the
solutions posted here. They all appear to be correct, and just trading
off various properties so that none is definitively better than the
other. (And none of them is so bad that I feel compelled to avoid it).

So I'd be OK with any of them (or leaving it as-is).

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fast-export: factor out print_oid()
  2020-08-17 22:08         ` Jeff King
@ 2020-08-17 22:53           ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2020-08-17 22:53 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, Taylor Blau, Git Mailing List

Jeff King <peff@peff.net> writes:

> On Sat, Aug 15, 2020 at 09:14:49AM +0200, René Scharfe wrote:
>
>> > I think this is a bad direction unless you're going to do it for all of
>> > the other anonymize_*() functions, as well, for consistency. And there
>> > it gets tricky because the caller is able to use the anonymizing
>> > knowledge in more places.
>> >
>> > I actually liked your original version better.
>> 
>> OK, how about embracing the static and do something like this?
>> 
>> -- >8 --
>> Subject: [PATCH] fast-export: add format_oid() and format_path()
>
> TBH, I don't find it an improvement because of the extra buffer
> handling. But I admit that I don't really care between any of the
> solutions posted here. They all appear to be correct, and just trading
> off various properties so that none is definitively better than the
> other. (And none of them is so bad that I feel compelled to avoid it).
>
> So I'd be OK with any of them (or leaving it as-is).
>
> -Peff

I've marked it as "retracted" per
https://lore.kernel.org/git/6e2d4472-8293-4f10-0ba6-82ae83f7a465@web.de/

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-08-17 22:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-13 11:11 [PATCH] fast-export: factor out print_oid() René Scharfe
2020-08-13 11:58 ` Jeff King
2020-08-13 17:17   ` René Scharfe
2020-08-13 15:18 ` Taylor Blau
2020-08-13 17:17   ` René Scharfe
2020-08-13 17:25     ` Jeff King
2020-08-15  7:14       ` René Scharfe
2020-08-17 22:08         ` Jeff King
2020-08-17 22:53           ` Junio C Hamano
2020-08-13 18:19   ` Junio C Hamano

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).