Re: [RFC/PATCHv2 1/6] decorate: allow storing values instead of pointers

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

From: Jeff King <peff@peff.net>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Jakub Narebski" <jnareb@gmail.com>, "Ted Ts'o" <tytso@mit.edu>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Clemens Buchacher" <drizzd@aon.at>,
	"Shawn O. Pearce" <spearce@spearce.org>,
	"David Barr" <davidbarr@google.com>
Subject: Re: [RFC/PATCHv2 1/6] decorate: allow storing values instead of pointers
Date: Wed, 13 Jul 2011 16:08:15 -0400	[thread overview]
Message-ID: <20110713200814.GD31965@sigill.intra.peff.net> (raw)
In-Reply-To: <20110713175250.GA1448@elie>

On Wed, Jul 13, 2011 at 12:52:50PM -0500, Jonathan Nieder wrote:

> > +void dump_longs(void)
> > +{
> > +	int i;
> > +	for (i = 0; i < longs.size; i++) {
> > +		struct object_decoration *e = decoration_slot(&longs, i);
> > +		unsigned long *value = (unsigned long *)e->decoration;
> > +
> > +		/* empty hash slot */
> > +		if (!e->base)
> > +			continue;
> > +
> > +		printf("%s -> %lu\n", sha1_to_hex(e->base->sha1), *value);
> 
> ... a cast is needed to use it.  Makes some sense.
> 
> What alignment guarantees are there for the field, if any?  I'm
> especially worried about platforms like sparc32 where the pointer
> width is 32 bits but some types need to be aligned to 64 bits.

We're packing these structs into a heap-allocated byte array. Each
struct takes up the size of the struct as defined by the compiler, plus
$width bytes. So with a 32-bit pointer and 32-bit data, you are probably
looking at data fields offset by 32-bits. Which might be wrong on
sparc32.

The result of malloc is guaranteed to be aligned for any type. But for
arrays, I'm not sure how that is handled. The only thing that makes
sense to me is that a sparc compiler with something like:

  struct foo {
    struct bar *pointer; /* pointers are 32-bits */
    double value; /* assume doubles need to be 64-bit aligned */
  };

would have to actually put in 32-bits of padding to meet the alignment
goals (and possibly some padding at the end so that the followup struct
in an array is aligned properly). But I don't know how this is handled,
so I'm just guessing.

And if that is the case, then yeah, there may well be an alignment issue
here. It gets even worse if you try to store something with an odd
alignment, like a 3-byte sequence.

So perhaps the safest thing would be to always memcpy into a real,
properly aligned destination. I'd have to tweak the get_value interface
a little bit, but it's not a big deal. It's going to be a little bit
slower, of course, but I doubt the extra few instructions will be
measurable.

I have to say, though, between the alignment issues and the strict
aliasing, I am tempted to scrap this whole approach and just use macros
to define the few functions we need. It's not like these containers are
heterogenous, or that we have a ton of types. Right now we want to map
"void *" and "uint32_t". In the future, I'd like to map a 20-byte sha1.

Doing something like:

  #define DECLARE_DECORATION(name, type) \
    void decoration_get_##name(struct decoration_##name *, \
                               struct object *, \
                               type value); \
    int decoration_set_##name(struct decoration_##name *, \
                              struct object *, \
                              type *value);

  DECLARE_DECORATION(void, void *);
  DECLARE_DECORATION(uint32, uint32_t);

  /* and then do an IMPLEMENT_DECORATION macro inside decorate.c */

is tempting. Writing your functions inside macros like this is ugly, but
then we know it's right, because the compiler knows what the sizes are
at compile time. And the optimizer can do its job properly, because
we're not fiddling the types at runtime.

> >  	for (i = 0; i < idnums.size; i++) {
> > +		struct object_decoration *deco = decoration_slot(&idnums, i);
> >  		if (deco->base && deco->base->type == 1) {
> > -			mark = ptr_to_mark(deco->decoration);
> > -			if (fprintf(f, ":%"PRIu32" %s\n", mark,
> > +			uint32_t *mark = (uint32_t *)deco->decoration;
> > +			if (fprintf(f, ":%"PRIu32" %s\n", *mark,
> >  				sha1_to_hex(deco->base->sha1)) < 0) {
> 
> Is this okay according to strict aliasing rules?  Maybe it would be
> safer to write
> 
> 			uint32_t mark;
> 			memcpy(&mark, deco->decoration, sizeof(mark));
> 
> which generates the same code in current versions of gcc on x86 if I
> remember correctly.

I guess you didn't read my comments on v1 of the patch. :)

I'm not sure if it's OK or not. Curiously, doing this:

  uint32_t mark = *(uint32_t *)deco->decoration;

generates a warning under -fstrict-aliasing, but:

  uint32_t *mark = (uint32_t *)deco->decoration;
  /* now use *mark */;

does not. I'm not sure if there's a subtlety in the strict aliasing
rules that makes the latter OK, or if it is simply a bug that it doesn't
trigger the compiler warning.

Using memcpy would be the safest thing, both from an alignment and a
strict-aliasing point of view.

> > --- a/decorate.c
> > +++ b/decorate.c
> > @@ -14,44 +14,48 @@ static unsigned int hash_obj(const struct object *obj, unsigned int n)
> >  	return hash % n;
> >  }
> >  
> > -static void *insert_decoration(struct decoration *n, const struct object *base, void *decoration)
> > +static int insert_decoration(struct decoration *n, const struct object *base,
> > +			     const void *decoration, void *old)
> >  {
> >  	int size = n->size;
> > -	struct object_decoration *hash = n->hash;
> > +	unsigned long width = decoration_width(n);
> 
> Micronit: why not size_t?

No reason.

> >  	unsigned int j = hash_obj(base, size);
> >  
> > -	while (hash[j].base) {
> > -		if (hash[j].base == base) {
> > -			void *old = hash[j].decoration;
> > -			hash[j].decoration = decoration;
> > -			return old;
> > +	while (1) {
> 
> Microoptimization: the modulo operation can avoid the conditional (j >= size):
> 
> 	for (j = hash_obj(base, size); ; j = (j + 1) % size) {
> 	}

Yeah, that would work. I doubt the performance is measurable. I couldn't
even get a measurable difference in iterating using pointers versus
using indices and converting them to pointers during each run through
the loop. So I suspect we simply don't actually go through this loop
very many times (i.e., the hash is doing its job and entries are spread
out appropriately). So micro-optimizing is probably not worth it.

> By the way, how do we know this loop will terminate?  Is it because
> the insertion function is careful to make sure the table never gets
> filled?

Exactly. See grow_decoration. I had the same question when I looked at
the loop (the termination condition is part of the original code, which
is Linus's).

> > +void *lookup_decoration(const struct decoration *n, const struct object *obj)
> > +{
> > +	void **v;
> > +
> > +	v = lookup_decoration_value(n, obj);
> > +	if (!v)
> > +		return NULL;
> > +	return *v;
> > +}
> 
> Maybe memcpy to avoid alignment problems?
> 
> 	unsigned char *p;
> 	void *v;
> 
> 	p = lookup_decoration_value(n, obj);
> 	if (!p)
> 		return NULL;
> 	memcpy(v, p, sizeof(v));
> 	return v;

Yeah, that would solve it.

> > +void *lookup_decoration_value(const struct decoration *n,
> > +			      const struct object *obj)
> >  {
> >  	unsigned int j;
> >  
> > @@ -77,7 +101,7 @@ void *lookup_decoration(struct decoration *n, const struct object *obj)
> >  		return NULL;
> >  	j = hash_obj(obj, n->size);
> >  	for (;;) {
> > -		struct object_decoration *ref = n->hash + j;
> > +		struct object_decoration *ref = decoration_slot(n, j);
> >  		if (ref->base == obj)
> >  			return ref->decoration;
> 
> I worry that this could have alignment trouble anyway.

I don't think it has alignment problems with the value inside the
struct, since we are just passing a pointer back to it as a void *; it
is the caller who will dereference it. We could pass it back as an
unsigned char *, to make it clear to the caller that they need to
memcpy.

But if you mean there might be an alignment issue looking at the ->base
field of the "struct object_decoration", then yeah, I am not sure of
that.

> >  struct object_decoration {
> >  	const struct object *base;
> > -	void *decoration;
> > +	unsigned char decoration[FLEX_ARRAY];
> >  };
> 
> On some platforms, this becomes
> 
> 	struct object_decoration {
> 		const struct object *base;
> 		unsigned char decoration[];
> 	};
> 
> which I hope would create a type with the alignment of a pointer
> (generally sufficient except in odd cases like sparc32).  But on
> old-fashioned platforms, it is
> 
> 	struct object_decoration {
> 		const struct object *base;
> 		unsigned char decoration[1];
> 	};
> 
> Will that be a problem, or is it standard for compilers to be smart
> enough to pad to a nice alignment?

I don't know. Thanks for mentioning it; it was another issue I had
noticed while writing, but forgot to bring up when I posted the patches.

> If we're willing to incur the cost of a copy that assumes unaligned
> objects, perhaps
> 
> 	extern int lookup_decoration_value(const struct decoration *n,
> 				const struct object *obj,
> 				void *result, size_t width);
> 
> would be safer.

Agreed.

> Aside from the alignment and strict-aliasing worries, this looks very
> nice.  Thanks for writing it.

Thanks for the review. When I wrote it, and even now, I'm still very
unsure that the alignment and aliasing issues are right. It seems to
work so far for me, but:

  1. I'm on x86_64, which is not one of the more oddball platforms for
     alignment issues.

  2. I'm putting in "void *" and "uint32_t" values. Those are about as
     vanilla as you can get. But I don't want to leave a time bomb for
     somebody who tries to store a 3-byte sequence.

So it makes me a bit nervous, and why I'm very tempted by the ugly macro
solution.

-Peff

next prev parent reply	other threads:[~2011-07-13 20:08 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-13  6:47 [RFC/PATCHv2 0/6] generation numbers for faster traversals Jeff King
2011-07-13  6:57 ` [RFC/PATCHv2 1/6] decorate: allow storing values instead of pointers Jeff King
2011-07-13 17:52   ` Jonathan Nieder
2011-07-13 20:08     ` Jeff King [this message]
2011-07-14 17:34       ` Jeff King
2011-07-14 17:51         ` [PATCH 1/3] implement generic key/value map Jeff King
2011-07-14 18:52           ` Bert Wesarg
2011-07-14 18:54             ` Bert Wesarg
2011-07-14 18:55               ` Jeff King
2011-07-14 19:07                 ` Bert Wesarg
2011-07-14 19:14                   ` Jeff King
2011-07-14 19:18                     ` Bert Wesarg
2011-07-14 17:52         ` [PATCH 2/3] fast-export: use object to uint32 map instead of "decorate" Jeff King
2011-07-15  9:40           ` Sverre Rabbelier
2011-07-15 20:00             ` Jeff King
2011-07-14 17:53         ` [PATCH 3/3] decorate: use "map" for the underlying implementation Jeff King
2011-07-14 21:06         ` [RFC/PATCHv2 1/6] decorate: allow storing values instead of pointers Junio C Hamano
2011-08-04 22:43           ` [RFC/PATCH 0/5] macro-based key/value maps Jeff King
2011-08-04 22:45             ` [PATCH 1/5] implement generic key/value map Jeff King
2011-08-04 22:46             ` [PATCH 2/5] fast-export: use object to uint32 map instead of "decorate" Jeff King
2011-08-04 22:46             ` [PATCH 3/5] decorate: use "map" for the underlying implementation Jeff King
2011-08-04 22:46             ` [PATCH 4/5] map: implement persistent maps Jeff King
2011-08-04 22:46             ` [PATCH 5/5] implement metadata cache subsystem Jeff King
2011-08-04 22:48             ` [RFC/PATCH 0/2] patch-id caching Jeff King
2011-08-04 22:49               ` [PATCH 1/2] cherry: read default config Jeff King
2011-08-04 22:49               ` [PATCH 2/2] cache patch ids on disk Jeff King
2011-08-04 22:52                 ` Jeff King
2011-08-05 11:03             ` [RFC/PATCH 0/5] macro-based key/value maps Jeff King
2011-08-05 15:31               ` René Scharfe
2011-08-06  6:30                 ` Jeff King
2011-07-13  7:04 ` [RFC/PATCHv2 2/6] add metadata-cache infrastructure Jeff King
2011-07-13  8:18   ` Bert Wesarg
2011-07-13  8:31     ` Jeff King
2011-07-13  8:45       ` Bert Wesarg
2011-07-13 19:18         ` Jeff King
2011-07-13 19:40       ` Junio C Hamano
2011-07-13 19:33   ` Junio C Hamano
2011-07-13 20:25     ` Jeff King
2011-07-13  7:05 ` [RFC/PATCHv2 3/6] commit: add commit_generation function Jeff King
2011-07-13 14:26   ` Eric Sunshine
2011-07-13  7:05 ` [RFC/PATCHv2 4/6] pretty: support %G to show the generation number of a commit Jeff King
2011-07-13  7:06 ` [RFC/PATCHv2 5/6] check commit generation cache validity against grafts Jeff King
2011-07-13 14:26   ` Eric Sunshine
2011-07-13 19:35     ` Jeff King
2011-07-13  7:06 ` [RFC/PATCHv2 6/6] limit "contains" traversals based on commit generation Jeff King
2011-07-13  7:23   ` Jeff King
2011-07-13 20:33     ` Junio C Hamano
2011-07-13 20:58       ` Jeff King
2011-07-13 21:12         ` Junio C Hamano
2011-07-13 21:18           ` Jeff King
2011-07-15 18:22   ` Junio C Hamano
2011-07-15 20:40     ` Jeff King
2011-07-15 21:04       ` Junio C Hamano
2011-07-15 21:14         ` Jeff King
2011-07-15 21:01 ` Generation numbers and replacement objects Jakub Narebski
2011-07-15 21:10   ` Jeff King
2011-07-16 21:10     ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110713200814.GD31965@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=davidbarr@google.com \
    --cc=drizzd@aon.at \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jnareb@gmail.com \
    --cc=jrnieder@gmail.com \
    --cc=spearce@spearce.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).