git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: Eric Sunshine <sunshine@sunshineco.com>, Git List <git@vger.kernel.org>
Subject: Re: [PATCH 4/8] add functions for memory-efficient bitmaps
Date: Tue, 01 Jul 2014 09:57:13 -0700	[thread overview]
Message-ID: <xmqqegy5xbiu.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <20140630170732.GA16747@sigill.intra.peff.net> (Jeff King's message of "Mon, 30 Jun 2014 13:07:32 -0400")

Jeff King <peff@peff.net> writes:

> On Sun, Jun 29, 2014 at 03:41:37AM -0400, Eric Sunshine wrote:
>
>> > +static inline void bitset_set(unsigned char *bits, int n)
>> > +{
>> > +       bits[n / CHAR_BIT] |= 1 << (n % CHAR_BIT);
>> > +}
>> 
>> Is it intentional or an oversight that there is no way to clear a bit
>> in the set?
>
> Intentional in the sense that I had no need for it in my series, and I
> didn't think about it. I doubt many callers would want it, since commit
> traversals tend to propagate bits through the graph, and then clean them
> up all at once. And the right way to clean up slabbed data like this is
> to just clear the slab.
>
> Of course somebody may use the code for something besides commit
> traversals. But I'd rather avoid adding dead code on the off chance that
> somebody uses it later (and then gets to find out whether it even works
> or not!).

Another thing I noticed was that the definition of and the
commentary on bitset_equal() and bitset_empty() sounded somewhat
"undecided".  These functions take "max" that is deliberately named
differently from "num_bits" (the width of the bitsets involved),
inviting to use them for testing only earlier bits in the bitset as
long as the caller understands the caveat, but the caveat requires
that the partial bitset to test must be byte-aligned, which makes it
not very useful in practice, which means we probably do not want
them to be used for any "max" other than "num_bits".

They probably would want either:

 * be made to truly honor max < num_bits case, by special casing the
   last byte that has max-th bit, to officially allow them to be
   used for partial bitset test; or

 * take "num_bits", not "max", to clarify that callers must use them
   only on the full bitset.

In either case, there needs another item in the "caller's responsibility"
list at the beginning of bitset.h:

    4. Ensure that padding bits at the end of the bitset array are
       initialized to 0.

In the description of bitset_sizeof(), the comment hints it by using
xcalloc() in the example, but a careless user may be tempted to
implement bitset_clr() and then do:

        int i;
        unsigned char *bits = malloc(bitset_sizeof(nr));
        for (i = 0; i < nr; i++)
        	bitset_clr(bits, i);
	assert(bitset_empty(bits, nr));

and the implementation of bitset_empty(), even if we rename
s/max/num_bits/, will choke if (nr % CHAR_BIT) and malloc() gave us
non-zero bit in the padding.

For the sake of simplicity, I am inclined to vote for not allowing
their use on a partial-sub-bitset.

  reply	other threads:[~2014-07-01 16:57 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-25 23:34 [PATCH 0/8] use merge-base for tag --contains Jeff King
2014-06-25 23:35 ` [PATCH 1/8] tag: allow --sort with -n Jeff King
2014-06-25 23:35 ` [PATCH 2/8] tag: factor out decision to stream tags Jeff King
2014-06-25 23:39 ` [PATCH 3/8] paint_down_to_common: use prio_queue Jeff King
2014-07-01 16:23   ` Junio C Hamano
2014-07-01 17:10     ` Jeff King
2014-06-25 23:40 ` [PATCH 4/8] add functions for memory-efficient bitmaps Jeff King
2014-06-26  3:15   ` Torsten Bögershausen
2014-06-26 15:51     ` Jeff King
2014-06-29  7:41   ` Eric Sunshine
2014-06-30 17:07     ` Jeff King
2014-07-01 16:57       ` Junio C Hamano [this message]
2014-07-01 17:18         ` Jeff King
2014-06-25 23:42 ` [PATCH 5/8] string-list: add pos to iterator callback Jeff King
2014-07-01 17:45   ` Junio C Hamano
2014-07-01 19:00     ` Jeff King
2014-06-25 23:47 ` [PATCH 6/8] commit: provide a fast multi-tip contains function Jeff King
2014-06-26 18:55   ` Junio C Hamano
2014-06-26 19:19     ` Junio C Hamano
2014-06-26 19:26       ` Junio C Hamano
2014-07-01 18:16       ` Junio C Hamano
2014-07-01 19:14         ` Junio C Hamano
2014-06-25 23:49 ` [PATCH 7/8] tag: use commit_contains Jeff King
2014-06-25 23:53 ` [PATCH 8/8] perf: add tests for tag --contains Jeff King
2014-06-26  0:01   ` Jeff King
2014-06-26  0:04     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqegy5xbiu.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).