git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Eric Sunshine <sunshine@sunshineco.com>
To: Jeff King <peff@peff.net>
Cc: Git List <git@vger.kernel.org>
Subject: Re: [PATCH 4/8] add functions for memory-efficient bitmaps
Date: Sun, 29 Jun 2014 03:41:37 -0400	[thread overview]
Message-ID: <CAPig+cSc=A=+PR7oF43yeLpcd4n=Bd1KU1AHPfMKXEu5wAF4Ug@mail.gmail.com> (raw)
In-Reply-To: <20140625234000.GD23146@sigill.intra.peff.net>

On Wed, Jun 25, 2014 at 7:40 PM, Jeff King <peff@peff.net> wrote:
> We already have a nice-to-use bitmap implementation in
> ewah/bitmap.c. It pretends to be infinitely long when asking
> for a bit (and just returns 0 for bits that haven't been
> allocated or set), and dynamically resizes as appropriate
> when you set bits.
>
> The cost to this is that each bitmap must store its own
> pointer and length, using up to 16 bytes per bitmap on top
> of the actual bit storage. This is a lot of storage (not to
> mention an extra level of pointer indirection) if you are
> going to store one bitmap per commit in a traversal.
>
> These functions provide an alternative bitmap implementation
> that can be used when you have a large number of fixed-size
> bitmaps. See the documentation in the header file for
> details and examples.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> diff --git a/bitset.h b/bitset.h
> new file mode 100644
> index 0000000..5fbc956
> --- /dev/null
> +++ b/bitset.h
> @@ -0,0 +1,113 @@
> +#ifndef BITSET_H
> +#define BITSET_H
> +
> + * Return the number of unsigned chars required to store num_bits bits.
> + *
> + * This is mostly used internally by the bitset functions, but you may need it
> + * when allocating the bitset. Example:
> + *
> + *   bits = xcalloc(1, bitset_sizeof(nr));
> + */
> +static inline int bitset_sizeof(int num_bits)
> +{
> +       return (num_bits + CHAR_BIT - 1) / CHAR_BIT;
> +}
> +
> +/*
> + * Set the bit at position "n". "n" is counted from zero, and must be
> + * smaller than the num_bits given to bitset_sizeof when allocating the bitset.
> + */
> +static inline void bitset_set(unsigned char *bits, int n)
> +{
> +       bits[n / CHAR_BIT] |= 1 << (n % CHAR_BIT);
> +}

Is it intentional or an oversight that there is no way to clear a bit
in the set?

> +/*
> + * Return the bit at position "n" (see bitset_set for a description of "n").
> + */
> +static inline int bitset_get(unsigned char *bits, int n)
> +{
> +       return !!(bits[n / CHAR_BIT] & (1 << (n % CHAR_BIT)));
> +}
> +
> +/*
> + * Return true iff the bitsets contain the same bits. Each bitset should be the
> + * same size, and should have been allocated using bitset_sizeof(max).
> + *
> + * Note that it is not safe to check partial equality by providing a smaller
> + * "max" (we assume any bits beyond "max" up to the next CHAR_BIT boundary are
> + * zeroed padding).
> + */
> +static inline int bitset_equal(unsigned char *a, unsigned char *b, int max)
> +{
> +       int i;
> +       for (i = bitset_sizeof(max); i > 0; i--)
> +               if (*a++ != *b++)
> +                       return 0;
> +       return 1;
> +}
> +
> +/*
> + * Bitwise-or the bitsets in "dst" and "src", and store the result in "dst".
> + *
> + * See bitset_equal for the definition of "max".
> + */
> +static inline void bitset_or(unsigned char *dst, const unsigned char *src, int max)
> +{
> +       int i;
> +       for (i = bitset_sizeof(max); i > 0; i--)
> +               *dst++ |= *src++;
> +}
> +
> +/*
> + * Returns true iff the bitset contains all zeroes.
> + *
> + * See bitset_equal for the definition of "max".
> + */
> +static inline int bitset_empty(const unsigned char *bits, int max)
> +{
> +       int i;
> +       for (i = bitset_sizeof(max); i > 0; i--, bits++)
> +               if (*bits)
> +                       return 0;
> +       return 1;
> +}
> +
> +#endif /* BITSET_H */
> --
> 2.0.0.566.gfe3e6b2

  parent reply	other threads:[~2014-06-29  7:41 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-25 23:34 [PATCH 0/8] use merge-base for tag --contains Jeff King
2014-06-25 23:35 ` [PATCH 1/8] tag: allow --sort with -n Jeff King
2014-06-25 23:35 ` [PATCH 2/8] tag: factor out decision to stream tags Jeff King
2014-06-25 23:39 ` [PATCH 3/8] paint_down_to_common: use prio_queue Jeff King
2014-07-01 16:23   ` Junio C Hamano
2014-07-01 17:10     ` Jeff King
2014-06-25 23:40 ` [PATCH 4/8] add functions for memory-efficient bitmaps Jeff King
2014-06-26  3:15   ` Torsten Bögershausen
2014-06-26 15:51     ` Jeff King
2014-06-29  7:41   ` Eric Sunshine [this message]
2014-06-30 17:07     ` Jeff King
2014-07-01 16:57       ` Junio C Hamano
2014-07-01 17:18         ` Jeff King
2014-06-25 23:42 ` [PATCH 5/8] string-list: add pos to iterator callback Jeff King
2014-07-01 17:45   ` Junio C Hamano
2014-07-01 19:00     ` Jeff King
2014-06-25 23:47 ` [PATCH 6/8] commit: provide a fast multi-tip contains function Jeff King
2014-06-26 18:55   ` Junio C Hamano
2014-06-26 19:19     ` Junio C Hamano
2014-06-26 19:26       ` Junio C Hamano
2014-07-01 18:16       ` Junio C Hamano
2014-07-01 19:14         ` Junio C Hamano
2014-06-25 23:49 ` [PATCH 7/8] tag: use commit_contains Jeff King
2014-06-25 23:53 ` [PATCH 8/8] perf: add tests for tag --contains Jeff King
2014-06-26  0:01   ` Jeff King
2014-06-26  0:04     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPig+cSc=A=+PR7oF43yeLpcd4n=Bd1KU1AHPfMKXEu5wAF4Ug@mail.gmail.com' \
    --to=sunshine@sunshineco.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).