git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Abhradeep Chakraborty via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Taylor Blau <me@ttaylorr.com>,
	Kaartic Sivaram <kaartic.sivaraam@gmail.com>,
	Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee <derrickstolee@github.com>,
	Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Subject: [PATCH 0/5] [RFC] introduce Roaring bitmaps to Git
Date: Mon, 19 Sep 2022 17:47:34 +0000	[thread overview]
Message-ID: <pull.1357.git.1663609659.gitgitgadget@gmail.com> (raw)

Git currently uses ewah bitmaps ( which are based on run-length encoding) to
compress bitmaps. Ewah bitmaps stores bitmaps in the form of run-length
words i.e. instead of storing each and every bit, it tries to find
consecutive bits (having same value) and replace them with the value bit and
the range upto which the bit is present. It is simple and efficient. But one
downside of this approach is that we have to decompress the whole bitmap in
order to find the bit of a certain position.

For small (or medium sized) bitmaps, this is not an issue. But it can be an
issue for large (or extra large) bitmaps. In that case roaring bitmaps are
generally more efficient[1] than ewah itself. Some benchmarks suggests that
roaring bitmaps give more performance benefits than ewah or any other
similar compression technique.

This patch series is currently in RFC state and it aims to let Git use
roaring bitmaps. As this is an RFC patch series (for now), the code are not
fully accurate (i.e. some tests are failing). But it is backward-compatible
(tests related to ewah bitmaps are passing). Some commit messages might need
more explanation and some commits may need a split (specially the one that
implement writing roaring bitmaps). Overall, the structure and code are near
to ready to make the series a formal patch series.

I am submitting it as an RFC (after discussions with mentors) because the
GSoC coding period is about to end. I will continue to work on the patch
series.

Abhradeep Chakraborty (5):
  reachability-bitmaps: add CRoaring library to Git
  roaring.[ch]: apply Git specific changes to the roaring API
  roaring: teach Git to write roaring bitmaps
  roaring: introduce a new config option for roaring bitmaps
  roaring: teach Git to read roaring bitmaps

 Makefile                   |     3 +
 bitmap.c                   |   225 +
 bitmap.h                   |    33 +
 builtin/diff.c             |    10 +-
 builtin/multi-pack-index.c |     5 +
 builtin/pack-objects.c     |    81 +-
 ewah/bitmap.c              |    61 +-
 ewah/ewok.h                |    37 +-
 midx.c                     |     7 +
 midx.h                     |     1 +
 pack-bitmap-write.c        |   326 +-
 pack-bitmap.c              |   969 +-
 pack-bitmap.h              |    27 +-
 roaring/roaring.c          | 20047 +++++++++++++++++++++++++++++++++++
 roaring/roaring.h          |  1028 ++
 t/t5310-pack-bitmaps.sh    |    79 +-
 16 files changed, 22490 insertions(+), 449 deletions(-)
 create mode 100644 bitmap.c
 create mode 100644 bitmap.h
 create mode 100644 roaring/roaring.c
 create mode 100644 roaring/roaring.h


base-commit: d3fa443f97e3a8d75b51341e2d5bac380b7422df
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1357%2FAbhra303%2Froaring-bitmap-exp-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1357/Abhra303/roaring-bitmap-exp-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1357
-- 
gitgitgadget

             reply	other threads:[~2022-09-19 17:47 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-19 17:47 Abhradeep Chakraborty via GitGitGadget [this message]
2022-09-19 17:47 ` [PATCH 1/5] reachability-bitmaps: add CRoaring library to Git Abhradeep Chakraborty via GitGitGadget
2022-09-19 17:47 ` [PATCH 2/5] roaring.[ch]: apply Git specific changes to the roaring API Abhradeep Chakraborty via GitGitGadget
2022-09-19 18:33   ` Derrick Stolee
2022-09-19 22:02     ` Junio C Hamano
2022-09-20 12:19       ` Derrick Stolee
2022-09-20 15:09         ` Abhradeep Chakraborty
2022-09-21 16:58         ` Junio C Hamano
2022-09-20 14:46     ` Abhradeep Chakraborty
2022-09-19 17:47 ` [PATCH 3/5] roaring: teach Git to write roaring bitmaps Abhradeep Chakraborty via GitGitGadget
2022-09-30  6:20   ` Junio C Hamano
2022-09-30 16:23     ` Abhradeep Chakraborty
2022-10-30  6:35       ` Abhradeep Chakraborty
2022-10-30 19:46         ` Derrick Stolee
2022-10-31 14:30           ` Abhradeep Chakraborty
2022-10-31 16:06           ` Junio C Hamano
2022-10-31 17:51             ` C99 -> C11 or C17? (was: [PATCH 3/5] roaring: teach Git to write roaring bitmaps) Ævar Arnfjörð Bjarmason
2022-10-31 20:55               ` rsbecker
2022-11-01  6:58             ` [PATCH 3/5] roaring: teach Git to write roaring bitmaps Abhradeep Chakraborty
2022-09-19 17:47 ` [PATCH 4/5] roaring: introduce a new config option for " Abhradeep Chakraborty via GitGitGadget
2022-09-19 17:47 ` [PATCH 5/5] roaring: teach Git to read " Abhradeep Chakraborty via GitGitGadget
2022-09-19 18:18 ` [PATCH 0/5] [RFC] introduce Roaring bitmaps to Git Derrick Stolee
2022-09-20 14:05   ` Abhradeep Chakraborty
2022-09-20 21:59 ` Taylor Blau
2022-09-21 15:27   ` Abhradeep Chakraborty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1357.git.1663609659.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=chakrabortyabhradeep79@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=kaartic.sivaraam@gmail.com \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).