git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [GSoC] Abhradeep's GSoC blogs (25 Jul, 2022 IST)
@ 2022-07-25 15:23 Abhradeep Chakraborty
  2022-07-25 21:52 ` Taylor Blau
  0 siblings, 1 reply; 2+ messages in thread
From: Abhradeep Chakraborty @ 2022-07-25 15:23 UTC (permalink / raw)
  To: git; +Cc: Abhradeep Chakraborty, Taylor Blau, Kaartic Sivaraam

Hello developers, this is the thread where you can know about
my weekly GSoC blog links.

My Project - Reachability bitmap improvements

Blog update
------------

Title - GSoC Week 6: using CRoaring library
Blog link - https://medium.com/@abhra303/gsoc-week-6-using-croaring-library-be309cfa89f5

Summary -

I missed the week 5 blog update. So this blog covers both
week 5 and week 6 work updates. I submitted my latest version
Of `lookup-table-extension` patch series. There are some issues
with CRoaring e.g. it do not store in network byte order (which I
confirmed in Roaringbitmap's google group[1]). So, I need to make
some changes to fix it. I have already finished implementing
`roaring_portable_network_serialize` and `..._deserialize`. My
next step is to use its functions in Git's codebase. I will
submit the patch series soon.

Previous blogs 
---------------

-------------------------------------------------------

Title - GSoC Week 4: diving into roaring bitmaps
Blog link - https://medium.com/@abhra303/gsoc-week-4-diving-into-roaring-bitmaps-f028f931d873

Summary -

I am thinking of submitting a patch to explain the workings
of bitmaps. I will be creating a new file 'technical/reachability-
bitmaps.txt` for that. This week I spent my time on diving more into
Croaring[1]. I tried to understand how they work internally, the
available functions they offer, their serializing format etc.
The serialisation format[2] seems fine to me but still I want to
know Kaartic and Taylor’s opinions. Another thing I noticed here is
that each roaring bitmaps are designed to store sets of 32-bit
(unsigned) integers. Thus a Roaring bitmap can contain up to 4294967296
integers. I am not sure if this is sufficient for us.
My next step is to make the new bitmap format version 2(with roaring
bitmaps) and modify rest of the code so that those code can accept
the new bitmap format version.

-------------------------------------------------------
Title - GSoC Week 3: working on further improvements
Blog link - https://medium.com/@abhra303/gsoc-week-3-working-on-further-improvements-13a27db64cd5

Summary -

In this week, I continued to work on further improvements of 
The bitmap-lookup-table patch series. Some of the requested
changes are (1) Improve the documentation and fix typos (2) add
comments (3) Disable `pack.writeBitmapLookupTable` by default
(4) Fix alignment issues (5) Make a `bitmap_lookup_table_triple`
struct (6) Subtract the table_size from index_end irrespective of
the value of GIT_TEST_READ_COMMIT_TABLE.

After implementing all the requested changes, I started working
on the idea I mentioned in my previous blog as my next step. The
idea is to stop the xor stack filling loop if the current xor
bitmap is already stored and assign `xor_bitmap` to it. As this
bitmap is already stored, we don't need to iterate further as we
know all the other bitmaps that are needed to parse this bitmap
has already been stored.

My next step is to roughly implement roaring run bitmaps and
run performance tests to check if it's really worth it.

-------------------------------------------------------
Title - GSoC Week 2: redesign the table format
Blog link - https://medium.com/@abhra303/gsoc-week-2-redesign-the-table-format-829dae755a5

Summary - 

In the last week, I worked on the reviews. Some major requested
changes are (1) Use commit positions instead of commit oids in
the table. (2) Use 8 byte offset positions instead of 4 bytes
(3) use iterative approach for parsing xor bitmaps (4) Use
`<commit_pos, offset, xor_pos>` triplets.

While implementing these changes, I discovered some bugs in the
previous version. I faced errors during this time. But finally
managed to fixed those errors. Taylor helped me to get rid of
some errors.

I think that we can optimise the parsing of xor bitmaps further
by stopping stack filling loop when we get an already parsed
bitmap since we know that bitmaps having xor relations with it
has already been stored/parsed.

------------------------------------------------------- 
Title - GSoC Week 1: Let's Get started
Blog link - https://medium.com/@abhra303/gsoc-week-1-lets-get-started-fad78ec34dcf

Summary -

This is the first blog that I wrote for GSoC. Taylor
suggested that I should work on "integrating a lookup table
extension" first as it is smaller compared to other sub-projects.

The idea is to have a table at the end of .bitmap file which
will contain the offsets (and xor-offsets) of the bitmaps of
selected commits. Whenever git try to get the bitmap of a
particular commit, instead of loading each bitmaps one by one,
git will parse only the desired bitmap by using the offset and
xor-offset of the table. This will reduce the overhead of
loading each and every bitmap.
-------------------------------------------------------

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [GSoC] Abhradeep's GSoC blogs (25 Jul, 2022 IST)
  2022-07-25 15:23 [GSoC] Abhradeep's GSoC blogs (25 Jul, 2022 IST) Abhradeep Chakraborty
@ 2022-07-25 21:52 ` Taylor Blau
  0 siblings, 0 replies; 2+ messages in thread
From: Taylor Blau @ 2022-07-25 21:52 UTC (permalink / raw)
  To: Abhradeep Chakraborty; +Cc: git, Kaartic Sivaraam

Hi Abhradeep,

On Mon, Jul 25, 2022 at 08:53:26PM +0530, Abhradeep Chakraborty wrote:
> Title - GSoC Week 6: using CRoaring library
> Blog link - https://medium.com/@abhra303/gsoc-week-6-using-croaring-library-be309cfa89f5
>
> Summary -
>
> I missed the week 5 blog update. So this blog covers both
> week 5 and week 6 work updates. I submitted my latest version
> Of `lookup-table-extension` patch series. There are some issues
> with CRoaring e.g. it do not store in network byte order (which I
> confirmed in Roaringbitmap's google group[1]). So, I need to make
> some changes to fix it. I have already finished implementing
> `roaring_portable_network_serialize` and `..._deserialize`. My
> next step is to use its functions in Git's codebase. I will
> submit the patch series soon.

Thanks for another delightful blog post. Like I mentioned off-list, I am
really impressed with your progress, and in your ability to reach out
across multiple projects in order to make substantial licensing changes.

I'm looking forward to hearing how things go with the CRoaring folks,
and to helping out where I can. I think that the final implementation of
Roaring bitmaps into Git's reachability bitmaps will be higher quality
because we're able to rely on existing libraries.

Thanks again for all of your hard work. I have your lookup table series
on my list to review right after this, and I think that that should be
getting close to being ready.

Well done!

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-07-25 21:52 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-25 15:23 [GSoC] Abhradeep's GSoC blogs (25 Jul, 2022 IST) Abhradeep Chakraborty
2022-07-25 21:52 ` Taylor Blau

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).