git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
From: Christian Couder <christian.couder@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
	Christian Couder <chriscool@tuxfamily.org>,
	Ramsay Jones <ramsay@ramsayjones.plus.com>,
	Jonathan Tan <jonathantanmy@google.com>
Subject: [PATCH v4 00/12] Rewrite packfile reuse code
Date: Wed, 18 Dec 2019 12:25:35 +0100
Message-ID: <20191218112547.4974-1-chriscool@tuxfamily.org> (raw)

This patch series is rewriting the code that tries to reuse existing
packfiles.

The code in this patch series was written by GitHub, and Peff nicely
provided it in the following discussion:

https://public-inbox.org/git/3E56B0FD-EBE8-4057-A93A-16EBB09FBCE0@jramsay.com.au/

The first versions of this patch series were also discussed:

v3: https://public-inbox.org/git/20191115141541.11149-1-chriscool@tuxfamily.org/
V2: https://public-inbox.org/git/20191019103531.23274-1-chriscool@tuxfamily.org/
V1: https://public-inbox.org/git/20190913130226.7449-1-chriscool@tuxfamily.org/

Thanks to the reviewers!

According to Peff this new code is a lot smarter than what it
replaces. It allows "holes" in the chunks of packfile to be reused,
and skips over them. It rewrites OFS_DELTA offsets as it goes to
account for the holes. So it's basically a linear walk over the
packfile, but with the important distinction that we don't add those
objects to the object_entry array, which makes them very lightweight
(especially in memory use, but they also aren't considered bases for
finding new deltas, etc). It seems like a good compromise between the
cost to serve a clone and the quality of the resulting packfile.

This series has been rebased onto current master ad05a3d8e5 (The fifth
batch, 2019-12-10).

Other changes since the previous patch series have all been suggested
by Peff. Thanks to him! They are the following:

  - Add note in commit message of patch 3/12.

  - Move previous patch 4/9 to patch 12/12 at the end of the series to
    avoid test failures.

  - Add new patches 5/12 and 6/12.

  - Improve commit message and documentation of pack.allowPackReuse in
    patch 8/12.

  - Improve commit message of patch 10/12.

  - Extract patch 11/12 from patch 10/12.

Jeff King (12):
  builtin/pack-objects: report reused packfile objects
  packfile: expose get_delta_base()
  ewah/bitmap: introduce bitmap_word_alloc()
  pack-bitmap: introduce bitmap_walk_contains()
  pack-bitmap: uninteresting oid can be outside bitmapped packfile
  pack-bitmap: simplify bitmap_has_oid_in_uninteresting()
  csum-file: introduce hashfile_total()
  pack-objects: introduce pack.allowPackReuse
  builtin/pack-objects: introduce obj_is_packed()
  pack-objects: improve partial packfile reuse
  pack-objects: add checks for duplicate objects
  pack-bitmap: don't rely on bitmap_git->reuse_objects

 Documentation/config/pack.txt |   7 +
 builtin/pack-objects.c        | 243 +++++++++++++++++++++++++++-------
 csum-file.h                   |   9 ++
 ewah/bitmap.c                 |  13 +-
 ewah/ewok.h                   |   1 +
 pack-bitmap.c                 | 192 ++++++++++++++++++---------
 pack-bitmap.h                 |   6 +-
 packfile.c                    |  10 +-
 packfile.h                    |   3 +
 9 files changed, 362 insertions(+), 122 deletions(-)

-- 
2.24.1.498.g561400140f


             reply index

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-18 11:25 Christian Couder [this message]
2019-12-18 11:25 ` [PATCH v4 01/12] builtin/pack-objects: report reused packfile objects Christian Couder
2019-12-18 11:25 ` [PATCH v4 02/12] packfile: expose get_delta_base() Christian Couder
2019-12-18 11:25 ` [PATCH v4 03/12] ewah/bitmap: introduce bitmap_word_alloc() Christian Couder
2019-12-18 11:25 ` [PATCH v4 04/12] pack-bitmap: introduce bitmap_walk_contains() Christian Couder
2019-12-18 11:25 ` [PATCH v4 05/12] pack-bitmap: uninteresting oid can be outside bitmapped packfile Christian Couder
2019-12-18 11:25 ` [PATCH v4 06/12] pack-bitmap: simplify bitmap_has_oid_in_uninteresting() Christian Couder
2019-12-18 11:25 ` [PATCH v4 07/12] csum-file: introduce hashfile_total() Christian Couder
2019-12-18 11:25 ` [PATCH v4 08/12] pack-objects: introduce pack.allowPackReuse Christian Couder
2019-12-18 11:25 ` [PATCH v4 09/12] builtin/pack-objects: introduce obj_is_packed() Christian Couder
2019-12-18 11:25 ` [PATCH v4 10/12] pack-objects: improve partial packfile reuse Christian Couder
2019-12-18 11:25 ` [PATCH v4 11/12] pack-objects: add checks for duplicate objects Christian Couder
2019-12-18 11:25 ` [PATCH v4 12/12] pack-bitmap: don't rely on bitmap_git->reuse_objects Christian Couder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191218112547.4974-1-chriscool@tuxfamily.org \
    --to=christian.couder@gmail.com \
    --cc=chriscool@tuxfamily.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=peff@peff.net \
    --cc=ramsay@ramsayjones.plus.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git