git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Ævar Arnfjörð" <avarab@gmail.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: [PATCH 09/15] pack-objects: shrink z_delta_size field in struct object_entry
Date: Sat, 14 Apr 2018 17:35:07 +0200	[thread overview]
Message-ID: <20180414153513.9902-10-pclouds@gmail.com> (raw)
In-Reply-To: <20180414153513.9902-1-pclouds@gmail.com>

We only cache deltas when it's smaller than a certain limit. This limit
defaults to 1000 but save its compressed length in a 64-bit field.
Shrink that field down to 20 bits, so you can only cache 1MB deltas.
Larger deltas must be recomputed at when the pack is written down.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/config.txt |  3 ++-
 builtin/pack-objects.c   | 24 ++++++++++++++++++------
 pack-objects.h           |  3 ++-
 3 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index d97f10722c..a62134264e 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2459,7 +2459,8 @@ pack.deltaCacheLimit::
 	The maximum size of a delta, that is cached in
 	linkgit:git-pack-objects[1]. This cache is used to speed up the
 	writing object phase by not having to recompute the final delta
-	result once the best match for all objects is found. Defaults to 1000.
+	result once the best match for all objects is found.
+	Defaults to 1000. Maximum value is 65535.
 
 pack.threads::
 	Specifies the number of threads to spawn when searching for best
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index ec02641d2e..211bb1ad0e 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2099,12 +2099,19 @@ static void find_deltas(struct object_entry **list, unsigned *list_size,
 		 * between writes at that moment.
 		 */
 		if (entry->delta_data && !pack_to_stdout) {
-			entry->z_delta_size = do_compress(&entry->delta_data,
-							  entry->delta_size);
-			cache_lock();
-			delta_cache_size -= entry->delta_size;
-			delta_cache_size += entry->z_delta_size;
-			cache_unlock();
+			unsigned long size;
+
+			size = do_compress(&entry->delta_data, entry->delta_size);
+			if (size < (1U << OE_Z_DELTA_BITS)) {
+				entry->z_delta_size = size;
+				cache_lock();
+				delta_cache_size -= entry->delta_size;
+				delta_cache_size += entry->z_delta_size;
+				cache_unlock();
+			} else {
+				FREE_AND_NULL(entry->delta_data);
+				entry->z_delta_size = 0;
+			}
 		}
 
 		/* if we made n a delta, and if n is already at max
@@ -3087,6 +3094,11 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 			depth, (1 << OE_DEPTH_BITS) - 1);
 		depth = (1 << OE_DEPTH_BITS) - 1;
 	}
+	if (cache_max_small_delta_size >= (1U << OE_Z_DELTA_BITS)) {
+		warning(_("pack.deltaCacheLimit is too high, forcing %d"),
+			(1U << OE_Z_DELTA_BITS) - 1);
+		cache_max_small_delta_size = (1U << OE_Z_DELTA_BITS) - 1;
+	}
 
 	argv_array_push(&rp, "pack-objects");
 	if (thin) {
diff --git a/pack-objects.h b/pack-objects.h
index e962dce3c0..9d0391c173 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -6,6 +6,7 @@
 #define OE_DFS_STATE_BITS	2
 #define OE_DEPTH_BITS		12
 #define OE_IN_PACK_BITS		10
+#define OE_Z_DELTA_BITS		20
 
 /*
  * State flags for depth-first search used for analyzing delta cycles.
@@ -77,7 +78,7 @@ struct object_entry {
 				     */
 	void *delta_data;	/* cached delta (uncompressed) */
 	unsigned long delta_size;	/* delta data size (uncompressed) */
-	unsigned long z_delta_size;	/* delta data size (compressed) */
+	unsigned z_delta_size:OE_Z_DELTA_BITS;
 	unsigned type_:TYPE_BITS;
 	unsigned in_pack_type:TYPE_BITS; /* could be delta */
 	unsigned type_valid:1;
-- 
2.17.0.367.g5dd2e386c3


  parent reply	other threads:[~2018-04-14 15:35 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-14 15:34 [PATCH 00/15] nd/pack-objects-pack-struct update Nguyễn Thái Ngọc Duy
2018-04-14 15:34 ` [PATCH 01/15] read-cache.c: make $GIT_TEST_SPLIT_INDEX boolean Nguyễn Thái Ngọc Duy
2018-04-14 19:53   ` Ævar Arnfjörð Bjarmason
2018-04-14 15:35 ` [PATCH 02/15] pack-objects: a bit of document about struct object_entry Nguyễn Thái Ngọc Duy
2018-04-14 15:35 ` [PATCH 03/15] pack-objects: turn type and in_pack_type to bitfields Nguyễn Thái Ngọc Duy
2018-04-14 15:35 ` [PATCH 04/15] pack-objects: use bitfield for object_entry::dfs_state Nguyễn Thái Ngọc Duy
2018-04-14 15:35 ` [PATCH 05/15] pack-objects: use bitfield for object_entry::depth Nguyễn Thái Ngọc Duy
2018-04-14 15:35 ` [PATCH 06/15] pack-objects: move in_pack_pos out of struct object_entry Nguyễn Thái Ngọc Duy
2018-04-14 15:35 ` [PATCH 07/15] pack-objects: move in_pack " Nguyễn Thái Ngọc Duy
2018-04-14 15:35 ` [PATCH 08/15] pack-objects: refer to delta objects by index instead of pointer Nguyễn Thái Ngọc Duy
2018-04-14 15:35 ` Nguyễn Thái Ngọc Duy [this message]
2018-04-14 15:35 ` [PATCH 10/15] pack-objects: don't check size when the object is bad Nguyễn Thái Ngọc Duy
2018-04-14 15:35 ` [PATCH 11/15] pack-objects: clarify the use of object_entry::size Nguyễn Thái Ngọc Duy
2018-04-14 15:35 ` [PATCH 12/15] pack-objects: shrink size field in struct object_entry Nguyễn Thái Ngọc Duy
2018-04-14 15:35 ` [PATCH 13/15] pack-objects: shrink delta_size " Nguyễn Thái Ngọc Duy
2018-04-14 15:35 ` [PATCH 14/15] pack-objects: reorder members to shrink " Nguyễn Thái Ngọc Duy
2018-04-14 15:35 ` [PATCH 15/15] ci: exercise the whole test suite with uncommon code in pack-objects Nguyễn Thái Ngọc Duy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180414153513.9902-10-pclouds@gmail.com \
    --to=pclouds@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).