git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: sbeller@google.com, peff@peff.net, jrnieder@gmail.com,
	avarab@gmail.com, Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: [PATCH v2 4/7] midx: refactor permutation logic
Date: Fri, 21 Dec 2018 08:28:40 -0800 (PST)	[thread overview]
Message-ID: <0c29a242fec32b0077291cb36a2831e6202b3453.1545409716.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.92.v2.git.gitgitgadget@gmail.com>

From: Derrick Stolee <dstolee@microsoft.com>

When writing a multi-pack-index, we keep track of an integer
permutation, tracking the list of pack-files that we know about
(both from the existing multi-pack-index and the new pack-files
being introduced) and converting them into a sorted order for
the new multi-pack-index.

In anticipation of dropping pack-files from the existing multi-
pack-index, refactor the logic around how we track this permutation.

First, insert the permutation into the pack_list structure. This
allows us to grow the permutation dynamically as we add packs.

Second, fill the permutation with values corresponding to their
position in the list of pack-files, sorted as follows:

  1. The pack-files in the existing multi-pack-index,
     sorted lexicographically.

  2. The pack-files not in the existing multi-pack-index,
     sorted as discovered from the filesystem.

There is a subtle thing in how we initialize this permutation,
specifically how we use 'i' for the initial value. This will
matter more when we implement the logic for dropping existing
packs, as we will create holes in the ordering.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 midx.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/midx.c b/midx.c
index bb825ef816..3bd7183a53 100644
--- a/midx.c
+++ b/midx.c
@@ -380,9 +380,11 @@ static size_t write_midx_header(struct hashfile *f,
 struct pack_list {
 	struct packed_git **list;
 	char **names;
+	uint32_t *perm;
 	uint32_t nr;
 	uint32_t alloc_list;
 	uint32_t alloc_names;
+	uint32_t alloc_perm;
 	size_t pack_name_concat_len;
 	struct multi_pack_index *m;
 };
@@ -398,6 +400,7 @@ static void add_pack_to_midx(const char *full_path, size_t full_path_len,
 
 		ALLOC_GROW(packs->list, packs->nr + 1, packs->alloc_list);
 		ALLOC_GROW(packs->names, packs->nr + 1, packs->alloc_names);
+		ALLOC_GROW(packs->perm, packs->nr + 1, packs->alloc_perm);
 
 		packs->list[packs->nr] = add_packed_git(full_path,
 							full_path_len,
@@ -417,6 +420,7 @@ static void add_pack_to_midx(const char *full_path, size_t full_path_len,
 			return;
 		}
 
+		packs->perm[packs->nr] = packs->nr;
 		packs->names[packs->nr] = xstrdup(file_name);
 		packs->pack_name_concat_len += strlen(file_name) + 1;
 		packs->nr++;
@@ -443,7 +447,7 @@ static void sort_packs_by_name(char **pack_names, uint32_t nr_packs, uint32_t *p
 	ALLOC_ARRAY(pairs, nr_packs);
 
 	for (i = 0; i < nr_packs; i++) {
-		pairs[i].pack_int_id = i;
+		pairs[i].pack_int_id = perm[i];
 		pairs[i].pack_name = pack_names[i];
 	}
 
@@ -755,7 +759,6 @@ int write_midx_file(const char *object_dir)
 	struct hashfile *f = NULL;
 	struct lock_file lk;
 	struct pack_list packs;
-	uint32_t *pack_perm = NULL;
 	uint64_t written = 0;
 	uint32_t chunk_ids[MIDX_MAX_CHUNKS + 1];
 	uint64_t chunk_offsets[MIDX_MAX_CHUNKS + 1];
@@ -774,18 +777,22 @@ int write_midx_file(const char *object_dir)
 
 	packs.nr = 0;
 	packs.alloc_list = packs.m ? packs.m->num_packs : 16;
-	packs.alloc_names = packs.alloc_list;
+	packs.alloc_perm = packs.alloc_names = packs.alloc_list;
 	packs.list = NULL;
 	packs.names = NULL;
+	packs.perm = NULL;
 	packs.pack_name_concat_len = 0;
 	ALLOC_ARRAY(packs.list, packs.alloc_list);
 	ALLOC_ARRAY(packs.names, packs.alloc_names);
+	ALLOC_ARRAY(packs.perm, packs.alloc_perm);
 
 	if (packs.m) {
 		for (i = 0; i < packs.m->num_packs; i++) {
 			ALLOC_GROW(packs.list, packs.nr + 1, packs.alloc_list);
 			ALLOC_GROW(packs.names, packs.nr + 1, packs.alloc_names);
+			ALLOC_GROW(packs.perm, packs.nr + 1, packs.alloc_perm);
 
+			packs.perm[packs.nr] = i;
 			packs.list[packs.nr] = NULL;
 			packs.names[packs.nr] = xstrdup(packs.m->pack_names[i]);
 			packs.pack_name_concat_len += strlen(packs.names[packs.nr]) + 1;
@@ -802,10 +809,9 @@ int write_midx_file(const char *object_dir)
 		packs.pack_name_concat_len += MIDX_CHUNK_ALIGNMENT -
 					      (packs.pack_name_concat_len % MIDX_CHUNK_ALIGNMENT);
 
-	ALLOC_ARRAY(pack_perm, packs.nr);
-	sort_packs_by_name(packs.names, packs.nr, pack_perm);
+	sort_packs_by_name(packs.names, packs.nr, packs.perm);
 
-	entries = get_sorted_entries(packs.m, packs.list, pack_perm, packs.nr, &nr_entries);
+	entries = get_sorted_entries(packs.m, packs.list, packs.perm, packs.nr, &nr_entries);
 
 	for (i = 0; i < nr_entries; i++) {
 		if (entries[i].offset > 0x7fffffff)
@@ -923,8 +929,8 @@ int write_midx_file(const char *object_dir)
 
 	free(packs.list);
 	free(packs.names);
+	free(packs.perm);
 	free(entries);
-	free(pack_perm);
 	free(midx_name);
 	return 0;
 }
-- 
gitgitgadget


  parent reply	other threads:[~2018-12-21 16:28 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-10 18:06 [PATCH 0/5] Create 'expire' and 'repack' verbs for git-multi-pack-index Derrick Stolee via GitGitGadget
2018-12-10 18:06 ` [PATCH 1/5] multi-pack-index: prepare for 'expire' verb Derrick Stolee via GitGitGadget
2018-12-11  1:35   ` Stefan Beller
2018-12-11  1:59     ` SZEDER Gábor
2018-12-11 12:32       ` Derrick Stolee
2018-12-10 18:06 ` [PATCH 2/5] midx: refactor permutation logic Derrick Stolee via GitGitGadget
2018-12-10 18:06 ` [PATCH 3/5] multi-pack-index: implement 'expire' verb Derrick Stolee via GitGitGadget
2018-12-10 18:06 ` [PATCH 4/5] multi-pack-index: prepare 'repack' verb Derrick Stolee via GitGitGadget
2018-12-11  1:54   ` Stefan Beller
2018-12-11 12:45     ` Derrick Stolee
2018-12-10 18:06 ` [PATCH 5/5] midx: implement midx_repack() Derrick Stolee via GitGitGadget
2018-12-11  2:32   ` Stefan Beller
2018-12-11 13:00     ` Derrick Stolee
2018-12-12  7:40   ` Junio C Hamano
2018-12-13  4:23     ` Junio C Hamano
2018-12-21 16:28 ` [PATCH v2 0/7] Create 'expire' and 'repack' verbs for git-multi-pack-index Derrick Stolee via GitGitGadget
2018-12-21 16:28   ` [PATCH v2 1/7] repack: refactor pack deletion for future use Derrick Stolee via GitGitGadget
2018-12-21 16:28   ` [PATCH v2 2/7] Docs: rearrange subcommands for multi-pack-index Derrick Stolee via GitGitGadget
2018-12-21 16:28   ` [PATCH v2 3/7] multi-pack-index: prepare for 'expire' subcommand Derrick Stolee via GitGitGadget
2018-12-21 16:28   ` Derrick Stolee via GitGitGadget [this message]
2018-12-21 16:28   ` [PATCH v2 5/7] multi-pack-index: implement 'expire' verb Derrick Stolee via GitGitGadget
2018-12-21 16:28   ` [PATCH v2 6/7] multi-pack-index: prepare 'repack' subcommand Derrick Stolee via GitGitGadget
2018-12-21 16:28   ` [PATCH v2 7/7] midx: implement midx_repack() Derrick Stolee via GitGitGadget
2019-01-09 15:21   ` [PATCH v3 0/9] Create 'expire' and 'repack' verbs for git-multi-pack-index Derrick Stolee via GitGitGadget
2019-01-09 15:21     ` [PATCH v3 1/9] repack: refactor pack deletion for future use Derrick Stolee via GitGitGadget
2019-01-09 15:21     ` [PATCH v3 2/9] Docs: rearrange subcommands for multi-pack-index Derrick Stolee via GitGitGadget
2019-01-09 15:21     ` [PATCH v3 4/9] midx: simplify computation of pack name lengths Derrick Stolee via GitGitGadget
2019-01-09 15:21     ` [PATCH v3 3/9] multi-pack-index: prepare for 'expire' subcommand Derrick Stolee via GitGitGadget
2019-01-09 15:21     ` [PATCH v3 5/9] midx: refactor permutation logic and pack sorting Derrick Stolee via GitGitGadget
2019-01-23 21:00       ` Jonathan Tan
2019-01-24 17:34         ` Derrick Stolee
2019-01-24 19:17           ` Derrick Stolee
2019-01-09 15:21     ` [PATCH v3 6/9] multi-pack-index: implement 'expire' verb Derrick Stolee via GitGitGadget
2019-01-09 15:54       ` SZEDER Gábor
2019-01-10 18:05         ` Junio C Hamano
2019-01-23 22:13       ` Jonathan Tan
2019-01-24 17:36         ` Derrick Stolee
2019-01-09 15:21     ` [PATCH v3 7/9] multi-pack-index: prepare 'repack' subcommand Derrick Stolee via GitGitGadget
2019-01-09 15:56       ` SZEDER Gábor
2019-01-23 22:38       ` Jonathan Tan
2019-01-24 19:36         ` Derrick Stolee
2019-01-24 21:38           ` Jonathan Tan
2019-01-09 15:21     ` [PATCH v3 8/9] midx: implement midx_repack() Derrick Stolee via GitGitGadget
2019-01-23 22:33       ` Jonathan Tan
2019-01-09 15:21     ` [PATCH v3 9/9] multi-pack-index: test expire while adding packs Derrick Stolee via GitGitGadget
2019-01-17 15:27     ` [PATCH v3 0/9] Create 'expire' and 'repack' verbs for git-multi-pack-index Derrick Stolee
2019-01-23 22:44     ` Jonathan Tan
2019-01-24 21:51     ` [PATCH v4 00/10] " Derrick Stolee via GitGitGadget
2019-01-24 21:51       ` [PATCH v4 01/10] repack: refactor pack deletion for future use Derrick Stolee via GitGitGadget
2019-01-24 21:51       ` [PATCH v4 02/10] Docs: rearrange subcommands for multi-pack-index Derrick Stolee via GitGitGadget
2019-01-24 21:51       ` [PATCH v4 03/10] multi-pack-index: prepare for 'expire' subcommand Derrick Stolee via GitGitGadget
2019-01-24 21:51       ` [PATCH v4 04/10] midx: simplify computation of pack name lengths Derrick Stolee via GitGitGadget
2019-01-24 21:51       ` [PATCH v4 05/10] midx: refactor permutation logic and pack sorting Derrick Stolee via GitGitGadget
2019-01-24 21:51       ` [PATCH v4 07/10] multi-pack-index: prepare 'repack' subcommand Derrick Stolee via GitGitGadget
2019-01-25 23:24         ` Josh Steadmon
2019-01-24 21:51       ` [PATCH v4 06/10] multi-pack-index: implement 'expire' subcommand Derrick Stolee via GitGitGadget
2019-01-24 21:52       ` [PATCH v4 08/10] midx: implement midx_repack() Derrick Stolee via GitGitGadget
2019-01-26 17:10         ` Derrick Stolee
2019-01-27 22:50           ` Junio C Hamano
2019-01-24 21:52       ` [PATCH v4 09/10] multi-pack-index: test expire while adding packs Derrick Stolee via GitGitGadget
2019-01-24 21:52       ` [PATCH v4 10/10] midx: add test that 'expire' respects .keep files Derrick Stolee via GitGitGadget
2019-01-24 22:14       ` [PATCH v4 00/10] Create 'expire' and 'repack' verbs for git-multi-pack-index Jonathan Tan
2019-01-25 23:49       ` Josh Steadmon
2019-04-24 15:14       ` [PATCH v5 00/11] " Derrick Stolee
2019-04-24 15:14         ` [PATCH v5 01/11] repack: refactor pack deletion for future use Derrick Stolee
2019-04-24 15:14         ` [PATCH v5 02/11] Docs: rearrange subcommands for multi-pack-index Derrick Stolee
2019-04-24 15:14         ` [PATCH v5 03/11] multi-pack-index: prepare for 'expire' subcommand Derrick Stolee
2019-04-24 15:14         ` [PATCH v5 04/11] midx: simplify computation of pack name lengths Derrick Stolee
2019-04-24 15:14         ` [PATCH v5 05/11] midx: refactor permutation logic and pack sorting Derrick Stolee
2019-04-24 15:14         ` [PATCH v5 06/11] multi-pack-index: implement 'expire' subcommand Derrick Stolee
2019-04-24 15:14         ` [PATCH v5 07/11] multi-pack-index: prepare 'repack' subcommand Derrick Stolee
2019-04-24 15:14         ` [PATCH v5 08/11] midx: implement midx_repack() Derrick Stolee
2019-04-24 15:14         ` [PATCH v5 09/11] multi-pack-index: test expire while adding packs Derrick Stolee
2019-04-24 15:14         ` [PATCH v5 10/11] midx: add test that 'expire' respects .keep files Derrick Stolee
2019-04-24 15:14         ` [PATCH v5 11/11] t5319-multi-pack-index.sh: test batch size zero Derrick Stolee
2019-04-25  5:38         ` [PATCH v5 00/11] Create 'expire' and 'repack' verbs for git-multi-pack-index Junio C Hamano
2019-04-25 11:06           ` Derrick Stolee
2019-05-14 18:47         ` [PATCH v6 " Derrick Stolee
2019-05-14 18:47           ` [PATCH v6 01/11] repack: refactor pack deletion for future use Derrick Stolee
2019-05-14 18:47           ` [PATCH v6 02/11] Docs: rearrange subcommands for multi-pack-index Derrick Stolee
2019-05-14 18:47           ` [PATCH v6 03/11] multi-pack-index: prepare for 'expire' subcommand Derrick Stolee
2019-05-14 18:47           ` [PATCH v6 04/11] midx: simplify computation of pack name lengths Derrick Stolee
2019-05-14 18:47           ` [PATCH v6 05/11] midx: refactor permutation logic and pack sorting Derrick Stolee
2019-05-14 18:47           ` [PATCH v6 06/11] multi-pack-index: implement 'expire' subcommand Derrick Stolee
2019-05-14 18:47           ` [PATCH v6 07/11] multi-pack-index: prepare 'repack' subcommand Derrick Stolee
2019-05-14 18:47           ` [PATCH v6 08/11] midx: implement midx_repack() Derrick Stolee
2019-05-14 18:47           ` [PATCH v6 09/11] multi-pack-index: test expire while adding packs Derrick Stolee
2019-05-14 18:47           ` [PATCH v6 10/11] midx: add test that 'expire' respects .keep files Derrick Stolee
2019-05-14 18:47           ` [PATCH v6 11/11] t5319-multi-pack-index.sh: test batch size zero Derrick Stolee
2019-06-10 14:15           ` [PATCH v6 00/11] Create 'expire' and 'repack' verbs for git-multi-pack-index Derrick Stolee
2019-06-10 17:31             ` Junio C Hamano
2019-06-10 17:57               ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0c29a242fec32b0077291cb36a2831e6202b3453.1545409716.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=avarab@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).