git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: git@vger.kernel.org
Cc: dstolee@microsoft.com, stolee@gmail.com, git@jeffhostetler.com,
	peff@peff.net, gitster@pobox.com, Johannes.Shindelin@gmx.de,
	jrnieder@gmail.com
Subject: [RFC PATCH 05/18] midx: create midx builtin with --write mode
Date: Sun,  7 Jan 2018 13:14:46 -0500	[thread overview]
Message-ID: <20180107181459.222909-6-dstolee@microsoft.com> (raw)
In-Reply-To: <20180107181459.222909-1-dstolee@microsoft.com>

Commentary: As we extend the function of the midx builtin, I expand the
SYNOPSIS row of "git-midx.txt" but do not create multiple rows. If this
builtin doesn't change too much, I will rewrite the SYNOPSIS to be multi-
lined, such as in "git-branch.txt".

-- >8 --

Create, document, and implement the first ability of the midx builtin.

The --write subcommand creates a multi-pack-index for all indexed
packfiles within a given pack directory. If none is provided, the
objects/pack directory is implied. The arguments allow specifying the
pack directory so we can add MIDX files to alternates.

The packfiles are expected to be paired with pack-indexes and are
otherwise ignored. This simplifies the implementation and also keeps
compatibility with older versions of Git (or changing core.midx to
false).

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 .gitignore                 |   1 +
 Documentation/git-midx.txt |  54 +++++++++++++
 Makefile                   |   1 +
 builtin.h                  |   1 +
 builtin/midx.c             | 195 +++++++++++++++++++++++++++++++++++++++++++++
 command-list.txt           |   1 +
 git.c                      |   1 +
 7 files changed, 254 insertions(+)
 create mode 100644 Documentation/git-midx.txt
 create mode 100644 builtin/midx.c

diff --git a/.gitignore b/.gitignore
index 833ef3b0b7..545e195f2a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -95,6 +95,7 @@
 /git-merge-subtree
 /git-mergetool
 /git-mergetool--lib
+/git-midx
 /git-mktag
 /git-mktree
 /git-name-rev
diff --git a/Documentation/git-midx.txt b/Documentation/git-midx.txt
new file mode 100644
index 0000000000..17464222c1
--- /dev/null
+++ b/Documentation/git-midx.txt
@@ -0,0 +1,54 @@
+git-midx(1)
+============
+
+NAME
+----
+git-midx - Write and verify multi-pack-indexes (MIDX files).
+
+
+SYNOPSIS
+--------
+[verse]
+'git midx' --write [--pack-dir <pack_dir>]
+
+DESCRIPTION
+-----------
+Write a MIDX file.
+
+OPTIONS
+-------
+
+--pack-dir <pack_dir>::
+	Use given directory for the location of packfiles, pack-indexes,
+	and MIDX files.
+
+--write::
+	If specified, write a new midx file to the pack directory using
+	the packfiles present. Outputs the hash of the result midx file.
+
+EXAMPLES
+--------
+
+* Write a MIDX file for the packfiles in your local .git folder.
++
+------------------------------------------------
+$ git midx --write
+------------------------------------------------
+
+* Write a MIDX file for the packfiles in a different folder
++
+---------------------------------------------------------
+$ git midx --write --pack-dir ../../alt/pack/
+---------------------------------------------------------
+
+CONFIGURATION
+-------------
+
+core.midx::
+	The midx command will fail if core.midx is false.
+	Also, the written MIDX files will be ignored by other commands
+	unless core.midx is true.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index d0d810951f..5c458705c1 100644
--- a/Makefile
+++ b/Makefile
@@ -980,6 +980,7 @@ BUILTIN_OBJS += builtin/merge-index.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
 BUILTIN_OBJS += builtin/merge-tree.o
+BUILTIN_OBJS += builtin/midx.o
 BUILTIN_OBJS += builtin/mktag.o
 BUILTIN_OBJS += builtin/mktree.o
 BUILTIN_OBJS += builtin/mv.o
diff --git a/builtin.h b/builtin.h
index 42378f3aa4..880383e341 100644
--- a/builtin.h
+++ b/builtin.h
@@ -188,6 +188,7 @@ extern int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 extern int cmd_merge_file(int argc, const char **argv, const char *prefix);
 extern int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
 extern int cmd_merge_tree(int argc, const char **argv, const char *prefix);
+extern int cmd_midx(int argc, const char **argv, const char *prefix);
 extern int cmd_mktag(int argc, const char **argv, const char *prefix);
 extern int cmd_mktree(int argc, const char **argv, const char *prefix);
 extern int cmd_mv(int argc, const char **argv, const char *prefix);
diff --git a/builtin/midx.c b/builtin/midx.c
new file mode 100644
index 0000000000..4aae14cf8e
--- /dev/null
+++ b/builtin/midx.c
@@ -0,0 +1,195 @@
+#include "builtin.h"
+#include "cache.h"
+#include "config.h"
+#include "dir.h"
+#include "git-compat-util.h"
+#include "lockfile.h"
+#include "packfile.h"
+#include "parse-options.h"
+#include "midx.h"
+
+static char const * const builtin_midx_usage[] = {
+	N_("git midx --write [--pack-dir <packdir>]"),
+	NULL
+};
+
+static struct opts_midx {
+	const char *pack_dir;
+	int write;
+} opts;
+
+static int build_midx_from_packs(
+	const char *pack_dir,
+	const char **pack_names, uint32_t nr_packs,
+	const char **midx_id)
+{
+	struct packed_git **packs;
+	const char **installed_pack_names;
+	uint32_t i, j, nr_installed_packs = 0;
+	uint32_t nr_objects = 0;
+	struct pack_midx_entry *objects;
+	struct pack_midx_entry **obj_ptrs;
+	uint32_t nr_total_packs = nr_packs;
+	uint32_t pack_offset = 0;
+	struct strbuf pack_path = STRBUF_INIT;
+	int baselen;
+
+	if (!nr_total_packs) {
+		*midx_id = NULL;
+		return 0;
+	}
+
+	ALLOC_ARRAY(packs, nr_total_packs);
+	ALLOC_ARRAY(installed_pack_names, nr_total_packs);
+
+	strbuf_addstr(&pack_path, pack_dir);
+	strbuf_addch(&pack_path, '/');
+	baselen = pack_path.len;
+	for (i = 0; i < nr_packs; i++) {
+		strbuf_setlen(&pack_path, baselen);
+		strbuf_addstr(&pack_path, pack_names[i]);
+
+		strbuf_strip_suffix(&pack_path, ".pack");
+		strbuf_addstr(&pack_path, ".idx");
+
+		packs[nr_installed_packs] = add_packed_git(pack_path.buf, pack_path.len, 0);
+
+		if (packs[nr_installed_packs] != NULL) {
+			if (open_pack_index(packs[nr_installed_packs]))
+				continue;
+
+			nr_objects += packs[nr_installed_packs]->num_objects;
+			installed_pack_names[nr_installed_packs] = pack_names[i];
+			nr_installed_packs++;
+		}
+	}
+	strbuf_release(&pack_path);
+
+	if (!nr_objects || !nr_installed_packs) {
+		FREE_AND_NULL(packs);
+		FREE_AND_NULL(installed_pack_names);
+		*midx_id = NULL;
+		return 0;
+	}
+
+	ALLOC_ARRAY(objects, nr_objects);
+	nr_objects = 0;
+
+	for (i = pack_offset; i < nr_installed_packs; i++) {
+		struct packed_git *p = packs[i];
+
+		for (j = 0; j < p->num_objects; j++) {
+			struct pack_midx_entry entry;
+
+			if (!nth_packed_object_oid(&entry.oid, p, j))
+				die("unable to get sha1 of object %u in %s",
+				    i, p->pack_name);
+
+			entry.pack_int_id = i;
+			entry.offset = nth_packed_object_offset(p, j);
+
+			objects[nr_objects] = entry;
+			nr_objects++;
+		}
+	}
+
+	ALLOC_ARRAY(obj_ptrs, nr_objects);
+	for (i = 0; i < nr_objects; i++)
+		obj_ptrs[i] = &objects[i];
+
+	*midx_id = write_midx_file(pack_dir, NULL,
+		installed_pack_names, nr_installed_packs,
+		obj_ptrs, nr_objects);
+
+	FREE_AND_NULL(packs);
+	FREE_AND_NULL(installed_pack_names);
+	FREE_AND_NULL(obj_ptrs);
+	FREE_AND_NULL(objects);
+
+	return 0;
+}
+
+static int midx_write(void)
+{
+	const char **pack_names = NULL;
+	uint32_t i, nr_packs = 0;
+	const char *midx_id = 0;
+	DIR *dir;
+	struct dirent *de;
+
+	dir = opendir(opts.pack_dir);
+	if (!dir) {
+		error_errno("unable to open object pack directory: %s",
+			    opts.pack_dir);
+		return 1;
+	}
+
+	nr_packs = 256;
+	ALLOC_ARRAY(pack_names, nr_packs);
+
+	i = 0;
+	while ((de = readdir(dir)) != NULL) {
+		if (is_dot_or_dotdot(de->d_name))
+			continue;
+
+		if (ends_with(de->d_name, ".pack")) {
+			ALLOC_GROW(pack_names, i + 1, nr_packs);
+			pack_names[i++] = xstrdup(de->d_name);
+		}
+	}
+
+	nr_packs = i;
+	closedir(dir);
+
+	if (!nr_packs)
+		goto cleanup;
+
+	if (build_midx_from_packs(opts.pack_dir, pack_names, nr_packs, &midx_id))
+		die("failed to build MIDX");
+
+	if (midx_id == NULL)
+		goto cleanup;
+
+	printf("%s\n", midx_id);
+
+cleanup:
+	if (pack_names)
+		FREE_AND_NULL(pack_names);
+	return 0;
+}
+
+int cmd_midx(int argc, const char **argv, const char *prefix)
+{
+	static struct option builtin_midx_options[] = {
+		{ OPTION_STRING, 'p', "pack-dir", &opts.pack_dir,
+			N_("dir"),
+			N_("The pack directory containing set of packfile and pack-index pairs.") },
+		OPT_BOOL('w', "write", &opts.write,
+			N_("write midx file")),
+		OPT_END(),
+	};
+
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(builtin_midx_usage, builtin_midx_options);
+
+	git_config(git_default_config, NULL);
+	if (!core_midx)
+		die(_("git-midx requires core.midx=true"));
+
+	argc = parse_options(argc, argv, prefix,
+			     builtin_midx_options,
+			     builtin_midx_usage, 0);
+
+	if (!opts.pack_dir) {
+		struct strbuf path = STRBUF_INIT;
+		strbuf_addstr(&path, get_object_directory());
+		strbuf_addstr(&path, "/pack");
+		opts.pack_dir = strbuf_detach(&path, NULL);
+	}
+
+	if (opts.write)
+		return midx_write();
+
+	usage_with_options(builtin_midx_usage, builtin_midx_options);
+	return 0;
+}
diff --git a/command-list.txt b/command-list.txt
index a1fad28fd8..a7b9412182 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -87,6 +87,7 @@ git-merge-index                         plumbingmanipulators
 git-merge-one-file                      purehelpers
 git-mergetool                           ancillarymanipulators
 git-merge-tree                          ancillaryinterrogators
+git-midx                                plumbingmanipulators
 git-mktag                               plumbingmanipulators
 git-mktree                              plumbingmanipulators
 git-mv                                  mainporcelain           worktree
diff --git a/git.c b/git.c
index c870b9719c..87fbda8463 100644
--- a/git.c
+++ b/git.c
@@ -431,6 +431,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE },
 	{ "merge-subtree", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE },
 	{ "merge-tree", cmd_merge_tree, RUN_SETUP },
+	{ "midx", cmd_midx, RUN_SETUP },
 	{ "mktag", cmd_mktag, RUN_SETUP },
 	{ "mktree", cmd_mktree, RUN_SETUP },
 	{ "mv", cmd_mv, RUN_SETUP | NEED_WORK_TREE },
-- 
2.15.0


  parent reply	other threads:[~2018-01-07 18:16 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-07 18:14 [RFC PATCH 00/18] Multi-pack index (MIDX) Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 01/18] docs: Multi-Pack Index (MIDX) Design Notes Derrick Stolee
2018-01-08 19:32   ` Jonathan Tan
2018-01-08 20:35     ` Derrick Stolee
2018-01-08 22:06       ` Jonathan Tan
2018-01-07 18:14 ` [RFC PATCH 02/18] midx: specify midx file format Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 03/18] midx: create core.midx config setting Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 04/18] midx: write multi-pack indexes for an object list Derrick Stolee
2018-01-07 18:14 ` Derrick Stolee [this message]
2018-01-07 18:14 ` [RFC PATCH 06/18] midx: add t5318-midx.sh test script Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 07/18] midx: teach midx --write to update midx-head Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 08/18] midx: teach git-midx to read midx file details Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 09/18] midx: find details of nth object in midx Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 10/18] midx: use existing midx when writing Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 11/18] midx: teach git-midx to clear midx files Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 12/18] midx: teach git-midx to delete expired files Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 13/18] t5318-midx.h: confirm git actions are stable Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 14/18] midx: load midx files when loading packs Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 15/18] midx: use midx for approximate object count Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 16/18] midx: nth_midxed_object_oid() and bsearch_midx() Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 17/18] sha1_name: use midx for abbreviations Derrick Stolee
2018-01-07 18:14 ` [RFC PATCH 18/18] packfile: use midx for object loads Derrick Stolee
2018-01-07 22:42 ` [RFC PATCH 00/18] Multi-pack index (MIDX) Ævar Arnfjörð Bjarmason
2018-01-08  0:08   ` Derrick Stolee
2018-01-08 10:20     ` Jeff King
2018-01-08 10:27       ` Jeff King
2018-01-08 12:28         ` Ævar Arnfjörð Bjarmason
2018-01-08 13:43       ` Johannes Schindelin
2018-01-09  6:50         ` Jeff King
2018-01-09 13:05           ` Johannes Schindelin
2018-01-09 19:51             ` Stefan Beller
2018-01-09 20:12               ` Junio C Hamano
2018-01-09 20:16                 ` Stefan Beller
2018-01-09 21:31                   ` Junio C Hamano
2018-01-10 17:05               ` Johannes Schindelin
2018-01-10 10:57             ` Jeff King
2018-01-08 13:43       ` Derrick Stolee
2018-01-09  7:12         ` Jeff King
2018-01-08 11:43     ` Ævar Arnfjörð Bjarmason
2018-06-06  8:13     ` Ævar Arnfjörð Bjarmason
2018-06-06 10:27       ` [RFC PATCH 0/2] unconditional O(1) SHA-1 abbreviation Ævar Arnfjörð Bjarmason
2018-06-06 10:27       ` [RFC PATCH 1/2] config.c: use braces on multiple conditional arms Ævar Arnfjörð Bjarmason
2018-06-06 10:27       ` [RFC PATCH 2/2] sha1-name: add core.validateAbbrev & relative core.abbrev Ævar Arnfjörð Bjarmason
2018-06-06 12:04         ` Christian Couder
2018-06-06 11:24       ` [RFC PATCH 00/18] Multi-pack index (MIDX) Derrick Stolee
2018-01-10 18:25 ` Martin Fick
2018-01-10 19:39   ` Derrick Stolee
2018-01-10 21:01     ` Martin Fick

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180107181459.222909-6-dstolee@microsoft.com \
    --to=stolee@gmail.com \
    --cc=Johannes.Shindelin@gmx.de \
    --cc=dstolee@microsoft.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).