From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: dstolee@microsoft.com, gitster@pobox.com, jrnieder@gmail.com,
peff@peff.net
Subject: [PATCH v2 0/8] pack-revindex: introduce on-disk '.rev' format
Date: Wed, 13 Jan 2021 17:28:01 -0500 [thread overview]
Message-ID: <cover.1610576805.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1610129989.git.me@ttaylorr.com>
Hi,
This is the second of two series to implement support for an on-disk format for
storing the reverse index. Note that this depends on the patches in the previous
series [1], which was recently updated).
This version is largely unchanged from the original, with the following
exceptions:
- It has been rebased onto the patches in the first series.
- The operands of two comparisons in 'offset_to_pack_pos()' were swapped so
that the smaller of the two appears on the left-hand side of the comparison.
- A brown-paper-bag bug was fixed in tests so that they pass on Windows (last
night's integration broke 'seen' on Windows).
- The GIT_TEST_WRITE_REV_INDEX mode was enabled in the "all-features" test.
Thanks in advance for your review.
[1]: https://lore.kernel.org/git/cover.1610129796.git.me@ttaylorr.com/
Taylor Blau (8):
packfile: prepare for the existence of '*.rev' files
pack-write.c: prepare to write 'pack-*.rev' files
builtin/index-pack.c: write reverse indexes
builtin/pack-objects.c: respect 'pack.writeReverseIndex'
Documentation/config/pack.txt: advertise 'pack.writeReverseIndex'
t: prepare for GIT_TEST_WRITE_REV_INDEX
t: support GIT_TEST_WRITE_REV_INDEX
pack-revindex: ensure that on-disk reverse indexes are given
precedence
Documentation/config/pack.txt | 7 ++
Documentation/git-index-pack.txt | 20 ++--
Documentation/technical/pack-format.txt | 17 ++++
builtin/index-pack.c | 67 +++++++++++--
builtin/pack-objects.c | 9 ++
builtin/repack.c | 1 +
ci/run-build-and-tests.sh | 1 +
object-store.h | 3 +
pack-revindex.c | 116 ++++++++++++++++++++--
pack-revindex.h | 10 +-
pack-write.c | 123 +++++++++++++++++++++++-
pack.h | 4 +
packfile.c | 13 ++-
packfile.h | 1 +
t/README | 3 +
t/t5319-multi-pack-index.sh | 5 +-
t/t5325-reverse-index.sh | 97 +++++++++++++++++++
t/t5604-clone-reference.sh | 2 +-
t/t5702-protocol-v2.sh | 12 ++-
t/t6500-gc.sh | 6 +-
t/t9300-fast-import.sh | 5 +-
tmp-objdir.c | 4 +-
22 files changed, 485 insertions(+), 41 deletions(-)
create mode 100755 t/t5325-reverse-index.sh
-: ---------- > 1: e1aa89244a pack-revindex: introduce a new API
-: ---------- > 2: 0fca7d5812 write_reuse_object(): convert to new revindex API
-: ---------- > 3: 7676822a54 write_reused_pack_one(): convert to new revindex API
-: ---------- > 4: dd7133fdb7 write_reused_pack_verbatim(): convert to new revindex API
-: ---------- > 5: 8e93ca3886 check_object(): convert to new revindex API
-: ---------- > 6: 084bbf2145 bitmap_position_packfile(): convert to new revindex API
-: ---------- > 7: 68794e9484 show_objects_for_type(): convert to new revindex API
-: ---------- > 8: 31ac6f5703 get_size_by_pos(): convert to new revindex API
-: ---------- > 9: acd80069a2 try_partial_reuse(): convert to new revindex API
-: ---------- > 10: 569acdca7f rebuild_existing_bitmaps(): convert to new revindex API
-: ---------- > 11: 9881637724 get_delta_base_oid(): convert to new revindex API
-: ---------- > 12: df8bb571a5 retry_bad_packed_offset(): convert to new revindex API
-: ---------- > 13: 41b2e00947 packed_object_info(): convert to new revindex API
-: ---------- > 14: 8ad49d231f unpack_entry(): convert to new revindex API
-: ---------- > 15: e757476351 for_each_object_in_pack(): convert to new revindex API
-: ---------- > 16: a500311e33 builtin/gc.c: guess the size of the revindex
-: ---------- > 17: 67d14da04a pack-revindex: remove unused 'find_pack_revindex()'
-: ---------- > 18: 3b5c92be68 pack-revindex: remove unused 'find_revindex_position()'
-: ---------- > 19: cabafce4a1 pack-revindex: hide the definition of 'revindex_entry'
-: ---------- > 20: 8400ff6c96 pack-revindex.c: avoid direct revindex access in 'offset_to_pack_pos()'
1: ddf47a0a50 ! 21: 6742c15c84 packfile: prepare for the existence of '*.rev' files
@@ pack-revindex.c: static void create_pack_revindex(struct packed_git *p)
+
int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos)
{
- int lo = 0;
+ unsigned lo, hi;
@@ pack-revindex.c: int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos)
uint32_t pack_pos_to_index(struct packed_git *p, uint32_t pos)
@@ pack-revindex.c: int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_
- if (!p->revindex)
+ if (!(p->revindex || p->revindex_data))
BUG("pack_pos_to_index: reverse index not yet loaded");
- if (pos >= p->num_objects)
+ if (p->num_objects <= pos)
BUG("pack_pos_to_index: out-of-bounds object at %"PRIu32, pos);
- return p->revindex[pos].nr;
+
@@ pack-revindex.c: int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_
- if (!p->revindex)
+ if (!(p->revindex || p->revindex_data))
BUG("pack_pos_to_index: reverse index not yet loaded");
- if (pos > p->num_objects)
+ if (p->num_objects < pos)
BUG("pack_pos_to_offset: out-of-bounds object at %"PRIu32, pos);
- return p->revindex[pos].offset;
+
@@ pack-revindex.c: int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_
+ return nth_packed_object_offset(p, pack_pos_to_index(p, pos));
}
+ ## pack-revindex.h ##
+@@ pack-revindex.h: struct packed_git;
+ /*
+ * load_pack_revindex populates the revindex's internal data-structures for the
+ * given pack, returning zero on success and a negative value otherwise.
++ *
++ * If a '.rev' file is present, it is checked for consistency, mmap'd, and
++ * pointers are assigned into it (instead of using the in-memory variant).
+ */
+ int load_pack_revindex(struct packed_git *p);
+
+@@ pack-revindex.h: uint32_t pack_pos_to_index(struct packed_git *p, uint32_t pos);
+ * If the reverse index has not yet been loaded, or the position is out of
+ * bounds, this function aborts.
+ *
+- * This function runs in constant time.
++ * This function runs in constant time under both in-memory and on-disk reverse
++ * indexes, but an additional step is taken to consult the corresponding .idx
++ * file when using the on-disk format.
+ */
+ off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos);
+
+
## packfile.c ##
@@ packfile.c: void close_pack_index(struct packed_git *p)
}
2: 88393e2662 = 22: 8648c87fa7 pack-write.c: prepare to write 'pack-*.rev' files
3: b0a7329824 ! 23: 5b18ada611 builtin/index-pack.c: write reverse indexes
@@ t/t5325-reverse-index.sh (new)
+ rm -f $rev &&
+ conf=$1 &&
+ shift &&
++ # remove the index since Windows won't overwrite an existing file
++ rm $packdir/pack-$pack.idx &&
+ git -c pack.writeReverseIndex=$conf index-pack "$@" \
+ $packdir/pack-$pack.pack
+}
4: e297a31875 = 24: 68bde3ea97 builtin/pack-objects.c: respect 'pack.writeReverseIndex'
5: 5d3e96a498 = 25: 38a253d0ce Documentation/config/pack.txt: advertise 'pack.writeReverseIndex'
6: 2288571fbe < -: ---------- t: prepare for GIT_TEST_WRITE_REV_INDEX
-: ---------- > 26: 12cdf2d67a t: prepare for GIT_TEST_WRITE_REV_INDEX
7: 3525c4d114 ! 27: 6b647d9775 t: support GIT_TEST_WRITE_REV_INDEX
@@ Commit message
Add a new option that unconditionally enables the pack.writeReverseIndex
setting in order to run the whole test suite in a mode that generates
- on-disk reverse indexes.
+ on-disk reverse indexes. Additionally, enable this mode in the second
+ run of tests under linux-gcc in 'ci/run-build-and-tests.sh'.
Once on-disk reverse indexes are proven out over several releases, we
can change the default value of that configuration to 'true', and drop
@@ builtin/pack-objects.c: int cmd_pack_objects(int argc, const char **argv, const
progress = isatty(2);
argc = parse_options(argc, argv, prefix, pack_objects_options,
+ ## ci/run-build-and-tests.sh ##
+@@ ci/run-build-and-tests.sh: linux-gcc)
+ export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1
+ export GIT_TEST_MULTI_PACK_INDEX=1
+ export GIT_TEST_ADD_I_USE_BUILTIN=1
++ export GIT_TEST_WRITE_REV_INDEX=1
+ make test
+ ;;
+ linux-clang)
+
## pack-revindex.h ##
@@
- #ifndef PACK_REVINDEX_H
- #define PACK_REVINDEX_H
+ * can be found
+ */
+#define GIT_TEST_WRITE_REV_INDEX "GIT_TEST_WRITE_REV_INDEX"
+
struct packed_git;
- int load_pack_revindex(struct packed_git *p);
+ /*
## t/README ##
@@ t/README: GIT_TEST_DEFAULT_HASH=<hash-algo> specifies which hash algorithm to
8: 6e580d43d1 ! 28: 48926ae182 pack-revindex: ensure that on-disk reverse indexes are given precedence
@@ pack-revindex.c: static void create_pack_revindex(struct packed_git *p)
## pack-revindex.h ##
@@
- #define PACK_REVINDEX_H
+ */
#define GIT_TEST_WRITE_REV_INDEX "GIT_TEST_WRITE_REV_INDEX"
+#define GIT_TEST_REV_INDEX_DIE_IN_MEMORY "GIT_TEST_REV_INDEX_DIE_IN_MEMORY"
@@ t/t5325-reverse-index.sh: test_expect_success 'pack-objects respects pack.writeR
'
+test_expect_success 'reverse index is not generated when available on disk' '
-+ git index-pack --rev-index $packdir/pack-$pack.pack &&
++ test_index_pack true &&
++ test_path_is_file $rev &&
+
+ git rev-parse HEAD >tip &&
+ GIT_TEST_REV_INDEX_DIE_IN_MEMORY=1 git cat-file \
--
2.30.0.138.g6d7191ea01
next prev parent reply other threads:[~2021-01-14 2:28 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-08 18:19 [PATCH 0/8] pack-revindex: introduce on-disk '.rev' format Taylor Blau
2021-01-08 18:19 ` [PATCH 1/8] packfile: prepare for the existence of '*.rev' files Taylor Blau
2021-01-08 18:20 ` [PATCH 2/8] pack-write.c: prepare to write 'pack-*.rev' files Taylor Blau
2021-01-08 18:20 ` [PATCH 3/8] builtin/index-pack.c: write reverse indexes Taylor Blau
2021-01-08 18:20 ` [PATCH 4/8] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Taylor Blau
2021-01-08 18:20 ` [PATCH 5/8] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Taylor Blau
2021-01-08 18:20 ` [PATCH 6/8] t: prepare for GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-12 17:11 ` Ævar Arnfjörð Bjarmason
2021-01-12 18:40 ` Taylor Blau
2021-01-08 18:20 ` [PATCH 7/8] t: support GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-12 16:49 ` Derrick Stolee
2021-01-12 17:34 ` Taylor Blau
2021-01-12 17:18 ` Ævar Arnfjörð Bjarmason
2021-01-12 17:39 ` Derrick Stolee
2021-01-12 18:17 ` Taylor Blau
2021-01-08 18:20 ` [PATCH 8/8] pack-revindex: ensure that on-disk reverse indexes are given precedence Taylor Blau
2021-01-13 22:28 ` Taylor Blau [this message]
2021-01-13 22:28 ` [PATCH v2 1/8] packfile: prepare for the existence of '*.rev' files Taylor Blau
2021-01-14 7:22 ` Junio C Hamano
2021-01-14 12:07 ` Derrick Stolee
2021-01-14 19:57 ` Jeff King
2021-01-14 18:28 ` Taylor Blau
2021-01-14 7:26 ` Junio C Hamano
2021-01-14 18:13 ` Taylor Blau
2021-01-14 20:57 ` Junio C Hamano
2021-01-22 22:54 ` Jeff King
2021-01-25 17:44 ` Taylor Blau
2021-01-25 18:27 ` Jeff King
2021-01-25 19:04 ` Junio C Hamano
2021-01-25 19:23 ` Taylor Blau
2021-01-13 22:28 ` [PATCH v2 2/8] pack-write.c: prepare to write 'pack-*.rev' files Taylor Blau
2021-01-22 23:24 ` Jeff King
2021-01-25 19:15 ` Taylor Blau
2021-01-26 21:43 ` Jeff King
2021-01-13 22:28 ` [PATCH v2 3/8] builtin/index-pack.c: write reverse indexes Taylor Blau
2021-01-22 23:53 ` Jeff King
2021-01-25 20:03 ` Taylor Blau
2021-01-13 22:28 ` [PATCH v2 4/8] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Taylor Blau
2021-01-22 23:57 ` Jeff King
2021-01-23 0:08 ` Jeff King
2021-01-25 20:21 ` Taylor Blau
2021-01-25 20:50 ` Jeff King
2021-01-13 22:28 ` [PATCH v2 5/8] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Taylor Blau
2021-01-13 22:28 ` [PATCH v2 6/8] t: prepare for GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-13 22:28 ` [PATCH v2 7/8] t: support GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-13 22:28 ` [PATCH v2 8/8] pack-revindex: ensure that on-disk reverse indexes are given precedence Taylor Blau
2021-01-25 23:37 ` [PATCH v3 00/10] pack-revindex: introduce on-disk '.rev' format Taylor Blau
2021-01-25 23:37 ` [PATCH v3 01/10] packfile: prepare for the existence of '*.rev' files Taylor Blau
2021-01-29 0:27 ` Jeff King
2021-01-29 1:14 ` Taylor Blau
2021-01-30 8:39 ` Jeff King
2021-01-25 23:37 ` [PATCH v3 02/10] pack-write.c: prepare to write 'pack-*.rev' files Taylor Blau
2021-01-25 23:37 ` [PATCH v3 03/10] builtin/index-pack.c: allow stripping arbitrary extensions Taylor Blau
2021-01-29 0:28 ` Jeff King
2021-01-29 1:15 ` Taylor Blau
2021-01-25 23:37 ` [PATCH v3 04/10] builtin/index-pack.c: write reverse indexes Taylor Blau
2021-01-25 23:37 ` [PATCH v3 05/10] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Taylor Blau
2021-01-25 23:37 ` [PATCH v3 06/10] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Taylor Blau
2021-01-29 0:30 ` Jeff King
2021-01-29 1:17 ` Taylor Blau
2021-01-30 8:41 ` Jeff King
2021-01-25 23:37 ` [PATCH v3 07/10] t: prepare for GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-29 0:45 ` Jeff King
2021-01-29 1:09 ` Eric Sunshine
2021-01-29 1:21 ` Taylor Blau
2021-01-30 8:43 ` Jeff King
2021-01-29 2:42 ` Junio C Hamano
2021-01-25 23:37 ` [PATCH v3 08/10] t: support GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-29 0:47 ` Jeff King
2021-01-25 23:37 ` [PATCH v3 09/10] pack-revindex: ensure that on-disk reverse indexes are given precedence Taylor Blau
2021-01-29 0:53 ` Jeff King
2021-01-29 1:25 ` Taylor Blau
2021-01-30 8:46 ` Jeff King
2021-01-25 23:37 ` [PATCH v3 10/10] t5325: check both on-disk and in-memory reverse index Taylor Blau
2021-01-29 1:04 ` Jeff King
2021-01-29 1:05 ` Jeff King
2021-01-29 1:32 ` Taylor Blau
2021-01-30 8:47 ` Jeff King
2021-01-26 2:36 ` [PATCH v3 00/10] pack-revindex: introduce on-disk '.rev' format Junio C Hamano
2021-01-26 2:49 ` Taylor Blau
2021-01-29 1:06 ` Jeff King
2021-01-29 1:34 ` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1610576805.git.me@ttaylorr.com \
--to=me@ttaylorr.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jrnieder@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).