git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: dstolee@microsoft.com, gitster@pobox.com, jrnieder@gmail.com,
	peff@peff.net
Subject: [PATCH v2 0/8] pack-revindex: introduce on-disk '.rev' format
Date: Wed, 13 Jan 2021 17:28:01 -0500	[thread overview]
Message-ID: <cover.1610576805.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1610129989.git.me@ttaylorr.com>

Hi,

This is the second of two series to implement support for an on-disk format for
storing the reverse index. Note that this depends on the patches in the previous
series [1], which was recently updated).

This version is largely unchanged from the original, with the following
exceptions:

  - It has been rebased onto the patches in the first series.
  - The operands of two comparisons in 'offset_to_pack_pos()' were swapped so
    that the smaller of the two appears on the left-hand side of the comparison.
  - A brown-paper-bag bug was fixed in tests so that they pass on Windows (last
    night's integration broke 'seen' on Windows).
  - The GIT_TEST_WRITE_REV_INDEX mode was enabled in the "all-features" test.

Thanks in advance for your review.

[1]: https://lore.kernel.org/git/cover.1610129796.git.me@ttaylorr.com/

Taylor Blau (8):
  packfile: prepare for the existence of '*.rev' files
  pack-write.c: prepare to write 'pack-*.rev' files
  builtin/index-pack.c: write reverse indexes
  builtin/pack-objects.c: respect 'pack.writeReverseIndex'
  Documentation/config/pack.txt: advertise 'pack.writeReverseIndex'
  t: prepare for GIT_TEST_WRITE_REV_INDEX
  t: support GIT_TEST_WRITE_REV_INDEX
  pack-revindex: ensure that on-disk reverse indexes are given
    precedence

 Documentation/config/pack.txt           |   7 ++
 Documentation/git-index-pack.txt        |  20 ++--
 Documentation/technical/pack-format.txt |  17 ++++
 builtin/index-pack.c                    |  67 +++++++++++--
 builtin/pack-objects.c                  |   9 ++
 builtin/repack.c                        |   1 +
 ci/run-build-and-tests.sh               |   1 +
 object-store.h                          |   3 +
 pack-revindex.c                         | 116 ++++++++++++++++++++--
 pack-revindex.h                         |  10 +-
 pack-write.c                            | 123 +++++++++++++++++++++++-
 pack.h                                  |   4 +
 packfile.c                              |  13 ++-
 packfile.h                              |   1 +
 t/README                                |   3 +
 t/t5319-multi-pack-index.sh             |   5 +-
 t/t5325-reverse-index.sh                |  97 +++++++++++++++++++
 t/t5604-clone-reference.sh              |   2 +-
 t/t5702-protocol-v2.sh                  |  12 ++-
 t/t6500-gc.sh                           |   6 +-
 t/t9300-fast-import.sh                  |   5 +-
 tmp-objdir.c                            |   4 +-
 22 files changed, 485 insertions(+), 41 deletions(-)
 create mode 100755 t/t5325-reverse-index.sh

 -:  ---------- >  1:  e1aa89244a pack-revindex: introduce a new API
 -:  ---------- >  2:  0fca7d5812 write_reuse_object(): convert to new revindex API
 -:  ---------- >  3:  7676822a54 write_reused_pack_one(): convert to new revindex API
 -:  ---------- >  4:  dd7133fdb7 write_reused_pack_verbatim(): convert to new revindex API
 -:  ---------- >  5:  8e93ca3886 check_object(): convert to new revindex API
 -:  ---------- >  6:  084bbf2145 bitmap_position_packfile(): convert to new revindex API
 -:  ---------- >  7:  68794e9484 show_objects_for_type(): convert to new revindex API
 -:  ---------- >  8:  31ac6f5703 get_size_by_pos(): convert to new revindex API
 -:  ---------- >  9:  acd80069a2 try_partial_reuse(): convert to new revindex API
 -:  ---------- > 10:  569acdca7f rebuild_existing_bitmaps(): convert to new revindex API
 -:  ---------- > 11:  9881637724 get_delta_base_oid(): convert to new revindex API
 -:  ---------- > 12:  df8bb571a5 retry_bad_packed_offset(): convert to new revindex API
 -:  ---------- > 13:  41b2e00947 packed_object_info(): convert to new revindex API
 -:  ---------- > 14:  8ad49d231f unpack_entry(): convert to new revindex API
 -:  ---------- > 15:  e757476351 for_each_object_in_pack(): convert to new revindex API
 -:  ---------- > 16:  a500311e33 builtin/gc.c: guess the size of the revindex
 -:  ---------- > 17:  67d14da04a pack-revindex: remove unused 'find_pack_revindex()'
 -:  ---------- > 18:  3b5c92be68 pack-revindex: remove unused 'find_revindex_position()'
 -:  ---------- > 19:  cabafce4a1 pack-revindex: hide the definition of 'revindex_entry'
 -:  ---------- > 20:  8400ff6c96 pack-revindex.c: avoid direct revindex access in 'offset_to_pack_pos()'
 1:  ddf47a0a50 ! 21:  6742c15c84 packfile: prepare for the existence of '*.rev' files
    @@ pack-revindex.c: static void create_pack_revindex(struct packed_git *p)
     +
      int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos)
      {
    - 	int lo = 0;
    + 	unsigned lo, hi;
     @@ pack-revindex.c: int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos)

      uint32_t pack_pos_to_index(struct packed_git *p, uint32_t pos)
    @@ pack-revindex.c: int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_
     -	if (!p->revindex)
     +	if (!(p->revindex || p->revindex_data))
      		BUG("pack_pos_to_index: reverse index not yet loaded");
    - 	if (pos >= p->num_objects)
    + 	if (p->num_objects <= pos)
      		BUG("pack_pos_to_index: out-of-bounds object at %"PRIu32, pos);
     -	return p->revindex[pos].nr;
     +
    @@ pack-revindex.c: int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_
     -	if (!p->revindex)
     +	if (!(p->revindex || p->revindex_data))
      		BUG("pack_pos_to_index: reverse index not yet loaded");
    - 	if (pos > p->num_objects)
    + 	if (p->num_objects < pos)
      		BUG("pack_pos_to_offset: out-of-bounds object at %"PRIu32, pos);
     -	return p->revindex[pos].offset;
     +
    @@ pack-revindex.c: int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_
     +		return nth_packed_object_offset(p, pack_pos_to_index(p, pos));
      }

    + ## pack-revindex.h ##
    +@@ pack-revindex.h: struct packed_git;
    + /*
    +  * load_pack_revindex populates the revindex's internal data-structures for the
    +  * given pack, returning zero on success and a negative value otherwise.
    ++ *
    ++ * If a '.rev' file is present, it is checked for consistency, mmap'd, and
    ++ * pointers are assigned into it (instead of using the in-memory variant).
    +  */
    + int load_pack_revindex(struct packed_git *p);
    +
    +@@ pack-revindex.h: uint32_t pack_pos_to_index(struct packed_git *p, uint32_t pos);
    +  * If the reverse index has not yet been loaded, or the position is out of
    +  * bounds, this function aborts.
    +  *
    +- * This function runs in constant time.
    ++ * This function runs in constant time under both in-memory and on-disk reverse
    ++ * indexes, but an additional step is taken to consult the corresponding .idx
    ++ * file when using the on-disk format.
    +  */
    + off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos);
    +
    +
      ## packfile.c ##
     @@ packfile.c: void close_pack_index(struct packed_git *p)
      	}
 2:  88393e2662 = 22:  8648c87fa7 pack-write.c: prepare to write 'pack-*.rev' files
 3:  b0a7329824 ! 23:  5b18ada611 builtin/index-pack.c: write reverse indexes
    @@ t/t5325-reverse-index.sh (new)
     +	rm -f $rev &&
     +	conf=$1 &&
     +	shift &&
    ++	# remove the index since Windows won't overwrite an existing file
    ++	rm $packdir/pack-$pack.idx &&
     +	git -c pack.writeReverseIndex=$conf index-pack "$@" \
     +		$packdir/pack-$pack.pack
     +}
 4:  e297a31875 = 24:  68bde3ea97 builtin/pack-objects.c: respect 'pack.writeReverseIndex'
 5:  5d3e96a498 = 25:  38a253d0ce Documentation/config/pack.txt: advertise 'pack.writeReverseIndex'
 6:  2288571fbe <  -:  ---------- t: prepare for GIT_TEST_WRITE_REV_INDEX
 -:  ---------- > 26:  12cdf2d67a t: prepare for GIT_TEST_WRITE_REV_INDEX
 7:  3525c4d114 ! 27:  6b647d9775 t: support GIT_TEST_WRITE_REV_INDEX
    @@ Commit message

         Add a new option that unconditionally enables the pack.writeReverseIndex
         setting in order to run the whole test suite in a mode that generates
    -    on-disk reverse indexes.
    +    on-disk reverse indexes. Additionally, enable this mode in the second
    +    run of tests under linux-gcc in 'ci/run-build-and-tests.sh'.

         Once on-disk reverse indexes are proven out over several releases, we
         can change the default value of that configuration to 'true', and drop
    @@ builtin/pack-objects.c: int cmd_pack_objects(int argc, const char **argv, const
      	progress = isatty(2);
      	argc = parse_options(argc, argv, prefix, pack_objects_options,

    + ## ci/run-build-and-tests.sh ##
    +@@ ci/run-build-and-tests.sh: linux-gcc)
    + 	export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1
    + 	export GIT_TEST_MULTI_PACK_INDEX=1
    + 	export GIT_TEST_ADD_I_USE_BUILTIN=1
    ++	export GIT_TEST_WRITE_REV_INDEX=1
    + 	make test
    + 	;;
    + linux-clang)
    +
      ## pack-revindex.h ##
     @@
    - #ifndef PACK_REVINDEX_H
    - #define PACK_REVINDEX_H
    +  *   can be found
    +  */

     +#define GIT_TEST_WRITE_REV_INDEX "GIT_TEST_WRITE_REV_INDEX"
     +
      struct packed_git;

    - int load_pack_revindex(struct packed_git *p);
    + /*

      ## t/README ##
     @@ t/README: GIT_TEST_DEFAULT_HASH=<hash-algo> specifies which hash algorithm to
 8:  6e580d43d1 ! 28:  48926ae182 pack-revindex: ensure that on-disk reverse indexes are given precedence
    @@ pack-revindex.c: static void create_pack_revindex(struct packed_git *p)

      ## pack-revindex.h ##
     @@
    - #define PACK_REVINDEX_H
    +  */

      #define GIT_TEST_WRITE_REV_INDEX "GIT_TEST_WRITE_REV_INDEX"
     +#define GIT_TEST_REV_INDEX_DIE_IN_MEMORY "GIT_TEST_REV_INDEX_DIE_IN_MEMORY"
    @@ t/t5325-reverse-index.sh: test_expect_success 'pack-objects respects pack.writeR
      '

     +test_expect_success 'reverse index is not generated when available on disk' '
    -+	git index-pack --rev-index $packdir/pack-$pack.pack &&
    ++	test_index_pack true &&
    ++	test_path_is_file $rev &&
     +
     +	git rev-parse HEAD >tip &&
     +	GIT_TEST_REV_INDEX_DIE_IN_MEMORY=1 git cat-file \
--
2.30.0.138.g6d7191ea01

  parent reply	other threads:[~2021-01-14  2:28 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-08 18:19 [PATCH 0/8] pack-revindex: introduce on-disk '.rev' format Taylor Blau
2021-01-08 18:19 ` [PATCH 1/8] packfile: prepare for the existence of '*.rev' files Taylor Blau
2021-01-08 18:20 ` [PATCH 2/8] pack-write.c: prepare to write 'pack-*.rev' files Taylor Blau
2021-01-08 18:20 ` [PATCH 3/8] builtin/index-pack.c: write reverse indexes Taylor Blau
2021-01-08 18:20 ` [PATCH 4/8] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Taylor Blau
2021-01-08 18:20 ` [PATCH 5/8] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Taylor Blau
2021-01-08 18:20 ` [PATCH 6/8] t: prepare for GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-12 17:11   ` Ævar Arnfjörð Bjarmason
2021-01-12 18:40     ` Taylor Blau
2021-01-08 18:20 ` [PATCH 7/8] t: support GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-12 16:49   ` Derrick Stolee
2021-01-12 17:34     ` Taylor Blau
2021-01-12 17:18   ` Ævar Arnfjörð Bjarmason
2021-01-12 17:39     ` Derrick Stolee
2021-01-12 18:17       ` Taylor Blau
2021-01-08 18:20 ` [PATCH 8/8] pack-revindex: ensure that on-disk reverse indexes are given precedence Taylor Blau
2021-01-13 22:28 ` Taylor Blau [this message]
2021-01-13 22:28   ` [PATCH v2 1/8] packfile: prepare for the existence of '*.rev' files Taylor Blau
2021-01-14  7:22     ` Junio C Hamano
2021-01-14 12:07       ` Derrick Stolee
2021-01-14 19:57         ` Jeff King
2021-01-14 18:28       ` Taylor Blau
2021-01-14  7:26     ` Junio C Hamano
2021-01-14 18:13       ` Taylor Blau
2021-01-14 20:57         ` Junio C Hamano
2021-01-22 22:54     ` Jeff King
2021-01-25 17:44       ` Taylor Blau
2021-01-25 18:27         ` Jeff King
2021-01-25 19:04         ` Junio C Hamano
2021-01-25 19:23           ` Taylor Blau
2021-01-13 22:28   ` [PATCH v2 2/8] pack-write.c: prepare to write 'pack-*.rev' files Taylor Blau
2021-01-22 23:24     ` Jeff King
2021-01-25 19:15       ` Taylor Blau
2021-01-26 21:43         ` Jeff King
2021-01-13 22:28   ` [PATCH v2 3/8] builtin/index-pack.c: write reverse indexes Taylor Blau
2021-01-22 23:53     ` Jeff King
2021-01-25 20:03       ` Taylor Blau
2021-01-13 22:28   ` [PATCH v2 4/8] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Taylor Blau
2021-01-22 23:57     ` Jeff King
2021-01-23  0:08       ` Jeff King
2021-01-25 20:21         ` Taylor Blau
2021-01-25 20:50           ` Jeff King
2021-01-13 22:28   ` [PATCH v2 5/8] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Taylor Blau
2021-01-13 22:28   ` [PATCH v2 6/8] t: prepare for GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-13 22:28   ` [PATCH v2 7/8] t: support GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-13 22:28   ` [PATCH v2 8/8] pack-revindex: ensure that on-disk reverse indexes are given precedence Taylor Blau
2021-01-25 23:37 ` [PATCH v3 00/10] pack-revindex: introduce on-disk '.rev' format Taylor Blau
2021-01-25 23:37   ` [PATCH v3 01/10] packfile: prepare for the existence of '*.rev' files Taylor Blau
2021-01-29  0:27     ` Jeff King
2021-01-29  1:14       ` Taylor Blau
2021-01-30  8:39         ` Jeff King
2021-01-25 23:37   ` [PATCH v3 02/10] pack-write.c: prepare to write 'pack-*.rev' files Taylor Blau
2021-01-25 23:37   ` [PATCH v3 03/10] builtin/index-pack.c: allow stripping arbitrary extensions Taylor Blau
2021-01-29  0:28     ` Jeff King
2021-01-29  1:15       ` Taylor Blau
2021-01-25 23:37   ` [PATCH v3 04/10] builtin/index-pack.c: write reverse indexes Taylor Blau
2021-01-25 23:37   ` [PATCH v3 05/10] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Taylor Blau
2021-01-25 23:37   ` [PATCH v3 06/10] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Taylor Blau
2021-01-29  0:30     ` Jeff King
2021-01-29  1:17       ` Taylor Blau
2021-01-30  8:41         ` Jeff King
2021-01-25 23:37   ` [PATCH v3 07/10] t: prepare for GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-29  0:45     ` Jeff King
2021-01-29  1:09       ` Eric Sunshine
2021-01-29  1:21       ` Taylor Blau
2021-01-30  8:43         ` Jeff King
2021-01-29  2:42       ` Junio C Hamano
2021-01-25 23:37   ` [PATCH v3 08/10] t: support GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-29  0:47     ` Jeff King
2021-01-25 23:37   ` [PATCH v3 09/10] pack-revindex: ensure that on-disk reverse indexes are given precedence Taylor Blau
2021-01-29  0:53     ` Jeff King
2021-01-29  1:25       ` Taylor Blau
2021-01-30  8:46         ` Jeff King
2021-01-25 23:37   ` [PATCH v3 10/10] t5325: check both on-disk and in-memory reverse index Taylor Blau
2021-01-29  1:04     ` Jeff King
2021-01-29  1:05     ` Jeff King
2021-01-29  1:32       ` Taylor Blau
2021-01-30  8:47         ` Jeff King
2021-01-26  2:36   ` [PATCH v3 00/10] pack-revindex: introduce on-disk '.rev' format Junio C Hamano
2021-01-26  2:49     ` Taylor Blau
2021-01-29  1:06   ` Jeff King
2021-01-29  1:34     ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1610576805.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).