git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>, Elijah Newren <newren@gmail.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v2 10/23] pack-bitmap-write.c: select pseudo-merge commits
Date: Mon, 29 Apr 2024 16:43:41 -0400	[thread overview]
Message-ID: <12b432e3a8adcda6228beae2b41b2363a6ce82a0.1714422410.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1714422410.git.me@ttaylorr.com>

Now that the pseudo-merge machinery has learned how to select
non-bitmapped commits and assign them into different pseudo-merge
group(s), invoke this new API from within the pack-bitmap internals and
store the results off.

Note that the selected pseudo-merge commits aren't actually used or
written anywhere yet. This will be done in the following commit.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config.txt                     |  2 +
 Documentation/config/bitmap-pseudo-merge.txt | 75 ++++++++++++++++++++
 Documentation/technical/bitmap-format.txt    | 26 +++++++
 pack-bitmap-write.c                          | 14 ++++
 4 files changed, 117 insertions(+)
 create mode 100644 Documentation/config/bitmap-pseudo-merge.txt

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 70b448b1326..bbedb7b9a06 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -383,6 +383,8 @@ include::config/apply.txt[]
 
 include::config/attr.txt[]
 
+include::config/bitmap-pseudo-merge.txt[]
+
 include::config/blame.txt[]
 
 include::config/branch.txt[]
diff --git a/Documentation/config/bitmap-pseudo-merge.txt b/Documentation/config/bitmap-pseudo-merge.txt
new file mode 100644
index 00000000000..90b72522046
--- /dev/null
+++ b/Documentation/config/bitmap-pseudo-merge.txt
@@ -0,0 +1,75 @@
+bitmapPseudoMerge.<name>.pattern::
+	Regular expression used to match reference names. Commits
+	pointed to by references matching this pattern (and meeting
+	the below criteria, like `bitmapPseudoMerge.<name>.sampleRate`
+	and `bitmapPseudoMerge.<name>.threshold`) will be considered
+	for inclusion in a pseudo-merge bitmap.
++
+Commits are grouped into pseudo-merge groups based on whether or not
+any reference(s) that point at a given commit match the pattern, which
+is an extended regular expression.
++
+Within a pseudo-merge group, commits may be further grouped into
+sub-groups based on the capture groups in the pattern. These
+sub-groupings are formed from the regular expressions by concatenating
+any capture groups from the regular expression, with a '-' dash in
+between.
++
+For example, if the pattern is `refs/tags/`, then all tags (provided
+they meet the below criteria) will be considered candidates for the
+same pseudo-merge group. However, if the pattern is instead
+`refs/remotes/([0-9])+/tags/`, then tags from different remotes will
+be grouped into separate pseudo-merge groups, based on the remote
+number.
+
+bitmapPseudoMerge.<name>.decay::
+	Determines the rate at which consecutive pseudo-merge bitmap
+	groups decrease in size. Must be non-negative. This parameter
+	can be thought of as `k` in the function `f(n) = C *
+	n^(-k/100)`, where `f(n)` is the size of the `n`th group.
++
+Setting the decay rate equal to `0` will cause all groups to be the
+same size. Setting the decay rate equal to `100` will cause the `n`th
+group to be `1/n` the size of the initial group.  Higher values of the
+decay rate cause consecutive groups to shrink at an increasing rate.
+The default is `100`.
+
+bitmapPseudoMerge.<name>.sampleRate::
+	Determines the proportion of non-bitmapped commits (among
+	reference tips) which are selected for inclusion in an
+	unstable pseudo-merge bitmap. Must be between `0` and `100`
+	(inclusive). The default is `100`.
+
+bitmapPseudoMerge.<name>.threshold::
+	Determines the minimum age of non-bitmapped commits (among
+	reference tips, as above) which are candidates for inclusion
+	in an unstable pseudo-merge bitmap. The default is
+	`1.week.ago`.
+
+bitmapPseudoMerge.<name>.maxMerges::
+	Determines the maximum number of pseudo-merge commits among
+	which commits may be distributed.
++
+For pseudo-merge groups whose pattern does not contain any capture
+groups, this setting is applied for all commits matching the regular
+expression. For patterns that have one or more capture groups, this
+setting is applied for each distinct capture group.
++
+For example, if your capture group is `refs/tags/`, then this setting
+will distribute all tags into a maximum of `maxMerges` pseudo-merge
+commits. However, if your capture group is, say,
+`refs/remotes/([0-9]+)/tags/`, then this setting will be applied to
+each remote's set of tags individually.
++
+Must be non-negative. The default value is 64.
+
+bitmapPseudoMerge.<name>.stableThreshold::
+	Determines the minimum age of commits (among reference tips,
+	as above, however stable commits are still considered
+	candidates even when they have been covered by a bitmap) which
+	are candidates for a stable a pseudo-merge bitmap. The default
+	is `1.month.ago`.
+
+bitmapPseudoMerge.<name>.stableSize::
+	Determines the size (in number of commits) of a stable
+	psuedo-merge bitmap. The default is `512`.
diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
index 63a7177ac08..ed7edf98034 100644
--- a/Documentation/technical/bitmap-format.txt
+++ b/Documentation/technical/bitmap-format.txt
@@ -434,3 +434,29 @@ the end of a `.bitmap` file. The format is as follows:
 
 * An 8-byte unsigned value (in network byte-order) equal to the number
   of bytes in the pseudo-merge section (including this field).
+
+=== Pseudo-merge selection
+
+Pseudo-merge commits are selected among non-bitmapped commits at the
+tip of one or more reference(s). In addition, there are a handful of
+constraints to further refine this selection:
+
+`pack.bitmapPseudoMergeDecay`:: Defines the "decay rate", which
+corresponds to how quickly (or not) consecutive pseudo-merges decrease
+in size relative to one another.
+
+`pack.bitmapPseudoMergeGroups`:: Defines the maximum number of
+pseudo-merge groups.
+
+`pack.bitmapPseudoMergeSampleRate`:: Defines the percentage of commits
+(matching the above criteria) which are selected.
+
+`pack.bitmapPseudoMergeThreshold`:: Defines the minimum age of a commit
+in order to be considered for inclusion within one or more pseudo-merge
+bitmaps.
+
+The size of consecutive pseudo-merge groups decays according to a
+power-law decay function, where the size of the `n`-th group is `f(n) =
+C*n^-k`. The value of `C` is chosen accordingly to match the number of
+desired groups, and `k` is 1/100th of the value of
+`pack.bitmapPseudoMergeDecay`.
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index dab5bdea806..e06930e10b9 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -17,6 +17,7 @@
 #include "trace2.h"
 #include "tree.h"
 #include "tree-walk.h"
+#include "pseudo-merge.h"
 
 struct bitmapped_commit {
 	struct commit *commit;
@@ -39,6 +40,8 @@ struct bitmap_writer {
 	struct bitmapped_commit *selected;
 	unsigned int selected_nr, selected_alloc;
 
+	struct string_list pseudo_merge_groups;
+	kh_oid_map_t *pseudo_merge_commits; /* oid -> pseudo merge(s) */
 	uint32_t pseudo_merges_nr;
 
 	struct progress *progress;
@@ -56,6 +59,11 @@ static inline int bitmap_writer_selected_nr(void)
 void bitmap_writer_init(struct repository *r)
 {
 	writer.bitmaps = kh_init_oid_map();
+	writer.pseudo_merge_commits = kh_init_oid_map();
+
+	string_list_init_dup(&writer.pseudo_merge_groups);
+
+	load_pseudo_merges_from_config(&writer.pseudo_merge_groups);
 }
 
 void bitmap_writer_show_progress(int show)
@@ -686,6 +694,12 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
 	}
 
 	stop_progress(&writer.progress);
+
+	select_pseudo_merges(&writer.pseudo_merge_groups,
+			     indexed_commits, indexed_commits_nr,
+			     writer.pseudo_merge_commits,
+			     &writer.pseudo_merges_nr,
+			     writer.show_progress);
 }
 
 
-- 
2.45.0.23.gc6f94b99219



  parent reply	other threads:[~2024-04-29 20:44 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-20 22:04 [PATCH 00/24] pack-bitmap: pseudo-merge reachability bitmaps Taylor Blau
2024-03-20 22:05 ` [PATCH 01/24] Documentation/technical: describe pseudo-merge bitmaps format Taylor Blau
2024-03-21 21:24   ` Junio C Hamano
2024-03-21 22:13     ` Taylor Blau
2024-03-21 22:22       ` Junio C Hamano
2024-03-20 22:05 ` [PATCH 02/24] config: repo_config_get_expiry() Taylor Blau
2024-04-10 17:54   ` Jeff King
2024-04-29 19:39     ` Taylor Blau
2024-03-20 22:05 ` [PATCH 03/24] ewah: implement `ewah_bitmap_is_subset()` Taylor Blau
2024-04-10 18:05   ` Jeff King
2024-04-29 19:47     ` Taylor Blau
2024-03-20 22:05 ` [PATCH 04/24] pack-bitmap: drop unused `max_bitmaps` parameter Taylor Blau
2024-04-10 18:06   ` Jeff King
2024-03-20 22:05 ` [PATCH 05/24] pack-bitmap: move some initialization to `bitmap_writer_init()` Taylor Blau
2024-04-10 18:10   ` Jeff King
2024-03-20 22:05 ` [PATCH 06/24] pseudo-merge.ch: initial commit Taylor Blau
2024-03-20 22:05 ` [PATCH 07/24] pack-bitmap-write: support storing pseudo-merge commits Taylor Blau
2024-03-20 22:05 ` [PATCH 08/24] pack-bitmap: implement `bitmap_writer_has_bitmapped_object_id()` Taylor Blau
2024-03-20 22:05 ` [PATCH 09/24] pack-bitmap: make `bitmap_writer_push_bitmapped_commit()` public Taylor Blau
2024-03-20 22:05 ` [PATCH 10/24] pseudo-merge: implement support for selecting pseudo-merge commits Taylor Blau
2024-03-20 22:05 ` [PATCH 11/24] pack-bitmap-write.c: select " Taylor Blau
2024-03-20 22:05 ` [PATCH 12/24] pack-bitmap-write.c: write pseudo-merge table Taylor Blau
2024-03-20 22:05 ` [PATCH 13/24] pack-bitmap: extract `read_bitmap()` function Taylor Blau
2024-03-20 22:05 ` [PATCH 14/24] pseudo-merge: scaffolding for reads Taylor Blau
2024-03-20 22:05 ` [PATCH 15/24] pack-bitmap.c: read pseudo-merge extension Taylor Blau
2024-03-20 22:05 ` [PATCH 16/24] pseudo-merge: implement support for reading pseudo-merge commits Taylor Blau
2024-03-20 22:05 ` [PATCH 17/24] ewah: implement `ewah_bitmap_popcount()` Taylor Blau
2024-03-20 22:05 ` [PATCH 18/24] pack-bitmap: implement test helpers for pseudo-merge Taylor Blau
2024-03-20 22:05 ` [PATCH 19/24] t/test-lib-functions.sh: support `--date` in `test_commit_bulk()` Taylor Blau
2024-03-20 22:05 ` [PATCH 20/24] pack-bitmap.c: use pseudo-merges during traversal Taylor Blau
2024-03-20 22:06 ` [PATCH 21/24] pack-bitmap: extra trace2 information Taylor Blau
2024-03-20 22:06 ` [PATCH 22/24] ewah: `bitmap_equals_ewah()` Taylor Blau
2024-03-20 22:06 ` [PATCH 23/24] pseudo-merge: implement support for finding existing merges Taylor Blau
2024-03-20 22:06 ` [PATCH 24/24] t/perf: implement performace tests for pseudo-merge bitmaps Taylor Blau
2024-03-21 19:50 ` [PATCH 00/24] pack-bitmap: pseudo-merge reachability bitmaps Junio C Hamano
2024-04-29 20:42 ` [PATCH v2 00/23] " Taylor Blau
2024-04-29 20:42   ` [PATCH v2 01/23] Documentation/technical: describe pseudo-merge bitmaps format Taylor Blau
2024-05-06 11:52     ` Patrick Steinhardt
2024-05-06 16:37       ` Taylor Blau
2024-05-10 11:46         ` Patrick Steinhardt
2024-05-13 19:47           ` Taylor Blau
2024-05-14  6:33             ` Patrick Steinhardt
2024-04-29 20:43   ` [PATCH v2 02/23] ewah: implement `ewah_bitmap_is_subset()` Taylor Blau
2024-04-29 20:43   ` [PATCH v2 03/23] pack-bitmap: drop unused `max_bitmaps` parameter Taylor Blau
2024-04-29 20:43   ` [PATCH v2 04/23] pack-bitmap: move some initialization to `bitmap_writer_init()` Taylor Blau
2024-05-06 11:52     ` Patrick Steinhardt
2024-05-06 18:24       ` Taylor Blau
2024-04-29 20:43   ` [PATCH v2 05/23] pseudo-merge.ch: initial commit Taylor Blau
2024-04-29 20:43   ` [PATCH v2 06/23] pack-bitmap-write: support storing pseudo-merge commits Taylor Blau
2024-05-06 11:52     ` Patrick Steinhardt
2024-05-06 18:48       ` Taylor Blau
2024-05-10 11:47         ` Patrick Steinhardt
2024-05-13 18:42     ` Jeff King
2024-05-13 20:19       ` Taylor Blau
2024-04-29 20:43   ` [PATCH v2 07/23] pack-bitmap: implement `bitmap_writer_has_bitmapped_object_id()` Taylor Blau
2024-04-29 20:43   ` [PATCH v2 08/23] pack-bitmap: make `bitmap_writer_push_bitmapped_commit()` public Taylor Blau
2024-05-13 18:50     ` Jeff King
2024-05-14  0:54       ` Taylor Blau
2024-04-29 20:43   ` [PATCH v2 09/23] pseudo-merge: implement support for selecting pseudo-merge commits Taylor Blau
2024-05-06 11:53     ` Patrick Steinhardt
2024-05-06 19:58       ` Taylor Blau
2024-05-13 19:03     ` Jeff King
2024-05-14  0:58       ` Taylor Blau
2024-05-16  8:07         ` Jeff King
2024-05-16 22:43           ` Junio C Hamano
2024-04-29 20:43   ` Taylor Blau [this message]
2024-05-06 11:53     ` [PATCH v2 10/23] pack-bitmap-write.c: select " Patrick Steinhardt
2024-05-06 20:05       ` Taylor Blau
2024-05-10 11:47         ` Patrick Steinhardt
2024-04-29 20:43   ` [PATCH v2 11/23] pack-bitmap-write.c: write pseudo-merge table Taylor Blau
2024-04-29 20:43   ` [PATCH v2 12/23] pack-bitmap: extract `read_bitmap()` function Taylor Blau
2024-04-29 20:43   ` [PATCH v2 13/23] pseudo-merge: scaffolding for reads Taylor Blau
2024-04-29 20:43   ` [PATCH v2 14/23] pack-bitmap.c: read pseudo-merge extension Taylor Blau
2024-04-29 20:44   ` [PATCH v2 15/23] pseudo-merge: implement support for reading pseudo-merge commits Taylor Blau
2024-04-29 20:44   ` [PATCH v2 16/23] ewah: implement `ewah_bitmap_popcount()` Taylor Blau
2024-04-29 20:44   ` [PATCH v2 17/23] pack-bitmap: implement test helpers for pseudo-merge Taylor Blau
2024-04-29 20:44   ` [PATCH v2 18/23] t/test-lib-functions.sh: support `--date` in `test_commit_bulk()` Taylor Blau
2024-04-29 20:44   ` [PATCH v2 19/23] pack-bitmap.c: use pseudo-merges during traversal Taylor Blau
2024-04-29 20:44   ` [PATCH v2 20/23] pack-bitmap: extra trace2 information Taylor Blau
2024-04-29 20:44   ` [PATCH v2 21/23] ewah: `bitmap_equals_ewah()` Taylor Blau
2024-04-29 20:44   ` [PATCH v2 22/23] pseudo-merge: implement support for finding existing merges Taylor Blau
2024-04-29 20:44   ` [PATCH v2 23/23] t/perf: implement performace tests for pseudo-merge bitmaps Taylor Blau
2024-04-30 20:03   ` [PATCH v2 00/23] pack-bitmap: pseudo-merge reachability bitmaps Junio C Hamano
2024-05-01 14:40     ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=12b432e3a8adcda6228beae2b41b2363a6ce82a0.1714422410.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).