From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>, Elijah Newren <newren@gmail.com>,
Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v2 10/23] pack-bitmap-write.c: select pseudo-merge commits
Date: Mon, 29 Apr 2024 16:43:41 -0400 [thread overview]
Message-ID: <12b432e3a8adcda6228beae2b41b2363a6ce82a0.1714422410.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1714422410.git.me@ttaylorr.com>
Now that the pseudo-merge machinery has learned how to select
non-bitmapped commits and assign them into different pseudo-merge
group(s), invoke this new API from within the pack-bitmap internals and
store the results off.
Note that the selected pseudo-merge commits aren't actually used or
written anywhere yet. This will be done in the following commit.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Documentation/config.txt | 2 +
Documentation/config/bitmap-pseudo-merge.txt | 75 ++++++++++++++++++++
Documentation/technical/bitmap-format.txt | 26 +++++++
pack-bitmap-write.c | 14 ++++
4 files changed, 117 insertions(+)
create mode 100644 Documentation/config/bitmap-pseudo-merge.txt
diff --git a/Documentation/config.txt b/Documentation/config.txt
index 70b448b1326..bbedb7b9a06 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -383,6 +383,8 @@ include::config/apply.txt[]
include::config/attr.txt[]
+include::config/bitmap-pseudo-merge.txt[]
+
include::config/blame.txt[]
include::config/branch.txt[]
diff --git a/Documentation/config/bitmap-pseudo-merge.txt b/Documentation/config/bitmap-pseudo-merge.txt
new file mode 100644
index 00000000000..90b72522046
--- /dev/null
+++ b/Documentation/config/bitmap-pseudo-merge.txt
@@ -0,0 +1,75 @@
+bitmapPseudoMerge.<name>.pattern::
+ Regular expression used to match reference names. Commits
+ pointed to by references matching this pattern (and meeting
+ the below criteria, like `bitmapPseudoMerge.<name>.sampleRate`
+ and `bitmapPseudoMerge.<name>.threshold`) will be considered
+ for inclusion in a pseudo-merge bitmap.
++
+Commits are grouped into pseudo-merge groups based on whether or not
+any reference(s) that point at a given commit match the pattern, which
+is an extended regular expression.
++
+Within a pseudo-merge group, commits may be further grouped into
+sub-groups based on the capture groups in the pattern. These
+sub-groupings are formed from the regular expressions by concatenating
+any capture groups from the regular expression, with a '-' dash in
+between.
++
+For example, if the pattern is `refs/tags/`, then all tags (provided
+they meet the below criteria) will be considered candidates for the
+same pseudo-merge group. However, if the pattern is instead
+`refs/remotes/([0-9])+/tags/`, then tags from different remotes will
+be grouped into separate pseudo-merge groups, based on the remote
+number.
+
+bitmapPseudoMerge.<name>.decay::
+ Determines the rate at which consecutive pseudo-merge bitmap
+ groups decrease in size. Must be non-negative. This parameter
+ can be thought of as `k` in the function `f(n) = C *
+ n^(-k/100)`, where `f(n)` is the size of the `n`th group.
++
+Setting the decay rate equal to `0` will cause all groups to be the
+same size. Setting the decay rate equal to `100` will cause the `n`th
+group to be `1/n` the size of the initial group. Higher values of the
+decay rate cause consecutive groups to shrink at an increasing rate.
+The default is `100`.
+
+bitmapPseudoMerge.<name>.sampleRate::
+ Determines the proportion of non-bitmapped commits (among
+ reference tips) which are selected for inclusion in an
+ unstable pseudo-merge bitmap. Must be between `0` and `100`
+ (inclusive). The default is `100`.
+
+bitmapPseudoMerge.<name>.threshold::
+ Determines the minimum age of non-bitmapped commits (among
+ reference tips, as above) which are candidates for inclusion
+ in an unstable pseudo-merge bitmap. The default is
+ `1.week.ago`.
+
+bitmapPseudoMerge.<name>.maxMerges::
+ Determines the maximum number of pseudo-merge commits among
+ which commits may be distributed.
++
+For pseudo-merge groups whose pattern does not contain any capture
+groups, this setting is applied for all commits matching the regular
+expression. For patterns that have one or more capture groups, this
+setting is applied for each distinct capture group.
++
+For example, if your capture group is `refs/tags/`, then this setting
+will distribute all tags into a maximum of `maxMerges` pseudo-merge
+commits. However, if your capture group is, say,
+`refs/remotes/([0-9]+)/tags/`, then this setting will be applied to
+each remote's set of tags individually.
++
+Must be non-negative. The default value is 64.
+
+bitmapPseudoMerge.<name>.stableThreshold::
+ Determines the minimum age of commits (among reference tips,
+ as above, however stable commits are still considered
+ candidates even when they have been covered by a bitmap) which
+ are candidates for a stable a pseudo-merge bitmap. The default
+ is `1.month.ago`.
+
+bitmapPseudoMerge.<name>.stableSize::
+ Determines the size (in number of commits) of a stable
+ psuedo-merge bitmap. The default is `512`.
diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
index 63a7177ac08..ed7edf98034 100644
--- a/Documentation/technical/bitmap-format.txt
+++ b/Documentation/technical/bitmap-format.txt
@@ -434,3 +434,29 @@ the end of a `.bitmap` file. The format is as follows:
* An 8-byte unsigned value (in network byte-order) equal to the number
of bytes in the pseudo-merge section (including this field).
+
+=== Pseudo-merge selection
+
+Pseudo-merge commits are selected among non-bitmapped commits at the
+tip of one or more reference(s). In addition, there are a handful of
+constraints to further refine this selection:
+
+`pack.bitmapPseudoMergeDecay`:: Defines the "decay rate", which
+corresponds to how quickly (or not) consecutive pseudo-merges decrease
+in size relative to one another.
+
+`pack.bitmapPseudoMergeGroups`:: Defines the maximum number of
+pseudo-merge groups.
+
+`pack.bitmapPseudoMergeSampleRate`:: Defines the percentage of commits
+(matching the above criteria) which are selected.
+
+`pack.bitmapPseudoMergeThreshold`:: Defines the minimum age of a commit
+in order to be considered for inclusion within one or more pseudo-merge
+bitmaps.
+
+The size of consecutive pseudo-merge groups decays according to a
+power-law decay function, where the size of the `n`-th group is `f(n) =
+C*n^-k`. The value of `C` is chosen accordingly to match the number of
+desired groups, and `k` is 1/100th of the value of
+`pack.bitmapPseudoMergeDecay`.
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index dab5bdea806..e06930e10b9 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -17,6 +17,7 @@
#include "trace2.h"
#include "tree.h"
#include "tree-walk.h"
+#include "pseudo-merge.h"
struct bitmapped_commit {
struct commit *commit;
@@ -39,6 +40,8 @@ struct bitmap_writer {
struct bitmapped_commit *selected;
unsigned int selected_nr, selected_alloc;
+ struct string_list pseudo_merge_groups;
+ kh_oid_map_t *pseudo_merge_commits; /* oid -> pseudo merge(s) */
uint32_t pseudo_merges_nr;
struct progress *progress;
@@ -56,6 +59,11 @@ static inline int bitmap_writer_selected_nr(void)
void bitmap_writer_init(struct repository *r)
{
writer.bitmaps = kh_init_oid_map();
+ writer.pseudo_merge_commits = kh_init_oid_map();
+
+ string_list_init_dup(&writer.pseudo_merge_groups);
+
+ load_pseudo_merges_from_config(&writer.pseudo_merge_groups);
}
void bitmap_writer_show_progress(int show)
@@ -686,6 +694,12 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
}
stop_progress(&writer.progress);
+
+ select_pseudo_merges(&writer.pseudo_merge_groups,
+ indexed_commits, indexed_commits_nr,
+ writer.pseudo_merge_commits,
+ &writer.pseudo_merges_nr,
+ writer.show_progress);
}
--
2.45.0.23.gc6f94b99219
next prev parent reply other threads:[~2024-04-29 20:44 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-20 22:04 [PATCH 00/24] pack-bitmap: pseudo-merge reachability bitmaps Taylor Blau
2024-03-20 22:05 ` [PATCH 01/24] Documentation/technical: describe pseudo-merge bitmaps format Taylor Blau
2024-03-21 21:24 ` Junio C Hamano
2024-03-21 22:13 ` Taylor Blau
2024-03-21 22:22 ` Junio C Hamano
2024-03-20 22:05 ` [PATCH 02/24] config: repo_config_get_expiry() Taylor Blau
2024-04-10 17:54 ` Jeff King
2024-04-29 19:39 ` Taylor Blau
2024-03-20 22:05 ` [PATCH 03/24] ewah: implement `ewah_bitmap_is_subset()` Taylor Blau
2024-04-10 18:05 ` Jeff King
2024-04-29 19:47 ` Taylor Blau
2024-03-20 22:05 ` [PATCH 04/24] pack-bitmap: drop unused `max_bitmaps` parameter Taylor Blau
2024-04-10 18:06 ` Jeff King
2024-03-20 22:05 ` [PATCH 05/24] pack-bitmap: move some initialization to `bitmap_writer_init()` Taylor Blau
2024-04-10 18:10 ` Jeff King
2024-03-20 22:05 ` [PATCH 06/24] pseudo-merge.ch: initial commit Taylor Blau
2024-03-20 22:05 ` [PATCH 07/24] pack-bitmap-write: support storing pseudo-merge commits Taylor Blau
2024-03-20 22:05 ` [PATCH 08/24] pack-bitmap: implement `bitmap_writer_has_bitmapped_object_id()` Taylor Blau
2024-03-20 22:05 ` [PATCH 09/24] pack-bitmap: make `bitmap_writer_push_bitmapped_commit()` public Taylor Blau
2024-03-20 22:05 ` [PATCH 10/24] pseudo-merge: implement support for selecting pseudo-merge commits Taylor Blau
2024-03-20 22:05 ` [PATCH 11/24] pack-bitmap-write.c: select " Taylor Blau
2024-03-20 22:05 ` [PATCH 12/24] pack-bitmap-write.c: write pseudo-merge table Taylor Blau
2024-03-20 22:05 ` [PATCH 13/24] pack-bitmap: extract `read_bitmap()` function Taylor Blau
2024-03-20 22:05 ` [PATCH 14/24] pseudo-merge: scaffolding for reads Taylor Blau
2024-03-20 22:05 ` [PATCH 15/24] pack-bitmap.c: read pseudo-merge extension Taylor Blau
2024-03-20 22:05 ` [PATCH 16/24] pseudo-merge: implement support for reading pseudo-merge commits Taylor Blau
2024-03-20 22:05 ` [PATCH 17/24] ewah: implement `ewah_bitmap_popcount()` Taylor Blau
2024-03-20 22:05 ` [PATCH 18/24] pack-bitmap: implement test helpers for pseudo-merge Taylor Blau
2024-03-20 22:05 ` [PATCH 19/24] t/test-lib-functions.sh: support `--date` in `test_commit_bulk()` Taylor Blau
2024-03-20 22:05 ` [PATCH 20/24] pack-bitmap.c: use pseudo-merges during traversal Taylor Blau
2024-03-20 22:06 ` [PATCH 21/24] pack-bitmap: extra trace2 information Taylor Blau
2024-03-20 22:06 ` [PATCH 22/24] ewah: `bitmap_equals_ewah()` Taylor Blau
2024-03-20 22:06 ` [PATCH 23/24] pseudo-merge: implement support for finding existing merges Taylor Blau
2024-03-20 22:06 ` [PATCH 24/24] t/perf: implement performace tests for pseudo-merge bitmaps Taylor Blau
2024-03-21 19:50 ` [PATCH 00/24] pack-bitmap: pseudo-merge reachability bitmaps Junio C Hamano
2024-04-29 20:42 ` [PATCH v2 00/23] " Taylor Blau
2024-04-29 20:42 ` [PATCH v2 01/23] Documentation/technical: describe pseudo-merge bitmaps format Taylor Blau
2024-05-06 11:52 ` Patrick Steinhardt
2024-05-06 16:37 ` Taylor Blau
2024-05-10 11:46 ` Patrick Steinhardt
2024-05-13 19:47 ` Taylor Blau
2024-05-14 6:33 ` Patrick Steinhardt
2024-04-29 20:43 ` [PATCH v2 02/23] ewah: implement `ewah_bitmap_is_subset()` Taylor Blau
2024-04-29 20:43 ` [PATCH v2 03/23] pack-bitmap: drop unused `max_bitmaps` parameter Taylor Blau
2024-04-29 20:43 ` [PATCH v2 04/23] pack-bitmap: move some initialization to `bitmap_writer_init()` Taylor Blau
2024-05-06 11:52 ` Patrick Steinhardt
2024-05-06 18:24 ` Taylor Blau
2024-04-29 20:43 ` [PATCH v2 05/23] pseudo-merge.ch: initial commit Taylor Blau
2024-04-29 20:43 ` [PATCH v2 06/23] pack-bitmap-write: support storing pseudo-merge commits Taylor Blau
2024-05-06 11:52 ` Patrick Steinhardt
2024-05-06 18:48 ` Taylor Blau
2024-05-10 11:47 ` Patrick Steinhardt
2024-05-13 18:42 ` Jeff King
2024-05-13 20:19 ` Taylor Blau
2024-04-29 20:43 ` [PATCH v2 07/23] pack-bitmap: implement `bitmap_writer_has_bitmapped_object_id()` Taylor Blau
2024-04-29 20:43 ` [PATCH v2 08/23] pack-bitmap: make `bitmap_writer_push_bitmapped_commit()` public Taylor Blau
2024-05-13 18:50 ` Jeff King
2024-05-14 0:54 ` Taylor Blau
2024-04-29 20:43 ` [PATCH v2 09/23] pseudo-merge: implement support for selecting pseudo-merge commits Taylor Blau
2024-05-06 11:53 ` Patrick Steinhardt
2024-05-06 19:58 ` Taylor Blau
2024-05-13 19:03 ` Jeff King
2024-05-14 0:58 ` Taylor Blau
2024-05-16 8:07 ` Jeff King
2024-05-16 22:43 ` Junio C Hamano
2024-04-29 20:43 ` Taylor Blau [this message]
2024-05-06 11:53 ` [PATCH v2 10/23] pack-bitmap-write.c: select " Patrick Steinhardt
2024-05-06 20:05 ` Taylor Blau
2024-05-10 11:47 ` Patrick Steinhardt
2024-04-29 20:43 ` [PATCH v2 11/23] pack-bitmap-write.c: write pseudo-merge table Taylor Blau
2024-04-29 20:43 ` [PATCH v2 12/23] pack-bitmap: extract `read_bitmap()` function Taylor Blau
2024-04-29 20:43 ` [PATCH v2 13/23] pseudo-merge: scaffolding for reads Taylor Blau
2024-04-29 20:43 ` [PATCH v2 14/23] pack-bitmap.c: read pseudo-merge extension Taylor Blau
2024-04-29 20:44 ` [PATCH v2 15/23] pseudo-merge: implement support for reading pseudo-merge commits Taylor Blau
2024-04-29 20:44 ` [PATCH v2 16/23] ewah: implement `ewah_bitmap_popcount()` Taylor Blau
2024-04-29 20:44 ` [PATCH v2 17/23] pack-bitmap: implement test helpers for pseudo-merge Taylor Blau
2024-04-29 20:44 ` [PATCH v2 18/23] t/test-lib-functions.sh: support `--date` in `test_commit_bulk()` Taylor Blau
2024-04-29 20:44 ` [PATCH v2 19/23] pack-bitmap.c: use pseudo-merges during traversal Taylor Blau
2024-04-29 20:44 ` [PATCH v2 20/23] pack-bitmap: extra trace2 information Taylor Blau
2024-04-29 20:44 ` [PATCH v2 21/23] ewah: `bitmap_equals_ewah()` Taylor Blau
2024-04-29 20:44 ` [PATCH v2 22/23] pseudo-merge: implement support for finding existing merges Taylor Blau
2024-04-29 20:44 ` [PATCH v2 23/23] t/perf: implement performace tests for pseudo-merge bitmaps Taylor Blau
2024-04-30 20:03 ` [PATCH v2 00/23] pack-bitmap: pseudo-merge reachability bitmaps Junio C Hamano
2024-05-01 14:40 ` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=12b432e3a8adcda6228beae2b41b2363a6ce82a0.1714422410.git.me@ttaylorr.com \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).