user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [PATCH 16/18] cindex: do not guess integer maximum for Xapian
  2023-11-13 13:15  4% [PATCH 00/18] cindex: some --associate work Eric Wong
@ 2023-11-13 13:15  7% ` Eric Wong
  0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2023-11-13 13:15 UTC (permalink / raw)
  To: meta

We can return an array to allow the caller to omit the internal
`-m' arg entirely.  We'll also allow any non-positive values to
mean there's no limit; and we'll defer the "unlimited" case to
the XapHelper implementation.  This frees us of having to deal
with mismatches between Perl and Xapian if Xapian was compiled
with 64-bit docid support and we're stuck on a 32-bit Perl
build.
---
 lib/PublicInbox/CodeSearchIdx.pm | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/lib/PublicInbox/CodeSearchIdx.pm b/lib/PublicInbox/CodeSearchIdx.pm
index 04c514fe..8e6b921d 100644
--- a/lib/PublicInbox/CodeSearchIdx.pm
+++ b/lib/PublicInbox/CodeSearchIdx.pm
@@ -501,11 +501,10 @@ sub shard_commit { # via wq_io_do
 	send($op_p, "shard_done $self->{shard}", 0);
 }
 
-sub assoc_max_init ($) {
+sub assoc_max_args ($) {
 	my ($self) = @_;
 	my $max = $self->{-opt}->{'associate-max'} // $ASSOC_MAX;
-	$max = $ASSOC_MAX if !$max;
-	$max < 0 ? ((2 ** 31) - 1) : $max;
+	$max <= 0 ? () : ('-m', $max);
 }
 
 sub start_xhc () {
@@ -538,7 +537,7 @@ sub dump_roots_start {
 	run_await(\@sort, $CMD_ENV, $sort_opt, \&cmd_done, $associate);
 	run_await(\@UNIQ_FOLD, $fold_env, $fold_opt, \&cmd_done, $associate);
 	my @arg = ((map { ('-A', $_) } @ASSOC_PFX), '-c',
-		'-m', assoc_max_init($self), $root2id, $QRY_STR);
+		assoc_max_args($self), $root2id, $QRY_STR);
 	for my $d ($self->shard_dirs) {
 		pipe(my $err_r, my $err_w);
 		$XHC->mkreq([$sort_w, $err_w], qw(dump_roots -d), $d, @arg);
@@ -556,6 +555,8 @@ sub dump_ibx { # sends to xap_helper.h
 	my $srch = $ibx->isrch or return warn <<EOM;
 W: $ekey not indexed for search
 EOM
+	# note: we don't send associate_max_args to dump_ibx since we
+	# have to post-filter non-patch messages
 	my @cmd = ('dump_ibx', $srch->xh_args,
 			(map { ('-A', $_) } @ASSOC_PFX), $ibx_id, $QRY_STR);
 	pipe(my $r, my $w);

^ permalink raw reply related	[relevance 7%]

* [PATCH 00/18] cindex: some --associate work
@ 2023-11-13 13:15  4% Eric Wong
  2023-11-13 13:15  7% ` [PATCH 16/18] cindex: do not guess integer maximum for Xapian Eric Wong
  0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2023-11-13 13:15 UTC (permalink / raw)
  To: meta

Still very much in flux, but some treewide cleanups in there...

And I've been wondering if "join" is a better word than
"associate" to denote the relationship between inboxes
and coderepos.

But "join" (even if we use join(1) internally) probably
implies strict relationships, whereas our current "associate"
is always going to be fuzzy due to patchids being fuzzy
and blobs OIDs being abbreviated in patches.

I'm also thinking about moving --associate-* CLI switches
into suboptions (e.g. what getsubopt(3) supports), so:

	--associate=aggressive,prefixes=patchid+dfblob

But Perl doesn't ship with getsubopt(3) emulation
out-of-the-box

Eric Wong (18):
  cindex: check `say' errors w/ close or ->flush
  tmpfile: check `stat' errors, use autodie for unlink
  cindex: use `local' for pipes between processes
  xap_helper_cxx: use write_file helper
  xap_helper_cxx: make the build process ccache-friendly
  xap_helper_cxx: use -pipe by default in CXXFLAGS
  xap_client: spawn C++ xap_helper directly
  treewide: update read_all to avoid eof|close checks
  spawn: don't append to scalarrefs on stdout/stderr
  cindex: imply --all with --associate w/o -I/--only
  cindex: delay associate until prune+indexing finish
  xap_helper: Perl dump_ibx respects `-m MAX'
  cidx_xap_helper_aux: complain about truncated inputs
  xap_helper: stricter and harsher error handling
  xap_helper: better variable naming for key buffer
  cindex: do not guess integer maximum for Xapian
  cindex: rename associate-max => window
  cindex: support --associate-aggressive shortcut

 lib/PublicInbox/CidxComm.pm         |   6 +-
 lib/PublicInbox/CidxXapHelperAux.pm |   6 +-
 lib/PublicInbox/CodeSearchIdx.pm    | 122 ++++++++++-----
 lib/PublicInbox/Gcf2.pm             |   3 +-
 lib/PublicInbox/IO.pm               |  18 ++-
 lib/PublicInbox/LeiInput.pm         |  10 +-
 lib/PublicInbox/LeiMirror.pm        |  10 +-
 lib/PublicInbox/LeiToMail.pm        |   3 +-
 lib/PublicInbox/Spawn.pm            |   4 +-
 lib/PublicInbox/TestCommon.pm       |   6 +-
 lib/PublicInbox/Tmpfile.pm          |  10 +-
 lib/PublicInbox/XapClient.pm        |  28 ++--
 lib/PublicInbox/XapHelper.pm        |  30 ++--
 lib/PublicInbox/XapHelperCxx.pm     |  55 +++----
 lib/PublicInbox/xap_helper.h        | 233 ++++++++++++----------------
 script/public-inbox-cindex          |   3 +-
 script/public-inbox-learn           |   2 +-
 script/public-inbox-mda             |   2 +-
 script/public-inbox-purge           |   2 +-
 t/spawn.t                           |   2 +-
 t/xap_helper.t                      |  27 ++--
 21 files changed, 287 insertions(+), 295 deletions(-)

Yay, less code!

^ permalink raw reply	[relevance 4%]

Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2023-11-13 13:15  4% [PATCH 00/18] cindex: some --associate work Eric Wong
2023-11-13 13:15  7% ` [PATCH 16/18] cindex: do not guess integer maximum for Xapian Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).