user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
* [PATCH] doc: add some recommendations around slow HDDs
@ 2020-07-17  3:57 Eric Wong
  0 siblings, 0 replies; only message in thread
From: Eric Wong @ 2020-07-17  3:57 UTC (permalink / raw)
  To: meta

grok-pull is still painful with serialization on an old USB 2.0
HDD, but at least it can finish with flock(1) and disabling
parallelization.  While parallel "git fetch" doesn't seem so
bad, slow seeks are exacerbated by parallel reads in Xapian.
That means some updates can take days instead of hours.  The
same updates take only seconds or minutes on an SSD.
---
 Documentation/public-inbox-index.pod   | 10 ++++++++++
 examples/grok-pull.post_update_hook.sh |  6 ++++++
 2 files changed, 16 insertions(+)

diff --git a/Documentation/public-inbox-index.pod b/Documentation/public-inbox-index.pod
index b1b24917b..ff2e54867 100644
--- a/Documentation/public-inbox-index.pod
+++ b/Documentation/public-inbox-index.pod
@@ -32,6 +32,16 @@ normal search functionality.
 
 =over
 
+=item --jobs=JOBS, -j
+
+Control the number of Xapian indexing jobs in a
+(L<public-inbox-v2-format(5)>) inbox.
+
+C<--jobs=0> is accepted as of public-inbox 1.6.0 (PENDING)
+to disable parallel indexing.
+
+Default: the number of existing Xapian shards
+
 =item --compact / -c
 
 Compacts the Xapian DBs after indexing.  This is recommended
diff --git a/examples/grok-pull.post_update_hook.sh b/examples/grok-pull.post_update_hook.sh
index 3ead39440..ec4ae93e8 100755
--- a/examples/grok-pull.post_update_hook.sh
+++ b/examples/grok-pull.post_update_hook.sh
@@ -1,4 +1,9 @@
 #!/bin/sh
+
+# use flock(1) from util-linux to avoid seek contention on slow HDDs
+# when using multiple `pull_threads' with grok-pull:
+# [ "${FLOCKER}" != "$0" ] && exec env FLOCKER="$0" flock "$0" "$0" "$@" || :
+
 # post_update_hook for repos.conf as used by grok-pull, takes a full
 # git repo path as it's first and only arg.
 full_git_dir="$1"
@@ -119,6 +124,7 @@ then
 		: v2 inboxes may be init-ed with an empty msgmap
 		;;
 	*)
+		# if on HDD and limited RAM, add `-j0' w/ public-inbox 1.6.0+
 		$EATMYDATA public-inbox-index -v "$inbox_dir"
 		;;
 	esac

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2020-07-17  3:57 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-17  3:57 [PATCH] doc: add some recommendations around slow HDDs Eric Wong

user/dev discussion of public-inbox itself

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 meta meta/ https://public-inbox.org/meta \
		meta@public-inbox.org
	public-inbox-index meta

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://7fh6tueqddpjyxjmgtdiueylzoqt6pt7hec3pukyptlmohoowvhde4yd.onion/inbox.comp.mail.public-inbox.meta
	nntp://ie5yzdi7fg72h7s4sdcztq5evakq23rdt33mfyfcddc5u3ndnw24ogqd.onion/inbox.comp.mail.public-inbox.meta
	nntp://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.io/gmane.mail.public-inbox.general
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for project(s) associated with this inbox:

	https://80x24.org/public-inbox.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git