user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* [PATCH 2/2] lei_mirror: fetch most-recently-updated repos, first
  2023-02-12 23:18  6% [PATCH 0/2] lei_mirror: more tweaks Eric Wong
@ 2023-02-12 23:18  7% ` Eric Wong
  0 siblings, 0 replies; 2+ results
From: Eric Wong @ 2023-02-12 23:18 UTC (permalink / raw)
  To: meta

Within the same forkgroup, we can assume the most recently updated
repo has the most data, so fetch those, first.  We'll save new clones
for last since we can preserve {reference} ordering for them.
---
 lib/PublicInbox/LeiMirror.pm | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index dd6356bb..4dedac9b 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -23,7 +23,7 @@ use PublicInbox::SHA qw(sha256_hex sha1_hex);
 use POSIX qw(strftime);
 
 our $LIVE; # pid => callback
-our $FGRP_TODO; # objstore -> [ fgrp mirror objects ]
+our $FGRP_TODO; # objstore -> [[ to resume ], [ to clone ]]
 our $TODO; # reference => [ non-fgrp mirror objects ]
 our @PUH; # post-update hooks
 
@@ -404,9 +404,12 @@ sub fgrp_fetch_all {
 		(fetch_args($self->{lei}, $opt), qw(--no-tags --multiple));
 	};
 	push(@fetch, "-j$j") if $j;
-	while (my ($osdir, $fgrpv) = each %$todo) {
+	while (my ($osdir, $fgrp_old_new) = each %$todo) {
 		my $f = "$osdir/config";
 		return if !keep_going($self);
+		my ($fgrpv, $new) = @$fgrp_old_new;
+		@$fgrpv = sort { $b->{-sort} <=> $a->{-sort} } @$fgrpv;
+		push @$fgrpv, @$new; # $new is ordered by references
 
 		my $cmd = ['git', "--git-dir=$osdir", qw(config -f), $f ];
 		# clobber group from previous run atomically
@@ -568,7 +571,8 @@ sub fgrp_enqueue {
 	my ($fgrp, $end) = @_; # $end calls fgrp_fetch_all
 	return if !keep_going($fgrp);
 	++$fgrp->{chg}->{nr_chg};
-	push @{$FGRP_TODO->{$fgrp->{-osdir}}}, $fgrp;
+	my $dst = $FGRP_TODO->{$fgrp->{-osdir}} //= [ [], [] ]; # [ old, new ]
+	push @{$dst->[defined($fgrp->{-sort} ? 0 : 1)]}, $fgrp;
 }
 
 sub clone_v1 {
@@ -586,8 +590,12 @@ sub clone_v1 {
 	my $resume = -d $dst;
 	if (my $fgrp = forkgroup_prep($self, $uri)) {
 		$fgrp->{-fini} = $fini;
-		$resume ? cmp_fp_do($fgrp, \&fgrp_enqueue, $end)
-			: fgrp_enqueue($fgrp, $end);
+		if ($resume) {
+			$fgrp->{-sort} = $fgrp->{-ent}->{modified};
+			cmp_fp_do($fgrp, \&fgrp_enqueue, $end);
+		} else { # new repo, save for last
+			fgrp_enqueue($fgrp, $end);
+		}
 	} elsif ($resume) {
 		cmp_fp_do($self, \&resume_fetch, $uri, $fini);
 	} else { # normal clone

^ permalink raw reply related	[relevance 7%]

* [PATCH 0/2] lei_mirror: more tweaks
@ 2023-02-12 23:18  6% Eric Wong
  2023-02-12 23:18  7% ` [PATCH 2/2] lei_mirror: fetch most-recently-updated repos, first Eric Wong
  0 siblings, 1 reply; 2+ results
From: Eric Wong @ 2023-02-12 23:18 UTC (permalink / raw)
  To: meta

The proposed-for-git `fetch.hideRefs' isn't supported, yet;
I'm still testing to see if it's harmful for new clones
(I suspect so), and how to reduce it's impact while still
being able to clone all kernel forks on kernel.org
supporting RAM-constrained systems.
https://public-inbox.org/git/20230212090426.M558990@dcvr/
("fetch: support hideRefs to speed up connectivity checks")

Eric Wong (2):
  lei_mirror: further reduce `git config' calls
  lei_mirror: fetch most-recently-updated repos, first

 lib/PublicInbox/LeiMirror.pm | 80 ++++++++++++++++++++++--------------
 1 file changed, 49 insertions(+), 31 deletions(-)

^ permalink raw reply	[relevance 6%]

Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2023-02-12 23:18  6% [PATCH 0/2] lei_mirror: more tweaks Eric Wong
2023-02-12 23:18  7% ` [PATCH 2/2] lei_mirror: fetch most-recently-updated repos, first Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).