user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: meta@public-inbox.org
Subject: [PATCH 2/5] inbox: perform cleanup of Git objects for coderepos
Date: Thu, 31 Jan 2019 04:27:21 +0000	[thread overview]
Message-ID: <20190131042724.2675-3-e@80x24.org> (raw)
In-Reply-To: <20190131042724.2675-1-e@80x24.org>

Otherwise, long-running but idle git processes may keep unlinked
packs around indefinitely and waste disk space.
---
 lib/PublicInbox/Git.pm   | 18 ++++++++++++++----
 lib/PublicInbox/Inbox.pm | 17 +++++++++++++++--
 t/git.t                  |  4 ++++
 3 files changed, 33 insertions(+), 6 deletions(-)

diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm
index e844884..a756684 100644
--- a/lib/PublicInbox/Git.pm
+++ b/lib/PublicInbox/Git.pm
@@ -206,7 +206,15 @@ sub check {
 }
 
 sub _destroy {
-	my ($self, $in, $out, $pid) = @_;
+	my ($self, $in, $out, $pid, $expire) = @_;
+	my $rfh = $self->{$in} or return;
+	if (defined $expire) {
+		# at least FreeBSD 11.2 and Linux 4.20 update mtime of the
+		# read end of a pipe when the pipe is written to; dunno
+		# about other OSes.
+		my $mtime = (stat($rfh))[9];
+		return if $mtime > $expire;
+	}
 	my $p = delete $self->{$pid} or return;
 	foreach my $f ($in, $out) {
 		delete $self->{$f};
@@ -236,10 +244,12 @@ sub qx {
 	<$fh>
 }
 
+# returns true if there are pending "git cat-file" processes
 sub cleanup {
-	my ($self) = @_;
-	_destroy($self, qw(in out pid));
-	_destroy($self, qw(in_c out_c pid_c));
+	my ($self, $expire) = @_;
+	_destroy($self, qw(in out pid), $expire);
+	_destroy($self, qw(in_c out_c pid_c), $expire);
+	!!($self->{pid} || $self->{pid_c});
 }
 
 # assuming a well-maintained repo, this should be a somewhat
diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm
index d57e46d..6fe896f 100644
--- a/lib/PublicInbox/Inbox.pm
+++ b/lib/PublicInbox/Inbox.pm
@@ -22,12 +22,25 @@ my $cleanup_broken = $@;
 my $CLEANUP = {}; # string(inbox) -> inbox
 sub cleanup_task () {
 	$cleanup_timer = undef;
+	my $next = {};
 	for my $ibx (values %$CLEANUP) {
-		foreach my $f (qw(git mm search)) {
+		my $again;
+		foreach my $f (qw(mm search)) {
 			delete $ibx->{$f} if SvREFCNT($ibx->{$f}) == 1;
 		}
+		my $expire = time - 60;
+		if (my $git = $ibx->{git}) {
+			$again = $git->cleanup($expire);
+		}
+		if (my $gits = $ibx->{-repo_objs}) {
+			foreach my $git (@$gits) {
+				$again = 1 if $git->cleanup($expire);
+			}
+		}
+		$again ||= !!($ibx->{mm} || $ibx->{search});
+		$next->{"$ibx"} = $ibx if $again;
 	}
-	$CLEANUP = {};
+	$CLEANUP = $next;
 }
 
 sub _cleanup_later ($) {
diff --git a/t/git.t b/t/git.t
index 9c80fbb..d637e63 100644
--- a/t/git.t
+++ b/t/git.t
@@ -142,6 +142,10 @@ if ('alternates reloaded') {
 	open $fh, '<', "$alt/config" or die "open failed: $!\n";
 	my $config = eval { local $/; <$fh> };
 	is($$found, $config, 'alternates reloaded');
+
+	ok($gcf->cleanup(time - 30), 'cleanup did not expire');
+	ok(!$gcf->cleanup(time + 30), 'cleanup can expire');
+	ok(!$gcf->cleanup, 'cleanup idempotent');
 }
 
 use_ok 'PublicInbox::Git', qw(git_unquote git_quote);
-- 
EW


  parent reply	other threads:[~2019-01-31  4:27 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-31  4:27 [PATCH 0/5] a few more solver fixups and improvements Eric Wong
2019-01-31  4:27 ` [PATCH 1/5] t/config.t: test PublicInbox::Git sharing between inboxes Eric Wong
2019-01-31  4:27 ` Eric Wong [this message]
2019-01-31  4:27 ` [PATCH 3/5] solvergit: allow searching on longer-than-needed OIDs Eric Wong
2019-01-31  4:27 ` [PATCH 4/5] solvergit: allow shorter-than-necessary OIDs from user Eric Wong
2019-01-31  4:27 ` [PATCH 5/5] viewvcs: support streaming large blobs Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190131042724.2675-3-e@80x24.org \
    --to=e@80x24.org \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).