git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Linus Torvalds <torvalds@linuxfoundation.org>
To: "Frank Ch. Eigler" <fche@redhat.com>, Junio C Hamano <gitster@pobox.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: reducing prune sync()s
Date: Thu, 29 May 2008 17:27:35 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LFD.1.10.0805291656260.3141@woody.linux-foundation.org> (raw)
In-Reply-To: <20080529205743.GC17123@redhat.com>



On Thu, 29 May 2008, Frank Ch. Eigler wrote:
> 
> Would there be interest in making this sync disableable with
> git-config?

No, that's just too scary. But..

>	  Or perhaps having the blanket sync be replaced a
> list of fsync()s for only the relevant git repository files?

That would be much better. The code was ported from shell script, and 
there is no fsync() in shell, but the rule should basically be that you 
can remove all the objects that correspond to a pack-file after you have 
made sure that the pack-file (and it's index - we can re-generate the pack 
index, but realistically speaking it's *much* better to not have to) is 
stable on disk.

Soemthing like this *may* work. THIS IS TOTALLY UNTESTED. And when I say 
"TOTALLY UNTESTED", I mean it. Zero testing. None. Nada. Zilch. Testing is 
for people who are actually interested in the feature (hint, hint).

But if somebody tests this and actually does an strace and sees that yes, 
it does the fsync() on each pack-file it cares about, then I'll happily 
sign off on this patch. It's fairly obvious.

What it does is to make "has_sha1_pack()" return the actual packed_git 
pointer it found that contains the SHA1 in question, instead of just an 
integer boolean. That changes no users, since everybody tests it for just 
truthiness value anyway.

It then introduces a "sha1_is_stably_packed()" function that not just 
looks up the pack entry, but also does an fsync() on it if it isn't 
already marked stable. Fairly trivial, as I said. But this is important 
code, so it really does need actual real-life testing. Trivial code often 
has trivial bugs.

		Linus

---
 builtin-prune-packed.c |   23 +++++++++++++++++++++--
 cache.h                |    6 ++++--
 sha1_file.c            |    9 ++++++---
 3 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/builtin-prune-packed.c b/builtin-prune-packed.c
index 23faf31..b64a5d4 100644
--- a/builtin-prune-packed.c
+++ b/builtin-prune-packed.c
@@ -10,6 +10,26 @@ static const char prune_packed_usage[] =
 
 static struct progress *progress;
 
+static int sha1_is_stably_packed(const unsigned char *sha1)
+{
+	struct packed_git *pack;
+
+	pack = has_sha1_pack(sha1, NULL);
+	if (!pack)
+		return 0;
+	if (!pack->pack_stable) {
+		int fd = pack->pack_fd;
+		if (fd < 0) {
+			if (open_packed_git(pack) < 0)
+				return 0;
+			fd = pack->pack_fd;
+		}
+		fsync(fd);
+		pack->pack_stable = 1;
+	}
+	return 1;
+}
+
 static void prune_dir(int i, DIR *dir, char *pathname, int len, int opts)
 {
 	struct dirent *de;
@@ -23,7 +43,7 @@ static void prune_dir(int i, DIR *dir, char *pathname, int len, int opts)
 		memcpy(hex+2, de->d_name, 38);
 		if (get_sha1_hex(hex, sha1))
 			continue;
-		if (!has_sha1_pack(sha1, NULL))
+		if (!sha1_is_stably_packed(sha1))
 			continue;
 		memcpy(pathname + len, de->d_name, 38);
 		if (opts & DRY_RUN)
@@ -85,7 +105,6 @@ int cmd_prune_packed(int argc, const char **argv, const char *prefix)
 		/* Handle arguments here .. */
 		usage(prune_packed_usage);
 	}
-	sync();
 	prune_packed_objects(opts);
 	return 0;
 }
diff --git a/cache.h b/cache.h
index eab1a17..3bb85eb 100644
--- a/cache.h
+++ b/cache.h
@@ -540,7 +540,7 @@ extern int write_sha1_from_fd(const unsigned char *sha1, int fd, char *buffer,
 extern int write_sha1_to_fd(int fd, const unsigned char *sha1);
 extern int move_temp_to_file(const char *tmpfile, const char *filename);
 
-extern int has_sha1_pack(const unsigned char *sha1, const char **ignore);
+extern struct packed_git *has_sha1_pack(const unsigned char *sha1, const char **ignore);
 extern int has_sha1_file(const unsigned char *sha1);
 
 extern int has_pack_file(const unsigned char *sha1);
@@ -646,7 +646,8 @@ extern struct packed_git {
 	int index_version;
 	time_t mtime;
 	int pack_fd;
-	int pack_local;
+	unsigned int pack_local:1,
+		     pack_stable:1;
 	unsigned char sha1[20];
 	/* something like ".git/objects/pack/xxxxx.pack" */
 	char pack_name[FLEX_ARRAY]; /* more */
@@ -708,6 +709,7 @@ extern struct packed_git *find_sha1_pack(const unsigned char *sha1,
 
 extern void pack_report(void);
 extern int open_pack_index(struct packed_git *);
+extern int open_packed_git(struct packed_git *p);
 extern unsigned char* use_pack(struct packed_git *, struct pack_window **, off_t, unsigned int *);
 extern void close_pack_windows(struct packed_git *);
 extern void unuse_pack(struct pack_window **);
diff --git a/sha1_file.c b/sha1_file.c
index 9679040..52e9824 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -703,7 +703,7 @@ static int open_packed_git_1(struct packed_git *p)
 	return 0;
 }
 
-static int open_packed_git(struct packed_git *p)
+int open_packed_git(struct packed_git *p)
 {
 	if (!open_packed_git_1(p))
 		return 0;
@@ -824,6 +824,7 @@ struct packed_git *add_packed_git(const char *path, int path_len, int local)
 	p->windows = NULL;
 	p->pack_fd = -1;
 	p->pack_local = local;
+	p->pack_stable = 0;
 	p->mtime = st.st_mtime;
 	if (path_len < 40 || get_sha1_hex(path + path_len - 40, p->sha1))
 		hashclr(p->sha1);
@@ -2381,10 +2382,12 @@ int has_pack_file(const unsigned char *sha1)
 	return 1;
 }
 
-int has_sha1_pack(const unsigned char *sha1, const char **ignore_packed)
+struct packed_git *has_sha1_pack(const unsigned char *sha1, const char **ignore_packed)
 {
 	struct pack_entry e;
-	return find_pack_entry(sha1, &e, ignore_packed);
+	if (find_pack_entry(sha1, &e, ignore_packed))
+		return e.p;
+	return NULL;
 }
 
 int has_sha1_file(const unsigned char *sha1)

  reply	other threads:[~2008-05-30  0:28 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-29 20:57 reducing prune sync()s Frank Ch. Eigler
2008-05-30  0:27 ` Linus Torvalds [this message]
2008-05-30  0:32   ` Linus Torvalds
2008-05-30  1:50     ` Frank Ch. Eigler
2008-05-30 20:07     ` Florian Weimer
2008-05-30  1:51   ` David Dillow
2008-05-30  2:17     ` Linus Torvalds
2008-05-30  2:30       ` Linus Torvalds
2008-05-30 15:25   ` Frank Ch. Eigler
2008-05-30 15:57     ` Linus Torvalds
2008-05-30 16:08       ` [PATCH 1/2] Make pack creation always fsync() the result Linus Torvalds
2008-05-30 16:11         ` [PATCH 2/2] Remove now unnecessary 'sync()' calls Linus Torvalds
2008-05-30 20:27         ` [PATCH 1/2] Make pack creation always fsync() the result Nicolas Pitre
2008-05-31 14:19         ` Frank Ch. Eigler
2008-06-02 22:23           ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.1.10.0805291656260.3141@woody.linux-foundation.org \
    --to=torvalds@linuxfoundation.org \
    --cc=fche@redhat.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).