git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* reducing prune sync()s
@ 2008-05-29 20:57 Frank Ch. Eigler
  2008-05-30  0:27 ` Linus Torvalds
  0 siblings, 1 reply; 15+ messages in thread
From: Frank Ch. Eigler @ 2008-05-29 20:57 UTC (permalink / raw)
  To: git

Hi -

In at least builtin-prune-packed.c, builtin-prune.c, and
guilt-repack.sh, there is an unconditional sync call.  On a machine
with lots of dirty disk data, writing it to some slow device, this
sync can take a long time.

Would there be interest in making this sync disableable with
git-config?  Or perhaps having the blanket sync be replaced a
list of fsync()s for only the relevant git repository files?

- FChE

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: reducing prune sync()s
  2008-05-29 20:57 reducing prune sync()s Frank Ch. Eigler
@ 2008-05-30  0:27 ` Linus Torvalds
  2008-05-30  0:32   ` Linus Torvalds
                     ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Linus Torvalds @ 2008-05-30  0:27 UTC (permalink / raw)
  To: Frank Ch. Eigler, Junio C Hamano; +Cc: Git Mailing List



On Thu, 29 May 2008, Frank Ch. Eigler wrote:
> 
> Would there be interest in making this sync disableable with
> git-config?

No, that's just too scary. But..

>	  Or perhaps having the blanket sync be replaced a
> list of fsync()s for only the relevant git repository files?

That would be much better. The code was ported from shell script, and 
there is no fsync() in shell, but the rule should basically be that you 
can remove all the objects that correspond to a pack-file after you have 
made sure that the pack-file (and it's index - we can re-generate the pack 
index, but realistically speaking it's *much* better to not have to) is 
stable on disk.

Soemthing like this *may* work. THIS IS TOTALLY UNTESTED. And when I say 
"TOTALLY UNTESTED", I mean it. Zero testing. None. Nada. Zilch. Testing is 
for people who are actually interested in the feature (hint, hint).

But if somebody tests this and actually does an strace and sees that yes, 
it does the fsync() on each pack-file it cares about, then I'll happily 
sign off on this patch. It's fairly obvious.

What it does is to make "has_sha1_pack()" return the actual packed_git 
pointer it found that contains the SHA1 in question, instead of just an 
integer boolean. That changes no users, since everybody tests it for just 
truthiness value anyway.

It then introduces a "sha1_is_stably_packed()" function that not just 
looks up the pack entry, but also does an fsync() on it if it isn't 
already marked stable. Fairly trivial, as I said. But this is important 
code, so it really does need actual real-life testing. Trivial code often 
has trivial bugs.

		Linus

---
 builtin-prune-packed.c |   23 +++++++++++++++++++++--
 cache.h                |    6 ++++--
 sha1_file.c            |    9 ++++++---
 3 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/builtin-prune-packed.c b/builtin-prune-packed.c
index 23faf31..b64a5d4 100644
--- a/builtin-prune-packed.c
+++ b/builtin-prune-packed.c
@@ -10,6 +10,26 @@ static const char prune_packed_usage[] =
 
 static struct progress *progress;
 
+static int sha1_is_stably_packed(const unsigned char *sha1)
+{
+	struct packed_git *pack;
+
+	pack = has_sha1_pack(sha1, NULL);
+	if (!pack)
+		return 0;
+	if (!pack->pack_stable) {
+		int fd = pack->pack_fd;
+		if (fd < 0) {
+			if (open_packed_git(pack) < 0)
+				return 0;
+			fd = pack->pack_fd;
+		}
+		fsync(fd);
+		pack->pack_stable = 1;
+	}
+	return 1;
+}
+
 static void prune_dir(int i, DIR *dir, char *pathname, int len, int opts)
 {
 	struct dirent *de;
@@ -23,7 +43,7 @@ static void prune_dir(int i, DIR *dir, char *pathname, int len, int opts)
 		memcpy(hex+2, de->d_name, 38);
 		if (get_sha1_hex(hex, sha1))
 			continue;
-		if (!has_sha1_pack(sha1, NULL))
+		if (!sha1_is_stably_packed(sha1))
 			continue;
 		memcpy(pathname + len, de->d_name, 38);
 		if (opts & DRY_RUN)
@@ -85,7 +105,6 @@ int cmd_prune_packed(int argc, const char **argv, const char *prefix)
 		/* Handle arguments here .. */
 		usage(prune_packed_usage);
 	}
-	sync();
 	prune_packed_objects(opts);
 	return 0;
 }
diff --git a/cache.h b/cache.h
index eab1a17..3bb85eb 100644
--- a/cache.h
+++ b/cache.h
@@ -540,7 +540,7 @@ extern int write_sha1_from_fd(const unsigned char *sha1, int fd, char *buffer,
 extern int write_sha1_to_fd(int fd, const unsigned char *sha1);
 extern int move_temp_to_file(const char *tmpfile, const char *filename);
 
-extern int has_sha1_pack(const unsigned char *sha1, const char **ignore);
+extern struct packed_git *has_sha1_pack(const unsigned char *sha1, const char **ignore);
 extern int has_sha1_file(const unsigned char *sha1);
 
 extern int has_pack_file(const unsigned char *sha1);
@@ -646,7 +646,8 @@ extern struct packed_git {
 	int index_version;
 	time_t mtime;
 	int pack_fd;
-	int pack_local;
+	unsigned int pack_local:1,
+		     pack_stable:1;
 	unsigned char sha1[20];
 	/* something like ".git/objects/pack/xxxxx.pack" */
 	char pack_name[FLEX_ARRAY]; /* more */
@@ -708,6 +709,7 @@ extern struct packed_git *find_sha1_pack(const unsigned char *sha1,
 
 extern void pack_report(void);
 extern int open_pack_index(struct packed_git *);
+extern int open_packed_git(struct packed_git *p);
 extern unsigned char* use_pack(struct packed_git *, struct pack_window **, off_t, unsigned int *);
 extern void close_pack_windows(struct packed_git *);
 extern void unuse_pack(struct pack_window **);
diff --git a/sha1_file.c b/sha1_file.c
index 9679040..52e9824 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -703,7 +703,7 @@ static int open_packed_git_1(struct packed_git *p)
 	return 0;
 }
 
-static int open_packed_git(struct packed_git *p)
+int open_packed_git(struct packed_git *p)
 {
 	if (!open_packed_git_1(p))
 		return 0;
@@ -824,6 +824,7 @@ struct packed_git *add_packed_git(const char *path, int path_len, int local)
 	p->windows = NULL;
 	p->pack_fd = -1;
 	p->pack_local = local;
+	p->pack_stable = 0;
 	p->mtime = st.st_mtime;
 	if (path_len < 40 || get_sha1_hex(path + path_len - 40, p->sha1))
 		hashclr(p->sha1);
@@ -2381,10 +2382,12 @@ int has_pack_file(const unsigned char *sha1)
 	return 1;
 }
 
-int has_sha1_pack(const unsigned char *sha1, const char **ignore_packed)
+struct packed_git *has_sha1_pack(const unsigned char *sha1, const char **ignore_packed)
 {
 	struct pack_entry e;
-	return find_pack_entry(sha1, &e, ignore_packed);
+	if (find_pack_entry(sha1, &e, ignore_packed))
+		return e.p;
+	return NULL;
 }
 
 int has_sha1_file(const unsigned char *sha1)

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: reducing prune sync()s
  2008-05-30  0:27 ` Linus Torvalds
@ 2008-05-30  0:32   ` Linus Torvalds
  2008-05-30  1:50     ` Frank Ch. Eigler
  2008-05-30 20:07     ` Florian Weimer
  2008-05-30  1:51   ` David Dillow
  2008-05-30 15:25   ` Frank Ch. Eigler
  2 siblings, 2 replies; 15+ messages in thread
From: Linus Torvalds @ 2008-05-30  0:32 UTC (permalink / raw)
  To: Frank Ch. Eigler, Junio C Hamano; +Cc: Git Mailing List



On Thu, 29 May 2008, Linus Torvalds wrote:
> On Thu, 29 May 2008, Frank Ch. Eigler wrote:
> > 
> >	  Or perhaps having the blanket sync be replaced a
> > list of fsync()s for only the relevant git repository files?
> 
> That would be much better.

Side note: a lot of systems make "fsync()" pretty expensive too. It's one 
of my main disagreements with most log-based filesystems - fsync() can in 
theory be fast, but almost always implies flushing the whole log, even if 
99.9% of that log is totally unrelated to the actual file you want to 
fsync().

So fsync() isn't always all that much better than sync(). It *should* be, 
but reality sometimes bites. So testing should include at least some level 
of "yes, it actually improves things at least on xyz"...

			Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: reducing prune sync()s
  2008-05-30  0:32   ` Linus Torvalds
@ 2008-05-30  1:50     ` Frank Ch. Eigler
  2008-05-30 20:07     ` Florian Weimer
  1 sibling, 0 replies; 15+ messages in thread
From: Frank Ch. Eigler @ 2008-05-30  1:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

Hi --

On Thu, May 29, 2008 at 05:32:30PM -0700, Linus Torvalds wrote:

> [...]  Side note: a lot of systems make "fsync()" pretty expensive
> too. It's one of my main disagreements with most log-based
> filesystems - fsync() can in theory be fast, but almost always
> implies flushing the whole log, even if 99.9% of that log is totally
> unrelated to the actual file you want to fsync(). [...]

The scenario where I ran into this was different: a usb-drive
filesystem was busy running receiving backups, collecting, oh, several
GB of dirty data, while git was working on a normal local disk on a
separate filesystem.  The system-wide git sync unnecessarily blocked
on the usb filesystem too.

- FChE

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: reducing prune sync()s
  2008-05-30  0:27 ` Linus Torvalds
  2008-05-30  0:32   ` Linus Torvalds
@ 2008-05-30  1:51   ` David Dillow
  2008-05-30  2:17     ` Linus Torvalds
  2008-05-30 15:25   ` Frank Ch. Eigler
  2 siblings, 1 reply; 15+ messages in thread
From: David Dillow @ 2008-05-30  1:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Frank Ch. Eigler, Junio C Hamano, Git Mailing List


On Thu, 2008-05-29 at 17:27 -0700, Linus Torvalds wrote:
> That would be much better. The code was ported from shell script, and 
> there is no fsync() in shell, but the rule should basically be that you 
> can remove all the objects that correspond to a pack-file after you have 
> made sure that the pack-file (and it's index - we can re-generate the pack 
> index, but realistically speaking it's *much* better to not have to) is 
> stable on disk.

Even if the data is stable on disk, don't we also need to ensure the
pack's connectivity to the namespace is also stable? Without an fsync()
of the directory that contains it, could it go away?

Of course, this is me recollecting a several-year-old exchange on LKML,
so I don't know if it is still needed or not, or on systems other than
Linux.

Dave

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: reducing prune sync()s
  2008-05-30  1:51   ` David Dillow
@ 2008-05-30  2:17     ` Linus Torvalds
  2008-05-30  2:30       ` Linus Torvalds
  0 siblings, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2008-05-30  2:17 UTC (permalink / raw)
  To: David Dillow
  Cc: Linus Torvalds, Frank Ch. Eigler, Junio C Hamano,
	Git Mailing List



On Thu, 29 May 2008, David Dillow wrote:
> 
> On Thu, 2008-05-29 at 17:27 -0700, Linus Torvalds wrote:
> > That would be much better. The code was ported from shell script, and 
> > there is no fsync() in shell, but the rule should basically be that you 
> > can remove all the objects that correspond to a pack-file after you have 
> > made sure that the pack-file (and it's index - we can re-generate the pack 
> > index, but realistically speaking it's *much* better to not have to) is 
> > stable on disk.
> 
> Even if the data is stable on disk, don't we also need to ensure the
> pack's connectivity to the namespace is also stable? Without an fsync()
> of the directory that contains it, could it go away?

In theory, yes. That said, it would always be in lost+found, so the data 
wouldn't ever get really lost. In that sense it is no different from a lot 
of other theoretical git corruption issues - git in general does *not* 
guarantee that the repository will not need to be "fixed up", it just 
makes a strong case for

 - git will always at least *see* the corruption

   (ie it is by design is very hard to corrupt a git repo silently and 
   subtly!)

 - git makes it very hard to lose data

   Old data is not overwritten, but that doesn't mean that you may not 
   have to _look_ for it!

An example of the latter is how a crash in the middle of "git commit" may 
actually cause partial *new* objects to be on disk (the objects themselves 
are not fsync'ed when written!) and may end up with the ref being updated 
but some object it points to was never written (again, a "git commit" does 
not wait until things are stable!), and the index file may be totally 
corrupt (again, no fsync anywhere!).

So if you have a system crash at a really bad time, you may have a git 
repository that needs manual intervention to actually be *usable*. I hope 
nobody ever believed anything else. That manual intervention may be things 
like:

 - throw away a corrupt .git/index file and re-create it (git read-tree)

 - possibly have to recover refs manually ("git fsck" + look at dangling 
   commits)

 - actually throw away broken commits, and re-create them (ie basically 
   doing a "git reset <known-good-state>" plus re-committing the working 
   tree or perhaps re-doing a whole "git am" series or something)

 - and yes, possibly recover lost inodes from /lost+found

Now, quite frankly, all of these are extremely rare. In most cases they 
will never happen at all, simply because the filesystem itself is more or 
less transactional. 

		Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: reducing prune sync()s
  2008-05-30  2:17     ` Linus Torvalds
@ 2008-05-30  2:30       ` Linus Torvalds
  0 siblings, 0 replies; 15+ messages in thread
From: Linus Torvalds @ 2008-05-30  2:30 UTC (permalink / raw)
  To: David Dillow
  Cc: Linus Torvalds, Frank Ch. Eigler, Junio C Hamano,
	Git Mailing List



On Thu, 29 May 2008, Linus Torvalds wrote:
> 
> So if you have a system crash at a really bad time, you may have a git 
> repository that needs manual intervention to actually be *usable*. I hope 
> nobody ever believed anything else. That manual intervention may be things 
> like:
> ...
>  - actually throw away broken commits, and re-create them (ie basically 
>    doing a "git reset <known-good-state>" plus re-committing the working 
>    tree or perhaps re-doing a whole "git am" series or something)

The important part here is that it's only the *new* state that can be this 
kind of "broken commits". In other words, you'd never have to re-do actual 
*old* commits, just the commits you were doing as things crashed - the 
commits that you were in the middle of doing, and still have the data for.

Example from my case: I may have series of 250+ commits that I create with 
"git am" when I sync up with Andrew, and I very much want the speed of 
being able to create all that new commit data without ever even causing a 
_single_ synchronous disk write.

So if the machine were to crash in the middle of the series, I might lose 
all of that data, but I still have my mailbox, so I'd just need to reset 
to the point before I even started the "git am", and re-do the whole 
series. My actual *base* repository objects would never get corrupted.

[ And one final notice: I don't know about others, but I've actually had 
  more corruption from disks going bad etc that from system crashes per 
  se. And when *that* happens, old data is obviously as easily gone as new 
  data is. So absolutely _nothing_ replaces backups. It doesn't matter if 
  you do a "fsync()" after every single byte write - a disk crash can and 
  will corrupt things that were "stable". So even "stable storage" is 
  very much unstable in the end. ]

			Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: reducing prune sync()s
  2008-05-30  0:27 ` Linus Torvalds
  2008-05-30  0:32   ` Linus Torvalds
  2008-05-30  1:51   ` David Dillow
@ 2008-05-30 15:25   ` Frank Ch. Eigler
  2008-05-30 15:57     ` Linus Torvalds
  2 siblings, 1 reply; 15+ messages in thread
From: Frank Ch. Eigler @ 2008-05-30 15:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

Hi -

On Thu, May 29, 2008 at 05:27:35PM -0700, Linus Torvalds wrote:
> [...]
> >	  Or perhaps having the blanket sync be replaced a
> > list of fsync()s for only the relevant git repository files?
> [...]
> Soemthing like this *may* work. THIS IS TOTALLY UNTESTED. And when I say 
> "TOTALLY UNTESTED", I mean it. Zero testing. None. Nada. Zilch. Testing is 
> for people who are actually interested in the feature (hint, hint).

The patch does add an fsync or two into the mix, a "git gc" or 
"git repack -a" still goes through the "git-repack" shell script, which
still did its "sync".  How about this patch, which adds a "git-fsync"
builtin for the shell scripts?  Added to yours, it replaces all the
syncs with fsync's, and tests fine in the same environment originally
reported.  (Lots of dirty data for another filesystem does not block
the fsyncs.)

diff --git a/.gitignore b/.gitignore
index 4ff2fec..708c5ac 100644
--- a/.gitignore
+++ b/.gitignore
@@ -47,6 +47,7 @@ git-for-each-ref
 git-format-patch
 git-fsck
 git-fsck-objects
+git-fsync
 git-gc
 git-get-tar-commit-id
 git-grep
diff --git a/Makefile b/Makefile
index 865e2bf..2148196 100644
--- a/Makefile
+++ b/Makefile
@@ -500,6 +500,7 @@ BUILTIN_OBJS += builtin-fetch.o
 BUILTIN_OBJS += builtin-fmt-merge-msg.o
 BUILTIN_OBJS += builtin-for-each-ref.o
 BUILTIN_OBJS += builtin-fsck.o
+BUILTIN_OBJS += builtin-fsync.o
 BUILTIN_OBJS += builtin-gc.o
 BUILTIN_OBJS += builtin-grep.o
 BUILTIN_OBJS += builtin-init-db.o
diff --git a/builtin-fsync.c b/builtin-fsync.c
new file mode 100644
index 0000000..8c4c7f7
--- /dev/null
+++ b/builtin-fsync.c
@@ -0,0 +1,31 @@
+/*
+ * Copyright (c) 2008 Frank Ch. Eigler
+ */
+#include "cache.h"
+#include "commit.h"
+#include "tar.h"
+#include "builtin.h"
+#include "quote.h"
+
+static const char fsync_usage[] =
+"git-fsync <filepattern>...\n";
+
+int cmd_fsync(int argc, const char **argv, const char *prefix)
+{
+	int i;
+	for (i=1; i<argc; i++) {
+		int fd = open (argv[i], O_RDONLY);
+		if (fd < 0) {
+			error("unable to open for fsync %s", argv[i]);
+			return 1;
+		} else {
+			int rc = fsync (fd);
+			if (rc < 0) {
+				error("unable to fsync %s", argv[i]);
+				return 1;
+			}
+			close (fd);
+		}
+	}
+	return 0;
+}
diff --git a/builtin-prune.c b/builtin-prune.c
index 25f9304..d79a6ff 100644
--- a/builtin-prune.c
+++ b/builtin-prune.c
@@ -155,8 +155,6 @@ int cmd_prune(int argc, const char **argv, const char *prefix)
 	}
 	mark_reachable_objects(&revs, 1);
 	prune_object_dir(get_object_directory());
-
-	sync();
 	prune_packed_objects(show_only);
 	remove_temporary_files();
 	return 0;
diff --git a/builtin.h b/builtin.h
index 95126fd..4cedc5d 100644
--- a/builtin.h
+++ b/builtin.h
@@ -41,6 +41,7 @@ extern int cmd_fmt_merge_msg(int argc, const char **argv, const char *prefix);
 extern int cmd_for_each_ref(int argc, const char **argv, const char *prefix);
 extern int cmd_format_patch(int argc, const char **argv, const char *prefix);
 extern int cmd_fsck(int argc, const char **argv, const char *prefix);
+extern int cmd_fsync(int argc, const char **argv, const char *prefix);
 extern int cmd_gc(int argc, const char **argv, const char *prefix);
 extern int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
 extern int cmd_grep(int argc, const char **argv, const char *prefix);
diff --git a/git-repack.sh b/git-repack.sh
index 501519a..650372c 100755
--- a/git-repack.sh
+++ b/git-repack.sh
@@ -125,12 +125,11 @@ then
 	# We know $existing are all redundant.
 	if [ -n "$existing" ]
 	then
-		sync
 		( cd "$PACKDIR" &&
 		  for e in $existing
 		  do
 			case " $fullbases " in
-			*" $e "*) ;;
+			*" $e "*) git-fsync "$e.pack" "$e.idx" ;;
 			*)	rm -f "$e.pack" "$e.idx" "$e.keep" ;;
 			esac
 		  done
diff --git a/git.c b/git.c
index 89b431f..21f3b7d 100644
--- a/git.c
+++ b/git.c
@@ -305,6 +305,7 @@ static void handle_internal_command(int argc, const char **argv)
 		{ "format-patch", cmd_format_patch, RUN_SETUP },
 		{ "fsck", cmd_fsck, RUN_SETUP },
 		{ "fsck-objects", cmd_fsck, RUN_SETUP },
+		{ "fsync", cmd_fsync },
 		{ "gc", cmd_gc, RUN_SETUP },
 		{ "get-tar-commit-id", cmd_get_tar_commit_id },
 		{ "grep", cmd_grep, RUN_SETUP | USE_PAGER },

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: reducing prune sync()s
  2008-05-30 15:25   ` Frank Ch. Eigler
@ 2008-05-30 15:57     ` Linus Torvalds
  2008-05-30 16:08       ` [PATCH 1/2] Make pack creation always fsync() the result Linus Torvalds
  0 siblings, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2008-05-30 15:57 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Linus Torvalds, Junio C Hamano, Git Mailing List



On Fri, 30 May 2008, Frank Ch. Eigler wrote:
> 
> On Thu, May 29, 2008 at 05:27:35PM -0700, Linus Torvalds wrote:
> > [...]
> > >	  Or perhaps having the blanket sync be replaced a
> > > list of fsync()s for only the relevant git repository files?
> > [...]
> > Soemthing like this *may* work. THIS IS TOTALLY UNTESTED. And when I say 
> > "TOTALLY UNTESTED", I mean it. Zero testing. None. Nada. Zilch. Testing is 
> > for people who are actually interested in the feature (hint, hint).
> 
> The patch does add an fsync or two into the mix, a "git gc" or 
> "git repack -a" still goes through the "git-repack" shell script, which
> still did its "sync".

Yes.

But I actually think there is a simpler and more straightforward approach.

Instead of being careful when removing objects (whether old packs or loose 
objects that are made redundant by a new pack), the simpler approach is to 
just always fsync() the new pack when creating it.

I was always very careful to *not* make git depend on any serialized IO, 
but the reason for that was literally the fact that I wanted to make sure 
that I could batch up things efficiently, and do any serialization (if I 
wanted to) later. So it was literally always about the whole "apply 
several hundred patches in one go" kind of thing.

And the thing is, the repacking phase *is* the "serialize things later (if 
you want)" thing, so doing things synchronously at that point is actually 
perfectly fine.

And every single "let's remove objects" operation is literally always 
about the fact that we have a new better pack-file, making old objects 
redundant, so if we just create those new pack-files stably on disk, then 
any subsequent action pretty much by definition doesn't need any sync. 
Because we know that the only thing we can really care about *is* stable.

So this is a conceptually much more direct approach. Creating pack-files 
really is the special occasion, since it's (a) literally the event that 
causes other objects to potentially be stale (b) fairly rare and (c) not 
normally limited by disk-IO anyway (ie a "git fetch" will create a new 
pack-file, but it's normally limited by the network overhead or the cost 
of creating the pack-file, not by adding a fsync() to make sure that the 
end result is stable).

So I'll follow up with a two-patch series (the first to create pack-files 
and their indexes stably on disk, the second to just remove the now 
unnecessary 'sync()' calls). I'll give it *some* basic testing first, 
though.

		Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] Make pack creation always fsync() the result
  2008-05-30 15:57     ` Linus Torvalds
@ 2008-05-30 16:08       ` Linus Torvalds
  2008-05-30 16:11         ` [PATCH 2/2] Remove now unnecessary 'sync()' calls Linus Torvalds
                           ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Linus Torvalds @ 2008-05-30 16:08 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Linus Torvalds, Junio C Hamano, Git Mailing List



From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Fri, 30 May 2008 08:42:16 -0700

This means that we can depend on packs always being stable on disk,
simplifying a lot of the object serialization worries.  And unlike loose
objects, serializing pack creation IO isn't going to be a performance
killer.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---

Ok, so this is pretty straightforward. I haven't given it a *lot* of 
testing, but while it can certainly also have bugs, it's not even trying 
to be "clever" like my previous attempt. 

I was always a bit leery about doing a 'fsync()' on a read-only file 
descriptor, and in general about doing an fsync() on a file that was 
created by something else. 

 builtin-pack-objects.c |    4 +++-
 cache.h                |    1 +
 csum-file.c            |    7 +++++--
 csum-file.h            |    6 +++++-
 fast-import.c          |    2 +-
 index-pack.c           |    1 +
 pack-write.c           |    2 +-
 write_or_die.c         |    7 +++++++
 8 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 70d2f5d..4c2e0cd 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -515,10 +515,12 @@ static void write_pack_file(void)
 		 * If so, rewrite it like in fast-import
 		 */
 		if (pack_to_stdout || nr_written == nr_remaining) {
-			sha1close(f, sha1, 1);
+			unsigned flags = pack_to_stdout ? CSUM_CLOSE : CSUM_FSYNC;
+			sha1close(f, sha1, flags);
 		} else {
 			int fd = sha1close(f, NULL, 0);
 			fixup_pack_header_footer(fd, sha1, pack_tmp_name, nr_written);
+			fsync_or_die(fd, pack_tmp_name);
 			close(fd);
 		}
 
diff --git a/cache.h b/cache.h
index eab1a17..092a997 100644
--- a/cache.h
+++ b/cache.h
@@ -761,6 +761,7 @@ extern ssize_t write_in_full(int fd, const void *buf, size_t count);
 extern void write_or_die(int fd, const void *buf, size_t count);
 extern int write_or_whine(int fd, const void *buf, size_t count, const char *msg);
 extern int write_or_whine_pipe(int fd, const void *buf, size_t count, const char *msg);
+extern void fsync_or_die(int fd, const char *);
 
 /* pager.c */
 extern void setup_pager(void);
diff --git a/csum-file.c b/csum-file.c
index 9728a99..ace64f1 100644
--- a/csum-file.c
+++ b/csum-file.c
@@ -32,21 +32,24 @@ static void sha1flush(struct sha1file *f, unsigned int count)
 	}
 }
 
-int sha1close(struct sha1file *f, unsigned char *result, int final)
+int sha1close(struct sha1file *f, unsigned char *result, unsigned int flags)
 {
 	int fd;
 	unsigned offset = f->offset;
+
 	if (offset) {
 		SHA1_Update(&f->ctx, f->buffer, offset);
 		sha1flush(f, offset);
 		f->offset = 0;
 	}
-	if (final) {
+	if (flags & (CSUM_CLOSE | CSUM_FSYNC)) {
 		/* write checksum and close fd */
 		SHA1_Final(f->buffer, &f->ctx);
 		if (result)
 			hashcpy(result, f->buffer);
 		sha1flush(f, 20);
+		if (flags & CSUM_FSYNC)
+			fsync_or_die(f->fd, f->name);
 		if (close(f->fd))
 			die("%s: sha1 file error on close (%s)",
 			    f->name, strerror(errno));
diff --git a/csum-file.h b/csum-file.h
index 1af7656..72c9487 100644
--- a/csum-file.h
+++ b/csum-file.h
@@ -16,9 +16,13 @@ struct sha1file {
 	unsigned char buffer[8192];
 };
 
+/* sha1close flags */
+#define CSUM_CLOSE	1
+#define CSUM_FSYNC	2
+
 extern struct sha1file *sha1fd(int fd, const char *name);
 extern struct sha1file *sha1fd_throughput(int fd, const char *name, struct progress *tp);
-extern int sha1close(struct sha1file *, unsigned char *, int);
+extern int sha1close(struct sha1file *, unsigned char *, unsigned int);
 extern int sha1write(struct sha1file *, void *, unsigned int);
 extern void crc32_begin(struct sha1file *);
 extern uint32_t crc32_end(struct sha1file *);
diff --git a/fast-import.c b/fast-import.c
index 93119bb..e72b286 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -890,7 +890,7 @@ static char *create_index(void)
 		SHA1_Update(&ctx, (*c)->sha1, 20);
 	}
 	sha1write(f, pack_data->sha1, sizeof(pack_data->sha1));
-	sha1close(f, NULL, 1);
+	sha1close(f, NULL, CSUM_FSYNC);
 	free(idx);
 	SHA1_Final(pack_data->sha1, &ctx);
 	return tmpfile;
diff --git a/index-pack.c b/index-pack.c
index aaba944..5ac91ba 100644
--- a/index-pack.c
+++ b/index-pack.c
@@ -694,6 +694,7 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
 	if (!from_stdin) {
 		close(input_fd);
 	} else {
+		fsync_or_die(output_fd, curr_pack_name);
 		err = close(output_fd);
 		if (err)
 			die("error while closing pack file: %s", strerror(errno));
diff --git a/pack-write.c b/pack-write.c
index c66c8af..f52cabe 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -139,7 +139,7 @@ char *write_idx_file(char *index_name, struct pack_idx_entry **objects,
 	}
 
 	sha1write(f, sha1, 20);
-	sha1close(f, NULL, 1);
+	sha1close(f, NULL, CSUM_FSYNC);
 	SHA1_Final(sha1, &ctx);
 	return index_name;
 }
diff --git a/write_or_die.c b/write_or_die.c
index 32f9914..630be4c 100644
--- a/write_or_die.c
+++ b/write_or_die.c
@@ -78,6 +78,13 @@ ssize_t write_in_full(int fd, const void *buf, size_t count)
 	return total;
 }
 
+void fsync_or_die(int fd, const char *msg)
+{
+	if (fsync(fd) < 0) {
+		die("%s: fsync error (%s)", msg, strerror(errno));
+	}
+}
+
 void write_or_die(int fd, const void *buf, size_t count)
 {
 	if (write_in_full(fd, buf, count) < 0) {
-- 
1.5.6.rc0.48.g5eea

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] Remove now unnecessary 'sync()' calls
  2008-05-30 16:08       ` [PATCH 1/2] Make pack creation always fsync() the result Linus Torvalds
@ 2008-05-30 16:11         ` Linus Torvalds
  2008-05-30 20:27         ` [PATCH 1/2] Make pack creation always fsync() the result Nicolas Pitre
  2008-05-31 14:19         ` Frank Ch. Eigler
  2 siblings, 0 replies; 15+ messages in thread
From: Linus Torvalds @ 2008-05-30 16:11 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Linus Torvalds, Junio C Hamano, Git Mailing List


From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Fri, 30 May 2008 08:54:46 -0700

Since the pack-files are now always created stably on disk, there is no
need to sync() before pruning lose objects or old stale pack-files.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---

This literally just removes the syncs. The only thing they wanted to 
protect were the pack-files, that are now created stably.

Yes, you can screw this up by doing direct filesystem operations on the 
pack-files (ie rsync/http walkers etc), but let's face it - those 
operations are pretty much fundamentally more problematic than anything we 
can do anyway, so I canno bring myself to care.

Also, maybe I missed some case where we should fsync. I think this is all 
good, but having other people look at and think about this would be better 
still.

 builtin-prune-packed.c |    1 -
 builtin-prune.c        |    1 -
 git-repack.sh          |    1 -
 3 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/builtin-prune-packed.c b/builtin-prune-packed.c
index 23faf31..241afbb 100644
--- a/builtin-prune-packed.c
+++ b/builtin-prune-packed.c
@@ -85,7 +85,6 @@ int cmd_prune_packed(int argc, const char **argv, const char *prefix)
 		/* Handle arguments here .. */
 		usage(prune_packed_usage);
 	}
-	sync();
 	prune_packed_objects(opts);
 	return 0;
 }
diff --git a/builtin-prune.c b/builtin-prune.c
index 25f9304..bd3d2f6 100644
--- a/builtin-prune.c
+++ b/builtin-prune.c
@@ -156,7 +156,6 @@ int cmd_prune(int argc, const char **argv, const char *prefix)
 	mark_reachable_objects(&revs, 1);
 	prune_object_dir(get_object_directory());
 
-	sync();
 	prune_packed_objects(show_only);
 	remove_temporary_files();
 	return 0;
diff --git a/git-repack.sh b/git-repack.sh
index 10f735c..072d1b4 100755
--- a/git-repack.sh
+++ b/git-repack.sh
@@ -125,7 +125,6 @@ then
 	# We know $existing are all redundant.
 	if [ -n "$existing" ]
 	then
-		sync
 		( cd "$PACKDIR" &&
 		  for e in $existing
 		  do
-- 
1.5.6.rc0.48.g5eea

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: reducing prune sync()s
  2008-05-30  0:32   ` Linus Torvalds
  2008-05-30  1:50     ` Frank Ch. Eigler
@ 2008-05-30 20:07     ` Florian Weimer
  1 sibling, 0 replies; 15+ messages in thread
From: Florian Weimer @ 2008-05-30 20:07 UTC (permalink / raw)
  To: git

* Linus Torvalds:

> Side note: a lot of systems make "fsync()" pretty expensive too. It's one 
> of my main disagreements with most log-based filesystems - fsync() can in 
> theory be fast, but almost always implies flushing the whole log, even if 
> 99.9% of that log is totally unrelated to the actual file you want to 
> fsync().

And flushing the whole log might be less expensive than several partial
flushes with ordering constraints.  If Linux ever gets support for
partial log flushes, I suppose you could restore the previous
performance by using sync_file_range() with approriate flags (to get the
data in flight to disk), followed by a second round of calls to to
fsync() (to actually wait for I/O completion).

> So fsync() isn't always all that much better than sync().

sync() is potentially a no-op, particularly if some of the targeted
files are still open.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] Make pack creation always fsync() the result
  2008-05-30 16:08       ` [PATCH 1/2] Make pack creation always fsync() the result Linus Torvalds
  2008-05-30 16:11         ` [PATCH 2/2] Remove now unnecessary 'sync()' calls Linus Torvalds
@ 2008-05-30 20:27         ` Nicolas Pitre
  2008-05-31 14:19         ` Frank Ch. Eigler
  2 siblings, 0 replies; 15+ messages in thread
From: Nicolas Pitre @ 2008-05-30 20:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Frank Ch. Eigler, Linus Torvalds, Junio C Hamano,
	Git Mailing List

On Fri, 30 May 2008, Linus Torvalds wrote:

> diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
> index 70d2f5d..4c2e0cd 100644
> --- a/builtin-pack-objects.c
> +++ b/builtin-pack-objects.c
> @@ -515,10 +515,12 @@ static void write_pack_file(void)
>  		 * If so, rewrite it like in fast-import
>  		 */
>  		if (pack_to_stdout || nr_written == nr_remaining) {
> -			sha1close(f, sha1, 1);
> +			unsigned flags = pack_to_stdout ? CSUM_CLOSE : CSUM_FSYNC;
> +			sha1close(f, sha1, flags);
>  		} else {

Micro nit:  wouldn't it look more obvious if it was written as:

	if (pack_to_stdout) {
		sha1close(f, sha1, CSUM_CLOSE);
	} else if (nr_written == nr_remaining) {
		sha1close(f, sha1, CSUM_FSYNC);
	} else {
		...

Otherwise looks sane to me.


Nicolas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] Make pack creation always fsync() the result
  2008-05-30 16:08       ` [PATCH 1/2] Make pack creation always fsync() the result Linus Torvalds
  2008-05-30 16:11         ` [PATCH 2/2] Remove now unnecessary 'sync()' calls Linus Torvalds
  2008-05-30 20:27         ` [PATCH 1/2] Make pack creation always fsync() the result Nicolas Pitre
@ 2008-05-31 14:19         ` Frank Ch. Eigler
  2008-06-02 22:23           ` Linus Torvalds
  2 siblings, 1 reply; 15+ messages in thread
From: Frank Ch. Eigler @ 2008-05-31 14:19 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

Hi -

On Fri, May 30, 2008 at 09:08:11AM -0700, Linus Torvalds wrote:

> [fsync on pack creation]
> This means that we can depend on packs always being stable on disk,
> simplifying a lot of the object serialization worries.  And unlike loose
> objects, serializing pack creation IO isn't going to be a performance
> killer. [...]

If you stabilize the outputs of the pack procedure rather than its
inputs, this makes me wonder if ordinary unpacked git objects would
also need some sort of fsync treatment.

- FChE

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] Make pack creation always fsync() the result
  2008-05-31 14:19         ` Frank Ch. Eigler
@ 2008-06-02 22:23           ` Linus Torvalds
  0 siblings, 0 replies; 15+ messages in thread
From: Linus Torvalds @ 2008-06-02 22:23 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Junio C Hamano, Git Mailing List



On Sat, 31 May 2008, Frank Ch. Eigler wrote:
> 
> If you stabilize the outputs of the pack procedure rather than its
> inputs, this makes me wonder if ordinary unpacked git objects would
> also need some sort of fsync treatment.

No, see the earlier discussion about the difference between "old" and 
"new" objects.

Pack-files can contain old objects that were _previously_ stable, so we 
need to make sure that they are at least as stable as the objects they 
replace. In contrast, new loose objects never replace old data, so they 
can always be re-created by just re-doing the git operation.

		Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2008-06-02 22:25 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-05-29 20:57 reducing prune sync()s Frank Ch. Eigler
2008-05-30  0:27 ` Linus Torvalds
2008-05-30  0:32   ` Linus Torvalds
2008-05-30  1:50     ` Frank Ch. Eigler
2008-05-30 20:07     ` Florian Weimer
2008-05-30  1:51   ` David Dillow
2008-05-30  2:17     ` Linus Torvalds
2008-05-30  2:30       ` Linus Torvalds
2008-05-30 15:25   ` Frank Ch. Eigler
2008-05-30 15:57     ` Linus Torvalds
2008-05-30 16:08       ` [PATCH 1/2] Make pack creation always fsync() the result Linus Torvalds
2008-05-30 16:11         ` [PATCH 2/2] Remove now unnecessary 'sync()' calls Linus Torvalds
2008-05-30 20:27         ` [PATCH 1/2] Make pack creation always fsync() the result Nicolas Pitre
2008-05-31 14:19         ` Frank Ch. Eigler
2008-06-02 22:23           ` Linus Torvalds

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).