git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Christoph Hellwig <hch@lst.de>,
	git@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] enable core.fsyncObjectFiles by default
Date: Wed, 17 Jan 2018 22:44:25 +0100	[thread overview]
Message-ID: <87h8rki2iu.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <xmqqd128s3wf.fsf@gitster.mtv.corp.google.com>


On Wed, Jan 17 2018, Junio C. Hamano jotted:

> Christoph Hellwig <hch@lst.de> writes:
>
>> fsync is required for data integrity as there is no gurantee that
>> data makes it to disk at any specified time without it.  Even for
>> ext3 with data=ordered mode the file system will only commit all
>> data at some point in time that is not guaranteed.
>
> It comes from this one:
>
> commit aafe9fbaf4f1d1f27a6f6e3eb3e246fff81240ef
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Wed Jun 18 15:18:44 2008 -0700
>
>     Add config option to enable 'fsync()' of object files
>
>     As explained in the documentation[*] this is totally useless on
>     filesystems that do ordered/journalled data writes, but it can be a
>     useful safety feature on filesystems like HFS+ that only journal the
>     metadata, not the actual file contents.
>
>     It defaults to off, although we could presumably in theory some day
>     auto-enable it on a per-filesystem basis.
>
>     [*] Yes, I updated the docs for the thing.  Hell really _has_ frozen
>         over, and the four horsemen are probably just beyond the horizon.
>         EVERYBODY PANIC!
>
>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> index 0e25b2c92..9a1cec5c8 100644
>> --- a/Documentation/config.txt
>> +++ b/Documentation/config.txt
>> @@ -866,10 +866,8 @@ core.whitespace::
>>  core.fsyncObjectFiles::
>>  	This boolean will enable 'fsync()' when writing object files.
>>  +
>> -This is a total waste of time and effort on a filesystem that orders
>> -data writes properly, but can be useful for filesystems that do not use
>> -journalling (traditional UNIX filesystems) or that only journal metadata
>> -and not file contents (OS X's HFS+, or Linux ext3 with "data=writeback").
>> +This option is enabled by default and ensures actual data integrity
>> +by calling fsync after writing object files.
>
> I am somewhat sympathetic to the desire to flip the default to
> "safe" and allow those who know they are already safe to tweak the
> knob for performance, and it also makes sense to document that the
> default is "true" here.  But I do not see the point of removing the
> four lines from this paragraph; the sole effect of the removal is to
> rob information from readers that they can use to decide if they
> want to disable the configuration, no?

[CC'd the author of the current behavior]

Some points/questions:

 a) Is there some reliable way to test whether this is needed from
    userspace? I'm thinking something like `git update-index
    --test-untracked-cache` but for fsync().

 b) On the filesystems that don't need this, what's the performance
    impact?

    I ran a small test myself on CentOS 7 (3.10) with ext4 data=ordered
    on the tests I thought might do a lot of loose object writes:

      $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_MAKE_OPTS="NO_OPENSSL=Y CFLAGS=-O3 -j56" ./run origin/master fsync-on~ fsync-on p3400-rebase.sh p0007-write-cache.sh
      [...]
      Test                                                            fsync-on~         fsync-on
      -------------------------------------------------------------------------------------------------------
      3400.2: rebase on top of a lot of unrelated changes             1.45(1.30+0.17)   1.45(1.28+0.20) +0.0%
      3400.4: rebase a lot of unrelated changes without split-index   4.34(3.71+0.66)   4.33(3.69+0.66) -0.2%
      3400.6: rebase a lot of unrelated changes with split-index      3.38(2.94+0.47)   3.38(2.93+0.47) +0.0%
      0007.2: write_locked_index 3 times (3214 files)                 0.01(0.00+0.00)   0.01(0.00+0.00) +0.0%

   No impact. However I did my own test of running the test suite 10%
   times with/without this patch, and it runs 9% slower:

     fsync-off: avg:21.59 21.50 21.50 21.52 21.53 21.54 21.57 21.59 21.61 21.63 21.95
      fsync-on: avg:23.43 23.21 23.25 23.26 23.26 23.27 23.32 23.49 23.51 23.83 23.88

   Test script at the end of this E-Mail.

 c) What sort of guarantees in this regard do NFS-mounted filesystems
    commonly make?

Test script:

use v5.10.0;
use strict;
use warnings;
use Time::HiRes qw(time);
use List::Util qw(sum);
use Data::Dumper;

my %time;
for my $ref (@ARGV) {
    system "git checkout $ref";
    system qq[make -j56 CFLAGS="-O3 -g" NO_OPENSSL=Y all];
    for (1..10) {
        my $t0 = -time();
        system "(cd t && NO_SVN_TESTS=1 GIT_TEST_HTTPD=0 prove -j56 --state=slow,save t[0-9]*.sh)";
        $t0 += time();
        push @{$time{$ref}} => $t0;
    }
}
for my $ref (sort keys %time) {
    printf "%20s: avg:%.2f %s\n",
        $ref,
        sum(@{$time{$ref}})/@{$time{$ref}},
        join(" ", map { sprintf "%.02f", $_ } sort { $a <=> $b } @{$time{$ref}});
}

  parent reply	other threads:[~2018-01-17 21:44 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-17 18:48 [PATCH] enable core.fsyncObjectFiles by default Christoph Hellwig
2018-01-17 19:04 ` Junio C Hamano
2018-01-17 19:35   ` Christoph Hellwig
2018-01-17 20:05     ` Andreas Schwab
2018-01-17 19:37   ` Matthew Wilcox
2018-01-17 19:42     ` Christoph Hellwig
2018-01-17 21:44   ` Ævar Arnfjörð Bjarmason [this message]
2018-01-17 22:07     ` Linus Torvalds
2018-01-17 22:25       ` Linus Torvalds
2018-01-17 23:16       ` Ævar Arnfjörð Bjarmason
2018-01-17 23:42         ` Linus Torvalds
2018-01-17 23:52       ` Theodore Ts'o
2018-01-17 23:57         ` Linus Torvalds
2018-01-18 16:27           ` Christoph Hellwig
2018-01-19 19:08             ` Junio C Hamano
2018-01-20 22:14               ` Theodore Ts'o
2018-01-20 22:27                 ` Junio C Hamano
2018-01-22 15:09                   ` Ævar Arnfjörð Bjarmason
2018-01-22 18:09                     ` Theodore Ts'o
2018-01-23  0:47                       ` Jeff King
2018-01-23  5:45                         ` Theodore Ts'o
2018-01-23 16:17                           ` Jeff King
2018-01-23  0:25                     ` Jeff King
2018-01-21 21:32             ` Chris Mason
2020-09-17 11:06         ` Ævar Arnfjörð Bjarmason
2020-09-17 11:28           ` [RFC PATCH 0/2] should core.fsyncObjectFiles fsync the dir entry + docs Ævar Arnfjörð Bjarmason
2020-09-17 11:28           ` [RFC PATCH 1/2] sha1-file: fsync() loose dir entry when core.fsyncObjectFiles Ævar Arnfjörð Bjarmason
2020-09-17 13:16             ` Jeff King
2020-09-17 15:09               ` Christoph Hellwig
2020-09-17 14:09             ` Christoph Hellwig
2020-09-17 14:55               ` Jeff King
2020-09-17 14:56                 ` Christoph Hellwig
2020-09-17 15:37                   ` Junio C Hamano
2020-09-17 17:12                     ` Jeff King
2020-09-17 20:37                       ` Taylor Blau
2020-09-22 10:42               ` Ævar Arnfjörð Bjarmason
2020-09-17 20:21             ` Johannes Sixt
2020-09-22  8:24               ` Ævar Arnfjörð Bjarmason
2020-11-19 11:38                 ` Johannes Schindelin
2020-09-17 11:28           ` [RFC PATCH 2/2] core.fsyncObjectFiles: make the docs less flippant Ævar Arnfjörð Bjarmason
2020-09-17 14:12             ` Christoph Hellwig
2020-09-17 15:43             ` Junio C Hamano
2020-09-17 20:15               ` Johannes Sixt
2020-10-08  8:13               ` Johannes Schindelin
2020-10-08 15:57                 ` Ævar Arnfjörð Bjarmason
2020-10-08 18:53                   ` Junio C Hamano
2020-10-09 10:44                   ` Johannes Schindelin
2020-09-17 19:21             ` Marc Branchaud
2020-09-17 14:14           ` [PATCH] enable core.fsyncObjectFiles by default Christoph Hellwig
2020-09-17 15:30           ` Junio C Hamano
2018-01-17 20:55 ` Jeff King
2018-01-17 21:10   ` Christoph Hellwig
  -- strict thread matches above, loose matches on Subject: below --
2015-06-23 21:57 [PATCH] Enable " Stefan Beller
2015-06-23 22:21 ` Junio C Hamano
2015-06-23 23:29   ` Theodore Ts'o
2015-06-24  5:32     ` Junio C Hamano
2015-06-24 14:30       ` Theodore Ts'o
2015-06-24  1:07 ` Duy Nguyen
2015-06-24  3:37 ` Jeff King
2015-06-24  5:20   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h8rki2iu.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).