git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] refs: fix corruption by not correctly syncing packed-refs to disk
@ 2022-12-20 14:52 Patrick Steinhardt
  2022-12-20 15:44 ` Jeff King
  0 siblings, 1 reply; 3+ messages in thread
From: Patrick Steinhardt @ 2022-12-20 14:52 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1706 bytes --]

At GitLab we have recently received a report where a repository was left
with a corrupted `packed-refs` file after the node hard-crashed even
though `core.fsync=reference` was set. This is something that in theory
should not happen if we correctly did the atomic-rename dance to:

    1. Write the data into a temporary file.

    2. Synchronize the temporary file to disk.

    3. Rename the temporary file into place.

So if we crash in the middle of writing the `packed-refs` file we should
only ever see either the old or the new state of the file.

And while we do the dance when writing the `packed-refs` file, there is
indeed one gotcha: we use a `FILE *` stream to write the temporary file,
but don't flush it before synchronizing it to disk. As a consequence any
data that is still buffered will not get synchronized and a crash of the
machine may cause corruption.

Fix this bug by flushing the file stream before we fsync.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs/packed-backend.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index c1c71d183e..6f5a0709fb 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1263,7 +1263,8 @@ static int write_with_updates(struct packed_ref_store *refs,
 		goto error;
 	}
 
-	if (fsync_component(FSYNC_COMPONENT_REFERENCE, get_tempfile_fd(refs->tempfile)) ||
+	if (fflush(out) ||
+	    fsync_component(FSYNC_COMPONENT_REFERENCE, get_tempfile_fd(refs->tempfile)) ||
 	    close_tempfile_gently(refs->tempfile)) {
 		strbuf_addf(err, "error closing file %s: %s",
 			    get_tempfile_path(refs->tempfile),
-- 
2.39.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] refs: fix corruption by not correctly syncing packed-refs to disk
  2022-12-20 14:52 [PATCH] refs: fix corruption by not correctly syncing packed-refs to disk Patrick Steinhardt
@ 2022-12-20 15:44 ` Jeff King
  2022-12-25 11:25   ` Junio C Hamano
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff King @ 2022-12-20 15:44 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

On Tue, Dec 20, 2022 at 03:52:14PM +0100, Patrick Steinhardt wrote:

> And while we do the dance when writing the `packed-refs` file, there is
> indeed one gotcha: we use a `FILE *` stream to write the temporary file,
> but don't flush it before synchronizing it to disk. As a consequence any
> data that is still buffered will not get synchronized and a crash of the
> machine may cause corruption.

The problem description makes sense, and so does your fix.

Grepping for other uses of fsync_component(), this looks like the only
buggy case (loose refs use write() directly, and most other files go via
finalize_hashfile(), which does likewise).

> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index c1c71d183e..6f5a0709fb 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1263,7 +1263,8 @@ static int write_with_updates(struct packed_ref_store *refs,
>  		goto error;
>  	}
>  
> -	if (fsync_component(FSYNC_COMPONENT_REFERENCE, get_tempfile_fd(refs->tempfile)) ||
> +	if (fflush(out) ||
> +	    fsync_component(FSYNC_COMPONENT_REFERENCE, get_tempfile_fd(refs->tempfile)) ||
>  	    close_tempfile_gently(refs->tempfile)) {

It kind of feels like this ought to be part of fsync_component() or
close_tempfile_gently(), but it would pollute those interfaces:

  - fsync_component() doesn't otherwise know about stdio

  - close_tempfile_gently() doesn't otherwise know about syncing (and it
    would have to learn about fsync_components to do it right).

So given that this is the only affected site, it makes sense to just fix
it for now and worry about a more generalized solution if we run into it
again.

-Peff

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] refs: fix corruption by not correctly syncing packed-refs to disk
  2022-12-20 15:44 ` Jeff King
@ 2022-12-25 11:25   ` Junio C Hamano
  0 siblings, 0 replies; 3+ messages in thread
From: Junio C Hamano @ 2022-12-25 11:25 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Jeff King, git

Jeff King <peff@peff.net> writes:

> On Tue, Dec 20, 2022 at 03:52:14PM +0100, Patrick Steinhardt wrote:
>
>> And while we do the dance when writing the `packed-refs` file, there is
>> indeed one gotcha: we use a `FILE *` stream to write the temporary file,
>> but don't flush it before synchronizing it to disk. As a consequence any
>> data that is still buffered will not get synchronized and a crash of the
>> machine may cause corruption.
>
> The problem description makes sense, and so does your fix.
>
> Grepping for other uses of fsync_component(), this looks like the only
> buggy case (loose refs use write() directly, and most other files go via
> finalize_hashfile(), which does likewise).
> ...
> So given that this is the only affected site, it makes sense to just fix
> it for now and worry about a more generalized solution if we run into it
> again.

Sounds good.

This came from bc22d845 (core.fsync: new option to harden
references, 2022-03-11), before which we did not even fsync() the
file, so let me apply directly on top of that commit.  Those who are
stuck on older versions of Git can choose to merge the result, even
though I may probably not bother merging it down to anything older
than 2.39 maintenance track.

Thanks.



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-12-25 11:26 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-20 14:52 [PATCH] refs: fix corruption by not correctly syncing packed-refs to disk Patrick Steinhardt
2022-12-20 15:44 ` Jeff King
2022-12-25 11:25   ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).