git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Elijah Newren <newren@gmail.com>
Cc: Johannes Schindelin via GitGitGadget <gitgitgadget@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	Taylor Blau <me@ttaylorr.com>
Subject: Re: [PATCH v3] merge-ort: fix segmentation fault in read-only repositories
Date: Fri, 23 Sep 2022 11:55:27 +0200 (CEST)	[thread overview]
Message-ID: <916o55op-qpqo-5o41-931s-8q54p7301sr2@tzk.qr> (raw)
In-Reply-To: <CABPp-BGizu+dq+8Hbqo5EjKFAWU_Ni9RwQ6p+4oUMLp0oamu7w@mail.gmail.com>

Hi Elijah,

On Thu, 22 Sep 2022, Elijah Newren wrote:

> On Thu, Sep 22, 2022 at 12:46 PM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > diff --git a/merge-ort.c b/merge-ort.c
> > index 99dcee2db8a..f654296220e 100644
> > --- a/merge-ort.c
> > +++ b/merge-ort.c
> > @@ -3571,15 +3571,15 @@ static int tree_entry_order(const void *a_, const void *b_)
> >                                  b->string, strlen(b->string), bmi->result.mode);
> >  }
> >
> > -static void write_tree(struct object_id *result_oid,
> > -                      struct string_list *versions,
> > -                      unsigned int offset,
> > -                      size_t hash_size)
> > +static int write_tree(struct object_id *result_oid,
> > +                     struct string_list *versions,
> > +                     unsigned int offset,
> > +                     size_t hash_size)
> >  {
> >         size_t maxlen = 0, extra;
> >         unsigned int nr;
> >         struct strbuf buf = STRBUF_INIT;
> > -       int i;
> > +       int i, ret;
> >
> >         assert(offset <= versions->nr);
> >         nr = versions->nr - offset;
> > @@ -3605,8 +3605,9 @@ static void write_tree(struct object_id *result_oid,
> >         }
> >
> >         /* Write this object file out, and record in result_oid */
> > -       write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
> > +       ret = write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid);
>
> This sent me into a bit of a hunt to try to figure out what the return
> value of write_object_file() is.  I know it's non-zero on failure, but
> don't know if it's always positive or always negative or possibly a
> mix and how that maps to the idea of "clean_merge".  I gave up after a
> few indirections, to be honest.
>
> Anyway, regardless of my inability to find the answer to the above
> question, would this be a bit easier to read if we initialized ret to
> 0 and then did something like
>
>     if (write_object_file(buf.buf, buf.len, OBJ_TREE, result_oid))
>         ret = -1;
>
> ?

You're right, this is a better pattern to follow because it makes it
easier to wrap your head around.

> > [...]
> > @@ -3769,8 +3770,9 @@ static void write_completed_directory(struct merge_options *opt,
> >                  */
> >                 dir_info->is_null = 0;
> >                 dir_info->result.mode = S_IFDIR;
> > -               write_tree(&dir_info->result.oid, &info->versions, offset,
> > -                          opt->repo->hash_algo->rawsz);
> > +               if (write_tree(&dir_info->result.oid, &info->versions, offset,
> > +                              opt->repo->hash_algo->rawsz) < 0)
> > +                       ret = -1;
>
> This makes me wonder about write_tree() again.  What if its call to
> write_object_file() returns a value greater than 0?  Is that possible?

No, `write_object_file()` returns either 0 or -1, always. I followed all
the code paths.

Here is a tangent (feel free to skip over this lengthy analysis):

These code paths include a _very, very_ misleading one. In
`write_loose_object()` (which is called via the call chain
`write_object_file()` -> `write_object_file_flags()` ->
`write_object_file_flags()`), we call `finalize_object_file()`, see
https://github.com/git/git/blob/v2.37.3/object-file.c#L2030). And in that
`finalize_object_file()` function, we set `ret = errno;` in two places
(https://github.com/git/git/blob/v2.37.3/object-file.c#L1833 and
https://github.com/git/git/blob/v2.37.3/object-file.c#L1850).

When I read that, I chased down what the conventions are for `errno`
values. Now, while
https://pubs.opengroup.org/onlinepubs/007908799/xsh/errno.h.html does not
say anything about `errno` values being non-negative,
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html
very much says:

	Values for errno are now required to be distinct positive values
	rather than non-zero values. This change is for alignment with the
	ISO/IEC 9899:1999 standard.

Terrible, right? If we set `ret = errno`, that means that we return
something positive!

I then even dug up the origins of the `ret = errno` statements. Turns out
that yours truly introduced the first of these `ret = errno` statements in
9e48b389990 (Work around missing hard links on FAT formatted media,
2005-10-26). :facepalm:

So why did I do such a thing? Well, back then, the `ret` value _already_
contained an `errno`, thanks to calling `link_temp_to_file()`, which _did_
return an `errno`. In fact, there was no code path in
`move_temp_to_file()` (as the function was called back then) that would
return -1, it always returned either 0 or an `errno`.

And after coming so far, after one hour of analyzing, I finally, finally
realized that there is no `return ret;` in that function. That function
has a `ret` variable that is never returned _from_ that function, but it
used to be returned _to_ that function, by the `link_temp_to_file()`
function that does not even exist any longer.

In my defense, this terribly misleading code was not even introduced by
myself. It was introduced in 230f13225df (Create object subdirectories on
demand, 2005-10-08).

tl;dr all is good, `finalize_object_file()` returns 0 upon success, -1
upon failure.

> If it is, are those error cases or not?  About half the callers in the
> code base that check the return value of write_object_file() check for a
> non-zero value, the other half check for a value less than 0.

And then there is `apply.c` where the return value isn't even checked at
all most of the time.

> And I can't find any documentation.  And it seemed like too much time
> for me to figure out what its range of return values was.
>
> [...]
>
> Nice to see that it didn't take too much work to propagate the -1 value
>from write_tree() up the hierarchy.  Nice work!

Thank you!

> I think we've still got some separate problems related to propagating
> the return value of the other write_object_file() call, from
> handle_content_merge().  The direct call in that other case is okay, but
> the higher levels at process_renames() and process_entry() seem to
> fumble it.  Your fix for failed tree writes is still valid without
> fixing the failed blob writes, and I'm happy to tackle the other half if
> you'd prefer to hand it off.  But, if you'd like to tackle that half
> too, you might see fewer error messages when it fails to write to the
> read-only repository, since it'll fail early on the first blob write
> instead of only failing when it gets around to trying to write a new
> tree.

It'll be part of v4 ;-)

Ciao,
Dscho

  reply	other threads:[~2022-09-23  9:55 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-21 15:30 [PATCH] merge-tree: fix segmentation fault in read-only repositories Johannes Schindelin via GitGitGadget
2022-09-21 15:42 ` Taylor Blau
2022-09-21 20:12   ` Johannes Schindelin
2022-09-21 18:12 ` Junio C Hamano
2022-09-21 20:13   ` Johannes Schindelin
2022-09-21 22:08 ` [PATCH v2] " Johannes Schindelin via GitGitGadget
2022-09-21 22:24   ` Junio C Hamano
2022-09-22 19:52     ` Johannes Schindelin
2022-09-21 22:56   ` Elijah Newren
2022-09-21 23:11     ` Junio C Hamano
2022-09-22 17:24     ` Johannes Schindelin
2022-09-22 19:50     ` Johannes Schindelin
2022-09-23  1:47       ` Elijah Newren
2022-09-22 19:46   ` [PATCH v3] merge-ort: " Johannes Schindelin via GitGitGadget
2022-09-23  1:38     ` Elijah Newren
2022-09-23  9:55       ` Johannes Schindelin [this message]
2022-09-26 21:55     ` [PATCH v4 0/2] merge-tree: " Johannes Schindelin via GitGitGadget
2022-09-26 21:55       ` [PATCH v4 1/2] merge-ort: " Johannes Schindelin via GitGitGadget
2022-09-26 21:55       ` [PATCH v4 2/2] merge-ort: return early when failing to write a blob Johannes Schindelin via GitGitGadget
2022-09-26 22:07         ` Junio C Hamano
2022-09-27  8:05         ` Elijah Newren
2022-09-27 16:45           ` Junio C Hamano
2022-09-27  8:11       ` [PATCH v4 0/2] merge-tree: fix segmentation fault in read-only repositories Elijah Newren
2022-09-28  7:29       ` [PATCH v5 " Johannes Schindelin via GitGitGadget
2022-09-28  7:29         ` [PATCH v5 1/2] merge-ort: " Johannes Schindelin via GitGitGadget
2022-09-28  7:29         ` [PATCH v5 2/2] merge-ort: return early when failing to write a blob Johannes Schindelin via GitGitGadget
2022-09-28 15:53           ` Junio C Hamano
2022-09-29  1:36             ` Elijah Newren
2022-09-29  1:49               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=916o55op-qpqo-5o41-931s-8q54p7301sr2@tzk.qr \
    --to=johannes.schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).