git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Jeff King <peff@peff.net>
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"David Turner" <novalis@novalis.org>,
	git@vger.kernel.org
Subject: Re: [PATCH 18/20] commit_packed_refs(): use reference iteration
Date: Wed, 22 Mar 2017 09:42:58 +0100	[thread overview]
Message-ID: <b49b4ae0-bcef-ce1d-62d5-f76a11e84766@alum.mit.edu> (raw)
In-Reply-To: <20170320180532.vxzra6tpf3t7qjks@sigill.intra.peff.net>

On 03/20/2017 07:05 PM, Jeff King wrote:
> On Mon, Mar 20, 2017 at 05:33:23PM +0100, Michael Haggerty wrote:
> 
>> -/*
>> - * An each_ref_entry_fn that writes the entry to a packed-refs file.
>> - */
>> -static int write_packed_entry_fn(struct ref_entry *entry, void *cb_data)
>> -{
>> -	enum peel_status peel_status = peel_entry(entry, 0);
>> -
>> -	if (peel_status != PEEL_PEELED && peel_status != PEEL_NON_TAG)
>> -		error("internal error: %s is not a valid packed reference!",
>> -		      entry->name);
>> -	write_packed_entry(cb_data, entry->name, entry->u.value.oid.hash,
>> -			   peel_status == PEEL_PEELED ?
>> -			   entry->u.value.peeled.hash : NULL);
>> -	return 0;
>> -}
> 
> This assertion goes away. It can't be moved into write_packed_entry()
> because the peel status is only known in the caller.
> 
> But here:
> 
>> @@ -1376,8 +1362,18 @@ static int commit_packed_refs(struct files_ref_store *refs)
>>  		die_errno("unable to fdopen packed-refs descriptor");
>>  
>>  	fprintf_or_die(out, "%s", PACKED_REFS_HEADER);
>> -	do_for_each_entry_in_dir(get_packed_ref_dir(packed_ref_cache),
>> -				 write_packed_entry_fn, out);
>> +
>> +	iter = cache_ref_iterator_begin(packed_ref_cache->cache, NULL, 0);
>> +	while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
>> +		struct object_id peeled;
>> +		int peel_error = ref_iterator_peel(iter, &peeled);
>> +
>> +		write_packed_entry(out, iter->refname, iter->oid->hash,
>> +				   peel_error ? NULL : peeled.hash);
>> +	}
> 
> Should we be checking that peel_error is only PEELED or NON_TAG?

This is a good question, and it took me a while to figure out the whole
answer. At first I deleted this check without much thought because it
was just an internal consistency check that should never trigger. But
actually the story is more complicated than that.

Tl;dr: the old code is slightly wrong and I think the new code is correct.

First the superficial answer: we can't `peel_error` in
`commit_packed_refs()` as you suggested, because `ref_iterator_peel()`
doesn't return an `enum peel_status`. It only returns 0 on OK /  nonzero
for problems. A legitimate reference should never have a status
`PEEL_INVALID`. That status is meant for stale packed refs that are
hidden by more recent loose refs, which *can* legitimately point at
objects that have since been GCed. Since `ref_iterator_peel()` is
someday meant to be an exposed part of the API, I didn't want it to give
out more information than pass/fail [1]. Also, the reason for a peel
failure is not necessarily known accurately (information is lost when a
reference is packed then the packed-refs file is read).

So, when could the old error message have been emitted? It is when there
is an entry in the packed-ref cache that is not `REF_KNOWS_PEELED`, and
when the peel is attempted, the result is `PEEL_INVALID`,
`PEEL_IS_SYMREF`, or `PEEL_BROKEN`. Since a packed ref cannot be a
symref, `PEEL_IS_SYMREF` and `PEEL_BROKEN` can be ruled out. So we're
left with `PEEL_INVALID`.

How could an entry get into the packed-refs cache without
`REF_KNOWS_PEELED`? One of the following:

* It was read from a `packed-refs` file that didn't have the
`fully-peeled` attribute. In that case, we *don't want* to emit an
error, because the broken value is presumably masked by a loose version
of the same reference (which we just don't happen to be packing this
time). The old code incorrectly emits the error message in this case. It
was probably never reported as a bug because this scenario is rare.

* It was a loose reference that was just added to the packed ref cache
by `files_packed_refs()` via `pack_if_possible_fn()` in preparation for
being packed. The latter function refuses to pack a reference for which
`entry_resolves_to_object()` returns false, and otherwise calls
`peel_entry()` itself and checks the return value. So a reference added
this way should always be `REF_KNOWS_PEELED`.

Therefore, I think it is a good thing to remove this check. I'll improve
the commit message to make this story clearer.

Michael

[1] We could change this policy, for example by only documenting the
pass/fail return value externally, while distinguishing between the
types of failure when iterating internal to the module.


  reply	other threads:[~2017-03-22  8:44 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-20 16:33 [PATCH 00/20] Separate `ref_cache` into a separate module Michael Haggerty
2017-03-20 16:33 ` [PATCH 01/20] get_ref_dir(): don't call read_loose_refs() for "refs/bisect" Michael Haggerty
2017-03-20 16:33 ` [PATCH 02/20] refs_read_raw_ref(): new function Michael Haggerty
2017-03-20 16:33 ` [PATCH 03/20] refs_ref_iterator_begin(): " Michael Haggerty
2017-03-20 16:33 ` [PATCH 04/20] refs_verify_refname_available(): implement once for all backends Michael Haggerty
2017-03-20 17:42   ` Jeff King
2017-03-20 22:20     ` Michael Haggerty
2017-03-20 16:33 ` [PATCH 05/20] refs_verify_refname_available(): use function in more places Michael Haggerty
2017-03-20 16:33 ` [PATCH 06/20] Rename `add_ref()` to `add_ref_entry()` Michael Haggerty
2017-03-20 16:33 ` [PATCH 07/20] Rename `find_ref()` to `find_ref_entry()` Michael Haggerty
2017-03-20 16:33 ` [PATCH 08/20] Rename `remove_entry()` to `remove_entry_from_dir()` Michael Haggerty
2017-03-20 16:33 ` [PATCH 09/20] refs: split `ref_cache` code into separate files Michael Haggerty
2017-03-20 17:49   ` Jeff King
2017-03-20 19:47     ` Junio C Hamano
2017-03-20 20:35       ` Stefan Beller
2017-03-20 22:40         ` Junio C Hamano
2017-03-20 16:33 ` [PATCH 10/20] ref-cache: introduce a new type, ref_cache Michael Haggerty
2017-03-20 16:33 ` [PATCH 11/20] refs: record the ref_store in ref_cache, not ref_dir Michael Haggerty
2017-03-20 17:51   ` Jeff King
2017-03-20 22:39     ` Michael Haggerty
2017-03-20 16:33 ` [PATCH 12/20] ref-cache: use a callback function to fill the cache Michael Haggerty
2017-03-20 16:33 ` [PATCH 13/20] refs: handle "refs/bisect/" in `loose_fill_ref_dir()` Michael Haggerty
2017-03-20 16:33 ` [PATCH 14/20] do_for_each_entry_in_dir(): eliminate `offset` argument Michael Haggerty
2017-03-20 16:33 ` [PATCH 15/20] get_loose_ref_dir(): function renamed from get_loose_refs() Michael Haggerty
2017-03-20 16:33 ` [PATCH 16/20] get_loose_ref_cache(): new function Michael Haggerty
2017-03-20 16:33 ` [PATCH 17/20] cache_ref_iterator_begin(): make function smarter Michael Haggerty
2017-03-20 16:33 ` [PATCH 18/20] commit_packed_refs(): use reference iteration Michael Haggerty
2017-03-20 18:05   ` Jeff King
2017-03-22  8:42     ` Michael Haggerty [this message]
2017-03-22 13:06       ` Jeff King
2017-03-20 16:33 ` [PATCH 19/20] files_pack_refs(): " Michael Haggerty
2017-03-20 16:33 ` [PATCH 20/20] do_for_each_entry_in_dir(): delete function Michael Haggerty
2017-03-20 17:25 ` [PATCH 00/20] Separate `ref_cache` into a separate module Junio C Hamano
2017-03-20 18:12 ` Jeff King
2017-03-20 18:24 ` Ævar Arnfjörð Bjarmason
2017-03-20 18:30   ` Jeff King
2017-03-20 22:32 ` Junio C Hamano
2017-03-20 22:48   ` Michael Haggerty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b49b4ae0-bcef-ce1d-62d5-f76a11e84766@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=novalis@novalis.org \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).