* [RFC 1/1] packed-backend: ignore broken headers
2018-03-26 12:42 [RFC 0/1] Tolerate broken headers in `packed-refs` files Michael Haggerty
@ 2018-03-26 12:42 ` Michael Haggerty
2018-03-26 13:08 ` [RFC 0/1] Tolerate broken headers in `packed-refs` files Derrick Stolee
1 sibling, 0 replies; 5+ messages in thread
From: Michael Haggerty @ 2018-03-26 12:42 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jeff King, git, Michael Haggerty
Prior to
9308b7f3ca read_packed_refs(): die if `packed-refs` contains bogus data, 2017-07-01
we silently ignored any lines in a `packed-refs` file that we didn't
understand. That policy was clearly wrong.
But at the time, unrecognized header lines were processed by the same
code as reference and peeled lines. This means that they were also
subject to the same liberal treatment. For example, any of the
following "header" lines would have been ignored:
* "# arbitrary data that looks like a comment"
* "# pack-refs with peeled fully-peeled" ← note: missing colon
* "# pack-refs"
Loosen up the parser to ignore any first line that begins with `#` but
doesn't start with the exact character sequence "# pack-refs with:".
(In fact, the old liberal policy meant that "comment" lines would have
been ignored anywhere in the file. But the file format isn't actually
documented to allow comments, and comments make little sense in this
file, so we won't go that far.)
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
I *don't* think we actually want to merge this patch. See the cover
letter for my reasoning.
refs/packed-backend.c | 21 +++++++++------------
t/t3210-pack-refs.sh | 33 ++++++++++++++++++++++++++++++++-
2 files changed, 41 insertions(+), 13 deletions(-)
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 65288c6472..f9d71bb60d 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -635,21 +635,18 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs)
tmp = xmemdupz(snapshot->buf, eol - snapshot->buf);
- if (!skip_prefix(tmp, "# pack-refs with:", (const char **)&p))
- die_invalid_line(refs->path,
- snapshot->buf,
- snapshot->eof - snapshot->buf);
+ if (skip_prefix(tmp, "# pack-refs with:", (const char **)&p)) {
+ string_list_split_in_place(&traits, p, ' ', -1);
- string_list_split_in_place(&traits, p, ' ', -1);
+ if (unsorted_string_list_has_string(&traits, "fully-peeled"))
+ snapshot->peeled = PEELED_FULLY;
+ else if (unsorted_string_list_has_string(&traits, "peeled"))
+ snapshot->peeled = PEELED_TAGS;
- if (unsorted_string_list_has_string(&traits, "fully-peeled"))
- snapshot->peeled = PEELED_FULLY;
- else if (unsorted_string_list_has_string(&traits, "peeled"))
- snapshot->peeled = PEELED_TAGS;
+ sorted = unsorted_string_list_has_string(&traits, "sorted");
- sorted = unsorted_string_list_has_string(&traits, "sorted");
-
- /* perhaps other traits later as well */
+ /* perhaps other traits later as well */
+ }
/* The "+ 1" is for the LF character. */
snapshot->start = eol + 1;
diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh
index afa27ffe2d..353ef3e655 100755
--- a/t/t3210-pack-refs.sh
+++ b/t/t3210-pack-refs.sh
@@ -20,7 +20,8 @@ test_expect_success \
'echo Hello > A &&
git update-index --add A &&
git commit -m "Initial commit." &&
- HEAD=$(git rev-parse --verify HEAD)'
+ HEAD=$(git rev-parse --verify HEAD) &&
+ git tag -m "A tag" annotated-tag'
SHA1=
@@ -221,6 +222,36 @@ test_expect_success 'reject packed-refs with a short SHA-1' '
test_cmp expected_err err
'
+test_expect_success 'handle packed-refs with a bogus header' '
+ git show-ref -d >expected &&
+ mv .git/packed-refs .git/packed-refs.bak &&
+ test_when_finished "mv .git/packed-refs.bak .git/packed-refs" &&
+ sed -e "s/^#.*/# whacked-refs/" <.git/packed-refs.bak >.git/packed-refs &&
+ git show-ref -d >actual 2>err &&
+ test_cmp expected actual &&
+ test_must_be_empty err
+'
+
+test_expect_success 'handle packed-refs with a truncated header' '
+ git show-ref -d >expected &&
+ mv .git/packed-refs .git/packed-refs.bak &&
+ test_when_finished "mv .git/packed-refs.bak .git/packed-refs" &&
+ sed -e "s/^#.*/# pack-refs/" <.git/packed-refs.bak >.git/packed-refs &&
+ git show-ref -d >actual 2>err &&
+ test_cmp expected actual &&
+ test_must_be_empty err
+'
+
+test_expect_success 'handle packed-refs with no traits in header' '
+ git show-ref -d >expected &&
+ mv .git/packed-refs .git/packed-refs.bak &&
+ test_when_finished "mv .git/packed-refs.bak .git/packed-refs" &&
+ sed -e "s/^#.*/# pack-refs with:/" <.git/packed-refs.bak >.git/packed-refs &&
+ git show-ref -d >actual 2>err &&
+ test_cmp expected actual &&
+ test_must_be_empty err
+'
+
test_expect_success 'timeout if packed-refs.lock exists' '
LOCK=.git/packed-refs.lock &&
>"$LOCK" &&
--
2.14.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RFC 0/1] Tolerate broken headers in `packed-refs` files
2018-03-26 12:42 [RFC 0/1] Tolerate broken headers in `packed-refs` files Michael Haggerty
2018-03-26 12:42 ` [RFC 1/1] packed-backend: ignore broken headers Michael Haggerty
@ 2018-03-26 13:08 ` Derrick Stolee
2018-03-26 14:33 ` Jeff King
2018-03-26 14:33 ` Edward Thomson
1 sibling, 2 replies; 5+ messages in thread
From: Derrick Stolee @ 2018-03-26 13:08 UTC (permalink / raw)
To: Michael Haggerty, Junio C Hamano; +Cc: Jeff King, git, ethomson
On 3/26/2018 8:42 AM, Michael Haggerty wrote:
> [...]
>
> But there might be some tools out in the wild that have been writing
> broken headers. In that case, users who upgrade Git might suddenly
> find that they can't read repositories that they could read before. In
> fact, a tool that we wrote and use internally at GitHub was doing
> exactly that, which is how we discovered this "problem".
>
> This patch shows what it would look like to relax the parsing again,
> albeit *only* for the first line of the file, and *only* for lines
> that start with '#'.
>
> The problem with this patch is that it would make it harder for people
> who implement broken tools in the future to discover their mistakes.
> The only result of the error would be that it is slower to work with
> the `packed-refs` files that they wrote. Such an error could go
> undiscovered for a long time.
My opinion is that we shouldn't maintain back-compat with formats that
may have been written by another tool because Git wasn't strict about
it. As long as Git never wrote files with these formats, then they
shouldn't be supported.
You are absolutely right that staying strict will help discover the
tools that are writing an incorrect format.
Since most heavily-used tools that didn't spawn Git processes use
LibGit2 to interact with Git repos, I added Ed Thomson to CC to see if
libgit2 could ever write these bad header comments.
Thanks for writing this RFC so we can have the discussion and more
quickly identify this issue if/when users are broken.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 5+ messages in thread