On 2020-07-23 at 05:31:15, Eric Sunshine wrote:
> Rather than having to worry about updating this each time the format
> changes, perhaps just say:
> 
>     A Git bundle consists of several parts.

Seems prudent.  The reader probably doesn't care exactly how many parts
there are, and if they do, they can count for themselves.

> I wonder if it would be simpler and cleaner to do something like this instead:
> 
>     cat >expect <<\-EOF &&
>     # v3 git bundle
>     @object-format=sha256
>     -[OID] message
>     [OID] refs/head/master
> 
>     EOF
>     head -n 5 "$D"/bundle1 | sanitize_oid >actual &&
>     test_cmp expect actual
> 
> where sanitize_oid() is a function which converts a hex OID into the
> literal string "[OID]" (or whatever). I believe I've seen you employ
> such sanitization functions already in these series in cases when you
> want to verify that an OID is present in some output but don't care
> about the actual value.
> 
> Or perhaps this approach is overkill?
> 
> Reading further in this patch, I see that you actually do employ this
> technique in a new test you add to t5607, though you don't bother with
> OID sanitation in that test.

Sure, I can do that.

> I worry about passing binary bundle data through 'sed' like this.
> Historically, some 'sed' implementations would drop the last part of a
> file if it didn't end with a newline. Even today, not all 'sed'
> implementations necessarily pass the binary data through unmolested.
> For instance, on Mac OS, 'sed' adds a newline at the end of the file
> if the binary bundle blob didn't end with a newline. Perhaps a more
> reliable approach would be to use Perl to read the entire bundle in as
> a blob, use s/// to munge the @object-format= line into the @unknown=
> line, and write it out.

Yeah, I was slightly worried about that, but as you mentioned in the
followup email, it's probably fine to just parse the header and verify
we reject that.
-- 
brian m. carlson: Houston, Texas, US