On 2020-07-23 at 05:31:15, Eric Sunshine wrote: > Rather than having to worry about updating this each time the format > changes, perhaps just say: > > A Git bundle consists of several parts. Seems prudent. The reader probably doesn't care exactly how many parts there are, and if they do, they can count for themselves. > I wonder if it would be simpler and cleaner to do something like this instead: > > cat >expect <<\-EOF && > # v3 git bundle > @object-format=sha256 > -[OID] message > [OID] refs/head/master > > EOF > head -n 5 "$D"/bundle1 | sanitize_oid >actual && > test_cmp expect actual > > where sanitize_oid() is a function which converts a hex OID into the > literal string "[OID]" (or whatever). I believe I've seen you employ > such sanitization functions already in these series in cases when you > want to verify that an OID is present in some output but don't care > about the actual value. > > Or perhaps this approach is overkill? > > Reading further in this patch, I see that you actually do employ this > technique in a new test you add to t5607, though you don't bother with > OID sanitation in that test. Sure, I can do that. > I worry about passing binary bundle data through 'sed' like this. > Historically, some 'sed' implementations would drop the last part of a > file if it didn't end with a newline. Even today, not all 'sed' > implementations necessarily pass the binary data through unmolested. > For instance, on Mac OS, 'sed' adds a newline at the end of the file > if the binary bundle blob didn't end with a newline. Perhaps a more > reliable approach would be to use Perl to read the entire bundle in as > a blob, use s/// to munge the @object-format= line into the @unknown= > line, and write it out. Yeah, I was slightly worried about that, but as you mentioned in the followup email, it's probably fine to just parse the header and verify we reject that. -- brian m. carlson: Houston, Texas, US