Hi Phillip, On Thu, 17 Nov 2022, Phillip Wood wrote: > On 16/11/2022 14:40, Phillip Wood wrote: > > > +        } else if (state == MBOX_IN_DIFF) { > > > +            switch (line[0]) { > > > +            case '\0': > > > +                continue; /* ignore empty lines after diff */ > > > +            case '+': > > > +            case '-': > > > +            case ' ': > > > +                if (!old_count && !new_count) > > > +                    break; > > > > This shouldn't happen in a well formed diff. Below we happily accept bad > > counts, is there a reason to reject them here? > > I think this might be picking up the "--" at the end of the patch as we don't > want to break here at the end of a hunk. If so then a comment would be > helpful. Agreed. And yes, it is picking up the "-- " line at the end of the patch. > > > +                if (old_count && line[0] != '+') > > > +                    old_count--; > > > +                if (new_count && line[0] != '-') > > > +                    new_count--; > > > > The diff is malformed if old_count == 0 and we see '-' or ' ' or new_count > > == 0 and we see '+' or ' '. The code is careful not to decrement the count > > in that case so I think it is harmless to accept diffs with bad line counts > > in the hunk header. I might be overly cautious here, but as you mentioned elsewhere, it is really bad if a `size_t` is decremented below 0, and `new_count`/`old_count` are of that type. > > > +                /* fallthrough */ > > > +            case '\\': > > > +                strbuf_addstr(&buf, line); > > > +                strbuf_addch(&buf, '\n'); > > > +                util->diffsize++; > > > > I think this might be a better place to break if old_count and new_count are > > both zero. > > It would be the right place to break at the end of each hunk, but I don't > think we want to do that. It would not even be the right place to break here then: think of the `\ No newline at end of file` lines: they come after the preceding line decremented `old_count`/`new_count`, yet we still want them to be part of the diff. > > > > +                continue; > > > +            case '@': > > > +                if (parse_hunk_header(line, &old_count, > > > +                              &new_count, &p)) > > > +                    break; > > > + > > > +                strbuf_addstr(&buf, "@@"); > > > +                if (current_filename && *p) > > > +                    strbuf_addf(&buf, " %s:", > > > +                            current_filename); > > > +                strbuf_addstr(&buf, p); > > > +                strbuf_addch(&buf, '\n'); > > > +                util->diffsize++; > > > +                continue; > > > +            } > > > > This is effectively the `default:` clause as it is executed when we don't > > handle the line above. We ignore the contents of this line which makes me > > wonder what happens if it is the start of another diff. > > We'll pick that up earlier with "if (starts_with(line, "diff --git"))" > > We only get here at the end of a patch (assuming it has the "--" line from > format-patch) We also get here in case of garbage in the middle of a diff ;-) Thank you for setting a fantastic example how to review code in a constructive, helpful manner! Dscho