From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, dstolee@microsoft.com, gitster@pobox.com,
jrnieder@gmail.com
Subject: Re: [PATCH v2 3/8] builtin/index-pack.c: write reverse indexes
Date: Mon, 25 Jan 2021 15:03:07 -0500 [thread overview]
Message-ID: <YA8j+3nq8nI/hcIQ@nand.local> (raw)
In-Reply-To: <YAtlZ49mTfTGg11/@coredump.intra.peff.net>
On Fri, Jan 22, 2021 at 06:53:11PM -0500, Jeff King wrote:
> On Wed, Jan 13, 2021 at 05:28:15PM -0500, Taylor Blau wrote:
>
> > OPTIONS
> > @@ -33,7 +34,14 @@ OPTIONS
> > file is constructed from the name of packed archive
> > file by replacing .pack with .idx (and the program
> > fails if the name of packed archive does not end
> > - with .pack).
> > + with .pack). Incompatible with `--rev-index`.
>
> I wondered which option was incompatible, but couldn't see from the
> context. It is "index-pack -o", which kind of makes sense. We can derive
> "foo.rev" from "foo.idx", but normally "-o" does not do any deriving.
>
> So I was all set to say "OK, we can live without it", but...
>
> > @@ -1824,7 +1851,16 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
> > if (from_stdin && hash_algo)
> > die(_("--object-format cannot be used with --stdin"));
> > if (!index_name && pack_name)
> > - index_name = derive_filename(pack_name, "idx", &index_name_buf);
> > + index_name = derive_filename(pack_name, ".pack", "idx", &index_name_buf);
> > +
> > + opts.flags &= ~(WRITE_REV | WRITE_REV_VERIFY);
> > + if (rev_index) {
> > + opts.flags |= verify ? WRITE_REV_VERIFY : WRITE_REV;
> > + if (index_name)
> > + rev_index_name = derive_filename(index_name,
> > + ".idx", "rev",
> > + &rev_index_name_buf);
> > + }
>
> ...here we do end up deriving ".rev" from ".idx" anyway. So I guess we
> probably could support "-o". I also wonder what happens with "git
> index-pack -o foo.idx" when pack.writeReverseIndex is set. It looks like
> it would just work because of this block. But then shouldn't
> "--rev-index" work, too? And indeed, there is a test for that at the end
> of the patch! So is the documentation just wrong?
Hah! The documentation is just plain wrong. It's been a while, but I
have a vague recollection of writing this documentation before changing
the implementation of index-pack to allow this. Clearly, I forgot to go
back to update the broken documentation.
Hilariously, there is even a test in t5325 that demonstrates this
working! 'git index-pack --rev-index -o other.idx' writes both
'other.idx' and 'other.rev'. That was easy :-).
> I admit to finding the use of opts.flags versus the rev_index option a
> bit confusing. It seems like they are doing roughly the same thing, but
> influenced by different sources. It seems like we should be able to have
> a single local variable (then that goes on to set opts.flags for any
> sub-functions we call). Or maybe two, if we need to distinguish config
> versus command-line, but then they should have clear names
> (rev_index_config and rev_index_cmdline or something).
Yeah, I know. It's because we already pass a pointer to a struct
pack_idx_option to git_index_pack_config(), so in effect the 'flags' on
that struct *is* rev_index_config.
It's a little ugly, I agree, but I'm skeptical that the effort to clean
it up is worth it, mostly because the pack_idx_option struct probably
shouldn't be part of the index-pack builtin in the first place.
> As an aside, looking at derive_filename(), it seems a bit weird that one
> argument has a dot in the suffix and the other does not. I guess you are
> following the convention from write_special_file(), which omits it in
> the newly-added suffix. But it is slightly awkward to omit it for the
> old suffix in derive_filename(), because we want to strip_suffix() it
> all at once.
> Probably not that big a deal, but if anybody feels strongly, then
> derive_filename() could do:
>
> if (!strip_suffix(pack_name, strip, &len) ||
> !len || pack_name[len] != '.')
> die("does not end in .%s", strip);
That does make the callers look nicer, but it needs an extra two things:
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index ef2874a8e6..c758f3b8e9 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1441,11 +1441,10 @@ static const char *derive_filename(const char *pack_name, const char *strip,
{
size_t len;
if (!strip_suffix(pack_name, strip, &len) || !len ||
- pack_name[len] != '.')
+ pack_name[len - 1] != '.')
die(_("packfile name '%s' does not end with '.%s'"),
pack_name, strip);
strbuf_add(buf, pack_name, len);
- strbuf_addch(buf, '.');
strbuf_addstr(buf, suffix);
return buf->buf;
}
And then it does what you are looking for. I'll pull that change out
with your Suggested-by as a preparatory commit right before this one.
> > @@ -1578,6 +1591,12 @@ static int git_index_pack_config(const char *k, const char *v, void *cb)
> > }
> > return 0;
> > }
> > + if (!strcmp(k, "pack.writereverseindex")) {
> > + if (git_config_bool(k, v))
> > + opts->flags |= WRITE_REV;
> > + else
> > + opts->flags &= ~WRITE_REV;
> > + }
> > return git_default_config(k, v, cb);
> > }
>
> IMHO we'll eventually want to turn this feature on by default. In which
> case we'll have to update every caller which is checking the config
> manually. Should we hide this in a function that looks up the config,
> and sets the default? Or alternatively, I guess, they could all use some
> shared initializer for "flags".
Note that there are only two such callers, so I'm not sure the effort to
extract this would be worth it.
> > + # Intentionally corrupt the reverse index.
> > + chmod u+w $rev &&
> > + printf "xxxx" | dd of=$rev bs=1 count=4 conv=notrunc &&
> > +
> > + test_must_fail git index-pack --rev-index --verify \
> > + $packdir/pack-$pack.pack 2>err &&
> > + grep "validation error" err
> > +'
>
> This isn't that subtle of a corruption, because we are corrupting the
> first 4 bytes, which is the magic signature. Maybe something further in
> the actual data would be interesting instead of or in addition?
>
> I dunno. There are a lot of edge cases around corruption (likewise, we
> might care how the normal reading code-path perceives a signature
> corruption like this). I'm not sure it's all that interesting to test
> all of them.
Agreed, I think what is here (even though it's not a severe corruption)
would be sufficient to make me feel good about our error handling in
general.
Thanks,
Taylor
next prev parent reply other threads:[~2021-01-25 20:07 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-08 18:19 [PATCH 0/8] pack-revindex: introduce on-disk '.rev' format Taylor Blau
2021-01-08 18:19 ` [PATCH 1/8] packfile: prepare for the existence of '*.rev' files Taylor Blau
2021-01-08 18:20 ` [PATCH 2/8] pack-write.c: prepare to write 'pack-*.rev' files Taylor Blau
2021-01-08 18:20 ` [PATCH 3/8] builtin/index-pack.c: write reverse indexes Taylor Blau
2021-01-08 18:20 ` [PATCH 4/8] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Taylor Blau
2021-01-08 18:20 ` [PATCH 5/8] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Taylor Blau
2021-01-08 18:20 ` [PATCH 6/8] t: prepare for GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-12 17:11 ` Ævar Arnfjörð Bjarmason
2021-01-12 18:40 ` Taylor Blau
2021-01-08 18:20 ` [PATCH 7/8] t: support GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-12 16:49 ` Derrick Stolee
2021-01-12 17:34 ` Taylor Blau
2021-01-12 17:18 ` Ævar Arnfjörð Bjarmason
2021-01-12 17:39 ` Derrick Stolee
2021-01-12 18:17 ` Taylor Blau
2021-01-08 18:20 ` [PATCH 8/8] pack-revindex: ensure that on-disk reverse indexes are given precedence Taylor Blau
2021-01-13 22:28 ` [PATCH v2 0/8] pack-revindex: introduce on-disk '.rev' format Taylor Blau
2021-01-13 22:28 ` [PATCH v2 1/8] packfile: prepare for the existence of '*.rev' files Taylor Blau
2021-01-14 7:22 ` Junio C Hamano
2021-01-14 12:07 ` Derrick Stolee
2021-01-14 19:57 ` Jeff King
2021-01-14 18:28 ` Taylor Blau
2021-01-14 7:26 ` Junio C Hamano
2021-01-14 18:13 ` Taylor Blau
2021-01-14 20:57 ` Junio C Hamano
2021-01-22 22:54 ` Jeff King
2021-01-25 17:44 ` Taylor Blau
2021-01-25 18:27 ` Jeff King
2021-01-25 19:04 ` Junio C Hamano
2021-01-25 19:23 ` Taylor Blau
2021-01-13 22:28 ` [PATCH v2 2/8] pack-write.c: prepare to write 'pack-*.rev' files Taylor Blau
2021-01-22 23:24 ` Jeff King
2021-01-25 19:15 ` Taylor Blau
2021-01-26 21:43 ` Jeff King
2021-01-13 22:28 ` [PATCH v2 3/8] builtin/index-pack.c: write reverse indexes Taylor Blau
2021-01-22 23:53 ` Jeff King
2021-01-25 20:03 ` Taylor Blau [this message]
2021-01-13 22:28 ` [PATCH v2 4/8] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Taylor Blau
2021-01-22 23:57 ` Jeff King
2021-01-23 0:08 ` Jeff King
2021-01-25 20:21 ` Taylor Blau
2021-01-25 20:50 ` Jeff King
2021-01-13 22:28 ` [PATCH v2 5/8] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Taylor Blau
2021-01-13 22:28 ` [PATCH v2 6/8] t: prepare for GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-13 22:28 ` [PATCH v2 7/8] t: support GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-13 22:28 ` [PATCH v2 8/8] pack-revindex: ensure that on-disk reverse indexes are given precedence Taylor Blau
2021-01-25 23:37 ` [PATCH v3 00/10] pack-revindex: introduce on-disk '.rev' format Taylor Blau
2021-01-25 23:37 ` [PATCH v3 01/10] packfile: prepare for the existence of '*.rev' files Taylor Blau
2021-01-29 0:27 ` Jeff King
2021-01-29 1:14 ` Taylor Blau
2021-01-30 8:39 ` Jeff King
2021-01-25 23:37 ` [PATCH v3 02/10] pack-write.c: prepare to write 'pack-*.rev' files Taylor Blau
2021-01-25 23:37 ` [PATCH v3 03/10] builtin/index-pack.c: allow stripping arbitrary extensions Taylor Blau
2021-01-29 0:28 ` Jeff King
2021-01-29 1:15 ` Taylor Blau
2021-01-25 23:37 ` [PATCH v3 04/10] builtin/index-pack.c: write reverse indexes Taylor Blau
2021-01-25 23:37 ` [PATCH v3 05/10] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Taylor Blau
2021-01-25 23:37 ` [PATCH v3 06/10] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Taylor Blau
2021-01-29 0:30 ` Jeff King
2021-01-29 1:17 ` Taylor Blau
2021-01-30 8:41 ` Jeff King
2021-01-25 23:37 ` [PATCH v3 07/10] t: prepare for GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-29 0:45 ` Jeff King
2021-01-29 1:09 ` Eric Sunshine
2021-01-29 1:21 ` Taylor Blau
2021-01-30 8:43 ` Jeff King
2021-01-29 2:42 ` Junio C Hamano
2021-01-25 23:37 ` [PATCH v3 08/10] t: support GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-29 0:47 ` Jeff King
2021-01-25 23:37 ` [PATCH v3 09/10] pack-revindex: ensure that on-disk reverse indexes are given precedence Taylor Blau
2021-01-29 0:53 ` Jeff King
2021-01-29 1:25 ` Taylor Blau
2021-01-30 8:46 ` Jeff King
2021-01-25 23:37 ` [PATCH v3 10/10] t5325: check both on-disk and in-memory reverse index Taylor Blau
2021-01-29 1:04 ` Jeff King
2021-01-29 1:05 ` Jeff King
2021-01-29 1:32 ` Taylor Blau
2021-01-30 8:47 ` Jeff King
2021-01-26 2:36 ` [PATCH v3 00/10] pack-revindex: introduce on-disk '.rev' format Junio C Hamano
2021-01-26 2:49 ` Taylor Blau
2021-01-29 1:06 ` Jeff King
2021-01-29 1:34 ` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YA8j+3nq8nI/hcIQ@nand.local \
--to=me@ttaylorr.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jrnieder@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).