git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee <dstolee@microsoft.com>,
	git@vger.kernel.org
Subject: Re: [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits'
Date: Fri, 3 Apr 2020 21:38:42 +0200	[thread overview]
Message-ID: <20200403193842.GA7859@szeder.dev> (raw)
In-Reply-To: <20200403184933.GA57202@syl.local>

On Fri, Apr 03, 2020 at 12:49:33PM -0600, Taylor Blau wrote:
> On Fri, Apr 03, 2020 at 02:30:57PM -0400, Jeff King wrote:
> > On Mon, Aug 05, 2019 at 10:02:40AM +0200, SZEDER Gábor wrote:
> >
> > > While 'git commit-graph write --stdin-commits' expects commit object
> > > ids as input, it accepts and silently skips over any invalid commit
> > > object ids, and still exits with success:
> > >
> > >   # nonsense
> > >   $ echo not-a-commit-oid | git commit-graph write --stdin-commits
> > >   $ echo $?
> > >   0
> > >   # sometimes I forgot that refs are not good...
> > >   $ echo HEAD | git commit-graph write --stdin-commits
> > >   $ echo $?
> > >   0
> > >   # valid tree OID, but not a commit OID
> > >   $ git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
> > >   $ echo $?
> > >   0
> > >   $ ls -l .git/objects/info/commit-graph
> > >   ls: cannot access '.git/objects/info/commit-graph': No such file or directory
> > >
> > > Check that all input records are indeed valid commit object ids and
> > > return with error otherwise, the same way '--stdin-packs' handles
> > > invalid input; see e103f7276f (commit-graph: return with errors during
> > > write, 2019-06-12).
> >
> > Can you explain more why the old behavior is a problem?

Because when I do:

   # sometimes I forgot that refs are not good...
   $ echo HEAD | git commit-graph write --stdin-commits

then I get _nothing_: neither an error, nor a commit-graph.

> > For reasons (see
> > below), we want to do something like:
> >
> >   git for-each-ref --format='%(objectname)' |
> >   git commit-graph write --stdin-commits
> >
> > In v2.23 and earlier, that worked exactly like --reachable, but now it
> > will blow up if there are any refs that point to a non-commit (e.g., a
> > tag of a blob).
> >
> > It can be worked around by asking for %(objecttype) and %(*objecttype)
> > and grepping the result, but that's awkward and much less efficient
> > (especially if you have a lot of annotated tags, as we may have to open
> > and parse each one).
> >
> > Now obviously you could just use --reachable for the code above. But
> > here are two plausible cases where you might not want to do that:
> >
> >  - you're limiting the graph to only a subset of refs (e.g., you want to
> >    graph refs/heads/ and refs/tags, but not refs/some-other-weird-area/).
> >
> >  - you're generating an incremental graph update. You know somehow that
> >    a few refs were updated, and you want to feed those tips to generate
> >    the incremental, but not the rest of the refs (not because it would
> >    be wrong to do so, but in the name of keeping it O(size of change)
> >    and not O(number of refs in the repo).
> >
> > The latter is the actual case that bit us. I suppose one could do
> > something like:
> >
> >   git rev-list --no-walk <maybe-commits |
> >   git commit-graph write --stdin-commits
> >
> > to use rev-list as a filter, but that feels kind of baroque.
> >
> > Normally I'm in favor of more error checking instead of less, but in
> > this case it feels like it's putting scripted use at a disadvantage
> > versus the internal code (e.g., the auto-write for git-fetch uses the
> > "--reachable" semantics for its internal invocation).
> 
> For what it's worth, (and in case it wasn't obvious) this came about
> because we feed '--stdin-commits' at GitHub, and observed exactly this
> error case. I wasn't sure what approach would be more palatable, so I
> prepared both in my fork at https://github.com/ttaylorr/git:
> 
>   - Branch 'tb/commit-graph-dont-check-oids' drops this checking
>     entirely.

Oh, no :)

>   - Branch 'tb/commit-graph-check-oids-option' adds a
>     '--[no-]check-oids', in case that this is generally desirable
>     behavior, by offering an opt-out of this OID checking.
> 
> Please let me know if you find either of these to be good options, and
> I'll happily send one of them to the list. Thanks.

Or introduce 'git commit-graph write --stdin-refs'?  Or teach
'--stdin-commits' to DWIM and accept and parse refs?  Though the
question still remains what to do with refs that can't be peeled back
to commits

> > -Peff
> >
> > PS As an aside, I think the internal git-fetch write could benefit from
> >    this same trick: feed the set of newly-stored ref tips to the
> >    commit-graph machinery, rather than using for_each_ref().
> 
> Thanks,
> Taylor

  reply	other threads:[~2020-04-03 19:38 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-05  8:02 [PATCH 0/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits' SZEDER Gábor
2019-08-05  8:02 ` [PATCH 1/3] t5318-commit-graph: use 'test_expect_code' SZEDER Gábor
2019-08-05  8:02 ` [PATCH 2/3] commit-graph: turn a group of write-related macro flags into an enum SZEDER Gábor
2019-08-05  8:02 ` [PATCH 3/3] commit-graph: error out on invalid commit oids in 'write --stdin-commits' SZEDER Gábor
2019-08-05 13:57   ` Derrick Stolee
2019-08-05 17:57     ` SZEDER Gábor
2020-04-03 18:30   ` Jeff King
2020-04-03 18:49     ` Taylor Blau
2020-04-03 19:38       ` SZEDER Gábor [this message]
2020-04-03 19:51         ` Jeff King
2020-04-03 20:40           ` SZEDER Gábor
2020-04-03 23:10             ` Jeff King
2020-04-13 19:39               ` Taylor Blau
2020-04-13 21:25                 ` Jeff King
2020-04-14  2:04                   ` Taylor Blau
2020-04-03 19:55         ` Taylor Blau
2020-04-03 19:47       ` Junio C Hamano
2020-04-03 19:57         ` Taylor Blau
2019-08-05 10:14 ` [PATCH 0/3] " SZEDER Gábor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200403193842.GA7859@szeder.dev \
    --to=szeder.dev@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).