git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Baumann, Moritz" <moritz.baumann@sap.com>
To: Jeff King <peff@peff.net>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: RE: Feature Request: Option to make "git rev-list --objects" output duplicate objects
Date: Tue, 28 Mar 2023 08:08:02 +0000	[thread overview]
Message-ID: <AS1PR02MB8185DF947EBC583318481E1994889@AS1PR02MB8185.eurprd02.prod.outlook.com> (raw)
In-Reply-To: <20230324192848.GC536967@coredump.intra.peff.net>

> Another problem you might not have run into yet: the names given by
> rev-list are not quoted in any way, and will just omit newlines. So if
> your hook is trying to avoid malicious garbage like "foo\nbar", it won't
> work.

Thanks for that warning. I was not aware that rev-list didn't quote file names.

> Those names are really just intended as hints for pack-objects. I
> suspect the documentation could be more clear about these limitations.

That would indeed be great and would have likely prevented the obvious
misconceptions on my side.

> I'm not sure what you mean by "one by one", since that is inherently
> what rev-list is doing under the hood. If you mean "running a separate
> process for each commit", then yes, that will be slow.

Yes, that's what I meant to say.

> But if you want
> to know all of the names touched in a set of commits, I have used
> something like this before:
>
>   git rev-list $new --not --all |
>   git diff-tree --stdin --format= -r -c --name-only

Thanks, that looks promising and solves at least one of my use cases. The only
minor problem is that there seems to be no way to pipe the diff-tree output to
cat-file without massaging it with awk first.

I have three uses cases in my pre-receive hooks:

1. Filters solely based on the file name
   ? your suggestions works perfectly here
2. Filters based only on file contents
   ? git rev-list --objects + git cat-file provide everything I need
3. One filter based on file size and name (forbid large files, with exceptions)
   ? I'm guessing "git rev-list | git diff-tree --stdin | awk |
     git cat-file --batch-check" is the best solution to extract the necessary
     information from git in this case?

-- Moritz

  reply	other threads:[~2023-03-28  8:08 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-24 15:51 Feature Request: Option to make "git rev-list --objects" output duplicate objects Baumann, Moritz
2023-03-24 16:50 ` Junio C Hamano
2023-03-27  7:02   ` Baumann, Moritz
2023-03-27 16:07     ` Junio C Hamano
2023-03-24 19:28 ` Jeff King
2023-03-28  8:08   ` Baumann, Moritz [this message]
2023-03-28 18:26     ` [PATCH] docs: document caveats of rev-list's object-name output Jeff King
2023-03-30 10:32       ` Baumann, Moritz
2023-03-28 18:32     ` Feature Request: Option to make "git rev-list --objects" output duplicate objects Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AS1PR02MB8185DF947EBC583318481E1994889@AS1PR02MB8185.eurprd02.prod.outlook.com \
    --to=moritz.baumann@sap.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).