git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* pathspec: problems with too long command line
@ 2018-11-21  9:23 Marc Strapetz
  2018-11-21 13:21 ` Jeff King
  0 siblings, 1 reply; 4+ messages in thread
From: Marc Strapetz @ 2018-11-21  9:23 UTC (permalink / raw)
  To: git

 From our GUI client we are invoking git operations on a possibly large 
set of files. This may result in pathspecs which are exceeding the 
maximum command line length, especially on Windows [1] and OSX [2]. To 
workaround this problem we are currently splitting up such operations by 
invoking multiple git commands. This works well for some commands (like 
add), but doesn't work well for others (like commit).

A possible solution could be to add another patchspec magic word which 
will read paths from a file instead of command line. A similar approach 
can be found in Mercurial with its "listfile:" pattern [3].

Does that sound reasonable? If so, we should be able to provide a 
corresponding patch.

-Marc

[1] https://blogs.msdn.microsoft.com/oldnewthing/20031210-00/?p=41553/
[2] https://serverfault.com/questions/69430
[3] https://www.mercurial-scm.org/repo/hg/help/patterns



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: pathspec: problems with too long command line
  2018-11-21  9:23 pathspec: problems with too long command line Marc Strapetz
@ 2018-11-21 13:21 ` Jeff King
  2018-11-21 13:37   ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff King @ 2018-11-21 13:21 UTC (permalink / raw)
  To: Marc Strapetz; +Cc: git

On Wed, Nov 21, 2018 at 10:23:34AM +0100, Marc Strapetz wrote:

> From our GUI client we are invoking git operations on a possibly large set
> of files. This may result in pathspecs which are exceeding the maximum
> command line length, especially on Windows [1] and OSX [2]. To workaround
> this problem we are currently splitting up such operations by invoking
> multiple git commands. This works well for some commands (like add), but
> doesn't work well for others (like commit).
> 
> A possible solution could be to add another patchspec magic word which will
> read paths from a file instead of command line. A similar approach can be
> found in Mercurial with its "listfile:" pattern [3].
> 
> Does that sound reasonable? If so, we should be able to provide a
> corresponding patch.

Quite a few commands take --stdin, which can be used to send pathspecs
(and often other stuff) without size limits. I don't think either
"commit" or "add" does, but that might be another route.

I'm slightly nervous at a pathspec that starts reading arbitrary files,
because I suspect there may be interesting ways to abuse it for services
which expose Git. E.g., if I have a web service which can show the
history of a file, I might take a $file parameter from the client and
run "git rev-list -- $file" (handling shell quoting, of course). That's
OK now, but with the proposed pathspec magic, a malicious user could ask
for ":(from-file=/etc/passwd)" or whatever.

I dunno. Maybe that is overly paranoid, and certainly servers like that
are a subset of users. And perhaps such servers should be specifying
GIT_LITERAL_PATHSPECS=1 anyway.

-Peff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: pathspec: problems with too long command line
  2018-11-21 13:21 ` Jeff King
@ 2018-11-21 13:37   ` Junio C Hamano
  2018-11-21 20:56     ` Marc Strapetz
  0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2018-11-21 13:37 UTC (permalink / raw)
  To: Jeff King; +Cc: Marc Strapetz, git

Jeff King <peff@peff.net> writes:

> On Wed, Nov 21, 2018 at 10:23:34AM +0100, Marc Strapetz wrote:
>
>> From our GUI client we are invoking git operations on a possibly large set
>> of files. ...
>> command line length, especially on Windows [1] and OSX [2]. To workaround
>> this problem we are currently splitting up such operations by invoking
>> multiple git commands. This works well for some commands (like add), but
>> doesn't work well for others (like commit).

> Quite a few commands take --stdin, which can be used to send pathspecs
> (and often other stuff) without size limits. I don't think either
> "commit" or "add" does, but that might be another route.

A GUI client, like your server, should not be using end-user facing
Porcelain commands like "add" and "commit" anyway.  In the standard
"update-index" followed by "write-tree" followed-by "commit-tree"
followed by "update-ref" sequence, the only thing that needs to take
pathspec is the update-index step, and it already does take --stdin.

In any case, I share your gut feeling that this should not be a
magic pathspec, but should instead be "--stdin[-paths]", for command
line parsing's sanity.  Catchng random strings that begin with
double dash as fishy is much simpler and more robust than having to
tell if a string that is a risky or a benign magic pathspec.




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: pathspec: problems with too long command line
  2018-11-21 13:37   ` Junio C Hamano
@ 2018-11-21 20:56     ` Marc Strapetz
  0 siblings, 0 replies; 4+ messages in thread
From: Marc Strapetz @ 2018-11-21 20:56 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King; +Cc: git

On 21.11.2018 14:37, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
>> On Wed, Nov 21, 2018 at 10:23:34AM +0100, Marc Strapetz wrote:
>>
>>>  From our GUI client we are invoking git operations on a possibly large set
>>> of files. ...
>>> command line length, especially on Windows [1] and OSX [2]. To workaround
>>> this problem we are currently splitting up such operations by invoking
>>> multiple git commands. This works well for some commands (like add), but
>>> doesn't work well for others (like commit).
> 
>> Quite a few commands take --stdin, which can be used to send pathspecs
>> (and often other stuff) without size limits. I don't think either
>> "commit" or "add" does, but that might be another route.
> 
> A GUI client, like your server, should not be using end-user facing
> Porcelain commands like "add" and "commit" anyway.  In the standard
> "update-index" followed by "write-tree" followed-by "commit-tree"
> followed by "update-ref" sequence, the only thing that needs to take
> pathspec is the update-index step, and it already does take --stdin.

In our case it's a desktop client. We didn't consider using plumbing 
commands in general but only in cases where no appropriate high level 
commands exist. One reason for this decision was definitely a lack of 
our understanding in the beginning (which is no excuse anymore :). 
Another reason is that our users have quite frequently requested to see 
invoked Git commands, for their own understanding and learning. I think 
this argument remains valid. A third reason is to reduce process 
invocations. Although we are quite experienced Git users now, one final 
reason may be that it's still desirable to rely on additional validation 
in high level commands.

Summed up, I would prefer to find a solution which allows to stick with 
"git add"s, "git commit"s, "git checkout"s, ... and providing 
--stdin-paths alternatively to <pathspec> would be a good solution from 
a GUI client developer's perspective. I'm probably too biased to see 
whether it will be beneficial to standalone Git, too?

>> I'm slightly nervous at a pathspec that starts reading arbitrary files,
>> because I suspect there may be interesting ways to abuse it for services
>> which expose Git. E.g., if I have a web service which can show the
>> history of a file, I might take a $file parameter from the client and
>> run "git rev-list -- $file" (handling shell quoting, of course). That's
>> OK now, but with the proposed pathspec magic, a malicious user could ask
>> for ":(from-file=/etc/passwd)" or whatever.
> 
> In any case, I share your gut feeling that this should not be a
> magic pathspec, but should instead be "--stdin[-paths]", for command
> line parsing's sanity.  Catchng random strings that begin with
> double dash as fishy is much simpler and more robust than having to
> tell if a string that is a risky or a benign magic pathspec.

These are interesting points. Then --stdin[-paths] is definitely the 
better choice.

-Marc

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-11-21 20:56 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-21  9:23 pathspec: problems with too long command line Marc Strapetz
2018-11-21 13:21 ` Jeff King
2018-11-21 13:37   ` Junio C Hamano
2018-11-21 20:56     ` Marc Strapetz

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).