git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/1] diff: release all handles before running external diff
@ 2019-07-04  9:16 Johannes Schindelin via GitGitGadget
  2019-07-04  9:16 ` [PATCH 1/1] diff: munmap() file contents " Johannes Schindelin via GitGitGadget
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-07-04  9:16 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

On Windows, it is not possible to overwrite a file as long as any process
holds a read handle to it. Even keeping regions memory-mapped prevents that.

When git difftool calls git diff, it might be the user's intention to write
the file(s) via the diff tool, so let's make sure that they are not
memory-mapped at that stage.

Johannes Schindelin (1):
  diff: munmap() file contents before running external diff

 diff.c | 4 ++++
 1 file changed, 4 insertions(+)


base-commit: aa25c82427ae70aebf3b8f970f2afd54e9a2a8c6
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-213%2Fdscho%2Fmunmap-before-ext-diff-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-213/dscho/munmap-before-ext-diff-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/213
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/1] diff: munmap() file contents before running external diff
  2019-07-04  9:16 [PATCH 0/1] diff: release all handles before running external diff Johannes Schindelin via GitGitGadget
@ 2019-07-04  9:16 ` Johannes Schindelin via GitGitGadget
  2019-07-08 21:54   ` Junio C Hamano
  2019-07-08 19:24 ` [PATCH 0/1] diff: release all handles " Junio C Hamano
  2019-07-11  8:23 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
  2 siblings, 1 reply; 8+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-07-04  9:16 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When running an external diff from, say, a diff tool, it is safe to
assume that we want to write the files in question. On Windows, that
means that there cannot be any other process holding an open handle to
said files.

So let's make sure that `git diff` itself is not holding any open handle
to the files in question.

This fixes https://github.com/git-for-windows/git/issues/1315

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 diff.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/diff.c b/diff.c
index 4d3cf83a27..0afb76bbca 100644
--- a/diff.c
+++ b/diff.c
@@ -4206,6 +4206,10 @@ static void run_external_diff(const char *pgm,
 	argv_array_pushf(&env, "GIT_DIFF_PATH_COUNTER=%d", ++o->diff_path_counter);
 	argv_array_pushf(&env, "GIT_DIFF_PATH_TOTAL=%d", q->nr);
 
+	if (one && one->should_munmap)
+		diff_free_filespec_data(one);
+	if (two && two->should_munmap)
+		diff_free_filespec_data(two);
 	if (run_command_v_opt_cd_env(argv.argv, RUN_USING_SHELL, NULL, env.argv))
 		die(_("external diff died, stopping at %s"), name);
 
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/1] diff: release all handles before running external diff
  2019-07-04  9:16 [PATCH 0/1] diff: release all handles before running external diff Johannes Schindelin via GitGitGadget
  2019-07-04  9:16 ` [PATCH 1/1] diff: munmap() file contents " Johannes Schindelin via GitGitGadget
@ 2019-07-08 19:24 ` Junio C Hamano
  2019-07-11  8:23 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
  2 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2019-07-08 19:24 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> On Windows, it is not possible to overwrite a file as long as any process
> holds a read handle to it. Even keeping regions memory-mapped prevents that.

That second sentence was quite helpful, as I do recall us closing
after mmapping.  Without it, the justification becomes quite weak.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] diff: munmap() file contents before running external diff
  2019-07-04  9:16 ` [PATCH 1/1] diff: munmap() file contents " Johannes Schindelin via GitGitGadget
@ 2019-07-08 21:54   ` Junio C Hamano
  2019-07-10 12:43     ` Johannes Schindelin
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2019-07-08 21:54 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> When running an external diff from, say, a diff tool, it is safe to
> assume that we want to write the files in question. On Windows, that
> means that there cannot be any other process holding an open handle to
> said files.

Please add "It is not enough to close the file descriptor; having a
region that is still mmapped keeps the file busy" or something like
that at the end.

> So let's make sure that `git diff` itself is not holding any open handle
> to the files in question.
>
> This fixes https://github.com/git-for-windows/git/issues/1315
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  diff.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/diff.c b/diff.c
> index 4d3cf83a27..0afb76bbca 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -4206,6 +4206,10 @@ static void run_external_diff(const char *pgm,
>  	argv_array_pushf(&env, "GIT_DIFF_PATH_COUNTER=%d", ++o->diff_path_counter);
>  	argv_array_pushf(&env, "GIT_DIFF_PATH_TOTAL=%d", q->nr);
>  
> +	if (one && one->should_munmap)
> +		diff_free_filespec_data(one);
> +	if (two && two->should_munmap)
> +		diff_free_filespec_data(two);

I wondered if a single diff_filespec instance can be used in two
diff_filepair instances (e.g. file A is in-place modified and also
used to create file C), and if so after showing the diff for file A,
we have problems with showing file C.  But I do not think it should
pose a problem, as "free data after comparing a pair" is what we do
for the in-core codepath in builtin_diff().

We can lose the NULL-ness test for one and two if these "free the
resource once we no longer need it" is done inside "if (one && two)".
After all, once add_external_diff_name()[*1*] does its thing, we do
not need the data for these diff_filespec instances, right?

Also, just like builtin_diff() unconditionally frees the resources
held by diff_filespec instances, shouldn't this function do so, even
the ones that are not marked with should_munmap?


>  	if (run_command_v_opt_cd_env(argv.argv, RUN_USING_SHELL, NULL, env.argv))
>  		die(_("external diff died, stopping at %s"), name);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] diff: munmap() file contents before running external diff
  2019-07-08 21:54   ` Junio C Hamano
@ 2019-07-10 12:43     ` Johannes Schindelin
  0 siblings, 0 replies; 8+ messages in thread
From: Johannes Schindelin @ 2019-07-10 12:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin via GitGitGadget, git

Hi Junio,

On Mon, 8 Jul 2019, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > When running an external diff from, say, a diff tool, it is safe to
> > assume that we want to write the files in question. On Windows, that
> > means that there cannot be any other process holding an open handle to
> > said files.
>
> Please add "It is not enough to close the file descriptor; having a
> region that is still mmapped keeps the file busy" or something like
> that at the end.

Good call.

> > So let's make sure that `git diff` itself is not holding any open handle
> > to the files in question.
> >
> > This fixes https://github.com/git-for-windows/git/issues/1315
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >  diff.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/diff.c b/diff.c
> > index 4d3cf83a27..0afb76bbca 100644
> > --- a/diff.c
> > +++ b/diff.c
> > @@ -4206,6 +4206,10 @@ static void run_external_diff(const char *pgm,
> >  	argv_array_pushf(&env, "GIT_DIFF_PATH_COUNTER=%d", ++o->diff_path_counter);
> >  	argv_array_pushf(&env, "GIT_DIFF_PATH_TOTAL=%d", q->nr);
> >
> > +	if (one && one->should_munmap)
> > +		diff_free_filespec_data(one);
> > +	if (two && two->should_munmap)
> > +		diff_free_filespec_data(two);
>
> I wondered if a single diff_filespec instance can be used in two
> diff_filepair instances (e.g. file A is in-place modified and also
> used to create file C), and if so after showing the diff for file A,
> we have problems with showing file C.  But I do not think it should
> pose a problem, as "free data after comparing a pair" is what we do
> for the in-core codepath in builtin_diff().

Precisely.

> We can lose the NULL-ness test for one and two if these "free the
> resource once we no longer need it" is done inside "if (one && two)".
> After all, once add_external_diff_name()[*1*] does its thing, we do
> not need the data for these diff_filespec instances, right?

Yes, but we still need the `should_munmap` test, I believe. So we have
that `if` anyway.

> Also, just like builtin_diff() unconditionally frees the resources
> held by diff_filespec instances, shouldn't this function do so, even
> the ones that are not marked with should_munmap?

I have not inspected the code path vigorously, nor am I confident that
this wouldn't be broken easily. At least when regions are mapped, I am
fairly certain that my patch does not break anything.

But if you are confident that this won't break anything, I'll certainly
be happy with that assessment.

I'll make that change and trust the CI build to fail if your assumption
was incorrect.

Ciao,
Dscho

>
>
> >  	if (run_command_v_opt_cd_env(argv.argv, RUN_USING_SHELL, NULL, env.argv))
> >  		die(_("external diff died, stopping at %s"), name);
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 0/1] diff: release all handles before running external diff
  2019-07-04  9:16 [PATCH 0/1] diff: release all handles before running external diff Johannes Schindelin via GitGitGadget
  2019-07-04  9:16 ` [PATCH 1/1] diff: munmap() file contents " Johannes Schindelin via GitGitGadget
  2019-07-08 19:24 ` [PATCH 0/1] diff: release all handles " Junio C Hamano
@ 2019-07-11  8:23 ` Johannes Schindelin via GitGitGadget
  2019-07-11  8:23   ` [PATCH v2 1/1] diff: munmap() file contents " Johannes Schindelin via GitGitGadget
  2 siblings, 1 reply; 8+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-07-11  8:23 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

On Windows, it is not possible to overwrite a file as long as any process
holds a read handle to it. Even keeping regions memory-mapped prevents that.

When git difftool calls git diff, it might be the user's intention to write
the file(s) via the diff tool, so let's make sure that they are not
memory-mapped at that stage.

Changes since v1:

 * Clarified in the commit message that even mapped regions block
   writes/deletes.
 * The diff file pair is now released unconditionally, not only when it is
   mapped, for consistency (the CI build did not fail, and a cursory
   inspection of the code paths indicates that this should be safe, as from
   this point on only the external command accesses the file pair's
   contents, and they had to be written out to disk to that end).

Johannes Schindelin (1):
  diff: munmap() file contents before running external diff

 diff.c | 2 ++
 1 file changed, 2 insertions(+)


base-commit: aa25c82427ae70aebf3b8f970f2afd54e9a2a8c6
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-213%2Fdscho%2Fmunmap-before-ext-diff-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-213/dscho/munmap-before-ext-diff-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/213

Range-diff vs v1:

 1:  bef83fc20b ! 1:  8a0213291b diff: munmap() file contents before running external diff
     @@ -5,11 +5,15 @@
          When running an external diff from, say, a diff tool, it is safe to
          assume that we want to write the files in question. On Windows, that
          means that there cannot be any other process holding an open handle to
     -    said files.
     +    said files, or even just a mapped region.
      
          So let's make sure that `git diff` itself is not holding any open handle
          to the files in question.
      
     +    In fact, we will just release the file pair right away, as the external
     +    diff uses the files we just wrote, so we do not need to hold the file
     +    contents in memory anymore.
     +
          This fixes https://github.com/git-for-windows/git/issues/1315
      
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
     @@ -21,10 +25,8 @@
       	argv_array_pushf(&env, "GIT_DIFF_PATH_COUNTER=%d", ++o->diff_path_counter);
       	argv_array_pushf(&env, "GIT_DIFF_PATH_TOTAL=%d", q->nr);
       
     -+	if (one && one->should_munmap)
     -+		diff_free_filespec_data(one);
     -+	if (two && two->should_munmap)
     -+		diff_free_filespec_data(two);
     ++	diff_free_filespec_data(one);
     ++	diff_free_filespec_data(two);
       	if (run_command_v_opt_cd_env(argv.argv, RUN_USING_SHELL, NULL, env.argv))
       		die(_("external diff died, stopping at %s"), name);
       

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/1] diff: munmap() file contents before running external diff
  2019-07-11  8:23 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
@ 2019-07-11  8:23   ` Johannes Schindelin via GitGitGadget
  2019-07-11 19:03     ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-07-11  8:23 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When running an external diff from, say, a diff tool, it is safe to
assume that we want to write the files in question. On Windows, that
means that there cannot be any other process holding an open handle to
said files, or even just a mapped region.

So let's make sure that `git diff` itself is not holding any open handle
to the files in question.

In fact, we will just release the file pair right away, as the external
diff uses the files we just wrote, so we do not need to hold the file
contents in memory anymore.

This fixes https://github.com/git-for-windows/git/issues/1315

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 diff.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/diff.c b/diff.c
index 4d3cf83a27..42affb6dcf 100644
--- a/diff.c
+++ b/diff.c
@@ -4206,6 +4206,8 @@ static void run_external_diff(const char *pgm,
 	argv_array_pushf(&env, "GIT_DIFF_PATH_COUNTER=%d", ++o->diff_path_counter);
 	argv_array_pushf(&env, "GIT_DIFF_PATH_TOTAL=%d", q->nr);
 
+	diff_free_filespec_data(one);
+	diff_free_filespec_data(two);
 	if (run_command_v_opt_cd_env(argv.argv, RUN_USING_SHELL, NULL, env.argv))
 		die(_("external diff died, stopping at %s"), name);
 
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/1] diff: munmap() file contents before running external diff
  2019-07-11  8:23   ` [PATCH v2 1/1] diff: munmap() file contents " Johannes Schindelin via GitGitGadget
@ 2019-07-11 19:03     ` Junio C Hamano
  0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2019-07-11 19:03 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> When running an external diff from, say, a diff tool, it is safe to
> assume that we want to write the files in question. On Windows, that
> means that there cannot be any other process holding an open handle to
> said files, or even just a mapped region.
>
> So let's make sure that `git diff` itself is not holding any open handle
> to the files in question.
>
> In fact, we will just release the file pair right away, as the external
> diff uses the files we just wrote, so we do not need to hold the file
> contents in memory anymore.
>
> This fixes https://github.com/git-for-windows/git/issues/1315
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  diff.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/diff.c b/diff.c
> index 4d3cf83a27..42affb6dcf 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -4206,6 +4206,8 @@ static void run_external_diff(const char *pgm,
>  	argv_array_pushf(&env, "GIT_DIFF_PATH_COUNTER=%d", ++o->diff_path_counter);
>  	argv_array_pushf(&env, "GIT_DIFF_PATH_TOTAL=%d", q->nr);
>  
> +	diff_free_filespec_data(one);
> +	diff_free_filespec_data(two);
>  	if (run_command_v_opt_cd_env(argv.argv, RUN_USING_SHELL, NULL, env.argv))
>  		die(_("external diff died, stopping at %s"), name);

Looks sensible; will queue.  Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-07-11 19:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-04  9:16 [PATCH 0/1] diff: release all handles before running external diff Johannes Schindelin via GitGitGadget
2019-07-04  9:16 ` [PATCH 1/1] diff: munmap() file contents " Johannes Schindelin via GitGitGadget
2019-07-08 21:54   ` Junio C Hamano
2019-07-10 12:43     ` Johannes Schindelin
2019-07-08 19:24 ` [PATCH 0/1] diff: release all handles " Junio C Hamano
2019-07-11  8:23 ` [PATCH v2 " Johannes Schindelin via GitGitGadget
2019-07-11  8:23   ` [PATCH v2 1/1] diff: munmap() file contents " Johannes Schindelin via GitGitGadget
2019-07-11 19:03     ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).