git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit
@ 2007-10-05  0:15 Steven Grimm
  2007-10-05  8:04 ` Andreas Ericsson
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Steven Grimm @ 2007-10-05  0:15 UTC (permalink / raw
  To: git

git-svn dcommit, by virtue of rewriting history to insert svn revision IDs,
leaves old commits dangling.  Since dcommit is already unsafe to run
concurrently with other git commands, no additional risk is introduced
by making it prune those old objects as needed.

Signed-off-by: Steven Grimm <koreth@midwinter.com>
---

This is in response to a colleague who complained that, after I
installed the latest git release, he was getting lots of "too many
unreachable loose objects" errors from the new "git gc --auto" run.
Those objects turned out to be dangling commits from a year's worth of
git-svn usage, since every git-svn commit will abandon at least one
existing commit in order to rewrite it with the svn version data.

 git-svn.perl |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index 777e436..be62ee1 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -441,6 +441,12 @@ sub cmd_dcommit {
 			}
 			command_noisy(@finish, $gs->refname);
 			$last_rev = $cmt_rev;
+
+			# rebase will have made the just-committed revisions
+			# unreachable; over time that can build up lots of
+			# loose objects in the repo. prune is unsafe to run
+			# concurrently but so is dcommit.
+			command_noisy(qw/gc --auto --prune/);
 		}
 	}
 }
-- 
1.5.3.4.203.gcc61a

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit
  2007-10-05  0:15 [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit Steven Grimm
@ 2007-10-05  8:04 ` Andreas Ericsson
  2007-10-05  8:27   ` Johannes Schindelin
  2007-10-05  8:21 ` Peter Baumann
  2007-10-05 23:54 ` Eric Wong
  2 siblings, 1 reply; 10+ messages in thread
From: Andreas Ericsson @ 2007-10-05  8:04 UTC (permalink / raw
  To: Steven Grimm; +Cc: git

Steven Grimm wrote:
> git-svn dcommit, by virtue of rewriting history to insert svn revision IDs,
> leaves old commits dangling.  Since dcommit is already unsafe to run
> concurrently with other git commands, no additional risk is introduced
> by making it prune those old objects as needed.
> 
> Signed-off-by: Steven Grimm <koreth@midwinter.com>
> ---
> 
> This is in response to a colleague who complained that, after I
> installed the latest git release, he was getting lots of "too many
> unreachable loose objects" errors from the new "git gc --auto" run.
> Those objects turned out to be dangling commits from a year's worth of
> git-svn usage, since every git-svn commit will abandon at least one
> existing commit in order to rewrite it with the svn version data.
> 
>  git-svn.perl |    6 ++++++
>  1 files changed, 6 insertions(+), 0 deletions(-)
> 
> diff --git a/git-svn.perl b/git-svn.perl
> index 777e436..be62ee1 100755
> --- a/git-svn.perl
> +++ b/git-svn.perl
> @@ -441,6 +441,12 @@ sub cmd_dcommit {
>  			}
>  			command_noisy(@finish, $gs->refname);
>  			$last_rev = $cmt_rev;
> +
> +			# rebase will have made the just-committed revisions
> +			# unreachable; over time that can build up lots of
> +			# loose objects in the repo. prune is unsafe to run
> +			# concurrently but so is dcommit.
> +			command_noisy(qw/gc --auto --prune/);
>  		}
>  	}
>  }

I'd be surprised if this would ever prune anything, as git doesn't throw out
objects reachable by reflog (or, I assume, any of the objects reachable from
objects reachable from reflog).

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit
  2007-10-05  0:15 [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit Steven Grimm
  2007-10-05  8:04 ` Andreas Ericsson
@ 2007-10-05  8:21 ` Peter Baumann
  2007-10-05 16:12   ` Steven Grimm
  2007-10-05 23:54 ` Eric Wong
  2 siblings, 1 reply; 10+ messages in thread
From: Peter Baumann @ 2007-10-05  8:21 UTC (permalink / raw
  To: Steven Grimm; +Cc: git

On Thu, Oct 04, 2007 at 05:15:28PM -0700, Steven Grimm wrote:
> git-svn dcommit, by virtue of rewriting history to insert svn revision IDs,
> leaves old commits dangling.  Since dcommit is already unsafe to run
> concurrently with other git commands, no additional risk is introduced
> by making it prune those old objects as needed.
> 
> Signed-off-by: Steven Grimm <koreth@midwinter.com>
> ---
> 
> This is in response to a colleague who complained that, after I
> installed the latest git release, he was getting lots of "too many
> unreachable loose objects" errors from the new "git gc --auto" run.
> Those objects turned out to be dangling commits from a year's worth of
> git-svn usage, since every git-svn commit will abandon at least one
> existing commit in order to rewrite it with the svn version data.
> 

I don't like the automatic prune. What if someone has other objects in
there which shouldn't be pruned? Making git svn dcommit doing the prune
would be at least suprising, because how is one supposed to know that
doing a commit into svn will prune all your precious objects?

Sure, I can unterstand from where you are coming from, but I'd prefere
if this could be specified on a case by case basis, e.g. from the
cmdline or as a config option.

-Peter


>  git-svn.perl |    6 ++++++
>  1 files changed, 6 insertions(+), 0 deletions(-)
> 
> diff --git a/git-svn.perl b/git-svn.perl
> index 777e436..be62ee1 100755
> --- a/git-svn.perl
> +++ b/git-svn.perl
> @@ -441,6 +441,12 @@ sub cmd_dcommit {
>  			}
>  			command_noisy(@finish, $gs->refname);
>  			$last_rev = $cmt_rev;
> +
> +			# rebase will have made the just-committed revisions
> +			# unreachable; over time that can build up lots of
> +			# loose objects in the repo. prune is unsafe to run
> +			# concurrently but so is dcommit.
> +			command_noisy(qw/gc --auto --prune/);
>  		}
>  	}
>  }
> -- 
> 1.5.3.4.203.gcc61a
> 
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit
  2007-10-05  8:04 ` Andreas Ericsson
@ 2007-10-05  8:27   ` Johannes Schindelin
  0 siblings, 0 replies; 10+ messages in thread
From: Johannes Schindelin @ 2007-10-05  8:27 UTC (permalink / raw
  To: Andreas Ericsson; +Cc: Steven Grimm, git

Hi,

On Fri, 5 Oct 2007, Andreas Ericsson wrote:

> Steven Grimm wrote:
> > git-svn dcommit, by virtue of rewriting history to insert svn revision IDs,
> > leaves old commits dangling.  Since dcommit is already unsafe to run
> > concurrently with other git commands, no additional risk is introduced
> > by making it prune those old objects as needed.
> > 
> > Signed-off-by: Steven Grimm <koreth@midwinter.com>
> > ---
> > 
> > This is in response to a colleague who complained that, after I
> > installed the latest git release, he was getting lots of "too many
> > unreachable loose objects" errors from the new "git gc --auto" run.
> > Those objects turned out to be dangling commits from a year's worth of
> > git-svn usage, since every git-svn commit will abandon at least one
> > existing commit in order to rewrite it with the svn version data.
> > 
> >  git-svn.perl |    6 ++++++
> >  1 files changed, 6 insertions(+), 0 deletions(-)
> > 
> > diff --git a/git-svn.perl b/git-svn.perl
> > index 777e436..be62ee1 100755
> > --- a/git-svn.perl
> > +++ b/git-svn.perl
> > @@ -441,6 +441,12 @@ sub cmd_dcommit {
> >  			}
> >  			command_noisy(@finish, $gs->refname);
> >  			$last_rev = $cmt_rev;
> > +
> > +			# rebase will have made the just-committed revisions
> > +			# unreachable; over time that can build up lots of
> > +			# loose objects in the repo. prune is unsafe to run
> > +			# concurrently but so is dcommit.
> > +			command_noisy(qw/gc --auto --prune/);
> >  		}
> >  	}
> >  }
> 
> I'd be surprised if this would ever prune anything, as git doesn't throw 
> out objects reachable by reflog (or, I assume, any of the objects 
> reachable from objects reachable from reflog).

It will so, in due time.  Reflogs have an expiry date, and will be culled 
by git gc --auto.  So if you dcommit often (which I do), the objects will 
be pruned, eventually.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit
  2007-10-05  8:21 ` Peter Baumann
@ 2007-10-05 16:12   ` Steven Grimm
  2007-10-05 16:15     ` [PATCH 3/2] Document the fact that git-svn now runs git-gc Steven Grimm
  2007-10-05 16:49     ` [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit Peter Baumann
  0 siblings, 2 replies; 10+ messages in thread
From: Steven Grimm @ 2007-10-05 16:12 UTC (permalink / raw
  To: Peter Baumann; +Cc: git

Peter Baumann wrote:
> I don't like the automatic prune. What if someone has other objects in
> there which shouldn't be pruned? Making git svn dcommit doing the prune
> would be at least suprising, because how is one supposed to know that
> doing a commit into svn will prune all your precious objects?
>   

"git commit" already does garbage collection, so we've already set a 
precedent for a commit operation also doing some cleanup at the end. 
However, you're correct that this cleanup behavior (and the way to turn 
it off) should be documented so that there's some way to know about it. 
Doc patch forthcoming.

> Sure, I can unterstand from where you are coming from, but I'd prefere
> if this could be specified on a case by case basis, e.g. from the
> cmdline or as a config option.
>   

This code (by virtue of only doing the prune if the "too many loose 
objects" test succeeds) will obey the existing gc.auto config option. So 
it's already possible to turn off as is. I'll note that in the doc patch.

-Steve

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3/2] Document the fact that git-svn now runs git-gc
  2007-10-05 16:12   ` Steven Grimm
@ 2007-10-05 16:15     ` Steven Grimm
  2007-10-05 16:49     ` [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit Peter Baumann
  1 sibling, 0 replies; 10+ messages in thread
From: Steven Grimm @ 2007-10-05 16:15 UTC (permalink / raw
  To: git

Signed-off-by: Steven Grimm <koreth@midwinter.com>
---
 Documentation/git-svn.txt |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/Documentation/git-svn.txt b/Documentation/git-svn.txt
index e157c6a..26f0f39 100644
--- a/Documentation/git-svn.txt
+++ b/Documentation/git-svn.txt
@@ -125,7 +125,15 @@ and have no uncommitted changes.
 	alternative to HEAD.
 	This is advantageous over 'set-tree' (below) because it produces
 	cleaner, more linear history.
-+
+
+When the commit is finished, gitlink:git-gc[1] is run with the
+`--prune` and `--auto` options to clean up the git object database,
+including removing old unreachable objects (some of which are
+created by the process of committing to SVN.) Set the `gc.auto`
+config option to 0 if you don't want your repository to be cleaned,
+e.g., because you are intentionally keeping unreachable objects in
+your repository.
+
 --no-rebase;;
 	After committing, do not rebase or reset.
 --
-- 
1.5.3.4.203.gcc61a

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit
  2007-10-05 16:12   ` Steven Grimm
  2007-10-05 16:15     ` [PATCH 3/2] Document the fact that git-svn now runs git-gc Steven Grimm
@ 2007-10-05 16:49     ` Peter Baumann
  2007-10-05 17:48       ` Steven Grimm
  1 sibling, 1 reply; 10+ messages in thread
From: Peter Baumann @ 2007-10-05 16:49 UTC (permalink / raw
  To: Steven Grimm; +Cc: git

On Fri, Oct 05, 2007 at 09:12:05AM -0700, Steven Grimm wrote:
> Peter Baumann wrote:
>> I don't like the automatic prune. What if someone has other objects in
>> there which shouldn't be pruned? Making git svn dcommit doing the prune
>> would be at least suprising, because how is one supposed to know that
>> doing a commit into svn will prune all your precious objects?
>>   
>
> "git commit" already does garbage collection, so we've already set a 
> precedent for a commit operation also doing some cleanup at the end. 
> However, you're correct that this cleanup behavior (and the way to turn it 
> off) should be documented so that there's some way to know about it. Doc 
> patch forthcoming.
>

That's new to me. Glancing over git-commit.sh, I could only find a
'git-gc --auto', but no prune. I am not against doing a 'git gc --auto',
but I am against the --prune, because this could make shared
repositories unfunctional.

-Peter

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit
  2007-10-05 16:49     ` [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit Peter Baumann
@ 2007-10-05 17:48       ` Steven Grimm
  2007-10-06  8:15         ` Peter Baumann
  0 siblings, 1 reply; 10+ messages in thread
From: Steven Grimm @ 2007-10-05 17:48 UTC (permalink / raw
  To: Peter Baumann; +Cc: git

Peter Baumann wrote:
> That's new to me. Glancing over git-commit.sh, I could only find a
> 'git-gc --auto', but no prune. I am not against doing a 'git gc --auto',
> but I am against the --prune, because this could make shared
> repositories unfunctional.
>   

Does anyone run "git svn dcommit" from a shared repository? That is the 
only command that will trigger this code path.

Given that you lose all the svn metadata if you do "git clone" (or "git 
clone -s") on a git-svn-managed repository, it's not clear to me that 
anyone would ever be bitten by this. Counterexamples welcome, of course.

How would you feel about a separate config option to specifically enable 
auto-pruning, and having "git svn clone" set that option by default? 
Presumably anyone who is setting up a shared git-svn repository will be 
up to the task of disabling the option.

-Steve

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit
  2007-10-05  0:15 [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit Steven Grimm
  2007-10-05  8:04 ` Andreas Ericsson
  2007-10-05  8:21 ` Peter Baumann
@ 2007-10-05 23:54 ` Eric Wong
  2 siblings, 0 replies; 10+ messages in thread
From: Eric Wong @ 2007-10-05 23:54 UTC (permalink / raw
  To: Steven Grimm; +Cc: git

Steven Grimm <koreth@midwinter.com> wrote:
> git-svn dcommit, by virtue of rewriting history to insert svn revision IDs,
> leaves old commits dangling.  Since dcommit is already unsafe to run
> concurrently with other git commands, no additional risk is introduced
> by making it prune those old objects as needed.
> 
> Signed-off-by: Steven Grimm <koreth@midwinter.com>
> ---
> 
> This is in response to a colleague who complained that, after I
> installed the latest git release, he was getting lots of "too many
> unreachable loose objects" errors from the new "git gc --auto" run.
> Those objects turned out to be dangling commits from a year's worth of
> git-svn usage, since every git-svn commit will abandon at least one
> existing commit in order to rewrite it with the svn version data.

I'm not a fan of automatic gc in general, but I understand it can
help new users.  So as long as clueful users can easily disable it,
then it's fine by me...

>  git-svn.perl |    6 ++++++
>  1 files changed, 6 insertions(+), 0 deletions(-)
> 
> diff --git a/git-svn.perl b/git-svn.perl
> index 777e436..be62ee1 100755
> --- a/git-svn.perl
> +++ b/git-svn.perl
> @@ -441,6 +441,12 @@ sub cmd_dcommit {
>  			}
>  			command_noisy(@finish, $gs->refname);
>  			$last_rev = $cmt_rev;
> +
> +			# rebase will have made the just-committed revisions
> +			# unreachable; over time that can build up lots of
> +			# loose objects in the repo. prune is unsafe to run
> +			# concurrently but so is dcommit.
> +			command_noisy(qw/gc --auto --prune/);
>  		}
>  	}
>  }

This is better called outside of this loop.  We now do a rebase after
every revision committed (which gets us even more dangling commits);
but we only want to call git-gc after everything is committed.

It'll be faster since git-gc is only invoked once, and if git-gc takes a
very long time to repack, we won't have to worry about timing out a SVN
network connection.  It'll also reduce the window for somebody else to
commit a conflicting change that'll cause dcommit to fail midway
through.


As far as Peter's concerns for shared repositories go, I'm not sure...

I've never been comfortable with shared repositories myself (even in a
pure git environment without git-svn) and always just preferred using
full clones or copies[1] myself so I could rm -r any working directory
and not worry about any other repositories relying on it.

[1] - I usually go about using cp -al + libflcow :)

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit
  2007-10-05 17:48       ` Steven Grimm
@ 2007-10-06  8:15         ` Peter Baumann
  0 siblings, 0 replies; 10+ messages in thread
From: Peter Baumann @ 2007-10-06  8:15 UTC (permalink / raw
  To: Steven Grimm; +Cc: Eric Wong, git

On Fri, Oct 05, 2007 at 10:48:29AM -0700, Steven Grimm wrote:
> Peter Baumann wrote:
>> That's new to me. Glancing over git-commit.sh, I could only find a
>> 'git-gc --auto', but no prune. I am not against doing a 'git gc --auto',
>> but I am against the --prune, because this could make shared
>> repositories unfunctional.
>>   
>
> Does anyone run "git svn dcommit" from a shared repository? That is the 
> only command that will trigger this code path.
>
> Given that you lose all the svn metadata if you do "git clone" (or "git 
> clone -s") on a git-svn-managed repository, it's not clear to me that 
> anyone would ever be bitten by this. Counterexamples welcome, of course.
>
> How would you feel about a separate config option to specifically enable 
> auto-pruning, and having "git svn clone" set that option by default? 
> Presumably anyone who is setting up a shared git-svn repository will be up 
> to the task of disabling the option.
>

Sorry, I looked at 'git commit' (as you said in your mail) and not
'git-svn dcommit'. Looking now at git-svn, I could see the there is only
done a git-repack if the user *explicitly* asked for it on the cmdline
specifying --repack. For this repack run, the default parameter includes
-d and no --prune, so I do not think that we are doing a --prune run if
we where not _explicitly_ asked for it. As I said, I am totaly fine with
doing a 'git-gc --auto', but I am a little worried about the --prune.

We advertise everywhere that GIT adds only new content/objects/data to the
repository and *never* deletes anything itself in the repo and now you
want to do a --prune, wich obviously *does* delete data behind the users
back in a dcommit/fetch operation, which no one would think of that these
commands do have anything in common with deleting data. And this worries me.

-Peter

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-10-06  8:16 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-05  0:15 [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit Steven Grimm
2007-10-05  8:04 ` Andreas Ericsson
2007-10-05  8:27   ` Johannes Schindelin
2007-10-05  8:21 ` Peter Baumann
2007-10-05 16:12   ` Steven Grimm
2007-10-05 16:15     ` [PATCH 3/2] Document the fact that git-svn now runs git-gc Steven Grimm
2007-10-05 16:49     ` [PATCH 2/2] Run garbage collection with loose object pruning after svn dcommit Peter Baumann
2007-10-05 17:48       ` Steven Grimm
2007-10-06  8:15         ` Peter Baumann
2007-10-05 23:54 ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).