git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Re: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
@ 2013-05-01  5:09 Ilya Basin
  2013-05-01  8:31 ` Re[2]: " Ilya Basin
  0 siblings, 1 reply; 18+ messages in thread
From: Ilya Basin @ 2013-05-01  5:09 UTC (permalink / raw)
  To: Git mailing list; +Cc: Ray Chen, Eric Wong

IB> +       return undef if (!keys $self->{_save_ph});
Correct is: return undef if (!keys %{$self->{_save_ph}});

In my repo the placeholders change too often (in 1/4 commits). I'm
thinking of using:
'git config --unset "svn-remote.$repo_id.added-placeholder" path_regex'
instead of full rewrite.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re[2]: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-01  5:09 [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list() Ilya Basin
@ 2013-05-01  8:31 ` Ilya Basin
  2013-05-01 17:09   ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Ilya Basin @ 2013-05-01  8:31 UTC (permalink / raw)
  To: Git mailing list; +Cc: Junio C Hamano, Ray Chen, Eric Wong

IB> In my repo the placeholders change too often (in 1/4 commits). I'm
IB> thinking of using:
IB> 'git config --unset "svn-remote.$repo_id.added-placeholder" path_regex'
IB> instead of full rewrite.

I need your help. There are still problems:

    $ grep "define MAX_MATCHES" ~/builds/git/git-git/config.c
    #define MAX_MATCHES 8192

    $ grep added-placeholder .git/config | wc -l
    4430

1/4 commits change the list of placeholders, usually 1 folder changes.
Clearing and re-adding the entries to the config takes ~1 minute.
Pressing Ctrl-C at this time makes the list incomplete.

Re-adding all entries using 'config --add' is slow.
Does Git::config package have tools to modify multiple entries at once?
I wonder why 'git config --get-all' is used instead of some
Git::config routine.

Otherwise, to make this atomic, I think, the modification should be made
to a backup config file, then it should replace .git/config (or
rewrite it with signals blocked).

How to determine GIT_DIR from Fetcher.pm?

maybe I can simply append a duplicate section
'[svn-remote "svn"]'. But then I would need to escape the values
myself.

Also, git --unset-all leaves one empty section: '[svn-remote "svn"]'
Is it a bug?

-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-01  8:31 ` Re[2]: " Ilya Basin
@ 2013-05-01 17:09   ` Junio C Hamano
  2013-05-01 19:51     ` Re[2]: " Ilya Basin
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2013-05-01 17:09 UTC (permalink / raw)
  To: Ilya Basin; +Cc: Git mailing list, Ray Chen, Eric Wong

Ilya Basin <basinilya@gmail.com> writes:

> IB> In my repo the placeholders change too often (in 1/4 commits). I'm
> IB> thinking of using:
> IB> 'git config --unset "svn-remote.$repo_id.added-placeholder" path_regex'
> IB> instead of full rewrite.
>
> I need your help. There are still problems:
>
>     $ grep "define MAX_MATCHES" ~/builds/git/git-git/config.c
>     #define MAX_MATCHES 8192
>
>     $ grep added-placeholder .git/config | wc -l
>     4430
>
> 1/4 commits change the list of placeholders, usually 1 folder changes.
> Clearing and re-adding the entries to the config takes ~1 minute.

While I agree both "git config"'s external interface and internal
implementation are not suited for bulk update, I have a suspicion
that the config mechanism is not the right place to store this
information in the first place.  The config is a per-Git-repository
state that is not versioned, which means it is applicable regardless
of individual commits or trees (also it means it is designed not to
be shared across repositories).  But "You may see a file here that
otherwise should not be there only to mark that there should be an
empty directory" is an attribute to a particular tree, isn't it?

If you have a branch that git-svn adds a placeholder file (hence you
want to annotate that tree with "This directory is there only to
hold the placeholder file") and you want to perform a merge on the
Git side of that branch with another Git branch that does have real
contents in that directory, you would want the result to say "This
directory no longer is just for a placeholder", but you cannot say
that globally by updating the config file, as the config mechanism
is also applied to the original branch that came from git-svn, in
which the directory in question is still only to hold the placeholder
file.

A Subversion-only history does not have a reason to have .gitignore
file tracked in it; wouldn't a cleaner implementation to consider a
directory that has .gitignore and nothing else marked with "added
placeholder", without (ab)using the config mechanism?  If you are
worried about a corner case where the Subversion side adds the file,
even though it is not used there, probably you can add a single
comment line "# added by git-svn only to keep the directory" and
consider a directory that has nothing but .gitignore that consists
of only that exact comment line an "added placeholder" directory to
work it around.  Either approach would tie the information to the
tree state, which sounds like a much more correct approach to the
"keep empty directory" problem to me.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re[2]: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-01 17:09   ` Junio C Hamano
@ 2013-05-01 19:51     ` Ilya Basin
  2013-05-01 21:30       ` Eric Wong
  0 siblings, 1 reply; 18+ messages in thread
From: Ilya Basin @ 2013-05-01 19:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git mailing list, Ray Chen, Eric Wong

JCH> ...and you want to perform a merge on the
JCH> Git side of that branch with another Git branch that does have real
JCH> contents in that directory, you would want the result to say "This
JCH> directory no longer is just for a placeholder", but you cannot say
JCH> that globally by updating the config file
Placeholder files are managed when fetching from SVN. SVN doesn't
support Git-like merges.

JCH> comment line "# added by git-svn only to keep the directory" and
JCH> consider a directory that has nothing but .gitignore that consists
JCH> of only that exact comment line an "added placeholder" directory to
JCH> work it around.
Sounds good, but it's not I who decided to use the config file.


-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-01 19:51     ` Re[2]: " Ilya Basin
@ 2013-05-01 21:30       ` Eric Wong
  2013-05-01 21:53         ` Junio C Hamano
  2013-05-02  3:51         ` Re[2]: " Ilya Basin
  0 siblings, 2 replies; 18+ messages in thread
From: Eric Wong @ 2013-05-01 21:30 UTC (permalink / raw)
  To: Ilya Basin; +Cc: Junio C Hamano, Git mailing list, Ray Chen

Ilya Basin <basinilya@gmail.com> wrote:
> JCH> comment line "# added by git-svn only to keep the directory" and
> JCH> consider a directory that has nothing but .gitignore that consists
> JCH> of only that exact comment line an "added placeholder" directory to
> JCH> work it around.
> Sounds good, but it's not I who decided to use the config file.

Ugh, I didn't review Ray's original commit closely enough to notice
this :x

Perhaps we should migrate users to use YAML storage for this, instead
(we already use YAML for Git::SVN::Memoize::YAML).


Fwiw, I've never been a fan of placeholders only accepted it since it's
off-by-default but it worked well enough for Ray.

My personal philosophy has always been: git svn users should leave
no trace or indication they're using a non-standard SVN client.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-01 21:30       ` Eric Wong
@ 2013-05-01 21:53         ` Junio C Hamano
  2013-05-02  2:49           ` Eric Wong
  2013-05-02  3:51         ` Re[2]: " Ilya Basin
  1 sibling, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2013-05-01 21:53 UTC (permalink / raw)
  To: Eric Wong; +Cc: Ilya Basin, Git mailing list, Ray Chen

Eric Wong <normalperson@yhbt.net> writes:

> Ilya Basin <basinilya@gmail.com> wrote:
>> JCH> comment line "# added by git-svn only to keep the directory" and
>> JCH> consider a directory that has nothing but .gitignore that consists
>> JCH> of only that exact comment line an "added placeholder" directory to
>> JCH> work it around.
>> Sounds good, but it's not I who decided to use the config file.
>
> Ugh, I didn't review Ray's original commit closely enough to notice
> this :x
>
> Perhaps we should migrate users to use YAML storage for this, instead
> (we already use YAML for Git::SVN::Memoize::YAML).

But does it solve the impedance mismatch between "per tree"
information and "per project" information?  Unless you key the
information not just with path but also with revision or tree object
name, use of YAML vs config would not make a difference in the
semantics, I am afraid.

I am reading the placeholder-added flag as: "This .gitignore file
does not exist in the Subversion original; it is there only so that
we can keep the otherwise empty diretory in the checkout, and it
should not be pushed back to the Subversion side".  Am I mistaken?

That however is not a property of the directory containing it (or
the path to that .gitignore file) that is valid throughout the
history of the project.  It is a property of a specific tree object
(or you could say it is a property of the revision).  When at some
point in the history the upstream project adds .gitignore there
because many people use git-svn to contribute to their project, it
stops to be "should not be pushed back".

So it seems to me that the information this "placeholder added"
thing wants to express belongs to the tree object (and .gitignore
file itself is a natural place to have that information).

> Fwiw, I've never been a fan of placeholders only accepted it since it's
> off-by-default but it worked well enough for Ray.
>
> My personal philosophy has always been: git svn users should leave
> no trace or indication they're using a non-standard SVN client.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-01 21:53         ` Junio C Hamano
@ 2013-05-02  2:49           ` Eric Wong
  2013-05-02 17:31             ` Re[2]: " Ilya Basin
  2013-05-02 18:59             ` Ray Chen
  0 siblings, 2 replies; 18+ messages in thread
From: Eric Wong @ 2013-05-02  2:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ilya Basin, Git mailing list, Ray Chen

Junio C Hamano <gitster@pobox.com> wrote:
> Eric Wong <normalperson@yhbt.net> writes:
> > Ilya Basin <basinilya@gmail.com> wrote:
> >> JCH> comment line "# added by git-svn only to keep the directory" and
> >> JCH> consider a directory that has nothing but .gitignore that consists
> >> JCH> of only that exact comment line an "added placeholder" directory to
> >> JCH> work it around.
> >> Sounds good, but it's not I who decided to use the config file.
> >
> > Ugh, I didn't review Ray's original commit closely enough to notice
> > this :x
> >
> > Perhaps we should migrate users to use YAML storage for this, instead
> > (we already use YAML for Git::SVN::Memoize::YAML).
> 
> But does it solve the impedance mismatch between "per tree"
> information and "per project" information?  Unless you key the
> information not just with path but also with revision or tree object
> name, use of YAML vs config would not make a difference in the
> semantics, I am afraid.

No it doesn't solve the impedance mismatch, but the YAML project would
be more flexible than the git config file.

> I am reading the placeholder-added flag as: "This .gitignore file
> does not exist in the Subversion original; it is there only so that
> we can keep the otherwise empty diretory in the checkout, and it
> should not be pushed back to the Subversion side".  Am I mistaken?

You're right, I had forgotten this feature completely :x

> That however is not a property of the directory containing it (or
> the path to that .gitignore file) that is valid throughout the
> history of the project.  It is a property of a specific tree object
> (or you could say it is a property of the revision).  When at some
> point in the history the upstream project adds .gitignore there
> because many people use git-svn to contribute to their project, it
> stops to be "should not be pushed back".
> 
> So it seems to me that the information this "placeholder added"
> thing wants to express belongs to the tree object (and .gitignore
> file itself is a natural place to have that information).

Perhaps that was the better way to go...

How would (the presumably few) existing users of this feature be
affected?

Currently with the config file, there are problems with interop between
git-svn users that do git <-> git repo sharing, an updated version with
the "placeholder added" .gitignore would allow git <-> git repo sharing,
but only between users of newer git versions.  Perhaps that's fine and
better than the current situation.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re[2]: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-01 21:30       ` Eric Wong
  2013-05-01 21:53         ` Junio C Hamano
@ 2013-05-02  3:51         ` Ilya Basin
  2013-05-02 20:09           ` Eric Wong
  1 sibling, 1 reply; 18+ messages in thread
From: Ilya Basin @ 2013-05-02  3:51 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Git mailing list, Ray Chen

EW> My personal philosophy has always been: git svn users should leave
EW> no trace or indication they're using a non-standard SVN client.
Placeholders aren't pushed back to svn.

-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re[2]: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-02  2:49           ` Eric Wong
@ 2013-05-02 17:31             ` Ilya Basin
  2013-05-02 20:40               ` Eric Wong
  2013-05-06  8:58               ` Re[3]: " Ilya Basin
  2013-05-02 18:59             ` Ray Chen
  1 sibling, 2 replies; 18+ messages in thread
From: Ilya Basin @ 2013-05-02 17:31 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Git mailing list, Ray Chen

Hi. I won't send you updated patches until I import and test my huge
repo. Everything will be here:
https://github.com/basinilya/git/commits/v1.8.2.2-git-svn-fixes

At the moment I've decided not to implement the Junio's proposal:
> >> JCH> comment line "# added by git-svn only to keep the directory" and
> >> JCH> consider a directory that has nothing but .gitignore that consists
> >> JCH> of only that exact comment line an "added placeholder" directory to
> >> JCH> work it around.

But the config file is not an option too: I have 400 tags, each has
200 empty folders.

Instead I decided to store the paths in a text file (see
https://github.com/basinilya/git/commit/a961aedd81cb8676a52cfe71ccb6eba0f9e64b90 ).
I'm not planning to push this change to you.

The last error I encountered is:
r7009 = 39805bb078983e34f2fc8d2c8c02d695d00d11c0 (refs/remotes/DMC4_Basic)
Too many open files: Can't open file '/home/il/builds/sicap/gitsvn/prd_dmc4.svn/db/revs/0/786': Too many open files at /.snapshots/persist/builds/git/git-git/perl/blib/lib/Git/SVN/Ra.pm line 282.

I think It's unrelated to empty dirs.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-02  2:49           ` Eric Wong
  2013-05-02 17:31             ` Re[2]: " Ilya Basin
@ 2013-05-02 18:59             ` Ray Chen
  1 sibling, 0 replies; 18+ messages in thread
From: Ray Chen @ 2013-05-02 18:59 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Ilya Basin, Git mailing list

On Wed, May 1, 2013 at 10:49 PM, Eric Wong <normalperson@yhbt.net> wrote:
> Junio C Hamano <gitster@pobox.com> wrote:
>
>> Eric Wong <normalperson@yhbt.net> writes:
>
>> That however is not a property of the directory containing it (or
>> the path to that .gitignore file) that is valid throughout the
>> history of the project.  It is a property of a specific tree object
>> (or you could say it is a property of the revision).  When at some
>> point in the history the upstream project adds .gitignore there
>> because many people use git-svn to contribute to their project, it
>> stops to be "should not be pushed back".
>>
>> So it seems to me that the information this "placeholder added"
>> thing wants to express belongs to the tree object (and .gitignore
>> file itself is a natural place to have that information).
>
> Perhaps that was the better way to go...
>
> How would (the presumably few) existing users of this feature be
> affected?
>
> Currently with the config file, there are problems with interop between
> git-svn users that do git <-> git repo sharing, an updated version with
> the "placeholder added" .gitignore would allow git <-> git repo sharing,
> but only between users of newer git versions.  Perhaps that's fine and
> better than the current situation.

The original patch was geared towards increasing the fidelity of a
one-time svn->git migration (ie. where svn won't be used anymore).  I
recall investigating a method to enforce this by disallowing future
git-svn fetches, but I can't remember if I was successful.  Given this
perspective, I'm not sure that existing users need to be supported.

Then, as Junio mentions, future versions of git that store placeholder
info in the tree/file object could open the possibility of proper
git<->git sharing and resync with the original svn repo.

- Ray

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-02  3:51         ` Re[2]: " Ilya Basin
@ 2013-05-02 20:09           ` Eric Wong
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Wong @ 2013-05-02 20:09 UTC (permalink / raw)
  To: Ilya Basin; +Cc: Junio C Hamano, Git mailing list, Ray Chen

Ilya Basin <basinilya@gmail.com> wrote:
> EW> My personal philosophy has always been: git svn users should leave
> EW> no trace or indication they're using a non-standard SVN client.
> 
> Placeholders aren't pushed back to svn.

Right, I was confused, as I often am :x

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-02 17:31             ` Re[2]: " Ilya Basin
@ 2013-05-02 20:40               ` Eric Wong
  2013-05-03  5:26                 ` Re[2]: " Ilya Basin
  2013-05-06  8:58               ` Re[3]: " Ilya Basin
  1 sibling, 1 reply; 18+ messages in thread
From: Eric Wong @ 2013-05-02 20:40 UTC (permalink / raw)
  To: Ilya Basin; +Cc: Junio C Hamano, Git mailing list, Ray Chen

Ilya Basin <basinilya@gmail.com> wrote:
> Hi. I won't send you updated patches until I import and test my huge
> repo. Everything will be here:
> https://github.com/basinilya/git/commits/v1.8.2.2-git-svn-fixes
> 
> At the moment I've decided not to implement the Junio's proposal:
> > >> JCH> comment line "# added by git-svn only to keep the directory" and
> > >> JCH> consider a directory that has nothing but .gitignore that consists
> > >> JCH> of only that exact comment line an "added placeholder" directory to
> > >> JCH> work it around.
> 
> But the config file is not an option too: I have 400 tags, each has
> 200 empty folders.
> 
> Instead I decided to store the paths in a text file (see
> https://github.com/basinilya/git/commit/a961aedd81cb8676a52cfe71ccb6eba0f9e64b90 ).
> I'm not planning to push this change to you.
> 
> The last error I encountered is:
> r7009 = 39805bb078983e34f2fc8d2c8c02d695d00d11c0 (refs/remotes/DMC4_Basic)
> Too many open files: Can't open file '/home/il/builds/sicap/gitsvn/prd_dmc4.svn/db/revs/0/786': Too many open files at /.snapshots/persist/builds/git/git-git/perl/blib/lib/Git/SVN/Ra.pm line 282.
> 
> I think It's unrelated to empty dirs.

Can you get an lsof on the git-svn process right before this?
What's your open files limit?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re[2]: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-02 20:40               ` Eric Wong
@ 2013-05-03  5:26                 ` Ilya Basin
  2013-05-03  6:42                   ` Re[3]: " Ilya Basin
  0 siblings, 1 reply; 18+ messages in thread
From: Ilya Basin @ 2013-05-03  5:26 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Git mailing list, Ray Chen

EW> Ilya Basin <basinilya@gmail.com> wrote:
>> Hi. I won't send you updated patches until I import and test my huge
>> repo. Everything will be here:
>> https://github.com/basinilya/git/commits/v1.8.2.2-git-svn-fixes
>> 
>> At the moment I've decided not to implement the Junio's proposal:
>> > >> JCH> comment line "# added by git-svn only to keep the directory" and
>> > >> JCH> consider a directory that has nothing but .gitignore that consists
>> > >> JCH> of only that exact comment line an "added placeholder" directory to
>> > >> JCH> work it around.
>> 
>> But the config file is not an option too: I have 400 tags, each has
>> 200 empty folders.
>> 
>> Instead I decided to store the paths in a text file (see
>> https://github.com/basinilya/git/commit/a961aedd81cb8676a52cfe71ccb6eba0f9e64b90 ).
>> I'm not planning to push this change to you.
>> 
>> The last error I encountered is:
>> r7009 = 39805bb078983e34f2fc8d2c8c02d695d00d11c0 (refs/remotes/DMC4_Basic)
>> Too many open files: Can't open file '/home/il/builds/sicap/gitsvn/prd_dmc4.svn/db/revs/0/786': Too many open files at /.snapshots/persist/builds/git/git-git/perl/blib/lib/Git/SVN/Ra.pm line 282.
>> 
>> I think It's unrelated to empty dirs.

EW> Can you get an lsof on the git-svn process right before this?
    /.snapshots/persist/builds/sicap/gitsvn/aaa/.git/A4O_OTQxWc
    /.snapshots/persist/builds/sicap/gitsvn/aaa/.git/LfpcENJduN
    /.snapshots/persist/builds/sicap/gitsvn/aaa/.git/Dkk7pN4Mpz
    etc.

EW> What's your open files limit?
1024

-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re[3]: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-03  5:26                 ` Re[2]: " Ilya Basin
@ 2013-05-03  6:42                   ` Ilya Basin
  2013-05-06  8:14                     ` Re[4]: " Ilya Basin
  0 siblings, 1 reply; 18+ messages in thread
From: Ilya Basin @ 2013-05-03  6:42 UTC (permalink / raw)
  To: Ilya Basin; +Cc: Eric Wong, Junio C Hamano, Git mailing list, Ray Chen

EW>> Ilya Basin <basinilya@gmail.com> wrote:
>>> Hi. I won't send you updated patches until I import and test my huge
>>> repo. Everything will be here:
>>> https://github.com/basinilya/git/commits/v1.8.2.2-git-svn-fixes
>>> 
>>> At the moment I've decided not to implement the Junio's proposal:
>>> > >> JCH> comment line "# added by git-svn only to keep the directory" and
>>> > >> JCH> consider a directory that has nothing but .gitignore that consists
>>> > >> JCH> of only that exact comment line an "added placeholder" directory to
>>> > >> JCH> work it around.
>>> 
>>> But the config file is not an option too: I have 400 tags, each has
>>> 200 empty folders.
>>> 
>>> Instead I decided to store the paths in a text file (see
>>> https://github.com/basinilya/git/commit/a961aedd81cb8676a52cfe71ccb6eba0f9e64b90 ).
>>> I'm not planning to push this change to you.
>>> 
>>> The last error I encountered is:
>>> r7009 = 39805bb078983e34f2fc8d2c8c02d695d00d11c0 (refs/remotes/DMC4_Basic)
>>> Too many open files: Can't open file '/home/il/builds/sicap/gitsvn/prd_dmc4.svn/db/revs/0/786': Too many open files at /.snapshots/persist/builds/git/git-git/perl/blib/lib/Git/SVN/Ra.pm line 282.
>>> 
>>> I think It's unrelated to empty dirs.

EW>> Can you get an lsof on the git-svn process right before this?
IB>     /.snapshots/persist/builds/sicap/gitsvn/aaa/.git/A4O_OTQxWc
IB>     /.snapshots/persist/builds/sicap/gitsvn/aaa/.git/LfpcENJduN
IB>     /.snapshots/persist/builds/sicap/gitsvn/aaa/.git/Dkk7pN4Mpz
IB>     etc.

EW>> What's your open files limit?
IB> 1024

Why no call to close() from temp_release() in Git.pm?


-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re[4]: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-03  6:42                   ` Re[3]: " Ilya Basin
@ 2013-05-06  8:14                     ` Ilya Basin
  0 siblings, 0 replies; 18+ messages in thread
From: Ilya Basin @ 2013-05-06  8:14 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Git mailing list, Ray Chen

>>>> The last error I encountered is:
>>>> r7009 = 39805bb078983e34f2fc8d2c8c02d695d00d11c0 (refs/remotes/DMC4_Basic)
>>>> Too many open files: Can't open file '/home/il/builds/sicap/gitsvn/prd_dmc4.svn/db/revs/0/786': Too many open files at /.snapshots/persist/builds/git/git-git/perl/blib/lib/Git/SVN/Ra.pm line 282.
>>>> 
>>>> I think It's unrelated to empty dirs.

EW>>> Can you get an lsof on the git-svn process right before this?
IB>>     /.snapshots/persist/builds/sicap/gitsvn/aaa/.git/A4O_OTQxWc
IB>>     /.snapshots/persist/builds/sicap/gitsvn/aaa/.git/LfpcENJduN
IB>>     /.snapshots/persist/builds/sicap/gitsvn/aaa/.git/Dkk7pN4Mpz
IB>>     etc.

EW>>> What's your open files limit?
IB>> 1024

IB> Why no call to close() from temp_release() in Git.pm?

Found, fixed. It was related to empty dirs.



-- 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re[3]: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-02 17:31             ` Re[2]: " Ilya Basin
  2013-05-02 20:40               ` Eric Wong
@ 2013-05-06  8:58               ` Ilya Basin
  2013-05-09  1:05                 ` Eric Wong
  2013-05-28 12:57                 ` Re[4]: " Ilya Basin
  1 sibling, 2 replies; 18+ messages in thread
From: Ilya Basin @ 2013-05-06  8:58 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Git mailing list, Ray Chen

Hi Eric. I'm out of spare time and I still unable to import my repo.
The code of SVN.pm is too complex. Please help me.
Here's the list of my issues:

* I think git-svn doesn't handle the case, when a tag is deleted.
  I expected it to rename the ref from "tags/tagname" to
  "tags/tagname@rev", but that doesn't happen.
  If a tag is replaced, there's no way to tell what was the previous
  state of that tag: git-svn just rewrites the ref.
  On the contrary, the temporary refs (with "@rev" suffix), used for
  re-import subdir tags are kept after successful reimport, although
  they have no usage.

* As I said already, I have 25k revisions and 200 tags created from
  subdirs in trunk. This increases the import time from 2h to 12h.
  I would bear it, if it had to be done once, but fetching a new
  revision may cause re-import of all 25k revisions too.
  You should implement some mechanism to find the parent branches of
  subdir tags. Maybe the unused refs I mentioned in the previous issue
  are good candidates for that, but I would name them somehow
  different to distinguish with deleted/replaced tags/branches.

* There are mistake commits in the svn history, similar to this:
    ------------------------------------------------------------------------
    r21255 | xxx_xxxxxx_xxxxxxxxx | 2012-03-02 18:46:30 +0300 (Fri, 02 Mar 2012) | 1 line
    Changed paths:
      A /tags/dmagentenabler-4.1.31/DMAgent (from /tags:20998)
    
    Delivery 4.1.31
    ------------------------------------------------------------------------
  git-svn tries to creates a tag, containing dirs with other tags.
  Technically, behaves correctly, but it hangs, because of the size of
  the commit.
  To solve it, I had to edit the svn dump file:

     Node-path: tags/dmagentenabler-4.1.31/DMAgent
     Node-kind: dir
     Node-action: add
    -Node-copyfrom-rev: 20998
    -Node-copyfrom-path: tags
    +Prop-content-length: 10
    +Content-length: 10
    +
    +PROPS-END
     
     
     Revision-number: 21256

  It creates an empty dir, instead of copying. Since the author
  noticed the mistake, he immediately deleted the dir in the next
  revision, so it works.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-06  8:58               ` Re[3]: " Ilya Basin
@ 2013-05-09  1:05                 ` Eric Wong
  2013-05-28 12:57                 ` Re[4]: " Ilya Basin
  1 sibling, 0 replies; 18+ messages in thread
From: Eric Wong @ 2013-05-09  1:05 UTC (permalink / raw)
  To: Ilya Basin; +Cc: Junio C Hamano, Git mailing list, Ray Chen

Ilya Basin <basinilya@gmail.com> wrote:
> Hi Eric. I'm out of spare time and I still unable to import my repo.
> The code of SVN.pm is too complex. Please help me.

Sorry, most what I do nowadays for git-svn is ACK/NACK changes.

git-svn has made itself obsolete for most contributors, myself included;
so it's hard for us to devote significant amounts of time on it since
we no longer see SVN repos in our day-to-day work.

Given the differences between branching/tagging in SVN and git, I
suspect some history may always be too complex/convoluted to
automatically import.  Perhaps an interactive mode can be introduced
to follow history...

Anyways, thank you for documenting these issues and suggesting fixes.
Hopefully somebody with sufficient motivation can continue your work
down the line.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re[4]: [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list()
  2013-05-06  8:58               ` Re[3]: " Ilya Basin
  2013-05-09  1:05                 ` Eric Wong
@ 2013-05-28 12:57                 ` Ilya Basin
  1 sibling, 0 replies; 18+ messages in thread
From: Ilya Basin @ 2013-05-28 12:57 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Git mailing list, Ray Chen

IB> * I think git-svn doesn't handle the case, when a tag is deleted.
IB>   I expected it to rename the ref from "tags/tagname" to
IB>   "tags/tagname@rev", but that doesn't happen.
IB>   If a tag is replaced, there's no way to tell what was the previous
IB>   state of that tag: git-svn just rewrites the ref.

OK, I figured out that git-svn creates a merge commit having one of
its parents the previous state of the tag and another parent the state
of the new copy src folder.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-05-28 12:58 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-01  5:09 [PATCH 4/5] git-svn: fix bottleneck in stash_placeholder_list() Ilya Basin
2013-05-01  8:31 ` Re[2]: " Ilya Basin
2013-05-01 17:09   ` Junio C Hamano
2013-05-01 19:51     ` Re[2]: " Ilya Basin
2013-05-01 21:30       ` Eric Wong
2013-05-01 21:53         ` Junio C Hamano
2013-05-02  2:49           ` Eric Wong
2013-05-02 17:31             ` Re[2]: " Ilya Basin
2013-05-02 20:40               ` Eric Wong
2013-05-03  5:26                 ` Re[2]: " Ilya Basin
2013-05-03  6:42                   ` Re[3]: " Ilya Basin
2013-05-06  8:14                     ` Re[4]: " Ilya Basin
2013-05-06  8:58               ` Re[3]: " Ilya Basin
2013-05-09  1:05                 ` Eric Wong
2013-05-28 12:57                 ` Re[4]: " Ilya Basin
2013-05-02 18:59             ` Ray Chen
2013-05-02  3:51         ` Re[2]: " Ilya Basin
2013-05-02 20:09           ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).