git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Status of the svn remote helper project (Nov, 2010)
@ 2010-11-07 11:21 Jonathan Nieder
  2010-11-07 12:06 ` David Michael Barr
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Jonathan Nieder @ 2010-11-07 11:21 UTC (permalink / raw)
  To: git
  Cc: Ramkumar Ramachandra, Sverre Rabbelier, David Barr, Sam Vilain,
	Stephen Bash, Tomas Carnecky

The svn remote helper project still has a long way to go.  In the
meantime, the svn-fe plumbing and Tomas's scripted prototype are
usable.

Here are some topics that might be roughly in their final form.  If
you would like to build on one of them, please let me know so I can
refrain from rewriting that piece of history.

A merge of these branches is available as

	git://repo.or.cz/git/jrn.git vcs-svn-pu

and individual topic branches are also available in that repository,
though for convenience they are not in the refs/heads namespace.

Thoughts and improvements welcome.

--------------------------------------------------
[Cooking]
* jn/svndiff0 (2010-11-06) 24 commits
 - vcs-svn: Allow deltas to copy from preimage
 - vcs-svn: Reject deltas that read past end of preimage
 - vcs-svn: Let deltas use data from postimage
 - vcs-svn: Reject deltas that do not consume all inline data
 - vcs-svn: Check declared number of output bytes
 - vcs-svn: Implement copyfrom_data delta instruction
 - vcs-svn: Read instructions from deltas
 - vcs-svn: Read inline data from deltas
 - vcs-svn: Read the preimage while applying deltas
 - vcs-svn: Skeleton of an svn delta parser
 - compat: helper for detecting unsigned overflow
 - vcs-svn: Learn to check for SVN\0 magic
 - vcs-svn: Learn to parse variable-length integers
 - vcs-svn: Add code to maintain a sliding view of a file
 - vcs-svn: Allow character-oriented input
 - vcs-svn: Allow input errors to be detected early
 - vcs-svn: Let callers peek ahead to find stream end
 - vcs-svn: Add binary-safe read() function
 - vcs-svn: Improve support for reading large files
 - vcs-svn: Make buffer_skip_bytes() report partial reads
 - vcs-svn: Teach line_buffer to handle multiple input files
 - vcs-svn: Collect line_buffer data in a struct
 - vcs-svn: Replace buffer_read_string() memory pool with a strbuf
 - vcs-svn: Eliminate global byte_buffer[] array

An SVN-format delta applier.  Seems okay, but it has not been heavily
exercised with real-world deltas.

* db/fast-import-cat-blob (2010-11-07) 3 commits
 - fast-import: let importers retrieve blobs
 - fast-import: clarify documentation of "feature" command
 - fast-import: stricter parsing of integer options

As David says: "it has some significant consequences".

A start for bi-directional communication with fast-import (needed by
svn-fe to avoid keeping its own database of blobs).  Seems to be in
okay shape.

* db/svn-fe-dumpfile3 (2010-11-07) 6 commits
 - vcs-svn: apply node text deltas
 - Merge branch 'jn/svndiff0' into db/svn-fe-dumpfile3
 - Merge branch 'db/fast-import-cat-blob' into db/svn-fe-dumpfile3
 - vcs-svn: Add output file param to buffer_copy_bytes()
 - vcs-svn: Find basis for deltified nodes; apply node prop deltas
 - vcs-svn: Teach dump parser about new header types
 (this branch uses jn/svndiff0 and db/fast-import-cat-blob.)

Adding support for dumpfiles with deltas (which is pretty close to
what the ra protocol sends over the wire) to svn-fe.

The tip commit could use some cleaning up.

* rr/svnfe-tests-no-perl (2010-11-07) 1 commit
 - t9010 (svn-fe): Eliminate dependency on svn perl bindings

Ejected from the jn/svndiff0 topic.  A noninvasive simplification;
what more could one ask for?

* jn/wrappers-no-libz (2010-11-06) 7 commits
 - Remove pack file handling dependency from wrapper.o
 - pack-objects: mark file-local variable static
 - wrapper: give zlib wrappers their own translation unit
 - strbuf: move strbuf_branchname to sha1_name.c
 - path helpers: move git_mkstemp* to wrapper.c
 - wrapper: move odb_* to environment.c
 - wrapper: move xmmap() to sha1_file.c

Approach seems reasonable.  More eyes on the tip commit would
be comforting.

* xx/wrappers-no-libz-svndiff0 (2010-11-07) 2 commits
 - svn-fe: stop linking to libz and libxdiff
 - Merge branch 'jn/svndiff0' into xx/wrappers-no-libz-svndiff0
 (this branch uses jn/wrappers-no-libz and jn/svndiff0.)

Example application of the jn/wrappers-no-libz topic.

--------------------------------------------------
[Not picked up yet]

* db/branch-mapper: $gmane/158375
 . contrib/svn-fe: Fast script to remap svn history

Could use a usage example (perhaps a test script).

* tc/remote-helper-usability: $gmane/157860
 . Register new packs after the remote helper is done fetching
 . Properly record history of the notes ref
 . Fix ls-remote output when displaying impure refs
 . Add git-remote-svn
 . Introduce the git fast-import-helper
 . Rename get_mode() to decode_tree_mode() and export it
 . Allow the transport fetch command to add additional refs
 . Allow more than one keepfile in the transport
 . Remote helper: accept ':<value> <name>' as a response to 'list'

The fourth-from-top seems a bit hard to review.  If it really is
necessary to introduce a separate program with a separate interface,
maybe a compile-time flag to choose between them would help?

* rr/remote-helper: http://github.com/artagnon/git
 . remote-svn: Write in fetch functionality
 . run-command: Protect the FD 3 from being grabbed
 . remote-svn: Build a pipeline for the import using svnrdump
 . run-command: Extend child_process to include a backchannel FD
 . Allow the transport fetch command to add additional refs
 . Remote helper: accept ':<value> <name>' as a response to 'list'
 . test-svn-fe: Allow for a dumpfile on stdin
 . contrib/svn-fe: Fast script to remap svn history
 . Add Tom's remote helper for reference
 . Add a stubby remote-svn remote helper
 . Add a correct svndiff applier

Work in progress, waiting on lower levels to be more functional
(in particular, svn-fe does not support incremental imports yet).

* sb/svn-fe-example: $gmane/159054

--------------------------------------------------
[Design note (vaporware)]

See $gmane/157141 for some hints about implementing incremental
imports.

$gmane means http://thread.gmane.org/gmane.comp.version-control.git

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Status of the svn remote helper project (Nov, 2010)
  2010-11-07 11:21 Status of the svn remote helper project (Nov, 2010) Jonathan Nieder
@ 2010-11-07 12:06 ` David Michael Barr
  2010-11-08  3:56   ` David Barr
  2010-11-07 12:50 ` Ramkumar Ramachandra
  2010-11-21  6:31 ` Status of the svn remote helper project (Nov 2010, #2) Jonathan Nieder
  2 siblings, 1 reply; 22+ messages in thread
From: David Michael Barr @ 2010-11-07 12:06 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, Ramkumar Ramachandra, Sverre Rabbelier, Sam Vilain,
	Stephen Bash, Tomas Carnecky

Hi all,

> [Cooking]
> * jn/svndiff0 (2010-11-06) 24 commits

[...]

> An SVN-format delta applier.  Seems okay, but it has not been heavily
> exercised with real-world deltas.

I'm testing this version against the original ASF dump that I used previously.
Maybe one day we can try against the KDE repo - which is epic in proportions.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Status of the svn remote helper project (Nov, 2010)
  2010-11-07 11:21 Status of the svn remote helper project (Nov, 2010) Jonathan Nieder
  2010-11-07 12:06 ` David Michael Barr
@ 2010-11-07 12:50 ` Ramkumar Ramachandra
  2010-11-07 17:42   ` Jonathan Nieder
  2010-11-21  6:31 ` Status of the svn remote helper project (Nov 2010, #2) Jonathan Nieder
  2 siblings, 1 reply; 22+ messages in thread
From: Ramkumar Ramachandra @ 2010-11-07 12:50 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, Sverre Rabbelier, David Barr, Sam Vilain, Stephen Bash,
	Tomas Carnecky

Hi,

Jonathan Nieder writes:
> The svn remote helper project still has a long way to go.  In the
> meantime, the svn-fe plumbing and Tomas's scripted prototype are
> usable.
> 
> Here are some topics that might be roughly in their final form.  If
> you would like to build on one of them, please let me know so I can
> refrain from rewriting that piece of history.
> 
> A merge of these branches is available as
> 
> 	git://repo.or.cz/git/jrn.git vcs-svn-pu

Thanks for doing this! Now we have an up-to-date index that tracks all
our work :)

A note to the others: If we merge too early, we will be forced to
either stick with bad decisions we made prematurely, or revert
them. Therefore, we have decided to develop this on the side, while
reporting progress on the list.

> * jn/svndiff0 (2010-11-06) 24 commits

Yeah, this requires some rigorous testing with real-world deltas. I'll
probably get some EC2 instaces to churn on it after my $EXAMS are over.

> * db/fast-import-cat-blob (2010-11-07) 3 commits
> As David says: "it has some significant consequences".

rr/remote-helper already uses this, but there might be a better way to
do it- we should wait and see.

> * db/svn-fe-dumpfile3 (2010-11-07) 6 commits

You earlier said that it might be possible to justify merging this now
provided we supply a wrapper script to make it easy to invoke.

> * rr/svnfe-tests-no-perl (2010-11-07) 1 commit

Ok.

> * jn/wrappers-no-libz (2010-11-06) 7 commits
> * xx/wrappers-no-libz-svndiff0 (2010-11-07) 2 commits

These two are fairly independent of the rest of the series, no? Maybe
get these merged separately?

> * db/branch-mapper: $gmane/158375

I've have some mapper ideas from the discussion thread following
sb/svn-fe-example. I'll finish it after $EXAMS.

> * tc/remote-helper-usability: $gmane/157860

It has some good ideas that I'm re-using in rr/remote-helper.

> * rr/remote-helper: http://github.com/artagnon/git

First, it's the wrong approach: I've hardcoded FD 3 into run-command
to mean backflow. This ugly inelegant design must be thrown
away. Second, it's very messy, and half the commits aren't even
used.

Anyway, I hope it gets the idea across- there's some functionality I
intend reuse from tc/remote-helper-usability. Also, the "fetch"
command works so long as fetching from revision 0 is requested. So the
immediate priority is to get svn-fe to support incremental imports.

> * sb/svn-fe-example: $gmane/159054

The discussion thread following this has some good observations/
ideas.

-- Ram

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Status of the svn remote helper project (Nov, 2010)
  2010-11-07 12:50 ` Ramkumar Ramachandra
@ 2010-11-07 17:42   ` Jonathan Nieder
  0 siblings, 0 replies; 22+ messages in thread
From: Jonathan Nieder @ 2010-11-07 17:42 UTC (permalink / raw)
  To: Ramkumar Ramachandra
  Cc: git, Sverre Rabbelier, David Barr, Sam Vilain, Stephen Bash,
	Tomas Carnecky

Ramkumar Ramachandra wrote:
> Jonathan Nieder writes:

>> 	git://repo.or.cz/git/jrn.git vcs-svn-pu
>
> Thanks for doing this! Now we have an up-to-date index that tracks all
> our work :)
>
> A note to the others: If we merge too early, we will be forced to
> either stick with bad decisions we made prematurely, or revert
> them. Therefore, we have decided to develop this on the side, while
> reporting progress on the list.

More precisely: it would be nice to see the usual flow of patches into
git.git; from my point of view, nothing is significantly different
in that respect from before, and this is just a convenient place for
remote-svn patches to park and be tested without bothering Junio too
much.

This should make it easier to keep track of the current state of svn::
support without regularly sending 46-commit patch series to the list.

Thanks for the updates and kind words.
Jonathan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Status of the svn remote helper project (Nov, 2010)
  2010-11-07 12:06 ` David Michael Barr
@ 2010-11-08  3:56   ` David Barr
  2010-11-08  6:11     ` Jonathan Nieder
  0 siblings, 1 reply; 22+ messages in thread
From: David Barr @ 2010-11-08  3:56 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, Ramkumar Ramachandra, Sverre Rabbelier, Sam Vilain,
	Stephen Bash, Tomas Carnecky

Hi all,

> > [Cooking]
> > * jn/svndiff0 (2010-11-06) 24 commits
> 
> [...]
> 
> > An SVN-format delta applier.  Seems okay, but it has not been heavily
> > exercised with real-world deltas.
> 
> I'm testing this version against the original ASF dump that I used 
previously.
> Maybe one day we can try against the KDE repo - which is epic in 
proportions.

I've successfully tested this series against the ASF repository
(940,166 revisions) and 5,636,613 blobs were faithfully reproduced.

--
David Barr.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Status of the svn remote helper project (Nov, 2010)
  2010-11-08  3:56   ` David Barr
@ 2010-11-08  6:11     ` Jonathan Nieder
  2010-11-08  6:20       ` David Barr
  0 siblings, 1 reply; 22+ messages in thread
From: Jonathan Nieder @ 2010-11-08  6:11 UTC (permalink / raw)
  To: David Barr
  Cc: git, Ramkumar Ramachandra, Sverre Rabbelier, Sam Vilain,
	Stephen Bash, Tomas Carnecky

David Barr wrote:

>>> [Cooking]
>>> * jn/svndiff0 (2010-11-06) 24 commits
[...]
> I've successfully tested this series against the ASF repository
> (940,166 revisions) and 5,636,613 blobs were faithfully reproduced.

Thanks!  Was that using svn-fe, test-svn-fe -d, or some other program
using the vcs-svn lib?

Jonathan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Status of the svn remote helper project (Nov, 2010)
  2010-11-08  6:11     ` Jonathan Nieder
@ 2010-11-08  6:20       ` David Barr
  0 siblings, 0 replies; 22+ messages in thread
From: David Barr @ 2010-11-08  6:20 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, Ramkumar Ramachandra, Sverre Rabbelier, Sam Vilain,
	Stephen Bash, Tomas Carnecky

Hi,

> >>> [Cooking]
> >>> * jn/svndiff0 (2010-11-06) 24 commits
> [...]
> > I've successfully tested this series against the ASF repository
> > (940,166 revisions) and 5,636,613 blobs were faithfully reproduced.
> 
> Thanks!  Was that using svn-fe, test-svn-fe -d, or some other program
> using the vcs-svn lib?

That was using svn-fe:
svn-fe < svn-asf-public-r0:940166 3<backflow | git-fast-import --cat-blob-fd=3 
3>backflow

--
David Barr.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Status of the svn remote helper project (Nov 2010, #2)
  2010-11-07 11:21 Status of the svn remote helper project (Nov, 2010) Jonathan Nieder
  2010-11-07 12:06 ` David Michael Barr
  2010-11-07 12:50 ` Ramkumar Ramachandra
@ 2010-11-21  6:31 ` Jonathan Nieder
  2010-11-21  9:38   ` David Michael Barr
  2010-12-05 11:37   ` Status of the svn remote helper project (Dec 2010, #1) Jonathan Nieder
  2 siblings, 2 replies; 22+ messages in thread
From: Jonathan Nieder @ 2010-11-21  6:31 UTC (permalink / raw)
  To: git
  Cc: Ramkumar Ramachandra, Sverre Rabbelier, David Barr, Sam Vilain,
	Stephen Bash, Tomas Carnecky

Not much to see here.  There are lots of patches waiting for review;
still especially noteworthy are Tomas's fast-import changes.

Incremental updates after a one-shot conversion by svn-fe are not
supported yet.  A map from git revisions to svn revision numbers would
be needed for that, preferrably such that the time to look up the HEAD
commit does not scale with the number of revisions.

A merge of the branches listed below is available as

	git://repo.or.cz/git/jrn.git vcs-svn-pu

and individual topic branches are also available in that repository
in the refs/topics namespace.  Please try to base your work on just
the topic branches you use; vcs-svn-pu itself is rebuilt each time it
is updated.

Complaints of all kinds welcome.

--------------------------------------------------
[Cooking]
* jn/svndiff0 (2010-11-06) 24 commits
 - vcs-svn: Allow deltas to copy from preimage
 - vcs-svn: Reject deltas that read past end of preimage
 - vcs-svn: Let deltas use data from postimage
 - vcs-svn: Reject deltas that do not consume all inline data
 - vcs-svn: Check declared number of output bytes
 - vcs-svn: Implement copyfrom_data delta instruction
 - vcs-svn: Read instructions from deltas
 - vcs-svn: Read inline data from deltas
 - vcs-svn: Read the preimage while applying deltas
 - vcs-svn: Skeleton of an svn delta parser
 - compat: helper for detecting unsigned overflow
 - vcs-svn: Learn to check for SVN\0 magic
 - vcs-svn: Learn to parse variable-length integers
 - vcs-svn: Add code to maintain a sliding view of a file
 - vcs-svn: Allow character-oriented input
 - vcs-svn: Allow input errors to be detected early
 - vcs-svn: Let callers peek ahead to find stream end
 - vcs-svn: Add binary-safe read() function
 - vcs-svn: Improve support for reading large files
 - vcs-svn: Make buffer_skip_bytes() report partial reads
 - vcs-svn: Teach line_buffer to handle multiple input files
 - vcs-svn: Collect line_buffer data in a struct
 - vcs-svn: Replace buffer_read_string() memory pool with a strbuf
 - vcs-svn: Eliminate global byte_buffer[] array

Well tested.  It's a library without a user except test-svn-fe -d, but
aside from that detail, this series should be ready for wide use.

* db/fast-import-cat-blob (2010-11-07) 3 commits
 - fast-import: Allow cat-blob requests at arbitrary points in stream
 - fast-import: let importers retrieve blobs
 - fast-import: clarify documentation of "feature" command
 - fast-import: stricter parsing of integer options

There are plans for an additional command to print information in
ls-tree format about a path.

* db/recognize-v3 (2010-11-20) 2 commits
 - vcs-svn: Allow simple v3 dumps (no deltas yet)
 - vcs-svn: Error out for v3 dumps

A bugfix and the framework for a feature.

* db/prop-delta (2010-11-20) 16 commits
 - vcs-svn: Simplify handling of deleted properties
 - vcs-svn: Implement Prop-delta handling
 - vcs-svn: Sharpen parsing of property lines
 - vcs-svn: Split off function for handling of individual properties
 - vcs-svn: Make source easier to read on small screens
 - vcs-svn: More dump format sanity checks
 - vcs-svn: Reject path nodes without Node-action
 - vcs-svn: Delay read of per-path properties
 - vcs-svn: Combine repo_replace and repo_modify functions
 - vcs-svn: Replace = Delete + Add
 - vcs-svn: handle_node: Handle deletion case early
 - vcs-svn: Use mark to indicate nodes with included text
 - vcs-svn: Unclutter handle_node by introducing have_props var
 - vcs-svn: Eliminate node_ctx.mark global
 - vcs-svn: Eliminate node_ctx.srcRev global
 - vcs-svn: Check for errors from open()
 (this branch uses db/recognize-v3.)

Needs review and testing.

* db/text-delta (2010-11-20) 10 commits
 - svn-fe: Test script for handling of dumps with --deltas
 - vcs-svn: Implement text-delta handling
 - Merge branch 'db/fast-import-cat-blob' into db/text-delta
 - vcs-svn: Teach line_buffer about temporary files
 - vcs-svn: Let caller set up sliding window for delta preimage
 - vcs-svn: Read delta preimage from file descriptor
 - vcs-svn: Introduce fd_buffer routines
 - vcs-svn: Introduce repo_read_path to check the content at a path
 - vcs-svn: Internal fast_export_save_blob helper
 - Merge branch 'jn/svndiff0' into db/text-delta
 (this branch uses db/recognize-v3, db/prop-delta, db/fast-import-cat-blob,
  and jn/svndiff0.)

A delta in r36 of <http://svn.apache.org/repos/asf> does not apply
with this brand of svn-fe.

* rr/svnfe-tests-no-perl (2010-11-07) 1 commit
 - t9010 (svn-fe): Eliminate dependency on svn perl bindings

Sent to list; hopefully will be in jch and we can stop tracking it
soon.

* jn/thinner-wrapper (2010-11-06) 7 commits
 - Remove pack file handling dependency from wrapper.o
 - pack-objects: mark file-local variable static
 - wrapper: give zlib wrappers their own translation unit
 - strbuf: move strbuf_branchname to sha1_name.c
 - path helpers: move git_mkstemp* to wrapper.c
 - wrapper: move odb_* to environment.c
 - wrapper: move xmmap() to sha1_file.c

>From pu.

* xx/thinner-wrapper-svndiff0 (2010-11-07) 2 commits
 - svn-fe: stop linking to libz and libxdiff
 - Merge branch 'jn/svndiff0' into xx/thinner-wrapper-svndiff0
 (this branch uses jn/thinner-wrapper and jn/svndiff0.)

---------------------------------------------------
[Dropped]
* db/svn-fe-dumpfile3 (2010-11-07) 6 commits
 - vcs-svn: apply node text deltas
 - Merge branch 'jn/svndiff0' into db/svn-fe-dumpfile3
 - Merge branch 'db/fast-import-cat-blob' into db/svn-fe-dumpfile3
 - vcs-svn: Add output file param to buffer_copy_bytes()
 - vcs-svn: Find basis for deltified nodes; apply node prop deltas
 - vcs-svn: Teach dump parser about new header types
 (this branch uses jn/svndiff0 and db/fast-import-cat-blob.)

Ejected in favor of db/recognize-v3, db/prop-delta, and db/text-delta.

--------------------------------------------------
[Not picked up yet]

* db/branch-mapper: $gmane/158375
 . contrib/svn-fe: Fast script to remap svn history

Sent comments.  The choices this script makes can be arbitrary at
times.

* tc/remote-helper-usability: $gmane/157860
 . Register new packs after the remote helper is done fetching
 . Properly record history of the notes ref
 . Fix ls-remote output when displaying impure refs
 . Add git-remote-svn
 . Introduce the git fast-import-helper
 . Rename get_mode() to decode_tree_mode() and export it
 . Allow the transport fetch command to add additional refs
 . Allow more than one keepfile in the transport
 . Remote helper: accept ':<value> <name>' as a response to 'list'

The fourth-from-top seems a bit hard to review.  If it really is
necessary to introduce a separate program with a separate interface,
maybe a compile-time flag to choose between them would help?

* rr/remote-helper: http://github.com/artagnon/git
 . remote-svn: Write in fetch functionality
 . run-command: Protect the FD 3 from being grabbed
 . remote-svn: Build a pipeline for the import using svnrdump
 . run-command: Extend child_process to include a backchannel FD
 . Allow the transport fetch command to add additional refs
 . Remote helper: accept ':<value> <name>' as a response to 'list'
 . test-svn-fe: Allow for a dumpfile on stdin
 . contrib/svn-fe: Fast script to remap svn history
 . Add Tom's remote helper for reference
 . Add a stubby remote-svn remote helper
 . Add a correct svndiff applier

Work in progress, waiting on lower levels to be more functional
(in particular, svn-fe does not support incremental imports yet).

* sb/svn-fe-example: $gmane/159054

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Status of the svn remote helper project (Nov 2010, #2)
  2010-11-21  6:31 ` Status of the svn remote helper project (Nov 2010, #2) Jonathan Nieder
@ 2010-11-21  9:38   ` David Michael Barr
  2010-11-21 23:06     ` Jonathan Nieder
  2010-12-05 11:37   ` Status of the svn remote helper project (Dec 2010, #1) Jonathan Nieder
  1 sibling, 1 reply; 22+ messages in thread
From: David Michael Barr @ 2010-11-21  9:38 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, Ramkumar Ramachandra, Sverre Rabbelier, Sam Vilain,
	Stephen Bash, Tomas Carnecky

Hi all,

> There are plans for an additional command to print information in
> ls-tree format about a path.

Sorry that progress has been quite slow this week.
My work-in-progess lays out the syntax of the command,
the initial parsing and object lookup.
I still need to break out the tree and path dereferencing logic.
I haven't started on printing yet but I presume it should be easy.

> * db/recognize-v3 (2010-11-20) 2 commits
> - vcs-svn: Allow simple v3 dumps (no deltas yet)
> - vcs-svn: Error out for v3 dumps
> 
> A bugfix and the framework for a feature.

Thanks for this, it should have gone in much earlier.

> * db/prop-delta (2010-11-20) 16 commits
> - vcs-svn: Simplify handling of deleted properties
> - vcs-svn: Implement Prop-delta handling
> - vcs-svn: Sharpen parsing of property lines
> - vcs-svn: Split off function for handling of individual properties
> - vcs-svn: Make source easier to read on small screens
> - vcs-svn: More dump format sanity checks
> - vcs-svn: Reject path nodes without Node-action
> - vcs-svn: Delay read of per-path properties
> - vcs-svn: Combine repo_replace and repo_modify functions
> - vcs-svn: Replace = Delete + Add
> - vcs-svn: handle_node: Handle deletion case early
> - vcs-svn: Use mark to indicate nodes with included text
> - vcs-svn: Unclutter handle_node by introducing have_props var
> - vcs-svn: Eliminate node_ctx.mark global
> - vcs-svn: Eliminate node_ctx.srcRev global
> - vcs-svn: Check for errors from open()
> (this branch uses db/recognize-v3.)
> 
> Needs review and testing.

Testing against my favourite, the ASF repo, as I write this.

> 
> * db/text-delta (2010-11-20) 10 commits
> - svn-fe: Test script for handling of dumps with --deltas
> - vcs-svn: Implement text-delta handling
> - Merge branch 'db/fast-import-cat-blob' into db/text-delta
> - vcs-svn: Teach line_buffer about temporary files
> - vcs-svn: Let caller set up sliding window for delta preimage
> - vcs-svn: Read delta preimage from file descriptor
> - vcs-svn: Introduce fd_buffer routines
> - vcs-svn: Introduce repo_read_path to check the content at a path
> - vcs-svn: Internal fast_export_save_blob helper
> - Merge branch 'jn/svndiff0' into db/text-delta
> (this branch uses db/recognize-v3, db/prop-delta, db/fast-import-cat-blob,
>  and jn/svndiff0.)
> 
> A delta in r36 of <http://svn.apache.org/repos/asf> does not apply
> with this brand of svn-fe.

That's odd, I was able to import up to r354 before receiving:
fatal: missing newline after cat-blob response

So there's some regressions to chase down since the last roll-up.

That's all I've got for now.

--
David Barr.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Status of the svn remote helper project (Nov 2010, #2)
  2010-11-21  9:38   ` David Michael Barr
@ 2010-11-21 23:06     ` Jonathan Nieder
  2010-11-22  2:06       ` David Barr
  0 siblings, 1 reply; 22+ messages in thread
From: Jonathan Nieder @ 2010-11-21 23:06 UTC (permalink / raw)
  To: David Michael Barr
  Cc: git, Ramkumar Ramachandra, Sverre Rabbelier, Sam Vilain,
	Stephen Bash, Tomas Carnecky

David Michael Barr wrote:
> Jonathan Nieder wrote:

>> A delta in r36 of <http://svn.apache.org/repos/asf> does not apply
>> with this brand of svn-fe.
>
> That's odd, I was able to import up to r354 before receiving:
> fatal: missing newline after cat-blob response

Apparently sometimes deltas use the whole preimage and sometimes they
don't.

Here's a fix (still needs a simple reproduction script).

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
diff --git a/test-svn-fe.c b/test-svn-fe.c
index 64f63cf..71de02b 100644
--- a/test-svn-fe.c
+++ b/test-svn-fe.c
@@ -11,7 +11,7 @@
 int main(int argc, char *argv[])
 {
 	static const char test_svnfe_usage[] =
-		"test-svn-fe (<dumpfile> | [-d] <preimage> <delta> <len>)";
+		"test-svn-fe (<dumpfile> | [-d] <preimage> [<preimage len>] <delta> <len>)";
 	if (argc < 2)
 		usage(test_svnfe_usage);
 	if (argc == 2) {
@@ -22,16 +22,26 @@ int main(int argc, char *argv[])
 		svndump_reset();
 		return 0;
 	}
-	if (argc == 5 && !strcmp(argv[1], "-d")) {
+	if ((argc == 5 || argc == 6) && !strcmp(argv[1], "-d")) {
+		char **arg = argv + 2;
 		struct line_buffer delta = LINE_BUFFER_INIT;
-		int preimage_fd = open(argv[2], O_RDONLY);
+		int preimage_fd = open(*arg++, O_RDONLY);
 		struct view preimage_view = {preimage_fd, 0, STRBUF_INIT};
+		off_t preimage_len;
 		if (preimage_fd < 0)
 			die_errno("cannot open preimage");
-		if (buffer_init(&delta, argv[3]))
+		if (argc == 6) {
+			preimage_len = (off_t) strtoull(*arg++, NULL, 0);
+		} else {
+			struct stat st;
+			if (fstat(preimage_fd, &st))
+				die_errno("cannot stat preimage");
+			preimage_len = st.st_size;
+		}
+		if (buffer_init(&delta, *arg++))
 			die_errno("cannot open delta");
-		if (svndiff0_apply(&delta, (off_t) strtoull(argv[4], NULL, 0),
-				   &preimage_view, stdout))
+		if (svndiff0_apply(&delta, (off_t) strtoull(*arg++, NULL, 0),
+				   &preimage_view, preimage_len, stdout))
 			return 1;
 		if (close(preimage_fd))
 			die_errno("cannot close preimage");
diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
index ceb1fc5..02456cf 100644
--- a/vcs-svn/fast_export.c
+++ b/vcs-svn/fast_export.c
@@ -119,6 +119,7 @@ static long apply_delta(uint32_t mark, off_t len, struct line_buffer *input,
 			uint32_t old_mark, uint32_t old_mode)
 {
 	long ret;
+	off_t preimage_len = 0;
 	struct view preimage = {REPORT_FILENO, 0, STRBUF_INIT};
 	FILE *out;
 
@@ -130,13 +131,12 @@ static long apply_delta(uint32_t mark, off_t len, struct line_buffer *input,
 		printf("cat-blob :%"PRIu32"\n", old_mark);
 		fflush(stdout);
 		response = get_response_line();
-		/* Not necessary, just for robustness */
-		if (parse_cat_response_line(response, &dummy))
+		if (parse_cat_response_line(response, &preimage_len))
 			die("invalid cat-blob response: %s", response);
 	}
 	if (old_mode == REPO_MODE_LNK)
 		strbuf_addstr(&preimage.buf, "link ");
-	if (svndiff0_apply(input, len, &preimage, out))
+	if (svndiff0_apply(input, len, &preimage, preimage_len, out))
 		die("cannot apply delta");
 	if (old_mark) {
 		/* Discard trailing newline from cat-blob-fd. */
diff --git a/vcs-svn/svndiff.c b/vcs-svn/svndiff.c
index 308c734..8210561 100644
--- a/vcs-svn/svndiff.c
+++ b/vcs-svn/svndiff.c
@@ -283,7 +283,8 @@ static int apply_one_window(struct line_buffer *delta, off_t *delta_len,
 }
 
 int svndiff0_apply(struct line_buffer *delta, off_t delta_len,
-			struct view *preimage_view, FILE *postimage)
+			struct view *preimage_view, off_t preimage_len,
+			FILE *postimage)
 {
 	assert(delta && preimage_view && postimage);
 
@@ -302,5 +303,7 @@ int svndiff0_apply(struct line_buffer *delta, off_t delta_len,
 			return error("Delta ends early! (%"PRIu64" bytes remaining)",
 			      (uint64_t) delta_len);
 	}
+	if (move_window(preimage_view, preimage_len, 0))
+		return -1;
 	return 0;
 }
diff --git a/vcs-svn/svndiff.h b/vcs-svn/svndiff.h
index bb5afd0..640e04f 100644
--- a/vcs-svn/svndiff.h
+++ b/vcs-svn/svndiff.h
@@ -5,6 +5,7 @@
 #include "sliding_window.h"
 
 extern int svndiff0_apply(struct line_buffer *delta, off_t delta_len,
-				struct view *preimage_view, FILE *postimage);
+				struct view *preimage_view, off_t preimage_len,
+				FILE *postimage);
 
 #endif

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: Status of the svn remote helper project (Nov 2010, #2)
  2010-11-21 23:06     ` Jonathan Nieder
@ 2010-11-22  2:06       ` David Barr
  0 siblings, 0 replies; 22+ messages in thread
From: David Barr @ 2010-11-22  2:06 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, Ramkumar Ramachandra, Sverre Rabbelier, Sam Vilain,
	Stephen Bash, Tomas Carnecky

Hi,

> >> A delta in r36 of <http://svn.apache.org/repos/asf> does not apply
> >> with this brand of svn-fe.
> >
> > That's odd, I was able to import up to r354 before receiving:
> > fatal: missing newline after cat-blob response
> 
> Apparently sometimes deltas use the whole preimage and sometimes they
> don't.
> 
> Here's a fix (still needs a simple reproduction script).

I'm testing this path along with the following changes.
The first just removes a compile-time warning.
The second fixes a memory leak.
Sorry, my send-email-fu is not up to scratch.

Signed-off-by: David Barr <david.barr@cordelta.com>
---
diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
index 02456cf..a95a5c9 100644
--- a/vcs-svn/fast_export.c
+++ b/vcs-svn/fast_export.c
@@ -127,7 +127,6 @@ static long apply_delta(uint32_t mark, off_t len, struct 
line_buffer *input,
                die("cannot open temporary file for blob retrieval");
        if (old_mark) {
                const char *response;
-               off_t dummy;
                printf("cat-blob :%"PRIu32"\n", old_mark);
                fflush(stdout);
                response = get_response_line();
@@ -147,6 +146,7 @@ static long apply_delta(uint32_t mark, off_t len, struct 
line_buffer *input,
        ret = buffer_tmpfile_prepare_to_read(&postimage);
        if (ret < 0)
                die("cannot read temporary file for blob retrieval");
+       strbuf_release(&preimage.buf);
        return ret;
 }
 

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Status of the svn remote helper project (Dec 2010, #1)
  2010-11-21  6:31 ` Status of the svn remote helper project (Nov 2010, #2) Jonathan Nieder
  2010-11-21  9:38   ` David Michael Barr
@ 2010-12-05 11:37   ` Jonathan Nieder
  2010-12-08 18:26     ` Tomas Carnecky
  2011-01-05 23:39     ` Status of the svn remote helper project (Jan 2011, #1) Jonathan Nieder
  1 sibling, 2 replies; 22+ messages in thread
From: Jonathan Nieder @ 2010-12-05 11:37 UTC (permalink / raw)
  To: git
  Cc: Ramkumar Ramachandra, Sverre Rabbelier, David Barr, Sam Vilain,
	Stephen Bash, Tomas Carnecky

It's been a couple of months since I mentioned that it would be
possible for a simple svn remote helper on top of svn-fe to be usable
in a couple of weeks.  I still believe a read-only importer is _just_
within grasp; but there are plenty of tempting and interesting topics
to work on in the stack below that, too.

That said, the collection of topics cooking is already satisfying.
Time for cleaning up additional branches for the tree seems short;
patch series and branches that are ready for wider testing as-is would
be fantastic (and should be based on a merge of 'master' and as few of
the topics listed below as possible).

Excluding changes taken from the 'jch' branch, relative to last time
we have

 Makefile                          |    4 +-
 contrib/svn-fe/svn-filter-root.py |  107 ++++++++++++++++++++++++
 contrib/svn-fe/testme.sh          |    4 +-
 fast-import.c                     |  164 ++++++++++++++++++++++++++++++++++---
 t/t9011-svn-da.sh                 |   10 +-
 t/t9300-fast-import.sh            |    6 +-
 test-treap.c                      |   11 ++-
 vcs-svn/repo_tree.c               |    4 +-
 vcs-svn/sliding_window.c          |   11 ++-
 vcs-svn/trp.h                     |    3 +-
 vcs-svn/trp.txt                   |   10 ++-
 11 files changed, 298 insertions(+), 36 deletions(-)

The added fast-import code implements the 'ls' command.

Most of the changes are one-liners and small cleanups.  The most
visible change in practice should be the fix for the mysterious
historical-files-appearing-as-directories bug in repo_tree.

As always, a merge of the branches listed below is available as

	git://repo.or.cz/git/jrn.git vcs-svn-pu

and individual topic branches are also available in that repository
under the refs/topics namespace.

--------------------------------------------------
[Cooking]
* jn/svndiff0 (2010-11-06) 24 commits
- vcs-svn: Allow deltas to copy from preimage
- vcs-svn: Reject deltas that read past end of preimage
- vcs-svn: Let deltas use data from postimage
- vcs-svn: Reject deltas that do not consume all inline data
- vcs-svn: Check declared number of output bytes
- vcs-svn: Implement copyfrom_data delta instruction
- vcs-svn: Read instructions from deltas
- vcs-svn: Read inline data from deltas
- vcs-svn: Read the preimage while applying deltas
- vcs-svn: Skeleton of an svn delta parser
- compat: helper for detecting unsigned overflow
- vcs-svn: Learn to check for SVN\0 magic
- vcs-svn: Learn to parse variable-length integers
- vcs-svn: Add code to maintain a sliding view of a file
- vcs-svn: Allow character-oriented input
- vcs-svn: Allow input errors to be detected early
- vcs-svn: Let callers peek ahead to find stream end
- vcs-svn: Add binary-safe read() function
- vcs-svn: Improve support for reading large files
- vcs-svn: Make buffer_skip_bytes() report partial reads
- vcs-svn: Teach line_buffer to handle multiple input files
- vcs-svn: Collect line_buffer data in a struct
- vcs-svn: Replace buffer_read_string() memory pool with a strbuf
- vcs-svn: Eliminate global byte_buffer[] array

Well tested and should be ready for wide use.

Is there some tool (a hex editor?) that can be used to easily read
and write deltas?  The "printf and test against 'svnadmin load'"
method is a bit time confusing.

* db/recognize-v3 (2010-11-20) 2 commits
- vcs-svn: Allow simple v3 dumps (no deltas yet)
- vcs-svn: Error out for v3 dumps

A bugfix and the framework for a feature.  Probably the bugfix
should be pushed toward 'maint'.

* db/prop-delta (2010-11-20) 16 commits
- vcs-svn: Simplify handling of deleted properties
- vcs-svn: Implement Prop-delta handling
- vcs-svn: Sharpen parsing of property lines
- vcs-svn: Split off function for handling of individual properties
- vcs-svn: Make source easier to read on small screens
- vcs-svn: More dump format sanity checks
- vcs-svn: Reject path nodes without Node-action
- vcs-svn: Delay read of per-path properties
- vcs-svn: Combine repo_replace and repo_modify functions
- vcs-svn: Replace = Delete + Add
- vcs-svn: handle_node: Handle deletion case early
- vcs-svn: Use mark to indicate nodes with included text
- vcs-svn: Unclutter handle_node by introducing have_props var
- vcs-svn: Eliminate node_ctx.mark global
- vcs-svn: Eliminate node_ctx.srcRev global
- vcs-svn: Check for errors from open()
(this branch uses db/recognize-v3.)

All but the tip commit are from 'pu'.

* db/fast-import-blob-access (2010-12-04) 5 commits
- fast-import: add 'ls' command
- fast-import: Allow cat-blob requests at arbitrary points in stream
- fast-import: let importers retrieve blobs
- fast-import: clarify documentation of "feature" command
- fast-import: stricter parsing of integer options

A proof of concept for the protocol targetting another repository
format (hg?) would be a great comfort.

Synchronization overhead seems to be a big problem.  If someone can
use "top -b" output to produce a nice timechart explaining this then
I would be happy to take a look.

* db/fast-import-object-reuse (2010-11-24) 1 commit
- fast-import: insert new object entries at start of hash bucket

A speedup.  Acked by Shawn, taken from 'pu'.

* jn/fast-import-ondemand-checkpoint (2010-11-24) 1 commit
- fast-import: treat SIGUSR1 as a request to access objects early

Taken from 'pu'.

* jn/svn-fe-makefile (2010-12-04) 1 commit
- Makefile: dependencies for vcs-svn tests

Seems to have been forgotten?  Simplified commit message, otherwise
unchanged.

* xx/thinner-wrapper-svndiff0 (2010-11-07) 1 commit
- svn-fe: stop linking to libz and libxdiff
(this branch uses jn/svndiff0 and jn/thinner-wrapper.)

Simplification.  jn/thinner-wrapper is part of 'jch'.

* rr/svnfe-tests-no-perl (2010-11-23) 1 commit
- t9010 (svn-fe): Eliminate dependency on svn perl bindings

>From 'pu'.

* db/text-delta (2010-11-20) 10 commits
- fixup! vcs-svn: tweak sliding window code to tolerate excessive
  readahead
- fixup! svn-fe: Test script for handling of dumps with --deltas
- svn-fe: Test script for handling of dumps with --deltas
- vcs-svn: Implement text-delta handling
- vcs-svn: Teach line_buffer about temporary files
- vcs-svn: tweak sliding window code to tolerate excessive readahead
- vcs-svn: Let caller set up sliding window for delta preimage
- vcs-svn: Read delta preimage from file descriptor
- vcs-svn: Introduce fd_buffer routines
- vcs-svn: Introduce repo_read_path to check the content at a path
- vcs-svn: Internal fast_export_save_blob helper
- Merge branch 'db/fast-import-blob-access' (early part) into
  db/text-delta
- Merge branch 'jn/svndiff0' into db/text-delta
(this branch uses db/recognize-v3, db/prop-delta, db/fast-import-blob-access,
and jn/svndiff0.)

It works!

* db/svn-extract-branches (2010-11-20) 1 commit
- svn-fe: Script to remap svn history

Very rough but let's merge it so it doesn't get forgotten.

* jn/maint-svn-fe (2010-12-05) 2 commits
- vcs-svn: fix intermittent repo_tree corruption
- treap: make treap_insert return inserted node

Fixes an old bug.  Hoping for feedback or an ack from someone familiar
svn-fe internals; afterwards, would fast-track to maint.

---------------------------------------------------
[Graduated]
* jn/thinner-wrapper (2010-11-06) 7 commits
 - Remove pack file handling dependency from wrapper.o
 - pack-objects: mark file-local variable static
 - wrapper: give zlib wrappers their own translation unit
 - strbuf: move strbuf_branchname to sha1_name.c
 - path helpers: move git_mkstemp* to wrapper.c
 - wrapper: move odb_* to environment.c
 - wrapper: move xmmap() to sha1_file.c

Part of 'jch' now.

--------------------------------------------------
[Out of tree, stalled]

* tc/remote-helper-usability: $gmane/157860
 . Register new packs after the remote helper is done fetching
 . Properly record history of the notes ref
 . Fix ls-remote output when displaying impure refs
 . Add git-remote-svn
 . Introduce the git fast-import-helper
 . Rename get_mode() to decode_tree_mode() and export it
 . Allow the transport fetch command to add additional refs
 . Allow more than one keepfile in the transport
 . Remote helper: accept ':<value> <name>' as a response to 'list'

The fourth-from-top seems a bit hard to review.  If it really is
necessary to introduce a separate program with a separate interface,
maybe a compile-time flag to choose between them would help?

* rr/remote-helper: http://github.com/artagnon/git
 . remote-svn: Write in fetch functionality
 . run-command: Protect the FD 3 from being grabbed
 . remote-svn: Build a pipeline for the import using svnrdump
 . run-command: Extend child_process to include a backchannel FD
 . Allow the transport fetch command to add additional refs
 . Remote helper: accept ':<value> <name>' as a response to 'list'
 . test-svn-fe: Allow for a dumpfile on stdin
 . contrib/svn-fe: Fast script to remap svn history
 . Add Tom's remote helper for reference
 . Add a stubby remote-svn remote helper
 . Add a correct svndiff applier

Work in progress, waiting on lower levels to be more functional
(in particular, svn-fe does not support incremental imports yet).

* sb/svn-fe-example: $gmane/159054

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Status of the svn remote helper project (Dec 2010, #1)
  2010-12-05 11:37   ` Status of the svn remote helper project (Dec 2010, #1) Jonathan Nieder
@ 2010-12-08 18:26     ` Tomas Carnecky
  2010-12-12  6:14       ` fast-import tweaks for remote helpers (Re: Status of the svn remote helper project (Dec 2010, #1)) Jonathan Nieder
  2011-01-05 23:39     ` Status of the svn remote helper project (Jan 2011, #1) Jonathan Nieder
  1 sibling, 1 reply; 22+ messages in thread
From: Tomas Carnecky @ 2010-12-08 18:26 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, Ramkumar Ramachandra, Sverre Rabbelier, David Barr,
	Sam Vilain, Stephen Bash

  On 12/5/10 12:37 PM, Jonathan Nieder wrote:
> --------------------------------------------------
> [Out of tree, stalled]
>
> * tc/remote-helper-usability: $gmane/157860
>   . Register new packs after the remote helper is done fetching
>   . Properly record history of the notes ref
>   . Fix ls-remote output when displaying impure refs
>   . Add git-remote-svn
>   . Introduce the git fast-import-helper
>   . Rename get_mode() to decode_tree_mode() and export it
>   . Allow the transport fetch command to add additional refs
>   . Allow more than one keepfile in the transport
>   . Remote helper: accept ':<value>  <name>' as a response to 'list'
>
> The fourth-from-top seems a bit hard to review.  If it really is
> necessary to introduce a separate program with a separate interface,
> maybe a compile-time flag to choose between them would help?

I simplified the code and the requirements on fast-import are much 
lighter now. All I need is a way to tell fast-import to stop writing 
refs and after each commit write its sha1 to stdout. It's possible to 
modify fast-import.c with a small patch to make it behave like that. 
However, I haven't followed the svn remote helper that much lately so I 
don't know whether one of the other patches already modifies fast-import 
in this way.

 From the beginning my code was meant to be just an example how the 
interaction between git and the svn remote helper could look like. For 
example I save the svn rev <-> sha1 mapping in notes, which is appears 
to work well. I'll take a look if I'll be able to use the svn-fe in my 
script.

tom

^ permalink raw reply	[flat|nested] 22+ messages in thread

* fast-import tweaks for remote helpers (Re: Status of the svn remote helper project (Dec 2010, #1))
  2010-12-08 18:26     ` Tomas Carnecky
@ 2010-12-12  6:14       ` Jonathan Nieder
  2010-12-12  9:53         ` Sam Vilain
  0 siblings, 1 reply; 22+ messages in thread
From: Jonathan Nieder @ 2010-12-12  6:14 UTC (permalink / raw)
  To: Tomas Carnecky
  Cc: git, Ramkumar Ramachandra, Sverre Rabbelier, David Barr,
	Sam Vilain, Stephen Bash

Tomas Carnecky wrote:

> I simplified the code and the requirements on fast-import are much
> lighter now. All I need is a way to tell fast-import to stop writing
> refs and after each commit write its sha1 to stdout.

That's good to hear.  What should be the syntax for asking fast-import
not to write to a ref?  Something like this?

	commit
	mark :1
	committer c o mitter <committer@example.com> now
	data <<END
	...

Writing the sha1 as each commit is written: how early does the
frontend need access to the sha1?  Would a facility to report marks
back to the frontend at the end of the stream take care of it?

Based on [1] I guess the main need is some way for fast-import to
tell the transport machinery what refs the transport machinery
should update (or at least ought to report as updated).  A hackish
way might be to make the remote helper send "progress" commands
with that information.

> It's possible to
> to modify fast-import.c with a small patch to make it behave like
> that. However, I haven't followed the svn remote helper that much
> lately so I don't know whether one of the other patches already
> modifies fast-import in this way.

No, the patches have mostly been adding commands that send information
back to the frontend.

 cat-blob (<dataref> | <mark>):
	Sends back an old blob along with its length (in
	cat-file --batch format).  svn-fe uses this to acquire
	the preimage when applying deltas.

 ls <quoted-path>:
	Sends back information about the current state of a path
	in the commit being prepared (as a single line in ls-tree
	format).  svn-fe uses this to move around files and to find
	a <dataref> to use with cat-blob when applying deltas.

 ls (<dataref> | <mark>) <path>:
	Sends back information about a path in a previous revision
	(tag, commit, or tree), in ls-tree format.

 M 040000 (<dataref> | <mark>) <path>:
	Like "M 100644 <dataref> <path>", replaces an entry in the
	active commit with content of the frontend's choice.  This
	gets used to copy in old directories.

> From the beginning my code was meant to be just an example how the
> interaction between git and the svn remote helper could look like.

It makes a nice demo, too. :)

> For example I save the svn rev <-> sha1 mapping in notes, which is
> appears to work well. I'll take a look if I'll be able to use the
> svn-fe in my script.

svn-fe needs a fast mapping svn rev -> sha1; it currently uses a marks
file for that.  (In the back of my mind, I have the idea of using a
file that allows O(1) access, perhaps of the form

	<commit name for rev 1> NL
	<commit name for rev 2> NL
	...

but as Ram has noted, keeping the whole table in memory is pretty
cheap already.)  A remote helper needs a fast mapping sha1 -> svn rev,
and imho notes are ideal for that[2].

The way I imagine it, the authoritative mapping is in notes and the
reverse mapping (e.g. in a marks file) is rebuilt when needed.

[1] remote-helper branch at git://github.com/wereHamster/git.git
[2] Why?  When a project switches from one svn server to another,
revision numbers tend to change, so revision numbers are not permanent
enough to belong in the commit message imho.  (If only git-notes had
existed when git svn was written...)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fast-import tweaks for remote helpers (Re: Status of the svn remote helper project (Dec 2010, #1))
  2010-12-12  6:14       ` fast-import tweaks for remote helpers (Re: Status of the svn remote helper project (Dec 2010, #1)) Jonathan Nieder
@ 2010-12-12  9:53         ` Sam Vilain
  2010-12-12 17:16           ` fast-import tweaks for remote helpers Jonathan Nieder
  0 siblings, 1 reply; 22+ messages in thread
From: Sam Vilain @ 2010-12-12  9:53 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Tomas Carnecky, git, Ramkumar Ramachandra, Sverre Rabbelier,
	David Barr, Sam Vilain, Stephen Bash

On 12/12/10 19:14, Jonathan Nieder wrote:
> That's good to hear. What should be the syntax for asking fast-import
> not to write to a ref?  Something like this?
>
> 	commit
> 	mark :1
> 	committer c o mitter<committer@example.com>  now
> 	data<<END
> 	...
>
> Writing the sha1 as each commit is written: how early does the
> frontend need access to the sha1?  Would a facility to report marks
> back to the frontend at the end of the stream take care of it?

What happened to --report-fd ?

> (In the back of my mind, I have the idea of using a
> file that allows O(1) access, perhaps of the form
>
> 	<commit name for rev 1>  NL
> 	<commit name for rev 2>  NL
> 	...

This doesn't scale to many branches; git-svn started with that and had 
to use a b-tree in the end.  Eg, consider a repository with 10,000 
branches and 600,000 revisions.

Thanks for continuing this work, it is most interesting to follow.

Cheers,
Sam

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: fast-import tweaks for remote helpers
  2010-12-12  9:53         ` Sam Vilain
@ 2010-12-12 17:16           ` Jonathan Nieder
  2011-01-05 21:20             ` fast-import --report-fd (Re: fast-import tweaks for remote helpers) Jonathan Nieder
  0 siblings, 1 reply; 22+ messages in thread
From: Jonathan Nieder @ 2010-12-12 17:16 UTC (permalink / raw)
  To: Sam Vilain
  Cc: Tomas Carnecky, git, Ramkumar Ramachandra, Sverre Rabbelier,
	David Barr, Stephen Bash

Sam Vilain wrote:

> What happened to --report-fd ?

The patch still works.  The main problem with report-fd is that it
introduced a synchronization point after every commit: the frontend
has to read the commit id before fast-import will continue.

So if the reports can be made optional ("report _this_ commit") or
batched ("report all marked commits") then the result would be easier
to use imho.

>> (In the back of my mind, I have the idea of using a
>> file that allows O(1) access, perhaps of the form
>>
>>	<commit name for rev 1>  NL
>>	<commit name for rev 2>  NL
>>	...
>
> This doesn't scale to many branches

Right.  Another reason to delay getting rid of the git branch with the
full repo.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* fast-import --report-fd (Re: fast-import tweaks for remote helpers)
  2010-12-12 17:16           ` fast-import tweaks for remote helpers Jonathan Nieder
@ 2011-01-05 21:20             ` Jonathan Nieder
  0 siblings, 0 replies; 22+ messages in thread
From: Jonathan Nieder @ 2011-01-05 21:20 UTC (permalink / raw)
  To: Sam Vilain
  Cc: Tomas Carnecky, git, Ramkumar Ramachandra, Sverre Rabbelier,
	David Barr, Stephen Bash

Jonathan Nieder wrote:
> Sam Vilain wrote:

>> What happened to --report-fd ?
>
> The patch still works.  The main problem with report-fd is that it
> introduced a synchronization point after every commit: the frontend
> has to read the commit id before fast-import will continue.

Correction: more precisely, that was and is the main problem with
svn-fe's use of bidirectional communication.  An application like
Tom's remote helper would probably not suffer so much from it, since
commit ids are just queued up as long as the pipe doesn't fill before
the frontend reads any.  It is transactions like

	FI>	r5 = 734987a9878b97c879c798a897c897ac
	FE>	cat 734987a9878b97c879c798a897c897ac
	FI>	734987a9878b97c879c798a897c897ac commit 448
		tree 8d5bcf0f24bdfea1fdab8d39ba3c8ba91a52547c
		parent 84279592b8b5816d00300ba5d4412adf05cc80d6
		parent 3ca7353cab4ed6c7efac0c8d7477c87112fc7350
		author Junio C Hamano <gitster@pobox.com> 1294187068 -0800
		committer Junio C Hamano <gitster@pobox.com> 1294187068 -0800

		Merge branch 'sr/gitweb-hilite-more' into pu

		* sr/gitweb-hilite-more:
		  gitweb: remove unnecessary test when closing file descriptor
		  gitweb: add extensions to highlight feature map

	FE>	cat 8d5bcf0f24bdfea1fdab8d39ba3c8ba91a52547c "main.c"

(i.e., round-trips) that were and are creating overhead in svn-fe.
See [1] if curious about details.

So please don't be dissuaded by the nonsense I sent. :)

[1] http://colabti.org/irclogger/irclogger_log_search/git-devel?search=overhead&action=search

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Status of the svn remote helper project (Jan 2011, #1)
  2010-12-05 11:37   ` Status of the svn remote helper project (Dec 2010, #1) Jonathan Nieder
  2010-12-08 18:26     ` Tomas Carnecky
@ 2011-01-05 23:39     ` Jonathan Nieder
  2011-01-07 14:00       ` David Michael Barr
  2011-02-11  9:09       ` Plans for the vcs-svn-pu branch Jonathan Nieder
  1 sibling, 2 replies; 22+ messages in thread
From: Jonathan Nieder @ 2011-01-05 23:39 UTC (permalink / raw)
  To: git
  Cc: Ramkumar Ramachandra, Sverre Rabbelier, David Barr, Sam Vilain,
	Stephen Bash, Tomas Carnecky

Here are the topics that are cooking in vcs-svn-pu.

Hopefully v1.7.4-rc0 is treating you well and free git time has been
going into finding and fixing regressions.  So if you are bored and
looking for something to do, please skip to [1] and ignore the rest of
this message.

December was a busy month.  Excluding changes from the 'jch' branch:

 39 files changed, 1472 insertions(+), 2020 deletions(-)

That breaks down as

   1.7% Documentation/
   1.8% contrib/svn-fe/
  17.5% t/
  65.4% vcs-svn/

The other 13%:

 .gitignore                        |    3 -
 Makefile                          |   17 +--
 fast-import.c                     |  134 ++++++++-----
 quote.h                           |    3 +-

Users will probably notice that svn-fe requires ls and cat-blob
support from fast-import (alas); hopefully the lower memory footprint,
code simplification and incremental import support can justify that
cost.

The fast-import changes are a mixture of enhancements to the "ls"
command and general optimizations.  The optimizations are not in their
final form and likely have bugs but are being sent out now for some
early exposure.

As always, a merge of the branches listed below is available as

	git://repo.or.cz/git/jrn.git vcs-svn-pu

and individual topic branches are available in that repository under
the refs/topics namespace.

Let's get svn-fe3 polished so when the next merge window comes around
it is ready to be merged quickly.  

--------------------------------------------------
[Graduated to "master"]
* db/fast-import-object-reuse (2010-11-24) 1 commit
 - fast-import: insert new object entries at start of hash bucket

* jn/fast-import-ondemand-checkpoint (2010-11-24) 1 commit
 - fast-import: treat SIGUSR1 as a request to access objects early

* jn/svn-fe-makefile (2010-12-04) 1 commit
 - Makefile: dependencies for vcs-svn tests

* rr/svnfe-tests-no-perl (2010-11-23) 1 commit
 - t9010 (svn-fe): Eliminate dependency on svn perl bindings

* jn/maint-svn-fe (2010-12-05) 2 commits
 - vcs-svn: fix intermittent repo_tree corruption
 - treap: make treap_insert return inserted node

* db/fast-import-blob-access (2010-12-04) 4 commits
 - fast-import: Allow cat-blob requests at arbitrary points in stream
 - fast-import: let importers retrieve blobs
 - fast-import: clarify documentation of "feature" command
 - fast-import: stricter parsing of integer options

The old tip commit that adds an 'ls' command was reworked (see below).

* jn/fast-import-ondemand-checkpoint (2010-11-24) 1 commit
 - fast-import: treat SIGUSR1 as a request to access objects early

--------------------------------------------------
[New Topics]
* jn/line-buffer-error (2010-12-28) 4 commits
 - vcs-svn: improve reporting of input errors
 - vcs-svn: make buffer_copy_bytes return length read
 - vcs-svn: make buffer_skip_bytes return length read
 - vcs-svn: allow input errors to be detected promptly

>From jn/svndiff0 but expanded, waiting for feedback from the list.
These let the calling function take care of reporting the error with
more context; is that worth it or would it make more sense to die()
directly?

* jn/line-buffer-large-file (2010-12-24) 1 commit
 - vcs-svn: improve support for reading large files

>From jn/svndiff0.  Will merge soon if there are no objections.

* jn/line-buffer (2011-01-02) 14 commits
 - Merge branch 'jn/line-buffer-large-file' into jn/line-buffer
 - Merge branch 'jn/line-buffer-error' into jn/line-buffer
 - vcs-svn: teach line_buffer about temporary files
 - vcs-svn: allow input from file descriptor
 - vcs-svn: allow character-oriented input
 - vcs-svn: add binary-safe read function
 - t0081 (line-buffer): add buffering tests
 - vcs-svn: tweak test-line-buffer to not assume line-oriented input
 - tests: give vcs-svn/line_buffer its own test script
 - vcs-svn: make test-line-buffer input format more flexible
 - vcs-svn: teach line_buffer to handle multiple input files
 - vcs-svn: collect line_buffer data in a struct
 - vcs-svn: replace buffer_read_string memory pool with a strbuf
 - vcs-svn: eliminate global byte_buffer
 (this branch uses jn/line-buffer-error and jn/line-buffer-large-file.)

>From jn/svndiff0 and db/text-delta.  Putting temporary files with
meaningless names in /tmp is unfortunate (maybe buffer_tmpfile_init
could use a filename prefix argument?).

* jn/unsigned-overflow (2010-12-25) 1 commit
 - compat: helper for detecting unsigned overflow

>From jn/svndiff0.  Should submit for separate inclusion.

* jn/sliding-window (2011-01-02) 1 commit
 - vcs-svn: learn to maintain a sliding view of a file
 (this branch uses jn/line-buffer, jn/line-buffer-error, and
  jn/line-buffer-large-file.)

>From jn/svndiff0 but clarified somewhat.

* db/vcs-svn-incremental (2011-01-05) 20 commits
 - svn-fe: WIP testme.sh performance enhancements
 - svn-fe: testme.sh update
 - vcs-svn: use mark from previous import for parent commit
 - vcs-svn: handle filenames with dq correctly
 - vcs-svn: quote paths correctly for ls command
 - vcs-svn: eliminate repo_tree structure
 - vcs-svn: add a comment before each commit
 - vcs-svn: simplify repo_modify_path and repo_copy
 - vcs-svn: prepare to eliminate repo_tree structure
 - vcs-svn: do not rely on marks for old blobs
 - vcs-svn: split off function to export result from delta application
 - vcs-svn: make apply_delta caller retrieve preimage
 - vcs-svn: explicitly close streams used for delta application at exit
 - vcs-svn: introduce cat_mark function to retrieve a marked blob
 - vcs-svn: save marks for imported commits
 - vcs-svn: use higher mark numbers for blobs
 - vcs-svn: check for errors reading from cat-blob-fd
 - quote.h: simplify the inclusion
 - Makefile: update dependencies for test-svn-fe.c
 - Merge branch 'db/fast-import-blob-access' into db/vcs-svn-incremental
 (this branch uses db/text-delta, db/prop-delta, jn/svndiff0,
  jn/sliding-window, jn/line-buffer, jn/line-buffer-error,
  jn/line-buffer-large-file, and db/fast-import-blob-access.)

Support for importing different revs in different svn-fe runs.

* db/optimize-vcs-svn (2011-01-05) 9 commits
 - vcs-svn: use strchr to find RFC822 delimiter
 - vcs-svn: drop obj_pool.h
 - vcs-svn: drop trp.h
 - vcs-svn: drop string_pool
 - vcs-svn: factor out usage of string_pool
 - vcs-svn: implement perfect hash for top-level keys
 - vcs-svn: implement perfect hash for node-prop keys
 - vcs-svn: avoid using ls command twice
 - vcs-svn: pass paths through to fast-import
 (this branch uses db/vcs-svn-incremental, db/text-delta, db/prop-delta,
  jn/svndiff0, jn/sliding-window, jn/line-buffer, jn/line-buffer-error,
  jn/line-buffer-large-file, and db/fast-import-blob-access.)

The diffstat says it all. ;-)

* db/optimize-fast-import (2011-01-05) 3 commits
 - WIP
 - WIP
 - WIP Hash/bitmap combo
 (this branch uses db/fast-import-blob-access.)

Very rough.  Testers beware.

I suspect "struct hash_table" may provide a simpler approach to
avoiding filling fast-import's tables.

--------------------------------------------------
[Cooking]
* jn/svndiff0 (2011-01-05) 11 commits
 - vcs-svn: microcleanup in svndiff0 window-reading code
 - vcs-svn: let deltas use data from preimage
 - vcs-svn: let deltas use data from postimage
 - vcs-svn: verify that deltas consume all inline data
 - vcs-svn: implement copyfrom_data delta instruction
 - vcs-svn: read instructions from deltas
 - vcs-svn: read inline data from deltas
 - vcs-svn: read the preimage when applying deltas
 - vcs-svn: parse svndiff0 window header
 - vcs-svn: skeleton of an svn delta parser
 - Merge branch 'jn/unsigned-overflow' into jn/svndiff0
 (this branch uses jn/sliding-window, jn/line-buffer,
  jn/line-buffer-error, and jn/line-buffer-large-file.)

Well tested and should be ready for wide use.

Is there some tool (a hex editor?) that can be used to read and write
deltas?  The "printf and test against 'svnadmin load'" method is a bit
time confusing.

* db/prop-delta (2010-12-09) 18 commits
 - vcs-svn: Simplify handling of deleted properties
 - vcs-svn: Allow change nodes for root of tree (/)
 - vcs-svn: Implement Prop-delta handling
 - vcs-svn: Sharpen parsing of property lines
 - vcs-svn: Split off function for handling of individual properties
 - vcs-svn: Make source easier to read on small screens
 - vcs-svn: More dump format sanity checks
 - vcs-svn: Reject path nodes without Node-action
 - vcs-svn: Delay read of per-path properties
 - vcs-svn: Combine repo_replace and repo_modify functions
 - vcs-svn: Replace = Delete + Add
 - vcs-svn: handle_node: Handle deletion case early
 - vcs-svn: Use mark to indicate nodes with included text
 - vcs-svn: Unclutter handle_node by introducing have_props var
 - vcs-svn: Eliminate node_ctx.mark global
 - vcs-svn: Eliminate node_ctx.srcRev global
 - vcs-svn: Check for errors from open()
 - Allow simple v3 dumps (no deltas yet)

All but the tip commit are in 'jch'.

* db/fast-import-blob-access (2011-01-03) 3 commits
 - fast-import: add 'ls' command
 - fast-import: treat filemodify with empty tree as delete
 - fast-import: clarify handling of cat-blob feature

A proof of concept for 'cat-blob' and 'ls' support targetting another
repository format (hg?) would be a great comfort.

Synchronization overhead seems to be a problem.  If someone can use
"top -b" output to produce a nice timechart then I would be happy to
take a look.

* db/text-delta (2011-01-04) 5 commits
 - svn-fe: Test script for handling of dumps with --deltas
 - vcs-svn: implement text-delta handling
 - vcs-svn: introduce repo_read_path to check the content at a path
 - Merge branch 'db/prop-delta' into db/text-delta
 - Merge branch 'jn/svndiff0' into db/text-delta
 (this branch uses db/prop-delta, jn/svndiff0, jn/sliding-window,
  jn/line-buffer, jn/line-buffer-error, and jn/line-buffer-large-file.)

Still seems to work. ;-)

* db/svn-extract-branches (2010-11-20) 1 commit
 - svn-fe: Script to remap svn history

Very rough but let's merge it so it doesn't get forgotten.

--------------------------------------------------
[Ejected]
* db/recognize-v3 (2010-11-20) 1 commit
 - vcs-svn: Allow simple v3 dumps (no deltas yet)

Not worth maintaining as a separate branch from db/prop-delta.

* xx/thinner-wrapper-svndiff0 (2010-11-07) 1 commit
 - svn-fe: stop linking to libz and libxdiff
 (this branch used jn/svndiff0.)

Fixed when jn/svndiff0 was rebased on top of jn/thinner-wrapper.

--------------------------------------------------
[Out of tree, stalled]

* tc/remote-helper-usability: $gmane/157860
 . Register new packs after the remote helper is done fetching
 . Properly record history of the notes ref
 . Fix ls-remote output when displaying impure refs
 . Add git-remote-svn
 . Introduce the git fast-import-helper
 . Rename get_mode() to decode_tree_mode() and export it
 . Allow the transport fetch command to add additional refs
 . Allow more than one keepfile in the transport
 . Remote helper: accept ':<value> <name>' as a response to 'list'

I here there has been some work to make this work with the usual
fast-import.

* rr/remote-helper: http://github.com/artagnon/git
 . [WIP] Temporary commit
 . remote-svn: Write in fetch functionality
 . run-command: Protect the FD 3 from being grabbed
 . remote-svn: Build a pipeline for the import using svnrdump
 . run-command: Extend child_process to include a backchannel FD
 . Allow the transport fetch command to add additional refs
 . Remote helper: accept ':<value> <name>' as a response to 'list'
 . test-svn-fe: Allow for a dumpfile on stdin
 . Add Tom's remote helper for reference
 . Add a stubby remote-svn remote helper
 . Add a correct svndiff applier

Work in progress, waiting on lower levels to stabilize.

* sb/svn-fe-example: $gmane/159054

[1] Debian: 8 reports[2]
    Fedora: 6 reports[3]
    Gentoo: 1 report[4]
[2] http://bugs.debian.org/cgi-bin/pkgreport.cgi?src=git;include=tags:upstream;exclude=tags:fixed-upstream;exclude=severity:wishlist
[3] https://bugzilla.redhat.com/buglist.cgi?component=git&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED
[4] http://bugs.gentoo.org/buglist.cgi?short_desc_type=allwordssubstr&short_desc=dev-vcs/git&product=Gentoo+Linux&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Status of the svn remote helper project (Jan 2011, #1)
  2011-01-05 23:39     ` Status of the svn remote helper project (Jan 2011, #1) Jonathan Nieder
@ 2011-01-07 14:00       ` David Michael Barr
  2011-02-11  9:09       ` Plans for the vcs-svn-pu branch Jonathan Nieder
  1 sibling, 0 replies; 22+ messages in thread
From: David Michael Barr @ 2011-01-07 14:00 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, Ramkumar Ramachandra, Sverre Rabbelier, Sam Vilain,
	Stephen Bash, Tomas Carnecky

Hi,

Jonathan wrote:
> Let's get svn-fe3 polished so when the next merge window comes around
> it is ready to be merged quickly.  

In short, I have tested the latest rollup and see no regressions when testing
against the ASF and KDE repos.

Thank you Jonathan for the hard work, the outstanding patches are
becoming quite numerous.

--
David Barr.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Plans for the vcs-svn-pu branch
  2011-01-05 23:39     ` Status of the svn remote helper project (Jan 2011, #1) Jonathan Nieder
  2011-01-07 14:00       ` David Michael Barr
@ 2011-02-11  9:09       ` Jonathan Nieder
  2011-02-11 10:36         ` [PATCH] svn-fe: warn about experimental status Jonathan Nieder
  2011-02-11 15:49         ` Plans for the vcs-svn-pu branch Ramkumar Ramachandra
  1 sibling, 2 replies; 22+ messages in thread
From: Jonathan Nieder @ 2011-02-11  9:09 UTC (permalink / raw)
  To: git; +Cc: Ramkumar Ramachandra, David Barr, Sverre Rabbelier,
	Tomas Carnecky

(culled cc list for a new topic)
Hi,

Jonathan Nieder wrote:

> Here are the topics that are cooking in vcs-svn-pu.

Long time no pushout.  My old plan was to maintain a "vcs-svn" branch
and ask Junio to pull from it periodically.  I still plan to do that
since I think it's the only feasible way to roll out the larger
changes but I might be submitting smaller pieces as patches to the git
list directly.

The purpose of this email is to request help --- if you are
particularly interested in some subtopic then I would love to see a
patch series or pull request on the subject.  If I can forget about
vcs-svn-pu's existence and just review and accept ready-to-apply
patches on the topic as they come, then building a suitable vcs-svn
branch will become very, very simple.

The good news is that it's been a calm time, so most of these topics
are probably in good shape.

> * db/fast-import-blob-access (2011-01-03)
>  - fast-import: add 'ls' command

Needs a feature name.  Last seen at [1].

> * db/optimize-vcs-svn (2011-01-05) 9 commits
>  - vcs-svn: drop obj_pool.h
>  - vcs-svn: drop trp.h
>  - vcs-svn: drop string_pool
[...]

Applying this code removal early would make your local branch juggler
very, very happy.  Might make sense to pair it with some patches to
support cat-blob-fd in transport-helper's "import" support, so the UI
could be

	git fetch svn-dump::/tmp/file.dump

That could (perhaps) partially compensate for the loss of convenience
from requiring the cat-blob-fd to be plumbed correctly.  It's a shame
we weren't louder about svn-fe's command line usage being subject to
change.  Live and learn. :/

>  (this branch uses db/vcs-svn-incremental, db/text-delta, db/prop-delta,
>   jn/svndiff0, jn/sliding-window, jn/line-buffer, jn/line-buffer-error,
>   jn/line-buffer-large-file, and db/fast-import-blob-access.)

That's silly --- this long overdue cleanup does not require anything
except db/fast-import-blob-access.  Building it on top of
db/prop-delta seems okay, since the latter is simple and known to work
well.

> * db/vcs-svn-incremental (2011-01-05) 20 commits
[...]
>  (this branch uses db/text-delta, db/prop-delta, jn/svndiff0,
>   jn/sliding-window, jn/line-buffer, jn/line-buffer-error,
>   jn/line-buffer-large-file, and db/fast-import-blob-access.)
>
> Support for importing different revs in different svn-fe runs.

Killer feature.  After db/optimize-vcs-svn it is easy.  It does not
need to wait for Text-delta and svndiff0 support.

[...]

With those three topics out of the way, the rest of the wishlist
should be easy.  Most of the patches are already written.

 - line-buffer enhancements to support Text-delta (multiple buffers,
   sliding window, temporary files)
 - svndiff0 parser/interpreter
 - Text-delta support
 - Prop-delta support
 - support for large files
 - improved error handling
 - fast-import scalability fixes

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH] svn-fe: warn about experimental status
  2011-02-11  9:09       ` Plans for the vcs-svn-pu branch Jonathan Nieder
@ 2011-02-11 10:36         ` Jonathan Nieder
  2011-02-11 15:49         ` Plans for the vcs-svn-pu branch Ramkumar Ramachandra
  1 sibling, 0 replies; 22+ messages in thread
From: Jonathan Nieder @ 2011-02-11 10:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Ramkumar Ramachandra, David Barr, Sverre Rabbelier,
	Tomas Carnecky

svn-fe is young and some coming cleanups might involve backward
incompatible UI changes.  Add some words of warning to the manual so
early adopters that are not following the project closely don't get
burned.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Jonathan Nieder wrote:

> Applying this code removal early would make your local branch juggler
> very, very happy.  Might make sense to pair it with
[...]
> we weren't louder about svn-fe's command line usage being subject to
> change.  Live and learn. :/

That probably wasn't as clear as hoped.  All I mean is that svn-fe is
young and although the aforementioned cleanup involves a UI change
(going from

	... | svn-fe | git fast-import

to

	mkfifo replies
	... | svn-fe 3<replies | git fast-import --cat-blob-fd=3 3>replies

), such changes are to be expected.  So hopefully no one will be
confused, but it would be better to mention the experimental status
somewhere in svn-fe.txt, like so:

 contrib/svn-fe/svn-fe.txt |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/contrib/svn-fe/svn-fe.txt b/contrib/svn-fe/svn-fe.txt
index 35f84bd..cd075b9 100644
--- a/contrib/svn-fe/svn-fe.txt
+++ b/contrib/svn-fe/svn-fe.txt
@@ -18,6 +18,9 @@ Subversion repository mirrored on the local disk. Remote Subversion
 repositories can be mirrored on local disk using the `svnsync`
 command.
 
+Note: this tool is very young.  The details of its commandline
+interface may change in backward incompatible ways.
+
 INPUT FORMAT
 ------------
 Subversion's repository dump format is documented in full in
-- 
1.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: Plans for the vcs-svn-pu branch
  2011-02-11  9:09       ` Plans for the vcs-svn-pu branch Jonathan Nieder
  2011-02-11 10:36         ` [PATCH] svn-fe: warn about experimental status Jonathan Nieder
@ 2011-02-11 15:49         ` Ramkumar Ramachandra
  1 sibling, 0 replies; 22+ messages in thread
From: Ramkumar Ramachandra @ 2011-02-11 15:49 UTC (permalink / raw)
  To: Jonathan Nieder, Junio C Hamano
  Cc: git, David Barr, Sverre Rabbelier, Tomas Carnecky

Hi,

Jonathan Nieder writes:
> Jonathan Nieder wrote:
> 
> > Here are the topics that are cooking in vcs-svn-pu.

Thanks for the elaborate email. Some updates from my side:
- I've rewritten most of the svnload parser to resemble fast-import,
  and I'd like some preliminary feedback on the design.
- Although most of the dependent infrastructure is in place now, the
  remote-helper branch is still lagging. I'll shortly look into this.

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
--8<--
/*
 * Produce a dumpfile v3 from a fast-import stream.
 * Load the dump into the SVN repository with:
 * svnrdump load <URL> <dumpfile
 *
 * Licensed under a two-clause BSD-style license.
 * See LICENSE for details.
 */

#include "cache.h"
#include "quote.h"
#include "git-compat-util.h"
#include "dump_export.h"
#include "dir_cache.h"

#define SVN_DATE_FORMAT "%Y-%m-%dT%H:%M:%S.000000Z"
#define SVN_DATE_LEN 27

struct ident
{
	struct strbuf name, email;
	char date[SVN_DATE_LEN + 1];
};

static FILE *infile;
static struct strbuf command_buf = STRBUF_INIT;
static struct strbuf log_buf = STRBUF_INIT;
static struct strbuf path_s = STRBUF_INIT;
static struct strbuf path_d = STRBUF_INIT;
static struct strbuf svn_author = STRBUF_INIT;
static struct ident author = {STRBUF_INIT, STRBUF_INIT, ""};
static struct ident committer = {STRBUF_INIT, STRBUF_INIT, ""};

static int read_next_command(void)
{
	return strbuf_getline(&command_buf, infile, '\n');
}

static void populate_revprops(struct strbuf *props, size_t author_len,
			const char *author, size_t log_len, const char *log,
			size_t date_len, const char *date)
{
	strbuf_reset(props);
	strbuf_addf(props, "K 10\nsvn:author\nV %lu\n%s\n", author_len, author);
	strbuf_addf(props, "K 7\nsvn:log\nV %lu\n%s\n", log_len, log);
	if (date_len)
		/* SVN doesn't like an empty svn:date value */
		strbuf_addf(props, "K 8\nsvn:date\nV %lu\n%s\n", date_len, date);
	strbuf_add(props, "PROPS-END\n", 10);
}

static void parse_ident(const char *buf, struct ident *identp)
{
	char *t, *tz_off;
	int tz_off_buf;
	const struct tm *tm_time;

	/* John Doe <johndoe@email.com> 1170199019 +0530 */
	strbuf_reset(&(identp->name));
	strbuf_reset(&(identp->email));

	if (!buf)
		goto error;
	if (!(tz_off = strrchr(buf, ' ')))
		goto error;
	*tz_off++ = '\0';
	if (!(t = strrchr(buf, ' ')))
		goto error;
	*(t - 1) = '\0'; /* Ignore '>' from email */
	t++;
	tz_off_buf = atoi(tz_off);
	if (tz_off_buf > 1200 || tz_off_buf  < -1200)
		goto error;
	tm_time = time_to_tm(strtoul(t, NULL, 10), tz_off_buf);
	strftime(identp->date, SVN_DATE_LEN + 1, SVN_DATE_FORMAT, tm_time);
	if (!(t = strchr(buf, '<')))
		goto error;
	*(t - 1) = '\0'; /* Ignore ' <' from email */
	t++;

	strbuf_add(&(identp->email), t, strlen(t));
	strbuf_add(&(identp->name), buf, strlen(buf));
	return;
error:
	die("Malformed ident line: %s", buf);
}

static void skip_optional_lf(void)
{
	int term_char = fgetc(stdin);
	if (term_char != '\n' && term_char != EOF)
		ungetc(term_char, stdin);
}

static void parse_data(struct strbuf *dst)
{
	if (prefixcmp(command_buf.buf, "data "))
		die("Expected 'data n' command, found: %s", command_buf.buf);

	if (!prefixcmp(command_buf.buf + 5, "<<")) {
		char *term = xstrdup(command_buf.buf + 5 + 2);
		size_t term_len = command_buf.len - 5 - 2;

		strbuf_reset(&command_buf);
		for (;;) {
			if (read_next_command() == EOF)
				die("EOF in data (terminator '%s' not found)", term);
			if (term_len == command_buf.len
			    && !memcmp(term, command_buf.buf, term_len))
				break;
			if (dst) {
				strbuf_addbuf(dst, &command_buf);
				strbuf_addch(dst, '\n');
			} else
				printf("%s\n", command_buf.buf);
		}
		free(term);
	} else {
		uintmax_t length;

		length = strtoumax(command_buf.buf + 5, NULL, 10);
		if ((size_t)length < length)
			die("Data is too large to use in this context");
		if (!dst) {
			strbuf_reset(&command_buf);
			/* buffer_copy_bytes(&command_buf, (size_t)length); */
		} else
			strbuf_fread(dst, (size_t)length, stdin);
	}

	skip_optional_lf();
}

static const char *get_mode(const char *str, uint16_t *modep)
{
	unsigned char c;
	uint16_t mode = 0;

	while ((c = *str++) != ' ') {
		if (c < '0' || c > '7')
			return NULL;
		mode = (mode << 3) + (c - '0');
	}
	*modep = mode;
	return str;
}

static void file_change_m(void)
{
	const char *p;
	const char *endp;
	uint16_t mode;
	enum node_kind kind;

	p = get_mode(command_buf.buf + 2, &mode);
	if (!p)
		die("Corrupt mode: %s", command_buf.buf);
	switch (mode) {
	case 0644:
	case 0755:
		mode |= S_IFREG;
	case S_IFREG | 0644:
		kind = NODE_KIND_NORMAL;
		break;
	case S_IFREG | 0755:
		kind = NODE_KIND_EXECUTABLE;
		break;
	case S_IFLNK:
		kind = NODE_KIND_SYMLINK;
		break;
	case S_IFGITLINK:
		die("Gitlinks unsupported"); /* TODO */
	case S_IFDIR:
		die("Subdirectories unsupported"); /* TODO */
	default:
		die("Corrupt mode: %s", command_buf.buf);
	}

	if (!prefixcmp(p, "inline"))
		p += 6;
	else
		die ("Non-inlined data unsupported");
	if (*p++ != ' ')
		die("Missing space after dataref: %s", command_buf.buf);

	/* parse out path into path_d */
	strbuf_reset(&path_d);
	if (!unquote_c_style(&path_d, p, &endp)) {
		if (*endp)
			die("Garbage after path in: %s", command_buf.buf);
	} else
		strbuf_addstr(&path_d, p);

	dump_export_m(path_d.buf, kind);
	read_next_command();
	parse_data(NULL); /* parse data and write it to stdout */
}

static void file_change_d(void)
{
	const char *p;
	const char *endp;
	
	p = command_buf.buf + 2;
	/* parse out path into path_d */
	strbuf_reset(&path_d);
	if (!unquote_c_style(&path_d, p, &endp)) {
		if (*endp)
			die("Garbage after path in: %s", command_buf.buf);
	} else
		strbuf_addstr(&path_d, p);
	dump_export_d(path_d.buf);
}

static void file_change_cr(int rename)
{
	const char *p;
	const char *endp;

	p = command_buf.buf + 2;
	strbuf_reset(&path_s);
	if (!unquote_c_style(&path_s, p, &endp)) {
		if (*endp != ' ')
			die("Missing space after source: %s", command_buf.buf);
	} else {
		endp = strchr(p, ' ');
		if (!endp)
			die("Missing space after source: %s", command_buf.buf);
		strbuf_add(&path_s, p, endp - p);
	}

	endp++;
	if (!*endp)
		die("Missing destination: %s", command_buf.buf);

	p = endp;
	strbuf_reset(&path_d);
	if (!unquote_c_style(&path_d, p, &endp)) {
		if (*endp)
			die("Garbage after destination in: %s", command_buf.buf);
	} else
		strbuf_addstr(&path_d, p);

	/* TODO: Check C "path/to/subdir" "" */
	if (rename)
		dump_export_d(path_s.buf);
	dump_export_c(path_d.buf, path_s.buf, 0);
}

static void parse_new_commit()
{
	char *branch;

	/* parse and ignore branch name */
	branch = strchr(command_buf.buf, ' ') + 1;
	read_next_command();
	if (!prefixcmp(command_buf.buf, "mark :"))
		/* parse and ignore mark line */
		read_next_command();
	if (!prefixcmp(command_buf.buf, "author ")) {
		parse_ident(command_buf.buf + 7, &author);
		read_next_command();
	}
	if (!prefixcmp(command_buf.buf, "committer ")) {
		parse_ident(command_buf.buf + 10, &committer);
		read_next_command();
	}
	if (!committer.name.len)
		die("Missing committer line in stream");
	parse_data(&log_buf);
	read_next_command();
	if (!prefixcmp(command_buf.buf, "from "))
		/* TODO: Support copyfrom */
		read_next_command();
	while (!prefixcmp(command_buf.buf, "merge "))
		/* TODO: Support merges */
		read_next_command();

	/* file_change_* */
	while (command_buf.len > 0) {
		if (!prefixcmp(command_buf.buf, "M "))
			file_change_m();
		else if (!prefixcmp(command_buf.buf, "D "))
			file_change_d();
		else if (!prefixcmp(command_buf.buf, "R "))
			file_change_cr(1);
		else if (!prefixcmp(command_buf.buf, "C "))
			file_change_cr(0);
		else if (!prefixcmp(command_buf.buf, "N "))
			; /* ignored */
		else if (!prefixcmp(command_buf.buf, "ls "))
			goto error; /* TODO */
		else if (!strcmp("deleteall", command_buf.buf))
			goto error; /* TODO */
		else
			break;
		if (read_next_command() == EOF)
			break;
	}
	return;
error:
	die("Unsupported command: %s", command_buf.buf);
}

void parse_new_tag()
{
	/* TODO: Support tags */
	return;
}

void parse_reset_branch()
{
	/* TODO */
	return;
}

void build_svn_author(struct ident *author, struct ident *committer)
{
	char *t, *email;

	strbuf_reset(&svn_author);
	email = author->email.len ? author->email.buf : committer->email.buf;
	if ((t = strchr(email, '@')))
		strbuf_add(&svn_author, email, t - email);
	else
		strbuf_addstr(&svn_author, t);
}

void svnload_read(void)
{
	char *val;
	while (read_next_command() != EOF) {
		if ((val = strchr(command_buf.buf, ' ')))
			*val++ = '\0';

		if (!strcmp("blob", command_buf.buf))
			die("Non-inlined blobs unsupported");
		else if (!prefixcmp(command_buf.buf, "ls "))
			goto error; /* TODO */
		else if (!prefixcmp(command_buf.buf, "cat-blob "))
			goto error; /* TODO */
		else if (!prefixcmp(command_buf.buf, "commit "))
			parse_new_commit(val);
		else if (!prefixcmp(command_buf.buf, "tag "))
			parse_new_tag(val);
		else if (!prefixcmp(command_buf.buf, "reset "))
			parse_reset_branch(val);
		else if (!strcmp(command_buf.buf, "checkpoint")
			|| !prefixcmp(command_buf.buf, "progress ")
			|| !prefixcmp(command_buf.buf, "feature ")
			|| !prefixcmp(command_buf.buf, "option "))
			; /* ignored */
		else
			goto error;
	}
error:
	die("Unsupported command: %s", command_buf.buf);
}

int svnload_init(const char *filename)
{
	if (!(infile = filename ? fopen(filename, "r") : stdin))
		die("Cannot open %s: %s", filename, strerror(errno));
	dump_export_init();
	return 0;
}

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2011-02-11 15:48 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-07 11:21 Status of the svn remote helper project (Nov, 2010) Jonathan Nieder
2010-11-07 12:06 ` David Michael Barr
2010-11-08  3:56   ` David Barr
2010-11-08  6:11     ` Jonathan Nieder
2010-11-08  6:20       ` David Barr
2010-11-07 12:50 ` Ramkumar Ramachandra
2010-11-07 17:42   ` Jonathan Nieder
2010-11-21  6:31 ` Status of the svn remote helper project (Nov 2010, #2) Jonathan Nieder
2010-11-21  9:38   ` David Michael Barr
2010-11-21 23:06     ` Jonathan Nieder
2010-11-22  2:06       ` David Barr
2010-12-05 11:37   ` Status of the svn remote helper project (Dec 2010, #1) Jonathan Nieder
2010-12-08 18:26     ` Tomas Carnecky
2010-12-12  6:14       ` fast-import tweaks for remote helpers (Re: Status of the svn remote helper project (Dec 2010, #1)) Jonathan Nieder
2010-12-12  9:53         ` Sam Vilain
2010-12-12 17:16           ` fast-import tweaks for remote helpers Jonathan Nieder
2011-01-05 21:20             ` fast-import --report-fd (Re: fast-import tweaks for remote helpers) Jonathan Nieder
2011-01-05 23:39     ` Status of the svn remote helper project (Jan 2011, #1) Jonathan Nieder
2011-01-07 14:00       ` David Michael Barr
2011-02-11  9:09       ` Plans for the vcs-svn-pu branch Jonathan Nieder
2011-02-11 10:36         ` [PATCH] svn-fe: warn about experimental status Jonathan Nieder
2011-02-11 15:49         ` Plans for the vcs-svn-pu branch Ramkumar Ramachandra

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).