git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Nieder <jrnieder@gmail.com>
To: David Barr <david.barr@cordelta.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Ramkumar Ramachandra <artagnon@gmail.com>,
	Sverre Rabbelier <srabbelier@gmail.com>
Subject: Re: [PATCH 7/7] svn-fe: Use the --report-fd feature
Date: Tue, 12 Oct 2010 18:59:13 -0500	[thread overview]
Message-ID: <20101012235913.GJ15587@burratino> (raw)
In-Reply-To: <1286891424-2067-8-git-send-email-david.barr@cordelta.com>

David Barr wrote:

> On one hand, this makes the interface much uglier.  But on the
> other hand, it makes it possible to retrieve blobs by name, a
> facility we will be using soon.

Stale log message.

> Helped-by: David Barr <david.barr@cordelta.com>

I think you wrote most of the patch at this point.

> --- a/t/t9010-svn-fe.sh
> +++ b/t/t9010-svn-fe.sh
> @@ -34,10 +34,10 @@ test_dump () {
>  		svnadmin load "$label-svn" < "$TEST_DIRECTORY/$dump" &&
>  		svn_cmd export "file://$PWD/$label-svn" "$label-svnco" &&
>  		git init "$label-git" &&
> -		test-svn-fe "$TEST_DIRECTORY/$dump" >"$label.fe" &&
>  		(
> -			cd "$label-git" &&
> -			git fast-import < ../"$label.fe"
> +			cd "$label-git" && mkfifo backflow

Missing && (not that it matters here, just good to get in the habit).

> --- a/test-svn-fe.c
> +++ b/test-svn-fe.c
> @@ -23,10 +23,11 @@ int main(int argc, char *argv[])
>  	if (argc == 5 && !strcmp(argv[1], "-d")) {
>  		struct line_buffer preimage = LINE_BUFFER_INIT;
>  		struct line_buffer delta = LINE_BUFFER_INIT;
> +		struct view preimage_view = {&preimage, 0, STRBUF_INIT};
>  		buffer_init(&preimage, argv[2]);
>  		buffer_init(&delta, argv[3]);
>  		if (svndiff0_apply(&delta, (off_t) strtoull(argv[4], NULL, 0),
> -				   &preimage, stdout))
> +				   &preimage_view, stdout))

This interface change is really neat.  Probably it should get its
own commit.

> --- a/vcs-svn/fast_export.c
> +++ b/vcs-svn/fast_export.c
> @@ -8,8 +8,10 @@
[...]
> +#define REPORT_FILENO 3
>  
>  static uint32_t first_commit_done;
>  
> @@ -30,10 +32,109 @@ void fast_export_modify(uint32_t depth, uint32_t *path, uint32_t mode,
>  	putchar('\n');
>  }
>  
> +static void fast_export_read_bytes(size_t len, char buf[len])
> +{
> +	if (read_in_full(REPORT_FILENO, buf, len) != len)
> +		warning("early eof. Expecting %"PRIu64" bytes", (uint64_t) len);
> +}
> +

Would it make sense for functions like this to use the line_buffer
module?  If we set the O_NONBLOCK flag on the report fd, I think it
could work (with buffer_fdopen()), but I'm not sure how portable that
is.

But maybe it's easier to use fds directly.

> +static void fast_export_expect(const char *string)
> +{
> +	const char *p;
> +	for (p = string; *p; p++) {
> +		char buf[1];
> +		if (xread(REPORT_FILENO, buf, 1) <= 0) {

Might be simpler to read strlen(string) bytes at a time with
read_in_full() (my fault, I know).

[...]
> +static size_t fast_export_read_integer_at_eol(void)
> +{
> +	size_t result = 0;
> +	for (;;) {
> +		char buf[1];
> +		if (xread(REPORT_FILENO, buf, 1) <= 0) {

Could use fscanf() if O_NONBLOCK is usable.  Otherwise I fear
the read(1) is needed. :(

[...]
> +void fast_export_parse_commit(char tree_id[SHA1_HEX_LENGTH])
> +{
> +	size_t len;
> +
> +	fast_export_skip_bytes(SHA1_HEX_LENGTH);
> +	fast_export_expect(" commit ");
> +	len = fast_export_read_integer_at_eol();

I think you mentioned that you'd prefer for this to use

	fast_export_expect(commit_id);

?  Anyway, this functionality is not used in the current patch.

[...]
> +void fast_export_save_blob(FILE *out)
[...]
> +	if (!out) {
> +		warning("cannot open temporary: %s", strerror(errno));
> +		out = fopen("/dev/null", "w");
> +	}
> +	if (!out) {
> +		warning("cannot open /dev/null: %s", strerror(errno));
> +		return;
> +	}

I think the caller should take care of the error message.

[...]
> +	if (ferror(out))
> +		warning("cannot write temporary: %s", strerror(errno));

And this, too, probably (since only the caller knows if it's
temporary).

>  }
>  
> +FILE *preimage = NULL;
> +FILE *postimage = NULL;

static?  And the usual git style is to leave the "= NULL" implicit for
globals.

>  void fast_export_blob(uint32_t mode, uint32_t mark, uint32_t len,
>  			uint32_t delta, uint32_t srcMark, uint32_t srcMode,
>  			struct line_buffer *input)
>  {
> +	struct line_buffer preimage_buf = LINE_BUFFER_INIT;
> +	struct line_buffer postimage_buf = LINE_BUFFER_INIT;
> +	struct view preimage_view = {&preimage_buf, 0, STRBUF_INIT};
> +	long preimage_len = 0;
> +
> +	if (delta) {
> +		if (!preimage)
> +			preimage = tmpfile();

This is not closed until exit() takes care of it, right?

> +		if (srcMark) {
> +			printf("cat :%"PRIu32"\n", srcMark);
> +			fflush(stdout);
> +			if (srcMode == REPO_MODE_LNK)
> +				fwrite("link ", 1, 5, preimage);

strbuf_addstr(&preimage_buf.buf, "link ");

would allow the fast_export_save_blob() to use fd 3 directly,
perhaps.  If so,

> +			fast_export_save_blob(preimage);
> +		}
> +		preimage_len = ftell(preimage);

this would be read from the output of the "cat" command.

Currently we are not using the preimage_len at all as far as I
can tell.  It would not be hard to teach svndiff0_apply() to keep
track of it again.

> +		fseek(preimage, 0, SEEK_SET);
> +		preimage_buf.infile = preimage;
> +		if (!postimage)
> +			postimage = tmpfile();
> +		svndiff0_apply(input, len, &preimage_view, postimage);
> +		strbuf_release(&preimage_view.buf);
> +		len = ftell(postimage);

The postimage has to be a temporary file in the current design, since
we do not know the postimage length to pass to fast-import until we've
written the whole thing out.  The data <<delim form is not usable for
this because there is no forbidden string to use as a delimiter.
A separate pass to add up the postimage lengths from windows would
fail with deltas like those Ram first supplied for the test suite
(though have we checked that those come up in the wild?).

[...]
> +	if (!delta)
> +		buffer_copy_bytes(input, len);
> +	else
> +		buffer_copy_bytes(&postimage_buf, len);

Maintaining support for dumpfilev2.  Nice.

>  	fputc('\n', stdout);
> +
> +	if (preimage) {
> +		fseek(preimage, 0, SEEK_SET);
> +		ftruncate(fileno(preimage), 0);
> +	}
> +
> +	if (postimage) {
> +		fseek(postimage, 0, SEEK_SET);

I think postimage is seeked twice --- are both needed?

[...]
> --- a/vcs-svn/repo_tree.c
> +++ b/vcs-svn/repo_tree.c
> @@ -12,6 +12,9 @@
>  
>  #include "trp.h"
>  
> +/* git hash-object -t tree --stdin </dev/null */
> +#define EMPTY_TREE_HASH "4b825dc642cb6eb9a060e54bf8d69288fbee4904"
> +
>  struct repo_dirent {
>  	uint32_t name_offset;
>  	struct trp_node children;
> @@ -25,6 +28,7 @@ struct repo_dir {
>  
>  struct repo_commit {
>  	uint32_t root_dir_offset;
> +	char tree_id[SHA1_HEX_LENGTH];
>  };

Not needed.

Thanks.

  parent reply	other threads:[~2010-10-13  0:03 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-12 13:50 [PATCH/RFC] Add support for subversion dump format v3 David Barr
2010-10-12 13:50 ` [PATCH 1/7] Teach fast-import to print the id of each imported commit David Barr
2010-10-12 14:42   ` Sverre Rabbelier
2010-10-12 18:48     ` Jonathan Nieder
2010-10-12 18:57       ` Sverre Rabbelier
2010-10-12 19:07         ` Jonathan Nieder
2010-10-12 22:06   ` Jonathan Nieder
2010-10-12 23:05     ` Sverre Rabbelier
2010-10-12 13:50 ` [PATCH 2/7] fast-import: Let importers retrieve the objects being written David Barr
2010-10-12 22:38   ` Jonathan Nieder
2010-10-12 23:07     ` Sverre Rabbelier
2010-10-13  0:07       ` Jonathan Nieder
2010-10-13 13:10         ` Sverre Rabbelier
2010-10-12 13:50 ` [PATCH 3/7] fast-import: Allow cat command with empty path David Barr
2010-10-12 22:47   ` Jonathan Nieder
2010-10-12 13:50 ` [PATCH 4/7] fast-import: Allow cat requests at arbitrary points in stream David Barr
2010-10-12 22:50   ` Jonathan Nieder
2010-10-12 13:50 ` [PATCH 5/7] vcs-svn: extend svndump to parse version 3 format David Barr
2010-10-12 22:56   ` Jonathan Nieder
2010-10-12 13:50 ` [PATCH 6/7] vcs-svn: implement prop-delta handling David Barr
2010-10-12 22:59   ` Jonathan Nieder
2010-10-12 13:50 ` [PATCH 7/7] svn-fe: Use the --report-fd feature David Barr
2010-10-12 14:55   ` Sverre Rabbelier
2010-10-12 23:03     ` Jonathan Nieder
2010-10-12 23:11       ` Sverre Rabbelier
2010-10-12 23:36         ` Jonathan Nieder
2010-10-12 23:59   ` Jonathan Nieder [this message]
2010-10-13  2:24 ` [PATCH/RFC] Add support for subversion dump format v3 Jonathan Nieder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101012235913.GJ15587@burratino \
    --to=jrnieder@gmail.com \
    --cc=artagnon@gmail.com \
    --cc=david.barr@cordelta.com \
    --cc=git@vger.kernel.org \
    --cc=srabbelier@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).