git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Nieder <jrnieder@gmail.com>
To: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	David Michael Barr <david.barr@cordelta.com>,
	Sverre Rabbelier <srabbelier@gmail.com>,
	Sam Vilain <sam@vilain.net>
Subject: [PATCH 03/11] vcs-svn: Read the preimage while applying deltas
Date: Wed, 13 Oct 2010 04:30:37 -0500	[thread overview]
Message-ID: <20101013093037.GD32608@burratino> (raw)
In-Reply-To: <20101013091714.GA32608@burratino>

The source view offset heading each svndiff0 window represents a
number of bytes past the beginning of the preimage.  Together with the
source view length, it instructs the delta applier about what portion
of the preimage instructions will refer to.  Read in that data right
away using the sliding window code.

Maybe some day we will mmap() to prepare to read data more lazily.

For compatibility with Subversion's implementation, tolerate source
view offsets pointing past the end of the preimage file (a later
patch will remove this flexibility).  For simplicity, also permit
source views that start within the preimage and end outside of it,
even though Subversion does not.

This does not teach the delta applier to read instructions or copy
data from the source view yet.  Deltas that would produce nonempty
output are still rejected.

Helped-by: Ramkumar Ramachandra <artagnon@gmail.com>
Helped-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
It occurs to me that Sam Vilain may well have something valuable to
say about this series, having implemented something similar[1].

Sam, this series adds an svndiff0 parser for git to use in parsing
v3 dumps (which are way easier to produce with remote access to an
svn repository than v2 dumps).  The beginning of the series is at [2],
though that cover letter is out of date: now, modulo any new bugs
I've introduced with this reroll, it is known to successfully apply
all the deltas involved in a complete dump of the ASF repo.

I am interested in improvements and complaints of all kinds.

[1] http://search.cpan.org/~samv/Parse-SVNDiff-0.03/lib/Parse/SVNDiff.pm
[2] http://thread.gmane.org/gmane.comp.version-control.git/151086/focus=158731

 t/t9011-svn-da.sh |   38 ++++++++++++++++++++++++++++++++++++++
 vcs-svn/svndiff.c |   22 +++++++++++++++-------
 2 files changed, 53 insertions(+), 7 deletions(-)

diff --git a/t/t9011-svn-da.sh b/t/t9011-svn-da.sh
index 8dccd16..b9aad70 100755
--- a/t/t9011-svn-da.sh
+++ b/t/t9011-svn-da.sh
@@ -79,4 +79,42 @@ test_expect_success 'nonempty (but unused) preimage view' '
 	test_cmp empty actual
 '
 
+test_expect_success 'preimage view: right endpoint cannot backtrack' '
+	printf "SVNQ%b%b" "Q\003QQQ" "Q\002QQQ" |
+		q_to_nul >clear.backtrack &&
+	test_must_fail test-svn-fe -d preimage clear.backtrack 14
+'
+
+test_expect_success 'preimage view: left endpoint can advance' '
+	printf "SVNQ%b%b" "Q\003QQQ" "\001\002QQQ" |
+		q_to_nul >clear.preshrink &&
+	printf "SVNQ%b%b" "Q\003QQQ" "\001\001QQQ" |
+		q_to_nul >clear.shrinkbacktrack &&
+	test-svn-fe -d preimage clear.preshrink 14 >actual &&
+	test_must_fail test-svn-fe -d preimage clear.shrinkbacktrack 14 &&
+	test_cmp empty actual
+'
+
+test_expect_success 'preimage view: offsets compared by value' '
+	printf "SVNQ%b%b" "\001\001QQQ" "\0200Q\003QQQ" |
+		q_to_nul >clear.noisybacktrack &&
+	printf "SVNQ%b%b" "\001\001QQQ" "\0200\001\002QQQ" |
+		q_to_nul >clear.noisyadvance &&
+	test_must_fail test-svn-fe -d preimage clear.noisybacktrack 15
+	test-svn-fe -d preimage clear.noisyadvance 15 &&
+	test_cmp empty actual
+'
+
+test_expect_success 'preimage view: accept truncated preimage' '
+	printf "SVNQ%b" "\010QQQQ" | q_to_nul >clear.lateemptyread &&
+	printf "SVNQ%b" "\010\001QQQ" | q_to_nul >clear.latenonemptyread &&
+	printf "SVNQ%b" "\001\010QQQ" | q_to_nul >clear.longread &&
+	test-svn-fe -d preimage clear.lateemptyread 9 >actual.emptyread &&
+	test-svn-fe -d preimage clear.latenonemptyread 9 >actual.nonemptyread &&
+	test-svn-fe -d preimage clear.longread 9 >actual.longread &&
+	test_cmp empty actual.emptyread &&
+	test_cmp empty actual.nonemptyread &&
+	test_cmp empty actual.longread
+'
+
 test_done
diff --git a/vcs-svn/svndiff.c b/vcs-svn/svndiff.c
index e572a93..f2876b3 100644
--- a/vcs-svn/svndiff.c
+++ b/vcs-svn/svndiff.c
@@ -4,6 +4,7 @@
  */
 
 #include "git-compat-util.h"
+#include "sliding_window.h"
 #include "line_buffer.h"
 
 /*
@@ -122,21 +123,28 @@ static int apply_one_window(struct line_buffer *delta, off_t *delta_len)
 int svndiff0_apply(struct line_buffer *delta, off_t delta_len,
 		   struct line_buffer *preimage, FILE *postimage)
 {
+	struct view preimage_view = {preimage, 0, STRBUF_INIT};
 	assert(delta && preimage && postimage);
 
 	if (read_magic(delta, &delta_len))
-		return -1;
+		goto fail;
 	while (delta_len > 0) {	/* For each window: */
-		off_t pre_off;
+		off_t pre_off = pre_off;
 		size_t pre_len;
 		if (read_offset(delta, &pre_off, &delta_len) ||
 		    read_length(delta, &pre_len, &delta_len) ||
+		    move_window(&preimage_view, pre_off, pre_len) ||
 		    apply_one_window(delta, &delta_len))
-			return -1;
-		if (delta_len && buffer_at_eof(delta))
-			return error("Delta ends early! "
-				     "(%"PRIu64" bytes remaining)",
-				     (uint64_t) delta_len);
+			goto fail;
+		if (delta_len && buffer_at_eof(delta)) {
+			error("Delta ends early! (%"PRIu64" bytes remaining)",
+			      (uint64_t) delta_len);
+			goto fail;
+		}
 	}
+	strbuf_release(&preimage_view.buf);
 	return 0;
+ fail:
+	strbuf_release(&preimage_view.buf);
+	return -1;
 }
-- 
1.7.2.3

  parent reply	other threads:[~2010-10-13  9:34 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-15 16:22 [PATCH 0/8] Resurrect rr/svn-export Ramkumar Ramachandra
2010-07-15 16:22 ` [PATCH 1/8] Export parse_date_basic() to convert a date string to timestamp Ramkumar Ramachandra
2010-07-15 17:25   ` Jonathan Nieder
2010-07-15 22:54     ` Junio C Hamano
2010-07-15 16:22 ` [PATCH 2/8] Introduce vcs-svn lib Ramkumar Ramachandra
2010-07-15 17:46   ` Jonathan Nieder
2010-07-15 19:15     ` Ramkumar Ramachandra
2010-07-15 16:22 ` [PATCH 3/8] Add memory pool library Ramkumar Ramachandra
2010-07-15 18:57   ` Jonathan Nieder
2010-07-15 19:12     ` Ramkumar Ramachandra
2010-07-15 16:23 ` [PATCH 4/8] Add treap implementation Ramkumar Ramachandra
2010-07-15 19:09   ` Jonathan Nieder
2010-07-15 19:18     ` Ramkumar Ramachandra
2010-07-15 16:23 ` [PATCH 5/8] Add string-specific memory pool Ramkumar Ramachandra
2010-07-15 16:23 ` [PATCH 6/8] Add stream helper library Ramkumar Ramachandra
2010-07-15 19:19   ` Jonathan Nieder
2010-07-15 16:23 ` [PATCH 7/8] Add infrastructure to write revisions in fast-export format Ramkumar Ramachandra
2010-07-15 19:28   ` Jonathan Nieder
2010-07-15 16:23 ` [PATCH 8/8] Add SVN dump parser Ramkumar Ramachandra
2010-07-15 19:52   ` Jonathan Nieder
2010-07-15 20:04     ` Jonathan Nieder
2010-07-16 10:13 ` [PATCH 0/8] Resurrect rr/svn-export Jonathan Nieder
2010-07-16 10:16   ` [PATCH 3/9] Add memory pool library Jonathan Nieder
2010-07-16 10:23   ` [PATCH 4/9] Add treap implementation Jonathan Nieder
2010-07-16 18:26     ` Jonathan Nieder
2010-08-09 21:57   ` [PATCH 0/10] rr/svn-export reroll Jonathan Nieder
2010-08-09 22:01     ` [PATCH 01/10] Export parse_date_basic() to convert a date string to timestamp Jonathan Nieder
2010-08-09 22:04     ` [PATCH 02/10] Introduce vcs-svn lib Jonathan Nieder
2010-08-09 22:11     ` [PATCH 03/10] Add memory pool library Jonathan Nieder
2010-08-09 22:17     ` [PATCH 04/10] Add treap implementation Jonathan Nieder
2010-08-12 17:22       ` Junio C Hamano
2010-08-12 22:02         ` Jonathan Nieder
2010-08-12 22:11         ` Jonathan Nieder
2010-08-12 22:44           ` Junio C Hamano
2010-08-09 22:34     ` [PATCH 05/10] Add string-specific memory pool Jonathan Nieder
2010-08-12 17:22       ` Junio C Hamano
2010-08-12 21:30         ` Jonathan Nieder
2010-08-09 22:39     ` [PATCH 06/10] Add stream helper library Jonathan Nieder
2010-08-09 22:48     ` [PATCH 07/10] Infrastructure to write revisions in fast-export format Jonathan Nieder
2010-08-09 22:55     ` [PATCH 08/10] SVN dump parser Jonathan Nieder
2010-08-12 17:22       ` Junio C Hamano
2010-08-09 22:55     ` PATCH 09/10] Update svn-fe manual Jonathan Nieder
2010-08-09 22:58     ` [PATCH 10/10] svn-fe manual: Clarify warning about deltas in dump files Jonathan Nieder
2010-08-10 12:53     ` [PATCH 0/10] rr/svn-export reroll Ramkumar Ramachandra
2010-08-11  1:53       ` Jonathan Nieder
2010-10-11  2:34       ` [PATCH/WIP 00/16] svn delta applier Jonathan Nieder
2010-10-11  2:37         ` [PATCH 01/16] vcs-svn: Eliminate global byte_buffer[] array Jonathan Nieder
2010-10-11  2:39         ` [PATCH 03/16] vcs-svn: Collect line_buffer data in a struct Jonathan Nieder
2010-10-11  2:41         ` [PATCH 04/16] vcs-svn: Teach line_buffer to handle multiple input files Jonathan Nieder
2010-10-11  2:44         ` [PATCH 05/16] vcs-svn: Make buffer_skip_bytes() report partial reads Jonathan Nieder
2010-10-11  2:46         ` [PATCH 06/16] vcs-svn: Improve support for reading large files Jonathan Nieder
2010-10-11  2:47         ` [PATCH 07/16] vcs-svn: Add binary-safe read() function Jonathan Nieder
2010-10-11  2:47         ` [PATCH 08/16] vcs-svn: Let callers peek ahead to find stream end Jonathan Nieder
2010-10-11  2:51         ` [PATCH 09/16] vcs-svn: Allow input errors to be detected early Jonathan Nieder
2010-10-11  2:52         ` [PATCH 10/16] vcs-svn: Allow character-oriented input Jonathan Nieder
2010-10-11  2:53         ` [PATCH 11/16] vcs-svn: Add code to maintain a sliding view of a file Jonathan Nieder
2010-10-11  2:55         ` [PATCH 12/16] vcs-svn: Learn to parse variable-length integers Jonathan Nieder
2010-10-11  2:58         ` [PATCH 13/16] vcs-svn: Learn to check for SVN\0 magic Jonathan Nieder
2010-10-11  2:59         ` [PATCH 14/16] compat: helper for detecting unsigned overflow Jonathan Nieder
2010-10-11  3:00         ` [PATCH 15/16] t9010 (svn-fe): Eliminate dependency on svn perl bindings Jonathan Nieder
2010-10-11  3:11         ` [PATCH 02/16] vcs-svn: Replace buffer_read_string() memory pool with a strbuf Jonathan Nieder
2010-10-11  4:01         ` [PATCH/RFC 16'/16] vcs-svn: Add svn delta parser Jonathan Nieder
2010-10-13  9:17           ` [PATCH/RFC 0/11] Building up the " Jonathan Nieder
2010-10-13  9:19             ` [PATCH 01/11] fixup! vcs-svn: Learn to parse variable-length integers Jonathan Nieder
2010-10-13  9:21             ` [PATCH 02/11] vcs-svn: Skeleton of an svn delta parser Jonathan Nieder
2010-10-13  9:30             ` Jonathan Nieder [this message]
2010-10-14 21:45               ` [PATCH 03/11] vcs-svn: Read the preimage while applying deltas Sam Vilain
2010-10-14 23:40                 ` Jonathan Nieder
2010-10-13  9:35             ` [PATCH 04/11] vcs-svn: Read inline data from deltas Jonathan Nieder
2010-10-13  9:38             ` [PATCH 05/11] vcs-svn: Read instructions " Jonathan Nieder
2010-10-13  9:39             ` [PATCH 06/11] vcs-svn: Implement copyfrom_data delta instruction Jonathan Nieder
2010-10-13  9:41             ` [PATCH 07/11] vcs-svn: Check declared number of output bytes Jonathan Nieder
2010-10-13  9:48             ` [PATCH 08/11] vcs-svn: Reject deltas that do not consume all inline data Jonathan Nieder
2010-10-13  9:50             ` [PATCH 09/11] vcs-svn: Let deltas use data from postimage Jonathan Nieder
2010-10-13  9:53             ` [PATCH 10/11] vcs-svn: Reject deltas that read past end of preimage Jonathan Nieder
2010-10-13  9:58             ` [PATCH 11/11] vcs-svn: Allow deltas to copy from preimage Jonathan Nieder
2010-10-13 10:00             ` Jonathan Nieder
2010-10-18 17:00             ` [PATCH/RFC 0/11] Building up the delta parser Ramkumar Ramachandra
2010-10-18 17:03               ` Jonathan Nieder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101013093037.GD32608@burratino \
    --to=jrnieder@gmail.com \
    --cc=artagnon@gmail.com \
    --cc=david.barr@cordelta.com \
    --cc=git@vger.kernel.org \
    --cc=sam@vilain.net \
    --cc=srabbelier@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).