git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] git-remote-mediawiki: bugfix for pages w/ >500 revisions
@ 2013-09-22 18:44 Benoit Person
  2013-09-22 19:27 ` Matthieu Moy
  0 siblings, 1 reply; 3+ messages in thread
From: Benoit Person @ 2013-09-22 18:44 UTC (permalink / raw
  To: git; +Cc: Matthieu Moy, Benoit Person

Mediawiki introduced a new API for queries w/ more than 500 results in
version 1.21. That change triggered an infinite loop while cloning a
mediawiki with such a page.

Fix that while still preserving the old behavior for old APIs.

Signed-off-by: Benoit Person <benoit.person@gmail.fr>
Reported-by: Benjamin Cathey
---

Patch tested for all mediawiki versions from 1.19 to 1.21.

For now, if the tests suite is run without the fix, the new test
introduces an infinite loop. I am not sure if this should be handled ?
(a timeout of some kind maybe ?)

 contrib/mw-to-git/git-remote-mediawiki.perl     | 14 ++++++++++++--
 contrib/mw-to-git/t/t9365-continuing-queries.sh | 24 ++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 2 deletions(-)
 create mode 100755 contrib/mw-to-git/t/t9365-continuing-queries.sh

diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl b/contrib/mw-to-git/git-remote-mediawiki.perl
index c9a4805..2d7af57 100755
--- a/contrib/mw-to-git/git-remote-mediawiki.perl
+++ b/contrib/mw-to-git/git-remote-mediawiki.perl
@@ -625,6 +625,9 @@ sub fetch_mw_revisions_for_page {
 		rvstartid => $fetch_from,
 		rvlimit => 500,
 		pageids => $id,
+
+                # let the mediawiki knows that we support the latest API
+                continue => '',
 	};
 
 	my $revnum = 0;
@@ -640,8 +643,15 @@ sub fetch_mw_revisions_for_page {
 			push(@page_revs, $page_rev_ids);
 			$revnum++;
 		}
-		last if (!$result->{'query-continue'});
-		$query->{rvstartid} = $result->{'query-continue'}->{revisions}->{rvstartid};
+
+                if ($result->{'query-continue'}) { # For legacy APIs
+                    $query->{rvstartid} = $result->{'query-continue'}->{revisions}->{rvstartid};
+                } elsif ($result->{continue}) { # For newer APIs
+                    $query->{rvstartid} = $result->{continue}->{rvcontinue};
+                    $query->{continue} = $result->{continue}->{continue};
+                } else {
+                    last;
+                }
 	}
 	if ($shallow_import && @page_revs) {
 		print {*STDERR} "  Found 1 revision (shallow import).\n";
diff --git a/contrib/mw-to-git/t/t9365-continuing-queries.sh b/contrib/mw-to-git/t/t9365-continuing-queries.sh
new file mode 100755
index 0000000..6fb5df4
--- /dev/null
+++ b/contrib/mw-to-git/t/t9365-continuing-queries.sh
@@ -0,0 +1,24 @@
+#!/bin/sh
+
+test_description='Test the Git Mediawiki remote helper: queries w/ more than 500 results'
+
+. ./test-gitmw-lib.sh
+. ./push-pull-tests.sh
+. $TEST_DIRECTORY/test-lib.sh
+
+test_check_precond
+
+test_expect_success 'creating page w/ >500 revisions' '
+	wiki_reset &&
+	for i in $(seq 1 501)
+	do
+		echo "creating revision $i"
+		wiki_editpage foo "revision $i<br/>" true
+	done
+'
+
+test_expect_success 'cloning page w/ >500 revisions' '
+	git clone mediawiki::'"$WIKI_URL"' mw_dir
+'
+
+test_done
-- 
1.8.4.GIT

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] git-remote-mediawiki: bugfix for pages w/ >500 revisions
  2013-09-22 18:44 [PATCH] git-remote-mediawiki: bugfix for pages w/ >500 revisions Benoit Person
@ 2013-09-22 19:27 ` Matthieu Moy
  2013-09-23 18:26   ` Jonathan Nieder
  0 siblings, 1 reply; 3+ messages in thread
From: Matthieu Moy @ 2013-09-22 19:27 UTC (permalink / raw
  To: Benoit Person; +Cc: git, Benoit Person

Benoit Person <benoit.person@gmail.com> writes:

> Mediawiki introduced a new API for queries w/ more than 500 results in
> version 1.21. That change triggered an infinite loop while cloning a
> mediawiki with such a page.
>
> Fix that while still preserving the old behavior for old APIs.

That would be nice to explain a bit more here. Where did the infinite
loop come from? How does your patch fix it?

> For now, if the tests suite is run without the fix, the new test
> introduces an infinite loop. I am not sure if this should be handled ?
> (a timeout of some kind maybe ?)

If the patch fix this, then it's not a really big problem. The test
failure is an infinite loop. That would be problematic if ran
non-interactively, but I think it's Ok since we only run the testsuite
manually.

> diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl b/contrib/mw-to-git/git-remote-mediawiki.perl
> index c9a4805..2d7af57 100755
> --- a/contrib/mw-to-git/git-remote-mediawiki.perl
> +++ b/contrib/mw-to-git/git-remote-mediawiki.perl
> @@ -625,6 +625,9 @@ sub fetch_mw_revisions_for_page {
>  		rvstartid => $fetch_from,
>  		rvlimit => 500,
>  		pageids => $id,
> +
> +                # let the mediawiki knows that we support the latest API

s/knows/know/

> +                continue => '',

Indentation with spaces. Please, use tabs.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] git-remote-mediawiki: bugfix for pages w/ >500 revisions
  2013-09-22 19:27 ` Matthieu Moy
@ 2013-09-23 18:26   ` Jonathan Nieder
  0 siblings, 0 replies; 3+ messages in thread
From: Jonathan Nieder @ 2013-09-23 18:26 UTC (permalink / raw
  To: Matthieu Moy; +Cc: Benoit Person, git, Benoit Person

Matthieu Moy wrote:
> Benoit Person <benoit.person@gmail.com> writes:

>> For now, if the tests suite is run without the fix, the new test
>> introduces an infinite loop. I am not sure if this should be handled ?
>> (a timeout of some kind maybe ?)
>
> If the patch fix this, then it's not a really big problem. The test
> failure is an infinite loop.

Yes, I think it's fine.

>                              That would be problematic if ran
> non-interactively, but I think it's Ok since we only run the testsuite
> manually.

Some distros (e.g., Debian) occasionally do run the testsuite
automatically, but it is still fine since they have a timeout that
varies by platform to detect if the test has stalled.  I suppose
ideally git's test harness could learn to do the same thing some day,
but for now it's easier one level above since an appropriate timeout
depends on the speed on the platform, what else is creating load on
the test machine, and other factors that are probably not easy for us
to guess.

(other tweaks snipped)

Thanks, both.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-09-23 18:26 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-22 18:44 [PATCH] git-remote-mediawiki: bugfix for pages w/ >500 revisions Benoit Person
2013-09-22 19:27 ` Matthieu Moy
2013-09-23 18:26   ` Jonathan Nieder

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).