phabricator.wikimedia.org

⚓ T106614 F9. Run LQT conversion script on mediawiki.org Project:Support desk

  • ️Wed Jul 22 2015

Comment Actions

TLDR, it seems to be getting slower. For now we can probably still just leave it, but it might be like 10 days. If it gets even slower, we will have to probably stop it and fix the problem (it is resumable):

+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| (SELECT COUNT(*) FROM page WHERE page_namespace = 90 AND page_title LIKE 'Project:Support_desk/%' AND page_title NOT LIKE '%/reply%%%%' AND page_is_redirect) / (SELECT COUNT(*) FROM page WHERE page_namespace = 90 AND page_title LIKE 'Project:Support_desk/% |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                                                                                                                                                                                                           0.5324 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.06 sec)

It does seem to be slowing done on the topic iteration:

[2015-07-30 22:12:46] Imported 2 revisions for header
[2015-07-30 22:12:47] Importing new topic
--
[2015-07-30 22:12:48]    Finished importing post with 1 revisions
[2015-07-30 22:13:25] Importing new topic
--
[2015-07-30 22:13:26]    Finished importing post with 1 revisions
[2015-07-30 22:13:31] Importing new topic
--
[2015-07-30 22:13:32]   Finished importing post with 1 revisions
[2015-07-30 22:13:37] Importing new topic
--
[2015-07-30 22:13:38]     Finished importing post with 1 revisions
[2015-07-30 22:13:49] Importing new topic
--
[2015-07-30 22:13:51]     Finished importing post with 1 revisions
[2015-07-30 22:13:58] Importing new topic
--
[2015-07-30 22:13:58]    Finished importing post with 1 revisions
[2015-07-30 22:14:02] Importing new topic
--
[2015-07-30 22:14:03]   Finished importing post with 1 revisions
[2015-07-30 22:14:07] Importing new topic
--
[2015-07-30 22:14:09]     Finished importing post with 1 revisions
[2015-07-30 22:14:15] Importing new topic
--
[2015-07-30 22:14:20]         Finished importing post with 2 revisions
[2015-07-30 22:14:28] Importing new topic
--
[2015-08-05 00:26:26]                 Finished importing post with 1 revisions
[2015-08-05 00:44:02] Importing new topic
--
[2015-08-05 00:44:19]  Finished importing post with 1 revisions
[2015-08-05 00:46:07] Importing new topic
--
[2015-08-05 00:46:34]   Finished importing post with 1 revisions
[2015-08-05 00:49:18] Importing new topic
--
[2015-08-05 00:50:19]      Finished importing post with 1 revisions
[2015-08-05 00:55:46] Importing new topic
--
[2015-08-05 00:56:25]    Finished importing post with 1 revisions
[2015-08-05 01:00:01] Importing new topic
--
[2015-08-05 01:01:23]     Finished importing post with 1 revisions
[2015-08-05 01:06:41] Importing new topic
--
[2015-08-05 01:10:03]                 Finished importing post with 1 revisions
[2015-08-05 01:29:08] Importing new topic
--
[2015-08-05 01:30:09]       Finished importing post with 1 revisions
[2015-08-05 01:36:03] Importing new topic
--
[2015-08-05 01:41:10]   Finished importing post with 3 revisions
[2015-08-05 02:05:49] Importing new topic
--
[2015-08-05 02:06:15]   Finished importing post with 1 revisions
[2015-08-05 02:08:51] Importing new topic

Something is slowing down the iteration of topics; in the beginning there were not these gaps, but now they're long.

I think the query is roughly:

SELECT  thread_id,thread_id,thread_subject,thread_article_namespace,thread_article_title,thread_parent,thread_ancestor,thread_created,thread_modified,thread_author_id,thread_author_name,thread_summary_page,thread_root,thread_type,thread_signature  FROM `thread`   WHERE thread_article_namespace = '0' AND thread_article_title = 'Test_LQT' AND (thread_id>='674') AND (thread_type != '2')  ORDER BY thread_id LIMIT 501

(that's taken from my test wiki debugging, but can be changed to match prod).

Even doing it without the LIMIT doesn't take forever:

SELECT  thread_id,thread_id,thread_subject,thread_article_namespace,thread_article_title,thread_parent,thread_ancestor,thread_created,thread_modified,thread_author_id,thread_author_name,thread_summary_page,thread_root,thread_type,thread_signature  FROM `thread`   WHERE thread_article_namespace = '4' AND thread_article_title = 'Support_desk/LQT_Archive_1' AND (thread_id>='674') AND (thread_type != '2')  ORDER BY thread_id;

27456 rows in set (0.42 sec)