⚓ T194697 Multiblocks — Allow for multiple, simultaneous blocks with different expiration dates
- ️Mon May 14 2018
Community Tech are looking at this, following its high ranking in the 2023 Community Wishlist Survey. Three people have said to me that we should consider just dropping the unique index constraint. But I'm still pretty keen on keeping the block conflict feature I introduced in 2006 (1d9922db6442dec82e4be07c45c2e8a3a2035602). In the multiblocks context, the feature means that if two admins try to block a user or IP address simultaneously, and the user has no prior blocks, one admin should succeed and the other should get a confirmation page with details of the first block. If we just remove the unique index, both admins would succeed in creating a block. I can think of three approaches to supporting multiple blocks with block conflict prevention: The unique index is dropped, and we use gap locking to prevent insertion of rows with the same (or adjacent) addresses. The problem is that it will generate deadlocks, both in the conflict case and when adjacent IP addresses or users are blocked. According to the MySQL manual, deadlocks are not meant to be a big deal. You can just reissue the transaction. But this conflicts with MediaWiki's current transaction system, in which there is a single transaction for the whole web request, with no control over it except via DeferredUpdates. This makes it difficult to write deadlock-tolerant code. The target information is duplicated into a summary table. The unique index is dropped from ipblocks, but the summary table has a unique index. Secondary block insertion: This is my idea from 2018 T194697#4490345. Although we could keep the ipblocks table and field names the same for easier migration. The plan would be to add the summary table as in option 2 (now called block_target), then add ipb_target, then drop the duplicated fields from ipblocks such as ipb_address. Secondary block insertion: Migration options: Considerations: @daniel noted that user blocks and IP blocks are not really the same. In terms of schema design, ipblocks is a polymorphic table, since it is trying to represent at least two types of target (IP range and user). For an IP block, ipb_user is zero. for a user block, ipb_range_start is empty and ipb_address contains the denormalized username, requiring a special case in RenameuserSQL. Arguably single IP addresses and IP ranges are distinct kinds of blocks. There is a fixme in tables.json saying that, for efficiency, ipb_range_start/ipb_range_end should be empty for single-IP blocks. In the PHP code, they have distinct type constants Block::TYPE_IP and Block::TYPE_RANGE. The conventional representation for polymorphic data, referring to e.g. Polymorphic Association – bad SQL Smell! by Tom Gillies, is to have separate tables for separate types, and to reverse the direction of the foreign key association. In such a design, the block_target table would just have a primary key, a count, and maybe a type enum, and we would have tables such as blocked_user with bu_target → block_target.bt_id. It would be a far-reaching change to the conceptual model, ideally including a split of the DatabaseBlock class.1. Gap locking with FOR UPDATE
BEGIN;
SELECT COUNT(*) FROM ipblocks WHERE ipb_address='...' FOR UPDATE;
INSERT INTO ipblocks ...;
COMMIT;
2. Denormalized summary table
BEGIN;
INSERT INTO ipblocks_summary (ips_address, ips_count) VALUES (...);
INSERT INTO ipblocks (ipb_address, ...) VALUES (...);
COMMIT;
BEGIN;
UPDATE ipblocks_summary SET ips_count=2 WHERE ips_address='...', ips_count=1;
INSERT INTO ipblocks ...;
COMMIT;
3. Normalized schema
BEGIN;
INSERT INTO block_target ...;
INSERT INTO ipblocks (ipb_target, ...) VALUES ($bt_id, ...);
COMMIT;
BEGIN;
UPDATE block_target SET bt_count=2 WHERE bt_id=$id..., bt_count=1;
INSERT INTO ipblocks (ipb_target, ...) VALUES ($id, ...);
COMMIT;
Bonus level: fixing polymorphic ipblocks