Introduction
Recently, while working on user permission control, I encountered a requirement that can be abstracted as follows: user's city attribute consists of multiple cities, and we need to list users whose city attributes are subsets of a given city set.
The initial user-city relationship table has sample data like below, with the first column being the user ID and the second column being the city:
1 anshan
1 beijing
1 baotou
1 baoding
1 beihai
1 baoji
1 chongqing
1 chengdu
2 anshan
2 beijing
3 baotou
3 baoding
3 beihai
4 baoji
4 shanghai
For example, given a city set [anshan, beijing, baotou, baoding, beihai, baoji], we need to filter out users whose city attributes are subsets of this set. This means users whose cities are anshan or beijing, etc., but do not contain cities outside this set. For instance, user 2 has cities anshan and beijing, and user 3 has cities baotou, baoding, and beihai, both of which meet the criteria. User 4, although having baoji, also has shanghai, which is a city outside the specified set, so it does not meet the criteria.
Implementation Principle
After testing, bit operations can be used. First, separate the cities into a separate table, with city IDs represented in binary numbers. Each city sets a different bit to 1, meaning city IDs are powers of 2:
000000001 anshan
000000010 beijing
000000100 baotou
000001000 baoding
000010000 beihai
000100000 baoji
001000000 chongqing
010000000 chengdu
100000000 shanghai
Then, the user-city relationship is changed to the sum of city IDs for all cities of a user:
| ID | Binary |
|---|---|
| 1 | 000111111 |
| 2 | 000000011 |
| 3 | 000011100 |
| 4 | 100100000 |
| 5 | 110000001 |
Here, the set [anshan, beijing, baotou, baoding, beihai, baoji] can be represented as 111111. During filtering, users whose bitwise AND result with the current user equals the user value itself are considered matching.
Usage
In the above demonstration data, integers are represented in binary. In actual database storage, they are stored as decimal numbers. The following is a specific SQL statement example for filtering users with city set 111111, where 111111 has been converted to decimal number 63:
select user_id, city_sum from user_city where (city_sum & 63)=city_sum;
In this solution, city IDs are powers of 2, which requires a large field length. The maximum value for MySQL unsigned BIGINT is 18446744073709551615. Therefore, both city IDs and users' city attributes use unsigned BIGINT.
Finally, here are the actual SQL statements that can be used in this example. This is for MySQL database, and other databases can be inferred accordingly:
CREATE TABLE `city` (
`city_id` bigint(20) unsigned NOT NULL default '0',
`city_name` varchar(20) NOT NULL,
PRIMARY KEY (`city_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='City, city ID is a binary number incremented by bit';
INSERT INTO `city` VALUES (1, 'anshan');
INSERT INTO `city` VALUES (2, 'beijing');
INSERT INTO `city` VALUES (4, 'baotou');
INSERT INTO `city` VALUES (8, 'baoding');
INSERT INTO `city` VALUES (16, 'beihai');
INSERT INTO `city` VALUES (32, 'baoji');
INSERT INTO `city` VALUES (64, 'chongqing');
INSERT INTO `city` VALUES (128, 'chengdu');
INSERT INTO `city` VALUES (256, 'shanghai');
CREATE TABLE `user_city` (
`user_id` int(11) NOT NULL,
`city_sum` bigint(20) unsigned default '0',
PRIMARY KEY (`user_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COMMENT='User-City Relationship';
INSERT INTO `user_city` VALUES (1, 63);
INSERT INTO `user_city` VALUES (2, 3);
INSERT INTO `user_city` VALUES (3, 28);
INSERT INTO `user_city` VALUES (4, 288);
INSERT INTO `user_city` VALUES (5, 385);