| Patrick Galbraith ( @ 2008-06-02 13:01:00 |
Deleting mismatches - which is more optimal?
I have tables that should be 1:1 relations but occasionally, they become mismatched, one table having more records than the other. I have been wondering which is the most optimal way to delete things like this.
I have a delete using a join (item_id is primary key):
vs. using a subquery:
Sometimes, the 2nd seems faster, but using a join seems more orthodox to me. I'm curious what people think about this.
If I make these into select queries (instead of delete) and run explain, I get:
'rows' is about the same. Index usage appears similar. Though using a subquery seems like it could be more expensive. Maybe this is a simple question I'm asking.
I have tables that should be 1:1 relations but occasionally, they become mismatched, one table having more records than the other. I have been wondering which is the most optimal way to delete things like this.
I have a delete using a join (item_id is primary key):
mysql> delete from items_keys using items_keys left join items using (item_id) where items.item_id is NULL;
vs. using a subquery:
mysql> delete from items_keys where item_id not in (select item_id from items);
Sometimes, the 2nd seems faster, but using a join seems more orthodox to me. I'm curious what people think about this.
If I make these into select queries (instead of delete) and run explain, I get:
mysql> explain select count(*) from items_keys left join items using (item_id) where items.item_id is NULL;+----+-------------+------------+--------+---------------+---------+---------+--------------------------+-------+--------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+--------+---------------+---------+---------+--------------------------+-------+--------------------------------------+ | 1 | SIMPLE | items_keys | index | NULL | PRIMARY | 8 | NULL | 73915 | Using index | | 1 | SIMPLE | items | eq_ref | PRIMARY | PRIMARY | 8 | grazr.items_keys.item_id | 1 | Using where; Using index; Not exists | +----+-------------+------------+--------+---------------+---------+---------+--------------------------+-------+--------------------------------------+ 2 rows in set (0.03 sec) mysql> explain select count(*) from items_keys where item_id not in (select item_id from items); +----+--------------------+------------+-----------------+---------------+---------+---------+------+-------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+------------+-----------------+---------------+---------+---------+------+-------+--------------------------+ | 1 | PRIMARY | items_keys | index | NULL | PRIMARY | 8 | NULL | 73918 | Using where; Using index | | 2 | DEPENDENT SUBQUERY | items | unique_subquery | PRIMARY | PRIMARY | 8 | func | 1 | Using index | +----+--------------------+------------+-----------------+---------------+---------+---------+------+-------+--------------------------+ 2 rows in set (0.01 sec)
'rows' is about the same. Index usage appears similar. Though using a subquery seems like it could be more expensive. Maybe this is a simple question I'm asking.