Before change:
-rw-rw---- 1 mysql mysql 1177773692 2008-06-05 12:37 items_text.MYD -rw-rw---- 1 mysql mysql 1136713728 2008-06-05 12:37 items_text.MYI
After adding more indexes:
-rw-rw---- 1 mysql mysql 1156516200 2008-06-04 17:14 items_text.MYD -rw-rw---- 1 mysql mysql 1978787840 2008-06-04 17:14 items_text.MYI
Furthermore, this made the table much harder to update. Replication kept lagging last night (nagios was complaining loudly).
I've since reverted back to the way I had it, not using the index, which is the least worse of my options currently. Someone asked me to try using this query:
select count(*) from items_text where MATCH (content) AGAINST ('+iron +man' IN BOOLEAN MODE) or MATCH (title) AGAINST ('+iron +man' IN BOOLEAN MODE)
This doesn't use indexes the way you would hope either! As soon as you add one more index to the query, even with "OR", you get this:
mysql> explain select count(*) from items_text where MATCH (content) AGAINST ('+iron +man' IN BOOLEAN MODE)\G*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: items_text
type: fulltext
possible_keys: content
key: content
key_len: 0
ref:
rows: 1
Extra: Using where
1 row in set (0.00 sec)
mysql> explain select count(*) from items_text where MATCH (content) AGAINST ('+iron +man' IN BOOLEAN MODE) or MATCH (title) AGAINST ('+iron +man' IN BOOLEAN MODE)\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: items_text
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1325662
Extra: Using where
1 row in set (0.00 sec)
So, back to the drawing board. Try another fulltext indexing option. This sure doesn't cut it.
Anonymous
June 5 2008, 18:31:47 UTC 3 years ago
FULLTEXT indexes
I found that FULLTEXT index is slow even when optimal (and it needs frequent optimizations). Since the next-best solution (Sphinx) is not yet usable , I recommend SolR/Lucene. Apart for much faster search the search itself is more flexible (you can find "iron" and "man" if there are no more than 3 words apart - for example). I know this doesn't exactly fit in Mysql replication chain but is the best solution I've fund so far. Maybe Sun should spend a bit of time to add SolR/Lucene as a storage engine.June 6 2008, 15:17:02 UTC 3 years ago
Re: FULLTEXT indexes
It wouldn't even necessarily necessitate a storage engine - perhaps some glue to make it work as a internal index.Anonymous
June 5 2008, 19:04:15 UTC 3 years ago
Combine the fields into one column
At first glance it appears you have 2 problems that are intertwined performance and replication.I've hit this wall before w/ fulltext indexes. To get around it I recommend creating a new column that stores all the data you want to search (you can use triggers or add some code to populate the column).
SELECT COUNT(*) FROM items_text
WHERE (MATCH (new_column) AGAINST ('+iron +man' IN BOOLEAN MODE));
Because all the data is in the new_column you won't have to deal composite indexes and it will scale! As you add a column to the items_text, just add it to the new_column also. You're using boolean searches... you could get away with removing the single column fulltext indexes which will lessen the your diskio which should solve your replication issue.
-sean
June 5 2008, 20:55:33 UTC 3 years ago
Sphinx.
Tried it?http://www.sphinxsearch.com/
June 6 2008, 15:14:40 UTC 3 years ago
Re: Sphinx.
Not yet, but I will very soon. Our CTO is under the impression that documentation is lacking, but it's certainly worth a shot.June 6 2008, 14:25:02 UTC 3 years ago
The problem with "OR" is that MySQL can only use 1 index per query (with exceptions, yes, but it's a general good rule).
You'll get much better performance if you do the queries separately. If you must do it in one query, use a UNION:
select count(*) from items_text
where MATCH (content) AGAINST ('+iron +man' IN BOOLEAN MODE)
UNION
select count(*) from items_text
where MATCH (title) AGAINST ('+iron +man' IN BOOLEAN MODE);
June 6 2008, 15:15:57 UTC 3 years ago