Patrick Galbraith (capttofu) wrote,

FULLTEXT lesson/reminder of the day

I've been using MySQL fulltext indexes on a table where I keep a few varchar and one text column that is used for searches. I've had it defined as:

CREATE TABLE `items_text` (
  `item_id` bigint(20) NOT NULL,
  `fts` varchar(4) NOT NULL default 'grzr',
  `author` varchar(80) NOT NULL default '',
  `title` varchar(255) NOT NULL default '',
  `content` text NOT NULL,
  PRIMARY KEY  (`item_id`),
  FULLTEXT KEY `title` (`title`),
  FULLTEXT KEY `author` (`author`),
  FULLTEXT KEY `fts` (`fts`),
  FULLTEXT KEY `content` (`content`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8


One of my colleagues pointed out he was experiencing slow performance with this query:

select count(*) from items_text where (MATCH (title, author, content) AGAINST ('+iron +man' IN BOOLEAN MODE))


I ran EXPLAIN just to make sure that the index was being used:

mysql> explain select count(*) from items_text where (MATCH (title, author, content) AGAINST ('+iron +man' IN BOOLEAN MODE))\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: items_text
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1286218
        Extra: Using where


Ug! I thought "surely since I have indexes on all of those columns, that the fulltext indexes should be used." This is not the case.

In order for this query to use the a fulltext index for those columns, there needs to be another index, multi-column, which has the exact same columns specified in the query.

 create fulltext index combined_idx on items_text (content,author,title)

mysql> explain select count(*) from items_text where MATCH (content,author,title) AGAINST ('+iron +man' IN BOOLEAN MODE)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: items_text_foo
         type: fulltext
possible_keys: combined_idx
          key: combined_idx
      key_len: 0
          ref: 
         rows: 1
        Extra: Using where
1 row in set (0.02 sec)


Note: they don't have to have the exact same order.


The manual (which I missed this part when I initially read it and was low on caffeine at the time) says this:

"The MATCH() column list must match exactly the column list in some FULLTEXT index definition for the table, unless this MATCH() is IN BOOLEAN MODE. Boolean-mode searches can be done on non-indexed columns, although they are likely to be slow."

Hit me with a clue stick.

On our site, we allow a user to specify any or all columns to be searched upon, so any possible combination in the "MATCH (..." part of the query needs to have a corresponding index.

Unfortunately, this means I have to create indexes for every combination. This is a table of around 1.5M rows, so this will take a while to create.

The moral of the story: Never assume you are using index(es). Run explain.

Now, I wish there was a way to not have to create so many indexes to facilitate this functionality that we provide.

One last note: You also cannot have MATCH(...) against FULLTEXT indexes across tables.
Tags: fulltext indexes, index, mysql, schema design, search functionality
  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic
  • 0 comments