Steve Meyers has 15 experience as a MySQL DBA, web software architect, and system administrator, with an emphasis on scalability

Other kinds of indexes

+ No comment yet

B-Tree Index

As MySQL's default index type, a b-tree index is reasonably fast to insert new items into, and is fast at looking up individual items or even ranges.  The index is stored as a branching tree, so looking up any one item is not an expensive operation.  Each level of branching allows an order of magnitude more items.  It's not always the best solution, but in some cases it may be the only one.  B-tree indexes are supported on all standard storage engines.

Hash Index

Hash indexes are similar to associative arrays (or hashes) in many programming languages.  They are very fast for looking up individual items, but cannot look up ranges.  They are stored as a series of "buckets", and a hash algorithm is used for determining which bucket to put an item into.  Good hash algorithms are very inexpensive, and spread input values evenly among the buckets.  It is because they are spread out that range operations become impossible.  Hash indexes are currently only supported for MEMORY and NDB (MySQL Cluster) storage engines.

Fulltext Index

Fulltext indexes are good for searching out individual words inside a long block of text.  "Good" is a relative term; MySQL's fulltext indexes perform better than no index, but for decent performance on larger tables, you'll probably want to use a dedicated search engine such as Sphinx, Solr, Lucene, or other similar applications.  I personally use Sphinx, because you can connect to it from MySQL using the SphinxSE storage engine.  Fulltext indexes are only available in the MyISAM storage engine, which is another reason to look into other solutions.

Spatial Index

If you're working with spatial data, such as latitude and longitude, you might be able to get a speed boost with spatial indexes.  Be careful with the spatial functions though; if you're not using MySQL 5.6 or MariaDB 5.5 (or higher), the functions that test relationships between polygons operate in "minimum bounding rectangle" mode.  Spatial indexes are only available in the MyISAM storage engine, although spatial data can be stored in InnoDB.  In order to protect your data, I would recommend storing your data in InnoDB, and creating a cache table in MyISAM for doing spatial lookups on.

Roll Your Own Index

Sometimes, it makes sense to create your own indexing system.  I'm not talking about hacking one into the MySQL source code, but rather finding other ways to index your data.  In a thread on the MySQL mailing list from 2001, I discussed the benefits of using a separately calculated hash column to index a column containing URLs.  The crux of it is that you don't want the length of your indexes to be too long, because you want MySQL to be able to fit all of your indexes in memory.  It's also more expensive to write longer indexes to disk.

At another company, we found that as fast as spatial indexes are, integer b-tree indexes are faster.  We rounded our latitude and longitude and put them together to create a grid, and then we were able to just request the items in a list of grid squares.  We also included the more precise fields in our where clause, in order to avoid getting extra points back from our query.

So if the default index types aren't working for you, try thinking outside the box.  There may be a solution that nobody else has thought of before, that will increase your index speed substantially.

Post a Comment