Steve Meyers has 15 experience as a MySQL DBA, web software architect, and system administrator, with an emphasis on scalability

Redundant Indexes

+ No comment yet
One of the more egregious forms of "too many indexes" is redundant indexes.  For example, let us consider the following table:

MariaDB [test]> desc presentations;
+---------------+-----------------------+------+-----+---------+----------------+
| Field         | Type                  | Null | Key | Default | Extra          |
+---------------+-----------------------+------+-----+---------+----------------+
| id            | int(10) unsigned      | NO   | PRI | NULL    | auto_increment |
| ts            | datetime              | NO   | MUL | NULL    |                |
| room          | mediumint(8) unsigned | NO   |     | NULL    |                |
| ss_presid     | int(10) unsigned      | NO   |     | NULL    |                |
| first_name    | varchar(255)          | NO   |     | NULL    |                |
| last_name     | varchar(255)          | NO   |     | NULL    |                |
| title         | varchar(255)          | NO   |     | NULL    |                |
| len_min       | mediumint(8) unsigned | NO   |     | NULL    |                |
| votes         | int(10) unsigned      | NO   |     | NULL    |                |
| timesid       | int(10) unsigned      | NO   |     | NULL    |                |
| num_conflicts | int(10) unsigned      | NO   |     | NULL    |                |
+---------------+-----------------------+------+-----+---------+----------------+
11 rows in set (0.00 sec)

Notice that the "Key" column doesn't really give us a ton of information.  We know that "ts" has multiple indexes, but nothing more than that.  As an aside, MySQL uses the terms "key" and "index" interchangeably.  To get better information about our keys or indexes, use the "SHOW KEYS FROM" query:

MariaDB [test]> SHOW KEYS FROM presentations;
+---------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table         | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| presentations |          0 | PRIMARY  |            1 | id          | A         |          84 |     NULL | NULL   |      | BTREE      |         |               |
| presentations |          0 | ts_room  |            1 | ts          | A         |        NULL |     NULL | NULL   |      | BTREE      |         |               |
| presentations |          0 | ts_room  |            2 | room        | A         |          84 |     NULL | NULL   |      | BTREE      |         |               |
| presentations |          1 | ts       |            1 | ts          | A         |        NULL |     NULL | NULL   |      | BTREE      |         |               |
+---------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.00 sec)

You'll notice that this is much more informative than the DESC command, at least as far as indexes are concerned.  The format takes some getting used to, though.  Use the "Key_name" column to determine which key we're talking about, as indexes can cross multiple columns.

Knowing that, we can tell that there are three keys: "PRIMARY", "ts_room", and "ts".  The "ts_room" index has two columns.  The "Seq_in_index" column tells us which comes first, although the "SHOW KEYS FROM" command will always return them in the proper order.

We have the primary key ("PRIMARY") which covers just the "id" column.  Primary keys are always unique indexes, which means that no two rows are allowed to have the same value for that key.  The "ts_room" index spans the "ts" and "room" columns, and is also unique.  Last, the "ts" index just covers the "ts" column, and is not unique.

So where is the redundant index?  First, we must understand how indexes work.  The left-most prefix of any index can be used by itself.  If we have an index on a varchar column, a SELECT query asking for rows with a value LIKE 'prefix%' will use the index on that column; a similar query looking for rows with a value LIKE '%suffix' will not use the index, since the left-most part is unspecified.

In multi-column indexes, this property also applies.  The columns in the index are processed in order, so a query on our presentations table with just the "room" specified would not be able to use the "ts_room" index.  However, a query with just the "ts" specified COULD use the "ts_room" index, since the "ts" is the leftmost part of it.

So now we've identified the redundant index: "ts" will never be needed, since its duties can be handled by "ts_room".  So, how do we get rid of it?

MariaDB [test]> ALTER TABLE presentations DROP INDEX ts;
Query OK, 84 rows affected (0.01 sec)              
Records: 84  Duplicates: 0  Warnings: 0

Post a Comment