Let's look at a typical user table:
+--------------+----------------------+------+-----+---------------------+----------------+ | Field | Type | Null | Key | Default | Extra | +--------------+----------------------+------+-----+---------------------+----------------+ | userid | smallint(5) unsigned | NO | PRI | NULL | auto_increment | | username | varchar(20) | NO | UNI | | | | password | varchar(32) | NO | | | | | firstname | varchar(15) | NO | | | | | lastname | varchar(15) | NO | | | | | email | varchar(255) | NO | | | | | perms | varchar(255) | NO | | | | | last_login | datetime | NO | | 0000-00-00 00:00:00 | | +--------------+----------------------+------+-----+---------------------+----------------+
We have a userid field that is an auto-increment field, and a username that has a unique index on it. But what if we want to look up a user by email address? There is no index on the email field, and so the database server will need to do a "full table scan" in order to find the row(s) you are looking for. If we EXPLAIN our query, we can see that the key being used is NULL, and the rows being examined are many.
EXPLAIN SELECT * FROM users WHERE email='demo@example.com'; +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | users | ALL | NULL | NULL | NULL | NULL | 79824 | Using where | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+
Whenever you do an EXPLAIN, the crucial columns you want to look at our "key" and "rows". Ideally, we always want the "rows" column to be 1, or as close to 1 as possible.
We can add an index to the email field like so:
ALTER TABLE users ADD INDEX idx_email (email);
Now, when we run the same EXPLAIN query as above, we get this result:
EXPLAIN SELECT * FROM users WHERE email='demo@example.com'; +------+-------------+-------+------+---------------+-----------+---------+-------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+-----------+---------+-------+------+-----------------------+ | 1 | SIMPLE | users | ref | idx_email | idx_email | 257 | const | 1 | Using index condition | +------+-------------+-------+------+---------------+-----------+---------+-------+------+-----------------------+
As you can see, the query is now using the idx_email key, and the value of rows is 1, as we had hoped.
Do indexes always help? Let's look at a similar example, but this time we are looking for users on the hotmail.com domain.
EXPLAIN SELECT * FROM users WHERE email LIKE '%@hotmail.com'; +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | users | ALL | NULL | NULL | NULL | NULL | 79824 | Using where | +------+-------------+-------+------+---------------+------+---------+------+-------+-------------+
Despite the existence of an index on the email column, the query optimizer has chosen to do a full table scan. Your typical B-tree index operates from left to right. If you don't specify the left part of the field, the database server cannot use your index.
Post a Comment