MySql Explain ignoring the unique index in a particular query

I started looking into Index(es) in depth for the first time and started analyzing our db beginning from the users table for the first time. I searched SO to find a similar question but was not able to frame my search well, I guess.

I was going through a particular concept and this first observation left me wondering - The difference in these Explain(s) [Difference : First query is using'a%'while the second query is using'ab%']

[Total number of rows inuserstable =9193]:

1) explain select * from users where email_address like 'a%';

MySql Explain ignoring the unique index in a particular query


(Actually matching columns =1240)

2) explain select * from users where email_address like 'ab%';

MySql Explain ignoring the unique index in a particular query

(Actually matching columns =109)

The index looks like this :MySql Explain ignoring the unique index in a particular query

My question:Why is the index totally ignored in the first query? Does mySql think that it is a better idea not to use the index in thecase 1? If yes, why?

If the probability, based statistics mysql collects on distribution of the values, is above a certain ratio of the total rows (typically 1/11 of the total), mysql deems it more efficient to simply scan the whole table reading the disks pages in sequentially, rather than use the index jumping around the disk pages in random order.

You could try your luck with this query, which may use the index:

where email_address between 'a' and 'az'

Although doing the full scan may actually be faster.

This is not a direct answer to your question but I still want to point it out (in case you already don't know):

Try:

explain select email_address from users where email_address like 'a%';
explain select email_address from users where email_address like 'ab%';

MySQL would now use indexes in both the queries above since the columns of interest are directly available from the index.

Probably in the case where you do a "select *", index access is more costly since the optmizer has to go through the index records, find the row ids and then go back to the table to retrieve other column values.

But in the query above where you only do a "select email_address", the optmizer knows all the information desired is available right from the index and hence it would use the index irrespective of the 30% rule.

Experts, please correct me if I am wrong.

What Others Are Reading