Thursday, July 7, 2011

MySQL LIMIT clause.. and when you shouldn't use it

Today I learned something that just amazed me about my trusty MySQL and most of all, the fact that I'm only learning this today, after years of using MySQL.

The senerio: a coworker of mine was running a query on a MySQL table that held around 10 million records. For context: the table was using the MyISAM engine and was not being modified in any way (no updates/inserts, only selects). The table also had an auto-incremented indexed ID field. The query itsself was very simple:

select id_content, content from `tablename` limit 1, 100000;


The limit offset was actually being used as a sliding window, and the limit count was a constant 100 thousand. Watching the the mysql server status in real time showed something strange, as the script was running locks started appearing and blocking as well as slowing down dramatically for each new window.

Reason: The limit clause. Little did we know, the MySQL limit has a "bizarre" implementation that doesn't work like I would have expected, when using an offset/count pair, the server actually selects ALL the records up until the offset, then selects `offset` more records and returns the later.

So for example, when do a "select * from tablename limit 100, 10;" the server is actually selecting the 110 first results then only returning the last 10 of that result set. Not too bad for a small table, but for a larger table this is just horrible: "Limit 100000, 10" actually selects (and allocates memory for) 100010 records then using the result of that, returns the last 10.

Better solution: stick with the indexed field and use where clauses instead.
select id_content, content from `tablename` where id_content > 10000 and id_content < 100010;

Cheers!

Edit*

I would like to clarify that using the LIMIT clause shouldn't be banned altogther, simply the limit clauses that use an offset and on large tables ;)