Selecting random record from MySQL database table.

The simplest way of selecting random rows from the MySQL database is to use "ORDER BY RAND()" clause in the query.

Solution 1 [SQL]
SELECT * FROM `table` ORDER BY RAND() LIMIT 0,1;

The problem with this method is that it is very slow. The reason for it being so slow is that MySQL creates a temporary table with all the result rows and assigns each one of them a random sorting index. The results are then sorted and returned.

There are several workarounds to speed things up.

The basic idea is to get a random number and then select a specific row using this number.

In the case that all the rows have unique ids we will just have to pick a random number between the smallest and the biggest id and then select the row with id that equals that number. To make this method work when ids are not evenly distributed we will have to use ">=" operator instead of "=" in the last query.

To get the minimum and maximum id values in the entire table we will use MAX() and MIN() aggregate functions. These functions will return minimum and maximum value in the specified group. The group in our case is all the values of `id` column in our table.

Solution 2 [PHP]
$range_result = mysql_query( " SELECT MAX(`id`) AS max_id , MIN(`id`) AS min_id FROM `table` ");
$range_row = mysql_fetch_object( $range_result );
$random = mt_rand( $range_row->min_id , $range_row->max_id );
$result = mysql_query( " SELECT * FROM `table` WHERE `id` >= $random LIMIT 0,1 ");

As we mentioned this method is limited to tables with unique id for each row. What to do if it's not the case?

The solution is to use the MySQL LIMIT clause. LIMIT accepts two arguments. The first argument specifies the offset of the first row to return, and the second specifies the maximum number of rows to return. The offset of the initial row is 0 (not 1).

To calculate the offset to the first row we will generate a random number between 0 and 1 using MySQL's RAND() function. Then we will multiply this number by number of records in the table, which we will get using COUNT() function. Since LIMIT arguments must be integers and not float values we will round the resulting number using FLOOR() function. FLOOR() is an arithmetic function that calculates the largest integer value that is smaller than or equal to the expression. The resulting code will look like this:

Solution 3 [PHP]
$offset_result = mysql_query( " SELECT FLOOR(RAND() * COUNT(*)) AS `offset` FROM `table` ");
$offset_row = mysql_fetch_object( $offset_result );
$offset = $offset_row->offset;
$result = mysql_query( " SELECT * FROM `table` LIMIT $offset, 1 " );

In MySQL 4.1 and later we can combine two previous methods using subquery like so:

Solution 4 [SQL]
SELECT * FROM `table` WHERE id >= (SELECT FLOOR( MAX(id) * RAND()) FROM `table` ) ORDER BY id LIMIT 1;

This solution has the same weakness as the solution 2 e.g. it only works for tables with unique ids.

Remember the reason we started looked for alternative ways of selecting random rows? Speed! So how do these methods compare in terms of execution times. I am not going to go into specifics of hardware and software configuration or give precise numbers. The approximate results are:

  • The slowest method is solution 1. Let's say that it took 100% of time to execute.
  • Solution 2 took 79%.
  • Solution 3 - 13%.
  • Solution 4 - 16%.

The winner is solution 3.

Tagged:
mysql
php

Comments

Using order by rand() is bad for big tables. I have a better solution using ONLY MySQL query. It involves User Defined Variables. See http://www.mysqldiary.com/how-to-produce-random-rows-from-a-table/
Thanks for the great article! I found your solution 3 to be exactly what I need. Yaaaay!
This code is faster then all in 100 once: $sql = "SELECT MAX(id) maxid FROM table"; $id = mysql_fetch_object(mysql_query($sql)); do { $rndid = rand(1, $id->maxid); $sql = "SELECT id FROM table WHERE id = ".$rndid; $rnd = mysql_fetch_object(mysql_query($sql)); } while (!$rnd); If you have a deleted id's it's scale down. Don't care if you have a half of all id's deleted it's be faster then all 4 methods.
  • Gravatar
  • James
  • 2009-02-05 19:33:27
As someone said: * * Someone * 2006-08-29 03:11:07 Method 2 is buggy it will, in general, not select each row with equal probability. Exmple: three rows with IDs 0,1, and 1000000 will almost always select the last one. ----- This also applies to method 4. Method 3 is the only alternative that will still be random. Though if you want to select more than one random row, you would have to run query multiple times, and generate the three random non-repeating offsets before hand.
  • Gravatar
  • Pascal
  • 2009-01-16 19:01:11
Solution 4 took 1 minute on a table with 598420 rows. I ended up doing this in PHP (table is static at this point) $r = rand(1,598420); $sql = "SELECT * FROM table WHERE id = $r"; Takes 0.01 secs.
But: min id = 1 max id = 10 and I removed record with id = 6 Your php (max() & min() method) method doesn't work.
  • Gravatar
  • Gene
  • 2008-12-12 06:58:34
I have a question and an idea:
The idea: Since we can optimize Solution 2 to Solution 4, why can't we optimize Solution 3 in the same way?
The question: Solution 1 is slower, but it can generate several random records at the same time while the other four can only generate consequent records. Does this mean that Solution 1 may still be the choice if I need several records each time.
  • Gravatar
  • Wombat
  • 2008-07-27 17:45:49
Great article! I was thinking that there had to be a better way of doing this than ORDER BY RAND() that wasn't a page long, and this is definitely it. As for your question about using ROUND(), G.C., I think the reason to use FLOOR() is because the result from it is used as an offset in the LIMIT clause. Since the offset is zero-based, the range of acceptable values is 0 to COUNT-1. If you used ROUND(), you would (very rarely, but still every once in a while) get an offset equal to COUNT, which if I'm correct would return nothing since there would never be a row with that offset.
  • Gravatar
  • G. C.
  • 2008-04-15 17:12:47
Round vs Floor Wouldn't "ROUND" be a better function to use instead of "FLOOR?" As floor would always return the lowest value, 3.9 & 3.4 would be covertered to an integer 3. Obviously when you use "ROUND" 3.9 = 4 and 3.4 =3. This provides a more "random" spread.
  • Gravatar
  • Lee
  • 2008-04-03 12:46:36
Thank you for this article First thank you for good article for multiple random value and you don't need duplicate 1. count total rows $_count=mysql_result(mysql_query("SELECT count(*) FROM table WHERE somefield='this'"),0); 2. then define an array to keep used offset value $used_id=array(); [Loop start] 3. start the random selection and store selected id in the array do { $rand_id_offset=mt_rand(1,$_count); } while (in_array($rand_id_offset,$used_id)); $used_id[]=$rand_id_offset; 4. fetch the row $sql = mysql_query("SELECT * FROM table WHERE somefield='this' LIMIT $rand_id_offset, 1"); $result=mysql_fetch_array($sql); [Loop end] repeat 3,4 until you get X total number of result this will get you 600% faster than using Solution 3 for multiple row selection
  • Gravatar
  • this is vey good technique
  • 2008-04-03 02:38:42
good technique until now i do not know that it has 4 types of solutinos the 4th query is great as far as prrformence is concerned
  • Gravatar
  • Marco
  • 2008-02-25 21:14:40
How to fast my perl Search function? Anyone know how to change this one to 'solution 4' to fast my search? $query = "SELECT * FROM 'table' WHERE MATCH(name) AGAINST('$qstring' IN BOOLEAN MODE) ORDER BY rand() LIMIT 10"; that's fulltext search funtion. Here was my test: $query = "SELECT * FROM 'table' WHERE MATCH(name) AGAINST('$qstring' IN BOOLEAN MODE) AND id >= (SELECT FLOOR( MAX(id) * RAND()) FROM `table` ) ORDER BY id LIMIT 10"; It work but only show less result than it was. Please!
  • Gravatar
  • Maxwell
  • 2008-02-19 11:27:36
Non-repeating?? Solution 3 works like a charm. What's the best method to select 3 random and non-repeating rows? Thank you.
  • Gravatar
  • Hari
  • 2008-02-08 19:12:34
Thanks!!! Solution 3 help me to more than rest of the others, thank u soo much
  • Gravatar
  • Mark, Soe Min
  • 2008-01-28 04:41:19
Thanks! Solution 3 is most suitable one for me.
  • Gravatar
  • Sean
  • 2008-01-16 16:52:26
You hit the nail on the head Thankyou ! Exactly what I was looking for. I was scratching my head trying to work that one out !
  • Gravatar
  • Anand
  • 2008-01-14 12:47:44
Least Bandwidth Thanks a lot..Could also tell us which of these methods will consume the least bandwidth. I shall be using this query in a page that shall be executed a lot many times in succession, so I dont want to lose bandwidth because of this..
  • Gravatar
  • Dipraj Kuwar
  • 2008-01-10 01:05:04
Thanks Perfect....
  • Gravatar
  • Muhamamd Irfan Ch
  • 2007-12-24 04:23:03
Nice example Goood
  • Gravatar
  • Josh
  • 2007-12-22 03:38:44
Very usefull Thanks Buddy Working on a table with about 2million rows RAND() was actually generating such massive tmp tables that the OS couldnt handle it... solution 4 was a 17 second process, solution 3 took less than 1 second. Great tips! Do you have any good scripts for making a good weighted random selection? I have been told that the best way to do it is to "order by rand()*weight". But that is still using the slow "order by rand()".
  • Gravatar
  • Josh
  • 2007-12-11 01:09:31
How about a Weighted Selection... Do you have any good scripts for making a good weighted random selection? I have been told that the best way to do it is to "order by rand()*weight". But that is still using the slow "order by rand()".
  • Gravatar
  • Yordan Milenkov
  • 2007-12-10 03:00:38
Very usefull Thank You, it's very usefull :)
  • Gravatar
  • Syed Zaeem-ud-DIn
  • 2007-11-04 20:57:49
Thanks Buddy Thanks for helping me, i really need such type of help for my new site.
  • Gravatar
  • Sergiy
  • 2007-10-12 02:55:26
#3 is the best. Thanks for posting this analysis. It saved my time and efforts.
  • Gravatar
  • Javier
  • 2007-09-18 19:06:02
Noob Question this is more of a question. What exactly "->" does on $offset = $offset_row->offset; Thanks
  • Gravatar
  • Maurizio
  • 2007-08-31 10:59:50
Uncomplete tip Solution 1 takes about 9 seconds. Solution 4: no idea. Surely more than 10 minutes. I've interrupted it. My table has 300,000 entries.
  • Gravatar
  • Arlo
  • 2007-08-10 17:01:57
Great tip Working on a table with about 2million rows RAND() was actually generating such massive tmp tables that the OS couldnt handle it... solution 4 was a 17 second process, solution 3 took less than 1 second. Great tips!
  • Gravatar
  • Boris
  • 2007-07-25 19:32:04
Thanks, it helped me alot!
  • Gravatar
  • Manpreet Singh Mann
  • 2007-07-25 01:32:09
Nice code but how to prevent Repetition ?
  • Gravatar
  • Vikas
  • 2007-07-24 07:09:37
What about multiple row What about more the one row, if we want to select randomly
  • Gravatar
  • Thomas
  • 2007-07-23 07:36:30
What would be faster? // Option 1: SELECT id, title, type, source FROM ads_banners WHERE live = 1 AND ads_zones_id = 1 AND id >= (SELECT FLOOR(MAX(id) * RAND()) FROM ads_banners WHERE live = 1 AND ads_zones_id = 1) ORDER BY id LIMIT 1 // Option 2: SELECT id, title, type, source FROM ads_banners WHERE live = 1 AND ads_zones_id = " . intval($zone_id) . " ORDER BY RAND() LIMIT 1 ----------------------------------------------------------------- */
  • Gravatar
  • Fasil
  • 2007-07-12 05:57:56
Selecting random record from MySQL database table. good one
  • Gravatar
  • Madhukar
  • 2007-05-02 06:21:00
Good One Thanks! It is really help random row. The best part of this article is comparison among different approaches. Thanks again - Madhukar
  • Gravatar
  • asghar
  • 2007-03-27 08:09:06
solve the random number genrated problem Hi this code sovled my random number genrated problem SELECT * FROM `table` ORDER BY RAND() LIMIT 0,1; Thanks
  • Gravatar
  • Port80
  • 2007-03-26 10:55:24
ASP version? Does anyone has ASP version of aboe example. I am not php person. Thanks.
  • Gravatar
  • Pierre Fontenelle
  • 2007-01-30 04:27:35
About Solution 3 I just happened to come across this through my search for the same issue I'm going to mention. If you are using solution 3, make sure it's done on small data sets. The thing about MySQL's usage of limit offsets, apparently LIMIT x,y retrieves x+y rows, and then discards x rows before sending the data back to the client. It's not a viable solution for picking a random record from millions of records. Solution 4 would be better in such a case.
  • Gravatar
  • Domino
  • 2007-01-07 22:05:57
Thanks I was looking for a method of taking random IDs from a database that lacks dome IDs. For example I have 1,3,6,8 ... Third method worked just great. Thanks a lot.
  • Gravatar
  • Someone
  • 2006-08-29 03:12:36
Why not use offset? count = select count(*) from table; index = randomInt( count); select * from table offset index count 1;
  • Gravatar
  • Someone
  • 2006-08-29 03:11:07
Method 2 is buggy it will, in general, not select each row with equal probability. Exmple: three rows with IDs 0,1, and 1000000 will almost always select the last one.
  • Gravatar
  • kail
  • 2006-06-30 09:15:16
Thanks for tip This is realy helpfull informacion. Thanks again.
  • Gravatar
  • Brian
  • 2006-05-04 11:55:08
indexed columns Solution 2 assumes an even distribution of values in the indexed fields. And solution 3 was very slow on a large table (> 4 million rows). Faster was to select only indexed fields (instead of '*') in solution 3.
  • Gravatar
  • poly@omino.com
  • 2006-05-03 02:52:00
Thanks, too! Just what I was looking for.
  • Gravatar
  • DrTebi
  • 2006-04-25 13:03:00
Thanks for this tip This helped me out, 'cause mysql's RAND() really is slow. Solution 2 worked out the best for me on a table with over 5000 rows.

© 2003 — 2014 Akinas
All rights reserved