Selecting random record from MySQL database table.

March 19, 2006

The simplest way of selecting random rows from the MySQL database is to use "ORDER BY RAND()" clause in the query.

Solution 1
[SQL]
SELECT * FROM `table` ORDER BY RAND() LIMIT 0,1;

The problem with this method is that it is very slow. The reason for it being so slow is that MySQL creates a temporary table with all the result rows and assigns each one of them a random sorting index. The results are then sorted and returned.

There are several workarounds to speed things up.

The basic idea is to get a random number and then select a specific row using this number.

In the case that all the rows have unique ids we will just have to pick a random number between the smallest and the biggest id and then select the row with id that equals that number. To make this method work when ids are not evenly distributed we will have to use ">=" operator instead of "=" in the last query.

To get the minimum and maximum id values in the entire table we will use MAX() and MIN() aggregate functions. These functions will return minimum and maximum value in the specified group. The group in our case is all the values of `id` column in our table.

Solution 2
[PHP]
$range_result = mysql_query( " SELECT MAX(`id`) AS max_id , MIN(`id`) AS min_id FROM `table` ");
$range_row = mysql_fetch_object( $range_result );
$random = mt_rand( $range_row->min_id , $range_row->max_id );
$result = mysql_query( " SELECT * FROM `table` WHERE `id` >= $random LIMIT 0,1 ");

As we mentioned this method is limited to tables with unique id for each row. What to do if it's not the case?

The solution is to use the MySQL LIMIT clause. LIMIT accepts two arguments. The first argument specifies the offset of the first row to return, and the second specifies the maximum number of rows to return. The offset of the initial row is 0 (not 1).

To calculate the offset to the first row we will generate a random number between 0 and 1 using MySQL's RAND() function. Then we will multiply this number by number of records in the table, which we will get using COUNT() function. Since LIMIT arguments must be integers and not float values we will round the resulting number using FLOOR() function. FLOOR() is an arithmetic function that calculates the largest integer value that is smaller than or equal to the expression. The resulting code will look like this:

Solution 3
[PHP]
$offset_result = mysql_query( " SELECT FLOOR(RAND() * COUNT(*)) AS `offset` FROM `table` ");
$offset_row = mysql_fetch_object( $offset_result );
$offset = $offset_row->offset;
$result = mysql_query( " SELECT * FROM `table` LIMIT $offset, 1 " );

In MySQL 4.1 and later we can combine two previous methods using subquery like so:

Solution 4
[SQL]
SELECT * FROM `table` WHERE id >= (SELECT FLOOR( MAX(id) * RAND()) FROM `table` ) ORDER BY id LIMIT 1;

This solution has the same weakness as the solution 2 e.g. it only works for tables with unique ids.

Remember the reason we started looked for alternative ways of selecting random rows? Speed! So how do these methods compare in terms of execution times. I am not going to go into specifics of hardware and software configuration or give precise numbers. The approximate results are:

The winner is solution 3.

 

Round vs Floor

Wouldn't "ROUND" be a better function to use instead of "FLOOR?" As floor would always return the lowest value, 3.9 & 3.4 would be covertered to an integer 3. Obviously when you use "ROUND" 3.9 = 4 and 3.4 =3. This provides a more "random" spread.

Posted by: G. C. on April 15, 2008.


Thank you for this article

First thank you for good article for multiple random value and you don't need duplicate 1. count total rows $_count=mysql_result(mysql_query("SELECT count(*) FROM table WHERE somefield='this'"),0); 2. then define an array to keep used offset value $used_id=array(); [Loop start] 3. start the random selection and store selected id in the array do { $rand_id_offset=mt_rand(1,$_count); } while (in_array($rand_id_offset,$used_id)); $used_id[]=$rand_id_offset; 4. fetch the row $sql = mysql_query("SELECT * FROM table WHERE somefield='this' LIMIT $rand_id_offset, 1"); $result=mysql_fetch_array($sql); [Loop end] repeat 3,4 until you get X total number of result this will get you 600% faster than using Solution 3 for multiple row selection

Posted by: Lee on April 3, 2008.


good technique

until now i do not know that it has 4 types of solutinos the 4th query is great as far as prrformence is concerned

Posted by: this is vey good technique on April 3, 2008.


How to fast my perl Search function?

Anyone know how to change this one to 'solution 4' to fast my search? $query = "SELECT * FROM 'table' WHERE MATCH(name) AGAINST('$qstring' IN BOOLEAN MODE) ORDER BY rand() LIMIT 10"; that's fulltext search funtion. Here was my test: $query = "SELECT * FROM 'table' WHERE MATCH(name) AGAINST('$qstring' IN BOOLEAN MODE) AND id >= (SELECT FLOOR( MAX(id) * RAND()) FROM `table` ) ORDER BY id LIMIT 10"; It work but only show less result than it was. Please!

Posted by: Marco on February 25, 2008.


Non-repeating??

Solution 3 works like a charm. What's the best method to select 3 random and non-repeating rows? Thank you.

Posted by: Maxwell on February 19, 2008.


Thanks!!!

Solution 3 help me to more than rest of the others, thank u soo much

Posted by: Harikrishnan on February 8, 2008.


Thanks

Thanks! Solution 3 is most suitable one for me.

Posted by: Mark, Soe Min on January 28, 2008.


You hit the nail on the head

Thankyou ! Exactly what I was looking for. I was scratching my head trying to work that one out !

Posted by: Sean on January 16, 2008.


Least Bandwidth

Thanks a lot..Could also tell us which of these methods will consume the least bandwidth. I shall be using this query in a page that shall be executed a lot many times in succession, so I dont want to lose bandwidth because of this..

Posted by: Anand on January 14, 2008.


Thanks

Perfect....

Posted by: Dipraj Kuwar on January 10, 2008.


Nice example

Goood

Posted by: Muhamamd Irfan Ch on December 24, 2007.


Very usefull Thanks Buddy

Working on a table with about 2million rows RAND() was actually generating such massive tmp tables that the OS couldnt handle it... solution 4 was a 17 second process, solution 3 took less than 1 second. Great tips! Do you have any good scripts for making a good weighted random selection? I have been told that the best way to do it is to "order by rand()*weight". But that is still using the slow "order by rand()".

Posted by: Josh on December 22, 2007.


How about a Weighted Selection...

Do you have any good scripts for making a good weighted random selection? I have been told that the best way to do it is to "order by rand()*weight". But that is still using the slow "order by rand()".

Posted by: Josh on December 11, 2007.


Very usefull

Thank You, it's very usefull :)

Posted by: Yordan Milenkov on December 10, 2007.


Thanks Buddy

Thanks for helping me, i really need such type of help for my new site.

Posted by: Syed Zaeem-ud-DIn on November 4, 2007.


#3 is the best. Thanks for posting this analysis. It saved my time and efforts.

Posted by: Sergiy on October 12, 2007.


Noob Question

this is more of a question. What exactly "->" does on $offset = $offset_row->offset; Thanks

Posted by: Javier on September 18, 2007.


Uncomplete tip

Solution 1 takes about 9 seconds. Solution 4: no idea. Surely more than 10 minutes. I've interrupted it. My table has 300,000 entries.

Posted by: Maurizio on August 31, 2007.


Great tip

Working on a table with about 2million rows RAND() was actually generating such massive tmp tables that the OS couldnt handle it... solution 4 was a 17 second process, solution 3 took less than 1 second. Great tips!

Posted by: Arlo on August 10, 2007.


Just to say thanks

Thanks, it helped me alot!

Posted by: Boris on July 25, 2007.


prevent Repetition

nice code but how to prevent Repetition ?

Posted by: Manpreet Singh Mann on July 25, 2007.


What about multiple row

What about more the one row, if we want to select randomly

Posted by: Vikas on July 24, 2007.


What would be faster?

// Option 1: SELECT id, title, type, source FROM ads_banners WHERE live = 1 AND ads_zones_id = 1 AND id >= (SELECT FLOOR(MAX(id) * RAND()) FROM ads_banners WHERE live = 1 AND ads_zones_id = 1) ORDER BY id LIMIT 1 // Option 2: SELECT id, title, type, source FROM ads_banners WHERE live = 1 AND ads_zones_id = " . intval($zone_id) . " ORDER BY RAND() LIMIT 1 ----------------------------------------------------------------- */

Posted by: Thomas on July 23, 2007.


Selecting random record from MySQL database table.

good one

Posted by: Fasil on July 12, 2007.


Good One

Thanks! It is really help random row. The best part of this article is comparison among different approaches. Thanks again - Madhukar

Posted by: Madhukar on May 2, 2007.


solve the random number genrated problem

Hi this code sovled my random number genrated problem SELECT * FROM `table` ORDER BY RAND() LIMIT 0,1; Thanks

Posted by: asghar on March 27, 2007.


ASP version?

Does anyone has ASP version of aboe example. I am not php person. Thanks.

Posted by: Port80 on March 26, 2007.


About Solution 3

I just happened to come across this through my search for the same issue I'm going to mention. If you are using solution 3, make sure it's done on small data sets. The thing about MySQL's usage of limit offsets, apparently LIMIT x,y retrieves x+y rows, and then discards x rows before sending the data back to the client. It's not a viable solution for picking a random record from millions of records. Solution 4 would be better in such a case.

Posted by: Pierre Fontenelle on January 30, 2007.


Thanks

I was looking for a method of taking random IDs from a database that lacks dome IDs. For example I have 1,3,6,8 ... Third method worked just great. Thanks a lot.

Posted by: Domino on January 7, 2007.


Why not use offset?

count = select count(*) from table; index = randomInt( count); select * from table offset index count 1;

Posted by: Someone on August 29, 2006.


Method 2 is buggy

it will, in general, not select each row with equal probability. Exmple: three rows with IDs 0,1, and 1000000 will almost always select the last one.

Posted by: Someone on August 29, 2006.


Thanks for tip

This is realy helpfull informacion. Thanks again.

Posted by: kail on June 30, 2006.


indexed columns

Solution 2 assumes an even distribution of values in the indexed fields. And solution 3 was very slow on a large table (> 4 million rows). Faster was to select only indexed fields (instead of '*') in solution 3.

Posted by: Brian on May 4, 2006.


Thanks, too!

Just what I was looking for.

Posted by: poly@omino.com on May 3, 2006.


Thanks for this tip

This helped me out, 'cause mysql's RAND() really is slow. Solution 2 worked out the best for me on a table with over 5000 rows.

Posted by: DrTebi on April 25, 2006.