![]() |
|
The simplest way of selecting random rows from the MySQL database is to use "ORDER BY RAND()" clause in the query.
SELECT * FROM `table` ORDER BY RAND() LIMIT 0,1;The problem with this method is that it is very slow. The reason for it being so slow is that MySQL creates a temporary table with all the result rows and assigns each one of them a random sorting index. The results are then sorted and returned.
There are several workarounds to speed things up.
The basic idea is to get a random number and then select a specific row using this number.
In the case that all the rows have unique ids we will just have to pick a random number between the smallest and the biggest id and then select the row with id that equals that number. To make this method work when ids are not evenly distributed we will have to use ">=" operator instead of "=" in the last query.
To get the minimum and maximum id values in the entire table we will use MAX() and MIN() aggregate functions. These functions will return minimum and maximum value in the specified group. The group in our case is all the values of `id` column in our table.
$range_result = mysql_query( " SELECT MAX(`id`) AS max_id , MIN(`id`) AS min_id FROM `table` ");
$range_row = mysql_fetch_object( $range_result );
$random = mt_rand( $range_row->min_id , $range_row->max_id );
$result = mysql_query( " SELECT * FROM `table` WHERE `id` >= $random LIMIT 0,1 "); As we mentioned this method is limited to tables with unique id for each row. What to do if it's not the case?
The solution is to use the MySQL LIMIT clause. LIMIT accepts two arguments. The first argument specifies the offset of the first row to return, and the second specifies the maximum number of rows to return. The offset of the initial row is 0 (not 1).
To calculate the offset to the first row we will generate a random number between 0 and 1 using MySQL's RAND() function. Then we will multiply this number by number of records in the table, which we will get using COUNT() function. Since LIMIT arguments must be integers and not float values we will round the resulting number using FLOOR() function. FLOOR() is an arithmetic function that calculates the largest integer value that is smaller than or equal to the expression. The resulting code will look like this:
$offset_result = mysql_query( " SELECT FLOOR(RAND() * COUNT(*)) AS `offset` FROM `table` ");
$offset_row = mysql_fetch_object( $offset_result );
$offset = $offset_row->offset;
$result = mysql_query( " SELECT * FROM `table` LIMIT $offset, 1 " ); In MySQL 4.1 and later we can combine two previous methods using subquery like so:
SELECT * FROM `table` WHERE id >= (SELECT FLOOR( MAX(id) * RAND()) FROM `table` ) ORDER BY id LIMIT 1;This solution has the same weakness as the solution 2 e.g. it only works for tables with unique ids.
Remember the reason we started looked for alternative ways of selecting random rows? Speed! So how do these methods compare in terms of execution times. I am not going to go into specifics of hardware and software configuration or give precise numbers. The approximate results are:
The winner is solution 3.
Wouldn't "ROUND" be a better function to use instead of "FLOOR?" As floor would always return the lowest value, 3.9 & 3.4 would be covertered to an integer 3. Obviously when you use "ROUND" 3.9 = 4 and 3.4 =3. This provides a more "random" spread.
Posted by: G. C. on April 15, 2008.
First thank you for good article for multiple random value and you don't need duplicate 1. count total rows $_count=mysql_result(mysql_query("SELECT count(*) FROM table WHERE somefield='this'"),0); 2. then define an array to keep used offset value $used_id=array(); [Loop start] 3. start the random selection and store selected id in the array do { $rand_id_offset=mt_rand(1,$_count); } while (in_array($rand_id_offset,$used_id)); $used_id[]=$rand_id_offset; 4. fetch the row $sql = mysql_query("SELECT * FROM table WHERE somefield='this' LIMIT $rand_id_offset, 1"); $result=mysql_fetch_array($sql); [Loop end] repeat 3,4 until you get X total number of result this will get you 600% faster than using Solution 3 for multiple row selection
Posted by: Lee on April 3, 2008.
until now i do not know that it has 4 types of solutinos the 4th query is great as far as prrformence is concerned
Posted by: this is vey good technique on April 3, 2008.
Anyone know how to change this one to 'solution 4' to fast my search? $query = "SELECT * FROM 'table' WHERE MATCH(name) AGAINST('$qstring' IN BOOLEAN MODE) ORDER BY rand() LIMIT 10"; that's fulltext search funtion. Here was my test: $query = "SELECT * FROM 'table' WHERE MATCH(name) AGAINST('$qstring' IN BOOLEAN MODE) AND id >= (SELECT FLOOR( MAX(id) * RAND()) FROM `table` ) ORDER BY id LIMIT 10"; It work but only show less result than it was. Please!
Posted by: Marco on February 25, 2008.
Solution 3 works like a charm. What's the best method to select 3 random and non-repeating rows? Thank you.
Posted by: Maxwell on February 19, 2008.
Solution 3 help me to more than rest of the others, thank u soo much
Posted by: Harikrishnan on February 8, 2008.
Thanks! Solution 3 is most suitable one for me.
Posted by: Mark, Soe Min on January 28, 2008.
Thankyou ! Exactly what I was looking for. I was scratching my head trying to work that one out !
Posted by: Sean on January 16, 2008.
Thanks a lot..Could also tell us which of these methods will consume the least bandwidth. I shall be using this query in a page that shall be executed a lot many times in succession, so I dont want to lose bandwidth because of this..
Posted by: Anand on January 14, 2008.
Perfect....
Posted by: Dipraj Kuwar on January 10, 2008.
Goood
Posted by: Muhamamd Irfan Ch on December 24, 2007.
Working on a table with about 2million rows RAND() was actually generating such massive tmp tables that the OS couldnt handle it... solution 4 was a 17 second process, solution 3 took less than 1 second. Great tips! Do you have any good scripts for making a good weighted random selection? I have been told that the best way to do it is to "order by rand()*weight". But that is still using the slow "order by rand()".
Posted by: Josh on December 22, 2007.
Do you have any good scripts for making a good weighted random selection? I have been told that the best way to do it is to "order by rand()*weight". But that is still using the slow "order by rand()".
Posted by: Josh on December 11, 2007.
Thank You, it's very usefull :)
Posted by: Yordan Milenkov on December 10, 2007.
Thanks for helping me, i really need such type of help for my new site.
Posted by: Syed Zaeem-ud-DIn on November 4, 2007.
#3 is the best. Thanks for posting this analysis. It saved my time and efforts.
Posted by: Sergiy on October 12, 2007.
this is more of a question. What exactly "->" does on $offset = $offset_row->offset; Thanks
Posted by: Javier on September 18, 2007.
Solution 1 takes about 9 seconds. Solution 4: no idea. Surely more than 10 minutes. I've interrupted it. My table has 300,000 entries.
Posted by: Maurizio on August 31, 2007.
Working on a table with about 2million rows RAND() was actually generating such massive tmp tables that the OS couldnt handle it... solution 4 was a 17 second process, solution 3 took less than 1 second. Great tips!
Posted by: Arlo on August 10, 2007.
Thanks, it helped me alot!
Posted by: Boris on July 25, 2007.
nice code but how to prevent Repetition ?
Posted by: Manpreet Singh Mann on July 25, 2007.
What about more the one row, if we want to select randomly
Posted by: Vikas on July 24, 2007.
// Option 1: SELECT id, title, type, source FROM ads_banners WHERE live = 1 AND ads_zones_id = 1 AND id >= (SELECT FLOOR(MAX(id) * RAND()) FROM ads_banners WHERE live = 1 AND ads_zones_id = 1) ORDER BY id LIMIT 1 // Option 2: SELECT id, title, type, source FROM ads_banners WHERE live = 1 AND ads_zones_id = " . intval($zone_id) . " ORDER BY RAND() LIMIT 1 ----------------------------------------------------------------- */
Posted by: Thomas on July 23, 2007.
good one
Posted by: Fasil on July 12, 2007.
Thanks! It is really help random row. The best part of this article is comparison among different approaches. Thanks again - Madhukar
Posted by: Madhukar on May 2, 2007.
Hi this code sovled my random number genrated problem SELECT * FROM `table` ORDER BY RAND() LIMIT 0,1; Thanks
Posted by: asghar on March 27, 2007.
Does anyone has ASP version of aboe example. I am not php person. Thanks.
Posted by: Port80 on March 26, 2007.
I just happened to come across this through my search for the same issue I'm going to mention. If you are using solution 3, make sure it's done on small data sets. The thing about MySQL's usage of limit offsets, apparently LIMIT x,y retrieves x+y rows, and then discards x rows before sending the data back to the client. It's not a viable solution for picking a random record from millions of records. Solution 4 would be better in such a case.
Posted by: Pierre Fontenelle on January 30, 2007.
I was looking for a method of taking random IDs from a database that lacks dome IDs. For example I have 1,3,6,8 ... Third method worked just great. Thanks a lot.
Posted by: Domino on January 7, 2007.
count = select count(*) from table; index = randomInt( count); select * from table offset index count 1;
Posted by: Someone on August 29, 2006.
it will, in general, not select each row with equal probability. Exmple: three rows with IDs 0,1, and 1000000 will almost always select the last one.
Posted by: Someone on August 29, 2006.
This is realy helpfull informacion. Thanks again.
Posted by: kail on June 30, 2006.
Solution 2 assumes an even distribution of values in the indexed fields. And solution 3 was very slow on a large table (> 4 million rows). Faster was to select only indexed fields (instead of '*') in solution 3.
Posted by: Brian on May 4, 2006.
Just what I was looking for.
Posted by: poly@omino.com on May 3, 2006.
This helped me out, 'cause mysql's RAND() really is slow. Solution 2 worked out the best for me on a table with over 5000 rows.
Posted by: DrTebi on April 25, 2006.