So, we had a cron job hanging for hours.  No idea why.  So, I started debugging.  It all came down to a call to in_array().  See, this job is importing data from a huge XML file into MySQL.  After it is done, we want to compare the data we just added/updated to the data in the table so we can deactivate any data we did not update.  We were using a mod_time field in mysql in the past.  But, that proved to be an issue when we wanted to start skipping rows from the XML that were present but unchanged.  Doing that saved a lot of MySQL writes and sped up the process.

So, anyhow, we have this huge array of ids accumulated during the import.  So, an in clause with 2 million parts would suck.  So, we suck back all the ids in the database that exist and stick that into an array.  We then compared the two arrays by looping one array and using in_array() to check if the value was in the second array.  Here is a pseudo example that shows the idea:

[sourcecode language='php']

foreach($arr1 as $key=>$i){

if(in_array($i, $arr2)){

unset($arr1[$key]);

}
}

[/sourcecode]

So, that was running for hours with about 400k items.  Our data did not contain the value as the key, but it could as the value was unique.  So, I added it.  So, now, the code looks like:

[sourcecode language='php']

foreach($arr1 as $key=>$i){

if(isset($arr2[$i])){

unset($arr1[$key]);

}
}

[/sourcecode]

Yeah, that runs in .8 seconds.  Much better.

So, why were we using in_array to start with if in_array is clearly not the right solution to this problem?  Well, it was basic code evolution.  Originally, these imports would be maybe 100 items.  But, things changed.

FWIW,  I tried array_diff() as well.  It took 25 seconds.  Way better than looping and calling in_array, but still not as quick as a simple isset check.  There was refactoring needed to put the values into the keys of the array.

UPDATE: I updated this post to properly reflect that there is nothing wrong with in_array, but simply that it was not the right solution to this problem.  I wrote this late and did not properly express this.  Thanks to all those people in the comments that helped explain this.