Memory exhausted when processing a lot of records

I got a problem when trying to import more than 30,000 data to database from excel file. Each data needs to be validated first before inserted to database. I use AR only for validating data, then switch to query builder when inserting them.




$query = Yii::app()->db->createCommand()->insert('material', array(				

  ‘id'=>$material->id,

  'category_id'=>$material->category_id,								      

  'name'=>$material->name,

  'image_type'=>$material->image_type,

  'thickness'=>$material->thickness,

   …..

  'fuel'=>$material->fuel

));



The problem is application always runs out of memory:

Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 71 bytes)

Additional details:

PHP version: 5.2.17 (the script works fine in PHP 5.3.1, but not in PHP 5.2 – due to this issue, IMO). Also, the web hosting I use doesn’t support PHP 5.3.

Memory limit: 128 MB (maximum memory allowed by my web hosting - cannot be increased again)

Yii version: 1.1.8

Is there any way to reduce memory usage and process them in PHP 5.2?

Done.

Maybe use some kind of batch processing? Let the script call itself with progress parameter indicating the last processed record.

Thank you!

I don’t understand what you mean. Could you please elaborate more about this?

Could you post your import code please? Maybe we can find things to improve. Spawning additional PHP processes like I mentioned might not be necessary.

Ok, I will try to make it as short as possible.




// get instance of excel file

$file = CUploadedFile::getInstance($this,'fileName');


if (isset($file))

{	

   // initialize total material that will be imported (for logging procedure)

   $materialTotal = 0;

	

   // set the path of file

   $fileName = Yii::app()->user->getState('user_id'). '_' .date('H_i_s'). '_' .$file->name;

   $location = $this->_directory.$fileName;

   $file->saveAs($location);


   // import PHPExcelReader extension

   Yii::import('ext.phpexcelreader.JPhpExcelReader');

   $data = new JPhpExcelReader($location, false);

	

   // check file contents first			

   if ($this->validateColumn($data))

   {

      $category = Category::model()->findByPk($this->categoryId);

      

      // traverse all columns and rows in excel file				

      switch($category->name)

      {

 	 case 'Cabinet':										

	    $numberColumns = array(9,10,11,12,14,15); 

		

	    // start from row 2 since first row is header 

	    for ($i = 2; $i <= $data->sheets[0]['numRows']; $i++) 

	    {

		// check if item number is empty

		if (empty($data->sheets[0]['cells'][$i][1]))

		   continue;

					

		// create material object

		$material = new Material;

		$material->category_id = $this->categoryId;

					

		// get all data from excel file

		for ($j = 1; $j <= $data->sheets[0]['numCols']; $j++) 

		{

		   $content = $data->sheets[0]['cells'][$i][$j];

		   

                   // replace empty content with zero (for several columns only)												

		   if ((empty($content)) && (in_array($j, $numberColumns, true)))

			$content = 0;

						

		   // fill material properties (based on column)

		   switch($j)

		   {

		      case 1: $material->id = $content; break;

		      case 2: $material->name = utf8_encode($content); break;

		      ... (there are several other attributes that will be set here)

		      case 15: $material->fuel = $content; break;

		   }							

		}

		

                // do validation here			

		if ($material->validate())

		{

		   $materialTotal++;

				

		   $query = Yii::app()->db->createCommand()->insert('material', array(

			'id'=>$material->id,

			'category_id'=>$material->category_id,

			.... (the other fields)

			'fuel'=>$material->fuel

		   ));

		}					

		else

		{

		   // record error

		   $success = 1;

						

		   // get all error messages

		   $errorTxt = '';

		   foreach (array_values($material->getErrors()) as $errors)

		   {

		      foreach ($errors as $error)

			 $errorTxt .= $error. "\n";

		   }

						

		   $rowError++;

						

		   // set array properties for PDF report

		   $errorLog[$rowError]['row'] = $i;

		   $errorLog[$rowError]['material_id'] = $oldId;

		   $errorLog[$rowError]['generated_material_id'] = $material->id;

		   $errorLog[$rowError]['description'] = $errorTxt;

		}															

	  }						

	

	  break;

	case 'Granite':

	  // do similar processing like cabinet items

	  .....

	  break;				

	}

   }

   else

   {

	$success = 0;

	$columnValid = false;

	$details = 'Column order in the spreadsheet is not correct. Please check it again.';

   }	

}



To summarize what I try to achieve:

  1. Import data from uploaded excel file to database.

  2. Validate each item in the file.

  3. If there are invalid items, display errors as PDF report.

Also, I notice that memory usage raise sharply when application tries to validate data.


$material->validate()

This


$data = new JPhpExcelReader($location, false);

might be a problem since it reads the whole file into memory. If you could read chunks of the file and then process each chunk in a loop, memory usage would decrease since only the current chunk is in memory, not the whole file. I don’t know how excel files are setup, but if they are setup like csv (1 line = 1 row), then you could read the first 1.000 lines, save them into a temp file and then pass the temp file path to JPhpExcelReader.

Thanks for the suggestion, Y!!. I just do some experiments for this.

Memory usage when application imports 1,000 rows:

Initial: 7,196,864 bytes, Peak: 53,693,768 bytes, End: 51,670,928 bytes

Then import 1,000 rows 2 times, but memory keep increasing:

Initial: 7,197,832 bytes, Peak: 97,226,920 bytes, End: 95,157,488 bytes

Finally, import 1,000 rows 3 times:

Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 86 bytes)

However, if I omit validation and saving process:

Initial: 7,195,032 bytes, Peak: 11,583,856 bytes, End: 8,372,120 bytes (1,000 rows)

Initial: 7,195,032 bytes, Peak: 11,733,624 bytes, End: 8,423,672 bytes (2 x 1,000 rows)

Can we minimize memory usage on saving and AR validating process?

Have you turned off YII_DEBUG?

Do you mean the one in index.php?


define('YII_DEBUG',false);

I had set it to false yesterday. Is there anything else that needs to be set?

Well, YII_DEBUG mode is the main reason for consuming more and more memory each time. That’s because Yii is collecting a lot of log and trace info in this mode.

What may become the cause of large memory consumption if YII_DEBUG has been disabled? I tried to use CSV (instead of excel spreadsheet) without any luck; the memory consumption is still large.

I wonder if there is a solution for this … :( (beside upgrading to PHP 5.3)

Avoid behaviors, if any.

/Tommy

Thanks, Tommy, but I ended up moving to another hosting that supports PHP 5.3. Can’t be helped. :(