Batch insert duplicates

Hello,

I am inserting about 1,500 rows of data into a table and there are often duplicate ID’s because the data in the source system has been updated.

What is the most efficient and effective method to perform a batch insert and update records with matching ID?

PS: I tried a foreach loop but got a "run out of memory warning". The batch insert function works well but now I get a duplicate record warning.

Many thanks

Here is my current code:


  $insertCount = Yii::$app->db->createCommand()

            ->batchInsert($tableName, $columnNameArray, $data)

            ->execute();

Consider using raw SQL query. If this is MySQL there is INSERT … ON DUPLICATE syntax available.

ok, grand. I will do that.

How do I perform batch validation ahead of the insert?

If you are using insert on duplicate it’s done on the fly.

Thanks Bizley,

I have followed your guidance and created a working solution, as seen below. Can you confirm that I have done this in the safest / most appropriate way? It is working but just need to know that I am not opening myself up to a sql injection.

Many thanks for your help.


        $params = [

            [

                'option_id' => 100,

                'option_text' => 'test'

            ], [

                'option_id' => 101,

                'option_text' => 'test'

            ]

        ];


        $data = [];

        foreach ($params as $param) {

            $data[] = "('" . $param['option_id'] . "','" . $param['option_text'] . "')";

        }

        $str = implode(",", $data);


        $sql = 'insert into test (option_id, option_text) values ';

        $sql .= $str . ' ON DUPLICATE KEY UPDATE option_text = VALUES(option_text)';

        $insertCount = Yii::$app->db->createCommand($sql)->execute();

It depends on the source of option_id and option_text. You should validate it for sure if it comes from user.

Take a look at PDO parameters binding.

I am importing the data from an API on an external website (survey monkey). So, yes, the data is coming from multiple end users and I am downloading the batches for insertion into my database.

The problem is that if I use PDO binding (which I really want to) then I cannot use "ON DUPLICATE" and I really need a mechanism to handle duplicates.

The alternative is to query the database, find any duplicates, delete them and then do a batch insert.

The other alternative is to extend the batch insert function on Yii so that I can include the ON DUPLICATE statement but I don’t know how to do that. I have raised the question today on the forum but no answers yet.

Your help is most appreciated, thanks.

Why can’t you use ‘on duplicate’ with binding?

I think this should work:




Yii::$app->db->createCommand(

    "INSERT INTO test (option_id, option_text) VALUES (:opt_id, :opt_text) ON DUPLICATE KEY UPDATE option_text = VALUES(option_text)"

)->bindValues([

    ':opt_id' => 100,

    ':opt_text' => 'test',

])->execute();



It just the matter of setting all the bindings now.

Because I am doing a batch load of data with the same keys:


$params = [


   ['option_id'=>'100', 'option_text'=>'test100'],

   ['option_id'=>'101', 'option_text'=>'test101'],

   ['option_id'=>'102', 'option_text'=>'test102'],

   ['option_id'=>'103', 'option_text'=>'test103'],


];

Binding a multidimensional array causes an error.

The only alternative I can think of is to recreate the array…


INSERT INTO test (option_id, option_text) VALUES (:id1, :text1),(:id2, :text2),(:id3, :text3), 


$params = [

   ':id1'=>'100',

   ':text1'=>'test100',

   ':id2'=>'102',

   ':text2'=>'test102',

   ':id3'=>'103',

   ':text3'=>'test103',

];

Am I missing something?

Yes, more or less something like it. Send it in few packs, not all at once, keep integers as integers and not strings.

Thanks Bizley.

Here is my working implementation for others in case they can use it.

Within my controller, here is how I call and execute the new function


        $sql = new Toolkit_BulkInsert();

        $sql->tableName='survey_response';

        $sql->data=$responses;

        $sql->connect();

        $sql->buildSQL();

        $count = $sql->execute();



I have created this class within models.


<?php


namespace app\models;


use Yii;




class Toolkit_BulkInsert extends \yii

{


    public $tableName;

    public $data=[];


    private $values_to_bind=[];

    private $colNames;




    public function buildSQL(){


        // get list of columns

        // IMPORTANT: requires that all arrays have the same columns in the same order!

        $this->colNames="(".implode(",",array_keys($this->data[0])).") ";


        // now build the sql in preparation for binding

        $x=0;

        $sql_values=[]; // eg insert into x values (:a,:<img src='http://www.yiiframework.com/forum/public/style_emoticons/default/cool.gif' class='bbc_emoticon' alt='B)' />,(:c,<img src='http://www.yiiframework.com/forum/public/style_emoticons/default/biggrin.gif' class='bbc_emoticon' alt=':D' />)...

        $this->values_to_bind=[]; // eg [:x=>abc]


        foreach ($this->data as $arr) {

            $ADO_id = [];

            $x++;

            foreach ($arr as $key=>$value){

                $ADO_id[]= " :".$key."_".$x.""; // :a_1, :b_1

                $this->values_to_bind = $this->values_to_bind + [":".$key."_".$x=>$value];

            }

            $sql_values[] = "(".implode(",",$ADO_id).")";

        }


        // create the update instruction

        // eg a=VALUES(a), b=VALUES(<img src='http://www.yiiframework.com/forum/public/style_emoticons/default/cool.gif' class='bbc_emoticon' alt='B)' />

        $sql_update=[];

        foreach($this->data[0] as $key=>$value){

            $sql_update[] = $key." = VALUES (".$key.") ";

        }


        $this->sql =

            "INSERT INTO ".$this->tableName." ".

            $this->colNames.

            "VALUES ". implode(",",$sql_values) ." ".

            "ON DUPLICATE KEY UPDATE ".implode(",",$sql_update);


       // echo $this->sql."<br>";

       // Toolkit::printObject($this->values_to_bind,0);

    }




    public function connect(){

        $this->connection = Yii::$app->getDb();

        return $this->connection;

    }




    public function execute(){

        $command = $this->connection->createCommand($this->sql);

        $command->bindValues($this->values_to_bind);

        return $command->execute(); // will return the count of records inserted

    }




}



the following snippet can also be used:




        $command = \Yii::$app->db->createCommand()->batchInsert(

            Trade::tableName(),

            ['timestamp', 'trade_id', 'price', 'type', 'amount'],

            $data

        );

        $sql = $command->getRawSql();

        $sql .= ' ON DUPLICATE KEY UPDATE trade_id=trade_id';

        $command->setRawSql($sql);

        $command->execute();



i dont find a reason why your custom class Toolkit_BulkInsert is extending Yii.