Batch insert duplicates


(G Milborrow) #1

Hello,

I am inserting about 1,500 rows of data into a table and there are often duplicate ID’s because the data in the source system has been updated.

What is the most efficient and effective method to perform a batch insert and update records with matching ID?

PS: I tried a foreach loop but got a "run out of memory warning". The batch insert function works well but now I get a duplicate record warning.

Many thanks

Here is my current code:


  $insertCount = Yii::$app->db->createCommand()

            ->batchInsert($tableName, $columnNameArray, $data)

            ->execute();


(Bizley) #2

Consider using raw SQL query. If this is MySQL there is INSERT … ON DUPLICATE syntax available.


(G Milborrow) #3

ok, grand. I will do that.

How do I perform batch validation ahead of the insert?


(Bizley) #4

If you are using insert on duplicate it’s done on the fly.


(G Milborrow) #5

Thanks Bizley,

I have followed your guidance and created a working solution, as seen below. Can you confirm that I have done this in the safest / most appropriate way? It is working but just need to know that I am not opening myself up to a sql injection.

Many thanks for your help.


        $params = [

            [

                'option_id' => 100,

                'option_text' => 'test'

            ], [

                'option_id' => 101,

                'option_text' => 'test'

            ]

        ];


        $data = [];

        foreach ($params as $param) {

            $data[] = "('" . $param['option_id'] . "','" . $param['option_text'] . "')";

        }

        $str = implode(",", $data);


        $sql = 'insert into test (option_id, option_text) values ';

        $sql .= $str . ' ON DUPLICATE KEY UPDATE option_text = VALUES(option_text)';

        $insertCount = Yii::$app->db->createCommand($sql)->execute();


(Bizley) #6

It depends on the source of option_id and option_text. You should validate it for sure if it comes from user.

Take a look at PDO parameters binding.


(G Milborrow) #7

I am importing the data from an API on an external website (survey monkey). So, yes, the data is coming from multiple end users and I am downloading the batches for insertion into my database.

The problem is that if I use PDO binding (which I really want to) then I cannot use "ON DUPLICATE" and I really need a mechanism to handle duplicates.

The alternative is to query the database, find any duplicates, delete them and then do a batch insert.

The other alternative is to extend the batch insert function on Yii so that I can include the ON DUPLICATE statement but I don’t know how to do that. I have raised the question today on the forum but no answers yet.

Your help is most appreciated, thanks.


(Bizley) #8

Why can’t you use ‘on duplicate’ with binding?

I think this should work:




Yii::$app->db->createCommand(

    "INSERT INTO test (option_id, option_text) VALUES (:opt_id, :opt_text) ON DUPLICATE KEY UPDATE option_text = VALUES(option_text)"

)->bindValues([

    ':opt_id' => 100,

    ':opt_text' => 'test',

])->execute();



It just the matter of setting all the bindings now.


(G Milborrow) #9

Because I am doing a batch load of data with the same keys:


$params = [


   ['option_id'=>'100', 'option_text'=>'test100'],

   ['option_id'=>'101', 'option_text'=>'test101'],

   ['option_id'=>'102', 'option_text'=>'test102'],

   ['option_id'=>'103', 'option_text'=>'test103'],


];

Binding a multidimensional array causes an error.

The only alternative I can think of is to recreate the array…


INSERT INTO test (option_id, option_text) VALUES (:id1, :text1),(:id2, :text2),(:id3, :text3), 


$params = [

   ':id1'=>'100',

   ':text1'=>'test100',

   ':id2'=>'102',

   ':text2'=>'test102',

   ':id3'=>'103',

   ':text3'=>'test103',

];

Am I missing something?


(Bizley) #10

Yes, more or less something like it. Send it in few packs, not all at once, keep integers as integers and not strings.


(G Milborrow) #11

Thanks Bizley.

Here is my working implementation for others in case they can use it.

Within my controller, here is how I call and execute the new function


        $sql = new Toolkit_BulkInsert();

        $sql->tableName='survey_response';

        $sql->data=$responses;

        $sql->connect();

        $sql->buildSQL();

        $count = $sql->execute();



I have created this class within models.


<?php


namespace app\models;


use Yii;




class Toolkit_BulkInsert extends \yii

{


    public $tableName;

    public $data=[];


    private $values_to_bind=[];

    private $colNames;




    public function buildSQL(){


        // get list of columns

        // IMPORTANT: requires that all arrays have the same columns in the same order!

        $this->colNames="(".implode(",",array_keys($this->data[0])).") ";


        // now build the sql in preparation for binding

        $x=0;

        $sql_values=[]; // eg insert into x values (:a,:<img src='http://www.yiiframework.com/forum/public/style_emoticons/default/cool.gif' class='bbc_emoticon' alt='B)' />,(:c,<img src='http://www.yiiframework.com/forum/public/style_emoticons/default/biggrin.gif' class='bbc_emoticon' alt=':D' />)...

        $this->values_to_bind=[]; // eg [:x=>abc]


        foreach ($this->data as $arr) {

            $ADO_id = [];

            $x++;

            foreach ($arr as $key=>$value){

                $ADO_id[]= " :".$key."_".$x.""; // :a_1, :b_1

                $this->values_to_bind = $this->values_to_bind + [":".$key."_".$x=>$value];

            }

            $sql_values[] = "(".implode(",",$ADO_id).")";

        }


        // create the update instruction

        // eg a=VALUES(a), b=VALUES(<img src='http://www.yiiframework.com/forum/public/style_emoticons/default/cool.gif' class='bbc_emoticon' alt='B)' />

        $sql_update=[];

        foreach($this->data[0] as $key=>$value){

            $sql_update[] = $key." = VALUES (".$key.") ";

        }


        $this->sql =

            "INSERT INTO ".$this->tableName." ".

            $this->colNames.

            "VALUES ". implode(",",$sql_values) ." ".

            "ON DUPLICATE KEY UPDATE ".implode(",",$sql_update);


       // echo $this->sql."<br>";

       // Toolkit::printObject($this->values_to_bind,0);

    }




    public function connect(){

        $this->connection = Yii::$app->getDb();

        return $this->connection;

    }




    public function execute(){

        $command = $this->connection->createCommand($this->sql);

        $command->bindValues($this->values_to_bind);

        return $command->execute(); // will return the count of records inserted

    }




}




(Info) #12

the following snippet can also be used:




        $command = \Yii::$app->db->createCommand()->batchInsert(

            Trade::tableName(),

            ['timestamp', 'trade_id', 'price', 'type', 'amount'],

            $data

        );

        $sql = $command->getRawSql();

        $sql .= ' ON DUPLICATE KEY UPDATE trade_id=trade_id';

        $command->setRawSql($sql);

        $command->execute();




(Buttflattery) #13

i dont find a reason why your custom class Toolkit_BulkInsert is extending Yii.