I have a really weird problem I hope someone can help me with:
I’m trying to load a webpage, and save the text from the page into a MySQL database. Before saving I’m replacing all control characters and most special characters with a single space. The text string comes out just right after processing, but whenever I try to save it to my database, it gets truncated. To show the problem in code:
Here I’m inserting some random characters into $data[’:raw_text’] for debugging. This works fine, and the string is of the appropriate length in the database and is printed in full in the die(). However, when I remove the debugging code, the string inserted into the db gets truncated to about 4000-5000 characters. The die() still prints the whole string, without any truncation. The db column is a LONGTEXT.
that’s called earlier in the code. $model->fulltext is not modified elsewhere. And when outputting $data[’:raw_text’] which is a copy of $model->fulltext in die() the value is not truncated, so something is happening while executing the query.
Try to submit test string (for example, 10000 letters “a”), not a real text. Data can be truncated on some weird char, that you’ve missed during replacement.
This is a step in the right direction, but still didn’t solve the issue. I’ve set emulatePrepare to false in the main config, and I’m also disabling emulates with
No effect. The character that seems to be the culprit is the pound sign (£), so I thought that I’ll try removing it for debugging, but even that doesn’t work. I’ve tried using preg_replace, str_replace, but the pound sign just wont budge. The source text is UTF-8, my script is UTF-8, php internal mysql table encoding is UTF-8…
Also, I tried inserting some pound signs into the db from the same script, but that doesn’t work any better. Any ideas?
Just to clarify, this script is being run as a console command. And yes, I’m using console.php as the config file, but the same db configuration exists also in main.php.
And this works! However the actual data I want to insert still isn’t working, even though the data is UTF-8 encoded to begin with, but I’ve tried converting it to UTF-8 too just to be sure, to no avail. All my source files are UTF-8 encoded.