I am running into trouble when using the length validator, its reporting a length error on my utf8 input fields (there are some umlauts) even though the length is ok.
This seems to happen because the validator uses strlen instead of mb_strlen.
I could not find the command that sets the encoding for multibyte functions in the Yii code either ( i.e.: mb_internal_encoding("UTF-8") )
I think mbstring is a non default module for PHP. So many providers might not have it compiled in. Maybe it's better to use the function overloading feature of this module:
This is merely a PHP problem. The framework can't fix this, if mbstring extension is not compiled into PHP. But it can easily be fixed by activating function overloading of mbstring extension in the Apache VirtualHost configuration:
Mmmh, the PHP manual sounds more like that the overloading functionality is provided to adapt older software for multibyte support without having to change it. The other problem is, that many providers won't allow you to change the virtual host settings for your package.
Though maybe this issue only occurs with the string validator, I could just write a new one then.
Mmmh, the PHP manual sounds more like that the overloading functionality is provided to adapt older software for multibyte support without having to change it.
I think it's nothing wrong in using this to fix the problem. That's what this feature is for. Real Unicode support will be available in PHP 6 AFAIK. If the framework code is replaced with mb_strlen() instead of strlen() there might be a lot of people complaining, that errors are thrown because of missing mbstring. The described fix avoids this.
I'll add it to the Unicode cookbook arcticle for now.
I see your point, working with UTF8 in PHP is still quite a hack as of now.
Its a good idea to mention some issues and solutions in the cookbook, thanx.
We should also provide another validator (or enhance the string validator) to check for valid UTF8 strings using mb_check_encoding(). Otherwise you can submit invalid characters which cause database exceptions (e.g. Incorrect string value: '\xFC').
I could write a new multibyte string validator which offers this whole functionality, maybe this could help other people too. Where should I put it?
Ok, here is my solution for now, I will use this validator instead of the CStringValidator (length) validator, I changed some attributes (i.e. 'max' to 'maxlength', etc.), feel free to use/improve/criticize it:
<?php
class mbstring extends CValidator {
public $maxlength; // maximum allowed string length
public $minlength; // minimum allowed string length
public $islength; // required exact length
public $tooShort; // custom message for short string
public $tooLong; // custom message for long string
public $wrongCharset; // custom message for wrong character set
public $allowEmpty=true;
protected function validateAttribute($object,$attribute) {
mb_internal_encoding(Yii::app()->charset);
$value = $object->$attribute;
if($this->allowEmpty && ($value === null || $value === ''))
return;
if (!mb_check_encoding($value)) {
$message=$this->wrongCharset !== null ? $this->wrongCharset : Yii::t('yii','{attribute} has wrong character set.');
$this->addError($object,$attribute,$message);
}
$length = mb_strlen($value);
if($this->minlength !== null && $length < $this->minlength) {
$message=$this->tooShort!==null?$this->tooShort:Yii::t('yii','{attribute} is too short (minimum is {min} characters).');
$this->addError($object,$attribute,$message,array('{min}'=>$this->minlength));
}
if($this->maxlength!==null && $length>$this->maxlength) {
$message=$this->tooLong!==null?$this->tooLong:Yii::t('yii','{attribute} is too long (maximum is {max} characters).');
$this->addError($object,$attribute,$message,array('{max}'=>$this->maxlength));
}
if($this->islength!==null && $length!==$this->islength) {
$message=$this->message!==null?$this->message:Yii::t('yii','{attribute} is of the wrong length (should be {length} characters).');
$this->addError($object,$attribute,$message,array('{length}'=>$this->islength));
}
}
}
how to switch completely to the mbstring validation?
when I’m validating the model, the specified mbstring field validates through both mbstring and CStringValidator… ofcourse CStringValidator returns the length error.
I defined the rules as follows (I redefined length property back to standard max/min):
In Yii 1.1 you can pass a charset to the string validator, so it actually supports multibyte now.
public function rules() {
return array(
array('title_ru','length','max'=>45,'encoding'=>Yii::app()->charset)
);
}
I still wrote my own version of the validator to also check for valid characters and to not always have to pass the ‘encoding’ parameter, so here is my new mbLength class:
class mbLength extends CStringValidator {
public $wrongCharset; // custom message for wrong character set
public function __construct() {
$this->encoding = Yii::app()->charset;
}
protected function validateAttribute($object,$attribute) {
$value = $object->$attribute;
if (!$this->isCharsetCorrect($value)) {
$message=$this->wrongCharset !== null ? $this->wrongCharset : Yii::t('yii','Wrong character set.');
$object->$attribute = '';
$this->addError($object,$attribute,$message);
}
parent::validateAttribute($object,$attribute);
}
public function isCharsetCorrect($string) {
$string = (string)$string;
$convertCS = 'UTF-8';
$sourceCS = Yii::app()->charset;
return $string === mb_convert_encoding ( mb_convert_encoding ( $string, $convertCS, $sourceCS ), $sourceCS, $convertCS );
}
}
To use this you would only have to specify one validator (mbLength) in the rule:
public function rules() {
return array(
array('title_ru','mbLength','max'=>45)
);
}