Mysql unique keys and collations

I’m working on an application that stores unique values on a field, so I used the unique index for this column, everything was ok, until I started inserting multi byte characters in it.

The characters inserted fine, the problem was that I’m using INSERT … ON DUPLICATE KEY UPDATE syntax to avoid looking if the field exists, so after debugging where the application was breaking I got this:

Query failed: Cannot add or update a child row: a foreign key constraint fails 
(`trubaseek/mapKeywordName`, CONSTRAINT `fkDomainMapNameId` FOREIGN KEY (`NameId`) 
REFERENCES `names` (`nameId`) ON DELETE NO ACTION ON UPDATE NO ACTION)<br> 
SQL: INSERT INTO `mapKeywordName` (`keywordId`, `nameId`) VALUES ('1', '326');

so I’ve spent about an hour changing the encodings without success, but after some testing I discovered that hôteles was the same as hotelës, I did several testing from the linux console, the mysql browser, and from a php script, and all revealed the same.

When I did this

INSERT INTO `names` (`name`, `price`, `priceCurrency`, `bids`, `traffic`, `lastUpdate`) 
VALUES ('hotelës', '0', '$US', '0', '0', NOW()) 
ON DUPLICATE KEY UPDATE `price`='0', `priceCurrency`='$US', `bids`='0', `traffic`='0', `lastUpdate` = NOW();

I’ve got this:

Query OK, 2 rows affected (0.01 sec)

Called from a php script mysql_insert_id(); returned an id like if it was inserted not updated, but a select for such id resulted in an error, because it didn’t existed.

So after some research, I’ve found that adding a collation utf8_bin to that column fixed this issue.

I haven’t tested this enough but for now it appears to be working.

This is one of many things about MySQL data inconsistency that you need to fight when mysql is used in a project.