Sunday, February 24, 2008

MySQL and UTF-8

After reading the article (Live aus der Marschrutka » MySQL and UTF-8 — no more question marks!) I’ve noticed that specifying this (I’m using cp1251 user interface and UTF-8 database):
init_connect = ‘SET CHARACTER SET cp1251′
in my.ini file doesn’t work for me.

Where as executing ‘SET CHARACTER SET cp1251′ query at the beginning of SQL session works fine.


I'm reposting here a small chunk of the article:

UTF-8, a Unicode encoding, is probably already the most used character encoding for new web applications, except maybe for Asia. The most popular open source database is MySQL. (But don’t miss the most advanced open source database, which I prefer.)

Communication with the database
The other side of the problem is the data that comes from and gets sent to the client. MySQL offers a lot of features here; you can have different character sets at almost every stage of data processing. To be all UTF-8, issue the following statement just after you’ve made the connection to the database server:

SET NAMES utf8;

This sets the character_set_client, character_set_connection and character_set_results variables to utf8. See below for the meaning of each of these variables.

Communication with the database also concerns SQL files you read with the MySQL command line client, or upload with phpMyAdmin. Put the statement at the top of every SQL file, like this:

SET NAMES utf8;
INSERT INTO TABLE gadgets (name, rating) VALUES ('iPod', 45);

If you’re talking to mysql from a command line that doesn’t understand UTF-8 (likely in Windows and older Linuxes), use the following statement to tell MySQL which character set you’re using on the client side:

SET CHARACTER SET cp1250;

This sets the character_set_client and character_set_results variables to cp1250. Upon arriving on the server, your data will be converted from CP1250 to UTF-8. Results returned to you will be converted from UTF-8 to CP1250.


For russian speaking  folks there's another (in some extent more detailed) article on the same topic: http://www.phpfaq.ru/charset

Blogged with Flock

No comments: