Clear, usable interfaces. Clean, accessible code.

Route 19

Logbook.

PHP, MySQL and UTF-8 Jun. 18 at 11:40 am

I’ve always had trouble getting PHP and UTF-8 to play nice with a database (mostly because I don’t have a complete understanding of the issue). Through trial and error I think I’ve found a way to prevent extended characters (usually French accents) from being garbled into question marks or black diamonds when output. Documenting here for future reference.

Whenever database content is being displayed on a page (even as form input or textarea content) pass it through htmlentities with the character set defined:

htmlentities($string, ENT_NOQUOTES, "UTF-8")

Whenever content is written to the database through a form submission, set the form tag’s accept-charset attribute:

<form action="/path/to/post" method="post" accept-charset="utf-8">

Ensure all pages have their character set defined in a meta tag and/or an http header:

<?php header("Content-type: text/html; charset=utf-8"); ?> <meta http-equiv="Content-type" content="text/html; charset=utf-8" />

Finally, make sure the database character set is also UTF-8. You may need to add these statements to your MySQL connection:

SET NAMES 'utf8' SET CHARACTER SET utf8

This solution is currently working well for me on two bilingual sites in limited testing. I’ll update here if there are issues. If you have a less complicated or more robust solution, I’d love to hear about it.

Recent Entries

More