Could not Store French Accent Marks Correctly in Postgres

Started by Wang, Mary Yover 15 years ago4 messagesgeneral
Jump to latest
#1Wang, Mary Y
mary.y.wang@boeing.com

Hi,

I'm having a problem right now. Some of our French users uploaded some files with file names that had French accent marks, and those file names were inserted into the Postgres database. When I examined the value of those file names, they all had some weird characters (the weird characters were in the same position where the accent marks were entered). I do not know how to handle this kind of situation. Most of my users are US based, but I have been told that there will be more international users in the future.

So my questions are:
(1) What is the best character encoding that would work for most of those languages that have accent marks?
(2) I assume that I also need to do some kind of conversion in the front end (PHP) as well.

I'm running on Linux and Postgres 8.3.8.

Any ideas?

Thanks in advance.
Mary Wang

#2John R Pierce
pierce@hogranch.com
In reply to: Wang, Mary Y (#1)
Re: Could not Store French Accent Marks Correctly in Postgres

On 08/20/10 2:10 PM, Wang, Mary Y wrote:

Hi,
I'm having a problem right now. Some of our French users uploaded
some files with file names that had French accent marks, and those
file names were inserted into the Postgres database. When I examined
the value of those file names, they all had some weird characters (the
weird characters were in the same position where the accent marks were
entered). I do not know how to handle this kind of situation. Most
of my users are US based, but I have been told that there will be more
international users in the future.
So my questions are:
(1) What is the best character encoding that would work for most
of those languages that have accent marks?
(2) I assume that I also need to do some kind of conversion in the
front end (PHP) as well.

UTF8 is the answer to your questions.

#3Ludwig Kniprath
ludwig@kni-online.de
In reply to: Wang, Mary Y (#1)
Re: Could not Store French Accent Marks Correctly in Postgres

Am 20.08.2010 23:10, schrieb Wang, Mary Y:

Hi,
I'm having a problem right now. Some of our French users uploaded
some files with file names that had French accent marks, and those
file names were inserted into the Postgres database. When I examined
the value of those file names, they all had some weird characters (the
weird characters were in the same position where the accent marks were
entered). I do not know how to handle this kind of situation. Most
of my users are US based, but I have been told that there will be more
international users in the future.
So my questions are:
(1) What is the best character encoding that would work for most
of those languages that have accent marks?
(2) I assume that I also need to do some kind of conversion in the
front end (PHP) as well.
I'm running on Linux and Postgres 8.3.8.
Any ideas?
Thanks in advance.
Mary Wang

Hi,
our solution for storing uploaded files in database/filesystem with php
uses utf-8 for the filenames in the database in combination with
string-replacement for some special characters in php. These are in our
case the german "Umlaute" (�,�,�,�), because otherwise we get the
problem of strange translations of these characters (php uses utf-8,
german windows uses cp-1250), that made them unusable for
download-links. You can use the function below, just add your special
characters to the $trans-array. As another benefit this function returns
unique filenames that can be used for storing the files in a
target-directory.

<SNIP>
public static function get_unique_file_name($target_dir,
$current_file_name){
$trans = array ("�" => "ae", "�" => "oe", "�" => "ue", "�" => "ss",
"�" => "Ae", "�" => "Oe", "�" => "Ue");
target_file_name = strtr($current_file_name, $trans);
$i = 0;
$old_target_file_name = $target_file_name;
while(file_exists($target_dir . '/' . $target_file_name)){
$i++;
$target_file_name = $i . $old_target_file_name;
}
return $target_file_name;
}
</SNIP>

Ludwig

#4Jonathan Bond-Caron
jbondc@openmv.com
In reply to: Wang, Mary Y (#1)
Re: Could not Store French Accent Marks Correctly in Postgres

On Fri Aug 20 05:10 PM, Wang, Mary Y wrote:

So my questions are:
(1) What is the best character encoding that would work for most of 
those languages that have accent marks?

Store data in PostgreSQL as UTF-8

(2) I assume that I also
need to do some kind of conversion in the front end (PHP) as well.

I'm running on Linux and Postgres 8.3.8.

If users are submitting the file names using an HTML form, use:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
...

If using some other method, the common encoding in Europe,north america is
CP1252, in php you can convert to UTF8 using:
mb_convert_encoding($str, 'UTF-8', 'Windows-1252');
http://www.php.net/manual/en/function.mb-convert-encoding.php