Changing DB Encodings
Why have I been told that I need to do a re initdb to change the char
encoding? The man says i can just createdb foodb -E utf8 so why
would i need to dump/initdb/create/restore? cant i just dump/create/restore?
It'd save all the messing around with changing the data dirs etc.
Naz Gassiep <naz@mira.net> writes:
Why have I been told that I need to do a re initdb to change the char
encoding? The man says i can just createdb foodb -E utf8 so why
would i need to dump/initdb/create/restore? cant i just dump/create/restore?
Are you clear on the difference between encoding and locale?
You can make new databases with whatever encoding you say, but the
server's lc_collate and lc_ctype are frozen at initdb, and it will
not work well to select an encoding that is incompatible with the
locale setting. In practice this means that you can only use multiple
database encodings when you initdb'd in "C" locale; all other locale
settings imply a particular encoding.
Yes, this is pretty annoying. No, it's not easy to fix.
regards, tom lane
Tom Lane wrote:
Are you clear on the difference between encoding and locale?
I confidently reply with "maybe".
You can make new databases with whatever encoding you say, but the
server's lc_collate and lc_ctype are frozen at initdb, and it will
not work well to select an encoding that is incompatible with the
locale setting. In practice this means that you can only use multiple
database encodings when you initdb'd in "C" locale; all other locale
settings imply a particular encoding.Yes, this is pretty annoying. No, it's not easy to fix.
OK I understand now, I think. How do I check what locale is currently
set? I've successfully created a utf8 database, does that imply that
because I was able to create a DB with a different encoding to the ones
all the others use (SQL_ASCII) that my locale is set to "C" ?
Regards,
- Naz.
Naz Gassiep <naz@mira.net> writes:
OK I understand now, I think. How do I check what locale is currently
set?
"show lc_collate" (maybe check the other lc_ settings too for luck).
I've successfully created a utf8 database, does that imply that
because I was able to create a DB with a different encoding to the ones
all the others use (SQL_ASCII) that my locale is set to "C" ?
No, that implies a lack of error checking.
regards, tom lane
Tom Lane wrote:
I've successfully created a utf8 database, does that imply that
because I was able to create a DB with a different encoding to the ones
all the others use (SQL_ASCII) that my locale is set to "C" ?No, that implies a lack of error checking.
Surely, then, that's a bug? Shouldn't postmaster check if a DB is in an
encoding that differs from the selected locale?
lc_collate and lc_type both report C as the locale. Does that mean I can
safely bulk load a pg_dump into it?
When doing the dump, do I have to use --encoding=utf8 as a param?
Please excuse me if I sound ignorant of all of this, I'm currently
forced to address an issue that I don't yet feel I have a sufficient
grasp of.
Naz Gassiep <naz@mira.net> writes:
Tom Lane wrote:
No, that implies a lack of error checking.
Surely, then, that's a bug? Shouldn't postmaster check if a DB is in an
encoding that differs from the selected locale?
Yeah, it should. Whether it can is a different question.
Part of the problem here is the lack of a reliable way to tell
*which* encoding is implied by a locale. On some systems you
can get a poorly-standardized string name for the locale's encoding;
on others you can't get anything.
There's been some experimental code in initdb for awhile now that tries
to guess encoding from locale. I have not heard reports of it failing
lately, so maybe we could promote it into a hard error check, or at
least a backend-side warning at CREATE DATABASE time. It still won't
help on old systems without nl_langinfo(CODESET), though. (But how
many of those are left? That call is specified by the Single Unix
Spec. Anybody know if it works on Windows?)
regards, tom lane
I would like to see a CREATE Variable dataname datatype added. The scope of these variables would be global. Along this same line I would like to see a way to have a trigger or rule fired upon connection initialization. This would allow for these type of variables to be SET along with other defaults that need to be setup on a per connection basis from the server side instead of the client application.
Best Regards,
Michael Gould
All Coast Intermodal Services, Inc.
904-376-7030
_____
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
To: Naz Gassiep [mailto:naz@mira.net]
Cc: pgsql-general@postgresql.org
Sent: Sat, 07 Jul 2007 19:50:03 -0400
Subject: Re: [GENERAL] Changing DB Encodings
Naz Gassiep <naz@mira.net> writes:
Tom Lane wrote:
No, that implies a lack of error checking.
Surely, then, that's a bug? Shouldn't postmaster check if a DB is in an
encoding that differs from the selected locale?
Yeah, it should. Whether it can is a different question.
Part of the problem here is the lack of a reliable way to tell
*which* encoding is implied by a locale. On some systems you
can get a poorly-standardized string name for the locale's encoding;
on others you can't get anything.
There's been some experimental code in initdb for awhile now that tries
to guess encoding from locale. I have not heard reports of it failing
lately, so maybe we could promote it into a hard error check, or at
least a backend-side warning at CREATE DATABASE time. It still won't
help on old systems without nl_langinfo(CODESET), though. (But how
many of those are left? That call is specified by the Single Unix
Spec. Anybody know if it works on Windows?)
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
[Please start a new message rather than replying to an old one to
start a new topic.]
On Jul 8, 2007, at 10:16 , Mike Gould wrote:
I would like to see a CREATE Variable dataname datatype added. The
scope of these variables would be global.
CREATE TABLE global_int_variables
(
global_int_variable TEXT PRIMARY KEY
, global_int_value INTEGER NOT NULL
);
-- set a new variable
INSERT INTO global_int_variables (global_int_variable,
global_int_value) VALUES ('foo', 42);
-- get current value
SELECT global_int_value FROM global_int_variables WHERE
global_int_variable = 'foo';
-- assign a new value to an existing variable
UPDATE global_int_variables SET global_int_value = 3 WHERE
global_int_variable = 'foo';
-- undefine a variable
DELETE FROM global_int_variables WHERE global_int_variable = 'foo';
You can also wrap these in functions for convenience (untested):
CREATE FUNCTION set_global_int(p_var TEXT, p_val INTEGER) RETURNS VOID
LANGUAGE plpgsql AS $_$
BEGIN
UPDATE global_int_variables
SET global_int_value = p_val
WHERE global_int_variable = p_var;
IF NOT FOUND THEN
INSERT INTO global_int_variables (global_int_variable,
global_int_value)
VALUES (p_var, p_val);
END IF;
RETURN;
END
$_$;
CREATE FUNCTION get_global_int(TEXT) RETURNS INTEGER
LANGUAGE SQL as $_$
SELECT global_int_value FROM global_int_variables WHERE
global_int_variable = $1;
$_$;
CREATE FUNCTION undef_global_int(TEXT) RETURNS VOID
LANGUAGE SQL as $_$
DELETE FROM global_int_variables WHERE global_int_variable = $1;
$_$;
test=# SELECT set_global_int('foo', 5);
set_global_int
----------------
(1 row)
test=# SELECT get_global_int('foo');
get_global_int
----------------
5
(1 row)
test=# SELECT get_global_int('bar');
get_global_int
----------------
(1 row)
test=# SELECT set_global_int('bar', 9);
set_global_int
----------------
(1 row)
test=# SELECT get_global_int('bar');
get_global_int
----------------
9
(1 row)
test=# SELECT set_global_int('bar', 10);
set_global_int
----------------
(1 row)
test=# SELECT get_global_int('bar');
get_global_int
----------------
10
(1 row)
test=# SELECT undef_global_int('bar');
undef_global_int
------------------
(1 row)
test=# SELECT get_global_int('bar');
get_global_int
----------------
(1 row)
Along this same line I would like to see a way to have a trigger or
rule fired upon connection initialization. This would allow for
these type of variables to be SET along with other defaults that
need to be setup on a per connection basis from the server side
instead of the client application.
If you want the variables to be used on a per-connection basis,
you'll want to use TEMP tables. As for setup per connection, you may
be able to do something with a psqlrc file if using psql. Otherwise I
believe you'll need to handle this setup yourself (which I believe
could be someone fully automated using stored procedures). I know
this has come up on the lists before but I don't recall the specifics
of the discussion (e.g., if it's been decided that this isn't
something that will be added to PostgreSQL or whether no one's made a
specific proposal). You may want to search the archives for more
information.
Michael Glaesemann
grzm seespotcode net