could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore
Hi!
Postgres 12 database dump is created in Debian 10 using pg_dump .
Trying to restore it in Windows 10 using
pg_restore --clean --create --if-exists --dbname=postgres --jobs=8 --no-password
mydb.backup
produces strange message
pg_restore: WARNING: could not determine encoding for locale "et_EE.UTF-8":
codeset is "CPUTF-8"
How to fix this ?
Debian and Windows computer have same settings:
Latest Postgres 12 is used
OS and database locales are Estonian
Database encoding is UTF-8
Andrus.
On 3/28/20 2:39 PM, Andrus wrote:
Hi!
Postgres 12 database dump is created in Debian 10 using pg_dump .
Trying to restore it in Windows 10ļæ½ using
pg_restore --clean --create --if-exists --dbname=postgres --jobs=8
--no-password mydb.backupproduces strange message
pg_restore: WARNING:ļæ½ could not determine encoding for locale
"et_EE.UTF-8": codeset is "CPUTF-8"
In the Debian Postgres instance in psql what does \l show for the databases?
In the Windows 10 command prompt what does systeminfo show?
How to fix this ?
Debian and Windows computer have same settings:
Latest Postgres 12ļæ½ is used
OS and database locales areļæ½ Estonian
Database encoding is UTF-8Andrus.
--
Adrian Klaver
adrian.klaver@aklaver.com
Hi!
In the Debian Postgres instance in psql what does \l show for the
databases?
#psql namm postgres
psql (12.2 (Debian 12.2-2.pgdg100+1))
Type "help" for help.
namm=# \l namm
List of databases
Name | Owner | Encoding | Collate | Ctype | Access
privileges
------+------------+----------+-------------+-------------+---------------------------
namm | namm_owner | UTF8 | et_EE.UTF-8 | et_EE.UTF-8 | =Tc/namm_owner
+
| | | | |
namm_owner=CTc/namm_owner
(1 row)
In the Windows 10 command prompt what does systeminfo show?
Host Name: SERVER2
OS Name: Microsoft Windows 10 Pro
OS Version: 10.0.18363 N/A Build 18363
OS Manufacturer: Microsoft Corporation
OS Configuration: Standalone Workstation
OS Build Type: Multiprocessor Free
Registered Owner: Windows User
Registered Organization:
Product ID: 00330-70008-16217-AAOEM
Original Install Date: 05.09.2019, 9:16:41
System Boot Time: 28.03.2020, 11:05:23
System Manufacturer: Gigabyte Technology Co., Ltd.
System Model: Q270M-D3H
System Type: x64-based PC
Processor(s): 1 Processor(s) Installed.
[01]: 192.168.91.154
GenuineIntel ~3601 Mhz
BIOS Version: American Megatrends Inc. F1, 09.01.2017
Windows Directory: C:\WINDOWS
System Directory: C:\WINDOWS\system32
Boot Device: \Device\HarddiskVolume3
System Locale: et;Eesti
Input Locale: et;Eesti
Time Zone: (UTC+02:00) Helsingi, Kiiev, Riia, Sofia,
Tallinn, Vilnius
Total Physical Memory: 16 286 MB
Available Physical Memory: 12 032 MB
Virtual Memory: Max Size: 18 718 MB
Virtual Memory: Available: 14 867 MB
Virtual Memory: In Use: 3 851 MB
Page File Location(s): C:\pagefile.sys
Domain: WORKGROUP
Logon Server: \\SERVER2
Hotfix(s): 18 Hotfix(s) Installed.
[01]: 192.168.91.154
[02]: fe80::94d:b1c:3945:bc8a
[03]: 2001:7d0:4c83:4c80:257f:b077:e1f7:21e1
[04]: 2001:7d0:4c83:4c80:94d:b1c:3945:bc8a Hyper-V Requirements: VM Monitor Mode Extensions: Yes Virtualization Enabled In Firmware: Yes Second Level Address Translation: Yes Data Execution Prevention Available: Yes
[05]: KB4515383
[06]: KB4515530
[07]: KB4516115
[08]: KB4517245
[09]: KB4520390
[10]: KB4521863
[11]: KB4524244
[12]: KB4524569
[13]: KB4528759
[14]: KB4532441
[15]: KB4537759
[16]: KB4538674
[17]: KB4541338
[18]: KB4551762 Network Card(s): 2 NIC(s) Installed.
Network Card(s): 2 NIC(s) Installed.
[01]: 192.168.91.154
Connection Name: Ethernet 4
Status: Media disconnected
[02]: fe80::94d:b1c:3945:bc8a
Connection Name: Ethernet 3
DHCP Enabled: Yes
DHCP Server: 192.168.91.1
IP address(es)
[01]: 192.168.91.154
[02]: fe80::94d:b1c:3945:bc8a
[03]: 2001:7d0:4c83:4c80:257f:b077:e1f7:21e1
2001:7d0:4c83:4c80:257f:b077:e1f7:21e1
[04]: 2001:7d0:4c83:4c80:94d:b1c:3945:bc8a Hyper-V Requirements: VM Monitor Mode Extensions: Yes Virtualization Enabled In Firmware: Yes Second Level Address Translation: Yes Data Execution Prevention Available: Yes
Hyper-V Requirements: VM Monitor Mode Extensions: Yes
Virtualization Enabled In Firmware: Yes
Second Level Address Translation: Yes
Data Execution Prevention Available: Yes
Andrus.
On 3/28/20 3:31 PM, Andrus wrote:
Hi!
In the Debian Postgres instance in psql what does \l show for the
databases?#psql namm postgres
psql (12.2 (Debian 12.2-2.pgdg100+1))
Type "help" for help.namm=# \l namm
�������������������������������� List of databases
Name |�� Owner��� | Encoding |�� Collate�� |��� Ctype��� |���� Access
privileges
------+------------+----------+-------------+-------------+---------------------------namm | namm_owner | UTF8���� | et_EE.UTF-8 | et_EE.UTF-8 | =Tc/namm_owner +
���� |����������� |��������� |������������ |������������ |
namm_owner=CTc/namm_owner
(1 row)In the Windows 10 command prompt what does systeminfo show?
System Locale:������������ et;Eesti
Input Locale:������������� et;Eesti
Hmm, I was expecting to see et_EE though I will admit to not truly
understanding how Windows does locales.
I should have asked earlier, in the Postgres instance on Windows what
does \l show for template0?
Andrus.
--
Adrian Klaver
adrian.klaver@aklaver.com
"Andrus" <kobruleht2@hot.ee> writes:
Postgres 12 database dump is created in Debian 10 using pg_dump .
Trying to restore it in Windows 10 using
pg_restore --clean --create --if-exists --dbname=postgres --jobs=8 --no-password
mydb.backup
produces strange message
pg_restore: WARNING: could not determine encoding for locale "et_EE.UTF-8":
codeset is "CPUTF-8"
How to fix this ?
Generally speaking, locale names from Unix systems won't work at all on
Windows. You need to create the database manually with whatever seems
to be the closest Windows locale match, and then restore its contents
without using --create.
(I do agree that that message isn't the most helpful thing. It looks
like chklocale.c is overoptimistically assuming that what it's handed
is valid, even if GetLocaleInfoEx says it isn't.)
regards, tom lane
I wrote:
(I do agree that that message isn't the most helpful thing. It looks
like chklocale.c is overoptimistically assuming that what it's handed
is valid, even if GetLocaleInfoEx says it isn't.)
After further digging, that optimism isn't *completely* without
foundation; it seems to be based on the fact that we know setlocale()
thought the locale string was OK. Which is interesting --- apparently
Microsoft is more willing to accept Unix-style locale names than
I thought. But they couldn't be bothered to make GetLocaleInfoEx()
and setlocale() take the same set of strings ...
regards, tom lane
Hi!
System Locale: et;Eesti
Input Locale: et;EestiHmm, I was expecting to see et_EE though I will admit to not truly
understanding how Windows does locales.
I should have asked earlier, in the Postgres instance on Windows what does
\l show for template0?
"D:\Program Files\PostgreSQL\12\bin\psql" postgres postgres
psql (12.2)
WARNING: Console code page (775) differs from Windows code page (1257)
8-bit characters might not work correctly. See psql reference
page "Notes for Windows users" for details.
Type "help" for help.
postgres=# \l template0
List of databases
Name | Owner | Encoding | Collate | Ctype
| Access privileges
-----------+----------+----------+-----------------------+-----------------------+-----------------------
template0 | postgres | UTF8 | Estonian_Estonia.1257 |
Estonian_Estonia.1257 | =c/postgres +
| | | |
| postgres=CTc/postgres
(1 row)
Andrus.
Hi!
Generally speaking, locale names from Unix systems won't work at all on
Windows. You need to create the database manually with whatever seems
to be the closest Windows locale match, and then restore its contents
without using --create.
This is unattended script running in every night from .bat file.
How to replace pg_restore --create option with psql and/or createdb calls
and specify proper locale for them ?
Currently everthing has "Estonian_Estonia.1257" locale in windows.
Which locale name should be specified in Windows instead of this?
Or maybe creating new template with proper encoding or changing template0
encoding helps?
Andrus.
On 3/29/20 12:56 AM, Andrus wrote:
Hi!
System Locale:������������ et;Eesti
Input Locale:������������� et;EestiHmm, I was expecting to see et_EE though I will admit to not truly
understanding how Windows does locales.
I should have asked earlier, in the Postgres instance on Windows what
does \l show for template0?
I rarely use Windows anymore so take the below with that in mind.
"D:\Program Files\PostgreSQL\12\bin\psql"ļæ½ postgres postgres
psql (12.2)
WARNING: Console code page (775) differs from Windows code page (1257)
������� 8-bit characters might not work correctly. See psql reference
������� page "Notes for Windows users" for details.
There seems to a difference of opinion of what Baltic Code Page to use:
https://en.wikipedia.org/wiki/Code_page_775
https://en.wikipedia.org/wiki/Windows-1257
The post below shows a users method of dealing with this for another CP:
/messages/by-id/549275CC.4010607@gmail.com
Type "help" for help.
postgres=# \l template0
������������������������������������������ List of databases
� Name��� |� Owner�� | Encoding |������� Collate������� |�������� Ctype
|�� Access privileges
-----------+----------+----------+-----------------------+-----------------------+-----------------------template0 | postgres | UTF8���� | Estonian_Estonia.1257 |
Estonian_Estonia.1257 | =c/postgres��������� +
��������� |��������� |��������� |���������������������� | |
I'm guessing it is picking up Estonian_Estonia.1257 from the system.
The Windows Postgres instance was installed from the EDB installer?
postgres=CTc/postgres
(1 row)Andrus.
--
Adrian Klaver
adrian.klaver@aklaver.com
On 3/29/20 1:02 AM, Andrus wrote:
Hi!
Generally speaking, locale names from Unix systems won't work at all on
Windows. You need to create the database manually with whatever seems
to be the closest Windows locale match, and then restore its contents
without using --create.This is unattended script running in every night from .bat file.
Per my previous post, you might try adding something like:
cmd.exe /c chcp 1257
to the top of the batch file. This idea came from here:
How to replace pg_restore --create option with psql and/or createdb
calls
and specify proper locale for them ?Currently everthing has "Estonian_Estonia.1257" locale in windows.
Which locale name should be specified in Windows instead of this?
Or maybe creating new template with proper encoding or changing
template0 encoding helps?
If I am following the error correctly then the issue is that the
Postgres console programs are using CP755 and that is not something for
which there is an automatic conversion:
https://www.postgresql.org/docs/12/multibyte.html#id-1.6.10.5.7
There is a conversion for 1257 clients so having your console run as
1257 should solve the problem. Someone with more current experience on
Windows will need to comment on whether that is the viable or best solution.
Andrus.
--
Adrian Klaver
adrian.klaver@aklaver.com
Hi!
There seems to a difference of opinion of what Baltic Code Page to use:
https://en.wikipedia.org/wiki/Code_page_775
https://en.wikipedia.org/wiki/Windows-1257
The post below shows a users method of dealing with this for another CP:
/messages/by-id/549275CC.4010607@gmail.com
Console code page 775 message appears since psql is console application
running from command line.
It does not have any relation to pg_dump/pg_restore issue since console code
page is not used in this case.
There is Estonian locale everywhere.
Maybe this warning is harmless since Linux code page is ignored and default
collation is used.
All table definitions in restored database contain references to default
collation:
CREATE TABLE firma1.acquirpo
(
kassanr numeric(3,0) NOT NULL,
policyid character(2) COLLATE pg_catalog."default" NOT NULL,
trantype character(6) COLLATE pg_catalog."default",
tacdefault character(10) COLLATE pg_catalog."default",
tacdenial character(10) COLLATE pg_catalog."default",
taconline character(10) COLLATE pg_catalog."default",
floorlimit numeric(12,0),
randselthr numeric(12,0),
minrandper numeric(2,0),
maxrandper numeric(2,0),
CONSTRAINT acquirpo_pkey PRIMARY KEY (kassanr, policyid)
)
TABLESPACE pg_default;
Same warning appears two times. This command execute by pg_restore probably
causes this (harmless?) warning:
CREATE DATABASE mydb
WITH
OWNER = mydb_owner
ENCODING = 'UTF8'
LC_COLLATE = 'et_EE.UTF-8'
LC_CTYPE = 'et_EE.UTF-8'
TABLESPACE = pg_default
CONNECTION LIMIT = -1;
If new database is created manually in windows cluster the following command
is generated:
CREATE DATABASE mydbmanually
WITH
OWNER = mydbmanually_owner
ENCODING = 'UTF8'
LC_COLLATE = 'Estonian_Estonia.1257'
LC_CTYPE = 'Estonian_Estonia.1257'
TABLESPACE = pg_default
CONNECTION LIMIT = -1;
Andrus.
Hi!
Per my previous post, you might try adding something like:
cmd.exe /c chcp 1257
to the top of the batch file. This idea came from here:
https://stackoverflow.com/questions/20794035/postgresql-warning-console-code-page-437-differs-from-windows-code-page-125
If I am following the error correctly then the issue is that the Postgres
console programs are using CP755 and that is not something for which there
is an automatic conversion:
https://www.postgresql.org/docs/12/multibyte.html#id-1.6.10.5.7
There is a conversion for 1257 clients so having your console run as 1257
should solve the problem. Someone with more current experience on Windows
will need to comment on whether that is the viable or best solution.
Both servers have UTF-8 encoding.
Non-unicode code page 755 referes only to command line applications like
psql.
Postgres service, pg_dump and pg_restore do not use console codepages any
way, they operate using only UTF-8 character set since both databases are in
UTF-8
I think console code page warning message is not related to this issue.
Andrus.
On 3/29/20 11:12 AM, Andrus wrote:
Hi!
There seems to a difference of opinion of what Baltic Code Page to use:
https://en.wikipedia.org/wiki/Code_page_775
https://en.wikipedia.org/wiki/Windows-1257
The post below shows a users method of dealing with this for another CP:
/messages/by-id/549275CC.4010607@gmail.comConsole code page 775 message appears since psql is console application
running from command line.It does not have any relation to pg_dump/pg_restore issue since console
code page is not used in this case.There is Estonian locale everywhere.
Maybe this warning is harmless since Linux code page is ignored and
default collation is used.
All table definitions in restored database contain references to default
collation:CREATE TABLE firma1.acquirpo
(
�� kassanr numeric(3,0) NOT NULL,
�� policyid character(2) COLLATE pg_catalog."default" NOT NULL,
�� trantype character(6) COLLATE pg_catalog."default",
�� tacdefault character(10) COLLATE pg_catalog."default",
�� tacdenial character(10) COLLATE pg_catalog."default",
�� taconline character(10) COLLATE pg_catalog."default",
�� floorlimit numeric(12,0),
�� randselthr numeric(12,0),
�� minrandper numeric(2,0),
�� maxrandper numeric(2,0),
�� CONSTRAINT acquirpo_pkey PRIMARY KEY (kassanr, policyid)
)TABLESPACE pg_default;
Same warning appears two times. This command execute by pg_restore
probably causes this (harmless?)ļæ½ warning:
What warning?
I cranked up a Windows 7 instance and tried to migrate a Postgres 11
database from Ubuntu and it failed on the CREATE DATABASE step because
of this line in the dump file:
CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8'
LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8';
When I manually changed it in the plain text version of the dump file to:
CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8'
LC_COLLATE = 'English_United States.1252' LC_CTYPE = 'English_United
States.1252';
borrowing from Tom's suggestion here:
/messages/by-id/22672.1585442578@sss.pgh.pa.us
it worked.
So basically what you see below.
If you followed Tom's suggestion fully you could restore into a manually
created database on the Windows side by dropping the --create and
pointing -d at the previously created database.
CREATE DATABASE mydb
�� WITH
�� OWNER = mydb_owner
�� ENCODING = 'UTF8'
�� LC_COLLATE = 'et_EE.UTF-8'
�� LC_CTYPE = 'et_EE.UTF-8'
�� TABLESPACE = pg_default
�� CONNECTION LIMIT = -1;If new database is created manually in windows cluster the following
command is generated:CREATE DATABASE mydbmanually
�� WITH
�� OWNER = mydbmanually_owner
�� ENCODING = 'UTF8'
�� LC_COLLATE = 'Estonian_Estonia.1257'
�� LC_CTYPE = 'Estonian_Estonia.1257'
�� TABLESPACE = pg_default
�� CONNECTION LIMIT = -1;Andrus.
--
Adrian Klaver
adrian.klaver@aklaver.com
Hi!
Same warning appears two times. This command execute by pg_restore
probably causes this (harmless?) warning:What warning?
pg_restore: WARNING: could not determine encoding for locale "et_EE.UTF-8":
codeset is "CPUTF-8"
I cranked up a Windows 7 instance and tried to migrate a Postgres 11
database from Ubuntu and it failed on the CREATE DATABASE step because of
this line in the dump file:
CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8'
LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8';
I ran this statemnt it in Windows 10 with Postgres 12 successfully. Result
was:
WARNING: could not determine encoding for locale "en_US.UTF-8": codeset is
"CPUTF-8"
WARNING: could not determine encoding for locale "en_US.UTF-8": codeset is
"CPUTF-8"
CREATE DATABASE
Query returned successfully in 1 secs 75 msec.
redmine database was created. I dont understand why it failed in your test.
When I manually changed it in the plain text version of the dump file to:
CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8'
LC_COLLATE = 'English_United States.1252' LC_CTYPE = 'English_United
States.1252';
I verifed that data was restored using pg_restore without manually changing
anything.
Andrus.
On 3/29/20 2:47 PM, Andrus wrote:
Hi!
Same warning appears two times. This command execute by pg_restore
probably causes this (harmless?)ļæ½ warning:What warning?
pg_restore: WARNING:ļæ½ could not determine encoding for locale
"et_EE.UTF-8":
codeset is "CPUTF-8"I cranked up a Windows 7 instance and tried to migrate a Postgres 11
database from Ubuntu and it failed on the CREATE DATABASE step because
of this line in the dump file:
CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8'
LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8';I ran this statemnt it in Windows 10 with Postgres 12 successfully.
Result was:WARNING:ļæ½ could not determine encoding for locale "en_US.UTF-8": codeset
is "CPUTF-8"
WARNING:ļæ½ could not determine encoding for locale "en_US.UTF-8": codeset
is "CPUTF-8"
CREATE DATABASEQuery returned successfully in 1 secs 75 msec.
redmine database was created. I dont understand why it failed in your test.
Not sure but:
1) I was on Windows 7
2) Using Postgres 11
3) My Windows skills have atrophied, especially with the Windows command
line.
When I manually changed it in the plain text version of the dump file to:
CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8'
LC_COLLATE = 'English_United States.1252' LC_CTYPE = 'English_United
States.1252';I verifed that data was restored using pg_restore without manually
changing anything.
So was this the same for the database you originally posted about, it
actually restored it just threw warnings?
If so I misunderstood the situation and thought the database was not
loading.
Andrus.
--
Adrian Klaver
adrian.klaver@aklaver.com
Hi!
Not sure but:
1) I was on Windows 7
2) Using Postgres 11
3) My Windows skills have atrophied, especially with the Windows command
line.
So was this the same for the database you originally posted about, it
actually restored it just threw warnings?
Looks like it restored. I havent checked restored data.
If so I misunderstood the situation and thought the database was not
loading.
I tried
CREATE DATABASE redmine
WITH
ENCODING = 'UTF8'
LC_COLLATE = 'foo'
LC_CTYPE = 'bar' template template0
in Linux and in Windows using Postgres 12.2
In Linux it throws error
ERROR: invalid locale name: "foo"
In Windows it creates database and throws warning only.
Without template template0 clause it throws error in Windows also.
In Linux
CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8'
LC_COLLATE = 'English_United States.1252' LC_CTYPE = 'English_United
States.1252';
also throws error
ERROR: invalid locale name: "English_United States.1252"
So it looks like pg_dump/pg_restore with --create works only from Linux to
Windows and does not work from Windows to Linux.
I expect that it should work from Windows to Linux also.
Andrus.