Thread: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore
could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore
From
"Andrus"
Date:
Hi! Postgres 12 database dump is created in Debian 10 using pg_dump . Trying to restore it in Windows 10 using pg_restore --clean --create --if-exists --dbname=postgres --jobs=8 --no-password mydb.backup produces strange message pg_restore: WARNING: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" How to fix this ? Debian and Windows computer have same settings: Latest Postgres 12 is used OS and database locales are Estonian Database encoding is UTF-8 Andrus.
Re: could not determine encoding for locale "et_EE.UTF-8": codeset is"CPUTF-8" in pg_restore
From
Adrian Klaver
Date:
On 3/28/20 2:39 PM, Andrus wrote: > Hi! > > Postgres 12 database dump is created in Debian 10 using pg_dump . > > Trying to restore it in Windows 10 using > > pg_restore --clean --create --if-exists --dbname=postgres --jobs=8 > --no-password mydb.backup > > produces strange message > > pg_restore: WARNING: could not determine encoding for locale > "et_EE.UTF-8": codeset is "CPUTF-8" In the Debian Postgres instance in psql what does \l show for the databases? In the Windows 10 command prompt what does systeminfo show? > > How to fix this ? > > Debian and Windows computer have same settings: > > Latest Postgres 12 is used > OS and database locales are Estonian > Database encoding is UTF-8 > > Andrus. > > -- Adrian Klaver adrian.klaver@aklaver.com
Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore
From
"Andrus"
Date:
Hi! >In the Debian Postgres instance in psql what does \l show for the >databases? #psql namm postgres psql (12.2 (Debian 12.2-2.pgdg100+1)) Type "help" for help. namm=# \l namm List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges ------+------------+----------+-------------+-------------+--------------------------- namm | namm_owner | UTF8 | et_EE.UTF-8 | et_EE.UTF-8 | =Tc/namm_owner + | | | | | namm_owner=CTc/namm_owner (1 row) >In the Windows 10 command prompt what does systeminfo show? Host Name: SERVER2 OS Name: Microsoft Windows 10 Pro OS Version: 10.0.18363 N/A Build 18363 OS Manufacturer: Microsoft Corporation OS Configuration: Standalone Workstation OS Build Type: Multiprocessor Free Registered Owner: Windows User Registered Organization: Product ID: 00330-70008-16217-AAOEM Original Install Date: 05.09.2019, 9:16:41 System Boot Time: 28.03.2020, 11:05:23 System Manufacturer: Gigabyte Technology Co., Ltd. System Model: Q270M-D3H System Type: x64-based PC Processor(s): 1 Processor(s) Installed. [01]: Intel64 Family 6 Model 158 Stepping 9 GenuineIntel ~3601 Mhz BIOS Version: American Megatrends Inc. F1, 09.01.2017 Windows Directory: C:\WINDOWS System Directory: C:\WINDOWS\system32 Boot Device: \Device\HarddiskVolume3 System Locale: et;Eesti Input Locale: et;Eesti Time Zone: (UTC+02:00) Helsingi, Kiiev, Riia, Sofia, Tallinn, Vilnius Total Physical Memory: 16 286 MB Available Physical Memory: 12 032 MB Virtual Memory: Max Size: 18 718 MB Virtual Memory: Available: 14 867 MB Virtual Memory: In Use: 3 851 MB Page File Location(s): C:\pagefile.sys Domain: WORKGROUP Logon Server: \\SERVER2 Hotfix(s): 18 Hotfix(s) Installed. [01]: KB4534132 [02]: KB4497165 [03]: KB4498523 [04]: KB4503308 [05]: KB4515383 [06]: KB4515530 [07]: KB4516115 [08]: KB4517245 [09]: KB4520390 [10]: KB4521863 [11]: KB4524244 [12]: KB4524569 [13]: KB4528759 [14]: KB4532441 [15]: KB4537759 [16]: KB4538674 [17]: KB4541338 [18]: KB4551762 Network Card(s): 2 NIC(s) Installed. [01]: TAP-Windows Adapter V9 Connection Name: Ethernet 4 Status: Media disconnected [02]: Intel(R) Ethernet Connection (2) I219-LM Connection Name: Ethernet 3 DHCP Enabled: Yes DHCP Server: 192.168.91.1 IP address(es) [01]: 192.168.91.154 [02]: fe80::94d:b1c:3945:bc8a [03]: 2001:7d0:4c83:4c80:257f:b077:e1f7:21e1 [04]: 2001:7d0:4c83:4c80:94d:b1c:3945:bc8a Hyper-V Requirements: VM Monitor Mode Extensions: Yes Virtualization Enabled In Firmware: Yes Second Level Address Translation: Yes Data Execution Prevention Available: Yes Andrus.
Re: could not determine encoding for locale "et_EE.UTF-8": codeset is"CPUTF-8" in pg_restore
From
Adrian Klaver
Date:
On 3/28/20 3:31 PM, Andrus wrote: > Hi! > >> In the Debian Postgres instance in psql what does \l show for the >> databases? > > #psql namm postgres > psql (12.2 (Debian 12.2-2.pgdg100+1)) > Type "help" for help. > > namm=# \l namm > List of databases > Name | Owner | Encoding | Collate | Ctype | Access > privileges > ------+------------+----------+-------------+-------------+--------------------------- > > namm | namm_owner | UTF8 | et_EE.UTF-8 | et_EE.UTF-8 | =Tc/namm_owner + > | | | | | > namm_owner=CTc/namm_owner > (1 row) > >> In the Windows 10 command prompt what does systeminfo show? > > System Locale: et;Eesti > Input Locale: et;Eesti Hmm, I was expecting to see et_EE though I will admit to not truly understanding how Windows does locales. I should have asked earlier, in the Postgres instance on Windows what does \l show for template0? > > Andrus. > -- Adrian Klaver adrian.klaver@aklaver.com
Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore
From
Tom Lane
Date:
"Andrus" <kobruleht2@hot.ee> writes: > Postgres 12 database dump is created in Debian 10 using pg_dump . > Trying to restore it in Windows 10 using > pg_restore --clean --create --if-exists --dbname=postgres --jobs=8 --no-password > mydb.backup > produces strange message > pg_restore: WARNING: could not determine encoding for locale "et_EE.UTF-8": > codeset is "CPUTF-8" > How to fix this ? Generally speaking, locale names from Unix systems won't work at all on Windows. You need to create the database manually with whatever seems to be the closest Windows locale match, and then restore its contents without using --create. (I do agree that that message isn't the most helpful thing. It looks like chklocale.c is overoptimistically assuming that what it's handed is valid, even if GetLocaleInfoEx says it isn't.) regards, tom lane
Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore
From
Tom Lane
Date:
I wrote: > (I do agree that that message isn't the most helpful thing. It looks > like chklocale.c is overoptimistically assuming that what it's handed > is valid, even if GetLocaleInfoEx says it isn't.) After further digging, that optimism isn't *completely* without foundation; it seems to be based on the fact that we know setlocale() thought the locale string was OK. Which is interesting --- apparently Microsoft is more willing to accept Unix-style locale names than I thought. But they couldn't be bothered to make GetLocaleInfoEx() and setlocale() take the same set of strings ... regards, tom lane
Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore
From
"Andrus"
Date:
Hi! >> System Locale: et;Eesti >> Input Locale: et;Eesti >Hmm, I was expecting to see et_EE though I will admit to not truly >understanding how Windows does locales. >I should have asked earlier, in the Postgres instance on Windows what does >\l show for template0? "D:\Program Files\PostgreSQL\12\bin\psql" postgres postgres psql (12.2) WARNING: Console code page (775) differs from Windows code page (1257) 8-bit characters might not work correctly. See psql reference page "Notes for Windows users" for details. Type "help" for help. postgres=# \l template0 List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -----------+----------+----------+-----------------------+-----------------------+----------------------- template0 | postgres | UTF8 | Estonian_Estonia.1257 | Estonian_Estonia.1257 | =c/postgres + | | | | | postgres=CTc/postgres (1 row) Andrus.
Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore
From
"Andrus"
Date:
Hi! >Generally speaking, locale names from Unix systems won't work at all on >Windows. You need to create the database manually with whatever seems >to be the closest Windows locale match, and then restore its contents >without using --create. This is unattended script running in every night from .bat file. How to replace pg_restore --create option with psql and/or createdb calls and specify proper locale for them ? Currently everthing has "Estonian_Estonia.1257" locale in windows. Which locale name should be specified in Windows instead of this? Or maybe creating new template with proper encoding or changing template0 encoding helps? Andrus.
Re: could not determine encoding for locale "et_EE.UTF-8": codeset is"CPUTF-8" in pg_restore
From
Adrian Klaver
Date:
On 3/29/20 12:56 AM, Andrus wrote: > Hi! > >>> System Locale: et;Eesti >>> Input Locale: et;Eesti >> Hmm, I was expecting to see et_EE though I will admit to not truly >> understanding how Windows does locales. >> I should have asked earlier, in the Postgres instance on Windows what >> does \l show for template0? I rarely use Windows anymore so take the below with that in mind. > > "D:\Program Files\PostgreSQL\12\bin\psql" postgres postgres > > psql (12.2) > WARNING: Console code page (775) differs from Windows code page (1257) > 8-bit characters might not work correctly. See psql reference > page "Notes for Windows users" for details. There seems to a difference of opinion of what Baltic Code Page to use: https://en.wikipedia.org/wiki/Code_page_775 https://en.wikipedia.org/wiki/Windows-1257 The post below shows a users method of dealing with this for another CP: https://www.postgresql.org/message-id/549275CC.4010607%40gmail.com > Type "help" for help. > > postgres=# \l template0 > List of databases > Name | Owner | Encoding | Collate | Ctype > | Access privileges > -----------+----------+----------+-----------------------+-----------------------+----------------------- > > template0 | postgres | UTF8 | Estonian_Estonia.1257 | > Estonian_Estonia.1257 | =c/postgres + > | | | | | I'm guessing it is picking up Estonian_Estonia.1257 from the system. The Windows Postgres instance was installed from the EDB installer? > postgres=CTc/postgres > (1 row) > > Andrus. -- Adrian Klaver adrian.klaver@aklaver.com
Re: could not determine encoding for locale "et_EE.UTF-8": codeset is"CPUTF-8" in pg_restore
From
Adrian Klaver
Date:
On 3/29/20 1:02 AM, Andrus wrote: > Hi! > >> Generally speaking, locale names from Unix systems won't work at all on >> Windows. You need to create the database manually with whatever seems >> to be the closest Windows locale match, and then restore its contents >> without using --create. > > This is unattended script running in every night from .bat file. Per my previous post, you might try adding something like: cmd.exe /c chcp 1257 to the top of the batch file. This idea came from here: https://stackoverflow.com/questions/20794035/postgresql-warning-console-code-page-437-differs-from-windows-code-page-125 > > How to replace pg_restore --create option with psql and/or createdb > calls > and specify proper locale for them ? > > Currently everthing has "Estonian_Estonia.1257" locale in windows. > Which locale name should be specified in Windows instead of this? > Or maybe creating new template with proper encoding or changing > template0 encoding helps? If I am following the error correctly then the issue is that the Postgres console programs are using CP755 and that is not something for which there is an automatic conversion: https://www.postgresql.org/docs/12/multibyte.html#id-1.6.10.5.7 There is a conversion for 1257 clients so having your console run as 1257 should solve the problem. Someone with more current experience on Windows will need to comment on whether that is the viable or best solution. > > Andrus. > > -- Adrian Klaver adrian.klaver@aklaver.com
Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore
From
"Andrus"
Date:
Hi! >There seems to a difference of opinion of what Baltic Code Page to use: >https://en.wikipedia.org/wiki/Code_page_775 >https://en.wikipedia.org/wiki/Windows-1257 >The post below shows a users method of dealing with this for another CP: >https://www.postgresql.org/message-id/549275CC.4010607%40gmail.com Console code page 775 message appears since psql is console application running from command line. It does not have any relation to pg_dump/pg_restore issue since console code page is not used in this case. There is Estonian locale everywhere. Maybe this warning is harmless since Linux code page is ignored and default collation is used. All table definitions in restored database contain references to default collation: CREATE TABLE firma1.acquirpo ( kassanr numeric(3,0) NOT NULL, policyid character(2) COLLATE pg_catalog."default" NOT NULL, trantype character(6) COLLATE pg_catalog."default", tacdefault character(10) COLLATE pg_catalog."default", tacdenial character(10) COLLATE pg_catalog."default", taconline character(10) COLLATE pg_catalog."default", floorlimit numeric(12,0), randselthr numeric(12,0), minrandper numeric(2,0), maxrandper numeric(2,0), CONSTRAINT acquirpo_pkey PRIMARY KEY (kassanr, policyid) ) TABLESPACE pg_default; Same warning appears two times. This command execute by pg_restore probably causes this (harmless?) warning: CREATE DATABASE mydb WITH OWNER = mydb_owner ENCODING = 'UTF8' LC_COLLATE = 'et_EE.UTF-8' LC_CTYPE = 'et_EE.UTF-8' TABLESPACE = pg_default CONNECTION LIMIT = -1; If new database is created manually in windows cluster the following command is generated: CREATE DATABASE mydbmanually WITH OWNER = mydbmanually_owner ENCODING = 'UTF8' LC_COLLATE = 'Estonian_Estonia.1257' LC_CTYPE = 'Estonian_Estonia.1257' TABLESPACE = pg_default CONNECTION LIMIT = -1; Andrus.
Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore
From
"Andrus"
Date:
Hi! >Per my previous post, you might try adding something like: >cmd.exe /c chcp 1257 >to the top of the batch file. This idea came from here: >https://stackoverflow.com/questions/20794035/postgresql-warning-console-code-page-437-differs-from-windows-code-page-125 >If I am following the error correctly then the issue is that the Postgres >console programs are using CP755 and that is not something for which there >is an automatic conversion: >https://www.postgresql.org/docs/12/multibyte.html#id-1.6.10.5.7 >There is a conversion for 1257 clients so having your console run as 1257 >should solve the problem. Someone with more current experience on Windows >will need to comment on whether that is the viable or best solution. Both servers have UTF-8 encoding. Non-unicode code page 755 referes only to command line applications like psql. Postgres service, pg_dump and pg_restore do not use console codepages any way, they operate using only UTF-8 character set since both databases are in UTF-8 I think console code page warning message is not related to this issue. Andrus.
Re: could not determine encoding for locale "et_EE.UTF-8": codeset is"CPUTF-8" in pg_restore
From
Adrian Klaver
Date:
On 3/29/20 11:12 AM, Andrus wrote: > Hi! > >> There seems to a difference of opinion of what Baltic Code Page to use: >> https://en.wikipedia.org/wiki/Code_page_775 >> https://en.wikipedia.org/wiki/Windows-1257 >> The post below shows a users method of dealing with this for another CP: >> https://www.postgresql.org/message-id/549275CC.4010607%40gmail.com > > Console code page 775 message appears since psql is console application > running from command line. > > It does not have any relation to pg_dump/pg_restore issue since console > code page is not used in this case. > > There is Estonian locale everywhere. > Maybe this warning is harmless since Linux code page is ignored and > default collation is used. > All table definitions in restored database contain references to default > collation: > > CREATE TABLE firma1.acquirpo > ( > kassanr numeric(3,0) NOT NULL, > policyid character(2) COLLATE pg_catalog."default" NOT NULL, > trantype character(6) COLLATE pg_catalog."default", > tacdefault character(10) COLLATE pg_catalog."default", > tacdenial character(10) COLLATE pg_catalog."default", > taconline character(10) COLLATE pg_catalog."default", > floorlimit numeric(12,0), > randselthr numeric(12,0), > minrandper numeric(2,0), > maxrandper numeric(2,0), > CONSTRAINT acquirpo_pkey PRIMARY KEY (kassanr, policyid) > ) > > TABLESPACE pg_default; > > Same warning appears two times. This command execute by pg_restore > probably causes this (harmless?) warning: What warning? I cranked up a Windows 7 instance and tried to migrate a Postgres 11 database from Ubuntu and it failed on the CREATE DATABASE step because of this line in the dump file: CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8' LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8'; When I manually changed it in the plain text version of the dump file to: CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8' LC_COLLATE = 'English_United States.1252' LC_CTYPE = 'English_United States.1252'; borrowing from Tom's suggestion here: https://www.postgresql.org/message-id/22672.1585442578%40sss.pgh.pa.us it worked. So basically what you see below. If you followed Tom's suggestion fully you could restore into a manually created database on the Windows side by dropping the --create and pointing -d at the previously created database. > > CREATE DATABASE mydb > WITH > OWNER = mydb_owner > ENCODING = 'UTF8' > LC_COLLATE = 'et_EE.UTF-8' > LC_CTYPE = 'et_EE.UTF-8' > TABLESPACE = pg_default > CONNECTION LIMIT = -1; > > If new database is created manually in windows cluster the following > command is generated: > > > CREATE DATABASE mydbmanually > WITH > OWNER = mydbmanually_owner > ENCODING = 'UTF8' > LC_COLLATE = 'Estonian_Estonia.1257' > LC_CTYPE = 'Estonian_Estonia.1257' > TABLESPACE = pg_default > CONNECTION LIMIT = -1; > > Andrus. > -- Adrian Klaver adrian.klaver@aklaver.com
Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore
From
"Andrus"
Date:
Hi! >> Same warning appears two times. This command execute by pg_restore >> probably causes this (harmless?) warning: >What warning? pg_restore: WARNING: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" >I cranked up a Windows 7 instance and tried to migrate a Postgres 11 >database from Ubuntu and it failed on the CREATE DATABASE step because of >this line in the dump file: >CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8' >LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8'; I ran this statemnt it in Windows 10 with Postgres 12 successfully. Result was: WARNING: could not determine encoding for locale "en_US.UTF-8": codeset is "CPUTF-8" WARNING: could not determine encoding for locale "en_US.UTF-8": codeset is "CPUTF-8" CREATE DATABASE Query returned successfully in 1 secs 75 msec. redmine database was created. I dont understand why it failed in your test. >When I manually changed it in the plain text version of the dump file to: >CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8' >LC_COLLATE = 'English_United States.1252' LC_CTYPE = 'English_United >States.1252'; I verifed that data was restored using pg_restore without manually changing anything. Andrus.
Re: could not determine encoding for locale "et_EE.UTF-8": codeset is"CPUTF-8" in pg_restore
From
Adrian Klaver
Date:
On 3/29/20 2:47 PM, Andrus wrote: > Hi! > >>> Same warning appears two times. This command execute by pg_restore >>> probably causes this (harmless?) warning: >> What warning? > > pg_restore: WARNING: could not determine encoding for locale > "et_EE.UTF-8": > codeset is "CPUTF-8" > >> I cranked up a Windows 7 instance and tried to migrate a Postgres 11 >> database from Ubuntu and it failed on the CREATE DATABASE step because >> of this line in the dump file: >> CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8' >> LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8'; > > I ran this statemnt it in Windows 10 with Postgres 12 successfully. > Result was: > > WARNING: could not determine encoding for locale "en_US.UTF-8": codeset > is "CPUTF-8" > WARNING: could not determine encoding for locale "en_US.UTF-8": codeset > is "CPUTF-8" > CREATE DATABASE > > Query returned successfully in 1 secs 75 msec. > > redmine database was created. I dont understand why it failed in your test. Not sure but: 1) I was on Windows 7 2) Using Postgres 11 3) My Windows skills have atrophied, especially with the Windows command line. > >> When I manually changed it in the plain text version of the dump file to: >> CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8' >> LC_COLLATE = 'English_United States.1252' LC_CTYPE = 'English_United >> States.1252'; > > I verifed that data was restored using pg_restore without manually > changing anything. So was this the same for the database you originally posted about, it actually restored it just threw warnings? If so I misunderstood the situation and thought the database was not loading. > > Andrus. > -- Adrian Klaver adrian.klaver@aklaver.com
Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore
From
"Andrus"
Date:
Hi! >Not sure but: >1) I was on Windows 7 >2) Using Postgres 11 >3) My Windows skills have atrophied, especially with the Windows command >line. >So was this the same for the database you originally posted about, it >actually restored it just threw warnings? Looks like it restored. I havent checked restored data. >If so I misunderstood the situation and thought the database was not >loading. I tried CREATE DATABASE redmine WITH ENCODING = 'UTF8' LC_COLLATE = 'foo' LC_CTYPE = 'bar' template template0 in Linux and in Windows using Postgres 12.2 In Linux it throws error ERROR: invalid locale name: "foo" In Windows it creates database and throws warning only. Without template template0 clause it throws error in Windows also. In Linux CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8' LC_COLLATE = 'English_United States.1252' LC_CTYPE = 'English_United States.1252'; also throws error ERROR: invalid locale name: "English_United States.1252" So it looks like pg_dump/pg_restore with --create works only from Linux to Windows and does not work from Windows to Linux. I expect that it should work from Windows to Linux also. Andrus.