Thread: Re: [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
On Sun, Sep 16, 2012 at 12:38:37PM +0800, Rural Hunter wrote: > >OK, I see many new ALTER TABLE commands, but nothing that would cause a > >difference in relation count. > > > >Attached is a patch that will return the OID of the old/new mismatched > >entries. Please research the pg_class objects on the old/new clusters > >that have the mismatch and let me know. It might be something that > >isn't in the old cluster, or not in the new cluster. > > > I ran the pg_upgrade with the patch and found the problematic object > is a toast object. > Copying user relation files > /raid/pgsql/base/6087920/6088238 > Mismatch of relation OID in database "forummon": old OID 16439148, > new OID 16439322 > > In old cluster: > # select * from pg_class WHERE oid=16439148; > relname | relnamespace | reltype | reloftype | relowner | relam | > relfilenode | reltablespace | relpages | reltuples | reltoastrelid | > reltoastidxid | relhasindex | relisshared | relpersistence | relkind > | relnatts | relchecks | relhasoids | relhaspkey | relhasrules | > relhastriggers | relhassubclass | relfrozenxid | relacl | reloptions > -------------------+--------------+----------+-----------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+-------------+-------------+----------------+---------+----------+-----------+------------+------------+-------------+----------------+----------------+--------------+--------+------------ > pg_toast_16439145 | 99 | 16439149 | 0 | 10 | 0 | 16439148 | 0 | 0 | > 0 | 0 | 16439150 | t | f | p | t | 3 | 0 | f | t | f | f | f | > 630449585 | | > (1 row) > > But it doesn't exist in new cluster: > select * from pg_class WHERE oid=16439148; > relname | relnamespace | reltype | reloftype | relowner | relam | > relfilenode | reltablespace | relpages | reltuples | relallvisible | > reltoastrelid | reltoastidxid | relhasindex | relisshared | > relpersistence | relkind | relnatts | relchecks | relhasoids | > relhaspkey | relhasrules | relhastriggers | relhassubclass | > relfrozenxid | relacl | reloptions > ---------+--------------+---------+-----------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+---------------+-------------+-------------+----------------+---------+----------+-----------+------------+------------+-------------+----------------+----------------+--------------+--------+------------ > (0 rows) [ Thread moved to hackers list.] OK, this is exactly what I wanted to see, and it explains why pg_dump didn't show it. Can you find out what table references this toast table? Try this query on the old cluster: select oid, * from pg_class WHERE reltoastrelid = 16439148; I believe it will have an oid of 16439145, or it might not exist. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Bruce Momjian <bruce@momjian.us> writes: > On Sun, Sep 16, 2012 at 12:38:37PM +0800, Rural Hunter wrote: >> I ran the pg_upgrade with the patch and found the problematic object >> is a toast object. > OK, this is exactly what I wanted to see, and it explains why pg_dump > didn't show it. Can you find out what table references this toast > table? Try this query on the old cluster: > select oid, * from pg_class WHERE reltoastrelid = 16439148; > I believe it will have an oid of 16439145, or it might not exist. Most likely what's happened is that the table has a toast table that it doesn't need, as a result of having dropped the only wide column(s) in it. So when the table is recreated in the new cluster, there's no toast table for it. So what you need to do is get rid of that check, or relax it so that it doesn't insist on toast tables matching up exactly. It seems possible that there could be discrepancies in the other direction too, ie, new cluster created a toast table when old cluster didn't have one. regards, tom lane
于2012年9月17日 1:17:46,Bruce Momjian写到: > On Sun, Sep 16, 2012 at 12:38:37PM +0800, Rural Hunter wrote: >>> OK, I see many new ALTER TABLE commands, but nothing that would cause a >>> difference in relation count. >>> >>> Attached is a patch that will return the OID of the old/new mismatched >>> entries. Please research the pg_class objects on the old/new clusters >>> that have the mismatch and let me know. It might be something that >>> isn't in the old cluster, or not in the new cluster. >>> >> I ran the pg_upgrade with the patch and found the problematic object >> is a toast object. >> Copying user relation files >> /raid/pgsql/base/6087920/6088238 >> Mismatch of relation OID in database "forummon": old OID 16439148, >> new OID 16439322 >> >> In old cluster: >> # select * from pg_class WHERE oid=16439148; >> relname | relnamespace | reltype | reloftype | relowner | relam | >> relfilenode | reltablespace | relpages | reltuples | reltoastrelid | >> reltoastidxid | relhasindex | relisshared | relpersistence | relkind >> | relnatts | relchecks | relhasoids | relhaspkey | relhasrules | >> relhastriggers | relhassubclass | relfrozenxid | relacl | reloptions >> -------------------+--------------+----------+-----------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+-------------+-------------+----------------+---------+----------+-----------+------------+------------+-------------+----------------+----------------+--------------+--------+------------ >> pg_toast_16439145 | 99 | 16439149 | 0 | 10 | 0 | 16439148 | 0 | 0 | >> 0 | 0 | 16439150 | t | f | p | t | 3 | 0 | f | t | f | f | f | >> 630449585 | | >> (1 row) >> >> But it doesn't exist in new cluster: >> select * from pg_class WHERE oid=16439148; >> relname | relnamespace | reltype | reloftype | relowner | relam | >> relfilenode | reltablespace | relpages | reltuples | relallvisible | >> reltoastrelid | reltoastidxid | relhasindex | relisshared | >> relpersistence | relkind | relnatts | relchecks | relhasoids | >> relhaspkey | relhasrules | relhastriggers | relhassubclass | >> relfrozenxid | relacl | reloptions >> ---------+--------------+---------+-----------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+---------------+-------------+-------------+----------------+---------+----------+-----------+------------+------------+-------------+----------------+----------------+--------------+--------+------------ >> (0 rows) > > [ Thread moved to hackers list.] > > OK, this is exactly what I wanted to see, and it explains why pg_dump > didn't show it. Can you find out what table references this toast > table? Try this query on the old cluster: > > select oid, * from pg_class WHERE reltoastrelid = 16439148; > > I believe it will have an oid of 16439145, or it might not exist. > # select oid, * from pg_class WHERE reltoastrelid = 16439148; oid | relname | relnamespace | reltype | reloftype| relowner | relam | relfilenode | reltablespace | relpages | reltuples | reltoastrelid | reltoastidxid | relhasindex | relisshared | relpersistence | relkind | relnatts | relchecks | relhasoids | relhaspkey | relhasrules | relhastriggers | relhassubclass | relfrozenxid | relacl | reloptions ----------+--------------+--------------+----------+-----------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+-------------+-------------+----------------+---------+----------+-----------+------------+------------+-------------+----------------+----------------+--------------+-----------------------------------------+------------16439145 |sql_features | 16438995 | 16439147 | 0 | 10 | 0 | 16439145 | 0 | 0 | 0 | 16439148 | 0 | f | f | p | r | 7 | 0 | f | f | f | f | f | 630449585 | {postgres=arwdDxt/postgres,=r/postgres} | (1 row) It's not a table. I haven't seen this name before. not sure why it exists. So what's the next thing I can do?
Rural Hunter <ruralhunter@gmail.com> writes: > # select oid, * from pg_class WHERE reltoastrelid = 16439148; > oid | relname | relnamespace | reltype | reloftype | > relowner | relam | relfilenode | reltablespace | relpages | reltuples | > reltoastrelid | reltoastidxid | relhasindex | relisshared | > relpersistence | relkind | relnatts | relchecks | relhasoids | > relhaspkey | relhasrules | relhastriggers | relhassubclass | > relfrozenxid | relacl | reloptions > ----------+--------------+--------------+----------+-----------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+-------------+-------------+----------------+---------+----------+-----------+------------+------------+-------------+----------------+----------------+--------------+-----------------------------------------+------------ > 16439145 | sql_features | 16438995 | 16439147 | 0 | > 10 | 0 | 16439145 | 0 | 0 | 0 | > 16439148 | 0 | f | f | p | > r | 7 | 0 | f | f | f > | f | f | 630449585 | > {postgres=arwdDxt/postgres,=r/postgres} | > (1 row) Well, that's even stranger, because (1) information_schema.sql_features ought to have a toast table in either version, and (2) neither pg_dump nor pg_upgrade ought to be attempting to dump or transfer that table. I wonder whether you dropped and recreated the information_schema in the lifetime of this database? We have recommended doing that in the past, IIRC. Could such a thing have confused pg_dump? regards, tom lane
于2012年9月17日 9:48:58,Tom Lane写到: > Rural Hunter <ruralhunter@gmail.com> writes: >> # select oid, * from pg_class WHERE reltoastrelid = 16439148; >> oid | relname | relnamespace | reltype | reloftype | >> relowner | relam | relfilenode | reltablespace | relpages | reltuples | >> reltoastrelid | reltoastidxid | relhasindex | relisshared | >> relpersistence | relkind | relnatts | relchecks | relhasoids | >> relhaspkey | relhasrules | relhastriggers | relhassubclass | >> relfrozenxid | relacl | reloptions >> ----------+--------------+--------------+----------+-----------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+-------------+-------------+----------------+---------+----------+-----------+------------+------------+-------------+----------------+----------------+--------------+-----------------------------------------+------------ >> 16439145 | sql_features | 16438995 | 16439147 | 0 | >> 10 | 0 | 16439145 | 0 | 0 | 0 | >> 16439148 | 0 | f | f | p | >> r | 7 | 0 | f | f | f >> | f | f | 630449585 | >> {postgres=arwdDxt/postgres,=r/postgres} | >> (1 row) > > Well, that's even stranger, because (1) information_schema.sql_features > ought to have a toast table in either version, and (2) neither pg_dump > nor pg_upgrade ought to be attempting to dump or transfer that table. > > I wonder whether you dropped and recreated the information_schema in > the lifetime of this database? We have recommended doing that in the > past, IIRC. Could such a thing have confused pg_dump? > > regards, tom lane > No, I have never manually re-created the table. This is the first time I see the name. But I'm not sure other things I installed before recreated it or not, such as pg_buffercache etc. One more thing, is this a hidden table? I can see it with '\d information_schema.sql_features' but it's not in the list of '\d'.
Rural Hunter <ruralhunter@gmail.com> writes: > 于2012年9月17日 9:48:58,Tom Lane写到: >> I wonder whether you dropped and recreated the information_schema in >> the lifetime of this database? We have recommended doing that in the >> past, IIRC. Could such a thing have confused pg_dump? > No, I have never manually re-created the table. I think you must have, because the query output shows that sql_features, its rowtype, and the information_schema all have OIDs much larger than they would have had in a virgin installation. The large relfilenode could have been explained by a VACUUM FULL, but the other OIDs wouldn't have been changed by that. > This is the first time > I see the name. But I'm not sure other things I installed before > recreated it or not, such as pg_buffercache etc. One more thing, is > this a hidden table? I can see it with '\d > information_schema.sql_features' but it's not in the list of '\d'. That just means that information_schema is not in your search_path. regards, tom lane
On Sun, Sep 16, 2012 at 06:04:16PM -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > On Sun, Sep 16, 2012 at 12:38:37PM +0800, Rural Hunter wrote: > >> I ran the pg_upgrade with the patch and found the problematic object > >> is a toast object. > > > OK, this is exactly what I wanted to see, and it explains why pg_dump > > didn't show it. Can you find out what table references this toast > > table? Try this query on the old cluster: > > > select oid, * from pg_class WHERE reltoastrelid = 16439148; > > > I believe it will have an oid of 16439145, or it might not exist. > > Most likely what's happened is that the table has a toast table that > it doesn't need, as a result of having dropped the only wide column(s) > in it. So when the table is recreated in the new cluster, there's no > toast table for it. > > So what you need to do is get rid of that check, or relax it so that it > doesn't insist on toast tables matching up exactly. It seems possible > that there could be discrepancies in the other direction too, ie, > new cluster created a toast table when old cluster didn't have one. pg_dump.c already has this code: if (OidIsValid(pg_class_reltoastrelid)) { /* * One complexity is that the table definitionmight not require * the creation of a TOAST table, and the TOAST table might have * beencreated long after table creation, when the table was * loaded with wide data. By setting the TOAST oid weforce * creation of the TOAST heap and TOAST index by the backend so we * can cleanly copy the filesduring binary upgrade. */ appendPQExpBuffer(upgrade_buffer, "SELECT binary_upgrade.set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n", pg_class_reltoastrelid); /* every toast table has an index */ appendPQExpBuffer(upgrade_buffer, "SELECTbinary_upgrade.set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n", pg_class_reltoastidxid); } As you can see, we look at the existing TOAST usage and force the new cluster to match. As I remember we replay the DROP COLUMN in binary upgrade mode so the new cluster always matches the old cluster's TOAST usage. I certainly have never seen this bug reported before. I think the big question is why did this case fail? I can say that the query that pulls details from each cluster skips information_schema or oid < FirstNormalObjectId. I wonder if there is a mismatch between what pg_dump filters out and pg_upgrade. Can you tell us the schema of the 'sql_features' table? Also, does it appear in the pg_dump --schema-only output? I don't think it does because it wasn't reported in the pg_dump --schema-only diff I requested, and pg_dump wouldn't have dumped it from the new cluster. What that means is that 'sql_features' got a TOAST table in the old cluster but while 'sql_features' also has a TOAST table in the new cluster, it isn't processed by pg_upgrade because it is in the information schema and has an oid < FirstNormalObjectId. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Sun, Sep 16, 2012 at 09:48:58PM -0400, Tom Lane wrote: > Rural Hunter <ruralhunter@gmail.com> writes: > > # select oid, * from pg_class WHERE reltoastrelid = 16439148; > > oid | relname | relnamespace | reltype | reloftype | > > relowner | relam | relfilenode | reltablespace | relpages | reltuples | > > reltoastrelid | reltoastidxid | relhasindex | relisshared | > > relpersistence | relkind | relnatts | relchecks | relhasoids | > > relhaspkey | relhasrules | relhastriggers | relhassubclass | > > relfrozenxid | relacl | reloptions > > ----------+--------------+--------------+----------+-----------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+-------------+-------------+----------------+---------+----------+-----------+------------+------------+-------------+----------------+----------------+--------------+-----------------------------------------+------------ > > 16439145 | sql_features | 16438995 | 16439147 | 0 | > > 10 | 0 | 16439145 | 0 | 0 | 0 | > > 16439148 | 0 | f | f | p | > > r | 7 | 0 | f | f | f > > | f | f | 630449585 | > > {postgres=arwdDxt/postgres,=r/postgres} | > > (1 row) > > Well, that's even stranger, because (1) information_schema.sql_features > ought to have a toast table in either version, and (2) neither pg_dump > nor pg_upgrade ought to be attempting to dump or transfer that table. I bet pg_upgrade is picking it up from the old cluster because it has an oid >= FirstNormalObjectId and the table is not in the information schema. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Bruce Momjian <bruce@momjian.us> writes: > On Sun, Sep 16, 2012 at 09:48:58PM -0400, Tom Lane wrote: >> Well, that's even stranger, because (1) information_schema.sql_features >> ought to have a toast table in either version, and (2) neither pg_dump >> nor pg_upgrade ought to be attempting to dump or transfer that table. > I bet pg_upgrade is picking it up from the old cluster because it has an > oid >= FirstNormalObjectId and the table is not in the information > schema. If it *isn't* in information_schema, but is just some random table that happens to be named sql_features, then it's hard to explain why there's anything going wrong at all. My money is on the OP having done a reload of the information_schema (as per, eg, the release notes for 9.1.2), and somehow that's confusing pg_dump and/or pg_upgrade. regards, tom lane
于2012年9月17日 12:32:36,Bruce Momjian写到: > On Sun, Sep 16, 2012 at 06:04:16PM -0400, Tom Lane wrote: >> Bruce Momjian <bruce@momjian.us> writes: >>> On Sun, Sep 16, 2012 at 12:38:37PM +0800, Rural Hunter wrote: >>>> I ran the pg_upgrade with the patch and found the problematic object >>>> is a toast object. >> >>> OK, this is exactly what I wanted to see, and it explains why pg_dump >>> didn't show it. Can you find out what table references this toast >>> table? Try this query on the old cluster: >> >>> select oid, * from pg_class WHERE reltoastrelid = 16439148; >> >>> I believe it will have an oid of 16439145, or it might not exist. >> >> Most likely what's happened is that the table has a toast table that >> it doesn't need, as a result of having dropped the only wide column(s) >> in it. So when the table is recreated in the new cluster, there's no >> toast table for it. >> >> So what you need to do is get rid of that check, or relax it so that it >> doesn't insist on toast tables matching up exactly. It seems possible >> that there could be discrepancies in the other direction too, ie, >> new cluster created a toast table when old cluster didn't have one. > > pg_dump.c already has this code: > > if (OidIsValid(pg_class_reltoastrelid)) > { > /* > * One complexity is that the table definition might not require > * the creation of a TOAST table, and the TOAST table might have > * been created long after table creation, when the table was > * loaded with wide data. By setting the TOAST oid we force > * creation of the TOAST heap and TOAST index by the backend so we > * can cleanly copy the files during binary upgrade. > */ > > appendPQExpBuffer(upgrade_buffer, > "SELECT binary_upgrade.set_next_toast_pg_class_oid('%u'::pg_catalog.oid);\n", > pg_class_reltoastrelid); > > /* every toast table has an index */ > appendPQExpBuffer(upgrade_buffer, > "SELECT binary_upgrade.set_next_index_pg_class_oid('%u'::pg_catalog.oid);\n", > pg_class_reltoastidxid); > } > > As you can see, we look at the existing TOAST usage and force the new > cluster to match. As I remember we replay the DROP COLUMN in binary > upgrade mode so the new cluster always matches the old cluster's TOAST > usage. I certainly have never seen this bug reported before. > > I think the big question is why did this case fail? I can say that the > query that pulls details from each cluster skips information_schema or > oid < FirstNormalObjectId. I wonder if there is a mismatch between what > pg_dump filters out and pg_upgrade. Can you tell us the schema of the > 'sql_features' table? # select * from pg_tables where tablename='sql_features'; schemaname | tablename | tableowner | tablespace | hasindexes | hasrules | hastriggers --------------------+--------------+------------+------------+------------+----------+-------------information_schema | sql_features| postgres | | f | f | f (1 row) > > Also, does it appear in the pg_dump --schema-only output? I don't think > it does because it wasn't reported in the pg_dump --schema-only diff I > requested, and pg_dump wouldn't have dumped it from the new cluster. right. I checked the dump from the old cluster and it's not there. > > What that means is that 'sql_features' got a TOAST table in the old > cluster but while 'sql_features' also has a TOAST table in the new > cluster, it isn't processed by pg_upgrade because it is in the > information schema and has an oid < FirstNormalObjectId. >
于2012年9月17日 12:47:11,Tom Lane写到: > Bruce Momjian <bruce@momjian.us> writes: >> On Sun, Sep 16, 2012 at 09:48:58PM -0400, Tom Lane wrote: >>> Well, that's even stranger, because (1) information_schema.sql_features >>> ought to have a toast table in either version, and (2) neither pg_dump >>> nor pg_upgrade ought to be attempting to dump or transfer that table. > >> I bet pg_upgrade is picking it up from the old cluster because it has an >> oid >= FirstNormalObjectId and the table is not in the information >> schema. > > If it *isn't* in information_schema, but is just some random table that > happens to be named sql_features, then it's hard to explain why there's > anything going wrong at all. My money is on the OP having done a reload > of the information_schema (as per, eg, the release notes for 9.1.2), and > somehow that's confusing pg_dump and/or pg_upgrade. ah yes yes, now I can remember it! I have followed the release notes and re-created the whole information_schema schema. > > regards, tom lane >
On Mon, Sep 17, 2012 at 01:03:37PM +0800, Rural Hunter wrote: > >As you can see, we look at the existing TOAST usage and force the new > >cluster to match. As I remember we replay the DROP COLUMN in binary > >upgrade mode so the new cluster always matches the old cluster's TOAST > >usage. I certainly have never seen this bug reported before. > > > >I think the big question is why did this case fail? I can say that the > >query that pulls details from each cluster skips information_schema or > >oid < FirstNormalObjectId. I wonder if there is a mismatch between what > >pg_dump filters out and pg_upgrade. Can you tell us the schema of the > >'sql_features' table? > # select * from pg_tables where tablename='sql_features'; > schemaname | tablename | tableowner | tablespace | > hasindexes | hasrules | hastriggers > --------------------+--------------+------------+------------+------------+----------+------------- > information_schema | sql_features | postgres | | f > | f | f > (1 row) OK, good to know. This is the query pg_upgrade 9.2 uses to pull information from 9.1 and 9.2: SELECT c.oid, n.nspname, c.relname, c.relfilenode, c.reltablespace, t.spclocation FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespacen ON c.relnamespace = n.oid LEFT OUTER JOIN pg_catalog.pg_tablespace t ON c.reltablespace =t.oid WHERE relkind IN ('r','t', 'i', 'S') AND ((n.nspname !~ '^pg_temp_' AND n.nspname !~ '^pg_toast_temp_' AND n.nspname NOT IN ('pg_catalog', 'information_schema', 'binary_upgrade') AND c.oid >= 16384 ) OR (n.nspname = 'pg_catalog' AND relname IN ('pg_largeobject', 'pg_largeobject_loid_pn_index', 'pg_largeobject_metadata','pg_largeobject_metadata_oid_index') ) )ORDER BY 1; Based on the fact that sql_features exists in the information_schema schema, I don't think 'sql_features' table is actually being processed by pg_upgrade, but I think its TOAST table, because it has a high oid, is being processed because it is in the pg_toast schema. This is causing the mismatch between the old and new clusters. I am thinking this query needs to be split apart into a UNION where the second part handles TOAST tables and looks at the schema of the _owner_ of the TOAST table. Needs to be backpatched too. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Mon, Sep 17, 2012 at 05:07:23PM -0400, Bruce Momjian wrote: > > # select * from pg_tables where tablename='sql_features'; > > schemaname | tablename | tableowner | tablespace | > > hasindexes | hasrules | hastriggers > > --------------------+--------------+------------+------------+------------+----------+------------- > > information_schema | sql_features | postgres | | f > > | f | f > > (1 row) > > OK, good to know. This is the query pg_upgrade 9.2 uses to pull > information from 9.1 and 9.2: > > SELECT c.oid, n.nspname, c.relname, c.relfilenode, c.reltablespace, t.spclocation > FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n ON c.relnamespace = n.oid > LEFT OUTER JOIN pg_catalog.pg_tablespace t ON c.reltablespace = t.oid > WHERE relkind IN ('r','t', 'i', 'S') AND > ((n.nspname !~ '^pg_temp_' AND > n.nspname !~ '^pg_toast_temp_' AND > n.nspname NOT IN ('pg_catalog', 'information_schema', 'binary_upgrade') AND > c.oid >= 16384 > ) > OR > (n.nspname = 'pg_catalog' AND > relname IN > ('pg_largeobject', 'pg_largeobject_loid_pn_index', 'pg_largeobject_metadata', 'pg_largeobject_metadata_oid_index') > ) > ) > ORDER BY 1; > > Based on the fact that sql_features exists in the information_schema > schema, I don't think 'sql_features' table is actually being processed > by pg_upgrade, but I think its TOAST table, because it has a high oid, > is being processed because it is in the pg_toast schema. This is > causing the mismatch between the old and new clusters. > > I am thinking this query needs to be split apart into a UNION where the > second part handles TOAST tables and looks at the schema of the _owner_ > of the TOAST table. Needs to be backpatched too. OK, I am at a conference now so will not be able to write-up a patch until perhaps next week. You can drop the information schema in the old database and pg_upgrade should run fine. I will test your failure once I create a patch. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Tue, Sep 18, 2012 at 07:22:39PM -0400, Bruce Momjian wrote: > > Based on the fact that sql_features exists in the information_schema > > schema, I don't think 'sql_features' table is actually being processed > > by pg_upgrade, but I think its TOAST table, because it has a high oid, > > is being processed because it is in the pg_toast schema. This is > > causing the mismatch between the old and new clusters. > > > > I am thinking this query needs to be split apart into a UNION where the > > second part handles TOAST tables and looks at the schema of the _owner_ > > of the TOAST table. Needs to be backpatched too. > > OK, I am at a conference now so will not be able to write-up a patch > until perhaps next week. You can drop the information schema in the old > database and pg_upgrade should run fine. I will test your failure once > I create a patch. One good thing is that pg_upgrade detected there was a bug in the code and threw an error, rather than producing an inaccurate dump. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
于 2012/9/19 7:22, Bruce Momjian 写道: > On Mon, Sep 17, 2012 at 05:07:23PM -0400, Bruce Momjian wrote: >>> # select * from pg_tables where tablename='sql_features'; >>> schemaname | tablename | tableowner | tablespace | >>> hasindexes | hasrules | hastriggers >>> --------------------+--------------+------------+------------+------------+----------+------------- >>> information_schema | sql_features | postgres | | f >>> | f | f >>> (1 row) >> OK, good to know. This is the query pg_upgrade 9.2 uses to pull >> information from 9.1 and 9.2: >> >> SELECT c.oid, n.nspname, c.relname, c.relfilenode, c.reltablespace, t.spclocation >> FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n ON c.relnamespace = n.oid >> LEFT OUTER JOIN pg_catalog.pg_tablespace t ON c.reltablespace = t.oid >> WHERE relkind IN ('r','t', 'i', 'S') AND >> ((n.nspname !~ '^pg_temp_' AND >> n.nspname !~ '^pg_toast_temp_' AND >> n.nspname NOT IN ('pg_catalog', 'information_schema', 'binary_upgrade') AND >> c.oid >= 16384 >> ) >> OR >> (n.nspname = 'pg_catalog' AND >> relname IN >> ('pg_largeobject', 'pg_largeobject_loid_pn_index', 'pg_largeobject_metadata', 'pg_largeobject_metadata_oid_index') >> ) >> ) >> ORDER BY 1; >> >> Based on the fact that sql_features exists in the information_schema >> schema, I don't think 'sql_features' table is actually being processed >> by pg_upgrade, but I think its TOAST table, because it has a high oid, >> is being processed because it is in the pg_toast schema. This is >> causing the mismatch between the old and new clusters. >> >> I am thinking this query needs to be split apart into a UNION where the >> second part handles TOAST tables and looks at the schema of the _owner_ >> of the TOAST table. Needs to be backpatched too. > OK, I am at a conference now so will not be able to write-up a patch > until perhaps next week. You can drop the information schema in the old > database and pg_upgrade should run fine. I will test your failure once > I create a patch. > OK. I will try. I also found some problems on initdb when re-init my pg9.2 db. 1. initdb doesn't create the pg_log dir so pg can not be started after initdb before I create the dir manually. 2. The case issue of db charset name. I installed pg9.1 and pg9.2 with zh_CN.UTF8. But somehow it seems the actual chaset name is stored with lowercase 'zh_CN.utf8' during the install. In this case, I can run the pg_upgrade without problem since they are both lowercase. But when I re-init pg9.2 with option '-E zh_CN.UTF8', pg_upgrade will fail and report that encoding/charset mis-match: one is uppercase and another is lowercase. If I run initdb with '-E zh_CN.utf8', it will tell me there is no such charset in the system. I found a workaround to run initdb with '--lc-collate=zh_CN.utf8 --lc-ctype=zh_CN.utf8 --lc-messages=zh_CN.utf8 --lc-monetary=zh_CN.utf8 --lc-numeric=zh_CN.utf8 --lc-time=zh_CN.utf8'. But the case problem is really confusing.
On Fri, Sep 21, 2012 at 05:16:46PM +0800, Rural Hunter wrote: > >>I am thinking this query needs to be split apart into a UNION where the > >>second part handles TOAST tables and looks at the schema of the _owner_ > >>of the TOAST table. Needs to be backpatched too. > >OK, I am at a conference now so will not be able to write-up a patch > >until perhaps next week. You can drop the information schema in the old > >database and pg_upgrade should run fine. I will test your failure once > >I create a patch. > > > OK. I will try. I also found some problems on initdb when re-init my > pg9.2 db. > 1. initdb doesn't create the pg_log dir so pg can not be started > after initdb before I create the dir manually. > 2. The case issue of db charset name. I installed pg9.1 and pg9.2 > with zh_CN.UTF8. But somehow it seems the actual chaset name is > stored with lowercase 'zh_CN.utf8' during the install. In this case, > I can run the pg_upgrade without problem since they are both > lowercase. But when I re-init pg9.2 with option '-E zh_CN.UTF8', > pg_upgrade will fail and report that encoding/charset mis-match: one > is uppercase and another is lowercase. If I run initdb with '-E > zh_CN.utf8', it will tell me there is no such charset in the system. > I found a workaround to run initdb with '--lc-collate=zh_CN.utf8 > --lc-ctype=zh_CN.utf8 --lc-messages=zh_CN.utf8 > --lc-monetary=zh_CN.utf8 --lc-numeric=zh_CN.utf8 > --lc-time=zh_CN.utf8'. But the case problem is really confusing. Yes, it sounds very confusing. I wonder if pg_upgrade should do a case-insentive comprison of encodings? Comments? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Fri, 2012-09-21 at 17:16 +0800, Rural Hunter wrote: > If I run initdb with '-E zh_CN.utf8', it will tell me there > is no such charset in the system. Because that is the name of a locale, not an encoding. > I found a workaround to run initdb > with '--lc-collate=zh_CN.utf8 --lc-ctype=zh_CN.utf8 > --lc-messages=zh_CN.utf8 --lc-monetary=zh_CN.utf8 > --lc-numeric=zh_CN.utf8 --lc-time=zh_CN.utf8'. But the case problem is > really confusing. Try initdb --locale='zn_CN.utf8'.
于2012年9月23日 20:33:48,Peter Eisentraut写到: > On Fri, 2012-09-21 at 17:16 +0800, Rural Hunter wrote: >> If I run initdb with '-E zh_CN.utf8', it will tell me there >> is no such charset in the system. > > Because that is the name of a locale, not an encoding. > >> I found a workaround to run initdb >> with '--lc-collate=zh_CN.utf8 --lc-ctype=zh_CN.utf8 >> --lc-messages=zh_CN.utf8 --lc-monetary=zh_CN.utf8 >> --lc-numeric=zh_CN.utf8 --lc-time=zh_CN.utf8'. But the case problem is >> really confusing. > > Try initdb --locale='zn_CN.utf8'. > > Ah yes, seems I used a wrong parameter. The --locale='zh_CN.utf8' works. --locale='zh_CN.UTF8' also works. But still the question is, should the encoding name be case sensitive?
On Sun, 2012-09-23 at 22:20 +0800, Rural Hunter wrote: > Ah yes, seems I used a wrong parameter. The --locale='zh_CN.utf8' > works. --locale='zh_CN.UTF8' also works. But still the question is, > should the encoding name be case sensitive? PostgreSQL treats encoding names as case insensitive. But it depends on the operating system whether locale names are case sensitive.
On Sun, Sep 23, 2012 at 06:46:33PM -0400, Peter Eisentraut wrote: > On Sun, 2012-09-23 at 22:20 +0800, Rural Hunter wrote: > > Ah yes, seems I used a wrong parameter. The --locale='zh_CN.utf8' > > works. --locale='zh_CN.UTF8' also works. But still the question is, > > should the encoding name be case sensitive? > > PostgreSQL treats encoding names as case insensitive. > > But it depends on the operating system whether locale names are case > sensitive. I can confirm that pg_upgrade does case-insensitive comparisons of encoding/locale names: static voidcheck_locale_and_encoding(ControlData *oldctrl, ControlData *newctrl){ /* These areoften defined with inconsistent case, so use pg_strcasecmp(). */ if (pg_strcasecmp(oldctrl->lc_collate, newctrl->lc_collate)!= 0) pg_log(PG_FATAL, "old and new cluster lc_collate values do not match\n"); if (pg_strcasecmp(oldctrl->lc_ctype, newctrl->lc_ctype) != 0) pg_log(PG_FATAL, "old andnew cluster lc_ctype values do not match\n"); if (pg_strcasecmp(oldctrl->encoding, newctrl->encoding) != 0) pg_log(PG_FATAL, "old and new cluster encoding values do not match\n");} -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On 9/24/12 8:55 AM, Bruce Momjian wrote: > I can confirm that pg_upgrade does case-insensitive comparisons of > encoding/locale names: > > static void > check_locale_and_encoding(ControlData *oldctrl, > ControlData *newctrl) > { > /* These are often defined with inconsistent case, so use pg_strcasecmp(). */ > if (pg_strcasecmp(oldctrl->lc_collate, newctrl->lc_collate) != 0) > pg_log(PG_FATAL, > "old and new cluster lc_collate values do not match\n"); > if (pg_strcasecmp(oldctrl->lc_ctype, newctrl->lc_ctype) != 0) > pg_log(PG_FATAL, > "old and new cluster lc_ctype values do not match\n"); I seem to recall that at some point in the distant past, somehow some Linux distributions changed the canonical spelling of locale names from xx_YY.UTF-8 to xx_YY.utf8. So if people are upgrading old PostgreSQL instances that use the old spelling, pg_upgrade will probably fail. A fix might be to take the locale name you find in pg_control and run it through setlocale() to get the new canonical name.
On Mon, Sep 24, 2012 at 09:06:04AM -0400, Peter Eisentraut wrote: > On 9/24/12 8:55 AM, Bruce Momjian wrote: > > I can confirm that pg_upgrade does case-insensitive comparisons of > > encoding/locale names: > > > > static void > > check_locale_and_encoding(ControlData *oldctrl, > > ControlData *newctrl) > > { > > /* These are often defined with inconsistent case, so use pg_strcasecmp(). */ > > if (pg_strcasecmp(oldctrl->lc_collate, newctrl->lc_collate) != 0) > > pg_log(PG_FATAL, > > "old and new cluster lc_collate values do not match\n"); > > if (pg_strcasecmp(oldctrl->lc_ctype, newctrl->lc_ctype) != 0) > > pg_log(PG_FATAL, > > "old and new cluster lc_ctype values do not match\n"); > > I seem to recall that at some point in the distant past, somehow some > Linux distributions changed the canonical spelling of locale names from > xx_YY.UTF-8 to xx_YY.utf8. So if people are upgrading old PostgreSQL > instances that use the old spelling, pg_upgrade will probably fail. A > fix might be to take the locale name you find in pg_control and run it > through setlocale() to get the new canonical name. Or we could just remove dashes from the name before comparisons. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
于 2012/9/24 20:55, Bruce Momjian 写道: > On Sun, Sep 23, 2012 at 06:46:33PM -0400, Peter Eisentraut wrote: >> On Sun, 2012-09-23 at 22:20 +0800, Rural Hunter wrote: >>> Ah yes, seems I used a wrong parameter. The --locale='zh_CN.utf8' >>> works. --locale='zh_CN.UTF8' also works. But still the question is, >>> should the encoding name be case sensitive? >> PostgreSQL treats encoding names as case insensitive. >> >> But it depends on the operating system whether locale names are case >> sensitive. > I can confirm that pg_upgrade does case-insensitive comparisons of > encoding/locale names: > > static void > check_locale_and_encoding(ControlData *oldctrl, > ControlData *newctrl) > { > /* These are often defined with inconsistent case, so use pg_strcasecmp(). */ > if (pg_strcasecmp(oldctrl->lc_collate, newctrl->lc_collate) != 0) > pg_log(PG_FATAL, > "old and new cluster lc_collate values do not match\n"); > if (pg_strcasecmp(oldctrl->lc_ctype, newctrl->lc_ctype) != 0) > pg_log(PG_FATAL, > "old and new cluster lc_ctype values do not match\n"); > if (pg_strcasecmp(oldctrl->encoding, newctrl->encoding) != 0) > pg_log(PG_FATAL, > "old and new cluster encoding values do not match\n"); > } > strange. not sure what happened. I reviewed the log and here is what I did: 1. initdb without encoding/locale parameter: $ initdb The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale "zh_CN.UTF-8". The default database encoding has accordingly been set to "UTF8". initdb: could not find suitable text search configuration for locale "zh_CN.UTF-8" The default text search configuration will be set to "simple". 2. Run pg_upgrade: $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c Performing Consistency Checks ----------------------------- Checking current, bin, and data directories ok Checking cluster versions ok Checking database user is a superuser ok Checking for prepared transactions ok Checking for reg* system OID user data types ok Checking for contrib/isn with bigint-passing mismatch ok old and new cluster lc_collate values do not match Failure, exiting 3. initdb with --lc-collate: $ initdb --lc-collate=zh_CN.utf8 The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locales COLLATE: zh_CN.utf8 CTYPE: zh_CN.UTF-8 MESSAGES: zh_CN.UTF-8 MONETARY: zh_CN.UTF-8 NUMERIC: zh_CN.UTF-8 TIME: zh_CN.UTF-8 The default database encoding has accordingly been set to "UTF8". initdb: could not find suitable text search configuration for locale "zh_CN.UTF-8" The default text search configuration will be set to "simple". 4. try pg_upgrade again: $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c Performing Consistency Checks ----------------------------- Checking current, bin, and data directories ok Checking cluster versions ok Checking database user is a superuser ok Checking for prepared transactions ok Checking for reg* system OID user data types ok Checking for contrib/isn with bigint-passing mismatch ok old and new cluster lc_ctype values do not match Failure, exiting 5. Run initdb with all those locale settings: $ initdb --lc-collate=zh_CN.utf8 --lc-ctype=zh_CN.utf8 --lc-messages=zh_CN.utf8 --lc-monetary=zh_CN.utf8 --lc-numeric=zh_CN.utf8 --lc-time=zh_CN.utf8 The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale "zh_CN.utf8". The default database encoding has accordingly been set to "UTF8". initdb: could not find suitable text search configuration for locale "zh_CN.utf8" The default text search configuration will be set to "simple". 6. Run pg_upgrade. this time it worked.
Bruce Momjian <bruce@momjian.us> writes: >>> I can confirm that pg_upgrade does case-insensitive comparisons of >>> encoding/locale names: > Or we could just remove dashes from the name before comparisons. That would merely move the breakage somewhere else. I think you are already assuming far too much about the OS' interpretation of locale names by assuming they are case-insensitive. Assuming that dashes aren't significant seems 100% wrong. FWIW, what I found out last time I touched this code is that on many systems setlocale doesn't bother to return a canonicalized spelling; it just gives back the string you gave it. It might be worth doing what Peter suggests, just to be consistent with what we are doing elsewhere, but I'm not sure how much it will help. regards, tom lane
On Mon, Sep 24, 2012 at 09:59:02PM +0800, Rural Hunter wrote: > 于 2012/9/24 20:55, Bruce Momjian 写道: > >On Sun, Sep 23, 2012 at 06:46:33PM -0400, Peter Eisentraut wrote: > >>On Sun, 2012-09-23 at 22:20 +0800, Rural Hunter wrote: > >>>Ah yes, seems I used a wrong parameter. The --locale='zh_CN.utf8' > >>>works. --locale='zh_CN.UTF8' also works. But still the question is, > >>>should the encoding name be case sensitive? > >>PostgreSQL treats encoding names as case insensitive. > >> > >>But it depends on the operating system whether locale names are case > >>sensitive. > >I can confirm that pg_upgrade does case-insensitive comparisons of > >encoding/locale names: > > > > static void > > check_locale_and_encoding(ControlData *oldctrl, > > ControlData *newctrl) > > { > > /* These are often defined with inconsistent case, so use pg_strcasecmp(). */ > > if (pg_strcasecmp(oldctrl->lc_collate, newctrl->lc_collate) != 0) > > pg_log(PG_FATAL, > > "old and new cluster lc_collate values do not match\n"); > > if (pg_strcasecmp(oldctrl->lc_ctype, newctrl->lc_ctype) != 0) > > pg_log(PG_FATAL, > > "old and new cluster lc_ctype values do not match\n"); > > if (pg_strcasecmp(oldctrl->encoding, newctrl->encoding) != 0) > > pg_log(PG_FATAL, > > "old and new cluster encoding values do not match\n"); > > } > > > strange. not sure what happened. I reviewed the log and here is what I did: > 1. initdb without encoding/locale parameter: > $ initdb > The files belonging to this database system will be owned by user > "postgres". > This user must also own the server process. > > The database cluster will be initialized with locale "zh_CN.UTF-8". > The default database encoding has accordingly been set to "UTF8". > initdb: could not find suitable text search configuration for locale > "zh_CN.UTF-8" > The default text search configuration will be set to "simple". > > 2. Run pg_upgrade: > $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B > /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c > Performing Consistency Checks > ----------------------------- > Checking current, bin, and data directories ok > Checking cluster versions ok > Checking database user is a superuser ok > Checking for prepared transactions ok > Checking for reg* system OID user data types ok > Checking for contrib/isn with bigint-passing mismatch ok > > old and new cluster lc_collate values do not match > Failure, exiting > > 3. initdb with --lc-collate: > $ initdb --lc-collate=zh_CN.utf8 > The files belonging to this database system will be owned by user > "postgres". > This user must also own the server process. > > The database cluster will be initialized with locales > COLLATE: zh_CN.utf8 > CTYPE: zh_CN.UTF-8 > MESSAGES: zh_CN.UTF-8 > MONETARY: zh_CN.UTF-8 > NUMERIC: zh_CN.UTF-8 > TIME: zh_CN.UTF-8 > The default database encoding has accordingly been set to "UTF8". > initdb: could not find suitable text search configuration for locale > "zh_CN.UTF-8" > The default text search configuration will be set to "simple". > > 4. try pg_upgrade again: > $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B > /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c > Performing Consistency Checks > ----------------------------- > Checking current, bin, and data directories ok > Checking cluster versions ok > Checking database user is a superuser ok > Checking for prepared transactions ok > Checking for reg* system OID user data types ok > Checking for contrib/isn with bigint-passing mismatch ok > > old and new cluster lc_ctype values do not match > Failure, exiting > > 5. Run initdb with all those locale settings: > $ initdb --lc-collate=zh_CN.utf8 --lc-ctype=zh_CN.utf8 > --lc-messages=zh_CN.utf8 --lc-monetary=zh_CN.utf8 > --lc-numeric=zh_CN.utf8 --lc-time=zh_CN.utf8 > The files belonging to this database system will be owned by user > "postgres". > This user must also own the server process. > > The database cluster will be initialized with locale "zh_CN.utf8". > The default database encoding has accordingly been set to "UTF8". > initdb: could not find suitable text search configuration for locale > "zh_CN.utf8" > The default text search configuration will be set to "simple". > > 6. Run pg_upgrade. this time it worked. OK, that is good information. pg_upgrade gets the locale and encoding from the template0 database settings: "SELECT datcollate, datctype ""FROM pg_catalog.pg_database ""WHERE datname = 'template0' "); If your operating system locale/encoding names changed after the initdb of the old cluster, this would not be reflected in template0. I think Peter is right that this might be as dash issue, utf8 vs utf-8. Look at the initdb output: > 3. initdb with --lc-collate: > $ initdb --lc-collate=zh_CN.utf8 > The files belonging to this database system will be owned by user > "postgres". > This user must also own the server process. > > The database cluster will be initialized with locales > COLLATE: zh_CN.utf8 > CTYPE: zh_CN.UTF-8 > MESSAGES: zh_CN.UTF-8 > MONETARY: zh_CN.UTF-8 > NUMERIC: zh_CN.UTF-8 > TIME: zh_CN.UTF-8 Notice colldate does not have dash, while ctype does. Peter, ideas on a solution? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Mon, Sep 24, 2012 at 10:13:45AM -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > >>> I can confirm that pg_upgrade does case-insensitive comparisons of > >>> encoding/locale names: > > > Or we could just remove dashes from the name before comparisons. > > That would merely move the breakage somewhere else. I think you are > already assuming far too much about the OS' interpretation of locale > names by assuming they are case-insensitive. Assuming that dashes > aren't significant seems 100% wrong. > > FWIW, what I found out last time I touched this code is that on many > systems setlocale doesn't bother to return a canonicalized spelling; > it just gives back the string you gave it. It might be worth doing > what Peter suggests, just to be consistent with what we are doing > elsewhere, but I'm not sure how much it will help. This comment in initdb.c doesn't sound hopeful: * If successful, and canonname isn't NULL, a malloc'd copy of the locale's* canonical name is stored there. This is especiallyuseful for figuring out* what locale name "" means (ie, the environment value). (Actually,* it seems that on mostimplementations that's the only thing it's good for;* we could wish that setlocale gave back a canonically spelled versionof* the locale name, but typically it doesn't.) -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Bruce Momjian <bruce@momjian.us> writes: > On Mon, Sep 24, 2012 at 10:13:45AM -0400, Tom Lane wrote: >> FWIW, what I found out last time I touched this code is that on many >> systems setlocale doesn't bother to return a canonicalized spelling; >> it just gives back the string you gave it. It might be worth doing >> what Peter suggests, just to be consistent with what we are doing >> elsewhere, but I'm not sure how much it will help. > This comment in initdb.c doesn't sound hopeful: > * If successful, and canonname isn't NULL, a malloc'd copy of the locale's > * canonical name is stored there. This is especially useful for figuring out > * what locale name "" means (ie, the environment value). (Actually, > * it seems that on most implementations that's the only thing it's good for; > * we could wish that setlocale gave back a canonically spelled version of > * the locale name, but typically it doesn't.) Yeah, I wrote that. We can hope that the OP is running on a platform where setlocale does canonicalize the name, in which case doing the same thing in pg_upgrade that initdb does would fix his problem. But I'm not going to predict success. regards, tom lane
于 2012/9/24 22:26, Bruce Momjian 写道: > On Mon, Sep 24, 2012 at 09:59:02PM +0800, Rural Hunter wrote: >> 于 2012/9/24 20:55, Bruce Momjian 写道: >>> On Sun, Sep 23, 2012 at 06:46:33PM -0400, Peter Eisentraut wrote: >>>> On Sun, 2012-09-23 at 22:20 +0800, Rural Hunter wrote: >>>>> Ah yes, seems I used a wrong parameter. The --locale='zh_CN.utf8' >>>>> works. --locale='zh_CN.UTF8' also works. But still the question is, >>>>> should the encoding name be case sensitive? >>>> PostgreSQL treats encoding names as case insensitive. >>>> >>>> But it depends on the operating system whether locale names are case >>>> sensitive. >>> I can confirm that pg_upgrade does case-insensitive comparisons of >>> encoding/locale names: >>> >>> static void >>> check_locale_and_encoding(ControlData *oldctrl, >>> ControlData *newctrl) >>> { >>> /* These are often defined with inconsistent case, so use pg_strcasecmp(). */ >>> if (pg_strcasecmp(oldctrl->lc_collate, newctrl->lc_collate) != 0) >>> pg_log(PG_FATAL, >>> "old and new cluster lc_collate values do not match\n"); >>> if (pg_strcasecmp(oldctrl->lc_ctype, newctrl->lc_ctype) != 0) >>> pg_log(PG_FATAL, >>> "old and new cluster lc_ctype values do not match\n"); >>> if (pg_strcasecmp(oldctrl->encoding, newctrl->encoding) != 0) >>> pg_log(PG_FATAL, >>> "old and new cluster encoding values do not match\n"); >>> } >>> >> strange. not sure what happened. I reviewed the log and here is what I did: >> 1. initdb without encoding/locale parameter: >> $ initdb >> The files belonging to this database system will be owned by user >> "postgres". >> This user must also own the server process. >> >> The database cluster will be initialized with locale "zh_CN.UTF-8". >> The default database encoding has accordingly been set to "UTF8". >> initdb: could not find suitable text search configuration for locale >> "zh_CN.UTF-8" >> The default text search configuration will be set to "simple". >> >> 2. Run pg_upgrade: >> $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B >> /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c >> Performing Consistency Checks >> ----------------------------- >> Checking current, bin, and data directories ok >> Checking cluster versions ok >> Checking database user is a superuser ok >> Checking for prepared transactions ok >> Checking for reg* system OID user data types ok >> Checking for contrib/isn with bigint-passing mismatch ok >> >> old and new cluster lc_collate values do not match >> Failure, exiting >> >> 3. initdb with --lc-collate: >> $ initdb --lc-collate=zh_CN.utf8 >> The files belonging to this database system will be owned by user >> "postgres". >> This user must also own the server process. >> >> The database cluster will be initialized with locales >> COLLATE: zh_CN.utf8 >> CTYPE: zh_CN.UTF-8 >> MESSAGES: zh_CN.UTF-8 >> MONETARY: zh_CN.UTF-8 >> NUMERIC: zh_CN.UTF-8 >> TIME: zh_CN.UTF-8 >> The default database encoding has accordingly been set to "UTF8". >> initdb: could not find suitable text search configuration for locale >> "zh_CN.UTF-8" >> The default text search configuration will be set to "simple". >> >> 4. try pg_upgrade again: >> $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B >> /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c >> Performing Consistency Checks >> ----------------------------- >> Checking current, bin, and data directories ok >> Checking cluster versions ok >> Checking database user is a superuser ok >> Checking for prepared transactions ok >> Checking for reg* system OID user data types ok >> Checking for contrib/isn with bigint-passing mismatch ok >> >> old and new cluster lc_ctype values do not match >> Failure, exiting >> >> 5. Run initdb with all those locale settings: >> $ initdb --lc-collate=zh_CN.utf8 --lc-ctype=zh_CN.utf8 >> --lc-messages=zh_CN.utf8 --lc-monetary=zh_CN.utf8 >> --lc-numeric=zh_CN.utf8 --lc-time=zh_CN.utf8 >> The files belonging to this database system will be owned by user >> "postgres". >> This user must also own the server process. >> >> The database cluster will be initialized with locale "zh_CN.utf8". >> The default database encoding has accordingly been set to "UTF8". >> initdb: could not find suitable text search configuration for locale >> "zh_CN.utf8" >> The default text search configuration will be set to "simple". >> >> 6. Run pg_upgrade. this time it worked. > OK, that is good information. pg_upgrade gets the locale and encoding > from the template0 database settings: > > "SELECT datcollate, datctype " > "FROM pg_catalog.pg_database " > "WHERE datname = 'template0' "); > > If your operating system locale/encoding names changed after the initdb > of the old cluster, this would not be reflected in template0. No. It's not changed. look at my system settings: LANG=zh_CN.UTF-8 $ cat /var/lib/locales/supported.d/local zh_CN.UTF-8 UTF-8 I think the problem is on the options when I installed pgsql(both 9.1 and 9.2) Select the locale to be used by the new database cluster. Locale [1] [Default locale] [2] C [3] POSIX [4] zh_CN.utf8 [5] zh_HK.utf8 [6] zh_SG.utf8 [7] zh_TW.utf8 Please choose an option [1] : 4 I chose 4 instead of 1. I guess the default locale(option 1) is with dash. > I think > Peter is right that this might be as dash issue, utf8 vs utf-8. Look at > the initdb output: > >> 3. initdb with --lc-collate: >> $ initdb --lc-collate=zh_CN.utf8 >> The files belonging to this database system will be owned by user >> "postgres". >> This user must also own the server process. >> >> The database cluster will be initialized with locales >> COLLATE: zh_CN.utf8 >> CTYPE: zh_CN.UTF-8 >> MESSAGES: zh_CN.UTF-8 >> MONETARY: zh_CN.UTF-8 >> NUMERIC: zh_CN.UTF-8 >> TIME: zh_CN.UTF-8 > Notice colldate does not have dash, while ctype does. > > Peter, ideas on a solution? >
On Mon, Sep 24, 2012 at 10:45:34PM +0800, Rural Hunter wrote: > >If your operating system locale/encoding names changed after the initdb > >of the old cluster, this would not be reflected in template0. > No. It's not changed. look at my system settings: > LANG=zh_CN.UTF-8 > $ cat /var/lib/locales/supported.d/local > zh_CN.UTF-8 UTF-8 > > I think the problem is on the options when I installed pgsql(both > 9.1 and 9.2) > Select the locale to be used by the new database cluster. > > Locale > > [1] [Default locale] > [2] C > [3] POSIX > [4] zh_CN.utf8 > [5] zh_HK.utf8 > [6] zh_SG.utf8 > [7] zh_TW.utf8 > Please choose an option [1] : 4 > I chose 4 instead of 1. I guess the default locale(option 1) is with dash. Well, if you run that query on template0 in the old and new cluster, you will see something different in the two of them. Could you have used default in one and a non-dash in the other. Did we change the way we canonicalize the locale between 9.1 and 9.2? I can send you a patch to test if the setlocale canonicalization works. Can you test it if I send it? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Bruce Momjian <bruce@momjian.us> writes: > Well, if you run that query on template0 in the old and new cluster, you > will see something different in the two of them. Could you have used > default in one and a non-dash in the other. Did we change the way we > canonicalize the locale between 9.1 and 9.2? IIRC, we didn't try to canonicalize locale names at all before 9.2. That initdb code you're quoting is of fairly recent vintage. regards, tom lane
On Mon, Sep 24, 2012 at 11:04:32AM -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Well, if you run that query on template0 in the old and new cluster, you > > will see something different in the two of them. Could you have used > > default in one and a non-dash in the other. Did we change the way we > > canonicalize the locale between 9.1 and 9.2? > > IIRC, we didn't try to canonicalize locale names at all before 9.2. > That initdb code you're quoting is of fairly recent vintage. Ah, so that would explain the change he is seeing. I will work on a patch. I am working on the information_schema patch now. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On 9/24/12 10:13 AM, Tom Lane wrote: > FWIW, what I found out last time I touched this code is that on many > systems setlocale doesn't bother to return a canonicalized spelling; > it just gives back the string you gave it. It might be worth doing > what Peter suggests, just to be consistent with what we are doing > elsewhere, but I'm not sure how much it will help. It might not have anything to do with the current problem, but if initdb canonicalizes locale names, then pg_upgrade also has to. Otherwise, whenever an operating system changes its locale canonicalization rules, pg_upgrade will fail.
On Mon, Sep 24, 2012 at 11:24:04AM -0400, Peter Eisentraut wrote: > On 9/24/12 10:13 AM, Tom Lane wrote: > > FWIW, what I found out last time I touched this code is that on many > > systems setlocale doesn't bother to return a canonicalized spelling; > > it just gives back the string you gave it. It might be worth doing > > what Peter suggests, just to be consistent with what we are doing > > elsewhere, but I'm not sure how much it will help. > > It might not have anything to do with the current problem, but if initdb > canonicalizes locale names, then pg_upgrade also has to. Otherwise, > whenever an operating system changes its locale canonicalization rules, > pg_upgrade will fail. Agreed. I will work on that soon. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Mon, Sep 24, 2012 at 11:22:22AM -0400, Peter Eisentraut wrote: > On 9/24/12 11:04 AM, Tom Lane wrote: > > Bruce Momjian <bruce@momjian.us> writes: > >> Well, if you run that query on template0 in the old and new cluster, you > >> will see something different in the two of them. Could you have used > >> default in one and a non-dash in the other. Did we change the way we > >> canonicalize the locale between 9.1 and 9.2? > > > > IIRC, we didn't try to canonicalize locale names at all before 9.2. > > That initdb code you're quoting is of fairly recent vintage. > > initdb has changed POSIX to C with glibc at least since 8.3. The code > you're quoting is just a refactoring, AFAICT. Frankly, I assumed the values assigned in pg_database for template0 were canonical. Tom is saying that canonicalization behavior changed between 9.1 to 9.2, and the user is reporting this. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Bruce Momjian <bruce@momjian.us> writes: > On Mon, Sep 24, 2012 at 11:22:22AM -0400, Peter Eisentraut wrote: >> initdb has changed POSIX to C with glibc at least since 8.3. The code >> you're quoting is just a refactoring, AFAICT. > Frankly, I assumed the values assigned in pg_database for template0 were > canonical. Tom is saying that canonicalization behavior changed > between 9.1 to 9.2, and the user is reporting this. It was not just a refactoring: we now pass the locale names through setlocale() which we didn't before. See commit c7cea267de3ca05b29a57b9d113b95ef3793c8d8. regards, tom lane
On 9/24/12 11:04 AM, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: >> Well, if you run that query on template0 in the old and new cluster, you >> will see something different in the two of them. Could you have used >> default in one and a non-dash in the other. Did we change the way we >> canonicalize the locale between 9.1 and 9.2? > > IIRC, we didn't try to canonicalize locale names at all before 9.2. > That initdb code you're quoting is of fairly recent vintage. initdb has changed POSIX to C with glibc at least since 8.3. The code you're quoting is just a refactoring, AFAICT.
On Mon, Sep 24, 2012 at 11:04:32AM -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Well, if you run that query on template0 in the old and new cluster, you > > will see something different in the two of them. Could you have used > > default in one and a non-dash in the other. Did we change the way we > > canonicalize the locale between 9.1 and 9.2? > > IIRC, we didn't try to canonicalize locale names at all before 9.2. > That initdb code you're quoting is of fairly recent vintage. OK, I have developed two patches. The first fixes the problem of toast tables having oid > FirstNormalObjectId due to recreating the information_schema as outlined in the 9.1 release notes. In fact, there are several cases this fixes, but information_schema was the one reported. The basic problem is that TOAST tables can't be restricted by schema -- you have to gather the relations, and then get the toast tables. The good news is that pg_upgrade caught its own bug and threw an error. I was able to test this patch by testing the information_schema recreation, and I checked to see the regression database had the expected info.c relation count. The second patch canonicalizes the old cluster's collation and ctype values pulled from the template0 database. I was recreate the fix my Debian Squeeze system. Can someone suggestion a way? I updated pg_database on the old 9.1 cluster to be en_US.UTF8, while the new cluster defaults to en_US.UTF-8, but pg_upgrade kept them the same after the setlocale() call and pg_upgrade threw a mismatch error. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Attachment
于 2012/9/24 22:57, Bruce Momjian 写道: > On Mon, Sep 24, 2012 at 10:45:34PM +0800, Rural Hunter wrote: >>> If your operating system locale/encoding names changed after the initdb >>> of the old cluster, this would not be reflected in template0. >> No. It's not changed. look at my system settings: >> LANG=zh_CN.UTF-8 >> $ cat /var/lib/locales/supported.d/local >> zh_CN.UTF-8 UTF-8 >> >> I think the problem is on the options when I installed pgsql(both >> 9.1 and 9.2) >> Select the locale to be used by the new database cluster. >> >> Locale >> >> [1] [Default locale] >> [2] C >> [3] POSIX >> [4] zh_CN.utf8 >> [5] zh_HK.utf8 >> [6] zh_SG.utf8 >> [7] zh_TW.utf8 >> Please choose an option [1] : 4 >> I chose 4 instead of 1. I guess the default locale(option 1) is with dash. > Well, if you run that query on template0 in the old and new cluster, you > will see something different in the two of them. Could you have used > default in one and a non-dash in the other. Yes, that's true. The upgrade is fine with both fresh installs(9.1 and 9.2) with option above(without-dash). The problem only happens when I inited the 9.2 db with default locale(I guess that one has the dash). Just wondering why pg installer provides options without dash. > Did we change the way we > canonicalize the locale between 9.1 and 9.2? > > I can send you a patch to test if the setlocale canonicalization works. > Can you test it if I send it? >
On Tue, Sep 25, 2012 at 08:41:19AM +0800, Rural Hunter wrote: > >>I think the problem is on the options when I installed pgsql(both > >>9.1 and 9.2) > >>Select the locale to be used by the new database cluster. > >> > >>Locale > >> > >>[1] [Default locale] > >>[2] C > >>[3] POSIX > >>[4] zh_CN.utf8 > >>[5] zh_HK.utf8 > >>[6] zh_SG.utf8 > >>[7] zh_TW.utf8 > >>Please choose an option [1] : 4 > >>I chose 4 instead of 1. I guess the default locale(option 1) is with dash. > >Well, if you run that query on template0 in the old and new cluster, you > >will see something different in the two of them. Could you have used > >default in one and a non-dash in the other. > Yes, that's true. The upgrade is fine with both fresh installs(9.1 > and 9.2) with option above(without-dash). The problem only happens > when I inited the 9.2 db with default locale(I guess that one has OK, that is good to know. I developed the attached C program that does the setlocale canonical test. On Debian Squeeze, I could not see any change: if I pass en_US.UTF-8, I get en_US.UTF-8 returned; if I pass en_US.UTF8, I get en_US.UTF8 returned. Can anyone test this and find a case where the local is canonicalized? Run it this way: $ canonical LC_COLLATE = 3 LC_CTYPE = 0 $ canonical 0 en_US.UTF8 en_US.UTF8 We are looking for cases where the second argument produces a non-matching locale name as output. I have also attached a patch that reports the mismatching locale or encoding names --- this should at least help with debugging and show that a dash is the problem. > the dash). Just wondering why pg installer provides options without > dash. No idea. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Attachment
于 2012/9/25 11:00, Bruce Momjian 写道: > On Tue, Sep 25, 2012 at 08:41:19AM +0800, Rural Hunter wrote: >>>> I think the problem is on the options when I installed pgsql(both >>>> 9.1 and 9.2) >>>> Select the locale to be used by the new database cluster. >>>> >>>> Locale >>>> >>>> [1] [Default locale] >>>> [2] C >>>> [3] POSIX >>>> [4] zh_CN.utf8 >>>> [5] zh_HK.utf8 >>>> [6] zh_SG.utf8 >>>> [7] zh_TW.utf8 >>>> Please choose an option [1] : 4 >>>> I chose 4 instead of 1. I guess the default locale(option 1) is with dash. >>> Well, if you run that query on template0 in the old and new cluster, you >>> will see something different in the two of them. Could you have used >>> default in one and a non-dash in the other. >> Yes, that's true. The upgrade is fine with both fresh installs(9.1 >> and 9.2) with option above(without-dash). The problem only happens >> when I inited the 9.2 db with default locale(I guess that one has > OK, that is good to know. I developed the attached C program that does > the setlocale canonical test. On Debian Squeeze, I could not see any > change: if I pass en_US.UTF-8, I get en_US.UTF-8 returned; if I pass > en_US.UTF8, I get en_US.UTF8 returned. Can anyone test this and find a > case where the local is canonicalized? Run it this way: > > $ canonical > LC_COLLATE = 3 > LC_CTYPE = 0 > $ canonical 0 en_US.UTF8 > en_US.UTF8 > > We are looking for cases where the second argument produces a > non-matching locale name as output. It matches on my system(ubuntu 10.10 server): $ ./canonical LC_COLLATE = 3 LC_CTYPE = 0 $ ./canonical 0 zh_CN.UTF-8 zh_CN.UTF-8 $ ./canonical 0 zh_CN.UTF8 zh_CN.UTF8 $ ./canonical 0 zh_CN.utf8 zh_CN.utf8 $ ./canonical 0 zh_CN.utf-8 zh_CN.utf-8 I tested the checker with the patch: $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c Performing Consistency Checks ----------------------------- Checking current, bin, and data directories ok Checking cluster versions ok Checking database user is a superuser ok Checking for prepared transactions ok Checking for reg* system OID user data types ok Checking for contrib/isn with bigint-passing mismatch ok lc_collate cluster values do not match: old "zh_CN.utf8", new "zh_CN.UTF-8" Failure, exiting zh_CN.utf8 is provided by the installer and zh_CN.UTF-8 is my system default. > > I have also attached a patch that reports the mismatching locale or > encoding names --- this should at least help with debugging and show > that a dash is the problem. > >> the dash). Just wondering why pg installer provides options without >> dash. > No idea. >
On Tue, Sep 25, 2012 at 12:22:43PM +0800, Rural Hunter wrote: > >OK, that is good to know. I developed the attached C program that does > >the setlocale canonical test. On Debian Squeeze, I could not see any > >change: if I pass en_US.UTF-8, I get en_US.UTF-8 returned; if I pass > >en_US.UTF8, I get en_US.UTF8 returned. Can anyone test this and find a > >case where the local is canonicalized? Run it this way: > > > > $ canonical > > LC_COLLATE = 3 > > LC_CTYPE = 0 > > $ canonical 0 en_US.UTF8 > > en_US.UTF8 > > > >We are looking for cases where the second argument produces a > >non-matching locale name as output. > It matches on my system(ubuntu 10.10 server): > $ ./canonical > LC_COLLATE = 3 > LC_CTYPE = 0 > $ ./canonical 0 zh_CN.UTF-8 > zh_CN.UTF-8 > $ ./canonical 0 zh_CN.UTF8 > zh_CN.UTF8 > $ ./canonical 0 zh_CN.utf8 > zh_CN.utf8 > $ ./canonical 0 zh_CN.utf-8 > zh_CN.utf-8 > > I tested the checker with the patch: > $ /opt/PostgreSQL/9.2/bin/pg_upgrade -b /opt/PostgreSQL/9.1/bin -B > /opt/PostgreSQL/9.2/bin -d /raid/pgsql -D /raid/pg92data -c > Performing Consistency Checks > ----------------------------- > Checking current, bin, and data directories ok > Checking cluster versions ok > Checking database user is a superuser ok > Checking for prepared transactions ok > Checking for reg* system OID user data types ok > Checking for contrib/isn with bigint-passing mismatch ok > > lc_collate cluster values do not match: old "zh_CN.utf8", new "zh_CN.UTF-8" > Failure, exiting > > zh_CN.utf8 is provided by the installer and zh_CN.UTF-8 is my system > default. OK, this tells us that the canonicalization code used in initdb is not going to help us in pg_upgrade, at least not on your system, and not on mine. I think we should apply the patch that fixes the TOAST problem with information_schema, and the patch that outputs the old/new values for easier debugging. Other than that, I don't know what else we can do except to ignore dashes when comparing locale names, which I am told is unacceptable. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Tue, Sep 25, 2012 at 09:10:33AM -0400, Bruce Momjian wrote: > > lc_collate cluster values do not match: old "zh_CN.utf8", new "zh_CN.UTF-8" > > Failure, exiting > > > > zh_CN.utf8 is provided by the installer and zh_CN.UTF-8 is my system > > default. > > OK, this tells us that the canonicalization code used in initdb is not > going to help us in pg_upgrade, at least not on your system, and not on > mine. > > I think we should apply the patch that fixes the TOAST problem with > information_schema, and the patch that outputs the old/new values for > easier debugging. Other than that, I don't know what else we can do > except to ignore dashes when comparing locale names, which I am told is > unacceptable. Based on this great bug report and submitter leg-work, I have applied three patches to pg_upgrade in head and 9.2, all attached: * try to get the canonical locale names, and report old/new values on mismatch * update query to skip toast tables for system objects * improve error reporting when the object counts don't match None of these bugs caused pg_upgrade to produce an incorrect upgraded cluster, so I am not going to panic and try to force them into 9.1, which probably isn't being used by many people anymore anyway. I think this closes this report. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +