Thread: Possible savepoint bug
As you can see, we have duplicates within the table (heap) of a primary key value. The index itself only references one of these tuples. Nearly all data inserted into this table is wrapped in a subtransaction, and is created a single tuple per subtransaction. About 20% of entries are duplicate, so we catch the UNIQUE VIOLATION and restore to the savepoint. I did keep a copy of the table. Compressed it is about 24MB. After trying everything below, I also gave it a run with vacuum full. It did not change the output. ssdb=# select version(); version ---------------------------------------------------------------------------PostgreSQL 8.0.3 on sparc-sun-solaris2.9, compiledby GCC gcc (GCC) 3.4.2 (1 row) ssdb=# \d feature_keyword_supply_google Table "public.feature_keyword_supply_google" Column | Type | Modifiers ----------------------------------------------+-----------------------+-----------account_instance_id | integer | not nullkeyword_id | integer | not nullfeature_keyword_supply_google_score | natural_number | not nullfeature_keyword_supply_google_last_collected | ss_timestamp_recorded | not null Indexes: "feature_keyword_supply_google_pkey" PRIMARY KEY, btree (account_instance_id, keyword_id) Foreign-key constraints: "feature_keyword_supply_google_account_instance_id_fkey" FOREIGN KEY (account_instance_id, keyword_id) REFERENCES feature_keyword(account_instance_id, keyword_id) ON UPDATE CASCADE ON DELETE CASCADE ssdb=# set enable_indexscan TO off; SET ssdb=# select ctid, xmin, cmin, xmax, account_instance_id = 11916, keyword_id = 1985374 from feature_keyword_supply_google where keyword_id = 1985374 and account_instance_id = 11916; ctid | xmin | cmin | xmax | ?column? | ?column? -----------+-----------+------+-----------+----------+----------(4277,60) | 506766160 | 3593 | 744608069 | t | t(4277,72)| 397750949 | 4828 | 506766160 | t | t (2 rows) ssdb=# set enable_indexscan TO on; SET ssdb=# select ctid, xmin, cmin, xmax, account_instance_id = 11916, keyword_id = 1985374 from feature_keyword_supply_google where keyword_id = 1985374 and account_instance_id = 11916; ctid | xmin | cmin | xmax | ?column? | ?column? -----------+-----------+------+-----------+----------+----------(4277,60) | 506766160 | 3593 | 744608069 | t | t (1 row) ssdb=# vacuum feature_keyword_supply_google; VACUUM ssdb=# set enable_indexscan = off; SET ssdb=# select ctid, xmin, cmin, xmax, account_instance_id = 11916, keyword_id = 1985374 from feature_keyword_supply_google where keyword_id = 1985374 and account_instance_id = 11916; ctid | xmin | cmin | xmax | ?column? | ?column? -----------+-----------+------+-----------+----------+----------(4277,60) | 506766160 | 3593 | 744608069 | t | t(4277,72)| 397750949 | 4828 | 506766160 | t | t (2 rows) [root@DB1 rbt]# pg_controldata --version pg_controldata (PostgreSQL) 8.0.3 [root@DB1 rbt]# pg_controldata /var/opt/pgsql/data pg_control version number: 74 Catalog version number: 200411041 Database system identifier: 4769850195962887551 Database cluster state: in production pg_control last modified: Wed Nov 09 10:10:26 2005 Current log file ID: 860 Next log file segment: 170 Latest checkpoint location: 35C/A14C8D60 Prior checkpoint location: 35C/908D6470 Latest checkpoint's REDO location: 35C/A1440E20 Latest checkpoint's UNDO location: 0/0 Latest checkpoint's TimeLineID: 1 Latest checkpoint's NextXID: 745383235 Latest checkpoint's NextOID: 30513944 Time of latest checkpoint: Wed Nov 09 09:42:53 2005 Database block size: 8192 Blocks per segment of large relation: 131072 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum number of function arguments: 32 Date/time type storage: floating-point numbers Maximum length of locale name: 128 LC_COLLATE: en_US LC_CTYPE: en_US --
Rod Taylor <pg@rbt.ca> writes: > As you can see, we have duplicates within the table (heap) of a primary > key value. The index itself only references one of these tuples. Can you put together a test case to reproduce this? It doesn't have to fail every time, as long as it fails once in awhile ... regards, tom lane
On Wed, 2005-11-09 at 14:20 -0500, Tom Lane wrote: > Rod Taylor <pg@rbt.ca> writes: > > As you can see, we have duplicates within the table (heap) of a primary > > key value. The index itself only references one of these tuples. > > Can you put together a test case to reproduce this? It doesn't have to > fail every time, as long as it fails once in awhile ... I'll see what I can put together. I see I have a similar issue on another structure which involves the same type of process.
On Wed, 2005-11-09 at 14:20 -0500, Tom Lane wrote: > Rod Taylor <pg@rbt.ca> writes: > > As you can see, we have duplicates within the table (heap) of a primary > > key value. The index itself only references one of these tuples. > > Can you put together a test case to reproduce this? It doesn't have to > fail every time, as long as it fails once in awhile ... Seems not. I've done millions of iterations of the same type of functionality that happens with these structures and haven't produced a single case. These are fairly low usage structures, so I think I've done about 3 months worth of work, which in production had 20 bad tuples. I tried playing with various delays, vacuum schedules, and number of parallel processes. Whatever is happening is from interaction not contained within the structures showing the symptoms. I'll watch it a bit closer now that I know the problem exists to see if I can find a pattern. --
Rod Taylor schrieb: > On Wed, 2005-11-09 at 14:20 -0500, Tom Lane wrote: >> Rod Taylor <pg@rbt.ca> writes: >>> As you can see, we have duplicates within the table (heap) of a primary >>> key value. The index itself only references one of these tuples. >> Can you put together a test case to reproduce this? It doesn't have to >> fail every time, as long as it fails once in awhile ... > > Seems not. I've done millions of iterations of the same type of > functionality that happens with these structures and haven't produced a > single case. These are fairly low usage structures, so I think I've done > about 3 months worth of work, which in production had 20 bad tuples. I > tried playing with various delays, vacuum schedules, and number of > parallel processes. I am seeing a similar unique index bug here... This is PostgreSQL 8.1.1 on RHEL 3, Intel Xeon (i686). We don't use SAVEPOINTs and we don't use autovacuum. It's quite unlikely that the problem is directly related to VACUUM since that is only run via cron during night hours. The symptoms are duplicate entries in a unique index. billing=> \d properties Table "billing.properties" Column | Type | Modifiers ----------+-------------------+----------- language | character(2) | not null key_name | character varying | not nullvalue | character varying | not null Indexes: "pk_properties" PRIMARY KEY, btree ("language", key_name) Check constraints: "tc_properties_key_name" CHECK (key_name::text ~ '^[a-zA-Z][a-zA-Z0-9_.]+$'::text) "tc_properties_language" CHECK ("language" = 'de'::bpchar OR "language" = 'en'::bpchar) billing=> reindex table properties; ERROR: could not create unique index DETAIL: Table contains duplicated values. billing=> select ctid,xmin,xmax,cmin,cmax,language,key_name from properties where key_name = 'enum.server_task_log.status.keys'; ctid | xmin | xmax | cmin | cmax | language | key_name ---------+--------+------+------+------+----------+---------------------------------- (31,64) | 505433 | 0 | 5 | 0 | de | enum.server_task_log.status.keys (31,57) | 505261 | 0 | 7 | 0 | de | enum.server_task_log.status.keys (31,56) | 505261 | 0 | 5 | 0 | en | enum.server_task_log.status.keys (3 rows) The state is the effect of only UPDATEs of the rows after a SELECT ... FOR UPDATE in the same transaction. It happend twice right now but I deleted the other rows... the table should still contain the data. I have disabled scheduled vacuums for now. I could send the index and table files off-list. This is the only effected table right now. It is not updated frequently but is rather static. I upgraded to 8.1.1 around Dec 21, there should have been near zero updates since then until today. Perhaps it's a problem with multi-column unique indexes? Best Regards, Michael Paesold
Michael Paesold <mpaesold@gmx.at> writes: > I am seeing a similar unique index bug here... > This is PostgreSQL 8.1.1 on RHEL 3, Intel Xeon (i686). It looks like the problem is that index entries are being inserted out of order. I find this pair that should be in the other order: Item 124 -- Length: 44 Offset: 2360 (0x0938) Flags: USED Block Id: 3 linp Index: 55 Size: 44 Has Nulls: 0 Has Varwidths:16384 0938: 00000300 37002c40 06000000 64650000 ....7.,@....de.. 0948: 1a000000 656e756d 2e626f6f 6b696e67 ....enum.booking0958: 2e747970 652e6b65 79730000 .type.keys.. Item 125 -- Length: 40 Offset: 2320 (0x0910) Flags: USED Block Id: 3 linp Index: 48 Size: 40 Has Nulls: 0 Has Varwidths:16384 0910: 00000300 30002840 06000000 64650000 ....0.(@....de.. 0920: 17000000 656e756d 2e626f6f 6b696e67 ....enum.booking0930: 2e747970 652e4c00 .type.L. and likewise here: Item 60 -- Length: 52 Offset: 5060 (0x13c4) Flags: USED Block Id: 4 linp Index: 38 Size: 52 Has Nulls: 0 Has Varwidths:16384 13c4: 00000400 26003440 06000000 64650000 ....&.4@....de.. 13d4: 24000000 656e756d 2e736572 7665725f $...enum.server_13e4: 7461736b 5f6c6f67 2e737461 7475732e task_log.status. 13f4: 6b657973 keys Item 61 -- Length: 56 Offset: 5004 (0x138c) Flags: USED Block Id: 4 linp Index: 37 Size: 56 Has Nulls: 0 Has Varwidths:16384 138c: 00000400 25003840 06000000 64650000 ....%.8@....de.. 139c: 27000000 656e756d 2e736572 7665725f '...enum.server_13ac: 7461736b 5f6c6f67 2e737461 7475732e task_log.status. 13bc: 52554e4e 494e4700 RUNNING. All four of the referenced tuples were inserted by XMIN 986, CMIN 0, which I assume was probably a COPY command. So the breakage occurred long before the update operations. Did you create the index before or after loading the initial data? If you have the original dump that was loaded, it'd be interesting to see if re-loading it reproduces the corrupt index. regards, tom lane
I wrote: > Michael Paesold <mpaesold@gmx.at> writes: >> I am seeing a similar unique index bug here... >> This is PostgreSQL 8.1.1 on RHEL 3, Intel Xeon (i686). > It looks like the problem is that index entries are being inserted out > of order. After further investigation, it seems that the original sort order of the index was not C-locale, but something else --- I can reproduce the current index ordering except for a small number of new-ish tuples if I sort the data in en_US. We go out of our way to prevent the backend's locale from changing after initdb. Did you do something to override that? Another theory is that this is a manifestation of the known problem with plperl sometimes changing the backend's locale setting. Is it possible that the index was created in a session that had previously run some plperl functions? regards, tom lane
Tom Lane wrote: > I wrote: >> Michael Paesold <mpaesold@gmx.at> writes: >>> I am seeing a similar unique index bug here... >>> This is PostgreSQL 8.1.1 on RHEL 3, Intel Xeon (i686). > >> It looks like the problem is that index entries are being inserted out >> of order. > > After further investigation, it seems that the original sort order of > the index was not C-locale, but something else --- I can reproduce the > current index ordering except for a small number of new-ish tuples if > I sort the data in en_US. > > We go out of our way to prevent the backend's locale from changing after > initdb. Did you do something to override that? No, I am sure I did not do anything to change the locale itentionally. The cluster was initialized with "initdb --no-locale"... (and this is what it still is). > Another theory is that this is a manifestation of the known problem with > plperl sometimes changing the backend's locale setting. Is it possible > that the index was created in a session that had previously run some > plperl functions? This is a theory. The whole database was loaded using pg_restore, I still have the original dump so I will have a look at the dump now. The database actually contains some plperl functions. Restoring to a file I find some perhaps interesting facts perhaps relevant: *) SET check_function_bodies = false; So at least the syntax checking function should not be called. *) Old plperl call handler: The dump from 7.4.x created the public.plperl_call_handler() function, which I only dropped after the full dump was loaded. CREATE FUNCTION plperl_call_handler() RETURNS language_handler AS '$libdir/plperl', 'plperl_call_handler' LANGUAGEc; ALTER FUNCTION public.plperl_call_handler() OWNER TO postgres; CREATE TRUSTED PROCEDURAL LANGUAGE plperl HANDLER plperl_call_handler; *) There is a single plperl function that is only used in a view. (Btw. this view is totally unrelated to the given table and should never be used in the same backend session.) From the points above, I don't think the plperl function should have been called during load. Perhaps I am mistaken and plperl did really override the locale setting. Looking at the environment set for the "postgres" unix user, which is used to run Postgres, I see that LANG is set to the default value of en_US.UTF-8. So it seems possible that setting LANG to C here, could fix the problem. This still doesn't explain why the initial sort order is wrong here. The creation order in the dump is: CREATE TABLE... (without indexes) COPY ... ALTER TABLE ONLY properties ADD CONSTRAINT pk_properties... Please tell me if you need further information. Best Regards, Michael Paesold
Michael Paesold <mpaesold@gmx.at> writes: > This is a theory. The whole database was loaded using pg_restore, I still > have the original dump so I will have a look at the dump now. The database > actually contains some plperl functions. OK, I think I have reproduced the problem. initdb in C locale, then start postmaster with LANG=en_US.UTF-8 in its environment. Then: z1=# create language plperl; CREATE LANGUAGE z1=# select 'enum.server_task_log.status.RUNNING'::varchar < 'enum.server_task_log.status.keys'::varchar;?column? ----------t -- correct result for C locale (1 row) z1=# \c z1 You are now connected to database "z1". z1=# SET check_function_bodies = false; SET z1=# create or replace function perlf() returns text as $$ z1$# return 'foo'; z1$# $$ language plperl; CREATE FUNCTION z1=# select 'enum.server_task_log.status.RUNNING'::varchar < 'enum.server_task_log.status.keys'::varchar;?column? ----------f -- WRONG result for C locale (1 row) So the mere act of defining a plperl function, even with check_function_bodies = false, is sufficient to send control through that bit of libperl code that does setlocale(LC_ALL, ""). Ugh. This is much worse than I thought. The reason I had not seen it before is that lc_collate_is_c caches its result, which means that if you do any text/varchar comparisons before first invoking libperl, you won't see any misbehavior (at least not when you started in C locale). The reconnect in the middle of the above test sequence is essential to reproduce the failure. We were talking last week about forcing the LANG/LC_ environment variables to match our desired settings within the postmaster. I think this example raises the priority of doing that by several notches :-( In the meantime, Michael, I'd suggest modifying your postmaster start script to force LANG=C, and then reindexing all indexes you have on text/varchar/char columns. That should get you out of the immediate problem and prevent it from recurring before we have a fix. (The system catalogs should be OK because they use "name" which is not locale-sensitive.) regards, tom lane
I wrote: > So the mere act of defining a plperl function, even with > check_function_bodies = false, is sufficient to send control through > that bit of libperl code that does setlocale(LC_ALL, ""). Ugh. > This is much worse than I thought. It seems one ingredient in this is that the plperl function validator fails to honor check_function_bodies, and hence is calling libperl anyway. I wonder if that explains the sudden rise in incidents in 8.1? regards, tom lane
Tom Lane said: > I wrote: >> So the mere act of defining a plperl function, even with >> check_function_bodies = false, is sufficient to send control through >> that bit of libperl code that does setlocale(LC_ALL, ""). Ugh. >> This is much worse than I thought. > > It seems one ingredient in this is that the plperl function validator > fails to honor check_function_bodies, and hence is calling libperl > anyway. I wonder if that explains the sudden rise in incidents in 8.1? > That's probably because I was unaware of its existence. It should certainly be fixed, but surely at best this would only delay seeing the ugly locale effect - as soon as you call a perl function you'll be back in the same boat regardless. cheers andrew
"Andrew Dunstan" <andrew@dunslane.net> writes: > It should certainly be fixed, but surely at best this would only delay > seeing the ugly locale effect - as soon as you call a perl function you'll > be back in the same boat regardless. Right, I was not suggesting that fixing the validator would avoid the bug. It just surprised me that libperl was getting called in the restore-a-dump scenario. BTW, I was just about to go change the validator, but if you want to do it go ahead... regards, tom lane
Tom Lane said: > "Andrew Dunstan" <andrew@dunslane.net> writes: >> It should certainly be fixed, but surely at best this would only delay >> seeing the ugly locale effect - as soon as you call a perl function >> you'll be back in the same boat regardless. > > Right, I was not suggesting that fixing the validator would avoid the > bug. It just surprised me that libperl was getting called in the > restore-a-dump scenario. > > BTW, I was just about to go change the validator, but if you want to do > it go ahead... > no, please do. cheers andrew
Tom Lane wrote: > Michael Paesold <mpaesold@gmx.at> writes: >> This is a theory. The whole database was loaded using pg_restore, I still >> have the original dump so I will have a look at the dump now. The database >> actually contains some plperl functions. > > OK, I think I have reproduced the problem. initdb in C locale, then > start postmaster with LANG=en_US.UTF-8 in its environment. Then: I had reproduced the problem here with a stripped down dump file from my backup, but your test case is much simpler, as usual. :-) > In the meantime, Michael, I'd suggest modifying your postmaster start > script to force LANG=C, and then reindexing all indexes you have on > text/varchar/char columns. That should get you out of the immediate > problem and prevent it from recurring before we have a fix. I had already reindexed all tables in a clean session and have now added "export LANG=C" to the profile of the postgres unix account. I cannot reproduce the bug after doing so. Thank you for your quick help debugging the problem. Best Regards, Michael Paesold
Tom Lane wrote: >Michael Paesold <mpaesold@gmx.at> writes: > > >>This is a theory. The whole database was loaded using pg_restore, I still >>have the original dump so I will have a look at the dump now. The database >>actually contains some plperl functions. >> >> > >OK, I think I have reproduced the problem. initdb in C locale, then >start postmaster with LANG=en_US.UTF-8 in its environment. Then: > >z1=# create language plperl; >CREATE LANGUAGE >z1=# select 'enum.server_task_log.status.RUNNING'::varchar < 'enum.server_task_log.status.keys'::varchar; > ?column? >---------- > t -- correct result for C locale >(1 row) > >z1=# \c z1 >You are now connected to database "z1". >z1=# SET check_function_bodies = false; >SET >z1=# create or replace function perlf() returns text as $$ >z1$# return 'foo'; >z1$# $$ language plperl; >CREATE FUNCTION >z1=# select 'enum.server_task_log.status.RUNNING'::varchar < 'enum.server_task_log.status.keys'::varchar; > ?column? >---------- > f -- WRONG result for C locale >(1 row) > > > Unfortunately we have not fixed this on Windows. I have confirmed the effect on 8.1.1, and I still see this effect on HEAD. We have fixed the check_function_bodies bit, but if that is on, or if I call a plperl func, I get the bad result shown above. The log message from the commit that was supposed to fix this says: Arrange to set the LC_XXX environment variables to match our locale setup. This protects against undesired changes in localebehavior if someone carelessly does setlocale(LC_ALL, "") (and we know who you are, perl guys). However, to the best of my knowledge, Windows does NOT consult the environment when set_locale is called. ISTM we probablyneed to call set_locale ourselves on Windows with the desired values. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > However, to the best of my knowledge, Windows does NOT consult the environment when set_locale is called. ISTM we probablyneed to call set_locale ourselves on Windows with the desired values. We already do, during process startup. The question becomes where the heck is Perl looking for the locale information, when on Windows? regards, tom lane
Tom Lane wrote: >Andrew Dunstan <andrew@dunslane.net> writes: > > >>However, to the best of my knowledge, Windows does NOT consult the environment when set_locale is called. ISTM we probablyneed to call set_locale ourselves on Windows with the desired values. >> >> > >We already do, during process startup. The question becomes where the >heck is Perl looking for the locale information, when on Windows? > > The Windows docs at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclib/html/_crt_setlocale.2c_._wsetlocale.asp say: |setlocale( LC_ALL, "" );| Sets the locale to the default, which is the user-default ANSI code page obtained from theoperating system. Does libperl call this only at interpreter startup? If so, maybe we should probably save out the settings and then restore them after the interpreter has started. cheers andrew ||
Andrew Dunstan wrote: > > > Tom Lane wrote: > >> Andrew Dunstan <andrew@dunslane.net> writes: >> >> >>> However, to the best of my knowledge, Windows does NOT consult the >>> environment when set_locale is called. ISTM we probably need to call >>> set_locale ourselves on Windows with the desired values. >>> >> >> >> We already do, during process startup. The question becomes where the >> heck is Perl looking for the locale information, when on Windows? >> >> > > The Windows docs at > http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclib/html/_crt_setlocale.2c_._wsetlocale.asp > say: > > |setlocale( LC_ALL, "" );| > Sets the locale to the default, which is the user-default ANSI code > page obtained from the operating system. > > Does libperl call this only at interpreter startup? If so, maybe we > should probably save out the settings and then restore them after the > interpreter has started. > > After some analysis of perl's locale.c, I came up with the attached patch, which seems to cure the previously observed problem on my Windows box. The questions are: a) is this an acceptable solution, and b) should we do this for all the LC_* settings (on Windows at least)? Especially, should we do it for LC_CTYPE? cheers andrew Index: plperl.c =================================================================== RCS file: /projects/cvsroot/pgsql/src/pl/plperl/plperl.c,v retrieving revision 1.98 diff -c -r1.98 plperl.c *** plperl.c 29 Dec 2005 14:28:31 -0000 1.98 --- plperl.c 8 Jan 2006 00:49:57 -0000 *************** *** 45,50 **** --- 45,51 ---- #include <ctype.h> #include <fcntl.h> #include <unistd.h> + #include <stdlib.h> /* postgreSQL stuff */ #include "commands/trigger.h" *************** *** 60,65 **** --- 61,72 ---- extern DLLIMPORT bool check_function_bodies; /* perl stuff */ + + /* prevent perl headers from hijacking stdio and lots of other stuff */ + #ifdef WIN32 + #define WIN32IO_IS_STDIO + #endif + #include "EXTERN.h" #include "perl.h" #include "XSUB.h" *************** *** 263,268 **** --- 270,297 ---- "", "-e", PERLBOOT }; + + char lang_buf[256], col_buf[256] ; + char * env_entry; + + lang_buf[0] = col_buf[0] = '\0'; + + env_entry= getenv("LANG"); + + if (env_entry && strlen(env_entry) > 0) + { + snprintf(lang_buf,256, "LANG=%s", env_entry); + unsetenv("LANG"); + } + + env_entry = getenv("LC_COLLATE"); + + if (env_entry && strlen(env_entry) > 0) + { + snprintf(col_buf,256,"LC_COLLATE=%s",env_entry); + unsetenv("LC_COLLATE"); + } + plperl_interp = perl_alloc(); if (!plperl_interp) elog(ERROR, "could not allocate Perl interpreter"); *************** *** 272,277 **** --- 301,311 ---- perl_run(plperl_interp); plperl_proc_hash = newHV(); + + if (strlen(lang_buf) > 0) + putenv(lang_buf); + if(strlen(col_buf) > 0) + putenv(col_buf); }
I wrote: > > > After some analysis of perl's locale.c, I came up with the attached > patch, which seems to cure the previously observed problem on my > Windows box. > > Further testing shows the problem persisting. Back to the drawing board. cheers andrew
I wrote: > > Further testing shows the problem persisting. Back to the drawing board. > > The attached patch against cvs tip does seem to work. Instead of playing with the environment, we simply allow perl to do its worst and then put things back the way we wanted them. Comments? cheers andrew Index: plperl.c =================================================================== RCS file: /cvsroot/pgsql/src/pl/plperl/plperl.c,v retrieving revision 1.98 diff -c -r1.98 plperl.c *** plperl.c 29 Dec 2005 14:28:31 -0000 1.98 --- plperl.c 8 Jan 2006 16:54:53 -0000 *************** *** 45,50 **** --- 45,51 ---- #include <ctype.h> #include <fcntl.h> #include <unistd.h> + #include <locale.h> /* postgreSQL stuff */ #include "commands/trigger.h" *************** *** 263,268 **** --- 264,303 ---- "", "-e", PERLBOOT }; + #ifdef WIN32 + + /* + * The perl library on startup does horrible things like call + * setlocale(LC_ALL,""). We have protected against that on most + * platforms by setting the environment appropriately. However, on + * Windows, setlocale() does not consult the environment, so we need + * to save the excisting locale settings before perl has a chance to + * mangle them and restore them after its dirty deeds are done. + * + * MSDN ref: + * http://msdn.microsoft.com/library/en-us/vclib/html/_crt_locale.asp + * + * It appaers that we only need to do this on interpreter startup, and + * subsequent calls to the interpreter don't mess with the locale + * settings. + */ + + char *loc; + char *save_collate, *save_ctype, *save_monetary, *save_numeric, *save_time; + + loc = setlocale(LC_COLLATE,NULL); + save_collate = loc ? pstrdup(loc) : NULL; + loc = setlocale(LC_CTYPE,NULL); + save_ctype = loc ? pstrdup(loc) : NULL; + loc = setlocale(LC_MONETARY,NULL); + save_monetary = loc ? pstrdup(loc) : NULL; + loc = setlocale(LC_NUMERIC,NULL); + save_numeric = loc ? pstrdup(loc) : NULL; + loc = setlocale(LC_TIME,NULL); + save_time = loc ? pstrdup(loc) : NULL; + + #endif + plperl_interp = perl_alloc(); if (!plperl_interp) elog(ERROR, "could not allocate Perl interpreter"); *************** *** 272,277 **** --- 307,328 ---- perl_run(plperl_interp); plperl_proc_hash = newHV(); + + #ifdef WIN32 + + if (save_collate != NULL) + setlocale(LC_COLLATE,save_collate); + if (save_ctype != NULL) + setlocale(LC_CTYPE,save_ctype); + if (save_monetary != NULL) + setlocale(LC_MONETARY,save_monetary); + if (save_numeric != NULL) + setlocale(LC_NUMERIC,save_numeric); + if (save_time != NULL) + setlocale(LC_TIME,save_time); + + #endif + }
Andrew Dunstan <andrew@dunslane.net> writes: > The attached patch against cvs tip does seem to work. Instead of playing > with the environment, we simply allow perl to do its worst and then put > things back the way we wanted them. Doesn't that screw up Perl, though? Its idea of what the locale is will be wrong. Maybe that's the least bad alternative available, but it doesn't seem like we're there yet. (Of course, the *big* problem with this approach is that it's Perl-specific, and won't do a thing for any other libraries that might try to do setlocale(LC_ALL, "").) regards, tom lane
Tom Lane wrote: >Andrew Dunstan <andrew@dunslane.net> writes: > > >>The attached patch against cvs tip does seem to work. Instead of playing >>with the environment, we simply allow perl to do its worst and then put >>things back the way we wanted them. >> >> > >Doesn't that screw up Perl, though? Its idea of what the locale is >will be wrong. > >Maybe that's the least bad alternative available, but it doesn't seem >like we're there yet. > >(Of course, the *big* problem with this approach is that it's >Perl-specific, and won't do a thing for any other libraries that >might try to do setlocale(LC_ALL, "").) > > > > All true, which is why I tried to avoid this solution. However, I have not found another one that works. Specifically, unsetting LANG and LC_COLLATE did not work as I thought it would. If anyone can provide a better solution I will be very happy. cheers andrew
> Andrew Dunstan <andrew@dunslane.net> writes: > > The attached patch against cvs tip does seem to work. Instead of playing > > with the environment, we simply allow perl to do its worst and then put > > things back the way we wanted them. How does that affect to the API calls you can make from Perl back into the database? What if you change the locale and then issue a query from within Perl? -- greg
Greg Stark wrote: >>Andrew Dunstan <andrew@dunslane.net> writes: >> >> >>>The attached patch against cvs tip does seem to work. Instead of playing >>>with the environment, we simply allow perl to do its worst and then put >>>things back the way we wanted them. >>> >>> > >How does that affect to the API calls you can make from Perl back into the >database? What if you change the locale and then issue a query from within >Perl? > > > If you deliberately change the locale settings (especially LC_COLLATE), all bets are off, surely. REINDEX will be in your future. Calling setlocale() is in fact a forbidden operation in trusted plperl. AFAICT, perl doesn't keep any state about locale settings, it just reacts to whatever the current settings are, I think, but I could be wrong. My main concern has been that we are pushing out a point release that advertises a fix for a problem, when the fix doesn't work on Windows. Either we need to find a fix (and I tried to supply one) or we need to change what we say about the release. I'm also a bit distressed that nobody else has tested this, and we have just assumed that the fix would work, despite what we already know about how setlocale() works on Windows. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > AFAICT, perl doesn't keep any state about locale settings, it just > reacts to whatever the current settings are, I think, but I could be wrong. If that's the case, why would it be bothering to issue setlocale during startup at all? If you look in locale.c in the Perl sources, it's pretty clear that it saves away state about the settings during Perl_init_i18nl10n(). I'm too lazy to track down where that state is used or what the consequences are if it's wrong, but it sure looks to me like *something* will be broken if we just change the locale back to what we want afterward. regards, tom lane
On Mon, 2006-01-09 at 11:03 -0500, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > AFAICT, perl doesn't keep any state about locale settings, it just > > reacts to whatever the current settings are, I think, but I could be wrong. > > If that's the case, why would it be bothering to issue setlocale during > startup at all? If you look in locale.c in the Perl sources, it's > pretty clear that it saves away state about the settings during > Perl_init_i18nl10n(). I'm too lazy to track down where that state is > used or what the consequences are if it's wrong, but it sure looks to > me like *something* will be broken if we just change the locale back > to what we want afterward. > I don't know. Reading that code just makes my head spin ... I should have thought a library shouldn't make too many assumptions about locale settings anyway. What is to prevent another library from doing the same thing. Then we'd have duelling settings, dependent on who got called last. I had thought, from reading perl's locale.c, that unsetting LC_COLLATE and LANG would inhibit the calls to setlocale(LC_ALL,"") and setlocale(LC_COLLLATE,""). But my testing seemed to prove that wrong. Of course, it's possible that ActiveState's perl is not doing quite the same thing. At any rate, what I think we know from that code is that on Windows, just setting LANG and LC_* is precisely the wrong thing, since the presence of those values will trigger a call to setlocale(LC_foo,"") but the relevant environment value will not actually be used. I'm just about out of ideas and right out of time to spend on this. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > I don't know. Reading that code just makes my head spin ... Yeah, too many ifdefs :-(. But I suppose that the initial "#ifdef LOCALE_ENVIRON_REQUIRED" block is not compiled on sane platforms, meaning that the first code in the routine is the unconditional if (! setlocale(LC_ALL, ""))setlocale_failure = TRUE; > I should have thought a library shouldn't make too many assumptions > about locale settings anyway. Indeed; I think a pretty strong case can be made that this is a Perl bug. It's reasonable to be doing the setlocale call in a standalone Perl executable, but libperl should just work with whatever locale settings have been chosen by the surrounding program (ie, all these calls should be setlocale(LC_xxx, NULL) in the libperl case). > I'm just about out of ideas and right out of time to spend on this. We could just file a Perl bug report and wait for them to fix it. regards, tom lane
On Mon, 2006-01-09 at 12:06 -0500, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > I don't know. Reading that code just makes my head spin ... > > Yeah, too many ifdefs :-(. But I suppose that the initial > "#ifdef LOCALE_ENVIRON_REQUIRED" block is not compiled on sane > platforms, meaning that the first code in the routine is the > unconditional > if (! setlocale(LC_ALL, "")) > setlocale_failure = TRUE; > *doh!* I had misread that. Now I see. On Windows that pretty much gives the game away. > > > I'm just about out of ideas and right out of time to spend on this. > > We could just file a Perl bug report and wait for them to fix it. > What's the data risk? cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > On Mon, 2006-01-09 at 12:06 -0500, Tom Lane wrote: >> We could just file a Perl bug report and wait for them to fix it. > What's the data risk? Given that it took us this long to identify the problem, I'm guessing that it doesn't affect too many people. For starters you'd have to run the postmaster under a locale environment different from what initdb saw (or was told). regards, tom lane
Tom Lane wrote: >>I'm just about out of ideas and right out of time to spend on this. >> >> > >We could just file a Perl bug report and wait for them to fix it. > > > > done cheers andrew
Is there a TODO here, even if the Perl folks are supposed to fix it? --------------------------------------------------------------------------- Andrew Dunstan wrote: > > > Tom Lane wrote: > > >>I'm just about out of ideas and right out of time to spend on this. > >> > >> > > > >We could just file a Perl bug report and wait for them to fix it. > > > > > > > > > > done > > cheers > > andrew > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Is there a TODO here, even if the Perl folks are supposed to fix it? When and if they fix it, it'd be useful for us to document the gotcha someplace (not sure where, though). Maybe we should even go so far as to refuse to work with older libperls on Windows. regards, tom lane
It has probably been sufficiently mitigated on *nix. On Windows, the choice seems to be between living with the risk and trying my "put the locales back where they were" patch, which as Tom and Greg point out might have other consequences. Take your pick. cheers andrew Bruce Momjian wrote: >Is there a TODO here, even if the Perl folks are supposed to fix it? > >--------------------------------------------------------------------------- > >Andrew Dunstan wrote: > > >>Tom Lane wrote: >> >> >> >>>>I'm just about out of ideas and right out of time to spend on this. >>>> >>>> >>>> >>>> >>>We could just file a Perl bug report and wait for them to fix it. >>> >>> >>> >>> >>> >>> >>done >> >> >> >
I can put it in the Win32 section of the TODO list. If we have something not working on Win32, I would like to document it. Is it:plperl changes the locale in Win32? --------------------------------------------------------------------------- Andrew Dunstan wrote: > > It has probably been sufficiently mitigated on *nix. On Windows, the > choice seems to be between living with the risk and trying my "put the > locales back where they were" patch, which as Tom and Greg point out > might have other consequences. Take your pick. > > cheers > > andrew > > Bruce Momjian wrote: > > >Is there a TODO here, even if the Perl folks are supposed to fix it? > > > >--------------------------------------------------------------------------- > > > >Andrew Dunstan wrote: > > > > > >>Tom Lane wrote: > >> > >> > >> > >>>>I'm just about out of ideas and right out of time to spend on this. > >>>> > >>>> > >>>> > >>>> > >>>We could just file a Perl bug report and wait for them to fix it. > >>> > >>> > >>> > >>> > >>> > >>> > >>done > >> > >> > >> > > > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: >I can put it in the Win32 section of the TODO list. If we have >something not working on Win32, I would like to document it. > >Is it: > plperl changes the locale in Win32? > > > As long as the locale is consistent I think we're OK (is that right, Tom?) Would that mean not using any of initdb's locale settings? cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > As long as the locale is consistent I think we're OK (is that right, Tom?) Right. > Would that mean not using any of initdb's locale settings? Yeah, you'd want to not use the --locale switch for initdb, and also not to change the system-wide locale settings afterwards. regards, tom lane