Thread: default_isolation_level='serializable' crashes on Windows
A customer reported that when you set default_isolation_level='serializable' in postgresql.conf on Windows, and try to start up the database, it crashes immediately. And sure enough, it does, on REL9_1_STABLE as well as on master. Stack trace: postgres!RecoveryInProgress+0x3a [c:\postgresql\src\backend\access\transam\xlog.c @ 7125] postgres!check_XactIsoLevel+0x162 [c:\postgresql\src\backend\commands\variable.c @ 617] postgres!call_string_check_hook+0x6d [c:\postgresql\src\backend\utils\misc\guc.c @ 8226] postgres!set_config_option+0x13e5 [c:\postgresql\src\backend\utils\misc\guc.c @ 5652] postgres!read_nondefault_variables+0x27f [c:\postgresql\src\backend\utils\misc\guc.c @ 7677] postgres!SubPostmasterMain+0x227 [c:\postgresql\src\backend\postmaster\postmaster.c @ 4101] postgres!main+0x1e9 [c:\postgresql\src\backend\main\main.c @ 187] postgres!__tmainCRTStartup+0x192 [f:\dd\vctools\crt_bld\self_64_amd64\crt\src\crtexe.c @ 586] postgres!mainCRTStartup+0xe [f:\dd\vctools\crt_bld\self_64_amd64\crt\src\crtexe.c @ 403] kernel32!BaseThreadInitThunk+0xd ntdll!RtlUserThreadStart+0x1d The problem is that when a postmaster subprocess is launched, it calls read_nondefault_variables() very early, before shmem initialization, to read the non-default config options from the file that postmaster wrote. When check_XactIsoLevel() calls RecoveryInProgress(), it crashes, because XLogCtl is NULL. I'm not sure what the cleanest fix for this would be. It seems that we could should just trust the values the postmaster passes to us and accept them without checking RecoveryInProgress(), but there's no straightforward way to tell that within check_XactIsoLevel(). Another thought is that there's really no need to pass XactIsoLevel from postmaster to a backend anyway, because it's overwritten from default_transaction_isolation as soon as you begin a transaction. There's also a call to RecoveryInProgress() in check_transaction_read_only() as well, but it seems to not have this problem. That's more by accident than by design, though. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > The problem is that when a postmaster subprocess is launched, it calls > read_nondefault_variables() very early, before shmem initialization, to > read the non-default config options from the file that postmaster wrote. > When check_XactIsoLevel() calls RecoveryInProgress(), it crashes, > because XLogCtl is NULL. Hm, how did the same code fail to crash in the postmaster itself, when the postmaster read the setting from postgresql.conf? A larger point is that I think it's broken for any GUC assignment function to be calling something as transient as RecoveryInProgress to start with. We probably ought to re-think the logic, not just band-aid this by having it skip the check when shmem isn't initialized yet. I'm thinking that the check has to occur somewhere outside GUC. regards, tom lane
On 12.08.2012 17:39, Tom Lane wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes: >> The problem is that when a postmaster subprocess is launched, it calls >> read_nondefault_variables() very early, before shmem initialization, to >> read the non-default config options from the file that postmaster wrote. >> When check_XactIsoLevel() calls RecoveryInProgress(), it crashes, >> because XLogCtl is NULL. > > Hm, how did the same code fail to crash in the postmaster itself, when > the postmaster read the setting from postgresql.conf? It's not the check function for default_transaction_isolation that crashes, but the one for transaction_isolation. I'm not exactly sure how transaction_isolation gets set to a non-default value, though. The default for transaction_isolation is 'default', so it's understandable that the underlying XactIsoLevel variable gets set to XACT_SERIALIZABLE, but AFAICS the code to read/write the GUCs from/to file only cares about the string value of the guc, not the integer value of the underlying global variable. > A larger point is that I think it's broken for any GUC assignment > function to be calling something as transient as RecoveryInProgress to > start with. We probably ought to re-think the logic, not just band-aid > this by having it skip the check when shmem isn't initialized yet. > I'm thinking that the check has to occur somewhere outside GUC. Hmm, it seems like the logical place to complain if you do a manual "SET transaction_isolation='serializable'". But I think we should only do the check if we're not in a transaction. Setting the guc won't have any effect outside a transaction anyway, because StartTransaction will overwrite it from default_transaction_isolation as soon as you begin a transaction. While playing around, I bumped into another related bug, and after googling around I found out that it was already reported by Robert Haas earlier, but still not fixed: http://archives.postgresql.org/message-id/CA%2BTgmoa0UM2W1YkjjneEgJctzxopC3G53ocYPaCyoEOWT3aKiA%40mail.gmail.com. Kevin, the last message on that thread (http://archives.postgresql.org/pgsql-hackers/2012-04/msg01394.php) says you'll write a patch for that. Ping? Or would you like me to try that? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > On 12.08.2012 17:39, Tom Lane wrote: >> A larger point is that I think it's broken for any GUC assignment >> function to be calling something as transient as RecoveryInProgress to >> start with. > Hmm, it seems like the logical place to complain if you do a manual "SET > transaction_isolation='serializable'". But I think we should only do the > check if we're not in a transaction. You mean "if we *are* in a transaction"? Possibly that would work. The concerns I've got about running this type of test in a GUC hook are (1) what if you're not in a transaction or (2) what if you aren't in a process that has access to shared memory; the early-startup-in- EXEC_BACKEND case loses on both counts. But I think we can test our local in-transaction state without touching shared memory, and then go from there. > While playing around, I bumped into another related bug, and after > googling around I found out that it was already reported by Robert Haas > earlier, but still not fixed: > http://archives.postgresql.org/message-id/CA%2BTgmoa0UM2W1YkjjneEgJctzxopC3G53ocYPaCyoEOWT3aKiA%40mail.gmail.com That sounds like a different bug, although perhaps with the same theme of overreliance on what a GUC hook can do. regards, tom lane
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Heikki Linnakangas Sent: Monday, August 13, 2012 12:14 PM On 12.08.2012 17:39, Tom Lane wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes: >>> The problem is that when a postmaster subprocess is launched, it calls >>> read_nondefault_variables() very early, before shmem initialization, to >>> read the non-default config options from the file that postmaster wrote. >>> When check_XactIsoLevel() calls RecoveryInProgress(), it crashes, >>> because XLogCtl is NULL. > >> Hm, how did the same code fail to crash in the postmaster itself, when >> the postmaster read the setting from postgresql.conf? >It's not the check function for default_transaction_isolation that >crashes, but the one for transaction_isolation. > I 'm not exactly sure how transaction_isolation gets set to a non-default > value, though. The default for transaction_isolation is 'default', so > it's understandable that the underlying XactIsoLevel variable gets set > to XACT_SERIALIZABLE, but AFAICS the code to read/write the GUCs from/to > file only cares about the string value of the guc, not the integer value > of the underlying global variable. Here What I am able to trace is that function read_nondefault_variables(), reads all variables from config_exec_params which contains both default_transaction_isolation and transaction_isolation. 1. it first reads default_transaction_isolation and sets value of DefaultXactIsoLevel to 'serializable'. 2. As for parameter default_transaction_isolation, there is no check function it passes. 3. After that when variable transaction_isolation is processed, function check_XactIsoLevel() sets XactIsoLevel to XACT_SERIALIZABLE which causes crash. Actually function read_nondefault_variables(), should only process non default values (default_transaction_isolation) not transaction_isolation, but currently it processes both? With Regards, Amit Kapila.
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Amit Kapila Sent: Monday, August 13, 2012 12:47 PM From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Heikki Linnakangas Sent: Monday, August 13, 2012 12:14 PM On 12.08.2012 17:39, Tom Lane wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes: >>> Hm, how did the same code fail to crash in the postmaster itself, when >>> the postmaster read the setting from postgresql.conf? >>It's not the check function for default_transaction_isolation that >>crashes, but the one for transaction_isolation. >> I 'm not exactly sure how transaction_isolation gets set to a non-default >> value, though. The default for transaction_isolation is 'default', so >> it's understandable that the underlying XactIsoLevel variable gets set >> to XACT_SERIALIZABLE, but AFAICS the code to read/write the GUCs from/to >> file only cares about the string value of the guc, not the integer value >> of the underlying global variable. > Here What I am able to trace is that function read_nondefault_variables(), > reads all variables > from config_exec_params which contains both default_transaction_isolation > and transaction_isolation. > 1. it first reads default_transaction_isolation and sets value of > DefaultXactIsoLevel to 'serializable'. > 2. As for parameter default_transaction_isolation, there is no check > function it passes. > 3. After that when variable transaction_isolation is processed, function > check_XactIsoLevel() sets > XactIsoLevel to XACT_SERIALIZABLE which causes crash. > Actually function read_nondefault_variables(), should only process non > default values (default_transaction_isolation) > not transaction_isolation, but currently it processes both? transaction_isolation is getting written to config_exec_params file as function write_one_nondefault_variable() checks if conf source is not PGC_S_DEFAULT, then it writes to file. For transaction_isolation, the conf source is set to PGC_S_OVERRIDE in function InitializeGUCOptions() so this also gets written to config_exec_params file. Should the parameter transaction_isolation be written to config_exec_params? With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > While playing around, I bumped into another related bug, and > after googling around I found out that it was already reported by > Robert Haas earlier, but still not fixed: > http://archives.postgresql.org/message-id/CA%2BTgmoa0UM2W1YkjjneEgJctzxopC3G53ocYPaCyoEOWT3aKiA%40mail.gmail.com > Kevin, the last message on that thread ( > http://archives.postgresql.org/pgsql-hackers/2012-04/msg01394.php > ) says you'll write a patch for that. Ping? Or would you like me > to try that? Let me try to redeem myself by taking a shot at it tonight. -Kevin