Thread: default_isolation_level='serializable' crashes on Windows

default_isolation_level='serializable' crashes on Windows

From
Heikki Linnakangas
Date:
A customer reported that when you set 
default_isolation_level='serializable' in postgresql.conf on Windows, 
and try to start up the database, it crashes immediately. And sure 
enough, it does, on REL9_1_STABLE as well as on master.

Stack trace:

postgres!RecoveryInProgress+0x3a 
[c:\postgresql\src\backend\access\transam\xlog.c @ 7125]
postgres!check_XactIsoLevel+0x162 
[c:\postgresql\src\backend\commands\variable.c @ 617] postgres!call_string_check_hook+0x6d 
[c:\postgresql\src\backend\utils\misc\guc.c @ 8226]
postgres!set_config_option+0x13e5 
[c:\postgresql\src\backend\utils\misc\guc.c @ 5652] postgres!read_nondefault_variables+0x27f 
[c:\postgresql\src\backend\utils\misc\guc.c @ 7677]
postgres!SubPostmasterMain+0x227 
[c:\postgresql\src\backend\postmaster\postmaster.c @ 4101]
postgres!main+0x1e9 [c:\postgresql\src\backend\main\main.c @ 187]
postgres!__tmainCRTStartup+0x192 
[f:\dd\vctools\crt_bld\self_64_amd64\crt\src\crtexe.c @ 586]
postgres!mainCRTStartup+0xe 
[f:\dd\vctools\crt_bld\self_64_amd64\crt\src\crtexe.c @ 403]
kernel32!BaseThreadInitThunk+0xd
ntdll!RtlUserThreadStart+0x1d

The problem is that when a postmaster subprocess is launched, it calls 
read_nondefault_variables() very early, before shmem initialization, to 
read the non-default config options from the file that postmaster wrote. 
When check_XactIsoLevel() calls RecoveryInProgress(), it crashes, 
because XLogCtl is NULL.

I'm not sure what the cleanest fix for this would be. It seems that we 
could should just trust the values the postmaster passes to us and 
accept them without checking RecoveryInProgress(), but there's no 
straightforward way to tell that within check_XactIsoLevel(). Another 
thought is that there's really no need to pass XactIsoLevel from 
postmaster to a backend anyway, because it's overwritten from 
default_transaction_isolation as soon as you begin a transaction.

There's also a call to RecoveryInProgress() in 
check_transaction_read_only() as well, but it seems to not have this 
problem. That's more by accident than by design, though.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com



Re: default_isolation_level='serializable' crashes on Windows

From
Tom Lane
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> The problem is that when a postmaster subprocess is launched, it calls 
> read_nondefault_variables() very early, before shmem initialization, to 
> read the non-default config options from the file that postmaster wrote. 
> When check_XactIsoLevel() calls RecoveryInProgress(), it crashes, 
> because XLogCtl is NULL.

Hm, how did the same code fail to crash in the postmaster itself, when
the postmaster read the setting from postgresql.conf?

A larger point is that I think it's broken for any GUC assignment
function to be calling something as transient as RecoveryInProgress to
start with.  We probably ought to re-think the logic, not just band-aid
this by having it skip the check when shmem isn't initialized yet.
I'm thinking that the check has to occur somewhere outside GUC.
        regards, tom lane



Re: default_isolation_level='serializable' crashes on Windows

From
Heikki Linnakangas
Date:
On 12.08.2012 17:39, Tom Lane wrote:
> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  writes:
>> The problem is that when a postmaster subprocess is launched, it calls
>> read_nondefault_variables() very early, before shmem initialization, to
>> read the non-default config options from the file that postmaster wrote.
>> When check_XactIsoLevel() calls RecoveryInProgress(), it crashes,
>> because XLogCtl is NULL.
>
> Hm, how did the same code fail to crash in the postmaster itself, when
> the postmaster read the setting from postgresql.conf?

It's not the check function for default_transaction_isolation that 
crashes, but the one for transaction_isolation.

I'm not exactly sure how transaction_isolation gets set to a non-default 
value, though. The default for transaction_isolation is 'default', so 
it's understandable that the underlying XactIsoLevel variable gets set 
to XACT_SERIALIZABLE, but AFAICS the code to read/write the GUCs from/to 
file only cares about the string value of the guc, not the integer value 
of the underlying global variable.

> A larger point is that I think it's broken for any GUC assignment
> function to be calling something as transient as RecoveryInProgress to
> start with.  We probably ought to re-think the logic, not just band-aid
> this by having it skip the check when shmem isn't initialized yet.
> I'm thinking that the check has to occur somewhere outside GUC.

Hmm, it seems like the logical place to complain if you do a manual "SET 
transaction_isolation='serializable'". But I think we should only do the 
check if we're not in a transaction. Setting the guc won't have any 
effect outside a transaction anyway, because StartTransaction will 
overwrite it from default_transaction_isolation as soon as you begin a 
transaction.

While playing around, I bumped into another related bug, and after 
googling around I found out that it was already reported by Robert Haas 
earlier, but still not fixed: 
http://archives.postgresql.org/message-id/CA%2BTgmoa0UM2W1YkjjneEgJctzxopC3G53ocYPaCyoEOWT3aKiA%40mail.gmail.com. 
Kevin, the last message on that thread 
(http://archives.postgresql.org/pgsql-hackers/2012-04/msg01394.php) says 
you'll write a patch for that. Ping? Or would you like me to try that?

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com



Re: default_isolation_level='serializable' crashes on Windows

From
Tom Lane
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> On 12.08.2012 17:39, Tom Lane wrote:
>> A larger point is that I think it's broken for any GUC assignment
>> function to be calling something as transient as RecoveryInProgress to
>> start with.

> Hmm, it seems like the logical place to complain if you do a manual "SET 
> transaction_isolation='serializable'". But I think we should only do the 
> check if we're not in a transaction.

You mean "if we *are* in a transaction"?  Possibly that would work.
The concerns I've got about running this type of test in a GUC hook
are (1) what if you're not in a transaction or (2) what if you aren't
in a process that has access to shared memory; the early-startup-in-
EXEC_BACKEND case loses on both counts.  But I think we can test our
local in-transaction state without touching shared memory, and then
go from there.

> While playing around, I bumped into another related bug, and after 
> googling around I found out that it was already reported by Robert Haas 
> earlier, but still not fixed: 
> http://archives.postgresql.org/message-id/CA%2BTgmoa0UM2W1YkjjneEgJctzxopC3G53ocYPaCyoEOWT3aKiA%40mail.gmail.com

That sounds like a different bug, although perhaps with the same theme
of overreliance on what a GUC hook can do.
        regards, tom lane



Re: default_isolation_level='serializable' crashes on Windows

From
Amit Kapila
Date:
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Heikki Linnakangas
Sent: Monday, August 13, 2012 12:14 PM
On 12.08.2012 17:39, Tom Lane wrote:
> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  writes:
>>> The problem is that when a postmaster subprocess is launched, it calls
>>> read_nondefault_variables() very early, before shmem initialization, to
>>> read the non-default config options from the file that postmaster wrote.
>>> When check_XactIsoLevel() calls RecoveryInProgress(), it crashes,
>>> because XLogCtl is NULL.
>
>> Hm, how did the same code fail to crash in the postmaster itself, when
>> the postmaster read the setting from postgresql.conf?

>It's not the check function for default_transaction_isolation that 
>crashes, but the one for transaction_isolation.

> I 'm not exactly sure how transaction_isolation gets set to a non-default 
> value, though. The default for transaction_isolation is 'default', so 
> it's understandable that the underlying XactIsoLevel variable gets set 
> to XACT_SERIALIZABLE, but AFAICS the code to read/write the GUCs from/to 
> file only cares about the string value of the guc, not the integer value 
> of the underlying global variable.

Here What I am able to trace is that function read_nondefault_variables(),
reads all variables
from config_exec_params which contains both default_transaction_isolation
and transaction_isolation.

1. it first reads default_transaction_isolation and sets value of
DefaultXactIsoLevel to 'serializable'.
2. As for parameter default_transaction_isolation, there is no check
function it passes.
3. After that when variable transaction_isolation is processed, function
check_XactIsoLevel() sets   XactIsoLevel to XACT_SERIALIZABLE which causes crash.

Actually function read_nondefault_variables(), should only process non
default values (default_transaction_isolation)
not transaction_isolation, but currently it processes both?


With Regards,
Amit Kapila.





Re: default_isolation_level='serializable' crashes on Windows

From
Amit Kapila
Date:
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Amit Kapila
Sent: Monday, August 13, 2012 12:47 PM
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Heikki Linnakangas
Sent: Monday, August 13, 2012 12:14 PM
On 12.08.2012 17:39, Tom Lane wrote:
> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  writes:


>>> Hm, how did the same code fail to crash in the postmaster itself, when
>>> the postmaster read the setting from postgresql.conf?

>>It's not the check function for default_transaction_isolation that 
>>crashes, but the one for transaction_isolation.

>> I 'm not exactly sure how transaction_isolation gets set to a non-default

>> value, though. The default for transaction_isolation is 'default', so 
>> it's understandable that the underlying XactIsoLevel variable gets set 
>> to XACT_SERIALIZABLE, but AFAICS the code to read/write the GUCs from/to 
>> file only cares about the string value of the guc, not the integer value 
>> of the underlying global variable.

> Here What I am able to trace is that function read_nondefault_variables(),
> reads all variables
> from config_exec_params which contains both default_transaction_isolation
> and transaction_isolation.

> 1. it first reads default_transaction_isolation and sets value of
> DefaultXactIsoLevel to 'serializable'.
> 2. As for parameter default_transaction_isolation, there is no check
> function it passes.
> 3. After that when variable transaction_isolation is processed, function
> check_XactIsoLevel() sets 
>   XactIsoLevel to XACT_SERIALIZABLE which causes crash.

> Actually function read_nondefault_variables(), should only process non
> default values (default_transaction_isolation)
> not transaction_isolation, but currently it processes both?

transaction_isolation is getting written to config_exec_params file as
function 
write_one_nondefault_variable() checks if conf source is not PGC_S_DEFAULT,
then it writes to file.
For transaction_isolation, the conf source is set to PGC_S_OVERRIDE in
function InitializeGUCOptions()
so this also gets written to config_exec_params file.
Should the parameter transaction_isolation be written to config_exec_params?


With Regards,
Amit Kapila.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers




Re: default_isolation_level='serializable' crashes on Windows

From
"Kevin Grittner"
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
> While playing around, I bumped into another related bug, and
> after googling around I found out that it was already reported by
> Robert Haas earlier, but still not fixed: 
>
http://archives.postgresql.org/message-id/CA%2BTgmoa0UM2W1YkjjneEgJctzxopC3G53ocYPaCyoEOWT3aKiA%40mail.gmail.com
> Kevin, the last message on that thread (
>
http://archives.postgresql.org/pgsql-hackers/2012-04/msg01394.php
> ) says you'll write a patch for that. Ping? Or would you like me
> to try that?
Let me try to redeem myself by taking a shot at it tonight.
-Kevin