Re: SSI bug? - Mailing list pgsql-hackers

From yamt@mwd.biglobe.ne.jp (YAMAMOTO Takashi)
Subject Re: SSI bug?
Date
Msg-id 20110328031616.CDAA719CEDF@mail.netbsd.org
Whole thread Raw
In response to Re: SSI bug?  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses Re: SSI bug?
List pgsql-hackers
hi,

>>> (6) Does the application continue to run relatively sanely, or
>>> does it fall over at this point?
>> 
>> my application just exits on the error.
>> 
>> if i re-run the application without rebooting postgres, it seems
>> that i will get the error sooner than the first run. (but it might
>> be just a matter of luck)
>  
> If your application hits this again, could you check pg_stat_activity
> and pg_locks and see if any SIReadLock entries are lingering after
> the owning transaction and all overlapping transactions are
> completed?  If anything is lingering between runs of your
> application, it *should* show up in one or the other of these.

this is 71ac48fd9cebd3d2a873635a04df64096c981f73 with your two patches.
this psql session was the only activity to the server at this point.

hoge=# select * from pg_stat_activity;
-[ RECORD 1 ]----+--------------------------------
datid            | 16384
datname          | hoge
procpid          | 7336
usesysid         | 10
usename          | takashi
application_name | psql
client_addr      | 
client_hostname  | 
client_port      | -1
backend_start    | 2011-03-26 12:28:21.882226+09
xact_start       | 2011-03-28 11:55:19.300027+09
query_start      | 2011-03-28 11:55:19.300027+09
waiting          | f
current_query    | select * from pg_stat_activity;

hoge=# select count(*) from pg_locks where mode='SIReadLock';
-[ RECORD 1 ]
count | 7057

hoge=# select locktype,count(*) from pg_locks group by locktype;                -[ RECORD 1 ]--------
locktype | virtualxid
count    | 1
-[ RECORD 2 ]--------
locktype | relation
count    | 1
-[ RECORD 3 ]--------
locktype | tuple
count    | 7061

hoge=# 

>  
>>> (7) The message hint would help pin it down, or a stack trace at
>>> the point of the error would help more. Is it possible to get
>>> either? Looking over the code, it appears that the only places
>>> that SSI could generate that error, it would cancel that
>>> transaction with the hint "You might need to increase
>>> max_pred_locks_per_transaction." and otherwise allow normal
>>> processing.
>> 
>> no message hints. these errors are not generated by SSI code,
>> at least directly.
>  
> Right, that's because we were using HASH_ENTER instead of
> HASH_ENTER_NULL.  I've posted a patch which should correct that.

sure, with your patch it seems that they turned into different ones.

PG_DIAG_SEVERITY: WARNING
PG_DIAG_SQLSTATE: 53200
PG_DIAG_MESSAGE_PRIMARY: out of shared memory
PG_DIAG_SOURCE_FILE: shmem.c
PG_DIAG_SOURCE_LINE: 190
PG_DIAG_SOURCE_FUNCTION: ShmemAlloc

PG_DIAG_SEVERITY: ERROR
PG_DIAG_SQLSTATE: 53200
PG_DIAG_MESSAGE_PRIMARY: out of shared memory
PG_DIAG_MESSAGE_HINT: You might need to increase max_pred_locks_per_transaction.
PG_DIAG_SOURCE_FILE: predicate.c
PG_DIAG_SOURCE_LINE: 2049
PG_DIAG_SOURCE_FUNCTION: CreatePredicateLock

>>> Even with the above information it may be far from clear where
>>> allocations are going past their maximum, since one HTAB could
>>> grab more than its share and starve another which is staying below
>>> its "maximum". I'll take a look at the possibility of adding a
>>> warning or some such when an HTAB expands past its maximum size.
>  
> I see from your later post that you are running with this patch.  Has
> that reported anything yet?

i got nothing except the following one.  (in the server log)

WARNING:  hash table "ShmemIndex" has more entries than expected
DETAIL:  The maximum was set to 32 on creation.

YAMAMOTO Takashi

>  
> Thanks,
>  
> -Kevin


pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Proposal: q-gram GIN and GiST indexes
Next
From: Simon Riggs
Date:
Subject: Re: Sync Rep v19