Re: BUG #5798: Some weird error with pl/pgsql procedure - Mailing list pgsql-bugs
From | Maxim Boguk |
---|---|
Subject | Re: BUG #5798: Some weird error with pl/pgsql procedure |
Date | |
Msg-id | AANLkTi==2o9JPJwwfOsL2CnFpcw1YYy_AY_qCiKtg05W@mail.gmail.com Whole thread Raw |
In response to | Re: BUG #5798: Some weird error with pl/pgsql procedure (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: BUG #5798: Some weird error with pl/pgsql procedure
|
List | pgsql-bugs |
It happened again in much simpler case (no slony or deadlock message involved... no slony at all on the database). The same error DBD::Pg::db selectall_arrayref failed: ERROR: cannot extract system attribute from virtual tuple CONTEXT: PL/pgSQL function "__clear_table_tail" line 50 at FOR over EXECUTE statement It happened when function worked with small table with only 33 pages size and 600 live tuples. What made situations even more easy to track it is I know this table don't have any inserts or deletes for ages (only select and update) and I have full log of database queries during problem. Log of call of the procedure: 2010-12-23 00:21:04 MSK pgsql@billing 55717 ERROR: cannot extract system attribute from virtual tuple 2010-12-23 00:21:04 MSK pgsql@billing 55717 CONTEXT: PL/pgSQL function "__clear_table_tail" line 50 at FOR over EXECUTE statement 2010-12-23 00:21:04 MSK pgsql@billing 55717 STATEMENT: SELECT * from __clear_table_tail('ctrl_servers', 'bill', 'opt', 21,= 1) In the same time log all actions involving changes on table ctrl_servers during +/- 3 second from ERROR: 2010-12-23 00:21:03 MSK bill@billing 42078 LOG: duration: 0.736 ms statement: UPDATE ctrl_servers SET last_connect =3D 'now' WHERE ( id =3D '1514' ) Just one query. Line where ERROR happens: FOR _new_tid IN EXECUTE 'UPDATE ONLY ' || quote_ident(arg_relnamespace) || '.' || quote_ident(arg_relname) || ' SET ' || arg_field || ' =3D ' || arg_field || ' WHERE ctid=3DANY($1) RETURNING ctid' USING _current_tid_list LOOP Need be said _current_tid_list at start of procedure call contained all possible tids for 21 page of relation (not only for live tuples but all possible (in range like {21,1} to {21,300+})). I have an idea the error happens because some rare race conditions with tid scan and concurrent updates in the same page (probably HOT somehow involved... because both updateable fields in function and concurrent update query updated non-indexed fields). Currently I work on creating self contained test case (unfortunately without much success until now). About stack backtrace I not sure how to get it (and even worse the database is heavily loaded production DB so I not sure creating core dump would be good idea). Still I would like to receive any help with creating a stack backtrace. On Wed, Dec 22, 2010 at 5:22 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Dec 20, 2010 at 9:48 PM, Maxim Boguk <Maxim.Boguk@gmail.com> wrot= e: >> Anyone can enlighten me what happens here? > > That does look weird, but without a simple test case I think it's > going to be hard to investigate this. =C2=A0Installing Slony and your code > and building a 130GB is more effort than I'm willing to put in... > > Any way you can get a stack backtrace at the point the error occurs? > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > --=20 Maxim Boguk Senior Postgresql DBA. Skype: maxim.boguk Jabber: maxim.boguk@gmail.com LinkedIn profile: http://nz.linkedin.com/in/maximboguk =D0=9C=D0=BE=D0=B9=D0=9A=D1=80=D1=83=D0=B3: http://mboguk.moikrug.ru/ =D0=A1=D0=B8=D0=BB=D0=B0 =D1=81=D0=BE=D0=BB=D0=BE=D0=BC=D1=83 =D0=BB=D0=BE= =D0=BC=D0=B8=D1=82, =D0=BD=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5 =D0=B2 =D0= =BD=D0=B0=D1=88=D0=B5=D0=B9 =D0=B6=D0=B8=D0=B7=D0=BD=D0=B8 - =D1=81=D0=BE= =D0=BB=D0=BE=D0=BC=D0=B0, =D0=B4=D0=B0 =D0=B8 =D1=81=D0=B8=D0=BB=D0=B0 =D0= =B4=D0=B0=D0=BB=D0=B5=D0=BA=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5.
pgsql-bugs by date: