Re: Re: Monitoring activities of PostgreSQL ("Everlasting" function execution) - Mailing list pgsql-general

From Allan Kamau
Subject Re: Re: Monitoring activities of PostgreSQL ("Everlasting" function execution)
Date
Msg-id AANLkTikC2x8dlT-Ct6kxBZbJfh52AsvPEEYZleitq_ha@mail.gmail.com
Whole thread Raw
In response to Re: Re: Monitoring activities of PostgreSQL ("Everlasting" function execution)  (Craig Ringer <craig@postnewspapers.com.au>)
List pgsql-general
On Wed, Jun 16, 2010 at 1:15 PM, Craig Ringer
<craig@postnewspapers.com.au> wrote:
> On 16/06/10 16:56, Allan Kamau wrote:
>
>> The function I have mentioned above seems to "hang" indefinitely (now
>> 12hours and counting) while it took only 2secs for the many times it
>> was previously invoked from the client application (with slightly
>> different parameter value at each such time).
>
> Is there anything in pg_catalog.pg_locks that relates to that query?
>
> If you attach a debugger to the stalled backend and get a backtrace,
> what's the output? See:
>
> http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD
>
> If you strace the problem backend, does it appear to be doing anything?
>
> --
> Craig Ringer
>
> Tech-related writing: http://soapyfrogs.blogspot.com/
>


The pg_catalog.pg_locks table does indeed have 20 records pertaining
to the "everlasting" function
(mining.perform_itemset_fragment_mining()) execution. Please have a
look at the query output below.


po12=> SELECT a.* FROM pg_stat_activity a;
  datid  | datname | procpid | usesysid | usename |
         current_query                              | waiting |
  xact_start           |
      query_start          |         backend_start         |
client_addr | client_port

---------+---------+---------+----------+---------+------------------------------------------------------------------------+---------+-------------------------------+----
---------------------------+-------------------------------+-------------+-------------
 1936449 | po12    |   14277 |    16392 | test    | SELECT a.* FROM
pg_stat_activity a;                                    | f       |
2010-06-17 13:56:50.129498+03 | 201
0-06-17 13:56:50.129498+03 | 2010-06-17 13:15:23.018817+03 |
  |          -1
 1936449 | po12    |   14657 |    16392 | test    | SELECT
mining.perform_itemset_fragment_mining($1,$2,$3,$4,$5,$6,$7,$8) | f
   | 2010-06-17 13:33:13.315609+03 | 201
0-06-17 13:33:13.319398+03 | 2010-06-17 13:22:33.606603+03 | 127.0.0.1
  |       36211
(2 rows)

po12=> SELECT a.* FROM pg_catalog.pg_locks a;
   locktype    | database | relation | page | tuple | virtualxid |
transactionid | classid |  objid  | objsubid | virtualtransaction |
pid  |        mode         | grant
ed

---------------+----------+----------+------+-------+------------+---------------+---------+---------+----------+--------------------+-------+---------------------+------
---
 object        |  1936449 |          |      |       |            |
          |    1247 | 3546410 |        0 | 1/6500             | 14657
| AccessExclusiveLock | t
 virtualxid    |          |          |      |       | 1/6500     |
          |         |         |          | 1/6500             | 14657
| ExclusiveLock       | t
 object        |        0 |          |      |       |            |
          |    1260 |   16392 |        0 | 1/6500             | 14657
| AccessShareLock     | t
 relation      |  1936449 |  3529708 |      |       |            |
          |         |         |          | 1/6500             | 14657
| AccessShareLock     | t
 relation      |  1936449 |  3546485 |      |       |            |
          |         |         |          | 1/6500             | 14657
| AccessExclusiveLock | t
 transactionid |          |          |      |       |            |
  1548543 |         |         |          | 1/6500             | 14657
| ExclusiveLock       | t
 relation      |  1936449 |  3546409 |      |       |            |
          |         |         |          | 1/6500             | 14657
| AccessExclusiveLock | t
 relation      |  1936449 |  3529699 |      |       |            |
          |         |         |          | 1/6500             | 14657
| AccessShareLock     | t
 relation      |  1936449 |  3546477 |      |       |            |
          |         |         |          | 1/6500             | 14657
| AccessExclusiveLock | t
 relation      |  1936449 |  3529709 |      |       |            |
          |         |         |          | 1/6500             | 14657
| AccessShareLock     | t
 virtualxid    |          |          |      |       | 2/271      |
          |         |         |          | 2/271              | 14277
| ExclusiveLock       | t
 relation      |  1936449 |  3546483 |      |       |            |
          |         |         |          | 1/6500             | 14657
| ShareLock           | t
 relation      |  1936449 |    10969 |      |       |            |
          |         |         |          | 2/271              | 14277
| AccessShareLock     | t
 relation      |  1936449 |  3529706 |      |       |            |
          |         |         |          | 1/6500             | 14657
| AccessShareLock     | t
 relation      |  1936449 |  3546486 |      |       |            |
          |         |         |          | 1/6500             | 14657
| RowExclusiveLock    | t
 relation      |  1936449 |  3546486 |      |       |            |
          |         |         |          | 1/6500             | 14657
| AccessExclusiveLock | t
 relation      |  1936449 |  3546479 |      |       |            |
          |         |         |          | 1/6500             | 14657
| AccessShareLock     | t
 relation      |  1936449 |  3546479 |      |       |            |
          |         |         |          | 1/6500             | 14657
| RowExclusiveLock    | t
 relation      |  1936449 |  3546479 |      |       |            |
          |         |         |          | 1/6500             | 14657
| ShareLock           | t
 relation      |  1936449 |  3546479 |      |       |            |
          |         |         |          | 1/6500             | 14657
| AccessExclusiveLock | t
(20 rows)





I did try collecting the trace of the process 14657 using gdb (as
advised in the suggested resource
"http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD")
and this is how it looks like.

Continuing.

Program received signal SIGINT, Interrupt.
0x0000000000a1a8cd in pg_detoast_datum (datum=0xe52318c) at fmgr.c:2217
2217        if (VARATT_IS_EXTENDED(datum))
#0  0x0000000000a1a8cd in pg_detoast_datum (datum=0xe52318c) at fmgr.c:2217
#1  0x00000000009771b3 in bitcmp (fcinfo=0x7fff4ebd2170) at varbit.c:857
#2  0x0000000000a5c9d0 in myFunctionCall2 (flinfo=0x341c0b0,
arg1=240267660, arg2=162198044) at tuplesort.c:2511
#3  0x0000000000a5cb65 in inlineApplySortFunction
(sortFunction=0x341c0b0, sk_flags=0, datum1=240267660, isNull1=0
'\000', datum2=162198044, isNull2=0 '\000') at tuplesort.c:2551
#4  0x0000000000a5cce8 in comparetup_heap (a=0x7fc84c82fe70,
b=0x7fc84c82fe88, state=0x341be88) at tuplesort.c:2594
#5  0x0000000000a5c4ba in tuplesort_heap_siftup (state=0x341be88,
checkIndex=1 '\001') at tuplesort.c:2435
#6  0x0000000000a5af75 in dumptuples (state=0x341be88, alltuples=0
'\000') at tuplesort.c:2070
#7  0x0000000000a57d1c in puttuple_common (state=0x341be88,
tuple=0x7fff4ebd2700) at tuplesort.c:1097
#8  0x0000000000a5754d in tuplesort_puttupleslot (state=0x341be88,
slot=0x342d7b8) at tuplesort.c:921
#9  0x00000000006eee50 in ExecSort (node=0x347ac98) at nodeSort.c:107
#10 0x00000000006ba46b in ExecProcNode (node=0x347ac98) at execProcnode.c:423
#11 0x00000000006ef4f4 in ExecUnique (node=0x347ab80) at nodeUnique.c:69
#12 0x00000000006ba54f in ExecProcNode (node=0x347ab80) at execProcnode.c:439
#13 0x00000000006f75be in SubqueryNext (node=0x347a8d8) at nodeSubqueryscan.c:58
#14 0x00000000006cc9ee in ExecScan (node=0x347a8d8, accessMtd=0x6f7588
<SubqueryNext>) at execScan.c:103
#15 0x00000000006f7620 in ExecSubqueryScan (node=0x347a8d8) at
nodeSubqueryscan.c:85
#16 0x00000000006ba26a in ExecProcNode (node=0x347a8d8) at execProcnode.c:381
#17 0x00000000006ef4f4 in ExecUnique (node=0x347a7c0) at nodeUnique.c:69
#18 0x00000000006ba54f in ExecProcNode (node=0x347a7c0) at execProcnode.c:439
#19 0x00000000006f75be in SubqueryNext (node=0x342df00) at nodeSubqueryscan.c:58
#20 0x00000000006cc9ee in ExecScan (node=0x342df00, accessMtd=0x6f7588
<SubqueryNext>) at execScan.c:103
#21 0x00000000006f7620 in ExecSubqueryScan (node=0x342df00) at
nodeSubqueryscan.c:85
#22 0x00000000006ba26a in ExecProcNode (node=0x342df00) at execProcnode.c:381
#23 0x00000000006b4faa in ExecutePlan (estate=0x342c038,
planstate=0x342df00, operation=CMD_INSERT, numberTuples=0,
direction=ForwardScanDirection, dest=0xe7bf40) at execMain.c:1504
#24 0x00000000006b1496 in standard_ExecutorRun (queryDesc=0x36fb6d8,
direction=ForwardScanDirection, count=0) at execMain.c:309
#25 0x00000000006b122d in ExecutorRun (queryDesc=0x36fb6d8,
direction=ForwardScanDirection, count=0) at execMain.c:258
#26 0x00000000007062cb in _SPI_pquery (queryDesc=0x36fb6d8,
fire_triggers=1 '\001', tcount=0) at spi.c:2009
#27 0x0000000000705b9c in _SPI_execute_plan (plan=0x7fff4ebd2dd0,
paramLI=0x339b578, snapshot=0x0, crosscheck_snapshot=0x0, read_only=0
'\000', fire_triggers=1 '\001', tcount=0) at spi.c:1831
#28 0x0000000000700d2b in SPI_execute_with_args (src=0x27ae950 "
INSERT INTO _itemset_fragment_major ( itemset_fragment_item_bit_code
,itemset_fragment_item_bit_code_overhang ) SELECT DISTINCT
itemset_fragment_item_bit_code
,itemset_fragment_item_bit_code_overhang"..., nargs=6,
argtypes=0x27aedc0, Values=0x27aedf8, Nulls=0x27ae518 "
~\177\330XG\003", read_only=0 '\000', tcount=0) at spi.c:495
#29 0x00007fc851dae0f7 in exec_stmt_dynexecute (estate=0x7fff4ebd31b0,
stmt=0x2864d80) at pl_exec.c:2945
#30 0x00007fc851da8505 in exec_stmt (estate=0x7fff4ebd31b0,
stmt=0x2864d80) at pl_exec.c:1301
#31 0x00007fc851da7f2f in exec_stmts (estate=0x7fff4ebd31b0,
stmts=0x28640c0) at pl_exec.c:1200
#32 0x00007fc851da8b88 in exec_stmt_if (estate=0x7fff4ebd31b0,
stmt=0x2873290) at pl_exec.c:1435
#33 0x00007fc851da81d2 in exec_stmt (estate=0x7fff4ebd31b0,
stmt=0x2873290) at pl_exec.c:1249
#34 0x00007fc851da7f2f in exec_stmts (estate=0x7fff4ebd31b0,
stmts=0x2863418) at pl_exec.c:1200
#35 0x00007fc851da7cd2 in exec_stmt_block (estate=0x7fff4ebd31b0,
block=0x2886810) at pl_exec.c:1137
#36 0x00007fc851da5301 in plpgsql_exec_function (func=0x2855050,
fcinfo=0x7fff4ebd3480) at pl_exec.c:315
#37 0x00007fc851d9da08 in plpgsql_call_handler (fcinfo=0x7fff4ebd3480)
at pl_handler.c:95
#38 0x00000000006c06ba in ExecMakeFunctionResult (fcache=0x28b5550,
econtext=0x28b5420, isNull=0x28b6238
"\177~\177\177\177\177\177\177x\215B\003", isDone=0x28b6378) at
execQual.c:1789
#39 0x00000000006c1750 in ExecEvalFunc (fcache=0x28b5550,
econtext=0x28b5420, isNull=0x28b6238
"\177~\177\177\177\177\177\177x\215B\003", isDone=0x28b6378) at
execQual.c:2215
#40 0x00000000006cbe44 in ExecTargetList (targetlist=0x28b6340,
econtext=0x28b5420, values=0x28b6218, isnull=0x28b6238
"\177~\177\177\177\177\177\177x\215B\003", itemIsDone=0x28b6378,
isDone=0x7fff4ebd3a34) at execQual.c:5108
#41 0x00000000006cc743 in ExecProject (projInfo=0x28b6258,
isDone=0x7fff4ebd3a34) at execQual.c:5323
#42 0x00000000006ec5b0 in ExecResult (node=0x28b5308) at nodeResult.c:155
#43 0x00000000006ba0db in ExecProcNode (node=0x28b5308) at execProcnode.c:344
#44 0x00000000006b4faa in ExecutePlan (estate=0x28b50d8,
planstate=0x28b5308, operation=CMD_SELECT, numberTuples=0,
direction=ForwardScanDirection, dest=0x2723890) at execMain.c:1504
#45 0x00000000006b1496 in standard_ExecutorRun (queryDesc=0x2777a38,
direction=ForwardScanDirection, count=0) at execMain.c:309
#46 0x00000000006b122d in ExecutorRun (queryDesc=0x2777a38,
direction=ForwardScanDirection, count=0) at execMain.c:258
#47 0x000000000087d517 in PortalRunSelect (portal=0x2787398, forward=1
'\001', count=0, dest=0x2723890) at pquery.c:953
#48 0x000000000087ce98 in PortalRun (portal=0x2787398,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x2723890,
altdest=0x2723890, completionTag=0x7fff4ebd3e60 "") at pquery.c:779
#49 0x0000000000874ad3 in exec_execute_message (portal_name=0x2723478
"", max_rows=9223372036854775807) at postgres.c:1928
#50 0x0000000000879efd in PostgresMain (argc=4, argv=0x2678b50,
username=0x2678900 "test") at postgres.c:3671
#51 0x0000000000809631 in BackendRun (port=0x2699140) at postmaster.c:3449
#52 0x0000000000808352 in BackendStartup (port=0x2699140) at postmaster.c:3063
#53 0x0000000000802cd0 in ServerLoop () at postmaster.c:1387
#54 0x0000000000801b19 in PostmasterMain (argc=6, argv=0x26746e0) at
postmaster.c:1040
#55 0x0000000000720646 in main (argc=6, argv=0x26746e0) at main.c:188
Continuing.

Program received signal SIGINT, Interrupt.
0x0000003775ce0fcb in __mcount_internal () from /lib64/libc.so.6
#0  0x0000003775ce0fcb in __mcount_internal () from /lib64/libc.so.6
#1  0x0000003775ce1c74 in mcount () from /lib64/libc.so.6
#2  0x00007fff4ebd1990 in ?? ()
#3  0x0000000000000003 in ?? ()
#4  0x00000000009781b2 in bitxor (fcinfo=0x7fff4ebd1950) at varbit.c:1167
#5  0x00007fff4ebd1950 in ?? ()
#6  0x00007fff4ebd1990 in ?? ()
#7  0x00000000006bcf45 in ExecEvalVar (exprstate=0x7fff4ebd1990,
econtext=0x7fff4ebd1950, isNull=0x7fff4ebd1950 "", isDone=0x3) at
execQual.c:748
#8  0x0000000000600000 in check_index_is_clusterable
(OldHeap=0xc78948de89488845, indexOid=2336784477, recheck=-1 '\377')
at cluster.c:446
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
Continuing.

Program received signal SIGINT, Interrupt.
0x0000000000a5cd99 in comparetup_heap (a=0x7fc84c99f880,
b=0x7fc84c99f898, state=0x341be88) at tuplesort.c:2615
2615            datum1 = heap_getattr(<up, attno, tupDesc, &isnull1);
#0  0x0000000000a5cd99 in comparetup_heap (a=0x7fc84c99f880,
b=0x7fc84c99f898, state=0x341be88) at tuplesort.c:2615
#1  0x0000000000a5c4ba in tuplesort_heap_siftup (state=0x341be88,
checkIndex=1 '\001') at tuplesort.c:2435
#2  0x0000000000a5af75 in dumptuples (state=0x341be88, alltuples=0
'\000') at tuplesort.c:2070
#3  0x0000000000a57d1c in puttuple_common (state=0x341be88,
tuple=0x7fff4ebd2700) at tuplesort.c:1097
#4  0x0000000000a5754d in tuplesort_puttupleslot (state=0x341be88,
slot=0x342d7b8) at tuplesort.c:921
#5  0x00000000006eee50 in ExecSort (node=0x347ac98) at nodeSort.c:107
#6  0x00000000006ba46b in ExecProcNode (node=0x347ac98) at execProcnode.c:423
#7  0x00000000006ef4f4 in ExecUnique (node=0x347ab80) at nodeUnique.c:69
#8  0x00000000006ba54f in ExecProcNode (node=0x347ab80) at execProcnode.c:439
#9  0x00000000006f75be in SubqueryNext (node=0x347a8d8) at nodeSubqueryscan.c:58
#10 0x00000000006cc9ee in ExecScan (node=0x347a8d8, accessMtd=0x6f7588
<SubqueryNext>) at execScan.c:103
#11 0x00000000006f7620 in ExecSubqueryScan (node=0x347a8d8) at
nodeSubqueryscan.c:85
#12 0x00000000006ba26a in ExecProcNode (node=0x347a8d8) at execProcnode.c:381
#13 0x00000000006ef4f4 in ExecUnique (node=0x347a7c0) at nodeUnique.c:69
#14 0x00000000006ba54f in ExecProcNode (node=0x347a7c0) at execProcnode.c:439
#15 0x00000000006f75be in SubqueryNext (node=0x342df00) at nodeSubqueryscan.c:58
#16 0x00000000006cc9ee in ExecScan (node=0x342df00, accessMtd=0x6f7588
<SubqueryNext>) at execScan.c:103
#17 0x00000000006f7620 in ExecSubqueryScan (node=0x342df00) at
nodeSubqueryscan.c:85
#18 0x00000000006ba26a in ExecProcNode (node=0x342df00) at execProcnode.c:381
#19 0x00000000006b4faa in ExecutePlan (estate=0x342c038,
planstate=0x342df00, operation=CMD_INSERT, numberTuples=0,
direction=ForwardScanDirection, dest=0xe7bf40) at execMain.c:1504
#20 0x00000000006b1496 in standard_ExecutorRun (queryDesc=0x36fb6d8,
direction=ForwardScanDirection, count=0) at execMain.c:309
#21 0x00000000006b122d in ExecutorRun (queryDesc=0x36fb6d8,
direction=ForwardScanDirection, count=0) at execMain.c:258
#22 0x00000000007062cb in _SPI_pquery (queryDesc=0x36fb6d8,
fire_triggers=1 '\001', tcount=0) at spi.c:2009
#23 0x0000000000705b9c in _SPI_execute_plan (plan=0x7fff4ebd2dd0,
paramLI=0x339b578, snapshot=0x0, crosscheck_snapshot=0x0, read_only=0
'\000', fire_triggers=1 '\001', tcount=0) at spi.c:1831
#24 0x0000000000700d2b in SPI_execute_with_args (src=0x27ae950 "
INSERT INTO _itemset_fragment_major ( itemset_fragment_item_bit_code
,itemset_fragment_item_bit_code_overhang ) SELECT DISTINCT
itemset_fragment_item_bit_code
,itemset_fragment_item_bit_code_overhang"..., nargs=6,
argtypes=0x27aedc0, Values=0x27aedf8, Nulls=0x27ae518 "
~\177\330XG\003", read_only=0 '\000', tcount=0) at spi.c:495
#25 0x00007fc851dae0f7 in exec_stmt_dynexecute (estate=0x7fff4ebd31b0,
stmt=0x2864d80) at pl_exec.c:2945
#26 0x00007fc851da8505 in exec_stmt (estate=0x7fff4ebd31b0,
stmt=0x2864d80) at pl_exec.c:1301
#27 0x00007fc851da7f2f in exec_stmts (estate=0x7fff4ebd31b0,
stmts=0x28640c0) at pl_exec.c:1200
#28 0x00007fc851da8b88 in exec_stmt_if (estate=0x7fff4ebd31b0,
stmt=0x2873290) at pl_exec.c:1435
#29 0x00007fc851da81d2 in exec_stmt (estate=0x7fff4ebd31b0,
stmt=0x2873290) at pl_exec.c:1249
#30 0x00007fc851da7f2f in exec_stmts (estate=0x7fff4ebd31b0,
stmts=0x2863418) at pl_exec.c:1200
#31 0x00007fc851da7cd2 in exec_stmt_block (estate=0x7fff4ebd31b0,
block=0x2886810) at pl_exec.c:1137
#32 0x00007fc851da5301 in plpgsql_exec_function (func=0x2855050,
fcinfo=0x7fff4ebd3480) at pl_exec.c:315
#33 0x00007fc851d9da08 in plpgsql_call_handler (fcinfo=0x7fff4ebd3480)
at pl_handler.c:95
#34 0x00000000006c06ba in ExecMakeFunctionResult (fcache=0x28b5550,
econtext=0x28b5420, isNull=0x28b6238
"\177~\177\177\177\177\177\177x\215B\003", isDone=0x28b6378) at
execQual.c:1789
#35 0x00000000006c1750 in ExecEvalFunc (fcache=0x28b5550,
econtext=0x28b5420, isNull=0x28b6238
"\177~\177\177\177\177\177\177x\215B\003", isDone=0x28b6378) at
execQual.c:2215
#36 0x00000000006cbe44 in ExecTargetList (targetlist=0x28b6340,
econtext=0x28b5420, values=0x28b6218, isnull=0x28b6238
"\177~\177\177\177\177\177\177x\215B\003", itemIsDone=0x28b6378,
isDone=0x7fff4ebd3a34) at execQual.c:5108
#37 0x00000000006cc743 in ExecProject (projInfo=0x28b6258,
isDone=0x7fff4ebd3a34) at execQual.c:5323
#38 0x00000000006ec5b0 in ExecResult (node=0x28b5308) at nodeResult.c:155
#39 0x00000000006ba0db in ExecProcNode (node=0x28b5308) at execProcnode.c:344
#40 0x00000000006b4faa in ExecutePlan (estate=0x28b50d8,
planstate=0x28b5308, operation=CMD_SELECT, numberTuples=0,
direction=ForwardScanDirection, dest=0x2723890) at execMain.c:1504
#41 0x00000000006b1496 in standard_ExecutorRun (queryDesc=0x2777a38,
direction=ForwardScanDirection, count=0) at execMain.c:309
#42 0x00000000006b122d in ExecutorRun (queryDesc=0x2777a38,
direction=ForwardScanDirection, count=0) at execMain.c:258
#43 0x000000000087d517 in PortalRunSelect (portal=0x2787398, forward=1
'\001', count=0, dest=0x2723890) at pquery.c:953
#44 0x000000000087ce98 in PortalRun (portal=0x2787398,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x2723890,
altdest=0x2723890, completionTag=0x7fff4ebd3e60 "") at pquery.c:779
#45 0x0000000000874ad3 in exec_execute_message (portal_name=0x2723478
"", max_rows=9223372036854775807) at postgres.c:1928
#46 0x0000000000879efd in PostgresMain (argc=4, argv=0x2678b50,
username=0x2678900 "test") at postgres.c:3671
#47 0x0000000000809631 in BackendRun (port=0x2699140) at postmaster.c:3449
#48 0x0000000000808352 in BackendStartup (port=0x2699140) at postmaster.c:3063
#49 0x0000000000802cd0 in ServerLoop () at postmaster.c:1387
#50 0x0000000000801b19 in PostmasterMain (argc=6, argv=0x26746e0) at
postmaster.c:1040
#51 0x0000000000720646 in main (argc=6, argv=0x26746e0) at main.c:188

pgsql-general by date:

Previous
From: Allan Kamau
Date:
Subject: Re: Monitoring activities of PostgreSQL
Next
From: stefano bonnin
Date:
Subject: Re: Given N, finding the interval of N hours with max(sum(..))