Thread: Out of memory on SELECT (from sort?) in 8.3

Out of memory on SELECT (from sort?) in 8.3

From
"Matt Magoffin"
Date:
I have a SELECT query that causes an out-of-memory error on my production
Postgres 8.3 server. I believe the culprit is from the sort in the query,
but its running out on queries that return 300,000 results which I didn't
think was very large.

work_mem is set to 128MB, but I thought the server would change to an
on-disk sort method if it couldn't perform the sort in memory. Is this not
the case?

In the log file, I see the following information:

2008-08-16 02:02:43 CDT lms_nna ERROR:  out of memory
2008-08-16 02:02:43 CDT lms_nna DETAIL:  Failed on request of size 16384.
TopMemoryContext: 105376 total in 12 blocks; 9288 free (10 chunks); 96088
used
  LibxmlContext: 8380416 total in 10 blocks; 3828016 free (0 chunks);
4552400 used
  TopTransactionContext: 8192 total in 1 blocks; 7648 free (0 chunks); 544
used
  PL/PgSQL function context: 8192 total in 1 blocks; 5512 free (4 chunks);
2680 used
  Type information cache: 24576 total in 2 blocks; 11888 free (5 chunks);
12688 used
  PL/PgSQL function context: 8192 total in 1 blocks; 4648 free (7 chunks);
3544 used
  CFuncHash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used
  Rendezvous variable hash: 8192 total in 1 blocks; 1680 free (0 chunks);
6512 used
  PLpgSQL function cache: 24224 total in 2 blocks; 3744 free (0 chunks);
20480 used
  Record information cache: 24576 total in 2 blocks; 15984 free (5
chunks); 8592 used
  Prepared Queries: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 used
  Operator class cache: 8192 total in 1 blocks; 1680 free (0 chunks); 6512
used
  Operator lookup cache: 24576 total in 2 blocks; 11888 free (5 chunks);
12688 used
  MessageContext: 8192 total in 1 blocks; 6976 free (1 chunks); 1216 used
  smgr relation table: 24576 total in 2 blocks; 13904 free (4 chunks);
10672 used
  TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 chunks);
32 used
  Portal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used
  PortalMemory: 8192 total in 1 blocks; 7616 free (0 chunks); 576 used
    PortalHeapMemory: 2048 total in 1 blocks; 424 free (0 chunks); 1624 used
      ExecutorState: 841031232 total in 51159 blocks; 1712232 free (56
chunks); 839319000 used
        TIDBitmap: 2088960 total in 8 blocks; 243200 free (25 chunks);
1845760 used
        TupleSort: 28303408 total in 13 blocks; 5346944 free (14 chunks);
22956464 used
        ExprContext: 8192 total in 1 blocks; 8096 free (0 chunks); 96 used
        ExprContext: 8192 total in 1 blocks; 8096 free (0 chunks); 96 used
        ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
    PortalHeapMemory: 1024 total in 1 blocks; 944 free (0 chunks); 80 used
  Relcache by OID: 24576 total in 2 blocks; 13872 free (3 chunks); 10704 used
  CacheMemoryContext: 817392 total in 20 blocks; 23488 free (1 chunks);
793904 used
    CachedPlan: 15360 total in 4 blocks; 1528 free (0 chunks); 13832 used
    CachedPlanSource: 15360 total in 4 blocks; 1440 free (0 chunks); 13920
used
    unnamed prepared statement: 8192 total in 1 blocks; 3856 free (2
chunks); 4336 used
    CachedPlan: 1024 total in 1 blocks; 312 free (0 chunks); 712 used
    CachedPlanSource: 1024 total in 1 blocks; 56 free (0 chunks); 968 used
    SPI Plan: 1024 total in 1 blocks; 832 free (0 chunks); 192 used
    CachedPlan: 1024 total in 1 blocks; 336 free (0 chunks); 688 used
    CachedPlanSource: 1024 total in 1 blocks; 80 free (0 chunks); 944 used
    SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used
    CachedPlan: 3072 total in 2 blocks; 488 free (0 chunks); 2584 used
    CachedPlanSource: 1024 total in 1 blocks; 24 free (0 chunks); 1000 used
    SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used
    CachedPlan: 1024 total in 1 blocks; 816 free (0 chunks); 208 used
    CachedPlanSource: 1024 total in 1 blocks; 656 free (0 chunks); 368 used
    pg_toast_18005_index: 2048 total in 1 blocks; 608 free (0 chunks);
1440 used
    reporting_modified_idx: 2048 total in 1 blocks; 704 free (0 chunks);
1344 used
    reporting_last_processed_date_idx: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used
    reporting_date_idx: 2048 total in 1 blocks; 704 free (0 chunks); 1344
used
    lead_created_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
    lead_processing_step_idx: 2048 total in 1 blocks; 752 free (0 chunks);
1296 used
    lead_destination_idx: 2048 total in 1 blocks; 752 free (0 chunks);
1296 used
    lead_modified_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
    lead_source_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
    processing_state_idx: 2048 total in 1 blocks; 752 free (0 chunks);
1296 used
    lead_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
    CachedPlan: 1024 total in 1 blocks; 816 free (0 chunks); 208 used
    CachedPlanSource: 1024 total in 1 blocks; 664 free (1 chunks); 360 used
    CachedPlan: 1024 total in 1 blocks; 312 free (0 chunks); 712 used
    CachedPlanSource: 1024 total in 1 blocks; 56 free (0 chunks); 968 used
    SPI Plan: 1024 total in 1 blocks; 832 free (0 chunks); 192 used
    locks_pkey: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used
    CachedPlan: 3072 total in 2 blocks; 648 free (0 chunks); 2424 used
    CachedPlanSource: 3072 total in 2 blocks; 1248 free (1 chunks); 1824 used
    SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used
    CachedPlan: 1024 total in 1 blocks; 816 free (0 chunks); 208 used
    CachedPlanSource: 1024 total in 1 blocks; 664 free (1 chunks); 360 used
    xslt_style_sheet_pkey: 2048 total in 1 blocks; 704 free (0 chunks);
1344 used
    external_system_user_name_key: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used
    external_system_name_key: 2048 total in 1 blocks; 752 free (0 chunks);
1296 used
    external_system_pkey: 2048 total in 1 blocks; 752 free (0 chunks);
1296 used
    dealer_external_system_id_idx: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used
    dealer_dealer_code_key: 2048 total in 1 blocks; 704 free (0 chunks);
1344 used
    dealer_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
    pg_index_indrelid_index: 2048 total in 1 blocks; 704 free (0 chunks);
1344 used
    pg_ts_dict_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks);
1328 used
    pg_aggregate_fnoid_index: 3072 total in 2 blocks; 1744 free (3
chunks); 1328 used
    pg_language_name_index: 3072 total in 2 blocks; 1744 free (3 chunks);
1328 used
    pg_statistic_relid_att_index: 3072 total in 2 blocks; 1600 free (2
chunks); 1472 used
    pg_ts_dict_dictname_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used
    pg_namespace_nspname_index: 3072 total in 2 blocks; 1696 free (2
chunks); 1376 used
    pg_opfamily_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks);
1328 used
    pg_opclass_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks);
1376 used
    pg_ts_parser_prsname_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used
    pg_amop_fam_strat_index: 3072 total in 2 blocks; 1384 free (2 chunks);
1688 used
    pg_opclass_am_name_nsp_index: 3072 total in 2 blocks; 1576 free (3
chunks); 1496 used
    pg_trigger_tgrelid_tgname_index: 3072 total in 2 blocks; 1600 free (2
chunks); 1472 used
    pg_cast_source_target_index: 3072 total in 2 blocks; 1600 free (2
chunks); 1472 used
    pg_auth_members_role_member_index: 3072 total in 2 blocks; 1648 free
(2 chunks); 1424 used
    pg_attribute_relid_attnum_index: 3072 total in 2 blocks; 1600 free (2
chunks); 1472 used
    pg_ts_config_cfgname_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used
    pg_authid_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks);
1376 used
    pg_ts_config_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks);
1328 used
    pg_conversion_default_index: 3072 total in 2 blocks; 1432 free (3
chunks); 1640 used
    pg_language_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks);
1376 used
    pg_enum_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328
used
    pg_proc_proname_args_nsp_index: 3072 total in 2 blocks; 1576 free (3
chunks); 1496 used
    pg_ts_parser_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks);
1328 used
    pg_database_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks);
1376 used
    pg_conversion_name_nsp_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used
    pg_class_relname_nsp_index: 3072 total in 2 blocks; 1600 free (2
chunks); 1472 used
    pg_attribute_relid_attnam_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used
    pg_class_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376
used
    pg_amproc_fam_proc_index: 3072 total in 2 blocks; 1384 free (2
chunks); 1688 used
    pg_operator_oprname_l_r_n_index: 3072 total in 2 blocks; 1384 free (2
chunks); 1688 used
    pg_index_indexrelid_index: 3072 total in 2 blocks; 1696 free (2
chunks); 1376 used
    pg_type_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376
used
    pg_rewrite_rel_rulename_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used
    pg_authid_rolname_index: 3072 total in 2 blocks; 1696 free (2 chunks);
1376 used
    pg_auth_members_member_role_index: 3072 total in 2 blocks; 1648 free
(2 chunks); 1424 used
    pg_enum_typid_label_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used
    pg_constraint_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks);
1328 used
    pg_conversion_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks);
1328 used
    pg_ts_template_tmplname_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used
    pg_ts_config_map_index: 3072 total in 2 blocks; 1624 free (3 chunks);
1448 used
    pg_namespace_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks);
1376 used
    pg_type_typname_nsp_index: 3072 total in 2 blocks; 1600 free (2
chunks); 1472 used
    pg_operator_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks);
1376 used
    pg_amop_opr_fam_index: 3072 total in 2 blocks; 1600 free (2 chunks);
1472 used
    pg_proc_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376
used
    pg_opfamily_am_name_nsp_index: 3072 total in 2 blocks; 1624 free (3
chunks); 1448 used
    pg_ts_template_oid_index: 3072 total in 2 blocks; 1744 free (3
chunks); 1328 used
  MdSmgr: 8192 total in 1 blocks; 7104 free (0 chunks); 1088 used
  LOCALLOCK hash: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 used
  Timezones: 53584 total in 2 blocks; 3744 free (0 chunks); 49840 used
  ErrorContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used



Re: Out of memory on SELECT (from sort?) in 8.3

From
Tom Lane
Date:
"Matt Magoffin" <postgresql.org@msqr.us> writes:
> I have a SELECT query that causes an out-of-memory error on my production
> Postgres 8.3 server.

8.3.which, and what exactly is the query and its EXPLAIN plan?

> I believe the culprit is from the sort in the query,

No, the problem seems to be here

>       ExecutorState: 841031232 total in 51159 blocks; 1712232 free (56
> chunks); 839319000 used

and it's impossible to guess what that's about without a lot more
details than you provided.  The sort's only eating 28MB:

>         TupleSort: 28303408 total in 13 blocks; 5346944 free (14 chunks);
> 22956464 used

            regards, tom lane

Re: Out of memory on SELECT (from sort?) in 8.3

From
Alvaro Herrera
Date:
Matt Magoffin wrote:
> I have a SELECT query that causes an out-of-memory error on my production
> Postgres 8.3 server. I believe the culprit is from the sort in the query,
> but its running out on queries that return 300,000 results which I didn't
> think was very large.

It's not the sort.  There's a memory leak in the system somewhere.  Can
you post a reproducible test case?  (If you can do it in the next couple
of days, there's a chance the fix will make it to 8.3.4).

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Out of memory on SELECT (from sort?) in 8.3

From
"Matt Magoffin"
Date:
>> I have a SELECT query that causes an out-of-memory error on my
>> production
>> Postgres 8.3 server.
>
> 8.3.which, and what exactly is the query and its EXPLAIN plan?
>
>> I believe the culprit is from the sort in the query,
>
> No, the problem seems to be here
>
>>       ExecutorState: 841031232 total in 51159 blocks; 1712232 free (56
>> chunks); 839319000 used
>
> and it's impossible to guess what that's about without a lot more
> details than you provided.

I see, I was looking at the log entries wrong, sorry. This is Postgres
8.3.3 + an xml.c patch (from Kris -
http://archives.postgresql.org/pgsql-general/2008-07/msg00110.php) to fix
a memory leak there. The EXPLAIN for this query was this:

             QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=238861.67..239149.85 rows=115274 width=70)
   Sort Key: id
   ->  Bitmap Heap Scan on lead  (cost=3828.21..229170.17 rows=115274
width=70)
         Recheck Cond: (((date(timezone('-06:00:00'::interval, created))
>= '2008-08-15'::date) AND (date(timezone('-06:00:00'::interval,
created)) <= '2008-08-15'::date)) OR
((date(timezone('-06:00:00'::interval, modified)) >=
'2008-08-15'::date) AND (date(timezone('-06:00:00'::interval,
modified)) <= '2008-08-15'::date)))
         ->  BitmapOr  (cost=3828.21..3828.21 rows=115274 width=0)
               ->  Bitmap Index Scan on reporting_date_idx
(cost=0.00..4.91 rows=1 width=0)
                     Index Cond: ((date(timezone('-06:00:00'::interval,
created)) >= '2008-08-15'::date) AND
(date(timezone('-06:00:00'::interval, created)) <=
'2008-08-15'::date))
               ->  Bitmap Index Scan on reporting_modified_idx
(cost=0.00..3765.67 rows=115274 width=0)
                     Index Cond: ((date(timezone('-06:00:00'::interval,
modified)) >= '2008-08-15'::date) AND
(date(timezone('-06:00:00'::interval, modified)) <=
'2008-08-15'::date))
(9 rows)

This particular query fails with this out-of-memory error when the result
size is about 300k or so.

I'm also seeing this TopMemoryContext output in the logs from some other
situations (sorry I'm not familiar with what/how this gets logged), like
this autovacuum task that appears to dump this into the log:

2008-07-17 00:19:26 CDT  ERROR:  canceling autovacuum task
2008-07-17 00:19:26 CDT  CONTEXT:  automatic vacuum of table
"lms_infiniti._lms.sl_log_2"
TopMemoryContext: 129952 total in 15 blocks; 32088 free (139 chunks);
97864 used
  LibxmlContext: 8380416 total in 10 blocks; 3123904 free (0 chunks);
5256512 used
  TopTransactionContext: 8192 total in 1 blocks; 7648 free (0 chunks); 544
used
  PL/PgSQL function context: 8192 total in 1 blocks; 4400 free (3 chunks);
3792 used
  PL/PgSQL function context: 8192 total in 1 blocks; 5512 free (4 chunks);
2680 used
  Type information cache: 24576 total in 2 blocks; 11888 free (5 chunks);
12688 used
  Operator lookup cache: 24576 total in 2 blocks; 11888 free (5 chunks);
12688 used
  Operator class cache: 8192 total in 1 blocks; 1680 free (0 chunks); 6512
used
  PL/PgSQL function context: 24576 total in 2 blocks; 21000 free (10
chunks); 3576 used
  CFuncHash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used
  Rendezvous variable hash: 8192 total in 1 blocks; 1680 free (0 chunks);
6512 used
  PLpgSQL function cache: 24224 total in 2 blocks; 3744 free (0 chunks);
20480 used
  Record information cache: 24576 total in 2 blocks; 15984 free (5
chunks); 8592 used
  Prepared Queries: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 used
  MessageContext: 8192 total in 1 blocks; 6976 free (1 chunks); 1216 used
  smgr relation table: 24576 total in 2 blocks; 9776 free (4 chunks);
14800 used
  TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 chunks);
32 used
  Portal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used
  PortalMemory: 8192 total in 1 blocks; 7616 free (0 chunks); 576 used
    PortalHeapMemory: 2048 total in 1 blocks; 360 free (0 chunks); 1688 used
      ExecutorState: 1019641424 total in 92998 blocks; 6834024 free (66
chunks); 1012807400 used
        TIDBitmap: 4186112 total in 9 blocks; 1774784 free (26 chunks);
2411328 used
        TupleSort: 31449136 total in 13 blocks; 3451184 free (12 chunks);
27997952 used
        ExprContext: 8192 total in 1 blocks; 8096 free (0 chunks); 96 used
        ExprContext: 8192 total in 1 blocks; 8096 free (0 chunks); 96 used
        ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
    PortalHeapMemory: 1024 total in 1 blocks; 944 free (0 chunks); 80 used
  Relcache by OID: 24576 total in 2 blocks; 12832 free (3 chunks); 11744 used
  CacheMemoryContext: 1341680 total in 21 blocks; 620264 free (1633
chunks); 721416 used
    pg_toast_16580_index: 2048 total in 1 blocks; 608 free (0 chunks);
1440 used
    CachedPlan: 15360 total in 4 blocks; 1528 free (0 chunks); 13832 used
    CachedPlanSource: 15360 total in 4 blocks; 1440 free (0 chunks); 13920
used
    reporting_modified_idx: 2048 total in 1 blocks; 704 free (0 chunks);
1344 used
    reporting_last_processed_date_idx: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used
    reporting_date_idx: 2048 total in 1 blocks; 704 free (0 chunks); 1344
used
    lead_created_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
    lead_processing_step_idx: 2048 total in 1 blocks; 752 free (0 chunks);
1296 used
    lead_destination_idx: 2048 total in 1 blocks; 752 free (0 chunks);
1296 used
    lead_modified_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
    lead_source_idx: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
    processing_state_idx: 2048 total in 1 blocks; 752 free (0 chunks);
1296 used
    lead_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
    unnamed prepared statement: 8192 total in 1 blocks; 3856 free (2
chunks); 4336 used
    locks_pkey: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used
    CachedPlan: 3072 total in 2 blocks; 648 free (0 chunks); 2424 used
    pg_language_oid_index: 2048 total in 1 blocks; 704 free (0 chunks);
1344 used
    xslt_style_sheet_pkey: 2048 total in 1 blocks; 704 free (0 chunks);
1344 used
    offline_report_key_key: 2048 total in 1 blocks; 752 free (0 chunks);
1296 used
    offline_report_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296
used
    pg_type_typname_nsp_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
    pg_statistic_relid_att_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
    pg_amop_opr_fam_index: 2048 total in 1 blocks; 608 free (0 chunks);
1440 used
    external_system_user_name_key: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used
    external_system_name_key: 2048 total in 1 blocks; 752 free (0 chunks);
1296 used
    external_system_pkey: 2048 total in 1 blocks; 752 free (0 chunks);
1296 used
    pg_index_indrelid_index: 2048 total in 1 blocks; 704 free (0 chunks);
1344 used
    pg_cast_source_target_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
    pg_operator_oprname_l_r_n_index: 2048 total in 1 blocks; 392 free (0
chunks); 1656 used
    pg_class_relname_nsp_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
    pg_type_oid_index: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used
    pg_proc_oid_index: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used
    pg_proc_proname_args_nsp_index: 2048 total in 1 blocks; 584 free (0
chunks); 1464 used
    pg_namespace_oid_index: 2048 total in 1 blocks; 704 free (0 chunks);
1344 used
    pg_namespace_nspname_index: 2048 total in 1 blocks; 704 free (0
chunks); 1344 used
    CachedPlan: 3072 total in 2 blocks; 1160 free (0 chunks); 1912 used
    CachedPlan: 1024 total in 1 blocks; 312 free (0 chunks); 712 used
    CachedPlanSource: 1024 total in 1 blocks; 56 free (0 chunks); 968 used
    SPI Plan: 1024 total in 1 blocks; 832 free (0 chunks); 192 used
    CachedPlan: 1024 total in 1 blocks; 336 free (0 chunks); 688 used
    CachedPlanSource: 1024 total in 1 blocks; 80 free (0 chunks); 944 used
    SPI Plan: 1024 total in 1 blocks; 832 free (0 chunks); 192 used
    CachedPlan: 1024 total in 1 blocks; 336 free (0 chunks); 688 used
    CachedPlanSource: 1024 total in 1 blocks; 80 free (0 chunks); 944 used
    SPI Plan: 1024 total in 1 blocks; 832 free (0 chunks); 192 used
    CachedPlan: 1024 total in 1 blocks; 336 free (0 chunks); 688 used
    CachedPlanSource: 1024 total in 1 blocks; 80 free (0 chunks); 944 used
    SPI Plan: 1024 total in 1 blocks; 832 free (0 chunks); 192 used
    CachedPlan: 1024 total in 1 blocks; 336 free (0 chunks); 688 used
    CachedPlanSource: 1024 total in 1 blocks; 80 free (0 chunks); 944 used
    SPI Plan: 1024 total in 1 blocks; 832 free (0 chunks); 192 used
    CachedPlan: 1024 total in 1 blocks; 336 free (0 chunks); 688 used
    CachedPlanSource: 1024 total in 1 blocks; 80 free (0 chunks); 944 used
    SPI Plan: 1024 total in 1 blocks; 832 free (0 chunks); 192 used
    CachedPlan: 3072 total in 2 blocks; 1984 free (0 chunks); 1088 used
    CachedPlanSource: 3072 total in 2 blocks; 1792 free (1 chunks); 1280 used
    SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used
    CachedPlan: 3072 total in 2 blocks; 1984 free (0 chunks); 1088 used
    CachedPlanSource: 3072 total in 2 blocks; 1792 free (1 chunks); 1280 used
    SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used
    CachedPlan: 3072 total in 2 blocks; 1984 free (0 chunks); 1088 used
    CachedPlanSource: 3072 total in 2 blocks; 1792 free (1 chunks); 1280 used
    SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used
    CachedPlan: 3072 total in 2 blocks; 1984 free (0 chunks); 1088 used
    CachedPlanSource: 3072 total in 2 blocks; 1792 free (1 chunks); 1280 used
    SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used
    CachedPlan: 1024 total in 1 blocks; 312 free (0 chunks); 712 used
    CachedPlanSource: 1024 total in 1 blocks; 56 free (0 chunks); 968 used
    SPI Plan: 1024 total in 1 blocks; 832 free (0 chunks); 192 used
    CachedPlan: 1024 total in 1 blocks; 336 free (0 chunks); 688 used
    CachedPlanSource: 1024 total in 1 blocks; 80 free (0 chunks); 944 used
    SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used
    CachedPlanSource: 1024 total in 1 blocks; 24 free (0 chunks); 1000 used
    SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used
    CachedPlan: 1024 total in 1 blocks; 816 free (0 chunks); 208 used
    CachedPlanSource: 1024 total in 1 blocks; 664 free (1 chunks); 360 used
    CachedPlan: 1024 total in 1 blocks; 312 free (0 chunks); 712 used
    CachedPlanSource: 1024 total in 1 blocks; 56 free (0 chunks); 968 used
    SPI Plan: 1024 total in 1 blocks; 832 free (0 chunks); 192 used
    CachedPlanSource: 3072 total in 2 blocks; 1248 free (1 chunks); 1824 used
    SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used
    CachedPlan: 1024 total in 1 blocks; 816 free (0 chunks); 208 used
    CachedPlanSource: 1024 total in 1 blocks; 664 free (1 chunks); 360 used
    pg_opclass_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks);
1376 used
    pg_amop_fam_strat_index: 3072 total in 2 blocks; 1384 free (2 chunks);
1688 used
    pg_trigger_tgrelid_tgname_index: 3072 total in 2 blocks; 1600 free (2
chunks); 1472 used
    pg_attribute_relid_attnum_index: 3072 total in 2 blocks; 1600 free (2
chunks); 1472 used
    pg_class_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376
used
    pg_amproc_fam_proc_index: 3072 total in 2 blocks; 1384 free (2
chunks); 1688 used
    pg_index_indexrelid_index: 3072 total in 2 blocks; 1696 free (2
chunks); 1376 used
    pg_rewrite_rel_rulename_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used
    pg_operator_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks);
1376 used
  MdSmgr: 8192 total in 1 blocks; 7296 free (54 chunks); 896 used
  LOCALLOCK hash: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 used
  Timezones: 53584 total in 2 blocks; 3744 free (0 chunks); 49840 used
  ErrorContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used

If there is any other information I can provide, please let me know.

-- m@


Re: Out of memory on SELECT (from sort?) in 8.3

From
"Matt Magoffin"
Date:
> It's not the sort.  There's a memory leak in the system somewhere.  Can
> you post a reproducible test case?  (If you can do it in the next couple
> of days, there's a chance the fix will make it to 8.3.4).

I'm not sure how to go about creating a test case for this. I have had a
nagging suspicion about a memory leak somewhere in this system (starting
back with this thread:
http://archives.postgresql.org/pgsql-general/2008-07/msg00048.php) but I
really don't have any ideas on were to even begin with creating a test
case. I did have a small test case with that thread
(http://archives.postgresql.org/pgsql-general/2008-07/msg00063.php) but
that was based on my guess about the XML stuff causing the leak. I have no
real proof of this, however, nor do I have any idea of a way to narrow
down the problem.

-- m@

Re: Out of memory on SELECT (from sort?) in 8.3

From
Tom Lane
Date:
"Matt Magoffin" <postgresql.org@msqr.us> writes:
> I see, I was looking at the log entries wrong, sorry. This is Postgres
> 8.3.3 + an xml.c patch (from Kris -
> http://archives.postgresql.org/pgsql-general/2008-07/msg00110.php) to fix
> a memory leak there. The EXPLAIN for this query was this:

Is this particular query using xml features, and if so which?  Actually,
could you just show us the whole query and the schemas of the tables
it's using?

> I'm also seeing this TopMemoryContext output in the logs from some other
> situations (sorry I'm not familiar with what/how this gets logged), like
> this autovacuum task that appears to dump this into the log:

The memory context dump happens when we're about to report an "out of
memory" error, so actually the relevant ERROR line would come after the
context dump not before.  I'm supposing it's another instance of the
same problem, since the main bloat seems to be here:

>       ExecutorState: 1019641424 total in 92998 blocks; 6834024 free (66
> chunks); 1012807400 used
>         TIDBitmap: 4186112 total in 9 blocks; 1774784 free (26 chunks);
> 2411328 used
>         TupleSort: 31449136 total in 13 blocks; 3451184 free (12 chunks);
> 27997952 used

Thinking about it, it's actually a bit hard to see how it could be an
xml.c problem --- AFAIR that code should only be running in per-tuple
contexts.  A leak at the ExecutorState level might perhaps have
something to do with misbehaving aggregates, or some such.  Anyway we
really need an example query to test in order to find it.

            regards, tom lane

Re: Out of memory on SELECT (from sort?) in 8.3

From
"Matt Magoffin"
Date:
> Is this particular query using xml features, and if so which?  Actually,
> could you just show us the whole query and the schemas of the tables
> it's using?

No, other than returning a column of type xml. The query for the EXPLAIN
output I posted before is this:

select
    lead.id as leadId,
    lead.xml as xml,
    lead.processing_state as processingState,
    lead.processing_step as processingStep,
    lead.processing_attempts as processingAttempts,
    lead.created as createdDate,
    lead.last_processed as lastProcessedDate
from lead lead
where
    (date(lead.created at time zone interval '-06')
        between date('2008-08-15') and date('2008-08-15'))
    or
    (date(lead.modified at time zone interval '-06')
        between date('2008-08-15') and date('2008-08-15'))
    order by lead.id

And the schema of the lead table is this:

lms_nna=# \d lead
                    Table "public.lead"
       Column        |           Type           | Modifiers
---------------------+--------------------------+-----------
 id                  | bigint                   | not null
 xml                 | xml                      |
 source              | character varying(50)    | not null
 destination         | character varying(50)    |
 processing_state    | character varying(20)    | not null
 created             | timestamp with time zone | not null
 modified            | timestamp with time zone | not null
 last_processed      | timestamp with time zone |
 processing_step     | integer                  |
 processing_attempts | integer                  |
Indexes:
    "lead_pkey" PRIMARY KEY, btree (id), tablespace "lms_index"
    "lead_created_idx" btree (created), tablespace "lms_index"
    "lead_destination_idx" btree (destination), tablespace "lms_index"
    "lead_modified_idx" btree (modified), tablespace "lms_index"
    "lead_processing_step_idx" btree (processing_step), tablespace
"lms_index"
    "lead_source_idx" btree (source), tablespace "lms_index"
    "processing_state_idx" btree (processing_state), tablespace "lms_index"
    "reporting_date_idx" btree (date(timezone('-06:00:00'::interval,
created))), tablespace "lms_index"
    "reporting_last_processed_date_idx" btree
(date(timezone('-06:00:00'::interval, last_processed))), tablespace
"lms_index"
    "reporting_modified_idx" btree (date(timezone('-06:00:00'::interval,
modified)))
Foreign-key constraints:
    "fk_lead_destination" FOREIGN KEY (destination) REFERENCES
external_system(name) ON UPDATE RESTRICT ON DELETE RESTRICT
    "fk_lead_source" FOREIGN KEY (source) REFERENCES external_system(name)
ON UPDATE RESTRICT ON DELETE RESTRICT
Triggers:
    _lms_logtrigger_11 AFTER INSERT OR DELETE OR UPDATE ON lead FOR EACH
ROW EXECUTE PROCEDURE _lms.logtrigger('_lms', '11', 'kvvvvvvvvv')
    create_lead_reporting_data AFTER INSERT OR UPDATE ON lead FOR EACH ROW
EXECUTE PROCEDURE update_lead_reporting_data()
    set_modified BEFORE INSERT OR UPDATE ON lead FOR EACH ROW EXECUTE
PROCEDURE update_modified()

Would you also like the schemas for the referenced FOREIGN KEY tables? You
can see we have some insert/update triggers there, which I can provide the
source for if you need (update_lead_reporting_data() is quite large,
though).

-- m@



Re: Out of memory on SELECT (from sort?) in 8.3

From
Tom Lane
Date:
"Matt Magoffin" <postgresql.org@msqr.us> writes:
>> Is this particular query using xml features, and if so which?  Actually,
>> could you just show us the whole query and the schemas of the tables
>> it's using?

> No, other than returning a column of type xml.

Huh, I was overthinking the problem.  xml_out has a memory leak :-(

More than one in fact: a large leak (equal to size of output) in
the executor context (hence query lifespan) and a small one in
LibxmlContext (hence transaction lifespan).

Try the patch here:
http://archives.postgresql.org/pgsql-committers/2008-09/msg00159.php

(In the long run it might be better if we didn't run datatype output
functions in a query-lifespan context ...)

            regards, tom lane

Re: Out of memory on SELECT (from sort?) in 8.3

From
"Matt Magoffin"
Date:
> Huh, I was overthinking the problem.  xml_out has a memory leak :-(
>
> More than one in fact: a large leak (equal to size of output) in
> the executor context (hence query lifespan) and a small one in
> LibxmlContext (hence transaction lifespan).
>
> Try the patch here:
> http://archives.postgresql.org/pgsql-committers/2008-09/msg00159.php

I've applied this patch now to our staging and production environments,
and Postgres performed quite well after testing very large result sets
that were previously failing with out of memory errors... so the patch is
successful for this problem I was having. I didn't see any significant
increase in memory over the course of the query execution, in fact.

Thanks!

-- m@

Re: Out of memory on SELECT (from sort?) in 8.3

From
Tom Lane
Date:
"Matt Magoffin" <postgresql.org@msqr.us> writes:
>> Try the patch here:
>> http://archives.postgresql.org/pgsql-committers/2008-09/msg00159.php

> I've applied this patch now to our staging and production environments,
> and Postgres performed quite well after testing very large result sets
> that were previously failing with out of memory errors... so the patch is
> successful for this problem I was having. I didn't see any significant
> increase in memory over the course of the query execution, in fact.

Good to hear.  That patch is baked into 8.3.4, if you want to go back to
using a stock release...

            regards, tom lane