Thread: A problem with the IN clause
To all, We are running postgresql 7.4.1 on an G5 with dual procs, OSX 10.3.3 server, 8GB mem, attached to a fully configured 3.5TB XRaid box via fibre channel. I think we have run into this issue before but I thought the code was fixed. :-( I have the following SQL: SELECT subage, COUNT( DISTINCT (t1.session_key)), COUNT( DISTINCT(t1.userid_key)) FROM f_pageviews t1 JOIN segmented_sub t0 ON (t1.userid_key = t0.id) WHERE t1.date_key BETWEEN 496 AND 502 AND t1.newsletterid_key IN (SELECT newsletterid FROM t_newscontentstatic) GROUP BY 1 When I run this against our warehouse instance I get an out of memory error. If I remove the AND t1.newsletterid_key IN (SELECT newsletterid FROM t_newscontentstatic) portion if runs fine. We had a problem in the past with the IN clause having a small number of values and some sort of hash table size allocation being called with an enormous value which caused the individual postgresql process to run out of memory. In this case we have 11 strings, each 8 characters in length, in the SELECT newsletterid FROM t_newscontentstatic result set. This query has been running fine for weeks. The only change has been the increase in the number of values returned in the IN clause result set, it goes up by 1 or 2 each week. I ran this query by hand populating the IN clause with 4 out of the 11 strings and it ran just fine. I have included the relevant postgresql logfile and successful and unsuccessful explain analyze outputs below. I know Tom Lane was kind enough to take a look last time we had this issue and we would be more than happy to let him into our system again to debug. :-) Thanks. --sean ============================================================= ============================================================= explain analyze for the failed run: tripmaster=# explain analyze SELECT subage, COUNT( DISTINCT (t1.session_key)), COUNT( DISTINCT(t1.userid_key)) tripmaster-# FROM f_pageviews t1 JOIN segmented_sub t0 ON (t1.userid_key = t0.id) tripmaster-# WHERE t1.date_key BETWEEN 496 AND 502 tripmaster-# AND t1.newsletterid_key IN (SELECT newsletterid FROM t_newscontentstatic) tripmaster-# GROUP BY 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------- GroupAggregate (cost=585501.75..585501.77 rows=1 width=12) (actual time=192050.482..192055.137 rows=14 loops=1) -> Sort (cost=585501.75..585501.75 rows=1 width=12) (actual time=192046.408..192047.231 rows=3214 loops=1) Sort Key: t0.subage -> Nested Loop (cost=585495.82..585501.74 rows=1 width=12) (actual time=189303.056..192042.130 rows=3214 loops=1) -> Hash Join (cost=585495.82..585495.87 rows=1 width=8) (actual time=189253.209..189384.639 rows=6691 loops=1) Hash Cond: ("outer".newsletterid = "inner".newsletterid_key) -> HashAggregate (cost=9.10..9.10 rows=8 width=12) (actual time=61.695..61.726 rows=11 loops=1) -> Seq Scan on t_newscontentstatic (cost=0.00..9.08 rows=8 width=12) (actual time=16.280..61.660 rows=11 loops=1) -> Hash (cost=585486.72..585486.72 rows=1 width=24) (actual time=189190.574..189190.574 rows=0 loops=1) -> Seq Scan on f_pageviews t1 (cost=0.00..585486.72 rows=1 width=24) (actual time=60502.415..-463715.543 rows=24422838 loops=1) Filter: ((date_key >= 496) AND (date_key <= 502)) -> Index Scan using idx_segmented_sub on segmented_sub t0 (cost=0.00..5.85 rows=1 width=8) (actual time=0.391..0.394 rows=0 loops=6691) Index Cond: ("outer".userid_key = t0.id) Total runtime: 194187.416 ms (14 rows) ============================================================= ============================================================= explain analyze from a successful run: tripmaster=# explain analyze SELECT subage, COUNT( DISTINCT (t1.session_key)), COUNT( DISTINCT(t1.userid_key)) tripmaster-# FROM f_pageviews t1 JOIN segmented_sub t0 ON (t1.userid_key = t0.id) tripmaster-# WHERE t1.date_key BETWEEN 496 AND 502 tripmaster-# AND t1.newsletterid_key IN ( '20040429', '20040330', '20040506', '20040520') tripmaster-# GROUP BY 1; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- GroupAggregate (cost=876145.08..876145.09 rows=1 width=12) (actual time=194282.825..194283.361 rows=11 loops=1) -> Sort (cost=876145.08..876145.08 rows=1 width=12) (actual time=194281.394..194281.579 rows=999 loops=1) Sort Key: t0.subage -> Nested Loop (cost=0.00..876145.07 rows=1 width=12) (actual time=66567.739..194279.092 rows=999 loops=1) -> Seq Scan on f_pageviews t1 (cost=0.00..876139.20 rows=1 width=8) (actual time=66124.874..192232.450 rows=1961 loops=1) Filter: ((date_key >= 496) AND (date_key <= 502) AND ((newsletterid_key = '20040429'::text) OR (newsletterid_key = '20040330'::text) OR (newsletterid_key = '20040506'::text) OR (newsletterid_key = '20040520'::text))) -> Index Scan using idx_segmented_sub on segmented_sub t0 (cost=0.00..5.85 rows=1 width=8) (actual time=1.031..1.033 rows=1 loops=1961) Index Cond: ("outer".userid_key = t0.id) Total runtime: 194283.802 ms (9 rows) ============================================================= ============================================================= capture of relevant part of postgresql logfile 2004-05-18 01:47:43 LOG: recycled transaction log file "0000041800000007" 2004-05-18 01:48:15 LOG: recycled transaction log file "0000041800000008" 2004-05-18 01:48:47 LOG: recycled transaction log file "0000041800000009" 2004-05-18 01:48:47 LOG: recycled transaction log file "000004180000000A" 2004-05-18 01:49:52 LOG: recycled transaction log file "000004180000000B" 2004-05-18 01:50:24 LOG: recycled transaction log file "000004180000000C" 2004-05-18 01:50:24 LOG: recycled transaction log file "000004180000000D" 2004-05-18 01:50:56 LOG: recycled transaction log file "000004180000000E" *** malloc: vm_allocate(size=8388608) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=8388608) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=8388608) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=8388608) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=8388608) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=8388608) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=8388608) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=4194304) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=4194304) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=4194304) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=4194304) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=4194304) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=4194304) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=4194304) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=2097152) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=2097152) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=2097152) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=2097152) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=2097152) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=2097152) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=2097152) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=2097152) failed (error code=3) *** malloc[3646]: error: Can't allocate region *** malloc: vm_allocate(size=1048576) failed (error code=3) *** malloc[3646]: error: Can't allocate region TopMemoryContext: 40960 total in 4 blocks; 10336 free (7 chunks); 30624 used TopTransactionContext: 8192 total in 1 blocks; 8136 free (0 chunks); 56 used DeferredTriggerXact: 0 total in 0 blocks; 0 free (0 chunks); 0 used MessageContext: 57344 total in 3 blocks; 4688 free (3 chunks); 52656 used PortalMemory: 8192 total in 1 blocks; 8040 free (0 chunks); 152 used PortalHeapMemory: 1024 total in 1 blocks; 912 free (0 chunks); 112 used ExecutorState: 24576 total in 2 blocks; 8552 free (2 chunks); 16024 used HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used HashBatchContext: -978331468 total in 414 blocks; 44200 free (788 chunks); -978375668 used HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used HashBatchContext: 364112 total in 1 blocks; 0 free (0 chunks); 364112 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used AggContext: 8192 total in 1 blocks; 7968 free (0 chunks); 224 used DynaHashTable: 8192 total in 1 blocks; 6112 free (0 chunks); 2080 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used AggContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used CacheMemoryContext: 1040384 total in 7 blocks; 463824 free (2 chunks); 576560 used t_newscontentstatic_pkey: 1024 total in 1 blocks; 640 free (0 chunks); 384 used idx_pageviews_primary: 1024 total in 1 blocks; 320 free (0 chunks); 704 used idx_segmented_sub: 1024 total in 1 blocks; 640 free (0 chunks); 384 used idx_segmented_sub: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_toast_16408_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used d_userid_user_id_key: 1024 total in 1 blocks; 640 free (0 chunks); 384 used d_userid_pkey: 1024 total in 1 blocks; 640 free (0 chunks); 384 used t_user_email_key: 1024 total in 1 blocks; 640 free (0 chunks); 384 used t_user_pkey: 1024 total in 1 blocks; 640 free (0 chunks); 384 used idx_user_country: 1024 total in 1 blocks; 640 free (0 chunks); 384 used idx_user_status: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_description_o_c_o_index: 2048 total in 1 blocks; 768 free (0 chunks); 1280 used pg_depend_depender_index: 2048 total in 1 blocks; 768 free (0 chunks); 1280 used pg_depend_reference_index: 2048 total in 1 blocks; 768 free (0 chunks); 1280 used d_date_pkey: 1024 total in 1 blocks; 640 free (0 chunks); 384 used idx_date_5: 1024 total in 1 blocks; 320 free (0 chunks); 704 used idx_date_4: 1024 total in 1 blocks; 640 free (0 chunks); 384 used idx_date_3: 1024 total in 1 blocks; 640 free (0 chunks); 384 used idx_date_2: 1024 total in 1 blocks; 640 free (0 chunks); 384 used idx_date_1: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_index_indrelid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_attrdef_adrelid_adnum_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used pg_conversion_default_index: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used pg_group_sysid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_rewrite_rel_rulename_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used pg_proc_proname_args_nsp_index: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used pg_namespace_nspname_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_conversion_name_nsp_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used pg_index_indexrelid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_trigger_tgrelid_tgname_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used pg_inherits_relid_seqno_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used pg_amop_opr_opc_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used pg_amproc_opc_procnum_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used pg_shadow_usename_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_amop_opc_strategy_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used pg_conversion_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_cast_source_target_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used pg_class_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_language_name_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_attribute_relid_attnam_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used pg_proc_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_statistic_relid_att_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used pg_attribute_relid_attnum_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used pg_class_relname_nsp_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used pg_language_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_opclass_am_name_nsp_index: 2048 total in 1 blocks; 768 free (0 chunks); 1280 used pg_type_typname_nsp_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used pg_type_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_aggregate_fnoid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_namespace_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_group_name_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_operator_oprname_l_r_n_index: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used pg_operator_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_opclass_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used pg_shadow_usesysid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used MdSmgr: 8192 total in 1 blocks; 5928 free (0 chunks); 2264 used DynaHash: 8192 total in 1 blocks; 6912 free (0 chunks); 1280 used DynaHashTable: 8192 total in 1 blocks; 2008 free (0 chunks); 6184 used DynaHashTable: 8192 total in 1 blocks; 5080 free (0 chunks); 3112 used DynaHashTable: 8192 total in 1 blocks; 2008 free (0 chunks); 6184 used DynaHashTable: 8192 total in 1 blocks; 1984 free (0 chunks); 6208 used DynaHashTable: 8192 total in 1 blocks; 3520 free (0 chunks); 4672 used DynaHashTable: 24576 total in 2 blocks; 13240 free (4 chunks); 11336 used DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used ErrorContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used 2004-05-18 02:04:53 ERROR: out of memory DETAIL: Failed on request of size 77. 2004-05-18 02:04:57 ERROR: current transaction is aborted, commands ignored until end of transaction block 2004-05-18 02:05:00 LOG: unexpected EOF on client connection 2004-05-18 02:05:00 LOG: unexpected EOF on client connection 2004-05-18 02:05:00 LOG: unexpected EOF on client connection 2004-05-18 02:05:00 LOG: unexpected EOF on client connection 2004-05-18 02:05:00 LOG: unexpected EOF on client connection 2004-05-18 02:05:00 LOG: unexpected EOF on client connection 2004-05-18 02:05:00 LOG: unexpected EOF on client connection 2004-05-18 02:05:00 LOG: unexpected EOF on client connection 2004-05-18 02:05:00 LOG: unexpected EOF on client connection 2004-05-18 02:05:00 LOG: unexpected EOF on client connection
Sean Shanny <shannyconsulting@earthlink.net> writes: > When I run this against our warehouse instance I get an out of memory > error. If I remove the > AND t1.newsletterid_key IN (SELECT newsletterid FROM t_newscontentstatic) > portion if runs fine. I think the problem is not there at all, but with drastic underestimation of the number of rows coming from f_pageviews: > -> Seq Scan on f_pageviews t1 > (cost=0.00..585486.72 rows=1 width=24) (actual > time=60502.415..-463715.543 rows=24422838 loops=1) > Filter: ((date_key >= 496) AND > (date_key <= 502)) The plan you say is failing is trying to load this result into a hashtable ... and since it's only expecting 1 row, it's not going to try to partition the hashtable or anything like that. Are your ANALYZE stats for f_pageviews up to date? Perhaps you need to increase the stats target for date_key to get more resolution in the stats. regards, tom lane
Tom,
It does not make sense that the smaller set of values in the IN clause would work then does it?
We took a look at the code we compiled from and indeed the patch you provided was not not applied, out fault. The patch is in the 7.4.2 code base. We upgraded today and I will be running the query again to see what happens.
You were right on the analyze, we do that in frequently as it takes a whole bunch of time over this much data. Something to cron in the middle of the night I think.
Thanks.
--sean
Tom Lane wrote:
It does not make sense that the smaller set of values in the IN clause would work then does it?
We took a look at the code we compiled from and indeed the patch you provided was not not applied, out fault. The patch is in the 7.4.2 code base. We upgraded today and I will be running the query again to see what happens.
You were right on the analyze, we do that in frequently as it takes a whole bunch of time over this much data. Something to cron in the middle of the night I think.
Thanks.
--sean
Tom Lane wrote:
Sean Shanny <shannyconsulting@earthlink.net> writes:When I run this against our warehouse instance I get an out of memory error. If I remove the AND t1.newsletterid_key IN (SELECT newsletterid FROM t_newscontentstatic) portion if runs fine.I think the problem is not there at all, but with drastic underestimation of the number of rows coming from f_pageviews:-> Seq Scan on f_pageviews t1 (cost=0.00..585486.72 rows=1 width=24) (actual time=60502.415..-463715.543 rows=24422838 loops=1) Filter: ((date_key >= 496) AND (date_key <= 502))The plan you say is failing is trying to load this result into a hashtable ... and since it's only expecting 1 row, it's not going to try to partition the hashtable or anything like that. Are your ANALYZE stats for f_pageviews up to date? Perhaps you need to increase the stats target for date_key to get more resolution in the stats. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend
Sean Shanny <shannyconsulting@earthlink.net> writes: >> I think the problem is not there at all, but with drastic >> underestimation of the number of rows coming from f_pageviews: > It does not make sense that the smaller set of values in the IN clause > would work then does it? Look at your two plans. In one, the set of rows extracted from f_pageviews is loaded into a hashtable, in the other, it's not. This is exactly the sort of plan changeover that I'd expect to happen given a change in the selectivity of a WHERE clause. The fact that the problem appears or disappears depending on how you change the IN clause doesn't mean that the IN clause itself is where the problem is. And certainly an 11-row IN clause subselect isn't going to run anything out of memory, so it's pretty implausible that this failure is coming from right there. You should probably check with plain EXPLAIN that the production database is generating these same plans, but based on the dev machine's EXPLAIN ANALYZE results I don't see where else the out-of-memory could be coming from than the hashtable for f_pageviews. regards, tom lane