Re: Bug? ExecChooseHashTableSize() got assertion failed with crazy number of rows - Mailing list pgsql-hackers
From | Kouhei Kaigai |
---|---|
Subject | Re: Bug? ExecChooseHashTableSize() got assertion failed with crazy number of rows |
Date | |
Msg-id | 9A28C8860F777E439AA12E8AEA7694F801134EB9@BPXM15GP.gisp.nec.co.jp Whole thread Raw |
In response to | Re: Bug? ExecChooseHashTableSize() got assertion failed with crazy number of rows (David Rowley <david.rowley@2ndquadrant.com>) |
Responses |
Re: Bug? ExecChooseHashTableSize() got assertion failed
with crazy number of rows
|
List | pgsql-hackers |
I reported a similar issue before. * Re: DBT-3 with SF=20 got failed http://www.postgresql.org/message-id/557A19D1.9050107@2ndquadrant.com I didn't get a server crash at that time, however, palloc() complained about request size >= 1GB. So, we may need a couple of overhaul around HashJoin to support large size of data, not only nbuckets around 0x80000000. Also, we may need to pay attention to reliability of scale estimation by planner. Even though the plan below says that Join generates 60521928028 rows, it actually generates 776157676 rows (0.12%). tpcds100=# EXPLAIN ANALYZE select ws1.ws_order_number,ws1.ws_warehouse_sk wh1,ws2.ws_warehouse_sk wh2from web_sales ws1,web_salesws2where ws1.ws_order_number = ws2.ws_order_number and ws1.ws_warehouse_sk <> ws2.ws_warehouse_sk; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------Merge Join (cost=25374644.08..1160509591.61 rows=60521928028 width=24) (actual time=138347.979..491889.343 rows=776157676 loops=1) Merge Cond: (ws1.ws_order_number = ws2.ws_order_number) Join Filter: (ws1.ws_warehouse_sk <> ws2.ws_warehouse_sk) Rows Removed by Join Filter: 127853313 -> Sort (cost=12687322.04..12867325.16 rows=72001248 width=16)(actual time=73252.300..79017.420 rows=72001237 loops=1) Sort Key: ws1.ws_order_number Sort Method:quicksort Memory: 7083296kB -> Seq Scan on web_sales ws1 (cost=0.00..3290612.48 rows=72001248 width=16)(actual time=0.023..39951.201 rows=72001237 loops=1) -> Sort (cost=12687322.04..12867325.16 rows=72001248 width=16)(actual time=65095.655..128885.811 rows=904010978 loops=1) Sort Key: ws2.ws_order_number Sort Method:quicksort Memory: 7083296kB -> Seq Scan on web_sales ws2 (cost=0.00..3290612.48 rows=72001248 width=16)(actual time=0.014..31046.888 rows=72001237 loops=1)Planning time: 0.232 msExecution time: 530176.521 ms (14 rows) So, even if we allows nodeHash.c to allocate hash buckets larger than 1GB, its initial size may be determined carefully. Probably, 1GB is a good starting point even if expanded later. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com> > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of David Rowley > Sent: Wednesday, August 19, 2015 10:07 AM > To: Tom Lane > Cc: Kevin Grittner; Kaigai Kouhei(海外 浩平); pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Bug? ExecChooseHashTableSize() got assertion failed with > crazy number of rows > > On 19 August 2015 at 12:38, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > David Rowley <david.rowley@2ndquadrant.com> writes: > > david=# set work_mem = '94GB'; > > ERROR: 98566144 is outside the valid range for parameter "work_mem" > (64 .. > > 2097151) > > Apparently you're testing on a 32-bit server. 64-bit servers allow > work_mem to go up to INT_MAX kilobytes. > > > > hmm, no, but it does appear that sizeof(long) is 4 bytes for me, despite 64 bit > OS. > > I guess this accounts for my misunderstanding that work_mem was limited to 2GB > even on 64 bit machines. > > From guc.h > > #if SIZEOF_SIZE_T > 4 && SIZEOF_LONG > 4 > #define MAX_KILOBYTES INT_MAX > #else > #define MAX_KILOBYTES (INT_MAX / 1024) > #endif > > Apologies for the noise.
pgsql-hackers by date: