Bug? ExecChooseHashTableSize() got assertion failed with crazy number of rows - Mailing list pgsql-hackers

From Kouhei Kaigai
Subject Bug? ExecChooseHashTableSize() got assertion failed with crazy number of rows
Date
Msg-id 9A28C8860F777E439AA12E8AEA7694F8011347B1@BPXM15GP.gisp.nec.co.jp
Whole thread Raw
Responses Re: Bug? ExecChooseHashTableSize() got assertion failed with crazy number of rows  (Kevin Grittner <kgrittn@ymail.com>)
List pgsql-hackers
Hello,

I noticed ExecChooseHashTableSize() in nodeHash.c got failed by
Assert(nbuckets > 0), when crazy number of rows are expected.

BACKTRACE:

#0  0x0000003f79432625 in raise () from /lib64/libc.so.6
#1  0x0000003f79433e05 in abort () from /lib64/libc.so.6
#2  0x000000000092600a in ExceptionalCondition (conditionName=0xac1ea0 "!(nbuckets > 0)",
    errorType=0xac1d88 "FailedAssertion", fileName=0xac1d40 "nodeHash.c", lineNumber=545) at assert.c:54
#3  0x00000000006851ff in ExecChooseHashTableSize (ntuples=60521928028, tupwidth=8, useskew=1 '\001',
    numbuckets=0x7fff146bff04, numbatches=0x7fff146bff00, num_skew_mcvs=0x7fff146bfefc) at nodeHash.c:545
#4  0x0000000000701735 in initial_cost_hashjoin (root=0x253a318, workspace=0x7fff146bffc0, jointype=JOIN_SEMI,
    hashclauses=0x257e4f0, outer_path=0x2569a40, inner_path=0x2569908, sjinfo=0x2566f40, semifactors=0x7fff146c0168)
    at costsize.c:2592
#5  0x000000000070e02a in try_hashjoin_path (root=0x253a318, joinrel=0x257d940, outer_path=0x2569a40,
inner_path=0x2569908,
    hashclauses=0x257e4f0, jointype=JOIN_SEMI, extra=0x7fff146c0150) at joinpath.c:543


See the following EXPLAIN output, at the configuration without --enable-cassert.
Planner expects 60.5B rows towards the self join by a relation with 72M rows.
(Probably, this estimation is too much.)

[kaigai@ayu ~]$ (echo EXPLAIN; cat ~/tpcds/query95.sql) | psql tpcds100
                                                                                       QUERY PLAN
     

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=9168667273.07..9168667273.08 rows=1 width=20)
   CTE ws_wh
     ->  Custom Scan (GpuJoin)  (cost=3342534.49..654642911.88 rows=60521928028 width=24)
           Bulkload: On (density: 100.00%)
           Depth 1: Logic: GpuHashJoin, HashKeys: (ws_order_number), JoinQual: ((ws_warehouse_sk <> ws_warehouse_sk)
AND(ws_order_number = ws_order_number)), nrows (ratio: 84056.77%) 
           ->  Custom Scan (BulkScan) on web_sales ws1_1  (cost=0.00..3290612.48 rows=72001248 width=16)
           ->  Seq Scan on web_sales ws2  (cost=0.00..3290612.48 rows=72001248 width=16)
   ->  Sort  (cost=8514024361.19..8514024361.20 rows=1 width=20)
         Sort Key: (count(DISTINCT ws1.ws_order_number))
      :

This crash was triggered by Assert(nbuckets > 0), and nbuckets is calculated
as follows.

    /*
     * If there's not enough space to store the projected number of tuples and
     * the required bucket headers, we will need multiple batches.
     */
    if (inner_rel_bytes + bucket_bytes > hash_table_bytes)
    {
        /* We'll need multiple batches */
        long        lbuckets;
        double      dbatch;
        int         minbatch;
        long        bucket_size;

        /*
         * Estimate the number of buckets we'll want to have when work_mem is
         * entirely full.  Each bucket will contain a bucket pointer plus
         * NTUP_PER_BUCKET tuples, whose projected size already includes
         * overhead for the hash code, pointer to the next tuple, etc.
         */
        bucket_size = (tupsize * NTUP_PER_BUCKET + sizeof(HashJoinTuple));
        lbuckets = 1 << my_log2(hash_table_bytes / bucket_size);
        lbuckets = Min(lbuckets, max_pointers);
        nbuckets = (int) lbuckets;
        bucket_bytes = nbuckets * sizeof(HashJoinTuple);
          :
          :
    }
    Assert(nbuckets > 0);
    Assert(nbatch > 0);

In my case, the hash_table_bytes was 101017630802, and bucket_size was 48.
So, my_log2(hash_table_bytes / bucket_size) = 31, then lbuckets will have
negative number because both "1" and my_log2() is int32.
So, Min(lbuckets, max_pointers) chooses 0x80000000, then it was set on
the nbuckets and triggers the Assert().

Attached patch fixes the problem.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: allowing wal_level change at run time
Next
From: Robert Haas
Date:
Subject: Re: [PROPOSAL] VACUUM Progress Checker.