RE: Simple Query Elapsed Time ~ 3 hrs Using Bitmap Index/Heap Scan - Mailing list pgsql-performance

From Fd Habash
Subject RE: Simple Query Elapsed Time ~ 3 hrs Using Bitmap Index/Heap Scan
Date
Msg-id 5b16c5d0.1c69fb81.81abf.5c97@mx.google.com
Whole thread Raw
In response to Re: Simple Query Elapsed Time ~ 3 hrs Using Bitmap Index/Heap Scan  (Matthew Hall <mhall@mhcomputing.net>)
List pgsql-performance

Indexes are being redone as per these insights. Appreciate the great support.

 

----------------
Thank you

 

From: Matthew Hall
Sent: Tuesday, June 5, 2018 10:42 AM
To: Fred Habash
Cc: pgsql-performance@lists.postgresql.org
Subject: Re: Simple Query Elapsed Time ~ 3 hrs Using Bitmap Index/Heap Scan

 

Probably the cardinality of "vclf_number" is really bad. So the scan on that index is returning many million or billion rows and then you get a recheck which takes semi-forever. So you need an index on cl_value or both vclf_number and cl_value. If you know some properties of the values actually stored inside of those that will help. 

Matthew Hall


On Jun 5, 2018, at 7:17 AM, Fred Habash <fmhabash@gmail.com> wrote:

Trying to optimize the Elapsed Time (ET) of this query. Currently, it is hovering around 3 hrs.

Running a 'vaccum analyse' had no effect on ET. Even forcing an 'indexonly' scan by disabling 'enable_seqscan', still around the 3 hrs.

The table is around 4.6B rows,

 explain select cit_id, cl_value from reflink.citation_locators where cl_value = '1507617681' and vclf_number = 1 ;
                                       QUERY PLAN                                       
-----------------------------------------------------------------------------------------
 Bitmap Heap Scan on citation_locators  (cost=5066559.01..50999084.79 rows=133 width=23)
   Recheck Cond: (vclf_number = 1)
   Filter: (cl_value = '1507617681'::text)
   ->  Bitmap Index Scan on cl_indx_fk02  (cost=0.00..5066558.97 rows=493984719 width=0)
         Index Cond: (vclf_number = 1)
(5 rows)

reflink.citation_locators 
                                Table "reflink.citation_locators"
      Column      |           Type           | Modifiers | Storage  | Stats target | Description 
------------------+--------------------------+-----------+----------+--------------+-------------
 cl_id            | bigint                   | not null  | plain    |              | 
 cl_value         | text                     | not null  | extended |              | 
 vclf_number      | integer                  | not null  | plain    |              | 
 cit_id           | bigint                   | not null  | plain    |              | 
 cl_date_created  | timestamp with time zone | not null  | plain    |              | 
 cl_date_modified | timestamp with time zone |           | plain    |              | 
Indexes:
    "cl_pk" PRIMARY KEY, btree (cl_id)
    "cl_cnst_uk01" UNIQUE CONSTRAINT, btree (cit_id, vclf_number, cl_value)
    "cl_indx_fk01" btree (cit_id)
    "cl_indx_fk02" btree (vclf_number)
Foreign-key constraints:
    "cl_cnst_fk01" FOREIGN KEY (cit_id) REFERENCES citations(cit_id) NOT VALID    "cl_cnst_fk02" FOREIGN KEY (vclf_number) REFERENCES valid_cit_locator_fields(vclf_number)

 

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Simple Query Elapsed Time ~ 3 hrs Using Bitmap Index/Heap Scan
Next
From: e-blokos
Date:
Subject: Re: Client Server performance & UDS