Re: Questions about btree_gin vs btree_gist for low cardinalitycolumns - Mailing list pgsql-general

From Peter J. Holzer
Subject Re: Questions about btree_gin vs btree_gist for low cardinalitycolumns
Date
Msg-id 20190602073732.kvan34sfotdlyc7s@hjp.at
Whole thread Raw
In response to Re: Questions about btree_gin vs btree_gist for low cardinality columns  (Morris de Oryx <morrisdeoryx@gmail.com>)
Responses Re: Questions about btree_gin vs btree_gist for low cardinality columns
List pgsql-general
On 2019-06-02 09:10:25 +1000, Morris de Oryx wrote:
> Peter, thanks a lot for picking up on what I started, improving it, and
> reporting back. I thought I was providing timing estimates from the EXPLAIN
> cost dumps. Seems not. Well, there's another thing that I've learned.

The cost is how long the optimizer thinks it will take (in arbitrary
units). But it's just an estimate, and estimates can be off - sometimes
quite dramatically.

To get the real timings with explain, use explain (analyze). I often
combine this with buffers to get I/O stats as well:

wdsah=> explain (analyze, buffers) select min(date) from facttable_stat_fta4 where partnerregion = 'USA' and sitcr4 =
'7522'; 

╔══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║                                                                                  QUERY PLAN
                                                      ║ 

╟──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ Aggregate  (cost=694.23..694.24 rows=1 width=4) (actual time=7.568..7.568 rows=1 loops=1)
                                                      ║ 
║   Buffers: shared hit=3 read=148 dirtied=114
                                                      ║ 
║   ->  Index Scan using facttable_stat_fta4_sitcr4_partnerregion_idx on facttable_stat_fta4  (cost=0.57..693.09
rows=455width=4) (actual time=0.515..7.493 rows=624 loops=1) ║ 
║         Index Cond: (((sitcr4)::text = '7522'::text) AND ((partnerregion)::text = 'USA'::text))
                                                      ║ 
║         Buffers: shared hit=3 read=148 dirtied=114
                                                      ║ 
║ Planning time: 0.744 ms
                                                      ║ 
║ Execution time: 7.613 ms
                                                      ║ 

╚══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
(7 rows)

And when you don't need the costs, you can turn them off:

wdsah=> explain (analyze, buffers, costs off) select min(date) from facttable_stat_fta4 where partnerregion = 'USA' and
sitcr4= '7522';  

╔════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║                                                               QUERY PLAN
                ║ 

╟────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ Aggregate (actual time=0.598..0.598 rows=1 loops=1)
                ║ 
║   Buffers: shared hit=140
                ║ 
║   ->  Index Scan using facttable_stat_fta4_sitcr4_partnerregion_idx on facttable_stat_fta4 (actual time=0.054..0.444
rows=624loops=1) ║ 
║         Index Cond: (((sitcr4)::text = '7522'::text) AND ((partnerregion)::text = 'USA'::text))
                ║ 
║         Buffers: shared hit=140
                ║ 
║ Planning time: 0.749 ms
                ║ 
║ Execution time: 0.647 ms
                ║ 

╚════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
(7 rows)

See https://www.postgresql.org/docs/current/sql-explain.html for details.

> Can you tell me how you get timing results into state_test_times?

In this case I just entered them manually (cut and paste from psql
\timing output). If I wanted to repeat that test on another database, I
would write a Python script (I'm sure you can do that in pgsql, too, but
I feel more comfortable in Python). I don't think there is a way to get
time timings in plain SQL.

        hp

--
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp@hjp.at         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

Attachment

pgsql-general by date:

Previous
From: Tom K
Date:
Subject: Re: psql: FATAL: the database system is starting up
Next
From: Morris de Oryx
Date:
Subject: Re: Questions about btree_gin vs btree_gist for low cardinality columns