Home > mailing lists

Re: Questions about btree_gin vs btree_gist for low cardinalitycolumns - Mailing list pgsql-general

From	Peter J. Holzer
Subject	Re: Questions about btree_gin vs btree_gist for low cardinalitycolumns
Date	June 2, 2019 10:37:32
Msg-id	20190602073732.kvan34sfotdlyc7s@hjp.at Whole thread Raw
In response to	Re: Questions about btree_gin vs btree_gist for low cardinality columns (Morris de Oryx <morrisdeoryx@gmail.com>)
Responses	Re: Questions about btree_gin vs btree_gist for low cardinality columns
List	pgsql-general

Tree view

On 2019-06-02 09:10:25 +1000, Morris de Oryx wrote:
> Peter, thanks a lot for picking up on what I started, improving it, and
> reporting back. I thought I was providing timing estimates from the EXPLAIN
> cost dumps. Seems not. Well, there's another thing that I've learned.

The cost is how long the optimizer thinks it will take (in arbitrary
units). But it's just an estimate, and estimates can be off - sometimes
quite dramatically.

To get the real timings with explain, use explain (analyze). I often
combine this with buffers to get I/O stats as well:

wdsah=> explain (analyze, buffers) select min(date) from facttable_stat_fta4 where partnerregion = 'USA' and sitcr4 =
'7522'; 

╔══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║                                                                                  QUERY PLAN
                                                      ║ 

╟──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ Aggregate  (cost=694.23..694.24 rows=1 width=4) (actual time=7.568..7.568 rows=1 loops=1)
                                                      ║ 
║   Buffers: shared hit=3 read=148 dirtied=114
                                                      ║ 
║   ->  Index Scan using facttable_stat_fta4_sitcr4_partnerregion_idx on facttable_stat_fta4  (cost=0.57..693.09
rows=455width=4) (actual time=0.515..7.493 rows=624 loops=1) ║ 
║         Index Cond: (((sitcr4)::text = '7522'::text) AND ((partnerregion)::text = 'USA'::text))
                                                      ║ 
║         Buffers: shared hit=3 read=148 dirtied=114
                                                      ║ 
║ Planning time: 0.744 ms
                                                      ║ 
║ Execution time: 7.613 ms
                                                      ║ 

╚══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
(7 rows)

And when you don't need the costs, you can turn them off:

wdsah=> explain (analyze, buffers, costs off) select min(date) from facttable_stat_fta4 where partnerregion = 'USA' and
sitcr4= '7522';  

╔════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║                                                               QUERY PLAN
                ║ 

╟────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ Aggregate (actual time=0.598..0.598 rows=1 loops=1)
                ║ 
║   Buffers: shared hit=140
                ║ 
║   ->  Index Scan using facttable_stat_fta4_sitcr4_partnerregion_idx on facttable_stat_fta4 (actual time=0.054..0.444
rows=624loops=1) ║ 
║         Index Cond: (((sitcr4)::text = '7522'::text) AND ((partnerregion)::text = 'USA'::text))
                ║ 
║         Buffers: shared hit=140
                ║ 
║ Planning time: 0.749 ms
                ║ 
║ Execution time: 0.647 ms
                ║ 

╚════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
(7 rows)

See https://www.postgresql.org/docs/current/sql-explain.html for details.

> Can you tell me how you get timing results into state_test_times?

In this case I just entered them manually (cut and paste from psql
\timing output). If I wanted to repeat that test on another database, I
would write a Python script (I'm sure you can do that in pgsql, too, but
I feel more comfortable in Python). I don't think there is a way to get
time timings in plain SQL.

        hp

--
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp@hjp.at         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

Attachment

signature.asc

pgsql-general by date:

From: Tom K
Date: 02 June 2019, 06:07:52
Subject: Re: psql: FATAL: the database system is starting up

From: Morris de Oryx
Date: 02 June 2019, 11:03:09
Subject: Re: Questions about btree_gin vs btree_gist for low cardinality columns

Re: Questions about btree_gin vs btree_gist for low cardinalitycolumns - Mailing list pgsql-general

Attachment

Previous

Next