View performance with implicit cast - Mailing list pgsql-sql
From | Tomasz Szypowski |
---|---|
Subject | View performance with implicit cast |
Date | |
Msg-id | DU0PR04MB94192D3922E05C8E0A18F5F399142@DU0PR04MB9419.eurprd04.prod.outlook.com Whole thread Raw |
Responses |
Re: View performance with implicit cast
|
List | pgsql-sql |
Hello
I have got an example, in which PostgreSQL could speed up:
First, create and populate structures:
create table test1 (date_1 timestamp);
insert into test1
select generate_series( '2020-01-01', '2025-01-01', interval '1 minute');
CREATE INDEX idx_test1_date_1 on test1(date_1);
create table test2 (date_2 date);
insert into test2
select generate_series( '1900-01-01', '2200-01-01', interval '1 day');
CREATE INDEX idx_test2_date_2 on test2(date_2);
Then create a view:
create or replace view v_test as
select date_1 from test1
union all
select date_2 from test2
and now run the query:
explain analyze
select * from v_test
where (date_1>='2024-12-09') and (date_1<'2025-01-01');
You would see:
Subquery Scan on v_test (cost=0.00..95710.28 rows=13702 width=8) (actual time=440.998..472.586 rows=33143 loops=1)
Filter: ((v_test.date_1 >= '2024-12-09 00:00:00'::timestamp without time zone) AND (v_test.date_1 < '2025-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 2707311
-> Append (cost=0.00..54603.47 rows=2740454 width=8) (actual time=0.038..332.601 rows=2740454 loops=1)
-> Seq Scan on test1 (cost=0.00..37950.81 rows=2630881 width=8) (actual time=0.035..165.235 rows=2630881 loops=1)
-> Subquery Scan on "*SELECT* 2" (cost=0.00..2950.39 rows=109573 width=8) (actual time=0.025..16.285 rows=109573 loops=1)
-> Seq Scan on test2 (cost=0.00..1580.73 rows=109573 width=4) (actual time=0.022..4.974 rows=109573 loops=1)
Planning Time: 0.263 ms
Execution Time: 473.598 ms
I can imagine, that test2 cannot be searched by index due to the implicite cast of the column date -> timestamp, but why the query doesn`t use index idx_test1_date_1? Furthermore it could use as well index idx_test2_date_2
Now let`s modify the view`s definition only to add explicit cast:
create or replace view v_test as
select date_1 from test1
union all
select date_2::timestamp from test2
Run the query again:
explain analyze
select * from v_test
where (date_1>='2024-12-09') and (date_1<'2025-01-01');
Append (cost=0.43..3812.87 rows=31903 width=8) (actual time=0.024..17.287 rows=33143 loops=1)
-> Index Only Scan using idx_test1_date_1 on test1 (cost=0.43..975.53 rows=31355 width=8) (actual time=0.023..5.500 rows=33120 loops=1)
Index Cond: ((date_1 >= '2024-12-09 00:00:00'::timestamp without time zone) AND (date_1 < '2025-01-01 00:00:00'::timestamp without time zone))
Heap Fetches: 0
-> Seq Scan on test2 (cost=0.00..2677.83 rows=548 width=8) (actual time=3.744..8.610 rows=23 loops=1)
Filter: (((date_2)::timestamp without time zone >= '2024-12-09 00:00:00'::timestamp without time zone) AND ((date_2)::timestamp without time zone < '2025-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 109550
Planning Time: 0.230 ms
Execution Time: 18.937 ms
So
- Why the index is being used only when the column types are the same?
- Why th other indexes are not being used?
Maybe you can help me to rewrite the query
Regards
Thomas Szypowski