OVERLAPS is slow - Mailing list pgsql-performance
From | Chris Browne |
---|---|
Subject | OVERLAPS is slow |
Date | |
Msg-id | 608wxtcb2o.fsf@dba2.int.libertyrms.com Whole thread Raw |
List | pgsql-performance |
I'm doing some analysis on temporal usages, and was hoping to make use of OVERLAPS, but it does not appear that it makes use of indices. Couching this in an example... I created a table, t1, thus: metadata=# \d t1 Table "public.t1" Column | Type | Modifiers --------+--------------------------+------------------------------------------------------- id | integer | not null default nextval('t1_id_seq'::regclass) t1 | timestamp with time zone | not null default now() t2 | timestamp with time zone | not null default 'infinity'::timestamp with time zone data | text | not null Indexes: "t1_pkey" PRIMARY KEY, btree (id) "f2" btree (id) WHERE t2 = 'infinity'::timestamp with time zone "t1t1" btree (t1) "t1t2" btree (t2) When entries go in, they default to having an effective date range from now() until 'infinity'. I then went off and seeded a bunch of data into the table, inserting values: for i in `cat /etc/dictionaries-common/words | head 2000`; do psql -d metadata -c "insert into t1 (data) values ('$i');" done Then, I started doing temporal updates, thus: for i in `cat /etc/dictionaries-common/words`; do psql -d metadata -c "insert into t1 (data) values ('$i');update t1 set t2 = now() where t2 = 'infinity' and id in (selectid from t1 where t2 = 'infinity' order by random() limit 1);" done This terminates many of those entries, and creates a new one that is effective "to infinity." After running this for a while, I have a reasonably meaningful amount of data in the table: metadata=# select count(*) from t1; select count(*) from t1 where t2 = 'infinity'; count -------- 125310 (1 row) count ------- 2177 (1 row) Searching for the "active" items in the table, via a constructed 'overlap': metadata=# explain analyze select count(*) from t1 where t1 <= now() and t2 >= now(); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Aggregate (cost=98.13..98.14 rows=1 width=0) (actual time=8.104..8.105 rows=1 loops=1) -> Index Scan using t1t2 on t1 (cost=0.00..93.95 rows=1671 width=0) (actual time=0.116..6.374 rows=2177 loops=1) Index Cond: (t2 >= now()) Filter: (t1 <= now()) Total runtime: 8.193 ms (5 rows) Note, that makes use of the index on column t2, and runs nice and quick. (And notice that the rows found, 2177, agrees with the earlier count.) Unfortunately, when I try using OVERLAPS, it reverts to a Seq Scan. metadata=# explain analyze select * from t1 where (t1,t2) overlaps (now(), now()); QUERY PLAN ----------------------------------------------------------------------------------------------------------- Seq Scan on t1 (cost=0.00..3156.59 rows=43135 width=24) (actual time=171.248..205.941 rows=2177 loops=1) Filter: "overlaps"(t1, t2, now(), now()) Total runtime: 207.508 ms (3 rows) I would surely think that I have enough data in the table for the stats to be good, and the first query certainly does harness the index on t2 to determine if records are overlapping (now(),now()). Is it possible that we need to have some improvement to the optimizer so that OVERLAPS could make use of the indices? -- select 'cbbrowne' || '@' || 'cbbrowne.com'; http://linuxfinances.info/info/lsf.html "Very little is known about the War of 1812 because the Americans lost it." -- Eric Nicol
pgsql-performance by date: