Thread: Increasing select max(datecol) from bilkaib where datecol<=date'2008-11-01' and (cr='00' or db='00') speed
Increasing select max(datecol) from bilkaib where datecol<=date'2008-11-01' and (cr='00' or db='00') speed
From
"Andrus"
Date:
There are columns kuupaev date, cr char(10), db char(10) and regular indexes for all those fields. bilkaib table contains large number of rows. The following query takes too much time. How to make it faster ? I think PostgreSql should use multiple indexes as bitmaps to speed it. I can re-write this query in any way or split to multiple statements if this makes it faster. Andrus. explain analyze select max(kuupaev) from bilkaib where kuupaev<=date'2008-11-01' and (cr='00' or db='00') "Result (cost=339.75..339.76 rows=1 width=0) (actual time=52432.256..52432.260 rows=1 loops=1)" " InitPlan" " -> Limit (cost=0.00..339.75 rows=1 width=4) (actual time=52432.232..52432.236 rows=1 loops=1)" " -> Index Scan Backward using bilkaib_kuupaev_idx on bilkaib (cost=0.00..1294464.73 rows=3810 width=4) (actual time=52432.222..52432.222 rows=1 loops=1)" " Index Cond: (kuupaev <= '2008-11-01'::date)" " Filter: ((kuupaev IS NOT NULL) AND ((cr = '00'::bpchar) OR (db = '00'::bpchar)))" "Total runtime: 52432.923 ms" "PostgreSQL 8.1.4 on i686-pc-linux-gnu, compiled by GCC i686-pc-linux-gnu-gcc (GCC) 3.4.6 (Gentoo 3.4.6-r1, ssp-3.4.5-1.0, pie-8.7.9)"
Re: Increasing select max(datecol) from bilkaib where datecol<=date'2008-11-01' and (cr='00' or db='00') speed
From
Matthew Wakeling
Date:
Firstly, please upgrade to Postgres 8.3 if possible. On Wed, 12 Nov 2008, Andrus wrote: > There are columns > kuupaev date, cr char(10), db char(10) > and regular indexes for all those fields. Create a single index on (cr, db, datecol). Matthew -- Those who do not understand Unix are condemned to reinvent it, poorly. -- Henry Spencer
Re: Increasing select max(datecol) from bilkaib where datecol<=date'2008-11-01' and (cr='00' or db='00') speed
From
"Vladimir Sitnikov"
Date:
On Wed, Nov 12, 2008 at 9:02 AM, Andrus <kobruleht2@hot.ee> wrote:
There are columns
kuupaev date, cr char(10), db char(10)
and regular indexes for all those fields.
bilkaib table contains large number of rows.
The following query takes too much time.
How to make it faster ?
I think PostgreSql should use multiple indexes as bitmaps to speed it.
I am afraid I do not see a way to use bitmaps to get any improvement here: the server will still need to read the whole indices to figure out the answer.
I suggest you to create two more indices:
create index date_with_zero_cr on bilkaib(date) where cr='00';
create index date_with_zero_db on bilkaib(date) where db='00';
And rewrite query as follows:
select greatest(
(select max(date) from bilkaib where datecol<=date'2008-11-01' and cr='00'),
(select max(date) from bilkaib where datecol<=date'2008-11-01' and db='00'))
Regards,
Vladimir Sitnikov
I suggest you to create two more indices:
create index date_with_zero_cr on bilkaib(date) where cr='00';
create index date_with_zero_db on bilkaib(date) where db='00';
And rewrite query as follows:
select greatest(
(select max(date) from bilkaib where datecol<=date'2008-11-01' and cr='00'),
(select max(date) from bilkaib where datecol<=date'2008-11-01' and db='00'))
Regards,
Vladimir Sitnikov
Re: Increasing select max(datecol) from bilkaib where datecol<=date'2008-11-01' and (cr='00' or db='00') speed
From
Matthew Wakeling
Date:
On Wed, 12 Nov 2008, Vladimir Sitnikov wrote: > And rewrite query as follows: > select greatest( > (select max(date) from bilkaib where datecol<=date'2008-11-01' and cr='00'), > (select max(date) from bilkaib where datecol<=date'2008-11-01' and db='00')) Oops, yes, I missed the "OR" in the query. This rewrite is good - my suggested index would not have helped. > I suggest you to create two more indices: > > create index date_with_zero_cr on bilkaib(date) where cr='00'; > create index date_with_zero_db on bilkaib(date) where db='00'; Alternatively if you create an index on (cr, bilkaib) and one on (db, bilkaib) then you will be able to use other values in the query too. Matthew -- Contrary to popular belief, Unix is user friendly. It just happens to be very selective about who its friends are. -- Kyle Hearn
Re: Increasing select max(datecol) from bilkaib where datecol<=date'2008-11-01' and (cr='00' or db='00') speed
From
"Andrus"
Date:
Matthew, Thank you. bilkaib table contains GL transactions for every day. 00 records are initial balance records and they appear only in start of year or start of month. They may present or may be not present for some month if initial balance is not calculated yet. If 00 records are present, usuallly there are lot of them for single date for db and cr columns. This query finds initial balance date befeore given date. bilkaib table contains several year transactions so it is large. >Alternatively if you create an index on (cr, bilkaib) and one on (db, >bilkaib) then you will be able to use other values in the query too. I'm sorry I do'nt understand this. What does the (cr, bilkaib) syntax mean? Should I create two functions indexes and re-write query as Vladimir suggests or is there better appoach ? Andrus.
Re: Increasing select max(datecol) from bilkaib where datecol<=date'2008-11-01' and (cr='00' or db='00') speed
From
"Vladimir Sitnikov"
Date:
This query finds initial balance date befeore given date.
If you are not interested in other balances except initial ones (the ones that have '00') the best way is to create partial indices that I have suggested.
That will keep size of indices small, while providing good performance (constant response time)
That will keep size of indices small, while providing good performance (constant response time)
bilkaib table contains several year transactions so it is large.
That is not a problem for the particular case. However, when you evaluate query performance, it really makes sense giving number of rows in each table (is 100K rows a "large" table? what about 10M rows?) and other properties of the data stored in the table (like number of rows that have cr='00')
Alternatively if you create an index on (cr, bilkaib) and one on (db, bilkaib) then you will be able to use other values in the query too.
That means if you create one index on biklaib (cr, datecol) and another index on (db, datecol) you will be able to improve queries like
select greatest(
(select max(date) from bilkaib where datecol<=date'2008-11-01' and cr=XXX),
(select max(date) from bilkaib where datecol<=date'2008-11-01' and db=YYY)).
with arbitrary XXX and YYY. I am not sure if you really want this.
select greatest(
(select max(date) from bilkaib where datecol<=date'2008-11-01' and cr=XXX),
(select max(date) from bilkaib where datecol<=date'2008-11-01' and db=YYY)).
with arbitrary XXX and YYY. I am not sure if you really want this.
I'm sorry I do'nt understand this.
What does the (cr, bilkaib) syntax mean?
I believe that should be read as (cr, datecol).
Should I create two functions indexes and re-write query as Vladimir suggests or is there better appoach ?
I am afraid PostgreSQL is not smart enough to rewrite query with "or" into two separate index scans. There is no way to improve the query significantly without rewriting it.
Note: for this case indices on (datecol), (cr) and (db) are not very helpful.
Regards,
Vladimir Sitnikov
Note: for this case indices on (datecol), (cr) and (db) are not very helpful.
Regards,
Vladimir Sitnikov
Re: Increasing select max(datecol) from bilkaib where datecol<=date'2008-11-01' and (cr='00' or db='00') speed
From
hubert depesz lubaczewski
Date:
On Wed, Nov 12, 2008 at 07:02:10PM +0200, Andrus wrote: > explain analyze select max(kuupaev) from bilkaib where > kuupaev<=date'2008-11-01' and (cr='00' or db='00') do you always have this: "(cr='00' or db='00')"? or do the values (00) change? if they don't change, or *most* of the queries have "(cr='00' or db='00')", than the biggest time difference you will get after creating this index: create index test on bilkaib (kuupaev) where cr='00' or db='00'; depesz -- Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/ jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007
Re: Increasing select max(datecol) from bilkaib where datecol<=date'2008-11-01' and (cr='00' or db='00') speed
From
"Andrus"
Date:
Vladimir, >I am afraid PostgreSQL is not smart enough to rewrite query with "or" into >two separate index scans. There is no way to improve the query >significantly without rewriting it. >Note: for this case indices on (datecol), (cr) and (db) are not very >helpful. Thank you very much. I added you indexes to db and re-write query. Now it runs fast. Andrus.
Re: Increasing select max(datecol) from bilkaib wheredatecol<=date'2008-11-01' and (cr='00' or db='00') speed
From
"Andrus"
Date:
Depesz, > do you always have this: "(cr='00' or db='00')"? or do the values (00) > change? > if they don't change, or *most* of the queries have "(cr='00' or > db='00')", than the biggest time difference you will get after creating > this index: > create index test on bilkaib (kuupaev) where cr='00' or db='00'; I have always cr='00' or db='00' clause. Separate values are never tested. I changed by queries back to old values and created this single index. This seems to be even better that Vladimir suggestion. Thank you very much. Andrus.