Thread: suggestion to improve planer

suggestion to improve planer

From
Ľubomír Varga
Date:
Hi.

I hope, that this is right mailing list.

SELECT date, value FROM t_eventWHERE t_event.id in (SELECT id FROM t_event    WHERE date < '2009-08-25'    ORDER BY
dateDESC LIMIT 1)ORDER BY date; 
cost 6.4

SELECT date, value FROM t_eventWHERE t_event.id = (SELECT id FROM t_event    WHERE date < '2009-08-25'    ORDER BY date
DESCLIMIT 1)ORDER BY date; 
cost 6.36..6.37


Why that two query dont have equal cost? If it is not problem, try add some
planer code to recognize that sublesect HAVE TO return just one row (limit 1)
and in plan could be used filter/index scan instead of hash aggregate. I have
also some complex query examples where cost difference is more visible.

Have a nice day.
--
Odborník na všetko je zlý odborník. Ja sa snažím byť výnimkou potvrdzujúcou
pravidlo.


Re: suggestion to improve planer

From
Peter Eisentraut
Date:
On Thu, 2009-09-03 at 10:35 +0200, Ľubomír Varga wrote:
> Hi.
> 
> I hope, that this is right mailing list.
> 
> SELECT date, value FROM t_event
>     WHERE t_event.id in (SELECT id FROM t_event
>         WHERE date < '2009-08-25'
>         ORDER BY date DESC LIMIT 1)
>     ORDER BY date;
> cost 6.4
> 
> SELECT date, value FROM t_event
>     WHERE t_event.id = (SELECT id FROM t_event
>         WHERE date < '2009-08-25'
>         ORDER BY date DESC LIMIT 1)
>     ORDER BY date;
> cost 6.36..6.37
> 
> 
> Why that two query dont have equal cost? If it is not problem, try add some 
> planer code to recognize that sublesect HAVE TO return just one row (limit 1) 
> and in plan could be used filter/index scan instead of hash aggregate.

Well, there is always a tradeoff between more planner analysis and more
complicated and slow planning.  Seeing that the cost estimates are close
enough for practical purposes, it doesn't seem worthwhile to fix
anything here.

>  I have 
> also some complex query examples where cost difference is more visible.

Having real examples where a change might actually improve runtime is
always more interesting than an academic exercise like the above.



Re: suggestion to improve planer

From
Ľubomír Varga
Date:
On Wednesday 09 September 2009 14:11:41 Peter Eisentraut wrote:
> On Thu, 2009-09-03 at 10:35 +0200, Ľubomír Varga wrote:
> > Hi.
> >
> > I hope, that this is right mailing list.
> >
> > SELECT date, value FROM t_event
> >     WHERE t_event.id in (SELECT id FROM t_event
> >         WHERE date < '2009-08-25'
> >         ORDER BY date DESC LIMIT 1)
> >     ORDER BY date;
> > cost 6.4
> >
> > SELECT date, value FROM t_event
> >     WHERE t_event.id = (SELECT id FROM t_event
> >         WHERE date < '2009-08-25'
> >         ORDER BY date DESC LIMIT 1)
> >     ORDER BY date;
> > cost 6.36..6.37
> >
> >
> > Why that two query dont have equal cost? If it is not problem, try add
> > some planer code to recognize that sublesect HAVE TO return just one row
> > (limit 1) and in plan could be used filter/index scan instead of hash
> > aggregate.
>
> Well, there is always a tradeoff between more planner analysis and more
> complicated and slow planning.  Seeing that the cost estimates are close
> enough for practical purposes, it doesn't seem worthwhile to fix
> anything here.
>
> >  I have
> > also some complex query examples where cost difference is more visible.
>
> Having real examples where a change might actually improve runtime is
> always more interesting than an academic exercise like the above.

Oka, real world example is attached as txt file. There are 3x2 queries and its
costs on my system/database. System/database is also real, and production.
First set for short time interval and second for long time interval.

Main poblem is, that I want to generate some statistical report and I dont
know for how many "devices" Iam going to generate it for. So if I make some
program, there will be something like:
..."t_device.imei in (" + this.getDevicesImeis() + ")"...
If i have only one device, there could be "=" instead of "in".

PS: Iam very confused about actual costs. I dont have exact query on which I
have seen that "in" for one device in array, is much worse than "=" for one
device. Actual costs are somehow different and "=" select gives about ten
time worse cost.

So probably just ignore my mail and keep planner as is.
My version of postgre: "PostgreSQL 8.2.6 on x86_64-pc-linux-gnu, compiled by
GCC x86_64-pc-linux-gnu-gcc (GCC) 4.1.1 (Gentoo 4.1.1-r3)"

--
Odborník na všetko je zlý odborník. Ja sa snažím byť výnimkou potvrdzujúcou
pravidlo.
explain
SELECT spz_number, date, value FROM t_event, t_device_cache, t_device device    WHERE event_type_fk = (SELECT id FROM
t_event_typeWHERE type = 10 LIMIT 1) AND    device.imei = (SELECT imei FROM t_device WHERE id = 3 limit 1) AND
device.id= t_event.device_fk AND    t_device_cache.imei = device.imei AND    date between '2009-08-25' AND
'2009-08-27'UNIONSELECTspz_number, date, value FROM t_event, t_device_cache, t_device device    WHERE event_type_fk =
(SELECTid FROM t_event_type WHERE type = 10 LIMIT 1) AND    device.imei = (SELECT imei FROM t_device WHERE id = 3 limit
1)AND    device.id = t_event.device_fk AND    t_device_cache.imei = device.imei AND    t_event.id = (SELECT id FROM
t_event       WHERE date > '2009-08-27' AND        event_type_fk = (SELECT id FROM t_event_type WHERE type = 10 LIMIT
1)AND        t_event.device_fk = device.id        ORDER BY date ASC LIMIT 1)ORDER BY spz_number, date;
 
cost 2573



explain
SELECT spz_number, date, value FROM t_event, t_device_cache, t_device device    WHERE event_type_fk in (SELECT id FROM
t_event_typeWHERE type = 10 LIMIT 1) AND    device.imei in (SELECT imei FROM t_device WHERE id = 3 limit 1) AND
device.id= t_event.device_fk AND    t_device_cache.imei = device.imei AND    date between '2009-08-25' AND
'2009-08-27'UNIONSELECTspz_number, date, value FROM t_event, t_device_cache, t_device device    WHERE event_type_fk in
(SELECTid FROM t_event_type WHERE type = 10 LIMIT 1) AND    device.imei in (SELECT imei FROM t_device WHERE id = 3
limit1) AND    device.id = t_event.device_fk AND    t_device_cache.imei = device.imei AND    t_event.id = (SELECT id
FROMt_event        WHERE date > '2009-08-27' AND        event_type_fk in (SELECT id FROM t_event_type WHERE type = 10
LIMIT1) AND        t_event.device_fk = device.id        ORDER BY date ASC LIMIT 1)ORDER BY spz_number, date;
 
cost 2109



explain
SELECT spz_number, date, value FROM t_event, t_device_cache, t_device device    WHERE event_type_fk = (SELECT id FROM
t_event_typeWHERE type = 10 LIMIT 1) AND    device.imei in (SELECT imei FROM t_device WHERE id = 3 limit 1) AND
device.id= t_event.device_fk AND    t_device_cache.imei = device.imei AND    date between '2009-08-25' AND
'2009-08-27'UNIONSELECTspz_number, date, value FROM t_event, t_device_cache, t_device device    WHERE event_type_fk =
(SELECTid FROM t_event_type WHERE type = 10 LIMIT 1) AND    device.imei in (SELECT imei FROM t_device WHERE id = 3
limit1) AND    device.id = t_event.device_fk AND    t_device_cache.imei = device.imei AND    t_event.id = (SELECT id
FROMt_event        WHERE date > '2009-08-27' AND        event_type_fk = (SELECT id FROM t_event_type WHERE type = 10
LIMIT1) AND        t_event.device_fk = device.id        ORDER BY date ASC LIMIT 1)ORDER BY spz_number, date;
 
cost 2039


---------------------------------------------------------------------------------------
Extendet time interval:
explain
SELECT spz_number, date, value FROM t_event, t_device_cache, t_device device    WHERE event_type_fk = (SELECT id FROM
t_event_typeWHERE type = 10 LIMIT 1) AND    device.imei = (SELECT imei FROM t_device WHERE id = 3 limit 1) AND
device.id= t_event.device_fk AND    t_device_cache.imei = device.imei AND    date between '2009-08-25' AND
'2009-08-27'UNIONSELECTspz_number, date, value FROM t_event, t_device_cache, t_device device    WHERE event_type_fk =
(SELECTid FROM t_event_type WHERE type = 10 LIMIT 1) AND    device.imei = (SELECT imei FROM t_device WHERE id = 3 limit
1)AND    device.id = t_event.device_fk AND    t_device_cache.imei = device.imei AND    t_event.id = (SELECT id FROM
t_event       WHERE date > '2009-08-27' AND        event_type_fk = (SELECT id FROM t_event_type WHERE type = 10 LIMIT
1)AND        t_event.device_fk = device.id        ORDER BY date ASC LIMIT 1)ORDER BY spz_number, date;
 
cost 32614



explain
SELECT spz_number, date, value FROM t_event, t_device_cache, t_device device    WHERE event_type_fk in (SELECT id FROM
t_event_typeWHERE type = 10 LIMIT 1) AND    device.imei in (SELECT imei FROM t_device WHERE id = 3 limit 1) AND
device.id= t_event.device_fk AND    t_device_cache.imei = device.imei AND    date between '2009-08-25' AND
'2009-08-27'UNIONSELECTspz_number, date, value FROM t_event, t_device_cache, t_device device    WHERE event_type_fk in
(SELECTid FROM t_event_type WHERE type = 10 LIMIT 1) AND    device.imei in (SELECT imei FROM t_device WHERE id = 3
limit1) AND    device.id = t_event.device_fk AND    t_device_cache.imei = device.imei AND    t_event.id = (SELECT id
FROMt_event        WHERE date > '2009-08-27' AND        event_type_fk in (SELECT id FROM t_event_type WHERE type = 10
LIMIT1) AND        t_event.device_fk = device.id        ORDER BY date ASC LIMIT 1)ORDER BY spz_number, date;
 
cost 2711

explain
SELECT spz_number, date, value FROM t_event, t_device_cache, t_device device    WHERE event_type_fk = (SELECT id FROM
t_event_typeWHERE type = 10 LIMIT 1) AND    device.imei in (SELECT imei FROM t_device WHERE id = 34 limit 1) AND
device.id= t_event.device_fk AND    t_device_cache.imei = device.imei AND    date between '2009-07-25' AND
'2009-08-25'UNIONSELECTspz_number, date, value FROM t_event, t_device_cache, t_device device    WHERE event_type_fk =
(SELECTid FROM t_event_type WHERE type = 10 LIMIT 1) AND    device.imei in (SELECT imei FROM t_device WHERE id = 34
limit1) AND    device.id = t_event.device_fk AND    t_device_cache.imei = device.imei AND    t_event.id = (SELECT id
FROMt_event        WHERE date > '2009-08-25' AND        event_type_fk = (SELECT id FROM t_event_type WHERE type = 10
LIMIT1) AND        t_event.device_fk = device.id        ORDER BY date ASC LIMIT 1)ORDER BY spz_number, date;
 
cost 2732