Re: Planning without reason. - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Re: Planning without reason.
Date
Msg-id 20060623152638.GI8900@svana.org
Whole thread Raw
In response to Re: Planning without reason.  (Tzahi Fadida <Tzahi.ML@gmail.com>)
List pgsql-hackers
On Fri, Jun 23, 2006 at 06:10:33PM +0300, Tzahi Fadida wrote:
> On Friday 23 June 2006 17:47, Martijn van Oosterhout wrote:
> > On Fri, Jun 23, 2006 at 05:12:14PM +0300, Tzahi Fadida wrote:
> > > My initial reasoning was to avoid extra sorts but i guess that the
> > > planner just doesn't get the LIMIT 1. I see now that UNION should be
> > > better for the planner to undestand (not performance wise).
> > > However, UNION alone, doesn't seem to cut it.
> > > Following is an example. t7 has 2 attributes and a non-unique index on
> > > one attribute. here is a printout:
> > > explain analyze (select * from t7 where a4=113 LIMIT 1) UNION (select *
> > > from t7 where a2=139 LIMIT 1);
> >
> > What are the indexes on?  If you only have an index on a4, the latter
> > query has to be an index scan and there's no way to optimise it way.
>
> That's my point, it should have only used a sequence scan and not also
> do an index scan.
> In other words, it should have consolidated the two nodes of index scan and
> sequence scan into a single plan node where you only scan sequentially the
> relation and choose a tuple for each UNION clause.

I see what you're saying, but that's not necessarily faster. If you
consider the case where the tuple found by the seq scan is near the
beginning of the table and the tuple by the index near the end, it
could be worse to fold them. If you did only a single seq scan it would
have to scan the whole table, whereas now it only has to scan the
beginning.

Just because it says "Seq Scan" doesn't mean it actually scans the
whole table. There isn't a way in postgres to specify the plan you
want. LIMIT only works on the output of entire nodes, you can't say
"output one tuple matching A, one tuple matching B" in a single node.

> Example. i have a tuple T i am searching for.
> T contains attribute1, attribute2. I have T in a
> heap_deformtuple(T) manner, i.e., i have T->v and T->n (for nulls).
> Currently i am doing (loosely):
> "(SELECT * FROM R where attribute1=" + convertDatumToCharString(T->v[0])+
> " AND attribute2=" + convertDatumToCharString(T->v[1]) +" LIMIT 1)"
> + "UNION"
> ... as above.
>
> I can use prepare without conversions but i still have to construct the long
> query each time. I can't do prepare just once because the where clauses
> structures are always changing. Thus, i was wondering if i can
> also construct the part in the plan where i request to SELECT * FROM R...
> I.e. not to use strings at all. The structure of the query is the same all the
> time. I.e. there is the SELECT * FROM R and the WHERE clause with LIMIT 1
> nodes with UNION  ALL between SELECTS.

I see. You could do:

(SELECT * FROM R WHERE attribute1=$1 and attribute2=$2 LIMIT 1) ... etc ...

That saves the conversions to cstrings, but you're looking for something
more abstract than that. I'm not sure that exists.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

pgsql-hackers by date:

Previous
From: mark@mark.mielke.cc
Date:
Subject: Re: vacuum, performance, and MVCC
Next
From: Andrew Dunstan
Date:
Subject: Re: Anyone still care about Cygwin? (was Re: [CORE] GPL