Re: Planning without reason. - Mailing list pgsql-hackers

From Tzahi Fadida
Subject Re: Planning without reason.
Date
Msg-id 200606231810.33462.Tzahi.ML@gmail.com
Whole thread Raw
In response to Re: Planning without reason.  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: Planning without reason.  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-hackers
On Friday 23 June 2006 17:47, Martijn van Oosterhout wrote:
> On Fri, Jun 23, 2006 at 05:12:14PM +0300, Tzahi Fadida wrote:
> > My initial reasoning was to avoid extra sorts but i guess that the
> > planner just doesn't get the LIMIT 1. I see now that UNION should be
> > better for the planner to undestand (not performance wise).
> > However, UNION alone, doesn't seem to cut it.
> > Following is an example. t7 has 2 attributes and a non-unique index on
> > one attribute. here is a printout:
> > explain analyze (select * from t7 where a4=113 LIMIT 1) UNION (select *
> > from t7 where a2=139 LIMIT 1);
>
> What are the indexes on?  If you only have an index on a4, the latter
> query has to be an index scan and there's no way to optimise it way.

That's my point, it should have only used a sequence scan and not also
do an index scan.
In other words, it should have consolidated the two nodes of index scan and
sequence scan into a single plan node where you only scan sequentially the
relation and choose a tuple for each UNION clause.

>
> > > Also, couldn't you just do:
> > >
> > > SELECT * FROM R
> > > WHERE (a=3, b=6, ...)
> > > OR (b=5, d=2, ...)
> > > etc
> >
> > No, a filtering action is not enough since my goal is to only use indices
> > when retrieving single tuples each time thus, if i will use OR i cannot
> > control the number of tuples returned by each Or clause.
>
> I must admit, this is a really strange way of doing it. For example, if
> multiple rows match, the tuples eventually returned will be a random
> selection of the rows that matched. Especially with the "limit 1"
> there's no way the optimiser could combine the individual scans.

It is a query i use for full disjunction which is a part of the algorithm.
I am doing it manually, so i don't see why it can't do it itself.
I.e.: Scan sequentially R. for each UNION clause find a matching tuple.
the end.

>
> If you really need the "LIMIT 1" and you don't have full index coverage
> then you're quite limited as to how it can be optimised.

You misunderstood me, i wish the planner to only use sequence scan in the
event where even one node is a sequential scan.

>
> > > > I am currently just writing the query as a string and open a cursor.
> > > > Is there a simple way to use Datums instead of converting the
> > > > attributes to strings to create a plan for SPI.
> > > > 10x.
> > >
> > > I imagine SPI_prepare() and SPI_execp() would be used for this.
> >
> > I am already using SPI_prepare but it uses a query of the form of a char
> > string, which i need to prepare and is quite long. I.e. if i have 100
> > tuples i wish to retrieve it can be very wasteful to prepare the string
> > in memory and use SPI_prepare to prepare and later execute it.
> > better to use directly the datums (which i already have deformed from
> > previous operations).
>
> I'm confused here too. I thought the datums you're talking about were
> arguments, thus you could push them straight to SPI_execp(). But you
> seem to be suggesting parts of the actual query are in datum form also?

Example. i have a tuple T i am searching for.
T contains attribute1, attribute2. I have T in a
heap_deformtuple(T) manner, i.e., i have T->v and T->n (for nulls).
Currently i am doing (loosely):
"(SELECT * FROM R where attribute1=" + convertDatumToCharString(T->v[0])+
" AND attribute2=" + convertDatumToCharString(T->v[1]) +" LIMIT 1)"
+ "UNION"
... as above.

I can use prepare without conversions but i still have to construct the long
query each time. I can't do prepare just once because the where clauses
structures are always changing. Thus, i was wondering if i can
also construct the part in the plan where i request to SELECT * FROM R...
I.e. not to use strings at all. The structure of the query is the same all the
time. I.e. there is the SELECT * FROM R and the WHERE clause with LIMIT 1
nodes with UNION  ALL between SELECTS.

>
> Have a nice day,

--
Regards,
��������Tzahi.
--
Tzahi Fadida
Blog: http://tzahi.blogsite.org | Home Site: http://tzahi.webhop.info
WARNING TO SPAMMERS: �see at
http://members.lycos.co.uk/my2nis/spamwarning.html


pgsql-hackers by date:

Previous
From: Csaba Nagy
Date:
Subject: Re: vacuum, performance, and MVCC
Next
From: Martijn van Oosterhout
Date:
Subject: Re: Webcluster session storage, was vacuum, performance, and MVCC