Query planner/stored procedure cost - Mailing list pgsql-hackers

From Jason M. Felice
Subject Query planner/stored procedure cost
Date
Msg-id 20030220162617.GC2031@argo.eraserhead.net
Whole thread Raw
Responses Re: Query planner/stored procedure cost
List pgsql-hackers
Hello...

I haven't been subscribed in a while, but I've got an issue and am trying to
determine if the Right Way(tm) is the quickest way to fix it.

Basically, I have some very expensive stored procedures that determine whether
a user should have access to particular rows in a query (not a postgresql
user, we only use one postgresql user... the user is passed as a parameter to
the function).  The logic--per row--contains about a dozen queries and probably
averages eight queries per run, with short-circuiting and all.

So it is _very_ expensive.  Given that I use this function in lots of queries
with hairy joins and all, I'd much like for the optimizer to know what to do
with the function.  Empirically, I deduce that the optimizer treats all
procedures as inexpensive (it seems to always just tack it on to the `Filter'
slot when scanning the related table).

Currently I'm using stored procedures returning multiple rows to get around
the planner on these and defer the expensive procedure until the last possible
moment (so that joins and other table criteria have a chance to filter out
a lot of records).  This typically shaves 75% of the time off of these
queries.

So, the question is:

What am I looking at in doing the following:

1) Adding a mechanism to tell PostgreSQL how expensive a procedure is  (a system table which can be updated manually,
oran existing system  table if there is a logical place for it).
 

2) Updating the planner to consider the procedure's cost in estimates.

3) Changing the query planner to consider "bubbling up" the function to  an outer filter slot.

Possibly, also:

4) Changing the planner to order expressions in a `Filter' slot by cost.

although I don't mind doing this manually and I know the order can determine
which indices PostgreSQL uses.

I'm still mulling it over, and I'm guessing the real problem here is if it
is a wise generalization that we can "bubble-up" the function.  What if the
function has side effects?  Does this break?  We can at least do procedures
with `iscachable' flag.


Disclaimer:  I haven't every really hacked the planner code, but I have a
good feel for how it works from lots and _lots_ of experience with it <g>

-Jay 'Eraserhead' Felice


pgsql-hackers by date:

Previous
From: "Dave Page"
Date:
Subject: Re: request for sql3 compliance for the update command
Next
From: "Hiroshi Inoue"
Date:
Subject: Re: A bad behavior under autocommit off mode