Re: [SQL] 7.4 - FK constraint performance - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [SQL] 7.4 - FK constraint performance
Date
Msg-id 24413.1077125249@sss.pgh.pa.us
Whole thread Raw
In response to Re: [SQL] 7.4 - FK constraint performance  (Stephan Szabo <sszabo@megazone.bigpanda.com>)
List pgsql-hackers
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> I did this, and changed the foreign keys to use it, but I haven't managed
> to build a fk case where I could actually detect a change in the plan
> chosen.  Since the queries are only a simple scan on the one table I'm
> wondering if it's basically just modifying both costs by the same value
> which means there's no real effect at all.

After fooling with this, I think you are right.  The planner is modeling
both cases as a linear slope from zero to estimated-total-cost, and
since the total number of tuples to be returned is the same, taking a
percentage won't change the outcome.

The only way we could improve the situation would be to somehow instruct
the planner that even though we don't know the parameter value to be
used when we are planning, it should expect that that value is *not* in
the table, rather than expecting that it has a distribution similar to
what is in the table.  Seems pretty messy.

I have occasionally speculated about postponing planning of
parameterized queries until they are first executed, and then using the
actual parameter values supplied in that first execution for purposes of
estimating costs and selectivity.  That would work pretty nicely in this
particular case, but in the real world I think it'd be mighty dangerous;
you could end up optimizing for an outlier case that isn't
representative of the queries you'll see later.

Another interesting line of thought is to let the user supply
representative values to be used for planning purposes.  In PREPARE,
you could imagine saying something like
PREPARE myquery (int = 42, text = 'http://www...') AS    SELECT ... where url like $2 ...

and then using the sample value 'http://www...' for purposes of
estimating the LIKE result.  Then it'd be on the user's head to pick
good representatives --- but he could make sure that they really were
representative, and not have to worry about luck of the draw from the
first live query.

I'm not sure if we could use such a feature automatically to bias FK
queries in the right direction, but it's something to think about.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: OIDs, CTIDs, updateable cursors and friends
Next
From: Bruce Momjian
Date:
Subject: Re: [PATCHES] NO WAIT ...