Re: [HACKERS] Hypothetical suggestions for planner, indexing - Mailing list pgsql-performance

From Peter Childs
Subject Re: [HACKERS] Hypothetical suggestions for planner, indexing
Date
Msg-id Pine.LNX.4.44.0305060810080.11841-100000@RedDragon.Childs
Whole thread Raw
In response to Re: [HACKERS] Hypothetical suggestions for planner, indexing improvement  (Josh Berkus <josh@agliodbs.com>)
List pgsql-performance
On Mon, 5 May 2003, Josh Berkus wrote:

> Tom,
>
> > It's a thought.  Keeping complete cross-column correlation stats (for
> > every combination of columns in the DB) is obviously out of the
> > question.  If you're gonna do it you need a heuristic to tell you which
> > combinations of columns are worth keeping track of --- and foreign-key
> > relationships seem like a reasonable guide to the interesting
> > combinations.
>
> Yes.  It would also make FKs something more than just an annoying (and slow)
> constraint in PostgreSQL.   And it would be a performance feature that most
> other RDBMSs don't have ;-)

    That statement seams really strange, If FKs are really only a
safety lock to stop you from putting bad data in your database, It makes
them a bit pointless if you want a nice fast database and you can trust
your users! This does not make them useless and I still have them but from
a purely performance point of view they don't help currently!
    It may be worth adding Partial Matching FKs so that a user can
mark that they think might be a useful match to do. This would help the
fact that in many data sets NULL can mean more than one different thing.
(Don't Know, None, etc) plus using the index on IS NOT NULL queries would
be very handy when you need to know about all the records that you need to
find the information out for, or all the records with no relationship.

Peter Childs

>
> > I'm not sure about the long-term usefulness of optimizing EXISTS per se.
> > Seems to me that a lot of the present uses of EXISTS are workarounds
> > for Postgres' historic mistreatment of IN ... which we've attacked more
> > directly for 7.4.  But cross-column correlations are certainly useful
> > for estimating join sizes in general.
>
> EXISTS is more flexible than IN; how can you do a 3-column corellation on an
> IN clause?
>
> The reason that I mention EXISTS is because that's where the lack of
> cross-column corellation is most dramatic; the planner seems to estimate a
> flat 50% for EXISTS clauses regardless of the content.
>
>


pgsql-performance by date:

Previous
From: Reiner Dassing
Date:
Subject: Select on timestamp-day slower than timestamp alone
Next
From: Ron Johnson
Date:
Subject: Re: Looking for a cheap upgrade (RAID)