Re: Why does a simple query not use an obvious index? - Mailing list pgsql-performance

From Mr Pink
Subject Re: Why does a simple query not use an obvious index?
Date
Msg-id 20040901125002.43149.qmail@web41102.mail.yahoo.com
Whole thread Raw
In response to Re: Why does a simple query not use an obvious index?  (Greg Stark <gsstark@mit.edu>)
List pgsql-performance
Hi Greg, Tom, etal

It's true that oracle only peeks during a hard parse, and this can have good or bad results
depending on the situation. Basically, the first value used in that query will determine the plan
until that query is bumped from the sql cache or the server is restarted. As far as I know, there
is no option to disable that feature in Oracle, I don't know about postgres.

Overall, I think it's a good feature because it helps us in the goal of reducing hardparsing (that
was it's real purpose in oracle). The trick as with all good features is to use it cleverly. For
example, you could run scripts on server startup that run such queries with optimal values before
any one gets back on. If your application has optimal use of bind variables allowing re-use of
query plan, and the sql cache has enough memory then the query plans you created at server startup
could be expected to be current for the life of that instance.

I write all this from my knowlegdge of Oracle, but I can't be sure how it applies to postgres.
Come to think about it, I don't think I've seen a good discussion of plan caching, hard parsing
and such like specifically related to pg. I'd really like to know more about how pg treats that
stuff.

regards
Mr Pink

--- Greg Stark <gsstark@mit.edu> wrote:

>
>
> [I'm actually responding to the previous post from Tom Lane, but I've deleted
> it and the archives seem to be down again.]
>
>
> The assumption being made is that the first provided result is representative
> of all future results. I don't see any reason that making this assumption of
> all stable functions should be less scary than making the assumption about
> user provided parameters.
>
> However I have the complementary reaction. I find peeking at the first
> bind parameter to be scary as hell. Functions seem slightly less scary.
>
> On Oracle Peeking at bind parameters is a feature explicitly intended for DSS
> data warehouse type systems. The use of placeholders there was purely for
> security and programming ease, not efficiency, since the queries are only
> planned executed a small number of times per plan. These are systems that
> suffered enormously without the parameter values. They often involved full
> table scans or bitmap index scans and without the statistics produced awful
> plans.
>
> For OLTP systems peeking at placeholders is more a danger than a benefit. The
> query will be executed thousands of times and if it's planned based on a
> single unusual value initially the entire system could fail.
>
> Consider the following scenario which isn't farfetched at all. In fact I think
> it well describes my current project:
>
> I have a table with a few million records. 99% of the time users are working
> with only a few hundred records at most. There's an index on the column
> they're keying off of. 1% of the key values have an unusually large number of
> records.
>
> Without peeking at placeholders the system should see that virtually all the
> key values are well under the threshold for an index scan to be best. So it
> always uses an index scan. 1% of the time it takes longer than that it would
> have with a sequential scan, but only by a small factor. (On the whole we're
> probably still better off avoiding the cache pollution anyways.)
>
> With peeking at placeholders 99% of the backends would perform the same way.
> However 1 backend in 100 sees one of these unusual values for its first query.
> This backend will use a sequential scan for *every* request. Executing a
> sequential table scan of this big table once a second this backend will drive
> the entire system into the ground.
>
> This means every time I start the system up I stand a small but significant
> chance of it just completely failing to perform properly. Worse, apache is
> designed to periodically start new processes, so at any given time the system
> could just randomly fall over and die.
>
> I would rather incur a 10% penalty on every query than have a 1% chance of it
> keeling over and dieing. Given this I would when I upgrade to 8.0 have to
> ensure that my application driver is either not using placeholders at all (at
> the protocol level -- I always prefer them at the api level) or ensure that
> postgres is *not* peeking at the value.
>
> I like the feature but I just want to be sure that it's optional.
>
> --
> greg
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>




__________________________________
Do you Yahoo!?
Yahoo! Mail - 50x more storage than other providers!
http://promotions.yahoo.com/new_mail

pgsql-performance by date:

Previous
From: Richard Huxton
Date:
Subject: Re: [ADMIN] slower every day
Next
From: "Michael Paesold"
Date:
Subject: Re: [ADMIN] slower every day