Re: bug in query planning? - Mailing list pgsql-general

From DeJuan Jackson
Subject Re: bug in query planning?
Date
Msg-id 3FE8E0C3.3060702@speedfc.com
Whole thread Raw
In response to Re: bug in query planning?  (Steven D.Arnold <stevena@neosynapse.net>)
Responses Re: bug in query planning?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
The queries are listed here for the referentially (yes that's a pun)
challenged.

Query 1:
SELECT COUNT(message_id)
 FROM messages m
      LEFT JOIN accounts a
       ON  m.account_id::bigint = a.account_id::bigint
 WHERE a.email = 'stevena@neosynapse.net';

Query 2:
SELECT COUNT(message_id)
 FROM accounts a
      LEFT JOIN messages m
       ON  a.account_id::bigint = m.account_id::bigint
 WHERE a.email = 'stevena@neosynapse.net';

Query 3:
SELECT COUNT(message_id)
 FROM messages m, accounts a
 WHERE m.account_id::bigint = a.account_id::bigint
       AND a.email = 'stevena@neosynapse.net';

 From what I can see they are not the same query and therefore shouldn't
use the same plan.

The first query is saying go get all the messages (best done with a seq
scan since there is no where to limit the results of the message table
[using an index scan would just add the overhead of reading the pages
for the index, the computational time to resolve the index entries, and
turn the table access into a random sector read instead of sequential
without actually limiting what gets returned]) match that with as many
accounts as you can and return a row for all of the messages (note the
LEFT JOIN).  Next filter all of the results on the account email  (which
only eliminates 1100 messages out of 52000).  Now count how many
messages are left which should return 51419.

The second query is saying get all of the accounts filter by email
address (it can get this from the where this time) giving 1 row.  Now
match that to every message for this account_id and return at least one
row even if there are no messages for this account (note again the LEFT
JOIN) (which uses the index scan because it expects the index
selectivity to be a approximately 1/4 of the full table [it's wrong]).
Now count how many messages I have which returns 51419.

The third query is saying give me all of the messages for the accounts
where my email = 'stevena@neosynapse.net' and I don't care where you
start from.  The optimizer,  after going through consideration of
various possible plans, is then smart enough to realize the email =
'blah' is indexed and it's selectivity is 1 row which means that we now
return to the situation in query 2 with one small change if there are no
messages for the account in question you would get no row returned,
leading to a more efficient aggregation step.

Steven D.Arnold wrote:

>
> On Dec 21, 2003, at 11:47 PM, Tom Lane wrote:
>
>> "Steven D.Arnold" <stevena@neosynapse.net> writes:
>>
>>> Query (2) below is the same query, but we reverse the order of the
>>> tables.  It's obviously not quite the same query semantically, even
>>> though in my case it should always produce the same result.
>>
You are correct the queries produce the same results, but they are
telling the planner to do completely different things.  The query
doesn't show it bu if the behavior you are desiring happened in postgres
(unless show the relational algebra that makes it work), I would have to
start looking for a new database (that's a disturbing thought).

>>
>> Since it is in fact not the same query, I'm unclear on why you expect
>> it to produce the same plan.
>
>
> What I expect is for both queries to use the index on the messages
> table!  Why is it not doing that?

Because of the table ordering and the left join in 7.3.x
Because of the left join in 7.4

>> FWIW, I believe that 7.4 will recognize that (1) and (3) are
>> semantically equivalent.
>
>
> I will try 7.4 and report back.

I don't believe the optimiser (in any database that cares about giving
you the correct results) can determine that a non-constrained primary
table in a left join can be rewritten as either of your other two
queries (but there are smarter people than me working on Postgres, so I
could be wrong).

>
> steve

My suggestion would be to place the more selective table first in a
JOIN, and get rid of the LEFT JOIN's unless that's exactly what you
want.  For more information about the different JOIN methods RTFM.

I would also suggest that you might want to tune your random page cost
toward 1, because obviously random access is being over estimated for
your hardware.  (You might just want to look at tuning your parameters
in general.)

And in the future you should run a query at least one extra time to note
the different caching makes (the second run for an explain analyze is
usually quite different than the first for tables of this size).

DeJuan


pgsql-general by date:

Previous
From: Paul Thomas
Date:
Subject: Re: getting configure options?
Next
From: bpalmer
Date:
Subject: Re: Mirrors that don't suck.