Views, joins and LIMIT - Mailing list pgsql-performance

From Dawid Kuroczko
Subject Views, joins and LIMIT
Date
Msg-id 758d5e7f04101102543eab8dca@mail.gmail.com
Whole thread Raw
Responses Re: Views, joins and LIMIT
List pgsql-performance
I've been wondering...

Suppose we have two tables
CREATE TABLE messages (
   message_id serial PRIMARY KEY,
   message text NOT NULL
);
CREATE TABLE entries (
  entry_id serial PRIMARY KEY,
  message_id integer NOT NULL REFERENCES messages
);

And we have a join:
SELECT entry_id,message FROM entries NATURAL JOIN messages ORDER BY
entry_id DESC LIMIT 10;

The typical planners order of doing things is -- join the tables,
perform sort, perform limit.

But in the above case (which I guess is quite common) few things can be assumed.
1) to perform ORDER BY we don't need any join (entry_id is in our
entries table).
2) entries.entry_id references PRIMARY KEY, which is unique, so we
will have not less, not more but exactly one row per join (one row
from messages per one row from entries)
3) Knowing above, instead of performing join on each of thousands of
entries rows, we could perform ORDER BY and LIMIT before JOINing.
4) And then, after LIMITing we could JOIN those 5 rows.

This I guess would be quite benefitial for VIEWs. :)

Other thing that would be, I guess, benefitial for views would be
special handling of lines like this:

SELECT entry_id,message_id FROM entries NATURAL JOIN messages;

Here there is no reason to perform JOIN at all -- the data will not be used.
As above, since entries.message_id IS NOT NULL REFERENCES messages
and messages is UNIQUE (PRIMARY KEY) we are sure there will be one-to-one(*)
mapping between two tables.  And since these keys are not used, no need to
waste time and perform JOIN.

I wonder what you all think about it. :)

   Regards,
        Dawid

(*) not exactly one-to-one, because same messages.message_id can be
references many times from entries.message_id, but the join will
return exactly the same number of lines as would select * from
entries;

pgsql-performance by date:

Previous
From: Dennis Bjorklund
Date:
Subject: Re: First set of OSDL Shared Mem scalability results, some
Next
From: Mike Harding
Date:
Subject: COPY slows down?