Hi Andreas,
[New to this list, forgive my ignorance.]
On 05/01/2014 01:26 PM, Andreas Joseph Krogh wrote:
> I'm using PostgreSQL 9.3.2 on x86_64-unknown-linux-gnu
My machine has PostgreSQL 9.1.13 on x86_64-unknown-linux-gnu.
> I have a schema where I have lots of messages and some users who might
> have read some of them. When a message is read by a user I create an
> entry i a table message_property holding the property (is_read) for
> that user.
> The schema is as follows:
> drop table if exists message_property;
> drop table if exists message;
> drop table if exists person;
> create table person(
> id serial primary key,
> username varchar not null unique
> );
> create table message(
> id serial primary key,
> subject varchar
> );
> create table message_property(
> message_id integer not null references message(id),
> person_id integer not null references person(id),
> is_read boolean not null default false,
> unique(message_id, person_id)
> );
[snip]
> So, for person 1 there are 10 unread messages, out of a total 1mill. 5
> of those unread does not have an entry in message_property and 5 have
> an entry and is_read set to FALSE.
> I have the following query to list all un-read messages for person
> with id=1:
> SELECT
> m.id AS message_id,
> prop.person_id,
> coalesce(prop.is_read, FALSE) AS is_read,
> m.subject
> FROM message m
> LEFT OUTER JOIN message_property prop ON prop.message_id = m.id
> AND prop.person_id = 1
> WHERE 1 = 1
> AND NOT EXISTS(SELECT
> *
> FROM message_property pr
> WHERE pr.message_id = m.id AND pr.person_id =
> prop.person_id AND prop.is_read = TRUE)
> ;
>
> The problem is that it's not quite efficient and performs badly,
> explain analyze shows:
[snip]
> Does anyone have suggestions on how to optimize the query or schema?
I'm getting better performance with:
SELECT
m.id AS message_id,
1 AS person_id,
FALSE AS is_read,
m.subject
FROM message m
WHERE 1 = 1
AND NOT EXISTS(SELECT
*
FROM message_property pr
WHERE pr.message_id = m.id AND pr.person_id = 1 AND pr.is_read);
You then lose the distinction between message_property with is_read =
FALSE, and nonexistent message_property for the message row.
If that is essential, I'm getting a roughly 2x speedup on my non-tuned
PostgreSQL with:
SELECT
m.id AS message_id,
prop.person_id,
coalesce(prop.is_read, FALSE) AS is_read,
m.subject
FROM message m
LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND
prop.person_id = 1
WHERE not coalesce(prop.is_read, false);
HTH,
Jochem
--
Jochem Berndsen | jochem@functor.nl