Re: Optimize query for listing un-read messages - Mailing list pgsql-performance

From Andreas Joseph Krogh
Subject Re: Optimize query for listing un-read messages
Date
Msg-id OfficeNetEmail.2b.83381fd877c3dffc.145b99dd5ea@prod2
Whole thread Raw
In response to Re: Optimize query for listing un-read messages  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: Optimize query for listing un-read messages  (Tomas Vondra <tv@fuzzy.cz>)
List pgsql-performance
På torsdag 01. mai 2014 kl. 23:02:13, skrev Pavel Stehule <pavel.stehule@gmail.com>:
 
 
2014-05-01 22:30 GMT+02:00 Andreas Joseph Krogh <andreas@visena.com>:
På torsdag 01. mai 2014 kl. 21:53:32, skrev Pavel Stehule <pavel.stehule@gmail.com>:
 
 
2014-05-01 21:39 GMT+02:00 Andreas Joseph Krogh <andreas@visena.com>:
På torsdag 01. mai 2014 kl. 21:30:39, skrev Pavel Stehule <pavel.stehule@gmail.com>:
Hello
[snip]
 
I had a perfect success on similar use case with descent ordered partial index

http://www.postgresql.org/docs/9.3/interactive/sql-createindex.html
 
I'm not getting good performance. Are you able to craft an example using my schema and partial index?
 
maybe some like
 
CREATE INDEX ON message_property (person_id, message_id) WHERE pr.is_read
 
When I am thinking about your schema, it is designed well, but it is not index friendly, so for some fast access you should to hold a cache (table) of unread messages
 
Ah, that's what I was hoping to not having to do. In my system, messages arrive all the time and having to update a cache for all new messages for all users seems messy... Seems I could just as well create a message_property for all users when a new message arrives, so I can INNER JOIN it and get good performance. But that table will quickly grow *very* large...
 
What you need is a JOIN index, that is not possible in Postgres.
 
I afraid so some "ugly" solutions is necessary (when you require extra fast access). You need a index (small index) and it require some existing set - you cannot do index on the difference two sets.
 
I expect so some flag on the relation "message" - like "it should not be not read" can helps little bit - and can be used in partial index as conditions. Other possibility is some variant of partitioning - you can divide a messages and users to distinct sets and then you decrease a number of possible combinations.
 
Just curious:
Is such a JOIN index possible in other DBs, if so - which?
Can other DBs do index on difference between two sets?
Will PG ever have this, is it even possible?
 
In my real system the message_property holds other properties for a message also, so I have to LEFT OUTER JOIN with it to get the properties where they exist when listing messages. The system works by assuming that when an entry in message_property does not exist for a given message for a given user then the property is equal to "false".
 
I don't quite see how maintaining a message_properrty entry for all users, for all messages, is a good idea. That's quite some work to be done when adding/removing users etc.
 
Thanks for having this discussion.
 
--
Andreas Jospeh Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 
Attachment

pgsql-performance by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Optimize query for listing un-read messages
Next
From: David G Johnston
Date:
Subject: Re: Optimize query for listing un-read messages