Thread: Slow Query

Slow Query

From
Parth Shah
Date:
Hi all,

We've been struggling with a slow query! -- and it's been exploding as rows have been added to relevant tables. It seems like a fairly common workflow, so we think we're overlooking the potential for an index or rewriting the query.

I've linked a document compiling the information as per the Postgresql recommendation for Slow Query Questions. Here's the link: https://docs.google.com/document/d/10qO5jkQNVtKw2Af1gcKAKiNw7tYFNQruzOQrUYXd4hk/edit?usp=sharing (we've enabled commenting)

Here's a high-level summary of the issue:
______

We’re trying to show a list of active conversations. Each conversation (named a spool in the database) has multiple threads, kind of like Slack channels. And the messages are stored in each thread. We want to return the 30 most recent conversations with recency determined as the most recent message in any thread of the conversation you are a participant of (you may not be a participant of certain threads in a conversation so it’s important those don’t leak sensitive data).

We found that as the number of threads increases, the query slowed down dramatically. We think the issue has to do with the fact that there is no easy way to go from a thread you are a participant to its most recent message, however, it is possible the issue is elsewhere. We’ve provided the full query and a simplified query of where we think the issue is, along with the EXPLAIN ANALYZE BUFFERS result. We figure this is not exactly an uncommon use case, so it’s likely that we are overlooking the potential for some missing indices or a better way to write the query. We appreciate the help and any advice!

______

We'd really appreciate any help and advice!

Best,
Parth

Re: Slow Query

From
Michael Lewis
Date:
Is there no index on thread.spool? What about notification.user? How about message.time (without thread as a leading column). Those would all seem very significant. Your row counts are very low to have a query perform so badly. Work_mem could probably be increased above 4MB, but it isn't hurting this query in particular.

My primary concern is that the query is rather chaotic at a glance. It would be great to re-write and remove the unneeded keywords, double quotes, totally worthless parentheses, etc. Something like the below may help you see the crux of the query and what could be done and understand how many rows might be coming out of those subqueries. I re-ordered some joins and there might be syntax errors, but give it a shot once you've added the indexes suggested above.

SELECT

spool.id,

handle.handle,

spool.name,

thread.id,

case.closed,

notification.read,

notification2.time,

message.message,

message.time,

message.author,

thread.name,

location.geo 

FROM

spool

JOIN handle ON handle.id = spool.id

JOIN thread ON thread.spool = spool.id

JOIN message ON message.thread = thread.id

LEFT JOIN location ON location.id = spool.location

LEFT JOIN case ON case.id = spool.id 

LEFT JOIN notification ON notification.user = 'b16690e4-a3c5-4868-945e-c2458c27a525'

AND 

notification.id = (

SELECT

notification3.id 

FROM

notification AS notification3

JOIN notification_thread ON notification_thread.id = notification3.id

JOIN thread AS thread2 ON thread2.id = notification_thread.thread 

WHERE

thread2.spool = spool.id

AND notification3.user = 'b16690e4-a3c5-4868-945e-c2458c27a525'

AND notification3.time <= '2020-09-30 16:32:38.054558'

ORDER BY

notification3.time DESC 

LIMIT 1 

)

LEFT JOIN notification AS notification2 ON notification2.user = 'b16690e4-a3c5-4868-945e-c2458c27a525'

AND notification2.id = (

SELECT

notification3.id 

FROM

notification AS notification3

JOIN notification_thread ON notification_thread.id = notification3.id

JOIN thread AS thread2 ON thread2.id = notification_thread.thread 

WHERE

thread2.spool = spool.id

AND notification3.user = 'b16690e4-a3c5-4868-945e-c2458c27a525'

AND notification3.time > '2020-09-30 16:32:38.054558'

ORDER BY

notification3.time DESC 

LIMIT 1 

) 

WHERE

message.time = (

SELECT

MAX ( message2.time ) 

FROM

message AS message2

JOIN thread AS thread2 ON thread2.id = message2.thread

JOIN participant ON participant.thread = thread2.id

JOIN identity ON identity.id = participant.identity

LEFT JOIN relation ON relation.to = identity.id

AND relation.from = 'b16690e4-a3c5-4868-945e-c2458c27a525'

AND relation.manages = TRUE

WHERE

NOT message2.draft 

AND ( identity.id = 'b16690e4-a3c5-4868-945e-c2458c27a525' OR NOT relation.to IS NULL )

AND thread2.spool = spool.id

LIMIT 1

) 

AND notification.id IS NOT NULL

ORDER BY

message.time DESC 

LIMIT 31;

Re: Slow Query

From
Parth Shah
Date:
Hi all,

Thanks, Michael (and Martin other thread)! We added those indexes you suggested, and went ahead and added indexes for all our foreign keys. We also added one combination index on notification (user, time). It led to a small constant factor speed up (2x) but is still taking a 13+ seconds. :( Still seems aggressively bad.

I've attached the updated, cleaned up query and explain analyze result (the extra chaos was due to the fact that we're using https://hackage.haskell.org/package/esqueleto-3.2.3/docs/Database-Esqueleto.html to generate the SQL). Maybe we're missing some multi-column indexes?

Best,
Parth

On Wed, Oct 14, 2020 at 3:18 PM Michael Lewis <mlewis@entrata.com> wrote:
Is there no index on thread.spool? What about notification.user? How about message.time (without thread as a leading column). Those would all seem very significant. Your row counts are very low to have a query perform so badly. Work_mem could probably be increased above 4MB, but it isn't hurting this query in particular.

My primary concern is that the query is rather chaotic at a glance. It would be great to re-write and remove the unneeded keywords, double quotes, totally worthless parentheses, etc. Something like the below may help you see the crux of the query and what could be done and understand how many rows might be coming out of those subqueries. I re-ordered some joins and there might be syntax errors, but give it a shot once you've added the indexes suggested above.

SELECT

spool.id,

handle.handle,

spool.name,

thread.id,

case.closed,

notification.read,

notification2.time,

message.message,

message.time,

message.author,

thread.name,

location.geo 

FROM

spool

JOIN handle ON handle.id = spool.id

JOIN thread ON thread.spool = spool.id

JOIN message ON message.thread = thread.id

LEFT JOIN location ON location.id = spool.location

LEFT JOIN case ON case.id = spool.id 

LEFT JOIN notification ON notification.user = 'b16690e4-a3c5-4868-945e-c2458c27a525'

AND 

notification.id = (

SELECT

notification3.id 

FROM

notification AS notification3

JOIN notification_thread ON notification_thread.id = notification3.id

JOIN thread AS thread2 ON thread2.id = notification_thread.thread 

WHERE

thread2.spool = spool.id

AND notification3.user = 'b16690e4-a3c5-4868-945e-c2458c27a525'

AND notification3.time <= '2020-09-30 16:32:38.054558'

ORDER BY

notification3.time DESC 

LIMIT 1 

)

LEFT JOIN notification AS notification2 ON notification2.user = 'b16690e4-a3c5-4868-945e-c2458c27a525'

AND notification2.id = (

SELECT

notification3.id 

FROM

notification AS notification3

JOIN notification_thread ON notification_thread.id = notification3.id

JOIN thread AS thread2 ON thread2.id = notification_thread.thread 

WHERE

thread2.spool = spool.id

AND notification3.user = 'b16690e4-a3c5-4868-945e-c2458c27a525'

AND notification3.time > '2020-09-30 16:32:38.054558'

ORDER BY

notification3.time DESC 

LIMIT 1 

) 

WHERE

message.time = (

SELECT

MAX ( message2.time ) 

FROM

message AS message2

JOIN thread AS thread2 ON thread2.id = message2.thread

JOIN participant ON participant.thread = thread2.id

JOIN identity ON identity.id = participant.identity

LEFT JOIN relation ON relation.to = identity.id

AND relation.from = 'b16690e4-a3c5-4868-945e-c2458c27a525'

AND relation.manages = TRUE

WHERE

NOT message2.draft 

AND ( identity.id = 'b16690e4-a3c5-4868-945e-c2458c27a525' OR NOT relation.to IS NULL )

AND thread2.spool = spool.id

LIMIT 1

) 

AND notification.id IS NOT NULL

ORDER BY

message.time DESC 

LIMIT 31;

Attachment

Re: Slow Query

From
Michael Lewis
Date:
Based on the execution plan, it looks like the part that takes 13 seconds of the total 14.4 seconds is just calculating the max time used in the where clause. Anytime I see an OR involved in a plan gone off the rails, I always always check if re-writing the query some other way may be faster. How's the plan for something like this?


WHERE message.time = greatest( sub1.time, sub2.time )

/* sub1.time */
(
select
MAX ( message2.time )
FROM
message AS message2
JOIN thread AS thread2 ON thread2.id = message2.thread
JOIN participant ON participant.thread = thread2.id
WHERE
NOT message2.draft
AND participant.identity = 'b16690e4-a3c5-4868-945e-c2458c27a525'
AND thread2.spool = spool.id
)

/* sub2.time */
(
select
MAX ( message2.time )
FROM
message AS message2
JOIN thread AS thread2 ON thread2.id = message2.thread
JOIN participant ON participant.thread = thread2.id
JOIN relation ON relation.to = participant.identity
AND relation.from = 'b16690e4-a3c5-4868-945e-c2458c27a525'
AND relation.manages = TRUE
WHERE
NOT message2.draft
AND thread2.spool = spool.id
)